Static Newsabout
todsacerdoti | 32 comments

pjmlp|next|

Interestingly enough, by following up some references of the article I discovered that Go is also following up on Java and .NET design decisions, that maybe could be there early on.

- Deprecating finalizers and using cleaner queues (https://openjdk.org/jeps/421)

- Weak references (https://learn.microsoft.com/en-us/dotnet/api/system.weakrefe..., https://docs.oracle.com/javase/8/docs/api/java/lang/ref/Weak...)

Related tickets,

"runtime: add AddCleanup and deprecate SetFinalizer" - https://github.com/golang/go/issues/67535

"weak: new package providing weak pointers" - https://github.com/golang/go/issues/67552

One day Go will eventually have all those features that they deemed unnecessary from "complex" languages.


nickcw|prev|next|

The unique package is my top feature for go1.23. I've been experimenting with it in rclone.

People often want to have millions of S3 objects in memory and reducing the memory used would be very desirable.

I interned all the strings used - there are a lot of duplicates like Content Type and it reduced the memory usage by about 30% which is great.

I wonder how much difference this little fix mentioned in the article for go1.23.2 will make? https://github.com/golang/go/issues/69370

The article also mentions strings.Clone which has been around for a while. Using that is very easy and it stops big strings being pinned into memory. I had a problem with this in the S3 backend where some string was pinning the entire XML response from the server which strings.Clone fixed.


tapirl|parent|next|

Be aware that there is a bug in the current implementation (1.23.0 and 1.23.1) https://github.com/golang/go/issues/69370

jfoutz|prev|next|

Interning is neat. Most of my experience is really dated. Primarily in the JVM, and mostly for class names, for reflection and class loaders. It's sort of surprising seeing this added to go, with its desires for minimalism. But when you can use it, it can be a big win.

Look past the "loading the whole book in memory" the author gets to the point soon enough.

The ip address example is ok. It's true, and highlights some important points. But keep in mind pointers are 64 bit. If you're not ipv6, and you're shuffling a lot of them, you're probably better off just keeping the uint64 and converting to string and allocating the struct as needed. interning doesn't appear to be much of a win in that narrow case. but if you do care about ipv6, and you're connecting to millions of upstreams, it's not unreasonable.

It's neat it's available. it's good to be aware of interning, but it's generally not a huge win. For a few special cases, it can be really awesome.

** edit uint32 for ipv4. bit counting is hard.


wjholden|parent|next|

Fun fact: in Go, an IPv4 address is internally represented as an IPv6 address, starting with ten zeroes and two 0xffs. The IPv4 address is copied in the last four bytes.

https://cs.opensource.google/go/go/+/refs/tags/go1.23.1:src/...

This is called an IPv4-Mapped Address.

https://www.rfc-editor.org/rfc/rfc5156#section-2.2


oefrha|parent|prev|next|

No, the example isn't about IP address octets, it's about IPv6 zones — we're talking strings like "eth0".

survivedurcode|prev|next|

Beware the trade-offs of interning affecting GC behavior. Now you can’t have a stack-allocation optimization, for example.

tapirl|parent|next|

The interning feature should be only used for some rare cases, it is not intended to be used widely.

cherryteastain|prev|next|

Cool idea, but sounds detrimental in terms of cache efficiency. Typically processing a string by reading it sequentially is quite cache efficient as the processor will prefetch, but with this method it seems like the string will not be contiguous in memory which will lead to more cache misses.

pimeys|parent|next|

It's quite common if you need to parse a schema and keep it in memory for a long time. We're working on GraphQL federation gateway and interning strings is really useful there due to schemas sometimes being hundreds of megabytes.

Writing your own interner is under hundred lines of code and takes about 15 minutes. Writing a thread-safe interner is a bit trickier problem, but there are good libraries for that.


SwiftyBug|parent|prev|next|

Seems like the classic trade-off of compute time x memory.

morkalork|prev|next|

This is new for Go? I remember learning about Java string interning decades ago in the context of xml parsers. If I remember correctly, there were even some memory leaks associated with it and thread locals?

pphysch|parent|next|

You could've implemented bespoke interning at any point in Go; it was added to the standard library only recently, though, likely because it may leverage Go's relatively recent support for generics.

peterldowns|prev|next|

I missed the initial blogpost about this; thanks for the solid explanation and the links. Probably won't make much of a difference for my use cases but cool to know this is now in the stdlib.

favflam|prev|next|

Is this the same as grpc.SharedBufferPool? The gRPC implementation does a lot of memory allocation.

pjot|prev|next|

  > I have a very large plaintext file and I’m loading it fully in memory. This results in a string variable book that occupies 282 MiB of memory.
At what point does something become (very) large?

smellybigbelly|prev|next|

Couldn’t you read in the book a bit smarter by deduplicating the io stream?

User23|prev|next|

For reference, the term comes from Lisp’s INTERN. [1]

[1] http://clhs.lisp.se/Body/f_intern.htm


liotier|prev|

Is this essentially dictionary compression ?

nine_k|parent|next|

It's similar. It's just not by word, but by entire string, assuming that strings are immutable, long enough, and the same string values are referenced often enough.

On 64-bit machines though strings shorter than 8 bytes are better off not interned.


jfoutz|root|parent|next|

as I read it, it's any struct! Not just strings. which is cool.

TheDong|root|parent|next|

Any "comparable" struct, which is to say any struct where '==' works. No, you can't override '=='.

This will not work for the average struct, and if you do use it, you now no longer can include any, for example `[]byte` fields in the struct or else it will no longer compile.

I always find it funny that `string` is comparable, but `[]rune` and `[]byte` are not.


aatd86|root|parent|next|

It's beczuse string values are immutable so at any point in a program, the result of string variable comparison would be the same.

For slice types it's more complex because there is mutability implicit aliasing. And the backing array pointed at may change. I guess the latter should point us toward deep value comparison for slice types since any other comparison would be quite flimsy.


recursive|parent|prev|

No. This is avoiding keeping multiple copies of the same value. There's no compression.

VWWHFSfQ|root|parent|next|

That's pretty much the definition of compression

recursive|root|parent|next|

Data compression can reduce the space required to store a single payload. That's not what's going here. I mean, I understand what you're saying, but calling this compression seems counter-productive to understand what's going on.

ithkuil|root|parent|next|

I think it is useful to consider this a form of lossless compression not only because it technically is a form of lossless compression but also because a simple compression algorithm like this is a good introduction to the various tradeoffs that compression algorithms can have.

This is an instance of dictionary encoding where terms are words and dictionary code space is the entire machine addressing space (heap).

The disadvantage of this approach is that you can only "deduplicate" words that are exactly the same and you cannot remove duplicated substrings. Another disadvantage is that the dictionary code size is 8 bytes which means you achieve some saving only for words that are longer than 8 bytes (plus some other overhead for the interning index).

Compression algorithms that use dictionary encoding typically use code/symbol sizes smaller than 64 bits, often using variable length encoded numbers encoded with some kind of entropy coding that is able to use short bit sequences for codes representing dictionary entries appearing often and longer bit sequences for less frequent ones.

But the idea is the same. During decompression you read each "number" from the compressed stream and you use it effectively to index an array that contains the target (sub)strings.

The algorithm in the post just interprets the 64-bit dictionary code as an index in the giant array that is the machine addressing space.


rat9988|root|parent|prev|next|

I'm not sure I can follow you here. Data compression cannot reduce the size of a single word if we count the dictionary size.

recursive|root|parent|next|

It's not impossible... but who said anything about a single word?

wjholden|root|parent|prev|next|

I like the term "deduplication."