[ruby-core:117469] [Ruby master Feature#20415] Precompute literal String hash code during compilation

Issue #20415 has been reported by byroot (Jean Boussier). ---------------------------------------- Feature #20415: Precompute literal String hash code during compilation https://bugs.ruby-lang.org/issues/20415 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- I worked on a proof of concept with @etienne which I think has some potential, but I'm looking for feedback on what would be the best implementation. The proof of concept is here: https://github.com/Shopify/ruby/pull/553 ### Idea Most string literals are relatively short, hence embedded, and have some wasted bytes at the end of their slot. We could use that wasted space to store the string hash. The goal being to make **looking up a literal String key in a hash, as fast as a Symbol key**. The goal isn't to memoize the hash code of all strings, but to **only selectively precompute the hash code of literal strings in the compiler**. The compiler could even selectively do this when we literal string is used to lookup a hash (`opt_aref`). Here's the benchmark we used: ```ruby hash = 10.times.to_h do |i| [i, i] end dyn_sym = "dynamic_symbol".to_sym hash[:some_symbol] = 1 hash[dyn_sym] = 1 hash["small"] = 2 hash["frozen_string_literal"] = 2 Benchmark.ips do |x| x.report("symbol") { hash[:some_symbol] } x.report("dyn_symbol") { hash[:some_symbol] } x.report("small_lit") { hash["small"] } x.report("frozen_lit") { hash["frozen_string_literal"] } x.compare!(order: :baseline) end ``` On Ruby 3.3.0, looking up a String key is a bit slower based on the key size: ``` Calculating ------------------------------------- symbol 24.175M (± 1.7%) i/s - 122.002M in 5.048306s dyn_symbol 24.345M (± 1.6%) i/s - 122.019M in 5.013400s small_lit 21.252M (± 2.1%) i/s - 107.744M in 5.072042s frozen_lit 20.095M (± 1.3%) i/s - 100.489M in 5.001681s Comparison: symbol: 24174848.1 i/s dyn_symbol: 24345476.9 i/s - same-ish: difference falls within error small_lit: 21252403.2 i/s - 1.14x slower frozen_lit: 20094766.0 i/s - 1.20x slower ``` With the proof of concept performance is pretty much identical: ``` Calculating ------------------------------------- symbol 23.528M (± 6.9%) i/s - 117.584M in 5.033231s dyn_symbol 23.777M (± 4.7%) i/s - 120.231M in 5.071734s small_lit 23.066M (± 2.9%) i/s - 115.376M in 5.006947s frozen_lit 22.729M (± 1.1%) i/s - 115.693M in 5.090700s Comparison: symbol: 23527823.6 i/s dyn_symbol: 23776757.8 i/s - same-ish: difference falls within error small_lit: 23065535.3 i/s - same-ish: difference falls within error frozen_lit: 22729351.6 i/s - same-ish: difference falls within error ``` ### Possible implementation The reason I'm opening this issue early is to get feedback on which would be the best implementation. #### Store hashcode after the string terminator Right now the proof of concept simply stores the `st_index_t` after the string null terminator, and only when the string is embedded and as enough left over space. Strings with a precomputed hash are marked with an user flag. Pros: - Very simple implementation, no need to change a lot of code, and very easy to strip out if we want to. - Doesn't use any extra memory. If the string doesn't have enough left over bytes, the optimization simply isn't applied. - The worst case overhead is a single `FL_TEST_RAW` in `rb_str_hash`. Cons: - The optimization won't apply to certain string sizes. e.g. strings between `17` and `23` bytes won't have a precomputed hash code. - Extracting the hash code requires some not so nice pointer arithmetic. #### Create another RString union Another possibility would be to add another entry in the `RString` struct union, such as we'd have: ```c struct RString { struct RBasic basic; long len; union { // ... existing members struct { st_index_t hash; char ary[1]; } embded_literal; } as; }; ``` Pros: - The optimization can now be applied to all string sizes. - The hashcode is always at the same offset and properly aligned. Cons: - Some strings would be bumped by one slot size, so would use marginally more memory. - Complexify the code base more, need to modify a lot more string related code (e.g. `RSTRING_PTR` and many others) - When compiling such string, if an equal string already exists in the `fstring` table, we'd need to replace it, we can't just mutate it in place to add the hashcode. ### Prior art [Feature #15331] is somewhat similar in its idea, but it does it lazily for all strings. Here it's much simpler because limited to string literals, which are the ones likely to be used as Hash keys, and the overhead is on compilation, not runtime (aside from a single flag check). So I think most of the caveats of that original implementation don't apply here. -- https://bugs.ruby-lang.org/

Issue #20415 has been updated by Eregon (Benoit Daloze). FWIW TruffleRuby already does this, since frozen string literals need to be deduplicated, the hash needs to be computed, so might as well save it while doing so (the only downside being footprint). ``` truffleruby 24.0.0, like ruby 3.2.2, Oracle GraalVM Native [x86_64-linux] Calculating ------------------------------------- symbol 107.376M (± 0.7%) i/s (9.31 ns/i) - 541.038M in 5.038971s dyn_symbol 106.989M (± 0.7%) i/s (9.35 ns/i) - 543.771M in 5.082698s small_lit 88.014M (± 0.6%) i/s (11.36 ns/i) - 442.996M in 5.033433s frozen_lit 88.174M (± 0.3%) i/s (11.34 ns/i) - 444.293M in 5.038895s Comparison: symbol: 107376115.9 i/s dyn_symbol: 106989494.3 i/s - same-ish: difference falls within error frozen_lit: 88173794.6 i/s - 1.22x slower small_lit: 88013579.7 i/s - 1.22x slower ``` Strings are still slower than Symbol keys, I suspect because `eql?` is quite a bit more expensive for Strings. Even if two Strings are interned it's not correct to compare them by identity, because they could still be `eql?` with the same bytes but different encodings. That case does not exist for Symbols. ---------------------------------------- Feature #20415: Precompute literal String hash code during compilation https://bugs.ruby-lang.org/issues/20415#change-107860 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- I worked on a proof of concept with @etienne which I think has some potential, but I'm looking for feedback on what would be the best implementation. The proof of concept is here: https://github.com/Shopify/ruby/pull/553 ### Idea Most string literals are relatively short, hence embedded, and have some wasted bytes at the end of their slot. We could use that wasted space to store the string hash. The goal being to make **looking up a literal String key in a hash, as fast as a Symbol key**. The goal isn't to memoize the hash code of all strings, but to **only selectively precompute the hash code of literal strings in the compiler**. The compiler could even selectively do this when we literal string is used to lookup a hash (`opt_aref`). Here's the benchmark we used: ```ruby hash = 10.times.to_h do |i| [i, i] end dyn_sym = "dynamic_symbol".to_sym hash[:some_symbol] = 1 hash[dyn_sym] = 1 hash["small"] = 2 hash["frozen_string_literal"] = 2 Benchmark.ips do |x| x.report("symbol") { hash[:some_symbol] } x.report("dyn_symbol") { hash[:some_symbol] } x.report("small_lit") { hash["small"] } x.report("frozen_lit") { hash["frozen_string_literal"] } x.compare!(order: :baseline) end ``` On Ruby 3.3.0, looking up a String key is a bit slower based on the key size: ``` Calculating ------------------------------------- symbol 24.175M (± 1.7%) i/s - 122.002M in 5.048306s dyn_symbol 24.345M (± 1.6%) i/s - 122.019M in 5.013400s small_lit 21.252M (± 2.1%) i/s - 107.744M in 5.072042s frozen_lit 20.095M (± 1.3%) i/s - 100.489M in 5.001681s Comparison: symbol: 24174848.1 i/s dyn_symbol: 24345476.9 i/s - same-ish: difference falls within error small_lit: 21252403.2 i/s - 1.14x slower frozen_lit: 20094766.0 i/s - 1.20x slower ``` With the proof of concept performance is pretty much identical: ``` Calculating ------------------------------------- symbol 23.528M (± 6.9%) i/s - 117.584M in 5.033231s dyn_symbol 23.777M (± 4.7%) i/s - 120.231M in 5.071734s small_lit 23.066M (± 2.9%) i/s - 115.376M in 5.006947s frozen_lit 22.729M (± 1.1%) i/s - 115.693M in 5.090700s Comparison: symbol: 23527823.6 i/s dyn_symbol: 23776757.8 i/s - same-ish: difference falls within error small_lit: 23065535.3 i/s - same-ish: difference falls within error frozen_lit: 22729351.6 i/s - same-ish: difference falls within error ``` ### Possible implementation The reason I'm opening this issue early is to get feedback on which would be the best implementation. #### Store hashcode after the string terminator Right now the proof of concept simply stores the `st_index_t` after the string null terminator, and only when the string is embedded and as enough left over space. Strings with a precomputed hash are marked with an user flag. Pros: - Very simple implementation, no need to change a lot of code, and very easy to strip out if we want to. - Doesn't use any extra memory. If the string doesn't have enough left over bytes, the optimization simply isn't applied. - The worst case overhead is a single `FL_TEST_RAW` in `rb_str_hash`. Cons: - The optimization won't apply to certain string sizes. e.g. strings between `17` and `23` bytes won't have a precomputed hash code. - Extracting the hash code requires some not so nice pointer arithmetic. #### Create another RString union Another possibility would be to add another entry in the `RString` struct union, such as we'd have: ```c struct RString { struct RBasic basic; long len; union { // ... existing members struct { st_index_t hash; char ary[1]; } embded_literal; } as; }; ``` Pros: - The optimization can now be applied to all string sizes. - The hashcode is always at the same offset and properly aligned. Cons: - Some strings would be bumped by one slot size, so would use marginally more memory. - Complexify the code base more, need to modify a lot more string related code (e.g. `RSTRING_PTR` and many others) - When compiling such string, if an equal string already exists in the `fstring` table, we'd need to replace it, we can't just mutate it in place to add the hashcode. ### Prior art [Feature #15331] is somewhat similar in its idea, but it does it lazily for all strings. Here it's much simpler because limited to string literals, which are the ones likely to be used as Hash keys, and the overhead is on compilation, not runtime (aside from a single flag check). So I think most of the caveats of that original implementation don't apply here. -- https://bugs.ruby-lang.org/

Issue #20415 has been updated by byroot (Jean Boussier).
if two Strings are interned it's not correct to compare them by identity, because they could still be eql? with the same bytes but different encodings.
I'm not sure I follow. Surely `x.eql?(x)` can use an identity check as a shortcut. If both refs are equal, you can immediately return true. Of course if refs aren't equal, then yes you need to do a full string comparison. So I'm a bit surprised to see TruffleRuby doesn't have the same performance on that benchmark. ---------------------------------------- Feature #20415: Precompute literal String hash code during compilation https://bugs.ruby-lang.org/issues/20415#change-107861 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- I worked on a proof of concept with @etienne which I think has some potential, but I'm looking for feedback on what would be the best implementation. The proof of concept is here: https://github.com/Shopify/ruby/pull/553 ### Idea Most string literals are relatively short, hence embedded, and have some wasted bytes at the end of their slot. We could use that wasted space to store the string hash. The goal being to make **looking up a literal String key in a hash, as fast as a Symbol key**. The goal isn't to memoize the hash code of all strings, but to **only selectively precompute the hash code of literal strings in the compiler**. The compiler could even selectively do this when we literal string is used to lookup a hash (`opt_aref`). Here's the benchmark we used: ```ruby hash = 10.times.to_h do |i| [i, i] end dyn_sym = "dynamic_symbol".to_sym hash[:some_symbol] = 1 hash[dyn_sym] = 1 hash["small"] = 2 hash["frozen_string_literal"] = 2 Benchmark.ips do |x| x.report("symbol") { hash[:some_symbol] } x.report("dyn_symbol") { hash[:some_symbol] } x.report("small_lit") { hash["small"] } x.report("frozen_lit") { hash["frozen_string_literal"] } x.compare!(order: :baseline) end ``` On Ruby 3.3.0, looking up a String key is a bit slower based on the key size: ``` Calculating ------------------------------------- symbol 24.175M (± 1.7%) i/s - 122.002M in 5.048306s dyn_symbol 24.345M (± 1.6%) i/s - 122.019M in 5.013400s small_lit 21.252M (± 2.1%) i/s - 107.744M in 5.072042s frozen_lit 20.095M (± 1.3%) i/s - 100.489M in 5.001681s Comparison: symbol: 24174848.1 i/s dyn_symbol: 24345476.9 i/s - same-ish: difference falls within error small_lit: 21252403.2 i/s - 1.14x slower frozen_lit: 20094766.0 i/s - 1.20x slower ``` With the proof of concept performance is pretty much identical: ``` Calculating ------------------------------------- symbol 23.528M (± 6.9%) i/s - 117.584M in 5.033231s dyn_symbol 23.777M (± 4.7%) i/s - 120.231M in 5.071734s small_lit 23.066M (± 2.9%) i/s - 115.376M in 5.006947s frozen_lit 22.729M (± 1.1%) i/s - 115.693M in 5.090700s Comparison: symbol: 23527823.6 i/s dyn_symbol: 23776757.8 i/s - same-ish: difference falls within error small_lit: 23065535.3 i/s - same-ish: difference falls within error frozen_lit: 22729351.6 i/s - same-ish: difference falls within error ``` ### Possible implementation The reason I'm opening this issue early is to get feedback on which would be the best implementation. #### Store hashcode after the string terminator Right now the proof of concept simply stores the `st_index_t` after the string null terminator, and only when the string is embedded and as enough left over space. Strings with a precomputed hash are marked with an user flag. Pros: - Very simple implementation, no need to change a lot of code, and very easy to strip out if we want to. - Doesn't use any extra memory. If the string doesn't have enough left over bytes, the optimization simply isn't applied. - The worst case overhead is a single `FL_TEST_RAW` in `rb_str_hash`. Cons: - The optimization won't apply to certain string sizes. e.g. strings between `17` and `23` bytes won't have a precomputed hash code. - Extracting the hash code requires some not so nice pointer arithmetic. #### Create another RString union Another possibility would be to add another entry in the `RString` struct union, such as we'd have: ```c struct RString { struct RBasic basic; long len; union { // ... existing members struct { st_index_t hash; char ary[1]; } embded_literal; } as; }; ``` Pros: - The optimization can now be applied to all string sizes. - The hashcode is always at the same offset and properly aligned. Cons: - Some strings would be bumped by one slot size, so would use marginally more memory. - Complexify the code base more, need to modify a lot more string related code (e.g. `RSTRING_PTR` and many others) - When compiling such string, if an equal string already exists in the `fstring` table, we'd need to replace it, we can't just mutate it in place to add the hashcode. ### Prior art [Feature #15331] is somewhat similar in its idea, but it does it lazily for all strings. Here it's much simpler because limited to string literals, which are the ones likely to be used as Hash keys, and the overhead is on compilation, not runtime (aside from a single flag check). So I think most of the caveats of that original implementation don't apply here. -- https://bugs.ruby-lang.org/

Issue #20415 has been updated by Eregon (Benoit Daloze). byroot (Jean Boussier) wrote in #note-2:
I'm not sure I follow. Surely `x.eql?(x)` can use an identity check as a shortcut. If both refs are equal, you can immediately return true. Of course if refs aren't equal, then yes you need to do a full string comparison.
Yes, if they are the same object of course they are eql?, but if they are not the same object they can still be eql?, even if both are interned/fstring/frozen string literals.
So I'm a bit surprised to see TruffleRuby doesn't have the same performance on that benchmark.
For this benchmark, whether `eql?` is called only with equal keys depends on the Hash representation, and whether they are not two keys falling in the same bucket. For this benchmark `eql?` should only be called with equal keys on TruffleRuby (due to using a buckets representation), and not the case on CRuby (due to using an array instead of buckets until 8 pairs IIRC). I'll try to take a look at some compiler graphs to have more definite insights on it. ---------------------------------------- Feature #20415: Precompute literal String hash code during compilation https://bugs.ruby-lang.org/issues/20415#change-107863 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- I worked on a proof of concept with @etienne which I think has some potential, but I'm looking for feedback on what would be the best implementation. The proof of concept is here: https://github.com/Shopify/ruby/pull/553 ### Idea Most string literals are relatively short, hence embedded, and have some wasted bytes at the end of their slot. We could use that wasted space to store the string hash. The goal being to make **looking up a literal String key in a hash, as fast as a Symbol key**. The goal isn't to memoize the hash code of all strings, but to **only selectively precompute the hash code of literal strings in the compiler**. The compiler could even selectively do this when we literal string is used to lookup a hash (`opt_aref`). Here's the benchmark we used: ```ruby hash = 10.times.to_h do |i| [i, i] end dyn_sym = "dynamic_symbol".to_sym hash[:some_symbol] = 1 hash[dyn_sym] = 1 hash["small"] = 2 hash["frozen_string_literal"] = 2 Benchmark.ips do |x| x.report("symbol") { hash[:some_symbol] } x.report("dyn_symbol") { hash[:some_symbol] } x.report("small_lit") { hash["small"] } x.report("frozen_lit") { hash["frozen_string_literal"] } x.compare!(order: :baseline) end ``` On Ruby 3.3.0, looking up a String key is a bit slower based on the key size: ``` Calculating ------------------------------------- symbol 24.175M (± 1.7%) i/s - 122.002M in 5.048306s dyn_symbol 24.345M (± 1.6%) i/s - 122.019M in 5.013400s small_lit 21.252M (± 2.1%) i/s - 107.744M in 5.072042s frozen_lit 20.095M (± 1.3%) i/s - 100.489M in 5.001681s Comparison: symbol: 24174848.1 i/s dyn_symbol: 24345476.9 i/s - same-ish: difference falls within error small_lit: 21252403.2 i/s - 1.14x slower frozen_lit: 20094766.0 i/s - 1.20x slower ``` With the proof of concept performance is pretty much identical: ``` Calculating ------------------------------------- symbol 23.528M (± 6.9%) i/s - 117.584M in 5.033231s dyn_symbol 23.777M (± 4.7%) i/s - 120.231M in 5.071734s small_lit 23.066M (± 2.9%) i/s - 115.376M in 5.006947s frozen_lit 22.729M (± 1.1%) i/s - 115.693M in 5.090700s Comparison: symbol: 23527823.6 i/s dyn_symbol: 23776757.8 i/s - same-ish: difference falls within error small_lit: 23065535.3 i/s - same-ish: difference falls within error frozen_lit: 22729351.6 i/s - same-ish: difference falls within error ``` ### Possible implementation The reason I'm opening this issue early is to get feedback on which would be the best implementation. #### Store hashcode after the string terminator Right now the proof of concept simply stores the `st_index_t` after the string null terminator, and only when the string is embedded and as enough left over space. Strings with a precomputed hash are marked with an user flag. Pros: - Very simple implementation, no need to change a lot of code, and very easy to strip out if we want to. - Doesn't use any extra memory. If the string doesn't have enough left over bytes, the optimization simply isn't applied. - The worst case overhead is a single `FL_TEST_RAW` in `rb_str_hash`. Cons: - The optimization won't apply to certain string sizes. e.g. strings between `17` and `23` bytes won't have a precomputed hash code. - Extracting the hash code requires some not so nice pointer arithmetic. #### Create another RString union Another possibility would be to add another entry in the `RString` struct union, such as we'd have: ```c struct RString { struct RBasic basic; long len; union { // ... existing members struct { st_index_t hash; char ary[1]; } embded_literal; } as; }; ``` Pros: - The optimization can now be applied to all string sizes. - The hashcode is always at the same offset and properly aligned. Cons: - Some strings would be bumped by one slot size, so would use marginally more memory. - Complexify the code base more, need to modify a lot more string related code (e.g. `RSTRING_PTR` and many others) - When compiling such string, if an equal string already exists in the `fstring` table, we'd need to replace it, we can't just mutate it in place to add the hashcode. ### Prior art [Feature #15331] is somewhat similar in its idea, but it does it lazily for all strings. Here it's much simpler because limited to string literals, which are the ones likely to be used as Hash keys, and the overhead is on compilation, not runtime (aside from a single flag check). So I think most of the caveats of that original implementation don't apply here. -- https://bugs.ruby-lang.org/

Issue #20415 has been updated by byroot (Jean Boussier).
Yes, if they are the same object of course they are eql?, but if they are not the same object they can still be eql?, even if both are interned/fstring/frozen string literals.
That was my understanding.
due to using an array instead of buckets until 8 pairs IIRC
Yes, that is why my benchmark use a larger Hash. ---------------------------------------- Feature #20415: Precompute literal String hash code during compilation https://bugs.ruby-lang.org/issues/20415#change-107864 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- I worked on a proof of concept with @etienne which I think has some potential, but I'm looking for feedback on what would be the best implementation. The proof of concept is here: https://github.com/Shopify/ruby/pull/553 ### Idea Most string literals are relatively short, hence embedded, and have some wasted bytes at the end of their slot. We could use that wasted space to store the string hash. The goal being to make **looking up a literal String key in a hash, as fast as a Symbol key**. The goal isn't to memoize the hash code of all strings, but to **only selectively precompute the hash code of literal strings in the compiler**. The compiler could even selectively do this when we literal string is used to lookup a hash (`opt_aref`). Here's the benchmark we used: ```ruby hash = 10.times.to_h do |i| [i, i] end dyn_sym = "dynamic_symbol".to_sym hash[:some_symbol] = 1 hash[dyn_sym] = 1 hash["small"] = 2 hash["frozen_string_literal"] = 2 Benchmark.ips do |x| x.report("symbol") { hash[:some_symbol] } x.report("dyn_symbol") { hash[:some_symbol] } x.report("small_lit") { hash["small"] } x.report("frozen_lit") { hash["frozen_string_literal"] } x.compare!(order: :baseline) end ``` On Ruby 3.3.0, looking up a String key is a bit slower based on the key size: ``` Calculating ------------------------------------- symbol 24.175M (± 1.7%) i/s - 122.002M in 5.048306s dyn_symbol 24.345M (± 1.6%) i/s - 122.019M in 5.013400s small_lit 21.252M (± 2.1%) i/s - 107.744M in 5.072042s frozen_lit 20.095M (± 1.3%) i/s - 100.489M in 5.001681s Comparison: symbol: 24174848.1 i/s dyn_symbol: 24345476.9 i/s - same-ish: difference falls within error small_lit: 21252403.2 i/s - 1.14x slower frozen_lit: 20094766.0 i/s - 1.20x slower ``` With the proof of concept performance is pretty much identical: ``` Calculating ------------------------------------- symbol 23.528M (± 6.9%) i/s - 117.584M in 5.033231s dyn_symbol 23.777M (± 4.7%) i/s - 120.231M in 5.071734s small_lit 23.066M (± 2.9%) i/s - 115.376M in 5.006947s frozen_lit 22.729M (± 1.1%) i/s - 115.693M in 5.090700s Comparison: symbol: 23527823.6 i/s dyn_symbol: 23776757.8 i/s - same-ish: difference falls within error small_lit: 23065535.3 i/s - same-ish: difference falls within error frozen_lit: 22729351.6 i/s - same-ish: difference falls within error ``` ### Possible implementation The reason I'm opening this issue early is to get feedback on which would be the best implementation. #### Store hashcode after the string terminator Right now the proof of concept simply stores the `st_index_t` after the string null terminator, and only when the string is embedded and as enough left over space. Strings with a precomputed hash are marked with an user flag. Pros: - Very simple implementation, no need to change a lot of code, and very easy to strip out if we want to. - Doesn't use any extra memory. If the string doesn't have enough left over bytes, the optimization simply isn't applied. - The worst case overhead is a single `FL_TEST_RAW` in `rb_str_hash`. Cons: - The optimization won't apply to certain string sizes. e.g. strings between `17` and `23` bytes won't have a precomputed hash code. - Extracting the hash code requires some not so nice pointer arithmetic. #### Create another RString union Another possibility would be to add another entry in the `RString` struct union, such as we'd have: ```c struct RString { struct RBasic basic; long len; union { // ... existing members struct { st_index_t hash; char ary[1]; } embded_literal; } as; }; ``` Pros: - The optimization can now be applied to all string sizes. - The hashcode is always at the same offset and properly aligned. Cons: - Some strings would be bumped by one slot size, so would use marginally more memory. - Complexify the code base more, need to modify a lot more string related code (e.g. `RSTRING_PTR` and many others) - When compiling such string, if an equal string already exists in the `fstring` table, we'd need to replace it, we can't just mutate it in place to add the hashcode. ### Prior art [Feature #15331] is somewhat similar in its idea, but it does it lazily for all strings. Here it's much simpler because limited to string literals, which are the ones likely to be used as Hash keys, and the overhead is on compilation, not runtime (aside from a single flag check). So I think most of the caveats of that original implementation don't apply here. -- https://bugs.ruby-lang.org/

Issue #20415 has been updated by etienne (Étienne Barrié). We pushed a cleaned-up PR at https://github.com/ruby/ruby/pull/10596. We settled on storing the hash code after the terminator as it prevents having to add yet another union in RString that would have a general performance impact and complexify the entire code base. And we decided against storing it at the end of the object slot to avoid having to access the slot size which is slower. ``` compare-ruby: ruby 3.4.0dev (2024-04-22T06:32:21Z main f77618c1fa) [arm64-darwin23] built-ruby: ruby 3.4.0dev (2024-04-22T10:13:03Z interned-string-ha.. 8a1a32331b) [arm64-darwin23] last_commit=Precompute embedded string literals hash code | |compare-ruby|built-ruby| |:-----------|-----------:|---------:| |symbol | 39.275M| 39.753M| | | -| 1.01x| |dyn_symbol | 37.348M| 37.704M| | | -| 1.01x| |small_lit | 29.514M| 33.948M| | | -| 1.15x| |frozen_lit | 27.180M| 33.056M| | | -| 1.22x| |iseq_lit | 27.391M| 32.242M| | | -| 1.18x| ``` ---------------------------------------- Feature #20415: Precompute literal String hash code during compilation https://bugs.ruby-lang.org/issues/20415#change-108053 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- I worked on a proof of concept with @etienne which I think has some potential, but I'm looking for feedback on what would be the best implementation. The proof of concept is here: https://github.com/Shopify/ruby/pull/553 ### Idea Most string literals are relatively short, hence embedded, and have some wasted bytes at the end of their slot. We could use that wasted space to store the string hash. The goal being to make **looking up a literal String key in a hash, as fast as a Symbol key**. The goal isn't to memoize the hash code of all strings, but to **only selectively precompute the hash code of literal strings in the compiler**. The compiler could even selectively do this when we literal string is used to lookup a hash (`opt_aref`). Here's the benchmark we used: ```ruby hash = 10.times.to_h do |i| [i, i] end dyn_sym = "dynamic_symbol".to_sym hash[:some_symbol] = 1 hash[dyn_sym] = 1 hash["small"] = 2 hash["frozen_string_literal"] = 2 Benchmark.ips do |x| x.report("symbol") { hash[:some_symbol] } x.report("dyn_symbol") { hash[:some_symbol] } x.report("small_lit") { hash["small"] } x.report("frozen_lit") { hash["frozen_string_literal"] } x.compare!(order: :baseline) end ``` On Ruby 3.3.0, looking up a String key is a bit slower based on the key size: ``` Calculating ------------------------------------- symbol 24.175M (± 1.7%) i/s - 122.002M in 5.048306s dyn_symbol 24.345M (± 1.6%) i/s - 122.019M in 5.013400s small_lit 21.252M (± 2.1%) i/s - 107.744M in 5.072042s frozen_lit 20.095M (± 1.3%) i/s - 100.489M in 5.001681s Comparison: symbol: 24174848.1 i/s dyn_symbol: 24345476.9 i/s - same-ish: difference falls within error small_lit: 21252403.2 i/s - 1.14x slower frozen_lit: 20094766.0 i/s - 1.20x slower ``` With the proof of concept performance is pretty much identical: ``` Calculating ------------------------------------- symbol 23.528M (± 6.9%) i/s - 117.584M in 5.033231s dyn_symbol 23.777M (± 4.7%) i/s - 120.231M in 5.071734s small_lit 23.066M (± 2.9%) i/s - 115.376M in 5.006947s frozen_lit 22.729M (± 1.1%) i/s - 115.693M in 5.090700s Comparison: symbol: 23527823.6 i/s dyn_symbol: 23776757.8 i/s - same-ish: difference falls within error small_lit: 23065535.3 i/s - same-ish: difference falls within error frozen_lit: 22729351.6 i/s - same-ish: difference falls within error ``` ### Possible implementation The reason I'm opening this issue early is to get feedback on which would be the best implementation. #### Store hashcode after the string terminator Right now the proof of concept simply stores the `st_index_t` after the string null terminator, and only when the string is embedded and as enough left over space. Strings with a precomputed hash are marked with an user flag. Pros: - Very simple implementation, no need to change a lot of code, and very easy to strip out if we want to. - Doesn't use any extra memory. If the string doesn't have enough left over bytes, the optimization simply isn't applied. - The worst case overhead is a single `FL_TEST_RAW` in `rb_str_hash`. Cons: - The optimization won't apply to certain string sizes. e.g. strings between `17` and `23` bytes won't have a precomputed hash code. - Extracting the hash code requires some not so nice pointer arithmetic. #### Create another RString union Another possibility would be to add another entry in the `RString` struct union, such as we'd have: ```c struct RString { struct RBasic basic; long len; union { // ... existing members struct { st_index_t hash; char ary[1]; } embded_literal; } as; }; ``` Pros: - The optimization can now be applied to all string sizes. - The hashcode is always at the same offset and properly aligned. Cons: - Some strings would be bumped by one slot size, so would use marginally more memory. - Complexify the code base more, need to modify a lot more string related code (e.g. `RSTRING_PTR` and many others) - When compiling such string, if an equal string already exists in the `fstring` table, we'd need to replace it, we can't just mutate it in place to add the hashcode. ### Prior art [Feature #15331] is somewhat similar in its idea, but it does it lazily for all strings. Here it's much simpler because limited to string literals, which are the ones likely to be used as Hash keys, and the overhead is on compilation, not runtime (aside from a single flag check). So I think most of the caveats of that original implementation don't apply here. -- https://bugs.ruby-lang.org/

Issue #20415 has been updated by shyouhei (Shyouhei Urabe). The benchmark seems great. But I'm not yet sure if this is worth the hustle. Is using a string _literal_ as a hash key very common? It would be much convincing to me if there are any non-micro benchmarks. ---------------------------------------- Feature #20415: Precompute literal String hash code during compilation https://bugs.ruby-lang.org/issues/20415#change-108228 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- I worked on a proof of concept with @etienne which I think has some potential, but I'm looking for feedback on what would be the best implementation. The proof of concept is here: https://github.com/Shopify/ruby/pull/553 ### Idea Most string literals are relatively short, hence embedded, and have some wasted bytes at the end of their slot. We could use that wasted space to store the string hash. The goal being to make **looking up a literal String key in a hash, as fast as a Symbol key**. The goal isn't to memoize the hash code of all strings, but to **only selectively precompute the hash code of literal strings in the compiler**. The compiler could even selectively do this when we literal string is used to lookup a hash (`opt_aref`). Here's the benchmark we used: ```ruby hash = 10.times.to_h do |i| [i, i] end dyn_sym = "dynamic_symbol".to_sym hash[:some_symbol] = 1 hash[dyn_sym] = 1 hash["small"] = 2 hash["frozen_string_literal"] = 2 Benchmark.ips do |x| x.report("symbol") { hash[:some_symbol] } x.report("dyn_symbol") { hash[:some_symbol] } x.report("small_lit") { hash["small"] } x.report("frozen_lit") { hash["frozen_string_literal"] } x.compare!(order: :baseline) end ``` On Ruby 3.3.0, looking up a String key is a bit slower based on the key size: ``` Calculating ------------------------------------- symbol 24.175M (± 1.7%) i/s - 122.002M in 5.048306s dyn_symbol 24.345M (± 1.6%) i/s - 122.019M in 5.013400s small_lit 21.252M (± 2.1%) i/s - 107.744M in 5.072042s frozen_lit 20.095M (± 1.3%) i/s - 100.489M in 5.001681s Comparison: symbol: 24174848.1 i/s dyn_symbol: 24345476.9 i/s - same-ish: difference falls within error small_lit: 21252403.2 i/s - 1.14x slower frozen_lit: 20094766.0 i/s - 1.20x slower ``` With the proof of concept performance is pretty much identical: ``` Calculating ------------------------------------- symbol 23.528M (± 6.9%) i/s - 117.584M in 5.033231s dyn_symbol 23.777M (± 4.7%) i/s - 120.231M in 5.071734s small_lit 23.066M (± 2.9%) i/s - 115.376M in 5.006947s frozen_lit 22.729M (± 1.1%) i/s - 115.693M in 5.090700s Comparison: symbol: 23527823.6 i/s dyn_symbol: 23776757.8 i/s - same-ish: difference falls within error small_lit: 23065535.3 i/s - same-ish: difference falls within error frozen_lit: 22729351.6 i/s - same-ish: difference falls within error ``` ### Possible implementation The reason I'm opening this issue early is to get feedback on which would be the best implementation. #### Store hashcode after the string terminator Right now the proof of concept simply stores the `st_index_t` after the string null terminator, and only when the string is embedded and as enough left over space. Strings with a precomputed hash are marked with an user flag. Pros: - Very simple implementation, no need to change a lot of code, and very easy to strip out if we want to. - Doesn't use any extra memory. If the string doesn't have enough left over bytes, the optimization simply isn't applied. - The worst case overhead is a single `FL_TEST_RAW` in `rb_str_hash`. Cons: - The optimization won't apply to certain string sizes. e.g. strings between `17` and `23` bytes won't have a precomputed hash code. - Extracting the hash code requires some not so nice pointer arithmetic. #### Create another RString union Another possibility would be to add another entry in the `RString` struct union, such as we'd have: ```c struct RString { struct RBasic basic; long len; union { // ... existing members struct { st_index_t hash; char ary[1]; } embded_literal; } as; }; ``` Pros: - The optimization can now be applied to all string sizes. - The hashcode is always at the same offset and properly aligned. Cons: - Some strings would be bumped by one slot size, so would use marginally more memory. - Complexify the code base more, need to modify a lot more string related code (e.g. `RSTRING_PTR` and many others) - When compiling such string, if an equal string already exists in the `fstring` table, we'd need to replace it, we can't just mutate it in place to add the hashcode. ### Prior art [Feature #15331] is somewhat similar in its idea, but it does it lazily for all strings. Here it's much simpler because limited to string literals, which are the ones likely to be used as Hash keys, and the overhead is on compilation, not runtime (aside from a single flag check). So I think most of the caveats of that original implementation don't apply here. -- https://bugs.ruby-lang.org/

Issue #20415 has been updated by byroot (Jean Boussier).
Is using a string literal as a hash key very common?
It's quite common in a few places, e.g. Active Record attribute accessors generate code such as: ```ruby # frozen_string_literal: true def title _read_attribute("title") end ``` (NB: this is in part historical because symbols used to be immortal, this could eventually migrate to symbols in the future). Similarly, rack applications use a lot of string literal keys to look into the `env` hash. Then you have all the code that consume JSON APIs of various sort, etc.
It would be much convincing to me if there are any non-micro benchmarks.
Yes, we're working on it, @etienne has some `yjit-bench` results, I'll ask him if they are ready. Overall the difference on larger benchmark isn't huge (but visible IIRC), but I'd argue the patch is self-contained enough that it's worth merging. If the patch was very invasive I'd agree with you, but sometime to get to a 5% improvement you need 10 small `0.5%` improvements combined together, and I think this patch make sense in that context. ---------------------------------------- Feature #20415: Precompute literal String hash code during compilation https://bugs.ruby-lang.org/issues/20415#change-108229 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- I worked on a proof of concept with @etienne which I think has some potential, but I'm looking for feedback on what would be the best implementation. The proof of concept is here: https://github.com/Shopify/ruby/pull/553 ### Idea Most string literals are relatively short, hence embedded, and have some wasted bytes at the end of their slot. We could use that wasted space to store the string hash. The goal being to make **looking up a literal String key in a hash, as fast as a Symbol key**. The goal isn't to memoize the hash code of all strings, but to **only selectively precompute the hash code of literal strings in the compiler**. The compiler could even selectively do this when we literal string is used to lookup a hash (`opt_aref`). Here's the benchmark we used: ```ruby hash = 10.times.to_h do |i| [i, i] end dyn_sym = "dynamic_symbol".to_sym hash[:some_symbol] = 1 hash[dyn_sym] = 1 hash["small"] = 2 hash["frozen_string_literal"] = 2 Benchmark.ips do |x| x.report("symbol") { hash[:some_symbol] } x.report("dyn_symbol") { hash[:some_symbol] } x.report("small_lit") { hash["small"] } x.report("frozen_lit") { hash["frozen_string_literal"] } x.compare!(order: :baseline) end ``` On Ruby 3.3.0, looking up a String key is a bit slower based on the key size: ``` Calculating ------------------------------------- symbol 24.175M (± 1.7%) i/s - 122.002M in 5.048306s dyn_symbol 24.345M (± 1.6%) i/s - 122.019M in 5.013400s small_lit 21.252M (± 2.1%) i/s - 107.744M in 5.072042s frozen_lit 20.095M (± 1.3%) i/s - 100.489M in 5.001681s Comparison: symbol: 24174848.1 i/s dyn_symbol: 24345476.9 i/s - same-ish: difference falls within error small_lit: 21252403.2 i/s - 1.14x slower frozen_lit: 20094766.0 i/s - 1.20x slower ``` With the proof of concept performance is pretty much identical: ``` Calculating ------------------------------------- symbol 23.528M (± 6.9%) i/s - 117.584M in 5.033231s dyn_symbol 23.777M (± 4.7%) i/s - 120.231M in 5.071734s small_lit 23.066M (± 2.9%) i/s - 115.376M in 5.006947s frozen_lit 22.729M (± 1.1%) i/s - 115.693M in 5.090700s Comparison: symbol: 23527823.6 i/s dyn_symbol: 23776757.8 i/s - same-ish: difference falls within error small_lit: 23065535.3 i/s - same-ish: difference falls within error frozen_lit: 22729351.6 i/s - same-ish: difference falls within error ``` ### Possible implementation The reason I'm opening this issue early is to get feedback on which would be the best implementation. #### Store hashcode after the string terminator Right now the proof of concept simply stores the `st_index_t` after the string null terminator, and only when the string is embedded and as enough left over space. Strings with a precomputed hash are marked with an user flag. Pros: - Very simple implementation, no need to change a lot of code, and very easy to strip out if we want to. - Doesn't use any extra memory. If the string doesn't have enough left over bytes, the optimization simply isn't applied. - The worst case overhead is a single `FL_TEST_RAW` in `rb_str_hash`. Cons: - The optimization won't apply to certain string sizes. e.g. strings between `17` and `23` bytes won't have a precomputed hash code. - Extracting the hash code requires some not so nice pointer arithmetic. #### Create another RString union Another possibility would be to add another entry in the `RString` struct union, such as we'd have: ```c struct RString { struct RBasic basic; long len; union { // ... existing members struct { st_index_t hash; char ary[1]; } embded_literal; } as; }; ``` Pros: - The optimization can now be applied to all string sizes. - The hashcode is always at the same offset and properly aligned. Cons: - Some strings would be bumped by one slot size, so would use marginally more memory. - Complexify the code base more, need to modify a lot more string related code (e.g. `RSTRING_PTR` and many others) - When compiling such string, if an equal string already exists in the `fstring` table, we'd need to replace it, we can't just mutate it in place to add the hashcode. ### Prior art [Feature #15331] is somewhat similar in its idea, but it does it lazily for all strings. Here it's much simpler because limited to string literals, which are the ones likely to be used as Hash keys, and the overhead is on compilation, not runtime (aside from a single flag check). So I think most of the caveats of that original implementation don't apply here. -- https://bugs.ruby-lang.org/

Issue #20415 has been updated by byroot (Jean Boussier). So there's pretty much only two `yjit-bench` benchmarks on which it makes a visible difference `rack` and `liquid-c`, both use string literal keys: (ran 3 times to account for variance) ``` head: ruby 3.4.0dev (2024-05-09T10:37:25Z master 74c911dfa9) +YJIT [arm64-darwin23] head-str-hashcode: ruby 3.4.0dev (2024-05-09T12:58:37Z interned-string-ha.. 44bc1f4c66) +YJIT [arm64-darwin23] -------- --------- ---------- ---------------------- ---------- ------------------------- ---------------------- bench head (ms) stddev (%) head-str-hashcode (ms) stddev (%) head-str-hashcode 1st itr head/head-str-hashcode liquid-c 19.4 4.1 18.8 1.6 1.11 1.03 rack 12.5 1.2 11.8 1.9 1.05 1.05 -------- --------- ---------- ---------------------- ---------- ------------------------- ---------------------- ------ -------- --------- ---------- ---------------------- ---------- ------------------------- ---------------------- bench head (ms) stddev (%) head-str-hashcode (ms) stddev (%) head-str-hashcode 1st itr head/head-str-hashcode liquid-c 19.4 1.0 19.0 1.7 0.98 1.02 rack 12.6 1.2 11.7 1.4 1.04 1.08 -------- --------- ---------- ---------------------- ---------- ------------------------- ---------------------- ------- -------- --------- ---------- ---------------------- ---------- ------------------------- ---------------------- bench head (ms) stddev (%) head-str-hashcode (ms) stddev (%) head-str-hashcode 1st itr head/head-str-hashcode liquid-c 19.4 1.0 19.0 1.2 1.00 1.02 rack 12.7 13.4 12.0 1.4 1.06 1.05 -------- --------- ---------- ---------------------- ---------- ------------------------- ---------------------- ``` On other benchmarks the difference isn't higher than standard deviation, so not conclusive. ---------------------------------------- Feature #20415: Precompute literal String hash code during compilation https://bugs.ruby-lang.org/issues/20415#change-108231 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- I worked on a proof of concept with @etienne which I think has some potential, but I'm looking for feedback on what would be the best implementation. The proof of concept is here: https://github.com/Shopify/ruby/pull/553 ### Idea Most string literals are relatively short, hence embedded, and have some wasted bytes at the end of their slot. We could use that wasted space to store the string hash. The goal being to make **looking up a literal String key in a hash, as fast as a Symbol key**. The goal isn't to memoize the hash code of all strings, but to **only selectively precompute the hash code of literal strings in the compiler**. The compiler could even selectively do this when we literal string is used to lookup a hash (`opt_aref`). Here's the benchmark we used: ```ruby hash = 10.times.to_h do |i| [i, i] end dyn_sym = "dynamic_symbol".to_sym hash[:some_symbol] = 1 hash[dyn_sym] = 1 hash["small"] = 2 hash["frozen_string_literal"] = 2 Benchmark.ips do |x| x.report("symbol") { hash[:some_symbol] } x.report("dyn_symbol") { hash[:some_symbol] } x.report("small_lit") { hash["small"] } x.report("frozen_lit") { hash["frozen_string_literal"] } x.compare!(order: :baseline) end ``` On Ruby 3.3.0, looking up a String key is a bit slower based on the key size: ``` Calculating ------------------------------------- symbol 24.175M (± 1.7%) i/s - 122.002M in 5.048306s dyn_symbol 24.345M (± 1.6%) i/s - 122.019M in 5.013400s small_lit 21.252M (± 2.1%) i/s - 107.744M in 5.072042s frozen_lit 20.095M (± 1.3%) i/s - 100.489M in 5.001681s Comparison: symbol: 24174848.1 i/s dyn_symbol: 24345476.9 i/s - same-ish: difference falls within error small_lit: 21252403.2 i/s - 1.14x slower frozen_lit: 20094766.0 i/s - 1.20x slower ``` With the proof of concept performance is pretty much identical: ``` Calculating ------------------------------------- symbol 23.528M (± 6.9%) i/s - 117.584M in 5.033231s dyn_symbol 23.777M (± 4.7%) i/s - 120.231M in 5.071734s small_lit 23.066M (± 2.9%) i/s - 115.376M in 5.006947s frozen_lit 22.729M (± 1.1%) i/s - 115.693M in 5.090700s Comparison: symbol: 23527823.6 i/s dyn_symbol: 23776757.8 i/s - same-ish: difference falls within error small_lit: 23065535.3 i/s - same-ish: difference falls within error frozen_lit: 22729351.6 i/s - same-ish: difference falls within error ``` ### Possible implementation The reason I'm opening this issue early is to get feedback on which would be the best implementation. #### Store hashcode after the string terminator Right now the proof of concept simply stores the `st_index_t` after the string null terminator, and only when the string is embedded and as enough left over space. Strings with a precomputed hash are marked with an user flag. Pros: - Very simple implementation, no need to change a lot of code, and very easy to strip out if we want to. - Doesn't use any extra memory. If the string doesn't have enough left over bytes, the optimization simply isn't applied. - The worst case overhead is a single `FL_TEST_RAW` in `rb_str_hash`. Cons: - The optimization won't apply to certain string sizes. e.g. strings between `17` and `23` bytes won't have a precomputed hash code. - Extracting the hash code requires some not so nice pointer arithmetic. #### Create another RString union Another possibility would be to add another entry in the `RString` struct union, such as we'd have: ```c struct RString { struct RBasic basic; long len; union { // ... existing members struct { st_index_t hash; char ary[1]; } embded_literal; } as; }; ``` Pros: - The optimization can now be applied to all string sizes. - The hashcode is always at the same offset and properly aligned. Cons: - Some strings would be bumped by one slot size, so would use marginally more memory. - Complexify the code base more, need to modify a lot more string related code (e.g. `RSTRING_PTR` and many others) - When compiling such string, if an equal string already exists in the `fstring` table, we'd need to replace it, we can't just mutate it in place to add the hashcode. ### Prior art [Feature #15331] is somewhat similar in its idea, but it does it lazily for all strings. Here it's much simpler because limited to string literals, which are the ones likely to be used as Hash keys, and the overhead is on compilation, not runtime (aside from a single flag check). So I think most of the caveats of that original implementation don't apply here. -- https://bugs.ruby-lang.org/

Issue #20415 has been updated by byroot (Jean Boussier). Status changed from Open to Closed This was discussed and accepted at the last developer meeting, so I merged it as `9e9f1d9301b05604d475573ddd18d6bf5185466c`. ---------------------------------------- Feature #20415: Precompute literal String hash code during compilation https://bugs.ruby-lang.org/issues/20415#change-108456 * Author: byroot (Jean Boussier) * Status: Closed ---------------------------------------- I worked on a proof of concept with @etienne which I think has some potential, but I'm looking for feedback on what would be the best implementation. The proof of concept is here: https://github.com/Shopify/ruby/pull/553 ### Idea Most string literals are relatively short, hence embedded, and have some wasted bytes at the end of their slot. We could use that wasted space to store the string hash. The goal being to make **looking up a literal String key in a hash, as fast as a Symbol key**. The goal isn't to memoize the hash code of all strings, but to **only selectively precompute the hash code of literal strings in the compiler**. The compiler could even selectively do this when we literal string is used to lookup a hash (`opt_aref`). Here's the benchmark we used: ```ruby hash = 10.times.to_h do |i| [i, i] end dyn_sym = "dynamic_symbol".to_sym hash[:some_symbol] = 1 hash[dyn_sym] = 1 hash["small"] = 2 hash["frozen_string_literal"] = 2 Benchmark.ips do |x| x.report("symbol") { hash[:some_symbol] } x.report("dyn_symbol") { hash[:some_symbol] } x.report("small_lit") { hash["small"] } x.report("frozen_lit") { hash["frozen_string_literal"] } x.compare!(order: :baseline) end ``` On Ruby 3.3.0, looking up a String key is a bit slower based on the key size: ``` Calculating ------------------------------------- symbol 24.175M (± 1.7%) i/s - 122.002M in 5.048306s dyn_symbol 24.345M (± 1.6%) i/s - 122.019M in 5.013400s small_lit 21.252M (± 2.1%) i/s - 107.744M in 5.072042s frozen_lit 20.095M (± 1.3%) i/s - 100.489M in 5.001681s Comparison: symbol: 24174848.1 i/s dyn_symbol: 24345476.9 i/s - same-ish: difference falls within error small_lit: 21252403.2 i/s - 1.14x slower frozen_lit: 20094766.0 i/s - 1.20x slower ``` With the proof of concept performance is pretty much identical: ``` Calculating ------------------------------------- symbol 23.528M (± 6.9%) i/s - 117.584M in 5.033231s dyn_symbol 23.777M (± 4.7%) i/s - 120.231M in 5.071734s small_lit 23.066M (± 2.9%) i/s - 115.376M in 5.006947s frozen_lit 22.729M (± 1.1%) i/s - 115.693M in 5.090700s Comparison: symbol: 23527823.6 i/s dyn_symbol: 23776757.8 i/s - same-ish: difference falls within error small_lit: 23065535.3 i/s - same-ish: difference falls within error frozen_lit: 22729351.6 i/s - same-ish: difference falls within error ``` ### Possible implementation The reason I'm opening this issue early is to get feedback on which would be the best implementation. #### Store hashcode after the string terminator Right now the proof of concept simply stores the `st_index_t` after the string null terminator, and only when the string is embedded and as enough left over space. Strings with a precomputed hash are marked with an user flag. Pros: - Very simple implementation, no need to change a lot of code, and very easy to strip out if we want to. - Doesn't use any extra memory. If the string doesn't have enough left over bytes, the optimization simply isn't applied. - The worst case overhead is a single `FL_TEST_RAW` in `rb_str_hash`. Cons: - The optimization won't apply to certain string sizes. e.g. strings between `17` and `23` bytes won't have a precomputed hash code. - Extracting the hash code requires some not so nice pointer arithmetic. #### Create another RString union Another possibility would be to add another entry in the `RString` struct union, such as we'd have: ```c struct RString { struct RBasic basic; long len; union { // ... existing members struct { st_index_t hash; char ary[1]; } embded_literal; } as; }; ``` Pros: - The optimization can now be applied to all string sizes. - The hashcode is always at the same offset and properly aligned. Cons: - Some strings would be bumped by one slot size, so would use marginally more memory. - Complexify the code base more, need to modify a lot more string related code (e.g. `RSTRING_PTR` and many others) - When compiling such string, if an equal string already exists in the `fstring` table, we'd need to replace it, we can't just mutate it in place to add the hashcode. ### Prior art [Feature #15331] is somewhat similar in its idea, but it does it lazily for all strings. Here it's much simpler because limited to string literals, which are the ones likely to be used as Hash keys, and the overhead is on compilation, not runtime (aside from a single flag check). So I think most of the caveats of that original implementation don't apply here. -- https://bugs.ruby-lang.org/
participants (4)
-
byroot (Jean Boussier)
-
Eregon (Benoit Daloze)
-
etienne
-
shyouhei (Shyouhei Urabe)