Issue #21833 has been updated by byroot (Jean Boussier).
Has there been any consideration switching to some other hash implementation?
There has been a few in the past, e.g. [Feature #16851]
Admittedly, I'm not a hash expert nor a cryptographer. There doesn't seem to be any known vulnerabilities with XXH3 that I have found.
Well, the main concern is HashDOS, but looking at your branch, it seems you seed the hash function, so it's fine on that front. ---------------------------------------- Misc #21833: Switch default hash from SipHash13 to XXH3? https://bugs.ruby-lang.org/issues/21833#change-116031 * Author: samyron (Scott Myron) * Status: Open ---------------------------------------- Has there been any consideration switching to some other hash implementation? I've searched through the issues and haven't found anything related to switching the default hash from SipHash13 to anything else. I created a [branch](https://github.com/ruby/ruby/compare/master...samyron:ruby:sm/xxh3) which switched `rb_memhash` from SipHash13 to [XXH3](https://github.com/Cyan4973/xxHash). I created a few simple benchmarks and ran them on my M1 Macbook Air. The results are very promising. ``` % cat ~/string_hash.yml prelude: | # Generate sets of short vs medium strings TINY_STRINGS = Array.new(100) { Array.new(3).map { (97 + rand(26)).chr }.join }.freeze SMALL_STRINGS = Array.new(100) { Array.new(8).map { (97 + rand(26)).chr }.join }.freeze MED_STRINGS = Array.new(100) { Array.new(20).map { (97 + rand(26)).chr }.join }.freeze LARGE_STRINGS = Array.new(100) { Array.new(200).map { (97 + rand(26)).chr }.join }.freeze HUGE_STRINGS = Array.new(100) { Array.new(65536).map { (97 + rand(26)).chr }.join }.freeze benchmark: tiny_strings: | TINY_STRINGS.each { |s| s.hash } small_strings: | SMALL_STRINGS.each { |s| s.hash } medium_strings: | MED_STRINGS.each { |s| s.hash } large_strings: | LARGE_STRINGS.each { |s| s.hash } huge_strings: | HUGE_STRINGS.each { |s| s.hash % benchmark-driver ~/string_hash.yml \ -e ruby-master::~/.rubies/ruby-master/bin/ruby \ -e ruby-xxhash::~/.rubies/ruby-xxhash/bin/ruby \ --output compare Warming up -------------------------------------- tiny_strings 262.513k i/s - 283.844k times in 1.081258s (3.81μs/i) small_strings 259.803k i/s - 280.445k times in 1.079454s (3.85μs/i) medium_strings 249.553k i/s - 267.531k times in 1.072041s (4.01μs/i) large_strings 116.426k i/s - 126.005k times in 1.082275s (8.59μs/i) huge_strings 498.481 i/s - 500.000 times in 1.003047s (2.01ms/i) Calculating ------------------------------------- ruby-master ruby-xxhash tiny_strings 264.070k 288.960k i/s - 787.538k times in 2.982305s 2.725421s small_strings 259.941k 286.229k i/s - 779.407k times in 2.998394s 2.723019s medium_strings 249.249k 283.952k i/s - 748.658k times in 3.003655s 2.636561s large_strings 116.572k 240.823k i/s - 349.278k times in 2.996244s 1.450351s huge_strings 500.164 5.296k i/s - 1.495k times in 2.989019s 0.282263s Comparison: tiny_strings ruby-xxhash: 288960.1 i/s ruby-master: 264070.2 i/s - 1.09x slower small_strings ruby-xxhash: 286229.0 i/s ruby-master: 259941.5 i/s - 1.10x slower medium_strings ruby-xxhash: 283952.5 i/s ruby-master: 249249.0 i/s - 1.14x slower large_strings ruby-xxhash: 240823.1 i/s ruby-master: 116571.9 i/s - 2.07x slower huge_strings ruby-xxhash: 5296.5 i/s ruby-master: 500.2 i/s - 10.59x slower ``` Running something a bit more real-world: ``` % cat ~/json_parse.yml prelude: | require 'json' activitypub_json_txt = File.read("/Users/scott/Development/json/benchmark/data/activitypub.json") twitter_json_txt = File.read("/Users/scott/Development/json/benchmark/data/twitter.json") citm_catalog_json_txt = File.read("/Users/scott/Development/json/benchmark/data/citm_catalog.json") ohai_json_txt = File.read("/Users/scott/Development/json/benchmark/data/ohai.json") benchmark: parse_activitypub_json: | JSON.parse(activitypub_json_txt) parse_twitter_json_txt: | JSON.parse(twitter_json_txt) parse_citm_catalog_json_txt: | JSON.parse(citm_catalog_json_txt) parse_ohai_json_txt: | JSON.parse(ohai_json_txt) % benchmark-driver ~/json_parse.yml \ -e ruby-master::~/.rubies/ruby-master/bin/ruby \ -e ruby-xxhash::~/.rubies/ruby-xxhash/bin/ruby \ --output compare Warming up -------------------------------------- parse_activitypub_json 10.969k i/s - 12.023k times in 1.096043s (91.16μs/i) parse_twitter_json_txt 1.169k i/s - 1.265k times in 1.082330s (855.60μs/i) parse_citm_catalog_json_txt 591.782 i/s - 600.000 times in 1.013887s (1.69ms/i) parse_ohai_json_txt 12.000k i/s - 12.782k times in 1.065168s (83.33μs/i) Calculating ------------------------------------- ruby-master ruby-xxhash parse_activitypub_json 10.986k 11.071k i/s - 32.908k times in 2.995440s 2.972542s parse_twitter_json_txt 1.162k 1.172k i/s - 3.506k times in 3.016331s 2.991486s parse_citm_catalog_json_txt 588.758 601.926 i/s - 1.775k times in 3.014820s 2.948868s parse_ohai_json_txt 10.747k 12.400k i/s - 35.999k times in 3.349753s 2.903138s Comparison: parse_activitypub_json ruby-xxhash: 11070.7 i/s ruby-master: 10986.0 i/s - 1.01x slower parse_twitter_json_txt ruby-xxhash: 1172.0 i/s ruby-master: 1162.3 i/s - 1.01x slower parse_citm_catalog_json_txt ruby-xxhash: 601.9 i/s ruby-master: 588.8 i/s - 1.02x slower parse_ohai_json_txt ruby-xxhash: 12400.0 i/s ruby-master: 10746.8 i/s - 1.15x slower ``` Admittedly, I'm not a hash expert nor a cryptographer. There doesn't seem to be any known vulnerabilities with XXH3 that I have found. -- https://bugs.ruby-lang.org/