
Issue #19875 has been updated by Freaky (Thomas Hurst). nobu (Nobuyoshi Nakada) wrote in #note-7:
These are all generated by the same compiler?
Yes - FreeBSD clang version 14.0.5. I also try 15.0.7 to no effect. gcc 12.2.0 performs much better, though it too has a slightly more modest performance regression from 4001: ``` './ruby.gcc12.revert test.rb' ran 1.28 ± 0.01 times faster than './ruby.clang.revert test.rb' 1.61 ± 0.01 times faster than './ruby.gcc12.master test.rb' 4.38 ± 0.04 times faster than './ruby.clang.master test.rb' ``` The obvious solution works for both: ``` diff --git string.c string.c index deeed4a12a..70644b2338 100644 --- string.c +++ string.c @@ -8455,6 +8455,13 @@ rb_str_count(int argc, VALUE *argv, VALUE str) s = RSTRING_PTR(str); if (!s || RSTRING_LEN(str) == 0) return INT2FIX(0); send = RSTRING_END(str); + if (RSTRING_LEN(str) < INT_MAX) { + i = 0; + while (s < send) { + if (*(unsigned char*)s++ == c) i++; + } + return INT2NUM(i); + } while (s < send) { if (*(unsigned char*)s++ == c) n++; } ``` ``` './ruby.gcc12.fix test.rb' ran 1.04 ± 0.01 times faster than './ruby.gcc12.revert test.rb' 1.32 ± 0.01 times faster than './ruby.clang.fix test.rb' 1.33 ± 0.02 times faster than './ruby.revert test.rb' 1.68 ± 0.01 times faster than './ruby.gcc12.master test.rb' 4.57 ± 0.03 times faster than './ruby.master test.rb' ``` Oh. And you know what this reminds me of? That [one time](https://github.com/Freaky/fast-bytecount) I ported Rust's bytecount crate to C. ``` './ruby.bytecount test.rb' ran 3.73 ± 0.08 times faster than './ruby.gcc12.fix test.rb' 3.87 ± 0.08 times faster than './ruby.gcc12.revert test.rb' 4.94 ± 0.10 times faster than './ruby.clang.fix test.rb' 4.96 ± 0.11 times faster than './ruby.revert test.rb' 6.27 ± 0.13 times faster than './ruby.gcc12.master test.rb' 17.05 ± 0.36 times faster than './ruby.master test.rb' ``` Who needs a compiler to vectorise for you when you can just ... copy what someone smarter than you did. ---------------------------------------- Bug #19875: Ruby 3.0 -> 3.1 Performance regression in String#count https://bugs.ruby-lang.org/issues/19875#change-104621 * Author: iz (Illia Zub) * Status: Feedback * Priority: Normal * ruby -v: 3.2.2 * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- `String#count` became slower since Ruby 3.1. Originally found by `@Freaky`: https://github.com/ruby/ruby/pull/4001#issuecomment-1714779781 Compared using the [`benchmark-driver` gem](https://github.com/benchmark-driver/benchmark-driver). ``` $ benchmark-driver tmp/string_count_benchmark_driver.yml --rbenv '3.1.1;3.1.4;2.7.2;3.2.2;3.0.6' Calculating ------------------------------------- 3.1.1 3.1.4 2.7.2 3.2.2 3.0.6 count 465.804 463.741 865.783 462.711 857.395 i/s - 10.000k times in 21.468251s 21.563768s 11.550239s 21.611783s 11.663235s Comparison: count 2.7.2: 865.8 i/s 3.0.6: 857.4 i/s - 1.01x slower 3.1.1: 465.8 i/s - 1.86x slower 3.1.4: 463.7 i/s - 1.87x slower 3.2.2: 462.7 i/s - 1.87x slower ``` Benchmark: ```yml $ cat ./tmp/string_count_benchmark_driver.yml loop_count: 10_000 prelude: | html = "\nruby\n" * 1024 * 1024 benchmark: count: html.count($/) ``` --- *Initially, I noticed the difference between `str.count($/)` and `str.lines.size` when working on the performance improvement: https://serpapi.com/blog/lines-count-failed-deployments/* ---Files-------------------------------- rb_str_len.fast (31.9 KB) rb_str_len.slow (34 KB) revert-4001.patch (1.71 KB) rb_str_count.S (11.8 KB) -- https://bugs.ruby-lang.org/