
Issue #20489 has been updated by tenderlovemaking (Aaron Patterson). nekoyama32767 (Jinsong Yu) wrote in #note-10:
tenderlovemaking (Aaron Patterson) wrote in #note-7:
The regression in Ruby 3.3 came from [this commit](https://github.com/ruby/ruby/pull/8064). It seems like `rb_vm_insns_count` is still in the master branch, so it could be impacting speed on 3.4 / 3.5, but I'm not sure. 3.4 is faster than 3.3, but not as fast as 3.2.
Actually, on a x86_64 linux virtual machine, 3.4 is ~14x slower. But I don't know why 3.4 become faster than 3.3 on arm64
Yes, I'm seeing the same thing on my x86 machine. I tried [changing the counter to a thread local](https://github.com/ruby/ruby/compare/master...tenderlove:ruby:tl?expand=1), and that seems to improve speed on my x86. Here is before: ``` $ time ./miniruby -v ../test.rb 8 8 ruby 3.5.0dev (2025-01-08T20:42:35Z master 96f23306f0) +PRISM [x86_64-linux] [0...1, 1...2, 2...3, 3...4, 4...5, 5...6, 6...7, 7...8] ../test.rb:43: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues. ________________________________________________________ Executed in 25.30 secs fish external usr time 152.10 secs 265.00 micros 152.10 secs sys time 0.01 secs 226.00 micros 0.01 secs ``` Here is after: ``` $ time ./miniruby -v ../test.rb 8 8 ruby 3.5.0dev (2025-01-09T19:03:26Z tl 5bfb7753e3) +PRISM [x86_64-linux] last_commit=move insn counting to a thread local [0...1, 1...2, 2...3, 3...4, 4...5, 5...6, 6...7, 7...8] ../test.rb:43: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues. ________________________________________________________ Executed in 4.70 secs fish external usr time 19.08 secs 252.00 micros 19.08 secs sys time 0.00 secs 210.00 micros 0.00 secs ``` Unfortunately, it's still not as fast as `tarai(1, 1)`: ``` $ time ./miniruby -v ../test.rb 1 1 ruby 3.5.0dev (2025-01-09T19:03:26Z tl 5bfb7753e3) +PRISM [x86_64-linux] last_commit=move insn counting to a thread local [0...1] ../test.rb:43: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues. ________________________________________________________ Executed in 2.30 secs fish external usr time 2.30 secs 8.34 millis 2.30 secs sys time 0.01 secs 4.17 millis 0.00 secs ``` This counter is for YJIT statistics, but from what I understand, YJIT is basically disabled when Ractors are enabled. To me, this means that putting the counter in a thread local is an acceptable change. I will clean up the patch I made and then ask other people on the YJIT team about it. ---------------------------------------- Bug #20489: Ractor behavior strange in ruby master https://bugs.ruby-lang.org/issues/20489#change-111416 * Author: nekoyama32767 (Jinsong Yu) * Status: Assigned * Assignee: ko1 (Koichi Sasada) * ruby -v: ruby 3.4.0dev (2024-05-14T01:58:31Z master 9d01f657b3) [x86_64-linux] * Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- This is a tarai program Run`./ruby tarai_ractor.rb 2 8` is to use 2 thread to run 8 times tarai function total, that means 4 times tarai for each ractor(thread). ``` GC.disable def split_len(len, split) ret = [] mod = len % split head = 0 tail = 0 split.times do |i| if head >= len break end k = 0 if i < mod then k = 1 end tail = tail + (len/split) + k ret.append(head...tail) head = tail end return ret end def ary_split(ary, split) return split_len(ary.length,split) end def item_check(item) if item[0] != nil 1 + item_check(item[0]) + item_check(item[1]) else 1 end end def tarai(x, y, z) = x <= y ? y : tarai(tarai(x-1, y, z), tarai(y-1, z, x), tarai(z-1, x, y)) times = ARGV[0].to_i split = ARGV[1].to_i p split_len(times, split) split_len(times, split).each.map do |sp| Ractor.new (sp) { s = _1 s.each do tarai(13, 7, 0) end } end.each(&:take) ``` The problem is in ruby 3.1.2 and ruby 3.3 `./ruby tarai_ractor.rb 1 1` has simiular execute time with `./ruby tarai_ractor.rb 8 8` because each thread only run 1 time of tarai function, like follow: ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux]: ``` time ruby exp_ractor_tarai.rb 1 1 [0...1] <internal:ractor>:267: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues. real 0m1.442s user 0m1.429s sys 0m0.014s time ruby exp_ractor_tarai.rb 8 8 [0...1, 1...2, 2...3, 3...4, 4...5, 5...6, 6...7, 7...8] <internal:ractor>:267: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues. real 0m1.857s user 0m13.817s sys 0m0.041s ``` But in ruby master(ruby 3.4.0dev) ruby 3.4.0dev (2024-05-14T01:58:31Z master 9d01f657b3) [x86_64-linux] 1 ractor 1 tarai: ``` time ../ruby exp_ractor_tarai.rb 1 1 `RubyGems' were not loaded. `error_highlight' was not loaded. `did_you_mean' was not loaded. `syntax_suggest' was not loaded. [0...1] exp_ractor_tarai.rb:47: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues. real 0m1.671s user 0m1.666s sys 0m0.005s ``` 8 ractor 8 tarai: ``` time ../ruby exp_ractor_tarai.rb 8 8 `RubyGems' were not loaded. `error_highlight' was not loaded. `did_you_mean' was not loaded. `syntax_suggest' was not loaded. [0...1, 1...2, 2...3, 3...4, 4...5, 5...6, 6...7, 7...8] exp_ractor_tarai.rb:47: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues. real 0m18.408s user 1m58.659s sys 0m0.021s ``` And in ruby 3.4.0dev when run `time ../ruby exp_ractor_tarai.rb 16 16` 16 thread should be used in system monitoring while only 8 threads are used. Ruby 3.3 and Ruby 3.1.2 do not have this problem. ---Files-------------------------------- thead16_16.png (168 KB) thread16_8.png (165 KB) Screenshot 2025-01-08 at 4.02.23 PM.png (126 KB) -- https://bugs.ruby-lang.org/