Issue #19288 has been updated by maciej.mensfeld (Maciej Mensfeld).
I want to revisit our discussion about leveraging Ruby Ractors for parallel JSON parsing.
It appears there hasn't been much activity on this thread for a long time.
I found it pertinent to mention that during the recent RubyKaigi conference, Koichi Sasada
highlighted the need for real-life/commercial use-cases to showcase Ractors'
potential. To that end, I wanted to bring forth that I do have a practical, commercial
scenario. Karafka handles parsing of thousands or more of JSONs in parallel. Having
Ractors support in such a context could substantially enhance performance, providing a
tangible benefit to the end users.
Given this real-life use case, are there any updates or plans to continue work on allowing
Ractors to operate faster in the presented-by-me scenario? It would indeed be invaluable
for many of users working with Kafka in Ruby. While the end-user processing of data still
will have to happen in a single Ractor, parsing seems like a great example where immutable
raw payload can be shipped to independent ractors and frozen deserialized payloads can be
shipped back.
----------------------------------------
Bug #19288: Ractor JSON parsing significantly slower than linear parsing
https://bugs.ruby-lang.org/issues/19288#change-104792
* Author: maciej.mensfeld (Maciej Mensfeld)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.2.0 (2022-12-25 revision a528908271) [x86_64-linux]
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
a simple benchmark:
```ruby
require 'json'
require 'benchmark'
CONCURRENT = 5
RACTORS = true
ELEMENTS = 100_000
data = CONCURRENT.times.map do
ELEMENTS.times.map do
{
rand => rand,
rand => rand,
rand => rand,
rand => rand
}.to_json
end
end
ractors = CONCURRENT.times.map do
Ractor.new do
Ractor.receive.each { JSON.parse(_1) }
end
end
result = Benchmark.measure do
if RACTORS
CONCURRENT.times do |i|
ractors[i].send(data[i], move: false)
end
ractors.each(&:take)
else
# Linear without any threads
data.each do |piece|
piece.each { JSON.parse(_1) }
end
end
end
puts result
```
Gives following results on my 8 core machine:
```shell
# without ractors:
2.731748 0.003993 2.735741 ( 2.736349)
# with ractors
12.580452 5.089802 17.670254 ( 5.209755)
```
I would expect Ractors not to be two times slower on the CPU intense work.
--
https://bugs.ruby-lang.org/