Issue #19288 has been updated by maciej.mensfeld (Maciej Mensfeld).
I find this issue important and if mitigated, it would allow me to release
production-grade functionalities that would benefit users of the Ruby language.
I run an OSS project called Karafka (
https://github.com/karafka/karafka) that allows for
processing Kafka messages using multiple threads in parallel. For non-IO bound cases, the
majority of the time of users whom use-cases I know is spent on data deserialization (>
80%). JSON is by far the most popular format that is also conveniently supported natively
by Ruby. While providing true parallelism around the whole processing may not be easy due
to a ton of synchronization around the whole process, the atomicity of messages
deserialization makes it an ideal case of using Ractors.
- Data can be sent there, and results can be transferred without interdependencies.
- Each message is atomic; hence their deserialization can run in parallel.
- All message deserialization requests can be sent to a generic queue from which Ractors
could consume.
I am not an expert in the Ruby code, but if there is anything I could help with to move
this forward, please just ping me.
----------------------------------------
Bug #19288: Ractor JSON parsing significantly slower than linear parsing
https://bugs.ruby-lang.org/issues/19288#change-101424
* Author: maciej.mensfeld (Maciej Mensfeld)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.2.0 (2022-12-25 revision a528908271) [x86_64-linux]
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
a simple benchmark:
```ruby
require 'json'
require 'benchmark'
CONCURRENT = 5
RACTORS = true
ELEMENTS = 100_000
data = CONCURRENT.times.map do
ELEMENTS.times.map do
{
rand => rand,
rand => rand,
rand => rand,
rand => rand
}.to_json
end
end
ractors = CONCURRENT.times.map do
Ractor.new do
Ractor.receive.each { JSON.parse(_1) }
end
end
result = Benchmark.measure do
if RACTORS
CONCURRENT.times do |i|
ractors[i].send(data[i], move: false)
end
ractors.each(&:take)
else
# Linear without any threads
data.each do |piece|
piece.each { JSON.parse(_1) }
end
end
end
puts result
```
Gives following results on my 8 core machine:
```shell
# without ractors:
2.731748 0.003993 2.735741 ( 2.736349)
# with ractors
12.580452 5.089802 17.670254 ( 5.209755)
```
I would expect Ractors not to be two times slower on the CPU intense work.
--
https://bugs.ruby-lang.org/