[ruby-talk:444105] Ractor status: are they used?

Hi, I just evaluated the possibility of using Ractors to distribute processing using various gems with native extensions (some matrix/vector computations with Numo::Linalg and approximate vector searches) and I've mixed feelings. The Ractor design seems very sound to me and I would prefer to use them to distribute the load on multiple CPUs but in practice I see 2 major obstacles : - the first use of Ractor code isn't encouraging, even with Ruby 3.2.0, as it outputs : "warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues." - when heavy computations are involved you are probably already using gems with native extensions and when trying to use them in a non-main Ractor you will almost always get an exception : "ractor unsafe method called from not main ractor (Ractor::UnsafeError)" This last problem could be manageable, according to : https://bugs.ruby-lang.org/issues/17307 native extensions should mark their individual functions as thread safe using : #ifdef HAVE_RB_EXT_RACTOR_SAFE rb_ext_ractor_safe(true); #endif I was considering contacting the maintainers of the various gems that we use to check with them which methods are safe and to see if I can submit a pull request but I'm a bit hesitant and wonders if Ractors are actually used and stable enough for maintainers to care. Do people on this list use Ractors? What for? Do people code gems with native extensions with them in mind? I've seen traces of rb_ext_ractor_safe when searching on Github but almost all of the matches are from load.h from the Ruby headers but not many actual uses (in the 4 first pages of results Psych and TruffleRuby are the only exceptions and the first external gem is ruby-extlz4 on the 5th page...). I've yet to see how the ffi gem can handle Ractors, I already know there's no trace of Ractors in its code and this could be a complete new can of worms (or not...). I'll dig a bit more but in my current position I think I'll favor a multi-process solution to scale (mostly because I don't have a definitive list of gems I'll have to patch at this point and time is short). That's a shame and I'm willing to revisit this later. Best regards, Lionel

I’ve toyed around with Ractors, thinking they might be a Godsend for GUI apps, like those built by Glimmer DSL for LibUI ( https://github.com/AndyObtiva/glimmer-dsl-libui). But, they always turned out to be more pain to use with the variable access restrictions than plain old threads, which I get for free as truly parallel in JRuby while using Glimmer DSL for SWT (https://github.com/AndyObtiva/glimmer-dsl-swt). I don’t know. Maybe one day I’ll see the light of Ractors and change my mind, but being a long time multithreading user, I always thought the fears of using true multithreading were overblown in Ruby to the point of impracticality. In actuality, I use multithreading all the time when building desktop apps, and I never have issues with deadlocks given I correctly use mutexes/semaphores, or given that many apps don’t even need to share data through the threads, but could benefit from threads for parallel processing. It would be nice to have the more convenient option of multithreading for those cases at least. I’ve always thought Ruby’s stance on multithreading was extreme. why not support it behind a non-default switch, and let those who don’t fear using them and know how to benefit from them safely use real multithreading by passing a switch, instead of getting forced to use something very restrictive and inconvenient like Ractors. But, if it were my decision, I’d even make multithreading always enabled in Ruby. I’ve built countless desktop GUI apps with JRuby’s multithreading support over the years, and never had any issues. Check out this Mandelbrot Fractal renderer that takes advantage of all your CPU cores! https://github.com/AndyObtiva/glimmer-dsl-swt/blob/master/docs/reference/GLI... I’ve observed that most people who have issues with multithreading don’t really have a solid foundation of parallel concurrent programming from a university degree (I have a BSc in CS), reading books, or real experience, and thus misunderstand how to build multithreaded apps, thus run into the problems they claim are horrifying to the point of discouraging multithreading. Exhibit A: the Mandelbrot Fractal problem. Some people assume that given that you want to calculate fractal pixels in a grid, you’d have to have a thread per pixel. Wrong!!! You are supposed to have as many thread as your CPU core threads only, and distribute work to them in a pool. They won’t share data, so there won’t be a deadlock. But, people weak in parallel concurrent programming will try to use a thread per pixel and even share a data structure between all of them, and then will tell you that true multithreading is too memory consuming and dangerous, and will end up cursing threads and wanting to use fibers or ractors instead. Well, that’s because of their inexperience in how to write parallel code correctly with threads to begin with more than anything. As a result, everybody is suffering an impractical restriction in Ruby because a few bad programmers don’t know how to do multithreading when in fact it’s a simple matter of practice makes perfect. I’d rather we follow the true Ruby way, which is to empower programmers with freedom (like the freedom of dynamic typing) and leaving them be responsible adults. So, multithreading should have the same Ruby way freedom too. It’s no different. And, at minimum, provide a Ruby command switch for truly parallel multithreading to those who know what they’re doing. Andy Maleh On Sat, Jan 21, 2023 at 9:37 AM Lionel Bouton via ruby-talk < ruby-talk@ml.ruby-lang.org> wrote:
Hi,
I just evaluated the possibility of using Ractors to distribute processing using various gems with native extensions (some matrix/vector computations with Numo::Linalg and approximate vector searches) and I've mixed feelings.
The Ractor design seems very sound to me and I would prefer to use them to distribute the load on multiple CPUs but in practice I see 2 major obstacles : - the first use of Ractor code isn't encouraging, even with Ruby 3.2.0, as it outputs : "warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues." - when heavy computations are involved you are probably already using gems with native extensions and when trying to use them in a non-main Ractor you will almost always get an exception : "ractor unsafe method called from not main ractor (Ractor::UnsafeError)"
This last problem could be manageable, according to : https://bugs.ruby-lang.org/issues/17307 native extensions should mark their individual functions as thread safe using :
#ifdef HAVE_RB_EXT_RACTOR_SAFE rb_ext_ractor_safe(true); #endif
I was considering contacting the maintainers of the various gems that we use to check with them which methods are safe and to see if I can submit a pull request but I'm a bit hesitant and wonders if Ractors are actually used and stable enough for maintainers to care.
Do people on this list use Ractors? What for? Do people code gems with native extensions with them in mind? I've seen traces of rb_ext_ractor_safe when searching on Github but almost all of the matches are from load.h from the Ruby headers but not many actual uses (in the 4 first pages of results Psych and TruffleRuby are the only exceptions and the first external gem is ruby-extlz4 on the 5th page...).
I've yet to see how the ffi gem can handle Ractors, I already know there's no trace of Ractors in its code and this could be a complete new can of worms (or not...).
I'll dig a bit more but in my current position I think I'll favor a multi-process solution to scale (mostly because I don't have a definitive list of gems I'll have to patch at this point and time is short). That's a shame and I'm willing to revisit this later.
Best regards,
Lionel ______________________________________________ ruby-talk mailing list -- ruby-talk@ml.ruby-lang.org To unsubscribe send an email to ruby-talk-leave@ml.ruby-lang.org ruby-talk info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-talk.ml.ruby-lang.org...
-- Andy Maleh LinkedIn: https://www.linkedin.com/in/andymaleh <https://www.linkedin.com/in/andymaleh> Blog: http://andymaleh.blogspot.com GitHub: http://www.github.com/AndyObtiva Twitter: @AndyObtiva <https://twitter.com/AndyObtiva>

My experience with Ractor was that I wanted to run various instances of whitequark/parser gem in parallel, but there were a couple of problems, for instance Ragel generated code contained something like singleton getters/setters on global modules. This made me rethink use of singleton getters/setters on global in the future, as they are very akin to what are global variables, but I didn't go as far as trying to fix Ragel to generate better code. But I also didn't like the fact, that I had to write unclean code like: call_a_method(&Ractor.make_shareable(proc do do_something )) For a Ractor to actually launch correctly. Finally, what I wanted to do, I implemented with fork. But - I agree with you in general - if you carefully deal with threads like Erlang forces you to do, nothing will go wrong. Python has recently eliminated its GIL. The GIL severely limits what you can do with Threads, it basically only allows you to accelerate IO with that (and for IO, cooperative multitasking is a much better choice anyway). On 1/21/23 23:23, Andy Maleh via ruby-talk wrote:
I’ve toyed around with Ractors, thinking they might be a Godsend for GUI apps, like those built by Glimmer DSL for LibUI (https://github.com/AndyObtiva/glimmer-dsl-libui). But, they always turned out to be more pain to use with the variable access restrictions than plain old threads, which I get for free as truly parallel in JRuby while using Glimmer DSL for SWT (https://github.com/AndyObtiva/glimmer-dsl-swt).

Hi, Le 21/01/2023 à 23:23, Andy Maleh via ruby-talk a écrit :
[...] I’ve always thought Ruby’s stance on multithreading was extreme. why not support it behind a non-default switch, and let those who don’t fear using them and know how to benefit from them safely use real multithreading by passing a switch, instead of getting forced to use something very restrictive and inconvenient like Ractors.
If you are referring to true concurrent threading, from what I know the explanation is mostly library support. For example PHP as an ecosystem took ages to truly support multi-threading (I'm not even sure it fully does today) because most of the libraries built with it where designed before it supported threads so they didn't support them (chicken and egg problem). It was far easier for Java to do it right as true concurrent threads were baked in from the start.
But, if it were my decision, I’d even make multithreading always enabled in Ruby. I’ve built countless desktop GUI apps with JRuby’s multithreading support over the years, and never had any issues.
Yes if you only use pure Ruby or wrappers around Java libs. But unless I'm mistaken interfacing JRuby with native C libraries is not straightforward and certainly not thread safe by default: I don't see a way around knowing which functions are safe and which aren't to properly wrap/use them. To properly support concurrent multi-threading in MRI with native extensions you'd have to only expose thread-safe functions or advise how to protect unsafe ones. You might have to rb_ext_ractor_safe(true) or use something similar to manage this to avoid using trial and error to see what fails. Ractors are just another tool in the box. - Threads are the obvious match when giving access to your whole state to many concurrent executions is needed (can be motivated by the actual problem to solve or the speed it gives for large amounts of information transfer). - Ractors are great when you want to divide your problem into portions of code with very clear and concise interfaces. This promotes simplicity which greatly helps when you need robustness (simplicity often indirectly helps performance too). - Whole processes are another tool that can be even more robust when used properly: many Unix daemons (like Postfix, Apache in prefork mode, PostgreSQL to name a few I use regularly) have been very solid in part because they divide their work amongst re-startable processes with clearly delimited responsibilities. This is the path we have taken until now and will probably continue with although there's an expected slight performance disadvantage with our new components. Ractors seem to be victims of the same chicken and egg problem PHP threads had. They are a good solution for a whole class of problems but it seems almost nobody supports them and so they aren't as useful as they could be. If at least the interface wasn't experimental anymore, there might be more incentive to work on making gems with native extensions Ractor aware. At least this is how I see it: I'm tempted to fork some gems and submit pull requests later after testing but this involves probably several days of work for our needs and this experimental state is a risk of our work going to waste. The whole process route is less risky right now. Best regards, -- Lionel Bouton gérant de JTEK SARL https://www.linkedin.com/in/lionelbouton/

On 2023-1-21 10:37 pm, Lionel Bouton via ruby-talk wrote:
Hi,
I just evaluated the possibility of using Ractors to distribute processing using various gems with native extensions (some matrix/vector computations with Numo::Linalg and approximate vector searches) and I've mixed feelings.
The Ractor design seems very sound to me and I would prefer to use them to distribute the load on multiple CPUs but in practice I see 2 major obstacles : - the first use of Ractor code isn't encouraging, even with Ruby 3.2.0, as it outputs : "warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues." - when heavy computations are involved you are probably already using gems with native extensions and when trying to use them in a non-main Ractor you will almost always get an exception : "ractor unsafe method called from not main ractor (Ractor::UnsafeError)"
The remaining comments in this thread are useful but I would recommend watching this talk by Samuel Williams (@ioquatix on Twitter/ Mastadon) since he probably has the best knowledge to answer your question :) https://www.youtube.com/watch?v=Y29SSOS4UOc In what I read, ractors are still slow but as with most things Ruby, it takes a couple of versions to be great to go. Best wishes, Mohit.
participants (4)
-
Andy Maleh
-
hmdne
-
Lionel Bouton
-
Mohit Sindhwani