
Could you report that to https://github.com/ruby/timeout/issues ? On Thu, Sep 12, 2024 at 1:16 PM yuri.kanivetsky--- via ruby-core < ruby-core@ml.ruby-lang.org> wrote:
Hi,
I'm not sure if it's a bug. Take a look at the following gist:
https://gist.github.com/x-yuri/253f76df6287441f64b5eaee418813c0
It's supposedly a minimal reproducible example of what I ran into a couple of days ago. In my case it was basically a sort of a background jobs service. It received messages, processed them and sent the result to another service. Every message was processed and sent in a separate thread. And it made a number of http requests as part of processing messages (net/http which uses timeout()).
It worked normally for a couple of hours (10-20 threads), but at some point the number of threads reached the thread pool's max_threads (320), http requests started to time out (later then they should, after 60-100 seconds instead of 3 seconds), and probably things started to get slow.
Then I noticed the following line:
patterns.any? { |pattern| quote.match(pattern['regex']) } ? 1 : 0
It was performed for every quote (10 times) in every thread, there were 300+ patterns and they (pattern['regex']) were strings. I moved converting strings to regexps to initialization (before starting to accept messages and creating threads) and it seems to work now with ~10 active threads.
I'm trying to understand what exactly happened and how to avoid it or what awaits me in the future. Is there some critical load that breaks things? I guess in theory the number of threads should increase linearly with the load. But it looks like in this case there's some critical load that just makes things stop working. Or maybe for some time some negative effects get accumulated and then things break. My conjecture is that it has something to do with GIL. But what exactly happens? Or what can I do to further investigate the issue? I'm going to try to run ruby with RUBY_DEBUG_LOG and try to examine the debug output. I guess I need to figure out why the timeout thread doesn't always get enough time. Does something block the threads? How are they scheduled? When do they switch? Any pointers or links to the code are welcome.
Regards, Yuri ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/