Issue #19588 has been reported by kyanagi (Kouhei Yanagita).
----------------------------------------
Feature #19588: Allow Comparable#clamp(min, max) to accept nil as a specification
https://bugs.ruby-lang.org/issues/19588
* Author: kyanagi (Kouhei Yanagita)
* Status: Open
* Priority: Normal
----------------------------------------
`Comparable#clamp(min, max)` (with two arguments) accepts `nil`. This behaves the same as beginless/endless Range.
~~~ruby
5.clamp(nil, 0) # => 0
5.clamp(10, nil) # => 10
5.clamp(..0) # => 0
5.clamp(10..) # => 10
~~~
This behavior is not documented. Presumably, this was not introduced intentionally.
The old Rubies did not accept accept a `Range` argument.
In Ruby 2.7, accepting `Range` as an argument was introduced.
At that time, the approach of passing `nil` as a two-argument method was also discussed but not adopted,
and using Range was chosen instead. https://bugs.ruby-lang.org/issues/14784
However, in Ruby 3.0, the behavior of `clamp` has changed to accept `nil`.
This change is not documented in the NEWS or the documentation for `clamp`,
and I believe that it was not an intentional change.
~~~
% docker run -it --rm rubylang/all-ruby env ALL_RUBY_SINCE=ruby-2.4.0 ./all-ruby -e "p 5.clamp(0, nil)"
ruby-2.4.0 -e:1:in `clamp': comparison of Integer with nil failed (ArgumentError)
from -e:1:in `<main>'
exit 1
...
ruby-2.7.8 -e:1:in `clamp': comparison of Integer with nil failed (ArgumentError)
from -e:1:in `<main>'
exit 1
ruby-3.0.0-preview1 5
...
ruby-3.2.2 5
~~~
It seems that https://github.com/ruby/ruby/commit/a93da4970be44a473b7b42e7516eb2663dece2c3 brought about this change.
How about making the current behavior a specification?
It has been three years since the behavior changed, and I don't see much point in prohibiting `nil` now.
--
https://bugs.ruby-lang.org/
Issue #19370 has been reported by zverok (Victor Shepelev).
----------------------------------------
Feature #19370: Anonymous parameters for blocks?
https://bugs.ruby-lang.org/issues/19370
* Author: zverok (Victor Shepelev)
* Status: Open
* Priority: Normal
----------------------------------------
Just to clarify: are anonymous parameters delegation is planned to support in blocks?
It would be a nice addition, if it is possible to implement:
```ruby
# data in form [request method, URL, params]:
[
[:get, 'https://google.com', {q: 'Ruby'}, {'User-Argent': 'Google-Chrome'}],
[:post, 'https://gist.github.com', 'body'],
# ...
].each { |method, *| request(method.to_s.upcase, *) }
```
...and at the very least, consistent with what the method definition can have.
If they are NOT planned to be implemented, I believe that at least error messages should be made much clearer, because currently, this would happen while running the code above:
> no anonymous rest parameter (SyntaxError)
I understand the reason (the `request` clause doesn't "see" anonymous parameter of the **block**, and claims that current **method** doesn't have them), but it looks honestly confusing and inconsistent.
--
https://bugs.ruby-lang.org/
Issue #19246 has been reported by thomthom (Thomas Thomassen).
----------------------------------------
Bug #19246: Rebuilding the loaded feature index much slower in Ruby 3.1
https://bugs.ruby-lang.org/issues/19246
* Author: thomthom (Thomas Thomassen)
* Status: Open
* Priority: Normal
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
Some background to this issue: (This is a case that is unconventional usage of Ruby, but I hope you bear with me.)
We ship the Ruby interpreter with our desktop applications for plugin support in our application (SketchUp).
One feature we have had since, at least 2006 (maybe earlier-hard to track history beyond that) is that we had a custom alternate `require` method: `Sketchup.require`. This allows the users of our API to load encrypted Ruby files.
This originally used `rb_provide` to add the path to the encrypted file into the list of loaded feature. However, somewhere between Ruby 2.2 and 2.5 there was some string optimisations made and the function `rb_provide` would not use a copy of the string passed to it. Instead it just held on to a pointer reference. In our case that string came from user-land, being passed in from `Sketchup.require` and would eventually be garbage collected and cause access violation crashes.
To work around that we changed our custom `Sketchup.require` to push to `$LOADED_FEATURES` directly. There was a small penalty to the index being rebuilt after that, but it was negligible.
Recently we tried to upgrade the Ruby interpreter in our application from 2.7 to 3.1 and found a major performance reduction when using our `Sketchup.require. As in, a plugin that would load in half a second would now spend 30 seconds.
From https://bugs.ruby-lang.org/issues/18452 it sounds like there is _some_ expected extra penalty due to changes in how the index is built. But should it really be this much?
Example minimal repro to simulate the issue:
```
# frozen_string_literal: true
require 'benchmark'
iterations = 200
foo_files = iterations.times.map { |i| "#{__dir__}/tmp/foo-#{i}.rb" }
foo_files.each { |f| File.write(f, "") }
bar_files = iterations.times.map { |i| "#{__dir__}/tmp/bar-#{i}.rb" }
bar_files.each { |f| File.write(f, "") }
biz_files = iterations.times.map { |i| "#{__dir__}/tmp/biz-#{i}.rb" }
biz_files.each { |f| File.write(f, "") }
Benchmark.bm do |x|
x.report('normal') {
foo_files.each { |file|
require file
}
}
x.report('loaded_features') {
foo_files.each { |file|
require file
$LOADED_FEATURES << "#{file}-fake.rb"
}
}
x.report('normal again') {
biz_files.each { |file|
require file
}
}
end
```
```
C:\Users\Thomas\SourceTree\ruby-perf>ruby27.bat
ruby 2.7.4p191 (2021-07-07 revision a21a3b7d23) [x64-mingw32]
C:\Users\Thomas\SourceTree\ruby-perf>ruby test-require.rb
user system total real
normal 0.000000 0.031000 0.031000 ( 0.078483)
loaded_features 0.015000 0.000000 0.015000 ( 0.038759)
normal again 0.016000 0.032000 0.048000 ( 0.076940)
```
```
C:\Users\Thomas\SourceTree\ruby-perf>ruby30.bat
ruby 2.7.4p191 (2021-07-07 revision a21a3b7d23) [x64-mingw32]
C:\Users\Thomas\SourceTree\ruby-perf>ruby test-require.rb
user system total real
normal 0.000000 0.031000 0.031000 ( 0.074733)
loaded_features 0.032000 0.000000 0.032000 ( 0.038898)
normal again 0.000000 0.047000 0.047000 ( 0.076343)
```
```
C:\Users\Thomas\SourceTree\ruby-perf>ruby31.bat
ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x64-mingw-ucrt]
C:\Users\Thomas\SourceTree\ruby-perf>ruby test-require.rb
user system total real
normal 0.016000 0.031000 0.047000 ( 0.132633)
loaded_features 1.969000 11.500000 13.469000 ( 18.395761)
normal again 0.031000 0.125000 0.156000 ( 0.249130)
```
Right now we're exploring options to deal with this. Because the performance degradation is a blocker for us upgrading. We also have 16 years of plugins created by third party developer that makes it impossible for us to drop this feature.
Some options as-is, none of which are ideal:
1. We revert to using `rb_provide` but ensure the string passed in is not owned by Ruby, instead building a list of strings that we keep around for the duration of the application process. The problem is that some of our plugin developers have on occasion released plugins that will touch `$LOADED_FEATURES`, and if such a plugin is installed on a user machine it might cause the application to become unresponsive for minutes. The other non-ideal issue with using `rb_provide` is that we're also using that in ways it wasn't really intended (from that I understand). And it's not an official API?
2. We create a separate way for our `Sketchup.require` to keep track of it's loaded features, but then that would diverge even more from the behaviour of `require`. Replicating `require` functionality is not trivial and would be prone to subtle errors and possible diverge. It also doesn't address our issue that there is code out there in existing plugins that touches `$LOADED_FEATURES`. (And it's not something we can just ask people to clean up. From previous experience old versions stick around for a long time and is very hard to purge from circulation.)
I have two questions for the Ruby mantainers:
1. Would it be reasonable to see an API for adding/removing/checking `$LOADED_FEATURE` that would allow for a more ideal implementation of a custom `require` functionality?
2. Is the performance difference in rebuilding the loaded feature index really expected to be as high as what we're seeing? An increase of nearly 100 times? Is there something there that might be addressed to make the rebuild to be less expensive against? (This would really help to address our challenges with third party plugins occasionally touching the global.)
--
https://bugs.ruby-lang.org/
Issue #19744 has been reported by tagomoris (Satoshi TAGOMORI).
----------------------------------------
Feature #19744: Namespace on read
https://bugs.ruby-lang.org/issues/19744
* Author: tagomoris (Satoshi TAGOMORI)
* Status: Open
* Priority: Normal
----------------------------------------
# What is the "Namespace on read"
This proposes a new feature to define virtual top-level namespaces in Ruby. Those namespaces can require/load libraries (either .rb or native extension) separately from the global namespace. Dependencies of required/loaded libraries are also required/loaded in the namespace.
### Motivation
The "namespace on read" can solve the 2 problems below, and can make a path to solve another problem:
The details of those motivations are described in the below section ("Motivation details").
#### Avoiding name conflicts between libraries
Applications can require two different libraries safely which use the same module name.
#### Avoiding unexpected globally shared modules/objects
Applications can make an independent/unshared module instance.
#### (In the future) Multiple versions of gems can be required
Application developers will have fewer version conflicts between gem dependencies if rubygems/bundler will support the namespace on read.
### Example code with this feature
```ruby
# your_module
module YourModule
end
# my_module.rb
require 'your_module'
module MyModule
end
# example.rb
namespace1 = NameSpace.new
namespace1.require('my_module') #=> true
namespace1::MyModule #=> #<Module:0x00000001027ea650>::MyModule (or #<NameSpace:0x00...>::MyModule ?)
namespace1::YourModule # similar to the above
MyModule # NameError
YourModule # NameError
namespace2 = NameSpace.new # Any number of namespaces can be defined
namespace2.require('my_module') # Different library "instance" from namespace1
require 'my_module' # require in the global namespace
MyModule.object_id != namespace1::MyModule.object_id #=> true
namespace1::MyModule.object_id != namespace2::MyModule.object_id
```
The required/loaded libraries will define different "instances" of modules/classes in those namespaces (just like the "wrapper" 2nd argument of `Kernel.load`). This doesn't introduce compatibility problems if all libraries use relative name resolution (without forced top-level reference like `::Name`).
# "On read": optional, user-driven feature
"On read" is a key thing of this feature. That means:
* No changes are required in existing/new libraries (except for limited cases, described below)
* No changes are required in applications if it doesn't need namespaces
* Users can enable/use namespaces just for limited code in the whole library/application
Users can start using this feature step by step (if they want it) without any big jumps.
## Motivation details
This feature can solve multiple problems I have in writing/executing Ruby code. Those are from the 3 problems I mentioned above: name conflicts, globally shared modules, and library version conflicts between dependencies. I'll describe 4 scenarios about those problems.
### Running multiple applications on a Ruby process
Modern computers have many CPU cores and large memory spaces. We sometimes want to have many separate applications (either micro-service architecture or modular monolith). Currently, running those applications require different processes. It requires additional computation costs (especially in developing those applications).
If we have isolated namespaces and can load applications in those namespaces, we'll be able to run apps on a process, with less overhead.
(I want to run many AWS Lambda applications on a process in isolated namespaces.)
### Running tests in isolated namespaces
Tests that require external libraries need many hacks to:
* require a library multiple times
* require many different 3rd party libraries into isolated spaces (those may conflict with each other)
Software with plugin systems (for example, Fluentd) will get benefit from namespaces.
In addition to it, application tests can avoid unexpected side effects if tests are executed in isolated namespaces.
### Safely isolated library instances
Libraries may have globally shared states. For example, [Oj](https://github.com/ohler55/oj) has a global `Obj.default_options` object to change the library behavior. Those options may be changed by any dependency libraries or applications, and it changes the behavior of `Oj` globally, unexpectedly.
For such libraries, we'll be able to instantiate a safe library instance in an isolated namespace.
### Avoiding dependency hells
Modern applications use many libraries, and those libraries require much more dependencies. Those dependencies will cause version conflicts very often. In such cases, application developers should resolve those by updating each libraries, or should just wait for the new release of libraries to conflict those libraries. Sometimes, library maintainers don't release updated versions, and application developers can do nothing.
If namespaces can require/load a library multiple times, it also enables to require/load different versions of a library in a process. It requires the support of rubygems, but namespaces should be a good fundamental of it.
## Expected problems
### Use of top-level references
In my expectation, `::Name` should refer the top-level `Name` in the global namespace. I expect that `::ENV` should contain the environment variables. But it may cause compatibility problems if library code uses `::MyLibrary` to refer themselves in their deeply nested library code.
### Additional memory consumption
An extension library (dynamically linked library) may be loaded multiple times (by `dlopen` for temporarily copied dll files) to load isolated library "instances" if different namespaces require the same extension library. That consumes additional memory.
In my opinion, additional memory consumption is a minimum cost to realize loading extension libraries multiple times without compatibility issues.
This occurs only when programmers use namespaces. And it's only about libraries that are used in 2 or more namespaces.
### The change of `dlopen` flag about extension libraries
To load an extension library multiple times without conflicting symbols, all extensions should stop sharing symbols globally. Libraries referring symbols from other extension libraries will have to change code & dependencies.
(About the things about extension libraries, [Naruse also wrote an entry](https://naruse.hateblo.jp/entry/2023/05/22/193411).)
# Misc
The proof-of-concept branch is here: https://github.com/tagomoris/ruby/pull/1
It's still work-in-progress branch, especially for extension libraries.
--
https://bugs.ruby-lang.org/
Issue #19160 has been reported by kaiquekandykoga (Kaíque Koga).
----------------------------------------
Bug #19160: cmp_clamp arguments
https://bugs.ruby-lang.org/issues/19160
* Author: kaiquekandykoga (Kaíque Koga)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-freebsd13.1]
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
If clamp receives min higher than max, it will raise an exception. The message says *min argument must be smaller than max argument* , but min can actually be equal to max too.
Patch https://github.com/ruby/ruby/pull/6802.
--
https://bugs.ruby-lang.org/
Issue #19716 has been reported by alexdowad (Alex Dowad).
----------------------------------------
Bug #19716: SystemStackError occurs too easily on Alpine Linux (due to small stack size reported by pthread_attr_getstacksize on musl libc)
https://bugs.ruby-lang.org/issues/19716
* Author: alexdowad (Alex Dowad)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.1.4p223 (2023-03-30 revision 957bb7cb81) [x86_64-linux-musl]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
This is the same problem previously reported against Ruby 2.5 in https://bugs.ruby-lang.org/issues/14387. I just ran into the same problem on Ruby 3.1.4, built on Alpine Linux 3.16.
@hsbt stated in the previous thread (https://bugs.ruby-lang.org/issues/14387#note-28):
> If you have this issue with Ruby 3.2, please file it with another issue.
I hacked `stack_check` in gc.c to print the values of `STACK_START` and `STACK_END` on stack overflow; on the Alpine 3.16 host where this problem just occurred, the values printed were:
> Start=0x7ffd0bf4f000, End=0x7ffd0bf32530
...which shows that Ruby thinks the stack size is only 131072 bytes. On the other hand, `ulimit -s` shows a stack size limit of 8192kb.
This Ruby 3.1.4 was built from unmodified source code downloaded from https://cache.ruby-lang.org; the build was configured using `CFLAGS='-march=native' ./configure --disable-install-doc`.
The invocation of Ruby which blew the stack was `bundle exec rake db:migrate`, on a mid-sized Rails project.
Regarding @ncopa's patch from #14387, @wanabe listed some things which should be done before it is merged into mainline Ruby:
> Okay, The patch needs one or more proofs of its behaviour, like that:
>
> Original issue [ruby-dev:50421] has gone away.
> Standard test codes run well.
> test-all
> ruby/spec
> getrlimit works on some situations like:
> on single thread
> with multiple threads
> with RLIMIT_STACK environment variable
> getrlimit code of musl is implemented correctly as expected.
> (But It's doubtful whether it can be. I guess that a proof of code soundness is very difficult.)
> Some "real world" applications can work.
> I think it is better example that that application(s) can't work without the patch.
I am happy to help cover some of these points if the Ruby development team is still interested in merging @ncopa's patch.
--
https://bugs.ruby-lang.org/
Issue #19576 has been reported by jprokop (Jarek Prokop).
----------------------------------------
Bug #19576: Backport request: Gemfile.lock resolving is broken with bundler shipped with Ruby 3.1.4
https://bugs.ruby-lang.org/issues/19576
* Author: jprokop (Jarek Prokop)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.1.3p185 (2022-11-24 revision 1a6b16756e) [x86_64-linux]
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
This is a backport request for bundler, that regressed in 2.3.36 in a specific situation. Newer and older bundler versions that ship with Ruby are not problematic, only the version that ships with Ruby version >= 3.1.3.
A few weeks ago, we discovered a bug in resolving in bundler shipped with Ruby 3.1.3 and 3.1.4:
Bundler version:
```
$ bundler --version
Bundler version 2.3.26
```
Affected rubies `ruby -v`:
First:
```
$ ruby -v
ruby 3.1.4p223 (2023-03-30 revision 957bb7cb81) [x86_64-linux]
```
Second:
```
$ruby -v
ruby 3.1.3p185 (2022-11-24 revision 1a6b16756e) [x86_64-linux]
```
Initial bug report with reproducer and more in-depth description can be found here: https://github.com/sclorg/s2i-ruby-container/issues/469
Using the following Gemfile for a rails app:
https://github.com/sclorg/rails-ex/blob/67b7a61eae9efa1088ff3f634ae316e1022…
bundler locks up in trying to resolve Nokogiri for Ruby 3.1, but keeps failing because it keeps using incompatible built binary gem instead of falling back to installing and building the binary extension of Nokogiri locally.
We craft this Gemfile to be usable from Ruby 2.5 up to Ruby 3.1, as the app is used mainly for testing.
I have created a patch to fix the situation, see the attached files. There are 2 of them, one contains the fix and the other one contains the test from the rubygems repo PR#6225.
full commit available here: https://src.fedoraproject.org/fork/jackorp/rpms/ruby/c/5ef600a8f40b76de5636…
The patches are created from the following upstream changes in bundler:
https://github.com/rubygems/rubygems/pull/6225
and adapted:
https://github.com/rubygems/rubygems/commit/7b64c64262a7a980c0eb23b96ea56cf…
for the PR#6225.
With the fix applied I no longer have issues doing `bundle install` with our Gemfile.lock.
---Files--------------------------------
rubygem-bundler-2.3.26-Tests-from-bundler-PR-6225.patch (1.82 KB)
rubygem-bundler-2.3.26-Provide-fix-for-bundler-Gemfile-resolving-regression.patch (5.21 KB)
--
https://bugs.ruby-lang.org/
Issue #19842 has been reported by ko1 (Koichi Sasada).
----------------------------------------
Feature #19842: Intorduce M:N threads
https://bugs.ruby-lang.org/issues/19842
* Author: ko1 (Koichi Sasada)
* Status: Open
* Priority: Normal
* Assignee: ko1 (Koichi Sasada)
----------------------------------------
This ticket proposes to introduce M:N threads to improve Threads/Ractors performance.
## Background
Ruby threads (RT in short) are implemented from old Ruby versions and they have the following features:
* Can be created with simple notation `Thread.new{}`
* Can be switched to another ready Ruby thread by:
* Time-slice.
* I/O blocking.
* Synchronization such as Mutex features.
* And other blocking reasons.
* Can be interruptible by:
* OS-deliver signals (only for the main thread).
* `Thread#kill`.
* Can be terminated by:
* the end of each Ruby thread.
* the end of the main thread (and other Ruby threads are killed).
Ruby 1.8 and erlier versions uses M:1 threads (green threads, user level threads, .... the word 1:N threads is more popular but to make this explanation consistent I use "M:1" term here) which manages multiple Ruby threads on 1 native thread.
(Native threads are provided by C interfaces such as Pthreads. In many cases, native threads are OS threads, but there are also user-level implementations, such as user-level pthread libraries in theory. Therefore, they are referred to as native threads in this article and NT in short)
If a Ruby thread T1 blocked because of a I/O operation, Ruby interpreter switches to the next ready Ruby thread T2. The I/O operation will be monitors by a `select()` (or similar) functionality and if the I/O is ready, T1 is marked as a ready thread and T1 will be resumed soon. However, when a Ruby thread issues some other blocking operations such as `gethostbyname()`, Ruby interpreter can not swtich to any other Ruby thread while `gethostbyname()` is not finished.
We named two types blocking operations:
* Managed blocking operations
* I/O (most of read/write)
* manage by I/O multiplexing API (select, poll, epoll, kqueue, IOCP, io_uring, ...)
* Sleeping
* Synchronization (Mutex, Queue, ...)
* Unmanaged operations
* All other blocking operations not listed above, written in C
* Huge number calculation like `Bignum#*`
* DNS lookup
* I/O (can not detect block-able or not by multiplexing API)
* open on FIFO, close on NFS, ...
* flock and other locking mechanism
* library call which uses blocking operations
* `libfoo` has `foo_func()` and `foo_func()` waits DNS lookup. A Ruby extension `foo-ruby` can call `foo_func()`.
With these terms we can say that M:1 threads can suport managed blocking operations but can not support unmanaged operations (can not make progress other Ruby threads) without further tricks.
Note that if the `select()`-like system calls say a `fd` is ready, but the I/O opeartion for `fd` can be blocked because of some contention (read by another thread or process, for example).
M:1 threads has another disadvantage that it can not run in parallel because only a native thread is used.
From Ruby 1.9 we had implemented 1:1 thread which means a Ruby thread has a corresponding native thread. To make implementation easy we also introduced a GVL. Only a Ruby thread acquires GVL can run. With 1:1 model, we can support managed blocking oprations and unmanaged blocking operations by releasing GVL. When a Ruby thread want to issue a blocking operation, the Ruby thread releases GVL and another ready Ruby threads continue to run. We don't care the blocking operation is managed or unmanaged.
(We can not make some of unmanaged blocking operations interruptible (stop by Ctrl-C for example)).
Advantages of 1:1 threads to the M:1 threads is:
* Easy to handle blocking operations by releasing GVL.
* We can utilize parallelism with multiple native threads by releasing GVL.
Disadvantages of 1:1 threads to the M:1 threads is:
* Overhead to make many native threads for many Ruby threads
* We can not make huge number of Ruby threads and Ractors on 1:1 threads.
* Thread switching overhead by GVL because inter-core communication is needed.
From Ruby 3.0 we introduced fiber scheduler mechanism to maintain multiple fibers
Differences between Ruby 1.8 M:1 threads are:
* No timeslice (only switch fibers by managed blocking operations)
* Ruby users can make own schedulers for apps with favorite underlying mechanism
Disadvantages are similar to M:1 threads. Another disadvantages is we need to consider about Fiber's behavior.
From Ruby 3.0 we also introduced Ractors. Ractors can run in parallel because of separating most of objects. 1 Ractor creates 1 Ruby thread, so Ractors has same disadvantages of 1:1 threads. For example, we can not make huge number of Ractors.
## Goal
Our goal is making lightweight Ractors on lightweight Ruby threads. To enable this goal we propose to implement M:N threads on MRI.
M:N threads manages M Ruby threads on N native threads, with limited N (~= CPU core numbers for example).
Advantages of M:N threads are:
1. We can run N ractors on N native threads simultaneously if the machine has N cores.
2. We can make huge number of Ruby threads or Ractors because we don't need huge number of native threads
3. We can support unmanaged blocking operations by locking a native thread to a Ruby thread which issues an unmanaged blocking operation.
4. We can make our own Ruby threads or Ractors scheduler instead of the native thread (OS) scheduler.
Disadvantages of M:N threads are:
1. It is complex implmentation and it can be hard.
2. It can introduce incompatibility especaially on TLS (Thread local storage).
3. We need to maitain our own scheduler.
Without using multiple Ractors, it is similar to Ruby 1.8 M:1 threads. The difference with M:1 threads are locking NT mechanism to support unmanaged blocking operations. Another advantage is that it is easy to fallback to 1:1 threads by locking all of corresponding native threads to Ruby threads.
## Proposed design
### User facing changes
If a program only has a main Ractor (i.e., most Ruby programs), the user will not face any changes by default.
On main Ractor, all threads are 1:1 threads by default and there is no compatibility issue.
`RUBY_MN_THREADS=1` envrionment variable is given, main Ractor enables M:N threads.
Note that the main thread locks NT by default because the initial NT is special in some case. I'm not sure we can relax this limitation.
On the multiple Ractors, N (+ alpha) native threads run M ractors. Now there is no way to disable M:N threads on multiple Ractors because there are only a few multi-Ractor programs and no compatibility issues.
Maximum number of N can be specified by `RUBY_MAX_PROC=N`. 8 by default but this value should be specified with the number of CPU processors (cores).
### TLS issue
On M:N threads a Ruby thread (RT1) migrates from a native thread (NT1) to NT2, ... so that TLS on native code can be a problem.
For example, RT1 calls a library function `foo()` and it set TLS1 on NT1. After migrating RT1 to NT2, RT1 calls `foo()` again but there is no TLS1 record because TLS1 is recorded only on NT1.
On this case, RT1 should be run on NT1 while using native library foo. To avoid such prbolem, we need the following features:
* 1:1 threads on main Ractor by default
* functionality to lock the NT for RT, maybe `Thread#lock_native_thread` and `Thread#unlock_native_thread` API is needed. For example, Go language has `runtime.LockOSThread()` and `runtime.UnlockOSThread()` for this purpose.
* Or C-API only for this purpose? (not fixed yet)
Thankfully, the same problem can occur with Fiber scheduler (and of course Ruby 1.8 M:1 threads), but I have not heard of it being much of a problem, so I expect that TLS will not be much of an issue.
### Unmanaged blocking operations
From Ruby 1.9 (1:1 threads), the `nogvl(func)` API is used for most blocking operations to keep the threading system healthy. In other words, `nogvl(func)` represents that the given function is blocking operation. To support unmanaged blocking operations, we lock a native thread for the Ruby thread which issues blocking operation.
If the blocking operations doesn't finish soon, other Ruby threads can not run because a RT locks NT. In this case, another system monitoring thread named "Timer thread" (historical name and TT in short) creates another NT to run ready other Ruby threads.
This TT's behavior is the same as the behavior of "sysmon" in the Go language.
We named locked NT as dedicated native threads (DNT) and other NT as shared native threads (SNT). The upper bound by `RUBY_MAX_PROC` affects the number of SNT. In other words, the number of DNT is not limited (it is same that the number of NT on 1:1 threads are not limited).
### Managed blocking operations
Managed blocking operations are multiplexing by `select()`-like functions on the Timer thread.. Now only `epoll()` is supported.
I/O operation flow (read on fd1) on Ruby thread RT1:
1. check the ready-ness of fd1 by `poll(timeout = 0)`, goto step 4.
2. register fd1 to Timer thread (TT) epoll and resume another ready Ruby thread.
3. If TT detects that the fd1 is ready, make RT1 as ready thread.
4. When RT1 is resumed, then do `read()` by locking corresponding NT1.
`sleep(n)` operation flow on Ruby thread RT1:
1. register timeout of RT1 to TT epoll.
2. If TT detects the timeout of RT1 (n seconds), TT makes RT1 as a ready Ruby thread.
### Internal design
* 2 level scheduling
* Ruby threads of a Ractor is managed by 1:N threads
* Ruby threads of different Ractors are managed by M:N threads
* Timer thread has several duties
1. Monitoring I/O (or other event) ready-ness
2. Monitoring timeout
3. Produce timeslice signals
4. Help OS signal delivering
(On pthread environment) recent Ruby doesn't make timer thread but MaNy implementation makes TT anytime. it can be improved.
## Implementation
The code name is MaNy project, it is from MN threads.
https://github.com/ko1/ruby/tree/many2
The implementation is not matured (debugging now).
## Measurements
See RubyKaigi 2023 slides: https://atdot.net/~ko1/activities/2023_rubykaigi2023.pdf
## Discussion
* Enable/disable
* default behavior
* how to switch the behavior
* Should we lock the NT for main thread anytime?
* Ruby/C API to lock the native threads
## Misc
This description will be improved more later.
--
https://bugs.ruby-lang.org/