
Issue #21039 has been updated by Eregon (Benoit Daloze).
It makes very little sense to me that example 4 is fine but example 2 isn't.
Ideally example 4 would also be forbidden. It just seems difficult to do it, but maybe there are ways?
To implement it would require examining a CFG (which we don't currently build in CRuby) and find every assignment which could come after.
Yes it requires some non-trivial work (maybe this is easier if this is done in `compile.c/compile_prism.c`? They already deal with assigning indices to local vars, resolving captured vars, etc). That work is required to make this new primitive safe enough to not break things very far away in the program. If we have to choose to do some more work and be safe being vs being unsafe, I hope you agree it's worth being safe and not breaking language semantics? Using the example of https://bugs.ruby-lang.org/issues/21039#note-20: ```ruby counter = 0 get "/" do # assume the proc gets copied here so counter is 0 "Hello world #{counter}" counter += 1 end counter += 123 ``` Assuming somewhere deep in Sinatra or Puma or so `Ractor.shareable_proc` is called with that block, I think it's a clear violation of the most basic semantics of Ruby blocks if that block doesn't see updates of `counter`. If people or I see this problem, I think they would react "WTF, Ruby is broken, nothing can be relied on anymore, not even local variable assignments". So I think there are three ways forward: * Don't allow `Ractor.shareable_proc` blocks to use any variable from the environment (like current `Ractor.new { ... }`) * Allow `Ractor.shareable_proc` to use variables from the environment, but make sure they are not reassigned so there are no broken semantics. Edge cases like `eval` and `binding` seem a much smaller issue if they notice the copy, though clearly not ideal. * Only allow `Ractor.shareable_proc` with a literal block, then it's clear enough that block has special semantics different from regular blocks (like a 3rd type besides proc & lambda). What has been agreed in https://bugs.ruby-lang.org/issues/21039#note-12.
All the other new rules are great and improve consistency, but I don't think this one is viable or necessary.
Actually you already need to scan rescue and inner blocks for rule 3 (The proc shouldn't be allowed to mutate a shared environment), for cases like: ```ruby foo = 123 Ractor.shareable_proc { 1.tap { begin; expr; rescue; foo += 1; end } } ```
So it all amounts to a lot of work that's more surprising to the user.
I think it's clear the user would be surprised in far less cases.
I don't think this is a significant departure from existing semantics.
Really, you think ignoring variable assignments is not a significant departure from existing semantics? BTW there is Proc#dup, of course it doesn't copy the environment. These new semantics are clearly a huge departure from existing block semantics.
It's also not dissimilar to using, say, instance_exec to change self (only here it's a different environment).
True that there is similarity. Also `instance_exec` can already be problematic. The big assumption there is a given block must always be executed with the same kind of `self`, e.g. instances of a given class or subclasses. If it's called with unrelated `self`s, most likely things would break, e.g. any method call inside the block would break unless that method is defined on both objects. `instance_exec` is mostly used for DSLs and there the `self` is always that DSL object. That's fine, because it's predictable and consistent, a given block in the source program gets called with `self` of the same class. It's the same thing as a given block in the program is either a lambda or proc, but never both (except some artificial edge cases). It's not guaranteed for `instance_exec`, but if misused it would be correct to blame whatever code is misusing it, and the worst case is a wrong `self`, which can be debugged with just `p self`. The worse case with unsafe `Ractor.shareable_proc` is extreme confusion (very hard to debug), data loss or corruption (due to ignored assignments) and potentially even security vulnerabilities as a result. ---------------------------------------- Feature #21039: Ractor.make_shareable breaks block semantics (seeing updated captured variables) of existing blocks https://bugs.ruby-lang.org/issues/21039#change-114292 * Author: Eregon (Benoit Daloze) * Status: Assigned * Assignee: ko1 (Koichi Sasada) ---------------------------------------- ```ruby def make_counter count = 0 nil.instance_exec do [-> { count }, -> { count += 1 }] end end get, increment = make_counter reader = Thread.new { sleep 0.01 loop do p get.call sleep 0.1 end } writer = Thread.new { loop do increment.call sleep 0.1 end } ractor_thread = Thread.new { sleep 1 Ractor.make_shareable(get) } sleep 2 ``` This prints: ``` 1 2 3 4 5 6 7 8 9 10 10 10 10 10 10 10 10 10 10 10 ``` But it should print 1..20, and indeed it does when commenting out the `Ractor.make_shareable(get)`. This shows a given block/Proc instance is concurrently broken by `Ractor.make_shareable`, IOW Ractor is breaking fundamental Ruby semantics of blocks and their captured/outer variables or "environment". It's expected that `Ractor.make_shareable` can `freeze` objects and that may cause some FrozenError, but here it's not a FrozenError, it's wrong/stale values being read. I think what should happen instead is that `Ractor.make_shareable` should create a new Proc and mutate that. However, if the Proc is inside some other object and not just directly the argument, that wouldn't work (like `Ractor.make_shareable([get])`). So I think one fix would to be to only accept Procs for `Ractor.make_shareable(obj, copy: true)`. FWIW that currently doesn't allow Procs, it gives `<internal:ractor>:828:in 'Ractor.make_shareable': allocator undefined for Proc (TypeError)`. It makes sense to use `copy` here since `make_shareable` effectively takes a copy/snapshot of the Proc's environment. I think the only other way, and I think it would be a far better way would be to not support making Procs shareable with `Ractor.make_shareable`. Instead it could be some new method like `isolated { ... }` or `Proc.isolated { ... }` or `Proc.snapshot_outer_variables { ... }` or so, only accepting a literal block (to avoid mutating/breaking an existing block), and that would snapshot outer variables (or require no outer variables like Ractor.new's block, or maybe even do `Ractor.make_shareable(copy: true)` on outer variables) and possibly also set `self` since that's anyway needed. That would make such blocks with different semantics explicit, which would fix the problem of breaking the intention of who wrote that block and whoever read that code, expecting normal Ruby block semantics, which includes seeing updated outer variables. Related: #21033 https://bugs.ruby-lang.org/issues/18243#note-5 Extracted from https://bugs.ruby-lang.org/issues/21033#note-14 -- https://bugs.ruby-lang.org/