April 2023 - ruby-core - ml.ruby-lang.org

[ruby-core:113267] [Ruby master Bug#4040] SystemStackError with Hash[*a] for Large _a_
by Eregon (Benoit Daloze) 16 Apr '23

16 Apr '23

Issue #4040 has been updated by Eregon (Benoit Daloze). @jeremyevans0 > I rebased my branch against master, and then ran all of the app_* benchmarks, here are the results: Are the +N% there improvements or regressions? From those numbers it sounds like `+` would be regressions (i.e., more time to execute the same thing). --- I am thinking a bit more about the implications of this for Ruby implementations and JITs. Only passing on the stack means not allowed to pass a huge number of arguments (the case on TruffleRuby). Only passing as a heap array seems inefficient in general (would cause extra allocations, at least in interpreter, for `foo(1, 2)`). I guess one could use 2 different calling conventions, on stack if no rest parameter, on heap if there is a rest parameter. But more calling conventions is a clear cost as it causes extra checks for every call, even more so for polymorphic call site (+ it's messy to do callee-specific logic in the caller). If supporting to pass both arguments on the stack or in a heap array, then the called method (the callee) will most likely need to branch and find out from where to read arguments. It seems always an anti-pattern to have the callee need to deal with two calling conventions. That may actually be easier to deal with in C because a `VALUE*` pointer can represent both, then it would be one check on method entry for which pointer and size to use. In Java, if passing arguments as an Object[] and having hidden arguments at the start of the array, there is no way to share the logic with a Ruby Array from the heap, or it would need some offset for every argument access, which seems very expensive. I suppose one could technically compile 2 variants of a method, one for on stack and one for heap array, but it seems very expensive from a warmup and memory perspective, and it's again costing more calling conventions. Also when using array storage strategies, the array might be int[] behind the scenes and then passing it as a single argument vs a splat is so so so much faster. Basically, I think efficient Ruby implementations and JITs might not want to deal with the complexity of on-heap arguments. Such usage pattern is intrinsically inefficient. For example `m(:name, *array)` is quite expensive if array is big, `m(:name, array)` is strictly better from a performance POV. `m(*array)` can at best be as fast as `m(array)`, but can be much worse, e.g. if passed on stack (and < 128 for your PR) or if `array` is a `int[]`. Of course CRuby devs will decide what they want here. The real issue is if CRuby accepts this: * There is probably no hope to ever revert that decision and to remove those costs, because some code will likely start to depend on it. * It might encourage Ruby users to abuse splats more since they seem not much slower than non-splat on CRuby. ---------------------------------------- Bug #4040: SystemStackError with Hash[*a] for Large _a_ https://bugs.ruby-lang.org/issues/4040#change-102829 * Author: runpaint (Run Paint Run Run) * Status: Open * Priority: Normal * Assignee: ko1 (Koichi Sasada) * ruby -v: ruby 1.9.3dev (2010-11-09 trunk 29737) [x86_64-linux] * Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN ---------------------------------------- =begin I've been hesitating over whether to file a ticket about this, so please feel free to close if I've made the wrong choice. I often use Hash[*array.flatten] in IRB to convert arrays of arrays into hashes. Today I noticed that if the array is big enough, this would raise a SystemStackError. Puzzled, I looked deeper. I assumed I was hitting the maximum number of arguments a method's argc can hold, but realised that the minimum size of the array needed to trigger this exception differed depending on whether I used IRB or not. So, presumably this is indeed exhausting the stack... In IRB, the following is the minimal reproduction of this problem: Hash[*130648.times.map{ 1 }]; true I haven't looked for the minimum value needed with `ruby -e`, but the following reproduces: ruby -e 'Hash[*1380888.times.map{ 1 }]' I suppose this isn't technically a bug, but maybe it offers another argument for either #666 or an extension of #3131. =end -- https://bugs.ruby-lang.org/

1 0

[ruby-core:113265] [Ruby master Bug#4040] SystemStackError with Hash[*a] for Large _a_
by k0kubun (Takashi Kokubun) 16 Apr '23

16 Apr '23

Issue #4040 has been updated by k0kubun (Takashi Kokubun). > It would be good to get benchmark results from Linux, so if someone could contribute that, I would appreciate it. I have a linux-x86_64 environment with CPU frequency scaling disabled, so I benchmarked [your PR](https://github.com/ruby/ruby/pull/7522) with yjit-bench. I used `--category headline` because other benchmarks are less important in practice. ``` before: ruby 3.3.0dev (2023-04-14T03:43:46Z master 3733ee835b) [x86_64-linux] after: ruby 3.3.0dev (2023-04-15T06:35:36Z large-array-splat-.. a0eb73211c) [x86_64-linux] -------------- ----------- ---------- ---------- ---------- ------------ ------------- bench before (ms) stddev (%) after (ms) stddev (%) before/after after 1st itr activerecord 65.6 0.4 65.8 0.3 1.00 0.99 erubi_rails 18.7 1.3 18.7 12.4 1.00 1.02 hexapdf 2164.2 0.4 2181.5 0.8 0.99 0.98 liquid-c 57.0 1.6 57.0 1.9 1.00 0.99 liquid-compile 52.6 0.6 52.4 0.6 1.00 1.00 liquid-render 139.8 1.3 140.2 1.1 1.00 1.00 mail 118.1 0.1 118.7 0.2 1.00 1.00 psych-load 1707.5 0.2 1750.9 0.1 0.98 0.97 railsbench 1907.4 0.8 1929.4 0.8 0.99 0.99 ruby-lsp 59.5 10.3 59.4 11.9 1.00 0.98 sequel 65.6 0.2 65.6 0.2 1.00 1.00 -------------- ----------- ---------- ---------- ---------- ------------ ------------- ``` I tried running them a few times. In `before/after`, psych-load is stably 2% slower. hexapdf and railsbench show a 1% slowdown, which may be insignificant. Other benchmarks seem to have no difference. ---------------------------------------- Bug #4040: SystemStackError with Hash[*a] for Large _a_ https://bugs.ruby-lang.org/issues/4040#change-102826 * Author: runpaint (Run Paint Run Run) * Status: Open * Priority: Normal * Assignee: ko1 (Koichi Sasada) * ruby -v: ruby 1.9.3dev (2010-11-09 trunk 29737) [x86_64-linux] * Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN ---------------------------------------- =begin I've been hesitating over whether to file a ticket about this, so please feel free to close if I've made the wrong choice. I often use Hash[*array.flatten] in IRB to convert arrays of arrays into hashes. Today I noticed that if the array is big enough, this would raise a SystemStackError. Puzzled, I looked deeper. I assumed I was hitting the maximum number of arguments a method's argc can hold, but realised that the minimum size of the array needed to trigger this exception differed depending on whether I used IRB or not. So, presumably this is indeed exhausting the stack... In IRB, the following is the minimal reproduction of this problem: Hash[*130648.times.map{ 1 }]; true I haven't looked for the minimum value needed with `ruby -e`, but the following reproduces: ruby -e 'Hash[*1380888.times.map{ 1 }]' I suppose this isn't technically a bug, but maybe it offers another argument for either #666 or an extension of #3131. =end -- https://bugs.ruby-lang.org/

1 0

[ruby-core:113264] [Ruby master Bug#4040] SystemStackError with Hash[*a] for Large _a_
by jeremyevans0 (Jeremy Evans) 15 Apr '23

15 Apr '23

Issue #4040 has been updated by jeremyevans0 (Jeremy Evans). I ran yjit-bench with both the master branch and the PR branch. Here are the results: ``` Total time spent benchmarking: 5172s master: ruby 3.3.0dev (2023-04-14T03:43:46Z master 3733ee835b) [x86_64-openbsd7.3] heap_argv: ruby 3.3.0dev (2023-04-15T06:35:36Z large-array-splat-.. a0eb73211c) [x86_64-openbsd7.3] -------------- ----------- ---------- -------------- ---------- ---------------- ----------------- bench master (ms) stddev (%) heap_argv (ms) stddev (%) master/heap_argv heap_argv 1st itr activerecord 150.8 2.4 150.9 2.1 1.00 0.97 erubi_rails 52.6 7.9 52.9 8.1 1.00 1.05 hexapdf 6996.9 1.0 6925.5 0.6 1.01 1.11 liquid-c 177.5 2.0 175.3 1.5 1.01 1.03 liquid-compile 165.6 2.8 165.5 2.0 1.00 1.01 liquid-render 372.2 0.6 374.9 1.6 0.99 1.01 mail 389.6 0.7 394.0 2.0 0.99 1.01 psych-load 6431.2 0.2 6356.4 0.3 1.01 1.01 railsbench 4654.3 0.3 4696.0 0.6 0.99 0.99 ruby-lsp 159.6 6.0 155.6 5.8 1.03 1.05 sequel 215.0 2.6 214.8 0.9 1.00 1.00 binarytrees 840.2 0.3 840.2 0.9 1.00 0.99 chunky_png 2710.0 0.2 2739.4 0.4 0.99 0.98 erubi 732.7 1.6 726.9 1.1 1.01 1.02 etanni 984.5 1.6 974.1 0.5 1.01 1.01 fannkuchredux 4282.9 0.2 4334.9 0.2 0.99 0.99 lee 3625.8 0.4 3594.9 0.3 1.01 1.01 nbody 183.7 0.9 178.7 0.2 1.03 1.03 optcarrot 9673.8 1.0 9626.9 0.9 1.00 1.01 ruby-json 9889.0 0.1 9848.9 0.4 1.00 1.01 rubykon 23063.9 0.5 22953.8 0.3 1.00 1.00 30k_ifelse 3829.2 0.5 3824.4 1.0 1.00 0.99 30k_methods 7761.7 0.2 7665.6 0.2 1.01 1.01 cfunc_itself 327.7 0.3 326.6 0.5 1.00 1.00 fib 466.2 0.3 469.2 0.6 0.99 1.00 getivar 221.7 0.6 222.1 0.3 1.00 1.00 keyword_args 652.6 0.2 653.2 0.3 1.00 1.00 respond_to 893.4 0.2 909.5 0.1 0.98 0.98 setivar 148.3 0.3 143.5 0.2 1.03 1.01 setivar_object 295.0 0.5 291.6 0.5 1.01 1.01 setivar_young 295.1 0.4 291.6 0.6 1.01 1.01 str_concat 231.9 3.2 211.0 2.3 1.10 1.07 throw 40.2 1.3 41.6 9.2 0.97 0.98 -------------- ----------- ---------- -------------- ---------- ---------------- ----------------- Legend: - master/heap_argv: ratio of master/heap_argv time. Higher is better for heap_argv. Above 1 represents a speedup. - heap_argv 1st itr: ratio of master/heap_argv time for the first benchmarking iteration. ``` So it looks like it is slower on 8 benchmarks (6 1% slower, 1 2% slower, 1 3% slower), and faster on 13 benchmarks (9 1% faster, 3 3% faster, 1 10% faster). So on the whole, it looks like a net performance increase. It would be good to get benchmark results from Linux, so if someone could contribute that, I would appreciate it. ---------------------------------------- Bug #4040: SystemStackError with Hash[*a] for Large _a_ https://bugs.ruby-lang.org/issues/4040#change-102825 * Author: runpaint (Run Paint Run Run) * Status: Open * Priority: Normal * Assignee: ko1 (Koichi Sasada) * ruby -v: ruby 1.9.3dev (2010-11-09 trunk 29737) [x86_64-linux] * Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN ---------------------------------------- =begin I've been hesitating over whether to file a ticket about this, so please feel free to close if I've made the wrong choice. I often use Hash[*array.flatten] in IRB to convert arrays of arrays into hashes. Today I noticed that if the array is big enough, this would raise a SystemStackError. Puzzled, I looked deeper. I assumed I was hitting the maximum number of arguments a method's argc can hold, but realised that the minimum size of the array needed to trigger this exception differed depending on whether I used IRB or not. So, presumably this is indeed exhausting the stack... In IRB, the following is the minimal reproduction of this problem: Hash[*130648.times.map{ 1 }]; true I haven't looked for the minimum value needed with `ruby -e`, but the following reproduces: ruby -e 'Hash[*1380888.times.map{ 1 }]' I suppose this isn't technically a bug, but maybe it offers another argument for either #666 or an extension of #3131. =end -- https://bugs.ruby-lang.org/

1 0

[ruby-core:111990] [Ruby master Bug#19371] Having Psych 5 installed raises an error during another gem's C-extension installation when parsing YAML
by tombruijn (Tom de Bruijn) 14 Apr '23

14 Apr '23

Issue #19371 has been reported by tombruijn (Tom de Bruijn). ---------------------------------------- Bug #19371: Having Psych 5 installed raises an error during another gem's C-extension installation when parsing YAML https://bugs.ruby-lang.org/issues/19371 * Author: tombruijn (Tom de Bruijn) * Status: Open * Priority: Normal * ruby -v: ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [aarch64-linux] * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- ## Summary There's an issue on Ruby versions with Psych 4 installed by default (Ruby 2.6 through 3.1) after installing the Psych gem version 5. This problem occurs when a Ruby gem has a C-extension installation script that parses a YAML string. I'm reporting it here and not with on the Psych gem repo, because it looks more like an issue with which Ruby C-extension is load during other gem's C-extension installation. ## Background I have a gem that parses a YAML string in the C-extension installation script, or it calls `Gem.configuration[:http_proxy]`, which parses the `.gemrc` file as YAML. This triggers the error mentioned below. This YAML parsing is done in the gem's `ext/extconf.rb` file. An example gem can be found in this repository: https://github.com/tombruijn/yaml-dummy-gem, see the [`ext/extconf.rb` file](https://github.com/tombruijn/yaml-dummy-gem/blob/main/ext/extconf.rb#…. ## The problem On Ruby 3.1.3 Psych version 4 is installed by default. When it parses the YAML file, it will use Psych 4. When Psych 5 is also installed on Ruby 3.1.3, it is no longer be able to parse the YAML file. The following error is raised: ``` $ bundle install Fetching https://github.com/tombruijn/yaml-dummy-gem.git Resolving dependencies... Using bundler 2.3.7 Using yaml-dummy-gem 1.0.0 from https://github.com/tombruijn/yaml-dummy-gem.git (at main@a48852d) Gem::Ext::BuildError: ERROR: Failed to build gem native extension. current directory: /usr/local/bundle/bundler/gems/yaml-dummy-gem-a48852dac33d/ext /usr/local/bin/ruby -I /usr/local/lib/ruby/3.1.0 -r ./siteconf20230123-730-rmbnnl.rb extconf.rb /usr/local/lib/ruby/3.1.0/psych.rb:459:in `parse_stream': undefined method `parse' for #<Psych::Parser:0x0000ffff8078c7f8 @handler=#<Psych::Handlers::DocumentStream:0x0000ffff8078c910 @stack=[], @last=nil, @root=nil, @start_line=nil, @start_column=nil, @end_line=nil, @end_column=nil, @block=#<Proc:0x0000ffff8078c848 /usr/local/lib/ruby/3.1.0/psych.rb:399>>, @external_encoding=0> (NoMethodError) parser.parse yaml, filename ^^^^^^ from /usr/local/lib/ruby/3.1.0/psych.rb:399:in `parse' from extconf.rb:3:in `<main>' extconf failed, exit code 1 Gem files will remain installed in /usr/local/bundle/bundler/gems/yaml-dummy-gem-a48852dac33d for inspection. Results logged to /usr/local/bundle/bundler/gems/extensions/aarch64-linux/3.1.0/yaml-dummy-gem-a48852dac33d/gem_make.out /usr/local/lib/ruby/3.1.0/rubygems/ext/builder.rb:95:in `run' /usr/local/lib/ruby/3.1.0/rubygems/ext/ext_conf_builder.rb:47:in `block in build' /usr/local/lib/ruby/3.1.0/tempfile.rb:317:in `open' /usr/local/lib/ruby/3.1.0/rubygems/ext/ext_conf_builder.rb:26:in `build' /usr/local/lib/ruby/3.1.0/rubygems/ext/builder.rb:161:in `build_extension' /usr/local/lib/ruby/3.1.0/rubygems/ext/builder.rb:195:in `block in build_extensions' /usr/local/lib/ruby/3.1.0/rubygems/ext/builder.rb:192:in `each' /usr/local/lib/ruby/3.1.0/rubygems/ext/builder.rb:192:in `build_extensions' /usr/local/lib/ruby/3.1.0/rubygems/installer.rb:853:in `build_extensions' /usr/local/lib/ruby/3.1.0/bundler/rubygems_gem_installer.rb:71:in `build_extensions' /usr/local/lib/ruby/3.1.0/bundler/source/path/installer.rb:34:in `post_install' /usr/local/lib/ruby/3.1.0/bundler/source/path.rb:244:in `generate_bin' /usr/local/lib/ruby/3.1.0/bundler/source/git.rb:194:in `install' /usr/local/lib/ruby/3.1.0/bundler/installer/gem_installer.rb:54:in `install' /usr/local/lib/ruby/3.1.0/bundler/installer/gem_installer.rb:16:in `install_from_spec' /usr/local/lib/ruby/3.1.0/bundler/installer/parallel_installer.rb:186:in `do_install' /usr/local/lib/ruby/3.1.0/bundler/installer/parallel_installer.rb:177:in `block in worker_pool' /usr/local/lib/ruby/3.1.0/bundler/worker.rb:62:in `apply_func' /usr/local/lib/ruby/3.1.0/bundler/worker.rb:57:in `block in process_queue' /usr/local/lib/ruby/3.1.0/bundler/worker.rb:54:in `loop' /usr/local/lib/ruby/3.1.0/bundler/worker.rb:54:in `process_queue' /usr/local/lib/ruby/3.1.0/bundler/worker.rb:91:in `block (2 levels) in create_threads' An error occurred while installing yaml-dummy-gem (1.0.0), and Bundler cannot continue. In Gemfile: yaml-dummy-gem ``` ## Debugging results The error is raised because the `Psych::Parser#parse` method cannot be found. In Psych version 4, this method is [defined by the Psych C-extension](https://github.com/ruby/psych/blob/2c3708e0a483c6d44ebddaff0b5…. In Psych version 5 the `parse` method is [defined in the gem's Ruby code](https://github.com/ruby/psych/blob/1f23e6e7f0ab4a6efab598c1ee528bb52d…. This method calls a [C function registered as the private `_native_parse` method](https://github.com/ruby/psych/blob/1f23e6e7f0ab4a6efab598c1ee528bb5…, which is the renamed version of the `parse` C-function in Psych version 4. From what I can tell, the Psych version 4 C-extension is no longer loaded when Psych version 5 is installed in this scenario. There is a mix up in which Psych gem version's C-extension is loaded during my dummy gem's C-extension installation. It load the Psych 4 Ruby code, with the Psych 5 C-extension∂. I confirmed this by modifying the standard installed Psych gem's code on Ruby 3.1 (Docker image `ruby:3.1`), with the following the change, which prints `true` on error. This means the Psych 4 gem has the Psych 5 C-extension loaded where `_native_parse` is defined. ```diff diff --git lib/psych.rb lib/psych.rb index 42d79ef..1a690d2 100644 --- lib/psych.rb +++ lib/psych.rb @@ -452,6 +452,9 @@ def self.parser def self.parse_stream yaml, filename: nil, &block if block_given? parser = Psych::Parser.new(Handlers::DocumentStream.new(&block)) + # This returns `true`, but it should be `false`. The `_native_parse` + # method is defined in the Psych 5 C-extension, not Psych 4. + puts parser.respond_to? :_native_parse, true # => true parser.parse yaml, filename ``` This error only occurs during the gem's extension installation in `ext/extconf.rb`. If the gem parses YAML when a Ruby app is running, it will not produce the same error with Psych version 5 installed. This issue does not occur on Ruby 3.2, where Psych version 5 is installed by default. I have confirmed this error occurs on the latest patch releases of the following Ruby versions: 3.1, 3.0, 2.7 and 2.6. ``` $ gem list psych psych (5.0.2, default: 4.0.3) ``` ## Code to reproduce Here is a basic Ruby gem that only parses a YAML file during extension installation: https://github.com/tombruijn/yaml-dummy-gem Here is a small project that triggers the error: https://github.com/tombruijn/yaml-dummy-ruby-app A GitHub actions workflow shows the results for all affected Ruby versions: https://github.com/tombruijn/yaml-dummy-ruby-app/actions/runs/3969088933 The [example app repo](https://github.com/tombruijn/yaml-dummy-ruby-app) also has instructions to run the example app locally. Please follow the instructions in the README to see the error. -- https://bugs.ruby-lang.org/

5 4

[ruby-core:113258] [Ruby master Bug#4040] SystemStackError with Hash[*a] for Large _a_
by jeremyevans0 (Jeremy Evans) 14 Apr '23

14 Apr '23

Issue #4040 has been updated by jeremyevans0 (Jeremy Evans). I rebased my branch against master, and then ran all of the `app_*` benchmarks, here are the results: ``` app_aobench: +1% app_erb: 0% app_factorial: 0% app_fib: +5% app_lc_fizzbuzz: 0% app_mandelbrot: 0% app_pentomino: -1% app_raise: 0% app_strconcat: +8% app_tak: +4% app_tarai: +3% app_uri: -2% ``` For most of the benchmarks, I ran with `--repeat-count 10 --repeat-result best` (some take a long time and I only ran with 1 or 3 instead of 10). So from this benchmarking, only `app_pentomino` and `app_uri` are slower, by 1-2%. 5 benchmarks are faster, by up to 8%. 5 benchmarks did not show any performance differences. `app_fib` is showing up 5% faster now, when it was previously showing 1-3% slower. To make sure this wasn't an anomaly, I ran with `--repeat-count 25`, and still got the same results. Again, this could just be due to my environment (OpenBSD), as I cannot think of a reason why `app_fib` would be faster with the changes. All of these benchmarks are more of the microbenchmark nature. More realistic benchmarks such as yjit-bench on Linux would be better for testing actual differences in performance. ---------------------------------------- Bug #4040: SystemStackError with Hash[*a] for Large _a_ https://bugs.ruby-lang.org/issues/4040#change-102814 * Author: runpaint (Run Paint Run Run) * Status: Open * Priority: Normal * Assignee: ko1 (Koichi Sasada) * ruby -v: ruby 1.9.3dev (2010-11-09 trunk 29737) [x86_64-linux] * Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN ---------------------------------------- =begin I've been hesitating over whether to file a ticket about this, so please feel free to close if I've made the wrong choice. I often use Hash[*array.flatten] in IRB to convert arrays of arrays into hashes. Today I noticed that if the array is big enough, this would raise a SystemStackError. Puzzled, I looked deeper. I assumed I was hitting the maximum number of arguments a method's argc can hold, but realised that the minimum size of the array needed to trigger this exception differed depending on whether I used IRB or not. So, presumably this is indeed exhausting the stack... In IRB, the following is the minimal reproduction of this problem: Hash[*130648.times.map{ 1 }]; true I haven't looked for the minimum value needed with `ruby -e`, but the following reproduces: ruby -e 'Hash[*1380888.times.map{ 1 }]' I suppose this isn't technically a bug, but maybe it offers another argument for either #666 or an extension of #3131. =end -- https://bugs.ruby-lang.org/

1 0

[ruby-core:112866] [Ruby master Feature#19528] `JSON.load` defaults are surprising (`create_additions: true`)
by byroot (Jean Boussier) 14 Apr '23

14 Apr '23

Issue #19528 has been reported by byroot (Jean Boussier). ---------------------------------------- Feature #19528: `JSON.load` defaults are surprising (`create_additions: true`) https://bugs.ruby-lang.org/issues/19528 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- I'm not sure if it was actually intended, but there's some tacit naming convention for serializers in Ruby to use `load` and `dump` as methods, likely inspired from `Marshal` and `YAML`. Because of this it's extremely common to see code that uses `JSON.load` expecting a simple, no surprise, and safe JSON parsing. However that's `JSON.parse`. `JSON.load` has this very surprising behavior (albeit perfectly documented), of de-serializing more complex types: ```ruby >> JSON.load('{ "json_class": "String", "raw": [72, 101, 108, 108, 111] }') => "Hello" ``` It's particularly weird because aside from the `String` extension that is eagerly defined, for other types you have to `require "json/add/core"`. Seasoned Ruby developers know about this of course, and [it is banned by various linters](https://www.rubydoc.info/gems/rubocop/RuboCop/Cop/Security/JSONLoad), but it keeps popping regularly in gems security releases and such. ### Proposal Assuming entirely removing this feature is not an option, I think `json 2.x` should warn when this feature is actually being used, and `json 3.x` should disable it by default and require users to explicitly use `JSON.load(str, create_additions: true)` to keep the old behavior. -- https://bugs.ruby-lang.org/

4 4

[ruby-core:113088] [Ruby master Feature#19571] Add REMEMBERED_WB_UNPROTECTED_OBJECTS_LIMIT_RATIO to the GC
by peterzhu2118 (Peter Zhu) 14 Apr '23

14 Apr '23

Issue #19571 has been reported by peterzhu2118 (Peter Zhu). ---------------------------------------- Feature #19571: Add REMEMBERED_WB_UNPROTECTED_OBJECTS_LIMIT_RATIO to the GC https://bugs.ruby-lang.org/issues/19571 * Author: peterzhu2118 (Peter Zhu) * Status: Open * Priority: Normal ---------------------------------------- GitHub PR: https://github.com/ruby/ruby/pull/7577 The proposed PR adds the environment variable `RUBY_GC_HEAP_REMEMBERED_WB_UNPROTECTED_OBJECTS_LIMIT_RATIO` which is used to calculate the `remembered_wb_unprotected_objects_limit` using a ratio of `old_objects`. This should improve performance by reducing major GC because, in a major GC, we mark all of the old objects, so we should have more uncollectible WB unprotected objects before starting a major GC. The default has been set to 0.01 (1% of old objects). On one of [Shopify's highest traffic Ruby apps, Storefront Renderer](https://shopify.engineering/how-shopify-reduced-storefront-response-times-rewrite), we saw significant improvements after deploying this patch in production. In the graphs below, we have the `tuned` group which uses `RUBY_GC_HEAP_REMEMBERED_WB_UNPROTECTED_OBJECTS_LIMIT_RATIO=0.01` (the default value), and an `untuned` group, which turns this feature off with `RUBY_GC_HEAP_REMEMBERED_WB_UNPROTECTED_OBJECTS_LIMIT_RATIO=0`. We see that the tuned group spends significantly less time in GC, on average 0.67x of the time compared to the untuned group and 0.49x for p99. We see this improvement in GC time translate to improvements in response times. The average response time is now 0.96x of the time compared to the untuned group and 0.86x for p99. ![](Screenshot%202023-04-03%20at%2011.39.06%20AM.png) ---Files-------------------------------- Screenshot 2023-04-03 at 11.39.06 AM.png (554 KB) -- https://bugs.ruby-lang.org/

2 4

[ruby-core:113222] [Ruby master Bug#19598] Inconsistent behaviour of TracePoint API
by bgdimitrov (Bogdan Dimitrov) 14 Apr '23

14 Apr '23

Issue #19598 has been reported by bgdimitrov (Bogdan Dimitrov). ---------------------------------------- Bug #19598: Inconsistent behaviour of TracePoint API https://bugs.ruby-lang.org/issues/19598 * Author: bgdimitrov (Bogdan Dimitrov) * Status: Open * Priority: Normal * ruby -v: ruby 3.1.4p223 (2023-03-30 revision 957bb7cb81) [x86_64-darwin22] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- Hello, I am seeing inconsistent behaviour of the TracePoint API. If I raise an error from within the `:raise` event block it crashes the entire program with a `exception reentered (fatal)` next time any error is raised. However if I add a simple `if` check in the `:raised` event block the same program doesn't crash anymore. My specific use case is that sometimes when I have `Exception`s being raised in my application they are being handled by ActiveRecord and wrapped in a `ActiveRecord::StatementInvalid`, which is a `StandardError`. The codebase has a lot of `rescue StandardError` statements which swallow the `StatementInvalid` and therefore the `Exception`s get ignored. I would like to bypass the `rescue StandardError` statements in this case. My current solution is to manually check in every `rescue StandardError` if the `StatementInvalid` has an `Exception` in its `.cause` attribute and if there is re-raise it, but the codebase is very big and this is not a very good solution as every developer needs to remember to do this check if they add a new `rescue StandardError` or modify an existing one. Using TracePoint to do the aforementioned check before any `rescue` statements are called and then re-raise the Exception seems like a very neat way to automate the handling of these masked `Exception`s. However I am getting inconsistent behaviour from Ruby depending on what code I put inside the `:raised` event handler. Here are two identical pieces of code apart from an extra `if` check in the second example. The first example crashes with `exception reentered (fatal)`, the second doesn't. #### Code to reproduce crash ``` require "active_record" class Test def run begin tp = TracePoint.new(:raise) do |t| puts "TracePoint received: #{t.raised_exception.class}" raise t.raised_exception.cause end puts "TracePoint created" tp.enable do puts "TracePoint enabled" # Generate an Exception masked as a StatementInvalid begin raise Exception catch Exception raise ActiveRecord::StatementInvalid end end rescue Exception => e puts "Got Exception instead of StatementInvalid" end end end t = Test.new t.run begin raise ArgumentError rescue ArgumentError => e puts "Never reach here" end ``` #### Output ``` TracePoint created TracePoint enabled TracePoint received: Exception Got Exception instead of StatementInvalid tp_test2.rb: exception reentered (fatal) ``` #### Code that doesn't crash, extra if check on line 8 ``` require "active_record" class Test def run begin tp = TracePoint.new(:raise) do |t| puts "TracePoint received: #{t.raised_exception.class}" if t.raised_exception.instance_of?(ActiveRecord::StatementInvalid) raise t.raised_exception.cause end end puts "TracePoint created" tp.enable do puts "TracePoint enabled" # Generate an Exception masked as a StatementInvalid begin raise Exception catch Exception raise ActiveRecord::StatementInvalid end end rescue Exception => e puts "Got Exception instead of StatementInvalid" end end end t = Test.new t.run begin raise ArgumentError rescue ArgumentError => e puts "Never reach here" end ``` #### Output ``` TracePoint created TracePoint enabled TracePoint received: Exception Got Exception instead of StatementInvalid Never reach here ``` -- https://bugs.ruby-lang.org/

3 3

[ruby-core:113244] [Ruby master Bug#4040] SystemStackError with Hash[*a] for Large _a_
by ko1 (Koichi Sasada) 14 Apr '23

14 Apr '23

Issue #4040 has been updated by ko1 (Koichi Sasada). jeremyevans0 (Jeremy Evans) wrote in #note-18: > The bmethod/send/symproc/method_missing optimizations are all very large in certain cases and should definitely be included. I mean how much "certain cases" are there in apps. Debug counter feature in `debug_counter.[ch]` will help to confirm such statistics. ---------------------------------------- Bug #4040: SystemStackError with Hash[*a] for Large _a_ https://bugs.ruby-lang.org/issues/4040#change-102799 * Author: runpaint (Run Paint Run Run) * Status: Open * Priority: Normal * Assignee: ko1 (Koichi Sasada) * ruby -v: ruby 1.9.3dev (2010-11-09 trunk 29737) [x86_64-linux] * Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN ---------------------------------------- =begin I've been hesitating over whether to file a ticket about this, so please feel free to close if I've made the wrong choice. I often use Hash[*array.flatten] in IRB to convert arrays of arrays into hashes. Today I noticed that if the array is big enough, this would raise a SystemStackError. Puzzled, I looked deeper. I assumed I was hitting the maximum number of arguments a method's argc can hold, but realised that the minimum size of the array needed to trigger this exception differed depending on whether I used IRB or not. So, presumably this is indeed exhausting the stack... In IRB, the following is the minimal reproduction of this problem: Hash[*130648.times.map{ 1 }]; true I haven't looked for the minimum value needed with `ruby -e`, but the following reproduces: ruby -e 'Hash[*1380888.times.map{ 1 }]' I suppose this isn't technically a bug, but maybe it offers another argument for either #666 or an extension of #3131. =end -- https://bugs.ruby-lang.org/

1 0

[ruby-core:113242] [Ruby master Bug#4040] SystemStackError with Hash[*a] for Large _a_
by jeremyevans0 (Jeremy Evans) 14 Apr '23

14 Apr '23

Issue #4040 has been updated by jeremyevans0 (Jeremy Evans). ko1 (Koichi Sasada) wrote in #note-17: > Quote from devmeeting agenda https://bugs.ruby-lang.org/issues/19525: > > > The fix results in a minor performance decrease in microbenchmarks. > > Could you show more details (results)? In terms of existing benchmarks: * For app_fib benchmark about 1-3% decrease. * vm_send and vm_send_var benchmark improves 3-5% due to the send optimization. I'll try to do some more benchmarking tomorrow and report back. Is there a decent real world benchmark in `benchmarks` I can use? With the patch set, some microbenchmarks are slower, but some cases I optimized (bmethod/send/symproc/method_missing) are much faster (over 2x). A real world benchmark would be more useful to determine the actual performance differences. I do most of my development on OpenBSD, which is a bit suboptimal for benchmarking small differences in performance in my experience (possibly due to the additional randomization). If someone could run yjit-bench on the pull request branch (in interpreter mode), that would be very helpful. > Do you have an analysis which line(s) makes slower? Unfortunately, I don't. My guess would be it is due to the additional branches in `CALLER_SETUP_ARG` and checking for `calling->heap_argv`. > I don't think this feature should be rejected. It is cool to support this feature (long splat can be accepted by rest argument). > However, personally speaking I feel the proposed patch (https://github.com/ruby/ruby/pull/7522) is too complex for future maintenance comparing with the benefits from the patch. Agreed. I wish the patch could be made simpler, but I think most of the complexity of the patch is necessary if we want to fix the bug. > Now I have no time to review the patch closely and I couldn't confirm this patch has such issue. > So I agree to merge it (and rewrite them if they can be more improved) because it is well tested. OK. Before it is merged, the yjit team needs to make the necessary changes to yjit to support it. Alternatively, they could temporarily disable parts of yjit this breaks, but from talking to @alanwu, that could result in temporarily disabling a lot of yjit. I think it would be preferable to fix yjit before this is merged. > I think it is better to have benchmark measurements on some benchmarks, though. I added some benchmarks related to the optimizations I added, and basic results of those benchmarks in the in related commits. As mentioned above, I'll try to do additional benchmarking and report back. > General comments: > * Some code are duplicated so maybe they can be more shorter. OK. I will review and see if I can eliminate the redundant code. > * (optimization) It is not sure how the optimization target cases are there (optimizations for the minor cases can introduce issues such as i-cache miss, difficulty on future maintenance and so on). The bmethod/send/symproc/method_missing optimizations are all very large in certain cases and should definitely be included. The cfunc optimizations are limited to specific cases (`*args` or `*args, **kw` with empty `kw`) and not as large even in those cases (10-15% for `*args`, 35-40% for `*args, **kw` with empty `kw`). I'm guessing they are still a net performance improvement, though. ---------------------------------------- Bug #4040: SystemStackError with Hash[*a] for Large _a_ https://bugs.ruby-lang.org/issues/4040#change-102795 * Author: runpaint (Run Paint Run Run) * Status: Open * Priority: Normal * Assignee: ko1 (Koichi Sasada) * ruby -v: ruby 1.9.3dev (2010-11-09 trunk 29737) [x86_64-linux] * Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN ---------------------------------------- =begin I've been hesitating over whether to file a ticket about this, so please feel free to close if I've made the wrong choice. I often use Hash[*array.flatten] in IRB to convert arrays of arrays into hashes. Today I noticed that if the array is big enough, this would raise a SystemStackError. Puzzled, I looked deeper. I assumed I was hitting the maximum number of arguments a method's argc can hold, but realised that the minimum size of the array needed to trigger this exception differed depending on whether I used IRB or not. So, presumably this is indeed exhausting the stack... In IRB, the following is the minimal reproduction of this problem: Hash[*130648.times.map{ 1 }]; true I haven't looked for the minimum value needed with `ruby -e`, but the following reproduces: ruby -e 'Hash[*1380888.times.map{ 1 }]' I suppose this isn't technically a bug, but maybe it offers another argument for either #666 or an extension of #3131. =end -- https://bugs.ruby-lang.org/

1 0