March 2024 - ruby-core - ml.ruby-lang.org

[ruby-core:116382] [Ruby master Feature#20205] Enable `frozen_string_literal` by default

by byroot (Jean Boussier)

Issue #20205 has been reported by byroot (Jean Boussier). ---------------------------------------- Feature #20205: Enable `frozen_string_literal` by default https://bugs.ruby-lang.org/issues/20205 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- ### Context The `frozen_string_literal: true` pragma was introduced in Ruby 2.3, and as far as I'm aware the plan was initially to make it the default for Ruby 3.0, but this plan was abandoned because it would be too much of a breaking change without any real further notice. According to Matz, he still wishes to enable `frozen_string_literal` by default in the future, but a reasonable migration plan is required. The main issue is backward compatibility, flipping the switch immediately would break a lot of code, so there must be some deprecation period. The usual the path forward for this kind of change is to emit deprecation warnings one of multiple versions in advance. One example of that was the Ruby 2.7 keyword argument deprecation. It was quite verbose, and some users were initially annoyed, but I think the community pulled through it and I don't seem to hear much about it anymore. So for frozen string literals, the first step would be to start warning when a string that would be frozen in the future is mutated. ### Deprecation Warning Implementation I implemented a quick proof of concept with @etienne in https://github.com/Shopify/ruby/pull/549 In short: - Files with `# frozen_string_literal: true` or `# frozen_string_literal: false` don't change in behavior at all. - Files with no `# frozen_string_literal` comment are compiled to use `putchilledstring` opcode instead of regular `putstring`. - This opcode mark the string with a user flag, when these strings are mutated, a warning is issued. Currently the proof of concept issue the warning at the mutation location, which in some case can make locating where the string was allocated a bit hard. But it is possible to improve it so the message also include the location at which the literal string was allocated, and learning from the keyword argument warning experience, we can record which warnings were already issued to avoid spamming users with duplicated warnings. As currently implemented, there is almost no overhead. If we modify the implementation to record the literal location, we'd incur a small memory overhead for each literal string in a file without an explicit `frozen_string_literal` pragma. But I believe we could do it in a way that has no overhead if `Warning[:deprecated] = false`. ### Timeline The migration would happen in 3 steps, each step can potentially last multiple releases. e.g. `R0` could be `3.4`, `R1` be `3.7` and `R2` be `4.0`. I don't have a strong opinion on the pace. - Release `R0`: introduce the deprecation warning (only if deprecation warnings enabled). - Release `R1`: make the deprecation warning show up regardless of verbosity level. - Release `R2`: make string literals frozen by default. ### Impact Given that `rubocop` is quite popular in the community and it has enforced the usage of `# frozen_string_literal: true` for years now, I suspect a large part of the actively maintained codebases in the wild wouldn't see any warnings. And with recent versions of `minitest` enabling deprecation warnings by default (and [potentially RSpec too](https://github.com/rspec/rspec-core/issues/2867)), the few that didn't migrate will likely be made compatible quickly. The real problem of course are the less actively developed libraries and applications. For such cases, any codebase can remain compatible by setting `RUBYOPT="--disable=frozen_string_literal"`, and so even after `R2` release. The flag would never be removed any legacy codebase can continue upgrading Ruby without changing a single line of cod by just flipping this flag. ### Workflow for library maintainers As a library maintainer, fixing the deprecation warnings can be as simple as prepending `# frozen_string_literal: false` at the top of all their source files, and this will keep working forever. Alternatively they can of course make their code compatible with frozen string literals. Code that is frozen string literal compatible doesn't need to explicitly declare it. Only code that need it turned of need to do so. ### Workflow for application owners For application owners, the workflow is the same than for libraries. However if they depend on a gem that hasn't updated, or that they can't upgrade it, they can run their application with `RUBYOPT="--disable=frozen_string_literal"` and it will keep working forever. Any user running into an incompatibility issue can set `RUBYOPT="--disable=frozen_string_literal"` forever, even in `4.x`, the only thing changing is the default value. And any application for which all dependencies have been made fully frozen string literal compatible can set `RUBYOPT="--enable=frozen_string_literal"` and start immediately removing magic comment from their codebase. -- https://bugs.ruby-lang.org/

2 hours, 40 minutes

15
45
0 0

[ruby-core:116039] [Ruby master Bug#20154] aarch64: configure overrides `-mbranch-protection` if it was set in CFLAGS via environment

by jprokop (Jarek Prokop)

Issue #20154 has been reported by jprokop (Jarek Prokop). ---------------------------------------- Bug #20154: aarch64: configure overrides `-mbranch-protection` if it was set in CFLAGS via environment https://bugs.ruby-lang.org/issues/20154 * Author: jprokop (Jarek Prokop) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- Recently a GH PR was merged <https://github.com/ruby/ruby/pull/9306> For PAC/BTI support on ARM CPUs for Coroutine.S. Without proper compilation support in configure.ac it segfaults Ruby with fibers on CPUs where PAC is supported: https://bugs.ruby-lang.org/issues/20085 At the time of writing, configure.ac appends the first option from a list for flag `-mbranch-protection` that successfully compiles a program <https://github.com/ruby/ruby/blob/master/configure.ac#L829>, to XCFLAGS and now also ASFLAGS to fix issue 20085 for Ruby master. This is suboptimal for Fedora as we set -mbranch-protection=standard by default in C{,XX}FLAGS: ``` CFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Werror=implicit-function-declaration -Werror=implicit-int -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -mbranch-protection=standard -fasynchronous-unwind-tables -fstack-clash-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' export CFLAGS CXXFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -mbranch-protection=standard -fasynchronous-unwind-tables -fstack-clash-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer' export CXXFLAGS ``` And the appended flag overrides distribution's compilation configuration, which in this case ends up omitting BTI instructions and only using PAC. Would it make sense to check if such flags exist and not overwrite them if they do? Serious proposals: 1. Simplest fix that does not overwrite what is set in the distribution and results in higher security is simply prepending the list of options with `-mbranch-protection=standard`, it should cause no problems on ARMv8 CPUs and forward, BTI similarly to PAC instructions result into NOP, it is only extending the capability. See attached 0001-aarch64-Check-mbranch-protection-standard-first-to-u.patch 2. Other fix that sounds more sane IMO and dodges this kind of guessing where are all the correct places for the flag is what another Fedora contributor Florian Weimer suggested: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org… "The reliable way to do this would be to compile a C file and check whether that enables __ARM_FEATURE_PAC_DEFAULT, and if that's the case, define a *different* macro for use in the assembler implementation. This way, you don't need to care about the exact name of the option." IOW instead of using __ARM_FEATURE_* directly in that code, define a macro in the style of "USE_PAC" with value of the feature if it is defined, I think that way we shouldn't need to append ASFLAGS anymore. However it's also important to catch the value of those macros as their values have meaning, I have an idea how to do that but I'd get on that monday earliest. ---Files-------------------------------- 0001-aarch64-Check-mbranch-protection-standard-first-to-u.patch (1004 Bytes) -- https://bugs.ruby-lang.org/

1 day, 13 hours

4
4
0 0

[ruby-core:114070] [Ruby master Bug#19753] IO::Buffer#get_string can't handle negative offset

by noteflakes (Sharon Rosner)

Issue #19753 has been reported by noteflakes (Sharon Rosner). ---------------------------------------- Bug #19753: IO::Buffer#get_string can't handle negative offset https://bugs.ruby-lang.org/issues/19753 * Author: noteflakes (Sharon Rosner) * Status: Open * Priority: Normal * ruby -v: 3.2 * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- ```ruby irb(main):001:0> b = IO::Buffer.for('abc') => #<IO::Buffer 0x00007f858f5450c0+3 EXTERNAL READONLY SLICE> ... irb(main):002:0> b.get_string(-1) => "\x00abc" irb(main):003:0> b.get_string(-1000, 3) (irb):3:in `get_string': Specified offset+length exceeds data size! (ArgumentError) from (irb):3:in `<main>' from /home/sharon/.rbenv/versions/3.2.0/lib/ruby/gems/3.2.0/gems/irb-1.7.1/exe/irb:9:in `<top (required)>' from /home/sharon/.rbenv/versions/3.2.0/bin/irb:25:in `load' from /home/sharon/.rbenv/versions/3.2.0/bin/irb:25:in `<main>' ``` Using a negative offset returns garbage in the string but it also might segfault: ```ruby irb(main):003:0> b = IO::Buffer.map(File.open('sgt-nodes.sql', 'r+')) => #<IO::Buffer 0x00007f189de14000+2008858 EXTERNAL MAPPED SHARED> irb(main):004:0> b.get_string(-1000) (irb):4: [BUG] Segmentation fault at 0x00007f189de13c18 ruby 3.2.0 (2022-12-25 revision a528908271) [x86_64-linux] -- Control frame information ----------------------------------------------- c:0021 p:---- s:0109 e:000108 CFUNC :get_string ... ``` ## Expected behaviour I think it might be nice to have `#get_string` behave like other methods taking an offset, like `String#[]`. For example: ```ruby irb(main):001:0> b = IO::Buffer.for('abc') => #<IO::Buffer 0x00007f858f5450c0+3 EXTERNAL READONLY SLICE> ... irb(main):002:0> b.get_string(-1) => "c" irb(main):003:0> b.get_string(-2) => "bc" irb(main):003:0> b.get_string(-1000) => "abc" irb(main):003:0> b.get_string(-1000, 2) => "ab" ``` -- https://bugs.ruby-lang.org/

1 day, 17 hours

5
9
0 0

[ruby-core:116460] [Ruby master Bug#20218] aset/masgn/op_asgn with keyword arguments

by jeremyevans0 (Jeremy Evans)

Issue #20218 has been reported by jeremyevans0 (Jeremy Evans). ---------------------------------------- Bug #20218: aset/masgn/op_asgn with keyword arguments https://bugs.ruby-lang.org/issues/20218 * Author: jeremyevans0 (Jeremy Evans) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- I found that use of keyword arguments in multiple assignment is broken in 3.3 and master: ```ruby h = {a: 1} o = [] def o.[]=(*args, **kw) replace([args, kw]) end # This segfaults as RHS argument is not a hash o[1, a: 1], _ = [1, 2] # This passes the RHS argument as keywords to the method, treating keyword splat as positional argument o[1, **h], _ = [{b: 3}, 2] o # => [[1, {:a=>1}], {:b=>3}] ``` Before 3.3, keyword arguments were treated as positional arguments. This is similar to #19918, but for keyword arguments instead of block arguments. @matz indicated he wanted to prohibit block arguments in aset/masgn and presumably also op_asgn (making them SyntaxErrors). Can we also prohibit keyword arguments in aset/masgn/op_asgn? Note that aset treats keyword arguments as regular arguments: ```ruby o[1, a: 1] = 2 o # => [[1, {:a=>1}, 2], {}] o[1, **h] = {b: 3} o # => [[1, {:a=>2}, {:b=>3}], {}] ``` While op_asgn treats keyword arguments as keywords: ```ruby h = {a: 2} o = [] def o.[](*args, **kw) concat([:[], args, kw]) x = Object.new def x.+(v) [:x, v] end x end def o.[]=(*args, **kw) concat([:[]=, args, kw]) end o[1, a: 1] += 2 o # => [:[], [1], {:a=>1}, :[]=, [1, [:x, 2]], {:a=>1}] o.clear o[1, **h] += {b: 3} o # => [:[], [1], {:a=>2}, :[]=, [1, [:x, {:b=>3}]], {:a=>2}] ``` -- https://bugs.ruby-lang.org/

2 days, 12 hours

6
5
0 0

[ruby-core:116589] [Ruby master Misc#20238] Use prism for mk_builtin_loader.rb

by kddnewton (Kevin Newton)

Issue #20238 has been reported by kddnewton (Kevin Newton). ---------------------------------------- Misc #20238: Use prism for mk_builtin_loader.rb https://bugs.ruby-lang.org/issues/20238 * Author: kddnewton (Kevin Newton) * Status: Open * Priority: Normal ---------------------------------------- I would like to propose that we use prism for mk_builtin_loader.rb. Right now the Ruby syntax that you can use in builtin classes is restricted to the base Ruby version (2.7). This means you can't use a lot of the nicer syntax that Ruby has shipped in the last couple of years. If we switch to using prism to parse the builtin files instead of using ripper, then we can always use the latest version of Ruby syntax. A pull request for this is here: https://github.com/kddnewton/ruby/pull/65. The approach for the PR is taken from how RJIT bindgen works. -- https://bugs.ruby-lang.org/

3 days, 20 hours

6
19
0 0

[ruby-core:112304] [Ruby master Bug#19427] Marshal.load(source, freeze: true) doesn't freeze in some cases

by andrykonchin (Andrew Konchin)

Issue #19427 has been reported by andrykonchin (Andrew Konchin). ---------------------------------------- Bug #19427: Marshal.load(source, freeze: true) doesn't freeze in some cases https://bugs.ruby-lang.org/issues/19427 * Author: andrykonchin (Andrew Konchin) * Status: Open * Priority: Normal * ruby -v: 3.1 * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- I've noticed that the `freeze` option doesn't work in the following cases: - when dumped object extends a module - when dumped object responds to `#marshal_dump` and `#marshal_load` methods - when dumped object responds to `#_dump` method Is it expected behaviour or a known issue? Examples: ```ruby module M end object = Object.new object.extend(M) object = Marshal.load(Marshal.dump(object), freeze: true) object.frozen? # => false ``` ```ruby class UserMarshal attr_accessor :data def initialize @data = 'stuff' end def marshal_dump() :data end def marshal_load(data) @data = data end end object = Marshal.load(Marshal.dump(UserMarshal.new), freeze: true) object.frozen? # => false ``` ```ruby class UserDefined attr_reader :a, :b def initialize @a = 'stuff' @b = @a end def _dump(depth) Marshal.dump [:stuff, :stuff] end def self._load(data) a, b = Marshal.load data obj = allocate obj.instance_variable_set :@a, a obj.instance_variable_set :@b, b obj end end ``` -- https://bugs.ruby-lang.org/

3 days, 20 hours

5
13
0 0

[ruby-core:112873] [Ruby master Bug#19530] `Array#sum` and `Enumerable#sum` sometimes show different behaviours

by dstosik (David Stosik)

Issue #19530 has been reported by dstosik (David Stosik). ---------------------------------------- Bug #19530: `Array#sum` and `Enumerable#sum` sometimes show different behaviours https://bugs.ruby-lang.org/issues/19530 * Author: dstosik (David Stosik) * Status: Open * Priority: Normal * ruby -v: ruby 3.2.1 (2023-02-08 revision 31819e82c8) [arm64-darwin22] * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- Hi everyone. 👋🏻 We recently discovered that `Array#sum` and `Enumerable#sum` will output different results in some edge cases. Here's the smallest script I managed to write to reproduce the issue: ``` ruby class Money def initialize(amount) @amount = amount.to_f end def +(other) self.class.new(@amount + other.to_f) end def to_f @amount end end p [7.0].each.sum(Money.new(0)).class #=> Money p [7.0] .sum(Money.new(0)).class #=> Float 💥 ``` I understand that it is expected that `#sum` may not honor custom definitions of the `#+` method (particularly when the summed values are `Float`). However, I would like to bring your attention to the fact that, in the example above, calling `#sum` on an `Array` of `Float` values and calling `#sum` on an `Enumerable` that yields the same `Float` values will return results of different types. I've reproduced the same behaviour with multiple versions of Ruby going from 2.6.5 to 3.2.1. Ideally, I would expect `[7.0].sum(Money.new(0))` to return a `Money` object identical to the one returned by `[7.0].each.sum(Money.new(0))`. I think it would make sense if at least they returned an identical value (even if it is a `Float`). -- https://bugs.ruby-lang.org/

4 days

3
3
0 0

[ruby-core:116200] [Ruby master Bug#20183] `erb/escape.so` cannot be loaded when `--with-static-linked-ext`

by nobu (Nobuyoshi Nakada)

Issue #20183 has been reported by nobu (Nobuyoshi Nakada). ---------------------------------------- Bug #20183: `erb/escape.so` cannot be loaded when `--with-static-linked-ext` https://bugs.ruby-lang.org/issues/20183 * Author: nobu (Nobuyoshi Nakada) * Status: Open * Priority: Normal * Backport: 3.0: REQUIRED, 3.1: REQUIRED, 3.2: REQUIRED, 3.3: REQUIRED ---------------------------------------- Since `cgi/escape.c` and `erb/escape.c` are both initialized by `Init_escape()` functions, both call the same function in `extinit.c`. -- https://bugs.ruby-lang.org/

4 days, 1 hour

3
2
0 0

[ruby-core:114403] [Ruby master Feature#19840] [Proposal] Expand Find pattern to Multiple Find

by FlickGradley (Nick Bradley)

Issue #19840 has been reported by FlickGradley (Nick Bradley). ---------------------------------------- Feature #19840: [Proposal] Expand Find pattern to Multiple Find https://bugs.ruby-lang.org/issues/19840 * Author: FlickGradley (Nick Bradley) * Status: Open * Priority: Normal ---------------------------------------- Hello! I love Ruby's pattern matching features. I would like to propose an expansion of the Find pattern which allows the selection of multiple matching elements of an array. I often find myself dealing with data like this: ``` ruby { results: [{ id: 1, name: "foo" }, { id: 2, name: "bar" }, ... ] } ``` My problem is that I need to retrieve all the `id` values from the nested array of hashes, and I don't know how many there will be in advance. It seems that the Find pattern could be expanded from allowing `pattern` (matching a single element) to `*pattern`. Examples: ``` ruby # Base case case { results: [{ id: 1, name: "foo" }, { id: 2, name: "bar" }] } in results: [*{ id: ids }] "matched: #{ids}" else "not matched" end #=> matched: [1, 2] # With * at the end (rest of args) - same result case { results: [{ id: 1, name: "foo" }, { id: 2, name: "bar" }] } in results: [*{ id: ids }, *] "matched: #{ids}" else "not matched" end #=> matched: [1, 2] # When one element doesn't match and there is no *rest case { results: [{ name: "foo" }, { id: 2, name: "bar" }] } in results: [*{ id: ids }] "matched: #{ids}" else "not matched" end #=> not matched ``` Similarly, `*Constant` could work to pull out types with an As pattern: ``` ruby case [1, 2, 3, "string"] in *Integer => nums, * "matched: #{nums}" else "not matched" end #=> matched: [1, 2, 3] ``` Other patterns would work in the same way. Essentially, this expands the concept of `*` in pattern matching to mean "a variable number of `things matching subpattern`". Today, the only pattern supported by `*` is a variable binding - but it could be any of the other subpatterns as well. This proposal does imply that this would work: ``` ruby a = 2 [1, 2, 2, 3] in [*, *^a, *] #=> true ``` To me, the `*` represents the variable number of matches, so the syntax makes intuitive sense. But others may have different opinions about `*^` being adjacent. It may also imply this would work, though we could restrict the number of non-variable patterns (in other words, patterns that have the possibility of not matching) to 1 per Array so that this isn't possible.. I'm not sure something like this would be useful or clear. ``` ruby a = 2 [1, 2, "hello", "ruby"] in [*Integer, *String] #=> true ``` This feature feels like the missing piece of the Find pattern to me - I often want to "Find Multiple". If others agree, I would be happy to contribute by working on this feature and creating a pull request. -- https://bugs.ruby-lang.org/

4 days, 4 hours

3
2
0 0

[ruby-core:117368] [Ruby master Bug#20401] Duplicated when clause warning line number

by kddnewton (Kevin Newton)

Issue #20401 has been reported by kddnewton (Kevin Newton). ---------------------------------------- Bug #20401: Duplicated when clause warning line number https://bugs.ruby-lang.org/issues/20401 * Author: kddnewton (Kevin Newton) * Status: Open * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- When you have a duplicated when clause, you get a warning for it. For example: ```ruby case foo when :bar when :baz when :bar end ``` you get `warning: duplicated `when' clause with line 2 is ignored`. But the when clause that is ignored is the one on line 4, not line 2. It seems like it's warning for the wrong line. -- https://bugs.ruby-lang.org/

4 days, 9 hours

2
3
0 0