March 2024 - ruby-core - ml.ruby-lang.org

[ruby-core:117241] [Ruby master Feature#8421] add Enumerable#find_map and Enumerable#find_all_map
by jeremyevans0 (Jeremy Evans) 19 Mar '24

19 Mar '24

Issue #8421 has been updated by jeremyevans0 (Jeremy Evans). `find_map` seems like a bad name as there is no map. map implies calling the same function over all elements in a collection, and in this case, there would be a single element (or none if nothing was found). Combining `find` and `then` seems like the simplest way now if you don't want to use `break`: ```ruby emails.find{ pattern.match(it) }&.then{ it[:identifier] } ``` Personally, I would use the following approach is I think it is clearer: ```ruby if match = emails.find{ pattern.match(it) } match[:identifier] end ``` ---------------------------------------- Feature #8421: add Enumerable#find_map and Enumerable#find_all_map https://bugs.ruby-lang.org/issues/8421#change-107328 * Author: Hanmac (Hans Mackowiak) * Status: Feedback ---------------------------------------- currently if you have an Enumerable and you want to return the return value of #find you need eigther: (o = enum.find(block) && block.call(o)) || nil or enum.inject(nil) {|ret,el| ret || block.call(el)} neigher of them may be better than an directly maked method same for #find_all_map enum.lazy.map(&:block).find_all{|el| el} it may work but it is not so good -- https://bugs.ruby-lang.org/

1 0

[ruby-core:117239] [Ruby master Feature#8421] add Enumerable#find_map and Enumerable#find_all_map
by zverok (Victor Shepelev) 19 Mar '24

19 Mar '24

Issue #8421 has been updated by zverok (Victor Shepelev). @alexbarret There is a somewhat lesser-known trick which looks pretty close to your code: ```ruby # proposal: find_map(emails) do |email| (matches = pattern.match(email)) && matches[:identifier] end # a "trick" emails.find { |email| match = pattern.match(email) and break match[:identifier] } # => "thecode" ``` It might even be considered two tricks, depending on your point of view: the control-flow `and` allows to chain any statements to it (note it doesn't need extra parentheses after assignment), and `break value` allows to return a non-standard value from a block. Not saying it is beautiful, just one more option. ---------------------------------------- Feature #8421: add Enumerable#find_map and Enumerable#find_all_map https://bugs.ruby-lang.org/issues/8421#change-107326 * Author: Hanmac (Hans Mackowiak) * Status: Feedback ---------------------------------------- currently if you have an Enumerable and you want to return the return value of #find you need eigther: (o = enum.find(block) && block.call(o)) || nil or enum.inject(nil) {|ret,el| ret || block.call(el)} neigher of them may be better than an directly maked method same for #find_all_map enum.lazy.map(&:block).find_all{|el| el} it may work but it is not so good -- https://bugs.ruby-lang.org/

1 0

[ruby-core:117238] [Ruby master Feature#16153] eventually_frozen flag to gradually phase-in frozen strings
by Dan0042 (Daniel DeLorme) 19 Mar '24

19 Mar '24

Issue #16153 has been updated by Dan0042 (Daniel DeLorme). This proposal is made redundant by #20205 chilled strings. Please close. ---------------------------------------- Feature #16153: eventually_frozen flag to gradually phase-in frozen strings https://bugs.ruby-lang.org/issues/16153#change-107325 * Author: Dan0042 (Daniel DeLorme) * Status: Open ---------------------------------------- Freezing strings can give us a nice performance boost, but freezing previously non-frozen strings is a backward-incompatible change which is hard to handle because the place where the string is mutated can be far from where it was frozen, and tests might not cover the cases of frozen input vs non-frozen input. I propose adding a flag which gives us a migration path for freezing strings. For purposes of discussion I will call this flag "eventually_frozen". It would act as a pseudo-frozen flag where mutating the object would result in a warning instead of an error. It would also change the return value of `Object#frozen?` so code like `obj = obj.dup if obj.frozen?` would work as expected to remove the warning. Note that eventually_frozen strings cannot be deduplicated, as they are in reality mutable. This way it would be possible for Symbol#to_s (and many others) to return an eventually_frozen string in 2.7 which gives apps and gems time to migrate, before finally becoming a frozen deduplicated string in 3.0. This might even open up a migration path for eventually using `frozen_string_literal:true` as default. For example if it was possible to add `frozen_string_literal:eventual` to all files in a project (or as a global switch), we could run that in production to discover where to fix things, and then change it to `frozen_string_literal:true` for a bug-free performance boost. ### Proposed changes * Object#freeze(immediately:true) * if `immediately` keyword is true, set frozen=true and eventually_frozen=false * if `immediately` keyword is false, set eventually_frozen=true UNLESS frozen flag is already true * String#+@ * if eventually_frozen is true, create a duplicate string with eventually_frozen=false * Object#frozen?(immediately:false) * return true if `immediately` keyword is false and eventually_frozen flag is true * rb_check_frozen * output warning if eventually_frozen flag is true ### Alternatively: the eventually_frozen flag is an internal detail only * OBJ_EVENTUAL_FREEZE * used instead of OBJ_FREEZE in `rb_sym_to_s` and others to set eventually_frozen=true * Object#freeze * set frozen=true and eventually_frozen=false * String#+@ * if eventually_frozen is true, create a duplicate string with eventually_frozen=false * Object#frozen? * return true (or maybe `:eventually`) if eventually_frozen flag is true * rb_check_frozen * output warning if eventually_frozen flag is true -- https://bugs.ruby-lang.org/

1 0

[ruby-core:117237] [Ruby master Feature#8421] add Enumerable#find_map and Enumerable#find_all_map
by alexbarret (Alexandre Barret) 19 Mar '24

19 Mar '24

Issue #8421 has been updated by alexbarret (Alexandre Barret). Can we reconsider introducing `#find_map` please, especially since `#find_all_map` has been introduced as `#filter_map` in Ruby in 2.7? Here are some examples ```ruby require "minitest/autorun" # Option 1 def identifier(emails, pattern: /\Ausername\+(?<identifier>[a-z|0-9]+)(a)domain\.com\z/i) result = nil emails.each do |email| if matches = pattern.match(email) result = matches[:identifier] break end end result end # Option 2 def identifier(emails, pattern: /\Ausername\+(?<identifier>[a-z|0-9]+)(a)domain\.com\z/i) matches = nil matches[:identifier] if emails.find { |email| matches = pattern.match(email) } end class TestIdentifierMethod < Minitest::Test def test_identifier assert_equal 'thecode', identifier(%w[ username(a)domain.com username+123@domainAcom wrongusername+123(a)domain.com username+123(a)wrongdomain.com username+thecode(a)domain.com ]) assert_nil identifier(%w[ username(a)domain.com username+123@domainAcom wrongusername+123(a)domain.com username+123(a)wrongdomain.com ]) end end ``` Having a find_map would ease it a bit ```ruby def find_map(collection, &block) result = nil collection.each do |item| break if result = yield(item) end result end def identifier(emails, pattern: /\Ausername\+(?<identifier>[a-z|0-9]+)(a)domain\.com\z/i) find_map(emails) do |email| (matches = pattern.match(email)) && matches[:identifier] end end ``` Here is a second use case ```ruby # Problem 2 Pet = Struct.new(:name) Person = Struct.new(:name, :pet, keyword_init: true) class TestPetIdentitifer < Minitest::Test def setup @some_people_with_pet = [ Person.new(name: 'Alex', pet: nil), Person.new(name: 'Olivier', pet: nil), Person.new(name: 'Romain', pet: Pet.new('Darwin')), Person.new(name: 'Mariano', pet: nil), Person.new(name: 'Sébastien', pet: nil), Person.new(name: 'Ben', pet: nil) ] @people_with_no_pet = [ Person.new(name: 'Mariano', pet: nil), Person.new(name: 'Sébastien', pet: nil), Person.new(name: 'Ben', pet: nil) ] end def test_pet_found people = @some_people_with_pet expected_pet = Pet.new('Darwin') assert_equal expected_pet, people.find(&:pet)&.pet assert_equal expected_pet, find_map(people, &:pet) # -> people.find_map(&:pet) end def test_pet_not_found people = @people_with_no_pet assert_nil people.find(&:pet)&.pet assert_nil find_map(people, &:pet) # -> people.find_map(&:pet) end end ``` Having `#find_map` allows these benefits * The caller does not need to guard against `nil` like when `#find` returns nothing * `#find_map` would be faster than `filter_map.first` or even `lazy.filter_map.first` * It would add the parity with `filter_map`. `#find` is to `#filter` what `#find_map` is to `#filter_map` ---------------------------------------- Feature #8421: add Enumerable#find_map and Enumerable#find_all_map https://bugs.ruby-lang.org/issues/8421#change-107324 * Author: Hanmac (Hans Mackowiak) * Status: Feedback ---------------------------------------- currently if you have an Enumerable and you want to return the return value of #find you need eigther: (o = enum.find(block) && block.call(o)) || nil or enum.inject(nil) {|ret,el| ret || block.call(el)} neigher of them may be better than an directly maked method same for #find_all_map enum.lazy.map(&:block).find_all{|el| el} it may work but it is not so good -- https://bugs.ruby-lang.org/

1 0

[ruby-core:117230] [Ruby master Feature#19057] Hide implementation of `rb_io_t`.
by ioquatix (Samuel Williams) 19 Mar '24

19 Mar '24

Issue #19057 has been updated by ioquatix (Samuel Williams). As an alternative course of action, I've released the latest head of unicorn as `unicorn-maintained` gem. You can use this instead of `unicorn` and it will work on Ruby head. https://github.com/unicorn-ruby/unicorn Anyone who wants to help contribute/maintain it, I am happy to add you to the organisation. It also includes `raindrops-maintained` as this has not been released. ---------------------------------------- Feature #19057: Hide implementation of `rb_io_t`. https://bugs.ruby-lang.org/issues/19057#change-107317 * Author: ioquatix (Samuel Williams) * Status: Assigned * Assignee: ioquatix (Samuel Williams) * Target version: 3.4 ---------------------------------------- In order to make improvements to the IO implementation like <https://bugs.ruby-lang.org/issues/18455>, we need to add new fields to `struct rb_io_t`. By the way, ending types in `_t` is not recommended by POSIX, so I'm also trying to rename the internal implementation to drop `_t` where possible during this conversion. Anyway, we should try to hide the implementation of `struct rb_io`. Ideally, we don't expose any of it, but the problem is backwards compatibility. So, in order to remain backwards compatibility, we should expose some fields of `struct rb_io`, the most commonly used one is `fd` and `mode`, but several others are commonly used. There are many fields which should not be exposed because they are implementation details. ## Current proposal The current proposed change <https://github.com/ruby/ruby/pull/6511> creates two structs: ```c // include/ruby/io.h #ifndef RB_IO_T struct rb_io { int fd; // ... public fields ... }; #else struct rb_io; #endif // internal/io.h #define RB_IO_T struct rb_io { int fd; // ... public fields ... // ... private fields ... }; ``` However, we are not 100% confident this is safe according to the C specification. My experience is not sufficiently wide to say this is safe in practice, but it does look okay to both myself, and @Eregon + @tenderlovemaking have both given some kind of approval. That being said, maybe it's not safe. There are two alternatives: ## Hide all details We can make public `struct rb_io` completely invisible. ```c // include/ruby/io.h #define RB_IO_HIDDEN struct rb_io; int rb_ioptr_descriptor(struct rb_io *ioptr); // accessor for previously visible state. // internal/io.h struct rb_io { // ... all fields ... }; ``` This would only be forwards compatible, and code would need to feature detect like this: ```c #ifdef RB_IO_HIDDEN #define RB_IOPTR_DESCRIPTOR rb_ioptr_descriptor #else #define RB_IOPTR_DESCRIPTOR(ioptr) rb_ioptr_descriptor(ioptr) #endif ``` ## Nested public interface Alternatively, we can nest the public fields into the private struct: ```c // include/ruby/io.h struct rb_io_public { int fd; // ... public fields ... }; // internal/io.h #define RB_IO_T struct rb_io { struct rb_io_public public; // ... private fields ... }; ``` ## Considerations I personally think the "Hide all details" implementation is the best, but it's also the lest compatible. This is also what we are ultimately aiming for, whether we decide to take an intermediate "compatibility step" is up to us. I think "Nested public interface" is messy and introduces more complexity, but it might be slightly better defined than the "Current proposal" which might create undefined behaviour. That being said, all the tests are passing. -- https://bugs.ruby-lang.org/

1 0

[ruby-core:117229] [Ruby master Feature#19057] Hide implementation of `rb_io_t`.
by Eregon (Benoit Daloze) 19 Mar '24

19 Mar '24

Issue #19057 has been updated by Eregon (Benoit Daloze). > From the reaction to this ticket, it is clear that forcing the "hide all the details" approach could destroy the Ruby ecosystem. "destroy the Ruby ecosystem" seems an exaggeration if it's just `unicorn` not working, because there was no release in 2+ years. Are we going to never remove an API because one gem seems not maintained anymore and relies on it? It might even be useful if people notice unicorn is no longer maintained actively, e.g. if security issues are found there might not be a release fixing them either. Note that I would be very happy to be proven wrong about unicorn being no longer maintained. ---------------------------------------- Feature #19057: Hide implementation of `rb_io_t`. https://bugs.ruby-lang.org/issues/19057#change-107316 * Author: ioquatix (Samuel Williams) * Status: Assigned * Assignee: ioquatix (Samuel Williams) * Target version: 3.4 ---------------------------------------- In order to make improvements to the IO implementation like <https://bugs.ruby-lang.org/issues/18455>, we need to add new fields to `struct rb_io_t`. By the way, ending types in `_t` is not recommended by POSIX, so I'm also trying to rename the internal implementation to drop `_t` where possible during this conversion. Anyway, we should try to hide the implementation of `struct rb_io`. Ideally, we don't expose any of it, but the problem is backwards compatibility. So, in order to remain backwards compatibility, we should expose some fields of `struct rb_io`, the most commonly used one is `fd` and `mode`, but several others are commonly used. There are many fields which should not be exposed because they are implementation details. ## Current proposal The current proposed change <https://github.com/ruby/ruby/pull/6511> creates two structs: ```c // include/ruby/io.h #ifndef RB_IO_T struct rb_io { int fd; // ... public fields ... }; #else struct rb_io; #endif // internal/io.h #define RB_IO_T struct rb_io { int fd; // ... public fields ... // ... private fields ... }; ``` However, we are not 100% confident this is safe according to the C specification. My experience is not sufficiently wide to say this is safe in practice, but it does look okay to both myself, and @Eregon + @tenderlovemaking have both given some kind of approval. That being said, maybe it's not safe. There are two alternatives: ## Hide all details We can make public `struct rb_io` completely invisible. ```c // include/ruby/io.h #define RB_IO_HIDDEN struct rb_io; int rb_ioptr_descriptor(struct rb_io *ioptr); // accessor for previously visible state. // internal/io.h struct rb_io { // ... all fields ... }; ``` This would only be forwards compatible, and code would need to feature detect like this: ```c #ifdef RB_IO_HIDDEN #define RB_IOPTR_DESCRIPTOR rb_ioptr_descriptor #else #define RB_IOPTR_DESCRIPTOR(ioptr) rb_ioptr_descriptor(ioptr) #endif ``` ## Nested public interface Alternatively, we can nest the public fields into the private struct: ```c // include/ruby/io.h struct rb_io_public { int fd; // ... public fields ... }; // internal/io.h #define RB_IO_T struct rb_io { struct rb_io_public public; // ... private fields ... }; ``` ## Considerations I personally think the "Hide all details" implementation is the best, but it's also the lest compatible. This is also what we are ultimately aiming for, whether we decide to take an intermediate "compatibility step" is up to us. I think "Nested public interface" is messy and introduces more complexity, but it might be slightly better defined than the "Current proposal" which might create undefined behaviour. That being said, all the tests are passing. -- https://bugs.ruby-lang.org/

1 0

[ruby-core:117228] [Ruby master Feature#19057] Hide implementation of `rb_io_t`.
by ioquatix (Samuel Williams) 19 Mar '24

19 Mar '24

Issue #19057 has been updated by ioquatix (Samuel Williams). Here is the revert PR: https://github.com/ruby/ruby/pull/10283 ---------------------------------------- Feature #19057: Hide implementation of `rb_io_t`. https://bugs.ruby-lang.org/issues/19057#change-107315 * Author: ioquatix (Samuel Williams) * Status: Assigned * Assignee: ioquatix (Samuel Williams) * Target version: 3.4 ---------------------------------------- In order to make improvements to the IO implementation like <https://bugs.ruby-lang.org/issues/18455>, we need to add new fields to `struct rb_io_t`. By the way, ending types in `_t` is not recommended by POSIX, so I'm also trying to rename the internal implementation to drop `_t` where possible during this conversion. Anyway, we should try to hide the implementation of `struct rb_io`. Ideally, we don't expose any of it, but the problem is backwards compatibility. So, in order to remain backwards compatibility, we should expose some fields of `struct rb_io`, the most commonly used one is `fd` and `mode`, but several others are commonly used. There are many fields which should not be exposed because they are implementation details. ## Current proposal The current proposed change <https://github.com/ruby/ruby/pull/6511> creates two structs: ```c // include/ruby/io.h #ifndef RB_IO_T struct rb_io { int fd; // ... public fields ... }; #else struct rb_io; #endif // internal/io.h #define RB_IO_T struct rb_io { int fd; // ... public fields ... // ... private fields ... }; ``` However, we are not 100% confident this is safe according to the C specification. My experience is not sufficiently wide to say this is safe in practice, but it does look okay to both myself, and @Eregon + @tenderlovemaking have both given some kind of approval. That being said, maybe it's not safe. There are two alternatives: ## Hide all details We can make public `struct rb_io` completely invisible. ```c // include/ruby/io.h #define RB_IO_HIDDEN struct rb_io; int rb_ioptr_descriptor(struct rb_io *ioptr); // accessor for previously visible state. // internal/io.h struct rb_io { // ... all fields ... }; ``` This would only be forwards compatible, and code would need to feature detect like this: ```c #ifdef RB_IO_HIDDEN #define RB_IOPTR_DESCRIPTOR rb_ioptr_descriptor #else #define RB_IOPTR_DESCRIPTOR(ioptr) rb_ioptr_descriptor(ioptr) #endif ``` ## Nested public interface Alternatively, we can nest the public fields into the private struct: ```c // include/ruby/io.h struct rb_io_public { int fd; // ... public fields ... }; // internal/io.h #define RB_IO_T struct rb_io { struct rb_io_public public; // ... private fields ... }; ``` ## Considerations I personally think the "Hide all details" implementation is the best, but it's also the lest compatible. This is also what we are ultimately aiming for, whether we decide to take an intermediate "compatibility step" is up to us. I think "Nested public interface" is messy and introduces more complexity, but it might be slightly better defined than the "Current proposal" which might create undefined behaviour. That being said, all the tests are passing. -- https://bugs.ruby-lang.org/

1 0

[ruby-core:117226] [Ruby master Feature#19057] Hide implementation of `rb_io_t`.
by ioquatix (Samuel Williams) 19 Mar '24

19 Mar '24

Issue #19057 has been updated by ioquatix (Samuel Williams). The simplest option right now is to revert this change and try again later. @mame are there any other gems apart from `unicorn` you are concerned about? In other words, if `unicorn` makes a release, then you don't have a problem with this change right? @matz is this also your position? Is `unicorn` the only blocker that you are concerned about? ---------------------------------------- Feature #19057: Hide implementation of `rb_io_t`. https://bugs.ruby-lang.org/issues/19057#change-107314 * Author: ioquatix (Samuel Williams) * Status: Assigned * Assignee: ioquatix (Samuel Williams) * Target version: 3.4 ---------------------------------------- In order to make improvements to the IO implementation like <https://bugs.ruby-lang.org/issues/18455>, we need to add new fields to `struct rb_io_t`. By the way, ending types in `_t` is not recommended by POSIX, so I'm also trying to rename the internal implementation to drop `_t` where possible during this conversion. Anyway, we should try to hide the implementation of `struct rb_io`. Ideally, we don't expose any of it, but the problem is backwards compatibility. So, in order to remain backwards compatibility, we should expose some fields of `struct rb_io`, the most commonly used one is `fd` and `mode`, but several others are commonly used. There are many fields which should not be exposed because they are implementation details. ## Current proposal The current proposed change <https://github.com/ruby/ruby/pull/6511> creates two structs: ```c // include/ruby/io.h #ifndef RB_IO_T struct rb_io { int fd; // ... public fields ... }; #else struct rb_io; #endif // internal/io.h #define RB_IO_T struct rb_io { int fd; // ... public fields ... // ... private fields ... }; ``` However, we are not 100% confident this is safe according to the C specification. My experience is not sufficiently wide to say this is safe in practice, but it does look okay to both myself, and @Eregon + @tenderlovemaking have both given some kind of approval. That being said, maybe it's not safe. There are two alternatives: ## Hide all details We can make public `struct rb_io` completely invisible. ```c // include/ruby/io.h #define RB_IO_HIDDEN struct rb_io; int rb_ioptr_descriptor(struct rb_io *ioptr); // accessor for previously visible state. // internal/io.h struct rb_io { // ... all fields ... }; ``` This would only be forwards compatible, and code would need to feature detect like this: ```c #ifdef RB_IO_HIDDEN #define RB_IOPTR_DESCRIPTOR rb_ioptr_descriptor #else #define RB_IOPTR_DESCRIPTOR(ioptr) rb_ioptr_descriptor(ioptr) #endif ``` ## Nested public interface Alternatively, we can nest the public fields into the private struct: ```c // include/ruby/io.h struct rb_io_public { int fd; // ... public fields ... }; // internal/io.h #define RB_IO_T struct rb_io { struct rb_io_public public; // ... private fields ... }; ``` ## Considerations I personally think the "Hide all details" implementation is the best, but it's also the lest compatible. This is also what we are ultimately aiming for, whether we decide to take an intermediate "compatibility step" is up to us. I think "Nested public interface" is messy and introduces more complexity, but it might be slightly better defined than the "Current proposal" which might create undefined behaviour. That being said, all the tests are passing. -- https://bugs.ruby-lang.org/

1 0

[ruby-core:117224] [Ruby master Feature#19057] Hide implementation of `rb_io_t`.
by matz (Yukihiro Matsumoto) 19 Mar '24

19 Mar '24

Issue #19057 has been updated by matz (Yukihiro Matsumoto). I agree with @mame. This change would break too many tests, apps, etc. We cannot accept the change at the moment. Can we be more conservative? Matz. ---------------------------------------- Feature #19057: Hide implementation of `rb_io_t`. https://bugs.ruby-lang.org/issues/19057#change-107310 * Author: ioquatix (Samuel Williams) * Status: Assigned * Assignee: ioquatix (Samuel Williams) * Target version: 3.4 ---------------------------------------- In order to make improvements to the IO implementation like <https://bugs.ruby-lang.org/issues/18455>, we need to add new fields to `struct rb_io_t`. By the way, ending types in `_t` is not recommended by POSIX, so I'm also trying to rename the internal implementation to drop `_t` where possible during this conversion. Anyway, we should try to hide the implementation of `struct rb_io`. Ideally, we don't expose any of it, but the problem is backwards compatibility. So, in order to remain backwards compatibility, we should expose some fields of `struct rb_io`, the most commonly used one is `fd` and `mode`, but several others are commonly used. There are many fields which should not be exposed because they are implementation details. ## Current proposal The current proposed change <https://github.com/ruby/ruby/pull/6511> creates two structs: ```c // include/ruby/io.h #ifndef RB_IO_T struct rb_io { int fd; // ... public fields ... }; #else struct rb_io; #endif // internal/io.h #define RB_IO_T struct rb_io { int fd; // ... public fields ... // ... private fields ... }; ``` However, we are not 100% confident this is safe according to the C specification. My experience is not sufficiently wide to say this is safe in practice, but it does look okay to both myself, and @Eregon + @tenderlovemaking have both given some kind of approval. That being said, maybe it's not safe. There are two alternatives: ## Hide all details We can make public `struct rb_io` completely invisible. ```c // include/ruby/io.h #define RB_IO_HIDDEN struct rb_io; int rb_ioptr_descriptor(struct rb_io *ioptr); // accessor for previously visible state. // internal/io.h struct rb_io { // ... all fields ... }; ``` This would only be forwards compatible, and code would need to feature detect like this: ```c #ifdef RB_IO_HIDDEN #define RB_IOPTR_DESCRIPTOR rb_ioptr_descriptor #else #define RB_IOPTR_DESCRIPTOR(ioptr) rb_ioptr_descriptor(ioptr) #endif ``` ## Nested public interface Alternatively, we can nest the public fields into the private struct: ```c // include/ruby/io.h struct rb_io_public { int fd; // ... public fields ... }; // internal/io.h #define RB_IO_T struct rb_io { struct rb_io_public public; // ... private fields ... }; ``` ## Considerations I personally think the "Hide all details" implementation is the best, but it's also the lest compatible. This is also what we are ultimately aiming for, whether we decide to take an intermediate "compatibility step" is up to us. I think "Nested public interface" is messy and introduces more complexity, but it might be slightly better defined than the "Current proposal" which might create undefined behaviour. That being said, all the tests are passing. -- https://bugs.ruby-lang.org/

1 0

[ruby-core:117212] [Ruby master Feature#20345] Add `--target-rbconfig` option to mkmf
by katei (Yuta Saito) 19 Mar '24

19 Mar '24

Issue #20345 has been reported by katei (Yuta Saito). ---------------------------------------- Feature #20345: Add `--target-rbconfig` option to mkmf https://bugs.ruby-lang.org/issues/20345 * Author: katei (Yuta Saito) * Status: Open ---------------------------------------- ## Motivation Today, CRuby runs on many platforms. But not all platforms are capable of running build tools (e.g. WebAssembly/WASI), so cross-target compilation against extensions libraries is essential for those platforms. We currently have 3 major mkmf users (`extconf.rb` consumers in in other words): 1. CRuby build system 2. rake-compiler 3. RubyGems [1] CRuby build system and [2] rake-compiler have their bespoke tricks to support cross compilation but [3] does not support cross compilation yet. So we are going to support cross-compilation in RubyGems to unlock the use of gems including non-precompiled extension libraries. However, introducing the same tricks to RubyGems to support cross compilation as well as the other two is not ideal and cannot handle some edge cases properly. Therefore, this proposal aims to add cross-compilation support in mkmf itself and remove the need for special tricks in mkmf users. Note that cross-compilation here includes: - Cross *platform* compilation: Build extension libraries for platform A on platform B. - Cross *ruby version* compilation: Build extension libraries for Ruby X with running mkmf.rb bundled with Ruby X on Ruby Y. ## Existing Solutions We currently have two solutions to cross-compile extension libraries, but both solutions are based on faking `rbconfig`. ### CRuby build system CRuby build system is capable for cross-compiling extension libraries for cross-platform and cross ruby version. The key trick here is that CRuby build system generates <platform>-fake.rb that fakes `RUBY_` constants like `RUBY_PLATFORM` and loads just built `rbconfig` describing Ruby version X for platform A and prevents loading `rbconfig` for Ruby version Y for platform B. As a result, this fakes the global `RbConfig` constant and mkmk generates Makefile using the faked `RbConfig`. ### rake-compiler rake-compiler also fakes `RbConfig` as well as CRuby build system does. One of the notable tricks here is that the faking script loads `resolv`, which expects the original `RUBY_PLATFORM`, at first and fake RbConfig after that. ```ruby # From https://github.com/rake-compiler/rake-compiler/blob/7357f9e917dae7935068778… # Pre-load resolver library before faking, in order to avoid error # "cannot load such file -- win32/resolv" when it is required later on. # See also: https://github.com/tjschuck/rake-compiler-dev-box/issues/5 require 'resolv' require 'rbconfig' ``` This has been introduced as a workaround but this indicates that the faking method cannot be generally applied. ## Problems Based on insights from the existing solutions, the problems here are: 1. There is no way to tell the target `RbConfig` to `mkmf` without polluting the global `RbConfig` constant. 2. There is no public API to retrieve the deployment target info, so existing `extconf.rb` assumes `::RbConfig` is the one. ## Proposal I propose adding those interfaces to `mkmf`: 1. `--target-rbconfig` option to override the RbConfig used for generating Makefiles without replacing the global top-level `RbConfig` module. 2. `MakeMakefile::RbConfig` constant to access the RbConfig for the target platform. By default, it's an alias of top-level `RbConfig`. If `--target-rbconfig` is given, it points to the specified `RbConfig` definition. ```console= $ ruby extconf.rb --target-rbconfig path/to/rbconfig.rb ``` ```ruby require "mkmf" system( "./libyaml/configure", # Before: # "--host=#{RbConfig::CONFIG['host']}", "--host=#{MakeMakefile::RbConfig::CONFIG['host']}", ... ) # Before: # case RUBY_PLATFORM case MakeMakefile::RbConfig::CONFIG['platform'] when /mswin|mingw|bccwin/ ... when /linux/ ... end create_makefile("psych") ``` Extension library authors who want to support cross-compilation just need to replace their use of some constants in `extconf.rb` that assume the config describes the deployment target. Here is the list of faked constant variables and corresponding representations compatible with cross-compilation. | Before | After (to make the ext x-compile ready) | |:------|:-----| |`RbConfig` | `MakeMakefile::RbConfig` | |`RUBY_PLATFORM` | `MakeMakefile::RbConfig::CONFIG["platform"]` | |`RUBY_VERSION` | `MakeMakefile::RbConfig::expand("$(MAJOR).$(MINOR).$(TEENY)")` | |`RUBY_DESCRIPTION` | No corresponding config entry | ## Compatibility This is a completely additive change, so I expect there is no compatibility issues for existing `extconf.rb`. Note that migrating `RbConfig` to `MakeMakefile::RbConfig` does not break existing faked `RbConfig` based cross-compilation because `MakeMakefile::RbConfig` is an alias of `::RbConfig` by default and it's the faked config describing the deployment target in this scenario. Also extension library authors who want to support cross-compilation and want to keep build with older Ruby before this change can include the following snippet at the beginning of `extconf.rb`: ```ruby= MakeMakefile::RbConfig ||= RbConfig ``` ## Implementation Literally a few lines of changes: https://github.com/kateinoigakukun/ruby/commit/9f3090c26ae1e5712dee702c19ba… ## Evaluation I ported nokogiri gem, which has [1k lines of `extconf.rb`](https://github.com/sparklemotion/nokogiri/blob/v1.16.3/ext/nokogiri/extconf.rb) and several platform specific branches, to WebAssembly/WASI with this change, and the new API was enough to satisfy the cross-compilation scenario. -- https://bugs.ruby-lang.org/

5 8