June 2023 - ruby-core - ml.ruby-lang.org

[ruby-core:113800] [Ruby master Feature#19719] Universal Parser

by yui-knk (Kaneko Yuichiro)

Issue #19719 has been reported by yui-knk (Kaneko Yuichiro). ---------------------------------------- Feature #19719: Universal Parser https://bugs.ruby-lang.org/issues/19719 * Author: yui-knk (Kaneko Yuichiro) * Status: Open * Priority: Normal ---------------------------------------- # Background There are three use cases where we need a CRuby parser which is independent of other CRuby features like Object, GC. I call such a parser a Universal Parser. 1. Use Universal Parser from Ruby applications. For example RuboCop. RuboCop needs a parser to get AST of source code. Currently RuboCop uses parser gem. In this sense Universal Parser is a replacement of parser gem. 2. Use Universal Parser from C/C++ or other language. For example Sorbet. Sorbet is written in C++. It has its own parser. Universal Parser can be used in such scenario. 3. Use Universal Parser for other Ruby implementations. mruby, JRuby and other Ruby implementations will use Universal Parser so that they don’t need to develop & manage their own parser. # Design * Implement Universal Parser by changing CRuby source code, especially parse.y and node.c. Introduce `UNIVERSAL_PARSER` macro and implement Universal Parser by passing all necessary CRuby related functions via `struct rb_parser_config_struct`. In this step there are two build modes with/without Universal Parser. * Reduce CRuby related functions passed by `struct rb_parser_config_struct`. Some of them are copied into parse.y, e.g. `rb_isspace`. Other are reimplemented, e.g. `NODE_LIT` has new String struct instead of `VALUE`. * Once CRuby related functions needed for Universal Parser are minimized, replace rest CRuby function references with functions provided by `struct rb_parser_config_struct` and remove `UNIVERSAL_PARSER` macro. # Release management There are three options for releasing a binary for Universal Parser (“librubyparser”). 1. Release it as so/dll a. Include it into Ruby release process b. Create another repository for release management of "librubyparser" 2. Release it as gem (ruby/universal_parser) "librubyparser" has only pure C interface and data structure. If you want to use it from Ruby code, you need C extension code to translate "librubyparser" data and Ruby objects. I propose to create "universal_parser" gem for this purpose. I prefer #1-b to #1-a because it doesn’t increase tasks for Ruby release process. I want to make #2 as a first milestone. -- https://bugs.ruby-lang.org/

11 months, 1 week

6
6
0 0

[ruby-core:113854] ruby/ruby Cirrus CI improvement

by Jun Aruga (he / him)

Hello Fedor @ Cirrus CI, CC: Ruby Core mailing list. Thank you for helping the Ruby project for adding Cirrus CI. I am sending an email to ask you for help to improve the Cirrus CI. We are running 4 parallel jobs (2 gcc cases ARM, 2 clang cases ARM) in the Cirrus CI. However, I am seeing some challenges such as the queue is stopping, and the clang cases are 2 or 3 times slower than gcc cases. So, today we stopped the 2 clang cases in a compromised way. Could you check our .cirrus.yml file on the reverted commit, and give advice to us and ideally just send a pull-request to the master branch for that? https://github.com/ruby/ruby/blob/master/.cirrus.yml ``` $ git clone https://github.com/ruby/ruby.git $ cd ruby $ git revert 72f07f0a5f882e87e305d668587152fa209a0568. ``` I suspect the current `cpu:` value might not be correct. And we may need the following change. ``` diff --git a/.cirrus.yml b/.cirrus.yml index be76e4ab4a..8b820be1a2 100644 --- a/.cirrus.yml +++ b/.cirrus.yml @@ -16,10 +16,10 @@ task: image: ghcr.io/ruby/ruby-ci-image:$CC # Define the used cpu core in each matrix task. We can use total 16 cpu # cores in entire matrix. [cpu] = [total cpu: 16] / [number of tasks] - cpu: 8 + cpu: 4 # We can request maximum 4 GB per cpu. # [memory per task] = [memory per cpu: 4 GB] * [cpu] - memory: 32G + memory: 16G env: CIRRUS_CLONE_DEPTH: 50 optflags: '-O1' @@ -71,10 +71,10 @@ yjit_task: image: ghcr.io/ruby/ruby-ci-image:$CC # Define the used cpu core in each matrix task. We can use total 16 cpu # cores in entire matrix. [cpu] = [total cpu: 16] / [number of tasks] - cpu: 8 + cpu: 4 # We can request maximum 4 GB per cpu. # [memory per task] = [memory per cpu: 4 GB] * [cpu] - memory: 32G + memory: 16G env: CIRRUS_CLONE_DEPTH: 50 optflags: '-O1' ``` Thank you for your help! -- Jun | He - Him | Timezone: UTC+1 or 2, Czech Republic See <https://www.worldtimebuddy.com/czech-republic-prague-to-utc> for the timezone.

11 months, 1 week

1
1
0 0

[ruby-core:113882] [Ruby master Feature#19725] Improve the match cache optimization to support look-ahead and atomic groups

by make_now_just (Hiroya Fujinami)

Issue #19725 has been reported by make_now_just (Hiroya Fujinami). ---------------------------------------- Feature #19725: Improve the match cache optimization to support look-ahead and atomic groups https://bugs.ruby-lang.org/issues/19725 * Author: make_now_just (Hiroya Fujinami) * Status: Open * Priority: Normal ---------------------------------------- Currently, the Regexp match cache optimization [Feature #19104] memoizes match cache points that once arrived; that is, it memoizes them **before backtracking**. This kind of implementation is simple and works fine in most cases. However, in some cases, it does not work well. For example, 1. it is hard to preserve the strange null-loop behavior, and 2. it is hard to support look-around operators and atomic groups correctly. The first problem is simple. Memoization may change the matching behavior if a cache point in a null loop is memoized because Onigmo allows a null loop (a loop without consuming a character) at the first time to keep captures. We avoid this problem by carefully resetting the memoization table on null loops currently. But, nested null loops are not supported. The second problem is a bit complex. For example, `/\Aa*(?=a*)a\z/` does not match against `"a"` on the memozation before backtracking. The reason is that when the first matching of the look-ahead is succeeded, the loop branch in the look-ahead is memoized, and it is impossible to succeed the second matching of the look-ahead. Furthermore, how do we prevent increasing the matching time of `/\Aa*?(?=a*)\z/` against `"a" * N + "z"` quadratically? I suggest two improvements to resolve these issues: 1. Memoizing **after backtracking**, and 2. Memoizing **partial match results** for match cache points in look-ahead and atomic groups. The first improvement is, I think, the canonical way to implement the match cache optimization. This avoids the first "null loop" problem entirely. This also enables to support look-ahead and atomic groups easily. This can be implemented by introducing an extra item to the stack and doing memoization on popping it. The second improvement is to simulate on the memoization the atomic behavior of look-ahead and atomic (Of course!) groups on the memoization. The current memoization table only records the arrived points; that is, it does not record the result of matching and only records failures of matching. Actually, it is not problematic in the current implementation because "matching is succeeded" means the matching is done. However, when a part of look-around and atomic groups is matched, the stack is hoisted to behave atomically. To simulate this behavior, successes in matching these parts are also memoized. This can be implemented by introducing an extra bit for the result of matching to the memoization table. These improvements will enable - to avoid the nested null loop restriction, and - to support positive/negative look-ahead and atomic groups correctly. However, look-ahead and atomic groups that include captures cannot be supported. This is because captures may be changed on successful matches. -- https://bugs.ruby-lang.org/

11 months, 1 week

1
1
0 0

[ruby-core:113529] [Ruby master Bug#19681] The final classpath of partially named modules is sometimes inconsistent once permanently named

by byroot (Jean Boussier)

Issue #19681 has been reported by byroot (Jean Boussier). ---------------------------------------- Bug #19681: The final classpath of partially named modules is sometimes inconsistent once permanently named https://bugs.ruby-lang.org/issues/19681 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal * Backport: 3.0: WONTFIX, 3.1: REQUIRED, 3.2: REQUIRED ---------------------------------------- Reported to me by @fxn ```ruby m = Module.new class m::C; end p m::C.name # => "#<Module:0x000000010789fbe0>::C" m::D = m::C p m::D.name # => "#<Module:0x000000010789fbe0>::C" M = m p M::C.name # => "#<M::D" ``` Expected behavior: ```ruby p M::C.name # => "#<M::C" ``` ### Reason When the parent is assigned its permanent classpath, we iterate over its `const_table` to recursively give a permanent name to all the constant it owns. However, `const_table` is an `id_table` so it doesn't retain the insertion order, which means that if the constant was aliased, we can no longer distinguish between the original name and its aliases, and whichever comes first in the `const_table` will be used as the permanent name. ### Potential solution I have a tentative fix for it in https://github.com/ruby/ruby/pull/7829. Instead of relying on the `const_table` key, it extract the original name from the temporary classpath. It does feel a bit wrong to do a string search in such a place, but it does work. -- https://bugs.ruby-lang.org/

11 months, 1 week

7
33
0 0

[ruby-core:113865] [Ruby master Feature#15192] Introduce a new "shortcut assigning" syntax to convenient setup instance variables

by bradly (Bradly Feeley)

Issue #15192 has been updated by bradly (Bradly Feeley). I really believe this would be seen a major improvement to Ruby. Personally, I still feel ``` ruby def initialize(@foo) ``` is the cleanest implementation (or something else the comes _before_ the arg name), but I would be happy with an agreement on any proposed syntax. ---------------------------------------- Feature #15192: Introduce a new "shortcut assigning" syntax to convenient setup instance variables https://bugs.ruby-lang.org/issues/15192#change-103513 * Author: jjyr (Jinyang Jiang) * Status: Open * Priority: Normal * Assignee: matz (Yukihiro Matsumoto) ---------------------------------------- Motivation: Introduce a new syntax for convenient setup instance variables for objects. The problem: Currently, setup instance variables in Ruby is too verbose. Basically, we need to write the meaningless assigning code again and again to assign variables ``` ruby class Person def initialize(name:, age:, gender:, country:) @name = name @age = age @gender = gender @country = country end end # we can use Struct to avoiding this Person = Struct.new(:name, :age, :gender, :country, keyword_init: true) # let's see a real-world case, which can't use Struct to describe an initializing process, from https://github.com/ciri-ethereum/ciri/blob/748985ccf7a620a2e480706a5a6b38f5… # Because we want to do something more than just assigning instance variables class Server def initialize(private_key:, protocol_manage:, bootstrap_nodes: [], node_name: 'Ciri', tcp_host: '127.0.0.1', tcp_port: 33033) @private_key = private_key @node_name = node_name @bootstrap_nodes = bootstrap_nodes @protocol_manage = protocol_manage server_node_id = NodeID.new(@private_key) caps = [Cap.new(name: 'eth', version: 63)] @handshake = ProtocolHandshake.new(version: BASE_PROTOCOL_VERSION, name: @node_name, id: server_node_id.id, caps: caps) @tcp_host = tcp_host @tcp_port = tcp_port @dial = Dial.new(bootstrap_nodes: bootstrap_nodes, private_key: private_key, handshake: @handshake) @network_state = NetworkState.new(protocol_manage) @dial_scheduler = DialScheduler.new(@network_state, @dial) end end # Introduce a new "shortcut assigning" syntax for convenient setup class Person # use @ prefix to describe instance variables. def initialize(@name:, @age:, @gender:, @country:) end # equal to def initialize2(name:, age:, gender:, country:) @name = name @age = age @gender = gender @country = country end # it should also work on position style arguments def initialize2(@name, @age, @gender, @country) end end # Our real-world case can be rewritten as below class Server def initialize(@private_key:, @protocol_manage:, @bootstrap_nodes: [], @node_name: 'Ciri', @tcp_host: '127.0.0.1', @tcp_port: 33033) server_node_id = NodeID.new(@private_key) caps = [Cap.new(name: 'eth', version: 63)] @handshake = ProtocolHandshake.new(version: BASE_PROTOCOL_VERSION, name: @node_name, id: server_node_id.id, caps: caps) @dial = Dial.new(bootstrap_nodes: @bootstrap_nodes, private_key: @private_key, handshake: @handshake) @network_state = NetworkState.new(@protocol_manage) @dial_scheduler = DialScheduler.new(@network_state, @dial) end end # consider to keep consistency, this "shortcut assigning" syntax should work for non-initialize methods class Foo def bar(@still_works) p @still_works end end ``` -- https://bugs.ruby-lang.org/

11 months, 2 weeks

1
0
0 0

[ruby-core:113660] [Ruby master Feature#19694] Add Regexp#timeout= setter

by aharpole (Aaron Harpole)

Issue #19694 has been reported by aharpole (Aaron Harpole). ---------------------------------------- Feature #19694: Add Regexp#timeout= setter https://bugs.ruby-lang.org/issues/19694 * Author: aharpole (Aaron Harpole) * Status: Open * Priority: Normal ---------------------------------------- # Abstract In addition to allowing for a Regexp timeout to be set on individual instances by setting a `timeout` argument in `Regexp.new`, I'm proposing that we also allow setting the timeout on Regexp objects with a `#timeout=` setter. # Background To be able to roll out a global Regexp timeout for a large application, there are inevitably some individual regexes for which a different timeout is appropriate. While the `timeout` keyword argument was added to `Regexp.new`, this isn't always a viable option. In the case of regex literal syntax (`/ab*/` or `%r{ab*}`, for instance), it's not possible to set a timeout at all right now without converting to `Regexp.new`, which may be awkward depending on the contents of the regex. It also is desirable from time to time to be able to set a timeout for a regex object after it's been initialized. Finally, because we offer a `Regexp#timeout` getter, for consistency it would be nice to also offer a setter. The introduction of a `Regexp#timeout=` setter was mentioned as a possible way to set individual timeouts in https://bugs.ruby-lang.org/issues/19104#Specification. # Proposal I propose that we add the method `Regexp#timeout=`. It works the same way the `timeout` argument works in `Regexp.new`, taking either a float or nil. This makes it relatively easy to add timeouts to specific regex literals (regex literals are frozen by default so you do have to `dup` them first): ``` emoji_filter_pattern = %r{ (?<!#{Regexp.quote(ZERO_WIDTH_JOINER)}) #{EmojiFilter.unicodes_pattern} (?!#{Regexp.union(EmojiFilter::MODIFIER_CHAR_MAP.keys.map { |k| Regexp.quote k })}) }x.dup emoji_filter_pattern.timeout = 1.0 emoji_filter_pattern.freeze ``` # Implementation This setter has been implemented in https://github.com/ruby/ruby/pull/7847. # Evaluation It's just a setter, so pretty straightforward in terms of implementation and use. # Discussion It's worth considering other options for overriding `Regexp.timeout`. I'd love to see something like the following for overriding regexp timeouts as well: ``` Regexp.timeout = 1.0 Regexp.with_timeout(5.0) do evaluate_slower_regexes end ``` It's possible to implement something like `Regexp.with_timeout` but it's not thread-safe by default since it would involve overwriting `Regexp.timeout`. # Summary Regexp instances have a getter for timeout, and adding a corresponding setter adds consistency and will make it easier for developers to adopt adding a global `Regexp.timeout` by making it simpler to adjust timeouts on a regex by regex basis. It's a minor change but the added consistency and flexibility help us optimize for developer happiness. -- https://bugs.ruby-lang.org/

11 months, 2 weeks

9
14
0 0

[ruby-core:113852] [Ruby master Bug#18572] Performance regression when invoking refined methods

by Eregon (Benoit Daloze)

Issue #18572 has been updated by Eregon (Benoit Daloze). shugo (Shugo Maeda) wrote in #note-10: > It's simple to prohibit using calls in blocks, but it will break backward compatibility, e.g., using in module_eval. It would still be possible to use `using` outside the `module_eval` (e.g. in a module body or at top-level), which seems more correct anyway since refinements should be per constant scope/cref, they are not per block. ---------------------------------------- Bug #18572: Performance regression when invoking refined methods https://bugs.ruby-lang.org/issues/18572#change-103501 * Author: palkan (Vladimir Dementyev) * Status: Assigned * Priority: Normal * Assignee: ko1 (Koichi Sasada) * Backport: 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN ---------------------------------------- Since Ruby 3.0, defining a refinement for a method slows down its execution even if we do not activate the refinement: ```ruby require "benchmark_driver" source = <<~RUBY class Hash def symbolize_keys transform_keys { |key| key.to_sym rescue key } end def refined_symbolize_keys transform_keys { |key| key.to_sym rescue key } end end module HashRefinements refine Hash do def refined_symbolize_keys raise "never called" end end end HASH = {foo: 1, bar: 2, baz: 3} class Foo def original end def refined end end module FooRefinements refine Foo do def refined raise "never called" end end end FOO = Foo.new RUBY Benchmark.driver do |x| x.prelude %Q{ #{source} } x.report "#symbolize_keys original", %{ HASH.symbolize_keys } x.report "#symbolize_keys refined", %{ HASH.refined_symbolize_keys } end Benchmark.driver do |x| x.prelude %Q{ #{source} } x.report "no-op original", %{ FOO.original } x.report "no-op refined", %{ FOO.refined } end ``` The results for Ruby 3.1: ```sh ... Comparison: #symbolize_keys original: 2372420.1 i/s #symbolize_keys refined: 1941019.0 i/s - 1.22x slower ... Comparison: no-op original: 51790974.2 i/s no-op refined: 14456518.9 i/s - 3.58x slower ``` For Ruby 2.6 and 2.7: ```sh Comparison: #symbolize_keys original: 2278339.7 i/s #symbolize_keys refined: 2264153.1 i/s - 1.01x slower ... Comparison: no-op refined: 64178338.5 i/s no-op original: 63357980.1 i/s - 1.01x slower ``` You can find the full code and more results in this [gist](https://gist.github.com/palkan/637dc83edd86d70b5dbf72f2a4d702e5). P.S. The problem was originally noticed by @byroot, see https://github.com/ruby-i18n/i18n/pull/573 -- https://bugs.ruby-lang.org/

11 months, 2 weeks

1
0
0 0

[ruby-core:113771] [Ruby master Feature#19712] IO#reopen removes singleton class

by itarato (Peter Arato)

Issue #19712 has been reported by itarato (Peter Arato). ---------------------------------------- Feature #19712: IO#reopen removes singleton class https://bugs.ruby-lang.org/issues/19712 * Author: itarato (Peter Arato) * Status: Open * Priority: Normal ---------------------------------------- The documentation states: > This may dynamically change the actual class of this stream. As well `#reopen` removes the singleton class, even when the logical class is the same. This can be surprising at times. An example: ``` ruby io = File.open(__FILE__) io.define_singleton_method(:foo) { $stdout.write("hello") } io.reopen(File.open(__FILE__)) io.foo # `<main>': undefined method `foo' for #<File:/test.rb> (NoMethodError) ``` An example where this was an issue: https://github.com/oracle/truffleruby/pull/3088 Tl;dr: a popular gem was enhancing the singleton class of STDOUT, while Rails/ActiveSupport was executing an `IO#reopen` - invalidating the changes by the gem. While it's hard to trivially solve this with keeping the intended functionality, could it be considered to make edge cases more visible? Examples: - `IO#reopen` issues a warning message when the receiver's singleton class has been updated (but keep removing it as of current state) - `IO#reopen` is limited on instances with a default singleton class, and issues an error when called on an instance with an updated singleton class - make `IO#reopen` carry over the singleton class *only* if the recipient and argument's class are the same, and yield an error otherwise (different classes) These are just ideas. I'm more looking for feedback from the core devs at this point. Thanks in advance! -- https://bugs.ruby-lang.org/

11 months, 2 weeks

5
10
0 0

[ruby-core:113845] [Ruby master Feature#19057] Hide implementation of `rb_io_t`.

by ioquatix (Samuel Williams)

Issue #19057 has been updated by ioquatix (Samuel Williams). Thanks for the report, fixed in https://github.com/ioquatix/raindrops/commit/94dbdd94977d895f98c084d0ca31c2… ---------------------------------------- Feature #19057: Hide implementation of `rb_io_t`. https://bugs.ruby-lang.org/issues/19057#change-103493 * Author: ioquatix (Samuel Williams) * Status: Assigned * Priority: Normal * Assignee: ioquatix (Samuel Williams) ---------------------------------------- In order to make improvements to the IO implementation like <https://bugs.ruby-lang.org/issues/18455>, we need to add new fields to `struct rb_io_t`. By the way, ending types in `_t` is not recommended by POSIX, so I'm also trying to rename the internal implementation to drop `_t` where possible during this conversion. Anyway, we should try to hide the implementation of `struct rb_io`. Ideally, we don't expose any of it, but the problem is backwards compatibility. So, in order to remain backwards compatibility, we should expose some fields of `struct rb_io`, the most commonly used one is `fd` and `mode`, but several others are commonly used. There are many fields which should not be exposed because they are implementation details. ## Current proposal The current proposed change <https://github.com/ruby/ruby/pull/6511> creates two structs: ```c // include/ruby/io.h #ifndef RB_IO_T struct rb_io { int fd; // ... public fields ... }; #else struct rb_io; #endif // internal/io.h #define RB_IO_T struct rb_io { int fd; // ... public fields ... // ... private fields ... }; ``` However, we are not 100% confident this is safe according to the C specification. My experience is not sufficiently wide to say this is safe in practice, but it does look okay to both myself, and @Eregon + @tenderlovemaking have both given some kind of approval. That being said, maybe it's not safe. There are two alternatives: ## Hide all details We can make public `struct rb_io` completely invisible. ```c // include/ruby/io.h #define RB_IO_HIDDEN struct rb_io; int rb_ioptr_descriptor(struct rb_io *ioptr); // accessor for previously visible state. // internal/io.h struct rb_io { // ... all fields ... }; ``` This would only be forwards compatible, and code would need to feature detect like this: ```c #ifdef RB_IO_HIDDEN #define RB_IOPTR_DESCRIPTOR rb_ioptr_descriptor #else #define RB_IOPTR_DESCRIPTOR(ioptr) rb_ioptr_descriptor(ioptr) #endif ``` ## Nested public interface Alternatively, we can nest the public fields into the private struct: ```c // include/ruby/io.h struct rb_io_public { int fd; // ... public fields ... }; // internal/io.h #define RB_IO_T struct rb_io { struct rb_io_public public; // ... private fields ... }; ``` ## Considerations I personally think the "Hide all details" implementation is the best, but it's also the lest compatible. This is also what we are ultimately aiming for, whether we decide to take an intermediate "compatibility step" is up to us. I think "Nested public interface" is messy and introduces more complexity, but it might be slightly better defined than the "Current proposal" which might create undefined behaviour. That being said, all the tests are passing. -- https://bugs.ruby-lang.org/

11 months, 2 weeks

1
0
0 0

[ruby-core:113844] [Ruby master Feature#19057] Hide implementation of `rb_io_t`.

by byroot (Jean Boussier)

Issue #19057 has been updated by byroot (Jean Boussier). @ioquatix I'm not sure which change exactly is the cause, but it appears that the recent `rb_io_t` changes broke `raindrops` ``` current directory: /usr/local/lib/ruby/gems/3.3.0+0/gems/raindrops-0.20.1/ext/raindrops make DESTDIR\= sitearchdir\=./.gem.20230609-25-giurc0 sitelibdir\=./.gem.20230609-25-giurc0 clean current directory: /usr/local/lib/ruby/gems/3.3.0+0/gems/raindrops-0.20.1/ext/raindrops make DESTDIR\= sitearchdir\=./.gem.20230609-25-giurc0 sitelibdir\=./.gem.20230609-25-giurc0 compiling linux_inet_diag.c In file included from /usr/include/x86_64-linux-gnu/bits/libc-header-start.h:33, from /usr/include/stdio.h:27, from /usr/local/include/ruby-3.3.0+0/ruby/defines.h:16, from /usr/local/include/ruby-3.3.0+0/ruby/ruby.h:25, from /usr/local/include/ruby-3.3.0+0/ruby.h:38, from linux_inet_diag.c:1: /usr/include/features.h:187:3: warning: #warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE" [-Wcpp] 187 | # warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE" | ^~~~~~~ In file included from linux_inet_diag.c:4: my_fileno.h:9:7: warning: "HAVE_RB_IO_T" is not defined, evaluates to 0 [-Wundef] 9 | #if ! HAVE_RB_IO_T | ^~~~~~~~~~~~ my_fileno.h:16:8: warning: "HAVE_RB_IO_T" is not defined, evaluates to 0 [-Wundef] 16 | # if !HAVE_RB_IO_T || (RUBY_VERSION_MAJOR == 1 && RUBY_VERSION_MINOR == 8) | ^~~~~~~~~~~~ my_fileno.h: In function ‘my_fileno’: my_fileno.h:10:19: error: unknown type name ‘OpenFile’; did you mean ‘GetOpenFile’? 10 | # define rb_io_t OpenFile | ^~~~~~~~ my_fileno.h:25:2: note: in expansion of macro ‘rb_io_t’ 25 | rb_io_t *fptr; | ^~~~~~~ In file included from my_fileno.h:3, from linux_inet_diag.c:4: /usr/local/include/ruby-3.3.0+0/ruby/io.h:395:55: warning: assignment to ‘int *’ from incompatible pointer type ‘struct rb_io *’ [-Wincompatible-pointer-types] 395 | #define RB_IO_POINTER(obj,fp) rb_io_check_closed((fp) = RFILE(rb_io_taint_check(obj))->fptr) | ^ /usr/local/include/ruby-3.3.0+0/ruby/io.h:401:21: note: in expansion of macro ‘RB_IO_POINTER’ 401 | #define GetOpenFile RB_IO_POINTER | ^~~~~~~~~~~~~ my_fileno.h:30:2: note: in expansion of macro ‘GetOpenFile’ 30 | GetOpenFile(io, fptr); | ^~~~~~~~~~~ /usr/local/include/ruby-3.3.0+0/ruby/io.h:395:55: warning: passing argument 1 of ‘rb_io_check_closed’ from incompatible pointer type [-Wincompatible-pointer-types] 395 | #define RB_IO_POINTER(obj,fp) rb_io_check_closed((fp) = RFILE(rb_io_taint_check(obj))->fptr) | ~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | int * /usr/local/include/ruby-3.3.0+0/ruby/io.h:401:21: note: in expansion of macro ‘RB_IO_POINTER’ 401 | #define GetOpenFile RB_IO_POINTER | ^~~~~~~~~~~~~ my_fileno.h:30:2: note: in expansion of macro ‘GetOpenFile’ 30 | GetOpenFile(io, fptr); | ^~~~~~~~~~~ /usr/local/include/ruby-3.3.0+0/ruby/io.h:638:34: note: expected ‘rb_io_t *’ {aka ‘struct rb_io *’} but argument is of type ‘int *’ 638 | void rb_io_check_closed(rb_io_t *fptr); | ~~~~~~~~~^~~~ In file included from linux_inet_diag.c:4: my_fileno.h:17:41: error: request for member ‘f’ in something not a structure or union 17 | # define FPTR_TO_FD(fptr) fileno(fptr->f) | ^~ my_fileno.h:31:7: note: in expansion of macro ‘FPTR_TO_FD’ 31 | fd = FPTR_TO_FD(fptr); | ^~~~~~~~~~ linux_inet_diag.c: At top level: cc1: warning: unrecognized command line option ‘-Wno-self-assign’ cc1: warning: unrecognized command line option ‘-Wno-parentheses-equality’ cc1: warning: unrecognized command line option ‘-Wno-constant-logical-operand’ make: *** [Makefile:248: linux_inet_diag.o] Error 1 ``` ---------------------------------------- Feature #19057: Hide implementation of `rb_io_t`. https://bugs.ruby-lang.org/issues/19057#change-103492 * Author: ioquatix (Samuel Williams) * Status: Assigned * Priority: Normal * Assignee: ioquatix (Samuel Williams) ---------------------------------------- In order to make improvements to the IO implementation like <https://bugs.ruby-lang.org/issues/18455>, we need to add new fields to `struct rb_io_t`. By the way, ending types in `_t` is not recommended by POSIX, so I'm also trying to rename the internal implementation to drop `_t` where possible during this conversion. Anyway, we should try to hide the implementation of `struct rb_io`. Ideally, we don't expose any of it, but the problem is backwards compatibility. So, in order to remain backwards compatibility, we should expose some fields of `struct rb_io`, the most commonly used one is `fd` and `mode`, but several others are commonly used. There are many fields which should not be exposed because they are implementation details. ## Current proposal The current proposed change <https://github.com/ruby/ruby/pull/6511> creates two structs: ```c // include/ruby/io.h #ifndef RB_IO_T struct rb_io { int fd; // ... public fields ... }; #else struct rb_io; #endif // internal/io.h #define RB_IO_T struct rb_io { int fd; // ... public fields ... // ... private fields ... }; ``` However, we are not 100% confident this is safe according to the C specification. My experience is not sufficiently wide to say this is safe in practice, but it does look okay to both myself, and @Eregon + @tenderlovemaking have both given some kind of approval. That being said, maybe it's not safe. There are two alternatives: ## Hide all details We can make public `struct rb_io` completely invisible. ```c // include/ruby/io.h #define RB_IO_HIDDEN struct rb_io; int rb_ioptr_descriptor(struct rb_io *ioptr); // accessor for previously visible state. // internal/io.h struct rb_io { // ... all fields ... }; ``` This would only be forwards compatible, and code would need to feature detect like this: ```c #ifdef RB_IO_HIDDEN #define RB_IOPTR_DESCRIPTOR rb_ioptr_descriptor #else #define RB_IOPTR_DESCRIPTOR(ioptr) rb_ioptr_descriptor(ioptr) #endif ``` ## Nested public interface Alternatively, we can nest the public fields into the private struct: ```c // include/ruby/io.h struct rb_io_public { int fd; // ... public fields ... }; // internal/io.h #define RB_IO_T struct rb_io { struct rb_io_public public; // ... private fields ... }; ``` ## Considerations I personally think the "Hide all details" implementation is the best, but it's also the lest compatible. This is also what we are ultimately aiming for, whether we decide to take an intermediate "compatibility step" is up to us. I think "Nested public interface" is messy and introduces more complexity, but it might be slightly better defined than the "Current proposal" which might create undefined behaviour. That being said, all the tests are passing. -- https://bugs.ruby-lang.org/

11 months, 2 weeks

1
0
0 0

2024

2023

2022

ruby-core June 2023