Issue #19719 has been reported by yui-knk (Kaneko Yuichiro).
----------------------------------------
Feature #19719: Universal Parser
https://bugs.ruby-lang.org/issues/19719
* Author: yui-knk (Kaneko Yuichiro)
* Status: Open
* Priority: Normal
----------------------------------------
# Background
There are three use cases where we need a CRuby parser which is independent of other CRuby features like Object, GC. I call such a parser a Universal Parser.
1. Use Universal Parser from Ruby applications.
For example RuboCop. RuboCop needs a parser to get AST of source code. Currently RuboCop uses parser gem. In this sense Universal Parser is a replacement of parser gem.
2. Use Universal Parser from C/C++ or other language.
For example Sorbet. Sorbet is written in C++. It has its own parser. Universal Parser can be used in such scenario.
3. Use Universal Parser for other Ruby implementations.
mruby, JRuby and other Ruby implementations will use Universal Parser so that they don’t need to develop & manage their own parser.
# Design
* Implement Universal Parser by changing CRuby source code, especially parse.y and node.c.
Introduce `UNIVERSAL_PARSER` macro and implement Universal Parser by passing all necessary CRuby related functions via `struct rb_parser_config_struct`. In this step there are two build modes with/without Universal Parser.
* Reduce CRuby related functions passed by `struct rb_parser_config_struct`. Some of them are copied into parse.y, e.g. `rb_isspace`. Other are reimplemented, e.g. `NODE_LIT` has new String struct instead of `VALUE`.
* Once CRuby related functions needed for Universal Parser are minimized, replace rest CRuby function references with functions provided by `struct rb_parser_config_struct` and remove `UNIVERSAL_PARSER` macro.
# Release management
There are three options for releasing a binary for Universal Parser (“librubyparser”).
1. Release it as so/dll
a. Include it into Ruby release process
b. Create another repository for release management of "librubyparser"
2. Release it as gem (ruby/universal_parser)
"librubyparser" has only pure C interface and data structure. If you want to use it from Ruby code, you need C extension code to translate "librubyparser" data and Ruby objects. I propose to create "universal_parser" gem for this purpose.
I prefer #1-b to #1-a because it doesn’t increase tasks for Ruby release process. I want to make #2 as a first milestone.
--
https://bugs.ruby-lang.org/
Hello Fedor @ Cirrus CI,
CC: Ruby Core mailing list.
Thank you for helping the Ruby project for adding Cirrus CI. I am
sending an email to ask you for help to improve the Cirrus CI. We are
running 4 parallel jobs (2 gcc cases ARM, 2 clang cases ARM) in the
Cirrus CI.
However, I am seeing some challenges such as the queue is stopping,
and the clang cases are 2 or 3 times slower than gcc cases. So, today
we stopped the 2 clang cases in a compromised way.
Could you check our .cirrus.yml file on the reverted commit, and give
advice to us and ideally just send a pull-request to the master branch
for that?
https://github.com/ruby/ruby/blob/master/.cirrus.yml
```
$ git clone https://github.com/ruby/ruby.git
$ cd ruby
$ git revert 72f07f0a5f882e87e305d668587152fa209a0568.
```
I suspect the current `cpu:` value might not be correct. And we may
need the following change.
```
diff --git a/.cirrus.yml b/.cirrus.yml
index be76e4ab4a..8b820be1a2 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -16,10 +16,10 @@ task:
image: ghcr.io/ruby/ruby-ci-image:$CC
# Define the used cpu core in each matrix task. We can use total 16 cpu
# cores in entire matrix. [cpu] = [total cpu: 16] / [number of tasks]
- cpu: 8
+ cpu: 4
# We can request maximum 4 GB per cpu.
# [memory per task] = [memory per cpu: 4 GB] * [cpu]
- memory: 32G
+ memory: 16G
env:
CIRRUS_CLONE_DEPTH: 50
optflags: '-O1'
@@ -71,10 +71,10 @@ yjit_task:
image: ghcr.io/ruby/ruby-ci-image:$CC
# Define the used cpu core in each matrix task. We can use total 16 cpu
# cores in entire matrix. [cpu] = [total cpu: 16] / [number of tasks]
- cpu: 8
+ cpu: 4
# We can request maximum 4 GB per cpu.
# [memory per task] = [memory per cpu: 4 GB] * [cpu]
- memory: 32G
+ memory: 16G
env:
CIRRUS_CLONE_DEPTH: 50
optflags: '-O1'
```
Thank you for your help!
--
Jun | He - Him | Timezone: UTC+1 or 2, Czech Republic
See <https://www.worldtimebuddy.com/czech-republic-prague-to-utc> for
the timezone.
Issue #19725 has been reported by make_now_just (Hiroya Fujinami).
----------------------------------------
Feature #19725: Improve the match cache optimization to support look-ahead and atomic groups
https://bugs.ruby-lang.org/issues/19725
* Author: make_now_just (Hiroya Fujinami)
* Status: Open
* Priority: Normal
----------------------------------------
Currently, the Regexp match cache optimization [Feature #19104] memoizes match cache points that once arrived; that is, it memoizes them **before backtracking**. This kind of implementation is simple and works fine in most cases. However, in some cases, it does not work well. For example,
1. it is hard to preserve the strange null-loop behavior, and
2. it is hard to support look-around operators and atomic groups correctly.
The first problem is simple. Memoization may change the matching behavior if a cache point in a null loop is memoized because Onigmo allows a null loop (a loop without consuming a character) at the first time to keep captures. We avoid this problem by carefully resetting the memoization table on null loops currently. But, nested null loops are not supported.
The second problem is a bit complex. For example, `/\Aa*(?=a*)a\z/` does not match against `"a"` on the memozation before backtracking. The reason is that when the first matching of the look-ahead is succeeded, the loop branch in the look-ahead is memoized, and it is impossible to succeed the second matching of the look-ahead. Furthermore, how do we prevent increasing the matching time of `/\Aa*?(?=a*)\z/` against `"a" * N + "z"` quadratically?
I suggest two improvements to resolve these issues:
1. Memoizing **after backtracking**, and
2. Memoizing **partial match results** for match cache points in look-ahead and atomic groups.
The first improvement is, I think, the canonical way to implement the match cache optimization. This avoids the first "null loop" problem entirely. This also enables to support look-ahead and atomic groups easily. This can be implemented by introducing an extra item to the stack and doing memoization on popping it.
The second improvement is to simulate on the memoization the atomic behavior of look-ahead and atomic (Of course!) groups on the memoization. The current memoization table only records the arrived points; that is, it does not record the result of matching and only records failures of matching. Actually, it is not problematic in the current implementation because "matching is succeeded" means the matching is done. However, when a part of look-around and atomic groups is matched, the stack is hoisted to behave atomically. To simulate this behavior, successes in matching these parts are also memoized. This can be implemented by introducing an extra bit for the result of matching to the memoization table.
These improvements will enable
- to avoid the nested null loop restriction, and
- to support positive/negative look-ahead and atomic groups correctly.
However, look-ahead and atomic groups that include captures cannot be supported. This is because captures may be changed on successful matches.
--
https://bugs.ruby-lang.org/
Issue #19681 has been reported by byroot (Jean Boussier).
----------------------------------------
Bug #19681: The final classpath of partially named modules is sometimes inconsistent once permanently named
https://bugs.ruby-lang.org/issues/19681
* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
* Backport: 3.0: WONTFIX, 3.1: REQUIRED, 3.2: REQUIRED
----------------------------------------
Reported to me by @fxn
```ruby
m = Module.new
class m::C; end
p m::C.name # => "#<Module:0x000000010789fbe0>::C"
m::D = m::C
p m::D.name # => "#<Module:0x000000010789fbe0>::C"
M = m
p M::C.name # => "#<M::D"
```
Expected behavior:
```ruby
p M::C.name # => "#<M::C"
```
### Reason
When the parent is assigned its permanent classpath, we iterate over its `const_table` to recursively give a permanent name to all the constant it owns.
However, `const_table` is an `id_table` so it doesn't retain the insertion order, which means that if the constant was aliased,
we can no longer distinguish between the original name and its aliases, and whichever comes first in the `const_table` will be used as the permanent name.
### Potential solution
I have a tentative fix for it in https://github.com/ruby/ruby/pull/7829. Instead of relying on the `const_table` key, it extract the original name from the temporary classpath.
It does feel a bit wrong to do a string search in such a place, but it does work.
--
https://bugs.ruby-lang.org/
Issue #15192 has been updated by bradly (Bradly Feeley).
I really believe this would be seen a major improvement to Ruby.
Personally, I still feel
``` ruby
def initialize(@foo)
```
is the cleanest implementation (or something else the comes _before_ the arg name), but I would be happy with an agreement on any proposed syntax.
----------------------------------------
Feature #15192: Introduce a new "shortcut assigning" syntax to convenient setup instance variables
https://bugs.ruby-lang.org/issues/15192#change-103513
* Author: jjyr (Jinyang Jiang)
* Status: Open
* Priority: Normal
* Assignee: matz (Yukihiro Matsumoto)
----------------------------------------
Motivation:
Introduce a new syntax for convenient setup instance variables for objects.
The problem:
Currently, setup instance variables in Ruby is too verbose.
Basically, we need to write the meaningless assigning code again and again to assign variables
``` ruby
class Person
def initialize(name:, age:, gender:, country:)
@name = name
@age = age
@gender = gender
@country = country
end
end
# we can use Struct to avoiding this
Person = Struct.new(:name, :age, :gender, :country, keyword_init: true)
# let's see a real-world case, which can't use Struct to describe an initializing process, from https://github.com/ciri-ethereum/ciri/blob/748985ccf7a620a2e480706a5a6b38f5…
# Because we want to do something more than just assigning instance variables
class Server
def initialize(private_key:, protocol_manage:, bootstrap_nodes: [],
node_name: 'Ciri', tcp_host: '127.0.0.1', tcp_port: 33033)
@private_key = private_key
@node_name = node_name
@bootstrap_nodes = bootstrap_nodes
@protocol_manage = protocol_manage
server_node_id = NodeID.new(@private_key)
caps = [Cap.new(name: 'eth', version: 63)]
@handshake = ProtocolHandshake.new(version: BASE_PROTOCOL_VERSION, name: @node_name, id: server_node_id.id, caps: caps)
@tcp_host = tcp_host
@tcp_port = tcp_port
@dial = Dial.new(bootstrap_nodes: bootstrap_nodes, private_key: private_key, handshake: @handshake)
@network_state = NetworkState.new(protocol_manage)
@dial_scheduler = DialScheduler.new(@network_state, @dial)
end
end
# Introduce a new "shortcut assigning" syntax for convenient setup
class Person
# use @ prefix to describe instance variables.
def initialize(@name:, @age:, @gender:, @country:)
end
# equal to
def initialize2(name:, age:, gender:, country:)
@name = name
@age = age
@gender = gender
@country = country
end
# it should also work on position style arguments
def initialize2(@name, @age, @gender, @country)
end
end
# Our real-world case can be rewritten as below
class Server
def initialize(@private_key:, @protocol_manage:, @bootstrap_nodes: [],
@node_name: 'Ciri', @tcp_host: '127.0.0.1', @tcp_port: 33033)
server_node_id = NodeID.new(@private_key)
caps = [Cap.new(name: 'eth', version: 63)]
@handshake = ProtocolHandshake.new(version: BASE_PROTOCOL_VERSION, name: @node_name, id: server_node_id.id, caps: caps)
@dial = Dial.new(bootstrap_nodes: @bootstrap_nodes, private_key: @private_key, handshake: @handshake)
@network_state = NetworkState.new(@protocol_manage)
@dial_scheduler = DialScheduler.new(@network_state, @dial)
end
end
# consider to keep consistency, this "shortcut assigning" syntax should work for non-initialize methods
class Foo
def bar(@still_works)
p @still_works
end
end
```
--
https://bugs.ruby-lang.org/
Issue #19694 has been reported by aharpole (Aaron Harpole).
----------------------------------------
Feature #19694: Add Regexp#timeout= setter
https://bugs.ruby-lang.org/issues/19694
* Author: aharpole (Aaron Harpole)
* Status: Open
* Priority: Normal
----------------------------------------
# Abstract
In addition to allowing for a Regexp timeout to be set on individual instances by setting a `timeout` argument in `Regexp.new`, I'm proposing that we also allow setting the timeout on Regexp objects with a `#timeout=` setter.
# Background
To be able to roll out a global Regexp timeout for a large application, there are inevitably some individual regexes for which a different timeout is appropriate. While the `timeout` keyword argument was added to `Regexp.new`, this isn't always a viable option.
In the case of regex literal syntax (`/ab*/` or `%r{ab*}`, for instance), it's not possible to set a timeout at all right now without converting to `Regexp.new`, which may be awkward depending on the contents of the regex.
It also is desirable from time to time to be able to set a timeout for a regex object after it's been initialized.
Finally, because we offer a `Regexp#timeout` getter, for consistency it would be nice to also offer a setter.
The introduction of a `Regexp#timeout=` setter was mentioned as a possible way to set individual timeouts in https://bugs.ruby-lang.org/issues/19104#Specification.
# Proposal
I propose that we add the method `Regexp#timeout=`. It works the same way the `timeout` argument works in `Regexp.new`, taking either a float or nil.
This makes it relatively easy to add timeouts to specific regex literals (regex literals are frozen by default so you do have to `dup` them first):
```
emoji_filter_pattern = %r{
(?<!#{Regexp.quote(ZERO_WIDTH_JOINER)})
#{EmojiFilter.unicodes_pattern}
(?!#{Regexp.union(EmojiFilter::MODIFIER_CHAR_MAP.keys.map { |k| Regexp.quote k })})
}x.dup
emoji_filter_pattern.timeout = 1.0
emoji_filter_pattern.freeze
```
# Implementation
This setter has been implemented in https://github.com/ruby/ruby/pull/7847.
# Evaluation
It's just a setter, so pretty straightforward in terms of implementation and use.
# Discussion
It's worth considering other options for overriding `Regexp.timeout`. I'd love to see something like the following for overriding regexp timeouts as well:
```
Regexp.timeout = 1.0
Regexp.with_timeout(5.0) do
evaluate_slower_regexes
end
```
It's possible to implement something like `Regexp.with_timeout` but it's not thread-safe by default since it would involve overwriting `Regexp.timeout`.
# Summary
Regexp instances have a getter for timeout, and adding a corresponding setter adds consistency and will make it easier for developers to adopt adding a global `Regexp.timeout` by making it simpler to adjust timeouts on a regex by regex basis.
It's a minor change but the added consistency and flexibility help us optimize for developer happiness.
--
https://bugs.ruby-lang.org/
Issue #18572 has been updated by Eregon (Benoit Daloze).
shugo (Shugo Maeda) wrote in #note-10:
> It's simple to prohibit using calls in blocks, but it will break backward compatibility, e.g., using in module_eval.
It would still be possible to use `using` outside the `module_eval` (e.g. in a module body or at top-level), which seems more correct anyway since refinements should be per constant scope/cref, they are not per block.
----------------------------------------
Bug #18572: Performance regression when invoking refined methods
https://bugs.ruby-lang.org/issues/18572#change-103501
* Author: palkan (Vladimir Dementyev)
* Status: Assigned
* Priority: Normal
* Assignee: ko1 (Koichi Sasada)
* Backport: 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
Since Ruby 3.0, defining a refinement for a method slows down its execution even if we do not activate the refinement:
```ruby
require "benchmark_driver"
source = <<~RUBY
class Hash
def symbolize_keys
transform_keys { |key| key.to_sym rescue key }
end
def refined_symbolize_keys
transform_keys { |key| key.to_sym rescue key }
end
end
module HashRefinements
refine Hash do
def refined_symbolize_keys
raise "never called"
end
end
end
HASH = {foo: 1, bar: 2, baz: 3}
class Foo
def original
end
def refined
end
end
module FooRefinements
refine Foo do
def refined
raise "never called"
end
end
end
FOO = Foo.new
RUBY
Benchmark.driver do |x|
x.prelude %Q{
#{source}
}
x.report "#symbolize_keys original", %{ HASH.symbolize_keys }
x.report "#symbolize_keys refined", %{ HASH.refined_symbolize_keys }
end
Benchmark.driver do |x|
x.prelude %Q{
#{source}
}
x.report "no-op original", %{ FOO.original }
x.report "no-op refined", %{ FOO.refined }
end
```
The results for Ruby 3.1:
```sh
...
Comparison:
#symbolize_keys original: 2372420.1 i/s
#symbolize_keys refined: 1941019.0 i/s - 1.22x slower
...
Comparison:
no-op original: 51790974.2 i/s
no-op refined: 14456518.9 i/s - 3.58x slower
```
For Ruby 2.6 and 2.7:
```sh
Comparison:
#symbolize_keys original: 2278339.7 i/s
#symbolize_keys refined: 2264153.1 i/s - 1.01x slower
...
Comparison:
no-op refined: 64178338.5 i/s
no-op original: 63357980.1 i/s - 1.01x slower
```
You can find the full code and more results in this [gist](https://gist.github.com/palkan/637dc83edd86d70b5dbf72f2a4d702e5).
P.S. The problem was originally noticed by @byroot, see https://github.com/ruby-i18n/i18n/pull/573
--
https://bugs.ruby-lang.org/
Issue #19712 has been reported by itarato (Peter Arato).
----------------------------------------
Feature #19712: IO#reopen removes singleton class
https://bugs.ruby-lang.org/issues/19712
* Author: itarato (Peter Arato)
* Status: Open
* Priority: Normal
----------------------------------------
The documentation states:
> This may dynamically change the actual class of this stream.
As well `#reopen` removes the singleton class, even when the logical class is the same. This can be surprising at times.
An example:
``` ruby
io = File.open(__FILE__)
io.define_singleton_method(:foo) { $stdout.write("hello") }
io.reopen(File.open(__FILE__))
io.foo # `<main>': undefined method `foo' for #<File:/test.rb> (NoMethodError)
```
An example where this was an issue: https://github.com/oracle/truffleruby/pull/3088 Tl;dr: a popular gem was enhancing the singleton class of STDOUT, while Rails/ActiveSupport was executing an `IO#reopen` - invalidating the changes by the gem.
While it's hard to trivially solve this with keeping the intended functionality, could it be considered to make edge cases more visible?
Examples:
- `IO#reopen` issues a warning message when the receiver's singleton class has been updated (but keep removing it as of current state)
- `IO#reopen` is limited on instances with a default singleton class, and issues an error when called on an instance with an updated singleton class
- make `IO#reopen` carry over the singleton class *only* if the recipient and argument's class are the same, and yield an error otherwise (different classes)
These are just ideas. I'm more looking for feedback from the core devs at this point. Thanks in advance!
--
https://bugs.ruby-lang.org/
Issue #19057 has been updated by ioquatix (Samuel Williams).
Thanks for the report, fixed in https://github.com/ioquatix/raindrops/commit/94dbdd94977d895f98c084d0ca31c2…
----------------------------------------
Feature #19057: Hide implementation of `rb_io_t`.
https://bugs.ruby-lang.org/issues/19057#change-103493
* Author: ioquatix (Samuel Williams)
* Status: Assigned
* Priority: Normal
* Assignee: ioquatix (Samuel Williams)
----------------------------------------
In order to make improvements to the IO implementation like <https://bugs.ruby-lang.org/issues/18455>, we need to add new fields to `struct rb_io_t`.
By the way, ending types in `_t` is not recommended by POSIX, so I'm also trying to rename the internal implementation to drop `_t` where possible during this conversion.
Anyway, we should try to hide the implementation of `struct rb_io`. Ideally, we don't expose any of it, but the problem is backwards compatibility.
So, in order to remain backwards compatibility, we should expose some fields of `struct rb_io`, the most commonly used one is `fd` and `mode`, but several others are commonly used.
There are many fields which should not be exposed because they are implementation details.
## Current proposal
The current proposed change <https://github.com/ruby/ruby/pull/6511> creates two structs:
```c
// include/ruby/io.h
#ifndef RB_IO_T
struct rb_io {
int fd;
// ... public fields ...
};
#else
struct rb_io;
#endif
// internal/io.h
#define RB_IO_T
struct rb_io {
int fd;
// ... public fields ...
// ... private fields ...
};
```
However, we are not 100% confident this is safe according to the C specification. My experience is not sufficiently wide to say this is safe in practice, but it does look okay to both myself, and @Eregon + @tenderlovemaking have both given some kind of approval.
That being said, maybe it's not safe.
There are two alternatives:
## Hide all details
We can make public `struct rb_io` completely invisible.
```c
// include/ruby/io.h
#define RB_IO_HIDDEN
struct rb_io;
int rb_ioptr_descriptor(struct rb_io *ioptr); // accessor for previously visible state.
// internal/io.h
struct rb_io {
// ... all fields ...
};
```
This would only be forwards compatible, and code would need to feature detect like this:
```c
#ifdef RB_IO_HIDDEN
#define RB_IOPTR_DESCRIPTOR rb_ioptr_descriptor
#else
#define RB_IOPTR_DESCRIPTOR(ioptr) rb_ioptr_descriptor(ioptr)
#endif
```
## Nested public interface
Alternatively, we can nest the public fields into the private struct:
```c
// include/ruby/io.h
struct rb_io_public {
int fd;
// ... public fields ...
};
// internal/io.h
#define RB_IO_T
struct rb_io {
struct rb_io_public public;
// ... private fields ...
};
```
## Considerations
I personally think the "Hide all details" implementation is the best, but it's also the lest compatible. This is also what we are ultimately aiming for, whether we decide to take an intermediate "compatibility step" is up to us.
I think "Nested public interface" is messy and introduces more complexity, but it might be slightly better defined than the "Current proposal" which might create undefined behaviour. That being said, all the tests are passing.
--
https://bugs.ruby-lang.org/
Issue #19057 has been updated by byroot (Jean Boussier).
@ioquatix I'm not sure which change exactly is the cause, but it appears that the recent `rb_io_t` changes broke `raindrops`
```
current directory: /usr/local/lib/ruby/gems/3.3.0+0/gems/raindrops-0.20.1/ext/raindrops
make DESTDIR\= sitearchdir\=./.gem.20230609-25-giurc0 sitelibdir\=./.gem.20230609-25-giurc0 clean
current directory: /usr/local/lib/ruby/gems/3.3.0+0/gems/raindrops-0.20.1/ext/raindrops
make DESTDIR\= sitearchdir\=./.gem.20230609-25-giurc0 sitelibdir\=./.gem.20230609-25-giurc0
compiling linux_inet_diag.c
In file included from /usr/include/x86_64-linux-gnu/bits/libc-header-start.h:33,
from /usr/include/stdio.h:27,
from /usr/local/include/ruby-3.3.0+0/ruby/defines.h:16,
from /usr/local/include/ruby-3.3.0+0/ruby/ruby.h:25,
from /usr/local/include/ruby-3.3.0+0/ruby.h:38,
from linux_inet_diag.c:1:
/usr/include/features.h:187:3: warning: #warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE" [-Wcpp]
187 | # warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE"
| ^~~~~~~
In file included from linux_inet_diag.c:4:
my_fileno.h:9:7: warning: "HAVE_RB_IO_T" is not defined, evaluates to 0 [-Wundef]
9 | #if ! HAVE_RB_IO_T
| ^~~~~~~~~~~~
my_fileno.h:16:8: warning: "HAVE_RB_IO_T" is not defined, evaluates to 0 [-Wundef]
16 | # if !HAVE_RB_IO_T || (RUBY_VERSION_MAJOR == 1 && RUBY_VERSION_MINOR == 8)
| ^~~~~~~~~~~~
my_fileno.h: In function ‘my_fileno’:
my_fileno.h:10:19: error: unknown type name ‘OpenFile’; did you mean ‘GetOpenFile’?
10 | # define rb_io_t OpenFile
| ^~~~~~~~
my_fileno.h:25:2: note: in expansion of macro ‘rb_io_t’
25 | rb_io_t *fptr;
| ^~~~~~~
In file included from my_fileno.h:3,
from linux_inet_diag.c:4:
/usr/local/include/ruby-3.3.0+0/ruby/io.h:395:55: warning: assignment to ‘int *’ from incompatible pointer type ‘struct rb_io *’ [-Wincompatible-pointer-types]
395 | #define RB_IO_POINTER(obj,fp) rb_io_check_closed((fp) = RFILE(rb_io_taint_check(obj))->fptr)
| ^
/usr/local/include/ruby-3.3.0+0/ruby/io.h:401:21: note: in expansion of macro ‘RB_IO_POINTER’
401 | #define GetOpenFile RB_IO_POINTER
| ^~~~~~~~~~~~~
my_fileno.h:30:2: note: in expansion of macro ‘GetOpenFile’
30 | GetOpenFile(io, fptr);
| ^~~~~~~~~~~
/usr/local/include/ruby-3.3.0+0/ruby/io.h:395:55: warning: passing argument 1 of ‘rb_io_check_closed’ from incompatible pointer type [-Wincompatible-pointer-types]
395 | #define RB_IO_POINTER(obj,fp) rb_io_check_closed((fp) = RFILE(rb_io_taint_check(obj))->fptr)
| ~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| int *
/usr/local/include/ruby-3.3.0+0/ruby/io.h:401:21: note: in expansion of macro ‘RB_IO_POINTER’
401 | #define GetOpenFile RB_IO_POINTER
| ^~~~~~~~~~~~~
my_fileno.h:30:2: note: in expansion of macro ‘GetOpenFile’
30 | GetOpenFile(io, fptr);
| ^~~~~~~~~~~
/usr/local/include/ruby-3.3.0+0/ruby/io.h:638:34: note: expected ‘rb_io_t *’ {aka ‘struct rb_io *’} but argument is of type ‘int *’
638 | void rb_io_check_closed(rb_io_t *fptr);
| ~~~~~~~~~^~~~
In file included from linux_inet_diag.c:4:
my_fileno.h:17:41: error: request for member ‘f’ in something not a structure or union
17 | # define FPTR_TO_FD(fptr) fileno(fptr->f)
| ^~
my_fileno.h:31:7: note: in expansion of macro ‘FPTR_TO_FD’
31 | fd = FPTR_TO_FD(fptr);
| ^~~~~~~~~~
linux_inet_diag.c: At top level:
cc1: warning: unrecognized command line option ‘-Wno-self-assign’
cc1: warning: unrecognized command line option ‘-Wno-parentheses-equality’
cc1: warning: unrecognized command line option ‘-Wno-constant-logical-operand’
make: *** [Makefile:248: linux_inet_diag.o] Error 1
```
----------------------------------------
Feature #19057: Hide implementation of `rb_io_t`.
https://bugs.ruby-lang.org/issues/19057#change-103492
* Author: ioquatix (Samuel Williams)
* Status: Assigned
* Priority: Normal
* Assignee: ioquatix (Samuel Williams)
----------------------------------------
In order to make improvements to the IO implementation like <https://bugs.ruby-lang.org/issues/18455>, we need to add new fields to `struct rb_io_t`.
By the way, ending types in `_t` is not recommended by POSIX, so I'm also trying to rename the internal implementation to drop `_t` where possible during this conversion.
Anyway, we should try to hide the implementation of `struct rb_io`. Ideally, we don't expose any of it, but the problem is backwards compatibility.
So, in order to remain backwards compatibility, we should expose some fields of `struct rb_io`, the most commonly used one is `fd` and `mode`, but several others are commonly used.
There are many fields which should not be exposed because they are implementation details.
## Current proposal
The current proposed change <https://github.com/ruby/ruby/pull/6511> creates two structs:
```c
// include/ruby/io.h
#ifndef RB_IO_T
struct rb_io {
int fd;
// ... public fields ...
};
#else
struct rb_io;
#endif
// internal/io.h
#define RB_IO_T
struct rb_io {
int fd;
// ... public fields ...
// ... private fields ...
};
```
However, we are not 100% confident this is safe according to the C specification. My experience is not sufficiently wide to say this is safe in practice, but it does look okay to both myself, and @Eregon + @tenderlovemaking have both given some kind of approval.
That being said, maybe it's not safe.
There are two alternatives:
## Hide all details
We can make public `struct rb_io` completely invisible.
```c
// include/ruby/io.h
#define RB_IO_HIDDEN
struct rb_io;
int rb_ioptr_descriptor(struct rb_io *ioptr); // accessor for previously visible state.
// internal/io.h
struct rb_io {
// ... all fields ...
};
```
This would only be forwards compatible, and code would need to feature detect like this:
```c
#ifdef RB_IO_HIDDEN
#define RB_IOPTR_DESCRIPTOR rb_ioptr_descriptor
#else
#define RB_IOPTR_DESCRIPTOR(ioptr) rb_ioptr_descriptor(ioptr)
#endif
```
## Nested public interface
Alternatively, we can nest the public fields into the private struct:
```c
// include/ruby/io.h
struct rb_io_public {
int fd;
// ... public fields ...
};
// internal/io.h
#define RB_IO_T
struct rb_io {
struct rb_io_public public;
// ... private fields ...
};
```
## Considerations
I personally think the "Hide all details" implementation is the best, but it's also the lest compatible. This is also what we are ultimately aiming for, whether we decide to take an intermediate "compatibility step" is up to us.
I think "Nested public interface" is messy and introduces more complexity, but it might be slightly better defined than the "Current proposal" which might create undefined behaviour. That being said, all the tests are passing.
--
https://bugs.ruby-lang.org/