Issue #19315 has been updated by ianks (Ian Ker-Seymer).
It seems a good idea to introduce a variant of
`RSTRING_PTR` which doesn't guarantee \0-termination, so such callers can then use the
existing bytes always without copy.
It would be nice to have a way to get the raw parts of a string ([ptr, len]) as part of
the official ruby C api. As you mentioned, RSTRING_PTR has some caveats:
1. It may reallocate
2. It relies on inline code (not accessibly via dylib)
As a workaround, I’ve seen a lot of hacks in the wild that manually implement this logic,
and it gets hairy since you have to consider embedded strings, etc.
So if we are going to add a feature, we should add something like `rb_string_raw_parts`
which can return a tuple of [ptr, len].
----------------------------------------
Feature #19315: Lazy substrings in CRuby
https://bugs.ruby-lang.org/issues/19315#change-101599
* Author: Eregon (Benoit Daloze)
* Status: Open
* Priority: Normal
----------------------------------------
CRuby should implement lazy substrings, i.e., "abcdef"[1..3] must not copy
bytes.
Currently CRuby only reuse the char* if the substring is until the end of the buffer.
But it should also work wherever the substring starts and ends.
Yes, it means RSTRING_PTR() might need to allocate to \0-terminate, so be it, it's
worth it.
There is already code for this (`SHARABLE_MIDDLE_SUBSTRING`), but it's disabled by
default and `RSTRING_PTR()` needs to be changed to deal with this.
It seems a good idea to introduce a variant of `RSTRING_PTR` which doesn't guarantee
\0-termination, so such callers can then use the existing bytes always without copy.
There are countless workarounds for this missing optimization, all not worth it with lazy
substring and all less readable:
*
https://bugs.ruby-lang.org/issues/19314
*
https://bugs.ruby-lang.org/issues/18598#note-3
*
https://github.com/ruby/net-protocol/pull/14
* Manual lazy substrings which track string + index + length
* More but I don't remember all now, feel free to comment or link more urls/tickets.
--
https://bugs.ruby-lang.org/