
Issue #19315 has been updated by duerst (Martin Dürst). Hanmac (Hans Mackowiak) wrote in #note-11:
it confused me too, i thought Copy On Write was default for shared strings
https://patshaughnessy.net/2012/1/18/seeing-double-how-ruby-shares-string-va...
Pat Shaughnessy in his blog describes exactly the same thing as Benoit Daloze above: Ruby shares string data as long as the ends of the strings align. The reason for this is that (C)Ruby uses NULL-terminated string data. ---------------------------------------- Feature #19315: Lazy substrings in CRuby https://bugs.ruby-lang.org/issues/19315#change-103567 * Author: Eregon (Benoit Daloze) * Status: Open * Priority: Normal ---------------------------------------- CRuby should implement lazy substrings, i.e., "abcdef"[1..3] must not copy bytes. Currently CRuby only reuse the char* if the substring is until the end of the buffer. But it should also work wherever the substring starts and ends. Yes, it means RSTRING_PTR() might need to allocate to \0-terminate, so be it, it's worth it. There is already code for this (`SHARABLE_MIDDLE_SUBSTRING`), but it's disabled by default and `RSTRING_PTR()` needs to be changed to deal with this. It seems a good idea to introduce a variant of `RSTRING_PTR` which doesn't guarantee \0-termination, so such callers can then use the existing bytes always without copy. There are countless workarounds for this missing optimization, all not worth it with lazy substring and all less readable: * https://bugs.ruby-lang.org/issues/19314 * https://bugs.ruby-lang.org/issues/18598#note-3 * https://github.com/ruby/net-protocol/pull/14 * Manual lazy substrings which track string + index + length * More but I don't remember all now, feel free to comment or link more urls/tickets. -- https://bugs.ruby-lang.org/