Issue #19784 has been updated by mame (Yusuke Endoh).
Do you mean that if the argument or if the receiver
String is not `String#valid_encoding?`, then we compare byte-by-byte, and otherwise we
compare character-by-character ?
No.
I think we should not consider whether a given
substring is valid
I think it's this way. In the case of `"\xFF\xC3\x84"` (=
`"\xFFÄ"`), the byte sequence from byteoffset 0 is invalid, so we take out one
byte `"\xFF"`, and the next byte sequence from byteoffset 1 is valid, so we take
out two bytes (one character) `"\xC3\x84"`, and so on, I think @akr or @naruse
can explain the rationale.
----------------------------------------
Bug #19784: String#delete_prefix! problem
https://bugs.ruby-lang.org/issues/19784#change-104325
* Author: inversion (Yura Babak)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
Here is the snipped and the question is in the comments:
``` ruby
fp = 'with_BOM_16.txt'
body = File.read(fp).force_encoding('UTF-8')
p body # "\xFF\xFE1\u00001\u0000"
p body.start_with?("\xFF\xFE") # true
body.delete_prefix!("\xFF\xFE") # !!! why doesn't work?
p body # "\xFF\xFE1\u00001\u0000"
p body.start_with?("\xFF\xFE") # true
body[0, 2] = ''
p body # "1\u00001\u0000"
p body.start_with?("\xFF\xFE") # false
```
Works same
on Linux (ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux])
and Windows (ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x64-mingw-ucrt])
--
https://bugs.ruby-lang.org/