Issue #19470 has been updated by ioquatix (Samuel Williams).
Sorry, I read the example more closely, it seems we are taking a small slice, which marks
the original array as shared. So both the original array and the slice point at the same
memory allocation, and on mutation, a copy is made. I think @mame is right, in general
this is a good optimisation. But I see the problem, there is a degenerate case here.
I wonder if we can propose some way to avoid this degenerate case if the mutation is
frequent or the slice should definitely be a copy rather than invoke the CoW. Maybe a
percentage based approach, or maybe we can detect the CoW frequently triggering a full
copy = don't do further CoW.
----------------------------------------
Bug #19470: Frequent small range-reads from and then writes to a large array are very
slow
https://bugs.ruby-lang.org/issues/19470#change-102088
* Author: giner (Stanislav German-Evtushenko)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.2.1 (2023-02-08 revision 31819e82c8) [x86_64-linux]
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
Write to a large array gets very slow when done after range-reading more than 3 items. In
such case the original array gets marked as shared which triggers CoW on a small change
afterwards. This leads to a significant performance impact and high memory utilization in
cases when we need to range-read/write from/to the same array many times. While this issue
can be avoided by reading <= 3 elements at a time the main problem is that this
behaviour is not obvious and hard to catch on on-trivial projects.
```ruby
times = []
arr = [0] * 100000
times.push 0
100000.times do
time_start = Time.now
arr[5] = 100 # takes 0.01662315899999512
times[-1] += Time.now - time_start
end
times.push 0
100000.times do
arr[0..2]
time_start = Time.now
arr[5] = 100 # takes 0.01826406799999659
times[-1] += Time.now - time_start
end
times.push 0
100000.times do
arr[0..3]
time_start = Time.now
arr[5] = 100 # takes 7.757753919000069
times[-1] += Time.now - time_start
end
times.push 0
100000.times do
arr.dup
time_start = Time.now
arr[5] = 100 # takes 7.626929300999957
times[-1] += Time.now - time_start
end
times.push 0
100000.times do
arr.clone
time_start = Time.now
arr[5] = 100 # takes 8.216933763000046
times[-1] += Time.now - time_start
end
p times
```
--
https://bugs.ruby-lang.org/