[ruby-core:116514] [Ruby master Bug#20228] Memory leak in Regexp timeout

Issue #20228 has been reported by peterzhu2118 (Peter Zhu). ---------------------------------------- Bug #20228: Memory leak in Regexp timeout https://bugs.ruby-lang.org/issues/20228 * Author: peterzhu2118 (Peter Zhu) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: DONTNEED, 3.2: REQUIRED, 3.3: REQUIRED ---------------------------------------- GitHub PR: https://github.com/ruby/ruby/pull/9765 If a Regexp::TimeoutError is raised, the `stk_base` and `OnigRegion` will leak. For example: ```ruby Regexp.timeout = 0.001 regex = /^(a*)*$/ str = "a" * 1000000 + "x" 10.times do 100.times do begin regex =~ str rescue end end puts `ps -o rss= -p #{$$}` end ``` Before: ``` 328800 632416 934368 1230448 1531088 1831248 2125072 2414384 2703440 2995664 ``` After: ``` 39280 47888 49024 56240 56496 56512 56592 56592 56720 56720 ``` -- https://bugs.ruby-lang.org/

Issue #20228 has been updated by nobu (Nobuyoshi Nakada). Using ruby APIs in onigmo doesn’t feel nice. ---------------------------------------- Bug #20228: Memory leak in Regexp timeout https://bugs.ruby-lang.org/issues/20228#change-106539 * Author: peterzhu2118 (Peter Zhu) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: DONTNEED, 3.2: REQUIRED, 3.3: REQUIRED ---------------------------------------- GitHub PR: https://github.com/ruby/ruby/pull/9765 If a Regexp::TimeoutError is raised, the `stk_base` and `OnigRegion` will leak. For example: ```ruby Regexp.timeout = 0.001 regex = /^(a*)*$/ str = "a" * 1000000 + "x" 10.times do 100.times do begin regex =~ str rescue end end puts `ps -o rss= -p #{$$}` end ``` Before: ``` 328800 632416 934368 1230448 1531088 1831248 2125072 2414384 2703440 2995664 ``` After: ``` 39280 47888 49024 56240 56496 56512 56592 56592 56720 56720 ``` -- https://bugs.ruby-lang.org/

Issue #20228 has been updated by mame (Yusuke Endoh). Good find, thanks! I'm a little concerned about the overhead of `rb_protect` for a typical simple match, but it's neglectable? ---------------------------------------- Bug #20228: Memory leak in Regexp timeout https://bugs.ruby-lang.org/issues/20228#change-106541 * Author: peterzhu2118 (Peter Zhu) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: DONTNEED, 3.2: REQUIRED, 3.3: REQUIRED ---------------------------------------- GitHub PR: https://github.com/ruby/ruby/pull/9765 If a Regexp::TimeoutError is raised, the `stk_base` and `OnigRegion` will leak. For example: ```ruby Regexp.timeout = 0.001 regex = /^(a*)*$/ str = "a" * 1000000 + "x" 10.times do 100.times do begin regex =~ str rescue end end puts `ps -o rss= -p #{$$}` end ``` Before: ``` 328800 632416 934368 1230448 1531088 1831248 2125072 2414384 2703440 2995664 ``` After: ``` 39280 47888 49024 56240 56496 56512 56592 56592 56720 56720 ``` -- https://bugs.ruby-lang.org/

Issue #20228 has been updated by peterzhu2118 (Peter Zhu).
Using ruby APIs in onigmo doesn’t feel nice.
I changed it to call `HANDLE_REG_TIMEOUT_IN_MATCH_AT` in onigmo, which calls `rb_reg_raise_timeout` so then there is no Ruby code in onigmo.
I'm a little concerned about the overhead of rb_protect for a typical simple match, but it's neglectable?
I think it can only raise when there is a timeout set, so I changed the implementation to only use `rb_protect` when there is a timeout. ---------------------------------------- Bug #20228: Memory leak in Regexp timeout https://bugs.ruby-lang.org/issues/20228#change-106544 * Author: peterzhu2118 (Peter Zhu) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: DONTNEED, 3.2: REQUIRED, 3.3: REQUIRED ---------------------------------------- GitHub PR: https://github.com/ruby/ruby/pull/9765 If a Regexp::TimeoutError is raised, the `stk_base` and `OnigRegion` will leak. For example: ```ruby Regexp.timeout = 0.001 regex = /^(a*)*$/ str = "a" * 1000000 + "x" 10.times do 100.times do begin regex =~ str rescue end end puts `ps -o rss= -p #{$$}` end ``` Before: ``` 328800 632416 934368 1230448 1531088 1831248 2125072 2414384 2703440 2995664 ``` After: ``` 39280 47888 49024 56240 56496 56512 56592 56592 56720 56720 ``` -- https://bugs.ruby-lang.org/

Issue #20228 has been updated by naruse (Yui NARUSE). Backport changed from 3.0: UNKNOWN, 3.1: DONTNEED, 3.2: REQUIRED, 3.3: REQUIRED to 3.0: UNKNOWN, 3.1: DONTNEED, 3.2: REQUIRED, 3.3: DONE ruby_3_3 c626c201e4129bbea17583ecef73472c6f668c81 merged revision(s) 01bfd1a2bf013a9ed92a9722ac5228187e05e6a8,1c120efe02d079b0a1dea573cf0fd7978d9cc857,31378dc0969f4466b2122d730b7298dd7004acdf. ---------------------------------------- Bug #20228: Memory leak in Regexp timeout https://bugs.ruby-lang.org/issues/20228#change-107359 * Author: peterzhu2118 (Peter Zhu) * Status: Closed * Backport: 3.0: UNKNOWN, 3.1: DONTNEED, 3.2: REQUIRED, 3.3: DONE ---------------------------------------- GitHub PR: https://github.com/ruby/ruby/pull/9765 If a Regexp::TimeoutError is raised, the `stk_base` and `OnigRegion` will leak. For example: ```ruby Regexp.timeout = 0.001 regex = /^(a*)*$/ str = "a" * 1000000 + "x" 10.times do 100.times do begin regex =~ str rescue end end puts `ps -o rss= -p #{$$}` end ``` Before: ``` 328800 632416 934368 1230448 1531088 1831248 2125072 2414384 2703440 2995664 ``` After: ``` 39280 47888 49024 56240 56496 56512 56592 56592 56720 56720 ``` -- https://bugs.ruby-lang.org/

Issue #20228 has been updated by nagachika (Tomoyuki Chikanaga). Backport changed from 3.0: UNKNOWN, 3.1: DONTNEED, 3.2: REQUIRED, 3.3: DONE to 3.0: UNKNOWN, 3.1: DONTNEED, 3.2: WONTFIX, 3.3: DONE I gave up to make a clean patch for ruby_3_2 branch. Please make PR if you want to backport. ---------------------------------------- Bug #20228: Memory leak in Regexp timeout https://bugs.ruby-lang.org/issues/20228#change-108984 * Author: peterzhu2118 (Peter Zhu) * Status: Closed * Backport: 3.0: UNKNOWN, 3.1: DONTNEED, 3.2: WONTFIX, 3.3: DONE ---------------------------------------- GitHub PR: https://github.com/ruby/ruby/pull/9765 If a Regexp::TimeoutError is raised, the `stk_base` and `OnigRegion` will leak. For example: ```ruby Regexp.timeout = 0.001 regex = /^(a*)*$/ str = "a" * 1000000 + "x" 10.times do 100.times do begin regex =~ str rescue end end puts `ps -o rss= -p #{$$}` end ``` Before: ``` 328800 632416 934368 1230448 1531088 1831248 2125072 2414384 2703440 2995664 ``` After: ``` 39280 47888 49024 56240 56496 56512 56592 56592 56720 56720 ``` -- https://bugs.ruby-lang.org/
participants (5)
-
mame (Yusuke Endoh)
-
nagachika (Tomoyuki Chikanaga)
-
naruse (Yui NARUSE)
-
nobu (Nobuyoshi Nakada)
-
peterzhu2118 (Peter Zhu)