
Issue #19571 has been updated by peterzhu2118 (Peter Zhu).
how the new parameter is used and how the current implementation calculate without new parameter
The default value is 0.01 (1%). It's calculated as 1% of the `old_objects` count. You can see the implementation is: ```c objspace->rgengc.uncollectible_wb_unprotected_objects_limit = MAX( (size_t)(objspace->rgengc.uncollectible_wb_unprotected_objects * r), (size_t)(objspace->rgengc.old_objects * gc_params.uncollectible_wb_unprotected_objects_limit_ratio) ); ``` The original implementation only used the `remembered_wb_unprotected_objects` multiplied by `RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR`: ```c objspace->rgengc.uncollectible_wb_unprotected_objects_limit = (size_t)(objspace->rgengc.uncollectible_wb_unprotected_objects * r); ```
my understanding is this parameter is used for the major GC condition.
That is correct.
This new parameter can reduce major GC count (and this is why the figures show the results)
Yes, with this feature, the number of major GC ran in requests is about 0.37x compared to without this feature.
Can we compare the major GC counts, unprotected objects count and memory footprint?
We have very few unprotected objects, so our `remembered_wb_unprotected_objects_limit` was very low. This meant that we reached the limit very frequently, which triggered major GC very frequently. But because we have a lot of old objects, we have to scan a few million old objects and we only free a few thousand remembered WB unprotected objects. This caused poor p99 response times. After this patch, the `remembered_wb_unprotected_objects_limit` is now 1% of the number of old objects, meaning that we don't trigger major GC as frequently. In Storefront Renderer, we don't see a change in average or p99 memory usage within the margin of error.
Could you try with other parameters, from 0.10 to 0.50 for example?
We tried with 0.02 and we saw an increase in response times because it makes minor GC slower (since we have much more objects to scan in minor GC). We also found that a lower number (e.g. 0.005 or 0.0025) did not perform as well either. It seems that 0.01 is around the optimal value.
This parameter is ratio for the "old objects" so I'm not sure it makes sense.
Do you mean `RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR` being used to calculate `remembered_wb_unprotected_objects_limit`? I agree in this case. It is confusing that `RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR` is used to calculate both `old_objects_limit` and `remembered_wb_unprotected_objects_limit`. ---------------------------------------- Feature #19571: Add REMEMBERED_WB_UNPROTECTED_OBJECTS_LIMIT_RATIO to the GC https://bugs.ruby-lang.org/issues/19571#change-102630 * Author: peterzhu2118 (Peter Zhu) * Status: Open * Priority: Normal ---------------------------------------- GitHub PR: https://github.com/ruby/ruby/pull/7577 The proposed PR adds the environment variable `RUBY_GC_HEAP_REMEMBERED_WB_UNPROTECTED_OBJECTS_LIMIT_RATIO` which is used to calculate the `remembered_wb_unprotected_objects_limit` using a ratio of `old_objects`. This should improve performance by reducing major GC because, in a major GC, we mark all of the old objects, so we should have more uncollectible WB unprotected objects before starting a major GC. The default has been set to 0.01 (1% of old objects). On one of [Shopify's highest traffic Ruby apps, Storefront Renderer](https://shopify.engineering/how-shopify-reduced-storefront-response-times-re...), we saw significant improvements after deploying this patch in production. In the graphs below, we have the `tuned` group which uses `RUBY_GC_HEAP_REMEMBERED_WB_UNPROTECTED_OBJECTS_LIMIT_RATIO=0.01` (the default value), and an `untuned` group, which turns this feature off with `RUBY_GC_HEAP_REMEMBERED_WB_UNPROTECTED_OBJECTS_LIMIT_RATIO=0`. We see that the tuned group spends significantly less time in GC, on average 0.67x of the time compared to the untuned group and 0.49x for p99. We see this improvement in GC time translate to improvements in response times. The average response time is now 0.96x of the time compared to the untuned group and 0.86x for p99.  ---Files-------------------------------- Screenshot 2023-04-03 at 11.39.06 AM.png (554 KB) -- https://bugs.ruby-lang.org/