diff options
author | Hiroya Fujinami <make.just.on@gmail.com> | 2023-11-10 01:24:15 +0900 |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-11-10 01:24:15 +0900 |
commit | c49adfab5d269942c44ebfd83e8c107299fc8015 (patch) | |
tree | f7570841a89c2222f5eab19a1a2dff0dfdec317c /doc | |
parent | ad3db6711c4aa48c82f4091342aab7394ee45736 (diff) | |
download | ruby-c49adfab5d269942c44ebfd83e8c107299fc8015.tar.gz |
Add "Optimization" section to regexp.rdoc (#8849)
* Add "Optimization" section to regexp.rdoc
* Apply the suggestions by @BurdetteLamar
---------
Co-authored-by: Burdette Lamar <BurdetteLamar@Yahoo.com>
Diffstat (limited to 'doc')
-rw-r--r-- | doc/regexp.rdoc | 27 |
1 files changed, 27 insertions, 0 deletions
diff --git a/doc/regexp.rdoc b/doc/regexp.rdoc index 6b4b435746..309e109afd 100644 --- a/doc/regexp.rdoc +++ b/doc/regexp.rdoc @@ -1228,6 +1228,33 @@ when regexp.timeout is non-+nil+, that value controls timing out: | nil | Float | Times out in Float seconds. | | Float | Any | Times out in Float seconds. | +== Optimization + +For certain values of the pattern and target string, +matching time can grow polynomially or exponentially in relation to the input size; +the potential vulnerability arising from this is the {regular expression denial-of-service}[https://en.wikipedia.org/wiki/ReDoS] (ReDoS) attack. + +\Regexp matching can apply an optimization to prevent ReDoS attacks. +When the optimization is applied, matching time increases linearly (not polynomially or exponentially) +in relation to the input size, and a ReDoS attach is not possible. + +This optimization is applied if the pattern meets these criteria: + +- No backreferences. +- No subexpression calls. +- No nested lookaround anchors or atomic groups. +- No nested quantifiers with counting (i.e. no nested <tt>{n}</tt>, + <tt>{min,}</tt>, <tt>{,max}</tt>, or <tt>{min,max}</tt> style quantifiers) + +You can use method Regexp.linear_time? to determine whether a pattern meets these criteria: + + Regexp.linear_time?(/a*/) # => true + Regexp.linear_time?('a*') # => true + Regexp.linear_time?(/(a*)\1/) # => false + +However, an untrusted source may not be safe even if the method returns +true+, +because the optimization uses memoization (which may invoke large memory consumption). + == References Read (online PDF books): |