aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorJeremy Evans <code@jeremyevans.net>2023-11-30 09:53:01 -0800
committerJeremy Evans <code@jeremyevans.net>2023-11-30 10:40:40 -0800
commit060f14bf62ad3f426a6666901c45b82d4334fa26 (patch)
treed663f15cf44ebd1e8626934265b1253014e5eaab
parentf75fef66221e55ce9e9e302cfd8ee22062527c6c (diff)
downloadruby-060f14bf62ad3f426a6666901c45b82d4334fa26.tar.gz
Update documentation for [[:word:]] and \p{Word} in regexps
Onigmo uses Decimal_Number and not Number for these. Fixes [Bug #19417]
-rw-r--r--doc/_regexp.rdoc30
1 files changed, 19 insertions, 11 deletions
diff --git a/doc/_regexp.rdoc b/doc/_regexp.rdoc
index ffba14e78f..7b71eee984 100644
--- a/doc/_regexp.rdoc
+++ b/doc/_regexp.rdoc
@@ -838,13 +838,17 @@ These are also commonly used:
- <tt>/\p{Emoji}/</tt>: Unicode emoji.
- <tt>/\p{Graph}/</tt>: Non-blank character
(excludes spaces, control characters, and similar).
-- <tt>/\p{Word}/</tt>: A member of one of the following Unicode character
- categories (see below):
+- <tt>/\p{Word}/</tt>: A member in one of these Unicode character
+ categories (see below) or having one of these Unicode properties:
- - +Mark+ (+M+).
- - +Letter+ (+L+).
- - +Number+ (+N+)
- - <tt>Connector Punctuation</tt> (+Pc+).
+ - Unicode categories:
+ - +Mark+ (+M+).
+ - <tt>Decimal Number</tt> (+Nd+)
+ - <tt>Connector Punctuation</tt> (+Pc+).
+
+ - Unicode properties:
+ - +Alpha+
+ - <tt>Join_Control</tt>
- <tt>/\p{ASCII}/</tt>: A character in the ASCII character set.
- <tt>/\p{Any}/</tt>: Any Unicode character (including unassigned characters).
@@ -993,12 +997,16 @@ Ruby also supports these (non-POSIX) bracket expressions:
- <tt>/[[:ascii:]]/</tt>: Matches a character in the ASCII character set.
- <tt>/[[:word:]]/</tt>: Matches a character in one of these Unicode character
- categories (see below):
+ categories or having one of these Unicode properties:
+
+ - Unicode categories:
+ - +Mark+ (+M+).
+ - <tt>Decimal Number</tt> (+Nd+)
+ - <tt>Connector Punctuation</tt> (+Pc+).
- - +Mark+ (+M+).
- - +Letter+ (+L+).
- - +Number+ (+N+)
- - <tt>Connector Punctuation</tt> (+Pc+).
+ - Unicode properties:
+ - +Alpha+
+ - <tt>Join_Control</tt>
=== Comments