OLD | NEW |
1 RE2 regular expression syntax reference | 1 RE2 regular expression syntax reference |
2 ------------------------------------- | 2 ------------------------------------- |
3 | 3 |
4 Single characters: | 4 Single characters: |
5 . any character, possibly including newline (s=true) | 5 . any character, possibly including newline (s=true) |
6 [xyz] character class | 6 [xyz] character class |
7 [^xyz] negated character class | 7 [^xyz] negated character class |
8 \d Perl character class | 8 \d Perl character class |
9 \D negated Perl character class | 9 \D negated Perl character class |
10 [:alpha:]» ASCII character class | 10 [[:alpha:]]» ASCII character class |
11 [:^alpha:]» negated ASCII character class | 11 [[:^alpha:]]» negated ASCII character class |
12 \pN Unicode character class (one-letter name) | 12 \pN Unicode character class (one-letter name) |
13 \p{Greek} Unicode character class | 13 \p{Greek} Unicode character class |
14 \PN negated Unicode character class (one-letter name) | 14 \PN negated Unicode character class (one-letter name) |
15 \P{Greek} negated Unicode character class | 15 \P{Greek} negated Unicode character class |
16 | 16 |
17 Composites: | 17 Composites: |
18 xy «x» followed by «y» | 18 xy «x» followed by «y» |
19 x|y «x» or «y» (prefer «x») | 19 x|y «x» or «y» (prefer «x») |
20 | 20 |
21 Repetitions: | 21 Repetitions: |
22 x* zero or more «x», prefer more | 22 x* zero or more «x», prefer more |
23 x+ one or more «x», prefer more | 23 x+ one or more «x», prefer more |
24 x? zero or one «x», prefer one | 24 x? zero or one «x», prefer one |
25 x{n,m} «n» or «n»+1 or ... or «m» «x», prefer more | 25 x{n,m} «n» or «n»+1 or ... or «m» «x», prefer more |
26 x{n,} «n» or more «x», prefer more | 26 x{n,} «n» or more «x», prefer more |
27 x{n} exactly «n» «x» | 27 x{n} exactly «n» «x» |
28 x*? zero or more «x», prefer fewer | 28 x*? zero or more «x», prefer fewer |
29 x+? one or more «x», prefer fewer | 29 x+? one or more «x», prefer fewer |
30 x?? zero or one «x», prefer zero | 30 x?? zero or one «x», prefer zero |
31 x{n,m}? «n» or «n»+1 or ... or «m» «x», prefer fewer | 31 x{n,m}? «n» or «n»+1 or ... or «m» «x», prefer fewer |
32 x{n,}? «n» or more «x», prefer fewer | 32 x{n,}? «n» or more «x», prefer fewer |
33 x{n}? exactly «n» «x» | 33 x{n}? exactly «n» «x» |
34 x{} (== x*) NOT SUPPORTED vim | 34 x{} (== x*) NOT SUPPORTED vim |
35 x{-} (== x*?) NOT SUPPORTED vim | 35 x{-} (== x*?) NOT SUPPORTED vim |
36 x{-n} (== x{n}?) NOT SUPPORTED vim | 36 x{-n} (== x{n}?) NOT SUPPORTED vim |
37 x= (== x?) NOT SUPPORTED vim | 37 x= (== x?) NOT SUPPORTED vim |
38 | 38 |
| 39 Implementation restriction: The counting forms «x{n,m}», «x{n,}», and «x{n}» |
| 40 reject forms that create a minimum or maximum repetition count above 1000. |
| 41 Unlimited repetitions are not subject to this restriction. |
| 42 |
39 Possessive repetitions: | 43 Possessive repetitions: |
40 x*+ zero or more «x», possessive NOT SUPPORTED | 44 x*+ zero or more «x», possessive NOT SUPPORTED |
41 x++ one or more «x», possessive NOT SUPPORTED | 45 x++ one or more «x», possessive NOT SUPPORTED |
42 x?+ zero or one «x», possessive NOT SUPPORTED | 46 x?+ zero or one «x», possessive NOT SUPPORTED |
43 x{n,m}+ «n» or ... or «m» «x», possessive NOT SUPPORTED | 47 x{n,m}+ «n» or ... or «m» «x», possessive NOT SUPPORTED |
44 x{n,}+ «n» or more «x», possessive NOT SUPPORTED | 48 x{n,}+ «n» or more «x», possessive NOT SUPPORTED |
45 x{n}+ exactly «n» «x», possessive NOT SUPPORTED | 49 x{n}+ exactly «n» «x», possessive NOT SUPPORTED |
46 | 50 |
47 Grouping: | 51 Grouping: |
48 (re)» numbered capturing group | 52 (re)» numbered capturing group (submatch) |
49 (?P<name>re)» named & numbered capturing group | 53 (?P<name>re)» named & numbered capturing group (submatch) |
50 (?<name>re)» named & numbered capturing group NOT SUPPORTED | 54 (?<name>re)» named & numbered capturing group (submatch) NOT SUPPORTED |
51 (?'name're)» named & numbered capturing group NOT SUPPORTED | 55 (?'name're)» named & numbered capturing group (submatch) NOT SUPPORTED |
52 (?:re) non-capturing group | 56 (?:re) non-capturing group |
53 (?flags) set flags within current group; non-capturing | 57 (?flags) set flags within current group; non-capturing |
54 (?flags:re) set flags during re; non-capturing | 58 (?flags:re) set flags during re; non-capturing |
55 (?#text) comment NOT SUPPORTED | 59 (?#text) comment NOT SUPPORTED |
56 (?|x|y|z) branch numbering reset NOT SUPPORTED | 60 (?|x|y|z) branch numbering reset NOT SUPPORTED |
57 (?>re) possessive match of «re» NOT SUPPORTED | 61 (?>re) possessive match of «re» NOT SUPPORTED |
58 re@> possessive match of «re» NOT SUPPORTED vim | 62 re@> possessive match of «re» NOT SUPPORTED vim |
59 %(re) non-capturing group NOT SUPPORTED vim | 63 %(re) non-capturing group NOT SUPPORTED vim |
60 | 64 |
61 Flags: | 65 Flags: |
62 i case-insensitive (default false) | 66 i case-insensitive (default false) |
63 m multi-line mode: «^» and «$» match begin/end line in addition to begin/e
nd text (default false) | 67 m multi-line mode: «^» and «$» match begin/end line in addition to begin/e
nd text (default false) |
64 s let «.» match «\n» (default false) | 68 s let «.» match «\n» (default false) |
65 U ungreedy: swap meaning of «x*» and «x*?», «x+» and «x+?», etc (default f
alse) | 69 U ungreedy: swap meaning of «x*» and «x*?», «x+» and «x+?», etc (default f
alse) |
66 Flag syntax is «xyz» (set) or «-xyz» (clear) or «xy-z» (set «xy», clear «z»). | 70 Flag syntax is «xyz» (set) or «-xyz» (clear) or «xy-z» (set «xy», clear «z»). |
67 | 71 |
68 Empty strings: | 72 Empty strings: |
69 ^ at beginning of text or line («m»=true) | 73 ^ at beginning of text or line («m»=true) |
70 $ at end of text (like «\z» not «\Z») or line («m»=true) | 74 $ at end of text (like «\z» not «\Z») or line («m»=true) |
71 \A at beginning of text | 75 \A at beginning of text |
72 \b» at word boundary («\w» on one side and «\W», «\A», or «\z» on the other) | 76 \b» at ASCII word boundary («\w» on one side and «\W», «\A», or «\z» on the
other) |
73 \B» not a word boundary | 77 \B» not at ASCII word boundary |
74 \G at beginning of subtext being searched NOT SUPPORTED pcre | 78 \G at beginning of subtext being searched NOT SUPPORTED pcre |
75 \G at end of last match NOT SUPPORTED perl | 79 \G at end of last match NOT SUPPORTED perl |
76 \Z at end of text, or before newline at end of text NOT SUPPORTED | 80 \Z at end of text, or before newline at end of text NOT SUPPORTED |
77 \z at end of text | 81 \z at end of text |
78 (?=re) before text matching «re» NOT SUPPORTED | 82 (?=re) before text matching «re» NOT SUPPORTED |
79 (?!re) before text not matching «re» NOT SUPPORTED | 83 (?!re) before text not matching «re» NOT SUPPORTED |
80 (?<=re) after text matching «re» NOT SUPPORTED | 84 (?<=re) after text matching «re» NOT SUPPORTED |
81 (?<!re) after text not matching «re» NOT SUPPORTED | 85 (?<!re) after text not matching «re» NOT SUPPORTED |
82 re& before text matching «re» NOT SUPPORTED vim | 86 re& before text matching «re» NOT SUPPORTED vim |
83 re@= before text matching «re» NOT SUPPORTED vim | 87 re@= before text matching «re» NOT SUPPORTED vim |
(...skipping 64 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
148 Named character classes as character class elements: | 152 Named character classes as character class elements: |
149 [\d] digits (== \d) | 153 [\d] digits (== \d) |
150 [^\d] not digits (== \D) | 154 [^\d] not digits (== \D) |
151 [\D] not digits (== \D) | 155 [\D] not digits (== \D) |
152 [^\D] not not digits (== \d) | 156 [^\D] not not digits (== \d) |
153 [[:name:]] named ASCII class inside character class (== [:name:]) | 157 [[:name:]] named ASCII class inside character class (== [:name:]) |
154 [^[:name:]] named ASCII class inside negated character class (== [:^name:]) | 158 [^[:name:]] named ASCII class inside negated character class (== [:^name:]) |
155 [\p{Name}] named Unicode property inside character class (== \p{Name}) | 159 [\p{Name}] named Unicode property inside character class (== \p{Name}) |
156 [^\p{Name}] named Unicode property inside negated character class (== \P{Nam
e}) | 160 [^\p{Name}] named Unicode property inside negated character class (== \P{Nam
e}) |
157 | 161 |
158 Perl character classes: | 162 Perl character classes (all ASCII-only): |
159 \d digits (== [0-9]) | 163 \d digits (== [0-9]) |
160 \D not digits (== [^0-9]) | 164 \D not digits (== [^0-9]) |
161 \s whitespace (== [\t\n\f\r ]) | 165 \s whitespace (== [\t\n\f\r ]) |
162 \S not whitespace (== [^\t\n\f\r ]) | 166 \S not whitespace (== [^\t\n\f\r ]) |
163 \w word characters (== [0-9A-Za-z_]) | 167 \w word characters (== [0-9A-Za-z_]) |
164 \W not word characters (== [^0-9A-Za-z_]) | 168 \W not word characters (== [^0-9A-Za-z_]) |
165 | 169 |
166 \h horizontal space NOT SUPPORTED | 170 \h horizontal space NOT SUPPORTED |
167 \H not horizontal space NOT SUPPORTED | 171 \H not horizontal space NOT SUPPORTED |
168 \v vertical space NOT SUPPORTED | 172 \v vertical space NOT SUPPORTED |
169 \V not vertical space NOT SUPPORTED | 173 \V not vertical space NOT SUPPORTED |
170 | 174 |
171 ASCII character classes: | 175 ASCII character classes: |
172 [:alnum:]» alphanumeric (== [0-9A-Za-z]) | 176 [[:alnum:]]» alphanumeric (== [0-9A-Za-z]) |
173 [:alpha:]» alphabetic (== [A-Za-z]) | 177 [[:alpha:]]» alphabetic (== [A-Za-z]) |
174 [:ascii:]» ASCII (== [\x00-\x7F]) | 178 [[:ascii:]]» ASCII (== [\x00-\x7F]) |
175 [:blank:]» blank (== [\t ]) | 179 [[:blank:]]» blank (== [\t ]) |
176 [:cntrl:]» control (== [\x00-\x1F\x7F]) | 180 [[:cntrl:]]» control (== [\x00-\x1F\x7F]) |
177 [:digit:]» digits (== [0-9]) | 181 [[:digit:]]» digits (== [0-9]) |
178 [:graph:]» graphical (== [!-~] == [A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\]^_`
{|}~]) | 182 [[:graph:]]» graphical (== [!-~] == [A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\]^_`
{|}~]) |
179 [:lower:]» lower case (== [a-z]) | 183 [[:lower:]]» lower case (== [a-z]) |
180 [:print:]» printable (== [ -~] == [ [:graph:]]) | 184 [[:print:]]» printable (== [ -~] == [ [:graph:]]) |
181 [:punct:]» punctuation (== [!-/:-@[-`{-~]) | 185 [[:punct:]]» punctuation (== [!-/:-@[-`{-~]) |
182 [:space:]» whitespace (== [\t\n\v\f\r ]) | 186 [[:space:]]» whitespace (== [\t\n\v\f\r ]) |
183 [:upper:]» upper case (== [A-Z]) | 187 [[:upper:]]» upper case (== [A-Z]) |
184 [:word:]» word characters (== [0-9A-Za-z_]) | 188 [[:word:]]» word characters (== [0-9A-Za-z_]) |
185 [:xdigit:]» hex digit (== [0-9A-Fa-f]) | 189 [[:xdigit:]]» hex digit (== [0-9A-Fa-f]) |
186 | 190 |
187 Unicode character class names--general category: | 191 Unicode character class names--general category: |
188 C other | 192 C other |
189 Cc control | 193 Cc control |
190 Cf format | 194 Cf format |
191 Cn unassigned code points NOT SUPPORTED | 195 Cn unassigned code points NOT SUPPORTED |
192 Co private use | 196 Co private use |
193 Cs surrogate | 197 Cs surrogate |
194 L letter | 198 L letter |
195 LC cased letter NOT SUPPORTED | 199 LC cased letter NOT SUPPORTED |
(...skipping 26 matching lines...) Expand all Loading... |
222 So other symbol | 226 So other symbol |
223 Z separator | 227 Z separator |
224 Zl line separator | 228 Zl line separator |
225 Zp paragraph separator | 229 Zp paragraph separator |
226 Zs space separator | 230 Zs space separator |
227 | 231 |
228 Unicode character class names--scripts: | 232 Unicode character class names--scripts: |
229 Arabic Arabic | 233 Arabic Arabic |
230 Armenian Armenian | 234 Armenian Armenian |
231 Balinese Balinese | 235 Balinese Balinese |
| 236 Bamum Bamum |
| 237 Batak Batak |
232 Bengali Bengali | 238 Bengali Bengali |
233 Bopomofo Bopomofo | 239 Bopomofo Bopomofo |
| 240 Brahmi Brahmi |
234 Braille Braille | 241 Braille Braille |
235 Buginese Buginese | 242 Buginese Buginese |
236 Buhid Buhid | 243 Buhid Buhid |
237 Canadian_Aboriginal Canadian Aboriginal | 244 Canadian_Aboriginal Canadian Aboriginal |
238 Carian Carian | 245 Carian Carian |
| 246 Chakma Chakma |
239 Cham Cham | 247 Cham Cham |
240 Cherokee Cherokee | 248 Cherokee Cherokee |
241 Common characters not specific to one script | 249 Common characters not specific to one script |
242 Coptic Coptic | 250 Coptic Coptic |
243 Cuneiform Cuneiform | 251 Cuneiform Cuneiform |
244 Cypriot Cypriot | 252 Cypriot Cypriot |
245 Cyrillic Cyrillic | 253 Cyrillic Cyrillic |
246 Deseret Deseret | 254 Deseret Deseret |
247 Devanagari Devanagari | 255 Devanagari Devanagari |
| 256 Egyptian_Hieroglyphs Egyptian Hieroglyphs |
248 Ethiopic Ethiopic | 257 Ethiopic Ethiopic |
249 Georgian Georgian | 258 Georgian Georgian |
250 Glagolitic Glagolitic | 259 Glagolitic Glagolitic |
251 Gothic Gothic | 260 Gothic Gothic |
252 Greek Greek | 261 Greek Greek |
253 Gujarati Gujarati | 262 Gujarati Gujarati |
254 Gurmukhi Gurmukhi | 263 Gurmukhi Gurmukhi |
255 Han Han | 264 Han Han |
256 Hangul Hangul | 265 Hangul Hangul |
257 Hanunoo Hanunoo | 266 Hanunoo Hanunoo |
258 Hebrew Hebrew | 267 Hebrew Hebrew |
259 Hiragana Hiragana | 268 Hiragana Hiragana |
| 269 Imperial_Aramaic Imperial Aramaic |
260 Inherited inherit script from previous character | 270 Inherited inherit script from previous character |
| 271 Inscriptional_Pahlavi Inscriptional Pahlavi |
| 272 Inscriptional_Parthian Inscriptional Parthian |
| 273 Javanese Javanese |
| 274 Kaithi Kaithi |
261 Kannada Kannada | 275 Kannada Kannada |
262 Katakana Katakana | 276 Katakana Katakana |
263 Kayah_Li Kayah Li | 277 Kayah_Li Kayah Li |
264 Kharoshthi Kharoshthi | 278 Kharoshthi Kharoshthi |
265 Khmer Khmer | 279 Khmer Khmer |
266 Lao Lao | 280 Lao Lao |
267 Latin Latin | 281 Latin Latin |
268 Lepcha Lepcha | 282 Lepcha Lepcha |
269 Limbu Limbu | 283 Limbu Limbu |
270 Linear_B Linear B | 284 Linear_B Linear B |
271 Lycian Lycian | 285 Lycian Lycian |
272 Lydian Lydian | 286 Lydian Lydian |
273 Malayalam Malayalam | 287 Malayalam Malayalam |
| 288 Mandaic Mandaic |
| 289 Meetei_Mayek Meetei Mayek |
| 290 Meroitic_Cursive Meroitic Cursive |
| 291 Meroitic_Hieroglyphs Meroitic Hieroglyphs |
| 292 Miao Miao |
274 Mongolian Mongolian | 293 Mongolian Mongolian |
275 Myanmar Myanmar | 294 Myanmar Myanmar |
276 New_Tai_Lue New Tai Lue (aka Simplified Tai Lue) | 295 New_Tai_Lue New Tai Lue (aka Simplified Tai Lue) |
277 Nko Nko | 296 Nko Nko |
278 Ogham Ogham | 297 Ogham Ogham |
279 Ol_Chiki Ol Chiki | 298 Ol_Chiki Ol Chiki |
280 Old_Italic Old Italic | 299 Old_Italic Old Italic |
281 Old_Persian Old Persian | 300 Old_Persian Old Persian |
| 301 Old_South_Arabian Old South Arabian |
| 302 Old_Turkic Old Turkic |
282 Oriya Oriya | 303 Oriya Oriya |
283 Osmanya Osmanya | 304 Osmanya Osmanya |
284 Phags_Pa 'Phags Pa | 305 Phags_Pa 'Phags Pa |
285 Phoenician Phoenician | 306 Phoenician Phoenician |
286 Rejang Rejang | 307 Rejang Rejang |
287 Runic Runic | 308 Runic Runic |
288 Saurashtra Saurashtra | 309 Saurashtra Saurashtra |
| 310 Sharada Sharada |
289 Shavian Shavian | 311 Shavian Shavian |
290 Sinhala Sinhala | 312 Sinhala Sinhala |
| 313 Sora_Sompeng Sora Sompeng |
291 Sundanese Sundanese | 314 Sundanese Sundanese |
292 Syloti_Nagri Syloti Nagri | 315 Syloti_Nagri Syloti Nagri |
293 Syriac Syriac | 316 Syriac Syriac |
294 Tagalog Tagalog | 317 Tagalog Tagalog |
295 Tagbanwa Tagbanwa | 318 Tagbanwa Tagbanwa |
296 Tai_Le Tai Le | 319 Tai_Le Tai Le |
| 320 Tai_Tham Tai Tham |
| 321 Tai_Viet Tai Viet |
| 322 Takri Takri |
297 Tamil Tamil | 323 Tamil Tamil |
298 Telugu Telugu | 324 Telugu Telugu |
299 Thaana Thaana | 325 Thaana Thaana |
300 Thai Thai | 326 Thai Thai |
301 Tibetan Tibetan | 327 Tibetan Tibetan |
302 Tifinagh Tifinagh | 328 Tifinagh Tifinagh |
303 Ugaritic Ugaritic | 329 Ugaritic Ugaritic |
304 Vai Vai | 330 Vai Vai |
305 Yi Yi | 331 Yi Yi |
306 | 332 |
(...skipping 57 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
364 (*SKIP) NOT SUPPORTED | 390 (*SKIP) NOT SUPPORTED |
365 (*THEN) NOT SUPPORTED | 391 (*THEN) NOT SUPPORTED |
366 (*ANY) set newline convention NOT SUPPORTED | 392 (*ANY) set newline convention NOT SUPPORTED |
367 (*ANYCRLF) NOT SUPPORTED | 393 (*ANYCRLF) NOT SUPPORTED |
368 (*CR) NOT SUPPORTED | 394 (*CR) NOT SUPPORTED |
369 (*CRLF) NOT SUPPORTED | 395 (*CRLF) NOT SUPPORTED |
370 (*LF) NOT SUPPORTED | 396 (*LF) NOT SUPPORTED |
371 (*BSR_ANYCRLF) set \R convention NOT SUPPORTED pcre | 397 (*BSR_ANYCRLF) set \R convention NOT SUPPORTED pcre |
372 (*BSR_UNICODE) NOT SUPPORTED pcre | 398 (*BSR_UNICODE) NOT SUPPORTED pcre |
373 | 399 |
OLD | NEW |