OLD | NEW |
| (Empty) |
1 RE2 regular expression syntax reference | |
2 ------------------------------------- | |
3 | |
4 Single characters: | |
5 . any character, possibly including newline (s=true) | |
6 [xyz] character class | |
7 [^xyz] negated character class | |
8 \d Perl character class | |
9 \D negated Perl character class | |
10 [[:alpha:]] ASCII character class | |
11 [[:^alpha:]] negated ASCII character class | |
12 \pN Unicode character class (one-letter name) | |
13 \p{Greek} Unicode character class | |
14 \PN negated Unicode character class (one-letter name) | |
15 \P{Greek} negated Unicode character class | |
16 | |
17 Composites: | |
18 xy «x» followed by «y» | |
19 x|y «x» or «y» (prefer «x») | |
20 | |
21 Repetitions: | |
22 x* zero or more «x», prefer more | |
23 x+ one or more «x», prefer more | |
24 x? zero or one «x», prefer one | |
25 x{n,m} «n» or «n»+1 or ... or «m» «x», prefer more | |
26 x{n,} «n» or more «x», prefer more | |
27 x{n} exactly «n» «x» | |
28 x*? zero or more «x», prefer fewer | |
29 x+? one or more «x», prefer fewer | |
30 x?? zero or one «x», prefer zero | |
31 x{n,m}? «n» or «n»+1 or ... or «m» «x», prefer fewer | |
32 x{n,}? «n» or more «x», prefer fewer | |
33 x{n}? exactly «n» «x» | |
34 x{} (== x*) NOT SUPPORTED vim | |
35 x{-} (== x*?) NOT SUPPORTED vim | |
36 x{-n} (== x{n}?) NOT SUPPORTED vim | |
37 x= (== x?) NOT SUPPORTED vim | |
38 | |
39 Implementation restriction: The counting forms «x{n,m}», «x{n,}», and «x{n}» | |
40 reject forms that create a minimum or maximum repetition count above 1000. | |
41 Unlimited repetitions are not subject to this restriction. | |
42 | |
43 Possessive repetitions: | |
44 x*+ zero or more «x», possessive NOT SUPPORTED | |
45 x++ one or more «x», possessive NOT SUPPORTED | |
46 x?+ zero or one «x», possessive NOT SUPPORTED | |
47 x{n,m}+ «n» or ... or «m» «x», possessive NOT SUPPORTED | |
48 x{n,}+ «n» or more «x», possessive NOT SUPPORTED | |
49 x{n}+ exactly «n» «x», possessive NOT SUPPORTED | |
50 | |
51 Grouping: | |
52 (re) numbered capturing group (submatch) | |
53 (?P<name>re) named & numbered capturing group (submatch) | |
54 (?<name>re) named & numbered capturing group (submatch) NOT SUPPORTED | |
55 (?'name're) named & numbered capturing group (submatch) NOT SUPPORTED | |
56 (?:re) non-capturing group | |
57 (?flags) set flags within current group; non-capturing | |
58 (?flags:re) set flags during re; non-capturing | |
59 (?#text) comment NOT SUPPORTED | |
60 (?|x|y|z) branch numbering reset NOT SUPPORTED | |
61 (?>re) possessive match of «re» NOT SUPPORTED | |
62 re@> possessive match of «re» NOT SUPPORTED vim | |
63 %(re) non-capturing group NOT SUPPORTED vim | |
64 | |
65 Flags: | |
66 i case-insensitive (default false) | |
67 m multi-line mode: «^» and «$» match begin/end line in addition to begin/e
nd text (default false) | |
68 s let «.» match «\n» (default false) | |
69 U ungreedy: swap meaning of «x*» and «x*?», «x+» and «x+?», etc (default f
alse) | |
70 Flag syntax is «xyz» (set) or «-xyz» (clear) or «xy-z» (set «xy», clear «z»). | |
71 | |
72 Empty strings: | |
73 ^ at beginning of text or line («m»=true) | |
74 $ at end of text (like «\z» not «\Z») or line («m»=true) | |
75 \A at beginning of text | |
76 \b at ASCII word boundary («\w» on one side and «\W», «\A», or «\z» on the
other) | |
77 \B not at ASCII word boundary | |
78 \G at beginning of subtext being searched NOT SUPPORTED pcre | |
79 \G at end of last match NOT SUPPORTED perl | |
80 \Z at end of text, or before newline at end of text NOT SUPPORTED | |
81 \z at end of text | |
82 (?=re) before text matching «re» NOT SUPPORTED | |
83 (?!re) before text not matching «re» NOT SUPPORTED | |
84 (?<=re) after text matching «re» NOT SUPPORTED | |
85 (?<!re) after text not matching «re» NOT SUPPORTED | |
86 re& before text matching «re» NOT SUPPORTED vim | |
87 re@= before text matching «re» NOT SUPPORTED vim | |
88 re@! before text not matching «re» NOT SUPPORTED vim | |
89 re@<= after text matching «re» NOT SUPPORTED vim | |
90 re@<! after text not matching «re» NOT SUPPORTED vim | |
91 \zs sets start of match (= \K) NOT SUPPORTED vim | |
92 \ze sets end of match NOT SUPPORTED vim | |
93 \%^ beginning of file NOT SUPPORTED vim | |
94 \%$ end of file NOT SUPPORTED vim | |
95 \%V on screen NOT SUPPORTED vim | |
96 \%# cursor position NOT SUPPORTED vim | |
97 \%'m mark «m» position NOT SUPPORTED vim | |
98 \%23l in line 23 NOT SUPPORTED vim | |
99 \%23c in column 23 NOT SUPPORTED vim | |
100 \%23v in virtual column 23 NOT SUPPORTED vim | |
101 | |
102 Escape sequences: | |
103 \a bell (== \007) | |
104 \f form feed (== \014) | |
105 \t horizontal tab (== \011) | |
106 \n newline (== \012) | |
107 \r carriage return (== \015) | |
108 \v vertical tab character (== \013) | |
109 \* literal «*», for any punctuation character «*» | |
110 \123 octal character code (up to three digits) | |
111 \x7F hex character code (exactly two digits) | |
112 \x{10FFFF} hex character code | |
113 \C match a single byte even in UTF-8 mode | |
114 \Q...\E literal text «...» even if «...» has punctuation | |
115 | |
116 \1 backreference NOT SUPPORTED | |
117 \b backspace NOT SUPPORTED (use «\010») | |
118 \cK control char ^K NOT SUPPORTED (use «\001» etc) | |
119 \e escape NOT SUPPORTED (use «\033») | |
120 \g1 backreference NOT SUPPORTED | |
121 \g{1} backreference NOT SUPPORTED | |
122 \g{+1} backreference NOT SUPPORTED | |
123 \g{-1} backreference NOT SUPPORTED | |
124 \g{name} named backreference NOT SUPPORTED | |
125 \g<name> subroutine call NOT SUPPORTED | |
126 \g'name' subroutine call NOT SUPPORTED | |
127 \k<name> named backreference NOT SUPPORTED | |
128 \k'name' named backreference NOT SUPPORTED | |
129 \lX lowercase «X» NOT SUPPORTED | |
130 \ux uppercase «x» NOT SUPPORTED | |
131 \L...\E lowercase text «...» NOT SUPPORTED | |
132 \K reset beginning of «$0» NOT SUPPORTED | |
133 \N{name} named Unicode character NOT SUPPORTED | |
134 \R line break NOT SUPPORTED | |
135 \U...\E upper case text «...» NOT SUPPORTED | |
136 \X extended Unicode sequence NOT SUPPORTED | |
137 | |
138 \%d123 decimal character 123 NOT SUPPORTED vim | |
139 \%xFF hex character FF NOT SUPPORTED vim | |
140 \%o123 octal character 123 NOT SUPPORTED vim | |
141 \%u1234 Unicode character 0x1234 NOT SUPPORTED vim | |
142 \%U12345678 Unicode character 0x12345678 NOT SUPPORTED vim | |
143 | |
144 Character class elements: | |
145 x single character | |
146 A-Z character range (inclusive) | |
147 \d Perl character class | |
148 [:foo:] ASCII character class «foo» | |
149 \p{Foo} Unicode character class «Foo» | |
150 \pF Unicode character class «F» (one-letter name) | |
151 | |
152 Named character classes as character class elements: | |
153 [\d] digits (== \d) | |
154 [^\d] not digits (== \D) | |
155 [\D] not digits (== \D) | |
156 [^\D] not not digits (== \d) | |
157 [[:name:]] named ASCII class inside character class (== [:name:]) | |
158 [^[:name:]] named ASCII class inside negated character class (== [:^name:]) | |
159 [\p{Name}] named Unicode property inside character class (== \p{Name}) | |
160 [^\p{Name}] named Unicode property inside negated character class (== \P{Nam
e}) | |
161 | |
162 Perl character classes (all ASCII-only): | |
163 \d digits (== [0-9]) | |
164 \D not digits (== [^0-9]) | |
165 \s whitespace (== [\t\n\f\r ]) | |
166 \S not whitespace (== [^\t\n\f\r ]) | |
167 \w word characters (== [0-9A-Za-z_]) | |
168 \W not word characters (== [^0-9A-Za-z_]) | |
169 | |
170 \h horizontal space NOT SUPPORTED | |
171 \H not horizontal space NOT SUPPORTED | |
172 \v vertical space NOT SUPPORTED | |
173 \V not vertical space NOT SUPPORTED | |
174 | |
175 ASCII character classes: | |
176 [[:alnum:]] alphanumeric (== [0-9A-Za-z]) | |
177 [[:alpha:]] alphabetic (== [A-Za-z]) | |
178 [[:ascii:]] ASCII (== [\x00-\x7F]) | |
179 [[:blank:]] blank (== [\t ]) | |
180 [[:cntrl:]] control (== [\x00-\x1F\x7F]) | |
181 [[:digit:]] digits (== [0-9]) | |
182 [[:graph:]] graphical (== [!-~] == [A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\]^_`
{|}~]) | |
183 [[:lower:]] lower case (== [a-z]) | |
184 [[:print:]] printable (== [ -~] == [ [:graph:]]) | |
185 [[:punct:]] punctuation (== [!-/:-@[-`{-~]) | |
186 [[:space:]] whitespace (== [\t\n\v\f\r ]) | |
187 [[:upper:]] upper case (== [A-Z]) | |
188 [[:word:]] word characters (== [0-9A-Za-z_]) | |
189 [[:xdigit:]] hex digit (== [0-9A-Fa-f]) | |
190 | |
191 Unicode character class names--general category: | |
192 C other | |
193 Cc control | |
194 Cf format | |
195 Cn unassigned code points NOT SUPPORTED | |
196 Co private use | |
197 Cs surrogate | |
198 L letter | |
199 LC cased letter NOT SUPPORTED | |
200 L& cased letter NOT SUPPORTED | |
201 Ll lowercase letter | |
202 Lm modifier letter | |
203 Lo other letter | |
204 Lt titlecase letter | |
205 Lu uppercase letter | |
206 M mark | |
207 Mc spacing mark | |
208 Me enclosing mark | |
209 Mn non-spacing mark | |
210 N number | |
211 Nd decimal number | |
212 Nl letter number | |
213 No other number | |
214 P punctuation | |
215 Pc connector punctuation | |
216 Pd dash punctuation | |
217 Pe close punctuation | |
218 Pf final punctuation | |
219 Pi initial punctuation | |
220 Po other punctuation | |
221 Ps open punctuation | |
222 S symbol | |
223 Sc currency symbol | |
224 Sk modifier symbol | |
225 Sm math symbol | |
226 So other symbol | |
227 Z separator | |
228 Zl line separator | |
229 Zp paragraph separator | |
230 Zs space separator | |
231 | |
232 Unicode character class names--scripts: | |
233 Arabic Arabic | |
234 Armenian Armenian | |
235 Balinese Balinese | |
236 Bamum Bamum | |
237 Batak Batak | |
238 Bengali Bengali | |
239 Bopomofo Bopomofo | |
240 Brahmi Brahmi | |
241 Braille Braille | |
242 Buginese Buginese | |
243 Buhid Buhid | |
244 Canadian_Aboriginal Canadian Aboriginal | |
245 Carian Carian | |
246 Chakma Chakma | |
247 Cham Cham | |
248 Cherokee Cherokee | |
249 Common characters not specific to one script | |
250 Coptic Coptic | |
251 Cuneiform Cuneiform | |
252 Cypriot Cypriot | |
253 Cyrillic Cyrillic | |
254 Deseret Deseret | |
255 Devanagari Devanagari | |
256 Egyptian_Hieroglyphs Egyptian Hieroglyphs | |
257 Ethiopic Ethiopic | |
258 Georgian Georgian | |
259 Glagolitic Glagolitic | |
260 Gothic Gothic | |
261 Greek Greek | |
262 Gujarati Gujarati | |
263 Gurmukhi Gurmukhi | |
264 Han Han | |
265 Hangul Hangul | |
266 Hanunoo Hanunoo | |
267 Hebrew Hebrew | |
268 Hiragana Hiragana | |
269 Imperial_Aramaic Imperial Aramaic | |
270 Inherited inherit script from previous character | |
271 Inscriptional_Pahlavi Inscriptional Pahlavi | |
272 Inscriptional_Parthian Inscriptional Parthian | |
273 Javanese Javanese | |
274 Kaithi Kaithi | |
275 Kannada Kannada | |
276 Katakana Katakana | |
277 Kayah_Li Kayah Li | |
278 Kharoshthi Kharoshthi | |
279 Khmer Khmer | |
280 Lao Lao | |
281 Latin Latin | |
282 Lepcha Lepcha | |
283 Limbu Limbu | |
284 Linear_B Linear B | |
285 Lycian Lycian | |
286 Lydian Lydian | |
287 Malayalam Malayalam | |
288 Mandaic Mandaic | |
289 Meetei_Mayek Meetei Mayek | |
290 Meroitic_Cursive Meroitic Cursive | |
291 Meroitic_Hieroglyphs Meroitic Hieroglyphs | |
292 Miao Miao | |
293 Mongolian Mongolian | |
294 Myanmar Myanmar | |
295 New_Tai_Lue New Tai Lue (aka Simplified Tai Lue) | |
296 Nko Nko | |
297 Ogham Ogham | |
298 Ol_Chiki Ol Chiki | |
299 Old_Italic Old Italic | |
300 Old_Persian Old Persian | |
301 Old_South_Arabian Old South Arabian | |
302 Old_Turkic Old Turkic | |
303 Oriya Oriya | |
304 Osmanya Osmanya | |
305 Phags_Pa 'Phags Pa | |
306 Phoenician Phoenician | |
307 Rejang Rejang | |
308 Runic Runic | |
309 Saurashtra Saurashtra | |
310 Sharada Sharada | |
311 Shavian Shavian | |
312 Sinhala Sinhala | |
313 Sora_Sompeng Sora Sompeng | |
314 Sundanese Sundanese | |
315 Syloti_Nagri Syloti Nagri | |
316 Syriac Syriac | |
317 Tagalog Tagalog | |
318 Tagbanwa Tagbanwa | |
319 Tai_Le Tai Le | |
320 Tai_Tham Tai Tham | |
321 Tai_Viet Tai Viet | |
322 Takri Takri | |
323 Tamil Tamil | |
324 Telugu Telugu | |
325 Thaana Thaana | |
326 Thai Thai | |
327 Tibetan Tibetan | |
328 Tifinagh Tifinagh | |
329 Ugaritic Ugaritic | |
330 Vai Vai | |
331 Yi Yi | |
332 | |
333 Vim character classes: | |
334 \i identifier character NOT SUPPORTED vim | |
335 \I «\i» except digits NOT SUPPORTED vim | |
336 \k keyword character NOT SUPPORTED vim | |
337 \K «\k» except digits NOT SUPPORTED vim | |
338 \f file name character NOT SUPPORTED vim | |
339 \F «\f» except digits NOT SUPPORTED vim | |
340 \p printable character NOT SUPPORTED vim | |
341 \P «\p» except digits NOT SUPPORTED vim | |
342 \s whitespace character (== [ \t]) NOT SUPPORTED vim | |
343 \S non-white space character (== [^ \t]) NOT SUPPORTED vim | |
344 \d digits (== [0-9]) vim | |
345 \D not «\d» vim | |
346 \x hex digits (== [0-9A-Fa-f]) NOT SUPPORTED vim | |
347 \X not «\x» NOT SUPPORTED vim | |
348 \o octal digits (== [0-7]) NOT SUPPORTED vim | |
349 \O not «\o» NOT SUPPORTED vim | |
350 \w word character vim | |
351 \W not «\w» vim | |
352 \h head of word character NOT SUPPORTED vim | |
353 \H not «\h» NOT SUPPORTED vim | |
354 \a alphabetic NOT SUPPORTED vim | |
355 \A not «\a» NOT SUPPORTED vim | |
356 \l lowercase NOT SUPPORTED vim | |
357 \L not lowercase NOT SUPPORTED vim | |
358 \u uppercase NOT SUPPORTED vim | |
359 \U not uppercase NOT SUPPORTED vim | |
360 \_x «\x» plus newline, for any «x» NOT SUPPORTED vim | |
361 | |
362 Vim flags: | |
363 \c ignore case NOT SUPPORTED vim | |
364 \C match case NOT SUPPORTED vim | |
365 \m magic NOT SUPPORTED vim | |
366 \M nomagic NOT SUPPORTED vim | |
367 \v verymagic NOT SUPPORTED vim | |
368 \V verynomagic NOT SUPPORTED vim | |
369 \Z ignore differences in Unicode combining characters NOT SUPPORTED vim | |
370 | |
371 Magic: | |
372 (?{code}) arbitrary Perl code NOT SUPPORTED perl | |
373 (??{code}) postponed arbitrary Perl code NOT SUPPORTED perl | |
374 (?n) recursive call to regexp capturing group «n» NOT SUPPORTED | |
375 (?+n) recursive call to relative group «+n» NOT SUPPORTED | |
376 (?-n) recursive call to relative group «-n» NOT SUPPORTED | |
377 (?C) PCRE callout NOT SUPPORTED pcre | |
378 (?R) recursive call to entire regexp (== (?0)) NOT SUPPORTED | |
379 (?&name) recursive call to named group NOT SUPPORTED | |
380 (?P=name) named backreference NOT SUPPORTED | |
381 (?P>name) recursive call to named group NOT SUPPORTED | |
382 (?(cond)true|false) conditional branch NOT SUPPORTED | |
383 (?(cond)true) conditional branch NOT SUPPORTED | |
384 (*ACCEPT) make regexps more like Prolog NOT SUPPORTED | |
385 (*COMMIT) NOT SUPPORTED | |
386 (*F) NOT SUPPORTED | |
387 (*FAIL) NOT SUPPORTED | |
388 (*MARK) NOT SUPPORTED | |
389 (*PRUNE) NOT SUPPORTED | |
390 (*SKIP) NOT SUPPORTED | |
391 (*THEN) NOT SUPPORTED | |
392 (*ANY) set newline convention NOT SUPPORTED | |
393 (*ANYCRLF) NOT SUPPORTED | |
394 (*CR) NOT SUPPORTED | |
395 (*CRLF) NOT SUPPORTED | |
396 (*LF) NOT SUPPORTED | |
397 (*BSR_ANYCRLF) set \R convention NOT SUPPORTED pcre | |
398 (*BSR_UNICODE) NOT SUPPORTED pcre | |
399 | |
OLD | NEW |