OLD | NEW |
| (Empty) |
1 | |
2 1. OVERVIEW | |
3 | |
4 This README file describes the syntax of the arguments that may be passed to | |
5 the FTS3 MATCH operator used for full-text queries. For example, if table | |
6 "t1" is an Fts3 virtual table, the following SQL query: | |
7 | |
8 SELECT * FROM t1 WHERE <col> MATCH <full-text query> | |
9 | |
10 may be used to retrieve all rows that match a specified for full-text query. | |
11 The text "<col>" should be replaced by either the name of the fts3 table | |
12 (in this case "t1"), or by the name of one of the columns of the fts3 | |
13 table. <full-text-query> should be replaced by an SQL expression that | |
14 computes to a string containing an Fts3 query. | |
15 | |
16 If the left-hand-side of the MATCH operator is set to the name of the | |
17 fts3 table, then by default the query may be matched against any column | |
18 of the table. If it is set to a column name, then by default the query | |
19 may only match the specified column. In both cases this may be overriden | |
20 as part of the query text (see sections 2 and 3 below). | |
21 | |
22 As of SQLite version 3.6.8, Fts3 supports two slightly different query | |
23 formats; the standard syntax, which is used by default, and the enhanced | |
24 query syntax which can be selected by compiling with the pre-processor | |
25 symbol SQLITE_ENABLE_FTS3_PARENTHESIS defined. | |
26 | |
27 -DSQLITE_ENABLE_FTS3_PARENTHESIS | |
28 | |
29 2. STANDARD QUERY SYNTAX | |
30 | |
31 When using the standard Fts3 query syntax, a query usually consists of a | |
32 list of terms (words) separated by white-space characters. To match a | |
33 query, a row (or column) of an Fts3 table must contain each of the specified | |
34 terms. For example, the following query: | |
35 | |
36 <col> MATCH 'hello world' | |
37 | |
38 matches rows (or columns, if <col> is the name of a column name) that | |
39 contain at least one instance of the token "hello", and at least one | |
40 instance of the token "world". Tokens may be grouped into phrases using | |
41 quotation marks. In this case, a matching row or column must contain each | |
42 of the tokens in the phrase in the order specified, with no intervening | |
43 tokens. For example, the query: | |
44 | |
45 <col> MATCH '"hello world" joe" | |
46 | |
47 matches the first of the following two documents, but not the second or | |
48 third: | |
49 | |
50 "'Hello world', said Joe." | |
51 "One should always greet the world with a cheery hello, thought Joe." | |
52 "How many hello world programs could their be?" | |
53 | |
54 As well as grouping tokens together by phrase, the binary NEAR operator | |
55 may be used to search for rows that contain two or more specified tokens | |
56 or phrases within a specified proximity of each other. The NEAR operator | |
57 must always be specified in upper case. The word "near" in lower or mixed | |
58 case is treated as an ordinary token. For example, the following query: | |
59 | |
60 <col> MATCH 'engineering NEAR consultancy' | |
61 | |
62 matches rows that contain both the "engineering" and "consultancy" tokens | |
63 in the same column with not more than 10 other words between them. It does | |
64 not matter which of the two terms occurs first in the document, only that | |
65 they be seperated by only 10 tokens or less. The user may also specify | |
66 a different required proximity by adding "/N" immediately after the NEAR | |
67 operator, where N is an integer. For example: | |
68 | |
69 <col> MATCH 'engineering NEAR/5 consultancy' | |
70 | |
71 searches for a row containing an instance of each specified token seperated | |
72 by not more than 5 other tokens. More than one NEAR operator can be used | |
73 in as sequence. For example this query: | |
74 | |
75 <col> MATCH 'reliable NEAR/2 engineering NEAR/5 consultancy' | |
76 | |
77 searches for a row that contains an instance of the token "reliable" | |
78 seperated by not more than two tokens from an instance of "engineering", | |
79 which is in turn separated by not more than 5 other tokens from an | |
80 instance of the term "consultancy". Phrases enclosed in quotes may | |
81 also be used as arguments to the NEAR operator. | |
82 | |
83 Similar to the NEAR operator, one or more tokens or phrases may be | |
84 separated by OR operators. In this case, only one of the specified tokens | |
85 or phrases must appear in the document. For example, the query: | |
86 | |
87 <col> MATCH 'hello OR world' | |
88 | |
89 matches rows that contain either the term "hello", or the term "world", | |
90 or both. Note that unlike in many programming languages, the OR operator | |
91 has a higher precedence than the AND operators implied between white-space | |
92 separated tokens. The following query matches documents that contain the | |
93 term 'sqlite' and at least one of the terms 'fantastic' or 'impressive', | |
94 not those that contain both 'sqlite' and 'fantastic' or 'impressive': | |
95 | |
96 <col> MATCH 'sqlite fantastic OR impressive' | |
97 | |
98 Any token that is part of an Fts3 query expression, whether or not it is | |
99 part of a phrase enclosed in quotes, may have a '*' character appended to | |
100 it. In this case, the token matches all terms that begin with the characters | |
101 of the token, not just those that exactly match it. For example, the | |
102 following query: | |
103 | |
104 <col> MATCH 'sql*' | |
105 | |
106 matches all rows that contain the term "SQLite", as well as those that | |
107 contain "SQL". | |
108 | |
109 A token that is not part of a quoted phrase may be preceded by a '-' | |
110 character, which indicates that matching rows must not contain the | |
111 specified term. For example, the following: | |
112 | |
113 <col> MATCH '"database engine" -sqlite' | |
114 | |
115 matches rows that contain the phrase "database engine" but do not contain | |
116 the term "sqlite". If the '-' character occurs inside a quoted phrase, | |
117 it is ignored. It is possible to use both the '-' prefix and the '*' postfix | |
118 on a single term. At this time, all Fts3 queries must contain at least | |
119 one term or phrase that is not preceded by the '-' prefix. | |
120 | |
121 Regardless of whether or not a table name or column name is used on the | |
122 left hand side of the MATCH operator, a specific column of the fts3 table | |
123 may be associated with each token in a query by preceding a token with | |
124 a column name followed by a ':' character. For example, regardless of what | |
125 is specified for <col>, the following query requires that column "col1" | |
126 of the table contains the term "hello", and that column "col2" of the | |
127 table contains the term "world". If the table does not contain columns | |
128 named "col1" and "col2", then an error is returned and the query is | |
129 not run. | |
130 | |
131 <col> MATCH 'col1:hello col2:world' | |
132 | |
133 It is not possible to associate a specific table column with a quoted | |
134 phrase or a term preceded by a '-' operator. A '*' character may be | |
135 appended to a term associated with a specific column for prefix matching. | |
136 | |
137 3. ENHANCED QUERY SYNTAX | |
138 | |
139 The enhanced query syntax is quite similar to the standard query syntax, | |
140 with the following four differences: | |
141 | |
142 1) Parenthesis are supported. When using the enhanced query syntax, | |
143 parenthesis may be used to overcome the built-in precedence of the | |
144 supplied binary operators. For example, the following query: | |
145 | |
146 <col> MATCH '(hello world) OR (simple example)' | |
147 | |
148 matches documents that contain both "hello" and "world", and documents | |
149 that contain both "simple" and "example". It is not possible to forumlate | |
150 such a query using the standard syntax. | |
151 | |
152 2) Instead of separating tokens and phrases by whitespace, an AND operator | |
153 may be explicitly specified. This does not change query processing at | |
154 all, but may be used to improve readability. For example, the following | |
155 query is handled identically to the one above: | |
156 | |
157 <col> MATCH '(hello AND world) OR (simple AND example)' | |
158 | |
159 As with the OR and NEAR operators, the AND operator must be specified | |
160 in upper case. The word "and" specified in lower or mixed case is | |
161 handled as a regular token. | |
162 | |
163 3) The '-' token prefix is not supported. Instead, a new binary operator, | |
164 NOT, is included. The NOT operator requires that the query specified | |
165 as its left-hand operator matches, but that the query specified as the | |
166 right-hand operator does not. For example, to query for all rows that | |
167 contain the term "example" but not the term "simple", the following | |
168 query could be used: | |
169 | |
170 <col> MATCH 'example NOT simple' | |
171 | |
172 As for all other operators, the NOT operator must be specified in | |
173 upper case. Otherwise it will be treated as a regular token. | |
174 | |
175 4) Unlike in the standard syntax, where the OR operator has a higher | |
176 precedence than the implicit AND operator, when using the enhanced | |
177 syntax implicit and explict AND operators have a higher precedence | |
178 than OR operators. Using the enhanced syntax, the following two | |
179 queries are equivalent: | |
180 | |
181 <col> MATCH 'sqlite fantastic OR impressive' | |
182 <col> MATCH '(sqlite AND fantastic) OR impressive' | |
183 | |
184 however, when using the standard syntax, the query: | |
185 | |
186 <col> MATCH 'sqlite fantastic OR impressive' | |
187 | |
188 is equivalent to the enhanced syntax query: | |
189 | |
190 <col> MATCH 'sqlite AND (fantastic OR impressive)' | |
191 | |
192 The precedence of all enhanced syntax operators, in order from highest | |
193 to lowest, is: | |
194 | |
195 NEAR (highest precedence, tightest grouping) | |
196 NOT | |
197 AND | |
198 OR (lowest precedence, loosest grouping) | |
199 | |
200 Using the advanced syntax, it is possible to specify expressions enclosed | |
201 in parenthesis as operands to the NOT, AND and OR operators. However both | |
202 the left and right hand side operands of NEAR operators must be either | |
203 tokens or phrases. Attempting the following query will return an error: | |
204 | |
205 <col> MATCH 'sqlite NEAR (fantastic OR impressive)' | |
206 | |
207 Queries of this form must be re-written as: | |
208 | |
209 <col> MATCH 'sqlite NEAR fantastic OR sqlite NEAR impressive' | |
OLD | NEW |