Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(354)

Side by Side Diff: third_party/sqlite/amalgamation/sqlite3.07.c

Issue 1636873003: Try for backport (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@zzsql_import3_10_2_websql_backport
Patch Set: Created 4 years, 11 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « third_party/sqlite/amalgamation/sqlite3.06.c ('k') | third_party/sqlite/split.pl » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
(Empty)
1 /************** Begin file sqlite3rbu.c **************************************/
2 /*
3 ** 2014 August 30
4 **
5 ** The author disclaims copyright to this source code. In place of
6 ** a legal notice, here is a blessing:
7 **
8 ** May you do good and not evil.
9 ** May you find forgiveness for yourself and forgive others.
10 ** May you share freely, never taking more than you give.
11 **
12 *************************************************************************
13 **
14 **
15 ** OVERVIEW
16 **
17 ** The RBU extension requires that the RBU update be packaged as an
18 ** SQLite database. The tables it expects to find are described in
19 ** sqlite3rbu.h. Essentially, for each table xyz in the target database
20 ** that the user wishes to write to, a corresponding data_xyz table is
21 ** created in the RBU database and populated with one row for each row to
22 ** update, insert or delete from the target table.
23 **
24 ** The update proceeds in three stages:
25 **
26 ** 1) The database is updated. The modified database pages are written
27 ** to a *-oal file. A *-oal file is just like a *-wal file, except
28 ** that it is named "<database>-oal" instead of "<database>-wal".
29 ** Because regular SQLite clients do not look for file named
30 ** "<database>-oal", they go on using the original database in
31 ** rollback mode while the *-oal file is being generated.
32 **
33 ** During this stage RBU does not update the database by writing
34 ** directly to the target tables. Instead it creates "imposter"
35 ** tables using the SQLITE_TESTCTRL_IMPOSTER interface that it uses
36 ** to update each b-tree individually. All updates required by each
37 ** b-tree are completed before moving on to the next, and all
38 ** updates are done in sorted key order.
39 **
40 ** 2) The "<database>-oal" file is moved to the equivalent "<database>-wal"
41 ** location using a call to rename(2). Before doing this the RBU
42 ** module takes an EXCLUSIVE lock on the database file, ensuring
43 ** that there are no other active readers.
44 **
45 ** Once the EXCLUSIVE lock is released, any other database readers
46 ** detect the new *-wal file and read the database in wal mode. At
47 ** this point they see the new version of the database - including
48 ** the updates made as part of the RBU update.
49 **
50 ** 3) The new *-wal file is checkpointed. This proceeds in the same way
51 ** as a regular database checkpoint, except that a single frame is
52 ** checkpointed each time sqlite3rbu_step() is called. If the RBU
53 ** handle is closed before the entire *-wal file is checkpointed,
54 ** the checkpoint progress is saved in the RBU database and the
55 ** checkpoint can be resumed by another RBU client at some point in
56 ** the future.
57 **
58 ** POTENTIAL PROBLEMS
59 **
60 ** The rename() call might not be portable. And RBU is not currently
61 ** syncing the directory after renaming the file.
62 **
63 ** When state is saved, any commit to the *-oal file and the commit to
64 ** the RBU update database are not atomic. So if the power fails at the
65 ** wrong moment they might get out of sync. As the main database will be
66 ** committed before the RBU update database this will likely either just
67 ** pass unnoticed, or result in SQLITE_CONSTRAINT errors (due to UNIQUE
68 ** constraint violations).
69 **
70 ** If some client does modify the target database mid RBU update, or some
71 ** other error occurs, the RBU extension will keep throwing errors. It's
72 ** not really clear how to get out of this state. The system could just
73 ** by delete the RBU update database and *-oal file and have the device
74 ** download the update again and start over.
75 **
76 ** At present, for an UPDATE, both the new.* and old.* records are
77 ** collected in the rbu_xyz table. And for both UPDATEs and DELETEs all
78 ** fields are collected. This means we're probably writing a lot more
79 ** data to disk when saving the state of an ongoing update to the RBU
80 ** update database than is strictly necessary.
81 **
82 */
83
84 /* #include <assert.h> */
85 /* #include <string.h> */
86 /* #include <stdio.h> */
87
88 /* #include "sqlite3.h" */
89
90 #if !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_RBU)
91 /************** Include sqlite3rbu.h in the middle of sqlite3rbu.c ***********/
92 /************** Begin file sqlite3rbu.h **************************************/
93 /*
94 ** 2014 August 30
95 **
96 ** The author disclaims copyright to this source code. In place of
97 ** a legal notice, here is a blessing:
98 **
99 ** May you do good and not evil.
100 ** May you find forgiveness for yourself and forgive others.
101 ** May you share freely, never taking more than you give.
102 **
103 *************************************************************************
104 **
105 ** This file contains the public interface for the RBU extension.
106 */
107
108 /*
109 ** SUMMARY
110 **
111 ** Writing a transaction containing a large number of operations on
112 ** b-tree indexes that are collectively larger than the available cache
113 ** memory can be very inefficient.
114 **
115 ** The problem is that in order to update a b-tree, the leaf page (at least)
116 ** containing the entry being inserted or deleted must be modified. If the
117 ** working set of leaves is larger than the available cache memory, then a
118 ** single leaf that is modified more than once as part of the transaction
119 ** may be loaded from or written to the persistent media multiple times.
120 ** Additionally, because the index updates are likely to be applied in
121 ** random order, access to pages within the database is also likely to be in
122 ** random order, which is itself quite inefficient.
123 **
124 ** One way to improve the situation is to sort the operations on each index
125 ** by index key before applying them to the b-tree. This leads to an IO
126 ** pattern that resembles a single linear scan through the index b-tree,
127 ** and all but guarantees each modified leaf page is loaded and stored
128 ** exactly once. SQLite uses this trick to improve the performance of
129 ** CREATE INDEX commands. This extension allows it to be used to improve
130 ** the performance of large transactions on existing databases.
131 **
132 ** Additionally, this extension allows the work involved in writing the
133 ** large transaction to be broken down into sub-transactions performed
134 ** sequentially by separate processes. This is useful if the system cannot
135 ** guarantee that a single update process will run for long enough to apply
136 ** the entire update, for example because the update is being applied on a
137 ** mobile device that is frequently rebooted. Even after the writer process
138 ** has committed one or more sub-transactions, other database clients continue
139 ** to read from the original database snapshot. In other words, partially
140 ** applied transactions are not visible to other clients.
141 **
142 ** "RBU" stands for "Resumable Bulk Update". As in a large database update
143 ** transmitted via a wireless network to a mobile device. A transaction
144 ** applied using this extension is hence refered to as an "RBU update".
145 **
146 **
147 ** LIMITATIONS
148 **
149 ** An "RBU update" transaction is subject to the following limitations:
150 **
151 ** * The transaction must consist of INSERT, UPDATE and DELETE operations
152 ** only.
153 **
154 ** * INSERT statements may not use any default values.
155 **
156 ** * UPDATE and DELETE statements must identify their target rows by
157 ** non-NULL PRIMARY KEY values. Rows with NULL values stored in PRIMARY
158 ** KEY fields may not be updated or deleted. If the table being written
159 ** has no PRIMARY KEY, affected rows must be identified by rowid.
160 **
161 ** * UPDATE statements may not modify PRIMARY KEY columns.
162 **
163 ** * No triggers will be fired.
164 **
165 ** * No foreign key violations are detected or reported.
166 **
167 ** * CHECK constraints are not enforced.
168 **
169 ** * No constraint handling mode except for "OR ROLLBACK" is supported.
170 **
171 **
172 ** PREPARATION
173 **
174 ** An "RBU update" is stored as a separate SQLite database. A database
175 ** containing an RBU update is an "RBU database". For each table in the
176 ** target database to be updated, the RBU database should contain a table
177 ** named "data_<target name>" containing the same set of columns as the
178 ** target table, and one more - "rbu_control". The data_% table should
179 ** have no PRIMARY KEY or UNIQUE constraints, but each column should have
180 ** the same type as the corresponding column in the target database.
181 ** The "rbu_control" column should have no type at all. For example, if
182 ** the target database contains:
183 **
184 ** CREATE TABLE t1(a INTEGER PRIMARY KEY, b TEXT, c UNIQUE);
185 **
186 ** Then the RBU database should contain:
187 **
188 ** CREATE TABLE data_t1(a INTEGER, b TEXT, c, rbu_control);
189 **
190 ** The order of the columns in the data_% table does not matter.
191 **
192 ** Instead of a regular table, the RBU database may also contain virtual
193 ** tables or view named using the data_<target> naming scheme.
194 **
195 ** Instead of the plain data_<target> naming scheme, RBU database tables
196 ** may also be named data<integer>_<target>, where <integer> is any sequence
197 ** of zero or more numeric characters (0-9). This can be significant because
198 ** tables within the RBU database are always processed in order sorted by
199 ** name. By judicious selection of the the <integer> portion of the names
200 ** of the RBU tables the user can therefore control the order in which they
201 ** are processed. This can be useful, for example, to ensure that "external
202 ** content" FTS4 tables are updated before their underlying content tables.
203 **
204 ** If the target database table is a virtual table or a table that has no
205 ** PRIMARY KEY declaration, the data_% table must also contain a column
206 ** named "rbu_rowid". This column is mapped to the tables implicit primary
207 ** key column - "rowid". Virtual tables for which the "rowid" column does
208 ** not function like a primary key value cannot be updated using RBU. For
209 ** example, if the target db contains either of the following:
210 **
211 ** CREATE VIRTUAL TABLE x1 USING fts3(a, b);
212 ** CREATE TABLE x1(a, b)
213 **
214 ** then the RBU database should contain:
215 **
216 ** CREATE TABLE data_x1(a, b, rbu_rowid, rbu_control);
217 **
218 ** All non-hidden columns (i.e. all columns matched by "SELECT *") of the
219 ** target table must be present in the input table. For virtual tables,
220 ** hidden columns are optional - they are updated by RBU if present in
221 ** the input table, or not otherwise. For example, to write to an fts4
222 ** table with a hidden languageid column such as:
223 **
224 ** CREATE VIRTUAL TABLE ft1 USING fts4(a, b, languageid='langid');
225 **
226 ** Either of the following input table schemas may be used:
227 **
228 ** CREATE TABLE data_ft1(a, b, langid, rbu_rowid, rbu_control);
229 ** CREATE TABLE data_ft1(a, b, rbu_rowid, rbu_control);
230 **
231 ** For each row to INSERT into the target database as part of the RBU
232 ** update, the corresponding data_% table should contain a single record
233 ** with the "rbu_control" column set to contain integer value 0. The
234 ** other columns should be set to the values that make up the new record
235 ** to insert.
236 **
237 ** If the target database table has an INTEGER PRIMARY KEY, it is not
238 ** possible to insert a NULL value into the IPK column. Attempting to
239 ** do so results in an SQLITE_MISMATCH error.
240 **
241 ** For each row to DELETE from the target database as part of the RBU
242 ** update, the corresponding data_% table should contain a single record
243 ** with the "rbu_control" column set to contain integer value 1. The
244 ** real primary key values of the row to delete should be stored in the
245 ** corresponding columns of the data_% table. The values stored in the
246 ** other columns are not used.
247 **
248 ** For each row to UPDATE from the target database as part of the RBU
249 ** update, the corresponding data_% table should contain a single record
250 ** with the "rbu_control" column set to contain a value of type text.
251 ** The real primary key values identifying the row to update should be
252 ** stored in the corresponding columns of the data_% table row, as should
253 ** the new values of all columns being update. The text value in the
254 ** "rbu_control" column must contain the same number of characters as
255 ** there are columns in the target database table, and must consist entirely
256 ** of 'x' and '.' characters (or in some special cases 'd' - see below). For
257 ** each column that is being updated, the corresponding character is set to
258 ** 'x'. For those that remain as they are, the corresponding character of the
259 ** rbu_control value should be set to '.'. For example, given the tables
260 ** above, the update statement:
261 **
262 ** UPDATE t1 SET c = 'usa' WHERE a = 4;
263 **
264 ** is represented by the data_t1 row created by:
265 **
266 ** INSERT INTO data_t1(a, b, c, rbu_control) VALUES(4, NULL, 'usa', '..x');
267 **
268 ** Instead of an 'x' character, characters of the rbu_control value specified
269 ** for UPDATEs may also be set to 'd'. In this case, instead of updating the
270 ** target table with the value stored in the corresponding data_% column, the
271 ** user-defined SQL function "rbu_delta()" is invoked and the result stored in
272 ** the target table column. rbu_delta() is invoked with two arguments - the
273 ** original value currently stored in the target table column and the
274 ** value specified in the data_xxx table.
275 **
276 ** For example, this row:
277 **
278 ** INSERT INTO data_t1(a, b, c, rbu_control) VALUES(4, NULL, 'usa', '..d');
279 **
280 ** is similar to an UPDATE statement such as:
281 **
282 ** UPDATE t1 SET c = rbu_delta(c, 'usa') WHERE a = 4;
283 **
284 ** Finally, if an 'f' character appears in place of a 'd' or 's' in an
285 ** ota_control string, the contents of the data_xxx table column is assumed
286 ** to be a "fossil delta" - a patch to be applied to a blob value in the
287 ** format used by the fossil source-code management system. In this case
288 ** the existing value within the target database table must be of type BLOB.
289 ** It is replaced by the result of applying the specified fossil delta to
290 ** itself.
291 **
292 ** If the target database table is a virtual table or a table with no PRIMARY
293 ** KEY, the rbu_control value should not include a character corresponding
294 ** to the rbu_rowid value. For example, this:
295 **
296 ** INSERT INTO data_ft1(a, b, rbu_rowid, rbu_control)
297 ** VALUES(NULL, 'usa', 12, '.x');
298 **
299 ** causes a result similar to:
300 **
301 ** UPDATE ft1 SET b = 'usa' WHERE rowid = 12;
302 **
303 ** The data_xxx tables themselves should have no PRIMARY KEY declarations.
304 ** However, RBU is more efficient if reading the rows in from each data_xxx
305 ** table in "rowid" order is roughly the same as reading them sorted by
306 ** the PRIMARY KEY of the corresponding target database table. In other
307 ** words, rows should be sorted using the destination table PRIMARY KEY
308 ** fields before they are inserted into the data_xxx tables.
309 **
310 ** USAGE
311 **
312 ** The API declared below allows an application to apply an RBU update
313 ** stored on disk to an existing target database. Essentially, the
314 ** application:
315 **
316 ** 1) Opens an RBU handle using the sqlite3rbu_open() function.
317 **
318 ** 2) Registers any required virtual table modules with the database
319 ** handle returned by sqlite3rbu_db(). Also, if required, register
320 ** the rbu_delta() implementation.
321 **
322 ** 3) Calls the sqlite3rbu_step() function one or more times on
323 ** the new handle. Each call to sqlite3rbu_step() performs a single
324 ** b-tree operation, so thousands of calls may be required to apply
325 ** a complete update.
326 **
327 ** 4) Calls sqlite3rbu_close() to close the RBU update handle. If
328 ** sqlite3rbu_step() has been called enough times to completely
329 ** apply the update to the target database, then the RBU database
330 ** is marked as fully applied. Otherwise, the state of the RBU
331 ** update application is saved in the RBU database for later
332 ** resumption.
333 **
334 ** See comments below for more detail on APIs.
335 **
336 ** If an update is only partially applied to the target database by the
337 ** time sqlite3rbu_close() is called, various state information is saved
338 ** within the RBU database. This allows subsequent processes to automatically
339 ** resume the RBU update from where it left off.
340 **
341 ** To remove all RBU extension state information, returning an RBU database
342 ** to its original contents, it is sufficient to drop all tables that begin
343 ** with the prefix "rbu_"
344 **
345 ** DATABASE LOCKING
346 **
347 ** An RBU update may not be applied to a database in WAL mode. Attempting
348 ** to do so is an error (SQLITE_ERROR).
349 **
350 ** While an RBU handle is open, a SHARED lock may be held on the target
351 ** database file. This means it is possible for other clients to read the
352 ** database, but not to write it.
353 **
354 ** If an RBU update is started and then suspended before it is completed,
355 ** then an external client writes to the database, then attempting to resume
356 ** the suspended RBU update is also an error (SQLITE_BUSY).
357 */
358
359 #ifndef _SQLITE3RBU_H
360 #define _SQLITE3RBU_H
361
362 /* #include "sqlite3.h" ** Required for error code definitions ** * /
363
364 #if 0
365 extern "C" {
366 #endif
367
368 typedef struct sqlite3rbu sqlite3rbu;
369
370 /*
371 ** Open an RBU handle.
372 **
373 ** Argument zTarget is the path to the target database. Argument zRbu is
374 ** the path to the RBU database. Each call to this function must be matched
375 ** by a call to sqlite3rbu_close(). When opening the databases, RBU passes
376 ** the SQLITE_CONFIG_URI flag to sqlite3_open_v2(). So if either zTarget
377 ** or zRbu begin with "file:", it will be interpreted as an SQLite
378 ** database URI, not a regular file name.
379 **
380 ** If the zState argument is passed a NULL value, the RBU extension stores
381 ** the current state of the update (how many rows have been updated, which
382 ** indexes are yet to be updated etc.) within the RBU database itself. This
383 ** can be convenient, as it means that the RBU application does not need to
384 ** organize removing a separate state file after the update is concluded.
385 ** Or, if zState is non-NULL, it must be a path to a database file in which
386 ** the RBU extension can store the state of the update.
387 **
388 ** When resuming an RBU update, the zState argument must be passed the same
389 ** value as when the RBU update was started.
390 **
391 ** Once the RBU update is finished, the RBU extension does not
392 ** automatically remove any zState database file, even if it created it.
393 **
394 ** By default, RBU uses the default VFS to access the files on disk. To
395 ** use a VFS other than the default, an SQLite "file:" URI containing a
396 ** "vfs=..." option may be passed as the zTarget option.
397 **
398 ** IMPORTANT NOTE FOR ZIPVFS USERS: The RBU extension works with all of
399 ** SQLite's built-in VFSs, including the multiplexor VFS. However it does
400 ** not work out of the box with zipvfs. Refer to the comment describing
401 ** the zipvfs_create_vfs() API below for details on using RBU with zipvfs.
402 */
403 SQLITE_API sqlite3rbu *SQLITE_STDCALL sqlite3rbu_open(
404 const char *zTarget,
405 const char *zRbu,
406 const char *zState
407 );
408
409 /*
410 ** Internally, each RBU connection uses a separate SQLite database
411 ** connection to access the target and rbu update databases. This
412 ** API allows the application direct access to these database handles.
413 **
414 ** The first argument passed to this function must be a valid, open, RBU
415 ** handle. The second argument should be passed zero to access the target
416 ** database handle, or non-zero to access the rbu update database handle.
417 ** Accessing the underlying database handles may be useful in the
418 ** following scenarios:
419 **
420 ** * If any target tables are virtual tables, it may be necessary to
421 ** call sqlite3_create_module() on the target database handle to
422 ** register the required virtual table implementations.
423 **
424 ** * If the data_xxx tables in the RBU source database are virtual
425 ** tables, the application may need to call sqlite3_create_module() on
426 ** the rbu update db handle to any required virtual table
427 ** implementations.
428 **
429 ** * If the application uses the "rbu_delta()" feature described above,
430 ** it must use sqlite3_create_function() or similar to register the
431 ** rbu_delta() implementation with the target database handle.
432 **
433 ** If an error has occurred, either while opening or stepping the RBU object,
434 ** this function may return NULL. The error code and message may be collected
435 ** when sqlite3rbu_close() is called.
436 **
437 ** Database handles returned by this function remain valid until the next
438 ** call to any sqlite3rbu_xxx() function other than sqlite3rbu_db().
439 */
440 SQLITE_API sqlite3 *SQLITE_STDCALL sqlite3rbu_db(sqlite3rbu*, int bRbu);
441
442 /*
443 ** Do some work towards applying the RBU update to the target db.
444 **
445 ** Return SQLITE_DONE if the update has been completely applied, or
446 ** SQLITE_OK if no error occurs but there remains work to do to apply
447 ** the RBU update. If an error does occur, some other error code is
448 ** returned.
449 **
450 ** Once a call to sqlite3rbu_step() has returned a value other than
451 ** SQLITE_OK, all subsequent calls on the same RBU handle are no-ops
452 ** that immediately return the same value.
453 */
454 SQLITE_API int SQLITE_STDCALL sqlite3rbu_step(sqlite3rbu *pRbu);
455
456 /*
457 ** Force RBU to save its state to disk.
458 **
459 ** If a power failure or application crash occurs during an update, following
460 ** system recovery RBU may resume the update from the point at which the state
461 ** was last saved. In other words, from the most recent successful call to
462 ** sqlite3rbu_close() or this function.
463 **
464 ** SQLITE_OK is returned if successful, or an SQLite error code otherwise.
465 */
466 SQLITE_API int SQLITE_STDCALL sqlite3rbu_savestate(sqlite3rbu *pRbu);
467
468 /*
469 ** Close an RBU handle.
470 **
471 ** If the RBU update has been completely applied, mark the RBU database
472 ** as fully applied. Otherwise, assuming no error has occurred, save the
473 ** current state of the RBU update appliation to the RBU database.
474 **
475 ** If an error has already occurred as part of an sqlite3rbu_step()
476 ** or sqlite3rbu_open() call, or if one occurs within this function, an
477 ** SQLite error code is returned. Additionally, *pzErrmsg may be set to
478 ** point to a buffer containing a utf-8 formatted English language error
479 ** message. It is the responsibility of the caller to eventually free any
480 ** such buffer using sqlite3_free().
481 **
482 ** Otherwise, if no error occurs, this function returns SQLITE_OK if the
483 ** update has been partially applied, or SQLITE_DONE if it has been
484 ** completely applied.
485 */
486 SQLITE_API int SQLITE_STDCALL sqlite3rbu_close(sqlite3rbu *pRbu, char **pzErrmsg );
487
488 /*
489 ** Return the total number of key-value operations (inserts, deletes or
490 ** updates) that have been performed on the target database since the
491 ** current RBU update was started.
492 */
493 SQLITE_API sqlite3_int64 SQLITE_STDCALL sqlite3rbu_progress(sqlite3rbu *pRbu);
494
495 /*
496 ** Create an RBU VFS named zName that accesses the underlying file-system
497 ** via existing VFS zParent. Or, if the zParent parameter is passed NULL,
498 ** then the new RBU VFS uses the default system VFS to access the file-system.
499 ** The new object is registered as a non-default VFS with SQLite before
500 ** returning.
501 **
502 ** Part of the RBU implementation uses a custom VFS object. Usually, this
503 ** object is created and deleted automatically by RBU.
504 **
505 ** The exception is for applications that also use zipvfs. In this case,
506 ** the custom VFS must be explicitly created by the user before the RBU
507 ** handle is opened. The RBU VFS should be installed so that the zipvfs
508 ** VFS uses the RBU VFS, which in turn uses any other VFS layers in use
509 ** (for example multiplexor) to access the file-system. For example,
510 ** to assemble an RBU enabled VFS stack that uses both zipvfs and
511 ** multiplexor (error checking omitted):
512 **
513 ** // Create a VFS named "multiplex" (not the default).
514 ** sqlite3_multiplex_initialize(0, 0);
515 **
516 ** // Create an rbu VFS named "rbu" that uses multiplexor. If the
517 ** // second argument were replaced with NULL, the "rbu" VFS would
518 ** // access the file-system via the system default VFS, bypassing the
519 ** // multiplexor.
520 ** sqlite3rbu_create_vfs("rbu", "multiplex");
521 **
522 ** // Create a zipvfs VFS named "zipvfs" that uses rbu.
523 ** zipvfs_create_vfs_v3("zipvfs", "rbu", 0, xCompressorAlgorithmDetector);
524 **
525 ** // Make zipvfs the default VFS.
526 ** sqlite3_vfs_register(sqlite3_vfs_find("zipvfs"), 1);
527 **
528 ** Because the default VFS created above includes a RBU functionality, it
529 ** may be used by RBU clients. Attempting to use RBU with a zipvfs VFS stack
530 ** that does not include the RBU layer results in an error.
531 **
532 ** The overhead of adding the "rbu" VFS to the system is negligible for
533 ** non-RBU users. There is no harm in an application accessing the
534 ** file-system via "rbu" all the time, even if it only uses RBU functionality
535 ** occasionally.
536 */
537 SQLITE_API int SQLITE_STDCALL sqlite3rbu_create_vfs(const char *zName, const cha r *zParent);
538
539 /*
540 ** Deregister and destroy an RBU vfs created by an earlier call to
541 ** sqlite3rbu_create_vfs().
542 **
543 ** VFS objects are not reference counted. If a VFS object is destroyed
544 ** before all database handles that use it have been closed, the results
545 ** are undefined.
546 */
547 SQLITE_API void SQLITE_STDCALL sqlite3rbu_destroy_vfs(const char *zName);
548
549 #if 0
550 } /* end of the 'extern "C"' block */
551 #endif
552
553 #endif /* _SQLITE3RBU_H */
554
555 /************** End of sqlite3rbu.h ******************************************/
556 /************** Continuing where we left off in sqlite3rbu.c *****************/
557
558 #if defined(_WIN32_WCE)
559 /* #include "windows.h" */
560 #endif
561
562 /* Maximum number of prepared UPDATE statements held by this module */
563 #define SQLITE_RBU_UPDATE_CACHESIZE 16
564
565 /*
566 ** Swap two objects of type TYPE.
567 */
568 #if !defined(SQLITE_AMALGAMATION)
569 # define SWAP(TYPE,A,B) {TYPE t=A; A=B; B=t;}
570 #endif
571
572 /*
573 ** The rbu_state table is used to save the state of a partially applied
574 ** update so that it can be resumed later. The table consists of integer
575 ** keys mapped to values as follows:
576 **
577 ** RBU_STATE_STAGE:
578 ** May be set to integer values 1, 2, 4 or 5. As follows:
579 ** 1: the *-rbu file is currently under construction.
580 ** 2: the *-rbu file has been constructed, but not yet moved
581 ** to the *-wal path.
582 ** 4: the checkpoint is underway.
583 ** 5: the rbu update has been checkpointed.
584 **
585 ** RBU_STATE_TBL:
586 ** Only valid if STAGE==1. The target database name of the table
587 ** currently being written.
588 **
589 ** RBU_STATE_IDX:
590 ** Only valid if STAGE==1. The target database name of the index
591 ** currently being written, or NULL if the main table is currently being
592 ** updated.
593 **
594 ** RBU_STATE_ROW:
595 ** Only valid if STAGE==1. Number of rows already processed for the current
596 ** table/index.
597 **
598 ** RBU_STATE_PROGRESS:
599 ** Trbul number of sqlite3rbu_step() calls made so far as part of this
600 ** rbu update.
601 **
602 ** RBU_STATE_CKPT:
603 ** Valid if STAGE==4. The 64-bit checksum associated with the wal-index
604 ** header created by recovering the *-wal file. This is used to detect
605 ** cases when another client appends frames to the *-wal file in the
606 ** middle of an incremental checkpoint (an incremental checkpoint cannot
607 ** be continued if this happens).
608 **
609 ** RBU_STATE_COOKIE:
610 ** Valid if STAGE==1. The current change-counter cookie value in the
611 ** target db file.
612 **
613 ** RBU_STATE_OALSZ:
614 ** Valid if STAGE==1. The size in bytes of the *-oal file.
615 */
616 #define RBU_STATE_STAGE 1
617 #define RBU_STATE_TBL 2
618 #define RBU_STATE_IDX 3
619 #define RBU_STATE_ROW 4
620 #define RBU_STATE_PROGRESS 5
621 #define RBU_STATE_CKPT 6
622 #define RBU_STATE_COOKIE 7
623 #define RBU_STATE_OALSZ 8
624
625 #define RBU_STAGE_OAL 1
626 #define RBU_STAGE_MOVE 2
627 #define RBU_STAGE_CAPTURE 3
628 #define RBU_STAGE_CKPT 4
629 #define RBU_STAGE_DONE 5
630
631
632 #define RBU_CREATE_STATE \
633 "CREATE TABLE IF NOT EXISTS %s.rbu_state(k INTEGER PRIMARY KEY, v)"
634
635 typedef struct RbuFrame RbuFrame;
636 typedef struct RbuObjIter RbuObjIter;
637 typedef struct RbuState RbuState;
638 typedef struct rbu_vfs rbu_vfs;
639 typedef struct rbu_file rbu_file;
640 typedef struct RbuUpdateStmt RbuUpdateStmt;
641
642 #if !defined(SQLITE_AMALGAMATION)
643 typedef unsigned int u32;
644 typedef unsigned char u8;
645 typedef sqlite3_int64 i64;
646 #endif
647
648 /*
649 ** These values must match the values defined in wal.c for the equivalent
650 ** locks. These are not magic numbers as they are part of the SQLite file
651 ** format.
652 */
653 #define WAL_LOCK_WRITE 0
654 #define WAL_LOCK_CKPT 1
655 #define WAL_LOCK_READ0 3
656
657 /*
658 ** A structure to store values read from the rbu_state table in memory.
659 */
660 struct RbuState {
661 int eStage;
662 char *zTbl;
663 char *zIdx;
664 i64 iWalCksum;
665 int nRow;
666 i64 nProgress;
667 u32 iCookie;
668 i64 iOalSz;
669 };
670
671 struct RbuUpdateStmt {
672 char *zMask; /* Copy of update mask used with pUpdate */
673 sqlite3_stmt *pUpdate; /* Last update statement (or NULL) */
674 RbuUpdateStmt *pNext;
675 };
676
677 /*
678 ** An iterator of this type is used to iterate through all objects in
679 ** the target database that require updating. For each such table, the
680 ** iterator visits, in order:
681 **
682 ** * the table itself,
683 ** * each index of the table (zero or more points to visit), and
684 ** * a special "cleanup table" state.
685 **
686 ** abIndexed:
687 ** If the table has no indexes on it, abIndexed is set to NULL. Otherwise,
688 ** it points to an array of flags nTblCol elements in size. The flag is
689 ** set for each column that is either a part of the PK or a part of an
690 ** index. Or clear otherwise.
691 **
692 */
693 struct RbuObjIter {
694 sqlite3_stmt *pTblIter; /* Iterate through tables */
695 sqlite3_stmt *pIdxIter; /* Index iterator */
696 int nTblCol; /* Size of azTblCol[] array */
697 char **azTblCol; /* Array of unquoted target column names */
698 char **azTblType; /* Array of target column types */
699 int *aiSrcOrder; /* src table col -> target table col */
700 u8 *abTblPk; /* Array of flags, set on target PK columns */
701 u8 *abNotNull; /* Array of flags, set on NOT NULL columns */
702 u8 *abIndexed; /* Array of flags, set on indexed & PK cols */
703 int eType; /* Table type - an RBU_PK_XXX value */
704
705 /* Output variables. zTbl==0 implies EOF. */
706 int bCleanup; /* True in "cleanup" state */
707 const char *zTbl; /* Name of target db table */
708 const char *zDataTbl; /* Name of rbu db table (or null) */
709 const char *zIdx; /* Name of target db index (or null) */
710 int iTnum; /* Root page of current object */
711 int iPkTnum; /* If eType==EXTERNAL, root of PK index */
712 int bUnique; /* Current index is unique */
713
714 /* Statements created by rbuObjIterPrepareAll() */
715 int nCol; /* Number of columns in current object */
716 sqlite3_stmt *pSelect; /* Source data */
717 sqlite3_stmt *pInsert; /* Statement for INSERT operations */
718 sqlite3_stmt *pDelete; /* Statement for DELETE ops */
719 sqlite3_stmt *pTmpInsert; /* Insert into rbu_tmp_$zDataTbl */
720
721 /* Last UPDATE used (for PK b-tree updates only), or NULL. */
722 RbuUpdateStmt *pRbuUpdate;
723 };
724
725 /*
726 ** Values for RbuObjIter.eType
727 **
728 ** 0: Table does not exist (error)
729 ** 1: Table has an implicit rowid.
730 ** 2: Table has an explicit IPK column.
731 ** 3: Table has an external PK index.
732 ** 4: Table is WITHOUT ROWID.
733 ** 5: Table is a virtual table.
734 */
735 #define RBU_PK_NOTABLE 0
736 #define RBU_PK_NONE 1
737 #define RBU_PK_IPK 2
738 #define RBU_PK_EXTERNAL 3
739 #define RBU_PK_WITHOUT_ROWID 4
740 #define RBU_PK_VTAB 5
741
742
743 /*
744 ** Within the RBU_STAGE_OAL stage, each call to sqlite3rbu_step() performs
745 ** one of the following operations.
746 */
747 #define RBU_INSERT 1 /* Insert on a main table b-tree */
748 #define RBU_DELETE 2 /* Delete a row from a main table b-tree */
749 #define RBU_IDX_DELETE 3 /* Delete a row from an aux. index b-tree */
750 #define RBU_IDX_INSERT 4 /* Insert on an aux. index b-tree */
751 #define RBU_UPDATE 5 /* Update a row in a main table b-tree */
752
753
754 /*
755 ** A single step of an incremental checkpoint - frame iWalFrame of the wal
756 ** file should be copied to page iDbPage of the database file.
757 */
758 struct RbuFrame {
759 u32 iDbPage;
760 u32 iWalFrame;
761 };
762
763 /*
764 ** RBU handle.
765 */
766 struct sqlite3rbu {
767 int eStage; /* Value of RBU_STATE_STAGE field */
768 sqlite3 *dbMain; /* target database handle */
769 sqlite3 *dbRbu; /* rbu database handle */
770 char *zTarget; /* Path to target db */
771 char *zRbu; /* Path to rbu db */
772 char *zState; /* Path to state db (or NULL if zRbu) */
773 char zStateDb[5]; /* Db name for state ("stat" or "main") */
774 int rc; /* Value returned by last rbu_step() call */
775 char *zErrmsg; /* Error message if rc!=SQLITE_OK */
776 int nStep; /* Rows processed for current object */
777 int nProgress; /* Rows processed for all objects */
778 RbuObjIter objiter; /* Iterator for skipping through tbl/idx */
779 const char *zVfsName; /* Name of automatically created rbu vfs */
780 rbu_file *pTargetFd; /* File handle open on target db */
781 i64 iOalSz;
782
783 /* The following state variables are used as part of the incremental
784 ** checkpoint stage (eStage==RBU_STAGE_CKPT). See comments surrounding
785 ** function rbuSetupCheckpoint() for details. */
786 u32 iMaxFrame; /* Largest iWalFrame value in aFrame[] */
787 u32 mLock;
788 int nFrame; /* Entries in aFrame[] array */
789 int nFrameAlloc; /* Allocated size of aFrame[] array */
790 RbuFrame *aFrame;
791 int pgsz;
792 u8 *aBuf;
793 i64 iWalCksum;
794 };
795
796 /*
797 ** An rbu VFS is implemented using an instance of this structure.
798 */
799 struct rbu_vfs {
800 sqlite3_vfs base; /* rbu VFS shim methods */
801 sqlite3_vfs *pRealVfs; /* Underlying VFS */
802 sqlite3_mutex *mutex; /* Mutex to protect pMain */
803 rbu_file *pMain; /* Linked list of main db files */
804 };
805
806 /*
807 ** Each file opened by an rbu VFS is represented by an instance of
808 ** the following structure.
809 */
810 struct rbu_file {
811 sqlite3_file base; /* sqlite3_file methods */
812 sqlite3_file *pReal; /* Underlying file handle */
813 rbu_vfs *pRbuVfs; /* Pointer to the rbu_vfs object */
814 sqlite3rbu *pRbu; /* Pointer to rbu object (rbu target only) */
815
816 int openFlags; /* Flags this file was opened with */
817 u32 iCookie; /* Cookie value for main db files */
818 u8 iWriteVer; /* "write-version" value for main db files */
819
820 int nShm; /* Number of entries in apShm[] array */
821 char **apShm; /* Array of mmap'd *-shm regions */
822 char *zDel; /* Delete this when closing file */
823
824 const char *zWal; /* Wal filename for this main db file */
825 rbu_file *pWalFd; /* Wal file descriptor for this main db */
826 rbu_file *pMainNext; /* Next MAIN_DB file */
827 };
828
829
830 /*************************************************************************
831 ** The following three functions, found below:
832 **
833 ** rbuDeltaGetInt()
834 ** rbuDeltaChecksum()
835 ** rbuDeltaApply()
836 **
837 ** are lifted from the fossil source code (http://fossil-scm.org). They
838 ** are used to implement the scalar SQL function rbu_fossil_delta().
839 */
840
841 /*
842 ** Read bytes from *pz and convert them into a positive integer. When
843 ** finished, leave *pz pointing to the first character past the end of
844 ** the integer. The *pLen parameter holds the length of the string
845 ** in *pz and is decremented once for each character in the integer.
846 */
847 static unsigned int rbuDeltaGetInt(const char **pz, int *pLen){
848 static const signed char zValue[] = {
849 -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
850 -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
851 -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
852 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, -1, -1, -1, -1, -1, -1,
853 -1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
854 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, -1, -1, -1, -1, 36,
855 -1, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51,
856 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, -1, -1, -1, 63, -1,
857 };
858 unsigned int v = 0;
859 int c;
860 unsigned char *z = (unsigned char*)*pz;
861 unsigned char *zStart = z;
862 while( (c = zValue[0x7f&*(z++)])>=0 ){
863 v = (v<<6) + c;
864 }
865 z--;
866 *pLen -= z - zStart;
867 *pz = (char*)z;
868 return v;
869 }
870
871 /*
872 ** Compute a 32-bit checksum on the N-byte buffer. Return the result.
873 */
874 static unsigned int rbuDeltaChecksum(const char *zIn, size_t N){
875 const unsigned char *z = (const unsigned char *)zIn;
876 unsigned sum0 = 0;
877 unsigned sum1 = 0;
878 unsigned sum2 = 0;
879 unsigned sum3 = 0;
880 while(N >= 16){
881 sum0 += ((unsigned)z[0] + z[4] + z[8] + z[12]);
882 sum1 += ((unsigned)z[1] + z[5] + z[9] + z[13]);
883 sum2 += ((unsigned)z[2] + z[6] + z[10]+ z[14]);
884 sum3 += ((unsigned)z[3] + z[7] + z[11]+ z[15]);
885 z += 16;
886 N -= 16;
887 }
888 while(N >= 4){
889 sum0 += z[0];
890 sum1 += z[1];
891 sum2 += z[2];
892 sum3 += z[3];
893 z += 4;
894 N -= 4;
895 }
896 sum3 += (sum2 << 8) + (sum1 << 16) + (sum0 << 24);
897 switch(N){
898 case 3: sum3 += (z[2] << 8);
899 case 2: sum3 += (z[1] << 16);
900 case 1: sum3 += (z[0] << 24);
901 default: ;
902 }
903 return sum3;
904 }
905
906 /*
907 ** Apply a delta.
908 **
909 ** The output buffer should be big enough to hold the whole output
910 ** file and a NUL terminator at the end. The delta_output_size()
911 ** routine will determine this size for you.
912 **
913 ** The delta string should be null-terminated. But the delta string
914 ** may contain embedded NUL characters (if the input and output are
915 ** binary files) so we also have to pass in the length of the delta in
916 ** the lenDelta parameter.
917 **
918 ** This function returns the size of the output file in bytes (excluding
919 ** the final NUL terminator character). Except, if the delta string is
920 ** malformed or intended for use with a source file other than zSrc,
921 ** then this routine returns -1.
922 **
923 ** Refer to the delta_create() documentation above for a description
924 ** of the delta file format.
925 */
926 static int rbuDeltaApply(
927 const char *zSrc, /* The source or pattern file */
928 int lenSrc, /* Length of the source file */
929 const char *zDelta, /* Delta to apply to the pattern */
930 int lenDelta, /* Length of the delta */
931 char *zOut /* Write the output into this preallocated buffer */
932 ){
933 unsigned int limit;
934 unsigned int total = 0;
935 #ifndef FOSSIL_OMIT_DELTA_CKSUM_TEST
936 char *zOrigOut = zOut;
937 #endif
938
939 limit = rbuDeltaGetInt(&zDelta, &lenDelta);
940 if( *zDelta!='\n' ){
941 /* ERROR: size integer not terminated by "\n" */
942 return -1;
943 }
944 zDelta++; lenDelta--;
945 while( *zDelta && lenDelta>0 ){
946 unsigned int cnt, ofst;
947 cnt = rbuDeltaGetInt(&zDelta, &lenDelta);
948 switch( zDelta[0] ){
949 case '@': {
950 zDelta++; lenDelta--;
951 ofst = rbuDeltaGetInt(&zDelta, &lenDelta);
952 if( lenDelta>0 && zDelta[0]!=',' ){
953 /* ERROR: copy command not terminated by ',' */
954 return -1;
955 }
956 zDelta++; lenDelta--;
957 total += cnt;
958 if( total>limit ){
959 /* ERROR: copy exceeds output file size */
960 return -1;
961 }
962 if( (int)(ofst+cnt) > lenSrc ){
963 /* ERROR: copy extends past end of input */
964 return -1;
965 }
966 memcpy(zOut, &zSrc[ofst], cnt);
967 zOut += cnt;
968 break;
969 }
970 case ':': {
971 zDelta++; lenDelta--;
972 total += cnt;
973 if( total>limit ){
974 /* ERROR: insert command gives an output larger than predicted */
975 return -1;
976 }
977 if( (int)cnt>lenDelta ){
978 /* ERROR: insert count exceeds size of delta */
979 return -1;
980 }
981 memcpy(zOut, zDelta, cnt);
982 zOut += cnt;
983 zDelta += cnt;
984 lenDelta -= cnt;
985 break;
986 }
987 case ';': {
988 zDelta++; lenDelta--;
989 zOut[0] = 0;
990 #ifndef FOSSIL_OMIT_DELTA_CKSUM_TEST
991 if( cnt!=rbuDeltaChecksum(zOrigOut, total) ){
992 /* ERROR: bad checksum */
993 return -1;
994 }
995 #endif
996 if( total!=limit ){
997 /* ERROR: generated size does not match predicted size */
998 return -1;
999 }
1000 return total;
1001 }
1002 default: {
1003 /* ERROR: unknown delta operator */
1004 return -1;
1005 }
1006 }
1007 }
1008 /* ERROR: unterminated delta */
1009 return -1;
1010 }
1011
1012 static int rbuDeltaOutputSize(const char *zDelta, int lenDelta){
1013 int size;
1014 size = rbuDeltaGetInt(&zDelta, &lenDelta);
1015 if( *zDelta!='\n' ){
1016 /* ERROR: size integer not terminated by "\n" */
1017 return -1;
1018 }
1019 return size;
1020 }
1021
1022 /*
1023 ** End of code taken from fossil.
1024 *************************************************************************/
1025
1026 /*
1027 ** Implementation of SQL scalar function rbu_fossil_delta().
1028 **
1029 ** This function applies a fossil delta patch to a blob. Exactly two
1030 ** arguments must be passed to this function. The first is the blob to
1031 ** patch and the second the patch to apply. If no error occurs, this
1032 ** function returns the patched blob.
1033 */
1034 static void rbuFossilDeltaFunc(
1035 sqlite3_context *context,
1036 int argc,
1037 sqlite3_value **argv
1038 ){
1039 const char *aDelta;
1040 int nDelta;
1041 const char *aOrig;
1042 int nOrig;
1043
1044 int nOut;
1045 int nOut2;
1046 char *aOut;
1047
1048 assert( argc==2 );
1049
1050 nOrig = sqlite3_value_bytes(argv[0]);
1051 aOrig = (const char*)sqlite3_value_blob(argv[0]);
1052 nDelta = sqlite3_value_bytes(argv[1]);
1053 aDelta = (const char*)sqlite3_value_blob(argv[1]);
1054
1055 /* Figure out the size of the output */
1056 nOut = rbuDeltaOutputSize(aDelta, nDelta);
1057 if( nOut<0 ){
1058 sqlite3_result_error(context, "corrupt fossil delta", -1);
1059 return;
1060 }
1061
1062 aOut = sqlite3_malloc(nOut+1);
1063 if( aOut==0 ){
1064 sqlite3_result_error_nomem(context);
1065 }else{
1066 nOut2 = rbuDeltaApply(aOrig, nOrig, aDelta, nDelta, aOut);
1067 if( nOut2!=nOut ){
1068 sqlite3_result_error(context, "corrupt fossil delta", -1);
1069 }else{
1070 sqlite3_result_blob(context, aOut, nOut, sqlite3_free);
1071 }
1072 }
1073 }
1074
1075
1076 /*
1077 ** Prepare the SQL statement in buffer zSql against database handle db.
1078 ** If successful, set *ppStmt to point to the new statement and return
1079 ** SQLITE_OK.
1080 **
1081 ** Otherwise, if an error does occur, set *ppStmt to NULL and return
1082 ** an SQLite error code. Additionally, set output variable *pzErrmsg to
1083 ** point to a buffer containing an error message. It is the responsibility
1084 ** of the caller to (eventually) free this buffer using sqlite3_free().
1085 */
1086 static int prepareAndCollectError(
1087 sqlite3 *db,
1088 sqlite3_stmt **ppStmt,
1089 char **pzErrmsg,
1090 const char *zSql
1091 ){
1092 int rc = sqlite3_prepare_v2(db, zSql, -1, ppStmt, 0);
1093 if( rc!=SQLITE_OK ){
1094 *pzErrmsg = sqlite3_mprintf("%s", sqlite3_errmsg(db));
1095 *ppStmt = 0;
1096 }
1097 return rc;
1098 }
1099
1100 /*
1101 ** Reset the SQL statement passed as the first argument. Return a copy
1102 ** of the value returned by sqlite3_reset().
1103 **
1104 ** If an error has occurred, then set *pzErrmsg to point to a buffer
1105 ** containing an error message. It is the responsibility of the caller
1106 ** to eventually free this buffer using sqlite3_free().
1107 */
1108 static int resetAndCollectError(sqlite3_stmt *pStmt, char **pzErrmsg){
1109 int rc = sqlite3_reset(pStmt);
1110 if( rc!=SQLITE_OK ){
1111 *pzErrmsg = sqlite3_mprintf("%s", sqlite3_errmsg(sqlite3_db_handle(pStmt)));
1112 }
1113 return rc;
1114 }
1115
1116 /*
1117 ** Unless it is NULL, argument zSql points to a buffer allocated using
1118 ** sqlite3_malloc containing an SQL statement. This function prepares the SQL
1119 ** statement against database db and frees the buffer. If statement
1120 ** compilation is successful, *ppStmt is set to point to the new statement
1121 ** handle and SQLITE_OK is returned.
1122 **
1123 ** Otherwise, if an error occurs, *ppStmt is set to NULL and an error code
1124 ** returned. In this case, *pzErrmsg may also be set to point to an error
1125 ** message. It is the responsibility of the caller to free this error message
1126 ** buffer using sqlite3_free().
1127 **
1128 ** If argument zSql is NULL, this function assumes that an OOM has occurred.
1129 ** In this case SQLITE_NOMEM is returned and *ppStmt set to NULL.
1130 */
1131 static int prepareFreeAndCollectError(
1132 sqlite3 *db,
1133 sqlite3_stmt **ppStmt,
1134 char **pzErrmsg,
1135 char *zSql
1136 ){
1137 int rc;
1138 assert( *pzErrmsg==0 );
1139 if( zSql==0 ){
1140 rc = SQLITE_NOMEM;
1141 *ppStmt = 0;
1142 }else{
1143 rc = prepareAndCollectError(db, ppStmt, pzErrmsg, zSql);
1144 sqlite3_free(zSql);
1145 }
1146 return rc;
1147 }
1148
1149 /*
1150 ** Free the RbuObjIter.azTblCol[] and RbuObjIter.abTblPk[] arrays allocated
1151 ** by an earlier call to rbuObjIterCacheTableInfo().
1152 */
1153 static void rbuObjIterFreeCols(RbuObjIter *pIter){
1154 int i;
1155 for(i=0; i<pIter->nTblCol; i++){
1156 sqlite3_free(pIter->azTblCol[i]);
1157 sqlite3_free(pIter->azTblType[i]);
1158 }
1159 sqlite3_free(pIter->azTblCol);
1160 pIter->azTblCol = 0;
1161 pIter->azTblType = 0;
1162 pIter->aiSrcOrder = 0;
1163 pIter->abTblPk = 0;
1164 pIter->abNotNull = 0;
1165 pIter->nTblCol = 0;
1166 pIter->eType = 0; /* Invalid value */
1167 }
1168
1169 /*
1170 ** Finalize all statements and free all allocations that are specific to
1171 ** the current object (table/index pair).
1172 */
1173 static void rbuObjIterClearStatements(RbuObjIter *pIter){
1174 RbuUpdateStmt *pUp;
1175
1176 sqlite3_finalize(pIter->pSelect);
1177 sqlite3_finalize(pIter->pInsert);
1178 sqlite3_finalize(pIter->pDelete);
1179 sqlite3_finalize(pIter->pTmpInsert);
1180 pUp = pIter->pRbuUpdate;
1181 while( pUp ){
1182 RbuUpdateStmt *pTmp = pUp->pNext;
1183 sqlite3_finalize(pUp->pUpdate);
1184 sqlite3_free(pUp);
1185 pUp = pTmp;
1186 }
1187
1188 pIter->pSelect = 0;
1189 pIter->pInsert = 0;
1190 pIter->pDelete = 0;
1191 pIter->pRbuUpdate = 0;
1192 pIter->pTmpInsert = 0;
1193 pIter->nCol = 0;
1194 }
1195
1196 /*
1197 ** Clean up any resources allocated as part of the iterator object passed
1198 ** as the only argument.
1199 */
1200 static void rbuObjIterFinalize(RbuObjIter *pIter){
1201 rbuObjIterClearStatements(pIter);
1202 sqlite3_finalize(pIter->pTblIter);
1203 sqlite3_finalize(pIter->pIdxIter);
1204 rbuObjIterFreeCols(pIter);
1205 memset(pIter, 0, sizeof(RbuObjIter));
1206 }
1207
1208 /*
1209 ** Advance the iterator to the next position.
1210 **
1211 ** If no error occurs, SQLITE_OK is returned and the iterator is left
1212 ** pointing to the next entry. Otherwise, an error code and message is
1213 ** left in the RBU handle passed as the first argument. A copy of the
1214 ** error code is returned.
1215 */
1216 static int rbuObjIterNext(sqlite3rbu *p, RbuObjIter *pIter){
1217 int rc = p->rc;
1218 if( rc==SQLITE_OK ){
1219
1220 /* Free any SQLite statements used while processing the previous object */
1221 rbuObjIterClearStatements(pIter);
1222 if( pIter->zIdx==0 ){
1223 rc = sqlite3_exec(p->dbMain,
1224 "DROP TRIGGER IF EXISTS temp.rbu_insert_tr;"
1225 "DROP TRIGGER IF EXISTS temp.rbu_update1_tr;"
1226 "DROP TRIGGER IF EXISTS temp.rbu_update2_tr;"
1227 "DROP TRIGGER IF EXISTS temp.rbu_delete_tr;"
1228 , 0, 0, &p->zErrmsg
1229 );
1230 }
1231
1232 if( rc==SQLITE_OK ){
1233 if( pIter->bCleanup ){
1234 rbuObjIterFreeCols(pIter);
1235 pIter->bCleanup = 0;
1236 rc = sqlite3_step(pIter->pTblIter);
1237 if( rc!=SQLITE_ROW ){
1238 rc = resetAndCollectError(pIter->pTblIter, &p->zErrmsg);
1239 pIter->zTbl = 0;
1240 }else{
1241 pIter->zTbl = (const char*)sqlite3_column_text(pIter->pTblIter, 0);
1242 pIter->zDataTbl = (const char*)sqlite3_column_text(pIter->pTblIter,1);
1243 rc = (pIter->zDataTbl && pIter->zTbl) ? SQLITE_OK : SQLITE_NOMEM;
1244 }
1245 }else{
1246 if( pIter->zIdx==0 ){
1247 sqlite3_stmt *pIdx = pIter->pIdxIter;
1248 rc = sqlite3_bind_text(pIdx, 1, pIter->zTbl, -1, SQLITE_STATIC);
1249 }
1250 if( rc==SQLITE_OK ){
1251 rc = sqlite3_step(pIter->pIdxIter);
1252 if( rc!=SQLITE_ROW ){
1253 rc = resetAndCollectError(pIter->pIdxIter, &p->zErrmsg);
1254 pIter->bCleanup = 1;
1255 pIter->zIdx = 0;
1256 }else{
1257 pIter->zIdx = (const char*)sqlite3_column_text(pIter->pIdxIter, 0);
1258 pIter->iTnum = sqlite3_column_int(pIter->pIdxIter, 1);
1259 pIter->bUnique = sqlite3_column_int(pIter->pIdxIter, 2);
1260 rc = pIter->zIdx ? SQLITE_OK : SQLITE_NOMEM;
1261 }
1262 }
1263 }
1264 }
1265 }
1266
1267 if( rc!=SQLITE_OK ){
1268 rbuObjIterFinalize(pIter);
1269 p->rc = rc;
1270 }
1271 return rc;
1272 }
1273
1274
1275 /*
1276 ** The implementation of the rbu_target_name() SQL function. This function
1277 ** accepts one argument - the name of a table in the RBU database. If the
1278 ** table name matches the pattern:
1279 **
1280 ** data[0-9]_<name>
1281 **
1282 ** where <name> is any sequence of 1 or more characters, <name> is returned.
1283 ** Otherwise, if the only argument does not match the above pattern, an SQL
1284 ** NULL is returned.
1285 **
1286 ** "data_t1" -> "t1"
1287 ** "data0123_t2" -> "t2"
1288 ** "dataAB_t3" -> NULL
1289 */
1290 static void rbuTargetNameFunc(
1291 sqlite3_context *context,
1292 int argc,
1293 sqlite3_value **argv
1294 ){
1295 const char *zIn;
1296 assert( argc==1 );
1297
1298 zIn = (const char*)sqlite3_value_text(argv[0]);
1299 if( zIn && strlen(zIn)>4 && memcmp("data", zIn, 4)==0 ){
1300 int i;
1301 for(i=4; zIn[i]>='0' && zIn[i]<='9'; i++);
1302 if( zIn[i]=='_' && zIn[i+1] ){
1303 sqlite3_result_text(context, &zIn[i+1], -1, SQLITE_STATIC);
1304 }
1305 }
1306 }
1307
1308 /*
1309 ** Initialize the iterator structure passed as the second argument.
1310 **
1311 ** If no error occurs, SQLITE_OK is returned and the iterator is left
1312 ** pointing to the first entry. Otherwise, an error code and message is
1313 ** left in the RBU handle passed as the first argument. A copy of the
1314 ** error code is returned.
1315 */
1316 static int rbuObjIterFirst(sqlite3rbu *p, RbuObjIter *pIter){
1317 int rc;
1318 memset(pIter, 0, sizeof(RbuObjIter));
1319
1320 rc = prepareAndCollectError(p->dbRbu, &pIter->pTblIter, &p->zErrmsg,
1321 "SELECT rbu_target_name(name) AS target, name FROM sqlite_master "
1322 "WHERE type IN ('table', 'view') AND target IS NOT NULL "
1323 "ORDER BY name"
1324 );
1325
1326 if( rc==SQLITE_OK ){
1327 rc = prepareAndCollectError(p->dbMain, &pIter->pIdxIter, &p->zErrmsg,
1328 "SELECT name, rootpage, sql IS NULL OR substr(8, 6)=='UNIQUE' "
1329 " FROM main.sqlite_master "
1330 " WHERE type='index' AND tbl_name = ?"
1331 );
1332 }
1333
1334 pIter->bCleanup = 1;
1335 p->rc = rc;
1336 return rbuObjIterNext(p, pIter);
1337 }
1338
1339 /*
1340 ** This is a wrapper around "sqlite3_mprintf(zFmt, ...)". If an OOM occurs,
1341 ** an error code is stored in the RBU handle passed as the first argument.
1342 **
1343 ** If an error has already occurred (p->rc is already set to something other
1344 ** than SQLITE_OK), then this function returns NULL without modifying the
1345 ** stored error code. In this case it still calls sqlite3_free() on any
1346 ** printf() parameters associated with %z conversions.
1347 */
1348 static char *rbuMPrintf(sqlite3rbu *p, const char *zFmt, ...){
1349 char *zSql = 0;
1350 va_list ap;
1351 va_start(ap, zFmt);
1352 zSql = sqlite3_vmprintf(zFmt, ap);
1353 if( p->rc==SQLITE_OK ){
1354 if( zSql==0 ) p->rc = SQLITE_NOMEM;
1355 }else{
1356 sqlite3_free(zSql);
1357 zSql = 0;
1358 }
1359 va_end(ap);
1360 return zSql;
1361 }
1362
1363 /*
1364 ** Argument zFmt is a sqlite3_mprintf() style format string. The trailing
1365 ** arguments are the usual subsitution values. This function performs
1366 ** the printf() style substitutions and executes the result as an SQL
1367 ** statement on the RBU handles database.
1368 **
1369 ** If an error occurs, an error code and error message is stored in the
1370 ** RBU handle. If an error has already occurred when this function is
1371 ** called, it is a no-op.
1372 */
1373 static int rbuMPrintfExec(sqlite3rbu *p, sqlite3 *db, const char *zFmt, ...){
1374 va_list ap;
1375 char *zSql;
1376 va_start(ap, zFmt);
1377 zSql = sqlite3_vmprintf(zFmt, ap);
1378 if( p->rc==SQLITE_OK ){
1379 if( zSql==0 ){
1380 p->rc = SQLITE_NOMEM;
1381 }else{
1382 p->rc = sqlite3_exec(db, zSql, 0, 0, &p->zErrmsg);
1383 }
1384 }
1385 sqlite3_free(zSql);
1386 va_end(ap);
1387 return p->rc;
1388 }
1389
1390 /*
1391 ** Attempt to allocate and return a pointer to a zeroed block of nByte
1392 ** bytes.
1393 **
1394 ** If an error (i.e. an OOM condition) occurs, return NULL and leave an
1395 ** error code in the rbu handle passed as the first argument. Or, if an
1396 ** error has already occurred when this function is called, return NULL
1397 ** immediately without attempting the allocation or modifying the stored
1398 ** error code.
1399 */
1400 static void *rbuMalloc(sqlite3rbu *p, int nByte){
1401 void *pRet = 0;
1402 if( p->rc==SQLITE_OK ){
1403 assert( nByte>0 );
1404 pRet = sqlite3_malloc(nByte);
1405 if( pRet==0 ){
1406 p->rc = SQLITE_NOMEM;
1407 }else{
1408 memset(pRet, 0, nByte);
1409 }
1410 }
1411 return pRet;
1412 }
1413
1414
1415 /*
1416 ** Allocate and zero the pIter->azTblCol[] and abTblPk[] arrays so that
1417 ** there is room for at least nCol elements. If an OOM occurs, store an
1418 ** error code in the RBU handle passed as the first argument.
1419 */
1420 static void rbuAllocateIterArrays(sqlite3rbu *p, RbuObjIter *pIter, int nCol){
1421 int nByte = (2*sizeof(char*) + sizeof(int) + 3*sizeof(u8)) * nCol;
1422 char **azNew;
1423
1424 azNew = (char**)rbuMalloc(p, nByte);
1425 if( azNew ){
1426 pIter->azTblCol = azNew;
1427 pIter->azTblType = &azNew[nCol];
1428 pIter->aiSrcOrder = (int*)&pIter->azTblType[nCol];
1429 pIter->abTblPk = (u8*)&pIter->aiSrcOrder[nCol];
1430 pIter->abNotNull = (u8*)&pIter->abTblPk[nCol];
1431 pIter->abIndexed = (u8*)&pIter->abNotNull[nCol];
1432 }
1433 }
1434
1435 /*
1436 ** The first argument must be a nul-terminated string. This function
1437 ** returns a copy of the string in memory obtained from sqlite3_malloc().
1438 ** It is the responsibility of the caller to eventually free this memory
1439 ** using sqlite3_free().
1440 **
1441 ** If an OOM condition is encountered when attempting to allocate memory,
1442 ** output variable (*pRc) is set to SQLITE_NOMEM before returning. Otherwise,
1443 ** if the allocation succeeds, (*pRc) is left unchanged.
1444 */
1445 static char *rbuStrndup(const char *zStr, int *pRc){
1446 char *zRet = 0;
1447
1448 assert( *pRc==SQLITE_OK );
1449 if( zStr ){
1450 int nCopy = strlen(zStr) + 1;
1451 zRet = (char*)sqlite3_malloc(nCopy);
1452 if( zRet ){
1453 memcpy(zRet, zStr, nCopy);
1454 }else{
1455 *pRc = SQLITE_NOMEM;
1456 }
1457 }
1458
1459 return zRet;
1460 }
1461
1462 /*
1463 ** Finalize the statement passed as the second argument.
1464 **
1465 ** If the sqlite3_finalize() call indicates that an error occurs, and the
1466 ** rbu handle error code is not already set, set the error code and error
1467 ** message accordingly.
1468 */
1469 static void rbuFinalize(sqlite3rbu *p, sqlite3_stmt *pStmt){
1470 sqlite3 *db = sqlite3_db_handle(pStmt);
1471 int rc = sqlite3_finalize(pStmt);
1472 if( p->rc==SQLITE_OK && rc!=SQLITE_OK ){
1473 p->rc = rc;
1474 p->zErrmsg = sqlite3_mprintf("%s", sqlite3_errmsg(db));
1475 }
1476 }
1477
1478 /* Determine the type of a table.
1479 **
1480 ** peType is of type (int*), a pointer to an output parameter of type
1481 ** (int). This call sets the output parameter as follows, depending
1482 ** on the type of the table specified by parameters dbName and zTbl.
1483 **
1484 ** RBU_PK_NOTABLE: No such table.
1485 ** RBU_PK_NONE: Table has an implicit rowid.
1486 ** RBU_PK_IPK: Table has an explicit IPK column.
1487 ** RBU_PK_EXTERNAL: Table has an external PK index.
1488 ** RBU_PK_WITHOUT_ROWID: Table is WITHOUT ROWID.
1489 ** RBU_PK_VTAB: Table is a virtual table.
1490 **
1491 ** Argument *piPk is also of type (int*), and also points to an output
1492 ** parameter. Unless the table has an external primary key index
1493 ** (i.e. unless *peType is set to 3), then *piPk is set to zero. Or,
1494 ** if the table does have an external primary key index, then *piPk
1495 ** is set to the root page number of the primary key index before
1496 ** returning.
1497 **
1498 ** ALGORITHM:
1499 **
1500 ** if( no entry exists in sqlite_master ){
1501 ** return RBU_PK_NOTABLE
1502 ** }else if( sql for the entry starts with "CREATE VIRTUAL" ){
1503 ** return RBU_PK_VTAB
1504 ** }else if( "PRAGMA index_list()" for the table contains a "pk" index ){
1505 ** if( the index that is the pk exists in sqlite_master ){
1506 ** *piPK = rootpage of that index.
1507 ** return RBU_PK_EXTERNAL
1508 ** }else{
1509 ** return RBU_PK_WITHOUT_ROWID
1510 ** }
1511 ** }else if( "PRAGMA table_info()" lists one or more "pk" columns ){
1512 ** return RBU_PK_IPK
1513 ** }else{
1514 ** return RBU_PK_NONE
1515 ** }
1516 */
1517 static void rbuTableType(
1518 sqlite3rbu *p,
1519 const char *zTab,
1520 int *peType,
1521 int *piTnum,
1522 int *piPk
1523 ){
1524 /*
1525 ** 0) SELECT count(*) FROM sqlite_master where name=%Q AND IsVirtual(%Q)
1526 ** 1) PRAGMA index_list = ?
1527 ** 2) SELECT count(*) FROM sqlite_master where name=%Q
1528 ** 3) PRAGMA table_info = ?
1529 */
1530 sqlite3_stmt *aStmt[4] = {0, 0, 0, 0};
1531
1532 *peType = RBU_PK_NOTABLE;
1533 *piPk = 0;
1534
1535 assert( p->rc==SQLITE_OK );
1536 p->rc = prepareFreeAndCollectError(p->dbMain, &aStmt[0], &p->zErrmsg,
1537 sqlite3_mprintf(
1538 "SELECT (sql LIKE 'create virtual%%'), rootpage"
1539 " FROM sqlite_master"
1540 " WHERE name=%Q", zTab
1541 ));
1542 if( p->rc!=SQLITE_OK || sqlite3_step(aStmt[0])!=SQLITE_ROW ){
1543 /* Either an error, or no such table. */
1544 goto rbuTableType_end;
1545 }
1546 if( sqlite3_column_int(aStmt[0], 0) ){
1547 *peType = RBU_PK_VTAB; /* virtual table */
1548 goto rbuTableType_end;
1549 }
1550 *piTnum = sqlite3_column_int(aStmt[0], 1);
1551
1552 p->rc = prepareFreeAndCollectError(p->dbMain, &aStmt[1], &p->zErrmsg,
1553 sqlite3_mprintf("PRAGMA index_list=%Q",zTab)
1554 );
1555 if( p->rc ) goto rbuTableType_end;
1556 while( sqlite3_step(aStmt[1])==SQLITE_ROW ){
1557 const u8 *zOrig = sqlite3_column_text(aStmt[1], 3);
1558 const u8 *zIdx = sqlite3_column_text(aStmt[1], 1);
1559 if( zOrig && zIdx && zOrig[0]=='p' ){
1560 p->rc = prepareFreeAndCollectError(p->dbMain, &aStmt[2], &p->zErrmsg,
1561 sqlite3_mprintf(
1562 "SELECT rootpage FROM sqlite_master WHERE name = %Q", zIdx
1563 ));
1564 if( p->rc==SQLITE_OK ){
1565 if( sqlite3_step(aStmt[2])==SQLITE_ROW ){
1566 *piPk = sqlite3_column_int(aStmt[2], 0);
1567 *peType = RBU_PK_EXTERNAL;
1568 }else{
1569 *peType = RBU_PK_WITHOUT_ROWID;
1570 }
1571 }
1572 goto rbuTableType_end;
1573 }
1574 }
1575
1576 p->rc = prepareFreeAndCollectError(p->dbMain, &aStmt[3], &p->zErrmsg,
1577 sqlite3_mprintf("PRAGMA table_info=%Q",zTab)
1578 );
1579 if( p->rc==SQLITE_OK ){
1580 while( sqlite3_step(aStmt[3])==SQLITE_ROW ){
1581 if( sqlite3_column_int(aStmt[3],5)>0 ){
1582 *peType = RBU_PK_IPK; /* explicit IPK column */
1583 goto rbuTableType_end;
1584 }
1585 }
1586 *peType = RBU_PK_NONE;
1587 }
1588
1589 rbuTableType_end: {
1590 unsigned int i;
1591 for(i=0; i<sizeof(aStmt)/sizeof(aStmt[0]); i++){
1592 rbuFinalize(p, aStmt[i]);
1593 }
1594 }
1595 }
1596
1597 /*
1598 ** This is a helper function for rbuObjIterCacheTableInfo(). It populates
1599 ** the pIter->abIndexed[] array.
1600 */
1601 static void rbuObjIterCacheIndexedCols(sqlite3rbu *p, RbuObjIter *pIter){
1602 sqlite3_stmt *pList = 0;
1603 int bIndex = 0;
1604
1605 if( p->rc==SQLITE_OK ){
1606 memcpy(pIter->abIndexed, pIter->abTblPk, sizeof(u8)*pIter->nTblCol);
1607 p->rc = prepareFreeAndCollectError(p->dbMain, &pList, &p->zErrmsg,
1608 sqlite3_mprintf("PRAGMA main.index_list = %Q", pIter->zTbl)
1609 );
1610 }
1611
1612 while( p->rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pList) ){
1613 const char *zIdx = (const char*)sqlite3_column_text(pList, 1);
1614 sqlite3_stmt *pXInfo = 0;
1615 if( zIdx==0 ) break;
1616 p->rc = prepareFreeAndCollectError(p->dbMain, &pXInfo, &p->zErrmsg,
1617 sqlite3_mprintf("PRAGMA main.index_xinfo = %Q", zIdx)
1618 );
1619 while( p->rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pXInfo) ){
1620 int iCid = sqlite3_column_int(pXInfo, 1);
1621 if( iCid>=0 ) pIter->abIndexed[iCid] = 1;
1622 }
1623 rbuFinalize(p, pXInfo);
1624 bIndex = 1;
1625 }
1626
1627 rbuFinalize(p, pList);
1628 if( bIndex==0 ) pIter->abIndexed = 0;
1629 }
1630
1631
1632 /*
1633 ** If they are not already populated, populate the pIter->azTblCol[],
1634 ** pIter->abTblPk[], pIter->nTblCol and pIter->bRowid variables according to
1635 ** the table (not index) that the iterator currently points to.
1636 **
1637 ** Return SQLITE_OK if successful, or an SQLite error code otherwise. If
1638 ** an error does occur, an error code and error message are also left in
1639 ** the RBU handle.
1640 */
1641 static int rbuObjIterCacheTableInfo(sqlite3rbu *p, RbuObjIter *pIter){
1642 if( pIter->azTblCol==0 ){
1643 sqlite3_stmt *pStmt = 0;
1644 int nCol = 0;
1645 int i; /* for() loop iterator variable */
1646 int bRbuRowid = 0; /* If input table has column "rbu_rowid" */
1647 int iOrder = 0;
1648 int iTnum = 0;
1649
1650 /* Figure out the type of table this step will deal with. */
1651 assert( pIter->eType==0 );
1652 rbuTableType(p, pIter->zTbl, &pIter->eType, &iTnum, &pIter->iPkTnum);
1653 if( p->rc==SQLITE_OK && pIter->eType==RBU_PK_NOTABLE ){
1654 p->rc = SQLITE_ERROR;
1655 p->zErrmsg = sqlite3_mprintf("no such table: %s", pIter->zTbl);
1656 }
1657 if( p->rc ) return p->rc;
1658 if( pIter->zIdx==0 ) pIter->iTnum = iTnum;
1659
1660 assert( pIter->eType==RBU_PK_NONE || pIter->eType==RBU_PK_IPK
1661 || pIter->eType==RBU_PK_EXTERNAL || pIter->eType==RBU_PK_WITHOUT_ROWID
1662 || pIter->eType==RBU_PK_VTAB
1663 );
1664
1665 /* Populate the azTblCol[] and nTblCol variables based on the columns
1666 ** of the input table. Ignore any input table columns that begin with
1667 ** "rbu_". */
1668 p->rc = prepareFreeAndCollectError(p->dbRbu, &pStmt, &p->zErrmsg,
1669 sqlite3_mprintf("SELECT * FROM '%q'", pIter->zDataTbl)
1670 );
1671 if( p->rc==SQLITE_OK ){
1672 nCol = sqlite3_column_count(pStmt);
1673 rbuAllocateIterArrays(p, pIter, nCol);
1674 }
1675 for(i=0; p->rc==SQLITE_OK && i<nCol; i++){
1676 const char *zName = (const char*)sqlite3_column_name(pStmt, i);
1677 if( sqlite3_strnicmp("rbu_", zName, 4) ){
1678 char *zCopy = rbuStrndup(zName, &p->rc);
1679 pIter->aiSrcOrder[pIter->nTblCol] = pIter->nTblCol;
1680 pIter->azTblCol[pIter->nTblCol++] = zCopy;
1681 }
1682 else if( 0==sqlite3_stricmp("rbu_rowid", zName) ){
1683 bRbuRowid = 1;
1684 }
1685 }
1686 sqlite3_finalize(pStmt);
1687 pStmt = 0;
1688
1689 if( p->rc==SQLITE_OK
1690 && bRbuRowid!=(pIter->eType==RBU_PK_VTAB || pIter->eType==RBU_PK_NONE)
1691 ){
1692 p->rc = SQLITE_ERROR;
1693 p->zErrmsg = sqlite3_mprintf(
1694 "table %q %s rbu_rowid column", pIter->zDataTbl,
1695 (bRbuRowid ? "may not have" : "requires")
1696 );
1697 }
1698
1699 /* Check that all non-HIDDEN columns in the destination table are also
1700 ** present in the input table. Populate the abTblPk[], azTblType[] and
1701 ** aiTblOrder[] arrays at the same time. */
1702 if( p->rc==SQLITE_OK ){
1703 p->rc = prepareFreeAndCollectError(p->dbMain, &pStmt, &p->zErrmsg,
1704 sqlite3_mprintf("PRAGMA table_info(%Q)", pIter->zTbl)
1705 );
1706 }
1707 while( p->rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pStmt) ){
1708 const char *zName = (const char*)sqlite3_column_text(pStmt, 1);
1709 if( zName==0 ) break; /* An OOM - finalize() below returns S_NOMEM */
1710 for(i=iOrder; i<pIter->nTblCol; i++){
1711 if( 0==strcmp(zName, pIter->azTblCol[i]) ) break;
1712 }
1713 if( i==pIter->nTblCol ){
1714 p->rc = SQLITE_ERROR;
1715 p->zErrmsg = sqlite3_mprintf("column missing from %q: %s",
1716 pIter->zDataTbl, zName
1717 );
1718 }else{
1719 int iPk = sqlite3_column_int(pStmt, 5);
1720 int bNotNull = sqlite3_column_int(pStmt, 3);
1721 const char *zType = (const char*)sqlite3_column_text(pStmt, 2);
1722
1723 if( i!=iOrder ){
1724 SWAP(int, pIter->aiSrcOrder[i], pIter->aiSrcOrder[iOrder]);
1725 SWAP(char*, pIter->azTblCol[i], pIter->azTblCol[iOrder]);
1726 }
1727
1728 pIter->azTblType[iOrder] = rbuStrndup(zType, &p->rc);
1729 pIter->abTblPk[iOrder] = (iPk!=0);
1730 pIter->abNotNull[iOrder] = (u8)bNotNull || (iPk!=0);
1731 iOrder++;
1732 }
1733 }
1734
1735 rbuFinalize(p, pStmt);
1736 rbuObjIterCacheIndexedCols(p, pIter);
1737 assert( pIter->eType!=RBU_PK_VTAB || pIter->abIndexed==0 );
1738 }
1739
1740 return p->rc;
1741 }
1742
1743 /*
1744 ** This function constructs and returns a pointer to a nul-terminated
1745 ** string containing some SQL clause or list based on one or more of the
1746 ** column names currently stored in the pIter->azTblCol[] array.
1747 */
1748 static char *rbuObjIterGetCollist(
1749 sqlite3rbu *p, /* RBU object */
1750 RbuObjIter *pIter /* Object iterator for column names */
1751 ){
1752 char *zList = 0;
1753 const char *zSep = "";
1754 int i;
1755 for(i=0; i<pIter->nTblCol; i++){
1756 const char *z = pIter->azTblCol[i];
1757 zList = rbuMPrintf(p, "%z%s\"%w\"", zList, zSep, z);
1758 zSep = ", ";
1759 }
1760 return zList;
1761 }
1762
1763 /*
1764 ** This function is used to create a SELECT list (the list of SQL
1765 ** expressions that follows a SELECT keyword) for a SELECT statement
1766 ** used to read from an data_xxx or rbu_tmp_xxx table while updating the
1767 ** index object currently indicated by the iterator object passed as the
1768 ** second argument. A "PRAGMA index_xinfo = <idxname>" statement is used
1769 ** to obtain the required information.
1770 **
1771 ** If the index is of the following form:
1772 **
1773 ** CREATE INDEX i1 ON t1(c, b COLLATE nocase);
1774 **
1775 ** and "t1" is a table with an explicit INTEGER PRIMARY KEY column
1776 ** "ipk", the returned string is:
1777 **
1778 ** "`c` COLLATE 'BINARY', `b` COLLATE 'NOCASE', `ipk` COLLATE 'BINARY'"
1779 **
1780 ** As well as the returned string, three other malloc'd strings are
1781 ** returned via output parameters. As follows:
1782 **
1783 ** pzImposterCols: ...
1784 ** pzImposterPk: ...
1785 ** pzWhere: ...
1786 */
1787 static char *rbuObjIterGetIndexCols(
1788 sqlite3rbu *p, /* RBU object */
1789 RbuObjIter *pIter, /* Object iterator for column names */
1790 char **pzImposterCols, /* OUT: Columns for imposter table */
1791 char **pzImposterPk, /* OUT: Imposter PK clause */
1792 char **pzWhere, /* OUT: WHERE clause */
1793 int *pnBind /* OUT: Trbul number of columns */
1794 ){
1795 int rc = p->rc; /* Error code */
1796 int rc2; /* sqlite3_finalize() return code */
1797 char *zRet = 0; /* String to return */
1798 char *zImpCols = 0; /* String to return via *pzImposterCols */
1799 char *zImpPK = 0; /* String to return via *pzImposterPK */
1800 char *zWhere = 0; /* String to return via *pzWhere */
1801 int nBind = 0; /* Value to return via *pnBind */
1802 const char *zCom = ""; /* Set to ", " later on */
1803 const char *zAnd = ""; /* Set to " AND " later on */
1804 sqlite3_stmt *pXInfo = 0; /* PRAGMA index_xinfo = ? */
1805
1806 if( rc==SQLITE_OK ){
1807 assert( p->zErrmsg==0 );
1808 rc = prepareFreeAndCollectError(p->dbMain, &pXInfo, &p->zErrmsg,
1809 sqlite3_mprintf("PRAGMA main.index_xinfo = %Q", pIter->zIdx)
1810 );
1811 }
1812
1813 while( rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pXInfo) ){
1814 int iCid = sqlite3_column_int(pXInfo, 1);
1815 int bDesc = sqlite3_column_int(pXInfo, 3);
1816 const char *zCollate = (const char*)sqlite3_column_text(pXInfo, 4);
1817 const char *zCol;
1818 const char *zType;
1819
1820 if( iCid<0 ){
1821 /* An integer primary key. If the table has an explicit IPK, use
1822 ** its name. Otherwise, use "rbu_rowid". */
1823 if( pIter->eType==RBU_PK_IPK ){
1824 int i;
1825 for(i=0; pIter->abTblPk[i]==0; i++);
1826 assert( i<pIter->nTblCol );
1827 zCol = pIter->azTblCol[i];
1828 }else{
1829 zCol = "rbu_rowid";
1830 }
1831 zType = "INTEGER";
1832 }else{
1833 zCol = pIter->azTblCol[iCid];
1834 zType = pIter->azTblType[iCid];
1835 }
1836
1837 zRet = sqlite3_mprintf("%z%s\"%w\" COLLATE %Q", zRet, zCom, zCol, zCollate);
1838 if( pIter->bUnique==0 || sqlite3_column_int(pXInfo, 5) ){
1839 const char *zOrder = (bDesc ? " DESC" : "");
1840 zImpPK = sqlite3_mprintf("%z%s\"rbu_imp_%d%w\"%s",
1841 zImpPK, zCom, nBind, zCol, zOrder
1842 );
1843 }
1844 zImpCols = sqlite3_mprintf("%z%s\"rbu_imp_%d%w\" %s COLLATE %Q",
1845 zImpCols, zCom, nBind, zCol, zType, zCollate
1846 );
1847 zWhere = sqlite3_mprintf(
1848 "%z%s\"rbu_imp_%d%w\" IS ?", zWhere, zAnd, nBind, zCol
1849 );
1850 if( zRet==0 || zImpPK==0 || zImpCols==0 || zWhere==0 ) rc = SQLITE_NOMEM;
1851 zCom = ", ";
1852 zAnd = " AND ";
1853 nBind++;
1854 }
1855
1856 rc2 = sqlite3_finalize(pXInfo);
1857 if( rc==SQLITE_OK ) rc = rc2;
1858
1859 if( rc!=SQLITE_OK ){
1860 sqlite3_free(zRet);
1861 sqlite3_free(zImpCols);
1862 sqlite3_free(zImpPK);
1863 sqlite3_free(zWhere);
1864 zRet = 0;
1865 zImpCols = 0;
1866 zImpPK = 0;
1867 zWhere = 0;
1868 p->rc = rc;
1869 }
1870
1871 *pzImposterCols = zImpCols;
1872 *pzImposterPk = zImpPK;
1873 *pzWhere = zWhere;
1874 *pnBind = nBind;
1875 return zRet;
1876 }
1877
1878 /*
1879 ** Assuming the current table columns are "a", "b" and "c", and the zObj
1880 ** paramter is passed "old", return a string of the form:
1881 **
1882 ** "old.a, old.b, old.b"
1883 **
1884 ** With the column names escaped.
1885 **
1886 ** For tables with implicit rowids - RBU_PK_EXTERNAL and RBU_PK_NONE, append
1887 ** the text ", old._rowid_" to the returned value.
1888 */
1889 static char *rbuObjIterGetOldlist(
1890 sqlite3rbu *p,
1891 RbuObjIter *pIter,
1892 const char *zObj
1893 ){
1894 char *zList = 0;
1895 if( p->rc==SQLITE_OK && pIter->abIndexed ){
1896 const char *zS = "";
1897 int i;
1898 for(i=0; i<pIter->nTblCol; i++){
1899 if( pIter->abIndexed[i] ){
1900 const char *zCol = pIter->azTblCol[i];
1901 zList = sqlite3_mprintf("%z%s%s.\"%w\"", zList, zS, zObj, zCol);
1902 }else{
1903 zList = sqlite3_mprintf("%z%sNULL", zList, zS);
1904 }
1905 zS = ", ";
1906 if( zList==0 ){
1907 p->rc = SQLITE_NOMEM;
1908 break;
1909 }
1910 }
1911
1912 /* For a table with implicit rowids, append "old._rowid_" to the list. */
1913 if( pIter->eType==RBU_PK_EXTERNAL || pIter->eType==RBU_PK_NONE ){
1914 zList = rbuMPrintf(p, "%z, %s._rowid_", zList, zObj);
1915 }
1916 }
1917 return zList;
1918 }
1919
1920 /*
1921 ** Return an expression that can be used in a WHERE clause to match the
1922 ** primary key of the current table. For example, if the table is:
1923 **
1924 ** CREATE TABLE t1(a, b, c, PRIMARY KEY(b, c));
1925 **
1926 ** Return the string:
1927 **
1928 ** "b = ?1 AND c = ?2"
1929 */
1930 static char *rbuObjIterGetWhere(
1931 sqlite3rbu *p,
1932 RbuObjIter *pIter
1933 ){
1934 char *zList = 0;
1935 if( pIter->eType==RBU_PK_VTAB || pIter->eType==RBU_PK_NONE ){
1936 zList = rbuMPrintf(p, "_rowid_ = ?%d", pIter->nTblCol+1);
1937 }else if( pIter->eType==RBU_PK_EXTERNAL ){
1938 const char *zSep = "";
1939 int i;
1940 for(i=0; i<pIter->nTblCol; i++){
1941 if( pIter->abTblPk[i] ){
1942 zList = rbuMPrintf(p, "%z%sc%d=?%d", zList, zSep, i, i+1);
1943 zSep = " AND ";
1944 }
1945 }
1946 zList = rbuMPrintf(p,
1947 "_rowid_ = (SELECT id FROM rbu_imposter2 WHERE %z)", zList
1948 );
1949
1950 }else{
1951 const char *zSep = "";
1952 int i;
1953 for(i=0; i<pIter->nTblCol; i++){
1954 if( pIter->abTblPk[i] ){
1955 const char *zCol = pIter->azTblCol[i];
1956 zList = rbuMPrintf(p, "%z%s\"%w\"=?%d", zList, zSep, zCol, i+1);
1957 zSep = " AND ";
1958 }
1959 }
1960 }
1961 return zList;
1962 }
1963
1964 /*
1965 ** The SELECT statement iterating through the keys for the current object
1966 ** (p->objiter.pSelect) currently points to a valid row. However, there
1967 ** is something wrong with the rbu_control value in the rbu_control value
1968 ** stored in the (p->nCol+1)'th column. Set the error code and error message
1969 ** of the RBU handle to something reflecting this.
1970 */
1971 static void rbuBadControlError(sqlite3rbu *p){
1972 p->rc = SQLITE_ERROR;
1973 p->zErrmsg = sqlite3_mprintf("invalid rbu_control value");
1974 }
1975
1976
1977 /*
1978 ** Return a nul-terminated string containing the comma separated list of
1979 ** assignments that should be included following the "SET" keyword of
1980 ** an UPDATE statement used to update the table object that the iterator
1981 ** passed as the second argument currently points to if the rbu_control
1982 ** column of the data_xxx table entry is set to zMask.
1983 **
1984 ** The memory for the returned string is obtained from sqlite3_malloc().
1985 ** It is the responsibility of the caller to eventually free it using
1986 ** sqlite3_free().
1987 **
1988 ** If an OOM error is encountered when allocating space for the new
1989 ** string, an error code is left in the rbu handle passed as the first
1990 ** argument and NULL is returned. Or, if an error has already occurred
1991 ** when this function is called, NULL is returned immediately, without
1992 ** attempting the allocation or modifying the stored error code.
1993 */
1994 static char *rbuObjIterGetSetlist(
1995 sqlite3rbu *p,
1996 RbuObjIter *pIter,
1997 const char *zMask
1998 ){
1999 char *zList = 0;
2000 if( p->rc==SQLITE_OK ){
2001 int i;
2002
2003 if( (int)strlen(zMask)!=pIter->nTblCol ){
2004 rbuBadControlError(p);
2005 }else{
2006 const char *zSep = "";
2007 for(i=0; i<pIter->nTblCol; i++){
2008 char c = zMask[pIter->aiSrcOrder[i]];
2009 if( c=='x' ){
2010 zList = rbuMPrintf(p, "%z%s\"%w\"=?%d",
2011 zList, zSep, pIter->azTblCol[i], i+1
2012 );
2013 zSep = ", ";
2014 }
2015 else if( c=='d' ){
2016 zList = rbuMPrintf(p, "%z%s\"%w\"=rbu_delta(\"%w\", ?%d)",
2017 zList, zSep, pIter->azTblCol[i], pIter->azTblCol[i], i+1
2018 );
2019 zSep = ", ";
2020 }
2021 else if( c=='f' ){
2022 zList = rbuMPrintf(p, "%z%s\"%w\"=rbu_fossil_delta(\"%w\", ?%d)",
2023 zList, zSep, pIter->azTblCol[i], pIter->azTblCol[i], i+1
2024 );
2025 zSep = ", ";
2026 }
2027 }
2028 }
2029 }
2030 return zList;
2031 }
2032
2033 /*
2034 ** Return a nul-terminated string consisting of nByte comma separated
2035 ** "?" expressions. For example, if nByte is 3, return a pointer to
2036 ** a buffer containing the string "?,?,?".
2037 **
2038 ** The memory for the returned string is obtained from sqlite3_malloc().
2039 ** It is the responsibility of the caller to eventually free it using
2040 ** sqlite3_free().
2041 **
2042 ** If an OOM error is encountered when allocating space for the new
2043 ** string, an error code is left in the rbu handle passed as the first
2044 ** argument and NULL is returned. Or, if an error has already occurred
2045 ** when this function is called, NULL is returned immediately, without
2046 ** attempting the allocation or modifying the stored error code.
2047 */
2048 static char *rbuObjIterGetBindlist(sqlite3rbu *p, int nBind){
2049 char *zRet = 0;
2050 int nByte = nBind*2 + 1;
2051
2052 zRet = (char*)rbuMalloc(p, nByte);
2053 if( zRet ){
2054 int i;
2055 for(i=0; i<nBind; i++){
2056 zRet[i*2] = '?';
2057 zRet[i*2+1] = (i+1==nBind) ? '\0' : ',';
2058 }
2059 }
2060 return zRet;
2061 }
2062
2063 /*
2064 ** The iterator currently points to a table (not index) of type
2065 ** RBU_PK_WITHOUT_ROWID. This function creates the PRIMARY KEY
2066 ** declaration for the corresponding imposter table. For example,
2067 ** if the iterator points to a table created as:
2068 **
2069 ** CREATE TABLE t1(a, b, c, PRIMARY KEY(b, a DESC)) WITHOUT ROWID
2070 **
2071 ** this function returns:
2072 **
2073 ** PRIMARY KEY("b", "a" DESC)
2074 */
2075 static char *rbuWithoutRowidPK(sqlite3rbu *p, RbuObjIter *pIter){
2076 char *z = 0;
2077 assert( pIter->zIdx==0 );
2078 if( p->rc==SQLITE_OK ){
2079 const char *zSep = "PRIMARY KEY(";
2080 sqlite3_stmt *pXList = 0; /* PRAGMA index_list = (pIter->zTbl) */
2081 sqlite3_stmt *pXInfo = 0; /* PRAGMA index_xinfo = <pk-index> */
2082
2083 p->rc = prepareFreeAndCollectError(p->dbMain, &pXList, &p->zErrmsg,
2084 sqlite3_mprintf("PRAGMA main.index_list = %Q", pIter->zTbl)
2085 );
2086 while( p->rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pXList) ){
2087 const char *zOrig = (const char*)sqlite3_column_text(pXList,3);
2088 if( zOrig && strcmp(zOrig, "pk")==0 ){
2089 const char *zIdx = (const char*)sqlite3_column_text(pXList,1);
2090 if( zIdx ){
2091 p->rc = prepareFreeAndCollectError(p->dbMain, &pXInfo, &p->zErrmsg,
2092 sqlite3_mprintf("PRAGMA main.index_xinfo = %Q", zIdx)
2093 );
2094 }
2095 break;
2096 }
2097 }
2098 rbuFinalize(p, pXList);
2099
2100 while( p->rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pXInfo) ){
2101 if( sqlite3_column_int(pXInfo, 5) ){
2102 /* int iCid = sqlite3_column_int(pXInfo, 0); */
2103 const char *zCol = (const char*)sqlite3_column_text(pXInfo, 2);
2104 const char *zDesc = sqlite3_column_int(pXInfo, 3) ? " DESC" : "";
2105 z = rbuMPrintf(p, "%z%s\"%w\"%s", z, zSep, zCol, zDesc);
2106 zSep = ", ";
2107 }
2108 }
2109 z = rbuMPrintf(p, "%z)", z);
2110 rbuFinalize(p, pXInfo);
2111 }
2112 return z;
2113 }
2114
2115 /*
2116 ** This function creates the second imposter table used when writing to
2117 ** a table b-tree where the table has an external primary key. If the
2118 ** iterator passed as the second argument does not currently point to
2119 ** a table (not index) with an external primary key, this function is a
2120 ** no-op.
2121 **
2122 ** Assuming the iterator does point to a table with an external PK, this
2123 ** function creates a WITHOUT ROWID imposter table named "rbu_imposter2"
2124 ** used to access that PK index. For example, if the target table is
2125 ** declared as follows:
2126 **
2127 ** CREATE TABLE t1(a, b TEXT, c REAL, PRIMARY KEY(b, c));
2128 **
2129 ** then the imposter table schema is:
2130 **
2131 ** CREATE TABLE rbu_imposter2(c1 TEXT, c2 REAL, id INTEGER) WITHOUT ROWID;
2132 **
2133 */
2134 static void rbuCreateImposterTable2(sqlite3rbu *p, RbuObjIter *pIter){
2135 if( p->rc==SQLITE_OK && pIter->eType==RBU_PK_EXTERNAL ){
2136 int tnum = pIter->iPkTnum; /* Root page of PK index */
2137 sqlite3_stmt *pQuery = 0; /* SELECT name ... WHERE rootpage = $tnum */
2138 const char *zIdx = 0; /* Name of PK index */
2139 sqlite3_stmt *pXInfo = 0; /* PRAGMA main.index_xinfo = $zIdx */
2140 const char *zComma = "";
2141 char *zCols = 0; /* Used to build up list of table cols */
2142 char *zPk = 0; /* Used to build up table PK declaration */
2143
2144 /* Figure out the name of the primary key index for the current table.
2145 ** This is needed for the argument to "PRAGMA index_xinfo". Set
2146 ** zIdx to point to a nul-terminated string containing this name. */
2147 p->rc = prepareAndCollectError(p->dbMain, &pQuery, &p->zErrmsg,
2148 "SELECT name FROM sqlite_master WHERE rootpage = ?"
2149 );
2150 if( p->rc==SQLITE_OK ){
2151 sqlite3_bind_int(pQuery, 1, tnum);
2152 if( SQLITE_ROW==sqlite3_step(pQuery) ){
2153 zIdx = (const char*)sqlite3_column_text(pQuery, 0);
2154 }
2155 }
2156 if( zIdx ){
2157 p->rc = prepareFreeAndCollectError(p->dbMain, &pXInfo, &p->zErrmsg,
2158 sqlite3_mprintf("PRAGMA main.index_xinfo = %Q", zIdx)
2159 );
2160 }
2161 rbuFinalize(p, pQuery);
2162
2163 while( p->rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pXInfo) ){
2164 int bKey = sqlite3_column_int(pXInfo, 5);
2165 if( bKey ){
2166 int iCid = sqlite3_column_int(pXInfo, 1);
2167 int bDesc = sqlite3_column_int(pXInfo, 3);
2168 const char *zCollate = (const char*)sqlite3_column_text(pXInfo, 4);
2169 zCols = rbuMPrintf(p, "%z%sc%d %s COLLATE %s", zCols, zComma,
2170 iCid, pIter->azTblType[iCid], zCollate
2171 );
2172 zPk = rbuMPrintf(p, "%z%sc%d%s", zPk, zComma, iCid, bDesc?" DESC":"");
2173 zComma = ", ";
2174 }
2175 }
2176 zCols = rbuMPrintf(p, "%z, id INTEGER", zCols);
2177 rbuFinalize(p, pXInfo);
2178
2179 sqlite3_test_control(SQLITE_TESTCTRL_IMPOSTER, p->dbMain, "main", 1, tnum);
2180 rbuMPrintfExec(p, p->dbMain,
2181 "CREATE TABLE rbu_imposter2(%z, PRIMARY KEY(%z)) WITHOUT ROWID",
2182 zCols, zPk
2183 );
2184 sqlite3_test_control(SQLITE_TESTCTRL_IMPOSTER, p->dbMain, "main", 0, 0);
2185 }
2186 }
2187
2188 /*
2189 ** If an error has already occurred when this function is called, it
2190 ** immediately returns zero (without doing any work). Or, if an error
2191 ** occurs during the execution of this function, it sets the error code
2192 ** in the sqlite3rbu object indicated by the first argument and returns
2193 ** zero.
2194 **
2195 ** The iterator passed as the second argument is guaranteed to point to
2196 ** a table (not an index) when this function is called. This function
2197 ** attempts to create any imposter table required to write to the main
2198 ** table b-tree of the table before returning. Non-zero is returned if
2199 ** an imposter table are created, or zero otherwise.
2200 **
2201 ** An imposter table is required in all cases except RBU_PK_VTAB. Only
2202 ** virtual tables are written to directly. The imposter table has the
2203 ** same schema as the actual target table (less any UNIQUE constraints).
2204 ** More precisely, the "same schema" means the same columns, types,
2205 ** collation sequences. For tables that do not have an external PRIMARY
2206 ** KEY, it also means the same PRIMARY KEY declaration.
2207 */
2208 static void rbuCreateImposterTable(sqlite3rbu *p, RbuObjIter *pIter){
2209 if( p->rc==SQLITE_OK && pIter->eType!=RBU_PK_VTAB ){
2210 int tnum = pIter->iTnum;
2211 const char *zComma = "";
2212 char *zSql = 0;
2213 int iCol;
2214 sqlite3_test_control(SQLITE_TESTCTRL_IMPOSTER, p->dbMain, "main", 0, 1);
2215
2216 for(iCol=0; p->rc==SQLITE_OK && iCol<pIter->nTblCol; iCol++){
2217 const char *zPk = "";
2218 const char *zCol = pIter->azTblCol[iCol];
2219 const char *zColl = 0;
2220
2221 p->rc = sqlite3_table_column_metadata(
2222 p->dbMain, "main", pIter->zTbl, zCol, 0, &zColl, 0, 0, 0
2223 );
2224
2225 if( pIter->eType==RBU_PK_IPK && pIter->abTblPk[iCol] ){
2226 /* If the target table column is an "INTEGER PRIMARY KEY", add
2227 ** "PRIMARY KEY" to the imposter table column declaration. */
2228 zPk = "PRIMARY KEY ";
2229 }
2230 zSql = rbuMPrintf(p, "%z%s\"%w\" %s %sCOLLATE %s%s",
2231 zSql, zComma, zCol, pIter->azTblType[iCol], zPk, zColl,
2232 (pIter->abNotNull[iCol] ? " NOT NULL" : "")
2233 );
2234 zComma = ", ";
2235 }
2236
2237 if( pIter->eType==RBU_PK_WITHOUT_ROWID ){
2238 char *zPk = rbuWithoutRowidPK(p, pIter);
2239 if( zPk ){
2240 zSql = rbuMPrintf(p, "%z, %z", zSql, zPk);
2241 }
2242 }
2243
2244 sqlite3_test_control(SQLITE_TESTCTRL_IMPOSTER, p->dbMain, "main", 1, tnum);
2245 rbuMPrintfExec(p, p->dbMain, "CREATE TABLE \"rbu_imp_%w\"(%z)%s",
2246 pIter->zTbl, zSql,
2247 (pIter->eType==RBU_PK_WITHOUT_ROWID ? " WITHOUT ROWID" : "")
2248 );
2249 sqlite3_test_control(SQLITE_TESTCTRL_IMPOSTER, p->dbMain, "main", 0, 0);
2250 }
2251 }
2252
2253 /*
2254 ** Prepare a statement used to insert rows into the "rbu_tmp_xxx" table.
2255 ** Specifically a statement of the form:
2256 **
2257 ** INSERT INTO rbu_tmp_xxx VALUES(?, ?, ? ...);
2258 **
2259 ** The number of bound variables is equal to the number of columns in
2260 ** the target table, plus one (for the rbu_control column), plus one more
2261 ** (for the rbu_rowid column) if the target table is an implicit IPK or
2262 ** virtual table.
2263 */
2264 static void rbuObjIterPrepareTmpInsert(
2265 sqlite3rbu *p,
2266 RbuObjIter *pIter,
2267 const char *zCollist,
2268 const char *zRbuRowid
2269 ){
2270 int bRbuRowid = (pIter->eType==RBU_PK_EXTERNAL || pIter->eType==RBU_PK_NONE);
2271 char *zBind = rbuObjIterGetBindlist(p, pIter->nTblCol + 1 + bRbuRowid);
2272 if( zBind ){
2273 assert( pIter->pTmpInsert==0 );
2274 p->rc = prepareFreeAndCollectError(
2275 p->dbRbu, &pIter->pTmpInsert, &p->zErrmsg, sqlite3_mprintf(
2276 "INSERT INTO %s.'rbu_tmp_%q'(rbu_control,%s%s) VALUES(%z)",
2277 p->zStateDb, pIter->zDataTbl, zCollist, zRbuRowid, zBind
2278 ));
2279 }
2280 }
2281
2282 static void rbuTmpInsertFunc(
2283 sqlite3_context *pCtx,
2284 int nVal,
2285 sqlite3_value **apVal
2286 ){
2287 sqlite3rbu *p = sqlite3_user_data(pCtx);
2288 int rc = SQLITE_OK;
2289 int i;
2290
2291 for(i=0; rc==SQLITE_OK && i<nVal; i++){
2292 rc = sqlite3_bind_value(p->objiter.pTmpInsert, i+1, apVal[i]);
2293 }
2294 if( rc==SQLITE_OK ){
2295 sqlite3_step(p->objiter.pTmpInsert);
2296 rc = sqlite3_reset(p->objiter.pTmpInsert);
2297 }
2298
2299 if( rc!=SQLITE_OK ){
2300 sqlite3_result_error_code(pCtx, rc);
2301 }
2302 }
2303
2304 /*
2305 ** Ensure that the SQLite statement handles required to update the
2306 ** target database object currently indicated by the iterator passed
2307 ** as the second argument are available.
2308 */
2309 static int rbuObjIterPrepareAll(
2310 sqlite3rbu *p,
2311 RbuObjIter *pIter,
2312 int nOffset /* Add "LIMIT -1 OFFSET $nOffset" to SELECT */
2313 ){
2314 assert( pIter->bCleanup==0 );
2315 if( pIter->pSelect==0 && rbuObjIterCacheTableInfo(p, pIter)==SQLITE_OK ){
2316 const int tnum = pIter->iTnum;
2317 char *zCollist = 0; /* List of indexed columns */
2318 char **pz = &p->zErrmsg;
2319 const char *zIdx = pIter->zIdx;
2320 char *zLimit = 0;
2321
2322 if( nOffset ){
2323 zLimit = sqlite3_mprintf(" LIMIT -1 OFFSET %d", nOffset);
2324 if( !zLimit ) p->rc = SQLITE_NOMEM;
2325 }
2326
2327 if( zIdx ){
2328 const char *zTbl = pIter->zTbl;
2329 char *zImposterCols = 0; /* Columns for imposter table */
2330 char *zImposterPK = 0; /* Primary key declaration for imposter */
2331 char *zWhere = 0; /* WHERE clause on PK columns */
2332 char *zBind = 0;
2333 int nBind = 0;
2334
2335 assert( pIter->eType!=RBU_PK_VTAB );
2336 zCollist = rbuObjIterGetIndexCols(
2337 p, pIter, &zImposterCols, &zImposterPK, &zWhere, &nBind
2338 );
2339 zBind = rbuObjIterGetBindlist(p, nBind);
2340
2341 /* Create the imposter table used to write to this index. */
2342 sqlite3_test_control(SQLITE_TESTCTRL_IMPOSTER, p->dbMain, "main", 0, 1);
2343 sqlite3_test_control(SQLITE_TESTCTRL_IMPOSTER, p->dbMain, "main", 1,tnum);
2344 rbuMPrintfExec(p, p->dbMain,
2345 "CREATE TABLE \"rbu_imp_%w\"( %s, PRIMARY KEY( %s ) ) WITHOUT ROWID",
2346 zTbl, zImposterCols, zImposterPK
2347 );
2348 sqlite3_test_control(SQLITE_TESTCTRL_IMPOSTER, p->dbMain, "main", 0, 0);
2349
2350 /* Create the statement to insert index entries */
2351 pIter->nCol = nBind;
2352 if( p->rc==SQLITE_OK ){
2353 p->rc = prepareFreeAndCollectError(
2354 p->dbMain, &pIter->pInsert, &p->zErrmsg,
2355 sqlite3_mprintf("INSERT INTO \"rbu_imp_%w\" VALUES(%s)", zTbl, zBind)
2356 );
2357 }
2358
2359 /* And to delete index entries */
2360 if( p->rc==SQLITE_OK ){
2361 p->rc = prepareFreeAndCollectError(
2362 p->dbMain, &pIter->pDelete, &p->zErrmsg,
2363 sqlite3_mprintf("DELETE FROM \"rbu_imp_%w\" WHERE %s", zTbl, zWhere)
2364 );
2365 }
2366
2367 /* Create the SELECT statement to read keys in sorted order */
2368 if( p->rc==SQLITE_OK ){
2369 char *zSql;
2370 if( pIter->eType==RBU_PK_EXTERNAL || pIter->eType==RBU_PK_NONE ){
2371 zSql = sqlite3_mprintf(
2372 "SELECT %s, rbu_control FROM %s.'rbu_tmp_%q' ORDER BY %s%s",
2373 zCollist, p->zStateDb, pIter->zDataTbl,
2374 zCollist, zLimit
2375 );
2376 }else{
2377 zSql = sqlite3_mprintf(
2378 "SELECT %s, rbu_control FROM '%q' "
2379 "WHERE typeof(rbu_control)='integer' AND rbu_control!=1 "
2380 "UNION ALL "
2381 "SELECT %s, rbu_control FROM %s.'rbu_tmp_%q' "
2382 "ORDER BY %s%s",
2383 zCollist, pIter->zDataTbl,
2384 zCollist, p->zStateDb, pIter->zDataTbl,
2385 zCollist, zLimit
2386 );
2387 }
2388 p->rc = prepareFreeAndCollectError(p->dbRbu, &pIter->pSelect, pz, zSql);
2389 }
2390
2391 sqlite3_free(zImposterCols);
2392 sqlite3_free(zImposterPK);
2393 sqlite3_free(zWhere);
2394 sqlite3_free(zBind);
2395 }else{
2396 int bRbuRowid = (pIter->eType==RBU_PK_VTAB || pIter->eType==RBU_PK_NONE);
2397 const char *zTbl = pIter->zTbl; /* Table this step applies to */
2398 const char *zWrite; /* Imposter table name */
2399
2400 char *zBindings = rbuObjIterGetBindlist(p, pIter->nTblCol + bRbuRowid);
2401 char *zWhere = rbuObjIterGetWhere(p, pIter);
2402 char *zOldlist = rbuObjIterGetOldlist(p, pIter, "old");
2403 char *zNewlist = rbuObjIterGetOldlist(p, pIter, "new");
2404
2405 zCollist = rbuObjIterGetCollist(p, pIter);
2406 pIter->nCol = pIter->nTblCol;
2407
2408 /* Create the imposter table or tables (if required). */
2409 rbuCreateImposterTable(p, pIter);
2410 rbuCreateImposterTable2(p, pIter);
2411 zWrite = (pIter->eType==RBU_PK_VTAB ? "" : "rbu_imp_");
2412
2413 /* Create the INSERT statement to write to the target PK b-tree */
2414 if( p->rc==SQLITE_OK ){
2415 p->rc = prepareFreeAndCollectError(p->dbMain, &pIter->pInsert, pz,
2416 sqlite3_mprintf(
2417 "INSERT INTO \"%s%w\"(%s%s) VALUES(%s)",
2418 zWrite, zTbl, zCollist, (bRbuRowid ? ", _rowid_" : ""), zBindings
2419 )
2420 );
2421 }
2422
2423 /* Create the DELETE statement to write to the target PK b-tree */
2424 if( p->rc==SQLITE_OK ){
2425 p->rc = prepareFreeAndCollectError(p->dbMain, &pIter->pDelete, pz,
2426 sqlite3_mprintf(
2427 "DELETE FROM \"%s%w\" WHERE %s", zWrite, zTbl, zWhere
2428 )
2429 );
2430 }
2431
2432 if( pIter->abIndexed ){
2433 const char *zRbuRowid = "";
2434 if( pIter->eType==RBU_PK_EXTERNAL || pIter->eType==RBU_PK_NONE ){
2435 zRbuRowid = ", rbu_rowid";
2436 }
2437
2438 /* Create the rbu_tmp_xxx table and the triggers to populate it. */
2439 rbuMPrintfExec(p, p->dbRbu,
2440 "CREATE TABLE IF NOT EXISTS %s.'rbu_tmp_%q' AS "
2441 "SELECT *%s FROM '%q' WHERE 0;"
2442 , p->zStateDb, pIter->zDataTbl
2443 , (pIter->eType==RBU_PK_EXTERNAL ? ", 0 AS rbu_rowid" : "")
2444 , pIter->zDataTbl
2445 );
2446
2447 rbuMPrintfExec(p, p->dbMain,
2448 "CREATE TEMP TRIGGER rbu_delete_tr BEFORE DELETE ON \"%s%w\" "
2449 "BEGIN "
2450 " SELECT rbu_tmp_insert(2, %s);"
2451 "END;"
2452
2453 "CREATE TEMP TRIGGER rbu_update1_tr BEFORE UPDATE ON \"%s%w\" "
2454 "BEGIN "
2455 " SELECT rbu_tmp_insert(2, %s);"
2456 "END;"
2457
2458 "CREATE TEMP TRIGGER rbu_update2_tr AFTER UPDATE ON \"%s%w\" "
2459 "BEGIN "
2460 " SELECT rbu_tmp_insert(3, %s);"
2461 "END;",
2462 zWrite, zTbl, zOldlist,
2463 zWrite, zTbl, zOldlist,
2464 zWrite, zTbl, zNewlist
2465 );
2466
2467 if( pIter->eType==RBU_PK_EXTERNAL || pIter->eType==RBU_PK_NONE ){
2468 rbuMPrintfExec(p, p->dbMain,
2469 "CREATE TEMP TRIGGER rbu_insert_tr AFTER INSERT ON \"%s%w\" "
2470 "BEGIN "
2471 " SELECT rbu_tmp_insert(0, %s);"
2472 "END;",
2473 zWrite, zTbl, zNewlist
2474 );
2475 }
2476
2477 rbuObjIterPrepareTmpInsert(p, pIter, zCollist, zRbuRowid);
2478 }
2479
2480 /* Create the SELECT statement to read keys from data_xxx */
2481 if( p->rc==SQLITE_OK ){
2482 p->rc = prepareFreeAndCollectError(p->dbRbu, &pIter->pSelect, pz,
2483 sqlite3_mprintf(
2484 "SELECT %s, rbu_control%s FROM '%q'%s",
2485 zCollist, (bRbuRowid ? ", rbu_rowid" : ""),
2486 pIter->zDataTbl, zLimit
2487 )
2488 );
2489 }
2490
2491 sqlite3_free(zWhere);
2492 sqlite3_free(zOldlist);
2493 sqlite3_free(zNewlist);
2494 sqlite3_free(zBindings);
2495 }
2496 sqlite3_free(zCollist);
2497 sqlite3_free(zLimit);
2498 }
2499
2500 return p->rc;
2501 }
2502
2503 /*
2504 ** Set output variable *ppStmt to point to an UPDATE statement that may
2505 ** be used to update the imposter table for the main table b-tree of the
2506 ** table object that pIter currently points to, assuming that the
2507 ** rbu_control column of the data_xyz table contains zMask.
2508 **
2509 ** If the zMask string does not specify any columns to update, then this
2510 ** is not an error. Output variable *ppStmt is set to NULL in this case.
2511 */
2512 static int rbuGetUpdateStmt(
2513 sqlite3rbu *p, /* RBU handle */
2514 RbuObjIter *pIter, /* Object iterator */
2515 const char *zMask, /* rbu_control value ('x.x.') */
2516 sqlite3_stmt **ppStmt /* OUT: UPDATE statement handle */
2517 ){
2518 RbuUpdateStmt **pp;
2519 RbuUpdateStmt *pUp = 0;
2520 int nUp = 0;
2521
2522 /* In case an error occurs */
2523 *ppStmt = 0;
2524
2525 /* Search for an existing statement. If one is found, shift it to the front
2526 ** of the LRU queue and return immediately. Otherwise, leave nUp pointing
2527 ** to the number of statements currently in the cache and pUp to the
2528 ** last object in the list. */
2529 for(pp=&pIter->pRbuUpdate; *pp; pp=&((*pp)->pNext)){
2530 pUp = *pp;
2531 if( strcmp(pUp->zMask, zMask)==0 ){
2532 *pp = pUp->pNext;
2533 pUp->pNext = pIter->pRbuUpdate;
2534 pIter->pRbuUpdate = pUp;
2535 *ppStmt = pUp->pUpdate;
2536 return SQLITE_OK;
2537 }
2538 nUp++;
2539 }
2540 assert( pUp==0 || pUp->pNext==0 );
2541
2542 if( nUp>=SQLITE_RBU_UPDATE_CACHESIZE ){
2543 for(pp=&pIter->pRbuUpdate; *pp!=pUp; pp=&((*pp)->pNext));
2544 *pp = 0;
2545 sqlite3_finalize(pUp->pUpdate);
2546 pUp->pUpdate = 0;
2547 }else{
2548 pUp = (RbuUpdateStmt*)rbuMalloc(p, sizeof(RbuUpdateStmt)+pIter->nTblCol+1);
2549 }
2550
2551 if( pUp ){
2552 char *zWhere = rbuObjIterGetWhere(p, pIter);
2553 char *zSet = rbuObjIterGetSetlist(p, pIter, zMask);
2554 char *zUpdate = 0;
2555
2556 pUp->zMask = (char*)&pUp[1];
2557 memcpy(pUp->zMask, zMask, pIter->nTblCol);
2558 pUp->pNext = pIter->pRbuUpdate;
2559 pIter->pRbuUpdate = pUp;
2560
2561 if( zSet ){
2562 const char *zPrefix = "";
2563
2564 if( pIter->eType!=RBU_PK_VTAB ) zPrefix = "rbu_imp_";
2565 zUpdate = sqlite3_mprintf("UPDATE \"%s%w\" SET %s WHERE %s",
2566 zPrefix, pIter->zTbl, zSet, zWhere
2567 );
2568 p->rc = prepareFreeAndCollectError(
2569 p->dbMain, &pUp->pUpdate, &p->zErrmsg, zUpdate
2570 );
2571 *ppStmt = pUp->pUpdate;
2572 }
2573 sqlite3_free(zWhere);
2574 sqlite3_free(zSet);
2575 }
2576
2577 return p->rc;
2578 }
2579
2580 static sqlite3 *rbuOpenDbhandle(sqlite3rbu *p, const char *zName){
2581 sqlite3 *db = 0;
2582 if( p->rc==SQLITE_OK ){
2583 const int flags = SQLITE_OPEN_READWRITE|SQLITE_OPEN_CREATE|SQLITE_OPEN_URI;
2584 p->rc = sqlite3_open_v2(zName, &db, flags, p->zVfsName);
2585 if( p->rc ){
2586 p->zErrmsg = sqlite3_mprintf("%s", sqlite3_errmsg(db));
2587 sqlite3_close(db);
2588 db = 0;
2589 }
2590 }
2591 return db;
2592 }
2593
2594 /*
2595 ** Open the database handle and attach the RBU database as "rbu". If an
2596 ** error occurs, leave an error code and message in the RBU handle.
2597 */
2598 static void rbuOpenDatabase(sqlite3rbu *p){
2599 assert( p->rc==SQLITE_OK );
2600 assert( p->dbMain==0 && p->dbRbu==0 );
2601
2602 p->eStage = 0;
2603 p->dbMain = rbuOpenDbhandle(p, p->zTarget);
2604 p->dbRbu = rbuOpenDbhandle(p, p->zRbu);
2605
2606 /* If using separate RBU and state databases, attach the state database to
2607 ** the RBU db handle now. */
2608 if( p->zState ){
2609 rbuMPrintfExec(p, p->dbRbu, "ATTACH %Q AS stat", p->zState);
2610 memcpy(p->zStateDb, "stat", 4);
2611 }else{
2612 memcpy(p->zStateDb, "main", 4);
2613 }
2614
2615 if( p->rc==SQLITE_OK ){
2616 p->rc = sqlite3_create_function(p->dbMain,
2617 "rbu_tmp_insert", -1, SQLITE_UTF8, (void*)p, rbuTmpInsertFunc, 0, 0
2618 );
2619 }
2620
2621 if( p->rc==SQLITE_OK ){
2622 p->rc = sqlite3_create_function(p->dbMain,
2623 "rbu_fossil_delta", 2, SQLITE_UTF8, 0, rbuFossilDeltaFunc, 0, 0
2624 );
2625 }
2626
2627 if( p->rc==SQLITE_OK ){
2628 p->rc = sqlite3_create_function(p->dbRbu,
2629 "rbu_target_name", 1, SQLITE_UTF8, (void*)p, rbuTargetNameFunc, 0, 0
2630 );
2631 }
2632
2633 if( p->rc==SQLITE_OK ){
2634 p->rc = sqlite3_file_control(p->dbMain, "main", SQLITE_FCNTL_RBU, (void*)p);
2635 }
2636 rbuMPrintfExec(p, p->dbMain, "SELECT * FROM sqlite_master");
2637
2638 /* Mark the database file just opened as an RBU target database. If
2639 ** this call returns SQLITE_NOTFOUND, then the RBU vfs is not in use.
2640 ** This is an error. */
2641 if( p->rc==SQLITE_OK ){
2642 p->rc = sqlite3_file_control(p->dbMain, "main", SQLITE_FCNTL_RBU, (void*)p);
2643 }
2644
2645 if( p->rc==SQLITE_NOTFOUND ){
2646 p->rc = SQLITE_ERROR;
2647 p->zErrmsg = sqlite3_mprintf("rbu vfs not found");
2648 }
2649 }
2650
2651 /*
2652 ** This routine is a copy of the sqlite3FileSuffix3() routine from the core.
2653 ** It is a no-op unless SQLITE_ENABLE_8_3_NAMES is defined.
2654 **
2655 ** If SQLITE_ENABLE_8_3_NAMES is set at compile-time and if the database
2656 ** filename in zBaseFilename is a URI with the "8_3_names=1" parameter and
2657 ** if filename in z[] has a suffix (a.k.a. "extension") that is longer than
2658 ** three characters, then shorten the suffix on z[] to be the last three
2659 ** characters of the original suffix.
2660 **
2661 ** If SQLITE_ENABLE_8_3_NAMES is set to 2 at compile-time, then always
2662 ** do the suffix shortening regardless of URI parameter.
2663 **
2664 ** Examples:
2665 **
2666 ** test.db-journal => test.nal
2667 ** test.db-wal => test.wal
2668 ** test.db-shm => test.shm
2669 ** test.db-mj7f3319fa => test.9fa
2670 */
2671 static void rbuFileSuffix3(const char *zBase, char *z){
2672 #ifdef SQLITE_ENABLE_8_3_NAMES
2673 #if SQLITE_ENABLE_8_3_NAMES<2
2674 if( sqlite3_uri_boolean(zBase, "8_3_names", 0) )
2675 #endif
2676 {
2677 int i, sz;
2678 sz = sqlite3Strlen30(z);
2679 for(i=sz-1; i>0 && z[i]!='/' && z[i]!='.'; i--){}
2680 if( z[i]=='.' && ALWAYS(sz>i+4) ) memmove(&z[i+1], &z[sz-3], 4);
2681 }
2682 #endif
2683 }
2684
2685 /*
2686 ** Return the current wal-index header checksum for the target database
2687 ** as a 64-bit integer.
2688 **
2689 ** The checksum is store in the first page of xShmMap memory as an 8-byte
2690 ** blob starting at byte offset 40.
2691 */
2692 static i64 rbuShmChecksum(sqlite3rbu *p){
2693 i64 iRet = 0;
2694 if( p->rc==SQLITE_OK ){
2695 sqlite3_file *pDb = p->pTargetFd->pReal;
2696 u32 volatile *ptr;
2697 p->rc = pDb->pMethods->xShmMap(pDb, 0, 32*1024, 0, (void volatile**)&ptr);
2698 if( p->rc==SQLITE_OK ){
2699 iRet = ((i64)ptr[10] << 32) + ptr[11];
2700 }
2701 }
2702 return iRet;
2703 }
2704
2705 /*
2706 ** This function is called as part of initializing or reinitializing an
2707 ** incremental checkpoint.
2708 **
2709 ** It populates the sqlite3rbu.aFrame[] array with the set of
2710 ** (wal frame -> db page) copy operations required to checkpoint the
2711 ** current wal file, and obtains the set of shm locks required to safely
2712 ** perform the copy operations directly on the file-system.
2713 **
2714 ** If argument pState is not NULL, then the incremental checkpoint is
2715 ** being resumed. In this case, if the checksum of the wal-index-header
2716 ** following recovery is not the same as the checksum saved in the RbuState
2717 ** object, then the rbu handle is set to DONE state. This occurs if some
2718 ** other client appends a transaction to the wal file in the middle of
2719 ** an incremental checkpoint.
2720 */
2721 static void rbuSetupCheckpoint(sqlite3rbu *p, RbuState *pState){
2722
2723 /* If pState is NULL, then the wal file may not have been opened and
2724 ** recovered. Running a read-statement here to ensure that doing so
2725 ** does not interfere with the "capture" process below. */
2726 if( pState==0 ){
2727 p->eStage = 0;
2728 if( p->rc==SQLITE_OK ){
2729 p->rc = sqlite3_exec(p->dbMain, "SELECT * FROM sqlite_master", 0, 0, 0);
2730 }
2731 }
2732
2733 /* Assuming no error has occurred, run a "restart" checkpoint with the
2734 ** sqlite3rbu.eStage variable set to CAPTURE. This turns on the following
2735 ** special behaviour in the rbu VFS:
2736 **
2737 ** * If the exclusive shm WRITER or READ0 lock cannot be obtained,
2738 ** the checkpoint fails with SQLITE_BUSY (normally SQLite would
2739 ** proceed with running a passive checkpoint instead of failing).
2740 **
2741 ** * Attempts to read from the *-wal file or write to the database file
2742 ** do not perform any IO. Instead, the frame/page combinations that
2743 ** would be read/written are recorded in the sqlite3rbu.aFrame[]
2744 ** array.
2745 **
2746 ** * Calls to xShmLock(UNLOCK) to release the exclusive shm WRITER,
2747 ** READ0 and CHECKPOINT locks taken as part of the checkpoint are
2748 ** no-ops. These locks will not be released until the connection
2749 ** is closed.
2750 **
2751 ** * Attempting to xSync() the database file causes an SQLITE_INTERNAL
2752 ** error.
2753 **
2754 ** As a result, unless an error (i.e. OOM or SQLITE_BUSY) occurs, the
2755 ** checkpoint below fails with SQLITE_INTERNAL, and leaves the aFrame[]
2756 ** array populated with a set of (frame -> page) mappings. Because the
2757 ** WRITER, CHECKPOINT and READ0 locks are still held, it is safe to copy
2758 ** data from the wal file into the database file according to the
2759 ** contents of aFrame[].
2760 */
2761 if( p->rc==SQLITE_OK ){
2762 int rc2;
2763 p->eStage = RBU_STAGE_CAPTURE;
2764 rc2 = sqlite3_exec(p->dbMain, "PRAGMA main.wal_checkpoint=restart", 0, 0,0);
2765 if( rc2!=SQLITE_INTERNAL ) p->rc = rc2;
2766 }
2767
2768 if( p->rc==SQLITE_OK ){
2769 p->eStage = RBU_STAGE_CKPT;
2770 p->nStep = (pState ? pState->nRow : 0);
2771 p->aBuf = rbuMalloc(p, p->pgsz);
2772 p->iWalCksum = rbuShmChecksum(p);
2773 }
2774
2775 if( p->rc==SQLITE_OK && pState && pState->iWalCksum!=p->iWalCksum ){
2776 p->rc = SQLITE_DONE;
2777 p->eStage = RBU_STAGE_DONE;
2778 }
2779 }
2780
2781 /*
2782 ** Called when iAmt bytes are read from offset iOff of the wal file while
2783 ** the rbu object is in capture mode. Record the frame number of the frame
2784 ** being read in the aFrame[] array.
2785 */
2786 static int rbuCaptureWalRead(sqlite3rbu *pRbu, i64 iOff, int iAmt){
2787 const u32 mReq = (1<<WAL_LOCK_WRITE)|(1<<WAL_LOCK_CKPT)|(1<<WAL_LOCK_READ0);
2788 u32 iFrame;
2789
2790 if( pRbu->mLock!=mReq ){
2791 pRbu->rc = SQLITE_BUSY;
2792 return SQLITE_INTERNAL;
2793 }
2794
2795 pRbu->pgsz = iAmt;
2796 if( pRbu->nFrame==pRbu->nFrameAlloc ){
2797 int nNew = (pRbu->nFrameAlloc ? pRbu->nFrameAlloc : 64) * 2;
2798 RbuFrame *aNew;
2799 aNew = (RbuFrame*)sqlite3_realloc(pRbu->aFrame, nNew * sizeof(RbuFrame));
2800 if( aNew==0 ) return SQLITE_NOMEM;
2801 pRbu->aFrame = aNew;
2802 pRbu->nFrameAlloc = nNew;
2803 }
2804
2805 iFrame = (u32)((iOff-32) / (i64)(iAmt+24)) + 1;
2806 if( pRbu->iMaxFrame<iFrame ) pRbu->iMaxFrame = iFrame;
2807 pRbu->aFrame[pRbu->nFrame].iWalFrame = iFrame;
2808 pRbu->aFrame[pRbu->nFrame].iDbPage = 0;
2809 pRbu->nFrame++;
2810 return SQLITE_OK;
2811 }
2812
2813 /*
2814 ** Called when a page of data is written to offset iOff of the database
2815 ** file while the rbu handle is in capture mode. Record the page number
2816 ** of the page being written in the aFrame[] array.
2817 */
2818 static int rbuCaptureDbWrite(sqlite3rbu *pRbu, i64 iOff){
2819 pRbu->aFrame[pRbu->nFrame-1].iDbPage = (u32)(iOff / pRbu->pgsz) + 1;
2820 return SQLITE_OK;
2821 }
2822
2823 /*
2824 ** This is called as part of an incremental checkpoint operation. Copy
2825 ** a single frame of data from the wal file into the database file, as
2826 ** indicated by the RbuFrame object.
2827 */
2828 static void rbuCheckpointFrame(sqlite3rbu *p, RbuFrame *pFrame){
2829 sqlite3_file *pWal = p->pTargetFd->pWalFd->pReal;
2830 sqlite3_file *pDb = p->pTargetFd->pReal;
2831 i64 iOff;
2832
2833 assert( p->rc==SQLITE_OK );
2834 iOff = (i64)(pFrame->iWalFrame-1) * (p->pgsz + 24) + 32 + 24;
2835 p->rc = pWal->pMethods->xRead(pWal, p->aBuf, p->pgsz, iOff);
2836 if( p->rc ) return;
2837
2838 iOff = (i64)(pFrame->iDbPage-1) * p->pgsz;
2839 p->rc = pDb->pMethods->xWrite(pDb, p->aBuf, p->pgsz, iOff);
2840 }
2841
2842
2843 /*
2844 ** Take an EXCLUSIVE lock on the database file.
2845 */
2846 static void rbuLockDatabase(sqlite3rbu *p){
2847 sqlite3_file *pReal = p->pTargetFd->pReal;
2848 assert( p->rc==SQLITE_OK );
2849 p->rc = pReal->pMethods->xLock(pReal, SQLITE_LOCK_SHARED);
2850 if( p->rc==SQLITE_OK ){
2851 p->rc = pReal->pMethods->xLock(pReal, SQLITE_LOCK_EXCLUSIVE);
2852 }
2853 }
2854
2855 #if defined(_WIN32_WCE)
2856 static LPWSTR rbuWinUtf8ToUnicode(const char *zFilename){
2857 int nChar;
2858 LPWSTR zWideFilename;
2859
2860 nChar = MultiByteToWideChar(CP_UTF8, 0, zFilename, -1, NULL, 0);
2861 if( nChar==0 ){
2862 return 0;
2863 }
2864 zWideFilename = sqlite3_malloc( nChar*sizeof(zWideFilename[0]) );
2865 if( zWideFilename==0 ){
2866 return 0;
2867 }
2868 memset(zWideFilename, 0, nChar*sizeof(zWideFilename[0]));
2869 nChar = MultiByteToWideChar(CP_UTF8, 0, zFilename, -1, zWideFilename,
2870 nChar);
2871 if( nChar==0 ){
2872 sqlite3_free(zWideFilename);
2873 zWideFilename = 0;
2874 }
2875 return zWideFilename;
2876 }
2877 #endif
2878
2879 /*
2880 ** The RBU handle is currently in RBU_STAGE_OAL state, with a SHARED lock
2881 ** on the database file. This proc moves the *-oal file to the *-wal path,
2882 ** then reopens the database file (this time in vanilla, non-oal, WAL mode).
2883 ** If an error occurs, leave an error code and error message in the rbu
2884 ** handle.
2885 */
2886 static void rbuMoveOalFile(sqlite3rbu *p){
2887 const char *zBase = sqlite3_db_filename(p->dbMain, "main");
2888
2889 char *zWal = sqlite3_mprintf("%s-wal", zBase);
2890 char *zOal = sqlite3_mprintf("%s-oal", zBase);
2891
2892 assert( p->eStage==RBU_STAGE_MOVE );
2893 assert( p->rc==SQLITE_OK && p->zErrmsg==0 );
2894 if( zWal==0 || zOal==0 ){
2895 p->rc = SQLITE_NOMEM;
2896 }else{
2897 /* Move the *-oal file to *-wal. At this point connection p->db is
2898 ** holding a SHARED lock on the target database file (because it is
2899 ** in WAL mode). So no other connection may be writing the db.
2900 **
2901 ** In order to ensure that there are no database readers, an EXCLUSIVE
2902 ** lock is obtained here before the *-oal is moved to *-wal.
2903 */
2904 rbuLockDatabase(p);
2905 if( p->rc==SQLITE_OK ){
2906 rbuFileSuffix3(zBase, zWal);
2907 rbuFileSuffix3(zBase, zOal);
2908
2909 /* Re-open the databases. */
2910 rbuObjIterFinalize(&p->objiter);
2911 sqlite3_close(p->dbMain);
2912 sqlite3_close(p->dbRbu);
2913 p->dbMain = 0;
2914 p->dbRbu = 0;
2915
2916 #if defined(_WIN32_WCE)
2917 {
2918 LPWSTR zWideOal;
2919 LPWSTR zWideWal;
2920
2921 zWideOal = rbuWinUtf8ToUnicode(zOal);
2922 if( zWideOal ){
2923 zWideWal = rbuWinUtf8ToUnicode(zWal);
2924 if( zWideWal ){
2925 if( MoveFileW(zWideOal, zWideWal) ){
2926 p->rc = SQLITE_OK;
2927 }else{
2928 p->rc = SQLITE_IOERR;
2929 }
2930 sqlite3_free(zWideWal);
2931 }else{
2932 p->rc = SQLITE_IOERR_NOMEM;
2933 }
2934 sqlite3_free(zWideOal);
2935 }else{
2936 p->rc = SQLITE_IOERR_NOMEM;
2937 }
2938 }
2939 #else
2940 p->rc = rename(zOal, zWal) ? SQLITE_IOERR : SQLITE_OK;
2941 #endif
2942
2943 if( p->rc==SQLITE_OK ){
2944 rbuOpenDatabase(p);
2945 rbuSetupCheckpoint(p, 0);
2946 }
2947 }
2948 }
2949
2950 sqlite3_free(zWal);
2951 sqlite3_free(zOal);
2952 }
2953
2954 /*
2955 ** The SELECT statement iterating through the keys for the current object
2956 ** (p->objiter.pSelect) currently points to a valid row. This function
2957 ** determines the type of operation requested by this row and returns
2958 ** one of the following values to indicate the result:
2959 **
2960 ** * RBU_INSERT
2961 ** * RBU_DELETE
2962 ** * RBU_IDX_DELETE
2963 ** * RBU_UPDATE
2964 **
2965 ** If RBU_UPDATE is returned, then output variable *pzMask is set to
2966 ** point to the text value indicating the columns to update.
2967 **
2968 ** If the rbu_control field contains an invalid value, an error code and
2969 ** message are left in the RBU handle and zero returned.
2970 */
2971 static int rbuStepType(sqlite3rbu *p, const char **pzMask){
2972 int iCol = p->objiter.nCol; /* Index of rbu_control column */
2973 int res = 0; /* Return value */
2974
2975 switch( sqlite3_column_type(p->objiter.pSelect, iCol) ){
2976 case SQLITE_INTEGER: {
2977 int iVal = sqlite3_column_int(p->objiter.pSelect, iCol);
2978 if( iVal==0 ){
2979 res = RBU_INSERT;
2980 }else if( iVal==1 ){
2981 res = RBU_DELETE;
2982 }else if( iVal==2 ){
2983 res = RBU_IDX_DELETE;
2984 }else if( iVal==3 ){
2985 res = RBU_IDX_INSERT;
2986 }
2987 break;
2988 }
2989
2990 case SQLITE_TEXT: {
2991 const unsigned char *z = sqlite3_column_text(p->objiter.pSelect, iCol);
2992 if( z==0 ){
2993 p->rc = SQLITE_NOMEM;
2994 }else{
2995 *pzMask = (const char*)z;
2996 }
2997 res = RBU_UPDATE;
2998
2999 break;
3000 }
3001
3002 default:
3003 break;
3004 }
3005
3006 if( res==0 ){
3007 rbuBadControlError(p);
3008 }
3009 return res;
3010 }
3011
3012 #ifdef SQLITE_DEBUG
3013 /*
3014 ** Assert that column iCol of statement pStmt is named zName.
3015 */
3016 static void assertColumnName(sqlite3_stmt *pStmt, int iCol, const char *zName){
3017 const char *zCol = sqlite3_column_name(pStmt, iCol);
3018 assert( 0==sqlite3_stricmp(zName, zCol) );
3019 }
3020 #else
3021 # define assertColumnName(x,y,z)
3022 #endif
3023
3024 /*
3025 ** This function does the work for an sqlite3rbu_step() call.
3026 **
3027 ** The object-iterator (p->objiter) currently points to a valid object,
3028 ** and the input cursor (p->objiter.pSelect) currently points to a valid
3029 ** input row. Perform whatever processing is required and return.
3030 **
3031 ** If no error occurs, SQLITE_OK is returned. Otherwise, an error code
3032 ** and message is left in the RBU handle and a copy of the error code
3033 ** returned.
3034 */
3035 static int rbuStep(sqlite3rbu *p){
3036 RbuObjIter *pIter = &p->objiter;
3037 const char *zMask = 0;
3038 int i;
3039 int eType = rbuStepType(p, &zMask);
3040
3041 if( eType ){
3042 assert( eType!=RBU_UPDATE || pIter->zIdx==0 );
3043
3044 if( pIter->zIdx==0 && eType==RBU_IDX_DELETE ){
3045 rbuBadControlError(p);
3046 }
3047 else if(
3048 eType==RBU_INSERT
3049 || eType==RBU_DELETE
3050 || eType==RBU_IDX_DELETE
3051 || eType==RBU_IDX_INSERT
3052 ){
3053 sqlite3_value *pVal;
3054 sqlite3_stmt *pWriter;
3055
3056 assert( eType!=RBU_UPDATE );
3057 assert( eType!=RBU_DELETE || pIter->zIdx==0 );
3058
3059 if( eType==RBU_IDX_DELETE || eType==RBU_DELETE ){
3060 pWriter = pIter->pDelete;
3061 }else{
3062 pWriter = pIter->pInsert;
3063 }
3064
3065 for(i=0; i<pIter->nCol; i++){
3066 /* If this is an INSERT into a table b-tree and the table has an
3067 ** explicit INTEGER PRIMARY KEY, check that this is not an attempt
3068 ** to write a NULL into the IPK column. That is not permitted. */
3069 if( eType==RBU_INSERT
3070 && pIter->zIdx==0 && pIter->eType==RBU_PK_IPK && pIter->abTblPk[i]
3071 && sqlite3_column_type(pIter->pSelect, i)==SQLITE_NULL
3072 ){
3073 p->rc = SQLITE_MISMATCH;
3074 p->zErrmsg = sqlite3_mprintf("datatype mismatch");
3075 goto step_out;
3076 }
3077
3078 if( eType==RBU_DELETE && pIter->abTblPk[i]==0 ){
3079 continue;
3080 }
3081
3082 pVal = sqlite3_column_value(pIter->pSelect, i);
3083 p->rc = sqlite3_bind_value(pWriter, i+1, pVal);
3084 if( p->rc ) goto step_out;
3085 }
3086 if( pIter->zIdx==0
3087 && (pIter->eType==RBU_PK_VTAB || pIter->eType==RBU_PK_NONE)
3088 ){
3089 /* For a virtual table, or a table with no primary key, the
3090 ** SELECT statement is:
3091 **
3092 ** SELECT <cols>, rbu_control, rbu_rowid FROM ....
3093 **
3094 ** Hence column_value(pIter->nCol+1).
3095 */
3096 assertColumnName(pIter->pSelect, pIter->nCol+1, "rbu_rowid");
3097 pVal = sqlite3_column_value(pIter->pSelect, pIter->nCol+1);
3098 p->rc = sqlite3_bind_value(pWriter, pIter->nCol+1, pVal);
3099 }
3100 if( p->rc==SQLITE_OK ){
3101 sqlite3_step(pWriter);
3102 p->rc = resetAndCollectError(pWriter, &p->zErrmsg);
3103 }
3104 }else{
3105 sqlite3_value *pVal;
3106 sqlite3_stmt *pUpdate = 0;
3107 assert( eType==RBU_UPDATE );
3108 rbuGetUpdateStmt(p, pIter, zMask, &pUpdate);
3109 if( pUpdate ){
3110 for(i=0; p->rc==SQLITE_OK && i<pIter->nCol; i++){
3111 char c = zMask[pIter->aiSrcOrder[i]];
3112 pVal = sqlite3_column_value(pIter->pSelect, i);
3113 if( pIter->abTblPk[i] || c!='.' ){
3114 p->rc = sqlite3_bind_value(pUpdate, i+1, pVal);
3115 }
3116 }
3117 if( p->rc==SQLITE_OK
3118 && (pIter->eType==RBU_PK_VTAB || pIter->eType==RBU_PK_NONE)
3119 ){
3120 /* Bind the rbu_rowid value to column _rowid_ */
3121 assertColumnName(pIter->pSelect, pIter->nCol+1, "rbu_rowid");
3122 pVal = sqlite3_column_value(pIter->pSelect, pIter->nCol+1);
3123 p->rc = sqlite3_bind_value(pUpdate, pIter->nCol+1, pVal);
3124 }
3125 if( p->rc==SQLITE_OK ){
3126 sqlite3_step(pUpdate);
3127 p->rc = resetAndCollectError(pUpdate, &p->zErrmsg);
3128 }
3129 }
3130 }
3131 }
3132
3133 step_out:
3134 return p->rc;
3135 }
3136
3137 /*
3138 ** Increment the schema cookie of the main database opened by p->dbMain.
3139 */
3140 static void rbuIncrSchemaCookie(sqlite3rbu *p){
3141 if( p->rc==SQLITE_OK ){
3142 int iCookie = 1000000;
3143 sqlite3_stmt *pStmt;
3144
3145 p->rc = prepareAndCollectError(p->dbMain, &pStmt, &p->zErrmsg,
3146 "PRAGMA schema_version"
3147 );
3148 if( p->rc==SQLITE_OK ){
3149 /* Coverage: it may be that this sqlite3_step() cannot fail. There
3150 ** is already a transaction open, so the prepared statement cannot
3151 ** throw an SQLITE_SCHEMA exception. The only database page the
3152 ** statement reads is page 1, which is guaranteed to be in the cache.
3153 ** And no memory allocations are required. */
3154 if( SQLITE_ROW==sqlite3_step(pStmt) ){
3155 iCookie = sqlite3_column_int(pStmt, 0);
3156 }
3157 rbuFinalize(p, pStmt);
3158 }
3159 if( p->rc==SQLITE_OK ){
3160 rbuMPrintfExec(p, p->dbMain, "PRAGMA schema_version = %d", iCookie+1);
3161 }
3162 }
3163 }
3164
3165 /*
3166 ** Update the contents of the rbu_state table within the rbu database. The
3167 ** value stored in the RBU_STATE_STAGE column is eStage. All other values
3168 ** are determined by inspecting the rbu handle passed as the first argument.
3169 */
3170 static void rbuSaveState(sqlite3rbu *p, int eStage){
3171 if( p->rc==SQLITE_OK || p->rc==SQLITE_DONE ){
3172 sqlite3_stmt *pInsert = 0;
3173 int rc;
3174
3175 assert( p->zErrmsg==0 );
3176 rc = prepareFreeAndCollectError(p->dbRbu, &pInsert, &p->zErrmsg,
3177 sqlite3_mprintf(
3178 "INSERT OR REPLACE INTO %s.rbu_state(k, v) VALUES "
3179 "(%d, %d), "
3180 "(%d, %Q), "
3181 "(%d, %Q), "
3182 "(%d, %d), "
3183 "(%d, %d), "
3184 "(%d, %lld), "
3185 "(%d, %lld), "
3186 "(%d, %lld) ",
3187 p->zStateDb,
3188 RBU_STATE_STAGE, eStage,
3189 RBU_STATE_TBL, p->objiter.zTbl,
3190 RBU_STATE_IDX, p->objiter.zIdx,
3191 RBU_STATE_ROW, p->nStep,
3192 RBU_STATE_PROGRESS, p->nProgress,
3193 RBU_STATE_CKPT, p->iWalCksum,
3194 RBU_STATE_COOKIE, (i64)p->pTargetFd->iCookie,
3195 RBU_STATE_OALSZ, p->iOalSz
3196 )
3197 );
3198 assert( pInsert==0 || rc==SQLITE_OK );
3199
3200 if( rc==SQLITE_OK ){
3201 sqlite3_step(pInsert);
3202 rc = sqlite3_finalize(pInsert);
3203 }
3204 if( rc!=SQLITE_OK ) p->rc = rc;
3205 }
3206 }
3207
3208
3209 /*
3210 ** Step the RBU object.
3211 */
3212 SQLITE_API int SQLITE_STDCALL sqlite3rbu_step(sqlite3rbu *p){
3213 if( p ){
3214 switch( p->eStage ){
3215 case RBU_STAGE_OAL: {
3216 RbuObjIter *pIter = &p->objiter;
3217 while( p->rc==SQLITE_OK && pIter->zTbl ){
3218
3219 if( pIter->bCleanup ){
3220 /* Clean up the rbu_tmp_xxx table for the previous table. It
3221 ** cannot be dropped as there are currently active SQL statements.
3222 ** But the contents can be deleted. */
3223 if( pIter->abIndexed ){
3224 rbuMPrintfExec(p, p->dbRbu,
3225 "DELETE FROM %s.'rbu_tmp_%q'", p->zStateDb, pIter->zDataTbl
3226 );
3227 }
3228 }else{
3229 rbuObjIterPrepareAll(p, pIter, 0);
3230
3231 /* Advance to the next row to process. */
3232 if( p->rc==SQLITE_OK ){
3233 int rc = sqlite3_step(pIter->pSelect);
3234 if( rc==SQLITE_ROW ){
3235 p->nProgress++;
3236 p->nStep++;
3237 return rbuStep(p);
3238 }
3239 p->rc = sqlite3_reset(pIter->pSelect);
3240 p->nStep = 0;
3241 }
3242 }
3243
3244 rbuObjIterNext(p, pIter);
3245 }
3246
3247 if( p->rc==SQLITE_OK ){
3248 assert( pIter->zTbl==0 );
3249 rbuSaveState(p, RBU_STAGE_MOVE);
3250 rbuIncrSchemaCookie(p);
3251 if( p->rc==SQLITE_OK ){
3252 p->rc = sqlite3_exec(p->dbMain, "COMMIT", 0, 0, &p->zErrmsg);
3253 }
3254 if( p->rc==SQLITE_OK ){
3255 p->rc = sqlite3_exec(p->dbRbu, "COMMIT", 0, 0, &p->zErrmsg);
3256 }
3257 p->eStage = RBU_STAGE_MOVE;
3258 }
3259 break;
3260 }
3261
3262 case RBU_STAGE_MOVE: {
3263 if( p->rc==SQLITE_OK ){
3264 rbuMoveOalFile(p);
3265 p->nProgress++;
3266 }
3267 break;
3268 }
3269
3270 case RBU_STAGE_CKPT: {
3271 if( p->rc==SQLITE_OK ){
3272 if( p->nStep>=p->nFrame ){
3273 sqlite3_file *pDb = p->pTargetFd->pReal;
3274
3275 /* Sync the db file */
3276 p->rc = pDb->pMethods->xSync(pDb, SQLITE_SYNC_NORMAL);
3277
3278 /* Update nBackfill */
3279 if( p->rc==SQLITE_OK ){
3280 void volatile *ptr;
3281 p->rc = pDb->pMethods->xShmMap(pDb, 0, 32*1024, 0, &ptr);
3282 if( p->rc==SQLITE_OK ){
3283 ((u32 volatile*)ptr)[24] = p->iMaxFrame;
3284 }
3285 }
3286
3287 if( p->rc==SQLITE_OK ){
3288 p->eStage = RBU_STAGE_DONE;
3289 p->rc = SQLITE_DONE;
3290 }
3291 }else{
3292 RbuFrame *pFrame = &p->aFrame[p->nStep];
3293 rbuCheckpointFrame(p, pFrame);
3294 p->nStep++;
3295 }
3296 p->nProgress++;
3297 }
3298 break;
3299 }
3300
3301 default:
3302 break;
3303 }
3304 return p->rc;
3305 }else{
3306 return SQLITE_NOMEM;
3307 }
3308 }
3309
3310 /*
3311 ** Free an RbuState object allocated by rbuLoadState().
3312 */
3313 static void rbuFreeState(RbuState *p){
3314 if( p ){
3315 sqlite3_free(p->zTbl);
3316 sqlite3_free(p->zIdx);
3317 sqlite3_free(p);
3318 }
3319 }
3320
3321 /*
3322 ** Allocate an RbuState object and load the contents of the rbu_state
3323 ** table into it. Return a pointer to the new object. It is the
3324 ** responsibility of the caller to eventually free the object using
3325 ** sqlite3_free().
3326 **
3327 ** If an error occurs, leave an error code and message in the rbu handle
3328 ** and return NULL.
3329 */
3330 static RbuState *rbuLoadState(sqlite3rbu *p){
3331 RbuState *pRet = 0;
3332 sqlite3_stmt *pStmt = 0;
3333 int rc;
3334 int rc2;
3335
3336 pRet = (RbuState*)rbuMalloc(p, sizeof(RbuState));
3337 if( pRet==0 ) return 0;
3338
3339 rc = prepareFreeAndCollectError(p->dbRbu, &pStmt, &p->zErrmsg,
3340 sqlite3_mprintf("SELECT k, v FROM %s.rbu_state", p->zStateDb)
3341 );
3342 while( rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pStmt) ){
3343 switch( sqlite3_column_int(pStmt, 0) ){
3344 case RBU_STATE_STAGE:
3345 pRet->eStage = sqlite3_column_int(pStmt, 1);
3346 if( pRet->eStage!=RBU_STAGE_OAL
3347 && pRet->eStage!=RBU_STAGE_MOVE
3348 && pRet->eStage!=RBU_STAGE_CKPT
3349 ){
3350 p->rc = SQLITE_CORRUPT;
3351 }
3352 break;
3353
3354 case RBU_STATE_TBL:
3355 pRet->zTbl = rbuStrndup((char*)sqlite3_column_text(pStmt, 1), &rc);
3356 break;
3357
3358 case RBU_STATE_IDX:
3359 pRet->zIdx = rbuStrndup((char*)sqlite3_column_text(pStmt, 1), &rc);
3360 break;
3361
3362 case RBU_STATE_ROW:
3363 pRet->nRow = sqlite3_column_int(pStmt, 1);
3364 break;
3365
3366 case RBU_STATE_PROGRESS:
3367 pRet->nProgress = sqlite3_column_int64(pStmt, 1);
3368 break;
3369
3370 case RBU_STATE_CKPT:
3371 pRet->iWalCksum = sqlite3_column_int64(pStmt, 1);
3372 break;
3373
3374 case RBU_STATE_COOKIE:
3375 pRet->iCookie = (u32)sqlite3_column_int64(pStmt, 1);
3376 break;
3377
3378 case RBU_STATE_OALSZ:
3379 pRet->iOalSz = (u32)sqlite3_column_int64(pStmt, 1);
3380 break;
3381
3382 default:
3383 rc = SQLITE_CORRUPT;
3384 break;
3385 }
3386 }
3387 rc2 = sqlite3_finalize(pStmt);
3388 if( rc==SQLITE_OK ) rc = rc2;
3389
3390 p->rc = rc;
3391 return pRet;
3392 }
3393
3394 /*
3395 ** Compare strings z1 and z2, returning 0 if they are identical, or non-zero
3396 ** otherwise. Either or both argument may be NULL. Two NULL values are
3397 ** considered equal, and NULL is considered distinct from all other values.
3398 */
3399 static int rbuStrCompare(const char *z1, const char *z2){
3400 if( z1==0 && z2==0 ) return 0;
3401 if( z1==0 || z2==0 ) return 1;
3402 return (sqlite3_stricmp(z1, z2)!=0);
3403 }
3404
3405 /*
3406 ** This function is called as part of sqlite3rbu_open() when initializing
3407 ** an rbu handle in OAL stage. If the rbu update has not started (i.e.
3408 ** the rbu_state table was empty) it is a no-op. Otherwise, it arranges
3409 ** things so that the next call to sqlite3rbu_step() continues on from
3410 ** where the previous rbu handle left off.
3411 **
3412 ** If an error occurs, an error code and error message are left in the
3413 ** rbu handle passed as the first argument.
3414 */
3415 static void rbuSetupOal(sqlite3rbu *p, RbuState *pState){
3416 assert( p->rc==SQLITE_OK );
3417 if( pState->zTbl ){
3418 RbuObjIter *pIter = &p->objiter;
3419 int rc = SQLITE_OK;
3420
3421 while( rc==SQLITE_OK && pIter->zTbl && (pIter->bCleanup
3422 || rbuStrCompare(pIter->zIdx, pState->zIdx)
3423 || rbuStrCompare(pIter->zTbl, pState->zTbl)
3424 )){
3425 rc = rbuObjIterNext(p, pIter);
3426 }
3427
3428 if( rc==SQLITE_OK && !pIter->zTbl ){
3429 rc = SQLITE_ERROR;
3430 p->zErrmsg = sqlite3_mprintf("rbu_state mismatch error");
3431 }
3432
3433 if( rc==SQLITE_OK ){
3434 p->nStep = pState->nRow;
3435 rc = rbuObjIterPrepareAll(p, &p->objiter, p->nStep);
3436 }
3437
3438 p->rc = rc;
3439 }
3440 }
3441
3442 /*
3443 ** If there is a "*-oal" file in the file-system corresponding to the
3444 ** target database in the file-system, delete it. If an error occurs,
3445 ** leave an error code and error message in the rbu handle.
3446 */
3447 static void rbuDeleteOalFile(sqlite3rbu *p){
3448 char *zOal = rbuMPrintf(p, "%s-oal", p->zTarget);
3449 if( zOal ){
3450 sqlite3_vfs *pVfs = sqlite3_vfs_find(0);
3451 assert( pVfs && p->rc==SQLITE_OK && p->zErrmsg==0 );
3452 pVfs->xDelete(pVfs, zOal, 0);
3453 sqlite3_free(zOal);
3454 }
3455 }
3456
3457 /*
3458 ** Allocate a private rbu VFS for the rbu handle passed as the only
3459 ** argument. This VFS will be used unless the call to sqlite3rbu_open()
3460 ** specified a URI with a vfs=? option in place of a target database
3461 ** file name.
3462 */
3463 static void rbuCreateVfs(sqlite3rbu *p){
3464 int rnd;
3465 char zRnd[64];
3466
3467 assert( p->rc==SQLITE_OK );
3468 sqlite3_randomness(sizeof(int), (void*)&rnd);
3469 sqlite3_snprintf(sizeof(zRnd), zRnd, "rbu_vfs_%d", rnd);
3470 p->rc = sqlite3rbu_create_vfs(zRnd, 0);
3471 if( p->rc==SQLITE_OK ){
3472 sqlite3_vfs *pVfs = sqlite3_vfs_find(zRnd);
3473 assert( pVfs );
3474 p->zVfsName = pVfs->zName;
3475 }
3476 }
3477
3478 /*
3479 ** Destroy the private VFS created for the rbu handle passed as the only
3480 ** argument by an earlier call to rbuCreateVfs().
3481 */
3482 static void rbuDeleteVfs(sqlite3rbu *p){
3483 if( p->zVfsName ){
3484 sqlite3rbu_destroy_vfs(p->zVfsName);
3485 p->zVfsName = 0;
3486 }
3487 }
3488
3489 /*
3490 ** Open and return a new RBU handle.
3491 */
3492 SQLITE_API sqlite3rbu *SQLITE_STDCALL sqlite3rbu_open(
3493 const char *zTarget,
3494 const char *zRbu,
3495 const char *zState
3496 ){
3497 sqlite3rbu *p;
3498 int nTarget = strlen(zTarget);
3499 int nRbu = strlen(zRbu);
3500 int nState = zState ? strlen(zState) : 0;
3501
3502 p = (sqlite3rbu*)sqlite3_malloc(sizeof(sqlite3rbu)+nTarget+1+nRbu+1+nState+1);
3503 if( p ){
3504 RbuState *pState = 0;
3505
3506 /* Create the custom VFS. */
3507 memset(p, 0, sizeof(sqlite3rbu));
3508 rbuCreateVfs(p);
3509
3510 /* Open the target database */
3511 if( p->rc==SQLITE_OK ){
3512 p->zTarget = (char*)&p[1];
3513 memcpy(p->zTarget, zTarget, nTarget+1);
3514 p->zRbu = &p->zTarget[nTarget+1];
3515 memcpy(p->zRbu, zRbu, nRbu+1);
3516 if( zState ){
3517 p->zState = &p->zRbu[nRbu+1];
3518 memcpy(p->zState, zState, nState+1);
3519 }
3520 rbuOpenDatabase(p);
3521 }
3522
3523 /* If it has not already been created, create the rbu_state table */
3524 rbuMPrintfExec(p, p->dbRbu, RBU_CREATE_STATE, p->zStateDb);
3525
3526 if( p->rc==SQLITE_OK ){
3527 pState = rbuLoadState(p);
3528 assert( pState || p->rc!=SQLITE_OK );
3529 if( p->rc==SQLITE_OK ){
3530
3531 if( pState->eStage==0 ){
3532 rbuDeleteOalFile(p);
3533 p->eStage = RBU_STAGE_OAL;
3534 }else{
3535 p->eStage = pState->eStage;
3536 }
3537 p->nProgress = pState->nProgress;
3538 p->iOalSz = pState->iOalSz;
3539 }
3540 }
3541 assert( p->rc!=SQLITE_OK || p->eStage!=0 );
3542
3543 if( p->rc==SQLITE_OK && p->pTargetFd->pWalFd ){
3544 if( p->eStage==RBU_STAGE_OAL ){
3545 p->rc = SQLITE_ERROR;
3546 p->zErrmsg = sqlite3_mprintf("cannot update wal mode database");
3547 }else if( p->eStage==RBU_STAGE_MOVE ){
3548 p->eStage = RBU_STAGE_CKPT;
3549 p->nStep = 0;
3550 }
3551 }
3552
3553 if( p->rc==SQLITE_OK
3554 && (p->eStage==RBU_STAGE_OAL || p->eStage==RBU_STAGE_MOVE)
3555 && pState->eStage!=0 && p->pTargetFd->iCookie!=pState->iCookie
3556 ){
3557 /* At this point (pTargetFd->iCookie) contains the value of the
3558 ** change-counter cookie (the thing that gets incremented when a
3559 ** transaction is committed in rollback mode) currently stored on
3560 ** page 1 of the database file. */
3561 p->rc = SQLITE_BUSY;
3562 p->zErrmsg = sqlite3_mprintf("database modified during rbu update");
3563 }
3564
3565 if( p->rc==SQLITE_OK ){
3566 if( p->eStage==RBU_STAGE_OAL ){
3567 sqlite3 *db = p->dbMain;
3568
3569 /* Open transactions both databases. The *-oal file is opened or
3570 ** created at this point. */
3571 p->rc = sqlite3_exec(db, "BEGIN IMMEDIATE", 0, 0, &p->zErrmsg);
3572 if( p->rc==SQLITE_OK ){
3573 p->rc = sqlite3_exec(p->dbRbu, "BEGIN IMMEDIATE", 0, 0, &p->zErrmsg);
3574 }
3575
3576 /* Check if the main database is a zipvfs db. If it is, set the upper
3577 ** level pager to use "journal_mode=off". This prevents it from
3578 ** generating a large journal using a temp file. */
3579 if( p->rc==SQLITE_OK ){
3580 int frc = sqlite3_file_control(db, "main", SQLITE_FCNTL_ZIPVFS, 0);
3581 if( frc==SQLITE_OK ){
3582 p->rc = sqlite3_exec(db, "PRAGMA journal_mode=off",0,0,&p->zErrmsg);
3583 }
3584 }
3585
3586 /* Point the object iterator at the first object */
3587 if( p->rc==SQLITE_OK ){
3588 p->rc = rbuObjIterFirst(p, &p->objiter);
3589 }
3590
3591 /* If the RBU database contains no data_xxx tables, declare the RBU
3592 ** update finished. */
3593 if( p->rc==SQLITE_OK && p->objiter.zTbl==0 ){
3594 p->rc = SQLITE_DONE;
3595 }
3596
3597 if( p->rc==SQLITE_OK ){
3598 rbuSetupOal(p, pState);
3599 }
3600
3601 }else if( p->eStage==RBU_STAGE_MOVE ){
3602 /* no-op */
3603 }else if( p->eStage==RBU_STAGE_CKPT ){
3604 rbuSetupCheckpoint(p, pState);
3605 }else if( p->eStage==RBU_STAGE_DONE ){
3606 p->rc = SQLITE_DONE;
3607 }else{
3608 p->rc = SQLITE_CORRUPT;
3609 }
3610 }
3611
3612 rbuFreeState(pState);
3613 }
3614
3615 return p;
3616 }
3617
3618
3619 /*
3620 ** Return the database handle used by pRbu.
3621 */
3622 SQLITE_API sqlite3 *SQLITE_STDCALL sqlite3rbu_db(sqlite3rbu *pRbu, int bRbu){
3623 sqlite3 *db = 0;
3624 if( pRbu ){
3625 db = (bRbu ? pRbu->dbRbu : pRbu->dbMain);
3626 }
3627 return db;
3628 }
3629
3630
3631 /*
3632 ** If the error code currently stored in the RBU handle is SQLITE_CONSTRAINT,
3633 ** then edit any error message string so as to remove all occurrences of
3634 ** the pattern "rbu_imp_[0-9]*".
3635 */
3636 static void rbuEditErrmsg(sqlite3rbu *p){
3637 if( p->rc==SQLITE_CONSTRAINT && p->zErrmsg ){
3638 int i;
3639 int nErrmsg = strlen(p->zErrmsg);
3640 for(i=0; i<(nErrmsg-8); i++){
3641 if( memcmp(&p->zErrmsg[i], "rbu_imp_", 8)==0 ){
3642 int nDel = 8;
3643 while( p->zErrmsg[i+nDel]>='0' && p->zErrmsg[i+nDel]<='9' ) nDel++;
3644 memmove(&p->zErrmsg[i], &p->zErrmsg[i+nDel], nErrmsg + 1 - i - nDel);
3645 nErrmsg -= nDel;
3646 }
3647 }
3648 }
3649 }
3650
3651 /*
3652 ** Close the RBU handle.
3653 */
3654 SQLITE_API int SQLITE_STDCALL sqlite3rbu_close(sqlite3rbu *p, char **pzErrmsg){
3655 int rc;
3656 if( p ){
3657
3658 /* Commit the transaction to the *-oal file. */
3659 if( p->rc==SQLITE_OK && p->eStage==RBU_STAGE_OAL ){
3660 p->rc = sqlite3_exec(p->dbMain, "COMMIT", 0, 0, &p->zErrmsg);
3661 }
3662
3663 rbuSaveState(p, p->eStage);
3664
3665 if( p->rc==SQLITE_OK && p->eStage==RBU_STAGE_OAL ){
3666 p->rc = sqlite3_exec(p->dbRbu, "COMMIT", 0, 0, &p->zErrmsg);
3667 }
3668
3669 /* Close any open statement handles. */
3670 rbuObjIterFinalize(&p->objiter);
3671
3672 /* Close the open database handle and VFS object. */
3673 sqlite3_close(p->dbMain);
3674 sqlite3_close(p->dbRbu);
3675 rbuDeleteVfs(p);
3676 sqlite3_free(p->aBuf);
3677 sqlite3_free(p->aFrame);
3678
3679 rbuEditErrmsg(p);
3680 rc = p->rc;
3681 *pzErrmsg = p->zErrmsg;
3682 sqlite3_free(p);
3683 }else{
3684 rc = SQLITE_NOMEM;
3685 *pzErrmsg = 0;
3686 }
3687 return rc;
3688 }
3689
3690 /*
3691 ** Return the total number of key-value operations (inserts, deletes or
3692 ** updates) that have been performed on the target database since the
3693 ** current RBU update was started.
3694 */
3695 SQLITE_API sqlite3_int64 SQLITE_STDCALL sqlite3rbu_progress(sqlite3rbu *pRbu){
3696 return pRbu->nProgress;
3697 }
3698
3699 SQLITE_API int SQLITE_STDCALL sqlite3rbu_savestate(sqlite3rbu *p){
3700 int rc = p->rc;
3701
3702 if( rc==SQLITE_DONE ) return SQLITE_OK;
3703
3704 assert( p->eStage>=RBU_STAGE_OAL && p->eStage<=RBU_STAGE_DONE );
3705 if( p->eStage==RBU_STAGE_OAL ){
3706 assert( rc!=SQLITE_DONE );
3707 if( rc==SQLITE_OK ) rc = sqlite3_exec(p->dbMain, "COMMIT", 0, 0, 0);
3708 }
3709
3710 p->rc = rc;
3711 rbuSaveState(p, p->eStage);
3712 rc = p->rc;
3713
3714 if( p->eStage==RBU_STAGE_OAL ){
3715 assert( rc!=SQLITE_DONE );
3716 if( rc==SQLITE_OK ) rc = sqlite3_exec(p->dbRbu, "COMMIT", 0, 0, 0);
3717 if( rc==SQLITE_OK ) rc = sqlite3_exec(p->dbRbu, "BEGIN IMMEDIATE", 0, 0, 0);
3718 if( rc==SQLITE_OK ) rc = sqlite3_exec(p->dbMain, "BEGIN IMMEDIATE", 0, 0,0);
3719 }
3720
3721 p->rc = rc;
3722 return rc;
3723 }
3724
3725 /**************************************************************************
3726 ** Beginning of RBU VFS shim methods. The VFS shim modifies the behaviour
3727 ** of a standard VFS in the following ways:
3728 **
3729 ** 1. Whenever the first page of a main database file is read or
3730 ** written, the value of the change-counter cookie is stored in
3731 ** rbu_file.iCookie. Similarly, the value of the "write-version"
3732 ** database header field is stored in rbu_file.iWriteVer. This ensures
3733 ** that the values are always trustworthy within an open transaction.
3734 **
3735 ** 2. Whenever an SQLITE_OPEN_WAL file is opened, the (rbu_file.pWalFd)
3736 ** member variable of the associated database file descriptor is set
3737 ** to point to the new file. A mutex protected linked list of all main
3738 ** db fds opened using a particular RBU VFS is maintained at
3739 ** rbu_vfs.pMain to facilitate this.
3740 **
3741 ** 3. Using a new file-control "SQLITE_FCNTL_RBU", a main db rbu_file
3742 ** object can be marked as the target database of an RBU update. This
3743 ** turns on the following extra special behaviour:
3744 **
3745 ** 3a. If xAccess() is called to check if there exists a *-wal file
3746 ** associated with an RBU target database currently in RBU_STAGE_OAL
3747 ** stage (preparing the *-oal file), the following special handling
3748 ** applies:
3749 **
3750 ** * if the *-wal file does exist, return SQLITE_CANTOPEN. An RBU
3751 ** target database may not be in wal mode already.
3752 **
3753 ** * if the *-wal file does not exist, set the output parameter to
3754 ** non-zero (to tell SQLite that it does exist) anyway.
3755 **
3756 ** Then, when xOpen() is called to open the *-wal file associated with
3757 ** the RBU target in RBU_STAGE_OAL stage, instead of opening the *-wal
3758 ** file, the rbu vfs opens the corresponding *-oal file instead.
3759 **
3760 ** 3b. The *-shm pages returned by xShmMap() for a target db file in
3761 ** RBU_STAGE_OAL mode are actually stored in heap memory. This is to
3762 ** avoid creating a *-shm file on disk. Additionally, xShmLock() calls
3763 ** are no-ops on target database files in RBU_STAGE_OAL mode. This is
3764 ** because assert() statements in some VFS implementations fail if
3765 ** xShmLock() is called before xShmMap().
3766 **
3767 ** 3c. If an EXCLUSIVE lock is attempted on a target database file in any
3768 ** mode except RBU_STAGE_DONE (all work completed and checkpointed), it
3769 ** fails with an SQLITE_BUSY error. This is to stop RBU connections
3770 ** from automatically checkpointing a *-wal (or *-oal) file from within
3771 ** sqlite3_close().
3772 **
3773 ** 3d. In RBU_STAGE_CAPTURE mode, all xRead() calls on the wal file, and
3774 ** all xWrite() calls on the target database file perform no IO.
3775 ** Instead the frame and page numbers that would be read and written
3776 ** are recorded. Additionally, successful attempts to obtain exclusive
3777 ** xShmLock() WRITER, CHECKPOINTER and READ0 locks on the target
3778 ** database file are recorded. xShmLock() calls to unlock the same
3779 ** locks are no-ops (so that once obtained, these locks are never
3780 ** relinquished). Finally, calls to xSync() on the target database
3781 ** file fail with SQLITE_INTERNAL errors.
3782 */
3783
3784 static void rbuUnlockShm(rbu_file *p){
3785 if( p->pRbu ){
3786 int (*xShmLock)(sqlite3_file*,int,int,int) = p->pReal->pMethods->xShmLock;
3787 int i;
3788 for(i=0; i<SQLITE_SHM_NLOCK;i++){
3789 if( (1<<i) & p->pRbu->mLock ){
3790 xShmLock(p->pReal, i, 1, SQLITE_SHM_UNLOCK|SQLITE_SHM_EXCLUSIVE);
3791 }
3792 }
3793 p->pRbu->mLock = 0;
3794 }
3795 }
3796
3797 /*
3798 ** Close an rbu file.
3799 */
3800 static int rbuVfsClose(sqlite3_file *pFile){
3801 rbu_file *p = (rbu_file*)pFile;
3802 int rc;
3803 int i;
3804
3805 /* Free the contents of the apShm[] array. And the array itself. */
3806 for(i=0; i<p->nShm; i++){
3807 sqlite3_free(p->apShm[i]);
3808 }
3809 sqlite3_free(p->apShm);
3810 p->apShm = 0;
3811 sqlite3_free(p->zDel);
3812
3813 if( p->openFlags & SQLITE_OPEN_MAIN_DB ){
3814 rbu_file **pp;
3815 sqlite3_mutex_enter(p->pRbuVfs->mutex);
3816 for(pp=&p->pRbuVfs->pMain; *pp!=p; pp=&((*pp)->pMainNext));
3817 *pp = p->pMainNext;
3818 sqlite3_mutex_leave(p->pRbuVfs->mutex);
3819 rbuUnlockShm(p);
3820 p->pReal->pMethods->xShmUnmap(p->pReal, 0);
3821 }
3822
3823 /* Close the underlying file handle */
3824 rc = p->pReal->pMethods->xClose(p->pReal);
3825 return rc;
3826 }
3827
3828
3829 /*
3830 ** Read and return an unsigned 32-bit big-endian integer from the buffer
3831 ** passed as the only argument.
3832 */
3833 static u32 rbuGetU32(u8 *aBuf){
3834 return ((u32)aBuf[0] << 24)
3835 + ((u32)aBuf[1] << 16)
3836 + ((u32)aBuf[2] << 8)
3837 + ((u32)aBuf[3]);
3838 }
3839
3840 /*
3841 ** Read data from an rbuVfs-file.
3842 */
3843 static int rbuVfsRead(
3844 sqlite3_file *pFile,
3845 void *zBuf,
3846 int iAmt,
3847 sqlite_int64 iOfst
3848 ){
3849 rbu_file *p = (rbu_file*)pFile;
3850 sqlite3rbu *pRbu = p->pRbu;
3851 int rc;
3852
3853 if( pRbu && pRbu->eStage==RBU_STAGE_CAPTURE ){
3854 assert( p->openFlags & SQLITE_OPEN_WAL );
3855 rc = rbuCaptureWalRead(p->pRbu, iOfst, iAmt);
3856 }else{
3857 if( pRbu && pRbu->eStage==RBU_STAGE_OAL
3858 && (p->openFlags & SQLITE_OPEN_WAL)
3859 && iOfst>=pRbu->iOalSz
3860 ){
3861 rc = SQLITE_OK;
3862 memset(zBuf, 0, iAmt);
3863 }else{
3864 rc = p->pReal->pMethods->xRead(p->pReal, zBuf, iAmt, iOfst);
3865 }
3866 if( rc==SQLITE_OK && iOfst==0 && (p->openFlags & SQLITE_OPEN_MAIN_DB) ){
3867 /* These look like magic numbers. But they are stable, as they are part
3868 ** of the definition of the SQLite file format, which may not change. */
3869 u8 *pBuf = (u8*)zBuf;
3870 p->iCookie = rbuGetU32(&pBuf[24]);
3871 p->iWriteVer = pBuf[19];
3872 }
3873 }
3874 return rc;
3875 }
3876
3877 /*
3878 ** Write data to an rbuVfs-file.
3879 */
3880 static int rbuVfsWrite(
3881 sqlite3_file *pFile,
3882 const void *zBuf,
3883 int iAmt,
3884 sqlite_int64 iOfst
3885 ){
3886 rbu_file *p = (rbu_file*)pFile;
3887 sqlite3rbu *pRbu = p->pRbu;
3888 int rc;
3889
3890 if( pRbu && pRbu->eStage==RBU_STAGE_CAPTURE ){
3891 assert( p->openFlags & SQLITE_OPEN_MAIN_DB );
3892 rc = rbuCaptureDbWrite(p->pRbu, iOfst);
3893 }else{
3894 if( pRbu && pRbu->eStage==RBU_STAGE_OAL
3895 && (p->openFlags & SQLITE_OPEN_WAL)
3896 && iOfst>=pRbu->iOalSz
3897 ){
3898 pRbu->iOalSz = iAmt + iOfst;
3899 }
3900 rc = p->pReal->pMethods->xWrite(p->pReal, zBuf, iAmt, iOfst);
3901 if( rc==SQLITE_OK && iOfst==0 && (p->openFlags & SQLITE_OPEN_MAIN_DB) ){
3902 /* These look like magic numbers. But they are stable, as they are part
3903 ** of the definition of the SQLite file format, which may not change. */
3904 u8 *pBuf = (u8*)zBuf;
3905 p->iCookie = rbuGetU32(&pBuf[24]);
3906 p->iWriteVer = pBuf[19];
3907 }
3908 }
3909 return rc;
3910 }
3911
3912 /*
3913 ** Truncate an rbuVfs-file.
3914 */
3915 static int rbuVfsTruncate(sqlite3_file *pFile, sqlite_int64 size){
3916 rbu_file *p = (rbu_file*)pFile;
3917 return p->pReal->pMethods->xTruncate(p->pReal, size);
3918 }
3919
3920 /*
3921 ** Sync an rbuVfs-file.
3922 */
3923 static int rbuVfsSync(sqlite3_file *pFile, int flags){
3924 rbu_file *p = (rbu_file *)pFile;
3925 if( p->pRbu && p->pRbu->eStage==RBU_STAGE_CAPTURE ){
3926 if( p->openFlags & SQLITE_OPEN_MAIN_DB ){
3927 return SQLITE_INTERNAL;
3928 }
3929 return SQLITE_OK;
3930 }
3931 return p->pReal->pMethods->xSync(p->pReal, flags);
3932 }
3933
3934 /*
3935 ** Return the current file-size of an rbuVfs-file.
3936 */
3937 static int rbuVfsFileSize(sqlite3_file *pFile, sqlite_int64 *pSize){
3938 rbu_file *p = (rbu_file *)pFile;
3939 return p->pReal->pMethods->xFileSize(p->pReal, pSize);
3940 }
3941
3942 /*
3943 ** Lock an rbuVfs-file.
3944 */
3945 static int rbuVfsLock(sqlite3_file *pFile, int eLock){
3946 rbu_file *p = (rbu_file*)pFile;
3947 sqlite3rbu *pRbu = p->pRbu;
3948 int rc = SQLITE_OK;
3949
3950 assert( p->openFlags & (SQLITE_OPEN_MAIN_DB|SQLITE_OPEN_TEMP_DB) );
3951 if( pRbu && eLock==SQLITE_LOCK_EXCLUSIVE && pRbu->eStage!=RBU_STAGE_DONE ){
3952 /* Do not allow EXCLUSIVE locks. Preventing SQLite from taking this
3953 ** prevents it from checkpointing the database from sqlite3_close(). */
3954 rc = SQLITE_BUSY;
3955 }else{
3956 rc = p->pReal->pMethods->xLock(p->pReal, eLock);
3957 }
3958
3959 return rc;
3960 }
3961
3962 /*
3963 ** Unlock an rbuVfs-file.
3964 */
3965 static int rbuVfsUnlock(sqlite3_file *pFile, int eLock){
3966 rbu_file *p = (rbu_file *)pFile;
3967 return p->pReal->pMethods->xUnlock(p->pReal, eLock);
3968 }
3969
3970 /*
3971 ** Check if another file-handle holds a RESERVED lock on an rbuVfs-file.
3972 */
3973 static int rbuVfsCheckReservedLock(sqlite3_file *pFile, int *pResOut){
3974 rbu_file *p = (rbu_file *)pFile;
3975 return p->pReal->pMethods->xCheckReservedLock(p->pReal, pResOut);
3976 }
3977
3978 /*
3979 ** File control method. For custom operations on an rbuVfs-file.
3980 */
3981 static int rbuVfsFileControl(sqlite3_file *pFile, int op, void *pArg){
3982 rbu_file *p = (rbu_file *)pFile;
3983 int (*xControl)(sqlite3_file*,int,void*) = p->pReal->pMethods->xFileControl;
3984 int rc;
3985
3986 assert( p->openFlags & (SQLITE_OPEN_MAIN_DB|SQLITE_OPEN_TEMP_DB)
3987 || p->openFlags & (SQLITE_OPEN_TRANSIENT_DB|SQLITE_OPEN_TEMP_JOURNAL)
3988 );
3989 if( op==SQLITE_FCNTL_RBU ){
3990 sqlite3rbu *pRbu = (sqlite3rbu*)pArg;
3991
3992 /* First try to find another RBU vfs lower down in the vfs stack. If
3993 ** one is found, this vfs will operate in pass-through mode. The lower
3994 ** level vfs will do the special RBU handling. */
3995 rc = xControl(p->pReal, op, pArg);
3996
3997 if( rc==SQLITE_NOTFOUND ){
3998 /* Now search for a zipvfs instance lower down in the VFS stack. If
3999 ** one is found, this is an error. */
4000 void *dummy = 0;
4001 rc = xControl(p->pReal, SQLITE_FCNTL_ZIPVFS, &dummy);
4002 if( rc==SQLITE_OK ){
4003 rc = SQLITE_ERROR;
4004 pRbu->zErrmsg = sqlite3_mprintf("rbu/zipvfs setup error");
4005 }else if( rc==SQLITE_NOTFOUND ){
4006 pRbu->pTargetFd = p;
4007 p->pRbu = pRbu;
4008 if( p->pWalFd ) p->pWalFd->pRbu = pRbu;
4009 rc = SQLITE_OK;
4010 }
4011 }
4012 return rc;
4013 }
4014
4015 rc = xControl(p->pReal, op, pArg);
4016 if( rc==SQLITE_OK && op==SQLITE_FCNTL_VFSNAME ){
4017 rbu_vfs *pRbuVfs = p->pRbuVfs;
4018 char *zIn = *(char**)pArg;
4019 char *zOut = sqlite3_mprintf("rbu(%s)/%z", pRbuVfs->base.zName, zIn);
4020 *(char**)pArg = zOut;
4021 if( zOut==0 ) rc = SQLITE_NOMEM;
4022 }
4023
4024 return rc;
4025 }
4026
4027 /*
4028 ** Return the sector-size in bytes for an rbuVfs-file.
4029 */
4030 static int rbuVfsSectorSize(sqlite3_file *pFile){
4031 rbu_file *p = (rbu_file *)pFile;
4032 return p->pReal->pMethods->xSectorSize(p->pReal);
4033 }
4034
4035 /*
4036 ** Return the device characteristic flags supported by an rbuVfs-file.
4037 */
4038 static int rbuVfsDeviceCharacteristics(sqlite3_file *pFile){
4039 rbu_file *p = (rbu_file *)pFile;
4040 return p->pReal->pMethods->xDeviceCharacteristics(p->pReal);
4041 }
4042
4043 /*
4044 ** Take or release a shared-memory lock.
4045 */
4046 static int rbuVfsShmLock(sqlite3_file *pFile, int ofst, int n, int flags){
4047 rbu_file *p = (rbu_file*)pFile;
4048 sqlite3rbu *pRbu = p->pRbu;
4049 int rc = SQLITE_OK;
4050
4051 #ifdef SQLITE_AMALGAMATION
4052 assert( WAL_CKPT_LOCK==1 );
4053 #endif
4054
4055 assert( p->openFlags & (SQLITE_OPEN_MAIN_DB|SQLITE_OPEN_TEMP_DB) );
4056 if( pRbu && (pRbu->eStage==RBU_STAGE_OAL || pRbu->eStage==RBU_STAGE_MOVE) ){
4057 /* Magic number 1 is the WAL_CKPT_LOCK lock. Preventing SQLite from
4058 ** taking this lock also prevents any checkpoints from occurring.
4059 ** todo: really, it's not clear why this might occur, as
4060 ** wal_autocheckpoint ought to be turned off. */
4061 if( ofst==WAL_LOCK_CKPT && n==1 ) rc = SQLITE_BUSY;
4062 }else{
4063 int bCapture = 0;
4064 if( n==1 && (flags & SQLITE_SHM_EXCLUSIVE)
4065 && pRbu && pRbu->eStage==RBU_STAGE_CAPTURE
4066 && (ofst==WAL_LOCK_WRITE || ofst==WAL_LOCK_CKPT || ofst==WAL_LOCK_READ0)
4067 ){
4068 bCapture = 1;
4069 }
4070
4071 if( bCapture==0 || 0==(flags & SQLITE_SHM_UNLOCK) ){
4072 rc = p->pReal->pMethods->xShmLock(p->pReal, ofst, n, flags);
4073 if( bCapture && rc==SQLITE_OK ){
4074 pRbu->mLock |= (1 << ofst);
4075 }
4076 }
4077 }
4078
4079 return rc;
4080 }
4081
4082 /*
4083 ** Obtain a pointer to a mapping of a single 32KiB page of the *-shm file.
4084 */
4085 static int rbuVfsShmMap(
4086 sqlite3_file *pFile,
4087 int iRegion,
4088 int szRegion,
4089 int isWrite,
4090 void volatile **pp
4091 ){
4092 rbu_file *p = (rbu_file*)pFile;
4093 int rc = SQLITE_OK;
4094 int eStage = (p->pRbu ? p->pRbu->eStage : 0);
4095
4096 /* If not in RBU_STAGE_OAL, allow this call to pass through. Or, if this
4097 ** rbu is in the RBU_STAGE_OAL state, use heap memory for *-shm space
4098 ** instead of a file on disk. */
4099 assert( p->openFlags & (SQLITE_OPEN_MAIN_DB|SQLITE_OPEN_TEMP_DB) );
4100 if( eStage==RBU_STAGE_OAL || eStage==RBU_STAGE_MOVE ){
4101 if( iRegion<=p->nShm ){
4102 int nByte = (iRegion+1) * sizeof(char*);
4103 char **apNew = (char**)sqlite3_realloc(p->apShm, nByte);
4104 if( apNew==0 ){
4105 rc = SQLITE_NOMEM;
4106 }else{
4107 memset(&apNew[p->nShm], 0, sizeof(char*) * (1 + iRegion - p->nShm));
4108 p->apShm = apNew;
4109 p->nShm = iRegion+1;
4110 }
4111 }
4112
4113 if( rc==SQLITE_OK && p->apShm[iRegion]==0 ){
4114 char *pNew = (char*)sqlite3_malloc(szRegion);
4115 if( pNew==0 ){
4116 rc = SQLITE_NOMEM;
4117 }else{
4118 memset(pNew, 0, szRegion);
4119 p->apShm[iRegion] = pNew;
4120 }
4121 }
4122
4123 if( rc==SQLITE_OK ){
4124 *pp = p->apShm[iRegion];
4125 }else{
4126 *pp = 0;
4127 }
4128 }else{
4129 assert( p->apShm==0 );
4130 rc = p->pReal->pMethods->xShmMap(p->pReal, iRegion, szRegion, isWrite, pp);
4131 }
4132
4133 return rc;
4134 }
4135
4136 /*
4137 ** Memory barrier.
4138 */
4139 static void rbuVfsShmBarrier(sqlite3_file *pFile){
4140 rbu_file *p = (rbu_file *)pFile;
4141 p->pReal->pMethods->xShmBarrier(p->pReal);
4142 }
4143
4144 /*
4145 ** The xShmUnmap method.
4146 */
4147 static int rbuVfsShmUnmap(sqlite3_file *pFile, int delFlag){
4148 rbu_file *p = (rbu_file*)pFile;
4149 int rc = SQLITE_OK;
4150 int eStage = (p->pRbu ? p->pRbu->eStage : 0);
4151
4152 assert( p->openFlags & (SQLITE_OPEN_MAIN_DB|SQLITE_OPEN_TEMP_DB) );
4153 if( eStage==RBU_STAGE_OAL || eStage==RBU_STAGE_MOVE ){
4154 /* no-op */
4155 }else{
4156 /* Release the checkpointer and writer locks */
4157 rbuUnlockShm(p);
4158 rc = p->pReal->pMethods->xShmUnmap(p->pReal, delFlag);
4159 }
4160 return rc;
4161 }
4162
4163 /*
4164 ** Given that zWal points to a buffer containing a wal file name passed to
4165 ** either the xOpen() or xAccess() VFS method, return a pointer to the
4166 ** file-handle opened by the same database connection on the corresponding
4167 ** database file.
4168 */
4169 static rbu_file *rbuFindMaindb(rbu_vfs *pRbuVfs, const char *zWal){
4170 rbu_file *pDb;
4171 sqlite3_mutex_enter(pRbuVfs->mutex);
4172 for(pDb=pRbuVfs->pMain; pDb && pDb->zWal!=zWal; pDb=pDb->pMainNext);
4173 sqlite3_mutex_leave(pRbuVfs->mutex);
4174 return pDb;
4175 }
4176
4177 /*
4178 ** Open an rbu file handle.
4179 */
4180 static int rbuVfsOpen(
4181 sqlite3_vfs *pVfs,
4182 const char *zName,
4183 sqlite3_file *pFile,
4184 int flags,
4185 int *pOutFlags
4186 ){
4187 static sqlite3_io_methods rbuvfs_io_methods = {
4188 2, /* iVersion */
4189 rbuVfsClose, /* xClose */
4190 rbuVfsRead, /* xRead */
4191 rbuVfsWrite, /* xWrite */
4192 rbuVfsTruncate, /* xTruncate */
4193 rbuVfsSync, /* xSync */
4194 rbuVfsFileSize, /* xFileSize */
4195 rbuVfsLock, /* xLock */
4196 rbuVfsUnlock, /* xUnlock */
4197 rbuVfsCheckReservedLock, /* xCheckReservedLock */
4198 rbuVfsFileControl, /* xFileControl */
4199 rbuVfsSectorSize, /* xSectorSize */
4200 rbuVfsDeviceCharacteristics, /* xDeviceCharacteristics */
4201 rbuVfsShmMap, /* xShmMap */
4202 rbuVfsShmLock, /* xShmLock */
4203 rbuVfsShmBarrier, /* xShmBarrier */
4204 rbuVfsShmUnmap, /* xShmUnmap */
4205 0, 0 /* xFetch, xUnfetch */
4206 };
4207 rbu_vfs *pRbuVfs = (rbu_vfs*)pVfs;
4208 sqlite3_vfs *pRealVfs = pRbuVfs->pRealVfs;
4209 rbu_file *pFd = (rbu_file *)pFile;
4210 int rc = SQLITE_OK;
4211 const char *zOpen = zName;
4212
4213 memset(pFd, 0, sizeof(rbu_file));
4214 pFd->pReal = (sqlite3_file*)&pFd[1];
4215 pFd->pRbuVfs = pRbuVfs;
4216 pFd->openFlags = flags;
4217 if( zName ){
4218 if( flags & SQLITE_OPEN_MAIN_DB ){
4219 /* A main database has just been opened. The following block sets
4220 ** (pFd->zWal) to point to a buffer owned by SQLite that contains
4221 ** the name of the *-wal file this db connection will use. SQLite
4222 ** happens to pass a pointer to this buffer when using xAccess()
4223 ** or xOpen() to operate on the *-wal file. */
4224 int n = strlen(zName);
4225 const char *z = &zName[n];
4226 if( flags & SQLITE_OPEN_URI ){
4227 int odd = 0;
4228 while( 1 ){
4229 if( z[0]==0 ){
4230 odd = 1 - odd;
4231 if( odd && z[1]==0 ) break;
4232 }
4233 z++;
4234 }
4235 z += 2;
4236 }else{
4237 while( *z==0 ) z++;
4238 }
4239 z += (n + 8 + 1);
4240 pFd->zWal = z;
4241 }
4242 else if( flags & SQLITE_OPEN_WAL ){
4243 rbu_file *pDb = rbuFindMaindb(pRbuVfs, zName);
4244 if( pDb ){
4245 if( pDb->pRbu && pDb->pRbu->eStage==RBU_STAGE_OAL ){
4246 /* This call is to open a *-wal file. Intead, open the *-oal. This
4247 ** code ensures that the string passed to xOpen() is terminated by a
4248 ** pair of '\0' bytes in case the VFS attempts to extract a URI
4249 ** parameter from it. */
4250 int nCopy = strlen(zName);
4251 char *zCopy = sqlite3_malloc(nCopy+2);
4252 if( zCopy ){
4253 memcpy(zCopy, zName, nCopy);
4254 zCopy[nCopy-3] = 'o';
4255 zCopy[nCopy] = '\0';
4256 zCopy[nCopy+1] = '\0';
4257 zOpen = (const char*)(pFd->zDel = zCopy);
4258 }else{
4259 rc = SQLITE_NOMEM;
4260 }
4261 pFd->pRbu = pDb->pRbu;
4262 }
4263 pDb->pWalFd = pFd;
4264 }
4265 }
4266 }
4267
4268 if( rc==SQLITE_OK ){
4269 rc = pRealVfs->xOpen(pRealVfs, zOpen, pFd->pReal, flags, pOutFlags);
4270 }
4271 if( pFd->pReal->pMethods ){
4272 /* The xOpen() operation has succeeded. Set the sqlite3_file.pMethods
4273 ** pointer and, if the file is a main database file, link it into the
4274 ** mutex protected linked list of all such files. */
4275 pFile->pMethods = &rbuvfs_io_methods;
4276 if( flags & SQLITE_OPEN_MAIN_DB ){
4277 sqlite3_mutex_enter(pRbuVfs->mutex);
4278 pFd->pMainNext = pRbuVfs->pMain;
4279 pRbuVfs->pMain = pFd;
4280 sqlite3_mutex_leave(pRbuVfs->mutex);
4281 }
4282 }else{
4283 sqlite3_free(pFd->zDel);
4284 }
4285
4286 return rc;
4287 }
4288
4289 /*
4290 ** Delete the file located at zPath.
4291 */
4292 static int rbuVfsDelete(sqlite3_vfs *pVfs, const char *zPath, int dirSync){
4293 sqlite3_vfs *pRealVfs = ((rbu_vfs*)pVfs)->pRealVfs;
4294 return pRealVfs->xDelete(pRealVfs, zPath, dirSync);
4295 }
4296
4297 /*
4298 ** Test for access permissions. Return true if the requested permission
4299 ** is available, or false otherwise.
4300 */
4301 static int rbuVfsAccess(
4302 sqlite3_vfs *pVfs,
4303 const char *zPath,
4304 int flags,
4305 int *pResOut
4306 ){
4307 rbu_vfs *pRbuVfs = (rbu_vfs*)pVfs;
4308 sqlite3_vfs *pRealVfs = pRbuVfs->pRealVfs;
4309 int rc;
4310
4311 rc = pRealVfs->xAccess(pRealVfs, zPath, flags, pResOut);
4312
4313 /* If this call is to check if a *-wal file associated with an RBU target
4314 ** database connection exists, and the RBU update is in RBU_STAGE_OAL,
4315 ** the following special handling is activated:
4316 **
4317 ** a) if the *-wal file does exist, return SQLITE_CANTOPEN. This
4318 ** ensures that the RBU extension never tries to update a database
4319 ** in wal mode, even if the first page of the database file has
4320 ** been damaged.
4321 **
4322 ** b) if the *-wal file does not exist, claim that it does anyway,
4323 ** causing SQLite to call xOpen() to open it. This call will also
4324 ** be intercepted (see the rbuVfsOpen() function) and the *-oal
4325 ** file opened instead.
4326 */
4327 if( rc==SQLITE_OK && flags==SQLITE_ACCESS_EXISTS ){
4328 rbu_file *pDb = rbuFindMaindb(pRbuVfs, zPath);
4329 if( pDb && pDb->pRbu && pDb->pRbu->eStage==RBU_STAGE_OAL ){
4330 if( *pResOut ){
4331 rc = SQLITE_CANTOPEN;
4332 }else{
4333 *pResOut = 1;
4334 }
4335 }
4336 }
4337
4338 return rc;
4339 }
4340
4341 /*
4342 ** Populate buffer zOut with the full canonical pathname corresponding
4343 ** to the pathname in zPath. zOut is guaranteed to point to a buffer
4344 ** of at least (DEVSYM_MAX_PATHNAME+1) bytes.
4345 */
4346 static int rbuVfsFullPathname(
4347 sqlite3_vfs *pVfs,
4348 const char *zPath,
4349 int nOut,
4350 char *zOut
4351 ){
4352 sqlite3_vfs *pRealVfs = ((rbu_vfs*)pVfs)->pRealVfs;
4353 return pRealVfs->xFullPathname(pRealVfs, zPath, nOut, zOut);
4354 }
4355
4356 #ifndef SQLITE_OMIT_LOAD_EXTENSION
4357 /*
4358 ** Open the dynamic library located at zPath and return a handle.
4359 */
4360 static void *rbuVfsDlOpen(sqlite3_vfs *pVfs, const char *zPath){
4361 sqlite3_vfs *pRealVfs = ((rbu_vfs*)pVfs)->pRealVfs;
4362 return pRealVfs->xDlOpen(pRealVfs, zPath);
4363 }
4364
4365 /*
4366 ** Populate the buffer zErrMsg (size nByte bytes) with a human readable
4367 ** utf-8 string describing the most recent error encountered associated
4368 ** with dynamic libraries.
4369 */
4370 static void rbuVfsDlError(sqlite3_vfs *pVfs, int nByte, char *zErrMsg){
4371 sqlite3_vfs *pRealVfs = ((rbu_vfs*)pVfs)->pRealVfs;
4372 pRealVfs->xDlError(pRealVfs, nByte, zErrMsg);
4373 }
4374
4375 /*
4376 ** Return a pointer to the symbol zSymbol in the dynamic library pHandle.
4377 */
4378 static void (*rbuVfsDlSym(
4379 sqlite3_vfs *pVfs,
4380 void *pArg,
4381 const char *zSym
4382 ))(void){
4383 sqlite3_vfs *pRealVfs = ((rbu_vfs*)pVfs)->pRealVfs;
4384 return pRealVfs->xDlSym(pRealVfs, pArg, zSym);
4385 }
4386
4387 /*
4388 ** Close the dynamic library handle pHandle.
4389 */
4390 static void rbuVfsDlClose(sqlite3_vfs *pVfs, void *pHandle){
4391 sqlite3_vfs *pRealVfs = ((rbu_vfs*)pVfs)->pRealVfs;
4392 pRealVfs->xDlClose(pRealVfs, pHandle);
4393 }
4394 #endif /* SQLITE_OMIT_LOAD_EXTENSION */
4395
4396 /*
4397 ** Populate the buffer pointed to by zBufOut with nByte bytes of
4398 ** random data.
4399 */
4400 static int rbuVfsRandomness(sqlite3_vfs *pVfs, int nByte, char *zBufOut){
4401 sqlite3_vfs *pRealVfs = ((rbu_vfs*)pVfs)->pRealVfs;
4402 return pRealVfs->xRandomness(pRealVfs, nByte, zBufOut);
4403 }
4404
4405 /*
4406 ** Sleep for nMicro microseconds. Return the number of microseconds
4407 ** actually slept.
4408 */
4409 static int rbuVfsSleep(sqlite3_vfs *pVfs, int nMicro){
4410 sqlite3_vfs *pRealVfs = ((rbu_vfs*)pVfs)->pRealVfs;
4411 return pRealVfs->xSleep(pRealVfs, nMicro);
4412 }
4413
4414 /*
4415 ** Return the current time as a Julian Day number in *pTimeOut.
4416 */
4417 static int rbuVfsCurrentTime(sqlite3_vfs *pVfs, double *pTimeOut){
4418 sqlite3_vfs *pRealVfs = ((rbu_vfs*)pVfs)->pRealVfs;
4419 return pRealVfs->xCurrentTime(pRealVfs, pTimeOut);
4420 }
4421
4422 /*
4423 ** No-op.
4424 */
4425 static int rbuVfsGetLastError(sqlite3_vfs *pVfs, int a, char *b){
4426 return 0;
4427 }
4428
4429 /*
4430 ** Deregister and destroy an RBU vfs created by an earlier call to
4431 ** sqlite3rbu_create_vfs().
4432 */
4433 SQLITE_API void SQLITE_STDCALL sqlite3rbu_destroy_vfs(const char *zName){
4434 sqlite3_vfs *pVfs = sqlite3_vfs_find(zName);
4435 if( pVfs && pVfs->xOpen==rbuVfsOpen ){
4436 sqlite3_mutex_free(((rbu_vfs*)pVfs)->mutex);
4437 sqlite3_vfs_unregister(pVfs);
4438 sqlite3_free(pVfs);
4439 }
4440 }
4441
4442 /*
4443 ** Create an RBU VFS named zName that accesses the underlying file-system
4444 ** via existing VFS zParent. The new object is registered as a non-default
4445 ** VFS with SQLite before returning.
4446 */
4447 SQLITE_API int SQLITE_STDCALL sqlite3rbu_create_vfs(const char *zName, const cha r *zParent){
4448
4449 /* Template for VFS */
4450 static sqlite3_vfs vfs_template = {
4451 1, /* iVersion */
4452 0, /* szOsFile */
4453 0, /* mxPathname */
4454 0, /* pNext */
4455 0, /* zName */
4456 0, /* pAppData */
4457 rbuVfsOpen, /* xOpen */
4458 rbuVfsDelete, /* xDelete */
4459 rbuVfsAccess, /* xAccess */
4460 rbuVfsFullPathname, /* xFullPathname */
4461
4462 #ifndef SQLITE_OMIT_LOAD_EXTENSION
4463 rbuVfsDlOpen, /* xDlOpen */
4464 rbuVfsDlError, /* xDlError */
4465 rbuVfsDlSym, /* xDlSym */
4466 rbuVfsDlClose, /* xDlClose */
4467 #else
4468 0, 0, 0, 0,
4469 #endif
4470
4471 rbuVfsRandomness, /* xRandomness */
4472 rbuVfsSleep, /* xSleep */
4473 rbuVfsCurrentTime, /* xCurrentTime */
4474 rbuVfsGetLastError, /* xGetLastError */
4475 0, /* xCurrentTimeInt64 (version 2) */
4476 0, 0, 0 /* Unimplemented version 3 methods */
4477 };
4478
4479 rbu_vfs *pNew = 0; /* Newly allocated VFS */
4480 int nName;
4481 int rc = SQLITE_OK;
4482
4483 int nByte;
4484 nName = strlen(zName);
4485 nByte = sizeof(rbu_vfs) + nName + 1;
4486 pNew = (rbu_vfs*)sqlite3_malloc(nByte);
4487 if( pNew==0 ){
4488 rc = SQLITE_NOMEM;
4489 }else{
4490 sqlite3_vfs *pParent; /* Parent VFS */
4491 memset(pNew, 0, nByte);
4492 pParent = sqlite3_vfs_find(zParent);
4493 if( pParent==0 ){
4494 rc = SQLITE_NOTFOUND;
4495 }else{
4496 char *zSpace;
4497 memcpy(&pNew->base, &vfs_template, sizeof(sqlite3_vfs));
4498 pNew->base.mxPathname = pParent->mxPathname;
4499 pNew->base.szOsFile = sizeof(rbu_file) + pParent->szOsFile;
4500 pNew->pRealVfs = pParent;
4501 pNew->base.zName = (const char*)(zSpace = (char*)&pNew[1]);
4502 memcpy(zSpace, zName, nName);
4503
4504 /* Allocate the mutex and register the new VFS (not as the default) */
4505 pNew->mutex = sqlite3_mutex_alloc(SQLITE_MUTEX_RECURSIVE);
4506 if( pNew->mutex==0 ){
4507 rc = SQLITE_NOMEM;
4508 }else{
4509 rc = sqlite3_vfs_register(&pNew->base, 0);
4510 }
4511 }
4512
4513 if( rc!=SQLITE_OK ){
4514 sqlite3_mutex_free(pNew->mutex);
4515 sqlite3_free(pNew);
4516 }
4517 }
4518
4519 return rc;
4520 }
4521
4522
4523 /**************************************************************************/
4524
4525 #endif /* !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_RBU) */
4526
4527 /************** End of sqlite3rbu.c ******************************************/
4528 /************** Begin file dbstat.c ******************************************/
4529 /*
4530 ** 2010 July 12
4531 **
4532 ** The author disclaims copyright to this source code. In place of
4533 ** a legal notice, here is a blessing:
4534 **
4535 ** May you do good and not evil.
4536 ** May you find forgiveness for yourself and forgive others.
4537 ** May you share freely, never taking more than you give.
4538 **
4539 ******************************************************************************
4540 **
4541 ** This file contains an implementation of the "dbstat" virtual table.
4542 **
4543 ** The dbstat virtual table is used to extract low-level formatting
4544 ** information from an SQLite database in order to implement the
4545 ** "sqlite3_analyzer" utility. See the ../tool/spaceanal.tcl script
4546 ** for an example implementation.
4547 **
4548 ** Additional information is available on the "dbstat.html" page of the
4549 ** official SQLite documentation.
4550 */
4551
4552 /* #include "sqliteInt.h" ** Requires access to internal data structures ** */
4553 #if (defined(SQLITE_ENABLE_DBSTAT_VTAB) || defined(SQLITE_TEST)) \
4554 && !defined(SQLITE_OMIT_VIRTUALTABLE)
4555
4556 /*
4557 ** Page paths:
4558 **
4559 ** The value of the 'path' column describes the path taken from the
4560 ** root-node of the b-tree structure to each page. The value of the
4561 ** root-node path is '/'.
4562 **
4563 ** The value of the path for the left-most child page of the root of
4564 ** a b-tree is '/000/'. (Btrees store content ordered from left to right
4565 ** so the pages to the left have smaller keys than the pages to the right.)
4566 ** The next to left-most child of the root page is
4567 ** '/001', and so on, each sibling page identified by a 3-digit hex
4568 ** value. The children of the 451st left-most sibling have paths such
4569 ** as '/1c2/000/, '/1c2/001/' etc.
4570 **
4571 ** Overflow pages are specified by appending a '+' character and a
4572 ** six-digit hexadecimal value to the path to the cell they are linked
4573 ** from. For example, the three overflow pages in a chain linked from
4574 ** the left-most cell of the 450th child of the root page are identified
4575 ** by the paths:
4576 **
4577 ** '/1c2/000+000000' // First page in overflow chain
4578 ** '/1c2/000+000001' // Second page in overflow chain
4579 ** '/1c2/000+000002' // Third page in overflow chain
4580 **
4581 ** If the paths are sorted using the BINARY collation sequence, then
4582 ** the overflow pages associated with a cell will appear earlier in the
4583 ** sort-order than its child page:
4584 **
4585 ** '/1c2/000/' // Left-most child of 451st child of root
4586 */
4587 #define VTAB_SCHEMA \
4588 "CREATE TABLE xx( " \
4589 " name STRING, /* Name of table or index */" \
4590 " path INTEGER, /* Path to page from root */" \
4591 " pageno INTEGER, /* Page number */" \
4592 " pagetype STRING, /* 'internal', 'leaf' or 'overflow' */" \
4593 " ncell INTEGER, /* Cells on page (0 for overflow) */" \
4594 " payload INTEGER, /* Bytes of payload on this page */" \
4595 " unused INTEGER, /* Bytes of unused space on this page */" \
4596 " mx_payload INTEGER, /* Largest payload size of all cells */" \
4597 " pgoffset INTEGER, /* Offset of page in file */" \
4598 " pgsize INTEGER, /* Size of the page */" \
4599 " schema TEXT HIDDEN /* Database schema being analyzed */" \
4600 ");"
4601
4602
4603 typedef struct StatTable StatTable;
4604 typedef struct StatCursor StatCursor;
4605 typedef struct StatPage StatPage;
4606 typedef struct StatCell StatCell;
4607
4608 struct StatCell {
4609 int nLocal; /* Bytes of local payload */
4610 u32 iChildPg; /* Child node (or 0 if this is a leaf) */
4611 int nOvfl; /* Entries in aOvfl[] */
4612 u32 *aOvfl; /* Array of overflow page numbers */
4613 int nLastOvfl; /* Bytes of payload on final overflow page */
4614 int iOvfl; /* Iterates through aOvfl[] */
4615 };
4616
4617 struct StatPage {
4618 u32 iPgno;
4619 DbPage *pPg;
4620 int iCell;
4621
4622 char *zPath; /* Path to this page */
4623
4624 /* Variables populated by statDecodePage(): */
4625 u8 flags; /* Copy of flags byte */
4626 int nCell; /* Number of cells on page */
4627 int nUnused; /* Number of unused bytes on page */
4628 StatCell *aCell; /* Array of parsed cells */
4629 u32 iRightChildPg; /* Right-child page number (or 0) */
4630 int nMxPayload; /* Largest payload of any cell on this page */
4631 };
4632
4633 struct StatCursor {
4634 sqlite3_vtab_cursor base;
4635 sqlite3_stmt *pStmt; /* Iterates through set of root pages */
4636 int isEof; /* After pStmt has returned SQLITE_DONE */
4637 int iDb; /* Schema used for this query */
4638
4639 StatPage aPage[32];
4640 int iPage; /* Current entry in aPage[] */
4641
4642 /* Values to return. */
4643 char *zName; /* Value of 'name' column */
4644 char *zPath; /* Value of 'path' column */
4645 u32 iPageno; /* Value of 'pageno' column */
4646 char *zPagetype; /* Value of 'pagetype' column */
4647 int nCell; /* Value of 'ncell' column */
4648 int nPayload; /* Value of 'payload' column */
4649 int nUnused; /* Value of 'unused' column */
4650 int nMxPayload; /* Value of 'mx_payload' column */
4651 i64 iOffset; /* Value of 'pgOffset' column */
4652 int szPage; /* Value of 'pgSize' column */
4653 };
4654
4655 struct StatTable {
4656 sqlite3_vtab base;
4657 sqlite3 *db;
4658 int iDb; /* Index of database to analyze */
4659 };
4660
4661 #ifndef get2byte
4662 # define get2byte(x) ((x)[0]<<8 | (x)[1])
4663 #endif
4664
4665 /*
4666 ** Connect to or create a statvfs virtual table.
4667 */
4668 static int statConnect(
4669 sqlite3 *db,
4670 void *pAux,
4671 int argc, const char *const*argv,
4672 sqlite3_vtab **ppVtab,
4673 char **pzErr
4674 ){
4675 StatTable *pTab = 0;
4676 int rc = SQLITE_OK;
4677 int iDb;
4678
4679 if( argc>=4 ){
4680 iDb = sqlite3FindDbName(db, argv[3]);
4681 if( iDb<0 ){
4682 *pzErr = sqlite3_mprintf("no such database: %s", argv[3]);
4683 return SQLITE_ERROR;
4684 }
4685 }else{
4686 iDb = 0;
4687 }
4688 rc = sqlite3_declare_vtab(db, VTAB_SCHEMA);
4689 if( rc==SQLITE_OK ){
4690 pTab = (StatTable *)sqlite3_malloc64(sizeof(StatTable));
4691 if( pTab==0 ) rc = SQLITE_NOMEM;
4692 }
4693
4694 assert( rc==SQLITE_OK || pTab==0 );
4695 if( rc==SQLITE_OK ){
4696 memset(pTab, 0, sizeof(StatTable));
4697 pTab->db = db;
4698 pTab->iDb = iDb;
4699 }
4700
4701 *ppVtab = (sqlite3_vtab*)pTab;
4702 return rc;
4703 }
4704
4705 /*
4706 ** Disconnect from or destroy a statvfs virtual table.
4707 */
4708 static int statDisconnect(sqlite3_vtab *pVtab){
4709 sqlite3_free(pVtab);
4710 return SQLITE_OK;
4711 }
4712
4713 /*
4714 ** There is no "best-index". This virtual table always does a linear
4715 ** scan. However, a schema=? constraint should cause this table to
4716 ** operate on a different database schema, so check for it.
4717 **
4718 ** idxNum is normally 0, but will be 1 if a schema=? constraint exists.
4719 */
4720 static int statBestIndex(sqlite3_vtab *tab, sqlite3_index_info *pIdxInfo){
4721 int i;
4722
4723 pIdxInfo->estimatedCost = 1.0e6; /* Initial cost estimate */
4724
4725 /* Look for a valid schema=? constraint. If found, change the idxNum to
4726 ** 1 and request the value of that constraint be sent to xFilter. And
4727 ** lower the cost estimate to encourage the constrained version to be
4728 ** used.
4729 */
4730 for(i=0; i<pIdxInfo->nConstraint; i++){
4731 if( pIdxInfo->aConstraint[i].usable==0 ) continue;
4732 if( pIdxInfo->aConstraint[i].op!=SQLITE_INDEX_CONSTRAINT_EQ ) continue;
4733 if( pIdxInfo->aConstraint[i].iColumn!=10 ) continue;
4734 pIdxInfo->idxNum = 1;
4735 pIdxInfo->estimatedCost = 1.0;
4736 pIdxInfo->aConstraintUsage[i].argvIndex = 1;
4737 pIdxInfo->aConstraintUsage[i].omit = 1;
4738 break;
4739 }
4740
4741
4742 /* Records are always returned in ascending order of (name, path).
4743 ** If this will satisfy the client, set the orderByConsumed flag so that
4744 ** SQLite does not do an external sort.
4745 */
4746 if( ( pIdxInfo->nOrderBy==1
4747 && pIdxInfo->aOrderBy[0].iColumn==0
4748 && pIdxInfo->aOrderBy[0].desc==0
4749 ) ||
4750 ( pIdxInfo->nOrderBy==2
4751 && pIdxInfo->aOrderBy[0].iColumn==0
4752 && pIdxInfo->aOrderBy[0].desc==0
4753 && pIdxInfo->aOrderBy[1].iColumn==1
4754 && pIdxInfo->aOrderBy[1].desc==0
4755 )
4756 ){
4757 pIdxInfo->orderByConsumed = 1;
4758 }
4759
4760 return SQLITE_OK;
4761 }
4762
4763 /*
4764 ** Open a new statvfs cursor.
4765 */
4766 static int statOpen(sqlite3_vtab *pVTab, sqlite3_vtab_cursor **ppCursor){
4767 StatTable *pTab = (StatTable *)pVTab;
4768 StatCursor *pCsr;
4769
4770 pCsr = (StatCursor *)sqlite3_malloc64(sizeof(StatCursor));
4771 if( pCsr==0 ){
4772 return SQLITE_NOMEM;
4773 }else{
4774 memset(pCsr, 0, sizeof(StatCursor));
4775 pCsr->base.pVtab = pVTab;
4776 pCsr->iDb = pTab->iDb;
4777 }
4778
4779 *ppCursor = (sqlite3_vtab_cursor *)pCsr;
4780 return SQLITE_OK;
4781 }
4782
4783 static void statClearPage(StatPage *p){
4784 int i;
4785 if( p->aCell ){
4786 for(i=0; i<p->nCell; i++){
4787 sqlite3_free(p->aCell[i].aOvfl);
4788 }
4789 sqlite3_free(p->aCell);
4790 }
4791 sqlite3PagerUnref(p->pPg);
4792 sqlite3_free(p->zPath);
4793 memset(p, 0, sizeof(StatPage));
4794 }
4795
4796 static void statResetCsr(StatCursor *pCsr){
4797 int i;
4798 sqlite3_reset(pCsr->pStmt);
4799 for(i=0; i<ArraySize(pCsr->aPage); i++){
4800 statClearPage(&pCsr->aPage[i]);
4801 }
4802 pCsr->iPage = 0;
4803 sqlite3_free(pCsr->zPath);
4804 pCsr->zPath = 0;
4805 pCsr->isEof = 0;
4806 }
4807
4808 /*
4809 ** Close a statvfs cursor.
4810 */
4811 static int statClose(sqlite3_vtab_cursor *pCursor){
4812 StatCursor *pCsr = (StatCursor *)pCursor;
4813 statResetCsr(pCsr);
4814 sqlite3_finalize(pCsr->pStmt);
4815 sqlite3_free(pCsr);
4816 return SQLITE_OK;
4817 }
4818
4819 static void getLocalPayload(
4820 int nUsable, /* Usable bytes per page */
4821 u8 flags, /* Page flags */
4822 int nTotal, /* Total record (payload) size */
4823 int *pnLocal /* OUT: Bytes stored locally */
4824 ){
4825 int nLocal;
4826 int nMinLocal;
4827 int nMaxLocal;
4828
4829 if( flags==0x0D ){ /* Table leaf node */
4830 nMinLocal = (nUsable - 12) * 32 / 255 - 23;
4831 nMaxLocal = nUsable - 35;
4832 }else{ /* Index interior and leaf nodes */
4833 nMinLocal = (nUsable - 12) * 32 / 255 - 23;
4834 nMaxLocal = (nUsable - 12) * 64 / 255 - 23;
4835 }
4836
4837 nLocal = nMinLocal + (nTotal - nMinLocal) % (nUsable - 4);
4838 if( nLocal>nMaxLocal ) nLocal = nMinLocal;
4839 *pnLocal = nLocal;
4840 }
4841
4842 static int statDecodePage(Btree *pBt, StatPage *p){
4843 int nUnused;
4844 int iOff;
4845 int nHdr;
4846 int isLeaf;
4847 int szPage;
4848
4849 u8 *aData = sqlite3PagerGetData(p->pPg);
4850 u8 *aHdr = &aData[p->iPgno==1 ? 100 : 0];
4851
4852 p->flags = aHdr[0];
4853 p->nCell = get2byte(&aHdr[3]);
4854 p->nMxPayload = 0;
4855
4856 isLeaf = (p->flags==0x0A || p->flags==0x0D);
4857 nHdr = 12 - isLeaf*4 + (p->iPgno==1)*100;
4858
4859 nUnused = get2byte(&aHdr[5]) - nHdr - 2*p->nCell;
4860 nUnused += (int)aHdr[7];
4861 iOff = get2byte(&aHdr[1]);
4862 while( iOff ){
4863 nUnused += get2byte(&aData[iOff+2]);
4864 iOff = get2byte(&aData[iOff]);
4865 }
4866 p->nUnused = nUnused;
4867 p->iRightChildPg = isLeaf ? 0 : sqlite3Get4byte(&aHdr[8]);
4868 szPage = sqlite3BtreeGetPageSize(pBt);
4869
4870 if( p->nCell ){
4871 int i; /* Used to iterate through cells */
4872 int nUsable; /* Usable bytes per page */
4873
4874 sqlite3BtreeEnter(pBt);
4875 nUsable = szPage - sqlite3BtreeGetReserveNoMutex(pBt);
4876 sqlite3BtreeLeave(pBt);
4877 p->aCell = sqlite3_malloc64((p->nCell+1) * sizeof(StatCell));
4878 if( p->aCell==0 ) return SQLITE_NOMEM;
4879 memset(p->aCell, 0, (p->nCell+1) * sizeof(StatCell));
4880
4881 for(i=0; i<p->nCell; i++){
4882 StatCell *pCell = &p->aCell[i];
4883
4884 iOff = get2byte(&aData[nHdr+i*2]);
4885 if( !isLeaf ){
4886 pCell->iChildPg = sqlite3Get4byte(&aData[iOff]);
4887 iOff += 4;
4888 }
4889 if( p->flags==0x05 ){
4890 /* A table interior node. nPayload==0. */
4891 }else{
4892 u32 nPayload; /* Bytes of payload total (local+overflow) */
4893 int nLocal; /* Bytes of payload stored locally */
4894 iOff += getVarint32(&aData[iOff], nPayload);
4895 if( p->flags==0x0D ){
4896 u64 dummy;
4897 iOff += sqlite3GetVarint(&aData[iOff], &dummy);
4898 }
4899 if( nPayload>(u32)p->nMxPayload ) p->nMxPayload = nPayload;
4900 getLocalPayload(nUsable, p->flags, nPayload, &nLocal);
4901 pCell->nLocal = nLocal;
4902 assert( nLocal>=0 );
4903 assert( nPayload>=(u32)nLocal );
4904 assert( nLocal<=(nUsable-35) );
4905 if( nPayload>(u32)nLocal ){
4906 int j;
4907 int nOvfl = ((nPayload - nLocal) + nUsable-4 - 1) / (nUsable - 4);
4908 pCell->nLastOvfl = (nPayload-nLocal) - (nOvfl-1) * (nUsable-4);
4909 pCell->nOvfl = nOvfl;
4910 pCell->aOvfl = sqlite3_malloc64(sizeof(u32)*nOvfl);
4911 if( pCell->aOvfl==0 ) return SQLITE_NOMEM;
4912 pCell->aOvfl[0] = sqlite3Get4byte(&aData[iOff+nLocal]);
4913 for(j=1; j<nOvfl; j++){
4914 int rc;
4915 u32 iPrev = pCell->aOvfl[j-1];
4916 DbPage *pPg = 0;
4917 rc = sqlite3PagerGet(sqlite3BtreePager(pBt), iPrev, &pPg, 0);
4918 if( rc!=SQLITE_OK ){
4919 assert( pPg==0 );
4920 return rc;
4921 }
4922 pCell->aOvfl[j] = sqlite3Get4byte(sqlite3PagerGetData(pPg));
4923 sqlite3PagerUnref(pPg);
4924 }
4925 }
4926 }
4927 }
4928 }
4929
4930 return SQLITE_OK;
4931 }
4932
4933 /*
4934 ** Populate the pCsr->iOffset and pCsr->szPage member variables. Based on
4935 ** the current value of pCsr->iPageno.
4936 */
4937 static void statSizeAndOffset(StatCursor *pCsr){
4938 StatTable *pTab = (StatTable *)((sqlite3_vtab_cursor *)pCsr)->pVtab;
4939 Btree *pBt = pTab->db->aDb[pTab->iDb].pBt;
4940 Pager *pPager = sqlite3BtreePager(pBt);
4941 sqlite3_file *fd;
4942 sqlite3_int64 x[2];
4943
4944 /* The default page size and offset */
4945 pCsr->szPage = sqlite3BtreeGetPageSize(pBt);
4946 pCsr->iOffset = (i64)pCsr->szPage * (pCsr->iPageno - 1);
4947
4948 /* If connected to a ZIPVFS backend, override the page size and
4949 ** offset with actual values obtained from ZIPVFS.
4950 */
4951 fd = sqlite3PagerFile(pPager);
4952 x[0] = pCsr->iPageno;
4953 if( fd->pMethods!=0 && sqlite3OsFileControl(fd, 230440, &x)==SQLITE_OK ){
4954 pCsr->iOffset = x[0];
4955 pCsr->szPage = (int)x[1];
4956 }
4957 }
4958
4959 /*
4960 ** Move a statvfs cursor to the next entry in the file.
4961 */
4962 static int statNext(sqlite3_vtab_cursor *pCursor){
4963 int rc;
4964 int nPayload;
4965 char *z;
4966 StatCursor *pCsr = (StatCursor *)pCursor;
4967 StatTable *pTab = (StatTable *)pCursor->pVtab;
4968 Btree *pBt = pTab->db->aDb[pCsr->iDb].pBt;
4969 Pager *pPager = sqlite3BtreePager(pBt);
4970
4971 sqlite3_free(pCsr->zPath);
4972 pCsr->zPath = 0;
4973
4974 statNextRestart:
4975 if( pCsr->aPage[0].pPg==0 ){
4976 rc = sqlite3_step(pCsr->pStmt);
4977 if( rc==SQLITE_ROW ){
4978 int nPage;
4979 u32 iRoot = (u32)sqlite3_column_int64(pCsr->pStmt, 1);
4980 sqlite3PagerPagecount(pPager, &nPage);
4981 if( nPage==0 ){
4982 pCsr->isEof = 1;
4983 return sqlite3_reset(pCsr->pStmt);
4984 }
4985 rc = sqlite3PagerGet(pPager, iRoot, &pCsr->aPage[0].pPg, 0);
4986 pCsr->aPage[0].iPgno = iRoot;
4987 pCsr->aPage[0].iCell = 0;
4988 pCsr->aPage[0].zPath = z = sqlite3_mprintf("/");
4989 pCsr->iPage = 0;
4990 if( z==0 ) rc = SQLITE_NOMEM;
4991 }else{
4992 pCsr->isEof = 1;
4993 return sqlite3_reset(pCsr->pStmt);
4994 }
4995 }else{
4996
4997 /* Page p itself has already been visited. */
4998 StatPage *p = &pCsr->aPage[pCsr->iPage];
4999
5000 while( p->iCell<p->nCell ){
5001 StatCell *pCell = &p->aCell[p->iCell];
5002 if( pCell->iOvfl<pCell->nOvfl ){
5003 int nUsable;
5004 sqlite3BtreeEnter(pBt);
5005 nUsable = sqlite3BtreeGetPageSize(pBt) -
5006 sqlite3BtreeGetReserveNoMutex(pBt);
5007 sqlite3BtreeLeave(pBt);
5008 pCsr->zName = (char *)sqlite3_column_text(pCsr->pStmt, 0);
5009 pCsr->iPageno = pCell->aOvfl[pCell->iOvfl];
5010 pCsr->zPagetype = "overflow";
5011 pCsr->nCell = 0;
5012 pCsr->nMxPayload = 0;
5013 pCsr->zPath = z = sqlite3_mprintf(
5014 "%s%.3x+%.6x", p->zPath, p->iCell, pCell->iOvfl
5015 );
5016 if( pCell->iOvfl<pCell->nOvfl-1 ){
5017 pCsr->nUnused = 0;
5018 pCsr->nPayload = nUsable - 4;
5019 }else{
5020 pCsr->nPayload = pCell->nLastOvfl;
5021 pCsr->nUnused = nUsable - 4 - pCsr->nPayload;
5022 }
5023 pCell->iOvfl++;
5024 statSizeAndOffset(pCsr);
5025 return z==0 ? SQLITE_NOMEM : SQLITE_OK;
5026 }
5027 if( p->iRightChildPg ) break;
5028 p->iCell++;
5029 }
5030
5031 if( !p->iRightChildPg || p->iCell>p->nCell ){
5032 statClearPage(p);
5033 if( pCsr->iPage==0 ) return statNext(pCursor);
5034 pCsr->iPage--;
5035 goto statNextRestart; /* Tail recursion */
5036 }
5037 pCsr->iPage++;
5038 assert( p==&pCsr->aPage[pCsr->iPage-1] );
5039
5040 if( p->iCell==p->nCell ){
5041 p[1].iPgno = p->iRightChildPg;
5042 }else{
5043 p[1].iPgno = p->aCell[p->iCell].iChildPg;
5044 }
5045 rc = sqlite3PagerGet(pPager, p[1].iPgno, &p[1].pPg, 0);
5046 p[1].iCell = 0;
5047 p[1].zPath = z = sqlite3_mprintf("%s%.3x/", p->zPath, p->iCell);
5048 p->iCell++;
5049 if( z==0 ) rc = SQLITE_NOMEM;
5050 }
5051
5052
5053 /* Populate the StatCursor fields with the values to be returned
5054 ** by the xColumn() and xRowid() methods.
5055 */
5056 if( rc==SQLITE_OK ){
5057 int i;
5058 StatPage *p = &pCsr->aPage[pCsr->iPage];
5059 pCsr->zName = (char *)sqlite3_column_text(pCsr->pStmt, 0);
5060 pCsr->iPageno = p->iPgno;
5061
5062 rc = statDecodePage(pBt, p);
5063 if( rc==SQLITE_OK ){
5064 statSizeAndOffset(pCsr);
5065
5066 switch( p->flags ){
5067 case 0x05: /* table internal */
5068 case 0x02: /* index internal */
5069 pCsr->zPagetype = "internal";
5070 break;
5071 case 0x0D: /* table leaf */
5072 case 0x0A: /* index leaf */
5073 pCsr->zPagetype = "leaf";
5074 break;
5075 default:
5076 pCsr->zPagetype = "corrupted";
5077 break;
5078 }
5079 pCsr->nCell = p->nCell;
5080 pCsr->nUnused = p->nUnused;
5081 pCsr->nMxPayload = p->nMxPayload;
5082 pCsr->zPath = z = sqlite3_mprintf("%s", p->zPath);
5083 if( z==0 ) rc = SQLITE_NOMEM;
5084 nPayload = 0;
5085 for(i=0; i<p->nCell; i++){
5086 nPayload += p->aCell[i].nLocal;
5087 }
5088 pCsr->nPayload = nPayload;
5089 }
5090 }
5091
5092 return rc;
5093 }
5094
5095 static int statEof(sqlite3_vtab_cursor *pCursor){
5096 StatCursor *pCsr = (StatCursor *)pCursor;
5097 return pCsr->isEof;
5098 }
5099
5100 static int statFilter(
5101 sqlite3_vtab_cursor *pCursor,
5102 int idxNum, const char *idxStr,
5103 int argc, sqlite3_value **argv
5104 ){
5105 StatCursor *pCsr = (StatCursor *)pCursor;
5106 StatTable *pTab = (StatTable*)(pCursor->pVtab);
5107 char *zSql;
5108 int rc = SQLITE_OK;
5109 char *zMaster;
5110
5111 if( idxNum==1 ){
5112 const char *zDbase = (const char*)sqlite3_value_text(argv[0]);
5113 pCsr->iDb = sqlite3FindDbName(pTab->db, zDbase);
5114 if( pCsr->iDb<0 ){
5115 sqlite3_free(pCursor->pVtab->zErrMsg);
5116 pCursor->pVtab->zErrMsg = sqlite3_mprintf("no such schema: %s", zDbase);
5117 return pCursor->pVtab->zErrMsg ? SQLITE_ERROR : SQLITE_NOMEM;
5118 }
5119 }else{
5120 pCsr->iDb = pTab->iDb;
5121 }
5122 statResetCsr(pCsr);
5123 sqlite3_finalize(pCsr->pStmt);
5124 pCsr->pStmt = 0;
5125 zMaster = pCsr->iDb==1 ? "sqlite_temp_master" : "sqlite_master";
5126 zSql = sqlite3_mprintf(
5127 "SELECT 'sqlite_master' AS name, 1 AS rootpage, 'table' AS type"
5128 " UNION ALL "
5129 "SELECT name, rootpage, type"
5130 " FROM \"%w\".%s WHERE rootpage!=0"
5131 " ORDER BY name", pTab->db->aDb[pCsr->iDb].zName, zMaster);
5132 if( zSql==0 ){
5133 return SQLITE_NOMEM;
5134 }else{
5135 rc = sqlite3_prepare_v2(pTab->db, zSql, -1, &pCsr->pStmt, 0);
5136 sqlite3_free(zSql);
5137 }
5138
5139 if( rc==SQLITE_OK ){
5140 rc = statNext(pCursor);
5141 }
5142 return rc;
5143 }
5144
5145 static int statColumn(
5146 sqlite3_vtab_cursor *pCursor,
5147 sqlite3_context *ctx,
5148 int i
5149 ){
5150 StatCursor *pCsr = (StatCursor *)pCursor;
5151 switch( i ){
5152 case 0: /* name */
5153 sqlite3_result_text(ctx, pCsr->zName, -1, SQLITE_TRANSIENT);
5154 break;
5155 case 1: /* path */
5156 sqlite3_result_text(ctx, pCsr->zPath, -1, SQLITE_TRANSIENT);
5157 break;
5158 case 2: /* pageno */
5159 sqlite3_result_int64(ctx, pCsr->iPageno);
5160 break;
5161 case 3: /* pagetype */
5162 sqlite3_result_text(ctx, pCsr->zPagetype, -1, SQLITE_STATIC);
5163 break;
5164 case 4: /* ncell */
5165 sqlite3_result_int(ctx, pCsr->nCell);
5166 break;
5167 case 5: /* payload */
5168 sqlite3_result_int(ctx, pCsr->nPayload);
5169 break;
5170 case 6: /* unused */
5171 sqlite3_result_int(ctx, pCsr->nUnused);
5172 break;
5173 case 7: /* mx_payload */
5174 sqlite3_result_int(ctx, pCsr->nMxPayload);
5175 break;
5176 case 8: /* pgoffset */
5177 sqlite3_result_int64(ctx, pCsr->iOffset);
5178 break;
5179 case 9: /* pgsize */
5180 sqlite3_result_int(ctx, pCsr->szPage);
5181 break;
5182 default: { /* schema */
5183 sqlite3 *db = sqlite3_context_db_handle(ctx);
5184 int iDb = pCsr->iDb;
5185 sqlite3_result_text(ctx, db->aDb[iDb].zName, -1, SQLITE_STATIC);
5186 break;
5187 }
5188 }
5189 return SQLITE_OK;
5190 }
5191
5192 static int statRowid(sqlite3_vtab_cursor *pCursor, sqlite_int64 *pRowid){
5193 StatCursor *pCsr = (StatCursor *)pCursor;
5194 *pRowid = pCsr->iPageno;
5195 return SQLITE_OK;
5196 }
5197
5198 /*
5199 ** Invoke this routine to register the "dbstat" virtual table module
5200 */
5201 SQLITE_PRIVATE int sqlite3DbstatRegister(sqlite3 *db){
5202 static sqlite3_module dbstat_module = {
5203 0, /* iVersion */
5204 statConnect, /* xCreate */
5205 statConnect, /* xConnect */
5206 statBestIndex, /* xBestIndex */
5207 statDisconnect, /* xDisconnect */
5208 statDisconnect, /* xDestroy */
5209 statOpen, /* xOpen - open a cursor */
5210 statClose, /* xClose - close a cursor */
5211 statFilter, /* xFilter - configure scan constraints */
5212 statNext, /* xNext - advance a cursor */
5213 statEof, /* xEof - check for end of scan */
5214 statColumn, /* xColumn - read data */
5215 statRowid, /* xRowid - read data */
5216 0, /* xUpdate */
5217 0, /* xBegin */
5218 0, /* xSync */
5219 0, /* xCommit */
5220 0, /* xRollback */
5221 0, /* xFindMethod */
5222 0, /* xRename */
5223 };
5224 return sqlite3_create_module(db, "dbstat", &dbstat_module, 0);
5225 }
5226 #elif defined(SQLITE_ENABLE_DBSTAT_VTAB)
5227 SQLITE_PRIVATE int sqlite3DbstatRegister(sqlite3 *db){ return SQLITE_OK; }
5228 #endif /* SQLITE_ENABLE_DBSTAT_VTAB */
5229
5230 /************** End of dbstat.c **********************************************/
5231 /************** Begin file json1.c *******************************************/
5232 /*
5233 ** 2015-08-12
5234 **
5235 ** The author disclaims copyright to this source code. In place of
5236 ** a legal notice, here is a blessing:
5237 **
5238 ** May you do good and not evil.
5239 ** May you find forgiveness for yourself and forgive others.
5240 ** May you share freely, never taking more than you give.
5241 **
5242 ******************************************************************************
5243 **
5244 ** This SQLite extension implements JSON functions. The interface is
5245 ** modeled after MySQL JSON functions:
5246 **
5247 ** https://dev.mysql.com/doc/refman/5.7/en/json.html
5248 **
5249 ** For the time being, all JSON is stored as pure text. (We might add
5250 ** a JSONB type in the future which stores a binary encoding of JSON in
5251 ** a BLOB, but there is no support for JSONB in the current implementation.
5252 ** This implementation parses JSON text at 250 MB/s, so it is hard to see
5253 ** how JSONB might improve on that.)
5254 */
5255 #if !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_JSON1)
5256 #if !defined(_SQLITEINT_H_)
5257 /* #include "sqlite3ext.h" */
5258 #endif
5259 SQLITE_EXTENSION_INIT1
5260 /* #include <assert.h> */
5261 /* #include <string.h> */
5262 /* #include <stdlib.h> */
5263 /* #include <stdarg.h> */
5264
5265 #define UNUSED_PARAM(X) (void)(X)
5266
5267 #ifndef LARGEST_INT64
5268 # define LARGEST_INT64 (0xffffffff|(((sqlite3_int64)0x7fffffff)<<32))
5269 # define SMALLEST_INT64 (((sqlite3_int64)-1) - LARGEST_INT64)
5270 #endif
5271
5272 /*
5273 ** Versions of isspace(), isalnum() and isdigit() to which it is safe
5274 ** to pass signed char values.
5275 */
5276 #ifdef sqlite3Isdigit
5277 /* Use the SQLite core versions if this routine is part of the
5278 ** SQLite amalgamation */
5279 # define safe_isdigit(x) sqlite3Isdigit(x)
5280 # define safe_isalnum(x) sqlite3Isalnum(x)
5281 #else
5282 /* Use the standard library for separate compilation */
5283 #include <ctype.h> /* amalgamator: keep */
5284 # define safe_isdigit(x) isdigit((unsigned char)(x))
5285 # define safe_isalnum(x) isalnum((unsigned char)(x))
5286 #endif
5287
5288 /*
5289 ** Growing our own isspace() routine this way is twice as fast as
5290 ** the library isspace() function, resulting in a 7% overall performance
5291 ** increase for the parser. (Ubuntu14.10 gcc 4.8.4 x64 with -Os).
5292 */
5293 static const char jsonIsSpace[] = {
5294 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0,
5295 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5296 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5297 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5298 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5299 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5300 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5301 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5302 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5303 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5304 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5305 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5306 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5307 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5308 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5309 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5310 };
5311 #define safe_isspace(x) (jsonIsSpace[(unsigned char)x])
5312
5313 #ifndef SQLITE_AMALGAMATION
5314 /* Unsigned integer types. These are already defined in the sqliteInt.h,
5315 ** but the definitions need to be repeated for separate compilation. */
5316 typedef sqlite3_uint64 u64;
5317 typedef unsigned int u32;
5318 typedef unsigned char u8;
5319 #endif
5320
5321 /* Objects */
5322 typedef struct JsonString JsonString;
5323 typedef struct JsonNode JsonNode;
5324 typedef struct JsonParse JsonParse;
5325
5326 /* An instance of this object represents a JSON string
5327 ** under construction. Really, this is a generic string accumulator
5328 ** that can be and is used to create strings other than JSON.
5329 */
5330 struct JsonString {
5331 sqlite3_context *pCtx; /* Function context - put error messages here */
5332 char *zBuf; /* Append JSON content here */
5333 u64 nAlloc; /* Bytes of storage available in zBuf[] */
5334 u64 nUsed; /* Bytes of zBuf[] currently used */
5335 u8 bStatic; /* True if zBuf is static space */
5336 u8 bErr; /* True if an error has been encountered */
5337 char zSpace[100]; /* Initial static space */
5338 };
5339
5340 /* JSON type values
5341 */
5342 #define JSON_NULL 0
5343 #define JSON_TRUE 1
5344 #define JSON_FALSE 2
5345 #define JSON_INT 3
5346 #define JSON_REAL 4
5347 #define JSON_STRING 5
5348 #define JSON_ARRAY 6
5349 #define JSON_OBJECT 7
5350
5351 /* The "subtype" set for JSON values */
5352 #define JSON_SUBTYPE 74 /* Ascii for "J" */
5353
5354 /*
5355 ** Names of the various JSON types:
5356 */
5357 static const char * const jsonType[] = {
5358 "null", "true", "false", "integer", "real", "text", "array", "object"
5359 };
5360
5361 /* Bit values for the JsonNode.jnFlag field
5362 */
5363 #define JNODE_RAW 0x01 /* Content is raw, not JSON encoded */
5364 #define JNODE_ESCAPE 0x02 /* Content is text with \ escapes */
5365 #define JNODE_REMOVE 0x04 /* Do not output */
5366 #define JNODE_REPLACE 0x08 /* Replace with JsonNode.iVal */
5367 #define JNODE_APPEND 0x10 /* More ARRAY/OBJECT entries at u.iAppend */
5368 #define JNODE_LABEL 0x20 /* Is a label of an object */
5369
5370
5371 /* A single node of parsed JSON
5372 */
5373 struct JsonNode {
5374 u8 eType; /* One of the JSON_ type values */
5375 u8 jnFlags; /* JNODE flags */
5376 u8 iVal; /* Replacement value when JNODE_REPLACE */
5377 u32 n; /* Bytes of content, or number of sub-nodes */
5378 union {
5379 const char *zJContent; /* Content for INT, REAL, and STRING */
5380 u32 iAppend; /* More terms for ARRAY and OBJECT */
5381 u32 iKey; /* Key for ARRAY objects in json_tree() */
5382 } u;
5383 };
5384
5385 /* A completely parsed JSON string
5386 */
5387 struct JsonParse {
5388 u32 nNode; /* Number of slots of aNode[] used */
5389 u32 nAlloc; /* Number of slots of aNode[] allocated */
5390 JsonNode *aNode; /* Array of nodes containing the parse */
5391 const char *zJson; /* Original JSON string */
5392 u32 *aUp; /* Index of parent of each node */
5393 u8 oom; /* Set to true if out of memory */
5394 u8 nErr; /* Number of errors seen */
5395 };
5396
5397 /**************************************************************************
5398 ** Utility routines for dealing with JsonString objects
5399 **************************************************************************/
5400
5401 /* Set the JsonString object to an empty string
5402 */
5403 static void jsonZero(JsonString *p){
5404 p->zBuf = p->zSpace;
5405 p->nAlloc = sizeof(p->zSpace);
5406 p->nUsed = 0;
5407 p->bStatic = 1;
5408 }
5409
5410 /* Initialize the JsonString object
5411 */
5412 static void jsonInit(JsonString *p, sqlite3_context *pCtx){
5413 p->pCtx = pCtx;
5414 p->bErr = 0;
5415 jsonZero(p);
5416 }
5417
5418
5419 /* Free all allocated memory and reset the JsonString object back to its
5420 ** initial state.
5421 */
5422 static void jsonReset(JsonString *p){
5423 if( !p->bStatic ) sqlite3_free(p->zBuf);
5424 jsonZero(p);
5425 }
5426
5427
5428 /* Report an out-of-memory (OOM) condition
5429 */
5430 static void jsonOom(JsonString *p){
5431 p->bErr = 1;
5432 sqlite3_result_error_nomem(p->pCtx);
5433 jsonReset(p);
5434 }
5435
5436 /* Enlarge pJson->zBuf so that it can hold at least N more bytes.
5437 ** Return zero on success. Return non-zero on an OOM error
5438 */
5439 static int jsonGrow(JsonString *p, u32 N){
5440 u64 nTotal = N<p->nAlloc ? p->nAlloc*2 : p->nAlloc+N+10;
5441 char *zNew;
5442 if( p->bStatic ){
5443 if( p->bErr ) return 1;
5444 zNew = sqlite3_malloc64(nTotal);
5445 if( zNew==0 ){
5446 jsonOom(p);
5447 return SQLITE_NOMEM;
5448 }
5449 memcpy(zNew, p->zBuf, (size_t)p->nUsed);
5450 p->zBuf = zNew;
5451 p->bStatic = 0;
5452 }else{
5453 zNew = sqlite3_realloc64(p->zBuf, nTotal);
5454 if( zNew==0 ){
5455 jsonOom(p);
5456 return SQLITE_NOMEM;
5457 }
5458 p->zBuf = zNew;
5459 }
5460 p->nAlloc = nTotal;
5461 return SQLITE_OK;
5462 }
5463
5464 /* Append N bytes from zIn onto the end of the JsonString string.
5465 */
5466 static void jsonAppendRaw(JsonString *p, const char *zIn, u32 N){
5467 if( (N+p->nUsed >= p->nAlloc) && jsonGrow(p,N)!=0 ) return;
5468 memcpy(p->zBuf+p->nUsed, zIn, N);
5469 p->nUsed += N;
5470 }
5471
5472 /* Append formatted text (not to exceed N bytes) to the JsonString.
5473 */
5474 static void jsonPrintf(int N, JsonString *p, const char *zFormat, ...){
5475 va_list ap;
5476 if( (p->nUsed + N >= p->nAlloc) && jsonGrow(p, N) ) return;
5477 va_start(ap, zFormat);
5478 sqlite3_vsnprintf(N, p->zBuf+p->nUsed, zFormat, ap);
5479 va_end(ap);
5480 p->nUsed += (int)strlen(p->zBuf+p->nUsed);
5481 }
5482
5483 /* Append a single character
5484 */
5485 static void jsonAppendChar(JsonString *p, char c){
5486 if( p->nUsed>=p->nAlloc && jsonGrow(p,1)!=0 ) return;
5487 p->zBuf[p->nUsed++] = c;
5488 }
5489
5490 /* Append a comma separator to the output buffer, if the previous
5491 ** character is not '[' or '{'.
5492 */
5493 static void jsonAppendSeparator(JsonString *p){
5494 char c;
5495 if( p->nUsed==0 ) return;
5496 c = p->zBuf[p->nUsed-1];
5497 if( c!='[' && c!='{' ) jsonAppendChar(p, ',');
5498 }
5499
5500 /* Append the N-byte string in zIn to the end of the JsonString string
5501 ** under construction. Enclose the string in "..." and escape
5502 ** any double-quotes or backslash characters contained within the
5503 ** string.
5504 */
5505 static void jsonAppendString(JsonString *p, const char *zIn, u32 N){
5506 u32 i;
5507 if( (N+p->nUsed+2 >= p->nAlloc) && jsonGrow(p,N+2)!=0 ) return;
5508 p->zBuf[p->nUsed++] = '"';
5509 for(i=0; i<N; i++){
5510 char c = zIn[i];
5511 if( c=='"' || c=='\\' ){
5512 if( (p->nUsed+N+3-i > p->nAlloc) && jsonGrow(p,N+3-i)!=0 ) return;
5513 p->zBuf[p->nUsed++] = '\\';
5514 }
5515 p->zBuf[p->nUsed++] = c;
5516 }
5517 p->zBuf[p->nUsed++] = '"';
5518 assert( p->nUsed<p->nAlloc );
5519 }
5520
5521 /*
5522 ** Append a function parameter value to the JSON string under
5523 ** construction.
5524 */
5525 static void jsonAppendValue(
5526 JsonString *p, /* Append to this JSON string */
5527 sqlite3_value *pValue /* Value to append */
5528 ){
5529 switch( sqlite3_value_type(pValue) ){
5530 case SQLITE_NULL: {
5531 jsonAppendRaw(p, "null", 4);
5532 break;
5533 }
5534 case SQLITE_INTEGER:
5535 case SQLITE_FLOAT: {
5536 const char *z = (const char*)sqlite3_value_text(pValue);
5537 u32 n = (u32)sqlite3_value_bytes(pValue);
5538 jsonAppendRaw(p, z, n);
5539 break;
5540 }
5541 case SQLITE_TEXT: {
5542 const char *z = (const char*)sqlite3_value_text(pValue);
5543 u32 n = (u32)sqlite3_value_bytes(pValue);
5544 if( sqlite3_value_subtype(pValue)==JSON_SUBTYPE ){
5545 jsonAppendRaw(p, z, n);
5546 }else{
5547 jsonAppendString(p, z, n);
5548 }
5549 break;
5550 }
5551 default: {
5552 if( p->bErr==0 ){
5553 sqlite3_result_error(p->pCtx, "JSON cannot hold BLOB values", -1);
5554 p->bErr = 1;
5555 jsonReset(p);
5556 }
5557 break;
5558 }
5559 }
5560 }
5561
5562
5563 /* Make the JSON in p the result of the SQL function.
5564 */
5565 static void jsonResult(JsonString *p){
5566 if( p->bErr==0 ){
5567 sqlite3_result_text64(p->pCtx, p->zBuf, p->nUsed,
5568 p->bStatic ? SQLITE_TRANSIENT : sqlite3_free,
5569 SQLITE_UTF8);
5570 jsonZero(p);
5571 }
5572 assert( p->bStatic );
5573 }
5574
5575 /**************************************************************************
5576 ** Utility routines for dealing with JsonNode and JsonParse objects
5577 **************************************************************************/
5578
5579 /*
5580 ** Return the number of consecutive JsonNode slots need to represent
5581 ** the parsed JSON at pNode. The minimum answer is 1. For ARRAY and
5582 ** OBJECT types, the number might be larger.
5583 **
5584 ** Appended elements are not counted. The value returned is the number
5585 ** by which the JsonNode counter should increment in order to go to the
5586 ** next peer value.
5587 */
5588 static u32 jsonNodeSize(JsonNode *pNode){
5589 return pNode->eType>=JSON_ARRAY ? pNode->n+1 : 1;
5590 }
5591
5592 /*
5593 ** Reclaim all memory allocated by a JsonParse object. But do not
5594 ** delete the JsonParse object itself.
5595 */
5596 static void jsonParseReset(JsonParse *pParse){
5597 sqlite3_free(pParse->aNode);
5598 pParse->aNode = 0;
5599 pParse->nNode = 0;
5600 pParse->nAlloc = 0;
5601 sqlite3_free(pParse->aUp);
5602 pParse->aUp = 0;
5603 }
5604
5605 /*
5606 ** Convert the JsonNode pNode into a pure JSON string and
5607 ** append to pOut. Subsubstructure is also included. Return
5608 ** the number of JsonNode objects that are encoded.
5609 */
5610 static void jsonRenderNode(
5611 JsonNode *pNode, /* The node to render */
5612 JsonString *pOut, /* Write JSON here */
5613 sqlite3_value **aReplace /* Replacement values */
5614 ){
5615 switch( pNode->eType ){
5616 default: {
5617 assert( pNode->eType==JSON_NULL );
5618 jsonAppendRaw(pOut, "null", 4);
5619 break;
5620 }
5621 case JSON_TRUE: {
5622 jsonAppendRaw(pOut, "true", 4);
5623 break;
5624 }
5625 case JSON_FALSE: {
5626 jsonAppendRaw(pOut, "false", 5);
5627 break;
5628 }
5629 case JSON_STRING: {
5630 if( pNode->jnFlags & JNODE_RAW ){
5631 jsonAppendString(pOut, pNode->u.zJContent, pNode->n);
5632 break;
5633 }
5634 /* Fall through into the next case */
5635 }
5636 case JSON_REAL:
5637 case JSON_INT: {
5638 jsonAppendRaw(pOut, pNode->u.zJContent, pNode->n);
5639 break;
5640 }
5641 case JSON_ARRAY: {
5642 u32 j = 1;
5643 jsonAppendChar(pOut, '[');
5644 for(;;){
5645 while( j<=pNode->n ){
5646 if( pNode[j].jnFlags & (JNODE_REMOVE|JNODE_REPLACE) ){
5647 if( pNode[j].jnFlags & JNODE_REPLACE ){
5648 jsonAppendSeparator(pOut);
5649 jsonAppendValue(pOut, aReplace[pNode[j].iVal]);
5650 }
5651 }else{
5652 jsonAppendSeparator(pOut);
5653 jsonRenderNode(&pNode[j], pOut, aReplace);
5654 }
5655 j += jsonNodeSize(&pNode[j]);
5656 }
5657 if( (pNode->jnFlags & JNODE_APPEND)==0 ) break;
5658 pNode = &pNode[pNode->u.iAppend];
5659 j = 1;
5660 }
5661 jsonAppendChar(pOut, ']');
5662 break;
5663 }
5664 case JSON_OBJECT: {
5665 u32 j = 1;
5666 jsonAppendChar(pOut, '{');
5667 for(;;){
5668 while( j<=pNode->n ){
5669 if( (pNode[j+1].jnFlags & JNODE_REMOVE)==0 ){
5670 jsonAppendSeparator(pOut);
5671 jsonRenderNode(&pNode[j], pOut, aReplace);
5672 jsonAppendChar(pOut, ':');
5673 if( pNode[j+1].jnFlags & JNODE_REPLACE ){
5674 jsonAppendValue(pOut, aReplace[pNode[j+1].iVal]);
5675 }else{
5676 jsonRenderNode(&pNode[j+1], pOut, aReplace);
5677 }
5678 }
5679 j += 1 + jsonNodeSize(&pNode[j+1]);
5680 }
5681 if( (pNode->jnFlags & JNODE_APPEND)==0 ) break;
5682 pNode = &pNode[pNode->u.iAppend];
5683 j = 1;
5684 }
5685 jsonAppendChar(pOut, '}');
5686 break;
5687 }
5688 }
5689 }
5690
5691 /*
5692 ** Return a JsonNode and all its descendents as a JSON string.
5693 */
5694 static void jsonReturnJson(
5695 JsonNode *pNode, /* Node to return */
5696 sqlite3_context *pCtx, /* Return value for this function */
5697 sqlite3_value **aReplace /* Array of replacement values */
5698 ){
5699 JsonString s;
5700 jsonInit(&s, pCtx);
5701 jsonRenderNode(pNode, &s, aReplace);
5702 jsonResult(&s);
5703 sqlite3_result_subtype(pCtx, JSON_SUBTYPE);
5704 }
5705
5706 /*
5707 ** Make the JsonNode the return value of the function.
5708 */
5709 static void jsonReturn(
5710 JsonNode *pNode, /* Node to return */
5711 sqlite3_context *pCtx, /* Return value for this function */
5712 sqlite3_value **aReplace /* Array of replacement values */
5713 ){
5714 switch( pNode->eType ){
5715 default: {
5716 assert( pNode->eType==JSON_NULL );
5717 sqlite3_result_null(pCtx);
5718 break;
5719 }
5720 case JSON_TRUE: {
5721 sqlite3_result_int(pCtx, 1);
5722 break;
5723 }
5724 case JSON_FALSE: {
5725 sqlite3_result_int(pCtx, 0);
5726 break;
5727 }
5728 case JSON_INT: {
5729 sqlite3_int64 i = 0;
5730 const char *z = pNode->u.zJContent;
5731 if( z[0]=='-' ){ z++; }
5732 while( z[0]>='0' && z[0]<='9' ){
5733 unsigned v = *(z++) - '0';
5734 if( i>=LARGEST_INT64/10 ){
5735 if( i>LARGEST_INT64/10 ) goto int_as_real;
5736 if( z[0]>='0' && z[0]<='9' ) goto int_as_real;
5737 if( v==9 ) goto int_as_real;
5738 if( v==8 ){
5739 if( pNode->u.zJContent[0]=='-' ){
5740 sqlite3_result_int64(pCtx, SMALLEST_INT64);
5741 goto int_done;
5742 }else{
5743 goto int_as_real;
5744 }
5745 }
5746 }
5747 i = i*10 + v;
5748 }
5749 if( pNode->u.zJContent[0]=='-' ){ i = -i; }
5750 sqlite3_result_int64(pCtx, i);
5751 int_done:
5752 break;
5753 int_as_real: /* fall through to real */;
5754 }
5755 case JSON_REAL: {
5756 double r;
5757 #ifdef SQLITE_AMALGAMATION
5758 const char *z = pNode->u.zJContent;
5759 sqlite3AtoF(z, &r, sqlite3Strlen30(z), SQLITE_UTF8);
5760 #else
5761 r = strtod(pNode->u.zJContent, 0);
5762 #endif
5763 sqlite3_result_double(pCtx, r);
5764 break;
5765 }
5766 case JSON_STRING: {
5767 #if 0 /* Never happens because JNODE_RAW is only set by json_set(),
5768 ** json_insert() and json_replace() and those routines do not
5769 ** call jsonReturn() */
5770 if( pNode->jnFlags & JNODE_RAW ){
5771 sqlite3_result_text(pCtx, pNode->u.zJContent, pNode->n,
5772 SQLITE_TRANSIENT);
5773 }else
5774 #endif
5775 assert( (pNode->jnFlags & JNODE_RAW)==0 );
5776 if( (pNode->jnFlags & JNODE_ESCAPE)==0 ){
5777 /* JSON formatted without any backslash-escapes */
5778 sqlite3_result_text(pCtx, pNode->u.zJContent+1, pNode->n-2,
5779 SQLITE_TRANSIENT);
5780 }else{
5781 /* Translate JSON formatted string into raw text */
5782 u32 i;
5783 u32 n = pNode->n;
5784 const char *z = pNode->u.zJContent;
5785 char *zOut;
5786 u32 j;
5787 zOut = sqlite3_malloc( n+1 );
5788 if( zOut==0 ){
5789 sqlite3_result_error_nomem(pCtx);
5790 break;
5791 }
5792 for(i=1, j=0; i<n-1; i++){
5793 char c = z[i];
5794 if( c!='\\' ){
5795 zOut[j++] = c;
5796 }else{
5797 c = z[++i];
5798 if( c=='u' ){
5799 u32 v = 0, k;
5800 for(k=0; k<4 && i<n-2; i++, k++){
5801 c = z[i+1];
5802 if( c>='0' && c<='9' ) v = v*16 + c - '0';
5803 else if( c>='A' && c<='F' ) v = v*16 + c - 'A' + 10;
5804 else if( c>='a' && c<='f' ) v = v*16 + c - 'a' + 10;
5805 else break;
5806 }
5807 if( v==0 ) break;
5808 if( v<=0x7f ){
5809 zOut[j++] = (char)v;
5810 }else if( v<=0x7ff ){
5811 zOut[j++] = (char)(0xc0 | (v>>6));
5812 zOut[j++] = 0x80 | (v&0x3f);
5813 }else{
5814 zOut[j++] = (char)(0xe0 | (v>>12));
5815 zOut[j++] = 0x80 | ((v>>6)&0x3f);
5816 zOut[j++] = 0x80 | (v&0x3f);
5817 }
5818 }else{
5819 if( c=='b' ){
5820 c = '\b';
5821 }else if( c=='f' ){
5822 c = '\f';
5823 }else if( c=='n' ){
5824 c = '\n';
5825 }else if( c=='r' ){
5826 c = '\r';
5827 }else if( c=='t' ){
5828 c = '\t';
5829 }
5830 zOut[j++] = c;
5831 }
5832 }
5833 }
5834 zOut[j] = 0;
5835 sqlite3_result_text(pCtx, zOut, j, sqlite3_free);
5836 }
5837 break;
5838 }
5839 case JSON_ARRAY:
5840 case JSON_OBJECT: {
5841 jsonReturnJson(pNode, pCtx, aReplace);
5842 break;
5843 }
5844 }
5845 }
5846
5847 /* Forward reference */
5848 static int jsonParseAddNode(JsonParse*,u32,u32,const char*);
5849
5850 /*
5851 ** A macro to hint to the compiler that a function should not be
5852 ** inlined.
5853 */
5854 #if defined(__GNUC__)
5855 # define JSON_NOINLINE __attribute__((noinline))
5856 #elif defined(_MSC_VER) && _MSC_VER>=1310
5857 # define JSON_NOINLINE __declspec(noinline)
5858 #else
5859 # define JSON_NOINLINE
5860 #endif
5861
5862
5863 static JSON_NOINLINE int jsonParseAddNodeExpand(
5864 JsonParse *pParse, /* Append the node to this object */
5865 u32 eType, /* Node type */
5866 u32 n, /* Content size or sub-node count */
5867 const char *zContent /* Content */
5868 ){
5869 u32 nNew;
5870 JsonNode *pNew;
5871 assert( pParse->nNode>=pParse->nAlloc );
5872 if( pParse->oom ) return -1;
5873 nNew = pParse->nAlloc*2 + 10;
5874 pNew = sqlite3_realloc(pParse->aNode, sizeof(JsonNode)*nNew);
5875 if( pNew==0 ){
5876 pParse->oom = 1;
5877 return -1;
5878 }
5879 pParse->nAlloc = nNew;
5880 pParse->aNode = pNew;
5881 assert( pParse->nNode<pParse->nAlloc );
5882 return jsonParseAddNode(pParse, eType, n, zContent);
5883 }
5884
5885 /*
5886 ** Create a new JsonNode instance based on the arguments and append that
5887 ** instance to the JsonParse. Return the index in pParse->aNode[] of the
5888 ** new node, or -1 if a memory allocation fails.
5889 */
5890 static int jsonParseAddNode(
5891 JsonParse *pParse, /* Append the node to this object */
5892 u32 eType, /* Node type */
5893 u32 n, /* Content size or sub-node count */
5894 const char *zContent /* Content */
5895 ){
5896 JsonNode *p;
5897 if( pParse->nNode>=pParse->nAlloc ){
5898 return jsonParseAddNodeExpand(pParse, eType, n, zContent);
5899 }
5900 p = &pParse->aNode[pParse->nNode];
5901 p->eType = (u8)eType;
5902 p->jnFlags = 0;
5903 p->iVal = 0;
5904 p->n = n;
5905 p->u.zJContent = zContent;
5906 return pParse->nNode++;
5907 }
5908
5909 /*
5910 ** Parse a single JSON value which begins at pParse->zJson[i]. Return the
5911 ** index of the first character past the end of the value parsed.
5912 **
5913 ** Return negative for a syntax error. Special cases: return -2 if the
5914 ** first non-whitespace character is '}' and return -3 if the first
5915 ** non-whitespace character is ']'.
5916 */
5917 static int jsonParseValue(JsonParse *pParse, u32 i){
5918 char c;
5919 u32 j;
5920 int iThis;
5921 int x;
5922 JsonNode *pNode;
5923 while( safe_isspace(pParse->zJson[i]) ){ i++; }
5924 if( (c = pParse->zJson[i])=='{' ){
5925 /* Parse object */
5926 iThis = jsonParseAddNode(pParse, JSON_OBJECT, 0, 0);
5927 if( iThis<0 ) return -1;
5928 for(j=i+1;;j++){
5929 while( safe_isspace(pParse->zJson[j]) ){ j++; }
5930 x = jsonParseValue(pParse, j);
5931 if( x<0 ){
5932 if( x==(-2) && pParse->nNode==(u32)iThis+1 ) return j+1;
5933 return -1;
5934 }
5935 if( pParse->oom ) return -1;
5936 pNode = &pParse->aNode[pParse->nNode-1];
5937 if( pNode->eType!=JSON_STRING ) return -1;
5938 pNode->jnFlags |= JNODE_LABEL;
5939 j = x;
5940 while( safe_isspace(pParse->zJson[j]) ){ j++; }
5941 if( pParse->zJson[j]!=':' ) return -1;
5942 j++;
5943 x = jsonParseValue(pParse, j);
5944 if( x<0 ) return -1;
5945 j = x;
5946 while( safe_isspace(pParse->zJson[j]) ){ j++; }
5947 c = pParse->zJson[j];
5948 if( c==',' ) continue;
5949 if( c!='}' ) return -1;
5950 break;
5951 }
5952 pParse->aNode[iThis].n = pParse->nNode - (u32)iThis - 1;
5953 return j+1;
5954 }else if( c=='[' ){
5955 /* Parse array */
5956 iThis = jsonParseAddNode(pParse, JSON_ARRAY, 0, 0);
5957 if( iThis<0 ) return -1;
5958 for(j=i+1;;j++){
5959 while( safe_isspace(pParse->zJson[j]) ){ j++; }
5960 x = jsonParseValue(pParse, j);
5961 if( x<0 ){
5962 if( x==(-3) && pParse->nNode==(u32)iThis+1 ) return j+1;
5963 return -1;
5964 }
5965 j = x;
5966 while( safe_isspace(pParse->zJson[j]) ){ j++; }
5967 c = pParse->zJson[j];
5968 if( c==',' ) continue;
5969 if( c!=']' ) return -1;
5970 break;
5971 }
5972 pParse->aNode[iThis].n = pParse->nNode - (u32)iThis - 1;
5973 return j+1;
5974 }else if( c=='"' ){
5975 /* Parse string */
5976 u8 jnFlags = 0;
5977 j = i+1;
5978 for(;;){
5979 c = pParse->zJson[j];
5980 if( c==0 ) return -1;
5981 if( c=='\\' ){
5982 c = pParse->zJson[++j];
5983 if( c==0 ) return -1;
5984 jnFlags = JNODE_ESCAPE;
5985 }else if( c=='"' ){
5986 break;
5987 }
5988 j++;
5989 }
5990 jsonParseAddNode(pParse, JSON_STRING, j+1-i, &pParse->zJson[i]);
5991 if( !pParse->oom ) pParse->aNode[pParse->nNode-1].jnFlags = jnFlags;
5992 return j+1;
5993 }else if( c=='n'
5994 && strncmp(pParse->zJson+i,"null",4)==0
5995 && !safe_isalnum(pParse->zJson[i+4]) ){
5996 jsonParseAddNode(pParse, JSON_NULL, 0, 0);
5997 return i+4;
5998 }else if( c=='t'
5999 && strncmp(pParse->zJson+i,"true",4)==0
6000 && !safe_isalnum(pParse->zJson[i+4]) ){
6001 jsonParseAddNode(pParse, JSON_TRUE, 0, 0);
6002 return i+4;
6003 }else if( c=='f'
6004 && strncmp(pParse->zJson+i,"false",5)==0
6005 && !safe_isalnum(pParse->zJson[i+5]) ){
6006 jsonParseAddNode(pParse, JSON_FALSE, 0, 0);
6007 return i+5;
6008 }else if( c=='-' || (c>='0' && c<='9') ){
6009 /* Parse number */
6010 u8 seenDP = 0;
6011 u8 seenE = 0;
6012 j = i+1;
6013 for(;; j++){
6014 c = pParse->zJson[j];
6015 if( c>='0' && c<='9' ) continue;
6016 if( c=='.' ){
6017 if( pParse->zJson[j-1]=='-' ) return -1;
6018 if( seenDP ) return -1;
6019 seenDP = 1;
6020 continue;
6021 }
6022 if( c=='e' || c=='E' ){
6023 if( pParse->zJson[j-1]<'0' ) return -1;
6024 if( seenE ) return -1;
6025 seenDP = seenE = 1;
6026 c = pParse->zJson[j+1];
6027 if( c=='+' || c=='-' ){
6028 j++;
6029 c = pParse->zJson[j+1];
6030 }
6031 if( c<'0' || c>'9' ) return -1;
6032 continue;
6033 }
6034 break;
6035 }
6036 if( pParse->zJson[j-1]<'0' ) return -1;
6037 jsonParseAddNode(pParse, seenDP ? JSON_REAL : JSON_INT,
6038 j - i, &pParse->zJson[i]);
6039 return j;
6040 }else if( c=='}' ){
6041 return -2; /* End of {...} */
6042 }else if( c==']' ){
6043 return -3; /* End of [...] */
6044 }else if( c==0 ){
6045 return 0; /* End of file */
6046 }else{
6047 return -1; /* Syntax error */
6048 }
6049 }
6050
6051 /*
6052 ** Parse a complete JSON string. Return 0 on success or non-zero if there
6053 ** are any errors. If an error occurs, free all memory associated with
6054 ** pParse.
6055 **
6056 ** pParse is uninitialized when this routine is called.
6057 */
6058 static int jsonParse(
6059 JsonParse *pParse, /* Initialize and fill this JsonParse object */
6060 sqlite3_context *pCtx, /* Report errors here */
6061 const char *zJson /* Input JSON text to be parsed */
6062 ){
6063 int i;
6064 memset(pParse, 0, sizeof(*pParse));
6065 if( zJson==0 ) return 1;
6066 pParse->zJson = zJson;
6067 i = jsonParseValue(pParse, 0);
6068 if( pParse->oom ) i = -1;
6069 if( i>0 ){
6070 while( safe_isspace(zJson[i]) ) i++;
6071 if( zJson[i] ) i = -1;
6072 }
6073 if( i<=0 ){
6074 if( pCtx!=0 ){
6075 if( pParse->oom ){
6076 sqlite3_result_error_nomem(pCtx);
6077 }else{
6078 sqlite3_result_error(pCtx, "malformed JSON", -1);
6079 }
6080 }
6081 jsonParseReset(pParse);
6082 return 1;
6083 }
6084 return 0;
6085 }
6086
6087 /* Mark node i of pParse as being a child of iParent. Call recursively
6088 ** to fill in all the descendants of node i.
6089 */
6090 static void jsonParseFillInParentage(JsonParse *pParse, u32 i, u32 iParent){
6091 JsonNode *pNode = &pParse->aNode[i];
6092 u32 j;
6093 pParse->aUp[i] = iParent;
6094 switch( pNode->eType ){
6095 case JSON_ARRAY: {
6096 for(j=1; j<=pNode->n; j += jsonNodeSize(pNode+j)){
6097 jsonParseFillInParentage(pParse, i+j, i);
6098 }
6099 break;
6100 }
6101 case JSON_OBJECT: {
6102 for(j=1; j<=pNode->n; j += jsonNodeSize(pNode+j+1)+1){
6103 pParse->aUp[i+j] = i;
6104 jsonParseFillInParentage(pParse, i+j+1, i);
6105 }
6106 break;
6107 }
6108 default: {
6109 break;
6110 }
6111 }
6112 }
6113
6114 /*
6115 ** Compute the parentage of all nodes in a completed parse.
6116 */
6117 static int jsonParseFindParents(JsonParse *pParse){
6118 u32 *aUp;
6119 assert( pParse->aUp==0 );
6120 aUp = pParse->aUp = sqlite3_malloc( sizeof(u32)*pParse->nNode );
6121 if( aUp==0 ){
6122 pParse->oom = 1;
6123 return SQLITE_NOMEM;
6124 }
6125 jsonParseFillInParentage(pParse, 0, 0);
6126 return SQLITE_OK;
6127 }
6128
6129 /*
6130 ** Compare the OBJECT label at pNode against zKey,nKey. Return true on
6131 ** a match.
6132 */
6133 static int jsonLabelCompare(JsonNode *pNode, const char *zKey, u32 nKey){
6134 if( pNode->jnFlags & JNODE_RAW ){
6135 if( pNode->n!=nKey ) return 0;
6136 return strncmp(pNode->u.zJContent, zKey, nKey)==0;
6137 }else{
6138 if( pNode->n!=nKey+2 ) return 0;
6139 return strncmp(pNode->u.zJContent+1, zKey, nKey)==0;
6140 }
6141 }
6142
6143 /* forward declaration */
6144 static JsonNode *jsonLookupAppend(JsonParse*,const char*,int*,const char**);
6145
6146 /*
6147 ** Search along zPath to find the node specified. Return a pointer
6148 ** to that node, or NULL if zPath is malformed or if there is no such
6149 ** node.
6150 **
6151 ** If pApnd!=0, then try to append new nodes to complete zPath if it is
6152 ** possible to do so and if no existing node corresponds to zPath. If
6153 ** new nodes are appended *pApnd is set to 1.
6154 */
6155 static JsonNode *jsonLookupStep(
6156 JsonParse *pParse, /* The JSON to search */
6157 u32 iRoot, /* Begin the search at this node */
6158 const char *zPath, /* The path to search */
6159 int *pApnd, /* Append nodes to complete path if not NULL */
6160 const char **pzErr /* Make *pzErr point to any syntax error in zPath */
6161 ){
6162 u32 i, j, nKey;
6163 const char *zKey;
6164 JsonNode *pRoot = &pParse->aNode[iRoot];
6165 if( zPath[0]==0 ) return pRoot;
6166 if( zPath[0]=='.' ){
6167 if( pRoot->eType!=JSON_OBJECT ) return 0;
6168 zPath++;
6169 if( zPath[0]=='"' ){
6170 zKey = zPath + 1;
6171 for(i=1; zPath[i] && zPath[i]!='"'; i++){}
6172 nKey = i-1;
6173 if( zPath[i] ){
6174 i++;
6175 }else{
6176 *pzErr = zPath;
6177 return 0;
6178 }
6179 }else{
6180 zKey = zPath;
6181 for(i=0; zPath[i] && zPath[i]!='.' && zPath[i]!='['; i++){}
6182 nKey = i;
6183 }
6184 if( nKey==0 ){
6185 *pzErr = zPath;
6186 return 0;
6187 }
6188 j = 1;
6189 for(;;){
6190 while( j<=pRoot->n ){
6191 if( jsonLabelCompare(pRoot+j, zKey, nKey) ){
6192 return jsonLookupStep(pParse, iRoot+j+1, &zPath[i], pApnd, pzErr);
6193 }
6194 j++;
6195 j += jsonNodeSize(&pRoot[j]);
6196 }
6197 if( (pRoot->jnFlags & JNODE_APPEND)==0 ) break;
6198 iRoot += pRoot->u.iAppend;
6199 pRoot = &pParse->aNode[iRoot];
6200 j = 1;
6201 }
6202 if( pApnd ){
6203 u32 iStart, iLabel;
6204 JsonNode *pNode;
6205 iStart = jsonParseAddNode(pParse, JSON_OBJECT, 2, 0);
6206 iLabel = jsonParseAddNode(pParse, JSON_STRING, i, zPath);
6207 zPath += i;
6208 pNode = jsonLookupAppend(pParse, zPath, pApnd, pzErr);
6209 if( pParse->oom ) return 0;
6210 if( pNode ){
6211 pRoot = &pParse->aNode[iRoot];
6212 pRoot->u.iAppend = iStart - iRoot;
6213 pRoot->jnFlags |= JNODE_APPEND;
6214 pParse->aNode[iLabel].jnFlags |= JNODE_RAW;
6215 }
6216 return pNode;
6217 }
6218 }else if( zPath[0]=='[' && safe_isdigit(zPath[1]) ){
6219 if( pRoot->eType!=JSON_ARRAY ) return 0;
6220 i = 0;
6221 j = 1;
6222 while( safe_isdigit(zPath[j]) ){
6223 i = i*10 + zPath[j] - '0';
6224 j++;
6225 }
6226 if( zPath[j]!=']' ){
6227 *pzErr = zPath;
6228 return 0;
6229 }
6230 zPath += j + 1;
6231 j = 1;
6232 for(;;){
6233 while( j<=pRoot->n && (i>0 || (pRoot[j].jnFlags & JNODE_REMOVE)!=0) ){
6234 if( (pRoot[j].jnFlags & JNODE_REMOVE)==0 ) i--;
6235 j += jsonNodeSize(&pRoot[j]);
6236 }
6237 if( (pRoot->jnFlags & JNODE_APPEND)==0 ) break;
6238 iRoot += pRoot->u.iAppend;
6239 pRoot = &pParse->aNode[iRoot];
6240 j = 1;
6241 }
6242 if( j<=pRoot->n ){
6243 return jsonLookupStep(pParse, iRoot+j, zPath, pApnd, pzErr);
6244 }
6245 if( i==0 && pApnd ){
6246 u32 iStart;
6247 JsonNode *pNode;
6248 iStart = jsonParseAddNode(pParse, JSON_ARRAY, 1, 0);
6249 pNode = jsonLookupAppend(pParse, zPath, pApnd, pzErr);
6250 if( pParse->oom ) return 0;
6251 if( pNode ){
6252 pRoot = &pParse->aNode[iRoot];
6253 pRoot->u.iAppend = iStart - iRoot;
6254 pRoot->jnFlags |= JNODE_APPEND;
6255 }
6256 return pNode;
6257 }
6258 }else{
6259 *pzErr = zPath;
6260 }
6261 return 0;
6262 }
6263
6264 /*
6265 ** Append content to pParse that will complete zPath. Return a pointer
6266 ** to the inserted node, or return NULL if the append fails.
6267 */
6268 static JsonNode *jsonLookupAppend(
6269 JsonParse *pParse, /* Append content to the JSON parse */
6270 const char *zPath, /* Description of content to append */
6271 int *pApnd, /* Set this flag to 1 */
6272 const char **pzErr /* Make this point to any syntax error */
6273 ){
6274 *pApnd = 1;
6275 if( zPath[0]==0 ){
6276 jsonParseAddNode(pParse, JSON_NULL, 0, 0);
6277 return pParse->oom ? 0 : &pParse->aNode[pParse->nNode-1];
6278 }
6279 if( zPath[0]=='.' ){
6280 jsonParseAddNode(pParse, JSON_OBJECT, 0, 0);
6281 }else if( strncmp(zPath,"[0]",3)==0 ){
6282 jsonParseAddNode(pParse, JSON_ARRAY, 0, 0);
6283 }else{
6284 return 0;
6285 }
6286 if( pParse->oom ) return 0;
6287 return jsonLookupStep(pParse, pParse->nNode-1, zPath, pApnd, pzErr);
6288 }
6289
6290 /*
6291 ** Return the text of a syntax error message on a JSON path. Space is
6292 ** obtained from sqlite3_malloc().
6293 */
6294 static char *jsonPathSyntaxError(const char *zErr){
6295 return sqlite3_mprintf("JSON path error near '%q'", zErr);
6296 }
6297
6298 /*
6299 ** Do a node lookup using zPath. Return a pointer to the node on success.
6300 ** Return NULL if not found or if there is an error.
6301 **
6302 ** On an error, write an error message into pCtx and increment the
6303 ** pParse->nErr counter.
6304 **
6305 ** If pApnd!=NULL then try to append missing nodes and set *pApnd = 1 if
6306 ** nodes are appended.
6307 */
6308 static JsonNode *jsonLookup(
6309 JsonParse *pParse, /* The JSON to search */
6310 const char *zPath, /* The path to search */
6311 int *pApnd, /* Append nodes to complete path if not NULL */
6312 sqlite3_context *pCtx /* Report errors here, if not NULL */
6313 ){
6314 const char *zErr = 0;
6315 JsonNode *pNode = 0;
6316 char *zMsg;
6317
6318 if( zPath==0 ) return 0;
6319 if( zPath[0]!='$' ){
6320 zErr = zPath;
6321 goto lookup_err;
6322 }
6323 zPath++;
6324 pNode = jsonLookupStep(pParse, 0, zPath, pApnd, &zErr);
6325 if( zErr==0 ) return pNode;
6326
6327 lookup_err:
6328 pParse->nErr++;
6329 assert( zErr!=0 && pCtx!=0 );
6330 zMsg = jsonPathSyntaxError(zErr);
6331 if( zMsg ){
6332 sqlite3_result_error(pCtx, zMsg, -1);
6333 sqlite3_free(zMsg);
6334 }else{
6335 sqlite3_result_error_nomem(pCtx);
6336 }
6337 return 0;
6338 }
6339
6340
6341 /*
6342 ** Report the wrong number of arguments for json_insert(), json_replace()
6343 ** or json_set().
6344 */
6345 static void jsonWrongNumArgs(
6346 sqlite3_context *pCtx,
6347 const char *zFuncName
6348 ){
6349 char *zMsg = sqlite3_mprintf("json_%s() needs an odd number of arguments",
6350 zFuncName);
6351 sqlite3_result_error(pCtx, zMsg, -1);
6352 sqlite3_free(zMsg);
6353 }
6354
6355
6356 /****************************************************************************
6357 ** SQL functions used for testing and debugging
6358 ****************************************************************************/
6359
6360 #ifdef SQLITE_DEBUG
6361 /*
6362 ** The json_parse(JSON) function returns a string which describes
6363 ** a parse of the JSON provided. Or it returns NULL if JSON is not
6364 ** well-formed.
6365 */
6366 static void jsonParseFunc(
6367 sqlite3_context *ctx,
6368 int argc,
6369 sqlite3_value **argv
6370 ){
6371 JsonString s; /* Output string - not real JSON */
6372 JsonParse x; /* The parse */
6373 u32 i;
6374
6375 assert( argc==1 );
6376 if( jsonParse(&x, ctx, (const char*)sqlite3_value_text(argv[0])) ) return;
6377 jsonParseFindParents(&x);
6378 jsonInit(&s, ctx);
6379 for(i=0; i<x.nNode; i++){
6380 const char *zType;
6381 if( x.aNode[i].jnFlags & JNODE_LABEL ){
6382 assert( x.aNode[i].eType==JSON_STRING );
6383 zType = "label";
6384 }else{
6385 zType = jsonType[x.aNode[i].eType];
6386 }
6387 jsonPrintf(100, &s,"node %3u: %7s n=%-4d up=%-4d",
6388 i, zType, x.aNode[i].n, x.aUp[i]);
6389 if( x.aNode[i].u.zJContent!=0 ){
6390 jsonAppendRaw(&s, " ", 1);
6391 jsonAppendRaw(&s, x.aNode[i].u.zJContent, x.aNode[i].n);
6392 }
6393 jsonAppendRaw(&s, "\n", 1);
6394 }
6395 jsonParseReset(&x);
6396 jsonResult(&s);
6397 }
6398
6399 /*
6400 ** The json_test1(JSON) function return true (1) if the input is JSON
6401 ** text generated by another json function. It returns (0) if the input
6402 ** is not known to be JSON.
6403 */
6404 static void jsonTest1Func(
6405 sqlite3_context *ctx,
6406 int argc,
6407 sqlite3_value **argv
6408 ){
6409 UNUSED_PARAM(argc);
6410 sqlite3_result_int(ctx, sqlite3_value_subtype(argv[0])==JSON_SUBTYPE);
6411 }
6412 #endif /* SQLITE_DEBUG */
6413
6414 /****************************************************************************
6415 ** Scalar SQL function implementations
6416 ****************************************************************************/
6417
6418 /*
6419 ** Implementation of the json_array(VALUE,...) function. Return a JSON
6420 ** array that contains all values given in arguments. Or if any argument
6421 ** is a BLOB, throw an error.
6422 */
6423 static void jsonArrayFunc(
6424 sqlite3_context *ctx,
6425 int argc,
6426 sqlite3_value **argv
6427 ){
6428 int i;
6429 JsonString jx;
6430
6431 jsonInit(&jx, ctx);
6432 jsonAppendChar(&jx, '[');
6433 for(i=0; i<argc; i++){
6434 jsonAppendSeparator(&jx);
6435 jsonAppendValue(&jx, argv[i]);
6436 }
6437 jsonAppendChar(&jx, ']');
6438 jsonResult(&jx);
6439 sqlite3_result_subtype(ctx, JSON_SUBTYPE);
6440 }
6441
6442
6443 /*
6444 ** json_array_length(JSON)
6445 ** json_array_length(JSON, PATH)
6446 **
6447 ** Return the number of elements in the top-level JSON array.
6448 ** Return 0 if the input is not a well-formed JSON array.
6449 */
6450 static void jsonArrayLengthFunc(
6451 sqlite3_context *ctx,
6452 int argc,
6453 sqlite3_value **argv
6454 ){
6455 JsonParse x; /* The parse */
6456 sqlite3_int64 n = 0;
6457 u32 i;
6458 JsonNode *pNode;
6459
6460 if( jsonParse(&x, ctx, (const char*)sqlite3_value_text(argv[0])) ) return;
6461 assert( x.nNode );
6462 if( argc==2 ){
6463 const char *zPath = (const char*)sqlite3_value_text(argv[1]);
6464 pNode = jsonLookup(&x, zPath, 0, ctx);
6465 }else{
6466 pNode = x.aNode;
6467 }
6468 if( pNode==0 ){
6469 x.nErr = 1;
6470 }else if( pNode->eType==JSON_ARRAY ){
6471 assert( (pNode->jnFlags & JNODE_APPEND)==0 );
6472 for(i=1; i<=pNode->n; n++){
6473 i += jsonNodeSize(&pNode[i]);
6474 }
6475 }
6476 if( x.nErr==0 ) sqlite3_result_int64(ctx, n);
6477 jsonParseReset(&x);
6478 }
6479
6480 /*
6481 ** json_extract(JSON, PATH, ...)
6482 **
6483 ** Return the element described by PATH. Return NULL if there is no
6484 ** PATH element. If there are multiple PATHs, then return a JSON array
6485 ** with the result from each path. Throw an error if the JSON or any PATH
6486 ** is malformed.
6487 */
6488 static void jsonExtractFunc(
6489 sqlite3_context *ctx,
6490 int argc,
6491 sqlite3_value **argv
6492 ){
6493 JsonParse x; /* The parse */
6494 JsonNode *pNode;
6495 const char *zPath;
6496 JsonString jx;
6497 int i;
6498
6499 if( argc<2 ) return;
6500 if( jsonParse(&x, ctx, (const char*)sqlite3_value_text(argv[0])) ) return;
6501 jsonInit(&jx, ctx);
6502 jsonAppendChar(&jx, '[');
6503 for(i=1; i<argc; i++){
6504 zPath = (const char*)sqlite3_value_text(argv[i]);
6505 pNode = jsonLookup(&x, zPath, 0, ctx);
6506 if( x.nErr ) break;
6507 if( argc>2 ){
6508 jsonAppendSeparator(&jx);
6509 if( pNode ){
6510 jsonRenderNode(pNode, &jx, 0);
6511 }else{
6512 jsonAppendRaw(&jx, "null", 4);
6513 }
6514 }else if( pNode ){
6515 jsonReturn(pNode, ctx, 0);
6516 }
6517 }
6518 if( argc>2 && i==argc ){
6519 jsonAppendChar(&jx, ']');
6520 jsonResult(&jx);
6521 sqlite3_result_subtype(ctx, JSON_SUBTYPE);
6522 }
6523 jsonReset(&jx);
6524 jsonParseReset(&x);
6525 }
6526
6527 /*
6528 ** Implementation of the json_object(NAME,VALUE,...) function. Return a JSON
6529 ** object that contains all name/value given in arguments. Or if any name
6530 ** is not a string or if any value is a BLOB, throw an error.
6531 */
6532 static void jsonObjectFunc(
6533 sqlite3_context *ctx,
6534 int argc,
6535 sqlite3_value **argv
6536 ){
6537 int i;
6538 JsonString jx;
6539 const char *z;
6540 u32 n;
6541
6542 if( argc&1 ){
6543 sqlite3_result_error(ctx, "json_object() requires an even number "
6544 "of arguments", -1);
6545 return;
6546 }
6547 jsonInit(&jx, ctx);
6548 jsonAppendChar(&jx, '{');
6549 for(i=0; i<argc; i+=2){
6550 if( sqlite3_value_type(argv[i])!=SQLITE_TEXT ){
6551 sqlite3_result_error(ctx, "json_object() labels must be TEXT", -1);
6552 jsonReset(&jx);
6553 return;
6554 }
6555 jsonAppendSeparator(&jx);
6556 z = (const char*)sqlite3_value_text(argv[i]);
6557 n = (u32)sqlite3_value_bytes(argv[i]);
6558 jsonAppendString(&jx, z, n);
6559 jsonAppendChar(&jx, ':');
6560 jsonAppendValue(&jx, argv[i+1]);
6561 }
6562 jsonAppendChar(&jx, '}');
6563 jsonResult(&jx);
6564 sqlite3_result_subtype(ctx, JSON_SUBTYPE);
6565 }
6566
6567
6568 /*
6569 ** json_remove(JSON, PATH, ...)
6570 **
6571 ** Remove the named elements from JSON and return the result. malformed
6572 ** JSON or PATH arguments result in an error.
6573 */
6574 static void jsonRemoveFunc(
6575 sqlite3_context *ctx,
6576 int argc,
6577 sqlite3_value **argv
6578 ){
6579 JsonParse x; /* The parse */
6580 JsonNode *pNode;
6581 const char *zPath;
6582 u32 i;
6583
6584 if( argc<1 ) return;
6585 if( jsonParse(&x, ctx, (const char*)sqlite3_value_text(argv[0])) ) return;
6586 assert( x.nNode );
6587 for(i=1; i<(u32)argc; i++){
6588 zPath = (const char*)sqlite3_value_text(argv[i]);
6589 if( zPath==0 ) goto remove_done;
6590 pNode = jsonLookup(&x, zPath, 0, ctx);
6591 if( x.nErr ) goto remove_done;
6592 if( pNode ) pNode->jnFlags |= JNODE_REMOVE;
6593 }
6594 if( (x.aNode[0].jnFlags & JNODE_REMOVE)==0 ){
6595 jsonReturnJson(x.aNode, ctx, 0);
6596 }
6597 remove_done:
6598 jsonParseReset(&x);
6599 }
6600
6601 /*
6602 ** json_replace(JSON, PATH, VALUE, ...)
6603 **
6604 ** Replace the value at PATH with VALUE. If PATH does not already exist,
6605 ** this routine is a no-op. If JSON or PATH is malformed, throw an error.
6606 */
6607 static void jsonReplaceFunc(
6608 sqlite3_context *ctx,
6609 int argc,
6610 sqlite3_value **argv
6611 ){
6612 JsonParse x; /* The parse */
6613 JsonNode *pNode;
6614 const char *zPath;
6615 u32 i;
6616
6617 if( argc<1 ) return;
6618 if( (argc&1)==0 ) {
6619 jsonWrongNumArgs(ctx, "replace");
6620 return;
6621 }
6622 if( jsonParse(&x, ctx, (const char*)sqlite3_value_text(argv[0])) ) return;
6623 assert( x.nNode );
6624 for(i=1; i<(u32)argc; i+=2){
6625 zPath = (const char*)sqlite3_value_text(argv[i]);
6626 pNode = jsonLookup(&x, zPath, 0, ctx);
6627 if( x.nErr ) goto replace_err;
6628 if( pNode ){
6629 pNode->jnFlags |= (u8)JNODE_REPLACE;
6630 pNode->iVal = (u8)(i+1);
6631 }
6632 }
6633 if( x.aNode[0].jnFlags & JNODE_REPLACE ){
6634 sqlite3_result_value(ctx, argv[x.aNode[0].iVal]);
6635 }else{
6636 jsonReturnJson(x.aNode, ctx, argv);
6637 }
6638 replace_err:
6639 jsonParseReset(&x);
6640 }
6641
6642 /*
6643 ** json_set(JSON, PATH, VALUE, ...)
6644 **
6645 ** Set the value at PATH to VALUE. Create the PATH if it does not already
6646 ** exist. Overwrite existing values that do exist.
6647 ** If JSON or PATH is malformed, throw an error.
6648 **
6649 ** json_insert(JSON, PATH, VALUE, ...)
6650 **
6651 ** Create PATH and initialize it to VALUE. If PATH already exists, this
6652 ** routine is a no-op. If JSON or PATH is malformed, throw an error.
6653 */
6654 static void jsonSetFunc(
6655 sqlite3_context *ctx,
6656 int argc,
6657 sqlite3_value **argv
6658 ){
6659 JsonParse x; /* The parse */
6660 JsonNode *pNode;
6661 const char *zPath;
6662 u32 i;
6663 int bApnd;
6664 int bIsSet = *(int*)sqlite3_user_data(ctx);
6665
6666 if( argc<1 ) return;
6667 if( (argc&1)==0 ) {
6668 jsonWrongNumArgs(ctx, bIsSet ? "set" : "insert");
6669 return;
6670 }
6671 if( jsonParse(&x, ctx, (const char*)sqlite3_value_text(argv[0])) ) return;
6672 assert( x.nNode );
6673 for(i=1; i<(u32)argc; i+=2){
6674 zPath = (const char*)sqlite3_value_text(argv[i]);
6675 bApnd = 0;
6676 pNode = jsonLookup(&x, zPath, &bApnd, ctx);
6677 if( x.oom ){
6678 sqlite3_result_error_nomem(ctx);
6679 goto jsonSetDone;
6680 }else if( x.nErr ){
6681 goto jsonSetDone;
6682 }else if( pNode && (bApnd || bIsSet) ){
6683 pNode->jnFlags |= (u8)JNODE_REPLACE;
6684 pNode->iVal = (u8)(i+1);
6685 }
6686 }
6687 if( x.aNode[0].jnFlags & JNODE_REPLACE ){
6688 sqlite3_result_value(ctx, argv[x.aNode[0].iVal]);
6689 }else{
6690 jsonReturnJson(x.aNode, ctx, argv);
6691 }
6692 jsonSetDone:
6693 jsonParseReset(&x);
6694 }
6695
6696 /*
6697 ** json_type(JSON)
6698 ** json_type(JSON, PATH)
6699 **
6700 ** Return the top-level "type" of a JSON string. Throw an error if
6701 ** either the JSON or PATH inputs are not well-formed.
6702 */
6703 static void jsonTypeFunc(
6704 sqlite3_context *ctx,
6705 int argc,
6706 sqlite3_value **argv
6707 ){
6708 JsonParse x; /* The parse */
6709 const char *zPath;
6710 JsonNode *pNode;
6711
6712 if( jsonParse(&x, ctx, (const char*)sqlite3_value_text(argv[0])) ) return;
6713 assert( x.nNode );
6714 if( argc==2 ){
6715 zPath = (const char*)sqlite3_value_text(argv[1]);
6716 pNode = jsonLookup(&x, zPath, 0, ctx);
6717 }else{
6718 pNode = x.aNode;
6719 }
6720 if( pNode ){
6721 sqlite3_result_text(ctx, jsonType[pNode->eType], -1, SQLITE_STATIC);
6722 }
6723 jsonParseReset(&x);
6724 }
6725
6726 /*
6727 ** json_valid(JSON)
6728 **
6729 ** Return 1 if JSON is a well-formed JSON string according to RFC-7159.
6730 ** Return 0 otherwise.
6731 */
6732 static void jsonValidFunc(
6733 sqlite3_context *ctx,
6734 int argc,
6735 sqlite3_value **argv
6736 ){
6737 JsonParse x; /* The parse */
6738 int rc = 0;
6739
6740 UNUSED_PARAM(argc);
6741 if( jsonParse(&x, 0, (const char*)sqlite3_value_text(argv[0]))==0 ){
6742 rc = 1;
6743 }
6744 jsonParseReset(&x);
6745 sqlite3_result_int(ctx, rc);
6746 }
6747
6748
6749 /****************************************************************************
6750 ** Aggregate SQL function implementations
6751 ****************************************************************************/
6752 /*
6753 ** json_group_array(VALUE)
6754 **
6755 ** Return a JSON array composed of all values in the aggregate.
6756 */
6757 static void jsonArrayStep(
6758 sqlite3_context *ctx,
6759 int argc,
6760 sqlite3_value **argv
6761 ){
6762 JsonString *pStr;
6763 pStr = (JsonString*)sqlite3_aggregate_context(ctx, sizeof(*pStr));
6764 if( pStr ){
6765 if( pStr->zBuf==0 ){
6766 jsonInit(pStr, ctx);
6767 jsonAppendChar(pStr, '[');
6768 }else{
6769 jsonAppendChar(pStr, ',');
6770 pStr->pCtx = ctx;
6771 }
6772 jsonAppendValue(pStr, argv[0]);
6773 }
6774 }
6775 static void jsonArrayFinal(sqlite3_context *ctx){
6776 JsonString *pStr;
6777 pStr = (JsonString*)sqlite3_aggregate_context(ctx, 0);
6778 if( pStr ){
6779 pStr->pCtx = ctx;
6780 jsonAppendChar(pStr, ']');
6781 if( pStr->bErr ){
6782 sqlite3_result_error_nomem(ctx);
6783 assert( pStr->bStatic );
6784 }else{
6785 sqlite3_result_text(ctx, pStr->zBuf, pStr->nUsed,
6786 pStr->bStatic ? SQLITE_TRANSIENT : sqlite3_free);
6787 pStr->bStatic = 1;
6788 }
6789 }else{
6790 sqlite3_result_text(ctx, "[]", 2, SQLITE_STATIC);
6791 }
6792 sqlite3_result_subtype(ctx, JSON_SUBTYPE);
6793 }
6794
6795 /*
6796 ** json_group_obj(NAME,VALUE)
6797 **
6798 ** Return a JSON object composed of all names and values in the aggregate.
6799 */
6800 static void jsonObjectStep(
6801 sqlite3_context *ctx,
6802 int argc,
6803 sqlite3_value **argv
6804 ){
6805 JsonString *pStr;
6806 const char *z;
6807 u32 n;
6808 pStr = (JsonString*)sqlite3_aggregate_context(ctx, sizeof(*pStr));
6809 if( pStr ){
6810 if( pStr->zBuf==0 ){
6811 jsonInit(pStr, ctx);
6812 jsonAppendChar(pStr, '{');
6813 }else{
6814 jsonAppendChar(pStr, ',');
6815 pStr->pCtx = ctx;
6816 }
6817 z = (const char*)sqlite3_value_text(argv[0]);
6818 n = (u32)sqlite3_value_bytes(argv[0]);
6819 jsonAppendString(pStr, z, n);
6820 jsonAppendChar(pStr, ':');
6821 jsonAppendValue(pStr, argv[1]);
6822 }
6823 }
6824 static void jsonObjectFinal(sqlite3_context *ctx){
6825 JsonString *pStr;
6826 pStr = (JsonString*)sqlite3_aggregate_context(ctx, 0);
6827 if( pStr ){
6828 jsonAppendChar(pStr, '}');
6829 if( pStr->bErr ){
6830 sqlite3_result_error_nomem(ctx);
6831 assert( pStr->bStatic );
6832 }else{
6833 sqlite3_result_text(ctx, pStr->zBuf, pStr->nUsed,
6834 pStr->bStatic ? SQLITE_TRANSIENT : sqlite3_free);
6835 pStr->bStatic = 1;
6836 }
6837 }else{
6838 sqlite3_result_text(ctx, "{}", 2, SQLITE_STATIC);
6839 }
6840 sqlite3_result_subtype(ctx, JSON_SUBTYPE);
6841 }
6842
6843
6844 #ifndef SQLITE_OMIT_VIRTUALTABLE
6845 /****************************************************************************
6846 ** The json_each virtual table
6847 ****************************************************************************/
6848 typedef struct JsonEachCursor JsonEachCursor;
6849 struct JsonEachCursor {
6850 sqlite3_vtab_cursor base; /* Base class - must be first */
6851 u32 iRowid; /* The rowid */
6852 u32 iBegin; /* The first node of the scan */
6853 u32 i; /* Index in sParse.aNode[] of current row */
6854 u32 iEnd; /* EOF when i equals or exceeds this value */
6855 u8 eType; /* Type of top-level element */
6856 u8 bRecursive; /* True for json_tree(). False for json_each() */
6857 char *zJson; /* Input JSON */
6858 char *zRoot; /* Path by which to filter zJson */
6859 JsonParse sParse; /* Parse of the input JSON */
6860 };
6861
6862 /* Constructor for the json_each virtual table */
6863 static int jsonEachConnect(
6864 sqlite3 *db,
6865 void *pAux,
6866 int argc, const char *const*argv,
6867 sqlite3_vtab **ppVtab,
6868 char **pzErr
6869 ){
6870 sqlite3_vtab *pNew;
6871 int rc;
6872
6873 /* Column numbers */
6874 #define JEACH_KEY 0
6875 #define JEACH_VALUE 1
6876 #define JEACH_TYPE 2
6877 #define JEACH_ATOM 3
6878 #define JEACH_ID 4
6879 #define JEACH_PARENT 5
6880 #define JEACH_FULLKEY 6
6881 #define JEACH_PATH 7
6882 #define JEACH_JSON 8
6883 #define JEACH_ROOT 9
6884
6885 UNUSED_PARAM(pzErr);
6886 UNUSED_PARAM(argv);
6887 UNUSED_PARAM(argc);
6888 UNUSED_PARAM(pAux);
6889 rc = sqlite3_declare_vtab(db,
6890 "CREATE TABLE x(key,value,type,atom,id,parent,fullkey,path,"
6891 "json HIDDEN,root HIDDEN)");
6892 if( rc==SQLITE_OK ){
6893 pNew = *ppVtab = sqlite3_malloc( sizeof(*pNew) );
6894 if( pNew==0 ) return SQLITE_NOMEM;
6895 memset(pNew, 0, sizeof(*pNew));
6896 }
6897 return rc;
6898 }
6899
6900 /* destructor for json_each virtual table */
6901 static int jsonEachDisconnect(sqlite3_vtab *pVtab){
6902 sqlite3_free(pVtab);
6903 return SQLITE_OK;
6904 }
6905
6906 /* constructor for a JsonEachCursor object for json_each(). */
6907 static int jsonEachOpenEach(sqlite3_vtab *p, sqlite3_vtab_cursor **ppCursor){
6908 JsonEachCursor *pCur;
6909
6910 UNUSED_PARAM(p);
6911 pCur = sqlite3_malloc( sizeof(*pCur) );
6912 if( pCur==0 ) return SQLITE_NOMEM;
6913 memset(pCur, 0, sizeof(*pCur));
6914 *ppCursor = &pCur->base;
6915 return SQLITE_OK;
6916 }
6917
6918 /* constructor for a JsonEachCursor object for json_tree(). */
6919 static int jsonEachOpenTree(sqlite3_vtab *p, sqlite3_vtab_cursor **ppCursor){
6920 int rc = jsonEachOpenEach(p, ppCursor);
6921 if( rc==SQLITE_OK ){
6922 JsonEachCursor *pCur = (JsonEachCursor*)*ppCursor;
6923 pCur->bRecursive = 1;
6924 }
6925 return rc;
6926 }
6927
6928 /* Reset a JsonEachCursor back to its original state. Free any memory
6929 ** held. */
6930 static void jsonEachCursorReset(JsonEachCursor *p){
6931 sqlite3_free(p->zJson);
6932 sqlite3_free(p->zRoot);
6933 jsonParseReset(&p->sParse);
6934 p->iRowid = 0;
6935 p->i = 0;
6936 p->iEnd = 0;
6937 p->eType = 0;
6938 p->zJson = 0;
6939 p->zRoot = 0;
6940 }
6941
6942 /* Destructor for a jsonEachCursor object */
6943 static int jsonEachClose(sqlite3_vtab_cursor *cur){
6944 JsonEachCursor *p = (JsonEachCursor*)cur;
6945 jsonEachCursorReset(p);
6946 sqlite3_free(cur);
6947 return SQLITE_OK;
6948 }
6949
6950 /* Return TRUE if the jsonEachCursor object has been advanced off the end
6951 ** of the JSON object */
6952 static int jsonEachEof(sqlite3_vtab_cursor *cur){
6953 JsonEachCursor *p = (JsonEachCursor*)cur;
6954 return p->i >= p->iEnd;
6955 }
6956
6957 /* Advance the cursor to the next element for json_tree() */
6958 static int jsonEachNext(sqlite3_vtab_cursor *cur){
6959 JsonEachCursor *p = (JsonEachCursor*)cur;
6960 if( p->bRecursive ){
6961 if( p->sParse.aNode[p->i].jnFlags & JNODE_LABEL ) p->i++;
6962 p->i++;
6963 p->iRowid++;
6964 if( p->i<p->iEnd ){
6965 u32 iUp = p->sParse.aUp[p->i];
6966 JsonNode *pUp = &p->sParse.aNode[iUp];
6967 p->eType = pUp->eType;
6968 if( pUp->eType==JSON_ARRAY ){
6969 if( iUp==p->i-1 ){
6970 pUp->u.iKey = 0;
6971 }else{
6972 pUp->u.iKey++;
6973 }
6974 }
6975 }
6976 }else{
6977 switch( p->eType ){
6978 case JSON_ARRAY: {
6979 p->i += jsonNodeSize(&p->sParse.aNode[p->i]);
6980 p->iRowid++;
6981 break;
6982 }
6983 case JSON_OBJECT: {
6984 p->i += 1 + jsonNodeSize(&p->sParse.aNode[p->i+1]);
6985 p->iRowid++;
6986 break;
6987 }
6988 default: {
6989 p->i = p->iEnd;
6990 break;
6991 }
6992 }
6993 }
6994 return SQLITE_OK;
6995 }
6996
6997 /* Append the name of the path for element i to pStr
6998 */
6999 static void jsonEachComputePath(
7000 JsonEachCursor *p, /* The cursor */
7001 JsonString *pStr, /* Write the path here */
7002 u32 i /* Path to this element */
7003 ){
7004 JsonNode *pNode, *pUp;
7005 u32 iUp;
7006 if( i==0 ){
7007 jsonAppendChar(pStr, '$');
7008 return;
7009 }
7010 iUp = p->sParse.aUp[i];
7011 jsonEachComputePath(p, pStr, iUp);
7012 pNode = &p->sParse.aNode[i];
7013 pUp = &p->sParse.aNode[iUp];
7014 if( pUp->eType==JSON_ARRAY ){
7015 jsonPrintf(30, pStr, "[%d]", pUp->u.iKey);
7016 }else{
7017 assert( pUp->eType==JSON_OBJECT );
7018 if( (pNode->jnFlags & JNODE_LABEL)==0 ) pNode--;
7019 assert( pNode->eType==JSON_STRING );
7020 assert( pNode->jnFlags & JNODE_LABEL );
7021 jsonPrintf(pNode->n+1, pStr, ".%.*s", pNode->n-2, pNode->u.zJContent+1);
7022 }
7023 }
7024
7025 /* Return the value of a column */
7026 static int jsonEachColumn(
7027 sqlite3_vtab_cursor *cur, /* The cursor */
7028 sqlite3_context *ctx, /* First argument to sqlite3_result_...() */
7029 int i /* Which column to return */
7030 ){
7031 JsonEachCursor *p = (JsonEachCursor*)cur;
7032 JsonNode *pThis = &p->sParse.aNode[p->i];
7033 switch( i ){
7034 case JEACH_KEY: {
7035 if( p->i==0 ) break;
7036 if( p->eType==JSON_OBJECT ){
7037 jsonReturn(pThis, ctx, 0);
7038 }else if( p->eType==JSON_ARRAY ){
7039 u32 iKey;
7040 if( p->bRecursive ){
7041 if( p->iRowid==0 ) break;
7042 iKey = p->sParse.aNode[p->sParse.aUp[p->i]].u.iKey;
7043 }else{
7044 iKey = p->iRowid;
7045 }
7046 sqlite3_result_int64(ctx, (sqlite3_int64)iKey);
7047 }
7048 break;
7049 }
7050 case JEACH_VALUE: {
7051 if( pThis->jnFlags & JNODE_LABEL ) pThis++;
7052 jsonReturn(pThis, ctx, 0);
7053 break;
7054 }
7055 case JEACH_TYPE: {
7056 if( pThis->jnFlags & JNODE_LABEL ) pThis++;
7057 sqlite3_result_text(ctx, jsonType[pThis->eType], -1, SQLITE_STATIC);
7058 break;
7059 }
7060 case JEACH_ATOM: {
7061 if( pThis->jnFlags & JNODE_LABEL ) pThis++;
7062 if( pThis->eType>=JSON_ARRAY ) break;
7063 jsonReturn(pThis, ctx, 0);
7064 break;
7065 }
7066 case JEACH_ID: {
7067 sqlite3_result_int64(ctx,
7068 (sqlite3_int64)p->i + ((pThis->jnFlags & JNODE_LABEL)!=0));
7069 break;
7070 }
7071 case JEACH_PARENT: {
7072 if( p->i>p->iBegin && p->bRecursive ){
7073 sqlite3_result_int64(ctx, (sqlite3_int64)p->sParse.aUp[p->i]);
7074 }
7075 break;
7076 }
7077 case JEACH_FULLKEY: {
7078 JsonString x;
7079 jsonInit(&x, ctx);
7080 if( p->bRecursive ){
7081 jsonEachComputePath(p, &x, p->i);
7082 }else{
7083 if( p->zRoot ){
7084 jsonAppendRaw(&x, p->zRoot, (int)strlen(p->zRoot));
7085 }else{
7086 jsonAppendChar(&x, '$');
7087 }
7088 if( p->eType==JSON_ARRAY ){
7089 jsonPrintf(30, &x, "[%d]", p->iRowid);
7090 }else{
7091 jsonPrintf(pThis->n, &x, ".%.*s", pThis->n-2, pThis->u.zJContent+1);
7092 }
7093 }
7094 jsonResult(&x);
7095 break;
7096 }
7097 case JEACH_PATH: {
7098 if( p->bRecursive ){
7099 JsonString x;
7100 jsonInit(&x, ctx);
7101 jsonEachComputePath(p, &x, p->sParse.aUp[p->i]);
7102 jsonResult(&x);
7103 break;
7104 }
7105 /* For json_each() path and root are the same so fall through
7106 ** into the root case */
7107 }
7108 case JEACH_ROOT: {
7109 const char *zRoot = p->zRoot;
7110 if( zRoot==0 ) zRoot = "$";
7111 sqlite3_result_text(ctx, zRoot, -1, SQLITE_STATIC);
7112 break;
7113 }
7114 case JEACH_JSON: {
7115 assert( i==JEACH_JSON );
7116 sqlite3_result_text(ctx, p->sParse.zJson, -1, SQLITE_STATIC);
7117 break;
7118 }
7119 }
7120 return SQLITE_OK;
7121 }
7122
7123 /* Return the current rowid value */
7124 static int jsonEachRowid(sqlite3_vtab_cursor *cur, sqlite_int64 *pRowid){
7125 JsonEachCursor *p = (JsonEachCursor*)cur;
7126 *pRowid = p->iRowid;
7127 return SQLITE_OK;
7128 }
7129
7130 /* The query strategy is to look for an equality constraint on the json
7131 ** column. Without such a constraint, the table cannot operate. idxNum is
7132 ** 1 if the constraint is found, 3 if the constraint and zRoot are found,
7133 ** and 0 otherwise.
7134 */
7135 static int jsonEachBestIndex(
7136 sqlite3_vtab *tab,
7137 sqlite3_index_info *pIdxInfo
7138 ){
7139 int i;
7140 int jsonIdx = -1;
7141 int rootIdx = -1;
7142 const struct sqlite3_index_constraint *pConstraint;
7143
7144 UNUSED_PARAM(tab);
7145 pConstraint = pIdxInfo->aConstraint;
7146 for(i=0; i<pIdxInfo->nConstraint; i++, pConstraint++){
7147 if( pConstraint->usable==0 ) continue;
7148 if( pConstraint->op!=SQLITE_INDEX_CONSTRAINT_EQ ) continue;
7149 switch( pConstraint->iColumn ){
7150 case JEACH_JSON: jsonIdx = i; break;
7151 case JEACH_ROOT: rootIdx = i; break;
7152 default: /* no-op */ break;
7153 }
7154 }
7155 if( jsonIdx<0 ){
7156 pIdxInfo->idxNum = 0;
7157 pIdxInfo->estimatedCost = 1e99;
7158 }else{
7159 pIdxInfo->estimatedCost = 1.0;
7160 pIdxInfo->aConstraintUsage[jsonIdx].argvIndex = 1;
7161 pIdxInfo->aConstraintUsage[jsonIdx].omit = 1;
7162 if( rootIdx<0 ){
7163 pIdxInfo->idxNum = 1;
7164 }else{
7165 pIdxInfo->aConstraintUsage[rootIdx].argvIndex = 2;
7166 pIdxInfo->aConstraintUsage[rootIdx].omit = 1;
7167 pIdxInfo->idxNum = 3;
7168 }
7169 }
7170 return SQLITE_OK;
7171 }
7172
7173 /* Start a search on a new JSON string */
7174 static int jsonEachFilter(
7175 sqlite3_vtab_cursor *cur,
7176 int idxNum, const char *idxStr,
7177 int argc, sqlite3_value **argv
7178 ){
7179 JsonEachCursor *p = (JsonEachCursor*)cur;
7180 const char *z;
7181 const char *zRoot = 0;
7182 sqlite3_int64 n;
7183
7184 UNUSED_PARAM(idxStr);
7185 UNUSED_PARAM(argc);
7186 jsonEachCursorReset(p);
7187 if( idxNum==0 ) return SQLITE_OK;
7188 z = (const char*)sqlite3_value_text(argv[0]);
7189 if( z==0 ) return SQLITE_OK;
7190 n = sqlite3_value_bytes(argv[0]);
7191 p->zJson = sqlite3_malloc64( n+1 );
7192 if( p->zJson==0 ) return SQLITE_NOMEM;
7193 memcpy(p->zJson, z, (size_t)n+1);
7194 if( jsonParse(&p->sParse, 0, p->zJson) ){
7195 int rc = SQLITE_NOMEM;
7196 if( p->sParse.oom==0 ){
7197 sqlite3_free(cur->pVtab->zErrMsg);
7198 cur->pVtab->zErrMsg = sqlite3_mprintf("malformed JSON");
7199 if( cur->pVtab->zErrMsg ) rc = SQLITE_ERROR;
7200 }
7201 jsonEachCursorReset(p);
7202 return rc;
7203 }else if( p->bRecursive && jsonParseFindParents(&p->sParse) ){
7204 jsonEachCursorReset(p);
7205 return SQLITE_NOMEM;
7206 }else{
7207 JsonNode *pNode = 0;
7208 if( idxNum==3 ){
7209 const char *zErr = 0;
7210 zRoot = (const char*)sqlite3_value_text(argv[1]);
7211 if( zRoot==0 ) return SQLITE_OK;
7212 n = sqlite3_value_bytes(argv[1]);
7213 p->zRoot = sqlite3_malloc64( n+1 );
7214 if( p->zRoot==0 ) return SQLITE_NOMEM;
7215 memcpy(p->zRoot, zRoot, (size_t)n+1);
7216 if( zRoot[0]!='$' ){
7217 zErr = zRoot;
7218 }else{
7219 pNode = jsonLookupStep(&p->sParse, 0, p->zRoot+1, 0, &zErr);
7220 }
7221 if( zErr ){
7222 sqlite3_free(cur->pVtab->zErrMsg);
7223 cur->pVtab->zErrMsg = jsonPathSyntaxError(zErr);
7224 jsonEachCursorReset(p);
7225 return cur->pVtab->zErrMsg ? SQLITE_ERROR : SQLITE_NOMEM;
7226 }else if( pNode==0 ){
7227 return SQLITE_OK;
7228 }
7229 }else{
7230 pNode = p->sParse.aNode;
7231 }
7232 p->iBegin = p->i = (int)(pNode - p->sParse.aNode);
7233 p->eType = pNode->eType;
7234 if( p->eType>=JSON_ARRAY ){
7235 pNode->u.iKey = 0;
7236 p->iEnd = p->i + pNode->n + 1;
7237 if( p->bRecursive ){
7238 p->eType = p->sParse.aNode[p->sParse.aUp[p->i]].eType;
7239 if( p->i>0 && (p->sParse.aNode[p->i-1].jnFlags & JNODE_LABEL)!=0 ){
7240 p->i--;
7241 }
7242 }else{
7243 p->i++;
7244 }
7245 }else{
7246 p->iEnd = p->i+1;
7247 }
7248 }
7249 return SQLITE_OK;
7250 }
7251
7252 /* The methods of the json_each virtual table */
7253 static sqlite3_module jsonEachModule = {
7254 0, /* iVersion */
7255 0, /* xCreate */
7256 jsonEachConnect, /* xConnect */
7257 jsonEachBestIndex, /* xBestIndex */
7258 jsonEachDisconnect, /* xDisconnect */
7259 0, /* xDestroy */
7260 jsonEachOpenEach, /* xOpen - open a cursor */
7261 jsonEachClose, /* xClose - close a cursor */
7262 jsonEachFilter, /* xFilter - configure scan constraints */
7263 jsonEachNext, /* xNext - advance a cursor */
7264 jsonEachEof, /* xEof - check for end of scan */
7265 jsonEachColumn, /* xColumn - read data */
7266 jsonEachRowid, /* xRowid - read data */
7267 0, /* xUpdate */
7268 0, /* xBegin */
7269 0, /* xSync */
7270 0, /* xCommit */
7271 0, /* xRollback */
7272 0, /* xFindMethod */
7273 0, /* xRename */
7274 0, /* xSavepoint */
7275 0, /* xRelease */
7276 0 /* xRollbackTo */
7277 };
7278
7279 /* The methods of the json_tree virtual table. */
7280 static sqlite3_module jsonTreeModule = {
7281 0, /* iVersion */
7282 0, /* xCreate */
7283 jsonEachConnect, /* xConnect */
7284 jsonEachBestIndex, /* xBestIndex */
7285 jsonEachDisconnect, /* xDisconnect */
7286 0, /* xDestroy */
7287 jsonEachOpenTree, /* xOpen - open a cursor */
7288 jsonEachClose, /* xClose - close a cursor */
7289 jsonEachFilter, /* xFilter - configure scan constraints */
7290 jsonEachNext, /* xNext - advance a cursor */
7291 jsonEachEof, /* xEof - check for end of scan */
7292 jsonEachColumn, /* xColumn - read data */
7293 jsonEachRowid, /* xRowid - read data */
7294 0, /* xUpdate */
7295 0, /* xBegin */
7296 0, /* xSync */
7297 0, /* xCommit */
7298 0, /* xRollback */
7299 0, /* xFindMethod */
7300 0, /* xRename */
7301 0, /* xSavepoint */
7302 0, /* xRelease */
7303 0 /* xRollbackTo */
7304 };
7305 #endif /* SQLITE_OMIT_VIRTUALTABLE */
7306
7307 /****************************************************************************
7308 ** The following routines are the only publically visible identifiers in this
7309 ** file. Call the following routines in order to register the various SQL
7310 ** functions and the virtual table implemented by this file.
7311 ****************************************************************************/
7312
7313 SQLITE_PRIVATE int sqlite3Json1Init(sqlite3 *db){
7314 int rc = SQLITE_OK;
7315 unsigned int i;
7316 static const struct {
7317 const char *zName;
7318 int nArg;
7319 int flag;
7320 void (*xFunc)(sqlite3_context*,int,sqlite3_value**);
7321 } aFunc[] = {
7322 { "json", 1, 0, jsonRemoveFunc },
7323 { "json_array", -1, 0, jsonArrayFunc },
7324 { "json_array_length", 1, 0, jsonArrayLengthFunc },
7325 { "json_array_length", 2, 0, jsonArrayLengthFunc },
7326 { "json_extract", -1, 0, jsonExtractFunc },
7327 { "json_insert", -1, 0, jsonSetFunc },
7328 { "json_object", -1, 0, jsonObjectFunc },
7329 { "json_remove", -1, 0, jsonRemoveFunc },
7330 { "json_replace", -1, 0, jsonReplaceFunc },
7331 { "json_set", -1, 1, jsonSetFunc },
7332 { "json_type", 1, 0, jsonTypeFunc },
7333 { "json_type", 2, 0, jsonTypeFunc },
7334 { "json_valid", 1, 0, jsonValidFunc },
7335
7336 #if SQLITE_DEBUG
7337 /* DEBUG and TESTING functions */
7338 { "json_parse", 1, 0, jsonParseFunc },
7339 { "json_test1", 1, 0, jsonTest1Func },
7340 #endif
7341 };
7342 static const struct {
7343 const char *zName;
7344 int nArg;
7345 void (*xStep)(sqlite3_context*,int,sqlite3_value**);
7346 void (*xFinal)(sqlite3_context*);
7347 } aAgg[] = {
7348 { "json_group_array", 1, jsonArrayStep, jsonArrayFinal },
7349 { "json_group_object", 2, jsonObjectStep, jsonObjectFinal },
7350 };
7351 #ifndef SQLITE_OMIT_VIRTUALTABLE
7352 static const struct {
7353 const char *zName;
7354 sqlite3_module *pModule;
7355 } aMod[] = {
7356 { "json_each", &jsonEachModule },
7357 { "json_tree", &jsonTreeModule },
7358 };
7359 #endif
7360 for(i=0; i<sizeof(aFunc)/sizeof(aFunc[0]) && rc==SQLITE_OK; i++){
7361 rc = sqlite3_create_function(db, aFunc[i].zName, aFunc[i].nArg,
7362 SQLITE_UTF8 | SQLITE_DETERMINISTIC,
7363 (void*)&aFunc[i].flag,
7364 aFunc[i].xFunc, 0, 0);
7365 }
7366 for(i=0; i<sizeof(aAgg)/sizeof(aAgg[0]) && rc==SQLITE_OK; i++){
7367 rc = sqlite3_create_function(db, aAgg[i].zName, aAgg[i].nArg,
7368 SQLITE_UTF8 | SQLITE_DETERMINISTIC, 0,
7369 0, aAgg[i].xStep, aAgg[i].xFinal);
7370 }
7371 #ifndef SQLITE_OMIT_VIRTUALTABLE
7372 for(i=0; i<sizeof(aMod)/sizeof(aMod[0]) && rc==SQLITE_OK; i++){
7373 rc = sqlite3_create_module(db, aMod[i].zName, aMod[i].pModule, 0);
7374 }
7375 #endif
7376 return rc;
7377 }
7378
7379
7380 #ifndef SQLITE_CORE
7381 #ifdef _WIN32
7382 __declspec(dllexport)
7383 #endif
7384 SQLITE_API int SQLITE_STDCALL sqlite3_json_init(
7385 sqlite3 *db,
7386 char **pzErrMsg,
7387 const sqlite3_api_routines *pApi
7388 ){
7389 SQLITE_EXTENSION_INIT2(pApi);
7390 (void)pzErrMsg; /* Unused parameter */
7391 return sqlite3Json1Init(db);
7392 }
7393 #endif
7394 #endif /* !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_JSON1) */
7395
7396 /************** End of json1.c ***********************************************/
7397 /************** Begin file fts5.c ********************************************/
7398
7399
7400 #if !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_FTS5)
7401
7402 #if !defined(NDEBUG) && !defined(SQLITE_DEBUG)
7403 # define NDEBUG 1
7404 #endif
7405 #if defined(NDEBUG) && defined(SQLITE_DEBUG)
7406 # undef NDEBUG
7407 #endif
7408
7409 /*
7410 ** 2014 May 31
7411 **
7412 ** The author disclaims copyright to this source code. In place of
7413 ** a legal notice, here is a blessing:
7414 **
7415 ** May you do good and not evil.
7416 ** May you find forgiveness for yourself and forgive others.
7417 ** May you share freely, never taking more than you give.
7418 **
7419 ******************************************************************************
7420 **
7421 ** Interfaces to extend FTS5. Using the interfaces defined in this file,
7422 ** FTS5 may be extended with:
7423 **
7424 ** * custom tokenizers, and
7425 ** * custom auxiliary functions.
7426 */
7427
7428
7429 #ifndef _FTS5_H
7430 #define _FTS5_H
7431
7432 /* #include "sqlite3.h" */
7433
7434 #if 0
7435 extern "C" {
7436 #endif
7437
7438 /*************************************************************************
7439 ** CUSTOM AUXILIARY FUNCTIONS
7440 **
7441 ** Virtual table implementations may overload SQL functions by implementing
7442 ** the sqlite3_module.xFindFunction() method.
7443 */
7444
7445 typedef struct Fts5ExtensionApi Fts5ExtensionApi;
7446 typedef struct Fts5Context Fts5Context;
7447 typedef struct Fts5PhraseIter Fts5PhraseIter;
7448
7449 typedef void (*fts5_extension_function)(
7450 const Fts5ExtensionApi *pApi, /* API offered by current FTS version */
7451 Fts5Context *pFts, /* First arg to pass to pApi functions */
7452 sqlite3_context *pCtx, /* Context for returning result/error */
7453 int nVal, /* Number of values in apVal[] array */
7454 sqlite3_value **apVal /* Array of trailing arguments */
7455 );
7456
7457 struct Fts5PhraseIter {
7458 const unsigned char *a;
7459 const unsigned char *b;
7460 };
7461
7462 /*
7463 ** EXTENSION API FUNCTIONS
7464 **
7465 ** xUserData(pFts):
7466 ** Return a copy of the context pointer the extension function was
7467 ** registered with.
7468 **
7469 ** xColumnTotalSize(pFts, iCol, pnToken):
7470 ** If parameter iCol is less than zero, set output variable *pnToken
7471 ** to the total number of tokens in the FTS5 table. Or, if iCol is
7472 ** non-negative but less than the number of columns in the table, return
7473 ** the total number of tokens in column iCol, considering all rows in
7474 ** the FTS5 table.
7475 **
7476 ** If parameter iCol is greater than or equal to the number of columns
7477 ** in the table, SQLITE_RANGE is returned. Or, if an error occurs (e.g.
7478 ** an OOM condition or IO error), an appropriate SQLite error code is
7479 ** returned.
7480 **
7481 ** xColumnCount(pFts):
7482 ** Return the number of columns in the table.
7483 **
7484 ** xColumnSize(pFts, iCol, pnToken):
7485 ** If parameter iCol is less than zero, set output variable *pnToken
7486 ** to the total number of tokens in the current row. Or, if iCol is
7487 ** non-negative but less than the number of columns in the table, set
7488 ** *pnToken to the number of tokens in column iCol of the current row.
7489 **
7490 ** If parameter iCol is greater than or equal to the number of columns
7491 ** in the table, SQLITE_RANGE is returned. Or, if an error occurs (e.g.
7492 ** an OOM condition or IO error), an appropriate SQLite error code is
7493 ** returned.
7494 **
7495 ** xColumnText:
7496 ** This function attempts to retrieve the text of column iCol of the
7497 ** current document. If successful, (*pz) is set to point to a buffer
7498 ** containing the text in utf-8 encoding, (*pn) is set to the size in bytes
7499 ** (not characters) of the buffer and SQLITE_OK is returned. Otherwise,
7500 ** if an error occurs, an SQLite error code is returned and the final values
7501 ** of (*pz) and (*pn) are undefined.
7502 **
7503 ** xPhraseCount:
7504 ** Returns the number of phrases in the current query expression.
7505 **
7506 ** xPhraseSize:
7507 ** Returns the number of tokens in phrase iPhrase of the query. Phrases
7508 ** are numbered starting from zero.
7509 **
7510 ** xInstCount:
7511 ** Set *pnInst to the total number of occurrences of all phrases within
7512 ** the query within the current row. Return SQLITE_OK if successful, or
7513 ** an error code (i.e. SQLITE_NOMEM) if an error occurs.
7514 **
7515 ** xInst:
7516 ** Query for the details of phrase match iIdx within the current row.
7517 ** Phrase matches are numbered starting from zero, so the iIdx argument
7518 ** should be greater than or equal to zero and smaller than the value
7519 ** output by xInstCount().
7520 **
7521 ** Returns SQLITE_OK if successful, or an error code (i.e. SQLITE_NOMEM)
7522 ** if an error occurs.
7523 **
7524 ** xRowid:
7525 ** Returns the rowid of the current row.
7526 **
7527 ** xTokenize:
7528 ** Tokenize text using the tokenizer belonging to the FTS5 table.
7529 **
7530 ** xQueryPhrase(pFts5, iPhrase, pUserData, xCallback):
7531 ** This API function is used to query the FTS table for phrase iPhrase
7532 ** of the current query. Specifically, a query equivalent to:
7533 **
7534 ** ... FROM ftstable WHERE ftstable MATCH $p ORDER BY rowid
7535 **
7536 ** with $p set to a phrase equivalent to the phrase iPhrase of the
7537 ** current query is executed. For each row visited, the callback function
7538 ** passed as the fourth argument is invoked. The context and API objects
7539 ** passed to the callback function may be used to access the properties of
7540 ** each matched row. Invoking Api.xUserData() returns a copy of the pointer
7541 ** passed as the third argument to pUserData.
7542 **
7543 ** If the callback function returns any value other than SQLITE_OK, the
7544 ** query is abandoned and the xQueryPhrase function returns immediately.
7545 ** If the returned value is SQLITE_DONE, xQueryPhrase returns SQLITE_OK.
7546 ** Otherwise, the error code is propagated upwards.
7547 **
7548 ** If the query runs to completion without incident, SQLITE_OK is returned.
7549 ** Or, if some error occurs before the query completes or is aborted by
7550 ** the callback, an SQLite error code is returned.
7551 **
7552 **
7553 ** xSetAuxdata(pFts5, pAux, xDelete)
7554 **
7555 ** Save the pointer passed as the second argument as the extension functions
7556 ** "auxiliary data". The pointer may then be retrieved by the current or any
7557 ** future invocation of the same fts5 extension function made as part of
7558 ** of the same MATCH query using the xGetAuxdata() API.
7559 **
7560 ** Each extension function is allocated a single auxiliary data slot for
7561 ** each FTS query (MATCH expression). If the extension function is invoked
7562 ** more than once for a single FTS query, then all invocations share a
7563 ** single auxiliary data context.
7564 **
7565 ** If there is already an auxiliary data pointer when this function is
7566 ** invoked, then it is replaced by the new pointer. If an xDelete callback
7567 ** was specified along with the original pointer, it is invoked at this
7568 ** point.
7569 **
7570 ** The xDelete callback, if one is specified, is also invoked on the
7571 ** auxiliary data pointer after the FTS5 query has finished.
7572 **
7573 ** If an error (e.g. an OOM condition) occurs within this function, an
7574 ** the auxiliary data is set to NULL and an error code returned. If the
7575 ** xDelete parameter was not NULL, it is invoked on the auxiliary data
7576 ** pointer before returning.
7577 **
7578 **
7579 ** xGetAuxdata(pFts5, bClear)
7580 **
7581 ** Returns the current auxiliary data pointer for the fts5 extension
7582 ** function. See the xSetAuxdata() method for details.
7583 **
7584 ** If the bClear argument is non-zero, then the auxiliary data is cleared
7585 ** (set to NULL) before this function returns. In this case the xDelete,
7586 ** if any, is not invoked.
7587 **
7588 **
7589 ** xRowCount(pFts5, pnRow)
7590 **
7591 ** This function is used to retrieve the total number of rows in the table.
7592 ** In other words, the same value that would be returned by:
7593 **
7594 ** SELECT count(*) FROM ftstable;
7595 **
7596 ** xPhraseFirst()
7597 ** This function is used, along with type Fts5PhraseIter and the xPhraseNext
7598 ** method, to iterate through all instances of a single query phrase within
7599 ** the current row. This is the same information as is accessible via the
7600 ** xInstCount/xInst APIs. While the xInstCount/xInst APIs are more convenient
7601 ** to use, this API may be faster under some circumstances. To iterate
7602 ** through instances of phrase iPhrase, use the following code:
7603 **
7604 ** Fts5PhraseIter iter;
7605 ** int iCol, iOff;
7606 ** for(pApi->xPhraseFirst(pFts, iPhrase, &iter, &iCol, &iOff);
7607 ** iOff>=0;
7608 ** pApi->xPhraseNext(pFts, &iter, &iCol, &iOff)
7609 ** ){
7610 ** // An instance of phrase iPhrase at offset iOff of column iCol
7611 ** }
7612 **
7613 ** The Fts5PhraseIter structure is defined above. Applications should not
7614 ** modify this structure directly - it should only be used as shown above
7615 ** with the xPhraseFirst() and xPhraseNext() API methods.
7616 **
7617 ** xPhraseNext()
7618 ** See xPhraseFirst above.
7619 */
7620 struct Fts5ExtensionApi {
7621 int iVersion; /* Currently always set to 1 */
7622
7623 void *(*xUserData)(Fts5Context*);
7624
7625 int (*xColumnCount)(Fts5Context*);
7626 int (*xRowCount)(Fts5Context*, sqlite3_int64 *pnRow);
7627 int (*xColumnTotalSize)(Fts5Context*, int iCol, sqlite3_int64 *pnToken);
7628
7629 int (*xTokenize)(Fts5Context*,
7630 const char *pText, int nText, /* Text to tokenize */
7631 void *pCtx, /* Context passed to xToken() */
7632 int (*xToken)(void*, int, const char*, int, int, int) /* Callback */
7633 );
7634
7635 int (*xPhraseCount)(Fts5Context*);
7636 int (*xPhraseSize)(Fts5Context*, int iPhrase);
7637
7638 int (*xInstCount)(Fts5Context*, int *pnInst);
7639 int (*xInst)(Fts5Context*, int iIdx, int *piPhrase, int *piCol, int *piOff);
7640
7641 sqlite3_int64 (*xRowid)(Fts5Context*);
7642 int (*xColumnText)(Fts5Context*, int iCol, const char **pz, int *pn);
7643 int (*xColumnSize)(Fts5Context*, int iCol, int *pnToken);
7644
7645 int (*xQueryPhrase)(Fts5Context*, int iPhrase, void *pUserData,
7646 int(*)(const Fts5ExtensionApi*,Fts5Context*,void*)
7647 );
7648 int (*xSetAuxdata)(Fts5Context*, void *pAux, void(*xDelete)(void*));
7649 void *(*xGetAuxdata)(Fts5Context*, int bClear);
7650
7651 void (*xPhraseFirst)(Fts5Context*, int iPhrase, Fts5PhraseIter*, int*, int*);
7652 void (*xPhraseNext)(Fts5Context*, Fts5PhraseIter*, int *piCol, int *piOff);
7653 };
7654
7655 /*
7656 ** CUSTOM AUXILIARY FUNCTIONS
7657 *************************************************************************/
7658
7659 /*************************************************************************
7660 ** CUSTOM TOKENIZERS
7661 **
7662 ** Applications may also register custom tokenizer types. A tokenizer
7663 ** is registered by providing fts5 with a populated instance of the
7664 ** following structure. All structure methods must be defined, setting
7665 ** any member of the fts5_tokenizer struct to NULL leads to undefined
7666 ** behaviour. The structure methods are expected to function as follows:
7667 **
7668 ** xCreate:
7669 ** This function is used to allocate and inititalize a tokenizer instance.
7670 ** A tokenizer instance is required to actually tokenize text.
7671 **
7672 ** The first argument passed to this function is a copy of the (void*)
7673 ** pointer provided by the application when the fts5_tokenizer object
7674 ** was registered with FTS5 (the third argument to xCreateTokenizer()).
7675 ** The second and third arguments are an array of nul-terminated strings
7676 ** containing the tokenizer arguments, if any, specified following the
7677 ** tokenizer name as part of the CREATE VIRTUAL TABLE statement used
7678 ** to create the FTS5 table.
7679 **
7680 ** The final argument is an output variable. If successful, (*ppOut)
7681 ** should be set to point to the new tokenizer handle and SQLITE_OK
7682 ** returned. If an error occurs, some value other than SQLITE_OK should
7683 ** be returned. In this case, fts5 assumes that the final value of *ppOut
7684 ** is undefined.
7685 **
7686 ** xDelete:
7687 ** This function is invoked to delete a tokenizer handle previously
7688 ** allocated using xCreate(). Fts5 guarantees that this function will
7689 ** be invoked exactly once for each successful call to xCreate().
7690 **
7691 ** xTokenize:
7692 ** This function is expected to tokenize the nText byte string indicated
7693 ** by argument pText. pText may or may not be nul-terminated. The first
7694 ** argument passed to this function is a pointer to an Fts5Tokenizer object
7695 ** returned by an earlier call to xCreate().
7696 **
7697 ** The second argument indicates the reason that FTS5 is requesting
7698 ** tokenization of the supplied text. This is always one of the following
7699 ** four values:
7700 **
7701 ** <ul><li> <b>FTS5_TOKENIZE_DOCUMENT</b> - A document is being inserted into
7702 ** or removed from the FTS table. The tokenizer is being invoked to
7703 ** determine the set of tokens to add to (or delete from) the
7704 ** FTS index.
7705 **
7706 ** <li> <b>FTS5_TOKENIZE_QUERY</b> - A MATCH query is being executed
7707 ** against the FTS index. The tokenizer is being called to tokenize
7708 ** a bareword or quoted string specified as part of the query.
7709 **
7710 ** <li> <b>(FTS5_TOKENIZE_QUERY | FTS5_TOKENIZE_PREFIX)</b> - Same as
7711 ** FTS5_TOKENIZE_QUERY, except that the bareword or quoted string is
7712 ** followed by a "*" character, indicating that the last token
7713 ** returned by the tokenizer will be treated as a token prefix.
7714 **
7715 ** <li> <b>FTS5_TOKENIZE_AUX</b> - The tokenizer is being invoked to
7716 ** satisfy an fts5_api.xTokenize() request made by an auxiliary
7717 ** function. Or an fts5_api.xColumnSize() request made by the same
7718 ** on a columnsize=0 database.
7719 ** </ul>
7720 **
7721 ** For each token in the input string, the supplied callback xToken() must
7722 ** be invoked. The first argument to it should be a copy of the pointer
7723 ** passed as the second argument to xTokenize(). The third and fourth
7724 ** arguments are a pointer to a buffer containing the token text, and the
7725 ** size of the token in bytes. The 4th and 5th arguments are the byte offsets
7726 ** of the first byte of and first byte immediately following the text from
7727 ** which the token is derived within the input.
7728 **
7729 ** The second argument passed to the xToken() callback ("tflags") should
7730 ** normally be set to 0. The exception is if the tokenizer supports
7731 ** synonyms. In this case see the discussion below for details.
7732 **
7733 ** FTS5 assumes the xToken() callback is invoked for each token in the
7734 ** order that they occur within the input text.
7735 **
7736 ** If an xToken() callback returns any value other than SQLITE_OK, then
7737 ** the tokenization should be abandoned and the xTokenize() method should
7738 ** immediately return a copy of the xToken() return value. Or, if the
7739 ** input buffer is exhausted, xTokenize() should return SQLITE_OK. Finally,
7740 ** if an error occurs with the xTokenize() implementation itself, it
7741 ** may abandon the tokenization and return any error code other than
7742 ** SQLITE_OK or SQLITE_DONE.
7743 **
7744 ** SYNONYM SUPPORT
7745 **
7746 ** Custom tokenizers may also support synonyms. Consider a case in which a
7747 ** user wishes to query for a phrase such as "first place". Using the
7748 ** built-in tokenizers, the FTS5 query 'first + place' will match instances
7749 ** of "first place" within the document set, but not alternative forms
7750 ** such as "1st place". In some applications, it would be better to match
7751 ** all instances of "first place" or "1st place" regardless of which form
7752 ** the user specified in the MATCH query text.
7753 **
7754 ** There are several ways to approach this in FTS5:
7755 **
7756 ** <ol><li> By mapping all synonyms to a single token. In this case, the
7757 ** In the above example, this means that the tokenizer returns the
7758 ** same token for inputs "first" and "1st". Say that token is in
7759 ** fact "first", so that when the user inserts the document "I won
7760 ** 1st place" entries are added to the index for tokens "i", "won",
7761 ** "first" and "place". If the user then queries for '1st + place',
7762 ** the tokenizer substitutes "first" for "1st" and the query works
7763 ** as expected.
7764 **
7765 ** <li> By adding multiple synonyms for a single term to the FTS index.
7766 ** In this case, when tokenizing query text, the tokenizer may
7767 ** provide multiple synonyms for a single term within the document.
7768 ** FTS5 then queries the index for each synonym individually. For
7769 ** example, faced with the query:
7770 **
7771 ** <codeblock>
7772 ** ... MATCH 'first place'</codeblock>
7773 **
7774 ** the tokenizer offers both "1st" and "first" as synonyms for the
7775 ** first token in the MATCH query and FTS5 effectively runs a query
7776 ** similar to:
7777 **
7778 ** <codeblock>
7779 ** ... MATCH '(first OR 1st) place'</codeblock>
7780 **
7781 ** except that, for the purposes of auxiliary functions, the query
7782 ** still appears to contain just two phrases - "(first OR 1st)"
7783 ** being treated as a single phrase.
7784 **
7785 ** <li> By adding multiple synonyms for a single term to the FTS index.
7786 ** Using this method, when tokenizing document text, the tokenizer
7787 ** provides multiple synonyms for each token. So that when a
7788 ** document such as "I won first place" is tokenized, entries are
7789 ** added to the FTS index for "i", "won", "first", "1st" and
7790 ** "place".
7791 **
7792 ** This way, even if the tokenizer does not provide synonyms
7793 ** when tokenizing query text (it should not - to do would be
7794 ** inefficient), it doesn't matter if the user queries for
7795 ** 'first + place' or '1st + place', as there are entires in the
7796 ** FTS index corresponding to both forms of the first token.
7797 ** </ol>
7798 **
7799 ** Whether it is parsing document or query text, any call to xToken that
7800 ** specifies a <i>tflags</i> argument with the FTS5_TOKEN_COLOCATED bit
7801 ** is considered to supply a synonym for the previous token. For example,
7802 ** when parsing the document "I won first place", a tokenizer that supports
7803 ** synonyms would call xToken() 5 times, as follows:
7804 **
7805 ** <codeblock>
7806 ** xToken(pCtx, 0, "i", 1, 0, 1);
7807 ** xToken(pCtx, 0, "won", 3, 2, 5);
7808 ** xToken(pCtx, 0, "first", 5, 6, 11);
7809 ** xToken(pCtx, FTS5_TOKEN_COLOCATED, "1st", 3, 6, 11);
7810 ** xToken(pCtx, 0, "place", 5, 12, 17);
7811 **</codeblock>
7812 **
7813 ** It is an error to specify the FTS5_TOKEN_COLOCATED flag the first time
7814 ** xToken() is called. Multiple synonyms may be specified for a single token
7815 ** by making multiple calls to xToken(FTS5_TOKEN_COLOCATED) in sequence.
7816 ** There is no limit to the number of synonyms that may be provided for a
7817 ** single token.
7818 **
7819 ** In many cases, method (1) above is the best approach. It does not add
7820 ** extra data to the FTS index or require FTS5 to query for multiple terms,
7821 ** so it is efficient in terms of disk space and query speed. However, it
7822 ** does not support prefix queries very well. If, as suggested above, the
7823 ** token "first" is subsituted for "1st" by the tokenizer, then the query:
7824 **
7825 ** <codeblock>
7826 ** ... MATCH '1s*'</codeblock>
7827 **
7828 ** will not match documents that contain the token "1st" (as the tokenizer
7829 ** will probably not map "1s" to any prefix of "first").
7830 **
7831 ** For full prefix support, method (3) may be preferred. In this case,
7832 ** because the index contains entries for both "first" and "1st", prefix
7833 ** queries such as 'fi*' or '1s*' will match correctly. However, because
7834 ** extra entries are added to the FTS index, this method uses more space
7835 ** within the database.
7836 **
7837 ** Method (2) offers a midpoint between (1) and (3). Using this method,
7838 ** a query such as '1s*' will match documents that contain the literal
7839 ** token "1st", but not "first" (assuming the tokenizer is not able to
7840 ** provide synonyms for prefixes). However, a non-prefix query like '1st'
7841 ** will match against "1st" and "first". This method does not require
7842 ** extra disk space, as no extra entries are added to the FTS index.
7843 ** On the other hand, it may require more CPU cycles to run MATCH queries,
7844 ** as separate queries of the FTS index are required for each synonym.
7845 **
7846 ** When using methods (2) or (3), it is important that the tokenizer only
7847 ** provide synonyms when tokenizing document text (method (2)) or query
7848 ** text (method (3)), not both. Doing so will not cause any errors, but is
7849 ** inefficient.
7850 */
7851 typedef struct Fts5Tokenizer Fts5Tokenizer;
7852 typedef struct fts5_tokenizer fts5_tokenizer;
7853 struct fts5_tokenizer {
7854 int (*xCreate)(void*, const char **azArg, int nArg, Fts5Tokenizer **ppOut);
7855 void (*xDelete)(Fts5Tokenizer*);
7856 int (*xTokenize)(Fts5Tokenizer*,
7857 void *pCtx,
7858 int flags, /* Mask of FTS5_TOKENIZE_* flags */
7859 const char *pText, int nText,
7860 int (*xToken)(
7861 void *pCtx, /* Copy of 2nd argument to xTokenize() */
7862 int tflags, /* Mask of FTS5_TOKEN_* flags */
7863 const char *pToken, /* Pointer to buffer containing token */
7864 int nToken, /* Size of token in bytes */
7865 int iStart, /* Byte offset of token within input text */
7866 int iEnd /* Byte offset of end of token within input text */
7867 )
7868 );
7869 };
7870
7871 /* Flags that may be passed as the third argument to xTokenize() */
7872 #define FTS5_TOKENIZE_QUERY 0x0001
7873 #define FTS5_TOKENIZE_PREFIX 0x0002
7874 #define FTS5_TOKENIZE_DOCUMENT 0x0004
7875 #define FTS5_TOKENIZE_AUX 0x0008
7876
7877 /* Flags that may be passed by the tokenizer implementation back to FTS5
7878 ** as the third argument to the supplied xToken callback. */
7879 #define FTS5_TOKEN_COLOCATED 0x0001 /* Same position as prev. token */
7880
7881 /*
7882 ** END OF CUSTOM TOKENIZERS
7883 *************************************************************************/
7884
7885 /*************************************************************************
7886 ** FTS5 EXTENSION REGISTRATION API
7887 */
7888 typedef struct fts5_api fts5_api;
7889 struct fts5_api {
7890 int iVersion; /* Currently always set to 2 */
7891
7892 /* Create a new tokenizer */
7893 int (*xCreateTokenizer)(
7894 fts5_api *pApi,
7895 const char *zName,
7896 void *pContext,
7897 fts5_tokenizer *pTokenizer,
7898 void (*xDestroy)(void*)
7899 );
7900
7901 /* Find an existing tokenizer */
7902 int (*xFindTokenizer)(
7903 fts5_api *pApi,
7904 const char *zName,
7905 void **ppContext,
7906 fts5_tokenizer *pTokenizer
7907 );
7908
7909 /* Create a new auxiliary function */
7910 int (*xCreateFunction)(
7911 fts5_api *pApi,
7912 const char *zName,
7913 void *pContext,
7914 fts5_extension_function xFunction,
7915 void (*xDestroy)(void*)
7916 );
7917 };
7918
7919 /*
7920 ** END OF REGISTRATION API
7921 *************************************************************************/
7922
7923 #if 0
7924 } /* end of the 'extern "C"' block */
7925 #endif
7926
7927 #endif /* _FTS5_H */
7928
7929
7930 /*
7931 ** 2014 May 31
7932 **
7933 ** The author disclaims copyright to this source code. In place of
7934 ** a legal notice, here is a blessing:
7935 **
7936 ** May you do good and not evil.
7937 ** May you find forgiveness for yourself and forgive others.
7938 ** May you share freely, never taking more than you give.
7939 **
7940 ******************************************************************************
7941 **
7942 */
7943 #ifndef _FTS5INT_H
7944 #define _FTS5INT_H
7945
7946 /* #include "fts5.h" */
7947 /* #include "sqlite3ext.h" */
7948 SQLITE_EXTENSION_INIT1
7949
7950 /* #include <string.h> */
7951 /* #include <assert.h> */
7952
7953 #ifndef SQLITE_AMALGAMATION
7954
7955 typedef unsigned char u8;
7956 typedef unsigned int u32;
7957 typedef unsigned short u16;
7958 typedef sqlite3_int64 i64;
7959 typedef sqlite3_uint64 u64;
7960
7961 #define ArraySize(x) (sizeof(x) / sizeof(x[0]))
7962
7963 #define testcase(x)
7964 #define ALWAYS(x) 1
7965 #define NEVER(x) 0
7966
7967 #define MIN(x,y) (((x) < (y)) ? (x) : (y))
7968 #define MAX(x,y) (((x) > (y)) ? (x) : (y))
7969
7970 /*
7971 ** Constants for the largest and smallest possible 64-bit signed integers.
7972 */
7973 # define LARGEST_INT64 (0xffffffff|(((i64)0x7fffffff)<<32))
7974 # define SMALLEST_INT64 (((i64)-1) - LARGEST_INT64)
7975
7976 #endif
7977
7978
7979 /*
7980 ** Maximum number of prefix indexes on single FTS5 table. This must be
7981 ** less than 32. If it is set to anything large than that, an #error
7982 ** directive in fts5_index.c will cause the build to fail.
7983 */
7984 #define FTS5_MAX_PREFIX_INDEXES 31
7985
7986 #define FTS5_DEFAULT_NEARDIST 10
7987 #define FTS5_DEFAULT_RANK "bm25"
7988
7989 /* Name of rank and rowid columns */
7990 #define FTS5_RANK_NAME "rank"
7991 #define FTS5_ROWID_NAME "rowid"
7992
7993 #ifdef SQLITE_DEBUG
7994 # define FTS5_CORRUPT sqlite3Fts5Corrupt()
7995 static int sqlite3Fts5Corrupt(void);
7996 #else
7997 # define FTS5_CORRUPT SQLITE_CORRUPT_VTAB
7998 #endif
7999
8000 /*
8001 ** The assert_nc() macro is similar to the assert() macro, except that it
8002 ** is used for assert() conditions that are true only if it can be
8003 ** guranteed that the database is not corrupt.
8004 */
8005 #ifdef SQLITE_DEBUG
8006 SQLITE_API extern int sqlite3_fts5_may_be_corrupt;
8007 # define assert_nc(x) assert(sqlite3_fts5_may_be_corrupt || (x))
8008 #else
8009 # define assert_nc(x) assert(x)
8010 #endif
8011
8012 typedef struct Fts5Global Fts5Global;
8013 typedef struct Fts5Colset Fts5Colset;
8014
8015 /* If a NEAR() clump or phrase may only match a specific set of columns,
8016 ** then an object of the following type is used to record the set of columns.
8017 ** Each entry in the aiCol[] array is a column that may be matched.
8018 **
8019 ** This object is used by fts5_expr.c and fts5_index.c.
8020 */
8021 struct Fts5Colset {
8022 int nCol;
8023 int aiCol[1];
8024 };
8025
8026
8027
8028 /**************************************************************************
8029 ** Interface to code in fts5_config.c. fts5_config.c contains contains code
8030 ** to parse the arguments passed to the CREATE VIRTUAL TABLE statement.
8031 */
8032
8033 typedef struct Fts5Config Fts5Config;
8034
8035 /*
8036 ** An instance of the following structure encodes all information that can
8037 ** be gleaned from the CREATE VIRTUAL TABLE statement.
8038 **
8039 ** And all information loaded from the %_config table.
8040 **
8041 ** nAutomerge:
8042 ** The minimum number of segments that an auto-merge operation should
8043 ** attempt to merge together. A value of 1 sets the object to use the
8044 ** compile time default. Zero disables auto-merge altogether.
8045 **
8046 ** zContent:
8047 **
8048 ** zContentRowid:
8049 ** The value of the content_rowid= option, if one was specified. Or
8050 ** the string "rowid" otherwise. This text is not quoted - if it is
8051 ** used as part of an SQL statement it needs to be quoted appropriately.
8052 **
8053 ** zContentExprlist:
8054 **
8055 ** pzErrmsg:
8056 ** This exists in order to allow the fts5_index.c module to return a
8057 ** decent error message if it encounters a file-format version it does
8058 ** not understand.
8059 **
8060 ** bColumnsize:
8061 ** True if the %_docsize table is created.
8062 **
8063 ** bPrefixIndex:
8064 ** This is only used for debugging. If set to false, any prefix indexes
8065 ** are ignored. This value is configured using:
8066 **
8067 ** INSERT INTO tbl(tbl, rank) VALUES('prefix-index', $bPrefixIndex);
8068 **
8069 */
8070 struct Fts5Config {
8071 sqlite3 *db; /* Database handle */
8072 char *zDb; /* Database holding FTS index (e.g. "main") */
8073 char *zName; /* Name of FTS index */
8074 int nCol; /* Number of columns */
8075 char **azCol; /* Column names */
8076 u8 *abUnindexed; /* True for unindexed columns */
8077 int nPrefix; /* Number of prefix indexes */
8078 int *aPrefix; /* Sizes in bytes of nPrefix prefix indexes */
8079 int eContent; /* An FTS5_CONTENT value */
8080 char *zContent; /* content table */
8081 char *zContentRowid; /* "content_rowid=" option value */
8082 int bColumnsize; /* "columnsize=" option value (dflt==1) */
8083 char *zContentExprlist;
8084 Fts5Tokenizer *pTok;
8085 fts5_tokenizer *pTokApi;
8086
8087 /* Values loaded from the %_config table */
8088 int iCookie; /* Incremented when %_config is modified */
8089 int pgsz; /* Approximate page size used in %_data */
8090 int nAutomerge; /* 'automerge' setting */
8091 int nCrisisMerge; /* Maximum allowed segments per level */
8092 int nHashSize; /* Bytes of memory for in-memory hash */
8093 char *zRank; /* Name of rank function */
8094 char *zRankArgs; /* Arguments to rank function */
8095
8096 /* If non-NULL, points to sqlite3_vtab.base.zErrmsg. Often NULL. */
8097 char **pzErrmsg;
8098
8099 #ifdef SQLITE_DEBUG
8100 int bPrefixIndex; /* True to use prefix-indexes */
8101 #endif
8102 };
8103
8104 /* Current expected value of %_config table 'version' field */
8105 #define FTS5_CURRENT_VERSION 4
8106
8107 #define FTS5_CONTENT_NORMAL 0
8108 #define FTS5_CONTENT_NONE 1
8109 #define FTS5_CONTENT_EXTERNAL 2
8110
8111
8112
8113
8114 static int sqlite3Fts5ConfigParse(
8115 Fts5Global*, sqlite3*, int, const char **, Fts5Config**, char**
8116 );
8117 static void sqlite3Fts5ConfigFree(Fts5Config*);
8118
8119 static int sqlite3Fts5ConfigDeclareVtab(Fts5Config *pConfig);
8120
8121 static int sqlite3Fts5Tokenize(
8122 Fts5Config *pConfig, /* FTS5 Configuration object */
8123 int flags, /* FTS5_TOKENIZE_* flags */
8124 const char *pText, int nText, /* Text to tokenize */
8125 void *pCtx, /* Context passed to xToken() */
8126 int (*xToken)(void*, int, const char*, int, int, int) /* Callback */
8127 );
8128
8129 static void sqlite3Fts5Dequote(char *z);
8130
8131 /* Load the contents of the %_config table */
8132 static int sqlite3Fts5ConfigLoad(Fts5Config*, int);
8133
8134 /* Set the value of a single config attribute */
8135 static int sqlite3Fts5ConfigSetValue(Fts5Config*, const char*, sqlite3_value*, i nt*);
8136
8137 static int sqlite3Fts5ConfigParseRank(const char*, char**, char**);
8138
8139 /*
8140 ** End of interface to code in fts5_config.c.
8141 **************************************************************************/
8142
8143 /**************************************************************************
8144 ** Interface to code in fts5_buffer.c.
8145 */
8146
8147 /*
8148 ** Buffer object for the incremental building of string data.
8149 */
8150 typedef struct Fts5Buffer Fts5Buffer;
8151 struct Fts5Buffer {
8152 u8 *p;
8153 int n;
8154 int nSpace;
8155 };
8156
8157 static int sqlite3Fts5BufferSize(int*, Fts5Buffer*, int);
8158 static void sqlite3Fts5BufferAppendVarint(int*, Fts5Buffer*, i64);
8159 static void sqlite3Fts5BufferAppendBlob(int*, Fts5Buffer*, int, const u8*);
8160 static void sqlite3Fts5BufferAppendString(int *, Fts5Buffer*, const char*);
8161 static void sqlite3Fts5BufferFree(Fts5Buffer*);
8162 static void sqlite3Fts5BufferZero(Fts5Buffer*);
8163 static void sqlite3Fts5BufferSet(int*, Fts5Buffer*, int, const u8*);
8164 static void sqlite3Fts5BufferAppendPrintf(int *, Fts5Buffer*, char *zFmt, ...);
8165
8166 static char *sqlite3Fts5Mprintf(int *pRc, const char *zFmt, ...);
8167
8168 #define fts5BufferZero(x) sqlite3Fts5BufferZero(x)
8169 #define fts5BufferAppendVarint(a,b,c) sqlite3Fts5BufferAppendVarint(a,b,c)
8170 #define fts5BufferFree(a) sqlite3Fts5BufferFree(a)
8171 #define fts5BufferAppendBlob(a,b,c,d) sqlite3Fts5BufferAppendBlob(a,b,c,d)
8172 #define fts5BufferSet(a,b,c,d) sqlite3Fts5BufferSet(a,b,c,d)
8173
8174 #define fts5BufferGrow(pRc,pBuf,nn) ( \
8175 (pBuf)->n + (nn) <= (pBuf)->nSpace ? 0 : \
8176 sqlite3Fts5BufferSize((pRc),(pBuf),(nn)+(pBuf)->n) \
8177 )
8178
8179 /* Write and decode big-endian 32-bit integer values */
8180 static void sqlite3Fts5Put32(u8*, int);
8181 static int sqlite3Fts5Get32(const u8*);
8182
8183 #define FTS5_POS2COLUMN(iPos) (int)(iPos >> 32)
8184 #define FTS5_POS2OFFSET(iPos) (int)(iPos & 0xFFFFFFFF)
8185
8186 typedef struct Fts5PoslistReader Fts5PoslistReader;
8187 struct Fts5PoslistReader {
8188 /* Variables used only by sqlite3Fts5PoslistIterXXX() functions. */
8189 const u8 *a; /* Position list to iterate through */
8190 int n; /* Size of buffer at a[] in bytes */
8191 int i; /* Current offset in a[] */
8192
8193 u8 bFlag; /* For client use (any custom purpose) */
8194
8195 /* Output variables */
8196 u8 bEof; /* Set to true at EOF */
8197 i64 iPos; /* (iCol<<32) + iPos */
8198 };
8199 static int sqlite3Fts5PoslistReaderInit(
8200 const u8 *a, int n, /* Poslist buffer to iterate through */
8201 Fts5PoslistReader *pIter /* Iterator object to initialize */
8202 );
8203 static int sqlite3Fts5PoslistReaderNext(Fts5PoslistReader*);
8204
8205 typedef struct Fts5PoslistWriter Fts5PoslistWriter;
8206 struct Fts5PoslistWriter {
8207 i64 iPrev;
8208 };
8209 static int sqlite3Fts5PoslistWriterAppend(Fts5Buffer*, Fts5PoslistWriter*, i64);
8210
8211 static int sqlite3Fts5PoslistNext64(
8212 const u8 *a, int n, /* Buffer containing poslist */
8213 int *pi, /* IN/OUT: Offset within a[] */
8214 i64 *piOff /* IN/OUT: Current offset */
8215 );
8216
8217 /* Malloc utility */
8218 static void *sqlite3Fts5MallocZero(int *pRc, int nByte);
8219 static char *sqlite3Fts5Strndup(int *pRc, const char *pIn, int nIn);
8220
8221 /* Character set tests (like isspace(), isalpha() etc.) */
8222 static int sqlite3Fts5IsBareword(char t);
8223
8224 /*
8225 ** End of interface to code in fts5_buffer.c.
8226 **************************************************************************/
8227
8228 /**************************************************************************
8229 ** Interface to code in fts5_index.c. fts5_index.c contains contains code
8230 ** to access the data stored in the %_data table.
8231 */
8232
8233 typedef struct Fts5Index Fts5Index;
8234 typedef struct Fts5IndexIter Fts5IndexIter;
8235
8236 /*
8237 ** Values used as part of the flags argument passed to IndexQuery().
8238 */
8239 #define FTS5INDEX_QUERY_PREFIX 0x0001 /* Prefix query */
8240 #define FTS5INDEX_QUERY_DESC 0x0002 /* Docs in descending rowid order */
8241 #define FTS5INDEX_QUERY_TEST_NOIDX 0x0004 /* Do not use prefix index */
8242 #define FTS5INDEX_QUERY_SCAN 0x0008 /* Scan query (fts5vocab) */
8243
8244 /*
8245 ** Create/destroy an Fts5Index object.
8246 */
8247 static int sqlite3Fts5IndexOpen(Fts5Config *pConfig, int bCreate, Fts5Index**, c har**);
8248 static int sqlite3Fts5IndexClose(Fts5Index *p);
8249
8250 /*
8251 ** for(
8252 ** sqlite3Fts5IndexQuery(p, "token", 5, 0, 0, &pIter);
8253 ** 0==sqlite3Fts5IterEof(pIter);
8254 ** sqlite3Fts5IterNext(pIter)
8255 ** ){
8256 ** i64 iRowid = sqlite3Fts5IterRowid(pIter);
8257 ** }
8258 */
8259
8260 /*
8261 ** Open a new iterator to iterate though all rowids that match the
8262 ** specified token or token prefix.
8263 */
8264 static int sqlite3Fts5IndexQuery(
8265 Fts5Index *p, /* FTS index to query */
8266 const char *pToken, int nToken, /* Token (or prefix) to query for */
8267 int flags, /* Mask of FTS5INDEX_QUERY_X flags */
8268 Fts5Colset *pColset, /* Match these columns only */
8269 Fts5IndexIter **ppIter /* OUT: New iterator object */
8270 );
8271
8272 /*
8273 ** The various operations on open token or token prefix iterators opened
8274 ** using sqlite3Fts5IndexQuery().
8275 */
8276 static int sqlite3Fts5IterEof(Fts5IndexIter*);
8277 static int sqlite3Fts5IterNext(Fts5IndexIter*);
8278 static int sqlite3Fts5IterNextFrom(Fts5IndexIter*, i64 iMatch);
8279 static i64 sqlite3Fts5IterRowid(Fts5IndexIter*);
8280 static int sqlite3Fts5IterPoslist(Fts5IndexIter*,Fts5Colset*, const u8**, int*, i64*);
8281 static int sqlite3Fts5IterPoslistBuffer(Fts5IndexIter *pIter, Fts5Buffer *pBuf);
8282
8283 /*
8284 ** Close an iterator opened by sqlite3Fts5IndexQuery().
8285 */
8286 static void sqlite3Fts5IterClose(Fts5IndexIter*);
8287
8288 /*
8289 ** This interface is used by the fts5vocab module.
8290 */
8291 static const char *sqlite3Fts5IterTerm(Fts5IndexIter*, int*);
8292 static int sqlite3Fts5IterNextScan(Fts5IndexIter*);
8293
8294
8295 /*
8296 ** Insert or remove data to or from the index. Each time a document is
8297 ** added to or removed from the index, this function is called one or more
8298 ** times.
8299 **
8300 ** For an insert, it must be called once for each token in the new document.
8301 ** If the operation is a delete, it must be called (at least) once for each
8302 ** unique token in the document with an iCol value less than zero. The iPos
8303 ** argument is ignored for a delete.
8304 */
8305 static int sqlite3Fts5IndexWrite(
8306 Fts5Index *p, /* Index to write to */
8307 int iCol, /* Column token appears in (-ve -> delete) */
8308 int iPos, /* Position of token within column */
8309 const char *pToken, int nToken /* Token to add or remove to or from index */
8310 );
8311
8312 /*
8313 ** Indicate that subsequent calls to sqlite3Fts5IndexWrite() pertain to
8314 ** document iDocid.
8315 */
8316 static int sqlite3Fts5IndexBeginWrite(
8317 Fts5Index *p, /* Index to write to */
8318 int bDelete, /* True if current operation is a delete */
8319 i64 iDocid /* Docid to add or remove data from */
8320 );
8321
8322 /*
8323 ** Flush any data stored in the in-memory hash tables to the database.
8324 ** If the bCommit flag is true, also close any open blob handles.
8325 */
8326 static int sqlite3Fts5IndexSync(Fts5Index *p, int bCommit);
8327
8328 /*
8329 ** Discard any data stored in the in-memory hash tables. Do not write it
8330 ** to the database. Additionally, assume that the contents of the %_data
8331 ** table may have changed on disk. So any in-memory caches of %_data
8332 ** records must be invalidated.
8333 */
8334 static int sqlite3Fts5IndexRollback(Fts5Index *p);
8335
8336 /*
8337 ** Get or set the "averages" values.
8338 */
8339 static int sqlite3Fts5IndexGetAverages(Fts5Index *p, i64 *pnRow, i64 *anSize);
8340 static int sqlite3Fts5IndexSetAverages(Fts5Index *p, const u8*, int);
8341
8342 /*
8343 ** Functions called by the storage module as part of integrity-check.
8344 */
8345 static u64 sqlite3Fts5IndexCksum(Fts5Config*,i64,int,int,const char*,int);
8346 static int sqlite3Fts5IndexIntegrityCheck(Fts5Index*, u64 cksum);
8347
8348 /*
8349 ** Called during virtual module initialization to register UDF
8350 ** fts5_decode() with SQLite
8351 */
8352 static int sqlite3Fts5IndexInit(sqlite3*);
8353
8354 static int sqlite3Fts5IndexSetCookie(Fts5Index*, int);
8355
8356 /*
8357 ** Return the total number of entries read from the %_data table by
8358 ** this connection since it was created.
8359 */
8360 static int sqlite3Fts5IndexReads(Fts5Index *p);
8361
8362 static int sqlite3Fts5IndexReinit(Fts5Index *p);
8363 static int sqlite3Fts5IndexOptimize(Fts5Index *p);
8364 static int sqlite3Fts5IndexMerge(Fts5Index *p, int nMerge);
8365
8366 static int sqlite3Fts5IndexLoadConfig(Fts5Index *p);
8367
8368 /*
8369 ** End of interface to code in fts5_index.c.
8370 **************************************************************************/
8371
8372 /**************************************************************************
8373 ** Interface to code in fts5_varint.c.
8374 */
8375 static int sqlite3Fts5GetVarint32(const unsigned char *p, u32 *v);
8376 static int sqlite3Fts5GetVarintLen(u32 iVal);
8377 static u8 sqlite3Fts5GetVarint(const unsigned char*, u64*);
8378 static int sqlite3Fts5PutVarint(unsigned char *p, u64 v);
8379
8380 #define fts5GetVarint32(a,b) sqlite3Fts5GetVarint32(a,(u32*)&b)
8381 #define fts5GetVarint sqlite3Fts5GetVarint
8382
8383 #define fts5FastGetVarint32(a, iOff, nVal) { \
8384 nVal = (a)[iOff++]; \
8385 if( nVal & 0x80 ){ \
8386 iOff--; \
8387 iOff += fts5GetVarint32(&(a)[iOff], nVal); \
8388 } \
8389 }
8390
8391
8392 /*
8393 ** End of interface to code in fts5_varint.c.
8394 **************************************************************************/
8395
8396
8397 /**************************************************************************
8398 ** Interface to code in fts5.c.
8399 */
8400
8401 static int sqlite3Fts5GetTokenizer(
8402 Fts5Global*,
8403 const char **azArg,
8404 int nArg,
8405 Fts5Tokenizer**,
8406 fts5_tokenizer**,
8407 char **pzErr
8408 );
8409
8410 static Fts5Index *sqlite3Fts5IndexFromCsrid(Fts5Global*, i64, Fts5Config **);
8411
8412 /*
8413 ** End of interface to code in fts5.c.
8414 **************************************************************************/
8415
8416 /**************************************************************************
8417 ** Interface to code in fts5_hash.c.
8418 */
8419 typedef struct Fts5Hash Fts5Hash;
8420
8421 /*
8422 ** Create a hash table, free a hash table.
8423 */
8424 static int sqlite3Fts5HashNew(Fts5Hash**, int *pnSize);
8425 static void sqlite3Fts5HashFree(Fts5Hash*);
8426
8427 static int sqlite3Fts5HashWrite(
8428 Fts5Hash*,
8429 i64 iRowid, /* Rowid for this entry */
8430 int iCol, /* Column token appears in (-ve -> delete) */
8431 int iPos, /* Position of token within column */
8432 char bByte,
8433 const char *pToken, int nToken /* Token to add or remove to or from index */
8434 );
8435
8436 /*
8437 ** Empty (but do not delete) a hash table.
8438 */
8439 static void sqlite3Fts5HashClear(Fts5Hash*);
8440
8441 static int sqlite3Fts5HashQuery(
8442 Fts5Hash*, /* Hash table to query */
8443 const char *pTerm, int nTerm, /* Query term */
8444 const u8 **ppDoclist, /* OUT: Pointer to doclist for pTerm */
8445 int *pnDoclist /* OUT: Size of doclist in bytes */
8446 );
8447
8448 static int sqlite3Fts5HashScanInit(
8449 Fts5Hash*, /* Hash table to query */
8450 const char *pTerm, int nTerm /* Query prefix */
8451 );
8452 static void sqlite3Fts5HashScanNext(Fts5Hash*);
8453 static int sqlite3Fts5HashScanEof(Fts5Hash*);
8454 static void sqlite3Fts5HashScanEntry(Fts5Hash *,
8455 const char **pzTerm, /* OUT: term (nul-terminated) */
8456 const u8 **ppDoclist, /* OUT: pointer to doclist */
8457 int *pnDoclist /* OUT: size of doclist in bytes */
8458 );
8459
8460
8461 /*
8462 ** End of interface to code in fts5_hash.c.
8463 **************************************************************************/
8464
8465 /**************************************************************************
8466 ** Interface to code in fts5_storage.c. fts5_storage.c contains contains
8467 ** code to access the data stored in the %_content and %_docsize tables.
8468 */
8469
8470 #define FTS5_STMT_SCAN_ASC 0 /* SELECT rowid, * FROM ... ORDER BY 1 ASC */
8471 #define FTS5_STMT_SCAN_DESC 1 /* SELECT rowid, * FROM ... ORDER BY 1 DESC */
8472 #define FTS5_STMT_LOOKUP 2 /* SELECT rowid, * FROM ... WHERE rowid=? */
8473
8474 typedef struct Fts5Storage Fts5Storage;
8475
8476 static int sqlite3Fts5StorageOpen(Fts5Config*, Fts5Index*, int, Fts5Storage**, c har**);
8477 static int sqlite3Fts5StorageClose(Fts5Storage *p);
8478 static int sqlite3Fts5StorageRename(Fts5Storage*, const char *zName);
8479
8480 static int sqlite3Fts5DropAll(Fts5Config*);
8481 static int sqlite3Fts5CreateTable(Fts5Config*, const char*, const char*, int, ch ar **);
8482
8483 static int sqlite3Fts5StorageDelete(Fts5Storage *p, i64);
8484 static int sqlite3Fts5StorageContentInsert(Fts5Storage *p, sqlite3_value**, i64* );
8485 static int sqlite3Fts5StorageIndexInsert(Fts5Storage *p, sqlite3_value**, i64);
8486
8487 static int sqlite3Fts5StorageIntegrity(Fts5Storage *p);
8488
8489 static int sqlite3Fts5StorageStmt(Fts5Storage *p, int eStmt, sqlite3_stmt**, cha r**);
8490 static void sqlite3Fts5StorageStmtRelease(Fts5Storage *p, int eStmt, sqlite3_stm t*);
8491
8492 static int sqlite3Fts5StorageDocsize(Fts5Storage *p, i64 iRowid, int *aCol);
8493 static int sqlite3Fts5StorageSize(Fts5Storage *p, int iCol, i64 *pnAvg);
8494 static int sqlite3Fts5StorageRowCount(Fts5Storage *p, i64 *pnRow);
8495
8496 static int sqlite3Fts5StorageSync(Fts5Storage *p, int bCommit);
8497 static int sqlite3Fts5StorageRollback(Fts5Storage *p);
8498
8499 static int sqlite3Fts5StorageConfigValue(
8500 Fts5Storage *p, const char*, sqlite3_value*, int
8501 );
8502
8503 static int sqlite3Fts5StorageSpecialDelete(Fts5Storage *p, i64 iDel, sqlite3_val ue**);
8504
8505 static int sqlite3Fts5StorageDeleteAll(Fts5Storage *p);
8506 static int sqlite3Fts5StorageRebuild(Fts5Storage *p);
8507 static int sqlite3Fts5StorageOptimize(Fts5Storage *p);
8508 static int sqlite3Fts5StorageMerge(Fts5Storage *p, int nMerge);
8509
8510 /*
8511 ** End of interface to code in fts5_storage.c.
8512 **************************************************************************/
8513
8514
8515 /**************************************************************************
8516 ** Interface to code in fts5_expr.c.
8517 */
8518 typedef struct Fts5Expr Fts5Expr;
8519 typedef struct Fts5ExprNode Fts5ExprNode;
8520 typedef struct Fts5Parse Fts5Parse;
8521 typedef struct Fts5Token Fts5Token;
8522 typedef struct Fts5ExprPhrase Fts5ExprPhrase;
8523 typedef struct Fts5ExprNearset Fts5ExprNearset;
8524
8525 struct Fts5Token {
8526 const char *p; /* Token text (not NULL terminated) */
8527 int n; /* Size of buffer p in bytes */
8528 };
8529
8530 /* Parse a MATCH expression. */
8531 static int sqlite3Fts5ExprNew(
8532 Fts5Config *pConfig,
8533 const char *zExpr,
8534 Fts5Expr **ppNew,
8535 char **pzErr
8536 );
8537
8538 /*
8539 ** for(rc = sqlite3Fts5ExprFirst(pExpr, pIdx, bDesc);
8540 ** rc==SQLITE_OK && 0==sqlite3Fts5ExprEof(pExpr);
8541 ** rc = sqlite3Fts5ExprNext(pExpr)
8542 ** ){
8543 ** // The document with rowid iRowid matches the expression!
8544 ** i64 iRowid = sqlite3Fts5ExprRowid(pExpr);
8545 ** }
8546 */
8547 static int sqlite3Fts5ExprFirst(Fts5Expr*, Fts5Index *pIdx, i64 iMin, int bDesc) ;
8548 static int sqlite3Fts5ExprNext(Fts5Expr*, i64 iMax);
8549 static int sqlite3Fts5ExprEof(Fts5Expr*);
8550 static i64 sqlite3Fts5ExprRowid(Fts5Expr*);
8551
8552 static void sqlite3Fts5ExprFree(Fts5Expr*);
8553
8554 /* Called during startup to register a UDF with SQLite */
8555 static int sqlite3Fts5ExprInit(Fts5Global*, sqlite3*);
8556
8557 static int sqlite3Fts5ExprPhraseCount(Fts5Expr*);
8558 static int sqlite3Fts5ExprPhraseSize(Fts5Expr*, int iPhrase);
8559 static int sqlite3Fts5ExprPoslist(Fts5Expr*, int, const u8 **);
8560
8561 static int sqlite3Fts5ExprClonePhrase(Fts5Config*, Fts5Expr*, int, Fts5Expr**);
8562
8563 /*******************************************
8564 ** The fts5_expr.c API above this point is used by the other hand-written
8565 ** C code in this module. The interfaces below this point are called by
8566 ** the parser code in fts5parse.y. */
8567
8568 static void sqlite3Fts5ParseError(Fts5Parse *pParse, const char *zFmt, ...);
8569
8570 static Fts5ExprNode *sqlite3Fts5ParseNode(
8571 Fts5Parse *pParse,
8572 int eType,
8573 Fts5ExprNode *pLeft,
8574 Fts5ExprNode *pRight,
8575 Fts5ExprNearset *pNear
8576 );
8577
8578 static Fts5ExprPhrase *sqlite3Fts5ParseTerm(
8579 Fts5Parse *pParse,
8580 Fts5ExprPhrase *pPhrase,
8581 Fts5Token *pToken,
8582 int bPrefix
8583 );
8584
8585 static Fts5ExprNearset *sqlite3Fts5ParseNearset(
8586 Fts5Parse*,
8587 Fts5ExprNearset*,
8588 Fts5ExprPhrase*
8589 );
8590
8591 static Fts5Colset *sqlite3Fts5ParseColset(
8592 Fts5Parse*,
8593 Fts5Colset*,
8594 Fts5Token *
8595 );
8596
8597 static void sqlite3Fts5ParsePhraseFree(Fts5ExprPhrase*);
8598 static void sqlite3Fts5ParseNearsetFree(Fts5ExprNearset*);
8599 static void sqlite3Fts5ParseNodeFree(Fts5ExprNode*);
8600
8601 static void sqlite3Fts5ParseSetDistance(Fts5Parse*, Fts5ExprNearset*, Fts5Token* );
8602 static void sqlite3Fts5ParseSetColset(Fts5Parse*, Fts5ExprNearset*, Fts5Colset*) ;
8603 static void sqlite3Fts5ParseFinished(Fts5Parse *pParse, Fts5ExprNode *p);
8604 static void sqlite3Fts5ParseNear(Fts5Parse *pParse, Fts5Token*);
8605
8606 /*
8607 ** End of interface to code in fts5_expr.c.
8608 **************************************************************************/
8609
8610
8611
8612 /**************************************************************************
8613 ** Interface to code in fts5_aux.c.
8614 */
8615
8616 static int sqlite3Fts5AuxInit(fts5_api*);
8617 /*
8618 ** End of interface to code in fts5_aux.c.
8619 **************************************************************************/
8620
8621 /**************************************************************************
8622 ** Interface to code in fts5_tokenizer.c.
8623 */
8624
8625 static int sqlite3Fts5TokenizerInit(fts5_api*);
8626 /*
8627 ** End of interface to code in fts5_tokenizer.c.
8628 **************************************************************************/
8629
8630 /**************************************************************************
8631 ** Interface to code in fts5_vocab.c.
8632 */
8633
8634 static int sqlite3Fts5VocabInit(Fts5Global*, sqlite3*);
8635
8636 /*
8637 ** End of interface to code in fts5_vocab.c.
8638 **************************************************************************/
8639
8640
8641 /**************************************************************************
8642 ** Interface to automatically generated code in fts5_unicode2.c.
8643 */
8644 static int sqlite3Fts5UnicodeIsalnum(int c);
8645 static int sqlite3Fts5UnicodeIsdiacritic(int c);
8646 static int sqlite3Fts5UnicodeFold(int c, int bRemoveDiacritic);
8647 /*
8648 ** End of interface to code in fts5_unicode2.c.
8649 **************************************************************************/
8650
8651 #endif
8652
8653 #define FTS5_OR 1
8654 #define FTS5_AND 2
8655 #define FTS5_NOT 3
8656 #define FTS5_TERM 4
8657 #define FTS5_COLON 5
8658 #define FTS5_LP 6
8659 #define FTS5_RP 7
8660 #define FTS5_LCP 8
8661 #define FTS5_RCP 9
8662 #define FTS5_STRING 10
8663 #define FTS5_COMMA 11
8664 #define FTS5_PLUS 12
8665 #define FTS5_STAR 13
8666
8667 /*
8668 ** 2000-05-29
8669 **
8670 ** The author disclaims copyright to this source code. In place of
8671 ** a legal notice, here is a blessing:
8672 **
8673 ** May you do good and not evil.
8674 ** May you find forgiveness for yourself and forgive others.
8675 ** May you share freely, never taking more than you give.
8676 **
8677 *************************************************************************
8678 ** Driver template for the LEMON parser generator.
8679 **
8680 ** The "lemon" program processes an LALR(1) input grammar file, then uses
8681 ** this template to construct a parser. The "lemon" program inserts text
8682 ** at each "%%" line. Also, any "P-a-r-s-e" identifer prefix (without the
8683 ** interstitial "-" characters) contained in this template is changed into
8684 ** the value of the %name directive from the grammar. Otherwise, the content
8685 ** of this template is copied straight through into the generate parser
8686 ** source file.
8687 **
8688 ** The following is the concatenation of all %include directives from the
8689 ** input grammar file:
8690 */
8691 /* #include <stdio.h> */
8692 /************ Begin %include sections from the grammar ************************/
8693
8694 /* #include "fts5Int.h" */
8695 /* #include "fts5parse.h" */
8696
8697 /*
8698 ** Disable all error recovery processing in the parser push-down
8699 ** automaton.
8700 */
8701 #define fts5YYNOERRORRECOVERY 1
8702
8703 /*
8704 ** Make fts5yytestcase() the same as testcase()
8705 */
8706 #define fts5yytestcase(X) testcase(X)
8707
8708 /*
8709 ** Indicate that sqlite3ParserFree() will never be called with a null
8710 ** pointer.
8711 */
8712 #define fts5YYPARSEFREENOTNULL 1
8713
8714 /*
8715 ** Alternative datatype for the argument to the malloc() routine passed
8716 ** into sqlite3ParserAlloc(). The default is size_t.
8717 */
8718 #define fts5YYMALLOCARGTYPE u64
8719
8720 /**************** End of %include directives **********************************/
8721 /* These constants specify the various numeric values for terminal symbols
8722 ** in a format understandable to "makeheaders". This section is blank unless
8723 ** "lemon" is run with the "-m" command-line option.
8724 ***************** Begin makeheaders token definitions *************************/
8725 /**************** End makeheaders token definitions ***************************/
8726
8727 /* The next sections is a series of control #defines.
8728 ** various aspects of the generated parser.
8729 ** fts5YYCODETYPE is the data type used to store the integer codes
8730 ** that represent terminal and non-terminal symbols.
8731 ** "unsigned char" is used if there are fewer than
8732 ** 256 symbols. Larger types otherwise.
8733 ** fts5YYNOCODE is a number of type fts5YYCODETYPE that is not used for
8734 ** any terminal or nonterminal symbol.
8735 ** fts5YYFALLBACK If defined, this indicates that one or more tokens
8736 ** (also known as: "terminal symbols") have fall-back
8737 ** values which should be used if the original symbol
8738 ** would not parse. This permits keywords to sometimes
8739 ** be used as identifiers, for example.
8740 ** fts5YYACTIONTYPE is the data type used for "action codes" - numbers
8741 ** that indicate what to do in response to the next
8742 ** token.
8743 ** sqlite3Fts5ParserFTS5TOKENTYPE is the data type used for minor type fo r terminal
8744 ** symbols. Background: A "minor type" is a semantic
8745 ** value associated with a terminal or non-terminal
8746 ** symbols. For example, for an "ID" terminal symbol,
8747 ** the minor type might be the name of the identifier.
8748 ** Each non-terminal can have a different minor type.
8749 ** Terminal symbols all have the same minor type, though.
8750 ** This macros defines the minor type for terminal
8751 ** symbols.
8752 ** fts5YYMINORTYPE is the data type used for all minor types.
8753 ** This is typically a union of many types, one of
8754 ** which is sqlite3Fts5ParserFTS5TOKENTYPE. The entry in the union
8755 ** for terminal symbols is called "fts5yy0".
8756 ** fts5YYSTACKDEPTH is the maximum depth of the parser's stack. If
8757 ** zero the stack is dynamically sized using realloc()
8758 ** sqlite3Fts5ParserARG_SDECL A static variable declaration for the %extr a_argument
8759 ** sqlite3Fts5ParserARG_PDECL A parameter declaration for the %extra_argu ment
8760 ** sqlite3Fts5ParserARG_STORE Code to store %extra_argument into fts5yypP arser
8761 ** sqlite3Fts5ParserARG_FETCH Code to extract %extra_argument from fts5yy pParser
8762 ** fts5YYERRORSYMBOL is the code number of the error symbol. If not
8763 ** defined, then do no error processing.
8764 ** fts5YYNSTATE the combined number of states.
8765 ** fts5YYNRULE the number of rules in the grammar
8766 ** fts5YY_MAX_SHIFT Maximum value for shift actions
8767 ** fts5YY_MIN_SHIFTREDUCE Minimum value for shift-reduce actions
8768 ** fts5YY_MAX_SHIFTREDUCE Maximum value for shift-reduce actions
8769 ** fts5YY_MIN_REDUCE Maximum value for reduce actions
8770 ** fts5YY_ERROR_ACTION The fts5yy_action[] code for syntax error
8771 ** fts5YY_ACCEPT_ACTION The fts5yy_action[] code for accept
8772 ** fts5YY_NO_ACTION The fts5yy_action[] code for no-op
8773 */
8774 #ifndef INTERFACE
8775 # define INTERFACE 1
8776 #endif
8777 /************* Begin control #defines *****************************************/
8778 #define fts5YYCODETYPE unsigned char
8779 #define fts5YYNOCODE 27
8780 #define fts5YYACTIONTYPE unsigned char
8781 #define sqlite3Fts5ParserFTS5TOKENTYPE Fts5Token
8782 typedef union {
8783 int fts5yyinit;
8784 sqlite3Fts5ParserFTS5TOKENTYPE fts5yy0;
8785 Fts5Colset* fts5yy3;
8786 Fts5ExprPhrase* fts5yy11;
8787 Fts5ExprNode* fts5yy18;
8788 int fts5yy20;
8789 Fts5ExprNearset* fts5yy26;
8790 } fts5YYMINORTYPE;
8791 #ifndef fts5YYSTACKDEPTH
8792 #define fts5YYSTACKDEPTH 100
8793 #endif
8794 #define sqlite3Fts5ParserARG_SDECL Fts5Parse *pParse;
8795 #define sqlite3Fts5ParserARG_PDECL ,Fts5Parse *pParse
8796 #define sqlite3Fts5ParserARG_FETCH Fts5Parse *pParse = fts5yypParser->pParse
8797 #define sqlite3Fts5ParserARG_STORE fts5yypParser->pParse = pParse
8798 #define fts5YYNSTATE 26
8799 #define fts5YYNRULE 24
8800 #define fts5YY_MAX_SHIFT 25
8801 #define fts5YY_MIN_SHIFTREDUCE 40
8802 #define fts5YY_MAX_SHIFTREDUCE 63
8803 #define fts5YY_MIN_REDUCE 64
8804 #define fts5YY_MAX_REDUCE 87
8805 #define fts5YY_ERROR_ACTION 88
8806 #define fts5YY_ACCEPT_ACTION 89
8807 #define fts5YY_NO_ACTION 90
8808 /************* End control #defines *******************************************/
8809
8810 /* The fts5yyzerominor constant is used to initialize instances of
8811 ** fts5YYMINORTYPE objects to zero. */
8812 static const fts5YYMINORTYPE fts5yyzerominor = { 0 };
8813
8814 /* Define the fts5yytestcase() macro to be a no-op if is not already defined
8815 ** otherwise.
8816 **
8817 ** Applications can choose to define fts5yytestcase() in the %include section
8818 ** to a macro that can assist in verifying code coverage. For production
8819 ** code the fts5yytestcase() macro should be turned off. But it is useful
8820 ** for testing.
8821 */
8822 #ifndef fts5yytestcase
8823 # define fts5yytestcase(X)
8824 #endif
8825
8826
8827 /* Next are the tables used to determine what action to take based on the
8828 ** current state and lookahead token. These tables are used to implement
8829 ** functions that take a state number and lookahead value and return an
8830 ** action integer.
8831 **
8832 ** Suppose the action integer is N. Then the action is determined as
8833 ** follows
8834 **
8835 ** 0 <= N <= fts5YY_MAX_SHIFT Shift N. That is, push the lookahea d
8836 ** token onto the stack and goto state N.
8837 **
8838 ** N between fts5YY_MIN_SHIFTREDUCE Shift to an arbitrary state then
8839 ** and fts5YY_MAX_SHIFTREDUCE reduce by rule N-fts5YY_MIN_SHIFTRED UCE.
8840 **
8841 ** N between fts5YY_MIN_REDUCE Reduce by rule N-fts5YY_MIN_REDUCE
8842 ** and fts5YY_MAX_REDUCE
8843
8844 ** N == fts5YY_ERROR_ACTION A syntax error has occurred.
8845 **
8846 ** N == fts5YY_ACCEPT_ACTION The parser accepts its input.
8847 **
8848 ** N == fts5YY_NO_ACTION No such action. Denotes unused
8849 ** slots in the fts5yy_action[] table.
8850 **
8851 ** The action table is constructed as a single large table named fts5yy_action[] .
8852 ** Given state S and lookahead X, the action is computed as
8853 **
8854 ** fts5yy_action[ fts5yy_shift_ofst[S] + X ]
8855 **
8856 ** If the index value fts5yy_shift_ofst[S]+X is out of range or if the value
8857 ** fts5yy_lookahead[fts5yy_shift_ofst[S]+X] is not equal to X or if fts5yy_shift _ofst[S]
8858 ** is equal to fts5YY_SHIFT_USE_DFLT, it means that the action is not in the tab le
8859 ** and that fts5yy_default[S] should be used instead.
8860 **
8861 ** The formula above is for computing the action when the lookahead is
8862 ** a terminal symbol. If the lookahead is a non-terminal (as occurs after
8863 ** a reduce action) then the fts5yy_reduce_ofst[] array is used in place of
8864 ** the fts5yy_shift_ofst[] array and fts5YY_REDUCE_USE_DFLT is used in place of
8865 ** fts5YY_SHIFT_USE_DFLT.
8866 **
8867 ** The following are the tables generated in this section:
8868 **
8869 ** fts5yy_action[] A single table containing all actions.
8870 ** fts5yy_lookahead[] A table containing the lookahead for each entry in
8871 ** fts5yy_action. Used to detect hash collisions.
8872 ** fts5yy_shift_ofst[] For each state, the offset into fts5yy_action for
8873 ** shifting terminals.
8874 ** fts5yy_reduce_ofst[] For each state, the offset into fts5yy_action for
8875 ** shifting non-terminals after a reduce.
8876 ** fts5yy_default[] Default action for each state.
8877 **
8878 *********** Begin parsing tables **********************************************/
8879 #define fts5YY_ACTTAB_COUNT (78)
8880 static const fts5YYACTIONTYPE fts5yy_action[] = {
8881 /* 0 */ 89, 15, 46, 5, 48, 24, 12, 19, 23, 14,
8882 /* 10 */ 46, 5, 48, 24, 20, 21, 23, 43, 46, 5,
8883 /* 20 */ 48, 24, 6, 18, 23, 17, 46, 5, 48, 24,
8884 /* 30 */ 75, 7, 23, 25, 46, 5, 48, 24, 62, 47,
8885 /* 40 */ 23, 48, 24, 7, 11, 23, 9, 3, 4, 2,
8886 /* 50 */ 62, 50, 52, 44, 64, 3, 4, 2, 49, 4,
8887 /* 60 */ 2, 1, 23, 11, 16, 9, 12, 2, 10, 61,
8888 /* 70 */ 53, 59, 62, 60, 22, 13, 55, 8,
8889 };
8890 static const fts5YYCODETYPE fts5yy_lookahead[] = {
8891 /* 0 */ 15, 16, 17, 18, 19, 20, 10, 11, 23, 16,
8892 /* 10 */ 17, 18, 19, 20, 23, 24, 23, 16, 17, 18,
8893 /* 20 */ 19, 20, 22, 23, 23, 16, 17, 18, 19, 20,
8894 /* 30 */ 5, 6, 23, 16, 17, 18, 19, 20, 13, 17,
8895 /* 40 */ 23, 19, 20, 6, 8, 23, 10, 1, 2, 3,
8896 /* 50 */ 13, 9, 10, 7, 0, 1, 2, 3, 19, 2,
8897 /* 60 */ 3, 6, 23, 8, 21, 10, 10, 3, 10, 25,
8898 /* 70 */ 10, 10, 13, 25, 12, 10, 7, 5,
8899 };
8900 #define fts5YY_SHIFT_USE_DFLT (-5)
8901 #define fts5YY_SHIFT_COUNT (25)
8902 #define fts5YY_SHIFT_MIN (-4)
8903 #define fts5YY_SHIFT_MAX (72)
8904 static const signed char fts5yy_shift_ofst[] = {
8905 /* 0 */ 55, 55, 55, 55, 55, 36, -4, 56, 58, 25,
8906 /* 10 */ 37, 60, 59, 59, 46, 54, 42, 57, 62, 61,
8907 /* 20 */ 62, 69, 65, 62, 72, 64,
8908 };
8909 #define fts5YY_REDUCE_USE_DFLT (-16)
8910 #define fts5YY_REDUCE_COUNT (13)
8911 #define fts5YY_REDUCE_MIN (-15)
8912 #define fts5YY_REDUCE_MAX (48)
8913 static const signed char fts5yy_reduce_ofst[] = {
8914 /* 0 */ -15, -7, 1, 9, 17, 22, -9, 0, 39, 44,
8915 /* 10 */ 44, 43, 44, 48,
8916 };
8917 static const fts5YYACTIONTYPE fts5yy_default[] = {
8918 /* 0 */ 88, 88, 88, 88, 88, 69, 82, 88, 88, 87,
8919 /* 10 */ 87, 88, 87, 87, 88, 88, 88, 66, 80, 88,
8920 /* 20 */ 81, 88, 88, 78, 88, 65,
8921 };
8922 /********** End of lemon-generated parsing tables *****************************/
8923
8924 /* The next table maps tokens (terminal symbols) into fallback tokens.
8925 ** If a construct like the following:
8926 **
8927 ** %fallback ID X Y Z.
8928 **
8929 ** appears in the grammar, then ID becomes a fallback token for X, Y,
8930 ** and Z. Whenever one of the tokens X, Y, or Z is input to the parser
8931 ** but it does not parse, the type of the token is changed to ID and
8932 ** the parse is retried before an error is thrown.
8933 **
8934 ** This feature can be used, for example, to cause some keywords in a language
8935 ** to revert to identifiers if they keyword does not apply in the context where
8936 ** it appears.
8937 */
8938 #ifdef fts5YYFALLBACK
8939 static const fts5YYCODETYPE fts5yyFallback[] = {
8940 };
8941 #endif /* fts5YYFALLBACK */
8942
8943 /* The following structure represents a single element of the
8944 ** parser's stack. Information stored includes:
8945 **
8946 ** + The state number for the parser at this level of the stack.
8947 **
8948 ** + The value of the token stored at this level of the stack.
8949 ** (In other words, the "major" token.)
8950 **
8951 ** + The semantic value stored at this level of the stack. This is
8952 ** the information used by the action routines in the grammar.
8953 ** It is sometimes called the "minor" token.
8954 **
8955 ** After the "shift" half of a SHIFTREDUCE action, the stateno field
8956 ** actually contains the reduce action for the second half of the
8957 ** SHIFTREDUCE.
8958 */
8959 struct fts5yyStackEntry {
8960 fts5YYACTIONTYPE stateno; /* The state-number, or reduce action in SHIFTREDUC E */
8961 fts5YYCODETYPE major; /* The major token value. This is the code
8962 ** number for the token at this stack level */
8963 fts5YYMINORTYPE minor; /* The user-supplied minor token value. This
8964 ** is the value of the token */
8965 };
8966 typedef struct fts5yyStackEntry fts5yyStackEntry;
8967
8968 /* The state of the parser is completely contained in an instance of
8969 ** the following structure */
8970 struct fts5yyParser {
8971 int fts5yyidx; /* Index of top element in stack */
8972 #ifdef fts5YYTRACKMAXSTACKDEPTH
8973 int fts5yyidxMax; /* Maximum value of fts5yyidx */
8974 #endif
8975 int fts5yyerrcnt; /* Shifts left before out of the error */
8976 sqlite3Fts5ParserARG_SDECL /* A place to hold %extra_argument * /
8977 #if fts5YYSTACKDEPTH<=0
8978 int fts5yystksz; /* Current side of the stack */
8979 fts5yyStackEntry *fts5yystack; /* The parser's stack */
8980 #else
8981 fts5yyStackEntry fts5yystack[fts5YYSTACKDEPTH]; /* The parser's stack */
8982 #endif
8983 };
8984 typedef struct fts5yyParser fts5yyParser;
8985
8986 #ifndef NDEBUG
8987 /* #include <stdio.h> */
8988 static FILE *fts5yyTraceFILE = 0;
8989 static char *fts5yyTracePrompt = 0;
8990 #endif /* NDEBUG */
8991
8992 #ifndef NDEBUG
8993 /*
8994 ** Turn parser tracing on by giving a stream to which to write the trace
8995 ** and a prompt to preface each trace message. Tracing is turned off
8996 ** by making either argument NULL
8997 **
8998 ** Inputs:
8999 ** <ul>
9000 ** <li> A FILE* to which trace output should be written.
9001 ** If NULL, then tracing is turned off.
9002 ** <li> A prefix string written at the beginning of every
9003 ** line of trace output. If NULL, then tracing is
9004 ** turned off.
9005 ** </ul>
9006 **
9007 ** Outputs:
9008 ** None.
9009 */
9010 static void sqlite3Fts5ParserTrace(FILE *TraceFILE, char *zTracePrompt){
9011 fts5yyTraceFILE = TraceFILE;
9012 fts5yyTracePrompt = zTracePrompt;
9013 if( fts5yyTraceFILE==0 ) fts5yyTracePrompt = 0;
9014 else if( fts5yyTracePrompt==0 ) fts5yyTraceFILE = 0;
9015 }
9016 #endif /* NDEBUG */
9017
9018 #ifndef NDEBUG
9019 /* For tracing shifts, the names of all terminals and nonterminals
9020 ** are required. The following table supplies these names */
9021 static const char *const fts5yyTokenName[] = {
9022 "$", "OR", "AND", "NOT",
9023 "TERM", "COLON", "LP", "RP",
9024 "LCP", "RCP", "STRING", "COMMA",
9025 "PLUS", "STAR", "error", "input",
9026 "expr", "cnearset", "exprlist", "nearset",
9027 "colset", "colsetlist", "nearphrases", "phrase",
9028 "neardist_opt", "star_opt",
9029 };
9030 #endif /* NDEBUG */
9031
9032 #ifndef NDEBUG
9033 /* For tracing reduce actions, the names of all rules are required.
9034 */
9035 static const char *const fts5yyRuleName[] = {
9036 /* 0 */ "input ::= expr",
9037 /* 1 */ "expr ::= expr AND expr",
9038 /* 2 */ "expr ::= expr OR expr",
9039 /* 3 */ "expr ::= expr NOT expr",
9040 /* 4 */ "expr ::= LP expr RP",
9041 /* 5 */ "expr ::= exprlist",
9042 /* 6 */ "exprlist ::= cnearset",
9043 /* 7 */ "exprlist ::= exprlist cnearset",
9044 /* 8 */ "cnearset ::= nearset",
9045 /* 9 */ "cnearset ::= colset COLON nearset",
9046 /* 10 */ "colset ::= LCP colsetlist RCP",
9047 /* 11 */ "colset ::= STRING",
9048 /* 12 */ "colsetlist ::= colsetlist STRING",
9049 /* 13 */ "colsetlist ::= STRING",
9050 /* 14 */ "nearset ::= phrase",
9051 /* 15 */ "nearset ::= STRING LP nearphrases neardist_opt RP",
9052 /* 16 */ "nearphrases ::= phrase",
9053 /* 17 */ "nearphrases ::= nearphrases phrase",
9054 /* 18 */ "neardist_opt ::=",
9055 /* 19 */ "neardist_opt ::= COMMA STRING",
9056 /* 20 */ "phrase ::= phrase PLUS STRING star_opt",
9057 /* 21 */ "phrase ::= STRING star_opt",
9058 /* 22 */ "star_opt ::= STAR",
9059 /* 23 */ "star_opt ::=",
9060 };
9061 #endif /* NDEBUG */
9062
9063
9064 #if fts5YYSTACKDEPTH<=0
9065 /*
9066 ** Try to increase the size of the parser stack.
9067 */
9068 static void fts5yyGrowStack(fts5yyParser *p){
9069 int newSize;
9070 fts5yyStackEntry *pNew;
9071
9072 newSize = p->fts5yystksz*2 + 100;
9073 pNew = realloc(p->fts5yystack, newSize*sizeof(pNew[0]));
9074 if( pNew ){
9075 p->fts5yystack = pNew;
9076 p->fts5yystksz = newSize;
9077 #ifndef NDEBUG
9078 if( fts5yyTraceFILE ){
9079 fprintf(fts5yyTraceFILE,"%sStack grows to %d entries!\n",
9080 fts5yyTracePrompt, p->fts5yystksz);
9081 }
9082 #endif
9083 }
9084 }
9085 #endif
9086
9087 /* Datatype of the argument to the memory allocated passed as the
9088 ** second argument to sqlite3Fts5ParserAlloc() below. This can be changed by
9089 ** putting an appropriate #define in the %include section of the input
9090 ** grammar.
9091 */
9092 #ifndef fts5YYMALLOCARGTYPE
9093 # define fts5YYMALLOCARGTYPE size_t
9094 #endif
9095
9096 /*
9097 ** This function allocates a new parser.
9098 ** The only argument is a pointer to a function which works like
9099 ** malloc.
9100 **
9101 ** Inputs:
9102 ** A pointer to the function used to allocate memory.
9103 **
9104 ** Outputs:
9105 ** A pointer to a parser. This pointer is used in subsequent calls
9106 ** to sqlite3Fts5Parser and sqlite3Fts5ParserFree.
9107 */
9108 static void *sqlite3Fts5ParserAlloc(void *(*mallocProc)(fts5YYMALLOCARGTYPE)){
9109 fts5yyParser *pParser;
9110 pParser = (fts5yyParser*)(*mallocProc)( (fts5YYMALLOCARGTYPE)sizeof(fts5yyPars er) );
9111 if( pParser ){
9112 pParser->fts5yyidx = -1;
9113 #ifdef fts5YYTRACKMAXSTACKDEPTH
9114 pParser->fts5yyidxMax = 0;
9115 #endif
9116 #if fts5YYSTACKDEPTH<=0
9117 pParser->fts5yystack = NULL;
9118 pParser->fts5yystksz = 0;
9119 fts5yyGrowStack(pParser);
9120 #endif
9121 }
9122 return pParser;
9123 }
9124
9125 /* The following function deletes the "minor type" or semantic value
9126 ** associated with a symbol. The symbol can be either a terminal
9127 ** or nonterminal. "fts5yymajor" is the symbol code, and "fts5yypminor" is
9128 ** a pointer to the value to be deleted. The code used to do the
9129 ** deletions is derived from the %destructor and/or %token_destructor
9130 ** directives of the input grammar.
9131 */
9132 static void fts5yy_destructor(
9133 fts5yyParser *fts5yypParser, /* The parser */
9134 fts5YYCODETYPE fts5yymajor, /* Type code for object to destroy */
9135 fts5YYMINORTYPE *fts5yypminor /* The object to be destroyed */
9136 ){
9137 sqlite3Fts5ParserARG_FETCH;
9138 switch( fts5yymajor ){
9139 /* Here is inserted the actions which take place when a
9140 ** terminal or non-terminal is destroyed. This can happen
9141 ** when the symbol is popped from the stack during a
9142 ** reduce or during error processing or when a parser is
9143 ** being destroyed before it is finished parsing.
9144 **
9145 ** Note: during a reduce, the only symbols destroyed are those
9146 ** which appear on the RHS of the rule, but which are *not* used
9147 ** inside the C code.
9148 */
9149 /********* Begin destructor definitions ***************************************/
9150 case 15: /* input */
9151 {
9152 (void)pParse;
9153 }
9154 break;
9155 case 16: /* expr */
9156 case 17: /* cnearset */
9157 case 18: /* exprlist */
9158 {
9159 sqlite3Fts5ParseNodeFree((fts5yypminor->fts5yy18));
9160 }
9161 break;
9162 case 19: /* nearset */
9163 case 22: /* nearphrases */
9164 {
9165 sqlite3Fts5ParseNearsetFree((fts5yypminor->fts5yy26));
9166 }
9167 break;
9168 case 20: /* colset */
9169 case 21: /* colsetlist */
9170 {
9171 sqlite3_free((fts5yypminor->fts5yy3));
9172 }
9173 break;
9174 case 23: /* phrase */
9175 {
9176 sqlite3Fts5ParsePhraseFree((fts5yypminor->fts5yy11));
9177 }
9178 break;
9179 /********* End destructor definitions *****************************************/
9180 default: break; /* If no destructor action specified: do nothing */
9181 }
9182 }
9183
9184 /*
9185 ** Pop the parser's stack once.
9186 **
9187 ** If there is a destructor routine associated with the token which
9188 ** is popped from the stack, then call it.
9189 */
9190 static void fts5yy_pop_parser_stack(fts5yyParser *pParser){
9191 fts5yyStackEntry *fts5yytos;
9192 assert( pParser->fts5yyidx>=0 );
9193 fts5yytos = &pParser->fts5yystack[pParser->fts5yyidx--];
9194 #ifndef NDEBUG
9195 if( fts5yyTraceFILE ){
9196 fprintf(fts5yyTraceFILE,"%sPopping %s\n",
9197 fts5yyTracePrompt,
9198 fts5yyTokenName[fts5yytos->major]);
9199 }
9200 #endif
9201 fts5yy_destructor(pParser, fts5yytos->major, &fts5yytos->minor);
9202 }
9203
9204 /*
9205 ** Deallocate and destroy a parser. Destructors are called for
9206 ** all stack elements before shutting the parser down.
9207 **
9208 ** If the fts5YYPARSEFREENEVERNULL macro exists (for example because it
9209 ** is defined in a %include section of the input grammar) then it is
9210 ** assumed that the input pointer is never NULL.
9211 */
9212 static void sqlite3Fts5ParserFree(
9213 void *p, /* The parser to be deleted */
9214 void (*freeProc)(void*) /* Function used to reclaim memory */
9215 ){
9216 fts5yyParser *pParser = (fts5yyParser*)p;
9217 #ifndef fts5YYPARSEFREENEVERNULL
9218 if( pParser==0 ) return;
9219 #endif
9220 while( pParser->fts5yyidx>=0 ) fts5yy_pop_parser_stack(pParser);
9221 #if fts5YYSTACKDEPTH<=0
9222 free(pParser->fts5yystack);
9223 #endif
9224 (*freeProc)((void*)pParser);
9225 }
9226
9227 /*
9228 ** Return the peak depth of the stack for a parser.
9229 */
9230 #ifdef fts5YYTRACKMAXSTACKDEPTH
9231 static int sqlite3Fts5ParserStackPeak(void *p){
9232 fts5yyParser *pParser = (fts5yyParser*)p;
9233 return pParser->fts5yyidxMax;
9234 }
9235 #endif
9236
9237 /*
9238 ** Find the appropriate action for a parser given the terminal
9239 ** look-ahead token iLookAhead.
9240 */
9241 static int fts5yy_find_shift_action(
9242 fts5yyParser *pParser, /* The parser */
9243 fts5YYCODETYPE iLookAhead /* The look-ahead token */
9244 ){
9245 int i;
9246 int stateno = pParser->fts5yystack[pParser->fts5yyidx].stateno;
9247
9248 if( stateno>=fts5YY_MIN_REDUCE ) return stateno;
9249 assert( stateno <= fts5YY_SHIFT_COUNT );
9250 do{
9251 i = fts5yy_shift_ofst[stateno];
9252 if( i==fts5YY_SHIFT_USE_DFLT ) return fts5yy_default[stateno];
9253 assert( iLookAhead!=fts5YYNOCODE );
9254 i += iLookAhead;
9255 if( i<0 || i>=fts5YY_ACTTAB_COUNT || fts5yy_lookahead[i]!=iLookAhead ){
9256 if( iLookAhead>0 ){
9257 #ifdef fts5YYFALLBACK
9258 fts5YYCODETYPE iFallback; /* Fallback token */
9259 if( iLookAhead<sizeof(fts5yyFallback)/sizeof(fts5yyFallback[0])
9260 && (iFallback = fts5yyFallback[iLookAhead])!=0 ){
9261 #ifndef NDEBUG
9262 if( fts5yyTraceFILE ){
9263 fprintf(fts5yyTraceFILE, "%sFALLBACK %s => %s\n",
9264 fts5yyTracePrompt, fts5yyTokenName[iLookAhead], fts5yyTokenName[i Fallback]);
9265 }
9266 #endif
9267 assert( fts5yyFallback[iFallback]==0 ); /* Fallback loop must terminat e */
9268 iLookAhead = iFallback;
9269 continue;
9270 }
9271 #endif
9272 #ifdef fts5YYWILDCARD
9273 {
9274 int j = i - iLookAhead + fts5YYWILDCARD;
9275 if(
9276 #if fts5YY_SHIFT_MIN+fts5YYWILDCARD<0
9277 j>=0 &&
9278 #endif
9279 #if fts5YY_SHIFT_MAX+fts5YYWILDCARD>=fts5YY_ACTTAB_COUNT
9280 j<fts5YY_ACTTAB_COUNT &&
9281 #endif
9282 fts5yy_lookahead[j]==fts5YYWILDCARD
9283 ){
9284 #ifndef NDEBUG
9285 if( fts5yyTraceFILE ){
9286 fprintf(fts5yyTraceFILE, "%sWILDCARD %s => %s\n",
9287 fts5yyTracePrompt, fts5yyTokenName[iLookAhead],
9288 fts5yyTokenName[fts5YYWILDCARD]);
9289 }
9290 #endif /* NDEBUG */
9291 return fts5yy_action[j];
9292 }
9293 }
9294 #endif /* fts5YYWILDCARD */
9295 }
9296 return fts5yy_default[stateno];
9297 }else{
9298 return fts5yy_action[i];
9299 }
9300 }while(1);
9301 }
9302
9303 /*
9304 ** Find the appropriate action for a parser given the non-terminal
9305 ** look-ahead token iLookAhead.
9306 */
9307 static int fts5yy_find_reduce_action(
9308 int stateno, /* Current state number */
9309 fts5YYCODETYPE iLookAhead /* The look-ahead token */
9310 ){
9311 int i;
9312 #ifdef fts5YYERRORSYMBOL
9313 if( stateno>fts5YY_REDUCE_COUNT ){
9314 return fts5yy_default[stateno];
9315 }
9316 #else
9317 assert( stateno<=fts5YY_REDUCE_COUNT );
9318 #endif
9319 i = fts5yy_reduce_ofst[stateno];
9320 assert( i!=fts5YY_REDUCE_USE_DFLT );
9321 assert( iLookAhead!=fts5YYNOCODE );
9322 i += iLookAhead;
9323 #ifdef fts5YYERRORSYMBOL
9324 if( i<0 || i>=fts5YY_ACTTAB_COUNT || fts5yy_lookahead[i]!=iLookAhead ){
9325 return fts5yy_default[stateno];
9326 }
9327 #else
9328 assert( i>=0 && i<fts5YY_ACTTAB_COUNT );
9329 assert( fts5yy_lookahead[i]==iLookAhead );
9330 #endif
9331 return fts5yy_action[i];
9332 }
9333
9334 /*
9335 ** The following routine is called if the stack overflows.
9336 */
9337 static void fts5yyStackOverflow(fts5yyParser *fts5yypParser, fts5YYMINORTYPE *ft s5yypMinor){
9338 sqlite3Fts5ParserARG_FETCH;
9339 fts5yypParser->fts5yyidx--;
9340 #ifndef NDEBUG
9341 if( fts5yyTraceFILE ){
9342 fprintf(fts5yyTraceFILE,"%sStack Overflow!\n",fts5yyTracePrompt);
9343 }
9344 #endif
9345 while( fts5yypParser->fts5yyidx>=0 ) fts5yy_pop_parser_stack(fts5yypParser);
9346 /* Here code is inserted which will execute if the parser
9347 ** stack every overflows */
9348 /******** Begin %stack_overflow code ******************************************/
9349
9350 assert( 0 );
9351 /******** End %stack_overflow code ********************************************/
9352 sqlite3Fts5ParserARG_STORE; /* Suppress warning about unused %extra_argument var */
9353 }
9354
9355 /*
9356 ** Print tracing information for a SHIFT action
9357 */
9358 #ifndef NDEBUG
9359 static void fts5yyTraceShift(fts5yyParser *fts5yypParser, int fts5yyNewState){
9360 if( fts5yyTraceFILE ){
9361 if( fts5yyNewState<fts5YYNSTATE ){
9362 fprintf(fts5yyTraceFILE,"%sShift '%s', go to state %d\n",
9363 fts5yyTracePrompt,fts5yyTokenName[fts5yypParser->fts5yystack[fts5yypPar ser->fts5yyidx].major],
9364 fts5yyNewState);
9365 }else{
9366 fprintf(fts5yyTraceFILE,"%sShift '%s'\n",
9367 fts5yyTracePrompt,fts5yyTokenName[fts5yypParser->fts5yystack[fts5yypPar ser->fts5yyidx].major]);
9368 }
9369 }
9370 }
9371 #else
9372 # define fts5yyTraceShift(X,Y)
9373 #endif
9374
9375 /*
9376 ** Perform a shift action.
9377 */
9378 static void fts5yy_shift(
9379 fts5yyParser *fts5yypParser, /* The parser to be shifted */
9380 int fts5yyNewState, /* The new state to shift in */
9381 int fts5yyMajor, /* The major token to shift in */
9382 fts5YYMINORTYPE *fts5yypMinor /* Pointer to the minor token to shift i n */
9383 ){
9384 fts5yyStackEntry *fts5yytos;
9385 fts5yypParser->fts5yyidx++;
9386 #ifdef fts5YYTRACKMAXSTACKDEPTH
9387 if( fts5yypParser->fts5yyidx>fts5yypParser->fts5yyidxMax ){
9388 fts5yypParser->fts5yyidxMax = fts5yypParser->fts5yyidx;
9389 }
9390 #endif
9391 #if fts5YYSTACKDEPTH>0
9392 if( fts5yypParser->fts5yyidx>=fts5YYSTACKDEPTH ){
9393 fts5yyStackOverflow(fts5yypParser, fts5yypMinor);
9394 return;
9395 }
9396 #else
9397 if( fts5yypParser->fts5yyidx>=fts5yypParser->fts5yystksz ){
9398 fts5yyGrowStack(fts5yypParser);
9399 if( fts5yypParser->fts5yyidx>=fts5yypParser->fts5yystksz ){
9400 fts5yyStackOverflow(fts5yypParser, fts5yypMinor);
9401 return;
9402 }
9403 }
9404 #endif
9405 fts5yytos = &fts5yypParser->fts5yystack[fts5yypParser->fts5yyidx];
9406 fts5yytos->stateno = (fts5YYACTIONTYPE)fts5yyNewState;
9407 fts5yytos->major = (fts5YYCODETYPE)fts5yyMajor;
9408 fts5yytos->minor = *fts5yypMinor;
9409 fts5yyTraceShift(fts5yypParser, fts5yyNewState);
9410 }
9411
9412 /* The following table contains information about every rule that
9413 ** is used during the reduce.
9414 */
9415 static const struct {
9416 fts5YYCODETYPE lhs; /* Symbol on the left-hand side of the rule */
9417 unsigned char nrhs; /* Number of right-hand side symbols in the rule */
9418 } fts5yyRuleInfo[] = {
9419 { 15, 1 },
9420 { 16, 3 },
9421 { 16, 3 },
9422 { 16, 3 },
9423 { 16, 3 },
9424 { 16, 1 },
9425 { 18, 1 },
9426 { 18, 2 },
9427 { 17, 1 },
9428 { 17, 3 },
9429 { 20, 3 },
9430 { 20, 1 },
9431 { 21, 2 },
9432 { 21, 1 },
9433 { 19, 1 },
9434 { 19, 5 },
9435 { 22, 1 },
9436 { 22, 2 },
9437 { 24, 0 },
9438 { 24, 2 },
9439 { 23, 4 },
9440 { 23, 2 },
9441 { 25, 1 },
9442 { 25, 0 },
9443 };
9444
9445 static void fts5yy_accept(fts5yyParser*); /* Forward Declaration */
9446
9447 /*
9448 ** Perform a reduce action and the shift that must immediately
9449 ** follow the reduce.
9450 */
9451 static void fts5yy_reduce(
9452 fts5yyParser *fts5yypParser, /* The parser */
9453 int fts5yyruleno /* Number of the rule by which to reduce */
9454 ){
9455 int fts5yygoto; /* The next state */
9456 int fts5yyact; /* The next action */
9457 fts5YYMINORTYPE fts5yygotominor; /* The LHS of the rule reduced */
9458 fts5yyStackEntry *fts5yymsp; /* The top of the parser's stack */
9459 int fts5yysize; /* Amount to pop the stack */
9460 sqlite3Fts5ParserARG_FETCH;
9461 fts5yymsp = &fts5yypParser->fts5yystack[fts5yypParser->fts5yyidx];
9462 #ifndef NDEBUG
9463 if( fts5yyTraceFILE && fts5yyruleno>=0
9464 && fts5yyruleno<(int)(sizeof(fts5yyRuleName)/sizeof(fts5yyRuleName[0])) ){
9465 fts5yysize = fts5yyRuleInfo[fts5yyruleno].nrhs;
9466 fprintf(fts5yyTraceFILE, "%sReduce [%s], go to state %d.\n", fts5yyTraceProm pt,
9467 fts5yyRuleName[fts5yyruleno], fts5yymsp[-fts5yysize].stateno);
9468 }
9469 #endif /* NDEBUG */
9470 fts5yygotominor = fts5yyzerominor;
9471
9472 switch( fts5yyruleno ){
9473 /* Beginning here are the reduction cases. A typical example
9474 ** follows:
9475 ** case 0:
9476 ** #line <lineno> <grammarfile>
9477 ** { ... } // User supplied code
9478 ** #line <lineno> <thisfile>
9479 ** break;
9480 */
9481 /********** Begin reduce actions **********************************************/
9482 case 0: /* input ::= expr */
9483 { sqlite3Fts5ParseFinished(pParse, fts5yymsp[0].minor.fts5yy18); }
9484 break;
9485 case 1: /* expr ::= expr AND expr */
9486 {
9487 fts5yygotominor.fts5yy18 = sqlite3Fts5ParseNode(pParse, FTS5_AND, fts5yymsp[-2 ].minor.fts5yy18, fts5yymsp[0].minor.fts5yy18, 0);
9488 }
9489 break;
9490 case 2: /* expr ::= expr OR expr */
9491 {
9492 fts5yygotominor.fts5yy18 = sqlite3Fts5ParseNode(pParse, FTS5_OR, fts5yymsp[-2] .minor.fts5yy18, fts5yymsp[0].minor.fts5yy18, 0);
9493 }
9494 break;
9495 case 3: /* expr ::= expr NOT expr */
9496 {
9497 fts5yygotominor.fts5yy18 = sqlite3Fts5ParseNode(pParse, FTS5_NOT, fts5yymsp[-2 ].minor.fts5yy18, fts5yymsp[0].minor.fts5yy18, 0);
9498 }
9499 break;
9500 case 4: /* expr ::= LP expr RP */
9501 {fts5yygotominor.fts5yy18 = fts5yymsp[-1].minor.fts5yy18;}
9502 break;
9503 case 5: /* expr ::= exprlist */
9504 case 6: /* exprlist ::= cnearset */ fts5yytestcase(fts5yyruleno==6);
9505 {fts5yygotominor.fts5yy18 = fts5yymsp[0].minor.fts5yy18;}
9506 break;
9507 case 7: /* exprlist ::= exprlist cnearset */
9508 {
9509 fts5yygotominor.fts5yy18 = sqlite3Fts5ParseNode(pParse, FTS5_AND, fts5yymsp[-1 ].minor.fts5yy18, fts5yymsp[0].minor.fts5yy18, 0);
9510 }
9511 break;
9512 case 8: /* cnearset ::= nearset */
9513 {
9514 fts5yygotominor.fts5yy18 = sqlite3Fts5ParseNode(pParse, FTS5_STRING, 0, 0, fts 5yymsp[0].minor.fts5yy26);
9515 }
9516 break;
9517 case 9: /* cnearset ::= colset COLON nearset */
9518 {
9519 sqlite3Fts5ParseSetColset(pParse, fts5yymsp[0].minor.fts5yy26, fts5yymsp[-2].m inor.fts5yy3);
9520 fts5yygotominor.fts5yy18 = sqlite3Fts5ParseNode(pParse, FTS5_STRING, 0, 0, fts 5yymsp[0].minor.fts5yy26);
9521 }
9522 break;
9523 case 10: /* colset ::= LCP colsetlist RCP */
9524 { fts5yygotominor.fts5yy3 = fts5yymsp[-1].minor.fts5yy3; }
9525 break;
9526 case 11: /* colset ::= STRING */
9527 {
9528 fts5yygotominor.fts5yy3 = sqlite3Fts5ParseColset(pParse, 0, &fts5yymsp[0].mino r.fts5yy0);
9529 }
9530 break;
9531 case 12: /* colsetlist ::= colsetlist STRING */
9532 {
9533 fts5yygotominor.fts5yy3 = sqlite3Fts5ParseColset(pParse, fts5yymsp[-1].minor.f ts5yy3, &fts5yymsp[0].minor.fts5yy0); }
9534 break;
9535 case 13: /* colsetlist ::= STRING */
9536 {
9537 fts5yygotominor.fts5yy3 = sqlite3Fts5ParseColset(pParse, 0, &fts5yymsp[0].mino r.fts5yy0);
9538 }
9539 break;
9540 case 14: /* nearset ::= phrase */
9541 { fts5yygotominor.fts5yy26 = sqlite3Fts5ParseNearset(pParse, 0, fts5yymsp[0].min or.fts5yy11); }
9542 break;
9543 case 15: /* nearset ::= STRING LP nearphrases neardist_opt RP */
9544 {
9545 sqlite3Fts5ParseNear(pParse, &fts5yymsp[-4].minor.fts5yy0);
9546 sqlite3Fts5ParseSetDistance(pParse, fts5yymsp[-2].minor.fts5yy26, &fts5yymsp[- 1].minor.fts5yy0);
9547 fts5yygotominor.fts5yy26 = fts5yymsp[-2].minor.fts5yy26;
9548 }
9549 break;
9550 case 16: /* nearphrases ::= phrase */
9551 {
9552 fts5yygotominor.fts5yy26 = sqlite3Fts5ParseNearset(pParse, 0, fts5yymsp[0].min or.fts5yy11);
9553 }
9554 break;
9555 case 17: /* nearphrases ::= nearphrases phrase */
9556 {
9557 fts5yygotominor.fts5yy26 = sqlite3Fts5ParseNearset(pParse, fts5yymsp[-1].minor .fts5yy26, fts5yymsp[0].minor.fts5yy11);
9558 }
9559 break;
9560 case 18: /* neardist_opt ::= */
9561 { fts5yygotominor.fts5yy0.p = 0; fts5yygotominor.fts5yy0.n = 0; }
9562 break;
9563 case 19: /* neardist_opt ::= COMMA STRING */
9564 { fts5yygotominor.fts5yy0 = fts5yymsp[0].minor.fts5yy0; }
9565 break;
9566 case 20: /* phrase ::= phrase PLUS STRING star_opt */
9567 {
9568 fts5yygotominor.fts5yy11 = sqlite3Fts5ParseTerm(pParse, fts5yymsp[-3].minor.ft s5yy11, &fts5yymsp[-1].minor.fts5yy0, fts5yymsp[0].minor.fts5yy20);
9569 }
9570 break;
9571 case 21: /* phrase ::= STRING star_opt */
9572 {
9573 fts5yygotominor.fts5yy11 = sqlite3Fts5ParseTerm(pParse, 0, &fts5yymsp[-1].mino r.fts5yy0, fts5yymsp[0].minor.fts5yy20);
9574 }
9575 break;
9576 case 22: /* star_opt ::= STAR */
9577 { fts5yygotominor.fts5yy20 = 1; }
9578 break;
9579 case 23: /* star_opt ::= */
9580 { fts5yygotominor.fts5yy20 = 0; }
9581 break;
9582 default:
9583 break;
9584 /********** End reduce actions ************************************************/
9585 };
9586 assert( fts5yyruleno>=0 && fts5yyruleno<sizeof(fts5yyRuleInfo)/sizeof(fts5yyRu leInfo[0]) );
9587 fts5yygoto = fts5yyRuleInfo[fts5yyruleno].lhs;
9588 fts5yysize = fts5yyRuleInfo[fts5yyruleno].nrhs;
9589 fts5yypParser->fts5yyidx -= fts5yysize;
9590 fts5yyact = fts5yy_find_reduce_action(fts5yymsp[-fts5yysize].stateno,(fts5YYCO DETYPE)fts5yygoto);
9591 if( fts5yyact <= fts5YY_MAX_SHIFTREDUCE ){
9592 if( fts5yyact>fts5YY_MAX_SHIFT ) fts5yyact += fts5YY_MIN_REDUCE - fts5YY_MIN _SHIFTREDUCE;
9593 /* If the reduce action popped at least
9594 ** one element off the stack, then we can push the new element back
9595 ** onto the stack here, and skip the stack overflow test in fts5yy_shift().
9596 ** That gives a significant speed improvement. */
9597 if( fts5yysize ){
9598 fts5yypParser->fts5yyidx++;
9599 fts5yymsp -= fts5yysize-1;
9600 fts5yymsp->stateno = (fts5YYACTIONTYPE)fts5yyact;
9601 fts5yymsp->major = (fts5YYCODETYPE)fts5yygoto;
9602 fts5yymsp->minor = fts5yygotominor;
9603 fts5yyTraceShift(fts5yypParser, fts5yyact);
9604 }else{
9605 fts5yy_shift(fts5yypParser,fts5yyact,fts5yygoto,&fts5yygotominor);
9606 }
9607 }else{
9608 assert( fts5yyact == fts5YY_ACCEPT_ACTION );
9609 fts5yy_accept(fts5yypParser);
9610 }
9611 }
9612
9613 /*
9614 ** The following code executes when the parse fails
9615 */
9616 #ifndef fts5YYNOERRORRECOVERY
9617 static void fts5yy_parse_failed(
9618 fts5yyParser *fts5yypParser /* The parser */
9619 ){
9620 sqlite3Fts5ParserARG_FETCH;
9621 #ifndef NDEBUG
9622 if( fts5yyTraceFILE ){
9623 fprintf(fts5yyTraceFILE,"%sFail!\n",fts5yyTracePrompt);
9624 }
9625 #endif
9626 while( fts5yypParser->fts5yyidx>=0 ) fts5yy_pop_parser_stack(fts5yypParser);
9627 /* Here code is inserted which will be executed whenever the
9628 ** parser fails */
9629 /************ Begin %parse_failure code ***************************************/
9630 /************ End %parse_failure code *****************************************/
9631 sqlite3Fts5ParserARG_STORE; /* Suppress warning about unused %extra_argument v ariable */
9632 }
9633 #endif /* fts5YYNOERRORRECOVERY */
9634
9635 /*
9636 ** The following code executes when a syntax error first occurs.
9637 */
9638 static void fts5yy_syntax_error(
9639 fts5yyParser *fts5yypParser, /* The parser */
9640 int fts5yymajor, /* The major type of the error token */
9641 fts5YYMINORTYPE fts5yyminor /* The minor type of the error token */
9642 ){
9643 sqlite3Fts5ParserARG_FETCH;
9644 #define FTS5TOKEN (fts5yyminor.fts5yy0)
9645 /************ Begin %syntax_error code ****************************************/
9646
9647 sqlite3Fts5ParseError(
9648 pParse, "fts5: syntax error near \"%.*s\"",FTS5TOKEN.n,FTS5TOKEN.p
9649 );
9650 /************ End %syntax_error code ******************************************/
9651 sqlite3Fts5ParserARG_STORE; /* Suppress warning about unused %extra_argument v ariable */
9652 }
9653
9654 /*
9655 ** The following is executed when the parser accepts
9656 */
9657 static void fts5yy_accept(
9658 fts5yyParser *fts5yypParser /* The parser */
9659 ){
9660 sqlite3Fts5ParserARG_FETCH;
9661 #ifndef NDEBUG
9662 if( fts5yyTraceFILE ){
9663 fprintf(fts5yyTraceFILE,"%sAccept!\n",fts5yyTracePrompt);
9664 }
9665 #endif
9666 while( fts5yypParser->fts5yyidx>=0 ) fts5yy_pop_parser_stack(fts5yypParser);
9667 /* Here code is inserted which will be executed whenever the
9668 ** parser accepts */
9669 /*********** Begin %parse_accept code *****************************************/
9670 /*********** End %parse_accept code *******************************************/
9671 sqlite3Fts5ParserARG_STORE; /* Suppress warning about unused %extra_argument v ariable */
9672 }
9673
9674 /* The main parser program.
9675 ** The first argument is a pointer to a structure obtained from
9676 ** "sqlite3Fts5ParserAlloc" which describes the current state of the parser.
9677 ** The second argument is the major token number. The third is
9678 ** the minor token. The fourth optional argument is whatever the
9679 ** user wants (and specified in the grammar) and is available for
9680 ** use by the action routines.
9681 **
9682 ** Inputs:
9683 ** <ul>
9684 ** <li> A pointer to the parser (an opaque structure.)
9685 ** <li> The major token number.
9686 ** <li> The minor token number.
9687 ** <li> An option argument of a grammar-specified type.
9688 ** </ul>
9689 **
9690 ** Outputs:
9691 ** None.
9692 */
9693 static void sqlite3Fts5Parser(
9694 void *fts5yyp, /* The parser */
9695 int fts5yymajor, /* The major token code number */
9696 sqlite3Fts5ParserFTS5TOKENTYPE fts5yyminor /* The value for the token */
9697 sqlite3Fts5ParserARG_PDECL /* Optional %extra_argument parameter */
9698 ){
9699 fts5YYMINORTYPE fts5yyminorunion;
9700 int fts5yyact; /* The parser action. */
9701 #if !defined(fts5YYERRORSYMBOL) && !defined(fts5YYNOERRORRECOVERY)
9702 int fts5yyendofinput; /* True if we are at the end of input */
9703 #endif
9704 #ifdef fts5YYERRORSYMBOL
9705 int fts5yyerrorhit = 0; /* True if fts5yymajor has invoked an error */
9706 #endif
9707 fts5yyParser *fts5yypParser; /* The parser */
9708
9709 /* (re)initialize the parser, if necessary */
9710 fts5yypParser = (fts5yyParser*)fts5yyp;
9711 if( fts5yypParser->fts5yyidx<0 ){
9712 #if fts5YYSTACKDEPTH<=0
9713 if( fts5yypParser->fts5yystksz <=0 ){
9714 /*memset(&fts5yyminorunion, 0, sizeof(fts5yyminorunion));*/
9715 fts5yyminorunion = fts5yyzerominor;
9716 fts5yyStackOverflow(fts5yypParser, &fts5yyminorunion);
9717 return;
9718 }
9719 #endif
9720 fts5yypParser->fts5yyidx = 0;
9721 fts5yypParser->fts5yyerrcnt = -1;
9722 fts5yypParser->fts5yystack[0].stateno = 0;
9723 fts5yypParser->fts5yystack[0].major = 0;
9724 #ifndef NDEBUG
9725 if( fts5yyTraceFILE ){
9726 fprintf(fts5yyTraceFILE,"%sInitialize. Empty stack. State 0\n",
9727 fts5yyTracePrompt);
9728 }
9729 #endif
9730 }
9731 fts5yyminorunion.fts5yy0 = fts5yyminor;
9732 #if !defined(fts5YYERRORSYMBOL) && !defined(fts5YYNOERRORRECOVERY)
9733 fts5yyendofinput = (fts5yymajor==0);
9734 #endif
9735 sqlite3Fts5ParserARG_STORE;
9736
9737 #ifndef NDEBUG
9738 if( fts5yyTraceFILE ){
9739 fprintf(fts5yyTraceFILE,"%sInput '%s'\n",fts5yyTracePrompt,fts5yyTokenName[f ts5yymajor]);
9740 }
9741 #endif
9742
9743 do{
9744 fts5yyact = fts5yy_find_shift_action(fts5yypParser,(fts5YYCODETYPE)fts5yymaj or);
9745 if( fts5yyact <= fts5YY_MAX_SHIFTREDUCE ){
9746 if( fts5yyact > fts5YY_MAX_SHIFT ) fts5yyact += fts5YY_MIN_REDUCE - fts5YY _MIN_SHIFTREDUCE;
9747 fts5yy_shift(fts5yypParser,fts5yyact,fts5yymajor,&fts5yyminorunion);
9748 fts5yypParser->fts5yyerrcnt--;
9749 fts5yymajor = fts5YYNOCODE;
9750 }else if( fts5yyact <= fts5YY_MAX_REDUCE ){
9751 fts5yy_reduce(fts5yypParser,fts5yyact-fts5YY_MIN_REDUCE);
9752 }else{
9753 assert( fts5yyact == fts5YY_ERROR_ACTION );
9754 #ifdef fts5YYERRORSYMBOL
9755 int fts5yymx;
9756 #endif
9757 #ifndef NDEBUG
9758 if( fts5yyTraceFILE ){
9759 fprintf(fts5yyTraceFILE,"%sSyntax Error!\n",fts5yyTracePrompt);
9760 }
9761 #endif
9762 #ifdef fts5YYERRORSYMBOL
9763 /* A syntax error has occurred.
9764 ** The response to an error depends upon whether or not the
9765 ** grammar defines an error token "ERROR".
9766 **
9767 ** This is what we do if the grammar does define ERROR:
9768 **
9769 ** * Call the %syntax_error function.
9770 **
9771 ** * Begin popping the stack until we enter a state where
9772 ** it is legal to shift the error symbol, then shift
9773 ** the error symbol.
9774 **
9775 ** * Set the error count to three.
9776 **
9777 ** * Begin accepting and shifting new tokens. No new error
9778 ** processing will occur until three tokens have been
9779 ** shifted successfully.
9780 **
9781 */
9782 if( fts5yypParser->fts5yyerrcnt<0 ){
9783 fts5yy_syntax_error(fts5yypParser,fts5yymajor,fts5yyminorunion);
9784 }
9785 fts5yymx = fts5yypParser->fts5yystack[fts5yypParser->fts5yyidx].major;
9786 if( fts5yymx==fts5YYERRORSYMBOL || fts5yyerrorhit ){
9787 #ifndef NDEBUG
9788 if( fts5yyTraceFILE ){
9789 fprintf(fts5yyTraceFILE,"%sDiscard input token %s\n",
9790 fts5yyTracePrompt,fts5yyTokenName[fts5yymajor]);
9791 }
9792 #endif
9793 fts5yy_destructor(fts5yypParser, (fts5YYCODETYPE)fts5yymajor,&fts5yymino runion);
9794 fts5yymajor = fts5YYNOCODE;
9795 }else{
9796 while(
9797 fts5yypParser->fts5yyidx >= 0 &&
9798 fts5yymx != fts5YYERRORSYMBOL &&
9799 (fts5yyact = fts5yy_find_reduce_action(
9800 fts5yypParser->fts5yystack[fts5yypParser->fts5yyidx].sta teno,
9801 fts5YYERRORSYMBOL)) >= fts5YY_MIN_REDUCE
9802 ){
9803 fts5yy_pop_parser_stack(fts5yypParser);
9804 }
9805 if( fts5yypParser->fts5yyidx < 0 || fts5yymajor==0 ){
9806 fts5yy_destructor(fts5yypParser,(fts5YYCODETYPE)fts5yymajor,&fts5yymin orunion);
9807 fts5yy_parse_failed(fts5yypParser);
9808 fts5yymajor = fts5YYNOCODE;
9809 }else if( fts5yymx!=fts5YYERRORSYMBOL ){
9810 fts5YYMINORTYPE u2;
9811 u2.fts5YYERRSYMDT = 0;
9812 fts5yy_shift(fts5yypParser,fts5yyact,fts5YYERRORSYMBOL,&u2);
9813 }
9814 }
9815 fts5yypParser->fts5yyerrcnt = 3;
9816 fts5yyerrorhit = 1;
9817 #elif defined(fts5YYNOERRORRECOVERY)
9818 /* If the fts5YYNOERRORRECOVERY macro is defined, then do not attempt to
9819 ** do any kind of error recovery. Instead, simply invoke the syntax
9820 ** error routine and continue going as if nothing had happened.
9821 **
9822 ** Applications can set this macro (for example inside %include) if
9823 ** they intend to abandon the parse upon the first syntax error seen.
9824 */
9825 fts5yy_syntax_error(fts5yypParser,fts5yymajor,fts5yyminorunion);
9826 fts5yy_destructor(fts5yypParser,(fts5YYCODETYPE)fts5yymajor,&fts5yyminorun ion);
9827 fts5yymajor = fts5YYNOCODE;
9828
9829 #else /* fts5YYERRORSYMBOL is not defined */
9830 /* This is what we do if the grammar does not define ERROR:
9831 **
9832 ** * Report an error message, and throw away the input token.
9833 **
9834 ** * If the input token is $, then fail the parse.
9835 **
9836 ** As before, subsequent error messages are suppressed until
9837 ** three input tokens have been successfully shifted.
9838 */
9839 if( fts5yypParser->fts5yyerrcnt<=0 ){
9840 fts5yy_syntax_error(fts5yypParser,fts5yymajor,fts5yyminorunion);
9841 }
9842 fts5yypParser->fts5yyerrcnt = 3;
9843 fts5yy_destructor(fts5yypParser,(fts5YYCODETYPE)fts5yymajor,&fts5yyminorun ion);
9844 if( fts5yyendofinput ){
9845 fts5yy_parse_failed(fts5yypParser);
9846 }
9847 fts5yymajor = fts5YYNOCODE;
9848 #endif
9849 }
9850 }while( fts5yymajor!=fts5YYNOCODE && fts5yypParser->fts5yyidx>=0 );
9851 #ifndef NDEBUG
9852 if( fts5yyTraceFILE ){
9853 int i;
9854 fprintf(fts5yyTraceFILE,"%sReturn. Stack=",fts5yyTracePrompt);
9855 for(i=1; i<=fts5yypParser->fts5yyidx; i++)
9856 fprintf(fts5yyTraceFILE,"%c%s", i==1 ? '[' : ' ',
9857 fts5yyTokenName[fts5yypParser->fts5yystack[i].major]);
9858 fprintf(fts5yyTraceFILE,"]\n");
9859 }
9860 #endif
9861 return;
9862 }
9863
9864 /*
9865 ** 2014 May 31
9866 **
9867 ** The author disclaims copyright to this source code. In place of
9868 ** a legal notice, here is a blessing:
9869 **
9870 ** May you do good and not evil.
9871 ** May you find forgiveness for yourself and forgive others.
9872 ** May you share freely, never taking more than you give.
9873 **
9874 ******************************************************************************
9875 */
9876
9877
9878 /* #include "fts5Int.h" */
9879 #include <math.h> /* amalgamator: keep */
9880
9881 /*
9882 ** Object used to iterate through all "coalesced phrase instances" in
9883 ** a single column of the current row. If the phrase instances in the
9884 ** column being considered do not overlap, this object simply iterates
9885 ** through them. Or, if they do overlap (share one or more tokens in
9886 ** common), each set of overlapping instances is treated as a single
9887 ** match. See documentation for the highlight() auxiliary function for
9888 ** details.
9889 **
9890 ** Usage is:
9891 **
9892 ** for(rc = fts5CInstIterNext(pApi, pFts, iCol, &iter);
9893 ** (rc==SQLITE_OK && 0==fts5CInstIterEof(&iter);
9894 ** rc = fts5CInstIterNext(&iter)
9895 ** ){
9896 ** printf("instance starts at %d, ends at %d\n", iter.iStart, iter.iEnd);
9897 ** }
9898 **
9899 */
9900 typedef struct CInstIter CInstIter;
9901 struct CInstIter {
9902 const Fts5ExtensionApi *pApi; /* API offered by current FTS version */
9903 Fts5Context *pFts; /* First arg to pass to pApi functions */
9904 int iCol; /* Column to search */
9905 int iInst; /* Next phrase instance index */
9906 int nInst; /* Total number of phrase instances */
9907
9908 /* Output variables */
9909 int iStart; /* First token in coalesced phrase instance */
9910 int iEnd; /* Last token in coalesced phrase instance */
9911 };
9912
9913 /*
9914 ** Advance the iterator to the next coalesced phrase instance. Return
9915 ** an SQLite error code if an error occurs, or SQLITE_OK otherwise.
9916 */
9917 static int fts5CInstIterNext(CInstIter *pIter){
9918 int rc = SQLITE_OK;
9919 pIter->iStart = -1;
9920 pIter->iEnd = -1;
9921
9922 while( rc==SQLITE_OK && pIter->iInst<pIter->nInst ){
9923 int ip; int ic; int io;
9924 rc = pIter->pApi->xInst(pIter->pFts, pIter->iInst, &ip, &ic, &io);
9925 if( rc==SQLITE_OK ){
9926 if( ic==pIter->iCol ){
9927 int iEnd = io - 1 + pIter->pApi->xPhraseSize(pIter->pFts, ip);
9928 if( pIter->iStart<0 ){
9929 pIter->iStart = io;
9930 pIter->iEnd = iEnd;
9931 }else if( io<=pIter->iEnd ){
9932 if( iEnd>pIter->iEnd ) pIter->iEnd = iEnd;
9933 }else{
9934 break;
9935 }
9936 }
9937 pIter->iInst++;
9938 }
9939 }
9940
9941 return rc;
9942 }
9943
9944 /*
9945 ** Initialize the iterator object indicated by the final parameter to
9946 ** iterate through coalesced phrase instances in column iCol.
9947 */
9948 static int fts5CInstIterInit(
9949 const Fts5ExtensionApi *pApi,
9950 Fts5Context *pFts,
9951 int iCol,
9952 CInstIter *pIter
9953 ){
9954 int rc;
9955
9956 memset(pIter, 0, sizeof(CInstIter));
9957 pIter->pApi = pApi;
9958 pIter->pFts = pFts;
9959 pIter->iCol = iCol;
9960 rc = pApi->xInstCount(pFts, &pIter->nInst);
9961
9962 if( rc==SQLITE_OK ){
9963 rc = fts5CInstIterNext(pIter);
9964 }
9965
9966 return rc;
9967 }
9968
9969
9970
9971 /*************************************************************************
9972 ** Start of highlight() implementation.
9973 */
9974 typedef struct HighlightContext HighlightContext;
9975 struct HighlightContext {
9976 CInstIter iter; /* Coalesced Instance Iterator */
9977 int iPos; /* Current token offset in zIn[] */
9978 int iRangeStart; /* First token to include */
9979 int iRangeEnd; /* If non-zero, last token to include */
9980 const char *zOpen; /* Opening highlight */
9981 const char *zClose; /* Closing highlight */
9982 const char *zIn; /* Input text */
9983 int nIn; /* Size of input text in bytes */
9984 int iOff; /* Current offset within zIn[] */
9985 char *zOut; /* Output value */
9986 };
9987
9988 /*
9989 ** Append text to the HighlightContext output string - p->zOut. Argument
9990 ** z points to a buffer containing n bytes of text to append. If n is
9991 ** negative, everything up until the first '\0' is appended to the output.
9992 **
9993 ** If *pRc is set to any value other than SQLITE_OK when this function is
9994 ** called, it is a no-op. If an error (i.e. an OOM condition) is encountered,
9995 ** *pRc is set to an error code before returning.
9996 */
9997 static void fts5HighlightAppend(
9998 int *pRc,
9999 HighlightContext *p,
10000 const char *z, int n
10001 ){
10002 if( *pRc==SQLITE_OK ){
10003 if( n<0 ) n = (int)strlen(z);
10004 p->zOut = sqlite3_mprintf("%z%.*s", p->zOut, n, z);
10005 if( p->zOut==0 ) *pRc = SQLITE_NOMEM;
10006 }
10007 }
10008
10009 /*
10010 ** Tokenizer callback used by implementation of highlight() function.
10011 */
10012 static int fts5HighlightCb(
10013 void *pContext, /* Pointer to HighlightContext object */
10014 int tflags, /* Mask of FTS5_TOKEN_* flags */
10015 const char *pToken, /* Buffer containing token */
10016 int nToken, /* Size of token in bytes */
10017 int iStartOff, /* Start offset of token */
10018 int iEndOff /* End offset of token */
10019 ){
10020 HighlightContext *p = (HighlightContext*)pContext;
10021 int rc = SQLITE_OK;
10022 int iPos;
10023
10024 if( tflags & FTS5_TOKEN_COLOCATED ) return SQLITE_OK;
10025 iPos = p->iPos++;
10026
10027 if( p->iRangeEnd>0 ){
10028 if( iPos<p->iRangeStart || iPos>p->iRangeEnd ) return SQLITE_OK;
10029 if( p->iRangeStart && iPos==p->iRangeStart ) p->iOff = iStartOff;
10030 }
10031
10032 if( iPos==p->iter.iStart ){
10033 fts5HighlightAppend(&rc, p, &p->zIn[p->iOff], iStartOff - p->iOff);
10034 fts5HighlightAppend(&rc, p, p->zOpen, -1);
10035 p->iOff = iStartOff;
10036 }
10037
10038 if( iPos==p->iter.iEnd ){
10039 if( p->iRangeEnd && p->iter.iStart<p->iRangeStart ){
10040 fts5HighlightAppend(&rc, p, p->zOpen, -1);
10041 }
10042 fts5HighlightAppend(&rc, p, &p->zIn[p->iOff], iEndOff - p->iOff);
10043 fts5HighlightAppend(&rc, p, p->zClose, -1);
10044 p->iOff = iEndOff;
10045 if( rc==SQLITE_OK ){
10046 rc = fts5CInstIterNext(&p->iter);
10047 }
10048 }
10049
10050 if( p->iRangeEnd>0 && iPos==p->iRangeEnd ){
10051 fts5HighlightAppend(&rc, p, &p->zIn[p->iOff], iEndOff - p->iOff);
10052 p->iOff = iEndOff;
10053 if( iPos<p->iter.iEnd ){
10054 fts5HighlightAppend(&rc, p, p->zClose, -1);
10055 }
10056 }
10057
10058 return rc;
10059 }
10060
10061 /*
10062 ** Implementation of highlight() function.
10063 */
10064 static void fts5HighlightFunction(
10065 const Fts5ExtensionApi *pApi, /* API offered by current FTS version */
10066 Fts5Context *pFts, /* First arg to pass to pApi functions */
10067 sqlite3_context *pCtx, /* Context for returning result/error */
10068 int nVal, /* Number of values in apVal[] array */
10069 sqlite3_value **apVal /* Array of trailing arguments */
10070 ){
10071 HighlightContext ctx;
10072 int rc;
10073 int iCol;
10074
10075 if( nVal!=3 ){
10076 const char *zErr = "wrong number of arguments to function highlight()";
10077 sqlite3_result_error(pCtx, zErr, -1);
10078 return;
10079 }
10080
10081 iCol = sqlite3_value_int(apVal[0]);
10082 memset(&ctx, 0, sizeof(HighlightContext));
10083 ctx.zOpen = (const char*)sqlite3_value_text(apVal[1]);
10084 ctx.zClose = (const char*)sqlite3_value_text(apVal[2]);
10085 rc = pApi->xColumnText(pFts, iCol, &ctx.zIn, &ctx.nIn);
10086
10087 if( ctx.zIn ){
10088 if( rc==SQLITE_OK ){
10089 rc = fts5CInstIterInit(pApi, pFts, iCol, &ctx.iter);
10090 }
10091
10092 if( rc==SQLITE_OK ){
10093 rc = pApi->xTokenize(pFts, ctx.zIn, ctx.nIn, (void*)&ctx,fts5HighlightCb);
10094 }
10095 fts5HighlightAppend(&rc, &ctx, &ctx.zIn[ctx.iOff], ctx.nIn - ctx.iOff);
10096
10097 if( rc==SQLITE_OK ){
10098 sqlite3_result_text(pCtx, (const char*)ctx.zOut, -1, SQLITE_TRANSIENT);
10099 }
10100 sqlite3_free(ctx.zOut);
10101 }
10102 if( rc!=SQLITE_OK ){
10103 sqlite3_result_error_code(pCtx, rc);
10104 }
10105 }
10106 /*
10107 ** End of highlight() implementation.
10108 **************************************************************************/
10109
10110 /*
10111 ** Implementation of snippet() function.
10112 */
10113 static void fts5SnippetFunction(
10114 const Fts5ExtensionApi *pApi, /* API offered by current FTS version */
10115 Fts5Context *pFts, /* First arg to pass to pApi functions */
10116 sqlite3_context *pCtx, /* Context for returning result/error */
10117 int nVal, /* Number of values in apVal[] array */
10118 sqlite3_value **apVal /* Array of trailing arguments */
10119 ){
10120 HighlightContext ctx;
10121 int rc = SQLITE_OK; /* Return code */
10122 int iCol; /* 1st argument to snippet() */
10123 const char *zEllips; /* 4th argument to snippet() */
10124 int nToken; /* 5th argument to snippet() */
10125 int nInst = 0; /* Number of instance matches this row */
10126 int i; /* Used to iterate through instances */
10127 int nPhrase; /* Number of phrases in query */
10128 unsigned char *aSeen; /* Array of "seen instance" flags */
10129 int iBestCol; /* Column containing best snippet */
10130 int iBestStart = 0; /* First token of best snippet */
10131 int iBestLast; /* Last token of best snippet */
10132 int nBestScore = 0; /* Score of best snippet */
10133 int nColSize = 0; /* Total size of iBestCol in tokens */
10134
10135 if( nVal!=5 ){
10136 const char *zErr = "wrong number of arguments to function snippet()";
10137 sqlite3_result_error(pCtx, zErr, -1);
10138 return;
10139 }
10140
10141 memset(&ctx, 0, sizeof(HighlightContext));
10142 iCol = sqlite3_value_int(apVal[0]);
10143 ctx.zOpen = (const char*)sqlite3_value_text(apVal[1]);
10144 ctx.zClose = (const char*)sqlite3_value_text(apVal[2]);
10145 zEllips = (const char*)sqlite3_value_text(apVal[3]);
10146 nToken = sqlite3_value_int(apVal[4]);
10147 iBestLast = nToken-1;
10148
10149 iBestCol = (iCol>=0 ? iCol : 0);
10150 nPhrase = pApi->xPhraseCount(pFts);
10151 aSeen = sqlite3_malloc(nPhrase);
10152 if( aSeen==0 ){
10153 rc = SQLITE_NOMEM;
10154 }
10155
10156 if( rc==SQLITE_OK ){
10157 rc = pApi->xInstCount(pFts, &nInst);
10158 }
10159 for(i=0; rc==SQLITE_OK && i<nInst; i++){
10160 int ip, iSnippetCol, iStart;
10161 memset(aSeen, 0, nPhrase);
10162 rc = pApi->xInst(pFts, i, &ip, &iSnippetCol, &iStart);
10163 if( rc==SQLITE_OK && (iCol<0 || iSnippetCol==iCol) ){
10164 int nScore = 1000;
10165 int iLast = iStart - 1 + pApi->xPhraseSize(pFts, ip);
10166 int j;
10167 aSeen[ip] = 1;
10168
10169 for(j=i+1; rc==SQLITE_OK && j<nInst; j++){
10170 int ic; int io; int iFinal;
10171 rc = pApi->xInst(pFts, j, &ip, &ic, &io);
10172 iFinal = io + pApi->xPhraseSize(pFts, ip) - 1;
10173 if( rc==SQLITE_OK && ic==iSnippetCol && iLast<iStart+nToken ){
10174 nScore += aSeen[ip] ? 1000 : 1;
10175 aSeen[ip] = 1;
10176 if( iFinal>iLast ) iLast = iFinal;
10177 }
10178 }
10179
10180 if( rc==SQLITE_OK && nScore>nBestScore ){
10181 iBestCol = iSnippetCol;
10182 iBestStart = iStart;
10183 iBestLast = iLast;
10184 nBestScore = nScore;
10185 }
10186 }
10187 }
10188
10189 if( rc==SQLITE_OK ){
10190 rc = pApi->xColumnSize(pFts, iBestCol, &nColSize);
10191 }
10192 if( rc==SQLITE_OK ){
10193 rc = pApi->xColumnText(pFts, iBestCol, &ctx.zIn, &ctx.nIn);
10194 }
10195 if( ctx.zIn ){
10196 if( rc==SQLITE_OK ){
10197 rc = fts5CInstIterInit(pApi, pFts, iBestCol, &ctx.iter);
10198 }
10199
10200 if( (iBestStart+nToken-1)>iBestLast ){
10201 iBestStart -= (iBestStart+nToken-1-iBestLast) / 2;
10202 }
10203 if( iBestStart+nToken>nColSize ){
10204 iBestStart = nColSize - nToken;
10205 }
10206 if( iBestStart<0 ) iBestStart = 0;
10207
10208 ctx.iRangeStart = iBestStart;
10209 ctx.iRangeEnd = iBestStart + nToken - 1;
10210
10211 if( iBestStart>0 ){
10212 fts5HighlightAppend(&rc, &ctx, zEllips, -1);
10213 }
10214 if( rc==SQLITE_OK ){
10215 rc = pApi->xTokenize(pFts, ctx.zIn, ctx.nIn, (void*)&ctx,fts5HighlightCb);
10216 }
10217 if( ctx.iRangeEnd>=(nColSize-1) ){
10218 fts5HighlightAppend(&rc, &ctx, &ctx.zIn[ctx.iOff], ctx.nIn - ctx.iOff);
10219 }else{
10220 fts5HighlightAppend(&rc, &ctx, zEllips, -1);
10221 }
10222
10223 if( rc==SQLITE_OK ){
10224 sqlite3_result_text(pCtx, (const char*)ctx.zOut, -1, SQLITE_TRANSIENT);
10225 }else{
10226 sqlite3_result_error_code(pCtx, rc);
10227 }
10228 sqlite3_free(ctx.zOut);
10229 }
10230 sqlite3_free(aSeen);
10231 }
10232
10233 /************************************************************************/
10234
10235 /*
10236 ** The first time the bm25() function is called for a query, an instance
10237 ** of the following structure is allocated and populated.
10238 */
10239 typedef struct Fts5Bm25Data Fts5Bm25Data;
10240 struct Fts5Bm25Data {
10241 int nPhrase; /* Number of phrases in query */
10242 double avgdl; /* Average number of tokens in each row */
10243 double *aIDF; /* IDF for each phrase */
10244 double *aFreq; /* Array used to calculate phrase freq. */
10245 };
10246
10247 /*
10248 ** Callback used by fts5Bm25GetData() to count the number of rows in the
10249 ** table matched by each individual phrase within the query.
10250 */
10251 static int fts5CountCb(
10252 const Fts5ExtensionApi *pApi,
10253 Fts5Context *pFts,
10254 void *pUserData /* Pointer to sqlite3_int64 variable */
10255 ){
10256 sqlite3_int64 *pn = (sqlite3_int64*)pUserData;
10257 (*pn)++;
10258 return SQLITE_OK;
10259 }
10260
10261 /*
10262 ** Set *ppData to point to the Fts5Bm25Data object for the current query.
10263 ** If the object has not already been allocated, allocate and populate it
10264 ** now.
10265 */
10266 static int fts5Bm25GetData(
10267 const Fts5ExtensionApi *pApi,
10268 Fts5Context *pFts,
10269 Fts5Bm25Data **ppData /* OUT: bm25-data object for this query */
10270 ){
10271 int rc = SQLITE_OK; /* Return code */
10272 Fts5Bm25Data *p; /* Object to return */
10273
10274 p = pApi->xGetAuxdata(pFts, 0);
10275 if( p==0 ){
10276 int nPhrase; /* Number of phrases in query */
10277 sqlite3_int64 nRow = 0; /* Number of rows in table */
10278 sqlite3_int64 nToken = 0; /* Number of tokens in table */
10279 int nByte; /* Bytes of space to allocate */
10280 int i;
10281
10282 /* Allocate the Fts5Bm25Data object */
10283 nPhrase = pApi->xPhraseCount(pFts);
10284 nByte = sizeof(Fts5Bm25Data) + nPhrase*2*sizeof(double);
10285 p = (Fts5Bm25Data*)sqlite3_malloc(nByte);
10286 if( p==0 ){
10287 rc = SQLITE_NOMEM;
10288 }else{
10289 memset(p, 0, nByte);
10290 p->nPhrase = nPhrase;
10291 p->aIDF = (double*)&p[1];
10292 p->aFreq = &p->aIDF[nPhrase];
10293 }
10294
10295 /* Calculate the average document length for this FTS5 table */
10296 if( rc==SQLITE_OK ) rc = pApi->xRowCount(pFts, &nRow);
10297 if( rc==SQLITE_OK ) rc = pApi->xColumnTotalSize(pFts, -1, &nToken);
10298 if( rc==SQLITE_OK ) p->avgdl = (double)nToken / (double)nRow;
10299
10300 /* Calculate an IDF for each phrase in the query */
10301 for(i=0; rc==SQLITE_OK && i<nPhrase; i++){
10302 sqlite3_int64 nHit = 0;
10303 rc = pApi->xQueryPhrase(pFts, i, (void*)&nHit, fts5CountCb);
10304 if( rc==SQLITE_OK ){
10305 /* Calculate the IDF (Inverse Document Frequency) for phrase i.
10306 ** This is done using the standard BM25 formula as found on wikipedia:
10307 **
10308 ** IDF = log( (N - nHit + 0.5) / (nHit + 0.5) )
10309 **
10310 ** where "N" is the total number of documents in the set and nHit
10311 ** is the number that contain at least one instance of the phrase
10312 ** under consideration.
10313 **
10314 ** The problem with this is that if (N < 2*nHit), the IDF is
10315 ** negative. Which is undesirable. So the mimimum allowable IDF is
10316 ** (1e-6) - roughly the same as a term that appears in just over
10317 ** half of set of 5,000,000 documents. */
10318 double idf = log( (nRow - nHit + 0.5) / (nHit + 0.5) );
10319 if( idf<=0.0 ) idf = 1e-6;
10320 p->aIDF[i] = idf;
10321 }
10322 }
10323
10324 if( rc!=SQLITE_OK ){
10325 sqlite3_free(p);
10326 }else{
10327 rc = pApi->xSetAuxdata(pFts, p, sqlite3_free);
10328 }
10329 if( rc!=SQLITE_OK ) p = 0;
10330 }
10331 *ppData = p;
10332 return rc;
10333 }
10334
10335 /*
10336 ** Implementation of bm25() function.
10337 */
10338 static void fts5Bm25Function(
10339 const Fts5ExtensionApi *pApi, /* API offered by current FTS version */
10340 Fts5Context *pFts, /* First arg to pass to pApi functions */
10341 sqlite3_context *pCtx, /* Context for returning result/error */
10342 int nVal, /* Number of values in apVal[] array */
10343 sqlite3_value **apVal /* Array of trailing arguments */
10344 ){
10345 const double k1 = 1.2; /* Constant "k1" from BM25 formula */
10346 const double b = 0.75; /* Constant "b" from BM25 formula */
10347 int rc = SQLITE_OK; /* Error code */
10348 double score = 0.0; /* SQL function return value */
10349 Fts5Bm25Data *pData; /* Values allocated/calculated once only */
10350 int i; /* Iterator variable */
10351 int nInst = 0; /* Value returned by xInstCount() */
10352 double D = 0.0; /* Total number of tokens in row */
10353 double *aFreq = 0; /* Array of phrase freq. for current row */
10354
10355 /* Calculate the phrase frequency (symbol "f(qi,D)" in the documentation)
10356 ** for each phrase in the query for the current row. */
10357 rc = fts5Bm25GetData(pApi, pFts, &pData);
10358 if( rc==SQLITE_OK ){
10359 aFreq = pData->aFreq;
10360 memset(aFreq, 0, sizeof(double) * pData->nPhrase);
10361 rc = pApi->xInstCount(pFts, &nInst);
10362 }
10363 for(i=0; rc==SQLITE_OK && i<nInst; i++){
10364 int ip; int ic; int io;
10365 rc = pApi->xInst(pFts, i, &ip, &ic, &io);
10366 if( rc==SQLITE_OK ){
10367 double w = (nVal > ic) ? sqlite3_value_double(apVal[ic]) : 1.0;
10368 aFreq[ip] += w;
10369 }
10370 }
10371
10372 /* Figure out the total size of the current row in tokens. */
10373 if( rc==SQLITE_OK ){
10374 int nTok;
10375 rc = pApi->xColumnSize(pFts, -1, &nTok);
10376 D = (double)nTok;
10377 }
10378
10379 /* Determine the BM25 score for the current row. */
10380 for(i=0; rc==SQLITE_OK && i<pData->nPhrase; i++){
10381 score += pData->aIDF[i] * (
10382 ( aFreq[i] * (k1 + 1.0) ) /
10383 ( aFreq[i] + k1 * (1 - b + b * D / pData->avgdl) )
10384 );
10385 }
10386
10387 /* If no error has occurred, return the calculated score. Otherwise,
10388 ** throw an SQL exception. */
10389 if( rc==SQLITE_OK ){
10390 sqlite3_result_double(pCtx, -1.0 * score);
10391 }else{
10392 sqlite3_result_error_code(pCtx, rc);
10393 }
10394 }
10395
10396 static int sqlite3Fts5AuxInit(fts5_api *pApi){
10397 struct Builtin {
10398 const char *zFunc; /* Function name (nul-terminated) */
10399 void *pUserData; /* User-data pointer */
10400 fts5_extension_function xFunc;/* Callback function */
10401 void (*xDestroy)(void*); /* Destructor function */
10402 } aBuiltin [] = {
10403 { "snippet", 0, fts5SnippetFunction, 0 },
10404 { "highlight", 0, fts5HighlightFunction, 0 },
10405 { "bm25", 0, fts5Bm25Function, 0 },
10406 };
10407 int rc = SQLITE_OK; /* Return code */
10408 int i; /* To iterate through builtin functions */
10409
10410 for(i=0; rc==SQLITE_OK && i<(int)ArraySize(aBuiltin); i++){
10411 rc = pApi->xCreateFunction(pApi,
10412 aBuiltin[i].zFunc,
10413 aBuiltin[i].pUserData,
10414 aBuiltin[i].xFunc,
10415 aBuiltin[i].xDestroy
10416 );
10417 }
10418
10419 return rc;
10420 }
10421
10422
10423
10424 /*
10425 ** 2014 May 31
10426 **
10427 ** The author disclaims copyright to this source code. In place of
10428 ** a legal notice, here is a blessing:
10429 **
10430 ** May you do good and not evil.
10431 ** May you find forgiveness for yourself and forgive others.
10432 ** May you share freely, never taking more than you give.
10433 **
10434 ******************************************************************************
10435 */
10436
10437
10438
10439 /* #include "fts5Int.h" */
10440
10441 static int sqlite3Fts5BufferSize(int *pRc, Fts5Buffer *pBuf, int nByte){
10442 int nNew = pBuf->nSpace ? pBuf->nSpace*2 : 64;
10443 u8 *pNew;
10444 while( nNew<nByte ){
10445 nNew = nNew * 2;
10446 }
10447 pNew = sqlite3_realloc(pBuf->p, nNew);
10448 if( pNew==0 ){
10449 *pRc = SQLITE_NOMEM;
10450 return 1;
10451 }else{
10452 pBuf->nSpace = nNew;
10453 pBuf->p = pNew;
10454 }
10455 return 0;
10456 }
10457
10458
10459 /*
10460 ** Encode value iVal as an SQLite varint and append it to the buffer object
10461 ** pBuf. If an OOM error occurs, set the error code in p.
10462 */
10463 static void sqlite3Fts5BufferAppendVarint(int *pRc, Fts5Buffer *pBuf, i64 iVal){
10464 if( fts5BufferGrow(pRc, pBuf, 9) ) return;
10465 pBuf->n += sqlite3Fts5PutVarint(&pBuf->p[pBuf->n], iVal);
10466 }
10467
10468 static void sqlite3Fts5Put32(u8 *aBuf, int iVal){
10469 aBuf[0] = (iVal>>24) & 0x00FF;
10470 aBuf[1] = (iVal>>16) & 0x00FF;
10471 aBuf[2] = (iVal>> 8) & 0x00FF;
10472 aBuf[3] = (iVal>> 0) & 0x00FF;
10473 }
10474
10475 static int sqlite3Fts5Get32(const u8 *aBuf){
10476 return (aBuf[0] << 24) + (aBuf[1] << 16) + (aBuf[2] << 8) + aBuf[3];
10477 }
10478
10479 /*
10480 ** Append buffer nData/pData to buffer pBuf. If an OOM error occurs, set
10481 ** the error code in p. If an error has already occurred when this function
10482 ** is called, it is a no-op.
10483 */
10484 static void sqlite3Fts5BufferAppendBlob(
10485 int *pRc,
10486 Fts5Buffer *pBuf,
10487 int nData,
10488 const u8 *pData
10489 ){
10490 assert( *pRc || nData>=0 );
10491 if( fts5BufferGrow(pRc, pBuf, nData) ) return;
10492 memcpy(&pBuf->p[pBuf->n], pData, nData);
10493 pBuf->n += nData;
10494 }
10495
10496 /*
10497 ** Append the nul-terminated string zStr to the buffer pBuf. This function
10498 ** ensures that the byte following the buffer data is set to 0x00, even
10499 ** though this byte is not included in the pBuf->n count.
10500 */
10501 static void sqlite3Fts5BufferAppendString(
10502 int *pRc,
10503 Fts5Buffer *pBuf,
10504 const char *zStr
10505 ){
10506 int nStr = (int)strlen(zStr);
10507 sqlite3Fts5BufferAppendBlob(pRc, pBuf, nStr+1, (const u8*)zStr);
10508 pBuf->n--;
10509 }
10510
10511 /*
10512 ** Argument zFmt is a printf() style format string. This function performs
10513 ** the printf() style processing, then appends the results to buffer pBuf.
10514 **
10515 ** Like sqlite3Fts5BufferAppendString(), this function ensures that the byte
10516 ** following the buffer data is set to 0x00, even though this byte is not
10517 ** included in the pBuf->n count.
10518 */
10519 static void sqlite3Fts5BufferAppendPrintf(
10520 int *pRc,
10521 Fts5Buffer *pBuf,
10522 char *zFmt, ...
10523 ){
10524 if( *pRc==SQLITE_OK ){
10525 char *zTmp;
10526 va_list ap;
10527 va_start(ap, zFmt);
10528 zTmp = sqlite3_vmprintf(zFmt, ap);
10529 va_end(ap);
10530
10531 if( zTmp==0 ){
10532 *pRc = SQLITE_NOMEM;
10533 }else{
10534 sqlite3Fts5BufferAppendString(pRc, pBuf, zTmp);
10535 sqlite3_free(zTmp);
10536 }
10537 }
10538 }
10539
10540 static char *sqlite3Fts5Mprintf(int *pRc, const char *zFmt, ...){
10541 char *zRet = 0;
10542 if( *pRc==SQLITE_OK ){
10543 va_list ap;
10544 va_start(ap, zFmt);
10545 zRet = sqlite3_vmprintf(zFmt, ap);
10546 va_end(ap);
10547 if( zRet==0 ){
10548 *pRc = SQLITE_NOMEM;
10549 }
10550 }
10551 return zRet;
10552 }
10553
10554
10555 /*
10556 ** Free any buffer allocated by pBuf. Zero the structure before returning.
10557 */
10558 static void sqlite3Fts5BufferFree(Fts5Buffer *pBuf){
10559 sqlite3_free(pBuf->p);
10560 memset(pBuf, 0, sizeof(Fts5Buffer));
10561 }
10562
10563 /*
10564 ** Zero the contents of the buffer object. But do not free the associated
10565 ** memory allocation.
10566 */
10567 static void sqlite3Fts5BufferZero(Fts5Buffer *pBuf){
10568 pBuf->n = 0;
10569 }
10570
10571 /*
10572 ** Set the buffer to contain nData/pData. If an OOM error occurs, leave an
10573 ** the error code in p. If an error has already occurred when this function
10574 ** is called, it is a no-op.
10575 */
10576 static void sqlite3Fts5BufferSet(
10577 int *pRc,
10578 Fts5Buffer *pBuf,
10579 int nData,
10580 const u8 *pData
10581 ){
10582 pBuf->n = 0;
10583 sqlite3Fts5BufferAppendBlob(pRc, pBuf, nData, pData);
10584 }
10585
10586 static int sqlite3Fts5PoslistNext64(
10587 const u8 *a, int n, /* Buffer containing poslist */
10588 int *pi, /* IN/OUT: Offset within a[] */
10589 i64 *piOff /* IN/OUT: Current offset */
10590 ){
10591 int i = *pi;
10592 if( i>=n ){
10593 /* EOF */
10594 *piOff = -1;
10595 return 1;
10596 }else{
10597 i64 iOff = *piOff;
10598 int iVal;
10599 fts5FastGetVarint32(a, i, iVal);
10600 if( iVal==1 ){
10601 fts5FastGetVarint32(a, i, iVal);
10602 iOff = ((i64)iVal) << 32;
10603 fts5FastGetVarint32(a, i, iVal);
10604 }
10605 *piOff = iOff + (iVal-2);
10606 *pi = i;
10607 return 0;
10608 }
10609 }
10610
10611
10612 /*
10613 ** Advance the iterator object passed as the only argument. Return true
10614 ** if the iterator reaches EOF, or false otherwise.
10615 */
10616 static int sqlite3Fts5PoslistReaderNext(Fts5PoslistReader *pIter){
10617 if( sqlite3Fts5PoslistNext64(pIter->a, pIter->n, &pIter->i, &pIter->iPos) ){
10618 pIter->bEof = 1;
10619 }
10620 return pIter->bEof;
10621 }
10622
10623 static int sqlite3Fts5PoslistReaderInit(
10624 const u8 *a, int n, /* Poslist buffer to iterate through */
10625 Fts5PoslistReader *pIter /* Iterator object to initialize */
10626 ){
10627 memset(pIter, 0, sizeof(*pIter));
10628 pIter->a = a;
10629 pIter->n = n;
10630 sqlite3Fts5PoslistReaderNext(pIter);
10631 return pIter->bEof;
10632 }
10633
10634 static int sqlite3Fts5PoslistWriterAppend(
10635 Fts5Buffer *pBuf,
10636 Fts5PoslistWriter *pWriter,
10637 i64 iPos
10638 ){
10639 static const i64 colmask = ((i64)(0x7FFFFFFF)) << 32;
10640 int rc = SQLITE_OK;
10641 if( 0==fts5BufferGrow(&rc, pBuf, 5+5+5) ){
10642 if( (iPos & colmask) != (pWriter->iPrev & colmask) ){
10643 pBuf->p[pBuf->n++] = 1;
10644 pBuf->n += sqlite3Fts5PutVarint(&pBuf->p[pBuf->n], (iPos>>32));
10645 pWriter->iPrev = (iPos & colmask);
10646 }
10647 pBuf->n += sqlite3Fts5PutVarint(&pBuf->p[pBuf->n], (iPos-pWriter->iPrev)+2);
10648 pWriter->iPrev = iPos;
10649 }
10650 return rc;
10651 }
10652
10653 static void *sqlite3Fts5MallocZero(int *pRc, int nByte){
10654 void *pRet = 0;
10655 if( *pRc==SQLITE_OK ){
10656 pRet = sqlite3_malloc(nByte);
10657 if( pRet==0 && nByte>0 ){
10658 *pRc = SQLITE_NOMEM;
10659 }else{
10660 memset(pRet, 0, nByte);
10661 }
10662 }
10663 return pRet;
10664 }
10665
10666 /*
10667 ** Return a nul-terminated copy of the string indicated by pIn. If nIn
10668 ** is non-negative, then it is the length of the string in bytes. Otherwise,
10669 ** the length of the string is determined using strlen().
10670 **
10671 ** It is the responsibility of the caller to eventually free the returned
10672 ** buffer using sqlite3_free(). If an OOM error occurs, NULL is returned.
10673 */
10674 static char *sqlite3Fts5Strndup(int *pRc, const char *pIn, int nIn){
10675 char *zRet = 0;
10676 if( *pRc==SQLITE_OK ){
10677 if( nIn<0 ){
10678 nIn = (int)strlen(pIn);
10679 }
10680 zRet = (char*)sqlite3_malloc(nIn+1);
10681 if( zRet ){
10682 memcpy(zRet, pIn, nIn);
10683 zRet[nIn] = '\0';
10684 }else{
10685 *pRc = SQLITE_NOMEM;
10686 }
10687 }
10688 return zRet;
10689 }
10690
10691
10692 /*
10693 ** Return true if character 't' may be part of an FTS5 bareword, or false
10694 ** otherwise. Characters that may be part of barewords:
10695 **
10696 ** * All non-ASCII characters,
10697 ** * The 52 upper and lower case ASCII characters, and
10698 ** * The 10 integer ASCII characters.
10699 ** * The underscore character "_" (0x5F).
10700 ** * The unicode "subsitute" character (0x1A).
10701 */
10702 static int sqlite3Fts5IsBareword(char t){
10703 u8 aBareword[128] = {
10704 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0x00 .. 0x0F */
10705 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, /* 0x10 .. 0x1F */
10706 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0x20 .. 0x2F */
10707 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, /* 0x30 .. 0x3F */
10708 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0x40 .. 0x4F */
10709 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, /* 0x50 .. 0x5F */
10710 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0x60 .. 0x6F */
10711 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0 /* 0x70 .. 0x7F */
10712 };
10713
10714 return (t & 0x80) || aBareword[(int)t];
10715 }
10716
10717
10718
10719 /*
10720 ** 2014 Jun 09
10721 **
10722 ** The author disclaims copyright to this source code. In place of
10723 ** a legal notice, here is a blessing:
10724 **
10725 ** May you do good and not evil.
10726 ** May you find forgiveness for yourself and forgive others.
10727 ** May you share freely, never taking more than you give.
10728 **
10729 ******************************************************************************
10730 **
10731 ** This is an SQLite module implementing full-text search.
10732 */
10733
10734
10735
10736 /* #include "fts5Int.h" */
10737
10738 #define FTS5_DEFAULT_PAGE_SIZE 4050
10739 #define FTS5_DEFAULT_AUTOMERGE 4
10740 #define FTS5_DEFAULT_CRISISMERGE 16
10741 #define FTS5_DEFAULT_HASHSIZE (1024*1024)
10742
10743 /* Maximum allowed page size */
10744 #define FTS5_MAX_PAGE_SIZE (128*1024)
10745
10746 static int fts5_iswhitespace(char x){
10747 return (x==' ');
10748 }
10749
10750 static int fts5_isopenquote(char x){
10751 return (x=='"' || x=='\'' || x=='[' || x=='`');
10752 }
10753
10754 /*
10755 ** Argument pIn points to a character that is part of a nul-terminated
10756 ** string. Return a pointer to the first character following *pIn in
10757 ** the string that is not a white-space character.
10758 */
10759 static const char *fts5ConfigSkipWhitespace(const char *pIn){
10760 const char *p = pIn;
10761 if( p ){
10762 while( fts5_iswhitespace(*p) ){ p++; }
10763 }
10764 return p;
10765 }
10766
10767 /*
10768 ** Argument pIn points to a character that is part of a nul-terminated
10769 ** string. Return a pointer to the first character following *pIn in
10770 ** the string that is not a "bareword" character.
10771 */
10772 static const char *fts5ConfigSkipBareword(const char *pIn){
10773 const char *p = pIn;
10774 while ( sqlite3Fts5IsBareword(*p) ) p++;
10775 if( p==pIn ) p = 0;
10776 return p;
10777 }
10778
10779 static int fts5_isdigit(char a){
10780 return (a>='0' && a<='9');
10781 }
10782
10783
10784
10785 static const char *fts5ConfigSkipLiteral(const char *pIn){
10786 const char *p = pIn;
10787 switch( *p ){
10788 case 'n': case 'N':
10789 if( sqlite3_strnicmp("null", p, 4)==0 ){
10790 p = &p[4];
10791 }else{
10792 p = 0;
10793 }
10794 break;
10795
10796 case 'x': case 'X':
10797 p++;
10798 if( *p=='\'' ){
10799 p++;
10800 while( (*p>='a' && *p<='f')
10801 || (*p>='A' && *p<='F')
10802 || (*p>='0' && *p<='9')
10803 ){
10804 p++;
10805 }
10806 if( *p=='\'' && 0==((p-pIn)%2) ){
10807 p++;
10808 }else{
10809 p = 0;
10810 }
10811 }else{
10812 p = 0;
10813 }
10814 break;
10815
10816 case '\'':
10817 p++;
10818 while( p ){
10819 if( *p=='\'' ){
10820 p++;
10821 if( *p!='\'' ) break;
10822 }
10823 p++;
10824 if( *p==0 ) p = 0;
10825 }
10826 break;
10827
10828 default:
10829 /* maybe a number */
10830 if( *p=='+' || *p=='-' ) p++;
10831 while( fts5_isdigit(*p) ) p++;
10832
10833 /* At this point, if the literal was an integer, the parse is
10834 ** finished. Or, if it is a floating point value, it may continue
10835 ** with either a decimal point or an 'E' character. */
10836 if( *p=='.' && fts5_isdigit(p[1]) ){
10837 p += 2;
10838 while( fts5_isdigit(*p) ) p++;
10839 }
10840 if( p==pIn ) p = 0;
10841
10842 break;
10843 }
10844
10845 return p;
10846 }
10847
10848 /*
10849 ** The first character of the string pointed to by argument z is guaranteed
10850 ** to be an open-quote character (see function fts5_isopenquote()).
10851 **
10852 ** This function searches for the corresponding close-quote character within
10853 ** the string and, if found, dequotes the string in place and adds a new
10854 ** nul-terminator byte.
10855 **
10856 ** If the close-quote is found, the value returned is the byte offset of
10857 ** the character immediately following it. Or, if the close-quote is not
10858 ** found, -1 is returned. If -1 is returned, the buffer is left in an
10859 ** undefined state.
10860 */
10861 static int fts5Dequote(char *z){
10862 char q;
10863 int iIn = 1;
10864 int iOut = 0;
10865 q = z[0];
10866
10867 /* Set stack variable q to the close-quote character */
10868 assert( q=='[' || q=='\'' || q=='"' || q=='`' );
10869 if( q=='[' ) q = ']';
10870
10871 while( ALWAYS(z[iIn]) ){
10872 if( z[iIn]==q ){
10873 if( z[iIn+1]!=q ){
10874 /* Character iIn was the close quote. */
10875 iIn++;
10876 break;
10877 }else{
10878 /* Character iIn and iIn+1 form an escaped quote character. Skip
10879 ** the input cursor past both and copy a single quote character
10880 ** to the output buffer. */
10881 iIn += 2;
10882 z[iOut++] = q;
10883 }
10884 }else{
10885 z[iOut++] = z[iIn++];
10886 }
10887 }
10888
10889 z[iOut] = '\0';
10890 return iIn;
10891 }
10892
10893 /*
10894 ** Convert an SQL-style quoted string into a normal string by removing
10895 ** the quote characters. The conversion is done in-place. If the
10896 ** input does not begin with a quote character, then this routine
10897 ** is a no-op.
10898 **
10899 ** Examples:
10900 **
10901 ** "abc" becomes abc
10902 ** 'xyz' becomes xyz
10903 ** [pqr] becomes pqr
10904 ** `mno` becomes mno
10905 */
10906 static void sqlite3Fts5Dequote(char *z){
10907 char quote; /* Quote character (if any ) */
10908
10909 assert( 0==fts5_iswhitespace(z[0]) );
10910 quote = z[0];
10911 if( quote=='[' || quote=='\'' || quote=='"' || quote=='`' ){
10912 fts5Dequote(z);
10913 }
10914 }
10915
10916 /*
10917 ** Parse a "special" CREATE VIRTUAL TABLE directive and update
10918 ** configuration object pConfig as appropriate.
10919 **
10920 ** If successful, object pConfig is updated and SQLITE_OK returned. If
10921 ** an error occurs, an SQLite error code is returned and an error message
10922 ** may be left in *pzErr. It is the responsibility of the caller to
10923 ** eventually free any such error message using sqlite3_free().
10924 */
10925 static int fts5ConfigParseSpecial(
10926 Fts5Global *pGlobal,
10927 Fts5Config *pConfig, /* Configuration object to update */
10928 const char *zCmd, /* Special command to parse */
10929 const char *zArg, /* Argument to parse */
10930 char **pzErr /* OUT: Error message */
10931 ){
10932 int rc = SQLITE_OK;
10933 int nCmd = (int)strlen(zCmd);
10934 if( sqlite3_strnicmp("prefix", zCmd, nCmd)==0 ){
10935 const int nByte = sizeof(int) * FTS5_MAX_PREFIX_INDEXES;
10936 const char *p;
10937 int bFirst = 1;
10938 if( pConfig->aPrefix==0 ){
10939 pConfig->aPrefix = sqlite3Fts5MallocZero(&rc, nByte);
10940 if( rc ) return rc;
10941 }
10942
10943 p = zArg;
10944 while( 1 ){
10945 int nPre = 0;
10946
10947 while( p[0]==' ' ) p++;
10948 if( bFirst==0 && p[0]==',' ){
10949 p++;
10950 while( p[0]==' ' ) p++;
10951 }else if( p[0]=='\0' ){
10952 break;
10953 }
10954 if( p[0]<'0' || p[0]>'9' ){
10955 *pzErr = sqlite3_mprintf("malformed prefix=... directive");
10956 rc = SQLITE_ERROR;
10957 break;
10958 }
10959
10960 if( pConfig->nPrefix==FTS5_MAX_PREFIX_INDEXES ){
10961 *pzErr = sqlite3_mprintf(
10962 "too many prefix indexes (max %d)", FTS5_MAX_PREFIX_INDEXES
10963 );
10964 rc = SQLITE_ERROR;
10965 break;
10966 }
10967
10968 while( p[0]>='0' && p[0]<='9' && nPre<1000 ){
10969 nPre = nPre*10 + (p[0] - '0');
10970 p++;
10971 }
10972
10973 if( rc==SQLITE_OK && (nPre<=0 || nPre>=1000) ){
10974 *pzErr = sqlite3_mprintf("prefix length out of range (max 999)");
10975 rc = SQLITE_ERROR;
10976 break;
10977 }
10978
10979 pConfig->aPrefix[pConfig->nPrefix] = nPre;
10980 pConfig->nPrefix++;
10981 bFirst = 0;
10982 }
10983 assert( pConfig->nPrefix<=FTS5_MAX_PREFIX_INDEXES );
10984 return rc;
10985 }
10986
10987 if( sqlite3_strnicmp("tokenize", zCmd, nCmd)==0 ){
10988 const char *p = (const char*)zArg;
10989 int nArg = (int)strlen(zArg) + 1;
10990 char **azArg = sqlite3Fts5MallocZero(&rc, sizeof(char*) * nArg);
10991 char *pDel = sqlite3Fts5MallocZero(&rc, nArg * 2);
10992 char *pSpace = pDel;
10993
10994 if( azArg && pSpace ){
10995 if( pConfig->pTok ){
10996 *pzErr = sqlite3_mprintf("multiple tokenize=... directives");
10997 rc = SQLITE_ERROR;
10998 }else{
10999 for(nArg=0; p && *p; nArg++){
11000 const char *p2 = fts5ConfigSkipWhitespace(p);
11001 if( *p2=='\'' ){
11002 p = fts5ConfigSkipLiteral(p2);
11003 }else{
11004 p = fts5ConfigSkipBareword(p2);
11005 }
11006 if( p ){
11007 memcpy(pSpace, p2, p-p2);
11008 azArg[nArg] = pSpace;
11009 sqlite3Fts5Dequote(pSpace);
11010 pSpace += (p - p2) + 1;
11011 p = fts5ConfigSkipWhitespace(p);
11012 }
11013 }
11014 if( p==0 ){
11015 *pzErr = sqlite3_mprintf("parse error in tokenize directive");
11016 rc = SQLITE_ERROR;
11017 }else{
11018 rc = sqlite3Fts5GetTokenizer(pGlobal,
11019 (const char**)azArg, nArg, &pConfig->pTok, &pConfig->pTokApi,
11020 pzErr
11021 );
11022 }
11023 }
11024 }
11025
11026 sqlite3_free(azArg);
11027 sqlite3_free(pDel);
11028 return rc;
11029 }
11030
11031 if( sqlite3_strnicmp("content", zCmd, nCmd)==0 ){
11032 if( pConfig->eContent!=FTS5_CONTENT_NORMAL ){
11033 *pzErr = sqlite3_mprintf("multiple content=... directives");
11034 rc = SQLITE_ERROR;
11035 }else{
11036 if( zArg[0] ){
11037 pConfig->eContent = FTS5_CONTENT_EXTERNAL;
11038 pConfig->zContent = sqlite3Fts5Mprintf(&rc, "%Q.%Q", pConfig->zDb,zArg);
11039 }else{
11040 pConfig->eContent = FTS5_CONTENT_NONE;
11041 }
11042 }
11043 return rc;
11044 }
11045
11046 if( sqlite3_strnicmp("content_rowid", zCmd, nCmd)==0 ){
11047 if( pConfig->zContentRowid ){
11048 *pzErr = sqlite3_mprintf("multiple content_rowid=... directives");
11049 rc = SQLITE_ERROR;
11050 }else{
11051 pConfig->zContentRowid = sqlite3Fts5Strndup(&rc, zArg, -1);
11052 }
11053 return rc;
11054 }
11055
11056 if( sqlite3_strnicmp("columnsize", zCmd, nCmd)==0 ){
11057 if( (zArg[0]!='0' && zArg[0]!='1') || zArg[1]!='\0' ){
11058 *pzErr = sqlite3_mprintf("malformed columnsize=... directive");
11059 rc = SQLITE_ERROR;
11060 }else{
11061 pConfig->bColumnsize = (zArg[0]=='1');
11062 }
11063 return rc;
11064 }
11065
11066 *pzErr = sqlite3_mprintf("unrecognized option: \"%.*s\"", nCmd, zCmd);
11067 return SQLITE_ERROR;
11068 }
11069
11070 /*
11071 ** Allocate an instance of the default tokenizer ("simple") at
11072 ** Fts5Config.pTokenizer. Return SQLITE_OK if successful, or an SQLite error
11073 ** code if an error occurs.
11074 */
11075 static int fts5ConfigDefaultTokenizer(Fts5Global *pGlobal, Fts5Config *pConfig){
11076 assert( pConfig->pTok==0 && pConfig->pTokApi==0 );
11077 return sqlite3Fts5GetTokenizer(
11078 pGlobal, 0, 0, &pConfig->pTok, &pConfig->pTokApi, 0
11079 );
11080 }
11081
11082 /*
11083 ** Gobble up the first bareword or quoted word from the input buffer zIn.
11084 ** Return a pointer to the character immediately following the last in
11085 ** the gobbled word if successful, or a NULL pointer otherwise (failed
11086 ** to find close-quote character).
11087 **
11088 ** Before returning, set pzOut to point to a new buffer containing a
11089 ** nul-terminated, dequoted copy of the gobbled word. If the word was
11090 ** quoted, *pbQuoted is also set to 1 before returning.
11091 **
11092 ** If *pRc is other than SQLITE_OK when this function is called, it is
11093 ** a no-op (NULL is returned). Otherwise, if an OOM occurs within this
11094 ** function, *pRc is set to SQLITE_NOMEM before returning. *pRc is *not*
11095 ** set if a parse error (failed to find close quote) occurs.
11096 */
11097 static const char *fts5ConfigGobbleWord(
11098 int *pRc, /* IN/OUT: Error code */
11099 const char *zIn, /* Buffer to gobble string/bareword from */
11100 char **pzOut, /* OUT: malloc'd buffer containing str/bw */
11101 int *pbQuoted /* OUT: Set to true if dequoting required */
11102 ){
11103 const char *zRet = 0;
11104
11105 int nIn = (int)strlen(zIn);
11106 char *zOut = sqlite3_malloc(nIn+1);
11107
11108 assert( *pRc==SQLITE_OK );
11109 *pbQuoted = 0;
11110 *pzOut = 0;
11111
11112 if( zOut==0 ){
11113 *pRc = SQLITE_NOMEM;
11114 }else{
11115 memcpy(zOut, zIn, nIn+1);
11116 if( fts5_isopenquote(zOut[0]) ){
11117 int ii = fts5Dequote(zOut);
11118 zRet = &zIn[ii];
11119 *pbQuoted = 1;
11120 }else{
11121 zRet = fts5ConfigSkipBareword(zIn);
11122 zOut[zRet-zIn] = '\0';
11123 }
11124 }
11125
11126 if( zRet==0 ){
11127 sqlite3_free(zOut);
11128 }else{
11129 *pzOut = zOut;
11130 }
11131
11132 return zRet;
11133 }
11134
11135 static int fts5ConfigParseColumn(
11136 Fts5Config *p,
11137 char *zCol,
11138 char *zArg,
11139 char **pzErr
11140 ){
11141 int rc = SQLITE_OK;
11142 if( 0==sqlite3_stricmp(zCol, FTS5_RANK_NAME)
11143 || 0==sqlite3_stricmp(zCol, FTS5_ROWID_NAME)
11144 ){
11145 *pzErr = sqlite3_mprintf("reserved fts5 column name: %s", zCol);
11146 rc = SQLITE_ERROR;
11147 }else if( zArg ){
11148 if( 0==sqlite3_stricmp(zArg, "unindexed") ){
11149 p->abUnindexed[p->nCol] = 1;
11150 }else{
11151 *pzErr = sqlite3_mprintf("unrecognized column option: %s", zArg);
11152 rc = SQLITE_ERROR;
11153 }
11154 }
11155
11156 p->azCol[p->nCol++] = zCol;
11157 return rc;
11158 }
11159
11160 /*
11161 ** Populate the Fts5Config.zContentExprlist string.
11162 */
11163 static int fts5ConfigMakeExprlist(Fts5Config *p){
11164 int i;
11165 int rc = SQLITE_OK;
11166 Fts5Buffer buf = {0, 0, 0};
11167
11168 sqlite3Fts5BufferAppendPrintf(&rc, &buf, "T.%Q", p->zContentRowid);
11169 if( p->eContent!=FTS5_CONTENT_NONE ){
11170 for(i=0; i<p->nCol; i++){
11171 if( p->eContent==FTS5_CONTENT_EXTERNAL ){
11172 sqlite3Fts5BufferAppendPrintf(&rc, &buf, ", T.%Q", p->azCol[i]);
11173 }else{
11174 sqlite3Fts5BufferAppendPrintf(&rc, &buf, ", T.c%d", i);
11175 }
11176 }
11177 }
11178
11179 assert( p->zContentExprlist==0 );
11180 p->zContentExprlist = (char*)buf.p;
11181 return rc;
11182 }
11183
11184 /*
11185 ** Arguments nArg/azArg contain the string arguments passed to the xCreate
11186 ** or xConnect method of the virtual table. This function attempts to
11187 ** allocate an instance of Fts5Config containing the results of parsing
11188 ** those arguments.
11189 **
11190 ** If successful, SQLITE_OK is returned and *ppOut is set to point to the
11191 ** new Fts5Config object. If an error occurs, an SQLite error code is
11192 ** returned, *ppOut is set to NULL and an error message may be left in
11193 ** *pzErr. It is the responsibility of the caller to eventually free any
11194 ** such error message using sqlite3_free().
11195 */
11196 static int sqlite3Fts5ConfigParse(
11197 Fts5Global *pGlobal,
11198 sqlite3 *db,
11199 int nArg, /* Number of arguments */
11200 const char **azArg, /* Array of nArg CREATE VIRTUAL TABLE args */
11201 Fts5Config **ppOut, /* OUT: Results of parse */
11202 char **pzErr /* OUT: Error message */
11203 ){
11204 int rc = SQLITE_OK; /* Return code */
11205 Fts5Config *pRet; /* New object to return */
11206 int i;
11207 int nByte;
11208
11209 *ppOut = pRet = (Fts5Config*)sqlite3_malloc(sizeof(Fts5Config));
11210 if( pRet==0 ) return SQLITE_NOMEM;
11211 memset(pRet, 0, sizeof(Fts5Config));
11212 pRet->db = db;
11213 pRet->iCookie = -1;
11214
11215 nByte = nArg * (sizeof(char*) + sizeof(u8));
11216 pRet->azCol = (char**)sqlite3Fts5MallocZero(&rc, nByte);
11217 pRet->abUnindexed = (u8*)&pRet->azCol[nArg];
11218 pRet->zDb = sqlite3Fts5Strndup(&rc, azArg[1], -1);
11219 pRet->zName = sqlite3Fts5Strndup(&rc, azArg[2], -1);
11220 pRet->bColumnsize = 1;
11221 #ifdef SQLITE_DEBUG
11222 pRet->bPrefixIndex = 1;
11223 #endif
11224 if( rc==SQLITE_OK && sqlite3_stricmp(pRet->zName, FTS5_RANK_NAME)==0 ){
11225 *pzErr = sqlite3_mprintf("reserved fts5 table name: %s", pRet->zName);
11226 rc = SQLITE_ERROR;
11227 }
11228
11229 for(i=3; rc==SQLITE_OK && i<nArg; i++){
11230 const char *zOrig = azArg[i];
11231 const char *z;
11232 char *zOne = 0;
11233 char *zTwo = 0;
11234 int bOption = 0;
11235 int bMustBeCol = 0;
11236
11237 z = fts5ConfigGobbleWord(&rc, zOrig, &zOne, &bMustBeCol);
11238 z = fts5ConfigSkipWhitespace(z);
11239 if( z && *z=='=' ){
11240 bOption = 1;
11241 z++;
11242 if( bMustBeCol ) z = 0;
11243 }
11244 z = fts5ConfigSkipWhitespace(z);
11245 if( z && z[0] ){
11246 int bDummy;
11247 z = fts5ConfigGobbleWord(&rc, z, &zTwo, &bDummy);
11248 if( z && z[0] ) z = 0;
11249 }
11250
11251 if( rc==SQLITE_OK ){
11252 if( z==0 ){
11253 *pzErr = sqlite3_mprintf("parse error in \"%s\"", zOrig);
11254 rc = SQLITE_ERROR;
11255 }else{
11256 if( bOption ){
11257 rc = fts5ConfigParseSpecial(pGlobal, pRet, zOne, zTwo?zTwo:"", pzErr);
11258 }else{
11259 rc = fts5ConfigParseColumn(pRet, zOne, zTwo, pzErr);
11260 zOne = 0;
11261 }
11262 }
11263 }
11264
11265 sqlite3_free(zOne);
11266 sqlite3_free(zTwo);
11267 }
11268
11269 /* If a tokenizer= option was successfully parsed, the tokenizer has
11270 ** already been allocated. Otherwise, allocate an instance of the default
11271 ** tokenizer (unicode61) now. */
11272 if( rc==SQLITE_OK && pRet->pTok==0 ){
11273 rc = fts5ConfigDefaultTokenizer(pGlobal, pRet);
11274 }
11275
11276 /* If no zContent option was specified, fill in the default values. */
11277 if( rc==SQLITE_OK && pRet->zContent==0 ){
11278 const char *zTail = 0;
11279 assert( pRet->eContent==FTS5_CONTENT_NORMAL
11280 || pRet->eContent==FTS5_CONTENT_NONE
11281 );
11282 if( pRet->eContent==FTS5_CONTENT_NORMAL ){
11283 zTail = "content";
11284 }else if( pRet->bColumnsize ){
11285 zTail = "docsize";
11286 }
11287
11288 if( zTail ){
11289 pRet->zContent = sqlite3Fts5Mprintf(
11290 &rc, "%Q.'%q_%s'", pRet->zDb, pRet->zName, zTail
11291 );
11292 }
11293 }
11294
11295 if( rc==SQLITE_OK && pRet->zContentRowid==0 ){
11296 pRet->zContentRowid = sqlite3Fts5Strndup(&rc, "rowid", -1);
11297 }
11298
11299 /* Formulate the zContentExprlist text */
11300 if( rc==SQLITE_OK ){
11301 rc = fts5ConfigMakeExprlist(pRet);
11302 }
11303
11304 if( rc!=SQLITE_OK ){
11305 sqlite3Fts5ConfigFree(pRet);
11306 *ppOut = 0;
11307 }
11308 return rc;
11309 }
11310
11311 /*
11312 ** Free the configuration object passed as the only argument.
11313 */
11314 static void sqlite3Fts5ConfigFree(Fts5Config *pConfig){
11315 if( pConfig ){
11316 int i;
11317 if( pConfig->pTok ){
11318 pConfig->pTokApi->xDelete(pConfig->pTok);
11319 }
11320 sqlite3_free(pConfig->zDb);
11321 sqlite3_free(pConfig->zName);
11322 for(i=0; i<pConfig->nCol; i++){
11323 sqlite3_free(pConfig->azCol[i]);
11324 }
11325 sqlite3_free(pConfig->azCol);
11326 sqlite3_free(pConfig->aPrefix);
11327 sqlite3_free(pConfig->zRank);
11328 sqlite3_free(pConfig->zRankArgs);
11329 sqlite3_free(pConfig->zContent);
11330 sqlite3_free(pConfig->zContentRowid);
11331 sqlite3_free(pConfig->zContentExprlist);
11332 sqlite3_free(pConfig);
11333 }
11334 }
11335
11336 /*
11337 ** Call sqlite3_declare_vtab() based on the contents of the configuration
11338 ** object passed as the only argument. Return SQLITE_OK if successful, or
11339 ** an SQLite error code if an error occurs.
11340 */
11341 static int sqlite3Fts5ConfigDeclareVtab(Fts5Config *pConfig){
11342 int i;
11343 int rc = SQLITE_OK;
11344 char *zSql;
11345
11346 zSql = sqlite3Fts5Mprintf(&rc, "CREATE TABLE x(");
11347 for(i=0; zSql && i<pConfig->nCol; i++){
11348 const char *zSep = (i==0?"":", ");
11349 zSql = sqlite3Fts5Mprintf(&rc, "%z%s%Q", zSql, zSep, pConfig->azCol[i]);
11350 }
11351 zSql = sqlite3Fts5Mprintf(&rc, "%z, %Q HIDDEN, %s HIDDEN)",
11352 zSql, pConfig->zName, FTS5_RANK_NAME
11353 );
11354
11355 assert( zSql || rc==SQLITE_NOMEM );
11356 if( zSql ){
11357 rc = sqlite3_declare_vtab(pConfig->db, zSql);
11358 sqlite3_free(zSql);
11359 }
11360
11361 return rc;
11362 }
11363
11364 /*
11365 ** Tokenize the text passed via the second and third arguments.
11366 **
11367 ** The callback is invoked once for each token in the input text. The
11368 ** arguments passed to it are, in order:
11369 **
11370 ** void *pCtx // Copy of 4th argument to sqlite3Fts5Tokenize()
11371 ** const char *pToken // Pointer to buffer containing token
11372 ** int nToken // Size of token in bytes
11373 ** int iStart // Byte offset of start of token within input text
11374 ** int iEnd // Byte offset of end of token within input text
11375 ** int iPos // Position of token in input (first token is 0)
11376 **
11377 ** If the callback returns a non-zero value the tokenization is abandoned
11378 ** and no further callbacks are issued.
11379 **
11380 ** This function returns SQLITE_OK if successful or an SQLite error code
11381 ** if an error occurs. If the tokenization was abandoned early because
11382 ** the callback returned SQLITE_DONE, this is not an error and this function
11383 ** still returns SQLITE_OK. Or, if the tokenization was abandoned early
11384 ** because the callback returned another non-zero value, it is assumed
11385 ** to be an SQLite error code and returned to the caller.
11386 */
11387 static int sqlite3Fts5Tokenize(
11388 Fts5Config *pConfig, /* FTS5 Configuration object */
11389 int flags, /* FTS5_TOKENIZE_* flags */
11390 const char *pText, int nText, /* Text to tokenize */
11391 void *pCtx, /* Context passed to xToken() */
11392 int (*xToken)(void*, int, const char*, int, int, int) /* Callback */
11393 ){
11394 if( pText==0 ) return SQLITE_OK;
11395 return pConfig->pTokApi->xTokenize(
11396 pConfig->pTok, pCtx, flags, pText, nText, xToken
11397 );
11398 }
11399
11400 /*
11401 ** Argument pIn points to the first character in what is expected to be
11402 ** a comma-separated list of SQL literals followed by a ')' character.
11403 ** If it actually is this, return a pointer to the ')'. Otherwise, return
11404 ** NULL to indicate a parse error.
11405 */
11406 static const char *fts5ConfigSkipArgs(const char *pIn){
11407 const char *p = pIn;
11408
11409 while( 1 ){
11410 p = fts5ConfigSkipWhitespace(p);
11411 p = fts5ConfigSkipLiteral(p);
11412 p = fts5ConfigSkipWhitespace(p);
11413 if( p==0 || *p==')' ) break;
11414 if( *p!=',' ){
11415 p = 0;
11416 break;
11417 }
11418 p++;
11419 }
11420
11421 return p;
11422 }
11423
11424 /*
11425 ** Parameter zIn contains a rank() function specification. The format of
11426 ** this is:
11427 **
11428 ** + Bareword (function name)
11429 ** + Open parenthesis - "("
11430 ** + Zero or more SQL literals in a comma separated list
11431 ** + Close parenthesis - ")"
11432 */
11433 static int sqlite3Fts5ConfigParseRank(
11434 const char *zIn, /* Input string */
11435 char **pzRank, /* OUT: Rank function name */
11436 char **pzRankArgs /* OUT: Rank function arguments */
11437 ){
11438 const char *p = zIn;
11439 const char *pRank;
11440 char *zRank = 0;
11441 char *zRankArgs = 0;
11442 int rc = SQLITE_OK;
11443
11444 *pzRank = 0;
11445 *pzRankArgs = 0;
11446
11447 if( p==0 ){
11448 rc = SQLITE_ERROR;
11449 }else{
11450 p = fts5ConfigSkipWhitespace(p);
11451 pRank = p;
11452 p = fts5ConfigSkipBareword(p);
11453
11454 if( p ){
11455 zRank = sqlite3Fts5MallocZero(&rc, 1 + p - pRank);
11456 if( zRank ) memcpy(zRank, pRank, p-pRank);
11457 }else{
11458 rc = SQLITE_ERROR;
11459 }
11460
11461 if( rc==SQLITE_OK ){
11462 p = fts5ConfigSkipWhitespace(p);
11463 if( *p!='(' ) rc = SQLITE_ERROR;
11464 p++;
11465 }
11466 if( rc==SQLITE_OK ){
11467 const char *pArgs;
11468 p = fts5ConfigSkipWhitespace(p);
11469 pArgs = p;
11470 if( *p!=')' ){
11471 p = fts5ConfigSkipArgs(p);
11472 if( p==0 ){
11473 rc = SQLITE_ERROR;
11474 }else{
11475 zRankArgs = sqlite3Fts5MallocZero(&rc, 1 + p - pArgs);
11476 if( zRankArgs ) memcpy(zRankArgs, pArgs, p-pArgs);
11477 }
11478 }
11479 }
11480 }
11481
11482 if( rc!=SQLITE_OK ){
11483 sqlite3_free(zRank);
11484 assert( zRankArgs==0 );
11485 }else{
11486 *pzRank = zRank;
11487 *pzRankArgs = zRankArgs;
11488 }
11489 return rc;
11490 }
11491
11492 static int sqlite3Fts5ConfigSetValue(
11493 Fts5Config *pConfig,
11494 const char *zKey,
11495 sqlite3_value *pVal,
11496 int *pbBadkey
11497 ){
11498 int rc = SQLITE_OK;
11499
11500 if( 0==sqlite3_stricmp(zKey, "pgsz") ){
11501 int pgsz = 0;
11502 if( SQLITE_INTEGER==sqlite3_value_numeric_type(pVal) ){
11503 pgsz = sqlite3_value_int(pVal);
11504 }
11505 if( pgsz<=0 || pgsz>FTS5_MAX_PAGE_SIZE ){
11506 *pbBadkey = 1;
11507 }else{
11508 pConfig->pgsz = pgsz;
11509 }
11510 }
11511
11512 else if( 0==sqlite3_stricmp(zKey, "hashsize") ){
11513 int nHashSize = -1;
11514 if( SQLITE_INTEGER==sqlite3_value_numeric_type(pVal) ){
11515 nHashSize = sqlite3_value_int(pVal);
11516 }
11517 if( nHashSize<=0 ){
11518 *pbBadkey = 1;
11519 }else{
11520 pConfig->nHashSize = nHashSize;
11521 }
11522 }
11523
11524 else if( 0==sqlite3_stricmp(zKey, "automerge") ){
11525 int nAutomerge = -1;
11526 if( SQLITE_INTEGER==sqlite3_value_numeric_type(pVal) ){
11527 nAutomerge = sqlite3_value_int(pVal);
11528 }
11529 if( nAutomerge<0 || nAutomerge>64 ){
11530 *pbBadkey = 1;
11531 }else{
11532 if( nAutomerge==1 ) nAutomerge = FTS5_DEFAULT_AUTOMERGE;
11533 pConfig->nAutomerge = nAutomerge;
11534 }
11535 }
11536
11537 else if( 0==sqlite3_stricmp(zKey, "crisismerge") ){
11538 int nCrisisMerge = -1;
11539 if( SQLITE_INTEGER==sqlite3_value_numeric_type(pVal) ){
11540 nCrisisMerge = sqlite3_value_int(pVal);
11541 }
11542 if( nCrisisMerge<0 ){
11543 *pbBadkey = 1;
11544 }else{
11545 if( nCrisisMerge<=1 ) nCrisisMerge = FTS5_DEFAULT_CRISISMERGE;
11546 pConfig->nCrisisMerge = nCrisisMerge;
11547 }
11548 }
11549
11550 else if( 0==sqlite3_stricmp(zKey, "rank") ){
11551 const char *zIn = (const char*)sqlite3_value_text(pVal);
11552 char *zRank;
11553 char *zRankArgs;
11554 rc = sqlite3Fts5ConfigParseRank(zIn, &zRank, &zRankArgs);
11555 if( rc==SQLITE_OK ){
11556 sqlite3_free(pConfig->zRank);
11557 sqlite3_free(pConfig->zRankArgs);
11558 pConfig->zRank = zRank;
11559 pConfig->zRankArgs = zRankArgs;
11560 }else if( rc==SQLITE_ERROR ){
11561 rc = SQLITE_OK;
11562 *pbBadkey = 1;
11563 }
11564 }else{
11565 *pbBadkey = 1;
11566 }
11567 return rc;
11568 }
11569
11570 /*
11571 ** Load the contents of the %_config table into memory.
11572 */
11573 static int sqlite3Fts5ConfigLoad(Fts5Config *pConfig, int iCookie){
11574 const char *zSelect = "SELECT k, v FROM %Q.'%q_config'";
11575 char *zSql;
11576 sqlite3_stmt *p = 0;
11577 int rc = SQLITE_OK;
11578 int iVersion = 0;
11579
11580 /* Set default values */
11581 pConfig->pgsz = FTS5_DEFAULT_PAGE_SIZE;
11582 pConfig->nAutomerge = FTS5_DEFAULT_AUTOMERGE;
11583 pConfig->nCrisisMerge = FTS5_DEFAULT_CRISISMERGE;
11584 pConfig->nHashSize = FTS5_DEFAULT_HASHSIZE;
11585
11586 zSql = sqlite3Fts5Mprintf(&rc, zSelect, pConfig->zDb, pConfig->zName);
11587 if( zSql ){
11588 rc = sqlite3_prepare_v2(pConfig->db, zSql, -1, &p, 0);
11589 sqlite3_free(zSql);
11590 }
11591
11592 assert( rc==SQLITE_OK || p==0 );
11593 if( rc==SQLITE_OK ){
11594 while( SQLITE_ROW==sqlite3_step(p) ){
11595 const char *zK = (const char*)sqlite3_column_text(p, 0);
11596 sqlite3_value *pVal = sqlite3_column_value(p, 1);
11597 if( 0==sqlite3_stricmp(zK, "version") ){
11598 iVersion = sqlite3_value_int(pVal);
11599 }else{
11600 int bDummy = 0;
11601 sqlite3Fts5ConfigSetValue(pConfig, zK, pVal, &bDummy);
11602 }
11603 }
11604 rc = sqlite3_finalize(p);
11605 }
11606
11607 if( rc==SQLITE_OK && iVersion!=FTS5_CURRENT_VERSION ){
11608 rc = SQLITE_ERROR;
11609 if( pConfig->pzErrmsg ){
11610 assert( 0==*pConfig->pzErrmsg );
11611 *pConfig->pzErrmsg = sqlite3_mprintf(
11612 "invalid fts5 file format (found %d, expected %d) - run 'rebuild'",
11613 iVersion, FTS5_CURRENT_VERSION
11614 );
11615 }
11616 }
11617
11618 if( rc==SQLITE_OK ){
11619 pConfig->iCookie = iCookie;
11620 }
11621 return rc;
11622 }
11623
11624
11625 /*
11626 ** 2014 May 31
11627 **
11628 ** The author disclaims copyright to this source code. In place of
11629 ** a legal notice, here is a blessing:
11630 **
11631 ** May you do good and not evil.
11632 ** May you find forgiveness for yourself and forgive others.
11633 ** May you share freely, never taking more than you give.
11634 **
11635 ******************************************************************************
11636 **
11637 */
11638
11639
11640
11641 /* #include "fts5Int.h" */
11642 /* #include "fts5parse.h" */
11643
11644 /*
11645 ** All token types in the generated fts5parse.h file are greater than 0.
11646 */
11647 #define FTS5_EOF 0
11648
11649 #define FTS5_LARGEST_INT64 (0xffffffff|(((i64)0x7fffffff)<<32))
11650
11651 typedef struct Fts5ExprTerm Fts5ExprTerm;
11652
11653 /*
11654 ** Functions generated by lemon from fts5parse.y.
11655 */
11656 static void *sqlite3Fts5ParserAlloc(void *(*mallocProc)(u64));
11657 static void sqlite3Fts5ParserFree(void*, void (*freeProc)(void*));
11658 static void sqlite3Fts5Parser(void*, int, Fts5Token, Fts5Parse*);
11659 #ifndef NDEBUG
11660 /* #include <stdio.h> */
11661 static void sqlite3Fts5ParserTrace(FILE*, char*);
11662 #endif
11663
11664
11665 struct Fts5Expr {
11666 Fts5Index *pIndex;
11667 Fts5ExprNode *pRoot;
11668 int bDesc; /* Iterate in descending rowid order */
11669 int nPhrase; /* Number of phrases in expression */
11670 Fts5ExprPhrase **apExprPhrase; /* Pointers to phrase objects */
11671 };
11672
11673 /*
11674 ** eType:
11675 ** Expression node type. Always one of:
11676 **
11677 ** FTS5_AND (nChild, apChild valid)
11678 ** FTS5_OR (nChild, apChild valid)
11679 ** FTS5_NOT (nChild, apChild valid)
11680 ** FTS5_STRING (pNear valid)
11681 ** FTS5_TERM (pNear valid)
11682 */
11683 struct Fts5ExprNode {
11684 int eType; /* Node type */
11685 int bEof; /* True at EOF */
11686 int bNomatch; /* True if entry is not a match */
11687
11688 i64 iRowid; /* Current rowid */
11689 Fts5ExprNearset *pNear; /* For FTS5_STRING - cluster of phrases */
11690
11691 /* Child nodes. For a NOT node, this array always contains 2 entries. For
11692 ** AND or OR nodes, it contains 2 or more entries. */
11693 int nChild; /* Number of child nodes */
11694 Fts5ExprNode *apChild[1]; /* Array of child nodes */
11695 };
11696
11697 #define Fts5NodeIsString(p) ((p)->eType==FTS5_TERM || (p)->eType==FTS5_STRING)
11698
11699 /*
11700 ** An instance of the following structure represents a single search term
11701 ** or term prefix.
11702 */
11703 struct Fts5ExprTerm {
11704 int bPrefix; /* True for a prefix term */
11705 char *zTerm; /* nul-terminated term */
11706 Fts5IndexIter *pIter; /* Iterator for this term */
11707 Fts5ExprTerm *pSynonym; /* Pointer to first in list of synonyms */
11708 };
11709
11710 /*
11711 ** A phrase. One or more terms that must appear in a contiguous sequence
11712 ** within a document for it to match.
11713 */
11714 struct Fts5ExprPhrase {
11715 Fts5ExprNode *pNode; /* FTS5_STRING node this phrase is part of */
11716 Fts5Buffer poslist; /* Current position list */
11717 int nTerm; /* Number of entries in aTerm[] */
11718 Fts5ExprTerm aTerm[1]; /* Terms that make up this phrase */
11719 };
11720
11721 /*
11722 ** One or more phrases that must appear within a certain token distance of
11723 ** each other within each matching document.
11724 */
11725 struct Fts5ExprNearset {
11726 int nNear; /* NEAR parameter */
11727 Fts5Colset *pColset; /* Columns to search (NULL -> all columns) */
11728 int nPhrase; /* Number of entries in aPhrase[] array */
11729 Fts5ExprPhrase *apPhrase[1]; /* Array of phrase pointers */
11730 };
11731
11732
11733 /*
11734 ** Parse context.
11735 */
11736 struct Fts5Parse {
11737 Fts5Config *pConfig;
11738 char *zErr;
11739 int rc;
11740 int nPhrase; /* Size of apPhrase array */
11741 Fts5ExprPhrase **apPhrase; /* Array of all phrases */
11742 Fts5ExprNode *pExpr; /* Result of a successful parse */
11743 };
11744
11745 static void sqlite3Fts5ParseError(Fts5Parse *pParse, const char *zFmt, ...){
11746 va_list ap;
11747 va_start(ap, zFmt);
11748 if( pParse->rc==SQLITE_OK ){
11749 pParse->zErr = sqlite3_vmprintf(zFmt, ap);
11750 pParse->rc = SQLITE_ERROR;
11751 }
11752 va_end(ap);
11753 }
11754
11755 static int fts5ExprIsspace(char t){
11756 return t==' ' || t=='\t' || t=='\n' || t=='\r';
11757 }
11758
11759 /*
11760 ** Read the first token from the nul-terminated string at *pz.
11761 */
11762 static int fts5ExprGetToken(
11763 Fts5Parse *pParse,
11764 const char **pz, /* IN/OUT: Pointer into buffer */
11765 Fts5Token *pToken
11766 ){
11767 const char *z = *pz;
11768 int tok;
11769
11770 /* Skip past any whitespace */
11771 while( fts5ExprIsspace(*z) ) z++;
11772
11773 pToken->p = z;
11774 pToken->n = 1;
11775 switch( *z ){
11776 case '(': tok = FTS5_LP; break;
11777 case ')': tok = FTS5_RP; break;
11778 case '{': tok = FTS5_LCP; break;
11779 case '}': tok = FTS5_RCP; break;
11780 case ':': tok = FTS5_COLON; break;
11781 case ',': tok = FTS5_COMMA; break;
11782 case '+': tok = FTS5_PLUS; break;
11783 case '*': tok = FTS5_STAR; break;
11784 case '\0': tok = FTS5_EOF; break;
11785
11786 case '"': {
11787 const char *z2;
11788 tok = FTS5_STRING;
11789
11790 for(z2=&z[1]; 1; z2++){
11791 if( z2[0]=='"' ){
11792 z2++;
11793 if( z2[0]!='"' ) break;
11794 }
11795 if( z2[0]=='\0' ){
11796 sqlite3Fts5ParseError(pParse, "unterminated string");
11797 return FTS5_EOF;
11798 }
11799 }
11800 pToken->n = (z2 - z);
11801 break;
11802 }
11803
11804 default: {
11805 const char *z2;
11806 if( sqlite3Fts5IsBareword(z[0])==0 ){
11807 sqlite3Fts5ParseError(pParse, "fts5: syntax error near \"%.1s\"", z);
11808 return FTS5_EOF;
11809 }
11810 tok = FTS5_STRING;
11811 for(z2=&z[1]; sqlite3Fts5IsBareword(*z2); z2++);
11812 pToken->n = (z2 - z);
11813 if( pToken->n==2 && memcmp(pToken->p, "OR", 2)==0 ) tok = FTS5_OR;
11814 if( pToken->n==3 && memcmp(pToken->p, "NOT", 3)==0 ) tok = FTS5_NOT;
11815 if( pToken->n==3 && memcmp(pToken->p, "AND", 3)==0 ) tok = FTS5_AND;
11816 break;
11817 }
11818 }
11819
11820 *pz = &pToken->p[pToken->n];
11821 return tok;
11822 }
11823
11824 static void *fts5ParseAlloc(u64 t){ return sqlite3_malloc((int)t); }
11825 static void fts5ParseFree(void *p){ sqlite3_free(p); }
11826
11827 static int sqlite3Fts5ExprNew(
11828 Fts5Config *pConfig, /* FTS5 Configuration */
11829 const char *zExpr, /* Expression text */
11830 Fts5Expr **ppNew,
11831 char **pzErr
11832 ){
11833 Fts5Parse sParse;
11834 Fts5Token token;
11835 const char *z = zExpr;
11836 int t; /* Next token type */
11837 void *pEngine;
11838 Fts5Expr *pNew;
11839
11840 *ppNew = 0;
11841 *pzErr = 0;
11842 memset(&sParse, 0, sizeof(sParse));
11843 pEngine = sqlite3Fts5ParserAlloc(fts5ParseAlloc);
11844 if( pEngine==0 ){ return SQLITE_NOMEM; }
11845 sParse.pConfig = pConfig;
11846
11847 do {
11848 t = fts5ExprGetToken(&sParse, &z, &token);
11849 sqlite3Fts5Parser(pEngine, t, token, &sParse);
11850 }while( sParse.rc==SQLITE_OK && t!=FTS5_EOF );
11851 sqlite3Fts5ParserFree(pEngine, fts5ParseFree);
11852
11853 assert( sParse.rc!=SQLITE_OK || sParse.zErr==0 );
11854 if( sParse.rc==SQLITE_OK ){
11855 *ppNew = pNew = sqlite3_malloc(sizeof(Fts5Expr));
11856 if( pNew==0 ){
11857 sParse.rc = SQLITE_NOMEM;
11858 sqlite3Fts5ParseNodeFree(sParse.pExpr);
11859 }else{
11860 pNew->pRoot = sParse.pExpr;
11861 pNew->pIndex = 0;
11862 pNew->apExprPhrase = sParse.apPhrase;
11863 pNew->nPhrase = sParse.nPhrase;
11864 sParse.apPhrase = 0;
11865 }
11866 }
11867
11868 sqlite3_free(sParse.apPhrase);
11869 *pzErr = sParse.zErr;
11870 return sParse.rc;
11871 }
11872
11873 /*
11874 ** Free the expression node object passed as the only argument.
11875 */
11876 static void sqlite3Fts5ParseNodeFree(Fts5ExprNode *p){
11877 if( p ){
11878 int i;
11879 for(i=0; i<p->nChild; i++){
11880 sqlite3Fts5ParseNodeFree(p->apChild[i]);
11881 }
11882 sqlite3Fts5ParseNearsetFree(p->pNear);
11883 sqlite3_free(p);
11884 }
11885 }
11886
11887 /*
11888 ** Free the expression object passed as the only argument.
11889 */
11890 static void sqlite3Fts5ExprFree(Fts5Expr *p){
11891 if( p ){
11892 sqlite3Fts5ParseNodeFree(p->pRoot);
11893 sqlite3_free(p->apExprPhrase);
11894 sqlite3_free(p);
11895 }
11896 }
11897
11898 /*
11899 ** Argument pTerm must be a synonym iterator. Return the current rowid
11900 ** that it points to.
11901 */
11902 static i64 fts5ExprSynonymRowid(Fts5ExprTerm *pTerm, int bDesc, int *pbEof){
11903 i64 iRet = 0;
11904 int bRetValid = 0;
11905 Fts5ExprTerm *p;
11906
11907 assert( pTerm->pSynonym );
11908 assert( bDesc==0 || bDesc==1 );
11909 for(p=pTerm; p; p=p->pSynonym){
11910 if( 0==sqlite3Fts5IterEof(p->pIter) ){
11911 i64 iRowid = sqlite3Fts5IterRowid(p->pIter);
11912 if( bRetValid==0 || (bDesc!=(iRowid<iRet)) ){
11913 iRet = iRowid;
11914 bRetValid = 1;
11915 }
11916 }
11917 }
11918
11919 if( pbEof && bRetValid==0 ) *pbEof = 1;
11920 return iRet;
11921 }
11922
11923 /*
11924 ** Argument pTerm must be a synonym iterator.
11925 */
11926 static int fts5ExprSynonymPoslist(
11927 Fts5ExprTerm *pTerm,
11928 Fts5Colset *pColset,
11929 i64 iRowid,
11930 int *pbDel, /* OUT: Caller should sqlite3_free(*pa) */
11931 u8 **pa, int *pn
11932 ){
11933 Fts5PoslistReader aStatic[4];
11934 Fts5PoslistReader *aIter = aStatic;
11935 int nIter = 0;
11936 int nAlloc = 4;
11937 int rc = SQLITE_OK;
11938 Fts5ExprTerm *p;
11939
11940 assert( pTerm->pSynonym );
11941 for(p=pTerm; p; p=p->pSynonym){
11942 Fts5IndexIter *pIter = p->pIter;
11943 if( sqlite3Fts5IterEof(pIter)==0 && sqlite3Fts5IterRowid(pIter)==iRowid ){
11944 const u8 *a;
11945 int n;
11946 i64 dummy;
11947 rc = sqlite3Fts5IterPoslist(pIter, pColset, &a, &n, &dummy);
11948 if( rc!=SQLITE_OK ) goto synonym_poslist_out;
11949 if( nIter==nAlloc ){
11950 int nByte = sizeof(Fts5PoslistReader) * nAlloc * 2;
11951 Fts5PoslistReader *aNew = (Fts5PoslistReader*)sqlite3_malloc(nByte);
11952 if( aNew==0 ){
11953 rc = SQLITE_NOMEM;
11954 goto synonym_poslist_out;
11955 }
11956 memcpy(aNew, aIter, sizeof(Fts5PoslistReader) * nIter);
11957 nAlloc = nAlloc*2;
11958 if( aIter!=aStatic ) sqlite3_free(aIter);
11959 aIter = aNew;
11960 }
11961 sqlite3Fts5PoslistReaderInit(a, n, &aIter[nIter]);
11962 assert( aIter[nIter].bEof==0 );
11963 nIter++;
11964 }
11965 }
11966
11967 assert( *pbDel==0 );
11968 if( nIter==1 ){
11969 *pa = (u8*)aIter[0].a;
11970 *pn = aIter[0].n;
11971 }else{
11972 Fts5PoslistWriter writer = {0};
11973 Fts5Buffer buf = {0,0,0};
11974 i64 iPrev = -1;
11975 while( 1 ){
11976 int i;
11977 i64 iMin = FTS5_LARGEST_INT64;
11978 for(i=0; i<nIter; i++){
11979 if( aIter[i].bEof==0 ){
11980 if( aIter[i].iPos==iPrev ){
11981 if( sqlite3Fts5PoslistReaderNext(&aIter[i]) ) continue;
11982 }
11983 if( aIter[i].iPos<iMin ){
11984 iMin = aIter[i].iPos;
11985 }
11986 }
11987 }
11988 if( iMin==FTS5_LARGEST_INT64 || rc!=SQLITE_OK ) break;
11989 rc = sqlite3Fts5PoslistWriterAppend(&buf, &writer, iMin);
11990 iPrev = iMin;
11991 }
11992 if( rc ){
11993 sqlite3_free(buf.p);
11994 }else{
11995 *pa = buf.p;
11996 *pn = buf.n;
11997 *pbDel = 1;
11998 }
11999 }
12000
12001 synonym_poslist_out:
12002 if( aIter!=aStatic ) sqlite3_free(aIter);
12003 return rc;
12004 }
12005
12006
12007 /*
12008 ** All individual term iterators in pPhrase are guaranteed to be valid and
12009 ** pointing to the same rowid when this function is called. This function
12010 ** checks if the current rowid really is a match, and if so populates
12011 ** the pPhrase->poslist buffer accordingly. Output parameter *pbMatch
12012 ** is set to true if this is really a match, or false otherwise.
12013 **
12014 ** SQLITE_OK is returned if an error occurs, or an SQLite error code
12015 ** otherwise. It is not considered an error code if the current rowid is
12016 ** not a match.
12017 */
12018 static int fts5ExprPhraseIsMatch(
12019 Fts5ExprNode *pNode, /* Node pPhrase belongs to */
12020 Fts5Colset *pColset, /* Restrict matches to these columns */
12021 Fts5ExprPhrase *pPhrase, /* Phrase object to initialize */
12022 int *pbMatch /* OUT: Set to true if really a match */
12023 ){
12024 Fts5PoslistWriter writer = {0};
12025 Fts5PoslistReader aStatic[4];
12026 Fts5PoslistReader *aIter = aStatic;
12027 int i;
12028 int rc = SQLITE_OK;
12029
12030 fts5BufferZero(&pPhrase->poslist);
12031
12032 /* If the aStatic[] array is not large enough, allocate a large array
12033 ** using sqlite3_malloc(). This approach could be improved upon. */
12034 if( pPhrase->nTerm>(int)ArraySize(aStatic) ){
12035 int nByte = sizeof(Fts5PoslistReader) * pPhrase->nTerm;
12036 aIter = (Fts5PoslistReader*)sqlite3_malloc(nByte);
12037 if( !aIter ) return SQLITE_NOMEM;
12038 }
12039 memset(aIter, 0, sizeof(Fts5PoslistReader) * pPhrase->nTerm);
12040
12041 /* Initialize a term iterator for each term in the phrase */
12042 for(i=0; i<pPhrase->nTerm; i++){
12043 Fts5ExprTerm *pTerm = &pPhrase->aTerm[i];
12044 i64 dummy;
12045 int n = 0;
12046 int bFlag = 0;
12047 const u8 *a = 0;
12048 if( pTerm->pSynonym ){
12049 rc = fts5ExprSynonymPoslist(
12050 pTerm, pColset, pNode->iRowid, &bFlag, (u8**)&a, &n
12051 );
12052 }else{
12053 rc = sqlite3Fts5IterPoslist(pTerm->pIter, pColset, &a, &n, &dummy);
12054 }
12055 if( rc!=SQLITE_OK ) goto ismatch_out;
12056 sqlite3Fts5PoslistReaderInit(a, n, &aIter[i]);
12057 aIter[i].bFlag = (u8)bFlag;
12058 if( aIter[i].bEof ) goto ismatch_out;
12059 }
12060
12061 while( 1 ){
12062 int bMatch;
12063 i64 iPos = aIter[0].iPos;
12064 do {
12065 bMatch = 1;
12066 for(i=0; i<pPhrase->nTerm; i++){
12067 Fts5PoslistReader *pPos = &aIter[i];
12068 i64 iAdj = iPos + i;
12069 if( pPos->iPos!=iAdj ){
12070 bMatch = 0;
12071 while( pPos->iPos<iAdj ){
12072 if( sqlite3Fts5PoslistReaderNext(pPos) ) goto ismatch_out;
12073 }
12074 if( pPos->iPos>iAdj ) iPos = pPos->iPos-i;
12075 }
12076 }
12077 }while( bMatch==0 );
12078
12079 /* Append position iPos to the output */
12080 rc = sqlite3Fts5PoslistWriterAppend(&pPhrase->poslist, &writer, iPos);
12081 if( rc!=SQLITE_OK ) goto ismatch_out;
12082
12083 for(i=0; i<pPhrase->nTerm; i++){
12084 if( sqlite3Fts5PoslistReaderNext(&aIter[i]) ) goto ismatch_out;
12085 }
12086 }
12087
12088 ismatch_out:
12089 *pbMatch = (pPhrase->poslist.n>0);
12090 for(i=0; i<pPhrase->nTerm; i++){
12091 if( aIter[i].bFlag ) sqlite3_free((u8*)aIter[i].a);
12092 }
12093 if( aIter!=aStatic ) sqlite3_free(aIter);
12094 return rc;
12095 }
12096
12097 typedef struct Fts5LookaheadReader Fts5LookaheadReader;
12098 struct Fts5LookaheadReader {
12099 const u8 *a; /* Buffer containing position list */
12100 int n; /* Size of buffer a[] in bytes */
12101 int i; /* Current offset in position list */
12102 i64 iPos; /* Current position */
12103 i64 iLookahead; /* Next position */
12104 };
12105
12106 #define FTS5_LOOKAHEAD_EOF (((i64)1) << 62)
12107
12108 static int fts5LookaheadReaderNext(Fts5LookaheadReader *p){
12109 p->iPos = p->iLookahead;
12110 if( sqlite3Fts5PoslistNext64(p->a, p->n, &p->i, &p->iLookahead) ){
12111 p->iLookahead = FTS5_LOOKAHEAD_EOF;
12112 }
12113 return (p->iPos==FTS5_LOOKAHEAD_EOF);
12114 }
12115
12116 static int fts5LookaheadReaderInit(
12117 const u8 *a, int n, /* Buffer to read position list from */
12118 Fts5LookaheadReader *p /* Iterator object to initialize */
12119 ){
12120 memset(p, 0, sizeof(Fts5LookaheadReader));
12121 p->a = a;
12122 p->n = n;
12123 fts5LookaheadReaderNext(p);
12124 return fts5LookaheadReaderNext(p);
12125 }
12126
12127 #if 0
12128 static int fts5LookaheadReaderEof(Fts5LookaheadReader *p){
12129 return (p->iPos==FTS5_LOOKAHEAD_EOF);
12130 }
12131 #endif
12132
12133 typedef struct Fts5NearTrimmer Fts5NearTrimmer;
12134 struct Fts5NearTrimmer {
12135 Fts5LookaheadReader reader; /* Input iterator */
12136 Fts5PoslistWriter writer; /* Writer context */
12137 Fts5Buffer *pOut; /* Output poslist */
12138 };
12139
12140 /*
12141 ** The near-set object passed as the first argument contains more than
12142 ** one phrase. All phrases currently point to the same row. The
12143 ** Fts5ExprPhrase.poslist buffers are populated accordingly. This function
12144 ** tests if the current row contains instances of each phrase sufficiently
12145 ** close together to meet the NEAR constraint. Non-zero is returned if it
12146 ** does, or zero otherwise.
12147 **
12148 ** If in/out parameter (*pRc) is set to other than SQLITE_OK when this
12149 ** function is called, it is a no-op. Or, if an error (e.g. SQLITE_NOMEM)
12150 ** occurs within this function (*pRc) is set accordingly before returning.
12151 ** The return value is undefined in both these cases.
12152 **
12153 ** If no error occurs and non-zero (a match) is returned, the position-list
12154 ** of each phrase object is edited to contain only those entries that
12155 ** meet the constraint before returning.
12156 */
12157 static int fts5ExprNearIsMatch(int *pRc, Fts5ExprNearset *pNear){
12158 Fts5NearTrimmer aStatic[4];
12159 Fts5NearTrimmer *a = aStatic;
12160 Fts5ExprPhrase **apPhrase = pNear->apPhrase;
12161
12162 int i;
12163 int rc = *pRc;
12164 int bMatch;
12165
12166 assert( pNear->nPhrase>1 );
12167
12168 /* If the aStatic[] array is not large enough, allocate a large array
12169 ** using sqlite3_malloc(). This approach could be improved upon. */
12170 if( pNear->nPhrase>(int)ArraySize(aStatic) ){
12171 int nByte = sizeof(Fts5NearTrimmer) * pNear->nPhrase;
12172 a = (Fts5NearTrimmer*)sqlite3Fts5MallocZero(&rc, nByte);
12173 }else{
12174 memset(aStatic, 0, sizeof(aStatic));
12175 }
12176 if( rc!=SQLITE_OK ){
12177 *pRc = rc;
12178 return 0;
12179 }
12180
12181 /* Initialize a lookahead iterator for each phrase. After passing the
12182 ** buffer and buffer size to the lookaside-reader init function, zero
12183 ** the phrase poslist buffer. The new poslist for the phrase (containing
12184 ** the same entries as the original with some entries removed on account
12185 ** of the NEAR constraint) is written over the original even as it is
12186 ** being read. This is safe as the entries for the new poslist are a
12187 ** subset of the old, so it is not possible for data yet to be read to
12188 ** be overwritten. */
12189 for(i=0; i<pNear->nPhrase; i++){
12190 Fts5Buffer *pPoslist = &apPhrase[i]->poslist;
12191 fts5LookaheadReaderInit(pPoslist->p, pPoslist->n, &a[i].reader);
12192 pPoslist->n = 0;
12193 a[i].pOut = pPoslist;
12194 }
12195
12196 while( 1 ){
12197 int iAdv;
12198 i64 iMin;
12199 i64 iMax;
12200
12201 /* This block advances the phrase iterators until they point to a set of
12202 ** entries that together comprise a match. */
12203 iMax = a[0].reader.iPos;
12204 do {
12205 bMatch = 1;
12206 for(i=0; i<pNear->nPhrase; i++){
12207 Fts5LookaheadReader *pPos = &a[i].reader;
12208 iMin = iMax - pNear->apPhrase[i]->nTerm - pNear->nNear;
12209 if( pPos->iPos<iMin || pPos->iPos>iMax ){
12210 bMatch = 0;
12211 while( pPos->iPos<iMin ){
12212 if( fts5LookaheadReaderNext(pPos) ) goto ismatch_out;
12213 }
12214 if( pPos->iPos>iMax ) iMax = pPos->iPos;
12215 }
12216 }
12217 }while( bMatch==0 );
12218
12219 /* Add an entry to each output position list */
12220 for(i=0; i<pNear->nPhrase; i++){
12221 i64 iPos = a[i].reader.iPos;
12222 Fts5PoslistWriter *pWriter = &a[i].writer;
12223 if( a[i].pOut->n==0 || iPos!=pWriter->iPrev ){
12224 sqlite3Fts5PoslistWriterAppend(a[i].pOut, pWriter, iPos);
12225 }
12226 }
12227
12228 iAdv = 0;
12229 iMin = a[0].reader.iLookahead;
12230 for(i=0; i<pNear->nPhrase; i++){
12231 if( a[i].reader.iLookahead < iMin ){
12232 iMin = a[i].reader.iLookahead;
12233 iAdv = i;
12234 }
12235 }
12236 if( fts5LookaheadReaderNext(&a[iAdv].reader) ) goto ismatch_out;
12237 }
12238
12239 ismatch_out: {
12240 int bRet = a[0].pOut->n>0;
12241 *pRc = rc;
12242 if( a!=aStatic ) sqlite3_free(a);
12243 return bRet;
12244 }
12245 }
12246
12247 /*
12248 ** Advance the first term iterator in the first phrase of pNear. Set output
12249 ** variable *pbEof to true if it reaches EOF or if an error occurs.
12250 **
12251 ** Return SQLITE_OK if successful, or an SQLite error code if an error
12252 ** occurs.
12253 */
12254 static int fts5ExprNearAdvanceFirst(
12255 Fts5Expr *pExpr, /* Expression pPhrase belongs to */
12256 Fts5ExprNode *pNode, /* FTS5_STRING or FTS5_TERM node */
12257 int bFromValid,
12258 i64 iFrom
12259 ){
12260 Fts5ExprTerm *pTerm = &pNode->pNear->apPhrase[0]->aTerm[0];
12261 int rc = SQLITE_OK;
12262
12263 if( pTerm->pSynonym ){
12264 int bEof = 1;
12265 Fts5ExprTerm *p;
12266
12267 /* Find the firstest rowid any synonym points to. */
12268 i64 iRowid = fts5ExprSynonymRowid(pTerm, pExpr->bDesc, 0);
12269
12270 /* Advance each iterator that currently points to iRowid. Or, if iFrom
12271 ** is valid - each iterator that points to a rowid before iFrom. */
12272 for(p=pTerm; p; p=p->pSynonym){
12273 if( sqlite3Fts5IterEof(p->pIter)==0 ){
12274 i64 ii = sqlite3Fts5IterRowid(p->pIter);
12275 if( ii==iRowid
12276 || (bFromValid && ii!=iFrom && (ii>iFrom)==pExpr->bDesc)
12277 ){
12278 if( bFromValid ){
12279 rc = sqlite3Fts5IterNextFrom(p->pIter, iFrom);
12280 }else{
12281 rc = sqlite3Fts5IterNext(p->pIter);
12282 }
12283 if( rc!=SQLITE_OK ) break;
12284 if( sqlite3Fts5IterEof(p->pIter)==0 ){
12285 bEof = 0;
12286 }
12287 }else{
12288 bEof = 0;
12289 }
12290 }
12291 }
12292
12293 /* Set the EOF flag if either all synonym iterators are at EOF or an
12294 ** error has occurred. */
12295 pNode->bEof = (rc || bEof);
12296 }else{
12297 Fts5IndexIter *pIter = pTerm->pIter;
12298
12299 assert( Fts5NodeIsString(pNode) );
12300 if( bFromValid ){
12301 rc = sqlite3Fts5IterNextFrom(pIter, iFrom);
12302 }else{
12303 rc = sqlite3Fts5IterNext(pIter);
12304 }
12305
12306 pNode->bEof = (rc || sqlite3Fts5IterEof(pIter));
12307 }
12308
12309 return rc;
12310 }
12311
12312 /*
12313 ** Advance iterator pIter until it points to a value equal to or laster
12314 ** than the initial value of *piLast. If this means the iterator points
12315 ** to a value laster than *piLast, update *piLast to the new lastest value.
12316 **
12317 ** If the iterator reaches EOF, set *pbEof to true before returning. If
12318 ** an error occurs, set *pRc to an error code. If either *pbEof or *pRc
12319 ** are set, return a non-zero value. Otherwise, return zero.
12320 */
12321 static int fts5ExprAdvanceto(
12322 Fts5IndexIter *pIter, /* Iterator to advance */
12323 int bDesc, /* True if iterator is "rowid DESC" */
12324 i64 *piLast, /* IN/OUT: Lastest rowid seen so far */
12325 int *pRc, /* OUT: Error code */
12326 int *pbEof /* OUT: Set to true if EOF */
12327 ){
12328 i64 iLast = *piLast;
12329 i64 iRowid;
12330
12331 iRowid = sqlite3Fts5IterRowid(pIter);
12332 if( (bDesc==0 && iLast>iRowid) || (bDesc && iLast<iRowid) ){
12333 int rc = sqlite3Fts5IterNextFrom(pIter, iLast);
12334 if( rc || sqlite3Fts5IterEof(pIter) ){
12335 *pRc = rc;
12336 *pbEof = 1;
12337 return 1;
12338 }
12339 iRowid = sqlite3Fts5IterRowid(pIter);
12340 assert( (bDesc==0 && iRowid>=iLast) || (bDesc==1 && iRowid<=iLast) );
12341 }
12342 *piLast = iRowid;
12343
12344 return 0;
12345 }
12346
12347 static int fts5ExprSynonymAdvanceto(
12348 Fts5ExprTerm *pTerm, /* Term iterator to advance */
12349 int bDesc, /* True if iterator is "rowid DESC" */
12350 i64 *piLast, /* IN/OUT: Lastest rowid seen so far */
12351 int *pRc /* OUT: Error code */
12352 ){
12353 int rc = SQLITE_OK;
12354 i64 iLast = *piLast;
12355 Fts5ExprTerm *p;
12356 int bEof = 0;
12357
12358 for(p=pTerm; rc==SQLITE_OK && p; p=p->pSynonym){
12359 if( sqlite3Fts5IterEof(p->pIter)==0 ){
12360 i64 iRowid = sqlite3Fts5IterRowid(p->pIter);
12361 if( (bDesc==0 && iLast>iRowid) || (bDesc && iLast<iRowid) ){
12362 rc = sqlite3Fts5IterNextFrom(p->pIter, iLast);
12363 }
12364 }
12365 }
12366
12367 if( rc!=SQLITE_OK ){
12368 *pRc = rc;
12369 bEof = 1;
12370 }else{
12371 *piLast = fts5ExprSynonymRowid(pTerm, bDesc, &bEof);
12372 }
12373 return bEof;
12374 }
12375
12376
12377 static int fts5ExprNearTest(
12378 int *pRc,
12379 Fts5Expr *pExpr, /* Expression that pNear is a part of */
12380 Fts5ExprNode *pNode /* The "NEAR" node (FTS5_STRING) */
12381 ){
12382 Fts5ExprNearset *pNear = pNode->pNear;
12383 int rc = *pRc;
12384 int i;
12385
12386 /* Check that each phrase in the nearset matches the current row.
12387 ** Populate the pPhrase->poslist buffers at the same time. If any
12388 ** phrase is not a match, break out of the loop early. */
12389 for(i=0; rc==SQLITE_OK && i<pNear->nPhrase; i++){
12390 Fts5ExprPhrase *pPhrase = pNear->apPhrase[i];
12391 if( pPhrase->nTerm>1 || pPhrase->aTerm[0].pSynonym || pNear->pColset ){
12392 int bMatch = 0;
12393 rc = fts5ExprPhraseIsMatch(pNode, pNear->pColset, pPhrase, &bMatch);
12394 if( bMatch==0 ) break;
12395 }else{
12396 rc = sqlite3Fts5IterPoslistBuffer(
12397 pPhrase->aTerm[0].pIter, &pPhrase->poslist
12398 );
12399 }
12400 }
12401
12402 *pRc = rc;
12403 if( i==pNear->nPhrase && (i==1 || fts5ExprNearIsMatch(pRc, pNear)) ){
12404 return 1;
12405 }
12406
12407 return 0;
12408 }
12409
12410 static int fts5ExprTokenTest(
12411 Fts5Expr *pExpr, /* Expression that pNear is a part of */
12412 Fts5ExprNode *pNode /* The "NEAR" node (FTS5_TERM) */
12413 ){
12414 /* As this "NEAR" object is actually a single phrase that consists
12415 ** of a single term only, grab pointers into the poslist managed by the
12416 ** fts5_index.c iterator object. This is much faster than synthesizing
12417 ** a new poslist the way we have to for more complicated phrase or NEAR
12418 ** expressions. */
12419 Fts5ExprNearset *pNear = pNode->pNear;
12420 Fts5ExprPhrase *pPhrase = pNear->apPhrase[0];
12421 Fts5IndexIter *pIter = pPhrase->aTerm[0].pIter;
12422 Fts5Colset *pColset = pNear->pColset;
12423 int rc;
12424
12425 assert( pNode->eType==FTS5_TERM );
12426 assert( pNear->nPhrase==1 && pPhrase->nTerm==1 );
12427 assert( pPhrase->aTerm[0].pSynonym==0 );
12428
12429 rc = sqlite3Fts5IterPoslist(pIter, pColset,
12430 (const u8**)&pPhrase->poslist.p, &pPhrase->poslist.n, &pNode->iRowid
12431 );
12432 pNode->bNomatch = (pPhrase->poslist.n==0);
12433 return rc;
12434 }
12435
12436 /*
12437 ** All individual term iterators in pNear are guaranteed to be valid when
12438 ** this function is called. This function checks if all term iterators
12439 ** point to the same rowid, and if not, advances them until they do.
12440 ** If an EOF is reached before this happens, *pbEof is set to true before
12441 ** returning.
12442 **
12443 ** SQLITE_OK is returned if an error occurs, or an SQLite error code
12444 ** otherwise. It is not considered an error code if an iterator reaches
12445 ** EOF.
12446 */
12447 static int fts5ExprNearNextMatch(
12448 Fts5Expr *pExpr, /* Expression pPhrase belongs to */
12449 Fts5ExprNode *pNode
12450 ){
12451 Fts5ExprNearset *pNear = pNode->pNear;
12452 Fts5ExprPhrase *pLeft = pNear->apPhrase[0];
12453 int rc = SQLITE_OK;
12454 i64 iLast; /* Lastest rowid any iterator points to */
12455 int i, j; /* Phrase and token index, respectively */
12456 int bMatch; /* True if all terms are at the same rowid */
12457 const int bDesc = pExpr->bDesc;
12458
12459 /* Check that this node should not be FTS5_TERM */
12460 assert( pNear->nPhrase>1
12461 || pNear->apPhrase[0]->nTerm>1
12462 || pNear->apPhrase[0]->aTerm[0].pSynonym
12463 );
12464
12465 /* Initialize iLast, the "lastest" rowid any iterator points to. If the
12466 ** iterator skips through rowids in the default ascending order, this means
12467 ** the maximum rowid. Or, if the iterator is "ORDER BY rowid DESC", then it
12468 ** means the minimum rowid. */
12469 if( pLeft->aTerm[0].pSynonym ){
12470 iLast = fts5ExprSynonymRowid(&pLeft->aTerm[0], bDesc, 0);
12471 }else{
12472 iLast = sqlite3Fts5IterRowid(pLeft->aTerm[0].pIter);
12473 }
12474
12475 do {
12476 bMatch = 1;
12477 for(i=0; i<pNear->nPhrase; i++){
12478 Fts5ExprPhrase *pPhrase = pNear->apPhrase[i];
12479 for(j=0; j<pPhrase->nTerm; j++){
12480 Fts5ExprTerm *pTerm = &pPhrase->aTerm[j];
12481 if( pTerm->pSynonym ){
12482 i64 iRowid = fts5ExprSynonymRowid(pTerm, bDesc, 0);
12483 if( iRowid==iLast ) continue;
12484 bMatch = 0;
12485 if( fts5ExprSynonymAdvanceto(pTerm, bDesc, &iLast, &rc) ){
12486 pNode->bEof = 1;
12487 return rc;
12488 }
12489 }else{
12490 Fts5IndexIter *pIter = pPhrase->aTerm[j].pIter;
12491 i64 iRowid = sqlite3Fts5IterRowid(pIter);
12492 if( iRowid==iLast ) continue;
12493 bMatch = 0;
12494 if( fts5ExprAdvanceto(pIter, bDesc, &iLast, &rc, &pNode->bEof) ){
12495 return rc;
12496 }
12497 }
12498 }
12499 }
12500 }while( bMatch==0 );
12501
12502 pNode->iRowid = iLast;
12503 pNode->bNomatch = (0==fts5ExprNearTest(&rc, pExpr, pNode));
12504
12505 return rc;
12506 }
12507
12508 /*
12509 ** Initialize all term iterators in the pNear object. If any term is found
12510 ** to match no documents at all, return immediately without initializing any
12511 ** further iterators.
12512 */
12513 static int fts5ExprNearInitAll(
12514 Fts5Expr *pExpr,
12515 Fts5ExprNode *pNode
12516 ){
12517 Fts5ExprNearset *pNear = pNode->pNear;
12518 int i, j;
12519 int rc = SQLITE_OK;
12520
12521 for(i=0; rc==SQLITE_OK && i<pNear->nPhrase; i++){
12522 Fts5ExprPhrase *pPhrase = pNear->apPhrase[i];
12523 for(j=0; j<pPhrase->nTerm; j++){
12524 Fts5ExprTerm *pTerm = &pPhrase->aTerm[j];
12525 Fts5ExprTerm *p;
12526 int bEof = 1;
12527
12528 for(p=pTerm; p && rc==SQLITE_OK; p=p->pSynonym){
12529 if( p->pIter ){
12530 sqlite3Fts5IterClose(p->pIter);
12531 p->pIter = 0;
12532 }
12533 rc = sqlite3Fts5IndexQuery(
12534 pExpr->pIndex, p->zTerm, (int)strlen(p->zTerm),
12535 (pTerm->bPrefix ? FTS5INDEX_QUERY_PREFIX : 0) |
12536 (pExpr->bDesc ? FTS5INDEX_QUERY_DESC : 0),
12537 pNear->pColset,
12538 &p->pIter
12539 );
12540 assert( rc==SQLITE_OK || p->pIter==0 );
12541 if( p->pIter && 0==sqlite3Fts5IterEof(p->pIter) ){
12542 bEof = 0;
12543 }
12544 }
12545
12546 if( bEof ){
12547 pNode->bEof = 1;
12548 return rc;
12549 }
12550 }
12551 }
12552
12553 return rc;
12554 }
12555
12556 /* fts5ExprNodeNext() calls fts5ExprNodeNextMatch(). And vice-versa. */
12557 static int fts5ExprNodeNextMatch(Fts5Expr*, Fts5ExprNode*);
12558
12559
12560 /*
12561 ** If pExpr is an ASC iterator, this function returns a value with the
12562 ** same sign as:
12563 **
12564 ** (iLhs - iRhs)
12565 **
12566 ** Otherwise, if this is a DESC iterator, the opposite is returned:
12567 **
12568 ** (iRhs - iLhs)
12569 */
12570 static int fts5RowidCmp(
12571 Fts5Expr *pExpr,
12572 i64 iLhs,
12573 i64 iRhs
12574 ){
12575 assert( pExpr->bDesc==0 || pExpr->bDesc==1 );
12576 if( pExpr->bDesc==0 ){
12577 if( iLhs<iRhs ) return -1;
12578 return (iLhs > iRhs);
12579 }else{
12580 if( iLhs>iRhs ) return -1;
12581 return (iLhs < iRhs);
12582 }
12583 }
12584
12585 static void fts5ExprSetEof(Fts5ExprNode *pNode){
12586 int i;
12587 pNode->bEof = 1;
12588 for(i=0; i<pNode->nChild; i++){
12589 fts5ExprSetEof(pNode->apChild[i]);
12590 }
12591 }
12592
12593 static void fts5ExprNodeZeroPoslist(Fts5ExprNode *pNode){
12594 if( pNode->eType==FTS5_STRING || pNode->eType==FTS5_TERM ){
12595 Fts5ExprNearset *pNear = pNode->pNear;
12596 int i;
12597 for(i=0; i<pNear->nPhrase; i++){
12598 Fts5ExprPhrase *pPhrase = pNear->apPhrase[i];
12599 pPhrase->poslist.n = 0;
12600 }
12601 }else{
12602 int i;
12603 for(i=0; i<pNode->nChild; i++){
12604 fts5ExprNodeZeroPoslist(pNode->apChild[i]);
12605 }
12606 }
12607 }
12608
12609
12610 static int fts5ExprNodeNext(Fts5Expr*, Fts5ExprNode*, int, i64);
12611
12612 /*
12613 ** Argument pNode is an FTS5_AND node.
12614 */
12615 static int fts5ExprAndNextRowid(
12616 Fts5Expr *pExpr, /* Expression pPhrase belongs to */
12617 Fts5ExprNode *pAnd /* FTS5_AND node to advance */
12618 ){
12619 int iChild;
12620 i64 iLast = pAnd->iRowid;
12621 int rc = SQLITE_OK;
12622 int bMatch;
12623
12624 assert( pAnd->bEof==0 );
12625 do {
12626 pAnd->bNomatch = 0;
12627 bMatch = 1;
12628 for(iChild=0; iChild<pAnd->nChild; iChild++){
12629 Fts5ExprNode *pChild = pAnd->apChild[iChild];
12630 if( 0 && pChild->eType==FTS5_STRING ){
12631 /* TODO */
12632 }else{
12633 int cmp = fts5RowidCmp(pExpr, iLast, pChild->iRowid);
12634 if( cmp>0 ){
12635 /* Advance pChild until it points to iLast or laster */
12636 rc = fts5ExprNodeNext(pExpr, pChild, 1, iLast);
12637 if( rc!=SQLITE_OK ) return rc;
12638 }
12639 }
12640
12641 /* If the child node is now at EOF, so is the parent AND node. Otherwise,
12642 ** the child node is guaranteed to have advanced at least as far as
12643 ** rowid iLast. So if it is not at exactly iLast, pChild->iRowid is the
12644 ** new lastest rowid seen so far. */
12645 assert( pChild->bEof || fts5RowidCmp(pExpr, iLast, pChild->iRowid)<=0 );
12646 if( pChild->bEof ){
12647 fts5ExprSetEof(pAnd);
12648 bMatch = 1;
12649 break;
12650 }else if( iLast!=pChild->iRowid ){
12651 bMatch = 0;
12652 iLast = pChild->iRowid;
12653 }
12654
12655 if( pChild->bNomatch ){
12656 pAnd->bNomatch = 1;
12657 }
12658 }
12659 }while( bMatch==0 );
12660
12661 if( pAnd->bNomatch && pAnd!=pExpr->pRoot ){
12662 fts5ExprNodeZeroPoslist(pAnd);
12663 }
12664 pAnd->iRowid = iLast;
12665 return SQLITE_OK;
12666 }
12667
12668
12669 /*
12670 ** Compare the values currently indicated by the two nodes as follows:
12671 **
12672 ** res = (*p1) - (*p2)
12673 **
12674 ** Nodes that point to values that come later in the iteration order are
12675 ** considered to be larger. Nodes at EOF are the largest of all.
12676 **
12677 ** This means that if the iteration order is ASC, then numerically larger
12678 ** rowids are considered larger. Or if it is the default DESC, numerically
12679 ** smaller rowids are larger.
12680 */
12681 static int fts5NodeCompare(
12682 Fts5Expr *pExpr,
12683 Fts5ExprNode *p1,
12684 Fts5ExprNode *p2
12685 ){
12686 if( p2->bEof ) return -1;
12687 if( p1->bEof ) return +1;
12688 return fts5RowidCmp(pExpr, p1->iRowid, p2->iRowid);
12689 }
12690
12691 /*
12692 ** Advance node iterator pNode, part of expression pExpr. If argument
12693 ** bFromValid is zero, then pNode is advanced exactly once. Or, if argument
12694 ** bFromValid is non-zero, then pNode is advanced until it is at or past
12695 ** rowid value iFrom. Whether "past" means "less than" or "greater than"
12696 ** depends on whether this is an ASC or DESC iterator.
12697 */
12698 static int fts5ExprNodeNext(
12699 Fts5Expr *pExpr,
12700 Fts5ExprNode *pNode,
12701 int bFromValid,
12702 i64 iFrom
12703 ){
12704 int rc = SQLITE_OK;
12705
12706 if( pNode->bEof==0 ){
12707 switch( pNode->eType ){
12708 case FTS5_STRING: {
12709 rc = fts5ExprNearAdvanceFirst(pExpr, pNode, bFromValid, iFrom);
12710 break;
12711 };
12712
12713 case FTS5_TERM: {
12714 Fts5IndexIter *pIter = pNode->pNear->apPhrase[0]->aTerm[0].pIter;
12715 if( bFromValid ){
12716 rc = sqlite3Fts5IterNextFrom(pIter, iFrom);
12717 }else{
12718 rc = sqlite3Fts5IterNext(pIter);
12719 }
12720 if( rc==SQLITE_OK && sqlite3Fts5IterEof(pIter)==0 ){
12721 assert( rc==SQLITE_OK );
12722 rc = fts5ExprTokenTest(pExpr, pNode);
12723 }else{
12724 pNode->bEof = 1;
12725 }
12726 return rc;
12727 };
12728
12729 case FTS5_AND: {
12730 Fts5ExprNode *pLeft = pNode->apChild[0];
12731 rc = fts5ExprNodeNext(pExpr, pLeft, bFromValid, iFrom);
12732 break;
12733 }
12734
12735 case FTS5_OR: {
12736 int i;
12737 i64 iLast = pNode->iRowid;
12738
12739 for(i=0; rc==SQLITE_OK && i<pNode->nChild; i++){
12740 Fts5ExprNode *p1 = pNode->apChild[i];
12741 assert( p1->bEof || fts5RowidCmp(pExpr, p1->iRowid, iLast)>=0 );
12742 if( p1->bEof==0 ){
12743 if( (p1->iRowid==iLast)
12744 || (bFromValid && fts5RowidCmp(pExpr, p1->iRowid, iFrom)<0)
12745 ){
12746 rc = fts5ExprNodeNext(pExpr, p1, bFromValid, iFrom);
12747 }
12748 }
12749 }
12750
12751 break;
12752 }
12753
12754 default: assert( pNode->eType==FTS5_NOT ); {
12755 assert( pNode->nChild==2 );
12756 rc = fts5ExprNodeNext(pExpr, pNode->apChild[0], bFromValid, iFrom);
12757 break;
12758 }
12759 }
12760
12761 if( rc==SQLITE_OK ){
12762 rc = fts5ExprNodeNextMatch(pExpr, pNode);
12763 }
12764 }
12765
12766 /* Assert that if bFromValid was true, either:
12767 **
12768 ** a) an error occurred, or
12769 ** b) the node is now at EOF, or
12770 ** c) the node is now at or past rowid iFrom.
12771 */
12772 assert( bFromValid==0
12773 || rc!=SQLITE_OK /* a */
12774 || pNode->bEof /* b */
12775 || pNode->iRowid==iFrom || pExpr->bDesc==(pNode->iRowid<iFrom) /* c */
12776 );
12777
12778 return rc;
12779 }
12780
12781
12782 /*
12783 ** If pNode currently points to a match, this function returns SQLITE_OK
12784 ** without modifying it. Otherwise, pNode is advanced until it does point
12785 ** to a match or EOF is reached.
12786 */
12787 static int fts5ExprNodeNextMatch(
12788 Fts5Expr *pExpr, /* Expression of which pNode is a part */
12789 Fts5ExprNode *pNode /* Expression node to test */
12790 ){
12791 int rc = SQLITE_OK;
12792 if( pNode->bEof==0 ){
12793 switch( pNode->eType ){
12794
12795 case FTS5_STRING: {
12796 /* Advance the iterators until they all point to the same rowid */
12797 rc = fts5ExprNearNextMatch(pExpr, pNode);
12798 break;
12799 }
12800
12801 case FTS5_TERM: {
12802 rc = fts5ExprTokenTest(pExpr, pNode);
12803 break;
12804 }
12805
12806 case FTS5_AND: {
12807 rc = fts5ExprAndNextRowid(pExpr, pNode);
12808 break;
12809 }
12810
12811 case FTS5_OR: {
12812 Fts5ExprNode *pNext = pNode->apChild[0];
12813 int i;
12814
12815 for(i=1; i<pNode->nChild; i++){
12816 Fts5ExprNode *pChild = pNode->apChild[i];
12817 int cmp = fts5NodeCompare(pExpr, pNext, pChild);
12818 if( cmp>0 || (cmp==0 && pChild->bNomatch==0) ){
12819 pNext = pChild;
12820 }
12821 }
12822 pNode->iRowid = pNext->iRowid;
12823 pNode->bEof = pNext->bEof;
12824 pNode->bNomatch = pNext->bNomatch;
12825 break;
12826 }
12827
12828 default: assert( pNode->eType==FTS5_NOT ); {
12829 Fts5ExprNode *p1 = pNode->apChild[0];
12830 Fts5ExprNode *p2 = pNode->apChild[1];
12831 assert( pNode->nChild==2 );
12832
12833 while( rc==SQLITE_OK && p1->bEof==0 ){
12834 int cmp = fts5NodeCompare(pExpr, p1, p2);
12835 if( cmp>0 ){
12836 rc = fts5ExprNodeNext(pExpr, p2, 1, p1->iRowid);
12837 cmp = fts5NodeCompare(pExpr, p1, p2);
12838 }
12839 assert( rc!=SQLITE_OK || cmp<=0 );
12840 if( cmp || p2->bNomatch ) break;
12841 rc = fts5ExprNodeNext(pExpr, p1, 0, 0);
12842 }
12843 pNode->bEof = p1->bEof;
12844 pNode->iRowid = p1->iRowid;
12845 break;
12846 }
12847 }
12848 }
12849 return rc;
12850 }
12851
12852
12853 /*
12854 ** Set node pNode, which is part of expression pExpr, to point to the first
12855 ** match. If there are no matches, set the Node.bEof flag to indicate EOF.
12856 **
12857 ** Return an SQLite error code if an error occurs, or SQLITE_OK otherwise.
12858 ** It is not an error if there are no matches.
12859 */
12860 static int fts5ExprNodeFirst(Fts5Expr *pExpr, Fts5ExprNode *pNode){
12861 int rc = SQLITE_OK;
12862 pNode->bEof = 0;
12863
12864 if( Fts5NodeIsString(pNode) ){
12865 /* Initialize all term iterators in the NEAR object. */
12866 rc = fts5ExprNearInitAll(pExpr, pNode);
12867 }else{
12868 int i;
12869 for(i=0; i<pNode->nChild && rc==SQLITE_OK; i++){
12870 rc = fts5ExprNodeFirst(pExpr, pNode->apChild[i]);
12871 }
12872 pNode->iRowid = pNode->apChild[0]->iRowid;
12873 }
12874
12875 if( rc==SQLITE_OK ){
12876 rc = fts5ExprNodeNextMatch(pExpr, pNode);
12877 }
12878 return rc;
12879 }
12880
12881
12882 /*
12883 ** Begin iterating through the set of documents in index pIdx matched by
12884 ** the MATCH expression passed as the first argument. If the "bDesc"
12885 ** parameter is passed a non-zero value, iteration is in descending rowid
12886 ** order. Or, if it is zero, in ascending order.
12887 **
12888 ** If iterating in ascending rowid order (bDesc==0), the first document
12889 ** visited is that with the smallest rowid that is larger than or equal
12890 ** to parameter iFirst. Or, if iterating in ascending order (bDesc==1),
12891 ** then the first document visited must have a rowid smaller than or
12892 ** equal to iFirst.
12893 **
12894 ** Return SQLITE_OK if successful, or an SQLite error code otherwise. It
12895 ** is not considered an error if the query does not match any documents.
12896 */
12897 static int sqlite3Fts5ExprFirst(Fts5Expr *p, Fts5Index *pIdx, i64 iFirst, int bD esc){
12898 Fts5ExprNode *pRoot = p->pRoot;
12899 int rc = SQLITE_OK;
12900 if( pRoot ){
12901 p->pIndex = pIdx;
12902 p->bDesc = bDesc;
12903 rc = fts5ExprNodeFirst(p, pRoot);
12904
12905 /* If not at EOF but the current rowid occurs earlier than iFirst in
12906 ** the iteration order, move to document iFirst or later. */
12907 if( pRoot->bEof==0 && fts5RowidCmp(p, pRoot->iRowid, iFirst)<0 ){
12908 rc = fts5ExprNodeNext(p, pRoot, 1, iFirst);
12909 }
12910
12911 /* If the iterator is not at a real match, skip forward until it is. */
12912 while( pRoot->bNomatch && rc==SQLITE_OK && pRoot->bEof==0 ){
12913 rc = fts5ExprNodeNext(p, pRoot, 0, 0);
12914 }
12915 }
12916 return rc;
12917 }
12918
12919 /*
12920 ** Move to the next document
12921 **
12922 ** Return SQLITE_OK if successful, or an SQLite error code otherwise. It
12923 ** is not considered an error if the query does not match any documents.
12924 */
12925 static int sqlite3Fts5ExprNext(Fts5Expr *p, i64 iLast){
12926 int rc;
12927 Fts5ExprNode *pRoot = p->pRoot;
12928 do {
12929 rc = fts5ExprNodeNext(p, pRoot, 0, 0);
12930 }while( pRoot->bNomatch && pRoot->bEof==0 && rc==SQLITE_OK );
12931 if( fts5RowidCmp(p, pRoot->iRowid, iLast)>0 ){
12932 pRoot->bEof = 1;
12933 }
12934 return rc;
12935 }
12936
12937 static int sqlite3Fts5ExprEof(Fts5Expr *p){
12938 return (p->pRoot==0 || p->pRoot->bEof);
12939 }
12940
12941 static i64 sqlite3Fts5ExprRowid(Fts5Expr *p){
12942 return p->pRoot->iRowid;
12943 }
12944
12945 static int fts5ParseStringFromToken(Fts5Token *pToken, char **pz){
12946 int rc = SQLITE_OK;
12947 *pz = sqlite3Fts5Strndup(&rc, pToken->p, pToken->n);
12948 return rc;
12949 }
12950
12951 /*
12952 ** Free the phrase object passed as the only argument.
12953 */
12954 static void fts5ExprPhraseFree(Fts5ExprPhrase *pPhrase){
12955 if( pPhrase ){
12956 int i;
12957 for(i=0; i<pPhrase->nTerm; i++){
12958 Fts5ExprTerm *pSyn;
12959 Fts5ExprTerm *pNext;
12960 Fts5ExprTerm *pTerm = &pPhrase->aTerm[i];
12961 sqlite3_free(pTerm->zTerm);
12962 sqlite3Fts5IterClose(pTerm->pIter);
12963
12964 for(pSyn=pTerm->pSynonym; pSyn; pSyn=pNext){
12965 pNext = pSyn->pSynonym;
12966 sqlite3Fts5IterClose(pSyn->pIter);
12967 sqlite3_free(pSyn);
12968 }
12969 }
12970 if( pPhrase->poslist.nSpace>0 ) fts5BufferFree(&pPhrase->poslist);
12971 sqlite3_free(pPhrase);
12972 }
12973 }
12974
12975 /*
12976 ** If argument pNear is NULL, then a new Fts5ExprNearset object is allocated
12977 ** and populated with pPhrase. Or, if pNear is not NULL, phrase pPhrase is
12978 ** appended to it and the results returned.
12979 **
12980 ** If an OOM error occurs, both the pNear and pPhrase objects are freed and
12981 ** NULL returned.
12982 */
12983 static Fts5ExprNearset *sqlite3Fts5ParseNearset(
12984 Fts5Parse *pParse, /* Parse context */
12985 Fts5ExprNearset *pNear, /* Existing nearset, or NULL */
12986 Fts5ExprPhrase *pPhrase /* Recently parsed phrase */
12987 ){
12988 const int SZALLOC = 8;
12989 Fts5ExprNearset *pRet = 0;
12990
12991 if( pParse->rc==SQLITE_OK ){
12992 if( pPhrase==0 ){
12993 return pNear;
12994 }
12995 if( pNear==0 ){
12996 int nByte = sizeof(Fts5ExprNearset) + SZALLOC * sizeof(Fts5ExprPhrase*);
12997 pRet = sqlite3_malloc(nByte);
12998 if( pRet==0 ){
12999 pParse->rc = SQLITE_NOMEM;
13000 }else{
13001 memset(pRet, 0, nByte);
13002 }
13003 }else if( (pNear->nPhrase % SZALLOC)==0 ){
13004 int nNew = pNear->nPhrase + SZALLOC;
13005 int nByte = sizeof(Fts5ExprNearset) + nNew * sizeof(Fts5ExprPhrase*);
13006
13007 pRet = (Fts5ExprNearset*)sqlite3_realloc(pNear, nByte);
13008 if( pRet==0 ){
13009 pParse->rc = SQLITE_NOMEM;
13010 }
13011 }else{
13012 pRet = pNear;
13013 }
13014 }
13015
13016 if( pRet==0 ){
13017 assert( pParse->rc!=SQLITE_OK );
13018 sqlite3Fts5ParseNearsetFree(pNear);
13019 sqlite3Fts5ParsePhraseFree(pPhrase);
13020 }else{
13021 pRet->apPhrase[pRet->nPhrase++] = pPhrase;
13022 }
13023 return pRet;
13024 }
13025
13026 typedef struct TokenCtx TokenCtx;
13027 struct TokenCtx {
13028 Fts5ExprPhrase *pPhrase;
13029 int rc;
13030 };
13031
13032 /*
13033 ** Callback for tokenizing terms used by ParseTerm().
13034 */
13035 static int fts5ParseTokenize(
13036 void *pContext, /* Pointer to Fts5InsertCtx object */
13037 int tflags, /* Mask of FTS5_TOKEN_* flags */
13038 const char *pToken, /* Buffer containing token */
13039 int nToken, /* Size of token in bytes */
13040 int iUnused1, /* Start offset of token */
13041 int iUnused2 /* End offset of token */
13042 ){
13043 int rc = SQLITE_OK;
13044 const int SZALLOC = 8;
13045 TokenCtx *pCtx = (TokenCtx*)pContext;
13046 Fts5ExprPhrase *pPhrase = pCtx->pPhrase;
13047
13048 /* If an error has already occurred, this is a no-op */
13049 if( pCtx->rc!=SQLITE_OK ) return pCtx->rc;
13050
13051 assert( pPhrase==0 || pPhrase->nTerm>0 );
13052 if( pPhrase && (tflags & FTS5_TOKEN_COLOCATED) ){
13053 Fts5ExprTerm *pSyn;
13054 int nByte = sizeof(Fts5ExprTerm) + nToken+1;
13055 pSyn = (Fts5ExprTerm*)sqlite3_malloc(nByte);
13056 if( pSyn==0 ){
13057 rc = SQLITE_NOMEM;
13058 }else{
13059 memset(pSyn, 0, nByte);
13060 pSyn->zTerm = (char*)&pSyn[1];
13061 memcpy(pSyn->zTerm, pToken, nToken);
13062 pSyn->pSynonym = pPhrase->aTerm[pPhrase->nTerm-1].pSynonym;
13063 pPhrase->aTerm[pPhrase->nTerm-1].pSynonym = pSyn;
13064 }
13065 }else{
13066 Fts5ExprTerm *pTerm;
13067 if( pPhrase==0 || (pPhrase->nTerm % SZALLOC)==0 ){
13068 Fts5ExprPhrase *pNew;
13069 int nNew = SZALLOC + (pPhrase ? pPhrase->nTerm : 0);
13070
13071 pNew = (Fts5ExprPhrase*)sqlite3_realloc(pPhrase,
13072 sizeof(Fts5ExprPhrase) + sizeof(Fts5ExprTerm) * nNew
13073 );
13074 if( pNew==0 ){
13075 rc = SQLITE_NOMEM;
13076 }else{
13077 if( pPhrase==0 ) memset(pNew, 0, sizeof(Fts5ExprPhrase));
13078 pCtx->pPhrase = pPhrase = pNew;
13079 pNew->nTerm = nNew - SZALLOC;
13080 }
13081 }
13082
13083 if( rc==SQLITE_OK ){
13084 pTerm = &pPhrase->aTerm[pPhrase->nTerm++];
13085 memset(pTerm, 0, sizeof(Fts5ExprTerm));
13086 pTerm->zTerm = sqlite3Fts5Strndup(&rc, pToken, nToken);
13087 }
13088 }
13089
13090 pCtx->rc = rc;
13091 return rc;
13092 }
13093
13094
13095 /*
13096 ** Free the phrase object passed as the only argument.
13097 */
13098 static void sqlite3Fts5ParsePhraseFree(Fts5ExprPhrase *pPhrase){
13099 fts5ExprPhraseFree(pPhrase);
13100 }
13101
13102 /*
13103 ** Free the phrase object passed as the second argument.
13104 */
13105 static void sqlite3Fts5ParseNearsetFree(Fts5ExprNearset *pNear){
13106 if( pNear ){
13107 int i;
13108 for(i=0; i<pNear->nPhrase; i++){
13109 fts5ExprPhraseFree(pNear->apPhrase[i]);
13110 }
13111 sqlite3_free(pNear->pColset);
13112 sqlite3_free(pNear);
13113 }
13114 }
13115
13116 static void sqlite3Fts5ParseFinished(Fts5Parse *pParse, Fts5ExprNode *p){
13117 assert( pParse->pExpr==0 );
13118 pParse->pExpr = p;
13119 }
13120
13121 /*
13122 ** This function is called by the parser to process a string token. The
13123 ** string may or may not be quoted. In any case it is tokenized and a
13124 ** phrase object consisting of all tokens returned.
13125 */
13126 static Fts5ExprPhrase *sqlite3Fts5ParseTerm(
13127 Fts5Parse *pParse, /* Parse context */
13128 Fts5ExprPhrase *pAppend, /* Phrase to append to */
13129 Fts5Token *pToken, /* String to tokenize */
13130 int bPrefix /* True if there is a trailing "*" */
13131 ){
13132 Fts5Config *pConfig = pParse->pConfig;
13133 TokenCtx sCtx; /* Context object passed to callback */
13134 int rc; /* Tokenize return code */
13135 char *z = 0;
13136
13137 memset(&sCtx, 0, sizeof(TokenCtx));
13138 sCtx.pPhrase = pAppend;
13139
13140 rc = fts5ParseStringFromToken(pToken, &z);
13141 if( rc==SQLITE_OK ){
13142 int flags = FTS5_TOKENIZE_QUERY | (bPrefix ? FTS5_TOKENIZE_QUERY : 0);
13143 int n;
13144 sqlite3Fts5Dequote(z);
13145 n = (int)strlen(z);
13146 rc = sqlite3Fts5Tokenize(pConfig, flags, z, n, &sCtx, fts5ParseTokenize);
13147 }
13148 sqlite3_free(z);
13149 if( rc || (rc = sCtx.rc) ){
13150 pParse->rc = rc;
13151 fts5ExprPhraseFree(sCtx.pPhrase);
13152 sCtx.pPhrase = 0;
13153 }else if( sCtx.pPhrase ){
13154
13155 if( pAppend==0 ){
13156 if( (pParse->nPhrase % 8)==0 ){
13157 int nByte = sizeof(Fts5ExprPhrase*) * (pParse->nPhrase + 8);
13158 Fts5ExprPhrase **apNew;
13159 apNew = (Fts5ExprPhrase**)sqlite3_realloc(pParse->apPhrase, nByte);
13160 if( apNew==0 ){
13161 pParse->rc = SQLITE_NOMEM;
13162 fts5ExprPhraseFree(sCtx.pPhrase);
13163 return 0;
13164 }
13165 pParse->apPhrase = apNew;
13166 }
13167 pParse->nPhrase++;
13168 }
13169
13170 pParse->apPhrase[pParse->nPhrase-1] = sCtx.pPhrase;
13171 assert( sCtx.pPhrase->nTerm>0 );
13172 sCtx.pPhrase->aTerm[sCtx.pPhrase->nTerm-1].bPrefix = bPrefix;
13173 }
13174
13175 return sCtx.pPhrase;
13176 }
13177
13178 /*
13179 ** Create a new FTS5 expression by cloning phrase iPhrase of the
13180 ** expression passed as the second argument.
13181 */
13182 static int sqlite3Fts5ExprClonePhrase(
13183 Fts5Config *pConfig,
13184 Fts5Expr *pExpr,
13185 int iPhrase,
13186 Fts5Expr **ppNew
13187 ){
13188 int rc = SQLITE_OK; /* Return code */
13189 Fts5ExprPhrase *pOrig; /* The phrase extracted from pExpr */
13190 int i; /* Used to iterate through phrase terms */
13191
13192 Fts5Expr *pNew = 0; /* Expression to return via *ppNew */
13193
13194 TokenCtx sCtx = {0,0}; /* Context object for fts5ParseTokenize */
13195
13196
13197 pOrig = pExpr->apExprPhrase[iPhrase];
13198
13199 pNew = (Fts5Expr*)sqlite3Fts5MallocZero(&rc, sizeof(Fts5Expr));
13200 if( rc==SQLITE_OK ){
13201 pNew->apExprPhrase = (Fts5ExprPhrase**)sqlite3Fts5MallocZero(&rc,
13202 sizeof(Fts5ExprPhrase*));
13203 }
13204 if( rc==SQLITE_OK ){
13205 pNew->pRoot = (Fts5ExprNode*)sqlite3Fts5MallocZero(&rc,
13206 sizeof(Fts5ExprNode));
13207 }
13208 if( rc==SQLITE_OK ){
13209 pNew->pRoot->pNear = (Fts5ExprNearset*)sqlite3Fts5MallocZero(&rc,
13210 sizeof(Fts5ExprNearset) + sizeof(Fts5ExprPhrase*));
13211 }
13212
13213 for(i=0; rc==SQLITE_OK && i<pOrig->nTerm; i++){
13214 int tflags = 0;
13215 Fts5ExprTerm *p;
13216 for(p=&pOrig->aTerm[i]; p && rc==SQLITE_OK; p=p->pSynonym){
13217 const char *zTerm = p->zTerm;
13218 rc = fts5ParseTokenize((void*)&sCtx, tflags, zTerm, (int)strlen(zTerm),
13219 0, 0);
13220 tflags = FTS5_TOKEN_COLOCATED;
13221 }
13222 if( rc==SQLITE_OK ){
13223 sCtx.pPhrase->aTerm[i].bPrefix = pOrig->aTerm[i].bPrefix;
13224 }
13225 }
13226
13227 if( rc==SQLITE_OK ){
13228 /* All the allocations succeeded. Put the expression object together. */
13229 pNew->pIndex = pExpr->pIndex;
13230 pNew->nPhrase = 1;
13231 pNew->apExprPhrase[0] = sCtx.pPhrase;
13232 pNew->pRoot->pNear->apPhrase[0] = sCtx.pPhrase;
13233 pNew->pRoot->pNear->nPhrase = 1;
13234 sCtx.pPhrase->pNode = pNew->pRoot;
13235
13236 if( pOrig->nTerm==1 && pOrig->aTerm[0].pSynonym==0 ){
13237 pNew->pRoot->eType = FTS5_TERM;
13238 }else{
13239 pNew->pRoot->eType = FTS5_STRING;
13240 }
13241 }else{
13242 sqlite3Fts5ExprFree(pNew);
13243 fts5ExprPhraseFree(sCtx.pPhrase);
13244 pNew = 0;
13245 }
13246
13247 *ppNew = pNew;
13248 return rc;
13249 }
13250
13251
13252 /*
13253 ** Token pTok has appeared in a MATCH expression where the NEAR operator
13254 ** is expected. If token pTok does not contain "NEAR", store an error
13255 ** in the pParse object.
13256 */
13257 static void sqlite3Fts5ParseNear(Fts5Parse *pParse, Fts5Token *pTok){
13258 if( pTok->n!=4 || memcmp("NEAR", pTok->p, 4) ){
13259 sqlite3Fts5ParseError(
13260 pParse, "fts5: syntax error near \"%.*s\"", pTok->n, pTok->p
13261 );
13262 }
13263 }
13264
13265 static void sqlite3Fts5ParseSetDistance(
13266 Fts5Parse *pParse,
13267 Fts5ExprNearset *pNear,
13268 Fts5Token *p
13269 ){
13270 int nNear = 0;
13271 int i;
13272 if( p->n ){
13273 for(i=0; i<p->n; i++){
13274 char c = (char)p->p[i];
13275 if( c<'0' || c>'9' ){
13276 sqlite3Fts5ParseError(
13277 pParse, "expected integer, got \"%.*s\"", p->n, p->p
13278 );
13279 return;
13280 }
13281 nNear = nNear * 10 + (p->p[i] - '0');
13282 }
13283 }else{
13284 nNear = FTS5_DEFAULT_NEARDIST;
13285 }
13286 pNear->nNear = nNear;
13287 }
13288
13289 /*
13290 ** The second argument passed to this function may be NULL, or it may be
13291 ** an existing Fts5Colset object. This function returns a pointer to
13292 ** a new colset object containing the contents of (p) with new value column
13293 ** number iCol appended.
13294 **
13295 ** If an OOM error occurs, store an error code in pParse and return NULL.
13296 ** The old colset object (if any) is not freed in this case.
13297 */
13298 static Fts5Colset *fts5ParseColset(
13299 Fts5Parse *pParse, /* Store SQLITE_NOMEM here if required */
13300 Fts5Colset *p, /* Existing colset object */
13301 int iCol /* New column to add to colset object */
13302 ){
13303 int nCol = p ? p->nCol : 0; /* Num. columns already in colset object */
13304 Fts5Colset *pNew; /* New colset object to return */
13305
13306 assert( pParse->rc==SQLITE_OK );
13307 assert( iCol>=0 && iCol<pParse->pConfig->nCol );
13308
13309 pNew = sqlite3_realloc(p, sizeof(Fts5Colset) + sizeof(int)*nCol);
13310 if( pNew==0 ){
13311 pParse->rc = SQLITE_NOMEM;
13312 }else{
13313 int *aiCol = pNew->aiCol;
13314 int i, j;
13315 for(i=0; i<nCol; i++){
13316 if( aiCol[i]==iCol ) return pNew;
13317 if( aiCol[i]>iCol ) break;
13318 }
13319 for(j=nCol; j>i; j--){
13320 aiCol[j] = aiCol[j-1];
13321 }
13322 aiCol[i] = iCol;
13323 pNew->nCol = nCol+1;
13324
13325 #ifndef NDEBUG
13326 /* Check that the array is in order and contains no duplicate entries. */
13327 for(i=1; i<pNew->nCol; i++) assert( pNew->aiCol[i]>pNew->aiCol[i-1] );
13328 #endif
13329 }
13330
13331 return pNew;
13332 }
13333
13334 static Fts5Colset *sqlite3Fts5ParseColset(
13335 Fts5Parse *pParse, /* Store SQLITE_NOMEM here if required */
13336 Fts5Colset *pColset, /* Existing colset object */
13337 Fts5Token *p
13338 ){
13339 Fts5Colset *pRet = 0;
13340 int iCol;
13341 char *z; /* Dequoted copy of token p */
13342
13343 z = sqlite3Fts5Strndup(&pParse->rc, p->p, p->n);
13344 if( pParse->rc==SQLITE_OK ){
13345 Fts5Config *pConfig = pParse->pConfig;
13346 sqlite3Fts5Dequote(z);
13347 for(iCol=0; iCol<pConfig->nCol; iCol++){
13348 if( 0==sqlite3_stricmp(pConfig->azCol[iCol], z) ) break;
13349 }
13350 if( iCol==pConfig->nCol ){
13351 sqlite3Fts5ParseError(pParse, "no such column: %s", z);
13352 }else{
13353 pRet = fts5ParseColset(pParse, pColset, iCol);
13354 }
13355 sqlite3_free(z);
13356 }
13357
13358 if( pRet==0 ){
13359 assert( pParse->rc!=SQLITE_OK );
13360 sqlite3_free(pColset);
13361 }
13362
13363 return pRet;
13364 }
13365
13366 static void sqlite3Fts5ParseSetColset(
13367 Fts5Parse *pParse,
13368 Fts5ExprNearset *pNear,
13369 Fts5Colset *pColset
13370 ){
13371 if( pNear ){
13372 pNear->pColset = pColset;
13373 }else{
13374 sqlite3_free(pColset);
13375 }
13376 }
13377
13378 static void fts5ExprAddChildren(Fts5ExprNode *p, Fts5ExprNode *pSub){
13379 if( p->eType!=FTS5_NOT && pSub->eType==p->eType ){
13380 int nByte = sizeof(Fts5ExprNode*) * pSub->nChild;
13381 memcpy(&p->apChild[p->nChild], pSub->apChild, nByte);
13382 p->nChild += pSub->nChild;
13383 sqlite3_free(pSub);
13384 }else{
13385 p->apChild[p->nChild++] = pSub;
13386 }
13387 }
13388
13389 /*
13390 ** Allocate and return a new expression object. If anything goes wrong (i.e.
13391 ** OOM error), leave an error code in pParse and return NULL.
13392 */
13393 static Fts5ExprNode *sqlite3Fts5ParseNode(
13394 Fts5Parse *pParse, /* Parse context */
13395 int eType, /* FTS5_STRING, AND, OR or NOT */
13396 Fts5ExprNode *pLeft, /* Left hand child expression */
13397 Fts5ExprNode *pRight, /* Right hand child expression */
13398 Fts5ExprNearset *pNear /* For STRING expressions, the near cluster */
13399 ){
13400 Fts5ExprNode *pRet = 0;
13401
13402 if( pParse->rc==SQLITE_OK ){
13403 int nChild = 0; /* Number of children of returned node */
13404 int nByte; /* Bytes of space to allocate for this node */
13405
13406 assert( (eType!=FTS5_STRING && !pNear)
13407 || (eType==FTS5_STRING && !pLeft && !pRight)
13408 );
13409 if( eType==FTS5_STRING && pNear==0 ) return 0;
13410 if( eType!=FTS5_STRING && pLeft==0 ) return pRight;
13411 if( eType!=FTS5_STRING && pRight==0 ) return pLeft;
13412
13413 if( eType==FTS5_NOT ){
13414 nChild = 2;
13415 }else if( eType==FTS5_AND || eType==FTS5_OR ){
13416 nChild = 2;
13417 if( pLeft->eType==eType ) nChild += pLeft->nChild-1;
13418 if( pRight->eType==eType ) nChild += pRight->nChild-1;
13419 }
13420
13421 nByte = sizeof(Fts5ExprNode) + sizeof(Fts5ExprNode*)*(nChild-1);
13422 pRet = (Fts5ExprNode*)sqlite3Fts5MallocZero(&pParse->rc, nByte);
13423
13424 if( pRet ){
13425 pRet->eType = eType;
13426 pRet->pNear = pNear;
13427 if( eType==FTS5_STRING ){
13428 int iPhrase;
13429 for(iPhrase=0; iPhrase<pNear->nPhrase; iPhrase++){
13430 pNear->apPhrase[iPhrase]->pNode = pRet;
13431 }
13432 if( pNear->nPhrase==1
13433 && pNear->apPhrase[0]->nTerm==1
13434 && pNear->apPhrase[0]->aTerm[0].pSynonym==0
13435 ){
13436 pRet->eType = FTS5_TERM;
13437 }
13438 }else{
13439 fts5ExprAddChildren(pRet, pLeft);
13440 fts5ExprAddChildren(pRet, pRight);
13441 }
13442 }
13443 }
13444
13445 if( pRet==0 ){
13446 assert( pParse->rc!=SQLITE_OK );
13447 sqlite3Fts5ParseNodeFree(pLeft);
13448 sqlite3Fts5ParseNodeFree(pRight);
13449 sqlite3Fts5ParseNearsetFree(pNear);
13450 }
13451 return pRet;
13452 }
13453
13454 static char *fts5ExprTermPrint(Fts5ExprTerm *pTerm){
13455 int nByte = 0;
13456 Fts5ExprTerm *p;
13457 char *zQuoted;
13458
13459 /* Determine the maximum amount of space required. */
13460 for(p=pTerm; p; p=p->pSynonym){
13461 nByte += (int)strlen(pTerm->zTerm) * 2 + 3 + 2;
13462 }
13463 zQuoted = sqlite3_malloc(nByte);
13464
13465 if( zQuoted ){
13466 int i = 0;
13467 for(p=pTerm; p; p=p->pSynonym){
13468 char *zIn = p->zTerm;
13469 zQuoted[i++] = '"';
13470 while( *zIn ){
13471 if( *zIn=='"' ) zQuoted[i++] = '"';
13472 zQuoted[i++] = *zIn++;
13473 }
13474 zQuoted[i++] = '"';
13475 if( p->pSynonym ) zQuoted[i++] = '|';
13476 }
13477 if( pTerm->bPrefix ){
13478 zQuoted[i++] = ' ';
13479 zQuoted[i++] = '*';
13480 }
13481 zQuoted[i++] = '\0';
13482 }
13483 return zQuoted;
13484 }
13485
13486 static char *fts5PrintfAppend(char *zApp, const char *zFmt, ...){
13487 char *zNew;
13488 va_list ap;
13489 va_start(ap, zFmt);
13490 zNew = sqlite3_vmprintf(zFmt, ap);
13491 va_end(ap);
13492 if( zApp && zNew ){
13493 char *zNew2 = sqlite3_mprintf("%s%s", zApp, zNew);
13494 sqlite3_free(zNew);
13495 zNew = zNew2;
13496 }
13497 sqlite3_free(zApp);
13498 return zNew;
13499 }
13500
13501 /*
13502 ** Compose a tcl-readable representation of expression pExpr. Return a
13503 ** pointer to a buffer containing that representation. It is the
13504 ** responsibility of the caller to at some point free the buffer using
13505 ** sqlite3_free().
13506 */
13507 static char *fts5ExprPrintTcl(
13508 Fts5Config *pConfig,
13509 const char *zNearsetCmd,
13510 Fts5ExprNode *pExpr
13511 ){
13512 char *zRet = 0;
13513 if( pExpr->eType==FTS5_STRING || pExpr->eType==FTS5_TERM ){
13514 Fts5ExprNearset *pNear = pExpr->pNear;
13515 int i;
13516 int iTerm;
13517
13518 zRet = fts5PrintfAppend(zRet, "%s ", zNearsetCmd);
13519 if( zRet==0 ) return 0;
13520 if( pNear->pColset ){
13521 int *aiCol = pNear->pColset->aiCol;
13522 int nCol = pNear->pColset->nCol;
13523 if( nCol==1 ){
13524 zRet = fts5PrintfAppend(zRet, "-col %d ", aiCol[0]);
13525 }else{
13526 zRet = fts5PrintfAppend(zRet, "-col {%d", aiCol[0]);
13527 for(i=1; i<pNear->pColset->nCol; i++){
13528 zRet = fts5PrintfAppend(zRet, " %d", aiCol[i]);
13529 }
13530 zRet = fts5PrintfAppend(zRet, "} ");
13531 }
13532 if( zRet==0 ) return 0;
13533 }
13534
13535 if( pNear->nPhrase>1 ){
13536 zRet = fts5PrintfAppend(zRet, "-near %d ", pNear->nNear);
13537 if( zRet==0 ) return 0;
13538 }
13539
13540 zRet = fts5PrintfAppend(zRet, "--");
13541 if( zRet==0 ) return 0;
13542
13543 for(i=0; i<pNear->nPhrase; i++){
13544 Fts5ExprPhrase *pPhrase = pNear->apPhrase[i];
13545
13546 zRet = fts5PrintfAppend(zRet, " {");
13547 for(iTerm=0; zRet && iTerm<pPhrase->nTerm; iTerm++){
13548 char *zTerm = pPhrase->aTerm[iTerm].zTerm;
13549 zRet = fts5PrintfAppend(zRet, "%s%s", iTerm==0?"":" ", zTerm);
13550 }
13551
13552 if( zRet ) zRet = fts5PrintfAppend(zRet, "}");
13553 if( zRet==0 ) return 0;
13554 }
13555
13556 }else{
13557 char const *zOp = 0;
13558 int i;
13559 switch( pExpr->eType ){
13560 case FTS5_AND: zOp = "AND"; break;
13561 case FTS5_NOT: zOp = "NOT"; break;
13562 default:
13563 assert( pExpr->eType==FTS5_OR );
13564 zOp = "OR";
13565 break;
13566 }
13567
13568 zRet = sqlite3_mprintf("%s", zOp);
13569 for(i=0; zRet && i<pExpr->nChild; i++){
13570 char *z = fts5ExprPrintTcl(pConfig, zNearsetCmd, pExpr->apChild[i]);
13571 if( !z ){
13572 sqlite3_free(zRet);
13573 zRet = 0;
13574 }else{
13575 zRet = fts5PrintfAppend(zRet, " [%z]", z);
13576 }
13577 }
13578 }
13579
13580 return zRet;
13581 }
13582
13583 static char *fts5ExprPrint(Fts5Config *pConfig, Fts5ExprNode *pExpr){
13584 char *zRet = 0;
13585 if( pExpr->eType==FTS5_STRING || pExpr->eType==FTS5_TERM ){
13586 Fts5ExprNearset *pNear = pExpr->pNear;
13587 int i;
13588 int iTerm;
13589
13590 if( pNear->pColset ){
13591 int iCol = pNear->pColset->aiCol[0];
13592 zRet = fts5PrintfAppend(zRet, "%s : ", pConfig->azCol[iCol]);
13593 if( zRet==0 ) return 0;
13594 }
13595
13596 if( pNear->nPhrase>1 ){
13597 zRet = fts5PrintfAppend(zRet, "NEAR(");
13598 if( zRet==0 ) return 0;
13599 }
13600
13601 for(i=0; i<pNear->nPhrase; i++){
13602 Fts5ExprPhrase *pPhrase = pNear->apPhrase[i];
13603 if( i!=0 ){
13604 zRet = fts5PrintfAppend(zRet, " ");
13605 if( zRet==0 ) return 0;
13606 }
13607 for(iTerm=0; iTerm<pPhrase->nTerm; iTerm++){
13608 char *zTerm = fts5ExprTermPrint(&pPhrase->aTerm[iTerm]);
13609 if( zTerm ){
13610 zRet = fts5PrintfAppend(zRet, "%s%s", iTerm==0?"":" + ", zTerm);
13611 sqlite3_free(zTerm);
13612 }
13613 if( zTerm==0 || zRet==0 ){
13614 sqlite3_free(zRet);
13615 return 0;
13616 }
13617 }
13618 }
13619
13620 if( pNear->nPhrase>1 ){
13621 zRet = fts5PrintfAppend(zRet, ", %d)", pNear->nNear);
13622 if( zRet==0 ) return 0;
13623 }
13624
13625 }else{
13626 char const *zOp = 0;
13627 int i;
13628
13629 switch( pExpr->eType ){
13630 case FTS5_AND: zOp = " AND "; break;
13631 case FTS5_NOT: zOp = " NOT "; break;
13632 default:
13633 assert( pExpr->eType==FTS5_OR );
13634 zOp = " OR ";
13635 break;
13636 }
13637
13638 for(i=0; i<pExpr->nChild; i++){
13639 char *z = fts5ExprPrint(pConfig, pExpr->apChild[i]);
13640 if( z==0 ){
13641 sqlite3_free(zRet);
13642 zRet = 0;
13643 }else{
13644 int e = pExpr->apChild[i]->eType;
13645 int b = (e!=FTS5_STRING && e!=FTS5_TERM);
13646 zRet = fts5PrintfAppend(zRet, "%s%s%z%s",
13647 (i==0 ? "" : zOp),
13648 (b?"(":""), z, (b?")":"")
13649 );
13650 }
13651 if( zRet==0 ) break;
13652 }
13653 }
13654
13655 return zRet;
13656 }
13657
13658 /*
13659 ** The implementation of user-defined scalar functions fts5_expr() (bTcl==0)
13660 ** and fts5_expr_tcl() (bTcl!=0).
13661 */
13662 static void fts5ExprFunction(
13663 sqlite3_context *pCtx, /* Function call context */
13664 int nArg, /* Number of args */
13665 sqlite3_value **apVal, /* Function arguments */
13666 int bTcl
13667 ){
13668 Fts5Global *pGlobal = (Fts5Global*)sqlite3_user_data(pCtx);
13669 sqlite3 *db = sqlite3_context_db_handle(pCtx);
13670 const char *zExpr = 0;
13671 char *zErr = 0;
13672 Fts5Expr *pExpr = 0;
13673 int rc;
13674 int i;
13675
13676 const char **azConfig; /* Array of arguments for Fts5Config */
13677 const char *zNearsetCmd = "nearset";
13678 int nConfig; /* Size of azConfig[] */
13679 Fts5Config *pConfig = 0;
13680 int iArg = 1;
13681
13682 if( nArg<1 ){
13683 zErr = sqlite3_mprintf("wrong number of arguments to function %s",
13684 bTcl ? "fts5_expr_tcl" : "fts5_expr"
13685 );
13686 sqlite3_result_error(pCtx, zErr, -1);
13687 sqlite3_free(zErr);
13688 return;
13689 }
13690
13691 if( bTcl && nArg>1 ){
13692 zNearsetCmd = (const char*)sqlite3_value_text(apVal[1]);
13693 iArg = 2;
13694 }
13695
13696 nConfig = 3 + (nArg-iArg);
13697 azConfig = (const char**)sqlite3_malloc(sizeof(char*) * nConfig);
13698 if( azConfig==0 ){
13699 sqlite3_result_error_nomem(pCtx);
13700 return;
13701 }
13702 azConfig[0] = 0;
13703 azConfig[1] = "main";
13704 azConfig[2] = "tbl";
13705 for(i=3; iArg<nArg; iArg++){
13706 azConfig[i++] = (const char*)sqlite3_value_text(apVal[iArg]);
13707 }
13708
13709 zExpr = (const char*)sqlite3_value_text(apVal[0]);
13710
13711 rc = sqlite3Fts5ConfigParse(pGlobal, db, nConfig, azConfig, &pConfig, &zErr);
13712 if( rc==SQLITE_OK ){
13713 rc = sqlite3Fts5ExprNew(pConfig, zExpr, &pExpr, &zErr);
13714 }
13715 if( rc==SQLITE_OK ){
13716 char *zText;
13717 if( pExpr->pRoot==0 ){
13718 zText = sqlite3_mprintf("");
13719 }else if( bTcl ){
13720 zText = fts5ExprPrintTcl(pConfig, zNearsetCmd, pExpr->pRoot);
13721 }else{
13722 zText = fts5ExprPrint(pConfig, pExpr->pRoot);
13723 }
13724 if( zText==0 ){
13725 rc = SQLITE_NOMEM;
13726 }else{
13727 sqlite3_result_text(pCtx, zText, -1, SQLITE_TRANSIENT);
13728 sqlite3_free(zText);
13729 }
13730 }
13731
13732 if( rc!=SQLITE_OK ){
13733 if( zErr ){
13734 sqlite3_result_error(pCtx, zErr, -1);
13735 sqlite3_free(zErr);
13736 }else{
13737 sqlite3_result_error_code(pCtx, rc);
13738 }
13739 }
13740 sqlite3_free((void *)azConfig);
13741 sqlite3Fts5ConfigFree(pConfig);
13742 sqlite3Fts5ExprFree(pExpr);
13743 }
13744
13745 static void fts5ExprFunctionHr(
13746 sqlite3_context *pCtx, /* Function call context */
13747 int nArg, /* Number of args */
13748 sqlite3_value **apVal /* Function arguments */
13749 ){
13750 fts5ExprFunction(pCtx, nArg, apVal, 0);
13751 }
13752 static void fts5ExprFunctionTcl(
13753 sqlite3_context *pCtx, /* Function call context */
13754 int nArg, /* Number of args */
13755 sqlite3_value **apVal /* Function arguments */
13756 ){
13757 fts5ExprFunction(pCtx, nArg, apVal, 1);
13758 }
13759
13760 /*
13761 ** The implementation of an SQLite user-defined-function that accepts a
13762 ** single integer as an argument. If the integer is an alpha-numeric
13763 ** unicode code point, 1 is returned. Otherwise 0.
13764 */
13765 static void fts5ExprIsAlnum(
13766 sqlite3_context *pCtx, /* Function call context */
13767 int nArg, /* Number of args */
13768 sqlite3_value **apVal /* Function arguments */
13769 ){
13770 int iCode;
13771 if( nArg!=1 ){
13772 sqlite3_result_error(pCtx,
13773 "wrong number of arguments to function fts5_isalnum", -1
13774 );
13775 return;
13776 }
13777 iCode = sqlite3_value_int(apVal[0]);
13778 sqlite3_result_int(pCtx, sqlite3Fts5UnicodeIsalnum(iCode));
13779 }
13780
13781 static void fts5ExprFold(
13782 sqlite3_context *pCtx, /* Function call context */
13783 int nArg, /* Number of args */
13784 sqlite3_value **apVal /* Function arguments */
13785 ){
13786 if( nArg!=1 && nArg!=2 ){
13787 sqlite3_result_error(pCtx,
13788 "wrong number of arguments to function fts5_fold", -1
13789 );
13790 }else{
13791 int iCode;
13792 int bRemoveDiacritics = 0;
13793 iCode = sqlite3_value_int(apVal[0]);
13794 if( nArg==2 ) bRemoveDiacritics = sqlite3_value_int(apVal[1]);
13795 sqlite3_result_int(pCtx, sqlite3Fts5UnicodeFold(iCode, bRemoveDiacritics));
13796 }
13797 }
13798
13799 /*
13800 ** This is called during initialization to register the fts5_expr() scalar
13801 ** UDF with the SQLite handle passed as the only argument.
13802 */
13803 static int sqlite3Fts5ExprInit(Fts5Global *pGlobal, sqlite3 *db){
13804 struct Fts5ExprFunc {
13805 const char *z;
13806 void (*x)(sqlite3_context*,int,sqlite3_value**);
13807 } aFunc[] = {
13808 { "fts5_expr", fts5ExprFunctionHr },
13809 { "fts5_expr_tcl", fts5ExprFunctionTcl },
13810 { "fts5_isalnum", fts5ExprIsAlnum },
13811 { "fts5_fold", fts5ExprFold },
13812 };
13813 int i;
13814 int rc = SQLITE_OK;
13815 void *pCtx = (void*)pGlobal;
13816
13817 for(i=0; rc==SQLITE_OK && i<(int)ArraySize(aFunc); i++){
13818 struct Fts5ExprFunc *p = &aFunc[i];
13819 rc = sqlite3_create_function(db, p->z, -1, SQLITE_UTF8, pCtx, p->x, 0, 0);
13820 }
13821
13822 /* Avoid a warning indicating that sqlite3Fts5ParserTrace() is unused */
13823 #ifndef NDEBUG
13824 (void)sqlite3Fts5ParserTrace;
13825 #endif
13826
13827 return rc;
13828 }
13829
13830 /*
13831 ** Return the number of phrases in expression pExpr.
13832 */
13833 static int sqlite3Fts5ExprPhraseCount(Fts5Expr *pExpr){
13834 return (pExpr ? pExpr->nPhrase : 0);
13835 }
13836
13837 /*
13838 ** Return the number of terms in the iPhrase'th phrase in pExpr.
13839 */
13840 static int sqlite3Fts5ExprPhraseSize(Fts5Expr *pExpr, int iPhrase){
13841 if( iPhrase<0 || iPhrase>=pExpr->nPhrase ) return 0;
13842 return pExpr->apExprPhrase[iPhrase]->nTerm;
13843 }
13844
13845 /*
13846 ** This function is used to access the current position list for phrase
13847 ** iPhrase.
13848 */
13849 static int sqlite3Fts5ExprPoslist(Fts5Expr *pExpr, int iPhrase, const u8 **pa){
13850 int nRet;
13851 Fts5ExprPhrase *pPhrase = pExpr->apExprPhrase[iPhrase];
13852 Fts5ExprNode *pNode = pPhrase->pNode;
13853 if( pNode->bEof==0 && pNode->iRowid==pExpr->pRoot->iRowid ){
13854 *pa = pPhrase->poslist.p;
13855 nRet = pPhrase->poslist.n;
13856 }else{
13857 *pa = 0;
13858 nRet = 0;
13859 }
13860 return nRet;
13861 }
13862
13863 /*
13864 ** 2014 August 11
13865 **
13866 ** The author disclaims copyright to this source code. In place of
13867 ** a legal notice, here is a blessing:
13868 **
13869 ** May you do good and not evil.
13870 ** May you find forgiveness for yourself and forgive others.
13871 ** May you share freely, never taking more than you give.
13872 **
13873 ******************************************************************************
13874 **
13875 */
13876
13877
13878
13879 /* #include "fts5Int.h" */
13880
13881 typedef struct Fts5HashEntry Fts5HashEntry;
13882
13883 /*
13884 ** This file contains the implementation of an in-memory hash table used
13885 ** to accumuluate "term -> doclist" content before it is flused to a level-0
13886 ** segment.
13887 */
13888
13889
13890 struct Fts5Hash {
13891 int *pnByte; /* Pointer to bytes counter */
13892 int nEntry; /* Number of entries currently in hash */
13893 int nSlot; /* Size of aSlot[] array */
13894 Fts5HashEntry *pScan; /* Current ordered scan item */
13895 Fts5HashEntry **aSlot; /* Array of hash slots */
13896 };
13897
13898 /*
13899 ** Each entry in the hash table is represented by an object of the
13900 ** following type. Each object, its key (zKey[]) and its current data
13901 ** are stored in a single memory allocation. The position list data
13902 ** immediately follows the key data in memory.
13903 **
13904 ** The data that follows the key is in a similar, but not identical format
13905 ** to the doclist data stored in the database. It is:
13906 **
13907 ** * Rowid, as a varint
13908 ** * Position list, without 0x00 terminator.
13909 ** * Size of previous position list and rowid, as a 4 byte
13910 ** big-endian integer.
13911 **
13912 ** iRowidOff:
13913 ** Offset of last rowid written to data area. Relative to first byte of
13914 ** structure.
13915 **
13916 ** nData:
13917 ** Bytes of data written since iRowidOff.
13918 */
13919 struct Fts5HashEntry {
13920 Fts5HashEntry *pHashNext; /* Next hash entry with same hash-key */
13921 Fts5HashEntry *pScanNext; /* Next entry in sorted order */
13922
13923 int nAlloc; /* Total size of allocation */
13924 int iSzPoslist; /* Offset of space for 4-byte poslist size */
13925 int nData; /* Total bytes of data (incl. structure) */
13926 u8 bDel; /* Set delete-flag @ iSzPoslist */
13927
13928 int iCol; /* Column of last value written */
13929 int iPos; /* Position of last value written */
13930 i64 iRowid; /* Rowid of last value written */
13931 char zKey[8]; /* Nul-terminated entry key */
13932 };
13933
13934 /*
13935 ** Size of Fts5HashEntry without the zKey[] array.
13936 */
13937 #define FTS5_HASHENTRYSIZE (sizeof(Fts5HashEntry)-8)
13938
13939
13940
13941 /*
13942 ** Allocate a new hash table.
13943 */
13944 static int sqlite3Fts5HashNew(Fts5Hash **ppNew, int *pnByte){
13945 int rc = SQLITE_OK;
13946 Fts5Hash *pNew;
13947
13948 *ppNew = pNew = (Fts5Hash*)sqlite3_malloc(sizeof(Fts5Hash));
13949 if( pNew==0 ){
13950 rc = SQLITE_NOMEM;
13951 }else{
13952 int nByte;
13953 memset(pNew, 0, sizeof(Fts5Hash));
13954 pNew->pnByte = pnByte;
13955
13956 pNew->nSlot = 1024;
13957 nByte = sizeof(Fts5HashEntry*) * pNew->nSlot;
13958 pNew->aSlot = (Fts5HashEntry**)sqlite3_malloc(nByte);
13959 if( pNew->aSlot==0 ){
13960 sqlite3_free(pNew);
13961 *ppNew = 0;
13962 rc = SQLITE_NOMEM;
13963 }else{
13964 memset(pNew->aSlot, 0, nByte);
13965 }
13966 }
13967 return rc;
13968 }
13969
13970 /*
13971 ** Free a hash table object.
13972 */
13973 static void sqlite3Fts5HashFree(Fts5Hash *pHash){
13974 if( pHash ){
13975 sqlite3Fts5HashClear(pHash);
13976 sqlite3_free(pHash->aSlot);
13977 sqlite3_free(pHash);
13978 }
13979 }
13980
13981 /*
13982 ** Empty (but do not delete) a hash table.
13983 */
13984 static void sqlite3Fts5HashClear(Fts5Hash *pHash){
13985 int i;
13986 for(i=0; i<pHash->nSlot; i++){
13987 Fts5HashEntry *pNext;
13988 Fts5HashEntry *pSlot;
13989 for(pSlot=pHash->aSlot[i]; pSlot; pSlot=pNext){
13990 pNext = pSlot->pHashNext;
13991 sqlite3_free(pSlot);
13992 }
13993 }
13994 memset(pHash->aSlot, 0, pHash->nSlot * sizeof(Fts5HashEntry*));
13995 pHash->nEntry = 0;
13996 }
13997
13998 static unsigned int fts5HashKey(int nSlot, const u8 *p, int n){
13999 int i;
14000 unsigned int h = 13;
14001 for(i=n-1; i>=0; i--){
14002 h = (h << 3) ^ h ^ p[i];
14003 }
14004 return (h % nSlot);
14005 }
14006
14007 static unsigned int fts5HashKey2(int nSlot, u8 b, const u8 *p, int n){
14008 int i;
14009 unsigned int h = 13;
14010 for(i=n-1; i>=0; i--){
14011 h = (h << 3) ^ h ^ p[i];
14012 }
14013 h = (h << 3) ^ h ^ b;
14014 return (h % nSlot);
14015 }
14016
14017 /*
14018 ** Resize the hash table by doubling the number of slots.
14019 */
14020 static int fts5HashResize(Fts5Hash *pHash){
14021 int nNew = pHash->nSlot*2;
14022 int i;
14023 Fts5HashEntry **apNew;
14024 Fts5HashEntry **apOld = pHash->aSlot;
14025
14026 apNew = (Fts5HashEntry**)sqlite3_malloc(nNew*sizeof(Fts5HashEntry*));
14027 if( !apNew ) return SQLITE_NOMEM;
14028 memset(apNew, 0, nNew*sizeof(Fts5HashEntry*));
14029
14030 for(i=0; i<pHash->nSlot; i++){
14031 while( apOld[i] ){
14032 int iHash;
14033 Fts5HashEntry *p = apOld[i];
14034 apOld[i] = p->pHashNext;
14035 iHash = fts5HashKey(nNew, (u8*)p->zKey, (int)strlen(p->zKey));
14036 p->pHashNext = apNew[iHash];
14037 apNew[iHash] = p;
14038 }
14039 }
14040
14041 sqlite3_free(apOld);
14042 pHash->nSlot = nNew;
14043 pHash->aSlot = apNew;
14044 return SQLITE_OK;
14045 }
14046
14047 static void fts5HashAddPoslistSize(Fts5HashEntry *p){
14048 if( p->iSzPoslist ){
14049 u8 *pPtr = (u8*)p;
14050 int nSz = (p->nData - p->iSzPoslist - 1); /* Size in bytes */
14051 int nPos = nSz*2 + p->bDel; /* Value of nPos field */
14052
14053 assert( p->bDel==0 || p->bDel==1 );
14054 if( nPos<=127 ){
14055 pPtr[p->iSzPoslist] = (u8)nPos;
14056 }else{
14057 int nByte = sqlite3Fts5GetVarintLen((u32)nPos);
14058 memmove(&pPtr[p->iSzPoslist + nByte], &pPtr[p->iSzPoslist + 1], nSz);
14059 sqlite3Fts5PutVarint(&pPtr[p->iSzPoslist], nPos);
14060 p->nData += (nByte-1);
14061 }
14062 p->bDel = 0;
14063 p->iSzPoslist = 0;
14064 }
14065 }
14066
14067 static int sqlite3Fts5HashWrite(
14068 Fts5Hash *pHash,
14069 i64 iRowid, /* Rowid for this entry */
14070 int iCol, /* Column token appears in (-ve -> delete) */
14071 int iPos, /* Position of token within column */
14072 char bByte, /* First byte of token */
14073 const char *pToken, int nToken /* Token to add or remove to or from index */
14074 ){
14075 unsigned int iHash;
14076 Fts5HashEntry *p;
14077 u8 *pPtr;
14078 int nIncr = 0; /* Amount to increment (*pHash->pnByte) by */
14079
14080 /* Attempt to locate an existing hash entry */
14081 iHash = fts5HashKey2(pHash->nSlot, (u8)bByte, (const u8*)pToken, nToken);
14082 for(p=pHash->aSlot[iHash]; p; p=p->pHashNext){
14083 if( p->zKey[0]==bByte
14084 && memcmp(&p->zKey[1], pToken, nToken)==0
14085 && p->zKey[nToken+1]==0
14086 ){
14087 break;
14088 }
14089 }
14090
14091 /* If an existing hash entry cannot be found, create a new one. */
14092 if( p==0 ){
14093 int nByte = FTS5_HASHENTRYSIZE + (nToken+1) + 1 + 64;
14094 if( nByte<128 ) nByte = 128;
14095
14096 if( (pHash->nEntry*2)>=pHash->nSlot ){
14097 int rc = fts5HashResize(pHash);
14098 if( rc!=SQLITE_OK ) return rc;
14099 iHash = fts5HashKey2(pHash->nSlot, (u8)bByte, (const u8*)pToken, nToken);
14100 }
14101
14102 p = (Fts5HashEntry*)sqlite3_malloc(nByte);
14103 if( !p ) return SQLITE_NOMEM;
14104 memset(p, 0, FTS5_HASHENTRYSIZE);
14105 p->nAlloc = nByte;
14106 p->zKey[0] = bByte;
14107 memcpy(&p->zKey[1], pToken, nToken);
14108 assert( iHash==fts5HashKey(pHash->nSlot, (u8*)p->zKey, nToken+1) );
14109 p->zKey[nToken+1] = '\0';
14110 p->nData = nToken+1 + 1 + FTS5_HASHENTRYSIZE;
14111 p->nData += sqlite3Fts5PutVarint(&((u8*)p)[p->nData], iRowid);
14112 p->iSzPoslist = p->nData;
14113 p->nData += 1;
14114 p->iRowid = iRowid;
14115 p->pHashNext = pHash->aSlot[iHash];
14116 pHash->aSlot[iHash] = p;
14117 pHash->nEntry++;
14118 nIncr += p->nData;
14119 }
14120
14121 /* Check there is enough space to append a new entry. Worst case scenario
14122 ** is:
14123 **
14124 ** + 9 bytes for a new rowid,
14125 ** + 4 byte reserved for the "poslist size" varint.
14126 ** + 1 byte for a "new column" byte,
14127 ** + 3 bytes for a new column number (16-bit max) as a varint,
14128 ** + 5 bytes for the new position offset (32-bit max).
14129 */
14130 if( (p->nAlloc - p->nData) < (9 + 4 + 1 + 3 + 5) ){
14131 int nNew = p->nAlloc * 2;
14132 Fts5HashEntry *pNew;
14133 Fts5HashEntry **pp;
14134 pNew = (Fts5HashEntry*)sqlite3_realloc(p, nNew);
14135 if( pNew==0 ) return SQLITE_NOMEM;
14136 pNew->nAlloc = nNew;
14137 for(pp=&pHash->aSlot[iHash]; *pp!=p; pp=&(*pp)->pHashNext);
14138 *pp = pNew;
14139 p = pNew;
14140 }
14141 pPtr = (u8*)p;
14142 nIncr -= p->nData;
14143
14144 /* If this is a new rowid, append the 4-byte size field for the previous
14145 ** entry, and the new rowid for this entry. */
14146 if( iRowid!=p->iRowid ){
14147 fts5HashAddPoslistSize(p);
14148 p->nData += sqlite3Fts5PutVarint(&pPtr[p->nData], iRowid - p->iRowid);
14149 p->iSzPoslist = p->nData;
14150 p->nData += 1;
14151 p->iCol = 0;
14152 p->iPos = 0;
14153 p->iRowid = iRowid;
14154 }
14155
14156 if( iCol>=0 ){
14157 /* Append a new column value, if necessary */
14158 assert( iCol>=p->iCol );
14159 if( iCol!=p->iCol ){
14160 pPtr[p->nData++] = 0x01;
14161 p->nData += sqlite3Fts5PutVarint(&pPtr[p->nData], iCol);
14162 p->iCol = iCol;
14163 p->iPos = 0;
14164 }
14165
14166 /* Append the new position offset */
14167 p->nData += sqlite3Fts5PutVarint(&pPtr[p->nData], iPos - p->iPos + 2);
14168 p->iPos = iPos;
14169 }else{
14170 /* This is a delete. Set the delete flag. */
14171 p->bDel = 1;
14172 }
14173 nIncr += p->nData;
14174
14175 *pHash->pnByte += nIncr;
14176 return SQLITE_OK;
14177 }
14178
14179
14180 /*
14181 ** Arguments pLeft and pRight point to linked-lists of hash-entry objects,
14182 ** each sorted in key order. This function merges the two lists into a
14183 ** single list and returns a pointer to its first element.
14184 */
14185 static Fts5HashEntry *fts5HashEntryMerge(
14186 Fts5HashEntry *pLeft,
14187 Fts5HashEntry *pRight
14188 ){
14189 Fts5HashEntry *p1 = pLeft;
14190 Fts5HashEntry *p2 = pRight;
14191 Fts5HashEntry *pRet = 0;
14192 Fts5HashEntry **ppOut = &pRet;
14193
14194 while( p1 || p2 ){
14195 if( p1==0 ){
14196 *ppOut = p2;
14197 p2 = 0;
14198 }else if( p2==0 ){
14199 *ppOut = p1;
14200 p1 = 0;
14201 }else{
14202 int i = 0;
14203 while( p1->zKey[i]==p2->zKey[i] ) i++;
14204
14205 if( ((u8)p1->zKey[i])>((u8)p2->zKey[i]) ){
14206 /* p2 is smaller */
14207 *ppOut = p2;
14208 ppOut = &p2->pScanNext;
14209 p2 = p2->pScanNext;
14210 }else{
14211 /* p1 is smaller */
14212 *ppOut = p1;
14213 ppOut = &p1->pScanNext;
14214 p1 = p1->pScanNext;
14215 }
14216 *ppOut = 0;
14217 }
14218 }
14219
14220 return pRet;
14221 }
14222
14223 /*
14224 ** Extract all tokens from hash table iHash and link them into a list
14225 ** in sorted order. The hash table is cleared before returning. It is
14226 ** the responsibility of the caller to free the elements of the returned
14227 ** list.
14228 */
14229 static int fts5HashEntrySort(
14230 Fts5Hash *pHash,
14231 const char *pTerm, int nTerm, /* Query prefix, if any */
14232 Fts5HashEntry **ppSorted
14233 ){
14234 const int nMergeSlot = 32;
14235 Fts5HashEntry **ap;
14236 Fts5HashEntry *pList;
14237 int iSlot;
14238 int i;
14239
14240 *ppSorted = 0;
14241 ap = sqlite3_malloc(sizeof(Fts5HashEntry*) * nMergeSlot);
14242 if( !ap ) return SQLITE_NOMEM;
14243 memset(ap, 0, sizeof(Fts5HashEntry*) * nMergeSlot);
14244
14245 for(iSlot=0; iSlot<pHash->nSlot; iSlot++){
14246 Fts5HashEntry *pIter;
14247 for(pIter=pHash->aSlot[iSlot]; pIter; pIter=pIter->pHashNext){
14248 if( pTerm==0 || 0==memcmp(pIter->zKey, pTerm, nTerm) ){
14249 Fts5HashEntry *pEntry = pIter;
14250 pEntry->pScanNext = 0;
14251 for(i=0; ap[i]; i++){
14252 pEntry = fts5HashEntryMerge(pEntry, ap[i]);
14253 ap[i] = 0;
14254 }
14255 ap[i] = pEntry;
14256 }
14257 }
14258 }
14259
14260 pList = 0;
14261 for(i=0; i<nMergeSlot; i++){
14262 pList = fts5HashEntryMerge(pList, ap[i]);
14263 }
14264
14265 pHash->nEntry = 0;
14266 sqlite3_free(ap);
14267 *ppSorted = pList;
14268 return SQLITE_OK;
14269 }
14270
14271 /*
14272 ** Query the hash table for a doclist associated with term pTerm/nTerm.
14273 */
14274 static int sqlite3Fts5HashQuery(
14275 Fts5Hash *pHash, /* Hash table to query */
14276 const char *pTerm, int nTerm, /* Query term */
14277 const u8 **ppDoclist, /* OUT: Pointer to doclist for pTerm */
14278 int *pnDoclist /* OUT: Size of doclist in bytes */
14279 ){
14280 unsigned int iHash = fts5HashKey(pHash->nSlot, (const u8*)pTerm, nTerm);
14281 Fts5HashEntry *p;
14282
14283 for(p=pHash->aSlot[iHash]; p; p=p->pHashNext){
14284 if( memcmp(p->zKey, pTerm, nTerm)==0 && p->zKey[nTerm]==0 ) break;
14285 }
14286
14287 if( p ){
14288 fts5HashAddPoslistSize(p);
14289 *ppDoclist = (const u8*)&p->zKey[nTerm+1];
14290 *pnDoclist = p->nData - (FTS5_HASHENTRYSIZE + nTerm + 1);
14291 }else{
14292 *ppDoclist = 0;
14293 *pnDoclist = 0;
14294 }
14295
14296 return SQLITE_OK;
14297 }
14298
14299 static int sqlite3Fts5HashScanInit(
14300 Fts5Hash *p, /* Hash table to query */
14301 const char *pTerm, int nTerm /* Query prefix */
14302 ){
14303 return fts5HashEntrySort(p, pTerm, nTerm, &p->pScan);
14304 }
14305
14306 static void sqlite3Fts5HashScanNext(Fts5Hash *p){
14307 assert( !sqlite3Fts5HashScanEof(p) );
14308 p->pScan = p->pScan->pScanNext;
14309 }
14310
14311 static int sqlite3Fts5HashScanEof(Fts5Hash *p){
14312 return (p->pScan==0);
14313 }
14314
14315 static void sqlite3Fts5HashScanEntry(
14316 Fts5Hash *pHash,
14317 const char **pzTerm, /* OUT: term (nul-terminated) */
14318 const u8 **ppDoclist, /* OUT: pointer to doclist */
14319 int *pnDoclist /* OUT: size of doclist in bytes */
14320 ){
14321 Fts5HashEntry *p;
14322 if( (p = pHash->pScan) ){
14323 int nTerm = (int)strlen(p->zKey);
14324 fts5HashAddPoslistSize(p);
14325 *pzTerm = p->zKey;
14326 *ppDoclist = (const u8*)&p->zKey[nTerm+1];
14327 *pnDoclist = p->nData - (FTS5_HASHENTRYSIZE + nTerm + 1);
14328 }else{
14329 *pzTerm = 0;
14330 *ppDoclist = 0;
14331 *pnDoclist = 0;
14332 }
14333 }
14334
14335
14336 /*
14337 ** 2014 May 31
14338 **
14339 ** The author disclaims copyright to this source code. In place of
14340 ** a legal notice, here is a blessing:
14341 **
14342 ** May you do good and not evil.
14343 ** May you find forgiveness for yourself and forgive others.
14344 ** May you share freely, never taking more than you give.
14345 **
14346 ******************************************************************************
14347 **
14348 ** Low level access to the FTS index stored in the database file. The
14349 ** routines in this file file implement all read and write access to the
14350 ** %_data table. Other parts of the system access this functionality via
14351 ** the interface defined in fts5Int.h.
14352 */
14353
14354
14355 /* #include "fts5Int.h" */
14356
14357 /*
14358 ** Overview:
14359 **
14360 ** The %_data table contains all the FTS indexes for an FTS5 virtual table.
14361 ** As well as the main term index, there may be up to 31 prefix indexes.
14362 ** The format is similar to FTS3/4, except that:
14363 **
14364 ** * all segment b-tree leaf data is stored in fixed size page records
14365 ** (e.g. 1000 bytes). A single doclist may span multiple pages. Care is
14366 ** taken to ensure it is possible to iterate in either direction through
14367 ** the entries in a doclist, or to seek to a specific entry within a
14368 ** doclist, without loading it into memory.
14369 **
14370 ** * large doclists that span many pages have associated "doclist index"
14371 ** records that contain a copy of the first rowid on each page spanned by
14372 ** the doclist. This is used to speed up seek operations, and merges of
14373 ** large doclists with very small doclists.
14374 **
14375 ** * extra fields in the "structure record" record the state of ongoing
14376 ** incremental merge operations.
14377 **
14378 */
14379
14380
14381 #define FTS5_OPT_WORK_UNIT 1000 /* Number of leaf pages per optimize step */
14382 #define FTS5_WORK_UNIT 64 /* Number of leaf pages in unit of work */
14383
14384 #define FTS5_MIN_DLIDX_SIZE 4 /* Add dlidx if this many empty pages */
14385
14386 #define FTS5_MAIN_PREFIX '0'
14387
14388 #if FTS5_MAX_PREFIX_INDEXES > 31
14389 # error "FTS5_MAX_PREFIX_INDEXES is too large"
14390 #endif
14391
14392 /*
14393 ** Details:
14394 **
14395 ** The %_data table managed by this module,
14396 **
14397 ** CREATE TABLE %_data(id INTEGER PRIMARY KEY, block BLOB);
14398 **
14399 ** , contains the following 5 types of records. See the comments surrounding
14400 ** the FTS5_*_ROWID macros below for a description of how %_data rowids are
14401 ** assigned to each fo them.
14402 **
14403 ** 1. Structure Records:
14404 **
14405 ** The set of segments that make up an index - the index structure - are
14406 ** recorded in a single record within the %_data table. The record consists
14407 ** of a single 32-bit configuration cookie value followed by a list of
14408 ** SQLite varints. If the FTS table features more than one index (because
14409 ** there are one or more prefix indexes), it is guaranteed that all share
14410 ** the same cookie value.
14411 **
14412 ** Immediately following the configuration cookie, the record begins with
14413 ** three varints:
14414 **
14415 ** + number of levels,
14416 ** + total number of segments on all levels,
14417 ** + value of write counter.
14418 **
14419 ** Then, for each level from 0 to nMax:
14420 **
14421 ** + number of input segments in ongoing merge.
14422 ** + total number of segments in level.
14423 ** + for each segment from oldest to newest:
14424 ** + segment id (always > 0)
14425 ** + first leaf page number (often 1, always greater than 0)
14426 ** + final leaf page number
14427 **
14428 ** 2. The Averages Record:
14429 **
14430 ** A single record within the %_data table. The data is a list of varints.
14431 ** The first value is the number of rows in the index. Then, for each column
14432 ** from left to right, the total number of tokens in the column for all
14433 ** rows of the table.
14434 **
14435 ** 3. Segment leaves:
14436 **
14437 ** TERM/DOCLIST FORMAT:
14438 **
14439 ** Most of each segment leaf is taken up by term/doclist data. The
14440 ** general format of term/doclist, starting with the first term
14441 ** on the leaf page, is:
14442 **
14443 ** varint : size of first term
14444 ** blob: first term data
14445 ** doclist: first doclist
14446 ** zero-or-more {
14447 ** varint: number of bytes in common with previous term
14448 ** varint: number of bytes of new term data (nNew)
14449 ** blob: nNew bytes of new term data
14450 ** doclist: next doclist
14451 ** }
14452 **
14453 ** doclist format:
14454 **
14455 ** varint: first rowid
14456 ** poslist: first poslist
14457 ** zero-or-more {
14458 ** varint: rowid delta (always > 0)
14459 ** poslist: next poslist
14460 ** }
14461 **
14462 ** poslist format:
14463 **
14464 ** varint: size of poslist in bytes multiplied by 2, not including
14465 ** this field. Plus 1 if this entry carries the "delete" flag.
14466 ** collist: collist for column 0
14467 ** zero-or-more {
14468 ** 0x01 byte
14469 ** varint: column number (I)
14470 ** collist: collist for column I
14471 ** }
14472 **
14473 ** collist format:
14474 **
14475 ** varint: first offset + 2
14476 ** zero-or-more {
14477 ** varint: offset delta + 2
14478 ** }
14479 **
14480 ** PAGE FORMAT
14481 **
14482 ** Each leaf page begins with a 4-byte header containing 2 16-bit
14483 ** unsigned integer fields in big-endian format. They are:
14484 **
14485 ** * The byte offset of the first rowid on the page, if it exists
14486 ** and occurs before the first term (otherwise 0).
14487 **
14488 ** * The byte offset of the start of the page footer. If the page
14489 ** footer is 0 bytes in size, then this field is the same as the
14490 ** size of the leaf page in bytes.
14491 **
14492 ** The page footer consists of a single varint for each term located
14493 ** on the page. Each varint is the byte offset of the current term
14494 ** within the page, delta-compressed against the previous value. In
14495 ** other words, the first varint in the footer is the byte offset of
14496 ** the first term, the second is the byte offset of the second less that
14497 ** of the first, and so on.
14498 **
14499 ** The term/doclist format described above is accurate if the entire
14500 ** term/doclist data fits on a single leaf page. If this is not the case,
14501 ** the format is changed in two ways:
14502 **
14503 ** + if the first rowid on a page occurs before the first term, it
14504 ** is stored as a literal value:
14505 **
14506 ** varint: first rowid
14507 **
14508 ** + the first term on each page is stored in the same way as the
14509 ** very first term of the segment:
14510 **
14511 ** varint : size of first term
14512 ** blob: first term data
14513 **
14514 ** 5. Segment doclist indexes:
14515 **
14516 ** Doclist indexes are themselves b-trees, however they usually consist of
14517 ** a single leaf record only. The format of each doclist index leaf page
14518 ** is:
14519 **
14520 ** * Flags byte. Bits are:
14521 ** 0x01: Clear if leaf is also the root page, otherwise set.
14522 **
14523 ** * Page number of fts index leaf page. As a varint.
14524 **
14525 ** * First rowid on page indicated by previous field. As a varint.
14526 **
14527 ** * A list of varints, one for each subsequent termless page. A
14528 ** positive delta if the termless page contains at least one rowid,
14529 ** or an 0x00 byte otherwise.
14530 **
14531 ** Internal doclist index nodes are:
14532 **
14533 ** * Flags byte. Bits are:
14534 ** 0x01: Clear for root page, otherwise set.
14535 **
14536 ** * Page number of first child page. As a varint.
14537 **
14538 ** * Copy of first rowid on page indicated by previous field. As a varint.
14539 **
14540 ** * A list of delta-encoded varints - the first rowid on each subsequent
14541 ** child page.
14542 **
14543 */
14544
14545 /*
14546 ** Rowids for the averages and structure records in the %_data table.
14547 */
14548 #define FTS5_AVERAGES_ROWID 1 /* Rowid used for the averages record */
14549 #define FTS5_STRUCTURE_ROWID 10 /* The structure record */
14550
14551 /*
14552 ** Macros determining the rowids used by segment leaves and dlidx leaves
14553 ** and nodes. All nodes and leaves are stored in the %_data table with large
14554 ** positive rowids.
14555 **
14556 ** Each segment has a unique non-zero 16-bit id.
14557 **
14558 ** The rowid for each segment leaf is found by passing the segment id and
14559 ** the leaf page number to the FTS5_SEGMENT_ROWID macro. Leaves are numbered
14560 ** sequentially starting from 1.
14561 */
14562 #define FTS5_DATA_ID_B 16 /* Max seg id number 65535 */
14563 #define FTS5_DATA_DLI_B 1 /* Doclist-index flag (1 bit) */
14564 #define FTS5_DATA_HEIGHT_B 5 /* Max dlidx tree height of 32 */
14565 #define FTS5_DATA_PAGE_B 31 /* Max page number of 2147483648 */
14566
14567 #define fts5_dri(segid, dlidx, height, pgno) ( \
14568 ((i64)(segid) << (FTS5_DATA_PAGE_B+FTS5_DATA_HEIGHT_B+FTS5_DATA_DLI_B)) + \
14569 ((i64)(dlidx) << (FTS5_DATA_PAGE_B + FTS5_DATA_HEIGHT_B)) + \
14570 ((i64)(height) << (FTS5_DATA_PAGE_B)) + \
14571 ((i64)(pgno)) \
14572 )
14573
14574 #define FTS5_SEGMENT_ROWID(segid, pgno) fts5_dri(segid, 0, 0, pgno)
14575 #define FTS5_DLIDX_ROWID(segid, height, pgno) fts5_dri(segid, 1, height, pgno)
14576
14577 /*
14578 ** Maximum segments permitted in a single index
14579 */
14580 #define FTS5_MAX_SEGMENT 2000
14581
14582 #ifdef SQLITE_DEBUG
14583 static int sqlite3Fts5Corrupt() { return SQLITE_CORRUPT_VTAB; }
14584 #endif
14585
14586
14587 /*
14588 ** Each time a blob is read from the %_data table, it is padded with this
14589 ** many zero bytes. This makes it easier to decode the various record formats
14590 ** without overreading if the records are corrupt.
14591 */
14592 #define FTS5_DATA_ZERO_PADDING 8
14593 #define FTS5_DATA_PADDING 20
14594
14595 typedef struct Fts5Data Fts5Data;
14596 typedef struct Fts5DlidxIter Fts5DlidxIter;
14597 typedef struct Fts5DlidxLvl Fts5DlidxLvl;
14598 typedef struct Fts5DlidxWriter Fts5DlidxWriter;
14599 typedef struct Fts5PageWriter Fts5PageWriter;
14600 typedef struct Fts5SegIter Fts5SegIter;
14601 typedef struct Fts5DoclistIter Fts5DoclistIter;
14602 typedef struct Fts5SegWriter Fts5SegWriter;
14603 typedef struct Fts5Structure Fts5Structure;
14604 typedef struct Fts5StructureLevel Fts5StructureLevel;
14605 typedef struct Fts5StructureSegment Fts5StructureSegment;
14606
14607 struct Fts5Data {
14608 u8 *p; /* Pointer to buffer containing record */
14609 int nn; /* Size of record in bytes */
14610 int szLeaf; /* Size of leaf without page-index */
14611 };
14612
14613 /*
14614 ** One object per %_data table.
14615 */
14616 struct Fts5Index {
14617 Fts5Config *pConfig; /* Virtual table configuration */
14618 char *zDataTbl; /* Name of %_data table */
14619 int nWorkUnit; /* Leaf pages in a "unit" of work */
14620
14621 /*
14622 ** Variables related to the accumulation of tokens and doclists within the
14623 ** in-memory hash tables before they are flushed to disk.
14624 */
14625 Fts5Hash *pHash; /* Hash table for in-memory data */
14626 int nPendingData; /* Current bytes of pending data */
14627 i64 iWriteRowid; /* Rowid for current doc being written */
14628 int bDelete; /* Current write is a delete */
14629
14630 /* Error state. */
14631 int rc; /* Current error code */
14632
14633 /* State used by the fts5DataXXX() functions. */
14634 sqlite3_blob *pReader; /* RO incr-blob open on %_data table */
14635 sqlite3_stmt *pWriter; /* "INSERT ... %_data VALUES(?,?)" */
14636 sqlite3_stmt *pDeleter; /* "DELETE FROM %_data ... id>=? AND id<=?" */
14637 sqlite3_stmt *pIdxWriter; /* "INSERT ... %_idx VALUES(?,?,?,?)" */
14638 sqlite3_stmt *pIdxDeleter; /* "DELETE FROM %_idx WHERE segid=? */
14639 sqlite3_stmt *pIdxSelect;
14640 int nRead; /* Total number of blocks read */
14641 };
14642
14643 struct Fts5DoclistIter {
14644 u8 *aEof; /* Pointer to 1 byte past end of doclist */
14645
14646 /* Output variables. aPoslist==0 at EOF */
14647 i64 iRowid;
14648 u8 *aPoslist;
14649 int nPoslist;
14650 int nSize;
14651 };
14652
14653 /*
14654 ** The contents of the "structure" record for each index are represented
14655 ** using an Fts5Structure record in memory. Which uses instances of the
14656 ** other Fts5StructureXXX types as components.
14657 */
14658 struct Fts5StructureSegment {
14659 int iSegid; /* Segment id */
14660 int pgnoFirst; /* First leaf page number in segment */
14661 int pgnoLast; /* Last leaf page number in segment */
14662 };
14663 struct Fts5StructureLevel {
14664 int nMerge; /* Number of segments in incr-merge */
14665 int nSeg; /* Total number of segments on level */
14666 Fts5StructureSegment *aSeg; /* Array of segments. aSeg[0] is oldest. */
14667 };
14668 struct Fts5Structure {
14669 int nRef; /* Object reference count */
14670 u64 nWriteCounter; /* Total leaves written to level 0 */
14671 int nSegment; /* Total segments in this structure */
14672 int nLevel; /* Number of levels in this index */
14673 Fts5StructureLevel aLevel[1]; /* Array of nLevel level objects */
14674 };
14675
14676 /*
14677 ** An object of type Fts5SegWriter is used to write to segments.
14678 */
14679 struct Fts5PageWriter {
14680 int pgno; /* Page number for this page */
14681 int iPrevPgidx; /* Previous value written into pgidx */
14682 Fts5Buffer buf; /* Buffer containing leaf data */
14683 Fts5Buffer pgidx; /* Buffer containing page-index */
14684 Fts5Buffer term; /* Buffer containing previous term on page */
14685 };
14686 struct Fts5DlidxWriter {
14687 int pgno; /* Page number for this page */
14688 int bPrevValid; /* True if iPrev is valid */
14689 i64 iPrev; /* Previous rowid value written to page */
14690 Fts5Buffer buf; /* Buffer containing page data */
14691 };
14692 struct Fts5SegWriter {
14693 int iSegid; /* Segid to write to */
14694 Fts5PageWriter writer; /* PageWriter object */
14695 i64 iPrevRowid; /* Previous rowid written to current leaf */
14696 u8 bFirstRowidInDoclist; /* True if next rowid is first in doclist */
14697 u8 bFirstRowidInPage; /* True if next rowid is first in page */
14698 /* TODO1: Can use (writer.pgidx.n==0) instead of bFirstTermInPage */
14699 u8 bFirstTermInPage; /* True if next term will be first in leaf */
14700 int nLeafWritten; /* Number of leaf pages written */
14701 int nEmpty; /* Number of contiguous term-less nodes */
14702
14703 int nDlidx; /* Allocated size of aDlidx[] array */
14704 Fts5DlidxWriter *aDlidx; /* Array of Fts5DlidxWriter objects */
14705
14706 /* Values to insert into the %_idx table */
14707 Fts5Buffer btterm; /* Next term to insert into %_idx table */
14708 int iBtPage; /* Page number corresponding to btterm */
14709 };
14710
14711 typedef struct Fts5CResult Fts5CResult;
14712 struct Fts5CResult {
14713 u16 iFirst; /* aSeg[] index of firstest iterator */
14714 u8 bTermEq; /* True if the terms are equal */
14715 };
14716
14717 /*
14718 ** Object for iterating through a single segment, visiting each term/rowid
14719 ** pair in the segment.
14720 **
14721 ** pSeg:
14722 ** The segment to iterate through.
14723 **
14724 ** iLeafPgno:
14725 ** Current leaf page number within segment.
14726 **
14727 ** iLeafOffset:
14728 ** Byte offset within the current leaf that is the first byte of the
14729 ** position list data (one byte passed the position-list size field).
14730 ** rowid field of the current entry. Usually this is the size field of the
14731 ** position list data. The exception is if the rowid for the current entry
14732 ** is the last thing on the leaf page.
14733 **
14734 ** pLeaf:
14735 ** Buffer containing current leaf page data. Set to NULL at EOF.
14736 **
14737 ** iTermLeafPgno, iTermLeafOffset:
14738 ** Leaf page number containing the last term read from the segment. And
14739 ** the offset immediately following the term data.
14740 **
14741 ** flags:
14742 ** Mask of FTS5_SEGITER_XXX values. Interpreted as follows:
14743 **
14744 ** FTS5_SEGITER_ONETERM:
14745 ** If set, set the iterator to point to EOF after the current doclist
14746 ** has been exhausted. Do not proceed to the next term in the segment.
14747 **
14748 ** FTS5_SEGITER_REVERSE:
14749 ** This flag is only ever set if FTS5_SEGITER_ONETERM is also set. If
14750 ** it is set, iterate through rowid in descending order instead of the
14751 ** default ascending order.
14752 **
14753 ** iRowidOffset/nRowidOffset/aRowidOffset:
14754 ** These are used if the FTS5_SEGITER_REVERSE flag is set.
14755 **
14756 ** For each rowid on the page corresponding to the current term, the
14757 ** corresponding aRowidOffset[] entry is set to the byte offset of the
14758 ** start of the "position-list-size" field within the page.
14759 **
14760 ** iTermIdx:
14761 ** Index of current term on iTermLeafPgno.
14762 */
14763 struct Fts5SegIter {
14764 Fts5StructureSegment *pSeg; /* Segment to iterate through */
14765 int flags; /* Mask of configuration flags */
14766 int iLeafPgno; /* Current leaf page number */
14767 Fts5Data *pLeaf; /* Current leaf data */
14768 Fts5Data *pNextLeaf; /* Leaf page (iLeafPgno+1) */
14769 int iLeafOffset; /* Byte offset within current leaf */
14770
14771 /* The page and offset from which the current term was read. The offset
14772 ** is the offset of the first rowid in the current doclist. */
14773 int iTermLeafPgno;
14774 int iTermLeafOffset;
14775
14776 int iPgidxOff; /* Next offset in pgidx */
14777 int iEndofDoclist;
14778
14779 /* The following are only used if the FTS5_SEGITER_REVERSE flag is set. */
14780 int iRowidOffset; /* Current entry in aRowidOffset[] */
14781 int nRowidOffset; /* Allocated size of aRowidOffset[] array */
14782 int *aRowidOffset; /* Array of offset to rowid fields */
14783
14784 Fts5DlidxIter *pDlidx; /* If there is a doclist-index */
14785
14786 /* Variables populated based on current entry. */
14787 Fts5Buffer term; /* Current term */
14788 i64 iRowid; /* Current rowid */
14789 int nPos; /* Number of bytes in current position list */
14790 int bDel; /* True if the delete flag is set */
14791 };
14792
14793 /*
14794 ** Argument is a pointer to an Fts5Data structure that contains a
14795 ** leaf page.
14796 */
14797 #define ASSERT_SZLEAF_OK(x) assert( \
14798 (x)->szLeaf==(x)->nn || (x)->szLeaf==fts5GetU16(&(x)->p[2]) \
14799 )
14800
14801 #define FTS5_SEGITER_ONETERM 0x01
14802 #define FTS5_SEGITER_REVERSE 0x02
14803
14804
14805 /*
14806 ** Argument is a pointer to an Fts5Data structure that contains a leaf
14807 ** page. This macro evaluates to true if the leaf contains no terms, or
14808 ** false if it contains at least one term.
14809 */
14810 #define fts5LeafIsTermless(x) ((x)->szLeaf >= (x)->nn)
14811
14812 #define fts5LeafTermOff(x, i) (fts5GetU16(&(x)->p[(x)->szLeaf + (i)*2]))
14813
14814 #define fts5LeafFirstRowidOff(x) (fts5GetU16((x)->p))
14815
14816 /*
14817 ** Object for iterating through the merged results of one or more segments,
14818 ** visiting each term/rowid pair in the merged data.
14819 **
14820 ** nSeg is always a power of two greater than or equal to the number of
14821 ** segments that this object is merging data from. Both the aSeg[] and
14822 ** aFirst[] arrays are sized at nSeg entries. The aSeg[] array is padded
14823 ** with zeroed objects - these are handled as if they were iterators opened
14824 ** on empty segments.
14825 **
14826 ** The results of comparing segments aSeg[N] and aSeg[N+1], where N is an
14827 ** even number, is stored in aFirst[(nSeg+N)/2]. The "result" of the
14828 ** comparison in this context is the index of the iterator that currently
14829 ** points to the smaller term/rowid combination. Iterators at EOF are
14830 ** considered to be greater than all other iterators.
14831 **
14832 ** aFirst[1] contains the index in aSeg[] of the iterator that points to
14833 ** the smallest key overall. aFirst[0] is unused.
14834 **
14835 ** poslist:
14836 ** Used by sqlite3Fts5IterPoslist() when the poslist needs to be buffered.
14837 ** There is no way to tell if this is populated or not.
14838 */
14839 struct Fts5IndexIter {
14840 Fts5Index *pIndex; /* Index that owns this iterator */
14841 Fts5Structure *pStruct; /* Database structure for this iterator */
14842 Fts5Buffer poslist; /* Buffer containing current poslist */
14843
14844 int nSeg; /* Size of aSeg[] array */
14845 int bRev; /* True to iterate in reverse order */
14846 u8 bSkipEmpty; /* True to skip deleted entries */
14847 u8 bEof; /* True at EOF */
14848 u8 bFiltered; /* True if column-filter already applied */
14849
14850 i64 iSwitchRowid; /* Firstest rowid of other than aFirst[1] */
14851 Fts5CResult *aFirst; /* Current merge state (see above) */
14852 Fts5SegIter aSeg[1]; /* Array of segment iterators */
14853 };
14854
14855
14856 /*
14857 ** An instance of the following type is used to iterate through the contents
14858 ** of a doclist-index record.
14859 **
14860 ** pData:
14861 ** Record containing the doclist-index data.
14862 **
14863 ** bEof:
14864 ** Set to true once iterator has reached EOF.
14865 **
14866 ** iOff:
14867 ** Set to the current offset within record pData.
14868 */
14869 struct Fts5DlidxLvl {
14870 Fts5Data *pData; /* Data for current page of this level */
14871 int iOff; /* Current offset into pData */
14872 int bEof; /* At EOF already */
14873 int iFirstOff; /* Used by reverse iterators */
14874
14875 /* Output variables */
14876 int iLeafPgno; /* Page number of current leaf page */
14877 i64 iRowid; /* First rowid on leaf iLeafPgno */
14878 };
14879 struct Fts5DlidxIter {
14880 int nLvl;
14881 int iSegid;
14882 Fts5DlidxLvl aLvl[1];
14883 };
14884
14885 static void fts5PutU16(u8 *aOut, u16 iVal){
14886 aOut[0] = (iVal>>8);
14887 aOut[1] = (iVal&0xFF);
14888 }
14889
14890 static u16 fts5GetU16(const u8 *aIn){
14891 return ((u16)aIn[0] << 8) + aIn[1];
14892 }
14893
14894 /*
14895 ** Allocate and return a buffer at least nByte bytes in size.
14896 **
14897 ** If an OOM error is encountered, return NULL and set the error code in
14898 ** the Fts5Index handle passed as the first argument.
14899 */
14900 static void *fts5IdxMalloc(Fts5Index *p, int nByte){
14901 return sqlite3Fts5MallocZero(&p->rc, nByte);
14902 }
14903
14904 /*
14905 ** Compare the contents of the pLeft buffer with the pRight/nRight blob.
14906 **
14907 ** Return -ve if pLeft is smaller than pRight, 0 if they are equal or
14908 ** +ve if pRight is smaller than pLeft. In other words:
14909 **
14910 ** res = *pLeft - *pRight
14911 */
14912 #ifdef SQLITE_DEBUG
14913 static int fts5BufferCompareBlob(
14914 Fts5Buffer *pLeft, /* Left hand side of comparison */
14915 const u8 *pRight, int nRight /* Right hand side of comparison */
14916 ){
14917 int nCmp = MIN(pLeft->n, nRight);
14918 int res = memcmp(pLeft->p, pRight, nCmp);
14919 return (res==0 ? (pLeft->n - nRight) : res);
14920 }
14921 #endif
14922
14923 /*
14924 ** Compare the contents of the two buffers using memcmp(). If one buffer
14925 ** is a prefix of the other, it is considered the lesser.
14926 **
14927 ** Return -ve if pLeft is smaller than pRight, 0 if they are equal or
14928 ** +ve if pRight is smaller than pLeft. In other words:
14929 **
14930 ** res = *pLeft - *pRight
14931 */
14932 static int fts5BufferCompare(Fts5Buffer *pLeft, Fts5Buffer *pRight){
14933 int nCmp = MIN(pLeft->n, pRight->n);
14934 int res = memcmp(pLeft->p, pRight->p, nCmp);
14935 return (res==0 ? (pLeft->n - pRight->n) : res);
14936 }
14937
14938 #ifdef SQLITE_DEBUG
14939 static int fts5BlobCompare(
14940 const u8 *pLeft, int nLeft,
14941 const u8 *pRight, int nRight
14942 ){
14943 int nCmp = MIN(nLeft, nRight);
14944 int res = memcmp(pLeft, pRight, nCmp);
14945 return (res==0 ? (nLeft - nRight) : res);
14946 }
14947 #endif
14948
14949 static int fts5LeafFirstTermOff(Fts5Data *pLeaf){
14950 int ret;
14951 fts5GetVarint32(&pLeaf->p[pLeaf->szLeaf], ret);
14952 return ret;
14953 }
14954
14955 /*
14956 ** Close the read-only blob handle, if it is open.
14957 */
14958 static void fts5CloseReader(Fts5Index *p){
14959 if( p->pReader ){
14960 sqlite3_blob *pReader = p->pReader;
14961 p->pReader = 0;
14962 sqlite3_blob_close(pReader);
14963 }
14964 }
14965
14966
14967 /*
14968 ** Retrieve a record from the %_data table.
14969 **
14970 ** If an error occurs, NULL is returned and an error left in the
14971 ** Fts5Index object.
14972 */
14973 static Fts5Data *fts5DataRead(Fts5Index *p, i64 iRowid){
14974 Fts5Data *pRet = 0;
14975 if( p->rc==SQLITE_OK ){
14976 int rc = SQLITE_OK;
14977
14978 if( p->pReader ){
14979 /* This call may return SQLITE_ABORT if there has been a savepoint
14980 ** rollback since it was last used. In this case a new blob handle
14981 ** is required. */
14982 sqlite3_blob *pBlob = p->pReader;
14983 p->pReader = 0;
14984 rc = sqlite3_blob_reopen(pBlob, iRowid);
14985 assert( p->pReader==0 );
14986 p->pReader = pBlob;
14987 if( rc!=SQLITE_OK ){
14988 fts5CloseReader(p);
14989 }
14990 if( rc==SQLITE_ABORT ) rc = SQLITE_OK;
14991 }
14992
14993 /* If the blob handle is not open at this point, open it and seek
14994 ** to the requested entry. */
14995 if( p->pReader==0 && rc==SQLITE_OK ){
14996 Fts5Config *pConfig = p->pConfig;
14997 rc = sqlite3_blob_open(pConfig->db,
14998 pConfig->zDb, p->zDataTbl, "block", iRowid, 0, &p->pReader
14999 );
15000 }
15001
15002 /* If either of the sqlite3_blob_open() or sqlite3_blob_reopen() calls
15003 ** above returned SQLITE_ERROR, return SQLITE_CORRUPT_VTAB instead.
15004 ** All the reasons those functions might return SQLITE_ERROR - missing
15005 ** table, missing row, non-blob/text in block column - indicate
15006 ** backing store corruption. */
15007 if( rc==SQLITE_ERROR ) rc = FTS5_CORRUPT;
15008
15009 if( rc==SQLITE_OK ){
15010 u8 *aOut = 0; /* Read blob data into this buffer */
15011 int nByte = sqlite3_blob_bytes(p->pReader);
15012 int nAlloc = sizeof(Fts5Data) + nByte + FTS5_DATA_PADDING;
15013 pRet = (Fts5Data*)sqlite3_malloc(nAlloc);
15014 if( pRet ){
15015 pRet->nn = nByte;
15016 aOut = pRet->p = (u8*)&pRet[1];
15017 }else{
15018 rc = SQLITE_NOMEM;
15019 }
15020
15021 if( rc==SQLITE_OK ){
15022 rc = sqlite3_blob_read(p->pReader, aOut, nByte, 0);
15023 }
15024 if( rc!=SQLITE_OK ){
15025 sqlite3_free(pRet);
15026 pRet = 0;
15027 }else{
15028 /* TODO1: Fix this */
15029 pRet->szLeaf = fts5GetU16(&pRet->p[2]);
15030 }
15031 }
15032 p->rc = rc;
15033 p->nRead++;
15034 }
15035
15036 assert( (pRet==0)==(p->rc!=SQLITE_OK) );
15037 return pRet;
15038 }
15039
15040 /*
15041 ** Release a reference to data record returned by an earlier call to
15042 ** fts5DataRead().
15043 */
15044 static void fts5DataRelease(Fts5Data *pData){
15045 sqlite3_free(pData);
15046 }
15047
15048 static int fts5IndexPrepareStmt(
15049 Fts5Index *p,
15050 sqlite3_stmt **ppStmt,
15051 char *zSql
15052 ){
15053 if( p->rc==SQLITE_OK ){
15054 if( zSql ){
15055 p->rc = sqlite3_prepare_v2(p->pConfig->db, zSql, -1, ppStmt, 0);
15056 }else{
15057 p->rc = SQLITE_NOMEM;
15058 }
15059 }
15060 sqlite3_free(zSql);
15061 return p->rc;
15062 }
15063
15064
15065 /*
15066 ** INSERT OR REPLACE a record into the %_data table.
15067 */
15068 static void fts5DataWrite(Fts5Index *p, i64 iRowid, const u8 *pData, int nData){
15069 if( p->rc!=SQLITE_OK ) return;
15070
15071 if( p->pWriter==0 ){
15072 Fts5Config *pConfig = p->pConfig;
15073 fts5IndexPrepareStmt(p, &p->pWriter, sqlite3_mprintf(
15074 "REPLACE INTO '%q'.'%q_data'(id, block) VALUES(?,?)",
15075 pConfig->zDb, pConfig->zName
15076 ));
15077 if( p->rc ) return;
15078 }
15079
15080 sqlite3_bind_int64(p->pWriter, 1, iRowid);
15081 sqlite3_bind_blob(p->pWriter, 2, pData, nData, SQLITE_STATIC);
15082 sqlite3_step(p->pWriter);
15083 p->rc = sqlite3_reset(p->pWriter);
15084 }
15085
15086 /*
15087 ** Execute the following SQL:
15088 **
15089 ** DELETE FROM %_data WHERE id BETWEEN $iFirst AND $iLast
15090 */
15091 static void fts5DataDelete(Fts5Index *p, i64 iFirst, i64 iLast){
15092 if( p->rc!=SQLITE_OK ) return;
15093
15094 if( p->pDeleter==0 ){
15095 int rc;
15096 Fts5Config *pConfig = p->pConfig;
15097 char *zSql = sqlite3_mprintf(
15098 "DELETE FROM '%q'.'%q_data' WHERE id>=? AND id<=?",
15099 pConfig->zDb, pConfig->zName
15100 );
15101 if( zSql==0 ){
15102 rc = SQLITE_NOMEM;
15103 }else{
15104 rc = sqlite3_prepare_v2(pConfig->db, zSql, -1, &p->pDeleter, 0);
15105 sqlite3_free(zSql);
15106 }
15107 if( rc!=SQLITE_OK ){
15108 p->rc = rc;
15109 return;
15110 }
15111 }
15112
15113 sqlite3_bind_int64(p->pDeleter, 1, iFirst);
15114 sqlite3_bind_int64(p->pDeleter, 2, iLast);
15115 sqlite3_step(p->pDeleter);
15116 p->rc = sqlite3_reset(p->pDeleter);
15117 }
15118
15119 /*
15120 ** Remove all records associated with segment iSegid.
15121 */
15122 static void fts5DataRemoveSegment(Fts5Index *p, int iSegid){
15123 i64 iFirst = FTS5_SEGMENT_ROWID(iSegid, 0);
15124 i64 iLast = FTS5_SEGMENT_ROWID(iSegid+1, 0)-1;
15125 fts5DataDelete(p, iFirst, iLast);
15126 if( p->pIdxDeleter==0 ){
15127 Fts5Config *pConfig = p->pConfig;
15128 fts5IndexPrepareStmt(p, &p->pIdxDeleter, sqlite3_mprintf(
15129 "DELETE FROM '%q'.'%q_idx' WHERE segid=?",
15130 pConfig->zDb, pConfig->zName
15131 ));
15132 }
15133 if( p->rc==SQLITE_OK ){
15134 sqlite3_bind_int(p->pIdxDeleter, 1, iSegid);
15135 sqlite3_step(p->pIdxDeleter);
15136 p->rc = sqlite3_reset(p->pIdxDeleter);
15137 }
15138 }
15139
15140 /*
15141 ** Release a reference to an Fts5Structure object returned by an earlier
15142 ** call to fts5StructureRead() or fts5StructureDecode().
15143 */
15144 static void fts5StructureRelease(Fts5Structure *pStruct){
15145 if( pStruct && 0>=(--pStruct->nRef) ){
15146 int i;
15147 assert( pStruct->nRef==0 );
15148 for(i=0; i<pStruct->nLevel; i++){
15149 sqlite3_free(pStruct->aLevel[i].aSeg);
15150 }
15151 sqlite3_free(pStruct);
15152 }
15153 }
15154
15155 static void fts5StructureRef(Fts5Structure *pStruct){
15156 pStruct->nRef++;
15157 }
15158
15159 /*
15160 ** Deserialize and return the structure record currently stored in serialized
15161 ** form within buffer pData/nData.
15162 **
15163 ** The Fts5Structure.aLevel[] and each Fts5StructureLevel.aSeg[] array
15164 ** are over-allocated by one slot. This allows the structure contents
15165 ** to be more easily edited.
15166 **
15167 ** If an error occurs, *ppOut is set to NULL and an SQLite error code
15168 ** returned. Otherwise, *ppOut is set to point to the new object and
15169 ** SQLITE_OK returned.
15170 */
15171 static int fts5StructureDecode(
15172 const u8 *pData, /* Buffer containing serialized structure */
15173 int nData, /* Size of buffer pData in bytes */
15174 int *piCookie, /* Configuration cookie value */
15175 Fts5Structure **ppOut /* OUT: Deserialized object */
15176 ){
15177 int rc = SQLITE_OK;
15178 int i = 0;
15179 int iLvl;
15180 int nLevel = 0;
15181 int nSegment = 0;
15182 int nByte; /* Bytes of space to allocate at pRet */
15183 Fts5Structure *pRet = 0; /* Structure object to return */
15184
15185 /* Grab the cookie value */
15186 if( piCookie ) *piCookie = sqlite3Fts5Get32(pData);
15187 i = 4;
15188
15189 /* Read the total number of levels and segments from the start of the
15190 ** structure record. */
15191 i += fts5GetVarint32(&pData[i], nLevel);
15192 i += fts5GetVarint32(&pData[i], nSegment);
15193 nByte = (
15194 sizeof(Fts5Structure) + /* Main structure */
15195 sizeof(Fts5StructureLevel) * (nLevel-1) /* aLevel[] array */
15196 );
15197 pRet = (Fts5Structure*)sqlite3Fts5MallocZero(&rc, nByte);
15198
15199 if( pRet ){
15200 pRet->nRef = 1;
15201 pRet->nLevel = nLevel;
15202 pRet->nSegment = nSegment;
15203 i += sqlite3Fts5GetVarint(&pData[i], &pRet->nWriteCounter);
15204
15205 for(iLvl=0; rc==SQLITE_OK && iLvl<nLevel; iLvl++){
15206 Fts5StructureLevel *pLvl = &pRet->aLevel[iLvl];
15207 int nTotal;
15208 int iSeg;
15209
15210 i += fts5GetVarint32(&pData[i], pLvl->nMerge);
15211 i += fts5GetVarint32(&pData[i], nTotal);
15212 assert( nTotal>=pLvl->nMerge );
15213 pLvl->aSeg = (Fts5StructureSegment*)sqlite3Fts5MallocZero(&rc,
15214 nTotal * sizeof(Fts5StructureSegment)
15215 );
15216
15217 if( rc==SQLITE_OK ){
15218 pLvl->nSeg = nTotal;
15219 for(iSeg=0; iSeg<nTotal; iSeg++){
15220 i += fts5GetVarint32(&pData[i], pLvl->aSeg[iSeg].iSegid);
15221 i += fts5GetVarint32(&pData[i], pLvl->aSeg[iSeg].pgnoFirst);
15222 i += fts5GetVarint32(&pData[i], pLvl->aSeg[iSeg].pgnoLast);
15223 }
15224 }else{
15225 fts5StructureRelease(pRet);
15226 pRet = 0;
15227 }
15228 }
15229 }
15230
15231 *ppOut = pRet;
15232 return rc;
15233 }
15234
15235 /*
15236 **
15237 */
15238 static void fts5StructureAddLevel(int *pRc, Fts5Structure **ppStruct){
15239 if( *pRc==SQLITE_OK ){
15240 Fts5Structure *pStruct = *ppStruct;
15241 int nLevel = pStruct->nLevel;
15242 int nByte = (
15243 sizeof(Fts5Structure) + /* Main structure */
15244 sizeof(Fts5StructureLevel) * (nLevel+1) /* aLevel[] array */
15245 );
15246
15247 pStruct = sqlite3_realloc(pStruct, nByte);
15248 if( pStruct ){
15249 memset(&pStruct->aLevel[nLevel], 0, sizeof(Fts5StructureLevel));
15250 pStruct->nLevel++;
15251 *ppStruct = pStruct;
15252 }else{
15253 *pRc = SQLITE_NOMEM;
15254 }
15255 }
15256 }
15257
15258 /*
15259 ** Extend level iLvl so that there is room for at least nExtra more
15260 ** segments.
15261 */
15262 static void fts5StructureExtendLevel(
15263 int *pRc,
15264 Fts5Structure *pStruct,
15265 int iLvl,
15266 int nExtra,
15267 int bInsert
15268 ){
15269 if( *pRc==SQLITE_OK ){
15270 Fts5StructureLevel *pLvl = &pStruct->aLevel[iLvl];
15271 Fts5StructureSegment *aNew;
15272 int nByte;
15273
15274 nByte = (pLvl->nSeg + nExtra) * sizeof(Fts5StructureSegment);
15275 aNew = sqlite3_realloc(pLvl->aSeg, nByte);
15276 if( aNew ){
15277 if( bInsert==0 ){
15278 memset(&aNew[pLvl->nSeg], 0, sizeof(Fts5StructureSegment) * nExtra);
15279 }else{
15280 int nMove = pLvl->nSeg * sizeof(Fts5StructureSegment);
15281 memmove(&aNew[nExtra], aNew, nMove);
15282 memset(aNew, 0, sizeof(Fts5StructureSegment) * nExtra);
15283 }
15284 pLvl->aSeg = aNew;
15285 }else{
15286 *pRc = SQLITE_NOMEM;
15287 }
15288 }
15289 }
15290
15291 /*
15292 ** Read, deserialize and return the structure record.
15293 **
15294 ** The Fts5Structure.aLevel[] and each Fts5StructureLevel.aSeg[] array
15295 ** are over-allocated as described for function fts5StructureDecode()
15296 ** above.
15297 **
15298 ** If an error occurs, NULL is returned and an error code left in the
15299 ** Fts5Index handle. If an error has already occurred when this function
15300 ** is called, it is a no-op.
15301 */
15302 static Fts5Structure *fts5StructureRead(Fts5Index *p){
15303 Fts5Config *pConfig = p->pConfig;
15304 Fts5Structure *pRet = 0; /* Object to return */
15305 int iCookie; /* Configuration cookie */
15306 Fts5Data *pData;
15307
15308 pData = fts5DataRead(p, FTS5_STRUCTURE_ROWID);
15309 if( p->rc ) return 0;
15310 /* TODO: Do we need this if the leaf-index is appended? Probably... */
15311 memset(&pData->p[pData->nn], 0, FTS5_DATA_PADDING);
15312 p->rc = fts5StructureDecode(pData->p, pData->nn, &iCookie, &pRet);
15313 if( p->rc==SQLITE_OK && pConfig->iCookie!=iCookie ){
15314 p->rc = sqlite3Fts5ConfigLoad(pConfig, iCookie);
15315 }
15316
15317 fts5DataRelease(pData);
15318 if( p->rc!=SQLITE_OK ){
15319 fts5StructureRelease(pRet);
15320 pRet = 0;
15321 }
15322 return pRet;
15323 }
15324
15325 /*
15326 ** Return the total number of segments in index structure pStruct. This
15327 ** function is only ever used as part of assert() conditions.
15328 */
15329 #ifdef SQLITE_DEBUG
15330 static int fts5StructureCountSegments(Fts5Structure *pStruct){
15331 int nSegment = 0; /* Total number of segments */
15332 if( pStruct ){
15333 int iLvl; /* Used to iterate through levels */
15334 for(iLvl=0; iLvl<pStruct->nLevel; iLvl++){
15335 nSegment += pStruct->aLevel[iLvl].nSeg;
15336 }
15337 }
15338
15339 return nSegment;
15340 }
15341 #endif
15342
15343 #define fts5BufferSafeAppendBlob(pBuf, pBlob, nBlob) { \
15344 assert( (pBuf)->nSpace>=((pBuf)->n+nBlob) ); \
15345 memcpy(&(pBuf)->p[(pBuf)->n], pBlob, nBlob); \
15346 (pBuf)->n += nBlob; \
15347 }
15348
15349 #define fts5BufferSafeAppendVarint(pBuf, iVal) { \
15350 (pBuf)->n += sqlite3Fts5PutVarint(&(pBuf)->p[(pBuf)->n], (iVal)); \
15351 assert( (pBuf)->nSpace>=(pBuf)->n ); \
15352 }
15353
15354
15355 /*
15356 ** Serialize and store the "structure" record.
15357 **
15358 ** If an error occurs, leave an error code in the Fts5Index object. If an
15359 ** error has already occurred, this function is a no-op.
15360 */
15361 static void fts5StructureWrite(Fts5Index *p, Fts5Structure *pStruct){
15362 if( p->rc==SQLITE_OK ){
15363 Fts5Buffer buf; /* Buffer to serialize record into */
15364 int iLvl; /* Used to iterate through levels */
15365 int iCookie; /* Cookie value to store */
15366
15367 assert( pStruct->nSegment==fts5StructureCountSegments(pStruct) );
15368 memset(&buf, 0, sizeof(Fts5Buffer));
15369
15370 /* Append the current configuration cookie */
15371 iCookie = p->pConfig->iCookie;
15372 if( iCookie<0 ) iCookie = 0;
15373
15374 if( 0==sqlite3Fts5BufferSize(&p->rc, &buf, 4+9+9+9) ){
15375 sqlite3Fts5Put32(buf.p, iCookie);
15376 buf.n = 4;
15377 fts5BufferSafeAppendVarint(&buf, pStruct->nLevel);
15378 fts5BufferSafeAppendVarint(&buf, pStruct->nSegment);
15379 fts5BufferSafeAppendVarint(&buf, (i64)pStruct->nWriteCounter);
15380 }
15381
15382 for(iLvl=0; iLvl<pStruct->nLevel; iLvl++){
15383 int iSeg; /* Used to iterate through segments */
15384 Fts5StructureLevel *pLvl = &pStruct->aLevel[iLvl];
15385 fts5BufferAppendVarint(&p->rc, &buf, pLvl->nMerge);
15386 fts5BufferAppendVarint(&p->rc, &buf, pLvl->nSeg);
15387 assert( pLvl->nMerge<=pLvl->nSeg );
15388
15389 for(iSeg=0; iSeg<pLvl->nSeg; iSeg++){
15390 fts5BufferAppendVarint(&p->rc, &buf, pLvl->aSeg[iSeg].iSegid);
15391 fts5BufferAppendVarint(&p->rc, &buf, pLvl->aSeg[iSeg].pgnoFirst);
15392 fts5BufferAppendVarint(&p->rc, &buf, pLvl->aSeg[iSeg].pgnoLast);
15393 }
15394 }
15395
15396 fts5DataWrite(p, FTS5_STRUCTURE_ROWID, buf.p, buf.n);
15397 fts5BufferFree(&buf);
15398 }
15399 }
15400
15401 #if 0
15402 static void fts5DebugStructure(int*,Fts5Buffer*,Fts5Structure*);
15403 static void fts5PrintStructure(const char *zCaption, Fts5Structure *pStruct){
15404 int rc = SQLITE_OK;
15405 Fts5Buffer buf;
15406 memset(&buf, 0, sizeof(buf));
15407 fts5DebugStructure(&rc, &buf, pStruct);
15408 fprintf(stdout, "%s: %s\n", zCaption, buf.p);
15409 fflush(stdout);
15410 fts5BufferFree(&buf);
15411 }
15412 #else
15413 # define fts5PrintStructure(x,y)
15414 #endif
15415
15416 static int fts5SegmentSize(Fts5StructureSegment *pSeg){
15417 return 1 + pSeg->pgnoLast - pSeg->pgnoFirst;
15418 }
15419
15420 /*
15421 ** Return a copy of index structure pStruct. Except, promote as many
15422 ** segments as possible to level iPromote. If an OOM occurs, NULL is
15423 ** returned.
15424 */
15425 static void fts5StructurePromoteTo(
15426 Fts5Index *p,
15427 int iPromote,
15428 int szPromote,
15429 Fts5Structure *pStruct
15430 ){
15431 int il, is;
15432 Fts5StructureLevel *pOut = &pStruct->aLevel[iPromote];
15433
15434 if( pOut->nMerge==0 ){
15435 for(il=iPromote+1; il<pStruct->nLevel; il++){
15436 Fts5StructureLevel *pLvl = &pStruct->aLevel[il];
15437 if( pLvl->nMerge ) return;
15438 for(is=pLvl->nSeg-1; is>=0; is--){
15439 int sz = fts5SegmentSize(&pLvl->aSeg[is]);
15440 if( sz>szPromote ) return;
15441 fts5StructureExtendLevel(&p->rc, pStruct, iPromote, 1, 1);
15442 if( p->rc ) return;
15443 memcpy(pOut->aSeg, &pLvl->aSeg[is], sizeof(Fts5StructureSegment));
15444 pOut->nSeg++;
15445 pLvl->nSeg--;
15446 }
15447 }
15448 }
15449 }
15450
15451 /*
15452 ** A new segment has just been written to level iLvl of index structure
15453 ** pStruct. This function determines if any segments should be promoted
15454 ** as a result. Segments are promoted in two scenarios:
15455 **
15456 ** a) If the segment just written is smaller than one or more segments
15457 ** within the previous populated level, it is promoted to the previous
15458 ** populated level.
15459 **
15460 ** b) If the segment just written is larger than the newest segment on
15461 ** the next populated level, then that segment, and any other adjacent
15462 ** segments that are also smaller than the one just written, are
15463 ** promoted.
15464 **
15465 ** If one or more segments are promoted, the structure object is updated
15466 ** to reflect this.
15467 */
15468 static void fts5StructurePromote(
15469 Fts5Index *p, /* FTS5 backend object */
15470 int iLvl, /* Index level just updated */
15471 Fts5Structure *pStruct /* Index structure */
15472 ){
15473 if( p->rc==SQLITE_OK ){
15474 int iTst;
15475 int iPromote = -1;
15476 int szPromote = 0; /* Promote anything this size or smaller */
15477 Fts5StructureSegment *pSeg; /* Segment just written */
15478 int szSeg; /* Size of segment just written */
15479 int nSeg = pStruct->aLevel[iLvl].nSeg;
15480
15481 if( nSeg==0 ) return;
15482 pSeg = &pStruct->aLevel[iLvl].aSeg[pStruct->aLevel[iLvl].nSeg-1];
15483 szSeg = (1 + pSeg->pgnoLast - pSeg->pgnoFirst);
15484
15485 /* Check for condition (a) */
15486 for(iTst=iLvl-1; iTst>=0 && pStruct->aLevel[iTst].nSeg==0; iTst--);
15487 if( iTst>=0 ){
15488 int i;
15489 int szMax = 0;
15490 Fts5StructureLevel *pTst = &pStruct->aLevel[iTst];
15491 assert( pTst->nMerge==0 );
15492 for(i=0; i<pTst->nSeg; i++){
15493 int sz = pTst->aSeg[i].pgnoLast - pTst->aSeg[i].pgnoFirst + 1;
15494 if( sz>szMax ) szMax = sz;
15495 }
15496 if( szMax>=szSeg ){
15497 /* Condition (a) is true. Promote the newest segment on level
15498 ** iLvl to level iTst. */
15499 iPromote = iTst;
15500 szPromote = szMax;
15501 }
15502 }
15503
15504 /* If condition (a) is not met, assume (b) is true. StructurePromoteTo()
15505 ** is a no-op if it is not. */
15506 if( iPromote<0 ){
15507 iPromote = iLvl;
15508 szPromote = szSeg;
15509 }
15510 fts5StructurePromoteTo(p, iPromote, szPromote, pStruct);
15511 }
15512 }
15513
15514
15515 /*
15516 ** Advance the iterator passed as the only argument. If the end of the
15517 ** doclist-index page is reached, return non-zero.
15518 */
15519 static int fts5DlidxLvlNext(Fts5DlidxLvl *pLvl){
15520 Fts5Data *pData = pLvl->pData;
15521
15522 if( pLvl->iOff==0 ){
15523 assert( pLvl->bEof==0 );
15524 pLvl->iOff = 1;
15525 pLvl->iOff += fts5GetVarint32(&pData->p[1], pLvl->iLeafPgno);
15526 pLvl->iOff += fts5GetVarint(&pData->p[pLvl->iOff], (u64*)&pLvl->iRowid);
15527 pLvl->iFirstOff = pLvl->iOff;
15528 }else{
15529 int iOff;
15530 for(iOff=pLvl->iOff; iOff<pData->nn; iOff++){
15531 if( pData->p[iOff] ) break;
15532 }
15533
15534 if( iOff<pData->nn ){
15535 i64 iVal;
15536 pLvl->iLeafPgno += (iOff - pLvl->iOff) + 1;
15537 iOff += fts5GetVarint(&pData->p[iOff], (u64*)&iVal);
15538 pLvl->iRowid += iVal;
15539 pLvl->iOff = iOff;
15540 }else{
15541 pLvl->bEof = 1;
15542 }
15543 }
15544
15545 return pLvl->bEof;
15546 }
15547
15548 /*
15549 ** Advance the iterator passed as the only argument.
15550 */
15551 static int fts5DlidxIterNextR(Fts5Index *p, Fts5DlidxIter *pIter, int iLvl){
15552 Fts5DlidxLvl *pLvl = &pIter->aLvl[iLvl];
15553
15554 assert( iLvl<pIter->nLvl );
15555 if( fts5DlidxLvlNext(pLvl) ){
15556 if( (iLvl+1) < pIter->nLvl ){
15557 fts5DlidxIterNextR(p, pIter, iLvl+1);
15558 if( pLvl[1].bEof==0 ){
15559 fts5DataRelease(pLvl->pData);
15560 memset(pLvl, 0, sizeof(Fts5DlidxLvl));
15561 pLvl->pData = fts5DataRead(p,
15562 FTS5_DLIDX_ROWID(pIter->iSegid, iLvl, pLvl[1].iLeafPgno)
15563 );
15564 if( pLvl->pData ) fts5DlidxLvlNext(pLvl);
15565 }
15566 }
15567 }
15568
15569 return pIter->aLvl[0].bEof;
15570 }
15571 static int fts5DlidxIterNext(Fts5Index *p, Fts5DlidxIter *pIter){
15572 return fts5DlidxIterNextR(p, pIter, 0);
15573 }
15574
15575 /*
15576 ** The iterator passed as the first argument has the following fields set
15577 ** as follows. This function sets up the rest of the iterator so that it
15578 ** points to the first rowid in the doclist-index.
15579 **
15580 ** pData:
15581 ** pointer to doclist-index record,
15582 **
15583 ** When this function is called pIter->iLeafPgno is the page number the
15584 ** doclist is associated with (the one featuring the term).
15585 */
15586 static int fts5DlidxIterFirst(Fts5DlidxIter *pIter){
15587 int i;
15588 for(i=0; i<pIter->nLvl; i++){
15589 fts5DlidxLvlNext(&pIter->aLvl[i]);
15590 }
15591 return pIter->aLvl[0].bEof;
15592 }
15593
15594
15595 static int fts5DlidxIterEof(Fts5Index *p, Fts5DlidxIter *pIter){
15596 return p->rc!=SQLITE_OK || pIter->aLvl[0].bEof;
15597 }
15598
15599 static void fts5DlidxIterLast(Fts5Index *p, Fts5DlidxIter *pIter){
15600 int i;
15601
15602 /* Advance each level to the last entry on the last page */
15603 for(i=pIter->nLvl-1; p->rc==SQLITE_OK && i>=0; i--){
15604 Fts5DlidxLvl *pLvl = &pIter->aLvl[i];
15605 while( fts5DlidxLvlNext(pLvl)==0 );
15606 pLvl->bEof = 0;
15607
15608 if( i>0 ){
15609 Fts5DlidxLvl *pChild = &pLvl[-1];
15610 fts5DataRelease(pChild->pData);
15611 memset(pChild, 0, sizeof(Fts5DlidxLvl));
15612 pChild->pData = fts5DataRead(p,
15613 FTS5_DLIDX_ROWID(pIter->iSegid, i-1, pLvl->iLeafPgno)
15614 );
15615 }
15616 }
15617 }
15618
15619 /*
15620 ** Move the iterator passed as the only argument to the previous entry.
15621 */
15622 static int fts5DlidxLvlPrev(Fts5DlidxLvl *pLvl){
15623 int iOff = pLvl->iOff;
15624
15625 assert( pLvl->bEof==0 );
15626 if( iOff<=pLvl->iFirstOff ){
15627 pLvl->bEof = 1;
15628 }else{
15629 u8 *a = pLvl->pData->p;
15630 i64 iVal;
15631 int iLimit;
15632 int ii;
15633 int nZero = 0;
15634
15635 /* Currently iOff points to the first byte of a varint. This block
15636 ** decrements iOff until it points to the first byte of the previous
15637 ** varint. Taking care not to read any memory locations that occur
15638 ** before the buffer in memory. */
15639 iLimit = (iOff>9 ? iOff-9 : 0);
15640 for(iOff--; iOff>iLimit; iOff--){
15641 if( (a[iOff-1] & 0x80)==0 ) break;
15642 }
15643
15644 fts5GetVarint(&a[iOff], (u64*)&iVal);
15645 pLvl->iRowid -= iVal;
15646 pLvl->iLeafPgno--;
15647
15648 /* Skip backwards past any 0x00 varints. */
15649 for(ii=iOff-1; ii>=pLvl->iFirstOff && a[ii]==0x00; ii--){
15650 nZero++;
15651 }
15652 if( ii>=pLvl->iFirstOff && (a[ii] & 0x80) ){
15653 /* The byte immediately before the last 0x00 byte has the 0x80 bit
15654 ** set. So the last 0x00 is only a varint 0 if there are 8 more 0x80
15655 ** bytes before a[ii]. */
15656 int bZero = 0; /* True if last 0x00 counts */
15657 if( (ii-8)>=pLvl->iFirstOff ){
15658 int j;
15659 for(j=1; j<=8 && (a[ii-j] & 0x80); j++);
15660 bZero = (j>8);
15661 }
15662 if( bZero==0 ) nZero--;
15663 }
15664 pLvl->iLeafPgno -= nZero;
15665 pLvl->iOff = iOff - nZero;
15666 }
15667
15668 return pLvl->bEof;
15669 }
15670
15671 static int fts5DlidxIterPrevR(Fts5Index *p, Fts5DlidxIter *pIter, int iLvl){
15672 Fts5DlidxLvl *pLvl = &pIter->aLvl[iLvl];
15673
15674 assert( iLvl<pIter->nLvl );
15675 if( fts5DlidxLvlPrev(pLvl) ){
15676 if( (iLvl+1) < pIter->nLvl ){
15677 fts5DlidxIterPrevR(p, pIter, iLvl+1);
15678 if( pLvl[1].bEof==0 ){
15679 fts5DataRelease(pLvl->pData);
15680 memset(pLvl, 0, sizeof(Fts5DlidxLvl));
15681 pLvl->pData = fts5DataRead(p,
15682 FTS5_DLIDX_ROWID(pIter->iSegid, iLvl, pLvl[1].iLeafPgno)
15683 );
15684 if( pLvl->pData ){
15685 while( fts5DlidxLvlNext(pLvl)==0 );
15686 pLvl->bEof = 0;
15687 }
15688 }
15689 }
15690 }
15691
15692 return pIter->aLvl[0].bEof;
15693 }
15694 static int fts5DlidxIterPrev(Fts5Index *p, Fts5DlidxIter *pIter){
15695 return fts5DlidxIterPrevR(p, pIter, 0);
15696 }
15697
15698 /*
15699 ** Free a doclist-index iterator object allocated by fts5DlidxIterInit().
15700 */
15701 static void fts5DlidxIterFree(Fts5DlidxIter *pIter){
15702 if( pIter ){
15703 int i;
15704 for(i=0; i<pIter->nLvl; i++){
15705 fts5DataRelease(pIter->aLvl[i].pData);
15706 }
15707 sqlite3_free(pIter);
15708 }
15709 }
15710
15711 static Fts5DlidxIter *fts5DlidxIterInit(
15712 Fts5Index *p, /* Fts5 Backend to iterate within */
15713 int bRev, /* True for ORDER BY ASC */
15714 int iSegid, /* Segment id */
15715 int iLeafPg /* Leaf page number to load dlidx for */
15716 ){
15717 Fts5DlidxIter *pIter = 0;
15718 int i;
15719 int bDone = 0;
15720
15721 for(i=0; p->rc==SQLITE_OK && bDone==0; i++){
15722 int nByte = sizeof(Fts5DlidxIter) + i * sizeof(Fts5DlidxLvl);
15723 Fts5DlidxIter *pNew;
15724
15725 pNew = (Fts5DlidxIter*)sqlite3_realloc(pIter, nByte);
15726 if( pNew==0 ){
15727 p->rc = SQLITE_NOMEM;
15728 }else{
15729 i64 iRowid = FTS5_DLIDX_ROWID(iSegid, i, iLeafPg);
15730 Fts5DlidxLvl *pLvl = &pNew->aLvl[i];
15731 pIter = pNew;
15732 memset(pLvl, 0, sizeof(Fts5DlidxLvl));
15733 pLvl->pData = fts5DataRead(p, iRowid);
15734 if( pLvl->pData && (pLvl->pData->p[0] & 0x0001)==0 ){
15735 bDone = 1;
15736 }
15737 pIter->nLvl = i+1;
15738 }
15739 }
15740
15741 if( p->rc==SQLITE_OK ){
15742 pIter->iSegid = iSegid;
15743 if( bRev==0 ){
15744 fts5DlidxIterFirst(pIter);
15745 }else{
15746 fts5DlidxIterLast(p, pIter);
15747 }
15748 }
15749
15750 if( p->rc!=SQLITE_OK ){
15751 fts5DlidxIterFree(pIter);
15752 pIter = 0;
15753 }
15754
15755 return pIter;
15756 }
15757
15758 static i64 fts5DlidxIterRowid(Fts5DlidxIter *pIter){
15759 return pIter->aLvl[0].iRowid;
15760 }
15761 static int fts5DlidxIterPgno(Fts5DlidxIter *pIter){
15762 return pIter->aLvl[0].iLeafPgno;
15763 }
15764
15765 /*
15766 ** Load the next leaf page into the segment iterator.
15767 */
15768 static void fts5SegIterNextPage(
15769 Fts5Index *p, /* FTS5 backend object */
15770 Fts5SegIter *pIter /* Iterator to advance to next page */
15771 ){
15772 Fts5Data *pLeaf;
15773 Fts5StructureSegment *pSeg = pIter->pSeg;
15774 fts5DataRelease(pIter->pLeaf);
15775 pIter->iLeafPgno++;
15776 if( pIter->pNextLeaf ){
15777 pIter->pLeaf = pIter->pNextLeaf;
15778 pIter->pNextLeaf = 0;
15779 }else if( pIter->iLeafPgno<=pSeg->pgnoLast ){
15780 pIter->pLeaf = fts5DataRead(p,
15781 FTS5_SEGMENT_ROWID(pSeg->iSegid, pIter->iLeafPgno)
15782 );
15783 }else{
15784 pIter->pLeaf = 0;
15785 }
15786 pLeaf = pIter->pLeaf;
15787
15788 if( pLeaf ){
15789 pIter->iPgidxOff = pLeaf->szLeaf;
15790 if( fts5LeafIsTermless(pLeaf) ){
15791 pIter->iEndofDoclist = pLeaf->nn+1;
15792 }else{
15793 pIter->iPgidxOff += fts5GetVarint32(&pLeaf->p[pIter->iPgidxOff],
15794 pIter->iEndofDoclist
15795 );
15796 }
15797 }
15798 }
15799
15800 /*
15801 ** Argument p points to a buffer containing a varint to be interpreted as a
15802 ** position list size field. Read the varint and return the number of bytes
15803 ** read. Before returning, set *pnSz to the number of bytes in the position
15804 ** list, and *pbDel to true if the delete flag is set, or false otherwise.
15805 */
15806 static int fts5GetPoslistSize(const u8 *p, int *pnSz, int *pbDel){
15807 int nSz;
15808 int n = 0;
15809 fts5FastGetVarint32(p, n, nSz);
15810 assert_nc( nSz>=0 );
15811 *pnSz = nSz/2;
15812 *pbDel = nSz & 0x0001;
15813 return n;
15814 }
15815
15816 /*
15817 ** Fts5SegIter.iLeafOffset currently points to the first byte of a
15818 ** position-list size field. Read the value of the field and store it
15819 ** in the following variables:
15820 **
15821 ** Fts5SegIter.nPos
15822 ** Fts5SegIter.bDel
15823 **
15824 ** Leave Fts5SegIter.iLeafOffset pointing to the first byte of the
15825 ** position list content (if any).
15826 */
15827 static void fts5SegIterLoadNPos(Fts5Index *p, Fts5SegIter *pIter){
15828 if( p->rc==SQLITE_OK ){
15829 int iOff = pIter->iLeafOffset; /* Offset to read at */
15830 int nSz;
15831 ASSERT_SZLEAF_OK(pIter->pLeaf);
15832 fts5FastGetVarint32(pIter->pLeaf->p, iOff, nSz);
15833 pIter->bDel = (nSz & 0x0001);
15834 pIter->nPos = nSz>>1;
15835 pIter->iLeafOffset = iOff;
15836 assert_nc( pIter->nPos>=0 );
15837 }
15838 }
15839
15840 static void fts5SegIterLoadRowid(Fts5Index *p, Fts5SegIter *pIter){
15841 u8 *a = pIter->pLeaf->p; /* Buffer to read data from */
15842 int iOff = pIter->iLeafOffset;
15843
15844 ASSERT_SZLEAF_OK(pIter->pLeaf);
15845 if( iOff>=pIter->pLeaf->szLeaf ){
15846 fts5SegIterNextPage(p, pIter);
15847 if( pIter->pLeaf==0 ){
15848 if( p->rc==SQLITE_OK ) p->rc = FTS5_CORRUPT;
15849 return;
15850 }
15851 iOff = 4;
15852 a = pIter->pLeaf->p;
15853 }
15854 iOff += sqlite3Fts5GetVarint(&a[iOff], (u64*)&pIter->iRowid);
15855 pIter->iLeafOffset = iOff;
15856 }
15857
15858 /*
15859 ** Fts5SegIter.iLeafOffset currently points to the first byte of the
15860 ** "nSuffix" field of a term. Function parameter nKeep contains the value
15861 ** of the "nPrefix" field (if there was one - it is passed 0 if this is
15862 ** the first term in the segment).
15863 **
15864 ** This function populates:
15865 **
15866 ** Fts5SegIter.term
15867 ** Fts5SegIter.rowid
15868 **
15869 ** accordingly and leaves (Fts5SegIter.iLeafOffset) set to the content of
15870 ** the first position list. The position list belonging to document
15871 ** (Fts5SegIter.iRowid).
15872 */
15873 static void fts5SegIterLoadTerm(Fts5Index *p, Fts5SegIter *pIter, int nKeep){
15874 u8 *a = pIter->pLeaf->p; /* Buffer to read data from */
15875 int iOff = pIter->iLeafOffset; /* Offset to read at */
15876 int nNew; /* Bytes of new data */
15877
15878 iOff += fts5GetVarint32(&a[iOff], nNew);
15879 pIter->term.n = nKeep;
15880 fts5BufferAppendBlob(&p->rc, &pIter->term, nNew, &a[iOff]);
15881 iOff += nNew;
15882 pIter->iTermLeafOffset = iOff;
15883 pIter->iTermLeafPgno = pIter->iLeafPgno;
15884 pIter->iLeafOffset = iOff;
15885
15886 if( pIter->iPgidxOff>=pIter->pLeaf->nn ){
15887 pIter->iEndofDoclist = pIter->pLeaf->nn+1;
15888 }else{
15889 int nExtra;
15890 pIter->iPgidxOff += fts5GetVarint32(&a[pIter->iPgidxOff], nExtra);
15891 pIter->iEndofDoclist += nExtra;
15892 }
15893
15894 fts5SegIterLoadRowid(p, pIter);
15895 }
15896
15897 /*
15898 ** Initialize the iterator object pIter to iterate through the entries in
15899 ** segment pSeg. The iterator is left pointing to the first entry when
15900 ** this function returns.
15901 **
15902 ** If an error occurs, Fts5Index.rc is set to an appropriate error code. If
15903 ** an error has already occurred when this function is called, it is a no-op.
15904 */
15905 static void fts5SegIterInit(
15906 Fts5Index *p, /* FTS index object */
15907 Fts5StructureSegment *pSeg, /* Description of segment */
15908 Fts5SegIter *pIter /* Object to populate */
15909 ){
15910 if( pSeg->pgnoFirst==0 ){
15911 /* This happens if the segment is being used as an input to an incremental
15912 ** merge and all data has already been "trimmed". See function
15913 ** fts5TrimSegments() for details. In this case leave the iterator empty.
15914 ** The caller will see the (pIter->pLeaf==0) and assume the iterator is
15915 ** at EOF already. */
15916 assert( pIter->pLeaf==0 );
15917 return;
15918 }
15919
15920 if( p->rc==SQLITE_OK ){
15921 memset(pIter, 0, sizeof(*pIter));
15922 pIter->pSeg = pSeg;
15923 pIter->iLeafPgno = pSeg->pgnoFirst-1;
15924 fts5SegIterNextPage(p, pIter);
15925 }
15926
15927 if( p->rc==SQLITE_OK ){
15928 pIter->iLeafOffset = 4;
15929 assert_nc( pIter->pLeaf->nn>4 );
15930 assert( fts5LeafFirstTermOff(pIter->pLeaf)==4 );
15931 pIter->iPgidxOff = pIter->pLeaf->szLeaf+1;
15932 fts5SegIterLoadTerm(p, pIter, 0);
15933 fts5SegIterLoadNPos(p, pIter);
15934 }
15935 }
15936
15937 /*
15938 ** This function is only ever called on iterators created by calls to
15939 ** Fts5IndexQuery() with the FTS5INDEX_QUERY_DESC flag set.
15940 **
15941 ** The iterator is in an unusual state when this function is called: the
15942 ** Fts5SegIter.iLeafOffset variable is set to the offset of the start of
15943 ** the position-list size field for the first relevant rowid on the page.
15944 ** Fts5SegIter.rowid is set, but nPos and bDel are not.
15945 **
15946 ** This function advances the iterator so that it points to the last
15947 ** relevant rowid on the page and, if necessary, initializes the
15948 ** aRowidOffset[] and iRowidOffset variables. At this point the iterator
15949 ** is in its regular state - Fts5SegIter.iLeafOffset points to the first
15950 ** byte of the position list content associated with said rowid.
15951 */
15952 static void fts5SegIterReverseInitPage(Fts5Index *p, Fts5SegIter *pIter){
15953 int n = pIter->pLeaf->szLeaf;
15954 int i = pIter->iLeafOffset;
15955 u8 *a = pIter->pLeaf->p;
15956 int iRowidOffset = 0;
15957
15958 if( n>pIter->iEndofDoclist ){
15959 n = pIter->iEndofDoclist;
15960 }
15961
15962 ASSERT_SZLEAF_OK(pIter->pLeaf);
15963 while( 1 ){
15964 i64 iDelta = 0;
15965 int nPos;
15966 int bDummy;
15967
15968 i += fts5GetPoslistSize(&a[i], &nPos, &bDummy);
15969 i += nPos;
15970 if( i>=n ) break;
15971 i += fts5GetVarint(&a[i], (u64*)&iDelta);
15972 pIter->iRowid += iDelta;
15973
15974 if( iRowidOffset>=pIter->nRowidOffset ){
15975 int nNew = pIter->nRowidOffset + 8;
15976 int *aNew = (int*)sqlite3_realloc(pIter->aRowidOffset, nNew*sizeof(int));
15977 if( aNew==0 ){
15978 p->rc = SQLITE_NOMEM;
15979 break;
15980 }
15981 pIter->aRowidOffset = aNew;
15982 pIter->nRowidOffset = nNew;
15983 }
15984
15985 pIter->aRowidOffset[iRowidOffset++] = pIter->iLeafOffset;
15986 pIter->iLeafOffset = i;
15987 }
15988 pIter->iRowidOffset = iRowidOffset;
15989 fts5SegIterLoadNPos(p, pIter);
15990 }
15991
15992 /*
15993 **
15994 */
15995 static void fts5SegIterReverseNewPage(Fts5Index *p, Fts5SegIter *pIter){
15996 assert( pIter->flags & FTS5_SEGITER_REVERSE );
15997 assert( pIter->flags & FTS5_SEGITER_ONETERM );
15998
15999 fts5DataRelease(pIter->pLeaf);
16000 pIter->pLeaf = 0;
16001 while( p->rc==SQLITE_OK && pIter->iLeafPgno>pIter->iTermLeafPgno ){
16002 Fts5Data *pNew;
16003 pIter->iLeafPgno--;
16004 pNew = fts5DataRead(p, FTS5_SEGMENT_ROWID(
16005 pIter->pSeg->iSegid, pIter->iLeafPgno
16006 ));
16007 if( pNew ){
16008 /* iTermLeafOffset may be equal to szLeaf if the term is the last
16009 ** thing on the page - i.e. the first rowid is on the following page.
16010 ** In this case leave pIter->pLeaf==0, this iterator is at EOF. */
16011 if( pIter->iLeafPgno==pIter->iTermLeafPgno ){
16012 assert( pIter->pLeaf==0 );
16013 if( pIter->iTermLeafOffset<pNew->szLeaf ){
16014 pIter->pLeaf = pNew;
16015 pIter->iLeafOffset = pIter->iTermLeafOffset;
16016 }
16017 }else{
16018 int iRowidOff;
16019 iRowidOff = fts5LeafFirstRowidOff(pNew);
16020 if( iRowidOff ){
16021 pIter->pLeaf = pNew;
16022 pIter->iLeafOffset = iRowidOff;
16023 }
16024 }
16025
16026 if( pIter->pLeaf ){
16027 u8 *a = &pIter->pLeaf->p[pIter->iLeafOffset];
16028 pIter->iLeafOffset += fts5GetVarint(a, (u64*)&pIter->iRowid);
16029 break;
16030 }else{
16031 fts5DataRelease(pNew);
16032 }
16033 }
16034 }
16035
16036 if( pIter->pLeaf ){
16037 pIter->iEndofDoclist = pIter->pLeaf->nn+1;
16038 fts5SegIterReverseInitPage(p, pIter);
16039 }
16040 }
16041
16042 /*
16043 ** Return true if the iterator passed as the second argument currently
16044 ** points to a delete marker. A delete marker is an entry with a 0 byte
16045 ** position-list.
16046 */
16047 static int fts5MultiIterIsEmpty(Fts5Index *p, Fts5IndexIter *pIter){
16048 Fts5SegIter *pSeg = &pIter->aSeg[pIter->aFirst[1].iFirst];
16049 return (p->rc==SQLITE_OK && pSeg->pLeaf && pSeg->nPos==0);
16050 }
16051
16052 /*
16053 ** Advance iterator pIter to the next entry.
16054 **
16055 ** If an error occurs, Fts5Index.rc is set to an appropriate error code. It
16056 ** is not considered an error if the iterator reaches EOF. If an error has
16057 ** already occurred when this function is called, it is a no-op.
16058 */
16059 static void fts5SegIterNext(
16060 Fts5Index *p, /* FTS5 backend object */
16061 Fts5SegIter *pIter, /* Iterator to advance */
16062 int *pbNewTerm /* OUT: Set for new term */
16063 ){
16064 assert( pbNewTerm==0 || *pbNewTerm==0 );
16065 if( p->rc==SQLITE_OK ){
16066 if( pIter->flags & FTS5_SEGITER_REVERSE ){
16067 assert( pIter->pNextLeaf==0 );
16068 if( pIter->iRowidOffset>0 ){
16069 u8 *a = pIter->pLeaf->p;
16070 int iOff;
16071 int nPos;
16072 int bDummy;
16073 i64 iDelta;
16074
16075 pIter->iRowidOffset--;
16076 pIter->iLeafOffset = iOff = pIter->aRowidOffset[pIter->iRowidOffset];
16077 iOff += fts5GetPoslistSize(&a[iOff], &nPos, &bDummy);
16078 iOff += nPos;
16079 fts5GetVarint(&a[iOff], (u64*)&iDelta);
16080 pIter->iRowid -= iDelta;
16081 fts5SegIterLoadNPos(p, pIter);
16082 }else{
16083 fts5SegIterReverseNewPage(p, pIter);
16084 }
16085 }else{
16086 Fts5Data *pLeaf = pIter->pLeaf;
16087 int iOff;
16088 int bNewTerm = 0;
16089 int nKeep = 0;
16090
16091 /* Search for the end of the position list within the current page. */
16092 u8 *a = pLeaf->p;
16093 int n = pLeaf->szLeaf;
16094
16095 ASSERT_SZLEAF_OK(pLeaf);
16096 iOff = pIter->iLeafOffset + pIter->nPos;
16097
16098 if( iOff<n ){
16099 /* The next entry is on the current page. */
16100 assert_nc( iOff<=pIter->iEndofDoclist );
16101 if( iOff>=pIter->iEndofDoclist ){
16102 bNewTerm = 1;
16103 if( iOff!=fts5LeafFirstTermOff(pLeaf) ){
16104 iOff += fts5GetVarint32(&a[iOff], nKeep);
16105 }
16106 }else{
16107 u64 iDelta;
16108 iOff += sqlite3Fts5GetVarint(&a[iOff], &iDelta);
16109 pIter->iRowid += iDelta;
16110 assert_nc( iDelta>0 );
16111 }
16112 pIter->iLeafOffset = iOff;
16113
16114 }else if( pIter->pSeg==0 ){
16115 const u8 *pList = 0;
16116 const char *zTerm = 0;
16117 int nList = 0;
16118 assert( (pIter->flags & FTS5_SEGITER_ONETERM) || pbNewTerm );
16119 if( 0==(pIter->flags & FTS5_SEGITER_ONETERM) ){
16120 sqlite3Fts5HashScanNext(p->pHash);
16121 sqlite3Fts5HashScanEntry(p->pHash, &zTerm, &pList, &nList);
16122 }
16123 if( pList==0 ){
16124 fts5DataRelease(pIter->pLeaf);
16125 pIter->pLeaf = 0;
16126 }else{
16127 pIter->pLeaf->p = (u8*)pList;
16128 pIter->pLeaf->nn = nList;
16129 pIter->pLeaf->szLeaf = nList;
16130 pIter->iEndofDoclist = nList+1;
16131 sqlite3Fts5BufferSet(&p->rc, &pIter->term, (int)strlen(zTerm),
16132 (u8*)zTerm);
16133 pIter->iLeafOffset = fts5GetVarint(pList, (u64*)&pIter->iRowid);
16134 *pbNewTerm = 1;
16135 }
16136 }else{
16137 iOff = 0;
16138 /* Next entry is not on the current page */
16139 while( iOff==0 ){
16140 fts5SegIterNextPage(p, pIter);
16141 pLeaf = pIter->pLeaf;
16142 if( pLeaf==0 ) break;
16143 ASSERT_SZLEAF_OK(pLeaf);
16144 if( (iOff = fts5LeafFirstRowidOff(pLeaf)) && iOff<pLeaf->szLeaf ){
16145 iOff += sqlite3Fts5GetVarint(&pLeaf->p[iOff], (u64*)&pIter->iRowid);
16146 pIter->iLeafOffset = iOff;
16147
16148 if( pLeaf->nn>pLeaf->szLeaf ){
16149 pIter->iPgidxOff = pLeaf->szLeaf + fts5GetVarint32(
16150 &pLeaf->p[pLeaf->szLeaf], pIter->iEndofDoclist
16151 );
16152 }
16153
16154 }
16155 else if( pLeaf->nn>pLeaf->szLeaf ){
16156 pIter->iPgidxOff = pLeaf->szLeaf + fts5GetVarint32(
16157 &pLeaf->p[pLeaf->szLeaf], iOff
16158 );
16159 pIter->iLeafOffset = iOff;
16160 pIter->iEndofDoclist = iOff;
16161 bNewTerm = 1;
16162 }
16163 if( iOff>=pLeaf->szLeaf ){
16164 p->rc = FTS5_CORRUPT;
16165 return;
16166 }
16167 }
16168 }
16169
16170 /* Check if the iterator is now at EOF. If so, return early. */
16171 if( pIter->pLeaf ){
16172 if( bNewTerm ){
16173 if( pIter->flags & FTS5_SEGITER_ONETERM ){
16174 fts5DataRelease(pIter->pLeaf);
16175 pIter->pLeaf = 0;
16176 }else{
16177 fts5SegIterLoadTerm(p, pIter, nKeep);
16178 fts5SegIterLoadNPos(p, pIter);
16179 if( pbNewTerm ) *pbNewTerm = 1;
16180 }
16181 }else{
16182 /* The following could be done by calling fts5SegIterLoadNPos(). But
16183 ** this block is particularly performance critical, so equivalent
16184 ** code is inlined. */
16185 int nSz;
16186 assert( p->rc==SQLITE_OK );
16187 fts5FastGetVarint32(pIter->pLeaf->p, pIter->iLeafOffset, nSz);
16188 pIter->bDel = (nSz & 0x0001);
16189 pIter->nPos = nSz>>1;
16190 assert_nc( pIter->nPos>=0 );
16191 }
16192 }
16193 }
16194 }
16195 }
16196
16197 #define SWAPVAL(T, a, b) { T tmp; tmp=a; a=b; b=tmp; }
16198
16199 /*
16200 ** Iterator pIter currently points to the first rowid in a doclist. This
16201 ** function sets the iterator up so that iterates in reverse order through
16202 ** the doclist.
16203 */
16204 static void fts5SegIterReverse(Fts5Index *p, Fts5SegIter *pIter){
16205 Fts5DlidxIter *pDlidx = pIter->pDlidx;
16206 Fts5Data *pLast = 0;
16207 int pgnoLast = 0;
16208
16209 if( pDlidx ){
16210 int iSegid = pIter->pSeg->iSegid;
16211 pgnoLast = fts5DlidxIterPgno(pDlidx);
16212 pLast = fts5DataRead(p, FTS5_SEGMENT_ROWID(iSegid, pgnoLast));
16213 }else{
16214 Fts5Data *pLeaf = pIter->pLeaf; /* Current leaf data */
16215
16216 /* Currently, Fts5SegIter.iLeafOffset points to the first byte of
16217 ** position-list content for the current rowid. Back it up so that it
16218 ** points to the start of the position-list size field. */
16219 pIter->iLeafOffset -= sqlite3Fts5GetVarintLen(pIter->nPos*2+pIter->bDel);
16220
16221 /* If this condition is true then the largest rowid for the current
16222 ** term may not be stored on the current page. So search forward to
16223 ** see where said rowid really is. */
16224 if( pIter->iEndofDoclist>=pLeaf->szLeaf ){
16225 int pgno;
16226 Fts5StructureSegment *pSeg = pIter->pSeg;
16227
16228 /* The last rowid in the doclist may not be on the current page. Search
16229 ** forward to find the page containing the last rowid. */
16230 for(pgno=pIter->iLeafPgno+1; !p->rc && pgno<=pSeg->pgnoLast; pgno++){
16231 i64 iAbs = FTS5_SEGMENT_ROWID(pSeg->iSegid, pgno);
16232 Fts5Data *pNew = fts5DataRead(p, iAbs);
16233 if( pNew ){
16234 int iRowid, bTermless;
16235 iRowid = fts5LeafFirstRowidOff(pNew);
16236 bTermless = fts5LeafIsTermless(pNew);
16237 if( iRowid ){
16238 SWAPVAL(Fts5Data*, pNew, pLast);
16239 pgnoLast = pgno;
16240 }
16241 fts5DataRelease(pNew);
16242 if( bTermless==0 ) break;
16243 }
16244 }
16245 }
16246 }
16247
16248 /* If pLast is NULL at this point, then the last rowid for this doclist
16249 ** lies on the page currently indicated by the iterator. In this case
16250 ** pIter->iLeafOffset is already set to point to the position-list size
16251 ** field associated with the first relevant rowid on the page.
16252 **
16253 ** Or, if pLast is non-NULL, then it is the page that contains the last
16254 ** rowid. In this case configure the iterator so that it points to the
16255 ** first rowid on this page.
16256 */
16257 if( pLast ){
16258 int iOff;
16259 fts5DataRelease(pIter->pLeaf);
16260 pIter->pLeaf = pLast;
16261 pIter->iLeafPgno = pgnoLast;
16262 iOff = fts5LeafFirstRowidOff(pLast);
16263 iOff += fts5GetVarint(&pLast->p[iOff], (u64*)&pIter->iRowid);
16264 pIter->iLeafOffset = iOff;
16265
16266 if( fts5LeafIsTermless(pLast) ){
16267 pIter->iEndofDoclist = pLast->nn+1;
16268 }else{
16269 pIter->iEndofDoclist = fts5LeafFirstTermOff(pLast);
16270 }
16271
16272 }
16273
16274 fts5SegIterReverseInitPage(p, pIter);
16275 }
16276
16277 /*
16278 ** Iterator pIter currently points to the first rowid of a doclist.
16279 ** There is a doclist-index associated with the final term on the current
16280 ** page. If the current term is the last term on the page, load the
16281 ** doclist-index from disk and initialize an iterator at (pIter->pDlidx).
16282 */
16283 static void fts5SegIterLoadDlidx(Fts5Index *p, Fts5SegIter *pIter){
16284 int iSeg = pIter->pSeg->iSegid;
16285 int bRev = (pIter->flags & FTS5_SEGITER_REVERSE);
16286 Fts5Data *pLeaf = pIter->pLeaf; /* Current leaf data */
16287
16288 assert( pIter->flags & FTS5_SEGITER_ONETERM );
16289 assert( pIter->pDlidx==0 );
16290
16291 /* Check if the current doclist ends on this page. If it does, return
16292 ** early without loading the doclist-index (as it belongs to a different
16293 ** term. */
16294 if( pIter->iTermLeafPgno==pIter->iLeafPgno
16295 && pIter->iEndofDoclist<pLeaf->szLeaf
16296 ){
16297 return;
16298 }
16299
16300 pIter->pDlidx = fts5DlidxIterInit(p, bRev, iSeg, pIter->iTermLeafPgno);
16301 }
16302
16303 #define fts5IndexSkipVarint(a, iOff) { \
16304 int iEnd = iOff+9; \
16305 while( (a[iOff++] & 0x80) && iOff<iEnd ); \
16306 }
16307
16308 /*
16309 ** The iterator object passed as the second argument currently contains
16310 ** no valid values except for the Fts5SegIter.pLeaf member variable. This
16311 ** function searches the leaf page for a term matching (pTerm/nTerm).
16312 **
16313 ** If the specified term is found on the page, then the iterator is left
16314 ** pointing to it. If argument bGe is zero and the term is not found,
16315 ** the iterator is left pointing at EOF.
16316 **
16317 ** If bGe is non-zero and the specified term is not found, then the
16318 ** iterator is left pointing to the smallest term in the segment that
16319 ** is larger than the specified term, even if this term is not on the
16320 ** current page.
16321 */
16322 static void fts5LeafSeek(
16323 Fts5Index *p, /* Leave any error code here */
16324 int bGe, /* True for a >= search */
16325 Fts5SegIter *pIter, /* Iterator to seek */
16326 const u8 *pTerm, int nTerm /* Term to search for */
16327 ){
16328 int iOff;
16329 const u8 *a = pIter->pLeaf->p;
16330 int szLeaf = pIter->pLeaf->szLeaf;
16331 int n = pIter->pLeaf->nn;
16332
16333 int nMatch = 0;
16334 int nKeep = 0;
16335 int nNew = 0;
16336 int iTermOff;
16337 int iPgidx; /* Current offset in pgidx */
16338 int bEndOfPage = 0;
16339
16340 assert( p->rc==SQLITE_OK );
16341
16342 iPgidx = szLeaf;
16343 iPgidx += fts5GetVarint32(&a[iPgidx], iTermOff);
16344 iOff = iTermOff;
16345
16346 while( 1 ){
16347
16348 /* Figure out how many new bytes are in this term */
16349 fts5FastGetVarint32(a, iOff, nNew);
16350 if( nKeep<nMatch ){
16351 goto search_failed;
16352 }
16353
16354 assert( nKeep>=nMatch );
16355 if( nKeep==nMatch ){
16356 int nCmp;
16357 int i;
16358 nCmp = MIN(nNew, nTerm-nMatch);
16359 for(i=0; i<nCmp; i++){
16360 if( a[iOff+i]!=pTerm[nMatch+i] ) break;
16361 }
16362 nMatch += i;
16363
16364 if( nTerm==nMatch ){
16365 if( i==nNew ){
16366 goto search_success;
16367 }else{
16368 goto search_failed;
16369 }
16370 }else if( i<nNew && a[iOff+i]>pTerm[nMatch] ){
16371 goto search_failed;
16372 }
16373 }
16374
16375 if( iPgidx>=n ){
16376 bEndOfPage = 1;
16377 break;
16378 }
16379
16380 iPgidx += fts5GetVarint32(&a[iPgidx], nKeep);
16381 iTermOff += nKeep;
16382 iOff = iTermOff;
16383
16384 /* Read the nKeep field of the next term. */
16385 fts5FastGetVarint32(a, iOff, nKeep);
16386 }
16387
16388 search_failed:
16389 if( bGe==0 ){
16390 fts5DataRelease(pIter->pLeaf);
16391 pIter->pLeaf = 0;
16392 return;
16393 }else if( bEndOfPage ){
16394 do {
16395 fts5SegIterNextPage(p, pIter);
16396 if( pIter->pLeaf==0 ) return;
16397 a = pIter->pLeaf->p;
16398 if( fts5LeafIsTermless(pIter->pLeaf)==0 ){
16399 iPgidx = pIter->pLeaf->szLeaf;
16400 iPgidx += fts5GetVarint32(&pIter->pLeaf->p[iPgidx], iOff);
16401 if( iOff<4 || iOff>=pIter->pLeaf->szLeaf ){
16402 p->rc = FTS5_CORRUPT;
16403 }else{
16404 nKeep = 0;
16405 iTermOff = iOff;
16406 n = pIter->pLeaf->nn;
16407 iOff += fts5GetVarint32(&a[iOff], nNew);
16408 break;
16409 }
16410 }
16411 }while( 1 );
16412 }
16413
16414 search_success:
16415
16416 pIter->iLeafOffset = iOff + nNew;
16417 pIter->iTermLeafOffset = pIter->iLeafOffset;
16418 pIter->iTermLeafPgno = pIter->iLeafPgno;
16419
16420 fts5BufferSet(&p->rc, &pIter->term, nKeep, pTerm);
16421 fts5BufferAppendBlob(&p->rc, &pIter->term, nNew, &a[iOff]);
16422
16423 if( iPgidx>=n ){
16424 pIter->iEndofDoclist = pIter->pLeaf->nn+1;
16425 }else{
16426 int nExtra;
16427 iPgidx += fts5GetVarint32(&a[iPgidx], nExtra);
16428 pIter->iEndofDoclist = iTermOff + nExtra;
16429 }
16430 pIter->iPgidxOff = iPgidx;
16431
16432 fts5SegIterLoadRowid(p, pIter);
16433 fts5SegIterLoadNPos(p, pIter);
16434 }
16435
16436 /*
16437 ** Initialize the object pIter to point to term pTerm/nTerm within segment
16438 ** pSeg. If there is no such term in the index, the iterator is set to EOF.
16439 **
16440 ** If an error occurs, Fts5Index.rc is set to an appropriate error code. If
16441 ** an error has already occurred when this function is called, it is a no-op.
16442 */
16443 static void fts5SegIterSeekInit(
16444 Fts5Index *p, /* FTS5 backend */
16445 Fts5Buffer *pBuf, /* Buffer to use for loading pages */
16446 const u8 *pTerm, int nTerm, /* Term to seek to */
16447 int flags, /* Mask of FTS5INDEX_XXX flags */
16448 Fts5StructureSegment *pSeg, /* Description of segment */
16449 Fts5SegIter *pIter /* Object to populate */
16450 ){
16451 int iPg = 1;
16452 int bGe = (flags & FTS5INDEX_QUERY_SCAN);
16453 int bDlidx = 0; /* True if there is a doclist-index */
16454
16455 static int nCall = 0;
16456 nCall++;
16457
16458 assert( bGe==0 || (flags & FTS5INDEX_QUERY_DESC)==0 );
16459 assert( pTerm && nTerm );
16460 memset(pIter, 0, sizeof(*pIter));
16461 pIter->pSeg = pSeg;
16462
16463 /* This block sets stack variable iPg to the leaf page number that may
16464 ** contain term (pTerm/nTerm), if it is present in the segment. */
16465 if( p->pIdxSelect==0 ){
16466 Fts5Config *pConfig = p->pConfig;
16467 fts5IndexPrepareStmt(p, &p->pIdxSelect, sqlite3_mprintf(
16468 "SELECT pgno FROM '%q'.'%q_idx' WHERE "
16469 "segid=? AND term<=? ORDER BY term DESC LIMIT 1",
16470 pConfig->zDb, pConfig->zName
16471 ));
16472 }
16473 if( p->rc ) return;
16474 sqlite3_bind_int(p->pIdxSelect, 1, pSeg->iSegid);
16475 sqlite3_bind_blob(p->pIdxSelect, 2, pTerm, nTerm, SQLITE_STATIC);
16476 if( SQLITE_ROW==sqlite3_step(p->pIdxSelect) ){
16477 i64 val = sqlite3_column_int(p->pIdxSelect, 0);
16478 iPg = (int)(val>>1);
16479 bDlidx = (val & 0x0001);
16480 }
16481 p->rc = sqlite3_reset(p->pIdxSelect);
16482
16483 if( iPg<pSeg->pgnoFirst ){
16484 iPg = pSeg->pgnoFirst;
16485 bDlidx = 0;
16486 }
16487
16488 pIter->iLeafPgno = iPg - 1;
16489 fts5SegIterNextPage(p, pIter);
16490
16491 if( pIter->pLeaf ){
16492 fts5LeafSeek(p, bGe, pIter, pTerm, nTerm);
16493 }
16494
16495 if( p->rc==SQLITE_OK && bGe==0 ){
16496 pIter->flags |= FTS5_SEGITER_ONETERM;
16497 if( pIter->pLeaf ){
16498 if( flags & FTS5INDEX_QUERY_DESC ){
16499 pIter->flags |= FTS5_SEGITER_REVERSE;
16500 }
16501 if( bDlidx ){
16502 fts5SegIterLoadDlidx(p, pIter);
16503 }
16504 if( flags & FTS5INDEX_QUERY_DESC ){
16505 fts5SegIterReverse(p, pIter);
16506 }
16507 }
16508 }
16509
16510 /* Either:
16511 **
16512 ** 1) an error has occurred, or
16513 ** 2) the iterator points to EOF, or
16514 ** 3) the iterator points to an entry with term (pTerm/nTerm), or
16515 ** 4) the FTS5INDEX_QUERY_SCAN flag was set and the iterator points
16516 ** to an entry with a term greater than or equal to (pTerm/nTerm).
16517 */
16518 assert( p->rc!=SQLITE_OK /* 1 */
16519 || pIter->pLeaf==0 /* 2 */
16520 || fts5BufferCompareBlob(&pIter->term, pTerm, nTerm)==0 /* 3 */
16521 || (bGe && fts5BufferCompareBlob(&pIter->term, pTerm, nTerm)>0) /* 4 */
16522 );
16523 }
16524
16525 /*
16526 ** Initialize the object pIter to point to term pTerm/nTerm within the
16527 ** in-memory hash table. If there is no such term in the hash-table, the
16528 ** iterator is set to EOF.
16529 **
16530 ** If an error occurs, Fts5Index.rc is set to an appropriate error code. If
16531 ** an error has already occurred when this function is called, it is a no-op.
16532 */
16533 static void fts5SegIterHashInit(
16534 Fts5Index *p, /* FTS5 backend */
16535 const u8 *pTerm, int nTerm, /* Term to seek to */
16536 int flags, /* Mask of FTS5INDEX_XXX flags */
16537 Fts5SegIter *pIter /* Object to populate */
16538 ){
16539 const u8 *pList = 0;
16540 int nList = 0;
16541 const u8 *z = 0;
16542 int n = 0;
16543
16544 assert( p->pHash );
16545 assert( p->rc==SQLITE_OK );
16546
16547 if( pTerm==0 || (flags & FTS5INDEX_QUERY_SCAN) ){
16548 p->rc = sqlite3Fts5HashScanInit(p->pHash, (const char*)pTerm, nTerm);
16549 sqlite3Fts5HashScanEntry(p->pHash, (const char**)&z, &pList, &nList);
16550 n = (z ? (int)strlen((const char*)z) : 0);
16551 }else{
16552 pIter->flags |= FTS5_SEGITER_ONETERM;
16553 sqlite3Fts5HashQuery(p->pHash, (const char*)pTerm, nTerm, &pList, &nList);
16554 z = pTerm;
16555 n = nTerm;
16556 }
16557
16558 if( pList ){
16559 Fts5Data *pLeaf;
16560 sqlite3Fts5BufferSet(&p->rc, &pIter->term, n, z);
16561 pLeaf = fts5IdxMalloc(p, sizeof(Fts5Data));
16562 if( pLeaf==0 ) return;
16563 pLeaf->p = (u8*)pList;
16564 pLeaf->nn = pLeaf->szLeaf = nList;
16565 pIter->pLeaf = pLeaf;
16566 pIter->iLeafOffset = fts5GetVarint(pLeaf->p, (u64*)&pIter->iRowid);
16567 pIter->iEndofDoclist = pLeaf->nn+1;
16568
16569 if( flags & FTS5INDEX_QUERY_DESC ){
16570 pIter->flags |= FTS5_SEGITER_REVERSE;
16571 fts5SegIterReverseInitPage(p, pIter);
16572 }else{
16573 fts5SegIterLoadNPos(p, pIter);
16574 }
16575 }
16576 }
16577
16578 /*
16579 ** Zero the iterator passed as the only argument.
16580 */
16581 static void fts5SegIterClear(Fts5SegIter *pIter){
16582 fts5BufferFree(&pIter->term);
16583 fts5DataRelease(pIter->pLeaf);
16584 fts5DataRelease(pIter->pNextLeaf);
16585 fts5DlidxIterFree(pIter->pDlidx);
16586 sqlite3_free(pIter->aRowidOffset);
16587 memset(pIter, 0, sizeof(Fts5SegIter));
16588 }
16589
16590 #ifdef SQLITE_DEBUG
16591
16592 /*
16593 ** This function is used as part of the big assert() procedure implemented by
16594 ** fts5AssertMultiIterSetup(). It ensures that the result currently stored
16595 ** in *pRes is the correct result of comparing the current positions of the
16596 ** two iterators.
16597 */
16598 static void fts5AssertComparisonResult(
16599 Fts5IndexIter *pIter,
16600 Fts5SegIter *p1,
16601 Fts5SegIter *p2,
16602 Fts5CResult *pRes
16603 ){
16604 int i1 = p1 - pIter->aSeg;
16605 int i2 = p2 - pIter->aSeg;
16606
16607 if( p1->pLeaf || p2->pLeaf ){
16608 if( p1->pLeaf==0 ){
16609 assert( pRes->iFirst==i2 );
16610 }else if( p2->pLeaf==0 ){
16611 assert( pRes->iFirst==i1 );
16612 }else{
16613 int nMin = MIN(p1->term.n, p2->term.n);
16614 int res = memcmp(p1->term.p, p2->term.p, nMin);
16615 if( res==0 ) res = p1->term.n - p2->term.n;
16616
16617 if( res==0 ){
16618 assert( pRes->bTermEq==1 );
16619 assert( p1->iRowid!=p2->iRowid );
16620 res = ((p1->iRowid > p2->iRowid)==pIter->bRev) ? -1 : 1;
16621 }else{
16622 assert( pRes->bTermEq==0 );
16623 }
16624
16625 if( res<0 ){
16626 assert( pRes->iFirst==i1 );
16627 }else{
16628 assert( pRes->iFirst==i2 );
16629 }
16630 }
16631 }
16632 }
16633
16634 /*
16635 ** This function is a no-op unless SQLITE_DEBUG is defined when this module
16636 ** is compiled. In that case, this function is essentially an assert()
16637 ** statement used to verify that the contents of the pIter->aFirst[] array
16638 ** are correct.
16639 */
16640 static void fts5AssertMultiIterSetup(Fts5Index *p, Fts5IndexIter *pIter){
16641 if( p->rc==SQLITE_OK ){
16642 Fts5SegIter *pFirst = &pIter->aSeg[ pIter->aFirst[1].iFirst ];
16643 int i;
16644
16645 assert( (pFirst->pLeaf==0)==pIter->bEof );
16646
16647 /* Check that pIter->iSwitchRowid is set correctly. */
16648 for(i=0; i<pIter->nSeg; i++){
16649 Fts5SegIter *p1 = &pIter->aSeg[i];
16650 assert( p1==pFirst
16651 || p1->pLeaf==0
16652 || fts5BufferCompare(&pFirst->term, &p1->term)
16653 || p1->iRowid==pIter->iSwitchRowid
16654 || (p1->iRowid<pIter->iSwitchRowid)==pIter->bRev
16655 );
16656 }
16657
16658 for(i=0; i<pIter->nSeg; i+=2){
16659 Fts5SegIter *p1 = &pIter->aSeg[i];
16660 Fts5SegIter *p2 = &pIter->aSeg[i+1];
16661 Fts5CResult *pRes = &pIter->aFirst[(pIter->nSeg + i) / 2];
16662 fts5AssertComparisonResult(pIter, p1, p2, pRes);
16663 }
16664
16665 for(i=1; i<(pIter->nSeg / 2); i+=2){
16666 Fts5SegIter *p1 = &pIter->aSeg[ pIter->aFirst[i*2].iFirst ];
16667 Fts5SegIter *p2 = &pIter->aSeg[ pIter->aFirst[i*2+1].iFirst ];
16668 Fts5CResult *pRes = &pIter->aFirst[i];
16669 fts5AssertComparisonResult(pIter, p1, p2, pRes);
16670 }
16671 }
16672 }
16673 #else
16674 # define fts5AssertMultiIterSetup(x,y)
16675 #endif
16676
16677 /*
16678 ** Do the comparison necessary to populate pIter->aFirst[iOut].
16679 **
16680 ** If the returned value is non-zero, then it is the index of an entry
16681 ** in the pIter->aSeg[] array that is (a) not at EOF, and (b) pointing
16682 ** to a key that is a duplicate of another, higher priority,
16683 ** segment-iterator in the pSeg->aSeg[] array.
16684 */
16685 static int fts5MultiIterDoCompare(Fts5IndexIter *pIter, int iOut){
16686 int i1; /* Index of left-hand Fts5SegIter */
16687 int i2; /* Index of right-hand Fts5SegIter */
16688 int iRes;
16689 Fts5SegIter *p1; /* Left-hand Fts5SegIter */
16690 Fts5SegIter *p2; /* Right-hand Fts5SegIter */
16691 Fts5CResult *pRes = &pIter->aFirst[iOut];
16692
16693 assert( iOut<pIter->nSeg && iOut>0 );
16694 assert( pIter->bRev==0 || pIter->bRev==1 );
16695
16696 if( iOut>=(pIter->nSeg/2) ){
16697 i1 = (iOut - pIter->nSeg/2) * 2;
16698 i2 = i1 + 1;
16699 }else{
16700 i1 = pIter->aFirst[iOut*2].iFirst;
16701 i2 = pIter->aFirst[iOut*2+1].iFirst;
16702 }
16703 p1 = &pIter->aSeg[i1];
16704 p2 = &pIter->aSeg[i2];
16705
16706 pRes->bTermEq = 0;
16707 if( p1->pLeaf==0 ){ /* If p1 is at EOF */
16708 iRes = i2;
16709 }else if( p2->pLeaf==0 ){ /* If p2 is at EOF */
16710 iRes = i1;
16711 }else{
16712 int res = fts5BufferCompare(&p1->term, &p2->term);
16713 if( res==0 ){
16714 assert( i2>i1 );
16715 assert( i2!=0 );
16716 pRes->bTermEq = 1;
16717 if( p1->iRowid==p2->iRowid ){
16718 p1->bDel = p2->bDel;
16719 return i2;
16720 }
16721 res = ((p1->iRowid > p2->iRowid)==pIter->bRev) ? -1 : +1;
16722 }
16723 assert( res!=0 );
16724 if( res<0 ){
16725 iRes = i1;
16726 }else{
16727 iRes = i2;
16728 }
16729 }
16730
16731 pRes->iFirst = (u16)iRes;
16732 return 0;
16733 }
16734
16735 /*
16736 ** Move the seg-iter so that it points to the first rowid on page iLeafPgno.
16737 ** It is an error if leaf iLeafPgno does not exist or contains no rowids.
16738 */
16739 static void fts5SegIterGotoPage(
16740 Fts5Index *p, /* FTS5 backend object */
16741 Fts5SegIter *pIter, /* Iterator to advance */
16742 int iLeafPgno
16743 ){
16744 assert( iLeafPgno>pIter->iLeafPgno );
16745
16746 if( iLeafPgno>pIter->pSeg->pgnoLast ){
16747 p->rc = FTS5_CORRUPT;
16748 }else{
16749 fts5DataRelease(pIter->pNextLeaf);
16750 pIter->pNextLeaf = 0;
16751 pIter->iLeafPgno = iLeafPgno-1;
16752 fts5SegIterNextPage(p, pIter);
16753 assert( p->rc!=SQLITE_OK || pIter->iLeafPgno==iLeafPgno );
16754
16755 if( p->rc==SQLITE_OK ){
16756 int iOff;
16757 u8 *a = pIter->pLeaf->p;
16758 int n = pIter->pLeaf->szLeaf;
16759
16760 iOff = fts5LeafFirstRowidOff(pIter->pLeaf);
16761 if( iOff<4 || iOff>=n ){
16762 p->rc = FTS5_CORRUPT;
16763 }else{
16764 iOff += fts5GetVarint(&a[iOff], (u64*)&pIter->iRowid);
16765 pIter->iLeafOffset = iOff;
16766 fts5SegIterLoadNPos(p, pIter);
16767 }
16768 }
16769 }
16770 }
16771
16772 /*
16773 ** Advance the iterator passed as the second argument until it is at or
16774 ** past rowid iFrom. Regardless of the value of iFrom, the iterator is
16775 ** always advanced at least once.
16776 */
16777 static void fts5SegIterNextFrom(
16778 Fts5Index *p, /* FTS5 backend object */
16779 Fts5SegIter *pIter, /* Iterator to advance */
16780 i64 iMatch /* Advance iterator at least this far */
16781 ){
16782 int bRev = (pIter->flags & FTS5_SEGITER_REVERSE);
16783 Fts5DlidxIter *pDlidx = pIter->pDlidx;
16784 int iLeafPgno = pIter->iLeafPgno;
16785 int bMove = 1;
16786
16787 assert( pIter->flags & FTS5_SEGITER_ONETERM );
16788 assert( pIter->pDlidx );
16789 assert( pIter->pLeaf );
16790
16791 if( bRev==0 ){
16792 while( !fts5DlidxIterEof(p, pDlidx) && iMatch>fts5DlidxIterRowid(pDlidx) ){
16793 iLeafPgno = fts5DlidxIterPgno(pDlidx);
16794 fts5DlidxIterNext(p, pDlidx);
16795 }
16796 assert_nc( iLeafPgno>=pIter->iLeafPgno || p->rc );
16797 if( iLeafPgno>pIter->iLeafPgno ){
16798 fts5SegIterGotoPage(p, pIter, iLeafPgno);
16799 bMove = 0;
16800 }
16801 }else{
16802 assert( pIter->pNextLeaf==0 );
16803 assert( iMatch<pIter->iRowid );
16804 while( !fts5DlidxIterEof(p, pDlidx) && iMatch<fts5DlidxIterRowid(pDlidx) ){
16805 fts5DlidxIterPrev(p, pDlidx);
16806 }
16807 iLeafPgno = fts5DlidxIterPgno(pDlidx);
16808
16809 assert( fts5DlidxIterEof(p, pDlidx) || iLeafPgno<=pIter->iLeafPgno );
16810
16811 if( iLeafPgno<pIter->iLeafPgno ){
16812 pIter->iLeafPgno = iLeafPgno+1;
16813 fts5SegIterReverseNewPage(p, pIter);
16814 bMove = 0;
16815 }
16816 }
16817
16818 do{
16819 if( bMove ) fts5SegIterNext(p, pIter, 0);
16820 if( pIter->pLeaf==0 ) break;
16821 if( bRev==0 && pIter->iRowid>=iMatch ) break;
16822 if( bRev!=0 && pIter->iRowid<=iMatch ) break;
16823 bMove = 1;
16824 }while( p->rc==SQLITE_OK );
16825 }
16826
16827
16828 /*
16829 ** Free the iterator object passed as the second argument.
16830 */
16831 static void fts5MultiIterFree(Fts5Index *p, Fts5IndexIter *pIter){
16832 if( pIter ){
16833 int i;
16834 for(i=0; i<pIter->nSeg; i++){
16835 fts5SegIterClear(&pIter->aSeg[i]);
16836 }
16837 fts5StructureRelease(pIter->pStruct);
16838 fts5BufferFree(&pIter->poslist);
16839 sqlite3_free(pIter);
16840 }
16841 }
16842
16843 static void fts5MultiIterAdvanced(
16844 Fts5Index *p, /* FTS5 backend to iterate within */
16845 Fts5IndexIter *pIter, /* Iterator to update aFirst[] array for */
16846 int iChanged, /* Index of sub-iterator just advanced */
16847 int iMinset /* Minimum entry in aFirst[] to set */
16848 ){
16849 int i;
16850 for(i=(pIter->nSeg+iChanged)/2; i>=iMinset && p->rc==SQLITE_OK; i=i/2){
16851 int iEq;
16852 if( (iEq = fts5MultiIterDoCompare(pIter, i)) ){
16853 fts5SegIterNext(p, &pIter->aSeg[iEq], 0);
16854 i = pIter->nSeg + iEq;
16855 }
16856 }
16857 }
16858
16859 /*
16860 ** Sub-iterator iChanged of iterator pIter has just been advanced. It still
16861 ** points to the same term though - just a different rowid. This function
16862 ** attempts to update the contents of the pIter->aFirst[] accordingly.
16863 ** If it does so successfully, 0 is returned. Otherwise 1.
16864 **
16865 ** If non-zero is returned, the caller should call fts5MultiIterAdvanced()
16866 ** on the iterator instead. That function does the same as this one, except
16867 ** that it deals with more complicated cases as well.
16868 */
16869 static int fts5MultiIterAdvanceRowid(
16870 Fts5Index *p, /* FTS5 backend to iterate within */
16871 Fts5IndexIter *pIter, /* Iterator to update aFirst[] array for */
16872 int iChanged /* Index of sub-iterator just advanced */
16873 ){
16874 Fts5SegIter *pNew = &pIter->aSeg[iChanged];
16875
16876 if( pNew->iRowid==pIter->iSwitchRowid
16877 || (pNew->iRowid<pIter->iSwitchRowid)==pIter->bRev
16878 ){
16879 int i;
16880 Fts5SegIter *pOther = &pIter->aSeg[iChanged ^ 0x0001];
16881 pIter->iSwitchRowid = pIter->bRev ? SMALLEST_INT64 : LARGEST_INT64;
16882 for(i=(pIter->nSeg+iChanged)/2; 1; i=i/2){
16883 Fts5CResult *pRes = &pIter->aFirst[i];
16884
16885 assert( pNew->pLeaf );
16886 assert( pRes->bTermEq==0 || pOther->pLeaf );
16887
16888 if( pRes->bTermEq ){
16889 if( pNew->iRowid==pOther->iRowid ){
16890 return 1;
16891 }else if( (pOther->iRowid>pNew->iRowid)==pIter->bRev ){
16892 pIter->iSwitchRowid = pOther->iRowid;
16893 pNew = pOther;
16894 }else if( (pOther->iRowid>pIter->iSwitchRowid)==pIter->bRev ){
16895 pIter->iSwitchRowid = pOther->iRowid;
16896 }
16897 }
16898 pRes->iFirst = (u16)(pNew - pIter->aSeg);
16899 if( i==1 ) break;
16900
16901 pOther = &pIter->aSeg[ pIter->aFirst[i ^ 0x0001].iFirst ];
16902 }
16903 }
16904
16905 return 0;
16906 }
16907
16908 /*
16909 ** Set the pIter->bEof variable based on the state of the sub-iterators.
16910 */
16911 static void fts5MultiIterSetEof(Fts5IndexIter *pIter){
16912 Fts5SegIter *pSeg = &pIter->aSeg[ pIter->aFirst[1].iFirst ];
16913 pIter->bEof = pSeg->pLeaf==0;
16914 pIter->iSwitchRowid = pSeg->iRowid;
16915 }
16916
16917 /*
16918 ** Move the iterator to the next entry.
16919 **
16920 ** If an error occurs, an error code is left in Fts5Index.rc. It is not
16921 ** considered an error if the iterator reaches EOF, or if it is already at
16922 ** EOF when this function is called.
16923 */
16924 static void fts5MultiIterNext(
16925 Fts5Index *p,
16926 Fts5IndexIter *pIter,
16927 int bFrom, /* True if argument iFrom is valid */
16928 i64 iFrom /* Advance at least as far as this */
16929 ){
16930 if( p->rc==SQLITE_OK ){
16931 int bUseFrom = bFrom;
16932 do {
16933 int iFirst = pIter->aFirst[1].iFirst;
16934 int bNewTerm = 0;
16935 Fts5SegIter *pSeg = &pIter->aSeg[iFirst];
16936 assert( p->rc==SQLITE_OK );
16937 if( bUseFrom && pSeg->pDlidx ){
16938 fts5SegIterNextFrom(p, pSeg, iFrom);
16939 }else{
16940 fts5SegIterNext(p, pSeg, &bNewTerm);
16941 }
16942
16943 if( pSeg->pLeaf==0 || bNewTerm
16944 || fts5MultiIterAdvanceRowid(p, pIter, iFirst)
16945 ){
16946 fts5MultiIterAdvanced(p, pIter, iFirst, 1);
16947 fts5MultiIterSetEof(pIter);
16948 }
16949 fts5AssertMultiIterSetup(p, pIter);
16950
16951 bUseFrom = 0;
16952 }while( pIter->bSkipEmpty && fts5MultiIterIsEmpty(p, pIter) );
16953 }
16954 }
16955
16956 static void fts5MultiIterNext2(
16957 Fts5Index *p,
16958 Fts5IndexIter *pIter,
16959 int *pbNewTerm /* OUT: True if *might* be new term */
16960 ){
16961 assert( pIter->bSkipEmpty );
16962 if( p->rc==SQLITE_OK ){
16963 do {
16964 int iFirst = pIter->aFirst[1].iFirst;
16965 Fts5SegIter *pSeg = &pIter->aSeg[iFirst];
16966 int bNewTerm = 0;
16967
16968 fts5SegIterNext(p, pSeg, &bNewTerm);
16969 if( pSeg->pLeaf==0 || bNewTerm
16970 || fts5MultiIterAdvanceRowid(p, pIter, iFirst)
16971 ){
16972 fts5MultiIterAdvanced(p, pIter, iFirst, 1);
16973 fts5MultiIterSetEof(pIter);
16974 *pbNewTerm = 1;
16975 }else{
16976 *pbNewTerm = 0;
16977 }
16978 fts5AssertMultiIterSetup(p, pIter);
16979
16980 }while( fts5MultiIterIsEmpty(p, pIter) );
16981 }
16982 }
16983
16984
16985 static Fts5IndexIter *fts5MultiIterAlloc(
16986 Fts5Index *p, /* FTS5 backend to iterate within */
16987 int nSeg
16988 ){
16989 Fts5IndexIter *pNew;
16990 int nSlot; /* Power of two >= nSeg */
16991
16992 for(nSlot=2; nSlot<nSeg; nSlot=nSlot*2);
16993 pNew = fts5IdxMalloc(p,
16994 sizeof(Fts5IndexIter) + /* pNew */
16995 sizeof(Fts5SegIter) * (nSlot-1) + /* pNew->aSeg[] */
16996 sizeof(Fts5CResult) * nSlot /* pNew->aFirst[] */
16997 );
16998 if( pNew ){
16999 pNew->nSeg = nSlot;
17000 pNew->aFirst = (Fts5CResult*)&pNew->aSeg[nSlot];
17001 pNew->pIndex = p;
17002 }
17003 return pNew;
17004 }
17005
17006 /*
17007 ** Allocate a new Fts5IndexIter object.
17008 **
17009 ** The new object will be used to iterate through data in structure pStruct.
17010 ** If iLevel is -ve, then all data in all segments is merged. Or, if iLevel
17011 ** is zero or greater, data from the first nSegment segments on level iLevel
17012 ** is merged.
17013 **
17014 ** The iterator initially points to the first term/rowid entry in the
17015 ** iterated data.
17016 */
17017 static void fts5MultiIterNew(
17018 Fts5Index *p, /* FTS5 backend to iterate within */
17019 Fts5Structure *pStruct, /* Structure of specific index */
17020 int bSkipEmpty, /* True to ignore delete-keys */
17021 int flags, /* FTS5INDEX_QUERY_XXX flags */
17022 const u8 *pTerm, int nTerm, /* Term to seek to (or NULL/0) */
17023 int iLevel, /* Level to iterate (-1 for all) */
17024 int nSegment, /* Number of segments to merge (iLevel>=0) */
17025 Fts5IndexIter **ppOut /* New object */
17026 ){
17027 int nSeg = 0; /* Number of segment-iters in use */
17028 int iIter = 0; /* */
17029 int iSeg; /* Used to iterate through segments */
17030 Fts5Buffer buf = {0,0,0}; /* Buffer used by fts5SegIterSeekInit() */
17031 Fts5StructureLevel *pLvl;
17032 Fts5IndexIter *pNew;
17033
17034 assert( (pTerm==0 && nTerm==0) || iLevel<0 );
17035
17036 /* Allocate space for the new multi-seg-iterator. */
17037 if( p->rc==SQLITE_OK ){
17038 if( iLevel<0 ){
17039 assert( pStruct->nSegment==fts5StructureCountSegments(pStruct) );
17040 nSeg = pStruct->nSegment;
17041 nSeg += (p->pHash ? 1 : 0);
17042 }else{
17043 nSeg = MIN(pStruct->aLevel[iLevel].nSeg, nSegment);
17044 }
17045 }
17046 *ppOut = pNew = fts5MultiIterAlloc(p, nSeg);
17047 if( pNew==0 ) return;
17048 pNew->bRev = (0!=(flags & FTS5INDEX_QUERY_DESC));
17049 pNew->bSkipEmpty = (u8)bSkipEmpty;
17050 pNew->pStruct = pStruct;
17051 fts5StructureRef(pStruct);
17052
17053 /* Initialize each of the component segment iterators. */
17054 if( iLevel<0 ){
17055 Fts5StructureLevel *pEnd = &pStruct->aLevel[pStruct->nLevel];
17056 if( p->pHash ){
17057 /* Add a segment iterator for the current contents of the hash table. */
17058 Fts5SegIter *pIter = &pNew->aSeg[iIter++];
17059 fts5SegIterHashInit(p, pTerm, nTerm, flags, pIter);
17060 }
17061 for(pLvl=&pStruct->aLevel[0]; pLvl<pEnd; pLvl++){
17062 for(iSeg=pLvl->nSeg-1; iSeg>=0; iSeg--){
17063 Fts5StructureSegment *pSeg = &pLvl->aSeg[iSeg];
17064 Fts5SegIter *pIter = &pNew->aSeg[iIter++];
17065 if( pTerm==0 ){
17066 fts5SegIterInit(p, pSeg, pIter);
17067 }else{
17068 fts5SegIterSeekInit(p, &buf, pTerm, nTerm, flags, pSeg, pIter);
17069 }
17070 }
17071 }
17072 }else{
17073 pLvl = &pStruct->aLevel[iLevel];
17074 for(iSeg=nSeg-1; iSeg>=0; iSeg--){
17075 fts5SegIterInit(p, &pLvl->aSeg[iSeg], &pNew->aSeg[iIter++]);
17076 }
17077 }
17078 assert( iIter==nSeg );
17079
17080 /* If the above was successful, each component iterators now points
17081 ** to the first entry in its segment. In this case initialize the
17082 ** aFirst[] array. Or, if an error has occurred, free the iterator
17083 ** object and set the output variable to NULL. */
17084 if( p->rc==SQLITE_OK ){
17085 for(iIter=pNew->nSeg-1; iIter>0; iIter--){
17086 int iEq;
17087 if( (iEq = fts5MultiIterDoCompare(pNew, iIter)) ){
17088 fts5SegIterNext(p, &pNew->aSeg[iEq], 0);
17089 fts5MultiIterAdvanced(p, pNew, iEq, iIter);
17090 }
17091 }
17092 fts5MultiIterSetEof(pNew);
17093 fts5AssertMultiIterSetup(p, pNew);
17094
17095 if( pNew->bSkipEmpty && fts5MultiIterIsEmpty(p, pNew) ){
17096 fts5MultiIterNext(p, pNew, 0, 0);
17097 }
17098 }else{
17099 fts5MultiIterFree(p, pNew);
17100 *ppOut = 0;
17101 }
17102 fts5BufferFree(&buf);
17103 }
17104
17105 /*
17106 ** Create an Fts5IndexIter that iterates through the doclist provided
17107 ** as the second argument.
17108 */
17109 static void fts5MultiIterNew2(
17110 Fts5Index *p, /* FTS5 backend to iterate within */
17111 Fts5Data *pData, /* Doclist to iterate through */
17112 int bDesc, /* True for descending rowid order */
17113 Fts5IndexIter **ppOut /* New object */
17114 ){
17115 Fts5IndexIter *pNew;
17116 pNew = fts5MultiIterAlloc(p, 2);
17117 if( pNew ){
17118 Fts5SegIter *pIter = &pNew->aSeg[1];
17119
17120 pNew->bFiltered = 1;
17121 pIter->flags = FTS5_SEGITER_ONETERM;
17122 if( pData->szLeaf>0 ){
17123 pIter->pLeaf = pData;
17124 pIter->iLeafOffset = fts5GetVarint(pData->p, (u64*)&pIter->iRowid);
17125 pIter->iEndofDoclist = pData->nn;
17126 pNew->aFirst[1].iFirst = 1;
17127 if( bDesc ){
17128 pNew->bRev = 1;
17129 pIter->flags |= FTS5_SEGITER_REVERSE;
17130 fts5SegIterReverseInitPage(p, pIter);
17131 }else{
17132 fts5SegIterLoadNPos(p, pIter);
17133 }
17134 pData = 0;
17135 }else{
17136 pNew->bEof = 1;
17137 }
17138
17139 *ppOut = pNew;
17140 }
17141
17142 fts5DataRelease(pData);
17143 }
17144
17145 /*
17146 ** Return true if the iterator is at EOF or if an error has occurred.
17147 ** False otherwise.
17148 */
17149 static int fts5MultiIterEof(Fts5Index *p, Fts5IndexIter *pIter){
17150 assert( p->rc
17151 || (pIter->aSeg[ pIter->aFirst[1].iFirst ].pLeaf==0)==pIter->bEof
17152 );
17153 return (p->rc || pIter->bEof);
17154 }
17155
17156 /*
17157 ** Return the rowid of the entry that the iterator currently points
17158 ** to. If the iterator points to EOF when this function is called the
17159 ** results are undefined.
17160 */
17161 static i64 fts5MultiIterRowid(Fts5IndexIter *pIter){
17162 assert( pIter->aSeg[ pIter->aFirst[1].iFirst ].pLeaf );
17163 return pIter->aSeg[ pIter->aFirst[1].iFirst ].iRowid;
17164 }
17165
17166 /*
17167 ** Move the iterator to the next entry at or following iMatch.
17168 */
17169 static void fts5MultiIterNextFrom(
17170 Fts5Index *p,
17171 Fts5IndexIter *pIter,
17172 i64 iMatch
17173 ){
17174 while( 1 ){
17175 i64 iRowid;
17176 fts5MultiIterNext(p, pIter, 1, iMatch);
17177 if( fts5MultiIterEof(p, pIter) ) break;
17178 iRowid = fts5MultiIterRowid(pIter);
17179 if( pIter->bRev==0 && iRowid>=iMatch ) break;
17180 if( pIter->bRev!=0 && iRowid<=iMatch ) break;
17181 }
17182 }
17183
17184 /*
17185 ** Return a pointer to a buffer containing the term associated with the
17186 ** entry that the iterator currently points to.
17187 */
17188 static const u8 *fts5MultiIterTerm(Fts5IndexIter *pIter, int *pn){
17189 Fts5SegIter *p = &pIter->aSeg[ pIter->aFirst[1].iFirst ];
17190 *pn = p->term.n;
17191 return p->term.p;
17192 }
17193
17194 static void fts5ChunkIterate(
17195 Fts5Index *p, /* Index object */
17196 Fts5SegIter *pSeg, /* Poslist of this iterator */
17197 void *pCtx, /* Context pointer for xChunk callback */
17198 void (*xChunk)(Fts5Index*, void*, const u8*, int)
17199 ){
17200 int nRem = pSeg->nPos; /* Number of bytes still to come */
17201 Fts5Data *pData = 0;
17202 u8 *pChunk = &pSeg->pLeaf->p[pSeg->iLeafOffset];
17203 int nChunk = MIN(nRem, pSeg->pLeaf->szLeaf - pSeg->iLeafOffset);
17204 int pgno = pSeg->iLeafPgno;
17205 int pgnoSave = 0;
17206
17207 if( (pSeg->flags & FTS5_SEGITER_REVERSE)==0 ){
17208 pgnoSave = pgno+1;
17209 }
17210
17211 while( 1 ){
17212 xChunk(p, pCtx, pChunk, nChunk);
17213 nRem -= nChunk;
17214 fts5DataRelease(pData);
17215 if( nRem<=0 ){
17216 break;
17217 }else{
17218 pgno++;
17219 pData = fts5DataRead(p, FTS5_SEGMENT_ROWID(pSeg->pSeg->iSegid, pgno));
17220 if( pData==0 ) break;
17221 pChunk = &pData->p[4];
17222 nChunk = MIN(nRem, pData->szLeaf - 4);
17223 if( pgno==pgnoSave ){
17224 assert( pSeg->pNextLeaf==0 );
17225 pSeg->pNextLeaf = pData;
17226 pData = 0;
17227 }
17228 }
17229 }
17230 }
17231
17232
17233
17234 /*
17235 ** Allocate a new segment-id for the structure pStruct. The new segment
17236 ** id must be between 1 and 65335 inclusive, and must not be used by
17237 ** any currently existing segment. If a free segment id cannot be found,
17238 ** SQLITE_FULL is returned.
17239 **
17240 ** If an error has already occurred, this function is a no-op. 0 is
17241 ** returned in this case.
17242 */
17243 static int fts5AllocateSegid(Fts5Index *p, Fts5Structure *pStruct){
17244 int iSegid = 0;
17245
17246 if( p->rc==SQLITE_OK ){
17247 if( pStruct->nSegment>=FTS5_MAX_SEGMENT ){
17248 p->rc = SQLITE_FULL;
17249 }else{
17250 while( iSegid==0 ){
17251 int iLvl, iSeg;
17252 sqlite3_randomness(sizeof(u32), (void*)&iSegid);
17253 iSegid = iSegid & ((1 << FTS5_DATA_ID_B)-1);
17254 for(iLvl=0; iLvl<pStruct->nLevel; iLvl++){
17255 for(iSeg=0; iSeg<pStruct->aLevel[iLvl].nSeg; iSeg++){
17256 if( iSegid==pStruct->aLevel[iLvl].aSeg[iSeg].iSegid ){
17257 iSegid = 0;
17258 }
17259 }
17260 }
17261 }
17262 }
17263 }
17264
17265 return iSegid;
17266 }
17267
17268 /*
17269 ** Discard all data currently cached in the hash-tables.
17270 */
17271 static void fts5IndexDiscardData(Fts5Index *p){
17272 assert( p->pHash || p->nPendingData==0 );
17273 if( p->pHash ){
17274 sqlite3Fts5HashClear(p->pHash);
17275 p->nPendingData = 0;
17276 }
17277 }
17278
17279 /*
17280 ** Return the size of the prefix, in bytes, that buffer (nNew/pNew) shares
17281 ** with buffer (nOld/pOld).
17282 */
17283 static int fts5PrefixCompress(
17284 int nOld, const u8 *pOld,
17285 int nNew, const u8 *pNew
17286 ){
17287 int i;
17288 assert( fts5BlobCompare(pOld, nOld, pNew, nNew)<0 );
17289 for(i=0; i<nOld; i++){
17290 if( pOld[i]!=pNew[i] ) break;
17291 }
17292 return i;
17293 }
17294
17295 static void fts5WriteDlidxClear(
17296 Fts5Index *p,
17297 Fts5SegWriter *pWriter,
17298 int bFlush /* If true, write dlidx to disk */
17299 ){
17300 int i;
17301 assert( bFlush==0 || (pWriter->nDlidx>0 && pWriter->aDlidx[0].buf.n>0) );
17302 for(i=0; i<pWriter->nDlidx; i++){
17303 Fts5DlidxWriter *pDlidx = &pWriter->aDlidx[i];
17304 if( pDlidx->buf.n==0 ) break;
17305 if( bFlush ){
17306 assert( pDlidx->pgno!=0 );
17307 fts5DataWrite(p,
17308 FTS5_DLIDX_ROWID(pWriter->iSegid, i, pDlidx->pgno),
17309 pDlidx->buf.p, pDlidx->buf.n
17310 );
17311 }
17312 sqlite3Fts5BufferZero(&pDlidx->buf);
17313 pDlidx->bPrevValid = 0;
17314 }
17315 }
17316
17317 /*
17318 ** Grow the pWriter->aDlidx[] array to at least nLvl elements in size.
17319 ** Any new array elements are zeroed before returning.
17320 */
17321 static int fts5WriteDlidxGrow(
17322 Fts5Index *p,
17323 Fts5SegWriter *pWriter,
17324 int nLvl
17325 ){
17326 if( p->rc==SQLITE_OK && nLvl>=pWriter->nDlidx ){
17327 Fts5DlidxWriter *aDlidx = (Fts5DlidxWriter*)sqlite3_realloc(
17328 pWriter->aDlidx, sizeof(Fts5DlidxWriter) * nLvl
17329 );
17330 if( aDlidx==0 ){
17331 p->rc = SQLITE_NOMEM;
17332 }else{
17333 int nByte = sizeof(Fts5DlidxWriter) * (nLvl - pWriter->nDlidx);
17334 memset(&aDlidx[pWriter->nDlidx], 0, nByte);
17335 pWriter->aDlidx = aDlidx;
17336 pWriter->nDlidx = nLvl;
17337 }
17338 }
17339 return p->rc;
17340 }
17341
17342 /*
17343 ** If the current doclist-index accumulating in pWriter->aDlidx[] is large
17344 ** enough, flush it to disk and return 1. Otherwise discard it and return
17345 ** zero.
17346 */
17347 static int fts5WriteFlushDlidx(Fts5Index *p, Fts5SegWriter *pWriter){
17348 int bFlag = 0;
17349
17350 /* If there were FTS5_MIN_DLIDX_SIZE or more empty leaf pages written
17351 ** to the database, also write the doclist-index to disk. */
17352 if( pWriter->aDlidx[0].buf.n>0 && pWriter->nEmpty>=FTS5_MIN_DLIDX_SIZE ){
17353 bFlag = 1;
17354 }
17355 fts5WriteDlidxClear(p, pWriter, bFlag);
17356 pWriter->nEmpty = 0;
17357 return bFlag;
17358 }
17359
17360 /*
17361 ** This function is called whenever processing of the doclist for the
17362 ** last term on leaf page (pWriter->iBtPage) is completed.
17363 **
17364 ** The doclist-index for that term is currently stored in-memory within the
17365 ** Fts5SegWriter.aDlidx[] array. If it is large enough, this function
17366 ** writes it out to disk. Or, if it is too small to bother with, discards
17367 ** it.
17368 **
17369 ** Fts5SegWriter.btterm currently contains the first term on page iBtPage.
17370 */
17371 static void fts5WriteFlushBtree(Fts5Index *p, Fts5SegWriter *pWriter){
17372 int bFlag;
17373
17374 assert( pWriter->iBtPage || pWriter->nEmpty==0 );
17375 if( pWriter->iBtPage==0 ) return;
17376 bFlag = fts5WriteFlushDlidx(p, pWriter);
17377
17378 if( p->rc==SQLITE_OK ){
17379 const char *z = (pWriter->btterm.n>0?(const char*)pWriter->btterm.p:"");
17380 /* The following was already done in fts5WriteInit(): */
17381 /* sqlite3_bind_int(p->pIdxWriter, 1, pWriter->iSegid); */
17382 sqlite3_bind_blob(p->pIdxWriter, 2, z, pWriter->btterm.n, SQLITE_STATIC);
17383 sqlite3_bind_int64(p->pIdxWriter, 3, bFlag + ((i64)pWriter->iBtPage<<1));
17384 sqlite3_step(p->pIdxWriter);
17385 p->rc = sqlite3_reset(p->pIdxWriter);
17386 }
17387 pWriter->iBtPage = 0;
17388 }
17389
17390 /*
17391 ** This is called once for each leaf page except the first that contains
17392 ** at least one term. Argument (nTerm/pTerm) is the split-key - a term that
17393 ** is larger than all terms written to earlier leaves, and equal to or
17394 ** smaller than the first term on the new leaf.
17395 **
17396 ** If an error occurs, an error code is left in Fts5Index.rc. If an error
17397 ** has already occurred when this function is called, it is a no-op.
17398 */
17399 static void fts5WriteBtreeTerm(
17400 Fts5Index *p, /* FTS5 backend object */
17401 Fts5SegWriter *pWriter, /* Writer object */
17402 int nTerm, const u8 *pTerm /* First term on new page */
17403 ){
17404 fts5WriteFlushBtree(p, pWriter);
17405 fts5BufferSet(&p->rc, &pWriter->btterm, nTerm, pTerm);
17406 pWriter->iBtPage = pWriter->writer.pgno;
17407 }
17408
17409 /*
17410 ** This function is called when flushing a leaf page that contains no
17411 ** terms at all to disk.
17412 */
17413 static void fts5WriteBtreeNoTerm(
17414 Fts5Index *p, /* FTS5 backend object */
17415 Fts5SegWriter *pWriter /* Writer object */
17416 ){
17417 /* If there were no rowids on the leaf page either and the doclist-index
17418 ** has already been started, append an 0x00 byte to it. */
17419 if( pWriter->bFirstRowidInPage && pWriter->aDlidx[0].buf.n>0 ){
17420 Fts5DlidxWriter *pDlidx = &pWriter->aDlidx[0];
17421 assert( pDlidx->bPrevValid );
17422 sqlite3Fts5BufferAppendVarint(&p->rc, &pDlidx->buf, 0);
17423 }
17424
17425 /* Increment the "number of sequential leaves without a term" counter. */
17426 pWriter->nEmpty++;
17427 }
17428
17429 static i64 fts5DlidxExtractFirstRowid(Fts5Buffer *pBuf){
17430 i64 iRowid;
17431 int iOff;
17432
17433 iOff = 1 + fts5GetVarint(&pBuf->p[1], (u64*)&iRowid);
17434 fts5GetVarint(&pBuf->p[iOff], (u64*)&iRowid);
17435 return iRowid;
17436 }
17437
17438 /*
17439 ** Rowid iRowid has just been appended to the current leaf page. It is the
17440 ** first on the page. This function appends an appropriate entry to the current
17441 ** doclist-index.
17442 */
17443 static void fts5WriteDlidxAppend(
17444 Fts5Index *p,
17445 Fts5SegWriter *pWriter,
17446 i64 iRowid
17447 ){
17448 int i;
17449 int bDone = 0;
17450
17451 for(i=0; p->rc==SQLITE_OK && bDone==0; i++){
17452 i64 iVal;
17453 Fts5DlidxWriter *pDlidx = &pWriter->aDlidx[i];
17454
17455 if( pDlidx->buf.n>=p->pConfig->pgsz ){
17456 /* The current doclist-index page is full. Write it to disk and push
17457 ** a copy of iRowid (which will become the first rowid on the next
17458 ** doclist-index leaf page) up into the next level of the b-tree
17459 ** hierarchy. If the node being flushed is currently the root node,
17460 ** also push its first rowid upwards. */
17461 pDlidx->buf.p[0] = 0x01; /* Not the root node */
17462 fts5DataWrite(p,
17463 FTS5_DLIDX_ROWID(pWriter->iSegid, i, pDlidx->pgno),
17464 pDlidx->buf.p, pDlidx->buf.n
17465 );
17466 fts5WriteDlidxGrow(p, pWriter, i+2);
17467 pDlidx = &pWriter->aDlidx[i];
17468 if( p->rc==SQLITE_OK && pDlidx[1].buf.n==0 ){
17469 i64 iFirst = fts5DlidxExtractFirstRowid(&pDlidx->buf);
17470
17471 /* This was the root node. Push its first rowid up to the new root. */
17472 pDlidx[1].pgno = pDlidx->pgno;
17473 sqlite3Fts5BufferAppendVarint(&p->rc, &pDlidx[1].buf, 0);
17474 sqlite3Fts5BufferAppendVarint(&p->rc, &pDlidx[1].buf, pDlidx->pgno);
17475 sqlite3Fts5BufferAppendVarint(&p->rc, &pDlidx[1].buf, iFirst);
17476 pDlidx[1].bPrevValid = 1;
17477 pDlidx[1].iPrev = iFirst;
17478 }
17479
17480 sqlite3Fts5BufferZero(&pDlidx->buf);
17481 pDlidx->bPrevValid = 0;
17482 pDlidx->pgno++;
17483 }else{
17484 bDone = 1;
17485 }
17486
17487 if( pDlidx->bPrevValid ){
17488 iVal = iRowid - pDlidx->iPrev;
17489 }else{
17490 i64 iPgno = (i==0 ? pWriter->writer.pgno : pDlidx[-1].pgno);
17491 assert( pDlidx->buf.n==0 );
17492 sqlite3Fts5BufferAppendVarint(&p->rc, &pDlidx->buf, !bDone);
17493 sqlite3Fts5BufferAppendVarint(&p->rc, &pDlidx->buf, iPgno);
17494 iVal = iRowid;
17495 }
17496
17497 sqlite3Fts5BufferAppendVarint(&p->rc, &pDlidx->buf, iVal);
17498 pDlidx->bPrevValid = 1;
17499 pDlidx->iPrev = iRowid;
17500 }
17501 }
17502
17503 static void fts5WriteFlushLeaf(Fts5Index *p, Fts5SegWriter *pWriter){
17504 static const u8 zero[] = { 0x00, 0x00, 0x00, 0x00 };
17505 Fts5PageWriter *pPage = &pWriter->writer;
17506 i64 iRowid;
17507
17508 assert( (pPage->pgidx.n==0)==(pWriter->bFirstTermInPage) );
17509
17510 /* Set the szLeaf header field. */
17511 assert( 0==fts5GetU16(&pPage->buf.p[2]) );
17512 fts5PutU16(&pPage->buf.p[2], (u16)pPage->buf.n);
17513
17514 if( pWriter->bFirstTermInPage ){
17515 /* No term was written to this page. */
17516 assert( pPage->pgidx.n==0 );
17517 fts5WriteBtreeNoTerm(p, pWriter);
17518 }else{
17519 /* Append the pgidx to the page buffer. Set the szLeaf header field. */
17520 fts5BufferAppendBlob(&p->rc, &pPage->buf, pPage->pgidx.n, pPage->pgidx.p);
17521 }
17522
17523 /* Write the page out to disk */
17524 iRowid = FTS5_SEGMENT_ROWID(pWriter->iSegid, pPage->pgno);
17525 fts5DataWrite(p, iRowid, pPage->buf.p, pPage->buf.n);
17526
17527 /* Initialize the next page. */
17528 fts5BufferZero(&pPage->buf);
17529 fts5BufferZero(&pPage->pgidx);
17530 fts5BufferAppendBlob(&p->rc, &pPage->buf, 4, zero);
17531 pPage->iPrevPgidx = 0;
17532 pPage->pgno++;
17533
17534 /* Increase the leaves written counter */
17535 pWriter->nLeafWritten++;
17536
17537 /* The new leaf holds no terms or rowids */
17538 pWriter->bFirstTermInPage = 1;
17539 pWriter->bFirstRowidInPage = 1;
17540 }
17541
17542 /*
17543 ** Append term pTerm/nTerm to the segment being written by the writer passed
17544 ** as the second argument.
17545 **
17546 ** If an error occurs, set the Fts5Index.rc error code. If an error has
17547 ** already occurred, this function is a no-op.
17548 */
17549 static void fts5WriteAppendTerm(
17550 Fts5Index *p,
17551 Fts5SegWriter *pWriter,
17552 int nTerm, const u8 *pTerm
17553 ){
17554 int nPrefix; /* Bytes of prefix compression for term */
17555 Fts5PageWriter *pPage = &pWriter->writer;
17556 Fts5Buffer *pPgidx = &pWriter->writer.pgidx;
17557
17558 assert( p->rc==SQLITE_OK );
17559 assert( pPage->buf.n>=4 );
17560 assert( pPage->buf.n>4 || pWriter->bFirstTermInPage );
17561
17562 /* If the current leaf page is full, flush it to disk. */
17563 if( (pPage->buf.n + pPgidx->n + nTerm + 2)>=p->pConfig->pgsz ){
17564 if( pPage->buf.n>4 ){
17565 fts5WriteFlushLeaf(p, pWriter);
17566 }
17567 fts5BufferGrow(&p->rc, &pPage->buf, nTerm+FTS5_DATA_PADDING);
17568 }
17569
17570 /* TODO1: Updating pgidx here. */
17571 pPgidx->n += sqlite3Fts5PutVarint(
17572 &pPgidx->p[pPgidx->n], pPage->buf.n - pPage->iPrevPgidx
17573 );
17574 pPage->iPrevPgidx = pPage->buf.n;
17575 #if 0
17576 fts5PutU16(&pPgidx->p[pPgidx->n], pPage->buf.n);
17577 pPgidx->n += 2;
17578 #endif
17579
17580 if( pWriter->bFirstTermInPage ){
17581 nPrefix = 0;
17582 if( pPage->pgno!=1 ){
17583 /* This is the first term on a leaf that is not the leftmost leaf in
17584 ** the segment b-tree. In this case it is necessary to add a term to
17585 ** the b-tree hierarchy that is (a) larger than the largest term
17586 ** already written to the segment and (b) smaller than or equal to
17587 ** this term. In other words, a prefix of (pTerm/nTerm) that is one
17588 ** byte longer than the longest prefix (pTerm/nTerm) shares with the
17589 ** previous term.
17590 **
17591 ** Usually, the previous term is available in pPage->term. The exception
17592 ** is if this is the first term written in an incremental-merge step.
17593 ** In this case the previous term is not available, so just write a
17594 ** copy of (pTerm/nTerm) into the parent node. This is slightly
17595 ** inefficient, but still correct. */
17596 int n = nTerm;
17597 if( pPage->term.n ){
17598 n = 1 + fts5PrefixCompress(pPage->term.n, pPage->term.p, nTerm, pTerm);
17599 }
17600 fts5WriteBtreeTerm(p, pWriter, n, pTerm);
17601 pPage = &pWriter->writer;
17602 }
17603 }else{
17604 nPrefix = fts5PrefixCompress(pPage->term.n, pPage->term.p, nTerm, pTerm);
17605 fts5BufferAppendVarint(&p->rc, &pPage->buf, nPrefix);
17606 }
17607
17608 /* Append the number of bytes of new data, then the term data itself
17609 ** to the page. */
17610 fts5BufferAppendVarint(&p->rc, &pPage->buf, nTerm - nPrefix);
17611 fts5BufferAppendBlob(&p->rc, &pPage->buf, nTerm - nPrefix, &pTerm[nPrefix]);
17612
17613 /* Update the Fts5PageWriter.term field. */
17614 fts5BufferSet(&p->rc, &pPage->term, nTerm, pTerm);
17615 pWriter->bFirstTermInPage = 0;
17616
17617 pWriter->bFirstRowidInPage = 0;
17618 pWriter->bFirstRowidInDoclist = 1;
17619
17620 assert( p->rc || (pWriter->nDlidx>0 && pWriter->aDlidx[0].buf.n==0) );
17621 pWriter->aDlidx[0].pgno = pPage->pgno;
17622 }
17623
17624 /*
17625 ** Append a rowid and position-list size field to the writers output.
17626 */
17627 static void fts5WriteAppendRowid(
17628 Fts5Index *p,
17629 Fts5SegWriter *pWriter,
17630 i64 iRowid,
17631 int nPos
17632 ){
17633 if( p->rc==SQLITE_OK ){
17634 Fts5PageWriter *pPage = &pWriter->writer;
17635
17636 if( (pPage->buf.n + pPage->pgidx.n)>=p->pConfig->pgsz ){
17637 fts5WriteFlushLeaf(p, pWriter);
17638 }
17639
17640 /* If this is to be the first rowid written to the page, set the
17641 ** rowid-pointer in the page-header. Also append a value to the dlidx
17642 ** buffer, in case a doclist-index is required. */
17643 if( pWriter->bFirstRowidInPage ){
17644 fts5PutU16(pPage->buf.p, (u16)pPage->buf.n);
17645 fts5WriteDlidxAppend(p, pWriter, iRowid);
17646 }
17647
17648 /* Write the rowid. */
17649 if( pWriter->bFirstRowidInDoclist || pWriter->bFirstRowidInPage ){
17650 fts5BufferAppendVarint(&p->rc, &pPage->buf, iRowid);
17651 }else{
17652 assert( p->rc || iRowid>pWriter->iPrevRowid );
17653 fts5BufferAppendVarint(&p->rc, &pPage->buf, iRowid - pWriter->iPrevRowid);
17654 }
17655 pWriter->iPrevRowid = iRowid;
17656 pWriter->bFirstRowidInDoclist = 0;
17657 pWriter->bFirstRowidInPage = 0;
17658
17659 fts5BufferAppendVarint(&p->rc, &pPage->buf, nPos);
17660 }
17661 }
17662
17663 static void fts5WriteAppendPoslistData(
17664 Fts5Index *p,
17665 Fts5SegWriter *pWriter,
17666 const u8 *aData,
17667 int nData
17668 ){
17669 Fts5PageWriter *pPage = &pWriter->writer;
17670 const u8 *a = aData;
17671 int n = nData;
17672
17673 assert( p->pConfig->pgsz>0 );
17674 while( p->rc==SQLITE_OK
17675 && (pPage->buf.n + pPage->pgidx.n + n)>=p->pConfig->pgsz
17676 ){
17677 int nReq = p->pConfig->pgsz - pPage->buf.n - pPage->pgidx.n;
17678 int nCopy = 0;
17679 while( nCopy<nReq ){
17680 i64 dummy;
17681 nCopy += fts5GetVarint(&a[nCopy], (u64*)&dummy);
17682 }
17683 fts5BufferAppendBlob(&p->rc, &pPage->buf, nCopy, a);
17684 a += nCopy;
17685 n -= nCopy;
17686 fts5WriteFlushLeaf(p, pWriter);
17687 }
17688 if( n>0 ){
17689 fts5BufferAppendBlob(&p->rc, &pPage->buf, n, a);
17690 }
17691 }
17692
17693 /*
17694 ** Flush any data cached by the writer object to the database. Free any
17695 ** allocations associated with the writer.
17696 */
17697 static void fts5WriteFinish(
17698 Fts5Index *p,
17699 Fts5SegWriter *pWriter, /* Writer object */
17700 int *pnLeaf /* OUT: Number of leaf pages in b-tree */
17701 ){
17702 int i;
17703 Fts5PageWriter *pLeaf = &pWriter->writer;
17704 if( p->rc==SQLITE_OK ){
17705 assert( pLeaf->pgno>=1 );
17706 if( pLeaf->buf.n>4 ){
17707 fts5WriteFlushLeaf(p, pWriter);
17708 }
17709 *pnLeaf = pLeaf->pgno-1;
17710 fts5WriteFlushBtree(p, pWriter);
17711 }
17712 fts5BufferFree(&pLeaf->term);
17713 fts5BufferFree(&pLeaf->buf);
17714 fts5BufferFree(&pLeaf->pgidx);
17715 fts5BufferFree(&pWriter->btterm);
17716
17717 for(i=0; i<pWriter->nDlidx; i++){
17718 sqlite3Fts5BufferFree(&pWriter->aDlidx[i].buf);
17719 }
17720 sqlite3_free(pWriter->aDlidx);
17721 }
17722
17723 static void fts5WriteInit(
17724 Fts5Index *p,
17725 Fts5SegWriter *pWriter,
17726 int iSegid
17727 ){
17728 const int nBuffer = p->pConfig->pgsz + FTS5_DATA_PADDING;
17729
17730 memset(pWriter, 0, sizeof(Fts5SegWriter));
17731 pWriter->iSegid = iSegid;
17732
17733 fts5WriteDlidxGrow(p, pWriter, 1);
17734 pWriter->writer.pgno = 1;
17735 pWriter->bFirstTermInPage = 1;
17736 pWriter->iBtPage = 1;
17737
17738 assert( pWriter->writer.buf.n==0 );
17739 assert( pWriter->writer.pgidx.n==0 );
17740
17741 /* Grow the two buffers to pgsz + padding bytes in size. */
17742 sqlite3Fts5BufferSize(&p->rc, &pWriter->writer.pgidx, nBuffer);
17743 sqlite3Fts5BufferSize(&p->rc, &pWriter->writer.buf, nBuffer);
17744
17745 if( p->pIdxWriter==0 ){
17746 Fts5Config *pConfig = p->pConfig;
17747 fts5IndexPrepareStmt(p, &p->pIdxWriter, sqlite3_mprintf(
17748 "INSERT INTO '%q'.'%q_idx'(segid,term,pgno) VALUES(?,?,?)",
17749 pConfig->zDb, pConfig->zName
17750 ));
17751 }
17752
17753 if( p->rc==SQLITE_OK ){
17754 /* Initialize the 4-byte leaf-page header to 0x00. */
17755 memset(pWriter->writer.buf.p, 0, 4);
17756 pWriter->writer.buf.n = 4;
17757
17758 /* Bind the current output segment id to the index-writer. This is an
17759 ** optimization over binding the same value over and over as rows are
17760 ** inserted into %_idx by the current writer. */
17761 sqlite3_bind_int(p->pIdxWriter, 1, pWriter->iSegid);
17762 }
17763 }
17764
17765 /*
17766 ** Iterator pIter was used to iterate through the input segments of on an
17767 ** incremental merge operation. This function is called if the incremental
17768 ** merge step has finished but the input has not been completely exhausted.
17769 */
17770 static void fts5TrimSegments(Fts5Index *p, Fts5IndexIter *pIter){
17771 int i;
17772 Fts5Buffer buf;
17773 memset(&buf, 0, sizeof(Fts5Buffer));
17774 for(i=0; i<pIter->nSeg; i++){
17775 Fts5SegIter *pSeg = &pIter->aSeg[i];
17776 if( pSeg->pSeg==0 ){
17777 /* no-op */
17778 }else if( pSeg->pLeaf==0 ){
17779 /* All keys from this input segment have been transfered to the output.
17780 ** Set both the first and last page-numbers to 0 to indicate that the
17781 ** segment is now empty. */
17782 pSeg->pSeg->pgnoLast = 0;
17783 pSeg->pSeg->pgnoFirst = 0;
17784 }else{
17785 int iOff = pSeg->iTermLeafOffset; /* Offset on new first leaf page */
17786 i64 iLeafRowid;
17787 Fts5Data *pData;
17788 int iId = pSeg->pSeg->iSegid;
17789 u8 aHdr[4] = {0x00, 0x00, 0x00, 0x00};
17790
17791 iLeafRowid = FTS5_SEGMENT_ROWID(iId, pSeg->iTermLeafPgno);
17792 pData = fts5DataRead(p, iLeafRowid);
17793 if( pData ){
17794 fts5BufferZero(&buf);
17795 fts5BufferGrow(&p->rc, &buf, pData->nn);
17796 fts5BufferAppendBlob(&p->rc, &buf, sizeof(aHdr), aHdr);
17797 fts5BufferAppendVarint(&p->rc, &buf, pSeg->term.n);
17798 fts5BufferAppendBlob(&p->rc, &buf, pSeg->term.n, pSeg->term.p);
17799 fts5BufferAppendBlob(&p->rc, &buf, pData->szLeaf-iOff, &pData->p[iOff]);
17800 if( p->rc==SQLITE_OK ){
17801 /* Set the szLeaf field */
17802 fts5PutU16(&buf.p[2], (u16)buf.n);
17803 }
17804
17805 /* Set up the new page-index array */
17806 fts5BufferAppendVarint(&p->rc, &buf, 4);
17807 if( pSeg->iLeafPgno==pSeg->iTermLeafPgno
17808 && pSeg->iEndofDoclist<pData->szLeaf
17809 ){
17810 int nDiff = pData->szLeaf - pSeg->iEndofDoclist;
17811 fts5BufferAppendVarint(&p->rc, &buf, buf.n - 1 - nDiff - 4);
17812 fts5BufferAppendBlob(&p->rc, &buf,
17813 pData->nn - pSeg->iPgidxOff, &pData->p[pSeg->iPgidxOff]
17814 );
17815 }
17816
17817 fts5DataRelease(pData);
17818 pSeg->pSeg->pgnoFirst = pSeg->iTermLeafPgno;
17819 fts5DataDelete(p, FTS5_SEGMENT_ROWID(iId, 1), iLeafRowid);
17820 fts5DataWrite(p, iLeafRowid, buf.p, buf.n);
17821 }
17822 }
17823 }
17824 fts5BufferFree(&buf);
17825 }
17826
17827 static void fts5MergeChunkCallback(
17828 Fts5Index *p,
17829 void *pCtx,
17830 const u8 *pChunk, int nChunk
17831 ){
17832 Fts5SegWriter *pWriter = (Fts5SegWriter*)pCtx;
17833 fts5WriteAppendPoslistData(p, pWriter, pChunk, nChunk);
17834 }
17835
17836 /*
17837 **
17838 */
17839 static void fts5IndexMergeLevel(
17840 Fts5Index *p, /* FTS5 backend object */
17841 Fts5Structure **ppStruct, /* IN/OUT: Stucture of index */
17842 int iLvl, /* Level to read input from */
17843 int *pnRem /* Write up to this many output leaves */
17844 ){
17845 Fts5Structure *pStruct = *ppStruct;
17846 Fts5StructureLevel *pLvl = &pStruct->aLevel[iLvl];
17847 Fts5StructureLevel *pLvlOut;
17848 Fts5IndexIter *pIter = 0; /* Iterator to read input data */
17849 int nRem = pnRem ? *pnRem : 0; /* Output leaf pages left to write */
17850 int nInput; /* Number of input segments */
17851 Fts5SegWriter writer; /* Writer object */
17852 Fts5StructureSegment *pSeg; /* Output segment */
17853 Fts5Buffer term;
17854 int bOldest; /* True if the output segment is the oldest */
17855
17856 assert( iLvl<pStruct->nLevel );
17857 assert( pLvl->nMerge<=pLvl->nSeg );
17858
17859 memset(&writer, 0, sizeof(Fts5SegWriter));
17860 memset(&term, 0, sizeof(Fts5Buffer));
17861 if( pLvl->nMerge ){
17862 pLvlOut = &pStruct->aLevel[iLvl+1];
17863 assert( pLvlOut->nSeg>0 );
17864 nInput = pLvl->nMerge;
17865 pSeg = &pLvlOut->aSeg[pLvlOut->nSeg-1];
17866
17867 fts5WriteInit(p, &writer, pSeg->iSegid);
17868 writer.writer.pgno = pSeg->pgnoLast+1;
17869 writer.iBtPage = 0;
17870 }else{
17871 int iSegid = fts5AllocateSegid(p, pStruct);
17872
17873 /* Extend the Fts5Structure object as required to ensure the output
17874 ** segment exists. */
17875 if( iLvl==pStruct->nLevel-1 ){
17876 fts5StructureAddLevel(&p->rc, ppStruct);
17877 pStruct = *ppStruct;
17878 }
17879 fts5StructureExtendLevel(&p->rc, pStruct, iLvl+1, 1, 0);
17880 if( p->rc ) return;
17881 pLvl = &pStruct->aLevel[iLvl];
17882 pLvlOut = &pStruct->aLevel[iLvl+1];
17883
17884 fts5WriteInit(p, &writer, iSegid);
17885
17886 /* Add the new segment to the output level */
17887 pSeg = &pLvlOut->aSeg[pLvlOut->nSeg];
17888 pLvlOut->nSeg++;
17889 pSeg->pgnoFirst = 1;
17890 pSeg->iSegid = iSegid;
17891 pStruct->nSegment++;
17892
17893 /* Read input from all segments in the input level */
17894 nInput = pLvl->nSeg;
17895 }
17896 bOldest = (pLvlOut->nSeg==1 && pStruct->nLevel==iLvl+2);
17897
17898 assert( iLvl>=0 );
17899 for(fts5MultiIterNew(p, pStruct, 0, 0, 0, 0, iLvl, nInput, &pIter);
17900 fts5MultiIterEof(p, pIter)==0;
17901 fts5MultiIterNext(p, pIter, 0, 0)
17902 ){
17903 Fts5SegIter *pSegIter = &pIter->aSeg[ pIter->aFirst[1].iFirst ];
17904 int nPos; /* position-list size field value */
17905 int nTerm;
17906 const u8 *pTerm;
17907
17908 /* Check for key annihilation. */
17909 if( pSegIter->nPos==0 && (bOldest || pSegIter->bDel==0) ) continue;
17910
17911 pTerm = fts5MultiIterTerm(pIter, &nTerm);
17912 if( nTerm!=term.n || memcmp(pTerm, term.p, nTerm) ){
17913 if( pnRem && writer.nLeafWritten>nRem ){
17914 break;
17915 }
17916
17917 /* This is a new term. Append a term to the output segment. */
17918 fts5WriteAppendTerm(p, &writer, nTerm, pTerm);
17919 fts5BufferSet(&p->rc, &term, nTerm, pTerm);
17920 }
17921
17922 /* Append the rowid to the output */
17923 /* WRITEPOSLISTSIZE */
17924 nPos = pSegIter->nPos*2 + pSegIter->bDel;
17925 fts5WriteAppendRowid(p, &writer, fts5MultiIterRowid(pIter), nPos);
17926
17927 /* Append the position-list data to the output */
17928 fts5ChunkIterate(p, pSegIter, (void*)&writer, fts5MergeChunkCallback);
17929 }
17930
17931 /* Flush the last leaf page to disk. Set the output segment b-tree height
17932 ** and last leaf page number at the same time. */
17933 fts5WriteFinish(p, &writer, &pSeg->pgnoLast);
17934
17935 if( fts5MultiIterEof(p, pIter) ){
17936 int i;
17937
17938 /* Remove the redundant segments from the %_data table */
17939 for(i=0; i<nInput; i++){
17940 fts5DataRemoveSegment(p, pLvl->aSeg[i].iSegid);
17941 }
17942
17943 /* Remove the redundant segments from the input level */
17944 if( pLvl->nSeg!=nInput ){
17945 int nMove = (pLvl->nSeg - nInput) * sizeof(Fts5StructureSegment);
17946 memmove(pLvl->aSeg, &pLvl->aSeg[nInput], nMove);
17947 }
17948 pStruct->nSegment -= nInput;
17949 pLvl->nSeg -= nInput;
17950 pLvl->nMerge = 0;
17951 if( pSeg->pgnoLast==0 ){
17952 pLvlOut->nSeg--;
17953 pStruct->nSegment--;
17954 }
17955 }else{
17956 assert( pSeg->pgnoLast>0 );
17957 fts5TrimSegments(p, pIter);
17958 pLvl->nMerge = nInput;
17959 }
17960
17961 fts5MultiIterFree(p, pIter);
17962 fts5BufferFree(&term);
17963 if( pnRem ) *pnRem -= writer.nLeafWritten;
17964 }
17965
17966 /*
17967 ** Do up to nPg pages of automerge work on the index.
17968 */
17969 static void fts5IndexMerge(
17970 Fts5Index *p, /* FTS5 backend object */
17971 Fts5Structure **ppStruct, /* IN/OUT: Current structure of index */
17972 int nPg /* Pages of work to do */
17973 ){
17974 int nRem = nPg;
17975 Fts5Structure *pStruct = *ppStruct;
17976 while( nRem>0 && p->rc==SQLITE_OK ){
17977 int iLvl; /* To iterate through levels */
17978 int iBestLvl = 0; /* Level offering the most input segments */
17979 int nBest = 0; /* Number of input segments on best level */
17980
17981 /* Set iBestLvl to the level to read input segments from. */
17982 assert( pStruct->nLevel>0 );
17983 for(iLvl=0; iLvl<pStruct->nLevel; iLvl++){
17984 Fts5StructureLevel *pLvl = &pStruct->aLevel[iLvl];
17985 if( pLvl->nMerge ){
17986 if( pLvl->nMerge>nBest ){
17987 iBestLvl = iLvl;
17988 nBest = pLvl->nMerge;
17989 }
17990 break;
17991 }
17992 if( pLvl->nSeg>nBest ){
17993 nBest = pLvl->nSeg;
17994 iBestLvl = iLvl;
17995 }
17996 }
17997
17998 /* If nBest is still 0, then the index must be empty. */
17999 #ifdef SQLITE_DEBUG
18000 for(iLvl=0; nBest==0 && iLvl<pStruct->nLevel; iLvl++){
18001 assert( pStruct->aLevel[iLvl].nSeg==0 );
18002 }
18003 #endif
18004
18005 if( nBest<p->pConfig->nAutomerge
18006 && pStruct->aLevel[iBestLvl].nMerge==0
18007 ){
18008 break;
18009 }
18010 fts5IndexMergeLevel(p, &pStruct, iBestLvl, &nRem);
18011 if( p->rc==SQLITE_OK && pStruct->aLevel[iBestLvl].nMerge==0 ){
18012 fts5StructurePromote(p, iBestLvl+1, pStruct);
18013 }
18014 }
18015 *ppStruct = pStruct;
18016 }
18017
18018 /*
18019 ** A total of nLeaf leaf pages of data has just been flushed to a level-0
18020 ** segment. This function updates the write-counter accordingly and, if
18021 ** necessary, performs incremental merge work.
18022 **
18023 ** If an error occurs, set the Fts5Index.rc error code. If an error has
18024 ** already occurred, this function is a no-op.
18025 */
18026 static void fts5IndexAutomerge(
18027 Fts5Index *p, /* FTS5 backend object */
18028 Fts5Structure **ppStruct, /* IN/OUT: Current structure of index */
18029 int nLeaf /* Number of output leaves just written */
18030 ){
18031 if( p->rc==SQLITE_OK && p->pConfig->nAutomerge>0 ){
18032 Fts5Structure *pStruct = *ppStruct;
18033 u64 nWrite; /* Initial value of write-counter */
18034 int nWork; /* Number of work-quanta to perform */
18035 int nRem; /* Number of leaf pages left to write */
18036
18037 /* Update the write-counter. While doing so, set nWork. */
18038 nWrite = pStruct->nWriteCounter;
18039 nWork = (int)(((nWrite + nLeaf) / p->nWorkUnit) - (nWrite / p->nWorkUnit));
18040 pStruct->nWriteCounter += nLeaf;
18041 nRem = (int)(p->nWorkUnit * nWork * pStruct->nLevel);
18042
18043 fts5IndexMerge(p, ppStruct, nRem);
18044 }
18045 }
18046
18047 static void fts5IndexCrisismerge(
18048 Fts5Index *p, /* FTS5 backend object */
18049 Fts5Structure **ppStruct /* IN/OUT: Current structure of index */
18050 ){
18051 const int nCrisis = p->pConfig->nCrisisMerge;
18052 Fts5Structure *pStruct = *ppStruct;
18053 int iLvl = 0;
18054
18055 assert( p->rc!=SQLITE_OK || pStruct->nLevel>0 );
18056 while( p->rc==SQLITE_OK && pStruct->aLevel[iLvl].nSeg>=nCrisis ){
18057 fts5IndexMergeLevel(p, &pStruct, iLvl, 0);
18058 assert( p->rc!=SQLITE_OK || pStruct->nLevel>(iLvl+1) );
18059 fts5StructurePromote(p, iLvl+1, pStruct);
18060 iLvl++;
18061 }
18062 *ppStruct = pStruct;
18063 }
18064
18065 static int fts5IndexReturn(Fts5Index *p){
18066 int rc = p->rc;
18067 p->rc = SQLITE_OK;
18068 return rc;
18069 }
18070
18071 typedef struct Fts5FlushCtx Fts5FlushCtx;
18072 struct Fts5FlushCtx {
18073 Fts5Index *pIdx;
18074 Fts5SegWriter writer;
18075 };
18076
18077 /*
18078 ** Buffer aBuf[] contains a list of varints, all small enough to fit
18079 ** in a 32-bit integer. Return the size of the largest prefix of this
18080 ** list nMax bytes or less in size.
18081 */
18082 static int fts5PoslistPrefix(const u8 *aBuf, int nMax){
18083 int ret;
18084 u32 dummy;
18085 ret = fts5GetVarint32(aBuf, dummy);
18086 if( ret<nMax ){
18087 while( 1 ){
18088 int i = fts5GetVarint32(&aBuf[ret], dummy);
18089 if( (ret + i) > nMax ) break;
18090 ret += i;
18091 }
18092 }
18093 return ret;
18094 }
18095
18096 /*
18097 ** Flush the contents of in-memory hash table iHash to a new level-0
18098 ** segment on disk. Also update the corresponding structure record.
18099 **
18100 ** If an error occurs, set the Fts5Index.rc error code. If an error has
18101 ** already occurred, this function is a no-op.
18102 */
18103 static void fts5FlushOneHash(Fts5Index *p){
18104 Fts5Hash *pHash = p->pHash;
18105 Fts5Structure *pStruct;
18106 int iSegid;
18107 int pgnoLast = 0; /* Last leaf page number in segment */
18108
18109 /* Obtain a reference to the index structure and allocate a new segment-id
18110 ** for the new level-0 segment. */
18111 pStruct = fts5StructureRead(p);
18112 iSegid = fts5AllocateSegid(p, pStruct);
18113
18114 if( iSegid ){
18115 const int pgsz = p->pConfig->pgsz;
18116
18117 Fts5StructureSegment *pSeg; /* New segment within pStruct */
18118 Fts5Buffer *pBuf; /* Buffer in which to assemble leaf page */
18119 Fts5Buffer *pPgidx; /* Buffer in which to assemble pgidx */
18120
18121 Fts5SegWriter writer;
18122 fts5WriteInit(p, &writer, iSegid);
18123
18124 pBuf = &writer.writer.buf;
18125 pPgidx = &writer.writer.pgidx;
18126
18127 /* fts5WriteInit() should have initialized the buffers to (most likely)
18128 ** the maximum space required. */
18129 assert( p->rc || pBuf->nSpace>=(pgsz + FTS5_DATA_PADDING) );
18130 assert( p->rc || pPgidx->nSpace>=(pgsz + FTS5_DATA_PADDING) );
18131
18132 /* Begin scanning through hash table entries. This loop runs once for each
18133 ** term/doclist currently stored within the hash table. */
18134 if( p->rc==SQLITE_OK ){
18135 p->rc = sqlite3Fts5HashScanInit(pHash, 0, 0);
18136 }
18137 while( p->rc==SQLITE_OK && 0==sqlite3Fts5HashScanEof(pHash) ){
18138 const char *zTerm; /* Buffer containing term */
18139 const u8 *pDoclist; /* Pointer to doclist for this term */
18140 int nDoclist; /* Size of doclist in bytes */
18141
18142 /* Write the term for this entry to disk. */
18143 sqlite3Fts5HashScanEntry(pHash, &zTerm, &pDoclist, &nDoclist);
18144 fts5WriteAppendTerm(p, &writer, (int)strlen(zTerm), (const u8*)zTerm);
18145
18146 assert( writer.bFirstRowidInPage==0 );
18147 if( pgsz>=(pBuf->n + pPgidx->n + nDoclist + 1) ){
18148 /* The entire doclist will fit on the current leaf. */
18149 fts5BufferSafeAppendBlob(pBuf, pDoclist, nDoclist);
18150 }else{
18151 i64 iRowid = 0;
18152 i64 iDelta = 0;
18153 int iOff = 0;
18154
18155 /* The entire doclist will not fit on this leaf. The following
18156 ** loop iterates through the poslists that make up the current
18157 ** doclist. */
18158 while( p->rc==SQLITE_OK && iOff<nDoclist ){
18159 int nPos;
18160 int nCopy;
18161 int bDummy;
18162 iOff += fts5GetVarint(&pDoclist[iOff], (u64*)&iDelta);
18163 nCopy = fts5GetPoslistSize(&pDoclist[iOff], &nPos, &bDummy);
18164 nCopy += nPos;
18165 iRowid += iDelta;
18166
18167 if( writer.bFirstRowidInPage ){
18168 fts5PutU16(&pBuf->p[0], (u16)pBuf->n); /* first rowid on page */
18169 pBuf->n += sqlite3Fts5PutVarint(&pBuf->p[pBuf->n], iRowid);
18170 writer.bFirstRowidInPage = 0;
18171 fts5WriteDlidxAppend(p, &writer, iRowid);
18172 }else{
18173 pBuf->n += sqlite3Fts5PutVarint(&pBuf->p[pBuf->n], iDelta);
18174 }
18175 assert( pBuf->n<=pBuf->nSpace );
18176
18177 if( (pBuf->n + pPgidx->n + nCopy) <= pgsz ){
18178 /* The entire poslist will fit on the current leaf. So copy
18179 ** it in one go. */
18180 fts5BufferSafeAppendBlob(pBuf, &pDoclist[iOff], nCopy);
18181 }else{
18182 /* The entire poslist will not fit on this leaf. So it needs
18183 ** to be broken into sections. The only qualification being
18184 ** that each varint must be stored contiguously. */
18185 const u8 *pPoslist = &pDoclist[iOff];
18186 int iPos = 0;
18187 while( p->rc==SQLITE_OK ){
18188 int nSpace = pgsz - pBuf->n - pPgidx->n;
18189 int n = 0;
18190 if( (nCopy - iPos)<=nSpace ){
18191 n = nCopy - iPos;
18192 }else{
18193 n = fts5PoslistPrefix(&pPoslist[iPos], nSpace);
18194 }
18195 assert( n>0 );
18196 fts5BufferSafeAppendBlob(pBuf, &pPoslist[iPos], n);
18197 iPos += n;
18198 if( (pBuf->n + pPgidx->n)>=pgsz ){
18199 fts5WriteFlushLeaf(p, &writer);
18200 }
18201 if( iPos>=nCopy ) break;
18202 }
18203 }
18204 iOff += nCopy;
18205 }
18206 }
18207
18208 /* TODO2: Doclist terminator written here. */
18209 /* pBuf->p[pBuf->n++] = '\0'; */
18210 assert( pBuf->n<=pBuf->nSpace );
18211 sqlite3Fts5HashScanNext(pHash);
18212 }
18213 sqlite3Fts5HashClear(pHash);
18214 fts5WriteFinish(p, &writer, &pgnoLast);
18215
18216 /* Update the Fts5Structure. It is written back to the database by the
18217 ** fts5StructureRelease() call below. */
18218 if( pStruct->nLevel==0 ){
18219 fts5StructureAddLevel(&p->rc, &pStruct);
18220 }
18221 fts5StructureExtendLevel(&p->rc, pStruct, 0, 1, 0);
18222 if( p->rc==SQLITE_OK ){
18223 pSeg = &pStruct->aLevel[0].aSeg[ pStruct->aLevel[0].nSeg++ ];
18224 pSeg->iSegid = iSegid;
18225 pSeg->pgnoFirst = 1;
18226 pSeg->pgnoLast = pgnoLast;
18227 pStruct->nSegment++;
18228 }
18229 fts5StructurePromote(p, 0, pStruct);
18230 }
18231
18232 fts5IndexAutomerge(p, &pStruct, pgnoLast);
18233 fts5IndexCrisismerge(p, &pStruct);
18234 fts5StructureWrite(p, pStruct);
18235 fts5StructureRelease(pStruct);
18236 }
18237
18238 /*
18239 ** Flush any data stored in the in-memory hash tables to the database.
18240 */
18241 static void fts5IndexFlush(Fts5Index *p){
18242 /* Unless it is empty, flush the hash table to disk */
18243 if( p->nPendingData ){
18244 assert( p->pHash );
18245 p->nPendingData = 0;
18246 fts5FlushOneHash(p);
18247 }
18248 }
18249
18250
18251 static int sqlite3Fts5IndexOptimize(Fts5Index *p){
18252 Fts5Structure *pStruct;
18253 Fts5Structure *pNew = 0;
18254 int nSeg = 0;
18255
18256 assert( p->rc==SQLITE_OK );
18257 fts5IndexFlush(p);
18258 pStruct = fts5StructureRead(p);
18259
18260 if( pStruct ){
18261 assert( pStruct->nSegment==fts5StructureCountSegments(pStruct) );
18262 nSeg = pStruct->nSegment;
18263 if( nSeg>1 ){
18264 int nByte = sizeof(Fts5Structure);
18265 nByte += (pStruct->nLevel+1) * sizeof(Fts5StructureLevel);
18266 pNew = (Fts5Structure*)sqlite3Fts5MallocZero(&p->rc, nByte);
18267 }
18268 }
18269 if( pNew ){
18270 Fts5StructureLevel *pLvl;
18271 int nByte = nSeg * sizeof(Fts5StructureSegment);
18272 pNew->nLevel = pStruct->nLevel+1;
18273 pNew->nRef = 1;
18274 pNew->nWriteCounter = pStruct->nWriteCounter;
18275 pLvl = &pNew->aLevel[pStruct->nLevel];
18276 pLvl->aSeg = (Fts5StructureSegment*)sqlite3Fts5MallocZero(&p->rc, nByte);
18277 if( pLvl->aSeg ){
18278 int iLvl, iSeg;
18279 int iSegOut = 0;
18280 for(iLvl=0; iLvl<pStruct->nLevel; iLvl++){
18281 for(iSeg=0; iSeg<pStruct->aLevel[iLvl].nSeg; iSeg++){
18282 pLvl->aSeg[iSegOut] = pStruct->aLevel[iLvl].aSeg[iSeg];
18283 iSegOut++;
18284 }
18285 }
18286 pNew->nSegment = pLvl->nSeg = nSeg;
18287 }else{
18288 sqlite3_free(pNew);
18289 pNew = 0;
18290 }
18291 }
18292
18293 if( pNew ){
18294 int iLvl = pNew->nLevel-1;
18295 while( p->rc==SQLITE_OK && pNew->aLevel[iLvl].nSeg>0 ){
18296 int nRem = FTS5_OPT_WORK_UNIT;
18297 fts5IndexMergeLevel(p, &pNew, iLvl, &nRem);
18298 }
18299
18300 fts5StructureWrite(p, pNew);
18301 fts5StructureRelease(pNew);
18302 }
18303
18304 fts5StructureRelease(pStruct);
18305 return fts5IndexReturn(p);
18306 }
18307
18308 static int sqlite3Fts5IndexMerge(Fts5Index *p, int nMerge){
18309 Fts5Structure *pStruct;
18310
18311 pStruct = fts5StructureRead(p);
18312 if( pStruct && pStruct->nLevel ){
18313 fts5IndexMerge(p, &pStruct, nMerge);
18314 fts5StructureWrite(p, pStruct);
18315 }
18316 fts5StructureRelease(pStruct);
18317
18318 return fts5IndexReturn(p);
18319 }
18320
18321 static void fts5PoslistCallback(
18322 Fts5Index *p,
18323 void *pContext,
18324 const u8 *pChunk, int nChunk
18325 ){
18326 assert_nc( nChunk>=0 );
18327 if( nChunk>0 ){
18328 fts5BufferSafeAppendBlob((Fts5Buffer*)pContext, pChunk, nChunk);
18329 }
18330 }
18331
18332 typedef struct PoslistCallbackCtx PoslistCallbackCtx;
18333 struct PoslistCallbackCtx {
18334 Fts5Buffer *pBuf; /* Append to this buffer */
18335 Fts5Colset *pColset; /* Restrict matches to this column */
18336 int eState; /* See above */
18337 };
18338
18339 /*
18340 ** TODO: Make this more efficient!
18341 */
18342 static int fts5IndexColsetTest(Fts5Colset *pColset, int iCol){
18343 int i;
18344 for(i=0; i<pColset->nCol; i++){
18345 if( pColset->aiCol[i]==iCol ) return 1;
18346 }
18347 return 0;
18348 }
18349
18350 static void fts5PoslistFilterCallback(
18351 Fts5Index *p,
18352 void *pContext,
18353 const u8 *pChunk, int nChunk
18354 ){
18355 PoslistCallbackCtx *pCtx = (PoslistCallbackCtx*)pContext;
18356 assert_nc( nChunk>=0 );
18357 if( nChunk>0 ){
18358 /* Search through to find the first varint with value 1. This is the
18359 ** start of the next columns hits. */
18360 int i = 0;
18361 int iStart = 0;
18362
18363 if( pCtx->eState==2 ){
18364 int iCol;
18365 fts5FastGetVarint32(pChunk, i, iCol);
18366 if( fts5IndexColsetTest(pCtx->pColset, iCol) ){
18367 pCtx->eState = 1;
18368 fts5BufferSafeAppendVarint(pCtx->pBuf, 1);
18369 }else{
18370 pCtx->eState = 0;
18371 }
18372 }
18373
18374 do {
18375 while( i<nChunk && pChunk[i]!=0x01 ){
18376 while( pChunk[i] & 0x80 ) i++;
18377 i++;
18378 }
18379 if( pCtx->eState ){
18380 fts5BufferSafeAppendBlob(pCtx->pBuf, &pChunk[iStart], i-iStart);
18381 }
18382 if( i<nChunk ){
18383 int iCol;
18384 iStart = i;
18385 i++;
18386 if( i>=nChunk ){
18387 pCtx->eState = 2;
18388 }else{
18389 fts5FastGetVarint32(pChunk, i, iCol);
18390 pCtx->eState = fts5IndexColsetTest(pCtx->pColset, iCol);
18391 if( pCtx->eState ){
18392 fts5BufferSafeAppendBlob(pCtx->pBuf, &pChunk[iStart], i-iStart);
18393 iStart = i;
18394 }
18395 }
18396 }
18397 }while( i<nChunk );
18398 }
18399 }
18400
18401 /*
18402 ** Iterator pIter currently points to a valid entry (not EOF). This
18403 ** function appends the position list data for the current entry to
18404 ** buffer pBuf. It does not make a copy of the position-list size
18405 ** field.
18406 */
18407 static void fts5SegiterPoslist(
18408 Fts5Index *p,
18409 Fts5SegIter *pSeg,
18410 Fts5Colset *pColset,
18411 Fts5Buffer *pBuf
18412 ){
18413 if( 0==fts5BufferGrow(&p->rc, pBuf, pSeg->nPos) ){
18414 if( pColset==0 ){
18415 fts5ChunkIterate(p, pSeg, (void*)pBuf, fts5PoslistCallback);
18416 }else{
18417 PoslistCallbackCtx sCtx;
18418 sCtx.pBuf = pBuf;
18419 sCtx.pColset = pColset;
18420 sCtx.eState = fts5IndexColsetTest(pColset, 0);
18421 assert( sCtx.eState==0 || sCtx.eState==1 );
18422 fts5ChunkIterate(p, pSeg, (void*)&sCtx, fts5PoslistFilterCallback);
18423 }
18424 }
18425 }
18426
18427 /*
18428 ** IN/OUT parameter (*pa) points to a position list n bytes in size. If
18429 ** the position list contains entries for column iCol, then (*pa) is set
18430 ** to point to the sub-position-list for that column and the number of
18431 ** bytes in it returned. Or, if the argument position list does not
18432 ** contain any entries for column iCol, return 0.
18433 */
18434 static int fts5IndexExtractCol(
18435 const u8 **pa, /* IN/OUT: Pointer to poslist */
18436 int n, /* IN: Size of poslist in bytes */
18437 int iCol /* Column to extract from poslist */
18438 ){
18439 int iCurrent = 0; /* Anything before the first 0x01 is col 0 */
18440 const u8 *p = *pa;
18441 const u8 *pEnd = &p[n]; /* One byte past end of position list */
18442 u8 prev = 0;
18443
18444 while( iCol>iCurrent ){
18445 /* Advance pointer p until it points to pEnd or an 0x01 byte that is
18446 ** not part of a varint */
18447 while( (prev & 0x80) || *p!=0x01 ){
18448 prev = *p++;
18449 if( p==pEnd ) return 0;
18450 }
18451 *pa = p++;
18452 p += fts5GetVarint32(p, iCurrent);
18453 }
18454 if( iCol!=iCurrent ) return 0;
18455
18456 /* Advance pointer p until it points to pEnd or an 0x01 byte that is
18457 ** not part of a varint */
18458 assert( (prev & 0x80)==0 );
18459 while( p<pEnd && ((prev & 0x80) || *p!=0x01) ){
18460 prev = *p++;
18461 }
18462 return p - (*pa);
18463 }
18464
18465
18466 /*
18467 ** Iterator pMulti currently points to a valid entry (not EOF). This
18468 ** function appends the following to buffer pBuf:
18469 **
18470 ** * The varint iDelta, and
18471 ** * the position list that currently points to, including the size field.
18472 **
18473 ** If argument pColset is NULL, then the position list is filtered according
18474 ** to pColset before being appended to the buffer. If this means there are
18475 ** no entries in the position list, nothing is appended to the buffer (not
18476 ** even iDelta).
18477 **
18478 ** If an error occurs, an error code is left in p->rc.
18479 */
18480 static int fts5AppendPoslist(
18481 Fts5Index *p,
18482 i64 iDelta,
18483 Fts5IndexIter *pMulti,
18484 Fts5Colset *pColset,
18485 Fts5Buffer *pBuf
18486 ){
18487 if( p->rc==SQLITE_OK ){
18488 Fts5SegIter *pSeg = &pMulti->aSeg[ pMulti->aFirst[1].iFirst ];
18489 assert( fts5MultiIterEof(p, pMulti)==0 );
18490 assert( pSeg->nPos>0 );
18491 if( 0==fts5BufferGrow(&p->rc, pBuf, pSeg->nPos+9+9) ){
18492
18493 if( pSeg->iLeafOffset+pSeg->nPos<=pSeg->pLeaf->szLeaf
18494 && (pColset==0 || pColset->nCol==1)
18495 ){
18496 const u8 *pPos = &pSeg->pLeaf->p[pSeg->iLeafOffset];
18497 int nPos;
18498 if( pColset ){
18499 nPos = fts5IndexExtractCol(&pPos, pSeg->nPos, pColset->aiCol[0]);
18500 if( nPos==0 ) return 1;
18501 }else{
18502 nPos = pSeg->nPos;
18503 }
18504 assert( nPos>0 );
18505 fts5BufferSafeAppendVarint(pBuf, iDelta);
18506 fts5BufferSafeAppendVarint(pBuf, nPos*2);
18507 fts5BufferSafeAppendBlob(pBuf, pPos, nPos);
18508 }else{
18509 int iSv1;
18510 int iSv2;
18511 int iData;
18512
18513 /* Append iDelta */
18514 iSv1 = pBuf->n;
18515 fts5BufferSafeAppendVarint(pBuf, iDelta);
18516
18517 /* WRITEPOSLISTSIZE */
18518 iSv2 = pBuf->n;
18519 fts5BufferSafeAppendVarint(pBuf, pSeg->nPos*2);
18520 iData = pBuf->n;
18521
18522 fts5SegiterPoslist(p, pSeg, pColset, pBuf);
18523
18524 if( pColset ){
18525 int nActual = pBuf->n - iData;
18526 if( nActual!=pSeg->nPos ){
18527 if( nActual==0 ){
18528 pBuf->n = iSv1;
18529 return 1;
18530 }else{
18531 int nReq = sqlite3Fts5GetVarintLen((u32)(nActual*2));
18532 while( iSv2<(iData-nReq) ){ pBuf->p[iSv2++] = 0x80; }
18533 sqlite3Fts5PutVarint(&pBuf->p[iSv2], nActual*2);
18534 }
18535 }
18536 }
18537 }
18538
18539 }
18540 }
18541
18542 return 0;
18543 }
18544
18545 static void fts5DoclistIterNext(Fts5DoclistIter *pIter){
18546 u8 *p = pIter->aPoslist + pIter->nSize + pIter->nPoslist;
18547
18548 assert( pIter->aPoslist );
18549 if( p>=pIter->aEof ){
18550 pIter->aPoslist = 0;
18551 }else{
18552 i64 iDelta;
18553
18554 p += fts5GetVarint(p, (u64*)&iDelta);
18555 pIter->iRowid += iDelta;
18556
18557 /* Read position list size */
18558 if( p[0] & 0x80 ){
18559 int nPos;
18560 pIter->nSize = fts5GetVarint32(p, nPos);
18561 pIter->nPoslist = (nPos>>1);
18562 }else{
18563 pIter->nPoslist = ((int)(p[0])) >> 1;
18564 pIter->nSize = 1;
18565 }
18566
18567 pIter->aPoslist = p;
18568 }
18569 }
18570
18571 static void fts5DoclistIterInit(
18572 Fts5Buffer *pBuf,
18573 Fts5DoclistIter *pIter
18574 ){
18575 memset(pIter, 0, sizeof(*pIter));
18576 pIter->aPoslist = pBuf->p;
18577 pIter->aEof = &pBuf->p[pBuf->n];
18578 fts5DoclistIterNext(pIter);
18579 }
18580
18581 #if 0
18582 /*
18583 ** Append a doclist to buffer pBuf.
18584 **
18585 ** This function assumes that space within the buffer has already been
18586 ** allocated.
18587 */
18588 static void fts5MergeAppendDocid(
18589 Fts5Buffer *pBuf, /* Buffer to write to */
18590 i64 *piLastRowid, /* IN/OUT: Previous rowid written (if any) */
18591 i64 iRowid /* Rowid to append */
18592 ){
18593 assert( pBuf->n!=0 || (*piLastRowid)==0 );
18594 fts5BufferSafeAppendVarint(pBuf, iRowid - *piLastRowid);
18595 *piLastRowid = iRowid;
18596 }
18597 #endif
18598
18599 #define fts5MergeAppendDocid(pBuf, iLastRowid, iRowid) { \
18600 assert( (pBuf)->n!=0 || (iLastRowid)==0 ); \
18601 fts5BufferSafeAppendVarint((pBuf), (iRowid) - (iLastRowid)); \
18602 (iLastRowid) = (iRowid); \
18603 }
18604
18605 /*
18606 ** Buffers p1 and p2 contain doclists. This function merges the content
18607 ** of the two doclists together and sets buffer p1 to the result before
18608 ** returning.
18609 **
18610 ** If an error occurs, an error code is left in p->rc. If an error has
18611 ** already occurred, this function is a no-op.
18612 */
18613 static void fts5MergePrefixLists(
18614 Fts5Index *p, /* FTS5 backend object */
18615 Fts5Buffer *p1, /* First list to merge */
18616 Fts5Buffer *p2 /* Second list to merge */
18617 ){
18618 if( p2->n ){
18619 i64 iLastRowid = 0;
18620 Fts5DoclistIter i1;
18621 Fts5DoclistIter i2;
18622 Fts5Buffer out;
18623 Fts5Buffer tmp;
18624 memset(&out, 0, sizeof(out));
18625 memset(&tmp, 0, sizeof(tmp));
18626
18627 sqlite3Fts5BufferSize(&p->rc, &out, p1->n + p2->n);
18628 fts5DoclistIterInit(p1, &i1);
18629 fts5DoclistIterInit(p2, &i2);
18630 while( p->rc==SQLITE_OK && (i1.aPoslist!=0 || i2.aPoslist!=0) ){
18631 if( i2.aPoslist==0 || (i1.aPoslist && i1.iRowid<i2.iRowid) ){
18632 /* Copy entry from i1 */
18633 fts5MergeAppendDocid(&out, iLastRowid, i1.iRowid);
18634 fts5BufferSafeAppendBlob(&out, i1.aPoslist, i1.nPoslist+i1.nSize);
18635 fts5DoclistIterNext(&i1);
18636 }
18637 else if( i1.aPoslist==0 || i2.iRowid!=i1.iRowid ){
18638 /* Copy entry from i2 */
18639 fts5MergeAppendDocid(&out, iLastRowid, i2.iRowid);
18640 fts5BufferSafeAppendBlob(&out, i2.aPoslist, i2.nPoslist+i2.nSize);
18641 fts5DoclistIterNext(&i2);
18642 }
18643 else{
18644 i64 iPos1 = 0;
18645 i64 iPos2 = 0;
18646 int iOff1 = 0;
18647 int iOff2 = 0;
18648 u8 *a1 = &i1.aPoslist[i1.nSize];
18649 u8 *a2 = &i2.aPoslist[i2.nSize];
18650
18651 Fts5PoslistWriter writer;
18652 memset(&writer, 0, sizeof(writer));
18653
18654 /* Merge the two position lists. */
18655 fts5MergeAppendDocid(&out, iLastRowid, i2.iRowid);
18656 fts5BufferZero(&tmp);
18657
18658 sqlite3Fts5PoslistNext64(a1, i1.nPoslist, &iOff1, &iPos1);
18659 sqlite3Fts5PoslistNext64(a2, i2.nPoslist, &iOff2, &iPos2);
18660
18661 while( p->rc==SQLITE_OK && (iPos1>=0 || iPos2>=0) ){
18662 i64 iNew;
18663 if( iPos2<0 || (iPos1>=0 && iPos1<iPos2) ){
18664 iNew = iPos1;
18665 sqlite3Fts5PoslistNext64(a1, i1.nPoslist, &iOff1, &iPos1);
18666 }else{
18667 iNew = iPos2;
18668 sqlite3Fts5PoslistNext64(a2, i2.nPoslist, &iOff2, &iPos2);
18669 if( iPos1==iPos2 ){
18670 sqlite3Fts5PoslistNext64(a1, i1.nPoslist, &iOff1,&iPos1);
18671 }
18672 }
18673 p->rc = sqlite3Fts5PoslistWriterAppend(&tmp, &writer, iNew);
18674 }
18675
18676 /* WRITEPOSLISTSIZE */
18677 fts5BufferSafeAppendVarint(&out, tmp.n * 2);
18678 fts5BufferSafeAppendBlob(&out, tmp.p, tmp.n);
18679 fts5DoclistIterNext(&i1);
18680 fts5DoclistIterNext(&i2);
18681 }
18682 }
18683
18684 fts5BufferSet(&p->rc, p1, out.n, out.p);
18685 fts5BufferFree(&tmp);
18686 fts5BufferFree(&out);
18687 }
18688 }
18689
18690 static void fts5BufferSwap(Fts5Buffer *p1, Fts5Buffer *p2){
18691 Fts5Buffer tmp = *p1;
18692 *p1 = *p2;
18693 *p2 = tmp;
18694 }
18695
18696 static void fts5SetupPrefixIter(
18697 Fts5Index *p, /* Index to read from */
18698 int bDesc, /* True for "ORDER BY rowid DESC" */
18699 const u8 *pToken, /* Buffer containing prefix to match */
18700 int nToken, /* Size of buffer pToken in bytes */
18701 Fts5Colset *pColset, /* Restrict matches to these columns */
18702 Fts5IndexIter **ppIter /* OUT: New iterator */
18703 ){
18704 Fts5Structure *pStruct;
18705 Fts5Buffer *aBuf;
18706 const int nBuf = 32;
18707
18708 aBuf = (Fts5Buffer*)fts5IdxMalloc(p, sizeof(Fts5Buffer)*nBuf);
18709 pStruct = fts5StructureRead(p);
18710
18711 if( aBuf && pStruct ){
18712 const int flags = FTS5INDEX_QUERY_SCAN;
18713 int i;
18714 i64 iLastRowid = 0;
18715 Fts5IndexIter *p1 = 0; /* Iterator used to gather data from index */
18716 Fts5Data *pData;
18717 Fts5Buffer doclist;
18718 int bNewTerm = 1;
18719
18720 memset(&doclist, 0, sizeof(doclist));
18721 for(fts5MultiIterNew(p, pStruct, 1, flags, pToken, nToken, -1, 0, &p1);
18722 fts5MultiIterEof(p, p1)==0;
18723 fts5MultiIterNext2(p, p1, &bNewTerm)
18724 ){
18725 i64 iRowid = fts5MultiIterRowid(p1);
18726 int nTerm;
18727 const u8 *pTerm = fts5MultiIterTerm(p1, &nTerm);
18728 assert_nc( memcmp(pToken, pTerm, MIN(nToken, nTerm))<=0 );
18729 if( bNewTerm ){
18730 if( nTerm<nToken || memcmp(pToken, pTerm, nToken) ) break;
18731 }
18732
18733 if( doclist.n>0 && iRowid<=iLastRowid ){
18734 for(i=0; p->rc==SQLITE_OK && doclist.n; i++){
18735 assert( i<nBuf );
18736 if( aBuf[i].n==0 ){
18737 fts5BufferSwap(&doclist, &aBuf[i]);
18738 fts5BufferZero(&doclist);
18739 }else{
18740 fts5MergePrefixLists(p, &doclist, &aBuf[i]);
18741 fts5BufferZero(&aBuf[i]);
18742 }
18743 }
18744 iLastRowid = 0;
18745 }
18746
18747 if( !fts5AppendPoslist(p, iRowid-iLastRowid, p1, pColset, &doclist) ){
18748 iLastRowid = iRowid;
18749 }
18750 }
18751
18752 for(i=0; i<nBuf; i++){
18753 if( p->rc==SQLITE_OK ){
18754 fts5MergePrefixLists(p, &doclist, &aBuf[i]);
18755 }
18756 fts5BufferFree(&aBuf[i]);
18757 }
18758 fts5MultiIterFree(p, p1);
18759
18760 pData = fts5IdxMalloc(p, sizeof(Fts5Data) + doclist.n);
18761 if( pData ){
18762 pData->p = (u8*)&pData[1];
18763 pData->nn = pData->szLeaf = doclist.n;
18764 memcpy(pData->p, doclist.p, doclist.n);
18765 fts5MultiIterNew2(p, pData, bDesc, ppIter);
18766 }
18767 fts5BufferFree(&doclist);
18768 }
18769
18770 fts5StructureRelease(pStruct);
18771 sqlite3_free(aBuf);
18772 }
18773
18774
18775 /*
18776 ** Indicate that all subsequent calls to sqlite3Fts5IndexWrite() pertain
18777 ** to the document with rowid iRowid.
18778 */
18779 static int sqlite3Fts5IndexBeginWrite(Fts5Index *p, int bDelete, i64 iRowid){
18780 assert( p->rc==SQLITE_OK );
18781
18782 /* Allocate the hash table if it has not already been allocated */
18783 if( p->pHash==0 ){
18784 p->rc = sqlite3Fts5HashNew(&p->pHash, &p->nPendingData);
18785 }
18786
18787 /* Flush the hash table to disk if required */
18788 if( iRowid<p->iWriteRowid
18789 || (iRowid==p->iWriteRowid && p->bDelete==0)
18790 || (p->nPendingData > p->pConfig->nHashSize)
18791 ){
18792 fts5IndexFlush(p);
18793 }
18794
18795 p->iWriteRowid = iRowid;
18796 p->bDelete = bDelete;
18797 return fts5IndexReturn(p);
18798 }
18799
18800 /*
18801 ** Commit data to disk.
18802 */
18803 static int sqlite3Fts5IndexSync(Fts5Index *p, int bCommit){
18804 assert( p->rc==SQLITE_OK );
18805 fts5IndexFlush(p);
18806 if( bCommit ) fts5CloseReader(p);
18807 return fts5IndexReturn(p);
18808 }
18809
18810 /*
18811 ** Discard any data stored in the in-memory hash tables. Do not write it
18812 ** to the database. Additionally, assume that the contents of the %_data
18813 ** table may have changed on disk. So any in-memory caches of %_data
18814 ** records must be invalidated.
18815 */
18816 static int sqlite3Fts5IndexRollback(Fts5Index *p){
18817 fts5CloseReader(p);
18818 fts5IndexDiscardData(p);
18819 assert( p->rc==SQLITE_OK );
18820 return SQLITE_OK;
18821 }
18822
18823 /*
18824 ** The %_data table is completely empty when this function is called. This
18825 ** function populates it with the initial structure objects for each index,
18826 ** and the initial version of the "averages" record (a zero-byte blob).
18827 */
18828 static int sqlite3Fts5IndexReinit(Fts5Index *p){
18829 Fts5Structure s;
18830 memset(&s, 0, sizeof(Fts5Structure));
18831 fts5DataWrite(p, FTS5_AVERAGES_ROWID, (const u8*)"", 0);
18832 fts5StructureWrite(p, &s);
18833 return fts5IndexReturn(p);
18834 }
18835
18836 /*
18837 ** Open a new Fts5Index handle. If the bCreate argument is true, create
18838 ** and initialize the underlying %_data table.
18839 **
18840 ** If successful, set *pp to point to the new object and return SQLITE_OK.
18841 ** Otherwise, set *pp to NULL and return an SQLite error code.
18842 */
18843 static int sqlite3Fts5IndexOpen(
18844 Fts5Config *pConfig,
18845 int bCreate,
18846 Fts5Index **pp,
18847 char **pzErr
18848 ){
18849 int rc = SQLITE_OK;
18850 Fts5Index *p; /* New object */
18851
18852 *pp = p = (Fts5Index*)sqlite3Fts5MallocZero(&rc, sizeof(Fts5Index));
18853 if( rc==SQLITE_OK ){
18854 p->pConfig = pConfig;
18855 p->nWorkUnit = FTS5_WORK_UNIT;
18856 p->zDataTbl = sqlite3Fts5Mprintf(&rc, "%s_data", pConfig->zName);
18857 if( p->zDataTbl && bCreate ){
18858 rc = sqlite3Fts5CreateTable(
18859 pConfig, "data", "id INTEGER PRIMARY KEY, block BLOB", 0, pzErr
18860 );
18861 if( rc==SQLITE_OK ){
18862 rc = sqlite3Fts5CreateTable(pConfig, "idx",
18863 "segid, term, pgno, PRIMARY KEY(segid, term)",
18864 1, pzErr
18865 );
18866 }
18867 if( rc==SQLITE_OK ){
18868 rc = sqlite3Fts5IndexReinit(p);
18869 }
18870 }
18871 }
18872
18873 assert( rc!=SQLITE_OK || p->rc==SQLITE_OK );
18874 if( rc ){
18875 sqlite3Fts5IndexClose(p);
18876 *pp = 0;
18877 }
18878 return rc;
18879 }
18880
18881 /*
18882 ** Close a handle opened by an earlier call to sqlite3Fts5IndexOpen().
18883 */
18884 static int sqlite3Fts5IndexClose(Fts5Index *p){
18885 int rc = SQLITE_OK;
18886 if( p ){
18887 assert( p->pReader==0 );
18888 sqlite3_finalize(p->pWriter);
18889 sqlite3_finalize(p->pDeleter);
18890 sqlite3_finalize(p->pIdxWriter);
18891 sqlite3_finalize(p->pIdxDeleter);
18892 sqlite3_finalize(p->pIdxSelect);
18893 sqlite3Fts5HashFree(p->pHash);
18894 sqlite3_free(p->zDataTbl);
18895 sqlite3_free(p);
18896 }
18897 return rc;
18898 }
18899
18900 /*
18901 ** Argument p points to a buffer containing utf-8 text that is n bytes in
18902 ** size. Return the number of bytes in the nChar character prefix of the
18903 ** buffer, or 0 if there are less than nChar characters in total.
18904 */
18905 static int fts5IndexCharlenToBytelen(const char *p, int nByte, int nChar){
18906 int n = 0;
18907 int i;
18908 for(i=0; i<nChar; i++){
18909 if( n>=nByte ) return 0; /* Input contains fewer than nChar chars */
18910 if( (unsigned char)p[n++]>=0xc0 ){
18911 while( (p[n] & 0xc0)==0x80 ) n++;
18912 }
18913 }
18914 return n;
18915 }
18916
18917 /*
18918 ** pIn is a UTF-8 encoded string, nIn bytes in size. Return the number of
18919 ** unicode characters in the string.
18920 */
18921 static int fts5IndexCharlen(const char *pIn, int nIn){
18922 int nChar = 0;
18923 int i = 0;
18924 while( i<nIn ){
18925 if( (unsigned char)pIn[i++]>=0xc0 ){
18926 while( i<nIn && (pIn[i] & 0xc0)==0x80 ) i++;
18927 }
18928 nChar++;
18929 }
18930 return nChar;
18931 }
18932
18933 /*
18934 ** Insert or remove data to or from the index. Each time a document is
18935 ** added to or removed from the index, this function is called one or more
18936 ** times.
18937 **
18938 ** For an insert, it must be called once for each token in the new document.
18939 ** If the operation is a delete, it must be called (at least) once for each
18940 ** unique token in the document with an iCol value less than zero. The iPos
18941 ** argument is ignored for a delete.
18942 */
18943 static int sqlite3Fts5IndexWrite(
18944 Fts5Index *p, /* Index to write to */
18945 int iCol, /* Column token appears in (-ve -> delete) */
18946 int iPos, /* Position of token within column */
18947 const char *pToken, int nToken /* Token to add or remove to or from index */
18948 ){
18949 int i; /* Used to iterate through indexes */
18950 int rc = SQLITE_OK; /* Return code */
18951 Fts5Config *pConfig = p->pConfig;
18952
18953 assert( p->rc==SQLITE_OK );
18954 assert( (iCol<0)==p->bDelete );
18955
18956 /* Add the entry to the main terms index. */
18957 rc = sqlite3Fts5HashWrite(
18958 p->pHash, p->iWriteRowid, iCol, iPos, FTS5_MAIN_PREFIX, pToken, nToken
18959 );
18960
18961 for(i=0; i<pConfig->nPrefix && rc==SQLITE_OK; i++){
18962 int nByte = fts5IndexCharlenToBytelen(pToken, nToken, pConfig->aPrefix[i]);
18963 if( nByte ){
18964 rc = sqlite3Fts5HashWrite(p->pHash,
18965 p->iWriteRowid, iCol, iPos, (char)(FTS5_MAIN_PREFIX+i+1), pToken,
18966 nByte
18967 );
18968 }
18969 }
18970
18971 return rc;
18972 }
18973
18974 /*
18975 ** Open a new iterator to iterate though all rowid that match the
18976 ** specified token or token prefix.
18977 */
18978 static int sqlite3Fts5IndexQuery(
18979 Fts5Index *p, /* FTS index to query */
18980 const char *pToken, int nToken, /* Token (or prefix) to query for */
18981 int flags, /* Mask of FTS5INDEX_QUERY_X flags */
18982 Fts5Colset *pColset, /* Match these columns only */
18983 Fts5IndexIter **ppIter /* OUT: New iterator object */
18984 ){
18985 Fts5Config *pConfig = p->pConfig;
18986 Fts5IndexIter *pRet = 0;
18987 int iIdx = 0;
18988 Fts5Buffer buf = {0, 0, 0};
18989
18990 /* If the QUERY_SCAN flag is set, all other flags must be clear. */
18991 assert( (flags & FTS5INDEX_QUERY_SCAN)==0 || flags==FTS5INDEX_QUERY_SCAN );
18992
18993 if( sqlite3Fts5BufferSize(&p->rc, &buf, nToken+1)==0 ){
18994 memcpy(&buf.p[1], pToken, nToken);
18995
18996 #ifdef SQLITE_DEBUG
18997 /* If the QUERY_TEST_NOIDX flag was specified, then this must be a
18998 ** prefix-query. Instead of using a prefix-index (if one exists),
18999 ** evaluate the prefix query using the main FTS index. This is used
19000 ** for internal sanity checking by the integrity-check in debug
19001 ** mode only. */
19002 if( pConfig->bPrefixIndex==0 || (flags & FTS5INDEX_QUERY_TEST_NOIDX) ){
19003 assert( flags & FTS5INDEX_QUERY_PREFIX );
19004 iIdx = 1+pConfig->nPrefix;
19005 }else
19006 #endif
19007 if( flags & FTS5INDEX_QUERY_PREFIX ){
19008 int nChar = fts5IndexCharlen(pToken, nToken);
19009 for(iIdx=1; iIdx<=pConfig->nPrefix; iIdx++){
19010 if( pConfig->aPrefix[iIdx-1]==nChar ) break;
19011 }
19012 }
19013
19014 if( iIdx<=pConfig->nPrefix ){
19015 Fts5Structure *pStruct = fts5StructureRead(p);
19016 buf.p[0] = (u8)(FTS5_MAIN_PREFIX + iIdx);
19017 if( pStruct ){
19018 fts5MultiIterNew(p, pStruct, 1, flags, buf.p, nToken+1, -1, 0, &pRet);
19019 fts5StructureRelease(pStruct);
19020 }
19021 }else{
19022 int bDesc = (flags & FTS5INDEX_QUERY_DESC)!=0;
19023 buf.p[0] = FTS5_MAIN_PREFIX;
19024 fts5SetupPrefixIter(p, bDesc, buf.p, nToken+1, pColset, &pRet);
19025 }
19026
19027 if( p->rc ){
19028 sqlite3Fts5IterClose(pRet);
19029 pRet = 0;
19030 fts5CloseReader(p);
19031 }
19032 *ppIter = pRet;
19033 sqlite3Fts5BufferFree(&buf);
19034 }
19035 return fts5IndexReturn(p);
19036 }
19037
19038 /*
19039 ** Return true if the iterator passed as the only argument is at EOF.
19040 */
19041 static int sqlite3Fts5IterEof(Fts5IndexIter *pIter){
19042 assert( pIter->pIndex->rc==SQLITE_OK );
19043 return pIter->bEof;
19044 }
19045
19046 /*
19047 ** Move to the next matching rowid.
19048 */
19049 static int sqlite3Fts5IterNext(Fts5IndexIter *pIter){
19050 assert( pIter->pIndex->rc==SQLITE_OK );
19051 fts5MultiIterNext(pIter->pIndex, pIter, 0, 0);
19052 return fts5IndexReturn(pIter->pIndex);
19053 }
19054
19055 /*
19056 ** Move to the next matching term/rowid. Used by the fts5vocab module.
19057 */
19058 static int sqlite3Fts5IterNextScan(Fts5IndexIter *pIter){
19059 Fts5Index *p = pIter->pIndex;
19060
19061 assert( pIter->pIndex->rc==SQLITE_OK );
19062
19063 fts5MultiIterNext(p, pIter, 0, 0);
19064 if( p->rc==SQLITE_OK ){
19065 Fts5SegIter *pSeg = &pIter->aSeg[ pIter->aFirst[1].iFirst ];
19066 if( pSeg->pLeaf && pSeg->term.p[0]!=FTS5_MAIN_PREFIX ){
19067 fts5DataRelease(pSeg->pLeaf);
19068 pSeg->pLeaf = 0;
19069 pIter->bEof = 1;
19070 }
19071 }
19072
19073 return fts5IndexReturn(pIter->pIndex);
19074 }
19075
19076 /*
19077 ** Move to the next matching rowid that occurs at or after iMatch. The
19078 ** definition of "at or after" depends on whether this iterator iterates
19079 ** in ascending or descending rowid order.
19080 */
19081 static int sqlite3Fts5IterNextFrom(Fts5IndexIter *pIter, i64 iMatch){
19082 fts5MultiIterNextFrom(pIter->pIndex, pIter, iMatch);
19083 return fts5IndexReturn(pIter->pIndex);
19084 }
19085
19086 /*
19087 ** Return the current rowid.
19088 */
19089 static i64 sqlite3Fts5IterRowid(Fts5IndexIter *pIter){
19090 return fts5MultiIterRowid(pIter);
19091 }
19092
19093 /*
19094 ** Return the current term.
19095 */
19096 static const char *sqlite3Fts5IterTerm(Fts5IndexIter *pIter, int *pn){
19097 int n;
19098 const char *z = (const char*)fts5MultiIterTerm(pIter, &n);
19099 *pn = n-1;
19100 return &z[1];
19101 }
19102
19103
19104 static int fts5IndexExtractColset (
19105 Fts5Colset *pColset, /* Colset to filter on */
19106 const u8 *pPos, int nPos, /* Position list */
19107 Fts5Buffer *pBuf /* Output buffer */
19108 ){
19109 int rc = SQLITE_OK;
19110 int i;
19111
19112 fts5BufferZero(pBuf);
19113 for(i=0; i<pColset->nCol; i++){
19114 const u8 *pSub = pPos;
19115 int nSub = fts5IndexExtractCol(&pSub, nPos, pColset->aiCol[i]);
19116 if( nSub ){
19117 fts5BufferAppendBlob(&rc, pBuf, nSub, pSub);
19118 }
19119 }
19120 return rc;
19121 }
19122
19123
19124 /*
19125 ** Return a pointer to a buffer containing a copy of the position list for
19126 ** the current entry. Output variable *pn is set to the size of the buffer
19127 ** in bytes before returning.
19128 **
19129 ** The returned position list does not include the "number of bytes" varint
19130 ** field that starts the position list on disk.
19131 */
19132 static int sqlite3Fts5IterPoslist(
19133 Fts5IndexIter *pIter,
19134 Fts5Colset *pColset, /* Column filter (or NULL) */
19135 const u8 **pp, /* OUT: Pointer to position-list data */
19136 int *pn, /* OUT: Size of position-list in bytes */
19137 i64 *piRowid /* OUT: Current rowid */
19138 ){
19139 Fts5SegIter *pSeg = &pIter->aSeg[ pIter->aFirst[1].iFirst ];
19140 assert( pIter->pIndex->rc==SQLITE_OK );
19141 *piRowid = pSeg->iRowid;
19142 if( pSeg->iLeafOffset+pSeg->nPos<=pSeg->pLeaf->szLeaf ){
19143 u8 *pPos = &pSeg->pLeaf->p[pSeg->iLeafOffset];
19144 if( pColset==0 || pIter->bFiltered ){
19145 *pn = pSeg->nPos;
19146 *pp = pPos;
19147 }else if( pColset->nCol==1 ){
19148 *pp = pPos;
19149 *pn = fts5IndexExtractCol(pp, pSeg->nPos, pColset->aiCol[0]);
19150 }else{
19151 fts5BufferZero(&pIter->poslist);
19152 fts5IndexExtractColset(pColset, pPos, pSeg->nPos, &pIter->poslist);
19153 *pp = pIter->poslist.p;
19154 *pn = pIter->poslist.n;
19155 }
19156 }else{
19157 fts5BufferZero(&pIter->poslist);
19158 fts5SegiterPoslist(pIter->pIndex, pSeg, pColset, &pIter->poslist);
19159 *pp = pIter->poslist.p;
19160 *pn = pIter->poslist.n;
19161 }
19162 return fts5IndexReturn(pIter->pIndex);
19163 }
19164
19165 /*
19166 ** This function is similar to sqlite3Fts5IterPoslist(), except that it
19167 ** copies the position list into the buffer supplied as the second
19168 ** argument.
19169 */
19170 static int sqlite3Fts5IterPoslistBuffer(Fts5IndexIter *pIter, Fts5Buffer *pBuf){
19171 Fts5Index *p = pIter->pIndex;
19172 Fts5SegIter *pSeg = &pIter->aSeg[ pIter->aFirst[1].iFirst ];
19173 assert( p->rc==SQLITE_OK );
19174 fts5BufferZero(pBuf);
19175 fts5SegiterPoslist(p, pSeg, 0, pBuf);
19176 return fts5IndexReturn(p);
19177 }
19178
19179 /*
19180 ** Close an iterator opened by an earlier call to sqlite3Fts5IndexQuery().
19181 */
19182 static void sqlite3Fts5IterClose(Fts5IndexIter *pIter){
19183 if( pIter ){
19184 Fts5Index *pIndex = pIter->pIndex;
19185 fts5MultiIterFree(pIter->pIndex, pIter);
19186 fts5CloseReader(pIndex);
19187 }
19188 }
19189
19190 /*
19191 ** Read and decode the "averages" record from the database.
19192 **
19193 ** Parameter anSize must point to an array of size nCol, where nCol is
19194 ** the number of user defined columns in the FTS table.
19195 */
19196 static int sqlite3Fts5IndexGetAverages(Fts5Index *p, i64 *pnRow, i64 *anSize){
19197 int nCol = p->pConfig->nCol;
19198 Fts5Data *pData;
19199
19200 *pnRow = 0;
19201 memset(anSize, 0, sizeof(i64) * nCol);
19202 pData = fts5DataRead(p, FTS5_AVERAGES_ROWID);
19203 if( p->rc==SQLITE_OK && pData->nn ){
19204 int i = 0;
19205 int iCol;
19206 i += fts5GetVarint(&pData->p[i], (u64*)pnRow);
19207 for(iCol=0; i<pData->nn && iCol<nCol; iCol++){
19208 i += fts5GetVarint(&pData->p[i], (u64*)&anSize[iCol]);
19209 }
19210 }
19211
19212 fts5DataRelease(pData);
19213 return fts5IndexReturn(p);
19214 }
19215
19216 /*
19217 ** Replace the current "averages" record with the contents of the buffer
19218 ** supplied as the second argument.
19219 */
19220 static int sqlite3Fts5IndexSetAverages(Fts5Index *p, const u8 *pData, int nData) {
19221 assert( p->rc==SQLITE_OK );
19222 fts5DataWrite(p, FTS5_AVERAGES_ROWID, pData, nData);
19223 return fts5IndexReturn(p);
19224 }
19225
19226 /*
19227 ** Return the total number of blocks this module has read from the %_data
19228 ** table since it was created.
19229 */
19230 static int sqlite3Fts5IndexReads(Fts5Index *p){
19231 return p->nRead;
19232 }
19233
19234 /*
19235 ** Set the 32-bit cookie value stored at the start of all structure
19236 ** records to the value passed as the second argument.
19237 **
19238 ** Return SQLITE_OK if successful, or an SQLite error code if an error
19239 ** occurs.
19240 */
19241 static int sqlite3Fts5IndexSetCookie(Fts5Index *p, int iNew){
19242 int rc; /* Return code */
19243 Fts5Config *pConfig = p->pConfig; /* Configuration object */
19244 u8 aCookie[4]; /* Binary representation of iNew */
19245 sqlite3_blob *pBlob = 0;
19246
19247 assert( p->rc==SQLITE_OK );
19248 sqlite3Fts5Put32(aCookie, iNew);
19249
19250 rc = sqlite3_blob_open(pConfig->db, pConfig->zDb, p->zDataTbl,
19251 "block", FTS5_STRUCTURE_ROWID, 1, &pBlob
19252 );
19253 if( rc==SQLITE_OK ){
19254 sqlite3_blob_write(pBlob, aCookie, 4, 0);
19255 rc = sqlite3_blob_close(pBlob);
19256 }
19257
19258 return rc;
19259 }
19260
19261 static int sqlite3Fts5IndexLoadConfig(Fts5Index *p){
19262 Fts5Structure *pStruct;
19263 pStruct = fts5StructureRead(p);
19264 fts5StructureRelease(pStruct);
19265 return fts5IndexReturn(p);
19266 }
19267
19268
19269 /*************************************************************************
19270 **************************************************************************
19271 ** Below this point is the implementation of the integrity-check
19272 ** functionality.
19273 */
19274
19275 /*
19276 ** Return a simple checksum value based on the arguments.
19277 */
19278 static u64 fts5IndexEntryCksum(
19279 i64 iRowid,
19280 int iCol,
19281 int iPos,
19282 int iIdx,
19283 const char *pTerm,
19284 int nTerm
19285 ){
19286 int i;
19287 u64 ret = iRowid;
19288 ret += (ret<<3) + iCol;
19289 ret += (ret<<3) + iPos;
19290 if( iIdx>=0 ) ret += (ret<<3) + (FTS5_MAIN_PREFIX + iIdx);
19291 for(i=0; i<nTerm; i++) ret += (ret<<3) + pTerm[i];
19292 return ret;
19293 }
19294
19295 #ifdef SQLITE_DEBUG
19296 /*
19297 ** This function is purely an internal test. It does not contribute to
19298 ** FTS functionality, or even the integrity-check, in any way.
19299 **
19300 ** Instead, it tests that the same set of pgno/rowid combinations are
19301 ** visited regardless of whether the doclist-index identified by parameters
19302 ** iSegid/iLeaf is iterated in forwards or reverse order.
19303 */
19304 static void fts5TestDlidxReverse(
19305 Fts5Index *p,
19306 int iSegid, /* Segment id to load from */
19307 int iLeaf /* Load doclist-index for this leaf */
19308 ){
19309 Fts5DlidxIter *pDlidx = 0;
19310 u64 cksum1 = 13;
19311 u64 cksum2 = 13;
19312
19313 for(pDlidx=fts5DlidxIterInit(p, 0, iSegid, iLeaf);
19314 fts5DlidxIterEof(p, pDlidx)==0;
19315 fts5DlidxIterNext(p, pDlidx)
19316 ){
19317 i64 iRowid = fts5DlidxIterRowid(pDlidx);
19318 int pgno = fts5DlidxIterPgno(pDlidx);
19319 assert( pgno>iLeaf );
19320 cksum1 += iRowid + ((i64)pgno<<32);
19321 }
19322 fts5DlidxIterFree(pDlidx);
19323 pDlidx = 0;
19324
19325 for(pDlidx=fts5DlidxIterInit(p, 1, iSegid, iLeaf);
19326 fts5DlidxIterEof(p, pDlidx)==0;
19327 fts5DlidxIterPrev(p, pDlidx)
19328 ){
19329 i64 iRowid = fts5DlidxIterRowid(pDlidx);
19330 int pgno = fts5DlidxIterPgno(pDlidx);
19331 assert( fts5DlidxIterPgno(pDlidx)>iLeaf );
19332 cksum2 += iRowid + ((i64)pgno<<32);
19333 }
19334 fts5DlidxIterFree(pDlidx);
19335 pDlidx = 0;
19336
19337 if( p->rc==SQLITE_OK && cksum1!=cksum2 ) p->rc = FTS5_CORRUPT;
19338 }
19339
19340 static int fts5QueryCksum(
19341 Fts5Index *p, /* Fts5 index object */
19342 int iIdx,
19343 const char *z, /* Index key to query for */
19344 int n, /* Size of index key in bytes */
19345 int flags, /* Flags for Fts5IndexQuery */
19346 u64 *pCksum /* IN/OUT: Checksum value */
19347 ){
19348 u64 cksum = *pCksum;
19349 Fts5IndexIter *pIdxIter = 0;
19350 int rc = sqlite3Fts5IndexQuery(p, z, n, flags, 0, &pIdxIter);
19351
19352 while( rc==SQLITE_OK && 0==sqlite3Fts5IterEof(pIdxIter) ){
19353 i64 dummy;
19354 const u8 *pPos;
19355 int nPos;
19356 i64 rowid = sqlite3Fts5IterRowid(pIdxIter);
19357 rc = sqlite3Fts5IterPoslist(pIdxIter, 0, &pPos, &nPos, &dummy);
19358 if( rc==SQLITE_OK ){
19359 Fts5PoslistReader sReader;
19360 for(sqlite3Fts5PoslistReaderInit(pPos, nPos, &sReader);
19361 sReader.bEof==0;
19362 sqlite3Fts5PoslistReaderNext(&sReader)
19363 ){
19364 int iCol = FTS5_POS2COLUMN(sReader.iPos);
19365 int iOff = FTS5_POS2OFFSET(sReader.iPos);
19366 cksum ^= fts5IndexEntryCksum(rowid, iCol, iOff, iIdx, z, n);
19367 }
19368 rc = sqlite3Fts5IterNext(pIdxIter);
19369 }
19370 }
19371 sqlite3Fts5IterClose(pIdxIter);
19372
19373 *pCksum = cksum;
19374 return rc;
19375 }
19376
19377
19378 /*
19379 ** This function is also purely an internal test. It does not contribute to
19380 ** FTS functionality, or even the integrity-check, in any way.
19381 */
19382 static void fts5TestTerm(
19383 Fts5Index *p,
19384 Fts5Buffer *pPrev, /* Previous term */
19385 const char *z, int n, /* Possibly new term to test */
19386 u64 expected,
19387 u64 *pCksum
19388 ){
19389 int rc = p->rc;
19390 if( pPrev->n==0 ){
19391 fts5BufferSet(&rc, pPrev, n, (const u8*)z);
19392 }else
19393 if( rc==SQLITE_OK && (pPrev->n!=n || memcmp(pPrev->p, z, n)) ){
19394 u64 cksum3 = *pCksum;
19395 const char *zTerm = (const char*)&pPrev->p[1]; /* term sans prefix-byte */
19396 int nTerm = pPrev->n-1; /* Size of zTerm in bytes */
19397 int iIdx = (pPrev->p[0] - FTS5_MAIN_PREFIX);
19398 int flags = (iIdx==0 ? 0 : FTS5INDEX_QUERY_PREFIX);
19399 u64 ck1 = 0;
19400 u64 ck2 = 0;
19401
19402 /* Check that the results returned for ASC and DESC queries are
19403 ** the same. If not, call this corruption. */
19404 rc = fts5QueryCksum(p, iIdx, zTerm, nTerm, flags, &ck1);
19405 if( rc==SQLITE_OK ){
19406 int f = flags|FTS5INDEX_QUERY_DESC;
19407 rc = fts5QueryCksum(p, iIdx, zTerm, nTerm, f, &ck2);
19408 }
19409 if( rc==SQLITE_OK && ck1!=ck2 ) rc = FTS5_CORRUPT;
19410
19411 /* If this is a prefix query, check that the results returned if the
19412 ** the index is disabled are the same. In both ASC and DESC order.
19413 **
19414 ** This check may only be performed if the hash table is empty. This
19415 ** is because the hash table only supports a single scan query at
19416 ** a time, and the multi-iter loop from which this function is called
19417 ** is already performing such a scan. */
19418 if( p->nPendingData==0 ){
19419 if( iIdx>0 && rc==SQLITE_OK ){
19420 int f = flags|FTS5INDEX_QUERY_TEST_NOIDX;
19421 ck2 = 0;
19422 rc = fts5QueryCksum(p, iIdx, zTerm, nTerm, f, &ck2);
19423 if( rc==SQLITE_OK && ck1!=ck2 ) rc = FTS5_CORRUPT;
19424 }
19425 if( iIdx>0 && rc==SQLITE_OK ){
19426 int f = flags|FTS5INDEX_QUERY_TEST_NOIDX|FTS5INDEX_QUERY_DESC;
19427 ck2 = 0;
19428 rc = fts5QueryCksum(p, iIdx, zTerm, nTerm, f, &ck2);
19429 if( rc==SQLITE_OK && ck1!=ck2 ) rc = FTS5_CORRUPT;
19430 }
19431 }
19432
19433 cksum3 ^= ck1;
19434 fts5BufferSet(&rc, pPrev, n, (const u8*)z);
19435
19436 if( rc==SQLITE_OK && cksum3!=expected ){
19437 rc = FTS5_CORRUPT;
19438 }
19439 *pCksum = cksum3;
19440 }
19441 p->rc = rc;
19442 }
19443
19444 #else
19445 # define fts5TestDlidxReverse(x,y,z)
19446 # define fts5TestTerm(u,v,w,x,y,z)
19447 #endif
19448
19449 /*
19450 ** Check that:
19451 **
19452 ** 1) All leaves of pSeg between iFirst and iLast (inclusive) exist and
19453 ** contain zero terms.
19454 ** 2) All leaves of pSeg between iNoRowid and iLast (inclusive) exist and
19455 ** contain zero rowids.
19456 */
19457 static void fts5IndexIntegrityCheckEmpty(
19458 Fts5Index *p,
19459 Fts5StructureSegment *pSeg, /* Segment to check internal consistency */
19460 int iFirst,
19461 int iNoRowid,
19462 int iLast
19463 ){
19464 int i;
19465
19466 /* Now check that the iter.nEmpty leaves following the current leaf
19467 ** (a) exist and (b) contain no terms. */
19468 for(i=iFirst; p->rc==SQLITE_OK && i<=iLast; i++){
19469 Fts5Data *pLeaf = fts5DataRead(p, FTS5_SEGMENT_ROWID(pSeg->iSegid, i));
19470 if( pLeaf ){
19471 if( !fts5LeafIsTermless(pLeaf) ) p->rc = FTS5_CORRUPT;
19472 if( i>=iNoRowid && 0!=fts5LeafFirstRowidOff(pLeaf) ) p->rc = FTS5_CORRUPT;
19473 }
19474 fts5DataRelease(pLeaf);
19475 }
19476 }
19477
19478 static void fts5IntegrityCheckPgidx(Fts5Index *p, Fts5Data *pLeaf){
19479 int iTermOff = 0;
19480 int ii;
19481
19482 Fts5Buffer buf1 = {0,0,0};
19483 Fts5Buffer buf2 = {0,0,0};
19484
19485 ii = pLeaf->szLeaf;
19486 while( ii<pLeaf->nn && p->rc==SQLITE_OK ){
19487 int res;
19488 int iOff;
19489 int nIncr;
19490
19491 ii += fts5GetVarint32(&pLeaf->p[ii], nIncr);
19492 iTermOff += nIncr;
19493 iOff = iTermOff;
19494
19495 if( iOff>=pLeaf->szLeaf ){
19496 p->rc = FTS5_CORRUPT;
19497 }else if( iTermOff==nIncr ){
19498 int nByte;
19499 iOff += fts5GetVarint32(&pLeaf->p[iOff], nByte);
19500 if( (iOff+nByte)>pLeaf->szLeaf ){
19501 p->rc = FTS5_CORRUPT;
19502 }else{
19503 fts5BufferSet(&p->rc, &buf1, nByte, &pLeaf->p[iOff]);
19504 }
19505 }else{
19506 int nKeep, nByte;
19507 iOff += fts5GetVarint32(&pLeaf->p[iOff], nKeep);
19508 iOff += fts5GetVarint32(&pLeaf->p[iOff], nByte);
19509 if( nKeep>buf1.n || (iOff+nByte)>pLeaf->szLeaf ){
19510 p->rc = FTS5_CORRUPT;
19511 }else{
19512 buf1.n = nKeep;
19513 fts5BufferAppendBlob(&p->rc, &buf1, nByte, &pLeaf->p[iOff]);
19514 }
19515
19516 if( p->rc==SQLITE_OK ){
19517 res = fts5BufferCompare(&buf1, &buf2);
19518 if( res<=0 ) p->rc = FTS5_CORRUPT;
19519 }
19520 }
19521 fts5BufferSet(&p->rc, &buf2, buf1.n, buf1.p);
19522 }
19523
19524 fts5BufferFree(&buf1);
19525 fts5BufferFree(&buf2);
19526 }
19527
19528 static void fts5IndexIntegrityCheckSegment(
19529 Fts5Index *p, /* FTS5 backend object */
19530 Fts5StructureSegment *pSeg /* Segment to check internal consistency */
19531 ){
19532 Fts5Config *pConfig = p->pConfig;
19533 sqlite3_stmt *pStmt = 0;
19534 int rc2;
19535 int iIdxPrevLeaf = pSeg->pgnoFirst-1;
19536 int iDlidxPrevLeaf = pSeg->pgnoLast;
19537
19538 if( pSeg->pgnoFirst==0 ) return;
19539
19540 fts5IndexPrepareStmt(p, &pStmt, sqlite3_mprintf(
19541 "SELECT segid, term, (pgno>>1), (pgno&1) FROM %Q.'%q_idx' WHERE segid=%d",
19542 pConfig->zDb, pConfig->zName, pSeg->iSegid
19543 ));
19544
19545 /* Iterate through the b-tree hierarchy. */
19546 while( p->rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pStmt) ){
19547 i64 iRow; /* Rowid for this leaf */
19548 Fts5Data *pLeaf; /* Data for this leaf */
19549
19550 int nIdxTerm = sqlite3_column_bytes(pStmt, 1);
19551 const char *zIdxTerm = (const char*)sqlite3_column_text(pStmt, 1);
19552 int iIdxLeaf = sqlite3_column_int(pStmt, 2);
19553 int bIdxDlidx = sqlite3_column_int(pStmt, 3);
19554
19555 /* If the leaf in question has already been trimmed from the segment,
19556 ** ignore this b-tree entry. Otherwise, load it into memory. */
19557 if( iIdxLeaf<pSeg->pgnoFirst ) continue;
19558 iRow = FTS5_SEGMENT_ROWID(pSeg->iSegid, iIdxLeaf);
19559 pLeaf = fts5DataRead(p, iRow);
19560 if( pLeaf==0 ) break;
19561
19562 /* Check that the leaf contains at least one term, and that it is equal
19563 ** to or larger than the split-key in zIdxTerm. Also check that if there
19564 ** is also a rowid pointer within the leaf page header, it points to a
19565 ** location before the term. */
19566 if( pLeaf->nn<=pLeaf->szLeaf ){
19567 p->rc = FTS5_CORRUPT;
19568 }else{
19569 int iOff; /* Offset of first term on leaf */
19570 int iRowidOff; /* Offset of first rowid on leaf */
19571 int nTerm; /* Size of term on leaf in bytes */
19572 int res; /* Comparison of term and split-key */
19573
19574 iOff = fts5LeafFirstTermOff(pLeaf);
19575 iRowidOff = fts5LeafFirstRowidOff(pLeaf);
19576 if( iRowidOff>=iOff ){
19577 p->rc = FTS5_CORRUPT;
19578 }else{
19579 iOff += fts5GetVarint32(&pLeaf->p[iOff], nTerm);
19580 res = memcmp(&pLeaf->p[iOff], zIdxTerm, MIN(nTerm, nIdxTerm));
19581 if( res==0 ) res = nTerm - nIdxTerm;
19582 if( res<0 ) p->rc = FTS5_CORRUPT;
19583 }
19584
19585 fts5IntegrityCheckPgidx(p, pLeaf);
19586 }
19587 fts5DataRelease(pLeaf);
19588 if( p->rc ) break;
19589
19590 /* Now check that the iter.nEmpty leaves following the current leaf
19591 ** (a) exist and (b) contain no terms. */
19592 fts5IndexIntegrityCheckEmpty(
19593 p, pSeg, iIdxPrevLeaf+1, iDlidxPrevLeaf+1, iIdxLeaf-1
19594 );
19595 if( p->rc ) break;
19596
19597 /* If there is a doclist-index, check that it looks right. */
19598 if( bIdxDlidx ){
19599 Fts5DlidxIter *pDlidx = 0; /* For iterating through doclist index */
19600 int iPrevLeaf = iIdxLeaf;
19601 int iSegid = pSeg->iSegid;
19602 int iPg = 0;
19603 i64 iKey;
19604
19605 for(pDlidx=fts5DlidxIterInit(p, 0, iSegid, iIdxLeaf);
19606 fts5DlidxIterEof(p, pDlidx)==0;
19607 fts5DlidxIterNext(p, pDlidx)
19608 ){
19609
19610 /* Check any rowid-less pages that occur before the current leaf. */
19611 for(iPg=iPrevLeaf+1; iPg<fts5DlidxIterPgno(pDlidx); iPg++){
19612 iKey = FTS5_SEGMENT_ROWID(iSegid, iPg);
19613 pLeaf = fts5DataRead(p, iKey);
19614 if( pLeaf ){
19615 if( fts5LeafFirstRowidOff(pLeaf)!=0 ) p->rc = FTS5_CORRUPT;
19616 fts5DataRelease(pLeaf);
19617 }
19618 }
19619 iPrevLeaf = fts5DlidxIterPgno(pDlidx);
19620
19621 /* Check that the leaf page indicated by the iterator really does
19622 ** contain the rowid suggested by the same. */
19623 iKey = FTS5_SEGMENT_ROWID(iSegid, iPrevLeaf);
19624 pLeaf = fts5DataRead(p, iKey);
19625 if( pLeaf ){
19626 i64 iRowid;
19627 int iRowidOff = fts5LeafFirstRowidOff(pLeaf);
19628 ASSERT_SZLEAF_OK(pLeaf);
19629 if( iRowidOff>=pLeaf->szLeaf ){
19630 p->rc = FTS5_CORRUPT;
19631 }else{
19632 fts5GetVarint(&pLeaf->p[iRowidOff], (u64*)&iRowid);
19633 if( iRowid!=fts5DlidxIterRowid(pDlidx) ) p->rc = FTS5_CORRUPT;
19634 }
19635 fts5DataRelease(pLeaf);
19636 }
19637 }
19638
19639 iDlidxPrevLeaf = iPg;
19640 fts5DlidxIterFree(pDlidx);
19641 fts5TestDlidxReverse(p, iSegid, iIdxLeaf);
19642 }else{
19643 iDlidxPrevLeaf = pSeg->pgnoLast;
19644 /* TODO: Check there is no doclist index */
19645 }
19646
19647 iIdxPrevLeaf = iIdxLeaf;
19648 }
19649
19650 rc2 = sqlite3_finalize(pStmt);
19651 if( p->rc==SQLITE_OK ) p->rc = rc2;
19652
19653 /* Page iter.iLeaf must now be the rightmost leaf-page in the segment */
19654 #if 0
19655 if( p->rc==SQLITE_OK && iter.iLeaf!=pSeg->pgnoLast ){
19656 p->rc = FTS5_CORRUPT;
19657 }
19658 #endif
19659 }
19660
19661
19662 /*
19663 ** Run internal checks to ensure that the FTS index (a) is internally
19664 ** consistent and (b) contains entries for which the XOR of the checksums
19665 ** as calculated by fts5IndexEntryCksum() is cksum.
19666 **
19667 ** Return SQLITE_CORRUPT if any of the internal checks fail, or if the
19668 ** checksum does not match. Return SQLITE_OK if all checks pass without
19669 ** error, or some other SQLite error code if another error (e.g. OOM)
19670 ** occurs.
19671 */
19672 static int sqlite3Fts5IndexIntegrityCheck(Fts5Index *p, u64 cksum){
19673 u64 cksum2 = 0; /* Checksum based on contents of indexes */
19674 Fts5Buffer poslist = {0,0,0}; /* Buffer used to hold a poslist */
19675 Fts5IndexIter *pIter; /* Used to iterate through entire index */
19676 Fts5Structure *pStruct; /* Index structure */
19677
19678 #ifdef SQLITE_DEBUG
19679 /* Used by extra internal tests only run if NDEBUG is not defined */
19680 u64 cksum3 = 0; /* Checksum based on contents of indexes */
19681 Fts5Buffer term = {0,0,0}; /* Buffer used to hold most recent term */
19682 #endif
19683
19684 /* Load the FTS index structure */
19685 pStruct = fts5StructureRead(p);
19686
19687 /* Check that the internal nodes of each segment match the leaves */
19688 if( pStruct ){
19689 int iLvl, iSeg;
19690 for(iLvl=0; iLvl<pStruct->nLevel; iLvl++){
19691 for(iSeg=0; iSeg<pStruct->aLevel[iLvl].nSeg; iSeg++){
19692 Fts5StructureSegment *pSeg = &pStruct->aLevel[iLvl].aSeg[iSeg];
19693 fts5IndexIntegrityCheckSegment(p, pSeg);
19694 }
19695 }
19696 }
19697
19698 /* The cksum argument passed to this function is a checksum calculated
19699 ** based on all expected entries in the FTS index (including prefix index
19700 ** entries). This block checks that a checksum calculated based on the
19701 ** actual contents of FTS index is identical.
19702 **
19703 ** Two versions of the same checksum are calculated. The first (stack
19704 ** variable cksum2) based on entries extracted from the full-text index
19705 ** while doing a linear scan of each individual index in turn.
19706 **
19707 ** As each term visited by the linear scans, a separate query for the
19708 ** same term is performed. cksum3 is calculated based on the entries
19709 ** extracted by these queries.
19710 */
19711 for(fts5MultiIterNew(p, pStruct, 0, 0, 0, 0, -1, 0, &pIter);
19712 fts5MultiIterEof(p, pIter)==0;
19713 fts5MultiIterNext(p, pIter, 0, 0)
19714 ){
19715 int n; /* Size of term in bytes */
19716 i64 iPos = 0; /* Position read from poslist */
19717 int iOff = 0; /* Offset within poslist */
19718 i64 iRowid = fts5MultiIterRowid(pIter);
19719 char *z = (char*)fts5MultiIterTerm(pIter, &n);
19720
19721 /* If this is a new term, query for it. Update cksum3 with the results. */
19722 fts5TestTerm(p, &term, z, n, cksum2, &cksum3);
19723
19724 poslist.n = 0;
19725 fts5SegiterPoslist(p, &pIter->aSeg[pIter->aFirst[1].iFirst] , 0, &poslist);
19726 while( 0==sqlite3Fts5PoslistNext64(poslist.p, poslist.n, &iOff, &iPos) ){
19727 int iCol = FTS5_POS2COLUMN(iPos);
19728 int iTokOff = FTS5_POS2OFFSET(iPos);
19729 cksum2 ^= fts5IndexEntryCksum(iRowid, iCol, iTokOff, -1, z, n);
19730 }
19731 }
19732 fts5TestTerm(p, &term, 0, 0, cksum2, &cksum3);
19733
19734 fts5MultiIterFree(p, pIter);
19735 if( p->rc==SQLITE_OK && cksum!=cksum2 ) p->rc = FTS5_CORRUPT;
19736
19737 fts5StructureRelease(pStruct);
19738 #ifdef SQLITE_DEBUG
19739 fts5BufferFree(&term);
19740 #endif
19741 fts5BufferFree(&poslist);
19742 return fts5IndexReturn(p);
19743 }
19744
19745
19746 /*
19747 ** Calculate and return a checksum that is the XOR of the index entry
19748 ** checksum of all entries that would be generated by the token specified
19749 ** by the final 5 arguments.
19750 */
19751 static u64 sqlite3Fts5IndexCksum(
19752 Fts5Config *pConfig, /* Configuration object */
19753 i64 iRowid, /* Document term appears in */
19754 int iCol, /* Column term appears in */
19755 int iPos, /* Position term appears in */
19756 const char *pTerm, int nTerm /* Term at iPos */
19757 ){
19758 u64 ret = 0; /* Return value */
19759 int iIdx; /* For iterating through indexes */
19760
19761 ret = fts5IndexEntryCksum(iRowid, iCol, iPos, 0, pTerm, nTerm);
19762
19763 for(iIdx=0; iIdx<pConfig->nPrefix; iIdx++){
19764 int nByte = fts5IndexCharlenToBytelen(pTerm, nTerm, pConfig->aPrefix[iIdx]);
19765 if( nByte ){
19766 ret ^= fts5IndexEntryCksum(iRowid, iCol, iPos, iIdx+1, pTerm, nByte);
19767 }
19768 }
19769
19770 return ret;
19771 }
19772
19773 /*************************************************************************
19774 **************************************************************************
19775 ** Below this point is the implementation of the fts5_decode() scalar
19776 ** function only.
19777 */
19778
19779 /*
19780 ** Decode a segment-data rowid from the %_data table. This function is
19781 ** the opposite of macro FTS5_SEGMENT_ROWID().
19782 */
19783 static void fts5DecodeRowid(
19784 i64 iRowid, /* Rowid from %_data table */
19785 int *piSegid, /* OUT: Segment id */
19786 int *pbDlidx, /* OUT: Dlidx flag */
19787 int *piHeight, /* OUT: Height */
19788 int *piPgno /* OUT: Page number */
19789 ){
19790 *piPgno = (int)(iRowid & (((i64)1 << FTS5_DATA_PAGE_B) - 1));
19791 iRowid >>= FTS5_DATA_PAGE_B;
19792
19793 *piHeight = (int)(iRowid & (((i64)1 << FTS5_DATA_HEIGHT_B) - 1));
19794 iRowid >>= FTS5_DATA_HEIGHT_B;
19795
19796 *pbDlidx = (int)(iRowid & 0x0001);
19797 iRowid >>= FTS5_DATA_DLI_B;
19798
19799 *piSegid = (int)(iRowid & (((i64)1 << FTS5_DATA_ID_B) - 1));
19800 }
19801
19802 static void fts5DebugRowid(int *pRc, Fts5Buffer *pBuf, i64 iKey){
19803 int iSegid, iHeight, iPgno, bDlidx; /* Rowid compenents */
19804 fts5DecodeRowid(iKey, &iSegid, &bDlidx, &iHeight, &iPgno);
19805
19806 if( iSegid==0 ){
19807 if( iKey==FTS5_AVERAGES_ROWID ){
19808 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, "{averages} ");
19809 }else{
19810 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, "{structure}");
19811 }
19812 }
19813 else{
19814 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, "{%ssegid=%d h=%d pgno=%d}",
19815 bDlidx ? "dlidx " : "", iSegid, iHeight, iPgno
19816 );
19817 }
19818 }
19819
19820 static void fts5DebugStructure(
19821 int *pRc, /* IN/OUT: error code */
19822 Fts5Buffer *pBuf,
19823 Fts5Structure *p
19824 ){
19825 int iLvl, iSeg; /* Iterate through levels, segments */
19826
19827 for(iLvl=0; iLvl<p->nLevel; iLvl++){
19828 Fts5StructureLevel *pLvl = &p->aLevel[iLvl];
19829 sqlite3Fts5BufferAppendPrintf(pRc, pBuf,
19830 " {lvl=%d nMerge=%d nSeg=%d", iLvl, pLvl->nMerge, pLvl->nSeg
19831 );
19832 for(iSeg=0; iSeg<pLvl->nSeg; iSeg++){
19833 Fts5StructureSegment *pSeg = &pLvl->aSeg[iSeg];
19834 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, " {id=%d leaves=%d..%d}",
19835 pSeg->iSegid, pSeg->pgnoFirst, pSeg->pgnoLast
19836 );
19837 }
19838 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, "}");
19839 }
19840 }
19841
19842 /*
19843 ** This is part of the fts5_decode() debugging aid.
19844 **
19845 ** Arguments pBlob/nBlob contain a serialized Fts5Structure object. This
19846 ** function appends a human-readable representation of the same object
19847 ** to the buffer passed as the second argument.
19848 */
19849 static void fts5DecodeStructure(
19850 int *pRc, /* IN/OUT: error code */
19851 Fts5Buffer *pBuf,
19852 const u8 *pBlob, int nBlob
19853 ){
19854 int rc; /* Return code */
19855 Fts5Structure *p = 0; /* Decoded structure object */
19856
19857 rc = fts5StructureDecode(pBlob, nBlob, 0, &p);
19858 if( rc!=SQLITE_OK ){
19859 *pRc = rc;
19860 return;
19861 }
19862
19863 fts5DebugStructure(pRc, pBuf, p);
19864 fts5StructureRelease(p);
19865 }
19866
19867 /*
19868 ** This is part of the fts5_decode() debugging aid.
19869 **
19870 ** Arguments pBlob/nBlob contain an "averages" record. This function
19871 ** appends a human-readable representation of record to the buffer passed
19872 ** as the second argument.
19873 */
19874 static void fts5DecodeAverages(
19875 int *pRc, /* IN/OUT: error code */
19876 Fts5Buffer *pBuf,
19877 const u8 *pBlob, int nBlob
19878 ){
19879 int i = 0;
19880 const char *zSpace = "";
19881
19882 while( i<nBlob ){
19883 u64 iVal;
19884 i += sqlite3Fts5GetVarint(&pBlob[i], &iVal);
19885 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, "%s%d", zSpace, (int)iVal);
19886 zSpace = " ";
19887 }
19888 }
19889
19890 /*
19891 ** Buffer (a/n) is assumed to contain a list of serialized varints. Read
19892 ** each varint and append its string representation to buffer pBuf. Return
19893 ** after either the input buffer is exhausted or a 0 value is read.
19894 **
19895 ** The return value is the number of bytes read from the input buffer.
19896 */
19897 static int fts5DecodePoslist(int *pRc, Fts5Buffer *pBuf, const u8 *a, int n){
19898 int iOff = 0;
19899 while( iOff<n ){
19900 int iVal;
19901 iOff += fts5GetVarint32(&a[iOff], iVal);
19902 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, " %d", iVal);
19903 }
19904 return iOff;
19905 }
19906
19907 /*
19908 ** The start of buffer (a/n) contains the start of a doclist. The doclist
19909 ** may or may not finish within the buffer. This function appends a text
19910 ** representation of the part of the doclist that is present to buffer
19911 ** pBuf.
19912 **
19913 ** The return value is the number of bytes read from the input buffer.
19914 */
19915 static int fts5DecodeDoclist(int *pRc, Fts5Buffer *pBuf, const u8 *a, int n){
19916 i64 iDocid = 0;
19917 int iOff = 0;
19918
19919 if( n>0 ){
19920 iOff = sqlite3Fts5GetVarint(a, (u64*)&iDocid);
19921 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, " id=%lld", iDocid);
19922 }
19923 while( iOff<n ){
19924 int nPos;
19925 int bDel;
19926 iOff += fts5GetPoslistSize(&a[iOff], &nPos, &bDel);
19927 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, " nPos=%d%s", nPos, bDel?"*":"");
19928 iOff += fts5DecodePoslist(pRc, pBuf, &a[iOff], MIN(n-iOff, nPos));
19929 if( iOff<n ){
19930 i64 iDelta;
19931 iOff += sqlite3Fts5GetVarint(&a[iOff], (u64*)&iDelta);
19932 iDocid += iDelta;
19933 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, " id=%lld", iDocid);
19934 }
19935 }
19936
19937 return iOff;
19938 }
19939
19940 /*
19941 ** The implementation of user-defined scalar function fts5_decode().
19942 */
19943 static void fts5DecodeFunction(
19944 sqlite3_context *pCtx, /* Function call context */
19945 int nArg, /* Number of args (always 2) */
19946 sqlite3_value **apVal /* Function arguments */
19947 ){
19948 i64 iRowid; /* Rowid for record being decoded */
19949 int iSegid,iHeight,iPgno,bDlidx;/* Rowid components */
19950 const u8 *aBlob; int n; /* Record to decode */
19951 u8 *a = 0;
19952 Fts5Buffer s; /* Build up text to return here */
19953 int rc = SQLITE_OK; /* Return code */
19954 int nSpace = 0;
19955
19956 assert( nArg==2 );
19957 memset(&s, 0, sizeof(Fts5Buffer));
19958 iRowid = sqlite3_value_int64(apVal[0]);
19959
19960 /* Make a copy of the second argument (a blob) in aBlob[]. The aBlob[]
19961 ** copy is followed by FTS5_DATA_ZERO_PADDING 0x00 bytes, which prevents
19962 ** buffer overreads even if the record is corrupt. */
19963 n = sqlite3_value_bytes(apVal[1]);
19964 aBlob = sqlite3_value_blob(apVal[1]);
19965 nSpace = n + FTS5_DATA_ZERO_PADDING;
19966 a = (u8*)sqlite3Fts5MallocZero(&rc, nSpace);
19967 if( a==0 ) goto decode_out;
19968 memcpy(a, aBlob, n);
19969
19970
19971 fts5DecodeRowid(iRowid, &iSegid, &bDlidx, &iHeight, &iPgno);
19972
19973 fts5DebugRowid(&rc, &s, iRowid);
19974 if( bDlidx ){
19975 Fts5Data dlidx;
19976 Fts5DlidxLvl lvl;
19977
19978 dlidx.p = a;
19979 dlidx.nn = n;
19980
19981 memset(&lvl, 0, sizeof(Fts5DlidxLvl));
19982 lvl.pData = &dlidx;
19983 lvl.iLeafPgno = iPgno;
19984
19985 for(fts5DlidxLvlNext(&lvl); lvl.bEof==0; fts5DlidxLvlNext(&lvl)){
19986 sqlite3Fts5BufferAppendPrintf(&rc, &s,
19987 " %d(%lld)", lvl.iLeafPgno, lvl.iRowid
19988 );
19989 }
19990 }else if( iSegid==0 ){
19991 if( iRowid==FTS5_AVERAGES_ROWID ){
19992 fts5DecodeAverages(&rc, &s, a, n);
19993 }else{
19994 fts5DecodeStructure(&rc, &s, a, n);
19995 }
19996 }else{
19997 Fts5Buffer term; /* Current term read from page */
19998 int szLeaf; /* Offset of pgidx in a[] */
19999 int iPgidxOff;
20000 int iPgidxPrev = 0; /* Previous value read from pgidx */
20001 int iTermOff = 0;
20002 int iRowidOff = 0;
20003 int iOff;
20004 int nDoclist;
20005
20006 memset(&term, 0, sizeof(Fts5Buffer));
20007
20008 if( n<4 ){
20009 sqlite3Fts5BufferSet(&rc, &s, 7, (const u8*)"corrupt");
20010 goto decode_out;
20011 }else{
20012 iRowidOff = fts5GetU16(&a[0]);
20013 iPgidxOff = szLeaf = fts5GetU16(&a[2]);
20014 if( iPgidxOff<n ){
20015 fts5GetVarint32(&a[iPgidxOff], iTermOff);
20016 }
20017 }
20018
20019 /* Decode the position list tail at the start of the page */
20020 if( iRowidOff!=0 ){
20021 iOff = iRowidOff;
20022 }else if( iTermOff!=0 ){
20023 iOff = iTermOff;
20024 }else{
20025 iOff = szLeaf;
20026 }
20027 fts5DecodePoslist(&rc, &s, &a[4], iOff-4);
20028
20029 /* Decode any more doclist data that appears on the page before the
20030 ** first term. */
20031 nDoclist = (iTermOff ? iTermOff : szLeaf) - iOff;
20032 fts5DecodeDoclist(&rc, &s, &a[iOff], nDoclist);
20033
20034 while( iPgidxOff<n ){
20035 int bFirst = (iPgidxOff==szLeaf); /* True for first term on page */
20036 int nByte; /* Bytes of data */
20037 int iEnd;
20038
20039 iPgidxOff += fts5GetVarint32(&a[iPgidxOff], nByte);
20040 iPgidxPrev += nByte;
20041 iOff = iPgidxPrev;
20042
20043 if( iPgidxOff<n ){
20044 fts5GetVarint32(&a[iPgidxOff], nByte);
20045 iEnd = iPgidxPrev + nByte;
20046 }else{
20047 iEnd = szLeaf;
20048 }
20049
20050 if( bFirst==0 ){
20051 iOff += fts5GetVarint32(&a[iOff], nByte);
20052 term.n = nByte;
20053 }
20054 iOff += fts5GetVarint32(&a[iOff], nByte);
20055 fts5BufferAppendBlob(&rc, &term, nByte, &a[iOff]);
20056 iOff += nByte;
20057
20058 sqlite3Fts5BufferAppendPrintf(
20059 &rc, &s, " term=%.*s", term.n, (const char*)term.p
20060 );
20061 iOff += fts5DecodeDoclist(&rc, &s, &a[iOff], iEnd-iOff);
20062 }
20063
20064 fts5BufferFree(&term);
20065 }
20066
20067 decode_out:
20068 sqlite3_free(a);
20069 if( rc==SQLITE_OK ){
20070 sqlite3_result_text(pCtx, (const char*)s.p, s.n, SQLITE_TRANSIENT);
20071 }else{
20072 sqlite3_result_error_code(pCtx, rc);
20073 }
20074 fts5BufferFree(&s);
20075 }
20076
20077 /*
20078 ** The implementation of user-defined scalar function fts5_rowid().
20079 */
20080 static void fts5RowidFunction(
20081 sqlite3_context *pCtx, /* Function call context */
20082 int nArg, /* Number of args (always 2) */
20083 sqlite3_value **apVal /* Function arguments */
20084 ){
20085 const char *zArg;
20086 if( nArg==0 ){
20087 sqlite3_result_error(pCtx, "should be: fts5_rowid(subject, ....)", -1);
20088 }else{
20089 zArg = (const char*)sqlite3_value_text(apVal[0]);
20090 if( 0==sqlite3_stricmp(zArg, "segment") ){
20091 i64 iRowid;
20092 int segid, pgno;
20093 if( nArg!=3 ){
20094 sqlite3_result_error(pCtx,
20095 "should be: fts5_rowid('segment', segid, pgno))", -1
20096 );
20097 }else{
20098 segid = sqlite3_value_int(apVal[1]);
20099 pgno = sqlite3_value_int(apVal[2]);
20100 iRowid = FTS5_SEGMENT_ROWID(segid, pgno);
20101 sqlite3_result_int64(pCtx, iRowid);
20102 }
20103 }else{
20104 sqlite3_result_error(pCtx,
20105 "first arg to fts5_rowid() must be 'segment'" , -1
20106 );
20107 }
20108 }
20109 }
20110
20111 /*
20112 ** This is called as part of registering the FTS5 module with database
20113 ** connection db. It registers several user-defined scalar functions useful
20114 ** with FTS5.
20115 **
20116 ** If successful, SQLITE_OK is returned. If an error occurs, some other
20117 ** SQLite error code is returned instead.
20118 */
20119 static int sqlite3Fts5IndexInit(sqlite3 *db){
20120 int rc = sqlite3_create_function(
20121 db, "fts5_decode", 2, SQLITE_UTF8, 0, fts5DecodeFunction, 0, 0
20122 );
20123 if( rc==SQLITE_OK ){
20124 rc = sqlite3_create_function(
20125 db, "fts5_rowid", -1, SQLITE_UTF8, 0, fts5RowidFunction, 0, 0
20126 );
20127 }
20128 return rc;
20129 }
20130
20131
20132 /*
20133 ** 2014 Jun 09
20134 **
20135 ** The author disclaims copyright to this source code. In place of
20136 ** a legal notice, here is a blessing:
20137 **
20138 ** May you do good and not evil.
20139 ** May you find forgiveness for yourself and forgive others.
20140 ** May you share freely, never taking more than you give.
20141 **
20142 ******************************************************************************
20143 **
20144 ** This is an SQLite module implementing full-text search.
20145 */
20146
20147
20148 /* #include "fts5Int.h" */
20149
20150 /*
20151 ** This variable is set to false when running tests for which the on disk
20152 ** structures should not be corrupt. Otherwise, true. If it is false, extra
20153 ** assert() conditions in the fts5 code are activated - conditions that are
20154 ** only true if it is guaranteed that the fts5 database is not corrupt.
20155 */
20156 SQLITE_API int sqlite3_fts5_may_be_corrupt = 1;
20157
20158
20159 typedef struct Fts5Auxdata Fts5Auxdata;
20160 typedef struct Fts5Auxiliary Fts5Auxiliary;
20161 typedef struct Fts5Cursor Fts5Cursor;
20162 typedef struct Fts5Sorter Fts5Sorter;
20163 typedef struct Fts5Table Fts5Table;
20164 typedef struct Fts5TokenizerModule Fts5TokenizerModule;
20165
20166 /*
20167 ** NOTES ON TRANSACTIONS:
20168 **
20169 ** SQLite invokes the following virtual table methods as transactions are
20170 ** opened and closed by the user:
20171 **
20172 ** xBegin(): Start of a new transaction.
20173 ** xSync(): Initial part of two-phase commit.
20174 ** xCommit(): Final part of two-phase commit.
20175 ** xRollback(): Rollback the transaction.
20176 **
20177 ** Anything that is required as part of a commit that may fail is performed
20178 ** in the xSync() callback. Current versions of SQLite ignore any errors
20179 ** returned by xCommit().
20180 **
20181 ** And as sub-transactions are opened/closed:
20182 **
20183 ** xSavepoint(int S): Open savepoint S.
20184 ** xRelease(int S): Commit and close savepoint S.
20185 ** xRollbackTo(int S): Rollback to start of savepoint S.
20186 **
20187 ** During a write-transaction the fts5_index.c module may cache some data
20188 ** in-memory. It is flushed to disk whenever xSync(), xRelease() or
20189 ** xSavepoint() is called. And discarded whenever xRollback() or xRollbackTo()
20190 ** is called.
20191 **
20192 ** Additionally, if SQLITE_DEBUG is defined, an instance of the following
20193 ** structure is used to record the current transaction state. This information
20194 ** is not required, but it is used in the assert() statements executed by
20195 ** function fts5CheckTransactionState() (see below).
20196 */
20197 struct Fts5TransactionState {
20198 int eState; /* 0==closed, 1==open, 2==synced */
20199 int iSavepoint; /* Number of open savepoints (0 -> none) */
20200 };
20201
20202 /*
20203 ** A single object of this type is allocated when the FTS5 module is
20204 ** registered with a database handle. It is used to store pointers to
20205 ** all registered FTS5 extensions - tokenizers and auxiliary functions.
20206 */
20207 struct Fts5Global {
20208 fts5_api api; /* User visible part of object (see fts5.h) */
20209 sqlite3 *db; /* Associated database connection */
20210 i64 iNextId; /* Used to allocate unique cursor ids */
20211 Fts5Auxiliary *pAux; /* First in list of all aux. functions */
20212 Fts5TokenizerModule *pTok; /* First in list of all tokenizer modules */
20213 Fts5TokenizerModule *pDfltTok; /* Default tokenizer module */
20214 Fts5Cursor *pCsr; /* First in list of all open cursors */
20215 };
20216
20217 /*
20218 ** Each auxiliary function registered with the FTS5 module is represented
20219 ** by an object of the following type. All such objects are stored as part
20220 ** of the Fts5Global.pAux list.
20221 */
20222 struct Fts5Auxiliary {
20223 Fts5Global *pGlobal; /* Global context for this function */
20224 char *zFunc; /* Function name (nul-terminated) */
20225 void *pUserData; /* User-data pointer */
20226 fts5_extension_function xFunc; /* Callback function */
20227 void (*xDestroy)(void*); /* Destructor function */
20228 Fts5Auxiliary *pNext; /* Next registered auxiliary function */
20229 };
20230
20231 /*
20232 ** Each tokenizer module registered with the FTS5 module is represented
20233 ** by an object of the following type. All such objects are stored as part
20234 ** of the Fts5Global.pTok list.
20235 */
20236 struct Fts5TokenizerModule {
20237 char *zName; /* Name of tokenizer */
20238 void *pUserData; /* User pointer passed to xCreate() */
20239 fts5_tokenizer x; /* Tokenizer functions */
20240 void (*xDestroy)(void*); /* Destructor function */
20241 Fts5TokenizerModule *pNext; /* Next registered tokenizer module */
20242 };
20243
20244 /*
20245 ** Virtual-table object.
20246 */
20247 struct Fts5Table {
20248 sqlite3_vtab base; /* Base class used by SQLite core */
20249 Fts5Config *pConfig; /* Virtual table configuration */
20250 Fts5Index *pIndex; /* Full-text index */
20251 Fts5Storage *pStorage; /* Document store */
20252 Fts5Global *pGlobal; /* Global (connection wide) data */
20253 Fts5Cursor *pSortCsr; /* Sort data from this cursor */
20254 #ifdef SQLITE_DEBUG
20255 struct Fts5TransactionState ts;
20256 #endif
20257 };
20258
20259 struct Fts5MatchPhrase {
20260 Fts5Buffer *pPoslist; /* Pointer to current poslist */
20261 int nTerm; /* Size of phrase in terms */
20262 };
20263
20264 /*
20265 ** pStmt:
20266 ** SELECT rowid, <fts> FROM <fts> ORDER BY +rank;
20267 **
20268 ** aIdx[]:
20269 ** There is one entry in the aIdx[] array for each phrase in the query,
20270 ** the value of which is the offset within aPoslist[] following the last
20271 ** byte of the position list for the corresponding phrase.
20272 */
20273 struct Fts5Sorter {
20274 sqlite3_stmt *pStmt;
20275 i64 iRowid; /* Current rowid */
20276 const u8 *aPoslist; /* Position lists for current row */
20277 int nIdx; /* Number of entries in aIdx[] */
20278 int aIdx[1]; /* Offsets into aPoslist for current row */
20279 };
20280
20281
20282 /*
20283 ** Virtual-table cursor object.
20284 **
20285 ** iSpecial:
20286 ** If this is a 'special' query (refer to function fts5SpecialMatch()),
20287 ** then this variable contains the result of the query.
20288 **
20289 ** iFirstRowid, iLastRowid:
20290 ** These variables are only used for FTS5_PLAN_MATCH cursors. Assuming the
20291 ** cursor iterates in ascending order of rowids, iFirstRowid is the lower
20292 ** limit of rowids to return, and iLastRowid the upper. In other words, the
20293 ** WHERE clause in the user's query might have been:
20294 **
20295 ** <tbl> MATCH <expr> AND rowid BETWEEN $iFirstRowid AND $iLastRowid
20296 **
20297 ** If the cursor iterates in descending order of rowid, iFirstRowid
20298 ** is the upper limit (i.e. the "first" rowid visited) and iLastRowid
20299 ** the lower.
20300 */
20301 struct Fts5Cursor {
20302 sqlite3_vtab_cursor base; /* Base class used by SQLite core */
20303 Fts5Cursor *pNext; /* Next cursor in Fts5Cursor.pCsr list */
20304 int *aColumnSize; /* Values for xColumnSize() */
20305 i64 iCsrId; /* Cursor id */
20306
20307 /* Zero from this point onwards on cursor reset */
20308 int ePlan; /* FTS5_PLAN_XXX value */
20309 int bDesc; /* True for "ORDER BY rowid DESC" queries */
20310 i64 iFirstRowid; /* Return no rowids earlier than this */
20311 i64 iLastRowid; /* Return no rowids later than this */
20312 sqlite3_stmt *pStmt; /* Statement used to read %_content */
20313 Fts5Expr *pExpr; /* Expression for MATCH queries */
20314 Fts5Sorter *pSorter; /* Sorter for "ORDER BY rank" queries */
20315 int csrflags; /* Mask of cursor flags (see below) */
20316 i64 iSpecial; /* Result of special query */
20317
20318 /* "rank" function. Populated on demand from vtab.xColumn(). */
20319 char *zRank; /* Custom rank function */
20320 char *zRankArgs; /* Custom rank function args */
20321 Fts5Auxiliary *pRank; /* Rank callback (or NULL) */
20322 int nRankArg; /* Number of trailing arguments for rank() */
20323 sqlite3_value **apRankArg; /* Array of trailing arguments */
20324 sqlite3_stmt *pRankArgStmt; /* Origin of objects in apRankArg[] */
20325
20326 /* Auxiliary data storage */
20327 Fts5Auxiliary *pAux; /* Currently executing extension function */
20328 Fts5Auxdata *pAuxdata; /* First in linked list of saved aux-data */
20329
20330 /* Cache used by auxiliary functions xInst() and xInstCount() */
20331 Fts5PoslistReader *aInstIter; /* One for each phrase */
20332 int nInstAlloc; /* Size of aInst[] array (entries / 3) */
20333 int nInstCount; /* Number of phrase instances */
20334 int *aInst; /* 3 integers per phrase instance */
20335 };
20336
20337 /*
20338 ** Bits that make up the "idxNum" parameter passed indirectly by
20339 ** xBestIndex() to xFilter().
20340 */
20341 #define FTS5_BI_MATCH 0x0001 /* <tbl> MATCH ? */
20342 #define FTS5_BI_RANK 0x0002 /* rank MATCH ? */
20343 #define FTS5_BI_ROWID_EQ 0x0004 /* rowid == ? */
20344 #define FTS5_BI_ROWID_LE 0x0008 /* rowid <= ? */
20345 #define FTS5_BI_ROWID_GE 0x0010 /* rowid >= ? */
20346
20347 #define FTS5_BI_ORDER_RANK 0x0020
20348 #define FTS5_BI_ORDER_ROWID 0x0040
20349 #define FTS5_BI_ORDER_DESC 0x0080
20350
20351 /*
20352 ** Values for Fts5Cursor.csrflags
20353 */
20354 #define FTS5CSR_REQUIRE_CONTENT 0x01
20355 #define FTS5CSR_REQUIRE_DOCSIZE 0x02
20356 #define FTS5CSR_REQUIRE_INST 0x04
20357 #define FTS5CSR_EOF 0x08
20358 #define FTS5CSR_FREE_ZRANK 0x10
20359 #define FTS5CSR_REQUIRE_RESEEK 0x20
20360
20361 #define BitFlagAllTest(x,y) (((x) & (y))==(y))
20362 #define BitFlagTest(x,y) (((x) & (y))!=0)
20363
20364
20365 /*
20366 ** Macros to Set(), Clear() and Test() cursor flags.
20367 */
20368 #define CsrFlagSet(pCsr, flag) ((pCsr)->csrflags |= (flag))
20369 #define CsrFlagClear(pCsr, flag) ((pCsr)->csrflags &= ~(flag))
20370 #define CsrFlagTest(pCsr, flag) ((pCsr)->csrflags & (flag))
20371
20372 struct Fts5Auxdata {
20373 Fts5Auxiliary *pAux; /* Extension to which this belongs */
20374 void *pPtr; /* Pointer value */
20375 void(*xDelete)(void*); /* Destructor */
20376 Fts5Auxdata *pNext; /* Next object in linked list */
20377 };
20378
20379 #ifdef SQLITE_DEBUG
20380 #define FTS5_BEGIN 1
20381 #define FTS5_SYNC 2
20382 #define FTS5_COMMIT 3
20383 #define FTS5_ROLLBACK 4
20384 #define FTS5_SAVEPOINT 5
20385 #define FTS5_RELEASE 6
20386 #define FTS5_ROLLBACKTO 7
20387 static void fts5CheckTransactionState(Fts5Table *p, int op, int iSavepoint){
20388 switch( op ){
20389 case FTS5_BEGIN:
20390 assert( p->ts.eState==0 );
20391 p->ts.eState = 1;
20392 p->ts.iSavepoint = -1;
20393 break;
20394
20395 case FTS5_SYNC:
20396 assert( p->ts.eState==1 );
20397 p->ts.eState = 2;
20398 break;
20399
20400 case FTS5_COMMIT:
20401 assert( p->ts.eState==2 );
20402 p->ts.eState = 0;
20403 break;
20404
20405 case FTS5_ROLLBACK:
20406 assert( p->ts.eState==1 || p->ts.eState==2 || p->ts.eState==0 );
20407 p->ts.eState = 0;
20408 break;
20409
20410 case FTS5_SAVEPOINT:
20411 assert( p->ts.eState==1 );
20412 assert( iSavepoint>=0 );
20413 assert( iSavepoint>p->ts.iSavepoint );
20414 p->ts.iSavepoint = iSavepoint;
20415 break;
20416
20417 case FTS5_RELEASE:
20418 assert( p->ts.eState==1 );
20419 assert( iSavepoint>=0 );
20420 assert( iSavepoint<=p->ts.iSavepoint );
20421 p->ts.iSavepoint = iSavepoint-1;
20422 break;
20423
20424 case FTS5_ROLLBACKTO:
20425 assert( p->ts.eState==1 );
20426 assert( iSavepoint>=0 );
20427 assert( iSavepoint<=p->ts.iSavepoint );
20428 p->ts.iSavepoint = iSavepoint;
20429 break;
20430 }
20431 }
20432 #else
20433 # define fts5CheckTransactionState(x,y,z)
20434 #endif
20435
20436 /*
20437 ** Return true if pTab is a contentless table.
20438 */
20439 static int fts5IsContentless(Fts5Table *pTab){
20440 return pTab->pConfig->eContent==FTS5_CONTENT_NONE;
20441 }
20442
20443 /*
20444 ** Delete a virtual table handle allocated by fts5InitVtab().
20445 */
20446 static void fts5FreeVtab(Fts5Table *pTab){
20447 if( pTab ){
20448 sqlite3Fts5IndexClose(pTab->pIndex);
20449 sqlite3Fts5StorageClose(pTab->pStorage);
20450 sqlite3Fts5ConfigFree(pTab->pConfig);
20451 sqlite3_free(pTab);
20452 }
20453 }
20454
20455 /*
20456 ** The xDisconnect() virtual table method.
20457 */
20458 static int fts5DisconnectMethod(sqlite3_vtab *pVtab){
20459 fts5FreeVtab((Fts5Table*)pVtab);
20460 return SQLITE_OK;
20461 }
20462
20463 /*
20464 ** The xDestroy() virtual table method.
20465 */
20466 static int fts5DestroyMethod(sqlite3_vtab *pVtab){
20467 Fts5Table *pTab = (Fts5Table*)pVtab;
20468 int rc = sqlite3Fts5DropAll(pTab->pConfig);
20469 if( rc==SQLITE_OK ){
20470 fts5FreeVtab((Fts5Table*)pVtab);
20471 }
20472 return rc;
20473 }
20474
20475 /*
20476 ** This function is the implementation of both the xConnect and xCreate
20477 ** methods of the FTS3 virtual table.
20478 **
20479 ** The argv[] array contains the following:
20480 **
20481 ** argv[0] -> module name ("fts5")
20482 ** argv[1] -> database name
20483 ** argv[2] -> table name
20484 ** argv[...] -> "column name" and other module argument fields.
20485 */
20486 static int fts5InitVtab(
20487 int bCreate, /* True for xCreate, false for xConnect */
20488 sqlite3 *db, /* The SQLite database connection */
20489 void *pAux, /* Hash table containing tokenizers */
20490 int argc, /* Number of elements in argv array */
20491 const char * const *argv, /* xCreate/xConnect argument array */
20492 sqlite3_vtab **ppVTab, /* Write the resulting vtab structure here */
20493 char **pzErr /* Write any error message here */
20494 ){
20495 Fts5Global *pGlobal = (Fts5Global*)pAux;
20496 const char **azConfig = (const char**)argv;
20497 int rc = SQLITE_OK; /* Return code */
20498 Fts5Config *pConfig = 0; /* Results of parsing argc/argv */
20499 Fts5Table *pTab = 0; /* New virtual table object */
20500
20501 /* Allocate the new vtab object and parse the configuration */
20502 pTab = (Fts5Table*)sqlite3Fts5MallocZero(&rc, sizeof(Fts5Table));
20503 if( rc==SQLITE_OK ){
20504 rc = sqlite3Fts5ConfigParse(pGlobal, db, argc, azConfig, &pConfig, pzErr);
20505 assert( (rc==SQLITE_OK && *pzErr==0) || pConfig==0 );
20506 }
20507 if( rc==SQLITE_OK ){
20508 pTab->pConfig = pConfig;
20509 pTab->pGlobal = pGlobal;
20510 }
20511
20512 /* Open the index sub-system */
20513 if( rc==SQLITE_OK ){
20514 rc = sqlite3Fts5IndexOpen(pConfig, bCreate, &pTab->pIndex, pzErr);
20515 }
20516
20517 /* Open the storage sub-system */
20518 if( rc==SQLITE_OK ){
20519 rc = sqlite3Fts5StorageOpen(
20520 pConfig, pTab->pIndex, bCreate, &pTab->pStorage, pzErr
20521 );
20522 }
20523
20524 /* Call sqlite3_declare_vtab() */
20525 if( rc==SQLITE_OK ){
20526 rc = sqlite3Fts5ConfigDeclareVtab(pConfig);
20527 }
20528
20529 /* Load the initial configuration */
20530 if( rc==SQLITE_OK ){
20531 assert( pConfig->pzErrmsg==0 );
20532 pConfig->pzErrmsg = pzErr;
20533 rc = sqlite3Fts5IndexLoadConfig(pTab->pIndex);
20534 sqlite3Fts5IndexRollback(pTab->pIndex);
20535 pConfig->pzErrmsg = 0;
20536 }
20537
20538 if( rc!=SQLITE_OK ){
20539 fts5FreeVtab(pTab);
20540 pTab = 0;
20541 }else if( bCreate ){
20542 fts5CheckTransactionState(pTab, FTS5_BEGIN, 0);
20543 }
20544 *ppVTab = (sqlite3_vtab*)pTab;
20545 return rc;
20546 }
20547
20548 /*
20549 ** The xConnect() and xCreate() methods for the virtual table. All the
20550 ** work is done in function fts5InitVtab().
20551 */
20552 static int fts5ConnectMethod(
20553 sqlite3 *db, /* Database connection */
20554 void *pAux, /* Pointer to tokenizer hash table */
20555 int argc, /* Number of elements in argv array */
20556 const char * const *argv, /* xCreate/xConnect argument array */
20557 sqlite3_vtab **ppVtab, /* OUT: New sqlite3_vtab object */
20558 char **pzErr /* OUT: sqlite3_malloc'd error message */
20559 ){
20560 return fts5InitVtab(0, db, pAux, argc, argv, ppVtab, pzErr);
20561 }
20562 static int fts5CreateMethod(
20563 sqlite3 *db, /* Database connection */
20564 void *pAux, /* Pointer to tokenizer hash table */
20565 int argc, /* Number of elements in argv array */
20566 const char * const *argv, /* xCreate/xConnect argument array */
20567 sqlite3_vtab **ppVtab, /* OUT: New sqlite3_vtab object */
20568 char **pzErr /* OUT: sqlite3_malloc'd error message */
20569 ){
20570 return fts5InitVtab(1, db, pAux, argc, argv, ppVtab, pzErr);
20571 }
20572
20573 /*
20574 ** The different query plans.
20575 */
20576 #define FTS5_PLAN_MATCH 1 /* (<tbl> MATCH ?) */
20577 #define FTS5_PLAN_SOURCE 2 /* A source cursor for SORTED_MATCH */
20578 #define FTS5_PLAN_SPECIAL 3 /* An internal query */
20579 #define FTS5_PLAN_SORTED_MATCH 4 /* (<tbl> MATCH ? ORDER BY rank) */
20580 #define FTS5_PLAN_SCAN 5 /* No usable constraint */
20581 #define FTS5_PLAN_ROWID 6 /* (rowid = ?) */
20582
20583 /*
20584 ** Set the SQLITE_INDEX_SCAN_UNIQUE flag in pIdxInfo->flags. Unless this
20585 ** extension is currently being used by a version of SQLite too old to
20586 ** support index-info flags. In that case this function is a no-op.
20587 */
20588 static void fts5SetUniqueFlag(sqlite3_index_info *pIdxInfo){
20589 #if SQLITE_VERSION_NUMBER>=3008012
20590 #ifndef SQLITE_CORE
20591 if( sqlite3_libversion_number()>=3008012 )
20592 #endif
20593 {
20594 pIdxInfo->idxFlags |= SQLITE_INDEX_SCAN_UNIQUE;
20595 }
20596 #endif
20597 }
20598
20599 /*
20600 ** Implementation of the xBestIndex method for FTS5 tables. Within the
20601 ** WHERE constraint, it searches for the following:
20602 **
20603 ** 1. A MATCH constraint against the special column.
20604 ** 2. A MATCH constraint against the "rank" column.
20605 ** 3. An == constraint against the rowid column.
20606 ** 4. A < or <= constraint against the rowid column.
20607 ** 5. A > or >= constraint against the rowid column.
20608 **
20609 ** Within the ORDER BY, either:
20610 **
20611 ** 5. ORDER BY rank [ASC|DESC]
20612 ** 6. ORDER BY rowid [ASC|DESC]
20613 **
20614 ** Costs are assigned as follows:
20615 **
20616 ** a) If an unusable MATCH operator is present in the WHERE clause, the
20617 ** cost is unconditionally set to 1e50 (a really big number).
20618 **
20619 ** a) If a MATCH operator is present, the cost depends on the other
20620 ** constraints also present. As follows:
20621 **
20622 ** * No other constraints: cost=1000.0
20623 ** * One rowid range constraint: cost=750.0
20624 ** * Both rowid range constraints: cost=500.0
20625 ** * An == rowid constraint: cost=100.0
20626 **
20627 ** b) Otherwise, if there is no MATCH:
20628 **
20629 ** * No other constraints: cost=1000000.0
20630 ** * One rowid range constraint: cost=750000.0
20631 ** * Both rowid range constraints: cost=250000.0
20632 ** * An == rowid constraint: cost=10.0
20633 **
20634 ** Costs are not modified by the ORDER BY clause.
20635 */
20636 static int fts5BestIndexMethod(sqlite3_vtab *pVTab, sqlite3_index_info *pInfo){
20637 Fts5Table *pTab = (Fts5Table*)pVTab;
20638 Fts5Config *pConfig = pTab->pConfig;
20639 int idxFlags = 0; /* Parameter passed through to xFilter() */
20640 int bHasMatch;
20641 int iNext;
20642 int i;
20643
20644 struct Constraint {
20645 int op; /* Mask against sqlite3_index_constraint.op */
20646 int fts5op; /* FTS5 mask for idxFlags */
20647 int iCol; /* 0==rowid, 1==tbl, 2==rank */
20648 int omit; /* True to omit this if found */
20649 int iConsIndex; /* Index in pInfo->aConstraint[] */
20650 } aConstraint[] = {
20651 {SQLITE_INDEX_CONSTRAINT_MATCH|SQLITE_INDEX_CONSTRAINT_EQ,
20652 FTS5_BI_MATCH, 1, 1, -1},
20653 {SQLITE_INDEX_CONSTRAINT_MATCH|SQLITE_INDEX_CONSTRAINT_EQ,
20654 FTS5_BI_RANK, 2, 1, -1},
20655 {SQLITE_INDEX_CONSTRAINT_EQ, FTS5_BI_ROWID_EQ, 0, 0, -1},
20656 {SQLITE_INDEX_CONSTRAINT_LT|SQLITE_INDEX_CONSTRAINT_LE,
20657 FTS5_BI_ROWID_LE, 0, 0, -1},
20658 {SQLITE_INDEX_CONSTRAINT_GT|SQLITE_INDEX_CONSTRAINT_GE,
20659 FTS5_BI_ROWID_GE, 0, 0, -1},
20660 };
20661
20662 int aColMap[3];
20663 aColMap[0] = -1;
20664 aColMap[1] = pConfig->nCol;
20665 aColMap[2] = pConfig->nCol+1;
20666
20667 /* Set idxFlags flags for all WHERE clause terms that will be used. */
20668 for(i=0; i<pInfo->nConstraint; i++){
20669 struct sqlite3_index_constraint *p = &pInfo->aConstraint[i];
20670 int j;
20671 for(j=0; j<(int)ArraySize(aConstraint); j++){
20672 struct Constraint *pC = &aConstraint[j];
20673 if( p->iColumn==aColMap[pC->iCol] && p->op & pC->op ){
20674 if( p->usable ){
20675 pC->iConsIndex = i;
20676 idxFlags |= pC->fts5op;
20677 }else if( j==0 ){
20678 /* As there exists an unusable MATCH constraint this is an
20679 ** unusable plan. Set a prohibitively high cost. */
20680 pInfo->estimatedCost = 1e50;
20681 return SQLITE_OK;
20682 }
20683 }
20684 }
20685 }
20686
20687 /* Set idxFlags flags for the ORDER BY clause */
20688 if( pInfo->nOrderBy==1 ){
20689 int iSort = pInfo->aOrderBy[0].iColumn;
20690 if( iSort==(pConfig->nCol+1) && BitFlagTest(idxFlags, FTS5_BI_MATCH) ){
20691 idxFlags |= FTS5_BI_ORDER_RANK;
20692 }else if( iSort==-1 ){
20693 idxFlags |= FTS5_BI_ORDER_ROWID;
20694 }
20695 if( BitFlagTest(idxFlags, FTS5_BI_ORDER_RANK|FTS5_BI_ORDER_ROWID) ){
20696 pInfo->orderByConsumed = 1;
20697 if( pInfo->aOrderBy[0].desc ){
20698 idxFlags |= FTS5_BI_ORDER_DESC;
20699 }
20700 }
20701 }
20702
20703 /* Calculate the estimated cost based on the flags set in idxFlags. */
20704 bHasMatch = BitFlagTest(idxFlags, FTS5_BI_MATCH);
20705 if( BitFlagTest(idxFlags, FTS5_BI_ROWID_EQ) ){
20706 pInfo->estimatedCost = bHasMatch ? 100.0 : 10.0;
20707 if( bHasMatch==0 ) fts5SetUniqueFlag(pInfo);
20708 }else if( BitFlagAllTest(idxFlags, FTS5_BI_ROWID_LE|FTS5_BI_ROWID_GE) ){
20709 pInfo->estimatedCost = bHasMatch ? 500.0 : 250000.0;
20710 }else if( BitFlagTest(idxFlags, FTS5_BI_ROWID_LE|FTS5_BI_ROWID_GE) ){
20711 pInfo->estimatedCost = bHasMatch ? 750.0 : 750000.0;
20712 }else{
20713 pInfo->estimatedCost = bHasMatch ? 1000.0 : 1000000.0;
20714 }
20715
20716 /* Assign argvIndex values to each constraint in use. */
20717 iNext = 1;
20718 for(i=0; i<(int)ArraySize(aConstraint); i++){
20719 struct Constraint *pC = &aConstraint[i];
20720 if( pC->iConsIndex>=0 ){
20721 pInfo->aConstraintUsage[pC->iConsIndex].argvIndex = iNext++;
20722 pInfo->aConstraintUsage[pC->iConsIndex].omit = (unsigned char)pC->omit;
20723 }
20724 }
20725
20726 pInfo->idxNum = idxFlags;
20727 return SQLITE_OK;
20728 }
20729
20730 /*
20731 ** Implementation of xOpen method.
20732 */
20733 static int fts5OpenMethod(sqlite3_vtab *pVTab, sqlite3_vtab_cursor **ppCsr){
20734 Fts5Table *pTab = (Fts5Table*)pVTab;
20735 Fts5Config *pConfig = pTab->pConfig;
20736 Fts5Cursor *pCsr; /* New cursor object */
20737 int nByte; /* Bytes of space to allocate */
20738 int rc = SQLITE_OK; /* Return code */
20739
20740 nByte = sizeof(Fts5Cursor) + pConfig->nCol * sizeof(int);
20741 pCsr = (Fts5Cursor*)sqlite3_malloc(nByte);
20742 if( pCsr ){
20743 Fts5Global *pGlobal = pTab->pGlobal;
20744 memset(pCsr, 0, nByte);
20745 pCsr->aColumnSize = (int*)&pCsr[1];
20746 pCsr->pNext = pGlobal->pCsr;
20747 pGlobal->pCsr = pCsr;
20748 pCsr->iCsrId = ++pGlobal->iNextId;
20749 }else{
20750 rc = SQLITE_NOMEM;
20751 }
20752 *ppCsr = (sqlite3_vtab_cursor*)pCsr;
20753 return rc;
20754 }
20755
20756 static int fts5StmtType(Fts5Cursor *pCsr){
20757 if( pCsr->ePlan==FTS5_PLAN_SCAN ){
20758 return (pCsr->bDesc) ? FTS5_STMT_SCAN_DESC : FTS5_STMT_SCAN_ASC;
20759 }
20760 return FTS5_STMT_LOOKUP;
20761 }
20762
20763 /*
20764 ** This function is called after the cursor passed as the only argument
20765 ** is moved to point at a different row. It clears all cached data
20766 ** specific to the previous row stored by the cursor object.
20767 */
20768 static void fts5CsrNewrow(Fts5Cursor *pCsr){
20769 CsrFlagSet(pCsr,
20770 FTS5CSR_REQUIRE_CONTENT
20771 | FTS5CSR_REQUIRE_DOCSIZE
20772 | FTS5CSR_REQUIRE_INST
20773 );
20774 }
20775
20776 static void fts5FreeCursorComponents(Fts5Cursor *pCsr){
20777 Fts5Table *pTab = (Fts5Table*)(pCsr->base.pVtab);
20778 Fts5Auxdata *pData;
20779 Fts5Auxdata *pNext;
20780
20781 sqlite3_free(pCsr->aInstIter);
20782 sqlite3_free(pCsr->aInst);
20783 if( pCsr->pStmt ){
20784 int eStmt = fts5StmtType(pCsr);
20785 sqlite3Fts5StorageStmtRelease(pTab->pStorage, eStmt, pCsr->pStmt);
20786 }
20787 if( pCsr->pSorter ){
20788 Fts5Sorter *pSorter = pCsr->pSorter;
20789 sqlite3_finalize(pSorter->pStmt);
20790 sqlite3_free(pSorter);
20791 }
20792
20793 if( pCsr->ePlan!=FTS5_PLAN_SOURCE ){
20794 sqlite3Fts5ExprFree(pCsr->pExpr);
20795 }
20796
20797 for(pData=pCsr->pAuxdata; pData; pData=pNext){
20798 pNext = pData->pNext;
20799 if( pData->xDelete ) pData->xDelete(pData->pPtr);
20800 sqlite3_free(pData);
20801 }
20802
20803 sqlite3_finalize(pCsr->pRankArgStmt);
20804 sqlite3_free(pCsr->apRankArg);
20805
20806 if( CsrFlagTest(pCsr, FTS5CSR_FREE_ZRANK) ){
20807 sqlite3_free(pCsr->zRank);
20808 sqlite3_free(pCsr->zRankArgs);
20809 }
20810
20811 memset(&pCsr->ePlan, 0, sizeof(Fts5Cursor) - ((u8*)&pCsr->ePlan - (u8*)pCsr));
20812 }
20813
20814
20815 /*
20816 ** Close the cursor. For additional information see the documentation
20817 ** on the xClose method of the virtual table interface.
20818 */
20819 static int fts5CloseMethod(sqlite3_vtab_cursor *pCursor){
20820 if( pCursor ){
20821 Fts5Table *pTab = (Fts5Table*)(pCursor->pVtab);
20822 Fts5Cursor *pCsr = (Fts5Cursor*)pCursor;
20823 Fts5Cursor **pp;
20824
20825 fts5FreeCursorComponents(pCsr);
20826 /* Remove the cursor from the Fts5Global.pCsr list */
20827 for(pp=&pTab->pGlobal->pCsr; (*pp)!=pCsr; pp=&(*pp)->pNext);
20828 *pp = pCsr->pNext;
20829
20830 sqlite3_free(pCsr);
20831 }
20832 return SQLITE_OK;
20833 }
20834
20835 static int fts5SorterNext(Fts5Cursor *pCsr){
20836 Fts5Sorter *pSorter = pCsr->pSorter;
20837 int rc;
20838
20839 rc = sqlite3_step(pSorter->pStmt);
20840 if( rc==SQLITE_DONE ){
20841 rc = SQLITE_OK;
20842 CsrFlagSet(pCsr, FTS5CSR_EOF);
20843 }else if( rc==SQLITE_ROW ){
20844 const u8 *a;
20845 const u8 *aBlob;
20846 int nBlob;
20847 int i;
20848 int iOff = 0;
20849 rc = SQLITE_OK;
20850
20851 pSorter->iRowid = sqlite3_column_int64(pSorter->pStmt, 0);
20852 nBlob = sqlite3_column_bytes(pSorter->pStmt, 1);
20853 aBlob = a = sqlite3_column_blob(pSorter->pStmt, 1);
20854
20855 for(i=0; i<(pSorter->nIdx-1); i++){
20856 int iVal;
20857 a += fts5GetVarint32(a, iVal);
20858 iOff += iVal;
20859 pSorter->aIdx[i] = iOff;
20860 }
20861 pSorter->aIdx[i] = &aBlob[nBlob] - a;
20862
20863 pSorter->aPoslist = a;
20864 fts5CsrNewrow(pCsr);
20865 }
20866
20867 return rc;
20868 }
20869
20870
20871 /*
20872 ** Set the FTS5CSR_REQUIRE_RESEEK flag on all FTS5_PLAN_MATCH cursors
20873 ** open on table pTab.
20874 */
20875 static void fts5TripCursors(Fts5Table *pTab){
20876 Fts5Cursor *pCsr;
20877 for(pCsr=pTab->pGlobal->pCsr; pCsr; pCsr=pCsr->pNext){
20878 if( pCsr->ePlan==FTS5_PLAN_MATCH
20879 && pCsr->base.pVtab==(sqlite3_vtab*)pTab
20880 ){
20881 CsrFlagSet(pCsr, FTS5CSR_REQUIRE_RESEEK);
20882 }
20883 }
20884 }
20885
20886 /*
20887 ** If the REQUIRE_RESEEK flag is set on the cursor passed as the first
20888 ** argument, close and reopen all Fts5IndexIter iterators that the cursor
20889 ** is using. Then attempt to move the cursor to a rowid equal to or laster
20890 ** (in the cursors sort order - ASC or DESC) than the current rowid.
20891 **
20892 ** If the new rowid is not equal to the old, set output parameter *pbSkip
20893 ** to 1 before returning. Otherwise, leave it unchanged.
20894 **
20895 ** Return SQLITE_OK if successful or if no reseek was required, or an
20896 ** error code if an error occurred.
20897 */
20898 static int fts5CursorReseek(Fts5Cursor *pCsr, int *pbSkip){
20899 int rc = SQLITE_OK;
20900 assert( *pbSkip==0 );
20901 if( CsrFlagTest(pCsr, FTS5CSR_REQUIRE_RESEEK) ){
20902 Fts5Table *pTab = (Fts5Table*)(pCsr->base.pVtab);
20903 int bDesc = pCsr->bDesc;
20904 i64 iRowid = sqlite3Fts5ExprRowid(pCsr->pExpr);
20905
20906 rc = sqlite3Fts5ExprFirst(pCsr->pExpr, pTab->pIndex, iRowid, bDesc);
20907 if( rc==SQLITE_OK && iRowid!=sqlite3Fts5ExprRowid(pCsr->pExpr) ){
20908 *pbSkip = 1;
20909 }
20910
20911 CsrFlagClear(pCsr, FTS5CSR_REQUIRE_RESEEK);
20912 fts5CsrNewrow(pCsr);
20913 if( sqlite3Fts5ExprEof(pCsr->pExpr) ){
20914 CsrFlagSet(pCsr, FTS5CSR_EOF);
20915 }
20916 }
20917 return rc;
20918 }
20919
20920
20921 /*
20922 ** Advance the cursor to the next row in the table that matches the
20923 ** search criteria.
20924 **
20925 ** Return SQLITE_OK if nothing goes wrong. SQLITE_OK is returned
20926 ** even if we reach end-of-file. The fts5EofMethod() will be called
20927 ** subsequently to determine whether or not an EOF was hit.
20928 */
20929 static int fts5NextMethod(sqlite3_vtab_cursor *pCursor){
20930 Fts5Cursor *pCsr = (Fts5Cursor*)pCursor;
20931 int rc = SQLITE_OK;
20932
20933 assert( (pCsr->ePlan<3)==
20934 (pCsr->ePlan==FTS5_PLAN_MATCH || pCsr->ePlan==FTS5_PLAN_SOURCE)
20935 );
20936
20937 if( pCsr->ePlan<3 ){
20938 int bSkip = 0;
20939 if( (rc = fts5CursorReseek(pCsr, &bSkip)) || bSkip ) return rc;
20940 rc = sqlite3Fts5ExprNext(pCsr->pExpr, pCsr->iLastRowid);
20941 if( sqlite3Fts5ExprEof(pCsr->pExpr) ){
20942 CsrFlagSet(pCsr, FTS5CSR_EOF);
20943 }
20944 fts5CsrNewrow(pCsr);
20945 }else{
20946 switch( pCsr->ePlan ){
20947 case FTS5_PLAN_SPECIAL: {
20948 CsrFlagSet(pCsr, FTS5CSR_EOF);
20949 break;
20950 }
20951
20952 case FTS5_PLAN_SORTED_MATCH: {
20953 rc = fts5SorterNext(pCsr);
20954 break;
20955 }
20956
20957 default:
20958 rc = sqlite3_step(pCsr->pStmt);
20959 if( rc!=SQLITE_ROW ){
20960 CsrFlagSet(pCsr, FTS5CSR_EOF);
20961 rc = sqlite3_reset(pCsr->pStmt);
20962 }else{
20963 rc = SQLITE_OK;
20964 }
20965 break;
20966 }
20967 }
20968
20969 return rc;
20970 }
20971
20972
20973 static sqlite3_stmt *fts5PrepareStatement(
20974 int *pRc,
20975 Fts5Config *pConfig,
20976 const char *zFmt,
20977 ...
20978 ){
20979 sqlite3_stmt *pRet = 0;
20980 va_list ap;
20981 va_start(ap, zFmt);
20982
20983 if( *pRc==SQLITE_OK ){
20984 int rc;
20985 char *zSql = sqlite3_vmprintf(zFmt, ap);
20986 if( zSql==0 ){
20987 rc = SQLITE_NOMEM;
20988 }else{
20989 rc = sqlite3_prepare_v2(pConfig->db, zSql, -1, &pRet, 0);
20990 if( rc!=SQLITE_OK ){
20991 *pConfig->pzErrmsg = sqlite3_mprintf("%s", sqlite3_errmsg(pConfig->db));
20992 }
20993 sqlite3_free(zSql);
20994 }
20995 *pRc = rc;
20996 }
20997
20998 va_end(ap);
20999 return pRet;
21000 }
21001
21002 static int fts5CursorFirstSorted(Fts5Table *pTab, Fts5Cursor *pCsr, int bDesc){
21003 Fts5Config *pConfig = pTab->pConfig;
21004 Fts5Sorter *pSorter;
21005 int nPhrase;
21006 int nByte;
21007 int rc = SQLITE_OK;
21008 const char *zRank = pCsr->zRank;
21009 const char *zRankArgs = pCsr->zRankArgs;
21010
21011 nPhrase = sqlite3Fts5ExprPhraseCount(pCsr->pExpr);
21012 nByte = sizeof(Fts5Sorter) + sizeof(int) * (nPhrase-1);
21013 pSorter = (Fts5Sorter*)sqlite3_malloc(nByte);
21014 if( pSorter==0 ) return SQLITE_NOMEM;
21015 memset(pSorter, 0, nByte);
21016 pSorter->nIdx = nPhrase;
21017
21018 /* TODO: It would be better to have some system for reusing statement
21019 ** handles here, rather than preparing a new one for each query. But that
21020 ** is not possible as SQLite reference counts the virtual table objects.
21021 ** And since the statement required here reads from this very virtual
21022 ** table, saving it creates a circular reference.
21023 **
21024 ** If SQLite a built-in statement cache, this wouldn't be a problem. */
21025 pSorter->pStmt = fts5PrepareStatement(&rc, pConfig,
21026 "SELECT rowid, rank FROM %Q.%Q ORDER BY %s(%s%s%s) %s",
21027 pConfig->zDb, pConfig->zName, zRank, pConfig->zName,
21028 (zRankArgs ? ", " : ""),
21029 (zRankArgs ? zRankArgs : ""),
21030 bDesc ? "DESC" : "ASC"
21031 );
21032
21033 pCsr->pSorter = pSorter;
21034 if( rc==SQLITE_OK ){
21035 assert( pTab->pSortCsr==0 );
21036 pTab->pSortCsr = pCsr;
21037 rc = fts5SorterNext(pCsr);
21038 pTab->pSortCsr = 0;
21039 }
21040
21041 if( rc!=SQLITE_OK ){
21042 sqlite3_finalize(pSorter->pStmt);
21043 sqlite3_free(pSorter);
21044 pCsr->pSorter = 0;
21045 }
21046
21047 return rc;
21048 }
21049
21050 static int fts5CursorFirst(Fts5Table *pTab, Fts5Cursor *pCsr, int bDesc){
21051 int rc;
21052 Fts5Expr *pExpr = pCsr->pExpr;
21053 rc = sqlite3Fts5ExprFirst(pExpr, pTab->pIndex, pCsr->iFirstRowid, bDesc);
21054 if( sqlite3Fts5ExprEof(pExpr) ){
21055 CsrFlagSet(pCsr, FTS5CSR_EOF);
21056 }
21057 fts5CsrNewrow(pCsr);
21058 return rc;
21059 }
21060
21061 /*
21062 ** Process a "special" query. A special query is identified as one with a
21063 ** MATCH expression that begins with a '*' character. The remainder of
21064 ** the text passed to the MATCH operator are used as the special query
21065 ** parameters.
21066 */
21067 static int fts5SpecialMatch(
21068 Fts5Table *pTab,
21069 Fts5Cursor *pCsr,
21070 const char *zQuery
21071 ){
21072 int rc = SQLITE_OK; /* Return code */
21073 const char *z = zQuery; /* Special query text */
21074 int n; /* Number of bytes in text at z */
21075
21076 while( z[0]==' ' ) z++;
21077 for(n=0; z[n] && z[n]!=' '; n++);
21078
21079 assert( pTab->base.zErrMsg==0 );
21080 pCsr->ePlan = FTS5_PLAN_SPECIAL;
21081
21082 if( 0==sqlite3_strnicmp("reads", z, n) ){
21083 pCsr->iSpecial = sqlite3Fts5IndexReads(pTab->pIndex);
21084 }
21085 else if( 0==sqlite3_strnicmp("id", z, n) ){
21086 pCsr->iSpecial = pCsr->iCsrId;
21087 }
21088 else{
21089 /* An unrecognized directive. Return an error message. */
21090 pTab->base.zErrMsg = sqlite3_mprintf("unknown special query: %.*s", n, z);
21091 rc = SQLITE_ERROR;
21092 }
21093
21094 return rc;
21095 }
21096
21097 /*
21098 ** Search for an auxiliary function named zName that can be used with table
21099 ** pTab. If one is found, return a pointer to the corresponding Fts5Auxiliary
21100 ** structure. Otherwise, if no such function exists, return NULL.
21101 */
21102 static Fts5Auxiliary *fts5FindAuxiliary(Fts5Table *pTab, const char *zName){
21103 Fts5Auxiliary *pAux;
21104
21105 for(pAux=pTab->pGlobal->pAux; pAux; pAux=pAux->pNext){
21106 if( sqlite3_stricmp(zName, pAux->zFunc)==0 ) return pAux;
21107 }
21108
21109 /* No function of the specified name was found. Return 0. */
21110 return 0;
21111 }
21112
21113
21114 static int fts5FindRankFunction(Fts5Cursor *pCsr){
21115 Fts5Table *pTab = (Fts5Table*)(pCsr->base.pVtab);
21116 Fts5Config *pConfig = pTab->pConfig;
21117 int rc = SQLITE_OK;
21118 Fts5Auxiliary *pAux = 0;
21119 const char *zRank = pCsr->zRank;
21120 const char *zRankArgs = pCsr->zRankArgs;
21121
21122 if( zRankArgs ){
21123 char *zSql = sqlite3Fts5Mprintf(&rc, "SELECT %s", zRankArgs);
21124 if( zSql ){
21125 sqlite3_stmt *pStmt = 0;
21126 rc = sqlite3_prepare_v2(pConfig->db, zSql, -1, &pStmt, 0);
21127 sqlite3_free(zSql);
21128 assert( rc==SQLITE_OK || pCsr->pRankArgStmt==0 );
21129 if( rc==SQLITE_OK ){
21130 if( SQLITE_ROW==sqlite3_step(pStmt) ){
21131 int nByte;
21132 pCsr->nRankArg = sqlite3_column_count(pStmt);
21133 nByte = sizeof(sqlite3_value*)*pCsr->nRankArg;
21134 pCsr->apRankArg = (sqlite3_value**)sqlite3Fts5MallocZero(&rc, nByte);
21135 if( rc==SQLITE_OK ){
21136 int i;
21137 for(i=0; i<pCsr->nRankArg; i++){
21138 pCsr->apRankArg[i] = sqlite3_column_value(pStmt, i);
21139 }
21140 }
21141 pCsr->pRankArgStmt = pStmt;
21142 }else{
21143 rc = sqlite3_finalize(pStmt);
21144 assert( rc!=SQLITE_OK );
21145 }
21146 }
21147 }
21148 }
21149
21150 if( rc==SQLITE_OK ){
21151 pAux = fts5FindAuxiliary(pTab, zRank);
21152 if( pAux==0 ){
21153 assert( pTab->base.zErrMsg==0 );
21154 pTab->base.zErrMsg = sqlite3_mprintf("no such function: %s", zRank);
21155 rc = SQLITE_ERROR;
21156 }
21157 }
21158
21159 pCsr->pRank = pAux;
21160 return rc;
21161 }
21162
21163
21164 static int fts5CursorParseRank(
21165 Fts5Config *pConfig,
21166 Fts5Cursor *pCsr,
21167 sqlite3_value *pRank
21168 ){
21169 int rc = SQLITE_OK;
21170 if( pRank ){
21171 const char *z = (const char*)sqlite3_value_text(pRank);
21172 char *zRank = 0;
21173 char *zRankArgs = 0;
21174
21175 if( z==0 ){
21176 if( sqlite3_value_type(pRank)==SQLITE_NULL ) rc = SQLITE_ERROR;
21177 }else{
21178 rc = sqlite3Fts5ConfigParseRank(z, &zRank, &zRankArgs);
21179 }
21180 if( rc==SQLITE_OK ){
21181 pCsr->zRank = zRank;
21182 pCsr->zRankArgs = zRankArgs;
21183 CsrFlagSet(pCsr, FTS5CSR_FREE_ZRANK);
21184 }else if( rc==SQLITE_ERROR ){
21185 pCsr->base.pVtab->zErrMsg = sqlite3_mprintf(
21186 "parse error in rank function: %s", z
21187 );
21188 }
21189 }else{
21190 if( pConfig->zRank ){
21191 pCsr->zRank = (char*)pConfig->zRank;
21192 pCsr->zRankArgs = (char*)pConfig->zRankArgs;
21193 }else{
21194 pCsr->zRank = (char*)FTS5_DEFAULT_RANK;
21195 pCsr->zRankArgs = 0;
21196 }
21197 }
21198 return rc;
21199 }
21200
21201 static i64 fts5GetRowidLimit(sqlite3_value *pVal, i64 iDefault){
21202 if( pVal ){
21203 int eType = sqlite3_value_numeric_type(pVal);
21204 if( eType==SQLITE_INTEGER ){
21205 return sqlite3_value_int64(pVal);
21206 }
21207 }
21208 return iDefault;
21209 }
21210
21211 /*
21212 ** This is the xFilter interface for the virtual table. See
21213 ** the virtual table xFilter method documentation for additional
21214 ** information.
21215 **
21216 ** There are three possible query strategies:
21217 **
21218 ** 1. Full-text search using a MATCH operator.
21219 ** 2. A by-rowid lookup.
21220 ** 3. A full-table scan.
21221 */
21222 static int fts5FilterMethod(
21223 sqlite3_vtab_cursor *pCursor, /* The cursor used for this query */
21224 int idxNum, /* Strategy index */
21225 const char *idxStr, /* Unused */
21226 int nVal, /* Number of elements in apVal */
21227 sqlite3_value **apVal /* Arguments for the indexing scheme */
21228 ){
21229 Fts5Table *pTab = (Fts5Table*)(pCursor->pVtab);
21230 Fts5Config *pConfig = pTab->pConfig;
21231 Fts5Cursor *pCsr = (Fts5Cursor*)pCursor;
21232 int rc = SQLITE_OK; /* Error code */
21233 int iVal = 0; /* Counter for apVal[] */
21234 int bDesc; /* True if ORDER BY [rank|rowid] DESC */
21235 int bOrderByRank; /* True if ORDER BY rank */
21236 sqlite3_value *pMatch = 0; /* <tbl> MATCH ? expression (or NULL) */
21237 sqlite3_value *pRank = 0; /* rank MATCH ? expression (or NULL) */
21238 sqlite3_value *pRowidEq = 0; /* rowid = ? expression (or NULL) */
21239 sqlite3_value *pRowidLe = 0; /* rowid <= ? expression (or NULL) */
21240 sqlite3_value *pRowidGe = 0; /* rowid >= ? expression (or NULL) */
21241 char **pzErrmsg = pConfig->pzErrmsg;
21242
21243 if( pCsr->ePlan ){
21244 fts5FreeCursorComponents(pCsr);
21245 memset(&pCsr->ePlan, 0, sizeof(Fts5Cursor) - ((u8*)&pCsr->ePlan-(u8*)pCsr));
21246 }
21247
21248 assert( pCsr->pStmt==0 );
21249 assert( pCsr->pExpr==0 );
21250 assert( pCsr->csrflags==0 );
21251 assert( pCsr->pRank==0 );
21252 assert( pCsr->zRank==0 );
21253 assert( pCsr->zRankArgs==0 );
21254
21255 assert( pzErrmsg==0 || pzErrmsg==&pTab->base.zErrMsg );
21256 pConfig->pzErrmsg = &pTab->base.zErrMsg;
21257
21258 /* Decode the arguments passed through to this function.
21259 **
21260 ** Note: The following set of if(...) statements must be in the same
21261 ** order as the corresponding entries in the struct at the top of
21262 ** fts5BestIndexMethod(). */
21263 if( BitFlagTest(idxNum, FTS5_BI_MATCH) ) pMatch = apVal[iVal++];
21264 if( BitFlagTest(idxNum, FTS5_BI_RANK) ) pRank = apVal[iVal++];
21265 if( BitFlagTest(idxNum, FTS5_BI_ROWID_EQ) ) pRowidEq = apVal[iVal++];
21266 if( BitFlagTest(idxNum, FTS5_BI_ROWID_LE) ) pRowidLe = apVal[iVal++];
21267 if( BitFlagTest(idxNum, FTS5_BI_ROWID_GE) ) pRowidGe = apVal[iVal++];
21268 assert( iVal==nVal );
21269 bOrderByRank = ((idxNum & FTS5_BI_ORDER_RANK) ? 1 : 0);
21270 pCsr->bDesc = bDesc = ((idxNum & FTS5_BI_ORDER_DESC) ? 1 : 0);
21271
21272 /* Set the cursor upper and lower rowid limits. Only some strategies
21273 ** actually use them. This is ok, as the xBestIndex() method leaves the
21274 ** sqlite3_index_constraint.omit flag clear for range constraints
21275 ** on the rowid field. */
21276 if( pRowidEq ){
21277 pRowidLe = pRowidGe = pRowidEq;
21278 }
21279 if( bDesc ){
21280 pCsr->iFirstRowid = fts5GetRowidLimit(pRowidLe, LARGEST_INT64);
21281 pCsr->iLastRowid = fts5GetRowidLimit(pRowidGe, SMALLEST_INT64);
21282 }else{
21283 pCsr->iLastRowid = fts5GetRowidLimit(pRowidLe, LARGEST_INT64);
21284 pCsr->iFirstRowid = fts5GetRowidLimit(pRowidGe, SMALLEST_INT64);
21285 }
21286
21287 if( pTab->pSortCsr ){
21288 /* If pSortCsr is non-NULL, then this call is being made as part of
21289 ** processing for a "... MATCH <expr> ORDER BY rank" query (ePlan is
21290 ** set to FTS5_PLAN_SORTED_MATCH). pSortCsr is the cursor that will
21291 ** return results to the user for this query. The current cursor
21292 ** (pCursor) is used to execute the query issued by function
21293 ** fts5CursorFirstSorted() above. */
21294 assert( pRowidEq==0 && pRowidLe==0 && pRowidGe==0 && pRank==0 );
21295 assert( nVal==0 && pMatch==0 && bOrderByRank==0 && bDesc==0 );
21296 assert( pCsr->iLastRowid==LARGEST_INT64 );
21297 assert( pCsr->iFirstRowid==SMALLEST_INT64 );
21298 pCsr->ePlan = FTS5_PLAN_SOURCE;
21299 pCsr->pExpr = pTab->pSortCsr->pExpr;
21300 rc = fts5CursorFirst(pTab, pCsr, bDesc);
21301 }else if( pMatch ){
21302 const char *zExpr = (const char*)sqlite3_value_text(apVal[0]);
21303 if( zExpr==0 ) zExpr = "";
21304
21305 rc = fts5CursorParseRank(pConfig, pCsr, pRank);
21306 if( rc==SQLITE_OK ){
21307 if( zExpr[0]=='*' ){
21308 /* The user has issued a query of the form "MATCH '*...'". This
21309 ** indicates that the MATCH expression is not a full text query,
21310 ** but a request for an internal parameter. */
21311 rc = fts5SpecialMatch(pTab, pCsr, &zExpr[1]);
21312 }else{
21313 char **pzErr = &pTab->base.zErrMsg;
21314 rc = sqlite3Fts5ExprNew(pConfig, zExpr, &pCsr->pExpr, pzErr);
21315 if( rc==SQLITE_OK ){
21316 if( bOrderByRank ){
21317 pCsr->ePlan = FTS5_PLAN_SORTED_MATCH;
21318 rc = fts5CursorFirstSorted(pTab, pCsr, bDesc);
21319 }else{
21320 pCsr->ePlan = FTS5_PLAN_MATCH;
21321 rc = fts5CursorFirst(pTab, pCsr, bDesc);
21322 }
21323 }
21324 }
21325 }
21326 }else if( pConfig->zContent==0 ){
21327 *pConfig->pzErrmsg = sqlite3_mprintf(
21328 "%s: table does not support scanning", pConfig->zName
21329 );
21330 rc = SQLITE_ERROR;
21331 }else{
21332 /* This is either a full-table scan (ePlan==FTS5_PLAN_SCAN) or a lookup
21333 ** by rowid (ePlan==FTS5_PLAN_ROWID). */
21334 pCsr->ePlan = (pRowidEq ? FTS5_PLAN_ROWID : FTS5_PLAN_SCAN);
21335 rc = sqlite3Fts5StorageStmt(
21336 pTab->pStorage, fts5StmtType(pCsr), &pCsr->pStmt, &pTab->base.zErrMsg
21337 );
21338 if( rc==SQLITE_OK ){
21339 if( pCsr->ePlan==FTS5_PLAN_ROWID ){
21340 sqlite3_bind_value(pCsr->pStmt, 1, apVal[0]);
21341 }else{
21342 sqlite3_bind_int64(pCsr->pStmt, 1, pCsr->iFirstRowid);
21343 sqlite3_bind_int64(pCsr->pStmt, 2, pCsr->iLastRowid);
21344 }
21345 rc = fts5NextMethod(pCursor);
21346 }
21347 }
21348
21349 pConfig->pzErrmsg = pzErrmsg;
21350 return rc;
21351 }
21352
21353 /*
21354 ** This is the xEof method of the virtual table. SQLite calls this
21355 ** routine to find out if it has reached the end of a result set.
21356 */
21357 static int fts5EofMethod(sqlite3_vtab_cursor *pCursor){
21358 Fts5Cursor *pCsr = (Fts5Cursor*)pCursor;
21359 return (CsrFlagTest(pCsr, FTS5CSR_EOF) ? 1 : 0);
21360 }
21361
21362 /*
21363 ** Return the rowid that the cursor currently points to.
21364 */
21365 static i64 fts5CursorRowid(Fts5Cursor *pCsr){
21366 assert( pCsr->ePlan==FTS5_PLAN_MATCH
21367 || pCsr->ePlan==FTS5_PLAN_SORTED_MATCH
21368 || pCsr->ePlan==FTS5_PLAN_SOURCE
21369 );
21370 if( pCsr->pSorter ){
21371 return pCsr->pSorter->iRowid;
21372 }else{
21373 return sqlite3Fts5ExprRowid(pCsr->pExpr);
21374 }
21375 }
21376
21377 /*
21378 ** This is the xRowid method. The SQLite core calls this routine to
21379 ** retrieve the rowid for the current row of the result set. fts5
21380 ** exposes %_content.rowid as the rowid for the virtual table. The
21381 ** rowid should be written to *pRowid.
21382 */
21383 static int fts5RowidMethod(sqlite3_vtab_cursor *pCursor, sqlite_int64 *pRowid){
21384 Fts5Cursor *pCsr = (Fts5Cursor*)pCursor;
21385 int ePlan = pCsr->ePlan;
21386
21387 assert( CsrFlagTest(pCsr, FTS5CSR_EOF)==0 );
21388 switch( ePlan ){
21389 case FTS5_PLAN_SPECIAL:
21390 *pRowid = 0;
21391 break;
21392
21393 case FTS5_PLAN_SOURCE:
21394 case FTS5_PLAN_MATCH:
21395 case FTS5_PLAN_SORTED_MATCH:
21396 *pRowid = fts5CursorRowid(pCsr);
21397 break;
21398
21399 default:
21400 *pRowid = sqlite3_column_int64(pCsr->pStmt, 0);
21401 break;
21402 }
21403
21404 return SQLITE_OK;
21405 }
21406
21407 /*
21408 ** If the cursor requires seeking (bSeekRequired flag is set), seek it.
21409 ** Return SQLITE_OK if no error occurs, or an SQLite error code otherwise.
21410 **
21411 ** If argument bErrormsg is true and an error occurs, an error message may
21412 ** be left in sqlite3_vtab.zErrMsg.
21413 */
21414 static int fts5SeekCursor(Fts5Cursor *pCsr, int bErrormsg){
21415 int rc = SQLITE_OK;
21416
21417 /* If the cursor does not yet have a statement handle, obtain one now. */
21418 if( pCsr->pStmt==0 ){
21419 Fts5Table *pTab = (Fts5Table*)(pCsr->base.pVtab);
21420 int eStmt = fts5StmtType(pCsr);
21421 rc = sqlite3Fts5StorageStmt(
21422 pTab->pStorage, eStmt, &pCsr->pStmt, (bErrormsg?&pTab->base.zErrMsg:0)
21423 );
21424 assert( rc!=SQLITE_OK || pTab->base.zErrMsg==0 );
21425 assert( CsrFlagTest(pCsr, FTS5CSR_REQUIRE_CONTENT) );
21426 }
21427
21428 if( rc==SQLITE_OK && CsrFlagTest(pCsr, FTS5CSR_REQUIRE_CONTENT) ){
21429 assert( pCsr->pExpr );
21430 sqlite3_reset(pCsr->pStmt);
21431 sqlite3_bind_int64(pCsr->pStmt, 1, fts5CursorRowid(pCsr));
21432 rc = sqlite3_step(pCsr->pStmt);
21433 if( rc==SQLITE_ROW ){
21434 rc = SQLITE_OK;
21435 CsrFlagClear(pCsr, FTS5CSR_REQUIRE_CONTENT);
21436 }else{
21437 rc = sqlite3_reset(pCsr->pStmt);
21438 if( rc==SQLITE_OK ){
21439 rc = FTS5_CORRUPT;
21440 }
21441 }
21442 }
21443 return rc;
21444 }
21445
21446 static void fts5SetVtabError(Fts5Table *p, const char *zFormat, ...){
21447 va_list ap; /* ... printf arguments */
21448 va_start(ap, zFormat);
21449 assert( p->base.zErrMsg==0 );
21450 p->base.zErrMsg = sqlite3_vmprintf(zFormat, ap);
21451 va_end(ap);
21452 }
21453
21454 /*
21455 ** This function is called to handle an FTS INSERT command. In other words,
21456 ** an INSERT statement of the form:
21457 **
21458 ** INSERT INTO fts(fts) VALUES($pCmd)
21459 ** INSERT INTO fts(fts, rank) VALUES($pCmd, $pVal)
21460 **
21461 ** Argument pVal is the value assigned to column "fts" by the INSERT
21462 ** statement. This function returns SQLITE_OK if successful, or an SQLite
21463 ** error code if an error occurs.
21464 **
21465 ** The commands implemented by this function are documented in the "Special
21466 ** INSERT Directives" section of the documentation. It should be updated if
21467 ** more commands are added to this function.
21468 */
21469 static int fts5SpecialInsert(
21470 Fts5Table *pTab, /* Fts5 table object */
21471 const char *zCmd, /* Text inserted into table-name column */
21472 sqlite3_value *pVal /* Value inserted into rank column */
21473 ){
21474 Fts5Config *pConfig = pTab->pConfig;
21475 int rc = SQLITE_OK;
21476 int bError = 0;
21477
21478 if( 0==sqlite3_stricmp("delete-all", zCmd) ){
21479 if( pConfig->eContent==FTS5_CONTENT_NORMAL ){
21480 fts5SetVtabError(pTab,
21481 "'delete-all' may only be used with a "
21482 "contentless or external content fts5 table"
21483 );
21484 rc = SQLITE_ERROR;
21485 }else{
21486 rc = sqlite3Fts5StorageDeleteAll(pTab->pStorage);
21487 }
21488 }else if( 0==sqlite3_stricmp("rebuild", zCmd) ){
21489 if( pConfig->eContent==FTS5_CONTENT_NONE ){
21490 fts5SetVtabError(pTab,
21491 "'rebuild' may not be used with a contentless fts5 table"
21492 );
21493 rc = SQLITE_ERROR;
21494 }else{
21495 rc = sqlite3Fts5StorageRebuild(pTab->pStorage);
21496 }
21497 }else if( 0==sqlite3_stricmp("optimize", zCmd) ){
21498 rc = sqlite3Fts5StorageOptimize(pTab->pStorage);
21499 }else if( 0==sqlite3_stricmp("merge", zCmd) ){
21500 int nMerge = sqlite3_value_int(pVal);
21501 rc = sqlite3Fts5StorageMerge(pTab->pStorage, nMerge);
21502 }else if( 0==sqlite3_stricmp("integrity-check", zCmd) ){
21503 rc = sqlite3Fts5StorageIntegrity(pTab->pStorage);
21504 #ifdef SQLITE_DEBUG
21505 }else if( 0==sqlite3_stricmp("prefix-index", zCmd) ){
21506 pConfig->bPrefixIndex = sqlite3_value_int(pVal);
21507 #endif
21508 }else{
21509 rc = sqlite3Fts5IndexLoadConfig(pTab->pIndex);
21510 if( rc==SQLITE_OK ){
21511 rc = sqlite3Fts5ConfigSetValue(pTab->pConfig, zCmd, pVal, &bError);
21512 }
21513 if( rc==SQLITE_OK ){
21514 if( bError ){
21515 rc = SQLITE_ERROR;
21516 }else{
21517 rc = sqlite3Fts5StorageConfigValue(pTab->pStorage, zCmd, pVal, 0);
21518 }
21519 }
21520 }
21521 return rc;
21522 }
21523
21524 static int fts5SpecialDelete(
21525 Fts5Table *pTab,
21526 sqlite3_value **apVal,
21527 sqlite3_int64 *piRowid
21528 ){
21529 int rc = SQLITE_OK;
21530 int eType1 = sqlite3_value_type(apVal[1]);
21531 if( eType1==SQLITE_INTEGER ){
21532 sqlite3_int64 iDel = sqlite3_value_int64(apVal[1]);
21533 rc = sqlite3Fts5StorageSpecialDelete(pTab->pStorage, iDel, &apVal[2]);
21534 }
21535 return rc;
21536 }
21537
21538 static void fts5StorageInsert(
21539 int *pRc,
21540 Fts5Table *pTab,
21541 sqlite3_value **apVal,
21542 i64 *piRowid
21543 ){
21544 int rc = *pRc;
21545 if( rc==SQLITE_OK ){
21546 rc = sqlite3Fts5StorageContentInsert(pTab->pStorage, apVal, piRowid);
21547 }
21548 if( rc==SQLITE_OK ){
21549 rc = sqlite3Fts5StorageIndexInsert(pTab->pStorage, apVal, *piRowid);
21550 }
21551 *pRc = rc;
21552 }
21553
21554 /*
21555 ** This function is the implementation of the xUpdate callback used by
21556 ** FTS3 virtual tables. It is invoked by SQLite each time a row is to be
21557 ** inserted, updated or deleted.
21558 **
21559 ** A delete specifies a single argument - the rowid of the row to remove.
21560 **
21561 ** Update and insert operations pass:
21562 **
21563 ** 1. The "old" rowid, or NULL.
21564 ** 2. The "new" rowid.
21565 ** 3. Values for each of the nCol matchable columns.
21566 ** 4. Values for the two hidden columns (<tablename> and "rank").
21567 */
21568 static int fts5UpdateMethod(
21569 sqlite3_vtab *pVtab, /* Virtual table handle */
21570 int nArg, /* Size of argument array */
21571 sqlite3_value **apVal, /* Array of arguments */
21572 sqlite_int64 *pRowid /* OUT: The affected (or effected) rowid */
21573 ){
21574 Fts5Table *pTab = (Fts5Table*)pVtab;
21575 Fts5Config *pConfig = pTab->pConfig;
21576 int eType0; /* value_type() of apVal[0] */
21577 int rc = SQLITE_OK; /* Return code */
21578
21579 /* A transaction must be open when this is called. */
21580 assert( pTab->ts.eState==1 );
21581
21582 assert( pVtab->zErrMsg==0 );
21583 assert( nArg==1 || nArg==(2+pConfig->nCol+2) );
21584 assert( nArg==1
21585 || sqlite3_value_type(apVal[1])==SQLITE_INTEGER
21586 || sqlite3_value_type(apVal[1])==SQLITE_NULL
21587 );
21588 assert( pTab->pConfig->pzErrmsg==0 );
21589 pTab->pConfig->pzErrmsg = &pTab->base.zErrMsg;
21590
21591 /* Put any active cursors into REQUIRE_SEEK state. */
21592 fts5TripCursors(pTab);
21593
21594 eType0 = sqlite3_value_type(apVal[0]);
21595 if( eType0==SQLITE_NULL
21596 && sqlite3_value_type(apVal[2+pConfig->nCol])!=SQLITE_NULL
21597 ){
21598 /* A "special" INSERT op. These are handled separately. */
21599 const char *z = (const char*)sqlite3_value_text(apVal[2+pConfig->nCol]);
21600 if( pConfig->eContent!=FTS5_CONTENT_NORMAL
21601 && 0==sqlite3_stricmp("delete", z)
21602 ){
21603 rc = fts5SpecialDelete(pTab, apVal, pRowid);
21604 }else{
21605 rc = fts5SpecialInsert(pTab, z, apVal[2 + pConfig->nCol + 1]);
21606 }
21607 }else{
21608 /* A regular INSERT, UPDATE or DELETE statement. The trick here is that
21609 ** any conflict on the rowid value must be detected before any
21610 ** modifications are made to the database file. There are 4 cases:
21611 **
21612 ** 1) DELETE
21613 ** 2) UPDATE (rowid not modified)
21614 ** 3) UPDATE (rowid modified)
21615 ** 4) INSERT
21616 **
21617 ** Cases 3 and 4 may violate the rowid constraint.
21618 */
21619 int eConflict = SQLITE_ABORT;
21620 if( pConfig->eContent==FTS5_CONTENT_NORMAL ){
21621 eConflict = sqlite3_vtab_on_conflict(pConfig->db);
21622 }
21623
21624 assert( eType0==SQLITE_INTEGER || eType0==SQLITE_NULL );
21625 assert( nArg!=1 || eType0==SQLITE_INTEGER );
21626
21627 /* Filter out attempts to run UPDATE or DELETE on contentless tables.
21628 ** This is not suported. */
21629 if( eType0==SQLITE_INTEGER && fts5IsContentless(pTab) ){
21630 pTab->base.zErrMsg = sqlite3_mprintf(
21631 "cannot %s contentless fts5 table: %s",
21632 (nArg>1 ? "UPDATE" : "DELETE from"), pConfig->zName
21633 );
21634 rc = SQLITE_ERROR;
21635 }
21636
21637 /* Case 1: DELETE */
21638 else if( nArg==1 ){
21639 i64 iDel = sqlite3_value_int64(apVal[0]); /* Rowid to delete */
21640 rc = sqlite3Fts5StorageDelete(pTab->pStorage, iDel);
21641 }
21642
21643 /* Case 2: INSERT */
21644 else if( eType0!=SQLITE_INTEGER ){
21645 /* If this is a REPLACE, first remove the current entry (if any) */
21646 if( eConflict==SQLITE_REPLACE
21647 && sqlite3_value_type(apVal[1])==SQLITE_INTEGER
21648 ){
21649 i64 iNew = sqlite3_value_int64(apVal[1]); /* Rowid to delete */
21650 rc = sqlite3Fts5StorageDelete(pTab->pStorage, iNew);
21651 }
21652 fts5StorageInsert(&rc, pTab, apVal, pRowid);
21653 }
21654
21655 /* Case 2: UPDATE */
21656 else{
21657 i64 iOld = sqlite3_value_int64(apVal[0]); /* Old rowid */
21658 i64 iNew = sqlite3_value_int64(apVal[1]); /* New rowid */
21659 if( iOld!=iNew ){
21660 if( eConflict==SQLITE_REPLACE ){
21661 rc = sqlite3Fts5StorageDelete(pTab->pStorage, iOld);
21662 if( rc==SQLITE_OK ){
21663 rc = sqlite3Fts5StorageDelete(pTab->pStorage, iNew);
21664 }
21665 fts5StorageInsert(&rc, pTab, apVal, pRowid);
21666 }else{
21667 rc = sqlite3Fts5StorageContentInsert(pTab->pStorage, apVal, pRowid);
21668 if( rc==SQLITE_OK ){
21669 rc = sqlite3Fts5StorageDelete(pTab->pStorage, iOld);
21670 }
21671 if( rc==SQLITE_OK ){
21672 rc = sqlite3Fts5StorageIndexInsert(pTab->pStorage, apVal, *pRowid);
21673 }
21674 }
21675 }else{
21676 rc = sqlite3Fts5StorageDelete(pTab->pStorage, iOld);
21677 fts5StorageInsert(&rc, pTab, apVal, pRowid);
21678 }
21679 }
21680 }
21681
21682 pTab->pConfig->pzErrmsg = 0;
21683 return rc;
21684 }
21685
21686 /*
21687 ** Implementation of xSync() method.
21688 */
21689 static int fts5SyncMethod(sqlite3_vtab *pVtab){
21690 int rc;
21691 Fts5Table *pTab = (Fts5Table*)pVtab;
21692 fts5CheckTransactionState(pTab, FTS5_SYNC, 0);
21693 pTab->pConfig->pzErrmsg = &pTab->base.zErrMsg;
21694 fts5TripCursors(pTab);
21695 rc = sqlite3Fts5StorageSync(pTab->pStorage, 1);
21696 pTab->pConfig->pzErrmsg = 0;
21697 return rc;
21698 }
21699
21700 /*
21701 ** Implementation of xBegin() method.
21702 */
21703 static int fts5BeginMethod(sqlite3_vtab *pVtab){
21704 fts5CheckTransactionState((Fts5Table*)pVtab, FTS5_BEGIN, 0);
21705 return SQLITE_OK;
21706 }
21707
21708 /*
21709 ** Implementation of xCommit() method. This is a no-op. The contents of
21710 ** the pending-terms hash-table have already been flushed into the database
21711 ** by fts5SyncMethod().
21712 */
21713 static int fts5CommitMethod(sqlite3_vtab *pVtab){
21714 fts5CheckTransactionState((Fts5Table*)pVtab, FTS5_COMMIT, 0);
21715 return SQLITE_OK;
21716 }
21717
21718 /*
21719 ** Implementation of xRollback(). Discard the contents of the pending-terms
21720 ** hash-table. Any changes made to the database are reverted by SQLite.
21721 */
21722 static int fts5RollbackMethod(sqlite3_vtab *pVtab){
21723 int rc;
21724 Fts5Table *pTab = (Fts5Table*)pVtab;
21725 fts5CheckTransactionState(pTab, FTS5_ROLLBACK, 0);
21726 rc = sqlite3Fts5StorageRollback(pTab->pStorage);
21727 return rc;
21728 }
21729
21730 static void *fts5ApiUserData(Fts5Context *pCtx){
21731 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx;
21732 return pCsr->pAux->pUserData;
21733 }
21734
21735 static int fts5ApiColumnCount(Fts5Context *pCtx){
21736 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx;
21737 return ((Fts5Table*)(pCsr->base.pVtab))->pConfig->nCol;
21738 }
21739
21740 static int fts5ApiColumnTotalSize(
21741 Fts5Context *pCtx,
21742 int iCol,
21743 sqlite3_int64 *pnToken
21744 ){
21745 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx;
21746 Fts5Table *pTab = (Fts5Table*)(pCsr->base.pVtab);
21747 return sqlite3Fts5StorageSize(pTab->pStorage, iCol, pnToken);
21748 }
21749
21750 static int fts5ApiRowCount(Fts5Context *pCtx, i64 *pnRow){
21751 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx;
21752 Fts5Table *pTab = (Fts5Table*)(pCsr->base.pVtab);
21753 return sqlite3Fts5StorageRowCount(pTab->pStorage, pnRow);
21754 }
21755
21756 static int fts5ApiTokenize(
21757 Fts5Context *pCtx,
21758 const char *pText, int nText,
21759 void *pUserData,
21760 int (*xToken)(void*, int, const char*, int, int, int)
21761 ){
21762 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx;
21763 Fts5Table *pTab = (Fts5Table*)(pCsr->base.pVtab);
21764 return sqlite3Fts5Tokenize(
21765 pTab->pConfig, FTS5_TOKENIZE_AUX, pText, nText, pUserData, xToken
21766 );
21767 }
21768
21769 static int fts5ApiPhraseCount(Fts5Context *pCtx){
21770 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx;
21771 return sqlite3Fts5ExprPhraseCount(pCsr->pExpr);
21772 }
21773
21774 static int fts5ApiPhraseSize(Fts5Context *pCtx, int iPhrase){
21775 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx;
21776 return sqlite3Fts5ExprPhraseSize(pCsr->pExpr, iPhrase);
21777 }
21778
21779 static int fts5CsrPoslist(Fts5Cursor *pCsr, int iPhrase, const u8 **pa){
21780 int n;
21781 if( pCsr->pSorter ){
21782 Fts5Sorter *pSorter = pCsr->pSorter;
21783 int i1 = (iPhrase==0 ? 0 : pSorter->aIdx[iPhrase-1]);
21784 n = pSorter->aIdx[iPhrase] - i1;
21785 *pa = &pSorter->aPoslist[i1];
21786 }else{
21787 n = sqlite3Fts5ExprPoslist(pCsr->pExpr, iPhrase, pa);
21788 }
21789 return n;
21790 }
21791
21792 /*
21793 ** Ensure that the Fts5Cursor.nInstCount and aInst[] variables are populated
21794 ** correctly for the current view. Return SQLITE_OK if successful, or an
21795 ** SQLite error code otherwise.
21796 */
21797 static int fts5CacheInstArray(Fts5Cursor *pCsr){
21798 int rc = SQLITE_OK;
21799 Fts5PoslistReader *aIter; /* One iterator for each phrase */
21800 int nIter; /* Number of iterators/phrases */
21801
21802 nIter = sqlite3Fts5ExprPhraseCount(pCsr->pExpr);
21803 if( pCsr->aInstIter==0 ){
21804 int nByte = sizeof(Fts5PoslistReader) * nIter;
21805 pCsr->aInstIter = (Fts5PoslistReader*)sqlite3Fts5MallocZero(&rc, nByte);
21806 }
21807 aIter = pCsr->aInstIter;
21808
21809 if( aIter ){
21810 int nInst = 0; /* Number instances seen so far */
21811 int i;
21812
21813 /* Initialize all iterators */
21814 for(i=0; i<nIter; i++){
21815 const u8 *a;
21816 int n = fts5CsrPoslist(pCsr, i, &a);
21817 sqlite3Fts5PoslistReaderInit(a, n, &aIter[i]);
21818 }
21819
21820 while( 1 ){
21821 int *aInst;
21822 int iBest = -1;
21823 for(i=0; i<nIter; i++){
21824 if( (aIter[i].bEof==0)
21825 && (iBest<0 || aIter[i].iPos<aIter[iBest].iPos)
21826 ){
21827 iBest = i;
21828 }
21829 }
21830 if( iBest<0 ) break;
21831
21832 nInst++;
21833 if( nInst>=pCsr->nInstAlloc ){
21834 pCsr->nInstAlloc = pCsr->nInstAlloc ? pCsr->nInstAlloc*2 : 32;
21835 aInst = (int*)sqlite3_realloc(
21836 pCsr->aInst, pCsr->nInstAlloc*sizeof(int)*3
21837 );
21838 if( aInst ){
21839 pCsr->aInst = aInst;
21840 }else{
21841 rc = SQLITE_NOMEM;
21842 break;
21843 }
21844 }
21845
21846 aInst = &pCsr->aInst[3 * (nInst-1)];
21847 aInst[0] = iBest;
21848 aInst[1] = FTS5_POS2COLUMN(aIter[iBest].iPos);
21849 aInst[2] = FTS5_POS2OFFSET(aIter[iBest].iPos);
21850 sqlite3Fts5PoslistReaderNext(&aIter[iBest]);
21851 }
21852
21853 pCsr->nInstCount = nInst;
21854 CsrFlagClear(pCsr, FTS5CSR_REQUIRE_INST);
21855 }
21856 return rc;
21857 }
21858
21859 static int fts5ApiInstCount(Fts5Context *pCtx, int *pnInst){
21860 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx;
21861 int rc = SQLITE_OK;
21862 if( CsrFlagTest(pCsr, FTS5CSR_REQUIRE_INST)==0
21863 || SQLITE_OK==(rc = fts5CacheInstArray(pCsr)) ){
21864 *pnInst = pCsr->nInstCount;
21865 }
21866 return rc;
21867 }
21868
21869 static int fts5ApiInst(
21870 Fts5Context *pCtx,
21871 int iIdx,
21872 int *piPhrase,
21873 int *piCol,
21874 int *piOff
21875 ){
21876 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx;
21877 int rc = SQLITE_OK;
21878 if( CsrFlagTest(pCsr, FTS5CSR_REQUIRE_INST)==0
21879 || SQLITE_OK==(rc = fts5CacheInstArray(pCsr))
21880 ){
21881 if( iIdx<0 || iIdx>=pCsr->nInstCount ){
21882 rc = SQLITE_RANGE;
21883 }else{
21884 *piPhrase = pCsr->aInst[iIdx*3];
21885 *piCol = pCsr->aInst[iIdx*3 + 1];
21886 *piOff = pCsr->aInst[iIdx*3 + 2];
21887 }
21888 }
21889 return rc;
21890 }
21891
21892 static sqlite3_int64 fts5ApiRowid(Fts5Context *pCtx){
21893 return fts5CursorRowid((Fts5Cursor*)pCtx);
21894 }
21895
21896 static int fts5ApiColumnText(
21897 Fts5Context *pCtx,
21898 int iCol,
21899 const char **pz,
21900 int *pn
21901 ){
21902 int rc = SQLITE_OK;
21903 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx;
21904 if( fts5IsContentless((Fts5Table*)(pCsr->base.pVtab)) ){
21905 *pz = 0;
21906 *pn = 0;
21907 }else{
21908 rc = fts5SeekCursor(pCsr, 0);
21909 if( rc==SQLITE_OK ){
21910 *pz = (const char*)sqlite3_column_text(pCsr->pStmt, iCol+1);
21911 *pn = sqlite3_column_bytes(pCsr->pStmt, iCol+1);
21912 }
21913 }
21914 return rc;
21915 }
21916
21917 static int fts5ColumnSizeCb(
21918 void *pContext, /* Pointer to int */
21919 int tflags,
21920 const char *pToken, /* Buffer containing token */
21921 int nToken, /* Size of token in bytes */
21922 int iStart, /* Start offset of token */
21923 int iEnd /* End offset of token */
21924 ){
21925 int *pCnt = (int*)pContext;
21926 if( (tflags & FTS5_TOKEN_COLOCATED)==0 ){
21927 (*pCnt)++;
21928 }
21929 return SQLITE_OK;
21930 }
21931
21932 static int fts5ApiColumnSize(Fts5Context *pCtx, int iCol, int *pnToken){
21933 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx;
21934 Fts5Table *pTab = (Fts5Table*)(pCsr->base.pVtab);
21935 Fts5Config *pConfig = pTab->pConfig;
21936 int rc = SQLITE_OK;
21937
21938 if( CsrFlagTest(pCsr, FTS5CSR_REQUIRE_DOCSIZE) ){
21939 if( pConfig->bColumnsize ){
21940 i64 iRowid = fts5CursorRowid(pCsr);
21941 rc = sqlite3Fts5StorageDocsize(pTab->pStorage, iRowid, pCsr->aColumnSize);
21942 }else if( pConfig->zContent==0 ){
21943 int i;
21944 for(i=0; i<pConfig->nCol; i++){
21945 if( pConfig->abUnindexed[i]==0 ){
21946 pCsr->aColumnSize[i] = -1;
21947 }
21948 }
21949 }else{
21950 int i;
21951 for(i=0; rc==SQLITE_OK && i<pConfig->nCol; i++){
21952 if( pConfig->abUnindexed[i]==0 ){
21953 const char *z; int n;
21954 void *p = (void*)(&pCsr->aColumnSize[i]);
21955 pCsr->aColumnSize[i] = 0;
21956 rc = fts5ApiColumnText(pCtx, i, &z, &n);
21957 if( rc==SQLITE_OK ){
21958 rc = sqlite3Fts5Tokenize(
21959 pConfig, FTS5_TOKENIZE_AUX, z, n, p, fts5ColumnSizeCb
21960 );
21961 }
21962 }
21963 }
21964 }
21965 CsrFlagClear(pCsr, FTS5CSR_REQUIRE_DOCSIZE);
21966 }
21967 if( iCol<0 ){
21968 int i;
21969 *pnToken = 0;
21970 for(i=0; i<pConfig->nCol; i++){
21971 *pnToken += pCsr->aColumnSize[i];
21972 }
21973 }else if( iCol<pConfig->nCol ){
21974 *pnToken = pCsr->aColumnSize[iCol];
21975 }else{
21976 *pnToken = 0;
21977 rc = SQLITE_RANGE;
21978 }
21979 return rc;
21980 }
21981
21982 /*
21983 ** Implementation of the xSetAuxdata() method.
21984 */
21985 static int fts5ApiSetAuxdata(
21986 Fts5Context *pCtx, /* Fts5 context */
21987 void *pPtr, /* Pointer to save as auxdata */
21988 void(*xDelete)(void*) /* Destructor for pPtr (or NULL) */
21989 ){
21990 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx;
21991 Fts5Auxdata *pData;
21992
21993 /* Search through the cursors list of Fts5Auxdata objects for one that
21994 ** corresponds to the currently executing auxiliary function. */
21995 for(pData=pCsr->pAuxdata; pData; pData=pData->pNext){
21996 if( pData->pAux==pCsr->pAux ) break;
21997 }
21998
21999 if( pData ){
22000 if( pData->xDelete ){
22001 pData->xDelete(pData->pPtr);
22002 }
22003 }else{
22004 int rc = SQLITE_OK;
22005 pData = (Fts5Auxdata*)sqlite3Fts5MallocZero(&rc, sizeof(Fts5Auxdata));
22006 if( pData==0 ){
22007 if( xDelete ) xDelete(pPtr);
22008 return rc;
22009 }
22010 pData->pAux = pCsr->pAux;
22011 pData->pNext = pCsr->pAuxdata;
22012 pCsr->pAuxdata = pData;
22013 }
22014
22015 pData->xDelete = xDelete;
22016 pData->pPtr = pPtr;
22017 return SQLITE_OK;
22018 }
22019
22020 static void *fts5ApiGetAuxdata(Fts5Context *pCtx, int bClear){
22021 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx;
22022 Fts5Auxdata *pData;
22023 void *pRet = 0;
22024
22025 for(pData=pCsr->pAuxdata; pData; pData=pData->pNext){
22026 if( pData->pAux==pCsr->pAux ) break;
22027 }
22028
22029 if( pData ){
22030 pRet = pData->pPtr;
22031 if( bClear ){
22032 pData->pPtr = 0;
22033 pData->xDelete = 0;
22034 }
22035 }
22036
22037 return pRet;
22038 }
22039
22040 static void fts5ApiPhraseNext(
22041 Fts5Context *pCtx,
22042 Fts5PhraseIter *pIter,
22043 int *piCol, int *piOff
22044 ){
22045 if( pIter->a>=pIter->b ){
22046 *piCol = -1;
22047 *piOff = -1;
22048 }else{
22049 int iVal;
22050 pIter->a += fts5GetVarint32(pIter->a, iVal);
22051 if( iVal==1 ){
22052 pIter->a += fts5GetVarint32(pIter->a, iVal);
22053 *piCol = iVal;
22054 *piOff = 0;
22055 pIter->a += fts5GetVarint32(pIter->a, iVal);
22056 }
22057 *piOff += (iVal-2);
22058 }
22059 }
22060
22061 static void fts5ApiPhraseFirst(
22062 Fts5Context *pCtx,
22063 int iPhrase,
22064 Fts5PhraseIter *pIter,
22065 int *piCol, int *piOff
22066 ){
22067 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx;
22068 int n = fts5CsrPoslist(pCsr, iPhrase, &pIter->a);
22069 pIter->b = &pIter->a[n];
22070 *piCol = 0;
22071 *piOff = 0;
22072 fts5ApiPhraseNext(pCtx, pIter, piCol, piOff);
22073 }
22074
22075 static int fts5ApiQueryPhrase(Fts5Context*, int, void*,
22076 int(*)(const Fts5ExtensionApi*, Fts5Context*, void*)
22077 );
22078
22079 static const Fts5ExtensionApi sFts5Api = {
22080 2, /* iVersion */
22081 fts5ApiUserData,
22082 fts5ApiColumnCount,
22083 fts5ApiRowCount,
22084 fts5ApiColumnTotalSize,
22085 fts5ApiTokenize,
22086 fts5ApiPhraseCount,
22087 fts5ApiPhraseSize,
22088 fts5ApiInstCount,
22089 fts5ApiInst,
22090 fts5ApiRowid,
22091 fts5ApiColumnText,
22092 fts5ApiColumnSize,
22093 fts5ApiQueryPhrase,
22094 fts5ApiSetAuxdata,
22095 fts5ApiGetAuxdata,
22096 fts5ApiPhraseFirst,
22097 fts5ApiPhraseNext,
22098 };
22099
22100
22101 /*
22102 ** Implementation of API function xQueryPhrase().
22103 */
22104 static int fts5ApiQueryPhrase(
22105 Fts5Context *pCtx,
22106 int iPhrase,
22107 void *pUserData,
22108 int(*xCallback)(const Fts5ExtensionApi*, Fts5Context*, void*)
22109 ){
22110 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx;
22111 Fts5Table *pTab = (Fts5Table*)(pCsr->base.pVtab);
22112 int rc;
22113 Fts5Cursor *pNew = 0;
22114
22115 rc = fts5OpenMethod(pCsr->base.pVtab, (sqlite3_vtab_cursor**)&pNew);
22116 if( rc==SQLITE_OK ){
22117 Fts5Config *pConf = pTab->pConfig;
22118 pNew->ePlan = FTS5_PLAN_MATCH;
22119 pNew->iFirstRowid = SMALLEST_INT64;
22120 pNew->iLastRowid = LARGEST_INT64;
22121 pNew->base.pVtab = (sqlite3_vtab*)pTab;
22122 rc = sqlite3Fts5ExprClonePhrase(pConf, pCsr->pExpr, iPhrase, &pNew->pExpr);
22123 }
22124
22125 if( rc==SQLITE_OK ){
22126 for(rc = fts5CursorFirst(pTab, pNew, 0);
22127 rc==SQLITE_OK && CsrFlagTest(pNew, FTS5CSR_EOF)==0;
22128 rc = fts5NextMethod((sqlite3_vtab_cursor*)pNew)
22129 ){
22130 rc = xCallback(&sFts5Api, (Fts5Context*)pNew, pUserData);
22131 if( rc!=SQLITE_OK ){
22132 if( rc==SQLITE_DONE ) rc = SQLITE_OK;
22133 break;
22134 }
22135 }
22136 }
22137
22138 fts5CloseMethod((sqlite3_vtab_cursor*)pNew);
22139 return rc;
22140 }
22141
22142 static void fts5ApiInvoke(
22143 Fts5Auxiliary *pAux,
22144 Fts5Cursor *pCsr,
22145 sqlite3_context *context,
22146 int argc,
22147 sqlite3_value **argv
22148 ){
22149 assert( pCsr->pAux==0 );
22150 pCsr->pAux = pAux;
22151 pAux->xFunc(&sFts5Api, (Fts5Context*)pCsr, context, argc, argv);
22152 pCsr->pAux = 0;
22153 }
22154
22155 static Fts5Cursor *fts5CursorFromCsrid(Fts5Global *pGlobal, i64 iCsrId){
22156 Fts5Cursor *pCsr;
22157 for(pCsr=pGlobal->pCsr; pCsr; pCsr=pCsr->pNext){
22158 if( pCsr->iCsrId==iCsrId ) break;
22159 }
22160 return pCsr;
22161 }
22162
22163 static void fts5ApiCallback(
22164 sqlite3_context *context,
22165 int argc,
22166 sqlite3_value **argv
22167 ){
22168
22169 Fts5Auxiliary *pAux;
22170 Fts5Cursor *pCsr;
22171 i64 iCsrId;
22172
22173 assert( argc>=1 );
22174 pAux = (Fts5Auxiliary*)sqlite3_user_data(context);
22175 iCsrId = sqlite3_value_int64(argv[0]);
22176
22177 pCsr = fts5CursorFromCsrid(pAux->pGlobal, iCsrId);
22178 if( pCsr==0 ){
22179 char *zErr = sqlite3_mprintf("no such cursor: %lld", iCsrId);
22180 sqlite3_result_error(context, zErr, -1);
22181 sqlite3_free(zErr);
22182 }else{
22183 fts5ApiInvoke(pAux, pCsr, context, argc-1, &argv[1]);
22184 }
22185 }
22186
22187
22188 /*
22189 ** Given cursor id iId, return a pointer to the corresponding Fts5Index
22190 ** object. Or NULL If the cursor id does not exist.
22191 **
22192 ** If successful, set *ppConfig to point to the associated config object
22193 ** before returning.
22194 */
22195 static Fts5Index *sqlite3Fts5IndexFromCsrid(
22196 Fts5Global *pGlobal, /* FTS5 global context for db handle */
22197 i64 iCsrId, /* Id of cursor to find */
22198 Fts5Config **ppConfig /* OUT: Configuration object */
22199 ){
22200 Fts5Cursor *pCsr;
22201 Fts5Table *pTab;
22202
22203 pCsr = fts5CursorFromCsrid(pGlobal, iCsrId);
22204 pTab = (Fts5Table*)pCsr->base.pVtab;
22205 *ppConfig = pTab->pConfig;
22206
22207 return pTab->pIndex;
22208 }
22209
22210 /*
22211 ** Return a "position-list blob" corresponding to the current position of
22212 ** cursor pCsr via sqlite3_result_blob(). A position-list blob contains
22213 ** the current position-list for each phrase in the query associated with
22214 ** cursor pCsr.
22215 **
22216 ** A position-list blob begins with (nPhrase-1) varints, where nPhrase is
22217 ** the number of phrases in the query. Following the varints are the
22218 ** concatenated position lists for each phrase, in order.
22219 **
22220 ** The first varint (if it exists) contains the size of the position list
22221 ** for phrase 0. The second (same disclaimer) contains the size of position
22222 ** list 1. And so on. There is no size field for the final position list,
22223 ** as it can be derived from the total size of the blob.
22224 */
22225 static int fts5PoslistBlob(sqlite3_context *pCtx, Fts5Cursor *pCsr){
22226 int i;
22227 int rc = SQLITE_OK;
22228 int nPhrase = sqlite3Fts5ExprPhraseCount(pCsr->pExpr);
22229 Fts5Buffer val;
22230
22231 memset(&val, 0, sizeof(Fts5Buffer));
22232
22233 /* Append the varints */
22234 for(i=0; i<(nPhrase-1); i++){
22235 const u8 *dummy;
22236 int nByte = sqlite3Fts5ExprPoslist(pCsr->pExpr, i, &dummy);
22237 sqlite3Fts5BufferAppendVarint(&rc, &val, nByte);
22238 }
22239
22240 /* Append the position lists */
22241 for(i=0; i<nPhrase; i++){
22242 const u8 *pPoslist;
22243 int nPoslist;
22244 nPoslist = sqlite3Fts5ExprPoslist(pCsr->pExpr, i, &pPoslist);
22245 sqlite3Fts5BufferAppendBlob(&rc, &val, nPoslist, pPoslist);
22246 }
22247
22248 sqlite3_result_blob(pCtx, val.p, val.n, sqlite3_free);
22249 return rc;
22250 }
22251
22252 /*
22253 ** This is the xColumn method, called by SQLite to request a value from
22254 ** the row that the supplied cursor currently points to.
22255 */
22256 static int fts5ColumnMethod(
22257 sqlite3_vtab_cursor *pCursor, /* Cursor to retrieve value from */
22258 sqlite3_context *pCtx, /* Context for sqlite3_result_xxx() calls */
22259 int iCol /* Index of column to read value from */
22260 ){
22261 Fts5Table *pTab = (Fts5Table*)(pCursor->pVtab);
22262 Fts5Config *pConfig = pTab->pConfig;
22263 Fts5Cursor *pCsr = (Fts5Cursor*)pCursor;
22264 int rc = SQLITE_OK;
22265
22266 assert( CsrFlagTest(pCsr, FTS5CSR_EOF)==0 );
22267
22268 if( pCsr->ePlan==FTS5_PLAN_SPECIAL ){
22269 if( iCol==pConfig->nCol ){
22270 sqlite3_result_int64(pCtx, pCsr->iSpecial);
22271 }
22272 }else
22273
22274 if( iCol==pConfig->nCol ){
22275 /* User is requesting the value of the special column with the same name
22276 ** as the table. Return the cursor integer id number. This value is only
22277 ** useful in that it may be passed as the first argument to an FTS5
22278 ** auxiliary function. */
22279 sqlite3_result_int64(pCtx, pCsr->iCsrId);
22280 }else if( iCol==pConfig->nCol+1 ){
22281
22282 /* The value of the "rank" column. */
22283 if( pCsr->ePlan==FTS5_PLAN_SOURCE ){
22284 fts5PoslistBlob(pCtx, pCsr);
22285 }else if(
22286 pCsr->ePlan==FTS5_PLAN_MATCH
22287 || pCsr->ePlan==FTS5_PLAN_SORTED_MATCH
22288 ){
22289 if( pCsr->pRank || SQLITE_OK==(rc = fts5FindRankFunction(pCsr)) ){
22290 fts5ApiInvoke(pCsr->pRank, pCsr, pCtx, pCsr->nRankArg, pCsr->apRankArg);
22291 }
22292 }
22293 }else if( !fts5IsContentless(pTab) ){
22294 rc = fts5SeekCursor(pCsr, 1);
22295 if( rc==SQLITE_OK ){
22296 sqlite3_result_value(pCtx, sqlite3_column_value(pCsr->pStmt, iCol+1));
22297 }
22298 }
22299 return rc;
22300 }
22301
22302
22303 /*
22304 ** This routine implements the xFindFunction method for the FTS3
22305 ** virtual table.
22306 */
22307 static int fts5FindFunctionMethod(
22308 sqlite3_vtab *pVtab, /* Virtual table handle */
22309 int nArg, /* Number of SQL function arguments */
22310 const char *zName, /* Name of SQL function */
22311 void (**pxFunc)(sqlite3_context*,int,sqlite3_value**), /* OUT: Result */
22312 void **ppArg /* OUT: User data for *pxFunc */
22313 ){
22314 Fts5Table *pTab = (Fts5Table*)pVtab;
22315 Fts5Auxiliary *pAux;
22316
22317 pAux = fts5FindAuxiliary(pTab, zName);
22318 if( pAux ){
22319 *pxFunc = fts5ApiCallback;
22320 *ppArg = (void*)pAux;
22321 return 1;
22322 }
22323
22324 /* No function of the specified name was found. Return 0. */
22325 return 0;
22326 }
22327
22328 /*
22329 ** Implementation of FTS5 xRename method. Rename an fts5 table.
22330 */
22331 static int fts5RenameMethod(
22332 sqlite3_vtab *pVtab, /* Virtual table handle */
22333 const char *zName /* New name of table */
22334 ){
22335 Fts5Table *pTab = (Fts5Table*)pVtab;
22336 return sqlite3Fts5StorageRename(pTab->pStorage, zName);
22337 }
22338
22339 /*
22340 ** The xSavepoint() method.
22341 **
22342 ** Flush the contents of the pending-terms table to disk.
22343 */
22344 static int fts5SavepointMethod(sqlite3_vtab *pVtab, int iSavepoint){
22345 Fts5Table *pTab = (Fts5Table*)pVtab;
22346 fts5CheckTransactionState(pTab, FTS5_SAVEPOINT, iSavepoint);
22347 fts5TripCursors(pTab);
22348 return sqlite3Fts5StorageSync(pTab->pStorage, 0);
22349 }
22350
22351 /*
22352 ** The xRelease() method.
22353 **
22354 ** This is a no-op.
22355 */
22356 static int fts5ReleaseMethod(sqlite3_vtab *pVtab, int iSavepoint){
22357 Fts5Table *pTab = (Fts5Table*)pVtab;
22358 fts5CheckTransactionState(pTab, FTS5_RELEASE, iSavepoint);
22359 fts5TripCursors(pTab);
22360 return sqlite3Fts5StorageSync(pTab->pStorage, 0);
22361 }
22362
22363 /*
22364 ** The xRollbackTo() method.
22365 **
22366 ** Discard the contents of the pending terms table.
22367 */
22368 static int fts5RollbackToMethod(sqlite3_vtab *pVtab, int iSavepoint){
22369 Fts5Table *pTab = (Fts5Table*)pVtab;
22370 fts5CheckTransactionState(pTab, FTS5_ROLLBACKTO, iSavepoint);
22371 fts5TripCursors(pTab);
22372 return sqlite3Fts5StorageRollback(pTab->pStorage);
22373 }
22374
22375 /*
22376 ** Register a new auxiliary function with global context pGlobal.
22377 */
22378 static int fts5CreateAux(
22379 fts5_api *pApi, /* Global context (one per db handle) */
22380 const char *zName, /* Name of new function */
22381 void *pUserData, /* User data for aux. function */
22382 fts5_extension_function xFunc, /* Aux. function implementation */
22383 void(*xDestroy)(void*) /* Destructor for pUserData */
22384 ){
22385 Fts5Global *pGlobal = (Fts5Global*)pApi;
22386 int rc = sqlite3_overload_function(pGlobal->db, zName, -1);
22387 if( rc==SQLITE_OK ){
22388 Fts5Auxiliary *pAux;
22389 int nName; /* Size of zName in bytes, including \0 */
22390 int nByte; /* Bytes of space to allocate */
22391
22392 nName = (int)strlen(zName) + 1;
22393 nByte = sizeof(Fts5Auxiliary) + nName;
22394 pAux = (Fts5Auxiliary*)sqlite3_malloc(nByte);
22395 if( pAux ){
22396 memset(pAux, 0, nByte);
22397 pAux->zFunc = (char*)&pAux[1];
22398 memcpy(pAux->zFunc, zName, nName);
22399 pAux->pGlobal = pGlobal;
22400 pAux->pUserData = pUserData;
22401 pAux->xFunc = xFunc;
22402 pAux->xDestroy = xDestroy;
22403 pAux->pNext = pGlobal->pAux;
22404 pGlobal->pAux = pAux;
22405 }else{
22406 rc = SQLITE_NOMEM;
22407 }
22408 }
22409
22410 return rc;
22411 }
22412
22413 /*
22414 ** Register a new tokenizer. This is the implementation of the
22415 ** fts5_api.xCreateTokenizer() method.
22416 */
22417 static int fts5CreateTokenizer(
22418 fts5_api *pApi, /* Global context (one per db handle) */
22419 const char *zName, /* Name of new function */
22420 void *pUserData, /* User data for aux. function */
22421 fts5_tokenizer *pTokenizer, /* Tokenizer implementation */
22422 void(*xDestroy)(void*) /* Destructor for pUserData */
22423 ){
22424 Fts5Global *pGlobal = (Fts5Global*)pApi;
22425 Fts5TokenizerModule *pNew;
22426 int nName; /* Size of zName and its \0 terminator */
22427 int nByte; /* Bytes of space to allocate */
22428 int rc = SQLITE_OK;
22429
22430 nName = (int)strlen(zName) + 1;
22431 nByte = sizeof(Fts5TokenizerModule) + nName;
22432 pNew = (Fts5TokenizerModule*)sqlite3_malloc(nByte);
22433 if( pNew ){
22434 memset(pNew, 0, nByte);
22435 pNew->zName = (char*)&pNew[1];
22436 memcpy(pNew->zName, zName, nName);
22437 pNew->pUserData = pUserData;
22438 pNew->x = *pTokenizer;
22439 pNew->xDestroy = xDestroy;
22440 pNew->pNext = pGlobal->pTok;
22441 pGlobal->pTok = pNew;
22442 if( pNew->pNext==0 ){
22443 pGlobal->pDfltTok = pNew;
22444 }
22445 }else{
22446 rc = SQLITE_NOMEM;
22447 }
22448
22449 return rc;
22450 }
22451
22452 static Fts5TokenizerModule *fts5LocateTokenizer(
22453 Fts5Global *pGlobal,
22454 const char *zName
22455 ){
22456 Fts5TokenizerModule *pMod = 0;
22457
22458 if( zName==0 ){
22459 pMod = pGlobal->pDfltTok;
22460 }else{
22461 for(pMod=pGlobal->pTok; pMod; pMod=pMod->pNext){
22462 if( sqlite3_stricmp(zName, pMod->zName)==0 ) break;
22463 }
22464 }
22465
22466 return pMod;
22467 }
22468
22469 /*
22470 ** Find a tokenizer. This is the implementation of the
22471 ** fts5_api.xFindTokenizer() method.
22472 */
22473 static int fts5FindTokenizer(
22474 fts5_api *pApi, /* Global context (one per db handle) */
22475 const char *zName, /* Name of new function */
22476 void **ppUserData,
22477 fts5_tokenizer *pTokenizer /* Populate this object */
22478 ){
22479 int rc = SQLITE_OK;
22480 Fts5TokenizerModule *pMod;
22481
22482 pMod = fts5LocateTokenizer((Fts5Global*)pApi, zName);
22483 if( pMod ){
22484 *pTokenizer = pMod->x;
22485 *ppUserData = pMod->pUserData;
22486 }else{
22487 memset(pTokenizer, 0, sizeof(fts5_tokenizer));
22488 rc = SQLITE_ERROR;
22489 }
22490
22491 return rc;
22492 }
22493
22494 static int sqlite3Fts5GetTokenizer(
22495 Fts5Global *pGlobal,
22496 const char **azArg,
22497 int nArg,
22498 Fts5Tokenizer **ppTok,
22499 fts5_tokenizer **ppTokApi,
22500 char **pzErr
22501 ){
22502 Fts5TokenizerModule *pMod;
22503 int rc = SQLITE_OK;
22504
22505 pMod = fts5LocateTokenizer(pGlobal, nArg==0 ? 0 : azArg[0]);
22506 if( pMod==0 ){
22507 assert( nArg>0 );
22508 rc = SQLITE_ERROR;
22509 *pzErr = sqlite3_mprintf("no such tokenizer: %s", azArg[0]);
22510 }else{
22511 rc = pMod->x.xCreate(pMod->pUserData, &azArg[1], (nArg?nArg-1:0), ppTok);
22512 *ppTokApi = &pMod->x;
22513 if( rc!=SQLITE_OK && pzErr ){
22514 *pzErr = sqlite3_mprintf("error in tokenizer constructor");
22515 }
22516 }
22517
22518 if( rc!=SQLITE_OK ){
22519 *ppTokApi = 0;
22520 *ppTok = 0;
22521 }
22522
22523 return rc;
22524 }
22525
22526 static void fts5ModuleDestroy(void *pCtx){
22527 Fts5TokenizerModule *pTok, *pNextTok;
22528 Fts5Auxiliary *pAux, *pNextAux;
22529 Fts5Global *pGlobal = (Fts5Global*)pCtx;
22530
22531 for(pAux=pGlobal->pAux; pAux; pAux=pNextAux){
22532 pNextAux = pAux->pNext;
22533 if( pAux->xDestroy ) pAux->xDestroy(pAux->pUserData);
22534 sqlite3_free(pAux);
22535 }
22536
22537 for(pTok=pGlobal->pTok; pTok; pTok=pNextTok){
22538 pNextTok = pTok->pNext;
22539 if( pTok->xDestroy ) pTok->xDestroy(pTok->pUserData);
22540 sqlite3_free(pTok);
22541 }
22542
22543 sqlite3_free(pGlobal);
22544 }
22545
22546 static void fts5Fts5Func(
22547 sqlite3_context *pCtx, /* Function call context */
22548 int nArg, /* Number of args */
22549 sqlite3_value **apVal /* Function arguments */
22550 ){
22551 Fts5Global *pGlobal = (Fts5Global*)sqlite3_user_data(pCtx);
22552 char buf[8];
22553 assert( nArg==0 );
22554 assert( sizeof(buf)>=sizeof(pGlobal) );
22555 memcpy(buf, (void*)&pGlobal, sizeof(pGlobal));
22556 sqlite3_result_blob(pCtx, buf, sizeof(pGlobal), SQLITE_TRANSIENT);
22557 }
22558
22559 /*
22560 ** Implementation of fts5_source_id() function.
22561 */
22562 static void fts5SourceIdFunc(
22563 sqlite3_context *pCtx, /* Function call context */
22564 int nArg, /* Number of args */
22565 sqlite3_value **apVal /* Function arguments */
22566 ){
22567 assert( nArg==0 );
22568 sqlite3_result_text(pCtx, "fts5: 2016-01-20 15:27:19 17efb4209f97fb4971656086b 138599a91a75ff9", -1, SQLITE_TRANSIENT);
22569 }
22570
22571 static int fts5Init(sqlite3 *db){
22572 static const sqlite3_module fts5Mod = {
22573 /* iVersion */ 2,
22574 /* xCreate */ fts5CreateMethod,
22575 /* xConnect */ fts5ConnectMethod,
22576 /* xBestIndex */ fts5BestIndexMethod,
22577 /* xDisconnect */ fts5DisconnectMethod,
22578 /* xDestroy */ fts5DestroyMethod,
22579 /* xOpen */ fts5OpenMethod,
22580 /* xClose */ fts5CloseMethod,
22581 /* xFilter */ fts5FilterMethod,
22582 /* xNext */ fts5NextMethod,
22583 /* xEof */ fts5EofMethod,
22584 /* xColumn */ fts5ColumnMethod,
22585 /* xRowid */ fts5RowidMethod,
22586 /* xUpdate */ fts5UpdateMethod,
22587 /* xBegin */ fts5BeginMethod,
22588 /* xSync */ fts5SyncMethod,
22589 /* xCommit */ fts5CommitMethod,
22590 /* xRollback */ fts5RollbackMethod,
22591 /* xFindFunction */ fts5FindFunctionMethod,
22592 /* xRename */ fts5RenameMethod,
22593 /* xSavepoint */ fts5SavepointMethod,
22594 /* xRelease */ fts5ReleaseMethod,
22595 /* xRollbackTo */ fts5RollbackToMethod,
22596 };
22597
22598 int rc;
22599 Fts5Global *pGlobal = 0;
22600
22601 pGlobal = (Fts5Global*)sqlite3_malloc(sizeof(Fts5Global));
22602 if( pGlobal==0 ){
22603 rc = SQLITE_NOMEM;
22604 }else{
22605 void *p = (void*)pGlobal;
22606 memset(pGlobal, 0, sizeof(Fts5Global));
22607 pGlobal->db = db;
22608 pGlobal->api.iVersion = 2;
22609 pGlobal->api.xCreateFunction = fts5CreateAux;
22610 pGlobal->api.xCreateTokenizer = fts5CreateTokenizer;
22611 pGlobal->api.xFindTokenizer = fts5FindTokenizer;
22612 rc = sqlite3_create_module_v2(db, "fts5", &fts5Mod, p, fts5ModuleDestroy);
22613 if( rc==SQLITE_OK ) rc = sqlite3Fts5IndexInit(db);
22614 if( rc==SQLITE_OK ) rc = sqlite3Fts5ExprInit(pGlobal, db);
22615 if( rc==SQLITE_OK ) rc = sqlite3Fts5AuxInit(&pGlobal->api);
22616 if( rc==SQLITE_OK ) rc = sqlite3Fts5TokenizerInit(&pGlobal->api);
22617 if( rc==SQLITE_OK ) rc = sqlite3Fts5VocabInit(pGlobal, db);
22618 if( rc==SQLITE_OK ){
22619 rc = sqlite3_create_function(
22620 db, "fts5", 0, SQLITE_UTF8, p, fts5Fts5Func, 0, 0
22621 );
22622 }
22623 if( rc==SQLITE_OK ){
22624 rc = sqlite3_create_function(
22625 db, "fts5_source_id", 0, SQLITE_UTF8, p, fts5SourceIdFunc, 0, 0
22626 );
22627 }
22628 }
22629 return rc;
22630 }
22631
22632 /*
22633 ** The following functions are used to register the module with SQLite. If
22634 ** this module is being built as part of the SQLite core (SQLITE_CORE is
22635 ** defined), then sqlite3_open() will call sqlite3Fts5Init() directly.
22636 **
22637 ** Or, if this module is being built as a loadable extension,
22638 ** sqlite3Fts5Init() is omitted and the two standard entry points
22639 ** sqlite3_fts_init() and sqlite3_fts5_init() defined instead.
22640 */
22641 #ifndef SQLITE_CORE
22642 #ifdef _WIN32
22643 __declspec(dllexport)
22644 #endif
22645 SQLITE_API int SQLITE_STDCALL sqlite3_fts_init(
22646 sqlite3 *db,
22647 char **pzErrMsg,
22648 const sqlite3_api_routines *pApi
22649 ){
22650 SQLITE_EXTENSION_INIT2(pApi);
22651 (void)pzErrMsg; /* Unused parameter */
22652 return fts5Init(db);
22653 }
22654
22655 #ifdef _WIN32
22656 __declspec(dllexport)
22657 #endif
22658 SQLITE_API int SQLITE_STDCALL sqlite3_fts5_init(
22659 sqlite3 *db,
22660 char **pzErrMsg,
22661 const sqlite3_api_routines *pApi
22662 ){
22663 SQLITE_EXTENSION_INIT2(pApi);
22664 (void)pzErrMsg; /* Unused parameter */
22665 return fts5Init(db);
22666 }
22667 #else
22668 SQLITE_PRIVATE int sqlite3Fts5Init(sqlite3 *db){
22669 return fts5Init(db);
22670 }
22671 #endif
22672
22673 /*
22674 ** 2014 May 31
22675 **
22676 ** The author disclaims copyright to this source code. In place of
22677 ** a legal notice, here is a blessing:
22678 **
22679 ** May you do good and not evil.
22680 ** May you find forgiveness for yourself and forgive others.
22681 ** May you share freely, never taking more than you give.
22682 **
22683 ******************************************************************************
22684 **
22685 */
22686
22687
22688
22689 /* #include "fts5Int.h" */
22690
22691 struct Fts5Storage {
22692 Fts5Config *pConfig;
22693 Fts5Index *pIndex;
22694 int bTotalsValid; /* True if nTotalRow/aTotalSize[] are valid */
22695 i64 nTotalRow; /* Total number of rows in FTS table */
22696 i64 *aTotalSize; /* Total sizes of each column */
22697 sqlite3_stmt *aStmt[11];
22698 };
22699
22700
22701 #if FTS5_STMT_SCAN_ASC!=0
22702 # error "FTS5_STMT_SCAN_ASC mismatch"
22703 #endif
22704 #if FTS5_STMT_SCAN_DESC!=1
22705 # error "FTS5_STMT_SCAN_DESC mismatch"
22706 #endif
22707 #if FTS5_STMT_LOOKUP!=2
22708 # error "FTS5_STMT_LOOKUP mismatch"
22709 #endif
22710
22711 #define FTS5_STMT_INSERT_CONTENT 3
22712 #define FTS5_STMT_REPLACE_CONTENT 4
22713 #define FTS5_STMT_DELETE_CONTENT 5
22714 #define FTS5_STMT_REPLACE_DOCSIZE 6
22715 #define FTS5_STMT_DELETE_DOCSIZE 7
22716 #define FTS5_STMT_LOOKUP_DOCSIZE 8
22717 #define FTS5_STMT_REPLACE_CONFIG 9
22718 #define FTS5_STMT_SCAN 10
22719
22720 /*
22721 ** Prepare the two insert statements - Fts5Storage.pInsertContent and
22722 ** Fts5Storage.pInsertDocsize - if they have not already been prepared.
22723 ** Return SQLITE_OK if successful, or an SQLite error code if an error
22724 ** occurs.
22725 */
22726 static int fts5StorageGetStmt(
22727 Fts5Storage *p, /* Storage handle */
22728 int eStmt, /* FTS5_STMT_XXX constant */
22729 sqlite3_stmt **ppStmt, /* OUT: Prepared statement handle */
22730 char **pzErrMsg /* OUT: Error message (if any) */
22731 ){
22732 int rc = SQLITE_OK;
22733
22734 /* If there is no %_docsize table, there should be no requests for
22735 ** statements to operate on it. */
22736 assert( p->pConfig->bColumnsize || (
22737 eStmt!=FTS5_STMT_REPLACE_DOCSIZE
22738 && eStmt!=FTS5_STMT_DELETE_DOCSIZE
22739 && eStmt!=FTS5_STMT_LOOKUP_DOCSIZE
22740 ));
22741
22742 assert( eStmt>=0 && eStmt<ArraySize(p->aStmt) );
22743 if( p->aStmt[eStmt]==0 ){
22744 const char *azStmt[] = {
22745 "SELECT %s FROM %s T WHERE T.%Q >= ? AND T.%Q <= ? ORDER BY T.%Q ASC",
22746 "SELECT %s FROM %s T WHERE T.%Q <= ? AND T.%Q >= ? ORDER BY T.%Q DESC",
22747 "SELECT %s FROM %s T WHERE T.%Q=?", /* LOOKUP */
22748
22749 "INSERT INTO %Q.'%q_content' VALUES(%s)", /* INSERT_CONTENT */
22750 "REPLACE INTO %Q.'%q_content' VALUES(%s)", /* REPLACE_CONTENT */
22751 "DELETE FROM %Q.'%q_content' WHERE id=?", /* DELETE_CONTENT */
22752 "REPLACE INTO %Q.'%q_docsize' VALUES(?,?)", /* REPLACE_DOCSIZE */
22753 "DELETE FROM %Q.'%q_docsize' WHERE id=?", /* DELETE_DOCSIZE */
22754
22755 "SELECT sz FROM %Q.'%q_docsize' WHERE id=?", /* LOOKUP_DOCSIZE */
22756
22757 "REPLACE INTO %Q.'%q_config' VALUES(?,?)", /* REPLACE_CONFIG */
22758 "SELECT %s FROM %s AS T", /* SCAN */
22759 };
22760 Fts5Config *pC = p->pConfig;
22761 char *zSql = 0;
22762
22763 switch( eStmt ){
22764 case FTS5_STMT_SCAN:
22765 zSql = sqlite3_mprintf(azStmt[eStmt],
22766 pC->zContentExprlist, pC->zContent
22767 );
22768 break;
22769
22770 case FTS5_STMT_SCAN_ASC:
22771 case FTS5_STMT_SCAN_DESC:
22772 zSql = sqlite3_mprintf(azStmt[eStmt], pC->zContentExprlist,
22773 pC->zContent, pC->zContentRowid, pC->zContentRowid,
22774 pC->zContentRowid
22775 );
22776 break;
22777
22778 case FTS5_STMT_LOOKUP:
22779 zSql = sqlite3_mprintf(azStmt[eStmt],
22780 pC->zContentExprlist, pC->zContent, pC->zContentRowid
22781 );
22782 break;
22783
22784 case FTS5_STMT_INSERT_CONTENT:
22785 case FTS5_STMT_REPLACE_CONTENT: {
22786 int nCol = pC->nCol + 1;
22787 char *zBind;
22788 int i;
22789
22790 zBind = sqlite3_malloc(1 + nCol*2);
22791 if( zBind ){
22792 for(i=0; i<nCol; i++){
22793 zBind[i*2] = '?';
22794 zBind[i*2 + 1] = ',';
22795 }
22796 zBind[i*2-1] = '\0';
22797 zSql = sqlite3_mprintf(azStmt[eStmt], pC->zDb, pC->zName, zBind);
22798 sqlite3_free(zBind);
22799 }
22800 break;
22801 }
22802
22803 default:
22804 zSql = sqlite3_mprintf(azStmt[eStmt], pC->zDb, pC->zName);
22805 break;
22806 }
22807
22808 if( zSql==0 ){
22809 rc = SQLITE_NOMEM;
22810 }else{
22811 rc = sqlite3_prepare_v2(pC->db, zSql, -1, &p->aStmt[eStmt], 0);
22812 sqlite3_free(zSql);
22813 if( rc!=SQLITE_OK && pzErrMsg ){
22814 *pzErrMsg = sqlite3_mprintf("%s", sqlite3_errmsg(pC->db));
22815 }
22816 }
22817 }
22818
22819 *ppStmt = p->aStmt[eStmt];
22820 return rc;
22821 }
22822
22823
22824 static int fts5ExecPrintf(
22825 sqlite3 *db,
22826 char **pzErr,
22827 const char *zFormat,
22828 ...
22829 ){
22830 int rc;
22831 va_list ap; /* ... printf arguments */
22832 char *zSql;
22833
22834 va_start(ap, zFormat);
22835 zSql = sqlite3_vmprintf(zFormat, ap);
22836
22837 if( zSql==0 ){
22838 rc = SQLITE_NOMEM;
22839 }else{
22840 rc = sqlite3_exec(db, zSql, 0, 0, pzErr);
22841 sqlite3_free(zSql);
22842 }
22843
22844 va_end(ap);
22845 return rc;
22846 }
22847
22848 /*
22849 ** Drop all shadow tables. Return SQLITE_OK if successful or an SQLite error
22850 ** code otherwise.
22851 */
22852 static int sqlite3Fts5DropAll(Fts5Config *pConfig){
22853 int rc = fts5ExecPrintf(pConfig->db, 0,
22854 "DROP TABLE IF EXISTS %Q.'%q_data';"
22855 "DROP TABLE IF EXISTS %Q.'%q_idx';"
22856 "DROP TABLE IF EXISTS %Q.'%q_config';",
22857 pConfig->zDb, pConfig->zName,
22858 pConfig->zDb, pConfig->zName,
22859 pConfig->zDb, pConfig->zName
22860 );
22861 if( rc==SQLITE_OK && pConfig->bColumnsize ){
22862 rc = fts5ExecPrintf(pConfig->db, 0,
22863 "DROP TABLE IF EXISTS %Q.'%q_docsize';",
22864 pConfig->zDb, pConfig->zName
22865 );
22866 }
22867 if( rc==SQLITE_OK && pConfig->eContent==FTS5_CONTENT_NORMAL ){
22868 rc = fts5ExecPrintf(pConfig->db, 0,
22869 "DROP TABLE IF EXISTS %Q.'%q_content';",
22870 pConfig->zDb, pConfig->zName
22871 );
22872 }
22873 return rc;
22874 }
22875
22876 static void fts5StorageRenameOne(
22877 Fts5Config *pConfig, /* Current FTS5 configuration */
22878 int *pRc, /* IN/OUT: Error code */
22879 const char *zTail, /* Tail of table name e.g. "data", "config" */
22880 const char *zName /* New name of FTS5 table */
22881 ){
22882 if( *pRc==SQLITE_OK ){
22883 *pRc = fts5ExecPrintf(pConfig->db, 0,
22884 "ALTER TABLE %Q.'%q_%s' RENAME TO '%q_%s';",
22885 pConfig->zDb, pConfig->zName, zTail, zName, zTail
22886 );
22887 }
22888 }
22889
22890 static int sqlite3Fts5StorageRename(Fts5Storage *pStorage, const char *zName){
22891 Fts5Config *pConfig = pStorage->pConfig;
22892 int rc = sqlite3Fts5StorageSync(pStorage, 1);
22893
22894 fts5StorageRenameOne(pConfig, &rc, "data", zName);
22895 fts5StorageRenameOne(pConfig, &rc, "idx", zName);
22896 fts5StorageRenameOne(pConfig, &rc, "config", zName);
22897 if( pConfig->bColumnsize ){
22898 fts5StorageRenameOne(pConfig, &rc, "docsize", zName);
22899 }
22900 if( pConfig->eContent==FTS5_CONTENT_NORMAL ){
22901 fts5StorageRenameOne(pConfig, &rc, "content", zName);
22902 }
22903 return rc;
22904 }
22905
22906 /*
22907 ** Create the shadow table named zPost, with definition zDefn. Return
22908 ** SQLITE_OK if successful, or an SQLite error code otherwise.
22909 */
22910 static int sqlite3Fts5CreateTable(
22911 Fts5Config *pConfig, /* FTS5 configuration */
22912 const char *zPost, /* Shadow table to create (e.g. "content") */
22913 const char *zDefn, /* Columns etc. for shadow table */
22914 int bWithout, /* True for without rowid */
22915 char **pzErr /* OUT: Error message */
22916 ){
22917 int rc;
22918 char *zErr = 0;
22919
22920 rc = fts5ExecPrintf(pConfig->db, &zErr, "CREATE TABLE %Q.'%q_%q'(%s)%s",
22921 pConfig->zDb, pConfig->zName, zPost, zDefn, bWithout?" WITHOUT ROWID":""
22922 );
22923 if( zErr ){
22924 *pzErr = sqlite3_mprintf(
22925 "fts5: error creating shadow table %q_%s: %s",
22926 pConfig->zName, zPost, zErr
22927 );
22928 sqlite3_free(zErr);
22929 }
22930
22931 return rc;
22932 }
22933
22934 /*
22935 ** Open a new Fts5Index handle. If the bCreate argument is true, create
22936 ** and initialize the underlying tables
22937 **
22938 ** If successful, set *pp to point to the new object and return SQLITE_OK.
22939 ** Otherwise, set *pp to NULL and return an SQLite error code.
22940 */
22941 static int sqlite3Fts5StorageOpen(
22942 Fts5Config *pConfig,
22943 Fts5Index *pIndex,
22944 int bCreate,
22945 Fts5Storage **pp,
22946 char **pzErr /* OUT: Error message */
22947 ){
22948 int rc = SQLITE_OK;
22949 Fts5Storage *p; /* New object */
22950 int nByte; /* Bytes of space to allocate */
22951
22952 nByte = sizeof(Fts5Storage) /* Fts5Storage object */
22953 + pConfig->nCol * sizeof(i64); /* Fts5Storage.aTotalSize[] */
22954 *pp = p = (Fts5Storage*)sqlite3_malloc(nByte);
22955 if( !p ) return SQLITE_NOMEM;
22956
22957 memset(p, 0, nByte);
22958 p->aTotalSize = (i64*)&p[1];
22959 p->pConfig = pConfig;
22960 p->pIndex = pIndex;
22961
22962 if( bCreate ){
22963 if( pConfig->eContent==FTS5_CONTENT_NORMAL ){
22964 int nDefn = 32 + pConfig->nCol*10;
22965 char *zDefn = sqlite3_malloc(32 + pConfig->nCol * 10);
22966 if( zDefn==0 ){
22967 rc = SQLITE_NOMEM;
22968 }else{
22969 int i;
22970 int iOff;
22971 sqlite3_snprintf(nDefn, zDefn, "id INTEGER PRIMARY KEY");
22972 iOff = (int)strlen(zDefn);
22973 for(i=0; i<pConfig->nCol; i++){
22974 sqlite3_snprintf(nDefn-iOff, &zDefn[iOff], ", c%d", i);
22975 iOff += (int)strlen(&zDefn[iOff]);
22976 }
22977 rc = sqlite3Fts5CreateTable(pConfig, "content", zDefn, 0, pzErr);
22978 }
22979 sqlite3_free(zDefn);
22980 }
22981
22982 if( rc==SQLITE_OK && pConfig->bColumnsize ){
22983 rc = sqlite3Fts5CreateTable(
22984 pConfig, "docsize", "id INTEGER PRIMARY KEY, sz BLOB", 0, pzErr
22985 );
22986 }
22987 if( rc==SQLITE_OK ){
22988 rc = sqlite3Fts5CreateTable(
22989 pConfig, "config", "k PRIMARY KEY, v", 1, pzErr
22990 );
22991 }
22992 if( rc==SQLITE_OK ){
22993 rc = sqlite3Fts5StorageConfigValue(p, "version", 0, FTS5_CURRENT_VERSION);
22994 }
22995 }
22996
22997 if( rc ){
22998 sqlite3Fts5StorageClose(p);
22999 *pp = 0;
23000 }
23001 return rc;
23002 }
23003
23004 /*
23005 ** Close a handle opened by an earlier call to sqlite3Fts5StorageOpen().
23006 */
23007 static int sqlite3Fts5StorageClose(Fts5Storage *p){
23008 int rc = SQLITE_OK;
23009 if( p ){
23010 int i;
23011
23012 /* Finalize all SQL statements */
23013 for(i=0; i<(int)ArraySize(p->aStmt); i++){
23014 sqlite3_finalize(p->aStmt[i]);
23015 }
23016
23017 sqlite3_free(p);
23018 }
23019 return rc;
23020 }
23021
23022 typedef struct Fts5InsertCtx Fts5InsertCtx;
23023 struct Fts5InsertCtx {
23024 Fts5Storage *pStorage;
23025 int iCol;
23026 int szCol; /* Size of column value in tokens */
23027 };
23028
23029 /*
23030 ** Tokenization callback used when inserting tokens into the FTS index.
23031 */
23032 static int fts5StorageInsertCallback(
23033 void *pContext, /* Pointer to Fts5InsertCtx object */
23034 int tflags,
23035 const char *pToken, /* Buffer containing token */
23036 int nToken, /* Size of token in bytes */
23037 int iStart, /* Start offset of token */
23038 int iEnd /* End offset of token */
23039 ){
23040 Fts5InsertCtx *pCtx = (Fts5InsertCtx*)pContext;
23041 Fts5Index *pIdx = pCtx->pStorage->pIndex;
23042 if( (tflags & FTS5_TOKEN_COLOCATED)==0 || pCtx->szCol==0 ){
23043 pCtx->szCol++;
23044 }
23045 return sqlite3Fts5IndexWrite(pIdx, pCtx->iCol, pCtx->szCol-1, pToken, nToken);
23046 }
23047
23048 /*
23049 ** If a row with rowid iDel is present in the %_content table, add the
23050 ** delete-markers to the FTS index necessary to delete it. Do not actually
23051 ** remove the %_content row at this time though.
23052 */
23053 static int fts5StorageDeleteFromIndex(Fts5Storage *p, i64 iDel){
23054 Fts5Config *pConfig = p->pConfig;
23055 sqlite3_stmt *pSeek; /* SELECT to read row iDel from %_data */
23056 int rc; /* Return code */
23057
23058 rc = fts5StorageGetStmt(p, FTS5_STMT_LOOKUP, &pSeek, 0);
23059 if( rc==SQLITE_OK ){
23060 int rc2;
23061 sqlite3_bind_int64(pSeek, 1, iDel);
23062 if( sqlite3_step(pSeek)==SQLITE_ROW ){
23063 int iCol;
23064 Fts5InsertCtx ctx;
23065 ctx.pStorage = p;
23066 ctx.iCol = -1;
23067 rc = sqlite3Fts5IndexBeginWrite(p->pIndex, 1, iDel);
23068 for(iCol=1; rc==SQLITE_OK && iCol<=pConfig->nCol; iCol++){
23069 if( pConfig->abUnindexed[iCol-1] ) continue;
23070 ctx.szCol = 0;
23071 rc = sqlite3Fts5Tokenize(pConfig,
23072 FTS5_TOKENIZE_DOCUMENT,
23073 (const char*)sqlite3_column_text(pSeek, iCol),
23074 sqlite3_column_bytes(pSeek, iCol),
23075 (void*)&ctx,
23076 fts5StorageInsertCallback
23077 );
23078 p->aTotalSize[iCol-1] -= (i64)ctx.szCol;
23079 }
23080 p->nTotalRow--;
23081 }
23082 rc2 = sqlite3_reset(pSeek);
23083 if( rc==SQLITE_OK ) rc = rc2;
23084 }
23085
23086 return rc;
23087 }
23088
23089
23090 /*
23091 ** Insert a record into the %_docsize table. Specifically, do:
23092 **
23093 ** INSERT OR REPLACE INTO %_docsize(id, sz) VALUES(iRowid, pBuf);
23094 **
23095 ** If there is no %_docsize table (as happens if the columnsize=0 option
23096 ** is specified when the FTS5 table is created), this function is a no-op.
23097 */
23098 static int fts5StorageInsertDocsize(
23099 Fts5Storage *p, /* Storage module to write to */
23100 i64 iRowid, /* id value */
23101 Fts5Buffer *pBuf /* sz value */
23102 ){
23103 int rc = SQLITE_OK;
23104 if( p->pConfig->bColumnsize ){
23105 sqlite3_stmt *pReplace = 0;
23106 rc = fts5StorageGetStmt(p, FTS5_STMT_REPLACE_DOCSIZE, &pReplace, 0);
23107 if( rc==SQLITE_OK ){
23108 sqlite3_bind_int64(pReplace, 1, iRowid);
23109 sqlite3_bind_blob(pReplace, 2, pBuf->p, pBuf->n, SQLITE_STATIC);
23110 sqlite3_step(pReplace);
23111 rc = sqlite3_reset(pReplace);
23112 }
23113 }
23114 return rc;
23115 }
23116
23117 /*
23118 ** Load the contents of the "averages" record from disk into the
23119 ** p->nTotalRow and p->aTotalSize[] variables. If successful, and if
23120 ** argument bCache is true, set the p->bTotalsValid flag to indicate
23121 ** that the contents of aTotalSize[] and nTotalRow are valid until
23122 ** further notice.
23123 **
23124 ** Return SQLITE_OK if successful, or an SQLite error code if an error
23125 ** occurs.
23126 */
23127 static int fts5StorageLoadTotals(Fts5Storage *p, int bCache){
23128 int rc = SQLITE_OK;
23129 if( p->bTotalsValid==0 ){
23130 rc = sqlite3Fts5IndexGetAverages(p->pIndex, &p->nTotalRow, p->aTotalSize);
23131 p->bTotalsValid = bCache;
23132 }
23133 return rc;
23134 }
23135
23136 /*
23137 ** Store the current contents of the p->nTotalRow and p->aTotalSize[]
23138 ** variables in the "averages" record on disk.
23139 **
23140 ** Return SQLITE_OK if successful, or an SQLite error code if an error
23141 ** occurs.
23142 */
23143 static int fts5StorageSaveTotals(Fts5Storage *p){
23144 int nCol = p->pConfig->nCol;
23145 int i;
23146 Fts5Buffer buf;
23147 int rc = SQLITE_OK;
23148 memset(&buf, 0, sizeof(buf));
23149
23150 sqlite3Fts5BufferAppendVarint(&rc, &buf, p->nTotalRow);
23151 for(i=0; i<nCol; i++){
23152 sqlite3Fts5BufferAppendVarint(&rc, &buf, p->aTotalSize[i]);
23153 }
23154 if( rc==SQLITE_OK ){
23155 rc = sqlite3Fts5IndexSetAverages(p->pIndex, buf.p, buf.n);
23156 }
23157 sqlite3_free(buf.p);
23158
23159 return rc;
23160 }
23161
23162 /*
23163 ** Remove a row from the FTS table.
23164 */
23165 static int sqlite3Fts5StorageDelete(Fts5Storage *p, i64 iDel){
23166 Fts5Config *pConfig = p->pConfig;
23167 int rc;
23168 sqlite3_stmt *pDel = 0;
23169
23170 rc = fts5StorageLoadTotals(p, 1);
23171
23172 /* Delete the index records */
23173 if( rc==SQLITE_OK ){
23174 rc = fts5StorageDeleteFromIndex(p, iDel);
23175 }
23176
23177 /* Delete the %_docsize record */
23178 if( rc==SQLITE_OK && pConfig->bColumnsize ){
23179 rc = fts5StorageGetStmt(p, FTS5_STMT_DELETE_DOCSIZE, &pDel, 0);
23180 if( rc==SQLITE_OK ){
23181 sqlite3_bind_int64(pDel, 1, iDel);
23182 sqlite3_step(pDel);
23183 rc = sqlite3_reset(pDel);
23184 }
23185 }
23186
23187 /* Delete the %_content record */
23188 if( pConfig->eContent==FTS5_CONTENT_NORMAL ){
23189 if( rc==SQLITE_OK ){
23190 rc = fts5StorageGetStmt(p, FTS5_STMT_DELETE_CONTENT, &pDel, 0);
23191 }
23192 if( rc==SQLITE_OK ){
23193 sqlite3_bind_int64(pDel, 1, iDel);
23194 sqlite3_step(pDel);
23195 rc = sqlite3_reset(pDel);
23196 }
23197 }
23198
23199 /* Write the averages record */
23200 if( rc==SQLITE_OK ){
23201 rc = fts5StorageSaveTotals(p);
23202 }
23203
23204 return rc;
23205 }
23206
23207 static int sqlite3Fts5StorageSpecialDelete(
23208 Fts5Storage *p,
23209 i64 iDel,
23210 sqlite3_value **apVal
23211 ){
23212 Fts5Config *pConfig = p->pConfig;
23213 int rc;
23214 sqlite3_stmt *pDel = 0;
23215
23216 assert( pConfig->eContent!=FTS5_CONTENT_NORMAL );
23217 rc = fts5StorageLoadTotals(p, 1);
23218
23219 /* Delete the index records */
23220 if( rc==SQLITE_OK ){
23221 int iCol;
23222 Fts5InsertCtx ctx;
23223 ctx.pStorage = p;
23224 ctx.iCol = -1;
23225
23226 rc = sqlite3Fts5IndexBeginWrite(p->pIndex, 1, iDel);
23227 for(iCol=0; rc==SQLITE_OK && iCol<pConfig->nCol; iCol++){
23228 if( pConfig->abUnindexed[iCol] ) continue;
23229 ctx.szCol = 0;
23230 rc = sqlite3Fts5Tokenize(pConfig,
23231 FTS5_TOKENIZE_DOCUMENT,
23232 (const char*)sqlite3_value_text(apVal[iCol]),
23233 sqlite3_value_bytes(apVal[iCol]),
23234 (void*)&ctx,
23235 fts5StorageInsertCallback
23236 );
23237 p->aTotalSize[iCol] -= (i64)ctx.szCol;
23238 }
23239 p->nTotalRow--;
23240 }
23241
23242 /* Delete the %_docsize record */
23243 if( pConfig->bColumnsize ){
23244 if( rc==SQLITE_OK ){
23245 rc = fts5StorageGetStmt(p, FTS5_STMT_DELETE_DOCSIZE, &pDel, 0);
23246 }
23247 if( rc==SQLITE_OK ){
23248 sqlite3_bind_int64(pDel, 1, iDel);
23249 sqlite3_step(pDel);
23250 rc = sqlite3_reset(pDel);
23251 }
23252 }
23253
23254 /* Write the averages record */
23255 if( rc==SQLITE_OK ){
23256 rc = fts5StorageSaveTotals(p);
23257 }
23258
23259 return rc;
23260 }
23261
23262 /*
23263 ** Delete all entries in the FTS5 index.
23264 */
23265 static int sqlite3Fts5StorageDeleteAll(Fts5Storage *p){
23266 Fts5Config *pConfig = p->pConfig;
23267 int rc;
23268
23269 /* Delete the contents of the %_data and %_docsize tables. */
23270 rc = fts5ExecPrintf(pConfig->db, 0,
23271 "DELETE FROM %Q.'%q_data';"
23272 "DELETE FROM %Q.'%q_idx';",
23273 pConfig->zDb, pConfig->zName,
23274 pConfig->zDb, pConfig->zName
23275 );
23276 if( rc==SQLITE_OK && pConfig->bColumnsize ){
23277 rc = fts5ExecPrintf(pConfig->db, 0,
23278 "DELETE FROM %Q.'%q_docsize';",
23279 pConfig->zDb, pConfig->zName
23280 );
23281 }
23282
23283 /* Reinitialize the %_data table. This call creates the initial structure
23284 ** and averages records. */
23285 if( rc==SQLITE_OK ){
23286 rc = sqlite3Fts5IndexReinit(p->pIndex);
23287 }
23288 if( rc==SQLITE_OK ){
23289 rc = sqlite3Fts5StorageConfigValue(p, "version", 0, FTS5_CURRENT_VERSION);
23290 }
23291 return rc;
23292 }
23293
23294 static int sqlite3Fts5StorageRebuild(Fts5Storage *p){
23295 Fts5Buffer buf = {0,0,0};
23296 Fts5Config *pConfig = p->pConfig;
23297 sqlite3_stmt *pScan = 0;
23298 Fts5InsertCtx ctx;
23299 int rc;
23300
23301 memset(&ctx, 0, sizeof(Fts5InsertCtx));
23302 ctx.pStorage = p;
23303 rc = sqlite3Fts5StorageDeleteAll(p);
23304 if( rc==SQLITE_OK ){
23305 rc = fts5StorageLoadTotals(p, 1);
23306 }
23307
23308 if( rc==SQLITE_OK ){
23309 rc = fts5StorageGetStmt(p, FTS5_STMT_SCAN, &pScan, 0);
23310 }
23311
23312 while( rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pScan) ){
23313 i64 iRowid = sqlite3_column_int64(pScan, 0);
23314
23315 sqlite3Fts5BufferZero(&buf);
23316 rc = sqlite3Fts5IndexBeginWrite(p->pIndex, 0, iRowid);
23317 for(ctx.iCol=0; rc==SQLITE_OK && ctx.iCol<pConfig->nCol; ctx.iCol++){
23318 ctx.szCol = 0;
23319 if( pConfig->abUnindexed[ctx.iCol]==0 ){
23320 rc = sqlite3Fts5Tokenize(pConfig,
23321 FTS5_TOKENIZE_DOCUMENT,
23322 (const char*)sqlite3_column_text(pScan, ctx.iCol+1),
23323 sqlite3_column_bytes(pScan, ctx.iCol+1),
23324 (void*)&ctx,
23325 fts5StorageInsertCallback
23326 );
23327 }
23328 sqlite3Fts5BufferAppendVarint(&rc, &buf, ctx.szCol);
23329 p->aTotalSize[ctx.iCol] += (i64)ctx.szCol;
23330 }
23331 p->nTotalRow++;
23332
23333 if( rc==SQLITE_OK ){
23334 rc = fts5StorageInsertDocsize(p, iRowid, &buf);
23335 }
23336 }
23337 sqlite3_free(buf.p);
23338
23339 /* Write the averages record */
23340 if( rc==SQLITE_OK ){
23341 rc = fts5StorageSaveTotals(p);
23342 }
23343 return rc;
23344 }
23345
23346 static int sqlite3Fts5StorageOptimize(Fts5Storage *p){
23347 return sqlite3Fts5IndexOptimize(p->pIndex);
23348 }
23349
23350 static int sqlite3Fts5StorageMerge(Fts5Storage *p, int nMerge){
23351 return sqlite3Fts5IndexMerge(p->pIndex, nMerge);
23352 }
23353
23354 /*
23355 ** Allocate a new rowid. This is used for "external content" tables when
23356 ** a NULL value is inserted into the rowid column. The new rowid is allocated
23357 ** by inserting a dummy row into the %_docsize table. The dummy will be
23358 ** overwritten later.
23359 **
23360 ** If the %_docsize table does not exist, SQLITE_MISMATCH is returned. In
23361 ** this case the user is required to provide a rowid explicitly.
23362 */
23363 static int fts5StorageNewRowid(Fts5Storage *p, i64 *piRowid){
23364 int rc = SQLITE_MISMATCH;
23365 if( p->pConfig->bColumnsize ){
23366 sqlite3_stmt *pReplace = 0;
23367 rc = fts5StorageGetStmt(p, FTS5_STMT_REPLACE_DOCSIZE, &pReplace, 0);
23368 if( rc==SQLITE_OK ){
23369 sqlite3_bind_null(pReplace, 1);
23370 sqlite3_bind_null(pReplace, 2);
23371 sqlite3_step(pReplace);
23372 rc = sqlite3_reset(pReplace);
23373 }
23374 if( rc==SQLITE_OK ){
23375 *piRowid = sqlite3_last_insert_rowid(p->pConfig->db);
23376 }
23377 }
23378 return rc;
23379 }
23380
23381 /*
23382 ** Insert a new row into the FTS content table.
23383 */
23384 static int sqlite3Fts5StorageContentInsert(
23385 Fts5Storage *p,
23386 sqlite3_value **apVal,
23387 i64 *piRowid
23388 ){
23389 Fts5Config *pConfig = p->pConfig;
23390 int rc = SQLITE_OK;
23391
23392 /* Insert the new row into the %_content table. */
23393 if( pConfig->eContent!=FTS5_CONTENT_NORMAL ){
23394 if( sqlite3_value_type(apVal[1])==SQLITE_INTEGER ){
23395 *piRowid = sqlite3_value_int64(apVal[1]);
23396 }else{
23397 rc = fts5StorageNewRowid(p, piRowid);
23398 }
23399 }else{
23400 sqlite3_stmt *pInsert = 0; /* Statement to write %_content table */
23401 int i; /* Counter variable */
23402 rc = fts5StorageGetStmt(p, FTS5_STMT_INSERT_CONTENT, &pInsert, 0);
23403 for(i=1; rc==SQLITE_OK && i<=pConfig->nCol+1; i++){
23404 rc = sqlite3_bind_value(pInsert, i, apVal[i]);
23405 }
23406 if( rc==SQLITE_OK ){
23407 sqlite3_step(pInsert);
23408 rc = sqlite3_reset(pInsert);
23409 }
23410 *piRowid = sqlite3_last_insert_rowid(pConfig->db);
23411 }
23412
23413 return rc;
23414 }
23415
23416 /*
23417 ** Insert new entries into the FTS index and %_docsize table.
23418 */
23419 static int sqlite3Fts5StorageIndexInsert(
23420 Fts5Storage *p,
23421 sqlite3_value **apVal,
23422 i64 iRowid
23423 ){
23424 Fts5Config *pConfig = p->pConfig;
23425 int rc = SQLITE_OK; /* Return code */
23426 Fts5InsertCtx ctx; /* Tokenization callback context object */
23427 Fts5Buffer buf; /* Buffer used to build up %_docsize blob */
23428
23429 memset(&buf, 0, sizeof(Fts5Buffer));
23430 ctx.pStorage = p;
23431 rc = fts5StorageLoadTotals(p, 1);
23432
23433 if( rc==SQLITE_OK ){
23434 rc = sqlite3Fts5IndexBeginWrite(p->pIndex, 0, iRowid);
23435 }
23436 for(ctx.iCol=0; rc==SQLITE_OK && ctx.iCol<pConfig->nCol; ctx.iCol++){
23437 ctx.szCol = 0;
23438 if( pConfig->abUnindexed[ctx.iCol]==0 ){
23439 rc = sqlite3Fts5Tokenize(pConfig,
23440 FTS5_TOKENIZE_DOCUMENT,
23441 (const char*)sqlite3_value_text(apVal[ctx.iCol+2]),
23442 sqlite3_value_bytes(apVal[ctx.iCol+2]),
23443 (void*)&ctx,
23444 fts5StorageInsertCallback
23445 );
23446 }
23447 sqlite3Fts5BufferAppendVarint(&rc, &buf, ctx.szCol);
23448 p->aTotalSize[ctx.iCol] += (i64)ctx.szCol;
23449 }
23450 p->nTotalRow++;
23451
23452 /* Write the %_docsize record */
23453 if( rc==SQLITE_OK ){
23454 rc = fts5StorageInsertDocsize(p, iRowid, &buf);
23455 }
23456 sqlite3_free(buf.p);
23457
23458 /* Write the averages record */
23459 if( rc==SQLITE_OK ){
23460 rc = fts5StorageSaveTotals(p);
23461 }
23462
23463 return rc;
23464 }
23465
23466 static int fts5StorageCount(Fts5Storage *p, const char *zSuffix, i64 *pnRow){
23467 Fts5Config *pConfig = p->pConfig;
23468 char *zSql;
23469 int rc;
23470
23471 zSql = sqlite3_mprintf("SELECT count(*) FROM %Q.'%q_%s'",
23472 pConfig->zDb, pConfig->zName, zSuffix
23473 );
23474 if( zSql==0 ){
23475 rc = SQLITE_NOMEM;
23476 }else{
23477 sqlite3_stmt *pCnt = 0;
23478 rc = sqlite3_prepare_v2(pConfig->db, zSql, -1, &pCnt, 0);
23479 if( rc==SQLITE_OK ){
23480 if( SQLITE_ROW==sqlite3_step(pCnt) ){
23481 *pnRow = sqlite3_column_int64(pCnt, 0);
23482 }
23483 rc = sqlite3_finalize(pCnt);
23484 }
23485 }
23486
23487 sqlite3_free(zSql);
23488 return rc;
23489 }
23490
23491 /*
23492 ** Context object used by sqlite3Fts5StorageIntegrity().
23493 */
23494 typedef struct Fts5IntegrityCtx Fts5IntegrityCtx;
23495 struct Fts5IntegrityCtx {
23496 i64 iRowid;
23497 int iCol;
23498 int szCol;
23499 u64 cksum;
23500 Fts5Config *pConfig;
23501 };
23502
23503 /*
23504 ** Tokenization callback used by integrity check.
23505 */
23506 static int fts5StorageIntegrityCallback(
23507 void *pContext, /* Pointer to Fts5InsertCtx object */
23508 int tflags,
23509 const char *pToken, /* Buffer containing token */
23510 int nToken, /* Size of token in bytes */
23511 int iStart, /* Start offset of token */
23512 int iEnd /* End offset of token */
23513 ){
23514 Fts5IntegrityCtx *pCtx = (Fts5IntegrityCtx*)pContext;
23515 if( (tflags & FTS5_TOKEN_COLOCATED)==0 || pCtx->szCol==0 ){
23516 pCtx->szCol++;
23517 }
23518 pCtx->cksum ^= sqlite3Fts5IndexCksum(
23519 pCtx->pConfig, pCtx->iRowid, pCtx->iCol, pCtx->szCol-1, pToken, nToken
23520 );
23521 return SQLITE_OK;
23522 }
23523
23524 /*
23525 ** Check that the contents of the FTS index match that of the %_content
23526 ** table. Return SQLITE_OK if they do, or SQLITE_CORRUPT if not. Return
23527 ** some other SQLite error code if an error occurs while attempting to
23528 ** determine this.
23529 */
23530 static int sqlite3Fts5StorageIntegrity(Fts5Storage *p){
23531 Fts5Config *pConfig = p->pConfig;
23532 int rc; /* Return code */
23533 int *aColSize; /* Array of size pConfig->nCol */
23534 i64 *aTotalSize; /* Array of size pConfig->nCol */
23535 Fts5IntegrityCtx ctx;
23536 sqlite3_stmt *pScan;
23537
23538 memset(&ctx, 0, sizeof(Fts5IntegrityCtx));
23539 ctx.pConfig = p->pConfig;
23540 aTotalSize = (i64*)sqlite3_malloc(pConfig->nCol * (sizeof(int)+sizeof(i64)));
23541 if( !aTotalSize ) return SQLITE_NOMEM;
23542 aColSize = (int*)&aTotalSize[pConfig->nCol];
23543 memset(aTotalSize, 0, sizeof(i64) * pConfig->nCol);
23544
23545 /* Generate the expected index checksum based on the contents of the
23546 ** %_content table. This block stores the checksum in ctx.cksum. */
23547 rc = fts5StorageGetStmt(p, FTS5_STMT_SCAN, &pScan, 0);
23548 if( rc==SQLITE_OK ){
23549 int rc2;
23550 while( SQLITE_ROW==sqlite3_step(pScan) ){
23551 int i;
23552 ctx.iRowid = sqlite3_column_int64(pScan, 0);
23553 ctx.szCol = 0;
23554 if( pConfig->bColumnsize ){
23555 rc = sqlite3Fts5StorageDocsize(p, ctx.iRowid, aColSize);
23556 }
23557 for(i=0; rc==SQLITE_OK && i<pConfig->nCol; i++){
23558 if( pConfig->abUnindexed[i] ) continue;
23559 ctx.iCol = i;
23560 ctx.szCol = 0;
23561 rc = sqlite3Fts5Tokenize(pConfig,
23562 FTS5_TOKENIZE_DOCUMENT,
23563 (const char*)sqlite3_column_text(pScan, i+1),
23564 sqlite3_column_bytes(pScan, i+1),
23565 (void*)&ctx,
23566 fts5StorageIntegrityCallback
23567 );
23568 if( pConfig->bColumnsize && ctx.szCol!=aColSize[i] ){
23569 rc = FTS5_CORRUPT;
23570 }
23571 aTotalSize[i] += ctx.szCol;
23572 }
23573 if( rc!=SQLITE_OK ) break;
23574 }
23575 rc2 = sqlite3_reset(pScan);
23576 if( rc==SQLITE_OK ) rc = rc2;
23577 }
23578
23579 /* Test that the "totals" (sometimes called "averages") record looks Ok */
23580 if( rc==SQLITE_OK ){
23581 int i;
23582 rc = fts5StorageLoadTotals(p, 0);
23583 for(i=0; rc==SQLITE_OK && i<pConfig->nCol; i++){
23584 if( p->aTotalSize[i]!=aTotalSize[i] ) rc = FTS5_CORRUPT;
23585 }
23586 }
23587
23588 /* Check that the %_docsize and %_content tables contain the expected
23589 ** number of rows. */
23590 if( rc==SQLITE_OK && pConfig->eContent==FTS5_CONTENT_NORMAL ){
23591 i64 nRow = 0;
23592 rc = fts5StorageCount(p, "content", &nRow);
23593 if( rc==SQLITE_OK && nRow!=p->nTotalRow ) rc = FTS5_CORRUPT;
23594 }
23595 if( rc==SQLITE_OK && pConfig->bColumnsize ){
23596 i64 nRow = 0;
23597 rc = fts5StorageCount(p, "docsize", &nRow);
23598 if( rc==SQLITE_OK && nRow!=p->nTotalRow ) rc = FTS5_CORRUPT;
23599 }
23600
23601 /* Pass the expected checksum down to the FTS index module. It will
23602 ** verify, amongst other things, that it matches the checksum generated by
23603 ** inspecting the index itself. */
23604 if( rc==SQLITE_OK ){
23605 rc = sqlite3Fts5IndexIntegrityCheck(p->pIndex, ctx.cksum);
23606 }
23607
23608 sqlite3_free(aTotalSize);
23609 return rc;
23610 }
23611
23612 /*
23613 ** Obtain an SQLite statement handle that may be used to read data from the
23614 ** %_content table.
23615 */
23616 static int sqlite3Fts5StorageStmt(
23617 Fts5Storage *p,
23618 int eStmt,
23619 sqlite3_stmt **pp,
23620 char **pzErrMsg
23621 ){
23622 int rc;
23623 assert( eStmt==FTS5_STMT_SCAN_ASC
23624 || eStmt==FTS5_STMT_SCAN_DESC
23625 || eStmt==FTS5_STMT_LOOKUP
23626 );
23627 rc = fts5StorageGetStmt(p, eStmt, pp, pzErrMsg);
23628 if( rc==SQLITE_OK ){
23629 assert( p->aStmt[eStmt]==*pp );
23630 p->aStmt[eStmt] = 0;
23631 }
23632 return rc;
23633 }
23634
23635 /*
23636 ** Release an SQLite statement handle obtained via an earlier call to
23637 ** sqlite3Fts5StorageStmt(). The eStmt parameter passed to this function
23638 ** must match that passed to the sqlite3Fts5StorageStmt() call.
23639 */
23640 static void sqlite3Fts5StorageStmtRelease(
23641 Fts5Storage *p,
23642 int eStmt,
23643 sqlite3_stmt *pStmt
23644 ){
23645 assert( eStmt==FTS5_STMT_SCAN_ASC
23646 || eStmt==FTS5_STMT_SCAN_DESC
23647 || eStmt==FTS5_STMT_LOOKUP
23648 );
23649 if( p->aStmt[eStmt]==0 ){
23650 sqlite3_reset(pStmt);
23651 p->aStmt[eStmt] = pStmt;
23652 }else{
23653 sqlite3_finalize(pStmt);
23654 }
23655 }
23656
23657 static int fts5StorageDecodeSizeArray(
23658 int *aCol, int nCol, /* Array to populate */
23659 const u8 *aBlob, int nBlob /* Record to read varints from */
23660 ){
23661 int i;
23662 int iOff = 0;
23663 for(i=0; i<nCol; i++){
23664 if( iOff>=nBlob ) return 1;
23665 iOff += fts5GetVarint32(&aBlob[iOff], aCol[i]);
23666 }
23667 return (iOff!=nBlob);
23668 }
23669
23670 /*
23671 ** Argument aCol points to an array of integers containing one entry for
23672 ** each table column. This function reads the %_docsize record for the
23673 ** specified rowid and populates aCol[] with the results.
23674 **
23675 ** An SQLite error code is returned if an error occurs, or SQLITE_OK
23676 ** otherwise.
23677 */
23678 static int sqlite3Fts5StorageDocsize(Fts5Storage *p, i64 iRowid, int *aCol){
23679 int nCol = p->pConfig->nCol; /* Number of user columns in table */
23680 sqlite3_stmt *pLookup = 0; /* Statement to query %_docsize */
23681 int rc; /* Return Code */
23682
23683 assert( p->pConfig->bColumnsize );
23684 rc = fts5StorageGetStmt(p, FTS5_STMT_LOOKUP_DOCSIZE, &pLookup, 0);
23685 if( rc==SQLITE_OK ){
23686 int bCorrupt = 1;
23687 sqlite3_bind_int64(pLookup, 1, iRowid);
23688 if( SQLITE_ROW==sqlite3_step(pLookup) ){
23689 const u8 *aBlob = sqlite3_column_blob(pLookup, 0);
23690 int nBlob = sqlite3_column_bytes(pLookup, 0);
23691 if( 0==fts5StorageDecodeSizeArray(aCol, nCol, aBlob, nBlob) ){
23692 bCorrupt = 0;
23693 }
23694 }
23695 rc = sqlite3_reset(pLookup);
23696 if( bCorrupt && rc==SQLITE_OK ){
23697 rc = FTS5_CORRUPT;
23698 }
23699 }
23700
23701 return rc;
23702 }
23703
23704 static int sqlite3Fts5StorageSize(Fts5Storage *p, int iCol, i64 *pnToken){
23705 int rc = fts5StorageLoadTotals(p, 0);
23706 if( rc==SQLITE_OK ){
23707 *pnToken = 0;
23708 if( iCol<0 ){
23709 int i;
23710 for(i=0; i<p->pConfig->nCol; i++){
23711 *pnToken += p->aTotalSize[i];
23712 }
23713 }else if( iCol<p->pConfig->nCol ){
23714 *pnToken = p->aTotalSize[iCol];
23715 }else{
23716 rc = SQLITE_RANGE;
23717 }
23718 }
23719 return rc;
23720 }
23721
23722 static int sqlite3Fts5StorageRowCount(Fts5Storage *p, i64 *pnRow){
23723 int rc = fts5StorageLoadTotals(p, 0);
23724 if( rc==SQLITE_OK ){
23725 *pnRow = p->nTotalRow;
23726 }
23727 return rc;
23728 }
23729
23730 /*
23731 ** Flush any data currently held in-memory to disk.
23732 */
23733 static int sqlite3Fts5StorageSync(Fts5Storage *p, int bCommit){
23734 if( bCommit && p->bTotalsValid ){
23735 int rc = fts5StorageSaveTotals(p);
23736 p->bTotalsValid = 0;
23737 if( rc!=SQLITE_OK ) return rc;
23738 }
23739 return sqlite3Fts5IndexSync(p->pIndex, bCommit);
23740 }
23741
23742 static int sqlite3Fts5StorageRollback(Fts5Storage *p){
23743 p->bTotalsValid = 0;
23744 return sqlite3Fts5IndexRollback(p->pIndex);
23745 }
23746
23747 static int sqlite3Fts5StorageConfigValue(
23748 Fts5Storage *p,
23749 const char *z,
23750 sqlite3_value *pVal,
23751 int iVal
23752 ){
23753 sqlite3_stmt *pReplace = 0;
23754 int rc = fts5StorageGetStmt(p, FTS5_STMT_REPLACE_CONFIG, &pReplace, 0);
23755 if( rc==SQLITE_OK ){
23756 sqlite3_bind_text(pReplace, 1, z, -1, SQLITE_STATIC);
23757 if( pVal ){
23758 sqlite3_bind_value(pReplace, 2, pVal);
23759 }else{
23760 sqlite3_bind_int(pReplace, 2, iVal);
23761 }
23762 sqlite3_step(pReplace);
23763 rc = sqlite3_reset(pReplace);
23764 }
23765 if( rc==SQLITE_OK && pVal ){
23766 int iNew = p->pConfig->iCookie + 1;
23767 rc = sqlite3Fts5IndexSetCookie(p->pIndex, iNew);
23768 if( rc==SQLITE_OK ){
23769 p->pConfig->iCookie = iNew;
23770 }
23771 }
23772 return rc;
23773 }
23774
23775
23776
23777 /*
23778 ** 2014 May 31
23779 **
23780 ** The author disclaims copyright to this source code. In place of
23781 ** a legal notice, here is a blessing:
23782 **
23783 ** May you do good and not evil.
23784 ** May you find forgiveness for yourself and forgive others.
23785 ** May you share freely, never taking more than you give.
23786 **
23787 ******************************************************************************
23788 */
23789
23790
23791 /* #include "fts5Int.h" */
23792
23793 /**************************************************************************
23794 ** Start of ascii tokenizer implementation.
23795 */
23796
23797 /*
23798 ** For tokenizers with no "unicode" modifier, the set of token characters
23799 ** is the same as the set of ASCII range alphanumeric characters.
23800 */
23801 static unsigned char aAsciiTokenChar[128] = {
23802 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0x00..0x0F */
23803 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0x10..0x1F */
23804 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0x20..0x2F */
23805 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, /* 0x30..0x3F */
23806 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0x40..0x4F */
23807 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, /* 0x50..0x5F */
23808 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0x60..0x6F */
23809 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, /* 0x70..0x7F */
23810 };
23811
23812 typedef struct AsciiTokenizer AsciiTokenizer;
23813 struct AsciiTokenizer {
23814 unsigned char aTokenChar[128];
23815 };
23816
23817 static void fts5AsciiAddExceptions(
23818 AsciiTokenizer *p,
23819 const char *zArg,
23820 int bTokenChars
23821 ){
23822 int i;
23823 for(i=0; zArg[i]; i++){
23824 if( (zArg[i] & 0x80)==0 ){
23825 p->aTokenChar[(int)zArg[i]] = (unsigned char)bTokenChars;
23826 }
23827 }
23828 }
23829
23830 /*
23831 ** Delete a "ascii" tokenizer.
23832 */
23833 static void fts5AsciiDelete(Fts5Tokenizer *p){
23834 sqlite3_free(p);
23835 }
23836
23837 /*
23838 ** Create an "ascii" tokenizer.
23839 */
23840 static int fts5AsciiCreate(
23841 void *pCtx,
23842 const char **azArg, int nArg,
23843 Fts5Tokenizer **ppOut
23844 ){
23845 int rc = SQLITE_OK;
23846 AsciiTokenizer *p = 0;
23847 if( nArg%2 ){
23848 rc = SQLITE_ERROR;
23849 }else{
23850 p = sqlite3_malloc(sizeof(AsciiTokenizer));
23851 if( p==0 ){
23852 rc = SQLITE_NOMEM;
23853 }else{
23854 int i;
23855 memset(p, 0, sizeof(AsciiTokenizer));
23856 memcpy(p->aTokenChar, aAsciiTokenChar, sizeof(aAsciiTokenChar));
23857 for(i=0; rc==SQLITE_OK && i<nArg; i+=2){
23858 const char *zArg = azArg[i+1];
23859 if( 0==sqlite3_stricmp(azArg[i], "tokenchars") ){
23860 fts5AsciiAddExceptions(p, zArg, 1);
23861 }else
23862 if( 0==sqlite3_stricmp(azArg[i], "separators") ){
23863 fts5AsciiAddExceptions(p, zArg, 0);
23864 }else{
23865 rc = SQLITE_ERROR;
23866 }
23867 }
23868 if( rc!=SQLITE_OK ){
23869 fts5AsciiDelete((Fts5Tokenizer*)p);
23870 p = 0;
23871 }
23872 }
23873 }
23874
23875 *ppOut = (Fts5Tokenizer*)p;
23876 return rc;
23877 }
23878
23879
23880 static void asciiFold(char *aOut, const char *aIn, int nByte){
23881 int i;
23882 for(i=0; i<nByte; i++){
23883 char c = aIn[i];
23884 if( c>='A' && c<='Z' ) c += 32;
23885 aOut[i] = c;
23886 }
23887 }
23888
23889 /*
23890 ** Tokenize some text using the ascii tokenizer.
23891 */
23892 static int fts5AsciiTokenize(
23893 Fts5Tokenizer *pTokenizer,
23894 void *pCtx,
23895 int flags,
23896 const char *pText, int nText,
23897 int (*xToken)(void*, int, const char*, int nToken, int iStart, int iEnd)
23898 ){
23899 AsciiTokenizer *p = (AsciiTokenizer*)pTokenizer;
23900 int rc = SQLITE_OK;
23901 int ie;
23902 int is = 0;
23903
23904 char aFold[64];
23905 int nFold = sizeof(aFold);
23906 char *pFold = aFold;
23907 unsigned char *a = p->aTokenChar;
23908
23909 while( is<nText && rc==SQLITE_OK ){
23910 int nByte;
23911
23912 /* Skip any leading divider characters. */
23913 while( is<nText && ((pText[is]&0x80)==0 && a[(int)pText[is]]==0) ){
23914 is++;
23915 }
23916 if( is==nText ) break;
23917
23918 /* Count the token characters */
23919 ie = is+1;
23920 while( ie<nText && ((pText[ie]&0x80) || a[(int)pText[ie]] ) ){
23921 ie++;
23922 }
23923
23924 /* Fold to lower case */
23925 nByte = ie-is;
23926 if( nByte>nFold ){
23927 if( pFold!=aFold ) sqlite3_free(pFold);
23928 pFold = sqlite3_malloc(nByte*2);
23929 if( pFold==0 ){
23930 rc = SQLITE_NOMEM;
23931 break;
23932 }
23933 nFold = nByte*2;
23934 }
23935 asciiFold(pFold, &pText[is], nByte);
23936
23937 /* Invoke the token callback */
23938 rc = xToken(pCtx, 0, pFold, nByte, is, ie);
23939 is = ie+1;
23940 }
23941
23942 if( pFold!=aFold ) sqlite3_free(pFold);
23943 if( rc==SQLITE_DONE ) rc = SQLITE_OK;
23944 return rc;
23945 }
23946
23947 /**************************************************************************
23948 ** Start of unicode61 tokenizer implementation.
23949 */
23950
23951
23952 /*
23953 ** The following two macros - READ_UTF8 and WRITE_UTF8 - have been copied
23954 ** from the sqlite3 source file utf.c. If this file is compiled as part
23955 ** of the amalgamation, they are not required.
23956 */
23957 #ifndef SQLITE_AMALGAMATION
23958
23959 static const unsigned char sqlite3Utf8Trans1[] = {
23960 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
23961 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
23962 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
23963 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f,
23964 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
23965 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
23966 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
23967 0x00, 0x01, 0x02, 0x03, 0x00, 0x01, 0x00, 0x00,
23968 };
23969
23970 #define READ_UTF8(zIn, zTerm, c) \
23971 c = *(zIn++); \
23972 if( c>=0xc0 ){ \
23973 c = sqlite3Utf8Trans1[c-0xc0]; \
23974 while( zIn!=zTerm && (*zIn & 0xc0)==0x80 ){ \
23975 c = (c<<6) + (0x3f & *(zIn++)); \
23976 } \
23977 if( c<0x80 \
23978 || (c&0xFFFFF800)==0xD800 \
23979 || (c&0xFFFFFFFE)==0xFFFE ){ c = 0xFFFD; } \
23980 }
23981
23982
23983 #define WRITE_UTF8(zOut, c) { \
23984 if( c<0x00080 ){ \
23985 *zOut++ = (unsigned char)(c&0xFF); \
23986 } \
23987 else if( c<0x00800 ){ \
23988 *zOut++ = 0xC0 + (unsigned char)((c>>6)&0x1F); \
23989 *zOut++ = 0x80 + (unsigned char)(c & 0x3F); \
23990 } \
23991 else if( c<0x10000 ){ \
23992 *zOut++ = 0xE0 + (unsigned char)((c>>12)&0x0F); \
23993 *zOut++ = 0x80 + (unsigned char)((c>>6) & 0x3F); \
23994 *zOut++ = 0x80 + (unsigned char)(c & 0x3F); \
23995 }else{ \
23996 *zOut++ = 0xF0 + (unsigned char)((c>>18) & 0x07); \
23997 *zOut++ = 0x80 + (unsigned char)((c>>12) & 0x3F); \
23998 *zOut++ = 0x80 + (unsigned char)((c>>6) & 0x3F); \
23999 *zOut++ = 0x80 + (unsigned char)(c & 0x3F); \
24000 } \
24001 }
24002
24003 #endif /* ifndef SQLITE_AMALGAMATION */
24004
24005 typedef struct Unicode61Tokenizer Unicode61Tokenizer;
24006 struct Unicode61Tokenizer {
24007 unsigned char aTokenChar[128]; /* ASCII range token characters */
24008 char *aFold; /* Buffer to fold text into */
24009 int nFold; /* Size of aFold[] in bytes */
24010 int bRemoveDiacritic; /* True if remove_diacritics=1 is set */
24011 int nException;
24012 int *aiException;
24013 };
24014
24015 static int fts5UnicodeAddExceptions(
24016 Unicode61Tokenizer *p, /* Tokenizer object */
24017 const char *z, /* Characters to treat as exceptions */
24018 int bTokenChars /* 1 for 'tokenchars', 0 for 'separators' */
24019 ){
24020 int rc = SQLITE_OK;
24021 int n = (int)strlen(z);
24022 int *aNew;
24023
24024 if( n>0 ){
24025 aNew = (int*)sqlite3_realloc(p->aiException, (n+p->nException)*sizeof(int));
24026 if( aNew ){
24027 int nNew = p->nException;
24028 const unsigned char *zCsr = (const unsigned char*)z;
24029 const unsigned char *zTerm = (const unsigned char*)&z[n];
24030 while( zCsr<zTerm ){
24031 int iCode;
24032 int bToken;
24033 READ_UTF8(zCsr, zTerm, iCode);
24034 if( iCode<128 ){
24035 p->aTokenChar[iCode] = (unsigned char)bTokenChars;
24036 }else{
24037 bToken = sqlite3Fts5UnicodeIsalnum(iCode);
24038 assert( (bToken==0 || bToken==1) );
24039 assert( (bTokenChars==0 || bTokenChars==1) );
24040 if( bToken!=bTokenChars && sqlite3Fts5UnicodeIsdiacritic(iCode)==0 ){
24041 int i;
24042 for(i=0; i<nNew; i++){
24043 if( aNew[i]>iCode ) break;
24044 }
24045 memmove(&aNew[i+1], &aNew[i], (nNew-i)*sizeof(int));
24046 aNew[i] = iCode;
24047 nNew++;
24048 }
24049 }
24050 }
24051 p->aiException = aNew;
24052 p->nException = nNew;
24053 }else{
24054 rc = SQLITE_NOMEM;
24055 }
24056 }
24057
24058 return rc;
24059 }
24060
24061 /*
24062 ** Return true if the p->aiException[] array contains the value iCode.
24063 */
24064 static int fts5UnicodeIsException(Unicode61Tokenizer *p, int iCode){
24065 if( p->nException>0 ){
24066 int *a = p->aiException;
24067 int iLo = 0;
24068 int iHi = p->nException-1;
24069
24070 while( iHi>=iLo ){
24071 int iTest = (iHi + iLo) / 2;
24072 if( iCode==a[iTest] ){
24073 return 1;
24074 }else if( iCode>a[iTest] ){
24075 iLo = iTest+1;
24076 }else{
24077 iHi = iTest-1;
24078 }
24079 }
24080 }
24081
24082 return 0;
24083 }
24084
24085 /*
24086 ** Delete a "unicode61" tokenizer.
24087 */
24088 static void fts5UnicodeDelete(Fts5Tokenizer *pTok){
24089 if( pTok ){
24090 Unicode61Tokenizer *p = (Unicode61Tokenizer*)pTok;
24091 sqlite3_free(p->aiException);
24092 sqlite3_free(p->aFold);
24093 sqlite3_free(p);
24094 }
24095 return;
24096 }
24097
24098 /*
24099 ** Create a "unicode61" tokenizer.
24100 */
24101 static int fts5UnicodeCreate(
24102 void *pCtx,
24103 const char **azArg, int nArg,
24104 Fts5Tokenizer **ppOut
24105 ){
24106 int rc = SQLITE_OK; /* Return code */
24107 Unicode61Tokenizer *p = 0; /* New tokenizer object */
24108
24109 if( nArg%2 ){
24110 rc = SQLITE_ERROR;
24111 }else{
24112 p = (Unicode61Tokenizer*)sqlite3_malloc(sizeof(Unicode61Tokenizer));
24113 if( p ){
24114 int i;
24115 memset(p, 0, sizeof(Unicode61Tokenizer));
24116 memcpy(p->aTokenChar, aAsciiTokenChar, sizeof(aAsciiTokenChar));
24117 p->bRemoveDiacritic = 1;
24118 p->nFold = 64;
24119 p->aFold = sqlite3_malloc(p->nFold * sizeof(char));
24120 if( p->aFold==0 ){
24121 rc = SQLITE_NOMEM;
24122 }
24123 for(i=0; rc==SQLITE_OK && i<nArg; i+=2){
24124 const char *zArg = azArg[i+1];
24125 if( 0==sqlite3_stricmp(azArg[i], "remove_diacritics") ){
24126 if( (zArg[0]!='0' && zArg[0]!='1') || zArg[1] ){
24127 rc = SQLITE_ERROR;
24128 }
24129 p->bRemoveDiacritic = (zArg[0]=='1');
24130 }else
24131 if( 0==sqlite3_stricmp(azArg[i], "tokenchars") ){
24132 rc = fts5UnicodeAddExceptions(p, zArg, 1);
24133 }else
24134 if( 0==sqlite3_stricmp(azArg[i], "separators") ){
24135 rc = fts5UnicodeAddExceptions(p, zArg, 0);
24136 }else{
24137 rc = SQLITE_ERROR;
24138 }
24139 }
24140 }else{
24141 rc = SQLITE_NOMEM;
24142 }
24143 if( rc!=SQLITE_OK ){
24144 fts5UnicodeDelete((Fts5Tokenizer*)p);
24145 p = 0;
24146 }
24147 *ppOut = (Fts5Tokenizer*)p;
24148 }
24149 return rc;
24150 }
24151
24152 /*
24153 ** Return true if, for the purposes of tokenizing with the tokenizer
24154 ** passed as the first argument, codepoint iCode is considered a token
24155 ** character (not a separator).
24156 */
24157 static int fts5UnicodeIsAlnum(Unicode61Tokenizer *p, int iCode){
24158 assert( (sqlite3Fts5UnicodeIsalnum(iCode) & 0xFFFFFFFE)==0 );
24159 return sqlite3Fts5UnicodeIsalnum(iCode) ^ fts5UnicodeIsException(p, iCode);
24160 }
24161
24162 static int fts5UnicodeTokenize(
24163 Fts5Tokenizer *pTokenizer,
24164 void *pCtx,
24165 int flags,
24166 const char *pText, int nText,
24167 int (*xToken)(void*, int, const char*, int nToken, int iStart, int iEnd)
24168 ){
24169 Unicode61Tokenizer *p = (Unicode61Tokenizer*)pTokenizer;
24170 int rc = SQLITE_OK;
24171 unsigned char *a = p->aTokenChar;
24172
24173 unsigned char *zTerm = (unsigned char*)&pText[nText];
24174 unsigned char *zCsr = (unsigned char *)pText;
24175
24176 /* Output buffer */
24177 char *aFold = p->aFold;
24178 int nFold = p->nFold;
24179 const char *pEnd = &aFold[nFold-6];
24180
24181 /* Each iteration of this loop gobbles up a contiguous run of separators,
24182 ** then the next token. */
24183 while( rc==SQLITE_OK ){
24184 int iCode; /* non-ASCII codepoint read from input */
24185 char *zOut = aFold;
24186 int is;
24187 int ie;
24188
24189 /* Skip any separator characters. */
24190 while( 1 ){
24191 if( zCsr>=zTerm ) goto tokenize_done;
24192 if( *zCsr & 0x80 ) {
24193 /* A character outside of the ascii range. Skip past it if it is
24194 ** a separator character. Or break out of the loop if it is not. */
24195 is = zCsr - (unsigned char*)pText;
24196 READ_UTF8(zCsr, zTerm, iCode);
24197 if( fts5UnicodeIsAlnum(p, iCode) ){
24198 goto non_ascii_tokenchar;
24199 }
24200 }else{
24201 if( a[*zCsr] ){
24202 is = zCsr - (unsigned char*)pText;
24203 goto ascii_tokenchar;
24204 }
24205 zCsr++;
24206 }
24207 }
24208
24209 /* Run through the tokenchars. Fold them into the output buffer along
24210 ** the way. */
24211 while( zCsr<zTerm ){
24212
24213 /* Grow the output buffer so that there is sufficient space to fit the
24214 ** largest possible utf-8 character. */
24215 if( zOut>pEnd ){
24216 aFold = sqlite3_malloc(nFold*2);
24217 if( aFold==0 ){
24218 rc = SQLITE_NOMEM;
24219 goto tokenize_done;
24220 }
24221 zOut = &aFold[zOut - p->aFold];
24222 memcpy(aFold, p->aFold, nFold);
24223 sqlite3_free(p->aFold);
24224 p->aFold = aFold;
24225 p->nFold = nFold = nFold*2;
24226 pEnd = &aFold[nFold-6];
24227 }
24228
24229 if( *zCsr & 0x80 ){
24230 /* An non-ascii-range character. Fold it into the output buffer if
24231 ** it is a token character, or break out of the loop if it is not. */
24232 READ_UTF8(zCsr, zTerm, iCode);
24233 if( fts5UnicodeIsAlnum(p,iCode)||sqlite3Fts5UnicodeIsdiacritic(iCode) ){
24234 non_ascii_tokenchar:
24235 iCode = sqlite3Fts5UnicodeFold(iCode, p->bRemoveDiacritic);
24236 if( iCode ) WRITE_UTF8(zOut, iCode);
24237 }else{
24238 break;
24239 }
24240 }else if( a[*zCsr]==0 ){
24241 /* An ascii-range separator character. End of token. */
24242 break;
24243 }else{
24244 ascii_tokenchar:
24245 if( *zCsr>='A' && *zCsr<='Z' ){
24246 *zOut++ = *zCsr + 32;
24247 }else{
24248 *zOut++ = *zCsr;
24249 }
24250 zCsr++;
24251 }
24252 ie = zCsr - (unsigned char*)pText;
24253 }
24254
24255 /* Invoke the token callback */
24256 rc = xToken(pCtx, 0, aFold, zOut-aFold, is, ie);
24257 }
24258
24259 tokenize_done:
24260 if( rc==SQLITE_DONE ) rc = SQLITE_OK;
24261 return rc;
24262 }
24263
24264 /**************************************************************************
24265 ** Start of porter stemmer implementation.
24266 */
24267
24268 /* Any tokens larger than this (in bytes) are passed through without
24269 ** stemming. */
24270 #define FTS5_PORTER_MAX_TOKEN 64
24271
24272 typedef struct PorterTokenizer PorterTokenizer;
24273 struct PorterTokenizer {
24274 fts5_tokenizer tokenizer; /* Parent tokenizer module */
24275 Fts5Tokenizer *pTokenizer; /* Parent tokenizer instance */
24276 char aBuf[FTS5_PORTER_MAX_TOKEN + 64];
24277 };
24278
24279 /*
24280 ** Delete a "porter" tokenizer.
24281 */
24282 static void fts5PorterDelete(Fts5Tokenizer *pTok){
24283 if( pTok ){
24284 PorterTokenizer *p = (PorterTokenizer*)pTok;
24285 if( p->pTokenizer ){
24286 p->tokenizer.xDelete(p->pTokenizer);
24287 }
24288 sqlite3_free(p);
24289 }
24290 }
24291
24292 /*
24293 ** Create a "porter" tokenizer.
24294 */
24295 static int fts5PorterCreate(
24296 void *pCtx,
24297 const char **azArg, int nArg,
24298 Fts5Tokenizer **ppOut
24299 ){
24300 fts5_api *pApi = (fts5_api*)pCtx;
24301 int rc = SQLITE_OK;
24302 PorterTokenizer *pRet;
24303 void *pUserdata = 0;
24304 const char *zBase = "unicode61";
24305
24306 if( nArg>0 ){
24307 zBase = azArg[0];
24308 }
24309
24310 pRet = (PorterTokenizer*)sqlite3_malloc(sizeof(PorterTokenizer));
24311 if( pRet ){
24312 memset(pRet, 0, sizeof(PorterTokenizer));
24313 rc = pApi->xFindTokenizer(pApi, zBase, &pUserdata, &pRet->tokenizer);
24314 }else{
24315 rc = SQLITE_NOMEM;
24316 }
24317 if( rc==SQLITE_OK ){
24318 int nArg2 = (nArg>0 ? nArg-1 : 0);
24319 const char **azArg2 = (nArg2 ? &azArg[1] : 0);
24320 rc = pRet->tokenizer.xCreate(pUserdata, azArg2, nArg2, &pRet->pTokenizer);
24321 }
24322
24323 if( rc!=SQLITE_OK ){
24324 fts5PorterDelete((Fts5Tokenizer*)pRet);
24325 pRet = 0;
24326 }
24327 *ppOut = (Fts5Tokenizer*)pRet;
24328 return rc;
24329 }
24330
24331 typedef struct PorterContext PorterContext;
24332 struct PorterContext {
24333 void *pCtx;
24334 int (*xToken)(void*, int, const char*, int, int, int);
24335 char *aBuf;
24336 };
24337
24338 typedef struct PorterRule PorterRule;
24339 struct PorterRule {
24340 const char *zSuffix;
24341 int nSuffix;
24342 int (*xCond)(char *zStem, int nStem);
24343 const char *zOutput;
24344 int nOutput;
24345 };
24346
24347 #if 0
24348 static int fts5PorterApply(char *aBuf, int *pnBuf, PorterRule *aRule){
24349 int ret = -1;
24350 int nBuf = *pnBuf;
24351 PorterRule *p;
24352
24353 for(p=aRule; p->zSuffix; p++){
24354 assert( strlen(p->zSuffix)==p->nSuffix );
24355 assert( strlen(p->zOutput)==p->nOutput );
24356 if( nBuf<p->nSuffix ) continue;
24357 if( 0==memcmp(&aBuf[nBuf - p->nSuffix], p->zSuffix, p->nSuffix) ) break;
24358 }
24359
24360 if( p->zSuffix ){
24361 int nStem = nBuf - p->nSuffix;
24362 if( p->xCond==0 || p->xCond(aBuf, nStem) ){
24363 memcpy(&aBuf[nStem], p->zOutput, p->nOutput);
24364 *pnBuf = nStem + p->nOutput;
24365 ret = p - aRule;
24366 }
24367 }
24368
24369 return ret;
24370 }
24371 #endif
24372
24373 static int fts5PorterIsVowel(char c, int bYIsVowel){
24374 return (
24375 c=='a' || c=='e' || c=='i' || c=='o' || c=='u' || (bYIsVowel && c=='y')
24376 );
24377 }
24378
24379 static int fts5PorterGobbleVC(char *zStem, int nStem, int bPrevCons){
24380 int i;
24381 int bCons = bPrevCons;
24382
24383 /* Scan for a vowel */
24384 for(i=0; i<nStem; i++){
24385 if( 0==(bCons = !fts5PorterIsVowel(zStem[i], bCons)) ) break;
24386 }
24387
24388 /* Scan for a consonent */
24389 for(i++; i<nStem; i++){
24390 if( (bCons = !fts5PorterIsVowel(zStem[i], bCons)) ) return i+1;
24391 }
24392 return 0;
24393 }
24394
24395 /* porter rule condition: (m > 0) */
24396 static int fts5Porter_MGt0(char *zStem, int nStem){
24397 return !!fts5PorterGobbleVC(zStem, nStem, 0);
24398 }
24399
24400 /* porter rule condition: (m > 1) */
24401 static int fts5Porter_MGt1(char *zStem, int nStem){
24402 int n;
24403 n = fts5PorterGobbleVC(zStem, nStem, 0);
24404 if( n && fts5PorterGobbleVC(&zStem[n], nStem-n, 1) ){
24405 return 1;
24406 }
24407 return 0;
24408 }
24409
24410 /* porter rule condition: (m = 1) */
24411 static int fts5Porter_MEq1(char *zStem, int nStem){
24412 int n;
24413 n = fts5PorterGobbleVC(zStem, nStem, 0);
24414 if( n && 0==fts5PorterGobbleVC(&zStem[n], nStem-n, 1) ){
24415 return 1;
24416 }
24417 return 0;
24418 }
24419
24420 /* porter rule condition: (*o) */
24421 static int fts5Porter_Ostar(char *zStem, int nStem){
24422 if( zStem[nStem-1]=='w' || zStem[nStem-1]=='x' || zStem[nStem-1]=='y' ){
24423 return 0;
24424 }else{
24425 int i;
24426 int mask = 0;
24427 int bCons = 0;
24428 for(i=0; i<nStem; i++){
24429 bCons = !fts5PorterIsVowel(zStem[i], bCons);
24430 assert( bCons==0 || bCons==1 );
24431 mask = (mask << 1) + bCons;
24432 }
24433 return ((mask & 0x0007)==0x0005);
24434 }
24435 }
24436
24437 /* porter rule condition: (m > 1 and (*S or *T)) */
24438 static int fts5Porter_MGt1_and_S_or_T(char *zStem, int nStem){
24439 assert( nStem>0 );
24440 return (zStem[nStem-1]=='s' || zStem[nStem-1]=='t')
24441 && fts5Porter_MGt1(zStem, nStem);
24442 }
24443
24444 /* porter rule condition: (*v*) */
24445 static int fts5Porter_Vowel(char *zStem, int nStem){
24446 int i;
24447 for(i=0; i<nStem; i++){
24448 if( fts5PorterIsVowel(zStem[i], i>0) ){
24449 return 1;
24450 }
24451 }
24452 return 0;
24453 }
24454
24455
24456 /**************************************************************************
24457 ***************************************************************************
24458 ** GENERATED CODE STARTS HERE (mkportersteps.tcl)
24459 */
24460
24461 static int fts5PorterStep4(char *aBuf, int *pnBuf){
24462 int ret = 0;
24463 int nBuf = *pnBuf;
24464 switch( aBuf[nBuf-2] ){
24465
24466 case 'a':
24467 if( nBuf>2 && 0==memcmp("al", &aBuf[nBuf-2], 2) ){
24468 if( fts5Porter_MGt1(aBuf, nBuf-2) ){
24469 *pnBuf = nBuf - 2;
24470 }
24471 }
24472 break;
24473
24474 case 'c':
24475 if( nBuf>4 && 0==memcmp("ance", &aBuf[nBuf-4], 4) ){
24476 if( fts5Porter_MGt1(aBuf, nBuf-4) ){
24477 *pnBuf = nBuf - 4;
24478 }
24479 }else if( nBuf>4 && 0==memcmp("ence", &aBuf[nBuf-4], 4) ){
24480 if( fts5Porter_MGt1(aBuf, nBuf-4) ){
24481 *pnBuf = nBuf - 4;
24482 }
24483 }
24484 break;
24485
24486 case 'e':
24487 if( nBuf>2 && 0==memcmp("er", &aBuf[nBuf-2], 2) ){
24488 if( fts5Porter_MGt1(aBuf, nBuf-2) ){
24489 *pnBuf = nBuf - 2;
24490 }
24491 }
24492 break;
24493
24494 case 'i':
24495 if( nBuf>2 && 0==memcmp("ic", &aBuf[nBuf-2], 2) ){
24496 if( fts5Porter_MGt1(aBuf, nBuf-2) ){
24497 *pnBuf = nBuf - 2;
24498 }
24499 }
24500 break;
24501
24502 case 'l':
24503 if( nBuf>4 && 0==memcmp("able", &aBuf[nBuf-4], 4) ){
24504 if( fts5Porter_MGt1(aBuf, nBuf-4) ){
24505 *pnBuf = nBuf - 4;
24506 }
24507 }else if( nBuf>4 && 0==memcmp("ible", &aBuf[nBuf-4], 4) ){
24508 if( fts5Porter_MGt1(aBuf, nBuf-4) ){
24509 *pnBuf = nBuf - 4;
24510 }
24511 }
24512 break;
24513
24514 case 'n':
24515 if( nBuf>3 && 0==memcmp("ant", &aBuf[nBuf-3], 3) ){
24516 if( fts5Porter_MGt1(aBuf, nBuf-3) ){
24517 *pnBuf = nBuf - 3;
24518 }
24519 }else if( nBuf>5 && 0==memcmp("ement", &aBuf[nBuf-5], 5) ){
24520 if( fts5Porter_MGt1(aBuf, nBuf-5) ){
24521 *pnBuf = nBuf - 5;
24522 }
24523 }else if( nBuf>4 && 0==memcmp("ment", &aBuf[nBuf-4], 4) ){
24524 if( fts5Porter_MGt1(aBuf, nBuf-4) ){
24525 *pnBuf = nBuf - 4;
24526 }
24527 }else if( nBuf>3 && 0==memcmp("ent", &aBuf[nBuf-3], 3) ){
24528 if( fts5Porter_MGt1(aBuf, nBuf-3) ){
24529 *pnBuf = nBuf - 3;
24530 }
24531 }
24532 break;
24533
24534 case 'o':
24535 if( nBuf>3 && 0==memcmp("ion", &aBuf[nBuf-3], 3) ){
24536 if( fts5Porter_MGt1_and_S_or_T(aBuf, nBuf-3) ){
24537 *pnBuf = nBuf - 3;
24538 }
24539 }else if( nBuf>2 && 0==memcmp("ou", &aBuf[nBuf-2], 2) ){
24540 if( fts5Porter_MGt1(aBuf, nBuf-2) ){
24541 *pnBuf = nBuf - 2;
24542 }
24543 }
24544 break;
24545
24546 case 's':
24547 if( nBuf>3 && 0==memcmp("ism", &aBuf[nBuf-3], 3) ){
24548 if( fts5Porter_MGt1(aBuf, nBuf-3) ){
24549 *pnBuf = nBuf - 3;
24550 }
24551 }
24552 break;
24553
24554 case 't':
24555 if( nBuf>3 && 0==memcmp("ate", &aBuf[nBuf-3], 3) ){
24556 if( fts5Porter_MGt1(aBuf, nBuf-3) ){
24557 *pnBuf = nBuf - 3;
24558 }
24559 }else if( nBuf>3 && 0==memcmp("iti", &aBuf[nBuf-3], 3) ){
24560 if( fts5Porter_MGt1(aBuf, nBuf-3) ){
24561 *pnBuf = nBuf - 3;
24562 }
24563 }
24564 break;
24565
24566 case 'u':
24567 if( nBuf>3 && 0==memcmp("ous", &aBuf[nBuf-3], 3) ){
24568 if( fts5Porter_MGt1(aBuf, nBuf-3) ){
24569 *pnBuf = nBuf - 3;
24570 }
24571 }
24572 break;
24573
24574 case 'v':
24575 if( nBuf>3 && 0==memcmp("ive", &aBuf[nBuf-3], 3) ){
24576 if( fts5Porter_MGt1(aBuf, nBuf-3) ){
24577 *pnBuf = nBuf - 3;
24578 }
24579 }
24580 break;
24581
24582 case 'z':
24583 if( nBuf>3 && 0==memcmp("ize", &aBuf[nBuf-3], 3) ){
24584 if( fts5Porter_MGt1(aBuf, nBuf-3) ){
24585 *pnBuf = nBuf - 3;
24586 }
24587 }
24588 break;
24589
24590 }
24591 return ret;
24592 }
24593
24594
24595 static int fts5PorterStep1B2(char *aBuf, int *pnBuf){
24596 int ret = 0;
24597 int nBuf = *pnBuf;
24598 switch( aBuf[nBuf-2] ){
24599
24600 case 'a':
24601 if( nBuf>2 && 0==memcmp("at", &aBuf[nBuf-2], 2) ){
24602 memcpy(&aBuf[nBuf-2], "ate", 3);
24603 *pnBuf = nBuf - 2 + 3;
24604 ret = 1;
24605 }
24606 break;
24607
24608 case 'b':
24609 if( nBuf>2 && 0==memcmp("bl", &aBuf[nBuf-2], 2) ){
24610 memcpy(&aBuf[nBuf-2], "ble", 3);
24611 *pnBuf = nBuf - 2 + 3;
24612 ret = 1;
24613 }
24614 break;
24615
24616 case 'i':
24617 if( nBuf>2 && 0==memcmp("iz", &aBuf[nBuf-2], 2) ){
24618 memcpy(&aBuf[nBuf-2], "ize", 3);
24619 *pnBuf = nBuf - 2 + 3;
24620 ret = 1;
24621 }
24622 break;
24623
24624 }
24625 return ret;
24626 }
24627
24628
24629 static int fts5PorterStep2(char *aBuf, int *pnBuf){
24630 int ret = 0;
24631 int nBuf = *pnBuf;
24632 switch( aBuf[nBuf-2] ){
24633
24634 case 'a':
24635 if( nBuf>7 && 0==memcmp("ational", &aBuf[nBuf-7], 7) ){
24636 if( fts5Porter_MGt0(aBuf, nBuf-7) ){
24637 memcpy(&aBuf[nBuf-7], "ate", 3);
24638 *pnBuf = nBuf - 7 + 3;
24639 }
24640 }else if( nBuf>6 && 0==memcmp("tional", &aBuf[nBuf-6], 6) ){
24641 if( fts5Porter_MGt0(aBuf, nBuf-6) ){
24642 memcpy(&aBuf[nBuf-6], "tion", 4);
24643 *pnBuf = nBuf - 6 + 4;
24644 }
24645 }
24646 break;
24647
24648 case 'c':
24649 if( nBuf>4 && 0==memcmp("enci", &aBuf[nBuf-4], 4) ){
24650 if( fts5Porter_MGt0(aBuf, nBuf-4) ){
24651 memcpy(&aBuf[nBuf-4], "ence", 4);
24652 *pnBuf = nBuf - 4 + 4;
24653 }
24654 }else if( nBuf>4 && 0==memcmp("anci", &aBuf[nBuf-4], 4) ){
24655 if( fts5Porter_MGt0(aBuf, nBuf-4) ){
24656 memcpy(&aBuf[nBuf-4], "ance", 4);
24657 *pnBuf = nBuf - 4 + 4;
24658 }
24659 }
24660 break;
24661
24662 case 'e':
24663 if( nBuf>4 && 0==memcmp("izer", &aBuf[nBuf-4], 4) ){
24664 if( fts5Porter_MGt0(aBuf, nBuf-4) ){
24665 memcpy(&aBuf[nBuf-4], "ize", 3);
24666 *pnBuf = nBuf - 4 + 3;
24667 }
24668 }
24669 break;
24670
24671 case 'g':
24672 if( nBuf>4 && 0==memcmp("logi", &aBuf[nBuf-4], 4) ){
24673 if( fts5Porter_MGt0(aBuf, nBuf-4) ){
24674 memcpy(&aBuf[nBuf-4], "log", 3);
24675 *pnBuf = nBuf - 4 + 3;
24676 }
24677 }
24678 break;
24679
24680 case 'l':
24681 if( nBuf>3 && 0==memcmp("bli", &aBuf[nBuf-3], 3) ){
24682 if( fts5Porter_MGt0(aBuf, nBuf-3) ){
24683 memcpy(&aBuf[nBuf-3], "ble", 3);
24684 *pnBuf = nBuf - 3 + 3;
24685 }
24686 }else if( nBuf>4 && 0==memcmp("alli", &aBuf[nBuf-4], 4) ){
24687 if( fts5Porter_MGt0(aBuf, nBuf-4) ){
24688 memcpy(&aBuf[nBuf-4], "al", 2);
24689 *pnBuf = nBuf - 4 + 2;
24690 }
24691 }else if( nBuf>5 && 0==memcmp("entli", &aBuf[nBuf-5], 5) ){
24692 if( fts5Porter_MGt0(aBuf, nBuf-5) ){
24693 memcpy(&aBuf[nBuf-5], "ent", 3);
24694 *pnBuf = nBuf - 5 + 3;
24695 }
24696 }else if( nBuf>3 && 0==memcmp("eli", &aBuf[nBuf-3], 3) ){
24697 if( fts5Porter_MGt0(aBuf, nBuf-3) ){
24698 memcpy(&aBuf[nBuf-3], "e", 1);
24699 *pnBuf = nBuf - 3 + 1;
24700 }
24701 }else if( nBuf>5 && 0==memcmp("ousli", &aBuf[nBuf-5], 5) ){
24702 if( fts5Porter_MGt0(aBuf, nBuf-5) ){
24703 memcpy(&aBuf[nBuf-5], "ous", 3);
24704 *pnBuf = nBuf - 5 + 3;
24705 }
24706 }
24707 break;
24708
24709 case 'o':
24710 if( nBuf>7 && 0==memcmp("ization", &aBuf[nBuf-7], 7) ){
24711 if( fts5Porter_MGt0(aBuf, nBuf-7) ){
24712 memcpy(&aBuf[nBuf-7], "ize", 3);
24713 *pnBuf = nBuf - 7 + 3;
24714 }
24715 }else if( nBuf>5 && 0==memcmp("ation", &aBuf[nBuf-5], 5) ){
24716 if( fts5Porter_MGt0(aBuf, nBuf-5) ){
24717 memcpy(&aBuf[nBuf-5], "ate", 3);
24718 *pnBuf = nBuf - 5 + 3;
24719 }
24720 }else if( nBuf>4 && 0==memcmp("ator", &aBuf[nBuf-4], 4) ){
24721 if( fts5Porter_MGt0(aBuf, nBuf-4) ){
24722 memcpy(&aBuf[nBuf-4], "ate", 3);
24723 *pnBuf = nBuf - 4 + 3;
24724 }
24725 }
24726 break;
24727
24728 case 's':
24729 if( nBuf>5 && 0==memcmp("alism", &aBuf[nBuf-5], 5) ){
24730 if( fts5Porter_MGt0(aBuf, nBuf-5) ){
24731 memcpy(&aBuf[nBuf-5], "al", 2);
24732 *pnBuf = nBuf - 5 + 2;
24733 }
24734 }else if( nBuf>7 && 0==memcmp("iveness", &aBuf[nBuf-7], 7) ){
24735 if( fts5Porter_MGt0(aBuf, nBuf-7) ){
24736 memcpy(&aBuf[nBuf-7], "ive", 3);
24737 *pnBuf = nBuf - 7 + 3;
24738 }
24739 }else if( nBuf>7 && 0==memcmp("fulness", &aBuf[nBuf-7], 7) ){
24740 if( fts5Porter_MGt0(aBuf, nBuf-7) ){
24741 memcpy(&aBuf[nBuf-7], "ful", 3);
24742 *pnBuf = nBuf - 7 + 3;
24743 }
24744 }else if( nBuf>7 && 0==memcmp("ousness", &aBuf[nBuf-7], 7) ){
24745 if( fts5Porter_MGt0(aBuf, nBuf-7) ){
24746 memcpy(&aBuf[nBuf-7], "ous", 3);
24747 *pnBuf = nBuf - 7 + 3;
24748 }
24749 }
24750 break;
24751
24752 case 't':
24753 if( nBuf>5 && 0==memcmp("aliti", &aBuf[nBuf-5], 5) ){
24754 if( fts5Porter_MGt0(aBuf, nBuf-5) ){
24755 memcpy(&aBuf[nBuf-5], "al", 2);
24756 *pnBuf = nBuf - 5 + 2;
24757 }
24758 }else if( nBuf>5 && 0==memcmp("iviti", &aBuf[nBuf-5], 5) ){
24759 if( fts5Porter_MGt0(aBuf, nBuf-5) ){
24760 memcpy(&aBuf[nBuf-5], "ive", 3);
24761 *pnBuf = nBuf - 5 + 3;
24762 }
24763 }else if( nBuf>6 && 0==memcmp("biliti", &aBuf[nBuf-6], 6) ){
24764 if( fts5Porter_MGt0(aBuf, nBuf-6) ){
24765 memcpy(&aBuf[nBuf-6], "ble", 3);
24766 *pnBuf = nBuf - 6 + 3;
24767 }
24768 }
24769 break;
24770
24771 }
24772 return ret;
24773 }
24774
24775
24776 static int fts5PorterStep3(char *aBuf, int *pnBuf){
24777 int ret = 0;
24778 int nBuf = *pnBuf;
24779 switch( aBuf[nBuf-2] ){
24780
24781 case 'a':
24782 if( nBuf>4 && 0==memcmp("ical", &aBuf[nBuf-4], 4) ){
24783 if( fts5Porter_MGt0(aBuf, nBuf-4) ){
24784 memcpy(&aBuf[nBuf-4], "ic", 2);
24785 *pnBuf = nBuf - 4 + 2;
24786 }
24787 }
24788 break;
24789
24790 case 's':
24791 if( nBuf>4 && 0==memcmp("ness", &aBuf[nBuf-4], 4) ){
24792 if( fts5Porter_MGt0(aBuf, nBuf-4) ){
24793 *pnBuf = nBuf - 4;
24794 }
24795 }
24796 break;
24797
24798 case 't':
24799 if( nBuf>5 && 0==memcmp("icate", &aBuf[nBuf-5], 5) ){
24800 if( fts5Porter_MGt0(aBuf, nBuf-5) ){
24801 memcpy(&aBuf[nBuf-5], "ic", 2);
24802 *pnBuf = nBuf - 5 + 2;
24803 }
24804 }else if( nBuf>5 && 0==memcmp("iciti", &aBuf[nBuf-5], 5) ){
24805 if( fts5Porter_MGt0(aBuf, nBuf-5) ){
24806 memcpy(&aBuf[nBuf-5], "ic", 2);
24807 *pnBuf = nBuf - 5 + 2;
24808 }
24809 }
24810 break;
24811
24812 case 'u':
24813 if( nBuf>3 && 0==memcmp("ful", &aBuf[nBuf-3], 3) ){
24814 if( fts5Porter_MGt0(aBuf, nBuf-3) ){
24815 *pnBuf = nBuf - 3;
24816 }
24817 }
24818 break;
24819
24820 case 'v':
24821 if( nBuf>5 && 0==memcmp("ative", &aBuf[nBuf-5], 5) ){
24822 if( fts5Porter_MGt0(aBuf, nBuf-5) ){
24823 *pnBuf = nBuf - 5;
24824 }
24825 }
24826 break;
24827
24828 case 'z':
24829 if( nBuf>5 && 0==memcmp("alize", &aBuf[nBuf-5], 5) ){
24830 if( fts5Porter_MGt0(aBuf, nBuf-5) ){
24831 memcpy(&aBuf[nBuf-5], "al", 2);
24832 *pnBuf = nBuf - 5 + 2;
24833 }
24834 }
24835 break;
24836
24837 }
24838 return ret;
24839 }
24840
24841
24842 static int fts5PorterStep1B(char *aBuf, int *pnBuf){
24843 int ret = 0;
24844 int nBuf = *pnBuf;
24845 switch( aBuf[nBuf-2] ){
24846
24847 case 'e':
24848 if( nBuf>3 && 0==memcmp("eed", &aBuf[nBuf-3], 3) ){
24849 if( fts5Porter_MGt0(aBuf, nBuf-3) ){
24850 memcpy(&aBuf[nBuf-3], "ee", 2);
24851 *pnBuf = nBuf - 3 + 2;
24852 }
24853 }else if( nBuf>2 && 0==memcmp("ed", &aBuf[nBuf-2], 2) ){
24854 if( fts5Porter_Vowel(aBuf, nBuf-2) ){
24855 *pnBuf = nBuf - 2;
24856 ret = 1;
24857 }
24858 }
24859 break;
24860
24861 case 'n':
24862 if( nBuf>3 && 0==memcmp("ing", &aBuf[nBuf-3], 3) ){
24863 if( fts5Porter_Vowel(aBuf, nBuf-3) ){
24864 *pnBuf = nBuf - 3;
24865 ret = 1;
24866 }
24867 }
24868 break;
24869
24870 }
24871 return ret;
24872 }
24873
24874 /*
24875 ** GENERATED CODE ENDS HERE (mkportersteps.tcl)
24876 ***************************************************************************
24877 **************************************************************************/
24878
24879 static void fts5PorterStep1A(char *aBuf, int *pnBuf){
24880 int nBuf = *pnBuf;
24881 if( aBuf[nBuf-1]=='s' ){
24882 if( aBuf[nBuf-2]=='e' ){
24883 if( (nBuf>4 && aBuf[nBuf-4]=='s' && aBuf[nBuf-3]=='s')
24884 || (nBuf>3 && aBuf[nBuf-3]=='i' )
24885 ){
24886 *pnBuf = nBuf-2;
24887 }else{
24888 *pnBuf = nBuf-1;
24889 }
24890 }
24891 else if( aBuf[nBuf-2]!='s' ){
24892 *pnBuf = nBuf-1;
24893 }
24894 }
24895 }
24896
24897 static int fts5PorterCb(
24898 void *pCtx,
24899 int tflags,
24900 const char *pToken,
24901 int nToken,
24902 int iStart,
24903 int iEnd
24904 ){
24905 PorterContext *p = (PorterContext*)pCtx;
24906
24907 char *aBuf;
24908 int nBuf;
24909
24910 if( nToken>FTS5_PORTER_MAX_TOKEN || nToken<3 ) goto pass_through;
24911 aBuf = p->aBuf;
24912 nBuf = nToken;
24913 memcpy(aBuf, pToken, nBuf);
24914
24915 /* Step 1. */
24916 fts5PorterStep1A(aBuf, &nBuf);
24917 if( fts5PorterStep1B(aBuf, &nBuf) ){
24918 if( fts5PorterStep1B2(aBuf, &nBuf)==0 ){
24919 char c = aBuf[nBuf-1];
24920 if( fts5PorterIsVowel(c, 0)==0
24921 && c!='l' && c!='s' && c!='z' && c==aBuf[nBuf-2]
24922 ){
24923 nBuf--;
24924 }else if( fts5Porter_MEq1(aBuf, nBuf) && fts5Porter_Ostar(aBuf, nBuf) ){
24925 aBuf[nBuf++] = 'e';
24926 }
24927 }
24928 }
24929
24930 /* Step 1C. */
24931 if( aBuf[nBuf-1]=='y' && fts5Porter_Vowel(aBuf, nBuf-1) ){
24932 aBuf[nBuf-1] = 'i';
24933 }
24934
24935 /* Steps 2 through 4. */
24936 fts5PorterStep2(aBuf, &nBuf);
24937 fts5PorterStep3(aBuf, &nBuf);
24938 fts5PorterStep4(aBuf, &nBuf);
24939
24940 /* Step 5a. */
24941 assert( nBuf>0 );
24942 if( aBuf[nBuf-1]=='e' ){
24943 if( fts5Porter_MGt1(aBuf, nBuf-1)
24944 || (fts5Porter_MEq1(aBuf, nBuf-1) && !fts5Porter_Ostar(aBuf, nBuf-1))
24945 ){
24946 nBuf--;
24947 }
24948 }
24949
24950 /* Step 5b. */
24951 if( nBuf>1 && aBuf[nBuf-1]=='l'
24952 && aBuf[nBuf-2]=='l' && fts5Porter_MGt1(aBuf, nBuf-1)
24953 ){
24954 nBuf--;
24955 }
24956
24957 return p->xToken(p->pCtx, tflags, aBuf, nBuf, iStart, iEnd);
24958
24959 pass_through:
24960 return p->xToken(p->pCtx, tflags, pToken, nToken, iStart, iEnd);
24961 }
24962
24963 /*
24964 ** Tokenize using the porter tokenizer.
24965 */
24966 static int fts5PorterTokenize(
24967 Fts5Tokenizer *pTokenizer,
24968 void *pCtx,
24969 int flags,
24970 const char *pText, int nText,
24971 int (*xToken)(void*, int, const char*, int nToken, int iStart, int iEnd)
24972 ){
24973 PorterTokenizer *p = (PorterTokenizer*)pTokenizer;
24974 PorterContext sCtx;
24975 sCtx.xToken = xToken;
24976 sCtx.pCtx = pCtx;
24977 sCtx.aBuf = p->aBuf;
24978 return p->tokenizer.xTokenize(
24979 p->pTokenizer, (void*)&sCtx, flags, pText, nText, fts5PorterCb
24980 );
24981 }
24982
24983 /*
24984 ** Register all built-in tokenizers with FTS5.
24985 */
24986 static int sqlite3Fts5TokenizerInit(fts5_api *pApi){
24987 struct BuiltinTokenizer {
24988 const char *zName;
24989 fts5_tokenizer x;
24990 } aBuiltin[] = {
24991 { "unicode61", {fts5UnicodeCreate, fts5UnicodeDelete, fts5UnicodeTokenize}},
24992 { "ascii", {fts5AsciiCreate, fts5AsciiDelete, fts5AsciiTokenize }},
24993 { "porter", {fts5PorterCreate, fts5PorterDelete, fts5PorterTokenize }},
24994 };
24995
24996 int rc = SQLITE_OK; /* Return code */
24997 int i; /* To iterate through builtin functions */
24998
24999 for(i=0; rc==SQLITE_OK && i<(int)ArraySize(aBuiltin); i++){
25000 rc = pApi->xCreateTokenizer(pApi,
25001 aBuiltin[i].zName,
25002 (void*)pApi,
25003 &aBuiltin[i].x,
25004 0
25005 );
25006 }
25007
25008 return rc;
25009 }
25010
25011
25012
25013 /*
25014 ** 2012 May 25
25015 **
25016 ** The author disclaims copyright to this source code. In place of
25017 ** a legal notice, here is a blessing:
25018 **
25019 ** May you do good and not evil.
25020 ** May you find forgiveness for yourself and forgive others.
25021 ** May you share freely, never taking more than you give.
25022 **
25023 ******************************************************************************
25024 */
25025
25026 /*
25027 ** DO NOT EDIT THIS MACHINE GENERATED FILE.
25028 */
25029
25030
25031 /* #include <assert.h> */
25032
25033 /*
25034 ** Return true if the argument corresponds to a unicode codepoint
25035 ** classified as either a letter or a number. Otherwise false.
25036 **
25037 ** The results are undefined if the value passed to this function
25038 ** is less than zero.
25039 */
25040 static int sqlite3Fts5UnicodeIsalnum(int c){
25041 /* Each unsigned integer in the following array corresponds to a contiguous
25042 ** range of unicode codepoints that are not either letters or numbers (i.e.
25043 ** codepoints for which this function should return 0).
25044 **
25045 ** The most significant 22 bits in each 32-bit value contain the first
25046 ** codepoint in the range. The least significant 10 bits are used to store
25047 ** the size of the range (always at least 1). In other words, the value
25048 ** ((C<<22) + N) represents a range of N codepoints starting with codepoint
25049 ** C. It is not possible to represent a range larger than 1023 codepoints
25050 ** using this format.
25051 */
25052 static const unsigned int aEntry[] = {
25053 0x00000030, 0x0000E807, 0x00016C06, 0x0001EC2F, 0x0002AC07,
25054 0x0002D001, 0x0002D803, 0x0002EC01, 0x0002FC01, 0x00035C01,
25055 0x0003DC01, 0x000B0804, 0x000B480E, 0x000B9407, 0x000BB401,
25056 0x000BBC81, 0x000DD401, 0x000DF801, 0x000E1002, 0x000E1C01,
25057 0x000FD801, 0x00120808, 0x00156806, 0x00162402, 0x00163C01,
25058 0x00164437, 0x0017CC02, 0x00180005, 0x00181816, 0x00187802,
25059 0x00192C15, 0x0019A804, 0x0019C001, 0x001B5001, 0x001B580F,
25060 0x001B9C07, 0x001BF402, 0x001C000E, 0x001C3C01, 0x001C4401,
25061 0x001CC01B, 0x001E980B, 0x001FAC09, 0x001FD804, 0x00205804,
25062 0x00206C09, 0x00209403, 0x0020A405, 0x0020C00F, 0x00216403,
25063 0x00217801, 0x0023901B, 0x00240004, 0x0024E803, 0x0024F812,
25064 0x00254407, 0x00258804, 0x0025C001, 0x00260403, 0x0026F001,
25065 0x0026F807, 0x00271C02, 0x00272C03, 0x00275C01, 0x00278802,
25066 0x0027C802, 0x0027E802, 0x00280403, 0x0028F001, 0x0028F805,
25067 0x00291C02, 0x00292C03, 0x00294401, 0x0029C002, 0x0029D401,
25068 0x002A0403, 0x002AF001, 0x002AF808, 0x002B1C03, 0x002B2C03,
25069 0x002B8802, 0x002BC002, 0x002C0403, 0x002CF001, 0x002CF807,
25070 0x002D1C02, 0x002D2C03, 0x002D5802, 0x002D8802, 0x002DC001,
25071 0x002E0801, 0x002EF805, 0x002F1803, 0x002F2804, 0x002F5C01,
25072 0x002FCC08, 0x00300403, 0x0030F807, 0x00311803, 0x00312804,
25073 0x00315402, 0x00318802, 0x0031FC01, 0x00320802, 0x0032F001,
25074 0x0032F807, 0x00331803, 0x00332804, 0x00335402, 0x00338802,
25075 0x00340802, 0x0034F807, 0x00351803, 0x00352804, 0x00355C01,
25076 0x00358802, 0x0035E401, 0x00360802, 0x00372801, 0x00373C06,
25077 0x00375801, 0x00376008, 0x0037C803, 0x0038C401, 0x0038D007,
25078 0x0038FC01, 0x00391C09, 0x00396802, 0x003AC401, 0x003AD006,
25079 0x003AEC02, 0x003B2006, 0x003C041F, 0x003CD00C, 0x003DC417,
25080 0x003E340B, 0x003E6424, 0x003EF80F, 0x003F380D, 0x0040AC14,
25081 0x00412806, 0x00415804, 0x00417803, 0x00418803, 0x00419C07,
25082 0x0041C404, 0x0042080C, 0x00423C01, 0x00426806, 0x0043EC01,
25083 0x004D740C, 0x004E400A, 0x00500001, 0x0059B402, 0x005A0001,
25084 0x005A6C02, 0x005BAC03, 0x005C4803, 0x005CC805, 0x005D4802,
25085 0x005DC802, 0x005ED023, 0x005F6004, 0x005F7401, 0x0060000F,
25086 0x0062A401, 0x0064800C, 0x0064C00C, 0x00650001, 0x00651002,
25087 0x0066C011, 0x00672002, 0x00677822, 0x00685C05, 0x00687802,
25088 0x0069540A, 0x0069801D, 0x0069FC01, 0x006A8007, 0x006AA006,
25089 0x006C0005, 0x006CD011, 0x006D6823, 0x006E0003, 0x006E840D,
25090 0x006F980E, 0x006FF004, 0x00709014, 0x0070EC05, 0x0071F802,
25091 0x00730008, 0x00734019, 0x0073B401, 0x0073C803, 0x00770027,
25092 0x0077F004, 0x007EF401, 0x007EFC03, 0x007F3403, 0x007F7403,
25093 0x007FB403, 0x007FF402, 0x00800065, 0x0081A806, 0x0081E805,
25094 0x00822805, 0x0082801A, 0x00834021, 0x00840002, 0x00840C04,
25095 0x00842002, 0x00845001, 0x00845803, 0x00847806, 0x00849401,
25096 0x00849C01, 0x0084A401, 0x0084B801, 0x0084E802, 0x00850005,
25097 0x00852804, 0x00853C01, 0x00864264, 0x00900027, 0x0091000B,
25098 0x0092704E, 0x00940200, 0x009C0475, 0x009E53B9, 0x00AD400A,
25099 0x00B39406, 0x00B3BC03, 0x00B3E404, 0x00B3F802, 0x00B5C001,
25100 0x00B5FC01, 0x00B7804F, 0x00B8C00C, 0x00BA001A, 0x00BA6C59,
25101 0x00BC00D6, 0x00BFC00C, 0x00C00005, 0x00C02019, 0x00C0A807,
25102 0x00C0D802, 0x00C0F403, 0x00C26404, 0x00C28001, 0x00C3EC01,
25103 0x00C64002, 0x00C6580A, 0x00C70024, 0x00C8001F, 0x00C8A81E,
25104 0x00C94001, 0x00C98020, 0x00CA2827, 0x00CB003F, 0x00CC0100,
25105 0x01370040, 0x02924037, 0x0293F802, 0x02983403, 0x0299BC10,
25106 0x029A7C01, 0x029BC008, 0x029C0017, 0x029C8002, 0x029E2402,
25107 0x02A00801, 0x02A01801, 0x02A02C01, 0x02A08C09, 0x02A0D804,
25108 0x02A1D004, 0x02A20002, 0x02A2D011, 0x02A33802, 0x02A38012,
25109 0x02A3E003, 0x02A4980A, 0x02A51C0D, 0x02A57C01, 0x02A60004,
25110 0x02A6CC1B, 0x02A77802, 0x02A8A40E, 0x02A90C01, 0x02A93002,
25111 0x02A97004, 0x02A9DC03, 0x02A9EC01, 0x02AAC001, 0x02AAC803,
25112 0x02AADC02, 0x02AAF802, 0x02AB0401, 0x02AB7802, 0x02ABAC07,
25113 0x02ABD402, 0x02AF8C0B, 0x03600001, 0x036DFC02, 0x036FFC02,
25114 0x037FFC01, 0x03EC7801, 0x03ECA401, 0x03EEC810, 0x03F4F802,
25115 0x03F7F002, 0x03F8001A, 0x03F88007, 0x03F8C023, 0x03F95013,
25116 0x03F9A004, 0x03FBFC01, 0x03FC040F, 0x03FC6807, 0x03FCEC06,
25117 0x03FD6C0B, 0x03FF8007, 0x03FFA007, 0x03FFE405, 0x04040003,
25118 0x0404DC09, 0x0405E411, 0x0406400C, 0x0407402E, 0x040E7C01,
25119 0x040F4001, 0x04215C01, 0x04247C01, 0x0424FC01, 0x04280403,
25120 0x04281402, 0x04283004, 0x0428E003, 0x0428FC01, 0x04294009,
25121 0x0429FC01, 0x042CE407, 0x04400003, 0x0440E016, 0x04420003,
25122 0x0442C012, 0x04440003, 0x04449C0E, 0x04450004, 0x04460003,
25123 0x0446CC0E, 0x04471404, 0x045AAC0D, 0x0491C004, 0x05BD442E,
25124 0x05BE3C04, 0x074000F6, 0x07440027, 0x0744A4B5, 0x07480046,
25125 0x074C0057, 0x075B0401, 0x075B6C01, 0x075BEC01, 0x075C5401,
25126 0x075CD401, 0x075D3C01, 0x075DBC01, 0x075E2401, 0x075EA401,
25127 0x075F0C01, 0x07BBC002, 0x07C0002C, 0x07C0C064, 0x07C2800F,
25128 0x07C2C40E, 0x07C3040F, 0x07C3440F, 0x07C4401F, 0x07C4C03C,
25129 0x07C5C02B, 0x07C7981D, 0x07C8402B, 0x07C90009, 0x07C94002,
25130 0x07CC0021, 0x07CCC006, 0x07CCDC46, 0x07CE0014, 0x07CE8025,
25131 0x07CF1805, 0x07CF8011, 0x07D0003F, 0x07D10001, 0x07D108B6,
25132 0x07D3E404, 0x07D4003E, 0x07D50004, 0x07D54018, 0x07D7EC46,
25133 0x07D9140B, 0x07DA0046, 0x07DC0074, 0x38000401, 0x38008060,
25134 0x380400F0,
25135 };
25136 static const unsigned int aAscii[4] = {
25137 0xFFFFFFFF, 0xFC00FFFF, 0xF8000001, 0xF8000001,
25138 };
25139
25140 if( c<128 ){
25141 return ( (aAscii[c >> 5] & (1 << (c & 0x001F)))==0 );
25142 }else if( c<(1<<22) ){
25143 unsigned int key = (((unsigned int)c)<<10) | 0x000003FF;
25144 int iRes = 0;
25145 int iHi = sizeof(aEntry)/sizeof(aEntry[0]) - 1;
25146 int iLo = 0;
25147 while( iHi>=iLo ){
25148 int iTest = (iHi + iLo) / 2;
25149 if( key >= aEntry[iTest] ){
25150 iRes = iTest;
25151 iLo = iTest+1;
25152 }else{
25153 iHi = iTest-1;
25154 }
25155 }
25156 assert( aEntry[0]<key );
25157 assert( key>=aEntry[iRes] );
25158 return (((unsigned int)c) >= ((aEntry[iRes]>>10) + (aEntry[iRes]&0x3FF)));
25159 }
25160 return 1;
25161 }
25162
25163
25164 /*
25165 ** If the argument is a codepoint corresponding to a lowercase letter
25166 ** in the ASCII range with a diacritic added, return the codepoint
25167 ** of the ASCII letter only. For example, if passed 235 - "LATIN
25168 ** SMALL LETTER E WITH DIAERESIS" - return 65 ("LATIN SMALL LETTER
25169 ** E"). The resuls of passing a codepoint that corresponds to an
25170 ** uppercase letter are undefined.
25171 */
25172 static int fts5_remove_diacritic(int c){
25173 unsigned short aDia[] = {
25174 0, 1797, 1848, 1859, 1891, 1928, 1940, 1995,
25175 2024, 2040, 2060, 2110, 2168, 2206, 2264, 2286,
25176 2344, 2383, 2472, 2488, 2516, 2596, 2668, 2732,
25177 2782, 2842, 2894, 2954, 2984, 3000, 3028, 3336,
25178 3456, 3696, 3712, 3728, 3744, 3896, 3912, 3928,
25179 3968, 4008, 4040, 4106, 4138, 4170, 4202, 4234,
25180 4266, 4296, 4312, 4344, 4408, 4424, 4472, 4504,
25181 6148, 6198, 6264, 6280, 6360, 6429, 6505, 6529,
25182 61448, 61468, 61534, 61592, 61642, 61688, 61704, 61726,
25183 61784, 61800, 61836, 61880, 61914, 61948, 61998, 62122,
25184 62154, 62200, 62218, 62302, 62364, 62442, 62478, 62536,
25185 62554, 62584, 62604, 62640, 62648, 62656, 62664, 62730,
25186 62924, 63050, 63082, 63274, 63390,
25187 };
25188 char aChar[] = {
25189 '\0', 'a', 'c', 'e', 'i', 'n', 'o', 'u', 'y', 'y', 'a', 'c',
25190 'd', 'e', 'e', 'g', 'h', 'i', 'j', 'k', 'l', 'n', 'o', 'r',
25191 's', 't', 'u', 'u', 'w', 'y', 'z', 'o', 'u', 'a', 'i', 'o',
25192 'u', 'g', 'k', 'o', 'j', 'g', 'n', 'a', 'e', 'i', 'o', 'r',
25193 'u', 's', 't', 'h', 'a', 'e', 'o', 'y', '\0', '\0', '\0', '\0',
25194 '\0', '\0', '\0', '\0', 'a', 'b', 'd', 'd', 'e', 'f', 'g', 'h',
25195 'h', 'i', 'k', 'l', 'l', 'm', 'n', 'p', 'r', 'r', 's', 't',
25196 'u', 'v', 'w', 'w', 'x', 'y', 'z', 'h', 't', 'w', 'y', 'a',
25197 'e', 'i', 'o', 'u', 'y',
25198 };
25199
25200 unsigned int key = (((unsigned int)c)<<3) | 0x00000007;
25201 int iRes = 0;
25202 int iHi = sizeof(aDia)/sizeof(aDia[0]) - 1;
25203 int iLo = 0;
25204 while( iHi>=iLo ){
25205 int iTest = (iHi + iLo) / 2;
25206 if( key >= aDia[iTest] ){
25207 iRes = iTest;
25208 iLo = iTest+1;
25209 }else{
25210 iHi = iTest-1;
25211 }
25212 }
25213 assert( key>=aDia[iRes] );
25214 return ((c > (aDia[iRes]>>3) + (aDia[iRes]&0x07)) ? c : (int)aChar[iRes]);
25215 }
25216
25217
25218 /*
25219 ** Return true if the argument interpreted as a unicode codepoint
25220 ** is a diacritical modifier character.
25221 */
25222 static int sqlite3Fts5UnicodeIsdiacritic(int c){
25223 unsigned int mask0 = 0x08029FDF;
25224 unsigned int mask1 = 0x000361F8;
25225 if( c<768 || c>817 ) return 0;
25226 return (c < 768+32) ?
25227 (mask0 & (1 << (c-768))) :
25228 (mask1 & (1 << (c-768-32)));
25229 }
25230
25231
25232 /*
25233 ** Interpret the argument as a unicode codepoint. If the codepoint
25234 ** is an upper case character that has a lower case equivalent,
25235 ** return the codepoint corresponding to the lower case version.
25236 ** Otherwise, return a copy of the argument.
25237 **
25238 ** The results are undefined if the value passed to this function
25239 ** is less than zero.
25240 */
25241 static int sqlite3Fts5UnicodeFold(int c, int bRemoveDiacritic){
25242 /* Each entry in the following array defines a rule for folding a range
25243 ** of codepoints to lower case. The rule applies to a range of nRange
25244 ** codepoints starting at codepoint iCode.
25245 **
25246 ** If the least significant bit in flags is clear, then the rule applies
25247 ** to all nRange codepoints (i.e. all nRange codepoints are upper case and
25248 ** need to be folded). Or, if it is set, then the rule only applies to
25249 ** every second codepoint in the range, starting with codepoint C.
25250 **
25251 ** The 7 most significant bits in flags are an index into the aiOff[]
25252 ** array. If a specific codepoint C does require folding, then its lower
25253 ** case equivalent is ((C + aiOff[flags>>1]) & 0xFFFF).
25254 **
25255 ** The contents of this array are generated by parsing the CaseFolding.txt
25256 ** file distributed as part of the "Unicode Character Database". See
25257 ** http://www.unicode.org for details.
25258 */
25259 static const struct TableEntry {
25260 unsigned short iCode;
25261 unsigned char flags;
25262 unsigned char nRange;
25263 } aEntry[] = {
25264 {65, 14, 26}, {181, 64, 1}, {192, 14, 23},
25265 {216, 14, 7}, {256, 1, 48}, {306, 1, 6},
25266 {313, 1, 16}, {330, 1, 46}, {376, 116, 1},
25267 {377, 1, 6}, {383, 104, 1}, {385, 50, 1},
25268 {386, 1, 4}, {390, 44, 1}, {391, 0, 1},
25269 {393, 42, 2}, {395, 0, 1}, {398, 32, 1},
25270 {399, 38, 1}, {400, 40, 1}, {401, 0, 1},
25271 {403, 42, 1}, {404, 46, 1}, {406, 52, 1},
25272 {407, 48, 1}, {408, 0, 1}, {412, 52, 1},
25273 {413, 54, 1}, {415, 56, 1}, {416, 1, 6},
25274 {422, 60, 1}, {423, 0, 1}, {425, 60, 1},
25275 {428, 0, 1}, {430, 60, 1}, {431, 0, 1},
25276 {433, 58, 2}, {435, 1, 4}, {439, 62, 1},
25277 {440, 0, 1}, {444, 0, 1}, {452, 2, 1},
25278 {453, 0, 1}, {455, 2, 1}, {456, 0, 1},
25279 {458, 2, 1}, {459, 1, 18}, {478, 1, 18},
25280 {497, 2, 1}, {498, 1, 4}, {502, 122, 1},
25281 {503, 134, 1}, {504, 1, 40}, {544, 110, 1},
25282 {546, 1, 18}, {570, 70, 1}, {571, 0, 1},
25283 {573, 108, 1}, {574, 68, 1}, {577, 0, 1},
25284 {579, 106, 1}, {580, 28, 1}, {581, 30, 1},
25285 {582, 1, 10}, {837, 36, 1}, {880, 1, 4},
25286 {886, 0, 1}, {902, 18, 1}, {904, 16, 3},
25287 {908, 26, 1}, {910, 24, 2}, {913, 14, 17},
25288 {931, 14, 9}, {962, 0, 1}, {975, 4, 1},
25289 {976, 140, 1}, {977, 142, 1}, {981, 146, 1},
25290 {982, 144, 1}, {984, 1, 24}, {1008, 136, 1},
25291 {1009, 138, 1}, {1012, 130, 1}, {1013, 128, 1},
25292 {1015, 0, 1}, {1017, 152, 1}, {1018, 0, 1},
25293 {1021, 110, 3}, {1024, 34, 16}, {1040, 14, 32},
25294 {1120, 1, 34}, {1162, 1, 54}, {1216, 6, 1},
25295 {1217, 1, 14}, {1232, 1, 88}, {1329, 22, 38},
25296 {4256, 66, 38}, {4295, 66, 1}, {4301, 66, 1},
25297 {7680, 1, 150}, {7835, 132, 1}, {7838, 96, 1},
25298 {7840, 1, 96}, {7944, 150, 8}, {7960, 150, 6},
25299 {7976, 150, 8}, {7992, 150, 8}, {8008, 150, 6},
25300 {8025, 151, 8}, {8040, 150, 8}, {8072, 150, 8},
25301 {8088, 150, 8}, {8104, 150, 8}, {8120, 150, 2},
25302 {8122, 126, 2}, {8124, 148, 1}, {8126, 100, 1},
25303 {8136, 124, 4}, {8140, 148, 1}, {8152, 150, 2},
25304 {8154, 120, 2}, {8168, 150, 2}, {8170, 118, 2},
25305 {8172, 152, 1}, {8184, 112, 2}, {8186, 114, 2},
25306 {8188, 148, 1}, {8486, 98, 1}, {8490, 92, 1},
25307 {8491, 94, 1}, {8498, 12, 1}, {8544, 8, 16},
25308 {8579, 0, 1}, {9398, 10, 26}, {11264, 22, 47},
25309 {11360, 0, 1}, {11362, 88, 1}, {11363, 102, 1},
25310 {11364, 90, 1}, {11367, 1, 6}, {11373, 84, 1},
25311 {11374, 86, 1}, {11375, 80, 1}, {11376, 82, 1},
25312 {11378, 0, 1}, {11381, 0, 1}, {11390, 78, 2},
25313 {11392, 1, 100}, {11499, 1, 4}, {11506, 0, 1},
25314 {42560, 1, 46}, {42624, 1, 24}, {42786, 1, 14},
25315 {42802, 1, 62}, {42873, 1, 4}, {42877, 76, 1},
25316 {42878, 1, 10}, {42891, 0, 1}, {42893, 74, 1},
25317 {42896, 1, 4}, {42912, 1, 10}, {42922, 72, 1},
25318 {65313, 14, 26},
25319 };
25320 static const unsigned short aiOff[] = {
25321 1, 2, 8, 15, 16, 26, 28, 32,
25322 37, 38, 40, 48, 63, 64, 69, 71,
25323 79, 80, 116, 202, 203, 205, 206, 207,
25324 209, 210, 211, 213, 214, 217, 218, 219,
25325 775, 7264, 10792, 10795, 23228, 23256, 30204, 54721,
25326 54753, 54754, 54756, 54787, 54793, 54809, 57153, 57274,
25327 57921, 58019, 58363, 61722, 65268, 65341, 65373, 65406,
25328 65408, 65410, 65415, 65424, 65436, 65439, 65450, 65462,
25329 65472, 65476, 65478, 65480, 65482, 65488, 65506, 65511,
25330 65514, 65521, 65527, 65528, 65529,
25331 };
25332
25333 int ret = c;
25334
25335 assert( sizeof(unsigned short)==2 && sizeof(unsigned char)==1 );
25336
25337 if( c<128 ){
25338 if( c>='A' && c<='Z' ) ret = c + ('a' - 'A');
25339 }else if( c<65536 ){
25340 const struct TableEntry *p;
25341 int iHi = sizeof(aEntry)/sizeof(aEntry[0]) - 1;
25342 int iLo = 0;
25343 int iRes = -1;
25344
25345 assert( c>aEntry[0].iCode );
25346 while( iHi>=iLo ){
25347 int iTest = (iHi + iLo) / 2;
25348 int cmp = (c - aEntry[iTest].iCode);
25349 if( cmp>=0 ){
25350 iRes = iTest;
25351 iLo = iTest+1;
25352 }else{
25353 iHi = iTest-1;
25354 }
25355 }
25356
25357 assert( iRes>=0 && c>=aEntry[iRes].iCode );
25358 p = &aEntry[iRes];
25359 if( c<(p->iCode + p->nRange) && 0==(0x01 & p->flags & (p->iCode ^ c)) ){
25360 ret = (c + (aiOff[p->flags>>1])) & 0x0000FFFF;
25361 assert( ret>0 );
25362 }
25363
25364 if( bRemoveDiacritic ) ret = fts5_remove_diacritic(ret);
25365 }
25366
25367 else if( c>=66560 && c<66600 ){
25368 ret = c + 40;
25369 }
25370
25371 return ret;
25372 }
25373
25374 /*
25375 ** 2015 May 30
25376 **
25377 ** The author disclaims copyright to this source code. In place of
25378 ** a legal notice, here is a blessing:
25379 **
25380 ** May you do good and not evil.
25381 ** May you find forgiveness for yourself and forgive others.
25382 ** May you share freely, never taking more than you give.
25383 **
25384 ******************************************************************************
25385 **
25386 ** Routines for varint serialization and deserialization.
25387 */
25388
25389
25390 /* #include "fts5Int.h" */
25391
25392 /*
25393 ** This is a copy of the sqlite3GetVarint32() routine from the SQLite core.
25394 ** Except, this version does handle the single byte case that the core
25395 ** version depends on being handled before its function is called.
25396 */
25397 static int sqlite3Fts5GetVarint32(const unsigned char *p, u32 *v){
25398 u32 a,b;
25399
25400 /* The 1-byte case. Overwhelmingly the most common. */
25401 a = *p;
25402 /* a: p0 (unmasked) */
25403 if (!(a&0x80))
25404 {
25405 /* Values between 0 and 127 */
25406 *v = a;
25407 return 1;
25408 }
25409
25410 /* The 2-byte case */
25411 p++;
25412 b = *p;
25413 /* b: p1 (unmasked) */
25414 if (!(b&0x80))
25415 {
25416 /* Values between 128 and 16383 */
25417 a &= 0x7f;
25418 a = a<<7;
25419 *v = a | b;
25420 return 2;
25421 }
25422
25423 /* The 3-byte case */
25424 p++;
25425 a = a<<14;
25426 a |= *p;
25427 /* a: p0<<14 | p2 (unmasked) */
25428 if (!(a&0x80))
25429 {
25430 /* Values between 16384 and 2097151 */
25431 a &= (0x7f<<14)|(0x7f);
25432 b &= 0x7f;
25433 b = b<<7;
25434 *v = a | b;
25435 return 3;
25436 }
25437
25438 /* A 32-bit varint is used to store size information in btrees.
25439 ** Objects are rarely larger than 2MiB limit of a 3-byte varint.
25440 ** A 3-byte varint is sufficient, for example, to record the size
25441 ** of a 1048569-byte BLOB or string.
25442 **
25443 ** We only unroll the first 1-, 2-, and 3- byte cases. The very
25444 ** rare larger cases can be handled by the slower 64-bit varint
25445 ** routine.
25446 */
25447 {
25448 u64 v64;
25449 u8 n;
25450 p -= 2;
25451 n = sqlite3Fts5GetVarint(p, &v64);
25452 *v = (u32)v64;
25453 assert( n>3 && n<=9 );
25454 return n;
25455 }
25456 }
25457
25458
25459 /*
25460 ** Bitmasks used by sqlite3GetVarint(). These precomputed constants
25461 ** are defined here rather than simply putting the constant expressions
25462 ** inline in order to work around bugs in the RVT compiler.
25463 **
25464 ** SLOT_2_0 A mask for (0x7f<<14) | 0x7f
25465 **
25466 ** SLOT_4_2_0 A mask for (0x7f<<28) | SLOT_2_0
25467 */
25468 #define SLOT_2_0 0x001fc07f
25469 #define SLOT_4_2_0 0xf01fc07f
25470
25471 /*
25472 ** Read a 64-bit variable-length integer from memory starting at p[0].
25473 ** Return the number of bytes read. The value is stored in *v.
25474 */
25475 static u8 sqlite3Fts5GetVarint(const unsigned char *p, u64 *v){
25476 u32 a,b,s;
25477
25478 a = *p;
25479 /* a: p0 (unmasked) */
25480 if (!(a&0x80))
25481 {
25482 *v = a;
25483 return 1;
25484 }
25485
25486 p++;
25487 b = *p;
25488 /* b: p1 (unmasked) */
25489 if (!(b&0x80))
25490 {
25491 a &= 0x7f;
25492 a = a<<7;
25493 a |= b;
25494 *v = a;
25495 return 2;
25496 }
25497
25498 /* Verify that constants are precomputed correctly */
25499 assert( SLOT_2_0 == ((0x7f<<14) | (0x7f)) );
25500 assert( SLOT_4_2_0 == ((0xfU<<28) | (0x7f<<14) | (0x7f)) );
25501
25502 p++;
25503 a = a<<14;
25504 a |= *p;
25505 /* a: p0<<14 | p2 (unmasked) */
25506 if (!(a&0x80))
25507 {
25508 a &= SLOT_2_0;
25509 b &= 0x7f;
25510 b = b<<7;
25511 a |= b;
25512 *v = a;
25513 return 3;
25514 }
25515
25516 /* CSE1 from below */
25517 a &= SLOT_2_0;
25518 p++;
25519 b = b<<14;
25520 b |= *p;
25521 /* b: p1<<14 | p3 (unmasked) */
25522 if (!(b&0x80))
25523 {
25524 b &= SLOT_2_0;
25525 /* moved CSE1 up */
25526 /* a &= (0x7f<<14)|(0x7f); */
25527 a = a<<7;
25528 a |= b;
25529 *v = a;
25530 return 4;
25531 }
25532
25533 /* a: p0<<14 | p2 (masked) */
25534 /* b: p1<<14 | p3 (unmasked) */
25535 /* 1:save off p0<<21 | p1<<14 | p2<<7 | p3 (masked) */
25536 /* moved CSE1 up */
25537 /* a &= (0x7f<<14)|(0x7f); */
25538 b &= SLOT_2_0;
25539 s = a;
25540 /* s: p0<<14 | p2 (masked) */
25541
25542 p++;
25543 a = a<<14;
25544 a |= *p;
25545 /* a: p0<<28 | p2<<14 | p4 (unmasked) */
25546 if (!(a&0x80))
25547 {
25548 /* we can skip these cause they were (effectively) done above in calc'ing s */
25549 /* a &= (0x7f<<28)|(0x7f<<14)|(0x7f); */
25550 /* b &= (0x7f<<14)|(0x7f); */
25551 b = b<<7;
25552 a |= b;
25553 s = s>>18;
25554 *v = ((u64)s)<<32 | a;
25555 return 5;
25556 }
25557
25558 /* 2:save off p0<<21 | p1<<14 | p2<<7 | p3 (masked) */
25559 s = s<<7;
25560 s |= b;
25561 /* s: p0<<21 | p1<<14 | p2<<7 | p3 (masked) */
25562
25563 p++;
25564 b = b<<14;
25565 b |= *p;
25566 /* b: p1<<28 | p3<<14 | p5 (unmasked) */
25567 if (!(b&0x80))
25568 {
25569 /* we can skip this cause it was (effectively) done above in calc'ing s */
25570 /* b &= (0x7f<<28)|(0x7f<<14)|(0x7f); */
25571 a &= SLOT_2_0;
25572 a = a<<7;
25573 a |= b;
25574 s = s>>18;
25575 *v = ((u64)s)<<32 | a;
25576 return 6;
25577 }
25578
25579 p++;
25580 a = a<<14;
25581 a |= *p;
25582 /* a: p2<<28 | p4<<14 | p6 (unmasked) */
25583 if (!(a&0x80))
25584 {
25585 a &= SLOT_4_2_0;
25586 b &= SLOT_2_0;
25587 b = b<<7;
25588 a |= b;
25589 s = s>>11;
25590 *v = ((u64)s)<<32 | a;
25591 return 7;
25592 }
25593
25594 /* CSE2 from below */
25595 a &= SLOT_2_0;
25596 p++;
25597 b = b<<14;
25598 b |= *p;
25599 /* b: p3<<28 | p5<<14 | p7 (unmasked) */
25600 if (!(b&0x80))
25601 {
25602 b &= SLOT_4_2_0;
25603 /* moved CSE2 up */
25604 /* a &= (0x7f<<14)|(0x7f); */
25605 a = a<<7;
25606 a |= b;
25607 s = s>>4;
25608 *v = ((u64)s)<<32 | a;
25609 return 8;
25610 }
25611
25612 p++;
25613 a = a<<15;
25614 a |= *p;
25615 /* a: p4<<29 | p6<<15 | p8 (unmasked) */
25616
25617 /* moved CSE2 up */
25618 /* a &= (0x7f<<29)|(0x7f<<15)|(0xff); */
25619 b &= SLOT_2_0;
25620 b = b<<8;
25621 a |= b;
25622
25623 s = s<<4;
25624 b = p[-4];
25625 b &= 0x7f;
25626 b = b>>3;
25627 s |= b;
25628
25629 *v = ((u64)s)<<32 | a;
25630
25631 return 9;
25632 }
25633
25634 /*
25635 ** The variable-length integer encoding is as follows:
25636 **
25637 ** KEY:
25638 ** A = 0xxxxxxx 7 bits of data and one flag bit
25639 ** B = 1xxxxxxx 7 bits of data and one flag bit
25640 ** C = xxxxxxxx 8 bits of data
25641 **
25642 ** 7 bits - A
25643 ** 14 bits - BA
25644 ** 21 bits - BBA
25645 ** 28 bits - BBBA
25646 ** 35 bits - BBBBA
25647 ** 42 bits - BBBBBA
25648 ** 49 bits - BBBBBBA
25649 ** 56 bits - BBBBBBBA
25650 ** 64 bits - BBBBBBBBC
25651 */
25652
25653 #ifdef SQLITE_NOINLINE
25654 # define FTS5_NOINLINE SQLITE_NOINLINE
25655 #else
25656 # define FTS5_NOINLINE
25657 #endif
25658
25659 /*
25660 ** Write a 64-bit variable-length integer to memory starting at p[0].
25661 ** The length of data write will be between 1 and 9 bytes. The number
25662 ** of bytes written is returned.
25663 **
25664 ** A variable-length integer consists of the lower 7 bits of each byte
25665 ** for all bytes that have the 8th bit set and one byte with the 8th
25666 ** bit clear. Except, if we get to the 9th byte, it stores the full
25667 ** 8 bits and is the last byte.
25668 */
25669 static int FTS5_NOINLINE fts5PutVarint64(unsigned char *p, u64 v){
25670 int i, j, n;
25671 u8 buf[10];
25672 if( v & (((u64)0xff000000)<<32) ){
25673 p[8] = (u8)v;
25674 v >>= 8;
25675 for(i=7; i>=0; i--){
25676 p[i] = (u8)((v & 0x7f) | 0x80);
25677 v >>= 7;
25678 }
25679 return 9;
25680 }
25681 n = 0;
25682 do{
25683 buf[n++] = (u8)((v & 0x7f) | 0x80);
25684 v >>= 7;
25685 }while( v!=0 );
25686 buf[0] &= 0x7f;
25687 assert( n<=9 );
25688 for(i=0, j=n-1; j>=0; j--, i++){
25689 p[i] = buf[j];
25690 }
25691 return n;
25692 }
25693
25694 static int sqlite3Fts5PutVarint(unsigned char *p, u64 v){
25695 if( v<=0x7f ){
25696 p[0] = v&0x7f;
25697 return 1;
25698 }
25699 if( v<=0x3fff ){
25700 p[0] = ((v>>7)&0x7f)|0x80;
25701 p[1] = v&0x7f;
25702 return 2;
25703 }
25704 return fts5PutVarint64(p,v);
25705 }
25706
25707
25708 static int sqlite3Fts5GetVarintLen(u32 iVal){
25709 if( iVal<(1 << 7 ) ) return 1;
25710 if( iVal<(1 << 14) ) return 2;
25711 if( iVal<(1 << 21) ) return 3;
25712 if( iVal<(1 << 28) ) return 4;
25713 return 5;
25714 }
25715
25716
25717 /*
25718 ** 2015 May 08
25719 **
25720 ** The author disclaims copyright to this source code. In place of
25721 ** a legal notice, here is a blessing:
25722 **
25723 ** May you do good and not evil.
25724 ** May you find forgiveness for yourself and forgive others.
25725 ** May you share freely, never taking more than you give.
25726 **
25727 ******************************************************************************
25728 **
25729 ** This is an SQLite virtual table module implementing direct access to an
25730 ** existing FTS5 index. The module may create several different types of
25731 ** tables:
25732 **
25733 ** col:
25734 ** CREATE TABLE vocab(term, col, doc, cnt, PRIMARY KEY(term, col));
25735 **
25736 ** One row for each term/column combination. The value of $doc is set to
25737 ** the number of fts5 rows that contain at least one instance of term
25738 ** $term within column $col. Field $cnt is set to the total number of
25739 ** instances of term $term in column $col (in any row of the fts5 table).
25740 **
25741 ** row:
25742 ** CREATE TABLE vocab(term, doc, cnt, PRIMARY KEY(term));
25743 **
25744 ** One row for each term in the database. The value of $doc is set to
25745 ** the number of fts5 rows that contain at least one instance of term
25746 ** $term. Field $cnt is set to the total number of instances of term
25747 ** $term in the database.
25748 */
25749
25750
25751 /* #include "fts5Int.h" */
25752
25753
25754 typedef struct Fts5VocabTable Fts5VocabTable;
25755 typedef struct Fts5VocabCursor Fts5VocabCursor;
25756
25757 struct Fts5VocabTable {
25758 sqlite3_vtab base;
25759 char *zFts5Tbl; /* Name of fts5 table */
25760 char *zFts5Db; /* Db containing fts5 table */
25761 sqlite3 *db; /* Database handle */
25762 Fts5Global *pGlobal; /* FTS5 global object for this database */
25763 int eType; /* FTS5_VOCAB_COL or ROW */
25764 };
25765
25766 struct Fts5VocabCursor {
25767 sqlite3_vtab_cursor base;
25768 sqlite3_stmt *pStmt; /* Statement holding lock on pIndex */
25769 Fts5Index *pIndex; /* Associated FTS5 index */
25770
25771 int bEof; /* True if this cursor is at EOF */
25772 Fts5IndexIter *pIter; /* Term/rowid iterator object */
25773
25774 int nLeTerm; /* Size of zLeTerm in bytes */
25775 char *zLeTerm; /* (term <= $zLeTerm) paramater, or NULL */
25776
25777 /* These are used by 'col' tables only */
25778 Fts5Config *pConfig; /* Fts5 table configuration */
25779 int iCol;
25780 i64 *aCnt;
25781 i64 *aDoc;
25782
25783 /* Output values used by 'row' and 'col' tables */
25784 i64 rowid; /* This table's current rowid value */
25785 Fts5Buffer term; /* Current value of 'term' column */
25786 };
25787
25788 #define FTS5_VOCAB_COL 0
25789 #define FTS5_VOCAB_ROW 1
25790
25791 #define FTS5_VOCAB_COL_SCHEMA "term, col, doc, cnt"
25792 #define FTS5_VOCAB_ROW_SCHEMA "term, doc, cnt"
25793
25794 /*
25795 ** Bits for the mask used as the idxNum value by xBestIndex/xFilter.
25796 */
25797 #define FTS5_VOCAB_TERM_EQ 0x01
25798 #define FTS5_VOCAB_TERM_GE 0x02
25799 #define FTS5_VOCAB_TERM_LE 0x04
25800
25801
25802 /*
25803 ** Translate a string containing an fts5vocab table type to an
25804 ** FTS5_VOCAB_XXX constant. If successful, set *peType to the output
25805 ** value and return SQLITE_OK. Otherwise, set *pzErr to an error message
25806 ** and return SQLITE_ERROR.
25807 */
25808 static int fts5VocabTableType(const char *zType, char **pzErr, int *peType){
25809 int rc = SQLITE_OK;
25810 char *zCopy = sqlite3Fts5Strndup(&rc, zType, -1);
25811 if( rc==SQLITE_OK ){
25812 sqlite3Fts5Dequote(zCopy);
25813 if( sqlite3_stricmp(zCopy, "col")==0 ){
25814 *peType = FTS5_VOCAB_COL;
25815 }else
25816
25817 if( sqlite3_stricmp(zCopy, "row")==0 ){
25818 *peType = FTS5_VOCAB_ROW;
25819 }else
25820 {
25821 *pzErr = sqlite3_mprintf("fts5vocab: unknown table type: %Q", zCopy);
25822 rc = SQLITE_ERROR;
25823 }
25824 sqlite3_free(zCopy);
25825 }
25826
25827 return rc;
25828 }
25829
25830
25831 /*
25832 ** The xDisconnect() virtual table method.
25833 */
25834 static int fts5VocabDisconnectMethod(sqlite3_vtab *pVtab){
25835 Fts5VocabTable *pTab = (Fts5VocabTable*)pVtab;
25836 sqlite3_free(pTab);
25837 return SQLITE_OK;
25838 }
25839
25840 /*
25841 ** The xDestroy() virtual table method.
25842 */
25843 static int fts5VocabDestroyMethod(sqlite3_vtab *pVtab){
25844 Fts5VocabTable *pTab = (Fts5VocabTable*)pVtab;
25845 sqlite3_free(pTab);
25846 return SQLITE_OK;
25847 }
25848
25849 /*
25850 ** This function is the implementation of both the xConnect and xCreate
25851 ** methods of the FTS3 virtual table.
25852 **
25853 ** The argv[] array contains the following:
25854 **
25855 ** argv[0] -> module name ("fts5vocab")
25856 ** argv[1] -> database name
25857 ** argv[2] -> table name
25858 **
25859 ** then:
25860 **
25861 ** argv[3] -> name of fts5 table
25862 ** argv[4] -> type of fts5vocab table
25863 **
25864 ** or, for tables in the TEMP schema only.
25865 **
25866 ** argv[3] -> name of fts5 tables database
25867 ** argv[4] -> name of fts5 table
25868 ** argv[5] -> type of fts5vocab table
25869 */
25870 static int fts5VocabInitVtab(
25871 sqlite3 *db, /* The SQLite database connection */
25872 void *pAux, /* Pointer to Fts5Global object */
25873 int argc, /* Number of elements in argv array */
25874 const char * const *argv, /* xCreate/xConnect argument array */
25875 sqlite3_vtab **ppVTab, /* Write the resulting vtab structure here */
25876 char **pzErr /* Write any error message here */
25877 ){
25878 const char *azSchema[] = {
25879 "CREATE TABlE vocab(" FTS5_VOCAB_COL_SCHEMA ")",
25880 "CREATE TABlE vocab(" FTS5_VOCAB_ROW_SCHEMA ")"
25881 };
25882
25883 Fts5VocabTable *pRet = 0;
25884 int rc = SQLITE_OK; /* Return code */
25885 int bDb;
25886
25887 bDb = (argc==6 && strlen(argv[1])==4 && memcmp("temp", argv[1], 4)==0);
25888
25889 if( argc!=5 && bDb==0 ){
25890 *pzErr = sqlite3_mprintf("wrong number of vtable arguments");
25891 rc = SQLITE_ERROR;
25892 }else{
25893 int nByte; /* Bytes of space to allocate */
25894 const char *zDb = bDb ? argv[3] : argv[1];
25895 const char *zTab = bDb ? argv[4] : argv[3];
25896 const char *zType = bDb ? argv[5] : argv[4];
25897 int nDb = (int)strlen(zDb)+1;
25898 int nTab = (int)strlen(zTab)+1;
25899 int eType = 0;
25900
25901 rc = fts5VocabTableType(zType, pzErr, &eType);
25902 if( rc==SQLITE_OK ){
25903 assert( eType>=0 && eType<sizeof(azSchema)/sizeof(azSchema[0]) );
25904 rc = sqlite3_declare_vtab(db, azSchema[eType]);
25905 }
25906
25907 nByte = sizeof(Fts5VocabTable) + nDb + nTab;
25908 pRet = sqlite3Fts5MallocZero(&rc, nByte);
25909 if( pRet ){
25910 pRet->pGlobal = (Fts5Global*)pAux;
25911 pRet->eType = eType;
25912 pRet->db = db;
25913 pRet->zFts5Tbl = (char*)&pRet[1];
25914 pRet->zFts5Db = &pRet->zFts5Tbl[nTab];
25915 memcpy(pRet->zFts5Tbl, zTab, nTab);
25916 memcpy(pRet->zFts5Db, zDb, nDb);
25917 sqlite3Fts5Dequote(pRet->zFts5Tbl);
25918 sqlite3Fts5Dequote(pRet->zFts5Db);
25919 }
25920 }
25921
25922 *ppVTab = (sqlite3_vtab*)pRet;
25923 return rc;
25924 }
25925
25926
25927 /*
25928 ** The xConnect() and xCreate() methods for the virtual table. All the
25929 ** work is done in function fts5VocabInitVtab().
25930 */
25931 static int fts5VocabConnectMethod(
25932 sqlite3 *db, /* Database connection */
25933 void *pAux, /* Pointer to tokenizer hash table */
25934 int argc, /* Number of elements in argv array */
25935 const char * const *argv, /* xCreate/xConnect argument array */
25936 sqlite3_vtab **ppVtab, /* OUT: New sqlite3_vtab object */
25937 char **pzErr /* OUT: sqlite3_malloc'd error message */
25938 ){
25939 return fts5VocabInitVtab(db, pAux, argc, argv, ppVtab, pzErr);
25940 }
25941 static int fts5VocabCreateMethod(
25942 sqlite3 *db, /* Database connection */
25943 void *pAux, /* Pointer to tokenizer hash table */
25944 int argc, /* Number of elements in argv array */
25945 const char * const *argv, /* xCreate/xConnect argument array */
25946 sqlite3_vtab **ppVtab, /* OUT: New sqlite3_vtab object */
25947 char **pzErr /* OUT: sqlite3_malloc'd error message */
25948 ){
25949 return fts5VocabInitVtab(db, pAux, argc, argv, ppVtab, pzErr);
25950 }
25951
25952 /*
25953 ** Implementation of the xBestIndex method.
25954 */
25955 static int fts5VocabBestIndexMethod(
25956 sqlite3_vtab *pVTab,
25957 sqlite3_index_info *pInfo
25958 ){
25959 int i;
25960 int iTermEq = -1;
25961 int iTermGe = -1;
25962 int iTermLe = -1;
25963 int idxNum = 0;
25964 int nArg = 0;
25965
25966 for(i=0; i<pInfo->nConstraint; i++){
25967 struct sqlite3_index_constraint *p = &pInfo->aConstraint[i];
25968 if( p->usable==0 ) continue;
25969 if( p->iColumn==0 ){ /* term column */
25970 if( p->op==SQLITE_INDEX_CONSTRAINT_EQ ) iTermEq = i;
25971 if( p->op==SQLITE_INDEX_CONSTRAINT_LE ) iTermLe = i;
25972 if( p->op==SQLITE_INDEX_CONSTRAINT_LT ) iTermLe = i;
25973 if( p->op==SQLITE_INDEX_CONSTRAINT_GE ) iTermGe = i;
25974 if( p->op==SQLITE_INDEX_CONSTRAINT_GT ) iTermGe = i;
25975 }
25976 }
25977
25978 if( iTermEq>=0 ){
25979 idxNum |= FTS5_VOCAB_TERM_EQ;
25980 pInfo->aConstraintUsage[iTermEq].argvIndex = ++nArg;
25981 pInfo->estimatedCost = 100;
25982 }else{
25983 pInfo->estimatedCost = 1000000;
25984 if( iTermGe>=0 ){
25985 idxNum |= FTS5_VOCAB_TERM_GE;
25986 pInfo->aConstraintUsage[iTermGe].argvIndex = ++nArg;
25987 pInfo->estimatedCost = pInfo->estimatedCost / 2;
25988 }
25989 if( iTermLe>=0 ){
25990 idxNum |= FTS5_VOCAB_TERM_LE;
25991 pInfo->aConstraintUsage[iTermLe].argvIndex = ++nArg;
25992 pInfo->estimatedCost = pInfo->estimatedCost / 2;
25993 }
25994 }
25995
25996 pInfo->idxNum = idxNum;
25997
25998 return SQLITE_OK;
25999 }
26000
26001 /*
26002 ** Implementation of xOpen method.
26003 */
26004 static int fts5VocabOpenMethod(
26005 sqlite3_vtab *pVTab,
26006 sqlite3_vtab_cursor **ppCsr
26007 ){
26008 Fts5VocabTable *pTab = (Fts5VocabTable*)pVTab;
26009 Fts5Index *pIndex = 0;
26010 Fts5Config *pConfig = 0;
26011 Fts5VocabCursor *pCsr = 0;
26012 int rc = SQLITE_OK;
26013 sqlite3_stmt *pStmt = 0;
26014 char *zSql = 0;
26015
26016 zSql = sqlite3Fts5Mprintf(&rc,
26017 "SELECT t.%Q FROM %Q.%Q AS t WHERE t.%Q MATCH '*id'",
26018 pTab->zFts5Tbl, pTab->zFts5Db, pTab->zFts5Tbl, pTab->zFts5Tbl
26019 );
26020 if( zSql ){
26021 rc = sqlite3_prepare_v2(pTab->db, zSql, -1, &pStmt, 0);
26022 }
26023 sqlite3_free(zSql);
26024 assert( rc==SQLITE_OK || pStmt==0 );
26025 if( rc==SQLITE_ERROR ) rc = SQLITE_OK;
26026
26027 if( pStmt && sqlite3_step(pStmt)==SQLITE_ROW ){
26028 i64 iId = sqlite3_column_int64(pStmt, 0);
26029 pIndex = sqlite3Fts5IndexFromCsrid(pTab->pGlobal, iId, &pConfig);
26030 }
26031
26032 if( rc==SQLITE_OK && pIndex==0 ){
26033 rc = sqlite3_finalize(pStmt);
26034 pStmt = 0;
26035 if( rc==SQLITE_OK ){
26036 pVTab->zErrMsg = sqlite3_mprintf(
26037 "no such fts5 table: %s.%s", pTab->zFts5Db, pTab->zFts5Tbl
26038 );
26039 rc = SQLITE_ERROR;
26040 }
26041 }
26042
26043 if( rc==SQLITE_OK ){
26044 int nByte = pConfig->nCol * sizeof(i64) * 2 + sizeof(Fts5VocabCursor);
26045 pCsr = (Fts5VocabCursor*)sqlite3Fts5MallocZero(&rc, nByte);
26046 }
26047
26048 if( pCsr ){
26049 pCsr->pIndex = pIndex;
26050 pCsr->pStmt = pStmt;
26051 pCsr->pConfig = pConfig;
26052 pCsr->aCnt = (i64*)&pCsr[1];
26053 pCsr->aDoc = &pCsr->aCnt[pConfig->nCol];
26054 }else{
26055 sqlite3_finalize(pStmt);
26056 }
26057
26058 *ppCsr = (sqlite3_vtab_cursor*)pCsr;
26059 return rc;
26060 }
26061
26062 static void fts5VocabResetCursor(Fts5VocabCursor *pCsr){
26063 pCsr->rowid = 0;
26064 sqlite3Fts5IterClose(pCsr->pIter);
26065 pCsr->pIter = 0;
26066 sqlite3_free(pCsr->zLeTerm);
26067 pCsr->nLeTerm = -1;
26068 pCsr->zLeTerm = 0;
26069 }
26070
26071 /*
26072 ** Close the cursor. For additional information see the documentation
26073 ** on the xClose method of the virtual table interface.
26074 */
26075 static int fts5VocabCloseMethod(sqlite3_vtab_cursor *pCursor){
26076 Fts5VocabCursor *pCsr = (Fts5VocabCursor*)pCursor;
26077 fts5VocabResetCursor(pCsr);
26078 sqlite3Fts5BufferFree(&pCsr->term);
26079 sqlite3_finalize(pCsr->pStmt);
26080 sqlite3_free(pCsr);
26081 return SQLITE_OK;
26082 }
26083
26084
26085 /*
26086 ** Advance the cursor to the next row in the table.
26087 */
26088 static int fts5VocabNextMethod(sqlite3_vtab_cursor *pCursor){
26089 Fts5VocabCursor *pCsr = (Fts5VocabCursor*)pCursor;
26090 Fts5VocabTable *pTab = (Fts5VocabTable*)pCursor->pVtab;
26091 int rc = SQLITE_OK;
26092 int nCol = pCsr->pConfig->nCol;
26093
26094 pCsr->rowid++;
26095
26096 if( pTab->eType==FTS5_VOCAB_COL ){
26097 for(pCsr->iCol++; pCsr->iCol<nCol; pCsr->iCol++){
26098 if( pCsr->aCnt[pCsr->iCol] ) break;
26099 }
26100 }
26101
26102 if( pTab->eType==FTS5_VOCAB_ROW || pCsr->iCol>=nCol ){
26103 if( sqlite3Fts5IterEof(pCsr->pIter) ){
26104 pCsr->bEof = 1;
26105 }else{
26106 const char *zTerm;
26107 int nTerm;
26108
26109 zTerm = sqlite3Fts5IterTerm(pCsr->pIter, &nTerm);
26110 if( pCsr->nLeTerm>=0 ){
26111 int nCmp = MIN(nTerm, pCsr->nLeTerm);
26112 int bCmp = memcmp(pCsr->zLeTerm, zTerm, nCmp);
26113 if( bCmp<0 || (bCmp==0 && pCsr->nLeTerm<nTerm) ){
26114 pCsr->bEof = 1;
26115 return SQLITE_OK;
26116 }
26117 }
26118
26119 sqlite3Fts5BufferSet(&rc, &pCsr->term, nTerm, (const u8*)zTerm);
26120 memset(pCsr->aCnt, 0, nCol * sizeof(i64));
26121 memset(pCsr->aDoc, 0, nCol * sizeof(i64));
26122 pCsr->iCol = 0;
26123
26124 assert( pTab->eType==FTS5_VOCAB_COL || pTab->eType==FTS5_VOCAB_ROW );
26125 while( rc==SQLITE_OK ){
26126 i64 dummy;
26127 const u8 *pPos; int nPos; /* Position list */
26128 i64 iPos = 0; /* 64-bit position read from poslist */
26129 int iOff = 0; /* Current offset within position list */
26130
26131 rc = sqlite3Fts5IterPoslist(pCsr->pIter, 0, &pPos, &nPos, &dummy);
26132 if( rc==SQLITE_OK ){
26133 if( pTab->eType==FTS5_VOCAB_ROW ){
26134 while( 0==sqlite3Fts5PoslistNext64(pPos, nPos, &iOff, &iPos) ){
26135 pCsr->aCnt[0]++;
26136 }
26137 pCsr->aDoc[0]++;
26138 }else{
26139 int iCol = -1;
26140 while( 0==sqlite3Fts5PoslistNext64(pPos, nPos, &iOff, &iPos) ){
26141 int ii = FTS5_POS2COLUMN(iPos);
26142 pCsr->aCnt[ii]++;
26143 if( iCol!=ii ){
26144 pCsr->aDoc[ii]++;
26145 iCol = ii;
26146 }
26147 }
26148 }
26149 rc = sqlite3Fts5IterNextScan(pCsr->pIter);
26150 }
26151
26152 if( rc==SQLITE_OK ){
26153 zTerm = sqlite3Fts5IterTerm(pCsr->pIter, &nTerm);
26154 if( nTerm!=pCsr->term.n || memcmp(zTerm, pCsr->term.p, nTerm) ){
26155 break;
26156 }
26157 if( sqlite3Fts5IterEof(pCsr->pIter) ) break;
26158 }
26159 }
26160 }
26161 }
26162
26163 if( pCsr->bEof==0 && pTab->eType==FTS5_VOCAB_COL ){
26164 while( pCsr->aCnt[pCsr->iCol]==0 ) pCsr->iCol++;
26165 assert( pCsr->iCol<pCsr->pConfig->nCol );
26166 }
26167 return rc;
26168 }
26169
26170 /*
26171 ** This is the xFilter implementation for the virtual table.
26172 */
26173 static int fts5VocabFilterMethod(
26174 sqlite3_vtab_cursor *pCursor, /* The cursor used for this query */
26175 int idxNum, /* Strategy index */
26176 const char *idxStr, /* Unused */
26177 int nVal, /* Number of elements in apVal */
26178 sqlite3_value **apVal /* Arguments for the indexing scheme */
26179 ){
26180 Fts5VocabCursor *pCsr = (Fts5VocabCursor*)pCursor;
26181 int rc = SQLITE_OK;
26182
26183 int iVal = 0;
26184 int f = FTS5INDEX_QUERY_SCAN;
26185 const char *zTerm = 0;
26186 int nTerm = 0;
26187
26188 sqlite3_value *pEq = 0;
26189 sqlite3_value *pGe = 0;
26190 sqlite3_value *pLe = 0;
26191
26192 fts5VocabResetCursor(pCsr);
26193 if( idxNum & FTS5_VOCAB_TERM_EQ ) pEq = apVal[iVal++];
26194 if( idxNum & FTS5_VOCAB_TERM_GE ) pGe = apVal[iVal++];
26195 if( idxNum & FTS5_VOCAB_TERM_LE ) pLe = apVal[iVal++];
26196
26197 if( pEq ){
26198 zTerm = (const char *)sqlite3_value_text(pEq);
26199 nTerm = sqlite3_value_bytes(pEq);
26200 f = 0;
26201 }else{
26202 if( pGe ){
26203 zTerm = (const char *)sqlite3_value_text(pGe);
26204 nTerm = sqlite3_value_bytes(pGe);
26205 }
26206 if( pLe ){
26207 const char *zCopy = (const char *)sqlite3_value_text(pLe);
26208 pCsr->nLeTerm = sqlite3_value_bytes(pLe);
26209 pCsr->zLeTerm = sqlite3_malloc(pCsr->nLeTerm+1);
26210 if( pCsr->zLeTerm==0 ){
26211 rc = SQLITE_NOMEM;
26212 }else{
26213 memcpy(pCsr->zLeTerm, zCopy, pCsr->nLeTerm+1);
26214 }
26215 }
26216 }
26217
26218
26219 if( rc==SQLITE_OK ){
26220 rc = sqlite3Fts5IndexQuery(pCsr->pIndex, zTerm, nTerm, f, 0, &pCsr->pIter);
26221 }
26222 if( rc==SQLITE_OK ){
26223 rc = fts5VocabNextMethod(pCursor);
26224 }
26225
26226 return rc;
26227 }
26228
26229 /*
26230 ** This is the xEof method of the virtual table. SQLite calls this
26231 ** routine to find out if it has reached the end of a result set.
26232 */
26233 static int fts5VocabEofMethod(sqlite3_vtab_cursor *pCursor){
26234 Fts5VocabCursor *pCsr = (Fts5VocabCursor*)pCursor;
26235 return pCsr->bEof;
26236 }
26237
26238 static int fts5VocabColumnMethod(
26239 sqlite3_vtab_cursor *pCursor, /* Cursor to retrieve value from */
26240 sqlite3_context *pCtx, /* Context for sqlite3_result_xxx() calls */
26241 int iCol /* Index of column to read value from */
26242 ){
26243 Fts5VocabCursor *pCsr = (Fts5VocabCursor*)pCursor;
26244
26245 if( iCol==0 ){
26246 sqlite3_result_text(
26247 pCtx, (const char*)pCsr->term.p, pCsr->term.n, SQLITE_TRANSIENT
26248 );
26249 }
26250 else if( ((Fts5VocabTable*)(pCursor->pVtab))->eType==FTS5_VOCAB_COL ){
26251 assert( iCol==1 || iCol==2 || iCol==3 );
26252 if( iCol==1 ){
26253 const char *z = pCsr->pConfig->azCol[pCsr->iCol];
26254 sqlite3_result_text(pCtx, z, -1, SQLITE_STATIC);
26255 }else if( iCol==2 ){
26256 sqlite3_result_int64(pCtx, pCsr->aDoc[pCsr->iCol]);
26257 }else{
26258 sqlite3_result_int64(pCtx, pCsr->aCnt[pCsr->iCol]);
26259 }
26260 }else{
26261 assert( iCol==1 || iCol==2 );
26262 if( iCol==1 ){
26263 sqlite3_result_int64(pCtx, pCsr->aDoc[0]);
26264 }else{
26265 sqlite3_result_int64(pCtx, pCsr->aCnt[0]);
26266 }
26267 }
26268 return SQLITE_OK;
26269 }
26270
26271 /*
26272 ** This is the xRowid method. The SQLite core calls this routine to
26273 ** retrieve the rowid for the current row of the result set. The
26274 ** rowid should be written to *pRowid.
26275 */
26276 static int fts5VocabRowidMethod(
26277 sqlite3_vtab_cursor *pCursor,
26278 sqlite_int64 *pRowid
26279 ){
26280 Fts5VocabCursor *pCsr = (Fts5VocabCursor*)pCursor;
26281 *pRowid = pCsr->rowid;
26282 return SQLITE_OK;
26283 }
26284
26285 static int sqlite3Fts5VocabInit(Fts5Global *pGlobal, sqlite3 *db){
26286 static const sqlite3_module fts5Vocab = {
26287 /* iVersion */ 2,
26288 /* xCreate */ fts5VocabCreateMethod,
26289 /* xConnect */ fts5VocabConnectMethod,
26290 /* xBestIndex */ fts5VocabBestIndexMethod,
26291 /* xDisconnect */ fts5VocabDisconnectMethod,
26292 /* xDestroy */ fts5VocabDestroyMethod,
26293 /* xOpen */ fts5VocabOpenMethod,
26294 /* xClose */ fts5VocabCloseMethod,
26295 /* xFilter */ fts5VocabFilterMethod,
26296 /* xNext */ fts5VocabNextMethod,
26297 /* xEof */ fts5VocabEofMethod,
26298 /* xColumn */ fts5VocabColumnMethod,
26299 /* xRowid */ fts5VocabRowidMethod,
26300 /* xUpdate */ 0,
26301 /* xBegin */ 0,
26302 /* xSync */ 0,
26303 /* xCommit */ 0,
26304 /* xRollback */ 0,
26305 /* xFindFunction */ 0,
26306 /* xRename */ 0,
26307 /* xSavepoint */ 0,
26308 /* xRelease */ 0,
26309 /* xRollbackTo */ 0,
26310 };
26311 void *p = (void*)pGlobal;
26312
26313 return sqlite3_create_module_v2(db, "fts5vocab", &fts5Vocab, p, 0);
26314 }
26315
26316
26317
26318
26319
26320 #endif /* !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_FTS5) */
26321
26322 /************** End of fts5.c ************************************************/
OLDNEW
« no previous file with comments | « third_party/sqlite/amalgamation/sqlite3.06.c ('k') | third_party/sqlite/split.pl » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698