OLD | NEW |
(Empty) | |
| 1 /************** Begin file sqlite3rbu.c **************************************/ |
| 2 /* |
| 3 ** 2014 August 30 |
| 4 ** |
| 5 ** The author disclaims copyright to this source code. In place of |
| 6 ** a legal notice, here is a blessing: |
| 7 ** |
| 8 ** May you do good and not evil. |
| 9 ** May you find forgiveness for yourself and forgive others. |
| 10 ** May you share freely, never taking more than you give. |
| 11 ** |
| 12 ************************************************************************* |
| 13 ** |
| 14 ** |
| 15 ** OVERVIEW |
| 16 ** |
| 17 ** The RBU extension requires that the RBU update be packaged as an |
| 18 ** SQLite database. The tables it expects to find are described in |
| 19 ** sqlite3rbu.h. Essentially, for each table xyz in the target database |
| 20 ** that the user wishes to write to, a corresponding data_xyz table is |
| 21 ** created in the RBU database and populated with one row for each row to |
| 22 ** update, insert or delete from the target table. |
| 23 ** |
| 24 ** The update proceeds in three stages: |
| 25 ** |
| 26 ** 1) The database is updated. The modified database pages are written |
| 27 ** to a *-oal file. A *-oal file is just like a *-wal file, except |
| 28 ** that it is named "<database>-oal" instead of "<database>-wal". |
| 29 ** Because regular SQLite clients do not look for file named |
| 30 ** "<database>-oal", they go on using the original database in |
| 31 ** rollback mode while the *-oal file is being generated. |
| 32 ** |
| 33 ** During this stage RBU does not update the database by writing |
| 34 ** directly to the target tables. Instead it creates "imposter" |
| 35 ** tables using the SQLITE_TESTCTRL_IMPOSTER interface that it uses |
| 36 ** to update each b-tree individually. All updates required by each |
| 37 ** b-tree are completed before moving on to the next, and all |
| 38 ** updates are done in sorted key order. |
| 39 ** |
| 40 ** 2) The "<database>-oal" file is moved to the equivalent "<database>-wal" |
| 41 ** location using a call to rename(2). Before doing this the RBU |
| 42 ** module takes an EXCLUSIVE lock on the database file, ensuring |
| 43 ** that there are no other active readers. |
| 44 ** |
| 45 ** Once the EXCLUSIVE lock is released, any other database readers |
| 46 ** detect the new *-wal file and read the database in wal mode. At |
| 47 ** this point they see the new version of the database - including |
| 48 ** the updates made as part of the RBU update. |
| 49 ** |
| 50 ** 3) The new *-wal file is checkpointed. This proceeds in the same way |
| 51 ** as a regular database checkpoint, except that a single frame is |
| 52 ** checkpointed each time sqlite3rbu_step() is called. If the RBU |
| 53 ** handle is closed before the entire *-wal file is checkpointed, |
| 54 ** the checkpoint progress is saved in the RBU database and the |
| 55 ** checkpoint can be resumed by another RBU client at some point in |
| 56 ** the future. |
| 57 ** |
| 58 ** POTENTIAL PROBLEMS |
| 59 ** |
| 60 ** The rename() call might not be portable. And RBU is not currently |
| 61 ** syncing the directory after renaming the file. |
| 62 ** |
| 63 ** When state is saved, any commit to the *-oal file and the commit to |
| 64 ** the RBU update database are not atomic. So if the power fails at the |
| 65 ** wrong moment they might get out of sync. As the main database will be |
| 66 ** committed before the RBU update database this will likely either just |
| 67 ** pass unnoticed, or result in SQLITE_CONSTRAINT errors (due to UNIQUE |
| 68 ** constraint violations). |
| 69 ** |
| 70 ** If some client does modify the target database mid RBU update, or some |
| 71 ** other error occurs, the RBU extension will keep throwing errors. It's |
| 72 ** not really clear how to get out of this state. The system could just |
| 73 ** by delete the RBU update database and *-oal file and have the device |
| 74 ** download the update again and start over. |
| 75 ** |
| 76 ** At present, for an UPDATE, both the new.* and old.* records are |
| 77 ** collected in the rbu_xyz table. And for both UPDATEs and DELETEs all |
| 78 ** fields are collected. This means we're probably writing a lot more |
| 79 ** data to disk when saving the state of an ongoing update to the RBU |
| 80 ** update database than is strictly necessary. |
| 81 ** |
| 82 */ |
| 83 |
| 84 /* #include <assert.h> */ |
| 85 /* #include <string.h> */ |
| 86 /* #include <stdio.h> */ |
| 87 |
| 88 /* #include "sqlite3.h" */ |
| 89 |
| 90 #if !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_RBU) |
| 91 /************** Include sqlite3rbu.h in the middle of sqlite3rbu.c ***********/ |
| 92 /************** Begin file sqlite3rbu.h **************************************/ |
| 93 /* |
| 94 ** 2014 August 30 |
| 95 ** |
| 96 ** The author disclaims copyright to this source code. In place of |
| 97 ** a legal notice, here is a blessing: |
| 98 ** |
| 99 ** May you do good and not evil. |
| 100 ** May you find forgiveness for yourself and forgive others. |
| 101 ** May you share freely, never taking more than you give. |
| 102 ** |
| 103 ************************************************************************* |
| 104 ** |
| 105 ** This file contains the public interface for the RBU extension. |
| 106 */ |
| 107 |
| 108 /* |
| 109 ** SUMMARY |
| 110 ** |
| 111 ** Writing a transaction containing a large number of operations on |
| 112 ** b-tree indexes that are collectively larger than the available cache |
| 113 ** memory can be very inefficient. |
| 114 ** |
| 115 ** The problem is that in order to update a b-tree, the leaf page (at least) |
| 116 ** containing the entry being inserted or deleted must be modified. If the |
| 117 ** working set of leaves is larger than the available cache memory, then a |
| 118 ** single leaf that is modified more than once as part of the transaction |
| 119 ** may be loaded from or written to the persistent media multiple times. |
| 120 ** Additionally, because the index updates are likely to be applied in |
| 121 ** random order, access to pages within the database is also likely to be in |
| 122 ** random order, which is itself quite inefficient. |
| 123 ** |
| 124 ** One way to improve the situation is to sort the operations on each index |
| 125 ** by index key before applying them to the b-tree. This leads to an IO |
| 126 ** pattern that resembles a single linear scan through the index b-tree, |
| 127 ** and all but guarantees each modified leaf page is loaded and stored |
| 128 ** exactly once. SQLite uses this trick to improve the performance of |
| 129 ** CREATE INDEX commands. This extension allows it to be used to improve |
| 130 ** the performance of large transactions on existing databases. |
| 131 ** |
| 132 ** Additionally, this extension allows the work involved in writing the |
| 133 ** large transaction to be broken down into sub-transactions performed |
| 134 ** sequentially by separate processes. This is useful if the system cannot |
| 135 ** guarantee that a single update process will run for long enough to apply |
| 136 ** the entire update, for example because the update is being applied on a |
| 137 ** mobile device that is frequently rebooted. Even after the writer process |
| 138 ** has committed one or more sub-transactions, other database clients continue |
| 139 ** to read from the original database snapshot. In other words, partially |
| 140 ** applied transactions are not visible to other clients. |
| 141 ** |
| 142 ** "RBU" stands for "Resumable Bulk Update". As in a large database update |
| 143 ** transmitted via a wireless network to a mobile device. A transaction |
| 144 ** applied using this extension is hence refered to as an "RBU update". |
| 145 ** |
| 146 ** |
| 147 ** LIMITATIONS |
| 148 ** |
| 149 ** An "RBU update" transaction is subject to the following limitations: |
| 150 ** |
| 151 ** * The transaction must consist of INSERT, UPDATE and DELETE operations |
| 152 ** only. |
| 153 ** |
| 154 ** * INSERT statements may not use any default values. |
| 155 ** |
| 156 ** * UPDATE and DELETE statements must identify their target rows by |
| 157 ** non-NULL PRIMARY KEY values. Rows with NULL values stored in PRIMARY |
| 158 ** KEY fields may not be updated or deleted. If the table being written |
| 159 ** has no PRIMARY KEY, affected rows must be identified by rowid. |
| 160 ** |
| 161 ** * UPDATE statements may not modify PRIMARY KEY columns. |
| 162 ** |
| 163 ** * No triggers will be fired. |
| 164 ** |
| 165 ** * No foreign key violations are detected or reported. |
| 166 ** |
| 167 ** * CHECK constraints are not enforced. |
| 168 ** |
| 169 ** * No constraint handling mode except for "OR ROLLBACK" is supported. |
| 170 ** |
| 171 ** |
| 172 ** PREPARATION |
| 173 ** |
| 174 ** An "RBU update" is stored as a separate SQLite database. A database |
| 175 ** containing an RBU update is an "RBU database". For each table in the |
| 176 ** target database to be updated, the RBU database should contain a table |
| 177 ** named "data_<target name>" containing the same set of columns as the |
| 178 ** target table, and one more - "rbu_control". The data_% table should |
| 179 ** have no PRIMARY KEY or UNIQUE constraints, but each column should have |
| 180 ** the same type as the corresponding column in the target database. |
| 181 ** The "rbu_control" column should have no type at all. For example, if |
| 182 ** the target database contains: |
| 183 ** |
| 184 ** CREATE TABLE t1(a INTEGER PRIMARY KEY, b TEXT, c UNIQUE); |
| 185 ** |
| 186 ** Then the RBU database should contain: |
| 187 ** |
| 188 ** CREATE TABLE data_t1(a INTEGER, b TEXT, c, rbu_control); |
| 189 ** |
| 190 ** The order of the columns in the data_% table does not matter. |
| 191 ** |
| 192 ** Instead of a regular table, the RBU database may also contain virtual |
| 193 ** tables or view named using the data_<target> naming scheme. |
| 194 ** |
| 195 ** Instead of the plain data_<target> naming scheme, RBU database tables |
| 196 ** may also be named data<integer>_<target>, where <integer> is any sequence |
| 197 ** of zero or more numeric characters (0-9). This can be significant because |
| 198 ** tables within the RBU database are always processed in order sorted by |
| 199 ** name. By judicious selection of the the <integer> portion of the names |
| 200 ** of the RBU tables the user can therefore control the order in which they |
| 201 ** are processed. This can be useful, for example, to ensure that "external |
| 202 ** content" FTS4 tables are updated before their underlying content tables. |
| 203 ** |
| 204 ** If the target database table is a virtual table or a table that has no |
| 205 ** PRIMARY KEY declaration, the data_% table must also contain a column |
| 206 ** named "rbu_rowid". This column is mapped to the tables implicit primary |
| 207 ** key column - "rowid". Virtual tables for which the "rowid" column does |
| 208 ** not function like a primary key value cannot be updated using RBU. For |
| 209 ** example, if the target db contains either of the following: |
| 210 ** |
| 211 ** CREATE VIRTUAL TABLE x1 USING fts3(a, b); |
| 212 ** CREATE TABLE x1(a, b) |
| 213 ** |
| 214 ** then the RBU database should contain: |
| 215 ** |
| 216 ** CREATE TABLE data_x1(a, b, rbu_rowid, rbu_control); |
| 217 ** |
| 218 ** All non-hidden columns (i.e. all columns matched by "SELECT *") of the |
| 219 ** target table must be present in the input table. For virtual tables, |
| 220 ** hidden columns are optional - they are updated by RBU if present in |
| 221 ** the input table, or not otherwise. For example, to write to an fts4 |
| 222 ** table with a hidden languageid column such as: |
| 223 ** |
| 224 ** CREATE VIRTUAL TABLE ft1 USING fts4(a, b, languageid='langid'); |
| 225 ** |
| 226 ** Either of the following input table schemas may be used: |
| 227 ** |
| 228 ** CREATE TABLE data_ft1(a, b, langid, rbu_rowid, rbu_control); |
| 229 ** CREATE TABLE data_ft1(a, b, rbu_rowid, rbu_control); |
| 230 ** |
| 231 ** For each row to INSERT into the target database as part of the RBU |
| 232 ** update, the corresponding data_% table should contain a single record |
| 233 ** with the "rbu_control" column set to contain integer value 0. The |
| 234 ** other columns should be set to the values that make up the new record |
| 235 ** to insert. |
| 236 ** |
| 237 ** If the target database table has an INTEGER PRIMARY KEY, it is not |
| 238 ** possible to insert a NULL value into the IPK column. Attempting to |
| 239 ** do so results in an SQLITE_MISMATCH error. |
| 240 ** |
| 241 ** For each row to DELETE from the target database as part of the RBU |
| 242 ** update, the corresponding data_% table should contain a single record |
| 243 ** with the "rbu_control" column set to contain integer value 1. The |
| 244 ** real primary key values of the row to delete should be stored in the |
| 245 ** corresponding columns of the data_% table. The values stored in the |
| 246 ** other columns are not used. |
| 247 ** |
| 248 ** For each row to UPDATE from the target database as part of the RBU |
| 249 ** update, the corresponding data_% table should contain a single record |
| 250 ** with the "rbu_control" column set to contain a value of type text. |
| 251 ** The real primary key values identifying the row to update should be |
| 252 ** stored in the corresponding columns of the data_% table row, as should |
| 253 ** the new values of all columns being update. The text value in the |
| 254 ** "rbu_control" column must contain the same number of characters as |
| 255 ** there are columns in the target database table, and must consist entirely |
| 256 ** of 'x' and '.' characters (or in some special cases 'd' - see below). For |
| 257 ** each column that is being updated, the corresponding character is set to |
| 258 ** 'x'. For those that remain as they are, the corresponding character of the |
| 259 ** rbu_control value should be set to '.'. For example, given the tables |
| 260 ** above, the update statement: |
| 261 ** |
| 262 ** UPDATE t1 SET c = 'usa' WHERE a = 4; |
| 263 ** |
| 264 ** is represented by the data_t1 row created by: |
| 265 ** |
| 266 ** INSERT INTO data_t1(a, b, c, rbu_control) VALUES(4, NULL, 'usa', '..x'); |
| 267 ** |
| 268 ** Instead of an 'x' character, characters of the rbu_control value specified |
| 269 ** for UPDATEs may also be set to 'd'. In this case, instead of updating the |
| 270 ** target table with the value stored in the corresponding data_% column, the |
| 271 ** user-defined SQL function "rbu_delta()" is invoked and the result stored in |
| 272 ** the target table column. rbu_delta() is invoked with two arguments - the |
| 273 ** original value currently stored in the target table column and the |
| 274 ** value specified in the data_xxx table. |
| 275 ** |
| 276 ** For example, this row: |
| 277 ** |
| 278 ** INSERT INTO data_t1(a, b, c, rbu_control) VALUES(4, NULL, 'usa', '..d'); |
| 279 ** |
| 280 ** is similar to an UPDATE statement such as: |
| 281 ** |
| 282 ** UPDATE t1 SET c = rbu_delta(c, 'usa') WHERE a = 4; |
| 283 ** |
| 284 ** Finally, if an 'f' character appears in place of a 'd' or 's' in an |
| 285 ** ota_control string, the contents of the data_xxx table column is assumed |
| 286 ** to be a "fossil delta" - a patch to be applied to a blob value in the |
| 287 ** format used by the fossil source-code management system. In this case |
| 288 ** the existing value within the target database table must be of type BLOB. |
| 289 ** It is replaced by the result of applying the specified fossil delta to |
| 290 ** itself. |
| 291 ** |
| 292 ** If the target database table is a virtual table or a table with no PRIMARY |
| 293 ** KEY, the rbu_control value should not include a character corresponding |
| 294 ** to the rbu_rowid value. For example, this: |
| 295 ** |
| 296 ** INSERT INTO data_ft1(a, b, rbu_rowid, rbu_control) |
| 297 ** VALUES(NULL, 'usa', 12, '.x'); |
| 298 ** |
| 299 ** causes a result similar to: |
| 300 ** |
| 301 ** UPDATE ft1 SET b = 'usa' WHERE rowid = 12; |
| 302 ** |
| 303 ** The data_xxx tables themselves should have no PRIMARY KEY declarations. |
| 304 ** However, RBU is more efficient if reading the rows in from each data_xxx |
| 305 ** table in "rowid" order is roughly the same as reading them sorted by |
| 306 ** the PRIMARY KEY of the corresponding target database table. In other |
| 307 ** words, rows should be sorted using the destination table PRIMARY KEY |
| 308 ** fields before they are inserted into the data_xxx tables. |
| 309 ** |
| 310 ** USAGE |
| 311 ** |
| 312 ** The API declared below allows an application to apply an RBU update |
| 313 ** stored on disk to an existing target database. Essentially, the |
| 314 ** application: |
| 315 ** |
| 316 ** 1) Opens an RBU handle using the sqlite3rbu_open() function. |
| 317 ** |
| 318 ** 2) Registers any required virtual table modules with the database |
| 319 ** handle returned by sqlite3rbu_db(). Also, if required, register |
| 320 ** the rbu_delta() implementation. |
| 321 ** |
| 322 ** 3) Calls the sqlite3rbu_step() function one or more times on |
| 323 ** the new handle. Each call to sqlite3rbu_step() performs a single |
| 324 ** b-tree operation, so thousands of calls may be required to apply |
| 325 ** a complete update. |
| 326 ** |
| 327 ** 4) Calls sqlite3rbu_close() to close the RBU update handle. If |
| 328 ** sqlite3rbu_step() has been called enough times to completely |
| 329 ** apply the update to the target database, then the RBU database |
| 330 ** is marked as fully applied. Otherwise, the state of the RBU |
| 331 ** update application is saved in the RBU database for later |
| 332 ** resumption. |
| 333 ** |
| 334 ** See comments below for more detail on APIs. |
| 335 ** |
| 336 ** If an update is only partially applied to the target database by the |
| 337 ** time sqlite3rbu_close() is called, various state information is saved |
| 338 ** within the RBU database. This allows subsequent processes to automatically |
| 339 ** resume the RBU update from where it left off. |
| 340 ** |
| 341 ** To remove all RBU extension state information, returning an RBU database |
| 342 ** to its original contents, it is sufficient to drop all tables that begin |
| 343 ** with the prefix "rbu_" |
| 344 ** |
| 345 ** DATABASE LOCKING |
| 346 ** |
| 347 ** An RBU update may not be applied to a database in WAL mode. Attempting |
| 348 ** to do so is an error (SQLITE_ERROR). |
| 349 ** |
| 350 ** While an RBU handle is open, a SHARED lock may be held on the target |
| 351 ** database file. This means it is possible for other clients to read the |
| 352 ** database, but not to write it. |
| 353 ** |
| 354 ** If an RBU update is started and then suspended before it is completed, |
| 355 ** then an external client writes to the database, then attempting to resume |
| 356 ** the suspended RBU update is also an error (SQLITE_BUSY). |
| 357 */ |
| 358 |
| 359 #ifndef _SQLITE3RBU_H |
| 360 #define _SQLITE3RBU_H |
| 361 |
| 362 /* #include "sqlite3.h" ** Required for error code definitions ** *
/ |
| 363 |
| 364 #if 0 |
| 365 extern "C" { |
| 366 #endif |
| 367 |
| 368 typedef struct sqlite3rbu sqlite3rbu; |
| 369 |
| 370 /* |
| 371 ** Open an RBU handle. |
| 372 ** |
| 373 ** Argument zTarget is the path to the target database. Argument zRbu is |
| 374 ** the path to the RBU database. Each call to this function must be matched |
| 375 ** by a call to sqlite3rbu_close(). When opening the databases, RBU passes |
| 376 ** the SQLITE_CONFIG_URI flag to sqlite3_open_v2(). So if either zTarget |
| 377 ** or zRbu begin with "file:", it will be interpreted as an SQLite |
| 378 ** database URI, not a regular file name. |
| 379 ** |
| 380 ** If the zState argument is passed a NULL value, the RBU extension stores |
| 381 ** the current state of the update (how many rows have been updated, which |
| 382 ** indexes are yet to be updated etc.) within the RBU database itself. This |
| 383 ** can be convenient, as it means that the RBU application does not need to |
| 384 ** organize removing a separate state file after the update is concluded. |
| 385 ** Or, if zState is non-NULL, it must be a path to a database file in which |
| 386 ** the RBU extension can store the state of the update. |
| 387 ** |
| 388 ** When resuming an RBU update, the zState argument must be passed the same |
| 389 ** value as when the RBU update was started. |
| 390 ** |
| 391 ** Once the RBU update is finished, the RBU extension does not |
| 392 ** automatically remove any zState database file, even if it created it. |
| 393 ** |
| 394 ** By default, RBU uses the default VFS to access the files on disk. To |
| 395 ** use a VFS other than the default, an SQLite "file:" URI containing a |
| 396 ** "vfs=..." option may be passed as the zTarget option. |
| 397 ** |
| 398 ** IMPORTANT NOTE FOR ZIPVFS USERS: The RBU extension works with all of |
| 399 ** SQLite's built-in VFSs, including the multiplexor VFS. However it does |
| 400 ** not work out of the box with zipvfs. Refer to the comment describing |
| 401 ** the zipvfs_create_vfs() API below for details on using RBU with zipvfs. |
| 402 */ |
| 403 SQLITE_API sqlite3rbu *SQLITE_STDCALL sqlite3rbu_open( |
| 404 const char *zTarget, |
| 405 const char *zRbu, |
| 406 const char *zState |
| 407 ); |
| 408 |
| 409 /* |
| 410 ** Internally, each RBU connection uses a separate SQLite database |
| 411 ** connection to access the target and rbu update databases. This |
| 412 ** API allows the application direct access to these database handles. |
| 413 ** |
| 414 ** The first argument passed to this function must be a valid, open, RBU |
| 415 ** handle. The second argument should be passed zero to access the target |
| 416 ** database handle, or non-zero to access the rbu update database handle. |
| 417 ** Accessing the underlying database handles may be useful in the |
| 418 ** following scenarios: |
| 419 ** |
| 420 ** * If any target tables are virtual tables, it may be necessary to |
| 421 ** call sqlite3_create_module() on the target database handle to |
| 422 ** register the required virtual table implementations. |
| 423 ** |
| 424 ** * If the data_xxx tables in the RBU source database are virtual |
| 425 ** tables, the application may need to call sqlite3_create_module() on |
| 426 ** the rbu update db handle to any required virtual table |
| 427 ** implementations. |
| 428 ** |
| 429 ** * If the application uses the "rbu_delta()" feature described above, |
| 430 ** it must use sqlite3_create_function() or similar to register the |
| 431 ** rbu_delta() implementation with the target database handle. |
| 432 ** |
| 433 ** If an error has occurred, either while opening or stepping the RBU object, |
| 434 ** this function may return NULL. The error code and message may be collected |
| 435 ** when sqlite3rbu_close() is called. |
| 436 ** |
| 437 ** Database handles returned by this function remain valid until the next |
| 438 ** call to any sqlite3rbu_xxx() function other than sqlite3rbu_db(). |
| 439 */ |
| 440 SQLITE_API sqlite3 *SQLITE_STDCALL sqlite3rbu_db(sqlite3rbu*, int bRbu); |
| 441 |
| 442 /* |
| 443 ** Do some work towards applying the RBU update to the target db. |
| 444 ** |
| 445 ** Return SQLITE_DONE if the update has been completely applied, or |
| 446 ** SQLITE_OK if no error occurs but there remains work to do to apply |
| 447 ** the RBU update. If an error does occur, some other error code is |
| 448 ** returned. |
| 449 ** |
| 450 ** Once a call to sqlite3rbu_step() has returned a value other than |
| 451 ** SQLITE_OK, all subsequent calls on the same RBU handle are no-ops |
| 452 ** that immediately return the same value. |
| 453 */ |
| 454 SQLITE_API int SQLITE_STDCALL sqlite3rbu_step(sqlite3rbu *pRbu); |
| 455 |
| 456 /* |
| 457 ** Force RBU to save its state to disk. |
| 458 ** |
| 459 ** If a power failure or application crash occurs during an update, following |
| 460 ** system recovery RBU may resume the update from the point at which the state |
| 461 ** was last saved. In other words, from the most recent successful call to |
| 462 ** sqlite3rbu_close() or this function. |
| 463 ** |
| 464 ** SQLITE_OK is returned if successful, or an SQLite error code otherwise. |
| 465 */ |
| 466 SQLITE_API int SQLITE_STDCALL sqlite3rbu_savestate(sqlite3rbu *pRbu); |
| 467 |
| 468 /* |
| 469 ** Close an RBU handle. |
| 470 ** |
| 471 ** If the RBU update has been completely applied, mark the RBU database |
| 472 ** as fully applied. Otherwise, assuming no error has occurred, save the |
| 473 ** current state of the RBU update appliation to the RBU database. |
| 474 ** |
| 475 ** If an error has already occurred as part of an sqlite3rbu_step() |
| 476 ** or sqlite3rbu_open() call, or if one occurs within this function, an |
| 477 ** SQLite error code is returned. Additionally, *pzErrmsg may be set to |
| 478 ** point to a buffer containing a utf-8 formatted English language error |
| 479 ** message. It is the responsibility of the caller to eventually free any |
| 480 ** such buffer using sqlite3_free(). |
| 481 ** |
| 482 ** Otherwise, if no error occurs, this function returns SQLITE_OK if the |
| 483 ** update has been partially applied, or SQLITE_DONE if it has been |
| 484 ** completely applied. |
| 485 */ |
| 486 SQLITE_API int SQLITE_STDCALL sqlite3rbu_close(sqlite3rbu *pRbu, char **pzErrmsg
); |
| 487 |
| 488 /* |
| 489 ** Return the total number of key-value operations (inserts, deletes or |
| 490 ** updates) that have been performed on the target database since the |
| 491 ** current RBU update was started. |
| 492 */ |
| 493 SQLITE_API sqlite3_int64 SQLITE_STDCALL sqlite3rbu_progress(sqlite3rbu *pRbu); |
| 494 |
| 495 /* |
| 496 ** Create an RBU VFS named zName that accesses the underlying file-system |
| 497 ** via existing VFS zParent. Or, if the zParent parameter is passed NULL, |
| 498 ** then the new RBU VFS uses the default system VFS to access the file-system. |
| 499 ** The new object is registered as a non-default VFS with SQLite before |
| 500 ** returning. |
| 501 ** |
| 502 ** Part of the RBU implementation uses a custom VFS object. Usually, this |
| 503 ** object is created and deleted automatically by RBU. |
| 504 ** |
| 505 ** The exception is for applications that also use zipvfs. In this case, |
| 506 ** the custom VFS must be explicitly created by the user before the RBU |
| 507 ** handle is opened. The RBU VFS should be installed so that the zipvfs |
| 508 ** VFS uses the RBU VFS, which in turn uses any other VFS layers in use |
| 509 ** (for example multiplexor) to access the file-system. For example, |
| 510 ** to assemble an RBU enabled VFS stack that uses both zipvfs and |
| 511 ** multiplexor (error checking omitted): |
| 512 ** |
| 513 ** // Create a VFS named "multiplex" (not the default). |
| 514 ** sqlite3_multiplex_initialize(0, 0); |
| 515 ** |
| 516 ** // Create an rbu VFS named "rbu" that uses multiplexor. If the |
| 517 ** // second argument were replaced with NULL, the "rbu" VFS would |
| 518 ** // access the file-system via the system default VFS, bypassing the |
| 519 ** // multiplexor. |
| 520 ** sqlite3rbu_create_vfs("rbu", "multiplex"); |
| 521 ** |
| 522 ** // Create a zipvfs VFS named "zipvfs" that uses rbu. |
| 523 ** zipvfs_create_vfs_v3("zipvfs", "rbu", 0, xCompressorAlgorithmDetector); |
| 524 ** |
| 525 ** // Make zipvfs the default VFS. |
| 526 ** sqlite3_vfs_register(sqlite3_vfs_find("zipvfs"), 1); |
| 527 ** |
| 528 ** Because the default VFS created above includes a RBU functionality, it |
| 529 ** may be used by RBU clients. Attempting to use RBU with a zipvfs VFS stack |
| 530 ** that does not include the RBU layer results in an error. |
| 531 ** |
| 532 ** The overhead of adding the "rbu" VFS to the system is negligible for |
| 533 ** non-RBU users. There is no harm in an application accessing the |
| 534 ** file-system via "rbu" all the time, even if it only uses RBU functionality |
| 535 ** occasionally. |
| 536 */ |
| 537 SQLITE_API int SQLITE_STDCALL sqlite3rbu_create_vfs(const char *zName, const cha
r *zParent); |
| 538 |
| 539 /* |
| 540 ** Deregister and destroy an RBU vfs created by an earlier call to |
| 541 ** sqlite3rbu_create_vfs(). |
| 542 ** |
| 543 ** VFS objects are not reference counted. If a VFS object is destroyed |
| 544 ** before all database handles that use it have been closed, the results |
| 545 ** are undefined. |
| 546 */ |
| 547 SQLITE_API void SQLITE_STDCALL sqlite3rbu_destroy_vfs(const char *zName); |
| 548 |
| 549 #if 0 |
| 550 } /* end of the 'extern "C"' block */ |
| 551 #endif |
| 552 |
| 553 #endif /* _SQLITE3RBU_H */ |
| 554 |
| 555 /************** End of sqlite3rbu.h ******************************************/ |
| 556 /************** Continuing where we left off in sqlite3rbu.c *****************/ |
| 557 |
| 558 #if defined(_WIN32_WCE) |
| 559 /* #include "windows.h" */ |
| 560 #endif |
| 561 |
| 562 /* Maximum number of prepared UPDATE statements held by this module */ |
| 563 #define SQLITE_RBU_UPDATE_CACHESIZE 16 |
| 564 |
| 565 /* |
| 566 ** Swap two objects of type TYPE. |
| 567 */ |
| 568 #if !defined(SQLITE_AMALGAMATION) |
| 569 # define SWAP(TYPE,A,B) {TYPE t=A; A=B; B=t;} |
| 570 #endif |
| 571 |
| 572 /* |
| 573 ** The rbu_state table is used to save the state of a partially applied |
| 574 ** update so that it can be resumed later. The table consists of integer |
| 575 ** keys mapped to values as follows: |
| 576 ** |
| 577 ** RBU_STATE_STAGE: |
| 578 ** May be set to integer values 1, 2, 4 or 5. As follows: |
| 579 ** 1: the *-rbu file is currently under construction. |
| 580 ** 2: the *-rbu file has been constructed, but not yet moved |
| 581 ** to the *-wal path. |
| 582 ** 4: the checkpoint is underway. |
| 583 ** 5: the rbu update has been checkpointed. |
| 584 ** |
| 585 ** RBU_STATE_TBL: |
| 586 ** Only valid if STAGE==1. The target database name of the table |
| 587 ** currently being written. |
| 588 ** |
| 589 ** RBU_STATE_IDX: |
| 590 ** Only valid if STAGE==1. The target database name of the index |
| 591 ** currently being written, or NULL if the main table is currently being |
| 592 ** updated. |
| 593 ** |
| 594 ** RBU_STATE_ROW: |
| 595 ** Only valid if STAGE==1. Number of rows already processed for the current |
| 596 ** table/index. |
| 597 ** |
| 598 ** RBU_STATE_PROGRESS: |
| 599 ** Trbul number of sqlite3rbu_step() calls made so far as part of this |
| 600 ** rbu update. |
| 601 ** |
| 602 ** RBU_STATE_CKPT: |
| 603 ** Valid if STAGE==4. The 64-bit checksum associated with the wal-index |
| 604 ** header created by recovering the *-wal file. This is used to detect |
| 605 ** cases when another client appends frames to the *-wal file in the |
| 606 ** middle of an incremental checkpoint (an incremental checkpoint cannot |
| 607 ** be continued if this happens). |
| 608 ** |
| 609 ** RBU_STATE_COOKIE: |
| 610 ** Valid if STAGE==1. The current change-counter cookie value in the |
| 611 ** target db file. |
| 612 ** |
| 613 ** RBU_STATE_OALSZ: |
| 614 ** Valid if STAGE==1. The size in bytes of the *-oal file. |
| 615 */ |
| 616 #define RBU_STATE_STAGE 1 |
| 617 #define RBU_STATE_TBL 2 |
| 618 #define RBU_STATE_IDX 3 |
| 619 #define RBU_STATE_ROW 4 |
| 620 #define RBU_STATE_PROGRESS 5 |
| 621 #define RBU_STATE_CKPT 6 |
| 622 #define RBU_STATE_COOKIE 7 |
| 623 #define RBU_STATE_OALSZ 8 |
| 624 |
| 625 #define RBU_STAGE_OAL 1 |
| 626 #define RBU_STAGE_MOVE 2 |
| 627 #define RBU_STAGE_CAPTURE 3 |
| 628 #define RBU_STAGE_CKPT 4 |
| 629 #define RBU_STAGE_DONE 5 |
| 630 |
| 631 |
| 632 #define RBU_CREATE_STATE \ |
| 633 "CREATE TABLE IF NOT EXISTS %s.rbu_state(k INTEGER PRIMARY KEY, v)" |
| 634 |
| 635 typedef struct RbuFrame RbuFrame; |
| 636 typedef struct RbuObjIter RbuObjIter; |
| 637 typedef struct RbuState RbuState; |
| 638 typedef struct rbu_vfs rbu_vfs; |
| 639 typedef struct rbu_file rbu_file; |
| 640 typedef struct RbuUpdateStmt RbuUpdateStmt; |
| 641 |
| 642 #if !defined(SQLITE_AMALGAMATION) |
| 643 typedef unsigned int u32; |
| 644 typedef unsigned char u8; |
| 645 typedef sqlite3_int64 i64; |
| 646 #endif |
| 647 |
| 648 /* |
| 649 ** These values must match the values defined in wal.c for the equivalent |
| 650 ** locks. These are not magic numbers as they are part of the SQLite file |
| 651 ** format. |
| 652 */ |
| 653 #define WAL_LOCK_WRITE 0 |
| 654 #define WAL_LOCK_CKPT 1 |
| 655 #define WAL_LOCK_READ0 3 |
| 656 |
| 657 /* |
| 658 ** A structure to store values read from the rbu_state table in memory. |
| 659 */ |
| 660 struct RbuState { |
| 661 int eStage; |
| 662 char *zTbl; |
| 663 char *zIdx; |
| 664 i64 iWalCksum; |
| 665 int nRow; |
| 666 i64 nProgress; |
| 667 u32 iCookie; |
| 668 i64 iOalSz; |
| 669 }; |
| 670 |
| 671 struct RbuUpdateStmt { |
| 672 char *zMask; /* Copy of update mask used with pUpdate */ |
| 673 sqlite3_stmt *pUpdate; /* Last update statement (or NULL) */ |
| 674 RbuUpdateStmt *pNext; |
| 675 }; |
| 676 |
| 677 /* |
| 678 ** An iterator of this type is used to iterate through all objects in |
| 679 ** the target database that require updating. For each such table, the |
| 680 ** iterator visits, in order: |
| 681 ** |
| 682 ** * the table itself, |
| 683 ** * each index of the table (zero or more points to visit), and |
| 684 ** * a special "cleanup table" state. |
| 685 ** |
| 686 ** abIndexed: |
| 687 ** If the table has no indexes on it, abIndexed is set to NULL. Otherwise, |
| 688 ** it points to an array of flags nTblCol elements in size. The flag is |
| 689 ** set for each column that is either a part of the PK or a part of an |
| 690 ** index. Or clear otherwise. |
| 691 ** |
| 692 */ |
| 693 struct RbuObjIter { |
| 694 sqlite3_stmt *pTblIter; /* Iterate through tables */ |
| 695 sqlite3_stmt *pIdxIter; /* Index iterator */ |
| 696 int nTblCol; /* Size of azTblCol[] array */ |
| 697 char **azTblCol; /* Array of unquoted target column names */ |
| 698 char **azTblType; /* Array of target column types */ |
| 699 int *aiSrcOrder; /* src table col -> target table col */ |
| 700 u8 *abTblPk; /* Array of flags, set on target PK columns */ |
| 701 u8 *abNotNull; /* Array of flags, set on NOT NULL columns */ |
| 702 u8 *abIndexed; /* Array of flags, set on indexed & PK cols */ |
| 703 int eType; /* Table type - an RBU_PK_XXX value */ |
| 704 |
| 705 /* Output variables. zTbl==0 implies EOF. */ |
| 706 int bCleanup; /* True in "cleanup" state */ |
| 707 const char *zTbl; /* Name of target db table */ |
| 708 const char *zDataTbl; /* Name of rbu db table (or null) */ |
| 709 const char *zIdx; /* Name of target db index (or null) */ |
| 710 int iTnum; /* Root page of current object */ |
| 711 int iPkTnum; /* If eType==EXTERNAL, root of PK index */ |
| 712 int bUnique; /* Current index is unique */ |
| 713 |
| 714 /* Statements created by rbuObjIterPrepareAll() */ |
| 715 int nCol; /* Number of columns in current object */ |
| 716 sqlite3_stmt *pSelect; /* Source data */ |
| 717 sqlite3_stmt *pInsert; /* Statement for INSERT operations */ |
| 718 sqlite3_stmt *pDelete; /* Statement for DELETE ops */ |
| 719 sqlite3_stmt *pTmpInsert; /* Insert into rbu_tmp_$zDataTbl */ |
| 720 |
| 721 /* Last UPDATE used (for PK b-tree updates only), or NULL. */ |
| 722 RbuUpdateStmt *pRbuUpdate; |
| 723 }; |
| 724 |
| 725 /* |
| 726 ** Values for RbuObjIter.eType |
| 727 ** |
| 728 ** 0: Table does not exist (error) |
| 729 ** 1: Table has an implicit rowid. |
| 730 ** 2: Table has an explicit IPK column. |
| 731 ** 3: Table has an external PK index. |
| 732 ** 4: Table is WITHOUT ROWID. |
| 733 ** 5: Table is a virtual table. |
| 734 */ |
| 735 #define RBU_PK_NOTABLE 0 |
| 736 #define RBU_PK_NONE 1 |
| 737 #define RBU_PK_IPK 2 |
| 738 #define RBU_PK_EXTERNAL 3 |
| 739 #define RBU_PK_WITHOUT_ROWID 4 |
| 740 #define RBU_PK_VTAB 5 |
| 741 |
| 742 |
| 743 /* |
| 744 ** Within the RBU_STAGE_OAL stage, each call to sqlite3rbu_step() performs |
| 745 ** one of the following operations. |
| 746 */ |
| 747 #define RBU_INSERT 1 /* Insert on a main table b-tree */ |
| 748 #define RBU_DELETE 2 /* Delete a row from a main table b-tree */ |
| 749 #define RBU_IDX_DELETE 3 /* Delete a row from an aux. index b-tree */ |
| 750 #define RBU_IDX_INSERT 4 /* Insert on an aux. index b-tree */ |
| 751 #define RBU_UPDATE 5 /* Update a row in a main table b-tree */ |
| 752 |
| 753 |
| 754 /* |
| 755 ** A single step of an incremental checkpoint - frame iWalFrame of the wal |
| 756 ** file should be copied to page iDbPage of the database file. |
| 757 */ |
| 758 struct RbuFrame { |
| 759 u32 iDbPage; |
| 760 u32 iWalFrame; |
| 761 }; |
| 762 |
| 763 /* |
| 764 ** RBU handle. |
| 765 */ |
| 766 struct sqlite3rbu { |
| 767 int eStage; /* Value of RBU_STATE_STAGE field */ |
| 768 sqlite3 *dbMain; /* target database handle */ |
| 769 sqlite3 *dbRbu; /* rbu database handle */ |
| 770 char *zTarget; /* Path to target db */ |
| 771 char *zRbu; /* Path to rbu db */ |
| 772 char *zState; /* Path to state db (or NULL if zRbu) */ |
| 773 char zStateDb[5]; /* Db name for state ("stat" or "main") */ |
| 774 int rc; /* Value returned by last rbu_step() call */ |
| 775 char *zErrmsg; /* Error message if rc!=SQLITE_OK */ |
| 776 int nStep; /* Rows processed for current object */ |
| 777 int nProgress; /* Rows processed for all objects */ |
| 778 RbuObjIter objiter; /* Iterator for skipping through tbl/idx */ |
| 779 const char *zVfsName; /* Name of automatically created rbu vfs */ |
| 780 rbu_file *pTargetFd; /* File handle open on target db */ |
| 781 i64 iOalSz; |
| 782 |
| 783 /* The following state variables are used as part of the incremental |
| 784 ** checkpoint stage (eStage==RBU_STAGE_CKPT). See comments surrounding |
| 785 ** function rbuSetupCheckpoint() for details. */ |
| 786 u32 iMaxFrame; /* Largest iWalFrame value in aFrame[] */ |
| 787 u32 mLock; |
| 788 int nFrame; /* Entries in aFrame[] array */ |
| 789 int nFrameAlloc; /* Allocated size of aFrame[] array */ |
| 790 RbuFrame *aFrame; |
| 791 int pgsz; |
| 792 u8 *aBuf; |
| 793 i64 iWalCksum; |
| 794 }; |
| 795 |
| 796 /* |
| 797 ** An rbu VFS is implemented using an instance of this structure. |
| 798 */ |
| 799 struct rbu_vfs { |
| 800 sqlite3_vfs base; /* rbu VFS shim methods */ |
| 801 sqlite3_vfs *pRealVfs; /* Underlying VFS */ |
| 802 sqlite3_mutex *mutex; /* Mutex to protect pMain */ |
| 803 rbu_file *pMain; /* Linked list of main db files */ |
| 804 }; |
| 805 |
| 806 /* |
| 807 ** Each file opened by an rbu VFS is represented by an instance of |
| 808 ** the following structure. |
| 809 */ |
| 810 struct rbu_file { |
| 811 sqlite3_file base; /* sqlite3_file methods */ |
| 812 sqlite3_file *pReal; /* Underlying file handle */ |
| 813 rbu_vfs *pRbuVfs; /* Pointer to the rbu_vfs object */ |
| 814 sqlite3rbu *pRbu; /* Pointer to rbu object (rbu target only) */ |
| 815 |
| 816 int openFlags; /* Flags this file was opened with */ |
| 817 u32 iCookie; /* Cookie value for main db files */ |
| 818 u8 iWriteVer; /* "write-version" value for main db files */ |
| 819 |
| 820 int nShm; /* Number of entries in apShm[] array */ |
| 821 char **apShm; /* Array of mmap'd *-shm regions */ |
| 822 char *zDel; /* Delete this when closing file */ |
| 823 |
| 824 const char *zWal; /* Wal filename for this main db file */ |
| 825 rbu_file *pWalFd; /* Wal file descriptor for this main db */ |
| 826 rbu_file *pMainNext; /* Next MAIN_DB file */ |
| 827 }; |
| 828 |
| 829 |
| 830 /************************************************************************* |
| 831 ** The following three functions, found below: |
| 832 ** |
| 833 ** rbuDeltaGetInt() |
| 834 ** rbuDeltaChecksum() |
| 835 ** rbuDeltaApply() |
| 836 ** |
| 837 ** are lifted from the fossil source code (http://fossil-scm.org). They |
| 838 ** are used to implement the scalar SQL function rbu_fossil_delta(). |
| 839 */ |
| 840 |
| 841 /* |
| 842 ** Read bytes from *pz and convert them into a positive integer. When |
| 843 ** finished, leave *pz pointing to the first character past the end of |
| 844 ** the integer. The *pLen parameter holds the length of the string |
| 845 ** in *pz and is decremented once for each character in the integer. |
| 846 */ |
| 847 static unsigned int rbuDeltaGetInt(const char **pz, int *pLen){ |
| 848 static const signed char zValue[] = { |
| 849 -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, |
| 850 -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, |
| 851 -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, |
| 852 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, -1, -1, -1, -1, -1, -1, |
| 853 -1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, |
| 854 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, -1, -1, -1, -1, 36, |
| 855 -1, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, |
| 856 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, -1, -1, -1, 63, -1, |
| 857 }; |
| 858 unsigned int v = 0; |
| 859 int c; |
| 860 unsigned char *z = (unsigned char*)*pz; |
| 861 unsigned char *zStart = z; |
| 862 while( (c = zValue[0x7f&*(z++)])>=0 ){ |
| 863 v = (v<<6) + c; |
| 864 } |
| 865 z--; |
| 866 *pLen -= z - zStart; |
| 867 *pz = (char*)z; |
| 868 return v; |
| 869 } |
| 870 |
| 871 /* |
| 872 ** Compute a 32-bit checksum on the N-byte buffer. Return the result. |
| 873 */ |
| 874 static unsigned int rbuDeltaChecksum(const char *zIn, size_t N){ |
| 875 const unsigned char *z = (const unsigned char *)zIn; |
| 876 unsigned sum0 = 0; |
| 877 unsigned sum1 = 0; |
| 878 unsigned sum2 = 0; |
| 879 unsigned sum3 = 0; |
| 880 while(N >= 16){ |
| 881 sum0 += ((unsigned)z[0] + z[4] + z[8] + z[12]); |
| 882 sum1 += ((unsigned)z[1] + z[5] + z[9] + z[13]); |
| 883 sum2 += ((unsigned)z[2] + z[6] + z[10]+ z[14]); |
| 884 sum3 += ((unsigned)z[3] + z[7] + z[11]+ z[15]); |
| 885 z += 16; |
| 886 N -= 16; |
| 887 } |
| 888 while(N >= 4){ |
| 889 sum0 += z[0]; |
| 890 sum1 += z[1]; |
| 891 sum2 += z[2]; |
| 892 sum3 += z[3]; |
| 893 z += 4; |
| 894 N -= 4; |
| 895 } |
| 896 sum3 += (sum2 << 8) + (sum1 << 16) + (sum0 << 24); |
| 897 switch(N){ |
| 898 case 3: sum3 += (z[2] << 8); |
| 899 case 2: sum3 += (z[1] << 16); |
| 900 case 1: sum3 += (z[0] << 24); |
| 901 default: ; |
| 902 } |
| 903 return sum3; |
| 904 } |
| 905 |
| 906 /* |
| 907 ** Apply a delta. |
| 908 ** |
| 909 ** The output buffer should be big enough to hold the whole output |
| 910 ** file and a NUL terminator at the end. The delta_output_size() |
| 911 ** routine will determine this size for you. |
| 912 ** |
| 913 ** The delta string should be null-terminated. But the delta string |
| 914 ** may contain embedded NUL characters (if the input and output are |
| 915 ** binary files) so we also have to pass in the length of the delta in |
| 916 ** the lenDelta parameter. |
| 917 ** |
| 918 ** This function returns the size of the output file in bytes (excluding |
| 919 ** the final NUL terminator character). Except, if the delta string is |
| 920 ** malformed or intended for use with a source file other than zSrc, |
| 921 ** then this routine returns -1. |
| 922 ** |
| 923 ** Refer to the delta_create() documentation above for a description |
| 924 ** of the delta file format. |
| 925 */ |
| 926 static int rbuDeltaApply( |
| 927 const char *zSrc, /* The source or pattern file */ |
| 928 int lenSrc, /* Length of the source file */ |
| 929 const char *zDelta, /* Delta to apply to the pattern */ |
| 930 int lenDelta, /* Length of the delta */ |
| 931 char *zOut /* Write the output into this preallocated buffer */ |
| 932 ){ |
| 933 unsigned int limit; |
| 934 unsigned int total = 0; |
| 935 #ifndef FOSSIL_OMIT_DELTA_CKSUM_TEST |
| 936 char *zOrigOut = zOut; |
| 937 #endif |
| 938 |
| 939 limit = rbuDeltaGetInt(&zDelta, &lenDelta); |
| 940 if( *zDelta!='\n' ){ |
| 941 /* ERROR: size integer not terminated by "\n" */ |
| 942 return -1; |
| 943 } |
| 944 zDelta++; lenDelta--; |
| 945 while( *zDelta && lenDelta>0 ){ |
| 946 unsigned int cnt, ofst; |
| 947 cnt = rbuDeltaGetInt(&zDelta, &lenDelta); |
| 948 switch( zDelta[0] ){ |
| 949 case '@': { |
| 950 zDelta++; lenDelta--; |
| 951 ofst = rbuDeltaGetInt(&zDelta, &lenDelta); |
| 952 if( lenDelta>0 && zDelta[0]!=',' ){ |
| 953 /* ERROR: copy command not terminated by ',' */ |
| 954 return -1; |
| 955 } |
| 956 zDelta++; lenDelta--; |
| 957 total += cnt; |
| 958 if( total>limit ){ |
| 959 /* ERROR: copy exceeds output file size */ |
| 960 return -1; |
| 961 } |
| 962 if( (int)(ofst+cnt) > lenSrc ){ |
| 963 /* ERROR: copy extends past end of input */ |
| 964 return -1; |
| 965 } |
| 966 memcpy(zOut, &zSrc[ofst], cnt); |
| 967 zOut += cnt; |
| 968 break; |
| 969 } |
| 970 case ':': { |
| 971 zDelta++; lenDelta--; |
| 972 total += cnt; |
| 973 if( total>limit ){ |
| 974 /* ERROR: insert command gives an output larger than predicted */ |
| 975 return -1; |
| 976 } |
| 977 if( (int)cnt>lenDelta ){ |
| 978 /* ERROR: insert count exceeds size of delta */ |
| 979 return -1; |
| 980 } |
| 981 memcpy(zOut, zDelta, cnt); |
| 982 zOut += cnt; |
| 983 zDelta += cnt; |
| 984 lenDelta -= cnt; |
| 985 break; |
| 986 } |
| 987 case ';': { |
| 988 zDelta++; lenDelta--; |
| 989 zOut[0] = 0; |
| 990 #ifndef FOSSIL_OMIT_DELTA_CKSUM_TEST |
| 991 if( cnt!=rbuDeltaChecksum(zOrigOut, total) ){ |
| 992 /* ERROR: bad checksum */ |
| 993 return -1; |
| 994 } |
| 995 #endif |
| 996 if( total!=limit ){ |
| 997 /* ERROR: generated size does not match predicted size */ |
| 998 return -1; |
| 999 } |
| 1000 return total; |
| 1001 } |
| 1002 default: { |
| 1003 /* ERROR: unknown delta operator */ |
| 1004 return -1; |
| 1005 } |
| 1006 } |
| 1007 } |
| 1008 /* ERROR: unterminated delta */ |
| 1009 return -1; |
| 1010 } |
| 1011 |
| 1012 static int rbuDeltaOutputSize(const char *zDelta, int lenDelta){ |
| 1013 int size; |
| 1014 size = rbuDeltaGetInt(&zDelta, &lenDelta); |
| 1015 if( *zDelta!='\n' ){ |
| 1016 /* ERROR: size integer not terminated by "\n" */ |
| 1017 return -1; |
| 1018 } |
| 1019 return size; |
| 1020 } |
| 1021 |
| 1022 /* |
| 1023 ** End of code taken from fossil. |
| 1024 *************************************************************************/ |
| 1025 |
| 1026 /* |
| 1027 ** Implementation of SQL scalar function rbu_fossil_delta(). |
| 1028 ** |
| 1029 ** This function applies a fossil delta patch to a blob. Exactly two |
| 1030 ** arguments must be passed to this function. The first is the blob to |
| 1031 ** patch and the second the patch to apply. If no error occurs, this |
| 1032 ** function returns the patched blob. |
| 1033 */ |
| 1034 static void rbuFossilDeltaFunc( |
| 1035 sqlite3_context *context, |
| 1036 int argc, |
| 1037 sqlite3_value **argv |
| 1038 ){ |
| 1039 const char *aDelta; |
| 1040 int nDelta; |
| 1041 const char *aOrig; |
| 1042 int nOrig; |
| 1043 |
| 1044 int nOut; |
| 1045 int nOut2; |
| 1046 char *aOut; |
| 1047 |
| 1048 assert( argc==2 ); |
| 1049 |
| 1050 nOrig = sqlite3_value_bytes(argv[0]); |
| 1051 aOrig = (const char*)sqlite3_value_blob(argv[0]); |
| 1052 nDelta = sqlite3_value_bytes(argv[1]); |
| 1053 aDelta = (const char*)sqlite3_value_blob(argv[1]); |
| 1054 |
| 1055 /* Figure out the size of the output */ |
| 1056 nOut = rbuDeltaOutputSize(aDelta, nDelta); |
| 1057 if( nOut<0 ){ |
| 1058 sqlite3_result_error(context, "corrupt fossil delta", -1); |
| 1059 return; |
| 1060 } |
| 1061 |
| 1062 aOut = sqlite3_malloc(nOut+1); |
| 1063 if( aOut==0 ){ |
| 1064 sqlite3_result_error_nomem(context); |
| 1065 }else{ |
| 1066 nOut2 = rbuDeltaApply(aOrig, nOrig, aDelta, nDelta, aOut); |
| 1067 if( nOut2!=nOut ){ |
| 1068 sqlite3_result_error(context, "corrupt fossil delta", -1); |
| 1069 }else{ |
| 1070 sqlite3_result_blob(context, aOut, nOut, sqlite3_free); |
| 1071 } |
| 1072 } |
| 1073 } |
| 1074 |
| 1075 |
| 1076 /* |
| 1077 ** Prepare the SQL statement in buffer zSql against database handle db. |
| 1078 ** If successful, set *ppStmt to point to the new statement and return |
| 1079 ** SQLITE_OK. |
| 1080 ** |
| 1081 ** Otherwise, if an error does occur, set *ppStmt to NULL and return |
| 1082 ** an SQLite error code. Additionally, set output variable *pzErrmsg to |
| 1083 ** point to a buffer containing an error message. It is the responsibility |
| 1084 ** of the caller to (eventually) free this buffer using sqlite3_free(). |
| 1085 */ |
| 1086 static int prepareAndCollectError( |
| 1087 sqlite3 *db, |
| 1088 sqlite3_stmt **ppStmt, |
| 1089 char **pzErrmsg, |
| 1090 const char *zSql |
| 1091 ){ |
| 1092 int rc = sqlite3_prepare_v2(db, zSql, -1, ppStmt, 0); |
| 1093 if( rc!=SQLITE_OK ){ |
| 1094 *pzErrmsg = sqlite3_mprintf("%s", sqlite3_errmsg(db)); |
| 1095 *ppStmt = 0; |
| 1096 } |
| 1097 return rc; |
| 1098 } |
| 1099 |
| 1100 /* |
| 1101 ** Reset the SQL statement passed as the first argument. Return a copy |
| 1102 ** of the value returned by sqlite3_reset(). |
| 1103 ** |
| 1104 ** If an error has occurred, then set *pzErrmsg to point to a buffer |
| 1105 ** containing an error message. It is the responsibility of the caller |
| 1106 ** to eventually free this buffer using sqlite3_free(). |
| 1107 */ |
| 1108 static int resetAndCollectError(sqlite3_stmt *pStmt, char **pzErrmsg){ |
| 1109 int rc = sqlite3_reset(pStmt); |
| 1110 if( rc!=SQLITE_OK ){ |
| 1111 *pzErrmsg = sqlite3_mprintf("%s", sqlite3_errmsg(sqlite3_db_handle(pStmt))); |
| 1112 } |
| 1113 return rc; |
| 1114 } |
| 1115 |
| 1116 /* |
| 1117 ** Unless it is NULL, argument zSql points to a buffer allocated using |
| 1118 ** sqlite3_malloc containing an SQL statement. This function prepares the SQL |
| 1119 ** statement against database db and frees the buffer. If statement |
| 1120 ** compilation is successful, *ppStmt is set to point to the new statement |
| 1121 ** handle and SQLITE_OK is returned. |
| 1122 ** |
| 1123 ** Otherwise, if an error occurs, *ppStmt is set to NULL and an error code |
| 1124 ** returned. In this case, *pzErrmsg may also be set to point to an error |
| 1125 ** message. It is the responsibility of the caller to free this error message |
| 1126 ** buffer using sqlite3_free(). |
| 1127 ** |
| 1128 ** If argument zSql is NULL, this function assumes that an OOM has occurred. |
| 1129 ** In this case SQLITE_NOMEM is returned and *ppStmt set to NULL. |
| 1130 */ |
| 1131 static int prepareFreeAndCollectError( |
| 1132 sqlite3 *db, |
| 1133 sqlite3_stmt **ppStmt, |
| 1134 char **pzErrmsg, |
| 1135 char *zSql |
| 1136 ){ |
| 1137 int rc; |
| 1138 assert( *pzErrmsg==0 ); |
| 1139 if( zSql==0 ){ |
| 1140 rc = SQLITE_NOMEM; |
| 1141 *ppStmt = 0; |
| 1142 }else{ |
| 1143 rc = prepareAndCollectError(db, ppStmt, pzErrmsg, zSql); |
| 1144 sqlite3_free(zSql); |
| 1145 } |
| 1146 return rc; |
| 1147 } |
| 1148 |
| 1149 /* |
| 1150 ** Free the RbuObjIter.azTblCol[] and RbuObjIter.abTblPk[] arrays allocated |
| 1151 ** by an earlier call to rbuObjIterCacheTableInfo(). |
| 1152 */ |
| 1153 static void rbuObjIterFreeCols(RbuObjIter *pIter){ |
| 1154 int i; |
| 1155 for(i=0; i<pIter->nTblCol; i++){ |
| 1156 sqlite3_free(pIter->azTblCol[i]); |
| 1157 sqlite3_free(pIter->azTblType[i]); |
| 1158 } |
| 1159 sqlite3_free(pIter->azTblCol); |
| 1160 pIter->azTblCol = 0; |
| 1161 pIter->azTblType = 0; |
| 1162 pIter->aiSrcOrder = 0; |
| 1163 pIter->abTblPk = 0; |
| 1164 pIter->abNotNull = 0; |
| 1165 pIter->nTblCol = 0; |
| 1166 pIter->eType = 0; /* Invalid value */ |
| 1167 } |
| 1168 |
| 1169 /* |
| 1170 ** Finalize all statements and free all allocations that are specific to |
| 1171 ** the current object (table/index pair). |
| 1172 */ |
| 1173 static void rbuObjIterClearStatements(RbuObjIter *pIter){ |
| 1174 RbuUpdateStmt *pUp; |
| 1175 |
| 1176 sqlite3_finalize(pIter->pSelect); |
| 1177 sqlite3_finalize(pIter->pInsert); |
| 1178 sqlite3_finalize(pIter->pDelete); |
| 1179 sqlite3_finalize(pIter->pTmpInsert); |
| 1180 pUp = pIter->pRbuUpdate; |
| 1181 while( pUp ){ |
| 1182 RbuUpdateStmt *pTmp = pUp->pNext; |
| 1183 sqlite3_finalize(pUp->pUpdate); |
| 1184 sqlite3_free(pUp); |
| 1185 pUp = pTmp; |
| 1186 } |
| 1187 |
| 1188 pIter->pSelect = 0; |
| 1189 pIter->pInsert = 0; |
| 1190 pIter->pDelete = 0; |
| 1191 pIter->pRbuUpdate = 0; |
| 1192 pIter->pTmpInsert = 0; |
| 1193 pIter->nCol = 0; |
| 1194 } |
| 1195 |
| 1196 /* |
| 1197 ** Clean up any resources allocated as part of the iterator object passed |
| 1198 ** as the only argument. |
| 1199 */ |
| 1200 static void rbuObjIterFinalize(RbuObjIter *pIter){ |
| 1201 rbuObjIterClearStatements(pIter); |
| 1202 sqlite3_finalize(pIter->pTblIter); |
| 1203 sqlite3_finalize(pIter->pIdxIter); |
| 1204 rbuObjIterFreeCols(pIter); |
| 1205 memset(pIter, 0, sizeof(RbuObjIter)); |
| 1206 } |
| 1207 |
| 1208 /* |
| 1209 ** Advance the iterator to the next position. |
| 1210 ** |
| 1211 ** If no error occurs, SQLITE_OK is returned and the iterator is left |
| 1212 ** pointing to the next entry. Otherwise, an error code and message is |
| 1213 ** left in the RBU handle passed as the first argument. A copy of the |
| 1214 ** error code is returned. |
| 1215 */ |
| 1216 static int rbuObjIterNext(sqlite3rbu *p, RbuObjIter *pIter){ |
| 1217 int rc = p->rc; |
| 1218 if( rc==SQLITE_OK ){ |
| 1219 |
| 1220 /* Free any SQLite statements used while processing the previous object */ |
| 1221 rbuObjIterClearStatements(pIter); |
| 1222 if( pIter->zIdx==0 ){ |
| 1223 rc = sqlite3_exec(p->dbMain, |
| 1224 "DROP TRIGGER IF EXISTS temp.rbu_insert_tr;" |
| 1225 "DROP TRIGGER IF EXISTS temp.rbu_update1_tr;" |
| 1226 "DROP TRIGGER IF EXISTS temp.rbu_update2_tr;" |
| 1227 "DROP TRIGGER IF EXISTS temp.rbu_delete_tr;" |
| 1228 , 0, 0, &p->zErrmsg |
| 1229 ); |
| 1230 } |
| 1231 |
| 1232 if( rc==SQLITE_OK ){ |
| 1233 if( pIter->bCleanup ){ |
| 1234 rbuObjIterFreeCols(pIter); |
| 1235 pIter->bCleanup = 0; |
| 1236 rc = sqlite3_step(pIter->pTblIter); |
| 1237 if( rc!=SQLITE_ROW ){ |
| 1238 rc = resetAndCollectError(pIter->pTblIter, &p->zErrmsg); |
| 1239 pIter->zTbl = 0; |
| 1240 }else{ |
| 1241 pIter->zTbl = (const char*)sqlite3_column_text(pIter->pTblIter, 0); |
| 1242 pIter->zDataTbl = (const char*)sqlite3_column_text(pIter->pTblIter,1); |
| 1243 rc = (pIter->zDataTbl && pIter->zTbl) ? SQLITE_OK : SQLITE_NOMEM; |
| 1244 } |
| 1245 }else{ |
| 1246 if( pIter->zIdx==0 ){ |
| 1247 sqlite3_stmt *pIdx = pIter->pIdxIter; |
| 1248 rc = sqlite3_bind_text(pIdx, 1, pIter->zTbl, -1, SQLITE_STATIC); |
| 1249 } |
| 1250 if( rc==SQLITE_OK ){ |
| 1251 rc = sqlite3_step(pIter->pIdxIter); |
| 1252 if( rc!=SQLITE_ROW ){ |
| 1253 rc = resetAndCollectError(pIter->pIdxIter, &p->zErrmsg); |
| 1254 pIter->bCleanup = 1; |
| 1255 pIter->zIdx = 0; |
| 1256 }else{ |
| 1257 pIter->zIdx = (const char*)sqlite3_column_text(pIter->pIdxIter, 0); |
| 1258 pIter->iTnum = sqlite3_column_int(pIter->pIdxIter, 1); |
| 1259 pIter->bUnique = sqlite3_column_int(pIter->pIdxIter, 2); |
| 1260 rc = pIter->zIdx ? SQLITE_OK : SQLITE_NOMEM; |
| 1261 } |
| 1262 } |
| 1263 } |
| 1264 } |
| 1265 } |
| 1266 |
| 1267 if( rc!=SQLITE_OK ){ |
| 1268 rbuObjIterFinalize(pIter); |
| 1269 p->rc = rc; |
| 1270 } |
| 1271 return rc; |
| 1272 } |
| 1273 |
| 1274 |
| 1275 /* |
| 1276 ** The implementation of the rbu_target_name() SQL function. This function |
| 1277 ** accepts one argument - the name of a table in the RBU database. If the |
| 1278 ** table name matches the pattern: |
| 1279 ** |
| 1280 ** data[0-9]_<name> |
| 1281 ** |
| 1282 ** where <name> is any sequence of 1 or more characters, <name> is returned. |
| 1283 ** Otherwise, if the only argument does not match the above pattern, an SQL |
| 1284 ** NULL is returned. |
| 1285 ** |
| 1286 ** "data_t1" -> "t1" |
| 1287 ** "data0123_t2" -> "t2" |
| 1288 ** "dataAB_t3" -> NULL |
| 1289 */ |
| 1290 static void rbuTargetNameFunc( |
| 1291 sqlite3_context *context, |
| 1292 int argc, |
| 1293 sqlite3_value **argv |
| 1294 ){ |
| 1295 const char *zIn; |
| 1296 assert( argc==1 ); |
| 1297 |
| 1298 zIn = (const char*)sqlite3_value_text(argv[0]); |
| 1299 if( zIn && strlen(zIn)>4 && memcmp("data", zIn, 4)==0 ){ |
| 1300 int i; |
| 1301 for(i=4; zIn[i]>='0' && zIn[i]<='9'; i++); |
| 1302 if( zIn[i]=='_' && zIn[i+1] ){ |
| 1303 sqlite3_result_text(context, &zIn[i+1], -1, SQLITE_STATIC); |
| 1304 } |
| 1305 } |
| 1306 } |
| 1307 |
| 1308 /* |
| 1309 ** Initialize the iterator structure passed as the second argument. |
| 1310 ** |
| 1311 ** If no error occurs, SQLITE_OK is returned and the iterator is left |
| 1312 ** pointing to the first entry. Otherwise, an error code and message is |
| 1313 ** left in the RBU handle passed as the first argument. A copy of the |
| 1314 ** error code is returned. |
| 1315 */ |
| 1316 static int rbuObjIterFirst(sqlite3rbu *p, RbuObjIter *pIter){ |
| 1317 int rc; |
| 1318 memset(pIter, 0, sizeof(RbuObjIter)); |
| 1319 |
| 1320 rc = prepareAndCollectError(p->dbRbu, &pIter->pTblIter, &p->zErrmsg, |
| 1321 "SELECT rbu_target_name(name) AS target, name FROM sqlite_master " |
| 1322 "WHERE type IN ('table', 'view') AND target IS NOT NULL " |
| 1323 "ORDER BY name" |
| 1324 ); |
| 1325 |
| 1326 if( rc==SQLITE_OK ){ |
| 1327 rc = prepareAndCollectError(p->dbMain, &pIter->pIdxIter, &p->zErrmsg, |
| 1328 "SELECT name, rootpage, sql IS NULL OR substr(8, 6)=='UNIQUE' " |
| 1329 " FROM main.sqlite_master " |
| 1330 " WHERE type='index' AND tbl_name = ?" |
| 1331 ); |
| 1332 } |
| 1333 |
| 1334 pIter->bCleanup = 1; |
| 1335 p->rc = rc; |
| 1336 return rbuObjIterNext(p, pIter); |
| 1337 } |
| 1338 |
| 1339 /* |
| 1340 ** This is a wrapper around "sqlite3_mprintf(zFmt, ...)". If an OOM occurs, |
| 1341 ** an error code is stored in the RBU handle passed as the first argument. |
| 1342 ** |
| 1343 ** If an error has already occurred (p->rc is already set to something other |
| 1344 ** than SQLITE_OK), then this function returns NULL without modifying the |
| 1345 ** stored error code. In this case it still calls sqlite3_free() on any |
| 1346 ** printf() parameters associated with %z conversions. |
| 1347 */ |
| 1348 static char *rbuMPrintf(sqlite3rbu *p, const char *zFmt, ...){ |
| 1349 char *zSql = 0; |
| 1350 va_list ap; |
| 1351 va_start(ap, zFmt); |
| 1352 zSql = sqlite3_vmprintf(zFmt, ap); |
| 1353 if( p->rc==SQLITE_OK ){ |
| 1354 if( zSql==0 ) p->rc = SQLITE_NOMEM; |
| 1355 }else{ |
| 1356 sqlite3_free(zSql); |
| 1357 zSql = 0; |
| 1358 } |
| 1359 va_end(ap); |
| 1360 return zSql; |
| 1361 } |
| 1362 |
| 1363 /* |
| 1364 ** Argument zFmt is a sqlite3_mprintf() style format string. The trailing |
| 1365 ** arguments are the usual subsitution values. This function performs |
| 1366 ** the printf() style substitutions and executes the result as an SQL |
| 1367 ** statement on the RBU handles database. |
| 1368 ** |
| 1369 ** If an error occurs, an error code and error message is stored in the |
| 1370 ** RBU handle. If an error has already occurred when this function is |
| 1371 ** called, it is a no-op. |
| 1372 */ |
| 1373 static int rbuMPrintfExec(sqlite3rbu *p, sqlite3 *db, const char *zFmt, ...){ |
| 1374 va_list ap; |
| 1375 char *zSql; |
| 1376 va_start(ap, zFmt); |
| 1377 zSql = sqlite3_vmprintf(zFmt, ap); |
| 1378 if( p->rc==SQLITE_OK ){ |
| 1379 if( zSql==0 ){ |
| 1380 p->rc = SQLITE_NOMEM; |
| 1381 }else{ |
| 1382 p->rc = sqlite3_exec(db, zSql, 0, 0, &p->zErrmsg); |
| 1383 } |
| 1384 } |
| 1385 sqlite3_free(zSql); |
| 1386 va_end(ap); |
| 1387 return p->rc; |
| 1388 } |
| 1389 |
| 1390 /* |
| 1391 ** Attempt to allocate and return a pointer to a zeroed block of nByte |
| 1392 ** bytes. |
| 1393 ** |
| 1394 ** If an error (i.e. an OOM condition) occurs, return NULL and leave an |
| 1395 ** error code in the rbu handle passed as the first argument. Or, if an |
| 1396 ** error has already occurred when this function is called, return NULL |
| 1397 ** immediately without attempting the allocation or modifying the stored |
| 1398 ** error code. |
| 1399 */ |
| 1400 static void *rbuMalloc(sqlite3rbu *p, int nByte){ |
| 1401 void *pRet = 0; |
| 1402 if( p->rc==SQLITE_OK ){ |
| 1403 assert( nByte>0 ); |
| 1404 pRet = sqlite3_malloc(nByte); |
| 1405 if( pRet==0 ){ |
| 1406 p->rc = SQLITE_NOMEM; |
| 1407 }else{ |
| 1408 memset(pRet, 0, nByte); |
| 1409 } |
| 1410 } |
| 1411 return pRet; |
| 1412 } |
| 1413 |
| 1414 |
| 1415 /* |
| 1416 ** Allocate and zero the pIter->azTblCol[] and abTblPk[] arrays so that |
| 1417 ** there is room for at least nCol elements. If an OOM occurs, store an |
| 1418 ** error code in the RBU handle passed as the first argument. |
| 1419 */ |
| 1420 static void rbuAllocateIterArrays(sqlite3rbu *p, RbuObjIter *pIter, int nCol){ |
| 1421 int nByte = (2*sizeof(char*) + sizeof(int) + 3*sizeof(u8)) * nCol; |
| 1422 char **azNew; |
| 1423 |
| 1424 azNew = (char**)rbuMalloc(p, nByte); |
| 1425 if( azNew ){ |
| 1426 pIter->azTblCol = azNew; |
| 1427 pIter->azTblType = &azNew[nCol]; |
| 1428 pIter->aiSrcOrder = (int*)&pIter->azTblType[nCol]; |
| 1429 pIter->abTblPk = (u8*)&pIter->aiSrcOrder[nCol]; |
| 1430 pIter->abNotNull = (u8*)&pIter->abTblPk[nCol]; |
| 1431 pIter->abIndexed = (u8*)&pIter->abNotNull[nCol]; |
| 1432 } |
| 1433 } |
| 1434 |
| 1435 /* |
| 1436 ** The first argument must be a nul-terminated string. This function |
| 1437 ** returns a copy of the string in memory obtained from sqlite3_malloc(). |
| 1438 ** It is the responsibility of the caller to eventually free this memory |
| 1439 ** using sqlite3_free(). |
| 1440 ** |
| 1441 ** If an OOM condition is encountered when attempting to allocate memory, |
| 1442 ** output variable (*pRc) is set to SQLITE_NOMEM before returning. Otherwise, |
| 1443 ** if the allocation succeeds, (*pRc) is left unchanged. |
| 1444 */ |
| 1445 static char *rbuStrndup(const char *zStr, int *pRc){ |
| 1446 char *zRet = 0; |
| 1447 |
| 1448 assert( *pRc==SQLITE_OK ); |
| 1449 if( zStr ){ |
| 1450 int nCopy = strlen(zStr) + 1; |
| 1451 zRet = (char*)sqlite3_malloc(nCopy); |
| 1452 if( zRet ){ |
| 1453 memcpy(zRet, zStr, nCopy); |
| 1454 }else{ |
| 1455 *pRc = SQLITE_NOMEM; |
| 1456 } |
| 1457 } |
| 1458 |
| 1459 return zRet; |
| 1460 } |
| 1461 |
| 1462 /* |
| 1463 ** Finalize the statement passed as the second argument. |
| 1464 ** |
| 1465 ** If the sqlite3_finalize() call indicates that an error occurs, and the |
| 1466 ** rbu handle error code is not already set, set the error code and error |
| 1467 ** message accordingly. |
| 1468 */ |
| 1469 static void rbuFinalize(sqlite3rbu *p, sqlite3_stmt *pStmt){ |
| 1470 sqlite3 *db = sqlite3_db_handle(pStmt); |
| 1471 int rc = sqlite3_finalize(pStmt); |
| 1472 if( p->rc==SQLITE_OK && rc!=SQLITE_OK ){ |
| 1473 p->rc = rc; |
| 1474 p->zErrmsg = sqlite3_mprintf("%s", sqlite3_errmsg(db)); |
| 1475 } |
| 1476 } |
| 1477 |
| 1478 /* Determine the type of a table. |
| 1479 ** |
| 1480 ** peType is of type (int*), a pointer to an output parameter of type |
| 1481 ** (int). This call sets the output parameter as follows, depending |
| 1482 ** on the type of the table specified by parameters dbName and zTbl. |
| 1483 ** |
| 1484 ** RBU_PK_NOTABLE: No such table. |
| 1485 ** RBU_PK_NONE: Table has an implicit rowid. |
| 1486 ** RBU_PK_IPK: Table has an explicit IPK column. |
| 1487 ** RBU_PK_EXTERNAL: Table has an external PK index. |
| 1488 ** RBU_PK_WITHOUT_ROWID: Table is WITHOUT ROWID. |
| 1489 ** RBU_PK_VTAB: Table is a virtual table. |
| 1490 ** |
| 1491 ** Argument *piPk is also of type (int*), and also points to an output |
| 1492 ** parameter. Unless the table has an external primary key index |
| 1493 ** (i.e. unless *peType is set to 3), then *piPk is set to zero. Or, |
| 1494 ** if the table does have an external primary key index, then *piPk |
| 1495 ** is set to the root page number of the primary key index before |
| 1496 ** returning. |
| 1497 ** |
| 1498 ** ALGORITHM: |
| 1499 ** |
| 1500 ** if( no entry exists in sqlite_master ){ |
| 1501 ** return RBU_PK_NOTABLE |
| 1502 ** }else if( sql for the entry starts with "CREATE VIRTUAL" ){ |
| 1503 ** return RBU_PK_VTAB |
| 1504 ** }else if( "PRAGMA index_list()" for the table contains a "pk" index ){ |
| 1505 ** if( the index that is the pk exists in sqlite_master ){ |
| 1506 ** *piPK = rootpage of that index. |
| 1507 ** return RBU_PK_EXTERNAL |
| 1508 ** }else{ |
| 1509 ** return RBU_PK_WITHOUT_ROWID |
| 1510 ** } |
| 1511 ** }else if( "PRAGMA table_info()" lists one or more "pk" columns ){ |
| 1512 ** return RBU_PK_IPK |
| 1513 ** }else{ |
| 1514 ** return RBU_PK_NONE |
| 1515 ** } |
| 1516 */ |
| 1517 static void rbuTableType( |
| 1518 sqlite3rbu *p, |
| 1519 const char *zTab, |
| 1520 int *peType, |
| 1521 int *piTnum, |
| 1522 int *piPk |
| 1523 ){ |
| 1524 /* |
| 1525 ** 0) SELECT count(*) FROM sqlite_master where name=%Q AND IsVirtual(%Q) |
| 1526 ** 1) PRAGMA index_list = ? |
| 1527 ** 2) SELECT count(*) FROM sqlite_master where name=%Q |
| 1528 ** 3) PRAGMA table_info = ? |
| 1529 */ |
| 1530 sqlite3_stmt *aStmt[4] = {0, 0, 0, 0}; |
| 1531 |
| 1532 *peType = RBU_PK_NOTABLE; |
| 1533 *piPk = 0; |
| 1534 |
| 1535 assert( p->rc==SQLITE_OK ); |
| 1536 p->rc = prepareFreeAndCollectError(p->dbMain, &aStmt[0], &p->zErrmsg, |
| 1537 sqlite3_mprintf( |
| 1538 "SELECT (sql LIKE 'create virtual%%'), rootpage" |
| 1539 " FROM sqlite_master" |
| 1540 " WHERE name=%Q", zTab |
| 1541 )); |
| 1542 if( p->rc!=SQLITE_OK || sqlite3_step(aStmt[0])!=SQLITE_ROW ){ |
| 1543 /* Either an error, or no such table. */ |
| 1544 goto rbuTableType_end; |
| 1545 } |
| 1546 if( sqlite3_column_int(aStmt[0], 0) ){ |
| 1547 *peType = RBU_PK_VTAB; /* virtual table */ |
| 1548 goto rbuTableType_end; |
| 1549 } |
| 1550 *piTnum = sqlite3_column_int(aStmt[0], 1); |
| 1551 |
| 1552 p->rc = prepareFreeAndCollectError(p->dbMain, &aStmt[1], &p->zErrmsg, |
| 1553 sqlite3_mprintf("PRAGMA index_list=%Q",zTab) |
| 1554 ); |
| 1555 if( p->rc ) goto rbuTableType_end; |
| 1556 while( sqlite3_step(aStmt[1])==SQLITE_ROW ){ |
| 1557 const u8 *zOrig = sqlite3_column_text(aStmt[1], 3); |
| 1558 const u8 *zIdx = sqlite3_column_text(aStmt[1], 1); |
| 1559 if( zOrig && zIdx && zOrig[0]=='p' ){ |
| 1560 p->rc = prepareFreeAndCollectError(p->dbMain, &aStmt[2], &p->zErrmsg, |
| 1561 sqlite3_mprintf( |
| 1562 "SELECT rootpage FROM sqlite_master WHERE name = %Q", zIdx |
| 1563 )); |
| 1564 if( p->rc==SQLITE_OK ){ |
| 1565 if( sqlite3_step(aStmt[2])==SQLITE_ROW ){ |
| 1566 *piPk = sqlite3_column_int(aStmt[2], 0); |
| 1567 *peType = RBU_PK_EXTERNAL; |
| 1568 }else{ |
| 1569 *peType = RBU_PK_WITHOUT_ROWID; |
| 1570 } |
| 1571 } |
| 1572 goto rbuTableType_end; |
| 1573 } |
| 1574 } |
| 1575 |
| 1576 p->rc = prepareFreeAndCollectError(p->dbMain, &aStmt[3], &p->zErrmsg, |
| 1577 sqlite3_mprintf("PRAGMA table_info=%Q",zTab) |
| 1578 ); |
| 1579 if( p->rc==SQLITE_OK ){ |
| 1580 while( sqlite3_step(aStmt[3])==SQLITE_ROW ){ |
| 1581 if( sqlite3_column_int(aStmt[3],5)>0 ){ |
| 1582 *peType = RBU_PK_IPK; /* explicit IPK column */ |
| 1583 goto rbuTableType_end; |
| 1584 } |
| 1585 } |
| 1586 *peType = RBU_PK_NONE; |
| 1587 } |
| 1588 |
| 1589 rbuTableType_end: { |
| 1590 unsigned int i; |
| 1591 for(i=0; i<sizeof(aStmt)/sizeof(aStmt[0]); i++){ |
| 1592 rbuFinalize(p, aStmt[i]); |
| 1593 } |
| 1594 } |
| 1595 } |
| 1596 |
| 1597 /* |
| 1598 ** This is a helper function for rbuObjIterCacheTableInfo(). It populates |
| 1599 ** the pIter->abIndexed[] array. |
| 1600 */ |
| 1601 static void rbuObjIterCacheIndexedCols(sqlite3rbu *p, RbuObjIter *pIter){ |
| 1602 sqlite3_stmt *pList = 0; |
| 1603 int bIndex = 0; |
| 1604 |
| 1605 if( p->rc==SQLITE_OK ){ |
| 1606 memcpy(pIter->abIndexed, pIter->abTblPk, sizeof(u8)*pIter->nTblCol); |
| 1607 p->rc = prepareFreeAndCollectError(p->dbMain, &pList, &p->zErrmsg, |
| 1608 sqlite3_mprintf("PRAGMA main.index_list = %Q", pIter->zTbl) |
| 1609 ); |
| 1610 } |
| 1611 |
| 1612 while( p->rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pList) ){ |
| 1613 const char *zIdx = (const char*)sqlite3_column_text(pList, 1); |
| 1614 sqlite3_stmt *pXInfo = 0; |
| 1615 if( zIdx==0 ) break; |
| 1616 p->rc = prepareFreeAndCollectError(p->dbMain, &pXInfo, &p->zErrmsg, |
| 1617 sqlite3_mprintf("PRAGMA main.index_xinfo = %Q", zIdx) |
| 1618 ); |
| 1619 while( p->rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pXInfo) ){ |
| 1620 int iCid = sqlite3_column_int(pXInfo, 1); |
| 1621 if( iCid>=0 ) pIter->abIndexed[iCid] = 1; |
| 1622 } |
| 1623 rbuFinalize(p, pXInfo); |
| 1624 bIndex = 1; |
| 1625 } |
| 1626 |
| 1627 rbuFinalize(p, pList); |
| 1628 if( bIndex==0 ) pIter->abIndexed = 0; |
| 1629 } |
| 1630 |
| 1631 |
| 1632 /* |
| 1633 ** If they are not already populated, populate the pIter->azTblCol[], |
| 1634 ** pIter->abTblPk[], pIter->nTblCol and pIter->bRowid variables according to |
| 1635 ** the table (not index) that the iterator currently points to. |
| 1636 ** |
| 1637 ** Return SQLITE_OK if successful, or an SQLite error code otherwise. If |
| 1638 ** an error does occur, an error code and error message are also left in |
| 1639 ** the RBU handle. |
| 1640 */ |
| 1641 static int rbuObjIterCacheTableInfo(sqlite3rbu *p, RbuObjIter *pIter){ |
| 1642 if( pIter->azTblCol==0 ){ |
| 1643 sqlite3_stmt *pStmt = 0; |
| 1644 int nCol = 0; |
| 1645 int i; /* for() loop iterator variable */ |
| 1646 int bRbuRowid = 0; /* If input table has column "rbu_rowid" */ |
| 1647 int iOrder = 0; |
| 1648 int iTnum = 0; |
| 1649 |
| 1650 /* Figure out the type of table this step will deal with. */ |
| 1651 assert( pIter->eType==0 ); |
| 1652 rbuTableType(p, pIter->zTbl, &pIter->eType, &iTnum, &pIter->iPkTnum); |
| 1653 if( p->rc==SQLITE_OK && pIter->eType==RBU_PK_NOTABLE ){ |
| 1654 p->rc = SQLITE_ERROR; |
| 1655 p->zErrmsg = sqlite3_mprintf("no such table: %s", pIter->zTbl); |
| 1656 } |
| 1657 if( p->rc ) return p->rc; |
| 1658 if( pIter->zIdx==0 ) pIter->iTnum = iTnum; |
| 1659 |
| 1660 assert( pIter->eType==RBU_PK_NONE || pIter->eType==RBU_PK_IPK |
| 1661 || pIter->eType==RBU_PK_EXTERNAL || pIter->eType==RBU_PK_WITHOUT_ROWID |
| 1662 || pIter->eType==RBU_PK_VTAB |
| 1663 ); |
| 1664 |
| 1665 /* Populate the azTblCol[] and nTblCol variables based on the columns |
| 1666 ** of the input table. Ignore any input table columns that begin with |
| 1667 ** "rbu_". */ |
| 1668 p->rc = prepareFreeAndCollectError(p->dbRbu, &pStmt, &p->zErrmsg, |
| 1669 sqlite3_mprintf("SELECT * FROM '%q'", pIter->zDataTbl) |
| 1670 ); |
| 1671 if( p->rc==SQLITE_OK ){ |
| 1672 nCol = sqlite3_column_count(pStmt); |
| 1673 rbuAllocateIterArrays(p, pIter, nCol); |
| 1674 } |
| 1675 for(i=0; p->rc==SQLITE_OK && i<nCol; i++){ |
| 1676 const char *zName = (const char*)sqlite3_column_name(pStmt, i); |
| 1677 if( sqlite3_strnicmp("rbu_", zName, 4) ){ |
| 1678 char *zCopy = rbuStrndup(zName, &p->rc); |
| 1679 pIter->aiSrcOrder[pIter->nTblCol] = pIter->nTblCol; |
| 1680 pIter->azTblCol[pIter->nTblCol++] = zCopy; |
| 1681 } |
| 1682 else if( 0==sqlite3_stricmp("rbu_rowid", zName) ){ |
| 1683 bRbuRowid = 1; |
| 1684 } |
| 1685 } |
| 1686 sqlite3_finalize(pStmt); |
| 1687 pStmt = 0; |
| 1688 |
| 1689 if( p->rc==SQLITE_OK |
| 1690 && bRbuRowid!=(pIter->eType==RBU_PK_VTAB || pIter->eType==RBU_PK_NONE) |
| 1691 ){ |
| 1692 p->rc = SQLITE_ERROR; |
| 1693 p->zErrmsg = sqlite3_mprintf( |
| 1694 "table %q %s rbu_rowid column", pIter->zDataTbl, |
| 1695 (bRbuRowid ? "may not have" : "requires") |
| 1696 ); |
| 1697 } |
| 1698 |
| 1699 /* Check that all non-HIDDEN columns in the destination table are also |
| 1700 ** present in the input table. Populate the abTblPk[], azTblType[] and |
| 1701 ** aiTblOrder[] arrays at the same time. */ |
| 1702 if( p->rc==SQLITE_OK ){ |
| 1703 p->rc = prepareFreeAndCollectError(p->dbMain, &pStmt, &p->zErrmsg, |
| 1704 sqlite3_mprintf("PRAGMA table_info(%Q)", pIter->zTbl) |
| 1705 ); |
| 1706 } |
| 1707 while( p->rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pStmt) ){ |
| 1708 const char *zName = (const char*)sqlite3_column_text(pStmt, 1); |
| 1709 if( zName==0 ) break; /* An OOM - finalize() below returns S_NOMEM */ |
| 1710 for(i=iOrder; i<pIter->nTblCol; i++){ |
| 1711 if( 0==strcmp(zName, pIter->azTblCol[i]) ) break; |
| 1712 } |
| 1713 if( i==pIter->nTblCol ){ |
| 1714 p->rc = SQLITE_ERROR; |
| 1715 p->zErrmsg = sqlite3_mprintf("column missing from %q: %s", |
| 1716 pIter->zDataTbl, zName |
| 1717 ); |
| 1718 }else{ |
| 1719 int iPk = sqlite3_column_int(pStmt, 5); |
| 1720 int bNotNull = sqlite3_column_int(pStmt, 3); |
| 1721 const char *zType = (const char*)sqlite3_column_text(pStmt, 2); |
| 1722 |
| 1723 if( i!=iOrder ){ |
| 1724 SWAP(int, pIter->aiSrcOrder[i], pIter->aiSrcOrder[iOrder]); |
| 1725 SWAP(char*, pIter->azTblCol[i], pIter->azTblCol[iOrder]); |
| 1726 } |
| 1727 |
| 1728 pIter->azTblType[iOrder] = rbuStrndup(zType, &p->rc); |
| 1729 pIter->abTblPk[iOrder] = (iPk!=0); |
| 1730 pIter->abNotNull[iOrder] = (u8)bNotNull || (iPk!=0); |
| 1731 iOrder++; |
| 1732 } |
| 1733 } |
| 1734 |
| 1735 rbuFinalize(p, pStmt); |
| 1736 rbuObjIterCacheIndexedCols(p, pIter); |
| 1737 assert( pIter->eType!=RBU_PK_VTAB || pIter->abIndexed==0 ); |
| 1738 } |
| 1739 |
| 1740 return p->rc; |
| 1741 } |
| 1742 |
| 1743 /* |
| 1744 ** This function constructs and returns a pointer to a nul-terminated |
| 1745 ** string containing some SQL clause or list based on one or more of the |
| 1746 ** column names currently stored in the pIter->azTblCol[] array. |
| 1747 */ |
| 1748 static char *rbuObjIterGetCollist( |
| 1749 sqlite3rbu *p, /* RBU object */ |
| 1750 RbuObjIter *pIter /* Object iterator for column names */ |
| 1751 ){ |
| 1752 char *zList = 0; |
| 1753 const char *zSep = ""; |
| 1754 int i; |
| 1755 for(i=0; i<pIter->nTblCol; i++){ |
| 1756 const char *z = pIter->azTblCol[i]; |
| 1757 zList = rbuMPrintf(p, "%z%s\"%w\"", zList, zSep, z); |
| 1758 zSep = ", "; |
| 1759 } |
| 1760 return zList; |
| 1761 } |
| 1762 |
| 1763 /* |
| 1764 ** This function is used to create a SELECT list (the list of SQL |
| 1765 ** expressions that follows a SELECT keyword) for a SELECT statement |
| 1766 ** used to read from an data_xxx or rbu_tmp_xxx table while updating the |
| 1767 ** index object currently indicated by the iterator object passed as the |
| 1768 ** second argument. A "PRAGMA index_xinfo = <idxname>" statement is used |
| 1769 ** to obtain the required information. |
| 1770 ** |
| 1771 ** If the index is of the following form: |
| 1772 ** |
| 1773 ** CREATE INDEX i1 ON t1(c, b COLLATE nocase); |
| 1774 ** |
| 1775 ** and "t1" is a table with an explicit INTEGER PRIMARY KEY column |
| 1776 ** "ipk", the returned string is: |
| 1777 ** |
| 1778 ** "`c` COLLATE 'BINARY', `b` COLLATE 'NOCASE', `ipk` COLLATE 'BINARY'" |
| 1779 ** |
| 1780 ** As well as the returned string, three other malloc'd strings are |
| 1781 ** returned via output parameters. As follows: |
| 1782 ** |
| 1783 ** pzImposterCols: ... |
| 1784 ** pzImposterPk: ... |
| 1785 ** pzWhere: ... |
| 1786 */ |
| 1787 static char *rbuObjIterGetIndexCols( |
| 1788 sqlite3rbu *p, /* RBU object */ |
| 1789 RbuObjIter *pIter, /* Object iterator for column names */ |
| 1790 char **pzImposterCols, /* OUT: Columns for imposter table */ |
| 1791 char **pzImposterPk, /* OUT: Imposter PK clause */ |
| 1792 char **pzWhere, /* OUT: WHERE clause */ |
| 1793 int *pnBind /* OUT: Trbul number of columns */ |
| 1794 ){ |
| 1795 int rc = p->rc; /* Error code */ |
| 1796 int rc2; /* sqlite3_finalize() return code */ |
| 1797 char *zRet = 0; /* String to return */ |
| 1798 char *zImpCols = 0; /* String to return via *pzImposterCols */ |
| 1799 char *zImpPK = 0; /* String to return via *pzImposterPK */ |
| 1800 char *zWhere = 0; /* String to return via *pzWhere */ |
| 1801 int nBind = 0; /* Value to return via *pnBind */ |
| 1802 const char *zCom = ""; /* Set to ", " later on */ |
| 1803 const char *zAnd = ""; /* Set to " AND " later on */ |
| 1804 sqlite3_stmt *pXInfo = 0; /* PRAGMA index_xinfo = ? */ |
| 1805 |
| 1806 if( rc==SQLITE_OK ){ |
| 1807 assert( p->zErrmsg==0 ); |
| 1808 rc = prepareFreeAndCollectError(p->dbMain, &pXInfo, &p->zErrmsg, |
| 1809 sqlite3_mprintf("PRAGMA main.index_xinfo = %Q", pIter->zIdx) |
| 1810 ); |
| 1811 } |
| 1812 |
| 1813 while( rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pXInfo) ){ |
| 1814 int iCid = sqlite3_column_int(pXInfo, 1); |
| 1815 int bDesc = sqlite3_column_int(pXInfo, 3); |
| 1816 const char *zCollate = (const char*)sqlite3_column_text(pXInfo, 4); |
| 1817 const char *zCol; |
| 1818 const char *zType; |
| 1819 |
| 1820 if( iCid<0 ){ |
| 1821 /* An integer primary key. If the table has an explicit IPK, use |
| 1822 ** its name. Otherwise, use "rbu_rowid". */ |
| 1823 if( pIter->eType==RBU_PK_IPK ){ |
| 1824 int i; |
| 1825 for(i=0; pIter->abTblPk[i]==0; i++); |
| 1826 assert( i<pIter->nTblCol ); |
| 1827 zCol = pIter->azTblCol[i]; |
| 1828 }else{ |
| 1829 zCol = "rbu_rowid"; |
| 1830 } |
| 1831 zType = "INTEGER"; |
| 1832 }else{ |
| 1833 zCol = pIter->azTblCol[iCid]; |
| 1834 zType = pIter->azTblType[iCid]; |
| 1835 } |
| 1836 |
| 1837 zRet = sqlite3_mprintf("%z%s\"%w\" COLLATE %Q", zRet, zCom, zCol, zCollate); |
| 1838 if( pIter->bUnique==0 || sqlite3_column_int(pXInfo, 5) ){ |
| 1839 const char *zOrder = (bDesc ? " DESC" : ""); |
| 1840 zImpPK = sqlite3_mprintf("%z%s\"rbu_imp_%d%w\"%s", |
| 1841 zImpPK, zCom, nBind, zCol, zOrder |
| 1842 ); |
| 1843 } |
| 1844 zImpCols = sqlite3_mprintf("%z%s\"rbu_imp_%d%w\" %s COLLATE %Q", |
| 1845 zImpCols, zCom, nBind, zCol, zType, zCollate |
| 1846 ); |
| 1847 zWhere = sqlite3_mprintf( |
| 1848 "%z%s\"rbu_imp_%d%w\" IS ?", zWhere, zAnd, nBind, zCol |
| 1849 ); |
| 1850 if( zRet==0 || zImpPK==0 || zImpCols==0 || zWhere==0 ) rc = SQLITE_NOMEM; |
| 1851 zCom = ", "; |
| 1852 zAnd = " AND "; |
| 1853 nBind++; |
| 1854 } |
| 1855 |
| 1856 rc2 = sqlite3_finalize(pXInfo); |
| 1857 if( rc==SQLITE_OK ) rc = rc2; |
| 1858 |
| 1859 if( rc!=SQLITE_OK ){ |
| 1860 sqlite3_free(zRet); |
| 1861 sqlite3_free(zImpCols); |
| 1862 sqlite3_free(zImpPK); |
| 1863 sqlite3_free(zWhere); |
| 1864 zRet = 0; |
| 1865 zImpCols = 0; |
| 1866 zImpPK = 0; |
| 1867 zWhere = 0; |
| 1868 p->rc = rc; |
| 1869 } |
| 1870 |
| 1871 *pzImposterCols = zImpCols; |
| 1872 *pzImposterPk = zImpPK; |
| 1873 *pzWhere = zWhere; |
| 1874 *pnBind = nBind; |
| 1875 return zRet; |
| 1876 } |
| 1877 |
| 1878 /* |
| 1879 ** Assuming the current table columns are "a", "b" and "c", and the zObj |
| 1880 ** paramter is passed "old", return a string of the form: |
| 1881 ** |
| 1882 ** "old.a, old.b, old.b" |
| 1883 ** |
| 1884 ** With the column names escaped. |
| 1885 ** |
| 1886 ** For tables with implicit rowids - RBU_PK_EXTERNAL and RBU_PK_NONE, append |
| 1887 ** the text ", old._rowid_" to the returned value. |
| 1888 */ |
| 1889 static char *rbuObjIterGetOldlist( |
| 1890 sqlite3rbu *p, |
| 1891 RbuObjIter *pIter, |
| 1892 const char *zObj |
| 1893 ){ |
| 1894 char *zList = 0; |
| 1895 if( p->rc==SQLITE_OK && pIter->abIndexed ){ |
| 1896 const char *zS = ""; |
| 1897 int i; |
| 1898 for(i=0; i<pIter->nTblCol; i++){ |
| 1899 if( pIter->abIndexed[i] ){ |
| 1900 const char *zCol = pIter->azTblCol[i]; |
| 1901 zList = sqlite3_mprintf("%z%s%s.\"%w\"", zList, zS, zObj, zCol); |
| 1902 }else{ |
| 1903 zList = sqlite3_mprintf("%z%sNULL", zList, zS); |
| 1904 } |
| 1905 zS = ", "; |
| 1906 if( zList==0 ){ |
| 1907 p->rc = SQLITE_NOMEM; |
| 1908 break; |
| 1909 } |
| 1910 } |
| 1911 |
| 1912 /* For a table with implicit rowids, append "old._rowid_" to the list. */ |
| 1913 if( pIter->eType==RBU_PK_EXTERNAL || pIter->eType==RBU_PK_NONE ){ |
| 1914 zList = rbuMPrintf(p, "%z, %s._rowid_", zList, zObj); |
| 1915 } |
| 1916 } |
| 1917 return zList; |
| 1918 } |
| 1919 |
| 1920 /* |
| 1921 ** Return an expression that can be used in a WHERE clause to match the |
| 1922 ** primary key of the current table. For example, if the table is: |
| 1923 ** |
| 1924 ** CREATE TABLE t1(a, b, c, PRIMARY KEY(b, c)); |
| 1925 ** |
| 1926 ** Return the string: |
| 1927 ** |
| 1928 ** "b = ?1 AND c = ?2" |
| 1929 */ |
| 1930 static char *rbuObjIterGetWhere( |
| 1931 sqlite3rbu *p, |
| 1932 RbuObjIter *pIter |
| 1933 ){ |
| 1934 char *zList = 0; |
| 1935 if( pIter->eType==RBU_PK_VTAB || pIter->eType==RBU_PK_NONE ){ |
| 1936 zList = rbuMPrintf(p, "_rowid_ = ?%d", pIter->nTblCol+1); |
| 1937 }else if( pIter->eType==RBU_PK_EXTERNAL ){ |
| 1938 const char *zSep = ""; |
| 1939 int i; |
| 1940 for(i=0; i<pIter->nTblCol; i++){ |
| 1941 if( pIter->abTblPk[i] ){ |
| 1942 zList = rbuMPrintf(p, "%z%sc%d=?%d", zList, zSep, i, i+1); |
| 1943 zSep = " AND "; |
| 1944 } |
| 1945 } |
| 1946 zList = rbuMPrintf(p, |
| 1947 "_rowid_ = (SELECT id FROM rbu_imposter2 WHERE %z)", zList |
| 1948 ); |
| 1949 |
| 1950 }else{ |
| 1951 const char *zSep = ""; |
| 1952 int i; |
| 1953 for(i=0; i<pIter->nTblCol; i++){ |
| 1954 if( pIter->abTblPk[i] ){ |
| 1955 const char *zCol = pIter->azTblCol[i]; |
| 1956 zList = rbuMPrintf(p, "%z%s\"%w\"=?%d", zList, zSep, zCol, i+1); |
| 1957 zSep = " AND "; |
| 1958 } |
| 1959 } |
| 1960 } |
| 1961 return zList; |
| 1962 } |
| 1963 |
| 1964 /* |
| 1965 ** The SELECT statement iterating through the keys for the current object |
| 1966 ** (p->objiter.pSelect) currently points to a valid row. However, there |
| 1967 ** is something wrong with the rbu_control value in the rbu_control value |
| 1968 ** stored in the (p->nCol+1)'th column. Set the error code and error message |
| 1969 ** of the RBU handle to something reflecting this. |
| 1970 */ |
| 1971 static void rbuBadControlError(sqlite3rbu *p){ |
| 1972 p->rc = SQLITE_ERROR; |
| 1973 p->zErrmsg = sqlite3_mprintf("invalid rbu_control value"); |
| 1974 } |
| 1975 |
| 1976 |
| 1977 /* |
| 1978 ** Return a nul-terminated string containing the comma separated list of |
| 1979 ** assignments that should be included following the "SET" keyword of |
| 1980 ** an UPDATE statement used to update the table object that the iterator |
| 1981 ** passed as the second argument currently points to if the rbu_control |
| 1982 ** column of the data_xxx table entry is set to zMask. |
| 1983 ** |
| 1984 ** The memory for the returned string is obtained from sqlite3_malloc(). |
| 1985 ** It is the responsibility of the caller to eventually free it using |
| 1986 ** sqlite3_free(). |
| 1987 ** |
| 1988 ** If an OOM error is encountered when allocating space for the new |
| 1989 ** string, an error code is left in the rbu handle passed as the first |
| 1990 ** argument and NULL is returned. Or, if an error has already occurred |
| 1991 ** when this function is called, NULL is returned immediately, without |
| 1992 ** attempting the allocation or modifying the stored error code. |
| 1993 */ |
| 1994 static char *rbuObjIterGetSetlist( |
| 1995 sqlite3rbu *p, |
| 1996 RbuObjIter *pIter, |
| 1997 const char *zMask |
| 1998 ){ |
| 1999 char *zList = 0; |
| 2000 if( p->rc==SQLITE_OK ){ |
| 2001 int i; |
| 2002 |
| 2003 if( (int)strlen(zMask)!=pIter->nTblCol ){ |
| 2004 rbuBadControlError(p); |
| 2005 }else{ |
| 2006 const char *zSep = ""; |
| 2007 for(i=0; i<pIter->nTblCol; i++){ |
| 2008 char c = zMask[pIter->aiSrcOrder[i]]; |
| 2009 if( c=='x' ){ |
| 2010 zList = rbuMPrintf(p, "%z%s\"%w\"=?%d", |
| 2011 zList, zSep, pIter->azTblCol[i], i+1 |
| 2012 ); |
| 2013 zSep = ", "; |
| 2014 } |
| 2015 else if( c=='d' ){ |
| 2016 zList = rbuMPrintf(p, "%z%s\"%w\"=rbu_delta(\"%w\", ?%d)", |
| 2017 zList, zSep, pIter->azTblCol[i], pIter->azTblCol[i], i+1 |
| 2018 ); |
| 2019 zSep = ", "; |
| 2020 } |
| 2021 else if( c=='f' ){ |
| 2022 zList = rbuMPrintf(p, "%z%s\"%w\"=rbu_fossil_delta(\"%w\", ?%d)", |
| 2023 zList, zSep, pIter->azTblCol[i], pIter->azTblCol[i], i+1 |
| 2024 ); |
| 2025 zSep = ", "; |
| 2026 } |
| 2027 } |
| 2028 } |
| 2029 } |
| 2030 return zList; |
| 2031 } |
| 2032 |
| 2033 /* |
| 2034 ** Return a nul-terminated string consisting of nByte comma separated |
| 2035 ** "?" expressions. For example, if nByte is 3, return a pointer to |
| 2036 ** a buffer containing the string "?,?,?". |
| 2037 ** |
| 2038 ** The memory for the returned string is obtained from sqlite3_malloc(). |
| 2039 ** It is the responsibility of the caller to eventually free it using |
| 2040 ** sqlite3_free(). |
| 2041 ** |
| 2042 ** If an OOM error is encountered when allocating space for the new |
| 2043 ** string, an error code is left in the rbu handle passed as the first |
| 2044 ** argument and NULL is returned. Or, if an error has already occurred |
| 2045 ** when this function is called, NULL is returned immediately, without |
| 2046 ** attempting the allocation or modifying the stored error code. |
| 2047 */ |
| 2048 static char *rbuObjIterGetBindlist(sqlite3rbu *p, int nBind){ |
| 2049 char *zRet = 0; |
| 2050 int nByte = nBind*2 + 1; |
| 2051 |
| 2052 zRet = (char*)rbuMalloc(p, nByte); |
| 2053 if( zRet ){ |
| 2054 int i; |
| 2055 for(i=0; i<nBind; i++){ |
| 2056 zRet[i*2] = '?'; |
| 2057 zRet[i*2+1] = (i+1==nBind) ? '\0' : ','; |
| 2058 } |
| 2059 } |
| 2060 return zRet; |
| 2061 } |
| 2062 |
| 2063 /* |
| 2064 ** The iterator currently points to a table (not index) of type |
| 2065 ** RBU_PK_WITHOUT_ROWID. This function creates the PRIMARY KEY |
| 2066 ** declaration for the corresponding imposter table. For example, |
| 2067 ** if the iterator points to a table created as: |
| 2068 ** |
| 2069 ** CREATE TABLE t1(a, b, c, PRIMARY KEY(b, a DESC)) WITHOUT ROWID |
| 2070 ** |
| 2071 ** this function returns: |
| 2072 ** |
| 2073 ** PRIMARY KEY("b", "a" DESC) |
| 2074 */ |
| 2075 static char *rbuWithoutRowidPK(sqlite3rbu *p, RbuObjIter *pIter){ |
| 2076 char *z = 0; |
| 2077 assert( pIter->zIdx==0 ); |
| 2078 if( p->rc==SQLITE_OK ){ |
| 2079 const char *zSep = "PRIMARY KEY("; |
| 2080 sqlite3_stmt *pXList = 0; /* PRAGMA index_list = (pIter->zTbl) */ |
| 2081 sqlite3_stmt *pXInfo = 0; /* PRAGMA index_xinfo = <pk-index> */ |
| 2082 |
| 2083 p->rc = prepareFreeAndCollectError(p->dbMain, &pXList, &p->zErrmsg, |
| 2084 sqlite3_mprintf("PRAGMA main.index_list = %Q", pIter->zTbl) |
| 2085 ); |
| 2086 while( p->rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pXList) ){ |
| 2087 const char *zOrig = (const char*)sqlite3_column_text(pXList,3); |
| 2088 if( zOrig && strcmp(zOrig, "pk")==0 ){ |
| 2089 const char *zIdx = (const char*)sqlite3_column_text(pXList,1); |
| 2090 if( zIdx ){ |
| 2091 p->rc = prepareFreeAndCollectError(p->dbMain, &pXInfo, &p->zErrmsg, |
| 2092 sqlite3_mprintf("PRAGMA main.index_xinfo = %Q", zIdx) |
| 2093 ); |
| 2094 } |
| 2095 break; |
| 2096 } |
| 2097 } |
| 2098 rbuFinalize(p, pXList); |
| 2099 |
| 2100 while( p->rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pXInfo) ){ |
| 2101 if( sqlite3_column_int(pXInfo, 5) ){ |
| 2102 /* int iCid = sqlite3_column_int(pXInfo, 0); */ |
| 2103 const char *zCol = (const char*)sqlite3_column_text(pXInfo, 2); |
| 2104 const char *zDesc = sqlite3_column_int(pXInfo, 3) ? " DESC" : ""; |
| 2105 z = rbuMPrintf(p, "%z%s\"%w\"%s", z, zSep, zCol, zDesc); |
| 2106 zSep = ", "; |
| 2107 } |
| 2108 } |
| 2109 z = rbuMPrintf(p, "%z)", z); |
| 2110 rbuFinalize(p, pXInfo); |
| 2111 } |
| 2112 return z; |
| 2113 } |
| 2114 |
| 2115 /* |
| 2116 ** This function creates the second imposter table used when writing to |
| 2117 ** a table b-tree where the table has an external primary key. If the |
| 2118 ** iterator passed as the second argument does not currently point to |
| 2119 ** a table (not index) with an external primary key, this function is a |
| 2120 ** no-op. |
| 2121 ** |
| 2122 ** Assuming the iterator does point to a table with an external PK, this |
| 2123 ** function creates a WITHOUT ROWID imposter table named "rbu_imposter2" |
| 2124 ** used to access that PK index. For example, if the target table is |
| 2125 ** declared as follows: |
| 2126 ** |
| 2127 ** CREATE TABLE t1(a, b TEXT, c REAL, PRIMARY KEY(b, c)); |
| 2128 ** |
| 2129 ** then the imposter table schema is: |
| 2130 ** |
| 2131 ** CREATE TABLE rbu_imposter2(c1 TEXT, c2 REAL, id INTEGER) WITHOUT ROWID; |
| 2132 ** |
| 2133 */ |
| 2134 static void rbuCreateImposterTable2(sqlite3rbu *p, RbuObjIter *pIter){ |
| 2135 if( p->rc==SQLITE_OK && pIter->eType==RBU_PK_EXTERNAL ){ |
| 2136 int tnum = pIter->iPkTnum; /* Root page of PK index */ |
| 2137 sqlite3_stmt *pQuery = 0; /* SELECT name ... WHERE rootpage = $tnum */ |
| 2138 const char *zIdx = 0; /* Name of PK index */ |
| 2139 sqlite3_stmt *pXInfo = 0; /* PRAGMA main.index_xinfo = $zIdx */ |
| 2140 const char *zComma = ""; |
| 2141 char *zCols = 0; /* Used to build up list of table cols */ |
| 2142 char *zPk = 0; /* Used to build up table PK declaration */ |
| 2143 |
| 2144 /* Figure out the name of the primary key index for the current table. |
| 2145 ** This is needed for the argument to "PRAGMA index_xinfo". Set |
| 2146 ** zIdx to point to a nul-terminated string containing this name. */ |
| 2147 p->rc = prepareAndCollectError(p->dbMain, &pQuery, &p->zErrmsg, |
| 2148 "SELECT name FROM sqlite_master WHERE rootpage = ?" |
| 2149 ); |
| 2150 if( p->rc==SQLITE_OK ){ |
| 2151 sqlite3_bind_int(pQuery, 1, tnum); |
| 2152 if( SQLITE_ROW==sqlite3_step(pQuery) ){ |
| 2153 zIdx = (const char*)sqlite3_column_text(pQuery, 0); |
| 2154 } |
| 2155 } |
| 2156 if( zIdx ){ |
| 2157 p->rc = prepareFreeAndCollectError(p->dbMain, &pXInfo, &p->zErrmsg, |
| 2158 sqlite3_mprintf("PRAGMA main.index_xinfo = %Q", zIdx) |
| 2159 ); |
| 2160 } |
| 2161 rbuFinalize(p, pQuery); |
| 2162 |
| 2163 while( p->rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pXInfo) ){ |
| 2164 int bKey = sqlite3_column_int(pXInfo, 5); |
| 2165 if( bKey ){ |
| 2166 int iCid = sqlite3_column_int(pXInfo, 1); |
| 2167 int bDesc = sqlite3_column_int(pXInfo, 3); |
| 2168 const char *zCollate = (const char*)sqlite3_column_text(pXInfo, 4); |
| 2169 zCols = rbuMPrintf(p, "%z%sc%d %s COLLATE %s", zCols, zComma, |
| 2170 iCid, pIter->azTblType[iCid], zCollate |
| 2171 ); |
| 2172 zPk = rbuMPrintf(p, "%z%sc%d%s", zPk, zComma, iCid, bDesc?" DESC":""); |
| 2173 zComma = ", "; |
| 2174 } |
| 2175 } |
| 2176 zCols = rbuMPrintf(p, "%z, id INTEGER", zCols); |
| 2177 rbuFinalize(p, pXInfo); |
| 2178 |
| 2179 sqlite3_test_control(SQLITE_TESTCTRL_IMPOSTER, p->dbMain, "main", 1, tnum); |
| 2180 rbuMPrintfExec(p, p->dbMain, |
| 2181 "CREATE TABLE rbu_imposter2(%z, PRIMARY KEY(%z)) WITHOUT ROWID", |
| 2182 zCols, zPk |
| 2183 ); |
| 2184 sqlite3_test_control(SQLITE_TESTCTRL_IMPOSTER, p->dbMain, "main", 0, 0); |
| 2185 } |
| 2186 } |
| 2187 |
| 2188 /* |
| 2189 ** If an error has already occurred when this function is called, it |
| 2190 ** immediately returns zero (without doing any work). Or, if an error |
| 2191 ** occurs during the execution of this function, it sets the error code |
| 2192 ** in the sqlite3rbu object indicated by the first argument and returns |
| 2193 ** zero. |
| 2194 ** |
| 2195 ** The iterator passed as the second argument is guaranteed to point to |
| 2196 ** a table (not an index) when this function is called. This function |
| 2197 ** attempts to create any imposter table required to write to the main |
| 2198 ** table b-tree of the table before returning. Non-zero is returned if |
| 2199 ** an imposter table are created, or zero otherwise. |
| 2200 ** |
| 2201 ** An imposter table is required in all cases except RBU_PK_VTAB. Only |
| 2202 ** virtual tables are written to directly. The imposter table has the |
| 2203 ** same schema as the actual target table (less any UNIQUE constraints). |
| 2204 ** More precisely, the "same schema" means the same columns, types, |
| 2205 ** collation sequences. For tables that do not have an external PRIMARY |
| 2206 ** KEY, it also means the same PRIMARY KEY declaration. |
| 2207 */ |
| 2208 static void rbuCreateImposterTable(sqlite3rbu *p, RbuObjIter *pIter){ |
| 2209 if( p->rc==SQLITE_OK && pIter->eType!=RBU_PK_VTAB ){ |
| 2210 int tnum = pIter->iTnum; |
| 2211 const char *zComma = ""; |
| 2212 char *zSql = 0; |
| 2213 int iCol; |
| 2214 sqlite3_test_control(SQLITE_TESTCTRL_IMPOSTER, p->dbMain, "main", 0, 1); |
| 2215 |
| 2216 for(iCol=0; p->rc==SQLITE_OK && iCol<pIter->nTblCol; iCol++){ |
| 2217 const char *zPk = ""; |
| 2218 const char *zCol = pIter->azTblCol[iCol]; |
| 2219 const char *zColl = 0; |
| 2220 |
| 2221 p->rc = sqlite3_table_column_metadata( |
| 2222 p->dbMain, "main", pIter->zTbl, zCol, 0, &zColl, 0, 0, 0 |
| 2223 ); |
| 2224 |
| 2225 if( pIter->eType==RBU_PK_IPK && pIter->abTblPk[iCol] ){ |
| 2226 /* If the target table column is an "INTEGER PRIMARY KEY", add |
| 2227 ** "PRIMARY KEY" to the imposter table column declaration. */ |
| 2228 zPk = "PRIMARY KEY "; |
| 2229 } |
| 2230 zSql = rbuMPrintf(p, "%z%s\"%w\" %s %sCOLLATE %s%s", |
| 2231 zSql, zComma, zCol, pIter->azTblType[iCol], zPk, zColl, |
| 2232 (pIter->abNotNull[iCol] ? " NOT NULL" : "") |
| 2233 ); |
| 2234 zComma = ", "; |
| 2235 } |
| 2236 |
| 2237 if( pIter->eType==RBU_PK_WITHOUT_ROWID ){ |
| 2238 char *zPk = rbuWithoutRowidPK(p, pIter); |
| 2239 if( zPk ){ |
| 2240 zSql = rbuMPrintf(p, "%z, %z", zSql, zPk); |
| 2241 } |
| 2242 } |
| 2243 |
| 2244 sqlite3_test_control(SQLITE_TESTCTRL_IMPOSTER, p->dbMain, "main", 1, tnum); |
| 2245 rbuMPrintfExec(p, p->dbMain, "CREATE TABLE \"rbu_imp_%w\"(%z)%s", |
| 2246 pIter->zTbl, zSql, |
| 2247 (pIter->eType==RBU_PK_WITHOUT_ROWID ? " WITHOUT ROWID" : "") |
| 2248 ); |
| 2249 sqlite3_test_control(SQLITE_TESTCTRL_IMPOSTER, p->dbMain, "main", 0, 0); |
| 2250 } |
| 2251 } |
| 2252 |
| 2253 /* |
| 2254 ** Prepare a statement used to insert rows into the "rbu_tmp_xxx" table. |
| 2255 ** Specifically a statement of the form: |
| 2256 ** |
| 2257 ** INSERT INTO rbu_tmp_xxx VALUES(?, ?, ? ...); |
| 2258 ** |
| 2259 ** The number of bound variables is equal to the number of columns in |
| 2260 ** the target table, plus one (for the rbu_control column), plus one more |
| 2261 ** (for the rbu_rowid column) if the target table is an implicit IPK or |
| 2262 ** virtual table. |
| 2263 */ |
| 2264 static void rbuObjIterPrepareTmpInsert( |
| 2265 sqlite3rbu *p, |
| 2266 RbuObjIter *pIter, |
| 2267 const char *zCollist, |
| 2268 const char *zRbuRowid |
| 2269 ){ |
| 2270 int bRbuRowid = (pIter->eType==RBU_PK_EXTERNAL || pIter->eType==RBU_PK_NONE); |
| 2271 char *zBind = rbuObjIterGetBindlist(p, pIter->nTblCol + 1 + bRbuRowid); |
| 2272 if( zBind ){ |
| 2273 assert( pIter->pTmpInsert==0 ); |
| 2274 p->rc = prepareFreeAndCollectError( |
| 2275 p->dbRbu, &pIter->pTmpInsert, &p->zErrmsg, sqlite3_mprintf( |
| 2276 "INSERT INTO %s.'rbu_tmp_%q'(rbu_control,%s%s) VALUES(%z)", |
| 2277 p->zStateDb, pIter->zDataTbl, zCollist, zRbuRowid, zBind |
| 2278 )); |
| 2279 } |
| 2280 } |
| 2281 |
| 2282 static void rbuTmpInsertFunc( |
| 2283 sqlite3_context *pCtx, |
| 2284 int nVal, |
| 2285 sqlite3_value **apVal |
| 2286 ){ |
| 2287 sqlite3rbu *p = sqlite3_user_data(pCtx); |
| 2288 int rc = SQLITE_OK; |
| 2289 int i; |
| 2290 |
| 2291 for(i=0; rc==SQLITE_OK && i<nVal; i++){ |
| 2292 rc = sqlite3_bind_value(p->objiter.pTmpInsert, i+1, apVal[i]); |
| 2293 } |
| 2294 if( rc==SQLITE_OK ){ |
| 2295 sqlite3_step(p->objiter.pTmpInsert); |
| 2296 rc = sqlite3_reset(p->objiter.pTmpInsert); |
| 2297 } |
| 2298 |
| 2299 if( rc!=SQLITE_OK ){ |
| 2300 sqlite3_result_error_code(pCtx, rc); |
| 2301 } |
| 2302 } |
| 2303 |
| 2304 /* |
| 2305 ** Ensure that the SQLite statement handles required to update the |
| 2306 ** target database object currently indicated by the iterator passed |
| 2307 ** as the second argument are available. |
| 2308 */ |
| 2309 static int rbuObjIterPrepareAll( |
| 2310 sqlite3rbu *p, |
| 2311 RbuObjIter *pIter, |
| 2312 int nOffset /* Add "LIMIT -1 OFFSET $nOffset" to SELECT */ |
| 2313 ){ |
| 2314 assert( pIter->bCleanup==0 ); |
| 2315 if( pIter->pSelect==0 && rbuObjIterCacheTableInfo(p, pIter)==SQLITE_OK ){ |
| 2316 const int tnum = pIter->iTnum; |
| 2317 char *zCollist = 0; /* List of indexed columns */ |
| 2318 char **pz = &p->zErrmsg; |
| 2319 const char *zIdx = pIter->zIdx; |
| 2320 char *zLimit = 0; |
| 2321 |
| 2322 if( nOffset ){ |
| 2323 zLimit = sqlite3_mprintf(" LIMIT -1 OFFSET %d", nOffset); |
| 2324 if( !zLimit ) p->rc = SQLITE_NOMEM; |
| 2325 } |
| 2326 |
| 2327 if( zIdx ){ |
| 2328 const char *zTbl = pIter->zTbl; |
| 2329 char *zImposterCols = 0; /* Columns for imposter table */ |
| 2330 char *zImposterPK = 0; /* Primary key declaration for imposter */ |
| 2331 char *zWhere = 0; /* WHERE clause on PK columns */ |
| 2332 char *zBind = 0; |
| 2333 int nBind = 0; |
| 2334 |
| 2335 assert( pIter->eType!=RBU_PK_VTAB ); |
| 2336 zCollist = rbuObjIterGetIndexCols( |
| 2337 p, pIter, &zImposterCols, &zImposterPK, &zWhere, &nBind |
| 2338 ); |
| 2339 zBind = rbuObjIterGetBindlist(p, nBind); |
| 2340 |
| 2341 /* Create the imposter table used to write to this index. */ |
| 2342 sqlite3_test_control(SQLITE_TESTCTRL_IMPOSTER, p->dbMain, "main", 0, 1); |
| 2343 sqlite3_test_control(SQLITE_TESTCTRL_IMPOSTER, p->dbMain, "main", 1,tnum); |
| 2344 rbuMPrintfExec(p, p->dbMain, |
| 2345 "CREATE TABLE \"rbu_imp_%w\"( %s, PRIMARY KEY( %s ) ) WITHOUT ROWID", |
| 2346 zTbl, zImposterCols, zImposterPK |
| 2347 ); |
| 2348 sqlite3_test_control(SQLITE_TESTCTRL_IMPOSTER, p->dbMain, "main", 0, 0); |
| 2349 |
| 2350 /* Create the statement to insert index entries */ |
| 2351 pIter->nCol = nBind; |
| 2352 if( p->rc==SQLITE_OK ){ |
| 2353 p->rc = prepareFreeAndCollectError( |
| 2354 p->dbMain, &pIter->pInsert, &p->zErrmsg, |
| 2355 sqlite3_mprintf("INSERT INTO \"rbu_imp_%w\" VALUES(%s)", zTbl, zBind) |
| 2356 ); |
| 2357 } |
| 2358 |
| 2359 /* And to delete index entries */ |
| 2360 if( p->rc==SQLITE_OK ){ |
| 2361 p->rc = prepareFreeAndCollectError( |
| 2362 p->dbMain, &pIter->pDelete, &p->zErrmsg, |
| 2363 sqlite3_mprintf("DELETE FROM \"rbu_imp_%w\" WHERE %s", zTbl, zWhere) |
| 2364 ); |
| 2365 } |
| 2366 |
| 2367 /* Create the SELECT statement to read keys in sorted order */ |
| 2368 if( p->rc==SQLITE_OK ){ |
| 2369 char *zSql; |
| 2370 if( pIter->eType==RBU_PK_EXTERNAL || pIter->eType==RBU_PK_NONE ){ |
| 2371 zSql = sqlite3_mprintf( |
| 2372 "SELECT %s, rbu_control FROM %s.'rbu_tmp_%q' ORDER BY %s%s", |
| 2373 zCollist, p->zStateDb, pIter->zDataTbl, |
| 2374 zCollist, zLimit |
| 2375 ); |
| 2376 }else{ |
| 2377 zSql = sqlite3_mprintf( |
| 2378 "SELECT %s, rbu_control FROM '%q' " |
| 2379 "WHERE typeof(rbu_control)='integer' AND rbu_control!=1 " |
| 2380 "UNION ALL " |
| 2381 "SELECT %s, rbu_control FROM %s.'rbu_tmp_%q' " |
| 2382 "ORDER BY %s%s", |
| 2383 zCollist, pIter->zDataTbl, |
| 2384 zCollist, p->zStateDb, pIter->zDataTbl, |
| 2385 zCollist, zLimit |
| 2386 ); |
| 2387 } |
| 2388 p->rc = prepareFreeAndCollectError(p->dbRbu, &pIter->pSelect, pz, zSql); |
| 2389 } |
| 2390 |
| 2391 sqlite3_free(zImposterCols); |
| 2392 sqlite3_free(zImposterPK); |
| 2393 sqlite3_free(zWhere); |
| 2394 sqlite3_free(zBind); |
| 2395 }else{ |
| 2396 int bRbuRowid = (pIter->eType==RBU_PK_VTAB || pIter->eType==RBU_PK_NONE); |
| 2397 const char *zTbl = pIter->zTbl; /* Table this step applies to */ |
| 2398 const char *zWrite; /* Imposter table name */ |
| 2399 |
| 2400 char *zBindings = rbuObjIterGetBindlist(p, pIter->nTblCol + bRbuRowid); |
| 2401 char *zWhere = rbuObjIterGetWhere(p, pIter); |
| 2402 char *zOldlist = rbuObjIterGetOldlist(p, pIter, "old"); |
| 2403 char *zNewlist = rbuObjIterGetOldlist(p, pIter, "new"); |
| 2404 |
| 2405 zCollist = rbuObjIterGetCollist(p, pIter); |
| 2406 pIter->nCol = pIter->nTblCol; |
| 2407 |
| 2408 /* Create the imposter table or tables (if required). */ |
| 2409 rbuCreateImposterTable(p, pIter); |
| 2410 rbuCreateImposterTable2(p, pIter); |
| 2411 zWrite = (pIter->eType==RBU_PK_VTAB ? "" : "rbu_imp_"); |
| 2412 |
| 2413 /* Create the INSERT statement to write to the target PK b-tree */ |
| 2414 if( p->rc==SQLITE_OK ){ |
| 2415 p->rc = prepareFreeAndCollectError(p->dbMain, &pIter->pInsert, pz, |
| 2416 sqlite3_mprintf( |
| 2417 "INSERT INTO \"%s%w\"(%s%s) VALUES(%s)", |
| 2418 zWrite, zTbl, zCollist, (bRbuRowid ? ", _rowid_" : ""), zBindings |
| 2419 ) |
| 2420 ); |
| 2421 } |
| 2422 |
| 2423 /* Create the DELETE statement to write to the target PK b-tree */ |
| 2424 if( p->rc==SQLITE_OK ){ |
| 2425 p->rc = prepareFreeAndCollectError(p->dbMain, &pIter->pDelete, pz, |
| 2426 sqlite3_mprintf( |
| 2427 "DELETE FROM \"%s%w\" WHERE %s", zWrite, zTbl, zWhere |
| 2428 ) |
| 2429 ); |
| 2430 } |
| 2431 |
| 2432 if( pIter->abIndexed ){ |
| 2433 const char *zRbuRowid = ""; |
| 2434 if( pIter->eType==RBU_PK_EXTERNAL || pIter->eType==RBU_PK_NONE ){ |
| 2435 zRbuRowid = ", rbu_rowid"; |
| 2436 } |
| 2437 |
| 2438 /* Create the rbu_tmp_xxx table and the triggers to populate it. */ |
| 2439 rbuMPrintfExec(p, p->dbRbu, |
| 2440 "CREATE TABLE IF NOT EXISTS %s.'rbu_tmp_%q' AS " |
| 2441 "SELECT *%s FROM '%q' WHERE 0;" |
| 2442 , p->zStateDb, pIter->zDataTbl |
| 2443 , (pIter->eType==RBU_PK_EXTERNAL ? ", 0 AS rbu_rowid" : "") |
| 2444 , pIter->zDataTbl |
| 2445 ); |
| 2446 |
| 2447 rbuMPrintfExec(p, p->dbMain, |
| 2448 "CREATE TEMP TRIGGER rbu_delete_tr BEFORE DELETE ON \"%s%w\" " |
| 2449 "BEGIN " |
| 2450 " SELECT rbu_tmp_insert(2, %s);" |
| 2451 "END;" |
| 2452 |
| 2453 "CREATE TEMP TRIGGER rbu_update1_tr BEFORE UPDATE ON \"%s%w\" " |
| 2454 "BEGIN " |
| 2455 " SELECT rbu_tmp_insert(2, %s);" |
| 2456 "END;" |
| 2457 |
| 2458 "CREATE TEMP TRIGGER rbu_update2_tr AFTER UPDATE ON \"%s%w\" " |
| 2459 "BEGIN " |
| 2460 " SELECT rbu_tmp_insert(3, %s);" |
| 2461 "END;", |
| 2462 zWrite, zTbl, zOldlist, |
| 2463 zWrite, zTbl, zOldlist, |
| 2464 zWrite, zTbl, zNewlist |
| 2465 ); |
| 2466 |
| 2467 if( pIter->eType==RBU_PK_EXTERNAL || pIter->eType==RBU_PK_NONE ){ |
| 2468 rbuMPrintfExec(p, p->dbMain, |
| 2469 "CREATE TEMP TRIGGER rbu_insert_tr AFTER INSERT ON \"%s%w\" " |
| 2470 "BEGIN " |
| 2471 " SELECT rbu_tmp_insert(0, %s);" |
| 2472 "END;", |
| 2473 zWrite, zTbl, zNewlist |
| 2474 ); |
| 2475 } |
| 2476 |
| 2477 rbuObjIterPrepareTmpInsert(p, pIter, zCollist, zRbuRowid); |
| 2478 } |
| 2479 |
| 2480 /* Create the SELECT statement to read keys from data_xxx */ |
| 2481 if( p->rc==SQLITE_OK ){ |
| 2482 p->rc = prepareFreeAndCollectError(p->dbRbu, &pIter->pSelect, pz, |
| 2483 sqlite3_mprintf( |
| 2484 "SELECT %s, rbu_control%s FROM '%q'%s", |
| 2485 zCollist, (bRbuRowid ? ", rbu_rowid" : ""), |
| 2486 pIter->zDataTbl, zLimit |
| 2487 ) |
| 2488 ); |
| 2489 } |
| 2490 |
| 2491 sqlite3_free(zWhere); |
| 2492 sqlite3_free(zOldlist); |
| 2493 sqlite3_free(zNewlist); |
| 2494 sqlite3_free(zBindings); |
| 2495 } |
| 2496 sqlite3_free(zCollist); |
| 2497 sqlite3_free(zLimit); |
| 2498 } |
| 2499 |
| 2500 return p->rc; |
| 2501 } |
| 2502 |
| 2503 /* |
| 2504 ** Set output variable *ppStmt to point to an UPDATE statement that may |
| 2505 ** be used to update the imposter table for the main table b-tree of the |
| 2506 ** table object that pIter currently points to, assuming that the |
| 2507 ** rbu_control column of the data_xyz table contains zMask. |
| 2508 ** |
| 2509 ** If the zMask string does not specify any columns to update, then this |
| 2510 ** is not an error. Output variable *ppStmt is set to NULL in this case. |
| 2511 */ |
| 2512 static int rbuGetUpdateStmt( |
| 2513 sqlite3rbu *p, /* RBU handle */ |
| 2514 RbuObjIter *pIter, /* Object iterator */ |
| 2515 const char *zMask, /* rbu_control value ('x.x.') */ |
| 2516 sqlite3_stmt **ppStmt /* OUT: UPDATE statement handle */ |
| 2517 ){ |
| 2518 RbuUpdateStmt **pp; |
| 2519 RbuUpdateStmt *pUp = 0; |
| 2520 int nUp = 0; |
| 2521 |
| 2522 /* In case an error occurs */ |
| 2523 *ppStmt = 0; |
| 2524 |
| 2525 /* Search for an existing statement. If one is found, shift it to the front |
| 2526 ** of the LRU queue and return immediately. Otherwise, leave nUp pointing |
| 2527 ** to the number of statements currently in the cache and pUp to the |
| 2528 ** last object in the list. */ |
| 2529 for(pp=&pIter->pRbuUpdate; *pp; pp=&((*pp)->pNext)){ |
| 2530 pUp = *pp; |
| 2531 if( strcmp(pUp->zMask, zMask)==0 ){ |
| 2532 *pp = pUp->pNext; |
| 2533 pUp->pNext = pIter->pRbuUpdate; |
| 2534 pIter->pRbuUpdate = pUp; |
| 2535 *ppStmt = pUp->pUpdate; |
| 2536 return SQLITE_OK; |
| 2537 } |
| 2538 nUp++; |
| 2539 } |
| 2540 assert( pUp==0 || pUp->pNext==0 ); |
| 2541 |
| 2542 if( nUp>=SQLITE_RBU_UPDATE_CACHESIZE ){ |
| 2543 for(pp=&pIter->pRbuUpdate; *pp!=pUp; pp=&((*pp)->pNext)); |
| 2544 *pp = 0; |
| 2545 sqlite3_finalize(pUp->pUpdate); |
| 2546 pUp->pUpdate = 0; |
| 2547 }else{ |
| 2548 pUp = (RbuUpdateStmt*)rbuMalloc(p, sizeof(RbuUpdateStmt)+pIter->nTblCol+1); |
| 2549 } |
| 2550 |
| 2551 if( pUp ){ |
| 2552 char *zWhere = rbuObjIterGetWhere(p, pIter); |
| 2553 char *zSet = rbuObjIterGetSetlist(p, pIter, zMask); |
| 2554 char *zUpdate = 0; |
| 2555 |
| 2556 pUp->zMask = (char*)&pUp[1]; |
| 2557 memcpy(pUp->zMask, zMask, pIter->nTblCol); |
| 2558 pUp->pNext = pIter->pRbuUpdate; |
| 2559 pIter->pRbuUpdate = pUp; |
| 2560 |
| 2561 if( zSet ){ |
| 2562 const char *zPrefix = ""; |
| 2563 |
| 2564 if( pIter->eType!=RBU_PK_VTAB ) zPrefix = "rbu_imp_"; |
| 2565 zUpdate = sqlite3_mprintf("UPDATE \"%s%w\" SET %s WHERE %s", |
| 2566 zPrefix, pIter->zTbl, zSet, zWhere |
| 2567 ); |
| 2568 p->rc = prepareFreeAndCollectError( |
| 2569 p->dbMain, &pUp->pUpdate, &p->zErrmsg, zUpdate |
| 2570 ); |
| 2571 *ppStmt = pUp->pUpdate; |
| 2572 } |
| 2573 sqlite3_free(zWhere); |
| 2574 sqlite3_free(zSet); |
| 2575 } |
| 2576 |
| 2577 return p->rc; |
| 2578 } |
| 2579 |
| 2580 static sqlite3 *rbuOpenDbhandle(sqlite3rbu *p, const char *zName){ |
| 2581 sqlite3 *db = 0; |
| 2582 if( p->rc==SQLITE_OK ){ |
| 2583 const int flags = SQLITE_OPEN_READWRITE|SQLITE_OPEN_CREATE|SQLITE_OPEN_URI; |
| 2584 p->rc = sqlite3_open_v2(zName, &db, flags, p->zVfsName); |
| 2585 if( p->rc ){ |
| 2586 p->zErrmsg = sqlite3_mprintf("%s", sqlite3_errmsg(db)); |
| 2587 sqlite3_close(db); |
| 2588 db = 0; |
| 2589 } |
| 2590 } |
| 2591 return db; |
| 2592 } |
| 2593 |
| 2594 /* |
| 2595 ** Open the database handle and attach the RBU database as "rbu". If an |
| 2596 ** error occurs, leave an error code and message in the RBU handle. |
| 2597 */ |
| 2598 static void rbuOpenDatabase(sqlite3rbu *p){ |
| 2599 assert( p->rc==SQLITE_OK ); |
| 2600 assert( p->dbMain==0 && p->dbRbu==0 ); |
| 2601 |
| 2602 p->eStage = 0; |
| 2603 p->dbMain = rbuOpenDbhandle(p, p->zTarget); |
| 2604 p->dbRbu = rbuOpenDbhandle(p, p->zRbu); |
| 2605 |
| 2606 /* If using separate RBU and state databases, attach the state database to |
| 2607 ** the RBU db handle now. */ |
| 2608 if( p->zState ){ |
| 2609 rbuMPrintfExec(p, p->dbRbu, "ATTACH %Q AS stat", p->zState); |
| 2610 memcpy(p->zStateDb, "stat", 4); |
| 2611 }else{ |
| 2612 memcpy(p->zStateDb, "main", 4); |
| 2613 } |
| 2614 |
| 2615 if( p->rc==SQLITE_OK ){ |
| 2616 p->rc = sqlite3_create_function(p->dbMain, |
| 2617 "rbu_tmp_insert", -1, SQLITE_UTF8, (void*)p, rbuTmpInsertFunc, 0, 0 |
| 2618 ); |
| 2619 } |
| 2620 |
| 2621 if( p->rc==SQLITE_OK ){ |
| 2622 p->rc = sqlite3_create_function(p->dbMain, |
| 2623 "rbu_fossil_delta", 2, SQLITE_UTF8, 0, rbuFossilDeltaFunc, 0, 0 |
| 2624 ); |
| 2625 } |
| 2626 |
| 2627 if( p->rc==SQLITE_OK ){ |
| 2628 p->rc = sqlite3_create_function(p->dbRbu, |
| 2629 "rbu_target_name", 1, SQLITE_UTF8, (void*)p, rbuTargetNameFunc, 0, 0 |
| 2630 ); |
| 2631 } |
| 2632 |
| 2633 if( p->rc==SQLITE_OK ){ |
| 2634 p->rc = sqlite3_file_control(p->dbMain, "main", SQLITE_FCNTL_RBU, (void*)p); |
| 2635 } |
| 2636 rbuMPrintfExec(p, p->dbMain, "SELECT * FROM sqlite_master"); |
| 2637 |
| 2638 /* Mark the database file just opened as an RBU target database. If |
| 2639 ** this call returns SQLITE_NOTFOUND, then the RBU vfs is not in use. |
| 2640 ** This is an error. */ |
| 2641 if( p->rc==SQLITE_OK ){ |
| 2642 p->rc = sqlite3_file_control(p->dbMain, "main", SQLITE_FCNTL_RBU, (void*)p); |
| 2643 } |
| 2644 |
| 2645 if( p->rc==SQLITE_NOTFOUND ){ |
| 2646 p->rc = SQLITE_ERROR; |
| 2647 p->zErrmsg = sqlite3_mprintf("rbu vfs not found"); |
| 2648 } |
| 2649 } |
| 2650 |
| 2651 /* |
| 2652 ** This routine is a copy of the sqlite3FileSuffix3() routine from the core. |
| 2653 ** It is a no-op unless SQLITE_ENABLE_8_3_NAMES is defined. |
| 2654 ** |
| 2655 ** If SQLITE_ENABLE_8_3_NAMES is set at compile-time and if the database |
| 2656 ** filename in zBaseFilename is a URI with the "8_3_names=1" parameter and |
| 2657 ** if filename in z[] has a suffix (a.k.a. "extension") that is longer than |
| 2658 ** three characters, then shorten the suffix on z[] to be the last three |
| 2659 ** characters of the original suffix. |
| 2660 ** |
| 2661 ** If SQLITE_ENABLE_8_3_NAMES is set to 2 at compile-time, then always |
| 2662 ** do the suffix shortening regardless of URI parameter. |
| 2663 ** |
| 2664 ** Examples: |
| 2665 ** |
| 2666 ** test.db-journal => test.nal |
| 2667 ** test.db-wal => test.wal |
| 2668 ** test.db-shm => test.shm |
| 2669 ** test.db-mj7f3319fa => test.9fa |
| 2670 */ |
| 2671 static void rbuFileSuffix3(const char *zBase, char *z){ |
| 2672 #ifdef SQLITE_ENABLE_8_3_NAMES |
| 2673 #if SQLITE_ENABLE_8_3_NAMES<2 |
| 2674 if( sqlite3_uri_boolean(zBase, "8_3_names", 0) ) |
| 2675 #endif |
| 2676 { |
| 2677 int i, sz; |
| 2678 sz = sqlite3Strlen30(z); |
| 2679 for(i=sz-1; i>0 && z[i]!='/' && z[i]!='.'; i--){} |
| 2680 if( z[i]=='.' && ALWAYS(sz>i+4) ) memmove(&z[i+1], &z[sz-3], 4); |
| 2681 } |
| 2682 #endif |
| 2683 } |
| 2684 |
| 2685 /* |
| 2686 ** Return the current wal-index header checksum for the target database |
| 2687 ** as a 64-bit integer. |
| 2688 ** |
| 2689 ** The checksum is store in the first page of xShmMap memory as an 8-byte |
| 2690 ** blob starting at byte offset 40. |
| 2691 */ |
| 2692 static i64 rbuShmChecksum(sqlite3rbu *p){ |
| 2693 i64 iRet = 0; |
| 2694 if( p->rc==SQLITE_OK ){ |
| 2695 sqlite3_file *pDb = p->pTargetFd->pReal; |
| 2696 u32 volatile *ptr; |
| 2697 p->rc = pDb->pMethods->xShmMap(pDb, 0, 32*1024, 0, (void volatile**)&ptr); |
| 2698 if( p->rc==SQLITE_OK ){ |
| 2699 iRet = ((i64)ptr[10] << 32) + ptr[11]; |
| 2700 } |
| 2701 } |
| 2702 return iRet; |
| 2703 } |
| 2704 |
| 2705 /* |
| 2706 ** This function is called as part of initializing or reinitializing an |
| 2707 ** incremental checkpoint. |
| 2708 ** |
| 2709 ** It populates the sqlite3rbu.aFrame[] array with the set of |
| 2710 ** (wal frame -> db page) copy operations required to checkpoint the |
| 2711 ** current wal file, and obtains the set of shm locks required to safely |
| 2712 ** perform the copy operations directly on the file-system. |
| 2713 ** |
| 2714 ** If argument pState is not NULL, then the incremental checkpoint is |
| 2715 ** being resumed. In this case, if the checksum of the wal-index-header |
| 2716 ** following recovery is not the same as the checksum saved in the RbuState |
| 2717 ** object, then the rbu handle is set to DONE state. This occurs if some |
| 2718 ** other client appends a transaction to the wal file in the middle of |
| 2719 ** an incremental checkpoint. |
| 2720 */ |
| 2721 static void rbuSetupCheckpoint(sqlite3rbu *p, RbuState *pState){ |
| 2722 |
| 2723 /* If pState is NULL, then the wal file may not have been opened and |
| 2724 ** recovered. Running a read-statement here to ensure that doing so |
| 2725 ** does not interfere with the "capture" process below. */ |
| 2726 if( pState==0 ){ |
| 2727 p->eStage = 0; |
| 2728 if( p->rc==SQLITE_OK ){ |
| 2729 p->rc = sqlite3_exec(p->dbMain, "SELECT * FROM sqlite_master", 0, 0, 0); |
| 2730 } |
| 2731 } |
| 2732 |
| 2733 /* Assuming no error has occurred, run a "restart" checkpoint with the |
| 2734 ** sqlite3rbu.eStage variable set to CAPTURE. This turns on the following |
| 2735 ** special behaviour in the rbu VFS: |
| 2736 ** |
| 2737 ** * If the exclusive shm WRITER or READ0 lock cannot be obtained, |
| 2738 ** the checkpoint fails with SQLITE_BUSY (normally SQLite would |
| 2739 ** proceed with running a passive checkpoint instead of failing). |
| 2740 ** |
| 2741 ** * Attempts to read from the *-wal file or write to the database file |
| 2742 ** do not perform any IO. Instead, the frame/page combinations that |
| 2743 ** would be read/written are recorded in the sqlite3rbu.aFrame[] |
| 2744 ** array. |
| 2745 ** |
| 2746 ** * Calls to xShmLock(UNLOCK) to release the exclusive shm WRITER, |
| 2747 ** READ0 and CHECKPOINT locks taken as part of the checkpoint are |
| 2748 ** no-ops. These locks will not be released until the connection |
| 2749 ** is closed. |
| 2750 ** |
| 2751 ** * Attempting to xSync() the database file causes an SQLITE_INTERNAL |
| 2752 ** error. |
| 2753 ** |
| 2754 ** As a result, unless an error (i.e. OOM or SQLITE_BUSY) occurs, the |
| 2755 ** checkpoint below fails with SQLITE_INTERNAL, and leaves the aFrame[] |
| 2756 ** array populated with a set of (frame -> page) mappings. Because the |
| 2757 ** WRITER, CHECKPOINT and READ0 locks are still held, it is safe to copy |
| 2758 ** data from the wal file into the database file according to the |
| 2759 ** contents of aFrame[]. |
| 2760 */ |
| 2761 if( p->rc==SQLITE_OK ){ |
| 2762 int rc2; |
| 2763 p->eStage = RBU_STAGE_CAPTURE; |
| 2764 rc2 = sqlite3_exec(p->dbMain, "PRAGMA main.wal_checkpoint=restart", 0, 0,0); |
| 2765 if( rc2!=SQLITE_INTERNAL ) p->rc = rc2; |
| 2766 } |
| 2767 |
| 2768 if( p->rc==SQLITE_OK ){ |
| 2769 p->eStage = RBU_STAGE_CKPT; |
| 2770 p->nStep = (pState ? pState->nRow : 0); |
| 2771 p->aBuf = rbuMalloc(p, p->pgsz); |
| 2772 p->iWalCksum = rbuShmChecksum(p); |
| 2773 } |
| 2774 |
| 2775 if( p->rc==SQLITE_OK && pState && pState->iWalCksum!=p->iWalCksum ){ |
| 2776 p->rc = SQLITE_DONE; |
| 2777 p->eStage = RBU_STAGE_DONE; |
| 2778 } |
| 2779 } |
| 2780 |
| 2781 /* |
| 2782 ** Called when iAmt bytes are read from offset iOff of the wal file while |
| 2783 ** the rbu object is in capture mode. Record the frame number of the frame |
| 2784 ** being read in the aFrame[] array. |
| 2785 */ |
| 2786 static int rbuCaptureWalRead(sqlite3rbu *pRbu, i64 iOff, int iAmt){ |
| 2787 const u32 mReq = (1<<WAL_LOCK_WRITE)|(1<<WAL_LOCK_CKPT)|(1<<WAL_LOCK_READ0); |
| 2788 u32 iFrame; |
| 2789 |
| 2790 if( pRbu->mLock!=mReq ){ |
| 2791 pRbu->rc = SQLITE_BUSY; |
| 2792 return SQLITE_INTERNAL; |
| 2793 } |
| 2794 |
| 2795 pRbu->pgsz = iAmt; |
| 2796 if( pRbu->nFrame==pRbu->nFrameAlloc ){ |
| 2797 int nNew = (pRbu->nFrameAlloc ? pRbu->nFrameAlloc : 64) * 2; |
| 2798 RbuFrame *aNew; |
| 2799 aNew = (RbuFrame*)sqlite3_realloc(pRbu->aFrame, nNew * sizeof(RbuFrame)); |
| 2800 if( aNew==0 ) return SQLITE_NOMEM; |
| 2801 pRbu->aFrame = aNew; |
| 2802 pRbu->nFrameAlloc = nNew; |
| 2803 } |
| 2804 |
| 2805 iFrame = (u32)((iOff-32) / (i64)(iAmt+24)) + 1; |
| 2806 if( pRbu->iMaxFrame<iFrame ) pRbu->iMaxFrame = iFrame; |
| 2807 pRbu->aFrame[pRbu->nFrame].iWalFrame = iFrame; |
| 2808 pRbu->aFrame[pRbu->nFrame].iDbPage = 0; |
| 2809 pRbu->nFrame++; |
| 2810 return SQLITE_OK; |
| 2811 } |
| 2812 |
| 2813 /* |
| 2814 ** Called when a page of data is written to offset iOff of the database |
| 2815 ** file while the rbu handle is in capture mode. Record the page number |
| 2816 ** of the page being written in the aFrame[] array. |
| 2817 */ |
| 2818 static int rbuCaptureDbWrite(sqlite3rbu *pRbu, i64 iOff){ |
| 2819 pRbu->aFrame[pRbu->nFrame-1].iDbPage = (u32)(iOff / pRbu->pgsz) + 1; |
| 2820 return SQLITE_OK; |
| 2821 } |
| 2822 |
| 2823 /* |
| 2824 ** This is called as part of an incremental checkpoint operation. Copy |
| 2825 ** a single frame of data from the wal file into the database file, as |
| 2826 ** indicated by the RbuFrame object. |
| 2827 */ |
| 2828 static void rbuCheckpointFrame(sqlite3rbu *p, RbuFrame *pFrame){ |
| 2829 sqlite3_file *pWal = p->pTargetFd->pWalFd->pReal; |
| 2830 sqlite3_file *pDb = p->pTargetFd->pReal; |
| 2831 i64 iOff; |
| 2832 |
| 2833 assert( p->rc==SQLITE_OK ); |
| 2834 iOff = (i64)(pFrame->iWalFrame-1) * (p->pgsz + 24) + 32 + 24; |
| 2835 p->rc = pWal->pMethods->xRead(pWal, p->aBuf, p->pgsz, iOff); |
| 2836 if( p->rc ) return; |
| 2837 |
| 2838 iOff = (i64)(pFrame->iDbPage-1) * p->pgsz; |
| 2839 p->rc = pDb->pMethods->xWrite(pDb, p->aBuf, p->pgsz, iOff); |
| 2840 } |
| 2841 |
| 2842 |
| 2843 /* |
| 2844 ** Take an EXCLUSIVE lock on the database file. |
| 2845 */ |
| 2846 static void rbuLockDatabase(sqlite3rbu *p){ |
| 2847 sqlite3_file *pReal = p->pTargetFd->pReal; |
| 2848 assert( p->rc==SQLITE_OK ); |
| 2849 p->rc = pReal->pMethods->xLock(pReal, SQLITE_LOCK_SHARED); |
| 2850 if( p->rc==SQLITE_OK ){ |
| 2851 p->rc = pReal->pMethods->xLock(pReal, SQLITE_LOCK_EXCLUSIVE); |
| 2852 } |
| 2853 } |
| 2854 |
| 2855 #if defined(_WIN32_WCE) |
| 2856 static LPWSTR rbuWinUtf8ToUnicode(const char *zFilename){ |
| 2857 int nChar; |
| 2858 LPWSTR zWideFilename; |
| 2859 |
| 2860 nChar = MultiByteToWideChar(CP_UTF8, 0, zFilename, -1, NULL, 0); |
| 2861 if( nChar==0 ){ |
| 2862 return 0; |
| 2863 } |
| 2864 zWideFilename = sqlite3_malloc( nChar*sizeof(zWideFilename[0]) ); |
| 2865 if( zWideFilename==0 ){ |
| 2866 return 0; |
| 2867 } |
| 2868 memset(zWideFilename, 0, nChar*sizeof(zWideFilename[0])); |
| 2869 nChar = MultiByteToWideChar(CP_UTF8, 0, zFilename, -1, zWideFilename, |
| 2870 nChar); |
| 2871 if( nChar==0 ){ |
| 2872 sqlite3_free(zWideFilename); |
| 2873 zWideFilename = 0; |
| 2874 } |
| 2875 return zWideFilename; |
| 2876 } |
| 2877 #endif |
| 2878 |
| 2879 /* |
| 2880 ** The RBU handle is currently in RBU_STAGE_OAL state, with a SHARED lock |
| 2881 ** on the database file. This proc moves the *-oal file to the *-wal path, |
| 2882 ** then reopens the database file (this time in vanilla, non-oal, WAL mode). |
| 2883 ** If an error occurs, leave an error code and error message in the rbu |
| 2884 ** handle. |
| 2885 */ |
| 2886 static void rbuMoveOalFile(sqlite3rbu *p){ |
| 2887 const char *zBase = sqlite3_db_filename(p->dbMain, "main"); |
| 2888 |
| 2889 char *zWal = sqlite3_mprintf("%s-wal", zBase); |
| 2890 char *zOal = sqlite3_mprintf("%s-oal", zBase); |
| 2891 |
| 2892 assert( p->eStage==RBU_STAGE_MOVE ); |
| 2893 assert( p->rc==SQLITE_OK && p->zErrmsg==0 ); |
| 2894 if( zWal==0 || zOal==0 ){ |
| 2895 p->rc = SQLITE_NOMEM; |
| 2896 }else{ |
| 2897 /* Move the *-oal file to *-wal. At this point connection p->db is |
| 2898 ** holding a SHARED lock on the target database file (because it is |
| 2899 ** in WAL mode). So no other connection may be writing the db. |
| 2900 ** |
| 2901 ** In order to ensure that there are no database readers, an EXCLUSIVE |
| 2902 ** lock is obtained here before the *-oal is moved to *-wal. |
| 2903 */ |
| 2904 rbuLockDatabase(p); |
| 2905 if( p->rc==SQLITE_OK ){ |
| 2906 rbuFileSuffix3(zBase, zWal); |
| 2907 rbuFileSuffix3(zBase, zOal); |
| 2908 |
| 2909 /* Re-open the databases. */ |
| 2910 rbuObjIterFinalize(&p->objiter); |
| 2911 sqlite3_close(p->dbMain); |
| 2912 sqlite3_close(p->dbRbu); |
| 2913 p->dbMain = 0; |
| 2914 p->dbRbu = 0; |
| 2915 |
| 2916 #if defined(_WIN32_WCE) |
| 2917 { |
| 2918 LPWSTR zWideOal; |
| 2919 LPWSTR zWideWal; |
| 2920 |
| 2921 zWideOal = rbuWinUtf8ToUnicode(zOal); |
| 2922 if( zWideOal ){ |
| 2923 zWideWal = rbuWinUtf8ToUnicode(zWal); |
| 2924 if( zWideWal ){ |
| 2925 if( MoveFileW(zWideOal, zWideWal) ){ |
| 2926 p->rc = SQLITE_OK; |
| 2927 }else{ |
| 2928 p->rc = SQLITE_IOERR; |
| 2929 } |
| 2930 sqlite3_free(zWideWal); |
| 2931 }else{ |
| 2932 p->rc = SQLITE_IOERR_NOMEM; |
| 2933 } |
| 2934 sqlite3_free(zWideOal); |
| 2935 }else{ |
| 2936 p->rc = SQLITE_IOERR_NOMEM; |
| 2937 } |
| 2938 } |
| 2939 #else |
| 2940 p->rc = rename(zOal, zWal) ? SQLITE_IOERR : SQLITE_OK; |
| 2941 #endif |
| 2942 |
| 2943 if( p->rc==SQLITE_OK ){ |
| 2944 rbuOpenDatabase(p); |
| 2945 rbuSetupCheckpoint(p, 0); |
| 2946 } |
| 2947 } |
| 2948 } |
| 2949 |
| 2950 sqlite3_free(zWal); |
| 2951 sqlite3_free(zOal); |
| 2952 } |
| 2953 |
| 2954 /* |
| 2955 ** The SELECT statement iterating through the keys for the current object |
| 2956 ** (p->objiter.pSelect) currently points to a valid row. This function |
| 2957 ** determines the type of operation requested by this row and returns |
| 2958 ** one of the following values to indicate the result: |
| 2959 ** |
| 2960 ** * RBU_INSERT |
| 2961 ** * RBU_DELETE |
| 2962 ** * RBU_IDX_DELETE |
| 2963 ** * RBU_UPDATE |
| 2964 ** |
| 2965 ** If RBU_UPDATE is returned, then output variable *pzMask is set to |
| 2966 ** point to the text value indicating the columns to update. |
| 2967 ** |
| 2968 ** If the rbu_control field contains an invalid value, an error code and |
| 2969 ** message are left in the RBU handle and zero returned. |
| 2970 */ |
| 2971 static int rbuStepType(sqlite3rbu *p, const char **pzMask){ |
| 2972 int iCol = p->objiter.nCol; /* Index of rbu_control column */ |
| 2973 int res = 0; /* Return value */ |
| 2974 |
| 2975 switch( sqlite3_column_type(p->objiter.pSelect, iCol) ){ |
| 2976 case SQLITE_INTEGER: { |
| 2977 int iVal = sqlite3_column_int(p->objiter.pSelect, iCol); |
| 2978 if( iVal==0 ){ |
| 2979 res = RBU_INSERT; |
| 2980 }else if( iVal==1 ){ |
| 2981 res = RBU_DELETE; |
| 2982 }else if( iVal==2 ){ |
| 2983 res = RBU_IDX_DELETE; |
| 2984 }else if( iVal==3 ){ |
| 2985 res = RBU_IDX_INSERT; |
| 2986 } |
| 2987 break; |
| 2988 } |
| 2989 |
| 2990 case SQLITE_TEXT: { |
| 2991 const unsigned char *z = sqlite3_column_text(p->objiter.pSelect, iCol); |
| 2992 if( z==0 ){ |
| 2993 p->rc = SQLITE_NOMEM; |
| 2994 }else{ |
| 2995 *pzMask = (const char*)z; |
| 2996 } |
| 2997 res = RBU_UPDATE; |
| 2998 |
| 2999 break; |
| 3000 } |
| 3001 |
| 3002 default: |
| 3003 break; |
| 3004 } |
| 3005 |
| 3006 if( res==0 ){ |
| 3007 rbuBadControlError(p); |
| 3008 } |
| 3009 return res; |
| 3010 } |
| 3011 |
| 3012 #ifdef SQLITE_DEBUG |
| 3013 /* |
| 3014 ** Assert that column iCol of statement pStmt is named zName. |
| 3015 */ |
| 3016 static void assertColumnName(sqlite3_stmt *pStmt, int iCol, const char *zName){ |
| 3017 const char *zCol = sqlite3_column_name(pStmt, iCol); |
| 3018 assert( 0==sqlite3_stricmp(zName, zCol) ); |
| 3019 } |
| 3020 #else |
| 3021 # define assertColumnName(x,y,z) |
| 3022 #endif |
| 3023 |
| 3024 /* |
| 3025 ** This function does the work for an sqlite3rbu_step() call. |
| 3026 ** |
| 3027 ** The object-iterator (p->objiter) currently points to a valid object, |
| 3028 ** and the input cursor (p->objiter.pSelect) currently points to a valid |
| 3029 ** input row. Perform whatever processing is required and return. |
| 3030 ** |
| 3031 ** If no error occurs, SQLITE_OK is returned. Otherwise, an error code |
| 3032 ** and message is left in the RBU handle and a copy of the error code |
| 3033 ** returned. |
| 3034 */ |
| 3035 static int rbuStep(sqlite3rbu *p){ |
| 3036 RbuObjIter *pIter = &p->objiter; |
| 3037 const char *zMask = 0; |
| 3038 int i; |
| 3039 int eType = rbuStepType(p, &zMask); |
| 3040 |
| 3041 if( eType ){ |
| 3042 assert( eType!=RBU_UPDATE || pIter->zIdx==0 ); |
| 3043 |
| 3044 if( pIter->zIdx==0 && eType==RBU_IDX_DELETE ){ |
| 3045 rbuBadControlError(p); |
| 3046 } |
| 3047 else if( |
| 3048 eType==RBU_INSERT |
| 3049 || eType==RBU_DELETE |
| 3050 || eType==RBU_IDX_DELETE |
| 3051 || eType==RBU_IDX_INSERT |
| 3052 ){ |
| 3053 sqlite3_value *pVal; |
| 3054 sqlite3_stmt *pWriter; |
| 3055 |
| 3056 assert( eType!=RBU_UPDATE ); |
| 3057 assert( eType!=RBU_DELETE || pIter->zIdx==0 ); |
| 3058 |
| 3059 if( eType==RBU_IDX_DELETE || eType==RBU_DELETE ){ |
| 3060 pWriter = pIter->pDelete; |
| 3061 }else{ |
| 3062 pWriter = pIter->pInsert; |
| 3063 } |
| 3064 |
| 3065 for(i=0; i<pIter->nCol; i++){ |
| 3066 /* If this is an INSERT into a table b-tree and the table has an |
| 3067 ** explicit INTEGER PRIMARY KEY, check that this is not an attempt |
| 3068 ** to write a NULL into the IPK column. That is not permitted. */ |
| 3069 if( eType==RBU_INSERT |
| 3070 && pIter->zIdx==0 && pIter->eType==RBU_PK_IPK && pIter->abTblPk[i] |
| 3071 && sqlite3_column_type(pIter->pSelect, i)==SQLITE_NULL |
| 3072 ){ |
| 3073 p->rc = SQLITE_MISMATCH; |
| 3074 p->zErrmsg = sqlite3_mprintf("datatype mismatch"); |
| 3075 goto step_out; |
| 3076 } |
| 3077 |
| 3078 if( eType==RBU_DELETE && pIter->abTblPk[i]==0 ){ |
| 3079 continue; |
| 3080 } |
| 3081 |
| 3082 pVal = sqlite3_column_value(pIter->pSelect, i); |
| 3083 p->rc = sqlite3_bind_value(pWriter, i+1, pVal); |
| 3084 if( p->rc ) goto step_out; |
| 3085 } |
| 3086 if( pIter->zIdx==0 |
| 3087 && (pIter->eType==RBU_PK_VTAB || pIter->eType==RBU_PK_NONE) |
| 3088 ){ |
| 3089 /* For a virtual table, or a table with no primary key, the |
| 3090 ** SELECT statement is: |
| 3091 ** |
| 3092 ** SELECT <cols>, rbu_control, rbu_rowid FROM .... |
| 3093 ** |
| 3094 ** Hence column_value(pIter->nCol+1). |
| 3095 */ |
| 3096 assertColumnName(pIter->pSelect, pIter->nCol+1, "rbu_rowid"); |
| 3097 pVal = sqlite3_column_value(pIter->pSelect, pIter->nCol+1); |
| 3098 p->rc = sqlite3_bind_value(pWriter, pIter->nCol+1, pVal); |
| 3099 } |
| 3100 if( p->rc==SQLITE_OK ){ |
| 3101 sqlite3_step(pWriter); |
| 3102 p->rc = resetAndCollectError(pWriter, &p->zErrmsg); |
| 3103 } |
| 3104 }else{ |
| 3105 sqlite3_value *pVal; |
| 3106 sqlite3_stmt *pUpdate = 0; |
| 3107 assert( eType==RBU_UPDATE ); |
| 3108 rbuGetUpdateStmt(p, pIter, zMask, &pUpdate); |
| 3109 if( pUpdate ){ |
| 3110 for(i=0; p->rc==SQLITE_OK && i<pIter->nCol; i++){ |
| 3111 char c = zMask[pIter->aiSrcOrder[i]]; |
| 3112 pVal = sqlite3_column_value(pIter->pSelect, i); |
| 3113 if( pIter->abTblPk[i] || c!='.' ){ |
| 3114 p->rc = sqlite3_bind_value(pUpdate, i+1, pVal); |
| 3115 } |
| 3116 } |
| 3117 if( p->rc==SQLITE_OK |
| 3118 && (pIter->eType==RBU_PK_VTAB || pIter->eType==RBU_PK_NONE) |
| 3119 ){ |
| 3120 /* Bind the rbu_rowid value to column _rowid_ */ |
| 3121 assertColumnName(pIter->pSelect, pIter->nCol+1, "rbu_rowid"); |
| 3122 pVal = sqlite3_column_value(pIter->pSelect, pIter->nCol+1); |
| 3123 p->rc = sqlite3_bind_value(pUpdate, pIter->nCol+1, pVal); |
| 3124 } |
| 3125 if( p->rc==SQLITE_OK ){ |
| 3126 sqlite3_step(pUpdate); |
| 3127 p->rc = resetAndCollectError(pUpdate, &p->zErrmsg); |
| 3128 } |
| 3129 } |
| 3130 } |
| 3131 } |
| 3132 |
| 3133 step_out: |
| 3134 return p->rc; |
| 3135 } |
| 3136 |
| 3137 /* |
| 3138 ** Increment the schema cookie of the main database opened by p->dbMain. |
| 3139 */ |
| 3140 static void rbuIncrSchemaCookie(sqlite3rbu *p){ |
| 3141 if( p->rc==SQLITE_OK ){ |
| 3142 int iCookie = 1000000; |
| 3143 sqlite3_stmt *pStmt; |
| 3144 |
| 3145 p->rc = prepareAndCollectError(p->dbMain, &pStmt, &p->zErrmsg, |
| 3146 "PRAGMA schema_version" |
| 3147 ); |
| 3148 if( p->rc==SQLITE_OK ){ |
| 3149 /* Coverage: it may be that this sqlite3_step() cannot fail. There |
| 3150 ** is already a transaction open, so the prepared statement cannot |
| 3151 ** throw an SQLITE_SCHEMA exception. The only database page the |
| 3152 ** statement reads is page 1, which is guaranteed to be in the cache. |
| 3153 ** And no memory allocations are required. */ |
| 3154 if( SQLITE_ROW==sqlite3_step(pStmt) ){ |
| 3155 iCookie = sqlite3_column_int(pStmt, 0); |
| 3156 } |
| 3157 rbuFinalize(p, pStmt); |
| 3158 } |
| 3159 if( p->rc==SQLITE_OK ){ |
| 3160 rbuMPrintfExec(p, p->dbMain, "PRAGMA schema_version = %d", iCookie+1); |
| 3161 } |
| 3162 } |
| 3163 } |
| 3164 |
| 3165 /* |
| 3166 ** Update the contents of the rbu_state table within the rbu database. The |
| 3167 ** value stored in the RBU_STATE_STAGE column is eStage. All other values |
| 3168 ** are determined by inspecting the rbu handle passed as the first argument. |
| 3169 */ |
| 3170 static void rbuSaveState(sqlite3rbu *p, int eStage){ |
| 3171 if( p->rc==SQLITE_OK || p->rc==SQLITE_DONE ){ |
| 3172 sqlite3_stmt *pInsert = 0; |
| 3173 int rc; |
| 3174 |
| 3175 assert( p->zErrmsg==0 ); |
| 3176 rc = prepareFreeAndCollectError(p->dbRbu, &pInsert, &p->zErrmsg, |
| 3177 sqlite3_mprintf( |
| 3178 "INSERT OR REPLACE INTO %s.rbu_state(k, v) VALUES " |
| 3179 "(%d, %d), " |
| 3180 "(%d, %Q), " |
| 3181 "(%d, %Q), " |
| 3182 "(%d, %d), " |
| 3183 "(%d, %d), " |
| 3184 "(%d, %lld), " |
| 3185 "(%d, %lld), " |
| 3186 "(%d, %lld) ", |
| 3187 p->zStateDb, |
| 3188 RBU_STATE_STAGE, eStage, |
| 3189 RBU_STATE_TBL, p->objiter.zTbl, |
| 3190 RBU_STATE_IDX, p->objiter.zIdx, |
| 3191 RBU_STATE_ROW, p->nStep, |
| 3192 RBU_STATE_PROGRESS, p->nProgress, |
| 3193 RBU_STATE_CKPT, p->iWalCksum, |
| 3194 RBU_STATE_COOKIE, (i64)p->pTargetFd->iCookie, |
| 3195 RBU_STATE_OALSZ, p->iOalSz |
| 3196 ) |
| 3197 ); |
| 3198 assert( pInsert==0 || rc==SQLITE_OK ); |
| 3199 |
| 3200 if( rc==SQLITE_OK ){ |
| 3201 sqlite3_step(pInsert); |
| 3202 rc = sqlite3_finalize(pInsert); |
| 3203 } |
| 3204 if( rc!=SQLITE_OK ) p->rc = rc; |
| 3205 } |
| 3206 } |
| 3207 |
| 3208 |
| 3209 /* |
| 3210 ** Step the RBU object. |
| 3211 */ |
| 3212 SQLITE_API int SQLITE_STDCALL sqlite3rbu_step(sqlite3rbu *p){ |
| 3213 if( p ){ |
| 3214 switch( p->eStage ){ |
| 3215 case RBU_STAGE_OAL: { |
| 3216 RbuObjIter *pIter = &p->objiter; |
| 3217 while( p->rc==SQLITE_OK && pIter->zTbl ){ |
| 3218 |
| 3219 if( pIter->bCleanup ){ |
| 3220 /* Clean up the rbu_tmp_xxx table for the previous table. It |
| 3221 ** cannot be dropped as there are currently active SQL statements. |
| 3222 ** But the contents can be deleted. */ |
| 3223 if( pIter->abIndexed ){ |
| 3224 rbuMPrintfExec(p, p->dbRbu, |
| 3225 "DELETE FROM %s.'rbu_tmp_%q'", p->zStateDb, pIter->zDataTbl |
| 3226 ); |
| 3227 } |
| 3228 }else{ |
| 3229 rbuObjIterPrepareAll(p, pIter, 0); |
| 3230 |
| 3231 /* Advance to the next row to process. */ |
| 3232 if( p->rc==SQLITE_OK ){ |
| 3233 int rc = sqlite3_step(pIter->pSelect); |
| 3234 if( rc==SQLITE_ROW ){ |
| 3235 p->nProgress++; |
| 3236 p->nStep++; |
| 3237 return rbuStep(p); |
| 3238 } |
| 3239 p->rc = sqlite3_reset(pIter->pSelect); |
| 3240 p->nStep = 0; |
| 3241 } |
| 3242 } |
| 3243 |
| 3244 rbuObjIterNext(p, pIter); |
| 3245 } |
| 3246 |
| 3247 if( p->rc==SQLITE_OK ){ |
| 3248 assert( pIter->zTbl==0 ); |
| 3249 rbuSaveState(p, RBU_STAGE_MOVE); |
| 3250 rbuIncrSchemaCookie(p); |
| 3251 if( p->rc==SQLITE_OK ){ |
| 3252 p->rc = sqlite3_exec(p->dbMain, "COMMIT", 0, 0, &p->zErrmsg); |
| 3253 } |
| 3254 if( p->rc==SQLITE_OK ){ |
| 3255 p->rc = sqlite3_exec(p->dbRbu, "COMMIT", 0, 0, &p->zErrmsg); |
| 3256 } |
| 3257 p->eStage = RBU_STAGE_MOVE; |
| 3258 } |
| 3259 break; |
| 3260 } |
| 3261 |
| 3262 case RBU_STAGE_MOVE: { |
| 3263 if( p->rc==SQLITE_OK ){ |
| 3264 rbuMoveOalFile(p); |
| 3265 p->nProgress++; |
| 3266 } |
| 3267 break; |
| 3268 } |
| 3269 |
| 3270 case RBU_STAGE_CKPT: { |
| 3271 if( p->rc==SQLITE_OK ){ |
| 3272 if( p->nStep>=p->nFrame ){ |
| 3273 sqlite3_file *pDb = p->pTargetFd->pReal; |
| 3274 |
| 3275 /* Sync the db file */ |
| 3276 p->rc = pDb->pMethods->xSync(pDb, SQLITE_SYNC_NORMAL); |
| 3277 |
| 3278 /* Update nBackfill */ |
| 3279 if( p->rc==SQLITE_OK ){ |
| 3280 void volatile *ptr; |
| 3281 p->rc = pDb->pMethods->xShmMap(pDb, 0, 32*1024, 0, &ptr); |
| 3282 if( p->rc==SQLITE_OK ){ |
| 3283 ((u32 volatile*)ptr)[24] = p->iMaxFrame; |
| 3284 } |
| 3285 } |
| 3286 |
| 3287 if( p->rc==SQLITE_OK ){ |
| 3288 p->eStage = RBU_STAGE_DONE; |
| 3289 p->rc = SQLITE_DONE; |
| 3290 } |
| 3291 }else{ |
| 3292 RbuFrame *pFrame = &p->aFrame[p->nStep]; |
| 3293 rbuCheckpointFrame(p, pFrame); |
| 3294 p->nStep++; |
| 3295 } |
| 3296 p->nProgress++; |
| 3297 } |
| 3298 break; |
| 3299 } |
| 3300 |
| 3301 default: |
| 3302 break; |
| 3303 } |
| 3304 return p->rc; |
| 3305 }else{ |
| 3306 return SQLITE_NOMEM; |
| 3307 } |
| 3308 } |
| 3309 |
| 3310 /* |
| 3311 ** Free an RbuState object allocated by rbuLoadState(). |
| 3312 */ |
| 3313 static void rbuFreeState(RbuState *p){ |
| 3314 if( p ){ |
| 3315 sqlite3_free(p->zTbl); |
| 3316 sqlite3_free(p->zIdx); |
| 3317 sqlite3_free(p); |
| 3318 } |
| 3319 } |
| 3320 |
| 3321 /* |
| 3322 ** Allocate an RbuState object and load the contents of the rbu_state |
| 3323 ** table into it. Return a pointer to the new object. It is the |
| 3324 ** responsibility of the caller to eventually free the object using |
| 3325 ** sqlite3_free(). |
| 3326 ** |
| 3327 ** If an error occurs, leave an error code and message in the rbu handle |
| 3328 ** and return NULL. |
| 3329 */ |
| 3330 static RbuState *rbuLoadState(sqlite3rbu *p){ |
| 3331 RbuState *pRet = 0; |
| 3332 sqlite3_stmt *pStmt = 0; |
| 3333 int rc; |
| 3334 int rc2; |
| 3335 |
| 3336 pRet = (RbuState*)rbuMalloc(p, sizeof(RbuState)); |
| 3337 if( pRet==0 ) return 0; |
| 3338 |
| 3339 rc = prepareFreeAndCollectError(p->dbRbu, &pStmt, &p->zErrmsg, |
| 3340 sqlite3_mprintf("SELECT k, v FROM %s.rbu_state", p->zStateDb) |
| 3341 ); |
| 3342 while( rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pStmt) ){ |
| 3343 switch( sqlite3_column_int(pStmt, 0) ){ |
| 3344 case RBU_STATE_STAGE: |
| 3345 pRet->eStage = sqlite3_column_int(pStmt, 1); |
| 3346 if( pRet->eStage!=RBU_STAGE_OAL |
| 3347 && pRet->eStage!=RBU_STAGE_MOVE |
| 3348 && pRet->eStage!=RBU_STAGE_CKPT |
| 3349 ){ |
| 3350 p->rc = SQLITE_CORRUPT; |
| 3351 } |
| 3352 break; |
| 3353 |
| 3354 case RBU_STATE_TBL: |
| 3355 pRet->zTbl = rbuStrndup((char*)sqlite3_column_text(pStmt, 1), &rc); |
| 3356 break; |
| 3357 |
| 3358 case RBU_STATE_IDX: |
| 3359 pRet->zIdx = rbuStrndup((char*)sqlite3_column_text(pStmt, 1), &rc); |
| 3360 break; |
| 3361 |
| 3362 case RBU_STATE_ROW: |
| 3363 pRet->nRow = sqlite3_column_int(pStmt, 1); |
| 3364 break; |
| 3365 |
| 3366 case RBU_STATE_PROGRESS: |
| 3367 pRet->nProgress = sqlite3_column_int64(pStmt, 1); |
| 3368 break; |
| 3369 |
| 3370 case RBU_STATE_CKPT: |
| 3371 pRet->iWalCksum = sqlite3_column_int64(pStmt, 1); |
| 3372 break; |
| 3373 |
| 3374 case RBU_STATE_COOKIE: |
| 3375 pRet->iCookie = (u32)sqlite3_column_int64(pStmt, 1); |
| 3376 break; |
| 3377 |
| 3378 case RBU_STATE_OALSZ: |
| 3379 pRet->iOalSz = (u32)sqlite3_column_int64(pStmt, 1); |
| 3380 break; |
| 3381 |
| 3382 default: |
| 3383 rc = SQLITE_CORRUPT; |
| 3384 break; |
| 3385 } |
| 3386 } |
| 3387 rc2 = sqlite3_finalize(pStmt); |
| 3388 if( rc==SQLITE_OK ) rc = rc2; |
| 3389 |
| 3390 p->rc = rc; |
| 3391 return pRet; |
| 3392 } |
| 3393 |
| 3394 /* |
| 3395 ** Compare strings z1 and z2, returning 0 if they are identical, or non-zero |
| 3396 ** otherwise. Either or both argument may be NULL. Two NULL values are |
| 3397 ** considered equal, and NULL is considered distinct from all other values. |
| 3398 */ |
| 3399 static int rbuStrCompare(const char *z1, const char *z2){ |
| 3400 if( z1==0 && z2==0 ) return 0; |
| 3401 if( z1==0 || z2==0 ) return 1; |
| 3402 return (sqlite3_stricmp(z1, z2)!=0); |
| 3403 } |
| 3404 |
| 3405 /* |
| 3406 ** This function is called as part of sqlite3rbu_open() when initializing |
| 3407 ** an rbu handle in OAL stage. If the rbu update has not started (i.e. |
| 3408 ** the rbu_state table was empty) it is a no-op. Otherwise, it arranges |
| 3409 ** things so that the next call to sqlite3rbu_step() continues on from |
| 3410 ** where the previous rbu handle left off. |
| 3411 ** |
| 3412 ** If an error occurs, an error code and error message are left in the |
| 3413 ** rbu handle passed as the first argument. |
| 3414 */ |
| 3415 static void rbuSetupOal(sqlite3rbu *p, RbuState *pState){ |
| 3416 assert( p->rc==SQLITE_OK ); |
| 3417 if( pState->zTbl ){ |
| 3418 RbuObjIter *pIter = &p->objiter; |
| 3419 int rc = SQLITE_OK; |
| 3420 |
| 3421 while( rc==SQLITE_OK && pIter->zTbl && (pIter->bCleanup |
| 3422 || rbuStrCompare(pIter->zIdx, pState->zIdx) |
| 3423 || rbuStrCompare(pIter->zTbl, pState->zTbl) |
| 3424 )){ |
| 3425 rc = rbuObjIterNext(p, pIter); |
| 3426 } |
| 3427 |
| 3428 if( rc==SQLITE_OK && !pIter->zTbl ){ |
| 3429 rc = SQLITE_ERROR; |
| 3430 p->zErrmsg = sqlite3_mprintf("rbu_state mismatch error"); |
| 3431 } |
| 3432 |
| 3433 if( rc==SQLITE_OK ){ |
| 3434 p->nStep = pState->nRow; |
| 3435 rc = rbuObjIterPrepareAll(p, &p->objiter, p->nStep); |
| 3436 } |
| 3437 |
| 3438 p->rc = rc; |
| 3439 } |
| 3440 } |
| 3441 |
| 3442 /* |
| 3443 ** If there is a "*-oal" file in the file-system corresponding to the |
| 3444 ** target database in the file-system, delete it. If an error occurs, |
| 3445 ** leave an error code and error message in the rbu handle. |
| 3446 */ |
| 3447 static void rbuDeleteOalFile(sqlite3rbu *p){ |
| 3448 char *zOal = rbuMPrintf(p, "%s-oal", p->zTarget); |
| 3449 if( zOal ){ |
| 3450 sqlite3_vfs *pVfs = sqlite3_vfs_find(0); |
| 3451 assert( pVfs && p->rc==SQLITE_OK && p->zErrmsg==0 ); |
| 3452 pVfs->xDelete(pVfs, zOal, 0); |
| 3453 sqlite3_free(zOal); |
| 3454 } |
| 3455 } |
| 3456 |
| 3457 /* |
| 3458 ** Allocate a private rbu VFS for the rbu handle passed as the only |
| 3459 ** argument. This VFS will be used unless the call to sqlite3rbu_open() |
| 3460 ** specified a URI with a vfs=? option in place of a target database |
| 3461 ** file name. |
| 3462 */ |
| 3463 static void rbuCreateVfs(sqlite3rbu *p){ |
| 3464 int rnd; |
| 3465 char zRnd[64]; |
| 3466 |
| 3467 assert( p->rc==SQLITE_OK ); |
| 3468 sqlite3_randomness(sizeof(int), (void*)&rnd); |
| 3469 sqlite3_snprintf(sizeof(zRnd), zRnd, "rbu_vfs_%d", rnd); |
| 3470 p->rc = sqlite3rbu_create_vfs(zRnd, 0); |
| 3471 if( p->rc==SQLITE_OK ){ |
| 3472 sqlite3_vfs *pVfs = sqlite3_vfs_find(zRnd); |
| 3473 assert( pVfs ); |
| 3474 p->zVfsName = pVfs->zName; |
| 3475 } |
| 3476 } |
| 3477 |
| 3478 /* |
| 3479 ** Destroy the private VFS created for the rbu handle passed as the only |
| 3480 ** argument by an earlier call to rbuCreateVfs(). |
| 3481 */ |
| 3482 static void rbuDeleteVfs(sqlite3rbu *p){ |
| 3483 if( p->zVfsName ){ |
| 3484 sqlite3rbu_destroy_vfs(p->zVfsName); |
| 3485 p->zVfsName = 0; |
| 3486 } |
| 3487 } |
| 3488 |
| 3489 /* |
| 3490 ** Open and return a new RBU handle. |
| 3491 */ |
| 3492 SQLITE_API sqlite3rbu *SQLITE_STDCALL sqlite3rbu_open( |
| 3493 const char *zTarget, |
| 3494 const char *zRbu, |
| 3495 const char *zState |
| 3496 ){ |
| 3497 sqlite3rbu *p; |
| 3498 int nTarget = strlen(zTarget); |
| 3499 int nRbu = strlen(zRbu); |
| 3500 int nState = zState ? strlen(zState) : 0; |
| 3501 |
| 3502 p = (sqlite3rbu*)sqlite3_malloc(sizeof(sqlite3rbu)+nTarget+1+nRbu+1+nState+1); |
| 3503 if( p ){ |
| 3504 RbuState *pState = 0; |
| 3505 |
| 3506 /* Create the custom VFS. */ |
| 3507 memset(p, 0, sizeof(sqlite3rbu)); |
| 3508 rbuCreateVfs(p); |
| 3509 |
| 3510 /* Open the target database */ |
| 3511 if( p->rc==SQLITE_OK ){ |
| 3512 p->zTarget = (char*)&p[1]; |
| 3513 memcpy(p->zTarget, zTarget, nTarget+1); |
| 3514 p->zRbu = &p->zTarget[nTarget+1]; |
| 3515 memcpy(p->zRbu, zRbu, nRbu+1); |
| 3516 if( zState ){ |
| 3517 p->zState = &p->zRbu[nRbu+1]; |
| 3518 memcpy(p->zState, zState, nState+1); |
| 3519 } |
| 3520 rbuOpenDatabase(p); |
| 3521 } |
| 3522 |
| 3523 /* If it has not already been created, create the rbu_state table */ |
| 3524 rbuMPrintfExec(p, p->dbRbu, RBU_CREATE_STATE, p->zStateDb); |
| 3525 |
| 3526 if( p->rc==SQLITE_OK ){ |
| 3527 pState = rbuLoadState(p); |
| 3528 assert( pState || p->rc!=SQLITE_OK ); |
| 3529 if( p->rc==SQLITE_OK ){ |
| 3530 |
| 3531 if( pState->eStage==0 ){ |
| 3532 rbuDeleteOalFile(p); |
| 3533 p->eStage = RBU_STAGE_OAL; |
| 3534 }else{ |
| 3535 p->eStage = pState->eStage; |
| 3536 } |
| 3537 p->nProgress = pState->nProgress; |
| 3538 p->iOalSz = pState->iOalSz; |
| 3539 } |
| 3540 } |
| 3541 assert( p->rc!=SQLITE_OK || p->eStage!=0 ); |
| 3542 |
| 3543 if( p->rc==SQLITE_OK && p->pTargetFd->pWalFd ){ |
| 3544 if( p->eStage==RBU_STAGE_OAL ){ |
| 3545 p->rc = SQLITE_ERROR; |
| 3546 p->zErrmsg = sqlite3_mprintf("cannot update wal mode database"); |
| 3547 }else if( p->eStage==RBU_STAGE_MOVE ){ |
| 3548 p->eStage = RBU_STAGE_CKPT; |
| 3549 p->nStep = 0; |
| 3550 } |
| 3551 } |
| 3552 |
| 3553 if( p->rc==SQLITE_OK |
| 3554 && (p->eStage==RBU_STAGE_OAL || p->eStage==RBU_STAGE_MOVE) |
| 3555 && pState->eStage!=0 && p->pTargetFd->iCookie!=pState->iCookie |
| 3556 ){ |
| 3557 /* At this point (pTargetFd->iCookie) contains the value of the |
| 3558 ** change-counter cookie (the thing that gets incremented when a |
| 3559 ** transaction is committed in rollback mode) currently stored on |
| 3560 ** page 1 of the database file. */ |
| 3561 p->rc = SQLITE_BUSY; |
| 3562 p->zErrmsg = sqlite3_mprintf("database modified during rbu update"); |
| 3563 } |
| 3564 |
| 3565 if( p->rc==SQLITE_OK ){ |
| 3566 if( p->eStage==RBU_STAGE_OAL ){ |
| 3567 sqlite3 *db = p->dbMain; |
| 3568 |
| 3569 /* Open transactions both databases. The *-oal file is opened or |
| 3570 ** created at this point. */ |
| 3571 p->rc = sqlite3_exec(db, "BEGIN IMMEDIATE", 0, 0, &p->zErrmsg); |
| 3572 if( p->rc==SQLITE_OK ){ |
| 3573 p->rc = sqlite3_exec(p->dbRbu, "BEGIN IMMEDIATE", 0, 0, &p->zErrmsg); |
| 3574 } |
| 3575 |
| 3576 /* Check if the main database is a zipvfs db. If it is, set the upper |
| 3577 ** level pager to use "journal_mode=off". This prevents it from |
| 3578 ** generating a large journal using a temp file. */ |
| 3579 if( p->rc==SQLITE_OK ){ |
| 3580 int frc = sqlite3_file_control(db, "main", SQLITE_FCNTL_ZIPVFS, 0); |
| 3581 if( frc==SQLITE_OK ){ |
| 3582 p->rc = sqlite3_exec(db, "PRAGMA journal_mode=off",0,0,&p->zErrmsg); |
| 3583 } |
| 3584 } |
| 3585 |
| 3586 /* Point the object iterator at the first object */ |
| 3587 if( p->rc==SQLITE_OK ){ |
| 3588 p->rc = rbuObjIterFirst(p, &p->objiter); |
| 3589 } |
| 3590 |
| 3591 /* If the RBU database contains no data_xxx tables, declare the RBU |
| 3592 ** update finished. */ |
| 3593 if( p->rc==SQLITE_OK && p->objiter.zTbl==0 ){ |
| 3594 p->rc = SQLITE_DONE; |
| 3595 } |
| 3596 |
| 3597 if( p->rc==SQLITE_OK ){ |
| 3598 rbuSetupOal(p, pState); |
| 3599 } |
| 3600 |
| 3601 }else if( p->eStage==RBU_STAGE_MOVE ){ |
| 3602 /* no-op */ |
| 3603 }else if( p->eStage==RBU_STAGE_CKPT ){ |
| 3604 rbuSetupCheckpoint(p, pState); |
| 3605 }else if( p->eStage==RBU_STAGE_DONE ){ |
| 3606 p->rc = SQLITE_DONE; |
| 3607 }else{ |
| 3608 p->rc = SQLITE_CORRUPT; |
| 3609 } |
| 3610 } |
| 3611 |
| 3612 rbuFreeState(pState); |
| 3613 } |
| 3614 |
| 3615 return p; |
| 3616 } |
| 3617 |
| 3618 |
| 3619 /* |
| 3620 ** Return the database handle used by pRbu. |
| 3621 */ |
| 3622 SQLITE_API sqlite3 *SQLITE_STDCALL sqlite3rbu_db(sqlite3rbu *pRbu, int bRbu){ |
| 3623 sqlite3 *db = 0; |
| 3624 if( pRbu ){ |
| 3625 db = (bRbu ? pRbu->dbRbu : pRbu->dbMain); |
| 3626 } |
| 3627 return db; |
| 3628 } |
| 3629 |
| 3630 |
| 3631 /* |
| 3632 ** If the error code currently stored in the RBU handle is SQLITE_CONSTRAINT, |
| 3633 ** then edit any error message string so as to remove all occurrences of |
| 3634 ** the pattern "rbu_imp_[0-9]*". |
| 3635 */ |
| 3636 static void rbuEditErrmsg(sqlite3rbu *p){ |
| 3637 if( p->rc==SQLITE_CONSTRAINT && p->zErrmsg ){ |
| 3638 int i; |
| 3639 int nErrmsg = strlen(p->zErrmsg); |
| 3640 for(i=0; i<(nErrmsg-8); i++){ |
| 3641 if( memcmp(&p->zErrmsg[i], "rbu_imp_", 8)==0 ){ |
| 3642 int nDel = 8; |
| 3643 while( p->zErrmsg[i+nDel]>='0' && p->zErrmsg[i+nDel]<='9' ) nDel++; |
| 3644 memmove(&p->zErrmsg[i], &p->zErrmsg[i+nDel], nErrmsg + 1 - i - nDel); |
| 3645 nErrmsg -= nDel; |
| 3646 } |
| 3647 } |
| 3648 } |
| 3649 } |
| 3650 |
| 3651 /* |
| 3652 ** Close the RBU handle. |
| 3653 */ |
| 3654 SQLITE_API int SQLITE_STDCALL sqlite3rbu_close(sqlite3rbu *p, char **pzErrmsg){ |
| 3655 int rc; |
| 3656 if( p ){ |
| 3657 |
| 3658 /* Commit the transaction to the *-oal file. */ |
| 3659 if( p->rc==SQLITE_OK && p->eStage==RBU_STAGE_OAL ){ |
| 3660 p->rc = sqlite3_exec(p->dbMain, "COMMIT", 0, 0, &p->zErrmsg); |
| 3661 } |
| 3662 |
| 3663 rbuSaveState(p, p->eStage); |
| 3664 |
| 3665 if( p->rc==SQLITE_OK && p->eStage==RBU_STAGE_OAL ){ |
| 3666 p->rc = sqlite3_exec(p->dbRbu, "COMMIT", 0, 0, &p->zErrmsg); |
| 3667 } |
| 3668 |
| 3669 /* Close any open statement handles. */ |
| 3670 rbuObjIterFinalize(&p->objiter); |
| 3671 |
| 3672 /* Close the open database handle and VFS object. */ |
| 3673 sqlite3_close(p->dbMain); |
| 3674 sqlite3_close(p->dbRbu); |
| 3675 rbuDeleteVfs(p); |
| 3676 sqlite3_free(p->aBuf); |
| 3677 sqlite3_free(p->aFrame); |
| 3678 |
| 3679 rbuEditErrmsg(p); |
| 3680 rc = p->rc; |
| 3681 *pzErrmsg = p->zErrmsg; |
| 3682 sqlite3_free(p); |
| 3683 }else{ |
| 3684 rc = SQLITE_NOMEM; |
| 3685 *pzErrmsg = 0; |
| 3686 } |
| 3687 return rc; |
| 3688 } |
| 3689 |
| 3690 /* |
| 3691 ** Return the total number of key-value operations (inserts, deletes or |
| 3692 ** updates) that have been performed on the target database since the |
| 3693 ** current RBU update was started. |
| 3694 */ |
| 3695 SQLITE_API sqlite3_int64 SQLITE_STDCALL sqlite3rbu_progress(sqlite3rbu *pRbu){ |
| 3696 return pRbu->nProgress; |
| 3697 } |
| 3698 |
| 3699 SQLITE_API int SQLITE_STDCALL sqlite3rbu_savestate(sqlite3rbu *p){ |
| 3700 int rc = p->rc; |
| 3701 |
| 3702 if( rc==SQLITE_DONE ) return SQLITE_OK; |
| 3703 |
| 3704 assert( p->eStage>=RBU_STAGE_OAL && p->eStage<=RBU_STAGE_DONE ); |
| 3705 if( p->eStage==RBU_STAGE_OAL ){ |
| 3706 assert( rc!=SQLITE_DONE ); |
| 3707 if( rc==SQLITE_OK ) rc = sqlite3_exec(p->dbMain, "COMMIT", 0, 0, 0); |
| 3708 } |
| 3709 |
| 3710 p->rc = rc; |
| 3711 rbuSaveState(p, p->eStage); |
| 3712 rc = p->rc; |
| 3713 |
| 3714 if( p->eStage==RBU_STAGE_OAL ){ |
| 3715 assert( rc!=SQLITE_DONE ); |
| 3716 if( rc==SQLITE_OK ) rc = sqlite3_exec(p->dbRbu, "COMMIT", 0, 0, 0); |
| 3717 if( rc==SQLITE_OK ) rc = sqlite3_exec(p->dbRbu, "BEGIN IMMEDIATE", 0, 0, 0); |
| 3718 if( rc==SQLITE_OK ) rc = sqlite3_exec(p->dbMain, "BEGIN IMMEDIATE", 0, 0,0); |
| 3719 } |
| 3720 |
| 3721 p->rc = rc; |
| 3722 return rc; |
| 3723 } |
| 3724 |
| 3725 /************************************************************************** |
| 3726 ** Beginning of RBU VFS shim methods. The VFS shim modifies the behaviour |
| 3727 ** of a standard VFS in the following ways: |
| 3728 ** |
| 3729 ** 1. Whenever the first page of a main database file is read or |
| 3730 ** written, the value of the change-counter cookie is stored in |
| 3731 ** rbu_file.iCookie. Similarly, the value of the "write-version" |
| 3732 ** database header field is stored in rbu_file.iWriteVer. This ensures |
| 3733 ** that the values are always trustworthy within an open transaction. |
| 3734 ** |
| 3735 ** 2. Whenever an SQLITE_OPEN_WAL file is opened, the (rbu_file.pWalFd) |
| 3736 ** member variable of the associated database file descriptor is set |
| 3737 ** to point to the new file. A mutex protected linked list of all main |
| 3738 ** db fds opened using a particular RBU VFS is maintained at |
| 3739 ** rbu_vfs.pMain to facilitate this. |
| 3740 ** |
| 3741 ** 3. Using a new file-control "SQLITE_FCNTL_RBU", a main db rbu_file |
| 3742 ** object can be marked as the target database of an RBU update. This |
| 3743 ** turns on the following extra special behaviour: |
| 3744 ** |
| 3745 ** 3a. If xAccess() is called to check if there exists a *-wal file |
| 3746 ** associated with an RBU target database currently in RBU_STAGE_OAL |
| 3747 ** stage (preparing the *-oal file), the following special handling |
| 3748 ** applies: |
| 3749 ** |
| 3750 ** * if the *-wal file does exist, return SQLITE_CANTOPEN. An RBU |
| 3751 ** target database may not be in wal mode already. |
| 3752 ** |
| 3753 ** * if the *-wal file does not exist, set the output parameter to |
| 3754 ** non-zero (to tell SQLite that it does exist) anyway. |
| 3755 ** |
| 3756 ** Then, when xOpen() is called to open the *-wal file associated with |
| 3757 ** the RBU target in RBU_STAGE_OAL stage, instead of opening the *-wal |
| 3758 ** file, the rbu vfs opens the corresponding *-oal file instead. |
| 3759 ** |
| 3760 ** 3b. The *-shm pages returned by xShmMap() for a target db file in |
| 3761 ** RBU_STAGE_OAL mode are actually stored in heap memory. This is to |
| 3762 ** avoid creating a *-shm file on disk. Additionally, xShmLock() calls |
| 3763 ** are no-ops on target database files in RBU_STAGE_OAL mode. This is |
| 3764 ** because assert() statements in some VFS implementations fail if |
| 3765 ** xShmLock() is called before xShmMap(). |
| 3766 ** |
| 3767 ** 3c. If an EXCLUSIVE lock is attempted on a target database file in any |
| 3768 ** mode except RBU_STAGE_DONE (all work completed and checkpointed), it |
| 3769 ** fails with an SQLITE_BUSY error. This is to stop RBU connections |
| 3770 ** from automatically checkpointing a *-wal (or *-oal) file from within |
| 3771 ** sqlite3_close(). |
| 3772 ** |
| 3773 ** 3d. In RBU_STAGE_CAPTURE mode, all xRead() calls on the wal file, and |
| 3774 ** all xWrite() calls on the target database file perform no IO. |
| 3775 ** Instead the frame and page numbers that would be read and written |
| 3776 ** are recorded. Additionally, successful attempts to obtain exclusive |
| 3777 ** xShmLock() WRITER, CHECKPOINTER and READ0 locks on the target |
| 3778 ** database file are recorded. xShmLock() calls to unlock the same |
| 3779 ** locks are no-ops (so that once obtained, these locks are never |
| 3780 ** relinquished). Finally, calls to xSync() on the target database |
| 3781 ** file fail with SQLITE_INTERNAL errors. |
| 3782 */ |
| 3783 |
| 3784 static void rbuUnlockShm(rbu_file *p){ |
| 3785 if( p->pRbu ){ |
| 3786 int (*xShmLock)(sqlite3_file*,int,int,int) = p->pReal->pMethods->xShmLock; |
| 3787 int i; |
| 3788 for(i=0; i<SQLITE_SHM_NLOCK;i++){ |
| 3789 if( (1<<i) & p->pRbu->mLock ){ |
| 3790 xShmLock(p->pReal, i, 1, SQLITE_SHM_UNLOCK|SQLITE_SHM_EXCLUSIVE); |
| 3791 } |
| 3792 } |
| 3793 p->pRbu->mLock = 0; |
| 3794 } |
| 3795 } |
| 3796 |
| 3797 /* |
| 3798 ** Close an rbu file. |
| 3799 */ |
| 3800 static int rbuVfsClose(sqlite3_file *pFile){ |
| 3801 rbu_file *p = (rbu_file*)pFile; |
| 3802 int rc; |
| 3803 int i; |
| 3804 |
| 3805 /* Free the contents of the apShm[] array. And the array itself. */ |
| 3806 for(i=0; i<p->nShm; i++){ |
| 3807 sqlite3_free(p->apShm[i]); |
| 3808 } |
| 3809 sqlite3_free(p->apShm); |
| 3810 p->apShm = 0; |
| 3811 sqlite3_free(p->zDel); |
| 3812 |
| 3813 if( p->openFlags & SQLITE_OPEN_MAIN_DB ){ |
| 3814 rbu_file **pp; |
| 3815 sqlite3_mutex_enter(p->pRbuVfs->mutex); |
| 3816 for(pp=&p->pRbuVfs->pMain; *pp!=p; pp=&((*pp)->pMainNext)); |
| 3817 *pp = p->pMainNext; |
| 3818 sqlite3_mutex_leave(p->pRbuVfs->mutex); |
| 3819 rbuUnlockShm(p); |
| 3820 p->pReal->pMethods->xShmUnmap(p->pReal, 0); |
| 3821 } |
| 3822 |
| 3823 /* Close the underlying file handle */ |
| 3824 rc = p->pReal->pMethods->xClose(p->pReal); |
| 3825 return rc; |
| 3826 } |
| 3827 |
| 3828 |
| 3829 /* |
| 3830 ** Read and return an unsigned 32-bit big-endian integer from the buffer |
| 3831 ** passed as the only argument. |
| 3832 */ |
| 3833 static u32 rbuGetU32(u8 *aBuf){ |
| 3834 return ((u32)aBuf[0] << 24) |
| 3835 + ((u32)aBuf[1] << 16) |
| 3836 + ((u32)aBuf[2] << 8) |
| 3837 + ((u32)aBuf[3]); |
| 3838 } |
| 3839 |
| 3840 /* |
| 3841 ** Read data from an rbuVfs-file. |
| 3842 */ |
| 3843 static int rbuVfsRead( |
| 3844 sqlite3_file *pFile, |
| 3845 void *zBuf, |
| 3846 int iAmt, |
| 3847 sqlite_int64 iOfst |
| 3848 ){ |
| 3849 rbu_file *p = (rbu_file*)pFile; |
| 3850 sqlite3rbu *pRbu = p->pRbu; |
| 3851 int rc; |
| 3852 |
| 3853 if( pRbu && pRbu->eStage==RBU_STAGE_CAPTURE ){ |
| 3854 assert( p->openFlags & SQLITE_OPEN_WAL ); |
| 3855 rc = rbuCaptureWalRead(p->pRbu, iOfst, iAmt); |
| 3856 }else{ |
| 3857 if( pRbu && pRbu->eStage==RBU_STAGE_OAL |
| 3858 && (p->openFlags & SQLITE_OPEN_WAL) |
| 3859 && iOfst>=pRbu->iOalSz |
| 3860 ){ |
| 3861 rc = SQLITE_OK; |
| 3862 memset(zBuf, 0, iAmt); |
| 3863 }else{ |
| 3864 rc = p->pReal->pMethods->xRead(p->pReal, zBuf, iAmt, iOfst); |
| 3865 } |
| 3866 if( rc==SQLITE_OK && iOfst==0 && (p->openFlags & SQLITE_OPEN_MAIN_DB) ){ |
| 3867 /* These look like magic numbers. But they are stable, as they are part |
| 3868 ** of the definition of the SQLite file format, which may not change. */ |
| 3869 u8 *pBuf = (u8*)zBuf; |
| 3870 p->iCookie = rbuGetU32(&pBuf[24]); |
| 3871 p->iWriteVer = pBuf[19]; |
| 3872 } |
| 3873 } |
| 3874 return rc; |
| 3875 } |
| 3876 |
| 3877 /* |
| 3878 ** Write data to an rbuVfs-file. |
| 3879 */ |
| 3880 static int rbuVfsWrite( |
| 3881 sqlite3_file *pFile, |
| 3882 const void *zBuf, |
| 3883 int iAmt, |
| 3884 sqlite_int64 iOfst |
| 3885 ){ |
| 3886 rbu_file *p = (rbu_file*)pFile; |
| 3887 sqlite3rbu *pRbu = p->pRbu; |
| 3888 int rc; |
| 3889 |
| 3890 if( pRbu && pRbu->eStage==RBU_STAGE_CAPTURE ){ |
| 3891 assert( p->openFlags & SQLITE_OPEN_MAIN_DB ); |
| 3892 rc = rbuCaptureDbWrite(p->pRbu, iOfst); |
| 3893 }else{ |
| 3894 if( pRbu && pRbu->eStage==RBU_STAGE_OAL |
| 3895 && (p->openFlags & SQLITE_OPEN_WAL) |
| 3896 && iOfst>=pRbu->iOalSz |
| 3897 ){ |
| 3898 pRbu->iOalSz = iAmt + iOfst; |
| 3899 } |
| 3900 rc = p->pReal->pMethods->xWrite(p->pReal, zBuf, iAmt, iOfst); |
| 3901 if( rc==SQLITE_OK && iOfst==0 && (p->openFlags & SQLITE_OPEN_MAIN_DB) ){ |
| 3902 /* These look like magic numbers. But they are stable, as they are part |
| 3903 ** of the definition of the SQLite file format, which may not change. */ |
| 3904 u8 *pBuf = (u8*)zBuf; |
| 3905 p->iCookie = rbuGetU32(&pBuf[24]); |
| 3906 p->iWriteVer = pBuf[19]; |
| 3907 } |
| 3908 } |
| 3909 return rc; |
| 3910 } |
| 3911 |
| 3912 /* |
| 3913 ** Truncate an rbuVfs-file. |
| 3914 */ |
| 3915 static int rbuVfsTruncate(sqlite3_file *pFile, sqlite_int64 size){ |
| 3916 rbu_file *p = (rbu_file*)pFile; |
| 3917 return p->pReal->pMethods->xTruncate(p->pReal, size); |
| 3918 } |
| 3919 |
| 3920 /* |
| 3921 ** Sync an rbuVfs-file. |
| 3922 */ |
| 3923 static int rbuVfsSync(sqlite3_file *pFile, int flags){ |
| 3924 rbu_file *p = (rbu_file *)pFile; |
| 3925 if( p->pRbu && p->pRbu->eStage==RBU_STAGE_CAPTURE ){ |
| 3926 if( p->openFlags & SQLITE_OPEN_MAIN_DB ){ |
| 3927 return SQLITE_INTERNAL; |
| 3928 } |
| 3929 return SQLITE_OK; |
| 3930 } |
| 3931 return p->pReal->pMethods->xSync(p->pReal, flags); |
| 3932 } |
| 3933 |
| 3934 /* |
| 3935 ** Return the current file-size of an rbuVfs-file. |
| 3936 */ |
| 3937 static int rbuVfsFileSize(sqlite3_file *pFile, sqlite_int64 *pSize){ |
| 3938 rbu_file *p = (rbu_file *)pFile; |
| 3939 return p->pReal->pMethods->xFileSize(p->pReal, pSize); |
| 3940 } |
| 3941 |
| 3942 /* |
| 3943 ** Lock an rbuVfs-file. |
| 3944 */ |
| 3945 static int rbuVfsLock(sqlite3_file *pFile, int eLock){ |
| 3946 rbu_file *p = (rbu_file*)pFile; |
| 3947 sqlite3rbu *pRbu = p->pRbu; |
| 3948 int rc = SQLITE_OK; |
| 3949 |
| 3950 assert( p->openFlags & (SQLITE_OPEN_MAIN_DB|SQLITE_OPEN_TEMP_DB) ); |
| 3951 if( pRbu && eLock==SQLITE_LOCK_EXCLUSIVE && pRbu->eStage!=RBU_STAGE_DONE ){ |
| 3952 /* Do not allow EXCLUSIVE locks. Preventing SQLite from taking this |
| 3953 ** prevents it from checkpointing the database from sqlite3_close(). */ |
| 3954 rc = SQLITE_BUSY; |
| 3955 }else{ |
| 3956 rc = p->pReal->pMethods->xLock(p->pReal, eLock); |
| 3957 } |
| 3958 |
| 3959 return rc; |
| 3960 } |
| 3961 |
| 3962 /* |
| 3963 ** Unlock an rbuVfs-file. |
| 3964 */ |
| 3965 static int rbuVfsUnlock(sqlite3_file *pFile, int eLock){ |
| 3966 rbu_file *p = (rbu_file *)pFile; |
| 3967 return p->pReal->pMethods->xUnlock(p->pReal, eLock); |
| 3968 } |
| 3969 |
| 3970 /* |
| 3971 ** Check if another file-handle holds a RESERVED lock on an rbuVfs-file. |
| 3972 */ |
| 3973 static int rbuVfsCheckReservedLock(sqlite3_file *pFile, int *pResOut){ |
| 3974 rbu_file *p = (rbu_file *)pFile; |
| 3975 return p->pReal->pMethods->xCheckReservedLock(p->pReal, pResOut); |
| 3976 } |
| 3977 |
| 3978 /* |
| 3979 ** File control method. For custom operations on an rbuVfs-file. |
| 3980 */ |
| 3981 static int rbuVfsFileControl(sqlite3_file *pFile, int op, void *pArg){ |
| 3982 rbu_file *p = (rbu_file *)pFile; |
| 3983 int (*xControl)(sqlite3_file*,int,void*) = p->pReal->pMethods->xFileControl; |
| 3984 int rc; |
| 3985 |
| 3986 assert( p->openFlags & (SQLITE_OPEN_MAIN_DB|SQLITE_OPEN_TEMP_DB) |
| 3987 || p->openFlags & (SQLITE_OPEN_TRANSIENT_DB|SQLITE_OPEN_TEMP_JOURNAL) |
| 3988 ); |
| 3989 if( op==SQLITE_FCNTL_RBU ){ |
| 3990 sqlite3rbu *pRbu = (sqlite3rbu*)pArg; |
| 3991 |
| 3992 /* First try to find another RBU vfs lower down in the vfs stack. If |
| 3993 ** one is found, this vfs will operate in pass-through mode. The lower |
| 3994 ** level vfs will do the special RBU handling. */ |
| 3995 rc = xControl(p->pReal, op, pArg); |
| 3996 |
| 3997 if( rc==SQLITE_NOTFOUND ){ |
| 3998 /* Now search for a zipvfs instance lower down in the VFS stack. If |
| 3999 ** one is found, this is an error. */ |
| 4000 void *dummy = 0; |
| 4001 rc = xControl(p->pReal, SQLITE_FCNTL_ZIPVFS, &dummy); |
| 4002 if( rc==SQLITE_OK ){ |
| 4003 rc = SQLITE_ERROR; |
| 4004 pRbu->zErrmsg = sqlite3_mprintf("rbu/zipvfs setup error"); |
| 4005 }else if( rc==SQLITE_NOTFOUND ){ |
| 4006 pRbu->pTargetFd = p; |
| 4007 p->pRbu = pRbu; |
| 4008 if( p->pWalFd ) p->pWalFd->pRbu = pRbu; |
| 4009 rc = SQLITE_OK; |
| 4010 } |
| 4011 } |
| 4012 return rc; |
| 4013 } |
| 4014 |
| 4015 rc = xControl(p->pReal, op, pArg); |
| 4016 if( rc==SQLITE_OK && op==SQLITE_FCNTL_VFSNAME ){ |
| 4017 rbu_vfs *pRbuVfs = p->pRbuVfs; |
| 4018 char *zIn = *(char**)pArg; |
| 4019 char *zOut = sqlite3_mprintf("rbu(%s)/%z", pRbuVfs->base.zName, zIn); |
| 4020 *(char**)pArg = zOut; |
| 4021 if( zOut==0 ) rc = SQLITE_NOMEM; |
| 4022 } |
| 4023 |
| 4024 return rc; |
| 4025 } |
| 4026 |
| 4027 /* |
| 4028 ** Return the sector-size in bytes for an rbuVfs-file. |
| 4029 */ |
| 4030 static int rbuVfsSectorSize(sqlite3_file *pFile){ |
| 4031 rbu_file *p = (rbu_file *)pFile; |
| 4032 return p->pReal->pMethods->xSectorSize(p->pReal); |
| 4033 } |
| 4034 |
| 4035 /* |
| 4036 ** Return the device characteristic flags supported by an rbuVfs-file. |
| 4037 */ |
| 4038 static int rbuVfsDeviceCharacteristics(sqlite3_file *pFile){ |
| 4039 rbu_file *p = (rbu_file *)pFile; |
| 4040 return p->pReal->pMethods->xDeviceCharacteristics(p->pReal); |
| 4041 } |
| 4042 |
| 4043 /* |
| 4044 ** Take or release a shared-memory lock. |
| 4045 */ |
| 4046 static int rbuVfsShmLock(sqlite3_file *pFile, int ofst, int n, int flags){ |
| 4047 rbu_file *p = (rbu_file*)pFile; |
| 4048 sqlite3rbu *pRbu = p->pRbu; |
| 4049 int rc = SQLITE_OK; |
| 4050 |
| 4051 #ifdef SQLITE_AMALGAMATION |
| 4052 assert( WAL_CKPT_LOCK==1 ); |
| 4053 #endif |
| 4054 |
| 4055 assert( p->openFlags & (SQLITE_OPEN_MAIN_DB|SQLITE_OPEN_TEMP_DB) ); |
| 4056 if( pRbu && (pRbu->eStage==RBU_STAGE_OAL || pRbu->eStage==RBU_STAGE_MOVE) ){ |
| 4057 /* Magic number 1 is the WAL_CKPT_LOCK lock. Preventing SQLite from |
| 4058 ** taking this lock also prevents any checkpoints from occurring. |
| 4059 ** todo: really, it's not clear why this might occur, as |
| 4060 ** wal_autocheckpoint ought to be turned off. */ |
| 4061 if( ofst==WAL_LOCK_CKPT && n==1 ) rc = SQLITE_BUSY; |
| 4062 }else{ |
| 4063 int bCapture = 0; |
| 4064 if( n==1 && (flags & SQLITE_SHM_EXCLUSIVE) |
| 4065 && pRbu && pRbu->eStage==RBU_STAGE_CAPTURE |
| 4066 && (ofst==WAL_LOCK_WRITE || ofst==WAL_LOCK_CKPT || ofst==WAL_LOCK_READ0) |
| 4067 ){ |
| 4068 bCapture = 1; |
| 4069 } |
| 4070 |
| 4071 if( bCapture==0 || 0==(flags & SQLITE_SHM_UNLOCK) ){ |
| 4072 rc = p->pReal->pMethods->xShmLock(p->pReal, ofst, n, flags); |
| 4073 if( bCapture && rc==SQLITE_OK ){ |
| 4074 pRbu->mLock |= (1 << ofst); |
| 4075 } |
| 4076 } |
| 4077 } |
| 4078 |
| 4079 return rc; |
| 4080 } |
| 4081 |
| 4082 /* |
| 4083 ** Obtain a pointer to a mapping of a single 32KiB page of the *-shm file. |
| 4084 */ |
| 4085 static int rbuVfsShmMap( |
| 4086 sqlite3_file *pFile, |
| 4087 int iRegion, |
| 4088 int szRegion, |
| 4089 int isWrite, |
| 4090 void volatile **pp |
| 4091 ){ |
| 4092 rbu_file *p = (rbu_file*)pFile; |
| 4093 int rc = SQLITE_OK; |
| 4094 int eStage = (p->pRbu ? p->pRbu->eStage : 0); |
| 4095 |
| 4096 /* If not in RBU_STAGE_OAL, allow this call to pass through. Or, if this |
| 4097 ** rbu is in the RBU_STAGE_OAL state, use heap memory for *-shm space |
| 4098 ** instead of a file on disk. */ |
| 4099 assert( p->openFlags & (SQLITE_OPEN_MAIN_DB|SQLITE_OPEN_TEMP_DB) ); |
| 4100 if( eStage==RBU_STAGE_OAL || eStage==RBU_STAGE_MOVE ){ |
| 4101 if( iRegion<=p->nShm ){ |
| 4102 int nByte = (iRegion+1) * sizeof(char*); |
| 4103 char **apNew = (char**)sqlite3_realloc(p->apShm, nByte); |
| 4104 if( apNew==0 ){ |
| 4105 rc = SQLITE_NOMEM; |
| 4106 }else{ |
| 4107 memset(&apNew[p->nShm], 0, sizeof(char*) * (1 + iRegion - p->nShm)); |
| 4108 p->apShm = apNew; |
| 4109 p->nShm = iRegion+1; |
| 4110 } |
| 4111 } |
| 4112 |
| 4113 if( rc==SQLITE_OK && p->apShm[iRegion]==0 ){ |
| 4114 char *pNew = (char*)sqlite3_malloc(szRegion); |
| 4115 if( pNew==0 ){ |
| 4116 rc = SQLITE_NOMEM; |
| 4117 }else{ |
| 4118 memset(pNew, 0, szRegion); |
| 4119 p->apShm[iRegion] = pNew; |
| 4120 } |
| 4121 } |
| 4122 |
| 4123 if( rc==SQLITE_OK ){ |
| 4124 *pp = p->apShm[iRegion]; |
| 4125 }else{ |
| 4126 *pp = 0; |
| 4127 } |
| 4128 }else{ |
| 4129 assert( p->apShm==0 ); |
| 4130 rc = p->pReal->pMethods->xShmMap(p->pReal, iRegion, szRegion, isWrite, pp); |
| 4131 } |
| 4132 |
| 4133 return rc; |
| 4134 } |
| 4135 |
| 4136 /* |
| 4137 ** Memory barrier. |
| 4138 */ |
| 4139 static void rbuVfsShmBarrier(sqlite3_file *pFile){ |
| 4140 rbu_file *p = (rbu_file *)pFile; |
| 4141 p->pReal->pMethods->xShmBarrier(p->pReal); |
| 4142 } |
| 4143 |
| 4144 /* |
| 4145 ** The xShmUnmap method. |
| 4146 */ |
| 4147 static int rbuVfsShmUnmap(sqlite3_file *pFile, int delFlag){ |
| 4148 rbu_file *p = (rbu_file*)pFile; |
| 4149 int rc = SQLITE_OK; |
| 4150 int eStage = (p->pRbu ? p->pRbu->eStage : 0); |
| 4151 |
| 4152 assert( p->openFlags & (SQLITE_OPEN_MAIN_DB|SQLITE_OPEN_TEMP_DB) ); |
| 4153 if( eStage==RBU_STAGE_OAL || eStage==RBU_STAGE_MOVE ){ |
| 4154 /* no-op */ |
| 4155 }else{ |
| 4156 /* Release the checkpointer and writer locks */ |
| 4157 rbuUnlockShm(p); |
| 4158 rc = p->pReal->pMethods->xShmUnmap(p->pReal, delFlag); |
| 4159 } |
| 4160 return rc; |
| 4161 } |
| 4162 |
| 4163 /* |
| 4164 ** Given that zWal points to a buffer containing a wal file name passed to |
| 4165 ** either the xOpen() or xAccess() VFS method, return a pointer to the |
| 4166 ** file-handle opened by the same database connection on the corresponding |
| 4167 ** database file. |
| 4168 */ |
| 4169 static rbu_file *rbuFindMaindb(rbu_vfs *pRbuVfs, const char *zWal){ |
| 4170 rbu_file *pDb; |
| 4171 sqlite3_mutex_enter(pRbuVfs->mutex); |
| 4172 for(pDb=pRbuVfs->pMain; pDb && pDb->zWal!=zWal; pDb=pDb->pMainNext); |
| 4173 sqlite3_mutex_leave(pRbuVfs->mutex); |
| 4174 return pDb; |
| 4175 } |
| 4176 |
| 4177 /* |
| 4178 ** Open an rbu file handle. |
| 4179 */ |
| 4180 static int rbuVfsOpen( |
| 4181 sqlite3_vfs *pVfs, |
| 4182 const char *zName, |
| 4183 sqlite3_file *pFile, |
| 4184 int flags, |
| 4185 int *pOutFlags |
| 4186 ){ |
| 4187 static sqlite3_io_methods rbuvfs_io_methods = { |
| 4188 2, /* iVersion */ |
| 4189 rbuVfsClose, /* xClose */ |
| 4190 rbuVfsRead, /* xRead */ |
| 4191 rbuVfsWrite, /* xWrite */ |
| 4192 rbuVfsTruncate, /* xTruncate */ |
| 4193 rbuVfsSync, /* xSync */ |
| 4194 rbuVfsFileSize, /* xFileSize */ |
| 4195 rbuVfsLock, /* xLock */ |
| 4196 rbuVfsUnlock, /* xUnlock */ |
| 4197 rbuVfsCheckReservedLock, /* xCheckReservedLock */ |
| 4198 rbuVfsFileControl, /* xFileControl */ |
| 4199 rbuVfsSectorSize, /* xSectorSize */ |
| 4200 rbuVfsDeviceCharacteristics, /* xDeviceCharacteristics */ |
| 4201 rbuVfsShmMap, /* xShmMap */ |
| 4202 rbuVfsShmLock, /* xShmLock */ |
| 4203 rbuVfsShmBarrier, /* xShmBarrier */ |
| 4204 rbuVfsShmUnmap, /* xShmUnmap */ |
| 4205 0, 0 /* xFetch, xUnfetch */ |
| 4206 }; |
| 4207 rbu_vfs *pRbuVfs = (rbu_vfs*)pVfs; |
| 4208 sqlite3_vfs *pRealVfs = pRbuVfs->pRealVfs; |
| 4209 rbu_file *pFd = (rbu_file *)pFile; |
| 4210 int rc = SQLITE_OK; |
| 4211 const char *zOpen = zName; |
| 4212 |
| 4213 memset(pFd, 0, sizeof(rbu_file)); |
| 4214 pFd->pReal = (sqlite3_file*)&pFd[1]; |
| 4215 pFd->pRbuVfs = pRbuVfs; |
| 4216 pFd->openFlags = flags; |
| 4217 if( zName ){ |
| 4218 if( flags & SQLITE_OPEN_MAIN_DB ){ |
| 4219 /* A main database has just been opened. The following block sets |
| 4220 ** (pFd->zWal) to point to a buffer owned by SQLite that contains |
| 4221 ** the name of the *-wal file this db connection will use. SQLite |
| 4222 ** happens to pass a pointer to this buffer when using xAccess() |
| 4223 ** or xOpen() to operate on the *-wal file. */ |
| 4224 int n = strlen(zName); |
| 4225 const char *z = &zName[n]; |
| 4226 if( flags & SQLITE_OPEN_URI ){ |
| 4227 int odd = 0; |
| 4228 while( 1 ){ |
| 4229 if( z[0]==0 ){ |
| 4230 odd = 1 - odd; |
| 4231 if( odd && z[1]==0 ) break; |
| 4232 } |
| 4233 z++; |
| 4234 } |
| 4235 z += 2; |
| 4236 }else{ |
| 4237 while( *z==0 ) z++; |
| 4238 } |
| 4239 z += (n + 8 + 1); |
| 4240 pFd->zWal = z; |
| 4241 } |
| 4242 else if( flags & SQLITE_OPEN_WAL ){ |
| 4243 rbu_file *pDb = rbuFindMaindb(pRbuVfs, zName); |
| 4244 if( pDb ){ |
| 4245 if( pDb->pRbu && pDb->pRbu->eStage==RBU_STAGE_OAL ){ |
| 4246 /* This call is to open a *-wal file. Intead, open the *-oal. This |
| 4247 ** code ensures that the string passed to xOpen() is terminated by a |
| 4248 ** pair of '\0' bytes in case the VFS attempts to extract a URI |
| 4249 ** parameter from it. */ |
| 4250 int nCopy = strlen(zName); |
| 4251 char *zCopy = sqlite3_malloc(nCopy+2); |
| 4252 if( zCopy ){ |
| 4253 memcpy(zCopy, zName, nCopy); |
| 4254 zCopy[nCopy-3] = 'o'; |
| 4255 zCopy[nCopy] = '\0'; |
| 4256 zCopy[nCopy+1] = '\0'; |
| 4257 zOpen = (const char*)(pFd->zDel = zCopy); |
| 4258 }else{ |
| 4259 rc = SQLITE_NOMEM; |
| 4260 } |
| 4261 pFd->pRbu = pDb->pRbu; |
| 4262 } |
| 4263 pDb->pWalFd = pFd; |
| 4264 } |
| 4265 } |
| 4266 } |
| 4267 |
| 4268 if( rc==SQLITE_OK ){ |
| 4269 rc = pRealVfs->xOpen(pRealVfs, zOpen, pFd->pReal, flags, pOutFlags); |
| 4270 } |
| 4271 if( pFd->pReal->pMethods ){ |
| 4272 /* The xOpen() operation has succeeded. Set the sqlite3_file.pMethods |
| 4273 ** pointer and, if the file is a main database file, link it into the |
| 4274 ** mutex protected linked list of all such files. */ |
| 4275 pFile->pMethods = &rbuvfs_io_methods; |
| 4276 if( flags & SQLITE_OPEN_MAIN_DB ){ |
| 4277 sqlite3_mutex_enter(pRbuVfs->mutex); |
| 4278 pFd->pMainNext = pRbuVfs->pMain; |
| 4279 pRbuVfs->pMain = pFd; |
| 4280 sqlite3_mutex_leave(pRbuVfs->mutex); |
| 4281 } |
| 4282 }else{ |
| 4283 sqlite3_free(pFd->zDel); |
| 4284 } |
| 4285 |
| 4286 return rc; |
| 4287 } |
| 4288 |
| 4289 /* |
| 4290 ** Delete the file located at zPath. |
| 4291 */ |
| 4292 static int rbuVfsDelete(sqlite3_vfs *pVfs, const char *zPath, int dirSync){ |
| 4293 sqlite3_vfs *pRealVfs = ((rbu_vfs*)pVfs)->pRealVfs; |
| 4294 return pRealVfs->xDelete(pRealVfs, zPath, dirSync); |
| 4295 } |
| 4296 |
| 4297 /* |
| 4298 ** Test for access permissions. Return true if the requested permission |
| 4299 ** is available, or false otherwise. |
| 4300 */ |
| 4301 static int rbuVfsAccess( |
| 4302 sqlite3_vfs *pVfs, |
| 4303 const char *zPath, |
| 4304 int flags, |
| 4305 int *pResOut |
| 4306 ){ |
| 4307 rbu_vfs *pRbuVfs = (rbu_vfs*)pVfs; |
| 4308 sqlite3_vfs *pRealVfs = pRbuVfs->pRealVfs; |
| 4309 int rc; |
| 4310 |
| 4311 rc = pRealVfs->xAccess(pRealVfs, zPath, flags, pResOut); |
| 4312 |
| 4313 /* If this call is to check if a *-wal file associated with an RBU target |
| 4314 ** database connection exists, and the RBU update is in RBU_STAGE_OAL, |
| 4315 ** the following special handling is activated: |
| 4316 ** |
| 4317 ** a) if the *-wal file does exist, return SQLITE_CANTOPEN. This |
| 4318 ** ensures that the RBU extension never tries to update a database |
| 4319 ** in wal mode, even if the first page of the database file has |
| 4320 ** been damaged. |
| 4321 ** |
| 4322 ** b) if the *-wal file does not exist, claim that it does anyway, |
| 4323 ** causing SQLite to call xOpen() to open it. This call will also |
| 4324 ** be intercepted (see the rbuVfsOpen() function) and the *-oal |
| 4325 ** file opened instead. |
| 4326 */ |
| 4327 if( rc==SQLITE_OK && flags==SQLITE_ACCESS_EXISTS ){ |
| 4328 rbu_file *pDb = rbuFindMaindb(pRbuVfs, zPath); |
| 4329 if( pDb && pDb->pRbu && pDb->pRbu->eStage==RBU_STAGE_OAL ){ |
| 4330 if( *pResOut ){ |
| 4331 rc = SQLITE_CANTOPEN; |
| 4332 }else{ |
| 4333 *pResOut = 1; |
| 4334 } |
| 4335 } |
| 4336 } |
| 4337 |
| 4338 return rc; |
| 4339 } |
| 4340 |
| 4341 /* |
| 4342 ** Populate buffer zOut with the full canonical pathname corresponding |
| 4343 ** to the pathname in zPath. zOut is guaranteed to point to a buffer |
| 4344 ** of at least (DEVSYM_MAX_PATHNAME+1) bytes. |
| 4345 */ |
| 4346 static int rbuVfsFullPathname( |
| 4347 sqlite3_vfs *pVfs, |
| 4348 const char *zPath, |
| 4349 int nOut, |
| 4350 char *zOut |
| 4351 ){ |
| 4352 sqlite3_vfs *pRealVfs = ((rbu_vfs*)pVfs)->pRealVfs; |
| 4353 return pRealVfs->xFullPathname(pRealVfs, zPath, nOut, zOut); |
| 4354 } |
| 4355 |
| 4356 #ifndef SQLITE_OMIT_LOAD_EXTENSION |
| 4357 /* |
| 4358 ** Open the dynamic library located at zPath and return a handle. |
| 4359 */ |
| 4360 static void *rbuVfsDlOpen(sqlite3_vfs *pVfs, const char *zPath){ |
| 4361 sqlite3_vfs *pRealVfs = ((rbu_vfs*)pVfs)->pRealVfs; |
| 4362 return pRealVfs->xDlOpen(pRealVfs, zPath); |
| 4363 } |
| 4364 |
| 4365 /* |
| 4366 ** Populate the buffer zErrMsg (size nByte bytes) with a human readable |
| 4367 ** utf-8 string describing the most recent error encountered associated |
| 4368 ** with dynamic libraries. |
| 4369 */ |
| 4370 static void rbuVfsDlError(sqlite3_vfs *pVfs, int nByte, char *zErrMsg){ |
| 4371 sqlite3_vfs *pRealVfs = ((rbu_vfs*)pVfs)->pRealVfs; |
| 4372 pRealVfs->xDlError(pRealVfs, nByte, zErrMsg); |
| 4373 } |
| 4374 |
| 4375 /* |
| 4376 ** Return a pointer to the symbol zSymbol in the dynamic library pHandle. |
| 4377 */ |
| 4378 static void (*rbuVfsDlSym( |
| 4379 sqlite3_vfs *pVfs, |
| 4380 void *pArg, |
| 4381 const char *zSym |
| 4382 ))(void){ |
| 4383 sqlite3_vfs *pRealVfs = ((rbu_vfs*)pVfs)->pRealVfs; |
| 4384 return pRealVfs->xDlSym(pRealVfs, pArg, zSym); |
| 4385 } |
| 4386 |
| 4387 /* |
| 4388 ** Close the dynamic library handle pHandle. |
| 4389 */ |
| 4390 static void rbuVfsDlClose(sqlite3_vfs *pVfs, void *pHandle){ |
| 4391 sqlite3_vfs *pRealVfs = ((rbu_vfs*)pVfs)->pRealVfs; |
| 4392 pRealVfs->xDlClose(pRealVfs, pHandle); |
| 4393 } |
| 4394 #endif /* SQLITE_OMIT_LOAD_EXTENSION */ |
| 4395 |
| 4396 /* |
| 4397 ** Populate the buffer pointed to by zBufOut with nByte bytes of |
| 4398 ** random data. |
| 4399 */ |
| 4400 static int rbuVfsRandomness(sqlite3_vfs *pVfs, int nByte, char *zBufOut){ |
| 4401 sqlite3_vfs *pRealVfs = ((rbu_vfs*)pVfs)->pRealVfs; |
| 4402 return pRealVfs->xRandomness(pRealVfs, nByte, zBufOut); |
| 4403 } |
| 4404 |
| 4405 /* |
| 4406 ** Sleep for nMicro microseconds. Return the number of microseconds |
| 4407 ** actually slept. |
| 4408 */ |
| 4409 static int rbuVfsSleep(sqlite3_vfs *pVfs, int nMicro){ |
| 4410 sqlite3_vfs *pRealVfs = ((rbu_vfs*)pVfs)->pRealVfs; |
| 4411 return pRealVfs->xSleep(pRealVfs, nMicro); |
| 4412 } |
| 4413 |
| 4414 /* |
| 4415 ** Return the current time as a Julian Day number in *pTimeOut. |
| 4416 */ |
| 4417 static int rbuVfsCurrentTime(sqlite3_vfs *pVfs, double *pTimeOut){ |
| 4418 sqlite3_vfs *pRealVfs = ((rbu_vfs*)pVfs)->pRealVfs; |
| 4419 return pRealVfs->xCurrentTime(pRealVfs, pTimeOut); |
| 4420 } |
| 4421 |
| 4422 /* |
| 4423 ** No-op. |
| 4424 */ |
| 4425 static int rbuVfsGetLastError(sqlite3_vfs *pVfs, int a, char *b){ |
| 4426 return 0; |
| 4427 } |
| 4428 |
| 4429 /* |
| 4430 ** Deregister and destroy an RBU vfs created by an earlier call to |
| 4431 ** sqlite3rbu_create_vfs(). |
| 4432 */ |
| 4433 SQLITE_API void SQLITE_STDCALL sqlite3rbu_destroy_vfs(const char *zName){ |
| 4434 sqlite3_vfs *pVfs = sqlite3_vfs_find(zName); |
| 4435 if( pVfs && pVfs->xOpen==rbuVfsOpen ){ |
| 4436 sqlite3_mutex_free(((rbu_vfs*)pVfs)->mutex); |
| 4437 sqlite3_vfs_unregister(pVfs); |
| 4438 sqlite3_free(pVfs); |
| 4439 } |
| 4440 } |
| 4441 |
| 4442 /* |
| 4443 ** Create an RBU VFS named zName that accesses the underlying file-system |
| 4444 ** via existing VFS zParent. The new object is registered as a non-default |
| 4445 ** VFS with SQLite before returning. |
| 4446 */ |
| 4447 SQLITE_API int SQLITE_STDCALL sqlite3rbu_create_vfs(const char *zName, const cha
r *zParent){ |
| 4448 |
| 4449 /* Template for VFS */ |
| 4450 static sqlite3_vfs vfs_template = { |
| 4451 1, /* iVersion */ |
| 4452 0, /* szOsFile */ |
| 4453 0, /* mxPathname */ |
| 4454 0, /* pNext */ |
| 4455 0, /* zName */ |
| 4456 0, /* pAppData */ |
| 4457 rbuVfsOpen, /* xOpen */ |
| 4458 rbuVfsDelete, /* xDelete */ |
| 4459 rbuVfsAccess, /* xAccess */ |
| 4460 rbuVfsFullPathname, /* xFullPathname */ |
| 4461 |
| 4462 #ifndef SQLITE_OMIT_LOAD_EXTENSION |
| 4463 rbuVfsDlOpen, /* xDlOpen */ |
| 4464 rbuVfsDlError, /* xDlError */ |
| 4465 rbuVfsDlSym, /* xDlSym */ |
| 4466 rbuVfsDlClose, /* xDlClose */ |
| 4467 #else |
| 4468 0, 0, 0, 0, |
| 4469 #endif |
| 4470 |
| 4471 rbuVfsRandomness, /* xRandomness */ |
| 4472 rbuVfsSleep, /* xSleep */ |
| 4473 rbuVfsCurrentTime, /* xCurrentTime */ |
| 4474 rbuVfsGetLastError, /* xGetLastError */ |
| 4475 0, /* xCurrentTimeInt64 (version 2) */ |
| 4476 0, 0, 0 /* Unimplemented version 3 methods */ |
| 4477 }; |
| 4478 |
| 4479 rbu_vfs *pNew = 0; /* Newly allocated VFS */ |
| 4480 int nName; |
| 4481 int rc = SQLITE_OK; |
| 4482 |
| 4483 int nByte; |
| 4484 nName = strlen(zName); |
| 4485 nByte = sizeof(rbu_vfs) + nName + 1; |
| 4486 pNew = (rbu_vfs*)sqlite3_malloc(nByte); |
| 4487 if( pNew==0 ){ |
| 4488 rc = SQLITE_NOMEM; |
| 4489 }else{ |
| 4490 sqlite3_vfs *pParent; /* Parent VFS */ |
| 4491 memset(pNew, 0, nByte); |
| 4492 pParent = sqlite3_vfs_find(zParent); |
| 4493 if( pParent==0 ){ |
| 4494 rc = SQLITE_NOTFOUND; |
| 4495 }else{ |
| 4496 char *zSpace; |
| 4497 memcpy(&pNew->base, &vfs_template, sizeof(sqlite3_vfs)); |
| 4498 pNew->base.mxPathname = pParent->mxPathname; |
| 4499 pNew->base.szOsFile = sizeof(rbu_file) + pParent->szOsFile; |
| 4500 pNew->pRealVfs = pParent; |
| 4501 pNew->base.zName = (const char*)(zSpace = (char*)&pNew[1]); |
| 4502 memcpy(zSpace, zName, nName); |
| 4503 |
| 4504 /* Allocate the mutex and register the new VFS (not as the default) */ |
| 4505 pNew->mutex = sqlite3_mutex_alloc(SQLITE_MUTEX_RECURSIVE); |
| 4506 if( pNew->mutex==0 ){ |
| 4507 rc = SQLITE_NOMEM; |
| 4508 }else{ |
| 4509 rc = sqlite3_vfs_register(&pNew->base, 0); |
| 4510 } |
| 4511 } |
| 4512 |
| 4513 if( rc!=SQLITE_OK ){ |
| 4514 sqlite3_mutex_free(pNew->mutex); |
| 4515 sqlite3_free(pNew); |
| 4516 } |
| 4517 } |
| 4518 |
| 4519 return rc; |
| 4520 } |
| 4521 |
| 4522 |
| 4523 /**************************************************************************/ |
| 4524 |
| 4525 #endif /* !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_RBU) */ |
| 4526 |
| 4527 /************** End of sqlite3rbu.c ******************************************/ |
| 4528 /************** Begin file dbstat.c ******************************************/ |
| 4529 /* |
| 4530 ** 2010 July 12 |
| 4531 ** |
| 4532 ** The author disclaims copyright to this source code. In place of |
| 4533 ** a legal notice, here is a blessing: |
| 4534 ** |
| 4535 ** May you do good and not evil. |
| 4536 ** May you find forgiveness for yourself and forgive others. |
| 4537 ** May you share freely, never taking more than you give. |
| 4538 ** |
| 4539 ****************************************************************************** |
| 4540 ** |
| 4541 ** This file contains an implementation of the "dbstat" virtual table. |
| 4542 ** |
| 4543 ** The dbstat virtual table is used to extract low-level formatting |
| 4544 ** information from an SQLite database in order to implement the |
| 4545 ** "sqlite3_analyzer" utility. See the ../tool/spaceanal.tcl script |
| 4546 ** for an example implementation. |
| 4547 ** |
| 4548 ** Additional information is available on the "dbstat.html" page of the |
| 4549 ** official SQLite documentation. |
| 4550 */ |
| 4551 |
| 4552 /* #include "sqliteInt.h" ** Requires access to internal data structures ** */ |
| 4553 #if (defined(SQLITE_ENABLE_DBSTAT_VTAB) || defined(SQLITE_TEST)) \ |
| 4554 && !defined(SQLITE_OMIT_VIRTUALTABLE) |
| 4555 |
| 4556 /* |
| 4557 ** Page paths: |
| 4558 ** |
| 4559 ** The value of the 'path' column describes the path taken from the |
| 4560 ** root-node of the b-tree structure to each page. The value of the |
| 4561 ** root-node path is '/'. |
| 4562 ** |
| 4563 ** The value of the path for the left-most child page of the root of |
| 4564 ** a b-tree is '/000/'. (Btrees store content ordered from left to right |
| 4565 ** so the pages to the left have smaller keys than the pages to the right.) |
| 4566 ** The next to left-most child of the root page is |
| 4567 ** '/001', and so on, each sibling page identified by a 3-digit hex |
| 4568 ** value. The children of the 451st left-most sibling have paths such |
| 4569 ** as '/1c2/000/, '/1c2/001/' etc. |
| 4570 ** |
| 4571 ** Overflow pages are specified by appending a '+' character and a |
| 4572 ** six-digit hexadecimal value to the path to the cell they are linked |
| 4573 ** from. For example, the three overflow pages in a chain linked from |
| 4574 ** the left-most cell of the 450th child of the root page are identified |
| 4575 ** by the paths: |
| 4576 ** |
| 4577 ** '/1c2/000+000000' // First page in overflow chain |
| 4578 ** '/1c2/000+000001' // Second page in overflow chain |
| 4579 ** '/1c2/000+000002' // Third page in overflow chain |
| 4580 ** |
| 4581 ** If the paths are sorted using the BINARY collation sequence, then |
| 4582 ** the overflow pages associated with a cell will appear earlier in the |
| 4583 ** sort-order than its child page: |
| 4584 ** |
| 4585 ** '/1c2/000/' // Left-most child of 451st child of root |
| 4586 */ |
| 4587 #define VTAB_SCHEMA \ |
| 4588 "CREATE TABLE xx( " \ |
| 4589 " name STRING, /* Name of table or index */" \ |
| 4590 " path INTEGER, /* Path to page from root */" \ |
| 4591 " pageno INTEGER, /* Page number */" \ |
| 4592 " pagetype STRING, /* 'internal', 'leaf' or 'overflow' */" \ |
| 4593 " ncell INTEGER, /* Cells on page (0 for overflow) */" \ |
| 4594 " payload INTEGER, /* Bytes of payload on this page */" \ |
| 4595 " unused INTEGER, /* Bytes of unused space on this page */" \ |
| 4596 " mx_payload INTEGER, /* Largest payload size of all cells */" \ |
| 4597 " pgoffset INTEGER, /* Offset of page in file */" \ |
| 4598 " pgsize INTEGER, /* Size of the page */" \ |
| 4599 " schema TEXT HIDDEN /* Database schema being analyzed */" \ |
| 4600 ");" |
| 4601 |
| 4602 |
| 4603 typedef struct StatTable StatTable; |
| 4604 typedef struct StatCursor StatCursor; |
| 4605 typedef struct StatPage StatPage; |
| 4606 typedef struct StatCell StatCell; |
| 4607 |
| 4608 struct StatCell { |
| 4609 int nLocal; /* Bytes of local payload */ |
| 4610 u32 iChildPg; /* Child node (or 0 if this is a leaf) */ |
| 4611 int nOvfl; /* Entries in aOvfl[] */ |
| 4612 u32 *aOvfl; /* Array of overflow page numbers */ |
| 4613 int nLastOvfl; /* Bytes of payload on final overflow page */ |
| 4614 int iOvfl; /* Iterates through aOvfl[] */ |
| 4615 }; |
| 4616 |
| 4617 struct StatPage { |
| 4618 u32 iPgno; |
| 4619 DbPage *pPg; |
| 4620 int iCell; |
| 4621 |
| 4622 char *zPath; /* Path to this page */ |
| 4623 |
| 4624 /* Variables populated by statDecodePage(): */ |
| 4625 u8 flags; /* Copy of flags byte */ |
| 4626 int nCell; /* Number of cells on page */ |
| 4627 int nUnused; /* Number of unused bytes on page */ |
| 4628 StatCell *aCell; /* Array of parsed cells */ |
| 4629 u32 iRightChildPg; /* Right-child page number (or 0) */ |
| 4630 int nMxPayload; /* Largest payload of any cell on this page */ |
| 4631 }; |
| 4632 |
| 4633 struct StatCursor { |
| 4634 sqlite3_vtab_cursor base; |
| 4635 sqlite3_stmt *pStmt; /* Iterates through set of root pages */ |
| 4636 int isEof; /* After pStmt has returned SQLITE_DONE */ |
| 4637 int iDb; /* Schema used for this query */ |
| 4638 |
| 4639 StatPage aPage[32]; |
| 4640 int iPage; /* Current entry in aPage[] */ |
| 4641 |
| 4642 /* Values to return. */ |
| 4643 char *zName; /* Value of 'name' column */ |
| 4644 char *zPath; /* Value of 'path' column */ |
| 4645 u32 iPageno; /* Value of 'pageno' column */ |
| 4646 char *zPagetype; /* Value of 'pagetype' column */ |
| 4647 int nCell; /* Value of 'ncell' column */ |
| 4648 int nPayload; /* Value of 'payload' column */ |
| 4649 int nUnused; /* Value of 'unused' column */ |
| 4650 int nMxPayload; /* Value of 'mx_payload' column */ |
| 4651 i64 iOffset; /* Value of 'pgOffset' column */ |
| 4652 int szPage; /* Value of 'pgSize' column */ |
| 4653 }; |
| 4654 |
| 4655 struct StatTable { |
| 4656 sqlite3_vtab base; |
| 4657 sqlite3 *db; |
| 4658 int iDb; /* Index of database to analyze */ |
| 4659 }; |
| 4660 |
| 4661 #ifndef get2byte |
| 4662 # define get2byte(x) ((x)[0]<<8 | (x)[1]) |
| 4663 #endif |
| 4664 |
| 4665 /* |
| 4666 ** Connect to or create a statvfs virtual table. |
| 4667 */ |
| 4668 static int statConnect( |
| 4669 sqlite3 *db, |
| 4670 void *pAux, |
| 4671 int argc, const char *const*argv, |
| 4672 sqlite3_vtab **ppVtab, |
| 4673 char **pzErr |
| 4674 ){ |
| 4675 StatTable *pTab = 0; |
| 4676 int rc = SQLITE_OK; |
| 4677 int iDb; |
| 4678 |
| 4679 if( argc>=4 ){ |
| 4680 iDb = sqlite3FindDbName(db, argv[3]); |
| 4681 if( iDb<0 ){ |
| 4682 *pzErr = sqlite3_mprintf("no such database: %s", argv[3]); |
| 4683 return SQLITE_ERROR; |
| 4684 } |
| 4685 }else{ |
| 4686 iDb = 0; |
| 4687 } |
| 4688 rc = sqlite3_declare_vtab(db, VTAB_SCHEMA); |
| 4689 if( rc==SQLITE_OK ){ |
| 4690 pTab = (StatTable *)sqlite3_malloc64(sizeof(StatTable)); |
| 4691 if( pTab==0 ) rc = SQLITE_NOMEM; |
| 4692 } |
| 4693 |
| 4694 assert( rc==SQLITE_OK || pTab==0 ); |
| 4695 if( rc==SQLITE_OK ){ |
| 4696 memset(pTab, 0, sizeof(StatTable)); |
| 4697 pTab->db = db; |
| 4698 pTab->iDb = iDb; |
| 4699 } |
| 4700 |
| 4701 *ppVtab = (sqlite3_vtab*)pTab; |
| 4702 return rc; |
| 4703 } |
| 4704 |
| 4705 /* |
| 4706 ** Disconnect from or destroy a statvfs virtual table. |
| 4707 */ |
| 4708 static int statDisconnect(sqlite3_vtab *pVtab){ |
| 4709 sqlite3_free(pVtab); |
| 4710 return SQLITE_OK; |
| 4711 } |
| 4712 |
| 4713 /* |
| 4714 ** There is no "best-index". This virtual table always does a linear |
| 4715 ** scan. However, a schema=? constraint should cause this table to |
| 4716 ** operate on a different database schema, so check for it. |
| 4717 ** |
| 4718 ** idxNum is normally 0, but will be 1 if a schema=? constraint exists. |
| 4719 */ |
| 4720 static int statBestIndex(sqlite3_vtab *tab, sqlite3_index_info *pIdxInfo){ |
| 4721 int i; |
| 4722 |
| 4723 pIdxInfo->estimatedCost = 1.0e6; /* Initial cost estimate */ |
| 4724 |
| 4725 /* Look for a valid schema=? constraint. If found, change the idxNum to |
| 4726 ** 1 and request the value of that constraint be sent to xFilter. And |
| 4727 ** lower the cost estimate to encourage the constrained version to be |
| 4728 ** used. |
| 4729 */ |
| 4730 for(i=0; i<pIdxInfo->nConstraint; i++){ |
| 4731 if( pIdxInfo->aConstraint[i].usable==0 ) continue; |
| 4732 if( pIdxInfo->aConstraint[i].op!=SQLITE_INDEX_CONSTRAINT_EQ ) continue; |
| 4733 if( pIdxInfo->aConstraint[i].iColumn!=10 ) continue; |
| 4734 pIdxInfo->idxNum = 1; |
| 4735 pIdxInfo->estimatedCost = 1.0; |
| 4736 pIdxInfo->aConstraintUsage[i].argvIndex = 1; |
| 4737 pIdxInfo->aConstraintUsage[i].omit = 1; |
| 4738 break; |
| 4739 } |
| 4740 |
| 4741 |
| 4742 /* Records are always returned in ascending order of (name, path). |
| 4743 ** If this will satisfy the client, set the orderByConsumed flag so that |
| 4744 ** SQLite does not do an external sort. |
| 4745 */ |
| 4746 if( ( pIdxInfo->nOrderBy==1 |
| 4747 && pIdxInfo->aOrderBy[0].iColumn==0 |
| 4748 && pIdxInfo->aOrderBy[0].desc==0 |
| 4749 ) || |
| 4750 ( pIdxInfo->nOrderBy==2 |
| 4751 && pIdxInfo->aOrderBy[0].iColumn==0 |
| 4752 && pIdxInfo->aOrderBy[0].desc==0 |
| 4753 && pIdxInfo->aOrderBy[1].iColumn==1 |
| 4754 && pIdxInfo->aOrderBy[1].desc==0 |
| 4755 ) |
| 4756 ){ |
| 4757 pIdxInfo->orderByConsumed = 1; |
| 4758 } |
| 4759 |
| 4760 return SQLITE_OK; |
| 4761 } |
| 4762 |
| 4763 /* |
| 4764 ** Open a new statvfs cursor. |
| 4765 */ |
| 4766 static int statOpen(sqlite3_vtab *pVTab, sqlite3_vtab_cursor **ppCursor){ |
| 4767 StatTable *pTab = (StatTable *)pVTab; |
| 4768 StatCursor *pCsr; |
| 4769 |
| 4770 pCsr = (StatCursor *)sqlite3_malloc64(sizeof(StatCursor)); |
| 4771 if( pCsr==0 ){ |
| 4772 return SQLITE_NOMEM; |
| 4773 }else{ |
| 4774 memset(pCsr, 0, sizeof(StatCursor)); |
| 4775 pCsr->base.pVtab = pVTab; |
| 4776 pCsr->iDb = pTab->iDb; |
| 4777 } |
| 4778 |
| 4779 *ppCursor = (sqlite3_vtab_cursor *)pCsr; |
| 4780 return SQLITE_OK; |
| 4781 } |
| 4782 |
| 4783 static void statClearPage(StatPage *p){ |
| 4784 int i; |
| 4785 if( p->aCell ){ |
| 4786 for(i=0; i<p->nCell; i++){ |
| 4787 sqlite3_free(p->aCell[i].aOvfl); |
| 4788 } |
| 4789 sqlite3_free(p->aCell); |
| 4790 } |
| 4791 sqlite3PagerUnref(p->pPg); |
| 4792 sqlite3_free(p->zPath); |
| 4793 memset(p, 0, sizeof(StatPage)); |
| 4794 } |
| 4795 |
| 4796 static void statResetCsr(StatCursor *pCsr){ |
| 4797 int i; |
| 4798 sqlite3_reset(pCsr->pStmt); |
| 4799 for(i=0; i<ArraySize(pCsr->aPage); i++){ |
| 4800 statClearPage(&pCsr->aPage[i]); |
| 4801 } |
| 4802 pCsr->iPage = 0; |
| 4803 sqlite3_free(pCsr->zPath); |
| 4804 pCsr->zPath = 0; |
| 4805 pCsr->isEof = 0; |
| 4806 } |
| 4807 |
| 4808 /* |
| 4809 ** Close a statvfs cursor. |
| 4810 */ |
| 4811 static int statClose(sqlite3_vtab_cursor *pCursor){ |
| 4812 StatCursor *pCsr = (StatCursor *)pCursor; |
| 4813 statResetCsr(pCsr); |
| 4814 sqlite3_finalize(pCsr->pStmt); |
| 4815 sqlite3_free(pCsr); |
| 4816 return SQLITE_OK; |
| 4817 } |
| 4818 |
| 4819 static void getLocalPayload( |
| 4820 int nUsable, /* Usable bytes per page */ |
| 4821 u8 flags, /* Page flags */ |
| 4822 int nTotal, /* Total record (payload) size */ |
| 4823 int *pnLocal /* OUT: Bytes stored locally */ |
| 4824 ){ |
| 4825 int nLocal; |
| 4826 int nMinLocal; |
| 4827 int nMaxLocal; |
| 4828 |
| 4829 if( flags==0x0D ){ /* Table leaf node */ |
| 4830 nMinLocal = (nUsable - 12) * 32 / 255 - 23; |
| 4831 nMaxLocal = nUsable - 35; |
| 4832 }else{ /* Index interior and leaf nodes */ |
| 4833 nMinLocal = (nUsable - 12) * 32 / 255 - 23; |
| 4834 nMaxLocal = (nUsable - 12) * 64 / 255 - 23; |
| 4835 } |
| 4836 |
| 4837 nLocal = nMinLocal + (nTotal - nMinLocal) % (nUsable - 4); |
| 4838 if( nLocal>nMaxLocal ) nLocal = nMinLocal; |
| 4839 *pnLocal = nLocal; |
| 4840 } |
| 4841 |
| 4842 static int statDecodePage(Btree *pBt, StatPage *p){ |
| 4843 int nUnused; |
| 4844 int iOff; |
| 4845 int nHdr; |
| 4846 int isLeaf; |
| 4847 int szPage; |
| 4848 |
| 4849 u8 *aData = sqlite3PagerGetData(p->pPg); |
| 4850 u8 *aHdr = &aData[p->iPgno==1 ? 100 : 0]; |
| 4851 |
| 4852 p->flags = aHdr[0]; |
| 4853 p->nCell = get2byte(&aHdr[3]); |
| 4854 p->nMxPayload = 0; |
| 4855 |
| 4856 isLeaf = (p->flags==0x0A || p->flags==0x0D); |
| 4857 nHdr = 12 - isLeaf*4 + (p->iPgno==1)*100; |
| 4858 |
| 4859 nUnused = get2byte(&aHdr[5]) - nHdr - 2*p->nCell; |
| 4860 nUnused += (int)aHdr[7]; |
| 4861 iOff = get2byte(&aHdr[1]); |
| 4862 while( iOff ){ |
| 4863 nUnused += get2byte(&aData[iOff+2]); |
| 4864 iOff = get2byte(&aData[iOff]); |
| 4865 } |
| 4866 p->nUnused = nUnused; |
| 4867 p->iRightChildPg = isLeaf ? 0 : sqlite3Get4byte(&aHdr[8]); |
| 4868 szPage = sqlite3BtreeGetPageSize(pBt); |
| 4869 |
| 4870 if( p->nCell ){ |
| 4871 int i; /* Used to iterate through cells */ |
| 4872 int nUsable; /* Usable bytes per page */ |
| 4873 |
| 4874 sqlite3BtreeEnter(pBt); |
| 4875 nUsable = szPage - sqlite3BtreeGetReserveNoMutex(pBt); |
| 4876 sqlite3BtreeLeave(pBt); |
| 4877 p->aCell = sqlite3_malloc64((p->nCell+1) * sizeof(StatCell)); |
| 4878 if( p->aCell==0 ) return SQLITE_NOMEM; |
| 4879 memset(p->aCell, 0, (p->nCell+1) * sizeof(StatCell)); |
| 4880 |
| 4881 for(i=0; i<p->nCell; i++){ |
| 4882 StatCell *pCell = &p->aCell[i]; |
| 4883 |
| 4884 iOff = get2byte(&aData[nHdr+i*2]); |
| 4885 if( !isLeaf ){ |
| 4886 pCell->iChildPg = sqlite3Get4byte(&aData[iOff]); |
| 4887 iOff += 4; |
| 4888 } |
| 4889 if( p->flags==0x05 ){ |
| 4890 /* A table interior node. nPayload==0. */ |
| 4891 }else{ |
| 4892 u32 nPayload; /* Bytes of payload total (local+overflow) */ |
| 4893 int nLocal; /* Bytes of payload stored locally */ |
| 4894 iOff += getVarint32(&aData[iOff], nPayload); |
| 4895 if( p->flags==0x0D ){ |
| 4896 u64 dummy; |
| 4897 iOff += sqlite3GetVarint(&aData[iOff], &dummy); |
| 4898 } |
| 4899 if( nPayload>(u32)p->nMxPayload ) p->nMxPayload = nPayload; |
| 4900 getLocalPayload(nUsable, p->flags, nPayload, &nLocal); |
| 4901 pCell->nLocal = nLocal; |
| 4902 assert( nLocal>=0 ); |
| 4903 assert( nPayload>=(u32)nLocal ); |
| 4904 assert( nLocal<=(nUsable-35) ); |
| 4905 if( nPayload>(u32)nLocal ){ |
| 4906 int j; |
| 4907 int nOvfl = ((nPayload - nLocal) + nUsable-4 - 1) / (nUsable - 4); |
| 4908 pCell->nLastOvfl = (nPayload-nLocal) - (nOvfl-1) * (nUsable-4); |
| 4909 pCell->nOvfl = nOvfl; |
| 4910 pCell->aOvfl = sqlite3_malloc64(sizeof(u32)*nOvfl); |
| 4911 if( pCell->aOvfl==0 ) return SQLITE_NOMEM; |
| 4912 pCell->aOvfl[0] = sqlite3Get4byte(&aData[iOff+nLocal]); |
| 4913 for(j=1; j<nOvfl; j++){ |
| 4914 int rc; |
| 4915 u32 iPrev = pCell->aOvfl[j-1]; |
| 4916 DbPage *pPg = 0; |
| 4917 rc = sqlite3PagerGet(sqlite3BtreePager(pBt), iPrev, &pPg, 0); |
| 4918 if( rc!=SQLITE_OK ){ |
| 4919 assert( pPg==0 ); |
| 4920 return rc; |
| 4921 } |
| 4922 pCell->aOvfl[j] = sqlite3Get4byte(sqlite3PagerGetData(pPg)); |
| 4923 sqlite3PagerUnref(pPg); |
| 4924 } |
| 4925 } |
| 4926 } |
| 4927 } |
| 4928 } |
| 4929 |
| 4930 return SQLITE_OK; |
| 4931 } |
| 4932 |
| 4933 /* |
| 4934 ** Populate the pCsr->iOffset and pCsr->szPage member variables. Based on |
| 4935 ** the current value of pCsr->iPageno. |
| 4936 */ |
| 4937 static void statSizeAndOffset(StatCursor *pCsr){ |
| 4938 StatTable *pTab = (StatTable *)((sqlite3_vtab_cursor *)pCsr)->pVtab; |
| 4939 Btree *pBt = pTab->db->aDb[pTab->iDb].pBt; |
| 4940 Pager *pPager = sqlite3BtreePager(pBt); |
| 4941 sqlite3_file *fd; |
| 4942 sqlite3_int64 x[2]; |
| 4943 |
| 4944 /* The default page size and offset */ |
| 4945 pCsr->szPage = sqlite3BtreeGetPageSize(pBt); |
| 4946 pCsr->iOffset = (i64)pCsr->szPage * (pCsr->iPageno - 1); |
| 4947 |
| 4948 /* If connected to a ZIPVFS backend, override the page size and |
| 4949 ** offset with actual values obtained from ZIPVFS. |
| 4950 */ |
| 4951 fd = sqlite3PagerFile(pPager); |
| 4952 x[0] = pCsr->iPageno; |
| 4953 if( fd->pMethods!=0 && sqlite3OsFileControl(fd, 230440, &x)==SQLITE_OK ){ |
| 4954 pCsr->iOffset = x[0]; |
| 4955 pCsr->szPage = (int)x[1]; |
| 4956 } |
| 4957 } |
| 4958 |
| 4959 /* |
| 4960 ** Move a statvfs cursor to the next entry in the file. |
| 4961 */ |
| 4962 static int statNext(sqlite3_vtab_cursor *pCursor){ |
| 4963 int rc; |
| 4964 int nPayload; |
| 4965 char *z; |
| 4966 StatCursor *pCsr = (StatCursor *)pCursor; |
| 4967 StatTable *pTab = (StatTable *)pCursor->pVtab; |
| 4968 Btree *pBt = pTab->db->aDb[pCsr->iDb].pBt; |
| 4969 Pager *pPager = sqlite3BtreePager(pBt); |
| 4970 |
| 4971 sqlite3_free(pCsr->zPath); |
| 4972 pCsr->zPath = 0; |
| 4973 |
| 4974 statNextRestart: |
| 4975 if( pCsr->aPage[0].pPg==0 ){ |
| 4976 rc = sqlite3_step(pCsr->pStmt); |
| 4977 if( rc==SQLITE_ROW ){ |
| 4978 int nPage; |
| 4979 u32 iRoot = (u32)sqlite3_column_int64(pCsr->pStmt, 1); |
| 4980 sqlite3PagerPagecount(pPager, &nPage); |
| 4981 if( nPage==0 ){ |
| 4982 pCsr->isEof = 1; |
| 4983 return sqlite3_reset(pCsr->pStmt); |
| 4984 } |
| 4985 rc = sqlite3PagerGet(pPager, iRoot, &pCsr->aPage[0].pPg, 0); |
| 4986 pCsr->aPage[0].iPgno = iRoot; |
| 4987 pCsr->aPage[0].iCell = 0; |
| 4988 pCsr->aPage[0].zPath = z = sqlite3_mprintf("/"); |
| 4989 pCsr->iPage = 0; |
| 4990 if( z==0 ) rc = SQLITE_NOMEM; |
| 4991 }else{ |
| 4992 pCsr->isEof = 1; |
| 4993 return sqlite3_reset(pCsr->pStmt); |
| 4994 } |
| 4995 }else{ |
| 4996 |
| 4997 /* Page p itself has already been visited. */ |
| 4998 StatPage *p = &pCsr->aPage[pCsr->iPage]; |
| 4999 |
| 5000 while( p->iCell<p->nCell ){ |
| 5001 StatCell *pCell = &p->aCell[p->iCell]; |
| 5002 if( pCell->iOvfl<pCell->nOvfl ){ |
| 5003 int nUsable; |
| 5004 sqlite3BtreeEnter(pBt); |
| 5005 nUsable = sqlite3BtreeGetPageSize(pBt) - |
| 5006 sqlite3BtreeGetReserveNoMutex(pBt); |
| 5007 sqlite3BtreeLeave(pBt); |
| 5008 pCsr->zName = (char *)sqlite3_column_text(pCsr->pStmt, 0); |
| 5009 pCsr->iPageno = pCell->aOvfl[pCell->iOvfl]; |
| 5010 pCsr->zPagetype = "overflow"; |
| 5011 pCsr->nCell = 0; |
| 5012 pCsr->nMxPayload = 0; |
| 5013 pCsr->zPath = z = sqlite3_mprintf( |
| 5014 "%s%.3x+%.6x", p->zPath, p->iCell, pCell->iOvfl |
| 5015 ); |
| 5016 if( pCell->iOvfl<pCell->nOvfl-1 ){ |
| 5017 pCsr->nUnused = 0; |
| 5018 pCsr->nPayload = nUsable - 4; |
| 5019 }else{ |
| 5020 pCsr->nPayload = pCell->nLastOvfl; |
| 5021 pCsr->nUnused = nUsable - 4 - pCsr->nPayload; |
| 5022 } |
| 5023 pCell->iOvfl++; |
| 5024 statSizeAndOffset(pCsr); |
| 5025 return z==0 ? SQLITE_NOMEM : SQLITE_OK; |
| 5026 } |
| 5027 if( p->iRightChildPg ) break; |
| 5028 p->iCell++; |
| 5029 } |
| 5030 |
| 5031 if( !p->iRightChildPg || p->iCell>p->nCell ){ |
| 5032 statClearPage(p); |
| 5033 if( pCsr->iPage==0 ) return statNext(pCursor); |
| 5034 pCsr->iPage--; |
| 5035 goto statNextRestart; /* Tail recursion */ |
| 5036 } |
| 5037 pCsr->iPage++; |
| 5038 assert( p==&pCsr->aPage[pCsr->iPage-1] ); |
| 5039 |
| 5040 if( p->iCell==p->nCell ){ |
| 5041 p[1].iPgno = p->iRightChildPg; |
| 5042 }else{ |
| 5043 p[1].iPgno = p->aCell[p->iCell].iChildPg; |
| 5044 } |
| 5045 rc = sqlite3PagerGet(pPager, p[1].iPgno, &p[1].pPg, 0); |
| 5046 p[1].iCell = 0; |
| 5047 p[1].zPath = z = sqlite3_mprintf("%s%.3x/", p->zPath, p->iCell); |
| 5048 p->iCell++; |
| 5049 if( z==0 ) rc = SQLITE_NOMEM; |
| 5050 } |
| 5051 |
| 5052 |
| 5053 /* Populate the StatCursor fields with the values to be returned |
| 5054 ** by the xColumn() and xRowid() methods. |
| 5055 */ |
| 5056 if( rc==SQLITE_OK ){ |
| 5057 int i; |
| 5058 StatPage *p = &pCsr->aPage[pCsr->iPage]; |
| 5059 pCsr->zName = (char *)sqlite3_column_text(pCsr->pStmt, 0); |
| 5060 pCsr->iPageno = p->iPgno; |
| 5061 |
| 5062 rc = statDecodePage(pBt, p); |
| 5063 if( rc==SQLITE_OK ){ |
| 5064 statSizeAndOffset(pCsr); |
| 5065 |
| 5066 switch( p->flags ){ |
| 5067 case 0x05: /* table internal */ |
| 5068 case 0x02: /* index internal */ |
| 5069 pCsr->zPagetype = "internal"; |
| 5070 break; |
| 5071 case 0x0D: /* table leaf */ |
| 5072 case 0x0A: /* index leaf */ |
| 5073 pCsr->zPagetype = "leaf"; |
| 5074 break; |
| 5075 default: |
| 5076 pCsr->zPagetype = "corrupted"; |
| 5077 break; |
| 5078 } |
| 5079 pCsr->nCell = p->nCell; |
| 5080 pCsr->nUnused = p->nUnused; |
| 5081 pCsr->nMxPayload = p->nMxPayload; |
| 5082 pCsr->zPath = z = sqlite3_mprintf("%s", p->zPath); |
| 5083 if( z==0 ) rc = SQLITE_NOMEM; |
| 5084 nPayload = 0; |
| 5085 for(i=0; i<p->nCell; i++){ |
| 5086 nPayload += p->aCell[i].nLocal; |
| 5087 } |
| 5088 pCsr->nPayload = nPayload; |
| 5089 } |
| 5090 } |
| 5091 |
| 5092 return rc; |
| 5093 } |
| 5094 |
| 5095 static int statEof(sqlite3_vtab_cursor *pCursor){ |
| 5096 StatCursor *pCsr = (StatCursor *)pCursor; |
| 5097 return pCsr->isEof; |
| 5098 } |
| 5099 |
| 5100 static int statFilter( |
| 5101 sqlite3_vtab_cursor *pCursor, |
| 5102 int idxNum, const char *idxStr, |
| 5103 int argc, sqlite3_value **argv |
| 5104 ){ |
| 5105 StatCursor *pCsr = (StatCursor *)pCursor; |
| 5106 StatTable *pTab = (StatTable*)(pCursor->pVtab); |
| 5107 char *zSql; |
| 5108 int rc = SQLITE_OK; |
| 5109 char *zMaster; |
| 5110 |
| 5111 if( idxNum==1 ){ |
| 5112 const char *zDbase = (const char*)sqlite3_value_text(argv[0]); |
| 5113 pCsr->iDb = sqlite3FindDbName(pTab->db, zDbase); |
| 5114 if( pCsr->iDb<0 ){ |
| 5115 sqlite3_free(pCursor->pVtab->zErrMsg); |
| 5116 pCursor->pVtab->zErrMsg = sqlite3_mprintf("no such schema: %s", zDbase); |
| 5117 return pCursor->pVtab->zErrMsg ? SQLITE_ERROR : SQLITE_NOMEM; |
| 5118 } |
| 5119 }else{ |
| 5120 pCsr->iDb = pTab->iDb; |
| 5121 } |
| 5122 statResetCsr(pCsr); |
| 5123 sqlite3_finalize(pCsr->pStmt); |
| 5124 pCsr->pStmt = 0; |
| 5125 zMaster = pCsr->iDb==1 ? "sqlite_temp_master" : "sqlite_master"; |
| 5126 zSql = sqlite3_mprintf( |
| 5127 "SELECT 'sqlite_master' AS name, 1 AS rootpage, 'table' AS type" |
| 5128 " UNION ALL " |
| 5129 "SELECT name, rootpage, type" |
| 5130 " FROM \"%w\".%s WHERE rootpage!=0" |
| 5131 " ORDER BY name", pTab->db->aDb[pCsr->iDb].zName, zMaster); |
| 5132 if( zSql==0 ){ |
| 5133 return SQLITE_NOMEM; |
| 5134 }else{ |
| 5135 rc = sqlite3_prepare_v2(pTab->db, zSql, -1, &pCsr->pStmt, 0); |
| 5136 sqlite3_free(zSql); |
| 5137 } |
| 5138 |
| 5139 if( rc==SQLITE_OK ){ |
| 5140 rc = statNext(pCursor); |
| 5141 } |
| 5142 return rc; |
| 5143 } |
| 5144 |
| 5145 static int statColumn( |
| 5146 sqlite3_vtab_cursor *pCursor, |
| 5147 sqlite3_context *ctx, |
| 5148 int i |
| 5149 ){ |
| 5150 StatCursor *pCsr = (StatCursor *)pCursor; |
| 5151 switch( i ){ |
| 5152 case 0: /* name */ |
| 5153 sqlite3_result_text(ctx, pCsr->zName, -1, SQLITE_TRANSIENT); |
| 5154 break; |
| 5155 case 1: /* path */ |
| 5156 sqlite3_result_text(ctx, pCsr->zPath, -1, SQLITE_TRANSIENT); |
| 5157 break; |
| 5158 case 2: /* pageno */ |
| 5159 sqlite3_result_int64(ctx, pCsr->iPageno); |
| 5160 break; |
| 5161 case 3: /* pagetype */ |
| 5162 sqlite3_result_text(ctx, pCsr->zPagetype, -1, SQLITE_STATIC); |
| 5163 break; |
| 5164 case 4: /* ncell */ |
| 5165 sqlite3_result_int(ctx, pCsr->nCell); |
| 5166 break; |
| 5167 case 5: /* payload */ |
| 5168 sqlite3_result_int(ctx, pCsr->nPayload); |
| 5169 break; |
| 5170 case 6: /* unused */ |
| 5171 sqlite3_result_int(ctx, pCsr->nUnused); |
| 5172 break; |
| 5173 case 7: /* mx_payload */ |
| 5174 sqlite3_result_int(ctx, pCsr->nMxPayload); |
| 5175 break; |
| 5176 case 8: /* pgoffset */ |
| 5177 sqlite3_result_int64(ctx, pCsr->iOffset); |
| 5178 break; |
| 5179 case 9: /* pgsize */ |
| 5180 sqlite3_result_int(ctx, pCsr->szPage); |
| 5181 break; |
| 5182 default: { /* schema */ |
| 5183 sqlite3 *db = sqlite3_context_db_handle(ctx); |
| 5184 int iDb = pCsr->iDb; |
| 5185 sqlite3_result_text(ctx, db->aDb[iDb].zName, -1, SQLITE_STATIC); |
| 5186 break; |
| 5187 } |
| 5188 } |
| 5189 return SQLITE_OK; |
| 5190 } |
| 5191 |
| 5192 static int statRowid(sqlite3_vtab_cursor *pCursor, sqlite_int64 *pRowid){ |
| 5193 StatCursor *pCsr = (StatCursor *)pCursor; |
| 5194 *pRowid = pCsr->iPageno; |
| 5195 return SQLITE_OK; |
| 5196 } |
| 5197 |
| 5198 /* |
| 5199 ** Invoke this routine to register the "dbstat" virtual table module |
| 5200 */ |
| 5201 SQLITE_PRIVATE int sqlite3DbstatRegister(sqlite3 *db){ |
| 5202 static sqlite3_module dbstat_module = { |
| 5203 0, /* iVersion */ |
| 5204 statConnect, /* xCreate */ |
| 5205 statConnect, /* xConnect */ |
| 5206 statBestIndex, /* xBestIndex */ |
| 5207 statDisconnect, /* xDisconnect */ |
| 5208 statDisconnect, /* xDestroy */ |
| 5209 statOpen, /* xOpen - open a cursor */ |
| 5210 statClose, /* xClose - close a cursor */ |
| 5211 statFilter, /* xFilter - configure scan constraints */ |
| 5212 statNext, /* xNext - advance a cursor */ |
| 5213 statEof, /* xEof - check for end of scan */ |
| 5214 statColumn, /* xColumn - read data */ |
| 5215 statRowid, /* xRowid - read data */ |
| 5216 0, /* xUpdate */ |
| 5217 0, /* xBegin */ |
| 5218 0, /* xSync */ |
| 5219 0, /* xCommit */ |
| 5220 0, /* xRollback */ |
| 5221 0, /* xFindMethod */ |
| 5222 0, /* xRename */ |
| 5223 }; |
| 5224 return sqlite3_create_module(db, "dbstat", &dbstat_module, 0); |
| 5225 } |
| 5226 #elif defined(SQLITE_ENABLE_DBSTAT_VTAB) |
| 5227 SQLITE_PRIVATE int sqlite3DbstatRegister(sqlite3 *db){ return SQLITE_OK; } |
| 5228 #endif /* SQLITE_ENABLE_DBSTAT_VTAB */ |
| 5229 |
| 5230 /************** End of dbstat.c **********************************************/ |
| 5231 /************** Begin file json1.c *******************************************/ |
| 5232 /* |
| 5233 ** 2015-08-12 |
| 5234 ** |
| 5235 ** The author disclaims copyright to this source code. In place of |
| 5236 ** a legal notice, here is a blessing: |
| 5237 ** |
| 5238 ** May you do good and not evil. |
| 5239 ** May you find forgiveness for yourself and forgive others. |
| 5240 ** May you share freely, never taking more than you give. |
| 5241 ** |
| 5242 ****************************************************************************** |
| 5243 ** |
| 5244 ** This SQLite extension implements JSON functions. The interface is |
| 5245 ** modeled after MySQL JSON functions: |
| 5246 ** |
| 5247 ** https://dev.mysql.com/doc/refman/5.7/en/json.html |
| 5248 ** |
| 5249 ** For the time being, all JSON is stored as pure text. (We might add |
| 5250 ** a JSONB type in the future which stores a binary encoding of JSON in |
| 5251 ** a BLOB, but there is no support for JSONB in the current implementation. |
| 5252 ** This implementation parses JSON text at 250 MB/s, so it is hard to see |
| 5253 ** how JSONB might improve on that.) |
| 5254 */ |
| 5255 #if !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_JSON1) |
| 5256 #if !defined(_SQLITEINT_H_) |
| 5257 /* #include "sqlite3ext.h" */ |
| 5258 #endif |
| 5259 SQLITE_EXTENSION_INIT1 |
| 5260 /* #include <assert.h> */ |
| 5261 /* #include <string.h> */ |
| 5262 /* #include <stdlib.h> */ |
| 5263 /* #include <stdarg.h> */ |
| 5264 |
| 5265 #define UNUSED_PARAM(X) (void)(X) |
| 5266 |
| 5267 #ifndef LARGEST_INT64 |
| 5268 # define LARGEST_INT64 (0xffffffff|(((sqlite3_int64)0x7fffffff)<<32)) |
| 5269 # define SMALLEST_INT64 (((sqlite3_int64)-1) - LARGEST_INT64) |
| 5270 #endif |
| 5271 |
| 5272 /* |
| 5273 ** Versions of isspace(), isalnum() and isdigit() to which it is safe |
| 5274 ** to pass signed char values. |
| 5275 */ |
| 5276 #ifdef sqlite3Isdigit |
| 5277 /* Use the SQLite core versions if this routine is part of the |
| 5278 ** SQLite amalgamation */ |
| 5279 # define safe_isdigit(x) sqlite3Isdigit(x) |
| 5280 # define safe_isalnum(x) sqlite3Isalnum(x) |
| 5281 #else |
| 5282 /* Use the standard library for separate compilation */ |
| 5283 #include <ctype.h> /* amalgamator: keep */ |
| 5284 # define safe_isdigit(x) isdigit((unsigned char)(x)) |
| 5285 # define safe_isalnum(x) isalnum((unsigned char)(x)) |
| 5286 #endif |
| 5287 |
| 5288 /* |
| 5289 ** Growing our own isspace() routine this way is twice as fast as |
| 5290 ** the library isspace() function, resulting in a 7% overall performance |
| 5291 ** increase for the parser. (Ubuntu14.10 gcc 4.8.4 x64 with -Os). |
| 5292 */ |
| 5293 static const char jsonIsSpace[] = { |
| 5294 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, |
| 5295 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, |
| 5296 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, |
| 5297 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, |
| 5298 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, |
| 5299 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, |
| 5300 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, |
| 5301 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, |
| 5302 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, |
| 5303 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, |
| 5304 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, |
| 5305 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, |
| 5306 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, |
| 5307 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, |
| 5308 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, |
| 5309 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, |
| 5310 }; |
| 5311 #define safe_isspace(x) (jsonIsSpace[(unsigned char)x]) |
| 5312 |
| 5313 #ifndef SQLITE_AMALGAMATION |
| 5314 /* Unsigned integer types. These are already defined in the sqliteInt.h, |
| 5315 ** but the definitions need to be repeated for separate compilation. */ |
| 5316 typedef sqlite3_uint64 u64; |
| 5317 typedef unsigned int u32; |
| 5318 typedef unsigned char u8; |
| 5319 #endif |
| 5320 |
| 5321 /* Objects */ |
| 5322 typedef struct JsonString JsonString; |
| 5323 typedef struct JsonNode JsonNode; |
| 5324 typedef struct JsonParse JsonParse; |
| 5325 |
| 5326 /* An instance of this object represents a JSON string |
| 5327 ** under construction. Really, this is a generic string accumulator |
| 5328 ** that can be and is used to create strings other than JSON. |
| 5329 */ |
| 5330 struct JsonString { |
| 5331 sqlite3_context *pCtx; /* Function context - put error messages here */ |
| 5332 char *zBuf; /* Append JSON content here */ |
| 5333 u64 nAlloc; /* Bytes of storage available in zBuf[] */ |
| 5334 u64 nUsed; /* Bytes of zBuf[] currently used */ |
| 5335 u8 bStatic; /* True if zBuf is static space */ |
| 5336 u8 bErr; /* True if an error has been encountered */ |
| 5337 char zSpace[100]; /* Initial static space */ |
| 5338 }; |
| 5339 |
| 5340 /* JSON type values |
| 5341 */ |
| 5342 #define JSON_NULL 0 |
| 5343 #define JSON_TRUE 1 |
| 5344 #define JSON_FALSE 2 |
| 5345 #define JSON_INT 3 |
| 5346 #define JSON_REAL 4 |
| 5347 #define JSON_STRING 5 |
| 5348 #define JSON_ARRAY 6 |
| 5349 #define JSON_OBJECT 7 |
| 5350 |
| 5351 /* The "subtype" set for JSON values */ |
| 5352 #define JSON_SUBTYPE 74 /* Ascii for "J" */ |
| 5353 |
| 5354 /* |
| 5355 ** Names of the various JSON types: |
| 5356 */ |
| 5357 static const char * const jsonType[] = { |
| 5358 "null", "true", "false", "integer", "real", "text", "array", "object" |
| 5359 }; |
| 5360 |
| 5361 /* Bit values for the JsonNode.jnFlag field |
| 5362 */ |
| 5363 #define JNODE_RAW 0x01 /* Content is raw, not JSON encoded */ |
| 5364 #define JNODE_ESCAPE 0x02 /* Content is text with \ escapes */ |
| 5365 #define JNODE_REMOVE 0x04 /* Do not output */ |
| 5366 #define JNODE_REPLACE 0x08 /* Replace with JsonNode.iVal */ |
| 5367 #define JNODE_APPEND 0x10 /* More ARRAY/OBJECT entries at u.iAppend */ |
| 5368 #define JNODE_LABEL 0x20 /* Is a label of an object */ |
| 5369 |
| 5370 |
| 5371 /* A single node of parsed JSON |
| 5372 */ |
| 5373 struct JsonNode { |
| 5374 u8 eType; /* One of the JSON_ type values */ |
| 5375 u8 jnFlags; /* JNODE flags */ |
| 5376 u8 iVal; /* Replacement value when JNODE_REPLACE */ |
| 5377 u32 n; /* Bytes of content, or number of sub-nodes */ |
| 5378 union { |
| 5379 const char *zJContent; /* Content for INT, REAL, and STRING */ |
| 5380 u32 iAppend; /* More terms for ARRAY and OBJECT */ |
| 5381 u32 iKey; /* Key for ARRAY objects in json_tree() */ |
| 5382 } u; |
| 5383 }; |
| 5384 |
| 5385 /* A completely parsed JSON string |
| 5386 */ |
| 5387 struct JsonParse { |
| 5388 u32 nNode; /* Number of slots of aNode[] used */ |
| 5389 u32 nAlloc; /* Number of slots of aNode[] allocated */ |
| 5390 JsonNode *aNode; /* Array of nodes containing the parse */ |
| 5391 const char *zJson; /* Original JSON string */ |
| 5392 u32 *aUp; /* Index of parent of each node */ |
| 5393 u8 oom; /* Set to true if out of memory */ |
| 5394 u8 nErr; /* Number of errors seen */ |
| 5395 }; |
| 5396 |
| 5397 /************************************************************************** |
| 5398 ** Utility routines for dealing with JsonString objects |
| 5399 **************************************************************************/ |
| 5400 |
| 5401 /* Set the JsonString object to an empty string |
| 5402 */ |
| 5403 static void jsonZero(JsonString *p){ |
| 5404 p->zBuf = p->zSpace; |
| 5405 p->nAlloc = sizeof(p->zSpace); |
| 5406 p->nUsed = 0; |
| 5407 p->bStatic = 1; |
| 5408 } |
| 5409 |
| 5410 /* Initialize the JsonString object |
| 5411 */ |
| 5412 static void jsonInit(JsonString *p, sqlite3_context *pCtx){ |
| 5413 p->pCtx = pCtx; |
| 5414 p->bErr = 0; |
| 5415 jsonZero(p); |
| 5416 } |
| 5417 |
| 5418 |
| 5419 /* Free all allocated memory and reset the JsonString object back to its |
| 5420 ** initial state. |
| 5421 */ |
| 5422 static void jsonReset(JsonString *p){ |
| 5423 if( !p->bStatic ) sqlite3_free(p->zBuf); |
| 5424 jsonZero(p); |
| 5425 } |
| 5426 |
| 5427 |
| 5428 /* Report an out-of-memory (OOM) condition |
| 5429 */ |
| 5430 static void jsonOom(JsonString *p){ |
| 5431 p->bErr = 1; |
| 5432 sqlite3_result_error_nomem(p->pCtx); |
| 5433 jsonReset(p); |
| 5434 } |
| 5435 |
| 5436 /* Enlarge pJson->zBuf so that it can hold at least N more bytes. |
| 5437 ** Return zero on success. Return non-zero on an OOM error |
| 5438 */ |
| 5439 static int jsonGrow(JsonString *p, u32 N){ |
| 5440 u64 nTotal = N<p->nAlloc ? p->nAlloc*2 : p->nAlloc+N+10; |
| 5441 char *zNew; |
| 5442 if( p->bStatic ){ |
| 5443 if( p->bErr ) return 1; |
| 5444 zNew = sqlite3_malloc64(nTotal); |
| 5445 if( zNew==0 ){ |
| 5446 jsonOom(p); |
| 5447 return SQLITE_NOMEM; |
| 5448 } |
| 5449 memcpy(zNew, p->zBuf, (size_t)p->nUsed); |
| 5450 p->zBuf = zNew; |
| 5451 p->bStatic = 0; |
| 5452 }else{ |
| 5453 zNew = sqlite3_realloc64(p->zBuf, nTotal); |
| 5454 if( zNew==0 ){ |
| 5455 jsonOom(p); |
| 5456 return SQLITE_NOMEM; |
| 5457 } |
| 5458 p->zBuf = zNew; |
| 5459 } |
| 5460 p->nAlloc = nTotal; |
| 5461 return SQLITE_OK; |
| 5462 } |
| 5463 |
| 5464 /* Append N bytes from zIn onto the end of the JsonString string. |
| 5465 */ |
| 5466 static void jsonAppendRaw(JsonString *p, const char *zIn, u32 N){ |
| 5467 if( (N+p->nUsed >= p->nAlloc) && jsonGrow(p,N)!=0 ) return; |
| 5468 memcpy(p->zBuf+p->nUsed, zIn, N); |
| 5469 p->nUsed += N; |
| 5470 } |
| 5471 |
| 5472 /* Append formatted text (not to exceed N bytes) to the JsonString. |
| 5473 */ |
| 5474 static void jsonPrintf(int N, JsonString *p, const char *zFormat, ...){ |
| 5475 va_list ap; |
| 5476 if( (p->nUsed + N >= p->nAlloc) && jsonGrow(p, N) ) return; |
| 5477 va_start(ap, zFormat); |
| 5478 sqlite3_vsnprintf(N, p->zBuf+p->nUsed, zFormat, ap); |
| 5479 va_end(ap); |
| 5480 p->nUsed += (int)strlen(p->zBuf+p->nUsed); |
| 5481 } |
| 5482 |
| 5483 /* Append a single character |
| 5484 */ |
| 5485 static void jsonAppendChar(JsonString *p, char c){ |
| 5486 if( p->nUsed>=p->nAlloc && jsonGrow(p,1)!=0 ) return; |
| 5487 p->zBuf[p->nUsed++] = c; |
| 5488 } |
| 5489 |
| 5490 /* Append a comma separator to the output buffer, if the previous |
| 5491 ** character is not '[' or '{'. |
| 5492 */ |
| 5493 static void jsonAppendSeparator(JsonString *p){ |
| 5494 char c; |
| 5495 if( p->nUsed==0 ) return; |
| 5496 c = p->zBuf[p->nUsed-1]; |
| 5497 if( c!='[' && c!='{' ) jsonAppendChar(p, ','); |
| 5498 } |
| 5499 |
| 5500 /* Append the N-byte string in zIn to the end of the JsonString string |
| 5501 ** under construction. Enclose the string in "..." and escape |
| 5502 ** any double-quotes or backslash characters contained within the |
| 5503 ** string. |
| 5504 */ |
| 5505 static void jsonAppendString(JsonString *p, const char *zIn, u32 N){ |
| 5506 u32 i; |
| 5507 if( (N+p->nUsed+2 >= p->nAlloc) && jsonGrow(p,N+2)!=0 ) return; |
| 5508 p->zBuf[p->nUsed++] = '"'; |
| 5509 for(i=0; i<N; i++){ |
| 5510 char c = zIn[i]; |
| 5511 if( c=='"' || c=='\\' ){ |
| 5512 if( (p->nUsed+N+3-i > p->nAlloc) && jsonGrow(p,N+3-i)!=0 ) return; |
| 5513 p->zBuf[p->nUsed++] = '\\'; |
| 5514 } |
| 5515 p->zBuf[p->nUsed++] = c; |
| 5516 } |
| 5517 p->zBuf[p->nUsed++] = '"'; |
| 5518 assert( p->nUsed<p->nAlloc ); |
| 5519 } |
| 5520 |
| 5521 /* |
| 5522 ** Append a function parameter value to the JSON string under |
| 5523 ** construction. |
| 5524 */ |
| 5525 static void jsonAppendValue( |
| 5526 JsonString *p, /* Append to this JSON string */ |
| 5527 sqlite3_value *pValue /* Value to append */ |
| 5528 ){ |
| 5529 switch( sqlite3_value_type(pValue) ){ |
| 5530 case SQLITE_NULL: { |
| 5531 jsonAppendRaw(p, "null", 4); |
| 5532 break; |
| 5533 } |
| 5534 case SQLITE_INTEGER: |
| 5535 case SQLITE_FLOAT: { |
| 5536 const char *z = (const char*)sqlite3_value_text(pValue); |
| 5537 u32 n = (u32)sqlite3_value_bytes(pValue); |
| 5538 jsonAppendRaw(p, z, n); |
| 5539 break; |
| 5540 } |
| 5541 case SQLITE_TEXT: { |
| 5542 const char *z = (const char*)sqlite3_value_text(pValue); |
| 5543 u32 n = (u32)sqlite3_value_bytes(pValue); |
| 5544 if( sqlite3_value_subtype(pValue)==JSON_SUBTYPE ){ |
| 5545 jsonAppendRaw(p, z, n); |
| 5546 }else{ |
| 5547 jsonAppendString(p, z, n); |
| 5548 } |
| 5549 break; |
| 5550 } |
| 5551 default: { |
| 5552 if( p->bErr==0 ){ |
| 5553 sqlite3_result_error(p->pCtx, "JSON cannot hold BLOB values", -1); |
| 5554 p->bErr = 1; |
| 5555 jsonReset(p); |
| 5556 } |
| 5557 break; |
| 5558 } |
| 5559 } |
| 5560 } |
| 5561 |
| 5562 |
| 5563 /* Make the JSON in p the result of the SQL function. |
| 5564 */ |
| 5565 static void jsonResult(JsonString *p){ |
| 5566 if( p->bErr==0 ){ |
| 5567 sqlite3_result_text64(p->pCtx, p->zBuf, p->nUsed, |
| 5568 p->bStatic ? SQLITE_TRANSIENT : sqlite3_free, |
| 5569 SQLITE_UTF8); |
| 5570 jsonZero(p); |
| 5571 } |
| 5572 assert( p->bStatic ); |
| 5573 } |
| 5574 |
| 5575 /************************************************************************** |
| 5576 ** Utility routines for dealing with JsonNode and JsonParse objects |
| 5577 **************************************************************************/ |
| 5578 |
| 5579 /* |
| 5580 ** Return the number of consecutive JsonNode slots need to represent |
| 5581 ** the parsed JSON at pNode. The minimum answer is 1. For ARRAY and |
| 5582 ** OBJECT types, the number might be larger. |
| 5583 ** |
| 5584 ** Appended elements are not counted. The value returned is the number |
| 5585 ** by which the JsonNode counter should increment in order to go to the |
| 5586 ** next peer value. |
| 5587 */ |
| 5588 static u32 jsonNodeSize(JsonNode *pNode){ |
| 5589 return pNode->eType>=JSON_ARRAY ? pNode->n+1 : 1; |
| 5590 } |
| 5591 |
| 5592 /* |
| 5593 ** Reclaim all memory allocated by a JsonParse object. But do not |
| 5594 ** delete the JsonParse object itself. |
| 5595 */ |
| 5596 static void jsonParseReset(JsonParse *pParse){ |
| 5597 sqlite3_free(pParse->aNode); |
| 5598 pParse->aNode = 0; |
| 5599 pParse->nNode = 0; |
| 5600 pParse->nAlloc = 0; |
| 5601 sqlite3_free(pParse->aUp); |
| 5602 pParse->aUp = 0; |
| 5603 } |
| 5604 |
| 5605 /* |
| 5606 ** Convert the JsonNode pNode into a pure JSON string and |
| 5607 ** append to pOut. Subsubstructure is also included. Return |
| 5608 ** the number of JsonNode objects that are encoded. |
| 5609 */ |
| 5610 static void jsonRenderNode( |
| 5611 JsonNode *pNode, /* The node to render */ |
| 5612 JsonString *pOut, /* Write JSON here */ |
| 5613 sqlite3_value **aReplace /* Replacement values */ |
| 5614 ){ |
| 5615 switch( pNode->eType ){ |
| 5616 default: { |
| 5617 assert( pNode->eType==JSON_NULL ); |
| 5618 jsonAppendRaw(pOut, "null", 4); |
| 5619 break; |
| 5620 } |
| 5621 case JSON_TRUE: { |
| 5622 jsonAppendRaw(pOut, "true", 4); |
| 5623 break; |
| 5624 } |
| 5625 case JSON_FALSE: { |
| 5626 jsonAppendRaw(pOut, "false", 5); |
| 5627 break; |
| 5628 } |
| 5629 case JSON_STRING: { |
| 5630 if( pNode->jnFlags & JNODE_RAW ){ |
| 5631 jsonAppendString(pOut, pNode->u.zJContent, pNode->n); |
| 5632 break; |
| 5633 } |
| 5634 /* Fall through into the next case */ |
| 5635 } |
| 5636 case JSON_REAL: |
| 5637 case JSON_INT: { |
| 5638 jsonAppendRaw(pOut, pNode->u.zJContent, pNode->n); |
| 5639 break; |
| 5640 } |
| 5641 case JSON_ARRAY: { |
| 5642 u32 j = 1; |
| 5643 jsonAppendChar(pOut, '['); |
| 5644 for(;;){ |
| 5645 while( j<=pNode->n ){ |
| 5646 if( pNode[j].jnFlags & (JNODE_REMOVE|JNODE_REPLACE) ){ |
| 5647 if( pNode[j].jnFlags & JNODE_REPLACE ){ |
| 5648 jsonAppendSeparator(pOut); |
| 5649 jsonAppendValue(pOut, aReplace[pNode[j].iVal]); |
| 5650 } |
| 5651 }else{ |
| 5652 jsonAppendSeparator(pOut); |
| 5653 jsonRenderNode(&pNode[j], pOut, aReplace); |
| 5654 } |
| 5655 j += jsonNodeSize(&pNode[j]); |
| 5656 } |
| 5657 if( (pNode->jnFlags & JNODE_APPEND)==0 ) break; |
| 5658 pNode = &pNode[pNode->u.iAppend]; |
| 5659 j = 1; |
| 5660 } |
| 5661 jsonAppendChar(pOut, ']'); |
| 5662 break; |
| 5663 } |
| 5664 case JSON_OBJECT: { |
| 5665 u32 j = 1; |
| 5666 jsonAppendChar(pOut, '{'); |
| 5667 for(;;){ |
| 5668 while( j<=pNode->n ){ |
| 5669 if( (pNode[j+1].jnFlags & JNODE_REMOVE)==0 ){ |
| 5670 jsonAppendSeparator(pOut); |
| 5671 jsonRenderNode(&pNode[j], pOut, aReplace); |
| 5672 jsonAppendChar(pOut, ':'); |
| 5673 if( pNode[j+1].jnFlags & JNODE_REPLACE ){ |
| 5674 jsonAppendValue(pOut, aReplace[pNode[j+1].iVal]); |
| 5675 }else{ |
| 5676 jsonRenderNode(&pNode[j+1], pOut, aReplace); |
| 5677 } |
| 5678 } |
| 5679 j += 1 + jsonNodeSize(&pNode[j+1]); |
| 5680 } |
| 5681 if( (pNode->jnFlags & JNODE_APPEND)==0 ) break; |
| 5682 pNode = &pNode[pNode->u.iAppend]; |
| 5683 j = 1; |
| 5684 } |
| 5685 jsonAppendChar(pOut, '}'); |
| 5686 break; |
| 5687 } |
| 5688 } |
| 5689 } |
| 5690 |
| 5691 /* |
| 5692 ** Return a JsonNode and all its descendents as a JSON string. |
| 5693 */ |
| 5694 static void jsonReturnJson( |
| 5695 JsonNode *pNode, /* Node to return */ |
| 5696 sqlite3_context *pCtx, /* Return value for this function */ |
| 5697 sqlite3_value **aReplace /* Array of replacement values */ |
| 5698 ){ |
| 5699 JsonString s; |
| 5700 jsonInit(&s, pCtx); |
| 5701 jsonRenderNode(pNode, &s, aReplace); |
| 5702 jsonResult(&s); |
| 5703 sqlite3_result_subtype(pCtx, JSON_SUBTYPE); |
| 5704 } |
| 5705 |
| 5706 /* |
| 5707 ** Make the JsonNode the return value of the function. |
| 5708 */ |
| 5709 static void jsonReturn( |
| 5710 JsonNode *pNode, /* Node to return */ |
| 5711 sqlite3_context *pCtx, /* Return value for this function */ |
| 5712 sqlite3_value **aReplace /* Array of replacement values */ |
| 5713 ){ |
| 5714 switch( pNode->eType ){ |
| 5715 default: { |
| 5716 assert( pNode->eType==JSON_NULL ); |
| 5717 sqlite3_result_null(pCtx); |
| 5718 break; |
| 5719 } |
| 5720 case JSON_TRUE: { |
| 5721 sqlite3_result_int(pCtx, 1); |
| 5722 break; |
| 5723 } |
| 5724 case JSON_FALSE: { |
| 5725 sqlite3_result_int(pCtx, 0); |
| 5726 break; |
| 5727 } |
| 5728 case JSON_INT: { |
| 5729 sqlite3_int64 i = 0; |
| 5730 const char *z = pNode->u.zJContent; |
| 5731 if( z[0]=='-' ){ z++; } |
| 5732 while( z[0]>='0' && z[0]<='9' ){ |
| 5733 unsigned v = *(z++) - '0'; |
| 5734 if( i>=LARGEST_INT64/10 ){ |
| 5735 if( i>LARGEST_INT64/10 ) goto int_as_real; |
| 5736 if( z[0]>='0' && z[0]<='9' ) goto int_as_real; |
| 5737 if( v==9 ) goto int_as_real; |
| 5738 if( v==8 ){ |
| 5739 if( pNode->u.zJContent[0]=='-' ){ |
| 5740 sqlite3_result_int64(pCtx, SMALLEST_INT64); |
| 5741 goto int_done; |
| 5742 }else{ |
| 5743 goto int_as_real; |
| 5744 } |
| 5745 } |
| 5746 } |
| 5747 i = i*10 + v; |
| 5748 } |
| 5749 if( pNode->u.zJContent[0]=='-' ){ i = -i; } |
| 5750 sqlite3_result_int64(pCtx, i); |
| 5751 int_done: |
| 5752 break; |
| 5753 int_as_real: /* fall through to real */; |
| 5754 } |
| 5755 case JSON_REAL: { |
| 5756 double r; |
| 5757 #ifdef SQLITE_AMALGAMATION |
| 5758 const char *z = pNode->u.zJContent; |
| 5759 sqlite3AtoF(z, &r, sqlite3Strlen30(z), SQLITE_UTF8); |
| 5760 #else |
| 5761 r = strtod(pNode->u.zJContent, 0); |
| 5762 #endif |
| 5763 sqlite3_result_double(pCtx, r); |
| 5764 break; |
| 5765 } |
| 5766 case JSON_STRING: { |
| 5767 #if 0 /* Never happens because JNODE_RAW is only set by json_set(), |
| 5768 ** json_insert() and json_replace() and those routines do not |
| 5769 ** call jsonReturn() */ |
| 5770 if( pNode->jnFlags & JNODE_RAW ){ |
| 5771 sqlite3_result_text(pCtx, pNode->u.zJContent, pNode->n, |
| 5772 SQLITE_TRANSIENT); |
| 5773 }else |
| 5774 #endif |
| 5775 assert( (pNode->jnFlags & JNODE_RAW)==0 ); |
| 5776 if( (pNode->jnFlags & JNODE_ESCAPE)==0 ){ |
| 5777 /* JSON formatted without any backslash-escapes */ |
| 5778 sqlite3_result_text(pCtx, pNode->u.zJContent+1, pNode->n-2, |
| 5779 SQLITE_TRANSIENT); |
| 5780 }else{ |
| 5781 /* Translate JSON formatted string into raw text */ |
| 5782 u32 i; |
| 5783 u32 n = pNode->n; |
| 5784 const char *z = pNode->u.zJContent; |
| 5785 char *zOut; |
| 5786 u32 j; |
| 5787 zOut = sqlite3_malloc( n+1 ); |
| 5788 if( zOut==0 ){ |
| 5789 sqlite3_result_error_nomem(pCtx); |
| 5790 break; |
| 5791 } |
| 5792 for(i=1, j=0; i<n-1; i++){ |
| 5793 char c = z[i]; |
| 5794 if( c!='\\' ){ |
| 5795 zOut[j++] = c; |
| 5796 }else{ |
| 5797 c = z[++i]; |
| 5798 if( c=='u' ){ |
| 5799 u32 v = 0, k; |
| 5800 for(k=0; k<4 && i<n-2; i++, k++){ |
| 5801 c = z[i+1]; |
| 5802 if( c>='0' && c<='9' ) v = v*16 + c - '0'; |
| 5803 else if( c>='A' && c<='F' ) v = v*16 + c - 'A' + 10; |
| 5804 else if( c>='a' && c<='f' ) v = v*16 + c - 'a' + 10; |
| 5805 else break; |
| 5806 } |
| 5807 if( v==0 ) break; |
| 5808 if( v<=0x7f ){ |
| 5809 zOut[j++] = (char)v; |
| 5810 }else if( v<=0x7ff ){ |
| 5811 zOut[j++] = (char)(0xc0 | (v>>6)); |
| 5812 zOut[j++] = 0x80 | (v&0x3f); |
| 5813 }else{ |
| 5814 zOut[j++] = (char)(0xe0 | (v>>12)); |
| 5815 zOut[j++] = 0x80 | ((v>>6)&0x3f); |
| 5816 zOut[j++] = 0x80 | (v&0x3f); |
| 5817 } |
| 5818 }else{ |
| 5819 if( c=='b' ){ |
| 5820 c = '\b'; |
| 5821 }else if( c=='f' ){ |
| 5822 c = '\f'; |
| 5823 }else if( c=='n' ){ |
| 5824 c = '\n'; |
| 5825 }else if( c=='r' ){ |
| 5826 c = '\r'; |
| 5827 }else if( c=='t' ){ |
| 5828 c = '\t'; |
| 5829 } |
| 5830 zOut[j++] = c; |
| 5831 } |
| 5832 } |
| 5833 } |
| 5834 zOut[j] = 0; |
| 5835 sqlite3_result_text(pCtx, zOut, j, sqlite3_free); |
| 5836 } |
| 5837 break; |
| 5838 } |
| 5839 case JSON_ARRAY: |
| 5840 case JSON_OBJECT: { |
| 5841 jsonReturnJson(pNode, pCtx, aReplace); |
| 5842 break; |
| 5843 } |
| 5844 } |
| 5845 } |
| 5846 |
| 5847 /* Forward reference */ |
| 5848 static int jsonParseAddNode(JsonParse*,u32,u32,const char*); |
| 5849 |
| 5850 /* |
| 5851 ** A macro to hint to the compiler that a function should not be |
| 5852 ** inlined. |
| 5853 */ |
| 5854 #if defined(__GNUC__) |
| 5855 # define JSON_NOINLINE __attribute__((noinline)) |
| 5856 #elif defined(_MSC_VER) && _MSC_VER>=1310 |
| 5857 # define JSON_NOINLINE __declspec(noinline) |
| 5858 #else |
| 5859 # define JSON_NOINLINE |
| 5860 #endif |
| 5861 |
| 5862 |
| 5863 static JSON_NOINLINE int jsonParseAddNodeExpand( |
| 5864 JsonParse *pParse, /* Append the node to this object */ |
| 5865 u32 eType, /* Node type */ |
| 5866 u32 n, /* Content size or sub-node count */ |
| 5867 const char *zContent /* Content */ |
| 5868 ){ |
| 5869 u32 nNew; |
| 5870 JsonNode *pNew; |
| 5871 assert( pParse->nNode>=pParse->nAlloc ); |
| 5872 if( pParse->oom ) return -1; |
| 5873 nNew = pParse->nAlloc*2 + 10; |
| 5874 pNew = sqlite3_realloc(pParse->aNode, sizeof(JsonNode)*nNew); |
| 5875 if( pNew==0 ){ |
| 5876 pParse->oom = 1; |
| 5877 return -1; |
| 5878 } |
| 5879 pParse->nAlloc = nNew; |
| 5880 pParse->aNode = pNew; |
| 5881 assert( pParse->nNode<pParse->nAlloc ); |
| 5882 return jsonParseAddNode(pParse, eType, n, zContent); |
| 5883 } |
| 5884 |
| 5885 /* |
| 5886 ** Create a new JsonNode instance based on the arguments and append that |
| 5887 ** instance to the JsonParse. Return the index in pParse->aNode[] of the |
| 5888 ** new node, or -1 if a memory allocation fails. |
| 5889 */ |
| 5890 static int jsonParseAddNode( |
| 5891 JsonParse *pParse, /* Append the node to this object */ |
| 5892 u32 eType, /* Node type */ |
| 5893 u32 n, /* Content size or sub-node count */ |
| 5894 const char *zContent /* Content */ |
| 5895 ){ |
| 5896 JsonNode *p; |
| 5897 if( pParse->nNode>=pParse->nAlloc ){ |
| 5898 return jsonParseAddNodeExpand(pParse, eType, n, zContent); |
| 5899 } |
| 5900 p = &pParse->aNode[pParse->nNode]; |
| 5901 p->eType = (u8)eType; |
| 5902 p->jnFlags = 0; |
| 5903 p->iVal = 0; |
| 5904 p->n = n; |
| 5905 p->u.zJContent = zContent; |
| 5906 return pParse->nNode++; |
| 5907 } |
| 5908 |
| 5909 /* |
| 5910 ** Parse a single JSON value which begins at pParse->zJson[i]. Return the |
| 5911 ** index of the first character past the end of the value parsed. |
| 5912 ** |
| 5913 ** Return negative for a syntax error. Special cases: return -2 if the |
| 5914 ** first non-whitespace character is '}' and return -3 if the first |
| 5915 ** non-whitespace character is ']'. |
| 5916 */ |
| 5917 static int jsonParseValue(JsonParse *pParse, u32 i){ |
| 5918 char c; |
| 5919 u32 j; |
| 5920 int iThis; |
| 5921 int x; |
| 5922 JsonNode *pNode; |
| 5923 while( safe_isspace(pParse->zJson[i]) ){ i++; } |
| 5924 if( (c = pParse->zJson[i])=='{' ){ |
| 5925 /* Parse object */ |
| 5926 iThis = jsonParseAddNode(pParse, JSON_OBJECT, 0, 0); |
| 5927 if( iThis<0 ) return -1; |
| 5928 for(j=i+1;;j++){ |
| 5929 while( safe_isspace(pParse->zJson[j]) ){ j++; } |
| 5930 x = jsonParseValue(pParse, j); |
| 5931 if( x<0 ){ |
| 5932 if( x==(-2) && pParse->nNode==(u32)iThis+1 ) return j+1; |
| 5933 return -1; |
| 5934 } |
| 5935 if( pParse->oom ) return -1; |
| 5936 pNode = &pParse->aNode[pParse->nNode-1]; |
| 5937 if( pNode->eType!=JSON_STRING ) return -1; |
| 5938 pNode->jnFlags |= JNODE_LABEL; |
| 5939 j = x; |
| 5940 while( safe_isspace(pParse->zJson[j]) ){ j++; } |
| 5941 if( pParse->zJson[j]!=':' ) return -1; |
| 5942 j++; |
| 5943 x = jsonParseValue(pParse, j); |
| 5944 if( x<0 ) return -1; |
| 5945 j = x; |
| 5946 while( safe_isspace(pParse->zJson[j]) ){ j++; } |
| 5947 c = pParse->zJson[j]; |
| 5948 if( c==',' ) continue; |
| 5949 if( c!='}' ) return -1; |
| 5950 break; |
| 5951 } |
| 5952 pParse->aNode[iThis].n = pParse->nNode - (u32)iThis - 1; |
| 5953 return j+1; |
| 5954 }else if( c=='[' ){ |
| 5955 /* Parse array */ |
| 5956 iThis = jsonParseAddNode(pParse, JSON_ARRAY, 0, 0); |
| 5957 if( iThis<0 ) return -1; |
| 5958 for(j=i+1;;j++){ |
| 5959 while( safe_isspace(pParse->zJson[j]) ){ j++; } |
| 5960 x = jsonParseValue(pParse, j); |
| 5961 if( x<0 ){ |
| 5962 if( x==(-3) && pParse->nNode==(u32)iThis+1 ) return j+1; |
| 5963 return -1; |
| 5964 } |
| 5965 j = x; |
| 5966 while( safe_isspace(pParse->zJson[j]) ){ j++; } |
| 5967 c = pParse->zJson[j]; |
| 5968 if( c==',' ) continue; |
| 5969 if( c!=']' ) return -1; |
| 5970 break; |
| 5971 } |
| 5972 pParse->aNode[iThis].n = pParse->nNode - (u32)iThis - 1; |
| 5973 return j+1; |
| 5974 }else if( c=='"' ){ |
| 5975 /* Parse string */ |
| 5976 u8 jnFlags = 0; |
| 5977 j = i+1; |
| 5978 for(;;){ |
| 5979 c = pParse->zJson[j]; |
| 5980 if( c==0 ) return -1; |
| 5981 if( c=='\\' ){ |
| 5982 c = pParse->zJson[++j]; |
| 5983 if( c==0 ) return -1; |
| 5984 jnFlags = JNODE_ESCAPE; |
| 5985 }else if( c=='"' ){ |
| 5986 break; |
| 5987 } |
| 5988 j++; |
| 5989 } |
| 5990 jsonParseAddNode(pParse, JSON_STRING, j+1-i, &pParse->zJson[i]); |
| 5991 if( !pParse->oom ) pParse->aNode[pParse->nNode-1].jnFlags = jnFlags; |
| 5992 return j+1; |
| 5993 }else if( c=='n' |
| 5994 && strncmp(pParse->zJson+i,"null",4)==0 |
| 5995 && !safe_isalnum(pParse->zJson[i+4]) ){ |
| 5996 jsonParseAddNode(pParse, JSON_NULL, 0, 0); |
| 5997 return i+4; |
| 5998 }else if( c=='t' |
| 5999 && strncmp(pParse->zJson+i,"true",4)==0 |
| 6000 && !safe_isalnum(pParse->zJson[i+4]) ){ |
| 6001 jsonParseAddNode(pParse, JSON_TRUE, 0, 0); |
| 6002 return i+4; |
| 6003 }else if( c=='f' |
| 6004 && strncmp(pParse->zJson+i,"false",5)==0 |
| 6005 && !safe_isalnum(pParse->zJson[i+5]) ){ |
| 6006 jsonParseAddNode(pParse, JSON_FALSE, 0, 0); |
| 6007 return i+5; |
| 6008 }else if( c=='-' || (c>='0' && c<='9') ){ |
| 6009 /* Parse number */ |
| 6010 u8 seenDP = 0; |
| 6011 u8 seenE = 0; |
| 6012 j = i+1; |
| 6013 for(;; j++){ |
| 6014 c = pParse->zJson[j]; |
| 6015 if( c>='0' && c<='9' ) continue; |
| 6016 if( c=='.' ){ |
| 6017 if( pParse->zJson[j-1]=='-' ) return -1; |
| 6018 if( seenDP ) return -1; |
| 6019 seenDP = 1; |
| 6020 continue; |
| 6021 } |
| 6022 if( c=='e' || c=='E' ){ |
| 6023 if( pParse->zJson[j-1]<'0' ) return -1; |
| 6024 if( seenE ) return -1; |
| 6025 seenDP = seenE = 1; |
| 6026 c = pParse->zJson[j+1]; |
| 6027 if( c=='+' || c=='-' ){ |
| 6028 j++; |
| 6029 c = pParse->zJson[j+1]; |
| 6030 } |
| 6031 if( c<'0' || c>'9' ) return -1; |
| 6032 continue; |
| 6033 } |
| 6034 break; |
| 6035 } |
| 6036 if( pParse->zJson[j-1]<'0' ) return -1; |
| 6037 jsonParseAddNode(pParse, seenDP ? JSON_REAL : JSON_INT, |
| 6038 j - i, &pParse->zJson[i]); |
| 6039 return j; |
| 6040 }else if( c=='}' ){ |
| 6041 return -2; /* End of {...} */ |
| 6042 }else if( c==']' ){ |
| 6043 return -3; /* End of [...] */ |
| 6044 }else if( c==0 ){ |
| 6045 return 0; /* End of file */ |
| 6046 }else{ |
| 6047 return -1; /* Syntax error */ |
| 6048 } |
| 6049 } |
| 6050 |
| 6051 /* |
| 6052 ** Parse a complete JSON string. Return 0 on success or non-zero if there |
| 6053 ** are any errors. If an error occurs, free all memory associated with |
| 6054 ** pParse. |
| 6055 ** |
| 6056 ** pParse is uninitialized when this routine is called. |
| 6057 */ |
| 6058 static int jsonParse( |
| 6059 JsonParse *pParse, /* Initialize and fill this JsonParse object */ |
| 6060 sqlite3_context *pCtx, /* Report errors here */ |
| 6061 const char *zJson /* Input JSON text to be parsed */ |
| 6062 ){ |
| 6063 int i; |
| 6064 memset(pParse, 0, sizeof(*pParse)); |
| 6065 if( zJson==0 ) return 1; |
| 6066 pParse->zJson = zJson; |
| 6067 i = jsonParseValue(pParse, 0); |
| 6068 if( pParse->oom ) i = -1; |
| 6069 if( i>0 ){ |
| 6070 while( safe_isspace(zJson[i]) ) i++; |
| 6071 if( zJson[i] ) i = -1; |
| 6072 } |
| 6073 if( i<=0 ){ |
| 6074 if( pCtx!=0 ){ |
| 6075 if( pParse->oom ){ |
| 6076 sqlite3_result_error_nomem(pCtx); |
| 6077 }else{ |
| 6078 sqlite3_result_error(pCtx, "malformed JSON", -1); |
| 6079 } |
| 6080 } |
| 6081 jsonParseReset(pParse); |
| 6082 return 1; |
| 6083 } |
| 6084 return 0; |
| 6085 } |
| 6086 |
| 6087 /* Mark node i of pParse as being a child of iParent. Call recursively |
| 6088 ** to fill in all the descendants of node i. |
| 6089 */ |
| 6090 static void jsonParseFillInParentage(JsonParse *pParse, u32 i, u32 iParent){ |
| 6091 JsonNode *pNode = &pParse->aNode[i]; |
| 6092 u32 j; |
| 6093 pParse->aUp[i] = iParent; |
| 6094 switch( pNode->eType ){ |
| 6095 case JSON_ARRAY: { |
| 6096 for(j=1; j<=pNode->n; j += jsonNodeSize(pNode+j)){ |
| 6097 jsonParseFillInParentage(pParse, i+j, i); |
| 6098 } |
| 6099 break; |
| 6100 } |
| 6101 case JSON_OBJECT: { |
| 6102 for(j=1; j<=pNode->n; j += jsonNodeSize(pNode+j+1)+1){ |
| 6103 pParse->aUp[i+j] = i; |
| 6104 jsonParseFillInParentage(pParse, i+j+1, i); |
| 6105 } |
| 6106 break; |
| 6107 } |
| 6108 default: { |
| 6109 break; |
| 6110 } |
| 6111 } |
| 6112 } |
| 6113 |
| 6114 /* |
| 6115 ** Compute the parentage of all nodes in a completed parse. |
| 6116 */ |
| 6117 static int jsonParseFindParents(JsonParse *pParse){ |
| 6118 u32 *aUp; |
| 6119 assert( pParse->aUp==0 ); |
| 6120 aUp = pParse->aUp = sqlite3_malloc( sizeof(u32)*pParse->nNode ); |
| 6121 if( aUp==0 ){ |
| 6122 pParse->oom = 1; |
| 6123 return SQLITE_NOMEM; |
| 6124 } |
| 6125 jsonParseFillInParentage(pParse, 0, 0); |
| 6126 return SQLITE_OK; |
| 6127 } |
| 6128 |
| 6129 /* |
| 6130 ** Compare the OBJECT label at pNode against zKey,nKey. Return true on |
| 6131 ** a match. |
| 6132 */ |
| 6133 static int jsonLabelCompare(JsonNode *pNode, const char *zKey, u32 nKey){ |
| 6134 if( pNode->jnFlags & JNODE_RAW ){ |
| 6135 if( pNode->n!=nKey ) return 0; |
| 6136 return strncmp(pNode->u.zJContent, zKey, nKey)==0; |
| 6137 }else{ |
| 6138 if( pNode->n!=nKey+2 ) return 0; |
| 6139 return strncmp(pNode->u.zJContent+1, zKey, nKey)==0; |
| 6140 } |
| 6141 } |
| 6142 |
| 6143 /* forward declaration */ |
| 6144 static JsonNode *jsonLookupAppend(JsonParse*,const char*,int*,const char**); |
| 6145 |
| 6146 /* |
| 6147 ** Search along zPath to find the node specified. Return a pointer |
| 6148 ** to that node, or NULL if zPath is malformed or if there is no such |
| 6149 ** node. |
| 6150 ** |
| 6151 ** If pApnd!=0, then try to append new nodes to complete zPath if it is |
| 6152 ** possible to do so and if no existing node corresponds to zPath. If |
| 6153 ** new nodes are appended *pApnd is set to 1. |
| 6154 */ |
| 6155 static JsonNode *jsonLookupStep( |
| 6156 JsonParse *pParse, /* The JSON to search */ |
| 6157 u32 iRoot, /* Begin the search at this node */ |
| 6158 const char *zPath, /* The path to search */ |
| 6159 int *pApnd, /* Append nodes to complete path if not NULL */ |
| 6160 const char **pzErr /* Make *pzErr point to any syntax error in zPath */ |
| 6161 ){ |
| 6162 u32 i, j, nKey; |
| 6163 const char *zKey; |
| 6164 JsonNode *pRoot = &pParse->aNode[iRoot]; |
| 6165 if( zPath[0]==0 ) return pRoot; |
| 6166 if( zPath[0]=='.' ){ |
| 6167 if( pRoot->eType!=JSON_OBJECT ) return 0; |
| 6168 zPath++; |
| 6169 if( zPath[0]=='"' ){ |
| 6170 zKey = zPath + 1; |
| 6171 for(i=1; zPath[i] && zPath[i]!='"'; i++){} |
| 6172 nKey = i-1; |
| 6173 if( zPath[i] ){ |
| 6174 i++; |
| 6175 }else{ |
| 6176 *pzErr = zPath; |
| 6177 return 0; |
| 6178 } |
| 6179 }else{ |
| 6180 zKey = zPath; |
| 6181 for(i=0; zPath[i] && zPath[i]!='.' && zPath[i]!='['; i++){} |
| 6182 nKey = i; |
| 6183 } |
| 6184 if( nKey==0 ){ |
| 6185 *pzErr = zPath; |
| 6186 return 0; |
| 6187 } |
| 6188 j = 1; |
| 6189 for(;;){ |
| 6190 while( j<=pRoot->n ){ |
| 6191 if( jsonLabelCompare(pRoot+j, zKey, nKey) ){ |
| 6192 return jsonLookupStep(pParse, iRoot+j+1, &zPath[i], pApnd, pzErr); |
| 6193 } |
| 6194 j++; |
| 6195 j += jsonNodeSize(&pRoot[j]); |
| 6196 } |
| 6197 if( (pRoot->jnFlags & JNODE_APPEND)==0 ) break; |
| 6198 iRoot += pRoot->u.iAppend; |
| 6199 pRoot = &pParse->aNode[iRoot]; |
| 6200 j = 1; |
| 6201 } |
| 6202 if( pApnd ){ |
| 6203 u32 iStart, iLabel; |
| 6204 JsonNode *pNode; |
| 6205 iStart = jsonParseAddNode(pParse, JSON_OBJECT, 2, 0); |
| 6206 iLabel = jsonParseAddNode(pParse, JSON_STRING, i, zPath); |
| 6207 zPath += i; |
| 6208 pNode = jsonLookupAppend(pParse, zPath, pApnd, pzErr); |
| 6209 if( pParse->oom ) return 0; |
| 6210 if( pNode ){ |
| 6211 pRoot = &pParse->aNode[iRoot]; |
| 6212 pRoot->u.iAppend = iStart - iRoot; |
| 6213 pRoot->jnFlags |= JNODE_APPEND; |
| 6214 pParse->aNode[iLabel].jnFlags |= JNODE_RAW; |
| 6215 } |
| 6216 return pNode; |
| 6217 } |
| 6218 }else if( zPath[0]=='[' && safe_isdigit(zPath[1]) ){ |
| 6219 if( pRoot->eType!=JSON_ARRAY ) return 0; |
| 6220 i = 0; |
| 6221 j = 1; |
| 6222 while( safe_isdigit(zPath[j]) ){ |
| 6223 i = i*10 + zPath[j] - '0'; |
| 6224 j++; |
| 6225 } |
| 6226 if( zPath[j]!=']' ){ |
| 6227 *pzErr = zPath; |
| 6228 return 0; |
| 6229 } |
| 6230 zPath += j + 1; |
| 6231 j = 1; |
| 6232 for(;;){ |
| 6233 while( j<=pRoot->n && (i>0 || (pRoot[j].jnFlags & JNODE_REMOVE)!=0) ){ |
| 6234 if( (pRoot[j].jnFlags & JNODE_REMOVE)==0 ) i--; |
| 6235 j += jsonNodeSize(&pRoot[j]); |
| 6236 } |
| 6237 if( (pRoot->jnFlags & JNODE_APPEND)==0 ) break; |
| 6238 iRoot += pRoot->u.iAppend; |
| 6239 pRoot = &pParse->aNode[iRoot]; |
| 6240 j = 1; |
| 6241 } |
| 6242 if( j<=pRoot->n ){ |
| 6243 return jsonLookupStep(pParse, iRoot+j, zPath, pApnd, pzErr); |
| 6244 } |
| 6245 if( i==0 && pApnd ){ |
| 6246 u32 iStart; |
| 6247 JsonNode *pNode; |
| 6248 iStart = jsonParseAddNode(pParse, JSON_ARRAY, 1, 0); |
| 6249 pNode = jsonLookupAppend(pParse, zPath, pApnd, pzErr); |
| 6250 if( pParse->oom ) return 0; |
| 6251 if( pNode ){ |
| 6252 pRoot = &pParse->aNode[iRoot]; |
| 6253 pRoot->u.iAppend = iStart - iRoot; |
| 6254 pRoot->jnFlags |= JNODE_APPEND; |
| 6255 } |
| 6256 return pNode; |
| 6257 } |
| 6258 }else{ |
| 6259 *pzErr = zPath; |
| 6260 } |
| 6261 return 0; |
| 6262 } |
| 6263 |
| 6264 /* |
| 6265 ** Append content to pParse that will complete zPath. Return a pointer |
| 6266 ** to the inserted node, or return NULL if the append fails. |
| 6267 */ |
| 6268 static JsonNode *jsonLookupAppend( |
| 6269 JsonParse *pParse, /* Append content to the JSON parse */ |
| 6270 const char *zPath, /* Description of content to append */ |
| 6271 int *pApnd, /* Set this flag to 1 */ |
| 6272 const char **pzErr /* Make this point to any syntax error */ |
| 6273 ){ |
| 6274 *pApnd = 1; |
| 6275 if( zPath[0]==0 ){ |
| 6276 jsonParseAddNode(pParse, JSON_NULL, 0, 0); |
| 6277 return pParse->oom ? 0 : &pParse->aNode[pParse->nNode-1]; |
| 6278 } |
| 6279 if( zPath[0]=='.' ){ |
| 6280 jsonParseAddNode(pParse, JSON_OBJECT, 0, 0); |
| 6281 }else if( strncmp(zPath,"[0]",3)==0 ){ |
| 6282 jsonParseAddNode(pParse, JSON_ARRAY, 0, 0); |
| 6283 }else{ |
| 6284 return 0; |
| 6285 } |
| 6286 if( pParse->oom ) return 0; |
| 6287 return jsonLookupStep(pParse, pParse->nNode-1, zPath, pApnd, pzErr); |
| 6288 } |
| 6289 |
| 6290 /* |
| 6291 ** Return the text of a syntax error message on a JSON path. Space is |
| 6292 ** obtained from sqlite3_malloc(). |
| 6293 */ |
| 6294 static char *jsonPathSyntaxError(const char *zErr){ |
| 6295 return sqlite3_mprintf("JSON path error near '%q'", zErr); |
| 6296 } |
| 6297 |
| 6298 /* |
| 6299 ** Do a node lookup using zPath. Return a pointer to the node on success. |
| 6300 ** Return NULL if not found or if there is an error. |
| 6301 ** |
| 6302 ** On an error, write an error message into pCtx and increment the |
| 6303 ** pParse->nErr counter. |
| 6304 ** |
| 6305 ** If pApnd!=NULL then try to append missing nodes and set *pApnd = 1 if |
| 6306 ** nodes are appended. |
| 6307 */ |
| 6308 static JsonNode *jsonLookup( |
| 6309 JsonParse *pParse, /* The JSON to search */ |
| 6310 const char *zPath, /* The path to search */ |
| 6311 int *pApnd, /* Append nodes to complete path if not NULL */ |
| 6312 sqlite3_context *pCtx /* Report errors here, if not NULL */ |
| 6313 ){ |
| 6314 const char *zErr = 0; |
| 6315 JsonNode *pNode = 0; |
| 6316 char *zMsg; |
| 6317 |
| 6318 if( zPath==0 ) return 0; |
| 6319 if( zPath[0]!='$' ){ |
| 6320 zErr = zPath; |
| 6321 goto lookup_err; |
| 6322 } |
| 6323 zPath++; |
| 6324 pNode = jsonLookupStep(pParse, 0, zPath, pApnd, &zErr); |
| 6325 if( zErr==0 ) return pNode; |
| 6326 |
| 6327 lookup_err: |
| 6328 pParse->nErr++; |
| 6329 assert( zErr!=0 && pCtx!=0 ); |
| 6330 zMsg = jsonPathSyntaxError(zErr); |
| 6331 if( zMsg ){ |
| 6332 sqlite3_result_error(pCtx, zMsg, -1); |
| 6333 sqlite3_free(zMsg); |
| 6334 }else{ |
| 6335 sqlite3_result_error_nomem(pCtx); |
| 6336 } |
| 6337 return 0; |
| 6338 } |
| 6339 |
| 6340 |
| 6341 /* |
| 6342 ** Report the wrong number of arguments for json_insert(), json_replace() |
| 6343 ** or json_set(). |
| 6344 */ |
| 6345 static void jsonWrongNumArgs( |
| 6346 sqlite3_context *pCtx, |
| 6347 const char *zFuncName |
| 6348 ){ |
| 6349 char *zMsg = sqlite3_mprintf("json_%s() needs an odd number of arguments", |
| 6350 zFuncName); |
| 6351 sqlite3_result_error(pCtx, zMsg, -1); |
| 6352 sqlite3_free(zMsg); |
| 6353 } |
| 6354 |
| 6355 |
| 6356 /**************************************************************************** |
| 6357 ** SQL functions used for testing and debugging |
| 6358 ****************************************************************************/ |
| 6359 |
| 6360 #ifdef SQLITE_DEBUG |
| 6361 /* |
| 6362 ** The json_parse(JSON) function returns a string which describes |
| 6363 ** a parse of the JSON provided. Or it returns NULL if JSON is not |
| 6364 ** well-formed. |
| 6365 */ |
| 6366 static void jsonParseFunc( |
| 6367 sqlite3_context *ctx, |
| 6368 int argc, |
| 6369 sqlite3_value **argv |
| 6370 ){ |
| 6371 JsonString s; /* Output string - not real JSON */ |
| 6372 JsonParse x; /* The parse */ |
| 6373 u32 i; |
| 6374 |
| 6375 assert( argc==1 ); |
| 6376 if( jsonParse(&x, ctx, (const char*)sqlite3_value_text(argv[0])) ) return; |
| 6377 jsonParseFindParents(&x); |
| 6378 jsonInit(&s, ctx); |
| 6379 for(i=0; i<x.nNode; i++){ |
| 6380 const char *zType; |
| 6381 if( x.aNode[i].jnFlags & JNODE_LABEL ){ |
| 6382 assert( x.aNode[i].eType==JSON_STRING ); |
| 6383 zType = "label"; |
| 6384 }else{ |
| 6385 zType = jsonType[x.aNode[i].eType]; |
| 6386 } |
| 6387 jsonPrintf(100, &s,"node %3u: %7s n=%-4d up=%-4d", |
| 6388 i, zType, x.aNode[i].n, x.aUp[i]); |
| 6389 if( x.aNode[i].u.zJContent!=0 ){ |
| 6390 jsonAppendRaw(&s, " ", 1); |
| 6391 jsonAppendRaw(&s, x.aNode[i].u.zJContent, x.aNode[i].n); |
| 6392 } |
| 6393 jsonAppendRaw(&s, "\n", 1); |
| 6394 } |
| 6395 jsonParseReset(&x); |
| 6396 jsonResult(&s); |
| 6397 } |
| 6398 |
| 6399 /* |
| 6400 ** The json_test1(JSON) function return true (1) if the input is JSON |
| 6401 ** text generated by another json function. It returns (0) if the input |
| 6402 ** is not known to be JSON. |
| 6403 */ |
| 6404 static void jsonTest1Func( |
| 6405 sqlite3_context *ctx, |
| 6406 int argc, |
| 6407 sqlite3_value **argv |
| 6408 ){ |
| 6409 UNUSED_PARAM(argc); |
| 6410 sqlite3_result_int(ctx, sqlite3_value_subtype(argv[0])==JSON_SUBTYPE); |
| 6411 } |
| 6412 #endif /* SQLITE_DEBUG */ |
| 6413 |
| 6414 /**************************************************************************** |
| 6415 ** Scalar SQL function implementations |
| 6416 ****************************************************************************/ |
| 6417 |
| 6418 /* |
| 6419 ** Implementation of the json_array(VALUE,...) function. Return a JSON |
| 6420 ** array that contains all values given in arguments. Or if any argument |
| 6421 ** is a BLOB, throw an error. |
| 6422 */ |
| 6423 static void jsonArrayFunc( |
| 6424 sqlite3_context *ctx, |
| 6425 int argc, |
| 6426 sqlite3_value **argv |
| 6427 ){ |
| 6428 int i; |
| 6429 JsonString jx; |
| 6430 |
| 6431 jsonInit(&jx, ctx); |
| 6432 jsonAppendChar(&jx, '['); |
| 6433 for(i=0; i<argc; i++){ |
| 6434 jsonAppendSeparator(&jx); |
| 6435 jsonAppendValue(&jx, argv[i]); |
| 6436 } |
| 6437 jsonAppendChar(&jx, ']'); |
| 6438 jsonResult(&jx); |
| 6439 sqlite3_result_subtype(ctx, JSON_SUBTYPE); |
| 6440 } |
| 6441 |
| 6442 |
| 6443 /* |
| 6444 ** json_array_length(JSON) |
| 6445 ** json_array_length(JSON, PATH) |
| 6446 ** |
| 6447 ** Return the number of elements in the top-level JSON array. |
| 6448 ** Return 0 if the input is not a well-formed JSON array. |
| 6449 */ |
| 6450 static void jsonArrayLengthFunc( |
| 6451 sqlite3_context *ctx, |
| 6452 int argc, |
| 6453 sqlite3_value **argv |
| 6454 ){ |
| 6455 JsonParse x; /* The parse */ |
| 6456 sqlite3_int64 n = 0; |
| 6457 u32 i; |
| 6458 JsonNode *pNode; |
| 6459 |
| 6460 if( jsonParse(&x, ctx, (const char*)sqlite3_value_text(argv[0])) ) return; |
| 6461 assert( x.nNode ); |
| 6462 if( argc==2 ){ |
| 6463 const char *zPath = (const char*)sqlite3_value_text(argv[1]); |
| 6464 pNode = jsonLookup(&x, zPath, 0, ctx); |
| 6465 }else{ |
| 6466 pNode = x.aNode; |
| 6467 } |
| 6468 if( pNode==0 ){ |
| 6469 x.nErr = 1; |
| 6470 }else if( pNode->eType==JSON_ARRAY ){ |
| 6471 assert( (pNode->jnFlags & JNODE_APPEND)==0 ); |
| 6472 for(i=1; i<=pNode->n; n++){ |
| 6473 i += jsonNodeSize(&pNode[i]); |
| 6474 } |
| 6475 } |
| 6476 if( x.nErr==0 ) sqlite3_result_int64(ctx, n); |
| 6477 jsonParseReset(&x); |
| 6478 } |
| 6479 |
| 6480 /* |
| 6481 ** json_extract(JSON, PATH, ...) |
| 6482 ** |
| 6483 ** Return the element described by PATH. Return NULL if there is no |
| 6484 ** PATH element. If there are multiple PATHs, then return a JSON array |
| 6485 ** with the result from each path. Throw an error if the JSON or any PATH |
| 6486 ** is malformed. |
| 6487 */ |
| 6488 static void jsonExtractFunc( |
| 6489 sqlite3_context *ctx, |
| 6490 int argc, |
| 6491 sqlite3_value **argv |
| 6492 ){ |
| 6493 JsonParse x; /* The parse */ |
| 6494 JsonNode *pNode; |
| 6495 const char *zPath; |
| 6496 JsonString jx; |
| 6497 int i; |
| 6498 |
| 6499 if( argc<2 ) return; |
| 6500 if( jsonParse(&x, ctx, (const char*)sqlite3_value_text(argv[0])) ) return; |
| 6501 jsonInit(&jx, ctx); |
| 6502 jsonAppendChar(&jx, '['); |
| 6503 for(i=1; i<argc; i++){ |
| 6504 zPath = (const char*)sqlite3_value_text(argv[i]); |
| 6505 pNode = jsonLookup(&x, zPath, 0, ctx); |
| 6506 if( x.nErr ) break; |
| 6507 if( argc>2 ){ |
| 6508 jsonAppendSeparator(&jx); |
| 6509 if( pNode ){ |
| 6510 jsonRenderNode(pNode, &jx, 0); |
| 6511 }else{ |
| 6512 jsonAppendRaw(&jx, "null", 4); |
| 6513 } |
| 6514 }else if( pNode ){ |
| 6515 jsonReturn(pNode, ctx, 0); |
| 6516 } |
| 6517 } |
| 6518 if( argc>2 && i==argc ){ |
| 6519 jsonAppendChar(&jx, ']'); |
| 6520 jsonResult(&jx); |
| 6521 sqlite3_result_subtype(ctx, JSON_SUBTYPE); |
| 6522 } |
| 6523 jsonReset(&jx); |
| 6524 jsonParseReset(&x); |
| 6525 } |
| 6526 |
| 6527 /* |
| 6528 ** Implementation of the json_object(NAME,VALUE,...) function. Return a JSON |
| 6529 ** object that contains all name/value given in arguments. Or if any name |
| 6530 ** is not a string or if any value is a BLOB, throw an error. |
| 6531 */ |
| 6532 static void jsonObjectFunc( |
| 6533 sqlite3_context *ctx, |
| 6534 int argc, |
| 6535 sqlite3_value **argv |
| 6536 ){ |
| 6537 int i; |
| 6538 JsonString jx; |
| 6539 const char *z; |
| 6540 u32 n; |
| 6541 |
| 6542 if( argc&1 ){ |
| 6543 sqlite3_result_error(ctx, "json_object() requires an even number " |
| 6544 "of arguments", -1); |
| 6545 return; |
| 6546 } |
| 6547 jsonInit(&jx, ctx); |
| 6548 jsonAppendChar(&jx, '{'); |
| 6549 for(i=0; i<argc; i+=2){ |
| 6550 if( sqlite3_value_type(argv[i])!=SQLITE_TEXT ){ |
| 6551 sqlite3_result_error(ctx, "json_object() labels must be TEXT", -1); |
| 6552 jsonReset(&jx); |
| 6553 return; |
| 6554 } |
| 6555 jsonAppendSeparator(&jx); |
| 6556 z = (const char*)sqlite3_value_text(argv[i]); |
| 6557 n = (u32)sqlite3_value_bytes(argv[i]); |
| 6558 jsonAppendString(&jx, z, n); |
| 6559 jsonAppendChar(&jx, ':'); |
| 6560 jsonAppendValue(&jx, argv[i+1]); |
| 6561 } |
| 6562 jsonAppendChar(&jx, '}'); |
| 6563 jsonResult(&jx); |
| 6564 sqlite3_result_subtype(ctx, JSON_SUBTYPE); |
| 6565 } |
| 6566 |
| 6567 |
| 6568 /* |
| 6569 ** json_remove(JSON, PATH, ...) |
| 6570 ** |
| 6571 ** Remove the named elements from JSON and return the result. malformed |
| 6572 ** JSON or PATH arguments result in an error. |
| 6573 */ |
| 6574 static void jsonRemoveFunc( |
| 6575 sqlite3_context *ctx, |
| 6576 int argc, |
| 6577 sqlite3_value **argv |
| 6578 ){ |
| 6579 JsonParse x; /* The parse */ |
| 6580 JsonNode *pNode; |
| 6581 const char *zPath; |
| 6582 u32 i; |
| 6583 |
| 6584 if( argc<1 ) return; |
| 6585 if( jsonParse(&x, ctx, (const char*)sqlite3_value_text(argv[0])) ) return; |
| 6586 assert( x.nNode ); |
| 6587 for(i=1; i<(u32)argc; i++){ |
| 6588 zPath = (const char*)sqlite3_value_text(argv[i]); |
| 6589 if( zPath==0 ) goto remove_done; |
| 6590 pNode = jsonLookup(&x, zPath, 0, ctx); |
| 6591 if( x.nErr ) goto remove_done; |
| 6592 if( pNode ) pNode->jnFlags |= JNODE_REMOVE; |
| 6593 } |
| 6594 if( (x.aNode[0].jnFlags & JNODE_REMOVE)==0 ){ |
| 6595 jsonReturnJson(x.aNode, ctx, 0); |
| 6596 } |
| 6597 remove_done: |
| 6598 jsonParseReset(&x); |
| 6599 } |
| 6600 |
| 6601 /* |
| 6602 ** json_replace(JSON, PATH, VALUE, ...) |
| 6603 ** |
| 6604 ** Replace the value at PATH with VALUE. If PATH does not already exist, |
| 6605 ** this routine is a no-op. If JSON or PATH is malformed, throw an error. |
| 6606 */ |
| 6607 static void jsonReplaceFunc( |
| 6608 sqlite3_context *ctx, |
| 6609 int argc, |
| 6610 sqlite3_value **argv |
| 6611 ){ |
| 6612 JsonParse x; /* The parse */ |
| 6613 JsonNode *pNode; |
| 6614 const char *zPath; |
| 6615 u32 i; |
| 6616 |
| 6617 if( argc<1 ) return; |
| 6618 if( (argc&1)==0 ) { |
| 6619 jsonWrongNumArgs(ctx, "replace"); |
| 6620 return; |
| 6621 } |
| 6622 if( jsonParse(&x, ctx, (const char*)sqlite3_value_text(argv[0])) ) return; |
| 6623 assert( x.nNode ); |
| 6624 for(i=1; i<(u32)argc; i+=2){ |
| 6625 zPath = (const char*)sqlite3_value_text(argv[i]); |
| 6626 pNode = jsonLookup(&x, zPath, 0, ctx); |
| 6627 if( x.nErr ) goto replace_err; |
| 6628 if( pNode ){ |
| 6629 pNode->jnFlags |= (u8)JNODE_REPLACE; |
| 6630 pNode->iVal = (u8)(i+1); |
| 6631 } |
| 6632 } |
| 6633 if( x.aNode[0].jnFlags & JNODE_REPLACE ){ |
| 6634 sqlite3_result_value(ctx, argv[x.aNode[0].iVal]); |
| 6635 }else{ |
| 6636 jsonReturnJson(x.aNode, ctx, argv); |
| 6637 } |
| 6638 replace_err: |
| 6639 jsonParseReset(&x); |
| 6640 } |
| 6641 |
| 6642 /* |
| 6643 ** json_set(JSON, PATH, VALUE, ...) |
| 6644 ** |
| 6645 ** Set the value at PATH to VALUE. Create the PATH if it does not already |
| 6646 ** exist. Overwrite existing values that do exist. |
| 6647 ** If JSON or PATH is malformed, throw an error. |
| 6648 ** |
| 6649 ** json_insert(JSON, PATH, VALUE, ...) |
| 6650 ** |
| 6651 ** Create PATH and initialize it to VALUE. If PATH already exists, this |
| 6652 ** routine is a no-op. If JSON or PATH is malformed, throw an error. |
| 6653 */ |
| 6654 static void jsonSetFunc( |
| 6655 sqlite3_context *ctx, |
| 6656 int argc, |
| 6657 sqlite3_value **argv |
| 6658 ){ |
| 6659 JsonParse x; /* The parse */ |
| 6660 JsonNode *pNode; |
| 6661 const char *zPath; |
| 6662 u32 i; |
| 6663 int bApnd; |
| 6664 int bIsSet = *(int*)sqlite3_user_data(ctx); |
| 6665 |
| 6666 if( argc<1 ) return; |
| 6667 if( (argc&1)==0 ) { |
| 6668 jsonWrongNumArgs(ctx, bIsSet ? "set" : "insert"); |
| 6669 return; |
| 6670 } |
| 6671 if( jsonParse(&x, ctx, (const char*)sqlite3_value_text(argv[0])) ) return; |
| 6672 assert( x.nNode ); |
| 6673 for(i=1; i<(u32)argc; i+=2){ |
| 6674 zPath = (const char*)sqlite3_value_text(argv[i]); |
| 6675 bApnd = 0; |
| 6676 pNode = jsonLookup(&x, zPath, &bApnd, ctx); |
| 6677 if( x.oom ){ |
| 6678 sqlite3_result_error_nomem(ctx); |
| 6679 goto jsonSetDone; |
| 6680 }else if( x.nErr ){ |
| 6681 goto jsonSetDone; |
| 6682 }else if( pNode && (bApnd || bIsSet) ){ |
| 6683 pNode->jnFlags |= (u8)JNODE_REPLACE; |
| 6684 pNode->iVal = (u8)(i+1); |
| 6685 } |
| 6686 } |
| 6687 if( x.aNode[0].jnFlags & JNODE_REPLACE ){ |
| 6688 sqlite3_result_value(ctx, argv[x.aNode[0].iVal]); |
| 6689 }else{ |
| 6690 jsonReturnJson(x.aNode, ctx, argv); |
| 6691 } |
| 6692 jsonSetDone: |
| 6693 jsonParseReset(&x); |
| 6694 } |
| 6695 |
| 6696 /* |
| 6697 ** json_type(JSON) |
| 6698 ** json_type(JSON, PATH) |
| 6699 ** |
| 6700 ** Return the top-level "type" of a JSON string. Throw an error if |
| 6701 ** either the JSON or PATH inputs are not well-formed. |
| 6702 */ |
| 6703 static void jsonTypeFunc( |
| 6704 sqlite3_context *ctx, |
| 6705 int argc, |
| 6706 sqlite3_value **argv |
| 6707 ){ |
| 6708 JsonParse x; /* The parse */ |
| 6709 const char *zPath; |
| 6710 JsonNode *pNode; |
| 6711 |
| 6712 if( jsonParse(&x, ctx, (const char*)sqlite3_value_text(argv[0])) ) return; |
| 6713 assert( x.nNode ); |
| 6714 if( argc==2 ){ |
| 6715 zPath = (const char*)sqlite3_value_text(argv[1]); |
| 6716 pNode = jsonLookup(&x, zPath, 0, ctx); |
| 6717 }else{ |
| 6718 pNode = x.aNode; |
| 6719 } |
| 6720 if( pNode ){ |
| 6721 sqlite3_result_text(ctx, jsonType[pNode->eType], -1, SQLITE_STATIC); |
| 6722 } |
| 6723 jsonParseReset(&x); |
| 6724 } |
| 6725 |
| 6726 /* |
| 6727 ** json_valid(JSON) |
| 6728 ** |
| 6729 ** Return 1 if JSON is a well-formed JSON string according to RFC-7159. |
| 6730 ** Return 0 otherwise. |
| 6731 */ |
| 6732 static void jsonValidFunc( |
| 6733 sqlite3_context *ctx, |
| 6734 int argc, |
| 6735 sqlite3_value **argv |
| 6736 ){ |
| 6737 JsonParse x; /* The parse */ |
| 6738 int rc = 0; |
| 6739 |
| 6740 UNUSED_PARAM(argc); |
| 6741 if( jsonParse(&x, 0, (const char*)sqlite3_value_text(argv[0]))==0 ){ |
| 6742 rc = 1; |
| 6743 } |
| 6744 jsonParseReset(&x); |
| 6745 sqlite3_result_int(ctx, rc); |
| 6746 } |
| 6747 |
| 6748 |
| 6749 /**************************************************************************** |
| 6750 ** Aggregate SQL function implementations |
| 6751 ****************************************************************************/ |
| 6752 /* |
| 6753 ** json_group_array(VALUE) |
| 6754 ** |
| 6755 ** Return a JSON array composed of all values in the aggregate. |
| 6756 */ |
| 6757 static void jsonArrayStep( |
| 6758 sqlite3_context *ctx, |
| 6759 int argc, |
| 6760 sqlite3_value **argv |
| 6761 ){ |
| 6762 JsonString *pStr; |
| 6763 pStr = (JsonString*)sqlite3_aggregate_context(ctx, sizeof(*pStr)); |
| 6764 if( pStr ){ |
| 6765 if( pStr->zBuf==0 ){ |
| 6766 jsonInit(pStr, ctx); |
| 6767 jsonAppendChar(pStr, '['); |
| 6768 }else{ |
| 6769 jsonAppendChar(pStr, ','); |
| 6770 pStr->pCtx = ctx; |
| 6771 } |
| 6772 jsonAppendValue(pStr, argv[0]); |
| 6773 } |
| 6774 } |
| 6775 static void jsonArrayFinal(sqlite3_context *ctx){ |
| 6776 JsonString *pStr; |
| 6777 pStr = (JsonString*)sqlite3_aggregate_context(ctx, 0); |
| 6778 if( pStr ){ |
| 6779 pStr->pCtx = ctx; |
| 6780 jsonAppendChar(pStr, ']'); |
| 6781 if( pStr->bErr ){ |
| 6782 sqlite3_result_error_nomem(ctx); |
| 6783 assert( pStr->bStatic ); |
| 6784 }else{ |
| 6785 sqlite3_result_text(ctx, pStr->zBuf, pStr->nUsed, |
| 6786 pStr->bStatic ? SQLITE_TRANSIENT : sqlite3_free); |
| 6787 pStr->bStatic = 1; |
| 6788 } |
| 6789 }else{ |
| 6790 sqlite3_result_text(ctx, "[]", 2, SQLITE_STATIC); |
| 6791 } |
| 6792 sqlite3_result_subtype(ctx, JSON_SUBTYPE); |
| 6793 } |
| 6794 |
| 6795 /* |
| 6796 ** json_group_obj(NAME,VALUE) |
| 6797 ** |
| 6798 ** Return a JSON object composed of all names and values in the aggregate. |
| 6799 */ |
| 6800 static void jsonObjectStep( |
| 6801 sqlite3_context *ctx, |
| 6802 int argc, |
| 6803 sqlite3_value **argv |
| 6804 ){ |
| 6805 JsonString *pStr; |
| 6806 const char *z; |
| 6807 u32 n; |
| 6808 pStr = (JsonString*)sqlite3_aggregate_context(ctx, sizeof(*pStr)); |
| 6809 if( pStr ){ |
| 6810 if( pStr->zBuf==0 ){ |
| 6811 jsonInit(pStr, ctx); |
| 6812 jsonAppendChar(pStr, '{'); |
| 6813 }else{ |
| 6814 jsonAppendChar(pStr, ','); |
| 6815 pStr->pCtx = ctx; |
| 6816 } |
| 6817 z = (const char*)sqlite3_value_text(argv[0]); |
| 6818 n = (u32)sqlite3_value_bytes(argv[0]); |
| 6819 jsonAppendString(pStr, z, n); |
| 6820 jsonAppendChar(pStr, ':'); |
| 6821 jsonAppendValue(pStr, argv[1]); |
| 6822 } |
| 6823 } |
| 6824 static void jsonObjectFinal(sqlite3_context *ctx){ |
| 6825 JsonString *pStr; |
| 6826 pStr = (JsonString*)sqlite3_aggregate_context(ctx, 0); |
| 6827 if( pStr ){ |
| 6828 jsonAppendChar(pStr, '}'); |
| 6829 if( pStr->bErr ){ |
| 6830 sqlite3_result_error_nomem(ctx); |
| 6831 assert( pStr->bStatic ); |
| 6832 }else{ |
| 6833 sqlite3_result_text(ctx, pStr->zBuf, pStr->nUsed, |
| 6834 pStr->bStatic ? SQLITE_TRANSIENT : sqlite3_free); |
| 6835 pStr->bStatic = 1; |
| 6836 } |
| 6837 }else{ |
| 6838 sqlite3_result_text(ctx, "{}", 2, SQLITE_STATIC); |
| 6839 } |
| 6840 sqlite3_result_subtype(ctx, JSON_SUBTYPE); |
| 6841 } |
| 6842 |
| 6843 |
| 6844 #ifndef SQLITE_OMIT_VIRTUALTABLE |
| 6845 /**************************************************************************** |
| 6846 ** The json_each virtual table |
| 6847 ****************************************************************************/ |
| 6848 typedef struct JsonEachCursor JsonEachCursor; |
| 6849 struct JsonEachCursor { |
| 6850 sqlite3_vtab_cursor base; /* Base class - must be first */ |
| 6851 u32 iRowid; /* The rowid */ |
| 6852 u32 iBegin; /* The first node of the scan */ |
| 6853 u32 i; /* Index in sParse.aNode[] of current row */ |
| 6854 u32 iEnd; /* EOF when i equals or exceeds this value */ |
| 6855 u8 eType; /* Type of top-level element */ |
| 6856 u8 bRecursive; /* True for json_tree(). False for json_each() */ |
| 6857 char *zJson; /* Input JSON */ |
| 6858 char *zRoot; /* Path by which to filter zJson */ |
| 6859 JsonParse sParse; /* Parse of the input JSON */ |
| 6860 }; |
| 6861 |
| 6862 /* Constructor for the json_each virtual table */ |
| 6863 static int jsonEachConnect( |
| 6864 sqlite3 *db, |
| 6865 void *pAux, |
| 6866 int argc, const char *const*argv, |
| 6867 sqlite3_vtab **ppVtab, |
| 6868 char **pzErr |
| 6869 ){ |
| 6870 sqlite3_vtab *pNew; |
| 6871 int rc; |
| 6872 |
| 6873 /* Column numbers */ |
| 6874 #define JEACH_KEY 0 |
| 6875 #define JEACH_VALUE 1 |
| 6876 #define JEACH_TYPE 2 |
| 6877 #define JEACH_ATOM 3 |
| 6878 #define JEACH_ID 4 |
| 6879 #define JEACH_PARENT 5 |
| 6880 #define JEACH_FULLKEY 6 |
| 6881 #define JEACH_PATH 7 |
| 6882 #define JEACH_JSON 8 |
| 6883 #define JEACH_ROOT 9 |
| 6884 |
| 6885 UNUSED_PARAM(pzErr); |
| 6886 UNUSED_PARAM(argv); |
| 6887 UNUSED_PARAM(argc); |
| 6888 UNUSED_PARAM(pAux); |
| 6889 rc = sqlite3_declare_vtab(db, |
| 6890 "CREATE TABLE x(key,value,type,atom,id,parent,fullkey,path," |
| 6891 "json HIDDEN,root HIDDEN)"); |
| 6892 if( rc==SQLITE_OK ){ |
| 6893 pNew = *ppVtab = sqlite3_malloc( sizeof(*pNew) ); |
| 6894 if( pNew==0 ) return SQLITE_NOMEM; |
| 6895 memset(pNew, 0, sizeof(*pNew)); |
| 6896 } |
| 6897 return rc; |
| 6898 } |
| 6899 |
| 6900 /* destructor for json_each virtual table */ |
| 6901 static int jsonEachDisconnect(sqlite3_vtab *pVtab){ |
| 6902 sqlite3_free(pVtab); |
| 6903 return SQLITE_OK; |
| 6904 } |
| 6905 |
| 6906 /* constructor for a JsonEachCursor object for json_each(). */ |
| 6907 static int jsonEachOpenEach(sqlite3_vtab *p, sqlite3_vtab_cursor **ppCursor){ |
| 6908 JsonEachCursor *pCur; |
| 6909 |
| 6910 UNUSED_PARAM(p); |
| 6911 pCur = sqlite3_malloc( sizeof(*pCur) ); |
| 6912 if( pCur==0 ) return SQLITE_NOMEM; |
| 6913 memset(pCur, 0, sizeof(*pCur)); |
| 6914 *ppCursor = &pCur->base; |
| 6915 return SQLITE_OK; |
| 6916 } |
| 6917 |
| 6918 /* constructor for a JsonEachCursor object for json_tree(). */ |
| 6919 static int jsonEachOpenTree(sqlite3_vtab *p, sqlite3_vtab_cursor **ppCursor){ |
| 6920 int rc = jsonEachOpenEach(p, ppCursor); |
| 6921 if( rc==SQLITE_OK ){ |
| 6922 JsonEachCursor *pCur = (JsonEachCursor*)*ppCursor; |
| 6923 pCur->bRecursive = 1; |
| 6924 } |
| 6925 return rc; |
| 6926 } |
| 6927 |
| 6928 /* Reset a JsonEachCursor back to its original state. Free any memory |
| 6929 ** held. */ |
| 6930 static void jsonEachCursorReset(JsonEachCursor *p){ |
| 6931 sqlite3_free(p->zJson); |
| 6932 sqlite3_free(p->zRoot); |
| 6933 jsonParseReset(&p->sParse); |
| 6934 p->iRowid = 0; |
| 6935 p->i = 0; |
| 6936 p->iEnd = 0; |
| 6937 p->eType = 0; |
| 6938 p->zJson = 0; |
| 6939 p->zRoot = 0; |
| 6940 } |
| 6941 |
| 6942 /* Destructor for a jsonEachCursor object */ |
| 6943 static int jsonEachClose(sqlite3_vtab_cursor *cur){ |
| 6944 JsonEachCursor *p = (JsonEachCursor*)cur; |
| 6945 jsonEachCursorReset(p); |
| 6946 sqlite3_free(cur); |
| 6947 return SQLITE_OK; |
| 6948 } |
| 6949 |
| 6950 /* Return TRUE if the jsonEachCursor object has been advanced off the end |
| 6951 ** of the JSON object */ |
| 6952 static int jsonEachEof(sqlite3_vtab_cursor *cur){ |
| 6953 JsonEachCursor *p = (JsonEachCursor*)cur; |
| 6954 return p->i >= p->iEnd; |
| 6955 } |
| 6956 |
| 6957 /* Advance the cursor to the next element for json_tree() */ |
| 6958 static int jsonEachNext(sqlite3_vtab_cursor *cur){ |
| 6959 JsonEachCursor *p = (JsonEachCursor*)cur; |
| 6960 if( p->bRecursive ){ |
| 6961 if( p->sParse.aNode[p->i].jnFlags & JNODE_LABEL ) p->i++; |
| 6962 p->i++; |
| 6963 p->iRowid++; |
| 6964 if( p->i<p->iEnd ){ |
| 6965 u32 iUp = p->sParse.aUp[p->i]; |
| 6966 JsonNode *pUp = &p->sParse.aNode[iUp]; |
| 6967 p->eType = pUp->eType; |
| 6968 if( pUp->eType==JSON_ARRAY ){ |
| 6969 if( iUp==p->i-1 ){ |
| 6970 pUp->u.iKey = 0; |
| 6971 }else{ |
| 6972 pUp->u.iKey++; |
| 6973 } |
| 6974 } |
| 6975 } |
| 6976 }else{ |
| 6977 switch( p->eType ){ |
| 6978 case JSON_ARRAY: { |
| 6979 p->i += jsonNodeSize(&p->sParse.aNode[p->i]); |
| 6980 p->iRowid++; |
| 6981 break; |
| 6982 } |
| 6983 case JSON_OBJECT: { |
| 6984 p->i += 1 + jsonNodeSize(&p->sParse.aNode[p->i+1]); |
| 6985 p->iRowid++; |
| 6986 break; |
| 6987 } |
| 6988 default: { |
| 6989 p->i = p->iEnd; |
| 6990 break; |
| 6991 } |
| 6992 } |
| 6993 } |
| 6994 return SQLITE_OK; |
| 6995 } |
| 6996 |
| 6997 /* Append the name of the path for element i to pStr |
| 6998 */ |
| 6999 static void jsonEachComputePath( |
| 7000 JsonEachCursor *p, /* The cursor */ |
| 7001 JsonString *pStr, /* Write the path here */ |
| 7002 u32 i /* Path to this element */ |
| 7003 ){ |
| 7004 JsonNode *pNode, *pUp; |
| 7005 u32 iUp; |
| 7006 if( i==0 ){ |
| 7007 jsonAppendChar(pStr, '$'); |
| 7008 return; |
| 7009 } |
| 7010 iUp = p->sParse.aUp[i]; |
| 7011 jsonEachComputePath(p, pStr, iUp); |
| 7012 pNode = &p->sParse.aNode[i]; |
| 7013 pUp = &p->sParse.aNode[iUp]; |
| 7014 if( pUp->eType==JSON_ARRAY ){ |
| 7015 jsonPrintf(30, pStr, "[%d]", pUp->u.iKey); |
| 7016 }else{ |
| 7017 assert( pUp->eType==JSON_OBJECT ); |
| 7018 if( (pNode->jnFlags & JNODE_LABEL)==0 ) pNode--; |
| 7019 assert( pNode->eType==JSON_STRING ); |
| 7020 assert( pNode->jnFlags & JNODE_LABEL ); |
| 7021 jsonPrintf(pNode->n+1, pStr, ".%.*s", pNode->n-2, pNode->u.zJContent+1); |
| 7022 } |
| 7023 } |
| 7024 |
| 7025 /* Return the value of a column */ |
| 7026 static int jsonEachColumn( |
| 7027 sqlite3_vtab_cursor *cur, /* The cursor */ |
| 7028 sqlite3_context *ctx, /* First argument to sqlite3_result_...() */ |
| 7029 int i /* Which column to return */ |
| 7030 ){ |
| 7031 JsonEachCursor *p = (JsonEachCursor*)cur; |
| 7032 JsonNode *pThis = &p->sParse.aNode[p->i]; |
| 7033 switch( i ){ |
| 7034 case JEACH_KEY: { |
| 7035 if( p->i==0 ) break; |
| 7036 if( p->eType==JSON_OBJECT ){ |
| 7037 jsonReturn(pThis, ctx, 0); |
| 7038 }else if( p->eType==JSON_ARRAY ){ |
| 7039 u32 iKey; |
| 7040 if( p->bRecursive ){ |
| 7041 if( p->iRowid==0 ) break; |
| 7042 iKey = p->sParse.aNode[p->sParse.aUp[p->i]].u.iKey; |
| 7043 }else{ |
| 7044 iKey = p->iRowid; |
| 7045 } |
| 7046 sqlite3_result_int64(ctx, (sqlite3_int64)iKey); |
| 7047 } |
| 7048 break; |
| 7049 } |
| 7050 case JEACH_VALUE: { |
| 7051 if( pThis->jnFlags & JNODE_LABEL ) pThis++; |
| 7052 jsonReturn(pThis, ctx, 0); |
| 7053 break; |
| 7054 } |
| 7055 case JEACH_TYPE: { |
| 7056 if( pThis->jnFlags & JNODE_LABEL ) pThis++; |
| 7057 sqlite3_result_text(ctx, jsonType[pThis->eType], -1, SQLITE_STATIC); |
| 7058 break; |
| 7059 } |
| 7060 case JEACH_ATOM: { |
| 7061 if( pThis->jnFlags & JNODE_LABEL ) pThis++; |
| 7062 if( pThis->eType>=JSON_ARRAY ) break; |
| 7063 jsonReturn(pThis, ctx, 0); |
| 7064 break; |
| 7065 } |
| 7066 case JEACH_ID: { |
| 7067 sqlite3_result_int64(ctx, |
| 7068 (sqlite3_int64)p->i + ((pThis->jnFlags & JNODE_LABEL)!=0)); |
| 7069 break; |
| 7070 } |
| 7071 case JEACH_PARENT: { |
| 7072 if( p->i>p->iBegin && p->bRecursive ){ |
| 7073 sqlite3_result_int64(ctx, (sqlite3_int64)p->sParse.aUp[p->i]); |
| 7074 } |
| 7075 break; |
| 7076 } |
| 7077 case JEACH_FULLKEY: { |
| 7078 JsonString x; |
| 7079 jsonInit(&x, ctx); |
| 7080 if( p->bRecursive ){ |
| 7081 jsonEachComputePath(p, &x, p->i); |
| 7082 }else{ |
| 7083 if( p->zRoot ){ |
| 7084 jsonAppendRaw(&x, p->zRoot, (int)strlen(p->zRoot)); |
| 7085 }else{ |
| 7086 jsonAppendChar(&x, '$'); |
| 7087 } |
| 7088 if( p->eType==JSON_ARRAY ){ |
| 7089 jsonPrintf(30, &x, "[%d]", p->iRowid); |
| 7090 }else{ |
| 7091 jsonPrintf(pThis->n, &x, ".%.*s", pThis->n-2, pThis->u.zJContent+1); |
| 7092 } |
| 7093 } |
| 7094 jsonResult(&x); |
| 7095 break; |
| 7096 } |
| 7097 case JEACH_PATH: { |
| 7098 if( p->bRecursive ){ |
| 7099 JsonString x; |
| 7100 jsonInit(&x, ctx); |
| 7101 jsonEachComputePath(p, &x, p->sParse.aUp[p->i]); |
| 7102 jsonResult(&x); |
| 7103 break; |
| 7104 } |
| 7105 /* For json_each() path and root are the same so fall through |
| 7106 ** into the root case */ |
| 7107 } |
| 7108 case JEACH_ROOT: { |
| 7109 const char *zRoot = p->zRoot; |
| 7110 if( zRoot==0 ) zRoot = "$"; |
| 7111 sqlite3_result_text(ctx, zRoot, -1, SQLITE_STATIC); |
| 7112 break; |
| 7113 } |
| 7114 case JEACH_JSON: { |
| 7115 assert( i==JEACH_JSON ); |
| 7116 sqlite3_result_text(ctx, p->sParse.zJson, -1, SQLITE_STATIC); |
| 7117 break; |
| 7118 } |
| 7119 } |
| 7120 return SQLITE_OK; |
| 7121 } |
| 7122 |
| 7123 /* Return the current rowid value */ |
| 7124 static int jsonEachRowid(sqlite3_vtab_cursor *cur, sqlite_int64 *pRowid){ |
| 7125 JsonEachCursor *p = (JsonEachCursor*)cur; |
| 7126 *pRowid = p->iRowid; |
| 7127 return SQLITE_OK; |
| 7128 } |
| 7129 |
| 7130 /* The query strategy is to look for an equality constraint on the json |
| 7131 ** column. Without such a constraint, the table cannot operate. idxNum is |
| 7132 ** 1 if the constraint is found, 3 if the constraint and zRoot are found, |
| 7133 ** and 0 otherwise. |
| 7134 */ |
| 7135 static int jsonEachBestIndex( |
| 7136 sqlite3_vtab *tab, |
| 7137 sqlite3_index_info *pIdxInfo |
| 7138 ){ |
| 7139 int i; |
| 7140 int jsonIdx = -1; |
| 7141 int rootIdx = -1; |
| 7142 const struct sqlite3_index_constraint *pConstraint; |
| 7143 |
| 7144 UNUSED_PARAM(tab); |
| 7145 pConstraint = pIdxInfo->aConstraint; |
| 7146 for(i=0; i<pIdxInfo->nConstraint; i++, pConstraint++){ |
| 7147 if( pConstraint->usable==0 ) continue; |
| 7148 if( pConstraint->op!=SQLITE_INDEX_CONSTRAINT_EQ ) continue; |
| 7149 switch( pConstraint->iColumn ){ |
| 7150 case JEACH_JSON: jsonIdx = i; break; |
| 7151 case JEACH_ROOT: rootIdx = i; break; |
| 7152 default: /* no-op */ break; |
| 7153 } |
| 7154 } |
| 7155 if( jsonIdx<0 ){ |
| 7156 pIdxInfo->idxNum = 0; |
| 7157 pIdxInfo->estimatedCost = 1e99; |
| 7158 }else{ |
| 7159 pIdxInfo->estimatedCost = 1.0; |
| 7160 pIdxInfo->aConstraintUsage[jsonIdx].argvIndex = 1; |
| 7161 pIdxInfo->aConstraintUsage[jsonIdx].omit = 1; |
| 7162 if( rootIdx<0 ){ |
| 7163 pIdxInfo->idxNum = 1; |
| 7164 }else{ |
| 7165 pIdxInfo->aConstraintUsage[rootIdx].argvIndex = 2; |
| 7166 pIdxInfo->aConstraintUsage[rootIdx].omit = 1; |
| 7167 pIdxInfo->idxNum = 3; |
| 7168 } |
| 7169 } |
| 7170 return SQLITE_OK; |
| 7171 } |
| 7172 |
| 7173 /* Start a search on a new JSON string */ |
| 7174 static int jsonEachFilter( |
| 7175 sqlite3_vtab_cursor *cur, |
| 7176 int idxNum, const char *idxStr, |
| 7177 int argc, sqlite3_value **argv |
| 7178 ){ |
| 7179 JsonEachCursor *p = (JsonEachCursor*)cur; |
| 7180 const char *z; |
| 7181 const char *zRoot = 0; |
| 7182 sqlite3_int64 n; |
| 7183 |
| 7184 UNUSED_PARAM(idxStr); |
| 7185 UNUSED_PARAM(argc); |
| 7186 jsonEachCursorReset(p); |
| 7187 if( idxNum==0 ) return SQLITE_OK; |
| 7188 z = (const char*)sqlite3_value_text(argv[0]); |
| 7189 if( z==0 ) return SQLITE_OK; |
| 7190 n = sqlite3_value_bytes(argv[0]); |
| 7191 p->zJson = sqlite3_malloc64( n+1 ); |
| 7192 if( p->zJson==0 ) return SQLITE_NOMEM; |
| 7193 memcpy(p->zJson, z, (size_t)n+1); |
| 7194 if( jsonParse(&p->sParse, 0, p->zJson) ){ |
| 7195 int rc = SQLITE_NOMEM; |
| 7196 if( p->sParse.oom==0 ){ |
| 7197 sqlite3_free(cur->pVtab->zErrMsg); |
| 7198 cur->pVtab->zErrMsg = sqlite3_mprintf("malformed JSON"); |
| 7199 if( cur->pVtab->zErrMsg ) rc = SQLITE_ERROR; |
| 7200 } |
| 7201 jsonEachCursorReset(p); |
| 7202 return rc; |
| 7203 }else if( p->bRecursive && jsonParseFindParents(&p->sParse) ){ |
| 7204 jsonEachCursorReset(p); |
| 7205 return SQLITE_NOMEM; |
| 7206 }else{ |
| 7207 JsonNode *pNode = 0; |
| 7208 if( idxNum==3 ){ |
| 7209 const char *zErr = 0; |
| 7210 zRoot = (const char*)sqlite3_value_text(argv[1]); |
| 7211 if( zRoot==0 ) return SQLITE_OK; |
| 7212 n = sqlite3_value_bytes(argv[1]); |
| 7213 p->zRoot = sqlite3_malloc64( n+1 ); |
| 7214 if( p->zRoot==0 ) return SQLITE_NOMEM; |
| 7215 memcpy(p->zRoot, zRoot, (size_t)n+1); |
| 7216 if( zRoot[0]!='$' ){ |
| 7217 zErr = zRoot; |
| 7218 }else{ |
| 7219 pNode = jsonLookupStep(&p->sParse, 0, p->zRoot+1, 0, &zErr); |
| 7220 } |
| 7221 if( zErr ){ |
| 7222 sqlite3_free(cur->pVtab->zErrMsg); |
| 7223 cur->pVtab->zErrMsg = jsonPathSyntaxError(zErr); |
| 7224 jsonEachCursorReset(p); |
| 7225 return cur->pVtab->zErrMsg ? SQLITE_ERROR : SQLITE_NOMEM; |
| 7226 }else if( pNode==0 ){ |
| 7227 return SQLITE_OK; |
| 7228 } |
| 7229 }else{ |
| 7230 pNode = p->sParse.aNode; |
| 7231 } |
| 7232 p->iBegin = p->i = (int)(pNode - p->sParse.aNode); |
| 7233 p->eType = pNode->eType; |
| 7234 if( p->eType>=JSON_ARRAY ){ |
| 7235 pNode->u.iKey = 0; |
| 7236 p->iEnd = p->i + pNode->n + 1; |
| 7237 if( p->bRecursive ){ |
| 7238 p->eType = p->sParse.aNode[p->sParse.aUp[p->i]].eType; |
| 7239 if( p->i>0 && (p->sParse.aNode[p->i-1].jnFlags & JNODE_LABEL)!=0 ){ |
| 7240 p->i--; |
| 7241 } |
| 7242 }else{ |
| 7243 p->i++; |
| 7244 } |
| 7245 }else{ |
| 7246 p->iEnd = p->i+1; |
| 7247 } |
| 7248 } |
| 7249 return SQLITE_OK; |
| 7250 } |
| 7251 |
| 7252 /* The methods of the json_each virtual table */ |
| 7253 static sqlite3_module jsonEachModule = { |
| 7254 0, /* iVersion */ |
| 7255 0, /* xCreate */ |
| 7256 jsonEachConnect, /* xConnect */ |
| 7257 jsonEachBestIndex, /* xBestIndex */ |
| 7258 jsonEachDisconnect, /* xDisconnect */ |
| 7259 0, /* xDestroy */ |
| 7260 jsonEachOpenEach, /* xOpen - open a cursor */ |
| 7261 jsonEachClose, /* xClose - close a cursor */ |
| 7262 jsonEachFilter, /* xFilter - configure scan constraints */ |
| 7263 jsonEachNext, /* xNext - advance a cursor */ |
| 7264 jsonEachEof, /* xEof - check for end of scan */ |
| 7265 jsonEachColumn, /* xColumn - read data */ |
| 7266 jsonEachRowid, /* xRowid - read data */ |
| 7267 0, /* xUpdate */ |
| 7268 0, /* xBegin */ |
| 7269 0, /* xSync */ |
| 7270 0, /* xCommit */ |
| 7271 0, /* xRollback */ |
| 7272 0, /* xFindMethod */ |
| 7273 0, /* xRename */ |
| 7274 0, /* xSavepoint */ |
| 7275 0, /* xRelease */ |
| 7276 0 /* xRollbackTo */ |
| 7277 }; |
| 7278 |
| 7279 /* The methods of the json_tree virtual table. */ |
| 7280 static sqlite3_module jsonTreeModule = { |
| 7281 0, /* iVersion */ |
| 7282 0, /* xCreate */ |
| 7283 jsonEachConnect, /* xConnect */ |
| 7284 jsonEachBestIndex, /* xBestIndex */ |
| 7285 jsonEachDisconnect, /* xDisconnect */ |
| 7286 0, /* xDestroy */ |
| 7287 jsonEachOpenTree, /* xOpen - open a cursor */ |
| 7288 jsonEachClose, /* xClose - close a cursor */ |
| 7289 jsonEachFilter, /* xFilter - configure scan constraints */ |
| 7290 jsonEachNext, /* xNext - advance a cursor */ |
| 7291 jsonEachEof, /* xEof - check for end of scan */ |
| 7292 jsonEachColumn, /* xColumn - read data */ |
| 7293 jsonEachRowid, /* xRowid - read data */ |
| 7294 0, /* xUpdate */ |
| 7295 0, /* xBegin */ |
| 7296 0, /* xSync */ |
| 7297 0, /* xCommit */ |
| 7298 0, /* xRollback */ |
| 7299 0, /* xFindMethod */ |
| 7300 0, /* xRename */ |
| 7301 0, /* xSavepoint */ |
| 7302 0, /* xRelease */ |
| 7303 0 /* xRollbackTo */ |
| 7304 }; |
| 7305 #endif /* SQLITE_OMIT_VIRTUALTABLE */ |
| 7306 |
| 7307 /**************************************************************************** |
| 7308 ** The following routines are the only publically visible identifiers in this |
| 7309 ** file. Call the following routines in order to register the various SQL |
| 7310 ** functions and the virtual table implemented by this file. |
| 7311 ****************************************************************************/ |
| 7312 |
| 7313 SQLITE_PRIVATE int sqlite3Json1Init(sqlite3 *db){ |
| 7314 int rc = SQLITE_OK; |
| 7315 unsigned int i; |
| 7316 static const struct { |
| 7317 const char *zName; |
| 7318 int nArg; |
| 7319 int flag; |
| 7320 void (*xFunc)(sqlite3_context*,int,sqlite3_value**); |
| 7321 } aFunc[] = { |
| 7322 { "json", 1, 0, jsonRemoveFunc }, |
| 7323 { "json_array", -1, 0, jsonArrayFunc }, |
| 7324 { "json_array_length", 1, 0, jsonArrayLengthFunc }, |
| 7325 { "json_array_length", 2, 0, jsonArrayLengthFunc }, |
| 7326 { "json_extract", -1, 0, jsonExtractFunc }, |
| 7327 { "json_insert", -1, 0, jsonSetFunc }, |
| 7328 { "json_object", -1, 0, jsonObjectFunc }, |
| 7329 { "json_remove", -1, 0, jsonRemoveFunc }, |
| 7330 { "json_replace", -1, 0, jsonReplaceFunc }, |
| 7331 { "json_set", -1, 1, jsonSetFunc }, |
| 7332 { "json_type", 1, 0, jsonTypeFunc }, |
| 7333 { "json_type", 2, 0, jsonTypeFunc }, |
| 7334 { "json_valid", 1, 0, jsonValidFunc }, |
| 7335 |
| 7336 #if SQLITE_DEBUG |
| 7337 /* DEBUG and TESTING functions */ |
| 7338 { "json_parse", 1, 0, jsonParseFunc }, |
| 7339 { "json_test1", 1, 0, jsonTest1Func }, |
| 7340 #endif |
| 7341 }; |
| 7342 static const struct { |
| 7343 const char *zName; |
| 7344 int nArg; |
| 7345 void (*xStep)(sqlite3_context*,int,sqlite3_value**); |
| 7346 void (*xFinal)(sqlite3_context*); |
| 7347 } aAgg[] = { |
| 7348 { "json_group_array", 1, jsonArrayStep, jsonArrayFinal }, |
| 7349 { "json_group_object", 2, jsonObjectStep, jsonObjectFinal }, |
| 7350 }; |
| 7351 #ifndef SQLITE_OMIT_VIRTUALTABLE |
| 7352 static const struct { |
| 7353 const char *zName; |
| 7354 sqlite3_module *pModule; |
| 7355 } aMod[] = { |
| 7356 { "json_each", &jsonEachModule }, |
| 7357 { "json_tree", &jsonTreeModule }, |
| 7358 }; |
| 7359 #endif |
| 7360 for(i=0; i<sizeof(aFunc)/sizeof(aFunc[0]) && rc==SQLITE_OK; i++){ |
| 7361 rc = sqlite3_create_function(db, aFunc[i].zName, aFunc[i].nArg, |
| 7362 SQLITE_UTF8 | SQLITE_DETERMINISTIC, |
| 7363 (void*)&aFunc[i].flag, |
| 7364 aFunc[i].xFunc, 0, 0); |
| 7365 } |
| 7366 for(i=0; i<sizeof(aAgg)/sizeof(aAgg[0]) && rc==SQLITE_OK; i++){ |
| 7367 rc = sqlite3_create_function(db, aAgg[i].zName, aAgg[i].nArg, |
| 7368 SQLITE_UTF8 | SQLITE_DETERMINISTIC, 0, |
| 7369 0, aAgg[i].xStep, aAgg[i].xFinal); |
| 7370 } |
| 7371 #ifndef SQLITE_OMIT_VIRTUALTABLE |
| 7372 for(i=0; i<sizeof(aMod)/sizeof(aMod[0]) && rc==SQLITE_OK; i++){ |
| 7373 rc = sqlite3_create_module(db, aMod[i].zName, aMod[i].pModule, 0); |
| 7374 } |
| 7375 #endif |
| 7376 return rc; |
| 7377 } |
| 7378 |
| 7379 |
| 7380 #ifndef SQLITE_CORE |
| 7381 #ifdef _WIN32 |
| 7382 __declspec(dllexport) |
| 7383 #endif |
| 7384 SQLITE_API int SQLITE_STDCALL sqlite3_json_init( |
| 7385 sqlite3 *db, |
| 7386 char **pzErrMsg, |
| 7387 const sqlite3_api_routines *pApi |
| 7388 ){ |
| 7389 SQLITE_EXTENSION_INIT2(pApi); |
| 7390 (void)pzErrMsg; /* Unused parameter */ |
| 7391 return sqlite3Json1Init(db); |
| 7392 } |
| 7393 #endif |
| 7394 #endif /* !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_JSON1) */ |
| 7395 |
| 7396 /************** End of json1.c ***********************************************/ |
| 7397 /************** Begin file fts5.c ********************************************/ |
| 7398 |
| 7399 |
| 7400 #if !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_FTS5) |
| 7401 |
| 7402 #if !defined(NDEBUG) && !defined(SQLITE_DEBUG) |
| 7403 # define NDEBUG 1 |
| 7404 #endif |
| 7405 #if defined(NDEBUG) && defined(SQLITE_DEBUG) |
| 7406 # undef NDEBUG |
| 7407 #endif |
| 7408 |
| 7409 /* |
| 7410 ** 2014 May 31 |
| 7411 ** |
| 7412 ** The author disclaims copyright to this source code. In place of |
| 7413 ** a legal notice, here is a blessing: |
| 7414 ** |
| 7415 ** May you do good and not evil. |
| 7416 ** May you find forgiveness for yourself and forgive others. |
| 7417 ** May you share freely, never taking more than you give. |
| 7418 ** |
| 7419 ****************************************************************************** |
| 7420 ** |
| 7421 ** Interfaces to extend FTS5. Using the interfaces defined in this file, |
| 7422 ** FTS5 may be extended with: |
| 7423 ** |
| 7424 ** * custom tokenizers, and |
| 7425 ** * custom auxiliary functions. |
| 7426 */ |
| 7427 |
| 7428 |
| 7429 #ifndef _FTS5_H |
| 7430 #define _FTS5_H |
| 7431 |
| 7432 /* #include "sqlite3.h" */ |
| 7433 |
| 7434 #if 0 |
| 7435 extern "C" { |
| 7436 #endif |
| 7437 |
| 7438 /************************************************************************* |
| 7439 ** CUSTOM AUXILIARY FUNCTIONS |
| 7440 ** |
| 7441 ** Virtual table implementations may overload SQL functions by implementing |
| 7442 ** the sqlite3_module.xFindFunction() method. |
| 7443 */ |
| 7444 |
| 7445 typedef struct Fts5ExtensionApi Fts5ExtensionApi; |
| 7446 typedef struct Fts5Context Fts5Context; |
| 7447 typedef struct Fts5PhraseIter Fts5PhraseIter; |
| 7448 |
| 7449 typedef void (*fts5_extension_function)( |
| 7450 const Fts5ExtensionApi *pApi, /* API offered by current FTS version */ |
| 7451 Fts5Context *pFts, /* First arg to pass to pApi functions */ |
| 7452 sqlite3_context *pCtx, /* Context for returning result/error */ |
| 7453 int nVal, /* Number of values in apVal[] array */ |
| 7454 sqlite3_value **apVal /* Array of trailing arguments */ |
| 7455 ); |
| 7456 |
| 7457 struct Fts5PhraseIter { |
| 7458 const unsigned char *a; |
| 7459 const unsigned char *b; |
| 7460 }; |
| 7461 |
| 7462 /* |
| 7463 ** EXTENSION API FUNCTIONS |
| 7464 ** |
| 7465 ** xUserData(pFts): |
| 7466 ** Return a copy of the context pointer the extension function was |
| 7467 ** registered with. |
| 7468 ** |
| 7469 ** xColumnTotalSize(pFts, iCol, pnToken): |
| 7470 ** If parameter iCol is less than zero, set output variable *pnToken |
| 7471 ** to the total number of tokens in the FTS5 table. Or, if iCol is |
| 7472 ** non-negative but less than the number of columns in the table, return |
| 7473 ** the total number of tokens in column iCol, considering all rows in |
| 7474 ** the FTS5 table. |
| 7475 ** |
| 7476 ** If parameter iCol is greater than or equal to the number of columns |
| 7477 ** in the table, SQLITE_RANGE is returned. Or, if an error occurs (e.g. |
| 7478 ** an OOM condition or IO error), an appropriate SQLite error code is |
| 7479 ** returned. |
| 7480 ** |
| 7481 ** xColumnCount(pFts): |
| 7482 ** Return the number of columns in the table. |
| 7483 ** |
| 7484 ** xColumnSize(pFts, iCol, pnToken): |
| 7485 ** If parameter iCol is less than zero, set output variable *pnToken |
| 7486 ** to the total number of tokens in the current row. Or, if iCol is |
| 7487 ** non-negative but less than the number of columns in the table, set |
| 7488 ** *pnToken to the number of tokens in column iCol of the current row. |
| 7489 ** |
| 7490 ** If parameter iCol is greater than or equal to the number of columns |
| 7491 ** in the table, SQLITE_RANGE is returned. Or, if an error occurs (e.g. |
| 7492 ** an OOM condition or IO error), an appropriate SQLite error code is |
| 7493 ** returned. |
| 7494 ** |
| 7495 ** xColumnText: |
| 7496 ** This function attempts to retrieve the text of column iCol of the |
| 7497 ** current document. If successful, (*pz) is set to point to a buffer |
| 7498 ** containing the text in utf-8 encoding, (*pn) is set to the size in bytes |
| 7499 ** (not characters) of the buffer and SQLITE_OK is returned. Otherwise, |
| 7500 ** if an error occurs, an SQLite error code is returned and the final values |
| 7501 ** of (*pz) and (*pn) are undefined. |
| 7502 ** |
| 7503 ** xPhraseCount: |
| 7504 ** Returns the number of phrases in the current query expression. |
| 7505 ** |
| 7506 ** xPhraseSize: |
| 7507 ** Returns the number of tokens in phrase iPhrase of the query. Phrases |
| 7508 ** are numbered starting from zero. |
| 7509 ** |
| 7510 ** xInstCount: |
| 7511 ** Set *pnInst to the total number of occurrences of all phrases within |
| 7512 ** the query within the current row. Return SQLITE_OK if successful, or |
| 7513 ** an error code (i.e. SQLITE_NOMEM) if an error occurs. |
| 7514 ** |
| 7515 ** xInst: |
| 7516 ** Query for the details of phrase match iIdx within the current row. |
| 7517 ** Phrase matches are numbered starting from zero, so the iIdx argument |
| 7518 ** should be greater than or equal to zero and smaller than the value |
| 7519 ** output by xInstCount(). |
| 7520 ** |
| 7521 ** Returns SQLITE_OK if successful, or an error code (i.e. SQLITE_NOMEM) |
| 7522 ** if an error occurs. |
| 7523 ** |
| 7524 ** xRowid: |
| 7525 ** Returns the rowid of the current row. |
| 7526 ** |
| 7527 ** xTokenize: |
| 7528 ** Tokenize text using the tokenizer belonging to the FTS5 table. |
| 7529 ** |
| 7530 ** xQueryPhrase(pFts5, iPhrase, pUserData, xCallback): |
| 7531 ** This API function is used to query the FTS table for phrase iPhrase |
| 7532 ** of the current query. Specifically, a query equivalent to: |
| 7533 ** |
| 7534 ** ... FROM ftstable WHERE ftstable MATCH $p ORDER BY rowid |
| 7535 ** |
| 7536 ** with $p set to a phrase equivalent to the phrase iPhrase of the |
| 7537 ** current query is executed. For each row visited, the callback function |
| 7538 ** passed as the fourth argument is invoked. The context and API objects |
| 7539 ** passed to the callback function may be used to access the properties of |
| 7540 ** each matched row. Invoking Api.xUserData() returns a copy of the pointer |
| 7541 ** passed as the third argument to pUserData. |
| 7542 ** |
| 7543 ** If the callback function returns any value other than SQLITE_OK, the |
| 7544 ** query is abandoned and the xQueryPhrase function returns immediately. |
| 7545 ** If the returned value is SQLITE_DONE, xQueryPhrase returns SQLITE_OK. |
| 7546 ** Otherwise, the error code is propagated upwards. |
| 7547 ** |
| 7548 ** If the query runs to completion without incident, SQLITE_OK is returned. |
| 7549 ** Or, if some error occurs before the query completes or is aborted by |
| 7550 ** the callback, an SQLite error code is returned. |
| 7551 ** |
| 7552 ** |
| 7553 ** xSetAuxdata(pFts5, pAux, xDelete) |
| 7554 ** |
| 7555 ** Save the pointer passed as the second argument as the extension functions |
| 7556 ** "auxiliary data". The pointer may then be retrieved by the current or any |
| 7557 ** future invocation of the same fts5 extension function made as part of |
| 7558 ** of the same MATCH query using the xGetAuxdata() API. |
| 7559 ** |
| 7560 ** Each extension function is allocated a single auxiliary data slot for |
| 7561 ** each FTS query (MATCH expression). If the extension function is invoked |
| 7562 ** more than once for a single FTS query, then all invocations share a |
| 7563 ** single auxiliary data context. |
| 7564 ** |
| 7565 ** If there is already an auxiliary data pointer when this function is |
| 7566 ** invoked, then it is replaced by the new pointer. If an xDelete callback |
| 7567 ** was specified along with the original pointer, it is invoked at this |
| 7568 ** point. |
| 7569 ** |
| 7570 ** The xDelete callback, if one is specified, is also invoked on the |
| 7571 ** auxiliary data pointer after the FTS5 query has finished. |
| 7572 ** |
| 7573 ** If an error (e.g. an OOM condition) occurs within this function, an |
| 7574 ** the auxiliary data is set to NULL and an error code returned. If the |
| 7575 ** xDelete parameter was not NULL, it is invoked on the auxiliary data |
| 7576 ** pointer before returning. |
| 7577 ** |
| 7578 ** |
| 7579 ** xGetAuxdata(pFts5, bClear) |
| 7580 ** |
| 7581 ** Returns the current auxiliary data pointer for the fts5 extension |
| 7582 ** function. See the xSetAuxdata() method for details. |
| 7583 ** |
| 7584 ** If the bClear argument is non-zero, then the auxiliary data is cleared |
| 7585 ** (set to NULL) before this function returns. In this case the xDelete, |
| 7586 ** if any, is not invoked. |
| 7587 ** |
| 7588 ** |
| 7589 ** xRowCount(pFts5, pnRow) |
| 7590 ** |
| 7591 ** This function is used to retrieve the total number of rows in the table. |
| 7592 ** In other words, the same value that would be returned by: |
| 7593 ** |
| 7594 ** SELECT count(*) FROM ftstable; |
| 7595 ** |
| 7596 ** xPhraseFirst() |
| 7597 ** This function is used, along with type Fts5PhraseIter and the xPhraseNext |
| 7598 ** method, to iterate through all instances of a single query phrase within |
| 7599 ** the current row. This is the same information as is accessible via the |
| 7600 ** xInstCount/xInst APIs. While the xInstCount/xInst APIs are more convenient |
| 7601 ** to use, this API may be faster under some circumstances. To iterate |
| 7602 ** through instances of phrase iPhrase, use the following code: |
| 7603 ** |
| 7604 ** Fts5PhraseIter iter; |
| 7605 ** int iCol, iOff; |
| 7606 ** for(pApi->xPhraseFirst(pFts, iPhrase, &iter, &iCol, &iOff); |
| 7607 ** iOff>=0; |
| 7608 ** pApi->xPhraseNext(pFts, &iter, &iCol, &iOff) |
| 7609 ** ){ |
| 7610 ** // An instance of phrase iPhrase at offset iOff of column iCol |
| 7611 ** } |
| 7612 ** |
| 7613 ** The Fts5PhraseIter structure is defined above. Applications should not |
| 7614 ** modify this structure directly - it should only be used as shown above |
| 7615 ** with the xPhraseFirst() and xPhraseNext() API methods. |
| 7616 ** |
| 7617 ** xPhraseNext() |
| 7618 ** See xPhraseFirst above. |
| 7619 */ |
| 7620 struct Fts5ExtensionApi { |
| 7621 int iVersion; /* Currently always set to 1 */ |
| 7622 |
| 7623 void *(*xUserData)(Fts5Context*); |
| 7624 |
| 7625 int (*xColumnCount)(Fts5Context*); |
| 7626 int (*xRowCount)(Fts5Context*, sqlite3_int64 *pnRow); |
| 7627 int (*xColumnTotalSize)(Fts5Context*, int iCol, sqlite3_int64 *pnToken); |
| 7628 |
| 7629 int (*xTokenize)(Fts5Context*, |
| 7630 const char *pText, int nText, /* Text to tokenize */ |
| 7631 void *pCtx, /* Context passed to xToken() */ |
| 7632 int (*xToken)(void*, int, const char*, int, int, int) /* Callback */ |
| 7633 ); |
| 7634 |
| 7635 int (*xPhraseCount)(Fts5Context*); |
| 7636 int (*xPhraseSize)(Fts5Context*, int iPhrase); |
| 7637 |
| 7638 int (*xInstCount)(Fts5Context*, int *pnInst); |
| 7639 int (*xInst)(Fts5Context*, int iIdx, int *piPhrase, int *piCol, int *piOff); |
| 7640 |
| 7641 sqlite3_int64 (*xRowid)(Fts5Context*); |
| 7642 int (*xColumnText)(Fts5Context*, int iCol, const char **pz, int *pn); |
| 7643 int (*xColumnSize)(Fts5Context*, int iCol, int *pnToken); |
| 7644 |
| 7645 int (*xQueryPhrase)(Fts5Context*, int iPhrase, void *pUserData, |
| 7646 int(*)(const Fts5ExtensionApi*,Fts5Context*,void*) |
| 7647 ); |
| 7648 int (*xSetAuxdata)(Fts5Context*, void *pAux, void(*xDelete)(void*)); |
| 7649 void *(*xGetAuxdata)(Fts5Context*, int bClear); |
| 7650 |
| 7651 void (*xPhraseFirst)(Fts5Context*, int iPhrase, Fts5PhraseIter*, int*, int*); |
| 7652 void (*xPhraseNext)(Fts5Context*, Fts5PhraseIter*, int *piCol, int *piOff); |
| 7653 }; |
| 7654 |
| 7655 /* |
| 7656 ** CUSTOM AUXILIARY FUNCTIONS |
| 7657 *************************************************************************/ |
| 7658 |
| 7659 /************************************************************************* |
| 7660 ** CUSTOM TOKENIZERS |
| 7661 ** |
| 7662 ** Applications may also register custom tokenizer types. A tokenizer |
| 7663 ** is registered by providing fts5 with a populated instance of the |
| 7664 ** following structure. All structure methods must be defined, setting |
| 7665 ** any member of the fts5_tokenizer struct to NULL leads to undefined |
| 7666 ** behaviour. The structure methods are expected to function as follows: |
| 7667 ** |
| 7668 ** xCreate: |
| 7669 ** This function is used to allocate and inititalize a tokenizer instance. |
| 7670 ** A tokenizer instance is required to actually tokenize text. |
| 7671 ** |
| 7672 ** The first argument passed to this function is a copy of the (void*) |
| 7673 ** pointer provided by the application when the fts5_tokenizer object |
| 7674 ** was registered with FTS5 (the third argument to xCreateTokenizer()). |
| 7675 ** The second and third arguments are an array of nul-terminated strings |
| 7676 ** containing the tokenizer arguments, if any, specified following the |
| 7677 ** tokenizer name as part of the CREATE VIRTUAL TABLE statement used |
| 7678 ** to create the FTS5 table. |
| 7679 ** |
| 7680 ** The final argument is an output variable. If successful, (*ppOut) |
| 7681 ** should be set to point to the new tokenizer handle and SQLITE_OK |
| 7682 ** returned. If an error occurs, some value other than SQLITE_OK should |
| 7683 ** be returned. In this case, fts5 assumes that the final value of *ppOut |
| 7684 ** is undefined. |
| 7685 ** |
| 7686 ** xDelete: |
| 7687 ** This function is invoked to delete a tokenizer handle previously |
| 7688 ** allocated using xCreate(). Fts5 guarantees that this function will |
| 7689 ** be invoked exactly once for each successful call to xCreate(). |
| 7690 ** |
| 7691 ** xTokenize: |
| 7692 ** This function is expected to tokenize the nText byte string indicated |
| 7693 ** by argument pText. pText may or may not be nul-terminated. The first |
| 7694 ** argument passed to this function is a pointer to an Fts5Tokenizer object |
| 7695 ** returned by an earlier call to xCreate(). |
| 7696 ** |
| 7697 ** The second argument indicates the reason that FTS5 is requesting |
| 7698 ** tokenization of the supplied text. This is always one of the following |
| 7699 ** four values: |
| 7700 ** |
| 7701 ** <ul><li> <b>FTS5_TOKENIZE_DOCUMENT</b> - A document is being inserted into |
| 7702 ** or removed from the FTS table. The tokenizer is being invoked to |
| 7703 ** determine the set of tokens to add to (or delete from) the |
| 7704 ** FTS index. |
| 7705 ** |
| 7706 ** <li> <b>FTS5_TOKENIZE_QUERY</b> - A MATCH query is being executed |
| 7707 ** against the FTS index. The tokenizer is being called to tokenize |
| 7708 ** a bareword or quoted string specified as part of the query. |
| 7709 ** |
| 7710 ** <li> <b>(FTS5_TOKENIZE_QUERY | FTS5_TOKENIZE_PREFIX)</b> - Same as |
| 7711 ** FTS5_TOKENIZE_QUERY, except that the bareword or quoted string is |
| 7712 ** followed by a "*" character, indicating that the last token |
| 7713 ** returned by the tokenizer will be treated as a token prefix. |
| 7714 ** |
| 7715 ** <li> <b>FTS5_TOKENIZE_AUX</b> - The tokenizer is being invoked to |
| 7716 ** satisfy an fts5_api.xTokenize() request made by an auxiliary |
| 7717 ** function. Or an fts5_api.xColumnSize() request made by the same |
| 7718 ** on a columnsize=0 database. |
| 7719 ** </ul> |
| 7720 ** |
| 7721 ** For each token in the input string, the supplied callback xToken() must |
| 7722 ** be invoked. The first argument to it should be a copy of the pointer |
| 7723 ** passed as the second argument to xTokenize(). The third and fourth |
| 7724 ** arguments are a pointer to a buffer containing the token text, and the |
| 7725 ** size of the token in bytes. The 4th and 5th arguments are the byte offsets |
| 7726 ** of the first byte of and first byte immediately following the text from |
| 7727 ** which the token is derived within the input. |
| 7728 ** |
| 7729 ** The second argument passed to the xToken() callback ("tflags") should |
| 7730 ** normally be set to 0. The exception is if the tokenizer supports |
| 7731 ** synonyms. In this case see the discussion below for details. |
| 7732 ** |
| 7733 ** FTS5 assumes the xToken() callback is invoked for each token in the |
| 7734 ** order that they occur within the input text. |
| 7735 ** |
| 7736 ** If an xToken() callback returns any value other than SQLITE_OK, then |
| 7737 ** the tokenization should be abandoned and the xTokenize() method should |
| 7738 ** immediately return a copy of the xToken() return value. Or, if the |
| 7739 ** input buffer is exhausted, xTokenize() should return SQLITE_OK. Finally, |
| 7740 ** if an error occurs with the xTokenize() implementation itself, it |
| 7741 ** may abandon the tokenization and return any error code other than |
| 7742 ** SQLITE_OK or SQLITE_DONE. |
| 7743 ** |
| 7744 ** SYNONYM SUPPORT |
| 7745 ** |
| 7746 ** Custom tokenizers may also support synonyms. Consider a case in which a |
| 7747 ** user wishes to query for a phrase such as "first place". Using the |
| 7748 ** built-in tokenizers, the FTS5 query 'first + place' will match instances |
| 7749 ** of "first place" within the document set, but not alternative forms |
| 7750 ** such as "1st place". In some applications, it would be better to match |
| 7751 ** all instances of "first place" or "1st place" regardless of which form |
| 7752 ** the user specified in the MATCH query text. |
| 7753 ** |
| 7754 ** There are several ways to approach this in FTS5: |
| 7755 ** |
| 7756 ** <ol><li> By mapping all synonyms to a single token. In this case, the |
| 7757 ** In the above example, this means that the tokenizer returns the |
| 7758 ** same token for inputs "first" and "1st". Say that token is in |
| 7759 ** fact "first", so that when the user inserts the document "I won |
| 7760 ** 1st place" entries are added to the index for tokens "i", "won", |
| 7761 ** "first" and "place". If the user then queries for '1st + place', |
| 7762 ** the tokenizer substitutes "first" for "1st" and the query works |
| 7763 ** as expected. |
| 7764 ** |
| 7765 ** <li> By adding multiple synonyms for a single term to the FTS index. |
| 7766 ** In this case, when tokenizing query text, the tokenizer may |
| 7767 ** provide multiple synonyms for a single term within the document. |
| 7768 ** FTS5 then queries the index for each synonym individually. For |
| 7769 ** example, faced with the query: |
| 7770 ** |
| 7771 ** <codeblock> |
| 7772 ** ... MATCH 'first place'</codeblock> |
| 7773 ** |
| 7774 ** the tokenizer offers both "1st" and "first" as synonyms for the |
| 7775 ** first token in the MATCH query and FTS5 effectively runs a query |
| 7776 ** similar to: |
| 7777 ** |
| 7778 ** <codeblock> |
| 7779 ** ... MATCH '(first OR 1st) place'</codeblock> |
| 7780 ** |
| 7781 ** except that, for the purposes of auxiliary functions, the query |
| 7782 ** still appears to contain just two phrases - "(first OR 1st)" |
| 7783 ** being treated as a single phrase. |
| 7784 ** |
| 7785 ** <li> By adding multiple synonyms for a single term to the FTS index. |
| 7786 ** Using this method, when tokenizing document text, the tokenizer |
| 7787 ** provides multiple synonyms for each token. So that when a |
| 7788 ** document such as "I won first place" is tokenized, entries are |
| 7789 ** added to the FTS index for "i", "won", "first", "1st" and |
| 7790 ** "place". |
| 7791 ** |
| 7792 ** This way, even if the tokenizer does not provide synonyms |
| 7793 ** when tokenizing query text (it should not - to do would be |
| 7794 ** inefficient), it doesn't matter if the user queries for |
| 7795 ** 'first + place' or '1st + place', as there are entires in the |
| 7796 ** FTS index corresponding to both forms of the first token. |
| 7797 ** </ol> |
| 7798 ** |
| 7799 ** Whether it is parsing document or query text, any call to xToken that |
| 7800 ** specifies a <i>tflags</i> argument with the FTS5_TOKEN_COLOCATED bit |
| 7801 ** is considered to supply a synonym for the previous token. For example, |
| 7802 ** when parsing the document "I won first place", a tokenizer that supports |
| 7803 ** synonyms would call xToken() 5 times, as follows: |
| 7804 ** |
| 7805 ** <codeblock> |
| 7806 ** xToken(pCtx, 0, "i", 1, 0, 1); |
| 7807 ** xToken(pCtx, 0, "won", 3, 2, 5); |
| 7808 ** xToken(pCtx, 0, "first", 5, 6, 11); |
| 7809 ** xToken(pCtx, FTS5_TOKEN_COLOCATED, "1st", 3, 6, 11); |
| 7810 ** xToken(pCtx, 0, "place", 5, 12, 17); |
| 7811 **</codeblock> |
| 7812 ** |
| 7813 ** It is an error to specify the FTS5_TOKEN_COLOCATED flag the first time |
| 7814 ** xToken() is called. Multiple synonyms may be specified for a single token |
| 7815 ** by making multiple calls to xToken(FTS5_TOKEN_COLOCATED) in sequence. |
| 7816 ** There is no limit to the number of synonyms that may be provided for a |
| 7817 ** single token. |
| 7818 ** |
| 7819 ** In many cases, method (1) above is the best approach. It does not add |
| 7820 ** extra data to the FTS index or require FTS5 to query for multiple terms, |
| 7821 ** so it is efficient in terms of disk space and query speed. However, it |
| 7822 ** does not support prefix queries very well. If, as suggested above, the |
| 7823 ** token "first" is subsituted for "1st" by the tokenizer, then the query: |
| 7824 ** |
| 7825 ** <codeblock> |
| 7826 ** ... MATCH '1s*'</codeblock> |
| 7827 ** |
| 7828 ** will not match documents that contain the token "1st" (as the tokenizer |
| 7829 ** will probably not map "1s" to any prefix of "first"). |
| 7830 ** |
| 7831 ** For full prefix support, method (3) may be preferred. In this case, |
| 7832 ** because the index contains entries for both "first" and "1st", prefix |
| 7833 ** queries such as 'fi*' or '1s*' will match correctly. However, because |
| 7834 ** extra entries are added to the FTS index, this method uses more space |
| 7835 ** within the database. |
| 7836 ** |
| 7837 ** Method (2) offers a midpoint between (1) and (3). Using this method, |
| 7838 ** a query such as '1s*' will match documents that contain the literal |
| 7839 ** token "1st", but not "first" (assuming the tokenizer is not able to |
| 7840 ** provide synonyms for prefixes). However, a non-prefix query like '1st' |
| 7841 ** will match against "1st" and "first". This method does not require |
| 7842 ** extra disk space, as no extra entries are added to the FTS index. |
| 7843 ** On the other hand, it may require more CPU cycles to run MATCH queries, |
| 7844 ** as separate queries of the FTS index are required for each synonym. |
| 7845 ** |
| 7846 ** When using methods (2) or (3), it is important that the tokenizer only |
| 7847 ** provide synonyms when tokenizing document text (method (2)) or query |
| 7848 ** text (method (3)), not both. Doing so will not cause any errors, but is |
| 7849 ** inefficient. |
| 7850 */ |
| 7851 typedef struct Fts5Tokenizer Fts5Tokenizer; |
| 7852 typedef struct fts5_tokenizer fts5_tokenizer; |
| 7853 struct fts5_tokenizer { |
| 7854 int (*xCreate)(void*, const char **azArg, int nArg, Fts5Tokenizer **ppOut); |
| 7855 void (*xDelete)(Fts5Tokenizer*); |
| 7856 int (*xTokenize)(Fts5Tokenizer*, |
| 7857 void *pCtx, |
| 7858 int flags, /* Mask of FTS5_TOKENIZE_* flags */ |
| 7859 const char *pText, int nText, |
| 7860 int (*xToken)( |
| 7861 void *pCtx, /* Copy of 2nd argument to xTokenize() */ |
| 7862 int tflags, /* Mask of FTS5_TOKEN_* flags */ |
| 7863 const char *pToken, /* Pointer to buffer containing token */ |
| 7864 int nToken, /* Size of token in bytes */ |
| 7865 int iStart, /* Byte offset of token within input text */ |
| 7866 int iEnd /* Byte offset of end of token within input text */ |
| 7867 ) |
| 7868 ); |
| 7869 }; |
| 7870 |
| 7871 /* Flags that may be passed as the third argument to xTokenize() */ |
| 7872 #define FTS5_TOKENIZE_QUERY 0x0001 |
| 7873 #define FTS5_TOKENIZE_PREFIX 0x0002 |
| 7874 #define FTS5_TOKENIZE_DOCUMENT 0x0004 |
| 7875 #define FTS5_TOKENIZE_AUX 0x0008 |
| 7876 |
| 7877 /* Flags that may be passed by the tokenizer implementation back to FTS5 |
| 7878 ** as the third argument to the supplied xToken callback. */ |
| 7879 #define FTS5_TOKEN_COLOCATED 0x0001 /* Same position as prev. token */ |
| 7880 |
| 7881 /* |
| 7882 ** END OF CUSTOM TOKENIZERS |
| 7883 *************************************************************************/ |
| 7884 |
| 7885 /************************************************************************* |
| 7886 ** FTS5 EXTENSION REGISTRATION API |
| 7887 */ |
| 7888 typedef struct fts5_api fts5_api; |
| 7889 struct fts5_api { |
| 7890 int iVersion; /* Currently always set to 2 */ |
| 7891 |
| 7892 /* Create a new tokenizer */ |
| 7893 int (*xCreateTokenizer)( |
| 7894 fts5_api *pApi, |
| 7895 const char *zName, |
| 7896 void *pContext, |
| 7897 fts5_tokenizer *pTokenizer, |
| 7898 void (*xDestroy)(void*) |
| 7899 ); |
| 7900 |
| 7901 /* Find an existing tokenizer */ |
| 7902 int (*xFindTokenizer)( |
| 7903 fts5_api *pApi, |
| 7904 const char *zName, |
| 7905 void **ppContext, |
| 7906 fts5_tokenizer *pTokenizer |
| 7907 ); |
| 7908 |
| 7909 /* Create a new auxiliary function */ |
| 7910 int (*xCreateFunction)( |
| 7911 fts5_api *pApi, |
| 7912 const char *zName, |
| 7913 void *pContext, |
| 7914 fts5_extension_function xFunction, |
| 7915 void (*xDestroy)(void*) |
| 7916 ); |
| 7917 }; |
| 7918 |
| 7919 /* |
| 7920 ** END OF REGISTRATION API |
| 7921 *************************************************************************/ |
| 7922 |
| 7923 #if 0 |
| 7924 } /* end of the 'extern "C"' block */ |
| 7925 #endif |
| 7926 |
| 7927 #endif /* _FTS5_H */ |
| 7928 |
| 7929 |
| 7930 /* |
| 7931 ** 2014 May 31 |
| 7932 ** |
| 7933 ** The author disclaims copyright to this source code. In place of |
| 7934 ** a legal notice, here is a blessing: |
| 7935 ** |
| 7936 ** May you do good and not evil. |
| 7937 ** May you find forgiveness for yourself and forgive others. |
| 7938 ** May you share freely, never taking more than you give. |
| 7939 ** |
| 7940 ****************************************************************************** |
| 7941 ** |
| 7942 */ |
| 7943 #ifndef _FTS5INT_H |
| 7944 #define _FTS5INT_H |
| 7945 |
| 7946 /* #include "fts5.h" */ |
| 7947 /* #include "sqlite3ext.h" */ |
| 7948 SQLITE_EXTENSION_INIT1 |
| 7949 |
| 7950 /* #include <string.h> */ |
| 7951 /* #include <assert.h> */ |
| 7952 |
| 7953 #ifndef SQLITE_AMALGAMATION |
| 7954 |
| 7955 typedef unsigned char u8; |
| 7956 typedef unsigned int u32; |
| 7957 typedef unsigned short u16; |
| 7958 typedef sqlite3_int64 i64; |
| 7959 typedef sqlite3_uint64 u64; |
| 7960 |
| 7961 #define ArraySize(x) (sizeof(x) / sizeof(x[0])) |
| 7962 |
| 7963 #define testcase(x) |
| 7964 #define ALWAYS(x) 1 |
| 7965 #define NEVER(x) 0 |
| 7966 |
| 7967 #define MIN(x,y) (((x) < (y)) ? (x) : (y)) |
| 7968 #define MAX(x,y) (((x) > (y)) ? (x) : (y)) |
| 7969 |
| 7970 /* |
| 7971 ** Constants for the largest and smallest possible 64-bit signed integers. |
| 7972 */ |
| 7973 # define LARGEST_INT64 (0xffffffff|(((i64)0x7fffffff)<<32)) |
| 7974 # define SMALLEST_INT64 (((i64)-1) - LARGEST_INT64) |
| 7975 |
| 7976 #endif |
| 7977 |
| 7978 |
| 7979 /* |
| 7980 ** Maximum number of prefix indexes on single FTS5 table. This must be |
| 7981 ** less than 32. If it is set to anything large than that, an #error |
| 7982 ** directive in fts5_index.c will cause the build to fail. |
| 7983 */ |
| 7984 #define FTS5_MAX_PREFIX_INDEXES 31 |
| 7985 |
| 7986 #define FTS5_DEFAULT_NEARDIST 10 |
| 7987 #define FTS5_DEFAULT_RANK "bm25" |
| 7988 |
| 7989 /* Name of rank and rowid columns */ |
| 7990 #define FTS5_RANK_NAME "rank" |
| 7991 #define FTS5_ROWID_NAME "rowid" |
| 7992 |
| 7993 #ifdef SQLITE_DEBUG |
| 7994 # define FTS5_CORRUPT sqlite3Fts5Corrupt() |
| 7995 static int sqlite3Fts5Corrupt(void); |
| 7996 #else |
| 7997 # define FTS5_CORRUPT SQLITE_CORRUPT_VTAB |
| 7998 #endif |
| 7999 |
| 8000 /* |
| 8001 ** The assert_nc() macro is similar to the assert() macro, except that it |
| 8002 ** is used for assert() conditions that are true only if it can be |
| 8003 ** guranteed that the database is not corrupt. |
| 8004 */ |
| 8005 #ifdef SQLITE_DEBUG |
| 8006 SQLITE_API extern int sqlite3_fts5_may_be_corrupt; |
| 8007 # define assert_nc(x) assert(sqlite3_fts5_may_be_corrupt || (x)) |
| 8008 #else |
| 8009 # define assert_nc(x) assert(x) |
| 8010 #endif |
| 8011 |
| 8012 typedef struct Fts5Global Fts5Global; |
| 8013 typedef struct Fts5Colset Fts5Colset; |
| 8014 |
| 8015 /* If a NEAR() clump or phrase may only match a specific set of columns, |
| 8016 ** then an object of the following type is used to record the set of columns. |
| 8017 ** Each entry in the aiCol[] array is a column that may be matched. |
| 8018 ** |
| 8019 ** This object is used by fts5_expr.c and fts5_index.c. |
| 8020 */ |
| 8021 struct Fts5Colset { |
| 8022 int nCol; |
| 8023 int aiCol[1]; |
| 8024 }; |
| 8025 |
| 8026 |
| 8027 |
| 8028 /************************************************************************** |
| 8029 ** Interface to code in fts5_config.c. fts5_config.c contains contains code |
| 8030 ** to parse the arguments passed to the CREATE VIRTUAL TABLE statement. |
| 8031 */ |
| 8032 |
| 8033 typedef struct Fts5Config Fts5Config; |
| 8034 |
| 8035 /* |
| 8036 ** An instance of the following structure encodes all information that can |
| 8037 ** be gleaned from the CREATE VIRTUAL TABLE statement. |
| 8038 ** |
| 8039 ** And all information loaded from the %_config table. |
| 8040 ** |
| 8041 ** nAutomerge: |
| 8042 ** The minimum number of segments that an auto-merge operation should |
| 8043 ** attempt to merge together. A value of 1 sets the object to use the |
| 8044 ** compile time default. Zero disables auto-merge altogether. |
| 8045 ** |
| 8046 ** zContent: |
| 8047 ** |
| 8048 ** zContentRowid: |
| 8049 ** The value of the content_rowid= option, if one was specified. Or |
| 8050 ** the string "rowid" otherwise. This text is not quoted - if it is |
| 8051 ** used as part of an SQL statement it needs to be quoted appropriately. |
| 8052 ** |
| 8053 ** zContentExprlist: |
| 8054 ** |
| 8055 ** pzErrmsg: |
| 8056 ** This exists in order to allow the fts5_index.c module to return a |
| 8057 ** decent error message if it encounters a file-format version it does |
| 8058 ** not understand. |
| 8059 ** |
| 8060 ** bColumnsize: |
| 8061 ** True if the %_docsize table is created. |
| 8062 ** |
| 8063 ** bPrefixIndex: |
| 8064 ** This is only used for debugging. If set to false, any prefix indexes |
| 8065 ** are ignored. This value is configured using: |
| 8066 ** |
| 8067 ** INSERT INTO tbl(tbl, rank) VALUES('prefix-index', $bPrefixIndex); |
| 8068 ** |
| 8069 */ |
| 8070 struct Fts5Config { |
| 8071 sqlite3 *db; /* Database handle */ |
| 8072 char *zDb; /* Database holding FTS index (e.g. "main") */ |
| 8073 char *zName; /* Name of FTS index */ |
| 8074 int nCol; /* Number of columns */ |
| 8075 char **azCol; /* Column names */ |
| 8076 u8 *abUnindexed; /* True for unindexed columns */ |
| 8077 int nPrefix; /* Number of prefix indexes */ |
| 8078 int *aPrefix; /* Sizes in bytes of nPrefix prefix indexes */ |
| 8079 int eContent; /* An FTS5_CONTENT value */ |
| 8080 char *zContent; /* content table */ |
| 8081 char *zContentRowid; /* "content_rowid=" option value */ |
| 8082 int bColumnsize; /* "columnsize=" option value (dflt==1) */ |
| 8083 char *zContentExprlist; |
| 8084 Fts5Tokenizer *pTok; |
| 8085 fts5_tokenizer *pTokApi; |
| 8086 |
| 8087 /* Values loaded from the %_config table */ |
| 8088 int iCookie; /* Incremented when %_config is modified */ |
| 8089 int pgsz; /* Approximate page size used in %_data */ |
| 8090 int nAutomerge; /* 'automerge' setting */ |
| 8091 int nCrisisMerge; /* Maximum allowed segments per level */ |
| 8092 int nHashSize; /* Bytes of memory for in-memory hash */ |
| 8093 char *zRank; /* Name of rank function */ |
| 8094 char *zRankArgs; /* Arguments to rank function */ |
| 8095 |
| 8096 /* If non-NULL, points to sqlite3_vtab.base.zErrmsg. Often NULL. */ |
| 8097 char **pzErrmsg; |
| 8098 |
| 8099 #ifdef SQLITE_DEBUG |
| 8100 int bPrefixIndex; /* True to use prefix-indexes */ |
| 8101 #endif |
| 8102 }; |
| 8103 |
| 8104 /* Current expected value of %_config table 'version' field */ |
| 8105 #define FTS5_CURRENT_VERSION 4 |
| 8106 |
| 8107 #define FTS5_CONTENT_NORMAL 0 |
| 8108 #define FTS5_CONTENT_NONE 1 |
| 8109 #define FTS5_CONTENT_EXTERNAL 2 |
| 8110 |
| 8111 |
| 8112 |
| 8113 |
| 8114 static int sqlite3Fts5ConfigParse( |
| 8115 Fts5Global*, sqlite3*, int, const char **, Fts5Config**, char** |
| 8116 ); |
| 8117 static void sqlite3Fts5ConfigFree(Fts5Config*); |
| 8118 |
| 8119 static int sqlite3Fts5ConfigDeclareVtab(Fts5Config *pConfig); |
| 8120 |
| 8121 static int sqlite3Fts5Tokenize( |
| 8122 Fts5Config *pConfig, /* FTS5 Configuration object */ |
| 8123 int flags, /* FTS5_TOKENIZE_* flags */ |
| 8124 const char *pText, int nText, /* Text to tokenize */ |
| 8125 void *pCtx, /* Context passed to xToken() */ |
| 8126 int (*xToken)(void*, int, const char*, int, int, int) /* Callback */ |
| 8127 ); |
| 8128 |
| 8129 static void sqlite3Fts5Dequote(char *z); |
| 8130 |
| 8131 /* Load the contents of the %_config table */ |
| 8132 static int sqlite3Fts5ConfigLoad(Fts5Config*, int); |
| 8133 |
| 8134 /* Set the value of a single config attribute */ |
| 8135 static int sqlite3Fts5ConfigSetValue(Fts5Config*, const char*, sqlite3_value*, i
nt*); |
| 8136 |
| 8137 static int sqlite3Fts5ConfigParseRank(const char*, char**, char**); |
| 8138 |
| 8139 /* |
| 8140 ** End of interface to code in fts5_config.c. |
| 8141 **************************************************************************/ |
| 8142 |
| 8143 /************************************************************************** |
| 8144 ** Interface to code in fts5_buffer.c. |
| 8145 */ |
| 8146 |
| 8147 /* |
| 8148 ** Buffer object for the incremental building of string data. |
| 8149 */ |
| 8150 typedef struct Fts5Buffer Fts5Buffer; |
| 8151 struct Fts5Buffer { |
| 8152 u8 *p; |
| 8153 int n; |
| 8154 int nSpace; |
| 8155 }; |
| 8156 |
| 8157 static int sqlite3Fts5BufferSize(int*, Fts5Buffer*, int); |
| 8158 static void sqlite3Fts5BufferAppendVarint(int*, Fts5Buffer*, i64); |
| 8159 static void sqlite3Fts5BufferAppendBlob(int*, Fts5Buffer*, int, const u8*); |
| 8160 static void sqlite3Fts5BufferAppendString(int *, Fts5Buffer*, const char*); |
| 8161 static void sqlite3Fts5BufferFree(Fts5Buffer*); |
| 8162 static void sqlite3Fts5BufferZero(Fts5Buffer*); |
| 8163 static void sqlite3Fts5BufferSet(int*, Fts5Buffer*, int, const u8*); |
| 8164 static void sqlite3Fts5BufferAppendPrintf(int *, Fts5Buffer*, char *zFmt, ...); |
| 8165 |
| 8166 static char *sqlite3Fts5Mprintf(int *pRc, const char *zFmt, ...); |
| 8167 |
| 8168 #define fts5BufferZero(x) sqlite3Fts5BufferZero(x) |
| 8169 #define fts5BufferAppendVarint(a,b,c) sqlite3Fts5BufferAppendVarint(a,b,c) |
| 8170 #define fts5BufferFree(a) sqlite3Fts5BufferFree(a) |
| 8171 #define fts5BufferAppendBlob(a,b,c,d) sqlite3Fts5BufferAppendBlob(a,b,c,d) |
| 8172 #define fts5BufferSet(a,b,c,d) sqlite3Fts5BufferSet(a,b,c,d) |
| 8173 |
| 8174 #define fts5BufferGrow(pRc,pBuf,nn) ( \ |
| 8175 (pBuf)->n + (nn) <= (pBuf)->nSpace ? 0 : \ |
| 8176 sqlite3Fts5BufferSize((pRc),(pBuf),(nn)+(pBuf)->n) \ |
| 8177 ) |
| 8178 |
| 8179 /* Write and decode big-endian 32-bit integer values */ |
| 8180 static void sqlite3Fts5Put32(u8*, int); |
| 8181 static int sqlite3Fts5Get32(const u8*); |
| 8182 |
| 8183 #define FTS5_POS2COLUMN(iPos) (int)(iPos >> 32) |
| 8184 #define FTS5_POS2OFFSET(iPos) (int)(iPos & 0xFFFFFFFF) |
| 8185 |
| 8186 typedef struct Fts5PoslistReader Fts5PoslistReader; |
| 8187 struct Fts5PoslistReader { |
| 8188 /* Variables used only by sqlite3Fts5PoslistIterXXX() functions. */ |
| 8189 const u8 *a; /* Position list to iterate through */ |
| 8190 int n; /* Size of buffer at a[] in bytes */ |
| 8191 int i; /* Current offset in a[] */ |
| 8192 |
| 8193 u8 bFlag; /* For client use (any custom purpose) */ |
| 8194 |
| 8195 /* Output variables */ |
| 8196 u8 bEof; /* Set to true at EOF */ |
| 8197 i64 iPos; /* (iCol<<32) + iPos */ |
| 8198 }; |
| 8199 static int sqlite3Fts5PoslistReaderInit( |
| 8200 const u8 *a, int n, /* Poslist buffer to iterate through */ |
| 8201 Fts5PoslistReader *pIter /* Iterator object to initialize */ |
| 8202 ); |
| 8203 static int sqlite3Fts5PoslistReaderNext(Fts5PoslistReader*); |
| 8204 |
| 8205 typedef struct Fts5PoslistWriter Fts5PoslistWriter; |
| 8206 struct Fts5PoslistWriter { |
| 8207 i64 iPrev; |
| 8208 }; |
| 8209 static int sqlite3Fts5PoslistWriterAppend(Fts5Buffer*, Fts5PoslistWriter*, i64); |
| 8210 |
| 8211 static int sqlite3Fts5PoslistNext64( |
| 8212 const u8 *a, int n, /* Buffer containing poslist */ |
| 8213 int *pi, /* IN/OUT: Offset within a[] */ |
| 8214 i64 *piOff /* IN/OUT: Current offset */ |
| 8215 ); |
| 8216 |
| 8217 /* Malloc utility */ |
| 8218 static void *sqlite3Fts5MallocZero(int *pRc, int nByte); |
| 8219 static char *sqlite3Fts5Strndup(int *pRc, const char *pIn, int nIn); |
| 8220 |
| 8221 /* Character set tests (like isspace(), isalpha() etc.) */ |
| 8222 static int sqlite3Fts5IsBareword(char t); |
| 8223 |
| 8224 /* |
| 8225 ** End of interface to code in fts5_buffer.c. |
| 8226 **************************************************************************/ |
| 8227 |
| 8228 /************************************************************************** |
| 8229 ** Interface to code in fts5_index.c. fts5_index.c contains contains code |
| 8230 ** to access the data stored in the %_data table. |
| 8231 */ |
| 8232 |
| 8233 typedef struct Fts5Index Fts5Index; |
| 8234 typedef struct Fts5IndexIter Fts5IndexIter; |
| 8235 |
| 8236 /* |
| 8237 ** Values used as part of the flags argument passed to IndexQuery(). |
| 8238 */ |
| 8239 #define FTS5INDEX_QUERY_PREFIX 0x0001 /* Prefix query */ |
| 8240 #define FTS5INDEX_QUERY_DESC 0x0002 /* Docs in descending rowid order */ |
| 8241 #define FTS5INDEX_QUERY_TEST_NOIDX 0x0004 /* Do not use prefix index */ |
| 8242 #define FTS5INDEX_QUERY_SCAN 0x0008 /* Scan query (fts5vocab) */ |
| 8243 |
| 8244 /* |
| 8245 ** Create/destroy an Fts5Index object. |
| 8246 */ |
| 8247 static int sqlite3Fts5IndexOpen(Fts5Config *pConfig, int bCreate, Fts5Index**, c
har**); |
| 8248 static int sqlite3Fts5IndexClose(Fts5Index *p); |
| 8249 |
| 8250 /* |
| 8251 ** for( |
| 8252 ** sqlite3Fts5IndexQuery(p, "token", 5, 0, 0, &pIter); |
| 8253 ** 0==sqlite3Fts5IterEof(pIter); |
| 8254 ** sqlite3Fts5IterNext(pIter) |
| 8255 ** ){ |
| 8256 ** i64 iRowid = sqlite3Fts5IterRowid(pIter); |
| 8257 ** } |
| 8258 */ |
| 8259 |
| 8260 /* |
| 8261 ** Open a new iterator to iterate though all rowids that match the |
| 8262 ** specified token or token prefix. |
| 8263 */ |
| 8264 static int sqlite3Fts5IndexQuery( |
| 8265 Fts5Index *p, /* FTS index to query */ |
| 8266 const char *pToken, int nToken, /* Token (or prefix) to query for */ |
| 8267 int flags, /* Mask of FTS5INDEX_QUERY_X flags */ |
| 8268 Fts5Colset *pColset, /* Match these columns only */ |
| 8269 Fts5IndexIter **ppIter /* OUT: New iterator object */ |
| 8270 ); |
| 8271 |
| 8272 /* |
| 8273 ** The various operations on open token or token prefix iterators opened |
| 8274 ** using sqlite3Fts5IndexQuery(). |
| 8275 */ |
| 8276 static int sqlite3Fts5IterEof(Fts5IndexIter*); |
| 8277 static int sqlite3Fts5IterNext(Fts5IndexIter*); |
| 8278 static int sqlite3Fts5IterNextFrom(Fts5IndexIter*, i64 iMatch); |
| 8279 static i64 sqlite3Fts5IterRowid(Fts5IndexIter*); |
| 8280 static int sqlite3Fts5IterPoslist(Fts5IndexIter*,Fts5Colset*, const u8**, int*,
i64*); |
| 8281 static int sqlite3Fts5IterPoslistBuffer(Fts5IndexIter *pIter, Fts5Buffer *pBuf); |
| 8282 |
| 8283 /* |
| 8284 ** Close an iterator opened by sqlite3Fts5IndexQuery(). |
| 8285 */ |
| 8286 static void sqlite3Fts5IterClose(Fts5IndexIter*); |
| 8287 |
| 8288 /* |
| 8289 ** This interface is used by the fts5vocab module. |
| 8290 */ |
| 8291 static const char *sqlite3Fts5IterTerm(Fts5IndexIter*, int*); |
| 8292 static int sqlite3Fts5IterNextScan(Fts5IndexIter*); |
| 8293 |
| 8294 |
| 8295 /* |
| 8296 ** Insert or remove data to or from the index. Each time a document is |
| 8297 ** added to or removed from the index, this function is called one or more |
| 8298 ** times. |
| 8299 ** |
| 8300 ** For an insert, it must be called once for each token in the new document. |
| 8301 ** If the operation is a delete, it must be called (at least) once for each |
| 8302 ** unique token in the document with an iCol value less than zero. The iPos |
| 8303 ** argument is ignored for a delete. |
| 8304 */ |
| 8305 static int sqlite3Fts5IndexWrite( |
| 8306 Fts5Index *p, /* Index to write to */ |
| 8307 int iCol, /* Column token appears in (-ve -> delete) */ |
| 8308 int iPos, /* Position of token within column */ |
| 8309 const char *pToken, int nToken /* Token to add or remove to or from index */ |
| 8310 ); |
| 8311 |
| 8312 /* |
| 8313 ** Indicate that subsequent calls to sqlite3Fts5IndexWrite() pertain to |
| 8314 ** document iDocid. |
| 8315 */ |
| 8316 static int sqlite3Fts5IndexBeginWrite( |
| 8317 Fts5Index *p, /* Index to write to */ |
| 8318 int bDelete, /* True if current operation is a delete */ |
| 8319 i64 iDocid /* Docid to add or remove data from */ |
| 8320 ); |
| 8321 |
| 8322 /* |
| 8323 ** Flush any data stored in the in-memory hash tables to the database. |
| 8324 ** If the bCommit flag is true, also close any open blob handles. |
| 8325 */ |
| 8326 static int sqlite3Fts5IndexSync(Fts5Index *p, int bCommit); |
| 8327 |
| 8328 /* |
| 8329 ** Discard any data stored in the in-memory hash tables. Do not write it |
| 8330 ** to the database. Additionally, assume that the contents of the %_data |
| 8331 ** table may have changed on disk. So any in-memory caches of %_data |
| 8332 ** records must be invalidated. |
| 8333 */ |
| 8334 static int sqlite3Fts5IndexRollback(Fts5Index *p); |
| 8335 |
| 8336 /* |
| 8337 ** Get or set the "averages" values. |
| 8338 */ |
| 8339 static int sqlite3Fts5IndexGetAverages(Fts5Index *p, i64 *pnRow, i64 *anSize); |
| 8340 static int sqlite3Fts5IndexSetAverages(Fts5Index *p, const u8*, int); |
| 8341 |
| 8342 /* |
| 8343 ** Functions called by the storage module as part of integrity-check. |
| 8344 */ |
| 8345 static u64 sqlite3Fts5IndexCksum(Fts5Config*,i64,int,int,const char*,int); |
| 8346 static int sqlite3Fts5IndexIntegrityCheck(Fts5Index*, u64 cksum); |
| 8347 |
| 8348 /* |
| 8349 ** Called during virtual module initialization to register UDF |
| 8350 ** fts5_decode() with SQLite |
| 8351 */ |
| 8352 static int sqlite3Fts5IndexInit(sqlite3*); |
| 8353 |
| 8354 static int sqlite3Fts5IndexSetCookie(Fts5Index*, int); |
| 8355 |
| 8356 /* |
| 8357 ** Return the total number of entries read from the %_data table by |
| 8358 ** this connection since it was created. |
| 8359 */ |
| 8360 static int sqlite3Fts5IndexReads(Fts5Index *p); |
| 8361 |
| 8362 static int sqlite3Fts5IndexReinit(Fts5Index *p); |
| 8363 static int sqlite3Fts5IndexOptimize(Fts5Index *p); |
| 8364 static int sqlite3Fts5IndexMerge(Fts5Index *p, int nMerge); |
| 8365 |
| 8366 static int sqlite3Fts5IndexLoadConfig(Fts5Index *p); |
| 8367 |
| 8368 /* |
| 8369 ** End of interface to code in fts5_index.c. |
| 8370 **************************************************************************/ |
| 8371 |
| 8372 /************************************************************************** |
| 8373 ** Interface to code in fts5_varint.c. |
| 8374 */ |
| 8375 static int sqlite3Fts5GetVarint32(const unsigned char *p, u32 *v); |
| 8376 static int sqlite3Fts5GetVarintLen(u32 iVal); |
| 8377 static u8 sqlite3Fts5GetVarint(const unsigned char*, u64*); |
| 8378 static int sqlite3Fts5PutVarint(unsigned char *p, u64 v); |
| 8379 |
| 8380 #define fts5GetVarint32(a,b) sqlite3Fts5GetVarint32(a,(u32*)&b) |
| 8381 #define fts5GetVarint sqlite3Fts5GetVarint |
| 8382 |
| 8383 #define fts5FastGetVarint32(a, iOff, nVal) { \ |
| 8384 nVal = (a)[iOff++]; \ |
| 8385 if( nVal & 0x80 ){ \ |
| 8386 iOff--; \ |
| 8387 iOff += fts5GetVarint32(&(a)[iOff], nVal); \ |
| 8388 } \ |
| 8389 } |
| 8390 |
| 8391 |
| 8392 /* |
| 8393 ** End of interface to code in fts5_varint.c. |
| 8394 **************************************************************************/ |
| 8395 |
| 8396 |
| 8397 /************************************************************************** |
| 8398 ** Interface to code in fts5.c. |
| 8399 */ |
| 8400 |
| 8401 static int sqlite3Fts5GetTokenizer( |
| 8402 Fts5Global*, |
| 8403 const char **azArg, |
| 8404 int nArg, |
| 8405 Fts5Tokenizer**, |
| 8406 fts5_tokenizer**, |
| 8407 char **pzErr |
| 8408 ); |
| 8409 |
| 8410 static Fts5Index *sqlite3Fts5IndexFromCsrid(Fts5Global*, i64, Fts5Config **); |
| 8411 |
| 8412 /* |
| 8413 ** End of interface to code in fts5.c. |
| 8414 **************************************************************************/ |
| 8415 |
| 8416 /************************************************************************** |
| 8417 ** Interface to code in fts5_hash.c. |
| 8418 */ |
| 8419 typedef struct Fts5Hash Fts5Hash; |
| 8420 |
| 8421 /* |
| 8422 ** Create a hash table, free a hash table. |
| 8423 */ |
| 8424 static int sqlite3Fts5HashNew(Fts5Hash**, int *pnSize); |
| 8425 static void sqlite3Fts5HashFree(Fts5Hash*); |
| 8426 |
| 8427 static int sqlite3Fts5HashWrite( |
| 8428 Fts5Hash*, |
| 8429 i64 iRowid, /* Rowid for this entry */ |
| 8430 int iCol, /* Column token appears in (-ve -> delete) */ |
| 8431 int iPos, /* Position of token within column */ |
| 8432 char bByte, |
| 8433 const char *pToken, int nToken /* Token to add or remove to or from index */ |
| 8434 ); |
| 8435 |
| 8436 /* |
| 8437 ** Empty (but do not delete) a hash table. |
| 8438 */ |
| 8439 static void sqlite3Fts5HashClear(Fts5Hash*); |
| 8440 |
| 8441 static int sqlite3Fts5HashQuery( |
| 8442 Fts5Hash*, /* Hash table to query */ |
| 8443 const char *pTerm, int nTerm, /* Query term */ |
| 8444 const u8 **ppDoclist, /* OUT: Pointer to doclist for pTerm */ |
| 8445 int *pnDoclist /* OUT: Size of doclist in bytes */ |
| 8446 ); |
| 8447 |
| 8448 static int sqlite3Fts5HashScanInit( |
| 8449 Fts5Hash*, /* Hash table to query */ |
| 8450 const char *pTerm, int nTerm /* Query prefix */ |
| 8451 ); |
| 8452 static void sqlite3Fts5HashScanNext(Fts5Hash*); |
| 8453 static int sqlite3Fts5HashScanEof(Fts5Hash*); |
| 8454 static void sqlite3Fts5HashScanEntry(Fts5Hash *, |
| 8455 const char **pzTerm, /* OUT: term (nul-terminated) */ |
| 8456 const u8 **ppDoclist, /* OUT: pointer to doclist */ |
| 8457 int *pnDoclist /* OUT: size of doclist in bytes */ |
| 8458 ); |
| 8459 |
| 8460 |
| 8461 /* |
| 8462 ** End of interface to code in fts5_hash.c. |
| 8463 **************************************************************************/ |
| 8464 |
| 8465 /************************************************************************** |
| 8466 ** Interface to code in fts5_storage.c. fts5_storage.c contains contains |
| 8467 ** code to access the data stored in the %_content and %_docsize tables. |
| 8468 */ |
| 8469 |
| 8470 #define FTS5_STMT_SCAN_ASC 0 /* SELECT rowid, * FROM ... ORDER BY 1 ASC */ |
| 8471 #define FTS5_STMT_SCAN_DESC 1 /* SELECT rowid, * FROM ... ORDER BY 1 DESC */ |
| 8472 #define FTS5_STMT_LOOKUP 2 /* SELECT rowid, * FROM ... WHERE rowid=? */ |
| 8473 |
| 8474 typedef struct Fts5Storage Fts5Storage; |
| 8475 |
| 8476 static int sqlite3Fts5StorageOpen(Fts5Config*, Fts5Index*, int, Fts5Storage**, c
har**); |
| 8477 static int sqlite3Fts5StorageClose(Fts5Storage *p); |
| 8478 static int sqlite3Fts5StorageRename(Fts5Storage*, const char *zName); |
| 8479 |
| 8480 static int sqlite3Fts5DropAll(Fts5Config*); |
| 8481 static int sqlite3Fts5CreateTable(Fts5Config*, const char*, const char*, int, ch
ar **); |
| 8482 |
| 8483 static int sqlite3Fts5StorageDelete(Fts5Storage *p, i64); |
| 8484 static int sqlite3Fts5StorageContentInsert(Fts5Storage *p, sqlite3_value**, i64*
); |
| 8485 static int sqlite3Fts5StorageIndexInsert(Fts5Storage *p, sqlite3_value**, i64); |
| 8486 |
| 8487 static int sqlite3Fts5StorageIntegrity(Fts5Storage *p); |
| 8488 |
| 8489 static int sqlite3Fts5StorageStmt(Fts5Storage *p, int eStmt, sqlite3_stmt**, cha
r**); |
| 8490 static void sqlite3Fts5StorageStmtRelease(Fts5Storage *p, int eStmt, sqlite3_stm
t*); |
| 8491 |
| 8492 static int sqlite3Fts5StorageDocsize(Fts5Storage *p, i64 iRowid, int *aCol); |
| 8493 static int sqlite3Fts5StorageSize(Fts5Storage *p, int iCol, i64 *pnAvg); |
| 8494 static int sqlite3Fts5StorageRowCount(Fts5Storage *p, i64 *pnRow); |
| 8495 |
| 8496 static int sqlite3Fts5StorageSync(Fts5Storage *p, int bCommit); |
| 8497 static int sqlite3Fts5StorageRollback(Fts5Storage *p); |
| 8498 |
| 8499 static int sqlite3Fts5StorageConfigValue( |
| 8500 Fts5Storage *p, const char*, sqlite3_value*, int |
| 8501 ); |
| 8502 |
| 8503 static int sqlite3Fts5StorageSpecialDelete(Fts5Storage *p, i64 iDel, sqlite3_val
ue**); |
| 8504 |
| 8505 static int sqlite3Fts5StorageDeleteAll(Fts5Storage *p); |
| 8506 static int sqlite3Fts5StorageRebuild(Fts5Storage *p); |
| 8507 static int sqlite3Fts5StorageOptimize(Fts5Storage *p); |
| 8508 static int sqlite3Fts5StorageMerge(Fts5Storage *p, int nMerge); |
| 8509 |
| 8510 /* |
| 8511 ** End of interface to code in fts5_storage.c. |
| 8512 **************************************************************************/ |
| 8513 |
| 8514 |
| 8515 /************************************************************************** |
| 8516 ** Interface to code in fts5_expr.c. |
| 8517 */ |
| 8518 typedef struct Fts5Expr Fts5Expr; |
| 8519 typedef struct Fts5ExprNode Fts5ExprNode; |
| 8520 typedef struct Fts5Parse Fts5Parse; |
| 8521 typedef struct Fts5Token Fts5Token; |
| 8522 typedef struct Fts5ExprPhrase Fts5ExprPhrase; |
| 8523 typedef struct Fts5ExprNearset Fts5ExprNearset; |
| 8524 |
| 8525 struct Fts5Token { |
| 8526 const char *p; /* Token text (not NULL terminated) */ |
| 8527 int n; /* Size of buffer p in bytes */ |
| 8528 }; |
| 8529 |
| 8530 /* Parse a MATCH expression. */ |
| 8531 static int sqlite3Fts5ExprNew( |
| 8532 Fts5Config *pConfig, |
| 8533 const char *zExpr, |
| 8534 Fts5Expr **ppNew, |
| 8535 char **pzErr |
| 8536 ); |
| 8537 |
| 8538 /* |
| 8539 ** for(rc = sqlite3Fts5ExprFirst(pExpr, pIdx, bDesc); |
| 8540 ** rc==SQLITE_OK && 0==sqlite3Fts5ExprEof(pExpr); |
| 8541 ** rc = sqlite3Fts5ExprNext(pExpr) |
| 8542 ** ){ |
| 8543 ** // The document with rowid iRowid matches the expression! |
| 8544 ** i64 iRowid = sqlite3Fts5ExprRowid(pExpr); |
| 8545 ** } |
| 8546 */ |
| 8547 static int sqlite3Fts5ExprFirst(Fts5Expr*, Fts5Index *pIdx, i64 iMin, int bDesc)
; |
| 8548 static int sqlite3Fts5ExprNext(Fts5Expr*, i64 iMax); |
| 8549 static int sqlite3Fts5ExprEof(Fts5Expr*); |
| 8550 static i64 sqlite3Fts5ExprRowid(Fts5Expr*); |
| 8551 |
| 8552 static void sqlite3Fts5ExprFree(Fts5Expr*); |
| 8553 |
| 8554 /* Called during startup to register a UDF with SQLite */ |
| 8555 static int sqlite3Fts5ExprInit(Fts5Global*, sqlite3*); |
| 8556 |
| 8557 static int sqlite3Fts5ExprPhraseCount(Fts5Expr*); |
| 8558 static int sqlite3Fts5ExprPhraseSize(Fts5Expr*, int iPhrase); |
| 8559 static int sqlite3Fts5ExprPoslist(Fts5Expr*, int, const u8 **); |
| 8560 |
| 8561 static int sqlite3Fts5ExprClonePhrase(Fts5Config*, Fts5Expr*, int, Fts5Expr**); |
| 8562 |
| 8563 /******************************************* |
| 8564 ** The fts5_expr.c API above this point is used by the other hand-written |
| 8565 ** C code in this module. The interfaces below this point are called by |
| 8566 ** the parser code in fts5parse.y. */ |
| 8567 |
| 8568 static void sqlite3Fts5ParseError(Fts5Parse *pParse, const char *zFmt, ...); |
| 8569 |
| 8570 static Fts5ExprNode *sqlite3Fts5ParseNode( |
| 8571 Fts5Parse *pParse, |
| 8572 int eType, |
| 8573 Fts5ExprNode *pLeft, |
| 8574 Fts5ExprNode *pRight, |
| 8575 Fts5ExprNearset *pNear |
| 8576 ); |
| 8577 |
| 8578 static Fts5ExprPhrase *sqlite3Fts5ParseTerm( |
| 8579 Fts5Parse *pParse, |
| 8580 Fts5ExprPhrase *pPhrase, |
| 8581 Fts5Token *pToken, |
| 8582 int bPrefix |
| 8583 ); |
| 8584 |
| 8585 static Fts5ExprNearset *sqlite3Fts5ParseNearset( |
| 8586 Fts5Parse*, |
| 8587 Fts5ExprNearset*, |
| 8588 Fts5ExprPhrase* |
| 8589 ); |
| 8590 |
| 8591 static Fts5Colset *sqlite3Fts5ParseColset( |
| 8592 Fts5Parse*, |
| 8593 Fts5Colset*, |
| 8594 Fts5Token * |
| 8595 ); |
| 8596 |
| 8597 static void sqlite3Fts5ParsePhraseFree(Fts5ExprPhrase*); |
| 8598 static void sqlite3Fts5ParseNearsetFree(Fts5ExprNearset*); |
| 8599 static void sqlite3Fts5ParseNodeFree(Fts5ExprNode*); |
| 8600 |
| 8601 static void sqlite3Fts5ParseSetDistance(Fts5Parse*, Fts5ExprNearset*, Fts5Token*
); |
| 8602 static void sqlite3Fts5ParseSetColset(Fts5Parse*, Fts5ExprNearset*, Fts5Colset*)
; |
| 8603 static void sqlite3Fts5ParseFinished(Fts5Parse *pParse, Fts5ExprNode *p); |
| 8604 static void sqlite3Fts5ParseNear(Fts5Parse *pParse, Fts5Token*); |
| 8605 |
| 8606 /* |
| 8607 ** End of interface to code in fts5_expr.c. |
| 8608 **************************************************************************/ |
| 8609 |
| 8610 |
| 8611 |
| 8612 /************************************************************************** |
| 8613 ** Interface to code in fts5_aux.c. |
| 8614 */ |
| 8615 |
| 8616 static int sqlite3Fts5AuxInit(fts5_api*); |
| 8617 /* |
| 8618 ** End of interface to code in fts5_aux.c. |
| 8619 **************************************************************************/ |
| 8620 |
| 8621 /************************************************************************** |
| 8622 ** Interface to code in fts5_tokenizer.c. |
| 8623 */ |
| 8624 |
| 8625 static int sqlite3Fts5TokenizerInit(fts5_api*); |
| 8626 /* |
| 8627 ** End of interface to code in fts5_tokenizer.c. |
| 8628 **************************************************************************/ |
| 8629 |
| 8630 /************************************************************************** |
| 8631 ** Interface to code in fts5_vocab.c. |
| 8632 */ |
| 8633 |
| 8634 static int sqlite3Fts5VocabInit(Fts5Global*, sqlite3*); |
| 8635 |
| 8636 /* |
| 8637 ** End of interface to code in fts5_vocab.c. |
| 8638 **************************************************************************/ |
| 8639 |
| 8640 |
| 8641 /************************************************************************** |
| 8642 ** Interface to automatically generated code in fts5_unicode2.c. |
| 8643 */ |
| 8644 static int sqlite3Fts5UnicodeIsalnum(int c); |
| 8645 static int sqlite3Fts5UnicodeIsdiacritic(int c); |
| 8646 static int sqlite3Fts5UnicodeFold(int c, int bRemoveDiacritic); |
| 8647 /* |
| 8648 ** End of interface to code in fts5_unicode2.c. |
| 8649 **************************************************************************/ |
| 8650 |
| 8651 #endif |
| 8652 |
| 8653 #define FTS5_OR 1 |
| 8654 #define FTS5_AND 2 |
| 8655 #define FTS5_NOT 3 |
| 8656 #define FTS5_TERM 4 |
| 8657 #define FTS5_COLON 5 |
| 8658 #define FTS5_LP 6 |
| 8659 #define FTS5_RP 7 |
| 8660 #define FTS5_LCP 8 |
| 8661 #define FTS5_RCP 9 |
| 8662 #define FTS5_STRING 10 |
| 8663 #define FTS5_COMMA 11 |
| 8664 #define FTS5_PLUS 12 |
| 8665 #define FTS5_STAR 13 |
| 8666 |
| 8667 /* |
| 8668 ** 2000-05-29 |
| 8669 ** |
| 8670 ** The author disclaims copyright to this source code. In place of |
| 8671 ** a legal notice, here is a blessing: |
| 8672 ** |
| 8673 ** May you do good and not evil. |
| 8674 ** May you find forgiveness for yourself and forgive others. |
| 8675 ** May you share freely, never taking more than you give. |
| 8676 ** |
| 8677 ************************************************************************* |
| 8678 ** Driver template for the LEMON parser generator. |
| 8679 ** |
| 8680 ** The "lemon" program processes an LALR(1) input grammar file, then uses |
| 8681 ** this template to construct a parser. The "lemon" program inserts text |
| 8682 ** at each "%%" line. Also, any "P-a-r-s-e" identifer prefix (without the |
| 8683 ** interstitial "-" characters) contained in this template is changed into |
| 8684 ** the value of the %name directive from the grammar. Otherwise, the content |
| 8685 ** of this template is copied straight through into the generate parser |
| 8686 ** source file. |
| 8687 ** |
| 8688 ** The following is the concatenation of all %include directives from the |
| 8689 ** input grammar file: |
| 8690 */ |
| 8691 /* #include <stdio.h> */ |
| 8692 /************ Begin %include sections from the grammar ************************/ |
| 8693 |
| 8694 /* #include "fts5Int.h" */ |
| 8695 /* #include "fts5parse.h" */ |
| 8696 |
| 8697 /* |
| 8698 ** Disable all error recovery processing in the parser push-down |
| 8699 ** automaton. |
| 8700 */ |
| 8701 #define fts5YYNOERRORRECOVERY 1 |
| 8702 |
| 8703 /* |
| 8704 ** Make fts5yytestcase() the same as testcase() |
| 8705 */ |
| 8706 #define fts5yytestcase(X) testcase(X) |
| 8707 |
| 8708 /* |
| 8709 ** Indicate that sqlite3ParserFree() will never be called with a null |
| 8710 ** pointer. |
| 8711 */ |
| 8712 #define fts5YYPARSEFREENOTNULL 1 |
| 8713 |
| 8714 /* |
| 8715 ** Alternative datatype for the argument to the malloc() routine passed |
| 8716 ** into sqlite3ParserAlloc(). The default is size_t. |
| 8717 */ |
| 8718 #define fts5YYMALLOCARGTYPE u64 |
| 8719 |
| 8720 /**************** End of %include directives **********************************/ |
| 8721 /* These constants specify the various numeric values for terminal symbols |
| 8722 ** in a format understandable to "makeheaders". This section is blank unless |
| 8723 ** "lemon" is run with the "-m" command-line option. |
| 8724 ***************** Begin makeheaders token definitions *************************/ |
| 8725 /**************** End makeheaders token definitions ***************************/ |
| 8726 |
| 8727 /* The next sections is a series of control #defines. |
| 8728 ** various aspects of the generated parser. |
| 8729 ** fts5YYCODETYPE is the data type used to store the integer codes |
| 8730 ** that represent terminal and non-terminal symbols. |
| 8731 ** "unsigned char" is used if there are fewer than |
| 8732 ** 256 symbols. Larger types otherwise. |
| 8733 ** fts5YYNOCODE is a number of type fts5YYCODETYPE that is not used
for |
| 8734 ** any terminal or nonterminal symbol. |
| 8735 ** fts5YYFALLBACK If defined, this indicates that one or more tokens |
| 8736 ** (also known as: "terminal symbols") have fall-back |
| 8737 ** values which should be used if the original symbol |
| 8738 ** would not parse. This permits keywords to sometimes |
| 8739 ** be used as identifiers, for example. |
| 8740 ** fts5YYACTIONTYPE is the data type used for "action codes" - numbers |
| 8741 ** that indicate what to do in response to the next |
| 8742 ** token. |
| 8743 ** sqlite3Fts5ParserFTS5TOKENTYPE is the data type used for minor type fo
r terminal |
| 8744 ** symbols. Background: A "minor type" is a semantic |
| 8745 ** value associated with a terminal or non-terminal |
| 8746 ** symbols. For example, for an "ID" terminal symbol, |
| 8747 ** the minor type might be the name of the identifier. |
| 8748 ** Each non-terminal can have a different minor type. |
| 8749 ** Terminal symbols all have the same minor type, though. |
| 8750 ** This macros defines the minor type for terminal |
| 8751 ** symbols. |
| 8752 ** fts5YYMINORTYPE is the data type used for all minor types. |
| 8753 ** This is typically a union of many types, one of |
| 8754 ** which is sqlite3Fts5ParserFTS5TOKENTYPE. The entry in
the union |
| 8755 ** for terminal symbols is called "fts5yy0". |
| 8756 ** fts5YYSTACKDEPTH is the maximum depth of the parser's stack. If |
| 8757 ** zero the stack is dynamically sized using realloc() |
| 8758 ** sqlite3Fts5ParserARG_SDECL A static variable declaration for the %extr
a_argument |
| 8759 ** sqlite3Fts5ParserARG_PDECL A parameter declaration for the %extra_argu
ment |
| 8760 ** sqlite3Fts5ParserARG_STORE Code to store %extra_argument into fts5yypP
arser |
| 8761 ** sqlite3Fts5ParserARG_FETCH Code to extract %extra_argument from fts5yy
pParser |
| 8762 ** fts5YYERRORSYMBOL is the code number of the error symbol. If not |
| 8763 ** defined, then do no error processing. |
| 8764 ** fts5YYNSTATE the combined number of states. |
| 8765 ** fts5YYNRULE the number of rules in the grammar |
| 8766 ** fts5YY_MAX_SHIFT Maximum value for shift actions |
| 8767 ** fts5YY_MIN_SHIFTREDUCE Minimum value for shift-reduce actions |
| 8768 ** fts5YY_MAX_SHIFTREDUCE Maximum value for shift-reduce actions |
| 8769 ** fts5YY_MIN_REDUCE Maximum value for reduce actions |
| 8770 ** fts5YY_ERROR_ACTION The fts5yy_action[] code for syntax error |
| 8771 ** fts5YY_ACCEPT_ACTION The fts5yy_action[] code for accept |
| 8772 ** fts5YY_NO_ACTION The fts5yy_action[] code for no-op |
| 8773 */ |
| 8774 #ifndef INTERFACE |
| 8775 # define INTERFACE 1 |
| 8776 #endif |
| 8777 /************* Begin control #defines *****************************************/ |
| 8778 #define fts5YYCODETYPE unsigned char |
| 8779 #define fts5YYNOCODE 27 |
| 8780 #define fts5YYACTIONTYPE unsigned char |
| 8781 #define sqlite3Fts5ParserFTS5TOKENTYPE Fts5Token |
| 8782 typedef union { |
| 8783 int fts5yyinit; |
| 8784 sqlite3Fts5ParserFTS5TOKENTYPE fts5yy0; |
| 8785 Fts5Colset* fts5yy3; |
| 8786 Fts5ExprPhrase* fts5yy11; |
| 8787 Fts5ExprNode* fts5yy18; |
| 8788 int fts5yy20; |
| 8789 Fts5ExprNearset* fts5yy26; |
| 8790 } fts5YYMINORTYPE; |
| 8791 #ifndef fts5YYSTACKDEPTH |
| 8792 #define fts5YYSTACKDEPTH 100 |
| 8793 #endif |
| 8794 #define sqlite3Fts5ParserARG_SDECL Fts5Parse *pParse; |
| 8795 #define sqlite3Fts5ParserARG_PDECL ,Fts5Parse *pParse |
| 8796 #define sqlite3Fts5ParserARG_FETCH Fts5Parse *pParse = fts5yypParser->pParse |
| 8797 #define sqlite3Fts5ParserARG_STORE fts5yypParser->pParse = pParse |
| 8798 #define fts5YYNSTATE 26 |
| 8799 #define fts5YYNRULE 24 |
| 8800 #define fts5YY_MAX_SHIFT 25 |
| 8801 #define fts5YY_MIN_SHIFTREDUCE 40 |
| 8802 #define fts5YY_MAX_SHIFTREDUCE 63 |
| 8803 #define fts5YY_MIN_REDUCE 64 |
| 8804 #define fts5YY_MAX_REDUCE 87 |
| 8805 #define fts5YY_ERROR_ACTION 88 |
| 8806 #define fts5YY_ACCEPT_ACTION 89 |
| 8807 #define fts5YY_NO_ACTION 90 |
| 8808 /************* End control #defines *******************************************/ |
| 8809 |
| 8810 /* The fts5yyzerominor constant is used to initialize instances of |
| 8811 ** fts5YYMINORTYPE objects to zero. */ |
| 8812 static const fts5YYMINORTYPE fts5yyzerominor = { 0 }; |
| 8813 |
| 8814 /* Define the fts5yytestcase() macro to be a no-op if is not already defined |
| 8815 ** otherwise. |
| 8816 ** |
| 8817 ** Applications can choose to define fts5yytestcase() in the %include section |
| 8818 ** to a macro that can assist in verifying code coverage. For production |
| 8819 ** code the fts5yytestcase() macro should be turned off. But it is useful |
| 8820 ** for testing. |
| 8821 */ |
| 8822 #ifndef fts5yytestcase |
| 8823 # define fts5yytestcase(X) |
| 8824 #endif |
| 8825 |
| 8826 |
| 8827 /* Next are the tables used to determine what action to take based on the |
| 8828 ** current state and lookahead token. These tables are used to implement |
| 8829 ** functions that take a state number and lookahead value and return an |
| 8830 ** action integer. |
| 8831 ** |
| 8832 ** Suppose the action integer is N. Then the action is determined as |
| 8833 ** follows |
| 8834 ** |
| 8835 ** 0 <= N <= fts5YY_MAX_SHIFT Shift N. That is, push the lookahea
d |
| 8836 ** token onto the stack and goto state N. |
| 8837 ** |
| 8838 ** N between fts5YY_MIN_SHIFTREDUCE Shift to an arbitrary state then |
| 8839 ** and fts5YY_MAX_SHIFTREDUCE reduce by rule N-fts5YY_MIN_SHIFTRED
UCE. |
| 8840 ** |
| 8841 ** N between fts5YY_MIN_REDUCE Reduce by rule N-fts5YY_MIN_REDUCE |
| 8842 ** and fts5YY_MAX_REDUCE |
| 8843 |
| 8844 ** N == fts5YY_ERROR_ACTION A syntax error has occurred. |
| 8845 ** |
| 8846 ** N == fts5YY_ACCEPT_ACTION The parser accepts its input. |
| 8847 ** |
| 8848 ** N == fts5YY_NO_ACTION No such action. Denotes unused |
| 8849 ** slots in the fts5yy_action[] table. |
| 8850 ** |
| 8851 ** The action table is constructed as a single large table named fts5yy_action[]
. |
| 8852 ** Given state S and lookahead X, the action is computed as |
| 8853 ** |
| 8854 ** fts5yy_action[ fts5yy_shift_ofst[S] + X ] |
| 8855 ** |
| 8856 ** If the index value fts5yy_shift_ofst[S]+X is out of range or if the value |
| 8857 ** fts5yy_lookahead[fts5yy_shift_ofst[S]+X] is not equal to X or if fts5yy_shift
_ofst[S] |
| 8858 ** is equal to fts5YY_SHIFT_USE_DFLT, it means that the action is not in the tab
le |
| 8859 ** and that fts5yy_default[S] should be used instead. |
| 8860 ** |
| 8861 ** The formula above is for computing the action when the lookahead is |
| 8862 ** a terminal symbol. If the lookahead is a non-terminal (as occurs after |
| 8863 ** a reduce action) then the fts5yy_reduce_ofst[] array is used in place of |
| 8864 ** the fts5yy_shift_ofst[] array and fts5YY_REDUCE_USE_DFLT is used in place of |
| 8865 ** fts5YY_SHIFT_USE_DFLT. |
| 8866 ** |
| 8867 ** The following are the tables generated in this section: |
| 8868 ** |
| 8869 ** fts5yy_action[] A single table containing all actions. |
| 8870 ** fts5yy_lookahead[] A table containing the lookahead for each entry in |
| 8871 ** fts5yy_action. Used to detect hash collisions. |
| 8872 ** fts5yy_shift_ofst[] For each state, the offset into fts5yy_action for |
| 8873 ** shifting terminals. |
| 8874 ** fts5yy_reduce_ofst[] For each state, the offset into fts5yy_action for |
| 8875 ** shifting non-terminals after a reduce. |
| 8876 ** fts5yy_default[] Default action for each state. |
| 8877 ** |
| 8878 *********** Begin parsing tables **********************************************/ |
| 8879 #define fts5YY_ACTTAB_COUNT (78) |
| 8880 static const fts5YYACTIONTYPE fts5yy_action[] = { |
| 8881 /* 0 */ 89, 15, 46, 5, 48, 24, 12, 19, 23, 14, |
| 8882 /* 10 */ 46, 5, 48, 24, 20, 21, 23, 43, 46, 5, |
| 8883 /* 20 */ 48, 24, 6, 18, 23, 17, 46, 5, 48, 24, |
| 8884 /* 30 */ 75, 7, 23, 25, 46, 5, 48, 24, 62, 47, |
| 8885 /* 40 */ 23, 48, 24, 7, 11, 23, 9, 3, 4, 2, |
| 8886 /* 50 */ 62, 50, 52, 44, 64, 3, 4, 2, 49, 4, |
| 8887 /* 60 */ 2, 1, 23, 11, 16, 9, 12, 2, 10, 61, |
| 8888 /* 70 */ 53, 59, 62, 60, 22, 13, 55, 8, |
| 8889 }; |
| 8890 static const fts5YYCODETYPE fts5yy_lookahead[] = { |
| 8891 /* 0 */ 15, 16, 17, 18, 19, 20, 10, 11, 23, 16, |
| 8892 /* 10 */ 17, 18, 19, 20, 23, 24, 23, 16, 17, 18, |
| 8893 /* 20 */ 19, 20, 22, 23, 23, 16, 17, 18, 19, 20, |
| 8894 /* 30 */ 5, 6, 23, 16, 17, 18, 19, 20, 13, 17, |
| 8895 /* 40 */ 23, 19, 20, 6, 8, 23, 10, 1, 2, 3, |
| 8896 /* 50 */ 13, 9, 10, 7, 0, 1, 2, 3, 19, 2, |
| 8897 /* 60 */ 3, 6, 23, 8, 21, 10, 10, 3, 10, 25, |
| 8898 /* 70 */ 10, 10, 13, 25, 12, 10, 7, 5, |
| 8899 }; |
| 8900 #define fts5YY_SHIFT_USE_DFLT (-5) |
| 8901 #define fts5YY_SHIFT_COUNT (25) |
| 8902 #define fts5YY_SHIFT_MIN (-4) |
| 8903 #define fts5YY_SHIFT_MAX (72) |
| 8904 static const signed char fts5yy_shift_ofst[] = { |
| 8905 /* 0 */ 55, 55, 55, 55, 55, 36, -4, 56, 58, 25, |
| 8906 /* 10 */ 37, 60, 59, 59, 46, 54, 42, 57, 62, 61, |
| 8907 /* 20 */ 62, 69, 65, 62, 72, 64, |
| 8908 }; |
| 8909 #define fts5YY_REDUCE_USE_DFLT (-16) |
| 8910 #define fts5YY_REDUCE_COUNT (13) |
| 8911 #define fts5YY_REDUCE_MIN (-15) |
| 8912 #define fts5YY_REDUCE_MAX (48) |
| 8913 static const signed char fts5yy_reduce_ofst[] = { |
| 8914 /* 0 */ -15, -7, 1, 9, 17, 22, -9, 0, 39, 44, |
| 8915 /* 10 */ 44, 43, 44, 48, |
| 8916 }; |
| 8917 static const fts5YYACTIONTYPE fts5yy_default[] = { |
| 8918 /* 0 */ 88, 88, 88, 88, 88, 69, 82, 88, 88, 87, |
| 8919 /* 10 */ 87, 88, 87, 87, 88, 88, 88, 66, 80, 88, |
| 8920 /* 20 */ 81, 88, 88, 78, 88, 65, |
| 8921 }; |
| 8922 /********** End of lemon-generated parsing tables *****************************/ |
| 8923 |
| 8924 /* The next table maps tokens (terminal symbols) into fallback tokens. |
| 8925 ** If a construct like the following: |
| 8926 ** |
| 8927 ** %fallback ID X Y Z. |
| 8928 ** |
| 8929 ** appears in the grammar, then ID becomes a fallback token for X, Y, |
| 8930 ** and Z. Whenever one of the tokens X, Y, or Z is input to the parser |
| 8931 ** but it does not parse, the type of the token is changed to ID and |
| 8932 ** the parse is retried before an error is thrown. |
| 8933 ** |
| 8934 ** This feature can be used, for example, to cause some keywords in a language |
| 8935 ** to revert to identifiers if they keyword does not apply in the context where |
| 8936 ** it appears. |
| 8937 */ |
| 8938 #ifdef fts5YYFALLBACK |
| 8939 static const fts5YYCODETYPE fts5yyFallback[] = { |
| 8940 }; |
| 8941 #endif /* fts5YYFALLBACK */ |
| 8942 |
| 8943 /* The following structure represents a single element of the |
| 8944 ** parser's stack. Information stored includes: |
| 8945 ** |
| 8946 ** + The state number for the parser at this level of the stack. |
| 8947 ** |
| 8948 ** + The value of the token stored at this level of the stack. |
| 8949 ** (In other words, the "major" token.) |
| 8950 ** |
| 8951 ** + The semantic value stored at this level of the stack. This is |
| 8952 ** the information used by the action routines in the grammar. |
| 8953 ** It is sometimes called the "minor" token. |
| 8954 ** |
| 8955 ** After the "shift" half of a SHIFTREDUCE action, the stateno field |
| 8956 ** actually contains the reduce action for the second half of the |
| 8957 ** SHIFTREDUCE. |
| 8958 */ |
| 8959 struct fts5yyStackEntry { |
| 8960 fts5YYACTIONTYPE stateno; /* The state-number, or reduce action in SHIFTREDUC
E */ |
| 8961 fts5YYCODETYPE major; /* The major token value. This is the code |
| 8962 ** number for the token at this stack level */ |
| 8963 fts5YYMINORTYPE minor; /* The user-supplied minor token value. This |
| 8964 ** is the value of the token */ |
| 8965 }; |
| 8966 typedef struct fts5yyStackEntry fts5yyStackEntry; |
| 8967 |
| 8968 /* The state of the parser is completely contained in an instance of |
| 8969 ** the following structure */ |
| 8970 struct fts5yyParser { |
| 8971 int fts5yyidx; /* Index of top element in stack */ |
| 8972 #ifdef fts5YYTRACKMAXSTACKDEPTH |
| 8973 int fts5yyidxMax; /* Maximum value of fts5yyidx */ |
| 8974 #endif |
| 8975 int fts5yyerrcnt; /* Shifts left before out of the error */ |
| 8976 sqlite3Fts5ParserARG_SDECL /* A place to hold %extra_argument *
/ |
| 8977 #if fts5YYSTACKDEPTH<=0 |
| 8978 int fts5yystksz; /* Current side of the stack */ |
| 8979 fts5yyStackEntry *fts5yystack; /* The parser's stack */ |
| 8980 #else |
| 8981 fts5yyStackEntry fts5yystack[fts5YYSTACKDEPTH]; /* The parser's stack */ |
| 8982 #endif |
| 8983 }; |
| 8984 typedef struct fts5yyParser fts5yyParser; |
| 8985 |
| 8986 #ifndef NDEBUG |
| 8987 /* #include <stdio.h> */ |
| 8988 static FILE *fts5yyTraceFILE = 0; |
| 8989 static char *fts5yyTracePrompt = 0; |
| 8990 #endif /* NDEBUG */ |
| 8991 |
| 8992 #ifndef NDEBUG |
| 8993 /* |
| 8994 ** Turn parser tracing on by giving a stream to which to write the trace |
| 8995 ** and a prompt to preface each trace message. Tracing is turned off |
| 8996 ** by making either argument NULL |
| 8997 ** |
| 8998 ** Inputs: |
| 8999 ** <ul> |
| 9000 ** <li> A FILE* to which trace output should be written. |
| 9001 ** If NULL, then tracing is turned off. |
| 9002 ** <li> A prefix string written at the beginning of every |
| 9003 ** line of trace output. If NULL, then tracing is |
| 9004 ** turned off. |
| 9005 ** </ul> |
| 9006 ** |
| 9007 ** Outputs: |
| 9008 ** None. |
| 9009 */ |
| 9010 static void sqlite3Fts5ParserTrace(FILE *TraceFILE, char *zTracePrompt){ |
| 9011 fts5yyTraceFILE = TraceFILE; |
| 9012 fts5yyTracePrompt = zTracePrompt; |
| 9013 if( fts5yyTraceFILE==0 ) fts5yyTracePrompt = 0; |
| 9014 else if( fts5yyTracePrompt==0 ) fts5yyTraceFILE = 0; |
| 9015 } |
| 9016 #endif /* NDEBUG */ |
| 9017 |
| 9018 #ifndef NDEBUG |
| 9019 /* For tracing shifts, the names of all terminals and nonterminals |
| 9020 ** are required. The following table supplies these names */ |
| 9021 static const char *const fts5yyTokenName[] = { |
| 9022 "$", "OR", "AND", "NOT", |
| 9023 "TERM", "COLON", "LP", "RP", |
| 9024 "LCP", "RCP", "STRING", "COMMA", |
| 9025 "PLUS", "STAR", "error", "input", |
| 9026 "expr", "cnearset", "exprlist", "nearset", |
| 9027 "colset", "colsetlist", "nearphrases", "phrase", |
| 9028 "neardist_opt", "star_opt", |
| 9029 }; |
| 9030 #endif /* NDEBUG */ |
| 9031 |
| 9032 #ifndef NDEBUG |
| 9033 /* For tracing reduce actions, the names of all rules are required. |
| 9034 */ |
| 9035 static const char *const fts5yyRuleName[] = { |
| 9036 /* 0 */ "input ::= expr", |
| 9037 /* 1 */ "expr ::= expr AND expr", |
| 9038 /* 2 */ "expr ::= expr OR expr", |
| 9039 /* 3 */ "expr ::= expr NOT expr", |
| 9040 /* 4 */ "expr ::= LP expr RP", |
| 9041 /* 5 */ "expr ::= exprlist", |
| 9042 /* 6 */ "exprlist ::= cnearset", |
| 9043 /* 7 */ "exprlist ::= exprlist cnearset", |
| 9044 /* 8 */ "cnearset ::= nearset", |
| 9045 /* 9 */ "cnearset ::= colset COLON nearset", |
| 9046 /* 10 */ "colset ::= LCP colsetlist RCP", |
| 9047 /* 11 */ "colset ::= STRING", |
| 9048 /* 12 */ "colsetlist ::= colsetlist STRING", |
| 9049 /* 13 */ "colsetlist ::= STRING", |
| 9050 /* 14 */ "nearset ::= phrase", |
| 9051 /* 15 */ "nearset ::= STRING LP nearphrases neardist_opt RP", |
| 9052 /* 16 */ "nearphrases ::= phrase", |
| 9053 /* 17 */ "nearphrases ::= nearphrases phrase", |
| 9054 /* 18 */ "neardist_opt ::=", |
| 9055 /* 19 */ "neardist_opt ::= COMMA STRING", |
| 9056 /* 20 */ "phrase ::= phrase PLUS STRING star_opt", |
| 9057 /* 21 */ "phrase ::= STRING star_opt", |
| 9058 /* 22 */ "star_opt ::= STAR", |
| 9059 /* 23 */ "star_opt ::=", |
| 9060 }; |
| 9061 #endif /* NDEBUG */ |
| 9062 |
| 9063 |
| 9064 #if fts5YYSTACKDEPTH<=0 |
| 9065 /* |
| 9066 ** Try to increase the size of the parser stack. |
| 9067 */ |
| 9068 static void fts5yyGrowStack(fts5yyParser *p){ |
| 9069 int newSize; |
| 9070 fts5yyStackEntry *pNew; |
| 9071 |
| 9072 newSize = p->fts5yystksz*2 + 100; |
| 9073 pNew = realloc(p->fts5yystack, newSize*sizeof(pNew[0])); |
| 9074 if( pNew ){ |
| 9075 p->fts5yystack = pNew; |
| 9076 p->fts5yystksz = newSize; |
| 9077 #ifndef NDEBUG |
| 9078 if( fts5yyTraceFILE ){ |
| 9079 fprintf(fts5yyTraceFILE,"%sStack grows to %d entries!\n", |
| 9080 fts5yyTracePrompt, p->fts5yystksz); |
| 9081 } |
| 9082 #endif |
| 9083 } |
| 9084 } |
| 9085 #endif |
| 9086 |
| 9087 /* Datatype of the argument to the memory allocated passed as the |
| 9088 ** second argument to sqlite3Fts5ParserAlloc() below. This can be changed by |
| 9089 ** putting an appropriate #define in the %include section of the input |
| 9090 ** grammar. |
| 9091 */ |
| 9092 #ifndef fts5YYMALLOCARGTYPE |
| 9093 # define fts5YYMALLOCARGTYPE size_t |
| 9094 #endif |
| 9095 |
| 9096 /* |
| 9097 ** This function allocates a new parser. |
| 9098 ** The only argument is a pointer to a function which works like |
| 9099 ** malloc. |
| 9100 ** |
| 9101 ** Inputs: |
| 9102 ** A pointer to the function used to allocate memory. |
| 9103 ** |
| 9104 ** Outputs: |
| 9105 ** A pointer to a parser. This pointer is used in subsequent calls |
| 9106 ** to sqlite3Fts5Parser and sqlite3Fts5ParserFree. |
| 9107 */ |
| 9108 static void *sqlite3Fts5ParserAlloc(void *(*mallocProc)(fts5YYMALLOCARGTYPE)){ |
| 9109 fts5yyParser *pParser; |
| 9110 pParser = (fts5yyParser*)(*mallocProc)( (fts5YYMALLOCARGTYPE)sizeof(fts5yyPars
er) ); |
| 9111 if( pParser ){ |
| 9112 pParser->fts5yyidx = -1; |
| 9113 #ifdef fts5YYTRACKMAXSTACKDEPTH |
| 9114 pParser->fts5yyidxMax = 0; |
| 9115 #endif |
| 9116 #if fts5YYSTACKDEPTH<=0 |
| 9117 pParser->fts5yystack = NULL; |
| 9118 pParser->fts5yystksz = 0; |
| 9119 fts5yyGrowStack(pParser); |
| 9120 #endif |
| 9121 } |
| 9122 return pParser; |
| 9123 } |
| 9124 |
| 9125 /* The following function deletes the "minor type" or semantic value |
| 9126 ** associated with a symbol. The symbol can be either a terminal |
| 9127 ** or nonterminal. "fts5yymajor" is the symbol code, and "fts5yypminor" is |
| 9128 ** a pointer to the value to be deleted. The code used to do the |
| 9129 ** deletions is derived from the %destructor and/or %token_destructor |
| 9130 ** directives of the input grammar. |
| 9131 */ |
| 9132 static void fts5yy_destructor( |
| 9133 fts5yyParser *fts5yypParser, /* The parser */ |
| 9134 fts5YYCODETYPE fts5yymajor, /* Type code for object to destroy */ |
| 9135 fts5YYMINORTYPE *fts5yypminor /* The object to be destroyed */ |
| 9136 ){ |
| 9137 sqlite3Fts5ParserARG_FETCH; |
| 9138 switch( fts5yymajor ){ |
| 9139 /* Here is inserted the actions which take place when a |
| 9140 ** terminal or non-terminal is destroyed. This can happen |
| 9141 ** when the symbol is popped from the stack during a |
| 9142 ** reduce or during error processing or when a parser is |
| 9143 ** being destroyed before it is finished parsing. |
| 9144 ** |
| 9145 ** Note: during a reduce, the only symbols destroyed are those |
| 9146 ** which appear on the RHS of the rule, but which are *not* used |
| 9147 ** inside the C code. |
| 9148 */ |
| 9149 /********* Begin destructor definitions ***************************************/ |
| 9150 case 15: /* input */ |
| 9151 { |
| 9152 (void)pParse; |
| 9153 } |
| 9154 break; |
| 9155 case 16: /* expr */ |
| 9156 case 17: /* cnearset */ |
| 9157 case 18: /* exprlist */ |
| 9158 { |
| 9159 sqlite3Fts5ParseNodeFree((fts5yypminor->fts5yy18)); |
| 9160 } |
| 9161 break; |
| 9162 case 19: /* nearset */ |
| 9163 case 22: /* nearphrases */ |
| 9164 { |
| 9165 sqlite3Fts5ParseNearsetFree((fts5yypminor->fts5yy26)); |
| 9166 } |
| 9167 break; |
| 9168 case 20: /* colset */ |
| 9169 case 21: /* colsetlist */ |
| 9170 { |
| 9171 sqlite3_free((fts5yypminor->fts5yy3)); |
| 9172 } |
| 9173 break; |
| 9174 case 23: /* phrase */ |
| 9175 { |
| 9176 sqlite3Fts5ParsePhraseFree((fts5yypminor->fts5yy11)); |
| 9177 } |
| 9178 break; |
| 9179 /********* End destructor definitions *****************************************/ |
| 9180 default: break; /* If no destructor action specified: do nothing */ |
| 9181 } |
| 9182 } |
| 9183 |
| 9184 /* |
| 9185 ** Pop the parser's stack once. |
| 9186 ** |
| 9187 ** If there is a destructor routine associated with the token which |
| 9188 ** is popped from the stack, then call it. |
| 9189 */ |
| 9190 static void fts5yy_pop_parser_stack(fts5yyParser *pParser){ |
| 9191 fts5yyStackEntry *fts5yytos; |
| 9192 assert( pParser->fts5yyidx>=0 ); |
| 9193 fts5yytos = &pParser->fts5yystack[pParser->fts5yyidx--]; |
| 9194 #ifndef NDEBUG |
| 9195 if( fts5yyTraceFILE ){ |
| 9196 fprintf(fts5yyTraceFILE,"%sPopping %s\n", |
| 9197 fts5yyTracePrompt, |
| 9198 fts5yyTokenName[fts5yytos->major]); |
| 9199 } |
| 9200 #endif |
| 9201 fts5yy_destructor(pParser, fts5yytos->major, &fts5yytos->minor); |
| 9202 } |
| 9203 |
| 9204 /* |
| 9205 ** Deallocate and destroy a parser. Destructors are called for |
| 9206 ** all stack elements before shutting the parser down. |
| 9207 ** |
| 9208 ** If the fts5YYPARSEFREENEVERNULL macro exists (for example because it |
| 9209 ** is defined in a %include section of the input grammar) then it is |
| 9210 ** assumed that the input pointer is never NULL. |
| 9211 */ |
| 9212 static void sqlite3Fts5ParserFree( |
| 9213 void *p, /* The parser to be deleted */ |
| 9214 void (*freeProc)(void*) /* Function used to reclaim memory */ |
| 9215 ){ |
| 9216 fts5yyParser *pParser = (fts5yyParser*)p; |
| 9217 #ifndef fts5YYPARSEFREENEVERNULL |
| 9218 if( pParser==0 ) return; |
| 9219 #endif |
| 9220 while( pParser->fts5yyidx>=0 ) fts5yy_pop_parser_stack(pParser); |
| 9221 #if fts5YYSTACKDEPTH<=0 |
| 9222 free(pParser->fts5yystack); |
| 9223 #endif |
| 9224 (*freeProc)((void*)pParser); |
| 9225 } |
| 9226 |
| 9227 /* |
| 9228 ** Return the peak depth of the stack for a parser. |
| 9229 */ |
| 9230 #ifdef fts5YYTRACKMAXSTACKDEPTH |
| 9231 static int sqlite3Fts5ParserStackPeak(void *p){ |
| 9232 fts5yyParser *pParser = (fts5yyParser*)p; |
| 9233 return pParser->fts5yyidxMax; |
| 9234 } |
| 9235 #endif |
| 9236 |
| 9237 /* |
| 9238 ** Find the appropriate action for a parser given the terminal |
| 9239 ** look-ahead token iLookAhead. |
| 9240 */ |
| 9241 static int fts5yy_find_shift_action( |
| 9242 fts5yyParser *pParser, /* The parser */ |
| 9243 fts5YYCODETYPE iLookAhead /* The look-ahead token */ |
| 9244 ){ |
| 9245 int i; |
| 9246 int stateno = pParser->fts5yystack[pParser->fts5yyidx].stateno; |
| 9247 |
| 9248 if( stateno>=fts5YY_MIN_REDUCE ) return stateno; |
| 9249 assert( stateno <= fts5YY_SHIFT_COUNT ); |
| 9250 do{ |
| 9251 i = fts5yy_shift_ofst[stateno]; |
| 9252 if( i==fts5YY_SHIFT_USE_DFLT ) return fts5yy_default[stateno]; |
| 9253 assert( iLookAhead!=fts5YYNOCODE ); |
| 9254 i += iLookAhead; |
| 9255 if( i<0 || i>=fts5YY_ACTTAB_COUNT || fts5yy_lookahead[i]!=iLookAhead ){ |
| 9256 if( iLookAhead>0 ){ |
| 9257 #ifdef fts5YYFALLBACK |
| 9258 fts5YYCODETYPE iFallback; /* Fallback token */ |
| 9259 if( iLookAhead<sizeof(fts5yyFallback)/sizeof(fts5yyFallback[0]) |
| 9260 && (iFallback = fts5yyFallback[iLookAhead])!=0 ){ |
| 9261 #ifndef NDEBUG |
| 9262 if( fts5yyTraceFILE ){ |
| 9263 fprintf(fts5yyTraceFILE, "%sFALLBACK %s => %s\n", |
| 9264 fts5yyTracePrompt, fts5yyTokenName[iLookAhead], fts5yyTokenName[i
Fallback]); |
| 9265 } |
| 9266 #endif |
| 9267 assert( fts5yyFallback[iFallback]==0 ); /* Fallback loop must terminat
e */ |
| 9268 iLookAhead = iFallback; |
| 9269 continue; |
| 9270 } |
| 9271 #endif |
| 9272 #ifdef fts5YYWILDCARD |
| 9273 { |
| 9274 int j = i - iLookAhead + fts5YYWILDCARD; |
| 9275 if( |
| 9276 #if fts5YY_SHIFT_MIN+fts5YYWILDCARD<0 |
| 9277 j>=0 && |
| 9278 #endif |
| 9279 #if fts5YY_SHIFT_MAX+fts5YYWILDCARD>=fts5YY_ACTTAB_COUNT |
| 9280 j<fts5YY_ACTTAB_COUNT && |
| 9281 #endif |
| 9282 fts5yy_lookahead[j]==fts5YYWILDCARD |
| 9283 ){ |
| 9284 #ifndef NDEBUG |
| 9285 if( fts5yyTraceFILE ){ |
| 9286 fprintf(fts5yyTraceFILE, "%sWILDCARD %s => %s\n", |
| 9287 fts5yyTracePrompt, fts5yyTokenName[iLookAhead], |
| 9288 fts5yyTokenName[fts5YYWILDCARD]); |
| 9289 } |
| 9290 #endif /* NDEBUG */ |
| 9291 return fts5yy_action[j]; |
| 9292 } |
| 9293 } |
| 9294 #endif /* fts5YYWILDCARD */ |
| 9295 } |
| 9296 return fts5yy_default[stateno]; |
| 9297 }else{ |
| 9298 return fts5yy_action[i]; |
| 9299 } |
| 9300 }while(1); |
| 9301 } |
| 9302 |
| 9303 /* |
| 9304 ** Find the appropriate action for a parser given the non-terminal |
| 9305 ** look-ahead token iLookAhead. |
| 9306 */ |
| 9307 static int fts5yy_find_reduce_action( |
| 9308 int stateno, /* Current state number */ |
| 9309 fts5YYCODETYPE iLookAhead /* The look-ahead token */ |
| 9310 ){ |
| 9311 int i; |
| 9312 #ifdef fts5YYERRORSYMBOL |
| 9313 if( stateno>fts5YY_REDUCE_COUNT ){ |
| 9314 return fts5yy_default[stateno]; |
| 9315 } |
| 9316 #else |
| 9317 assert( stateno<=fts5YY_REDUCE_COUNT ); |
| 9318 #endif |
| 9319 i = fts5yy_reduce_ofst[stateno]; |
| 9320 assert( i!=fts5YY_REDUCE_USE_DFLT ); |
| 9321 assert( iLookAhead!=fts5YYNOCODE ); |
| 9322 i += iLookAhead; |
| 9323 #ifdef fts5YYERRORSYMBOL |
| 9324 if( i<0 || i>=fts5YY_ACTTAB_COUNT || fts5yy_lookahead[i]!=iLookAhead ){ |
| 9325 return fts5yy_default[stateno]; |
| 9326 } |
| 9327 #else |
| 9328 assert( i>=0 && i<fts5YY_ACTTAB_COUNT ); |
| 9329 assert( fts5yy_lookahead[i]==iLookAhead ); |
| 9330 #endif |
| 9331 return fts5yy_action[i]; |
| 9332 } |
| 9333 |
| 9334 /* |
| 9335 ** The following routine is called if the stack overflows. |
| 9336 */ |
| 9337 static void fts5yyStackOverflow(fts5yyParser *fts5yypParser, fts5YYMINORTYPE *ft
s5yypMinor){ |
| 9338 sqlite3Fts5ParserARG_FETCH; |
| 9339 fts5yypParser->fts5yyidx--; |
| 9340 #ifndef NDEBUG |
| 9341 if( fts5yyTraceFILE ){ |
| 9342 fprintf(fts5yyTraceFILE,"%sStack Overflow!\n",fts5yyTracePrompt); |
| 9343 } |
| 9344 #endif |
| 9345 while( fts5yypParser->fts5yyidx>=0 ) fts5yy_pop_parser_stack(fts5yypParser); |
| 9346 /* Here code is inserted which will execute if the parser |
| 9347 ** stack every overflows */ |
| 9348 /******** Begin %stack_overflow code ******************************************/ |
| 9349 |
| 9350 assert( 0 ); |
| 9351 /******** End %stack_overflow code ********************************************/ |
| 9352 sqlite3Fts5ParserARG_STORE; /* Suppress warning about unused %extra_argument
var */ |
| 9353 } |
| 9354 |
| 9355 /* |
| 9356 ** Print tracing information for a SHIFT action |
| 9357 */ |
| 9358 #ifndef NDEBUG |
| 9359 static void fts5yyTraceShift(fts5yyParser *fts5yypParser, int fts5yyNewState){ |
| 9360 if( fts5yyTraceFILE ){ |
| 9361 if( fts5yyNewState<fts5YYNSTATE ){ |
| 9362 fprintf(fts5yyTraceFILE,"%sShift '%s', go to state %d\n", |
| 9363 fts5yyTracePrompt,fts5yyTokenName[fts5yypParser->fts5yystack[fts5yypPar
ser->fts5yyidx].major], |
| 9364 fts5yyNewState); |
| 9365 }else{ |
| 9366 fprintf(fts5yyTraceFILE,"%sShift '%s'\n", |
| 9367 fts5yyTracePrompt,fts5yyTokenName[fts5yypParser->fts5yystack[fts5yypPar
ser->fts5yyidx].major]); |
| 9368 } |
| 9369 } |
| 9370 } |
| 9371 #else |
| 9372 # define fts5yyTraceShift(X,Y) |
| 9373 #endif |
| 9374 |
| 9375 /* |
| 9376 ** Perform a shift action. |
| 9377 */ |
| 9378 static void fts5yy_shift( |
| 9379 fts5yyParser *fts5yypParser, /* The parser to be shifted */ |
| 9380 int fts5yyNewState, /* The new state to shift in */ |
| 9381 int fts5yyMajor, /* The major token to shift in */ |
| 9382 fts5YYMINORTYPE *fts5yypMinor /* Pointer to the minor token to shift i
n */ |
| 9383 ){ |
| 9384 fts5yyStackEntry *fts5yytos; |
| 9385 fts5yypParser->fts5yyidx++; |
| 9386 #ifdef fts5YYTRACKMAXSTACKDEPTH |
| 9387 if( fts5yypParser->fts5yyidx>fts5yypParser->fts5yyidxMax ){ |
| 9388 fts5yypParser->fts5yyidxMax = fts5yypParser->fts5yyidx; |
| 9389 } |
| 9390 #endif |
| 9391 #if fts5YYSTACKDEPTH>0 |
| 9392 if( fts5yypParser->fts5yyidx>=fts5YYSTACKDEPTH ){ |
| 9393 fts5yyStackOverflow(fts5yypParser, fts5yypMinor); |
| 9394 return; |
| 9395 } |
| 9396 #else |
| 9397 if( fts5yypParser->fts5yyidx>=fts5yypParser->fts5yystksz ){ |
| 9398 fts5yyGrowStack(fts5yypParser); |
| 9399 if( fts5yypParser->fts5yyidx>=fts5yypParser->fts5yystksz ){ |
| 9400 fts5yyStackOverflow(fts5yypParser, fts5yypMinor); |
| 9401 return; |
| 9402 } |
| 9403 } |
| 9404 #endif |
| 9405 fts5yytos = &fts5yypParser->fts5yystack[fts5yypParser->fts5yyidx]; |
| 9406 fts5yytos->stateno = (fts5YYACTIONTYPE)fts5yyNewState; |
| 9407 fts5yytos->major = (fts5YYCODETYPE)fts5yyMajor; |
| 9408 fts5yytos->minor = *fts5yypMinor; |
| 9409 fts5yyTraceShift(fts5yypParser, fts5yyNewState); |
| 9410 } |
| 9411 |
| 9412 /* The following table contains information about every rule that |
| 9413 ** is used during the reduce. |
| 9414 */ |
| 9415 static const struct { |
| 9416 fts5YYCODETYPE lhs; /* Symbol on the left-hand side of the rule */ |
| 9417 unsigned char nrhs; /* Number of right-hand side symbols in the rule */ |
| 9418 } fts5yyRuleInfo[] = { |
| 9419 { 15, 1 }, |
| 9420 { 16, 3 }, |
| 9421 { 16, 3 }, |
| 9422 { 16, 3 }, |
| 9423 { 16, 3 }, |
| 9424 { 16, 1 }, |
| 9425 { 18, 1 }, |
| 9426 { 18, 2 }, |
| 9427 { 17, 1 }, |
| 9428 { 17, 3 }, |
| 9429 { 20, 3 }, |
| 9430 { 20, 1 }, |
| 9431 { 21, 2 }, |
| 9432 { 21, 1 }, |
| 9433 { 19, 1 }, |
| 9434 { 19, 5 }, |
| 9435 { 22, 1 }, |
| 9436 { 22, 2 }, |
| 9437 { 24, 0 }, |
| 9438 { 24, 2 }, |
| 9439 { 23, 4 }, |
| 9440 { 23, 2 }, |
| 9441 { 25, 1 }, |
| 9442 { 25, 0 }, |
| 9443 }; |
| 9444 |
| 9445 static void fts5yy_accept(fts5yyParser*); /* Forward Declaration */ |
| 9446 |
| 9447 /* |
| 9448 ** Perform a reduce action and the shift that must immediately |
| 9449 ** follow the reduce. |
| 9450 */ |
| 9451 static void fts5yy_reduce( |
| 9452 fts5yyParser *fts5yypParser, /* The parser */ |
| 9453 int fts5yyruleno /* Number of the rule by which to reduce */ |
| 9454 ){ |
| 9455 int fts5yygoto; /* The next state */ |
| 9456 int fts5yyact; /* The next action */ |
| 9457 fts5YYMINORTYPE fts5yygotominor; /* The LHS of the rule reduced */ |
| 9458 fts5yyStackEntry *fts5yymsp; /* The top of the parser's stack */ |
| 9459 int fts5yysize; /* Amount to pop the stack */ |
| 9460 sqlite3Fts5ParserARG_FETCH; |
| 9461 fts5yymsp = &fts5yypParser->fts5yystack[fts5yypParser->fts5yyidx]; |
| 9462 #ifndef NDEBUG |
| 9463 if( fts5yyTraceFILE && fts5yyruleno>=0 |
| 9464 && fts5yyruleno<(int)(sizeof(fts5yyRuleName)/sizeof(fts5yyRuleName[0]))
){ |
| 9465 fts5yysize = fts5yyRuleInfo[fts5yyruleno].nrhs; |
| 9466 fprintf(fts5yyTraceFILE, "%sReduce [%s], go to state %d.\n", fts5yyTraceProm
pt, |
| 9467 fts5yyRuleName[fts5yyruleno], fts5yymsp[-fts5yysize].stateno); |
| 9468 } |
| 9469 #endif /* NDEBUG */ |
| 9470 fts5yygotominor = fts5yyzerominor; |
| 9471 |
| 9472 switch( fts5yyruleno ){ |
| 9473 /* Beginning here are the reduction cases. A typical example |
| 9474 ** follows: |
| 9475 ** case 0: |
| 9476 ** #line <lineno> <grammarfile> |
| 9477 ** { ... } // User supplied code |
| 9478 ** #line <lineno> <thisfile> |
| 9479 ** break; |
| 9480 */ |
| 9481 /********** Begin reduce actions **********************************************/ |
| 9482 case 0: /* input ::= expr */ |
| 9483 { sqlite3Fts5ParseFinished(pParse, fts5yymsp[0].minor.fts5yy18); } |
| 9484 break; |
| 9485 case 1: /* expr ::= expr AND expr */ |
| 9486 { |
| 9487 fts5yygotominor.fts5yy18 = sqlite3Fts5ParseNode(pParse, FTS5_AND, fts5yymsp[-2
].minor.fts5yy18, fts5yymsp[0].minor.fts5yy18, 0); |
| 9488 } |
| 9489 break; |
| 9490 case 2: /* expr ::= expr OR expr */ |
| 9491 { |
| 9492 fts5yygotominor.fts5yy18 = sqlite3Fts5ParseNode(pParse, FTS5_OR, fts5yymsp[-2]
.minor.fts5yy18, fts5yymsp[0].minor.fts5yy18, 0); |
| 9493 } |
| 9494 break; |
| 9495 case 3: /* expr ::= expr NOT expr */ |
| 9496 { |
| 9497 fts5yygotominor.fts5yy18 = sqlite3Fts5ParseNode(pParse, FTS5_NOT, fts5yymsp[-2
].minor.fts5yy18, fts5yymsp[0].minor.fts5yy18, 0); |
| 9498 } |
| 9499 break; |
| 9500 case 4: /* expr ::= LP expr RP */ |
| 9501 {fts5yygotominor.fts5yy18 = fts5yymsp[-1].minor.fts5yy18;} |
| 9502 break; |
| 9503 case 5: /* expr ::= exprlist */ |
| 9504 case 6: /* exprlist ::= cnearset */ fts5yytestcase(fts5yyruleno==6); |
| 9505 {fts5yygotominor.fts5yy18 = fts5yymsp[0].minor.fts5yy18;} |
| 9506 break; |
| 9507 case 7: /* exprlist ::= exprlist cnearset */ |
| 9508 { |
| 9509 fts5yygotominor.fts5yy18 = sqlite3Fts5ParseNode(pParse, FTS5_AND, fts5yymsp[-1
].minor.fts5yy18, fts5yymsp[0].minor.fts5yy18, 0); |
| 9510 } |
| 9511 break; |
| 9512 case 8: /* cnearset ::= nearset */ |
| 9513 { |
| 9514 fts5yygotominor.fts5yy18 = sqlite3Fts5ParseNode(pParse, FTS5_STRING, 0, 0, fts
5yymsp[0].minor.fts5yy26); |
| 9515 } |
| 9516 break; |
| 9517 case 9: /* cnearset ::= colset COLON nearset */ |
| 9518 { |
| 9519 sqlite3Fts5ParseSetColset(pParse, fts5yymsp[0].minor.fts5yy26, fts5yymsp[-2].m
inor.fts5yy3); |
| 9520 fts5yygotominor.fts5yy18 = sqlite3Fts5ParseNode(pParse, FTS5_STRING, 0, 0, fts
5yymsp[0].minor.fts5yy26); |
| 9521 } |
| 9522 break; |
| 9523 case 10: /* colset ::= LCP colsetlist RCP */ |
| 9524 { fts5yygotominor.fts5yy3 = fts5yymsp[-1].minor.fts5yy3; } |
| 9525 break; |
| 9526 case 11: /* colset ::= STRING */ |
| 9527 { |
| 9528 fts5yygotominor.fts5yy3 = sqlite3Fts5ParseColset(pParse, 0, &fts5yymsp[0].mino
r.fts5yy0); |
| 9529 } |
| 9530 break; |
| 9531 case 12: /* colsetlist ::= colsetlist STRING */ |
| 9532 { |
| 9533 fts5yygotominor.fts5yy3 = sqlite3Fts5ParseColset(pParse, fts5yymsp[-1].minor.f
ts5yy3, &fts5yymsp[0].minor.fts5yy0); } |
| 9534 break; |
| 9535 case 13: /* colsetlist ::= STRING */ |
| 9536 { |
| 9537 fts5yygotominor.fts5yy3 = sqlite3Fts5ParseColset(pParse, 0, &fts5yymsp[0].mino
r.fts5yy0); |
| 9538 } |
| 9539 break; |
| 9540 case 14: /* nearset ::= phrase */ |
| 9541 { fts5yygotominor.fts5yy26 = sqlite3Fts5ParseNearset(pParse, 0, fts5yymsp[0].min
or.fts5yy11); } |
| 9542 break; |
| 9543 case 15: /* nearset ::= STRING LP nearphrases neardist_opt RP */ |
| 9544 { |
| 9545 sqlite3Fts5ParseNear(pParse, &fts5yymsp[-4].minor.fts5yy0); |
| 9546 sqlite3Fts5ParseSetDistance(pParse, fts5yymsp[-2].minor.fts5yy26, &fts5yymsp[-
1].minor.fts5yy0); |
| 9547 fts5yygotominor.fts5yy26 = fts5yymsp[-2].minor.fts5yy26; |
| 9548 } |
| 9549 break; |
| 9550 case 16: /* nearphrases ::= phrase */ |
| 9551 { |
| 9552 fts5yygotominor.fts5yy26 = sqlite3Fts5ParseNearset(pParse, 0, fts5yymsp[0].min
or.fts5yy11); |
| 9553 } |
| 9554 break; |
| 9555 case 17: /* nearphrases ::= nearphrases phrase */ |
| 9556 { |
| 9557 fts5yygotominor.fts5yy26 = sqlite3Fts5ParseNearset(pParse, fts5yymsp[-1].minor
.fts5yy26, fts5yymsp[0].minor.fts5yy11); |
| 9558 } |
| 9559 break; |
| 9560 case 18: /* neardist_opt ::= */ |
| 9561 { fts5yygotominor.fts5yy0.p = 0; fts5yygotominor.fts5yy0.n = 0; } |
| 9562 break; |
| 9563 case 19: /* neardist_opt ::= COMMA STRING */ |
| 9564 { fts5yygotominor.fts5yy0 = fts5yymsp[0].minor.fts5yy0; } |
| 9565 break; |
| 9566 case 20: /* phrase ::= phrase PLUS STRING star_opt */ |
| 9567 { |
| 9568 fts5yygotominor.fts5yy11 = sqlite3Fts5ParseTerm(pParse, fts5yymsp[-3].minor.ft
s5yy11, &fts5yymsp[-1].minor.fts5yy0, fts5yymsp[0].minor.fts5yy20); |
| 9569 } |
| 9570 break; |
| 9571 case 21: /* phrase ::= STRING star_opt */ |
| 9572 { |
| 9573 fts5yygotominor.fts5yy11 = sqlite3Fts5ParseTerm(pParse, 0, &fts5yymsp[-1].mino
r.fts5yy0, fts5yymsp[0].minor.fts5yy20); |
| 9574 } |
| 9575 break; |
| 9576 case 22: /* star_opt ::= STAR */ |
| 9577 { fts5yygotominor.fts5yy20 = 1; } |
| 9578 break; |
| 9579 case 23: /* star_opt ::= */ |
| 9580 { fts5yygotominor.fts5yy20 = 0; } |
| 9581 break; |
| 9582 default: |
| 9583 break; |
| 9584 /********** End reduce actions ************************************************/ |
| 9585 }; |
| 9586 assert( fts5yyruleno>=0 && fts5yyruleno<sizeof(fts5yyRuleInfo)/sizeof(fts5yyRu
leInfo[0]) ); |
| 9587 fts5yygoto = fts5yyRuleInfo[fts5yyruleno].lhs; |
| 9588 fts5yysize = fts5yyRuleInfo[fts5yyruleno].nrhs; |
| 9589 fts5yypParser->fts5yyidx -= fts5yysize; |
| 9590 fts5yyact = fts5yy_find_reduce_action(fts5yymsp[-fts5yysize].stateno,(fts5YYCO
DETYPE)fts5yygoto); |
| 9591 if( fts5yyact <= fts5YY_MAX_SHIFTREDUCE ){ |
| 9592 if( fts5yyact>fts5YY_MAX_SHIFT ) fts5yyact += fts5YY_MIN_REDUCE - fts5YY_MIN
_SHIFTREDUCE; |
| 9593 /* If the reduce action popped at least |
| 9594 ** one element off the stack, then we can push the new element back |
| 9595 ** onto the stack here, and skip the stack overflow test in fts5yy_shift(). |
| 9596 ** That gives a significant speed improvement. */ |
| 9597 if( fts5yysize ){ |
| 9598 fts5yypParser->fts5yyidx++; |
| 9599 fts5yymsp -= fts5yysize-1; |
| 9600 fts5yymsp->stateno = (fts5YYACTIONTYPE)fts5yyact; |
| 9601 fts5yymsp->major = (fts5YYCODETYPE)fts5yygoto; |
| 9602 fts5yymsp->minor = fts5yygotominor; |
| 9603 fts5yyTraceShift(fts5yypParser, fts5yyact); |
| 9604 }else{ |
| 9605 fts5yy_shift(fts5yypParser,fts5yyact,fts5yygoto,&fts5yygotominor); |
| 9606 } |
| 9607 }else{ |
| 9608 assert( fts5yyact == fts5YY_ACCEPT_ACTION ); |
| 9609 fts5yy_accept(fts5yypParser); |
| 9610 } |
| 9611 } |
| 9612 |
| 9613 /* |
| 9614 ** The following code executes when the parse fails |
| 9615 */ |
| 9616 #ifndef fts5YYNOERRORRECOVERY |
| 9617 static void fts5yy_parse_failed( |
| 9618 fts5yyParser *fts5yypParser /* The parser */ |
| 9619 ){ |
| 9620 sqlite3Fts5ParserARG_FETCH; |
| 9621 #ifndef NDEBUG |
| 9622 if( fts5yyTraceFILE ){ |
| 9623 fprintf(fts5yyTraceFILE,"%sFail!\n",fts5yyTracePrompt); |
| 9624 } |
| 9625 #endif |
| 9626 while( fts5yypParser->fts5yyidx>=0 ) fts5yy_pop_parser_stack(fts5yypParser); |
| 9627 /* Here code is inserted which will be executed whenever the |
| 9628 ** parser fails */ |
| 9629 /************ Begin %parse_failure code ***************************************/ |
| 9630 /************ End %parse_failure code *****************************************/ |
| 9631 sqlite3Fts5ParserARG_STORE; /* Suppress warning about unused %extra_argument v
ariable */ |
| 9632 } |
| 9633 #endif /* fts5YYNOERRORRECOVERY */ |
| 9634 |
| 9635 /* |
| 9636 ** The following code executes when a syntax error first occurs. |
| 9637 */ |
| 9638 static void fts5yy_syntax_error( |
| 9639 fts5yyParser *fts5yypParser, /* The parser */ |
| 9640 int fts5yymajor, /* The major type of the error token */ |
| 9641 fts5YYMINORTYPE fts5yyminor /* The minor type of the error token */ |
| 9642 ){ |
| 9643 sqlite3Fts5ParserARG_FETCH; |
| 9644 #define FTS5TOKEN (fts5yyminor.fts5yy0) |
| 9645 /************ Begin %syntax_error code ****************************************/ |
| 9646 |
| 9647 sqlite3Fts5ParseError( |
| 9648 pParse, "fts5: syntax error near \"%.*s\"",FTS5TOKEN.n,FTS5TOKEN.p |
| 9649 ); |
| 9650 /************ End %syntax_error code ******************************************/ |
| 9651 sqlite3Fts5ParserARG_STORE; /* Suppress warning about unused %extra_argument v
ariable */ |
| 9652 } |
| 9653 |
| 9654 /* |
| 9655 ** The following is executed when the parser accepts |
| 9656 */ |
| 9657 static void fts5yy_accept( |
| 9658 fts5yyParser *fts5yypParser /* The parser */ |
| 9659 ){ |
| 9660 sqlite3Fts5ParserARG_FETCH; |
| 9661 #ifndef NDEBUG |
| 9662 if( fts5yyTraceFILE ){ |
| 9663 fprintf(fts5yyTraceFILE,"%sAccept!\n",fts5yyTracePrompt); |
| 9664 } |
| 9665 #endif |
| 9666 while( fts5yypParser->fts5yyidx>=0 ) fts5yy_pop_parser_stack(fts5yypParser); |
| 9667 /* Here code is inserted which will be executed whenever the |
| 9668 ** parser accepts */ |
| 9669 /*********** Begin %parse_accept code *****************************************/ |
| 9670 /*********** End %parse_accept code *******************************************/ |
| 9671 sqlite3Fts5ParserARG_STORE; /* Suppress warning about unused %extra_argument v
ariable */ |
| 9672 } |
| 9673 |
| 9674 /* The main parser program. |
| 9675 ** The first argument is a pointer to a structure obtained from |
| 9676 ** "sqlite3Fts5ParserAlloc" which describes the current state of the parser. |
| 9677 ** The second argument is the major token number. The third is |
| 9678 ** the minor token. The fourth optional argument is whatever the |
| 9679 ** user wants (and specified in the grammar) and is available for |
| 9680 ** use by the action routines. |
| 9681 ** |
| 9682 ** Inputs: |
| 9683 ** <ul> |
| 9684 ** <li> A pointer to the parser (an opaque structure.) |
| 9685 ** <li> The major token number. |
| 9686 ** <li> The minor token number. |
| 9687 ** <li> An option argument of a grammar-specified type. |
| 9688 ** </ul> |
| 9689 ** |
| 9690 ** Outputs: |
| 9691 ** None. |
| 9692 */ |
| 9693 static void sqlite3Fts5Parser( |
| 9694 void *fts5yyp, /* The parser */ |
| 9695 int fts5yymajor, /* The major token code number */ |
| 9696 sqlite3Fts5ParserFTS5TOKENTYPE fts5yyminor /* The value for the token */ |
| 9697 sqlite3Fts5ParserARG_PDECL /* Optional %extra_argument parameter
*/ |
| 9698 ){ |
| 9699 fts5YYMINORTYPE fts5yyminorunion; |
| 9700 int fts5yyact; /* The parser action. */ |
| 9701 #if !defined(fts5YYERRORSYMBOL) && !defined(fts5YYNOERRORRECOVERY) |
| 9702 int fts5yyendofinput; /* True if we are at the end of input */ |
| 9703 #endif |
| 9704 #ifdef fts5YYERRORSYMBOL |
| 9705 int fts5yyerrorhit = 0; /* True if fts5yymajor has invoked an error */ |
| 9706 #endif |
| 9707 fts5yyParser *fts5yypParser; /* The parser */ |
| 9708 |
| 9709 /* (re)initialize the parser, if necessary */ |
| 9710 fts5yypParser = (fts5yyParser*)fts5yyp; |
| 9711 if( fts5yypParser->fts5yyidx<0 ){ |
| 9712 #if fts5YYSTACKDEPTH<=0 |
| 9713 if( fts5yypParser->fts5yystksz <=0 ){ |
| 9714 /*memset(&fts5yyminorunion, 0, sizeof(fts5yyminorunion));*/ |
| 9715 fts5yyminorunion = fts5yyzerominor; |
| 9716 fts5yyStackOverflow(fts5yypParser, &fts5yyminorunion); |
| 9717 return; |
| 9718 } |
| 9719 #endif |
| 9720 fts5yypParser->fts5yyidx = 0; |
| 9721 fts5yypParser->fts5yyerrcnt = -1; |
| 9722 fts5yypParser->fts5yystack[0].stateno = 0; |
| 9723 fts5yypParser->fts5yystack[0].major = 0; |
| 9724 #ifndef NDEBUG |
| 9725 if( fts5yyTraceFILE ){ |
| 9726 fprintf(fts5yyTraceFILE,"%sInitialize. Empty stack. State 0\n", |
| 9727 fts5yyTracePrompt); |
| 9728 } |
| 9729 #endif |
| 9730 } |
| 9731 fts5yyminorunion.fts5yy0 = fts5yyminor; |
| 9732 #if !defined(fts5YYERRORSYMBOL) && !defined(fts5YYNOERRORRECOVERY) |
| 9733 fts5yyendofinput = (fts5yymajor==0); |
| 9734 #endif |
| 9735 sqlite3Fts5ParserARG_STORE; |
| 9736 |
| 9737 #ifndef NDEBUG |
| 9738 if( fts5yyTraceFILE ){ |
| 9739 fprintf(fts5yyTraceFILE,"%sInput '%s'\n",fts5yyTracePrompt,fts5yyTokenName[f
ts5yymajor]); |
| 9740 } |
| 9741 #endif |
| 9742 |
| 9743 do{ |
| 9744 fts5yyact = fts5yy_find_shift_action(fts5yypParser,(fts5YYCODETYPE)fts5yymaj
or); |
| 9745 if( fts5yyact <= fts5YY_MAX_SHIFTREDUCE ){ |
| 9746 if( fts5yyact > fts5YY_MAX_SHIFT ) fts5yyact += fts5YY_MIN_REDUCE - fts5YY
_MIN_SHIFTREDUCE; |
| 9747 fts5yy_shift(fts5yypParser,fts5yyact,fts5yymajor,&fts5yyminorunion); |
| 9748 fts5yypParser->fts5yyerrcnt--; |
| 9749 fts5yymajor = fts5YYNOCODE; |
| 9750 }else if( fts5yyact <= fts5YY_MAX_REDUCE ){ |
| 9751 fts5yy_reduce(fts5yypParser,fts5yyact-fts5YY_MIN_REDUCE); |
| 9752 }else{ |
| 9753 assert( fts5yyact == fts5YY_ERROR_ACTION ); |
| 9754 #ifdef fts5YYERRORSYMBOL |
| 9755 int fts5yymx; |
| 9756 #endif |
| 9757 #ifndef NDEBUG |
| 9758 if( fts5yyTraceFILE ){ |
| 9759 fprintf(fts5yyTraceFILE,"%sSyntax Error!\n",fts5yyTracePrompt); |
| 9760 } |
| 9761 #endif |
| 9762 #ifdef fts5YYERRORSYMBOL |
| 9763 /* A syntax error has occurred. |
| 9764 ** The response to an error depends upon whether or not the |
| 9765 ** grammar defines an error token "ERROR". |
| 9766 ** |
| 9767 ** This is what we do if the grammar does define ERROR: |
| 9768 ** |
| 9769 ** * Call the %syntax_error function. |
| 9770 ** |
| 9771 ** * Begin popping the stack until we enter a state where |
| 9772 ** it is legal to shift the error symbol, then shift |
| 9773 ** the error symbol. |
| 9774 ** |
| 9775 ** * Set the error count to three. |
| 9776 ** |
| 9777 ** * Begin accepting and shifting new tokens. No new error |
| 9778 ** processing will occur until three tokens have been |
| 9779 ** shifted successfully. |
| 9780 ** |
| 9781 */ |
| 9782 if( fts5yypParser->fts5yyerrcnt<0 ){ |
| 9783 fts5yy_syntax_error(fts5yypParser,fts5yymajor,fts5yyminorunion); |
| 9784 } |
| 9785 fts5yymx = fts5yypParser->fts5yystack[fts5yypParser->fts5yyidx].major; |
| 9786 if( fts5yymx==fts5YYERRORSYMBOL || fts5yyerrorhit ){ |
| 9787 #ifndef NDEBUG |
| 9788 if( fts5yyTraceFILE ){ |
| 9789 fprintf(fts5yyTraceFILE,"%sDiscard input token %s\n", |
| 9790 fts5yyTracePrompt,fts5yyTokenName[fts5yymajor]); |
| 9791 } |
| 9792 #endif |
| 9793 fts5yy_destructor(fts5yypParser, (fts5YYCODETYPE)fts5yymajor,&fts5yymino
runion); |
| 9794 fts5yymajor = fts5YYNOCODE; |
| 9795 }else{ |
| 9796 while( |
| 9797 fts5yypParser->fts5yyidx >= 0 && |
| 9798 fts5yymx != fts5YYERRORSYMBOL && |
| 9799 (fts5yyact = fts5yy_find_reduce_action( |
| 9800 fts5yypParser->fts5yystack[fts5yypParser->fts5yyidx].sta
teno, |
| 9801 fts5YYERRORSYMBOL)) >= fts5YY_MIN_REDUCE |
| 9802 ){ |
| 9803 fts5yy_pop_parser_stack(fts5yypParser); |
| 9804 } |
| 9805 if( fts5yypParser->fts5yyidx < 0 || fts5yymajor==0 ){ |
| 9806 fts5yy_destructor(fts5yypParser,(fts5YYCODETYPE)fts5yymajor,&fts5yymin
orunion); |
| 9807 fts5yy_parse_failed(fts5yypParser); |
| 9808 fts5yymajor = fts5YYNOCODE; |
| 9809 }else if( fts5yymx!=fts5YYERRORSYMBOL ){ |
| 9810 fts5YYMINORTYPE u2; |
| 9811 u2.fts5YYERRSYMDT = 0; |
| 9812 fts5yy_shift(fts5yypParser,fts5yyact,fts5YYERRORSYMBOL,&u2); |
| 9813 } |
| 9814 } |
| 9815 fts5yypParser->fts5yyerrcnt = 3; |
| 9816 fts5yyerrorhit = 1; |
| 9817 #elif defined(fts5YYNOERRORRECOVERY) |
| 9818 /* If the fts5YYNOERRORRECOVERY macro is defined, then do not attempt to |
| 9819 ** do any kind of error recovery. Instead, simply invoke the syntax |
| 9820 ** error routine and continue going as if nothing had happened. |
| 9821 ** |
| 9822 ** Applications can set this macro (for example inside %include) if |
| 9823 ** they intend to abandon the parse upon the first syntax error seen. |
| 9824 */ |
| 9825 fts5yy_syntax_error(fts5yypParser,fts5yymajor,fts5yyminorunion); |
| 9826 fts5yy_destructor(fts5yypParser,(fts5YYCODETYPE)fts5yymajor,&fts5yyminorun
ion); |
| 9827 fts5yymajor = fts5YYNOCODE; |
| 9828 |
| 9829 #else /* fts5YYERRORSYMBOL is not defined */ |
| 9830 /* This is what we do if the grammar does not define ERROR: |
| 9831 ** |
| 9832 ** * Report an error message, and throw away the input token. |
| 9833 ** |
| 9834 ** * If the input token is $, then fail the parse. |
| 9835 ** |
| 9836 ** As before, subsequent error messages are suppressed until |
| 9837 ** three input tokens have been successfully shifted. |
| 9838 */ |
| 9839 if( fts5yypParser->fts5yyerrcnt<=0 ){ |
| 9840 fts5yy_syntax_error(fts5yypParser,fts5yymajor,fts5yyminorunion); |
| 9841 } |
| 9842 fts5yypParser->fts5yyerrcnt = 3; |
| 9843 fts5yy_destructor(fts5yypParser,(fts5YYCODETYPE)fts5yymajor,&fts5yyminorun
ion); |
| 9844 if( fts5yyendofinput ){ |
| 9845 fts5yy_parse_failed(fts5yypParser); |
| 9846 } |
| 9847 fts5yymajor = fts5YYNOCODE; |
| 9848 #endif |
| 9849 } |
| 9850 }while( fts5yymajor!=fts5YYNOCODE && fts5yypParser->fts5yyidx>=0 ); |
| 9851 #ifndef NDEBUG |
| 9852 if( fts5yyTraceFILE ){ |
| 9853 int i; |
| 9854 fprintf(fts5yyTraceFILE,"%sReturn. Stack=",fts5yyTracePrompt); |
| 9855 for(i=1; i<=fts5yypParser->fts5yyidx; i++) |
| 9856 fprintf(fts5yyTraceFILE,"%c%s", i==1 ? '[' : ' ', |
| 9857 fts5yyTokenName[fts5yypParser->fts5yystack[i].major]); |
| 9858 fprintf(fts5yyTraceFILE,"]\n"); |
| 9859 } |
| 9860 #endif |
| 9861 return; |
| 9862 } |
| 9863 |
| 9864 /* |
| 9865 ** 2014 May 31 |
| 9866 ** |
| 9867 ** The author disclaims copyright to this source code. In place of |
| 9868 ** a legal notice, here is a blessing: |
| 9869 ** |
| 9870 ** May you do good and not evil. |
| 9871 ** May you find forgiveness for yourself and forgive others. |
| 9872 ** May you share freely, never taking more than you give. |
| 9873 ** |
| 9874 ****************************************************************************** |
| 9875 */ |
| 9876 |
| 9877 |
| 9878 /* #include "fts5Int.h" */ |
| 9879 #include <math.h> /* amalgamator: keep */ |
| 9880 |
| 9881 /* |
| 9882 ** Object used to iterate through all "coalesced phrase instances" in |
| 9883 ** a single column of the current row. If the phrase instances in the |
| 9884 ** column being considered do not overlap, this object simply iterates |
| 9885 ** through them. Or, if they do overlap (share one or more tokens in |
| 9886 ** common), each set of overlapping instances is treated as a single |
| 9887 ** match. See documentation for the highlight() auxiliary function for |
| 9888 ** details. |
| 9889 ** |
| 9890 ** Usage is: |
| 9891 ** |
| 9892 ** for(rc = fts5CInstIterNext(pApi, pFts, iCol, &iter); |
| 9893 ** (rc==SQLITE_OK && 0==fts5CInstIterEof(&iter); |
| 9894 ** rc = fts5CInstIterNext(&iter) |
| 9895 ** ){ |
| 9896 ** printf("instance starts at %d, ends at %d\n", iter.iStart, iter.iEnd); |
| 9897 ** } |
| 9898 ** |
| 9899 */ |
| 9900 typedef struct CInstIter CInstIter; |
| 9901 struct CInstIter { |
| 9902 const Fts5ExtensionApi *pApi; /* API offered by current FTS version */ |
| 9903 Fts5Context *pFts; /* First arg to pass to pApi functions */ |
| 9904 int iCol; /* Column to search */ |
| 9905 int iInst; /* Next phrase instance index */ |
| 9906 int nInst; /* Total number of phrase instances */ |
| 9907 |
| 9908 /* Output variables */ |
| 9909 int iStart; /* First token in coalesced phrase instance */ |
| 9910 int iEnd; /* Last token in coalesced phrase instance */ |
| 9911 }; |
| 9912 |
| 9913 /* |
| 9914 ** Advance the iterator to the next coalesced phrase instance. Return |
| 9915 ** an SQLite error code if an error occurs, or SQLITE_OK otherwise. |
| 9916 */ |
| 9917 static int fts5CInstIterNext(CInstIter *pIter){ |
| 9918 int rc = SQLITE_OK; |
| 9919 pIter->iStart = -1; |
| 9920 pIter->iEnd = -1; |
| 9921 |
| 9922 while( rc==SQLITE_OK && pIter->iInst<pIter->nInst ){ |
| 9923 int ip; int ic; int io; |
| 9924 rc = pIter->pApi->xInst(pIter->pFts, pIter->iInst, &ip, &ic, &io); |
| 9925 if( rc==SQLITE_OK ){ |
| 9926 if( ic==pIter->iCol ){ |
| 9927 int iEnd = io - 1 + pIter->pApi->xPhraseSize(pIter->pFts, ip); |
| 9928 if( pIter->iStart<0 ){ |
| 9929 pIter->iStart = io; |
| 9930 pIter->iEnd = iEnd; |
| 9931 }else if( io<=pIter->iEnd ){ |
| 9932 if( iEnd>pIter->iEnd ) pIter->iEnd = iEnd; |
| 9933 }else{ |
| 9934 break; |
| 9935 } |
| 9936 } |
| 9937 pIter->iInst++; |
| 9938 } |
| 9939 } |
| 9940 |
| 9941 return rc; |
| 9942 } |
| 9943 |
| 9944 /* |
| 9945 ** Initialize the iterator object indicated by the final parameter to |
| 9946 ** iterate through coalesced phrase instances in column iCol. |
| 9947 */ |
| 9948 static int fts5CInstIterInit( |
| 9949 const Fts5ExtensionApi *pApi, |
| 9950 Fts5Context *pFts, |
| 9951 int iCol, |
| 9952 CInstIter *pIter |
| 9953 ){ |
| 9954 int rc; |
| 9955 |
| 9956 memset(pIter, 0, sizeof(CInstIter)); |
| 9957 pIter->pApi = pApi; |
| 9958 pIter->pFts = pFts; |
| 9959 pIter->iCol = iCol; |
| 9960 rc = pApi->xInstCount(pFts, &pIter->nInst); |
| 9961 |
| 9962 if( rc==SQLITE_OK ){ |
| 9963 rc = fts5CInstIterNext(pIter); |
| 9964 } |
| 9965 |
| 9966 return rc; |
| 9967 } |
| 9968 |
| 9969 |
| 9970 |
| 9971 /************************************************************************* |
| 9972 ** Start of highlight() implementation. |
| 9973 */ |
| 9974 typedef struct HighlightContext HighlightContext; |
| 9975 struct HighlightContext { |
| 9976 CInstIter iter; /* Coalesced Instance Iterator */ |
| 9977 int iPos; /* Current token offset in zIn[] */ |
| 9978 int iRangeStart; /* First token to include */ |
| 9979 int iRangeEnd; /* If non-zero, last token to include */ |
| 9980 const char *zOpen; /* Opening highlight */ |
| 9981 const char *zClose; /* Closing highlight */ |
| 9982 const char *zIn; /* Input text */ |
| 9983 int nIn; /* Size of input text in bytes */ |
| 9984 int iOff; /* Current offset within zIn[] */ |
| 9985 char *zOut; /* Output value */ |
| 9986 }; |
| 9987 |
| 9988 /* |
| 9989 ** Append text to the HighlightContext output string - p->zOut. Argument |
| 9990 ** z points to a buffer containing n bytes of text to append. If n is |
| 9991 ** negative, everything up until the first '\0' is appended to the output. |
| 9992 ** |
| 9993 ** If *pRc is set to any value other than SQLITE_OK when this function is |
| 9994 ** called, it is a no-op. If an error (i.e. an OOM condition) is encountered, |
| 9995 ** *pRc is set to an error code before returning. |
| 9996 */ |
| 9997 static void fts5HighlightAppend( |
| 9998 int *pRc, |
| 9999 HighlightContext *p, |
| 10000 const char *z, int n |
| 10001 ){ |
| 10002 if( *pRc==SQLITE_OK ){ |
| 10003 if( n<0 ) n = (int)strlen(z); |
| 10004 p->zOut = sqlite3_mprintf("%z%.*s", p->zOut, n, z); |
| 10005 if( p->zOut==0 ) *pRc = SQLITE_NOMEM; |
| 10006 } |
| 10007 } |
| 10008 |
| 10009 /* |
| 10010 ** Tokenizer callback used by implementation of highlight() function. |
| 10011 */ |
| 10012 static int fts5HighlightCb( |
| 10013 void *pContext, /* Pointer to HighlightContext object */ |
| 10014 int tflags, /* Mask of FTS5_TOKEN_* flags */ |
| 10015 const char *pToken, /* Buffer containing token */ |
| 10016 int nToken, /* Size of token in bytes */ |
| 10017 int iStartOff, /* Start offset of token */ |
| 10018 int iEndOff /* End offset of token */ |
| 10019 ){ |
| 10020 HighlightContext *p = (HighlightContext*)pContext; |
| 10021 int rc = SQLITE_OK; |
| 10022 int iPos; |
| 10023 |
| 10024 if( tflags & FTS5_TOKEN_COLOCATED ) return SQLITE_OK; |
| 10025 iPos = p->iPos++; |
| 10026 |
| 10027 if( p->iRangeEnd>0 ){ |
| 10028 if( iPos<p->iRangeStart || iPos>p->iRangeEnd ) return SQLITE_OK; |
| 10029 if( p->iRangeStart && iPos==p->iRangeStart ) p->iOff = iStartOff; |
| 10030 } |
| 10031 |
| 10032 if( iPos==p->iter.iStart ){ |
| 10033 fts5HighlightAppend(&rc, p, &p->zIn[p->iOff], iStartOff - p->iOff); |
| 10034 fts5HighlightAppend(&rc, p, p->zOpen, -1); |
| 10035 p->iOff = iStartOff; |
| 10036 } |
| 10037 |
| 10038 if( iPos==p->iter.iEnd ){ |
| 10039 if( p->iRangeEnd && p->iter.iStart<p->iRangeStart ){ |
| 10040 fts5HighlightAppend(&rc, p, p->zOpen, -1); |
| 10041 } |
| 10042 fts5HighlightAppend(&rc, p, &p->zIn[p->iOff], iEndOff - p->iOff); |
| 10043 fts5HighlightAppend(&rc, p, p->zClose, -1); |
| 10044 p->iOff = iEndOff; |
| 10045 if( rc==SQLITE_OK ){ |
| 10046 rc = fts5CInstIterNext(&p->iter); |
| 10047 } |
| 10048 } |
| 10049 |
| 10050 if( p->iRangeEnd>0 && iPos==p->iRangeEnd ){ |
| 10051 fts5HighlightAppend(&rc, p, &p->zIn[p->iOff], iEndOff - p->iOff); |
| 10052 p->iOff = iEndOff; |
| 10053 if( iPos<p->iter.iEnd ){ |
| 10054 fts5HighlightAppend(&rc, p, p->zClose, -1); |
| 10055 } |
| 10056 } |
| 10057 |
| 10058 return rc; |
| 10059 } |
| 10060 |
| 10061 /* |
| 10062 ** Implementation of highlight() function. |
| 10063 */ |
| 10064 static void fts5HighlightFunction( |
| 10065 const Fts5ExtensionApi *pApi, /* API offered by current FTS version */ |
| 10066 Fts5Context *pFts, /* First arg to pass to pApi functions */ |
| 10067 sqlite3_context *pCtx, /* Context for returning result/error */ |
| 10068 int nVal, /* Number of values in apVal[] array */ |
| 10069 sqlite3_value **apVal /* Array of trailing arguments */ |
| 10070 ){ |
| 10071 HighlightContext ctx; |
| 10072 int rc; |
| 10073 int iCol; |
| 10074 |
| 10075 if( nVal!=3 ){ |
| 10076 const char *zErr = "wrong number of arguments to function highlight()"; |
| 10077 sqlite3_result_error(pCtx, zErr, -1); |
| 10078 return; |
| 10079 } |
| 10080 |
| 10081 iCol = sqlite3_value_int(apVal[0]); |
| 10082 memset(&ctx, 0, sizeof(HighlightContext)); |
| 10083 ctx.zOpen = (const char*)sqlite3_value_text(apVal[1]); |
| 10084 ctx.zClose = (const char*)sqlite3_value_text(apVal[2]); |
| 10085 rc = pApi->xColumnText(pFts, iCol, &ctx.zIn, &ctx.nIn); |
| 10086 |
| 10087 if( ctx.zIn ){ |
| 10088 if( rc==SQLITE_OK ){ |
| 10089 rc = fts5CInstIterInit(pApi, pFts, iCol, &ctx.iter); |
| 10090 } |
| 10091 |
| 10092 if( rc==SQLITE_OK ){ |
| 10093 rc = pApi->xTokenize(pFts, ctx.zIn, ctx.nIn, (void*)&ctx,fts5HighlightCb); |
| 10094 } |
| 10095 fts5HighlightAppend(&rc, &ctx, &ctx.zIn[ctx.iOff], ctx.nIn - ctx.iOff); |
| 10096 |
| 10097 if( rc==SQLITE_OK ){ |
| 10098 sqlite3_result_text(pCtx, (const char*)ctx.zOut, -1, SQLITE_TRANSIENT); |
| 10099 } |
| 10100 sqlite3_free(ctx.zOut); |
| 10101 } |
| 10102 if( rc!=SQLITE_OK ){ |
| 10103 sqlite3_result_error_code(pCtx, rc); |
| 10104 } |
| 10105 } |
| 10106 /* |
| 10107 ** End of highlight() implementation. |
| 10108 **************************************************************************/ |
| 10109 |
| 10110 /* |
| 10111 ** Implementation of snippet() function. |
| 10112 */ |
| 10113 static void fts5SnippetFunction( |
| 10114 const Fts5ExtensionApi *pApi, /* API offered by current FTS version */ |
| 10115 Fts5Context *pFts, /* First arg to pass to pApi functions */ |
| 10116 sqlite3_context *pCtx, /* Context for returning result/error */ |
| 10117 int nVal, /* Number of values in apVal[] array */ |
| 10118 sqlite3_value **apVal /* Array of trailing arguments */ |
| 10119 ){ |
| 10120 HighlightContext ctx; |
| 10121 int rc = SQLITE_OK; /* Return code */ |
| 10122 int iCol; /* 1st argument to snippet() */ |
| 10123 const char *zEllips; /* 4th argument to snippet() */ |
| 10124 int nToken; /* 5th argument to snippet() */ |
| 10125 int nInst = 0; /* Number of instance matches this row */ |
| 10126 int i; /* Used to iterate through instances */ |
| 10127 int nPhrase; /* Number of phrases in query */ |
| 10128 unsigned char *aSeen; /* Array of "seen instance" flags */ |
| 10129 int iBestCol; /* Column containing best snippet */ |
| 10130 int iBestStart = 0; /* First token of best snippet */ |
| 10131 int iBestLast; /* Last token of best snippet */ |
| 10132 int nBestScore = 0; /* Score of best snippet */ |
| 10133 int nColSize = 0; /* Total size of iBestCol in tokens */ |
| 10134 |
| 10135 if( nVal!=5 ){ |
| 10136 const char *zErr = "wrong number of arguments to function snippet()"; |
| 10137 sqlite3_result_error(pCtx, zErr, -1); |
| 10138 return; |
| 10139 } |
| 10140 |
| 10141 memset(&ctx, 0, sizeof(HighlightContext)); |
| 10142 iCol = sqlite3_value_int(apVal[0]); |
| 10143 ctx.zOpen = (const char*)sqlite3_value_text(apVal[1]); |
| 10144 ctx.zClose = (const char*)sqlite3_value_text(apVal[2]); |
| 10145 zEllips = (const char*)sqlite3_value_text(apVal[3]); |
| 10146 nToken = sqlite3_value_int(apVal[4]); |
| 10147 iBestLast = nToken-1; |
| 10148 |
| 10149 iBestCol = (iCol>=0 ? iCol : 0); |
| 10150 nPhrase = pApi->xPhraseCount(pFts); |
| 10151 aSeen = sqlite3_malloc(nPhrase); |
| 10152 if( aSeen==0 ){ |
| 10153 rc = SQLITE_NOMEM; |
| 10154 } |
| 10155 |
| 10156 if( rc==SQLITE_OK ){ |
| 10157 rc = pApi->xInstCount(pFts, &nInst); |
| 10158 } |
| 10159 for(i=0; rc==SQLITE_OK && i<nInst; i++){ |
| 10160 int ip, iSnippetCol, iStart; |
| 10161 memset(aSeen, 0, nPhrase); |
| 10162 rc = pApi->xInst(pFts, i, &ip, &iSnippetCol, &iStart); |
| 10163 if( rc==SQLITE_OK && (iCol<0 || iSnippetCol==iCol) ){ |
| 10164 int nScore = 1000; |
| 10165 int iLast = iStart - 1 + pApi->xPhraseSize(pFts, ip); |
| 10166 int j; |
| 10167 aSeen[ip] = 1; |
| 10168 |
| 10169 for(j=i+1; rc==SQLITE_OK && j<nInst; j++){ |
| 10170 int ic; int io; int iFinal; |
| 10171 rc = pApi->xInst(pFts, j, &ip, &ic, &io); |
| 10172 iFinal = io + pApi->xPhraseSize(pFts, ip) - 1; |
| 10173 if( rc==SQLITE_OK && ic==iSnippetCol && iLast<iStart+nToken ){ |
| 10174 nScore += aSeen[ip] ? 1000 : 1; |
| 10175 aSeen[ip] = 1; |
| 10176 if( iFinal>iLast ) iLast = iFinal; |
| 10177 } |
| 10178 } |
| 10179 |
| 10180 if( rc==SQLITE_OK && nScore>nBestScore ){ |
| 10181 iBestCol = iSnippetCol; |
| 10182 iBestStart = iStart; |
| 10183 iBestLast = iLast; |
| 10184 nBestScore = nScore; |
| 10185 } |
| 10186 } |
| 10187 } |
| 10188 |
| 10189 if( rc==SQLITE_OK ){ |
| 10190 rc = pApi->xColumnSize(pFts, iBestCol, &nColSize); |
| 10191 } |
| 10192 if( rc==SQLITE_OK ){ |
| 10193 rc = pApi->xColumnText(pFts, iBestCol, &ctx.zIn, &ctx.nIn); |
| 10194 } |
| 10195 if( ctx.zIn ){ |
| 10196 if( rc==SQLITE_OK ){ |
| 10197 rc = fts5CInstIterInit(pApi, pFts, iBestCol, &ctx.iter); |
| 10198 } |
| 10199 |
| 10200 if( (iBestStart+nToken-1)>iBestLast ){ |
| 10201 iBestStart -= (iBestStart+nToken-1-iBestLast) / 2; |
| 10202 } |
| 10203 if( iBestStart+nToken>nColSize ){ |
| 10204 iBestStart = nColSize - nToken; |
| 10205 } |
| 10206 if( iBestStart<0 ) iBestStart = 0; |
| 10207 |
| 10208 ctx.iRangeStart = iBestStart; |
| 10209 ctx.iRangeEnd = iBestStart + nToken - 1; |
| 10210 |
| 10211 if( iBestStart>0 ){ |
| 10212 fts5HighlightAppend(&rc, &ctx, zEllips, -1); |
| 10213 } |
| 10214 if( rc==SQLITE_OK ){ |
| 10215 rc = pApi->xTokenize(pFts, ctx.zIn, ctx.nIn, (void*)&ctx,fts5HighlightCb); |
| 10216 } |
| 10217 if( ctx.iRangeEnd>=(nColSize-1) ){ |
| 10218 fts5HighlightAppend(&rc, &ctx, &ctx.zIn[ctx.iOff], ctx.nIn - ctx.iOff); |
| 10219 }else{ |
| 10220 fts5HighlightAppend(&rc, &ctx, zEllips, -1); |
| 10221 } |
| 10222 |
| 10223 if( rc==SQLITE_OK ){ |
| 10224 sqlite3_result_text(pCtx, (const char*)ctx.zOut, -1, SQLITE_TRANSIENT); |
| 10225 }else{ |
| 10226 sqlite3_result_error_code(pCtx, rc); |
| 10227 } |
| 10228 sqlite3_free(ctx.zOut); |
| 10229 } |
| 10230 sqlite3_free(aSeen); |
| 10231 } |
| 10232 |
| 10233 /************************************************************************/ |
| 10234 |
| 10235 /* |
| 10236 ** The first time the bm25() function is called for a query, an instance |
| 10237 ** of the following structure is allocated and populated. |
| 10238 */ |
| 10239 typedef struct Fts5Bm25Data Fts5Bm25Data; |
| 10240 struct Fts5Bm25Data { |
| 10241 int nPhrase; /* Number of phrases in query */ |
| 10242 double avgdl; /* Average number of tokens in each row */ |
| 10243 double *aIDF; /* IDF for each phrase */ |
| 10244 double *aFreq; /* Array used to calculate phrase freq. */ |
| 10245 }; |
| 10246 |
| 10247 /* |
| 10248 ** Callback used by fts5Bm25GetData() to count the number of rows in the |
| 10249 ** table matched by each individual phrase within the query. |
| 10250 */ |
| 10251 static int fts5CountCb( |
| 10252 const Fts5ExtensionApi *pApi, |
| 10253 Fts5Context *pFts, |
| 10254 void *pUserData /* Pointer to sqlite3_int64 variable */ |
| 10255 ){ |
| 10256 sqlite3_int64 *pn = (sqlite3_int64*)pUserData; |
| 10257 (*pn)++; |
| 10258 return SQLITE_OK; |
| 10259 } |
| 10260 |
| 10261 /* |
| 10262 ** Set *ppData to point to the Fts5Bm25Data object for the current query. |
| 10263 ** If the object has not already been allocated, allocate and populate it |
| 10264 ** now. |
| 10265 */ |
| 10266 static int fts5Bm25GetData( |
| 10267 const Fts5ExtensionApi *pApi, |
| 10268 Fts5Context *pFts, |
| 10269 Fts5Bm25Data **ppData /* OUT: bm25-data object for this query */ |
| 10270 ){ |
| 10271 int rc = SQLITE_OK; /* Return code */ |
| 10272 Fts5Bm25Data *p; /* Object to return */ |
| 10273 |
| 10274 p = pApi->xGetAuxdata(pFts, 0); |
| 10275 if( p==0 ){ |
| 10276 int nPhrase; /* Number of phrases in query */ |
| 10277 sqlite3_int64 nRow = 0; /* Number of rows in table */ |
| 10278 sqlite3_int64 nToken = 0; /* Number of tokens in table */ |
| 10279 int nByte; /* Bytes of space to allocate */ |
| 10280 int i; |
| 10281 |
| 10282 /* Allocate the Fts5Bm25Data object */ |
| 10283 nPhrase = pApi->xPhraseCount(pFts); |
| 10284 nByte = sizeof(Fts5Bm25Data) + nPhrase*2*sizeof(double); |
| 10285 p = (Fts5Bm25Data*)sqlite3_malloc(nByte); |
| 10286 if( p==0 ){ |
| 10287 rc = SQLITE_NOMEM; |
| 10288 }else{ |
| 10289 memset(p, 0, nByte); |
| 10290 p->nPhrase = nPhrase; |
| 10291 p->aIDF = (double*)&p[1]; |
| 10292 p->aFreq = &p->aIDF[nPhrase]; |
| 10293 } |
| 10294 |
| 10295 /* Calculate the average document length for this FTS5 table */ |
| 10296 if( rc==SQLITE_OK ) rc = pApi->xRowCount(pFts, &nRow); |
| 10297 if( rc==SQLITE_OK ) rc = pApi->xColumnTotalSize(pFts, -1, &nToken); |
| 10298 if( rc==SQLITE_OK ) p->avgdl = (double)nToken / (double)nRow; |
| 10299 |
| 10300 /* Calculate an IDF for each phrase in the query */ |
| 10301 for(i=0; rc==SQLITE_OK && i<nPhrase; i++){ |
| 10302 sqlite3_int64 nHit = 0; |
| 10303 rc = pApi->xQueryPhrase(pFts, i, (void*)&nHit, fts5CountCb); |
| 10304 if( rc==SQLITE_OK ){ |
| 10305 /* Calculate the IDF (Inverse Document Frequency) for phrase i. |
| 10306 ** This is done using the standard BM25 formula as found on wikipedia: |
| 10307 ** |
| 10308 ** IDF = log( (N - nHit + 0.5) / (nHit + 0.5) ) |
| 10309 ** |
| 10310 ** where "N" is the total number of documents in the set and nHit |
| 10311 ** is the number that contain at least one instance of the phrase |
| 10312 ** under consideration. |
| 10313 ** |
| 10314 ** The problem with this is that if (N < 2*nHit), the IDF is |
| 10315 ** negative. Which is undesirable. So the mimimum allowable IDF is |
| 10316 ** (1e-6) - roughly the same as a term that appears in just over |
| 10317 ** half of set of 5,000,000 documents. */ |
| 10318 double idf = log( (nRow - nHit + 0.5) / (nHit + 0.5) ); |
| 10319 if( idf<=0.0 ) idf = 1e-6; |
| 10320 p->aIDF[i] = idf; |
| 10321 } |
| 10322 } |
| 10323 |
| 10324 if( rc!=SQLITE_OK ){ |
| 10325 sqlite3_free(p); |
| 10326 }else{ |
| 10327 rc = pApi->xSetAuxdata(pFts, p, sqlite3_free); |
| 10328 } |
| 10329 if( rc!=SQLITE_OK ) p = 0; |
| 10330 } |
| 10331 *ppData = p; |
| 10332 return rc; |
| 10333 } |
| 10334 |
| 10335 /* |
| 10336 ** Implementation of bm25() function. |
| 10337 */ |
| 10338 static void fts5Bm25Function( |
| 10339 const Fts5ExtensionApi *pApi, /* API offered by current FTS version */ |
| 10340 Fts5Context *pFts, /* First arg to pass to pApi functions */ |
| 10341 sqlite3_context *pCtx, /* Context for returning result/error */ |
| 10342 int nVal, /* Number of values in apVal[] array */ |
| 10343 sqlite3_value **apVal /* Array of trailing arguments */ |
| 10344 ){ |
| 10345 const double k1 = 1.2; /* Constant "k1" from BM25 formula */ |
| 10346 const double b = 0.75; /* Constant "b" from BM25 formula */ |
| 10347 int rc = SQLITE_OK; /* Error code */ |
| 10348 double score = 0.0; /* SQL function return value */ |
| 10349 Fts5Bm25Data *pData; /* Values allocated/calculated once only */ |
| 10350 int i; /* Iterator variable */ |
| 10351 int nInst = 0; /* Value returned by xInstCount() */ |
| 10352 double D = 0.0; /* Total number of tokens in row */ |
| 10353 double *aFreq = 0; /* Array of phrase freq. for current row */ |
| 10354 |
| 10355 /* Calculate the phrase frequency (symbol "f(qi,D)" in the documentation) |
| 10356 ** for each phrase in the query for the current row. */ |
| 10357 rc = fts5Bm25GetData(pApi, pFts, &pData); |
| 10358 if( rc==SQLITE_OK ){ |
| 10359 aFreq = pData->aFreq; |
| 10360 memset(aFreq, 0, sizeof(double) * pData->nPhrase); |
| 10361 rc = pApi->xInstCount(pFts, &nInst); |
| 10362 } |
| 10363 for(i=0; rc==SQLITE_OK && i<nInst; i++){ |
| 10364 int ip; int ic; int io; |
| 10365 rc = pApi->xInst(pFts, i, &ip, &ic, &io); |
| 10366 if( rc==SQLITE_OK ){ |
| 10367 double w = (nVal > ic) ? sqlite3_value_double(apVal[ic]) : 1.0; |
| 10368 aFreq[ip] += w; |
| 10369 } |
| 10370 } |
| 10371 |
| 10372 /* Figure out the total size of the current row in tokens. */ |
| 10373 if( rc==SQLITE_OK ){ |
| 10374 int nTok; |
| 10375 rc = pApi->xColumnSize(pFts, -1, &nTok); |
| 10376 D = (double)nTok; |
| 10377 } |
| 10378 |
| 10379 /* Determine the BM25 score for the current row. */ |
| 10380 for(i=0; rc==SQLITE_OK && i<pData->nPhrase; i++){ |
| 10381 score += pData->aIDF[i] * ( |
| 10382 ( aFreq[i] * (k1 + 1.0) ) / |
| 10383 ( aFreq[i] + k1 * (1 - b + b * D / pData->avgdl) ) |
| 10384 ); |
| 10385 } |
| 10386 |
| 10387 /* If no error has occurred, return the calculated score. Otherwise, |
| 10388 ** throw an SQL exception. */ |
| 10389 if( rc==SQLITE_OK ){ |
| 10390 sqlite3_result_double(pCtx, -1.0 * score); |
| 10391 }else{ |
| 10392 sqlite3_result_error_code(pCtx, rc); |
| 10393 } |
| 10394 } |
| 10395 |
| 10396 static int sqlite3Fts5AuxInit(fts5_api *pApi){ |
| 10397 struct Builtin { |
| 10398 const char *zFunc; /* Function name (nul-terminated) */ |
| 10399 void *pUserData; /* User-data pointer */ |
| 10400 fts5_extension_function xFunc;/* Callback function */ |
| 10401 void (*xDestroy)(void*); /* Destructor function */ |
| 10402 } aBuiltin [] = { |
| 10403 { "snippet", 0, fts5SnippetFunction, 0 }, |
| 10404 { "highlight", 0, fts5HighlightFunction, 0 }, |
| 10405 { "bm25", 0, fts5Bm25Function, 0 }, |
| 10406 }; |
| 10407 int rc = SQLITE_OK; /* Return code */ |
| 10408 int i; /* To iterate through builtin functions */ |
| 10409 |
| 10410 for(i=0; rc==SQLITE_OK && i<(int)ArraySize(aBuiltin); i++){ |
| 10411 rc = pApi->xCreateFunction(pApi, |
| 10412 aBuiltin[i].zFunc, |
| 10413 aBuiltin[i].pUserData, |
| 10414 aBuiltin[i].xFunc, |
| 10415 aBuiltin[i].xDestroy |
| 10416 ); |
| 10417 } |
| 10418 |
| 10419 return rc; |
| 10420 } |
| 10421 |
| 10422 |
| 10423 |
| 10424 /* |
| 10425 ** 2014 May 31 |
| 10426 ** |
| 10427 ** The author disclaims copyright to this source code. In place of |
| 10428 ** a legal notice, here is a blessing: |
| 10429 ** |
| 10430 ** May you do good and not evil. |
| 10431 ** May you find forgiveness for yourself and forgive others. |
| 10432 ** May you share freely, never taking more than you give. |
| 10433 ** |
| 10434 ****************************************************************************** |
| 10435 */ |
| 10436 |
| 10437 |
| 10438 |
| 10439 /* #include "fts5Int.h" */ |
| 10440 |
| 10441 static int sqlite3Fts5BufferSize(int *pRc, Fts5Buffer *pBuf, int nByte){ |
| 10442 int nNew = pBuf->nSpace ? pBuf->nSpace*2 : 64; |
| 10443 u8 *pNew; |
| 10444 while( nNew<nByte ){ |
| 10445 nNew = nNew * 2; |
| 10446 } |
| 10447 pNew = sqlite3_realloc(pBuf->p, nNew); |
| 10448 if( pNew==0 ){ |
| 10449 *pRc = SQLITE_NOMEM; |
| 10450 return 1; |
| 10451 }else{ |
| 10452 pBuf->nSpace = nNew; |
| 10453 pBuf->p = pNew; |
| 10454 } |
| 10455 return 0; |
| 10456 } |
| 10457 |
| 10458 |
| 10459 /* |
| 10460 ** Encode value iVal as an SQLite varint and append it to the buffer object |
| 10461 ** pBuf. If an OOM error occurs, set the error code in p. |
| 10462 */ |
| 10463 static void sqlite3Fts5BufferAppendVarint(int *pRc, Fts5Buffer *pBuf, i64 iVal){ |
| 10464 if( fts5BufferGrow(pRc, pBuf, 9) ) return; |
| 10465 pBuf->n += sqlite3Fts5PutVarint(&pBuf->p[pBuf->n], iVal); |
| 10466 } |
| 10467 |
| 10468 static void sqlite3Fts5Put32(u8 *aBuf, int iVal){ |
| 10469 aBuf[0] = (iVal>>24) & 0x00FF; |
| 10470 aBuf[1] = (iVal>>16) & 0x00FF; |
| 10471 aBuf[2] = (iVal>> 8) & 0x00FF; |
| 10472 aBuf[3] = (iVal>> 0) & 0x00FF; |
| 10473 } |
| 10474 |
| 10475 static int sqlite3Fts5Get32(const u8 *aBuf){ |
| 10476 return (aBuf[0] << 24) + (aBuf[1] << 16) + (aBuf[2] << 8) + aBuf[3]; |
| 10477 } |
| 10478 |
| 10479 /* |
| 10480 ** Append buffer nData/pData to buffer pBuf. If an OOM error occurs, set |
| 10481 ** the error code in p. If an error has already occurred when this function |
| 10482 ** is called, it is a no-op. |
| 10483 */ |
| 10484 static void sqlite3Fts5BufferAppendBlob( |
| 10485 int *pRc, |
| 10486 Fts5Buffer *pBuf, |
| 10487 int nData, |
| 10488 const u8 *pData |
| 10489 ){ |
| 10490 assert( *pRc || nData>=0 ); |
| 10491 if( fts5BufferGrow(pRc, pBuf, nData) ) return; |
| 10492 memcpy(&pBuf->p[pBuf->n], pData, nData); |
| 10493 pBuf->n += nData; |
| 10494 } |
| 10495 |
| 10496 /* |
| 10497 ** Append the nul-terminated string zStr to the buffer pBuf. This function |
| 10498 ** ensures that the byte following the buffer data is set to 0x00, even |
| 10499 ** though this byte is not included in the pBuf->n count. |
| 10500 */ |
| 10501 static void sqlite3Fts5BufferAppendString( |
| 10502 int *pRc, |
| 10503 Fts5Buffer *pBuf, |
| 10504 const char *zStr |
| 10505 ){ |
| 10506 int nStr = (int)strlen(zStr); |
| 10507 sqlite3Fts5BufferAppendBlob(pRc, pBuf, nStr+1, (const u8*)zStr); |
| 10508 pBuf->n--; |
| 10509 } |
| 10510 |
| 10511 /* |
| 10512 ** Argument zFmt is a printf() style format string. This function performs |
| 10513 ** the printf() style processing, then appends the results to buffer pBuf. |
| 10514 ** |
| 10515 ** Like sqlite3Fts5BufferAppendString(), this function ensures that the byte |
| 10516 ** following the buffer data is set to 0x00, even though this byte is not |
| 10517 ** included in the pBuf->n count. |
| 10518 */ |
| 10519 static void sqlite3Fts5BufferAppendPrintf( |
| 10520 int *pRc, |
| 10521 Fts5Buffer *pBuf, |
| 10522 char *zFmt, ... |
| 10523 ){ |
| 10524 if( *pRc==SQLITE_OK ){ |
| 10525 char *zTmp; |
| 10526 va_list ap; |
| 10527 va_start(ap, zFmt); |
| 10528 zTmp = sqlite3_vmprintf(zFmt, ap); |
| 10529 va_end(ap); |
| 10530 |
| 10531 if( zTmp==0 ){ |
| 10532 *pRc = SQLITE_NOMEM; |
| 10533 }else{ |
| 10534 sqlite3Fts5BufferAppendString(pRc, pBuf, zTmp); |
| 10535 sqlite3_free(zTmp); |
| 10536 } |
| 10537 } |
| 10538 } |
| 10539 |
| 10540 static char *sqlite3Fts5Mprintf(int *pRc, const char *zFmt, ...){ |
| 10541 char *zRet = 0; |
| 10542 if( *pRc==SQLITE_OK ){ |
| 10543 va_list ap; |
| 10544 va_start(ap, zFmt); |
| 10545 zRet = sqlite3_vmprintf(zFmt, ap); |
| 10546 va_end(ap); |
| 10547 if( zRet==0 ){ |
| 10548 *pRc = SQLITE_NOMEM; |
| 10549 } |
| 10550 } |
| 10551 return zRet; |
| 10552 } |
| 10553 |
| 10554 |
| 10555 /* |
| 10556 ** Free any buffer allocated by pBuf. Zero the structure before returning. |
| 10557 */ |
| 10558 static void sqlite3Fts5BufferFree(Fts5Buffer *pBuf){ |
| 10559 sqlite3_free(pBuf->p); |
| 10560 memset(pBuf, 0, sizeof(Fts5Buffer)); |
| 10561 } |
| 10562 |
| 10563 /* |
| 10564 ** Zero the contents of the buffer object. But do not free the associated |
| 10565 ** memory allocation. |
| 10566 */ |
| 10567 static void sqlite3Fts5BufferZero(Fts5Buffer *pBuf){ |
| 10568 pBuf->n = 0; |
| 10569 } |
| 10570 |
| 10571 /* |
| 10572 ** Set the buffer to contain nData/pData. If an OOM error occurs, leave an |
| 10573 ** the error code in p. If an error has already occurred when this function |
| 10574 ** is called, it is a no-op. |
| 10575 */ |
| 10576 static void sqlite3Fts5BufferSet( |
| 10577 int *pRc, |
| 10578 Fts5Buffer *pBuf, |
| 10579 int nData, |
| 10580 const u8 *pData |
| 10581 ){ |
| 10582 pBuf->n = 0; |
| 10583 sqlite3Fts5BufferAppendBlob(pRc, pBuf, nData, pData); |
| 10584 } |
| 10585 |
| 10586 static int sqlite3Fts5PoslistNext64( |
| 10587 const u8 *a, int n, /* Buffer containing poslist */ |
| 10588 int *pi, /* IN/OUT: Offset within a[] */ |
| 10589 i64 *piOff /* IN/OUT: Current offset */ |
| 10590 ){ |
| 10591 int i = *pi; |
| 10592 if( i>=n ){ |
| 10593 /* EOF */ |
| 10594 *piOff = -1; |
| 10595 return 1; |
| 10596 }else{ |
| 10597 i64 iOff = *piOff; |
| 10598 int iVal; |
| 10599 fts5FastGetVarint32(a, i, iVal); |
| 10600 if( iVal==1 ){ |
| 10601 fts5FastGetVarint32(a, i, iVal); |
| 10602 iOff = ((i64)iVal) << 32; |
| 10603 fts5FastGetVarint32(a, i, iVal); |
| 10604 } |
| 10605 *piOff = iOff + (iVal-2); |
| 10606 *pi = i; |
| 10607 return 0; |
| 10608 } |
| 10609 } |
| 10610 |
| 10611 |
| 10612 /* |
| 10613 ** Advance the iterator object passed as the only argument. Return true |
| 10614 ** if the iterator reaches EOF, or false otherwise. |
| 10615 */ |
| 10616 static int sqlite3Fts5PoslistReaderNext(Fts5PoslistReader *pIter){ |
| 10617 if( sqlite3Fts5PoslistNext64(pIter->a, pIter->n, &pIter->i, &pIter->iPos) ){ |
| 10618 pIter->bEof = 1; |
| 10619 } |
| 10620 return pIter->bEof; |
| 10621 } |
| 10622 |
| 10623 static int sqlite3Fts5PoslistReaderInit( |
| 10624 const u8 *a, int n, /* Poslist buffer to iterate through */ |
| 10625 Fts5PoslistReader *pIter /* Iterator object to initialize */ |
| 10626 ){ |
| 10627 memset(pIter, 0, sizeof(*pIter)); |
| 10628 pIter->a = a; |
| 10629 pIter->n = n; |
| 10630 sqlite3Fts5PoslistReaderNext(pIter); |
| 10631 return pIter->bEof; |
| 10632 } |
| 10633 |
| 10634 static int sqlite3Fts5PoslistWriterAppend( |
| 10635 Fts5Buffer *pBuf, |
| 10636 Fts5PoslistWriter *pWriter, |
| 10637 i64 iPos |
| 10638 ){ |
| 10639 static const i64 colmask = ((i64)(0x7FFFFFFF)) << 32; |
| 10640 int rc = SQLITE_OK; |
| 10641 if( 0==fts5BufferGrow(&rc, pBuf, 5+5+5) ){ |
| 10642 if( (iPos & colmask) != (pWriter->iPrev & colmask) ){ |
| 10643 pBuf->p[pBuf->n++] = 1; |
| 10644 pBuf->n += sqlite3Fts5PutVarint(&pBuf->p[pBuf->n], (iPos>>32)); |
| 10645 pWriter->iPrev = (iPos & colmask); |
| 10646 } |
| 10647 pBuf->n += sqlite3Fts5PutVarint(&pBuf->p[pBuf->n], (iPos-pWriter->iPrev)+2); |
| 10648 pWriter->iPrev = iPos; |
| 10649 } |
| 10650 return rc; |
| 10651 } |
| 10652 |
| 10653 static void *sqlite3Fts5MallocZero(int *pRc, int nByte){ |
| 10654 void *pRet = 0; |
| 10655 if( *pRc==SQLITE_OK ){ |
| 10656 pRet = sqlite3_malloc(nByte); |
| 10657 if( pRet==0 && nByte>0 ){ |
| 10658 *pRc = SQLITE_NOMEM; |
| 10659 }else{ |
| 10660 memset(pRet, 0, nByte); |
| 10661 } |
| 10662 } |
| 10663 return pRet; |
| 10664 } |
| 10665 |
| 10666 /* |
| 10667 ** Return a nul-terminated copy of the string indicated by pIn. If nIn |
| 10668 ** is non-negative, then it is the length of the string in bytes. Otherwise, |
| 10669 ** the length of the string is determined using strlen(). |
| 10670 ** |
| 10671 ** It is the responsibility of the caller to eventually free the returned |
| 10672 ** buffer using sqlite3_free(). If an OOM error occurs, NULL is returned. |
| 10673 */ |
| 10674 static char *sqlite3Fts5Strndup(int *pRc, const char *pIn, int nIn){ |
| 10675 char *zRet = 0; |
| 10676 if( *pRc==SQLITE_OK ){ |
| 10677 if( nIn<0 ){ |
| 10678 nIn = (int)strlen(pIn); |
| 10679 } |
| 10680 zRet = (char*)sqlite3_malloc(nIn+1); |
| 10681 if( zRet ){ |
| 10682 memcpy(zRet, pIn, nIn); |
| 10683 zRet[nIn] = '\0'; |
| 10684 }else{ |
| 10685 *pRc = SQLITE_NOMEM; |
| 10686 } |
| 10687 } |
| 10688 return zRet; |
| 10689 } |
| 10690 |
| 10691 |
| 10692 /* |
| 10693 ** Return true if character 't' may be part of an FTS5 bareword, or false |
| 10694 ** otherwise. Characters that may be part of barewords: |
| 10695 ** |
| 10696 ** * All non-ASCII characters, |
| 10697 ** * The 52 upper and lower case ASCII characters, and |
| 10698 ** * The 10 integer ASCII characters. |
| 10699 ** * The underscore character "_" (0x5F). |
| 10700 ** * The unicode "subsitute" character (0x1A). |
| 10701 */ |
| 10702 static int sqlite3Fts5IsBareword(char t){ |
| 10703 u8 aBareword[128] = { |
| 10704 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0x00 .. 0x0F */ |
| 10705 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, /* 0x10 .. 0x1F */ |
| 10706 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0x20 .. 0x2F */ |
| 10707 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, /* 0x30 .. 0x3F */ |
| 10708 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0x40 .. 0x4F */ |
| 10709 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, /* 0x50 .. 0x5F */ |
| 10710 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0x60 .. 0x6F */ |
| 10711 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0 /* 0x70 .. 0x7F */ |
| 10712 }; |
| 10713 |
| 10714 return (t & 0x80) || aBareword[(int)t]; |
| 10715 } |
| 10716 |
| 10717 |
| 10718 |
| 10719 /* |
| 10720 ** 2014 Jun 09 |
| 10721 ** |
| 10722 ** The author disclaims copyright to this source code. In place of |
| 10723 ** a legal notice, here is a blessing: |
| 10724 ** |
| 10725 ** May you do good and not evil. |
| 10726 ** May you find forgiveness for yourself and forgive others. |
| 10727 ** May you share freely, never taking more than you give. |
| 10728 ** |
| 10729 ****************************************************************************** |
| 10730 ** |
| 10731 ** This is an SQLite module implementing full-text search. |
| 10732 */ |
| 10733 |
| 10734 |
| 10735 |
| 10736 /* #include "fts5Int.h" */ |
| 10737 |
| 10738 #define FTS5_DEFAULT_PAGE_SIZE 4050 |
| 10739 #define FTS5_DEFAULT_AUTOMERGE 4 |
| 10740 #define FTS5_DEFAULT_CRISISMERGE 16 |
| 10741 #define FTS5_DEFAULT_HASHSIZE (1024*1024) |
| 10742 |
| 10743 /* Maximum allowed page size */ |
| 10744 #define FTS5_MAX_PAGE_SIZE (128*1024) |
| 10745 |
| 10746 static int fts5_iswhitespace(char x){ |
| 10747 return (x==' '); |
| 10748 } |
| 10749 |
| 10750 static int fts5_isopenquote(char x){ |
| 10751 return (x=='"' || x=='\'' || x=='[' || x=='`'); |
| 10752 } |
| 10753 |
| 10754 /* |
| 10755 ** Argument pIn points to a character that is part of a nul-terminated |
| 10756 ** string. Return a pointer to the first character following *pIn in |
| 10757 ** the string that is not a white-space character. |
| 10758 */ |
| 10759 static const char *fts5ConfigSkipWhitespace(const char *pIn){ |
| 10760 const char *p = pIn; |
| 10761 if( p ){ |
| 10762 while( fts5_iswhitespace(*p) ){ p++; } |
| 10763 } |
| 10764 return p; |
| 10765 } |
| 10766 |
| 10767 /* |
| 10768 ** Argument pIn points to a character that is part of a nul-terminated |
| 10769 ** string. Return a pointer to the first character following *pIn in |
| 10770 ** the string that is not a "bareword" character. |
| 10771 */ |
| 10772 static const char *fts5ConfigSkipBareword(const char *pIn){ |
| 10773 const char *p = pIn; |
| 10774 while ( sqlite3Fts5IsBareword(*p) ) p++; |
| 10775 if( p==pIn ) p = 0; |
| 10776 return p; |
| 10777 } |
| 10778 |
| 10779 static int fts5_isdigit(char a){ |
| 10780 return (a>='0' && a<='9'); |
| 10781 } |
| 10782 |
| 10783 |
| 10784 |
| 10785 static const char *fts5ConfigSkipLiteral(const char *pIn){ |
| 10786 const char *p = pIn; |
| 10787 switch( *p ){ |
| 10788 case 'n': case 'N': |
| 10789 if( sqlite3_strnicmp("null", p, 4)==0 ){ |
| 10790 p = &p[4]; |
| 10791 }else{ |
| 10792 p = 0; |
| 10793 } |
| 10794 break; |
| 10795 |
| 10796 case 'x': case 'X': |
| 10797 p++; |
| 10798 if( *p=='\'' ){ |
| 10799 p++; |
| 10800 while( (*p>='a' && *p<='f') |
| 10801 || (*p>='A' && *p<='F') |
| 10802 || (*p>='0' && *p<='9') |
| 10803 ){ |
| 10804 p++; |
| 10805 } |
| 10806 if( *p=='\'' && 0==((p-pIn)%2) ){ |
| 10807 p++; |
| 10808 }else{ |
| 10809 p = 0; |
| 10810 } |
| 10811 }else{ |
| 10812 p = 0; |
| 10813 } |
| 10814 break; |
| 10815 |
| 10816 case '\'': |
| 10817 p++; |
| 10818 while( p ){ |
| 10819 if( *p=='\'' ){ |
| 10820 p++; |
| 10821 if( *p!='\'' ) break; |
| 10822 } |
| 10823 p++; |
| 10824 if( *p==0 ) p = 0; |
| 10825 } |
| 10826 break; |
| 10827 |
| 10828 default: |
| 10829 /* maybe a number */ |
| 10830 if( *p=='+' || *p=='-' ) p++; |
| 10831 while( fts5_isdigit(*p) ) p++; |
| 10832 |
| 10833 /* At this point, if the literal was an integer, the parse is |
| 10834 ** finished. Or, if it is a floating point value, it may continue |
| 10835 ** with either a decimal point or an 'E' character. */ |
| 10836 if( *p=='.' && fts5_isdigit(p[1]) ){ |
| 10837 p += 2; |
| 10838 while( fts5_isdigit(*p) ) p++; |
| 10839 } |
| 10840 if( p==pIn ) p = 0; |
| 10841 |
| 10842 break; |
| 10843 } |
| 10844 |
| 10845 return p; |
| 10846 } |
| 10847 |
| 10848 /* |
| 10849 ** The first character of the string pointed to by argument z is guaranteed |
| 10850 ** to be an open-quote character (see function fts5_isopenquote()). |
| 10851 ** |
| 10852 ** This function searches for the corresponding close-quote character within |
| 10853 ** the string and, if found, dequotes the string in place and adds a new |
| 10854 ** nul-terminator byte. |
| 10855 ** |
| 10856 ** If the close-quote is found, the value returned is the byte offset of |
| 10857 ** the character immediately following it. Or, if the close-quote is not |
| 10858 ** found, -1 is returned. If -1 is returned, the buffer is left in an |
| 10859 ** undefined state. |
| 10860 */ |
| 10861 static int fts5Dequote(char *z){ |
| 10862 char q; |
| 10863 int iIn = 1; |
| 10864 int iOut = 0; |
| 10865 q = z[0]; |
| 10866 |
| 10867 /* Set stack variable q to the close-quote character */ |
| 10868 assert( q=='[' || q=='\'' || q=='"' || q=='`' ); |
| 10869 if( q=='[' ) q = ']'; |
| 10870 |
| 10871 while( ALWAYS(z[iIn]) ){ |
| 10872 if( z[iIn]==q ){ |
| 10873 if( z[iIn+1]!=q ){ |
| 10874 /* Character iIn was the close quote. */ |
| 10875 iIn++; |
| 10876 break; |
| 10877 }else{ |
| 10878 /* Character iIn and iIn+1 form an escaped quote character. Skip |
| 10879 ** the input cursor past both and copy a single quote character |
| 10880 ** to the output buffer. */ |
| 10881 iIn += 2; |
| 10882 z[iOut++] = q; |
| 10883 } |
| 10884 }else{ |
| 10885 z[iOut++] = z[iIn++]; |
| 10886 } |
| 10887 } |
| 10888 |
| 10889 z[iOut] = '\0'; |
| 10890 return iIn; |
| 10891 } |
| 10892 |
| 10893 /* |
| 10894 ** Convert an SQL-style quoted string into a normal string by removing |
| 10895 ** the quote characters. The conversion is done in-place. If the |
| 10896 ** input does not begin with a quote character, then this routine |
| 10897 ** is a no-op. |
| 10898 ** |
| 10899 ** Examples: |
| 10900 ** |
| 10901 ** "abc" becomes abc |
| 10902 ** 'xyz' becomes xyz |
| 10903 ** [pqr] becomes pqr |
| 10904 ** `mno` becomes mno |
| 10905 */ |
| 10906 static void sqlite3Fts5Dequote(char *z){ |
| 10907 char quote; /* Quote character (if any ) */ |
| 10908 |
| 10909 assert( 0==fts5_iswhitespace(z[0]) ); |
| 10910 quote = z[0]; |
| 10911 if( quote=='[' || quote=='\'' || quote=='"' || quote=='`' ){ |
| 10912 fts5Dequote(z); |
| 10913 } |
| 10914 } |
| 10915 |
| 10916 /* |
| 10917 ** Parse a "special" CREATE VIRTUAL TABLE directive and update |
| 10918 ** configuration object pConfig as appropriate. |
| 10919 ** |
| 10920 ** If successful, object pConfig is updated and SQLITE_OK returned. If |
| 10921 ** an error occurs, an SQLite error code is returned and an error message |
| 10922 ** may be left in *pzErr. It is the responsibility of the caller to |
| 10923 ** eventually free any such error message using sqlite3_free(). |
| 10924 */ |
| 10925 static int fts5ConfigParseSpecial( |
| 10926 Fts5Global *pGlobal, |
| 10927 Fts5Config *pConfig, /* Configuration object to update */ |
| 10928 const char *zCmd, /* Special command to parse */ |
| 10929 const char *zArg, /* Argument to parse */ |
| 10930 char **pzErr /* OUT: Error message */ |
| 10931 ){ |
| 10932 int rc = SQLITE_OK; |
| 10933 int nCmd = (int)strlen(zCmd); |
| 10934 if( sqlite3_strnicmp("prefix", zCmd, nCmd)==0 ){ |
| 10935 const int nByte = sizeof(int) * FTS5_MAX_PREFIX_INDEXES; |
| 10936 const char *p; |
| 10937 int bFirst = 1; |
| 10938 if( pConfig->aPrefix==0 ){ |
| 10939 pConfig->aPrefix = sqlite3Fts5MallocZero(&rc, nByte); |
| 10940 if( rc ) return rc; |
| 10941 } |
| 10942 |
| 10943 p = zArg; |
| 10944 while( 1 ){ |
| 10945 int nPre = 0; |
| 10946 |
| 10947 while( p[0]==' ' ) p++; |
| 10948 if( bFirst==0 && p[0]==',' ){ |
| 10949 p++; |
| 10950 while( p[0]==' ' ) p++; |
| 10951 }else if( p[0]=='\0' ){ |
| 10952 break; |
| 10953 } |
| 10954 if( p[0]<'0' || p[0]>'9' ){ |
| 10955 *pzErr = sqlite3_mprintf("malformed prefix=... directive"); |
| 10956 rc = SQLITE_ERROR; |
| 10957 break; |
| 10958 } |
| 10959 |
| 10960 if( pConfig->nPrefix==FTS5_MAX_PREFIX_INDEXES ){ |
| 10961 *pzErr = sqlite3_mprintf( |
| 10962 "too many prefix indexes (max %d)", FTS5_MAX_PREFIX_INDEXES |
| 10963 ); |
| 10964 rc = SQLITE_ERROR; |
| 10965 break; |
| 10966 } |
| 10967 |
| 10968 while( p[0]>='0' && p[0]<='9' && nPre<1000 ){ |
| 10969 nPre = nPre*10 + (p[0] - '0'); |
| 10970 p++; |
| 10971 } |
| 10972 |
| 10973 if( rc==SQLITE_OK && (nPre<=0 || nPre>=1000) ){ |
| 10974 *pzErr = sqlite3_mprintf("prefix length out of range (max 999)"); |
| 10975 rc = SQLITE_ERROR; |
| 10976 break; |
| 10977 } |
| 10978 |
| 10979 pConfig->aPrefix[pConfig->nPrefix] = nPre; |
| 10980 pConfig->nPrefix++; |
| 10981 bFirst = 0; |
| 10982 } |
| 10983 assert( pConfig->nPrefix<=FTS5_MAX_PREFIX_INDEXES ); |
| 10984 return rc; |
| 10985 } |
| 10986 |
| 10987 if( sqlite3_strnicmp("tokenize", zCmd, nCmd)==0 ){ |
| 10988 const char *p = (const char*)zArg; |
| 10989 int nArg = (int)strlen(zArg) + 1; |
| 10990 char **azArg = sqlite3Fts5MallocZero(&rc, sizeof(char*) * nArg); |
| 10991 char *pDel = sqlite3Fts5MallocZero(&rc, nArg * 2); |
| 10992 char *pSpace = pDel; |
| 10993 |
| 10994 if( azArg && pSpace ){ |
| 10995 if( pConfig->pTok ){ |
| 10996 *pzErr = sqlite3_mprintf("multiple tokenize=... directives"); |
| 10997 rc = SQLITE_ERROR; |
| 10998 }else{ |
| 10999 for(nArg=0; p && *p; nArg++){ |
| 11000 const char *p2 = fts5ConfigSkipWhitespace(p); |
| 11001 if( *p2=='\'' ){ |
| 11002 p = fts5ConfigSkipLiteral(p2); |
| 11003 }else{ |
| 11004 p = fts5ConfigSkipBareword(p2); |
| 11005 } |
| 11006 if( p ){ |
| 11007 memcpy(pSpace, p2, p-p2); |
| 11008 azArg[nArg] = pSpace; |
| 11009 sqlite3Fts5Dequote(pSpace); |
| 11010 pSpace += (p - p2) + 1; |
| 11011 p = fts5ConfigSkipWhitespace(p); |
| 11012 } |
| 11013 } |
| 11014 if( p==0 ){ |
| 11015 *pzErr = sqlite3_mprintf("parse error in tokenize directive"); |
| 11016 rc = SQLITE_ERROR; |
| 11017 }else{ |
| 11018 rc = sqlite3Fts5GetTokenizer(pGlobal, |
| 11019 (const char**)azArg, nArg, &pConfig->pTok, &pConfig->pTokApi, |
| 11020 pzErr |
| 11021 ); |
| 11022 } |
| 11023 } |
| 11024 } |
| 11025 |
| 11026 sqlite3_free(azArg); |
| 11027 sqlite3_free(pDel); |
| 11028 return rc; |
| 11029 } |
| 11030 |
| 11031 if( sqlite3_strnicmp("content", zCmd, nCmd)==0 ){ |
| 11032 if( pConfig->eContent!=FTS5_CONTENT_NORMAL ){ |
| 11033 *pzErr = sqlite3_mprintf("multiple content=... directives"); |
| 11034 rc = SQLITE_ERROR; |
| 11035 }else{ |
| 11036 if( zArg[0] ){ |
| 11037 pConfig->eContent = FTS5_CONTENT_EXTERNAL; |
| 11038 pConfig->zContent = sqlite3Fts5Mprintf(&rc, "%Q.%Q", pConfig->zDb,zArg); |
| 11039 }else{ |
| 11040 pConfig->eContent = FTS5_CONTENT_NONE; |
| 11041 } |
| 11042 } |
| 11043 return rc; |
| 11044 } |
| 11045 |
| 11046 if( sqlite3_strnicmp("content_rowid", zCmd, nCmd)==0 ){ |
| 11047 if( pConfig->zContentRowid ){ |
| 11048 *pzErr = sqlite3_mprintf("multiple content_rowid=... directives"); |
| 11049 rc = SQLITE_ERROR; |
| 11050 }else{ |
| 11051 pConfig->zContentRowid = sqlite3Fts5Strndup(&rc, zArg, -1); |
| 11052 } |
| 11053 return rc; |
| 11054 } |
| 11055 |
| 11056 if( sqlite3_strnicmp("columnsize", zCmd, nCmd)==0 ){ |
| 11057 if( (zArg[0]!='0' && zArg[0]!='1') || zArg[1]!='\0' ){ |
| 11058 *pzErr = sqlite3_mprintf("malformed columnsize=... directive"); |
| 11059 rc = SQLITE_ERROR; |
| 11060 }else{ |
| 11061 pConfig->bColumnsize = (zArg[0]=='1'); |
| 11062 } |
| 11063 return rc; |
| 11064 } |
| 11065 |
| 11066 *pzErr = sqlite3_mprintf("unrecognized option: \"%.*s\"", nCmd, zCmd); |
| 11067 return SQLITE_ERROR; |
| 11068 } |
| 11069 |
| 11070 /* |
| 11071 ** Allocate an instance of the default tokenizer ("simple") at |
| 11072 ** Fts5Config.pTokenizer. Return SQLITE_OK if successful, or an SQLite error |
| 11073 ** code if an error occurs. |
| 11074 */ |
| 11075 static int fts5ConfigDefaultTokenizer(Fts5Global *pGlobal, Fts5Config *pConfig){ |
| 11076 assert( pConfig->pTok==0 && pConfig->pTokApi==0 ); |
| 11077 return sqlite3Fts5GetTokenizer( |
| 11078 pGlobal, 0, 0, &pConfig->pTok, &pConfig->pTokApi, 0 |
| 11079 ); |
| 11080 } |
| 11081 |
| 11082 /* |
| 11083 ** Gobble up the first bareword or quoted word from the input buffer zIn. |
| 11084 ** Return a pointer to the character immediately following the last in |
| 11085 ** the gobbled word if successful, or a NULL pointer otherwise (failed |
| 11086 ** to find close-quote character). |
| 11087 ** |
| 11088 ** Before returning, set pzOut to point to a new buffer containing a |
| 11089 ** nul-terminated, dequoted copy of the gobbled word. If the word was |
| 11090 ** quoted, *pbQuoted is also set to 1 before returning. |
| 11091 ** |
| 11092 ** If *pRc is other than SQLITE_OK when this function is called, it is |
| 11093 ** a no-op (NULL is returned). Otherwise, if an OOM occurs within this |
| 11094 ** function, *pRc is set to SQLITE_NOMEM before returning. *pRc is *not* |
| 11095 ** set if a parse error (failed to find close quote) occurs. |
| 11096 */ |
| 11097 static const char *fts5ConfigGobbleWord( |
| 11098 int *pRc, /* IN/OUT: Error code */ |
| 11099 const char *zIn, /* Buffer to gobble string/bareword from */ |
| 11100 char **pzOut, /* OUT: malloc'd buffer containing str/bw */ |
| 11101 int *pbQuoted /* OUT: Set to true if dequoting required */ |
| 11102 ){ |
| 11103 const char *zRet = 0; |
| 11104 |
| 11105 int nIn = (int)strlen(zIn); |
| 11106 char *zOut = sqlite3_malloc(nIn+1); |
| 11107 |
| 11108 assert( *pRc==SQLITE_OK ); |
| 11109 *pbQuoted = 0; |
| 11110 *pzOut = 0; |
| 11111 |
| 11112 if( zOut==0 ){ |
| 11113 *pRc = SQLITE_NOMEM; |
| 11114 }else{ |
| 11115 memcpy(zOut, zIn, nIn+1); |
| 11116 if( fts5_isopenquote(zOut[0]) ){ |
| 11117 int ii = fts5Dequote(zOut); |
| 11118 zRet = &zIn[ii]; |
| 11119 *pbQuoted = 1; |
| 11120 }else{ |
| 11121 zRet = fts5ConfigSkipBareword(zIn); |
| 11122 zOut[zRet-zIn] = '\0'; |
| 11123 } |
| 11124 } |
| 11125 |
| 11126 if( zRet==0 ){ |
| 11127 sqlite3_free(zOut); |
| 11128 }else{ |
| 11129 *pzOut = zOut; |
| 11130 } |
| 11131 |
| 11132 return zRet; |
| 11133 } |
| 11134 |
| 11135 static int fts5ConfigParseColumn( |
| 11136 Fts5Config *p, |
| 11137 char *zCol, |
| 11138 char *zArg, |
| 11139 char **pzErr |
| 11140 ){ |
| 11141 int rc = SQLITE_OK; |
| 11142 if( 0==sqlite3_stricmp(zCol, FTS5_RANK_NAME) |
| 11143 || 0==sqlite3_stricmp(zCol, FTS5_ROWID_NAME) |
| 11144 ){ |
| 11145 *pzErr = sqlite3_mprintf("reserved fts5 column name: %s", zCol); |
| 11146 rc = SQLITE_ERROR; |
| 11147 }else if( zArg ){ |
| 11148 if( 0==sqlite3_stricmp(zArg, "unindexed") ){ |
| 11149 p->abUnindexed[p->nCol] = 1; |
| 11150 }else{ |
| 11151 *pzErr = sqlite3_mprintf("unrecognized column option: %s", zArg); |
| 11152 rc = SQLITE_ERROR; |
| 11153 } |
| 11154 } |
| 11155 |
| 11156 p->azCol[p->nCol++] = zCol; |
| 11157 return rc; |
| 11158 } |
| 11159 |
| 11160 /* |
| 11161 ** Populate the Fts5Config.zContentExprlist string. |
| 11162 */ |
| 11163 static int fts5ConfigMakeExprlist(Fts5Config *p){ |
| 11164 int i; |
| 11165 int rc = SQLITE_OK; |
| 11166 Fts5Buffer buf = {0, 0, 0}; |
| 11167 |
| 11168 sqlite3Fts5BufferAppendPrintf(&rc, &buf, "T.%Q", p->zContentRowid); |
| 11169 if( p->eContent!=FTS5_CONTENT_NONE ){ |
| 11170 for(i=0; i<p->nCol; i++){ |
| 11171 if( p->eContent==FTS5_CONTENT_EXTERNAL ){ |
| 11172 sqlite3Fts5BufferAppendPrintf(&rc, &buf, ", T.%Q", p->azCol[i]); |
| 11173 }else{ |
| 11174 sqlite3Fts5BufferAppendPrintf(&rc, &buf, ", T.c%d", i); |
| 11175 } |
| 11176 } |
| 11177 } |
| 11178 |
| 11179 assert( p->zContentExprlist==0 ); |
| 11180 p->zContentExprlist = (char*)buf.p; |
| 11181 return rc; |
| 11182 } |
| 11183 |
| 11184 /* |
| 11185 ** Arguments nArg/azArg contain the string arguments passed to the xCreate |
| 11186 ** or xConnect method of the virtual table. This function attempts to |
| 11187 ** allocate an instance of Fts5Config containing the results of parsing |
| 11188 ** those arguments. |
| 11189 ** |
| 11190 ** If successful, SQLITE_OK is returned and *ppOut is set to point to the |
| 11191 ** new Fts5Config object. If an error occurs, an SQLite error code is |
| 11192 ** returned, *ppOut is set to NULL and an error message may be left in |
| 11193 ** *pzErr. It is the responsibility of the caller to eventually free any |
| 11194 ** such error message using sqlite3_free(). |
| 11195 */ |
| 11196 static int sqlite3Fts5ConfigParse( |
| 11197 Fts5Global *pGlobal, |
| 11198 sqlite3 *db, |
| 11199 int nArg, /* Number of arguments */ |
| 11200 const char **azArg, /* Array of nArg CREATE VIRTUAL TABLE args */ |
| 11201 Fts5Config **ppOut, /* OUT: Results of parse */ |
| 11202 char **pzErr /* OUT: Error message */ |
| 11203 ){ |
| 11204 int rc = SQLITE_OK; /* Return code */ |
| 11205 Fts5Config *pRet; /* New object to return */ |
| 11206 int i; |
| 11207 int nByte; |
| 11208 |
| 11209 *ppOut = pRet = (Fts5Config*)sqlite3_malloc(sizeof(Fts5Config)); |
| 11210 if( pRet==0 ) return SQLITE_NOMEM; |
| 11211 memset(pRet, 0, sizeof(Fts5Config)); |
| 11212 pRet->db = db; |
| 11213 pRet->iCookie = -1; |
| 11214 |
| 11215 nByte = nArg * (sizeof(char*) + sizeof(u8)); |
| 11216 pRet->azCol = (char**)sqlite3Fts5MallocZero(&rc, nByte); |
| 11217 pRet->abUnindexed = (u8*)&pRet->azCol[nArg]; |
| 11218 pRet->zDb = sqlite3Fts5Strndup(&rc, azArg[1], -1); |
| 11219 pRet->zName = sqlite3Fts5Strndup(&rc, azArg[2], -1); |
| 11220 pRet->bColumnsize = 1; |
| 11221 #ifdef SQLITE_DEBUG |
| 11222 pRet->bPrefixIndex = 1; |
| 11223 #endif |
| 11224 if( rc==SQLITE_OK && sqlite3_stricmp(pRet->zName, FTS5_RANK_NAME)==0 ){ |
| 11225 *pzErr = sqlite3_mprintf("reserved fts5 table name: %s", pRet->zName); |
| 11226 rc = SQLITE_ERROR; |
| 11227 } |
| 11228 |
| 11229 for(i=3; rc==SQLITE_OK && i<nArg; i++){ |
| 11230 const char *zOrig = azArg[i]; |
| 11231 const char *z; |
| 11232 char *zOne = 0; |
| 11233 char *zTwo = 0; |
| 11234 int bOption = 0; |
| 11235 int bMustBeCol = 0; |
| 11236 |
| 11237 z = fts5ConfigGobbleWord(&rc, zOrig, &zOne, &bMustBeCol); |
| 11238 z = fts5ConfigSkipWhitespace(z); |
| 11239 if( z && *z=='=' ){ |
| 11240 bOption = 1; |
| 11241 z++; |
| 11242 if( bMustBeCol ) z = 0; |
| 11243 } |
| 11244 z = fts5ConfigSkipWhitespace(z); |
| 11245 if( z && z[0] ){ |
| 11246 int bDummy; |
| 11247 z = fts5ConfigGobbleWord(&rc, z, &zTwo, &bDummy); |
| 11248 if( z && z[0] ) z = 0; |
| 11249 } |
| 11250 |
| 11251 if( rc==SQLITE_OK ){ |
| 11252 if( z==0 ){ |
| 11253 *pzErr = sqlite3_mprintf("parse error in \"%s\"", zOrig); |
| 11254 rc = SQLITE_ERROR; |
| 11255 }else{ |
| 11256 if( bOption ){ |
| 11257 rc = fts5ConfigParseSpecial(pGlobal, pRet, zOne, zTwo?zTwo:"", pzErr); |
| 11258 }else{ |
| 11259 rc = fts5ConfigParseColumn(pRet, zOne, zTwo, pzErr); |
| 11260 zOne = 0; |
| 11261 } |
| 11262 } |
| 11263 } |
| 11264 |
| 11265 sqlite3_free(zOne); |
| 11266 sqlite3_free(zTwo); |
| 11267 } |
| 11268 |
| 11269 /* If a tokenizer= option was successfully parsed, the tokenizer has |
| 11270 ** already been allocated. Otherwise, allocate an instance of the default |
| 11271 ** tokenizer (unicode61) now. */ |
| 11272 if( rc==SQLITE_OK && pRet->pTok==0 ){ |
| 11273 rc = fts5ConfigDefaultTokenizer(pGlobal, pRet); |
| 11274 } |
| 11275 |
| 11276 /* If no zContent option was specified, fill in the default values. */ |
| 11277 if( rc==SQLITE_OK && pRet->zContent==0 ){ |
| 11278 const char *zTail = 0; |
| 11279 assert( pRet->eContent==FTS5_CONTENT_NORMAL |
| 11280 || pRet->eContent==FTS5_CONTENT_NONE |
| 11281 ); |
| 11282 if( pRet->eContent==FTS5_CONTENT_NORMAL ){ |
| 11283 zTail = "content"; |
| 11284 }else if( pRet->bColumnsize ){ |
| 11285 zTail = "docsize"; |
| 11286 } |
| 11287 |
| 11288 if( zTail ){ |
| 11289 pRet->zContent = sqlite3Fts5Mprintf( |
| 11290 &rc, "%Q.'%q_%s'", pRet->zDb, pRet->zName, zTail |
| 11291 ); |
| 11292 } |
| 11293 } |
| 11294 |
| 11295 if( rc==SQLITE_OK && pRet->zContentRowid==0 ){ |
| 11296 pRet->zContentRowid = sqlite3Fts5Strndup(&rc, "rowid", -1); |
| 11297 } |
| 11298 |
| 11299 /* Formulate the zContentExprlist text */ |
| 11300 if( rc==SQLITE_OK ){ |
| 11301 rc = fts5ConfigMakeExprlist(pRet); |
| 11302 } |
| 11303 |
| 11304 if( rc!=SQLITE_OK ){ |
| 11305 sqlite3Fts5ConfigFree(pRet); |
| 11306 *ppOut = 0; |
| 11307 } |
| 11308 return rc; |
| 11309 } |
| 11310 |
| 11311 /* |
| 11312 ** Free the configuration object passed as the only argument. |
| 11313 */ |
| 11314 static void sqlite3Fts5ConfigFree(Fts5Config *pConfig){ |
| 11315 if( pConfig ){ |
| 11316 int i; |
| 11317 if( pConfig->pTok ){ |
| 11318 pConfig->pTokApi->xDelete(pConfig->pTok); |
| 11319 } |
| 11320 sqlite3_free(pConfig->zDb); |
| 11321 sqlite3_free(pConfig->zName); |
| 11322 for(i=0; i<pConfig->nCol; i++){ |
| 11323 sqlite3_free(pConfig->azCol[i]); |
| 11324 } |
| 11325 sqlite3_free(pConfig->azCol); |
| 11326 sqlite3_free(pConfig->aPrefix); |
| 11327 sqlite3_free(pConfig->zRank); |
| 11328 sqlite3_free(pConfig->zRankArgs); |
| 11329 sqlite3_free(pConfig->zContent); |
| 11330 sqlite3_free(pConfig->zContentRowid); |
| 11331 sqlite3_free(pConfig->zContentExprlist); |
| 11332 sqlite3_free(pConfig); |
| 11333 } |
| 11334 } |
| 11335 |
| 11336 /* |
| 11337 ** Call sqlite3_declare_vtab() based on the contents of the configuration |
| 11338 ** object passed as the only argument. Return SQLITE_OK if successful, or |
| 11339 ** an SQLite error code if an error occurs. |
| 11340 */ |
| 11341 static int sqlite3Fts5ConfigDeclareVtab(Fts5Config *pConfig){ |
| 11342 int i; |
| 11343 int rc = SQLITE_OK; |
| 11344 char *zSql; |
| 11345 |
| 11346 zSql = sqlite3Fts5Mprintf(&rc, "CREATE TABLE x("); |
| 11347 for(i=0; zSql && i<pConfig->nCol; i++){ |
| 11348 const char *zSep = (i==0?"":", "); |
| 11349 zSql = sqlite3Fts5Mprintf(&rc, "%z%s%Q", zSql, zSep, pConfig->azCol[i]); |
| 11350 } |
| 11351 zSql = sqlite3Fts5Mprintf(&rc, "%z, %Q HIDDEN, %s HIDDEN)", |
| 11352 zSql, pConfig->zName, FTS5_RANK_NAME |
| 11353 ); |
| 11354 |
| 11355 assert( zSql || rc==SQLITE_NOMEM ); |
| 11356 if( zSql ){ |
| 11357 rc = sqlite3_declare_vtab(pConfig->db, zSql); |
| 11358 sqlite3_free(zSql); |
| 11359 } |
| 11360 |
| 11361 return rc; |
| 11362 } |
| 11363 |
| 11364 /* |
| 11365 ** Tokenize the text passed via the second and third arguments. |
| 11366 ** |
| 11367 ** The callback is invoked once for each token in the input text. The |
| 11368 ** arguments passed to it are, in order: |
| 11369 ** |
| 11370 ** void *pCtx // Copy of 4th argument to sqlite3Fts5Tokenize() |
| 11371 ** const char *pToken // Pointer to buffer containing token |
| 11372 ** int nToken // Size of token in bytes |
| 11373 ** int iStart // Byte offset of start of token within input text |
| 11374 ** int iEnd // Byte offset of end of token within input text |
| 11375 ** int iPos // Position of token in input (first token is 0) |
| 11376 ** |
| 11377 ** If the callback returns a non-zero value the tokenization is abandoned |
| 11378 ** and no further callbacks are issued. |
| 11379 ** |
| 11380 ** This function returns SQLITE_OK if successful or an SQLite error code |
| 11381 ** if an error occurs. If the tokenization was abandoned early because |
| 11382 ** the callback returned SQLITE_DONE, this is not an error and this function |
| 11383 ** still returns SQLITE_OK. Or, if the tokenization was abandoned early |
| 11384 ** because the callback returned another non-zero value, it is assumed |
| 11385 ** to be an SQLite error code and returned to the caller. |
| 11386 */ |
| 11387 static int sqlite3Fts5Tokenize( |
| 11388 Fts5Config *pConfig, /* FTS5 Configuration object */ |
| 11389 int flags, /* FTS5_TOKENIZE_* flags */ |
| 11390 const char *pText, int nText, /* Text to tokenize */ |
| 11391 void *pCtx, /* Context passed to xToken() */ |
| 11392 int (*xToken)(void*, int, const char*, int, int, int) /* Callback */ |
| 11393 ){ |
| 11394 if( pText==0 ) return SQLITE_OK; |
| 11395 return pConfig->pTokApi->xTokenize( |
| 11396 pConfig->pTok, pCtx, flags, pText, nText, xToken |
| 11397 ); |
| 11398 } |
| 11399 |
| 11400 /* |
| 11401 ** Argument pIn points to the first character in what is expected to be |
| 11402 ** a comma-separated list of SQL literals followed by a ')' character. |
| 11403 ** If it actually is this, return a pointer to the ')'. Otherwise, return |
| 11404 ** NULL to indicate a parse error. |
| 11405 */ |
| 11406 static const char *fts5ConfigSkipArgs(const char *pIn){ |
| 11407 const char *p = pIn; |
| 11408 |
| 11409 while( 1 ){ |
| 11410 p = fts5ConfigSkipWhitespace(p); |
| 11411 p = fts5ConfigSkipLiteral(p); |
| 11412 p = fts5ConfigSkipWhitespace(p); |
| 11413 if( p==0 || *p==')' ) break; |
| 11414 if( *p!=',' ){ |
| 11415 p = 0; |
| 11416 break; |
| 11417 } |
| 11418 p++; |
| 11419 } |
| 11420 |
| 11421 return p; |
| 11422 } |
| 11423 |
| 11424 /* |
| 11425 ** Parameter zIn contains a rank() function specification. The format of |
| 11426 ** this is: |
| 11427 ** |
| 11428 ** + Bareword (function name) |
| 11429 ** + Open parenthesis - "(" |
| 11430 ** + Zero or more SQL literals in a comma separated list |
| 11431 ** + Close parenthesis - ")" |
| 11432 */ |
| 11433 static int sqlite3Fts5ConfigParseRank( |
| 11434 const char *zIn, /* Input string */ |
| 11435 char **pzRank, /* OUT: Rank function name */ |
| 11436 char **pzRankArgs /* OUT: Rank function arguments */ |
| 11437 ){ |
| 11438 const char *p = zIn; |
| 11439 const char *pRank; |
| 11440 char *zRank = 0; |
| 11441 char *zRankArgs = 0; |
| 11442 int rc = SQLITE_OK; |
| 11443 |
| 11444 *pzRank = 0; |
| 11445 *pzRankArgs = 0; |
| 11446 |
| 11447 if( p==0 ){ |
| 11448 rc = SQLITE_ERROR; |
| 11449 }else{ |
| 11450 p = fts5ConfigSkipWhitespace(p); |
| 11451 pRank = p; |
| 11452 p = fts5ConfigSkipBareword(p); |
| 11453 |
| 11454 if( p ){ |
| 11455 zRank = sqlite3Fts5MallocZero(&rc, 1 + p - pRank); |
| 11456 if( zRank ) memcpy(zRank, pRank, p-pRank); |
| 11457 }else{ |
| 11458 rc = SQLITE_ERROR; |
| 11459 } |
| 11460 |
| 11461 if( rc==SQLITE_OK ){ |
| 11462 p = fts5ConfigSkipWhitespace(p); |
| 11463 if( *p!='(' ) rc = SQLITE_ERROR; |
| 11464 p++; |
| 11465 } |
| 11466 if( rc==SQLITE_OK ){ |
| 11467 const char *pArgs; |
| 11468 p = fts5ConfigSkipWhitespace(p); |
| 11469 pArgs = p; |
| 11470 if( *p!=')' ){ |
| 11471 p = fts5ConfigSkipArgs(p); |
| 11472 if( p==0 ){ |
| 11473 rc = SQLITE_ERROR; |
| 11474 }else{ |
| 11475 zRankArgs = sqlite3Fts5MallocZero(&rc, 1 + p - pArgs); |
| 11476 if( zRankArgs ) memcpy(zRankArgs, pArgs, p-pArgs); |
| 11477 } |
| 11478 } |
| 11479 } |
| 11480 } |
| 11481 |
| 11482 if( rc!=SQLITE_OK ){ |
| 11483 sqlite3_free(zRank); |
| 11484 assert( zRankArgs==0 ); |
| 11485 }else{ |
| 11486 *pzRank = zRank; |
| 11487 *pzRankArgs = zRankArgs; |
| 11488 } |
| 11489 return rc; |
| 11490 } |
| 11491 |
| 11492 static int sqlite3Fts5ConfigSetValue( |
| 11493 Fts5Config *pConfig, |
| 11494 const char *zKey, |
| 11495 sqlite3_value *pVal, |
| 11496 int *pbBadkey |
| 11497 ){ |
| 11498 int rc = SQLITE_OK; |
| 11499 |
| 11500 if( 0==sqlite3_stricmp(zKey, "pgsz") ){ |
| 11501 int pgsz = 0; |
| 11502 if( SQLITE_INTEGER==sqlite3_value_numeric_type(pVal) ){ |
| 11503 pgsz = sqlite3_value_int(pVal); |
| 11504 } |
| 11505 if( pgsz<=0 || pgsz>FTS5_MAX_PAGE_SIZE ){ |
| 11506 *pbBadkey = 1; |
| 11507 }else{ |
| 11508 pConfig->pgsz = pgsz; |
| 11509 } |
| 11510 } |
| 11511 |
| 11512 else if( 0==sqlite3_stricmp(zKey, "hashsize") ){ |
| 11513 int nHashSize = -1; |
| 11514 if( SQLITE_INTEGER==sqlite3_value_numeric_type(pVal) ){ |
| 11515 nHashSize = sqlite3_value_int(pVal); |
| 11516 } |
| 11517 if( nHashSize<=0 ){ |
| 11518 *pbBadkey = 1; |
| 11519 }else{ |
| 11520 pConfig->nHashSize = nHashSize; |
| 11521 } |
| 11522 } |
| 11523 |
| 11524 else if( 0==sqlite3_stricmp(zKey, "automerge") ){ |
| 11525 int nAutomerge = -1; |
| 11526 if( SQLITE_INTEGER==sqlite3_value_numeric_type(pVal) ){ |
| 11527 nAutomerge = sqlite3_value_int(pVal); |
| 11528 } |
| 11529 if( nAutomerge<0 || nAutomerge>64 ){ |
| 11530 *pbBadkey = 1; |
| 11531 }else{ |
| 11532 if( nAutomerge==1 ) nAutomerge = FTS5_DEFAULT_AUTOMERGE; |
| 11533 pConfig->nAutomerge = nAutomerge; |
| 11534 } |
| 11535 } |
| 11536 |
| 11537 else if( 0==sqlite3_stricmp(zKey, "crisismerge") ){ |
| 11538 int nCrisisMerge = -1; |
| 11539 if( SQLITE_INTEGER==sqlite3_value_numeric_type(pVal) ){ |
| 11540 nCrisisMerge = sqlite3_value_int(pVal); |
| 11541 } |
| 11542 if( nCrisisMerge<0 ){ |
| 11543 *pbBadkey = 1; |
| 11544 }else{ |
| 11545 if( nCrisisMerge<=1 ) nCrisisMerge = FTS5_DEFAULT_CRISISMERGE; |
| 11546 pConfig->nCrisisMerge = nCrisisMerge; |
| 11547 } |
| 11548 } |
| 11549 |
| 11550 else if( 0==sqlite3_stricmp(zKey, "rank") ){ |
| 11551 const char *zIn = (const char*)sqlite3_value_text(pVal); |
| 11552 char *zRank; |
| 11553 char *zRankArgs; |
| 11554 rc = sqlite3Fts5ConfigParseRank(zIn, &zRank, &zRankArgs); |
| 11555 if( rc==SQLITE_OK ){ |
| 11556 sqlite3_free(pConfig->zRank); |
| 11557 sqlite3_free(pConfig->zRankArgs); |
| 11558 pConfig->zRank = zRank; |
| 11559 pConfig->zRankArgs = zRankArgs; |
| 11560 }else if( rc==SQLITE_ERROR ){ |
| 11561 rc = SQLITE_OK; |
| 11562 *pbBadkey = 1; |
| 11563 } |
| 11564 }else{ |
| 11565 *pbBadkey = 1; |
| 11566 } |
| 11567 return rc; |
| 11568 } |
| 11569 |
| 11570 /* |
| 11571 ** Load the contents of the %_config table into memory. |
| 11572 */ |
| 11573 static int sqlite3Fts5ConfigLoad(Fts5Config *pConfig, int iCookie){ |
| 11574 const char *zSelect = "SELECT k, v FROM %Q.'%q_config'"; |
| 11575 char *zSql; |
| 11576 sqlite3_stmt *p = 0; |
| 11577 int rc = SQLITE_OK; |
| 11578 int iVersion = 0; |
| 11579 |
| 11580 /* Set default values */ |
| 11581 pConfig->pgsz = FTS5_DEFAULT_PAGE_SIZE; |
| 11582 pConfig->nAutomerge = FTS5_DEFAULT_AUTOMERGE; |
| 11583 pConfig->nCrisisMerge = FTS5_DEFAULT_CRISISMERGE; |
| 11584 pConfig->nHashSize = FTS5_DEFAULT_HASHSIZE; |
| 11585 |
| 11586 zSql = sqlite3Fts5Mprintf(&rc, zSelect, pConfig->zDb, pConfig->zName); |
| 11587 if( zSql ){ |
| 11588 rc = sqlite3_prepare_v2(pConfig->db, zSql, -1, &p, 0); |
| 11589 sqlite3_free(zSql); |
| 11590 } |
| 11591 |
| 11592 assert( rc==SQLITE_OK || p==0 ); |
| 11593 if( rc==SQLITE_OK ){ |
| 11594 while( SQLITE_ROW==sqlite3_step(p) ){ |
| 11595 const char *zK = (const char*)sqlite3_column_text(p, 0); |
| 11596 sqlite3_value *pVal = sqlite3_column_value(p, 1); |
| 11597 if( 0==sqlite3_stricmp(zK, "version") ){ |
| 11598 iVersion = sqlite3_value_int(pVal); |
| 11599 }else{ |
| 11600 int bDummy = 0; |
| 11601 sqlite3Fts5ConfigSetValue(pConfig, zK, pVal, &bDummy); |
| 11602 } |
| 11603 } |
| 11604 rc = sqlite3_finalize(p); |
| 11605 } |
| 11606 |
| 11607 if( rc==SQLITE_OK && iVersion!=FTS5_CURRENT_VERSION ){ |
| 11608 rc = SQLITE_ERROR; |
| 11609 if( pConfig->pzErrmsg ){ |
| 11610 assert( 0==*pConfig->pzErrmsg ); |
| 11611 *pConfig->pzErrmsg = sqlite3_mprintf( |
| 11612 "invalid fts5 file format (found %d, expected %d) - run 'rebuild'", |
| 11613 iVersion, FTS5_CURRENT_VERSION |
| 11614 ); |
| 11615 } |
| 11616 } |
| 11617 |
| 11618 if( rc==SQLITE_OK ){ |
| 11619 pConfig->iCookie = iCookie; |
| 11620 } |
| 11621 return rc; |
| 11622 } |
| 11623 |
| 11624 |
| 11625 /* |
| 11626 ** 2014 May 31 |
| 11627 ** |
| 11628 ** The author disclaims copyright to this source code. In place of |
| 11629 ** a legal notice, here is a blessing: |
| 11630 ** |
| 11631 ** May you do good and not evil. |
| 11632 ** May you find forgiveness for yourself and forgive others. |
| 11633 ** May you share freely, never taking more than you give. |
| 11634 ** |
| 11635 ****************************************************************************** |
| 11636 ** |
| 11637 */ |
| 11638 |
| 11639 |
| 11640 |
| 11641 /* #include "fts5Int.h" */ |
| 11642 /* #include "fts5parse.h" */ |
| 11643 |
| 11644 /* |
| 11645 ** All token types in the generated fts5parse.h file are greater than 0. |
| 11646 */ |
| 11647 #define FTS5_EOF 0 |
| 11648 |
| 11649 #define FTS5_LARGEST_INT64 (0xffffffff|(((i64)0x7fffffff)<<32)) |
| 11650 |
| 11651 typedef struct Fts5ExprTerm Fts5ExprTerm; |
| 11652 |
| 11653 /* |
| 11654 ** Functions generated by lemon from fts5parse.y. |
| 11655 */ |
| 11656 static void *sqlite3Fts5ParserAlloc(void *(*mallocProc)(u64)); |
| 11657 static void sqlite3Fts5ParserFree(void*, void (*freeProc)(void*)); |
| 11658 static void sqlite3Fts5Parser(void*, int, Fts5Token, Fts5Parse*); |
| 11659 #ifndef NDEBUG |
| 11660 /* #include <stdio.h> */ |
| 11661 static void sqlite3Fts5ParserTrace(FILE*, char*); |
| 11662 #endif |
| 11663 |
| 11664 |
| 11665 struct Fts5Expr { |
| 11666 Fts5Index *pIndex; |
| 11667 Fts5ExprNode *pRoot; |
| 11668 int bDesc; /* Iterate in descending rowid order */ |
| 11669 int nPhrase; /* Number of phrases in expression */ |
| 11670 Fts5ExprPhrase **apExprPhrase; /* Pointers to phrase objects */ |
| 11671 }; |
| 11672 |
| 11673 /* |
| 11674 ** eType: |
| 11675 ** Expression node type. Always one of: |
| 11676 ** |
| 11677 ** FTS5_AND (nChild, apChild valid) |
| 11678 ** FTS5_OR (nChild, apChild valid) |
| 11679 ** FTS5_NOT (nChild, apChild valid) |
| 11680 ** FTS5_STRING (pNear valid) |
| 11681 ** FTS5_TERM (pNear valid) |
| 11682 */ |
| 11683 struct Fts5ExprNode { |
| 11684 int eType; /* Node type */ |
| 11685 int bEof; /* True at EOF */ |
| 11686 int bNomatch; /* True if entry is not a match */ |
| 11687 |
| 11688 i64 iRowid; /* Current rowid */ |
| 11689 Fts5ExprNearset *pNear; /* For FTS5_STRING - cluster of phrases */ |
| 11690 |
| 11691 /* Child nodes. For a NOT node, this array always contains 2 entries. For |
| 11692 ** AND or OR nodes, it contains 2 or more entries. */ |
| 11693 int nChild; /* Number of child nodes */ |
| 11694 Fts5ExprNode *apChild[1]; /* Array of child nodes */ |
| 11695 }; |
| 11696 |
| 11697 #define Fts5NodeIsString(p) ((p)->eType==FTS5_TERM || (p)->eType==FTS5_STRING) |
| 11698 |
| 11699 /* |
| 11700 ** An instance of the following structure represents a single search term |
| 11701 ** or term prefix. |
| 11702 */ |
| 11703 struct Fts5ExprTerm { |
| 11704 int bPrefix; /* True for a prefix term */ |
| 11705 char *zTerm; /* nul-terminated term */ |
| 11706 Fts5IndexIter *pIter; /* Iterator for this term */ |
| 11707 Fts5ExprTerm *pSynonym; /* Pointer to first in list of synonyms */ |
| 11708 }; |
| 11709 |
| 11710 /* |
| 11711 ** A phrase. One or more terms that must appear in a contiguous sequence |
| 11712 ** within a document for it to match. |
| 11713 */ |
| 11714 struct Fts5ExprPhrase { |
| 11715 Fts5ExprNode *pNode; /* FTS5_STRING node this phrase is part of */ |
| 11716 Fts5Buffer poslist; /* Current position list */ |
| 11717 int nTerm; /* Number of entries in aTerm[] */ |
| 11718 Fts5ExprTerm aTerm[1]; /* Terms that make up this phrase */ |
| 11719 }; |
| 11720 |
| 11721 /* |
| 11722 ** One or more phrases that must appear within a certain token distance of |
| 11723 ** each other within each matching document. |
| 11724 */ |
| 11725 struct Fts5ExprNearset { |
| 11726 int nNear; /* NEAR parameter */ |
| 11727 Fts5Colset *pColset; /* Columns to search (NULL -> all columns) */ |
| 11728 int nPhrase; /* Number of entries in aPhrase[] array */ |
| 11729 Fts5ExprPhrase *apPhrase[1]; /* Array of phrase pointers */ |
| 11730 }; |
| 11731 |
| 11732 |
| 11733 /* |
| 11734 ** Parse context. |
| 11735 */ |
| 11736 struct Fts5Parse { |
| 11737 Fts5Config *pConfig; |
| 11738 char *zErr; |
| 11739 int rc; |
| 11740 int nPhrase; /* Size of apPhrase array */ |
| 11741 Fts5ExprPhrase **apPhrase; /* Array of all phrases */ |
| 11742 Fts5ExprNode *pExpr; /* Result of a successful parse */ |
| 11743 }; |
| 11744 |
| 11745 static void sqlite3Fts5ParseError(Fts5Parse *pParse, const char *zFmt, ...){ |
| 11746 va_list ap; |
| 11747 va_start(ap, zFmt); |
| 11748 if( pParse->rc==SQLITE_OK ){ |
| 11749 pParse->zErr = sqlite3_vmprintf(zFmt, ap); |
| 11750 pParse->rc = SQLITE_ERROR; |
| 11751 } |
| 11752 va_end(ap); |
| 11753 } |
| 11754 |
| 11755 static int fts5ExprIsspace(char t){ |
| 11756 return t==' ' || t=='\t' || t=='\n' || t=='\r'; |
| 11757 } |
| 11758 |
| 11759 /* |
| 11760 ** Read the first token from the nul-terminated string at *pz. |
| 11761 */ |
| 11762 static int fts5ExprGetToken( |
| 11763 Fts5Parse *pParse, |
| 11764 const char **pz, /* IN/OUT: Pointer into buffer */ |
| 11765 Fts5Token *pToken |
| 11766 ){ |
| 11767 const char *z = *pz; |
| 11768 int tok; |
| 11769 |
| 11770 /* Skip past any whitespace */ |
| 11771 while( fts5ExprIsspace(*z) ) z++; |
| 11772 |
| 11773 pToken->p = z; |
| 11774 pToken->n = 1; |
| 11775 switch( *z ){ |
| 11776 case '(': tok = FTS5_LP; break; |
| 11777 case ')': tok = FTS5_RP; break; |
| 11778 case '{': tok = FTS5_LCP; break; |
| 11779 case '}': tok = FTS5_RCP; break; |
| 11780 case ':': tok = FTS5_COLON; break; |
| 11781 case ',': tok = FTS5_COMMA; break; |
| 11782 case '+': tok = FTS5_PLUS; break; |
| 11783 case '*': tok = FTS5_STAR; break; |
| 11784 case '\0': tok = FTS5_EOF; break; |
| 11785 |
| 11786 case '"': { |
| 11787 const char *z2; |
| 11788 tok = FTS5_STRING; |
| 11789 |
| 11790 for(z2=&z[1]; 1; z2++){ |
| 11791 if( z2[0]=='"' ){ |
| 11792 z2++; |
| 11793 if( z2[0]!='"' ) break; |
| 11794 } |
| 11795 if( z2[0]=='\0' ){ |
| 11796 sqlite3Fts5ParseError(pParse, "unterminated string"); |
| 11797 return FTS5_EOF; |
| 11798 } |
| 11799 } |
| 11800 pToken->n = (z2 - z); |
| 11801 break; |
| 11802 } |
| 11803 |
| 11804 default: { |
| 11805 const char *z2; |
| 11806 if( sqlite3Fts5IsBareword(z[0])==0 ){ |
| 11807 sqlite3Fts5ParseError(pParse, "fts5: syntax error near \"%.1s\"", z); |
| 11808 return FTS5_EOF; |
| 11809 } |
| 11810 tok = FTS5_STRING; |
| 11811 for(z2=&z[1]; sqlite3Fts5IsBareword(*z2); z2++); |
| 11812 pToken->n = (z2 - z); |
| 11813 if( pToken->n==2 && memcmp(pToken->p, "OR", 2)==0 ) tok = FTS5_OR; |
| 11814 if( pToken->n==3 && memcmp(pToken->p, "NOT", 3)==0 ) tok = FTS5_NOT; |
| 11815 if( pToken->n==3 && memcmp(pToken->p, "AND", 3)==0 ) tok = FTS5_AND; |
| 11816 break; |
| 11817 } |
| 11818 } |
| 11819 |
| 11820 *pz = &pToken->p[pToken->n]; |
| 11821 return tok; |
| 11822 } |
| 11823 |
| 11824 static void *fts5ParseAlloc(u64 t){ return sqlite3_malloc((int)t); } |
| 11825 static void fts5ParseFree(void *p){ sqlite3_free(p); } |
| 11826 |
| 11827 static int sqlite3Fts5ExprNew( |
| 11828 Fts5Config *pConfig, /* FTS5 Configuration */ |
| 11829 const char *zExpr, /* Expression text */ |
| 11830 Fts5Expr **ppNew, |
| 11831 char **pzErr |
| 11832 ){ |
| 11833 Fts5Parse sParse; |
| 11834 Fts5Token token; |
| 11835 const char *z = zExpr; |
| 11836 int t; /* Next token type */ |
| 11837 void *pEngine; |
| 11838 Fts5Expr *pNew; |
| 11839 |
| 11840 *ppNew = 0; |
| 11841 *pzErr = 0; |
| 11842 memset(&sParse, 0, sizeof(sParse)); |
| 11843 pEngine = sqlite3Fts5ParserAlloc(fts5ParseAlloc); |
| 11844 if( pEngine==0 ){ return SQLITE_NOMEM; } |
| 11845 sParse.pConfig = pConfig; |
| 11846 |
| 11847 do { |
| 11848 t = fts5ExprGetToken(&sParse, &z, &token); |
| 11849 sqlite3Fts5Parser(pEngine, t, token, &sParse); |
| 11850 }while( sParse.rc==SQLITE_OK && t!=FTS5_EOF ); |
| 11851 sqlite3Fts5ParserFree(pEngine, fts5ParseFree); |
| 11852 |
| 11853 assert( sParse.rc!=SQLITE_OK || sParse.zErr==0 ); |
| 11854 if( sParse.rc==SQLITE_OK ){ |
| 11855 *ppNew = pNew = sqlite3_malloc(sizeof(Fts5Expr)); |
| 11856 if( pNew==0 ){ |
| 11857 sParse.rc = SQLITE_NOMEM; |
| 11858 sqlite3Fts5ParseNodeFree(sParse.pExpr); |
| 11859 }else{ |
| 11860 pNew->pRoot = sParse.pExpr; |
| 11861 pNew->pIndex = 0; |
| 11862 pNew->apExprPhrase = sParse.apPhrase; |
| 11863 pNew->nPhrase = sParse.nPhrase; |
| 11864 sParse.apPhrase = 0; |
| 11865 } |
| 11866 } |
| 11867 |
| 11868 sqlite3_free(sParse.apPhrase); |
| 11869 *pzErr = sParse.zErr; |
| 11870 return sParse.rc; |
| 11871 } |
| 11872 |
| 11873 /* |
| 11874 ** Free the expression node object passed as the only argument. |
| 11875 */ |
| 11876 static void sqlite3Fts5ParseNodeFree(Fts5ExprNode *p){ |
| 11877 if( p ){ |
| 11878 int i; |
| 11879 for(i=0; i<p->nChild; i++){ |
| 11880 sqlite3Fts5ParseNodeFree(p->apChild[i]); |
| 11881 } |
| 11882 sqlite3Fts5ParseNearsetFree(p->pNear); |
| 11883 sqlite3_free(p); |
| 11884 } |
| 11885 } |
| 11886 |
| 11887 /* |
| 11888 ** Free the expression object passed as the only argument. |
| 11889 */ |
| 11890 static void sqlite3Fts5ExprFree(Fts5Expr *p){ |
| 11891 if( p ){ |
| 11892 sqlite3Fts5ParseNodeFree(p->pRoot); |
| 11893 sqlite3_free(p->apExprPhrase); |
| 11894 sqlite3_free(p); |
| 11895 } |
| 11896 } |
| 11897 |
| 11898 /* |
| 11899 ** Argument pTerm must be a synonym iterator. Return the current rowid |
| 11900 ** that it points to. |
| 11901 */ |
| 11902 static i64 fts5ExprSynonymRowid(Fts5ExprTerm *pTerm, int bDesc, int *pbEof){ |
| 11903 i64 iRet = 0; |
| 11904 int bRetValid = 0; |
| 11905 Fts5ExprTerm *p; |
| 11906 |
| 11907 assert( pTerm->pSynonym ); |
| 11908 assert( bDesc==0 || bDesc==1 ); |
| 11909 for(p=pTerm; p; p=p->pSynonym){ |
| 11910 if( 0==sqlite3Fts5IterEof(p->pIter) ){ |
| 11911 i64 iRowid = sqlite3Fts5IterRowid(p->pIter); |
| 11912 if( bRetValid==0 || (bDesc!=(iRowid<iRet)) ){ |
| 11913 iRet = iRowid; |
| 11914 bRetValid = 1; |
| 11915 } |
| 11916 } |
| 11917 } |
| 11918 |
| 11919 if( pbEof && bRetValid==0 ) *pbEof = 1; |
| 11920 return iRet; |
| 11921 } |
| 11922 |
| 11923 /* |
| 11924 ** Argument pTerm must be a synonym iterator. |
| 11925 */ |
| 11926 static int fts5ExprSynonymPoslist( |
| 11927 Fts5ExprTerm *pTerm, |
| 11928 Fts5Colset *pColset, |
| 11929 i64 iRowid, |
| 11930 int *pbDel, /* OUT: Caller should sqlite3_free(*pa) */ |
| 11931 u8 **pa, int *pn |
| 11932 ){ |
| 11933 Fts5PoslistReader aStatic[4]; |
| 11934 Fts5PoslistReader *aIter = aStatic; |
| 11935 int nIter = 0; |
| 11936 int nAlloc = 4; |
| 11937 int rc = SQLITE_OK; |
| 11938 Fts5ExprTerm *p; |
| 11939 |
| 11940 assert( pTerm->pSynonym ); |
| 11941 for(p=pTerm; p; p=p->pSynonym){ |
| 11942 Fts5IndexIter *pIter = p->pIter; |
| 11943 if( sqlite3Fts5IterEof(pIter)==0 && sqlite3Fts5IterRowid(pIter)==iRowid ){ |
| 11944 const u8 *a; |
| 11945 int n; |
| 11946 i64 dummy; |
| 11947 rc = sqlite3Fts5IterPoslist(pIter, pColset, &a, &n, &dummy); |
| 11948 if( rc!=SQLITE_OK ) goto synonym_poslist_out; |
| 11949 if( nIter==nAlloc ){ |
| 11950 int nByte = sizeof(Fts5PoslistReader) * nAlloc * 2; |
| 11951 Fts5PoslistReader *aNew = (Fts5PoslistReader*)sqlite3_malloc(nByte); |
| 11952 if( aNew==0 ){ |
| 11953 rc = SQLITE_NOMEM; |
| 11954 goto synonym_poslist_out; |
| 11955 } |
| 11956 memcpy(aNew, aIter, sizeof(Fts5PoslistReader) * nIter); |
| 11957 nAlloc = nAlloc*2; |
| 11958 if( aIter!=aStatic ) sqlite3_free(aIter); |
| 11959 aIter = aNew; |
| 11960 } |
| 11961 sqlite3Fts5PoslistReaderInit(a, n, &aIter[nIter]); |
| 11962 assert( aIter[nIter].bEof==0 ); |
| 11963 nIter++; |
| 11964 } |
| 11965 } |
| 11966 |
| 11967 assert( *pbDel==0 ); |
| 11968 if( nIter==1 ){ |
| 11969 *pa = (u8*)aIter[0].a; |
| 11970 *pn = aIter[0].n; |
| 11971 }else{ |
| 11972 Fts5PoslistWriter writer = {0}; |
| 11973 Fts5Buffer buf = {0,0,0}; |
| 11974 i64 iPrev = -1; |
| 11975 while( 1 ){ |
| 11976 int i; |
| 11977 i64 iMin = FTS5_LARGEST_INT64; |
| 11978 for(i=0; i<nIter; i++){ |
| 11979 if( aIter[i].bEof==0 ){ |
| 11980 if( aIter[i].iPos==iPrev ){ |
| 11981 if( sqlite3Fts5PoslistReaderNext(&aIter[i]) ) continue; |
| 11982 } |
| 11983 if( aIter[i].iPos<iMin ){ |
| 11984 iMin = aIter[i].iPos; |
| 11985 } |
| 11986 } |
| 11987 } |
| 11988 if( iMin==FTS5_LARGEST_INT64 || rc!=SQLITE_OK ) break; |
| 11989 rc = sqlite3Fts5PoslistWriterAppend(&buf, &writer, iMin); |
| 11990 iPrev = iMin; |
| 11991 } |
| 11992 if( rc ){ |
| 11993 sqlite3_free(buf.p); |
| 11994 }else{ |
| 11995 *pa = buf.p; |
| 11996 *pn = buf.n; |
| 11997 *pbDel = 1; |
| 11998 } |
| 11999 } |
| 12000 |
| 12001 synonym_poslist_out: |
| 12002 if( aIter!=aStatic ) sqlite3_free(aIter); |
| 12003 return rc; |
| 12004 } |
| 12005 |
| 12006 |
| 12007 /* |
| 12008 ** All individual term iterators in pPhrase are guaranteed to be valid and |
| 12009 ** pointing to the same rowid when this function is called. This function |
| 12010 ** checks if the current rowid really is a match, and if so populates |
| 12011 ** the pPhrase->poslist buffer accordingly. Output parameter *pbMatch |
| 12012 ** is set to true if this is really a match, or false otherwise. |
| 12013 ** |
| 12014 ** SQLITE_OK is returned if an error occurs, or an SQLite error code |
| 12015 ** otherwise. It is not considered an error code if the current rowid is |
| 12016 ** not a match. |
| 12017 */ |
| 12018 static int fts5ExprPhraseIsMatch( |
| 12019 Fts5ExprNode *pNode, /* Node pPhrase belongs to */ |
| 12020 Fts5Colset *pColset, /* Restrict matches to these columns */ |
| 12021 Fts5ExprPhrase *pPhrase, /* Phrase object to initialize */ |
| 12022 int *pbMatch /* OUT: Set to true if really a match */ |
| 12023 ){ |
| 12024 Fts5PoslistWriter writer = {0}; |
| 12025 Fts5PoslistReader aStatic[4]; |
| 12026 Fts5PoslistReader *aIter = aStatic; |
| 12027 int i; |
| 12028 int rc = SQLITE_OK; |
| 12029 |
| 12030 fts5BufferZero(&pPhrase->poslist); |
| 12031 |
| 12032 /* If the aStatic[] array is not large enough, allocate a large array |
| 12033 ** using sqlite3_malloc(). This approach could be improved upon. */ |
| 12034 if( pPhrase->nTerm>(int)ArraySize(aStatic) ){ |
| 12035 int nByte = sizeof(Fts5PoslistReader) * pPhrase->nTerm; |
| 12036 aIter = (Fts5PoslistReader*)sqlite3_malloc(nByte); |
| 12037 if( !aIter ) return SQLITE_NOMEM; |
| 12038 } |
| 12039 memset(aIter, 0, sizeof(Fts5PoslistReader) * pPhrase->nTerm); |
| 12040 |
| 12041 /* Initialize a term iterator for each term in the phrase */ |
| 12042 for(i=0; i<pPhrase->nTerm; i++){ |
| 12043 Fts5ExprTerm *pTerm = &pPhrase->aTerm[i]; |
| 12044 i64 dummy; |
| 12045 int n = 0; |
| 12046 int bFlag = 0; |
| 12047 const u8 *a = 0; |
| 12048 if( pTerm->pSynonym ){ |
| 12049 rc = fts5ExprSynonymPoslist( |
| 12050 pTerm, pColset, pNode->iRowid, &bFlag, (u8**)&a, &n |
| 12051 ); |
| 12052 }else{ |
| 12053 rc = sqlite3Fts5IterPoslist(pTerm->pIter, pColset, &a, &n, &dummy); |
| 12054 } |
| 12055 if( rc!=SQLITE_OK ) goto ismatch_out; |
| 12056 sqlite3Fts5PoslistReaderInit(a, n, &aIter[i]); |
| 12057 aIter[i].bFlag = (u8)bFlag; |
| 12058 if( aIter[i].bEof ) goto ismatch_out; |
| 12059 } |
| 12060 |
| 12061 while( 1 ){ |
| 12062 int bMatch; |
| 12063 i64 iPos = aIter[0].iPos; |
| 12064 do { |
| 12065 bMatch = 1; |
| 12066 for(i=0; i<pPhrase->nTerm; i++){ |
| 12067 Fts5PoslistReader *pPos = &aIter[i]; |
| 12068 i64 iAdj = iPos + i; |
| 12069 if( pPos->iPos!=iAdj ){ |
| 12070 bMatch = 0; |
| 12071 while( pPos->iPos<iAdj ){ |
| 12072 if( sqlite3Fts5PoslistReaderNext(pPos) ) goto ismatch_out; |
| 12073 } |
| 12074 if( pPos->iPos>iAdj ) iPos = pPos->iPos-i; |
| 12075 } |
| 12076 } |
| 12077 }while( bMatch==0 ); |
| 12078 |
| 12079 /* Append position iPos to the output */ |
| 12080 rc = sqlite3Fts5PoslistWriterAppend(&pPhrase->poslist, &writer, iPos); |
| 12081 if( rc!=SQLITE_OK ) goto ismatch_out; |
| 12082 |
| 12083 for(i=0; i<pPhrase->nTerm; i++){ |
| 12084 if( sqlite3Fts5PoslistReaderNext(&aIter[i]) ) goto ismatch_out; |
| 12085 } |
| 12086 } |
| 12087 |
| 12088 ismatch_out: |
| 12089 *pbMatch = (pPhrase->poslist.n>0); |
| 12090 for(i=0; i<pPhrase->nTerm; i++){ |
| 12091 if( aIter[i].bFlag ) sqlite3_free((u8*)aIter[i].a); |
| 12092 } |
| 12093 if( aIter!=aStatic ) sqlite3_free(aIter); |
| 12094 return rc; |
| 12095 } |
| 12096 |
| 12097 typedef struct Fts5LookaheadReader Fts5LookaheadReader; |
| 12098 struct Fts5LookaheadReader { |
| 12099 const u8 *a; /* Buffer containing position list */ |
| 12100 int n; /* Size of buffer a[] in bytes */ |
| 12101 int i; /* Current offset in position list */ |
| 12102 i64 iPos; /* Current position */ |
| 12103 i64 iLookahead; /* Next position */ |
| 12104 }; |
| 12105 |
| 12106 #define FTS5_LOOKAHEAD_EOF (((i64)1) << 62) |
| 12107 |
| 12108 static int fts5LookaheadReaderNext(Fts5LookaheadReader *p){ |
| 12109 p->iPos = p->iLookahead; |
| 12110 if( sqlite3Fts5PoslistNext64(p->a, p->n, &p->i, &p->iLookahead) ){ |
| 12111 p->iLookahead = FTS5_LOOKAHEAD_EOF; |
| 12112 } |
| 12113 return (p->iPos==FTS5_LOOKAHEAD_EOF); |
| 12114 } |
| 12115 |
| 12116 static int fts5LookaheadReaderInit( |
| 12117 const u8 *a, int n, /* Buffer to read position list from */ |
| 12118 Fts5LookaheadReader *p /* Iterator object to initialize */ |
| 12119 ){ |
| 12120 memset(p, 0, sizeof(Fts5LookaheadReader)); |
| 12121 p->a = a; |
| 12122 p->n = n; |
| 12123 fts5LookaheadReaderNext(p); |
| 12124 return fts5LookaheadReaderNext(p); |
| 12125 } |
| 12126 |
| 12127 #if 0 |
| 12128 static int fts5LookaheadReaderEof(Fts5LookaheadReader *p){ |
| 12129 return (p->iPos==FTS5_LOOKAHEAD_EOF); |
| 12130 } |
| 12131 #endif |
| 12132 |
| 12133 typedef struct Fts5NearTrimmer Fts5NearTrimmer; |
| 12134 struct Fts5NearTrimmer { |
| 12135 Fts5LookaheadReader reader; /* Input iterator */ |
| 12136 Fts5PoslistWriter writer; /* Writer context */ |
| 12137 Fts5Buffer *pOut; /* Output poslist */ |
| 12138 }; |
| 12139 |
| 12140 /* |
| 12141 ** The near-set object passed as the first argument contains more than |
| 12142 ** one phrase. All phrases currently point to the same row. The |
| 12143 ** Fts5ExprPhrase.poslist buffers are populated accordingly. This function |
| 12144 ** tests if the current row contains instances of each phrase sufficiently |
| 12145 ** close together to meet the NEAR constraint. Non-zero is returned if it |
| 12146 ** does, or zero otherwise. |
| 12147 ** |
| 12148 ** If in/out parameter (*pRc) is set to other than SQLITE_OK when this |
| 12149 ** function is called, it is a no-op. Or, if an error (e.g. SQLITE_NOMEM) |
| 12150 ** occurs within this function (*pRc) is set accordingly before returning. |
| 12151 ** The return value is undefined in both these cases. |
| 12152 ** |
| 12153 ** If no error occurs and non-zero (a match) is returned, the position-list |
| 12154 ** of each phrase object is edited to contain only those entries that |
| 12155 ** meet the constraint before returning. |
| 12156 */ |
| 12157 static int fts5ExprNearIsMatch(int *pRc, Fts5ExprNearset *pNear){ |
| 12158 Fts5NearTrimmer aStatic[4]; |
| 12159 Fts5NearTrimmer *a = aStatic; |
| 12160 Fts5ExprPhrase **apPhrase = pNear->apPhrase; |
| 12161 |
| 12162 int i; |
| 12163 int rc = *pRc; |
| 12164 int bMatch; |
| 12165 |
| 12166 assert( pNear->nPhrase>1 ); |
| 12167 |
| 12168 /* If the aStatic[] array is not large enough, allocate a large array |
| 12169 ** using sqlite3_malloc(). This approach could be improved upon. */ |
| 12170 if( pNear->nPhrase>(int)ArraySize(aStatic) ){ |
| 12171 int nByte = sizeof(Fts5NearTrimmer) * pNear->nPhrase; |
| 12172 a = (Fts5NearTrimmer*)sqlite3Fts5MallocZero(&rc, nByte); |
| 12173 }else{ |
| 12174 memset(aStatic, 0, sizeof(aStatic)); |
| 12175 } |
| 12176 if( rc!=SQLITE_OK ){ |
| 12177 *pRc = rc; |
| 12178 return 0; |
| 12179 } |
| 12180 |
| 12181 /* Initialize a lookahead iterator for each phrase. After passing the |
| 12182 ** buffer and buffer size to the lookaside-reader init function, zero |
| 12183 ** the phrase poslist buffer. The new poslist for the phrase (containing |
| 12184 ** the same entries as the original with some entries removed on account |
| 12185 ** of the NEAR constraint) is written over the original even as it is |
| 12186 ** being read. This is safe as the entries for the new poslist are a |
| 12187 ** subset of the old, so it is not possible for data yet to be read to |
| 12188 ** be overwritten. */ |
| 12189 for(i=0; i<pNear->nPhrase; i++){ |
| 12190 Fts5Buffer *pPoslist = &apPhrase[i]->poslist; |
| 12191 fts5LookaheadReaderInit(pPoslist->p, pPoslist->n, &a[i].reader); |
| 12192 pPoslist->n = 0; |
| 12193 a[i].pOut = pPoslist; |
| 12194 } |
| 12195 |
| 12196 while( 1 ){ |
| 12197 int iAdv; |
| 12198 i64 iMin; |
| 12199 i64 iMax; |
| 12200 |
| 12201 /* This block advances the phrase iterators until they point to a set of |
| 12202 ** entries that together comprise a match. */ |
| 12203 iMax = a[0].reader.iPos; |
| 12204 do { |
| 12205 bMatch = 1; |
| 12206 for(i=0; i<pNear->nPhrase; i++){ |
| 12207 Fts5LookaheadReader *pPos = &a[i].reader; |
| 12208 iMin = iMax - pNear->apPhrase[i]->nTerm - pNear->nNear; |
| 12209 if( pPos->iPos<iMin || pPos->iPos>iMax ){ |
| 12210 bMatch = 0; |
| 12211 while( pPos->iPos<iMin ){ |
| 12212 if( fts5LookaheadReaderNext(pPos) ) goto ismatch_out; |
| 12213 } |
| 12214 if( pPos->iPos>iMax ) iMax = pPos->iPos; |
| 12215 } |
| 12216 } |
| 12217 }while( bMatch==0 ); |
| 12218 |
| 12219 /* Add an entry to each output position list */ |
| 12220 for(i=0; i<pNear->nPhrase; i++){ |
| 12221 i64 iPos = a[i].reader.iPos; |
| 12222 Fts5PoslistWriter *pWriter = &a[i].writer; |
| 12223 if( a[i].pOut->n==0 || iPos!=pWriter->iPrev ){ |
| 12224 sqlite3Fts5PoslistWriterAppend(a[i].pOut, pWriter, iPos); |
| 12225 } |
| 12226 } |
| 12227 |
| 12228 iAdv = 0; |
| 12229 iMin = a[0].reader.iLookahead; |
| 12230 for(i=0; i<pNear->nPhrase; i++){ |
| 12231 if( a[i].reader.iLookahead < iMin ){ |
| 12232 iMin = a[i].reader.iLookahead; |
| 12233 iAdv = i; |
| 12234 } |
| 12235 } |
| 12236 if( fts5LookaheadReaderNext(&a[iAdv].reader) ) goto ismatch_out; |
| 12237 } |
| 12238 |
| 12239 ismatch_out: { |
| 12240 int bRet = a[0].pOut->n>0; |
| 12241 *pRc = rc; |
| 12242 if( a!=aStatic ) sqlite3_free(a); |
| 12243 return bRet; |
| 12244 } |
| 12245 } |
| 12246 |
| 12247 /* |
| 12248 ** Advance the first term iterator in the first phrase of pNear. Set output |
| 12249 ** variable *pbEof to true if it reaches EOF or if an error occurs. |
| 12250 ** |
| 12251 ** Return SQLITE_OK if successful, or an SQLite error code if an error |
| 12252 ** occurs. |
| 12253 */ |
| 12254 static int fts5ExprNearAdvanceFirst( |
| 12255 Fts5Expr *pExpr, /* Expression pPhrase belongs to */ |
| 12256 Fts5ExprNode *pNode, /* FTS5_STRING or FTS5_TERM node */ |
| 12257 int bFromValid, |
| 12258 i64 iFrom |
| 12259 ){ |
| 12260 Fts5ExprTerm *pTerm = &pNode->pNear->apPhrase[0]->aTerm[0]; |
| 12261 int rc = SQLITE_OK; |
| 12262 |
| 12263 if( pTerm->pSynonym ){ |
| 12264 int bEof = 1; |
| 12265 Fts5ExprTerm *p; |
| 12266 |
| 12267 /* Find the firstest rowid any synonym points to. */ |
| 12268 i64 iRowid = fts5ExprSynonymRowid(pTerm, pExpr->bDesc, 0); |
| 12269 |
| 12270 /* Advance each iterator that currently points to iRowid. Or, if iFrom |
| 12271 ** is valid - each iterator that points to a rowid before iFrom. */ |
| 12272 for(p=pTerm; p; p=p->pSynonym){ |
| 12273 if( sqlite3Fts5IterEof(p->pIter)==0 ){ |
| 12274 i64 ii = sqlite3Fts5IterRowid(p->pIter); |
| 12275 if( ii==iRowid |
| 12276 || (bFromValid && ii!=iFrom && (ii>iFrom)==pExpr->bDesc) |
| 12277 ){ |
| 12278 if( bFromValid ){ |
| 12279 rc = sqlite3Fts5IterNextFrom(p->pIter, iFrom); |
| 12280 }else{ |
| 12281 rc = sqlite3Fts5IterNext(p->pIter); |
| 12282 } |
| 12283 if( rc!=SQLITE_OK ) break; |
| 12284 if( sqlite3Fts5IterEof(p->pIter)==0 ){ |
| 12285 bEof = 0; |
| 12286 } |
| 12287 }else{ |
| 12288 bEof = 0; |
| 12289 } |
| 12290 } |
| 12291 } |
| 12292 |
| 12293 /* Set the EOF flag if either all synonym iterators are at EOF or an |
| 12294 ** error has occurred. */ |
| 12295 pNode->bEof = (rc || bEof); |
| 12296 }else{ |
| 12297 Fts5IndexIter *pIter = pTerm->pIter; |
| 12298 |
| 12299 assert( Fts5NodeIsString(pNode) ); |
| 12300 if( bFromValid ){ |
| 12301 rc = sqlite3Fts5IterNextFrom(pIter, iFrom); |
| 12302 }else{ |
| 12303 rc = sqlite3Fts5IterNext(pIter); |
| 12304 } |
| 12305 |
| 12306 pNode->bEof = (rc || sqlite3Fts5IterEof(pIter)); |
| 12307 } |
| 12308 |
| 12309 return rc; |
| 12310 } |
| 12311 |
| 12312 /* |
| 12313 ** Advance iterator pIter until it points to a value equal to or laster |
| 12314 ** than the initial value of *piLast. If this means the iterator points |
| 12315 ** to a value laster than *piLast, update *piLast to the new lastest value. |
| 12316 ** |
| 12317 ** If the iterator reaches EOF, set *pbEof to true before returning. If |
| 12318 ** an error occurs, set *pRc to an error code. If either *pbEof or *pRc |
| 12319 ** are set, return a non-zero value. Otherwise, return zero. |
| 12320 */ |
| 12321 static int fts5ExprAdvanceto( |
| 12322 Fts5IndexIter *pIter, /* Iterator to advance */ |
| 12323 int bDesc, /* True if iterator is "rowid DESC" */ |
| 12324 i64 *piLast, /* IN/OUT: Lastest rowid seen so far */ |
| 12325 int *pRc, /* OUT: Error code */ |
| 12326 int *pbEof /* OUT: Set to true if EOF */ |
| 12327 ){ |
| 12328 i64 iLast = *piLast; |
| 12329 i64 iRowid; |
| 12330 |
| 12331 iRowid = sqlite3Fts5IterRowid(pIter); |
| 12332 if( (bDesc==0 && iLast>iRowid) || (bDesc && iLast<iRowid) ){ |
| 12333 int rc = sqlite3Fts5IterNextFrom(pIter, iLast); |
| 12334 if( rc || sqlite3Fts5IterEof(pIter) ){ |
| 12335 *pRc = rc; |
| 12336 *pbEof = 1; |
| 12337 return 1; |
| 12338 } |
| 12339 iRowid = sqlite3Fts5IterRowid(pIter); |
| 12340 assert( (bDesc==0 && iRowid>=iLast) || (bDesc==1 && iRowid<=iLast) ); |
| 12341 } |
| 12342 *piLast = iRowid; |
| 12343 |
| 12344 return 0; |
| 12345 } |
| 12346 |
| 12347 static int fts5ExprSynonymAdvanceto( |
| 12348 Fts5ExprTerm *pTerm, /* Term iterator to advance */ |
| 12349 int bDesc, /* True if iterator is "rowid DESC" */ |
| 12350 i64 *piLast, /* IN/OUT: Lastest rowid seen so far */ |
| 12351 int *pRc /* OUT: Error code */ |
| 12352 ){ |
| 12353 int rc = SQLITE_OK; |
| 12354 i64 iLast = *piLast; |
| 12355 Fts5ExprTerm *p; |
| 12356 int bEof = 0; |
| 12357 |
| 12358 for(p=pTerm; rc==SQLITE_OK && p; p=p->pSynonym){ |
| 12359 if( sqlite3Fts5IterEof(p->pIter)==0 ){ |
| 12360 i64 iRowid = sqlite3Fts5IterRowid(p->pIter); |
| 12361 if( (bDesc==0 && iLast>iRowid) || (bDesc && iLast<iRowid) ){ |
| 12362 rc = sqlite3Fts5IterNextFrom(p->pIter, iLast); |
| 12363 } |
| 12364 } |
| 12365 } |
| 12366 |
| 12367 if( rc!=SQLITE_OK ){ |
| 12368 *pRc = rc; |
| 12369 bEof = 1; |
| 12370 }else{ |
| 12371 *piLast = fts5ExprSynonymRowid(pTerm, bDesc, &bEof); |
| 12372 } |
| 12373 return bEof; |
| 12374 } |
| 12375 |
| 12376 |
| 12377 static int fts5ExprNearTest( |
| 12378 int *pRc, |
| 12379 Fts5Expr *pExpr, /* Expression that pNear is a part of */ |
| 12380 Fts5ExprNode *pNode /* The "NEAR" node (FTS5_STRING) */ |
| 12381 ){ |
| 12382 Fts5ExprNearset *pNear = pNode->pNear; |
| 12383 int rc = *pRc; |
| 12384 int i; |
| 12385 |
| 12386 /* Check that each phrase in the nearset matches the current row. |
| 12387 ** Populate the pPhrase->poslist buffers at the same time. If any |
| 12388 ** phrase is not a match, break out of the loop early. */ |
| 12389 for(i=0; rc==SQLITE_OK && i<pNear->nPhrase; i++){ |
| 12390 Fts5ExprPhrase *pPhrase = pNear->apPhrase[i]; |
| 12391 if( pPhrase->nTerm>1 || pPhrase->aTerm[0].pSynonym || pNear->pColset ){ |
| 12392 int bMatch = 0; |
| 12393 rc = fts5ExprPhraseIsMatch(pNode, pNear->pColset, pPhrase, &bMatch); |
| 12394 if( bMatch==0 ) break; |
| 12395 }else{ |
| 12396 rc = sqlite3Fts5IterPoslistBuffer( |
| 12397 pPhrase->aTerm[0].pIter, &pPhrase->poslist |
| 12398 ); |
| 12399 } |
| 12400 } |
| 12401 |
| 12402 *pRc = rc; |
| 12403 if( i==pNear->nPhrase && (i==1 || fts5ExprNearIsMatch(pRc, pNear)) ){ |
| 12404 return 1; |
| 12405 } |
| 12406 |
| 12407 return 0; |
| 12408 } |
| 12409 |
| 12410 static int fts5ExprTokenTest( |
| 12411 Fts5Expr *pExpr, /* Expression that pNear is a part of */ |
| 12412 Fts5ExprNode *pNode /* The "NEAR" node (FTS5_TERM) */ |
| 12413 ){ |
| 12414 /* As this "NEAR" object is actually a single phrase that consists |
| 12415 ** of a single term only, grab pointers into the poslist managed by the |
| 12416 ** fts5_index.c iterator object. This is much faster than synthesizing |
| 12417 ** a new poslist the way we have to for more complicated phrase or NEAR |
| 12418 ** expressions. */ |
| 12419 Fts5ExprNearset *pNear = pNode->pNear; |
| 12420 Fts5ExprPhrase *pPhrase = pNear->apPhrase[0]; |
| 12421 Fts5IndexIter *pIter = pPhrase->aTerm[0].pIter; |
| 12422 Fts5Colset *pColset = pNear->pColset; |
| 12423 int rc; |
| 12424 |
| 12425 assert( pNode->eType==FTS5_TERM ); |
| 12426 assert( pNear->nPhrase==1 && pPhrase->nTerm==1 ); |
| 12427 assert( pPhrase->aTerm[0].pSynonym==0 ); |
| 12428 |
| 12429 rc = sqlite3Fts5IterPoslist(pIter, pColset, |
| 12430 (const u8**)&pPhrase->poslist.p, &pPhrase->poslist.n, &pNode->iRowid |
| 12431 ); |
| 12432 pNode->bNomatch = (pPhrase->poslist.n==0); |
| 12433 return rc; |
| 12434 } |
| 12435 |
| 12436 /* |
| 12437 ** All individual term iterators in pNear are guaranteed to be valid when |
| 12438 ** this function is called. This function checks if all term iterators |
| 12439 ** point to the same rowid, and if not, advances them until they do. |
| 12440 ** If an EOF is reached before this happens, *pbEof is set to true before |
| 12441 ** returning. |
| 12442 ** |
| 12443 ** SQLITE_OK is returned if an error occurs, or an SQLite error code |
| 12444 ** otherwise. It is not considered an error code if an iterator reaches |
| 12445 ** EOF. |
| 12446 */ |
| 12447 static int fts5ExprNearNextMatch( |
| 12448 Fts5Expr *pExpr, /* Expression pPhrase belongs to */ |
| 12449 Fts5ExprNode *pNode |
| 12450 ){ |
| 12451 Fts5ExprNearset *pNear = pNode->pNear; |
| 12452 Fts5ExprPhrase *pLeft = pNear->apPhrase[0]; |
| 12453 int rc = SQLITE_OK; |
| 12454 i64 iLast; /* Lastest rowid any iterator points to */ |
| 12455 int i, j; /* Phrase and token index, respectively */ |
| 12456 int bMatch; /* True if all terms are at the same rowid */ |
| 12457 const int bDesc = pExpr->bDesc; |
| 12458 |
| 12459 /* Check that this node should not be FTS5_TERM */ |
| 12460 assert( pNear->nPhrase>1 |
| 12461 || pNear->apPhrase[0]->nTerm>1 |
| 12462 || pNear->apPhrase[0]->aTerm[0].pSynonym |
| 12463 ); |
| 12464 |
| 12465 /* Initialize iLast, the "lastest" rowid any iterator points to. If the |
| 12466 ** iterator skips through rowids in the default ascending order, this means |
| 12467 ** the maximum rowid. Or, if the iterator is "ORDER BY rowid DESC", then it |
| 12468 ** means the minimum rowid. */ |
| 12469 if( pLeft->aTerm[0].pSynonym ){ |
| 12470 iLast = fts5ExprSynonymRowid(&pLeft->aTerm[0], bDesc, 0); |
| 12471 }else{ |
| 12472 iLast = sqlite3Fts5IterRowid(pLeft->aTerm[0].pIter); |
| 12473 } |
| 12474 |
| 12475 do { |
| 12476 bMatch = 1; |
| 12477 for(i=0; i<pNear->nPhrase; i++){ |
| 12478 Fts5ExprPhrase *pPhrase = pNear->apPhrase[i]; |
| 12479 for(j=0; j<pPhrase->nTerm; j++){ |
| 12480 Fts5ExprTerm *pTerm = &pPhrase->aTerm[j]; |
| 12481 if( pTerm->pSynonym ){ |
| 12482 i64 iRowid = fts5ExprSynonymRowid(pTerm, bDesc, 0); |
| 12483 if( iRowid==iLast ) continue; |
| 12484 bMatch = 0; |
| 12485 if( fts5ExprSynonymAdvanceto(pTerm, bDesc, &iLast, &rc) ){ |
| 12486 pNode->bEof = 1; |
| 12487 return rc; |
| 12488 } |
| 12489 }else{ |
| 12490 Fts5IndexIter *pIter = pPhrase->aTerm[j].pIter; |
| 12491 i64 iRowid = sqlite3Fts5IterRowid(pIter); |
| 12492 if( iRowid==iLast ) continue; |
| 12493 bMatch = 0; |
| 12494 if( fts5ExprAdvanceto(pIter, bDesc, &iLast, &rc, &pNode->bEof) ){ |
| 12495 return rc; |
| 12496 } |
| 12497 } |
| 12498 } |
| 12499 } |
| 12500 }while( bMatch==0 ); |
| 12501 |
| 12502 pNode->iRowid = iLast; |
| 12503 pNode->bNomatch = (0==fts5ExprNearTest(&rc, pExpr, pNode)); |
| 12504 |
| 12505 return rc; |
| 12506 } |
| 12507 |
| 12508 /* |
| 12509 ** Initialize all term iterators in the pNear object. If any term is found |
| 12510 ** to match no documents at all, return immediately without initializing any |
| 12511 ** further iterators. |
| 12512 */ |
| 12513 static int fts5ExprNearInitAll( |
| 12514 Fts5Expr *pExpr, |
| 12515 Fts5ExprNode *pNode |
| 12516 ){ |
| 12517 Fts5ExprNearset *pNear = pNode->pNear; |
| 12518 int i, j; |
| 12519 int rc = SQLITE_OK; |
| 12520 |
| 12521 for(i=0; rc==SQLITE_OK && i<pNear->nPhrase; i++){ |
| 12522 Fts5ExprPhrase *pPhrase = pNear->apPhrase[i]; |
| 12523 for(j=0; j<pPhrase->nTerm; j++){ |
| 12524 Fts5ExprTerm *pTerm = &pPhrase->aTerm[j]; |
| 12525 Fts5ExprTerm *p; |
| 12526 int bEof = 1; |
| 12527 |
| 12528 for(p=pTerm; p && rc==SQLITE_OK; p=p->pSynonym){ |
| 12529 if( p->pIter ){ |
| 12530 sqlite3Fts5IterClose(p->pIter); |
| 12531 p->pIter = 0; |
| 12532 } |
| 12533 rc = sqlite3Fts5IndexQuery( |
| 12534 pExpr->pIndex, p->zTerm, (int)strlen(p->zTerm), |
| 12535 (pTerm->bPrefix ? FTS5INDEX_QUERY_PREFIX : 0) | |
| 12536 (pExpr->bDesc ? FTS5INDEX_QUERY_DESC : 0), |
| 12537 pNear->pColset, |
| 12538 &p->pIter |
| 12539 ); |
| 12540 assert( rc==SQLITE_OK || p->pIter==0 ); |
| 12541 if( p->pIter && 0==sqlite3Fts5IterEof(p->pIter) ){ |
| 12542 bEof = 0; |
| 12543 } |
| 12544 } |
| 12545 |
| 12546 if( bEof ){ |
| 12547 pNode->bEof = 1; |
| 12548 return rc; |
| 12549 } |
| 12550 } |
| 12551 } |
| 12552 |
| 12553 return rc; |
| 12554 } |
| 12555 |
| 12556 /* fts5ExprNodeNext() calls fts5ExprNodeNextMatch(). And vice-versa. */ |
| 12557 static int fts5ExprNodeNextMatch(Fts5Expr*, Fts5ExprNode*); |
| 12558 |
| 12559 |
| 12560 /* |
| 12561 ** If pExpr is an ASC iterator, this function returns a value with the |
| 12562 ** same sign as: |
| 12563 ** |
| 12564 ** (iLhs - iRhs) |
| 12565 ** |
| 12566 ** Otherwise, if this is a DESC iterator, the opposite is returned: |
| 12567 ** |
| 12568 ** (iRhs - iLhs) |
| 12569 */ |
| 12570 static int fts5RowidCmp( |
| 12571 Fts5Expr *pExpr, |
| 12572 i64 iLhs, |
| 12573 i64 iRhs |
| 12574 ){ |
| 12575 assert( pExpr->bDesc==0 || pExpr->bDesc==1 ); |
| 12576 if( pExpr->bDesc==0 ){ |
| 12577 if( iLhs<iRhs ) return -1; |
| 12578 return (iLhs > iRhs); |
| 12579 }else{ |
| 12580 if( iLhs>iRhs ) return -1; |
| 12581 return (iLhs < iRhs); |
| 12582 } |
| 12583 } |
| 12584 |
| 12585 static void fts5ExprSetEof(Fts5ExprNode *pNode){ |
| 12586 int i; |
| 12587 pNode->bEof = 1; |
| 12588 for(i=0; i<pNode->nChild; i++){ |
| 12589 fts5ExprSetEof(pNode->apChild[i]); |
| 12590 } |
| 12591 } |
| 12592 |
| 12593 static void fts5ExprNodeZeroPoslist(Fts5ExprNode *pNode){ |
| 12594 if( pNode->eType==FTS5_STRING || pNode->eType==FTS5_TERM ){ |
| 12595 Fts5ExprNearset *pNear = pNode->pNear; |
| 12596 int i; |
| 12597 for(i=0; i<pNear->nPhrase; i++){ |
| 12598 Fts5ExprPhrase *pPhrase = pNear->apPhrase[i]; |
| 12599 pPhrase->poslist.n = 0; |
| 12600 } |
| 12601 }else{ |
| 12602 int i; |
| 12603 for(i=0; i<pNode->nChild; i++){ |
| 12604 fts5ExprNodeZeroPoslist(pNode->apChild[i]); |
| 12605 } |
| 12606 } |
| 12607 } |
| 12608 |
| 12609 |
| 12610 static int fts5ExprNodeNext(Fts5Expr*, Fts5ExprNode*, int, i64); |
| 12611 |
| 12612 /* |
| 12613 ** Argument pNode is an FTS5_AND node. |
| 12614 */ |
| 12615 static int fts5ExprAndNextRowid( |
| 12616 Fts5Expr *pExpr, /* Expression pPhrase belongs to */ |
| 12617 Fts5ExprNode *pAnd /* FTS5_AND node to advance */ |
| 12618 ){ |
| 12619 int iChild; |
| 12620 i64 iLast = pAnd->iRowid; |
| 12621 int rc = SQLITE_OK; |
| 12622 int bMatch; |
| 12623 |
| 12624 assert( pAnd->bEof==0 ); |
| 12625 do { |
| 12626 pAnd->bNomatch = 0; |
| 12627 bMatch = 1; |
| 12628 for(iChild=0; iChild<pAnd->nChild; iChild++){ |
| 12629 Fts5ExprNode *pChild = pAnd->apChild[iChild]; |
| 12630 if( 0 && pChild->eType==FTS5_STRING ){ |
| 12631 /* TODO */ |
| 12632 }else{ |
| 12633 int cmp = fts5RowidCmp(pExpr, iLast, pChild->iRowid); |
| 12634 if( cmp>0 ){ |
| 12635 /* Advance pChild until it points to iLast or laster */ |
| 12636 rc = fts5ExprNodeNext(pExpr, pChild, 1, iLast); |
| 12637 if( rc!=SQLITE_OK ) return rc; |
| 12638 } |
| 12639 } |
| 12640 |
| 12641 /* If the child node is now at EOF, so is the parent AND node. Otherwise, |
| 12642 ** the child node is guaranteed to have advanced at least as far as |
| 12643 ** rowid iLast. So if it is not at exactly iLast, pChild->iRowid is the |
| 12644 ** new lastest rowid seen so far. */ |
| 12645 assert( pChild->bEof || fts5RowidCmp(pExpr, iLast, pChild->iRowid)<=0 ); |
| 12646 if( pChild->bEof ){ |
| 12647 fts5ExprSetEof(pAnd); |
| 12648 bMatch = 1; |
| 12649 break; |
| 12650 }else if( iLast!=pChild->iRowid ){ |
| 12651 bMatch = 0; |
| 12652 iLast = pChild->iRowid; |
| 12653 } |
| 12654 |
| 12655 if( pChild->bNomatch ){ |
| 12656 pAnd->bNomatch = 1; |
| 12657 } |
| 12658 } |
| 12659 }while( bMatch==0 ); |
| 12660 |
| 12661 if( pAnd->bNomatch && pAnd!=pExpr->pRoot ){ |
| 12662 fts5ExprNodeZeroPoslist(pAnd); |
| 12663 } |
| 12664 pAnd->iRowid = iLast; |
| 12665 return SQLITE_OK; |
| 12666 } |
| 12667 |
| 12668 |
| 12669 /* |
| 12670 ** Compare the values currently indicated by the two nodes as follows: |
| 12671 ** |
| 12672 ** res = (*p1) - (*p2) |
| 12673 ** |
| 12674 ** Nodes that point to values that come later in the iteration order are |
| 12675 ** considered to be larger. Nodes at EOF are the largest of all. |
| 12676 ** |
| 12677 ** This means that if the iteration order is ASC, then numerically larger |
| 12678 ** rowids are considered larger. Or if it is the default DESC, numerically |
| 12679 ** smaller rowids are larger. |
| 12680 */ |
| 12681 static int fts5NodeCompare( |
| 12682 Fts5Expr *pExpr, |
| 12683 Fts5ExprNode *p1, |
| 12684 Fts5ExprNode *p2 |
| 12685 ){ |
| 12686 if( p2->bEof ) return -1; |
| 12687 if( p1->bEof ) return +1; |
| 12688 return fts5RowidCmp(pExpr, p1->iRowid, p2->iRowid); |
| 12689 } |
| 12690 |
| 12691 /* |
| 12692 ** Advance node iterator pNode, part of expression pExpr. If argument |
| 12693 ** bFromValid is zero, then pNode is advanced exactly once. Or, if argument |
| 12694 ** bFromValid is non-zero, then pNode is advanced until it is at or past |
| 12695 ** rowid value iFrom. Whether "past" means "less than" or "greater than" |
| 12696 ** depends on whether this is an ASC or DESC iterator. |
| 12697 */ |
| 12698 static int fts5ExprNodeNext( |
| 12699 Fts5Expr *pExpr, |
| 12700 Fts5ExprNode *pNode, |
| 12701 int bFromValid, |
| 12702 i64 iFrom |
| 12703 ){ |
| 12704 int rc = SQLITE_OK; |
| 12705 |
| 12706 if( pNode->bEof==0 ){ |
| 12707 switch( pNode->eType ){ |
| 12708 case FTS5_STRING: { |
| 12709 rc = fts5ExprNearAdvanceFirst(pExpr, pNode, bFromValid, iFrom); |
| 12710 break; |
| 12711 }; |
| 12712 |
| 12713 case FTS5_TERM: { |
| 12714 Fts5IndexIter *pIter = pNode->pNear->apPhrase[0]->aTerm[0].pIter; |
| 12715 if( bFromValid ){ |
| 12716 rc = sqlite3Fts5IterNextFrom(pIter, iFrom); |
| 12717 }else{ |
| 12718 rc = sqlite3Fts5IterNext(pIter); |
| 12719 } |
| 12720 if( rc==SQLITE_OK && sqlite3Fts5IterEof(pIter)==0 ){ |
| 12721 assert( rc==SQLITE_OK ); |
| 12722 rc = fts5ExprTokenTest(pExpr, pNode); |
| 12723 }else{ |
| 12724 pNode->bEof = 1; |
| 12725 } |
| 12726 return rc; |
| 12727 }; |
| 12728 |
| 12729 case FTS5_AND: { |
| 12730 Fts5ExprNode *pLeft = pNode->apChild[0]; |
| 12731 rc = fts5ExprNodeNext(pExpr, pLeft, bFromValid, iFrom); |
| 12732 break; |
| 12733 } |
| 12734 |
| 12735 case FTS5_OR: { |
| 12736 int i; |
| 12737 i64 iLast = pNode->iRowid; |
| 12738 |
| 12739 for(i=0; rc==SQLITE_OK && i<pNode->nChild; i++){ |
| 12740 Fts5ExprNode *p1 = pNode->apChild[i]; |
| 12741 assert( p1->bEof || fts5RowidCmp(pExpr, p1->iRowid, iLast)>=0 ); |
| 12742 if( p1->bEof==0 ){ |
| 12743 if( (p1->iRowid==iLast) |
| 12744 || (bFromValid && fts5RowidCmp(pExpr, p1->iRowid, iFrom)<0) |
| 12745 ){ |
| 12746 rc = fts5ExprNodeNext(pExpr, p1, bFromValid, iFrom); |
| 12747 } |
| 12748 } |
| 12749 } |
| 12750 |
| 12751 break; |
| 12752 } |
| 12753 |
| 12754 default: assert( pNode->eType==FTS5_NOT ); { |
| 12755 assert( pNode->nChild==2 ); |
| 12756 rc = fts5ExprNodeNext(pExpr, pNode->apChild[0], bFromValid, iFrom); |
| 12757 break; |
| 12758 } |
| 12759 } |
| 12760 |
| 12761 if( rc==SQLITE_OK ){ |
| 12762 rc = fts5ExprNodeNextMatch(pExpr, pNode); |
| 12763 } |
| 12764 } |
| 12765 |
| 12766 /* Assert that if bFromValid was true, either: |
| 12767 ** |
| 12768 ** a) an error occurred, or |
| 12769 ** b) the node is now at EOF, or |
| 12770 ** c) the node is now at or past rowid iFrom. |
| 12771 */ |
| 12772 assert( bFromValid==0 |
| 12773 || rc!=SQLITE_OK /* a */ |
| 12774 || pNode->bEof /* b */ |
| 12775 || pNode->iRowid==iFrom || pExpr->bDesc==(pNode->iRowid<iFrom) /* c */ |
| 12776 ); |
| 12777 |
| 12778 return rc; |
| 12779 } |
| 12780 |
| 12781 |
| 12782 /* |
| 12783 ** If pNode currently points to a match, this function returns SQLITE_OK |
| 12784 ** without modifying it. Otherwise, pNode is advanced until it does point |
| 12785 ** to a match or EOF is reached. |
| 12786 */ |
| 12787 static int fts5ExprNodeNextMatch( |
| 12788 Fts5Expr *pExpr, /* Expression of which pNode is a part */ |
| 12789 Fts5ExprNode *pNode /* Expression node to test */ |
| 12790 ){ |
| 12791 int rc = SQLITE_OK; |
| 12792 if( pNode->bEof==0 ){ |
| 12793 switch( pNode->eType ){ |
| 12794 |
| 12795 case FTS5_STRING: { |
| 12796 /* Advance the iterators until they all point to the same rowid */ |
| 12797 rc = fts5ExprNearNextMatch(pExpr, pNode); |
| 12798 break; |
| 12799 } |
| 12800 |
| 12801 case FTS5_TERM: { |
| 12802 rc = fts5ExprTokenTest(pExpr, pNode); |
| 12803 break; |
| 12804 } |
| 12805 |
| 12806 case FTS5_AND: { |
| 12807 rc = fts5ExprAndNextRowid(pExpr, pNode); |
| 12808 break; |
| 12809 } |
| 12810 |
| 12811 case FTS5_OR: { |
| 12812 Fts5ExprNode *pNext = pNode->apChild[0]; |
| 12813 int i; |
| 12814 |
| 12815 for(i=1; i<pNode->nChild; i++){ |
| 12816 Fts5ExprNode *pChild = pNode->apChild[i]; |
| 12817 int cmp = fts5NodeCompare(pExpr, pNext, pChild); |
| 12818 if( cmp>0 || (cmp==0 && pChild->bNomatch==0) ){ |
| 12819 pNext = pChild; |
| 12820 } |
| 12821 } |
| 12822 pNode->iRowid = pNext->iRowid; |
| 12823 pNode->bEof = pNext->bEof; |
| 12824 pNode->bNomatch = pNext->bNomatch; |
| 12825 break; |
| 12826 } |
| 12827 |
| 12828 default: assert( pNode->eType==FTS5_NOT ); { |
| 12829 Fts5ExprNode *p1 = pNode->apChild[0]; |
| 12830 Fts5ExprNode *p2 = pNode->apChild[1]; |
| 12831 assert( pNode->nChild==2 ); |
| 12832 |
| 12833 while( rc==SQLITE_OK && p1->bEof==0 ){ |
| 12834 int cmp = fts5NodeCompare(pExpr, p1, p2); |
| 12835 if( cmp>0 ){ |
| 12836 rc = fts5ExprNodeNext(pExpr, p2, 1, p1->iRowid); |
| 12837 cmp = fts5NodeCompare(pExpr, p1, p2); |
| 12838 } |
| 12839 assert( rc!=SQLITE_OK || cmp<=0 ); |
| 12840 if( cmp || p2->bNomatch ) break; |
| 12841 rc = fts5ExprNodeNext(pExpr, p1, 0, 0); |
| 12842 } |
| 12843 pNode->bEof = p1->bEof; |
| 12844 pNode->iRowid = p1->iRowid; |
| 12845 break; |
| 12846 } |
| 12847 } |
| 12848 } |
| 12849 return rc; |
| 12850 } |
| 12851 |
| 12852 |
| 12853 /* |
| 12854 ** Set node pNode, which is part of expression pExpr, to point to the first |
| 12855 ** match. If there are no matches, set the Node.bEof flag to indicate EOF. |
| 12856 ** |
| 12857 ** Return an SQLite error code if an error occurs, or SQLITE_OK otherwise. |
| 12858 ** It is not an error if there are no matches. |
| 12859 */ |
| 12860 static int fts5ExprNodeFirst(Fts5Expr *pExpr, Fts5ExprNode *pNode){ |
| 12861 int rc = SQLITE_OK; |
| 12862 pNode->bEof = 0; |
| 12863 |
| 12864 if( Fts5NodeIsString(pNode) ){ |
| 12865 /* Initialize all term iterators in the NEAR object. */ |
| 12866 rc = fts5ExprNearInitAll(pExpr, pNode); |
| 12867 }else{ |
| 12868 int i; |
| 12869 for(i=0; i<pNode->nChild && rc==SQLITE_OK; i++){ |
| 12870 rc = fts5ExprNodeFirst(pExpr, pNode->apChild[i]); |
| 12871 } |
| 12872 pNode->iRowid = pNode->apChild[0]->iRowid; |
| 12873 } |
| 12874 |
| 12875 if( rc==SQLITE_OK ){ |
| 12876 rc = fts5ExprNodeNextMatch(pExpr, pNode); |
| 12877 } |
| 12878 return rc; |
| 12879 } |
| 12880 |
| 12881 |
| 12882 /* |
| 12883 ** Begin iterating through the set of documents in index pIdx matched by |
| 12884 ** the MATCH expression passed as the first argument. If the "bDesc" |
| 12885 ** parameter is passed a non-zero value, iteration is in descending rowid |
| 12886 ** order. Or, if it is zero, in ascending order. |
| 12887 ** |
| 12888 ** If iterating in ascending rowid order (bDesc==0), the first document |
| 12889 ** visited is that with the smallest rowid that is larger than or equal |
| 12890 ** to parameter iFirst. Or, if iterating in ascending order (bDesc==1), |
| 12891 ** then the first document visited must have a rowid smaller than or |
| 12892 ** equal to iFirst. |
| 12893 ** |
| 12894 ** Return SQLITE_OK if successful, or an SQLite error code otherwise. It |
| 12895 ** is not considered an error if the query does not match any documents. |
| 12896 */ |
| 12897 static int sqlite3Fts5ExprFirst(Fts5Expr *p, Fts5Index *pIdx, i64 iFirst, int bD
esc){ |
| 12898 Fts5ExprNode *pRoot = p->pRoot; |
| 12899 int rc = SQLITE_OK; |
| 12900 if( pRoot ){ |
| 12901 p->pIndex = pIdx; |
| 12902 p->bDesc = bDesc; |
| 12903 rc = fts5ExprNodeFirst(p, pRoot); |
| 12904 |
| 12905 /* If not at EOF but the current rowid occurs earlier than iFirst in |
| 12906 ** the iteration order, move to document iFirst or later. */ |
| 12907 if( pRoot->bEof==0 && fts5RowidCmp(p, pRoot->iRowid, iFirst)<0 ){ |
| 12908 rc = fts5ExprNodeNext(p, pRoot, 1, iFirst); |
| 12909 } |
| 12910 |
| 12911 /* If the iterator is not at a real match, skip forward until it is. */ |
| 12912 while( pRoot->bNomatch && rc==SQLITE_OK && pRoot->bEof==0 ){ |
| 12913 rc = fts5ExprNodeNext(p, pRoot, 0, 0); |
| 12914 } |
| 12915 } |
| 12916 return rc; |
| 12917 } |
| 12918 |
| 12919 /* |
| 12920 ** Move to the next document |
| 12921 ** |
| 12922 ** Return SQLITE_OK if successful, or an SQLite error code otherwise. It |
| 12923 ** is not considered an error if the query does not match any documents. |
| 12924 */ |
| 12925 static int sqlite3Fts5ExprNext(Fts5Expr *p, i64 iLast){ |
| 12926 int rc; |
| 12927 Fts5ExprNode *pRoot = p->pRoot; |
| 12928 do { |
| 12929 rc = fts5ExprNodeNext(p, pRoot, 0, 0); |
| 12930 }while( pRoot->bNomatch && pRoot->bEof==0 && rc==SQLITE_OK ); |
| 12931 if( fts5RowidCmp(p, pRoot->iRowid, iLast)>0 ){ |
| 12932 pRoot->bEof = 1; |
| 12933 } |
| 12934 return rc; |
| 12935 } |
| 12936 |
| 12937 static int sqlite3Fts5ExprEof(Fts5Expr *p){ |
| 12938 return (p->pRoot==0 || p->pRoot->bEof); |
| 12939 } |
| 12940 |
| 12941 static i64 sqlite3Fts5ExprRowid(Fts5Expr *p){ |
| 12942 return p->pRoot->iRowid; |
| 12943 } |
| 12944 |
| 12945 static int fts5ParseStringFromToken(Fts5Token *pToken, char **pz){ |
| 12946 int rc = SQLITE_OK; |
| 12947 *pz = sqlite3Fts5Strndup(&rc, pToken->p, pToken->n); |
| 12948 return rc; |
| 12949 } |
| 12950 |
| 12951 /* |
| 12952 ** Free the phrase object passed as the only argument. |
| 12953 */ |
| 12954 static void fts5ExprPhraseFree(Fts5ExprPhrase *pPhrase){ |
| 12955 if( pPhrase ){ |
| 12956 int i; |
| 12957 for(i=0; i<pPhrase->nTerm; i++){ |
| 12958 Fts5ExprTerm *pSyn; |
| 12959 Fts5ExprTerm *pNext; |
| 12960 Fts5ExprTerm *pTerm = &pPhrase->aTerm[i]; |
| 12961 sqlite3_free(pTerm->zTerm); |
| 12962 sqlite3Fts5IterClose(pTerm->pIter); |
| 12963 |
| 12964 for(pSyn=pTerm->pSynonym; pSyn; pSyn=pNext){ |
| 12965 pNext = pSyn->pSynonym; |
| 12966 sqlite3Fts5IterClose(pSyn->pIter); |
| 12967 sqlite3_free(pSyn); |
| 12968 } |
| 12969 } |
| 12970 if( pPhrase->poslist.nSpace>0 ) fts5BufferFree(&pPhrase->poslist); |
| 12971 sqlite3_free(pPhrase); |
| 12972 } |
| 12973 } |
| 12974 |
| 12975 /* |
| 12976 ** If argument pNear is NULL, then a new Fts5ExprNearset object is allocated |
| 12977 ** and populated with pPhrase. Or, if pNear is not NULL, phrase pPhrase is |
| 12978 ** appended to it and the results returned. |
| 12979 ** |
| 12980 ** If an OOM error occurs, both the pNear and pPhrase objects are freed and |
| 12981 ** NULL returned. |
| 12982 */ |
| 12983 static Fts5ExprNearset *sqlite3Fts5ParseNearset( |
| 12984 Fts5Parse *pParse, /* Parse context */ |
| 12985 Fts5ExprNearset *pNear, /* Existing nearset, or NULL */ |
| 12986 Fts5ExprPhrase *pPhrase /* Recently parsed phrase */ |
| 12987 ){ |
| 12988 const int SZALLOC = 8; |
| 12989 Fts5ExprNearset *pRet = 0; |
| 12990 |
| 12991 if( pParse->rc==SQLITE_OK ){ |
| 12992 if( pPhrase==0 ){ |
| 12993 return pNear; |
| 12994 } |
| 12995 if( pNear==0 ){ |
| 12996 int nByte = sizeof(Fts5ExprNearset) + SZALLOC * sizeof(Fts5ExprPhrase*); |
| 12997 pRet = sqlite3_malloc(nByte); |
| 12998 if( pRet==0 ){ |
| 12999 pParse->rc = SQLITE_NOMEM; |
| 13000 }else{ |
| 13001 memset(pRet, 0, nByte); |
| 13002 } |
| 13003 }else if( (pNear->nPhrase % SZALLOC)==0 ){ |
| 13004 int nNew = pNear->nPhrase + SZALLOC; |
| 13005 int nByte = sizeof(Fts5ExprNearset) + nNew * sizeof(Fts5ExprPhrase*); |
| 13006 |
| 13007 pRet = (Fts5ExprNearset*)sqlite3_realloc(pNear, nByte); |
| 13008 if( pRet==0 ){ |
| 13009 pParse->rc = SQLITE_NOMEM; |
| 13010 } |
| 13011 }else{ |
| 13012 pRet = pNear; |
| 13013 } |
| 13014 } |
| 13015 |
| 13016 if( pRet==0 ){ |
| 13017 assert( pParse->rc!=SQLITE_OK ); |
| 13018 sqlite3Fts5ParseNearsetFree(pNear); |
| 13019 sqlite3Fts5ParsePhraseFree(pPhrase); |
| 13020 }else{ |
| 13021 pRet->apPhrase[pRet->nPhrase++] = pPhrase; |
| 13022 } |
| 13023 return pRet; |
| 13024 } |
| 13025 |
| 13026 typedef struct TokenCtx TokenCtx; |
| 13027 struct TokenCtx { |
| 13028 Fts5ExprPhrase *pPhrase; |
| 13029 int rc; |
| 13030 }; |
| 13031 |
| 13032 /* |
| 13033 ** Callback for tokenizing terms used by ParseTerm(). |
| 13034 */ |
| 13035 static int fts5ParseTokenize( |
| 13036 void *pContext, /* Pointer to Fts5InsertCtx object */ |
| 13037 int tflags, /* Mask of FTS5_TOKEN_* flags */ |
| 13038 const char *pToken, /* Buffer containing token */ |
| 13039 int nToken, /* Size of token in bytes */ |
| 13040 int iUnused1, /* Start offset of token */ |
| 13041 int iUnused2 /* End offset of token */ |
| 13042 ){ |
| 13043 int rc = SQLITE_OK; |
| 13044 const int SZALLOC = 8; |
| 13045 TokenCtx *pCtx = (TokenCtx*)pContext; |
| 13046 Fts5ExprPhrase *pPhrase = pCtx->pPhrase; |
| 13047 |
| 13048 /* If an error has already occurred, this is a no-op */ |
| 13049 if( pCtx->rc!=SQLITE_OK ) return pCtx->rc; |
| 13050 |
| 13051 assert( pPhrase==0 || pPhrase->nTerm>0 ); |
| 13052 if( pPhrase && (tflags & FTS5_TOKEN_COLOCATED) ){ |
| 13053 Fts5ExprTerm *pSyn; |
| 13054 int nByte = sizeof(Fts5ExprTerm) + nToken+1; |
| 13055 pSyn = (Fts5ExprTerm*)sqlite3_malloc(nByte); |
| 13056 if( pSyn==0 ){ |
| 13057 rc = SQLITE_NOMEM; |
| 13058 }else{ |
| 13059 memset(pSyn, 0, nByte); |
| 13060 pSyn->zTerm = (char*)&pSyn[1]; |
| 13061 memcpy(pSyn->zTerm, pToken, nToken); |
| 13062 pSyn->pSynonym = pPhrase->aTerm[pPhrase->nTerm-1].pSynonym; |
| 13063 pPhrase->aTerm[pPhrase->nTerm-1].pSynonym = pSyn; |
| 13064 } |
| 13065 }else{ |
| 13066 Fts5ExprTerm *pTerm; |
| 13067 if( pPhrase==0 || (pPhrase->nTerm % SZALLOC)==0 ){ |
| 13068 Fts5ExprPhrase *pNew; |
| 13069 int nNew = SZALLOC + (pPhrase ? pPhrase->nTerm : 0); |
| 13070 |
| 13071 pNew = (Fts5ExprPhrase*)sqlite3_realloc(pPhrase, |
| 13072 sizeof(Fts5ExprPhrase) + sizeof(Fts5ExprTerm) * nNew |
| 13073 ); |
| 13074 if( pNew==0 ){ |
| 13075 rc = SQLITE_NOMEM; |
| 13076 }else{ |
| 13077 if( pPhrase==0 ) memset(pNew, 0, sizeof(Fts5ExprPhrase)); |
| 13078 pCtx->pPhrase = pPhrase = pNew; |
| 13079 pNew->nTerm = nNew - SZALLOC; |
| 13080 } |
| 13081 } |
| 13082 |
| 13083 if( rc==SQLITE_OK ){ |
| 13084 pTerm = &pPhrase->aTerm[pPhrase->nTerm++]; |
| 13085 memset(pTerm, 0, sizeof(Fts5ExprTerm)); |
| 13086 pTerm->zTerm = sqlite3Fts5Strndup(&rc, pToken, nToken); |
| 13087 } |
| 13088 } |
| 13089 |
| 13090 pCtx->rc = rc; |
| 13091 return rc; |
| 13092 } |
| 13093 |
| 13094 |
| 13095 /* |
| 13096 ** Free the phrase object passed as the only argument. |
| 13097 */ |
| 13098 static void sqlite3Fts5ParsePhraseFree(Fts5ExprPhrase *pPhrase){ |
| 13099 fts5ExprPhraseFree(pPhrase); |
| 13100 } |
| 13101 |
| 13102 /* |
| 13103 ** Free the phrase object passed as the second argument. |
| 13104 */ |
| 13105 static void sqlite3Fts5ParseNearsetFree(Fts5ExprNearset *pNear){ |
| 13106 if( pNear ){ |
| 13107 int i; |
| 13108 for(i=0; i<pNear->nPhrase; i++){ |
| 13109 fts5ExprPhraseFree(pNear->apPhrase[i]); |
| 13110 } |
| 13111 sqlite3_free(pNear->pColset); |
| 13112 sqlite3_free(pNear); |
| 13113 } |
| 13114 } |
| 13115 |
| 13116 static void sqlite3Fts5ParseFinished(Fts5Parse *pParse, Fts5ExprNode *p){ |
| 13117 assert( pParse->pExpr==0 ); |
| 13118 pParse->pExpr = p; |
| 13119 } |
| 13120 |
| 13121 /* |
| 13122 ** This function is called by the parser to process a string token. The |
| 13123 ** string may or may not be quoted. In any case it is tokenized and a |
| 13124 ** phrase object consisting of all tokens returned. |
| 13125 */ |
| 13126 static Fts5ExprPhrase *sqlite3Fts5ParseTerm( |
| 13127 Fts5Parse *pParse, /* Parse context */ |
| 13128 Fts5ExprPhrase *pAppend, /* Phrase to append to */ |
| 13129 Fts5Token *pToken, /* String to tokenize */ |
| 13130 int bPrefix /* True if there is a trailing "*" */ |
| 13131 ){ |
| 13132 Fts5Config *pConfig = pParse->pConfig; |
| 13133 TokenCtx sCtx; /* Context object passed to callback */ |
| 13134 int rc; /* Tokenize return code */ |
| 13135 char *z = 0; |
| 13136 |
| 13137 memset(&sCtx, 0, sizeof(TokenCtx)); |
| 13138 sCtx.pPhrase = pAppend; |
| 13139 |
| 13140 rc = fts5ParseStringFromToken(pToken, &z); |
| 13141 if( rc==SQLITE_OK ){ |
| 13142 int flags = FTS5_TOKENIZE_QUERY | (bPrefix ? FTS5_TOKENIZE_QUERY : 0); |
| 13143 int n; |
| 13144 sqlite3Fts5Dequote(z); |
| 13145 n = (int)strlen(z); |
| 13146 rc = sqlite3Fts5Tokenize(pConfig, flags, z, n, &sCtx, fts5ParseTokenize); |
| 13147 } |
| 13148 sqlite3_free(z); |
| 13149 if( rc || (rc = sCtx.rc) ){ |
| 13150 pParse->rc = rc; |
| 13151 fts5ExprPhraseFree(sCtx.pPhrase); |
| 13152 sCtx.pPhrase = 0; |
| 13153 }else if( sCtx.pPhrase ){ |
| 13154 |
| 13155 if( pAppend==0 ){ |
| 13156 if( (pParse->nPhrase % 8)==0 ){ |
| 13157 int nByte = sizeof(Fts5ExprPhrase*) * (pParse->nPhrase + 8); |
| 13158 Fts5ExprPhrase **apNew; |
| 13159 apNew = (Fts5ExprPhrase**)sqlite3_realloc(pParse->apPhrase, nByte); |
| 13160 if( apNew==0 ){ |
| 13161 pParse->rc = SQLITE_NOMEM; |
| 13162 fts5ExprPhraseFree(sCtx.pPhrase); |
| 13163 return 0; |
| 13164 } |
| 13165 pParse->apPhrase = apNew; |
| 13166 } |
| 13167 pParse->nPhrase++; |
| 13168 } |
| 13169 |
| 13170 pParse->apPhrase[pParse->nPhrase-1] = sCtx.pPhrase; |
| 13171 assert( sCtx.pPhrase->nTerm>0 ); |
| 13172 sCtx.pPhrase->aTerm[sCtx.pPhrase->nTerm-1].bPrefix = bPrefix; |
| 13173 } |
| 13174 |
| 13175 return sCtx.pPhrase; |
| 13176 } |
| 13177 |
| 13178 /* |
| 13179 ** Create a new FTS5 expression by cloning phrase iPhrase of the |
| 13180 ** expression passed as the second argument. |
| 13181 */ |
| 13182 static int sqlite3Fts5ExprClonePhrase( |
| 13183 Fts5Config *pConfig, |
| 13184 Fts5Expr *pExpr, |
| 13185 int iPhrase, |
| 13186 Fts5Expr **ppNew |
| 13187 ){ |
| 13188 int rc = SQLITE_OK; /* Return code */ |
| 13189 Fts5ExprPhrase *pOrig; /* The phrase extracted from pExpr */ |
| 13190 int i; /* Used to iterate through phrase terms */ |
| 13191 |
| 13192 Fts5Expr *pNew = 0; /* Expression to return via *ppNew */ |
| 13193 |
| 13194 TokenCtx sCtx = {0,0}; /* Context object for fts5ParseTokenize */ |
| 13195 |
| 13196 |
| 13197 pOrig = pExpr->apExprPhrase[iPhrase]; |
| 13198 |
| 13199 pNew = (Fts5Expr*)sqlite3Fts5MallocZero(&rc, sizeof(Fts5Expr)); |
| 13200 if( rc==SQLITE_OK ){ |
| 13201 pNew->apExprPhrase = (Fts5ExprPhrase**)sqlite3Fts5MallocZero(&rc, |
| 13202 sizeof(Fts5ExprPhrase*)); |
| 13203 } |
| 13204 if( rc==SQLITE_OK ){ |
| 13205 pNew->pRoot = (Fts5ExprNode*)sqlite3Fts5MallocZero(&rc, |
| 13206 sizeof(Fts5ExprNode)); |
| 13207 } |
| 13208 if( rc==SQLITE_OK ){ |
| 13209 pNew->pRoot->pNear = (Fts5ExprNearset*)sqlite3Fts5MallocZero(&rc, |
| 13210 sizeof(Fts5ExprNearset) + sizeof(Fts5ExprPhrase*)); |
| 13211 } |
| 13212 |
| 13213 for(i=0; rc==SQLITE_OK && i<pOrig->nTerm; i++){ |
| 13214 int tflags = 0; |
| 13215 Fts5ExprTerm *p; |
| 13216 for(p=&pOrig->aTerm[i]; p && rc==SQLITE_OK; p=p->pSynonym){ |
| 13217 const char *zTerm = p->zTerm; |
| 13218 rc = fts5ParseTokenize((void*)&sCtx, tflags, zTerm, (int)strlen(zTerm), |
| 13219 0, 0); |
| 13220 tflags = FTS5_TOKEN_COLOCATED; |
| 13221 } |
| 13222 if( rc==SQLITE_OK ){ |
| 13223 sCtx.pPhrase->aTerm[i].bPrefix = pOrig->aTerm[i].bPrefix; |
| 13224 } |
| 13225 } |
| 13226 |
| 13227 if( rc==SQLITE_OK ){ |
| 13228 /* All the allocations succeeded. Put the expression object together. */ |
| 13229 pNew->pIndex = pExpr->pIndex; |
| 13230 pNew->nPhrase = 1; |
| 13231 pNew->apExprPhrase[0] = sCtx.pPhrase; |
| 13232 pNew->pRoot->pNear->apPhrase[0] = sCtx.pPhrase; |
| 13233 pNew->pRoot->pNear->nPhrase = 1; |
| 13234 sCtx.pPhrase->pNode = pNew->pRoot; |
| 13235 |
| 13236 if( pOrig->nTerm==1 && pOrig->aTerm[0].pSynonym==0 ){ |
| 13237 pNew->pRoot->eType = FTS5_TERM; |
| 13238 }else{ |
| 13239 pNew->pRoot->eType = FTS5_STRING; |
| 13240 } |
| 13241 }else{ |
| 13242 sqlite3Fts5ExprFree(pNew); |
| 13243 fts5ExprPhraseFree(sCtx.pPhrase); |
| 13244 pNew = 0; |
| 13245 } |
| 13246 |
| 13247 *ppNew = pNew; |
| 13248 return rc; |
| 13249 } |
| 13250 |
| 13251 |
| 13252 /* |
| 13253 ** Token pTok has appeared in a MATCH expression where the NEAR operator |
| 13254 ** is expected. If token pTok does not contain "NEAR", store an error |
| 13255 ** in the pParse object. |
| 13256 */ |
| 13257 static void sqlite3Fts5ParseNear(Fts5Parse *pParse, Fts5Token *pTok){ |
| 13258 if( pTok->n!=4 || memcmp("NEAR", pTok->p, 4) ){ |
| 13259 sqlite3Fts5ParseError( |
| 13260 pParse, "fts5: syntax error near \"%.*s\"", pTok->n, pTok->p |
| 13261 ); |
| 13262 } |
| 13263 } |
| 13264 |
| 13265 static void sqlite3Fts5ParseSetDistance( |
| 13266 Fts5Parse *pParse, |
| 13267 Fts5ExprNearset *pNear, |
| 13268 Fts5Token *p |
| 13269 ){ |
| 13270 int nNear = 0; |
| 13271 int i; |
| 13272 if( p->n ){ |
| 13273 for(i=0; i<p->n; i++){ |
| 13274 char c = (char)p->p[i]; |
| 13275 if( c<'0' || c>'9' ){ |
| 13276 sqlite3Fts5ParseError( |
| 13277 pParse, "expected integer, got \"%.*s\"", p->n, p->p |
| 13278 ); |
| 13279 return; |
| 13280 } |
| 13281 nNear = nNear * 10 + (p->p[i] - '0'); |
| 13282 } |
| 13283 }else{ |
| 13284 nNear = FTS5_DEFAULT_NEARDIST; |
| 13285 } |
| 13286 pNear->nNear = nNear; |
| 13287 } |
| 13288 |
| 13289 /* |
| 13290 ** The second argument passed to this function may be NULL, or it may be |
| 13291 ** an existing Fts5Colset object. This function returns a pointer to |
| 13292 ** a new colset object containing the contents of (p) with new value column |
| 13293 ** number iCol appended. |
| 13294 ** |
| 13295 ** If an OOM error occurs, store an error code in pParse and return NULL. |
| 13296 ** The old colset object (if any) is not freed in this case. |
| 13297 */ |
| 13298 static Fts5Colset *fts5ParseColset( |
| 13299 Fts5Parse *pParse, /* Store SQLITE_NOMEM here if required */ |
| 13300 Fts5Colset *p, /* Existing colset object */ |
| 13301 int iCol /* New column to add to colset object */ |
| 13302 ){ |
| 13303 int nCol = p ? p->nCol : 0; /* Num. columns already in colset object */ |
| 13304 Fts5Colset *pNew; /* New colset object to return */ |
| 13305 |
| 13306 assert( pParse->rc==SQLITE_OK ); |
| 13307 assert( iCol>=0 && iCol<pParse->pConfig->nCol ); |
| 13308 |
| 13309 pNew = sqlite3_realloc(p, sizeof(Fts5Colset) + sizeof(int)*nCol); |
| 13310 if( pNew==0 ){ |
| 13311 pParse->rc = SQLITE_NOMEM; |
| 13312 }else{ |
| 13313 int *aiCol = pNew->aiCol; |
| 13314 int i, j; |
| 13315 for(i=0; i<nCol; i++){ |
| 13316 if( aiCol[i]==iCol ) return pNew; |
| 13317 if( aiCol[i]>iCol ) break; |
| 13318 } |
| 13319 for(j=nCol; j>i; j--){ |
| 13320 aiCol[j] = aiCol[j-1]; |
| 13321 } |
| 13322 aiCol[i] = iCol; |
| 13323 pNew->nCol = nCol+1; |
| 13324 |
| 13325 #ifndef NDEBUG |
| 13326 /* Check that the array is in order and contains no duplicate entries. */ |
| 13327 for(i=1; i<pNew->nCol; i++) assert( pNew->aiCol[i]>pNew->aiCol[i-1] ); |
| 13328 #endif |
| 13329 } |
| 13330 |
| 13331 return pNew; |
| 13332 } |
| 13333 |
| 13334 static Fts5Colset *sqlite3Fts5ParseColset( |
| 13335 Fts5Parse *pParse, /* Store SQLITE_NOMEM here if required */ |
| 13336 Fts5Colset *pColset, /* Existing colset object */ |
| 13337 Fts5Token *p |
| 13338 ){ |
| 13339 Fts5Colset *pRet = 0; |
| 13340 int iCol; |
| 13341 char *z; /* Dequoted copy of token p */ |
| 13342 |
| 13343 z = sqlite3Fts5Strndup(&pParse->rc, p->p, p->n); |
| 13344 if( pParse->rc==SQLITE_OK ){ |
| 13345 Fts5Config *pConfig = pParse->pConfig; |
| 13346 sqlite3Fts5Dequote(z); |
| 13347 for(iCol=0; iCol<pConfig->nCol; iCol++){ |
| 13348 if( 0==sqlite3_stricmp(pConfig->azCol[iCol], z) ) break; |
| 13349 } |
| 13350 if( iCol==pConfig->nCol ){ |
| 13351 sqlite3Fts5ParseError(pParse, "no such column: %s", z); |
| 13352 }else{ |
| 13353 pRet = fts5ParseColset(pParse, pColset, iCol); |
| 13354 } |
| 13355 sqlite3_free(z); |
| 13356 } |
| 13357 |
| 13358 if( pRet==0 ){ |
| 13359 assert( pParse->rc!=SQLITE_OK ); |
| 13360 sqlite3_free(pColset); |
| 13361 } |
| 13362 |
| 13363 return pRet; |
| 13364 } |
| 13365 |
| 13366 static void sqlite3Fts5ParseSetColset( |
| 13367 Fts5Parse *pParse, |
| 13368 Fts5ExprNearset *pNear, |
| 13369 Fts5Colset *pColset |
| 13370 ){ |
| 13371 if( pNear ){ |
| 13372 pNear->pColset = pColset; |
| 13373 }else{ |
| 13374 sqlite3_free(pColset); |
| 13375 } |
| 13376 } |
| 13377 |
| 13378 static void fts5ExprAddChildren(Fts5ExprNode *p, Fts5ExprNode *pSub){ |
| 13379 if( p->eType!=FTS5_NOT && pSub->eType==p->eType ){ |
| 13380 int nByte = sizeof(Fts5ExprNode*) * pSub->nChild; |
| 13381 memcpy(&p->apChild[p->nChild], pSub->apChild, nByte); |
| 13382 p->nChild += pSub->nChild; |
| 13383 sqlite3_free(pSub); |
| 13384 }else{ |
| 13385 p->apChild[p->nChild++] = pSub; |
| 13386 } |
| 13387 } |
| 13388 |
| 13389 /* |
| 13390 ** Allocate and return a new expression object. If anything goes wrong (i.e. |
| 13391 ** OOM error), leave an error code in pParse and return NULL. |
| 13392 */ |
| 13393 static Fts5ExprNode *sqlite3Fts5ParseNode( |
| 13394 Fts5Parse *pParse, /* Parse context */ |
| 13395 int eType, /* FTS5_STRING, AND, OR or NOT */ |
| 13396 Fts5ExprNode *pLeft, /* Left hand child expression */ |
| 13397 Fts5ExprNode *pRight, /* Right hand child expression */ |
| 13398 Fts5ExprNearset *pNear /* For STRING expressions, the near cluster */ |
| 13399 ){ |
| 13400 Fts5ExprNode *pRet = 0; |
| 13401 |
| 13402 if( pParse->rc==SQLITE_OK ){ |
| 13403 int nChild = 0; /* Number of children of returned node */ |
| 13404 int nByte; /* Bytes of space to allocate for this node */ |
| 13405 |
| 13406 assert( (eType!=FTS5_STRING && !pNear) |
| 13407 || (eType==FTS5_STRING && !pLeft && !pRight) |
| 13408 ); |
| 13409 if( eType==FTS5_STRING && pNear==0 ) return 0; |
| 13410 if( eType!=FTS5_STRING && pLeft==0 ) return pRight; |
| 13411 if( eType!=FTS5_STRING && pRight==0 ) return pLeft; |
| 13412 |
| 13413 if( eType==FTS5_NOT ){ |
| 13414 nChild = 2; |
| 13415 }else if( eType==FTS5_AND || eType==FTS5_OR ){ |
| 13416 nChild = 2; |
| 13417 if( pLeft->eType==eType ) nChild += pLeft->nChild-1; |
| 13418 if( pRight->eType==eType ) nChild += pRight->nChild-1; |
| 13419 } |
| 13420 |
| 13421 nByte = sizeof(Fts5ExprNode) + sizeof(Fts5ExprNode*)*(nChild-1); |
| 13422 pRet = (Fts5ExprNode*)sqlite3Fts5MallocZero(&pParse->rc, nByte); |
| 13423 |
| 13424 if( pRet ){ |
| 13425 pRet->eType = eType; |
| 13426 pRet->pNear = pNear; |
| 13427 if( eType==FTS5_STRING ){ |
| 13428 int iPhrase; |
| 13429 for(iPhrase=0; iPhrase<pNear->nPhrase; iPhrase++){ |
| 13430 pNear->apPhrase[iPhrase]->pNode = pRet; |
| 13431 } |
| 13432 if( pNear->nPhrase==1 |
| 13433 && pNear->apPhrase[0]->nTerm==1 |
| 13434 && pNear->apPhrase[0]->aTerm[0].pSynonym==0 |
| 13435 ){ |
| 13436 pRet->eType = FTS5_TERM; |
| 13437 } |
| 13438 }else{ |
| 13439 fts5ExprAddChildren(pRet, pLeft); |
| 13440 fts5ExprAddChildren(pRet, pRight); |
| 13441 } |
| 13442 } |
| 13443 } |
| 13444 |
| 13445 if( pRet==0 ){ |
| 13446 assert( pParse->rc!=SQLITE_OK ); |
| 13447 sqlite3Fts5ParseNodeFree(pLeft); |
| 13448 sqlite3Fts5ParseNodeFree(pRight); |
| 13449 sqlite3Fts5ParseNearsetFree(pNear); |
| 13450 } |
| 13451 return pRet; |
| 13452 } |
| 13453 |
| 13454 static char *fts5ExprTermPrint(Fts5ExprTerm *pTerm){ |
| 13455 int nByte = 0; |
| 13456 Fts5ExprTerm *p; |
| 13457 char *zQuoted; |
| 13458 |
| 13459 /* Determine the maximum amount of space required. */ |
| 13460 for(p=pTerm; p; p=p->pSynonym){ |
| 13461 nByte += (int)strlen(pTerm->zTerm) * 2 + 3 + 2; |
| 13462 } |
| 13463 zQuoted = sqlite3_malloc(nByte); |
| 13464 |
| 13465 if( zQuoted ){ |
| 13466 int i = 0; |
| 13467 for(p=pTerm; p; p=p->pSynonym){ |
| 13468 char *zIn = p->zTerm; |
| 13469 zQuoted[i++] = '"'; |
| 13470 while( *zIn ){ |
| 13471 if( *zIn=='"' ) zQuoted[i++] = '"'; |
| 13472 zQuoted[i++] = *zIn++; |
| 13473 } |
| 13474 zQuoted[i++] = '"'; |
| 13475 if( p->pSynonym ) zQuoted[i++] = '|'; |
| 13476 } |
| 13477 if( pTerm->bPrefix ){ |
| 13478 zQuoted[i++] = ' '; |
| 13479 zQuoted[i++] = '*'; |
| 13480 } |
| 13481 zQuoted[i++] = '\0'; |
| 13482 } |
| 13483 return zQuoted; |
| 13484 } |
| 13485 |
| 13486 static char *fts5PrintfAppend(char *zApp, const char *zFmt, ...){ |
| 13487 char *zNew; |
| 13488 va_list ap; |
| 13489 va_start(ap, zFmt); |
| 13490 zNew = sqlite3_vmprintf(zFmt, ap); |
| 13491 va_end(ap); |
| 13492 if( zApp && zNew ){ |
| 13493 char *zNew2 = sqlite3_mprintf("%s%s", zApp, zNew); |
| 13494 sqlite3_free(zNew); |
| 13495 zNew = zNew2; |
| 13496 } |
| 13497 sqlite3_free(zApp); |
| 13498 return zNew; |
| 13499 } |
| 13500 |
| 13501 /* |
| 13502 ** Compose a tcl-readable representation of expression pExpr. Return a |
| 13503 ** pointer to a buffer containing that representation. It is the |
| 13504 ** responsibility of the caller to at some point free the buffer using |
| 13505 ** sqlite3_free(). |
| 13506 */ |
| 13507 static char *fts5ExprPrintTcl( |
| 13508 Fts5Config *pConfig, |
| 13509 const char *zNearsetCmd, |
| 13510 Fts5ExprNode *pExpr |
| 13511 ){ |
| 13512 char *zRet = 0; |
| 13513 if( pExpr->eType==FTS5_STRING || pExpr->eType==FTS5_TERM ){ |
| 13514 Fts5ExprNearset *pNear = pExpr->pNear; |
| 13515 int i; |
| 13516 int iTerm; |
| 13517 |
| 13518 zRet = fts5PrintfAppend(zRet, "%s ", zNearsetCmd); |
| 13519 if( zRet==0 ) return 0; |
| 13520 if( pNear->pColset ){ |
| 13521 int *aiCol = pNear->pColset->aiCol; |
| 13522 int nCol = pNear->pColset->nCol; |
| 13523 if( nCol==1 ){ |
| 13524 zRet = fts5PrintfAppend(zRet, "-col %d ", aiCol[0]); |
| 13525 }else{ |
| 13526 zRet = fts5PrintfAppend(zRet, "-col {%d", aiCol[0]); |
| 13527 for(i=1; i<pNear->pColset->nCol; i++){ |
| 13528 zRet = fts5PrintfAppend(zRet, " %d", aiCol[i]); |
| 13529 } |
| 13530 zRet = fts5PrintfAppend(zRet, "} "); |
| 13531 } |
| 13532 if( zRet==0 ) return 0; |
| 13533 } |
| 13534 |
| 13535 if( pNear->nPhrase>1 ){ |
| 13536 zRet = fts5PrintfAppend(zRet, "-near %d ", pNear->nNear); |
| 13537 if( zRet==0 ) return 0; |
| 13538 } |
| 13539 |
| 13540 zRet = fts5PrintfAppend(zRet, "--"); |
| 13541 if( zRet==0 ) return 0; |
| 13542 |
| 13543 for(i=0; i<pNear->nPhrase; i++){ |
| 13544 Fts5ExprPhrase *pPhrase = pNear->apPhrase[i]; |
| 13545 |
| 13546 zRet = fts5PrintfAppend(zRet, " {"); |
| 13547 for(iTerm=0; zRet && iTerm<pPhrase->nTerm; iTerm++){ |
| 13548 char *zTerm = pPhrase->aTerm[iTerm].zTerm; |
| 13549 zRet = fts5PrintfAppend(zRet, "%s%s", iTerm==0?"":" ", zTerm); |
| 13550 } |
| 13551 |
| 13552 if( zRet ) zRet = fts5PrintfAppend(zRet, "}"); |
| 13553 if( zRet==0 ) return 0; |
| 13554 } |
| 13555 |
| 13556 }else{ |
| 13557 char const *zOp = 0; |
| 13558 int i; |
| 13559 switch( pExpr->eType ){ |
| 13560 case FTS5_AND: zOp = "AND"; break; |
| 13561 case FTS5_NOT: zOp = "NOT"; break; |
| 13562 default: |
| 13563 assert( pExpr->eType==FTS5_OR ); |
| 13564 zOp = "OR"; |
| 13565 break; |
| 13566 } |
| 13567 |
| 13568 zRet = sqlite3_mprintf("%s", zOp); |
| 13569 for(i=0; zRet && i<pExpr->nChild; i++){ |
| 13570 char *z = fts5ExprPrintTcl(pConfig, zNearsetCmd, pExpr->apChild[i]); |
| 13571 if( !z ){ |
| 13572 sqlite3_free(zRet); |
| 13573 zRet = 0; |
| 13574 }else{ |
| 13575 zRet = fts5PrintfAppend(zRet, " [%z]", z); |
| 13576 } |
| 13577 } |
| 13578 } |
| 13579 |
| 13580 return zRet; |
| 13581 } |
| 13582 |
| 13583 static char *fts5ExprPrint(Fts5Config *pConfig, Fts5ExprNode *pExpr){ |
| 13584 char *zRet = 0; |
| 13585 if( pExpr->eType==FTS5_STRING || pExpr->eType==FTS5_TERM ){ |
| 13586 Fts5ExprNearset *pNear = pExpr->pNear; |
| 13587 int i; |
| 13588 int iTerm; |
| 13589 |
| 13590 if( pNear->pColset ){ |
| 13591 int iCol = pNear->pColset->aiCol[0]; |
| 13592 zRet = fts5PrintfAppend(zRet, "%s : ", pConfig->azCol[iCol]); |
| 13593 if( zRet==0 ) return 0; |
| 13594 } |
| 13595 |
| 13596 if( pNear->nPhrase>1 ){ |
| 13597 zRet = fts5PrintfAppend(zRet, "NEAR("); |
| 13598 if( zRet==0 ) return 0; |
| 13599 } |
| 13600 |
| 13601 for(i=0; i<pNear->nPhrase; i++){ |
| 13602 Fts5ExprPhrase *pPhrase = pNear->apPhrase[i]; |
| 13603 if( i!=0 ){ |
| 13604 zRet = fts5PrintfAppend(zRet, " "); |
| 13605 if( zRet==0 ) return 0; |
| 13606 } |
| 13607 for(iTerm=0; iTerm<pPhrase->nTerm; iTerm++){ |
| 13608 char *zTerm = fts5ExprTermPrint(&pPhrase->aTerm[iTerm]); |
| 13609 if( zTerm ){ |
| 13610 zRet = fts5PrintfAppend(zRet, "%s%s", iTerm==0?"":" + ", zTerm); |
| 13611 sqlite3_free(zTerm); |
| 13612 } |
| 13613 if( zTerm==0 || zRet==0 ){ |
| 13614 sqlite3_free(zRet); |
| 13615 return 0; |
| 13616 } |
| 13617 } |
| 13618 } |
| 13619 |
| 13620 if( pNear->nPhrase>1 ){ |
| 13621 zRet = fts5PrintfAppend(zRet, ", %d)", pNear->nNear); |
| 13622 if( zRet==0 ) return 0; |
| 13623 } |
| 13624 |
| 13625 }else{ |
| 13626 char const *zOp = 0; |
| 13627 int i; |
| 13628 |
| 13629 switch( pExpr->eType ){ |
| 13630 case FTS5_AND: zOp = " AND "; break; |
| 13631 case FTS5_NOT: zOp = " NOT "; break; |
| 13632 default: |
| 13633 assert( pExpr->eType==FTS5_OR ); |
| 13634 zOp = " OR "; |
| 13635 break; |
| 13636 } |
| 13637 |
| 13638 for(i=0; i<pExpr->nChild; i++){ |
| 13639 char *z = fts5ExprPrint(pConfig, pExpr->apChild[i]); |
| 13640 if( z==0 ){ |
| 13641 sqlite3_free(zRet); |
| 13642 zRet = 0; |
| 13643 }else{ |
| 13644 int e = pExpr->apChild[i]->eType; |
| 13645 int b = (e!=FTS5_STRING && e!=FTS5_TERM); |
| 13646 zRet = fts5PrintfAppend(zRet, "%s%s%z%s", |
| 13647 (i==0 ? "" : zOp), |
| 13648 (b?"(":""), z, (b?")":"") |
| 13649 ); |
| 13650 } |
| 13651 if( zRet==0 ) break; |
| 13652 } |
| 13653 } |
| 13654 |
| 13655 return zRet; |
| 13656 } |
| 13657 |
| 13658 /* |
| 13659 ** The implementation of user-defined scalar functions fts5_expr() (bTcl==0) |
| 13660 ** and fts5_expr_tcl() (bTcl!=0). |
| 13661 */ |
| 13662 static void fts5ExprFunction( |
| 13663 sqlite3_context *pCtx, /* Function call context */ |
| 13664 int nArg, /* Number of args */ |
| 13665 sqlite3_value **apVal, /* Function arguments */ |
| 13666 int bTcl |
| 13667 ){ |
| 13668 Fts5Global *pGlobal = (Fts5Global*)sqlite3_user_data(pCtx); |
| 13669 sqlite3 *db = sqlite3_context_db_handle(pCtx); |
| 13670 const char *zExpr = 0; |
| 13671 char *zErr = 0; |
| 13672 Fts5Expr *pExpr = 0; |
| 13673 int rc; |
| 13674 int i; |
| 13675 |
| 13676 const char **azConfig; /* Array of arguments for Fts5Config */ |
| 13677 const char *zNearsetCmd = "nearset"; |
| 13678 int nConfig; /* Size of azConfig[] */ |
| 13679 Fts5Config *pConfig = 0; |
| 13680 int iArg = 1; |
| 13681 |
| 13682 if( nArg<1 ){ |
| 13683 zErr = sqlite3_mprintf("wrong number of arguments to function %s", |
| 13684 bTcl ? "fts5_expr_tcl" : "fts5_expr" |
| 13685 ); |
| 13686 sqlite3_result_error(pCtx, zErr, -1); |
| 13687 sqlite3_free(zErr); |
| 13688 return; |
| 13689 } |
| 13690 |
| 13691 if( bTcl && nArg>1 ){ |
| 13692 zNearsetCmd = (const char*)sqlite3_value_text(apVal[1]); |
| 13693 iArg = 2; |
| 13694 } |
| 13695 |
| 13696 nConfig = 3 + (nArg-iArg); |
| 13697 azConfig = (const char**)sqlite3_malloc(sizeof(char*) * nConfig); |
| 13698 if( azConfig==0 ){ |
| 13699 sqlite3_result_error_nomem(pCtx); |
| 13700 return; |
| 13701 } |
| 13702 azConfig[0] = 0; |
| 13703 azConfig[1] = "main"; |
| 13704 azConfig[2] = "tbl"; |
| 13705 for(i=3; iArg<nArg; iArg++){ |
| 13706 azConfig[i++] = (const char*)sqlite3_value_text(apVal[iArg]); |
| 13707 } |
| 13708 |
| 13709 zExpr = (const char*)sqlite3_value_text(apVal[0]); |
| 13710 |
| 13711 rc = sqlite3Fts5ConfigParse(pGlobal, db, nConfig, azConfig, &pConfig, &zErr); |
| 13712 if( rc==SQLITE_OK ){ |
| 13713 rc = sqlite3Fts5ExprNew(pConfig, zExpr, &pExpr, &zErr); |
| 13714 } |
| 13715 if( rc==SQLITE_OK ){ |
| 13716 char *zText; |
| 13717 if( pExpr->pRoot==0 ){ |
| 13718 zText = sqlite3_mprintf(""); |
| 13719 }else if( bTcl ){ |
| 13720 zText = fts5ExprPrintTcl(pConfig, zNearsetCmd, pExpr->pRoot); |
| 13721 }else{ |
| 13722 zText = fts5ExprPrint(pConfig, pExpr->pRoot); |
| 13723 } |
| 13724 if( zText==0 ){ |
| 13725 rc = SQLITE_NOMEM; |
| 13726 }else{ |
| 13727 sqlite3_result_text(pCtx, zText, -1, SQLITE_TRANSIENT); |
| 13728 sqlite3_free(zText); |
| 13729 } |
| 13730 } |
| 13731 |
| 13732 if( rc!=SQLITE_OK ){ |
| 13733 if( zErr ){ |
| 13734 sqlite3_result_error(pCtx, zErr, -1); |
| 13735 sqlite3_free(zErr); |
| 13736 }else{ |
| 13737 sqlite3_result_error_code(pCtx, rc); |
| 13738 } |
| 13739 } |
| 13740 sqlite3_free((void *)azConfig); |
| 13741 sqlite3Fts5ConfigFree(pConfig); |
| 13742 sqlite3Fts5ExprFree(pExpr); |
| 13743 } |
| 13744 |
| 13745 static void fts5ExprFunctionHr( |
| 13746 sqlite3_context *pCtx, /* Function call context */ |
| 13747 int nArg, /* Number of args */ |
| 13748 sqlite3_value **apVal /* Function arguments */ |
| 13749 ){ |
| 13750 fts5ExprFunction(pCtx, nArg, apVal, 0); |
| 13751 } |
| 13752 static void fts5ExprFunctionTcl( |
| 13753 sqlite3_context *pCtx, /* Function call context */ |
| 13754 int nArg, /* Number of args */ |
| 13755 sqlite3_value **apVal /* Function arguments */ |
| 13756 ){ |
| 13757 fts5ExprFunction(pCtx, nArg, apVal, 1); |
| 13758 } |
| 13759 |
| 13760 /* |
| 13761 ** The implementation of an SQLite user-defined-function that accepts a |
| 13762 ** single integer as an argument. If the integer is an alpha-numeric |
| 13763 ** unicode code point, 1 is returned. Otherwise 0. |
| 13764 */ |
| 13765 static void fts5ExprIsAlnum( |
| 13766 sqlite3_context *pCtx, /* Function call context */ |
| 13767 int nArg, /* Number of args */ |
| 13768 sqlite3_value **apVal /* Function arguments */ |
| 13769 ){ |
| 13770 int iCode; |
| 13771 if( nArg!=1 ){ |
| 13772 sqlite3_result_error(pCtx, |
| 13773 "wrong number of arguments to function fts5_isalnum", -1 |
| 13774 ); |
| 13775 return; |
| 13776 } |
| 13777 iCode = sqlite3_value_int(apVal[0]); |
| 13778 sqlite3_result_int(pCtx, sqlite3Fts5UnicodeIsalnum(iCode)); |
| 13779 } |
| 13780 |
| 13781 static void fts5ExprFold( |
| 13782 sqlite3_context *pCtx, /* Function call context */ |
| 13783 int nArg, /* Number of args */ |
| 13784 sqlite3_value **apVal /* Function arguments */ |
| 13785 ){ |
| 13786 if( nArg!=1 && nArg!=2 ){ |
| 13787 sqlite3_result_error(pCtx, |
| 13788 "wrong number of arguments to function fts5_fold", -1 |
| 13789 ); |
| 13790 }else{ |
| 13791 int iCode; |
| 13792 int bRemoveDiacritics = 0; |
| 13793 iCode = sqlite3_value_int(apVal[0]); |
| 13794 if( nArg==2 ) bRemoveDiacritics = sqlite3_value_int(apVal[1]); |
| 13795 sqlite3_result_int(pCtx, sqlite3Fts5UnicodeFold(iCode, bRemoveDiacritics)); |
| 13796 } |
| 13797 } |
| 13798 |
| 13799 /* |
| 13800 ** This is called during initialization to register the fts5_expr() scalar |
| 13801 ** UDF with the SQLite handle passed as the only argument. |
| 13802 */ |
| 13803 static int sqlite3Fts5ExprInit(Fts5Global *pGlobal, sqlite3 *db){ |
| 13804 struct Fts5ExprFunc { |
| 13805 const char *z; |
| 13806 void (*x)(sqlite3_context*,int,sqlite3_value**); |
| 13807 } aFunc[] = { |
| 13808 { "fts5_expr", fts5ExprFunctionHr }, |
| 13809 { "fts5_expr_tcl", fts5ExprFunctionTcl }, |
| 13810 { "fts5_isalnum", fts5ExprIsAlnum }, |
| 13811 { "fts5_fold", fts5ExprFold }, |
| 13812 }; |
| 13813 int i; |
| 13814 int rc = SQLITE_OK; |
| 13815 void *pCtx = (void*)pGlobal; |
| 13816 |
| 13817 for(i=0; rc==SQLITE_OK && i<(int)ArraySize(aFunc); i++){ |
| 13818 struct Fts5ExprFunc *p = &aFunc[i]; |
| 13819 rc = sqlite3_create_function(db, p->z, -1, SQLITE_UTF8, pCtx, p->x, 0, 0); |
| 13820 } |
| 13821 |
| 13822 /* Avoid a warning indicating that sqlite3Fts5ParserTrace() is unused */ |
| 13823 #ifndef NDEBUG |
| 13824 (void)sqlite3Fts5ParserTrace; |
| 13825 #endif |
| 13826 |
| 13827 return rc; |
| 13828 } |
| 13829 |
| 13830 /* |
| 13831 ** Return the number of phrases in expression pExpr. |
| 13832 */ |
| 13833 static int sqlite3Fts5ExprPhraseCount(Fts5Expr *pExpr){ |
| 13834 return (pExpr ? pExpr->nPhrase : 0); |
| 13835 } |
| 13836 |
| 13837 /* |
| 13838 ** Return the number of terms in the iPhrase'th phrase in pExpr. |
| 13839 */ |
| 13840 static int sqlite3Fts5ExprPhraseSize(Fts5Expr *pExpr, int iPhrase){ |
| 13841 if( iPhrase<0 || iPhrase>=pExpr->nPhrase ) return 0; |
| 13842 return pExpr->apExprPhrase[iPhrase]->nTerm; |
| 13843 } |
| 13844 |
| 13845 /* |
| 13846 ** This function is used to access the current position list for phrase |
| 13847 ** iPhrase. |
| 13848 */ |
| 13849 static int sqlite3Fts5ExprPoslist(Fts5Expr *pExpr, int iPhrase, const u8 **pa){ |
| 13850 int nRet; |
| 13851 Fts5ExprPhrase *pPhrase = pExpr->apExprPhrase[iPhrase]; |
| 13852 Fts5ExprNode *pNode = pPhrase->pNode; |
| 13853 if( pNode->bEof==0 && pNode->iRowid==pExpr->pRoot->iRowid ){ |
| 13854 *pa = pPhrase->poslist.p; |
| 13855 nRet = pPhrase->poslist.n; |
| 13856 }else{ |
| 13857 *pa = 0; |
| 13858 nRet = 0; |
| 13859 } |
| 13860 return nRet; |
| 13861 } |
| 13862 |
| 13863 /* |
| 13864 ** 2014 August 11 |
| 13865 ** |
| 13866 ** The author disclaims copyright to this source code. In place of |
| 13867 ** a legal notice, here is a blessing: |
| 13868 ** |
| 13869 ** May you do good and not evil. |
| 13870 ** May you find forgiveness for yourself and forgive others. |
| 13871 ** May you share freely, never taking more than you give. |
| 13872 ** |
| 13873 ****************************************************************************** |
| 13874 ** |
| 13875 */ |
| 13876 |
| 13877 |
| 13878 |
| 13879 /* #include "fts5Int.h" */ |
| 13880 |
| 13881 typedef struct Fts5HashEntry Fts5HashEntry; |
| 13882 |
| 13883 /* |
| 13884 ** This file contains the implementation of an in-memory hash table used |
| 13885 ** to accumuluate "term -> doclist" content before it is flused to a level-0 |
| 13886 ** segment. |
| 13887 */ |
| 13888 |
| 13889 |
| 13890 struct Fts5Hash { |
| 13891 int *pnByte; /* Pointer to bytes counter */ |
| 13892 int nEntry; /* Number of entries currently in hash */ |
| 13893 int nSlot; /* Size of aSlot[] array */ |
| 13894 Fts5HashEntry *pScan; /* Current ordered scan item */ |
| 13895 Fts5HashEntry **aSlot; /* Array of hash slots */ |
| 13896 }; |
| 13897 |
| 13898 /* |
| 13899 ** Each entry in the hash table is represented by an object of the |
| 13900 ** following type. Each object, its key (zKey[]) and its current data |
| 13901 ** are stored in a single memory allocation. The position list data |
| 13902 ** immediately follows the key data in memory. |
| 13903 ** |
| 13904 ** The data that follows the key is in a similar, but not identical format |
| 13905 ** to the doclist data stored in the database. It is: |
| 13906 ** |
| 13907 ** * Rowid, as a varint |
| 13908 ** * Position list, without 0x00 terminator. |
| 13909 ** * Size of previous position list and rowid, as a 4 byte |
| 13910 ** big-endian integer. |
| 13911 ** |
| 13912 ** iRowidOff: |
| 13913 ** Offset of last rowid written to data area. Relative to first byte of |
| 13914 ** structure. |
| 13915 ** |
| 13916 ** nData: |
| 13917 ** Bytes of data written since iRowidOff. |
| 13918 */ |
| 13919 struct Fts5HashEntry { |
| 13920 Fts5HashEntry *pHashNext; /* Next hash entry with same hash-key */ |
| 13921 Fts5HashEntry *pScanNext; /* Next entry in sorted order */ |
| 13922 |
| 13923 int nAlloc; /* Total size of allocation */ |
| 13924 int iSzPoslist; /* Offset of space for 4-byte poslist size */ |
| 13925 int nData; /* Total bytes of data (incl. structure) */ |
| 13926 u8 bDel; /* Set delete-flag @ iSzPoslist */ |
| 13927 |
| 13928 int iCol; /* Column of last value written */ |
| 13929 int iPos; /* Position of last value written */ |
| 13930 i64 iRowid; /* Rowid of last value written */ |
| 13931 char zKey[8]; /* Nul-terminated entry key */ |
| 13932 }; |
| 13933 |
| 13934 /* |
| 13935 ** Size of Fts5HashEntry without the zKey[] array. |
| 13936 */ |
| 13937 #define FTS5_HASHENTRYSIZE (sizeof(Fts5HashEntry)-8) |
| 13938 |
| 13939 |
| 13940 |
| 13941 /* |
| 13942 ** Allocate a new hash table. |
| 13943 */ |
| 13944 static int sqlite3Fts5HashNew(Fts5Hash **ppNew, int *pnByte){ |
| 13945 int rc = SQLITE_OK; |
| 13946 Fts5Hash *pNew; |
| 13947 |
| 13948 *ppNew = pNew = (Fts5Hash*)sqlite3_malloc(sizeof(Fts5Hash)); |
| 13949 if( pNew==0 ){ |
| 13950 rc = SQLITE_NOMEM; |
| 13951 }else{ |
| 13952 int nByte; |
| 13953 memset(pNew, 0, sizeof(Fts5Hash)); |
| 13954 pNew->pnByte = pnByte; |
| 13955 |
| 13956 pNew->nSlot = 1024; |
| 13957 nByte = sizeof(Fts5HashEntry*) * pNew->nSlot; |
| 13958 pNew->aSlot = (Fts5HashEntry**)sqlite3_malloc(nByte); |
| 13959 if( pNew->aSlot==0 ){ |
| 13960 sqlite3_free(pNew); |
| 13961 *ppNew = 0; |
| 13962 rc = SQLITE_NOMEM; |
| 13963 }else{ |
| 13964 memset(pNew->aSlot, 0, nByte); |
| 13965 } |
| 13966 } |
| 13967 return rc; |
| 13968 } |
| 13969 |
| 13970 /* |
| 13971 ** Free a hash table object. |
| 13972 */ |
| 13973 static void sqlite3Fts5HashFree(Fts5Hash *pHash){ |
| 13974 if( pHash ){ |
| 13975 sqlite3Fts5HashClear(pHash); |
| 13976 sqlite3_free(pHash->aSlot); |
| 13977 sqlite3_free(pHash); |
| 13978 } |
| 13979 } |
| 13980 |
| 13981 /* |
| 13982 ** Empty (but do not delete) a hash table. |
| 13983 */ |
| 13984 static void sqlite3Fts5HashClear(Fts5Hash *pHash){ |
| 13985 int i; |
| 13986 for(i=0; i<pHash->nSlot; i++){ |
| 13987 Fts5HashEntry *pNext; |
| 13988 Fts5HashEntry *pSlot; |
| 13989 for(pSlot=pHash->aSlot[i]; pSlot; pSlot=pNext){ |
| 13990 pNext = pSlot->pHashNext; |
| 13991 sqlite3_free(pSlot); |
| 13992 } |
| 13993 } |
| 13994 memset(pHash->aSlot, 0, pHash->nSlot * sizeof(Fts5HashEntry*)); |
| 13995 pHash->nEntry = 0; |
| 13996 } |
| 13997 |
| 13998 static unsigned int fts5HashKey(int nSlot, const u8 *p, int n){ |
| 13999 int i; |
| 14000 unsigned int h = 13; |
| 14001 for(i=n-1; i>=0; i--){ |
| 14002 h = (h << 3) ^ h ^ p[i]; |
| 14003 } |
| 14004 return (h % nSlot); |
| 14005 } |
| 14006 |
| 14007 static unsigned int fts5HashKey2(int nSlot, u8 b, const u8 *p, int n){ |
| 14008 int i; |
| 14009 unsigned int h = 13; |
| 14010 for(i=n-1; i>=0; i--){ |
| 14011 h = (h << 3) ^ h ^ p[i]; |
| 14012 } |
| 14013 h = (h << 3) ^ h ^ b; |
| 14014 return (h % nSlot); |
| 14015 } |
| 14016 |
| 14017 /* |
| 14018 ** Resize the hash table by doubling the number of slots. |
| 14019 */ |
| 14020 static int fts5HashResize(Fts5Hash *pHash){ |
| 14021 int nNew = pHash->nSlot*2; |
| 14022 int i; |
| 14023 Fts5HashEntry **apNew; |
| 14024 Fts5HashEntry **apOld = pHash->aSlot; |
| 14025 |
| 14026 apNew = (Fts5HashEntry**)sqlite3_malloc(nNew*sizeof(Fts5HashEntry*)); |
| 14027 if( !apNew ) return SQLITE_NOMEM; |
| 14028 memset(apNew, 0, nNew*sizeof(Fts5HashEntry*)); |
| 14029 |
| 14030 for(i=0; i<pHash->nSlot; i++){ |
| 14031 while( apOld[i] ){ |
| 14032 int iHash; |
| 14033 Fts5HashEntry *p = apOld[i]; |
| 14034 apOld[i] = p->pHashNext; |
| 14035 iHash = fts5HashKey(nNew, (u8*)p->zKey, (int)strlen(p->zKey)); |
| 14036 p->pHashNext = apNew[iHash]; |
| 14037 apNew[iHash] = p; |
| 14038 } |
| 14039 } |
| 14040 |
| 14041 sqlite3_free(apOld); |
| 14042 pHash->nSlot = nNew; |
| 14043 pHash->aSlot = apNew; |
| 14044 return SQLITE_OK; |
| 14045 } |
| 14046 |
| 14047 static void fts5HashAddPoslistSize(Fts5HashEntry *p){ |
| 14048 if( p->iSzPoslist ){ |
| 14049 u8 *pPtr = (u8*)p; |
| 14050 int nSz = (p->nData - p->iSzPoslist - 1); /* Size in bytes */ |
| 14051 int nPos = nSz*2 + p->bDel; /* Value of nPos field */ |
| 14052 |
| 14053 assert( p->bDel==0 || p->bDel==1 ); |
| 14054 if( nPos<=127 ){ |
| 14055 pPtr[p->iSzPoslist] = (u8)nPos; |
| 14056 }else{ |
| 14057 int nByte = sqlite3Fts5GetVarintLen((u32)nPos); |
| 14058 memmove(&pPtr[p->iSzPoslist + nByte], &pPtr[p->iSzPoslist + 1], nSz); |
| 14059 sqlite3Fts5PutVarint(&pPtr[p->iSzPoslist], nPos); |
| 14060 p->nData += (nByte-1); |
| 14061 } |
| 14062 p->bDel = 0; |
| 14063 p->iSzPoslist = 0; |
| 14064 } |
| 14065 } |
| 14066 |
| 14067 static int sqlite3Fts5HashWrite( |
| 14068 Fts5Hash *pHash, |
| 14069 i64 iRowid, /* Rowid for this entry */ |
| 14070 int iCol, /* Column token appears in (-ve -> delete) */ |
| 14071 int iPos, /* Position of token within column */ |
| 14072 char bByte, /* First byte of token */ |
| 14073 const char *pToken, int nToken /* Token to add or remove to or from index */ |
| 14074 ){ |
| 14075 unsigned int iHash; |
| 14076 Fts5HashEntry *p; |
| 14077 u8 *pPtr; |
| 14078 int nIncr = 0; /* Amount to increment (*pHash->pnByte) by */ |
| 14079 |
| 14080 /* Attempt to locate an existing hash entry */ |
| 14081 iHash = fts5HashKey2(pHash->nSlot, (u8)bByte, (const u8*)pToken, nToken); |
| 14082 for(p=pHash->aSlot[iHash]; p; p=p->pHashNext){ |
| 14083 if( p->zKey[0]==bByte |
| 14084 && memcmp(&p->zKey[1], pToken, nToken)==0 |
| 14085 && p->zKey[nToken+1]==0 |
| 14086 ){ |
| 14087 break; |
| 14088 } |
| 14089 } |
| 14090 |
| 14091 /* If an existing hash entry cannot be found, create a new one. */ |
| 14092 if( p==0 ){ |
| 14093 int nByte = FTS5_HASHENTRYSIZE + (nToken+1) + 1 + 64; |
| 14094 if( nByte<128 ) nByte = 128; |
| 14095 |
| 14096 if( (pHash->nEntry*2)>=pHash->nSlot ){ |
| 14097 int rc = fts5HashResize(pHash); |
| 14098 if( rc!=SQLITE_OK ) return rc; |
| 14099 iHash = fts5HashKey2(pHash->nSlot, (u8)bByte, (const u8*)pToken, nToken); |
| 14100 } |
| 14101 |
| 14102 p = (Fts5HashEntry*)sqlite3_malloc(nByte); |
| 14103 if( !p ) return SQLITE_NOMEM; |
| 14104 memset(p, 0, FTS5_HASHENTRYSIZE); |
| 14105 p->nAlloc = nByte; |
| 14106 p->zKey[0] = bByte; |
| 14107 memcpy(&p->zKey[1], pToken, nToken); |
| 14108 assert( iHash==fts5HashKey(pHash->nSlot, (u8*)p->zKey, nToken+1) ); |
| 14109 p->zKey[nToken+1] = '\0'; |
| 14110 p->nData = nToken+1 + 1 + FTS5_HASHENTRYSIZE; |
| 14111 p->nData += sqlite3Fts5PutVarint(&((u8*)p)[p->nData], iRowid); |
| 14112 p->iSzPoslist = p->nData; |
| 14113 p->nData += 1; |
| 14114 p->iRowid = iRowid; |
| 14115 p->pHashNext = pHash->aSlot[iHash]; |
| 14116 pHash->aSlot[iHash] = p; |
| 14117 pHash->nEntry++; |
| 14118 nIncr += p->nData; |
| 14119 } |
| 14120 |
| 14121 /* Check there is enough space to append a new entry. Worst case scenario |
| 14122 ** is: |
| 14123 ** |
| 14124 ** + 9 bytes for a new rowid, |
| 14125 ** + 4 byte reserved for the "poslist size" varint. |
| 14126 ** + 1 byte for a "new column" byte, |
| 14127 ** + 3 bytes for a new column number (16-bit max) as a varint, |
| 14128 ** + 5 bytes for the new position offset (32-bit max). |
| 14129 */ |
| 14130 if( (p->nAlloc - p->nData) < (9 + 4 + 1 + 3 + 5) ){ |
| 14131 int nNew = p->nAlloc * 2; |
| 14132 Fts5HashEntry *pNew; |
| 14133 Fts5HashEntry **pp; |
| 14134 pNew = (Fts5HashEntry*)sqlite3_realloc(p, nNew); |
| 14135 if( pNew==0 ) return SQLITE_NOMEM; |
| 14136 pNew->nAlloc = nNew; |
| 14137 for(pp=&pHash->aSlot[iHash]; *pp!=p; pp=&(*pp)->pHashNext); |
| 14138 *pp = pNew; |
| 14139 p = pNew; |
| 14140 } |
| 14141 pPtr = (u8*)p; |
| 14142 nIncr -= p->nData; |
| 14143 |
| 14144 /* If this is a new rowid, append the 4-byte size field for the previous |
| 14145 ** entry, and the new rowid for this entry. */ |
| 14146 if( iRowid!=p->iRowid ){ |
| 14147 fts5HashAddPoslistSize(p); |
| 14148 p->nData += sqlite3Fts5PutVarint(&pPtr[p->nData], iRowid - p->iRowid); |
| 14149 p->iSzPoslist = p->nData; |
| 14150 p->nData += 1; |
| 14151 p->iCol = 0; |
| 14152 p->iPos = 0; |
| 14153 p->iRowid = iRowid; |
| 14154 } |
| 14155 |
| 14156 if( iCol>=0 ){ |
| 14157 /* Append a new column value, if necessary */ |
| 14158 assert( iCol>=p->iCol ); |
| 14159 if( iCol!=p->iCol ){ |
| 14160 pPtr[p->nData++] = 0x01; |
| 14161 p->nData += sqlite3Fts5PutVarint(&pPtr[p->nData], iCol); |
| 14162 p->iCol = iCol; |
| 14163 p->iPos = 0; |
| 14164 } |
| 14165 |
| 14166 /* Append the new position offset */ |
| 14167 p->nData += sqlite3Fts5PutVarint(&pPtr[p->nData], iPos - p->iPos + 2); |
| 14168 p->iPos = iPos; |
| 14169 }else{ |
| 14170 /* This is a delete. Set the delete flag. */ |
| 14171 p->bDel = 1; |
| 14172 } |
| 14173 nIncr += p->nData; |
| 14174 |
| 14175 *pHash->pnByte += nIncr; |
| 14176 return SQLITE_OK; |
| 14177 } |
| 14178 |
| 14179 |
| 14180 /* |
| 14181 ** Arguments pLeft and pRight point to linked-lists of hash-entry objects, |
| 14182 ** each sorted in key order. This function merges the two lists into a |
| 14183 ** single list and returns a pointer to its first element. |
| 14184 */ |
| 14185 static Fts5HashEntry *fts5HashEntryMerge( |
| 14186 Fts5HashEntry *pLeft, |
| 14187 Fts5HashEntry *pRight |
| 14188 ){ |
| 14189 Fts5HashEntry *p1 = pLeft; |
| 14190 Fts5HashEntry *p2 = pRight; |
| 14191 Fts5HashEntry *pRet = 0; |
| 14192 Fts5HashEntry **ppOut = &pRet; |
| 14193 |
| 14194 while( p1 || p2 ){ |
| 14195 if( p1==0 ){ |
| 14196 *ppOut = p2; |
| 14197 p2 = 0; |
| 14198 }else if( p2==0 ){ |
| 14199 *ppOut = p1; |
| 14200 p1 = 0; |
| 14201 }else{ |
| 14202 int i = 0; |
| 14203 while( p1->zKey[i]==p2->zKey[i] ) i++; |
| 14204 |
| 14205 if( ((u8)p1->zKey[i])>((u8)p2->zKey[i]) ){ |
| 14206 /* p2 is smaller */ |
| 14207 *ppOut = p2; |
| 14208 ppOut = &p2->pScanNext; |
| 14209 p2 = p2->pScanNext; |
| 14210 }else{ |
| 14211 /* p1 is smaller */ |
| 14212 *ppOut = p1; |
| 14213 ppOut = &p1->pScanNext; |
| 14214 p1 = p1->pScanNext; |
| 14215 } |
| 14216 *ppOut = 0; |
| 14217 } |
| 14218 } |
| 14219 |
| 14220 return pRet; |
| 14221 } |
| 14222 |
| 14223 /* |
| 14224 ** Extract all tokens from hash table iHash and link them into a list |
| 14225 ** in sorted order. The hash table is cleared before returning. It is |
| 14226 ** the responsibility of the caller to free the elements of the returned |
| 14227 ** list. |
| 14228 */ |
| 14229 static int fts5HashEntrySort( |
| 14230 Fts5Hash *pHash, |
| 14231 const char *pTerm, int nTerm, /* Query prefix, if any */ |
| 14232 Fts5HashEntry **ppSorted |
| 14233 ){ |
| 14234 const int nMergeSlot = 32; |
| 14235 Fts5HashEntry **ap; |
| 14236 Fts5HashEntry *pList; |
| 14237 int iSlot; |
| 14238 int i; |
| 14239 |
| 14240 *ppSorted = 0; |
| 14241 ap = sqlite3_malloc(sizeof(Fts5HashEntry*) * nMergeSlot); |
| 14242 if( !ap ) return SQLITE_NOMEM; |
| 14243 memset(ap, 0, sizeof(Fts5HashEntry*) * nMergeSlot); |
| 14244 |
| 14245 for(iSlot=0; iSlot<pHash->nSlot; iSlot++){ |
| 14246 Fts5HashEntry *pIter; |
| 14247 for(pIter=pHash->aSlot[iSlot]; pIter; pIter=pIter->pHashNext){ |
| 14248 if( pTerm==0 || 0==memcmp(pIter->zKey, pTerm, nTerm) ){ |
| 14249 Fts5HashEntry *pEntry = pIter; |
| 14250 pEntry->pScanNext = 0; |
| 14251 for(i=0; ap[i]; i++){ |
| 14252 pEntry = fts5HashEntryMerge(pEntry, ap[i]); |
| 14253 ap[i] = 0; |
| 14254 } |
| 14255 ap[i] = pEntry; |
| 14256 } |
| 14257 } |
| 14258 } |
| 14259 |
| 14260 pList = 0; |
| 14261 for(i=0; i<nMergeSlot; i++){ |
| 14262 pList = fts5HashEntryMerge(pList, ap[i]); |
| 14263 } |
| 14264 |
| 14265 pHash->nEntry = 0; |
| 14266 sqlite3_free(ap); |
| 14267 *ppSorted = pList; |
| 14268 return SQLITE_OK; |
| 14269 } |
| 14270 |
| 14271 /* |
| 14272 ** Query the hash table for a doclist associated with term pTerm/nTerm. |
| 14273 */ |
| 14274 static int sqlite3Fts5HashQuery( |
| 14275 Fts5Hash *pHash, /* Hash table to query */ |
| 14276 const char *pTerm, int nTerm, /* Query term */ |
| 14277 const u8 **ppDoclist, /* OUT: Pointer to doclist for pTerm */ |
| 14278 int *pnDoclist /* OUT: Size of doclist in bytes */ |
| 14279 ){ |
| 14280 unsigned int iHash = fts5HashKey(pHash->nSlot, (const u8*)pTerm, nTerm); |
| 14281 Fts5HashEntry *p; |
| 14282 |
| 14283 for(p=pHash->aSlot[iHash]; p; p=p->pHashNext){ |
| 14284 if( memcmp(p->zKey, pTerm, nTerm)==0 && p->zKey[nTerm]==0 ) break; |
| 14285 } |
| 14286 |
| 14287 if( p ){ |
| 14288 fts5HashAddPoslistSize(p); |
| 14289 *ppDoclist = (const u8*)&p->zKey[nTerm+1]; |
| 14290 *pnDoclist = p->nData - (FTS5_HASHENTRYSIZE + nTerm + 1); |
| 14291 }else{ |
| 14292 *ppDoclist = 0; |
| 14293 *pnDoclist = 0; |
| 14294 } |
| 14295 |
| 14296 return SQLITE_OK; |
| 14297 } |
| 14298 |
| 14299 static int sqlite3Fts5HashScanInit( |
| 14300 Fts5Hash *p, /* Hash table to query */ |
| 14301 const char *pTerm, int nTerm /* Query prefix */ |
| 14302 ){ |
| 14303 return fts5HashEntrySort(p, pTerm, nTerm, &p->pScan); |
| 14304 } |
| 14305 |
| 14306 static void sqlite3Fts5HashScanNext(Fts5Hash *p){ |
| 14307 assert( !sqlite3Fts5HashScanEof(p) ); |
| 14308 p->pScan = p->pScan->pScanNext; |
| 14309 } |
| 14310 |
| 14311 static int sqlite3Fts5HashScanEof(Fts5Hash *p){ |
| 14312 return (p->pScan==0); |
| 14313 } |
| 14314 |
| 14315 static void sqlite3Fts5HashScanEntry( |
| 14316 Fts5Hash *pHash, |
| 14317 const char **pzTerm, /* OUT: term (nul-terminated) */ |
| 14318 const u8 **ppDoclist, /* OUT: pointer to doclist */ |
| 14319 int *pnDoclist /* OUT: size of doclist in bytes */ |
| 14320 ){ |
| 14321 Fts5HashEntry *p; |
| 14322 if( (p = pHash->pScan) ){ |
| 14323 int nTerm = (int)strlen(p->zKey); |
| 14324 fts5HashAddPoslistSize(p); |
| 14325 *pzTerm = p->zKey; |
| 14326 *ppDoclist = (const u8*)&p->zKey[nTerm+1]; |
| 14327 *pnDoclist = p->nData - (FTS5_HASHENTRYSIZE + nTerm + 1); |
| 14328 }else{ |
| 14329 *pzTerm = 0; |
| 14330 *ppDoclist = 0; |
| 14331 *pnDoclist = 0; |
| 14332 } |
| 14333 } |
| 14334 |
| 14335 |
| 14336 /* |
| 14337 ** 2014 May 31 |
| 14338 ** |
| 14339 ** The author disclaims copyright to this source code. In place of |
| 14340 ** a legal notice, here is a blessing: |
| 14341 ** |
| 14342 ** May you do good and not evil. |
| 14343 ** May you find forgiveness for yourself and forgive others. |
| 14344 ** May you share freely, never taking more than you give. |
| 14345 ** |
| 14346 ****************************************************************************** |
| 14347 ** |
| 14348 ** Low level access to the FTS index stored in the database file. The |
| 14349 ** routines in this file file implement all read and write access to the |
| 14350 ** %_data table. Other parts of the system access this functionality via |
| 14351 ** the interface defined in fts5Int.h. |
| 14352 */ |
| 14353 |
| 14354 |
| 14355 /* #include "fts5Int.h" */ |
| 14356 |
| 14357 /* |
| 14358 ** Overview: |
| 14359 ** |
| 14360 ** The %_data table contains all the FTS indexes for an FTS5 virtual table. |
| 14361 ** As well as the main term index, there may be up to 31 prefix indexes. |
| 14362 ** The format is similar to FTS3/4, except that: |
| 14363 ** |
| 14364 ** * all segment b-tree leaf data is stored in fixed size page records |
| 14365 ** (e.g. 1000 bytes). A single doclist may span multiple pages. Care is |
| 14366 ** taken to ensure it is possible to iterate in either direction through |
| 14367 ** the entries in a doclist, or to seek to a specific entry within a |
| 14368 ** doclist, without loading it into memory. |
| 14369 ** |
| 14370 ** * large doclists that span many pages have associated "doclist index" |
| 14371 ** records that contain a copy of the first rowid on each page spanned by |
| 14372 ** the doclist. This is used to speed up seek operations, and merges of |
| 14373 ** large doclists with very small doclists. |
| 14374 ** |
| 14375 ** * extra fields in the "structure record" record the state of ongoing |
| 14376 ** incremental merge operations. |
| 14377 ** |
| 14378 */ |
| 14379 |
| 14380 |
| 14381 #define FTS5_OPT_WORK_UNIT 1000 /* Number of leaf pages per optimize step */ |
| 14382 #define FTS5_WORK_UNIT 64 /* Number of leaf pages in unit of work */ |
| 14383 |
| 14384 #define FTS5_MIN_DLIDX_SIZE 4 /* Add dlidx if this many empty pages */ |
| 14385 |
| 14386 #define FTS5_MAIN_PREFIX '0' |
| 14387 |
| 14388 #if FTS5_MAX_PREFIX_INDEXES > 31 |
| 14389 # error "FTS5_MAX_PREFIX_INDEXES is too large" |
| 14390 #endif |
| 14391 |
| 14392 /* |
| 14393 ** Details: |
| 14394 ** |
| 14395 ** The %_data table managed by this module, |
| 14396 ** |
| 14397 ** CREATE TABLE %_data(id INTEGER PRIMARY KEY, block BLOB); |
| 14398 ** |
| 14399 ** , contains the following 5 types of records. See the comments surrounding |
| 14400 ** the FTS5_*_ROWID macros below for a description of how %_data rowids are |
| 14401 ** assigned to each fo them. |
| 14402 ** |
| 14403 ** 1. Structure Records: |
| 14404 ** |
| 14405 ** The set of segments that make up an index - the index structure - are |
| 14406 ** recorded in a single record within the %_data table. The record consists |
| 14407 ** of a single 32-bit configuration cookie value followed by a list of |
| 14408 ** SQLite varints. If the FTS table features more than one index (because |
| 14409 ** there are one or more prefix indexes), it is guaranteed that all share |
| 14410 ** the same cookie value. |
| 14411 ** |
| 14412 ** Immediately following the configuration cookie, the record begins with |
| 14413 ** three varints: |
| 14414 ** |
| 14415 ** + number of levels, |
| 14416 ** + total number of segments on all levels, |
| 14417 ** + value of write counter. |
| 14418 ** |
| 14419 ** Then, for each level from 0 to nMax: |
| 14420 ** |
| 14421 ** + number of input segments in ongoing merge. |
| 14422 ** + total number of segments in level. |
| 14423 ** + for each segment from oldest to newest: |
| 14424 ** + segment id (always > 0) |
| 14425 ** + first leaf page number (often 1, always greater than 0) |
| 14426 ** + final leaf page number |
| 14427 ** |
| 14428 ** 2. The Averages Record: |
| 14429 ** |
| 14430 ** A single record within the %_data table. The data is a list of varints. |
| 14431 ** The first value is the number of rows in the index. Then, for each column |
| 14432 ** from left to right, the total number of tokens in the column for all |
| 14433 ** rows of the table. |
| 14434 ** |
| 14435 ** 3. Segment leaves: |
| 14436 ** |
| 14437 ** TERM/DOCLIST FORMAT: |
| 14438 ** |
| 14439 ** Most of each segment leaf is taken up by term/doclist data. The |
| 14440 ** general format of term/doclist, starting with the first term |
| 14441 ** on the leaf page, is: |
| 14442 ** |
| 14443 ** varint : size of first term |
| 14444 ** blob: first term data |
| 14445 ** doclist: first doclist |
| 14446 ** zero-or-more { |
| 14447 ** varint: number of bytes in common with previous term |
| 14448 ** varint: number of bytes of new term data (nNew) |
| 14449 ** blob: nNew bytes of new term data |
| 14450 ** doclist: next doclist |
| 14451 ** } |
| 14452 ** |
| 14453 ** doclist format: |
| 14454 ** |
| 14455 ** varint: first rowid |
| 14456 ** poslist: first poslist |
| 14457 ** zero-or-more { |
| 14458 ** varint: rowid delta (always > 0) |
| 14459 ** poslist: next poslist |
| 14460 ** } |
| 14461 ** |
| 14462 ** poslist format: |
| 14463 ** |
| 14464 ** varint: size of poslist in bytes multiplied by 2, not including |
| 14465 ** this field. Plus 1 if this entry carries the "delete" flag. |
| 14466 ** collist: collist for column 0 |
| 14467 ** zero-or-more { |
| 14468 ** 0x01 byte |
| 14469 ** varint: column number (I) |
| 14470 ** collist: collist for column I |
| 14471 ** } |
| 14472 ** |
| 14473 ** collist format: |
| 14474 ** |
| 14475 ** varint: first offset + 2 |
| 14476 ** zero-or-more { |
| 14477 ** varint: offset delta + 2 |
| 14478 ** } |
| 14479 ** |
| 14480 ** PAGE FORMAT |
| 14481 ** |
| 14482 ** Each leaf page begins with a 4-byte header containing 2 16-bit |
| 14483 ** unsigned integer fields in big-endian format. They are: |
| 14484 ** |
| 14485 ** * The byte offset of the first rowid on the page, if it exists |
| 14486 ** and occurs before the first term (otherwise 0). |
| 14487 ** |
| 14488 ** * The byte offset of the start of the page footer. If the page |
| 14489 ** footer is 0 bytes in size, then this field is the same as the |
| 14490 ** size of the leaf page in bytes. |
| 14491 ** |
| 14492 ** The page footer consists of a single varint for each term located |
| 14493 ** on the page. Each varint is the byte offset of the current term |
| 14494 ** within the page, delta-compressed against the previous value. In |
| 14495 ** other words, the first varint in the footer is the byte offset of |
| 14496 ** the first term, the second is the byte offset of the second less that |
| 14497 ** of the first, and so on. |
| 14498 ** |
| 14499 ** The term/doclist format described above is accurate if the entire |
| 14500 ** term/doclist data fits on a single leaf page. If this is not the case, |
| 14501 ** the format is changed in two ways: |
| 14502 ** |
| 14503 ** + if the first rowid on a page occurs before the first term, it |
| 14504 ** is stored as a literal value: |
| 14505 ** |
| 14506 ** varint: first rowid |
| 14507 ** |
| 14508 ** + the first term on each page is stored in the same way as the |
| 14509 ** very first term of the segment: |
| 14510 ** |
| 14511 ** varint : size of first term |
| 14512 ** blob: first term data |
| 14513 ** |
| 14514 ** 5. Segment doclist indexes: |
| 14515 ** |
| 14516 ** Doclist indexes are themselves b-trees, however they usually consist of |
| 14517 ** a single leaf record only. The format of each doclist index leaf page |
| 14518 ** is: |
| 14519 ** |
| 14520 ** * Flags byte. Bits are: |
| 14521 ** 0x01: Clear if leaf is also the root page, otherwise set. |
| 14522 ** |
| 14523 ** * Page number of fts index leaf page. As a varint. |
| 14524 ** |
| 14525 ** * First rowid on page indicated by previous field. As a varint. |
| 14526 ** |
| 14527 ** * A list of varints, one for each subsequent termless page. A |
| 14528 ** positive delta if the termless page contains at least one rowid, |
| 14529 ** or an 0x00 byte otherwise. |
| 14530 ** |
| 14531 ** Internal doclist index nodes are: |
| 14532 ** |
| 14533 ** * Flags byte. Bits are: |
| 14534 ** 0x01: Clear for root page, otherwise set. |
| 14535 ** |
| 14536 ** * Page number of first child page. As a varint. |
| 14537 ** |
| 14538 ** * Copy of first rowid on page indicated by previous field. As a varint. |
| 14539 ** |
| 14540 ** * A list of delta-encoded varints - the first rowid on each subsequent |
| 14541 ** child page. |
| 14542 ** |
| 14543 */ |
| 14544 |
| 14545 /* |
| 14546 ** Rowids for the averages and structure records in the %_data table. |
| 14547 */ |
| 14548 #define FTS5_AVERAGES_ROWID 1 /* Rowid used for the averages record */ |
| 14549 #define FTS5_STRUCTURE_ROWID 10 /* The structure record */ |
| 14550 |
| 14551 /* |
| 14552 ** Macros determining the rowids used by segment leaves and dlidx leaves |
| 14553 ** and nodes. All nodes and leaves are stored in the %_data table with large |
| 14554 ** positive rowids. |
| 14555 ** |
| 14556 ** Each segment has a unique non-zero 16-bit id. |
| 14557 ** |
| 14558 ** The rowid for each segment leaf is found by passing the segment id and |
| 14559 ** the leaf page number to the FTS5_SEGMENT_ROWID macro. Leaves are numbered |
| 14560 ** sequentially starting from 1. |
| 14561 */ |
| 14562 #define FTS5_DATA_ID_B 16 /* Max seg id number 65535 */ |
| 14563 #define FTS5_DATA_DLI_B 1 /* Doclist-index flag (1 bit) */ |
| 14564 #define FTS5_DATA_HEIGHT_B 5 /* Max dlidx tree height of 32 */ |
| 14565 #define FTS5_DATA_PAGE_B 31 /* Max page number of 2147483648 */ |
| 14566 |
| 14567 #define fts5_dri(segid, dlidx, height, pgno) ( \ |
| 14568 ((i64)(segid) << (FTS5_DATA_PAGE_B+FTS5_DATA_HEIGHT_B+FTS5_DATA_DLI_B)) + \ |
| 14569 ((i64)(dlidx) << (FTS5_DATA_PAGE_B + FTS5_DATA_HEIGHT_B)) + \ |
| 14570 ((i64)(height) << (FTS5_DATA_PAGE_B)) + \ |
| 14571 ((i64)(pgno)) \ |
| 14572 ) |
| 14573 |
| 14574 #define FTS5_SEGMENT_ROWID(segid, pgno) fts5_dri(segid, 0, 0, pgno) |
| 14575 #define FTS5_DLIDX_ROWID(segid, height, pgno) fts5_dri(segid, 1, height, pgno) |
| 14576 |
| 14577 /* |
| 14578 ** Maximum segments permitted in a single index |
| 14579 */ |
| 14580 #define FTS5_MAX_SEGMENT 2000 |
| 14581 |
| 14582 #ifdef SQLITE_DEBUG |
| 14583 static int sqlite3Fts5Corrupt() { return SQLITE_CORRUPT_VTAB; } |
| 14584 #endif |
| 14585 |
| 14586 |
| 14587 /* |
| 14588 ** Each time a blob is read from the %_data table, it is padded with this |
| 14589 ** many zero bytes. This makes it easier to decode the various record formats |
| 14590 ** without overreading if the records are corrupt. |
| 14591 */ |
| 14592 #define FTS5_DATA_ZERO_PADDING 8 |
| 14593 #define FTS5_DATA_PADDING 20 |
| 14594 |
| 14595 typedef struct Fts5Data Fts5Data; |
| 14596 typedef struct Fts5DlidxIter Fts5DlidxIter; |
| 14597 typedef struct Fts5DlidxLvl Fts5DlidxLvl; |
| 14598 typedef struct Fts5DlidxWriter Fts5DlidxWriter; |
| 14599 typedef struct Fts5PageWriter Fts5PageWriter; |
| 14600 typedef struct Fts5SegIter Fts5SegIter; |
| 14601 typedef struct Fts5DoclistIter Fts5DoclistIter; |
| 14602 typedef struct Fts5SegWriter Fts5SegWriter; |
| 14603 typedef struct Fts5Structure Fts5Structure; |
| 14604 typedef struct Fts5StructureLevel Fts5StructureLevel; |
| 14605 typedef struct Fts5StructureSegment Fts5StructureSegment; |
| 14606 |
| 14607 struct Fts5Data { |
| 14608 u8 *p; /* Pointer to buffer containing record */ |
| 14609 int nn; /* Size of record in bytes */ |
| 14610 int szLeaf; /* Size of leaf without page-index */ |
| 14611 }; |
| 14612 |
| 14613 /* |
| 14614 ** One object per %_data table. |
| 14615 */ |
| 14616 struct Fts5Index { |
| 14617 Fts5Config *pConfig; /* Virtual table configuration */ |
| 14618 char *zDataTbl; /* Name of %_data table */ |
| 14619 int nWorkUnit; /* Leaf pages in a "unit" of work */ |
| 14620 |
| 14621 /* |
| 14622 ** Variables related to the accumulation of tokens and doclists within the |
| 14623 ** in-memory hash tables before they are flushed to disk. |
| 14624 */ |
| 14625 Fts5Hash *pHash; /* Hash table for in-memory data */ |
| 14626 int nPendingData; /* Current bytes of pending data */ |
| 14627 i64 iWriteRowid; /* Rowid for current doc being written */ |
| 14628 int bDelete; /* Current write is a delete */ |
| 14629 |
| 14630 /* Error state. */ |
| 14631 int rc; /* Current error code */ |
| 14632 |
| 14633 /* State used by the fts5DataXXX() functions. */ |
| 14634 sqlite3_blob *pReader; /* RO incr-blob open on %_data table */ |
| 14635 sqlite3_stmt *pWriter; /* "INSERT ... %_data VALUES(?,?)" */ |
| 14636 sqlite3_stmt *pDeleter; /* "DELETE FROM %_data ... id>=? AND id<=?" */ |
| 14637 sqlite3_stmt *pIdxWriter; /* "INSERT ... %_idx VALUES(?,?,?,?)" */ |
| 14638 sqlite3_stmt *pIdxDeleter; /* "DELETE FROM %_idx WHERE segid=? */ |
| 14639 sqlite3_stmt *pIdxSelect; |
| 14640 int nRead; /* Total number of blocks read */ |
| 14641 }; |
| 14642 |
| 14643 struct Fts5DoclistIter { |
| 14644 u8 *aEof; /* Pointer to 1 byte past end of doclist */ |
| 14645 |
| 14646 /* Output variables. aPoslist==0 at EOF */ |
| 14647 i64 iRowid; |
| 14648 u8 *aPoslist; |
| 14649 int nPoslist; |
| 14650 int nSize; |
| 14651 }; |
| 14652 |
| 14653 /* |
| 14654 ** The contents of the "structure" record for each index are represented |
| 14655 ** using an Fts5Structure record in memory. Which uses instances of the |
| 14656 ** other Fts5StructureXXX types as components. |
| 14657 */ |
| 14658 struct Fts5StructureSegment { |
| 14659 int iSegid; /* Segment id */ |
| 14660 int pgnoFirst; /* First leaf page number in segment */ |
| 14661 int pgnoLast; /* Last leaf page number in segment */ |
| 14662 }; |
| 14663 struct Fts5StructureLevel { |
| 14664 int nMerge; /* Number of segments in incr-merge */ |
| 14665 int nSeg; /* Total number of segments on level */ |
| 14666 Fts5StructureSegment *aSeg; /* Array of segments. aSeg[0] is oldest. */ |
| 14667 }; |
| 14668 struct Fts5Structure { |
| 14669 int nRef; /* Object reference count */ |
| 14670 u64 nWriteCounter; /* Total leaves written to level 0 */ |
| 14671 int nSegment; /* Total segments in this structure */ |
| 14672 int nLevel; /* Number of levels in this index */ |
| 14673 Fts5StructureLevel aLevel[1]; /* Array of nLevel level objects */ |
| 14674 }; |
| 14675 |
| 14676 /* |
| 14677 ** An object of type Fts5SegWriter is used to write to segments. |
| 14678 */ |
| 14679 struct Fts5PageWriter { |
| 14680 int pgno; /* Page number for this page */ |
| 14681 int iPrevPgidx; /* Previous value written into pgidx */ |
| 14682 Fts5Buffer buf; /* Buffer containing leaf data */ |
| 14683 Fts5Buffer pgidx; /* Buffer containing page-index */ |
| 14684 Fts5Buffer term; /* Buffer containing previous term on page */ |
| 14685 }; |
| 14686 struct Fts5DlidxWriter { |
| 14687 int pgno; /* Page number for this page */ |
| 14688 int bPrevValid; /* True if iPrev is valid */ |
| 14689 i64 iPrev; /* Previous rowid value written to page */ |
| 14690 Fts5Buffer buf; /* Buffer containing page data */ |
| 14691 }; |
| 14692 struct Fts5SegWriter { |
| 14693 int iSegid; /* Segid to write to */ |
| 14694 Fts5PageWriter writer; /* PageWriter object */ |
| 14695 i64 iPrevRowid; /* Previous rowid written to current leaf */ |
| 14696 u8 bFirstRowidInDoclist; /* True if next rowid is first in doclist */ |
| 14697 u8 bFirstRowidInPage; /* True if next rowid is first in page */ |
| 14698 /* TODO1: Can use (writer.pgidx.n==0) instead of bFirstTermInPage */ |
| 14699 u8 bFirstTermInPage; /* True if next term will be first in leaf */ |
| 14700 int nLeafWritten; /* Number of leaf pages written */ |
| 14701 int nEmpty; /* Number of contiguous term-less nodes */ |
| 14702 |
| 14703 int nDlidx; /* Allocated size of aDlidx[] array */ |
| 14704 Fts5DlidxWriter *aDlidx; /* Array of Fts5DlidxWriter objects */ |
| 14705 |
| 14706 /* Values to insert into the %_idx table */ |
| 14707 Fts5Buffer btterm; /* Next term to insert into %_idx table */ |
| 14708 int iBtPage; /* Page number corresponding to btterm */ |
| 14709 }; |
| 14710 |
| 14711 typedef struct Fts5CResult Fts5CResult; |
| 14712 struct Fts5CResult { |
| 14713 u16 iFirst; /* aSeg[] index of firstest iterator */ |
| 14714 u8 bTermEq; /* True if the terms are equal */ |
| 14715 }; |
| 14716 |
| 14717 /* |
| 14718 ** Object for iterating through a single segment, visiting each term/rowid |
| 14719 ** pair in the segment. |
| 14720 ** |
| 14721 ** pSeg: |
| 14722 ** The segment to iterate through. |
| 14723 ** |
| 14724 ** iLeafPgno: |
| 14725 ** Current leaf page number within segment. |
| 14726 ** |
| 14727 ** iLeafOffset: |
| 14728 ** Byte offset within the current leaf that is the first byte of the |
| 14729 ** position list data (one byte passed the position-list size field). |
| 14730 ** rowid field of the current entry. Usually this is the size field of the |
| 14731 ** position list data. The exception is if the rowid for the current entry |
| 14732 ** is the last thing on the leaf page. |
| 14733 ** |
| 14734 ** pLeaf: |
| 14735 ** Buffer containing current leaf page data. Set to NULL at EOF. |
| 14736 ** |
| 14737 ** iTermLeafPgno, iTermLeafOffset: |
| 14738 ** Leaf page number containing the last term read from the segment. And |
| 14739 ** the offset immediately following the term data. |
| 14740 ** |
| 14741 ** flags: |
| 14742 ** Mask of FTS5_SEGITER_XXX values. Interpreted as follows: |
| 14743 ** |
| 14744 ** FTS5_SEGITER_ONETERM: |
| 14745 ** If set, set the iterator to point to EOF after the current doclist |
| 14746 ** has been exhausted. Do not proceed to the next term in the segment. |
| 14747 ** |
| 14748 ** FTS5_SEGITER_REVERSE: |
| 14749 ** This flag is only ever set if FTS5_SEGITER_ONETERM is also set. If |
| 14750 ** it is set, iterate through rowid in descending order instead of the |
| 14751 ** default ascending order. |
| 14752 ** |
| 14753 ** iRowidOffset/nRowidOffset/aRowidOffset: |
| 14754 ** These are used if the FTS5_SEGITER_REVERSE flag is set. |
| 14755 ** |
| 14756 ** For each rowid on the page corresponding to the current term, the |
| 14757 ** corresponding aRowidOffset[] entry is set to the byte offset of the |
| 14758 ** start of the "position-list-size" field within the page. |
| 14759 ** |
| 14760 ** iTermIdx: |
| 14761 ** Index of current term on iTermLeafPgno. |
| 14762 */ |
| 14763 struct Fts5SegIter { |
| 14764 Fts5StructureSegment *pSeg; /* Segment to iterate through */ |
| 14765 int flags; /* Mask of configuration flags */ |
| 14766 int iLeafPgno; /* Current leaf page number */ |
| 14767 Fts5Data *pLeaf; /* Current leaf data */ |
| 14768 Fts5Data *pNextLeaf; /* Leaf page (iLeafPgno+1) */ |
| 14769 int iLeafOffset; /* Byte offset within current leaf */ |
| 14770 |
| 14771 /* The page and offset from which the current term was read. The offset |
| 14772 ** is the offset of the first rowid in the current doclist. */ |
| 14773 int iTermLeafPgno; |
| 14774 int iTermLeafOffset; |
| 14775 |
| 14776 int iPgidxOff; /* Next offset in pgidx */ |
| 14777 int iEndofDoclist; |
| 14778 |
| 14779 /* The following are only used if the FTS5_SEGITER_REVERSE flag is set. */ |
| 14780 int iRowidOffset; /* Current entry in aRowidOffset[] */ |
| 14781 int nRowidOffset; /* Allocated size of aRowidOffset[] array */ |
| 14782 int *aRowidOffset; /* Array of offset to rowid fields */ |
| 14783 |
| 14784 Fts5DlidxIter *pDlidx; /* If there is a doclist-index */ |
| 14785 |
| 14786 /* Variables populated based on current entry. */ |
| 14787 Fts5Buffer term; /* Current term */ |
| 14788 i64 iRowid; /* Current rowid */ |
| 14789 int nPos; /* Number of bytes in current position list */ |
| 14790 int bDel; /* True if the delete flag is set */ |
| 14791 }; |
| 14792 |
| 14793 /* |
| 14794 ** Argument is a pointer to an Fts5Data structure that contains a |
| 14795 ** leaf page. |
| 14796 */ |
| 14797 #define ASSERT_SZLEAF_OK(x) assert( \ |
| 14798 (x)->szLeaf==(x)->nn || (x)->szLeaf==fts5GetU16(&(x)->p[2]) \ |
| 14799 ) |
| 14800 |
| 14801 #define FTS5_SEGITER_ONETERM 0x01 |
| 14802 #define FTS5_SEGITER_REVERSE 0x02 |
| 14803 |
| 14804 |
| 14805 /* |
| 14806 ** Argument is a pointer to an Fts5Data structure that contains a leaf |
| 14807 ** page. This macro evaluates to true if the leaf contains no terms, or |
| 14808 ** false if it contains at least one term. |
| 14809 */ |
| 14810 #define fts5LeafIsTermless(x) ((x)->szLeaf >= (x)->nn) |
| 14811 |
| 14812 #define fts5LeafTermOff(x, i) (fts5GetU16(&(x)->p[(x)->szLeaf + (i)*2])) |
| 14813 |
| 14814 #define fts5LeafFirstRowidOff(x) (fts5GetU16((x)->p)) |
| 14815 |
| 14816 /* |
| 14817 ** Object for iterating through the merged results of one or more segments, |
| 14818 ** visiting each term/rowid pair in the merged data. |
| 14819 ** |
| 14820 ** nSeg is always a power of two greater than or equal to the number of |
| 14821 ** segments that this object is merging data from. Both the aSeg[] and |
| 14822 ** aFirst[] arrays are sized at nSeg entries. The aSeg[] array is padded |
| 14823 ** with zeroed objects - these are handled as if they were iterators opened |
| 14824 ** on empty segments. |
| 14825 ** |
| 14826 ** The results of comparing segments aSeg[N] and aSeg[N+1], where N is an |
| 14827 ** even number, is stored in aFirst[(nSeg+N)/2]. The "result" of the |
| 14828 ** comparison in this context is the index of the iterator that currently |
| 14829 ** points to the smaller term/rowid combination. Iterators at EOF are |
| 14830 ** considered to be greater than all other iterators. |
| 14831 ** |
| 14832 ** aFirst[1] contains the index in aSeg[] of the iterator that points to |
| 14833 ** the smallest key overall. aFirst[0] is unused. |
| 14834 ** |
| 14835 ** poslist: |
| 14836 ** Used by sqlite3Fts5IterPoslist() when the poslist needs to be buffered. |
| 14837 ** There is no way to tell if this is populated or not. |
| 14838 */ |
| 14839 struct Fts5IndexIter { |
| 14840 Fts5Index *pIndex; /* Index that owns this iterator */ |
| 14841 Fts5Structure *pStruct; /* Database structure for this iterator */ |
| 14842 Fts5Buffer poslist; /* Buffer containing current poslist */ |
| 14843 |
| 14844 int nSeg; /* Size of aSeg[] array */ |
| 14845 int bRev; /* True to iterate in reverse order */ |
| 14846 u8 bSkipEmpty; /* True to skip deleted entries */ |
| 14847 u8 bEof; /* True at EOF */ |
| 14848 u8 bFiltered; /* True if column-filter already applied */ |
| 14849 |
| 14850 i64 iSwitchRowid; /* Firstest rowid of other than aFirst[1] */ |
| 14851 Fts5CResult *aFirst; /* Current merge state (see above) */ |
| 14852 Fts5SegIter aSeg[1]; /* Array of segment iterators */ |
| 14853 }; |
| 14854 |
| 14855 |
| 14856 /* |
| 14857 ** An instance of the following type is used to iterate through the contents |
| 14858 ** of a doclist-index record. |
| 14859 ** |
| 14860 ** pData: |
| 14861 ** Record containing the doclist-index data. |
| 14862 ** |
| 14863 ** bEof: |
| 14864 ** Set to true once iterator has reached EOF. |
| 14865 ** |
| 14866 ** iOff: |
| 14867 ** Set to the current offset within record pData. |
| 14868 */ |
| 14869 struct Fts5DlidxLvl { |
| 14870 Fts5Data *pData; /* Data for current page of this level */ |
| 14871 int iOff; /* Current offset into pData */ |
| 14872 int bEof; /* At EOF already */ |
| 14873 int iFirstOff; /* Used by reverse iterators */ |
| 14874 |
| 14875 /* Output variables */ |
| 14876 int iLeafPgno; /* Page number of current leaf page */ |
| 14877 i64 iRowid; /* First rowid on leaf iLeafPgno */ |
| 14878 }; |
| 14879 struct Fts5DlidxIter { |
| 14880 int nLvl; |
| 14881 int iSegid; |
| 14882 Fts5DlidxLvl aLvl[1]; |
| 14883 }; |
| 14884 |
| 14885 static void fts5PutU16(u8 *aOut, u16 iVal){ |
| 14886 aOut[0] = (iVal>>8); |
| 14887 aOut[1] = (iVal&0xFF); |
| 14888 } |
| 14889 |
| 14890 static u16 fts5GetU16(const u8 *aIn){ |
| 14891 return ((u16)aIn[0] << 8) + aIn[1]; |
| 14892 } |
| 14893 |
| 14894 /* |
| 14895 ** Allocate and return a buffer at least nByte bytes in size. |
| 14896 ** |
| 14897 ** If an OOM error is encountered, return NULL and set the error code in |
| 14898 ** the Fts5Index handle passed as the first argument. |
| 14899 */ |
| 14900 static void *fts5IdxMalloc(Fts5Index *p, int nByte){ |
| 14901 return sqlite3Fts5MallocZero(&p->rc, nByte); |
| 14902 } |
| 14903 |
| 14904 /* |
| 14905 ** Compare the contents of the pLeft buffer with the pRight/nRight blob. |
| 14906 ** |
| 14907 ** Return -ve if pLeft is smaller than pRight, 0 if they are equal or |
| 14908 ** +ve if pRight is smaller than pLeft. In other words: |
| 14909 ** |
| 14910 ** res = *pLeft - *pRight |
| 14911 */ |
| 14912 #ifdef SQLITE_DEBUG |
| 14913 static int fts5BufferCompareBlob( |
| 14914 Fts5Buffer *pLeft, /* Left hand side of comparison */ |
| 14915 const u8 *pRight, int nRight /* Right hand side of comparison */ |
| 14916 ){ |
| 14917 int nCmp = MIN(pLeft->n, nRight); |
| 14918 int res = memcmp(pLeft->p, pRight, nCmp); |
| 14919 return (res==0 ? (pLeft->n - nRight) : res); |
| 14920 } |
| 14921 #endif |
| 14922 |
| 14923 /* |
| 14924 ** Compare the contents of the two buffers using memcmp(). If one buffer |
| 14925 ** is a prefix of the other, it is considered the lesser. |
| 14926 ** |
| 14927 ** Return -ve if pLeft is smaller than pRight, 0 if they are equal or |
| 14928 ** +ve if pRight is smaller than pLeft. In other words: |
| 14929 ** |
| 14930 ** res = *pLeft - *pRight |
| 14931 */ |
| 14932 static int fts5BufferCompare(Fts5Buffer *pLeft, Fts5Buffer *pRight){ |
| 14933 int nCmp = MIN(pLeft->n, pRight->n); |
| 14934 int res = memcmp(pLeft->p, pRight->p, nCmp); |
| 14935 return (res==0 ? (pLeft->n - pRight->n) : res); |
| 14936 } |
| 14937 |
| 14938 #ifdef SQLITE_DEBUG |
| 14939 static int fts5BlobCompare( |
| 14940 const u8 *pLeft, int nLeft, |
| 14941 const u8 *pRight, int nRight |
| 14942 ){ |
| 14943 int nCmp = MIN(nLeft, nRight); |
| 14944 int res = memcmp(pLeft, pRight, nCmp); |
| 14945 return (res==0 ? (nLeft - nRight) : res); |
| 14946 } |
| 14947 #endif |
| 14948 |
| 14949 static int fts5LeafFirstTermOff(Fts5Data *pLeaf){ |
| 14950 int ret; |
| 14951 fts5GetVarint32(&pLeaf->p[pLeaf->szLeaf], ret); |
| 14952 return ret; |
| 14953 } |
| 14954 |
| 14955 /* |
| 14956 ** Close the read-only blob handle, if it is open. |
| 14957 */ |
| 14958 static void fts5CloseReader(Fts5Index *p){ |
| 14959 if( p->pReader ){ |
| 14960 sqlite3_blob *pReader = p->pReader; |
| 14961 p->pReader = 0; |
| 14962 sqlite3_blob_close(pReader); |
| 14963 } |
| 14964 } |
| 14965 |
| 14966 |
| 14967 /* |
| 14968 ** Retrieve a record from the %_data table. |
| 14969 ** |
| 14970 ** If an error occurs, NULL is returned and an error left in the |
| 14971 ** Fts5Index object. |
| 14972 */ |
| 14973 static Fts5Data *fts5DataRead(Fts5Index *p, i64 iRowid){ |
| 14974 Fts5Data *pRet = 0; |
| 14975 if( p->rc==SQLITE_OK ){ |
| 14976 int rc = SQLITE_OK; |
| 14977 |
| 14978 if( p->pReader ){ |
| 14979 /* This call may return SQLITE_ABORT if there has been a savepoint |
| 14980 ** rollback since it was last used. In this case a new blob handle |
| 14981 ** is required. */ |
| 14982 sqlite3_blob *pBlob = p->pReader; |
| 14983 p->pReader = 0; |
| 14984 rc = sqlite3_blob_reopen(pBlob, iRowid); |
| 14985 assert( p->pReader==0 ); |
| 14986 p->pReader = pBlob; |
| 14987 if( rc!=SQLITE_OK ){ |
| 14988 fts5CloseReader(p); |
| 14989 } |
| 14990 if( rc==SQLITE_ABORT ) rc = SQLITE_OK; |
| 14991 } |
| 14992 |
| 14993 /* If the blob handle is not open at this point, open it and seek |
| 14994 ** to the requested entry. */ |
| 14995 if( p->pReader==0 && rc==SQLITE_OK ){ |
| 14996 Fts5Config *pConfig = p->pConfig; |
| 14997 rc = sqlite3_blob_open(pConfig->db, |
| 14998 pConfig->zDb, p->zDataTbl, "block", iRowid, 0, &p->pReader |
| 14999 ); |
| 15000 } |
| 15001 |
| 15002 /* If either of the sqlite3_blob_open() or sqlite3_blob_reopen() calls |
| 15003 ** above returned SQLITE_ERROR, return SQLITE_CORRUPT_VTAB instead. |
| 15004 ** All the reasons those functions might return SQLITE_ERROR - missing |
| 15005 ** table, missing row, non-blob/text in block column - indicate |
| 15006 ** backing store corruption. */ |
| 15007 if( rc==SQLITE_ERROR ) rc = FTS5_CORRUPT; |
| 15008 |
| 15009 if( rc==SQLITE_OK ){ |
| 15010 u8 *aOut = 0; /* Read blob data into this buffer */ |
| 15011 int nByte = sqlite3_blob_bytes(p->pReader); |
| 15012 int nAlloc = sizeof(Fts5Data) + nByte + FTS5_DATA_PADDING; |
| 15013 pRet = (Fts5Data*)sqlite3_malloc(nAlloc); |
| 15014 if( pRet ){ |
| 15015 pRet->nn = nByte; |
| 15016 aOut = pRet->p = (u8*)&pRet[1]; |
| 15017 }else{ |
| 15018 rc = SQLITE_NOMEM; |
| 15019 } |
| 15020 |
| 15021 if( rc==SQLITE_OK ){ |
| 15022 rc = sqlite3_blob_read(p->pReader, aOut, nByte, 0); |
| 15023 } |
| 15024 if( rc!=SQLITE_OK ){ |
| 15025 sqlite3_free(pRet); |
| 15026 pRet = 0; |
| 15027 }else{ |
| 15028 /* TODO1: Fix this */ |
| 15029 pRet->szLeaf = fts5GetU16(&pRet->p[2]); |
| 15030 } |
| 15031 } |
| 15032 p->rc = rc; |
| 15033 p->nRead++; |
| 15034 } |
| 15035 |
| 15036 assert( (pRet==0)==(p->rc!=SQLITE_OK) ); |
| 15037 return pRet; |
| 15038 } |
| 15039 |
| 15040 /* |
| 15041 ** Release a reference to data record returned by an earlier call to |
| 15042 ** fts5DataRead(). |
| 15043 */ |
| 15044 static void fts5DataRelease(Fts5Data *pData){ |
| 15045 sqlite3_free(pData); |
| 15046 } |
| 15047 |
| 15048 static int fts5IndexPrepareStmt( |
| 15049 Fts5Index *p, |
| 15050 sqlite3_stmt **ppStmt, |
| 15051 char *zSql |
| 15052 ){ |
| 15053 if( p->rc==SQLITE_OK ){ |
| 15054 if( zSql ){ |
| 15055 p->rc = sqlite3_prepare_v2(p->pConfig->db, zSql, -1, ppStmt, 0); |
| 15056 }else{ |
| 15057 p->rc = SQLITE_NOMEM; |
| 15058 } |
| 15059 } |
| 15060 sqlite3_free(zSql); |
| 15061 return p->rc; |
| 15062 } |
| 15063 |
| 15064 |
| 15065 /* |
| 15066 ** INSERT OR REPLACE a record into the %_data table. |
| 15067 */ |
| 15068 static void fts5DataWrite(Fts5Index *p, i64 iRowid, const u8 *pData, int nData){ |
| 15069 if( p->rc!=SQLITE_OK ) return; |
| 15070 |
| 15071 if( p->pWriter==0 ){ |
| 15072 Fts5Config *pConfig = p->pConfig; |
| 15073 fts5IndexPrepareStmt(p, &p->pWriter, sqlite3_mprintf( |
| 15074 "REPLACE INTO '%q'.'%q_data'(id, block) VALUES(?,?)", |
| 15075 pConfig->zDb, pConfig->zName |
| 15076 )); |
| 15077 if( p->rc ) return; |
| 15078 } |
| 15079 |
| 15080 sqlite3_bind_int64(p->pWriter, 1, iRowid); |
| 15081 sqlite3_bind_blob(p->pWriter, 2, pData, nData, SQLITE_STATIC); |
| 15082 sqlite3_step(p->pWriter); |
| 15083 p->rc = sqlite3_reset(p->pWriter); |
| 15084 } |
| 15085 |
| 15086 /* |
| 15087 ** Execute the following SQL: |
| 15088 ** |
| 15089 ** DELETE FROM %_data WHERE id BETWEEN $iFirst AND $iLast |
| 15090 */ |
| 15091 static void fts5DataDelete(Fts5Index *p, i64 iFirst, i64 iLast){ |
| 15092 if( p->rc!=SQLITE_OK ) return; |
| 15093 |
| 15094 if( p->pDeleter==0 ){ |
| 15095 int rc; |
| 15096 Fts5Config *pConfig = p->pConfig; |
| 15097 char *zSql = sqlite3_mprintf( |
| 15098 "DELETE FROM '%q'.'%q_data' WHERE id>=? AND id<=?", |
| 15099 pConfig->zDb, pConfig->zName |
| 15100 ); |
| 15101 if( zSql==0 ){ |
| 15102 rc = SQLITE_NOMEM; |
| 15103 }else{ |
| 15104 rc = sqlite3_prepare_v2(pConfig->db, zSql, -1, &p->pDeleter, 0); |
| 15105 sqlite3_free(zSql); |
| 15106 } |
| 15107 if( rc!=SQLITE_OK ){ |
| 15108 p->rc = rc; |
| 15109 return; |
| 15110 } |
| 15111 } |
| 15112 |
| 15113 sqlite3_bind_int64(p->pDeleter, 1, iFirst); |
| 15114 sqlite3_bind_int64(p->pDeleter, 2, iLast); |
| 15115 sqlite3_step(p->pDeleter); |
| 15116 p->rc = sqlite3_reset(p->pDeleter); |
| 15117 } |
| 15118 |
| 15119 /* |
| 15120 ** Remove all records associated with segment iSegid. |
| 15121 */ |
| 15122 static void fts5DataRemoveSegment(Fts5Index *p, int iSegid){ |
| 15123 i64 iFirst = FTS5_SEGMENT_ROWID(iSegid, 0); |
| 15124 i64 iLast = FTS5_SEGMENT_ROWID(iSegid+1, 0)-1; |
| 15125 fts5DataDelete(p, iFirst, iLast); |
| 15126 if( p->pIdxDeleter==0 ){ |
| 15127 Fts5Config *pConfig = p->pConfig; |
| 15128 fts5IndexPrepareStmt(p, &p->pIdxDeleter, sqlite3_mprintf( |
| 15129 "DELETE FROM '%q'.'%q_idx' WHERE segid=?", |
| 15130 pConfig->zDb, pConfig->zName |
| 15131 )); |
| 15132 } |
| 15133 if( p->rc==SQLITE_OK ){ |
| 15134 sqlite3_bind_int(p->pIdxDeleter, 1, iSegid); |
| 15135 sqlite3_step(p->pIdxDeleter); |
| 15136 p->rc = sqlite3_reset(p->pIdxDeleter); |
| 15137 } |
| 15138 } |
| 15139 |
| 15140 /* |
| 15141 ** Release a reference to an Fts5Structure object returned by an earlier |
| 15142 ** call to fts5StructureRead() or fts5StructureDecode(). |
| 15143 */ |
| 15144 static void fts5StructureRelease(Fts5Structure *pStruct){ |
| 15145 if( pStruct && 0>=(--pStruct->nRef) ){ |
| 15146 int i; |
| 15147 assert( pStruct->nRef==0 ); |
| 15148 for(i=0; i<pStruct->nLevel; i++){ |
| 15149 sqlite3_free(pStruct->aLevel[i].aSeg); |
| 15150 } |
| 15151 sqlite3_free(pStruct); |
| 15152 } |
| 15153 } |
| 15154 |
| 15155 static void fts5StructureRef(Fts5Structure *pStruct){ |
| 15156 pStruct->nRef++; |
| 15157 } |
| 15158 |
| 15159 /* |
| 15160 ** Deserialize and return the structure record currently stored in serialized |
| 15161 ** form within buffer pData/nData. |
| 15162 ** |
| 15163 ** The Fts5Structure.aLevel[] and each Fts5StructureLevel.aSeg[] array |
| 15164 ** are over-allocated by one slot. This allows the structure contents |
| 15165 ** to be more easily edited. |
| 15166 ** |
| 15167 ** If an error occurs, *ppOut is set to NULL and an SQLite error code |
| 15168 ** returned. Otherwise, *ppOut is set to point to the new object and |
| 15169 ** SQLITE_OK returned. |
| 15170 */ |
| 15171 static int fts5StructureDecode( |
| 15172 const u8 *pData, /* Buffer containing serialized structure */ |
| 15173 int nData, /* Size of buffer pData in bytes */ |
| 15174 int *piCookie, /* Configuration cookie value */ |
| 15175 Fts5Structure **ppOut /* OUT: Deserialized object */ |
| 15176 ){ |
| 15177 int rc = SQLITE_OK; |
| 15178 int i = 0; |
| 15179 int iLvl; |
| 15180 int nLevel = 0; |
| 15181 int nSegment = 0; |
| 15182 int nByte; /* Bytes of space to allocate at pRet */ |
| 15183 Fts5Structure *pRet = 0; /* Structure object to return */ |
| 15184 |
| 15185 /* Grab the cookie value */ |
| 15186 if( piCookie ) *piCookie = sqlite3Fts5Get32(pData); |
| 15187 i = 4; |
| 15188 |
| 15189 /* Read the total number of levels and segments from the start of the |
| 15190 ** structure record. */ |
| 15191 i += fts5GetVarint32(&pData[i], nLevel); |
| 15192 i += fts5GetVarint32(&pData[i], nSegment); |
| 15193 nByte = ( |
| 15194 sizeof(Fts5Structure) + /* Main structure */ |
| 15195 sizeof(Fts5StructureLevel) * (nLevel-1) /* aLevel[] array */ |
| 15196 ); |
| 15197 pRet = (Fts5Structure*)sqlite3Fts5MallocZero(&rc, nByte); |
| 15198 |
| 15199 if( pRet ){ |
| 15200 pRet->nRef = 1; |
| 15201 pRet->nLevel = nLevel; |
| 15202 pRet->nSegment = nSegment; |
| 15203 i += sqlite3Fts5GetVarint(&pData[i], &pRet->nWriteCounter); |
| 15204 |
| 15205 for(iLvl=0; rc==SQLITE_OK && iLvl<nLevel; iLvl++){ |
| 15206 Fts5StructureLevel *pLvl = &pRet->aLevel[iLvl]; |
| 15207 int nTotal; |
| 15208 int iSeg; |
| 15209 |
| 15210 i += fts5GetVarint32(&pData[i], pLvl->nMerge); |
| 15211 i += fts5GetVarint32(&pData[i], nTotal); |
| 15212 assert( nTotal>=pLvl->nMerge ); |
| 15213 pLvl->aSeg = (Fts5StructureSegment*)sqlite3Fts5MallocZero(&rc, |
| 15214 nTotal * sizeof(Fts5StructureSegment) |
| 15215 ); |
| 15216 |
| 15217 if( rc==SQLITE_OK ){ |
| 15218 pLvl->nSeg = nTotal; |
| 15219 for(iSeg=0; iSeg<nTotal; iSeg++){ |
| 15220 i += fts5GetVarint32(&pData[i], pLvl->aSeg[iSeg].iSegid); |
| 15221 i += fts5GetVarint32(&pData[i], pLvl->aSeg[iSeg].pgnoFirst); |
| 15222 i += fts5GetVarint32(&pData[i], pLvl->aSeg[iSeg].pgnoLast); |
| 15223 } |
| 15224 }else{ |
| 15225 fts5StructureRelease(pRet); |
| 15226 pRet = 0; |
| 15227 } |
| 15228 } |
| 15229 } |
| 15230 |
| 15231 *ppOut = pRet; |
| 15232 return rc; |
| 15233 } |
| 15234 |
| 15235 /* |
| 15236 ** |
| 15237 */ |
| 15238 static void fts5StructureAddLevel(int *pRc, Fts5Structure **ppStruct){ |
| 15239 if( *pRc==SQLITE_OK ){ |
| 15240 Fts5Structure *pStruct = *ppStruct; |
| 15241 int nLevel = pStruct->nLevel; |
| 15242 int nByte = ( |
| 15243 sizeof(Fts5Structure) + /* Main structure */ |
| 15244 sizeof(Fts5StructureLevel) * (nLevel+1) /* aLevel[] array */ |
| 15245 ); |
| 15246 |
| 15247 pStruct = sqlite3_realloc(pStruct, nByte); |
| 15248 if( pStruct ){ |
| 15249 memset(&pStruct->aLevel[nLevel], 0, sizeof(Fts5StructureLevel)); |
| 15250 pStruct->nLevel++; |
| 15251 *ppStruct = pStruct; |
| 15252 }else{ |
| 15253 *pRc = SQLITE_NOMEM; |
| 15254 } |
| 15255 } |
| 15256 } |
| 15257 |
| 15258 /* |
| 15259 ** Extend level iLvl so that there is room for at least nExtra more |
| 15260 ** segments. |
| 15261 */ |
| 15262 static void fts5StructureExtendLevel( |
| 15263 int *pRc, |
| 15264 Fts5Structure *pStruct, |
| 15265 int iLvl, |
| 15266 int nExtra, |
| 15267 int bInsert |
| 15268 ){ |
| 15269 if( *pRc==SQLITE_OK ){ |
| 15270 Fts5StructureLevel *pLvl = &pStruct->aLevel[iLvl]; |
| 15271 Fts5StructureSegment *aNew; |
| 15272 int nByte; |
| 15273 |
| 15274 nByte = (pLvl->nSeg + nExtra) * sizeof(Fts5StructureSegment); |
| 15275 aNew = sqlite3_realloc(pLvl->aSeg, nByte); |
| 15276 if( aNew ){ |
| 15277 if( bInsert==0 ){ |
| 15278 memset(&aNew[pLvl->nSeg], 0, sizeof(Fts5StructureSegment) * nExtra); |
| 15279 }else{ |
| 15280 int nMove = pLvl->nSeg * sizeof(Fts5StructureSegment); |
| 15281 memmove(&aNew[nExtra], aNew, nMove); |
| 15282 memset(aNew, 0, sizeof(Fts5StructureSegment) * nExtra); |
| 15283 } |
| 15284 pLvl->aSeg = aNew; |
| 15285 }else{ |
| 15286 *pRc = SQLITE_NOMEM; |
| 15287 } |
| 15288 } |
| 15289 } |
| 15290 |
| 15291 /* |
| 15292 ** Read, deserialize and return the structure record. |
| 15293 ** |
| 15294 ** The Fts5Structure.aLevel[] and each Fts5StructureLevel.aSeg[] array |
| 15295 ** are over-allocated as described for function fts5StructureDecode() |
| 15296 ** above. |
| 15297 ** |
| 15298 ** If an error occurs, NULL is returned and an error code left in the |
| 15299 ** Fts5Index handle. If an error has already occurred when this function |
| 15300 ** is called, it is a no-op. |
| 15301 */ |
| 15302 static Fts5Structure *fts5StructureRead(Fts5Index *p){ |
| 15303 Fts5Config *pConfig = p->pConfig; |
| 15304 Fts5Structure *pRet = 0; /* Object to return */ |
| 15305 int iCookie; /* Configuration cookie */ |
| 15306 Fts5Data *pData; |
| 15307 |
| 15308 pData = fts5DataRead(p, FTS5_STRUCTURE_ROWID); |
| 15309 if( p->rc ) return 0; |
| 15310 /* TODO: Do we need this if the leaf-index is appended? Probably... */ |
| 15311 memset(&pData->p[pData->nn], 0, FTS5_DATA_PADDING); |
| 15312 p->rc = fts5StructureDecode(pData->p, pData->nn, &iCookie, &pRet); |
| 15313 if( p->rc==SQLITE_OK && pConfig->iCookie!=iCookie ){ |
| 15314 p->rc = sqlite3Fts5ConfigLoad(pConfig, iCookie); |
| 15315 } |
| 15316 |
| 15317 fts5DataRelease(pData); |
| 15318 if( p->rc!=SQLITE_OK ){ |
| 15319 fts5StructureRelease(pRet); |
| 15320 pRet = 0; |
| 15321 } |
| 15322 return pRet; |
| 15323 } |
| 15324 |
| 15325 /* |
| 15326 ** Return the total number of segments in index structure pStruct. This |
| 15327 ** function is only ever used as part of assert() conditions. |
| 15328 */ |
| 15329 #ifdef SQLITE_DEBUG |
| 15330 static int fts5StructureCountSegments(Fts5Structure *pStruct){ |
| 15331 int nSegment = 0; /* Total number of segments */ |
| 15332 if( pStruct ){ |
| 15333 int iLvl; /* Used to iterate through levels */ |
| 15334 for(iLvl=0; iLvl<pStruct->nLevel; iLvl++){ |
| 15335 nSegment += pStruct->aLevel[iLvl].nSeg; |
| 15336 } |
| 15337 } |
| 15338 |
| 15339 return nSegment; |
| 15340 } |
| 15341 #endif |
| 15342 |
| 15343 #define fts5BufferSafeAppendBlob(pBuf, pBlob, nBlob) { \ |
| 15344 assert( (pBuf)->nSpace>=((pBuf)->n+nBlob) ); \ |
| 15345 memcpy(&(pBuf)->p[(pBuf)->n], pBlob, nBlob); \ |
| 15346 (pBuf)->n += nBlob; \ |
| 15347 } |
| 15348 |
| 15349 #define fts5BufferSafeAppendVarint(pBuf, iVal) { \ |
| 15350 (pBuf)->n += sqlite3Fts5PutVarint(&(pBuf)->p[(pBuf)->n], (iVal)); \ |
| 15351 assert( (pBuf)->nSpace>=(pBuf)->n ); \ |
| 15352 } |
| 15353 |
| 15354 |
| 15355 /* |
| 15356 ** Serialize and store the "structure" record. |
| 15357 ** |
| 15358 ** If an error occurs, leave an error code in the Fts5Index object. If an |
| 15359 ** error has already occurred, this function is a no-op. |
| 15360 */ |
| 15361 static void fts5StructureWrite(Fts5Index *p, Fts5Structure *pStruct){ |
| 15362 if( p->rc==SQLITE_OK ){ |
| 15363 Fts5Buffer buf; /* Buffer to serialize record into */ |
| 15364 int iLvl; /* Used to iterate through levels */ |
| 15365 int iCookie; /* Cookie value to store */ |
| 15366 |
| 15367 assert( pStruct->nSegment==fts5StructureCountSegments(pStruct) ); |
| 15368 memset(&buf, 0, sizeof(Fts5Buffer)); |
| 15369 |
| 15370 /* Append the current configuration cookie */ |
| 15371 iCookie = p->pConfig->iCookie; |
| 15372 if( iCookie<0 ) iCookie = 0; |
| 15373 |
| 15374 if( 0==sqlite3Fts5BufferSize(&p->rc, &buf, 4+9+9+9) ){ |
| 15375 sqlite3Fts5Put32(buf.p, iCookie); |
| 15376 buf.n = 4; |
| 15377 fts5BufferSafeAppendVarint(&buf, pStruct->nLevel); |
| 15378 fts5BufferSafeAppendVarint(&buf, pStruct->nSegment); |
| 15379 fts5BufferSafeAppendVarint(&buf, (i64)pStruct->nWriteCounter); |
| 15380 } |
| 15381 |
| 15382 for(iLvl=0; iLvl<pStruct->nLevel; iLvl++){ |
| 15383 int iSeg; /* Used to iterate through segments */ |
| 15384 Fts5StructureLevel *pLvl = &pStruct->aLevel[iLvl]; |
| 15385 fts5BufferAppendVarint(&p->rc, &buf, pLvl->nMerge); |
| 15386 fts5BufferAppendVarint(&p->rc, &buf, pLvl->nSeg); |
| 15387 assert( pLvl->nMerge<=pLvl->nSeg ); |
| 15388 |
| 15389 for(iSeg=0; iSeg<pLvl->nSeg; iSeg++){ |
| 15390 fts5BufferAppendVarint(&p->rc, &buf, pLvl->aSeg[iSeg].iSegid); |
| 15391 fts5BufferAppendVarint(&p->rc, &buf, pLvl->aSeg[iSeg].pgnoFirst); |
| 15392 fts5BufferAppendVarint(&p->rc, &buf, pLvl->aSeg[iSeg].pgnoLast); |
| 15393 } |
| 15394 } |
| 15395 |
| 15396 fts5DataWrite(p, FTS5_STRUCTURE_ROWID, buf.p, buf.n); |
| 15397 fts5BufferFree(&buf); |
| 15398 } |
| 15399 } |
| 15400 |
| 15401 #if 0 |
| 15402 static void fts5DebugStructure(int*,Fts5Buffer*,Fts5Structure*); |
| 15403 static void fts5PrintStructure(const char *zCaption, Fts5Structure *pStruct){ |
| 15404 int rc = SQLITE_OK; |
| 15405 Fts5Buffer buf; |
| 15406 memset(&buf, 0, sizeof(buf)); |
| 15407 fts5DebugStructure(&rc, &buf, pStruct); |
| 15408 fprintf(stdout, "%s: %s\n", zCaption, buf.p); |
| 15409 fflush(stdout); |
| 15410 fts5BufferFree(&buf); |
| 15411 } |
| 15412 #else |
| 15413 # define fts5PrintStructure(x,y) |
| 15414 #endif |
| 15415 |
| 15416 static int fts5SegmentSize(Fts5StructureSegment *pSeg){ |
| 15417 return 1 + pSeg->pgnoLast - pSeg->pgnoFirst; |
| 15418 } |
| 15419 |
| 15420 /* |
| 15421 ** Return a copy of index structure pStruct. Except, promote as many |
| 15422 ** segments as possible to level iPromote. If an OOM occurs, NULL is |
| 15423 ** returned. |
| 15424 */ |
| 15425 static void fts5StructurePromoteTo( |
| 15426 Fts5Index *p, |
| 15427 int iPromote, |
| 15428 int szPromote, |
| 15429 Fts5Structure *pStruct |
| 15430 ){ |
| 15431 int il, is; |
| 15432 Fts5StructureLevel *pOut = &pStruct->aLevel[iPromote]; |
| 15433 |
| 15434 if( pOut->nMerge==0 ){ |
| 15435 for(il=iPromote+1; il<pStruct->nLevel; il++){ |
| 15436 Fts5StructureLevel *pLvl = &pStruct->aLevel[il]; |
| 15437 if( pLvl->nMerge ) return; |
| 15438 for(is=pLvl->nSeg-1; is>=0; is--){ |
| 15439 int sz = fts5SegmentSize(&pLvl->aSeg[is]); |
| 15440 if( sz>szPromote ) return; |
| 15441 fts5StructureExtendLevel(&p->rc, pStruct, iPromote, 1, 1); |
| 15442 if( p->rc ) return; |
| 15443 memcpy(pOut->aSeg, &pLvl->aSeg[is], sizeof(Fts5StructureSegment)); |
| 15444 pOut->nSeg++; |
| 15445 pLvl->nSeg--; |
| 15446 } |
| 15447 } |
| 15448 } |
| 15449 } |
| 15450 |
| 15451 /* |
| 15452 ** A new segment has just been written to level iLvl of index structure |
| 15453 ** pStruct. This function determines if any segments should be promoted |
| 15454 ** as a result. Segments are promoted in two scenarios: |
| 15455 ** |
| 15456 ** a) If the segment just written is smaller than one or more segments |
| 15457 ** within the previous populated level, it is promoted to the previous |
| 15458 ** populated level. |
| 15459 ** |
| 15460 ** b) If the segment just written is larger than the newest segment on |
| 15461 ** the next populated level, then that segment, and any other adjacent |
| 15462 ** segments that are also smaller than the one just written, are |
| 15463 ** promoted. |
| 15464 ** |
| 15465 ** If one or more segments are promoted, the structure object is updated |
| 15466 ** to reflect this. |
| 15467 */ |
| 15468 static void fts5StructurePromote( |
| 15469 Fts5Index *p, /* FTS5 backend object */ |
| 15470 int iLvl, /* Index level just updated */ |
| 15471 Fts5Structure *pStruct /* Index structure */ |
| 15472 ){ |
| 15473 if( p->rc==SQLITE_OK ){ |
| 15474 int iTst; |
| 15475 int iPromote = -1; |
| 15476 int szPromote = 0; /* Promote anything this size or smaller */ |
| 15477 Fts5StructureSegment *pSeg; /* Segment just written */ |
| 15478 int szSeg; /* Size of segment just written */ |
| 15479 int nSeg = pStruct->aLevel[iLvl].nSeg; |
| 15480 |
| 15481 if( nSeg==0 ) return; |
| 15482 pSeg = &pStruct->aLevel[iLvl].aSeg[pStruct->aLevel[iLvl].nSeg-1]; |
| 15483 szSeg = (1 + pSeg->pgnoLast - pSeg->pgnoFirst); |
| 15484 |
| 15485 /* Check for condition (a) */ |
| 15486 for(iTst=iLvl-1; iTst>=0 && pStruct->aLevel[iTst].nSeg==0; iTst--); |
| 15487 if( iTst>=0 ){ |
| 15488 int i; |
| 15489 int szMax = 0; |
| 15490 Fts5StructureLevel *pTst = &pStruct->aLevel[iTst]; |
| 15491 assert( pTst->nMerge==0 ); |
| 15492 for(i=0; i<pTst->nSeg; i++){ |
| 15493 int sz = pTst->aSeg[i].pgnoLast - pTst->aSeg[i].pgnoFirst + 1; |
| 15494 if( sz>szMax ) szMax = sz; |
| 15495 } |
| 15496 if( szMax>=szSeg ){ |
| 15497 /* Condition (a) is true. Promote the newest segment on level |
| 15498 ** iLvl to level iTst. */ |
| 15499 iPromote = iTst; |
| 15500 szPromote = szMax; |
| 15501 } |
| 15502 } |
| 15503 |
| 15504 /* If condition (a) is not met, assume (b) is true. StructurePromoteTo() |
| 15505 ** is a no-op if it is not. */ |
| 15506 if( iPromote<0 ){ |
| 15507 iPromote = iLvl; |
| 15508 szPromote = szSeg; |
| 15509 } |
| 15510 fts5StructurePromoteTo(p, iPromote, szPromote, pStruct); |
| 15511 } |
| 15512 } |
| 15513 |
| 15514 |
| 15515 /* |
| 15516 ** Advance the iterator passed as the only argument. If the end of the |
| 15517 ** doclist-index page is reached, return non-zero. |
| 15518 */ |
| 15519 static int fts5DlidxLvlNext(Fts5DlidxLvl *pLvl){ |
| 15520 Fts5Data *pData = pLvl->pData; |
| 15521 |
| 15522 if( pLvl->iOff==0 ){ |
| 15523 assert( pLvl->bEof==0 ); |
| 15524 pLvl->iOff = 1; |
| 15525 pLvl->iOff += fts5GetVarint32(&pData->p[1], pLvl->iLeafPgno); |
| 15526 pLvl->iOff += fts5GetVarint(&pData->p[pLvl->iOff], (u64*)&pLvl->iRowid); |
| 15527 pLvl->iFirstOff = pLvl->iOff; |
| 15528 }else{ |
| 15529 int iOff; |
| 15530 for(iOff=pLvl->iOff; iOff<pData->nn; iOff++){ |
| 15531 if( pData->p[iOff] ) break; |
| 15532 } |
| 15533 |
| 15534 if( iOff<pData->nn ){ |
| 15535 i64 iVal; |
| 15536 pLvl->iLeafPgno += (iOff - pLvl->iOff) + 1; |
| 15537 iOff += fts5GetVarint(&pData->p[iOff], (u64*)&iVal); |
| 15538 pLvl->iRowid += iVal; |
| 15539 pLvl->iOff = iOff; |
| 15540 }else{ |
| 15541 pLvl->bEof = 1; |
| 15542 } |
| 15543 } |
| 15544 |
| 15545 return pLvl->bEof; |
| 15546 } |
| 15547 |
| 15548 /* |
| 15549 ** Advance the iterator passed as the only argument. |
| 15550 */ |
| 15551 static int fts5DlidxIterNextR(Fts5Index *p, Fts5DlidxIter *pIter, int iLvl){ |
| 15552 Fts5DlidxLvl *pLvl = &pIter->aLvl[iLvl]; |
| 15553 |
| 15554 assert( iLvl<pIter->nLvl ); |
| 15555 if( fts5DlidxLvlNext(pLvl) ){ |
| 15556 if( (iLvl+1) < pIter->nLvl ){ |
| 15557 fts5DlidxIterNextR(p, pIter, iLvl+1); |
| 15558 if( pLvl[1].bEof==0 ){ |
| 15559 fts5DataRelease(pLvl->pData); |
| 15560 memset(pLvl, 0, sizeof(Fts5DlidxLvl)); |
| 15561 pLvl->pData = fts5DataRead(p, |
| 15562 FTS5_DLIDX_ROWID(pIter->iSegid, iLvl, pLvl[1].iLeafPgno) |
| 15563 ); |
| 15564 if( pLvl->pData ) fts5DlidxLvlNext(pLvl); |
| 15565 } |
| 15566 } |
| 15567 } |
| 15568 |
| 15569 return pIter->aLvl[0].bEof; |
| 15570 } |
| 15571 static int fts5DlidxIterNext(Fts5Index *p, Fts5DlidxIter *pIter){ |
| 15572 return fts5DlidxIterNextR(p, pIter, 0); |
| 15573 } |
| 15574 |
| 15575 /* |
| 15576 ** The iterator passed as the first argument has the following fields set |
| 15577 ** as follows. This function sets up the rest of the iterator so that it |
| 15578 ** points to the first rowid in the doclist-index. |
| 15579 ** |
| 15580 ** pData: |
| 15581 ** pointer to doclist-index record, |
| 15582 ** |
| 15583 ** When this function is called pIter->iLeafPgno is the page number the |
| 15584 ** doclist is associated with (the one featuring the term). |
| 15585 */ |
| 15586 static int fts5DlidxIterFirst(Fts5DlidxIter *pIter){ |
| 15587 int i; |
| 15588 for(i=0; i<pIter->nLvl; i++){ |
| 15589 fts5DlidxLvlNext(&pIter->aLvl[i]); |
| 15590 } |
| 15591 return pIter->aLvl[0].bEof; |
| 15592 } |
| 15593 |
| 15594 |
| 15595 static int fts5DlidxIterEof(Fts5Index *p, Fts5DlidxIter *pIter){ |
| 15596 return p->rc!=SQLITE_OK || pIter->aLvl[0].bEof; |
| 15597 } |
| 15598 |
| 15599 static void fts5DlidxIterLast(Fts5Index *p, Fts5DlidxIter *pIter){ |
| 15600 int i; |
| 15601 |
| 15602 /* Advance each level to the last entry on the last page */ |
| 15603 for(i=pIter->nLvl-1; p->rc==SQLITE_OK && i>=0; i--){ |
| 15604 Fts5DlidxLvl *pLvl = &pIter->aLvl[i]; |
| 15605 while( fts5DlidxLvlNext(pLvl)==0 ); |
| 15606 pLvl->bEof = 0; |
| 15607 |
| 15608 if( i>0 ){ |
| 15609 Fts5DlidxLvl *pChild = &pLvl[-1]; |
| 15610 fts5DataRelease(pChild->pData); |
| 15611 memset(pChild, 0, sizeof(Fts5DlidxLvl)); |
| 15612 pChild->pData = fts5DataRead(p, |
| 15613 FTS5_DLIDX_ROWID(pIter->iSegid, i-1, pLvl->iLeafPgno) |
| 15614 ); |
| 15615 } |
| 15616 } |
| 15617 } |
| 15618 |
| 15619 /* |
| 15620 ** Move the iterator passed as the only argument to the previous entry. |
| 15621 */ |
| 15622 static int fts5DlidxLvlPrev(Fts5DlidxLvl *pLvl){ |
| 15623 int iOff = pLvl->iOff; |
| 15624 |
| 15625 assert( pLvl->bEof==0 ); |
| 15626 if( iOff<=pLvl->iFirstOff ){ |
| 15627 pLvl->bEof = 1; |
| 15628 }else{ |
| 15629 u8 *a = pLvl->pData->p; |
| 15630 i64 iVal; |
| 15631 int iLimit; |
| 15632 int ii; |
| 15633 int nZero = 0; |
| 15634 |
| 15635 /* Currently iOff points to the first byte of a varint. This block |
| 15636 ** decrements iOff until it points to the first byte of the previous |
| 15637 ** varint. Taking care not to read any memory locations that occur |
| 15638 ** before the buffer in memory. */ |
| 15639 iLimit = (iOff>9 ? iOff-9 : 0); |
| 15640 for(iOff--; iOff>iLimit; iOff--){ |
| 15641 if( (a[iOff-1] & 0x80)==0 ) break; |
| 15642 } |
| 15643 |
| 15644 fts5GetVarint(&a[iOff], (u64*)&iVal); |
| 15645 pLvl->iRowid -= iVal; |
| 15646 pLvl->iLeafPgno--; |
| 15647 |
| 15648 /* Skip backwards past any 0x00 varints. */ |
| 15649 for(ii=iOff-1; ii>=pLvl->iFirstOff && a[ii]==0x00; ii--){ |
| 15650 nZero++; |
| 15651 } |
| 15652 if( ii>=pLvl->iFirstOff && (a[ii] & 0x80) ){ |
| 15653 /* The byte immediately before the last 0x00 byte has the 0x80 bit |
| 15654 ** set. So the last 0x00 is only a varint 0 if there are 8 more 0x80 |
| 15655 ** bytes before a[ii]. */ |
| 15656 int bZero = 0; /* True if last 0x00 counts */ |
| 15657 if( (ii-8)>=pLvl->iFirstOff ){ |
| 15658 int j; |
| 15659 for(j=1; j<=8 && (a[ii-j] & 0x80); j++); |
| 15660 bZero = (j>8); |
| 15661 } |
| 15662 if( bZero==0 ) nZero--; |
| 15663 } |
| 15664 pLvl->iLeafPgno -= nZero; |
| 15665 pLvl->iOff = iOff - nZero; |
| 15666 } |
| 15667 |
| 15668 return pLvl->bEof; |
| 15669 } |
| 15670 |
| 15671 static int fts5DlidxIterPrevR(Fts5Index *p, Fts5DlidxIter *pIter, int iLvl){ |
| 15672 Fts5DlidxLvl *pLvl = &pIter->aLvl[iLvl]; |
| 15673 |
| 15674 assert( iLvl<pIter->nLvl ); |
| 15675 if( fts5DlidxLvlPrev(pLvl) ){ |
| 15676 if( (iLvl+1) < pIter->nLvl ){ |
| 15677 fts5DlidxIterPrevR(p, pIter, iLvl+1); |
| 15678 if( pLvl[1].bEof==0 ){ |
| 15679 fts5DataRelease(pLvl->pData); |
| 15680 memset(pLvl, 0, sizeof(Fts5DlidxLvl)); |
| 15681 pLvl->pData = fts5DataRead(p, |
| 15682 FTS5_DLIDX_ROWID(pIter->iSegid, iLvl, pLvl[1].iLeafPgno) |
| 15683 ); |
| 15684 if( pLvl->pData ){ |
| 15685 while( fts5DlidxLvlNext(pLvl)==0 ); |
| 15686 pLvl->bEof = 0; |
| 15687 } |
| 15688 } |
| 15689 } |
| 15690 } |
| 15691 |
| 15692 return pIter->aLvl[0].bEof; |
| 15693 } |
| 15694 static int fts5DlidxIterPrev(Fts5Index *p, Fts5DlidxIter *pIter){ |
| 15695 return fts5DlidxIterPrevR(p, pIter, 0); |
| 15696 } |
| 15697 |
| 15698 /* |
| 15699 ** Free a doclist-index iterator object allocated by fts5DlidxIterInit(). |
| 15700 */ |
| 15701 static void fts5DlidxIterFree(Fts5DlidxIter *pIter){ |
| 15702 if( pIter ){ |
| 15703 int i; |
| 15704 for(i=0; i<pIter->nLvl; i++){ |
| 15705 fts5DataRelease(pIter->aLvl[i].pData); |
| 15706 } |
| 15707 sqlite3_free(pIter); |
| 15708 } |
| 15709 } |
| 15710 |
| 15711 static Fts5DlidxIter *fts5DlidxIterInit( |
| 15712 Fts5Index *p, /* Fts5 Backend to iterate within */ |
| 15713 int bRev, /* True for ORDER BY ASC */ |
| 15714 int iSegid, /* Segment id */ |
| 15715 int iLeafPg /* Leaf page number to load dlidx for */ |
| 15716 ){ |
| 15717 Fts5DlidxIter *pIter = 0; |
| 15718 int i; |
| 15719 int bDone = 0; |
| 15720 |
| 15721 for(i=0; p->rc==SQLITE_OK && bDone==0; i++){ |
| 15722 int nByte = sizeof(Fts5DlidxIter) + i * sizeof(Fts5DlidxLvl); |
| 15723 Fts5DlidxIter *pNew; |
| 15724 |
| 15725 pNew = (Fts5DlidxIter*)sqlite3_realloc(pIter, nByte); |
| 15726 if( pNew==0 ){ |
| 15727 p->rc = SQLITE_NOMEM; |
| 15728 }else{ |
| 15729 i64 iRowid = FTS5_DLIDX_ROWID(iSegid, i, iLeafPg); |
| 15730 Fts5DlidxLvl *pLvl = &pNew->aLvl[i]; |
| 15731 pIter = pNew; |
| 15732 memset(pLvl, 0, sizeof(Fts5DlidxLvl)); |
| 15733 pLvl->pData = fts5DataRead(p, iRowid); |
| 15734 if( pLvl->pData && (pLvl->pData->p[0] & 0x0001)==0 ){ |
| 15735 bDone = 1; |
| 15736 } |
| 15737 pIter->nLvl = i+1; |
| 15738 } |
| 15739 } |
| 15740 |
| 15741 if( p->rc==SQLITE_OK ){ |
| 15742 pIter->iSegid = iSegid; |
| 15743 if( bRev==0 ){ |
| 15744 fts5DlidxIterFirst(pIter); |
| 15745 }else{ |
| 15746 fts5DlidxIterLast(p, pIter); |
| 15747 } |
| 15748 } |
| 15749 |
| 15750 if( p->rc!=SQLITE_OK ){ |
| 15751 fts5DlidxIterFree(pIter); |
| 15752 pIter = 0; |
| 15753 } |
| 15754 |
| 15755 return pIter; |
| 15756 } |
| 15757 |
| 15758 static i64 fts5DlidxIterRowid(Fts5DlidxIter *pIter){ |
| 15759 return pIter->aLvl[0].iRowid; |
| 15760 } |
| 15761 static int fts5DlidxIterPgno(Fts5DlidxIter *pIter){ |
| 15762 return pIter->aLvl[0].iLeafPgno; |
| 15763 } |
| 15764 |
| 15765 /* |
| 15766 ** Load the next leaf page into the segment iterator. |
| 15767 */ |
| 15768 static void fts5SegIterNextPage( |
| 15769 Fts5Index *p, /* FTS5 backend object */ |
| 15770 Fts5SegIter *pIter /* Iterator to advance to next page */ |
| 15771 ){ |
| 15772 Fts5Data *pLeaf; |
| 15773 Fts5StructureSegment *pSeg = pIter->pSeg; |
| 15774 fts5DataRelease(pIter->pLeaf); |
| 15775 pIter->iLeafPgno++; |
| 15776 if( pIter->pNextLeaf ){ |
| 15777 pIter->pLeaf = pIter->pNextLeaf; |
| 15778 pIter->pNextLeaf = 0; |
| 15779 }else if( pIter->iLeafPgno<=pSeg->pgnoLast ){ |
| 15780 pIter->pLeaf = fts5DataRead(p, |
| 15781 FTS5_SEGMENT_ROWID(pSeg->iSegid, pIter->iLeafPgno) |
| 15782 ); |
| 15783 }else{ |
| 15784 pIter->pLeaf = 0; |
| 15785 } |
| 15786 pLeaf = pIter->pLeaf; |
| 15787 |
| 15788 if( pLeaf ){ |
| 15789 pIter->iPgidxOff = pLeaf->szLeaf; |
| 15790 if( fts5LeafIsTermless(pLeaf) ){ |
| 15791 pIter->iEndofDoclist = pLeaf->nn+1; |
| 15792 }else{ |
| 15793 pIter->iPgidxOff += fts5GetVarint32(&pLeaf->p[pIter->iPgidxOff], |
| 15794 pIter->iEndofDoclist |
| 15795 ); |
| 15796 } |
| 15797 } |
| 15798 } |
| 15799 |
| 15800 /* |
| 15801 ** Argument p points to a buffer containing a varint to be interpreted as a |
| 15802 ** position list size field. Read the varint and return the number of bytes |
| 15803 ** read. Before returning, set *pnSz to the number of bytes in the position |
| 15804 ** list, and *pbDel to true if the delete flag is set, or false otherwise. |
| 15805 */ |
| 15806 static int fts5GetPoslistSize(const u8 *p, int *pnSz, int *pbDel){ |
| 15807 int nSz; |
| 15808 int n = 0; |
| 15809 fts5FastGetVarint32(p, n, nSz); |
| 15810 assert_nc( nSz>=0 ); |
| 15811 *pnSz = nSz/2; |
| 15812 *pbDel = nSz & 0x0001; |
| 15813 return n; |
| 15814 } |
| 15815 |
| 15816 /* |
| 15817 ** Fts5SegIter.iLeafOffset currently points to the first byte of a |
| 15818 ** position-list size field. Read the value of the field and store it |
| 15819 ** in the following variables: |
| 15820 ** |
| 15821 ** Fts5SegIter.nPos |
| 15822 ** Fts5SegIter.bDel |
| 15823 ** |
| 15824 ** Leave Fts5SegIter.iLeafOffset pointing to the first byte of the |
| 15825 ** position list content (if any). |
| 15826 */ |
| 15827 static void fts5SegIterLoadNPos(Fts5Index *p, Fts5SegIter *pIter){ |
| 15828 if( p->rc==SQLITE_OK ){ |
| 15829 int iOff = pIter->iLeafOffset; /* Offset to read at */ |
| 15830 int nSz; |
| 15831 ASSERT_SZLEAF_OK(pIter->pLeaf); |
| 15832 fts5FastGetVarint32(pIter->pLeaf->p, iOff, nSz); |
| 15833 pIter->bDel = (nSz & 0x0001); |
| 15834 pIter->nPos = nSz>>1; |
| 15835 pIter->iLeafOffset = iOff; |
| 15836 assert_nc( pIter->nPos>=0 ); |
| 15837 } |
| 15838 } |
| 15839 |
| 15840 static void fts5SegIterLoadRowid(Fts5Index *p, Fts5SegIter *pIter){ |
| 15841 u8 *a = pIter->pLeaf->p; /* Buffer to read data from */ |
| 15842 int iOff = pIter->iLeafOffset; |
| 15843 |
| 15844 ASSERT_SZLEAF_OK(pIter->pLeaf); |
| 15845 if( iOff>=pIter->pLeaf->szLeaf ){ |
| 15846 fts5SegIterNextPage(p, pIter); |
| 15847 if( pIter->pLeaf==0 ){ |
| 15848 if( p->rc==SQLITE_OK ) p->rc = FTS5_CORRUPT; |
| 15849 return; |
| 15850 } |
| 15851 iOff = 4; |
| 15852 a = pIter->pLeaf->p; |
| 15853 } |
| 15854 iOff += sqlite3Fts5GetVarint(&a[iOff], (u64*)&pIter->iRowid); |
| 15855 pIter->iLeafOffset = iOff; |
| 15856 } |
| 15857 |
| 15858 /* |
| 15859 ** Fts5SegIter.iLeafOffset currently points to the first byte of the |
| 15860 ** "nSuffix" field of a term. Function parameter nKeep contains the value |
| 15861 ** of the "nPrefix" field (if there was one - it is passed 0 if this is |
| 15862 ** the first term in the segment). |
| 15863 ** |
| 15864 ** This function populates: |
| 15865 ** |
| 15866 ** Fts5SegIter.term |
| 15867 ** Fts5SegIter.rowid |
| 15868 ** |
| 15869 ** accordingly and leaves (Fts5SegIter.iLeafOffset) set to the content of |
| 15870 ** the first position list. The position list belonging to document |
| 15871 ** (Fts5SegIter.iRowid). |
| 15872 */ |
| 15873 static void fts5SegIterLoadTerm(Fts5Index *p, Fts5SegIter *pIter, int nKeep){ |
| 15874 u8 *a = pIter->pLeaf->p; /* Buffer to read data from */ |
| 15875 int iOff = pIter->iLeafOffset; /* Offset to read at */ |
| 15876 int nNew; /* Bytes of new data */ |
| 15877 |
| 15878 iOff += fts5GetVarint32(&a[iOff], nNew); |
| 15879 pIter->term.n = nKeep; |
| 15880 fts5BufferAppendBlob(&p->rc, &pIter->term, nNew, &a[iOff]); |
| 15881 iOff += nNew; |
| 15882 pIter->iTermLeafOffset = iOff; |
| 15883 pIter->iTermLeafPgno = pIter->iLeafPgno; |
| 15884 pIter->iLeafOffset = iOff; |
| 15885 |
| 15886 if( pIter->iPgidxOff>=pIter->pLeaf->nn ){ |
| 15887 pIter->iEndofDoclist = pIter->pLeaf->nn+1; |
| 15888 }else{ |
| 15889 int nExtra; |
| 15890 pIter->iPgidxOff += fts5GetVarint32(&a[pIter->iPgidxOff], nExtra); |
| 15891 pIter->iEndofDoclist += nExtra; |
| 15892 } |
| 15893 |
| 15894 fts5SegIterLoadRowid(p, pIter); |
| 15895 } |
| 15896 |
| 15897 /* |
| 15898 ** Initialize the iterator object pIter to iterate through the entries in |
| 15899 ** segment pSeg. The iterator is left pointing to the first entry when |
| 15900 ** this function returns. |
| 15901 ** |
| 15902 ** If an error occurs, Fts5Index.rc is set to an appropriate error code. If |
| 15903 ** an error has already occurred when this function is called, it is a no-op. |
| 15904 */ |
| 15905 static void fts5SegIterInit( |
| 15906 Fts5Index *p, /* FTS index object */ |
| 15907 Fts5StructureSegment *pSeg, /* Description of segment */ |
| 15908 Fts5SegIter *pIter /* Object to populate */ |
| 15909 ){ |
| 15910 if( pSeg->pgnoFirst==0 ){ |
| 15911 /* This happens if the segment is being used as an input to an incremental |
| 15912 ** merge and all data has already been "trimmed". See function |
| 15913 ** fts5TrimSegments() for details. In this case leave the iterator empty. |
| 15914 ** The caller will see the (pIter->pLeaf==0) and assume the iterator is |
| 15915 ** at EOF already. */ |
| 15916 assert( pIter->pLeaf==0 ); |
| 15917 return; |
| 15918 } |
| 15919 |
| 15920 if( p->rc==SQLITE_OK ){ |
| 15921 memset(pIter, 0, sizeof(*pIter)); |
| 15922 pIter->pSeg = pSeg; |
| 15923 pIter->iLeafPgno = pSeg->pgnoFirst-1; |
| 15924 fts5SegIterNextPage(p, pIter); |
| 15925 } |
| 15926 |
| 15927 if( p->rc==SQLITE_OK ){ |
| 15928 pIter->iLeafOffset = 4; |
| 15929 assert_nc( pIter->pLeaf->nn>4 ); |
| 15930 assert( fts5LeafFirstTermOff(pIter->pLeaf)==4 ); |
| 15931 pIter->iPgidxOff = pIter->pLeaf->szLeaf+1; |
| 15932 fts5SegIterLoadTerm(p, pIter, 0); |
| 15933 fts5SegIterLoadNPos(p, pIter); |
| 15934 } |
| 15935 } |
| 15936 |
| 15937 /* |
| 15938 ** This function is only ever called on iterators created by calls to |
| 15939 ** Fts5IndexQuery() with the FTS5INDEX_QUERY_DESC flag set. |
| 15940 ** |
| 15941 ** The iterator is in an unusual state when this function is called: the |
| 15942 ** Fts5SegIter.iLeafOffset variable is set to the offset of the start of |
| 15943 ** the position-list size field for the first relevant rowid on the page. |
| 15944 ** Fts5SegIter.rowid is set, but nPos and bDel are not. |
| 15945 ** |
| 15946 ** This function advances the iterator so that it points to the last |
| 15947 ** relevant rowid on the page and, if necessary, initializes the |
| 15948 ** aRowidOffset[] and iRowidOffset variables. At this point the iterator |
| 15949 ** is in its regular state - Fts5SegIter.iLeafOffset points to the first |
| 15950 ** byte of the position list content associated with said rowid. |
| 15951 */ |
| 15952 static void fts5SegIterReverseInitPage(Fts5Index *p, Fts5SegIter *pIter){ |
| 15953 int n = pIter->pLeaf->szLeaf; |
| 15954 int i = pIter->iLeafOffset; |
| 15955 u8 *a = pIter->pLeaf->p; |
| 15956 int iRowidOffset = 0; |
| 15957 |
| 15958 if( n>pIter->iEndofDoclist ){ |
| 15959 n = pIter->iEndofDoclist; |
| 15960 } |
| 15961 |
| 15962 ASSERT_SZLEAF_OK(pIter->pLeaf); |
| 15963 while( 1 ){ |
| 15964 i64 iDelta = 0; |
| 15965 int nPos; |
| 15966 int bDummy; |
| 15967 |
| 15968 i += fts5GetPoslistSize(&a[i], &nPos, &bDummy); |
| 15969 i += nPos; |
| 15970 if( i>=n ) break; |
| 15971 i += fts5GetVarint(&a[i], (u64*)&iDelta); |
| 15972 pIter->iRowid += iDelta; |
| 15973 |
| 15974 if( iRowidOffset>=pIter->nRowidOffset ){ |
| 15975 int nNew = pIter->nRowidOffset + 8; |
| 15976 int *aNew = (int*)sqlite3_realloc(pIter->aRowidOffset, nNew*sizeof(int)); |
| 15977 if( aNew==0 ){ |
| 15978 p->rc = SQLITE_NOMEM; |
| 15979 break; |
| 15980 } |
| 15981 pIter->aRowidOffset = aNew; |
| 15982 pIter->nRowidOffset = nNew; |
| 15983 } |
| 15984 |
| 15985 pIter->aRowidOffset[iRowidOffset++] = pIter->iLeafOffset; |
| 15986 pIter->iLeafOffset = i; |
| 15987 } |
| 15988 pIter->iRowidOffset = iRowidOffset; |
| 15989 fts5SegIterLoadNPos(p, pIter); |
| 15990 } |
| 15991 |
| 15992 /* |
| 15993 ** |
| 15994 */ |
| 15995 static void fts5SegIterReverseNewPage(Fts5Index *p, Fts5SegIter *pIter){ |
| 15996 assert( pIter->flags & FTS5_SEGITER_REVERSE ); |
| 15997 assert( pIter->flags & FTS5_SEGITER_ONETERM ); |
| 15998 |
| 15999 fts5DataRelease(pIter->pLeaf); |
| 16000 pIter->pLeaf = 0; |
| 16001 while( p->rc==SQLITE_OK && pIter->iLeafPgno>pIter->iTermLeafPgno ){ |
| 16002 Fts5Data *pNew; |
| 16003 pIter->iLeafPgno--; |
| 16004 pNew = fts5DataRead(p, FTS5_SEGMENT_ROWID( |
| 16005 pIter->pSeg->iSegid, pIter->iLeafPgno |
| 16006 )); |
| 16007 if( pNew ){ |
| 16008 /* iTermLeafOffset may be equal to szLeaf if the term is the last |
| 16009 ** thing on the page - i.e. the first rowid is on the following page. |
| 16010 ** In this case leave pIter->pLeaf==0, this iterator is at EOF. */ |
| 16011 if( pIter->iLeafPgno==pIter->iTermLeafPgno ){ |
| 16012 assert( pIter->pLeaf==0 ); |
| 16013 if( pIter->iTermLeafOffset<pNew->szLeaf ){ |
| 16014 pIter->pLeaf = pNew; |
| 16015 pIter->iLeafOffset = pIter->iTermLeafOffset; |
| 16016 } |
| 16017 }else{ |
| 16018 int iRowidOff; |
| 16019 iRowidOff = fts5LeafFirstRowidOff(pNew); |
| 16020 if( iRowidOff ){ |
| 16021 pIter->pLeaf = pNew; |
| 16022 pIter->iLeafOffset = iRowidOff; |
| 16023 } |
| 16024 } |
| 16025 |
| 16026 if( pIter->pLeaf ){ |
| 16027 u8 *a = &pIter->pLeaf->p[pIter->iLeafOffset]; |
| 16028 pIter->iLeafOffset += fts5GetVarint(a, (u64*)&pIter->iRowid); |
| 16029 break; |
| 16030 }else{ |
| 16031 fts5DataRelease(pNew); |
| 16032 } |
| 16033 } |
| 16034 } |
| 16035 |
| 16036 if( pIter->pLeaf ){ |
| 16037 pIter->iEndofDoclist = pIter->pLeaf->nn+1; |
| 16038 fts5SegIterReverseInitPage(p, pIter); |
| 16039 } |
| 16040 } |
| 16041 |
| 16042 /* |
| 16043 ** Return true if the iterator passed as the second argument currently |
| 16044 ** points to a delete marker. A delete marker is an entry with a 0 byte |
| 16045 ** position-list. |
| 16046 */ |
| 16047 static int fts5MultiIterIsEmpty(Fts5Index *p, Fts5IndexIter *pIter){ |
| 16048 Fts5SegIter *pSeg = &pIter->aSeg[pIter->aFirst[1].iFirst]; |
| 16049 return (p->rc==SQLITE_OK && pSeg->pLeaf && pSeg->nPos==0); |
| 16050 } |
| 16051 |
| 16052 /* |
| 16053 ** Advance iterator pIter to the next entry. |
| 16054 ** |
| 16055 ** If an error occurs, Fts5Index.rc is set to an appropriate error code. It |
| 16056 ** is not considered an error if the iterator reaches EOF. If an error has |
| 16057 ** already occurred when this function is called, it is a no-op. |
| 16058 */ |
| 16059 static void fts5SegIterNext( |
| 16060 Fts5Index *p, /* FTS5 backend object */ |
| 16061 Fts5SegIter *pIter, /* Iterator to advance */ |
| 16062 int *pbNewTerm /* OUT: Set for new term */ |
| 16063 ){ |
| 16064 assert( pbNewTerm==0 || *pbNewTerm==0 ); |
| 16065 if( p->rc==SQLITE_OK ){ |
| 16066 if( pIter->flags & FTS5_SEGITER_REVERSE ){ |
| 16067 assert( pIter->pNextLeaf==0 ); |
| 16068 if( pIter->iRowidOffset>0 ){ |
| 16069 u8 *a = pIter->pLeaf->p; |
| 16070 int iOff; |
| 16071 int nPos; |
| 16072 int bDummy; |
| 16073 i64 iDelta; |
| 16074 |
| 16075 pIter->iRowidOffset--; |
| 16076 pIter->iLeafOffset = iOff = pIter->aRowidOffset[pIter->iRowidOffset]; |
| 16077 iOff += fts5GetPoslistSize(&a[iOff], &nPos, &bDummy); |
| 16078 iOff += nPos; |
| 16079 fts5GetVarint(&a[iOff], (u64*)&iDelta); |
| 16080 pIter->iRowid -= iDelta; |
| 16081 fts5SegIterLoadNPos(p, pIter); |
| 16082 }else{ |
| 16083 fts5SegIterReverseNewPage(p, pIter); |
| 16084 } |
| 16085 }else{ |
| 16086 Fts5Data *pLeaf = pIter->pLeaf; |
| 16087 int iOff; |
| 16088 int bNewTerm = 0; |
| 16089 int nKeep = 0; |
| 16090 |
| 16091 /* Search for the end of the position list within the current page. */ |
| 16092 u8 *a = pLeaf->p; |
| 16093 int n = pLeaf->szLeaf; |
| 16094 |
| 16095 ASSERT_SZLEAF_OK(pLeaf); |
| 16096 iOff = pIter->iLeafOffset + pIter->nPos; |
| 16097 |
| 16098 if( iOff<n ){ |
| 16099 /* The next entry is on the current page. */ |
| 16100 assert_nc( iOff<=pIter->iEndofDoclist ); |
| 16101 if( iOff>=pIter->iEndofDoclist ){ |
| 16102 bNewTerm = 1; |
| 16103 if( iOff!=fts5LeafFirstTermOff(pLeaf) ){ |
| 16104 iOff += fts5GetVarint32(&a[iOff], nKeep); |
| 16105 } |
| 16106 }else{ |
| 16107 u64 iDelta; |
| 16108 iOff += sqlite3Fts5GetVarint(&a[iOff], &iDelta); |
| 16109 pIter->iRowid += iDelta; |
| 16110 assert_nc( iDelta>0 ); |
| 16111 } |
| 16112 pIter->iLeafOffset = iOff; |
| 16113 |
| 16114 }else if( pIter->pSeg==0 ){ |
| 16115 const u8 *pList = 0; |
| 16116 const char *zTerm = 0; |
| 16117 int nList = 0; |
| 16118 assert( (pIter->flags & FTS5_SEGITER_ONETERM) || pbNewTerm ); |
| 16119 if( 0==(pIter->flags & FTS5_SEGITER_ONETERM) ){ |
| 16120 sqlite3Fts5HashScanNext(p->pHash); |
| 16121 sqlite3Fts5HashScanEntry(p->pHash, &zTerm, &pList, &nList); |
| 16122 } |
| 16123 if( pList==0 ){ |
| 16124 fts5DataRelease(pIter->pLeaf); |
| 16125 pIter->pLeaf = 0; |
| 16126 }else{ |
| 16127 pIter->pLeaf->p = (u8*)pList; |
| 16128 pIter->pLeaf->nn = nList; |
| 16129 pIter->pLeaf->szLeaf = nList; |
| 16130 pIter->iEndofDoclist = nList+1; |
| 16131 sqlite3Fts5BufferSet(&p->rc, &pIter->term, (int)strlen(zTerm), |
| 16132 (u8*)zTerm); |
| 16133 pIter->iLeafOffset = fts5GetVarint(pList, (u64*)&pIter->iRowid); |
| 16134 *pbNewTerm = 1; |
| 16135 } |
| 16136 }else{ |
| 16137 iOff = 0; |
| 16138 /* Next entry is not on the current page */ |
| 16139 while( iOff==0 ){ |
| 16140 fts5SegIterNextPage(p, pIter); |
| 16141 pLeaf = pIter->pLeaf; |
| 16142 if( pLeaf==0 ) break; |
| 16143 ASSERT_SZLEAF_OK(pLeaf); |
| 16144 if( (iOff = fts5LeafFirstRowidOff(pLeaf)) && iOff<pLeaf->szLeaf ){ |
| 16145 iOff += sqlite3Fts5GetVarint(&pLeaf->p[iOff], (u64*)&pIter->iRowid); |
| 16146 pIter->iLeafOffset = iOff; |
| 16147 |
| 16148 if( pLeaf->nn>pLeaf->szLeaf ){ |
| 16149 pIter->iPgidxOff = pLeaf->szLeaf + fts5GetVarint32( |
| 16150 &pLeaf->p[pLeaf->szLeaf], pIter->iEndofDoclist |
| 16151 ); |
| 16152 } |
| 16153 |
| 16154 } |
| 16155 else if( pLeaf->nn>pLeaf->szLeaf ){ |
| 16156 pIter->iPgidxOff = pLeaf->szLeaf + fts5GetVarint32( |
| 16157 &pLeaf->p[pLeaf->szLeaf], iOff |
| 16158 ); |
| 16159 pIter->iLeafOffset = iOff; |
| 16160 pIter->iEndofDoclist = iOff; |
| 16161 bNewTerm = 1; |
| 16162 } |
| 16163 if( iOff>=pLeaf->szLeaf ){ |
| 16164 p->rc = FTS5_CORRUPT; |
| 16165 return; |
| 16166 } |
| 16167 } |
| 16168 } |
| 16169 |
| 16170 /* Check if the iterator is now at EOF. If so, return early. */ |
| 16171 if( pIter->pLeaf ){ |
| 16172 if( bNewTerm ){ |
| 16173 if( pIter->flags & FTS5_SEGITER_ONETERM ){ |
| 16174 fts5DataRelease(pIter->pLeaf); |
| 16175 pIter->pLeaf = 0; |
| 16176 }else{ |
| 16177 fts5SegIterLoadTerm(p, pIter, nKeep); |
| 16178 fts5SegIterLoadNPos(p, pIter); |
| 16179 if( pbNewTerm ) *pbNewTerm = 1; |
| 16180 } |
| 16181 }else{ |
| 16182 /* The following could be done by calling fts5SegIterLoadNPos(). But |
| 16183 ** this block is particularly performance critical, so equivalent |
| 16184 ** code is inlined. */ |
| 16185 int nSz; |
| 16186 assert( p->rc==SQLITE_OK ); |
| 16187 fts5FastGetVarint32(pIter->pLeaf->p, pIter->iLeafOffset, nSz); |
| 16188 pIter->bDel = (nSz & 0x0001); |
| 16189 pIter->nPos = nSz>>1; |
| 16190 assert_nc( pIter->nPos>=0 ); |
| 16191 } |
| 16192 } |
| 16193 } |
| 16194 } |
| 16195 } |
| 16196 |
| 16197 #define SWAPVAL(T, a, b) { T tmp; tmp=a; a=b; b=tmp; } |
| 16198 |
| 16199 /* |
| 16200 ** Iterator pIter currently points to the first rowid in a doclist. This |
| 16201 ** function sets the iterator up so that iterates in reverse order through |
| 16202 ** the doclist. |
| 16203 */ |
| 16204 static void fts5SegIterReverse(Fts5Index *p, Fts5SegIter *pIter){ |
| 16205 Fts5DlidxIter *pDlidx = pIter->pDlidx; |
| 16206 Fts5Data *pLast = 0; |
| 16207 int pgnoLast = 0; |
| 16208 |
| 16209 if( pDlidx ){ |
| 16210 int iSegid = pIter->pSeg->iSegid; |
| 16211 pgnoLast = fts5DlidxIterPgno(pDlidx); |
| 16212 pLast = fts5DataRead(p, FTS5_SEGMENT_ROWID(iSegid, pgnoLast)); |
| 16213 }else{ |
| 16214 Fts5Data *pLeaf = pIter->pLeaf; /* Current leaf data */ |
| 16215 |
| 16216 /* Currently, Fts5SegIter.iLeafOffset points to the first byte of |
| 16217 ** position-list content for the current rowid. Back it up so that it |
| 16218 ** points to the start of the position-list size field. */ |
| 16219 pIter->iLeafOffset -= sqlite3Fts5GetVarintLen(pIter->nPos*2+pIter->bDel); |
| 16220 |
| 16221 /* If this condition is true then the largest rowid for the current |
| 16222 ** term may not be stored on the current page. So search forward to |
| 16223 ** see where said rowid really is. */ |
| 16224 if( pIter->iEndofDoclist>=pLeaf->szLeaf ){ |
| 16225 int pgno; |
| 16226 Fts5StructureSegment *pSeg = pIter->pSeg; |
| 16227 |
| 16228 /* The last rowid in the doclist may not be on the current page. Search |
| 16229 ** forward to find the page containing the last rowid. */ |
| 16230 for(pgno=pIter->iLeafPgno+1; !p->rc && pgno<=pSeg->pgnoLast; pgno++){ |
| 16231 i64 iAbs = FTS5_SEGMENT_ROWID(pSeg->iSegid, pgno); |
| 16232 Fts5Data *pNew = fts5DataRead(p, iAbs); |
| 16233 if( pNew ){ |
| 16234 int iRowid, bTermless; |
| 16235 iRowid = fts5LeafFirstRowidOff(pNew); |
| 16236 bTermless = fts5LeafIsTermless(pNew); |
| 16237 if( iRowid ){ |
| 16238 SWAPVAL(Fts5Data*, pNew, pLast); |
| 16239 pgnoLast = pgno; |
| 16240 } |
| 16241 fts5DataRelease(pNew); |
| 16242 if( bTermless==0 ) break; |
| 16243 } |
| 16244 } |
| 16245 } |
| 16246 } |
| 16247 |
| 16248 /* If pLast is NULL at this point, then the last rowid for this doclist |
| 16249 ** lies on the page currently indicated by the iterator. In this case |
| 16250 ** pIter->iLeafOffset is already set to point to the position-list size |
| 16251 ** field associated with the first relevant rowid on the page. |
| 16252 ** |
| 16253 ** Or, if pLast is non-NULL, then it is the page that contains the last |
| 16254 ** rowid. In this case configure the iterator so that it points to the |
| 16255 ** first rowid on this page. |
| 16256 */ |
| 16257 if( pLast ){ |
| 16258 int iOff; |
| 16259 fts5DataRelease(pIter->pLeaf); |
| 16260 pIter->pLeaf = pLast; |
| 16261 pIter->iLeafPgno = pgnoLast; |
| 16262 iOff = fts5LeafFirstRowidOff(pLast); |
| 16263 iOff += fts5GetVarint(&pLast->p[iOff], (u64*)&pIter->iRowid); |
| 16264 pIter->iLeafOffset = iOff; |
| 16265 |
| 16266 if( fts5LeafIsTermless(pLast) ){ |
| 16267 pIter->iEndofDoclist = pLast->nn+1; |
| 16268 }else{ |
| 16269 pIter->iEndofDoclist = fts5LeafFirstTermOff(pLast); |
| 16270 } |
| 16271 |
| 16272 } |
| 16273 |
| 16274 fts5SegIterReverseInitPage(p, pIter); |
| 16275 } |
| 16276 |
| 16277 /* |
| 16278 ** Iterator pIter currently points to the first rowid of a doclist. |
| 16279 ** There is a doclist-index associated with the final term on the current |
| 16280 ** page. If the current term is the last term on the page, load the |
| 16281 ** doclist-index from disk and initialize an iterator at (pIter->pDlidx). |
| 16282 */ |
| 16283 static void fts5SegIterLoadDlidx(Fts5Index *p, Fts5SegIter *pIter){ |
| 16284 int iSeg = pIter->pSeg->iSegid; |
| 16285 int bRev = (pIter->flags & FTS5_SEGITER_REVERSE); |
| 16286 Fts5Data *pLeaf = pIter->pLeaf; /* Current leaf data */ |
| 16287 |
| 16288 assert( pIter->flags & FTS5_SEGITER_ONETERM ); |
| 16289 assert( pIter->pDlidx==0 ); |
| 16290 |
| 16291 /* Check if the current doclist ends on this page. If it does, return |
| 16292 ** early without loading the doclist-index (as it belongs to a different |
| 16293 ** term. */ |
| 16294 if( pIter->iTermLeafPgno==pIter->iLeafPgno |
| 16295 && pIter->iEndofDoclist<pLeaf->szLeaf |
| 16296 ){ |
| 16297 return; |
| 16298 } |
| 16299 |
| 16300 pIter->pDlidx = fts5DlidxIterInit(p, bRev, iSeg, pIter->iTermLeafPgno); |
| 16301 } |
| 16302 |
| 16303 #define fts5IndexSkipVarint(a, iOff) { \ |
| 16304 int iEnd = iOff+9; \ |
| 16305 while( (a[iOff++] & 0x80) && iOff<iEnd ); \ |
| 16306 } |
| 16307 |
| 16308 /* |
| 16309 ** The iterator object passed as the second argument currently contains |
| 16310 ** no valid values except for the Fts5SegIter.pLeaf member variable. This |
| 16311 ** function searches the leaf page for a term matching (pTerm/nTerm). |
| 16312 ** |
| 16313 ** If the specified term is found on the page, then the iterator is left |
| 16314 ** pointing to it. If argument bGe is zero and the term is not found, |
| 16315 ** the iterator is left pointing at EOF. |
| 16316 ** |
| 16317 ** If bGe is non-zero and the specified term is not found, then the |
| 16318 ** iterator is left pointing to the smallest term in the segment that |
| 16319 ** is larger than the specified term, even if this term is not on the |
| 16320 ** current page. |
| 16321 */ |
| 16322 static void fts5LeafSeek( |
| 16323 Fts5Index *p, /* Leave any error code here */ |
| 16324 int bGe, /* True for a >= search */ |
| 16325 Fts5SegIter *pIter, /* Iterator to seek */ |
| 16326 const u8 *pTerm, int nTerm /* Term to search for */ |
| 16327 ){ |
| 16328 int iOff; |
| 16329 const u8 *a = pIter->pLeaf->p; |
| 16330 int szLeaf = pIter->pLeaf->szLeaf; |
| 16331 int n = pIter->pLeaf->nn; |
| 16332 |
| 16333 int nMatch = 0; |
| 16334 int nKeep = 0; |
| 16335 int nNew = 0; |
| 16336 int iTermOff; |
| 16337 int iPgidx; /* Current offset in pgidx */ |
| 16338 int bEndOfPage = 0; |
| 16339 |
| 16340 assert( p->rc==SQLITE_OK ); |
| 16341 |
| 16342 iPgidx = szLeaf; |
| 16343 iPgidx += fts5GetVarint32(&a[iPgidx], iTermOff); |
| 16344 iOff = iTermOff; |
| 16345 |
| 16346 while( 1 ){ |
| 16347 |
| 16348 /* Figure out how many new bytes are in this term */ |
| 16349 fts5FastGetVarint32(a, iOff, nNew); |
| 16350 if( nKeep<nMatch ){ |
| 16351 goto search_failed; |
| 16352 } |
| 16353 |
| 16354 assert( nKeep>=nMatch ); |
| 16355 if( nKeep==nMatch ){ |
| 16356 int nCmp; |
| 16357 int i; |
| 16358 nCmp = MIN(nNew, nTerm-nMatch); |
| 16359 for(i=0; i<nCmp; i++){ |
| 16360 if( a[iOff+i]!=pTerm[nMatch+i] ) break; |
| 16361 } |
| 16362 nMatch += i; |
| 16363 |
| 16364 if( nTerm==nMatch ){ |
| 16365 if( i==nNew ){ |
| 16366 goto search_success; |
| 16367 }else{ |
| 16368 goto search_failed; |
| 16369 } |
| 16370 }else if( i<nNew && a[iOff+i]>pTerm[nMatch] ){ |
| 16371 goto search_failed; |
| 16372 } |
| 16373 } |
| 16374 |
| 16375 if( iPgidx>=n ){ |
| 16376 bEndOfPage = 1; |
| 16377 break; |
| 16378 } |
| 16379 |
| 16380 iPgidx += fts5GetVarint32(&a[iPgidx], nKeep); |
| 16381 iTermOff += nKeep; |
| 16382 iOff = iTermOff; |
| 16383 |
| 16384 /* Read the nKeep field of the next term. */ |
| 16385 fts5FastGetVarint32(a, iOff, nKeep); |
| 16386 } |
| 16387 |
| 16388 search_failed: |
| 16389 if( bGe==0 ){ |
| 16390 fts5DataRelease(pIter->pLeaf); |
| 16391 pIter->pLeaf = 0; |
| 16392 return; |
| 16393 }else if( bEndOfPage ){ |
| 16394 do { |
| 16395 fts5SegIterNextPage(p, pIter); |
| 16396 if( pIter->pLeaf==0 ) return; |
| 16397 a = pIter->pLeaf->p; |
| 16398 if( fts5LeafIsTermless(pIter->pLeaf)==0 ){ |
| 16399 iPgidx = pIter->pLeaf->szLeaf; |
| 16400 iPgidx += fts5GetVarint32(&pIter->pLeaf->p[iPgidx], iOff); |
| 16401 if( iOff<4 || iOff>=pIter->pLeaf->szLeaf ){ |
| 16402 p->rc = FTS5_CORRUPT; |
| 16403 }else{ |
| 16404 nKeep = 0; |
| 16405 iTermOff = iOff; |
| 16406 n = pIter->pLeaf->nn; |
| 16407 iOff += fts5GetVarint32(&a[iOff], nNew); |
| 16408 break; |
| 16409 } |
| 16410 } |
| 16411 }while( 1 ); |
| 16412 } |
| 16413 |
| 16414 search_success: |
| 16415 |
| 16416 pIter->iLeafOffset = iOff + nNew; |
| 16417 pIter->iTermLeafOffset = pIter->iLeafOffset; |
| 16418 pIter->iTermLeafPgno = pIter->iLeafPgno; |
| 16419 |
| 16420 fts5BufferSet(&p->rc, &pIter->term, nKeep, pTerm); |
| 16421 fts5BufferAppendBlob(&p->rc, &pIter->term, nNew, &a[iOff]); |
| 16422 |
| 16423 if( iPgidx>=n ){ |
| 16424 pIter->iEndofDoclist = pIter->pLeaf->nn+1; |
| 16425 }else{ |
| 16426 int nExtra; |
| 16427 iPgidx += fts5GetVarint32(&a[iPgidx], nExtra); |
| 16428 pIter->iEndofDoclist = iTermOff + nExtra; |
| 16429 } |
| 16430 pIter->iPgidxOff = iPgidx; |
| 16431 |
| 16432 fts5SegIterLoadRowid(p, pIter); |
| 16433 fts5SegIterLoadNPos(p, pIter); |
| 16434 } |
| 16435 |
| 16436 /* |
| 16437 ** Initialize the object pIter to point to term pTerm/nTerm within segment |
| 16438 ** pSeg. If there is no such term in the index, the iterator is set to EOF. |
| 16439 ** |
| 16440 ** If an error occurs, Fts5Index.rc is set to an appropriate error code. If |
| 16441 ** an error has already occurred when this function is called, it is a no-op. |
| 16442 */ |
| 16443 static void fts5SegIterSeekInit( |
| 16444 Fts5Index *p, /* FTS5 backend */ |
| 16445 Fts5Buffer *pBuf, /* Buffer to use for loading pages */ |
| 16446 const u8 *pTerm, int nTerm, /* Term to seek to */ |
| 16447 int flags, /* Mask of FTS5INDEX_XXX flags */ |
| 16448 Fts5StructureSegment *pSeg, /* Description of segment */ |
| 16449 Fts5SegIter *pIter /* Object to populate */ |
| 16450 ){ |
| 16451 int iPg = 1; |
| 16452 int bGe = (flags & FTS5INDEX_QUERY_SCAN); |
| 16453 int bDlidx = 0; /* True if there is a doclist-index */ |
| 16454 |
| 16455 static int nCall = 0; |
| 16456 nCall++; |
| 16457 |
| 16458 assert( bGe==0 || (flags & FTS5INDEX_QUERY_DESC)==0 ); |
| 16459 assert( pTerm && nTerm ); |
| 16460 memset(pIter, 0, sizeof(*pIter)); |
| 16461 pIter->pSeg = pSeg; |
| 16462 |
| 16463 /* This block sets stack variable iPg to the leaf page number that may |
| 16464 ** contain term (pTerm/nTerm), if it is present in the segment. */ |
| 16465 if( p->pIdxSelect==0 ){ |
| 16466 Fts5Config *pConfig = p->pConfig; |
| 16467 fts5IndexPrepareStmt(p, &p->pIdxSelect, sqlite3_mprintf( |
| 16468 "SELECT pgno FROM '%q'.'%q_idx' WHERE " |
| 16469 "segid=? AND term<=? ORDER BY term DESC LIMIT 1", |
| 16470 pConfig->zDb, pConfig->zName |
| 16471 )); |
| 16472 } |
| 16473 if( p->rc ) return; |
| 16474 sqlite3_bind_int(p->pIdxSelect, 1, pSeg->iSegid); |
| 16475 sqlite3_bind_blob(p->pIdxSelect, 2, pTerm, nTerm, SQLITE_STATIC); |
| 16476 if( SQLITE_ROW==sqlite3_step(p->pIdxSelect) ){ |
| 16477 i64 val = sqlite3_column_int(p->pIdxSelect, 0); |
| 16478 iPg = (int)(val>>1); |
| 16479 bDlidx = (val & 0x0001); |
| 16480 } |
| 16481 p->rc = sqlite3_reset(p->pIdxSelect); |
| 16482 |
| 16483 if( iPg<pSeg->pgnoFirst ){ |
| 16484 iPg = pSeg->pgnoFirst; |
| 16485 bDlidx = 0; |
| 16486 } |
| 16487 |
| 16488 pIter->iLeafPgno = iPg - 1; |
| 16489 fts5SegIterNextPage(p, pIter); |
| 16490 |
| 16491 if( pIter->pLeaf ){ |
| 16492 fts5LeafSeek(p, bGe, pIter, pTerm, nTerm); |
| 16493 } |
| 16494 |
| 16495 if( p->rc==SQLITE_OK && bGe==0 ){ |
| 16496 pIter->flags |= FTS5_SEGITER_ONETERM; |
| 16497 if( pIter->pLeaf ){ |
| 16498 if( flags & FTS5INDEX_QUERY_DESC ){ |
| 16499 pIter->flags |= FTS5_SEGITER_REVERSE; |
| 16500 } |
| 16501 if( bDlidx ){ |
| 16502 fts5SegIterLoadDlidx(p, pIter); |
| 16503 } |
| 16504 if( flags & FTS5INDEX_QUERY_DESC ){ |
| 16505 fts5SegIterReverse(p, pIter); |
| 16506 } |
| 16507 } |
| 16508 } |
| 16509 |
| 16510 /* Either: |
| 16511 ** |
| 16512 ** 1) an error has occurred, or |
| 16513 ** 2) the iterator points to EOF, or |
| 16514 ** 3) the iterator points to an entry with term (pTerm/nTerm), or |
| 16515 ** 4) the FTS5INDEX_QUERY_SCAN flag was set and the iterator points |
| 16516 ** to an entry with a term greater than or equal to (pTerm/nTerm). |
| 16517 */ |
| 16518 assert( p->rc!=SQLITE_OK /* 1 */ |
| 16519 || pIter->pLeaf==0 /* 2 */ |
| 16520 || fts5BufferCompareBlob(&pIter->term, pTerm, nTerm)==0 /* 3 */ |
| 16521 || (bGe && fts5BufferCompareBlob(&pIter->term, pTerm, nTerm)>0) /* 4 */ |
| 16522 ); |
| 16523 } |
| 16524 |
| 16525 /* |
| 16526 ** Initialize the object pIter to point to term pTerm/nTerm within the |
| 16527 ** in-memory hash table. If there is no such term in the hash-table, the |
| 16528 ** iterator is set to EOF. |
| 16529 ** |
| 16530 ** If an error occurs, Fts5Index.rc is set to an appropriate error code. If |
| 16531 ** an error has already occurred when this function is called, it is a no-op. |
| 16532 */ |
| 16533 static void fts5SegIterHashInit( |
| 16534 Fts5Index *p, /* FTS5 backend */ |
| 16535 const u8 *pTerm, int nTerm, /* Term to seek to */ |
| 16536 int flags, /* Mask of FTS5INDEX_XXX flags */ |
| 16537 Fts5SegIter *pIter /* Object to populate */ |
| 16538 ){ |
| 16539 const u8 *pList = 0; |
| 16540 int nList = 0; |
| 16541 const u8 *z = 0; |
| 16542 int n = 0; |
| 16543 |
| 16544 assert( p->pHash ); |
| 16545 assert( p->rc==SQLITE_OK ); |
| 16546 |
| 16547 if( pTerm==0 || (flags & FTS5INDEX_QUERY_SCAN) ){ |
| 16548 p->rc = sqlite3Fts5HashScanInit(p->pHash, (const char*)pTerm, nTerm); |
| 16549 sqlite3Fts5HashScanEntry(p->pHash, (const char**)&z, &pList, &nList); |
| 16550 n = (z ? (int)strlen((const char*)z) : 0); |
| 16551 }else{ |
| 16552 pIter->flags |= FTS5_SEGITER_ONETERM; |
| 16553 sqlite3Fts5HashQuery(p->pHash, (const char*)pTerm, nTerm, &pList, &nList); |
| 16554 z = pTerm; |
| 16555 n = nTerm; |
| 16556 } |
| 16557 |
| 16558 if( pList ){ |
| 16559 Fts5Data *pLeaf; |
| 16560 sqlite3Fts5BufferSet(&p->rc, &pIter->term, n, z); |
| 16561 pLeaf = fts5IdxMalloc(p, sizeof(Fts5Data)); |
| 16562 if( pLeaf==0 ) return; |
| 16563 pLeaf->p = (u8*)pList; |
| 16564 pLeaf->nn = pLeaf->szLeaf = nList; |
| 16565 pIter->pLeaf = pLeaf; |
| 16566 pIter->iLeafOffset = fts5GetVarint(pLeaf->p, (u64*)&pIter->iRowid); |
| 16567 pIter->iEndofDoclist = pLeaf->nn+1; |
| 16568 |
| 16569 if( flags & FTS5INDEX_QUERY_DESC ){ |
| 16570 pIter->flags |= FTS5_SEGITER_REVERSE; |
| 16571 fts5SegIterReverseInitPage(p, pIter); |
| 16572 }else{ |
| 16573 fts5SegIterLoadNPos(p, pIter); |
| 16574 } |
| 16575 } |
| 16576 } |
| 16577 |
| 16578 /* |
| 16579 ** Zero the iterator passed as the only argument. |
| 16580 */ |
| 16581 static void fts5SegIterClear(Fts5SegIter *pIter){ |
| 16582 fts5BufferFree(&pIter->term); |
| 16583 fts5DataRelease(pIter->pLeaf); |
| 16584 fts5DataRelease(pIter->pNextLeaf); |
| 16585 fts5DlidxIterFree(pIter->pDlidx); |
| 16586 sqlite3_free(pIter->aRowidOffset); |
| 16587 memset(pIter, 0, sizeof(Fts5SegIter)); |
| 16588 } |
| 16589 |
| 16590 #ifdef SQLITE_DEBUG |
| 16591 |
| 16592 /* |
| 16593 ** This function is used as part of the big assert() procedure implemented by |
| 16594 ** fts5AssertMultiIterSetup(). It ensures that the result currently stored |
| 16595 ** in *pRes is the correct result of comparing the current positions of the |
| 16596 ** two iterators. |
| 16597 */ |
| 16598 static void fts5AssertComparisonResult( |
| 16599 Fts5IndexIter *pIter, |
| 16600 Fts5SegIter *p1, |
| 16601 Fts5SegIter *p2, |
| 16602 Fts5CResult *pRes |
| 16603 ){ |
| 16604 int i1 = p1 - pIter->aSeg; |
| 16605 int i2 = p2 - pIter->aSeg; |
| 16606 |
| 16607 if( p1->pLeaf || p2->pLeaf ){ |
| 16608 if( p1->pLeaf==0 ){ |
| 16609 assert( pRes->iFirst==i2 ); |
| 16610 }else if( p2->pLeaf==0 ){ |
| 16611 assert( pRes->iFirst==i1 ); |
| 16612 }else{ |
| 16613 int nMin = MIN(p1->term.n, p2->term.n); |
| 16614 int res = memcmp(p1->term.p, p2->term.p, nMin); |
| 16615 if( res==0 ) res = p1->term.n - p2->term.n; |
| 16616 |
| 16617 if( res==0 ){ |
| 16618 assert( pRes->bTermEq==1 ); |
| 16619 assert( p1->iRowid!=p2->iRowid ); |
| 16620 res = ((p1->iRowid > p2->iRowid)==pIter->bRev) ? -1 : 1; |
| 16621 }else{ |
| 16622 assert( pRes->bTermEq==0 ); |
| 16623 } |
| 16624 |
| 16625 if( res<0 ){ |
| 16626 assert( pRes->iFirst==i1 ); |
| 16627 }else{ |
| 16628 assert( pRes->iFirst==i2 ); |
| 16629 } |
| 16630 } |
| 16631 } |
| 16632 } |
| 16633 |
| 16634 /* |
| 16635 ** This function is a no-op unless SQLITE_DEBUG is defined when this module |
| 16636 ** is compiled. In that case, this function is essentially an assert() |
| 16637 ** statement used to verify that the contents of the pIter->aFirst[] array |
| 16638 ** are correct. |
| 16639 */ |
| 16640 static void fts5AssertMultiIterSetup(Fts5Index *p, Fts5IndexIter *pIter){ |
| 16641 if( p->rc==SQLITE_OK ){ |
| 16642 Fts5SegIter *pFirst = &pIter->aSeg[ pIter->aFirst[1].iFirst ]; |
| 16643 int i; |
| 16644 |
| 16645 assert( (pFirst->pLeaf==0)==pIter->bEof ); |
| 16646 |
| 16647 /* Check that pIter->iSwitchRowid is set correctly. */ |
| 16648 for(i=0; i<pIter->nSeg; i++){ |
| 16649 Fts5SegIter *p1 = &pIter->aSeg[i]; |
| 16650 assert( p1==pFirst |
| 16651 || p1->pLeaf==0 |
| 16652 || fts5BufferCompare(&pFirst->term, &p1->term) |
| 16653 || p1->iRowid==pIter->iSwitchRowid |
| 16654 || (p1->iRowid<pIter->iSwitchRowid)==pIter->bRev |
| 16655 ); |
| 16656 } |
| 16657 |
| 16658 for(i=0; i<pIter->nSeg; i+=2){ |
| 16659 Fts5SegIter *p1 = &pIter->aSeg[i]; |
| 16660 Fts5SegIter *p2 = &pIter->aSeg[i+1]; |
| 16661 Fts5CResult *pRes = &pIter->aFirst[(pIter->nSeg + i) / 2]; |
| 16662 fts5AssertComparisonResult(pIter, p1, p2, pRes); |
| 16663 } |
| 16664 |
| 16665 for(i=1; i<(pIter->nSeg / 2); i+=2){ |
| 16666 Fts5SegIter *p1 = &pIter->aSeg[ pIter->aFirst[i*2].iFirst ]; |
| 16667 Fts5SegIter *p2 = &pIter->aSeg[ pIter->aFirst[i*2+1].iFirst ]; |
| 16668 Fts5CResult *pRes = &pIter->aFirst[i]; |
| 16669 fts5AssertComparisonResult(pIter, p1, p2, pRes); |
| 16670 } |
| 16671 } |
| 16672 } |
| 16673 #else |
| 16674 # define fts5AssertMultiIterSetup(x,y) |
| 16675 #endif |
| 16676 |
| 16677 /* |
| 16678 ** Do the comparison necessary to populate pIter->aFirst[iOut]. |
| 16679 ** |
| 16680 ** If the returned value is non-zero, then it is the index of an entry |
| 16681 ** in the pIter->aSeg[] array that is (a) not at EOF, and (b) pointing |
| 16682 ** to a key that is a duplicate of another, higher priority, |
| 16683 ** segment-iterator in the pSeg->aSeg[] array. |
| 16684 */ |
| 16685 static int fts5MultiIterDoCompare(Fts5IndexIter *pIter, int iOut){ |
| 16686 int i1; /* Index of left-hand Fts5SegIter */ |
| 16687 int i2; /* Index of right-hand Fts5SegIter */ |
| 16688 int iRes; |
| 16689 Fts5SegIter *p1; /* Left-hand Fts5SegIter */ |
| 16690 Fts5SegIter *p2; /* Right-hand Fts5SegIter */ |
| 16691 Fts5CResult *pRes = &pIter->aFirst[iOut]; |
| 16692 |
| 16693 assert( iOut<pIter->nSeg && iOut>0 ); |
| 16694 assert( pIter->bRev==0 || pIter->bRev==1 ); |
| 16695 |
| 16696 if( iOut>=(pIter->nSeg/2) ){ |
| 16697 i1 = (iOut - pIter->nSeg/2) * 2; |
| 16698 i2 = i1 + 1; |
| 16699 }else{ |
| 16700 i1 = pIter->aFirst[iOut*2].iFirst; |
| 16701 i2 = pIter->aFirst[iOut*2+1].iFirst; |
| 16702 } |
| 16703 p1 = &pIter->aSeg[i1]; |
| 16704 p2 = &pIter->aSeg[i2]; |
| 16705 |
| 16706 pRes->bTermEq = 0; |
| 16707 if( p1->pLeaf==0 ){ /* If p1 is at EOF */ |
| 16708 iRes = i2; |
| 16709 }else if( p2->pLeaf==0 ){ /* If p2 is at EOF */ |
| 16710 iRes = i1; |
| 16711 }else{ |
| 16712 int res = fts5BufferCompare(&p1->term, &p2->term); |
| 16713 if( res==0 ){ |
| 16714 assert( i2>i1 ); |
| 16715 assert( i2!=0 ); |
| 16716 pRes->bTermEq = 1; |
| 16717 if( p1->iRowid==p2->iRowid ){ |
| 16718 p1->bDel = p2->bDel; |
| 16719 return i2; |
| 16720 } |
| 16721 res = ((p1->iRowid > p2->iRowid)==pIter->bRev) ? -1 : +1; |
| 16722 } |
| 16723 assert( res!=0 ); |
| 16724 if( res<0 ){ |
| 16725 iRes = i1; |
| 16726 }else{ |
| 16727 iRes = i2; |
| 16728 } |
| 16729 } |
| 16730 |
| 16731 pRes->iFirst = (u16)iRes; |
| 16732 return 0; |
| 16733 } |
| 16734 |
| 16735 /* |
| 16736 ** Move the seg-iter so that it points to the first rowid on page iLeafPgno. |
| 16737 ** It is an error if leaf iLeafPgno does not exist or contains no rowids. |
| 16738 */ |
| 16739 static void fts5SegIterGotoPage( |
| 16740 Fts5Index *p, /* FTS5 backend object */ |
| 16741 Fts5SegIter *pIter, /* Iterator to advance */ |
| 16742 int iLeafPgno |
| 16743 ){ |
| 16744 assert( iLeafPgno>pIter->iLeafPgno ); |
| 16745 |
| 16746 if( iLeafPgno>pIter->pSeg->pgnoLast ){ |
| 16747 p->rc = FTS5_CORRUPT; |
| 16748 }else{ |
| 16749 fts5DataRelease(pIter->pNextLeaf); |
| 16750 pIter->pNextLeaf = 0; |
| 16751 pIter->iLeafPgno = iLeafPgno-1; |
| 16752 fts5SegIterNextPage(p, pIter); |
| 16753 assert( p->rc!=SQLITE_OK || pIter->iLeafPgno==iLeafPgno ); |
| 16754 |
| 16755 if( p->rc==SQLITE_OK ){ |
| 16756 int iOff; |
| 16757 u8 *a = pIter->pLeaf->p; |
| 16758 int n = pIter->pLeaf->szLeaf; |
| 16759 |
| 16760 iOff = fts5LeafFirstRowidOff(pIter->pLeaf); |
| 16761 if( iOff<4 || iOff>=n ){ |
| 16762 p->rc = FTS5_CORRUPT; |
| 16763 }else{ |
| 16764 iOff += fts5GetVarint(&a[iOff], (u64*)&pIter->iRowid); |
| 16765 pIter->iLeafOffset = iOff; |
| 16766 fts5SegIterLoadNPos(p, pIter); |
| 16767 } |
| 16768 } |
| 16769 } |
| 16770 } |
| 16771 |
| 16772 /* |
| 16773 ** Advance the iterator passed as the second argument until it is at or |
| 16774 ** past rowid iFrom. Regardless of the value of iFrom, the iterator is |
| 16775 ** always advanced at least once. |
| 16776 */ |
| 16777 static void fts5SegIterNextFrom( |
| 16778 Fts5Index *p, /* FTS5 backend object */ |
| 16779 Fts5SegIter *pIter, /* Iterator to advance */ |
| 16780 i64 iMatch /* Advance iterator at least this far */ |
| 16781 ){ |
| 16782 int bRev = (pIter->flags & FTS5_SEGITER_REVERSE); |
| 16783 Fts5DlidxIter *pDlidx = pIter->pDlidx; |
| 16784 int iLeafPgno = pIter->iLeafPgno; |
| 16785 int bMove = 1; |
| 16786 |
| 16787 assert( pIter->flags & FTS5_SEGITER_ONETERM ); |
| 16788 assert( pIter->pDlidx ); |
| 16789 assert( pIter->pLeaf ); |
| 16790 |
| 16791 if( bRev==0 ){ |
| 16792 while( !fts5DlidxIterEof(p, pDlidx) && iMatch>fts5DlidxIterRowid(pDlidx) ){ |
| 16793 iLeafPgno = fts5DlidxIterPgno(pDlidx); |
| 16794 fts5DlidxIterNext(p, pDlidx); |
| 16795 } |
| 16796 assert_nc( iLeafPgno>=pIter->iLeafPgno || p->rc ); |
| 16797 if( iLeafPgno>pIter->iLeafPgno ){ |
| 16798 fts5SegIterGotoPage(p, pIter, iLeafPgno); |
| 16799 bMove = 0; |
| 16800 } |
| 16801 }else{ |
| 16802 assert( pIter->pNextLeaf==0 ); |
| 16803 assert( iMatch<pIter->iRowid ); |
| 16804 while( !fts5DlidxIterEof(p, pDlidx) && iMatch<fts5DlidxIterRowid(pDlidx) ){ |
| 16805 fts5DlidxIterPrev(p, pDlidx); |
| 16806 } |
| 16807 iLeafPgno = fts5DlidxIterPgno(pDlidx); |
| 16808 |
| 16809 assert( fts5DlidxIterEof(p, pDlidx) || iLeafPgno<=pIter->iLeafPgno ); |
| 16810 |
| 16811 if( iLeafPgno<pIter->iLeafPgno ){ |
| 16812 pIter->iLeafPgno = iLeafPgno+1; |
| 16813 fts5SegIterReverseNewPage(p, pIter); |
| 16814 bMove = 0; |
| 16815 } |
| 16816 } |
| 16817 |
| 16818 do{ |
| 16819 if( bMove ) fts5SegIterNext(p, pIter, 0); |
| 16820 if( pIter->pLeaf==0 ) break; |
| 16821 if( bRev==0 && pIter->iRowid>=iMatch ) break; |
| 16822 if( bRev!=0 && pIter->iRowid<=iMatch ) break; |
| 16823 bMove = 1; |
| 16824 }while( p->rc==SQLITE_OK ); |
| 16825 } |
| 16826 |
| 16827 |
| 16828 /* |
| 16829 ** Free the iterator object passed as the second argument. |
| 16830 */ |
| 16831 static void fts5MultiIterFree(Fts5Index *p, Fts5IndexIter *pIter){ |
| 16832 if( pIter ){ |
| 16833 int i; |
| 16834 for(i=0; i<pIter->nSeg; i++){ |
| 16835 fts5SegIterClear(&pIter->aSeg[i]); |
| 16836 } |
| 16837 fts5StructureRelease(pIter->pStruct); |
| 16838 fts5BufferFree(&pIter->poslist); |
| 16839 sqlite3_free(pIter); |
| 16840 } |
| 16841 } |
| 16842 |
| 16843 static void fts5MultiIterAdvanced( |
| 16844 Fts5Index *p, /* FTS5 backend to iterate within */ |
| 16845 Fts5IndexIter *pIter, /* Iterator to update aFirst[] array for */ |
| 16846 int iChanged, /* Index of sub-iterator just advanced */ |
| 16847 int iMinset /* Minimum entry in aFirst[] to set */ |
| 16848 ){ |
| 16849 int i; |
| 16850 for(i=(pIter->nSeg+iChanged)/2; i>=iMinset && p->rc==SQLITE_OK; i=i/2){ |
| 16851 int iEq; |
| 16852 if( (iEq = fts5MultiIterDoCompare(pIter, i)) ){ |
| 16853 fts5SegIterNext(p, &pIter->aSeg[iEq], 0); |
| 16854 i = pIter->nSeg + iEq; |
| 16855 } |
| 16856 } |
| 16857 } |
| 16858 |
| 16859 /* |
| 16860 ** Sub-iterator iChanged of iterator pIter has just been advanced. It still |
| 16861 ** points to the same term though - just a different rowid. This function |
| 16862 ** attempts to update the contents of the pIter->aFirst[] accordingly. |
| 16863 ** If it does so successfully, 0 is returned. Otherwise 1. |
| 16864 ** |
| 16865 ** If non-zero is returned, the caller should call fts5MultiIterAdvanced() |
| 16866 ** on the iterator instead. That function does the same as this one, except |
| 16867 ** that it deals with more complicated cases as well. |
| 16868 */ |
| 16869 static int fts5MultiIterAdvanceRowid( |
| 16870 Fts5Index *p, /* FTS5 backend to iterate within */ |
| 16871 Fts5IndexIter *pIter, /* Iterator to update aFirst[] array for */ |
| 16872 int iChanged /* Index of sub-iterator just advanced */ |
| 16873 ){ |
| 16874 Fts5SegIter *pNew = &pIter->aSeg[iChanged]; |
| 16875 |
| 16876 if( pNew->iRowid==pIter->iSwitchRowid |
| 16877 || (pNew->iRowid<pIter->iSwitchRowid)==pIter->bRev |
| 16878 ){ |
| 16879 int i; |
| 16880 Fts5SegIter *pOther = &pIter->aSeg[iChanged ^ 0x0001]; |
| 16881 pIter->iSwitchRowid = pIter->bRev ? SMALLEST_INT64 : LARGEST_INT64; |
| 16882 for(i=(pIter->nSeg+iChanged)/2; 1; i=i/2){ |
| 16883 Fts5CResult *pRes = &pIter->aFirst[i]; |
| 16884 |
| 16885 assert( pNew->pLeaf ); |
| 16886 assert( pRes->bTermEq==0 || pOther->pLeaf ); |
| 16887 |
| 16888 if( pRes->bTermEq ){ |
| 16889 if( pNew->iRowid==pOther->iRowid ){ |
| 16890 return 1; |
| 16891 }else if( (pOther->iRowid>pNew->iRowid)==pIter->bRev ){ |
| 16892 pIter->iSwitchRowid = pOther->iRowid; |
| 16893 pNew = pOther; |
| 16894 }else if( (pOther->iRowid>pIter->iSwitchRowid)==pIter->bRev ){ |
| 16895 pIter->iSwitchRowid = pOther->iRowid; |
| 16896 } |
| 16897 } |
| 16898 pRes->iFirst = (u16)(pNew - pIter->aSeg); |
| 16899 if( i==1 ) break; |
| 16900 |
| 16901 pOther = &pIter->aSeg[ pIter->aFirst[i ^ 0x0001].iFirst ]; |
| 16902 } |
| 16903 } |
| 16904 |
| 16905 return 0; |
| 16906 } |
| 16907 |
| 16908 /* |
| 16909 ** Set the pIter->bEof variable based on the state of the sub-iterators. |
| 16910 */ |
| 16911 static void fts5MultiIterSetEof(Fts5IndexIter *pIter){ |
| 16912 Fts5SegIter *pSeg = &pIter->aSeg[ pIter->aFirst[1].iFirst ]; |
| 16913 pIter->bEof = pSeg->pLeaf==0; |
| 16914 pIter->iSwitchRowid = pSeg->iRowid; |
| 16915 } |
| 16916 |
| 16917 /* |
| 16918 ** Move the iterator to the next entry. |
| 16919 ** |
| 16920 ** If an error occurs, an error code is left in Fts5Index.rc. It is not |
| 16921 ** considered an error if the iterator reaches EOF, or if it is already at |
| 16922 ** EOF when this function is called. |
| 16923 */ |
| 16924 static void fts5MultiIterNext( |
| 16925 Fts5Index *p, |
| 16926 Fts5IndexIter *pIter, |
| 16927 int bFrom, /* True if argument iFrom is valid */ |
| 16928 i64 iFrom /* Advance at least as far as this */ |
| 16929 ){ |
| 16930 if( p->rc==SQLITE_OK ){ |
| 16931 int bUseFrom = bFrom; |
| 16932 do { |
| 16933 int iFirst = pIter->aFirst[1].iFirst; |
| 16934 int bNewTerm = 0; |
| 16935 Fts5SegIter *pSeg = &pIter->aSeg[iFirst]; |
| 16936 assert( p->rc==SQLITE_OK ); |
| 16937 if( bUseFrom && pSeg->pDlidx ){ |
| 16938 fts5SegIterNextFrom(p, pSeg, iFrom); |
| 16939 }else{ |
| 16940 fts5SegIterNext(p, pSeg, &bNewTerm); |
| 16941 } |
| 16942 |
| 16943 if( pSeg->pLeaf==0 || bNewTerm |
| 16944 || fts5MultiIterAdvanceRowid(p, pIter, iFirst) |
| 16945 ){ |
| 16946 fts5MultiIterAdvanced(p, pIter, iFirst, 1); |
| 16947 fts5MultiIterSetEof(pIter); |
| 16948 } |
| 16949 fts5AssertMultiIterSetup(p, pIter); |
| 16950 |
| 16951 bUseFrom = 0; |
| 16952 }while( pIter->bSkipEmpty && fts5MultiIterIsEmpty(p, pIter) ); |
| 16953 } |
| 16954 } |
| 16955 |
| 16956 static void fts5MultiIterNext2( |
| 16957 Fts5Index *p, |
| 16958 Fts5IndexIter *pIter, |
| 16959 int *pbNewTerm /* OUT: True if *might* be new term */ |
| 16960 ){ |
| 16961 assert( pIter->bSkipEmpty ); |
| 16962 if( p->rc==SQLITE_OK ){ |
| 16963 do { |
| 16964 int iFirst = pIter->aFirst[1].iFirst; |
| 16965 Fts5SegIter *pSeg = &pIter->aSeg[iFirst]; |
| 16966 int bNewTerm = 0; |
| 16967 |
| 16968 fts5SegIterNext(p, pSeg, &bNewTerm); |
| 16969 if( pSeg->pLeaf==0 || bNewTerm |
| 16970 || fts5MultiIterAdvanceRowid(p, pIter, iFirst) |
| 16971 ){ |
| 16972 fts5MultiIterAdvanced(p, pIter, iFirst, 1); |
| 16973 fts5MultiIterSetEof(pIter); |
| 16974 *pbNewTerm = 1; |
| 16975 }else{ |
| 16976 *pbNewTerm = 0; |
| 16977 } |
| 16978 fts5AssertMultiIterSetup(p, pIter); |
| 16979 |
| 16980 }while( fts5MultiIterIsEmpty(p, pIter) ); |
| 16981 } |
| 16982 } |
| 16983 |
| 16984 |
| 16985 static Fts5IndexIter *fts5MultiIterAlloc( |
| 16986 Fts5Index *p, /* FTS5 backend to iterate within */ |
| 16987 int nSeg |
| 16988 ){ |
| 16989 Fts5IndexIter *pNew; |
| 16990 int nSlot; /* Power of two >= nSeg */ |
| 16991 |
| 16992 for(nSlot=2; nSlot<nSeg; nSlot=nSlot*2); |
| 16993 pNew = fts5IdxMalloc(p, |
| 16994 sizeof(Fts5IndexIter) + /* pNew */ |
| 16995 sizeof(Fts5SegIter) * (nSlot-1) + /* pNew->aSeg[] */ |
| 16996 sizeof(Fts5CResult) * nSlot /* pNew->aFirst[] */ |
| 16997 ); |
| 16998 if( pNew ){ |
| 16999 pNew->nSeg = nSlot; |
| 17000 pNew->aFirst = (Fts5CResult*)&pNew->aSeg[nSlot]; |
| 17001 pNew->pIndex = p; |
| 17002 } |
| 17003 return pNew; |
| 17004 } |
| 17005 |
| 17006 /* |
| 17007 ** Allocate a new Fts5IndexIter object. |
| 17008 ** |
| 17009 ** The new object will be used to iterate through data in structure pStruct. |
| 17010 ** If iLevel is -ve, then all data in all segments is merged. Or, if iLevel |
| 17011 ** is zero or greater, data from the first nSegment segments on level iLevel |
| 17012 ** is merged. |
| 17013 ** |
| 17014 ** The iterator initially points to the first term/rowid entry in the |
| 17015 ** iterated data. |
| 17016 */ |
| 17017 static void fts5MultiIterNew( |
| 17018 Fts5Index *p, /* FTS5 backend to iterate within */ |
| 17019 Fts5Structure *pStruct, /* Structure of specific index */ |
| 17020 int bSkipEmpty, /* True to ignore delete-keys */ |
| 17021 int flags, /* FTS5INDEX_QUERY_XXX flags */ |
| 17022 const u8 *pTerm, int nTerm, /* Term to seek to (or NULL/0) */ |
| 17023 int iLevel, /* Level to iterate (-1 for all) */ |
| 17024 int nSegment, /* Number of segments to merge (iLevel>=0) */ |
| 17025 Fts5IndexIter **ppOut /* New object */ |
| 17026 ){ |
| 17027 int nSeg = 0; /* Number of segment-iters in use */ |
| 17028 int iIter = 0; /* */ |
| 17029 int iSeg; /* Used to iterate through segments */ |
| 17030 Fts5Buffer buf = {0,0,0}; /* Buffer used by fts5SegIterSeekInit() */ |
| 17031 Fts5StructureLevel *pLvl; |
| 17032 Fts5IndexIter *pNew; |
| 17033 |
| 17034 assert( (pTerm==0 && nTerm==0) || iLevel<0 ); |
| 17035 |
| 17036 /* Allocate space for the new multi-seg-iterator. */ |
| 17037 if( p->rc==SQLITE_OK ){ |
| 17038 if( iLevel<0 ){ |
| 17039 assert( pStruct->nSegment==fts5StructureCountSegments(pStruct) ); |
| 17040 nSeg = pStruct->nSegment; |
| 17041 nSeg += (p->pHash ? 1 : 0); |
| 17042 }else{ |
| 17043 nSeg = MIN(pStruct->aLevel[iLevel].nSeg, nSegment); |
| 17044 } |
| 17045 } |
| 17046 *ppOut = pNew = fts5MultiIterAlloc(p, nSeg); |
| 17047 if( pNew==0 ) return; |
| 17048 pNew->bRev = (0!=(flags & FTS5INDEX_QUERY_DESC)); |
| 17049 pNew->bSkipEmpty = (u8)bSkipEmpty; |
| 17050 pNew->pStruct = pStruct; |
| 17051 fts5StructureRef(pStruct); |
| 17052 |
| 17053 /* Initialize each of the component segment iterators. */ |
| 17054 if( iLevel<0 ){ |
| 17055 Fts5StructureLevel *pEnd = &pStruct->aLevel[pStruct->nLevel]; |
| 17056 if( p->pHash ){ |
| 17057 /* Add a segment iterator for the current contents of the hash table. */ |
| 17058 Fts5SegIter *pIter = &pNew->aSeg[iIter++]; |
| 17059 fts5SegIterHashInit(p, pTerm, nTerm, flags, pIter); |
| 17060 } |
| 17061 for(pLvl=&pStruct->aLevel[0]; pLvl<pEnd; pLvl++){ |
| 17062 for(iSeg=pLvl->nSeg-1; iSeg>=0; iSeg--){ |
| 17063 Fts5StructureSegment *pSeg = &pLvl->aSeg[iSeg]; |
| 17064 Fts5SegIter *pIter = &pNew->aSeg[iIter++]; |
| 17065 if( pTerm==0 ){ |
| 17066 fts5SegIterInit(p, pSeg, pIter); |
| 17067 }else{ |
| 17068 fts5SegIterSeekInit(p, &buf, pTerm, nTerm, flags, pSeg, pIter); |
| 17069 } |
| 17070 } |
| 17071 } |
| 17072 }else{ |
| 17073 pLvl = &pStruct->aLevel[iLevel]; |
| 17074 for(iSeg=nSeg-1; iSeg>=0; iSeg--){ |
| 17075 fts5SegIterInit(p, &pLvl->aSeg[iSeg], &pNew->aSeg[iIter++]); |
| 17076 } |
| 17077 } |
| 17078 assert( iIter==nSeg ); |
| 17079 |
| 17080 /* If the above was successful, each component iterators now points |
| 17081 ** to the first entry in its segment. In this case initialize the |
| 17082 ** aFirst[] array. Or, if an error has occurred, free the iterator |
| 17083 ** object and set the output variable to NULL. */ |
| 17084 if( p->rc==SQLITE_OK ){ |
| 17085 for(iIter=pNew->nSeg-1; iIter>0; iIter--){ |
| 17086 int iEq; |
| 17087 if( (iEq = fts5MultiIterDoCompare(pNew, iIter)) ){ |
| 17088 fts5SegIterNext(p, &pNew->aSeg[iEq], 0); |
| 17089 fts5MultiIterAdvanced(p, pNew, iEq, iIter); |
| 17090 } |
| 17091 } |
| 17092 fts5MultiIterSetEof(pNew); |
| 17093 fts5AssertMultiIterSetup(p, pNew); |
| 17094 |
| 17095 if( pNew->bSkipEmpty && fts5MultiIterIsEmpty(p, pNew) ){ |
| 17096 fts5MultiIterNext(p, pNew, 0, 0); |
| 17097 } |
| 17098 }else{ |
| 17099 fts5MultiIterFree(p, pNew); |
| 17100 *ppOut = 0; |
| 17101 } |
| 17102 fts5BufferFree(&buf); |
| 17103 } |
| 17104 |
| 17105 /* |
| 17106 ** Create an Fts5IndexIter that iterates through the doclist provided |
| 17107 ** as the second argument. |
| 17108 */ |
| 17109 static void fts5MultiIterNew2( |
| 17110 Fts5Index *p, /* FTS5 backend to iterate within */ |
| 17111 Fts5Data *pData, /* Doclist to iterate through */ |
| 17112 int bDesc, /* True for descending rowid order */ |
| 17113 Fts5IndexIter **ppOut /* New object */ |
| 17114 ){ |
| 17115 Fts5IndexIter *pNew; |
| 17116 pNew = fts5MultiIterAlloc(p, 2); |
| 17117 if( pNew ){ |
| 17118 Fts5SegIter *pIter = &pNew->aSeg[1]; |
| 17119 |
| 17120 pNew->bFiltered = 1; |
| 17121 pIter->flags = FTS5_SEGITER_ONETERM; |
| 17122 if( pData->szLeaf>0 ){ |
| 17123 pIter->pLeaf = pData; |
| 17124 pIter->iLeafOffset = fts5GetVarint(pData->p, (u64*)&pIter->iRowid); |
| 17125 pIter->iEndofDoclist = pData->nn; |
| 17126 pNew->aFirst[1].iFirst = 1; |
| 17127 if( bDesc ){ |
| 17128 pNew->bRev = 1; |
| 17129 pIter->flags |= FTS5_SEGITER_REVERSE; |
| 17130 fts5SegIterReverseInitPage(p, pIter); |
| 17131 }else{ |
| 17132 fts5SegIterLoadNPos(p, pIter); |
| 17133 } |
| 17134 pData = 0; |
| 17135 }else{ |
| 17136 pNew->bEof = 1; |
| 17137 } |
| 17138 |
| 17139 *ppOut = pNew; |
| 17140 } |
| 17141 |
| 17142 fts5DataRelease(pData); |
| 17143 } |
| 17144 |
| 17145 /* |
| 17146 ** Return true if the iterator is at EOF or if an error has occurred. |
| 17147 ** False otherwise. |
| 17148 */ |
| 17149 static int fts5MultiIterEof(Fts5Index *p, Fts5IndexIter *pIter){ |
| 17150 assert( p->rc |
| 17151 || (pIter->aSeg[ pIter->aFirst[1].iFirst ].pLeaf==0)==pIter->bEof |
| 17152 ); |
| 17153 return (p->rc || pIter->bEof); |
| 17154 } |
| 17155 |
| 17156 /* |
| 17157 ** Return the rowid of the entry that the iterator currently points |
| 17158 ** to. If the iterator points to EOF when this function is called the |
| 17159 ** results are undefined. |
| 17160 */ |
| 17161 static i64 fts5MultiIterRowid(Fts5IndexIter *pIter){ |
| 17162 assert( pIter->aSeg[ pIter->aFirst[1].iFirst ].pLeaf ); |
| 17163 return pIter->aSeg[ pIter->aFirst[1].iFirst ].iRowid; |
| 17164 } |
| 17165 |
| 17166 /* |
| 17167 ** Move the iterator to the next entry at or following iMatch. |
| 17168 */ |
| 17169 static void fts5MultiIterNextFrom( |
| 17170 Fts5Index *p, |
| 17171 Fts5IndexIter *pIter, |
| 17172 i64 iMatch |
| 17173 ){ |
| 17174 while( 1 ){ |
| 17175 i64 iRowid; |
| 17176 fts5MultiIterNext(p, pIter, 1, iMatch); |
| 17177 if( fts5MultiIterEof(p, pIter) ) break; |
| 17178 iRowid = fts5MultiIterRowid(pIter); |
| 17179 if( pIter->bRev==0 && iRowid>=iMatch ) break; |
| 17180 if( pIter->bRev!=0 && iRowid<=iMatch ) break; |
| 17181 } |
| 17182 } |
| 17183 |
| 17184 /* |
| 17185 ** Return a pointer to a buffer containing the term associated with the |
| 17186 ** entry that the iterator currently points to. |
| 17187 */ |
| 17188 static const u8 *fts5MultiIterTerm(Fts5IndexIter *pIter, int *pn){ |
| 17189 Fts5SegIter *p = &pIter->aSeg[ pIter->aFirst[1].iFirst ]; |
| 17190 *pn = p->term.n; |
| 17191 return p->term.p; |
| 17192 } |
| 17193 |
| 17194 static void fts5ChunkIterate( |
| 17195 Fts5Index *p, /* Index object */ |
| 17196 Fts5SegIter *pSeg, /* Poslist of this iterator */ |
| 17197 void *pCtx, /* Context pointer for xChunk callback */ |
| 17198 void (*xChunk)(Fts5Index*, void*, const u8*, int) |
| 17199 ){ |
| 17200 int nRem = pSeg->nPos; /* Number of bytes still to come */ |
| 17201 Fts5Data *pData = 0; |
| 17202 u8 *pChunk = &pSeg->pLeaf->p[pSeg->iLeafOffset]; |
| 17203 int nChunk = MIN(nRem, pSeg->pLeaf->szLeaf - pSeg->iLeafOffset); |
| 17204 int pgno = pSeg->iLeafPgno; |
| 17205 int pgnoSave = 0; |
| 17206 |
| 17207 if( (pSeg->flags & FTS5_SEGITER_REVERSE)==0 ){ |
| 17208 pgnoSave = pgno+1; |
| 17209 } |
| 17210 |
| 17211 while( 1 ){ |
| 17212 xChunk(p, pCtx, pChunk, nChunk); |
| 17213 nRem -= nChunk; |
| 17214 fts5DataRelease(pData); |
| 17215 if( nRem<=0 ){ |
| 17216 break; |
| 17217 }else{ |
| 17218 pgno++; |
| 17219 pData = fts5DataRead(p, FTS5_SEGMENT_ROWID(pSeg->pSeg->iSegid, pgno)); |
| 17220 if( pData==0 ) break; |
| 17221 pChunk = &pData->p[4]; |
| 17222 nChunk = MIN(nRem, pData->szLeaf - 4); |
| 17223 if( pgno==pgnoSave ){ |
| 17224 assert( pSeg->pNextLeaf==0 ); |
| 17225 pSeg->pNextLeaf = pData; |
| 17226 pData = 0; |
| 17227 } |
| 17228 } |
| 17229 } |
| 17230 } |
| 17231 |
| 17232 |
| 17233 |
| 17234 /* |
| 17235 ** Allocate a new segment-id for the structure pStruct. The new segment |
| 17236 ** id must be between 1 and 65335 inclusive, and must not be used by |
| 17237 ** any currently existing segment. If a free segment id cannot be found, |
| 17238 ** SQLITE_FULL is returned. |
| 17239 ** |
| 17240 ** If an error has already occurred, this function is a no-op. 0 is |
| 17241 ** returned in this case. |
| 17242 */ |
| 17243 static int fts5AllocateSegid(Fts5Index *p, Fts5Structure *pStruct){ |
| 17244 int iSegid = 0; |
| 17245 |
| 17246 if( p->rc==SQLITE_OK ){ |
| 17247 if( pStruct->nSegment>=FTS5_MAX_SEGMENT ){ |
| 17248 p->rc = SQLITE_FULL; |
| 17249 }else{ |
| 17250 while( iSegid==0 ){ |
| 17251 int iLvl, iSeg; |
| 17252 sqlite3_randomness(sizeof(u32), (void*)&iSegid); |
| 17253 iSegid = iSegid & ((1 << FTS5_DATA_ID_B)-1); |
| 17254 for(iLvl=0; iLvl<pStruct->nLevel; iLvl++){ |
| 17255 for(iSeg=0; iSeg<pStruct->aLevel[iLvl].nSeg; iSeg++){ |
| 17256 if( iSegid==pStruct->aLevel[iLvl].aSeg[iSeg].iSegid ){ |
| 17257 iSegid = 0; |
| 17258 } |
| 17259 } |
| 17260 } |
| 17261 } |
| 17262 } |
| 17263 } |
| 17264 |
| 17265 return iSegid; |
| 17266 } |
| 17267 |
| 17268 /* |
| 17269 ** Discard all data currently cached in the hash-tables. |
| 17270 */ |
| 17271 static void fts5IndexDiscardData(Fts5Index *p){ |
| 17272 assert( p->pHash || p->nPendingData==0 ); |
| 17273 if( p->pHash ){ |
| 17274 sqlite3Fts5HashClear(p->pHash); |
| 17275 p->nPendingData = 0; |
| 17276 } |
| 17277 } |
| 17278 |
| 17279 /* |
| 17280 ** Return the size of the prefix, in bytes, that buffer (nNew/pNew) shares |
| 17281 ** with buffer (nOld/pOld). |
| 17282 */ |
| 17283 static int fts5PrefixCompress( |
| 17284 int nOld, const u8 *pOld, |
| 17285 int nNew, const u8 *pNew |
| 17286 ){ |
| 17287 int i; |
| 17288 assert( fts5BlobCompare(pOld, nOld, pNew, nNew)<0 ); |
| 17289 for(i=0; i<nOld; i++){ |
| 17290 if( pOld[i]!=pNew[i] ) break; |
| 17291 } |
| 17292 return i; |
| 17293 } |
| 17294 |
| 17295 static void fts5WriteDlidxClear( |
| 17296 Fts5Index *p, |
| 17297 Fts5SegWriter *pWriter, |
| 17298 int bFlush /* If true, write dlidx to disk */ |
| 17299 ){ |
| 17300 int i; |
| 17301 assert( bFlush==0 || (pWriter->nDlidx>0 && pWriter->aDlidx[0].buf.n>0) ); |
| 17302 for(i=0; i<pWriter->nDlidx; i++){ |
| 17303 Fts5DlidxWriter *pDlidx = &pWriter->aDlidx[i]; |
| 17304 if( pDlidx->buf.n==0 ) break; |
| 17305 if( bFlush ){ |
| 17306 assert( pDlidx->pgno!=0 ); |
| 17307 fts5DataWrite(p, |
| 17308 FTS5_DLIDX_ROWID(pWriter->iSegid, i, pDlidx->pgno), |
| 17309 pDlidx->buf.p, pDlidx->buf.n |
| 17310 ); |
| 17311 } |
| 17312 sqlite3Fts5BufferZero(&pDlidx->buf); |
| 17313 pDlidx->bPrevValid = 0; |
| 17314 } |
| 17315 } |
| 17316 |
| 17317 /* |
| 17318 ** Grow the pWriter->aDlidx[] array to at least nLvl elements in size. |
| 17319 ** Any new array elements are zeroed before returning. |
| 17320 */ |
| 17321 static int fts5WriteDlidxGrow( |
| 17322 Fts5Index *p, |
| 17323 Fts5SegWriter *pWriter, |
| 17324 int nLvl |
| 17325 ){ |
| 17326 if( p->rc==SQLITE_OK && nLvl>=pWriter->nDlidx ){ |
| 17327 Fts5DlidxWriter *aDlidx = (Fts5DlidxWriter*)sqlite3_realloc( |
| 17328 pWriter->aDlidx, sizeof(Fts5DlidxWriter) * nLvl |
| 17329 ); |
| 17330 if( aDlidx==0 ){ |
| 17331 p->rc = SQLITE_NOMEM; |
| 17332 }else{ |
| 17333 int nByte = sizeof(Fts5DlidxWriter) * (nLvl - pWriter->nDlidx); |
| 17334 memset(&aDlidx[pWriter->nDlidx], 0, nByte); |
| 17335 pWriter->aDlidx = aDlidx; |
| 17336 pWriter->nDlidx = nLvl; |
| 17337 } |
| 17338 } |
| 17339 return p->rc; |
| 17340 } |
| 17341 |
| 17342 /* |
| 17343 ** If the current doclist-index accumulating in pWriter->aDlidx[] is large |
| 17344 ** enough, flush it to disk and return 1. Otherwise discard it and return |
| 17345 ** zero. |
| 17346 */ |
| 17347 static int fts5WriteFlushDlidx(Fts5Index *p, Fts5SegWriter *pWriter){ |
| 17348 int bFlag = 0; |
| 17349 |
| 17350 /* If there were FTS5_MIN_DLIDX_SIZE or more empty leaf pages written |
| 17351 ** to the database, also write the doclist-index to disk. */ |
| 17352 if( pWriter->aDlidx[0].buf.n>0 && pWriter->nEmpty>=FTS5_MIN_DLIDX_SIZE ){ |
| 17353 bFlag = 1; |
| 17354 } |
| 17355 fts5WriteDlidxClear(p, pWriter, bFlag); |
| 17356 pWriter->nEmpty = 0; |
| 17357 return bFlag; |
| 17358 } |
| 17359 |
| 17360 /* |
| 17361 ** This function is called whenever processing of the doclist for the |
| 17362 ** last term on leaf page (pWriter->iBtPage) is completed. |
| 17363 ** |
| 17364 ** The doclist-index for that term is currently stored in-memory within the |
| 17365 ** Fts5SegWriter.aDlidx[] array. If it is large enough, this function |
| 17366 ** writes it out to disk. Or, if it is too small to bother with, discards |
| 17367 ** it. |
| 17368 ** |
| 17369 ** Fts5SegWriter.btterm currently contains the first term on page iBtPage. |
| 17370 */ |
| 17371 static void fts5WriteFlushBtree(Fts5Index *p, Fts5SegWriter *pWriter){ |
| 17372 int bFlag; |
| 17373 |
| 17374 assert( pWriter->iBtPage || pWriter->nEmpty==0 ); |
| 17375 if( pWriter->iBtPage==0 ) return; |
| 17376 bFlag = fts5WriteFlushDlidx(p, pWriter); |
| 17377 |
| 17378 if( p->rc==SQLITE_OK ){ |
| 17379 const char *z = (pWriter->btterm.n>0?(const char*)pWriter->btterm.p:""); |
| 17380 /* The following was already done in fts5WriteInit(): */ |
| 17381 /* sqlite3_bind_int(p->pIdxWriter, 1, pWriter->iSegid); */ |
| 17382 sqlite3_bind_blob(p->pIdxWriter, 2, z, pWriter->btterm.n, SQLITE_STATIC); |
| 17383 sqlite3_bind_int64(p->pIdxWriter, 3, bFlag + ((i64)pWriter->iBtPage<<1)); |
| 17384 sqlite3_step(p->pIdxWriter); |
| 17385 p->rc = sqlite3_reset(p->pIdxWriter); |
| 17386 } |
| 17387 pWriter->iBtPage = 0; |
| 17388 } |
| 17389 |
| 17390 /* |
| 17391 ** This is called once for each leaf page except the first that contains |
| 17392 ** at least one term. Argument (nTerm/pTerm) is the split-key - a term that |
| 17393 ** is larger than all terms written to earlier leaves, and equal to or |
| 17394 ** smaller than the first term on the new leaf. |
| 17395 ** |
| 17396 ** If an error occurs, an error code is left in Fts5Index.rc. If an error |
| 17397 ** has already occurred when this function is called, it is a no-op. |
| 17398 */ |
| 17399 static void fts5WriteBtreeTerm( |
| 17400 Fts5Index *p, /* FTS5 backend object */ |
| 17401 Fts5SegWriter *pWriter, /* Writer object */ |
| 17402 int nTerm, const u8 *pTerm /* First term on new page */ |
| 17403 ){ |
| 17404 fts5WriteFlushBtree(p, pWriter); |
| 17405 fts5BufferSet(&p->rc, &pWriter->btterm, nTerm, pTerm); |
| 17406 pWriter->iBtPage = pWriter->writer.pgno; |
| 17407 } |
| 17408 |
| 17409 /* |
| 17410 ** This function is called when flushing a leaf page that contains no |
| 17411 ** terms at all to disk. |
| 17412 */ |
| 17413 static void fts5WriteBtreeNoTerm( |
| 17414 Fts5Index *p, /* FTS5 backend object */ |
| 17415 Fts5SegWriter *pWriter /* Writer object */ |
| 17416 ){ |
| 17417 /* If there were no rowids on the leaf page either and the doclist-index |
| 17418 ** has already been started, append an 0x00 byte to it. */ |
| 17419 if( pWriter->bFirstRowidInPage && pWriter->aDlidx[0].buf.n>0 ){ |
| 17420 Fts5DlidxWriter *pDlidx = &pWriter->aDlidx[0]; |
| 17421 assert( pDlidx->bPrevValid ); |
| 17422 sqlite3Fts5BufferAppendVarint(&p->rc, &pDlidx->buf, 0); |
| 17423 } |
| 17424 |
| 17425 /* Increment the "number of sequential leaves without a term" counter. */ |
| 17426 pWriter->nEmpty++; |
| 17427 } |
| 17428 |
| 17429 static i64 fts5DlidxExtractFirstRowid(Fts5Buffer *pBuf){ |
| 17430 i64 iRowid; |
| 17431 int iOff; |
| 17432 |
| 17433 iOff = 1 + fts5GetVarint(&pBuf->p[1], (u64*)&iRowid); |
| 17434 fts5GetVarint(&pBuf->p[iOff], (u64*)&iRowid); |
| 17435 return iRowid; |
| 17436 } |
| 17437 |
| 17438 /* |
| 17439 ** Rowid iRowid has just been appended to the current leaf page. It is the |
| 17440 ** first on the page. This function appends an appropriate entry to the current |
| 17441 ** doclist-index. |
| 17442 */ |
| 17443 static void fts5WriteDlidxAppend( |
| 17444 Fts5Index *p, |
| 17445 Fts5SegWriter *pWriter, |
| 17446 i64 iRowid |
| 17447 ){ |
| 17448 int i; |
| 17449 int bDone = 0; |
| 17450 |
| 17451 for(i=0; p->rc==SQLITE_OK && bDone==0; i++){ |
| 17452 i64 iVal; |
| 17453 Fts5DlidxWriter *pDlidx = &pWriter->aDlidx[i]; |
| 17454 |
| 17455 if( pDlidx->buf.n>=p->pConfig->pgsz ){ |
| 17456 /* The current doclist-index page is full. Write it to disk and push |
| 17457 ** a copy of iRowid (which will become the first rowid on the next |
| 17458 ** doclist-index leaf page) up into the next level of the b-tree |
| 17459 ** hierarchy. If the node being flushed is currently the root node, |
| 17460 ** also push its first rowid upwards. */ |
| 17461 pDlidx->buf.p[0] = 0x01; /* Not the root node */ |
| 17462 fts5DataWrite(p, |
| 17463 FTS5_DLIDX_ROWID(pWriter->iSegid, i, pDlidx->pgno), |
| 17464 pDlidx->buf.p, pDlidx->buf.n |
| 17465 ); |
| 17466 fts5WriteDlidxGrow(p, pWriter, i+2); |
| 17467 pDlidx = &pWriter->aDlidx[i]; |
| 17468 if( p->rc==SQLITE_OK && pDlidx[1].buf.n==0 ){ |
| 17469 i64 iFirst = fts5DlidxExtractFirstRowid(&pDlidx->buf); |
| 17470 |
| 17471 /* This was the root node. Push its first rowid up to the new root. */ |
| 17472 pDlidx[1].pgno = pDlidx->pgno; |
| 17473 sqlite3Fts5BufferAppendVarint(&p->rc, &pDlidx[1].buf, 0); |
| 17474 sqlite3Fts5BufferAppendVarint(&p->rc, &pDlidx[1].buf, pDlidx->pgno); |
| 17475 sqlite3Fts5BufferAppendVarint(&p->rc, &pDlidx[1].buf, iFirst); |
| 17476 pDlidx[1].bPrevValid = 1; |
| 17477 pDlidx[1].iPrev = iFirst; |
| 17478 } |
| 17479 |
| 17480 sqlite3Fts5BufferZero(&pDlidx->buf); |
| 17481 pDlidx->bPrevValid = 0; |
| 17482 pDlidx->pgno++; |
| 17483 }else{ |
| 17484 bDone = 1; |
| 17485 } |
| 17486 |
| 17487 if( pDlidx->bPrevValid ){ |
| 17488 iVal = iRowid - pDlidx->iPrev; |
| 17489 }else{ |
| 17490 i64 iPgno = (i==0 ? pWriter->writer.pgno : pDlidx[-1].pgno); |
| 17491 assert( pDlidx->buf.n==0 ); |
| 17492 sqlite3Fts5BufferAppendVarint(&p->rc, &pDlidx->buf, !bDone); |
| 17493 sqlite3Fts5BufferAppendVarint(&p->rc, &pDlidx->buf, iPgno); |
| 17494 iVal = iRowid; |
| 17495 } |
| 17496 |
| 17497 sqlite3Fts5BufferAppendVarint(&p->rc, &pDlidx->buf, iVal); |
| 17498 pDlidx->bPrevValid = 1; |
| 17499 pDlidx->iPrev = iRowid; |
| 17500 } |
| 17501 } |
| 17502 |
| 17503 static void fts5WriteFlushLeaf(Fts5Index *p, Fts5SegWriter *pWriter){ |
| 17504 static const u8 zero[] = { 0x00, 0x00, 0x00, 0x00 }; |
| 17505 Fts5PageWriter *pPage = &pWriter->writer; |
| 17506 i64 iRowid; |
| 17507 |
| 17508 assert( (pPage->pgidx.n==0)==(pWriter->bFirstTermInPage) ); |
| 17509 |
| 17510 /* Set the szLeaf header field. */ |
| 17511 assert( 0==fts5GetU16(&pPage->buf.p[2]) ); |
| 17512 fts5PutU16(&pPage->buf.p[2], (u16)pPage->buf.n); |
| 17513 |
| 17514 if( pWriter->bFirstTermInPage ){ |
| 17515 /* No term was written to this page. */ |
| 17516 assert( pPage->pgidx.n==0 ); |
| 17517 fts5WriteBtreeNoTerm(p, pWriter); |
| 17518 }else{ |
| 17519 /* Append the pgidx to the page buffer. Set the szLeaf header field. */ |
| 17520 fts5BufferAppendBlob(&p->rc, &pPage->buf, pPage->pgidx.n, pPage->pgidx.p); |
| 17521 } |
| 17522 |
| 17523 /* Write the page out to disk */ |
| 17524 iRowid = FTS5_SEGMENT_ROWID(pWriter->iSegid, pPage->pgno); |
| 17525 fts5DataWrite(p, iRowid, pPage->buf.p, pPage->buf.n); |
| 17526 |
| 17527 /* Initialize the next page. */ |
| 17528 fts5BufferZero(&pPage->buf); |
| 17529 fts5BufferZero(&pPage->pgidx); |
| 17530 fts5BufferAppendBlob(&p->rc, &pPage->buf, 4, zero); |
| 17531 pPage->iPrevPgidx = 0; |
| 17532 pPage->pgno++; |
| 17533 |
| 17534 /* Increase the leaves written counter */ |
| 17535 pWriter->nLeafWritten++; |
| 17536 |
| 17537 /* The new leaf holds no terms or rowids */ |
| 17538 pWriter->bFirstTermInPage = 1; |
| 17539 pWriter->bFirstRowidInPage = 1; |
| 17540 } |
| 17541 |
| 17542 /* |
| 17543 ** Append term pTerm/nTerm to the segment being written by the writer passed |
| 17544 ** as the second argument. |
| 17545 ** |
| 17546 ** If an error occurs, set the Fts5Index.rc error code. If an error has |
| 17547 ** already occurred, this function is a no-op. |
| 17548 */ |
| 17549 static void fts5WriteAppendTerm( |
| 17550 Fts5Index *p, |
| 17551 Fts5SegWriter *pWriter, |
| 17552 int nTerm, const u8 *pTerm |
| 17553 ){ |
| 17554 int nPrefix; /* Bytes of prefix compression for term */ |
| 17555 Fts5PageWriter *pPage = &pWriter->writer; |
| 17556 Fts5Buffer *pPgidx = &pWriter->writer.pgidx; |
| 17557 |
| 17558 assert( p->rc==SQLITE_OK ); |
| 17559 assert( pPage->buf.n>=4 ); |
| 17560 assert( pPage->buf.n>4 || pWriter->bFirstTermInPage ); |
| 17561 |
| 17562 /* If the current leaf page is full, flush it to disk. */ |
| 17563 if( (pPage->buf.n + pPgidx->n + nTerm + 2)>=p->pConfig->pgsz ){ |
| 17564 if( pPage->buf.n>4 ){ |
| 17565 fts5WriteFlushLeaf(p, pWriter); |
| 17566 } |
| 17567 fts5BufferGrow(&p->rc, &pPage->buf, nTerm+FTS5_DATA_PADDING); |
| 17568 } |
| 17569 |
| 17570 /* TODO1: Updating pgidx here. */ |
| 17571 pPgidx->n += sqlite3Fts5PutVarint( |
| 17572 &pPgidx->p[pPgidx->n], pPage->buf.n - pPage->iPrevPgidx |
| 17573 ); |
| 17574 pPage->iPrevPgidx = pPage->buf.n; |
| 17575 #if 0 |
| 17576 fts5PutU16(&pPgidx->p[pPgidx->n], pPage->buf.n); |
| 17577 pPgidx->n += 2; |
| 17578 #endif |
| 17579 |
| 17580 if( pWriter->bFirstTermInPage ){ |
| 17581 nPrefix = 0; |
| 17582 if( pPage->pgno!=1 ){ |
| 17583 /* This is the first term on a leaf that is not the leftmost leaf in |
| 17584 ** the segment b-tree. In this case it is necessary to add a term to |
| 17585 ** the b-tree hierarchy that is (a) larger than the largest term |
| 17586 ** already written to the segment and (b) smaller than or equal to |
| 17587 ** this term. In other words, a prefix of (pTerm/nTerm) that is one |
| 17588 ** byte longer than the longest prefix (pTerm/nTerm) shares with the |
| 17589 ** previous term. |
| 17590 ** |
| 17591 ** Usually, the previous term is available in pPage->term. The exception |
| 17592 ** is if this is the first term written in an incremental-merge step. |
| 17593 ** In this case the previous term is not available, so just write a |
| 17594 ** copy of (pTerm/nTerm) into the parent node. This is slightly |
| 17595 ** inefficient, but still correct. */ |
| 17596 int n = nTerm; |
| 17597 if( pPage->term.n ){ |
| 17598 n = 1 + fts5PrefixCompress(pPage->term.n, pPage->term.p, nTerm, pTerm); |
| 17599 } |
| 17600 fts5WriteBtreeTerm(p, pWriter, n, pTerm); |
| 17601 pPage = &pWriter->writer; |
| 17602 } |
| 17603 }else{ |
| 17604 nPrefix = fts5PrefixCompress(pPage->term.n, pPage->term.p, nTerm, pTerm); |
| 17605 fts5BufferAppendVarint(&p->rc, &pPage->buf, nPrefix); |
| 17606 } |
| 17607 |
| 17608 /* Append the number of bytes of new data, then the term data itself |
| 17609 ** to the page. */ |
| 17610 fts5BufferAppendVarint(&p->rc, &pPage->buf, nTerm - nPrefix); |
| 17611 fts5BufferAppendBlob(&p->rc, &pPage->buf, nTerm - nPrefix, &pTerm[nPrefix]); |
| 17612 |
| 17613 /* Update the Fts5PageWriter.term field. */ |
| 17614 fts5BufferSet(&p->rc, &pPage->term, nTerm, pTerm); |
| 17615 pWriter->bFirstTermInPage = 0; |
| 17616 |
| 17617 pWriter->bFirstRowidInPage = 0; |
| 17618 pWriter->bFirstRowidInDoclist = 1; |
| 17619 |
| 17620 assert( p->rc || (pWriter->nDlidx>0 && pWriter->aDlidx[0].buf.n==0) ); |
| 17621 pWriter->aDlidx[0].pgno = pPage->pgno; |
| 17622 } |
| 17623 |
| 17624 /* |
| 17625 ** Append a rowid and position-list size field to the writers output. |
| 17626 */ |
| 17627 static void fts5WriteAppendRowid( |
| 17628 Fts5Index *p, |
| 17629 Fts5SegWriter *pWriter, |
| 17630 i64 iRowid, |
| 17631 int nPos |
| 17632 ){ |
| 17633 if( p->rc==SQLITE_OK ){ |
| 17634 Fts5PageWriter *pPage = &pWriter->writer; |
| 17635 |
| 17636 if( (pPage->buf.n + pPage->pgidx.n)>=p->pConfig->pgsz ){ |
| 17637 fts5WriteFlushLeaf(p, pWriter); |
| 17638 } |
| 17639 |
| 17640 /* If this is to be the first rowid written to the page, set the |
| 17641 ** rowid-pointer in the page-header. Also append a value to the dlidx |
| 17642 ** buffer, in case a doclist-index is required. */ |
| 17643 if( pWriter->bFirstRowidInPage ){ |
| 17644 fts5PutU16(pPage->buf.p, (u16)pPage->buf.n); |
| 17645 fts5WriteDlidxAppend(p, pWriter, iRowid); |
| 17646 } |
| 17647 |
| 17648 /* Write the rowid. */ |
| 17649 if( pWriter->bFirstRowidInDoclist || pWriter->bFirstRowidInPage ){ |
| 17650 fts5BufferAppendVarint(&p->rc, &pPage->buf, iRowid); |
| 17651 }else{ |
| 17652 assert( p->rc || iRowid>pWriter->iPrevRowid ); |
| 17653 fts5BufferAppendVarint(&p->rc, &pPage->buf, iRowid - pWriter->iPrevRowid); |
| 17654 } |
| 17655 pWriter->iPrevRowid = iRowid; |
| 17656 pWriter->bFirstRowidInDoclist = 0; |
| 17657 pWriter->bFirstRowidInPage = 0; |
| 17658 |
| 17659 fts5BufferAppendVarint(&p->rc, &pPage->buf, nPos); |
| 17660 } |
| 17661 } |
| 17662 |
| 17663 static void fts5WriteAppendPoslistData( |
| 17664 Fts5Index *p, |
| 17665 Fts5SegWriter *pWriter, |
| 17666 const u8 *aData, |
| 17667 int nData |
| 17668 ){ |
| 17669 Fts5PageWriter *pPage = &pWriter->writer; |
| 17670 const u8 *a = aData; |
| 17671 int n = nData; |
| 17672 |
| 17673 assert( p->pConfig->pgsz>0 ); |
| 17674 while( p->rc==SQLITE_OK |
| 17675 && (pPage->buf.n + pPage->pgidx.n + n)>=p->pConfig->pgsz |
| 17676 ){ |
| 17677 int nReq = p->pConfig->pgsz - pPage->buf.n - pPage->pgidx.n; |
| 17678 int nCopy = 0; |
| 17679 while( nCopy<nReq ){ |
| 17680 i64 dummy; |
| 17681 nCopy += fts5GetVarint(&a[nCopy], (u64*)&dummy); |
| 17682 } |
| 17683 fts5BufferAppendBlob(&p->rc, &pPage->buf, nCopy, a); |
| 17684 a += nCopy; |
| 17685 n -= nCopy; |
| 17686 fts5WriteFlushLeaf(p, pWriter); |
| 17687 } |
| 17688 if( n>0 ){ |
| 17689 fts5BufferAppendBlob(&p->rc, &pPage->buf, n, a); |
| 17690 } |
| 17691 } |
| 17692 |
| 17693 /* |
| 17694 ** Flush any data cached by the writer object to the database. Free any |
| 17695 ** allocations associated with the writer. |
| 17696 */ |
| 17697 static void fts5WriteFinish( |
| 17698 Fts5Index *p, |
| 17699 Fts5SegWriter *pWriter, /* Writer object */ |
| 17700 int *pnLeaf /* OUT: Number of leaf pages in b-tree */ |
| 17701 ){ |
| 17702 int i; |
| 17703 Fts5PageWriter *pLeaf = &pWriter->writer; |
| 17704 if( p->rc==SQLITE_OK ){ |
| 17705 assert( pLeaf->pgno>=1 ); |
| 17706 if( pLeaf->buf.n>4 ){ |
| 17707 fts5WriteFlushLeaf(p, pWriter); |
| 17708 } |
| 17709 *pnLeaf = pLeaf->pgno-1; |
| 17710 fts5WriteFlushBtree(p, pWriter); |
| 17711 } |
| 17712 fts5BufferFree(&pLeaf->term); |
| 17713 fts5BufferFree(&pLeaf->buf); |
| 17714 fts5BufferFree(&pLeaf->pgidx); |
| 17715 fts5BufferFree(&pWriter->btterm); |
| 17716 |
| 17717 for(i=0; i<pWriter->nDlidx; i++){ |
| 17718 sqlite3Fts5BufferFree(&pWriter->aDlidx[i].buf); |
| 17719 } |
| 17720 sqlite3_free(pWriter->aDlidx); |
| 17721 } |
| 17722 |
| 17723 static void fts5WriteInit( |
| 17724 Fts5Index *p, |
| 17725 Fts5SegWriter *pWriter, |
| 17726 int iSegid |
| 17727 ){ |
| 17728 const int nBuffer = p->pConfig->pgsz + FTS5_DATA_PADDING; |
| 17729 |
| 17730 memset(pWriter, 0, sizeof(Fts5SegWriter)); |
| 17731 pWriter->iSegid = iSegid; |
| 17732 |
| 17733 fts5WriteDlidxGrow(p, pWriter, 1); |
| 17734 pWriter->writer.pgno = 1; |
| 17735 pWriter->bFirstTermInPage = 1; |
| 17736 pWriter->iBtPage = 1; |
| 17737 |
| 17738 assert( pWriter->writer.buf.n==0 ); |
| 17739 assert( pWriter->writer.pgidx.n==0 ); |
| 17740 |
| 17741 /* Grow the two buffers to pgsz + padding bytes in size. */ |
| 17742 sqlite3Fts5BufferSize(&p->rc, &pWriter->writer.pgidx, nBuffer); |
| 17743 sqlite3Fts5BufferSize(&p->rc, &pWriter->writer.buf, nBuffer); |
| 17744 |
| 17745 if( p->pIdxWriter==0 ){ |
| 17746 Fts5Config *pConfig = p->pConfig; |
| 17747 fts5IndexPrepareStmt(p, &p->pIdxWriter, sqlite3_mprintf( |
| 17748 "INSERT INTO '%q'.'%q_idx'(segid,term,pgno) VALUES(?,?,?)", |
| 17749 pConfig->zDb, pConfig->zName |
| 17750 )); |
| 17751 } |
| 17752 |
| 17753 if( p->rc==SQLITE_OK ){ |
| 17754 /* Initialize the 4-byte leaf-page header to 0x00. */ |
| 17755 memset(pWriter->writer.buf.p, 0, 4); |
| 17756 pWriter->writer.buf.n = 4; |
| 17757 |
| 17758 /* Bind the current output segment id to the index-writer. This is an |
| 17759 ** optimization over binding the same value over and over as rows are |
| 17760 ** inserted into %_idx by the current writer. */ |
| 17761 sqlite3_bind_int(p->pIdxWriter, 1, pWriter->iSegid); |
| 17762 } |
| 17763 } |
| 17764 |
| 17765 /* |
| 17766 ** Iterator pIter was used to iterate through the input segments of on an |
| 17767 ** incremental merge operation. This function is called if the incremental |
| 17768 ** merge step has finished but the input has not been completely exhausted. |
| 17769 */ |
| 17770 static void fts5TrimSegments(Fts5Index *p, Fts5IndexIter *pIter){ |
| 17771 int i; |
| 17772 Fts5Buffer buf; |
| 17773 memset(&buf, 0, sizeof(Fts5Buffer)); |
| 17774 for(i=0; i<pIter->nSeg; i++){ |
| 17775 Fts5SegIter *pSeg = &pIter->aSeg[i]; |
| 17776 if( pSeg->pSeg==0 ){ |
| 17777 /* no-op */ |
| 17778 }else if( pSeg->pLeaf==0 ){ |
| 17779 /* All keys from this input segment have been transfered to the output. |
| 17780 ** Set both the first and last page-numbers to 0 to indicate that the |
| 17781 ** segment is now empty. */ |
| 17782 pSeg->pSeg->pgnoLast = 0; |
| 17783 pSeg->pSeg->pgnoFirst = 0; |
| 17784 }else{ |
| 17785 int iOff = pSeg->iTermLeafOffset; /* Offset on new first leaf page */ |
| 17786 i64 iLeafRowid; |
| 17787 Fts5Data *pData; |
| 17788 int iId = pSeg->pSeg->iSegid; |
| 17789 u8 aHdr[4] = {0x00, 0x00, 0x00, 0x00}; |
| 17790 |
| 17791 iLeafRowid = FTS5_SEGMENT_ROWID(iId, pSeg->iTermLeafPgno); |
| 17792 pData = fts5DataRead(p, iLeafRowid); |
| 17793 if( pData ){ |
| 17794 fts5BufferZero(&buf); |
| 17795 fts5BufferGrow(&p->rc, &buf, pData->nn); |
| 17796 fts5BufferAppendBlob(&p->rc, &buf, sizeof(aHdr), aHdr); |
| 17797 fts5BufferAppendVarint(&p->rc, &buf, pSeg->term.n); |
| 17798 fts5BufferAppendBlob(&p->rc, &buf, pSeg->term.n, pSeg->term.p); |
| 17799 fts5BufferAppendBlob(&p->rc, &buf, pData->szLeaf-iOff, &pData->p[iOff]); |
| 17800 if( p->rc==SQLITE_OK ){ |
| 17801 /* Set the szLeaf field */ |
| 17802 fts5PutU16(&buf.p[2], (u16)buf.n); |
| 17803 } |
| 17804 |
| 17805 /* Set up the new page-index array */ |
| 17806 fts5BufferAppendVarint(&p->rc, &buf, 4); |
| 17807 if( pSeg->iLeafPgno==pSeg->iTermLeafPgno |
| 17808 && pSeg->iEndofDoclist<pData->szLeaf |
| 17809 ){ |
| 17810 int nDiff = pData->szLeaf - pSeg->iEndofDoclist; |
| 17811 fts5BufferAppendVarint(&p->rc, &buf, buf.n - 1 - nDiff - 4); |
| 17812 fts5BufferAppendBlob(&p->rc, &buf, |
| 17813 pData->nn - pSeg->iPgidxOff, &pData->p[pSeg->iPgidxOff] |
| 17814 ); |
| 17815 } |
| 17816 |
| 17817 fts5DataRelease(pData); |
| 17818 pSeg->pSeg->pgnoFirst = pSeg->iTermLeafPgno; |
| 17819 fts5DataDelete(p, FTS5_SEGMENT_ROWID(iId, 1), iLeafRowid); |
| 17820 fts5DataWrite(p, iLeafRowid, buf.p, buf.n); |
| 17821 } |
| 17822 } |
| 17823 } |
| 17824 fts5BufferFree(&buf); |
| 17825 } |
| 17826 |
| 17827 static void fts5MergeChunkCallback( |
| 17828 Fts5Index *p, |
| 17829 void *pCtx, |
| 17830 const u8 *pChunk, int nChunk |
| 17831 ){ |
| 17832 Fts5SegWriter *pWriter = (Fts5SegWriter*)pCtx; |
| 17833 fts5WriteAppendPoslistData(p, pWriter, pChunk, nChunk); |
| 17834 } |
| 17835 |
| 17836 /* |
| 17837 ** |
| 17838 */ |
| 17839 static void fts5IndexMergeLevel( |
| 17840 Fts5Index *p, /* FTS5 backend object */ |
| 17841 Fts5Structure **ppStruct, /* IN/OUT: Stucture of index */ |
| 17842 int iLvl, /* Level to read input from */ |
| 17843 int *pnRem /* Write up to this many output leaves */ |
| 17844 ){ |
| 17845 Fts5Structure *pStruct = *ppStruct; |
| 17846 Fts5StructureLevel *pLvl = &pStruct->aLevel[iLvl]; |
| 17847 Fts5StructureLevel *pLvlOut; |
| 17848 Fts5IndexIter *pIter = 0; /* Iterator to read input data */ |
| 17849 int nRem = pnRem ? *pnRem : 0; /* Output leaf pages left to write */ |
| 17850 int nInput; /* Number of input segments */ |
| 17851 Fts5SegWriter writer; /* Writer object */ |
| 17852 Fts5StructureSegment *pSeg; /* Output segment */ |
| 17853 Fts5Buffer term; |
| 17854 int bOldest; /* True if the output segment is the oldest */ |
| 17855 |
| 17856 assert( iLvl<pStruct->nLevel ); |
| 17857 assert( pLvl->nMerge<=pLvl->nSeg ); |
| 17858 |
| 17859 memset(&writer, 0, sizeof(Fts5SegWriter)); |
| 17860 memset(&term, 0, sizeof(Fts5Buffer)); |
| 17861 if( pLvl->nMerge ){ |
| 17862 pLvlOut = &pStruct->aLevel[iLvl+1]; |
| 17863 assert( pLvlOut->nSeg>0 ); |
| 17864 nInput = pLvl->nMerge; |
| 17865 pSeg = &pLvlOut->aSeg[pLvlOut->nSeg-1]; |
| 17866 |
| 17867 fts5WriteInit(p, &writer, pSeg->iSegid); |
| 17868 writer.writer.pgno = pSeg->pgnoLast+1; |
| 17869 writer.iBtPage = 0; |
| 17870 }else{ |
| 17871 int iSegid = fts5AllocateSegid(p, pStruct); |
| 17872 |
| 17873 /* Extend the Fts5Structure object as required to ensure the output |
| 17874 ** segment exists. */ |
| 17875 if( iLvl==pStruct->nLevel-1 ){ |
| 17876 fts5StructureAddLevel(&p->rc, ppStruct); |
| 17877 pStruct = *ppStruct; |
| 17878 } |
| 17879 fts5StructureExtendLevel(&p->rc, pStruct, iLvl+1, 1, 0); |
| 17880 if( p->rc ) return; |
| 17881 pLvl = &pStruct->aLevel[iLvl]; |
| 17882 pLvlOut = &pStruct->aLevel[iLvl+1]; |
| 17883 |
| 17884 fts5WriteInit(p, &writer, iSegid); |
| 17885 |
| 17886 /* Add the new segment to the output level */ |
| 17887 pSeg = &pLvlOut->aSeg[pLvlOut->nSeg]; |
| 17888 pLvlOut->nSeg++; |
| 17889 pSeg->pgnoFirst = 1; |
| 17890 pSeg->iSegid = iSegid; |
| 17891 pStruct->nSegment++; |
| 17892 |
| 17893 /* Read input from all segments in the input level */ |
| 17894 nInput = pLvl->nSeg; |
| 17895 } |
| 17896 bOldest = (pLvlOut->nSeg==1 && pStruct->nLevel==iLvl+2); |
| 17897 |
| 17898 assert( iLvl>=0 ); |
| 17899 for(fts5MultiIterNew(p, pStruct, 0, 0, 0, 0, iLvl, nInput, &pIter); |
| 17900 fts5MultiIterEof(p, pIter)==0; |
| 17901 fts5MultiIterNext(p, pIter, 0, 0) |
| 17902 ){ |
| 17903 Fts5SegIter *pSegIter = &pIter->aSeg[ pIter->aFirst[1].iFirst ]; |
| 17904 int nPos; /* position-list size field value */ |
| 17905 int nTerm; |
| 17906 const u8 *pTerm; |
| 17907 |
| 17908 /* Check for key annihilation. */ |
| 17909 if( pSegIter->nPos==0 && (bOldest || pSegIter->bDel==0) ) continue; |
| 17910 |
| 17911 pTerm = fts5MultiIterTerm(pIter, &nTerm); |
| 17912 if( nTerm!=term.n || memcmp(pTerm, term.p, nTerm) ){ |
| 17913 if( pnRem && writer.nLeafWritten>nRem ){ |
| 17914 break; |
| 17915 } |
| 17916 |
| 17917 /* This is a new term. Append a term to the output segment. */ |
| 17918 fts5WriteAppendTerm(p, &writer, nTerm, pTerm); |
| 17919 fts5BufferSet(&p->rc, &term, nTerm, pTerm); |
| 17920 } |
| 17921 |
| 17922 /* Append the rowid to the output */ |
| 17923 /* WRITEPOSLISTSIZE */ |
| 17924 nPos = pSegIter->nPos*2 + pSegIter->bDel; |
| 17925 fts5WriteAppendRowid(p, &writer, fts5MultiIterRowid(pIter), nPos); |
| 17926 |
| 17927 /* Append the position-list data to the output */ |
| 17928 fts5ChunkIterate(p, pSegIter, (void*)&writer, fts5MergeChunkCallback); |
| 17929 } |
| 17930 |
| 17931 /* Flush the last leaf page to disk. Set the output segment b-tree height |
| 17932 ** and last leaf page number at the same time. */ |
| 17933 fts5WriteFinish(p, &writer, &pSeg->pgnoLast); |
| 17934 |
| 17935 if( fts5MultiIterEof(p, pIter) ){ |
| 17936 int i; |
| 17937 |
| 17938 /* Remove the redundant segments from the %_data table */ |
| 17939 for(i=0; i<nInput; i++){ |
| 17940 fts5DataRemoveSegment(p, pLvl->aSeg[i].iSegid); |
| 17941 } |
| 17942 |
| 17943 /* Remove the redundant segments from the input level */ |
| 17944 if( pLvl->nSeg!=nInput ){ |
| 17945 int nMove = (pLvl->nSeg - nInput) * sizeof(Fts5StructureSegment); |
| 17946 memmove(pLvl->aSeg, &pLvl->aSeg[nInput], nMove); |
| 17947 } |
| 17948 pStruct->nSegment -= nInput; |
| 17949 pLvl->nSeg -= nInput; |
| 17950 pLvl->nMerge = 0; |
| 17951 if( pSeg->pgnoLast==0 ){ |
| 17952 pLvlOut->nSeg--; |
| 17953 pStruct->nSegment--; |
| 17954 } |
| 17955 }else{ |
| 17956 assert( pSeg->pgnoLast>0 ); |
| 17957 fts5TrimSegments(p, pIter); |
| 17958 pLvl->nMerge = nInput; |
| 17959 } |
| 17960 |
| 17961 fts5MultiIterFree(p, pIter); |
| 17962 fts5BufferFree(&term); |
| 17963 if( pnRem ) *pnRem -= writer.nLeafWritten; |
| 17964 } |
| 17965 |
| 17966 /* |
| 17967 ** Do up to nPg pages of automerge work on the index. |
| 17968 */ |
| 17969 static void fts5IndexMerge( |
| 17970 Fts5Index *p, /* FTS5 backend object */ |
| 17971 Fts5Structure **ppStruct, /* IN/OUT: Current structure of index */ |
| 17972 int nPg /* Pages of work to do */ |
| 17973 ){ |
| 17974 int nRem = nPg; |
| 17975 Fts5Structure *pStruct = *ppStruct; |
| 17976 while( nRem>0 && p->rc==SQLITE_OK ){ |
| 17977 int iLvl; /* To iterate through levels */ |
| 17978 int iBestLvl = 0; /* Level offering the most input segments */ |
| 17979 int nBest = 0; /* Number of input segments on best level */ |
| 17980 |
| 17981 /* Set iBestLvl to the level to read input segments from. */ |
| 17982 assert( pStruct->nLevel>0 ); |
| 17983 for(iLvl=0; iLvl<pStruct->nLevel; iLvl++){ |
| 17984 Fts5StructureLevel *pLvl = &pStruct->aLevel[iLvl]; |
| 17985 if( pLvl->nMerge ){ |
| 17986 if( pLvl->nMerge>nBest ){ |
| 17987 iBestLvl = iLvl; |
| 17988 nBest = pLvl->nMerge; |
| 17989 } |
| 17990 break; |
| 17991 } |
| 17992 if( pLvl->nSeg>nBest ){ |
| 17993 nBest = pLvl->nSeg; |
| 17994 iBestLvl = iLvl; |
| 17995 } |
| 17996 } |
| 17997 |
| 17998 /* If nBest is still 0, then the index must be empty. */ |
| 17999 #ifdef SQLITE_DEBUG |
| 18000 for(iLvl=0; nBest==0 && iLvl<pStruct->nLevel; iLvl++){ |
| 18001 assert( pStruct->aLevel[iLvl].nSeg==0 ); |
| 18002 } |
| 18003 #endif |
| 18004 |
| 18005 if( nBest<p->pConfig->nAutomerge |
| 18006 && pStruct->aLevel[iBestLvl].nMerge==0 |
| 18007 ){ |
| 18008 break; |
| 18009 } |
| 18010 fts5IndexMergeLevel(p, &pStruct, iBestLvl, &nRem); |
| 18011 if( p->rc==SQLITE_OK && pStruct->aLevel[iBestLvl].nMerge==0 ){ |
| 18012 fts5StructurePromote(p, iBestLvl+1, pStruct); |
| 18013 } |
| 18014 } |
| 18015 *ppStruct = pStruct; |
| 18016 } |
| 18017 |
| 18018 /* |
| 18019 ** A total of nLeaf leaf pages of data has just been flushed to a level-0 |
| 18020 ** segment. This function updates the write-counter accordingly and, if |
| 18021 ** necessary, performs incremental merge work. |
| 18022 ** |
| 18023 ** If an error occurs, set the Fts5Index.rc error code. If an error has |
| 18024 ** already occurred, this function is a no-op. |
| 18025 */ |
| 18026 static void fts5IndexAutomerge( |
| 18027 Fts5Index *p, /* FTS5 backend object */ |
| 18028 Fts5Structure **ppStruct, /* IN/OUT: Current structure of index */ |
| 18029 int nLeaf /* Number of output leaves just written */ |
| 18030 ){ |
| 18031 if( p->rc==SQLITE_OK && p->pConfig->nAutomerge>0 ){ |
| 18032 Fts5Structure *pStruct = *ppStruct; |
| 18033 u64 nWrite; /* Initial value of write-counter */ |
| 18034 int nWork; /* Number of work-quanta to perform */ |
| 18035 int nRem; /* Number of leaf pages left to write */ |
| 18036 |
| 18037 /* Update the write-counter. While doing so, set nWork. */ |
| 18038 nWrite = pStruct->nWriteCounter; |
| 18039 nWork = (int)(((nWrite + nLeaf) / p->nWorkUnit) - (nWrite / p->nWorkUnit)); |
| 18040 pStruct->nWriteCounter += nLeaf; |
| 18041 nRem = (int)(p->nWorkUnit * nWork * pStruct->nLevel); |
| 18042 |
| 18043 fts5IndexMerge(p, ppStruct, nRem); |
| 18044 } |
| 18045 } |
| 18046 |
| 18047 static void fts5IndexCrisismerge( |
| 18048 Fts5Index *p, /* FTS5 backend object */ |
| 18049 Fts5Structure **ppStruct /* IN/OUT: Current structure of index */ |
| 18050 ){ |
| 18051 const int nCrisis = p->pConfig->nCrisisMerge; |
| 18052 Fts5Structure *pStruct = *ppStruct; |
| 18053 int iLvl = 0; |
| 18054 |
| 18055 assert( p->rc!=SQLITE_OK || pStruct->nLevel>0 ); |
| 18056 while( p->rc==SQLITE_OK && pStruct->aLevel[iLvl].nSeg>=nCrisis ){ |
| 18057 fts5IndexMergeLevel(p, &pStruct, iLvl, 0); |
| 18058 assert( p->rc!=SQLITE_OK || pStruct->nLevel>(iLvl+1) ); |
| 18059 fts5StructurePromote(p, iLvl+1, pStruct); |
| 18060 iLvl++; |
| 18061 } |
| 18062 *ppStruct = pStruct; |
| 18063 } |
| 18064 |
| 18065 static int fts5IndexReturn(Fts5Index *p){ |
| 18066 int rc = p->rc; |
| 18067 p->rc = SQLITE_OK; |
| 18068 return rc; |
| 18069 } |
| 18070 |
| 18071 typedef struct Fts5FlushCtx Fts5FlushCtx; |
| 18072 struct Fts5FlushCtx { |
| 18073 Fts5Index *pIdx; |
| 18074 Fts5SegWriter writer; |
| 18075 }; |
| 18076 |
| 18077 /* |
| 18078 ** Buffer aBuf[] contains a list of varints, all small enough to fit |
| 18079 ** in a 32-bit integer. Return the size of the largest prefix of this |
| 18080 ** list nMax bytes or less in size. |
| 18081 */ |
| 18082 static int fts5PoslistPrefix(const u8 *aBuf, int nMax){ |
| 18083 int ret; |
| 18084 u32 dummy; |
| 18085 ret = fts5GetVarint32(aBuf, dummy); |
| 18086 if( ret<nMax ){ |
| 18087 while( 1 ){ |
| 18088 int i = fts5GetVarint32(&aBuf[ret], dummy); |
| 18089 if( (ret + i) > nMax ) break; |
| 18090 ret += i; |
| 18091 } |
| 18092 } |
| 18093 return ret; |
| 18094 } |
| 18095 |
| 18096 /* |
| 18097 ** Flush the contents of in-memory hash table iHash to a new level-0 |
| 18098 ** segment on disk. Also update the corresponding structure record. |
| 18099 ** |
| 18100 ** If an error occurs, set the Fts5Index.rc error code. If an error has |
| 18101 ** already occurred, this function is a no-op. |
| 18102 */ |
| 18103 static void fts5FlushOneHash(Fts5Index *p){ |
| 18104 Fts5Hash *pHash = p->pHash; |
| 18105 Fts5Structure *pStruct; |
| 18106 int iSegid; |
| 18107 int pgnoLast = 0; /* Last leaf page number in segment */ |
| 18108 |
| 18109 /* Obtain a reference to the index structure and allocate a new segment-id |
| 18110 ** for the new level-0 segment. */ |
| 18111 pStruct = fts5StructureRead(p); |
| 18112 iSegid = fts5AllocateSegid(p, pStruct); |
| 18113 |
| 18114 if( iSegid ){ |
| 18115 const int pgsz = p->pConfig->pgsz; |
| 18116 |
| 18117 Fts5StructureSegment *pSeg; /* New segment within pStruct */ |
| 18118 Fts5Buffer *pBuf; /* Buffer in which to assemble leaf page */ |
| 18119 Fts5Buffer *pPgidx; /* Buffer in which to assemble pgidx */ |
| 18120 |
| 18121 Fts5SegWriter writer; |
| 18122 fts5WriteInit(p, &writer, iSegid); |
| 18123 |
| 18124 pBuf = &writer.writer.buf; |
| 18125 pPgidx = &writer.writer.pgidx; |
| 18126 |
| 18127 /* fts5WriteInit() should have initialized the buffers to (most likely) |
| 18128 ** the maximum space required. */ |
| 18129 assert( p->rc || pBuf->nSpace>=(pgsz + FTS5_DATA_PADDING) ); |
| 18130 assert( p->rc || pPgidx->nSpace>=(pgsz + FTS5_DATA_PADDING) ); |
| 18131 |
| 18132 /* Begin scanning through hash table entries. This loop runs once for each |
| 18133 ** term/doclist currently stored within the hash table. */ |
| 18134 if( p->rc==SQLITE_OK ){ |
| 18135 p->rc = sqlite3Fts5HashScanInit(pHash, 0, 0); |
| 18136 } |
| 18137 while( p->rc==SQLITE_OK && 0==sqlite3Fts5HashScanEof(pHash) ){ |
| 18138 const char *zTerm; /* Buffer containing term */ |
| 18139 const u8 *pDoclist; /* Pointer to doclist for this term */ |
| 18140 int nDoclist; /* Size of doclist in bytes */ |
| 18141 |
| 18142 /* Write the term for this entry to disk. */ |
| 18143 sqlite3Fts5HashScanEntry(pHash, &zTerm, &pDoclist, &nDoclist); |
| 18144 fts5WriteAppendTerm(p, &writer, (int)strlen(zTerm), (const u8*)zTerm); |
| 18145 |
| 18146 assert( writer.bFirstRowidInPage==0 ); |
| 18147 if( pgsz>=(pBuf->n + pPgidx->n + nDoclist + 1) ){ |
| 18148 /* The entire doclist will fit on the current leaf. */ |
| 18149 fts5BufferSafeAppendBlob(pBuf, pDoclist, nDoclist); |
| 18150 }else{ |
| 18151 i64 iRowid = 0; |
| 18152 i64 iDelta = 0; |
| 18153 int iOff = 0; |
| 18154 |
| 18155 /* The entire doclist will not fit on this leaf. The following |
| 18156 ** loop iterates through the poslists that make up the current |
| 18157 ** doclist. */ |
| 18158 while( p->rc==SQLITE_OK && iOff<nDoclist ){ |
| 18159 int nPos; |
| 18160 int nCopy; |
| 18161 int bDummy; |
| 18162 iOff += fts5GetVarint(&pDoclist[iOff], (u64*)&iDelta); |
| 18163 nCopy = fts5GetPoslistSize(&pDoclist[iOff], &nPos, &bDummy); |
| 18164 nCopy += nPos; |
| 18165 iRowid += iDelta; |
| 18166 |
| 18167 if( writer.bFirstRowidInPage ){ |
| 18168 fts5PutU16(&pBuf->p[0], (u16)pBuf->n); /* first rowid on page */ |
| 18169 pBuf->n += sqlite3Fts5PutVarint(&pBuf->p[pBuf->n], iRowid); |
| 18170 writer.bFirstRowidInPage = 0; |
| 18171 fts5WriteDlidxAppend(p, &writer, iRowid); |
| 18172 }else{ |
| 18173 pBuf->n += sqlite3Fts5PutVarint(&pBuf->p[pBuf->n], iDelta); |
| 18174 } |
| 18175 assert( pBuf->n<=pBuf->nSpace ); |
| 18176 |
| 18177 if( (pBuf->n + pPgidx->n + nCopy) <= pgsz ){ |
| 18178 /* The entire poslist will fit on the current leaf. So copy |
| 18179 ** it in one go. */ |
| 18180 fts5BufferSafeAppendBlob(pBuf, &pDoclist[iOff], nCopy); |
| 18181 }else{ |
| 18182 /* The entire poslist will not fit on this leaf. So it needs |
| 18183 ** to be broken into sections. The only qualification being |
| 18184 ** that each varint must be stored contiguously. */ |
| 18185 const u8 *pPoslist = &pDoclist[iOff]; |
| 18186 int iPos = 0; |
| 18187 while( p->rc==SQLITE_OK ){ |
| 18188 int nSpace = pgsz - pBuf->n - pPgidx->n; |
| 18189 int n = 0; |
| 18190 if( (nCopy - iPos)<=nSpace ){ |
| 18191 n = nCopy - iPos; |
| 18192 }else{ |
| 18193 n = fts5PoslistPrefix(&pPoslist[iPos], nSpace); |
| 18194 } |
| 18195 assert( n>0 ); |
| 18196 fts5BufferSafeAppendBlob(pBuf, &pPoslist[iPos], n); |
| 18197 iPos += n; |
| 18198 if( (pBuf->n + pPgidx->n)>=pgsz ){ |
| 18199 fts5WriteFlushLeaf(p, &writer); |
| 18200 } |
| 18201 if( iPos>=nCopy ) break; |
| 18202 } |
| 18203 } |
| 18204 iOff += nCopy; |
| 18205 } |
| 18206 } |
| 18207 |
| 18208 /* TODO2: Doclist terminator written here. */ |
| 18209 /* pBuf->p[pBuf->n++] = '\0'; */ |
| 18210 assert( pBuf->n<=pBuf->nSpace ); |
| 18211 sqlite3Fts5HashScanNext(pHash); |
| 18212 } |
| 18213 sqlite3Fts5HashClear(pHash); |
| 18214 fts5WriteFinish(p, &writer, &pgnoLast); |
| 18215 |
| 18216 /* Update the Fts5Structure. It is written back to the database by the |
| 18217 ** fts5StructureRelease() call below. */ |
| 18218 if( pStruct->nLevel==0 ){ |
| 18219 fts5StructureAddLevel(&p->rc, &pStruct); |
| 18220 } |
| 18221 fts5StructureExtendLevel(&p->rc, pStruct, 0, 1, 0); |
| 18222 if( p->rc==SQLITE_OK ){ |
| 18223 pSeg = &pStruct->aLevel[0].aSeg[ pStruct->aLevel[0].nSeg++ ]; |
| 18224 pSeg->iSegid = iSegid; |
| 18225 pSeg->pgnoFirst = 1; |
| 18226 pSeg->pgnoLast = pgnoLast; |
| 18227 pStruct->nSegment++; |
| 18228 } |
| 18229 fts5StructurePromote(p, 0, pStruct); |
| 18230 } |
| 18231 |
| 18232 fts5IndexAutomerge(p, &pStruct, pgnoLast); |
| 18233 fts5IndexCrisismerge(p, &pStruct); |
| 18234 fts5StructureWrite(p, pStruct); |
| 18235 fts5StructureRelease(pStruct); |
| 18236 } |
| 18237 |
| 18238 /* |
| 18239 ** Flush any data stored in the in-memory hash tables to the database. |
| 18240 */ |
| 18241 static void fts5IndexFlush(Fts5Index *p){ |
| 18242 /* Unless it is empty, flush the hash table to disk */ |
| 18243 if( p->nPendingData ){ |
| 18244 assert( p->pHash ); |
| 18245 p->nPendingData = 0; |
| 18246 fts5FlushOneHash(p); |
| 18247 } |
| 18248 } |
| 18249 |
| 18250 |
| 18251 static int sqlite3Fts5IndexOptimize(Fts5Index *p){ |
| 18252 Fts5Structure *pStruct; |
| 18253 Fts5Structure *pNew = 0; |
| 18254 int nSeg = 0; |
| 18255 |
| 18256 assert( p->rc==SQLITE_OK ); |
| 18257 fts5IndexFlush(p); |
| 18258 pStruct = fts5StructureRead(p); |
| 18259 |
| 18260 if( pStruct ){ |
| 18261 assert( pStruct->nSegment==fts5StructureCountSegments(pStruct) ); |
| 18262 nSeg = pStruct->nSegment; |
| 18263 if( nSeg>1 ){ |
| 18264 int nByte = sizeof(Fts5Structure); |
| 18265 nByte += (pStruct->nLevel+1) * sizeof(Fts5StructureLevel); |
| 18266 pNew = (Fts5Structure*)sqlite3Fts5MallocZero(&p->rc, nByte); |
| 18267 } |
| 18268 } |
| 18269 if( pNew ){ |
| 18270 Fts5StructureLevel *pLvl; |
| 18271 int nByte = nSeg * sizeof(Fts5StructureSegment); |
| 18272 pNew->nLevel = pStruct->nLevel+1; |
| 18273 pNew->nRef = 1; |
| 18274 pNew->nWriteCounter = pStruct->nWriteCounter; |
| 18275 pLvl = &pNew->aLevel[pStruct->nLevel]; |
| 18276 pLvl->aSeg = (Fts5StructureSegment*)sqlite3Fts5MallocZero(&p->rc, nByte); |
| 18277 if( pLvl->aSeg ){ |
| 18278 int iLvl, iSeg; |
| 18279 int iSegOut = 0; |
| 18280 for(iLvl=0; iLvl<pStruct->nLevel; iLvl++){ |
| 18281 for(iSeg=0; iSeg<pStruct->aLevel[iLvl].nSeg; iSeg++){ |
| 18282 pLvl->aSeg[iSegOut] = pStruct->aLevel[iLvl].aSeg[iSeg]; |
| 18283 iSegOut++; |
| 18284 } |
| 18285 } |
| 18286 pNew->nSegment = pLvl->nSeg = nSeg; |
| 18287 }else{ |
| 18288 sqlite3_free(pNew); |
| 18289 pNew = 0; |
| 18290 } |
| 18291 } |
| 18292 |
| 18293 if( pNew ){ |
| 18294 int iLvl = pNew->nLevel-1; |
| 18295 while( p->rc==SQLITE_OK && pNew->aLevel[iLvl].nSeg>0 ){ |
| 18296 int nRem = FTS5_OPT_WORK_UNIT; |
| 18297 fts5IndexMergeLevel(p, &pNew, iLvl, &nRem); |
| 18298 } |
| 18299 |
| 18300 fts5StructureWrite(p, pNew); |
| 18301 fts5StructureRelease(pNew); |
| 18302 } |
| 18303 |
| 18304 fts5StructureRelease(pStruct); |
| 18305 return fts5IndexReturn(p); |
| 18306 } |
| 18307 |
| 18308 static int sqlite3Fts5IndexMerge(Fts5Index *p, int nMerge){ |
| 18309 Fts5Structure *pStruct; |
| 18310 |
| 18311 pStruct = fts5StructureRead(p); |
| 18312 if( pStruct && pStruct->nLevel ){ |
| 18313 fts5IndexMerge(p, &pStruct, nMerge); |
| 18314 fts5StructureWrite(p, pStruct); |
| 18315 } |
| 18316 fts5StructureRelease(pStruct); |
| 18317 |
| 18318 return fts5IndexReturn(p); |
| 18319 } |
| 18320 |
| 18321 static void fts5PoslistCallback( |
| 18322 Fts5Index *p, |
| 18323 void *pContext, |
| 18324 const u8 *pChunk, int nChunk |
| 18325 ){ |
| 18326 assert_nc( nChunk>=0 ); |
| 18327 if( nChunk>0 ){ |
| 18328 fts5BufferSafeAppendBlob((Fts5Buffer*)pContext, pChunk, nChunk); |
| 18329 } |
| 18330 } |
| 18331 |
| 18332 typedef struct PoslistCallbackCtx PoslistCallbackCtx; |
| 18333 struct PoslistCallbackCtx { |
| 18334 Fts5Buffer *pBuf; /* Append to this buffer */ |
| 18335 Fts5Colset *pColset; /* Restrict matches to this column */ |
| 18336 int eState; /* See above */ |
| 18337 }; |
| 18338 |
| 18339 /* |
| 18340 ** TODO: Make this more efficient! |
| 18341 */ |
| 18342 static int fts5IndexColsetTest(Fts5Colset *pColset, int iCol){ |
| 18343 int i; |
| 18344 for(i=0; i<pColset->nCol; i++){ |
| 18345 if( pColset->aiCol[i]==iCol ) return 1; |
| 18346 } |
| 18347 return 0; |
| 18348 } |
| 18349 |
| 18350 static void fts5PoslistFilterCallback( |
| 18351 Fts5Index *p, |
| 18352 void *pContext, |
| 18353 const u8 *pChunk, int nChunk |
| 18354 ){ |
| 18355 PoslistCallbackCtx *pCtx = (PoslistCallbackCtx*)pContext; |
| 18356 assert_nc( nChunk>=0 ); |
| 18357 if( nChunk>0 ){ |
| 18358 /* Search through to find the first varint with value 1. This is the |
| 18359 ** start of the next columns hits. */ |
| 18360 int i = 0; |
| 18361 int iStart = 0; |
| 18362 |
| 18363 if( pCtx->eState==2 ){ |
| 18364 int iCol; |
| 18365 fts5FastGetVarint32(pChunk, i, iCol); |
| 18366 if( fts5IndexColsetTest(pCtx->pColset, iCol) ){ |
| 18367 pCtx->eState = 1; |
| 18368 fts5BufferSafeAppendVarint(pCtx->pBuf, 1); |
| 18369 }else{ |
| 18370 pCtx->eState = 0; |
| 18371 } |
| 18372 } |
| 18373 |
| 18374 do { |
| 18375 while( i<nChunk && pChunk[i]!=0x01 ){ |
| 18376 while( pChunk[i] & 0x80 ) i++; |
| 18377 i++; |
| 18378 } |
| 18379 if( pCtx->eState ){ |
| 18380 fts5BufferSafeAppendBlob(pCtx->pBuf, &pChunk[iStart], i-iStart); |
| 18381 } |
| 18382 if( i<nChunk ){ |
| 18383 int iCol; |
| 18384 iStart = i; |
| 18385 i++; |
| 18386 if( i>=nChunk ){ |
| 18387 pCtx->eState = 2; |
| 18388 }else{ |
| 18389 fts5FastGetVarint32(pChunk, i, iCol); |
| 18390 pCtx->eState = fts5IndexColsetTest(pCtx->pColset, iCol); |
| 18391 if( pCtx->eState ){ |
| 18392 fts5BufferSafeAppendBlob(pCtx->pBuf, &pChunk[iStart], i-iStart); |
| 18393 iStart = i; |
| 18394 } |
| 18395 } |
| 18396 } |
| 18397 }while( i<nChunk ); |
| 18398 } |
| 18399 } |
| 18400 |
| 18401 /* |
| 18402 ** Iterator pIter currently points to a valid entry (not EOF). This |
| 18403 ** function appends the position list data for the current entry to |
| 18404 ** buffer pBuf. It does not make a copy of the position-list size |
| 18405 ** field. |
| 18406 */ |
| 18407 static void fts5SegiterPoslist( |
| 18408 Fts5Index *p, |
| 18409 Fts5SegIter *pSeg, |
| 18410 Fts5Colset *pColset, |
| 18411 Fts5Buffer *pBuf |
| 18412 ){ |
| 18413 if( 0==fts5BufferGrow(&p->rc, pBuf, pSeg->nPos) ){ |
| 18414 if( pColset==0 ){ |
| 18415 fts5ChunkIterate(p, pSeg, (void*)pBuf, fts5PoslistCallback); |
| 18416 }else{ |
| 18417 PoslistCallbackCtx sCtx; |
| 18418 sCtx.pBuf = pBuf; |
| 18419 sCtx.pColset = pColset; |
| 18420 sCtx.eState = fts5IndexColsetTest(pColset, 0); |
| 18421 assert( sCtx.eState==0 || sCtx.eState==1 ); |
| 18422 fts5ChunkIterate(p, pSeg, (void*)&sCtx, fts5PoslistFilterCallback); |
| 18423 } |
| 18424 } |
| 18425 } |
| 18426 |
| 18427 /* |
| 18428 ** IN/OUT parameter (*pa) points to a position list n bytes in size. If |
| 18429 ** the position list contains entries for column iCol, then (*pa) is set |
| 18430 ** to point to the sub-position-list for that column and the number of |
| 18431 ** bytes in it returned. Or, if the argument position list does not |
| 18432 ** contain any entries for column iCol, return 0. |
| 18433 */ |
| 18434 static int fts5IndexExtractCol( |
| 18435 const u8 **pa, /* IN/OUT: Pointer to poslist */ |
| 18436 int n, /* IN: Size of poslist in bytes */ |
| 18437 int iCol /* Column to extract from poslist */ |
| 18438 ){ |
| 18439 int iCurrent = 0; /* Anything before the first 0x01 is col 0 */ |
| 18440 const u8 *p = *pa; |
| 18441 const u8 *pEnd = &p[n]; /* One byte past end of position list */ |
| 18442 u8 prev = 0; |
| 18443 |
| 18444 while( iCol>iCurrent ){ |
| 18445 /* Advance pointer p until it points to pEnd or an 0x01 byte that is |
| 18446 ** not part of a varint */ |
| 18447 while( (prev & 0x80) || *p!=0x01 ){ |
| 18448 prev = *p++; |
| 18449 if( p==pEnd ) return 0; |
| 18450 } |
| 18451 *pa = p++; |
| 18452 p += fts5GetVarint32(p, iCurrent); |
| 18453 } |
| 18454 if( iCol!=iCurrent ) return 0; |
| 18455 |
| 18456 /* Advance pointer p until it points to pEnd or an 0x01 byte that is |
| 18457 ** not part of a varint */ |
| 18458 assert( (prev & 0x80)==0 ); |
| 18459 while( p<pEnd && ((prev & 0x80) || *p!=0x01) ){ |
| 18460 prev = *p++; |
| 18461 } |
| 18462 return p - (*pa); |
| 18463 } |
| 18464 |
| 18465 |
| 18466 /* |
| 18467 ** Iterator pMulti currently points to a valid entry (not EOF). This |
| 18468 ** function appends the following to buffer pBuf: |
| 18469 ** |
| 18470 ** * The varint iDelta, and |
| 18471 ** * the position list that currently points to, including the size field. |
| 18472 ** |
| 18473 ** If argument pColset is NULL, then the position list is filtered according |
| 18474 ** to pColset before being appended to the buffer. If this means there are |
| 18475 ** no entries in the position list, nothing is appended to the buffer (not |
| 18476 ** even iDelta). |
| 18477 ** |
| 18478 ** If an error occurs, an error code is left in p->rc. |
| 18479 */ |
| 18480 static int fts5AppendPoslist( |
| 18481 Fts5Index *p, |
| 18482 i64 iDelta, |
| 18483 Fts5IndexIter *pMulti, |
| 18484 Fts5Colset *pColset, |
| 18485 Fts5Buffer *pBuf |
| 18486 ){ |
| 18487 if( p->rc==SQLITE_OK ){ |
| 18488 Fts5SegIter *pSeg = &pMulti->aSeg[ pMulti->aFirst[1].iFirst ]; |
| 18489 assert( fts5MultiIterEof(p, pMulti)==0 ); |
| 18490 assert( pSeg->nPos>0 ); |
| 18491 if( 0==fts5BufferGrow(&p->rc, pBuf, pSeg->nPos+9+9) ){ |
| 18492 |
| 18493 if( pSeg->iLeafOffset+pSeg->nPos<=pSeg->pLeaf->szLeaf |
| 18494 && (pColset==0 || pColset->nCol==1) |
| 18495 ){ |
| 18496 const u8 *pPos = &pSeg->pLeaf->p[pSeg->iLeafOffset]; |
| 18497 int nPos; |
| 18498 if( pColset ){ |
| 18499 nPos = fts5IndexExtractCol(&pPos, pSeg->nPos, pColset->aiCol[0]); |
| 18500 if( nPos==0 ) return 1; |
| 18501 }else{ |
| 18502 nPos = pSeg->nPos; |
| 18503 } |
| 18504 assert( nPos>0 ); |
| 18505 fts5BufferSafeAppendVarint(pBuf, iDelta); |
| 18506 fts5BufferSafeAppendVarint(pBuf, nPos*2); |
| 18507 fts5BufferSafeAppendBlob(pBuf, pPos, nPos); |
| 18508 }else{ |
| 18509 int iSv1; |
| 18510 int iSv2; |
| 18511 int iData; |
| 18512 |
| 18513 /* Append iDelta */ |
| 18514 iSv1 = pBuf->n; |
| 18515 fts5BufferSafeAppendVarint(pBuf, iDelta); |
| 18516 |
| 18517 /* WRITEPOSLISTSIZE */ |
| 18518 iSv2 = pBuf->n; |
| 18519 fts5BufferSafeAppendVarint(pBuf, pSeg->nPos*2); |
| 18520 iData = pBuf->n; |
| 18521 |
| 18522 fts5SegiterPoslist(p, pSeg, pColset, pBuf); |
| 18523 |
| 18524 if( pColset ){ |
| 18525 int nActual = pBuf->n - iData; |
| 18526 if( nActual!=pSeg->nPos ){ |
| 18527 if( nActual==0 ){ |
| 18528 pBuf->n = iSv1; |
| 18529 return 1; |
| 18530 }else{ |
| 18531 int nReq = sqlite3Fts5GetVarintLen((u32)(nActual*2)); |
| 18532 while( iSv2<(iData-nReq) ){ pBuf->p[iSv2++] = 0x80; } |
| 18533 sqlite3Fts5PutVarint(&pBuf->p[iSv2], nActual*2); |
| 18534 } |
| 18535 } |
| 18536 } |
| 18537 } |
| 18538 |
| 18539 } |
| 18540 } |
| 18541 |
| 18542 return 0; |
| 18543 } |
| 18544 |
| 18545 static void fts5DoclistIterNext(Fts5DoclistIter *pIter){ |
| 18546 u8 *p = pIter->aPoslist + pIter->nSize + pIter->nPoslist; |
| 18547 |
| 18548 assert( pIter->aPoslist ); |
| 18549 if( p>=pIter->aEof ){ |
| 18550 pIter->aPoslist = 0; |
| 18551 }else{ |
| 18552 i64 iDelta; |
| 18553 |
| 18554 p += fts5GetVarint(p, (u64*)&iDelta); |
| 18555 pIter->iRowid += iDelta; |
| 18556 |
| 18557 /* Read position list size */ |
| 18558 if( p[0] & 0x80 ){ |
| 18559 int nPos; |
| 18560 pIter->nSize = fts5GetVarint32(p, nPos); |
| 18561 pIter->nPoslist = (nPos>>1); |
| 18562 }else{ |
| 18563 pIter->nPoslist = ((int)(p[0])) >> 1; |
| 18564 pIter->nSize = 1; |
| 18565 } |
| 18566 |
| 18567 pIter->aPoslist = p; |
| 18568 } |
| 18569 } |
| 18570 |
| 18571 static void fts5DoclistIterInit( |
| 18572 Fts5Buffer *pBuf, |
| 18573 Fts5DoclistIter *pIter |
| 18574 ){ |
| 18575 memset(pIter, 0, sizeof(*pIter)); |
| 18576 pIter->aPoslist = pBuf->p; |
| 18577 pIter->aEof = &pBuf->p[pBuf->n]; |
| 18578 fts5DoclistIterNext(pIter); |
| 18579 } |
| 18580 |
| 18581 #if 0 |
| 18582 /* |
| 18583 ** Append a doclist to buffer pBuf. |
| 18584 ** |
| 18585 ** This function assumes that space within the buffer has already been |
| 18586 ** allocated. |
| 18587 */ |
| 18588 static void fts5MergeAppendDocid( |
| 18589 Fts5Buffer *pBuf, /* Buffer to write to */ |
| 18590 i64 *piLastRowid, /* IN/OUT: Previous rowid written (if any) */ |
| 18591 i64 iRowid /* Rowid to append */ |
| 18592 ){ |
| 18593 assert( pBuf->n!=0 || (*piLastRowid)==0 ); |
| 18594 fts5BufferSafeAppendVarint(pBuf, iRowid - *piLastRowid); |
| 18595 *piLastRowid = iRowid; |
| 18596 } |
| 18597 #endif |
| 18598 |
| 18599 #define fts5MergeAppendDocid(pBuf, iLastRowid, iRowid) { \ |
| 18600 assert( (pBuf)->n!=0 || (iLastRowid)==0 ); \ |
| 18601 fts5BufferSafeAppendVarint((pBuf), (iRowid) - (iLastRowid)); \ |
| 18602 (iLastRowid) = (iRowid); \ |
| 18603 } |
| 18604 |
| 18605 /* |
| 18606 ** Buffers p1 and p2 contain doclists. This function merges the content |
| 18607 ** of the two doclists together and sets buffer p1 to the result before |
| 18608 ** returning. |
| 18609 ** |
| 18610 ** If an error occurs, an error code is left in p->rc. If an error has |
| 18611 ** already occurred, this function is a no-op. |
| 18612 */ |
| 18613 static void fts5MergePrefixLists( |
| 18614 Fts5Index *p, /* FTS5 backend object */ |
| 18615 Fts5Buffer *p1, /* First list to merge */ |
| 18616 Fts5Buffer *p2 /* Second list to merge */ |
| 18617 ){ |
| 18618 if( p2->n ){ |
| 18619 i64 iLastRowid = 0; |
| 18620 Fts5DoclistIter i1; |
| 18621 Fts5DoclistIter i2; |
| 18622 Fts5Buffer out; |
| 18623 Fts5Buffer tmp; |
| 18624 memset(&out, 0, sizeof(out)); |
| 18625 memset(&tmp, 0, sizeof(tmp)); |
| 18626 |
| 18627 sqlite3Fts5BufferSize(&p->rc, &out, p1->n + p2->n); |
| 18628 fts5DoclistIterInit(p1, &i1); |
| 18629 fts5DoclistIterInit(p2, &i2); |
| 18630 while( p->rc==SQLITE_OK && (i1.aPoslist!=0 || i2.aPoslist!=0) ){ |
| 18631 if( i2.aPoslist==0 || (i1.aPoslist && i1.iRowid<i2.iRowid) ){ |
| 18632 /* Copy entry from i1 */ |
| 18633 fts5MergeAppendDocid(&out, iLastRowid, i1.iRowid); |
| 18634 fts5BufferSafeAppendBlob(&out, i1.aPoslist, i1.nPoslist+i1.nSize); |
| 18635 fts5DoclistIterNext(&i1); |
| 18636 } |
| 18637 else if( i1.aPoslist==0 || i2.iRowid!=i1.iRowid ){ |
| 18638 /* Copy entry from i2 */ |
| 18639 fts5MergeAppendDocid(&out, iLastRowid, i2.iRowid); |
| 18640 fts5BufferSafeAppendBlob(&out, i2.aPoslist, i2.nPoslist+i2.nSize); |
| 18641 fts5DoclistIterNext(&i2); |
| 18642 } |
| 18643 else{ |
| 18644 i64 iPos1 = 0; |
| 18645 i64 iPos2 = 0; |
| 18646 int iOff1 = 0; |
| 18647 int iOff2 = 0; |
| 18648 u8 *a1 = &i1.aPoslist[i1.nSize]; |
| 18649 u8 *a2 = &i2.aPoslist[i2.nSize]; |
| 18650 |
| 18651 Fts5PoslistWriter writer; |
| 18652 memset(&writer, 0, sizeof(writer)); |
| 18653 |
| 18654 /* Merge the two position lists. */ |
| 18655 fts5MergeAppendDocid(&out, iLastRowid, i2.iRowid); |
| 18656 fts5BufferZero(&tmp); |
| 18657 |
| 18658 sqlite3Fts5PoslistNext64(a1, i1.nPoslist, &iOff1, &iPos1); |
| 18659 sqlite3Fts5PoslistNext64(a2, i2.nPoslist, &iOff2, &iPos2); |
| 18660 |
| 18661 while( p->rc==SQLITE_OK && (iPos1>=0 || iPos2>=0) ){ |
| 18662 i64 iNew; |
| 18663 if( iPos2<0 || (iPos1>=0 && iPos1<iPos2) ){ |
| 18664 iNew = iPos1; |
| 18665 sqlite3Fts5PoslistNext64(a1, i1.nPoslist, &iOff1, &iPos1); |
| 18666 }else{ |
| 18667 iNew = iPos2; |
| 18668 sqlite3Fts5PoslistNext64(a2, i2.nPoslist, &iOff2, &iPos2); |
| 18669 if( iPos1==iPos2 ){ |
| 18670 sqlite3Fts5PoslistNext64(a1, i1.nPoslist, &iOff1,&iPos1); |
| 18671 } |
| 18672 } |
| 18673 p->rc = sqlite3Fts5PoslistWriterAppend(&tmp, &writer, iNew); |
| 18674 } |
| 18675 |
| 18676 /* WRITEPOSLISTSIZE */ |
| 18677 fts5BufferSafeAppendVarint(&out, tmp.n * 2); |
| 18678 fts5BufferSafeAppendBlob(&out, tmp.p, tmp.n); |
| 18679 fts5DoclistIterNext(&i1); |
| 18680 fts5DoclistIterNext(&i2); |
| 18681 } |
| 18682 } |
| 18683 |
| 18684 fts5BufferSet(&p->rc, p1, out.n, out.p); |
| 18685 fts5BufferFree(&tmp); |
| 18686 fts5BufferFree(&out); |
| 18687 } |
| 18688 } |
| 18689 |
| 18690 static void fts5BufferSwap(Fts5Buffer *p1, Fts5Buffer *p2){ |
| 18691 Fts5Buffer tmp = *p1; |
| 18692 *p1 = *p2; |
| 18693 *p2 = tmp; |
| 18694 } |
| 18695 |
| 18696 static void fts5SetupPrefixIter( |
| 18697 Fts5Index *p, /* Index to read from */ |
| 18698 int bDesc, /* True for "ORDER BY rowid DESC" */ |
| 18699 const u8 *pToken, /* Buffer containing prefix to match */ |
| 18700 int nToken, /* Size of buffer pToken in bytes */ |
| 18701 Fts5Colset *pColset, /* Restrict matches to these columns */ |
| 18702 Fts5IndexIter **ppIter /* OUT: New iterator */ |
| 18703 ){ |
| 18704 Fts5Structure *pStruct; |
| 18705 Fts5Buffer *aBuf; |
| 18706 const int nBuf = 32; |
| 18707 |
| 18708 aBuf = (Fts5Buffer*)fts5IdxMalloc(p, sizeof(Fts5Buffer)*nBuf); |
| 18709 pStruct = fts5StructureRead(p); |
| 18710 |
| 18711 if( aBuf && pStruct ){ |
| 18712 const int flags = FTS5INDEX_QUERY_SCAN; |
| 18713 int i; |
| 18714 i64 iLastRowid = 0; |
| 18715 Fts5IndexIter *p1 = 0; /* Iterator used to gather data from index */ |
| 18716 Fts5Data *pData; |
| 18717 Fts5Buffer doclist; |
| 18718 int bNewTerm = 1; |
| 18719 |
| 18720 memset(&doclist, 0, sizeof(doclist)); |
| 18721 for(fts5MultiIterNew(p, pStruct, 1, flags, pToken, nToken, -1, 0, &p1); |
| 18722 fts5MultiIterEof(p, p1)==0; |
| 18723 fts5MultiIterNext2(p, p1, &bNewTerm) |
| 18724 ){ |
| 18725 i64 iRowid = fts5MultiIterRowid(p1); |
| 18726 int nTerm; |
| 18727 const u8 *pTerm = fts5MultiIterTerm(p1, &nTerm); |
| 18728 assert_nc( memcmp(pToken, pTerm, MIN(nToken, nTerm))<=0 ); |
| 18729 if( bNewTerm ){ |
| 18730 if( nTerm<nToken || memcmp(pToken, pTerm, nToken) ) break; |
| 18731 } |
| 18732 |
| 18733 if( doclist.n>0 && iRowid<=iLastRowid ){ |
| 18734 for(i=0; p->rc==SQLITE_OK && doclist.n; i++){ |
| 18735 assert( i<nBuf ); |
| 18736 if( aBuf[i].n==0 ){ |
| 18737 fts5BufferSwap(&doclist, &aBuf[i]); |
| 18738 fts5BufferZero(&doclist); |
| 18739 }else{ |
| 18740 fts5MergePrefixLists(p, &doclist, &aBuf[i]); |
| 18741 fts5BufferZero(&aBuf[i]); |
| 18742 } |
| 18743 } |
| 18744 iLastRowid = 0; |
| 18745 } |
| 18746 |
| 18747 if( !fts5AppendPoslist(p, iRowid-iLastRowid, p1, pColset, &doclist) ){ |
| 18748 iLastRowid = iRowid; |
| 18749 } |
| 18750 } |
| 18751 |
| 18752 for(i=0; i<nBuf; i++){ |
| 18753 if( p->rc==SQLITE_OK ){ |
| 18754 fts5MergePrefixLists(p, &doclist, &aBuf[i]); |
| 18755 } |
| 18756 fts5BufferFree(&aBuf[i]); |
| 18757 } |
| 18758 fts5MultiIterFree(p, p1); |
| 18759 |
| 18760 pData = fts5IdxMalloc(p, sizeof(Fts5Data) + doclist.n); |
| 18761 if( pData ){ |
| 18762 pData->p = (u8*)&pData[1]; |
| 18763 pData->nn = pData->szLeaf = doclist.n; |
| 18764 memcpy(pData->p, doclist.p, doclist.n); |
| 18765 fts5MultiIterNew2(p, pData, bDesc, ppIter); |
| 18766 } |
| 18767 fts5BufferFree(&doclist); |
| 18768 } |
| 18769 |
| 18770 fts5StructureRelease(pStruct); |
| 18771 sqlite3_free(aBuf); |
| 18772 } |
| 18773 |
| 18774 |
| 18775 /* |
| 18776 ** Indicate that all subsequent calls to sqlite3Fts5IndexWrite() pertain |
| 18777 ** to the document with rowid iRowid. |
| 18778 */ |
| 18779 static int sqlite3Fts5IndexBeginWrite(Fts5Index *p, int bDelete, i64 iRowid){ |
| 18780 assert( p->rc==SQLITE_OK ); |
| 18781 |
| 18782 /* Allocate the hash table if it has not already been allocated */ |
| 18783 if( p->pHash==0 ){ |
| 18784 p->rc = sqlite3Fts5HashNew(&p->pHash, &p->nPendingData); |
| 18785 } |
| 18786 |
| 18787 /* Flush the hash table to disk if required */ |
| 18788 if( iRowid<p->iWriteRowid |
| 18789 || (iRowid==p->iWriteRowid && p->bDelete==0) |
| 18790 || (p->nPendingData > p->pConfig->nHashSize) |
| 18791 ){ |
| 18792 fts5IndexFlush(p); |
| 18793 } |
| 18794 |
| 18795 p->iWriteRowid = iRowid; |
| 18796 p->bDelete = bDelete; |
| 18797 return fts5IndexReturn(p); |
| 18798 } |
| 18799 |
| 18800 /* |
| 18801 ** Commit data to disk. |
| 18802 */ |
| 18803 static int sqlite3Fts5IndexSync(Fts5Index *p, int bCommit){ |
| 18804 assert( p->rc==SQLITE_OK ); |
| 18805 fts5IndexFlush(p); |
| 18806 if( bCommit ) fts5CloseReader(p); |
| 18807 return fts5IndexReturn(p); |
| 18808 } |
| 18809 |
| 18810 /* |
| 18811 ** Discard any data stored in the in-memory hash tables. Do not write it |
| 18812 ** to the database. Additionally, assume that the contents of the %_data |
| 18813 ** table may have changed on disk. So any in-memory caches of %_data |
| 18814 ** records must be invalidated. |
| 18815 */ |
| 18816 static int sqlite3Fts5IndexRollback(Fts5Index *p){ |
| 18817 fts5CloseReader(p); |
| 18818 fts5IndexDiscardData(p); |
| 18819 assert( p->rc==SQLITE_OK ); |
| 18820 return SQLITE_OK; |
| 18821 } |
| 18822 |
| 18823 /* |
| 18824 ** The %_data table is completely empty when this function is called. This |
| 18825 ** function populates it with the initial structure objects for each index, |
| 18826 ** and the initial version of the "averages" record (a zero-byte blob). |
| 18827 */ |
| 18828 static int sqlite3Fts5IndexReinit(Fts5Index *p){ |
| 18829 Fts5Structure s; |
| 18830 memset(&s, 0, sizeof(Fts5Structure)); |
| 18831 fts5DataWrite(p, FTS5_AVERAGES_ROWID, (const u8*)"", 0); |
| 18832 fts5StructureWrite(p, &s); |
| 18833 return fts5IndexReturn(p); |
| 18834 } |
| 18835 |
| 18836 /* |
| 18837 ** Open a new Fts5Index handle. If the bCreate argument is true, create |
| 18838 ** and initialize the underlying %_data table. |
| 18839 ** |
| 18840 ** If successful, set *pp to point to the new object and return SQLITE_OK. |
| 18841 ** Otherwise, set *pp to NULL and return an SQLite error code. |
| 18842 */ |
| 18843 static int sqlite3Fts5IndexOpen( |
| 18844 Fts5Config *pConfig, |
| 18845 int bCreate, |
| 18846 Fts5Index **pp, |
| 18847 char **pzErr |
| 18848 ){ |
| 18849 int rc = SQLITE_OK; |
| 18850 Fts5Index *p; /* New object */ |
| 18851 |
| 18852 *pp = p = (Fts5Index*)sqlite3Fts5MallocZero(&rc, sizeof(Fts5Index)); |
| 18853 if( rc==SQLITE_OK ){ |
| 18854 p->pConfig = pConfig; |
| 18855 p->nWorkUnit = FTS5_WORK_UNIT; |
| 18856 p->zDataTbl = sqlite3Fts5Mprintf(&rc, "%s_data", pConfig->zName); |
| 18857 if( p->zDataTbl && bCreate ){ |
| 18858 rc = sqlite3Fts5CreateTable( |
| 18859 pConfig, "data", "id INTEGER PRIMARY KEY, block BLOB", 0, pzErr |
| 18860 ); |
| 18861 if( rc==SQLITE_OK ){ |
| 18862 rc = sqlite3Fts5CreateTable(pConfig, "idx", |
| 18863 "segid, term, pgno, PRIMARY KEY(segid, term)", |
| 18864 1, pzErr |
| 18865 ); |
| 18866 } |
| 18867 if( rc==SQLITE_OK ){ |
| 18868 rc = sqlite3Fts5IndexReinit(p); |
| 18869 } |
| 18870 } |
| 18871 } |
| 18872 |
| 18873 assert( rc!=SQLITE_OK || p->rc==SQLITE_OK ); |
| 18874 if( rc ){ |
| 18875 sqlite3Fts5IndexClose(p); |
| 18876 *pp = 0; |
| 18877 } |
| 18878 return rc; |
| 18879 } |
| 18880 |
| 18881 /* |
| 18882 ** Close a handle opened by an earlier call to sqlite3Fts5IndexOpen(). |
| 18883 */ |
| 18884 static int sqlite3Fts5IndexClose(Fts5Index *p){ |
| 18885 int rc = SQLITE_OK; |
| 18886 if( p ){ |
| 18887 assert( p->pReader==0 ); |
| 18888 sqlite3_finalize(p->pWriter); |
| 18889 sqlite3_finalize(p->pDeleter); |
| 18890 sqlite3_finalize(p->pIdxWriter); |
| 18891 sqlite3_finalize(p->pIdxDeleter); |
| 18892 sqlite3_finalize(p->pIdxSelect); |
| 18893 sqlite3Fts5HashFree(p->pHash); |
| 18894 sqlite3_free(p->zDataTbl); |
| 18895 sqlite3_free(p); |
| 18896 } |
| 18897 return rc; |
| 18898 } |
| 18899 |
| 18900 /* |
| 18901 ** Argument p points to a buffer containing utf-8 text that is n bytes in |
| 18902 ** size. Return the number of bytes in the nChar character prefix of the |
| 18903 ** buffer, or 0 if there are less than nChar characters in total. |
| 18904 */ |
| 18905 static int fts5IndexCharlenToBytelen(const char *p, int nByte, int nChar){ |
| 18906 int n = 0; |
| 18907 int i; |
| 18908 for(i=0; i<nChar; i++){ |
| 18909 if( n>=nByte ) return 0; /* Input contains fewer than nChar chars */ |
| 18910 if( (unsigned char)p[n++]>=0xc0 ){ |
| 18911 while( (p[n] & 0xc0)==0x80 ) n++; |
| 18912 } |
| 18913 } |
| 18914 return n; |
| 18915 } |
| 18916 |
| 18917 /* |
| 18918 ** pIn is a UTF-8 encoded string, nIn bytes in size. Return the number of |
| 18919 ** unicode characters in the string. |
| 18920 */ |
| 18921 static int fts5IndexCharlen(const char *pIn, int nIn){ |
| 18922 int nChar = 0; |
| 18923 int i = 0; |
| 18924 while( i<nIn ){ |
| 18925 if( (unsigned char)pIn[i++]>=0xc0 ){ |
| 18926 while( i<nIn && (pIn[i] & 0xc0)==0x80 ) i++; |
| 18927 } |
| 18928 nChar++; |
| 18929 } |
| 18930 return nChar; |
| 18931 } |
| 18932 |
| 18933 /* |
| 18934 ** Insert or remove data to or from the index. Each time a document is |
| 18935 ** added to or removed from the index, this function is called one or more |
| 18936 ** times. |
| 18937 ** |
| 18938 ** For an insert, it must be called once for each token in the new document. |
| 18939 ** If the operation is a delete, it must be called (at least) once for each |
| 18940 ** unique token in the document with an iCol value less than zero. The iPos |
| 18941 ** argument is ignored for a delete. |
| 18942 */ |
| 18943 static int sqlite3Fts5IndexWrite( |
| 18944 Fts5Index *p, /* Index to write to */ |
| 18945 int iCol, /* Column token appears in (-ve -> delete) */ |
| 18946 int iPos, /* Position of token within column */ |
| 18947 const char *pToken, int nToken /* Token to add or remove to or from index */ |
| 18948 ){ |
| 18949 int i; /* Used to iterate through indexes */ |
| 18950 int rc = SQLITE_OK; /* Return code */ |
| 18951 Fts5Config *pConfig = p->pConfig; |
| 18952 |
| 18953 assert( p->rc==SQLITE_OK ); |
| 18954 assert( (iCol<0)==p->bDelete ); |
| 18955 |
| 18956 /* Add the entry to the main terms index. */ |
| 18957 rc = sqlite3Fts5HashWrite( |
| 18958 p->pHash, p->iWriteRowid, iCol, iPos, FTS5_MAIN_PREFIX, pToken, nToken |
| 18959 ); |
| 18960 |
| 18961 for(i=0; i<pConfig->nPrefix && rc==SQLITE_OK; i++){ |
| 18962 int nByte = fts5IndexCharlenToBytelen(pToken, nToken, pConfig->aPrefix[i]); |
| 18963 if( nByte ){ |
| 18964 rc = sqlite3Fts5HashWrite(p->pHash, |
| 18965 p->iWriteRowid, iCol, iPos, (char)(FTS5_MAIN_PREFIX+i+1), pToken, |
| 18966 nByte |
| 18967 ); |
| 18968 } |
| 18969 } |
| 18970 |
| 18971 return rc; |
| 18972 } |
| 18973 |
| 18974 /* |
| 18975 ** Open a new iterator to iterate though all rowid that match the |
| 18976 ** specified token or token prefix. |
| 18977 */ |
| 18978 static int sqlite3Fts5IndexQuery( |
| 18979 Fts5Index *p, /* FTS index to query */ |
| 18980 const char *pToken, int nToken, /* Token (or prefix) to query for */ |
| 18981 int flags, /* Mask of FTS5INDEX_QUERY_X flags */ |
| 18982 Fts5Colset *pColset, /* Match these columns only */ |
| 18983 Fts5IndexIter **ppIter /* OUT: New iterator object */ |
| 18984 ){ |
| 18985 Fts5Config *pConfig = p->pConfig; |
| 18986 Fts5IndexIter *pRet = 0; |
| 18987 int iIdx = 0; |
| 18988 Fts5Buffer buf = {0, 0, 0}; |
| 18989 |
| 18990 /* If the QUERY_SCAN flag is set, all other flags must be clear. */ |
| 18991 assert( (flags & FTS5INDEX_QUERY_SCAN)==0 || flags==FTS5INDEX_QUERY_SCAN ); |
| 18992 |
| 18993 if( sqlite3Fts5BufferSize(&p->rc, &buf, nToken+1)==0 ){ |
| 18994 memcpy(&buf.p[1], pToken, nToken); |
| 18995 |
| 18996 #ifdef SQLITE_DEBUG |
| 18997 /* If the QUERY_TEST_NOIDX flag was specified, then this must be a |
| 18998 ** prefix-query. Instead of using a prefix-index (if one exists), |
| 18999 ** evaluate the prefix query using the main FTS index. This is used |
| 19000 ** for internal sanity checking by the integrity-check in debug |
| 19001 ** mode only. */ |
| 19002 if( pConfig->bPrefixIndex==0 || (flags & FTS5INDEX_QUERY_TEST_NOIDX) ){ |
| 19003 assert( flags & FTS5INDEX_QUERY_PREFIX ); |
| 19004 iIdx = 1+pConfig->nPrefix; |
| 19005 }else |
| 19006 #endif |
| 19007 if( flags & FTS5INDEX_QUERY_PREFIX ){ |
| 19008 int nChar = fts5IndexCharlen(pToken, nToken); |
| 19009 for(iIdx=1; iIdx<=pConfig->nPrefix; iIdx++){ |
| 19010 if( pConfig->aPrefix[iIdx-1]==nChar ) break; |
| 19011 } |
| 19012 } |
| 19013 |
| 19014 if( iIdx<=pConfig->nPrefix ){ |
| 19015 Fts5Structure *pStruct = fts5StructureRead(p); |
| 19016 buf.p[0] = (u8)(FTS5_MAIN_PREFIX + iIdx); |
| 19017 if( pStruct ){ |
| 19018 fts5MultiIterNew(p, pStruct, 1, flags, buf.p, nToken+1, -1, 0, &pRet); |
| 19019 fts5StructureRelease(pStruct); |
| 19020 } |
| 19021 }else{ |
| 19022 int bDesc = (flags & FTS5INDEX_QUERY_DESC)!=0; |
| 19023 buf.p[0] = FTS5_MAIN_PREFIX; |
| 19024 fts5SetupPrefixIter(p, bDesc, buf.p, nToken+1, pColset, &pRet); |
| 19025 } |
| 19026 |
| 19027 if( p->rc ){ |
| 19028 sqlite3Fts5IterClose(pRet); |
| 19029 pRet = 0; |
| 19030 fts5CloseReader(p); |
| 19031 } |
| 19032 *ppIter = pRet; |
| 19033 sqlite3Fts5BufferFree(&buf); |
| 19034 } |
| 19035 return fts5IndexReturn(p); |
| 19036 } |
| 19037 |
| 19038 /* |
| 19039 ** Return true if the iterator passed as the only argument is at EOF. |
| 19040 */ |
| 19041 static int sqlite3Fts5IterEof(Fts5IndexIter *pIter){ |
| 19042 assert( pIter->pIndex->rc==SQLITE_OK ); |
| 19043 return pIter->bEof; |
| 19044 } |
| 19045 |
| 19046 /* |
| 19047 ** Move to the next matching rowid. |
| 19048 */ |
| 19049 static int sqlite3Fts5IterNext(Fts5IndexIter *pIter){ |
| 19050 assert( pIter->pIndex->rc==SQLITE_OK ); |
| 19051 fts5MultiIterNext(pIter->pIndex, pIter, 0, 0); |
| 19052 return fts5IndexReturn(pIter->pIndex); |
| 19053 } |
| 19054 |
| 19055 /* |
| 19056 ** Move to the next matching term/rowid. Used by the fts5vocab module. |
| 19057 */ |
| 19058 static int sqlite3Fts5IterNextScan(Fts5IndexIter *pIter){ |
| 19059 Fts5Index *p = pIter->pIndex; |
| 19060 |
| 19061 assert( pIter->pIndex->rc==SQLITE_OK ); |
| 19062 |
| 19063 fts5MultiIterNext(p, pIter, 0, 0); |
| 19064 if( p->rc==SQLITE_OK ){ |
| 19065 Fts5SegIter *pSeg = &pIter->aSeg[ pIter->aFirst[1].iFirst ]; |
| 19066 if( pSeg->pLeaf && pSeg->term.p[0]!=FTS5_MAIN_PREFIX ){ |
| 19067 fts5DataRelease(pSeg->pLeaf); |
| 19068 pSeg->pLeaf = 0; |
| 19069 pIter->bEof = 1; |
| 19070 } |
| 19071 } |
| 19072 |
| 19073 return fts5IndexReturn(pIter->pIndex); |
| 19074 } |
| 19075 |
| 19076 /* |
| 19077 ** Move to the next matching rowid that occurs at or after iMatch. The |
| 19078 ** definition of "at or after" depends on whether this iterator iterates |
| 19079 ** in ascending or descending rowid order. |
| 19080 */ |
| 19081 static int sqlite3Fts5IterNextFrom(Fts5IndexIter *pIter, i64 iMatch){ |
| 19082 fts5MultiIterNextFrom(pIter->pIndex, pIter, iMatch); |
| 19083 return fts5IndexReturn(pIter->pIndex); |
| 19084 } |
| 19085 |
| 19086 /* |
| 19087 ** Return the current rowid. |
| 19088 */ |
| 19089 static i64 sqlite3Fts5IterRowid(Fts5IndexIter *pIter){ |
| 19090 return fts5MultiIterRowid(pIter); |
| 19091 } |
| 19092 |
| 19093 /* |
| 19094 ** Return the current term. |
| 19095 */ |
| 19096 static const char *sqlite3Fts5IterTerm(Fts5IndexIter *pIter, int *pn){ |
| 19097 int n; |
| 19098 const char *z = (const char*)fts5MultiIterTerm(pIter, &n); |
| 19099 *pn = n-1; |
| 19100 return &z[1]; |
| 19101 } |
| 19102 |
| 19103 |
| 19104 static int fts5IndexExtractColset ( |
| 19105 Fts5Colset *pColset, /* Colset to filter on */ |
| 19106 const u8 *pPos, int nPos, /* Position list */ |
| 19107 Fts5Buffer *pBuf /* Output buffer */ |
| 19108 ){ |
| 19109 int rc = SQLITE_OK; |
| 19110 int i; |
| 19111 |
| 19112 fts5BufferZero(pBuf); |
| 19113 for(i=0; i<pColset->nCol; i++){ |
| 19114 const u8 *pSub = pPos; |
| 19115 int nSub = fts5IndexExtractCol(&pSub, nPos, pColset->aiCol[i]); |
| 19116 if( nSub ){ |
| 19117 fts5BufferAppendBlob(&rc, pBuf, nSub, pSub); |
| 19118 } |
| 19119 } |
| 19120 return rc; |
| 19121 } |
| 19122 |
| 19123 |
| 19124 /* |
| 19125 ** Return a pointer to a buffer containing a copy of the position list for |
| 19126 ** the current entry. Output variable *pn is set to the size of the buffer |
| 19127 ** in bytes before returning. |
| 19128 ** |
| 19129 ** The returned position list does not include the "number of bytes" varint |
| 19130 ** field that starts the position list on disk. |
| 19131 */ |
| 19132 static int sqlite3Fts5IterPoslist( |
| 19133 Fts5IndexIter *pIter, |
| 19134 Fts5Colset *pColset, /* Column filter (or NULL) */ |
| 19135 const u8 **pp, /* OUT: Pointer to position-list data */ |
| 19136 int *pn, /* OUT: Size of position-list in bytes */ |
| 19137 i64 *piRowid /* OUT: Current rowid */ |
| 19138 ){ |
| 19139 Fts5SegIter *pSeg = &pIter->aSeg[ pIter->aFirst[1].iFirst ]; |
| 19140 assert( pIter->pIndex->rc==SQLITE_OK ); |
| 19141 *piRowid = pSeg->iRowid; |
| 19142 if( pSeg->iLeafOffset+pSeg->nPos<=pSeg->pLeaf->szLeaf ){ |
| 19143 u8 *pPos = &pSeg->pLeaf->p[pSeg->iLeafOffset]; |
| 19144 if( pColset==0 || pIter->bFiltered ){ |
| 19145 *pn = pSeg->nPos; |
| 19146 *pp = pPos; |
| 19147 }else if( pColset->nCol==1 ){ |
| 19148 *pp = pPos; |
| 19149 *pn = fts5IndexExtractCol(pp, pSeg->nPos, pColset->aiCol[0]); |
| 19150 }else{ |
| 19151 fts5BufferZero(&pIter->poslist); |
| 19152 fts5IndexExtractColset(pColset, pPos, pSeg->nPos, &pIter->poslist); |
| 19153 *pp = pIter->poslist.p; |
| 19154 *pn = pIter->poslist.n; |
| 19155 } |
| 19156 }else{ |
| 19157 fts5BufferZero(&pIter->poslist); |
| 19158 fts5SegiterPoslist(pIter->pIndex, pSeg, pColset, &pIter->poslist); |
| 19159 *pp = pIter->poslist.p; |
| 19160 *pn = pIter->poslist.n; |
| 19161 } |
| 19162 return fts5IndexReturn(pIter->pIndex); |
| 19163 } |
| 19164 |
| 19165 /* |
| 19166 ** This function is similar to sqlite3Fts5IterPoslist(), except that it |
| 19167 ** copies the position list into the buffer supplied as the second |
| 19168 ** argument. |
| 19169 */ |
| 19170 static int sqlite3Fts5IterPoslistBuffer(Fts5IndexIter *pIter, Fts5Buffer *pBuf){ |
| 19171 Fts5Index *p = pIter->pIndex; |
| 19172 Fts5SegIter *pSeg = &pIter->aSeg[ pIter->aFirst[1].iFirst ]; |
| 19173 assert( p->rc==SQLITE_OK ); |
| 19174 fts5BufferZero(pBuf); |
| 19175 fts5SegiterPoslist(p, pSeg, 0, pBuf); |
| 19176 return fts5IndexReturn(p); |
| 19177 } |
| 19178 |
| 19179 /* |
| 19180 ** Close an iterator opened by an earlier call to sqlite3Fts5IndexQuery(). |
| 19181 */ |
| 19182 static void sqlite3Fts5IterClose(Fts5IndexIter *pIter){ |
| 19183 if( pIter ){ |
| 19184 Fts5Index *pIndex = pIter->pIndex; |
| 19185 fts5MultiIterFree(pIter->pIndex, pIter); |
| 19186 fts5CloseReader(pIndex); |
| 19187 } |
| 19188 } |
| 19189 |
| 19190 /* |
| 19191 ** Read and decode the "averages" record from the database. |
| 19192 ** |
| 19193 ** Parameter anSize must point to an array of size nCol, where nCol is |
| 19194 ** the number of user defined columns in the FTS table. |
| 19195 */ |
| 19196 static int sqlite3Fts5IndexGetAverages(Fts5Index *p, i64 *pnRow, i64 *anSize){ |
| 19197 int nCol = p->pConfig->nCol; |
| 19198 Fts5Data *pData; |
| 19199 |
| 19200 *pnRow = 0; |
| 19201 memset(anSize, 0, sizeof(i64) * nCol); |
| 19202 pData = fts5DataRead(p, FTS5_AVERAGES_ROWID); |
| 19203 if( p->rc==SQLITE_OK && pData->nn ){ |
| 19204 int i = 0; |
| 19205 int iCol; |
| 19206 i += fts5GetVarint(&pData->p[i], (u64*)pnRow); |
| 19207 for(iCol=0; i<pData->nn && iCol<nCol; iCol++){ |
| 19208 i += fts5GetVarint(&pData->p[i], (u64*)&anSize[iCol]); |
| 19209 } |
| 19210 } |
| 19211 |
| 19212 fts5DataRelease(pData); |
| 19213 return fts5IndexReturn(p); |
| 19214 } |
| 19215 |
| 19216 /* |
| 19217 ** Replace the current "averages" record with the contents of the buffer |
| 19218 ** supplied as the second argument. |
| 19219 */ |
| 19220 static int sqlite3Fts5IndexSetAverages(Fts5Index *p, const u8 *pData, int nData)
{ |
| 19221 assert( p->rc==SQLITE_OK ); |
| 19222 fts5DataWrite(p, FTS5_AVERAGES_ROWID, pData, nData); |
| 19223 return fts5IndexReturn(p); |
| 19224 } |
| 19225 |
| 19226 /* |
| 19227 ** Return the total number of blocks this module has read from the %_data |
| 19228 ** table since it was created. |
| 19229 */ |
| 19230 static int sqlite3Fts5IndexReads(Fts5Index *p){ |
| 19231 return p->nRead; |
| 19232 } |
| 19233 |
| 19234 /* |
| 19235 ** Set the 32-bit cookie value stored at the start of all structure |
| 19236 ** records to the value passed as the second argument. |
| 19237 ** |
| 19238 ** Return SQLITE_OK if successful, or an SQLite error code if an error |
| 19239 ** occurs. |
| 19240 */ |
| 19241 static int sqlite3Fts5IndexSetCookie(Fts5Index *p, int iNew){ |
| 19242 int rc; /* Return code */ |
| 19243 Fts5Config *pConfig = p->pConfig; /* Configuration object */ |
| 19244 u8 aCookie[4]; /* Binary representation of iNew */ |
| 19245 sqlite3_blob *pBlob = 0; |
| 19246 |
| 19247 assert( p->rc==SQLITE_OK ); |
| 19248 sqlite3Fts5Put32(aCookie, iNew); |
| 19249 |
| 19250 rc = sqlite3_blob_open(pConfig->db, pConfig->zDb, p->zDataTbl, |
| 19251 "block", FTS5_STRUCTURE_ROWID, 1, &pBlob |
| 19252 ); |
| 19253 if( rc==SQLITE_OK ){ |
| 19254 sqlite3_blob_write(pBlob, aCookie, 4, 0); |
| 19255 rc = sqlite3_blob_close(pBlob); |
| 19256 } |
| 19257 |
| 19258 return rc; |
| 19259 } |
| 19260 |
| 19261 static int sqlite3Fts5IndexLoadConfig(Fts5Index *p){ |
| 19262 Fts5Structure *pStruct; |
| 19263 pStruct = fts5StructureRead(p); |
| 19264 fts5StructureRelease(pStruct); |
| 19265 return fts5IndexReturn(p); |
| 19266 } |
| 19267 |
| 19268 |
| 19269 /************************************************************************* |
| 19270 ************************************************************************** |
| 19271 ** Below this point is the implementation of the integrity-check |
| 19272 ** functionality. |
| 19273 */ |
| 19274 |
| 19275 /* |
| 19276 ** Return a simple checksum value based on the arguments. |
| 19277 */ |
| 19278 static u64 fts5IndexEntryCksum( |
| 19279 i64 iRowid, |
| 19280 int iCol, |
| 19281 int iPos, |
| 19282 int iIdx, |
| 19283 const char *pTerm, |
| 19284 int nTerm |
| 19285 ){ |
| 19286 int i; |
| 19287 u64 ret = iRowid; |
| 19288 ret += (ret<<3) + iCol; |
| 19289 ret += (ret<<3) + iPos; |
| 19290 if( iIdx>=0 ) ret += (ret<<3) + (FTS5_MAIN_PREFIX + iIdx); |
| 19291 for(i=0; i<nTerm; i++) ret += (ret<<3) + pTerm[i]; |
| 19292 return ret; |
| 19293 } |
| 19294 |
| 19295 #ifdef SQLITE_DEBUG |
| 19296 /* |
| 19297 ** This function is purely an internal test. It does not contribute to |
| 19298 ** FTS functionality, or even the integrity-check, in any way. |
| 19299 ** |
| 19300 ** Instead, it tests that the same set of pgno/rowid combinations are |
| 19301 ** visited regardless of whether the doclist-index identified by parameters |
| 19302 ** iSegid/iLeaf is iterated in forwards or reverse order. |
| 19303 */ |
| 19304 static void fts5TestDlidxReverse( |
| 19305 Fts5Index *p, |
| 19306 int iSegid, /* Segment id to load from */ |
| 19307 int iLeaf /* Load doclist-index for this leaf */ |
| 19308 ){ |
| 19309 Fts5DlidxIter *pDlidx = 0; |
| 19310 u64 cksum1 = 13; |
| 19311 u64 cksum2 = 13; |
| 19312 |
| 19313 for(pDlidx=fts5DlidxIterInit(p, 0, iSegid, iLeaf); |
| 19314 fts5DlidxIterEof(p, pDlidx)==0; |
| 19315 fts5DlidxIterNext(p, pDlidx) |
| 19316 ){ |
| 19317 i64 iRowid = fts5DlidxIterRowid(pDlidx); |
| 19318 int pgno = fts5DlidxIterPgno(pDlidx); |
| 19319 assert( pgno>iLeaf ); |
| 19320 cksum1 += iRowid + ((i64)pgno<<32); |
| 19321 } |
| 19322 fts5DlidxIterFree(pDlidx); |
| 19323 pDlidx = 0; |
| 19324 |
| 19325 for(pDlidx=fts5DlidxIterInit(p, 1, iSegid, iLeaf); |
| 19326 fts5DlidxIterEof(p, pDlidx)==0; |
| 19327 fts5DlidxIterPrev(p, pDlidx) |
| 19328 ){ |
| 19329 i64 iRowid = fts5DlidxIterRowid(pDlidx); |
| 19330 int pgno = fts5DlidxIterPgno(pDlidx); |
| 19331 assert( fts5DlidxIterPgno(pDlidx)>iLeaf ); |
| 19332 cksum2 += iRowid + ((i64)pgno<<32); |
| 19333 } |
| 19334 fts5DlidxIterFree(pDlidx); |
| 19335 pDlidx = 0; |
| 19336 |
| 19337 if( p->rc==SQLITE_OK && cksum1!=cksum2 ) p->rc = FTS5_CORRUPT; |
| 19338 } |
| 19339 |
| 19340 static int fts5QueryCksum( |
| 19341 Fts5Index *p, /* Fts5 index object */ |
| 19342 int iIdx, |
| 19343 const char *z, /* Index key to query for */ |
| 19344 int n, /* Size of index key in bytes */ |
| 19345 int flags, /* Flags for Fts5IndexQuery */ |
| 19346 u64 *pCksum /* IN/OUT: Checksum value */ |
| 19347 ){ |
| 19348 u64 cksum = *pCksum; |
| 19349 Fts5IndexIter *pIdxIter = 0; |
| 19350 int rc = sqlite3Fts5IndexQuery(p, z, n, flags, 0, &pIdxIter); |
| 19351 |
| 19352 while( rc==SQLITE_OK && 0==sqlite3Fts5IterEof(pIdxIter) ){ |
| 19353 i64 dummy; |
| 19354 const u8 *pPos; |
| 19355 int nPos; |
| 19356 i64 rowid = sqlite3Fts5IterRowid(pIdxIter); |
| 19357 rc = sqlite3Fts5IterPoslist(pIdxIter, 0, &pPos, &nPos, &dummy); |
| 19358 if( rc==SQLITE_OK ){ |
| 19359 Fts5PoslistReader sReader; |
| 19360 for(sqlite3Fts5PoslistReaderInit(pPos, nPos, &sReader); |
| 19361 sReader.bEof==0; |
| 19362 sqlite3Fts5PoslistReaderNext(&sReader) |
| 19363 ){ |
| 19364 int iCol = FTS5_POS2COLUMN(sReader.iPos); |
| 19365 int iOff = FTS5_POS2OFFSET(sReader.iPos); |
| 19366 cksum ^= fts5IndexEntryCksum(rowid, iCol, iOff, iIdx, z, n); |
| 19367 } |
| 19368 rc = sqlite3Fts5IterNext(pIdxIter); |
| 19369 } |
| 19370 } |
| 19371 sqlite3Fts5IterClose(pIdxIter); |
| 19372 |
| 19373 *pCksum = cksum; |
| 19374 return rc; |
| 19375 } |
| 19376 |
| 19377 |
| 19378 /* |
| 19379 ** This function is also purely an internal test. It does not contribute to |
| 19380 ** FTS functionality, or even the integrity-check, in any way. |
| 19381 */ |
| 19382 static void fts5TestTerm( |
| 19383 Fts5Index *p, |
| 19384 Fts5Buffer *pPrev, /* Previous term */ |
| 19385 const char *z, int n, /* Possibly new term to test */ |
| 19386 u64 expected, |
| 19387 u64 *pCksum |
| 19388 ){ |
| 19389 int rc = p->rc; |
| 19390 if( pPrev->n==0 ){ |
| 19391 fts5BufferSet(&rc, pPrev, n, (const u8*)z); |
| 19392 }else |
| 19393 if( rc==SQLITE_OK && (pPrev->n!=n || memcmp(pPrev->p, z, n)) ){ |
| 19394 u64 cksum3 = *pCksum; |
| 19395 const char *zTerm = (const char*)&pPrev->p[1]; /* term sans prefix-byte */ |
| 19396 int nTerm = pPrev->n-1; /* Size of zTerm in bytes */ |
| 19397 int iIdx = (pPrev->p[0] - FTS5_MAIN_PREFIX); |
| 19398 int flags = (iIdx==0 ? 0 : FTS5INDEX_QUERY_PREFIX); |
| 19399 u64 ck1 = 0; |
| 19400 u64 ck2 = 0; |
| 19401 |
| 19402 /* Check that the results returned for ASC and DESC queries are |
| 19403 ** the same. If not, call this corruption. */ |
| 19404 rc = fts5QueryCksum(p, iIdx, zTerm, nTerm, flags, &ck1); |
| 19405 if( rc==SQLITE_OK ){ |
| 19406 int f = flags|FTS5INDEX_QUERY_DESC; |
| 19407 rc = fts5QueryCksum(p, iIdx, zTerm, nTerm, f, &ck2); |
| 19408 } |
| 19409 if( rc==SQLITE_OK && ck1!=ck2 ) rc = FTS5_CORRUPT; |
| 19410 |
| 19411 /* If this is a prefix query, check that the results returned if the |
| 19412 ** the index is disabled are the same. In both ASC and DESC order. |
| 19413 ** |
| 19414 ** This check may only be performed if the hash table is empty. This |
| 19415 ** is because the hash table only supports a single scan query at |
| 19416 ** a time, and the multi-iter loop from which this function is called |
| 19417 ** is already performing such a scan. */ |
| 19418 if( p->nPendingData==0 ){ |
| 19419 if( iIdx>0 && rc==SQLITE_OK ){ |
| 19420 int f = flags|FTS5INDEX_QUERY_TEST_NOIDX; |
| 19421 ck2 = 0; |
| 19422 rc = fts5QueryCksum(p, iIdx, zTerm, nTerm, f, &ck2); |
| 19423 if( rc==SQLITE_OK && ck1!=ck2 ) rc = FTS5_CORRUPT; |
| 19424 } |
| 19425 if( iIdx>0 && rc==SQLITE_OK ){ |
| 19426 int f = flags|FTS5INDEX_QUERY_TEST_NOIDX|FTS5INDEX_QUERY_DESC; |
| 19427 ck2 = 0; |
| 19428 rc = fts5QueryCksum(p, iIdx, zTerm, nTerm, f, &ck2); |
| 19429 if( rc==SQLITE_OK && ck1!=ck2 ) rc = FTS5_CORRUPT; |
| 19430 } |
| 19431 } |
| 19432 |
| 19433 cksum3 ^= ck1; |
| 19434 fts5BufferSet(&rc, pPrev, n, (const u8*)z); |
| 19435 |
| 19436 if( rc==SQLITE_OK && cksum3!=expected ){ |
| 19437 rc = FTS5_CORRUPT; |
| 19438 } |
| 19439 *pCksum = cksum3; |
| 19440 } |
| 19441 p->rc = rc; |
| 19442 } |
| 19443 |
| 19444 #else |
| 19445 # define fts5TestDlidxReverse(x,y,z) |
| 19446 # define fts5TestTerm(u,v,w,x,y,z) |
| 19447 #endif |
| 19448 |
| 19449 /* |
| 19450 ** Check that: |
| 19451 ** |
| 19452 ** 1) All leaves of pSeg between iFirst and iLast (inclusive) exist and |
| 19453 ** contain zero terms. |
| 19454 ** 2) All leaves of pSeg between iNoRowid and iLast (inclusive) exist and |
| 19455 ** contain zero rowids. |
| 19456 */ |
| 19457 static void fts5IndexIntegrityCheckEmpty( |
| 19458 Fts5Index *p, |
| 19459 Fts5StructureSegment *pSeg, /* Segment to check internal consistency */ |
| 19460 int iFirst, |
| 19461 int iNoRowid, |
| 19462 int iLast |
| 19463 ){ |
| 19464 int i; |
| 19465 |
| 19466 /* Now check that the iter.nEmpty leaves following the current leaf |
| 19467 ** (a) exist and (b) contain no terms. */ |
| 19468 for(i=iFirst; p->rc==SQLITE_OK && i<=iLast; i++){ |
| 19469 Fts5Data *pLeaf = fts5DataRead(p, FTS5_SEGMENT_ROWID(pSeg->iSegid, i)); |
| 19470 if( pLeaf ){ |
| 19471 if( !fts5LeafIsTermless(pLeaf) ) p->rc = FTS5_CORRUPT; |
| 19472 if( i>=iNoRowid && 0!=fts5LeafFirstRowidOff(pLeaf) ) p->rc = FTS5_CORRUPT; |
| 19473 } |
| 19474 fts5DataRelease(pLeaf); |
| 19475 } |
| 19476 } |
| 19477 |
| 19478 static void fts5IntegrityCheckPgidx(Fts5Index *p, Fts5Data *pLeaf){ |
| 19479 int iTermOff = 0; |
| 19480 int ii; |
| 19481 |
| 19482 Fts5Buffer buf1 = {0,0,0}; |
| 19483 Fts5Buffer buf2 = {0,0,0}; |
| 19484 |
| 19485 ii = pLeaf->szLeaf; |
| 19486 while( ii<pLeaf->nn && p->rc==SQLITE_OK ){ |
| 19487 int res; |
| 19488 int iOff; |
| 19489 int nIncr; |
| 19490 |
| 19491 ii += fts5GetVarint32(&pLeaf->p[ii], nIncr); |
| 19492 iTermOff += nIncr; |
| 19493 iOff = iTermOff; |
| 19494 |
| 19495 if( iOff>=pLeaf->szLeaf ){ |
| 19496 p->rc = FTS5_CORRUPT; |
| 19497 }else if( iTermOff==nIncr ){ |
| 19498 int nByte; |
| 19499 iOff += fts5GetVarint32(&pLeaf->p[iOff], nByte); |
| 19500 if( (iOff+nByte)>pLeaf->szLeaf ){ |
| 19501 p->rc = FTS5_CORRUPT; |
| 19502 }else{ |
| 19503 fts5BufferSet(&p->rc, &buf1, nByte, &pLeaf->p[iOff]); |
| 19504 } |
| 19505 }else{ |
| 19506 int nKeep, nByte; |
| 19507 iOff += fts5GetVarint32(&pLeaf->p[iOff], nKeep); |
| 19508 iOff += fts5GetVarint32(&pLeaf->p[iOff], nByte); |
| 19509 if( nKeep>buf1.n || (iOff+nByte)>pLeaf->szLeaf ){ |
| 19510 p->rc = FTS5_CORRUPT; |
| 19511 }else{ |
| 19512 buf1.n = nKeep; |
| 19513 fts5BufferAppendBlob(&p->rc, &buf1, nByte, &pLeaf->p[iOff]); |
| 19514 } |
| 19515 |
| 19516 if( p->rc==SQLITE_OK ){ |
| 19517 res = fts5BufferCompare(&buf1, &buf2); |
| 19518 if( res<=0 ) p->rc = FTS5_CORRUPT; |
| 19519 } |
| 19520 } |
| 19521 fts5BufferSet(&p->rc, &buf2, buf1.n, buf1.p); |
| 19522 } |
| 19523 |
| 19524 fts5BufferFree(&buf1); |
| 19525 fts5BufferFree(&buf2); |
| 19526 } |
| 19527 |
| 19528 static void fts5IndexIntegrityCheckSegment( |
| 19529 Fts5Index *p, /* FTS5 backend object */ |
| 19530 Fts5StructureSegment *pSeg /* Segment to check internal consistency */ |
| 19531 ){ |
| 19532 Fts5Config *pConfig = p->pConfig; |
| 19533 sqlite3_stmt *pStmt = 0; |
| 19534 int rc2; |
| 19535 int iIdxPrevLeaf = pSeg->pgnoFirst-1; |
| 19536 int iDlidxPrevLeaf = pSeg->pgnoLast; |
| 19537 |
| 19538 if( pSeg->pgnoFirst==0 ) return; |
| 19539 |
| 19540 fts5IndexPrepareStmt(p, &pStmt, sqlite3_mprintf( |
| 19541 "SELECT segid, term, (pgno>>1), (pgno&1) FROM %Q.'%q_idx' WHERE segid=%d", |
| 19542 pConfig->zDb, pConfig->zName, pSeg->iSegid |
| 19543 )); |
| 19544 |
| 19545 /* Iterate through the b-tree hierarchy. */ |
| 19546 while( p->rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pStmt) ){ |
| 19547 i64 iRow; /* Rowid for this leaf */ |
| 19548 Fts5Data *pLeaf; /* Data for this leaf */ |
| 19549 |
| 19550 int nIdxTerm = sqlite3_column_bytes(pStmt, 1); |
| 19551 const char *zIdxTerm = (const char*)sqlite3_column_text(pStmt, 1); |
| 19552 int iIdxLeaf = sqlite3_column_int(pStmt, 2); |
| 19553 int bIdxDlidx = sqlite3_column_int(pStmt, 3); |
| 19554 |
| 19555 /* If the leaf in question has already been trimmed from the segment, |
| 19556 ** ignore this b-tree entry. Otherwise, load it into memory. */ |
| 19557 if( iIdxLeaf<pSeg->pgnoFirst ) continue; |
| 19558 iRow = FTS5_SEGMENT_ROWID(pSeg->iSegid, iIdxLeaf); |
| 19559 pLeaf = fts5DataRead(p, iRow); |
| 19560 if( pLeaf==0 ) break; |
| 19561 |
| 19562 /* Check that the leaf contains at least one term, and that it is equal |
| 19563 ** to or larger than the split-key in zIdxTerm. Also check that if there |
| 19564 ** is also a rowid pointer within the leaf page header, it points to a |
| 19565 ** location before the term. */ |
| 19566 if( pLeaf->nn<=pLeaf->szLeaf ){ |
| 19567 p->rc = FTS5_CORRUPT; |
| 19568 }else{ |
| 19569 int iOff; /* Offset of first term on leaf */ |
| 19570 int iRowidOff; /* Offset of first rowid on leaf */ |
| 19571 int nTerm; /* Size of term on leaf in bytes */ |
| 19572 int res; /* Comparison of term and split-key */ |
| 19573 |
| 19574 iOff = fts5LeafFirstTermOff(pLeaf); |
| 19575 iRowidOff = fts5LeafFirstRowidOff(pLeaf); |
| 19576 if( iRowidOff>=iOff ){ |
| 19577 p->rc = FTS5_CORRUPT; |
| 19578 }else{ |
| 19579 iOff += fts5GetVarint32(&pLeaf->p[iOff], nTerm); |
| 19580 res = memcmp(&pLeaf->p[iOff], zIdxTerm, MIN(nTerm, nIdxTerm)); |
| 19581 if( res==0 ) res = nTerm - nIdxTerm; |
| 19582 if( res<0 ) p->rc = FTS5_CORRUPT; |
| 19583 } |
| 19584 |
| 19585 fts5IntegrityCheckPgidx(p, pLeaf); |
| 19586 } |
| 19587 fts5DataRelease(pLeaf); |
| 19588 if( p->rc ) break; |
| 19589 |
| 19590 /* Now check that the iter.nEmpty leaves following the current leaf |
| 19591 ** (a) exist and (b) contain no terms. */ |
| 19592 fts5IndexIntegrityCheckEmpty( |
| 19593 p, pSeg, iIdxPrevLeaf+1, iDlidxPrevLeaf+1, iIdxLeaf-1 |
| 19594 ); |
| 19595 if( p->rc ) break; |
| 19596 |
| 19597 /* If there is a doclist-index, check that it looks right. */ |
| 19598 if( bIdxDlidx ){ |
| 19599 Fts5DlidxIter *pDlidx = 0; /* For iterating through doclist index */ |
| 19600 int iPrevLeaf = iIdxLeaf; |
| 19601 int iSegid = pSeg->iSegid; |
| 19602 int iPg = 0; |
| 19603 i64 iKey; |
| 19604 |
| 19605 for(pDlidx=fts5DlidxIterInit(p, 0, iSegid, iIdxLeaf); |
| 19606 fts5DlidxIterEof(p, pDlidx)==0; |
| 19607 fts5DlidxIterNext(p, pDlidx) |
| 19608 ){ |
| 19609 |
| 19610 /* Check any rowid-less pages that occur before the current leaf. */ |
| 19611 for(iPg=iPrevLeaf+1; iPg<fts5DlidxIterPgno(pDlidx); iPg++){ |
| 19612 iKey = FTS5_SEGMENT_ROWID(iSegid, iPg); |
| 19613 pLeaf = fts5DataRead(p, iKey); |
| 19614 if( pLeaf ){ |
| 19615 if( fts5LeafFirstRowidOff(pLeaf)!=0 ) p->rc = FTS5_CORRUPT; |
| 19616 fts5DataRelease(pLeaf); |
| 19617 } |
| 19618 } |
| 19619 iPrevLeaf = fts5DlidxIterPgno(pDlidx); |
| 19620 |
| 19621 /* Check that the leaf page indicated by the iterator really does |
| 19622 ** contain the rowid suggested by the same. */ |
| 19623 iKey = FTS5_SEGMENT_ROWID(iSegid, iPrevLeaf); |
| 19624 pLeaf = fts5DataRead(p, iKey); |
| 19625 if( pLeaf ){ |
| 19626 i64 iRowid; |
| 19627 int iRowidOff = fts5LeafFirstRowidOff(pLeaf); |
| 19628 ASSERT_SZLEAF_OK(pLeaf); |
| 19629 if( iRowidOff>=pLeaf->szLeaf ){ |
| 19630 p->rc = FTS5_CORRUPT; |
| 19631 }else{ |
| 19632 fts5GetVarint(&pLeaf->p[iRowidOff], (u64*)&iRowid); |
| 19633 if( iRowid!=fts5DlidxIterRowid(pDlidx) ) p->rc = FTS5_CORRUPT; |
| 19634 } |
| 19635 fts5DataRelease(pLeaf); |
| 19636 } |
| 19637 } |
| 19638 |
| 19639 iDlidxPrevLeaf = iPg; |
| 19640 fts5DlidxIterFree(pDlidx); |
| 19641 fts5TestDlidxReverse(p, iSegid, iIdxLeaf); |
| 19642 }else{ |
| 19643 iDlidxPrevLeaf = pSeg->pgnoLast; |
| 19644 /* TODO: Check there is no doclist index */ |
| 19645 } |
| 19646 |
| 19647 iIdxPrevLeaf = iIdxLeaf; |
| 19648 } |
| 19649 |
| 19650 rc2 = sqlite3_finalize(pStmt); |
| 19651 if( p->rc==SQLITE_OK ) p->rc = rc2; |
| 19652 |
| 19653 /* Page iter.iLeaf must now be the rightmost leaf-page in the segment */ |
| 19654 #if 0 |
| 19655 if( p->rc==SQLITE_OK && iter.iLeaf!=pSeg->pgnoLast ){ |
| 19656 p->rc = FTS5_CORRUPT; |
| 19657 } |
| 19658 #endif |
| 19659 } |
| 19660 |
| 19661 |
| 19662 /* |
| 19663 ** Run internal checks to ensure that the FTS index (a) is internally |
| 19664 ** consistent and (b) contains entries for which the XOR of the checksums |
| 19665 ** as calculated by fts5IndexEntryCksum() is cksum. |
| 19666 ** |
| 19667 ** Return SQLITE_CORRUPT if any of the internal checks fail, or if the |
| 19668 ** checksum does not match. Return SQLITE_OK if all checks pass without |
| 19669 ** error, or some other SQLite error code if another error (e.g. OOM) |
| 19670 ** occurs. |
| 19671 */ |
| 19672 static int sqlite3Fts5IndexIntegrityCheck(Fts5Index *p, u64 cksum){ |
| 19673 u64 cksum2 = 0; /* Checksum based on contents of indexes */ |
| 19674 Fts5Buffer poslist = {0,0,0}; /* Buffer used to hold a poslist */ |
| 19675 Fts5IndexIter *pIter; /* Used to iterate through entire index */ |
| 19676 Fts5Structure *pStruct; /* Index structure */ |
| 19677 |
| 19678 #ifdef SQLITE_DEBUG |
| 19679 /* Used by extra internal tests only run if NDEBUG is not defined */ |
| 19680 u64 cksum3 = 0; /* Checksum based on contents of indexes */ |
| 19681 Fts5Buffer term = {0,0,0}; /* Buffer used to hold most recent term */ |
| 19682 #endif |
| 19683 |
| 19684 /* Load the FTS index structure */ |
| 19685 pStruct = fts5StructureRead(p); |
| 19686 |
| 19687 /* Check that the internal nodes of each segment match the leaves */ |
| 19688 if( pStruct ){ |
| 19689 int iLvl, iSeg; |
| 19690 for(iLvl=0; iLvl<pStruct->nLevel; iLvl++){ |
| 19691 for(iSeg=0; iSeg<pStruct->aLevel[iLvl].nSeg; iSeg++){ |
| 19692 Fts5StructureSegment *pSeg = &pStruct->aLevel[iLvl].aSeg[iSeg]; |
| 19693 fts5IndexIntegrityCheckSegment(p, pSeg); |
| 19694 } |
| 19695 } |
| 19696 } |
| 19697 |
| 19698 /* The cksum argument passed to this function is a checksum calculated |
| 19699 ** based on all expected entries in the FTS index (including prefix index |
| 19700 ** entries). This block checks that a checksum calculated based on the |
| 19701 ** actual contents of FTS index is identical. |
| 19702 ** |
| 19703 ** Two versions of the same checksum are calculated. The first (stack |
| 19704 ** variable cksum2) based on entries extracted from the full-text index |
| 19705 ** while doing a linear scan of each individual index in turn. |
| 19706 ** |
| 19707 ** As each term visited by the linear scans, a separate query for the |
| 19708 ** same term is performed. cksum3 is calculated based on the entries |
| 19709 ** extracted by these queries. |
| 19710 */ |
| 19711 for(fts5MultiIterNew(p, pStruct, 0, 0, 0, 0, -1, 0, &pIter); |
| 19712 fts5MultiIterEof(p, pIter)==0; |
| 19713 fts5MultiIterNext(p, pIter, 0, 0) |
| 19714 ){ |
| 19715 int n; /* Size of term in bytes */ |
| 19716 i64 iPos = 0; /* Position read from poslist */ |
| 19717 int iOff = 0; /* Offset within poslist */ |
| 19718 i64 iRowid = fts5MultiIterRowid(pIter); |
| 19719 char *z = (char*)fts5MultiIterTerm(pIter, &n); |
| 19720 |
| 19721 /* If this is a new term, query for it. Update cksum3 with the results. */ |
| 19722 fts5TestTerm(p, &term, z, n, cksum2, &cksum3); |
| 19723 |
| 19724 poslist.n = 0; |
| 19725 fts5SegiterPoslist(p, &pIter->aSeg[pIter->aFirst[1].iFirst] , 0, &poslist); |
| 19726 while( 0==sqlite3Fts5PoslistNext64(poslist.p, poslist.n, &iOff, &iPos) ){ |
| 19727 int iCol = FTS5_POS2COLUMN(iPos); |
| 19728 int iTokOff = FTS5_POS2OFFSET(iPos); |
| 19729 cksum2 ^= fts5IndexEntryCksum(iRowid, iCol, iTokOff, -1, z, n); |
| 19730 } |
| 19731 } |
| 19732 fts5TestTerm(p, &term, 0, 0, cksum2, &cksum3); |
| 19733 |
| 19734 fts5MultiIterFree(p, pIter); |
| 19735 if( p->rc==SQLITE_OK && cksum!=cksum2 ) p->rc = FTS5_CORRUPT; |
| 19736 |
| 19737 fts5StructureRelease(pStruct); |
| 19738 #ifdef SQLITE_DEBUG |
| 19739 fts5BufferFree(&term); |
| 19740 #endif |
| 19741 fts5BufferFree(&poslist); |
| 19742 return fts5IndexReturn(p); |
| 19743 } |
| 19744 |
| 19745 |
| 19746 /* |
| 19747 ** Calculate and return a checksum that is the XOR of the index entry |
| 19748 ** checksum of all entries that would be generated by the token specified |
| 19749 ** by the final 5 arguments. |
| 19750 */ |
| 19751 static u64 sqlite3Fts5IndexCksum( |
| 19752 Fts5Config *pConfig, /* Configuration object */ |
| 19753 i64 iRowid, /* Document term appears in */ |
| 19754 int iCol, /* Column term appears in */ |
| 19755 int iPos, /* Position term appears in */ |
| 19756 const char *pTerm, int nTerm /* Term at iPos */ |
| 19757 ){ |
| 19758 u64 ret = 0; /* Return value */ |
| 19759 int iIdx; /* For iterating through indexes */ |
| 19760 |
| 19761 ret = fts5IndexEntryCksum(iRowid, iCol, iPos, 0, pTerm, nTerm); |
| 19762 |
| 19763 for(iIdx=0; iIdx<pConfig->nPrefix; iIdx++){ |
| 19764 int nByte = fts5IndexCharlenToBytelen(pTerm, nTerm, pConfig->aPrefix[iIdx]); |
| 19765 if( nByte ){ |
| 19766 ret ^= fts5IndexEntryCksum(iRowid, iCol, iPos, iIdx+1, pTerm, nByte); |
| 19767 } |
| 19768 } |
| 19769 |
| 19770 return ret; |
| 19771 } |
| 19772 |
| 19773 /************************************************************************* |
| 19774 ************************************************************************** |
| 19775 ** Below this point is the implementation of the fts5_decode() scalar |
| 19776 ** function only. |
| 19777 */ |
| 19778 |
| 19779 /* |
| 19780 ** Decode a segment-data rowid from the %_data table. This function is |
| 19781 ** the opposite of macro FTS5_SEGMENT_ROWID(). |
| 19782 */ |
| 19783 static void fts5DecodeRowid( |
| 19784 i64 iRowid, /* Rowid from %_data table */ |
| 19785 int *piSegid, /* OUT: Segment id */ |
| 19786 int *pbDlidx, /* OUT: Dlidx flag */ |
| 19787 int *piHeight, /* OUT: Height */ |
| 19788 int *piPgno /* OUT: Page number */ |
| 19789 ){ |
| 19790 *piPgno = (int)(iRowid & (((i64)1 << FTS5_DATA_PAGE_B) - 1)); |
| 19791 iRowid >>= FTS5_DATA_PAGE_B; |
| 19792 |
| 19793 *piHeight = (int)(iRowid & (((i64)1 << FTS5_DATA_HEIGHT_B) - 1)); |
| 19794 iRowid >>= FTS5_DATA_HEIGHT_B; |
| 19795 |
| 19796 *pbDlidx = (int)(iRowid & 0x0001); |
| 19797 iRowid >>= FTS5_DATA_DLI_B; |
| 19798 |
| 19799 *piSegid = (int)(iRowid & (((i64)1 << FTS5_DATA_ID_B) - 1)); |
| 19800 } |
| 19801 |
| 19802 static void fts5DebugRowid(int *pRc, Fts5Buffer *pBuf, i64 iKey){ |
| 19803 int iSegid, iHeight, iPgno, bDlidx; /* Rowid compenents */ |
| 19804 fts5DecodeRowid(iKey, &iSegid, &bDlidx, &iHeight, &iPgno); |
| 19805 |
| 19806 if( iSegid==0 ){ |
| 19807 if( iKey==FTS5_AVERAGES_ROWID ){ |
| 19808 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, "{averages} "); |
| 19809 }else{ |
| 19810 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, "{structure}"); |
| 19811 } |
| 19812 } |
| 19813 else{ |
| 19814 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, "{%ssegid=%d h=%d pgno=%d}", |
| 19815 bDlidx ? "dlidx " : "", iSegid, iHeight, iPgno |
| 19816 ); |
| 19817 } |
| 19818 } |
| 19819 |
| 19820 static void fts5DebugStructure( |
| 19821 int *pRc, /* IN/OUT: error code */ |
| 19822 Fts5Buffer *pBuf, |
| 19823 Fts5Structure *p |
| 19824 ){ |
| 19825 int iLvl, iSeg; /* Iterate through levels, segments */ |
| 19826 |
| 19827 for(iLvl=0; iLvl<p->nLevel; iLvl++){ |
| 19828 Fts5StructureLevel *pLvl = &p->aLevel[iLvl]; |
| 19829 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, |
| 19830 " {lvl=%d nMerge=%d nSeg=%d", iLvl, pLvl->nMerge, pLvl->nSeg |
| 19831 ); |
| 19832 for(iSeg=0; iSeg<pLvl->nSeg; iSeg++){ |
| 19833 Fts5StructureSegment *pSeg = &pLvl->aSeg[iSeg]; |
| 19834 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, " {id=%d leaves=%d..%d}", |
| 19835 pSeg->iSegid, pSeg->pgnoFirst, pSeg->pgnoLast |
| 19836 ); |
| 19837 } |
| 19838 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, "}"); |
| 19839 } |
| 19840 } |
| 19841 |
| 19842 /* |
| 19843 ** This is part of the fts5_decode() debugging aid. |
| 19844 ** |
| 19845 ** Arguments pBlob/nBlob contain a serialized Fts5Structure object. This |
| 19846 ** function appends a human-readable representation of the same object |
| 19847 ** to the buffer passed as the second argument. |
| 19848 */ |
| 19849 static void fts5DecodeStructure( |
| 19850 int *pRc, /* IN/OUT: error code */ |
| 19851 Fts5Buffer *pBuf, |
| 19852 const u8 *pBlob, int nBlob |
| 19853 ){ |
| 19854 int rc; /* Return code */ |
| 19855 Fts5Structure *p = 0; /* Decoded structure object */ |
| 19856 |
| 19857 rc = fts5StructureDecode(pBlob, nBlob, 0, &p); |
| 19858 if( rc!=SQLITE_OK ){ |
| 19859 *pRc = rc; |
| 19860 return; |
| 19861 } |
| 19862 |
| 19863 fts5DebugStructure(pRc, pBuf, p); |
| 19864 fts5StructureRelease(p); |
| 19865 } |
| 19866 |
| 19867 /* |
| 19868 ** This is part of the fts5_decode() debugging aid. |
| 19869 ** |
| 19870 ** Arguments pBlob/nBlob contain an "averages" record. This function |
| 19871 ** appends a human-readable representation of record to the buffer passed |
| 19872 ** as the second argument. |
| 19873 */ |
| 19874 static void fts5DecodeAverages( |
| 19875 int *pRc, /* IN/OUT: error code */ |
| 19876 Fts5Buffer *pBuf, |
| 19877 const u8 *pBlob, int nBlob |
| 19878 ){ |
| 19879 int i = 0; |
| 19880 const char *zSpace = ""; |
| 19881 |
| 19882 while( i<nBlob ){ |
| 19883 u64 iVal; |
| 19884 i += sqlite3Fts5GetVarint(&pBlob[i], &iVal); |
| 19885 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, "%s%d", zSpace, (int)iVal); |
| 19886 zSpace = " "; |
| 19887 } |
| 19888 } |
| 19889 |
| 19890 /* |
| 19891 ** Buffer (a/n) is assumed to contain a list of serialized varints. Read |
| 19892 ** each varint and append its string representation to buffer pBuf. Return |
| 19893 ** after either the input buffer is exhausted or a 0 value is read. |
| 19894 ** |
| 19895 ** The return value is the number of bytes read from the input buffer. |
| 19896 */ |
| 19897 static int fts5DecodePoslist(int *pRc, Fts5Buffer *pBuf, const u8 *a, int n){ |
| 19898 int iOff = 0; |
| 19899 while( iOff<n ){ |
| 19900 int iVal; |
| 19901 iOff += fts5GetVarint32(&a[iOff], iVal); |
| 19902 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, " %d", iVal); |
| 19903 } |
| 19904 return iOff; |
| 19905 } |
| 19906 |
| 19907 /* |
| 19908 ** The start of buffer (a/n) contains the start of a doclist. The doclist |
| 19909 ** may or may not finish within the buffer. This function appends a text |
| 19910 ** representation of the part of the doclist that is present to buffer |
| 19911 ** pBuf. |
| 19912 ** |
| 19913 ** The return value is the number of bytes read from the input buffer. |
| 19914 */ |
| 19915 static int fts5DecodeDoclist(int *pRc, Fts5Buffer *pBuf, const u8 *a, int n){ |
| 19916 i64 iDocid = 0; |
| 19917 int iOff = 0; |
| 19918 |
| 19919 if( n>0 ){ |
| 19920 iOff = sqlite3Fts5GetVarint(a, (u64*)&iDocid); |
| 19921 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, " id=%lld", iDocid); |
| 19922 } |
| 19923 while( iOff<n ){ |
| 19924 int nPos; |
| 19925 int bDel; |
| 19926 iOff += fts5GetPoslistSize(&a[iOff], &nPos, &bDel); |
| 19927 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, " nPos=%d%s", nPos, bDel?"*":""); |
| 19928 iOff += fts5DecodePoslist(pRc, pBuf, &a[iOff], MIN(n-iOff, nPos)); |
| 19929 if( iOff<n ){ |
| 19930 i64 iDelta; |
| 19931 iOff += sqlite3Fts5GetVarint(&a[iOff], (u64*)&iDelta); |
| 19932 iDocid += iDelta; |
| 19933 sqlite3Fts5BufferAppendPrintf(pRc, pBuf, " id=%lld", iDocid); |
| 19934 } |
| 19935 } |
| 19936 |
| 19937 return iOff; |
| 19938 } |
| 19939 |
| 19940 /* |
| 19941 ** The implementation of user-defined scalar function fts5_decode(). |
| 19942 */ |
| 19943 static void fts5DecodeFunction( |
| 19944 sqlite3_context *pCtx, /* Function call context */ |
| 19945 int nArg, /* Number of args (always 2) */ |
| 19946 sqlite3_value **apVal /* Function arguments */ |
| 19947 ){ |
| 19948 i64 iRowid; /* Rowid for record being decoded */ |
| 19949 int iSegid,iHeight,iPgno,bDlidx;/* Rowid components */ |
| 19950 const u8 *aBlob; int n; /* Record to decode */ |
| 19951 u8 *a = 0; |
| 19952 Fts5Buffer s; /* Build up text to return here */ |
| 19953 int rc = SQLITE_OK; /* Return code */ |
| 19954 int nSpace = 0; |
| 19955 |
| 19956 assert( nArg==2 ); |
| 19957 memset(&s, 0, sizeof(Fts5Buffer)); |
| 19958 iRowid = sqlite3_value_int64(apVal[0]); |
| 19959 |
| 19960 /* Make a copy of the second argument (a blob) in aBlob[]. The aBlob[] |
| 19961 ** copy is followed by FTS5_DATA_ZERO_PADDING 0x00 bytes, which prevents |
| 19962 ** buffer overreads even if the record is corrupt. */ |
| 19963 n = sqlite3_value_bytes(apVal[1]); |
| 19964 aBlob = sqlite3_value_blob(apVal[1]); |
| 19965 nSpace = n + FTS5_DATA_ZERO_PADDING; |
| 19966 a = (u8*)sqlite3Fts5MallocZero(&rc, nSpace); |
| 19967 if( a==0 ) goto decode_out; |
| 19968 memcpy(a, aBlob, n); |
| 19969 |
| 19970 |
| 19971 fts5DecodeRowid(iRowid, &iSegid, &bDlidx, &iHeight, &iPgno); |
| 19972 |
| 19973 fts5DebugRowid(&rc, &s, iRowid); |
| 19974 if( bDlidx ){ |
| 19975 Fts5Data dlidx; |
| 19976 Fts5DlidxLvl lvl; |
| 19977 |
| 19978 dlidx.p = a; |
| 19979 dlidx.nn = n; |
| 19980 |
| 19981 memset(&lvl, 0, sizeof(Fts5DlidxLvl)); |
| 19982 lvl.pData = &dlidx; |
| 19983 lvl.iLeafPgno = iPgno; |
| 19984 |
| 19985 for(fts5DlidxLvlNext(&lvl); lvl.bEof==0; fts5DlidxLvlNext(&lvl)){ |
| 19986 sqlite3Fts5BufferAppendPrintf(&rc, &s, |
| 19987 " %d(%lld)", lvl.iLeafPgno, lvl.iRowid |
| 19988 ); |
| 19989 } |
| 19990 }else if( iSegid==0 ){ |
| 19991 if( iRowid==FTS5_AVERAGES_ROWID ){ |
| 19992 fts5DecodeAverages(&rc, &s, a, n); |
| 19993 }else{ |
| 19994 fts5DecodeStructure(&rc, &s, a, n); |
| 19995 } |
| 19996 }else{ |
| 19997 Fts5Buffer term; /* Current term read from page */ |
| 19998 int szLeaf; /* Offset of pgidx in a[] */ |
| 19999 int iPgidxOff; |
| 20000 int iPgidxPrev = 0; /* Previous value read from pgidx */ |
| 20001 int iTermOff = 0; |
| 20002 int iRowidOff = 0; |
| 20003 int iOff; |
| 20004 int nDoclist; |
| 20005 |
| 20006 memset(&term, 0, sizeof(Fts5Buffer)); |
| 20007 |
| 20008 if( n<4 ){ |
| 20009 sqlite3Fts5BufferSet(&rc, &s, 7, (const u8*)"corrupt"); |
| 20010 goto decode_out; |
| 20011 }else{ |
| 20012 iRowidOff = fts5GetU16(&a[0]); |
| 20013 iPgidxOff = szLeaf = fts5GetU16(&a[2]); |
| 20014 if( iPgidxOff<n ){ |
| 20015 fts5GetVarint32(&a[iPgidxOff], iTermOff); |
| 20016 } |
| 20017 } |
| 20018 |
| 20019 /* Decode the position list tail at the start of the page */ |
| 20020 if( iRowidOff!=0 ){ |
| 20021 iOff = iRowidOff; |
| 20022 }else if( iTermOff!=0 ){ |
| 20023 iOff = iTermOff; |
| 20024 }else{ |
| 20025 iOff = szLeaf; |
| 20026 } |
| 20027 fts5DecodePoslist(&rc, &s, &a[4], iOff-4); |
| 20028 |
| 20029 /* Decode any more doclist data that appears on the page before the |
| 20030 ** first term. */ |
| 20031 nDoclist = (iTermOff ? iTermOff : szLeaf) - iOff; |
| 20032 fts5DecodeDoclist(&rc, &s, &a[iOff], nDoclist); |
| 20033 |
| 20034 while( iPgidxOff<n ){ |
| 20035 int bFirst = (iPgidxOff==szLeaf); /* True for first term on page */ |
| 20036 int nByte; /* Bytes of data */ |
| 20037 int iEnd; |
| 20038 |
| 20039 iPgidxOff += fts5GetVarint32(&a[iPgidxOff], nByte); |
| 20040 iPgidxPrev += nByte; |
| 20041 iOff = iPgidxPrev; |
| 20042 |
| 20043 if( iPgidxOff<n ){ |
| 20044 fts5GetVarint32(&a[iPgidxOff], nByte); |
| 20045 iEnd = iPgidxPrev + nByte; |
| 20046 }else{ |
| 20047 iEnd = szLeaf; |
| 20048 } |
| 20049 |
| 20050 if( bFirst==0 ){ |
| 20051 iOff += fts5GetVarint32(&a[iOff], nByte); |
| 20052 term.n = nByte; |
| 20053 } |
| 20054 iOff += fts5GetVarint32(&a[iOff], nByte); |
| 20055 fts5BufferAppendBlob(&rc, &term, nByte, &a[iOff]); |
| 20056 iOff += nByte; |
| 20057 |
| 20058 sqlite3Fts5BufferAppendPrintf( |
| 20059 &rc, &s, " term=%.*s", term.n, (const char*)term.p |
| 20060 ); |
| 20061 iOff += fts5DecodeDoclist(&rc, &s, &a[iOff], iEnd-iOff); |
| 20062 } |
| 20063 |
| 20064 fts5BufferFree(&term); |
| 20065 } |
| 20066 |
| 20067 decode_out: |
| 20068 sqlite3_free(a); |
| 20069 if( rc==SQLITE_OK ){ |
| 20070 sqlite3_result_text(pCtx, (const char*)s.p, s.n, SQLITE_TRANSIENT); |
| 20071 }else{ |
| 20072 sqlite3_result_error_code(pCtx, rc); |
| 20073 } |
| 20074 fts5BufferFree(&s); |
| 20075 } |
| 20076 |
| 20077 /* |
| 20078 ** The implementation of user-defined scalar function fts5_rowid(). |
| 20079 */ |
| 20080 static void fts5RowidFunction( |
| 20081 sqlite3_context *pCtx, /* Function call context */ |
| 20082 int nArg, /* Number of args (always 2) */ |
| 20083 sqlite3_value **apVal /* Function arguments */ |
| 20084 ){ |
| 20085 const char *zArg; |
| 20086 if( nArg==0 ){ |
| 20087 sqlite3_result_error(pCtx, "should be: fts5_rowid(subject, ....)", -1); |
| 20088 }else{ |
| 20089 zArg = (const char*)sqlite3_value_text(apVal[0]); |
| 20090 if( 0==sqlite3_stricmp(zArg, "segment") ){ |
| 20091 i64 iRowid; |
| 20092 int segid, pgno; |
| 20093 if( nArg!=3 ){ |
| 20094 sqlite3_result_error(pCtx, |
| 20095 "should be: fts5_rowid('segment', segid, pgno))", -1 |
| 20096 ); |
| 20097 }else{ |
| 20098 segid = sqlite3_value_int(apVal[1]); |
| 20099 pgno = sqlite3_value_int(apVal[2]); |
| 20100 iRowid = FTS5_SEGMENT_ROWID(segid, pgno); |
| 20101 sqlite3_result_int64(pCtx, iRowid); |
| 20102 } |
| 20103 }else{ |
| 20104 sqlite3_result_error(pCtx, |
| 20105 "first arg to fts5_rowid() must be 'segment'" , -1 |
| 20106 ); |
| 20107 } |
| 20108 } |
| 20109 } |
| 20110 |
| 20111 /* |
| 20112 ** This is called as part of registering the FTS5 module with database |
| 20113 ** connection db. It registers several user-defined scalar functions useful |
| 20114 ** with FTS5. |
| 20115 ** |
| 20116 ** If successful, SQLITE_OK is returned. If an error occurs, some other |
| 20117 ** SQLite error code is returned instead. |
| 20118 */ |
| 20119 static int sqlite3Fts5IndexInit(sqlite3 *db){ |
| 20120 int rc = sqlite3_create_function( |
| 20121 db, "fts5_decode", 2, SQLITE_UTF8, 0, fts5DecodeFunction, 0, 0 |
| 20122 ); |
| 20123 if( rc==SQLITE_OK ){ |
| 20124 rc = sqlite3_create_function( |
| 20125 db, "fts5_rowid", -1, SQLITE_UTF8, 0, fts5RowidFunction, 0, 0 |
| 20126 ); |
| 20127 } |
| 20128 return rc; |
| 20129 } |
| 20130 |
| 20131 |
| 20132 /* |
| 20133 ** 2014 Jun 09 |
| 20134 ** |
| 20135 ** The author disclaims copyright to this source code. In place of |
| 20136 ** a legal notice, here is a blessing: |
| 20137 ** |
| 20138 ** May you do good and not evil. |
| 20139 ** May you find forgiveness for yourself and forgive others. |
| 20140 ** May you share freely, never taking more than you give. |
| 20141 ** |
| 20142 ****************************************************************************** |
| 20143 ** |
| 20144 ** This is an SQLite module implementing full-text search. |
| 20145 */ |
| 20146 |
| 20147 |
| 20148 /* #include "fts5Int.h" */ |
| 20149 |
| 20150 /* |
| 20151 ** This variable is set to false when running tests for which the on disk |
| 20152 ** structures should not be corrupt. Otherwise, true. If it is false, extra |
| 20153 ** assert() conditions in the fts5 code are activated - conditions that are |
| 20154 ** only true if it is guaranteed that the fts5 database is not corrupt. |
| 20155 */ |
| 20156 SQLITE_API int sqlite3_fts5_may_be_corrupt = 1; |
| 20157 |
| 20158 |
| 20159 typedef struct Fts5Auxdata Fts5Auxdata; |
| 20160 typedef struct Fts5Auxiliary Fts5Auxiliary; |
| 20161 typedef struct Fts5Cursor Fts5Cursor; |
| 20162 typedef struct Fts5Sorter Fts5Sorter; |
| 20163 typedef struct Fts5Table Fts5Table; |
| 20164 typedef struct Fts5TokenizerModule Fts5TokenizerModule; |
| 20165 |
| 20166 /* |
| 20167 ** NOTES ON TRANSACTIONS: |
| 20168 ** |
| 20169 ** SQLite invokes the following virtual table methods as transactions are |
| 20170 ** opened and closed by the user: |
| 20171 ** |
| 20172 ** xBegin(): Start of a new transaction. |
| 20173 ** xSync(): Initial part of two-phase commit. |
| 20174 ** xCommit(): Final part of two-phase commit. |
| 20175 ** xRollback(): Rollback the transaction. |
| 20176 ** |
| 20177 ** Anything that is required as part of a commit that may fail is performed |
| 20178 ** in the xSync() callback. Current versions of SQLite ignore any errors |
| 20179 ** returned by xCommit(). |
| 20180 ** |
| 20181 ** And as sub-transactions are opened/closed: |
| 20182 ** |
| 20183 ** xSavepoint(int S): Open savepoint S. |
| 20184 ** xRelease(int S): Commit and close savepoint S. |
| 20185 ** xRollbackTo(int S): Rollback to start of savepoint S. |
| 20186 ** |
| 20187 ** During a write-transaction the fts5_index.c module may cache some data |
| 20188 ** in-memory. It is flushed to disk whenever xSync(), xRelease() or |
| 20189 ** xSavepoint() is called. And discarded whenever xRollback() or xRollbackTo() |
| 20190 ** is called. |
| 20191 ** |
| 20192 ** Additionally, if SQLITE_DEBUG is defined, an instance of the following |
| 20193 ** structure is used to record the current transaction state. This information |
| 20194 ** is not required, but it is used in the assert() statements executed by |
| 20195 ** function fts5CheckTransactionState() (see below). |
| 20196 */ |
| 20197 struct Fts5TransactionState { |
| 20198 int eState; /* 0==closed, 1==open, 2==synced */ |
| 20199 int iSavepoint; /* Number of open savepoints (0 -> none) */ |
| 20200 }; |
| 20201 |
| 20202 /* |
| 20203 ** A single object of this type is allocated when the FTS5 module is |
| 20204 ** registered with a database handle. It is used to store pointers to |
| 20205 ** all registered FTS5 extensions - tokenizers and auxiliary functions. |
| 20206 */ |
| 20207 struct Fts5Global { |
| 20208 fts5_api api; /* User visible part of object (see fts5.h) */ |
| 20209 sqlite3 *db; /* Associated database connection */ |
| 20210 i64 iNextId; /* Used to allocate unique cursor ids */ |
| 20211 Fts5Auxiliary *pAux; /* First in list of all aux. functions */ |
| 20212 Fts5TokenizerModule *pTok; /* First in list of all tokenizer modules */ |
| 20213 Fts5TokenizerModule *pDfltTok; /* Default tokenizer module */ |
| 20214 Fts5Cursor *pCsr; /* First in list of all open cursors */ |
| 20215 }; |
| 20216 |
| 20217 /* |
| 20218 ** Each auxiliary function registered with the FTS5 module is represented |
| 20219 ** by an object of the following type. All such objects are stored as part |
| 20220 ** of the Fts5Global.pAux list. |
| 20221 */ |
| 20222 struct Fts5Auxiliary { |
| 20223 Fts5Global *pGlobal; /* Global context for this function */ |
| 20224 char *zFunc; /* Function name (nul-terminated) */ |
| 20225 void *pUserData; /* User-data pointer */ |
| 20226 fts5_extension_function xFunc; /* Callback function */ |
| 20227 void (*xDestroy)(void*); /* Destructor function */ |
| 20228 Fts5Auxiliary *pNext; /* Next registered auxiliary function */ |
| 20229 }; |
| 20230 |
| 20231 /* |
| 20232 ** Each tokenizer module registered with the FTS5 module is represented |
| 20233 ** by an object of the following type. All such objects are stored as part |
| 20234 ** of the Fts5Global.pTok list. |
| 20235 */ |
| 20236 struct Fts5TokenizerModule { |
| 20237 char *zName; /* Name of tokenizer */ |
| 20238 void *pUserData; /* User pointer passed to xCreate() */ |
| 20239 fts5_tokenizer x; /* Tokenizer functions */ |
| 20240 void (*xDestroy)(void*); /* Destructor function */ |
| 20241 Fts5TokenizerModule *pNext; /* Next registered tokenizer module */ |
| 20242 }; |
| 20243 |
| 20244 /* |
| 20245 ** Virtual-table object. |
| 20246 */ |
| 20247 struct Fts5Table { |
| 20248 sqlite3_vtab base; /* Base class used by SQLite core */ |
| 20249 Fts5Config *pConfig; /* Virtual table configuration */ |
| 20250 Fts5Index *pIndex; /* Full-text index */ |
| 20251 Fts5Storage *pStorage; /* Document store */ |
| 20252 Fts5Global *pGlobal; /* Global (connection wide) data */ |
| 20253 Fts5Cursor *pSortCsr; /* Sort data from this cursor */ |
| 20254 #ifdef SQLITE_DEBUG |
| 20255 struct Fts5TransactionState ts; |
| 20256 #endif |
| 20257 }; |
| 20258 |
| 20259 struct Fts5MatchPhrase { |
| 20260 Fts5Buffer *pPoslist; /* Pointer to current poslist */ |
| 20261 int nTerm; /* Size of phrase in terms */ |
| 20262 }; |
| 20263 |
| 20264 /* |
| 20265 ** pStmt: |
| 20266 ** SELECT rowid, <fts> FROM <fts> ORDER BY +rank; |
| 20267 ** |
| 20268 ** aIdx[]: |
| 20269 ** There is one entry in the aIdx[] array for each phrase in the query, |
| 20270 ** the value of which is the offset within aPoslist[] following the last |
| 20271 ** byte of the position list for the corresponding phrase. |
| 20272 */ |
| 20273 struct Fts5Sorter { |
| 20274 sqlite3_stmt *pStmt; |
| 20275 i64 iRowid; /* Current rowid */ |
| 20276 const u8 *aPoslist; /* Position lists for current row */ |
| 20277 int nIdx; /* Number of entries in aIdx[] */ |
| 20278 int aIdx[1]; /* Offsets into aPoslist for current row */ |
| 20279 }; |
| 20280 |
| 20281 |
| 20282 /* |
| 20283 ** Virtual-table cursor object. |
| 20284 ** |
| 20285 ** iSpecial: |
| 20286 ** If this is a 'special' query (refer to function fts5SpecialMatch()), |
| 20287 ** then this variable contains the result of the query. |
| 20288 ** |
| 20289 ** iFirstRowid, iLastRowid: |
| 20290 ** These variables are only used for FTS5_PLAN_MATCH cursors. Assuming the |
| 20291 ** cursor iterates in ascending order of rowids, iFirstRowid is the lower |
| 20292 ** limit of rowids to return, and iLastRowid the upper. In other words, the |
| 20293 ** WHERE clause in the user's query might have been: |
| 20294 ** |
| 20295 ** <tbl> MATCH <expr> AND rowid BETWEEN $iFirstRowid AND $iLastRowid |
| 20296 ** |
| 20297 ** If the cursor iterates in descending order of rowid, iFirstRowid |
| 20298 ** is the upper limit (i.e. the "first" rowid visited) and iLastRowid |
| 20299 ** the lower. |
| 20300 */ |
| 20301 struct Fts5Cursor { |
| 20302 sqlite3_vtab_cursor base; /* Base class used by SQLite core */ |
| 20303 Fts5Cursor *pNext; /* Next cursor in Fts5Cursor.pCsr list */ |
| 20304 int *aColumnSize; /* Values for xColumnSize() */ |
| 20305 i64 iCsrId; /* Cursor id */ |
| 20306 |
| 20307 /* Zero from this point onwards on cursor reset */ |
| 20308 int ePlan; /* FTS5_PLAN_XXX value */ |
| 20309 int bDesc; /* True for "ORDER BY rowid DESC" queries */ |
| 20310 i64 iFirstRowid; /* Return no rowids earlier than this */ |
| 20311 i64 iLastRowid; /* Return no rowids later than this */ |
| 20312 sqlite3_stmt *pStmt; /* Statement used to read %_content */ |
| 20313 Fts5Expr *pExpr; /* Expression for MATCH queries */ |
| 20314 Fts5Sorter *pSorter; /* Sorter for "ORDER BY rank" queries */ |
| 20315 int csrflags; /* Mask of cursor flags (see below) */ |
| 20316 i64 iSpecial; /* Result of special query */ |
| 20317 |
| 20318 /* "rank" function. Populated on demand from vtab.xColumn(). */ |
| 20319 char *zRank; /* Custom rank function */ |
| 20320 char *zRankArgs; /* Custom rank function args */ |
| 20321 Fts5Auxiliary *pRank; /* Rank callback (or NULL) */ |
| 20322 int nRankArg; /* Number of trailing arguments for rank() */ |
| 20323 sqlite3_value **apRankArg; /* Array of trailing arguments */ |
| 20324 sqlite3_stmt *pRankArgStmt; /* Origin of objects in apRankArg[] */ |
| 20325 |
| 20326 /* Auxiliary data storage */ |
| 20327 Fts5Auxiliary *pAux; /* Currently executing extension function */ |
| 20328 Fts5Auxdata *pAuxdata; /* First in linked list of saved aux-data */ |
| 20329 |
| 20330 /* Cache used by auxiliary functions xInst() and xInstCount() */ |
| 20331 Fts5PoslistReader *aInstIter; /* One for each phrase */ |
| 20332 int nInstAlloc; /* Size of aInst[] array (entries / 3) */ |
| 20333 int nInstCount; /* Number of phrase instances */ |
| 20334 int *aInst; /* 3 integers per phrase instance */ |
| 20335 }; |
| 20336 |
| 20337 /* |
| 20338 ** Bits that make up the "idxNum" parameter passed indirectly by |
| 20339 ** xBestIndex() to xFilter(). |
| 20340 */ |
| 20341 #define FTS5_BI_MATCH 0x0001 /* <tbl> MATCH ? */ |
| 20342 #define FTS5_BI_RANK 0x0002 /* rank MATCH ? */ |
| 20343 #define FTS5_BI_ROWID_EQ 0x0004 /* rowid == ? */ |
| 20344 #define FTS5_BI_ROWID_LE 0x0008 /* rowid <= ? */ |
| 20345 #define FTS5_BI_ROWID_GE 0x0010 /* rowid >= ? */ |
| 20346 |
| 20347 #define FTS5_BI_ORDER_RANK 0x0020 |
| 20348 #define FTS5_BI_ORDER_ROWID 0x0040 |
| 20349 #define FTS5_BI_ORDER_DESC 0x0080 |
| 20350 |
| 20351 /* |
| 20352 ** Values for Fts5Cursor.csrflags |
| 20353 */ |
| 20354 #define FTS5CSR_REQUIRE_CONTENT 0x01 |
| 20355 #define FTS5CSR_REQUIRE_DOCSIZE 0x02 |
| 20356 #define FTS5CSR_REQUIRE_INST 0x04 |
| 20357 #define FTS5CSR_EOF 0x08 |
| 20358 #define FTS5CSR_FREE_ZRANK 0x10 |
| 20359 #define FTS5CSR_REQUIRE_RESEEK 0x20 |
| 20360 |
| 20361 #define BitFlagAllTest(x,y) (((x) & (y))==(y)) |
| 20362 #define BitFlagTest(x,y) (((x) & (y))!=0) |
| 20363 |
| 20364 |
| 20365 /* |
| 20366 ** Macros to Set(), Clear() and Test() cursor flags. |
| 20367 */ |
| 20368 #define CsrFlagSet(pCsr, flag) ((pCsr)->csrflags |= (flag)) |
| 20369 #define CsrFlagClear(pCsr, flag) ((pCsr)->csrflags &= ~(flag)) |
| 20370 #define CsrFlagTest(pCsr, flag) ((pCsr)->csrflags & (flag)) |
| 20371 |
| 20372 struct Fts5Auxdata { |
| 20373 Fts5Auxiliary *pAux; /* Extension to which this belongs */ |
| 20374 void *pPtr; /* Pointer value */ |
| 20375 void(*xDelete)(void*); /* Destructor */ |
| 20376 Fts5Auxdata *pNext; /* Next object in linked list */ |
| 20377 }; |
| 20378 |
| 20379 #ifdef SQLITE_DEBUG |
| 20380 #define FTS5_BEGIN 1 |
| 20381 #define FTS5_SYNC 2 |
| 20382 #define FTS5_COMMIT 3 |
| 20383 #define FTS5_ROLLBACK 4 |
| 20384 #define FTS5_SAVEPOINT 5 |
| 20385 #define FTS5_RELEASE 6 |
| 20386 #define FTS5_ROLLBACKTO 7 |
| 20387 static void fts5CheckTransactionState(Fts5Table *p, int op, int iSavepoint){ |
| 20388 switch( op ){ |
| 20389 case FTS5_BEGIN: |
| 20390 assert( p->ts.eState==0 ); |
| 20391 p->ts.eState = 1; |
| 20392 p->ts.iSavepoint = -1; |
| 20393 break; |
| 20394 |
| 20395 case FTS5_SYNC: |
| 20396 assert( p->ts.eState==1 ); |
| 20397 p->ts.eState = 2; |
| 20398 break; |
| 20399 |
| 20400 case FTS5_COMMIT: |
| 20401 assert( p->ts.eState==2 ); |
| 20402 p->ts.eState = 0; |
| 20403 break; |
| 20404 |
| 20405 case FTS5_ROLLBACK: |
| 20406 assert( p->ts.eState==1 || p->ts.eState==2 || p->ts.eState==0 ); |
| 20407 p->ts.eState = 0; |
| 20408 break; |
| 20409 |
| 20410 case FTS5_SAVEPOINT: |
| 20411 assert( p->ts.eState==1 ); |
| 20412 assert( iSavepoint>=0 ); |
| 20413 assert( iSavepoint>p->ts.iSavepoint ); |
| 20414 p->ts.iSavepoint = iSavepoint; |
| 20415 break; |
| 20416 |
| 20417 case FTS5_RELEASE: |
| 20418 assert( p->ts.eState==1 ); |
| 20419 assert( iSavepoint>=0 ); |
| 20420 assert( iSavepoint<=p->ts.iSavepoint ); |
| 20421 p->ts.iSavepoint = iSavepoint-1; |
| 20422 break; |
| 20423 |
| 20424 case FTS5_ROLLBACKTO: |
| 20425 assert( p->ts.eState==1 ); |
| 20426 assert( iSavepoint>=0 ); |
| 20427 assert( iSavepoint<=p->ts.iSavepoint ); |
| 20428 p->ts.iSavepoint = iSavepoint; |
| 20429 break; |
| 20430 } |
| 20431 } |
| 20432 #else |
| 20433 # define fts5CheckTransactionState(x,y,z) |
| 20434 #endif |
| 20435 |
| 20436 /* |
| 20437 ** Return true if pTab is a contentless table. |
| 20438 */ |
| 20439 static int fts5IsContentless(Fts5Table *pTab){ |
| 20440 return pTab->pConfig->eContent==FTS5_CONTENT_NONE; |
| 20441 } |
| 20442 |
| 20443 /* |
| 20444 ** Delete a virtual table handle allocated by fts5InitVtab(). |
| 20445 */ |
| 20446 static void fts5FreeVtab(Fts5Table *pTab){ |
| 20447 if( pTab ){ |
| 20448 sqlite3Fts5IndexClose(pTab->pIndex); |
| 20449 sqlite3Fts5StorageClose(pTab->pStorage); |
| 20450 sqlite3Fts5ConfigFree(pTab->pConfig); |
| 20451 sqlite3_free(pTab); |
| 20452 } |
| 20453 } |
| 20454 |
| 20455 /* |
| 20456 ** The xDisconnect() virtual table method. |
| 20457 */ |
| 20458 static int fts5DisconnectMethod(sqlite3_vtab *pVtab){ |
| 20459 fts5FreeVtab((Fts5Table*)pVtab); |
| 20460 return SQLITE_OK; |
| 20461 } |
| 20462 |
| 20463 /* |
| 20464 ** The xDestroy() virtual table method. |
| 20465 */ |
| 20466 static int fts5DestroyMethod(sqlite3_vtab *pVtab){ |
| 20467 Fts5Table *pTab = (Fts5Table*)pVtab; |
| 20468 int rc = sqlite3Fts5DropAll(pTab->pConfig); |
| 20469 if( rc==SQLITE_OK ){ |
| 20470 fts5FreeVtab((Fts5Table*)pVtab); |
| 20471 } |
| 20472 return rc; |
| 20473 } |
| 20474 |
| 20475 /* |
| 20476 ** This function is the implementation of both the xConnect and xCreate |
| 20477 ** methods of the FTS3 virtual table. |
| 20478 ** |
| 20479 ** The argv[] array contains the following: |
| 20480 ** |
| 20481 ** argv[0] -> module name ("fts5") |
| 20482 ** argv[1] -> database name |
| 20483 ** argv[2] -> table name |
| 20484 ** argv[...] -> "column name" and other module argument fields. |
| 20485 */ |
| 20486 static int fts5InitVtab( |
| 20487 int bCreate, /* True for xCreate, false for xConnect */ |
| 20488 sqlite3 *db, /* The SQLite database connection */ |
| 20489 void *pAux, /* Hash table containing tokenizers */ |
| 20490 int argc, /* Number of elements in argv array */ |
| 20491 const char * const *argv, /* xCreate/xConnect argument array */ |
| 20492 sqlite3_vtab **ppVTab, /* Write the resulting vtab structure here */ |
| 20493 char **pzErr /* Write any error message here */ |
| 20494 ){ |
| 20495 Fts5Global *pGlobal = (Fts5Global*)pAux; |
| 20496 const char **azConfig = (const char**)argv; |
| 20497 int rc = SQLITE_OK; /* Return code */ |
| 20498 Fts5Config *pConfig = 0; /* Results of parsing argc/argv */ |
| 20499 Fts5Table *pTab = 0; /* New virtual table object */ |
| 20500 |
| 20501 /* Allocate the new vtab object and parse the configuration */ |
| 20502 pTab = (Fts5Table*)sqlite3Fts5MallocZero(&rc, sizeof(Fts5Table)); |
| 20503 if( rc==SQLITE_OK ){ |
| 20504 rc = sqlite3Fts5ConfigParse(pGlobal, db, argc, azConfig, &pConfig, pzErr); |
| 20505 assert( (rc==SQLITE_OK && *pzErr==0) || pConfig==0 ); |
| 20506 } |
| 20507 if( rc==SQLITE_OK ){ |
| 20508 pTab->pConfig = pConfig; |
| 20509 pTab->pGlobal = pGlobal; |
| 20510 } |
| 20511 |
| 20512 /* Open the index sub-system */ |
| 20513 if( rc==SQLITE_OK ){ |
| 20514 rc = sqlite3Fts5IndexOpen(pConfig, bCreate, &pTab->pIndex, pzErr); |
| 20515 } |
| 20516 |
| 20517 /* Open the storage sub-system */ |
| 20518 if( rc==SQLITE_OK ){ |
| 20519 rc = sqlite3Fts5StorageOpen( |
| 20520 pConfig, pTab->pIndex, bCreate, &pTab->pStorage, pzErr |
| 20521 ); |
| 20522 } |
| 20523 |
| 20524 /* Call sqlite3_declare_vtab() */ |
| 20525 if( rc==SQLITE_OK ){ |
| 20526 rc = sqlite3Fts5ConfigDeclareVtab(pConfig); |
| 20527 } |
| 20528 |
| 20529 /* Load the initial configuration */ |
| 20530 if( rc==SQLITE_OK ){ |
| 20531 assert( pConfig->pzErrmsg==0 ); |
| 20532 pConfig->pzErrmsg = pzErr; |
| 20533 rc = sqlite3Fts5IndexLoadConfig(pTab->pIndex); |
| 20534 sqlite3Fts5IndexRollback(pTab->pIndex); |
| 20535 pConfig->pzErrmsg = 0; |
| 20536 } |
| 20537 |
| 20538 if( rc!=SQLITE_OK ){ |
| 20539 fts5FreeVtab(pTab); |
| 20540 pTab = 0; |
| 20541 }else if( bCreate ){ |
| 20542 fts5CheckTransactionState(pTab, FTS5_BEGIN, 0); |
| 20543 } |
| 20544 *ppVTab = (sqlite3_vtab*)pTab; |
| 20545 return rc; |
| 20546 } |
| 20547 |
| 20548 /* |
| 20549 ** The xConnect() and xCreate() methods for the virtual table. All the |
| 20550 ** work is done in function fts5InitVtab(). |
| 20551 */ |
| 20552 static int fts5ConnectMethod( |
| 20553 sqlite3 *db, /* Database connection */ |
| 20554 void *pAux, /* Pointer to tokenizer hash table */ |
| 20555 int argc, /* Number of elements in argv array */ |
| 20556 const char * const *argv, /* xCreate/xConnect argument array */ |
| 20557 sqlite3_vtab **ppVtab, /* OUT: New sqlite3_vtab object */ |
| 20558 char **pzErr /* OUT: sqlite3_malloc'd error message */ |
| 20559 ){ |
| 20560 return fts5InitVtab(0, db, pAux, argc, argv, ppVtab, pzErr); |
| 20561 } |
| 20562 static int fts5CreateMethod( |
| 20563 sqlite3 *db, /* Database connection */ |
| 20564 void *pAux, /* Pointer to tokenizer hash table */ |
| 20565 int argc, /* Number of elements in argv array */ |
| 20566 const char * const *argv, /* xCreate/xConnect argument array */ |
| 20567 sqlite3_vtab **ppVtab, /* OUT: New sqlite3_vtab object */ |
| 20568 char **pzErr /* OUT: sqlite3_malloc'd error message */ |
| 20569 ){ |
| 20570 return fts5InitVtab(1, db, pAux, argc, argv, ppVtab, pzErr); |
| 20571 } |
| 20572 |
| 20573 /* |
| 20574 ** The different query plans. |
| 20575 */ |
| 20576 #define FTS5_PLAN_MATCH 1 /* (<tbl> MATCH ?) */ |
| 20577 #define FTS5_PLAN_SOURCE 2 /* A source cursor for SORTED_MATCH */ |
| 20578 #define FTS5_PLAN_SPECIAL 3 /* An internal query */ |
| 20579 #define FTS5_PLAN_SORTED_MATCH 4 /* (<tbl> MATCH ? ORDER BY rank) */ |
| 20580 #define FTS5_PLAN_SCAN 5 /* No usable constraint */ |
| 20581 #define FTS5_PLAN_ROWID 6 /* (rowid = ?) */ |
| 20582 |
| 20583 /* |
| 20584 ** Set the SQLITE_INDEX_SCAN_UNIQUE flag in pIdxInfo->flags. Unless this |
| 20585 ** extension is currently being used by a version of SQLite too old to |
| 20586 ** support index-info flags. In that case this function is a no-op. |
| 20587 */ |
| 20588 static void fts5SetUniqueFlag(sqlite3_index_info *pIdxInfo){ |
| 20589 #if SQLITE_VERSION_NUMBER>=3008012 |
| 20590 #ifndef SQLITE_CORE |
| 20591 if( sqlite3_libversion_number()>=3008012 ) |
| 20592 #endif |
| 20593 { |
| 20594 pIdxInfo->idxFlags |= SQLITE_INDEX_SCAN_UNIQUE; |
| 20595 } |
| 20596 #endif |
| 20597 } |
| 20598 |
| 20599 /* |
| 20600 ** Implementation of the xBestIndex method for FTS5 tables. Within the |
| 20601 ** WHERE constraint, it searches for the following: |
| 20602 ** |
| 20603 ** 1. A MATCH constraint against the special column. |
| 20604 ** 2. A MATCH constraint against the "rank" column. |
| 20605 ** 3. An == constraint against the rowid column. |
| 20606 ** 4. A < or <= constraint against the rowid column. |
| 20607 ** 5. A > or >= constraint against the rowid column. |
| 20608 ** |
| 20609 ** Within the ORDER BY, either: |
| 20610 ** |
| 20611 ** 5. ORDER BY rank [ASC|DESC] |
| 20612 ** 6. ORDER BY rowid [ASC|DESC] |
| 20613 ** |
| 20614 ** Costs are assigned as follows: |
| 20615 ** |
| 20616 ** a) If an unusable MATCH operator is present in the WHERE clause, the |
| 20617 ** cost is unconditionally set to 1e50 (a really big number). |
| 20618 ** |
| 20619 ** a) If a MATCH operator is present, the cost depends on the other |
| 20620 ** constraints also present. As follows: |
| 20621 ** |
| 20622 ** * No other constraints: cost=1000.0 |
| 20623 ** * One rowid range constraint: cost=750.0 |
| 20624 ** * Both rowid range constraints: cost=500.0 |
| 20625 ** * An == rowid constraint: cost=100.0 |
| 20626 ** |
| 20627 ** b) Otherwise, if there is no MATCH: |
| 20628 ** |
| 20629 ** * No other constraints: cost=1000000.0 |
| 20630 ** * One rowid range constraint: cost=750000.0 |
| 20631 ** * Both rowid range constraints: cost=250000.0 |
| 20632 ** * An == rowid constraint: cost=10.0 |
| 20633 ** |
| 20634 ** Costs are not modified by the ORDER BY clause. |
| 20635 */ |
| 20636 static int fts5BestIndexMethod(sqlite3_vtab *pVTab, sqlite3_index_info *pInfo){ |
| 20637 Fts5Table *pTab = (Fts5Table*)pVTab; |
| 20638 Fts5Config *pConfig = pTab->pConfig; |
| 20639 int idxFlags = 0; /* Parameter passed through to xFilter() */ |
| 20640 int bHasMatch; |
| 20641 int iNext; |
| 20642 int i; |
| 20643 |
| 20644 struct Constraint { |
| 20645 int op; /* Mask against sqlite3_index_constraint.op */ |
| 20646 int fts5op; /* FTS5 mask for idxFlags */ |
| 20647 int iCol; /* 0==rowid, 1==tbl, 2==rank */ |
| 20648 int omit; /* True to omit this if found */ |
| 20649 int iConsIndex; /* Index in pInfo->aConstraint[] */ |
| 20650 } aConstraint[] = { |
| 20651 {SQLITE_INDEX_CONSTRAINT_MATCH|SQLITE_INDEX_CONSTRAINT_EQ, |
| 20652 FTS5_BI_MATCH, 1, 1, -1}, |
| 20653 {SQLITE_INDEX_CONSTRAINT_MATCH|SQLITE_INDEX_CONSTRAINT_EQ, |
| 20654 FTS5_BI_RANK, 2, 1, -1}, |
| 20655 {SQLITE_INDEX_CONSTRAINT_EQ, FTS5_BI_ROWID_EQ, 0, 0, -1}, |
| 20656 {SQLITE_INDEX_CONSTRAINT_LT|SQLITE_INDEX_CONSTRAINT_LE, |
| 20657 FTS5_BI_ROWID_LE, 0, 0, -1}, |
| 20658 {SQLITE_INDEX_CONSTRAINT_GT|SQLITE_INDEX_CONSTRAINT_GE, |
| 20659 FTS5_BI_ROWID_GE, 0, 0, -1}, |
| 20660 }; |
| 20661 |
| 20662 int aColMap[3]; |
| 20663 aColMap[0] = -1; |
| 20664 aColMap[1] = pConfig->nCol; |
| 20665 aColMap[2] = pConfig->nCol+1; |
| 20666 |
| 20667 /* Set idxFlags flags for all WHERE clause terms that will be used. */ |
| 20668 for(i=0; i<pInfo->nConstraint; i++){ |
| 20669 struct sqlite3_index_constraint *p = &pInfo->aConstraint[i]; |
| 20670 int j; |
| 20671 for(j=0; j<(int)ArraySize(aConstraint); j++){ |
| 20672 struct Constraint *pC = &aConstraint[j]; |
| 20673 if( p->iColumn==aColMap[pC->iCol] && p->op & pC->op ){ |
| 20674 if( p->usable ){ |
| 20675 pC->iConsIndex = i; |
| 20676 idxFlags |= pC->fts5op; |
| 20677 }else if( j==0 ){ |
| 20678 /* As there exists an unusable MATCH constraint this is an |
| 20679 ** unusable plan. Set a prohibitively high cost. */ |
| 20680 pInfo->estimatedCost = 1e50; |
| 20681 return SQLITE_OK; |
| 20682 } |
| 20683 } |
| 20684 } |
| 20685 } |
| 20686 |
| 20687 /* Set idxFlags flags for the ORDER BY clause */ |
| 20688 if( pInfo->nOrderBy==1 ){ |
| 20689 int iSort = pInfo->aOrderBy[0].iColumn; |
| 20690 if( iSort==(pConfig->nCol+1) && BitFlagTest(idxFlags, FTS5_BI_MATCH) ){ |
| 20691 idxFlags |= FTS5_BI_ORDER_RANK; |
| 20692 }else if( iSort==-1 ){ |
| 20693 idxFlags |= FTS5_BI_ORDER_ROWID; |
| 20694 } |
| 20695 if( BitFlagTest(idxFlags, FTS5_BI_ORDER_RANK|FTS5_BI_ORDER_ROWID) ){ |
| 20696 pInfo->orderByConsumed = 1; |
| 20697 if( pInfo->aOrderBy[0].desc ){ |
| 20698 idxFlags |= FTS5_BI_ORDER_DESC; |
| 20699 } |
| 20700 } |
| 20701 } |
| 20702 |
| 20703 /* Calculate the estimated cost based on the flags set in idxFlags. */ |
| 20704 bHasMatch = BitFlagTest(idxFlags, FTS5_BI_MATCH); |
| 20705 if( BitFlagTest(idxFlags, FTS5_BI_ROWID_EQ) ){ |
| 20706 pInfo->estimatedCost = bHasMatch ? 100.0 : 10.0; |
| 20707 if( bHasMatch==0 ) fts5SetUniqueFlag(pInfo); |
| 20708 }else if( BitFlagAllTest(idxFlags, FTS5_BI_ROWID_LE|FTS5_BI_ROWID_GE) ){ |
| 20709 pInfo->estimatedCost = bHasMatch ? 500.0 : 250000.0; |
| 20710 }else if( BitFlagTest(idxFlags, FTS5_BI_ROWID_LE|FTS5_BI_ROWID_GE) ){ |
| 20711 pInfo->estimatedCost = bHasMatch ? 750.0 : 750000.0; |
| 20712 }else{ |
| 20713 pInfo->estimatedCost = bHasMatch ? 1000.0 : 1000000.0; |
| 20714 } |
| 20715 |
| 20716 /* Assign argvIndex values to each constraint in use. */ |
| 20717 iNext = 1; |
| 20718 for(i=0; i<(int)ArraySize(aConstraint); i++){ |
| 20719 struct Constraint *pC = &aConstraint[i]; |
| 20720 if( pC->iConsIndex>=0 ){ |
| 20721 pInfo->aConstraintUsage[pC->iConsIndex].argvIndex = iNext++; |
| 20722 pInfo->aConstraintUsage[pC->iConsIndex].omit = (unsigned char)pC->omit; |
| 20723 } |
| 20724 } |
| 20725 |
| 20726 pInfo->idxNum = idxFlags; |
| 20727 return SQLITE_OK; |
| 20728 } |
| 20729 |
| 20730 /* |
| 20731 ** Implementation of xOpen method. |
| 20732 */ |
| 20733 static int fts5OpenMethod(sqlite3_vtab *pVTab, sqlite3_vtab_cursor **ppCsr){ |
| 20734 Fts5Table *pTab = (Fts5Table*)pVTab; |
| 20735 Fts5Config *pConfig = pTab->pConfig; |
| 20736 Fts5Cursor *pCsr; /* New cursor object */ |
| 20737 int nByte; /* Bytes of space to allocate */ |
| 20738 int rc = SQLITE_OK; /* Return code */ |
| 20739 |
| 20740 nByte = sizeof(Fts5Cursor) + pConfig->nCol * sizeof(int); |
| 20741 pCsr = (Fts5Cursor*)sqlite3_malloc(nByte); |
| 20742 if( pCsr ){ |
| 20743 Fts5Global *pGlobal = pTab->pGlobal; |
| 20744 memset(pCsr, 0, nByte); |
| 20745 pCsr->aColumnSize = (int*)&pCsr[1]; |
| 20746 pCsr->pNext = pGlobal->pCsr; |
| 20747 pGlobal->pCsr = pCsr; |
| 20748 pCsr->iCsrId = ++pGlobal->iNextId; |
| 20749 }else{ |
| 20750 rc = SQLITE_NOMEM; |
| 20751 } |
| 20752 *ppCsr = (sqlite3_vtab_cursor*)pCsr; |
| 20753 return rc; |
| 20754 } |
| 20755 |
| 20756 static int fts5StmtType(Fts5Cursor *pCsr){ |
| 20757 if( pCsr->ePlan==FTS5_PLAN_SCAN ){ |
| 20758 return (pCsr->bDesc) ? FTS5_STMT_SCAN_DESC : FTS5_STMT_SCAN_ASC; |
| 20759 } |
| 20760 return FTS5_STMT_LOOKUP; |
| 20761 } |
| 20762 |
| 20763 /* |
| 20764 ** This function is called after the cursor passed as the only argument |
| 20765 ** is moved to point at a different row. It clears all cached data |
| 20766 ** specific to the previous row stored by the cursor object. |
| 20767 */ |
| 20768 static void fts5CsrNewrow(Fts5Cursor *pCsr){ |
| 20769 CsrFlagSet(pCsr, |
| 20770 FTS5CSR_REQUIRE_CONTENT |
| 20771 | FTS5CSR_REQUIRE_DOCSIZE |
| 20772 | FTS5CSR_REQUIRE_INST |
| 20773 ); |
| 20774 } |
| 20775 |
| 20776 static void fts5FreeCursorComponents(Fts5Cursor *pCsr){ |
| 20777 Fts5Table *pTab = (Fts5Table*)(pCsr->base.pVtab); |
| 20778 Fts5Auxdata *pData; |
| 20779 Fts5Auxdata *pNext; |
| 20780 |
| 20781 sqlite3_free(pCsr->aInstIter); |
| 20782 sqlite3_free(pCsr->aInst); |
| 20783 if( pCsr->pStmt ){ |
| 20784 int eStmt = fts5StmtType(pCsr); |
| 20785 sqlite3Fts5StorageStmtRelease(pTab->pStorage, eStmt, pCsr->pStmt); |
| 20786 } |
| 20787 if( pCsr->pSorter ){ |
| 20788 Fts5Sorter *pSorter = pCsr->pSorter; |
| 20789 sqlite3_finalize(pSorter->pStmt); |
| 20790 sqlite3_free(pSorter); |
| 20791 } |
| 20792 |
| 20793 if( pCsr->ePlan!=FTS5_PLAN_SOURCE ){ |
| 20794 sqlite3Fts5ExprFree(pCsr->pExpr); |
| 20795 } |
| 20796 |
| 20797 for(pData=pCsr->pAuxdata; pData; pData=pNext){ |
| 20798 pNext = pData->pNext; |
| 20799 if( pData->xDelete ) pData->xDelete(pData->pPtr); |
| 20800 sqlite3_free(pData); |
| 20801 } |
| 20802 |
| 20803 sqlite3_finalize(pCsr->pRankArgStmt); |
| 20804 sqlite3_free(pCsr->apRankArg); |
| 20805 |
| 20806 if( CsrFlagTest(pCsr, FTS5CSR_FREE_ZRANK) ){ |
| 20807 sqlite3_free(pCsr->zRank); |
| 20808 sqlite3_free(pCsr->zRankArgs); |
| 20809 } |
| 20810 |
| 20811 memset(&pCsr->ePlan, 0, sizeof(Fts5Cursor) - ((u8*)&pCsr->ePlan - (u8*)pCsr)); |
| 20812 } |
| 20813 |
| 20814 |
| 20815 /* |
| 20816 ** Close the cursor. For additional information see the documentation |
| 20817 ** on the xClose method of the virtual table interface. |
| 20818 */ |
| 20819 static int fts5CloseMethod(sqlite3_vtab_cursor *pCursor){ |
| 20820 if( pCursor ){ |
| 20821 Fts5Table *pTab = (Fts5Table*)(pCursor->pVtab); |
| 20822 Fts5Cursor *pCsr = (Fts5Cursor*)pCursor; |
| 20823 Fts5Cursor **pp; |
| 20824 |
| 20825 fts5FreeCursorComponents(pCsr); |
| 20826 /* Remove the cursor from the Fts5Global.pCsr list */ |
| 20827 for(pp=&pTab->pGlobal->pCsr; (*pp)!=pCsr; pp=&(*pp)->pNext); |
| 20828 *pp = pCsr->pNext; |
| 20829 |
| 20830 sqlite3_free(pCsr); |
| 20831 } |
| 20832 return SQLITE_OK; |
| 20833 } |
| 20834 |
| 20835 static int fts5SorterNext(Fts5Cursor *pCsr){ |
| 20836 Fts5Sorter *pSorter = pCsr->pSorter; |
| 20837 int rc; |
| 20838 |
| 20839 rc = sqlite3_step(pSorter->pStmt); |
| 20840 if( rc==SQLITE_DONE ){ |
| 20841 rc = SQLITE_OK; |
| 20842 CsrFlagSet(pCsr, FTS5CSR_EOF); |
| 20843 }else if( rc==SQLITE_ROW ){ |
| 20844 const u8 *a; |
| 20845 const u8 *aBlob; |
| 20846 int nBlob; |
| 20847 int i; |
| 20848 int iOff = 0; |
| 20849 rc = SQLITE_OK; |
| 20850 |
| 20851 pSorter->iRowid = sqlite3_column_int64(pSorter->pStmt, 0); |
| 20852 nBlob = sqlite3_column_bytes(pSorter->pStmt, 1); |
| 20853 aBlob = a = sqlite3_column_blob(pSorter->pStmt, 1); |
| 20854 |
| 20855 for(i=0; i<(pSorter->nIdx-1); i++){ |
| 20856 int iVal; |
| 20857 a += fts5GetVarint32(a, iVal); |
| 20858 iOff += iVal; |
| 20859 pSorter->aIdx[i] = iOff; |
| 20860 } |
| 20861 pSorter->aIdx[i] = &aBlob[nBlob] - a; |
| 20862 |
| 20863 pSorter->aPoslist = a; |
| 20864 fts5CsrNewrow(pCsr); |
| 20865 } |
| 20866 |
| 20867 return rc; |
| 20868 } |
| 20869 |
| 20870 |
| 20871 /* |
| 20872 ** Set the FTS5CSR_REQUIRE_RESEEK flag on all FTS5_PLAN_MATCH cursors |
| 20873 ** open on table pTab. |
| 20874 */ |
| 20875 static void fts5TripCursors(Fts5Table *pTab){ |
| 20876 Fts5Cursor *pCsr; |
| 20877 for(pCsr=pTab->pGlobal->pCsr; pCsr; pCsr=pCsr->pNext){ |
| 20878 if( pCsr->ePlan==FTS5_PLAN_MATCH |
| 20879 && pCsr->base.pVtab==(sqlite3_vtab*)pTab |
| 20880 ){ |
| 20881 CsrFlagSet(pCsr, FTS5CSR_REQUIRE_RESEEK); |
| 20882 } |
| 20883 } |
| 20884 } |
| 20885 |
| 20886 /* |
| 20887 ** If the REQUIRE_RESEEK flag is set on the cursor passed as the first |
| 20888 ** argument, close and reopen all Fts5IndexIter iterators that the cursor |
| 20889 ** is using. Then attempt to move the cursor to a rowid equal to or laster |
| 20890 ** (in the cursors sort order - ASC or DESC) than the current rowid. |
| 20891 ** |
| 20892 ** If the new rowid is not equal to the old, set output parameter *pbSkip |
| 20893 ** to 1 before returning. Otherwise, leave it unchanged. |
| 20894 ** |
| 20895 ** Return SQLITE_OK if successful or if no reseek was required, or an |
| 20896 ** error code if an error occurred. |
| 20897 */ |
| 20898 static int fts5CursorReseek(Fts5Cursor *pCsr, int *pbSkip){ |
| 20899 int rc = SQLITE_OK; |
| 20900 assert( *pbSkip==0 ); |
| 20901 if( CsrFlagTest(pCsr, FTS5CSR_REQUIRE_RESEEK) ){ |
| 20902 Fts5Table *pTab = (Fts5Table*)(pCsr->base.pVtab); |
| 20903 int bDesc = pCsr->bDesc; |
| 20904 i64 iRowid = sqlite3Fts5ExprRowid(pCsr->pExpr); |
| 20905 |
| 20906 rc = sqlite3Fts5ExprFirst(pCsr->pExpr, pTab->pIndex, iRowid, bDesc); |
| 20907 if( rc==SQLITE_OK && iRowid!=sqlite3Fts5ExprRowid(pCsr->pExpr) ){ |
| 20908 *pbSkip = 1; |
| 20909 } |
| 20910 |
| 20911 CsrFlagClear(pCsr, FTS5CSR_REQUIRE_RESEEK); |
| 20912 fts5CsrNewrow(pCsr); |
| 20913 if( sqlite3Fts5ExprEof(pCsr->pExpr) ){ |
| 20914 CsrFlagSet(pCsr, FTS5CSR_EOF); |
| 20915 } |
| 20916 } |
| 20917 return rc; |
| 20918 } |
| 20919 |
| 20920 |
| 20921 /* |
| 20922 ** Advance the cursor to the next row in the table that matches the |
| 20923 ** search criteria. |
| 20924 ** |
| 20925 ** Return SQLITE_OK if nothing goes wrong. SQLITE_OK is returned |
| 20926 ** even if we reach end-of-file. The fts5EofMethod() will be called |
| 20927 ** subsequently to determine whether or not an EOF was hit. |
| 20928 */ |
| 20929 static int fts5NextMethod(sqlite3_vtab_cursor *pCursor){ |
| 20930 Fts5Cursor *pCsr = (Fts5Cursor*)pCursor; |
| 20931 int rc = SQLITE_OK; |
| 20932 |
| 20933 assert( (pCsr->ePlan<3)== |
| 20934 (pCsr->ePlan==FTS5_PLAN_MATCH || pCsr->ePlan==FTS5_PLAN_SOURCE) |
| 20935 ); |
| 20936 |
| 20937 if( pCsr->ePlan<3 ){ |
| 20938 int bSkip = 0; |
| 20939 if( (rc = fts5CursorReseek(pCsr, &bSkip)) || bSkip ) return rc; |
| 20940 rc = sqlite3Fts5ExprNext(pCsr->pExpr, pCsr->iLastRowid); |
| 20941 if( sqlite3Fts5ExprEof(pCsr->pExpr) ){ |
| 20942 CsrFlagSet(pCsr, FTS5CSR_EOF); |
| 20943 } |
| 20944 fts5CsrNewrow(pCsr); |
| 20945 }else{ |
| 20946 switch( pCsr->ePlan ){ |
| 20947 case FTS5_PLAN_SPECIAL: { |
| 20948 CsrFlagSet(pCsr, FTS5CSR_EOF); |
| 20949 break; |
| 20950 } |
| 20951 |
| 20952 case FTS5_PLAN_SORTED_MATCH: { |
| 20953 rc = fts5SorterNext(pCsr); |
| 20954 break; |
| 20955 } |
| 20956 |
| 20957 default: |
| 20958 rc = sqlite3_step(pCsr->pStmt); |
| 20959 if( rc!=SQLITE_ROW ){ |
| 20960 CsrFlagSet(pCsr, FTS5CSR_EOF); |
| 20961 rc = sqlite3_reset(pCsr->pStmt); |
| 20962 }else{ |
| 20963 rc = SQLITE_OK; |
| 20964 } |
| 20965 break; |
| 20966 } |
| 20967 } |
| 20968 |
| 20969 return rc; |
| 20970 } |
| 20971 |
| 20972 |
| 20973 static sqlite3_stmt *fts5PrepareStatement( |
| 20974 int *pRc, |
| 20975 Fts5Config *pConfig, |
| 20976 const char *zFmt, |
| 20977 ... |
| 20978 ){ |
| 20979 sqlite3_stmt *pRet = 0; |
| 20980 va_list ap; |
| 20981 va_start(ap, zFmt); |
| 20982 |
| 20983 if( *pRc==SQLITE_OK ){ |
| 20984 int rc; |
| 20985 char *zSql = sqlite3_vmprintf(zFmt, ap); |
| 20986 if( zSql==0 ){ |
| 20987 rc = SQLITE_NOMEM; |
| 20988 }else{ |
| 20989 rc = sqlite3_prepare_v2(pConfig->db, zSql, -1, &pRet, 0); |
| 20990 if( rc!=SQLITE_OK ){ |
| 20991 *pConfig->pzErrmsg = sqlite3_mprintf("%s", sqlite3_errmsg(pConfig->db)); |
| 20992 } |
| 20993 sqlite3_free(zSql); |
| 20994 } |
| 20995 *pRc = rc; |
| 20996 } |
| 20997 |
| 20998 va_end(ap); |
| 20999 return pRet; |
| 21000 } |
| 21001 |
| 21002 static int fts5CursorFirstSorted(Fts5Table *pTab, Fts5Cursor *pCsr, int bDesc){ |
| 21003 Fts5Config *pConfig = pTab->pConfig; |
| 21004 Fts5Sorter *pSorter; |
| 21005 int nPhrase; |
| 21006 int nByte; |
| 21007 int rc = SQLITE_OK; |
| 21008 const char *zRank = pCsr->zRank; |
| 21009 const char *zRankArgs = pCsr->zRankArgs; |
| 21010 |
| 21011 nPhrase = sqlite3Fts5ExprPhraseCount(pCsr->pExpr); |
| 21012 nByte = sizeof(Fts5Sorter) + sizeof(int) * (nPhrase-1); |
| 21013 pSorter = (Fts5Sorter*)sqlite3_malloc(nByte); |
| 21014 if( pSorter==0 ) return SQLITE_NOMEM; |
| 21015 memset(pSorter, 0, nByte); |
| 21016 pSorter->nIdx = nPhrase; |
| 21017 |
| 21018 /* TODO: It would be better to have some system for reusing statement |
| 21019 ** handles here, rather than preparing a new one for each query. But that |
| 21020 ** is not possible as SQLite reference counts the virtual table objects. |
| 21021 ** And since the statement required here reads from this very virtual |
| 21022 ** table, saving it creates a circular reference. |
| 21023 ** |
| 21024 ** If SQLite a built-in statement cache, this wouldn't be a problem. */ |
| 21025 pSorter->pStmt = fts5PrepareStatement(&rc, pConfig, |
| 21026 "SELECT rowid, rank FROM %Q.%Q ORDER BY %s(%s%s%s) %s", |
| 21027 pConfig->zDb, pConfig->zName, zRank, pConfig->zName, |
| 21028 (zRankArgs ? ", " : ""), |
| 21029 (zRankArgs ? zRankArgs : ""), |
| 21030 bDesc ? "DESC" : "ASC" |
| 21031 ); |
| 21032 |
| 21033 pCsr->pSorter = pSorter; |
| 21034 if( rc==SQLITE_OK ){ |
| 21035 assert( pTab->pSortCsr==0 ); |
| 21036 pTab->pSortCsr = pCsr; |
| 21037 rc = fts5SorterNext(pCsr); |
| 21038 pTab->pSortCsr = 0; |
| 21039 } |
| 21040 |
| 21041 if( rc!=SQLITE_OK ){ |
| 21042 sqlite3_finalize(pSorter->pStmt); |
| 21043 sqlite3_free(pSorter); |
| 21044 pCsr->pSorter = 0; |
| 21045 } |
| 21046 |
| 21047 return rc; |
| 21048 } |
| 21049 |
| 21050 static int fts5CursorFirst(Fts5Table *pTab, Fts5Cursor *pCsr, int bDesc){ |
| 21051 int rc; |
| 21052 Fts5Expr *pExpr = pCsr->pExpr; |
| 21053 rc = sqlite3Fts5ExprFirst(pExpr, pTab->pIndex, pCsr->iFirstRowid, bDesc); |
| 21054 if( sqlite3Fts5ExprEof(pExpr) ){ |
| 21055 CsrFlagSet(pCsr, FTS5CSR_EOF); |
| 21056 } |
| 21057 fts5CsrNewrow(pCsr); |
| 21058 return rc; |
| 21059 } |
| 21060 |
| 21061 /* |
| 21062 ** Process a "special" query. A special query is identified as one with a |
| 21063 ** MATCH expression that begins with a '*' character. The remainder of |
| 21064 ** the text passed to the MATCH operator are used as the special query |
| 21065 ** parameters. |
| 21066 */ |
| 21067 static int fts5SpecialMatch( |
| 21068 Fts5Table *pTab, |
| 21069 Fts5Cursor *pCsr, |
| 21070 const char *zQuery |
| 21071 ){ |
| 21072 int rc = SQLITE_OK; /* Return code */ |
| 21073 const char *z = zQuery; /* Special query text */ |
| 21074 int n; /* Number of bytes in text at z */ |
| 21075 |
| 21076 while( z[0]==' ' ) z++; |
| 21077 for(n=0; z[n] && z[n]!=' '; n++); |
| 21078 |
| 21079 assert( pTab->base.zErrMsg==0 ); |
| 21080 pCsr->ePlan = FTS5_PLAN_SPECIAL; |
| 21081 |
| 21082 if( 0==sqlite3_strnicmp("reads", z, n) ){ |
| 21083 pCsr->iSpecial = sqlite3Fts5IndexReads(pTab->pIndex); |
| 21084 } |
| 21085 else if( 0==sqlite3_strnicmp("id", z, n) ){ |
| 21086 pCsr->iSpecial = pCsr->iCsrId; |
| 21087 } |
| 21088 else{ |
| 21089 /* An unrecognized directive. Return an error message. */ |
| 21090 pTab->base.zErrMsg = sqlite3_mprintf("unknown special query: %.*s", n, z); |
| 21091 rc = SQLITE_ERROR; |
| 21092 } |
| 21093 |
| 21094 return rc; |
| 21095 } |
| 21096 |
| 21097 /* |
| 21098 ** Search for an auxiliary function named zName that can be used with table |
| 21099 ** pTab. If one is found, return a pointer to the corresponding Fts5Auxiliary |
| 21100 ** structure. Otherwise, if no such function exists, return NULL. |
| 21101 */ |
| 21102 static Fts5Auxiliary *fts5FindAuxiliary(Fts5Table *pTab, const char *zName){ |
| 21103 Fts5Auxiliary *pAux; |
| 21104 |
| 21105 for(pAux=pTab->pGlobal->pAux; pAux; pAux=pAux->pNext){ |
| 21106 if( sqlite3_stricmp(zName, pAux->zFunc)==0 ) return pAux; |
| 21107 } |
| 21108 |
| 21109 /* No function of the specified name was found. Return 0. */ |
| 21110 return 0; |
| 21111 } |
| 21112 |
| 21113 |
| 21114 static int fts5FindRankFunction(Fts5Cursor *pCsr){ |
| 21115 Fts5Table *pTab = (Fts5Table*)(pCsr->base.pVtab); |
| 21116 Fts5Config *pConfig = pTab->pConfig; |
| 21117 int rc = SQLITE_OK; |
| 21118 Fts5Auxiliary *pAux = 0; |
| 21119 const char *zRank = pCsr->zRank; |
| 21120 const char *zRankArgs = pCsr->zRankArgs; |
| 21121 |
| 21122 if( zRankArgs ){ |
| 21123 char *zSql = sqlite3Fts5Mprintf(&rc, "SELECT %s", zRankArgs); |
| 21124 if( zSql ){ |
| 21125 sqlite3_stmt *pStmt = 0; |
| 21126 rc = sqlite3_prepare_v2(pConfig->db, zSql, -1, &pStmt, 0); |
| 21127 sqlite3_free(zSql); |
| 21128 assert( rc==SQLITE_OK || pCsr->pRankArgStmt==0 ); |
| 21129 if( rc==SQLITE_OK ){ |
| 21130 if( SQLITE_ROW==sqlite3_step(pStmt) ){ |
| 21131 int nByte; |
| 21132 pCsr->nRankArg = sqlite3_column_count(pStmt); |
| 21133 nByte = sizeof(sqlite3_value*)*pCsr->nRankArg; |
| 21134 pCsr->apRankArg = (sqlite3_value**)sqlite3Fts5MallocZero(&rc, nByte); |
| 21135 if( rc==SQLITE_OK ){ |
| 21136 int i; |
| 21137 for(i=0; i<pCsr->nRankArg; i++){ |
| 21138 pCsr->apRankArg[i] = sqlite3_column_value(pStmt, i); |
| 21139 } |
| 21140 } |
| 21141 pCsr->pRankArgStmt = pStmt; |
| 21142 }else{ |
| 21143 rc = sqlite3_finalize(pStmt); |
| 21144 assert( rc!=SQLITE_OK ); |
| 21145 } |
| 21146 } |
| 21147 } |
| 21148 } |
| 21149 |
| 21150 if( rc==SQLITE_OK ){ |
| 21151 pAux = fts5FindAuxiliary(pTab, zRank); |
| 21152 if( pAux==0 ){ |
| 21153 assert( pTab->base.zErrMsg==0 ); |
| 21154 pTab->base.zErrMsg = sqlite3_mprintf("no such function: %s", zRank); |
| 21155 rc = SQLITE_ERROR; |
| 21156 } |
| 21157 } |
| 21158 |
| 21159 pCsr->pRank = pAux; |
| 21160 return rc; |
| 21161 } |
| 21162 |
| 21163 |
| 21164 static int fts5CursorParseRank( |
| 21165 Fts5Config *pConfig, |
| 21166 Fts5Cursor *pCsr, |
| 21167 sqlite3_value *pRank |
| 21168 ){ |
| 21169 int rc = SQLITE_OK; |
| 21170 if( pRank ){ |
| 21171 const char *z = (const char*)sqlite3_value_text(pRank); |
| 21172 char *zRank = 0; |
| 21173 char *zRankArgs = 0; |
| 21174 |
| 21175 if( z==0 ){ |
| 21176 if( sqlite3_value_type(pRank)==SQLITE_NULL ) rc = SQLITE_ERROR; |
| 21177 }else{ |
| 21178 rc = sqlite3Fts5ConfigParseRank(z, &zRank, &zRankArgs); |
| 21179 } |
| 21180 if( rc==SQLITE_OK ){ |
| 21181 pCsr->zRank = zRank; |
| 21182 pCsr->zRankArgs = zRankArgs; |
| 21183 CsrFlagSet(pCsr, FTS5CSR_FREE_ZRANK); |
| 21184 }else if( rc==SQLITE_ERROR ){ |
| 21185 pCsr->base.pVtab->zErrMsg = sqlite3_mprintf( |
| 21186 "parse error in rank function: %s", z |
| 21187 ); |
| 21188 } |
| 21189 }else{ |
| 21190 if( pConfig->zRank ){ |
| 21191 pCsr->zRank = (char*)pConfig->zRank; |
| 21192 pCsr->zRankArgs = (char*)pConfig->zRankArgs; |
| 21193 }else{ |
| 21194 pCsr->zRank = (char*)FTS5_DEFAULT_RANK; |
| 21195 pCsr->zRankArgs = 0; |
| 21196 } |
| 21197 } |
| 21198 return rc; |
| 21199 } |
| 21200 |
| 21201 static i64 fts5GetRowidLimit(sqlite3_value *pVal, i64 iDefault){ |
| 21202 if( pVal ){ |
| 21203 int eType = sqlite3_value_numeric_type(pVal); |
| 21204 if( eType==SQLITE_INTEGER ){ |
| 21205 return sqlite3_value_int64(pVal); |
| 21206 } |
| 21207 } |
| 21208 return iDefault; |
| 21209 } |
| 21210 |
| 21211 /* |
| 21212 ** This is the xFilter interface for the virtual table. See |
| 21213 ** the virtual table xFilter method documentation for additional |
| 21214 ** information. |
| 21215 ** |
| 21216 ** There are three possible query strategies: |
| 21217 ** |
| 21218 ** 1. Full-text search using a MATCH operator. |
| 21219 ** 2. A by-rowid lookup. |
| 21220 ** 3. A full-table scan. |
| 21221 */ |
| 21222 static int fts5FilterMethod( |
| 21223 sqlite3_vtab_cursor *pCursor, /* The cursor used for this query */ |
| 21224 int idxNum, /* Strategy index */ |
| 21225 const char *idxStr, /* Unused */ |
| 21226 int nVal, /* Number of elements in apVal */ |
| 21227 sqlite3_value **apVal /* Arguments for the indexing scheme */ |
| 21228 ){ |
| 21229 Fts5Table *pTab = (Fts5Table*)(pCursor->pVtab); |
| 21230 Fts5Config *pConfig = pTab->pConfig; |
| 21231 Fts5Cursor *pCsr = (Fts5Cursor*)pCursor; |
| 21232 int rc = SQLITE_OK; /* Error code */ |
| 21233 int iVal = 0; /* Counter for apVal[] */ |
| 21234 int bDesc; /* True if ORDER BY [rank|rowid] DESC */ |
| 21235 int bOrderByRank; /* True if ORDER BY rank */ |
| 21236 sqlite3_value *pMatch = 0; /* <tbl> MATCH ? expression (or NULL) */ |
| 21237 sqlite3_value *pRank = 0; /* rank MATCH ? expression (or NULL) */ |
| 21238 sqlite3_value *pRowidEq = 0; /* rowid = ? expression (or NULL) */ |
| 21239 sqlite3_value *pRowidLe = 0; /* rowid <= ? expression (or NULL) */ |
| 21240 sqlite3_value *pRowidGe = 0; /* rowid >= ? expression (or NULL) */ |
| 21241 char **pzErrmsg = pConfig->pzErrmsg; |
| 21242 |
| 21243 if( pCsr->ePlan ){ |
| 21244 fts5FreeCursorComponents(pCsr); |
| 21245 memset(&pCsr->ePlan, 0, sizeof(Fts5Cursor) - ((u8*)&pCsr->ePlan-(u8*)pCsr)); |
| 21246 } |
| 21247 |
| 21248 assert( pCsr->pStmt==0 ); |
| 21249 assert( pCsr->pExpr==0 ); |
| 21250 assert( pCsr->csrflags==0 ); |
| 21251 assert( pCsr->pRank==0 ); |
| 21252 assert( pCsr->zRank==0 ); |
| 21253 assert( pCsr->zRankArgs==0 ); |
| 21254 |
| 21255 assert( pzErrmsg==0 || pzErrmsg==&pTab->base.zErrMsg ); |
| 21256 pConfig->pzErrmsg = &pTab->base.zErrMsg; |
| 21257 |
| 21258 /* Decode the arguments passed through to this function. |
| 21259 ** |
| 21260 ** Note: The following set of if(...) statements must be in the same |
| 21261 ** order as the corresponding entries in the struct at the top of |
| 21262 ** fts5BestIndexMethod(). */ |
| 21263 if( BitFlagTest(idxNum, FTS5_BI_MATCH) ) pMatch = apVal[iVal++]; |
| 21264 if( BitFlagTest(idxNum, FTS5_BI_RANK) ) pRank = apVal[iVal++]; |
| 21265 if( BitFlagTest(idxNum, FTS5_BI_ROWID_EQ) ) pRowidEq = apVal[iVal++]; |
| 21266 if( BitFlagTest(idxNum, FTS5_BI_ROWID_LE) ) pRowidLe = apVal[iVal++]; |
| 21267 if( BitFlagTest(idxNum, FTS5_BI_ROWID_GE) ) pRowidGe = apVal[iVal++]; |
| 21268 assert( iVal==nVal ); |
| 21269 bOrderByRank = ((idxNum & FTS5_BI_ORDER_RANK) ? 1 : 0); |
| 21270 pCsr->bDesc = bDesc = ((idxNum & FTS5_BI_ORDER_DESC) ? 1 : 0); |
| 21271 |
| 21272 /* Set the cursor upper and lower rowid limits. Only some strategies |
| 21273 ** actually use them. This is ok, as the xBestIndex() method leaves the |
| 21274 ** sqlite3_index_constraint.omit flag clear for range constraints |
| 21275 ** on the rowid field. */ |
| 21276 if( pRowidEq ){ |
| 21277 pRowidLe = pRowidGe = pRowidEq; |
| 21278 } |
| 21279 if( bDesc ){ |
| 21280 pCsr->iFirstRowid = fts5GetRowidLimit(pRowidLe, LARGEST_INT64); |
| 21281 pCsr->iLastRowid = fts5GetRowidLimit(pRowidGe, SMALLEST_INT64); |
| 21282 }else{ |
| 21283 pCsr->iLastRowid = fts5GetRowidLimit(pRowidLe, LARGEST_INT64); |
| 21284 pCsr->iFirstRowid = fts5GetRowidLimit(pRowidGe, SMALLEST_INT64); |
| 21285 } |
| 21286 |
| 21287 if( pTab->pSortCsr ){ |
| 21288 /* If pSortCsr is non-NULL, then this call is being made as part of |
| 21289 ** processing for a "... MATCH <expr> ORDER BY rank" query (ePlan is |
| 21290 ** set to FTS5_PLAN_SORTED_MATCH). pSortCsr is the cursor that will |
| 21291 ** return results to the user for this query. The current cursor |
| 21292 ** (pCursor) is used to execute the query issued by function |
| 21293 ** fts5CursorFirstSorted() above. */ |
| 21294 assert( pRowidEq==0 && pRowidLe==0 && pRowidGe==0 && pRank==0 ); |
| 21295 assert( nVal==0 && pMatch==0 && bOrderByRank==0 && bDesc==0 ); |
| 21296 assert( pCsr->iLastRowid==LARGEST_INT64 ); |
| 21297 assert( pCsr->iFirstRowid==SMALLEST_INT64 ); |
| 21298 pCsr->ePlan = FTS5_PLAN_SOURCE; |
| 21299 pCsr->pExpr = pTab->pSortCsr->pExpr; |
| 21300 rc = fts5CursorFirst(pTab, pCsr, bDesc); |
| 21301 }else if( pMatch ){ |
| 21302 const char *zExpr = (const char*)sqlite3_value_text(apVal[0]); |
| 21303 if( zExpr==0 ) zExpr = ""; |
| 21304 |
| 21305 rc = fts5CursorParseRank(pConfig, pCsr, pRank); |
| 21306 if( rc==SQLITE_OK ){ |
| 21307 if( zExpr[0]=='*' ){ |
| 21308 /* The user has issued a query of the form "MATCH '*...'". This |
| 21309 ** indicates that the MATCH expression is not a full text query, |
| 21310 ** but a request for an internal parameter. */ |
| 21311 rc = fts5SpecialMatch(pTab, pCsr, &zExpr[1]); |
| 21312 }else{ |
| 21313 char **pzErr = &pTab->base.zErrMsg; |
| 21314 rc = sqlite3Fts5ExprNew(pConfig, zExpr, &pCsr->pExpr, pzErr); |
| 21315 if( rc==SQLITE_OK ){ |
| 21316 if( bOrderByRank ){ |
| 21317 pCsr->ePlan = FTS5_PLAN_SORTED_MATCH; |
| 21318 rc = fts5CursorFirstSorted(pTab, pCsr, bDesc); |
| 21319 }else{ |
| 21320 pCsr->ePlan = FTS5_PLAN_MATCH; |
| 21321 rc = fts5CursorFirst(pTab, pCsr, bDesc); |
| 21322 } |
| 21323 } |
| 21324 } |
| 21325 } |
| 21326 }else if( pConfig->zContent==0 ){ |
| 21327 *pConfig->pzErrmsg = sqlite3_mprintf( |
| 21328 "%s: table does not support scanning", pConfig->zName |
| 21329 ); |
| 21330 rc = SQLITE_ERROR; |
| 21331 }else{ |
| 21332 /* This is either a full-table scan (ePlan==FTS5_PLAN_SCAN) or a lookup |
| 21333 ** by rowid (ePlan==FTS5_PLAN_ROWID). */ |
| 21334 pCsr->ePlan = (pRowidEq ? FTS5_PLAN_ROWID : FTS5_PLAN_SCAN); |
| 21335 rc = sqlite3Fts5StorageStmt( |
| 21336 pTab->pStorage, fts5StmtType(pCsr), &pCsr->pStmt, &pTab->base.zErrMsg |
| 21337 ); |
| 21338 if( rc==SQLITE_OK ){ |
| 21339 if( pCsr->ePlan==FTS5_PLAN_ROWID ){ |
| 21340 sqlite3_bind_value(pCsr->pStmt, 1, apVal[0]); |
| 21341 }else{ |
| 21342 sqlite3_bind_int64(pCsr->pStmt, 1, pCsr->iFirstRowid); |
| 21343 sqlite3_bind_int64(pCsr->pStmt, 2, pCsr->iLastRowid); |
| 21344 } |
| 21345 rc = fts5NextMethod(pCursor); |
| 21346 } |
| 21347 } |
| 21348 |
| 21349 pConfig->pzErrmsg = pzErrmsg; |
| 21350 return rc; |
| 21351 } |
| 21352 |
| 21353 /* |
| 21354 ** This is the xEof method of the virtual table. SQLite calls this |
| 21355 ** routine to find out if it has reached the end of a result set. |
| 21356 */ |
| 21357 static int fts5EofMethod(sqlite3_vtab_cursor *pCursor){ |
| 21358 Fts5Cursor *pCsr = (Fts5Cursor*)pCursor; |
| 21359 return (CsrFlagTest(pCsr, FTS5CSR_EOF) ? 1 : 0); |
| 21360 } |
| 21361 |
| 21362 /* |
| 21363 ** Return the rowid that the cursor currently points to. |
| 21364 */ |
| 21365 static i64 fts5CursorRowid(Fts5Cursor *pCsr){ |
| 21366 assert( pCsr->ePlan==FTS5_PLAN_MATCH |
| 21367 || pCsr->ePlan==FTS5_PLAN_SORTED_MATCH |
| 21368 || pCsr->ePlan==FTS5_PLAN_SOURCE |
| 21369 ); |
| 21370 if( pCsr->pSorter ){ |
| 21371 return pCsr->pSorter->iRowid; |
| 21372 }else{ |
| 21373 return sqlite3Fts5ExprRowid(pCsr->pExpr); |
| 21374 } |
| 21375 } |
| 21376 |
| 21377 /* |
| 21378 ** This is the xRowid method. The SQLite core calls this routine to |
| 21379 ** retrieve the rowid for the current row of the result set. fts5 |
| 21380 ** exposes %_content.rowid as the rowid for the virtual table. The |
| 21381 ** rowid should be written to *pRowid. |
| 21382 */ |
| 21383 static int fts5RowidMethod(sqlite3_vtab_cursor *pCursor, sqlite_int64 *pRowid){ |
| 21384 Fts5Cursor *pCsr = (Fts5Cursor*)pCursor; |
| 21385 int ePlan = pCsr->ePlan; |
| 21386 |
| 21387 assert( CsrFlagTest(pCsr, FTS5CSR_EOF)==0 ); |
| 21388 switch( ePlan ){ |
| 21389 case FTS5_PLAN_SPECIAL: |
| 21390 *pRowid = 0; |
| 21391 break; |
| 21392 |
| 21393 case FTS5_PLAN_SOURCE: |
| 21394 case FTS5_PLAN_MATCH: |
| 21395 case FTS5_PLAN_SORTED_MATCH: |
| 21396 *pRowid = fts5CursorRowid(pCsr); |
| 21397 break; |
| 21398 |
| 21399 default: |
| 21400 *pRowid = sqlite3_column_int64(pCsr->pStmt, 0); |
| 21401 break; |
| 21402 } |
| 21403 |
| 21404 return SQLITE_OK; |
| 21405 } |
| 21406 |
| 21407 /* |
| 21408 ** If the cursor requires seeking (bSeekRequired flag is set), seek it. |
| 21409 ** Return SQLITE_OK if no error occurs, or an SQLite error code otherwise. |
| 21410 ** |
| 21411 ** If argument bErrormsg is true and an error occurs, an error message may |
| 21412 ** be left in sqlite3_vtab.zErrMsg. |
| 21413 */ |
| 21414 static int fts5SeekCursor(Fts5Cursor *pCsr, int bErrormsg){ |
| 21415 int rc = SQLITE_OK; |
| 21416 |
| 21417 /* If the cursor does not yet have a statement handle, obtain one now. */ |
| 21418 if( pCsr->pStmt==0 ){ |
| 21419 Fts5Table *pTab = (Fts5Table*)(pCsr->base.pVtab); |
| 21420 int eStmt = fts5StmtType(pCsr); |
| 21421 rc = sqlite3Fts5StorageStmt( |
| 21422 pTab->pStorage, eStmt, &pCsr->pStmt, (bErrormsg?&pTab->base.zErrMsg:0) |
| 21423 ); |
| 21424 assert( rc!=SQLITE_OK || pTab->base.zErrMsg==0 ); |
| 21425 assert( CsrFlagTest(pCsr, FTS5CSR_REQUIRE_CONTENT) ); |
| 21426 } |
| 21427 |
| 21428 if( rc==SQLITE_OK && CsrFlagTest(pCsr, FTS5CSR_REQUIRE_CONTENT) ){ |
| 21429 assert( pCsr->pExpr ); |
| 21430 sqlite3_reset(pCsr->pStmt); |
| 21431 sqlite3_bind_int64(pCsr->pStmt, 1, fts5CursorRowid(pCsr)); |
| 21432 rc = sqlite3_step(pCsr->pStmt); |
| 21433 if( rc==SQLITE_ROW ){ |
| 21434 rc = SQLITE_OK; |
| 21435 CsrFlagClear(pCsr, FTS5CSR_REQUIRE_CONTENT); |
| 21436 }else{ |
| 21437 rc = sqlite3_reset(pCsr->pStmt); |
| 21438 if( rc==SQLITE_OK ){ |
| 21439 rc = FTS5_CORRUPT; |
| 21440 } |
| 21441 } |
| 21442 } |
| 21443 return rc; |
| 21444 } |
| 21445 |
| 21446 static void fts5SetVtabError(Fts5Table *p, const char *zFormat, ...){ |
| 21447 va_list ap; /* ... printf arguments */ |
| 21448 va_start(ap, zFormat); |
| 21449 assert( p->base.zErrMsg==0 ); |
| 21450 p->base.zErrMsg = sqlite3_vmprintf(zFormat, ap); |
| 21451 va_end(ap); |
| 21452 } |
| 21453 |
| 21454 /* |
| 21455 ** This function is called to handle an FTS INSERT command. In other words, |
| 21456 ** an INSERT statement of the form: |
| 21457 ** |
| 21458 ** INSERT INTO fts(fts) VALUES($pCmd) |
| 21459 ** INSERT INTO fts(fts, rank) VALUES($pCmd, $pVal) |
| 21460 ** |
| 21461 ** Argument pVal is the value assigned to column "fts" by the INSERT |
| 21462 ** statement. This function returns SQLITE_OK if successful, or an SQLite |
| 21463 ** error code if an error occurs. |
| 21464 ** |
| 21465 ** The commands implemented by this function are documented in the "Special |
| 21466 ** INSERT Directives" section of the documentation. It should be updated if |
| 21467 ** more commands are added to this function. |
| 21468 */ |
| 21469 static int fts5SpecialInsert( |
| 21470 Fts5Table *pTab, /* Fts5 table object */ |
| 21471 const char *zCmd, /* Text inserted into table-name column */ |
| 21472 sqlite3_value *pVal /* Value inserted into rank column */ |
| 21473 ){ |
| 21474 Fts5Config *pConfig = pTab->pConfig; |
| 21475 int rc = SQLITE_OK; |
| 21476 int bError = 0; |
| 21477 |
| 21478 if( 0==sqlite3_stricmp("delete-all", zCmd) ){ |
| 21479 if( pConfig->eContent==FTS5_CONTENT_NORMAL ){ |
| 21480 fts5SetVtabError(pTab, |
| 21481 "'delete-all' may only be used with a " |
| 21482 "contentless or external content fts5 table" |
| 21483 ); |
| 21484 rc = SQLITE_ERROR; |
| 21485 }else{ |
| 21486 rc = sqlite3Fts5StorageDeleteAll(pTab->pStorage); |
| 21487 } |
| 21488 }else if( 0==sqlite3_stricmp("rebuild", zCmd) ){ |
| 21489 if( pConfig->eContent==FTS5_CONTENT_NONE ){ |
| 21490 fts5SetVtabError(pTab, |
| 21491 "'rebuild' may not be used with a contentless fts5 table" |
| 21492 ); |
| 21493 rc = SQLITE_ERROR; |
| 21494 }else{ |
| 21495 rc = sqlite3Fts5StorageRebuild(pTab->pStorage); |
| 21496 } |
| 21497 }else if( 0==sqlite3_stricmp("optimize", zCmd) ){ |
| 21498 rc = sqlite3Fts5StorageOptimize(pTab->pStorage); |
| 21499 }else if( 0==sqlite3_stricmp("merge", zCmd) ){ |
| 21500 int nMerge = sqlite3_value_int(pVal); |
| 21501 rc = sqlite3Fts5StorageMerge(pTab->pStorage, nMerge); |
| 21502 }else if( 0==sqlite3_stricmp("integrity-check", zCmd) ){ |
| 21503 rc = sqlite3Fts5StorageIntegrity(pTab->pStorage); |
| 21504 #ifdef SQLITE_DEBUG |
| 21505 }else if( 0==sqlite3_stricmp("prefix-index", zCmd) ){ |
| 21506 pConfig->bPrefixIndex = sqlite3_value_int(pVal); |
| 21507 #endif |
| 21508 }else{ |
| 21509 rc = sqlite3Fts5IndexLoadConfig(pTab->pIndex); |
| 21510 if( rc==SQLITE_OK ){ |
| 21511 rc = sqlite3Fts5ConfigSetValue(pTab->pConfig, zCmd, pVal, &bError); |
| 21512 } |
| 21513 if( rc==SQLITE_OK ){ |
| 21514 if( bError ){ |
| 21515 rc = SQLITE_ERROR; |
| 21516 }else{ |
| 21517 rc = sqlite3Fts5StorageConfigValue(pTab->pStorage, zCmd, pVal, 0); |
| 21518 } |
| 21519 } |
| 21520 } |
| 21521 return rc; |
| 21522 } |
| 21523 |
| 21524 static int fts5SpecialDelete( |
| 21525 Fts5Table *pTab, |
| 21526 sqlite3_value **apVal, |
| 21527 sqlite3_int64 *piRowid |
| 21528 ){ |
| 21529 int rc = SQLITE_OK; |
| 21530 int eType1 = sqlite3_value_type(apVal[1]); |
| 21531 if( eType1==SQLITE_INTEGER ){ |
| 21532 sqlite3_int64 iDel = sqlite3_value_int64(apVal[1]); |
| 21533 rc = sqlite3Fts5StorageSpecialDelete(pTab->pStorage, iDel, &apVal[2]); |
| 21534 } |
| 21535 return rc; |
| 21536 } |
| 21537 |
| 21538 static void fts5StorageInsert( |
| 21539 int *pRc, |
| 21540 Fts5Table *pTab, |
| 21541 sqlite3_value **apVal, |
| 21542 i64 *piRowid |
| 21543 ){ |
| 21544 int rc = *pRc; |
| 21545 if( rc==SQLITE_OK ){ |
| 21546 rc = sqlite3Fts5StorageContentInsert(pTab->pStorage, apVal, piRowid); |
| 21547 } |
| 21548 if( rc==SQLITE_OK ){ |
| 21549 rc = sqlite3Fts5StorageIndexInsert(pTab->pStorage, apVal, *piRowid); |
| 21550 } |
| 21551 *pRc = rc; |
| 21552 } |
| 21553 |
| 21554 /* |
| 21555 ** This function is the implementation of the xUpdate callback used by |
| 21556 ** FTS3 virtual tables. It is invoked by SQLite each time a row is to be |
| 21557 ** inserted, updated or deleted. |
| 21558 ** |
| 21559 ** A delete specifies a single argument - the rowid of the row to remove. |
| 21560 ** |
| 21561 ** Update and insert operations pass: |
| 21562 ** |
| 21563 ** 1. The "old" rowid, or NULL. |
| 21564 ** 2. The "new" rowid. |
| 21565 ** 3. Values for each of the nCol matchable columns. |
| 21566 ** 4. Values for the two hidden columns (<tablename> and "rank"). |
| 21567 */ |
| 21568 static int fts5UpdateMethod( |
| 21569 sqlite3_vtab *pVtab, /* Virtual table handle */ |
| 21570 int nArg, /* Size of argument array */ |
| 21571 sqlite3_value **apVal, /* Array of arguments */ |
| 21572 sqlite_int64 *pRowid /* OUT: The affected (or effected) rowid */ |
| 21573 ){ |
| 21574 Fts5Table *pTab = (Fts5Table*)pVtab; |
| 21575 Fts5Config *pConfig = pTab->pConfig; |
| 21576 int eType0; /* value_type() of apVal[0] */ |
| 21577 int rc = SQLITE_OK; /* Return code */ |
| 21578 |
| 21579 /* A transaction must be open when this is called. */ |
| 21580 assert( pTab->ts.eState==1 ); |
| 21581 |
| 21582 assert( pVtab->zErrMsg==0 ); |
| 21583 assert( nArg==1 || nArg==(2+pConfig->nCol+2) ); |
| 21584 assert( nArg==1 |
| 21585 || sqlite3_value_type(apVal[1])==SQLITE_INTEGER |
| 21586 || sqlite3_value_type(apVal[1])==SQLITE_NULL |
| 21587 ); |
| 21588 assert( pTab->pConfig->pzErrmsg==0 ); |
| 21589 pTab->pConfig->pzErrmsg = &pTab->base.zErrMsg; |
| 21590 |
| 21591 /* Put any active cursors into REQUIRE_SEEK state. */ |
| 21592 fts5TripCursors(pTab); |
| 21593 |
| 21594 eType0 = sqlite3_value_type(apVal[0]); |
| 21595 if( eType0==SQLITE_NULL |
| 21596 && sqlite3_value_type(apVal[2+pConfig->nCol])!=SQLITE_NULL |
| 21597 ){ |
| 21598 /* A "special" INSERT op. These are handled separately. */ |
| 21599 const char *z = (const char*)sqlite3_value_text(apVal[2+pConfig->nCol]); |
| 21600 if( pConfig->eContent!=FTS5_CONTENT_NORMAL |
| 21601 && 0==sqlite3_stricmp("delete", z) |
| 21602 ){ |
| 21603 rc = fts5SpecialDelete(pTab, apVal, pRowid); |
| 21604 }else{ |
| 21605 rc = fts5SpecialInsert(pTab, z, apVal[2 + pConfig->nCol + 1]); |
| 21606 } |
| 21607 }else{ |
| 21608 /* A regular INSERT, UPDATE or DELETE statement. The trick here is that |
| 21609 ** any conflict on the rowid value must be detected before any |
| 21610 ** modifications are made to the database file. There are 4 cases: |
| 21611 ** |
| 21612 ** 1) DELETE |
| 21613 ** 2) UPDATE (rowid not modified) |
| 21614 ** 3) UPDATE (rowid modified) |
| 21615 ** 4) INSERT |
| 21616 ** |
| 21617 ** Cases 3 and 4 may violate the rowid constraint. |
| 21618 */ |
| 21619 int eConflict = SQLITE_ABORT; |
| 21620 if( pConfig->eContent==FTS5_CONTENT_NORMAL ){ |
| 21621 eConflict = sqlite3_vtab_on_conflict(pConfig->db); |
| 21622 } |
| 21623 |
| 21624 assert( eType0==SQLITE_INTEGER || eType0==SQLITE_NULL ); |
| 21625 assert( nArg!=1 || eType0==SQLITE_INTEGER ); |
| 21626 |
| 21627 /* Filter out attempts to run UPDATE or DELETE on contentless tables. |
| 21628 ** This is not suported. */ |
| 21629 if( eType0==SQLITE_INTEGER && fts5IsContentless(pTab) ){ |
| 21630 pTab->base.zErrMsg = sqlite3_mprintf( |
| 21631 "cannot %s contentless fts5 table: %s", |
| 21632 (nArg>1 ? "UPDATE" : "DELETE from"), pConfig->zName |
| 21633 ); |
| 21634 rc = SQLITE_ERROR; |
| 21635 } |
| 21636 |
| 21637 /* Case 1: DELETE */ |
| 21638 else if( nArg==1 ){ |
| 21639 i64 iDel = sqlite3_value_int64(apVal[0]); /* Rowid to delete */ |
| 21640 rc = sqlite3Fts5StorageDelete(pTab->pStorage, iDel); |
| 21641 } |
| 21642 |
| 21643 /* Case 2: INSERT */ |
| 21644 else if( eType0!=SQLITE_INTEGER ){ |
| 21645 /* If this is a REPLACE, first remove the current entry (if any) */ |
| 21646 if( eConflict==SQLITE_REPLACE |
| 21647 && sqlite3_value_type(apVal[1])==SQLITE_INTEGER |
| 21648 ){ |
| 21649 i64 iNew = sqlite3_value_int64(apVal[1]); /* Rowid to delete */ |
| 21650 rc = sqlite3Fts5StorageDelete(pTab->pStorage, iNew); |
| 21651 } |
| 21652 fts5StorageInsert(&rc, pTab, apVal, pRowid); |
| 21653 } |
| 21654 |
| 21655 /* Case 2: UPDATE */ |
| 21656 else{ |
| 21657 i64 iOld = sqlite3_value_int64(apVal[0]); /* Old rowid */ |
| 21658 i64 iNew = sqlite3_value_int64(apVal[1]); /* New rowid */ |
| 21659 if( iOld!=iNew ){ |
| 21660 if( eConflict==SQLITE_REPLACE ){ |
| 21661 rc = sqlite3Fts5StorageDelete(pTab->pStorage, iOld); |
| 21662 if( rc==SQLITE_OK ){ |
| 21663 rc = sqlite3Fts5StorageDelete(pTab->pStorage, iNew); |
| 21664 } |
| 21665 fts5StorageInsert(&rc, pTab, apVal, pRowid); |
| 21666 }else{ |
| 21667 rc = sqlite3Fts5StorageContentInsert(pTab->pStorage, apVal, pRowid); |
| 21668 if( rc==SQLITE_OK ){ |
| 21669 rc = sqlite3Fts5StorageDelete(pTab->pStorage, iOld); |
| 21670 } |
| 21671 if( rc==SQLITE_OK ){ |
| 21672 rc = sqlite3Fts5StorageIndexInsert(pTab->pStorage, apVal, *pRowid); |
| 21673 } |
| 21674 } |
| 21675 }else{ |
| 21676 rc = sqlite3Fts5StorageDelete(pTab->pStorage, iOld); |
| 21677 fts5StorageInsert(&rc, pTab, apVal, pRowid); |
| 21678 } |
| 21679 } |
| 21680 } |
| 21681 |
| 21682 pTab->pConfig->pzErrmsg = 0; |
| 21683 return rc; |
| 21684 } |
| 21685 |
| 21686 /* |
| 21687 ** Implementation of xSync() method. |
| 21688 */ |
| 21689 static int fts5SyncMethod(sqlite3_vtab *pVtab){ |
| 21690 int rc; |
| 21691 Fts5Table *pTab = (Fts5Table*)pVtab; |
| 21692 fts5CheckTransactionState(pTab, FTS5_SYNC, 0); |
| 21693 pTab->pConfig->pzErrmsg = &pTab->base.zErrMsg; |
| 21694 fts5TripCursors(pTab); |
| 21695 rc = sqlite3Fts5StorageSync(pTab->pStorage, 1); |
| 21696 pTab->pConfig->pzErrmsg = 0; |
| 21697 return rc; |
| 21698 } |
| 21699 |
| 21700 /* |
| 21701 ** Implementation of xBegin() method. |
| 21702 */ |
| 21703 static int fts5BeginMethod(sqlite3_vtab *pVtab){ |
| 21704 fts5CheckTransactionState((Fts5Table*)pVtab, FTS5_BEGIN, 0); |
| 21705 return SQLITE_OK; |
| 21706 } |
| 21707 |
| 21708 /* |
| 21709 ** Implementation of xCommit() method. This is a no-op. The contents of |
| 21710 ** the pending-terms hash-table have already been flushed into the database |
| 21711 ** by fts5SyncMethod(). |
| 21712 */ |
| 21713 static int fts5CommitMethod(sqlite3_vtab *pVtab){ |
| 21714 fts5CheckTransactionState((Fts5Table*)pVtab, FTS5_COMMIT, 0); |
| 21715 return SQLITE_OK; |
| 21716 } |
| 21717 |
| 21718 /* |
| 21719 ** Implementation of xRollback(). Discard the contents of the pending-terms |
| 21720 ** hash-table. Any changes made to the database are reverted by SQLite. |
| 21721 */ |
| 21722 static int fts5RollbackMethod(sqlite3_vtab *pVtab){ |
| 21723 int rc; |
| 21724 Fts5Table *pTab = (Fts5Table*)pVtab; |
| 21725 fts5CheckTransactionState(pTab, FTS5_ROLLBACK, 0); |
| 21726 rc = sqlite3Fts5StorageRollback(pTab->pStorage); |
| 21727 return rc; |
| 21728 } |
| 21729 |
| 21730 static void *fts5ApiUserData(Fts5Context *pCtx){ |
| 21731 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx; |
| 21732 return pCsr->pAux->pUserData; |
| 21733 } |
| 21734 |
| 21735 static int fts5ApiColumnCount(Fts5Context *pCtx){ |
| 21736 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx; |
| 21737 return ((Fts5Table*)(pCsr->base.pVtab))->pConfig->nCol; |
| 21738 } |
| 21739 |
| 21740 static int fts5ApiColumnTotalSize( |
| 21741 Fts5Context *pCtx, |
| 21742 int iCol, |
| 21743 sqlite3_int64 *pnToken |
| 21744 ){ |
| 21745 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx; |
| 21746 Fts5Table *pTab = (Fts5Table*)(pCsr->base.pVtab); |
| 21747 return sqlite3Fts5StorageSize(pTab->pStorage, iCol, pnToken); |
| 21748 } |
| 21749 |
| 21750 static int fts5ApiRowCount(Fts5Context *pCtx, i64 *pnRow){ |
| 21751 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx; |
| 21752 Fts5Table *pTab = (Fts5Table*)(pCsr->base.pVtab); |
| 21753 return sqlite3Fts5StorageRowCount(pTab->pStorage, pnRow); |
| 21754 } |
| 21755 |
| 21756 static int fts5ApiTokenize( |
| 21757 Fts5Context *pCtx, |
| 21758 const char *pText, int nText, |
| 21759 void *pUserData, |
| 21760 int (*xToken)(void*, int, const char*, int, int, int) |
| 21761 ){ |
| 21762 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx; |
| 21763 Fts5Table *pTab = (Fts5Table*)(pCsr->base.pVtab); |
| 21764 return sqlite3Fts5Tokenize( |
| 21765 pTab->pConfig, FTS5_TOKENIZE_AUX, pText, nText, pUserData, xToken |
| 21766 ); |
| 21767 } |
| 21768 |
| 21769 static int fts5ApiPhraseCount(Fts5Context *pCtx){ |
| 21770 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx; |
| 21771 return sqlite3Fts5ExprPhraseCount(pCsr->pExpr); |
| 21772 } |
| 21773 |
| 21774 static int fts5ApiPhraseSize(Fts5Context *pCtx, int iPhrase){ |
| 21775 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx; |
| 21776 return sqlite3Fts5ExprPhraseSize(pCsr->pExpr, iPhrase); |
| 21777 } |
| 21778 |
| 21779 static int fts5CsrPoslist(Fts5Cursor *pCsr, int iPhrase, const u8 **pa){ |
| 21780 int n; |
| 21781 if( pCsr->pSorter ){ |
| 21782 Fts5Sorter *pSorter = pCsr->pSorter; |
| 21783 int i1 = (iPhrase==0 ? 0 : pSorter->aIdx[iPhrase-1]); |
| 21784 n = pSorter->aIdx[iPhrase] - i1; |
| 21785 *pa = &pSorter->aPoslist[i1]; |
| 21786 }else{ |
| 21787 n = sqlite3Fts5ExprPoslist(pCsr->pExpr, iPhrase, pa); |
| 21788 } |
| 21789 return n; |
| 21790 } |
| 21791 |
| 21792 /* |
| 21793 ** Ensure that the Fts5Cursor.nInstCount and aInst[] variables are populated |
| 21794 ** correctly for the current view. Return SQLITE_OK if successful, or an |
| 21795 ** SQLite error code otherwise. |
| 21796 */ |
| 21797 static int fts5CacheInstArray(Fts5Cursor *pCsr){ |
| 21798 int rc = SQLITE_OK; |
| 21799 Fts5PoslistReader *aIter; /* One iterator for each phrase */ |
| 21800 int nIter; /* Number of iterators/phrases */ |
| 21801 |
| 21802 nIter = sqlite3Fts5ExprPhraseCount(pCsr->pExpr); |
| 21803 if( pCsr->aInstIter==0 ){ |
| 21804 int nByte = sizeof(Fts5PoslistReader) * nIter; |
| 21805 pCsr->aInstIter = (Fts5PoslistReader*)sqlite3Fts5MallocZero(&rc, nByte); |
| 21806 } |
| 21807 aIter = pCsr->aInstIter; |
| 21808 |
| 21809 if( aIter ){ |
| 21810 int nInst = 0; /* Number instances seen so far */ |
| 21811 int i; |
| 21812 |
| 21813 /* Initialize all iterators */ |
| 21814 for(i=0; i<nIter; i++){ |
| 21815 const u8 *a; |
| 21816 int n = fts5CsrPoslist(pCsr, i, &a); |
| 21817 sqlite3Fts5PoslistReaderInit(a, n, &aIter[i]); |
| 21818 } |
| 21819 |
| 21820 while( 1 ){ |
| 21821 int *aInst; |
| 21822 int iBest = -1; |
| 21823 for(i=0; i<nIter; i++){ |
| 21824 if( (aIter[i].bEof==0) |
| 21825 && (iBest<0 || aIter[i].iPos<aIter[iBest].iPos) |
| 21826 ){ |
| 21827 iBest = i; |
| 21828 } |
| 21829 } |
| 21830 if( iBest<0 ) break; |
| 21831 |
| 21832 nInst++; |
| 21833 if( nInst>=pCsr->nInstAlloc ){ |
| 21834 pCsr->nInstAlloc = pCsr->nInstAlloc ? pCsr->nInstAlloc*2 : 32; |
| 21835 aInst = (int*)sqlite3_realloc( |
| 21836 pCsr->aInst, pCsr->nInstAlloc*sizeof(int)*3 |
| 21837 ); |
| 21838 if( aInst ){ |
| 21839 pCsr->aInst = aInst; |
| 21840 }else{ |
| 21841 rc = SQLITE_NOMEM; |
| 21842 break; |
| 21843 } |
| 21844 } |
| 21845 |
| 21846 aInst = &pCsr->aInst[3 * (nInst-1)]; |
| 21847 aInst[0] = iBest; |
| 21848 aInst[1] = FTS5_POS2COLUMN(aIter[iBest].iPos); |
| 21849 aInst[2] = FTS5_POS2OFFSET(aIter[iBest].iPos); |
| 21850 sqlite3Fts5PoslistReaderNext(&aIter[iBest]); |
| 21851 } |
| 21852 |
| 21853 pCsr->nInstCount = nInst; |
| 21854 CsrFlagClear(pCsr, FTS5CSR_REQUIRE_INST); |
| 21855 } |
| 21856 return rc; |
| 21857 } |
| 21858 |
| 21859 static int fts5ApiInstCount(Fts5Context *pCtx, int *pnInst){ |
| 21860 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx; |
| 21861 int rc = SQLITE_OK; |
| 21862 if( CsrFlagTest(pCsr, FTS5CSR_REQUIRE_INST)==0 |
| 21863 || SQLITE_OK==(rc = fts5CacheInstArray(pCsr)) ){ |
| 21864 *pnInst = pCsr->nInstCount; |
| 21865 } |
| 21866 return rc; |
| 21867 } |
| 21868 |
| 21869 static int fts5ApiInst( |
| 21870 Fts5Context *pCtx, |
| 21871 int iIdx, |
| 21872 int *piPhrase, |
| 21873 int *piCol, |
| 21874 int *piOff |
| 21875 ){ |
| 21876 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx; |
| 21877 int rc = SQLITE_OK; |
| 21878 if( CsrFlagTest(pCsr, FTS5CSR_REQUIRE_INST)==0 |
| 21879 || SQLITE_OK==(rc = fts5CacheInstArray(pCsr)) |
| 21880 ){ |
| 21881 if( iIdx<0 || iIdx>=pCsr->nInstCount ){ |
| 21882 rc = SQLITE_RANGE; |
| 21883 }else{ |
| 21884 *piPhrase = pCsr->aInst[iIdx*3]; |
| 21885 *piCol = pCsr->aInst[iIdx*3 + 1]; |
| 21886 *piOff = pCsr->aInst[iIdx*3 + 2]; |
| 21887 } |
| 21888 } |
| 21889 return rc; |
| 21890 } |
| 21891 |
| 21892 static sqlite3_int64 fts5ApiRowid(Fts5Context *pCtx){ |
| 21893 return fts5CursorRowid((Fts5Cursor*)pCtx); |
| 21894 } |
| 21895 |
| 21896 static int fts5ApiColumnText( |
| 21897 Fts5Context *pCtx, |
| 21898 int iCol, |
| 21899 const char **pz, |
| 21900 int *pn |
| 21901 ){ |
| 21902 int rc = SQLITE_OK; |
| 21903 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx; |
| 21904 if( fts5IsContentless((Fts5Table*)(pCsr->base.pVtab)) ){ |
| 21905 *pz = 0; |
| 21906 *pn = 0; |
| 21907 }else{ |
| 21908 rc = fts5SeekCursor(pCsr, 0); |
| 21909 if( rc==SQLITE_OK ){ |
| 21910 *pz = (const char*)sqlite3_column_text(pCsr->pStmt, iCol+1); |
| 21911 *pn = sqlite3_column_bytes(pCsr->pStmt, iCol+1); |
| 21912 } |
| 21913 } |
| 21914 return rc; |
| 21915 } |
| 21916 |
| 21917 static int fts5ColumnSizeCb( |
| 21918 void *pContext, /* Pointer to int */ |
| 21919 int tflags, |
| 21920 const char *pToken, /* Buffer containing token */ |
| 21921 int nToken, /* Size of token in bytes */ |
| 21922 int iStart, /* Start offset of token */ |
| 21923 int iEnd /* End offset of token */ |
| 21924 ){ |
| 21925 int *pCnt = (int*)pContext; |
| 21926 if( (tflags & FTS5_TOKEN_COLOCATED)==0 ){ |
| 21927 (*pCnt)++; |
| 21928 } |
| 21929 return SQLITE_OK; |
| 21930 } |
| 21931 |
| 21932 static int fts5ApiColumnSize(Fts5Context *pCtx, int iCol, int *pnToken){ |
| 21933 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx; |
| 21934 Fts5Table *pTab = (Fts5Table*)(pCsr->base.pVtab); |
| 21935 Fts5Config *pConfig = pTab->pConfig; |
| 21936 int rc = SQLITE_OK; |
| 21937 |
| 21938 if( CsrFlagTest(pCsr, FTS5CSR_REQUIRE_DOCSIZE) ){ |
| 21939 if( pConfig->bColumnsize ){ |
| 21940 i64 iRowid = fts5CursorRowid(pCsr); |
| 21941 rc = sqlite3Fts5StorageDocsize(pTab->pStorage, iRowid, pCsr->aColumnSize); |
| 21942 }else if( pConfig->zContent==0 ){ |
| 21943 int i; |
| 21944 for(i=0; i<pConfig->nCol; i++){ |
| 21945 if( pConfig->abUnindexed[i]==0 ){ |
| 21946 pCsr->aColumnSize[i] = -1; |
| 21947 } |
| 21948 } |
| 21949 }else{ |
| 21950 int i; |
| 21951 for(i=0; rc==SQLITE_OK && i<pConfig->nCol; i++){ |
| 21952 if( pConfig->abUnindexed[i]==0 ){ |
| 21953 const char *z; int n; |
| 21954 void *p = (void*)(&pCsr->aColumnSize[i]); |
| 21955 pCsr->aColumnSize[i] = 0; |
| 21956 rc = fts5ApiColumnText(pCtx, i, &z, &n); |
| 21957 if( rc==SQLITE_OK ){ |
| 21958 rc = sqlite3Fts5Tokenize( |
| 21959 pConfig, FTS5_TOKENIZE_AUX, z, n, p, fts5ColumnSizeCb |
| 21960 ); |
| 21961 } |
| 21962 } |
| 21963 } |
| 21964 } |
| 21965 CsrFlagClear(pCsr, FTS5CSR_REQUIRE_DOCSIZE); |
| 21966 } |
| 21967 if( iCol<0 ){ |
| 21968 int i; |
| 21969 *pnToken = 0; |
| 21970 for(i=0; i<pConfig->nCol; i++){ |
| 21971 *pnToken += pCsr->aColumnSize[i]; |
| 21972 } |
| 21973 }else if( iCol<pConfig->nCol ){ |
| 21974 *pnToken = pCsr->aColumnSize[iCol]; |
| 21975 }else{ |
| 21976 *pnToken = 0; |
| 21977 rc = SQLITE_RANGE; |
| 21978 } |
| 21979 return rc; |
| 21980 } |
| 21981 |
| 21982 /* |
| 21983 ** Implementation of the xSetAuxdata() method. |
| 21984 */ |
| 21985 static int fts5ApiSetAuxdata( |
| 21986 Fts5Context *pCtx, /* Fts5 context */ |
| 21987 void *pPtr, /* Pointer to save as auxdata */ |
| 21988 void(*xDelete)(void*) /* Destructor for pPtr (or NULL) */ |
| 21989 ){ |
| 21990 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx; |
| 21991 Fts5Auxdata *pData; |
| 21992 |
| 21993 /* Search through the cursors list of Fts5Auxdata objects for one that |
| 21994 ** corresponds to the currently executing auxiliary function. */ |
| 21995 for(pData=pCsr->pAuxdata; pData; pData=pData->pNext){ |
| 21996 if( pData->pAux==pCsr->pAux ) break; |
| 21997 } |
| 21998 |
| 21999 if( pData ){ |
| 22000 if( pData->xDelete ){ |
| 22001 pData->xDelete(pData->pPtr); |
| 22002 } |
| 22003 }else{ |
| 22004 int rc = SQLITE_OK; |
| 22005 pData = (Fts5Auxdata*)sqlite3Fts5MallocZero(&rc, sizeof(Fts5Auxdata)); |
| 22006 if( pData==0 ){ |
| 22007 if( xDelete ) xDelete(pPtr); |
| 22008 return rc; |
| 22009 } |
| 22010 pData->pAux = pCsr->pAux; |
| 22011 pData->pNext = pCsr->pAuxdata; |
| 22012 pCsr->pAuxdata = pData; |
| 22013 } |
| 22014 |
| 22015 pData->xDelete = xDelete; |
| 22016 pData->pPtr = pPtr; |
| 22017 return SQLITE_OK; |
| 22018 } |
| 22019 |
| 22020 static void *fts5ApiGetAuxdata(Fts5Context *pCtx, int bClear){ |
| 22021 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx; |
| 22022 Fts5Auxdata *pData; |
| 22023 void *pRet = 0; |
| 22024 |
| 22025 for(pData=pCsr->pAuxdata; pData; pData=pData->pNext){ |
| 22026 if( pData->pAux==pCsr->pAux ) break; |
| 22027 } |
| 22028 |
| 22029 if( pData ){ |
| 22030 pRet = pData->pPtr; |
| 22031 if( bClear ){ |
| 22032 pData->pPtr = 0; |
| 22033 pData->xDelete = 0; |
| 22034 } |
| 22035 } |
| 22036 |
| 22037 return pRet; |
| 22038 } |
| 22039 |
| 22040 static void fts5ApiPhraseNext( |
| 22041 Fts5Context *pCtx, |
| 22042 Fts5PhraseIter *pIter, |
| 22043 int *piCol, int *piOff |
| 22044 ){ |
| 22045 if( pIter->a>=pIter->b ){ |
| 22046 *piCol = -1; |
| 22047 *piOff = -1; |
| 22048 }else{ |
| 22049 int iVal; |
| 22050 pIter->a += fts5GetVarint32(pIter->a, iVal); |
| 22051 if( iVal==1 ){ |
| 22052 pIter->a += fts5GetVarint32(pIter->a, iVal); |
| 22053 *piCol = iVal; |
| 22054 *piOff = 0; |
| 22055 pIter->a += fts5GetVarint32(pIter->a, iVal); |
| 22056 } |
| 22057 *piOff += (iVal-2); |
| 22058 } |
| 22059 } |
| 22060 |
| 22061 static void fts5ApiPhraseFirst( |
| 22062 Fts5Context *pCtx, |
| 22063 int iPhrase, |
| 22064 Fts5PhraseIter *pIter, |
| 22065 int *piCol, int *piOff |
| 22066 ){ |
| 22067 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx; |
| 22068 int n = fts5CsrPoslist(pCsr, iPhrase, &pIter->a); |
| 22069 pIter->b = &pIter->a[n]; |
| 22070 *piCol = 0; |
| 22071 *piOff = 0; |
| 22072 fts5ApiPhraseNext(pCtx, pIter, piCol, piOff); |
| 22073 } |
| 22074 |
| 22075 static int fts5ApiQueryPhrase(Fts5Context*, int, void*, |
| 22076 int(*)(const Fts5ExtensionApi*, Fts5Context*, void*) |
| 22077 ); |
| 22078 |
| 22079 static const Fts5ExtensionApi sFts5Api = { |
| 22080 2, /* iVersion */ |
| 22081 fts5ApiUserData, |
| 22082 fts5ApiColumnCount, |
| 22083 fts5ApiRowCount, |
| 22084 fts5ApiColumnTotalSize, |
| 22085 fts5ApiTokenize, |
| 22086 fts5ApiPhraseCount, |
| 22087 fts5ApiPhraseSize, |
| 22088 fts5ApiInstCount, |
| 22089 fts5ApiInst, |
| 22090 fts5ApiRowid, |
| 22091 fts5ApiColumnText, |
| 22092 fts5ApiColumnSize, |
| 22093 fts5ApiQueryPhrase, |
| 22094 fts5ApiSetAuxdata, |
| 22095 fts5ApiGetAuxdata, |
| 22096 fts5ApiPhraseFirst, |
| 22097 fts5ApiPhraseNext, |
| 22098 }; |
| 22099 |
| 22100 |
| 22101 /* |
| 22102 ** Implementation of API function xQueryPhrase(). |
| 22103 */ |
| 22104 static int fts5ApiQueryPhrase( |
| 22105 Fts5Context *pCtx, |
| 22106 int iPhrase, |
| 22107 void *pUserData, |
| 22108 int(*xCallback)(const Fts5ExtensionApi*, Fts5Context*, void*) |
| 22109 ){ |
| 22110 Fts5Cursor *pCsr = (Fts5Cursor*)pCtx; |
| 22111 Fts5Table *pTab = (Fts5Table*)(pCsr->base.pVtab); |
| 22112 int rc; |
| 22113 Fts5Cursor *pNew = 0; |
| 22114 |
| 22115 rc = fts5OpenMethod(pCsr->base.pVtab, (sqlite3_vtab_cursor**)&pNew); |
| 22116 if( rc==SQLITE_OK ){ |
| 22117 Fts5Config *pConf = pTab->pConfig; |
| 22118 pNew->ePlan = FTS5_PLAN_MATCH; |
| 22119 pNew->iFirstRowid = SMALLEST_INT64; |
| 22120 pNew->iLastRowid = LARGEST_INT64; |
| 22121 pNew->base.pVtab = (sqlite3_vtab*)pTab; |
| 22122 rc = sqlite3Fts5ExprClonePhrase(pConf, pCsr->pExpr, iPhrase, &pNew->pExpr); |
| 22123 } |
| 22124 |
| 22125 if( rc==SQLITE_OK ){ |
| 22126 for(rc = fts5CursorFirst(pTab, pNew, 0); |
| 22127 rc==SQLITE_OK && CsrFlagTest(pNew, FTS5CSR_EOF)==0; |
| 22128 rc = fts5NextMethod((sqlite3_vtab_cursor*)pNew) |
| 22129 ){ |
| 22130 rc = xCallback(&sFts5Api, (Fts5Context*)pNew, pUserData); |
| 22131 if( rc!=SQLITE_OK ){ |
| 22132 if( rc==SQLITE_DONE ) rc = SQLITE_OK; |
| 22133 break; |
| 22134 } |
| 22135 } |
| 22136 } |
| 22137 |
| 22138 fts5CloseMethod((sqlite3_vtab_cursor*)pNew); |
| 22139 return rc; |
| 22140 } |
| 22141 |
| 22142 static void fts5ApiInvoke( |
| 22143 Fts5Auxiliary *pAux, |
| 22144 Fts5Cursor *pCsr, |
| 22145 sqlite3_context *context, |
| 22146 int argc, |
| 22147 sqlite3_value **argv |
| 22148 ){ |
| 22149 assert( pCsr->pAux==0 ); |
| 22150 pCsr->pAux = pAux; |
| 22151 pAux->xFunc(&sFts5Api, (Fts5Context*)pCsr, context, argc, argv); |
| 22152 pCsr->pAux = 0; |
| 22153 } |
| 22154 |
| 22155 static Fts5Cursor *fts5CursorFromCsrid(Fts5Global *pGlobal, i64 iCsrId){ |
| 22156 Fts5Cursor *pCsr; |
| 22157 for(pCsr=pGlobal->pCsr; pCsr; pCsr=pCsr->pNext){ |
| 22158 if( pCsr->iCsrId==iCsrId ) break; |
| 22159 } |
| 22160 return pCsr; |
| 22161 } |
| 22162 |
| 22163 static void fts5ApiCallback( |
| 22164 sqlite3_context *context, |
| 22165 int argc, |
| 22166 sqlite3_value **argv |
| 22167 ){ |
| 22168 |
| 22169 Fts5Auxiliary *pAux; |
| 22170 Fts5Cursor *pCsr; |
| 22171 i64 iCsrId; |
| 22172 |
| 22173 assert( argc>=1 ); |
| 22174 pAux = (Fts5Auxiliary*)sqlite3_user_data(context); |
| 22175 iCsrId = sqlite3_value_int64(argv[0]); |
| 22176 |
| 22177 pCsr = fts5CursorFromCsrid(pAux->pGlobal, iCsrId); |
| 22178 if( pCsr==0 ){ |
| 22179 char *zErr = sqlite3_mprintf("no such cursor: %lld", iCsrId); |
| 22180 sqlite3_result_error(context, zErr, -1); |
| 22181 sqlite3_free(zErr); |
| 22182 }else{ |
| 22183 fts5ApiInvoke(pAux, pCsr, context, argc-1, &argv[1]); |
| 22184 } |
| 22185 } |
| 22186 |
| 22187 |
| 22188 /* |
| 22189 ** Given cursor id iId, return a pointer to the corresponding Fts5Index |
| 22190 ** object. Or NULL If the cursor id does not exist. |
| 22191 ** |
| 22192 ** If successful, set *ppConfig to point to the associated config object |
| 22193 ** before returning. |
| 22194 */ |
| 22195 static Fts5Index *sqlite3Fts5IndexFromCsrid( |
| 22196 Fts5Global *pGlobal, /* FTS5 global context for db handle */ |
| 22197 i64 iCsrId, /* Id of cursor to find */ |
| 22198 Fts5Config **ppConfig /* OUT: Configuration object */ |
| 22199 ){ |
| 22200 Fts5Cursor *pCsr; |
| 22201 Fts5Table *pTab; |
| 22202 |
| 22203 pCsr = fts5CursorFromCsrid(pGlobal, iCsrId); |
| 22204 pTab = (Fts5Table*)pCsr->base.pVtab; |
| 22205 *ppConfig = pTab->pConfig; |
| 22206 |
| 22207 return pTab->pIndex; |
| 22208 } |
| 22209 |
| 22210 /* |
| 22211 ** Return a "position-list blob" corresponding to the current position of |
| 22212 ** cursor pCsr via sqlite3_result_blob(). A position-list blob contains |
| 22213 ** the current position-list for each phrase in the query associated with |
| 22214 ** cursor pCsr. |
| 22215 ** |
| 22216 ** A position-list blob begins with (nPhrase-1) varints, where nPhrase is |
| 22217 ** the number of phrases in the query. Following the varints are the |
| 22218 ** concatenated position lists for each phrase, in order. |
| 22219 ** |
| 22220 ** The first varint (if it exists) contains the size of the position list |
| 22221 ** for phrase 0. The second (same disclaimer) contains the size of position |
| 22222 ** list 1. And so on. There is no size field for the final position list, |
| 22223 ** as it can be derived from the total size of the blob. |
| 22224 */ |
| 22225 static int fts5PoslistBlob(sqlite3_context *pCtx, Fts5Cursor *pCsr){ |
| 22226 int i; |
| 22227 int rc = SQLITE_OK; |
| 22228 int nPhrase = sqlite3Fts5ExprPhraseCount(pCsr->pExpr); |
| 22229 Fts5Buffer val; |
| 22230 |
| 22231 memset(&val, 0, sizeof(Fts5Buffer)); |
| 22232 |
| 22233 /* Append the varints */ |
| 22234 for(i=0; i<(nPhrase-1); i++){ |
| 22235 const u8 *dummy; |
| 22236 int nByte = sqlite3Fts5ExprPoslist(pCsr->pExpr, i, &dummy); |
| 22237 sqlite3Fts5BufferAppendVarint(&rc, &val, nByte); |
| 22238 } |
| 22239 |
| 22240 /* Append the position lists */ |
| 22241 for(i=0; i<nPhrase; i++){ |
| 22242 const u8 *pPoslist; |
| 22243 int nPoslist; |
| 22244 nPoslist = sqlite3Fts5ExprPoslist(pCsr->pExpr, i, &pPoslist); |
| 22245 sqlite3Fts5BufferAppendBlob(&rc, &val, nPoslist, pPoslist); |
| 22246 } |
| 22247 |
| 22248 sqlite3_result_blob(pCtx, val.p, val.n, sqlite3_free); |
| 22249 return rc; |
| 22250 } |
| 22251 |
| 22252 /* |
| 22253 ** This is the xColumn method, called by SQLite to request a value from |
| 22254 ** the row that the supplied cursor currently points to. |
| 22255 */ |
| 22256 static int fts5ColumnMethod( |
| 22257 sqlite3_vtab_cursor *pCursor, /* Cursor to retrieve value from */ |
| 22258 sqlite3_context *pCtx, /* Context for sqlite3_result_xxx() calls */ |
| 22259 int iCol /* Index of column to read value from */ |
| 22260 ){ |
| 22261 Fts5Table *pTab = (Fts5Table*)(pCursor->pVtab); |
| 22262 Fts5Config *pConfig = pTab->pConfig; |
| 22263 Fts5Cursor *pCsr = (Fts5Cursor*)pCursor; |
| 22264 int rc = SQLITE_OK; |
| 22265 |
| 22266 assert( CsrFlagTest(pCsr, FTS5CSR_EOF)==0 ); |
| 22267 |
| 22268 if( pCsr->ePlan==FTS5_PLAN_SPECIAL ){ |
| 22269 if( iCol==pConfig->nCol ){ |
| 22270 sqlite3_result_int64(pCtx, pCsr->iSpecial); |
| 22271 } |
| 22272 }else |
| 22273 |
| 22274 if( iCol==pConfig->nCol ){ |
| 22275 /* User is requesting the value of the special column with the same name |
| 22276 ** as the table. Return the cursor integer id number. This value is only |
| 22277 ** useful in that it may be passed as the first argument to an FTS5 |
| 22278 ** auxiliary function. */ |
| 22279 sqlite3_result_int64(pCtx, pCsr->iCsrId); |
| 22280 }else if( iCol==pConfig->nCol+1 ){ |
| 22281 |
| 22282 /* The value of the "rank" column. */ |
| 22283 if( pCsr->ePlan==FTS5_PLAN_SOURCE ){ |
| 22284 fts5PoslistBlob(pCtx, pCsr); |
| 22285 }else if( |
| 22286 pCsr->ePlan==FTS5_PLAN_MATCH |
| 22287 || pCsr->ePlan==FTS5_PLAN_SORTED_MATCH |
| 22288 ){ |
| 22289 if( pCsr->pRank || SQLITE_OK==(rc = fts5FindRankFunction(pCsr)) ){ |
| 22290 fts5ApiInvoke(pCsr->pRank, pCsr, pCtx, pCsr->nRankArg, pCsr->apRankArg); |
| 22291 } |
| 22292 } |
| 22293 }else if( !fts5IsContentless(pTab) ){ |
| 22294 rc = fts5SeekCursor(pCsr, 1); |
| 22295 if( rc==SQLITE_OK ){ |
| 22296 sqlite3_result_value(pCtx, sqlite3_column_value(pCsr->pStmt, iCol+1)); |
| 22297 } |
| 22298 } |
| 22299 return rc; |
| 22300 } |
| 22301 |
| 22302 |
| 22303 /* |
| 22304 ** This routine implements the xFindFunction method for the FTS3 |
| 22305 ** virtual table. |
| 22306 */ |
| 22307 static int fts5FindFunctionMethod( |
| 22308 sqlite3_vtab *pVtab, /* Virtual table handle */ |
| 22309 int nArg, /* Number of SQL function arguments */ |
| 22310 const char *zName, /* Name of SQL function */ |
| 22311 void (**pxFunc)(sqlite3_context*,int,sqlite3_value**), /* OUT: Result */ |
| 22312 void **ppArg /* OUT: User data for *pxFunc */ |
| 22313 ){ |
| 22314 Fts5Table *pTab = (Fts5Table*)pVtab; |
| 22315 Fts5Auxiliary *pAux; |
| 22316 |
| 22317 pAux = fts5FindAuxiliary(pTab, zName); |
| 22318 if( pAux ){ |
| 22319 *pxFunc = fts5ApiCallback; |
| 22320 *ppArg = (void*)pAux; |
| 22321 return 1; |
| 22322 } |
| 22323 |
| 22324 /* No function of the specified name was found. Return 0. */ |
| 22325 return 0; |
| 22326 } |
| 22327 |
| 22328 /* |
| 22329 ** Implementation of FTS5 xRename method. Rename an fts5 table. |
| 22330 */ |
| 22331 static int fts5RenameMethod( |
| 22332 sqlite3_vtab *pVtab, /* Virtual table handle */ |
| 22333 const char *zName /* New name of table */ |
| 22334 ){ |
| 22335 Fts5Table *pTab = (Fts5Table*)pVtab; |
| 22336 return sqlite3Fts5StorageRename(pTab->pStorage, zName); |
| 22337 } |
| 22338 |
| 22339 /* |
| 22340 ** The xSavepoint() method. |
| 22341 ** |
| 22342 ** Flush the contents of the pending-terms table to disk. |
| 22343 */ |
| 22344 static int fts5SavepointMethod(sqlite3_vtab *pVtab, int iSavepoint){ |
| 22345 Fts5Table *pTab = (Fts5Table*)pVtab; |
| 22346 fts5CheckTransactionState(pTab, FTS5_SAVEPOINT, iSavepoint); |
| 22347 fts5TripCursors(pTab); |
| 22348 return sqlite3Fts5StorageSync(pTab->pStorage, 0); |
| 22349 } |
| 22350 |
| 22351 /* |
| 22352 ** The xRelease() method. |
| 22353 ** |
| 22354 ** This is a no-op. |
| 22355 */ |
| 22356 static int fts5ReleaseMethod(sqlite3_vtab *pVtab, int iSavepoint){ |
| 22357 Fts5Table *pTab = (Fts5Table*)pVtab; |
| 22358 fts5CheckTransactionState(pTab, FTS5_RELEASE, iSavepoint); |
| 22359 fts5TripCursors(pTab); |
| 22360 return sqlite3Fts5StorageSync(pTab->pStorage, 0); |
| 22361 } |
| 22362 |
| 22363 /* |
| 22364 ** The xRollbackTo() method. |
| 22365 ** |
| 22366 ** Discard the contents of the pending terms table. |
| 22367 */ |
| 22368 static int fts5RollbackToMethod(sqlite3_vtab *pVtab, int iSavepoint){ |
| 22369 Fts5Table *pTab = (Fts5Table*)pVtab; |
| 22370 fts5CheckTransactionState(pTab, FTS5_ROLLBACKTO, iSavepoint); |
| 22371 fts5TripCursors(pTab); |
| 22372 return sqlite3Fts5StorageRollback(pTab->pStorage); |
| 22373 } |
| 22374 |
| 22375 /* |
| 22376 ** Register a new auxiliary function with global context pGlobal. |
| 22377 */ |
| 22378 static int fts5CreateAux( |
| 22379 fts5_api *pApi, /* Global context (one per db handle) */ |
| 22380 const char *zName, /* Name of new function */ |
| 22381 void *pUserData, /* User data for aux. function */ |
| 22382 fts5_extension_function xFunc, /* Aux. function implementation */ |
| 22383 void(*xDestroy)(void*) /* Destructor for pUserData */ |
| 22384 ){ |
| 22385 Fts5Global *pGlobal = (Fts5Global*)pApi; |
| 22386 int rc = sqlite3_overload_function(pGlobal->db, zName, -1); |
| 22387 if( rc==SQLITE_OK ){ |
| 22388 Fts5Auxiliary *pAux; |
| 22389 int nName; /* Size of zName in bytes, including \0 */ |
| 22390 int nByte; /* Bytes of space to allocate */ |
| 22391 |
| 22392 nName = (int)strlen(zName) + 1; |
| 22393 nByte = sizeof(Fts5Auxiliary) + nName; |
| 22394 pAux = (Fts5Auxiliary*)sqlite3_malloc(nByte); |
| 22395 if( pAux ){ |
| 22396 memset(pAux, 0, nByte); |
| 22397 pAux->zFunc = (char*)&pAux[1]; |
| 22398 memcpy(pAux->zFunc, zName, nName); |
| 22399 pAux->pGlobal = pGlobal; |
| 22400 pAux->pUserData = pUserData; |
| 22401 pAux->xFunc = xFunc; |
| 22402 pAux->xDestroy = xDestroy; |
| 22403 pAux->pNext = pGlobal->pAux; |
| 22404 pGlobal->pAux = pAux; |
| 22405 }else{ |
| 22406 rc = SQLITE_NOMEM; |
| 22407 } |
| 22408 } |
| 22409 |
| 22410 return rc; |
| 22411 } |
| 22412 |
| 22413 /* |
| 22414 ** Register a new tokenizer. This is the implementation of the |
| 22415 ** fts5_api.xCreateTokenizer() method. |
| 22416 */ |
| 22417 static int fts5CreateTokenizer( |
| 22418 fts5_api *pApi, /* Global context (one per db handle) */ |
| 22419 const char *zName, /* Name of new function */ |
| 22420 void *pUserData, /* User data for aux. function */ |
| 22421 fts5_tokenizer *pTokenizer, /* Tokenizer implementation */ |
| 22422 void(*xDestroy)(void*) /* Destructor for pUserData */ |
| 22423 ){ |
| 22424 Fts5Global *pGlobal = (Fts5Global*)pApi; |
| 22425 Fts5TokenizerModule *pNew; |
| 22426 int nName; /* Size of zName and its \0 terminator */ |
| 22427 int nByte; /* Bytes of space to allocate */ |
| 22428 int rc = SQLITE_OK; |
| 22429 |
| 22430 nName = (int)strlen(zName) + 1; |
| 22431 nByte = sizeof(Fts5TokenizerModule) + nName; |
| 22432 pNew = (Fts5TokenizerModule*)sqlite3_malloc(nByte); |
| 22433 if( pNew ){ |
| 22434 memset(pNew, 0, nByte); |
| 22435 pNew->zName = (char*)&pNew[1]; |
| 22436 memcpy(pNew->zName, zName, nName); |
| 22437 pNew->pUserData = pUserData; |
| 22438 pNew->x = *pTokenizer; |
| 22439 pNew->xDestroy = xDestroy; |
| 22440 pNew->pNext = pGlobal->pTok; |
| 22441 pGlobal->pTok = pNew; |
| 22442 if( pNew->pNext==0 ){ |
| 22443 pGlobal->pDfltTok = pNew; |
| 22444 } |
| 22445 }else{ |
| 22446 rc = SQLITE_NOMEM; |
| 22447 } |
| 22448 |
| 22449 return rc; |
| 22450 } |
| 22451 |
| 22452 static Fts5TokenizerModule *fts5LocateTokenizer( |
| 22453 Fts5Global *pGlobal, |
| 22454 const char *zName |
| 22455 ){ |
| 22456 Fts5TokenizerModule *pMod = 0; |
| 22457 |
| 22458 if( zName==0 ){ |
| 22459 pMod = pGlobal->pDfltTok; |
| 22460 }else{ |
| 22461 for(pMod=pGlobal->pTok; pMod; pMod=pMod->pNext){ |
| 22462 if( sqlite3_stricmp(zName, pMod->zName)==0 ) break; |
| 22463 } |
| 22464 } |
| 22465 |
| 22466 return pMod; |
| 22467 } |
| 22468 |
| 22469 /* |
| 22470 ** Find a tokenizer. This is the implementation of the |
| 22471 ** fts5_api.xFindTokenizer() method. |
| 22472 */ |
| 22473 static int fts5FindTokenizer( |
| 22474 fts5_api *pApi, /* Global context (one per db handle) */ |
| 22475 const char *zName, /* Name of new function */ |
| 22476 void **ppUserData, |
| 22477 fts5_tokenizer *pTokenizer /* Populate this object */ |
| 22478 ){ |
| 22479 int rc = SQLITE_OK; |
| 22480 Fts5TokenizerModule *pMod; |
| 22481 |
| 22482 pMod = fts5LocateTokenizer((Fts5Global*)pApi, zName); |
| 22483 if( pMod ){ |
| 22484 *pTokenizer = pMod->x; |
| 22485 *ppUserData = pMod->pUserData; |
| 22486 }else{ |
| 22487 memset(pTokenizer, 0, sizeof(fts5_tokenizer)); |
| 22488 rc = SQLITE_ERROR; |
| 22489 } |
| 22490 |
| 22491 return rc; |
| 22492 } |
| 22493 |
| 22494 static int sqlite3Fts5GetTokenizer( |
| 22495 Fts5Global *pGlobal, |
| 22496 const char **azArg, |
| 22497 int nArg, |
| 22498 Fts5Tokenizer **ppTok, |
| 22499 fts5_tokenizer **ppTokApi, |
| 22500 char **pzErr |
| 22501 ){ |
| 22502 Fts5TokenizerModule *pMod; |
| 22503 int rc = SQLITE_OK; |
| 22504 |
| 22505 pMod = fts5LocateTokenizer(pGlobal, nArg==0 ? 0 : azArg[0]); |
| 22506 if( pMod==0 ){ |
| 22507 assert( nArg>0 ); |
| 22508 rc = SQLITE_ERROR; |
| 22509 *pzErr = sqlite3_mprintf("no such tokenizer: %s", azArg[0]); |
| 22510 }else{ |
| 22511 rc = pMod->x.xCreate(pMod->pUserData, &azArg[1], (nArg?nArg-1:0), ppTok); |
| 22512 *ppTokApi = &pMod->x; |
| 22513 if( rc!=SQLITE_OK && pzErr ){ |
| 22514 *pzErr = sqlite3_mprintf("error in tokenizer constructor"); |
| 22515 } |
| 22516 } |
| 22517 |
| 22518 if( rc!=SQLITE_OK ){ |
| 22519 *ppTokApi = 0; |
| 22520 *ppTok = 0; |
| 22521 } |
| 22522 |
| 22523 return rc; |
| 22524 } |
| 22525 |
| 22526 static void fts5ModuleDestroy(void *pCtx){ |
| 22527 Fts5TokenizerModule *pTok, *pNextTok; |
| 22528 Fts5Auxiliary *pAux, *pNextAux; |
| 22529 Fts5Global *pGlobal = (Fts5Global*)pCtx; |
| 22530 |
| 22531 for(pAux=pGlobal->pAux; pAux; pAux=pNextAux){ |
| 22532 pNextAux = pAux->pNext; |
| 22533 if( pAux->xDestroy ) pAux->xDestroy(pAux->pUserData); |
| 22534 sqlite3_free(pAux); |
| 22535 } |
| 22536 |
| 22537 for(pTok=pGlobal->pTok; pTok; pTok=pNextTok){ |
| 22538 pNextTok = pTok->pNext; |
| 22539 if( pTok->xDestroy ) pTok->xDestroy(pTok->pUserData); |
| 22540 sqlite3_free(pTok); |
| 22541 } |
| 22542 |
| 22543 sqlite3_free(pGlobal); |
| 22544 } |
| 22545 |
| 22546 static void fts5Fts5Func( |
| 22547 sqlite3_context *pCtx, /* Function call context */ |
| 22548 int nArg, /* Number of args */ |
| 22549 sqlite3_value **apVal /* Function arguments */ |
| 22550 ){ |
| 22551 Fts5Global *pGlobal = (Fts5Global*)sqlite3_user_data(pCtx); |
| 22552 char buf[8]; |
| 22553 assert( nArg==0 ); |
| 22554 assert( sizeof(buf)>=sizeof(pGlobal) ); |
| 22555 memcpy(buf, (void*)&pGlobal, sizeof(pGlobal)); |
| 22556 sqlite3_result_blob(pCtx, buf, sizeof(pGlobal), SQLITE_TRANSIENT); |
| 22557 } |
| 22558 |
| 22559 /* |
| 22560 ** Implementation of fts5_source_id() function. |
| 22561 */ |
| 22562 static void fts5SourceIdFunc( |
| 22563 sqlite3_context *pCtx, /* Function call context */ |
| 22564 int nArg, /* Number of args */ |
| 22565 sqlite3_value **apVal /* Function arguments */ |
| 22566 ){ |
| 22567 assert( nArg==0 ); |
| 22568 sqlite3_result_text(pCtx, "fts5: 2016-01-20 15:27:19 17efb4209f97fb4971656086b
138599a91a75ff9", -1, SQLITE_TRANSIENT); |
| 22569 } |
| 22570 |
| 22571 static int fts5Init(sqlite3 *db){ |
| 22572 static const sqlite3_module fts5Mod = { |
| 22573 /* iVersion */ 2, |
| 22574 /* xCreate */ fts5CreateMethod, |
| 22575 /* xConnect */ fts5ConnectMethod, |
| 22576 /* xBestIndex */ fts5BestIndexMethod, |
| 22577 /* xDisconnect */ fts5DisconnectMethod, |
| 22578 /* xDestroy */ fts5DestroyMethod, |
| 22579 /* xOpen */ fts5OpenMethod, |
| 22580 /* xClose */ fts5CloseMethod, |
| 22581 /* xFilter */ fts5FilterMethod, |
| 22582 /* xNext */ fts5NextMethod, |
| 22583 /* xEof */ fts5EofMethod, |
| 22584 /* xColumn */ fts5ColumnMethod, |
| 22585 /* xRowid */ fts5RowidMethod, |
| 22586 /* xUpdate */ fts5UpdateMethod, |
| 22587 /* xBegin */ fts5BeginMethod, |
| 22588 /* xSync */ fts5SyncMethod, |
| 22589 /* xCommit */ fts5CommitMethod, |
| 22590 /* xRollback */ fts5RollbackMethod, |
| 22591 /* xFindFunction */ fts5FindFunctionMethod, |
| 22592 /* xRename */ fts5RenameMethod, |
| 22593 /* xSavepoint */ fts5SavepointMethod, |
| 22594 /* xRelease */ fts5ReleaseMethod, |
| 22595 /* xRollbackTo */ fts5RollbackToMethod, |
| 22596 }; |
| 22597 |
| 22598 int rc; |
| 22599 Fts5Global *pGlobal = 0; |
| 22600 |
| 22601 pGlobal = (Fts5Global*)sqlite3_malloc(sizeof(Fts5Global)); |
| 22602 if( pGlobal==0 ){ |
| 22603 rc = SQLITE_NOMEM; |
| 22604 }else{ |
| 22605 void *p = (void*)pGlobal; |
| 22606 memset(pGlobal, 0, sizeof(Fts5Global)); |
| 22607 pGlobal->db = db; |
| 22608 pGlobal->api.iVersion = 2; |
| 22609 pGlobal->api.xCreateFunction = fts5CreateAux; |
| 22610 pGlobal->api.xCreateTokenizer = fts5CreateTokenizer; |
| 22611 pGlobal->api.xFindTokenizer = fts5FindTokenizer; |
| 22612 rc = sqlite3_create_module_v2(db, "fts5", &fts5Mod, p, fts5ModuleDestroy); |
| 22613 if( rc==SQLITE_OK ) rc = sqlite3Fts5IndexInit(db); |
| 22614 if( rc==SQLITE_OK ) rc = sqlite3Fts5ExprInit(pGlobal, db); |
| 22615 if( rc==SQLITE_OK ) rc = sqlite3Fts5AuxInit(&pGlobal->api); |
| 22616 if( rc==SQLITE_OK ) rc = sqlite3Fts5TokenizerInit(&pGlobal->api); |
| 22617 if( rc==SQLITE_OK ) rc = sqlite3Fts5VocabInit(pGlobal, db); |
| 22618 if( rc==SQLITE_OK ){ |
| 22619 rc = sqlite3_create_function( |
| 22620 db, "fts5", 0, SQLITE_UTF8, p, fts5Fts5Func, 0, 0 |
| 22621 ); |
| 22622 } |
| 22623 if( rc==SQLITE_OK ){ |
| 22624 rc = sqlite3_create_function( |
| 22625 db, "fts5_source_id", 0, SQLITE_UTF8, p, fts5SourceIdFunc, 0, 0 |
| 22626 ); |
| 22627 } |
| 22628 } |
| 22629 return rc; |
| 22630 } |
| 22631 |
| 22632 /* |
| 22633 ** The following functions are used to register the module with SQLite. If |
| 22634 ** this module is being built as part of the SQLite core (SQLITE_CORE is |
| 22635 ** defined), then sqlite3_open() will call sqlite3Fts5Init() directly. |
| 22636 ** |
| 22637 ** Or, if this module is being built as a loadable extension, |
| 22638 ** sqlite3Fts5Init() is omitted and the two standard entry points |
| 22639 ** sqlite3_fts_init() and sqlite3_fts5_init() defined instead. |
| 22640 */ |
| 22641 #ifndef SQLITE_CORE |
| 22642 #ifdef _WIN32 |
| 22643 __declspec(dllexport) |
| 22644 #endif |
| 22645 SQLITE_API int SQLITE_STDCALL sqlite3_fts_init( |
| 22646 sqlite3 *db, |
| 22647 char **pzErrMsg, |
| 22648 const sqlite3_api_routines *pApi |
| 22649 ){ |
| 22650 SQLITE_EXTENSION_INIT2(pApi); |
| 22651 (void)pzErrMsg; /* Unused parameter */ |
| 22652 return fts5Init(db); |
| 22653 } |
| 22654 |
| 22655 #ifdef _WIN32 |
| 22656 __declspec(dllexport) |
| 22657 #endif |
| 22658 SQLITE_API int SQLITE_STDCALL sqlite3_fts5_init( |
| 22659 sqlite3 *db, |
| 22660 char **pzErrMsg, |
| 22661 const sqlite3_api_routines *pApi |
| 22662 ){ |
| 22663 SQLITE_EXTENSION_INIT2(pApi); |
| 22664 (void)pzErrMsg; /* Unused parameter */ |
| 22665 return fts5Init(db); |
| 22666 } |
| 22667 #else |
| 22668 SQLITE_PRIVATE int sqlite3Fts5Init(sqlite3 *db){ |
| 22669 return fts5Init(db); |
| 22670 } |
| 22671 #endif |
| 22672 |
| 22673 /* |
| 22674 ** 2014 May 31 |
| 22675 ** |
| 22676 ** The author disclaims copyright to this source code. In place of |
| 22677 ** a legal notice, here is a blessing: |
| 22678 ** |
| 22679 ** May you do good and not evil. |
| 22680 ** May you find forgiveness for yourself and forgive others. |
| 22681 ** May you share freely, never taking more than you give. |
| 22682 ** |
| 22683 ****************************************************************************** |
| 22684 ** |
| 22685 */ |
| 22686 |
| 22687 |
| 22688 |
| 22689 /* #include "fts5Int.h" */ |
| 22690 |
| 22691 struct Fts5Storage { |
| 22692 Fts5Config *pConfig; |
| 22693 Fts5Index *pIndex; |
| 22694 int bTotalsValid; /* True if nTotalRow/aTotalSize[] are valid */ |
| 22695 i64 nTotalRow; /* Total number of rows in FTS table */ |
| 22696 i64 *aTotalSize; /* Total sizes of each column */ |
| 22697 sqlite3_stmt *aStmt[11]; |
| 22698 }; |
| 22699 |
| 22700 |
| 22701 #if FTS5_STMT_SCAN_ASC!=0 |
| 22702 # error "FTS5_STMT_SCAN_ASC mismatch" |
| 22703 #endif |
| 22704 #if FTS5_STMT_SCAN_DESC!=1 |
| 22705 # error "FTS5_STMT_SCAN_DESC mismatch" |
| 22706 #endif |
| 22707 #if FTS5_STMT_LOOKUP!=2 |
| 22708 # error "FTS5_STMT_LOOKUP mismatch" |
| 22709 #endif |
| 22710 |
| 22711 #define FTS5_STMT_INSERT_CONTENT 3 |
| 22712 #define FTS5_STMT_REPLACE_CONTENT 4 |
| 22713 #define FTS5_STMT_DELETE_CONTENT 5 |
| 22714 #define FTS5_STMT_REPLACE_DOCSIZE 6 |
| 22715 #define FTS5_STMT_DELETE_DOCSIZE 7 |
| 22716 #define FTS5_STMT_LOOKUP_DOCSIZE 8 |
| 22717 #define FTS5_STMT_REPLACE_CONFIG 9 |
| 22718 #define FTS5_STMT_SCAN 10 |
| 22719 |
| 22720 /* |
| 22721 ** Prepare the two insert statements - Fts5Storage.pInsertContent and |
| 22722 ** Fts5Storage.pInsertDocsize - if they have not already been prepared. |
| 22723 ** Return SQLITE_OK if successful, or an SQLite error code if an error |
| 22724 ** occurs. |
| 22725 */ |
| 22726 static int fts5StorageGetStmt( |
| 22727 Fts5Storage *p, /* Storage handle */ |
| 22728 int eStmt, /* FTS5_STMT_XXX constant */ |
| 22729 sqlite3_stmt **ppStmt, /* OUT: Prepared statement handle */ |
| 22730 char **pzErrMsg /* OUT: Error message (if any) */ |
| 22731 ){ |
| 22732 int rc = SQLITE_OK; |
| 22733 |
| 22734 /* If there is no %_docsize table, there should be no requests for |
| 22735 ** statements to operate on it. */ |
| 22736 assert( p->pConfig->bColumnsize || ( |
| 22737 eStmt!=FTS5_STMT_REPLACE_DOCSIZE |
| 22738 && eStmt!=FTS5_STMT_DELETE_DOCSIZE |
| 22739 && eStmt!=FTS5_STMT_LOOKUP_DOCSIZE |
| 22740 )); |
| 22741 |
| 22742 assert( eStmt>=0 && eStmt<ArraySize(p->aStmt) ); |
| 22743 if( p->aStmt[eStmt]==0 ){ |
| 22744 const char *azStmt[] = { |
| 22745 "SELECT %s FROM %s T WHERE T.%Q >= ? AND T.%Q <= ? ORDER BY T.%Q ASC", |
| 22746 "SELECT %s FROM %s T WHERE T.%Q <= ? AND T.%Q >= ? ORDER BY T.%Q DESC", |
| 22747 "SELECT %s FROM %s T WHERE T.%Q=?", /* LOOKUP */ |
| 22748 |
| 22749 "INSERT INTO %Q.'%q_content' VALUES(%s)", /* INSERT_CONTENT */ |
| 22750 "REPLACE INTO %Q.'%q_content' VALUES(%s)", /* REPLACE_CONTENT */ |
| 22751 "DELETE FROM %Q.'%q_content' WHERE id=?", /* DELETE_CONTENT */ |
| 22752 "REPLACE INTO %Q.'%q_docsize' VALUES(?,?)", /* REPLACE_DOCSIZE */ |
| 22753 "DELETE FROM %Q.'%q_docsize' WHERE id=?", /* DELETE_DOCSIZE */ |
| 22754 |
| 22755 "SELECT sz FROM %Q.'%q_docsize' WHERE id=?", /* LOOKUP_DOCSIZE */ |
| 22756 |
| 22757 "REPLACE INTO %Q.'%q_config' VALUES(?,?)", /* REPLACE_CONFIG */ |
| 22758 "SELECT %s FROM %s AS T", /* SCAN */ |
| 22759 }; |
| 22760 Fts5Config *pC = p->pConfig; |
| 22761 char *zSql = 0; |
| 22762 |
| 22763 switch( eStmt ){ |
| 22764 case FTS5_STMT_SCAN: |
| 22765 zSql = sqlite3_mprintf(azStmt[eStmt], |
| 22766 pC->zContentExprlist, pC->zContent |
| 22767 ); |
| 22768 break; |
| 22769 |
| 22770 case FTS5_STMT_SCAN_ASC: |
| 22771 case FTS5_STMT_SCAN_DESC: |
| 22772 zSql = sqlite3_mprintf(azStmt[eStmt], pC->zContentExprlist, |
| 22773 pC->zContent, pC->zContentRowid, pC->zContentRowid, |
| 22774 pC->zContentRowid |
| 22775 ); |
| 22776 break; |
| 22777 |
| 22778 case FTS5_STMT_LOOKUP: |
| 22779 zSql = sqlite3_mprintf(azStmt[eStmt], |
| 22780 pC->zContentExprlist, pC->zContent, pC->zContentRowid |
| 22781 ); |
| 22782 break; |
| 22783 |
| 22784 case FTS5_STMT_INSERT_CONTENT: |
| 22785 case FTS5_STMT_REPLACE_CONTENT: { |
| 22786 int nCol = pC->nCol + 1; |
| 22787 char *zBind; |
| 22788 int i; |
| 22789 |
| 22790 zBind = sqlite3_malloc(1 + nCol*2); |
| 22791 if( zBind ){ |
| 22792 for(i=0; i<nCol; i++){ |
| 22793 zBind[i*2] = '?'; |
| 22794 zBind[i*2 + 1] = ','; |
| 22795 } |
| 22796 zBind[i*2-1] = '\0'; |
| 22797 zSql = sqlite3_mprintf(azStmt[eStmt], pC->zDb, pC->zName, zBind); |
| 22798 sqlite3_free(zBind); |
| 22799 } |
| 22800 break; |
| 22801 } |
| 22802 |
| 22803 default: |
| 22804 zSql = sqlite3_mprintf(azStmt[eStmt], pC->zDb, pC->zName); |
| 22805 break; |
| 22806 } |
| 22807 |
| 22808 if( zSql==0 ){ |
| 22809 rc = SQLITE_NOMEM; |
| 22810 }else{ |
| 22811 rc = sqlite3_prepare_v2(pC->db, zSql, -1, &p->aStmt[eStmt], 0); |
| 22812 sqlite3_free(zSql); |
| 22813 if( rc!=SQLITE_OK && pzErrMsg ){ |
| 22814 *pzErrMsg = sqlite3_mprintf("%s", sqlite3_errmsg(pC->db)); |
| 22815 } |
| 22816 } |
| 22817 } |
| 22818 |
| 22819 *ppStmt = p->aStmt[eStmt]; |
| 22820 return rc; |
| 22821 } |
| 22822 |
| 22823 |
| 22824 static int fts5ExecPrintf( |
| 22825 sqlite3 *db, |
| 22826 char **pzErr, |
| 22827 const char *zFormat, |
| 22828 ... |
| 22829 ){ |
| 22830 int rc; |
| 22831 va_list ap; /* ... printf arguments */ |
| 22832 char *zSql; |
| 22833 |
| 22834 va_start(ap, zFormat); |
| 22835 zSql = sqlite3_vmprintf(zFormat, ap); |
| 22836 |
| 22837 if( zSql==0 ){ |
| 22838 rc = SQLITE_NOMEM; |
| 22839 }else{ |
| 22840 rc = sqlite3_exec(db, zSql, 0, 0, pzErr); |
| 22841 sqlite3_free(zSql); |
| 22842 } |
| 22843 |
| 22844 va_end(ap); |
| 22845 return rc; |
| 22846 } |
| 22847 |
| 22848 /* |
| 22849 ** Drop all shadow tables. Return SQLITE_OK if successful or an SQLite error |
| 22850 ** code otherwise. |
| 22851 */ |
| 22852 static int sqlite3Fts5DropAll(Fts5Config *pConfig){ |
| 22853 int rc = fts5ExecPrintf(pConfig->db, 0, |
| 22854 "DROP TABLE IF EXISTS %Q.'%q_data';" |
| 22855 "DROP TABLE IF EXISTS %Q.'%q_idx';" |
| 22856 "DROP TABLE IF EXISTS %Q.'%q_config';", |
| 22857 pConfig->zDb, pConfig->zName, |
| 22858 pConfig->zDb, pConfig->zName, |
| 22859 pConfig->zDb, pConfig->zName |
| 22860 ); |
| 22861 if( rc==SQLITE_OK && pConfig->bColumnsize ){ |
| 22862 rc = fts5ExecPrintf(pConfig->db, 0, |
| 22863 "DROP TABLE IF EXISTS %Q.'%q_docsize';", |
| 22864 pConfig->zDb, pConfig->zName |
| 22865 ); |
| 22866 } |
| 22867 if( rc==SQLITE_OK && pConfig->eContent==FTS5_CONTENT_NORMAL ){ |
| 22868 rc = fts5ExecPrintf(pConfig->db, 0, |
| 22869 "DROP TABLE IF EXISTS %Q.'%q_content';", |
| 22870 pConfig->zDb, pConfig->zName |
| 22871 ); |
| 22872 } |
| 22873 return rc; |
| 22874 } |
| 22875 |
| 22876 static void fts5StorageRenameOne( |
| 22877 Fts5Config *pConfig, /* Current FTS5 configuration */ |
| 22878 int *pRc, /* IN/OUT: Error code */ |
| 22879 const char *zTail, /* Tail of table name e.g. "data", "config" */ |
| 22880 const char *zName /* New name of FTS5 table */ |
| 22881 ){ |
| 22882 if( *pRc==SQLITE_OK ){ |
| 22883 *pRc = fts5ExecPrintf(pConfig->db, 0, |
| 22884 "ALTER TABLE %Q.'%q_%s' RENAME TO '%q_%s';", |
| 22885 pConfig->zDb, pConfig->zName, zTail, zName, zTail |
| 22886 ); |
| 22887 } |
| 22888 } |
| 22889 |
| 22890 static int sqlite3Fts5StorageRename(Fts5Storage *pStorage, const char *zName){ |
| 22891 Fts5Config *pConfig = pStorage->pConfig; |
| 22892 int rc = sqlite3Fts5StorageSync(pStorage, 1); |
| 22893 |
| 22894 fts5StorageRenameOne(pConfig, &rc, "data", zName); |
| 22895 fts5StorageRenameOne(pConfig, &rc, "idx", zName); |
| 22896 fts5StorageRenameOne(pConfig, &rc, "config", zName); |
| 22897 if( pConfig->bColumnsize ){ |
| 22898 fts5StorageRenameOne(pConfig, &rc, "docsize", zName); |
| 22899 } |
| 22900 if( pConfig->eContent==FTS5_CONTENT_NORMAL ){ |
| 22901 fts5StorageRenameOne(pConfig, &rc, "content", zName); |
| 22902 } |
| 22903 return rc; |
| 22904 } |
| 22905 |
| 22906 /* |
| 22907 ** Create the shadow table named zPost, with definition zDefn. Return |
| 22908 ** SQLITE_OK if successful, or an SQLite error code otherwise. |
| 22909 */ |
| 22910 static int sqlite3Fts5CreateTable( |
| 22911 Fts5Config *pConfig, /* FTS5 configuration */ |
| 22912 const char *zPost, /* Shadow table to create (e.g. "content") */ |
| 22913 const char *zDefn, /* Columns etc. for shadow table */ |
| 22914 int bWithout, /* True for without rowid */ |
| 22915 char **pzErr /* OUT: Error message */ |
| 22916 ){ |
| 22917 int rc; |
| 22918 char *zErr = 0; |
| 22919 |
| 22920 rc = fts5ExecPrintf(pConfig->db, &zErr, "CREATE TABLE %Q.'%q_%q'(%s)%s", |
| 22921 pConfig->zDb, pConfig->zName, zPost, zDefn, bWithout?" WITHOUT ROWID":"" |
| 22922 ); |
| 22923 if( zErr ){ |
| 22924 *pzErr = sqlite3_mprintf( |
| 22925 "fts5: error creating shadow table %q_%s: %s", |
| 22926 pConfig->zName, zPost, zErr |
| 22927 ); |
| 22928 sqlite3_free(zErr); |
| 22929 } |
| 22930 |
| 22931 return rc; |
| 22932 } |
| 22933 |
| 22934 /* |
| 22935 ** Open a new Fts5Index handle. If the bCreate argument is true, create |
| 22936 ** and initialize the underlying tables |
| 22937 ** |
| 22938 ** If successful, set *pp to point to the new object and return SQLITE_OK. |
| 22939 ** Otherwise, set *pp to NULL and return an SQLite error code. |
| 22940 */ |
| 22941 static int sqlite3Fts5StorageOpen( |
| 22942 Fts5Config *pConfig, |
| 22943 Fts5Index *pIndex, |
| 22944 int bCreate, |
| 22945 Fts5Storage **pp, |
| 22946 char **pzErr /* OUT: Error message */ |
| 22947 ){ |
| 22948 int rc = SQLITE_OK; |
| 22949 Fts5Storage *p; /* New object */ |
| 22950 int nByte; /* Bytes of space to allocate */ |
| 22951 |
| 22952 nByte = sizeof(Fts5Storage) /* Fts5Storage object */ |
| 22953 + pConfig->nCol * sizeof(i64); /* Fts5Storage.aTotalSize[] */ |
| 22954 *pp = p = (Fts5Storage*)sqlite3_malloc(nByte); |
| 22955 if( !p ) return SQLITE_NOMEM; |
| 22956 |
| 22957 memset(p, 0, nByte); |
| 22958 p->aTotalSize = (i64*)&p[1]; |
| 22959 p->pConfig = pConfig; |
| 22960 p->pIndex = pIndex; |
| 22961 |
| 22962 if( bCreate ){ |
| 22963 if( pConfig->eContent==FTS5_CONTENT_NORMAL ){ |
| 22964 int nDefn = 32 + pConfig->nCol*10; |
| 22965 char *zDefn = sqlite3_malloc(32 + pConfig->nCol * 10); |
| 22966 if( zDefn==0 ){ |
| 22967 rc = SQLITE_NOMEM; |
| 22968 }else{ |
| 22969 int i; |
| 22970 int iOff; |
| 22971 sqlite3_snprintf(nDefn, zDefn, "id INTEGER PRIMARY KEY"); |
| 22972 iOff = (int)strlen(zDefn); |
| 22973 for(i=0; i<pConfig->nCol; i++){ |
| 22974 sqlite3_snprintf(nDefn-iOff, &zDefn[iOff], ", c%d", i); |
| 22975 iOff += (int)strlen(&zDefn[iOff]); |
| 22976 } |
| 22977 rc = sqlite3Fts5CreateTable(pConfig, "content", zDefn, 0, pzErr); |
| 22978 } |
| 22979 sqlite3_free(zDefn); |
| 22980 } |
| 22981 |
| 22982 if( rc==SQLITE_OK && pConfig->bColumnsize ){ |
| 22983 rc = sqlite3Fts5CreateTable( |
| 22984 pConfig, "docsize", "id INTEGER PRIMARY KEY, sz BLOB", 0, pzErr |
| 22985 ); |
| 22986 } |
| 22987 if( rc==SQLITE_OK ){ |
| 22988 rc = sqlite3Fts5CreateTable( |
| 22989 pConfig, "config", "k PRIMARY KEY, v", 1, pzErr |
| 22990 ); |
| 22991 } |
| 22992 if( rc==SQLITE_OK ){ |
| 22993 rc = sqlite3Fts5StorageConfigValue(p, "version", 0, FTS5_CURRENT_VERSION); |
| 22994 } |
| 22995 } |
| 22996 |
| 22997 if( rc ){ |
| 22998 sqlite3Fts5StorageClose(p); |
| 22999 *pp = 0; |
| 23000 } |
| 23001 return rc; |
| 23002 } |
| 23003 |
| 23004 /* |
| 23005 ** Close a handle opened by an earlier call to sqlite3Fts5StorageOpen(). |
| 23006 */ |
| 23007 static int sqlite3Fts5StorageClose(Fts5Storage *p){ |
| 23008 int rc = SQLITE_OK; |
| 23009 if( p ){ |
| 23010 int i; |
| 23011 |
| 23012 /* Finalize all SQL statements */ |
| 23013 for(i=0; i<(int)ArraySize(p->aStmt); i++){ |
| 23014 sqlite3_finalize(p->aStmt[i]); |
| 23015 } |
| 23016 |
| 23017 sqlite3_free(p); |
| 23018 } |
| 23019 return rc; |
| 23020 } |
| 23021 |
| 23022 typedef struct Fts5InsertCtx Fts5InsertCtx; |
| 23023 struct Fts5InsertCtx { |
| 23024 Fts5Storage *pStorage; |
| 23025 int iCol; |
| 23026 int szCol; /* Size of column value in tokens */ |
| 23027 }; |
| 23028 |
| 23029 /* |
| 23030 ** Tokenization callback used when inserting tokens into the FTS index. |
| 23031 */ |
| 23032 static int fts5StorageInsertCallback( |
| 23033 void *pContext, /* Pointer to Fts5InsertCtx object */ |
| 23034 int tflags, |
| 23035 const char *pToken, /* Buffer containing token */ |
| 23036 int nToken, /* Size of token in bytes */ |
| 23037 int iStart, /* Start offset of token */ |
| 23038 int iEnd /* End offset of token */ |
| 23039 ){ |
| 23040 Fts5InsertCtx *pCtx = (Fts5InsertCtx*)pContext; |
| 23041 Fts5Index *pIdx = pCtx->pStorage->pIndex; |
| 23042 if( (tflags & FTS5_TOKEN_COLOCATED)==0 || pCtx->szCol==0 ){ |
| 23043 pCtx->szCol++; |
| 23044 } |
| 23045 return sqlite3Fts5IndexWrite(pIdx, pCtx->iCol, pCtx->szCol-1, pToken, nToken); |
| 23046 } |
| 23047 |
| 23048 /* |
| 23049 ** If a row with rowid iDel is present in the %_content table, add the |
| 23050 ** delete-markers to the FTS index necessary to delete it. Do not actually |
| 23051 ** remove the %_content row at this time though. |
| 23052 */ |
| 23053 static int fts5StorageDeleteFromIndex(Fts5Storage *p, i64 iDel){ |
| 23054 Fts5Config *pConfig = p->pConfig; |
| 23055 sqlite3_stmt *pSeek; /* SELECT to read row iDel from %_data */ |
| 23056 int rc; /* Return code */ |
| 23057 |
| 23058 rc = fts5StorageGetStmt(p, FTS5_STMT_LOOKUP, &pSeek, 0); |
| 23059 if( rc==SQLITE_OK ){ |
| 23060 int rc2; |
| 23061 sqlite3_bind_int64(pSeek, 1, iDel); |
| 23062 if( sqlite3_step(pSeek)==SQLITE_ROW ){ |
| 23063 int iCol; |
| 23064 Fts5InsertCtx ctx; |
| 23065 ctx.pStorage = p; |
| 23066 ctx.iCol = -1; |
| 23067 rc = sqlite3Fts5IndexBeginWrite(p->pIndex, 1, iDel); |
| 23068 for(iCol=1; rc==SQLITE_OK && iCol<=pConfig->nCol; iCol++){ |
| 23069 if( pConfig->abUnindexed[iCol-1] ) continue; |
| 23070 ctx.szCol = 0; |
| 23071 rc = sqlite3Fts5Tokenize(pConfig, |
| 23072 FTS5_TOKENIZE_DOCUMENT, |
| 23073 (const char*)sqlite3_column_text(pSeek, iCol), |
| 23074 sqlite3_column_bytes(pSeek, iCol), |
| 23075 (void*)&ctx, |
| 23076 fts5StorageInsertCallback |
| 23077 ); |
| 23078 p->aTotalSize[iCol-1] -= (i64)ctx.szCol; |
| 23079 } |
| 23080 p->nTotalRow--; |
| 23081 } |
| 23082 rc2 = sqlite3_reset(pSeek); |
| 23083 if( rc==SQLITE_OK ) rc = rc2; |
| 23084 } |
| 23085 |
| 23086 return rc; |
| 23087 } |
| 23088 |
| 23089 |
| 23090 /* |
| 23091 ** Insert a record into the %_docsize table. Specifically, do: |
| 23092 ** |
| 23093 ** INSERT OR REPLACE INTO %_docsize(id, sz) VALUES(iRowid, pBuf); |
| 23094 ** |
| 23095 ** If there is no %_docsize table (as happens if the columnsize=0 option |
| 23096 ** is specified when the FTS5 table is created), this function is a no-op. |
| 23097 */ |
| 23098 static int fts5StorageInsertDocsize( |
| 23099 Fts5Storage *p, /* Storage module to write to */ |
| 23100 i64 iRowid, /* id value */ |
| 23101 Fts5Buffer *pBuf /* sz value */ |
| 23102 ){ |
| 23103 int rc = SQLITE_OK; |
| 23104 if( p->pConfig->bColumnsize ){ |
| 23105 sqlite3_stmt *pReplace = 0; |
| 23106 rc = fts5StorageGetStmt(p, FTS5_STMT_REPLACE_DOCSIZE, &pReplace, 0); |
| 23107 if( rc==SQLITE_OK ){ |
| 23108 sqlite3_bind_int64(pReplace, 1, iRowid); |
| 23109 sqlite3_bind_blob(pReplace, 2, pBuf->p, pBuf->n, SQLITE_STATIC); |
| 23110 sqlite3_step(pReplace); |
| 23111 rc = sqlite3_reset(pReplace); |
| 23112 } |
| 23113 } |
| 23114 return rc; |
| 23115 } |
| 23116 |
| 23117 /* |
| 23118 ** Load the contents of the "averages" record from disk into the |
| 23119 ** p->nTotalRow and p->aTotalSize[] variables. If successful, and if |
| 23120 ** argument bCache is true, set the p->bTotalsValid flag to indicate |
| 23121 ** that the contents of aTotalSize[] and nTotalRow are valid until |
| 23122 ** further notice. |
| 23123 ** |
| 23124 ** Return SQLITE_OK if successful, or an SQLite error code if an error |
| 23125 ** occurs. |
| 23126 */ |
| 23127 static int fts5StorageLoadTotals(Fts5Storage *p, int bCache){ |
| 23128 int rc = SQLITE_OK; |
| 23129 if( p->bTotalsValid==0 ){ |
| 23130 rc = sqlite3Fts5IndexGetAverages(p->pIndex, &p->nTotalRow, p->aTotalSize); |
| 23131 p->bTotalsValid = bCache; |
| 23132 } |
| 23133 return rc; |
| 23134 } |
| 23135 |
| 23136 /* |
| 23137 ** Store the current contents of the p->nTotalRow and p->aTotalSize[] |
| 23138 ** variables in the "averages" record on disk. |
| 23139 ** |
| 23140 ** Return SQLITE_OK if successful, or an SQLite error code if an error |
| 23141 ** occurs. |
| 23142 */ |
| 23143 static int fts5StorageSaveTotals(Fts5Storage *p){ |
| 23144 int nCol = p->pConfig->nCol; |
| 23145 int i; |
| 23146 Fts5Buffer buf; |
| 23147 int rc = SQLITE_OK; |
| 23148 memset(&buf, 0, sizeof(buf)); |
| 23149 |
| 23150 sqlite3Fts5BufferAppendVarint(&rc, &buf, p->nTotalRow); |
| 23151 for(i=0; i<nCol; i++){ |
| 23152 sqlite3Fts5BufferAppendVarint(&rc, &buf, p->aTotalSize[i]); |
| 23153 } |
| 23154 if( rc==SQLITE_OK ){ |
| 23155 rc = sqlite3Fts5IndexSetAverages(p->pIndex, buf.p, buf.n); |
| 23156 } |
| 23157 sqlite3_free(buf.p); |
| 23158 |
| 23159 return rc; |
| 23160 } |
| 23161 |
| 23162 /* |
| 23163 ** Remove a row from the FTS table. |
| 23164 */ |
| 23165 static int sqlite3Fts5StorageDelete(Fts5Storage *p, i64 iDel){ |
| 23166 Fts5Config *pConfig = p->pConfig; |
| 23167 int rc; |
| 23168 sqlite3_stmt *pDel = 0; |
| 23169 |
| 23170 rc = fts5StorageLoadTotals(p, 1); |
| 23171 |
| 23172 /* Delete the index records */ |
| 23173 if( rc==SQLITE_OK ){ |
| 23174 rc = fts5StorageDeleteFromIndex(p, iDel); |
| 23175 } |
| 23176 |
| 23177 /* Delete the %_docsize record */ |
| 23178 if( rc==SQLITE_OK && pConfig->bColumnsize ){ |
| 23179 rc = fts5StorageGetStmt(p, FTS5_STMT_DELETE_DOCSIZE, &pDel, 0); |
| 23180 if( rc==SQLITE_OK ){ |
| 23181 sqlite3_bind_int64(pDel, 1, iDel); |
| 23182 sqlite3_step(pDel); |
| 23183 rc = sqlite3_reset(pDel); |
| 23184 } |
| 23185 } |
| 23186 |
| 23187 /* Delete the %_content record */ |
| 23188 if( pConfig->eContent==FTS5_CONTENT_NORMAL ){ |
| 23189 if( rc==SQLITE_OK ){ |
| 23190 rc = fts5StorageGetStmt(p, FTS5_STMT_DELETE_CONTENT, &pDel, 0); |
| 23191 } |
| 23192 if( rc==SQLITE_OK ){ |
| 23193 sqlite3_bind_int64(pDel, 1, iDel); |
| 23194 sqlite3_step(pDel); |
| 23195 rc = sqlite3_reset(pDel); |
| 23196 } |
| 23197 } |
| 23198 |
| 23199 /* Write the averages record */ |
| 23200 if( rc==SQLITE_OK ){ |
| 23201 rc = fts5StorageSaveTotals(p); |
| 23202 } |
| 23203 |
| 23204 return rc; |
| 23205 } |
| 23206 |
| 23207 static int sqlite3Fts5StorageSpecialDelete( |
| 23208 Fts5Storage *p, |
| 23209 i64 iDel, |
| 23210 sqlite3_value **apVal |
| 23211 ){ |
| 23212 Fts5Config *pConfig = p->pConfig; |
| 23213 int rc; |
| 23214 sqlite3_stmt *pDel = 0; |
| 23215 |
| 23216 assert( pConfig->eContent!=FTS5_CONTENT_NORMAL ); |
| 23217 rc = fts5StorageLoadTotals(p, 1); |
| 23218 |
| 23219 /* Delete the index records */ |
| 23220 if( rc==SQLITE_OK ){ |
| 23221 int iCol; |
| 23222 Fts5InsertCtx ctx; |
| 23223 ctx.pStorage = p; |
| 23224 ctx.iCol = -1; |
| 23225 |
| 23226 rc = sqlite3Fts5IndexBeginWrite(p->pIndex, 1, iDel); |
| 23227 for(iCol=0; rc==SQLITE_OK && iCol<pConfig->nCol; iCol++){ |
| 23228 if( pConfig->abUnindexed[iCol] ) continue; |
| 23229 ctx.szCol = 0; |
| 23230 rc = sqlite3Fts5Tokenize(pConfig, |
| 23231 FTS5_TOKENIZE_DOCUMENT, |
| 23232 (const char*)sqlite3_value_text(apVal[iCol]), |
| 23233 sqlite3_value_bytes(apVal[iCol]), |
| 23234 (void*)&ctx, |
| 23235 fts5StorageInsertCallback |
| 23236 ); |
| 23237 p->aTotalSize[iCol] -= (i64)ctx.szCol; |
| 23238 } |
| 23239 p->nTotalRow--; |
| 23240 } |
| 23241 |
| 23242 /* Delete the %_docsize record */ |
| 23243 if( pConfig->bColumnsize ){ |
| 23244 if( rc==SQLITE_OK ){ |
| 23245 rc = fts5StorageGetStmt(p, FTS5_STMT_DELETE_DOCSIZE, &pDel, 0); |
| 23246 } |
| 23247 if( rc==SQLITE_OK ){ |
| 23248 sqlite3_bind_int64(pDel, 1, iDel); |
| 23249 sqlite3_step(pDel); |
| 23250 rc = sqlite3_reset(pDel); |
| 23251 } |
| 23252 } |
| 23253 |
| 23254 /* Write the averages record */ |
| 23255 if( rc==SQLITE_OK ){ |
| 23256 rc = fts5StorageSaveTotals(p); |
| 23257 } |
| 23258 |
| 23259 return rc; |
| 23260 } |
| 23261 |
| 23262 /* |
| 23263 ** Delete all entries in the FTS5 index. |
| 23264 */ |
| 23265 static int sqlite3Fts5StorageDeleteAll(Fts5Storage *p){ |
| 23266 Fts5Config *pConfig = p->pConfig; |
| 23267 int rc; |
| 23268 |
| 23269 /* Delete the contents of the %_data and %_docsize tables. */ |
| 23270 rc = fts5ExecPrintf(pConfig->db, 0, |
| 23271 "DELETE FROM %Q.'%q_data';" |
| 23272 "DELETE FROM %Q.'%q_idx';", |
| 23273 pConfig->zDb, pConfig->zName, |
| 23274 pConfig->zDb, pConfig->zName |
| 23275 ); |
| 23276 if( rc==SQLITE_OK && pConfig->bColumnsize ){ |
| 23277 rc = fts5ExecPrintf(pConfig->db, 0, |
| 23278 "DELETE FROM %Q.'%q_docsize';", |
| 23279 pConfig->zDb, pConfig->zName |
| 23280 ); |
| 23281 } |
| 23282 |
| 23283 /* Reinitialize the %_data table. This call creates the initial structure |
| 23284 ** and averages records. */ |
| 23285 if( rc==SQLITE_OK ){ |
| 23286 rc = sqlite3Fts5IndexReinit(p->pIndex); |
| 23287 } |
| 23288 if( rc==SQLITE_OK ){ |
| 23289 rc = sqlite3Fts5StorageConfigValue(p, "version", 0, FTS5_CURRENT_VERSION); |
| 23290 } |
| 23291 return rc; |
| 23292 } |
| 23293 |
| 23294 static int sqlite3Fts5StorageRebuild(Fts5Storage *p){ |
| 23295 Fts5Buffer buf = {0,0,0}; |
| 23296 Fts5Config *pConfig = p->pConfig; |
| 23297 sqlite3_stmt *pScan = 0; |
| 23298 Fts5InsertCtx ctx; |
| 23299 int rc; |
| 23300 |
| 23301 memset(&ctx, 0, sizeof(Fts5InsertCtx)); |
| 23302 ctx.pStorage = p; |
| 23303 rc = sqlite3Fts5StorageDeleteAll(p); |
| 23304 if( rc==SQLITE_OK ){ |
| 23305 rc = fts5StorageLoadTotals(p, 1); |
| 23306 } |
| 23307 |
| 23308 if( rc==SQLITE_OK ){ |
| 23309 rc = fts5StorageGetStmt(p, FTS5_STMT_SCAN, &pScan, 0); |
| 23310 } |
| 23311 |
| 23312 while( rc==SQLITE_OK && SQLITE_ROW==sqlite3_step(pScan) ){ |
| 23313 i64 iRowid = sqlite3_column_int64(pScan, 0); |
| 23314 |
| 23315 sqlite3Fts5BufferZero(&buf); |
| 23316 rc = sqlite3Fts5IndexBeginWrite(p->pIndex, 0, iRowid); |
| 23317 for(ctx.iCol=0; rc==SQLITE_OK && ctx.iCol<pConfig->nCol; ctx.iCol++){ |
| 23318 ctx.szCol = 0; |
| 23319 if( pConfig->abUnindexed[ctx.iCol]==0 ){ |
| 23320 rc = sqlite3Fts5Tokenize(pConfig, |
| 23321 FTS5_TOKENIZE_DOCUMENT, |
| 23322 (const char*)sqlite3_column_text(pScan, ctx.iCol+1), |
| 23323 sqlite3_column_bytes(pScan, ctx.iCol+1), |
| 23324 (void*)&ctx, |
| 23325 fts5StorageInsertCallback |
| 23326 ); |
| 23327 } |
| 23328 sqlite3Fts5BufferAppendVarint(&rc, &buf, ctx.szCol); |
| 23329 p->aTotalSize[ctx.iCol] += (i64)ctx.szCol; |
| 23330 } |
| 23331 p->nTotalRow++; |
| 23332 |
| 23333 if( rc==SQLITE_OK ){ |
| 23334 rc = fts5StorageInsertDocsize(p, iRowid, &buf); |
| 23335 } |
| 23336 } |
| 23337 sqlite3_free(buf.p); |
| 23338 |
| 23339 /* Write the averages record */ |
| 23340 if( rc==SQLITE_OK ){ |
| 23341 rc = fts5StorageSaveTotals(p); |
| 23342 } |
| 23343 return rc; |
| 23344 } |
| 23345 |
| 23346 static int sqlite3Fts5StorageOptimize(Fts5Storage *p){ |
| 23347 return sqlite3Fts5IndexOptimize(p->pIndex); |
| 23348 } |
| 23349 |
| 23350 static int sqlite3Fts5StorageMerge(Fts5Storage *p, int nMerge){ |
| 23351 return sqlite3Fts5IndexMerge(p->pIndex, nMerge); |
| 23352 } |
| 23353 |
| 23354 /* |
| 23355 ** Allocate a new rowid. This is used for "external content" tables when |
| 23356 ** a NULL value is inserted into the rowid column. The new rowid is allocated |
| 23357 ** by inserting a dummy row into the %_docsize table. The dummy will be |
| 23358 ** overwritten later. |
| 23359 ** |
| 23360 ** If the %_docsize table does not exist, SQLITE_MISMATCH is returned. In |
| 23361 ** this case the user is required to provide a rowid explicitly. |
| 23362 */ |
| 23363 static int fts5StorageNewRowid(Fts5Storage *p, i64 *piRowid){ |
| 23364 int rc = SQLITE_MISMATCH; |
| 23365 if( p->pConfig->bColumnsize ){ |
| 23366 sqlite3_stmt *pReplace = 0; |
| 23367 rc = fts5StorageGetStmt(p, FTS5_STMT_REPLACE_DOCSIZE, &pReplace, 0); |
| 23368 if( rc==SQLITE_OK ){ |
| 23369 sqlite3_bind_null(pReplace, 1); |
| 23370 sqlite3_bind_null(pReplace, 2); |
| 23371 sqlite3_step(pReplace); |
| 23372 rc = sqlite3_reset(pReplace); |
| 23373 } |
| 23374 if( rc==SQLITE_OK ){ |
| 23375 *piRowid = sqlite3_last_insert_rowid(p->pConfig->db); |
| 23376 } |
| 23377 } |
| 23378 return rc; |
| 23379 } |
| 23380 |
| 23381 /* |
| 23382 ** Insert a new row into the FTS content table. |
| 23383 */ |
| 23384 static int sqlite3Fts5StorageContentInsert( |
| 23385 Fts5Storage *p, |
| 23386 sqlite3_value **apVal, |
| 23387 i64 *piRowid |
| 23388 ){ |
| 23389 Fts5Config *pConfig = p->pConfig; |
| 23390 int rc = SQLITE_OK; |
| 23391 |
| 23392 /* Insert the new row into the %_content table. */ |
| 23393 if( pConfig->eContent!=FTS5_CONTENT_NORMAL ){ |
| 23394 if( sqlite3_value_type(apVal[1])==SQLITE_INTEGER ){ |
| 23395 *piRowid = sqlite3_value_int64(apVal[1]); |
| 23396 }else{ |
| 23397 rc = fts5StorageNewRowid(p, piRowid); |
| 23398 } |
| 23399 }else{ |
| 23400 sqlite3_stmt *pInsert = 0; /* Statement to write %_content table */ |
| 23401 int i; /* Counter variable */ |
| 23402 rc = fts5StorageGetStmt(p, FTS5_STMT_INSERT_CONTENT, &pInsert, 0); |
| 23403 for(i=1; rc==SQLITE_OK && i<=pConfig->nCol+1; i++){ |
| 23404 rc = sqlite3_bind_value(pInsert, i, apVal[i]); |
| 23405 } |
| 23406 if( rc==SQLITE_OK ){ |
| 23407 sqlite3_step(pInsert); |
| 23408 rc = sqlite3_reset(pInsert); |
| 23409 } |
| 23410 *piRowid = sqlite3_last_insert_rowid(pConfig->db); |
| 23411 } |
| 23412 |
| 23413 return rc; |
| 23414 } |
| 23415 |
| 23416 /* |
| 23417 ** Insert new entries into the FTS index and %_docsize table. |
| 23418 */ |
| 23419 static int sqlite3Fts5StorageIndexInsert( |
| 23420 Fts5Storage *p, |
| 23421 sqlite3_value **apVal, |
| 23422 i64 iRowid |
| 23423 ){ |
| 23424 Fts5Config *pConfig = p->pConfig; |
| 23425 int rc = SQLITE_OK; /* Return code */ |
| 23426 Fts5InsertCtx ctx; /* Tokenization callback context object */ |
| 23427 Fts5Buffer buf; /* Buffer used to build up %_docsize blob */ |
| 23428 |
| 23429 memset(&buf, 0, sizeof(Fts5Buffer)); |
| 23430 ctx.pStorage = p; |
| 23431 rc = fts5StorageLoadTotals(p, 1); |
| 23432 |
| 23433 if( rc==SQLITE_OK ){ |
| 23434 rc = sqlite3Fts5IndexBeginWrite(p->pIndex, 0, iRowid); |
| 23435 } |
| 23436 for(ctx.iCol=0; rc==SQLITE_OK && ctx.iCol<pConfig->nCol; ctx.iCol++){ |
| 23437 ctx.szCol = 0; |
| 23438 if( pConfig->abUnindexed[ctx.iCol]==0 ){ |
| 23439 rc = sqlite3Fts5Tokenize(pConfig, |
| 23440 FTS5_TOKENIZE_DOCUMENT, |
| 23441 (const char*)sqlite3_value_text(apVal[ctx.iCol+2]), |
| 23442 sqlite3_value_bytes(apVal[ctx.iCol+2]), |
| 23443 (void*)&ctx, |
| 23444 fts5StorageInsertCallback |
| 23445 ); |
| 23446 } |
| 23447 sqlite3Fts5BufferAppendVarint(&rc, &buf, ctx.szCol); |
| 23448 p->aTotalSize[ctx.iCol] += (i64)ctx.szCol; |
| 23449 } |
| 23450 p->nTotalRow++; |
| 23451 |
| 23452 /* Write the %_docsize record */ |
| 23453 if( rc==SQLITE_OK ){ |
| 23454 rc = fts5StorageInsertDocsize(p, iRowid, &buf); |
| 23455 } |
| 23456 sqlite3_free(buf.p); |
| 23457 |
| 23458 /* Write the averages record */ |
| 23459 if( rc==SQLITE_OK ){ |
| 23460 rc = fts5StorageSaveTotals(p); |
| 23461 } |
| 23462 |
| 23463 return rc; |
| 23464 } |
| 23465 |
| 23466 static int fts5StorageCount(Fts5Storage *p, const char *zSuffix, i64 *pnRow){ |
| 23467 Fts5Config *pConfig = p->pConfig; |
| 23468 char *zSql; |
| 23469 int rc; |
| 23470 |
| 23471 zSql = sqlite3_mprintf("SELECT count(*) FROM %Q.'%q_%s'", |
| 23472 pConfig->zDb, pConfig->zName, zSuffix |
| 23473 ); |
| 23474 if( zSql==0 ){ |
| 23475 rc = SQLITE_NOMEM; |
| 23476 }else{ |
| 23477 sqlite3_stmt *pCnt = 0; |
| 23478 rc = sqlite3_prepare_v2(pConfig->db, zSql, -1, &pCnt, 0); |
| 23479 if( rc==SQLITE_OK ){ |
| 23480 if( SQLITE_ROW==sqlite3_step(pCnt) ){ |
| 23481 *pnRow = sqlite3_column_int64(pCnt, 0); |
| 23482 } |
| 23483 rc = sqlite3_finalize(pCnt); |
| 23484 } |
| 23485 } |
| 23486 |
| 23487 sqlite3_free(zSql); |
| 23488 return rc; |
| 23489 } |
| 23490 |
| 23491 /* |
| 23492 ** Context object used by sqlite3Fts5StorageIntegrity(). |
| 23493 */ |
| 23494 typedef struct Fts5IntegrityCtx Fts5IntegrityCtx; |
| 23495 struct Fts5IntegrityCtx { |
| 23496 i64 iRowid; |
| 23497 int iCol; |
| 23498 int szCol; |
| 23499 u64 cksum; |
| 23500 Fts5Config *pConfig; |
| 23501 }; |
| 23502 |
| 23503 /* |
| 23504 ** Tokenization callback used by integrity check. |
| 23505 */ |
| 23506 static int fts5StorageIntegrityCallback( |
| 23507 void *pContext, /* Pointer to Fts5InsertCtx object */ |
| 23508 int tflags, |
| 23509 const char *pToken, /* Buffer containing token */ |
| 23510 int nToken, /* Size of token in bytes */ |
| 23511 int iStart, /* Start offset of token */ |
| 23512 int iEnd /* End offset of token */ |
| 23513 ){ |
| 23514 Fts5IntegrityCtx *pCtx = (Fts5IntegrityCtx*)pContext; |
| 23515 if( (tflags & FTS5_TOKEN_COLOCATED)==0 || pCtx->szCol==0 ){ |
| 23516 pCtx->szCol++; |
| 23517 } |
| 23518 pCtx->cksum ^= sqlite3Fts5IndexCksum( |
| 23519 pCtx->pConfig, pCtx->iRowid, pCtx->iCol, pCtx->szCol-1, pToken, nToken |
| 23520 ); |
| 23521 return SQLITE_OK; |
| 23522 } |
| 23523 |
| 23524 /* |
| 23525 ** Check that the contents of the FTS index match that of the %_content |
| 23526 ** table. Return SQLITE_OK if they do, or SQLITE_CORRUPT if not. Return |
| 23527 ** some other SQLite error code if an error occurs while attempting to |
| 23528 ** determine this. |
| 23529 */ |
| 23530 static int sqlite3Fts5StorageIntegrity(Fts5Storage *p){ |
| 23531 Fts5Config *pConfig = p->pConfig; |
| 23532 int rc; /* Return code */ |
| 23533 int *aColSize; /* Array of size pConfig->nCol */ |
| 23534 i64 *aTotalSize; /* Array of size pConfig->nCol */ |
| 23535 Fts5IntegrityCtx ctx; |
| 23536 sqlite3_stmt *pScan; |
| 23537 |
| 23538 memset(&ctx, 0, sizeof(Fts5IntegrityCtx)); |
| 23539 ctx.pConfig = p->pConfig; |
| 23540 aTotalSize = (i64*)sqlite3_malloc(pConfig->nCol * (sizeof(int)+sizeof(i64))); |
| 23541 if( !aTotalSize ) return SQLITE_NOMEM; |
| 23542 aColSize = (int*)&aTotalSize[pConfig->nCol]; |
| 23543 memset(aTotalSize, 0, sizeof(i64) * pConfig->nCol); |
| 23544 |
| 23545 /* Generate the expected index checksum based on the contents of the |
| 23546 ** %_content table. This block stores the checksum in ctx.cksum. */ |
| 23547 rc = fts5StorageGetStmt(p, FTS5_STMT_SCAN, &pScan, 0); |
| 23548 if( rc==SQLITE_OK ){ |
| 23549 int rc2; |
| 23550 while( SQLITE_ROW==sqlite3_step(pScan) ){ |
| 23551 int i; |
| 23552 ctx.iRowid = sqlite3_column_int64(pScan, 0); |
| 23553 ctx.szCol = 0; |
| 23554 if( pConfig->bColumnsize ){ |
| 23555 rc = sqlite3Fts5StorageDocsize(p, ctx.iRowid, aColSize); |
| 23556 } |
| 23557 for(i=0; rc==SQLITE_OK && i<pConfig->nCol; i++){ |
| 23558 if( pConfig->abUnindexed[i] ) continue; |
| 23559 ctx.iCol = i; |
| 23560 ctx.szCol = 0; |
| 23561 rc = sqlite3Fts5Tokenize(pConfig, |
| 23562 FTS5_TOKENIZE_DOCUMENT, |
| 23563 (const char*)sqlite3_column_text(pScan, i+1), |
| 23564 sqlite3_column_bytes(pScan, i+1), |
| 23565 (void*)&ctx, |
| 23566 fts5StorageIntegrityCallback |
| 23567 ); |
| 23568 if( pConfig->bColumnsize && ctx.szCol!=aColSize[i] ){ |
| 23569 rc = FTS5_CORRUPT; |
| 23570 } |
| 23571 aTotalSize[i] += ctx.szCol; |
| 23572 } |
| 23573 if( rc!=SQLITE_OK ) break; |
| 23574 } |
| 23575 rc2 = sqlite3_reset(pScan); |
| 23576 if( rc==SQLITE_OK ) rc = rc2; |
| 23577 } |
| 23578 |
| 23579 /* Test that the "totals" (sometimes called "averages") record looks Ok */ |
| 23580 if( rc==SQLITE_OK ){ |
| 23581 int i; |
| 23582 rc = fts5StorageLoadTotals(p, 0); |
| 23583 for(i=0; rc==SQLITE_OK && i<pConfig->nCol; i++){ |
| 23584 if( p->aTotalSize[i]!=aTotalSize[i] ) rc = FTS5_CORRUPT; |
| 23585 } |
| 23586 } |
| 23587 |
| 23588 /* Check that the %_docsize and %_content tables contain the expected |
| 23589 ** number of rows. */ |
| 23590 if( rc==SQLITE_OK && pConfig->eContent==FTS5_CONTENT_NORMAL ){ |
| 23591 i64 nRow = 0; |
| 23592 rc = fts5StorageCount(p, "content", &nRow); |
| 23593 if( rc==SQLITE_OK && nRow!=p->nTotalRow ) rc = FTS5_CORRUPT; |
| 23594 } |
| 23595 if( rc==SQLITE_OK && pConfig->bColumnsize ){ |
| 23596 i64 nRow = 0; |
| 23597 rc = fts5StorageCount(p, "docsize", &nRow); |
| 23598 if( rc==SQLITE_OK && nRow!=p->nTotalRow ) rc = FTS5_CORRUPT; |
| 23599 } |
| 23600 |
| 23601 /* Pass the expected checksum down to the FTS index module. It will |
| 23602 ** verify, amongst other things, that it matches the checksum generated by |
| 23603 ** inspecting the index itself. */ |
| 23604 if( rc==SQLITE_OK ){ |
| 23605 rc = sqlite3Fts5IndexIntegrityCheck(p->pIndex, ctx.cksum); |
| 23606 } |
| 23607 |
| 23608 sqlite3_free(aTotalSize); |
| 23609 return rc; |
| 23610 } |
| 23611 |
| 23612 /* |
| 23613 ** Obtain an SQLite statement handle that may be used to read data from the |
| 23614 ** %_content table. |
| 23615 */ |
| 23616 static int sqlite3Fts5StorageStmt( |
| 23617 Fts5Storage *p, |
| 23618 int eStmt, |
| 23619 sqlite3_stmt **pp, |
| 23620 char **pzErrMsg |
| 23621 ){ |
| 23622 int rc; |
| 23623 assert( eStmt==FTS5_STMT_SCAN_ASC |
| 23624 || eStmt==FTS5_STMT_SCAN_DESC |
| 23625 || eStmt==FTS5_STMT_LOOKUP |
| 23626 ); |
| 23627 rc = fts5StorageGetStmt(p, eStmt, pp, pzErrMsg); |
| 23628 if( rc==SQLITE_OK ){ |
| 23629 assert( p->aStmt[eStmt]==*pp ); |
| 23630 p->aStmt[eStmt] = 0; |
| 23631 } |
| 23632 return rc; |
| 23633 } |
| 23634 |
| 23635 /* |
| 23636 ** Release an SQLite statement handle obtained via an earlier call to |
| 23637 ** sqlite3Fts5StorageStmt(). The eStmt parameter passed to this function |
| 23638 ** must match that passed to the sqlite3Fts5StorageStmt() call. |
| 23639 */ |
| 23640 static void sqlite3Fts5StorageStmtRelease( |
| 23641 Fts5Storage *p, |
| 23642 int eStmt, |
| 23643 sqlite3_stmt *pStmt |
| 23644 ){ |
| 23645 assert( eStmt==FTS5_STMT_SCAN_ASC |
| 23646 || eStmt==FTS5_STMT_SCAN_DESC |
| 23647 || eStmt==FTS5_STMT_LOOKUP |
| 23648 ); |
| 23649 if( p->aStmt[eStmt]==0 ){ |
| 23650 sqlite3_reset(pStmt); |
| 23651 p->aStmt[eStmt] = pStmt; |
| 23652 }else{ |
| 23653 sqlite3_finalize(pStmt); |
| 23654 } |
| 23655 } |
| 23656 |
| 23657 static int fts5StorageDecodeSizeArray( |
| 23658 int *aCol, int nCol, /* Array to populate */ |
| 23659 const u8 *aBlob, int nBlob /* Record to read varints from */ |
| 23660 ){ |
| 23661 int i; |
| 23662 int iOff = 0; |
| 23663 for(i=0; i<nCol; i++){ |
| 23664 if( iOff>=nBlob ) return 1; |
| 23665 iOff += fts5GetVarint32(&aBlob[iOff], aCol[i]); |
| 23666 } |
| 23667 return (iOff!=nBlob); |
| 23668 } |
| 23669 |
| 23670 /* |
| 23671 ** Argument aCol points to an array of integers containing one entry for |
| 23672 ** each table column. This function reads the %_docsize record for the |
| 23673 ** specified rowid and populates aCol[] with the results. |
| 23674 ** |
| 23675 ** An SQLite error code is returned if an error occurs, or SQLITE_OK |
| 23676 ** otherwise. |
| 23677 */ |
| 23678 static int sqlite3Fts5StorageDocsize(Fts5Storage *p, i64 iRowid, int *aCol){ |
| 23679 int nCol = p->pConfig->nCol; /* Number of user columns in table */ |
| 23680 sqlite3_stmt *pLookup = 0; /* Statement to query %_docsize */ |
| 23681 int rc; /* Return Code */ |
| 23682 |
| 23683 assert( p->pConfig->bColumnsize ); |
| 23684 rc = fts5StorageGetStmt(p, FTS5_STMT_LOOKUP_DOCSIZE, &pLookup, 0); |
| 23685 if( rc==SQLITE_OK ){ |
| 23686 int bCorrupt = 1; |
| 23687 sqlite3_bind_int64(pLookup, 1, iRowid); |
| 23688 if( SQLITE_ROW==sqlite3_step(pLookup) ){ |
| 23689 const u8 *aBlob = sqlite3_column_blob(pLookup, 0); |
| 23690 int nBlob = sqlite3_column_bytes(pLookup, 0); |
| 23691 if( 0==fts5StorageDecodeSizeArray(aCol, nCol, aBlob, nBlob) ){ |
| 23692 bCorrupt = 0; |
| 23693 } |
| 23694 } |
| 23695 rc = sqlite3_reset(pLookup); |
| 23696 if( bCorrupt && rc==SQLITE_OK ){ |
| 23697 rc = FTS5_CORRUPT; |
| 23698 } |
| 23699 } |
| 23700 |
| 23701 return rc; |
| 23702 } |
| 23703 |
| 23704 static int sqlite3Fts5StorageSize(Fts5Storage *p, int iCol, i64 *pnToken){ |
| 23705 int rc = fts5StorageLoadTotals(p, 0); |
| 23706 if( rc==SQLITE_OK ){ |
| 23707 *pnToken = 0; |
| 23708 if( iCol<0 ){ |
| 23709 int i; |
| 23710 for(i=0; i<p->pConfig->nCol; i++){ |
| 23711 *pnToken += p->aTotalSize[i]; |
| 23712 } |
| 23713 }else if( iCol<p->pConfig->nCol ){ |
| 23714 *pnToken = p->aTotalSize[iCol]; |
| 23715 }else{ |
| 23716 rc = SQLITE_RANGE; |
| 23717 } |
| 23718 } |
| 23719 return rc; |
| 23720 } |
| 23721 |
| 23722 static int sqlite3Fts5StorageRowCount(Fts5Storage *p, i64 *pnRow){ |
| 23723 int rc = fts5StorageLoadTotals(p, 0); |
| 23724 if( rc==SQLITE_OK ){ |
| 23725 *pnRow = p->nTotalRow; |
| 23726 } |
| 23727 return rc; |
| 23728 } |
| 23729 |
| 23730 /* |
| 23731 ** Flush any data currently held in-memory to disk. |
| 23732 */ |
| 23733 static int sqlite3Fts5StorageSync(Fts5Storage *p, int bCommit){ |
| 23734 if( bCommit && p->bTotalsValid ){ |
| 23735 int rc = fts5StorageSaveTotals(p); |
| 23736 p->bTotalsValid = 0; |
| 23737 if( rc!=SQLITE_OK ) return rc; |
| 23738 } |
| 23739 return sqlite3Fts5IndexSync(p->pIndex, bCommit); |
| 23740 } |
| 23741 |
| 23742 static int sqlite3Fts5StorageRollback(Fts5Storage *p){ |
| 23743 p->bTotalsValid = 0; |
| 23744 return sqlite3Fts5IndexRollback(p->pIndex); |
| 23745 } |
| 23746 |
| 23747 static int sqlite3Fts5StorageConfigValue( |
| 23748 Fts5Storage *p, |
| 23749 const char *z, |
| 23750 sqlite3_value *pVal, |
| 23751 int iVal |
| 23752 ){ |
| 23753 sqlite3_stmt *pReplace = 0; |
| 23754 int rc = fts5StorageGetStmt(p, FTS5_STMT_REPLACE_CONFIG, &pReplace, 0); |
| 23755 if( rc==SQLITE_OK ){ |
| 23756 sqlite3_bind_text(pReplace, 1, z, -1, SQLITE_STATIC); |
| 23757 if( pVal ){ |
| 23758 sqlite3_bind_value(pReplace, 2, pVal); |
| 23759 }else{ |
| 23760 sqlite3_bind_int(pReplace, 2, iVal); |
| 23761 } |
| 23762 sqlite3_step(pReplace); |
| 23763 rc = sqlite3_reset(pReplace); |
| 23764 } |
| 23765 if( rc==SQLITE_OK && pVal ){ |
| 23766 int iNew = p->pConfig->iCookie + 1; |
| 23767 rc = sqlite3Fts5IndexSetCookie(p->pIndex, iNew); |
| 23768 if( rc==SQLITE_OK ){ |
| 23769 p->pConfig->iCookie = iNew; |
| 23770 } |
| 23771 } |
| 23772 return rc; |
| 23773 } |
| 23774 |
| 23775 |
| 23776 |
| 23777 /* |
| 23778 ** 2014 May 31 |
| 23779 ** |
| 23780 ** The author disclaims copyright to this source code. In place of |
| 23781 ** a legal notice, here is a blessing: |
| 23782 ** |
| 23783 ** May you do good and not evil. |
| 23784 ** May you find forgiveness for yourself and forgive others. |
| 23785 ** May you share freely, never taking more than you give. |
| 23786 ** |
| 23787 ****************************************************************************** |
| 23788 */ |
| 23789 |
| 23790 |
| 23791 /* #include "fts5Int.h" */ |
| 23792 |
| 23793 /************************************************************************** |
| 23794 ** Start of ascii tokenizer implementation. |
| 23795 */ |
| 23796 |
| 23797 /* |
| 23798 ** For tokenizers with no "unicode" modifier, the set of token characters |
| 23799 ** is the same as the set of ASCII range alphanumeric characters. |
| 23800 */ |
| 23801 static unsigned char aAsciiTokenChar[128] = { |
| 23802 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0x00..0x0F */ |
| 23803 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0x10..0x1F */ |
| 23804 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0x20..0x2F */ |
| 23805 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, /* 0x30..0x3F */ |
| 23806 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0x40..0x4F */ |
| 23807 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, /* 0x50..0x5F */ |
| 23808 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0x60..0x6F */ |
| 23809 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, /* 0x70..0x7F */ |
| 23810 }; |
| 23811 |
| 23812 typedef struct AsciiTokenizer AsciiTokenizer; |
| 23813 struct AsciiTokenizer { |
| 23814 unsigned char aTokenChar[128]; |
| 23815 }; |
| 23816 |
| 23817 static void fts5AsciiAddExceptions( |
| 23818 AsciiTokenizer *p, |
| 23819 const char *zArg, |
| 23820 int bTokenChars |
| 23821 ){ |
| 23822 int i; |
| 23823 for(i=0; zArg[i]; i++){ |
| 23824 if( (zArg[i] & 0x80)==0 ){ |
| 23825 p->aTokenChar[(int)zArg[i]] = (unsigned char)bTokenChars; |
| 23826 } |
| 23827 } |
| 23828 } |
| 23829 |
| 23830 /* |
| 23831 ** Delete a "ascii" tokenizer. |
| 23832 */ |
| 23833 static void fts5AsciiDelete(Fts5Tokenizer *p){ |
| 23834 sqlite3_free(p); |
| 23835 } |
| 23836 |
| 23837 /* |
| 23838 ** Create an "ascii" tokenizer. |
| 23839 */ |
| 23840 static int fts5AsciiCreate( |
| 23841 void *pCtx, |
| 23842 const char **azArg, int nArg, |
| 23843 Fts5Tokenizer **ppOut |
| 23844 ){ |
| 23845 int rc = SQLITE_OK; |
| 23846 AsciiTokenizer *p = 0; |
| 23847 if( nArg%2 ){ |
| 23848 rc = SQLITE_ERROR; |
| 23849 }else{ |
| 23850 p = sqlite3_malloc(sizeof(AsciiTokenizer)); |
| 23851 if( p==0 ){ |
| 23852 rc = SQLITE_NOMEM; |
| 23853 }else{ |
| 23854 int i; |
| 23855 memset(p, 0, sizeof(AsciiTokenizer)); |
| 23856 memcpy(p->aTokenChar, aAsciiTokenChar, sizeof(aAsciiTokenChar)); |
| 23857 for(i=0; rc==SQLITE_OK && i<nArg; i+=2){ |
| 23858 const char *zArg = azArg[i+1]; |
| 23859 if( 0==sqlite3_stricmp(azArg[i], "tokenchars") ){ |
| 23860 fts5AsciiAddExceptions(p, zArg, 1); |
| 23861 }else |
| 23862 if( 0==sqlite3_stricmp(azArg[i], "separators") ){ |
| 23863 fts5AsciiAddExceptions(p, zArg, 0); |
| 23864 }else{ |
| 23865 rc = SQLITE_ERROR; |
| 23866 } |
| 23867 } |
| 23868 if( rc!=SQLITE_OK ){ |
| 23869 fts5AsciiDelete((Fts5Tokenizer*)p); |
| 23870 p = 0; |
| 23871 } |
| 23872 } |
| 23873 } |
| 23874 |
| 23875 *ppOut = (Fts5Tokenizer*)p; |
| 23876 return rc; |
| 23877 } |
| 23878 |
| 23879 |
| 23880 static void asciiFold(char *aOut, const char *aIn, int nByte){ |
| 23881 int i; |
| 23882 for(i=0; i<nByte; i++){ |
| 23883 char c = aIn[i]; |
| 23884 if( c>='A' && c<='Z' ) c += 32; |
| 23885 aOut[i] = c; |
| 23886 } |
| 23887 } |
| 23888 |
| 23889 /* |
| 23890 ** Tokenize some text using the ascii tokenizer. |
| 23891 */ |
| 23892 static int fts5AsciiTokenize( |
| 23893 Fts5Tokenizer *pTokenizer, |
| 23894 void *pCtx, |
| 23895 int flags, |
| 23896 const char *pText, int nText, |
| 23897 int (*xToken)(void*, int, const char*, int nToken, int iStart, int iEnd) |
| 23898 ){ |
| 23899 AsciiTokenizer *p = (AsciiTokenizer*)pTokenizer; |
| 23900 int rc = SQLITE_OK; |
| 23901 int ie; |
| 23902 int is = 0; |
| 23903 |
| 23904 char aFold[64]; |
| 23905 int nFold = sizeof(aFold); |
| 23906 char *pFold = aFold; |
| 23907 unsigned char *a = p->aTokenChar; |
| 23908 |
| 23909 while( is<nText && rc==SQLITE_OK ){ |
| 23910 int nByte; |
| 23911 |
| 23912 /* Skip any leading divider characters. */ |
| 23913 while( is<nText && ((pText[is]&0x80)==0 && a[(int)pText[is]]==0) ){ |
| 23914 is++; |
| 23915 } |
| 23916 if( is==nText ) break; |
| 23917 |
| 23918 /* Count the token characters */ |
| 23919 ie = is+1; |
| 23920 while( ie<nText && ((pText[ie]&0x80) || a[(int)pText[ie]] ) ){ |
| 23921 ie++; |
| 23922 } |
| 23923 |
| 23924 /* Fold to lower case */ |
| 23925 nByte = ie-is; |
| 23926 if( nByte>nFold ){ |
| 23927 if( pFold!=aFold ) sqlite3_free(pFold); |
| 23928 pFold = sqlite3_malloc(nByte*2); |
| 23929 if( pFold==0 ){ |
| 23930 rc = SQLITE_NOMEM; |
| 23931 break; |
| 23932 } |
| 23933 nFold = nByte*2; |
| 23934 } |
| 23935 asciiFold(pFold, &pText[is], nByte); |
| 23936 |
| 23937 /* Invoke the token callback */ |
| 23938 rc = xToken(pCtx, 0, pFold, nByte, is, ie); |
| 23939 is = ie+1; |
| 23940 } |
| 23941 |
| 23942 if( pFold!=aFold ) sqlite3_free(pFold); |
| 23943 if( rc==SQLITE_DONE ) rc = SQLITE_OK; |
| 23944 return rc; |
| 23945 } |
| 23946 |
| 23947 /************************************************************************** |
| 23948 ** Start of unicode61 tokenizer implementation. |
| 23949 */ |
| 23950 |
| 23951 |
| 23952 /* |
| 23953 ** The following two macros - READ_UTF8 and WRITE_UTF8 - have been copied |
| 23954 ** from the sqlite3 source file utf.c. If this file is compiled as part |
| 23955 ** of the amalgamation, they are not required. |
| 23956 */ |
| 23957 #ifndef SQLITE_AMALGAMATION |
| 23958 |
| 23959 static const unsigned char sqlite3Utf8Trans1[] = { |
| 23960 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, |
| 23961 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, |
| 23962 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, |
| 23963 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f, |
| 23964 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, |
| 23965 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, |
| 23966 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, |
| 23967 0x00, 0x01, 0x02, 0x03, 0x00, 0x01, 0x00, 0x00, |
| 23968 }; |
| 23969 |
| 23970 #define READ_UTF8(zIn, zTerm, c) \ |
| 23971 c = *(zIn++); \ |
| 23972 if( c>=0xc0 ){ \ |
| 23973 c = sqlite3Utf8Trans1[c-0xc0]; \ |
| 23974 while( zIn!=zTerm && (*zIn & 0xc0)==0x80 ){ \ |
| 23975 c = (c<<6) + (0x3f & *(zIn++)); \ |
| 23976 } \ |
| 23977 if( c<0x80 \ |
| 23978 || (c&0xFFFFF800)==0xD800 \ |
| 23979 || (c&0xFFFFFFFE)==0xFFFE ){ c = 0xFFFD; } \ |
| 23980 } |
| 23981 |
| 23982 |
| 23983 #define WRITE_UTF8(zOut, c) { \ |
| 23984 if( c<0x00080 ){ \ |
| 23985 *zOut++ = (unsigned char)(c&0xFF); \ |
| 23986 } \ |
| 23987 else if( c<0x00800 ){ \ |
| 23988 *zOut++ = 0xC0 + (unsigned char)((c>>6)&0x1F); \ |
| 23989 *zOut++ = 0x80 + (unsigned char)(c & 0x3F); \ |
| 23990 } \ |
| 23991 else if( c<0x10000 ){ \ |
| 23992 *zOut++ = 0xE0 + (unsigned char)((c>>12)&0x0F); \ |
| 23993 *zOut++ = 0x80 + (unsigned char)((c>>6) & 0x3F); \ |
| 23994 *zOut++ = 0x80 + (unsigned char)(c & 0x3F); \ |
| 23995 }else{ \ |
| 23996 *zOut++ = 0xF0 + (unsigned char)((c>>18) & 0x07); \ |
| 23997 *zOut++ = 0x80 + (unsigned char)((c>>12) & 0x3F); \ |
| 23998 *zOut++ = 0x80 + (unsigned char)((c>>6) & 0x3F); \ |
| 23999 *zOut++ = 0x80 + (unsigned char)(c & 0x3F); \ |
| 24000 } \ |
| 24001 } |
| 24002 |
| 24003 #endif /* ifndef SQLITE_AMALGAMATION */ |
| 24004 |
| 24005 typedef struct Unicode61Tokenizer Unicode61Tokenizer; |
| 24006 struct Unicode61Tokenizer { |
| 24007 unsigned char aTokenChar[128]; /* ASCII range token characters */ |
| 24008 char *aFold; /* Buffer to fold text into */ |
| 24009 int nFold; /* Size of aFold[] in bytes */ |
| 24010 int bRemoveDiacritic; /* True if remove_diacritics=1 is set */ |
| 24011 int nException; |
| 24012 int *aiException; |
| 24013 }; |
| 24014 |
| 24015 static int fts5UnicodeAddExceptions( |
| 24016 Unicode61Tokenizer *p, /* Tokenizer object */ |
| 24017 const char *z, /* Characters to treat as exceptions */ |
| 24018 int bTokenChars /* 1 for 'tokenchars', 0 for 'separators' */ |
| 24019 ){ |
| 24020 int rc = SQLITE_OK; |
| 24021 int n = (int)strlen(z); |
| 24022 int *aNew; |
| 24023 |
| 24024 if( n>0 ){ |
| 24025 aNew = (int*)sqlite3_realloc(p->aiException, (n+p->nException)*sizeof(int)); |
| 24026 if( aNew ){ |
| 24027 int nNew = p->nException; |
| 24028 const unsigned char *zCsr = (const unsigned char*)z; |
| 24029 const unsigned char *zTerm = (const unsigned char*)&z[n]; |
| 24030 while( zCsr<zTerm ){ |
| 24031 int iCode; |
| 24032 int bToken; |
| 24033 READ_UTF8(zCsr, zTerm, iCode); |
| 24034 if( iCode<128 ){ |
| 24035 p->aTokenChar[iCode] = (unsigned char)bTokenChars; |
| 24036 }else{ |
| 24037 bToken = sqlite3Fts5UnicodeIsalnum(iCode); |
| 24038 assert( (bToken==0 || bToken==1) ); |
| 24039 assert( (bTokenChars==0 || bTokenChars==1) ); |
| 24040 if( bToken!=bTokenChars && sqlite3Fts5UnicodeIsdiacritic(iCode)==0 ){ |
| 24041 int i; |
| 24042 for(i=0; i<nNew; i++){ |
| 24043 if( aNew[i]>iCode ) break; |
| 24044 } |
| 24045 memmove(&aNew[i+1], &aNew[i], (nNew-i)*sizeof(int)); |
| 24046 aNew[i] = iCode; |
| 24047 nNew++; |
| 24048 } |
| 24049 } |
| 24050 } |
| 24051 p->aiException = aNew; |
| 24052 p->nException = nNew; |
| 24053 }else{ |
| 24054 rc = SQLITE_NOMEM; |
| 24055 } |
| 24056 } |
| 24057 |
| 24058 return rc; |
| 24059 } |
| 24060 |
| 24061 /* |
| 24062 ** Return true if the p->aiException[] array contains the value iCode. |
| 24063 */ |
| 24064 static int fts5UnicodeIsException(Unicode61Tokenizer *p, int iCode){ |
| 24065 if( p->nException>0 ){ |
| 24066 int *a = p->aiException; |
| 24067 int iLo = 0; |
| 24068 int iHi = p->nException-1; |
| 24069 |
| 24070 while( iHi>=iLo ){ |
| 24071 int iTest = (iHi + iLo) / 2; |
| 24072 if( iCode==a[iTest] ){ |
| 24073 return 1; |
| 24074 }else if( iCode>a[iTest] ){ |
| 24075 iLo = iTest+1; |
| 24076 }else{ |
| 24077 iHi = iTest-1; |
| 24078 } |
| 24079 } |
| 24080 } |
| 24081 |
| 24082 return 0; |
| 24083 } |
| 24084 |
| 24085 /* |
| 24086 ** Delete a "unicode61" tokenizer. |
| 24087 */ |
| 24088 static void fts5UnicodeDelete(Fts5Tokenizer *pTok){ |
| 24089 if( pTok ){ |
| 24090 Unicode61Tokenizer *p = (Unicode61Tokenizer*)pTok; |
| 24091 sqlite3_free(p->aiException); |
| 24092 sqlite3_free(p->aFold); |
| 24093 sqlite3_free(p); |
| 24094 } |
| 24095 return; |
| 24096 } |
| 24097 |
| 24098 /* |
| 24099 ** Create a "unicode61" tokenizer. |
| 24100 */ |
| 24101 static int fts5UnicodeCreate( |
| 24102 void *pCtx, |
| 24103 const char **azArg, int nArg, |
| 24104 Fts5Tokenizer **ppOut |
| 24105 ){ |
| 24106 int rc = SQLITE_OK; /* Return code */ |
| 24107 Unicode61Tokenizer *p = 0; /* New tokenizer object */ |
| 24108 |
| 24109 if( nArg%2 ){ |
| 24110 rc = SQLITE_ERROR; |
| 24111 }else{ |
| 24112 p = (Unicode61Tokenizer*)sqlite3_malloc(sizeof(Unicode61Tokenizer)); |
| 24113 if( p ){ |
| 24114 int i; |
| 24115 memset(p, 0, sizeof(Unicode61Tokenizer)); |
| 24116 memcpy(p->aTokenChar, aAsciiTokenChar, sizeof(aAsciiTokenChar)); |
| 24117 p->bRemoveDiacritic = 1; |
| 24118 p->nFold = 64; |
| 24119 p->aFold = sqlite3_malloc(p->nFold * sizeof(char)); |
| 24120 if( p->aFold==0 ){ |
| 24121 rc = SQLITE_NOMEM; |
| 24122 } |
| 24123 for(i=0; rc==SQLITE_OK && i<nArg; i+=2){ |
| 24124 const char *zArg = azArg[i+1]; |
| 24125 if( 0==sqlite3_stricmp(azArg[i], "remove_diacritics") ){ |
| 24126 if( (zArg[0]!='0' && zArg[0]!='1') || zArg[1] ){ |
| 24127 rc = SQLITE_ERROR; |
| 24128 } |
| 24129 p->bRemoveDiacritic = (zArg[0]=='1'); |
| 24130 }else |
| 24131 if( 0==sqlite3_stricmp(azArg[i], "tokenchars") ){ |
| 24132 rc = fts5UnicodeAddExceptions(p, zArg, 1); |
| 24133 }else |
| 24134 if( 0==sqlite3_stricmp(azArg[i], "separators") ){ |
| 24135 rc = fts5UnicodeAddExceptions(p, zArg, 0); |
| 24136 }else{ |
| 24137 rc = SQLITE_ERROR; |
| 24138 } |
| 24139 } |
| 24140 }else{ |
| 24141 rc = SQLITE_NOMEM; |
| 24142 } |
| 24143 if( rc!=SQLITE_OK ){ |
| 24144 fts5UnicodeDelete((Fts5Tokenizer*)p); |
| 24145 p = 0; |
| 24146 } |
| 24147 *ppOut = (Fts5Tokenizer*)p; |
| 24148 } |
| 24149 return rc; |
| 24150 } |
| 24151 |
| 24152 /* |
| 24153 ** Return true if, for the purposes of tokenizing with the tokenizer |
| 24154 ** passed as the first argument, codepoint iCode is considered a token |
| 24155 ** character (not a separator). |
| 24156 */ |
| 24157 static int fts5UnicodeIsAlnum(Unicode61Tokenizer *p, int iCode){ |
| 24158 assert( (sqlite3Fts5UnicodeIsalnum(iCode) & 0xFFFFFFFE)==0 ); |
| 24159 return sqlite3Fts5UnicodeIsalnum(iCode) ^ fts5UnicodeIsException(p, iCode); |
| 24160 } |
| 24161 |
| 24162 static int fts5UnicodeTokenize( |
| 24163 Fts5Tokenizer *pTokenizer, |
| 24164 void *pCtx, |
| 24165 int flags, |
| 24166 const char *pText, int nText, |
| 24167 int (*xToken)(void*, int, const char*, int nToken, int iStart, int iEnd) |
| 24168 ){ |
| 24169 Unicode61Tokenizer *p = (Unicode61Tokenizer*)pTokenizer; |
| 24170 int rc = SQLITE_OK; |
| 24171 unsigned char *a = p->aTokenChar; |
| 24172 |
| 24173 unsigned char *zTerm = (unsigned char*)&pText[nText]; |
| 24174 unsigned char *zCsr = (unsigned char *)pText; |
| 24175 |
| 24176 /* Output buffer */ |
| 24177 char *aFold = p->aFold; |
| 24178 int nFold = p->nFold; |
| 24179 const char *pEnd = &aFold[nFold-6]; |
| 24180 |
| 24181 /* Each iteration of this loop gobbles up a contiguous run of separators, |
| 24182 ** then the next token. */ |
| 24183 while( rc==SQLITE_OK ){ |
| 24184 int iCode; /* non-ASCII codepoint read from input */ |
| 24185 char *zOut = aFold; |
| 24186 int is; |
| 24187 int ie; |
| 24188 |
| 24189 /* Skip any separator characters. */ |
| 24190 while( 1 ){ |
| 24191 if( zCsr>=zTerm ) goto tokenize_done; |
| 24192 if( *zCsr & 0x80 ) { |
| 24193 /* A character outside of the ascii range. Skip past it if it is |
| 24194 ** a separator character. Or break out of the loop if it is not. */ |
| 24195 is = zCsr - (unsigned char*)pText; |
| 24196 READ_UTF8(zCsr, zTerm, iCode); |
| 24197 if( fts5UnicodeIsAlnum(p, iCode) ){ |
| 24198 goto non_ascii_tokenchar; |
| 24199 } |
| 24200 }else{ |
| 24201 if( a[*zCsr] ){ |
| 24202 is = zCsr - (unsigned char*)pText; |
| 24203 goto ascii_tokenchar; |
| 24204 } |
| 24205 zCsr++; |
| 24206 } |
| 24207 } |
| 24208 |
| 24209 /* Run through the tokenchars. Fold them into the output buffer along |
| 24210 ** the way. */ |
| 24211 while( zCsr<zTerm ){ |
| 24212 |
| 24213 /* Grow the output buffer so that there is sufficient space to fit the |
| 24214 ** largest possible utf-8 character. */ |
| 24215 if( zOut>pEnd ){ |
| 24216 aFold = sqlite3_malloc(nFold*2); |
| 24217 if( aFold==0 ){ |
| 24218 rc = SQLITE_NOMEM; |
| 24219 goto tokenize_done; |
| 24220 } |
| 24221 zOut = &aFold[zOut - p->aFold]; |
| 24222 memcpy(aFold, p->aFold, nFold); |
| 24223 sqlite3_free(p->aFold); |
| 24224 p->aFold = aFold; |
| 24225 p->nFold = nFold = nFold*2; |
| 24226 pEnd = &aFold[nFold-6]; |
| 24227 } |
| 24228 |
| 24229 if( *zCsr & 0x80 ){ |
| 24230 /* An non-ascii-range character. Fold it into the output buffer if |
| 24231 ** it is a token character, or break out of the loop if it is not. */ |
| 24232 READ_UTF8(zCsr, zTerm, iCode); |
| 24233 if( fts5UnicodeIsAlnum(p,iCode)||sqlite3Fts5UnicodeIsdiacritic(iCode) ){ |
| 24234 non_ascii_tokenchar: |
| 24235 iCode = sqlite3Fts5UnicodeFold(iCode, p->bRemoveDiacritic); |
| 24236 if( iCode ) WRITE_UTF8(zOut, iCode); |
| 24237 }else{ |
| 24238 break; |
| 24239 } |
| 24240 }else if( a[*zCsr]==0 ){ |
| 24241 /* An ascii-range separator character. End of token. */ |
| 24242 break; |
| 24243 }else{ |
| 24244 ascii_tokenchar: |
| 24245 if( *zCsr>='A' && *zCsr<='Z' ){ |
| 24246 *zOut++ = *zCsr + 32; |
| 24247 }else{ |
| 24248 *zOut++ = *zCsr; |
| 24249 } |
| 24250 zCsr++; |
| 24251 } |
| 24252 ie = zCsr - (unsigned char*)pText; |
| 24253 } |
| 24254 |
| 24255 /* Invoke the token callback */ |
| 24256 rc = xToken(pCtx, 0, aFold, zOut-aFold, is, ie); |
| 24257 } |
| 24258 |
| 24259 tokenize_done: |
| 24260 if( rc==SQLITE_DONE ) rc = SQLITE_OK; |
| 24261 return rc; |
| 24262 } |
| 24263 |
| 24264 /************************************************************************** |
| 24265 ** Start of porter stemmer implementation. |
| 24266 */ |
| 24267 |
| 24268 /* Any tokens larger than this (in bytes) are passed through without |
| 24269 ** stemming. */ |
| 24270 #define FTS5_PORTER_MAX_TOKEN 64 |
| 24271 |
| 24272 typedef struct PorterTokenizer PorterTokenizer; |
| 24273 struct PorterTokenizer { |
| 24274 fts5_tokenizer tokenizer; /* Parent tokenizer module */ |
| 24275 Fts5Tokenizer *pTokenizer; /* Parent tokenizer instance */ |
| 24276 char aBuf[FTS5_PORTER_MAX_TOKEN + 64]; |
| 24277 }; |
| 24278 |
| 24279 /* |
| 24280 ** Delete a "porter" tokenizer. |
| 24281 */ |
| 24282 static void fts5PorterDelete(Fts5Tokenizer *pTok){ |
| 24283 if( pTok ){ |
| 24284 PorterTokenizer *p = (PorterTokenizer*)pTok; |
| 24285 if( p->pTokenizer ){ |
| 24286 p->tokenizer.xDelete(p->pTokenizer); |
| 24287 } |
| 24288 sqlite3_free(p); |
| 24289 } |
| 24290 } |
| 24291 |
| 24292 /* |
| 24293 ** Create a "porter" tokenizer. |
| 24294 */ |
| 24295 static int fts5PorterCreate( |
| 24296 void *pCtx, |
| 24297 const char **azArg, int nArg, |
| 24298 Fts5Tokenizer **ppOut |
| 24299 ){ |
| 24300 fts5_api *pApi = (fts5_api*)pCtx; |
| 24301 int rc = SQLITE_OK; |
| 24302 PorterTokenizer *pRet; |
| 24303 void *pUserdata = 0; |
| 24304 const char *zBase = "unicode61"; |
| 24305 |
| 24306 if( nArg>0 ){ |
| 24307 zBase = azArg[0]; |
| 24308 } |
| 24309 |
| 24310 pRet = (PorterTokenizer*)sqlite3_malloc(sizeof(PorterTokenizer)); |
| 24311 if( pRet ){ |
| 24312 memset(pRet, 0, sizeof(PorterTokenizer)); |
| 24313 rc = pApi->xFindTokenizer(pApi, zBase, &pUserdata, &pRet->tokenizer); |
| 24314 }else{ |
| 24315 rc = SQLITE_NOMEM; |
| 24316 } |
| 24317 if( rc==SQLITE_OK ){ |
| 24318 int nArg2 = (nArg>0 ? nArg-1 : 0); |
| 24319 const char **azArg2 = (nArg2 ? &azArg[1] : 0); |
| 24320 rc = pRet->tokenizer.xCreate(pUserdata, azArg2, nArg2, &pRet->pTokenizer); |
| 24321 } |
| 24322 |
| 24323 if( rc!=SQLITE_OK ){ |
| 24324 fts5PorterDelete((Fts5Tokenizer*)pRet); |
| 24325 pRet = 0; |
| 24326 } |
| 24327 *ppOut = (Fts5Tokenizer*)pRet; |
| 24328 return rc; |
| 24329 } |
| 24330 |
| 24331 typedef struct PorterContext PorterContext; |
| 24332 struct PorterContext { |
| 24333 void *pCtx; |
| 24334 int (*xToken)(void*, int, const char*, int, int, int); |
| 24335 char *aBuf; |
| 24336 }; |
| 24337 |
| 24338 typedef struct PorterRule PorterRule; |
| 24339 struct PorterRule { |
| 24340 const char *zSuffix; |
| 24341 int nSuffix; |
| 24342 int (*xCond)(char *zStem, int nStem); |
| 24343 const char *zOutput; |
| 24344 int nOutput; |
| 24345 }; |
| 24346 |
| 24347 #if 0 |
| 24348 static int fts5PorterApply(char *aBuf, int *pnBuf, PorterRule *aRule){ |
| 24349 int ret = -1; |
| 24350 int nBuf = *pnBuf; |
| 24351 PorterRule *p; |
| 24352 |
| 24353 for(p=aRule; p->zSuffix; p++){ |
| 24354 assert( strlen(p->zSuffix)==p->nSuffix ); |
| 24355 assert( strlen(p->zOutput)==p->nOutput ); |
| 24356 if( nBuf<p->nSuffix ) continue; |
| 24357 if( 0==memcmp(&aBuf[nBuf - p->nSuffix], p->zSuffix, p->nSuffix) ) break; |
| 24358 } |
| 24359 |
| 24360 if( p->zSuffix ){ |
| 24361 int nStem = nBuf - p->nSuffix; |
| 24362 if( p->xCond==0 || p->xCond(aBuf, nStem) ){ |
| 24363 memcpy(&aBuf[nStem], p->zOutput, p->nOutput); |
| 24364 *pnBuf = nStem + p->nOutput; |
| 24365 ret = p - aRule; |
| 24366 } |
| 24367 } |
| 24368 |
| 24369 return ret; |
| 24370 } |
| 24371 #endif |
| 24372 |
| 24373 static int fts5PorterIsVowel(char c, int bYIsVowel){ |
| 24374 return ( |
| 24375 c=='a' || c=='e' || c=='i' || c=='o' || c=='u' || (bYIsVowel && c=='y') |
| 24376 ); |
| 24377 } |
| 24378 |
| 24379 static int fts5PorterGobbleVC(char *zStem, int nStem, int bPrevCons){ |
| 24380 int i; |
| 24381 int bCons = bPrevCons; |
| 24382 |
| 24383 /* Scan for a vowel */ |
| 24384 for(i=0; i<nStem; i++){ |
| 24385 if( 0==(bCons = !fts5PorterIsVowel(zStem[i], bCons)) ) break; |
| 24386 } |
| 24387 |
| 24388 /* Scan for a consonent */ |
| 24389 for(i++; i<nStem; i++){ |
| 24390 if( (bCons = !fts5PorterIsVowel(zStem[i], bCons)) ) return i+1; |
| 24391 } |
| 24392 return 0; |
| 24393 } |
| 24394 |
| 24395 /* porter rule condition: (m > 0) */ |
| 24396 static int fts5Porter_MGt0(char *zStem, int nStem){ |
| 24397 return !!fts5PorterGobbleVC(zStem, nStem, 0); |
| 24398 } |
| 24399 |
| 24400 /* porter rule condition: (m > 1) */ |
| 24401 static int fts5Porter_MGt1(char *zStem, int nStem){ |
| 24402 int n; |
| 24403 n = fts5PorterGobbleVC(zStem, nStem, 0); |
| 24404 if( n && fts5PorterGobbleVC(&zStem[n], nStem-n, 1) ){ |
| 24405 return 1; |
| 24406 } |
| 24407 return 0; |
| 24408 } |
| 24409 |
| 24410 /* porter rule condition: (m = 1) */ |
| 24411 static int fts5Porter_MEq1(char *zStem, int nStem){ |
| 24412 int n; |
| 24413 n = fts5PorterGobbleVC(zStem, nStem, 0); |
| 24414 if( n && 0==fts5PorterGobbleVC(&zStem[n], nStem-n, 1) ){ |
| 24415 return 1; |
| 24416 } |
| 24417 return 0; |
| 24418 } |
| 24419 |
| 24420 /* porter rule condition: (*o) */ |
| 24421 static int fts5Porter_Ostar(char *zStem, int nStem){ |
| 24422 if( zStem[nStem-1]=='w' || zStem[nStem-1]=='x' || zStem[nStem-1]=='y' ){ |
| 24423 return 0; |
| 24424 }else{ |
| 24425 int i; |
| 24426 int mask = 0; |
| 24427 int bCons = 0; |
| 24428 for(i=0; i<nStem; i++){ |
| 24429 bCons = !fts5PorterIsVowel(zStem[i], bCons); |
| 24430 assert( bCons==0 || bCons==1 ); |
| 24431 mask = (mask << 1) + bCons; |
| 24432 } |
| 24433 return ((mask & 0x0007)==0x0005); |
| 24434 } |
| 24435 } |
| 24436 |
| 24437 /* porter rule condition: (m > 1 and (*S or *T)) */ |
| 24438 static int fts5Porter_MGt1_and_S_or_T(char *zStem, int nStem){ |
| 24439 assert( nStem>0 ); |
| 24440 return (zStem[nStem-1]=='s' || zStem[nStem-1]=='t') |
| 24441 && fts5Porter_MGt1(zStem, nStem); |
| 24442 } |
| 24443 |
| 24444 /* porter rule condition: (*v*) */ |
| 24445 static int fts5Porter_Vowel(char *zStem, int nStem){ |
| 24446 int i; |
| 24447 for(i=0; i<nStem; i++){ |
| 24448 if( fts5PorterIsVowel(zStem[i], i>0) ){ |
| 24449 return 1; |
| 24450 } |
| 24451 } |
| 24452 return 0; |
| 24453 } |
| 24454 |
| 24455 |
| 24456 /************************************************************************** |
| 24457 *************************************************************************** |
| 24458 ** GENERATED CODE STARTS HERE (mkportersteps.tcl) |
| 24459 */ |
| 24460 |
| 24461 static int fts5PorterStep4(char *aBuf, int *pnBuf){ |
| 24462 int ret = 0; |
| 24463 int nBuf = *pnBuf; |
| 24464 switch( aBuf[nBuf-2] ){ |
| 24465 |
| 24466 case 'a': |
| 24467 if( nBuf>2 && 0==memcmp("al", &aBuf[nBuf-2], 2) ){ |
| 24468 if( fts5Porter_MGt1(aBuf, nBuf-2) ){ |
| 24469 *pnBuf = nBuf - 2; |
| 24470 } |
| 24471 } |
| 24472 break; |
| 24473 |
| 24474 case 'c': |
| 24475 if( nBuf>4 && 0==memcmp("ance", &aBuf[nBuf-4], 4) ){ |
| 24476 if( fts5Porter_MGt1(aBuf, nBuf-4) ){ |
| 24477 *pnBuf = nBuf - 4; |
| 24478 } |
| 24479 }else if( nBuf>4 && 0==memcmp("ence", &aBuf[nBuf-4], 4) ){ |
| 24480 if( fts5Porter_MGt1(aBuf, nBuf-4) ){ |
| 24481 *pnBuf = nBuf - 4; |
| 24482 } |
| 24483 } |
| 24484 break; |
| 24485 |
| 24486 case 'e': |
| 24487 if( nBuf>2 && 0==memcmp("er", &aBuf[nBuf-2], 2) ){ |
| 24488 if( fts5Porter_MGt1(aBuf, nBuf-2) ){ |
| 24489 *pnBuf = nBuf - 2; |
| 24490 } |
| 24491 } |
| 24492 break; |
| 24493 |
| 24494 case 'i': |
| 24495 if( nBuf>2 && 0==memcmp("ic", &aBuf[nBuf-2], 2) ){ |
| 24496 if( fts5Porter_MGt1(aBuf, nBuf-2) ){ |
| 24497 *pnBuf = nBuf - 2; |
| 24498 } |
| 24499 } |
| 24500 break; |
| 24501 |
| 24502 case 'l': |
| 24503 if( nBuf>4 && 0==memcmp("able", &aBuf[nBuf-4], 4) ){ |
| 24504 if( fts5Porter_MGt1(aBuf, nBuf-4) ){ |
| 24505 *pnBuf = nBuf - 4; |
| 24506 } |
| 24507 }else if( nBuf>4 && 0==memcmp("ible", &aBuf[nBuf-4], 4) ){ |
| 24508 if( fts5Porter_MGt1(aBuf, nBuf-4) ){ |
| 24509 *pnBuf = nBuf - 4; |
| 24510 } |
| 24511 } |
| 24512 break; |
| 24513 |
| 24514 case 'n': |
| 24515 if( nBuf>3 && 0==memcmp("ant", &aBuf[nBuf-3], 3) ){ |
| 24516 if( fts5Porter_MGt1(aBuf, nBuf-3) ){ |
| 24517 *pnBuf = nBuf - 3; |
| 24518 } |
| 24519 }else if( nBuf>5 && 0==memcmp("ement", &aBuf[nBuf-5], 5) ){ |
| 24520 if( fts5Porter_MGt1(aBuf, nBuf-5) ){ |
| 24521 *pnBuf = nBuf - 5; |
| 24522 } |
| 24523 }else if( nBuf>4 && 0==memcmp("ment", &aBuf[nBuf-4], 4) ){ |
| 24524 if( fts5Porter_MGt1(aBuf, nBuf-4) ){ |
| 24525 *pnBuf = nBuf - 4; |
| 24526 } |
| 24527 }else if( nBuf>3 && 0==memcmp("ent", &aBuf[nBuf-3], 3) ){ |
| 24528 if( fts5Porter_MGt1(aBuf, nBuf-3) ){ |
| 24529 *pnBuf = nBuf - 3; |
| 24530 } |
| 24531 } |
| 24532 break; |
| 24533 |
| 24534 case 'o': |
| 24535 if( nBuf>3 && 0==memcmp("ion", &aBuf[nBuf-3], 3) ){ |
| 24536 if( fts5Porter_MGt1_and_S_or_T(aBuf, nBuf-3) ){ |
| 24537 *pnBuf = nBuf - 3; |
| 24538 } |
| 24539 }else if( nBuf>2 && 0==memcmp("ou", &aBuf[nBuf-2], 2) ){ |
| 24540 if( fts5Porter_MGt1(aBuf, nBuf-2) ){ |
| 24541 *pnBuf = nBuf - 2; |
| 24542 } |
| 24543 } |
| 24544 break; |
| 24545 |
| 24546 case 's': |
| 24547 if( nBuf>3 && 0==memcmp("ism", &aBuf[nBuf-3], 3) ){ |
| 24548 if( fts5Porter_MGt1(aBuf, nBuf-3) ){ |
| 24549 *pnBuf = nBuf - 3; |
| 24550 } |
| 24551 } |
| 24552 break; |
| 24553 |
| 24554 case 't': |
| 24555 if( nBuf>3 && 0==memcmp("ate", &aBuf[nBuf-3], 3) ){ |
| 24556 if( fts5Porter_MGt1(aBuf, nBuf-3) ){ |
| 24557 *pnBuf = nBuf - 3; |
| 24558 } |
| 24559 }else if( nBuf>3 && 0==memcmp("iti", &aBuf[nBuf-3], 3) ){ |
| 24560 if( fts5Porter_MGt1(aBuf, nBuf-3) ){ |
| 24561 *pnBuf = nBuf - 3; |
| 24562 } |
| 24563 } |
| 24564 break; |
| 24565 |
| 24566 case 'u': |
| 24567 if( nBuf>3 && 0==memcmp("ous", &aBuf[nBuf-3], 3) ){ |
| 24568 if( fts5Porter_MGt1(aBuf, nBuf-3) ){ |
| 24569 *pnBuf = nBuf - 3; |
| 24570 } |
| 24571 } |
| 24572 break; |
| 24573 |
| 24574 case 'v': |
| 24575 if( nBuf>3 && 0==memcmp("ive", &aBuf[nBuf-3], 3) ){ |
| 24576 if( fts5Porter_MGt1(aBuf, nBuf-3) ){ |
| 24577 *pnBuf = nBuf - 3; |
| 24578 } |
| 24579 } |
| 24580 break; |
| 24581 |
| 24582 case 'z': |
| 24583 if( nBuf>3 && 0==memcmp("ize", &aBuf[nBuf-3], 3) ){ |
| 24584 if( fts5Porter_MGt1(aBuf, nBuf-3) ){ |
| 24585 *pnBuf = nBuf - 3; |
| 24586 } |
| 24587 } |
| 24588 break; |
| 24589 |
| 24590 } |
| 24591 return ret; |
| 24592 } |
| 24593 |
| 24594 |
| 24595 static int fts5PorterStep1B2(char *aBuf, int *pnBuf){ |
| 24596 int ret = 0; |
| 24597 int nBuf = *pnBuf; |
| 24598 switch( aBuf[nBuf-2] ){ |
| 24599 |
| 24600 case 'a': |
| 24601 if( nBuf>2 && 0==memcmp("at", &aBuf[nBuf-2], 2) ){ |
| 24602 memcpy(&aBuf[nBuf-2], "ate", 3); |
| 24603 *pnBuf = nBuf - 2 + 3; |
| 24604 ret = 1; |
| 24605 } |
| 24606 break; |
| 24607 |
| 24608 case 'b': |
| 24609 if( nBuf>2 && 0==memcmp("bl", &aBuf[nBuf-2], 2) ){ |
| 24610 memcpy(&aBuf[nBuf-2], "ble", 3); |
| 24611 *pnBuf = nBuf - 2 + 3; |
| 24612 ret = 1; |
| 24613 } |
| 24614 break; |
| 24615 |
| 24616 case 'i': |
| 24617 if( nBuf>2 && 0==memcmp("iz", &aBuf[nBuf-2], 2) ){ |
| 24618 memcpy(&aBuf[nBuf-2], "ize", 3); |
| 24619 *pnBuf = nBuf - 2 + 3; |
| 24620 ret = 1; |
| 24621 } |
| 24622 break; |
| 24623 |
| 24624 } |
| 24625 return ret; |
| 24626 } |
| 24627 |
| 24628 |
| 24629 static int fts5PorterStep2(char *aBuf, int *pnBuf){ |
| 24630 int ret = 0; |
| 24631 int nBuf = *pnBuf; |
| 24632 switch( aBuf[nBuf-2] ){ |
| 24633 |
| 24634 case 'a': |
| 24635 if( nBuf>7 && 0==memcmp("ational", &aBuf[nBuf-7], 7) ){ |
| 24636 if( fts5Porter_MGt0(aBuf, nBuf-7) ){ |
| 24637 memcpy(&aBuf[nBuf-7], "ate", 3); |
| 24638 *pnBuf = nBuf - 7 + 3; |
| 24639 } |
| 24640 }else if( nBuf>6 && 0==memcmp("tional", &aBuf[nBuf-6], 6) ){ |
| 24641 if( fts5Porter_MGt0(aBuf, nBuf-6) ){ |
| 24642 memcpy(&aBuf[nBuf-6], "tion", 4); |
| 24643 *pnBuf = nBuf - 6 + 4; |
| 24644 } |
| 24645 } |
| 24646 break; |
| 24647 |
| 24648 case 'c': |
| 24649 if( nBuf>4 && 0==memcmp("enci", &aBuf[nBuf-4], 4) ){ |
| 24650 if( fts5Porter_MGt0(aBuf, nBuf-4) ){ |
| 24651 memcpy(&aBuf[nBuf-4], "ence", 4); |
| 24652 *pnBuf = nBuf - 4 + 4; |
| 24653 } |
| 24654 }else if( nBuf>4 && 0==memcmp("anci", &aBuf[nBuf-4], 4) ){ |
| 24655 if( fts5Porter_MGt0(aBuf, nBuf-4) ){ |
| 24656 memcpy(&aBuf[nBuf-4], "ance", 4); |
| 24657 *pnBuf = nBuf - 4 + 4; |
| 24658 } |
| 24659 } |
| 24660 break; |
| 24661 |
| 24662 case 'e': |
| 24663 if( nBuf>4 && 0==memcmp("izer", &aBuf[nBuf-4], 4) ){ |
| 24664 if( fts5Porter_MGt0(aBuf, nBuf-4) ){ |
| 24665 memcpy(&aBuf[nBuf-4], "ize", 3); |
| 24666 *pnBuf = nBuf - 4 + 3; |
| 24667 } |
| 24668 } |
| 24669 break; |
| 24670 |
| 24671 case 'g': |
| 24672 if( nBuf>4 && 0==memcmp("logi", &aBuf[nBuf-4], 4) ){ |
| 24673 if( fts5Porter_MGt0(aBuf, nBuf-4) ){ |
| 24674 memcpy(&aBuf[nBuf-4], "log", 3); |
| 24675 *pnBuf = nBuf - 4 + 3; |
| 24676 } |
| 24677 } |
| 24678 break; |
| 24679 |
| 24680 case 'l': |
| 24681 if( nBuf>3 && 0==memcmp("bli", &aBuf[nBuf-3], 3) ){ |
| 24682 if( fts5Porter_MGt0(aBuf, nBuf-3) ){ |
| 24683 memcpy(&aBuf[nBuf-3], "ble", 3); |
| 24684 *pnBuf = nBuf - 3 + 3; |
| 24685 } |
| 24686 }else if( nBuf>4 && 0==memcmp("alli", &aBuf[nBuf-4], 4) ){ |
| 24687 if( fts5Porter_MGt0(aBuf, nBuf-4) ){ |
| 24688 memcpy(&aBuf[nBuf-4], "al", 2); |
| 24689 *pnBuf = nBuf - 4 + 2; |
| 24690 } |
| 24691 }else if( nBuf>5 && 0==memcmp("entli", &aBuf[nBuf-5], 5) ){ |
| 24692 if( fts5Porter_MGt0(aBuf, nBuf-5) ){ |
| 24693 memcpy(&aBuf[nBuf-5], "ent", 3); |
| 24694 *pnBuf = nBuf - 5 + 3; |
| 24695 } |
| 24696 }else if( nBuf>3 && 0==memcmp("eli", &aBuf[nBuf-3], 3) ){ |
| 24697 if( fts5Porter_MGt0(aBuf, nBuf-3) ){ |
| 24698 memcpy(&aBuf[nBuf-3], "e", 1); |
| 24699 *pnBuf = nBuf - 3 + 1; |
| 24700 } |
| 24701 }else if( nBuf>5 && 0==memcmp("ousli", &aBuf[nBuf-5], 5) ){ |
| 24702 if( fts5Porter_MGt0(aBuf, nBuf-5) ){ |
| 24703 memcpy(&aBuf[nBuf-5], "ous", 3); |
| 24704 *pnBuf = nBuf - 5 + 3; |
| 24705 } |
| 24706 } |
| 24707 break; |
| 24708 |
| 24709 case 'o': |
| 24710 if( nBuf>7 && 0==memcmp("ization", &aBuf[nBuf-7], 7) ){ |
| 24711 if( fts5Porter_MGt0(aBuf, nBuf-7) ){ |
| 24712 memcpy(&aBuf[nBuf-7], "ize", 3); |
| 24713 *pnBuf = nBuf - 7 + 3; |
| 24714 } |
| 24715 }else if( nBuf>5 && 0==memcmp("ation", &aBuf[nBuf-5], 5) ){ |
| 24716 if( fts5Porter_MGt0(aBuf, nBuf-5) ){ |
| 24717 memcpy(&aBuf[nBuf-5], "ate", 3); |
| 24718 *pnBuf = nBuf - 5 + 3; |
| 24719 } |
| 24720 }else if( nBuf>4 && 0==memcmp("ator", &aBuf[nBuf-4], 4) ){ |
| 24721 if( fts5Porter_MGt0(aBuf, nBuf-4) ){ |
| 24722 memcpy(&aBuf[nBuf-4], "ate", 3); |
| 24723 *pnBuf = nBuf - 4 + 3; |
| 24724 } |
| 24725 } |
| 24726 break; |
| 24727 |
| 24728 case 's': |
| 24729 if( nBuf>5 && 0==memcmp("alism", &aBuf[nBuf-5], 5) ){ |
| 24730 if( fts5Porter_MGt0(aBuf, nBuf-5) ){ |
| 24731 memcpy(&aBuf[nBuf-5], "al", 2); |
| 24732 *pnBuf = nBuf - 5 + 2; |
| 24733 } |
| 24734 }else if( nBuf>7 && 0==memcmp("iveness", &aBuf[nBuf-7], 7) ){ |
| 24735 if( fts5Porter_MGt0(aBuf, nBuf-7) ){ |
| 24736 memcpy(&aBuf[nBuf-7], "ive", 3); |
| 24737 *pnBuf = nBuf - 7 + 3; |
| 24738 } |
| 24739 }else if( nBuf>7 && 0==memcmp("fulness", &aBuf[nBuf-7], 7) ){ |
| 24740 if( fts5Porter_MGt0(aBuf, nBuf-7) ){ |
| 24741 memcpy(&aBuf[nBuf-7], "ful", 3); |
| 24742 *pnBuf = nBuf - 7 + 3; |
| 24743 } |
| 24744 }else if( nBuf>7 && 0==memcmp("ousness", &aBuf[nBuf-7], 7) ){ |
| 24745 if( fts5Porter_MGt0(aBuf, nBuf-7) ){ |
| 24746 memcpy(&aBuf[nBuf-7], "ous", 3); |
| 24747 *pnBuf = nBuf - 7 + 3; |
| 24748 } |
| 24749 } |
| 24750 break; |
| 24751 |
| 24752 case 't': |
| 24753 if( nBuf>5 && 0==memcmp("aliti", &aBuf[nBuf-5], 5) ){ |
| 24754 if( fts5Porter_MGt0(aBuf, nBuf-5) ){ |
| 24755 memcpy(&aBuf[nBuf-5], "al", 2); |
| 24756 *pnBuf = nBuf - 5 + 2; |
| 24757 } |
| 24758 }else if( nBuf>5 && 0==memcmp("iviti", &aBuf[nBuf-5], 5) ){ |
| 24759 if( fts5Porter_MGt0(aBuf, nBuf-5) ){ |
| 24760 memcpy(&aBuf[nBuf-5], "ive", 3); |
| 24761 *pnBuf = nBuf - 5 + 3; |
| 24762 } |
| 24763 }else if( nBuf>6 && 0==memcmp("biliti", &aBuf[nBuf-6], 6) ){ |
| 24764 if( fts5Porter_MGt0(aBuf, nBuf-6) ){ |
| 24765 memcpy(&aBuf[nBuf-6], "ble", 3); |
| 24766 *pnBuf = nBuf - 6 + 3; |
| 24767 } |
| 24768 } |
| 24769 break; |
| 24770 |
| 24771 } |
| 24772 return ret; |
| 24773 } |
| 24774 |
| 24775 |
| 24776 static int fts5PorterStep3(char *aBuf, int *pnBuf){ |
| 24777 int ret = 0; |
| 24778 int nBuf = *pnBuf; |
| 24779 switch( aBuf[nBuf-2] ){ |
| 24780 |
| 24781 case 'a': |
| 24782 if( nBuf>4 && 0==memcmp("ical", &aBuf[nBuf-4], 4) ){ |
| 24783 if( fts5Porter_MGt0(aBuf, nBuf-4) ){ |
| 24784 memcpy(&aBuf[nBuf-4], "ic", 2); |
| 24785 *pnBuf = nBuf - 4 + 2; |
| 24786 } |
| 24787 } |
| 24788 break; |
| 24789 |
| 24790 case 's': |
| 24791 if( nBuf>4 && 0==memcmp("ness", &aBuf[nBuf-4], 4) ){ |
| 24792 if( fts5Porter_MGt0(aBuf, nBuf-4) ){ |
| 24793 *pnBuf = nBuf - 4; |
| 24794 } |
| 24795 } |
| 24796 break; |
| 24797 |
| 24798 case 't': |
| 24799 if( nBuf>5 && 0==memcmp("icate", &aBuf[nBuf-5], 5) ){ |
| 24800 if( fts5Porter_MGt0(aBuf, nBuf-5) ){ |
| 24801 memcpy(&aBuf[nBuf-5], "ic", 2); |
| 24802 *pnBuf = nBuf - 5 + 2; |
| 24803 } |
| 24804 }else if( nBuf>5 && 0==memcmp("iciti", &aBuf[nBuf-5], 5) ){ |
| 24805 if( fts5Porter_MGt0(aBuf, nBuf-5) ){ |
| 24806 memcpy(&aBuf[nBuf-5], "ic", 2); |
| 24807 *pnBuf = nBuf - 5 + 2; |
| 24808 } |
| 24809 } |
| 24810 break; |
| 24811 |
| 24812 case 'u': |
| 24813 if( nBuf>3 && 0==memcmp("ful", &aBuf[nBuf-3], 3) ){ |
| 24814 if( fts5Porter_MGt0(aBuf, nBuf-3) ){ |
| 24815 *pnBuf = nBuf - 3; |
| 24816 } |
| 24817 } |
| 24818 break; |
| 24819 |
| 24820 case 'v': |
| 24821 if( nBuf>5 && 0==memcmp("ative", &aBuf[nBuf-5], 5) ){ |
| 24822 if( fts5Porter_MGt0(aBuf, nBuf-5) ){ |
| 24823 *pnBuf = nBuf - 5; |
| 24824 } |
| 24825 } |
| 24826 break; |
| 24827 |
| 24828 case 'z': |
| 24829 if( nBuf>5 && 0==memcmp("alize", &aBuf[nBuf-5], 5) ){ |
| 24830 if( fts5Porter_MGt0(aBuf, nBuf-5) ){ |
| 24831 memcpy(&aBuf[nBuf-5], "al", 2); |
| 24832 *pnBuf = nBuf - 5 + 2; |
| 24833 } |
| 24834 } |
| 24835 break; |
| 24836 |
| 24837 } |
| 24838 return ret; |
| 24839 } |
| 24840 |
| 24841 |
| 24842 static int fts5PorterStep1B(char *aBuf, int *pnBuf){ |
| 24843 int ret = 0; |
| 24844 int nBuf = *pnBuf; |
| 24845 switch( aBuf[nBuf-2] ){ |
| 24846 |
| 24847 case 'e': |
| 24848 if( nBuf>3 && 0==memcmp("eed", &aBuf[nBuf-3], 3) ){ |
| 24849 if( fts5Porter_MGt0(aBuf, nBuf-3) ){ |
| 24850 memcpy(&aBuf[nBuf-3], "ee", 2); |
| 24851 *pnBuf = nBuf - 3 + 2; |
| 24852 } |
| 24853 }else if( nBuf>2 && 0==memcmp("ed", &aBuf[nBuf-2], 2) ){ |
| 24854 if( fts5Porter_Vowel(aBuf, nBuf-2) ){ |
| 24855 *pnBuf = nBuf - 2; |
| 24856 ret = 1; |
| 24857 } |
| 24858 } |
| 24859 break; |
| 24860 |
| 24861 case 'n': |
| 24862 if( nBuf>3 && 0==memcmp("ing", &aBuf[nBuf-3], 3) ){ |
| 24863 if( fts5Porter_Vowel(aBuf, nBuf-3) ){ |
| 24864 *pnBuf = nBuf - 3; |
| 24865 ret = 1; |
| 24866 } |
| 24867 } |
| 24868 break; |
| 24869 |
| 24870 } |
| 24871 return ret; |
| 24872 } |
| 24873 |
| 24874 /* |
| 24875 ** GENERATED CODE ENDS HERE (mkportersteps.tcl) |
| 24876 *************************************************************************** |
| 24877 **************************************************************************/ |
| 24878 |
| 24879 static void fts5PorterStep1A(char *aBuf, int *pnBuf){ |
| 24880 int nBuf = *pnBuf; |
| 24881 if( aBuf[nBuf-1]=='s' ){ |
| 24882 if( aBuf[nBuf-2]=='e' ){ |
| 24883 if( (nBuf>4 && aBuf[nBuf-4]=='s' && aBuf[nBuf-3]=='s') |
| 24884 || (nBuf>3 && aBuf[nBuf-3]=='i' ) |
| 24885 ){ |
| 24886 *pnBuf = nBuf-2; |
| 24887 }else{ |
| 24888 *pnBuf = nBuf-1; |
| 24889 } |
| 24890 } |
| 24891 else if( aBuf[nBuf-2]!='s' ){ |
| 24892 *pnBuf = nBuf-1; |
| 24893 } |
| 24894 } |
| 24895 } |
| 24896 |
| 24897 static int fts5PorterCb( |
| 24898 void *pCtx, |
| 24899 int tflags, |
| 24900 const char *pToken, |
| 24901 int nToken, |
| 24902 int iStart, |
| 24903 int iEnd |
| 24904 ){ |
| 24905 PorterContext *p = (PorterContext*)pCtx; |
| 24906 |
| 24907 char *aBuf; |
| 24908 int nBuf; |
| 24909 |
| 24910 if( nToken>FTS5_PORTER_MAX_TOKEN || nToken<3 ) goto pass_through; |
| 24911 aBuf = p->aBuf; |
| 24912 nBuf = nToken; |
| 24913 memcpy(aBuf, pToken, nBuf); |
| 24914 |
| 24915 /* Step 1. */ |
| 24916 fts5PorterStep1A(aBuf, &nBuf); |
| 24917 if( fts5PorterStep1B(aBuf, &nBuf) ){ |
| 24918 if( fts5PorterStep1B2(aBuf, &nBuf)==0 ){ |
| 24919 char c = aBuf[nBuf-1]; |
| 24920 if( fts5PorterIsVowel(c, 0)==0 |
| 24921 && c!='l' && c!='s' && c!='z' && c==aBuf[nBuf-2] |
| 24922 ){ |
| 24923 nBuf--; |
| 24924 }else if( fts5Porter_MEq1(aBuf, nBuf) && fts5Porter_Ostar(aBuf, nBuf) ){ |
| 24925 aBuf[nBuf++] = 'e'; |
| 24926 } |
| 24927 } |
| 24928 } |
| 24929 |
| 24930 /* Step 1C. */ |
| 24931 if( aBuf[nBuf-1]=='y' && fts5Porter_Vowel(aBuf, nBuf-1) ){ |
| 24932 aBuf[nBuf-1] = 'i'; |
| 24933 } |
| 24934 |
| 24935 /* Steps 2 through 4. */ |
| 24936 fts5PorterStep2(aBuf, &nBuf); |
| 24937 fts5PorterStep3(aBuf, &nBuf); |
| 24938 fts5PorterStep4(aBuf, &nBuf); |
| 24939 |
| 24940 /* Step 5a. */ |
| 24941 assert( nBuf>0 ); |
| 24942 if( aBuf[nBuf-1]=='e' ){ |
| 24943 if( fts5Porter_MGt1(aBuf, nBuf-1) |
| 24944 || (fts5Porter_MEq1(aBuf, nBuf-1) && !fts5Porter_Ostar(aBuf, nBuf-1)) |
| 24945 ){ |
| 24946 nBuf--; |
| 24947 } |
| 24948 } |
| 24949 |
| 24950 /* Step 5b. */ |
| 24951 if( nBuf>1 && aBuf[nBuf-1]=='l' |
| 24952 && aBuf[nBuf-2]=='l' && fts5Porter_MGt1(aBuf, nBuf-1) |
| 24953 ){ |
| 24954 nBuf--; |
| 24955 } |
| 24956 |
| 24957 return p->xToken(p->pCtx, tflags, aBuf, nBuf, iStart, iEnd); |
| 24958 |
| 24959 pass_through: |
| 24960 return p->xToken(p->pCtx, tflags, pToken, nToken, iStart, iEnd); |
| 24961 } |
| 24962 |
| 24963 /* |
| 24964 ** Tokenize using the porter tokenizer. |
| 24965 */ |
| 24966 static int fts5PorterTokenize( |
| 24967 Fts5Tokenizer *pTokenizer, |
| 24968 void *pCtx, |
| 24969 int flags, |
| 24970 const char *pText, int nText, |
| 24971 int (*xToken)(void*, int, const char*, int nToken, int iStart, int iEnd) |
| 24972 ){ |
| 24973 PorterTokenizer *p = (PorterTokenizer*)pTokenizer; |
| 24974 PorterContext sCtx; |
| 24975 sCtx.xToken = xToken; |
| 24976 sCtx.pCtx = pCtx; |
| 24977 sCtx.aBuf = p->aBuf; |
| 24978 return p->tokenizer.xTokenize( |
| 24979 p->pTokenizer, (void*)&sCtx, flags, pText, nText, fts5PorterCb |
| 24980 ); |
| 24981 } |
| 24982 |
| 24983 /* |
| 24984 ** Register all built-in tokenizers with FTS5. |
| 24985 */ |
| 24986 static int sqlite3Fts5TokenizerInit(fts5_api *pApi){ |
| 24987 struct BuiltinTokenizer { |
| 24988 const char *zName; |
| 24989 fts5_tokenizer x; |
| 24990 } aBuiltin[] = { |
| 24991 { "unicode61", {fts5UnicodeCreate, fts5UnicodeDelete, fts5UnicodeTokenize}}, |
| 24992 { "ascii", {fts5AsciiCreate, fts5AsciiDelete, fts5AsciiTokenize }}, |
| 24993 { "porter", {fts5PorterCreate, fts5PorterDelete, fts5PorterTokenize }}, |
| 24994 }; |
| 24995 |
| 24996 int rc = SQLITE_OK; /* Return code */ |
| 24997 int i; /* To iterate through builtin functions */ |
| 24998 |
| 24999 for(i=0; rc==SQLITE_OK && i<(int)ArraySize(aBuiltin); i++){ |
| 25000 rc = pApi->xCreateTokenizer(pApi, |
| 25001 aBuiltin[i].zName, |
| 25002 (void*)pApi, |
| 25003 &aBuiltin[i].x, |
| 25004 0 |
| 25005 ); |
| 25006 } |
| 25007 |
| 25008 return rc; |
| 25009 } |
| 25010 |
| 25011 |
| 25012 |
| 25013 /* |
| 25014 ** 2012 May 25 |
| 25015 ** |
| 25016 ** The author disclaims copyright to this source code. In place of |
| 25017 ** a legal notice, here is a blessing: |
| 25018 ** |
| 25019 ** May you do good and not evil. |
| 25020 ** May you find forgiveness for yourself and forgive others. |
| 25021 ** May you share freely, never taking more than you give. |
| 25022 ** |
| 25023 ****************************************************************************** |
| 25024 */ |
| 25025 |
| 25026 /* |
| 25027 ** DO NOT EDIT THIS MACHINE GENERATED FILE. |
| 25028 */ |
| 25029 |
| 25030 |
| 25031 /* #include <assert.h> */ |
| 25032 |
| 25033 /* |
| 25034 ** Return true if the argument corresponds to a unicode codepoint |
| 25035 ** classified as either a letter or a number. Otherwise false. |
| 25036 ** |
| 25037 ** The results are undefined if the value passed to this function |
| 25038 ** is less than zero. |
| 25039 */ |
| 25040 static int sqlite3Fts5UnicodeIsalnum(int c){ |
| 25041 /* Each unsigned integer in the following array corresponds to a contiguous |
| 25042 ** range of unicode codepoints that are not either letters or numbers (i.e. |
| 25043 ** codepoints for which this function should return 0). |
| 25044 ** |
| 25045 ** The most significant 22 bits in each 32-bit value contain the first |
| 25046 ** codepoint in the range. The least significant 10 bits are used to store |
| 25047 ** the size of the range (always at least 1). In other words, the value |
| 25048 ** ((C<<22) + N) represents a range of N codepoints starting with codepoint |
| 25049 ** C. It is not possible to represent a range larger than 1023 codepoints |
| 25050 ** using this format. |
| 25051 */ |
| 25052 static const unsigned int aEntry[] = { |
| 25053 0x00000030, 0x0000E807, 0x00016C06, 0x0001EC2F, 0x0002AC07, |
| 25054 0x0002D001, 0x0002D803, 0x0002EC01, 0x0002FC01, 0x00035C01, |
| 25055 0x0003DC01, 0x000B0804, 0x000B480E, 0x000B9407, 0x000BB401, |
| 25056 0x000BBC81, 0x000DD401, 0x000DF801, 0x000E1002, 0x000E1C01, |
| 25057 0x000FD801, 0x00120808, 0x00156806, 0x00162402, 0x00163C01, |
| 25058 0x00164437, 0x0017CC02, 0x00180005, 0x00181816, 0x00187802, |
| 25059 0x00192C15, 0x0019A804, 0x0019C001, 0x001B5001, 0x001B580F, |
| 25060 0x001B9C07, 0x001BF402, 0x001C000E, 0x001C3C01, 0x001C4401, |
| 25061 0x001CC01B, 0x001E980B, 0x001FAC09, 0x001FD804, 0x00205804, |
| 25062 0x00206C09, 0x00209403, 0x0020A405, 0x0020C00F, 0x00216403, |
| 25063 0x00217801, 0x0023901B, 0x00240004, 0x0024E803, 0x0024F812, |
| 25064 0x00254407, 0x00258804, 0x0025C001, 0x00260403, 0x0026F001, |
| 25065 0x0026F807, 0x00271C02, 0x00272C03, 0x00275C01, 0x00278802, |
| 25066 0x0027C802, 0x0027E802, 0x00280403, 0x0028F001, 0x0028F805, |
| 25067 0x00291C02, 0x00292C03, 0x00294401, 0x0029C002, 0x0029D401, |
| 25068 0x002A0403, 0x002AF001, 0x002AF808, 0x002B1C03, 0x002B2C03, |
| 25069 0x002B8802, 0x002BC002, 0x002C0403, 0x002CF001, 0x002CF807, |
| 25070 0x002D1C02, 0x002D2C03, 0x002D5802, 0x002D8802, 0x002DC001, |
| 25071 0x002E0801, 0x002EF805, 0x002F1803, 0x002F2804, 0x002F5C01, |
| 25072 0x002FCC08, 0x00300403, 0x0030F807, 0x00311803, 0x00312804, |
| 25073 0x00315402, 0x00318802, 0x0031FC01, 0x00320802, 0x0032F001, |
| 25074 0x0032F807, 0x00331803, 0x00332804, 0x00335402, 0x00338802, |
| 25075 0x00340802, 0x0034F807, 0x00351803, 0x00352804, 0x00355C01, |
| 25076 0x00358802, 0x0035E401, 0x00360802, 0x00372801, 0x00373C06, |
| 25077 0x00375801, 0x00376008, 0x0037C803, 0x0038C401, 0x0038D007, |
| 25078 0x0038FC01, 0x00391C09, 0x00396802, 0x003AC401, 0x003AD006, |
| 25079 0x003AEC02, 0x003B2006, 0x003C041F, 0x003CD00C, 0x003DC417, |
| 25080 0x003E340B, 0x003E6424, 0x003EF80F, 0x003F380D, 0x0040AC14, |
| 25081 0x00412806, 0x00415804, 0x00417803, 0x00418803, 0x00419C07, |
| 25082 0x0041C404, 0x0042080C, 0x00423C01, 0x00426806, 0x0043EC01, |
| 25083 0x004D740C, 0x004E400A, 0x00500001, 0x0059B402, 0x005A0001, |
| 25084 0x005A6C02, 0x005BAC03, 0x005C4803, 0x005CC805, 0x005D4802, |
| 25085 0x005DC802, 0x005ED023, 0x005F6004, 0x005F7401, 0x0060000F, |
| 25086 0x0062A401, 0x0064800C, 0x0064C00C, 0x00650001, 0x00651002, |
| 25087 0x0066C011, 0x00672002, 0x00677822, 0x00685C05, 0x00687802, |
| 25088 0x0069540A, 0x0069801D, 0x0069FC01, 0x006A8007, 0x006AA006, |
| 25089 0x006C0005, 0x006CD011, 0x006D6823, 0x006E0003, 0x006E840D, |
| 25090 0x006F980E, 0x006FF004, 0x00709014, 0x0070EC05, 0x0071F802, |
| 25091 0x00730008, 0x00734019, 0x0073B401, 0x0073C803, 0x00770027, |
| 25092 0x0077F004, 0x007EF401, 0x007EFC03, 0x007F3403, 0x007F7403, |
| 25093 0x007FB403, 0x007FF402, 0x00800065, 0x0081A806, 0x0081E805, |
| 25094 0x00822805, 0x0082801A, 0x00834021, 0x00840002, 0x00840C04, |
| 25095 0x00842002, 0x00845001, 0x00845803, 0x00847806, 0x00849401, |
| 25096 0x00849C01, 0x0084A401, 0x0084B801, 0x0084E802, 0x00850005, |
| 25097 0x00852804, 0x00853C01, 0x00864264, 0x00900027, 0x0091000B, |
| 25098 0x0092704E, 0x00940200, 0x009C0475, 0x009E53B9, 0x00AD400A, |
| 25099 0x00B39406, 0x00B3BC03, 0x00B3E404, 0x00B3F802, 0x00B5C001, |
| 25100 0x00B5FC01, 0x00B7804F, 0x00B8C00C, 0x00BA001A, 0x00BA6C59, |
| 25101 0x00BC00D6, 0x00BFC00C, 0x00C00005, 0x00C02019, 0x00C0A807, |
| 25102 0x00C0D802, 0x00C0F403, 0x00C26404, 0x00C28001, 0x00C3EC01, |
| 25103 0x00C64002, 0x00C6580A, 0x00C70024, 0x00C8001F, 0x00C8A81E, |
| 25104 0x00C94001, 0x00C98020, 0x00CA2827, 0x00CB003F, 0x00CC0100, |
| 25105 0x01370040, 0x02924037, 0x0293F802, 0x02983403, 0x0299BC10, |
| 25106 0x029A7C01, 0x029BC008, 0x029C0017, 0x029C8002, 0x029E2402, |
| 25107 0x02A00801, 0x02A01801, 0x02A02C01, 0x02A08C09, 0x02A0D804, |
| 25108 0x02A1D004, 0x02A20002, 0x02A2D011, 0x02A33802, 0x02A38012, |
| 25109 0x02A3E003, 0x02A4980A, 0x02A51C0D, 0x02A57C01, 0x02A60004, |
| 25110 0x02A6CC1B, 0x02A77802, 0x02A8A40E, 0x02A90C01, 0x02A93002, |
| 25111 0x02A97004, 0x02A9DC03, 0x02A9EC01, 0x02AAC001, 0x02AAC803, |
| 25112 0x02AADC02, 0x02AAF802, 0x02AB0401, 0x02AB7802, 0x02ABAC07, |
| 25113 0x02ABD402, 0x02AF8C0B, 0x03600001, 0x036DFC02, 0x036FFC02, |
| 25114 0x037FFC01, 0x03EC7801, 0x03ECA401, 0x03EEC810, 0x03F4F802, |
| 25115 0x03F7F002, 0x03F8001A, 0x03F88007, 0x03F8C023, 0x03F95013, |
| 25116 0x03F9A004, 0x03FBFC01, 0x03FC040F, 0x03FC6807, 0x03FCEC06, |
| 25117 0x03FD6C0B, 0x03FF8007, 0x03FFA007, 0x03FFE405, 0x04040003, |
| 25118 0x0404DC09, 0x0405E411, 0x0406400C, 0x0407402E, 0x040E7C01, |
| 25119 0x040F4001, 0x04215C01, 0x04247C01, 0x0424FC01, 0x04280403, |
| 25120 0x04281402, 0x04283004, 0x0428E003, 0x0428FC01, 0x04294009, |
| 25121 0x0429FC01, 0x042CE407, 0x04400003, 0x0440E016, 0x04420003, |
| 25122 0x0442C012, 0x04440003, 0x04449C0E, 0x04450004, 0x04460003, |
| 25123 0x0446CC0E, 0x04471404, 0x045AAC0D, 0x0491C004, 0x05BD442E, |
| 25124 0x05BE3C04, 0x074000F6, 0x07440027, 0x0744A4B5, 0x07480046, |
| 25125 0x074C0057, 0x075B0401, 0x075B6C01, 0x075BEC01, 0x075C5401, |
| 25126 0x075CD401, 0x075D3C01, 0x075DBC01, 0x075E2401, 0x075EA401, |
| 25127 0x075F0C01, 0x07BBC002, 0x07C0002C, 0x07C0C064, 0x07C2800F, |
| 25128 0x07C2C40E, 0x07C3040F, 0x07C3440F, 0x07C4401F, 0x07C4C03C, |
| 25129 0x07C5C02B, 0x07C7981D, 0x07C8402B, 0x07C90009, 0x07C94002, |
| 25130 0x07CC0021, 0x07CCC006, 0x07CCDC46, 0x07CE0014, 0x07CE8025, |
| 25131 0x07CF1805, 0x07CF8011, 0x07D0003F, 0x07D10001, 0x07D108B6, |
| 25132 0x07D3E404, 0x07D4003E, 0x07D50004, 0x07D54018, 0x07D7EC46, |
| 25133 0x07D9140B, 0x07DA0046, 0x07DC0074, 0x38000401, 0x38008060, |
| 25134 0x380400F0, |
| 25135 }; |
| 25136 static const unsigned int aAscii[4] = { |
| 25137 0xFFFFFFFF, 0xFC00FFFF, 0xF8000001, 0xF8000001, |
| 25138 }; |
| 25139 |
| 25140 if( c<128 ){ |
| 25141 return ( (aAscii[c >> 5] & (1 << (c & 0x001F)))==0 ); |
| 25142 }else if( c<(1<<22) ){ |
| 25143 unsigned int key = (((unsigned int)c)<<10) | 0x000003FF; |
| 25144 int iRes = 0; |
| 25145 int iHi = sizeof(aEntry)/sizeof(aEntry[0]) - 1; |
| 25146 int iLo = 0; |
| 25147 while( iHi>=iLo ){ |
| 25148 int iTest = (iHi + iLo) / 2; |
| 25149 if( key >= aEntry[iTest] ){ |
| 25150 iRes = iTest; |
| 25151 iLo = iTest+1; |
| 25152 }else{ |
| 25153 iHi = iTest-1; |
| 25154 } |
| 25155 } |
| 25156 assert( aEntry[0]<key ); |
| 25157 assert( key>=aEntry[iRes] ); |
| 25158 return (((unsigned int)c) >= ((aEntry[iRes]>>10) + (aEntry[iRes]&0x3FF))); |
| 25159 } |
| 25160 return 1; |
| 25161 } |
| 25162 |
| 25163 |
| 25164 /* |
| 25165 ** If the argument is a codepoint corresponding to a lowercase letter |
| 25166 ** in the ASCII range with a diacritic added, return the codepoint |
| 25167 ** of the ASCII letter only. For example, if passed 235 - "LATIN |
| 25168 ** SMALL LETTER E WITH DIAERESIS" - return 65 ("LATIN SMALL LETTER |
| 25169 ** E"). The resuls of passing a codepoint that corresponds to an |
| 25170 ** uppercase letter are undefined. |
| 25171 */ |
| 25172 static int fts5_remove_diacritic(int c){ |
| 25173 unsigned short aDia[] = { |
| 25174 0, 1797, 1848, 1859, 1891, 1928, 1940, 1995, |
| 25175 2024, 2040, 2060, 2110, 2168, 2206, 2264, 2286, |
| 25176 2344, 2383, 2472, 2488, 2516, 2596, 2668, 2732, |
| 25177 2782, 2842, 2894, 2954, 2984, 3000, 3028, 3336, |
| 25178 3456, 3696, 3712, 3728, 3744, 3896, 3912, 3928, |
| 25179 3968, 4008, 4040, 4106, 4138, 4170, 4202, 4234, |
| 25180 4266, 4296, 4312, 4344, 4408, 4424, 4472, 4504, |
| 25181 6148, 6198, 6264, 6280, 6360, 6429, 6505, 6529, |
| 25182 61448, 61468, 61534, 61592, 61642, 61688, 61704, 61726, |
| 25183 61784, 61800, 61836, 61880, 61914, 61948, 61998, 62122, |
| 25184 62154, 62200, 62218, 62302, 62364, 62442, 62478, 62536, |
| 25185 62554, 62584, 62604, 62640, 62648, 62656, 62664, 62730, |
| 25186 62924, 63050, 63082, 63274, 63390, |
| 25187 }; |
| 25188 char aChar[] = { |
| 25189 '\0', 'a', 'c', 'e', 'i', 'n', 'o', 'u', 'y', 'y', 'a', 'c', |
| 25190 'd', 'e', 'e', 'g', 'h', 'i', 'j', 'k', 'l', 'n', 'o', 'r', |
| 25191 's', 't', 'u', 'u', 'w', 'y', 'z', 'o', 'u', 'a', 'i', 'o', |
| 25192 'u', 'g', 'k', 'o', 'j', 'g', 'n', 'a', 'e', 'i', 'o', 'r', |
| 25193 'u', 's', 't', 'h', 'a', 'e', 'o', 'y', '\0', '\0', '\0', '\0', |
| 25194 '\0', '\0', '\0', '\0', 'a', 'b', 'd', 'd', 'e', 'f', 'g', 'h', |
| 25195 'h', 'i', 'k', 'l', 'l', 'm', 'n', 'p', 'r', 'r', 's', 't', |
| 25196 'u', 'v', 'w', 'w', 'x', 'y', 'z', 'h', 't', 'w', 'y', 'a', |
| 25197 'e', 'i', 'o', 'u', 'y', |
| 25198 }; |
| 25199 |
| 25200 unsigned int key = (((unsigned int)c)<<3) | 0x00000007; |
| 25201 int iRes = 0; |
| 25202 int iHi = sizeof(aDia)/sizeof(aDia[0]) - 1; |
| 25203 int iLo = 0; |
| 25204 while( iHi>=iLo ){ |
| 25205 int iTest = (iHi + iLo) / 2; |
| 25206 if( key >= aDia[iTest] ){ |
| 25207 iRes = iTest; |
| 25208 iLo = iTest+1; |
| 25209 }else{ |
| 25210 iHi = iTest-1; |
| 25211 } |
| 25212 } |
| 25213 assert( key>=aDia[iRes] ); |
| 25214 return ((c > (aDia[iRes]>>3) + (aDia[iRes]&0x07)) ? c : (int)aChar[iRes]); |
| 25215 } |
| 25216 |
| 25217 |
| 25218 /* |
| 25219 ** Return true if the argument interpreted as a unicode codepoint |
| 25220 ** is a diacritical modifier character. |
| 25221 */ |
| 25222 static int sqlite3Fts5UnicodeIsdiacritic(int c){ |
| 25223 unsigned int mask0 = 0x08029FDF; |
| 25224 unsigned int mask1 = 0x000361F8; |
| 25225 if( c<768 || c>817 ) return 0; |
| 25226 return (c < 768+32) ? |
| 25227 (mask0 & (1 << (c-768))) : |
| 25228 (mask1 & (1 << (c-768-32))); |
| 25229 } |
| 25230 |
| 25231 |
| 25232 /* |
| 25233 ** Interpret the argument as a unicode codepoint. If the codepoint |
| 25234 ** is an upper case character that has a lower case equivalent, |
| 25235 ** return the codepoint corresponding to the lower case version. |
| 25236 ** Otherwise, return a copy of the argument. |
| 25237 ** |
| 25238 ** The results are undefined if the value passed to this function |
| 25239 ** is less than zero. |
| 25240 */ |
| 25241 static int sqlite3Fts5UnicodeFold(int c, int bRemoveDiacritic){ |
| 25242 /* Each entry in the following array defines a rule for folding a range |
| 25243 ** of codepoints to lower case. The rule applies to a range of nRange |
| 25244 ** codepoints starting at codepoint iCode. |
| 25245 ** |
| 25246 ** If the least significant bit in flags is clear, then the rule applies |
| 25247 ** to all nRange codepoints (i.e. all nRange codepoints are upper case and |
| 25248 ** need to be folded). Or, if it is set, then the rule only applies to |
| 25249 ** every second codepoint in the range, starting with codepoint C. |
| 25250 ** |
| 25251 ** The 7 most significant bits in flags are an index into the aiOff[] |
| 25252 ** array. If a specific codepoint C does require folding, then its lower |
| 25253 ** case equivalent is ((C + aiOff[flags>>1]) & 0xFFFF). |
| 25254 ** |
| 25255 ** The contents of this array are generated by parsing the CaseFolding.txt |
| 25256 ** file distributed as part of the "Unicode Character Database". See |
| 25257 ** http://www.unicode.org for details. |
| 25258 */ |
| 25259 static const struct TableEntry { |
| 25260 unsigned short iCode; |
| 25261 unsigned char flags; |
| 25262 unsigned char nRange; |
| 25263 } aEntry[] = { |
| 25264 {65, 14, 26}, {181, 64, 1}, {192, 14, 23}, |
| 25265 {216, 14, 7}, {256, 1, 48}, {306, 1, 6}, |
| 25266 {313, 1, 16}, {330, 1, 46}, {376, 116, 1}, |
| 25267 {377, 1, 6}, {383, 104, 1}, {385, 50, 1}, |
| 25268 {386, 1, 4}, {390, 44, 1}, {391, 0, 1}, |
| 25269 {393, 42, 2}, {395, 0, 1}, {398, 32, 1}, |
| 25270 {399, 38, 1}, {400, 40, 1}, {401, 0, 1}, |
| 25271 {403, 42, 1}, {404, 46, 1}, {406, 52, 1}, |
| 25272 {407, 48, 1}, {408, 0, 1}, {412, 52, 1}, |
| 25273 {413, 54, 1}, {415, 56, 1}, {416, 1, 6}, |
| 25274 {422, 60, 1}, {423, 0, 1}, {425, 60, 1}, |
| 25275 {428, 0, 1}, {430, 60, 1}, {431, 0, 1}, |
| 25276 {433, 58, 2}, {435, 1, 4}, {439, 62, 1}, |
| 25277 {440, 0, 1}, {444, 0, 1}, {452, 2, 1}, |
| 25278 {453, 0, 1}, {455, 2, 1}, {456, 0, 1}, |
| 25279 {458, 2, 1}, {459, 1, 18}, {478, 1, 18}, |
| 25280 {497, 2, 1}, {498, 1, 4}, {502, 122, 1}, |
| 25281 {503, 134, 1}, {504, 1, 40}, {544, 110, 1}, |
| 25282 {546, 1, 18}, {570, 70, 1}, {571, 0, 1}, |
| 25283 {573, 108, 1}, {574, 68, 1}, {577, 0, 1}, |
| 25284 {579, 106, 1}, {580, 28, 1}, {581, 30, 1}, |
| 25285 {582, 1, 10}, {837, 36, 1}, {880, 1, 4}, |
| 25286 {886, 0, 1}, {902, 18, 1}, {904, 16, 3}, |
| 25287 {908, 26, 1}, {910, 24, 2}, {913, 14, 17}, |
| 25288 {931, 14, 9}, {962, 0, 1}, {975, 4, 1}, |
| 25289 {976, 140, 1}, {977, 142, 1}, {981, 146, 1}, |
| 25290 {982, 144, 1}, {984, 1, 24}, {1008, 136, 1}, |
| 25291 {1009, 138, 1}, {1012, 130, 1}, {1013, 128, 1}, |
| 25292 {1015, 0, 1}, {1017, 152, 1}, {1018, 0, 1}, |
| 25293 {1021, 110, 3}, {1024, 34, 16}, {1040, 14, 32}, |
| 25294 {1120, 1, 34}, {1162, 1, 54}, {1216, 6, 1}, |
| 25295 {1217, 1, 14}, {1232, 1, 88}, {1329, 22, 38}, |
| 25296 {4256, 66, 38}, {4295, 66, 1}, {4301, 66, 1}, |
| 25297 {7680, 1, 150}, {7835, 132, 1}, {7838, 96, 1}, |
| 25298 {7840, 1, 96}, {7944, 150, 8}, {7960, 150, 6}, |
| 25299 {7976, 150, 8}, {7992, 150, 8}, {8008, 150, 6}, |
| 25300 {8025, 151, 8}, {8040, 150, 8}, {8072, 150, 8}, |
| 25301 {8088, 150, 8}, {8104, 150, 8}, {8120, 150, 2}, |
| 25302 {8122, 126, 2}, {8124, 148, 1}, {8126, 100, 1}, |
| 25303 {8136, 124, 4}, {8140, 148, 1}, {8152, 150, 2}, |
| 25304 {8154, 120, 2}, {8168, 150, 2}, {8170, 118, 2}, |
| 25305 {8172, 152, 1}, {8184, 112, 2}, {8186, 114, 2}, |
| 25306 {8188, 148, 1}, {8486, 98, 1}, {8490, 92, 1}, |
| 25307 {8491, 94, 1}, {8498, 12, 1}, {8544, 8, 16}, |
| 25308 {8579, 0, 1}, {9398, 10, 26}, {11264, 22, 47}, |
| 25309 {11360, 0, 1}, {11362, 88, 1}, {11363, 102, 1}, |
| 25310 {11364, 90, 1}, {11367, 1, 6}, {11373, 84, 1}, |
| 25311 {11374, 86, 1}, {11375, 80, 1}, {11376, 82, 1}, |
| 25312 {11378, 0, 1}, {11381, 0, 1}, {11390, 78, 2}, |
| 25313 {11392, 1, 100}, {11499, 1, 4}, {11506, 0, 1}, |
| 25314 {42560, 1, 46}, {42624, 1, 24}, {42786, 1, 14}, |
| 25315 {42802, 1, 62}, {42873, 1, 4}, {42877, 76, 1}, |
| 25316 {42878, 1, 10}, {42891, 0, 1}, {42893, 74, 1}, |
| 25317 {42896, 1, 4}, {42912, 1, 10}, {42922, 72, 1}, |
| 25318 {65313, 14, 26}, |
| 25319 }; |
| 25320 static const unsigned short aiOff[] = { |
| 25321 1, 2, 8, 15, 16, 26, 28, 32, |
| 25322 37, 38, 40, 48, 63, 64, 69, 71, |
| 25323 79, 80, 116, 202, 203, 205, 206, 207, |
| 25324 209, 210, 211, 213, 214, 217, 218, 219, |
| 25325 775, 7264, 10792, 10795, 23228, 23256, 30204, 54721, |
| 25326 54753, 54754, 54756, 54787, 54793, 54809, 57153, 57274, |
| 25327 57921, 58019, 58363, 61722, 65268, 65341, 65373, 65406, |
| 25328 65408, 65410, 65415, 65424, 65436, 65439, 65450, 65462, |
| 25329 65472, 65476, 65478, 65480, 65482, 65488, 65506, 65511, |
| 25330 65514, 65521, 65527, 65528, 65529, |
| 25331 }; |
| 25332 |
| 25333 int ret = c; |
| 25334 |
| 25335 assert( sizeof(unsigned short)==2 && sizeof(unsigned char)==1 ); |
| 25336 |
| 25337 if( c<128 ){ |
| 25338 if( c>='A' && c<='Z' ) ret = c + ('a' - 'A'); |
| 25339 }else if( c<65536 ){ |
| 25340 const struct TableEntry *p; |
| 25341 int iHi = sizeof(aEntry)/sizeof(aEntry[0]) - 1; |
| 25342 int iLo = 0; |
| 25343 int iRes = -1; |
| 25344 |
| 25345 assert( c>aEntry[0].iCode ); |
| 25346 while( iHi>=iLo ){ |
| 25347 int iTest = (iHi + iLo) / 2; |
| 25348 int cmp = (c - aEntry[iTest].iCode); |
| 25349 if( cmp>=0 ){ |
| 25350 iRes = iTest; |
| 25351 iLo = iTest+1; |
| 25352 }else{ |
| 25353 iHi = iTest-1; |
| 25354 } |
| 25355 } |
| 25356 |
| 25357 assert( iRes>=0 && c>=aEntry[iRes].iCode ); |
| 25358 p = &aEntry[iRes]; |
| 25359 if( c<(p->iCode + p->nRange) && 0==(0x01 & p->flags & (p->iCode ^ c)) ){ |
| 25360 ret = (c + (aiOff[p->flags>>1])) & 0x0000FFFF; |
| 25361 assert( ret>0 ); |
| 25362 } |
| 25363 |
| 25364 if( bRemoveDiacritic ) ret = fts5_remove_diacritic(ret); |
| 25365 } |
| 25366 |
| 25367 else if( c>=66560 && c<66600 ){ |
| 25368 ret = c + 40; |
| 25369 } |
| 25370 |
| 25371 return ret; |
| 25372 } |
| 25373 |
| 25374 /* |
| 25375 ** 2015 May 30 |
| 25376 ** |
| 25377 ** The author disclaims copyright to this source code. In place of |
| 25378 ** a legal notice, here is a blessing: |
| 25379 ** |
| 25380 ** May you do good and not evil. |
| 25381 ** May you find forgiveness for yourself and forgive others. |
| 25382 ** May you share freely, never taking more than you give. |
| 25383 ** |
| 25384 ****************************************************************************** |
| 25385 ** |
| 25386 ** Routines for varint serialization and deserialization. |
| 25387 */ |
| 25388 |
| 25389 |
| 25390 /* #include "fts5Int.h" */ |
| 25391 |
| 25392 /* |
| 25393 ** This is a copy of the sqlite3GetVarint32() routine from the SQLite core. |
| 25394 ** Except, this version does handle the single byte case that the core |
| 25395 ** version depends on being handled before its function is called. |
| 25396 */ |
| 25397 static int sqlite3Fts5GetVarint32(const unsigned char *p, u32 *v){ |
| 25398 u32 a,b; |
| 25399 |
| 25400 /* The 1-byte case. Overwhelmingly the most common. */ |
| 25401 a = *p; |
| 25402 /* a: p0 (unmasked) */ |
| 25403 if (!(a&0x80)) |
| 25404 { |
| 25405 /* Values between 0 and 127 */ |
| 25406 *v = a; |
| 25407 return 1; |
| 25408 } |
| 25409 |
| 25410 /* The 2-byte case */ |
| 25411 p++; |
| 25412 b = *p; |
| 25413 /* b: p1 (unmasked) */ |
| 25414 if (!(b&0x80)) |
| 25415 { |
| 25416 /* Values between 128 and 16383 */ |
| 25417 a &= 0x7f; |
| 25418 a = a<<7; |
| 25419 *v = a | b; |
| 25420 return 2; |
| 25421 } |
| 25422 |
| 25423 /* The 3-byte case */ |
| 25424 p++; |
| 25425 a = a<<14; |
| 25426 a |= *p; |
| 25427 /* a: p0<<14 | p2 (unmasked) */ |
| 25428 if (!(a&0x80)) |
| 25429 { |
| 25430 /* Values between 16384 and 2097151 */ |
| 25431 a &= (0x7f<<14)|(0x7f); |
| 25432 b &= 0x7f; |
| 25433 b = b<<7; |
| 25434 *v = a | b; |
| 25435 return 3; |
| 25436 } |
| 25437 |
| 25438 /* A 32-bit varint is used to store size information in btrees. |
| 25439 ** Objects are rarely larger than 2MiB limit of a 3-byte varint. |
| 25440 ** A 3-byte varint is sufficient, for example, to record the size |
| 25441 ** of a 1048569-byte BLOB or string. |
| 25442 ** |
| 25443 ** We only unroll the first 1-, 2-, and 3- byte cases. The very |
| 25444 ** rare larger cases can be handled by the slower 64-bit varint |
| 25445 ** routine. |
| 25446 */ |
| 25447 { |
| 25448 u64 v64; |
| 25449 u8 n; |
| 25450 p -= 2; |
| 25451 n = sqlite3Fts5GetVarint(p, &v64); |
| 25452 *v = (u32)v64; |
| 25453 assert( n>3 && n<=9 ); |
| 25454 return n; |
| 25455 } |
| 25456 } |
| 25457 |
| 25458 |
| 25459 /* |
| 25460 ** Bitmasks used by sqlite3GetVarint(). These precomputed constants |
| 25461 ** are defined here rather than simply putting the constant expressions |
| 25462 ** inline in order to work around bugs in the RVT compiler. |
| 25463 ** |
| 25464 ** SLOT_2_0 A mask for (0x7f<<14) | 0x7f |
| 25465 ** |
| 25466 ** SLOT_4_2_0 A mask for (0x7f<<28) | SLOT_2_0 |
| 25467 */ |
| 25468 #define SLOT_2_0 0x001fc07f |
| 25469 #define SLOT_4_2_0 0xf01fc07f |
| 25470 |
| 25471 /* |
| 25472 ** Read a 64-bit variable-length integer from memory starting at p[0]. |
| 25473 ** Return the number of bytes read. The value is stored in *v. |
| 25474 */ |
| 25475 static u8 sqlite3Fts5GetVarint(const unsigned char *p, u64 *v){ |
| 25476 u32 a,b,s; |
| 25477 |
| 25478 a = *p; |
| 25479 /* a: p0 (unmasked) */ |
| 25480 if (!(a&0x80)) |
| 25481 { |
| 25482 *v = a; |
| 25483 return 1; |
| 25484 } |
| 25485 |
| 25486 p++; |
| 25487 b = *p; |
| 25488 /* b: p1 (unmasked) */ |
| 25489 if (!(b&0x80)) |
| 25490 { |
| 25491 a &= 0x7f; |
| 25492 a = a<<7; |
| 25493 a |= b; |
| 25494 *v = a; |
| 25495 return 2; |
| 25496 } |
| 25497 |
| 25498 /* Verify that constants are precomputed correctly */ |
| 25499 assert( SLOT_2_0 == ((0x7f<<14) | (0x7f)) ); |
| 25500 assert( SLOT_4_2_0 == ((0xfU<<28) | (0x7f<<14) | (0x7f)) ); |
| 25501 |
| 25502 p++; |
| 25503 a = a<<14; |
| 25504 a |= *p; |
| 25505 /* a: p0<<14 | p2 (unmasked) */ |
| 25506 if (!(a&0x80)) |
| 25507 { |
| 25508 a &= SLOT_2_0; |
| 25509 b &= 0x7f; |
| 25510 b = b<<7; |
| 25511 a |= b; |
| 25512 *v = a; |
| 25513 return 3; |
| 25514 } |
| 25515 |
| 25516 /* CSE1 from below */ |
| 25517 a &= SLOT_2_0; |
| 25518 p++; |
| 25519 b = b<<14; |
| 25520 b |= *p; |
| 25521 /* b: p1<<14 | p3 (unmasked) */ |
| 25522 if (!(b&0x80)) |
| 25523 { |
| 25524 b &= SLOT_2_0; |
| 25525 /* moved CSE1 up */ |
| 25526 /* a &= (0x7f<<14)|(0x7f); */ |
| 25527 a = a<<7; |
| 25528 a |= b; |
| 25529 *v = a; |
| 25530 return 4; |
| 25531 } |
| 25532 |
| 25533 /* a: p0<<14 | p2 (masked) */ |
| 25534 /* b: p1<<14 | p3 (unmasked) */ |
| 25535 /* 1:save off p0<<21 | p1<<14 | p2<<7 | p3 (masked) */ |
| 25536 /* moved CSE1 up */ |
| 25537 /* a &= (0x7f<<14)|(0x7f); */ |
| 25538 b &= SLOT_2_0; |
| 25539 s = a; |
| 25540 /* s: p0<<14 | p2 (masked) */ |
| 25541 |
| 25542 p++; |
| 25543 a = a<<14; |
| 25544 a |= *p; |
| 25545 /* a: p0<<28 | p2<<14 | p4 (unmasked) */ |
| 25546 if (!(a&0x80)) |
| 25547 { |
| 25548 /* we can skip these cause they were (effectively) done above in calc'ing s
*/ |
| 25549 /* a &= (0x7f<<28)|(0x7f<<14)|(0x7f); */ |
| 25550 /* b &= (0x7f<<14)|(0x7f); */ |
| 25551 b = b<<7; |
| 25552 a |= b; |
| 25553 s = s>>18; |
| 25554 *v = ((u64)s)<<32 | a; |
| 25555 return 5; |
| 25556 } |
| 25557 |
| 25558 /* 2:save off p0<<21 | p1<<14 | p2<<7 | p3 (masked) */ |
| 25559 s = s<<7; |
| 25560 s |= b; |
| 25561 /* s: p0<<21 | p1<<14 | p2<<7 | p3 (masked) */ |
| 25562 |
| 25563 p++; |
| 25564 b = b<<14; |
| 25565 b |= *p; |
| 25566 /* b: p1<<28 | p3<<14 | p5 (unmasked) */ |
| 25567 if (!(b&0x80)) |
| 25568 { |
| 25569 /* we can skip this cause it was (effectively) done above in calc'ing s */ |
| 25570 /* b &= (0x7f<<28)|(0x7f<<14)|(0x7f); */ |
| 25571 a &= SLOT_2_0; |
| 25572 a = a<<7; |
| 25573 a |= b; |
| 25574 s = s>>18; |
| 25575 *v = ((u64)s)<<32 | a; |
| 25576 return 6; |
| 25577 } |
| 25578 |
| 25579 p++; |
| 25580 a = a<<14; |
| 25581 a |= *p; |
| 25582 /* a: p2<<28 | p4<<14 | p6 (unmasked) */ |
| 25583 if (!(a&0x80)) |
| 25584 { |
| 25585 a &= SLOT_4_2_0; |
| 25586 b &= SLOT_2_0; |
| 25587 b = b<<7; |
| 25588 a |= b; |
| 25589 s = s>>11; |
| 25590 *v = ((u64)s)<<32 | a; |
| 25591 return 7; |
| 25592 } |
| 25593 |
| 25594 /* CSE2 from below */ |
| 25595 a &= SLOT_2_0; |
| 25596 p++; |
| 25597 b = b<<14; |
| 25598 b |= *p; |
| 25599 /* b: p3<<28 | p5<<14 | p7 (unmasked) */ |
| 25600 if (!(b&0x80)) |
| 25601 { |
| 25602 b &= SLOT_4_2_0; |
| 25603 /* moved CSE2 up */ |
| 25604 /* a &= (0x7f<<14)|(0x7f); */ |
| 25605 a = a<<7; |
| 25606 a |= b; |
| 25607 s = s>>4; |
| 25608 *v = ((u64)s)<<32 | a; |
| 25609 return 8; |
| 25610 } |
| 25611 |
| 25612 p++; |
| 25613 a = a<<15; |
| 25614 a |= *p; |
| 25615 /* a: p4<<29 | p6<<15 | p8 (unmasked) */ |
| 25616 |
| 25617 /* moved CSE2 up */ |
| 25618 /* a &= (0x7f<<29)|(0x7f<<15)|(0xff); */ |
| 25619 b &= SLOT_2_0; |
| 25620 b = b<<8; |
| 25621 a |= b; |
| 25622 |
| 25623 s = s<<4; |
| 25624 b = p[-4]; |
| 25625 b &= 0x7f; |
| 25626 b = b>>3; |
| 25627 s |= b; |
| 25628 |
| 25629 *v = ((u64)s)<<32 | a; |
| 25630 |
| 25631 return 9; |
| 25632 } |
| 25633 |
| 25634 /* |
| 25635 ** The variable-length integer encoding is as follows: |
| 25636 ** |
| 25637 ** KEY: |
| 25638 ** A = 0xxxxxxx 7 bits of data and one flag bit |
| 25639 ** B = 1xxxxxxx 7 bits of data and one flag bit |
| 25640 ** C = xxxxxxxx 8 bits of data |
| 25641 ** |
| 25642 ** 7 bits - A |
| 25643 ** 14 bits - BA |
| 25644 ** 21 bits - BBA |
| 25645 ** 28 bits - BBBA |
| 25646 ** 35 bits - BBBBA |
| 25647 ** 42 bits - BBBBBA |
| 25648 ** 49 bits - BBBBBBA |
| 25649 ** 56 bits - BBBBBBBA |
| 25650 ** 64 bits - BBBBBBBBC |
| 25651 */ |
| 25652 |
| 25653 #ifdef SQLITE_NOINLINE |
| 25654 # define FTS5_NOINLINE SQLITE_NOINLINE |
| 25655 #else |
| 25656 # define FTS5_NOINLINE |
| 25657 #endif |
| 25658 |
| 25659 /* |
| 25660 ** Write a 64-bit variable-length integer to memory starting at p[0]. |
| 25661 ** The length of data write will be between 1 and 9 bytes. The number |
| 25662 ** of bytes written is returned. |
| 25663 ** |
| 25664 ** A variable-length integer consists of the lower 7 bits of each byte |
| 25665 ** for all bytes that have the 8th bit set and one byte with the 8th |
| 25666 ** bit clear. Except, if we get to the 9th byte, it stores the full |
| 25667 ** 8 bits and is the last byte. |
| 25668 */ |
| 25669 static int FTS5_NOINLINE fts5PutVarint64(unsigned char *p, u64 v){ |
| 25670 int i, j, n; |
| 25671 u8 buf[10]; |
| 25672 if( v & (((u64)0xff000000)<<32) ){ |
| 25673 p[8] = (u8)v; |
| 25674 v >>= 8; |
| 25675 for(i=7; i>=0; i--){ |
| 25676 p[i] = (u8)((v & 0x7f) | 0x80); |
| 25677 v >>= 7; |
| 25678 } |
| 25679 return 9; |
| 25680 } |
| 25681 n = 0; |
| 25682 do{ |
| 25683 buf[n++] = (u8)((v & 0x7f) | 0x80); |
| 25684 v >>= 7; |
| 25685 }while( v!=0 ); |
| 25686 buf[0] &= 0x7f; |
| 25687 assert( n<=9 ); |
| 25688 for(i=0, j=n-1; j>=0; j--, i++){ |
| 25689 p[i] = buf[j]; |
| 25690 } |
| 25691 return n; |
| 25692 } |
| 25693 |
| 25694 static int sqlite3Fts5PutVarint(unsigned char *p, u64 v){ |
| 25695 if( v<=0x7f ){ |
| 25696 p[0] = v&0x7f; |
| 25697 return 1; |
| 25698 } |
| 25699 if( v<=0x3fff ){ |
| 25700 p[0] = ((v>>7)&0x7f)|0x80; |
| 25701 p[1] = v&0x7f; |
| 25702 return 2; |
| 25703 } |
| 25704 return fts5PutVarint64(p,v); |
| 25705 } |
| 25706 |
| 25707 |
| 25708 static int sqlite3Fts5GetVarintLen(u32 iVal){ |
| 25709 if( iVal<(1 << 7 ) ) return 1; |
| 25710 if( iVal<(1 << 14) ) return 2; |
| 25711 if( iVal<(1 << 21) ) return 3; |
| 25712 if( iVal<(1 << 28) ) return 4; |
| 25713 return 5; |
| 25714 } |
| 25715 |
| 25716 |
| 25717 /* |
| 25718 ** 2015 May 08 |
| 25719 ** |
| 25720 ** The author disclaims copyright to this source code. In place of |
| 25721 ** a legal notice, here is a blessing: |
| 25722 ** |
| 25723 ** May you do good and not evil. |
| 25724 ** May you find forgiveness for yourself and forgive others. |
| 25725 ** May you share freely, never taking more than you give. |
| 25726 ** |
| 25727 ****************************************************************************** |
| 25728 ** |
| 25729 ** This is an SQLite virtual table module implementing direct access to an |
| 25730 ** existing FTS5 index. The module may create several different types of |
| 25731 ** tables: |
| 25732 ** |
| 25733 ** col: |
| 25734 ** CREATE TABLE vocab(term, col, doc, cnt, PRIMARY KEY(term, col)); |
| 25735 ** |
| 25736 ** One row for each term/column combination. The value of $doc is set to |
| 25737 ** the number of fts5 rows that contain at least one instance of term |
| 25738 ** $term within column $col. Field $cnt is set to the total number of |
| 25739 ** instances of term $term in column $col (in any row of the fts5 table). |
| 25740 ** |
| 25741 ** row: |
| 25742 ** CREATE TABLE vocab(term, doc, cnt, PRIMARY KEY(term)); |
| 25743 ** |
| 25744 ** One row for each term in the database. The value of $doc is set to |
| 25745 ** the number of fts5 rows that contain at least one instance of term |
| 25746 ** $term. Field $cnt is set to the total number of instances of term |
| 25747 ** $term in the database. |
| 25748 */ |
| 25749 |
| 25750 |
| 25751 /* #include "fts5Int.h" */ |
| 25752 |
| 25753 |
| 25754 typedef struct Fts5VocabTable Fts5VocabTable; |
| 25755 typedef struct Fts5VocabCursor Fts5VocabCursor; |
| 25756 |
| 25757 struct Fts5VocabTable { |
| 25758 sqlite3_vtab base; |
| 25759 char *zFts5Tbl; /* Name of fts5 table */ |
| 25760 char *zFts5Db; /* Db containing fts5 table */ |
| 25761 sqlite3 *db; /* Database handle */ |
| 25762 Fts5Global *pGlobal; /* FTS5 global object for this database */ |
| 25763 int eType; /* FTS5_VOCAB_COL or ROW */ |
| 25764 }; |
| 25765 |
| 25766 struct Fts5VocabCursor { |
| 25767 sqlite3_vtab_cursor base; |
| 25768 sqlite3_stmt *pStmt; /* Statement holding lock on pIndex */ |
| 25769 Fts5Index *pIndex; /* Associated FTS5 index */ |
| 25770 |
| 25771 int bEof; /* True if this cursor is at EOF */ |
| 25772 Fts5IndexIter *pIter; /* Term/rowid iterator object */ |
| 25773 |
| 25774 int nLeTerm; /* Size of zLeTerm in bytes */ |
| 25775 char *zLeTerm; /* (term <= $zLeTerm) paramater, or NULL */ |
| 25776 |
| 25777 /* These are used by 'col' tables only */ |
| 25778 Fts5Config *pConfig; /* Fts5 table configuration */ |
| 25779 int iCol; |
| 25780 i64 *aCnt; |
| 25781 i64 *aDoc; |
| 25782 |
| 25783 /* Output values used by 'row' and 'col' tables */ |
| 25784 i64 rowid; /* This table's current rowid value */ |
| 25785 Fts5Buffer term; /* Current value of 'term' column */ |
| 25786 }; |
| 25787 |
| 25788 #define FTS5_VOCAB_COL 0 |
| 25789 #define FTS5_VOCAB_ROW 1 |
| 25790 |
| 25791 #define FTS5_VOCAB_COL_SCHEMA "term, col, doc, cnt" |
| 25792 #define FTS5_VOCAB_ROW_SCHEMA "term, doc, cnt" |
| 25793 |
| 25794 /* |
| 25795 ** Bits for the mask used as the idxNum value by xBestIndex/xFilter. |
| 25796 */ |
| 25797 #define FTS5_VOCAB_TERM_EQ 0x01 |
| 25798 #define FTS5_VOCAB_TERM_GE 0x02 |
| 25799 #define FTS5_VOCAB_TERM_LE 0x04 |
| 25800 |
| 25801 |
| 25802 /* |
| 25803 ** Translate a string containing an fts5vocab table type to an |
| 25804 ** FTS5_VOCAB_XXX constant. If successful, set *peType to the output |
| 25805 ** value and return SQLITE_OK. Otherwise, set *pzErr to an error message |
| 25806 ** and return SQLITE_ERROR. |
| 25807 */ |
| 25808 static int fts5VocabTableType(const char *zType, char **pzErr, int *peType){ |
| 25809 int rc = SQLITE_OK; |
| 25810 char *zCopy = sqlite3Fts5Strndup(&rc, zType, -1); |
| 25811 if( rc==SQLITE_OK ){ |
| 25812 sqlite3Fts5Dequote(zCopy); |
| 25813 if( sqlite3_stricmp(zCopy, "col")==0 ){ |
| 25814 *peType = FTS5_VOCAB_COL; |
| 25815 }else |
| 25816 |
| 25817 if( sqlite3_stricmp(zCopy, "row")==0 ){ |
| 25818 *peType = FTS5_VOCAB_ROW; |
| 25819 }else |
| 25820 { |
| 25821 *pzErr = sqlite3_mprintf("fts5vocab: unknown table type: %Q", zCopy); |
| 25822 rc = SQLITE_ERROR; |
| 25823 } |
| 25824 sqlite3_free(zCopy); |
| 25825 } |
| 25826 |
| 25827 return rc; |
| 25828 } |
| 25829 |
| 25830 |
| 25831 /* |
| 25832 ** The xDisconnect() virtual table method. |
| 25833 */ |
| 25834 static int fts5VocabDisconnectMethod(sqlite3_vtab *pVtab){ |
| 25835 Fts5VocabTable *pTab = (Fts5VocabTable*)pVtab; |
| 25836 sqlite3_free(pTab); |
| 25837 return SQLITE_OK; |
| 25838 } |
| 25839 |
| 25840 /* |
| 25841 ** The xDestroy() virtual table method. |
| 25842 */ |
| 25843 static int fts5VocabDestroyMethod(sqlite3_vtab *pVtab){ |
| 25844 Fts5VocabTable *pTab = (Fts5VocabTable*)pVtab; |
| 25845 sqlite3_free(pTab); |
| 25846 return SQLITE_OK; |
| 25847 } |
| 25848 |
| 25849 /* |
| 25850 ** This function is the implementation of both the xConnect and xCreate |
| 25851 ** methods of the FTS3 virtual table. |
| 25852 ** |
| 25853 ** The argv[] array contains the following: |
| 25854 ** |
| 25855 ** argv[0] -> module name ("fts5vocab") |
| 25856 ** argv[1] -> database name |
| 25857 ** argv[2] -> table name |
| 25858 ** |
| 25859 ** then: |
| 25860 ** |
| 25861 ** argv[3] -> name of fts5 table |
| 25862 ** argv[4] -> type of fts5vocab table |
| 25863 ** |
| 25864 ** or, for tables in the TEMP schema only. |
| 25865 ** |
| 25866 ** argv[3] -> name of fts5 tables database |
| 25867 ** argv[4] -> name of fts5 table |
| 25868 ** argv[5] -> type of fts5vocab table |
| 25869 */ |
| 25870 static int fts5VocabInitVtab( |
| 25871 sqlite3 *db, /* The SQLite database connection */ |
| 25872 void *pAux, /* Pointer to Fts5Global object */ |
| 25873 int argc, /* Number of elements in argv array */ |
| 25874 const char * const *argv, /* xCreate/xConnect argument array */ |
| 25875 sqlite3_vtab **ppVTab, /* Write the resulting vtab structure here */ |
| 25876 char **pzErr /* Write any error message here */ |
| 25877 ){ |
| 25878 const char *azSchema[] = { |
| 25879 "CREATE TABlE vocab(" FTS5_VOCAB_COL_SCHEMA ")", |
| 25880 "CREATE TABlE vocab(" FTS5_VOCAB_ROW_SCHEMA ")" |
| 25881 }; |
| 25882 |
| 25883 Fts5VocabTable *pRet = 0; |
| 25884 int rc = SQLITE_OK; /* Return code */ |
| 25885 int bDb; |
| 25886 |
| 25887 bDb = (argc==6 && strlen(argv[1])==4 && memcmp("temp", argv[1], 4)==0); |
| 25888 |
| 25889 if( argc!=5 && bDb==0 ){ |
| 25890 *pzErr = sqlite3_mprintf("wrong number of vtable arguments"); |
| 25891 rc = SQLITE_ERROR; |
| 25892 }else{ |
| 25893 int nByte; /* Bytes of space to allocate */ |
| 25894 const char *zDb = bDb ? argv[3] : argv[1]; |
| 25895 const char *zTab = bDb ? argv[4] : argv[3]; |
| 25896 const char *zType = bDb ? argv[5] : argv[4]; |
| 25897 int nDb = (int)strlen(zDb)+1; |
| 25898 int nTab = (int)strlen(zTab)+1; |
| 25899 int eType = 0; |
| 25900 |
| 25901 rc = fts5VocabTableType(zType, pzErr, &eType); |
| 25902 if( rc==SQLITE_OK ){ |
| 25903 assert( eType>=0 && eType<sizeof(azSchema)/sizeof(azSchema[0]) ); |
| 25904 rc = sqlite3_declare_vtab(db, azSchema[eType]); |
| 25905 } |
| 25906 |
| 25907 nByte = sizeof(Fts5VocabTable) + nDb + nTab; |
| 25908 pRet = sqlite3Fts5MallocZero(&rc, nByte); |
| 25909 if( pRet ){ |
| 25910 pRet->pGlobal = (Fts5Global*)pAux; |
| 25911 pRet->eType = eType; |
| 25912 pRet->db = db; |
| 25913 pRet->zFts5Tbl = (char*)&pRet[1]; |
| 25914 pRet->zFts5Db = &pRet->zFts5Tbl[nTab]; |
| 25915 memcpy(pRet->zFts5Tbl, zTab, nTab); |
| 25916 memcpy(pRet->zFts5Db, zDb, nDb); |
| 25917 sqlite3Fts5Dequote(pRet->zFts5Tbl); |
| 25918 sqlite3Fts5Dequote(pRet->zFts5Db); |
| 25919 } |
| 25920 } |
| 25921 |
| 25922 *ppVTab = (sqlite3_vtab*)pRet; |
| 25923 return rc; |
| 25924 } |
| 25925 |
| 25926 |
| 25927 /* |
| 25928 ** The xConnect() and xCreate() methods for the virtual table. All the |
| 25929 ** work is done in function fts5VocabInitVtab(). |
| 25930 */ |
| 25931 static int fts5VocabConnectMethod( |
| 25932 sqlite3 *db, /* Database connection */ |
| 25933 void *pAux, /* Pointer to tokenizer hash table */ |
| 25934 int argc, /* Number of elements in argv array */ |
| 25935 const char * const *argv, /* xCreate/xConnect argument array */ |
| 25936 sqlite3_vtab **ppVtab, /* OUT: New sqlite3_vtab object */ |
| 25937 char **pzErr /* OUT: sqlite3_malloc'd error message */ |
| 25938 ){ |
| 25939 return fts5VocabInitVtab(db, pAux, argc, argv, ppVtab, pzErr); |
| 25940 } |
| 25941 static int fts5VocabCreateMethod( |
| 25942 sqlite3 *db, /* Database connection */ |
| 25943 void *pAux, /* Pointer to tokenizer hash table */ |
| 25944 int argc, /* Number of elements in argv array */ |
| 25945 const char * const *argv, /* xCreate/xConnect argument array */ |
| 25946 sqlite3_vtab **ppVtab, /* OUT: New sqlite3_vtab object */ |
| 25947 char **pzErr /* OUT: sqlite3_malloc'd error message */ |
| 25948 ){ |
| 25949 return fts5VocabInitVtab(db, pAux, argc, argv, ppVtab, pzErr); |
| 25950 } |
| 25951 |
| 25952 /* |
| 25953 ** Implementation of the xBestIndex method. |
| 25954 */ |
| 25955 static int fts5VocabBestIndexMethod( |
| 25956 sqlite3_vtab *pVTab, |
| 25957 sqlite3_index_info *pInfo |
| 25958 ){ |
| 25959 int i; |
| 25960 int iTermEq = -1; |
| 25961 int iTermGe = -1; |
| 25962 int iTermLe = -1; |
| 25963 int idxNum = 0; |
| 25964 int nArg = 0; |
| 25965 |
| 25966 for(i=0; i<pInfo->nConstraint; i++){ |
| 25967 struct sqlite3_index_constraint *p = &pInfo->aConstraint[i]; |
| 25968 if( p->usable==0 ) continue; |
| 25969 if( p->iColumn==0 ){ /* term column */ |
| 25970 if( p->op==SQLITE_INDEX_CONSTRAINT_EQ ) iTermEq = i; |
| 25971 if( p->op==SQLITE_INDEX_CONSTRAINT_LE ) iTermLe = i; |
| 25972 if( p->op==SQLITE_INDEX_CONSTRAINT_LT ) iTermLe = i; |
| 25973 if( p->op==SQLITE_INDEX_CONSTRAINT_GE ) iTermGe = i; |
| 25974 if( p->op==SQLITE_INDEX_CONSTRAINT_GT ) iTermGe = i; |
| 25975 } |
| 25976 } |
| 25977 |
| 25978 if( iTermEq>=0 ){ |
| 25979 idxNum |= FTS5_VOCAB_TERM_EQ; |
| 25980 pInfo->aConstraintUsage[iTermEq].argvIndex = ++nArg; |
| 25981 pInfo->estimatedCost = 100; |
| 25982 }else{ |
| 25983 pInfo->estimatedCost = 1000000; |
| 25984 if( iTermGe>=0 ){ |
| 25985 idxNum |= FTS5_VOCAB_TERM_GE; |
| 25986 pInfo->aConstraintUsage[iTermGe].argvIndex = ++nArg; |
| 25987 pInfo->estimatedCost = pInfo->estimatedCost / 2; |
| 25988 } |
| 25989 if( iTermLe>=0 ){ |
| 25990 idxNum |= FTS5_VOCAB_TERM_LE; |
| 25991 pInfo->aConstraintUsage[iTermLe].argvIndex = ++nArg; |
| 25992 pInfo->estimatedCost = pInfo->estimatedCost / 2; |
| 25993 } |
| 25994 } |
| 25995 |
| 25996 pInfo->idxNum = idxNum; |
| 25997 |
| 25998 return SQLITE_OK; |
| 25999 } |
| 26000 |
| 26001 /* |
| 26002 ** Implementation of xOpen method. |
| 26003 */ |
| 26004 static int fts5VocabOpenMethod( |
| 26005 sqlite3_vtab *pVTab, |
| 26006 sqlite3_vtab_cursor **ppCsr |
| 26007 ){ |
| 26008 Fts5VocabTable *pTab = (Fts5VocabTable*)pVTab; |
| 26009 Fts5Index *pIndex = 0; |
| 26010 Fts5Config *pConfig = 0; |
| 26011 Fts5VocabCursor *pCsr = 0; |
| 26012 int rc = SQLITE_OK; |
| 26013 sqlite3_stmt *pStmt = 0; |
| 26014 char *zSql = 0; |
| 26015 |
| 26016 zSql = sqlite3Fts5Mprintf(&rc, |
| 26017 "SELECT t.%Q FROM %Q.%Q AS t WHERE t.%Q MATCH '*id'", |
| 26018 pTab->zFts5Tbl, pTab->zFts5Db, pTab->zFts5Tbl, pTab->zFts5Tbl |
| 26019 ); |
| 26020 if( zSql ){ |
| 26021 rc = sqlite3_prepare_v2(pTab->db, zSql, -1, &pStmt, 0); |
| 26022 } |
| 26023 sqlite3_free(zSql); |
| 26024 assert( rc==SQLITE_OK || pStmt==0 ); |
| 26025 if( rc==SQLITE_ERROR ) rc = SQLITE_OK; |
| 26026 |
| 26027 if( pStmt && sqlite3_step(pStmt)==SQLITE_ROW ){ |
| 26028 i64 iId = sqlite3_column_int64(pStmt, 0); |
| 26029 pIndex = sqlite3Fts5IndexFromCsrid(pTab->pGlobal, iId, &pConfig); |
| 26030 } |
| 26031 |
| 26032 if( rc==SQLITE_OK && pIndex==0 ){ |
| 26033 rc = sqlite3_finalize(pStmt); |
| 26034 pStmt = 0; |
| 26035 if( rc==SQLITE_OK ){ |
| 26036 pVTab->zErrMsg = sqlite3_mprintf( |
| 26037 "no such fts5 table: %s.%s", pTab->zFts5Db, pTab->zFts5Tbl |
| 26038 ); |
| 26039 rc = SQLITE_ERROR; |
| 26040 } |
| 26041 } |
| 26042 |
| 26043 if( rc==SQLITE_OK ){ |
| 26044 int nByte = pConfig->nCol * sizeof(i64) * 2 + sizeof(Fts5VocabCursor); |
| 26045 pCsr = (Fts5VocabCursor*)sqlite3Fts5MallocZero(&rc, nByte); |
| 26046 } |
| 26047 |
| 26048 if( pCsr ){ |
| 26049 pCsr->pIndex = pIndex; |
| 26050 pCsr->pStmt = pStmt; |
| 26051 pCsr->pConfig = pConfig; |
| 26052 pCsr->aCnt = (i64*)&pCsr[1]; |
| 26053 pCsr->aDoc = &pCsr->aCnt[pConfig->nCol]; |
| 26054 }else{ |
| 26055 sqlite3_finalize(pStmt); |
| 26056 } |
| 26057 |
| 26058 *ppCsr = (sqlite3_vtab_cursor*)pCsr; |
| 26059 return rc; |
| 26060 } |
| 26061 |
| 26062 static void fts5VocabResetCursor(Fts5VocabCursor *pCsr){ |
| 26063 pCsr->rowid = 0; |
| 26064 sqlite3Fts5IterClose(pCsr->pIter); |
| 26065 pCsr->pIter = 0; |
| 26066 sqlite3_free(pCsr->zLeTerm); |
| 26067 pCsr->nLeTerm = -1; |
| 26068 pCsr->zLeTerm = 0; |
| 26069 } |
| 26070 |
| 26071 /* |
| 26072 ** Close the cursor. For additional information see the documentation |
| 26073 ** on the xClose method of the virtual table interface. |
| 26074 */ |
| 26075 static int fts5VocabCloseMethod(sqlite3_vtab_cursor *pCursor){ |
| 26076 Fts5VocabCursor *pCsr = (Fts5VocabCursor*)pCursor; |
| 26077 fts5VocabResetCursor(pCsr); |
| 26078 sqlite3Fts5BufferFree(&pCsr->term); |
| 26079 sqlite3_finalize(pCsr->pStmt); |
| 26080 sqlite3_free(pCsr); |
| 26081 return SQLITE_OK; |
| 26082 } |
| 26083 |
| 26084 |
| 26085 /* |
| 26086 ** Advance the cursor to the next row in the table. |
| 26087 */ |
| 26088 static int fts5VocabNextMethod(sqlite3_vtab_cursor *pCursor){ |
| 26089 Fts5VocabCursor *pCsr = (Fts5VocabCursor*)pCursor; |
| 26090 Fts5VocabTable *pTab = (Fts5VocabTable*)pCursor->pVtab; |
| 26091 int rc = SQLITE_OK; |
| 26092 int nCol = pCsr->pConfig->nCol; |
| 26093 |
| 26094 pCsr->rowid++; |
| 26095 |
| 26096 if( pTab->eType==FTS5_VOCAB_COL ){ |
| 26097 for(pCsr->iCol++; pCsr->iCol<nCol; pCsr->iCol++){ |
| 26098 if( pCsr->aCnt[pCsr->iCol] ) break; |
| 26099 } |
| 26100 } |
| 26101 |
| 26102 if( pTab->eType==FTS5_VOCAB_ROW || pCsr->iCol>=nCol ){ |
| 26103 if( sqlite3Fts5IterEof(pCsr->pIter) ){ |
| 26104 pCsr->bEof = 1; |
| 26105 }else{ |
| 26106 const char *zTerm; |
| 26107 int nTerm; |
| 26108 |
| 26109 zTerm = sqlite3Fts5IterTerm(pCsr->pIter, &nTerm); |
| 26110 if( pCsr->nLeTerm>=0 ){ |
| 26111 int nCmp = MIN(nTerm, pCsr->nLeTerm); |
| 26112 int bCmp = memcmp(pCsr->zLeTerm, zTerm, nCmp); |
| 26113 if( bCmp<0 || (bCmp==0 && pCsr->nLeTerm<nTerm) ){ |
| 26114 pCsr->bEof = 1; |
| 26115 return SQLITE_OK; |
| 26116 } |
| 26117 } |
| 26118 |
| 26119 sqlite3Fts5BufferSet(&rc, &pCsr->term, nTerm, (const u8*)zTerm); |
| 26120 memset(pCsr->aCnt, 0, nCol * sizeof(i64)); |
| 26121 memset(pCsr->aDoc, 0, nCol * sizeof(i64)); |
| 26122 pCsr->iCol = 0; |
| 26123 |
| 26124 assert( pTab->eType==FTS5_VOCAB_COL || pTab->eType==FTS5_VOCAB_ROW ); |
| 26125 while( rc==SQLITE_OK ){ |
| 26126 i64 dummy; |
| 26127 const u8 *pPos; int nPos; /* Position list */ |
| 26128 i64 iPos = 0; /* 64-bit position read from poslist */ |
| 26129 int iOff = 0; /* Current offset within position list */ |
| 26130 |
| 26131 rc = sqlite3Fts5IterPoslist(pCsr->pIter, 0, &pPos, &nPos, &dummy); |
| 26132 if( rc==SQLITE_OK ){ |
| 26133 if( pTab->eType==FTS5_VOCAB_ROW ){ |
| 26134 while( 0==sqlite3Fts5PoslistNext64(pPos, nPos, &iOff, &iPos) ){ |
| 26135 pCsr->aCnt[0]++; |
| 26136 } |
| 26137 pCsr->aDoc[0]++; |
| 26138 }else{ |
| 26139 int iCol = -1; |
| 26140 while( 0==sqlite3Fts5PoslistNext64(pPos, nPos, &iOff, &iPos) ){ |
| 26141 int ii = FTS5_POS2COLUMN(iPos); |
| 26142 pCsr->aCnt[ii]++; |
| 26143 if( iCol!=ii ){ |
| 26144 pCsr->aDoc[ii]++; |
| 26145 iCol = ii; |
| 26146 } |
| 26147 } |
| 26148 } |
| 26149 rc = sqlite3Fts5IterNextScan(pCsr->pIter); |
| 26150 } |
| 26151 |
| 26152 if( rc==SQLITE_OK ){ |
| 26153 zTerm = sqlite3Fts5IterTerm(pCsr->pIter, &nTerm); |
| 26154 if( nTerm!=pCsr->term.n || memcmp(zTerm, pCsr->term.p, nTerm) ){ |
| 26155 break; |
| 26156 } |
| 26157 if( sqlite3Fts5IterEof(pCsr->pIter) ) break; |
| 26158 } |
| 26159 } |
| 26160 } |
| 26161 } |
| 26162 |
| 26163 if( pCsr->bEof==0 && pTab->eType==FTS5_VOCAB_COL ){ |
| 26164 while( pCsr->aCnt[pCsr->iCol]==0 ) pCsr->iCol++; |
| 26165 assert( pCsr->iCol<pCsr->pConfig->nCol ); |
| 26166 } |
| 26167 return rc; |
| 26168 } |
| 26169 |
| 26170 /* |
| 26171 ** This is the xFilter implementation for the virtual table. |
| 26172 */ |
| 26173 static int fts5VocabFilterMethod( |
| 26174 sqlite3_vtab_cursor *pCursor, /* The cursor used for this query */ |
| 26175 int idxNum, /* Strategy index */ |
| 26176 const char *idxStr, /* Unused */ |
| 26177 int nVal, /* Number of elements in apVal */ |
| 26178 sqlite3_value **apVal /* Arguments for the indexing scheme */ |
| 26179 ){ |
| 26180 Fts5VocabCursor *pCsr = (Fts5VocabCursor*)pCursor; |
| 26181 int rc = SQLITE_OK; |
| 26182 |
| 26183 int iVal = 0; |
| 26184 int f = FTS5INDEX_QUERY_SCAN; |
| 26185 const char *zTerm = 0; |
| 26186 int nTerm = 0; |
| 26187 |
| 26188 sqlite3_value *pEq = 0; |
| 26189 sqlite3_value *pGe = 0; |
| 26190 sqlite3_value *pLe = 0; |
| 26191 |
| 26192 fts5VocabResetCursor(pCsr); |
| 26193 if( idxNum & FTS5_VOCAB_TERM_EQ ) pEq = apVal[iVal++]; |
| 26194 if( idxNum & FTS5_VOCAB_TERM_GE ) pGe = apVal[iVal++]; |
| 26195 if( idxNum & FTS5_VOCAB_TERM_LE ) pLe = apVal[iVal++]; |
| 26196 |
| 26197 if( pEq ){ |
| 26198 zTerm = (const char *)sqlite3_value_text(pEq); |
| 26199 nTerm = sqlite3_value_bytes(pEq); |
| 26200 f = 0; |
| 26201 }else{ |
| 26202 if( pGe ){ |
| 26203 zTerm = (const char *)sqlite3_value_text(pGe); |
| 26204 nTerm = sqlite3_value_bytes(pGe); |
| 26205 } |
| 26206 if( pLe ){ |
| 26207 const char *zCopy = (const char *)sqlite3_value_text(pLe); |
| 26208 pCsr->nLeTerm = sqlite3_value_bytes(pLe); |
| 26209 pCsr->zLeTerm = sqlite3_malloc(pCsr->nLeTerm+1); |
| 26210 if( pCsr->zLeTerm==0 ){ |
| 26211 rc = SQLITE_NOMEM; |
| 26212 }else{ |
| 26213 memcpy(pCsr->zLeTerm, zCopy, pCsr->nLeTerm+1); |
| 26214 } |
| 26215 } |
| 26216 } |
| 26217 |
| 26218 |
| 26219 if( rc==SQLITE_OK ){ |
| 26220 rc = sqlite3Fts5IndexQuery(pCsr->pIndex, zTerm, nTerm, f, 0, &pCsr->pIter); |
| 26221 } |
| 26222 if( rc==SQLITE_OK ){ |
| 26223 rc = fts5VocabNextMethod(pCursor); |
| 26224 } |
| 26225 |
| 26226 return rc; |
| 26227 } |
| 26228 |
| 26229 /* |
| 26230 ** This is the xEof method of the virtual table. SQLite calls this |
| 26231 ** routine to find out if it has reached the end of a result set. |
| 26232 */ |
| 26233 static int fts5VocabEofMethod(sqlite3_vtab_cursor *pCursor){ |
| 26234 Fts5VocabCursor *pCsr = (Fts5VocabCursor*)pCursor; |
| 26235 return pCsr->bEof; |
| 26236 } |
| 26237 |
| 26238 static int fts5VocabColumnMethod( |
| 26239 sqlite3_vtab_cursor *pCursor, /* Cursor to retrieve value from */ |
| 26240 sqlite3_context *pCtx, /* Context for sqlite3_result_xxx() calls */ |
| 26241 int iCol /* Index of column to read value from */ |
| 26242 ){ |
| 26243 Fts5VocabCursor *pCsr = (Fts5VocabCursor*)pCursor; |
| 26244 |
| 26245 if( iCol==0 ){ |
| 26246 sqlite3_result_text( |
| 26247 pCtx, (const char*)pCsr->term.p, pCsr->term.n, SQLITE_TRANSIENT |
| 26248 ); |
| 26249 } |
| 26250 else if( ((Fts5VocabTable*)(pCursor->pVtab))->eType==FTS5_VOCAB_COL ){ |
| 26251 assert( iCol==1 || iCol==2 || iCol==3 ); |
| 26252 if( iCol==1 ){ |
| 26253 const char *z = pCsr->pConfig->azCol[pCsr->iCol]; |
| 26254 sqlite3_result_text(pCtx, z, -1, SQLITE_STATIC); |
| 26255 }else if( iCol==2 ){ |
| 26256 sqlite3_result_int64(pCtx, pCsr->aDoc[pCsr->iCol]); |
| 26257 }else{ |
| 26258 sqlite3_result_int64(pCtx, pCsr->aCnt[pCsr->iCol]); |
| 26259 } |
| 26260 }else{ |
| 26261 assert( iCol==1 || iCol==2 ); |
| 26262 if( iCol==1 ){ |
| 26263 sqlite3_result_int64(pCtx, pCsr->aDoc[0]); |
| 26264 }else{ |
| 26265 sqlite3_result_int64(pCtx, pCsr->aCnt[0]); |
| 26266 } |
| 26267 } |
| 26268 return SQLITE_OK; |
| 26269 } |
| 26270 |
| 26271 /* |
| 26272 ** This is the xRowid method. The SQLite core calls this routine to |
| 26273 ** retrieve the rowid for the current row of the result set. The |
| 26274 ** rowid should be written to *pRowid. |
| 26275 */ |
| 26276 static int fts5VocabRowidMethod( |
| 26277 sqlite3_vtab_cursor *pCursor, |
| 26278 sqlite_int64 *pRowid |
| 26279 ){ |
| 26280 Fts5VocabCursor *pCsr = (Fts5VocabCursor*)pCursor; |
| 26281 *pRowid = pCsr->rowid; |
| 26282 return SQLITE_OK; |
| 26283 } |
| 26284 |
| 26285 static int sqlite3Fts5VocabInit(Fts5Global *pGlobal, sqlite3 *db){ |
| 26286 static const sqlite3_module fts5Vocab = { |
| 26287 /* iVersion */ 2, |
| 26288 /* xCreate */ fts5VocabCreateMethod, |
| 26289 /* xConnect */ fts5VocabConnectMethod, |
| 26290 /* xBestIndex */ fts5VocabBestIndexMethod, |
| 26291 /* xDisconnect */ fts5VocabDisconnectMethod, |
| 26292 /* xDestroy */ fts5VocabDestroyMethod, |
| 26293 /* xOpen */ fts5VocabOpenMethod, |
| 26294 /* xClose */ fts5VocabCloseMethod, |
| 26295 /* xFilter */ fts5VocabFilterMethod, |
| 26296 /* xNext */ fts5VocabNextMethod, |
| 26297 /* xEof */ fts5VocabEofMethod, |
| 26298 /* xColumn */ fts5VocabColumnMethod, |
| 26299 /* xRowid */ fts5VocabRowidMethod, |
| 26300 /* xUpdate */ 0, |
| 26301 /* xBegin */ 0, |
| 26302 /* xSync */ 0, |
| 26303 /* xCommit */ 0, |
| 26304 /* xRollback */ 0, |
| 26305 /* xFindFunction */ 0, |
| 26306 /* xRename */ 0, |
| 26307 /* xSavepoint */ 0, |
| 26308 /* xRelease */ 0, |
| 26309 /* xRollbackTo */ 0, |
| 26310 }; |
| 26311 void *p = (void*)pGlobal; |
| 26312 |
| 26313 return sqlite3_create_module_v2(db, "fts5vocab", &fts5Vocab, p, 0); |
| 26314 } |
| 26315 |
| 26316 |
| 26317 |
| 26318 |
| 26319 |
| 26320 #endif /* !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_FTS5) */ |
| 26321 |
| 26322 /************** End of fts5.c ************************************************/ |
OLD | NEW |