OLD | NEW |
| (Empty) |
1 /* | |
2 ** 2014 August 30 | |
3 ** | |
4 ** The author disclaims copyright to this source code. In place of | |
5 ** a legal notice, here is a blessing: | |
6 ** | |
7 ** May you do good and not evil. | |
8 ** May you find forgiveness for yourself and forgive others. | |
9 ** May you share freely, never taking more than you give. | |
10 ** | |
11 ************************************************************************* | |
12 ** | |
13 ** This file contains the public interface for the RBU extension. | |
14 */ | |
15 | |
16 /* | |
17 ** SUMMARY | |
18 ** | |
19 ** Writing a transaction containing a large number of operations on | |
20 ** b-tree indexes that are collectively larger than the available cache | |
21 ** memory can be very inefficient. | |
22 ** | |
23 ** The problem is that in order to update a b-tree, the leaf page (at least) | |
24 ** containing the entry being inserted or deleted must be modified. If the | |
25 ** working set of leaves is larger than the available cache memory, then a | |
26 ** single leaf that is modified more than once as part of the transaction | |
27 ** may be loaded from or written to the persistent media multiple times. | |
28 ** Additionally, because the index updates are likely to be applied in | |
29 ** random order, access to pages within the database is also likely to be in | |
30 ** random order, which is itself quite inefficient. | |
31 ** | |
32 ** One way to improve the situation is to sort the operations on each index | |
33 ** by index key before applying them to the b-tree. This leads to an IO | |
34 ** pattern that resembles a single linear scan through the index b-tree, | |
35 ** and all but guarantees each modified leaf page is loaded and stored | |
36 ** exactly once. SQLite uses this trick to improve the performance of | |
37 ** CREATE INDEX commands. This extension allows it to be used to improve | |
38 ** the performance of large transactions on existing databases. | |
39 ** | |
40 ** Additionally, this extension allows the work involved in writing the | |
41 ** large transaction to be broken down into sub-transactions performed | |
42 ** sequentially by separate processes. This is useful if the system cannot | |
43 ** guarantee that a single update process will run for long enough to apply | |
44 ** the entire update, for example because the update is being applied on a | |
45 ** mobile device that is frequently rebooted. Even after the writer process | |
46 ** has committed one or more sub-transactions, other database clients continue | |
47 ** to read from the original database snapshot. In other words, partially | |
48 ** applied transactions are not visible to other clients. | |
49 ** | |
50 ** "RBU" stands for "Resumable Bulk Update". As in a large database update | |
51 ** transmitted via a wireless network to a mobile device. A transaction | |
52 ** applied using this extension is hence refered to as an "RBU update". | |
53 ** | |
54 ** | |
55 ** LIMITATIONS | |
56 ** | |
57 ** An "RBU update" transaction is subject to the following limitations: | |
58 ** | |
59 ** * The transaction must consist of INSERT, UPDATE and DELETE operations | |
60 ** only. | |
61 ** | |
62 ** * INSERT statements may not use any default values. | |
63 ** | |
64 ** * UPDATE and DELETE statements must identify their target rows by | |
65 ** non-NULL PRIMARY KEY values. Rows with NULL values stored in PRIMARY | |
66 ** KEY fields may not be updated or deleted. If the table being written | |
67 ** has no PRIMARY KEY, affected rows must be identified by rowid. | |
68 ** | |
69 ** * UPDATE statements may not modify PRIMARY KEY columns. | |
70 ** | |
71 ** * No triggers will be fired. | |
72 ** | |
73 ** * No foreign key violations are detected or reported. | |
74 ** | |
75 ** * CHECK constraints are not enforced. | |
76 ** | |
77 ** * No constraint handling mode except for "OR ROLLBACK" is supported. | |
78 ** | |
79 ** | |
80 ** PREPARATION | |
81 ** | |
82 ** An "RBU update" is stored as a separate SQLite database. A database | |
83 ** containing an RBU update is an "RBU database". For each table in the | |
84 ** target database to be updated, the RBU database should contain a table | |
85 ** named "data_<target name>" containing the same set of columns as the | |
86 ** target table, and one more - "rbu_control". The data_% table should | |
87 ** have no PRIMARY KEY or UNIQUE constraints, but each column should have | |
88 ** the same type as the corresponding column in the target database. | |
89 ** The "rbu_control" column should have no type at all. For example, if | |
90 ** the target database contains: | |
91 ** | |
92 ** CREATE TABLE t1(a INTEGER PRIMARY KEY, b TEXT, c UNIQUE); | |
93 ** | |
94 ** Then the RBU database should contain: | |
95 ** | |
96 ** CREATE TABLE data_t1(a INTEGER, b TEXT, c, rbu_control); | |
97 ** | |
98 ** The order of the columns in the data_% table does not matter. | |
99 ** | |
100 ** Instead of a regular table, the RBU database may also contain virtual | |
101 ** tables or view named using the data_<target> naming scheme. | |
102 ** | |
103 ** Instead of the plain data_<target> naming scheme, RBU database tables | |
104 ** may also be named data<integer>_<target>, where <integer> is any sequence | |
105 ** of zero or more numeric characters (0-9). This can be significant because | |
106 ** tables within the RBU database are always processed in order sorted by | |
107 ** name. By judicious selection of the the <integer> portion of the names | |
108 ** of the RBU tables the user can therefore control the order in which they | |
109 ** are processed. This can be useful, for example, to ensure that "external | |
110 ** content" FTS4 tables are updated before their underlying content tables. | |
111 ** | |
112 ** If the target database table is a virtual table or a table that has no | |
113 ** PRIMARY KEY declaration, the data_% table must also contain a column | |
114 ** named "rbu_rowid". This column is mapped to the tables implicit primary | |
115 ** key column - "rowid". Virtual tables for which the "rowid" column does | |
116 ** not function like a primary key value cannot be updated using RBU. For | |
117 ** example, if the target db contains either of the following: | |
118 ** | |
119 ** CREATE VIRTUAL TABLE x1 USING fts3(a, b); | |
120 ** CREATE TABLE x1(a, b) | |
121 ** | |
122 ** then the RBU database should contain: | |
123 ** | |
124 ** CREATE TABLE data_x1(a, b, rbu_rowid, rbu_control); | |
125 ** | |
126 ** All non-hidden columns (i.e. all columns matched by "SELECT *") of the | |
127 ** target table must be present in the input table. For virtual tables, | |
128 ** hidden columns are optional - they are updated by RBU if present in | |
129 ** the input table, or not otherwise. For example, to write to an fts4 | |
130 ** table with a hidden languageid column such as: | |
131 ** | |
132 ** CREATE VIRTUAL TABLE ft1 USING fts4(a, b, languageid='langid'); | |
133 ** | |
134 ** Either of the following input table schemas may be used: | |
135 ** | |
136 ** CREATE TABLE data_ft1(a, b, langid, rbu_rowid, rbu_control); | |
137 ** CREATE TABLE data_ft1(a, b, rbu_rowid, rbu_control); | |
138 ** | |
139 ** For each row to INSERT into the target database as part of the RBU | |
140 ** update, the corresponding data_% table should contain a single record | |
141 ** with the "rbu_control" column set to contain integer value 0. The | |
142 ** other columns should be set to the values that make up the new record | |
143 ** to insert. | |
144 ** | |
145 ** If the target database table has an INTEGER PRIMARY KEY, it is not | |
146 ** possible to insert a NULL value into the IPK column. Attempting to | |
147 ** do so results in an SQLITE_MISMATCH error. | |
148 ** | |
149 ** For each row to DELETE from the target database as part of the RBU | |
150 ** update, the corresponding data_% table should contain a single record | |
151 ** with the "rbu_control" column set to contain integer value 1. The | |
152 ** real primary key values of the row to delete should be stored in the | |
153 ** corresponding columns of the data_% table. The values stored in the | |
154 ** other columns are not used. | |
155 ** | |
156 ** For each row to UPDATE from the target database as part of the RBU | |
157 ** update, the corresponding data_% table should contain a single record | |
158 ** with the "rbu_control" column set to contain a value of type text. | |
159 ** The real primary key values identifying the row to update should be | |
160 ** stored in the corresponding columns of the data_% table row, as should | |
161 ** the new values of all columns being update. The text value in the | |
162 ** "rbu_control" column must contain the same number of characters as | |
163 ** there are columns in the target database table, and must consist entirely | |
164 ** of 'x' and '.' characters (or in some special cases 'd' - see below). For | |
165 ** each column that is being updated, the corresponding character is set to | |
166 ** 'x'. For those that remain as they are, the corresponding character of the | |
167 ** rbu_control value should be set to '.'. For example, given the tables | |
168 ** above, the update statement: | |
169 ** | |
170 ** UPDATE t1 SET c = 'usa' WHERE a = 4; | |
171 ** | |
172 ** is represented by the data_t1 row created by: | |
173 ** | |
174 ** INSERT INTO data_t1(a, b, c, rbu_control) VALUES(4, NULL, 'usa', '..x'); | |
175 ** | |
176 ** Instead of an 'x' character, characters of the rbu_control value specified | |
177 ** for UPDATEs may also be set to 'd'. In this case, instead of updating the | |
178 ** target table with the value stored in the corresponding data_% column, the | |
179 ** user-defined SQL function "rbu_delta()" is invoked and the result stored in | |
180 ** the target table column. rbu_delta() is invoked with two arguments - the | |
181 ** original value currently stored in the target table column and the | |
182 ** value specified in the data_xxx table. | |
183 ** | |
184 ** For example, this row: | |
185 ** | |
186 ** INSERT INTO data_t1(a, b, c, rbu_control) VALUES(4, NULL, 'usa', '..d'); | |
187 ** | |
188 ** is similar to an UPDATE statement such as: | |
189 ** | |
190 ** UPDATE t1 SET c = rbu_delta(c, 'usa') WHERE a = 4; | |
191 ** | |
192 ** Finally, if an 'f' character appears in place of a 'd' or 's' in an | |
193 ** ota_control string, the contents of the data_xxx table column is assumed | |
194 ** to be a "fossil delta" - a patch to be applied to a blob value in the | |
195 ** format used by the fossil source-code management system. In this case | |
196 ** the existing value within the target database table must be of type BLOB. | |
197 ** It is replaced by the result of applying the specified fossil delta to | |
198 ** itself. | |
199 ** | |
200 ** If the target database table is a virtual table or a table with no PRIMARY | |
201 ** KEY, the rbu_control value should not include a character corresponding | |
202 ** to the rbu_rowid value. For example, this: | |
203 ** | |
204 ** INSERT INTO data_ft1(a, b, rbu_rowid, rbu_control) | |
205 ** VALUES(NULL, 'usa', 12, '.x'); | |
206 ** | |
207 ** causes a result similar to: | |
208 ** | |
209 ** UPDATE ft1 SET b = 'usa' WHERE rowid = 12; | |
210 ** | |
211 ** The data_xxx tables themselves should have no PRIMARY KEY declarations. | |
212 ** However, RBU is more efficient if reading the rows in from each data_xxx | |
213 ** table in "rowid" order is roughly the same as reading them sorted by | |
214 ** the PRIMARY KEY of the corresponding target database table. In other | |
215 ** words, rows should be sorted using the destination table PRIMARY KEY | |
216 ** fields before they are inserted into the data_xxx tables. | |
217 ** | |
218 ** USAGE | |
219 ** | |
220 ** The API declared below allows an application to apply an RBU update | |
221 ** stored on disk to an existing target database. Essentially, the | |
222 ** application: | |
223 ** | |
224 ** 1) Opens an RBU handle using the sqlite3rbu_open() function. | |
225 ** | |
226 ** 2) Registers any required virtual table modules with the database | |
227 ** handle returned by sqlite3rbu_db(). Also, if required, register | |
228 ** the rbu_delta() implementation. | |
229 ** | |
230 ** 3) Calls the sqlite3rbu_step() function one or more times on | |
231 ** the new handle. Each call to sqlite3rbu_step() performs a single | |
232 ** b-tree operation, so thousands of calls may be required to apply | |
233 ** a complete update. | |
234 ** | |
235 ** 4) Calls sqlite3rbu_close() to close the RBU update handle. If | |
236 ** sqlite3rbu_step() has been called enough times to completely | |
237 ** apply the update to the target database, then the RBU database | |
238 ** is marked as fully applied. Otherwise, the state of the RBU | |
239 ** update application is saved in the RBU database for later | |
240 ** resumption. | |
241 ** | |
242 ** See comments below for more detail on APIs. | |
243 ** | |
244 ** If an update is only partially applied to the target database by the | |
245 ** time sqlite3rbu_close() is called, various state information is saved | |
246 ** within the RBU database. This allows subsequent processes to automatically | |
247 ** resume the RBU update from where it left off. | |
248 ** | |
249 ** To remove all RBU extension state information, returning an RBU database | |
250 ** to its original contents, it is sufficient to drop all tables that begin | |
251 ** with the prefix "rbu_" | |
252 ** | |
253 ** DATABASE LOCKING | |
254 ** | |
255 ** An RBU update may not be applied to a database in WAL mode. Attempting | |
256 ** to do so is an error (SQLITE_ERROR). | |
257 ** | |
258 ** While an RBU handle is open, a SHARED lock may be held on the target | |
259 ** database file. This means it is possible for other clients to read the | |
260 ** database, but not to write it. | |
261 ** | |
262 ** If an RBU update is started and then suspended before it is completed, | |
263 ** then an external client writes to the database, then attempting to resume | |
264 ** the suspended RBU update is also an error (SQLITE_BUSY). | |
265 */ | |
266 | |
267 #ifndef _SQLITE3RBU_H | |
268 #define _SQLITE3RBU_H | |
269 | |
270 #include "sqlite3.h" /* Required for error code definitions */ | |
271 | |
272 #ifdef __cplusplus | |
273 extern "C" { | |
274 #endif | |
275 | |
276 typedef struct sqlite3rbu sqlite3rbu; | |
277 | |
278 /* | |
279 ** Open an RBU handle. | |
280 ** | |
281 ** Argument zTarget is the path to the target database. Argument zRbu is | |
282 ** the path to the RBU database. Each call to this function must be matched | |
283 ** by a call to sqlite3rbu_close(). When opening the databases, RBU passes | |
284 ** the SQLITE_CONFIG_URI flag to sqlite3_open_v2(). So if either zTarget | |
285 ** or zRbu begin with "file:", it will be interpreted as an SQLite | |
286 ** database URI, not a regular file name. | |
287 ** | |
288 ** If the zState argument is passed a NULL value, the RBU extension stores | |
289 ** the current state of the update (how many rows have been updated, which | |
290 ** indexes are yet to be updated etc.) within the RBU database itself. This | |
291 ** can be convenient, as it means that the RBU application does not need to | |
292 ** organize removing a separate state file after the update is concluded. | |
293 ** Or, if zState is non-NULL, it must be a path to a database file in which | |
294 ** the RBU extension can store the state of the update. | |
295 ** | |
296 ** When resuming an RBU update, the zState argument must be passed the same | |
297 ** value as when the RBU update was started. | |
298 ** | |
299 ** Once the RBU update is finished, the RBU extension does not | |
300 ** automatically remove any zState database file, even if it created it. | |
301 ** | |
302 ** By default, RBU uses the default VFS to access the files on disk. To | |
303 ** use a VFS other than the default, an SQLite "file:" URI containing a | |
304 ** "vfs=..." option may be passed as the zTarget option. | |
305 ** | |
306 ** IMPORTANT NOTE FOR ZIPVFS USERS: The RBU extension works with all of | |
307 ** SQLite's built-in VFSs, including the multiplexor VFS. However it does | |
308 ** not work out of the box with zipvfs. Refer to the comment describing | |
309 ** the zipvfs_create_vfs() API below for details on using RBU with zipvfs. | |
310 */ | |
311 sqlite3rbu *sqlite3rbu_open( | |
312 const char *zTarget, | |
313 const char *zRbu, | |
314 const char *zState | |
315 ); | |
316 | |
317 /* | |
318 ** Internally, each RBU connection uses a separate SQLite database | |
319 ** connection to access the target and rbu update databases. This | |
320 ** API allows the application direct access to these database handles. | |
321 ** | |
322 ** The first argument passed to this function must be a valid, open, RBU | |
323 ** handle. The second argument should be passed zero to access the target | |
324 ** database handle, or non-zero to access the rbu update database handle. | |
325 ** Accessing the underlying database handles may be useful in the | |
326 ** following scenarios: | |
327 ** | |
328 ** * If any target tables are virtual tables, it may be necessary to | |
329 ** call sqlite3_create_module() on the target database handle to | |
330 ** register the required virtual table implementations. | |
331 ** | |
332 ** * If the data_xxx tables in the RBU source database are virtual | |
333 ** tables, the application may need to call sqlite3_create_module() on | |
334 ** the rbu update db handle to any required virtual table | |
335 ** implementations. | |
336 ** | |
337 ** * If the application uses the "rbu_delta()" feature described above, | |
338 ** it must use sqlite3_create_function() or similar to register the | |
339 ** rbu_delta() implementation with the target database handle. | |
340 ** | |
341 ** If an error has occurred, either while opening or stepping the RBU object, | |
342 ** this function may return NULL. The error code and message may be collected | |
343 ** when sqlite3rbu_close() is called. | |
344 ** | |
345 ** Database handles returned by this function remain valid until the next | |
346 ** call to any sqlite3rbu_xxx() function other than sqlite3rbu_db(). | |
347 */ | |
348 sqlite3 *sqlite3rbu_db(sqlite3rbu*, int bRbu); | |
349 | |
350 /* | |
351 ** Do some work towards applying the RBU update to the target db. | |
352 ** | |
353 ** Return SQLITE_DONE if the update has been completely applied, or | |
354 ** SQLITE_OK if no error occurs but there remains work to do to apply | |
355 ** the RBU update. If an error does occur, some other error code is | |
356 ** returned. | |
357 ** | |
358 ** Once a call to sqlite3rbu_step() has returned a value other than | |
359 ** SQLITE_OK, all subsequent calls on the same RBU handle are no-ops | |
360 ** that immediately return the same value. | |
361 */ | |
362 int sqlite3rbu_step(sqlite3rbu *pRbu); | |
363 | |
364 /* | |
365 ** Force RBU to save its state to disk. | |
366 ** | |
367 ** If a power failure or application crash occurs during an update, following | |
368 ** system recovery RBU may resume the update from the point at which the state | |
369 ** was last saved. In other words, from the most recent successful call to | |
370 ** sqlite3rbu_close() or this function. | |
371 ** | |
372 ** SQLITE_OK is returned if successful, or an SQLite error code otherwise. | |
373 */ | |
374 int sqlite3rbu_savestate(sqlite3rbu *pRbu); | |
375 | |
376 /* | |
377 ** Close an RBU handle. | |
378 ** | |
379 ** If the RBU update has been completely applied, mark the RBU database | |
380 ** as fully applied. Otherwise, assuming no error has occurred, save the | |
381 ** current state of the RBU update appliation to the RBU database. | |
382 ** | |
383 ** If an error has already occurred as part of an sqlite3rbu_step() | |
384 ** or sqlite3rbu_open() call, or if one occurs within this function, an | |
385 ** SQLite error code is returned. Additionally, *pzErrmsg may be set to | |
386 ** point to a buffer containing a utf-8 formatted English language error | |
387 ** message. It is the responsibility of the caller to eventually free any | |
388 ** such buffer using sqlite3_free(). | |
389 ** | |
390 ** Otherwise, if no error occurs, this function returns SQLITE_OK if the | |
391 ** update has been partially applied, or SQLITE_DONE if it has been | |
392 ** completely applied. | |
393 */ | |
394 int sqlite3rbu_close(sqlite3rbu *pRbu, char **pzErrmsg); | |
395 | |
396 /* | |
397 ** Return the total number of key-value operations (inserts, deletes or | |
398 ** updates) that have been performed on the target database since the | |
399 ** current RBU update was started. | |
400 */ | |
401 sqlite3_int64 sqlite3rbu_progress(sqlite3rbu *pRbu); | |
402 | |
403 /* | |
404 ** Create an RBU VFS named zName that accesses the underlying file-system | |
405 ** via existing VFS zParent. Or, if the zParent parameter is passed NULL, | |
406 ** then the new RBU VFS uses the default system VFS to access the file-system. | |
407 ** The new object is registered as a non-default VFS with SQLite before | |
408 ** returning. | |
409 ** | |
410 ** Part of the RBU implementation uses a custom VFS object. Usually, this | |
411 ** object is created and deleted automatically by RBU. | |
412 ** | |
413 ** The exception is for applications that also use zipvfs. In this case, | |
414 ** the custom VFS must be explicitly created by the user before the RBU | |
415 ** handle is opened. The RBU VFS should be installed so that the zipvfs | |
416 ** VFS uses the RBU VFS, which in turn uses any other VFS layers in use | |
417 ** (for example multiplexor) to access the file-system. For example, | |
418 ** to assemble an RBU enabled VFS stack that uses both zipvfs and | |
419 ** multiplexor (error checking omitted): | |
420 ** | |
421 ** // Create a VFS named "multiplex" (not the default). | |
422 ** sqlite3_multiplex_initialize(0, 0); | |
423 ** | |
424 ** // Create an rbu VFS named "rbu" that uses multiplexor. If the | |
425 ** // second argument were replaced with NULL, the "rbu" VFS would | |
426 ** // access the file-system via the system default VFS, bypassing the | |
427 ** // multiplexor. | |
428 ** sqlite3rbu_create_vfs("rbu", "multiplex"); | |
429 ** | |
430 ** // Create a zipvfs VFS named "zipvfs" that uses rbu. | |
431 ** zipvfs_create_vfs_v3("zipvfs", "rbu", 0, xCompressorAlgorithmDetector); | |
432 ** | |
433 ** // Make zipvfs the default VFS. | |
434 ** sqlite3_vfs_register(sqlite3_vfs_find("zipvfs"), 1); | |
435 ** | |
436 ** Because the default VFS created above includes a RBU functionality, it | |
437 ** may be used by RBU clients. Attempting to use RBU with a zipvfs VFS stack | |
438 ** that does not include the RBU layer results in an error. | |
439 ** | |
440 ** The overhead of adding the "rbu" VFS to the system is negligible for | |
441 ** non-RBU users. There is no harm in an application accessing the | |
442 ** file-system via "rbu" all the time, even if it only uses RBU functionality | |
443 ** occasionally. | |
444 */ | |
445 int sqlite3rbu_create_vfs(const char *zName, const char *zParent); | |
446 | |
447 /* | |
448 ** Deregister and destroy an RBU vfs created by an earlier call to | |
449 ** sqlite3rbu_create_vfs(). | |
450 ** | |
451 ** VFS objects are not reference counted. If a VFS object is destroyed | |
452 ** before all database handles that use it have been closed, the results | |
453 ** are undefined. | |
454 */ | |
455 void sqlite3rbu_destroy_vfs(const char *zName); | |
456 | |
457 #ifdef __cplusplus | |
458 } /* end of the 'extern "C"' block */ | |
459 #endif | |
460 | |
461 #endif /* _SQLITE3RBU_H */ | |
OLD | NEW |