third_party/sqlite/src/src/recover.c - Issue 694353003: Get `gn gen` to succeed on Windows

Side by Side Diff: third_party/sqlite/src/src/recover.c

Issue 694353003: Get `gn gen` to succeed on Windows (Closed) Base URL: https://github.com/domokit/mojo.git@master

Patch Set: remove GYP_DEFINES code Created 6 years, 1 month ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

OLD	NEW
(Empty)
	1 /*

	2 ** 2012 Jan 11

	3 **

	4 ** The author disclaims copyright to this source code. In place of

	5 ** a legal notice, here is a blessing:

	6 **

	7 ** May you do good and not evil.

	8 ** May you find forgiveness for yourself and forgive others.

	9 ** May you share freely, never taking more than you give.

	10 */

	11 /* TODO(shess): THIS MODULE IS STILL EXPERIMENTAL. DO NOT USE IT. */

	12 /* Implements a virtual table "recover" which can be used to recover

	13 * data from a corrupt table. The table is walked manually, with

	14 * corrupt items skipped. Additionally, any errors while reading will

	15 * be skipped.

	16 *

	17 * Given a table with this definition:

	18 *

	19 * CREATE TABLE Stuff (

	20 * name TEXT PRIMARY KEY,

	21 * value TEXT NOT NULL

	22 * );

	23 *

	24 * to recover the data from teh table, you could do something like:

	25 *

	26 * -- Attach another database, the original is not trustworthy.

	27 * ATTACH DATABASE '/tmp/db.db' AS rdb;

	28 * -- Create a new version of the table.

	29 * CREATE TABLE rdb.Stuff (

	30 * name TEXT PRIMARY KEY,

	31 * value TEXT NOT NULL

	32 * );

	33 * -- This will read the original table's data.

	34 * CREATE VIRTUAL TABLE temp.recover_Stuff using recover(

	35 * main.Stuff,

	36 * name TEXT STRICT NOT NULL, -- only real TEXT data allowed

	37 * value TEXT STRICT NOT NULL

	38 * );

	39 * -- Corruption means the UNIQUE constraint may no longer hold for

	40 * -- Stuff, so either OR REPLACE or OR IGNORE must be used.

	41 * INSERT OR REPLACE INTO rdb.Stuff (rowid, name, value )

	42 * SELECT rowid, name, value FROM temp.recover_Stuff;

	43 * DROP TABLE temp.recover_Stuff;

	44 * DETACH DATABASE rdb;

	45 * -- Move db.db to replace original db in filesystem.

	46 *

	47 *

	48 * Usage

	49 *

	50 * Given the goal of dealing with corruption, it would not be safe to

	51 * create a recovery table in the database being recovered. So

	52 * recovery tables must be created in the temp database. They are not

	53 * appropriate to persist, in any case. [As a bonus, sqlite_master

	54 * tables can be recovered. Perhaps more cute than useful, though.]

	55 *

	56 * The parameters are a specifier for the table to read, and a column

	57 * definition for each bit of data stored in that table. The named

	58 * table must be convertable to a root page number by reading the

	59 * sqlite_master table. Bare table names are assumed to be in

	60 * database 0 ("main"), other databases can be specified in db.table

	61 * fashion.

	62 *

	63 * Column definitions are similar to BUT NOT THE SAME AS those

	64 * provided to CREATE statements:

	65 * column-def: column-name [type-name [STRICT] [NOT NULL]]

	66 * type-name: (ANY\|ROWID\|INTEGER\|FLOAT\|NUMERIC\|TEXT\|BLOB)

	67 *

	68 * Only those exact type names are accepted, there is no type

	69 * intuition. The only constraints accepted are STRICT (see below)

	70 * and NOT NULL. Anything unexpected will cause the create to fail.

	71 *

	72 * ANY is a convenience to indicate that manifest typing is desired.

	73 * It is equivalent to not specifying a type at all. The results for

	74 * such columns will have the type of the data's storage. The exposed

	75 * schema will contain no type for that column.

	76 *

	77 * ROWID is used for columns representing aliases to the rowid

	78 * (INTEGER PRIMARY KEY, with or without AUTOINCREMENT), to make the

	79 * concept explicit. Such columns are actually stored as NULL, so

	80 * they cannot be simply ignored. The exposed schema will be INTEGER

	81 * for that column.

	82 *

	83 * NOT NULL causes rows with a NULL in that column to be skipped. It

	84 * also adds NOT NULL to the column in the exposed schema. If the

	85 * table has ever had columns added using ALTER TABLE, then those

	86 * columns implicitly contain NULL for rows which have not been

	87 * updated. [Workaround using COALESCE() in your SELECT statement.]

	88 *

	89 * The created table is read-only, with no indices. Any SELECT will

	90 * be a full-table scan, returning each valid row read from the

	91 * storage of the backing table. The rowid will be the rowid of the

	92 * row from the backing table. "Valid" means:

	93 * - The cell metadata for the row is well-formed. Mainly this means that

	94 * the cell header info describes a payload of the size indicated by

	95 * the cell's payload size.

	96 * - The cell does not run off the page.

	97 * - The cell does not overlap any other cell on the page.

	98 * - The cell contains doesn't contain too many columns.

	99 * - The types of the serialized data match the indicated types (see below).

	100 *

	101 *

	102 * Type affinity versus type storage.

	103 *

	104 * http://www.sqlite.org/datatype3.html describes SQLite's type

	105 * affinity system. The system provides for automated coercion of

	106 * types in certain cases, transparently enough that many developers

	107 * do not realize that it is happening. Importantly, it implies that

	108 * the raw data stored in the database may not have the obvious type.

	109 *

	110 * Differences between the stored data types and the expected data

	111 * types may be a signal of corruption. This module makes some

	112 * allowances for automatic coercion. It is important to be concious

	113 * of the difference between the schema exposed by the module, and the

	114 * data types read from storage. The following table describes how

	115 * the module interprets things:

	116 *

	117 * type schema data STRICT

	118 * ---- ------ ---- ------

	119 * ANY <none> any any

	120 * ROWID INTEGER n/a n/a

	121 * INTEGER INTEGER integer integer

	122 * FLOAT FLOAT integer or float float

	123 * NUMERIC NUMERIC integer, float, or text integer or float

	124 * TEXT TEXT text or blob text

	125 * BLOB BLOB blob blob

	126 *

	127 * type is the type provided to the recover module, schema is the

	128 * schema exposed by the module, data is the acceptable types of data

	129 * decoded from storage, and STRICT is a modification of that.

	130 *

	131 * A very loose recovery system might use ANY for all columns, then

	132 * use the appropriate sqlite3_column_*() calls to coerce to expected

	133 * types. This doesn't provide much protection if a page from a

	134 * different table with the same column count is linked into an

	135 * inappropriate btree.

	136 *

	137 * A very tight recovery system might use STRICT to enforce typing on

	138 * all columns, preferring to skip rows which are valid at the storage

	139 * level but don't contain the right types. Note that FLOAT STRICT is

	140 * almost certainly not appropriate, since integral values are

	141 * transparently stored as integers, when that is more efficient.

	142 *

	143 * Another option is to use ANY for all columns and inspect each

	144 * result manually (using sqlite3_column_*). This should only be

	145 * necessary in cases where developers have used manifest typing (test

	146 * to make sure before you decide that you aren't using manifest

	147 * typing!).

	148 *

	149 *

	150 * Caveats

	151 *

	152 * Leaf pages not referenced by interior nodes will not be found.

	153 *

	154 * Leaf pages referenced from interior nodes of other tables will not

	155 * be resolved.

	156 *

	157 * Rows referencing invalid overflow pages will be skipped.

	158 *

	159 * SQlite rows have a header which describes how to interpret the rest

	160 * of the payload. The header can be valid in cases where the rest of

	161 * the record is actually corrupt (in the sense that the data is not

	162 * the intended data). This can especially happen WRT overflow pages,

	163 * as lack of atomic updates between pages is the primary form of

	164 * corruption I have seen in the wild.

	165 */

	166 /* The implementation is via a series of cursors. The cursor

	167 * implementations follow the pattern:

	168 *

	169 * // Creates the cursor using various initialization info.

	170 * int cursorCreate(...);

	171 *

	172 * // Returns 1 if there is no more data, 0 otherwise.

	173 * int cursorEOF(Cursor *pCursor);

	174 *

	175 * // Various accessors can be used if not at EOF.

	176 *

	177 * // Move to the next item.

	178 * int cursorNext(Cursor *pCursor);

	179 *

	180 * // Destroy the memory associated with the cursor.

	181 * void cursorDestroy(Cursor *pCursor);

	182 *

	183 * References in the following are to sections at

	184 * http://www.sqlite.org/fileformat2.html .

	185 *

	186 * RecoverLeafCursor iterates the records in a leaf table node

	187 * described in section 1.5 "B-tree Pages". When the node is

	188 * exhausted, an interior cursor is used to get the next leaf node,

	189 * and iteration continues there.

	190 *

	191 * RecoverInteriorCursor iterates the child pages in an interior table

	192 * node described in section 1.5 "B-tree Pages". When the node is

	193 * exhausted, a parent interior cursor is used to get the next

	194 * interior node at the same level, and iteration continues there.

	195 *

	196 * Together these record the path from the leaf level to the root of

	197 * the tree. Iteration happens from the leaves rather than the root

	198 * both for efficiency and putting the special case at the front of

	199 * the list is easier to implement.

	200 *

	201 * RecoverCursor uses a RecoverLeafCursor to iterate the rows of a

	202 * table, returning results via the SQLite virtual table interface.

	203 */

	204 /* TODO(shess): It might be useful to allow DEFAULT in types to

	205 * specify what to do for NULL when an ALTER TABLE case comes up.

	206 * Unfortunately, simply adding it to the exposed schema and using

	207 * sqlite3_result_null() does not cause the default to be generate.

	208 * Handling it ourselves seems hard, unfortunately.

	209 */

	210

	211 #include <assert.h>

	212 #include <ctype.h>

	213 #include <stdio.h>

	214 #include <string.h>

	215

	216 /* Internal SQLite things that are used:

	217 * u32, u64, i64 types.

	218 * Btree, Pager, and DbPage structs.

	219 * DbPage.pData, .pPager, and .pgno

	220 * sqlite3 struct.

	221 * sqlite3BtreePager() and sqlite3BtreeGetPageSize()

	222 * sqlite3PagerAcquire() and sqlite3PagerUnref()

	223 * getVarint().

	224 */

	225 #include "sqliteInt.h"

	226

	227 /* For debugging. */

	228 #if 0

	229 #define FNENTRY() fprintf(stderr, "In %s\n", __FUNCTION__)

	230 #else

	231 #define FNENTRY()

	232 #endif

	233

	234 /* Generic constants and helper functions. */

	235

	236 static const unsigned char kTableLeafPage = 0x0D;

	237 static const unsigned char kTableInteriorPage = 0x05;

	238

	239 /* From section 1.5. */

	240 static const unsigned kiPageTypeOffset = 0;

	241 static const unsigned kiPageFreeBlockOffset = 1;

	242 static const unsigned kiPageCellCountOffset = 3;

	243 static const unsigned kiPageCellContentOffset = 5;

	244 static const unsigned kiPageFragmentedBytesOffset = 7;

	245 static const unsigned knPageLeafHeaderBytes = 8;

	246 /* Interior pages contain an additional field. */

	247 static const unsigned kiPageRightChildOffset = 8;

	248 static const unsigned kiPageInteriorHeaderBytes = 12;

	249

	250 /* Accepted types are specified by a mask. */

	251 #define MASK_ROWID (1<<0)

	252 #define MASK_INTEGER (1<<1)

	253 #define MASK_FLOAT (1<<2)

	254 #define MASK_TEXT (1<<3)

	255 #define MASK_BLOB (1<<4)

	256 #define MASK_NULL (1<<5)

	257

	258 /* Helpers to decode fixed-size fields. */

	259 static u32 decodeUnsigned16(const unsigned char *pData){

	260 return (pData[0]<<8) + pData[1];

	261 }

	262 static u32 decodeUnsigned32(const unsigned char *pData){

	263 return (decodeUnsigned16(pData)<<16) + decodeUnsigned16(pData+2);

	264 }

	265 static i64 decodeSigned(const unsigned char *pData, unsigned nBytes){

	266 i64 r = (char)(*pData);

	267 while( --nBytes ){

	268 r <<= 8;

	269 r += *(++pData);

	270 }

	271 return r;

	272 }

	273 /* Derived from vdbeaux.c, sqlite3VdbeSerialGet(), case 7. */

	274 /* TODO(shess): Determine if swapMixedEndianFloat() applies. */

	275 static double decodeFloat64(const unsigned char *pData){

	276 #if !defined(NDEBUG)

	277 static const u64 t1 = ((u64)0x3ff00000)<<32;

	278 static const double r1 = 1.0;

	279 u64 t2 = t1;

	280 assert( sizeof(r1)==sizeof(t2) && memcmp(&r1, &t2, sizeof(r1))==0 );

	281 #endif

	282 i64 x = decodeSigned(pData, 8);

	283 double d;

	284 memcpy(&d, &x, sizeof(x));

	285 return d;

	286 }

	287

	288 /* Return true if a varint can safely be read from pData/nData. */

	289 /* TODO(shess): DbPage points into the middle of a buffer which

	290 * contains the page data before DbPage. So code should always be

	291 * able to read a small number of varints safely. Consider whether to

	292 * trust that or not.

	293 */

	294 static int checkVarint(const unsigned char *pData, unsigned nData){

	295 unsigned i;

	296

	297 /* In the worst case the decoder takes all 8 bits of the 9th byte. */

	298 if( nData>=9 ){

	299 return 1;

	300 }

	301

	302 /* Look for a high-bit-clear byte in what's left. */

	303 for( i=0; i<nData; ++i ){

	304 if( !(pData[i]&0x80) ){

	305 return 1;

	306 }

	307 }

	308

	309 /* Cannot decode in the space given. */

	310 return 0;

	311 }

	312

	313 /* Return 1 if n varints can be read from pData/nData. */

	314 static int checkVarints(const unsigned char *pData, unsigned nData,

	315 unsigned n){

	316 unsigned nCur = 0; /* Byte offset within current varint. */

	317 unsigned nFound = 0; /* Number of varints found. */

	318 unsigned i;

	319

	320 /* In the worst case the decoder takes all 8 bits of the 9th byte. */

	321 if( nData>=9*n ){

	322 return 1;

	323 }

	324

	325 for( i=0; nFound<n && i<nData; ++i ){

	326 nCur++;

	327 if( nCur==9 \|\| !(pData[i]&0x80) ){

	328 nFound++;

	329 nCur = 0;

	330 }

	331 }

	332

	333 return nFound==n;

	334 }

	335

	336 /* ctype and str[n]casecmp() can be affected by locale (eg, tr_TR).

	337 * These versions consider only the ASCII space.

	338 */

	339 /* TODO(shess): It may be reasonable to just remove the need for these

	340 * entirely. The module could require "TEXT STRICT NOT NULL", not

	341 * "Text Strict Not Null" or whatever the developer felt like typing

	342 * that day. Handling corrupt data is a PERFECT place to be pedantic.

	343 */

	344 static int ascii_isspace(char c){

	345 /* From fts3_expr.c */

	346 return c==' ' \|\| c=='\t' \|\| c=='\n' \|\| c=='\r' \|\| c=='\v' \|\| c=='\f';

	347 }

	348 static int ascii_isalnum(int x){

	349 /* From fts3_tokenizer1.c */

	350 return (x>='0' && x<='9') \|\| (x>='A' && x<='Z') \|\| (x>='a' && x<='z');

	351 }

	352 static int ascii_tolower(int x){

	353 /* From fts3_tokenizer1.c */

	354 return (x>='A' && x<='Z') ? x-'A'+'a' : x;

	355 }

	356 /* TODO(shess): Consider sqlite3_strnicmp() */

	357 static int ascii_strncasecmp(const char s1, const char s2, size_t n){

	358 const unsigned char us1 = (const unsigned char )s1;

	359 const unsigned char us2 = (const unsigned char )s2;

	360 while( us1 && us2 && n && ascii_tolower(us1)==ascii_tolower(us2) ){

	361 us1++, us2++, n--;

	362 }

	363 return n ? ascii_tolower(us1)-ascii_tolower(us2) : 0;

	364 }

	365 static int ascii_strcasecmp(const char s1, const char s2){

	366 /* If s2 is equal through strlen(s1), will exit while() due to s1's

	367 * trailing NUL, and return NUL-s2[strlen(s1)].

	368 */

	369 return ascii_strncasecmp(s1, s2, strlen(s1)+1);

	370 }

	371

	372 /* For some reason I kept making mistakes with offset calculations. */

	373 static const unsigned char PageData(DbPage pPage, unsigned iOffset){

	374 assert( iOffset<=pPage->nPageSize );

	375 return (unsigned char *)pPage->pData + iOffset;

	376 }

	377

	378 /* The first page in the file contains a file header in the first 100

	379 * bytes. The page's header information comes after that. Note that

	380 * the offsets in the page's header information are relative to the

	381 * beginning of the page, NOT the end of the page header.

	382 */

	383 static const unsigned char PageHeader(DbPage pPage){

	384 if( pPage->pgno==1 ){

	385 const unsigned nDatabaseHeader = 100;

	386 return PageData(pPage, nDatabaseHeader);

	387 }else{

	388 return PageData(pPage, 0);

	389 }

	390 }

	391

	392 /* Helper to fetch the pager and page size for the named database. */

	393 static int GetPager(sqlite3 db, const char zName,

	394 Pager *pPager, unsigned pnPageSize){

	395 Btree *pBt = NULL;

	396 int i;

	397 for( i=0; i<db->nDb; ++i ){

	398 if( ascii_strcasecmp(db->aDb[i].zName, zName)==0 ){

	399 pBt = db->aDb[i].pBt;

	400 break;

	401 }

	402 }

	403 if( !pBt ){

	404 return SQLITE_ERROR;

	405 }

	406

	407 *pPager = sqlite3BtreePager(pBt);

	408 *pnPageSize = sqlite3BtreeGetPageSize(pBt) - sqlite3BtreeGetReserve(pBt);

	409 return SQLITE_OK;

	410 }

	411

	412 /* iSerialType is a type read from a record header. See "2.1 Record Format".

	413 */

	414

	415 /* Storage size of iSerialType in bytes. My interpretation of SQLite

	416 * documentation is that text and blob fields can have 32-bit length.

	417 * Values past 2^31-12 will need more than 32 bits to encode, which is

	418 * why iSerialType is u64.

	419 */

	420 static u32 SerialTypeLength(u64 iSerialType){

	421 switch( iSerialType ){

	422 case 0 : return 0; /* NULL */

	423 case 1 : return 1; /* Various integers. */

	424 case 2 : return 2;

	425 case 3 : return 3;

	426 case 4 : return 4;

	427 case 5 : return 6;

	428 case 6 : return 8;

	429 case 7 : return 8; /* 64-bit float. */

	430 case 8 : return 0; /* Constant 0. */

	431 case 9 : return 0; /* Constant 1. */

	432 case 10 : case 11 : assert( !"RESERVED TYPE"); return 0;

	433 }

	434 return (u32)((iSerialType>>1) - 6);

	435 }

	436

	437 /* True if iSerialType refers to a blob. */

	438 static int SerialTypeIsBlob(u64 iSerialType){

	439 assert( iSerialType>=12 );

	440 return (iSerialType%2)==0;

	441 }

	442

	443 /* Returns true if the serialized type represented by iSerialType is

	444 * compatible with the given type mask.

	445 */

	446 static int SerialTypeIsCompatible(u64 iSerialType, unsigned char mask){

	447 switch( iSerialType ){

	448 case 0 : return (mask&MASK_NULL)!=0;

	449 case 1 : return (mask&MASK_INTEGER)!=0;

	450 case 2 : return (mask&MASK_INTEGER)!=0;

	451 case 3 : return (mask&MASK_INTEGER)!=0;

	452 case 4 : return (mask&MASK_INTEGER)!=0;

	453 case 5 : return (mask&MASK_INTEGER)!=0;

	454 case 6 : return (mask&MASK_INTEGER)!=0;

	455 case 7 : return (mask&MASK_FLOAT)!=0;

	456 case 8 : return (mask&MASK_INTEGER)!=0;

	457 case 9 : return (mask&MASK_INTEGER)!=0;

	458 case 10 : assert( !"RESERVED TYPE"); return 0;

	459 case 11 : assert( !"RESERVED TYPE"); return 0;

	460 }

	461 return (mask&(SerialTypeIsBlob(iSerialType) ? MASK_BLOB : MASK_TEXT));

	462 }

	463

	464 /* Versions of strdup() with return values appropriate for

	465 * sqlite3_free(). malloc.c has sqlite3DbStrDup()/NDup(), but those

	466 * need sqlite3DbFree(), which seems intrusive.

	467 */

	468 static char sqlite3_strndup(const char z, unsigned n){

	469 char *zNew;

	470

	471 if( z==NULL ){

	472 return NULL;

	473 }

	474

	475 zNew = sqlite3_malloc(n+1);

	476 if( zNew!=NULL ){

	477 memcpy(zNew, z, n);

	478 zNew[n] = '\0';

	479 }

	480 return zNew;

	481 }

	482 static char sqlite3_strdup(const char z){

	483 if( z==NULL ){

	484 return NULL;

	485 }

	486 return sqlite3_strndup(z, strlen(z));

	487 }

	488

	489 /* Fetch the page number of zTable in zDb from sqlite_master in zDb,

	490 * and put it in *piRootPage.

	491 */

	492 static int getRootPage(sqlite3 db, const char zDb, const char *zTable,

	493 u32 *piRootPage){

	494 char zSql; / SQL selecting root page of named element. */

	495 sqlite3_stmt *pStmt;

	496 int rc;

	497

	498 if( strcmp(zTable, "sqlite_master")==0 ){

	499 *piRootPage = 1;

	500 return SQLITE_OK;

	501 }

	502

	503 zSql = sqlite3_mprintf("SELECT rootpage FROM %s.sqlite_master "

	504 "WHERE type = 'table' AND tbl_name = %Q",

	505 zDb, zTable);

	506 if( !zSql ){

	507 return SQLITE_NOMEM;

	508 }

	509

	510 rc = sqlite3_prepare_v2(db, zSql, -1, &pStmt, 0);

	511 sqlite3_free(zSql);

	512 if( rc!=SQLITE_OK ){

	513 return rc;

	514 }

	515

	516 /* Require a result. */

	517 rc = sqlite3_step(pStmt);

	518 if( rc==SQLITE_DONE ){

	519 rc = SQLITE_CORRUPT;

	520 }else if( rc==SQLITE_ROW ){

	521 *piRootPage = sqlite3_column_int(pStmt, 0);

	522

	523 /* Require only one result. */

	524 rc = sqlite3_step(pStmt);

	525 if( rc==SQLITE_DONE ){

	526 rc = SQLITE_OK;

	527 }else if( rc==SQLITE_ROW ){

	528 rc = SQLITE_CORRUPT;

	529 }

	530 }

	531 sqlite3_finalize(pStmt);

	532 return rc;

	533 }

	534

	535 static int getEncoding(sqlite3 db, const char zDb, int* piEncoding){

	536 sqlite3_stmt *pStmt;

	537 int rc;

	538 char *zSql = sqlite3_mprintf("PRAGMA %s.encoding", zDb);

	539 if( !zSql ){

	540 return SQLITE_NOMEM;

	541 }

	542

	543 rc = sqlite3_prepare_v2(db, zSql, -1, &pStmt, 0);

	544 sqlite3_free(zSql);

	545 if( rc!=SQLITE_OK ){

	546 return rc;

	547 }

	548

	549 /* Require a result. */

	550 rc = sqlite3_step(pStmt);

	551 if( rc==SQLITE_DONE ){

	552 /* This case should not be possible. */

	553 rc = SQLITE_CORRUPT;

	554 }else if( rc==SQLITE_ROW ){

	555 if( sqlite3_column_type(pStmt, 0)==SQLITE_TEXT ){

	556 const char* z = (const char *)sqlite3_column_text(pStmt, 0);

	557 /* These strings match the literals in pragma.c. */

	558 if( !strcmp(z, "UTF-16le") ){

	559 *piEncoding = SQLITE_UTF16LE;

	560 }else if( !strcmp(z, "UTF-16be") ){

	561 *piEncoding = SQLITE_UTF16BE;

	562 }else if( !strcmp(z, "UTF-8") ){

	563 *piEncoding = SQLITE_UTF8;

	564 }else{

	565 /* This case should not be possible. */

	566 *piEncoding = SQLITE_UTF8;

	567 }

	568 }else{

	569 /* This case should not be possible. */

	570 *piEncoding = SQLITE_UTF8;

	571 }

	572

	573 /* Require only one result. */

	574 rc = sqlite3_step(pStmt);

	575 if( rc==SQLITE_DONE ){

	576 rc = SQLITE_OK;

	577 }else if( rc==SQLITE_ROW ){

	578 /* This case should not be possible. */

	579 rc = SQLITE_CORRUPT;

	580 }

	581 }

	582 sqlite3_finalize(pStmt);

	583 return rc;

	584 }

	585

	586 /* Cursor for iterating interior nodes. Interior page cells contain a

	587 * child page number and a rowid. The child page contains items left

	588 * of the rowid (less than). The rightmost page of the subtree is

	589 * stored in the page header.

	590 *

	591 * interiorCursorDestroy - release all resources associated with the

	592 * cursor and any parent cursors.

	593 * interiorCursorCreate - create a cursor with the given parent and page.

	594 * interiorCursorEOF - returns true if neither the cursor nor the

	595 * parent cursors can return any more data.

	596 * interiorCursorNextPage - fetch the next child page from the cursor.

	597 *

	598 * Logically, interiorCursorNextPage() returns the next child page

	599 * number from the page the cursor is currently reading, calling the

	600 * parent cursor as necessary to get new pages to read, until done.

	601 * SQLITE_ROW if a page is returned, SQLITE_DONE if out of pages,

	602 * error otherwise. Unfortunately, if the table is corrupted

	603 * unexpected pages can be returned. If any unexpected page is found,

	604 * leaf or otherwise, it is returned to the caller for processing,

	605 * with the interior cursor left empty. The next call to

	606 * interiorCursorNextPage() will recurse to the parent cursor until an

	607 * interior page to iterate is returned.

	608 *

	609 * Note that while interiorCursorNextPage() will refuse to follow

	610 * loops, it does not keep track of pages returned for purposes of

	611 * preventing duplication.

	612 *

	613 * Note that interiorCursorEOF() could return false (not at EOF), and

	614 * interiorCursorNextPage() could still return SQLITE_DONE. This

	615 * could happen if there are more cells to iterate in an interior

	616 * page, but those cells refer to invalid pages.

	617 */

	618 typedef struct RecoverInteriorCursor RecoverInteriorCursor;

	619 struct RecoverInteriorCursor {

	620 RecoverInteriorCursor pParent; / Parent node to this node. */

	621 DbPage pPage; / Reference to leaf page. */

	622 unsigned nPageSize; /* Size of page. */

	623 unsigned nChildren; /* Number of children on the page. */

	624 unsigned iChild; /* Index of next child to return. */

	625 };

	626

	627 static void interiorCursorDestroy(RecoverInteriorCursor *pCursor){

	628 /* Destroy all the cursors to the root. */

	629 while( pCursor ){

	630 RecoverInteriorCursor *p = pCursor;

	631 pCursor = pCursor->pParent;

	632

	633 if( p->pPage ){

	634 sqlite3PagerUnref(p->pPage);

	635 p->pPage = NULL;

	636 }

	637

	638 memset(p, 0xA5, sizeof(*p));

	639 sqlite3_free(p);

	640 }

	641 }

	642

	643 /* Internal helper. Reset storage in preparation for iterating pPage. */

	644 static void interiorCursorSetPage(RecoverInteriorCursor *pCursor,

	645 DbPage *pPage){

	646 const unsigned knMinCellLength = 2 + 4 + 1;

	647 unsigned nMaxChildren;

	648 assert( PageHeader(pPage)[kiPageTypeOffset]==kTableInteriorPage );

	649

	650 if( pCursor->pPage ){

	651 sqlite3PagerUnref(pCursor->pPage);

	652 pCursor->pPage = NULL;

	653 }

	654 pCursor->pPage = pPage;

	655 pCursor->iChild = 0;

	656

	657 /* A child for each cell, plus one in the header. */

	658 pCursor->nChildren = decodeUnsigned16(PageHeader(pPage) +

	659 kiPageCellCountOffset) + 1;

	660

	661 /* Each child requires a 16-bit offset from an array after the header,

	662 * and each child contains a 32-bit page number and at least a varint

	663 * (min size of one byte). The final child page is in the header. So

	664 * the maximum value for nChildren is:

	665 * (nPageSize - kiPageInteriorHeaderBytes) /

	666 * (sizeof(uint16) + sizeof(uint32) + 1) + 1

	667 */

	668 /* TODO(shess): This count is very unlikely to be corrupted in

	669 * isolation, so seeing this could signal to skip the page. OTOH, I

	670 * can't offhand think of how to get here unless this or the page-type

	671 * byte is corrupted. Could be an overflow page, but it would require

	672 * a very large database.

	673 */

	674 nMaxChildren =

	675 (pCursor->nPageSize - kiPageInteriorHeaderBytes) / knMinCellLength + 1;

	676 if (pCursor->nChildren > nMaxChildren) {

	677 pCursor->nChildren = nMaxChildren;

	678 }

	679 }

	680

	681 static int interiorCursorCreate(RecoverInteriorCursor *pParent,

	682 DbPage *pPage, int nPageSize,

	683 RecoverInteriorCursor **ppCursor){

	684 RecoverInteriorCursor *pCursor =

	685 sqlite3_malloc(sizeof(RecoverInteriorCursor));

	686 if( !pCursor ){

	687 return SQLITE_NOMEM;

	688 }

	689

	690 memset(pCursor, 0, sizeof(*pCursor));

	691 pCursor->pParent = pParent;

	692 pCursor->nPageSize = nPageSize;

	693 interiorCursorSetPage(pCursor, pPage);

	694 *ppCursor = pCursor;

	695 return SQLITE_OK;

	696 }

	697

	698 /* Internal helper. Return the child page number at iChild. */

	699 static unsigned interiorCursorChildPage(RecoverInteriorCursor *pCursor){

	700 const unsigned char pPageHeader; / Header of the current page. */

	701 const unsigned char pCellOffsets; / Offset to page's cell offsets. */

	702 unsigned iCellOffset; /* Offset of target cell. */

	703

	704 assert( pCursor->iChild<pCursor->nChildren );

	705

	706 /* Rightmost child is in the header. */

	707 pPageHeader = PageHeader(pCursor->pPage);

	708 if( pCursor->iChild==pCursor->nChildren-1 ){

	709 return decodeUnsigned32(pPageHeader + kiPageRightChildOffset);

	710 }

	711

	712 /* Each cell is a 4-byte integer page number and a varint rowid

	713 * which is greater than the rowid of items in that sub-tree (this

	714 * module ignores ordering). The offset is from the beginning of the

	715 * page, not from the page header.

	716 */

	717 pCellOffsets = pPageHeader + kiPageInteriorHeaderBytes;

	718 iCellOffset = decodeUnsigned16(pCellOffsets + pCursor->iChild*2);

	719 if( iCellOffset<=pCursor->nPageSize-4 ){

	720 return decodeUnsigned32(PageData(pCursor->pPage, iCellOffset));

	721 }

	722

	723 /* TODO(shess): Check for cell overlaps? Cells require 4 bytes plus

	724 * a varint. Check could be identical to leaf check (or even a

	725 * shared helper testing for "Cells starting in this range"?).

	726 */

	727

	728 /* If the offset is broken, return an invalid page number. */

	729 return 0;

	730 }

	731

	732 static int interiorCursorEOF(RecoverInteriorCursor *pCursor){

	733 /* Find a parent with remaining children. EOF if none found. */

	734 while( pCursor && pCursor->iChild>=pCursor->nChildren ){

	735 pCursor = pCursor->pParent;

	736 }

	737 return pCursor==NULL;

	738 }

	739

	740 /* Internal helper. Used to detect if iPage would cause a loop. */

	741 static int interiorCursorPageInUse(RecoverInteriorCursor *pCursor,

	742 unsigned iPage){

	743 /* Find any parent using the indicated page. */

	744 while( pCursor && pCursor->pPage->pgno!=iPage ){

	745 pCursor = pCursor->pParent;

	746 }

	747 return pCursor!=NULL;

	748 }

	749

	750 /* Get the next page from the interior cursor at *ppCursor. Returns

	751 * SQLITE_ROW with the page in *ppPage, or SQLITE_DONE if out of

	752 * pages, or the error SQLite returned.

	753 *

	754 * If the tree is uneven, then when the cursor attempts to get a new

	755 * interior page from the parent cursor, it may get a non-interior

	756 * page. In that case, the new page is returned, and *ppCursor is

	757 * updated to point to the parent cursor (this cursor is freed).

	758 */

	759 /* TODO(shess): I've tried to avoid recursion in most of this code,

	760 * but this case is more challenging because the recursive call is in

	761 * the middle of operation. One option for converting it without

	762 * adding memory management would be to retain the head pointer and

	763 * use a helper to "back up" as needed. Another option would be to

	764 * reverse the list during traversal.

	765 */

	766 static int interiorCursorNextPage(RecoverInteriorCursor **ppCursor,

	767 DbPage **ppPage){

	768 RecoverInteriorCursor pCursor = ppCursor;

	769 while( 1 ){

	770 int rc;

	771 const unsigned char pPageHeader; / Header of found page. */

	772

	773 /* Find a valid child page which isn't on the stack. */

	774 while( pCursor->iChild<pCursor->nChildren ){

	775 const unsigned iPage = interiorCursorChildPage(pCursor);

	776 pCursor->iChild++;

	777 if( interiorCursorPageInUse(pCursor, iPage) ){

	778 fprintf(stderr, "Loop detected at %d\n", iPage);

	779 }else{

	780 int rc = sqlite3PagerAcquire(pCursor->pPage->pPager, iPage, ppPage, 0);

	781 if( rc==SQLITE_OK ){

	782 return SQLITE_ROW;

	783 }

	784 }

	785 }

	786

	787 /* This page has no more children. Get next page from parent. */

	788 if( !pCursor->pParent ){

	789 return SQLITE_DONE;

	790 }

	791 rc = interiorCursorNextPage(&pCursor->pParent, ppPage);

	792 if( rc!=SQLITE_ROW ){

	793 return rc;

	794 }

	795

	796 /* If a non-interior page is received, that either means that the

	797 * tree is uneven, or that a child was re-used (say as an overflow

	798 * page). Remove this cursor and let the caller handle the page.

	799 */

	800 pPageHeader = PageHeader(*ppPage);

	801 if( pPageHeader[kiPageTypeOffset]!=kTableInteriorPage ){

	802 *ppCursor = pCursor->pParent;

	803 pCursor->pParent = NULL;

	804 interiorCursorDestroy(pCursor);

	805 return SQLITE_ROW;

	806 }

	807

	808 /* Iterate the new page. */

	809 interiorCursorSetPage(pCursor, *ppPage);

	810 *ppPage = NULL;

	811 }

	812

	813 assert(NULL); /* NOTREACHED() */

	814 return SQLITE_CORRUPT;

	815 }

	816

	817 /* Large rows are spilled to overflow pages. The row's main page

	818 * stores the overflow page number after the local payload, with a

	819 * linked list forward from there as necessary. overflowMaybeCreate()

	820 * and overflowGetSegment() provide an abstraction for accessing such

	821 * data while centralizing the code.

	822 *

	823 * overflowDestroy - releases all resources associated with the structure.

	824 * overflowMaybeCreate - create the overflow structure if it is needed

	825 * to represent the given record. See function comment.

	826 * overflowGetSegment - fetch a segment from the record, accounting

	827 * for overflow pages. Segments which are not

	828 * entirely contained with a page are constructed

	829 * into a buffer which is returned. See function comment.

	830 */

	831 typedef struct RecoverOverflow RecoverOverflow;

	832 struct RecoverOverflow {

	833 RecoverOverflow *pNextOverflow;

	834 DbPage *pPage;

	835 unsigned nPageSize;

	836 };

	837

	838 static void overflowDestroy(RecoverOverflow *pOverflow){

	839 while( pOverflow ){

	840 RecoverOverflow *p = pOverflow;

	841 pOverflow = p->pNextOverflow;

	842

	843 if( p->pPage ){

	844 sqlite3PagerUnref(p->pPage);

	845 p->pPage = NULL;

	846 }

	847

	848 memset(p, 0xA5, sizeof(*p));

	849 sqlite3_free(p);

	850 }

	851 }

	852

	853 /* Internal helper. Used to detect if iPage would cause a loop. */

	854 static int overflowPageInUse(RecoverOverflow *pOverflow, unsigned iPage){

	855 while( pOverflow && pOverflow->pPage->pgno!=iPage ){

	856 pOverflow = pOverflow->pNextOverflow;

	857 }

	858 return pOverflow!=NULL;

	859 }

	860

	861 /* Setup to access an nRecordBytes record beginning at iRecordOffset

	862 * in pPage. If nRecordBytes can be satisfied entirely from pPage,

	863 * then no overflow pages are needed an *pnLocalRecordBytes is set to

	864 * nRecordBytes. Otherwise, *ppOverflow is set to the head of a list

	865 * of overflow pages, and *pnLocalRecordBytes is set to the number of

	866 * bytes local to pPage.

	867 *

	868 * overflowGetSegment() will do the right thing regardless of whether

	869 * those values are set to be in-page or not.

	870 */

	871 static int overflowMaybeCreate(DbPage *pPage, unsigned nPageSize,

	872 unsigned iRecordOffset, unsigned nRecordBytes,

	873 unsigned *pnLocalRecordBytes,

	874 RecoverOverflow **ppOverflow){

	875 unsigned nLocalRecordBytes; /* Record bytes in the leaf page. */

	876 unsigned iNextPage; /* Next page number for record data. */

	877 unsigned nBytes; /* Maximum record bytes as of current page. */

	878 int rc;

	879 RecoverOverflow pFirstOverflow; / First in linked list of pages. */

	880 RecoverOverflow pLastOverflow; / End of linked list. */

	881

	882 /* Calculations from the "Table B-Tree Leaf Cell" part of section

	883 * 1.5 of http://www.sqlite.org/fileformat2.html . maxLocal and

	884 * minLocal to match naming in btree.c.

	885 */

	886 const unsigned maxLocal = nPageSize - 35;

	887 const unsigned minLocal = ((nPageSize-12)32/255)-23; / m */

	888

	889 /* Always fit anything smaller than maxLocal. */

	890 if( nRecordBytes<=maxLocal ){

	891 *pnLocalRecordBytes = nRecordBytes;

	892 *ppOverflow = NULL;

	893 return SQLITE_OK;

	894 }

	895

	896 /* Calculate the remainder after accounting for minLocal on the leaf

	897 * page and what packs evenly into overflow pages. If the remainder

	898 * does not fit into maxLocal, then a partially-full overflow page

	899 * will be required in any case, so store as little as possible locally.

	900 */

	901 nLocalRecordBytes = minLocal+((nRecordBytes-minLocal)%(nPageSize-4));

	902 if( maxLocal<nLocalRecordBytes ){

	903 nLocalRecordBytes = minLocal;

	904 }

	905

	906 /* Don't read off the end of the page. */

	907 if( iRecordOffset+nLocalRecordBytes+4>nPageSize ){

	908 return SQLITE_CORRUPT;

	909 }

	910

	911 /* First overflow page number is after the local bytes. */

	912 iNextPage =

	913 decodeUnsigned32(PageData(pPage, iRecordOffset + nLocalRecordBytes));

	914 nBytes = nLocalRecordBytes;

	915

	916 /* While there are more pages to read, and more bytes are needed,

	917 * get another page.

	918 */

	919 pFirstOverflow = pLastOverflow = NULL;

	920 rc = SQLITE_OK;

	921 while( iNextPage && nBytes<nRecordBytes ){

	922 RecoverOverflow pOverflow; / New overflow page for the list. */

	923

	924 rc = sqlite3PagerAcquire(pPage->pPager, iNextPage, &pPage, 0);

	925 if( rc!=SQLITE_OK ){

	926 break;

	927 }

	928

	929 pOverflow = sqlite3_malloc(sizeof(RecoverOverflow));

	930 if( !pOverflow ){

	931 sqlite3PagerUnref(pPage);

	932 rc = SQLITE_NOMEM;

	933 break;

	934 }

	935 memset(pOverflow, 0, sizeof(*pOverflow));

	936 pOverflow->pPage = pPage;

	937 pOverflow->nPageSize = nPageSize;

	938

	939 if( !pFirstOverflow ){

	940 pFirstOverflow = pOverflow;

	941 }else{

	942 pLastOverflow->pNextOverflow = pOverflow;

	943 }

	944 pLastOverflow = pOverflow;

	945

	946 iNextPage = decodeUnsigned32(pPage->pData);

	947 nBytes += nPageSize-4;

	948

	949 /* Avoid loops. */

	950 if( overflowPageInUse(pFirstOverflow, iNextPage) ){

	951 fprintf(stderr, "Overflow loop detected at %d\n", iNextPage);

	952 rc = SQLITE_CORRUPT;

	953 break;

	954 }

	955 }

	956

	957 /* If there were not enough pages, or too many, things are corrupt.

	958 * Not having enough pages is an obvious problem, all the data

	959 * cannot be read. Too many pages means that the contents of the

	960 * row between the main page and the overflow page(s) is

	961 * inconsistent (most likely one or more of the overflow pages does

	962 * not really belong to this row).

	963 */

	964 if( rc==SQLITE_OK && (nBytes<nRecordBytes \|\| iNextPage) ){

	965 rc = SQLITE_CORRUPT;

	966 }

	967

	968 if( rc==SQLITE_OK ){

	969 *ppOverflow = pFirstOverflow;

	970 *pnLocalRecordBytes = nLocalRecordBytes;

	971 }else if( pFirstOverflow ){

	972 overflowDestroy(pFirstOverflow);

	973 }

	974 return rc;

	975 }

	976

	977 /* Use in concert with overflowMaybeCreate() to efficiently read parts

	978 * of a potentially-overflowing record. pPage and iRecordOffset are

	979 * the values passed into overflowMaybeCreate(), nLocalRecordBytes and

	980 * pOverflow are the values returned by that call.

	981 *

	982 * On SQLITE_OK, *ppBase points to nRequestBytes of data at

	983 * iRequestOffset within the record. If the data exists contiguously

	984 * in a page, a direct pointer is returned, otherwise a buffer from

	985 * sqlite3_malloc() is returned with the data. *pbFree is set true if

	986 * sqlite3_free() should be called on *ppBase.

	987 */

	988 /* Operation of this function is subtle. At any time, pPage is the

	989 * current page, with iRecordOffset and nLocalRecordBytes being record

	990 * data within pPage, and pOverflow being the overflow page after

	991 * pPage. This allows the code to handle both the initial leaf page

	992 * and overflow pages consistently by adjusting the values

	993 * appropriately.

	994 */

	995 static int overflowGetSegment(DbPage *pPage, unsigned iRecordOffset,

	996 unsigned nLocalRecordBytes,

	997 RecoverOverflow *pOverflow,

	998 unsigned iRequestOffset, unsigned nRequestBytes,

	999 unsigned char *ppBase, int pbFree){

	1000 unsigned nBase; /* Amount of data currently collected. */

	1001 unsigned char pBase; / Buffer to collect record data into. */

	1002

	1003 /* Skip to the page containing the start of the data. */

	1004 while( iRequestOffset>=nLocalRecordBytes && pOverflow ){

	1005 /* Factor out current page's contribution. */

	1006 iRequestOffset -= nLocalRecordBytes;

	1007

	1008 /* Move forward to the next page in the list. */

	1009 pPage = pOverflow->pPage;

	1010 iRecordOffset = 4;

	1011 nLocalRecordBytes = pOverflow->nPageSize - iRecordOffset;

	1012 pOverflow = pOverflow->pNextOverflow;

	1013 }

	1014

	1015 /* If the requested data is entirely within this page, return a

	1016 * pointer into the page.

	1017 */

	1018 if( iRequestOffset+nRequestBytes<=nLocalRecordBytes ){

	1019 /* TODO(shess): "assignment discards qualifiers from pointer target type"

	1020 * Having ppBase be const makes sense, but sqlite3_free() takes non-const.

	1021 */

	1022 ppBase = (unsigned char )PageData(pPage, iRecordOffset + iRequestOffset);

	1023 *pbFree = 0;

	1024 return SQLITE_OK;

	1025 }

	1026

	1027 /* The data range would require additional pages. */

	1028 if( !pOverflow ){

	1029 /* Should never happen, the range is outside the nRecordBytes

	1030 * passed to overflowMaybeCreate().

	1031 */

	1032 assert(NULL); /* NOTREACHED */

	1033 return SQLITE_ERROR;

	1034 }

	1035

	1036 /* Get a buffer to construct into. */

	1037 nBase = 0;

	1038 pBase = sqlite3_malloc(nRequestBytes);

	1039 if( !pBase ){

	1040 return SQLITE_NOMEM;

	1041 }

	1042 while( nBase<nRequestBytes ){

	1043 /* Copy over data present on this page. */

	1044 unsigned nCopyBytes = nRequestBytes - nBase;

	1045 if( nLocalRecordBytes-iRequestOffset<nCopyBytes ){

	1046 nCopyBytes = nLocalRecordBytes - iRequestOffset;

	1047 }

	1048 memcpy(pBase + nBase, PageData(pPage, iRecordOffset + iRequestOffset),

	1049 nCopyBytes);

	1050 nBase += nCopyBytes;

	1051

	1052 if( pOverflow ){

	1053 /* Copy from start of record data in future pages. */

	1054 iRequestOffset = 0;

	1055

	1056 /* Move forward to the next page in the list. Should match

	1057 * first while() loop.

	1058 */

	1059 pPage = pOverflow->pPage;

	1060 iRecordOffset = 4;

	1061 nLocalRecordBytes = pOverflow->nPageSize - iRecordOffset;

	1062 pOverflow = pOverflow->pNextOverflow;

	1063 }else if( nBase<nRequestBytes ){

	1064 /* Ran out of overflow pages with data left to deliver. Not

	1065 * possible if the requested range fits within nRecordBytes

	1066 * passed to overflowMaybeCreate() when creating pOverflow.

	1067 */

	1068 assert(NULL); /* NOTREACHED */

	1069 sqlite3_free(pBase);

	1070 return SQLITE_ERROR;

	1071 }

	1072 }

	1073 assert( nBase==nRequestBytes );

	1074 *ppBase = pBase;

	1075 *pbFree = 1;

	1076 return SQLITE_OK;

	1077 }

	1078

	1079 /* Primary structure for iterating the contents of a table.

	1080 *

	1081 * leafCursorDestroy - release all resources associated with the cursor.

	1082 * leafCursorCreate - create a cursor to iterate items from tree at

	1083 * the provided root page.

	1084 * leafCursorNextValidCell - get the cursor ready to access data from

	1085 * the next valid cell in the table.

	1086 * leafCursorCellRowid - get the current cell's rowid.

	1087 * leafCursorCellColumns - get current cell's column count.

	1088 * leafCursorCellColInfo - get type and data for a column in current cell.

	1089 *

	1090 * leafCursorNextValidCell skips cells which fail simple integrity

	1091 * checks, such as overlapping other cells, or being located at

	1092 * impossible offsets, or where header data doesn't correctly describe

	1093 * payload data. Returns SQLITE_ROW if a valid cell is found,

	1094 * SQLITE_DONE if all pages in the tree were exhausted.

	1095 *

	1096 * leafCursorCellColInfo() accounts for overflow pages in the style of

	1097 * overflowGetSegment().

	1098 */

	1099 typedef struct RecoverLeafCursor RecoverLeafCursor;

	1100 struct RecoverLeafCursor {

	1101 RecoverInteriorCursor pParent; / Parent node to this node. */

	1102 DbPage pPage; / Reference to leaf page. */

	1103 unsigned nPageSize; /* Size of pPage. */

	1104 unsigned nCells; /* Number of cells in pPage. */

	1105 unsigned iCell; /* Current cell. */

	1106

	1107 /* Info parsed from data in iCell. */

	1108 i64 iRowid; /* rowid parsed. */

	1109 unsigned nRecordCols; /* how many items in the record. */

	1110 u64 iRecordOffset; /* offset to record data. */

	1111 /* TODO(shess): nRecordBytes and nRecordHeaderBytes are used in

	1112 * leafCursorCellColInfo() to prevent buffer overruns.

	1113 * leafCursorCellDecode() already verified that the cell is valid, so

	1114 * those checks should be redundant.

	1115 */

	1116 u64 nRecordBytes; /* Size of record data. */

	1117 unsigned nLocalRecordBytes; /* Amount of record data in-page. */

	1118 unsigned nRecordHeaderBytes; /* Size of record header data. */

	1119 unsigned char pRecordHeader; / Pointer to record header data. */

	1120 int bFreeRecordHeader; /* True if record header requires free. */

	1121 RecoverOverflow pOverflow; / Cell overflow info, if needed. */

	1122 };

	1123

	1124 /* Internal helper shared between next-page and create-cursor. If

	1125 * pPage is a leaf page, it will be stored in the cursor and state

	1126 * initialized for reading cells.

	1127 *

	1128 * If pPage is an interior page, a new parent cursor is created and

	1129 * injected on the stack. This is necessary to handle trees with

	1130 * uneven depth, but also is used during initial setup.

	1131 *

	1132 * If pPage is not a table page at all, it is discarded.

	1133 *

	1134 * If SQLITE_OK is returned, the caller no longer owns pPage,

	1135 * otherwise the caller is responsible for discarding it.

	1136 */

	1137 static int leafCursorLoadPage(RecoverLeafCursor pCursor, DbPage pPage){

	1138 const unsigned char pPageHeader; / Header of pPage /

	1139

	1140 /* Release the current page. */

	1141 if( pCursor->pPage ){

	1142 sqlite3PagerUnref(pCursor->pPage);

	1143 pCursor->pPage = NULL;

	1144 pCursor->iCell = pCursor->nCells = 0;

	1145 }

	1146

	1147 /* If the page is an unexpected interior node, inject a new stack

	1148 * layer and try again from there.

	1149 */

	1150 pPageHeader = PageHeader(pPage);

	1151 if( pPageHeader[kiPageTypeOffset]==kTableInteriorPage ){

	1152 RecoverInteriorCursor *pParent;

	1153 int rc = interiorCursorCreate(pCursor->pParent, pPage, pCursor->nPageSize,

	1154 &pParent);

	1155 if( rc!=SQLITE_OK ){

	1156 return rc;

	1157 }

	1158 pCursor->pParent = pParent;

	1159 return SQLITE_OK;

	1160 }

	1161

	1162 /* Not a leaf page, skip it. */

	1163 if( pPageHeader[kiPageTypeOffset]!=kTableLeafPage ){

	1164 sqlite3PagerUnref(pPage);

	1165 return SQLITE_OK;

	1166 }

	1167

	1168 /* Take ownership of the page and start decoding. */

	1169 pCursor->pPage = pPage;

	1170 pCursor->iCell = 0;

	1171 pCursor->nCells = decodeUnsigned16(pPageHeader + kiPageCellCountOffset);

	1172 return SQLITE_OK;

	1173 }

	1174

	1175 /* Get the next leaf-level page in the tree. Returns SQLITE_ROW when

	1176 * a leaf page is found, SQLITE_DONE when no more leaves exist, or any

	1177 * error which occurred.

	1178 */

	1179 static int leafCursorNextPage(RecoverLeafCursor *pCursor){

	1180 if( !pCursor->pParent ){

	1181 return SQLITE_DONE;

	1182 }

	1183

	1184 /* Repeatedly load the parent's next child page until a leaf is found. */

	1185 do {

	1186 DbPage *pNextPage;

	1187 int rc = interiorCursorNextPage(&pCursor->pParent, &pNextPage);

	1188 if( rc!=SQLITE_ROW ){

	1189 assert( rc==SQLITE_DONE );

	1190 return rc;

	1191 }

	1192

	1193 rc = leafCursorLoadPage(pCursor, pNextPage);

	1194 if( rc!=SQLITE_OK ){

	1195 sqlite3PagerUnref(pNextPage);

	1196 return rc;

	1197 }

	1198 } while( !pCursor->pPage );

	1199

	1200 return SQLITE_ROW;

	1201 }

	1202

	1203 static void leafCursorDestroyCellData(RecoverLeafCursor *pCursor){

	1204 if( pCursor->bFreeRecordHeader ){

	1205 sqlite3_free(pCursor->pRecordHeader);

	1206 }

	1207 pCursor->bFreeRecordHeader = 0;

	1208 pCursor->pRecordHeader = NULL;

	1209

	1210 if( pCursor->pOverflow ){

	1211 overflowDestroy(pCursor->pOverflow);

	1212 pCursor->pOverflow = NULL;

	1213 }

	1214 }

	1215

	1216 static void leafCursorDestroy(RecoverLeafCursor *pCursor){

	1217 leafCursorDestroyCellData(pCursor);

	1218

	1219 if( pCursor->pParent ){

	1220 interiorCursorDestroy(pCursor->pParent);

	1221 pCursor->pParent = NULL;

	1222 }

	1223

	1224 if( pCursor->pPage ){

	1225 sqlite3PagerUnref(pCursor->pPage);

	1226 pCursor->pPage = NULL;

	1227 }

	1228

	1229 memset(pCursor, 0xA5, sizeof(*pCursor));

	1230 sqlite3_free(pCursor);

	1231 }

	1232

	1233 /* Create a cursor to iterate the rows from the leaf pages of a table

	1234 * rooted at iRootPage.

	1235 */

	1236 /* TODO(shess): recoverOpen() calls this to setup the cursor, and I

	1237 * think that recoverFilter() may make a hard assumption that the

	1238 * cursor returned will turn up at least one valid cell.

	1239 *

	1240 * The cases I can think of which break this assumption are:

	1241 * - pPage is a valid leaf page with no valid cells.

	1242 * - pPage is a valid interior page with no valid leaves.

	1243 * - pPage is a valid interior page who's leaves contain no valid cells.

	1244 * - pPage is not a valid leaf or interior page.

	1245 */

	1246 static int leafCursorCreate(Pager *pPager, unsigned nPageSize,

	1247 u32 iRootPage, RecoverLeafCursor **ppCursor){

	1248 DbPage pPage; / Reference to page at iRootPage. */

	1249 RecoverLeafCursor pCursor; / Leaf cursor being constructed. */

	1250 int rc;

	1251

	1252 /* Start out with the root page. */

	1253 rc = sqlite3PagerAcquire(pPager, iRootPage, &pPage, 0);

	1254 if( rc!=SQLITE_OK ){

	1255 return rc;

	1256 }

	1257

	1258 pCursor = sqlite3_malloc(sizeof(RecoverLeafCursor));

	1259 if( !pCursor ){

	1260 sqlite3PagerUnref(pPage);

	1261 return SQLITE_NOMEM;

	1262 }

	1263 memset(pCursor, 0, sizeof(*pCursor));

	1264

	1265 pCursor->nPageSize = nPageSize;

	1266

	1267 rc = leafCursorLoadPage(pCursor, pPage);

	1268 if( rc!=SQLITE_OK ){

	1269 sqlite3PagerUnref(pPage);

	1270 leafCursorDestroy(pCursor);

	1271 return rc;

	1272 }

	1273

	1274 /* pPage wasn't a leaf page, find the next leaf page. */

	1275 if( !pCursor->pPage ){

	1276 rc = leafCursorNextPage(pCursor);

	1277 if( rc!=SQLITE_DONE && rc!=SQLITE_ROW ){

	1278 leafCursorDestroy(pCursor);

	1279 return rc;

	1280 }

	1281 }

	1282

	1283 *ppCursor = pCursor;

	1284 return SQLITE_OK;

	1285 }

	1286

	1287 /* Useful for setting breakpoints. */

	1288 static int ValidateError(){

	1289 return SQLITE_ERROR;

	1290 }

	1291

	1292 /* Setup the cursor for reading the information from cell iCell. */

	1293 static int leafCursorCellDecode(RecoverLeafCursor *pCursor){

	1294 const unsigned char pPageHeader; / Header of current page. */

	1295 const unsigned char pPageEnd; / Byte after end of current page. */

	1296 const unsigned char pCellOffsets; / Pointer to page's cell offsets. */

	1297 unsigned iCellOffset; /* Offset of current cell (iCell). */

	1298 const unsigned char pCell; / Pointer to data at iCellOffset. */

	1299 unsigned nCellMaxBytes; /* Maximum local size of iCell. */

	1300 unsigned iEndOffset; /* End of iCell's in-page data. */

	1301 u64 nRecordBytes; /* Expected size of cell, w/overflow. */

	1302 u64 iRowid; /* iCell's rowid (in table). */

	1303 unsigned nRead; /* Amount of cell read. */

	1304 unsigned nRecordHeaderRead; /* Header data read. */

	1305 u64 nRecordHeaderBytes; /* Header size expected. */

	1306 unsigned nRecordCols; /* Columns read from header. */

	1307 u64 nRecordColBytes; /* Bytes in payload for those columns. */

	1308 unsigned i;

	1309 int rc;

	1310

	1311 assert( pCursor->iCell<pCursor->nCells );

	1312

	1313 leafCursorDestroyCellData(pCursor);

	1314

	1315 /* Find the offset to the row. */

	1316 pPageHeader = PageHeader(pCursor->pPage);

	1317 pCellOffsets = pPageHeader + knPageLeafHeaderBytes;

	1318 pPageEnd = PageData(pCursor->pPage, pCursor->nPageSize);

	1319 if( pCellOffsets + pCursor->iCell*2 + 2 > pPageEnd ){

	1320 return ValidateError();

	1321 }

	1322 iCellOffset = decodeUnsigned16(pCellOffsets + pCursor->iCell*2);

	1323 if( iCellOffset>=pCursor->nPageSize ){

	1324 return ValidateError();

	1325 }

	1326

	1327 pCell = PageData(pCursor->pPage, iCellOffset);

	1328 nCellMaxBytes = pCursor->nPageSize - iCellOffset;

	1329

	1330 /* B-tree leaf cells lead with varint record size, varint rowid and

	1331 * varint header size.

	1332 */

	1333 /* TODO(shess): The smallest page size is 512 bytes, which has an m

	1334 * of 39. Three varints need at most 27 bytes to encode. I think.

	1335 */

	1336 if( !checkVarints(pCell, nCellMaxBytes, 3) ){

	1337 return ValidateError();

	1338 }

	1339

	1340 nRead = getVarint(pCell, &nRecordBytes);

	1341 assert( iCellOffset+nRead<=pCursor->nPageSize );

	1342 pCursor->nRecordBytes = nRecordBytes;

	1343

	1344 nRead += getVarint(pCell + nRead, &iRowid);

	1345 assert( iCellOffset+nRead<=pCursor->nPageSize );

	1346 pCursor->iRowid = (i64)iRowid;

	1347

	1348 pCursor->iRecordOffset = iCellOffset + nRead;

	1349

	1350 /* Start overflow setup here because nLocalRecordBytes is needed to

	1351 * check cell overlap.

	1352 */

	1353 rc = overflowMaybeCreate(pCursor->pPage, pCursor->nPageSize,

	1354 pCursor->iRecordOffset, pCursor->nRecordBytes,

	1355 &pCursor->nLocalRecordBytes,

	1356 &pCursor->pOverflow);

	1357 if( rc!=SQLITE_OK ){

	1358 return ValidateError();

	1359 }

	1360

	1361 /* Check that no other cell starts within this cell. */

	1362 iEndOffset = pCursor->iRecordOffset + pCursor->nLocalRecordBytes;

	1363 for( i=0; i<pCursor->nCells && pCellOffsets + i*2 + 2 <= pPageEnd; ++i ){

	1364 const unsigned iOtherOffset = decodeUnsigned16(pCellOffsets + i*2);

	1365 if( iOtherOffset>iCellOffset && iOtherOffset<iEndOffset ){

	1366 return ValidateError();

	1367 }

	1368 }

	1369

	1370 nRecordHeaderRead = getVarint(pCell + nRead, &nRecordHeaderBytes);

	1371 assert( nRecordHeaderBytes<=nRecordBytes );

	1372 pCursor->nRecordHeaderBytes = nRecordHeaderBytes;

	1373

	1374 /* Large headers could overflow if pages are small. */

	1375 rc = overflowGetSegment(pCursor->pPage,

	1376 pCursor->iRecordOffset, pCursor->nLocalRecordBytes,

	1377 pCursor->pOverflow, 0, nRecordHeaderBytes,

	1378 &pCursor->pRecordHeader, &pCursor->bFreeRecordHeader);

	1379 if( rc!=SQLITE_OK ){

	1380 return ValidateError();

	1381 }

	1382

	1383 /* Tally up the column count and size of data. */

	1384 nRecordCols = 0;

	1385 nRecordColBytes = 0;

	1386 while( nRecordHeaderRead<nRecordHeaderBytes ){

	1387 u64 iSerialType; /* Type descriptor for current column. */

	1388 if( !checkVarint(pCursor->pRecordHeader + nRecordHeaderRead,

	1389 nRecordHeaderBytes - nRecordHeaderRead) ){

	1390 return ValidateError();

	1391 }

	1392 nRecordHeaderRead += getVarint(pCursor->pRecordHeader + nRecordHeaderRead,

	1393 &iSerialType);

	1394 if( iSerialType==10 \|\| iSerialType==11 ){

	1395 return ValidateError();

	1396 }

	1397 nRecordColBytes += SerialTypeLength(iSerialType);

	1398 nRecordCols++;

	1399 }

	1400 pCursor->nRecordCols = nRecordCols;

	1401

	1402 /* Parsing the header used as many bytes as expected. */

	1403 if( nRecordHeaderRead!=nRecordHeaderBytes ){

	1404 return ValidateError();

	1405 }

	1406

	1407 /* Calculated record is size of expected record. */

	1408 if( nRecordHeaderBytes+nRecordColBytes!=nRecordBytes ){

	1409 return ValidateError();

	1410 }

	1411

	1412 return SQLITE_OK;

	1413 }

	1414

	1415 static i64 leafCursorCellRowid(RecoverLeafCursor *pCursor){

	1416 return pCursor->iRowid;

	1417 }

	1418

	1419 static unsigned leafCursorCellColumns(RecoverLeafCursor *pCursor){

	1420 return pCursor->nRecordCols;

	1421 }

	1422

	1423 /* Get the column info for the cell. Pass NULL for ppBase to prevent

	1424 * retrieving the data segment. If pbFree is true, ppBase must be

	1425 * freed by the caller using sqlite3_free().

	1426 */

	1427 static int leafCursorCellColInfo(RecoverLeafCursor *pCursor,

	1428 unsigned iCol, u64 *piColType,

	1429 unsigned char *ppBase, int pbFree){

	1430 const unsigned char pRecordHeader; / Current cell's header. */

	1431 u64 nRecordHeaderBytes; /* Bytes in pRecordHeader. */

	1432 unsigned nRead; /* Bytes read from header. */

	1433 u64 iColEndOffset; /* Offset to end of column in cell. */

	1434 unsigned nColsSkipped; /* Count columns as procesed. */

	1435 u64 iSerialType; /* Type descriptor for current column. */

	1436

	1437 /* Implicit NULL for columns past the end. This case happens when

	1438 * rows have not been updated since an ALTER TABLE added columns.

	1439 * It is more convenient to address here than in callers.

	1440 */

	1441 if( iCol>=pCursor->nRecordCols ){

	1442 *piColType = 0;

	1443 if( ppBase ){

	1444 *ppBase = 0;

	1445 *pbFree = 0;

	1446 }

	1447 return SQLITE_OK;

	1448 }

	1449

	1450 /* Must be able to decode header size. */

	1451 pRecordHeader = pCursor->pRecordHeader;

	1452 if( !checkVarint(pRecordHeader, pCursor->nRecordHeaderBytes) ){

	1453 return SQLITE_CORRUPT;

	1454 }

	1455

	1456 /* Rather than caching the header size and how many bytes it took,

	1457 * decode it every time.

	1458 */

	1459 nRead = getVarint(pRecordHeader, &nRecordHeaderBytes);

	1460 assert( nRecordHeaderBytes==pCursor->nRecordHeaderBytes );

	1461

	1462 /* Scan forward to the indicated column. Scans to _after_ column

	1463 * for later range checking.

	1464 */

	1465 /* TODO(shess): This could get expensive for very wide tables. An

	1466 * array of iSerialType could be built in leafCursorCellDecode(), but

	1467 * the number of columns is dynamic per row, so it would add memory

	1468 * management complexity. Enough info to efficiently forward

	1469 * iterate could be kept, if all clients forward iterate

	1470 * (recoverColumn() may not).

	1471 */

	1472 iColEndOffset = 0;

	1473 nColsSkipped = 0;

	1474 while( nColsSkipped<=iCol && nRead<nRecordHeaderBytes ){

	1475 if( !checkVarint(pRecordHeader + nRead, nRecordHeaderBytes - nRead) ){

	1476 return SQLITE_CORRUPT;

	1477 }

	1478 nRead += getVarint(pRecordHeader + nRead, &iSerialType);

	1479 iColEndOffset += SerialTypeLength(iSerialType);

	1480 nColsSkipped++;

	1481 }

	1482

	1483 /* Column's data extends past record's end. */

	1484 if( nRecordHeaderBytes+iColEndOffset>pCursor->nRecordBytes ){

	1485 return SQLITE_CORRUPT;

	1486 }

	1487

	1488 *piColType = iSerialType;

	1489 if( ppBase ){

	1490 const u32 nColBytes = SerialTypeLength(iSerialType);

	1491

	1492 /* Offset from start of record to beginning of column. */

	1493 const unsigned iColOffset = nRecordHeaderBytes+iColEndOffset-nColBytes;

	1494

	1495 return overflowGetSegment(pCursor->pPage, pCursor->iRecordOffset,

	1496 pCursor->nLocalRecordBytes, pCursor->pOverflow,

	1497 iColOffset, nColBytes, ppBase, pbFree);

	1498 }

	1499 return SQLITE_OK;

	1500 }

	1501

	1502 static int leafCursorNextValidCell(RecoverLeafCursor *pCursor){

	1503 while( 1 ){

	1504 int rc;

	1505

	1506 /* Move to the next cell. */

	1507 pCursor->iCell++;

	1508

	1509 /* No more cells, get the next leaf. */

	1510 if( pCursor->iCell>=pCursor->nCells ){

	1511 rc = leafCursorNextPage(pCursor);

	1512 if( rc!=SQLITE_ROW ){

	1513 return rc;

	1514 }

	1515 assert( pCursor->iCell==0 );

	1516 }

	1517

	1518 /* If the cell is valid, indicate that a row is available. */

	1519 rc = leafCursorCellDecode(pCursor);

	1520 if( rc==SQLITE_OK ){

	1521 return SQLITE_ROW;

	1522 }

	1523

	1524 /* Iterate until done or a valid row is found. */

	1525 /* TODO(shess): Remove debugging output. */

	1526 fprintf(stderr, "Skipping invalid cell\n");

	1527 }

	1528 return SQLITE_ERROR;

	1529 }

	1530

	1531 typedef struct Recover Recover;

	1532 struct Recover {

	1533 sqlite3_vtab base;

	1534 sqlite3 db; / Host database connection */

	1535 char zDb; / Database containing target table */

	1536 char zTable; / Target table */

	1537 unsigned nCols; /* Number of columns in target table */

	1538 unsigned char pTypes; / Types of columns in target table */

	1539 };

	1540

	1541 /* Internal helper for deleting the module. */

	1542 static void recoverRelease(Recover *pRecover){

	1543 sqlite3_free(pRecover->zDb);

	1544 sqlite3_free(pRecover->zTable);

	1545 sqlite3_free(pRecover->pTypes);

	1546 memset(pRecover, 0xA5, sizeof(*pRecover));

	1547 sqlite3_free(pRecover);

	1548 }

	1549

	1550 /* Helper function for initializing the module. Forward-declared so

	1551 * recoverCreate() and recoverConnect() can see it.

	1552 */

	1553 static int recoverInit(

	1554 sqlite3 , void , int, const char const, sqlite3_vtab , char

	1555 );

	1556

	1557 static int recoverCreate(

	1558 sqlite3 *db,

	1559 void *pAux,

	1560 int argc, const char constargv,

	1561 sqlite3_vtab **ppVtab,

	1562 char **pzErr

	1563 ){

	1564 FNENTRY();

	1565 return recoverInit(db, pAux, argc, argv, ppVtab, pzErr);

	1566 }

	1567

	1568 /* This should never be called. */

	1569 static int recoverConnect(

	1570 sqlite3 *db,

	1571 void *pAux,

	1572 int argc, const char constargv,

	1573 sqlite3_vtab **ppVtab,

	1574 char **pzErr

	1575 ){

	1576 FNENTRY();

	1577 return recoverInit(db, pAux, argc, argv, ppVtab, pzErr);

	1578 }

	1579

	1580 /* No indices supported. */

	1581 static int recoverBestIndex(sqlite3_vtab tab, sqlite3_index_info pIdxInfo){

	1582 FNENTRY();

	1583 return SQLITE_OK;

	1584 }

	1585

	1586 /* Logically, this should never be called. */

	1587 static int recoverDisconnect(sqlite3_vtab *pVtab){

	1588 FNENTRY();

	1589 recoverRelease((Recover*)pVtab);

	1590 return SQLITE_OK;

	1591 }

	1592

	1593 static int recoverDestroy(sqlite3_vtab *pVtab){

	1594 FNENTRY();

	1595 recoverRelease((Recover*)pVtab);

	1596 return SQLITE_OK;

	1597 }

	1598

	1599 typedef struct RecoverCursor RecoverCursor;

	1600 struct RecoverCursor {

	1601 sqlite3_vtab_cursor base;

	1602 RecoverLeafCursor *pLeafCursor;

	1603 int iEncoding;

	1604 int bEOF;

	1605 };

	1606

	1607 static int recoverOpen(sqlite3_vtab pVTab, sqlite3_vtab_cursor *ppCursor){

	1608 Recover pRecover = (Recover)pVTab;

	1609 u32 iRootPage; /* Root page of the backing table. */

	1610 int iEncoding; /* UTF encoding for backing database. */

	1611 unsigned nPageSize; /* Size of pages in backing database. */

	1612 Pager pPager; / Backing database pager. */

	1613 RecoverLeafCursor pLeafCursor; / Cursor to read table's leaf pages. */

	1614 RecoverCursor pCursor; / Cursor to read rows from leaves. */

	1615 int rc;

	1616

	1617 FNENTRY();

	1618

	1619 iRootPage = 0;

	1620 rc = getRootPage(pRecover->db, pRecover->zDb, pRecover->zTable,

	1621 &iRootPage);

	1622 if( rc!=SQLITE_OK ){

	1623 return rc;

	1624 }

	1625

	1626 iEncoding = 0;

	1627 rc = getEncoding(pRecover->db, pRecover->zDb, &iEncoding);

	1628 if( rc!=SQLITE_OK ){

	1629 return rc;

	1630 }

	1631

	1632 rc = GetPager(pRecover->db, pRecover->zDb, &pPager, &nPageSize);

	1633 if( rc!=SQLITE_OK ){

	1634 return rc;

	1635 }

	1636

	1637 rc = leafCursorCreate(pPager, nPageSize, iRootPage, &pLeafCursor);

	1638 if( rc!=SQLITE_OK ){

	1639 return rc;

	1640 }

	1641

	1642 pCursor = sqlite3_malloc(sizeof(RecoverCursor));

	1643 if( !pCursor ){

	1644 leafCursorDestroy(pLeafCursor);

	1645 return SQLITE_NOMEM;

	1646 }

	1647 memset(pCursor, 0, sizeof(*pCursor));

	1648 pCursor->base.pVtab = pVTab;

	1649 pCursor->pLeafCursor = pLeafCursor;

	1650 pCursor->iEncoding = iEncoding;

	1651

	1652 /* If no leaf pages were found, empty result set. */

	1653 /* TODO(shess): leafCursorNextValidCell() would return SQLITE_ROW or

	1654 * SQLITE_DONE to indicate whether there is further data to consider.

	1655 */

	1656 pCursor->bEOF = (pLeafCursor->pPage==NULL);

	1657

	1658 ppCursor = (sqlite3_vtab_cursor)pCursor;

	1659 return SQLITE_OK;

	1660 }

	1661

	1662 static int recoverClose(sqlite3_vtab_cursor *cur){

	1663 RecoverCursor pCursor = (RecoverCursor)cur;

	1664 FNENTRY();

	1665 if( pCursor->pLeafCursor ){

	1666 leafCursorDestroy(pCursor->pLeafCursor);

	1667 pCursor->pLeafCursor = NULL;

	1668 }

	1669 memset(pCursor, 0xA5, sizeof(*pCursor));

	1670 sqlite3_free(cur);

	1671 return SQLITE_OK;

	1672 }

	1673

	1674 /* Helpful place to set a breakpoint. */

	1675 static int RecoverInvalidCell(){

	1676 return SQLITE_ERROR;

	1677 }

	1678

	1679 /* Returns SQLITE_OK if the cell has an appropriate number of columns

	1680 * with the appropriate types of data.

	1681 */

	1682 static int recoverValidateLeafCell(Recover pRecover, RecoverCursor pCursor){

	1683 unsigned i;

	1684

	1685 /* If the row's storage has too many columns, skip it. */

	1686 if( leafCursorCellColumns(pCursor->pLeafCursor)>pRecover->nCols ){

	1687 return RecoverInvalidCell();

	1688 }

	1689

	1690 /* Skip rows with unexpected types. */

	1691 for( i=0; i<pRecover->nCols; ++i ){

	1692 u64 iType; /* Storage type of column i. */

	1693 int rc;

	1694

	1695 /* ROWID alias. */

	1696 if( (pRecover->pTypes[i]&MASK_ROWID) ){

	1697 continue;

	1698 }

	1699

	1700 rc = leafCursorCellColInfo(pCursor->pLeafCursor, i, &iType, NULL, NULL);

	1701 assert( rc==SQLITE_OK );

	1702 if( rc!=SQLITE_OK \|\| !SerialTypeIsCompatible(iType, pRecover->pTypes[i]) ){

	1703 return RecoverInvalidCell();

	1704 }

	1705 }

	1706

	1707 return SQLITE_OK;

	1708 }

	1709

	1710 static int recoverNext(sqlite3_vtab_cursor *pVtabCursor){

	1711 RecoverCursor pCursor = (RecoverCursor)pVtabCursor;

	1712 Recover pRecover = (Recover)pCursor->base.pVtab;

	1713 int rc;

	1714

	1715 FNENTRY();

	1716

	1717 /* Scan forward to the next cell with valid storage, then check that

	1718 * the stored data matches the schema.

	1719 */

	1720 while( (rc = leafCursorNextValidCell(pCursor->pLeafCursor))==SQLITE_ROW ){

	1721 if( recoverValidateLeafCell(pRecover, pCursor)==SQLITE_OK ){

	1722 return SQLITE_OK;

	1723 }

	1724 }

	1725

	1726 if( rc==SQLITE_DONE ){

	1727 pCursor->bEOF = 1;

	1728 return SQLITE_OK;

	1729 }

	1730

	1731 assert( rc!=SQLITE_OK );

	1732 return rc;

	1733 }

	1734

	1735 static int recoverFilter(

	1736 sqlite3_vtab_cursor *pVtabCursor,

	1737 int idxNum, const char *idxStr,

	1738 int argc, sqlite3_value **argv

	1739 ){

	1740 RecoverCursor pCursor = (RecoverCursor)pVtabCursor;

	1741 Recover pRecover = (Recover)pCursor->base.pVtab;

	1742 int rc;

	1743

	1744 FNENTRY();

	1745

	1746 /* There were no valid leaf pages in the table. */

	1747 if( pCursor->bEOF ){

	1748 return SQLITE_OK;

	1749 }

	1750

	1751 /* Load the first cell, and iterate forward if it's not valid. If no cells at

	1752 * all are valid, recoverNext() sets bEOF and returns appropriately.

	1753 */

	1754 rc = leafCursorCellDecode(pCursor->pLeafCursor);

	1755 if( rc!=SQLITE_OK \|\| recoverValidateLeafCell(pRecover, pCursor)!=SQLITE_OK ){

	1756 return recoverNext(pVtabCursor);

	1757 }

	1758

	1759 return SQLITE_OK;

	1760 }

	1761

	1762 static int recoverEof(sqlite3_vtab_cursor *pVtabCursor){

	1763 RecoverCursor pCursor = (RecoverCursor)pVtabCursor;

	1764 FNENTRY();

	1765 return pCursor->bEOF;

	1766 }

	1767

	1768 static int recoverColumn(sqlite3_vtab_cursor cur, sqlite3_context ctx, int i){

	1769 RecoverCursor pCursor = (RecoverCursor)cur;

	1770 Recover pRecover = (Recover)pCursor->base.pVtab;

	1771 u64 iColType; /* Storage type of column i. */

	1772 unsigned char pColData; / Column i's data. */

	1773 int shouldFree; /* Non-zero if pColData should be freed. */

	1774 int rc;

	1775

	1776 FNENTRY();

	1777

	1778 if( i>=pRecover->nCols ){

	1779 return SQLITE_ERROR;

	1780 }

	1781

	1782 /* ROWID alias. */

	1783 if( (pRecover->pTypes[i]&MASK_ROWID) ){

	1784 sqlite3_result_int64(ctx, leafCursorCellRowid(pCursor->pLeafCursor));

	1785 return SQLITE_OK;

	1786 }

	1787

	1788 pColData = NULL;

	1789 shouldFree = 0;

	1790 rc = leafCursorCellColInfo(pCursor->pLeafCursor, i, &iColType,

	1791 &pColData, &shouldFree);

	1792 if( rc!=SQLITE_OK ){

	1793 return rc;

	1794 }

	1795 /* recoverValidateLeafCell() should guarantee that this will never

	1796 * occur.

	1797 */

	1798 if( !SerialTypeIsCompatible(iColType, pRecover->pTypes[i]) ){

	1799 if( shouldFree ){

	1800 sqlite3_free(pColData);

	1801 }

	1802 return SQLITE_ERROR;

	1803 }

	1804

	1805 switch( iColType ){

	1806 case 0 : sqlite3_result_null(ctx); break;

	1807 case 1 : sqlite3_result_int64(ctx, decodeSigned(pColData, 1)); break;

	1808 case 2 : sqlite3_result_int64(ctx, decodeSigned(pColData, 2)); break;

	1809 case 3 : sqlite3_result_int64(ctx, decodeSigned(pColData, 3)); break;

	1810 case 4 : sqlite3_result_int64(ctx, decodeSigned(pColData, 4)); break;

	1811 case 5 : sqlite3_result_int64(ctx, decodeSigned(pColData, 6)); break;

	1812 case 6 : sqlite3_result_int64(ctx, decodeSigned(pColData, 8)); break;

	1813 case 7 : sqlite3_result_double(ctx, decodeFloat64(pColData)); break;

	1814 case 8 : sqlite3_result_int(ctx, 0); break;

	1815 case 9 : sqlite3_result_int(ctx, 1); break;

	1816 case 10 : assert( iColType!=10 ); break;

	1817 case 11 : assert( iColType!=11 ); break;

	1818

	1819 default : {

	1820 u32 l = SerialTypeLength(iColType);

	1821

	1822 /* If pColData was already allocated, arrange to pass ownership. */

	1823 sqlite3_destructor_type pFn = SQLITE_TRANSIENT;

	1824 if( shouldFree ){

	1825 pFn = sqlite3_free;

	1826 shouldFree = 0;

	1827 }

	1828

	1829 if( SerialTypeIsBlob(iColType) ){

	1830 sqlite3_result_blob(ctx, pColData, l, pFn);

	1831 }else{

	1832 if( pCursor->iEncoding==SQLITE_UTF16LE ){

	1833 sqlite3_result_text16le(ctx, (const void*)pColData, l, pFn);

	1834 }else if( pCursor->iEncoding==SQLITE_UTF16BE ){

	1835 sqlite3_result_text16be(ctx, (const void*)pColData, l, pFn);

	1836 }else{

	1837 sqlite3_result_text(ctx, (const char*)pColData, l, pFn);

	1838 }

	1839 }

	1840 } break;

	1841 }

	1842 if( shouldFree ){

	1843 sqlite3_free(pColData);

	1844 }

	1845 return SQLITE_OK;

	1846 }

	1847

	1848 static int recoverRowid(sqlite3_vtab_cursor pVtabCursor, sqlite_int64 pRowid){

	1849 RecoverCursor pCursor = (RecoverCursor)pVtabCursor;

	1850 FNENTRY();

	1851 *pRowid = leafCursorCellRowid(pCursor->pLeafCursor);

	1852 return SQLITE_OK;

	1853 }

	1854

	1855 static sqlite3_module recoverModule = {

	1856 0, /* iVersion */

	1857 recoverCreate, /* xCreate - create a table */

	1858 recoverConnect, /* xConnect - connect to an existing table */

	1859 recoverBestIndex, /* xBestIndex - Determine search strategy */

	1860 recoverDisconnect, /* xDisconnect - Disconnect from a table */

	1861 recoverDestroy, /* xDestroy - Drop a table */

	1862 recoverOpen, /* xOpen - open a cursor */

	1863 recoverClose, /* xClose - close a cursor */

	1864 recoverFilter, /* xFilter - configure scan constraints */

	1865 recoverNext, /* xNext - advance a cursor */

	1866 recoverEof, /* xEof */

	1867 recoverColumn, /* xColumn - read data */

	1868 recoverRowid, /* xRowid - read data */

	1869 0, /* xUpdate - write data */

	1870 0, /* xBegin - begin transaction */

	1871 0, /* xSync - sync transaction */

	1872 0, /* xCommit - commit transaction */

	1873 0, /* xRollback - rollback transaction */

	1874 0, /* xFindFunction - function overloading */

	1875 0, /* xRename - rename the table */

	1876 };

	1877

	1878 int recoverVtableInit(sqlite3 *db){

	1879 return sqlite3_create_module_v2(db, "recover", &recoverModule, NULL, 0);

	1880 }

	1881

	1882 /* This section of code is for parsing the create input and

	1883 * initializing the module.

	1884 */

	1885

	1886 /* Find the next word in zText and place the endpoints in pzWord*.

	1887 * Returns true if the word is non-empty. "Word" is defined as

	1888 * ASCII alphanumeric plus '_' at this time.

	1889 */

	1890 static int findWord(const char *zText,

	1891 const char pzWordStart, const char pzWordEnd){

	1892 int r;

	1893 while( ascii_isspace(*zText) ){

	1894 zText++;

	1895 }

	1896 *pzWordStart = zText;

	1897 while( ascii_isalnum(zText) \|\| zText=='_' ){

	1898 zText++;

	1899 }

	1900 r = zText>pzWordStart; / In case pzWordStart==pzWordEnd */

	1901 *pzWordEnd = zText;

	1902 return r;

	1903 }

	1904

	1905 /* Return true if the next word in zText is zWord, also setting

	1906 * *pzContinue to the character after the word.

	1907 */

	1908 static int expectWord(const char zText, const char zWord,

	1909 const char **pzContinue){

	1910 const char zWordStart, zWordEnd;

	1911 if( findWord(zText, &zWordStart, &zWordEnd) &&

	1912 ascii_strncasecmp(zWord, zWordStart, zWordEnd - zWordStart)==0 ){

	1913 *pzContinue = zWordEnd;

	1914 return 1;

	1915 }

	1916 return 0;

	1917 }

	1918

	1919 /* Parse the name and type information out of parameter. In case of

	1920 * success, *pzNameStart/End contain the name of the column,

	1921 * pzTypeStart/End contain the top-level type, and pTypeMask has the

	1922 * type mask to use for the column.

	1923 */

	1924 static int findNameAndType(const char *parameter,

	1925 const char pzNameStart, const char pzNameEnd,

	1926 const char pzTypeStart, const char pzTypeEnd,

	1927 unsigned char *pTypeMask){

	1928 unsigned nNameLen; /* Length of found name. */

	1929 const char zEnd; / Current end of parsed column information. */

	1930 int bNotNull; /* Non-zero if NULL is not allowed for name. */

	1931 int bStrict; /* Non-zero if column requires exact type match. */

	1932 const char zDummy; / Dummy parameter, result unused. */

	1933 unsigned i;

	1934

	1935 /* strictMask is used for STRICT, strictMask\|otherMask if STRICT is

	1936 * not supplied. zReplace provides an alternate type to expose to

	1937 * the caller.

	1938 */

	1939 static struct {

	1940 const char *zName;

	1941 unsigned char strictMask;

	1942 unsigned char otherMask;

	1943 const char *zReplace;

	1944 } kTypeInfo[] = {

	1945 { "ANY",

	1946 MASK_INTEGER \| MASK_FLOAT \| MASK_BLOB \| MASK_TEXT \| MASK_NULL,

	1947 0, "",

	1948 },

	1949 { "ROWID", MASK_INTEGER \| MASK_ROWID, 0, "INTEGER", },

	1950 { "INTEGER", MASK_INTEGER \| MASK_NULL, 0, NULL, },

	1951 { "FLOAT", MASK_FLOAT \| MASK_NULL, MASK_INTEGER, NULL, },

	1952 { "NUMERIC", MASK_INTEGER \| MASK_FLOAT \| MASK_NULL, MASK_TEXT, NULL, },

	1953 { "TEXT", MASK_TEXT \| MASK_NULL, MASK_BLOB, NULL, },

	1954 { "BLOB", MASK_BLOB \| MASK_NULL, 0, NULL, },

	1955 };

	1956

	1957 if( !findWord(parameter, pzNameStart, pzNameEnd) ){

	1958 return SQLITE_MISUSE;

	1959 }

	1960

	1961 /* Manifest typing, accept any storage type. */

	1962 if( !findWord(*pzNameEnd, pzTypeStart, pzTypeEnd) ){

	1963 pzTypeEnd = pzTypeStart = "";

	1964 *pTypeMask = MASK_INTEGER \| MASK_FLOAT \| MASK_BLOB \| MASK_TEXT \| MASK_NULL;

	1965 return SQLITE_OK;

	1966 }

	1967

	1968 nNameLen = pzTypeEnd - pzTypeStart;

	1969 for( i=0; i<ArraySize(kTypeInfo); ++i ){

	1970 if( ascii_strncasecmp(kTypeInfo[i].zName, *pzTypeStart, nNameLen)==0 ){

	1971 break;

	1972 }

	1973 }

	1974 if( i==ArraySize(kTypeInfo) ){

	1975 return SQLITE_MISUSE;

	1976 }

	1977

	1978 zEnd = *pzTypeEnd;

	1979 bStrict = 0;

	1980 if( expectWord(zEnd, "STRICT", &zEnd) ){

	1981 /* TODO(shess): Ick. But I don't want another single-purpose

	1982 * flag, either.

	1983 */

	1984 if( kTypeInfo[i].zReplace && !kTypeInfo[i].zReplace[0] ){

	1985 return SQLITE_MISUSE;

	1986 }

	1987 bStrict = 1;

	1988 }

	1989

	1990 bNotNull = 0;

	1991 if( expectWord(zEnd, "NOT", &zEnd) ){

	1992 if( expectWord(zEnd, "NULL", &zEnd) ){

	1993 bNotNull = 1;

	1994 }else{

	1995 /* Anything other than NULL after NOT is an error. */

	1996 return SQLITE_MISUSE;

	1997 }

	1998 }

	1999

	2000 /* Anything else is an error. */

	2001 if( findWord(zEnd, &zDummy, &zDummy) ){

	2002 return SQLITE_MISUSE;

	2003 }

	2004

	2005 *pTypeMask = kTypeInfo[i].strictMask;

	2006 if( !bStrict ){

	2007 *pTypeMask \|= kTypeInfo[i].otherMask;

	2008 }

	2009 if( bNotNull ){

	2010 *pTypeMask &= ~MASK_NULL;

	2011 }

	2012 if( kTypeInfo[i].zReplace ){

	2013 *pzTypeStart = kTypeInfo[i].zReplace;

	2014 pzTypeEnd = pzTypeStart + strlen(*pzTypeStart);

	2015 }

	2016 return SQLITE_OK;

	2017 }

	2018

	2019 /* Parse the arguments, placing type masks in *pTypes and the exposed

	2020 * schema in *pzCreateSql (for sqlite3_declare_vtab).

	2021 */

	2022 static int ParseColumnsAndGenerateCreate(unsigned nCols,

	2023 const char const pCols,

	2024 char **pzCreateSql,

	2025 unsigned char *pTypes,

	2026 char **pzErr){

	2027 unsigned i;

	2028 char *zCreateSql = sqlite3_mprintf("CREATE TABLE x(");

	2029 if( !zCreateSql ){

	2030 return SQLITE_NOMEM;

	2031 }

	2032

	2033 for( i=0; i<nCols; i++ ){

	2034 const char *zSep = (i < nCols - 1 ? ", " : ")");

	2035 const char *zNotNull = "";

	2036 const char zNameStart, zNameEnd;

	2037 const char zTypeStart, zTypeEnd;

	2038 int rc = findNameAndType(pCols[i],

	2039 &zNameStart, &zNameEnd,

	2040 &zTypeStart, &zTypeEnd,

	2041 &pTypes[i]);

	2042 if( rc!=SQLITE_OK ){

	2043 *pzErr = sqlite3_mprintf("unable to parse column %d", i);

	2044 sqlite3_free(zCreateSql);

	2045 return rc;

	2046 }

	2047

	2048 if( !(pTypes[i]&MASK_NULL) ){

	2049 zNotNull = " NOT NULL";

	2050 }

	2051

	2052 /* Add name and type to the create statement. */

	2053 zCreateSql = sqlite3_mprintf("%z%.s %.s%s%s",

	2054 zCreateSql,

	2055 zNameEnd - zNameStart, zNameStart,

	2056 zTypeEnd - zTypeStart, zTypeStart,

	2057 zNotNull, zSep);

	2058 if( !zCreateSql ){

	2059 return SQLITE_NOMEM;

	2060 }

	2061 }

	2062

	2063 *pzCreateSql = zCreateSql;

	2064 return SQLITE_OK;

	2065 }

	2066

	2067 /* Helper function for initializing the module. */

	2068 /* argv[0] module name

	2069 * argv[1] db name for virtual table

	2070 * argv[2] virtual table name

	2071 * argv[3] backing table name

	2072 * argv[4] columns

	2073 */

	2074 /* TODO(shess): Since connect isn't supported, could inline into

	2075 * recoverCreate().

	2076 */

	2077 /* TODO(shess): Explore cases where it would make sense to set pzErr. /

	2078 static int recoverInit(

	2079 sqlite3 db, / Database connection */

	2080 void pAux, / unused */

	2081 int argc, const char constargv, /* Parameters to CREATE TABLE statement */

	2082 sqlite3_vtab *ppVtab, / OUT: New virtual table */

	2083 char *pzErr / OUT: Error message, if any */

	2084 ){

	2085 const unsigned kTypeCol = 4; /* First argument with column type info. */

	2086 Recover pRecover; / Virtual table structure being created. */

	2087 char zDot; / Any dot found in "db.table" backing. */

	2088 u32 iRootPage; /* Root page of backing table. */

	2089 char zCreateSql; / Schema of created virtual table. */

	2090 int rc;

	2091

	2092 /* Require to be in the temp database. */

	2093 if( ascii_strcasecmp(argv[1], "temp")!=0 ){

	2094 *pzErr = sqlite3_mprintf("recover table must be in temp database");

	2095 return SQLITE_MISUSE;

	2096 }

	2097

	2098 /* Need the backing table and at least one column. */

	2099 if( argc<=kTypeCol ){

	2100 *pzErr = sqlite3_mprintf("no columns specified");

	2101 return SQLITE_MISUSE;

	2102 }

	2103

	2104 pRecover = sqlite3_malloc(sizeof(Recover));

	2105 if( !pRecover ){

	2106 return SQLITE_NOMEM;

	2107 }

	2108 memset(pRecover, 0, sizeof(*pRecover));

	2109 pRecover->base.pModule = &recoverModule;

	2110 pRecover->db = db;

	2111

	2112 /* Parse out db.table, assuming main if no dot. */

	2113 zDot = strchr(argv[3], '.');

	2114 if( !zDot ){

	2115 pRecover->zDb = sqlite3_strdup(db->aDb[0].zName);

	2116 pRecover->zTable = sqlite3_strdup(argv[3]);

	2117 }else if( zDot>argv[3] && zDot[1]!='\0' ){

	2118 pRecover->zDb = sqlite3_strndup(argv[3], zDot - argv[3]);

	2119 pRecover->zTable = sqlite3_strdup(zDot + 1);

	2120 }else{

	2121 /* ".table" or "db." not allowed. */

	2122 *pzErr = sqlite3_mprintf("ill-formed table specifier");

	2123 recoverRelease(pRecover);

	2124 return SQLITE_ERROR;

	2125 }

	2126

	2127 pRecover->nCols = argc - kTypeCol;

	2128 pRecover->pTypes = sqlite3_malloc(pRecover->nCols);

	2129 if( !pRecover->zDb \|\| !pRecover->zTable \|\| !pRecover->pTypes ){

	2130 recoverRelease(pRecover);

	2131 return SQLITE_NOMEM;

	2132 }

	2133

	2134 /* Require the backing table to exist. */

	2135 /* TODO(shess): Be more pedantic about the form of the descriptor

	2136 * string. This already fails for poorly-formed strings, simply

	2137 * because there won't be a root page, but it would make more sense

	2138 * to be explicit.

	2139 */

	2140 rc = getRootPage(pRecover->db, pRecover->zDb, pRecover->zTable, &iRootPage);

	2141 if( rc!=SQLITE_OK ){

	2142 *pzErr = sqlite3_mprintf("unable to find backing table");

	2143 recoverRelease(pRecover);

	2144 return rc;

	2145 }

	2146

	2147 /* Parse the column definitions. */

	2148 rc = ParseColumnsAndGenerateCreate(pRecover->nCols, argv + kTypeCol,

	2149 &zCreateSql, pRecover->pTypes, pzErr);

	2150 if( rc!=SQLITE_OK ){

	2151 recoverRelease(pRecover);

	2152 return rc;

	2153 }

	2154

	2155 rc = sqlite3_declare_vtab(db, zCreateSql);

	2156 sqlite3_free(zCreateSql);

	2157 if( rc!=SQLITE_OK ){

	2158 recoverRelease(pRecover);

	2159 return rc;

	2160 }

	2161

	2162 ppVtab = (sqlite3_vtab )pRecover;

	2163 return SQLITE_OK;

	2164 }

OLD	NEW

« no previous file with comments | « third_party/sqlite/src/src/random.c ('k') | third_party/sqlite/src/src/resolve.c » ('j') | no next file with comments »