Issue 9768006: Implement sql::Connection::Raze() in terms of sqlite3_backup API.

Issue 9768006: Implement sql::Connection::Raze() in terms of sqlite3_backup API. (Closed)

Created:
8 years, 9 months ago by Scott Hess - ex-Googler

Modified:
8 years, 8 months ago

Reviewers:
Greg Billock

CC:
chromium-reviews

Base URL:
svn://svn.chromium.org/chrome/trunk/src

Visibility:
Public.

More Reviews

Description

Implement sql::Connection::Raze() in terms of sqlite3_backup API. Wraps up the notion of reseting a database in a way which respects SQLite locking constraints and doesn't require closing the database. A similar outcome could be managed using filesystem operations, which requires coordination between clients of the database to make sure that no corruption occurs due to incorrect handling of -journal files. Also, Windows pins files until the last handle closes, making that approach challenging in some cases. BUG=none TEST=none ADDENDUM: had to git cl dcommit, and things froze, it looked weird, I thought maybe I needed to re-sync with trunk, so hit Ctrl-C and resync'ed. Code already landed: http://src.chromium.org/viewvc/chrome?view=rev&revision=131167

Patch Set 1 #

Patch Set 2 : cleanup and moar testz. #

Total comments: 21

Patch Set 3 : gbillock comments. #

Total comments: 1

Created: 8 years, 8 months ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+232 lines, -1 line)			Patch
M	sql/connection.h	View	1	1 chunk	+20 lines, -0 lines	0 comments	Download
M	sql/connection.cc	View	1 2	1 chunk	+79 lines, -0 lines	1 comment	Download
M	sql/connection_unittest.cc	View	1 2	3 chunks	+133 lines, -1 line	0 comments	Download

Messages

Total messages: 15 (0 generated)

Expand Messages | Collapse Messages

Scott Hess - ex-Googler

I also have a sqlite3_raze() implementation using btree.c internals, but this seemed safer. I plan ...

8 years, 9 months ago (2012-03-21 19:54:09 UTC) #1

Greg Billock

http://codereview.chromium.org/9768006/diff/5001/sql/connection.cc File sql/connection.cc (right): http://codereview.chromium.org/9768006/diff/5001/sql/connection.cc#newcode190 sql/connection.cc:190: // Propagate the page size to the new database. ...

8 years, 8 months ago (2012-03-31 01:23:50 UTC) #2

Scott Hess - ex-Googler

Thanks for the review - I had entirely forgotten about this CL :-). I'm thinking ...

8 years, 8 months ago (2012-04-04 01:28:25 UTC) #3

Thanks for the review - I had entirely forgotten about this CL :-).

I'm thinking on the "backup exactly one page" thing.  I maybe might think that
backing up -1 pages would adapt reasonably in case SQLite changes in a way which
results in more than 1 page in the empty database.  As indicated, I am mixed on
that.

http://codereview.chromium.org/9768006/diff/5001/sql/connection.cc
File sql/connection.cc (right):

http://codereview.chromium.org/9768006/diff/5001/sql/connection.cc#newcode190
sql/connection.cc:190: // Propagate the page size to the new database.
On 2012/03/31 01:23:50, Greg Billock wrote:
> Maybe say that this gets the page_size from the current connection, and the
next
> stanze propagates it.

OK.

> How likely is the DB to respond to this PRAGMA if it is totally goofed up?

AFAICT, 'PRAGMA page_size' only returns info from the connection structure.  So
this would imply that the database could not even be opened.  In that case, this
approach won't work and something else would need to happen.

http://codereview.chromium.org/9768006/diff/5001/sql/connection.cc#newcode200
sql/connection.cc:200: // one.  The backup API tracks the original schema from
the
On 2012/03/31 01:23:50, Greg Billock wrote:
> so the new DB after the backup will have schema_version = 2?

The backup API restores the backup destination's schema_version and adds one. 
This guarantees that other readers see a schema change.

I'll clarify the comment.  It needs to force the null database to have a page,
otherwise it will back up zero pages, in which case it won't honor the existing
page_size.

http://codereview.chromium.org/9768006/diff/5001/sql/connection.cc#newcode214
sql/connection.cc:214: busy_timeout.SetTimeout(base::TimeDelta::FromSeconds(1));
On 2012/03/31 01:23:50, Greg Billock wrote:
> Just rely on callers to use RazeWithTimeout here?

That is a really good suggestion.  I must have been testing or otherwise
distracted!

http://codereview.chromium.org/9768006/diff/5001/sql/connection.cc#newcode216
sql/connection.cc:216: int rc = sqlite3_backup_step(backup, 1);
On 2012/03/31 01:23:50, Greg Billock wrote:
> doc says pass -1 to copy all remaining pages. Would that be better?

Partly because I must have missed the -1 setting.  Even so, I'm mixed.  If more
than one page is backed up, I don't know what that would mean, so I am somewhat
inclined to be conservative.

http://codereview.chromium.org/9768006/diff/5001/sql/connection.cc#newcode220
sql/connection.cc:220: if (rc == SQLITE_BUSY) {
On 2012/03/31 01:23:50, Greg Billock wrote:
> Is there a way to force it open? Presumably something might be wedged hard
when
> this method is called, right?

Unfortunately, there's no way to force it open without the possibility of the
other writer writing crap over it.  If this is a problem in practice, the
w/timeout version would be reasonable to try.  I think it might be best to try
it out, histogram the results, and see what happened.  Profile databases are
exclusive-access for the most part, and the caller should know that.

http://codereview.chromium.org/9768006/diff/5001/sql/connection.h
File sql/connection.h (right):

http://codereview.chromium.org/9768006/diff/5001/sql/connection.h#newcode187
sql/connection.h:187: // filesystem operations problematic).  Returns true if
the database
On 2012/03/31 01:23:50, Greg Billock wrote:
> Does this mean it won't necessarily work? It seems like this would be used in
> extreme situations, so wouldn't that mean it should take steps to try to make
> sure that collisions get nuked as well?

My broad background with this change is:
A) corruption isn't that common in the first place, and has varied causes, which
make this hard to test for real.
B) we have a baseline presumption that SQLite's core operations like opening
journal files and writing pages are appropriate.

Tracing through the SQLite code, there are places where this code can fall down,
and for the most part resolving the problem in those cases probably requires
changes in the caller, or additional gruntwork to track down what's really
happening (by adding histograms or DumpWithoutCrash() calls or something).  So
this CL aims to provide a best-effort, I-think-this-should-work-well starting
point, which can be built on.

My expectation is that we'll find that this doesn't work in a tiny number of
cases, and that they'll be "I have no idea what we can do about that" types of
cases.  Like "profile not writable" or "disk is horribly hosed".  If there's a
systemic case, like Database X turns out to have two writers instead of the
expected one, then we'll (hopefully) notice and drill down and fix the problem.

http://codereview.chromium.org/9768006/diff/5001/sql/connection.h#newcode191
sql/connection.h:191: // process.  RazeWithTimeout() may be used if appropriate.
On 2012/03/31 01:23:50, Greg Billock wrote:
> What's the use case for that kind of call?

Most (all?) profile databases are exclusive-access, so they should NEVER receive
SQLITE_BUSY and Raze() should work.  Web SQL databases can be opened in multiple
renderers, in which case SQLITE_BUSY must be expected and handled.

http://codereview.chromium.org/9768006/diff/5001/sql/connection_unittest.cc
File sql/connection_unittest.cc (right):

http://codereview.chromium.org/9768006/diff/5001/sql/connection_unittest.cc#n...
sql/connection_unittest.cc:138: EXPECT_EQ(2, s.ColumnInt(0));
On 2012/03/31 01:23:50, Greg Billock wrote:
> is this impl-dependent? that is, won't the page count vary depending on sqlite
> version, build params, or whatever else?

Shouldn't ever vary.  There should be the special page 1 containing the file
header and sqlite_master, and page 2 as the root (and also child) page for table
foo.  It will contain a single one-byte value, so it should fit fine no matter
the page size.  I actually don't think it needs any data to have a root page.

If it ever does generate more than 2 total pages, then someone probably needs to
look at the code and make sure that the assumptions about how databases are
constructed still hold.

Scott Hess - ex-Googler

commentated and modified to use -1 and DCHECK on not-one-page case. http://codereview.chromium.org/9768006/diff/5001/sql/connection.cc File sql/connection.cc (right): ...

8 years, 8 months ago (2012-04-04 19:26:53 UTC) #4

Greg Billock

lgtm http://codereview.chromium.org/9768006/diff/5001/sql/connection.h File sql/connection.h (right): http://codereview.chromium.org/9768006/diff/5001/sql/connection.h#newcode187 sql/connection.h:187: // filesystem operations problematic). Returns true if the ...

8 years, 8 months ago (2012-04-04 19:49:21 UTC) #5

lgtm

http://codereview.chromium.org/9768006/diff/5001/sql/connection.h
File sql/connection.h (right):

http://codereview.chromium.org/9768006/diff/5001/sql/connection.h#newcode187
sql/connection.h:187: // filesystem operations problematic).  Returns true if
the database
Sounds good. I don't know if any of that's worth mentioning in the comment.
Since you've already written it, perhaps so...

On 2012/04/04 01:28:25, shess wrote:
> On 2012/03/31 01:23:50, Greg Billock wrote:
> > Does this mean it won't necessarily work? It seems like this would be used
in
> > extreme situations, so wouldn't that mean it should take steps to try to
make
> > sure that collisions get nuked as well?
> 
> My broad background with this change is:
> A) corruption isn't that common in the first place, and has varied causes,
which
> make this hard to test for real.
> B) we have a baseline presumption that SQLite's core operations like opening
> journal files and writing pages are appropriate.
> 
> Tracing through the SQLite code, there are places where this code can fall
down,
> and for the most part resolving the problem in those cases probably requires
> changes in the caller, or additional gruntwork to track down what's really
> happening (by adding histograms or DumpWithoutCrash() calls or something).  So
> this CL aims to provide a best-effort, I-think-this-should-work-well starting
> point, which can be built on.
> 
> My expectation is that we'll find that this doesn't work in a tiny number of
> cases, and that they'll be "I have no idea what we can do about that" types of
> cases.  Like "profile not writable" or "disk is horribly hosed".  If there's a
> systemic case, like Database X turns out to have two writers instead of the
> expected one, then we'll (hopefully) notice and drill down and fix the
problem.

http://codereview.chromium.org/9768006/diff/5001/sql/connection.h#newcode191
sql/connection.h:191: // process.  RazeWithTimeout() may be used if appropriate.
Do you envision using this most typically for WebSQL disaster recovery? My
mental image was this was for profile DBs, but perhaps that's completely off.

On 2012/04/04 01:28:25, shess wrote:
> On 2012/03/31 01:23:50, Greg Billock wrote:
> > What's the use case for that kind of call?
> 
> Most (all?) profile databases are exclusive-access, so they should NEVER
receive
> SQLITE_BUSY and Raze() should work.  Web SQL databases can be opened in
multiple
> renderers, in which case SQLITE_BUSY must be expected and handled.

http://codereview.chromium.org/9768006/diff/11001/sql/connection.cc
File sql/connection.cc (right):

http://codereview.chromium.org/9768006/diff/11001/sql/connection.cc#newcode234
sql/connection.cc:234: DCHECK_EQ(pages, 1);
Likes it. That should get the attention of a version roll or something that
breaks the assumption.

Scott Hess - ex-Googler

I'm going to stop adding comments, because I could spend eternity debating with myself about ...

8 years, 8 months ago (2012-04-04 20:28:52 UTC) #6