Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(1)

Side by Side Diff: content/browser/indexed_db/leveldb_coding_scheme.md

Issue 1682253004: Indexed DB: Pull leveldb coding scheme docs into a markdown file (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master
Patch Set: Add compile asserts, reformat docs Created 4 years, 10 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
OLDNEW
(Empty)
1 # LevelDB Coding Scheme
2
3 LevelDB stores key/value pairs. Keys and values are strings of bytes,
4 normally of type std::string.
5
6 The keys in the backing store are variable-length tuples with
7 different types of fields. Each key in the backing store starts with a
8 ternary prefix: <database id, object store id, index id>. For each, 0
9 is reserved for metadata.
10
11 > The prefix makes sure that data for a specific database, object store, and
cmumford 2016/02/11 00:48:32 Why did you decide to blockquote this paragraph?
jsbell 2016/02/11 23:26:55 The markdown formatter I was using rendered it mor
12 > index are grouped together. The locality is important for performance:
13 > common operations should only need a minimal number of seek operations. For
14 > example, all the metadata for a database is grouped together so that
15 > reading that metadata only requires one seek.
16
17 Each key type has a class (in square brackets below) which knows how
jsbell 2016/02/11 23:26:55 Note definition of "square brackets" here. Made it
18 to encode, decode, and compare that key type.
19
20 The term "user key" refers to an Indexed DB key value specified by
21 user code as opposed to the internal keys as described here.
22
23 Integer literals used below (e.g. <0, 0, 0, 201, ...>) are defined as
24 constants in `indexed_db_leveldb_coding.cc`
25
26 > Note: In order to be compatible with LevelDB's Bloom filter each bit of the
cmumford 2016/02/11 00:48:31 Consider using this markdown: *** note **Warning:
jsbell 2016/02/11 23:26:55 Done.
27 > encoded key needs to used and "not ignored" by the comparator.
28
29 ## Types
30
31 ### Primitive Types
32
33 Generic types which may appear as parts of keys or values are:
34
35 * **Byte** - what it says on the tin
36 * **Bool** - single byte, 0 = false, otherwise true
37 * **Int** - int64_t; 8 bytes in little-endian order
38 * **VarInt** - int64_t >= 0; variable-width, little endian, 7 bits per
39 byte with high bit set until last
40 * **String** - UTF-16BE (must be byte-swapped on x86/x64/ARM); length
41 must be implicit
42 * **StringWithLength** - VarInt prefix with length in code units (i.e.
43 bytes/2), followed by String
44 * **Binary** - VarInt prefix with length in bytes, followed by data
45 bytes
46 * **Double** - IEEE754 64-bit (double), in *host endianness*
47
48 ### IDBKey
49
50 First byte is the type, followed by type-specific serialization:
51
52 * Number = Byte(3), Double
53 * Date = Byte(2), Double
54 * String = Byte(1), StringWithLength
55 * Binary = Byte(6), Binary
56 * Array = Byte(4), VarInt(count), IDBKey...
57
58 ### IDBKeyPath
59
60 * Null: Byte(0), Byte(0), Byte(0)
61 * String: Byte(0), Byte(0), Byte(1), StringWithLength
62 * Array: Byte(0), Byte(0), Byte(2), VarInt(count), StringWithLength...
63
64 If length is < 3 or the first two bytes are not 0, 0 it is treated as
65 a String (backwards compatibility).
66
67 ## Key Prefix
68 [`KeyPrefix`]
cmumford 2016/02/11 00:48:31 I'm not sure what this means - to have it in brace
jsbell 2016/02/11 23:26:55 Defined above, but made more obvious.
69
70 Each key is prefixed with <database id, object store id, index id>
cmumford 2016/02/11 00:48:31 Hour <> is interpreted as a tag showing as: "Each
jsbell 2016/02/11 23:26:55 *sigh* worked in the markdown editor I was using.
71 with 0 reserved for metadata.
72
73 To save space, the prefix is not encoded with the usual types. The
74 first byte defines the lengths of the other fields:
75
76 * The top 3 bits are the length of the database id - 1 (in bytes)
77 * The next 3 bits are the length of the object store id - 1(in bytes)
78 * The bottom 2 bits are the length of the index id - 1 (in bytes)
79
80 This is followed by:
81
82 * the database id in little-endian order (1 - 8 bytes)
83 * the object store id in little-endian order (1 - 8 bytes)
84 * the index id in little-endian order (1 - 4 bytes)
85
86 ## Global metadata
87
88 The prefix is <0, 0, 0>, followed by a metadata type byte:
cmumford 2016/02/11 00:48:31 These braces also interpreted as html.
jsbell 2016/02/11 23:26:55 Done.
89
90 key | value
91 ------------------------------------|------
92 <0, 0, 0, 0> | backing store schema version [`SchemaVersi onKey`]
93 <0, 0, 0, 1> | maximum allocated database [`MaxDatabaseId Key`]
cmumford 2016/02/11 00:48:31 Keys in this table, and those below also interpret
jsbell 2016/02/11 23:26:55 Done.
94 <0, 0, 0, 2> | SerializedScriptValue version [`DataVersio nKey`]
95 <0, 0, 0, 3> | Blob journal [`BlobJournalKey`]
96 <0, 0, 0, 4> | Live blob journal; same format. [`LiveBlob JournalKey`]
97 <0, 0, 0, 100, VarInt(database id)> | Existence implies the database id is in th e free list [`DatabaseFreeListKey`] - _obsolete_
98 <0, 0, 0, 201, StringWithLength(origin), StringWithLength(database name)> | Data base id (Int) [`DatabaseNameKey`]
99
100
101 ### Blob Journal
102 The format of the blob journal value is:
103
104 `{database_id (var int), blobKey (var int)}*`
105
106 There is no length field; just read until you run out of data.
107
108 If the blobKey is `kAllBlobsKey`, the whole database should be
109 deleted.
110
111 ## Database metadata
112 [`DatabaseMetaDataKey`]
113
114 The prefix is <database id, 0, 0> followed by a metadata type Byte:
115
116 key | value
117 -----------------------|-------
118 <database id, 0, 0, 0> | origin name (String)
119 <database id, 0, 0, 1> | database name (String)
120 <database id, 0, 0, 2> | IDB string version data (String) - _obsolete_
121 <database id, 0, 0, 3> | maximum allocated object store id (Int)
122 <database id, 0, 0, 4> | IDB integer version (VarInt)
123 <database id, 0, 0, 5> | blob key generator current number (VarInt)
124
125
126 ## More database metadata
127
128 The prefix is <database id, 0, 0> followed by a type Byte. The object
129 store and index id are VarInt, the names are StringWithLength.
130
131 key | value
132 ------------------------------------------------------|-------
133 <database id, 0, 0, 150, object store id> | existence implies the ob ject store id is in the free list [`ObjectStoreFreeListKey`] - _obsolete_
134 <database id, 0, 0, 151, object store id, index id> | existence implies the in dex id is in the free list [`IndexFreeListKey`] - _obsolete_
135 <database id, 0, 0, 200, object store name> | object store id (Int) [` ObjectStoreNamesKey`]
136 <database id, 0, 0, 201, object store id, index name> | index id (Int) [`IndexNa mesKey`]
137
138 ## Object store metadata
139 [`ObjectStoreMetaDataKey`]
140
141 The prefix is <database id, 0, 0>, followed by a type Byte (50), then
142 the object store id (VarInt), then a metadata type Byte.
143
144 key | value
145 --------------------------------------------|-------
146 <database id, 0, 0, 50, object store id, 0> | object store name (String)
147 <database id, 0, 0, 50, object store id, 1> | key path (IDBKeyPath)
148 <database id, 0, 0, 50, object store id, 2> | auto increment flag (Bool)
149 <database id, 0, 0, 50, object store id, 3> | is evictable (Bool) - _obsolete_
150 <database id, 0, 0, 50, object store id, 4> | last _version_ number (Int)
cmumford 2016/02/11 00:48:32 I see this makes "version" italic, but no others.
jsbell 2016/02/11 23:26:55 It was originally quoted, which was weird. Agreed
151 <database id, 0, 0, 50, object store id, 5> | maximum allocated index id (Int)
152 <database id, 0, 0, 50, object store id, 6> | has key path flag (Bool) - _obsole te_
153 <database id, 0, 0, 50, object store id, 7> | key generator current number (Int)
154
155 The key path was originally just a string (#1) or null (identified by
156 flag, #6). To support null, string, or array the coding is now
157 identified by the leading bytes in #1 - see IDBKeyPath.
158
159 The _version_ field is used to weed out stale index data. Whenever new
160 object store data is inserted, it gets a new _version_ number, and new
161 index data is written with this number. When the index is used for
162 look-ups, entries are validated against the "exists" entries, and
163 records with old _version_ numbers are deleted when they are
164 encountered in `GetPrimaryKeyViaIndex`,
165 `IndexCursorImpl::LoadCurrentRow` and
166 `IndexKeyCursorImpl::LoadCurrentRow`.
167
168
169 ## Index metadata
170 [`IndexMetaDataKey`]
171
172 The prefix is <database id, 0, 0>, followed by a type Byte (100), then
173 the object store id (VarInt), then the index id (VarInt), then a
174 metadata type Byte.
175
176 key | value
177 -------------------------------------------------------|-------
178 <database id, 0, 0, 100, object store id, index id, 0> | index name (String)
179 <database id, 0, 0, 100, object store id, index id, 1> | unique flag (Bool)
180 <database id, 0, 0, 100, object store id, index id, 2> | key path (IDBKeyPath)
181 <database id, 0, 0, 100, object store id, index id, 3> | multi-entry flag (Bool)
182
183 ## Object store data
184 [`ObjectStoreDataKey`]
185
186 The reserved index id `1` is used in the prefix. The prefix is
187 followed the encoded IDB primary key (IDBKey). The data has a
188 _version_ prefix followed by the serialized script value.
189
190 key | value
191 ----------------------------------------------------|-------
192 <database id, object store id, 1, IDBKey(user key)> | _version_ (VarInt), serial ized script value
193
194
195 ## "Exists" entry
196 [`ExistsEntryKey`]
197
198 The reserved index id `2` is used in the prefix. The prefix is
199 followed the encoded IDB primary key (IDBKey).
200
201 key | value
202 ----------------------------------------------------|-------
203 <database id, object store id, 2, IDBKey(user key)> | _version_ (VarInt)
204
205
206 ## Blob entry table
207 [`BlobEntryKey`]
208
209 The reserved index id `3` is used in the prefix. The prefix is
210 followed the encoded IDB primary key.
211
212 key | value
213 ----------------------------------------------------|-------
214 <database id, object store id, 3, IDBKey(user key)> | BlobData
215
216 ### BlobData
217
218 { Bool(is_file),
219 VarInt(key),
220 StringWithLength(type), // may be empty
221 /*for Blobs only*/ VarInt(size),
222 /*for Files only*/ StringWithLength(filename)
223 } *
cmumford 2016/02/11 00:48:31 errant asterisk?
jsbell 2016/02/11 23:26:55 Means "zero or more of the preceding" as in BNF/re
224
225 There is no length field; just read until you run out of data.
226
227 ## Index data
228
229 The prefix is followed by a type byte, the encoded index key (IDBKey),
230 a _sequence number_ (VarInt), and the encoded primary key (IDBKey).
231
232 key | value
233 ----|-------
234 <database id, object store id, index id, IDBKey(index key), VarInt(sequence numb er), IDBKey(primary key)> | _version_ (VarInt), primary key (IDBKey) [`IndexDat aKey`]
235
236 The _sequence number_ is obsolete; it was used to allow two entries
237 with the same user (index) key in non-unique indexes prior to the
238 inclusion of the primary key in the data. `0` is always written now
239 (which, as a VarInt, takes a single byte)
OLDNEW

Powered by Google App Engine
This is Rietveld 408576698