| OLD | NEW |
| (Empty) |
| 1 In-memory appengine wrapper | |
| 2 =========================== | |
| 3 | |
| 4 | |
| 5 Notes on the internal encodings | |
| 6 ------------------------------- | |
| 7 | |
| 8 All datatypes inside of the index Collections of the gkvlite Store are stored | |
| 9 in a manner which allows them to be compared entirely via bytes.Compare. All | |
| 10 types are prefixed by a sortable type byte which encodes the sort-order of types | |
| 11 according to the appengine SDK. Additionally, types have the following data | |
| 12 encoding: | |
| 13 * ints | |
| 14 * stored with the `funnybase` varint encoding | |
| 15 * floats | |
| 16 * http://stereopsis.com/radix.html | |
| 17 * toBytes: | |
| 18 ``` | |
| 19 b := math.Float64bits(f) | |
| 20 return b ^ (-(b >> 63) | 0x8000000000000000) | |
| 21 ``` | |
| 22 * fromBytes: | |
| 23 ``` | |
| 24 return math.Float64frombits(b ^ ((b >> 63) - 1) | 0x8000000000000000) | |
| 25 ``` | |
| 26 * string, []byte, BlobKey, ByteString | |
| 27 * funnybase byte count | |
| 28 * raw bytes | |
| 29 * \*Key, GeoPoint | |
| 30 * composite of above types | |
| 31 * time.Time | |
| 32 * composite of above types, stored with microsecond accuracy. | |
| 33 * rounding to microseconds is a limitation of the real appengine. | |
| 34 * toMicro: `return t.Unix()*1e6 + int64(t.Nanosecond()/1e3)` | |
| 35 * fromMicro: `return time.Unix(t/1e6, (t%1e6)*1e3)` | |
| 36 * nil, true, false | |
| 37 * value is encoded directly in the type byte | |
| 38 | |
| 39 | |
| 40 Gkvlite Collection schema | |
| 41 ------------------------- | |
| 42 | |
| 43 In order to provide efficient result deduplication, the value of an index row | |
| 44 which indexes 1 or more properties is a concatenation of the previous values | |
| 45 which would show up in the same index. For example, if you have the property | |
| 46 list for the key K: | |
| 47 | |
| 48 bob: 1 | |
| 49 bob: 4 | |
| 50 bob: 7 | |
| 51 cat: "hello" | |
| 52 cat: "world" | |
| 53 | |
| 54 And the regular (non-ancestor) composite index was {bob, -cat}, you'd have the | |
| 55 rows in the index `idx:ns:kind|R|bob|-cat` (| in the row indicates | |
| 56 concatenation, each value has an implied type byte. `...` indicates that other | |
| 57 rows may be present): | |
| 58 | |
| 59 ... | |
| 60 1|"world"|K = nil|nil | |
| 61 ... | |
| 62 1|"hello"|K = nil|"world" | |
| 63 ... | |
| 64 4|"world"|K = 1|nil | |
| 65 ... | |
| 66 4|"hello"|K = 1|"world" | |
| 67 ... | |
| 68 7|"world"|K = 4|nil | |
| 69 ... | |
| 70 7|"hello"|K = 4|"world" | |
| 71 ... | |
| 72 | |
| 73 This allows us to, start scanning at any point and be able to determine if we've | |
| 74 returned a given key already (without storing all of the keys in memory | |
| 75 for the duration of the Query run). We can do this because we can see if the | |
| 76 value of an index row falls within the original query filter parameters. If it | |
| 77 does, then we must already have returned they Key, and can safely skip the index | |
| 78 row. AFAIK, real-datastore provides deduplication by keeping all the returned | |
| 79 keys in memory as it runs the query, and doing a set-check. | |
| 80 | |
| 81 The end-result is semantically equivalent, with the exception that Query Cursors | |
| 82 on the real datastore will potentially return the same Key in the first Cursor | |
| 83 use as well as on the 2nd (or Nth) cursor use, where this method will not. | |
| 84 | |
| 85 collections | |
| 86 ents:ns -> key -> value | |
| 87 (rootkind, rootid, __entity_group__,1) -> {_
_version__: int} | |
| 88 (rootkind, rootid, __entity_group_ids__,1) -
> {__version__: int} | |
| 89 (__entity_group_ids__,1) -> {__version__: in
t} | |
| 90 // TODO(iannucci): Journal every entity write in a log with a globally | |
| 91 // increasing version number (aka "timestamp"). | |
| 92 // | |
| 93 // TODO(iannucci): Use the value in idx collection to indicate the last | |
| 94 // global log version reflected in this index. Then index updates can happ
en | |
| 95 // in parallel, in a truly eventually-consistent fashion (and completely | |
| 96 // avoid holding the DB writelock while calculating index entries). | |
| 97 // Unfortunately, copying new records (and removing old ones) into the DB | |
| 98 // would still require holding global writelock. | |
| 99 // | |
| 100 // TODO(iannucci): how do we garbage-collect the journal? | |
| 101 // | |
| 102 // TODO(iannucci): add the ability in gkvlite to 'swap' a collection with | |
| 103 // another one, transactionally? Not sure if this is possible to do easily
. | |
| 104 // If we had this, then we could do all the index writes for a given index | |
| 105 // on the side, and then do a quick swap into place with a writelock. As | |
| 106 // long as every index only has a single goroutine writing it, then this | |
| 107 // would enable maximum concurrency, since all indexes could update in | |
| 108 // parallel and only synchronize for the duration of a single pointer swap
. | |
| 109 idx -> kind|A?|[-?prop]* = nil | |
| 110 idx:ns:kind -> key = nil | |
| 111 idx:ns:kind|prop -> propval|key = [prev val] | |
| 112 idx:ns:kind|-prop -> -propval|key = [next val] | |
| 113 idx:ns:kind|A|?prop|?prop -> A|propval|propval|key = [prev/next val]|[pre
v/next val] | |
| 114 idx:ns:kind|?prop|?prop -> propval|propval|key = [prev/next val]|[prev/
next val] | |
| OLD | NEW |