OLD | NEW |
(Empty) | |
| 1 In-memory appengine wrapper |
| 2 =========================== |
| 3 |
| 4 |
| 5 Notes on the internal encodings |
| 6 ------------------------------- |
| 7 |
| 8 All datatypes inside of the index Collections of the gkvlite Store are stored |
| 9 in a manner which allows them to be compared entirely via bytes.Compare. All |
| 10 types are prefixed by a sortable type byte which encodes the sort-order of types |
| 11 according to the appengine SDK. Additionally, types have the following data |
| 12 encoding: |
| 13 * ints |
| 14 * stored with the `funnybase` varint encoding |
| 15 * floats |
| 16 * http://stereopsis.com/radix.html |
| 17 * toBytes: |
| 18 ``` |
| 19 b := math.Float64bits(f) |
| 20 return b ^ (-(b >> 63) | 0x8000000000000000) |
| 21 ``` |
| 22 * fromBytes: |
| 23 ``` |
| 24 return math.Float64frombits(b ^ ((b >> 63) - 1) | 0x8000000000000000) |
| 25 ``` |
| 26 * string, []byte, BlobKey, ByteString |
| 27 * funnybase byte count |
| 28 * raw bytes |
| 29 * \*Key, GeoPoint |
| 30 * composite of above types |
| 31 * time.Time |
| 32 * composite of above types, stored with microsecond accuracy. |
| 33 * rounding to microseconds is a limitation of the real appengine. |
| 34 * toMicro: `return t.Unix()*1e6 + int64(t.Nanosecond()/1e3)` |
| 35 * fromMicro: `return time.Unix(t/1e6, (t%1e6)*1e3)` |
| 36 * nil, true, false |
| 37 * value is encoded directly in the type byte |
| 38 |
| 39 |
| 40 Gkvlite Collection schema |
| 41 ------------------------- |
| 42 |
| 43 In order to provide efficient result deduplication, the value of an index row |
| 44 which indexes 1 or more properties is a concatenation of the previous values |
| 45 which would show up in the same index. For example, if you have the property |
| 46 list for the key K: |
| 47 |
| 48 bob: 1 |
| 49 bob: 4 |
| 50 bob: 7 |
| 51 cat: "hello" |
| 52 cat: "world" |
| 53 |
| 54 And the regular (non-ancestor) composite index was {bob, -cat}, you'd have the |
| 55 rows in the index `idx:ns:kind|R|bob|-cat` (| in the row indicates |
| 56 concatenation, each value has an implied type byte. `...` indicates that other |
| 57 rows may be present): |
| 58 |
| 59 ... |
| 60 1|"world"|K = nil|nil |
| 61 ... |
| 62 1|"hello"|K = nil|"world" |
| 63 ... |
| 64 4|"world"|K = 1|nil |
| 65 ... |
| 66 4|"hello"|K = 1|"world" |
| 67 ... |
| 68 7|"world"|K = 4|nil |
| 69 ... |
| 70 7|"hello"|K = 4|"world" |
| 71 ... |
| 72 |
| 73 This allows us to, start scanning at any point and be able to determine if we've |
| 74 returned a given key already (without storing all of the keys in memory |
| 75 for the duration of the Query run). We can do this because we can see if the |
| 76 value of an index row falls within the original query filter parameters. If it |
| 77 does, then we must already have returned they Key, and can safely skip the index |
| 78 row. AFAIK, real-datastore provides deduplication by keeping all the returned |
| 79 keys in memory as it runs the query, and doing a set-check. |
| 80 |
| 81 The end-result is semantically equivalent, with the exception that Query Cursors |
| 82 on the real datastore will potentially return the same Key in the first Cursor |
| 83 use as well as on the 2nd (or Nth) cursor use, where this method will not. |
| 84 |
| 85 collections |
| 86 ents:ns -> key -> value |
| 87 (rootkind, rootid, __entity_group__,1) -> {_
_version__: int} |
| 88 (rootkind, rootid, __entity_group_ids__,1) -
> {__version__: int} |
| 89 (__entity_group_ids__,1) -> {__version__: in
t} |
| 90 idx:ns:kind -> key = nil |
| 91 idx:ns:kind|prop -> propval|key = [prev val] |
| 92 idx:ns:kind|-prop -> -propval|key = [next val] |
| 93 idx:ns:kind|A|?prop|?prop -> A|propval|propval|key = [prev/next val]|[pre
v/next val] |
| 94 idx:ns:kind|?prop|?prop -> propval|propval|key = [prev/next val]|[prev/
next val] |
| 95 |
| 96 // to add persistence later |
| 97 idx: -> kind,A?,[-?prop]* |
OLD | NEW |