Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(151)

Side by Side Diff: go/src/infra/gae/libs/wrapper/memory/README.md

Issue 1240573002: Reland: Refactor current GAE abstraction library to be free of the SDK* (Closed) Base URL: https://chromium.googlesource.com/infra/infra.git@master
Patch Set: expand coverage range to fit 32bit test expectations Created 5 years, 5 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
OLDNEW
(Empty)
1 In-memory appengine wrapper
2 ===========================
3
4
5 Notes on the internal encodings
6 -------------------------------
7
8 All datatypes inside of the index Collections of the gkvlite Store are stored
9 in a manner which allows them to be compared entirely via bytes.Compare. All
10 types are prefixed by a sortable type byte which encodes the sort-order of types
11 according to the appengine SDK. Additionally, types have the following data
12 encoding:
13 * ints
14 * stored with the `funnybase` varint encoding
15 * floats
16 * http://stereopsis.com/radix.html
17 * toBytes:
18 ```
19 b := math.Float64bits(f)
20 return b ^ (-(b >> 63) | 0x8000000000000000)
21 ```
22 * fromBytes:
23 ```
24 return math.Float64frombits(b ^ ((b >> 63) - 1) | 0x8000000000000000)
25 ```
26 * string, []byte, BlobKey, ByteString
27 * funnybase byte count
28 * raw bytes
29 * \*Key, GeoPoint
30 * composite of above types
31 * time.Time
32 * composite of above types, stored with microsecond accuracy.
33 * rounding to microseconds is a limitation of the real appengine.
34 * toMicro: `return t.Unix()*1e6 + int64(t.Nanosecond()/1e3)`
35 * fromMicro: `return time.Unix(t/1e6, (t%1e6)*1e3)`
36 * nil, true, false
37 * value is encoded directly in the type byte
38
39
40 Gkvlite Collection schema
41 -------------------------
42
43 In order to provide efficient result deduplication, the value of an index row
44 which indexes 1 or more properties is a concatenation of the previous values
45 which would show up in the same index. For example, if you have the property
46 list for the key K:
47
48 bob: 1
49 bob: 4
50 bob: 7
51 cat: "hello"
52 cat: "world"
53
54 And the regular (non-ancestor) composite index was {bob, -cat}, you'd have the
55 rows in the index `idx:ns:kind|R|bob|-cat` (| in the row indicates
56 concatenation, each value has an implied type byte. `...` indicates that other
57 rows may be present):
58
59 ...
60 1|"world"|K = nil|nil
61 ...
62 1|"hello"|K = nil|"world"
63 ...
64 4|"world"|K = 1|nil
65 ...
66 4|"hello"|K = 1|"world"
67 ...
68 7|"world"|K = 4|nil
69 ...
70 7|"hello"|K = 4|"world"
71 ...
72
73 This allows us to, start scanning at any point and be able to determine if we've
74 returned a given key already (without storing all of the keys in memory
75 for the duration of the Query run). We can do this because we can see if the
76 value of an index row falls within the original query filter parameters. If it
77 does, then we must already have returned they Key, and can safely skip the index
78 row. AFAIK, real-datastore provides deduplication by keeping all the returned
79 keys in memory as it runs the query, and doing a set-check.
80
81 The end-result is semantically equivalent, with the exception that Query Cursors
82 on the real datastore will potentially return the same Key in the first Cursor
83 use as well as on the 2nd (or Nth) cursor use, where this method will not.
84
85 collections
86 ents:ns -> key -> value
87 (rootkind, rootid, __entity_group__,1) -> {_ _version__: int}
88 (rootkind, rootid, __entity_group_ids__,1) - > {__version__: int}
89 (__entity_group_ids__,1) -> {__version__: in t}
90 // TODO(iannucci): Journal every entity write in a log with a globally
91 // increasing version number (aka "timestamp").
92 //
93 // TODO(iannucci): Use the value in idx collection to indicate the last
94 // global log version reflected in this index. Then index updates can happ en
95 // in parallel, in a truly eventually-consistent fashion (and completely
96 // avoid holding the DB writelock while calculating index entries).
97 // Unfortunately, copying new records (and removing old ones) into the DB
98 // would still require holding global writelock.
99 //
100 // TODO(iannucci): how do we garbage-collect the journal?
101 //
102 // TODO(iannucci): add the ability in gkvlite to 'swap' a collection with
103 // another one, transactionally? Not sure if this is possible to do easily .
104 // If we had this, then we could do all the index writes for a given index
105 // on the side, and then do a quick swap into place with a writelock. As
106 // long as every index only has a single goroutine writing it, then this
107 // would enable maximum concurrency, since all indexes could update in
108 // parallel and only synchronize for the duration of a single pointer swap .
109 idx -> kind|A?|[-?prop]* = nil
110 idx:ns:kind -> key = nil
111 idx:ns:kind|prop -> propval|key = [prev val]
112 idx:ns:kind|-prop -> -propval|key = [next val]
113 idx:ns:kind|A|?prop|?prop -> A|propval|propval|key = [prev/next val]|[pre v/next val]
114 idx:ns:kind|?prop|?prop -> propval|propval|key = [prev/next val]|[prev/ next val]
OLDNEW
« no previous file with comments | « go/src/infra/gae/libs/wrapper/memcache.go ('k') | go/src/infra/gae/libs/wrapper/memory/binutils.go » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698