filter/dscache/doc.go - Issue 1269113005: A transparent cache for datastore, backed by memcache.

Side by Side Diff: filter/dscache/doc.go

Issue 1269113005: A transparent cache for datastore, backed by memcache. (Closed) Base URL: https://github.com/luci/gae.git@add_meta

Patch Set: add test for per-model expiration Created 5 years, 4 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

OLD	NEW
(Empty)
	1 // Copyright 2015 The Chromium Authors. All rights reserved.

	2 // Use of this source code is governed by a BSD-style license that can be

	3 // found in the LICENSE file.

	4

	5 // Package dscache provides a transparent cache for RawDatastore which is

	6 // backed by Memcache.

	7 //

	8 // Inspiration

	9 //

	10 // Although this is not a port of any particular implementation, it takes

	11 // inspiration from these fine libraries:

	12 // - https://cloud.google.com/appengine/docs/python/ndb/

	13 // - https://github.com/qedus/nds

	14 // - https://github.com/mjibson/goon

	15 //

	16 // Algorithm

	17 //

	18 // Memcache contains cache entries for single datastore entities. The memcache

	19 // key looks like

	20 //

	21 // "gae:" \| vers \| ":" \| shard \| ":" \| Base64_std_nopad(SHA1(datastore.Key))

	22 //

	23 // Where:

	24 // - vers is an ascii-hex-encoded number (currently 1).

	25 // - shard is a zero-based ascii-hex-encoded number (depends on dscache.shards ).

	26 // - SHA1 has been chosen as unlikely (p == 1e-18) to collide, given dedicated
	Vadim Sh. 2015/08/05 17:09:31 just curious: why not use the key directly? Too lo just curious: why not use the key directly? Too long? iannucci 2015/08/06 01:54:01 Yeah, there's a limit of 250 bytes for the memcach Show quoted text On 2015/08/05 17:09:31, Vadim Sh. wrote: > just curious: why not use the key directly? Too long? Yeah, there's a limit of 250 bytes for the memcache key (and it also may not contain null bytes, so I can't just gzip some encoding of it). Additionally, long keys eat into the available space for a memcache row (overhead + key + value). I figured this was a pretty good compromise.
	27 // memcache sizes of up to 170 Exabytes (assuming an average entry size of

	28 // 100KB including the memcache key). This is clearly overkill, but MD5

	29 // could start showing collisions at this probability in as small as a 26GB

	30 // cache (and also MD5 sucks).
	Vadim Sh. 2015/08/05 17:09:31 that's a good argument :D that's a good argument :D
	31 //

	32 // The memcache value is a compression byte, indicating the scheme (See

	33 // CompressionType), followed by the encoded (and possibly compressed) value.

	34 // Encoding is done with datastore.PropertyMap.Write(). The memcache value

	35 // may also be the empty byte sequence, indicating that this entity is deleted.

	36 //

	37 // The memcache entry may also have a 'flags' value set to one of the following:
	Vadim Sh. 2015/08/05 17:09:31 where 'flags' value is stored? where 'flags' value is stored? iannucci 2015/08/06 01:54:00 It's 4 of the bytes in the 'overhead' of the memca Show quoted text On 2015/08/05 17:09:31, Vadim Sh. wrote: > where 'flags' value is stored? It's 4 of the bytes in the 'overhead' of the memcache implementation. It's SetFlags(uint32) on memcache.Item.
	38 // - 0 "entity" (cached value)

	39 // - 1 "lock" (someone is mutating this entry)

	40 //

	41 // Algorithm - Put and Delete

	42 //

	43 // On a Put (or Delete), the memcache value written with a LockTimeSeconds
	Vadim Sh. 2015/08/05 17:09:31 unconditionally written? Overriding any existing s unconditionally written? Overriding any existing stuff? iannucci 2015/08/06 01:54:00 Yeah, it writes a lock (blank value + 30 second ti Show quoted text On 2015/08/05 17:09:31, Vadim Sh. wrote: > unconditionally written? Overriding any existing stuff? Yeah, it writes a lock (blank value + 30 second timeout) to exclude any Gets from trying to populate the memcache entry while the put is doing it's thing. Afterwards it's deleted. The 30s expiration is in the event of a crash.
	44 // expiration (default 31 seconds), and a memcache flag value of 0x1 (indicating

	45 // that it's a put-locked key).

	46 //

	47 // The datastore operation will then occur. Assuming success, Put will then

	48 // delete all of the memcache locks (Not using CompareAndSwap).
	Vadim Sh. 2015/08/05 17:09:31 what are "memcache locks"? what are "memcache locks"? iannucci 2015/08/06 01:54:00 Hopefully clarified this section Show quoted text On 2015/08/05 17:09:31, Vadim Sh. wrote: > what are "memcache locks"? Hopefully clarified this section
	49 //

	50 // Algorithm - Get

	51 //

	52 // On a Get, "Add" a lock for it (which only does something if there's no entry

	53 // in memcache yet) with a nonce value. We immediately Get the memcache entries

	54 // back (for CAS purposes later).

	55 //

	56 // If it doesn't exist (unlikely since we just Add'd it) or if it's flag is
	Vadim Sh. 2015/08/05 17:09:31 typo: its (not it's) typo: its (not it's) iannucci 2015/08/06 01:54:01 Done. Show quoted text On 2015/08/05 17:09:31, Vadim Sh. wrote: > typo: its (not it's) Done.
	57 // "lock" and the Value != the nonce we put there, go hit the datastore without

	58 // trying to update memcache.

	59 //

	60 // If its flag is "entity", decode the object and return it. If the Value is

	61 // the empty byte sequence, return ErrNoSuchEntity.

	62 //

	63 // If its flag is "lock" and the Value equals the nonce, go get it from the

	64 // datastore. If that's successful, then encode the value to bytes, and CAS

	65 // the object to memcache. The CAS will succeed if nothing else touched the

	66 // memcache in the meantime (like a Put, a memcache expiration/eviction, etc.).

	67 //

	68 // Algorithm - Transactions
	Vadim Sh. 2015/08/05 17:09:31 What about Gets in transaction? DS transactions u What about Gets in transaction? DS transactions use snapshot isolation (or whatever it is called). iannucci 2015/08/06 01:54:01 Gets pass through without any memcache interaction Show quoted text On 2015/08/05 17:09:31, Vadim Sh. wrote: > What about Gets in transaction? > > DS transactions use snapshot isolation (or whatever it is called). Gets pass through without any memcache interaction (reads or writes)
	69 //

	70 // In a transaction, all Put memcache operations are held until the very end of

	71 // the transaction. Right before the transaction is committed, all accumulated

	72 // Put keys are locked. If the transaction is sucessfully committed (err ==

	73 // nil), then all the locks will be deleted.

	74 //

	75 // The assumption here is that get operations apply all outstanding

	76 // transactions before they return data (https://cloud.google.com/appengine/docs /go/datastore/#Go_Datastore_writes_and_data_visibility),

	77 // and so it is safe to purge all the locks if the transaction is known-good.

	78 //

	79 // If the transaction succeeds, but RunInTransaction returns an error (which can

	80 // happen), or if the transaction fails, then the lock entries time out

	81 // naturally. This will mean 31-ish seconds of direct datastore access, but it's

	82 // the more-correct thing to do.

	83 //

	84 // Cache control

	85 //

	86 // An entity may expose the following metadata (see

	87 // datastore.PropertyLoadSaver.GetMeta) to control the behavior of its cache.

	88 //

	89 // - "dscache.enable,<true\|false>" - whether or not this entity should be
	Vadim Sh. 2015/08/05 17:09:31 isn't it $dscache.enable? isn't it $dscache.enable? iannucci 2015/08/06 01:54:01 clarified Show quoted text On 2015/08/05 17:09:31, Vadim Sh. wrote: > isn't it $dscache.enable? clarified
	90 // cached at all. If ommitted, dscache defaults to true.

	91 // - "dscache.expiration,#seconds" - the number of seconds of persistance to

	92 // use when this item is cached. 0 is infinite. If omitted, defaults to 0.

	93 //

	94 // In addition, the application may set a function ShardsForKey(key) which

	95 // returns the number of shards to use for a given datastore key.
	Vadim Sh. 2015/08/05 17:09:31 what does it mean to use sharding for a single dat what does it mean to use sharding for a single datastore key? iannucci 2015/08/06 01:54:00 clarified Show quoted text On 2015/08/05 17:09:31, Vadim Sh. wrote: > what does it mean to use sharding for a single datastore key? clarified
	96 //

	97 // Caveats

	98 //

	99 // A couple things to note that may differ from other appengine datastore

	100 // caching libraries (like goon, nds, or ndb).

	101 //

	102 // - It does NOT provide in-memory ("per-request") caching.
	Vadim Sh. 2015/08/05 17:09:31 good :) good :)
	103 // - It's tolerant of memcache failures (but will give potentially

	104 // inconsistent results if memcache is non-operational). Using a transaction

	105 // bypasses the cache logic, which will present a consistent view of the dat a.

	106 // - Queries do not interact with the cache at all.

	107 // - Negative lookups (e.g. ErrNoSuchEntity) are cached.

	108 // - Allows sharding hot memcache entries as recommended by

	109 // https://cloud.google.com/appengine/articles/best-practices-for-app-engine -memcache#distribute-load .

	110 package dscache

OLD	NEW

« filter/dscache/context.go ('K') | « filter/dscache/context.go ('k') | filter/dscache/ds.go » ('j') | filter/dscache/ds.go » ('J')