filter/dscache/doc.go - Issue 1269113005: A transparent cache for datastore, backed by memcache.

Unified Diff: filter/dscache/doc.go

Issue 1269113005: A transparent cache for datastore, backed by memcache. (Closed) Base URL: https://github.com/luci/gae.git@add_meta

Patch Set: Created 5 years, 4 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View side-by-side diff with in-line comments

Download patch

Index: filter/dscache/doc.go

diff --git a/filter/dscache/doc.go b/filter/dscache/doc.go

new file mode 100644

index 0000000000000000000000000000000000000000..12222046313be64e07252948ac98a0b838b942b3

--- /dev/null

+++ b/filter/dscache/doc.go

@@ -0,0 +1,110 @@

+// Use of this source code is governed by a BSD-style license that can be

+// found in the LICENSE file.

+// Package dscache provides a transparent cache for RawDatastore which is

+// backed by Memcache.

+//

+// Inspiration

+//

+// Although this is not a port of any particular implementation, it takes

+// inspiration from these fine libraries:

+// - https://cloud.google.com/appengine/docs/python/ndb/

+// - https://github.com/qedus/nds

+// - https://github.com/mjibson/goon

+//

+// Algorithm

+//

+// Memcache contains cache entries for single datastore entities. The memcache

+// key looks like

+//

+// "gae:" | vers | ":" | shard | ":" | Base64_std_nopad(SHA1(rawdatastore.Key))

+//

+// Where:

+// - vers is an ascii-hex-encoded number (currently 1).

+// - shard is a zero-based ascii-hex-encoded number (depends on dscache.shards).

+// - SHA1 has been chosen as unlikely (p == 1e-18) to collide, given dedicated

+// memcache sizes of up to 170 Exabytes (assuming an average entry size of

+// 100KB including the memcache key). This is clearly overkill, but MD5

+// could start showing collisions at this probability in as small as a 26GB

+// cache (and also MD5 sucks).

+//

+// The memcache value is a compression byte, indicating the scheme (See

+// CompressionType), followed by the encoded (and possibly compressed) value.

+// Encoding is done with rawdatastore.PropertyMap.Write(). The memcache value

+// may also be the empty byte sequence, indicating that this entity is deleted.

+//

+// The memcache entry may also have a 'flags' value set to one of the following:

+// - 0 "entity" (cached value)

+// - 1 "lock" (someone is mutating this entry)

+//

+// Algorithm - Put and Delete

+//

+// On a Put (or Delete), the memcache value written with a LockTimeSeconds

+// expiration (default 31 seconds), and a memcache flag value of 0x1 (indicating

+// that it's a put-locked key).

+//

+// The datastore operation will then occur. Assuming success, Put will then

+// delete all of the memcache locks (Not using CompareAndSwap).

+//

+// Algorithm - Get

+//

+// On a Get, "Add" a lock for it (which only does something if there's no entry

+// in memcache yet) with a nonce value. We immediately Get the memcache entries

+// back (for CAS purposes later).

+//

+// If it doesn't exist (unlikely since we just Add'd it) or if it's flag is

+// "lock" and the Value != the nonce we put there, go hit the datastore without

+// trying to update memcache.

+//

+// If its flag is "entity", decode the object and return it. If the Value is

+// the empty byte sequence, return ErrNoSuchEntity.

+//

+// If its flag is "lock" and the Value equals the nonce, go get it from the

+// datastore. If that's successful, then encode the value to bytes, and CAS

+// the object to memcache. The CAS will succeed if nothing else touched the

+// memcache in the meantime (like a Put, a memcache expiration/eviction, etc.).

+//

+// Algorithm - Transactions

+//

+// In a transaction, all Put memcache operations are held until the very end of

+// the transaction. Right before the transaction is committed, all accumulated

+// Put keys are locked. If the transaction is sucessfully committed (err ==

+// nil), then all the locks will be deleted.

+//

+// The assumption here is that get operations apply all outstanding

+// transactions before they return data (https://cloud.google.com/appengine/docs/go/datastore/#Go_Datastore_writes_and_data_visibility),

+// and so it is safe to purge all the locks if the transaction is known-good.

+//

+// If the transaction succeeds, but RunInTransaction returns an error (which can

+// happen), or if the transaction fails, then the lock entries time out

+// naturally. This will mean 31-ish seconds of direct datastore access, but it's

+// the more-correct thing to do.

+//

+// Cache control

+//

+// An entity may expose the following metadata (see

+// rawdatastore.PropertyLoadSaver.GetMeta) to control the behavior of its cache.

+//

+// - "dscache.enable,<true|false>" - whether or not this entity should be

+// cached at all. If ommitted, dscache defaults to true.

+// - "dscache.expiration,#seconds" - the number of seconds of persistance to

+// use when this item is cached. 0 is infinite. If omitted, defaults to 0.

+//

+// In addition, the application may set a function ShardsForKey(key) which

+// returns the number of shards to use for a given datastore key.

+//

+// Caveats

+//

+// A couple things to note that may differ from other appengine datastore

+// caching libraries (like goon, nds, or ndb).

+//

+// - It does NOT provide in-memory ("per-request") caching.

+// - It's tolerant of memcache failures (but will give potentially

+// inconsistent results if memcache is non-operational). Using a transaction

+// bypasses the cache logic, which will present a consistent view of the data.

+// - Queries do not interact with the cache at all.

+// - Negative lookups (e.g. ErrNoSuchEntity) are cached.

+// - Allows sharding hot memcache entries as recommended by

+// https://cloud.google.com/appengine/articles/best-practices-for-app-engine-memcache#distribute-load .

+package dscache

« no previous file with comments | « filter/dscache/context.go ('k') | filter/dscache/ds.go » ('j') | filter/dscache/dscache_test.go » ('J')