DescriptionRemove SetFinalizer from impl/memory.
This patch attempts to addresses a memory leak that has been observed in
the wild associated with "txnbuf". Ultimately, "txnbuf" wraps an
"impl/memory" datastore instance, and the leak location has been
narrowed down to "impl/memory"'s use of SetFinalizer to free gkvlite
Store instances.
The drive for SetFinalizer is that nothing is tracking or assuming
ownership of snapshots, which are used liberally in "impl/memory".
Because snapshots own handles to the underlying gkvlite data store
instance, they must be closed when software is finished with them else
every snapshot's resources will accumulate in the gkvlite store.
Unfortunately, the observed state of things is that the finalizer is
never getting called and is, in fact, rendering those snapshots and
their associated data unable to be collected by GC even when the store
is finished. This is manifesting through txnbuf, which creates
"impl/memory" datastores as the in-memory component of its operation.
Possible reason for this is that circular references and SetFinalizer
don't work well together: https://golang.org/pkg/runtime/#SetFinalizer
This patch removes the SetFinalizer call altogether. It has the
immediate effect that, for the duration of an "impl/memory" datastore's
lifetime, none of its intermediate snapshots will be collected. However,
it seems like this isn't happening anyway, so this isn't a huge loss. On
the other hand, removing SetFinalizer opens those snapshots up to being
collected when the full datastore instance is no longer referenced
(i.e., when the Context that owns it becomes collectable). For tests,
this happens every Convey round, and in txnbuf this happens after each
transaction is committed, so this should actually be just fine.
A follow-on patch is planned to add a "Close" method to
"datastore.Testable"'s snapshots and "impl/memory"'s memStores, and then
explicitly closing those snapshots when we're done with them in
"impl/memory"'s datastore code. This is currently blocked by an observed
bug in gkvlite related to closing snapshots:
https://github.com/steveyen/gkvlite/issues/13
TL;DR: No worse than we currently are, and for actual production stuff,
way better. Can be even better if we're explicit, but we're blocked on a
separate bug.
Evidence:
- LogDog Coordinator GAE, running w/ txnbuf, crashes from memory leak
multiple times a minute.
- Removing txnbuf (https://codereview.chromium.org/2592753002), it has
been 1+ day without a crash.
- Putting a "fmt.Println" or "panic" in the finalizer yields no text
or panics respectively, indicating that it is never being called.
- With finalizer installed, the test shows:
Memory alloc difference is: 90634240
- Without the finalizer, the test shows:
Memory alloc difference is: 166184
BUG=chromium:675485
TEST=local
- go test ./impl/memory -test.memusage -run 'TestMemoryUsage' -v
Patch Set 1 #
Messages
Total messages: 10 (2 generated)
|