Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(2417)

Unified Diff: appengine/findit/libs/math/logarithms.py

Issue 2543333002: Added log-domain summation (Closed)
Patch Set: rebase Created 4 years ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View side-by-side diff with in-line comments
Download patch
Index: appengine/findit/libs/math/logarithms.py
diff --git a/appengine/findit/libs/math/logarithms.py b/appengine/findit/libs/math/logarithms.py
new file mode 100644
index 0000000000000000000000000000000000000000..82a2c1f64a66989d595dbaa78d5d42480eab6b7f
--- /dev/null
+++ b/appengine/findit/libs/math/logarithms.py
@@ -0,0 +1,44 @@
+# Copyright 2016 The Chromium Authors. All rights reserved.
+# Use of this source code is governed by a BSD-style license that can be
+# found in the LICENSE file.
+
+import math
+
+
+_LOG_ZERO = float('-inf')
+
+
+def logsumexp(xs):
+ """Efficiently and accurately compute a log-domain sum.
inferno 2016/12/04 23:51:24 Where is this idea based off on ? Can you add a wi
wrengr 2016/12/05 19:12:59 This is common knowledge in the machine learning c
+
+ While ``math.log(math.fsum(math.exp(x) for x in xs)))`` accomplishes
+ the same end, it requires an additional ``len(xs) - 2`` logarithms
+ and introduces more opportunities for over/underflow and rounding
+ errors. Thus, this function is both more efficient and more accurate.
+
+ Args:
+ xs (collection of float): the collection of log-domain numbers to be
+ summed. The order of the elements doesn't matter, nor does the
+ concrete type of the collection (it could be ``list``, ``set``,
+ ``frozenset``, etc); the collection just needs to support ``len``,
+ ``max``, and ``iter``. Because we need the maximum before we
+ can start the iteration, this function requires (in principle)
+ two passes over the data; therefore we cannot take an iterator nor
+ generator. The length should be computed in constant time (else it
+ will introduce an unnecessary third pass over the data). Ideally
+ the maximum can also be computed in constant time (in which case
+ this function only needs to pass over the data once), though none
+ of Python's builtin types support that.
+
+ Returns:
+ The log-domain sum of ``xs``.
+ """
+ if not xs:
+ return _LOG_ZERO
+
+ maximum = max(xs)
+ if math.isinf(maximum):
+ return maximum
+
+ total = math.fsum(math.expm1(x - maximum) for x in xs)
+ return maximum + math.log1p(total + float(len(xs) - 1))
« no previous file with comments | « no previous file | appengine/findit/libs/math/test/logarithms_test.py » ('j') | appengine/findit/libs/math/test/logarithms_test.py » ('J')

Powered by Google App Engine
This is Rietveld 408576698