Chromium Code Reviews| Index: appengine/findit/libs/math/logarithms.py |
| diff --git a/appengine/findit/libs/math/logarithms.py b/appengine/findit/libs/math/logarithms.py |
| new file mode 100644 |
| index 0000000000000000000000000000000000000000..82a2c1f64a66989d595dbaa78d5d42480eab6b7f |
| --- /dev/null |
| +++ b/appengine/findit/libs/math/logarithms.py |
| @@ -0,0 +1,44 @@ |
| +# Copyright 2016 The Chromium Authors. All rights reserved. |
| +# Use of this source code is governed by a BSD-style license that can be |
| +# found in the LICENSE file. |
| + |
| +import math |
| + |
| + |
| +_LOG_ZERO = float('-inf') |
| + |
| + |
| +def logsumexp(xs): |
| + """Efficiently and accurately compute a log-domain sum. |
|
inferno
2016/12/04 23:51:24
Where is this idea based off on ? Can you add a wi
wrengr
2016/12/05 19:12:59
This is common knowledge in the machine learning c
|
| + |
| + While ``math.log(math.fsum(math.exp(x) for x in xs)))`` accomplishes |
| + the same end, it requires an additional ``len(xs) - 2`` logarithms |
| + and introduces more opportunities for over/underflow and rounding |
| + errors. Thus, this function is both more efficient and more accurate. |
| + |
| + Args: |
| + xs (collection of float): the collection of log-domain numbers to be |
| + summed. The order of the elements doesn't matter, nor does the |
| + concrete type of the collection (it could be ``list``, ``set``, |
| + ``frozenset``, etc); the collection just needs to support ``len``, |
| + ``max``, and ``iter``. Because we need the maximum before we |
| + can start the iteration, this function requires (in principle) |
| + two passes over the data; therefore we cannot take an iterator nor |
| + generator. The length should be computed in constant time (else it |
| + will introduce an unnecessary third pass over the data). Ideally |
| + the maximum can also be computed in constant time (in which case |
| + this function only needs to pass over the data once), though none |
| + of Python's builtin types support that. |
| + |
| + Returns: |
| + The log-domain sum of ``xs``. |
| + """ |
| + if not xs: |
| + return _LOG_ZERO |
| + |
| + maximum = max(xs) |
| + if math.isinf(maximum): |
| + return maximum |
| + |
| + total = math.fsum(math.expm1(x - maximum) for x in xs) |
| + return maximum + math.log1p(total + float(len(xs) - 1)) |