Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(37)

Side by Side Diff: appengine/findit/crash/changelist_features/min_distance.py

Issue 2517383005: Implementing loglinear classification (without training), for CL classification (Closed)
Patch Set: rebase Created 4 years ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
OLDNEW
(Empty)
1 # Copyright 2016 The Chromium Authors. All rights reserved.
2 # Use of this source code is governed by a BSD-style license that can be
3 # found in the LICENSE file.
4
5 from collections import namedtuple
6
7
8 # N.B., this must not be infinity, else we'll start getting NaN values
9 # from LinearMinDistanceFeature (and SquaredMinDistanceFeature).
10 DEFAULT_MAXIMUM = 50
11
12
13 # N.B., if we make this a namedtuple then it becomes horrible to inherit from.
14 class MinDistanceFeature(object):
15 def __init__(self, maximum=None):
16 """
17 Args:
18 maximum (float): An upper bound on the return result. This
19 argument is optional and (effectively) defaults to infinity.
20 """
21 self._maximum = maximum
22
23 def __call__(self, result):
24 """Return the minimum ``AnalysisInfo.min_distance`` across all files.
25
26 Although this method looks like it should be a method on the
27 ``Result`` class, we have it live here in order to make coverage
28 tests happy. The downside of this is that we now have to modify
29 multiple files whenever the guts of ``Result`` change. The upside
30 is the aforementioned coverage tests, and that it helps keep the
31 ``Result`` class looking cleaner.
32
33 Args:
34 result (Result): the result to analyze.
35
36 Returns:
37 The minimum distance between (the code for) a stack frame in the
38 ``Result`` and the CL in the ``Result`` as a ``float``. If no
39 ``maximum`` is given, then we return that minimum directly. If a
40 ``maximum`` is given, then we return the smaller of it and the
41 found minimum distance.
42 """
43 if not result.file_to_analysis_info:
44 return self._maximum
45
46 minimum = min(analysis_info.min_distance
47 for analysis_info
48 in result.file_to_analysis_info.itervalues())
49 if self._maximum is None:
50 return minimum
51
52 return min(float(self._maximum), float(minimum))
53
54
55 class LinearMinDistanceFeature(MinDistanceFeature):
56 """Return the minimum distance scaled linearly between 0 and 1.
57
58 That is, when the minimum distance is 0 we return 1; when it is greater
59 than the ``maximum`` passed to the constructor, we return 0. And in
60 between we return values linearly interpolated between those points.
61
62 In principle this normalization isn't strictly required, as the weight
63 of this feature can be be scaled to account for the normalization.
64 However, by normalizing things we ensure that the feature's weight is
65 independent of ``maximum``, which helps training.
66 """
67 def __init__(self, maximum=None):
68 """
69 Args:
70 maximum (float): An upper bound on the return result. This
71 argument is optional and defaults to ``DEFAULT_MAXIMUM``.
72 """
73 if maximum is None:
74 maximum = DEFAULT_MAXIMUM
inferno 2016/12/06 18:07:06 nit: you can just do this in contructor def __init
Sharu Jiang 2016/12/06 20:49:19 I remember in this way, pylint will complain.
75 super(LinearMinDistanceFeature, self).__init__(maximum)
76
77 def __call__(self, result):
78 min_distance = super(LinearMinDistanceFeature, self).__call__(result)
79 return (self._maximum - min_distance) / self._maximum
80
81
82 class SquaredMinDistanceFeature(LinearMinDistanceFeature):
83 """Return the minimum distance scaled quadratically between 0 and 1.
84
85 This feature together with ``LinearMinDistanceFeature`` (and a
86 constant feature) allow us to capture any quadratic polynomial of the
87 ``MinDistance``. That is, suppose we had a single feature ``c2*x**2 +
88 c1*x + 1`` with weight ``w``. Rather than using that feature directly
89 (which would require us to specify the hyperparameters ``c2`` and
90 ``c1``) we can instead use three features: ``w2*(x**2) + w1*x + w0``;
91 which enables us to avoid specifying the hyperparameters, by pushing
92 them into the weight parameters instead.
93 """
94 def __call__(self, result):
95 linear_min_distance = (
96 super(SquaredMinDistanceFeature, self).__call__(result))
97 return linear_min_distance * linear_min_distance
OLDNEW

Powered by Google App Engine
This is Rietveld 408576698