|
|
DescriptionRefactor TraceData to encapsulate internal traces' representation
Changes to TraceData's API:
+ TraceData.Serialize() is the only supported way to serialize the traces to a single
HTML trace file. The memory saving still comes from the fact that this uses the trace
file handle directly if the trace were collected using TraceFileHandle (in
chrome_inspector/tracing_backend.py).
+ TraceData.GetTracesFor only return traces in dictionary form, since one do not
needs to call this unless they want to have access to the trace events in Python.
Upon calling this, all the TraceFileHandle are also converted to the in-memory form
automatically since there is no use in saving memory once this is invoked.
As a result of this, no clients of TraceData need concern with whether the internal
representation of trace is TraceFileHandle or the in-memory dictionary form.
BUG=catapult:#3119
Review-Url: https://codereview.chromium.org/2661573003
Committed: https://chromium.googlesource.com/external/github.com/catapult-project/catapult/+/e1e303241a39f9ea96b99cfc6bb92b191c1f2b1e
Patch Set 1 #
Total comments: 6
Patch Set 2 : Style nits #
Total comments: 11
Patch Set 3 : Address Charlie's comments #Messages
Total messages: 42 (32 generated)
The CQ bit was checked by nednguyen@google.com to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: Catapult Presubmit on master.tryserver.client.catapult (JOB_FAILED, https://build.chromium.org/p/tryserver.client.catapult/builders/Catapult%20Pr...)
The CQ bit was checked by nednguyen@google.com to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Patchset #1 (id:1) has been deleted
Patchset #1 (id:20001) has been deleted
Description was changed from ========== WIP to refactor TraceData ========== to ========== Refactor TraceData to hide details about managing traces on disk Changes to TraceData's API: + TraceData.Serialize() is the only supported way to serialize the traces to a single HTML trace file. The memory saving still comes from the fact that this uses the trace file handle directly if the trace were collected using TraceFileHandle (in chrome_inspector/tracing_backend.py). + TraceData.GetTracesFor only return traces in dictionary form, since one do not needs to call this unless they want to have access to the trace events in Python. Upon calling this, all the TraceFileHandle are also converted to the in-memory form automatically since there is no use in saving memory once this is invoked. As a result of this, no clients of TraceData need to care about whether the internal representation of trace is TraceFileHandle or the in-memory dictionary form. ==========
nednguyen@google.com changed reviewers: + charliea@chromium.org, perezju@chromium.org, rnephew@chromium.org
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: Catapult Windows Tryserver on master.tryserver.client.catapult (JOB_FAILED, https://build.chromium.org/p/tryserver.client.catapult/builders/Catapult%20Wi...)
Patchset #1 (id:40001) has been deleted
The CQ bit was checked by nednguyen@google.com to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Description was changed from ========== Refactor TraceData to hide details about managing traces on disk Changes to TraceData's API: + TraceData.Serialize() is the only supported way to serialize the traces to a single HTML trace file. The memory saving still comes from the fact that this uses the trace file handle directly if the trace were collected using TraceFileHandle (in chrome_inspector/tracing_backend.py). + TraceData.GetTracesFor only return traces in dictionary form, since one do not needs to call this unless they want to have access to the trace events in Python. Upon calling this, all the TraceFileHandle are also converted to the in-memory form automatically since there is no use in saving memory once this is invoked. As a result of this, no clients of TraceData need to care about whether the internal representation of trace is TraceFileHandle or the in-memory dictionary form. ========== to ========== Refactor TraceData to hide details about managing traces on disk Changes to TraceData's API: + TraceData.Serialize() is the only supported way to serialize the traces to a single HTML trace file. The memory saving still comes from the fact that this uses the trace file handle directly if the trace were collected using TraceFileHandle (in chrome_inspector/tracing_backend.py). + TraceData.GetTracesFor only return traces in dictionary form, since one do not needs to call this unless they want to have access to the trace events in Python. Upon calling this, all the TraceFileHandle are also converted to the in-memory form automatically since there is no use in saving memory once this is invoked. As a result of this, no clients of TraceData need to care about whether the internal representation of trace is TraceFileHandle or the in-memory dictionary form. ==========
Description was changed from ========== Refactor TraceData to hide details about managing traces on disk Changes to TraceData's API: + TraceData.Serialize() is the only supported way to serialize the traces to a single HTML trace file. The memory saving still comes from the fact that this uses the trace file handle directly if the trace were collected using TraceFileHandle (in chrome_inspector/tracing_backend.py). + TraceData.GetTracesFor only return traces in dictionary form, since one do not needs to call this unless they want to have access to the trace events in Python. Upon calling this, all the TraceFileHandle are also converted to the in-memory form automatically since there is no use in saving memory once this is invoked. As a result of this, no clients of TraceData need to care about whether the internal representation of trace is TraceFileHandle or the in-memory dictionary form. ========== to ========== Refactor TraceData to hide details about managing traces on disk Changes to TraceData's API: + TraceData.Serialize() is the only supported way to serialize the traces to a single HTML trace file. The memory saving still comes from the fact that this uses the trace file handle directly if the trace were collected using TraceFileHandle (in chrome_inspector/tracing_backend.py). + TraceData.GetTracesFor only return traces in dictionary form, since one do not needs to call this unless they want to have access to the trace events in Python. Upon calling this, all the TraceFileHandle are also converted to the in-memory form automatically since there is no use in saving memory once this is invoked. As a result of this, no clients of TraceData need concern with whether the internal representation of trace is TraceFileHandle or the in-memory dictionary form. ==========
Description was changed from ========== Refactor TraceData to hide details about managing traces on disk Changes to TraceData's API: + TraceData.Serialize() is the only supported way to serialize the traces to a single HTML trace file. The memory saving still comes from the fact that this uses the trace file handle directly if the trace were collected using TraceFileHandle (in chrome_inspector/tracing_backend.py). + TraceData.GetTracesFor only return traces in dictionary form, since one do not needs to call this unless they want to have access to the trace events in Python. Upon calling this, all the TraceFileHandle are also converted to the in-memory form automatically since there is no use in saving memory once this is invoked. As a result of this, no clients of TraceData need concern with whether the internal representation of trace is TraceFileHandle or the in-memory dictionary form. ========== to ========== Refactor TraceData to encapsulate internal traces' representation Changes to TraceData's API: + TraceData.Serialize() is the only supported way to serialize the traces to a single HTML trace file. The memory saving still comes from the fact that this uses the trace file handle directly if the trace were collected using TraceFileHandle (in chrome_inspector/tracing_backend.py). + TraceData.GetTracesFor only return traces in dictionary form, since one do not needs to call this unless they want to have access to the trace events in Python. Upon calling this, all the TraceFileHandle are also converted to the in-memory form automatically since there is no use in saving memory once this is invoked. As a result of this, no clients of TraceData need concern with whether the internal representation of trace is TraceFileHandle or the in-memory dictionary form. ==========
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
lgtm w/some comments https://codereview.chromium.org/2661573003/diff/60001/telemetry/telemetry/int... File telemetry/telemetry/internal/platform/profiler/trace_profiler.py (right): https://codereview.chromium.org/2661573003/diff/60001/telemetry/telemetry/int... telemetry/telemetry/internal/platform/profiler/trace_profiler.py:49: trace_result.Serialize(trace_file, trace_title=title) no more zip'ing? could there be a `compress` option to Serialize? https://codereview.chromium.org/2661573003/diff/60001/telemetry/telemetry/tim... File telemetry/telemetry/timeline/trace_data.py (right): https://codereview.chromium.org/2661573003/diff/60001/telemetry/telemetry/tim... telemetry/telemetry/timeline/trace_data.py:129: for i in xrange(len(self._raw_data[part.raw_field_name])): nit: factor out the common part `self._raw_data[part.raw_field_name]` into a variable, e.g., `traces_list`? You could also `for i, data in enumerate(traces_list)`. That would improve legibility. https://codereview.chromium.org/2661573003/diff/60001/telemetry/telemetry/web... File telemetry/telemetry/web_perf/timeline_based_measurement.py (right): https://codereview.chromium.org/2661573003/diff/60001/telemetry/telemetry/web... telemetry/telemetry/web_perf/timeline_based_measurement.py:306: ' on July 17, 2016.') Date in the past. Maybe kill this functionality on a follow up CL?
https://codereview.chromium.org/2661573003/diff/60001/telemetry/telemetry/int... File telemetry/telemetry/internal/platform/profiler/trace_profiler.py (right): https://codereview.chromium.org/2661573003/diff/60001/telemetry/telemetry/int... telemetry/telemetry/internal/platform/profiler/trace_profiler.py:49: trace_result.Serialize(trace_file, trace_title=title) On 2017/02/02 09:25:56, perezju wrote: > no more zip'ing? could there be a `compress` option to Serialize? No need to, because trace2html already compress the trace data (https://github.com/catapult-project/catapult/blob/master/tracing/tracing_buil...) https://codereview.chromium.org/2661573003/diff/60001/telemetry/telemetry/tim... File telemetry/telemetry/timeline/trace_data.py (right): https://codereview.chromium.org/2661573003/diff/60001/telemetry/telemetry/tim... telemetry/telemetry/timeline/trace_data.py:129: for i in xrange(len(self._raw_data[part.raw_field_name])): On 2017/02/02 09:25:56, perezju wrote: > nit: factor out the common part `self._raw_data[part.raw_field_name]` into a > variable, e.g., `traces_list`? You could also `for i, data in > enumerate(traces_list)`. That would improve legibility. Done. https://codereview.chromium.org/2661573003/diff/60001/telemetry/telemetry/web... File telemetry/telemetry/web_perf/timeline_based_measurement.py (right): https://codereview.chromium.org/2661573003/diff/60001/telemetry/telemetry/web... telemetry/telemetry/web_perf/timeline_based_measurement.py:306: ' on July 17, 2016.') On 2017/02/02 09:25:56, perezju wrote: > Date in the past. Maybe kill this functionality on a follow up CL? Hmhh, I will ping Ethan on this.
The CQ bit was checked by nednguyen@google.com to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
On 2017/02/02 14:00:22, nednguyen wrote: > https://codereview.chromium.org/2661573003/diff/60001/telemetry/telemetry/int... > File telemetry/telemetry/internal/platform/profiler/trace_profiler.py (right): > > https://codereview.chromium.org/2661573003/diff/60001/telemetry/telemetry/int... > telemetry/telemetry/internal/platform/profiler/trace_profiler.py:49: > trace_result.Serialize(trace_file, trace_title=title) > On 2017/02/02 09:25:56, perezju wrote: > > no more zip'ing? could there be a `compress` option to Serialize? > > No need to, because trace2html already compress the trace data > (https://github.com/catapult-project/catapult/blob/master/tracing/tracing_buil...) > > https://codereview.chromium.org/2661573003/diff/60001/telemetry/telemetry/tim... > File telemetry/telemetry/timeline/trace_data.py (right): > > https://codereview.chromium.org/2661573003/diff/60001/telemetry/telemetry/tim... > telemetry/telemetry/timeline/trace_data.py:129: for i in > xrange(len(self._raw_data[part.raw_field_name])): > On 2017/02/02 09:25:56, perezju wrote: > > nit: factor out the common part `self._raw_data[part.raw_field_name]` into a > > variable, e.g., `traces_list`? You could also `for i, data in > > enumerate(traces_list)`. That would improve legibility. > > Done. > > https://codereview.chromium.org/2661573003/diff/60001/telemetry/telemetry/web... > File telemetry/telemetry/web_perf/timeline_based_measurement.py (right): > > https://codereview.chromium.org/2661573003/diff/60001/telemetry/telemetry/web... > telemetry/telemetry/web_perf/timeline_based_measurement.py:306: ' on July 17, > 2016.') > On 2017/02/02 09:25:56, perezju wrote: > > Date in the past. Maybe kill this functionality on a follow up CL? > > Hmhh, I will ping Ethan on this. Ping Charlie
https://codereview.chromium.org/2661573003/diff/80001/telemetry/telemetry/tim... File telemetry/telemetry/timeline/trace_data.py (right): https://codereview.chromium.org/2661573003/diff/80001/telemetry/telemetry/tim... telemetry/telemetry/timeline/trace_data.py:85: """ TraceData holds a collection of traces from multiple sources. IMO this should mention in the pydoc that, unless otherwise stated, all methods avoid loading the whole traces into memory. https://codereview.chromium.org/2661573003/diff/80001/telemetry/telemetry/tim... telemetry/telemetry/timeline/trace_data.py:121: def GetTracesFor(self, part): IMO, I think this should come with a big bold pydoc warning that, unlike all other functions, calling this immediately loads the traces into memory, which can blow the memory limit. https://codereview.chromium.org/2661573003/diff/80001/telemetry/telemetry/tim... telemetry/telemetry/timeline/trace_data.py:170: path = os.path.join(temp_dir, '%s.trace' % counter) It seems like path is used outside of the scope it's defined here. Isn't the scope the "if" statement or the "else" statement, and it's used outside of that scope? It's weird, beause I've definitely had pylint warn me about this before. My vote to resolve this is to create a small helper function in the file: def _TraceToFile(trace): if isinstance(trace, TraceFileHandle): return trace.file_path # Your other logic... return path IMO this is also good because this makes the function more readable, as well https://codereview.chromium.org/2661573003/diff/80001/telemetry/telemetry/tim... telemetry/telemetry/timeline/trace_data.py:175: counter += 1 Why do this instead of using temp files with delete=True? It seems like we want temp files, right? https://codereview.chromium.org/2661573003/diff/80001/telemetry/telemetry/tim... File telemetry/telemetry/timeline/trace_data_unittest.py (right): https://codereview.chromium.org/2661573003/diff/80001/telemetry/telemetry/tim... telemetry/telemetry/timeline/trace_data_unittest.py:17: test_temp_file = tempfile.NamedTemporaryFile(delete=False) The tempfile library explicitly discourages you from doing this (NamedTemporaryFile... close... retrieve path... then use that path) because it creates race conditions about other people trying to access the same file. This is the main reason that they deprecated the mktemp() function The recommended alternative is creating a temp directory with mkdtemp() and then creating well-named files (e.g. "example.trace") within that directory. If you create this directory as self._temp_dir in a setUp method and delete it in a tearDown method, then it's slightly better practice and also makes the test more readable because you avoid the awkward open up file => close file => try test => finally cleanup dance.
lgtm % my previous comments
https://codereview.chromium.org/2661573003/diff/80001/telemetry/telemetry/tim... File telemetry/telemetry/timeline/trace_data_unittest.py (right): https://codereview.chromium.org/2661573003/diff/80001/telemetry/telemetry/tim... telemetry/telemetry/timeline/trace_data_unittest.py:17: test_temp_file = tempfile.NamedTemporaryFile(delete=False) On 2017/02/04 00:43:29, charliea wrote: > The tempfile library explicitly discourages you from doing this > (NamedTemporaryFile... close... retrieve path... then use that path) because it > creates race conditions about other people trying to access the same file. This > is the main reason that they deprecated the mktemp() function > > The recommended alternative is creating a temp directory with mkdtemp() and then > creating well-named files (e.g. "example.trace") within that directory. If you > create this directory as self._temp_dir in a setUp method and delete it in a > tearDown method, then it's slightly better practice and also makes the test more > readable because you avoid the awkward open up file => close file => try test => > finally cleanup dance. +1!
Description was changed from ========== Refactor TraceData to encapsulate internal traces' representation Changes to TraceData's API: + TraceData.Serialize() is the only supported way to serialize the traces to a single HTML trace file. The memory saving still comes from the fact that this uses the trace file handle directly if the trace were collected using TraceFileHandle (in chrome_inspector/tracing_backend.py). + TraceData.GetTracesFor only return traces in dictionary form, since one do not needs to call this unless they want to have access to the trace events in Python. Upon calling this, all the TraceFileHandle are also converted to the in-memory form automatically since there is no use in saving memory once this is invoked. As a result of this, no clients of TraceData need concern with whether the internal representation of trace is TraceFileHandle or the in-memory dictionary form. ========== to ========== Refactor TraceData to encapsulate internal traces' representation Changes to TraceData's API: + TraceData.Serialize() is the only supported way to serialize the traces to a single HTML trace file. The memory saving still comes from the fact that this uses the trace file handle directly if the trace were collected using TraceFileHandle (in chrome_inspector/tracing_backend.py). + TraceData.GetTracesFor only return traces in dictionary form, since one do not needs to call this unless they want to have access to the trace events in Python. Upon calling this, all the TraceFileHandle are also converted to the in-memory form automatically since there is no use in saving memory once this is invoked. As a result of this, no clients of TraceData need concern with whether the internal representation of trace is TraceFileHandle or the in-memory dictionary form. BUG=catapult:#3119 ==========
The CQ bit was checked by nednguyen@google.com to run a CQ dry run
https://codereview.chromium.org/2661573003/diff/80001/telemetry/telemetry/tim... File telemetry/telemetry/timeline/trace_data.py (right): https://codereview.chromium.org/2661573003/diff/80001/telemetry/telemetry/tim... telemetry/telemetry/timeline/trace_data.py:85: """ TraceData holds a collection of traces from multiple sources. On 2017/02/04 00:43:29, charliea wrote: > IMO this should mention in the pydoc that, unless otherwise stated, all methods > avoid loading the whole traces into memory. This not true. Some method will load the whole traces into memory, and it also depends on how you collect the trace in the beginning. The memory saving mostly come from using the TraceFileHandle and not this container. https://codereview.chromium.org/2661573003/diff/80001/telemetry/telemetry/tim... telemetry/telemetry/timeline/trace_data.py:121: def GetTracesFor(self, part): On 2017/02/04 00:43:29, charliea wrote: > IMO, I think this should come with a big bold pydoc warning that, unlike all > other functions, calling this immediately loads the traces into memory, which > can blow the memory limit. Done. https://codereview.chromium.org/2661573003/diff/80001/telemetry/telemetry/tim... telemetry/telemetry/timeline/trace_data.py:170: path = os.path.join(temp_dir, '%s.trace' % counter) On 2017/02/04 00:43:29, charliea wrote: > It seems like path is used outside of the scope it's defined here. Isn't the > scope the "if" statement or the "else" statement, and it's used outside of that > scope? It's weird, beause I've definitely had pylint warn me about this before. > > My vote to resolve this is to create a small helper function in the file: > > def _TraceToFile(trace): > if isinstance(trace, TraceFileHandle): > return trace.file_path > > # Your other logic... > return path > > IMO this is also good because this makes the function more readable, as well Done. It's a bit awkward that sometimes it return a file in temp dir & sometimes not, but looks a bit cleaner. I added some documentation. https://codereview.chromium.org/2661573003/diff/80001/telemetry/telemetry/tim... telemetry/telemetry/timeline/trace_data.py:175: counter += 1 On 2017/02/04 00:43:29, charliea wrote: > Why do this instead of using temp files with delete=True? It seems like we want > temp files, right? Yes, but the reason is similar to what you commented in the unittest. It's better to create a temp dir, put files in them & remove the temp dir after usage rather than "NamedTemporaryFile... close... retrieve path... then use that path". I used NamedTemporaryFile with delete=False https://codereview.chromium.org/2661573003/diff/80001/telemetry/telemetry/tim... File telemetry/telemetry/timeline/trace_data_unittest.py (right): https://codereview.chromium.org/2661573003/diff/80001/telemetry/telemetry/tim... telemetry/telemetry/timeline/trace_data_unittest.py:17: test_temp_file = tempfile.NamedTemporaryFile(delete=False) On 2017/02/06 09:21:22, perezju wrote: > On 2017/02/04 00:43:29, charliea wrote: > > The tempfile library explicitly discourages you from doing this > > (NamedTemporaryFile... close... retrieve path... then use that path) because > it > > creates race conditions about other people trying to access the same file. > This > > is the main reason that they deprecated the mktemp() function > > > > The recommended alternative is creating a temp directory with mkdtemp() and > then > > creating well-named files (e.g. "example.trace") within that directory. If you > > create this directory as self._temp_dir in a setUp method and delete it in a > > tearDown method, then it's slightly better practice and also makes the test > more > > readable because you avoid the awkward open up file => close file => try test > => > > finally cleanup dance. > > +1! Cool tip! Done.
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by nednguyen@google.com
The CQ bit was checked by nednguyen@google.com
The patchset sent to the CQ was uploaded after l-g-t-m from perezju@chromium.org, charliea@chromium.org Link to the patchset: https://codereview.chromium.org/2661573003/#ps100001 (title: "Address Charlie's comments")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
CQ is committing da patch. Bot data: {"patchset_id": 100001, "attempt_start_ts": 1486394277450830, "parent_rev": "1e9198268428eaf0a987054dd0146d153a0682c8", "commit_rev": "e1e303241a39f9ea96b99cfc6bb92b191c1f2b1e"}
Message was sent while issue was closed.
Description was changed from ========== Refactor TraceData to encapsulate internal traces' representation Changes to TraceData's API: + TraceData.Serialize() is the only supported way to serialize the traces to a single HTML trace file. The memory saving still comes from the fact that this uses the trace file handle directly if the trace were collected using TraceFileHandle (in chrome_inspector/tracing_backend.py). + TraceData.GetTracesFor only return traces in dictionary form, since one do not needs to call this unless they want to have access to the trace events in Python. Upon calling this, all the TraceFileHandle are also converted to the in-memory form automatically since there is no use in saving memory once this is invoked. As a result of this, no clients of TraceData need concern with whether the internal representation of trace is TraceFileHandle or the in-memory dictionary form. BUG=catapult:#3119 ========== to ========== Refactor TraceData to encapsulate internal traces' representation Changes to TraceData's API: + TraceData.Serialize() is the only supported way to serialize the traces to a single HTML trace file. The memory saving still comes from the fact that this uses the trace file handle directly if the trace were collected using TraceFileHandle (in chrome_inspector/tracing_backend.py). + TraceData.GetTracesFor only return traces in dictionary form, since one do not needs to call this unless they want to have access to the trace events in Python. Upon calling this, all the TraceFileHandle are also converted to the in-memory form automatically since there is no use in saving memory once this is invoked. As a result of this, no clients of TraceData need concern with whether the internal representation of trace is TraceFileHandle or the in-memory dictionary form. BUG=catapult:#3119 Review-Url: https://codereview.chromium.org/2661573003 Committed: https://chromium.googlesource.com/external/github.com/catapult-project/catapu... ==========
Message was sent while issue was closed.
Committed patchset #3 (id:100001) as https://chromium.googlesource.com/external/github.com/catapult-project/catapu... |