|
|
Chromium Code Reviews
DescriptionMoving ScheduleNewAnalysis to break the cycle
BUG=659345
Committed: https://chromium.googlesource.com/infra/infra/+/b74c6a6cd22bfc0e9ed6baddd089ce394644a395
Patch Set 1 #
Total comments: 11
Patch Set 2 : rebase #Patch Set 3 : rebase & adding some tests #
Total comments: 18
Patch Set 4 : addressing nits #Patch Set 5 : rebase #
Total comments: 20
Messages
Total messages: 25 (4 generated)
Description was changed from ========== Moving ScheduleNewAnalysis to break the cycle BUG=659345 ========== to ========== Moving ScheduleNewAnalysis to break the cycle BUG=659345 ==========
wrengr@chromium.org changed reviewers: + katesonia@chromium.org, mbarbella@chromium.org, stgao@chromium.org
Broke the cycle for ScheduleNewAnalysis. PTAL
https://codereview.chromium.org/2455053004/diff/1/appengine/findit/crash/cras... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/1/appengine/findit/crash/cras... appengine/findit/crash/crash_pipeline.py:107: def run(self): # pragma: no cover This is wrong... I didn't notice that in your last cl. As I mentioned in last cl, the run has to have the same parameter as its __init__, or we will have errors using yield pipeline(*_, **_) I deployed a test version and here is the error. https://screenshot.googleplex.com/Kf45ZmBh0FY.png https://codereview.chromium.org/2455053004/diff/1/appengine/findit/crash/cras... appengine/findit/crash/crash_pipeline.py:142: def run(self): # pragma: no cover We'd better avoid "pragma: no cover" big block of code. The TestRunningAnalysis should have covered some of the code? https://codereview.chromium.org/2455053004/diff/1/appengine/findit/crash/test... File appengine/findit/crash/test/crash_pipeline_test.py (left): https://codereview.chromium.org/2455053004/diff/1/appengine/findit/crash/test... appengine/findit/crash/test/crash_pipeline_test.py:85: def _TestRunningAnalysisForResult(self, analysis_result, analysis_tags): The purpose of this test is to test the process of pipeline run, I think it should be in crash_pipeline_test.py instead of in crash_handler_test.py, since the ScheduleNewAnalysis only determines whether we should start pipelines or not.
https://codereview.chromium.org/2455053004/diff/1/appengine/findit/crash/cras... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/1/appengine/findit/crash/cras... appengine/findit/crash/crash_pipeline.py:107: def run(self): # pragma: no cover On 2016/10/27 23:55:33, sharu jiang wrote: > This is wrong... I didn't notice that in your last cl. > As I mentioned in last cl, the run has to have the same parameter as its > __init__, or we will have errors using yield pipeline(*_, **_) > > I deployed a test version and here is the error. > https://screenshot.googleplex.com/Kf45ZmBh0FY.png This is what I was hoping you'd've set up the mock testing to check for. https://codereview.chromium.org/2455053004/diff/1/appengine/findit/crash/cras... appengine/findit/crash/crash_pipeline.py:142: def run(self): # pragma: no cover On 2016/10/27 23:55:33, sharu jiang wrote: > We'd better avoid "pragma: no cover" big block of code. The TestRunningAnalysis > should have covered some of the code? I agree. Alas, this was necessary for CQ to accept the patch and land it. I can bump up the priority of the bug report. N.B., in many cases (including this one) the code is in fact covered by the unittest suite as a whole, it's just not covered by the subset of unittests which happen to live in the one foo_test.py file used for computing coverage. https://codereview.chromium.org/2455053004/diff/1/appengine/findit/crash/test... File appengine/findit/crash/test/crash_pipeline_test.py (left): https://codereview.chromium.org/2455053004/diff/1/appengine/findit/crash/test... appengine/findit/crash/test/crash_pipeline_test.py:85: def _TestRunningAnalysisForResult(self, analysis_result, analysis_tags): On 2016/10/27 23:55:33, sharu jiang wrote: > The purpose of this test is to test the process of pipeline run, I think it > should be in crash_pipeline_test.py instead of in crash_handler_test.py, since > the ScheduleNewAnalysis only determines whether we should start pipelines or > not. Then I need you to submit a CL with the correct mocking for testing things without calling ScheduleNewAnalysis.
https://codereview.chromium.org/2455053004/diff/1/appengine/findit/crash/cras... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/1/appengine/findit/crash/cras... appengine/findit/crash/crash_pipeline.py:107: def run(self): # pragma: no cover On 2016/10/28 18:10:59, wrengr wrote: > On 2016/10/27 23:55:33, sharu jiang wrote: > > This is wrong... I didn't notice that in your last cl. > > As I mentioned in last cl, the run has to have the same parameter as its > > __init__, or we will have errors using yield pipeline(*_, **_) > > > > I deployed a test version and here is the error. > > https://screenshot.googleplex.com/Kf45ZmBh0FY.png > > This is what I was hoping you'd've set up the mock testing to check for. This has been walked around by calling pipeline.run() instead of yeild statement in unit tests... so it was not caught. https://codereview.chromium.org/2455053004/diff/1/appengine/findit/crash/cras... appengine/findit/crash/crash_pipeline.py:142: def run(self): # pragma: no cover On 2016/10/28 18:10:59, wrengr wrote: > On 2016/10/27 23:55:33, sharu jiang wrote: > > We'd better avoid "pragma: no cover" big block of code. The > TestRunningAnalysis > > should have covered some of the code? > > I agree. Alas, this was necessary for CQ to accept the patch and land it. I can > bump up the priority of the bug report. > > N.B., in many cases (including this one) the code is in fact covered by the > unittest suite as a whole, it's just not covered by the subset of unittests > which happen to live in the one foo_test.py file used for computing coverage. For a better testing, we should cover this code in it's own foo_test.py, and in other unittest suite, we can mock this function and do not do duplicate tests if possible. https://codereview.chromium.org/2455053004/diff/1/appengine/findit/crash/test... File appengine/findit/crash/test/crash_pipeline_test.py (left): https://codereview.chromium.org/2455053004/diff/1/appengine/findit/crash/test... appengine/findit/crash/test/crash_pipeline_test.py:85: def _TestRunningAnalysisForResult(self, analysis_result, analysis_tags): On 2016/10/28 18:10:59, wrengr wrote: > On 2016/10/27 23:55:33, sharu jiang wrote: > > The purpose of this test is to test the process of pipeline run, I think it > > should be in crash_pipeline_test.py instead of in crash_handler_test.py, since > > the ScheduleNewAnalysis only determines whether we should start pipelines or > > not. > > Then I need you to submit a CL with the correct mocking for testing things > without calling ScheduleNewAnalysis. You can just remove #154-#157 part, and call _MockPipeline(*_).start(). For the ScheduleNewAnalysis function, you can test in crash_handler_test.py instead. https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/crash/... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:157: def run(self, *_args, **_kwargs): # pragma: no cover This is a little hacky, we should at lease add doc saying we will throw out parameters because of the yield statement.
https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/crash/... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:68: # unclear what sort of terrible introspection AppEngine does in order to It is not AppEngine, but the AppEngine Pipeline framework https://github.com/GoogleCloudPlatform/appengine-pipelines/wiki/Python That's by design in the AppEngine Pipeline. One reason I could think of is that users of the framework don't have to create constructor unless other functions like ``finalized`` also need to access those parameters. https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:157: def run(self, *_args, **_kwargs): # pragma: no cover On 2016/11/01 23:03:24, sharu jiang wrote: > This is a little hacky, we should at lease add doc saying we will throw out > parameters because of the yield statement. I don't understand this. Sharu, do you mind elaborating a little bit? Which yield? Why the yield impacts here? https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/handle... File appengine/findit/handlers/crash/crash_handler.py (right): https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/handle... appengine/findit/handlers/crash/crash_handler.py:114: # TODO(http://crbug.com/659345): try to break the cycle re pipeline_cls. This seems invalid now. https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/handle... appengine/findit/handlers/crash/crash_handler.py:118: """Create a pipeline object to perform the analysis, and start it. nit: Create -> Creates. This is the convention in the infra repo, although we might still have some violations in the code. https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/handle... appengine/findit/handlers/crash/crash_handler.py:126: and so should surely be moved there instead of living here. invalid now. https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/lib/gi... File appengine/findit/lib/gitiles/change_log.py (right): https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/lib/gi... appengine/findit/lib/gitiles/change_log.py:25: # TODO(wrengr); convert this into a namedtuple. I like this idea. Maybe do the same for FileChangeInfo above too.
https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/crash/... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:68: # unclear what sort of terrible introspection AppEngine does in order to It is not AppEngine, but the AppEngine Pipeline framework https://github.com/GoogleCloudPlatform/appengine-pipelines/wiki/Python That's by design in the AppEngine Pipeline. One reason I could think of is that users of the framework don't have to create constructor unless other functions like ``finalized`` also need to access those parameters. https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:157: def run(self, *_args, **_kwargs): # pragma: no cover On 2016/11/01 23:03:24, sharu jiang wrote: > This is a little hacky, we should at lease add doc saying we will throw out > parameters because of the yield statement. I don't understand this. Sharu, do you mind elaborating a little bit? Which yield? Why the yield impacts here? https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/handle... File appengine/findit/handlers/crash/crash_handler.py (right): https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/handle... appengine/findit/handlers/crash/crash_handler.py:114: # TODO(http://crbug.com/659345): try to break the cycle re pipeline_cls. This seems invalid now. https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/handle... appengine/findit/handlers/crash/crash_handler.py:118: """Create a pipeline object to perform the analysis, and start it. nit: Create -> Creates. This is the convention in the infra repo, although we might still have some violations in the code. https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/handle... appengine/findit/handlers/crash/crash_handler.py:126: and so should surely be moved there instead of living here. invalid now. https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/lib/gi... File appengine/findit/lib/gitiles/change_log.py (right): https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/lib/gi... appengine/findit/lib/gitiles/change_log.py:25: # TODO(wrengr); convert this into a namedtuple. I like this idea. Maybe do the same for FileChangeInfo above too.
https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/crash/... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:157: def run(self, *_args, **_kwargs): # pragma: no cover On 2016/11/02 01:28:27, stgao (slow on Monday) wrote: > On 2016/11/01 23:03:24, sharu jiang wrote: > > This is a little hacky, we should at lease add doc saying we will throw out > > parameters because of the yield statement. > > I don't understand this. > Sharu, do you mind elaborating a little bit? Which yield? Why the yield impacts > here? like the `yield` in #190 and #193 What `yield` does is to construct a pipeline and pass whatever parameters of the construct function to run(*_args, **_kwargs). This way we can force the *_args, **_kwargs to be the same as in construct function. However, if we do: pipe = PublishResultPipeline(client1, crash_identifiers1) pipe.run(client2, crash_identifiers2, ...) The pipe will run using (client1, crash_identifiers1) instead of using the (client2, crash_identifiers2) passed into `run`. I think this may cause confusion, so if we are to do something like #189-#193, we should at lease add doc to notify users to use `yield` instead of creating a pipeline and call its run separately.
https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/crash/... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:157: def run(self, *_args, **_kwargs): # pragma: no cover On 2016/11/02 17:01:35, sharu jiang wrote: > On 2016/11/02 01:28:27, stgao (slow on Monday) wrote: > > On 2016/11/01 23:03:24, sharu jiang wrote: > > > This is a little hacky, we should at lease add doc saying we will throw out > > > parameters because of the yield statement. > > > > I don't understand this. > > Sharu, do you mind elaborating a little bit? Which yield? Why the yield > impacts > > here? > > like the `yield` in #190 and #193 > What `yield` does is to construct a pipeline and pass whatever parameters of the > construct function to run(*_args, **_kwargs). This way we can force the *_args, > **_kwargs to be the same as in construct function. > > However, if we do: > pipe = PublishResultPipeline(client1, crash_identifiers1) > pipe.run(client2, crash_identifiers2, ...) > > The pipe will run using (client1, crash_identifiers1) instead of using the > (client2, crash_identifiers2) passed into `run`. I think this may cause > confusion, so if we are to do something like #189-#193, we should at lease add > doc to notify users to use `yield` instead of creating a pipeline and call its > run separately. Would you mind pointing me to the doc for the usage of calling ``run`` directly? It's new to me and I'd like to learn more.
New version addresses nits. Anything else before cq? https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/crash/... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:68: # unclear what sort of terrible introspection AppEngine does in order to On 2016/11/02 01:28:27, stgao (slow on Monday) wrote: > It is not AppEngine, but the AppEngine Pipeline framework > https://github.com/GoogleCloudPlatform/appengine-pipelines/wiki/Python > > That's by design in the AppEngine Pipeline. One reason I could think of is that > users of the framework don't have to create constructor unless other functions > like ``finalized`` also need to access those parameters. Acknowledged. https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:157: def run(self, *_args, **_kwargs): # pragma: no cover On 2016/11/01 23:03:24, sharu jiang wrote: > This is a little hacky, we should at lease add doc saying we will throw out > parameters because of the yield statement. I agree it's a terrible hack, but it's forced on us by the structure of AppEngine pipelines. I'll add docstrings to reiterate the discussion from lines 57--81. https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/handle... File appengine/findit/handlers/crash/crash_handler.py (right): https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/handle... appengine/findit/handlers/crash/crash_handler.py:114: # TODO(http://crbug.com/659345): try to break the cycle re pipeline_cls. On 2016/11/02 01:28:27, stgao (slow on Monday) wrote: > This seems invalid now. Done. https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/handle... appengine/findit/handlers/crash/crash_handler.py:118: """Create a pipeline object to perform the analysis, and start it. On 2016/11/02 01:28:27, stgao (slow on Monday) wrote: > nit: Create -> Creates. > This is the convention in the infra repo, although we might still have some > violations in the code. Done. https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/handle... appengine/findit/handlers/crash/crash_handler.py:126: and so should surely be moved there instead of living here. On 2016/11/02 01:28:27, stgao (slow on Monday) wrote: > invalid now. Done. https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/lib/gi... File appengine/findit/lib/gitiles/change_log.py (right): https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/lib/gi... appengine/findit/lib/gitiles/change_log.py:25: # TODO(wrengr); convert this into a namedtuple. On 2016/11/02 01:28:27, stgao (slow on Monday) wrote: > I like this idea. Maybe do the same for FileChangeInfo above too. I filed a bug: http://crbug.com/661822. Feel free to add more things to the list there, and I'll get to them sometime next week.
lgtm for code. But I have comments on the docstrings regarding pipeline. Once fixed, please feel free to CQ the CL. https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/lib/gi... File appengine/findit/lib/gitiles/change_log.py (right): https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/lib/gi... appengine/findit/lib/gitiles/change_log.py:25: # TODO(wrengr); convert this into a namedtuple. On 2016/11/02 23:52:26, wrengr wrote: > On 2016/11/02 01:28:27, stgao (slow on Monday) wrote: > > I like this idea. Maybe do the same for FileChangeInfo above too. > > I filed a bug: http://crbug.com/661822. Feel free to add more things to the list > there, and I'll get to them sometime next week. Thanks! https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:60: # The pipeline library is designed in a strange way which requires that It might be just me, but "strange" seems a little strong in wording. IMHO, the design of the appengine pipeline looks good. It might just not expect users to create a constructor in subclasses like what we do here. The only reason we need to add the constructor here is because we want to update an NDB entity in the ``finalized`` function to deal with errors. https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:66: # ``run`` method itself. For more about all this, see: I don't understand why the yielded objects must take the same arguments as the run itself. We have examples that is against this statement. https://chromium.googlesource.com/infra/infra/+/master/appengine/findit/water... The official documentation also has similar example https://github.com/GoogleCloudPlatform/appengine-pipelines/wiki/Python#defini... https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:74: # be JSON-serializable. The pipeline class handles most of the details, For the yielded objects by ``CrashWrapperPipeline.run``, they just must be an instance of Pipeline; for the __return__ objects by the function, they must be JSON-serializable. https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:75: # so the force of this constraint is that whatever arguments we pass It is true that parameters passed over to the constructor of Pipeline must be JSON-serializable, but I don't see a cause-result relation between this and the constraint of returned values by Pipeline.run though. https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:84: # Moreover, the ``run`` and ``finalized`` methods are executed in separate Actually, this is one of the goodies from appengine pipeline. A pipeline is executed in a few phases so that we could do different things in different phases: in ``run``, we could just deal with the main logic; in ``finalized``, we could deal with the abortion due to too many retries or other errors. Of course, it doesn't introduce the problem as described here.
https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:60: # The pipeline library is designed in a strange way which requires that On 2016/11/03 04:48:10, stgao (slow on Monday) wrote: > It might be just me, but "strange" seems a little strong in wording. I've done a lot of software engineering on a lot of different projects in a lot of different programming languages, and this set of constraints is extremely peculiar. I can think of no other projects that have similar design restrictions. Given that this design is both counterintuitive and entirely unique to someone with over a decade of engineering experience, I think it's fair to remark on it being "strange". > IMHO, the design of the appengine pipeline looks good. It might just not expect > users to create a constructor in subclasses like what we do here. (a) Not allowing/expecting subclasses to define constructors is assuredly "strange"; indeed, it deserves far stronger words than that. (b) If they don't want us doing that, then maybe we shouldn't be doing it. https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:66: # ``run`` method itself. For more about all this, see: On 2016/11/03 04:48:10, stgao (slow on Monday) wrote: > I don't understand why the yielded objects must take the same arguments as the > run itself. > > We have examples that is against this statement. > https://chromium.googlesource.com/infra/infra/+/master/appengine/findit/water... > > The official documentation also has similar example > https://github.com/GoogleCloudPlatform/appengine-pipelines/wiki/Python#defini... I have no idea. This is what katesonia has said repeatedly, and she's showed me crash logs for the version where ``run`` accepts no arguments, so I believe her. https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:74: # be JSON-serializable. The pipeline class handles most of the details, On 2016/11/03 04:48:10, stgao (slow on Monday) wrote: > For the yielded objects by ``CrashWrapperPipeline.run``, they just must be an > instance of Pipeline; for the __return__ objects by the function, they must be > JSON-serializable. The bug I encountered previously about JSON-serializability was because of passing a Findit object to the CrashAnalysisPipeline and/or CrashPublishPipeline. Replacing that object with a string fixed the problem; what any of the methods return was unaltered. This also agrees with what you say in the start of the next comment... https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:75: # so the force of this constraint is that whatever arguments we pass On 2016/11/03 04:48:10, stgao (slow on Monday) wrote: > It is true that parameters passed over to the constructor of Pipeline must be > JSON-serializable, Which agrees with what I wrote... > but I don't see a cause-result relation between this and the > constraint of returned values by Pipeline.run though. I don't say anything about the return values. (Because CrashAnalysisPipeline.run and CrashPublishPipeline don't actually return anything, so it doesn't matter.) https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:84: # Moreover, the ``run`` and ``finalized`` methods are executed in separate On 2016/11/03 04:48:10, stgao (slow on Monday) wrote: > Of course, it doesn't introduce the problem as described here. Does it not? How do changes from inside ``run`` get propagated to the newly-constructed object on which ``finalized`` is called? If it does actually work I'll remove the comment.
The CQ bit was checked by wrengr@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Description was changed from ========== Moving ScheduleNewAnalysis to break the cycle BUG=659345 ========== to ========== Moving ScheduleNewAnalysis to break the cycle BUG=659345 Committed: https://chromium.googlesource.com/infra/infra/+/b74c6a6cd22bfc0e9ed6baddd089c... ==========
Message was sent while issue was closed.
Committed patchset #5 (id:80001) as https://chromium.googlesource.com/infra/infra/+/b74c6a6cd22bfc0e9ed6baddd089c...
Message was sent while issue was closed.
https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:66: # ``run`` method itself. For more about all this, see: On 2016/11/03 17:40:34, wrengr wrote: > On 2016/11/03 04:48:10, stgao (slow on Monday) wrote: > > I don't understand why the yielded objects must take the same arguments as the > > run itself. > > > > We have examples that is against this statement. > > > https://chromium.googlesource.com/infra/infra/+/master/appengine/findit/water... > > > > The official documentation also has similar example > > > https://github.com/GoogleCloudPlatform/appengine-pipelines/wiki/Python#defini... > > I have no idea. This is what katesonia has said repeatedly, and she's showed me > crash logs for the version where ``run`` accepts no arguments, so I believe her. Sharu, would you mind clarifying here? I'm confused. https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:84: # Moreover, the ``run`` and ``finalized`` methods are executed in separate On 2016/11/03 17:40:34, wrengr wrote: > On 2016/11/03 04:48:10, stgao (slow on Monday) wrote: > > Of course, it doesn't introduce the problem as described here. > > Does it not? How do changes from inside ``run`` get propagated to the > newly-constructed object on which ``finalized`` is called? If it does actually > work I'll remove the comment. Sorry for the typo. I meant to say it __does__ introduce the problem here.
Message was sent while issue was closed.
https://codereview.chromium.org/2455053004/diff/1/appengine/findit/crash/cras... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/1/appengine/findit/crash/cras... appengine/findit/crash/crash_pipeline.py:107: def run(self): # pragma: no cover On 2016/11/01 23:03:24, sharu jiang wrote: > On 2016/10/28 18:10:59, wrengr wrote: > > On 2016/10/27 23:55:33, sharu jiang wrote: > > > This is wrong... I didn't notice that in your last cl. > > > As I mentioned in last cl, the run has to have the same parameter as its > > > __init__, or we will have errors using yield pipeline(*_, **_) > > > > > > I deployed a test version and here is the error. > > > https://screenshot.googleplex.com/Kf45ZmBh0FY.png > > > > This is what I was hoping you'd've set up the mock testing to check for. > > This has been walked around by calling pipeline.run() instead of yeild statement > in unit tests... so it was not caught. Right. Can you adjust the unittests so that they will catch this in the future? Filed http://crbug.com/662088 https://codereview.chromium.org/2455053004/diff/1/appengine/findit/crash/cras... appengine/findit/crash/crash_pipeline.py:142: def run(self): # pragma: no cover On 2016/11/01 23:03:24, sharu jiang wrote: > On 2016/10/28 18:10:59, wrengr wrote: > > On 2016/10/27 23:55:33, sharu jiang wrote: > > > We'd better avoid "pragma: no cover" big block of code. The > > TestRunningAnalysis > > > should have covered some of the code? > > > > I agree. Alas, this was necessary for CQ to accept the patch and land it. I > can > > bump up the priority of the bug report. > > > > N.B., in many cases (including this one) the code is in fact covered by the > > unittest suite as a whole, it's just not covered by the subset of unittests > > which happen to live in the one foo_test.py file used for computing coverage. > > For a better testing, we should cover this code in it's own foo_test.py, and in > other unittest suite, we can mock this function and do not do duplicate tests if > possible. I agree. You know more about how to mock pipeline stuff out so I think it'd be more effective if you could do this. Cf., http://crbug.com/659346 https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:84: # Moreover, the ``run`` and ``finalized`` methods are executed in separate On 2016/11/03 18:17:37, stgao (slow on Monday) wrote: > On 2016/11/03 17:40:34, wrengr wrote: > > On 2016/11/03 04:48:10, stgao (slow on Monday) wrote: > > > Of course, it doesn't introduce the problem as described here. > > > > Does it not? How do changes from inside ``run`` get propagated to the > > newly-constructed object on which ``finalized`` is called? If it does actually > > work I'll remove the comment. > > Sorry for the typo. I meant to say it __does__ introduce the problem here. Cool, I understood it right :)
Message was sent while issue was closed.
https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:60: # The pipeline library is designed in a strange way which requires that On 2016/11/03 17:40:34, wrengr wrote: > On 2016/11/03 04:48:10, stgao (slow on Monday) wrote: > > It might be just me, but "strange" seems a little strong in wording. > > I've done a lot of software engineering on a lot of different projects in a lot > of different programming languages, and this set of constraints is extremely > peculiar. I can think of no other projects that have similar design > restrictions. Given that this design is both counterintuitive and entirely > unique to someone with over a decade of engineering experience, I think it's > fair to remark on it being "strange". I would say "special" instead. But it's fine as is. :) > > > > IMHO, the design of the appengine pipeline looks good. It might just not > expect > > users to create a constructor in subclasses like what we do here. > > (a) Not allowing/expecting subclasses to define constructors is assuredly > "strange"; indeed, it deserves far stronger words than that. (b) If they don't > want us doing that, then maybe we shouldn't be doing it. Maybe we should look into alternative solution and avoid adding the constructor. https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:74: # be JSON-serializable. The pipeline class handles most of the details, On 2016/11/03 17:40:34, wrengr wrote: > On 2016/11/03 04:48:10, stgao (slow on Monday) wrote: > > For the yielded objects by ``CrashWrapperPipeline.run``, they just must be an > > instance of Pipeline; for the __return__ objects by the function, they must be > > JSON-serializable. > > The bug I encountered previously about JSON-serializability was because of > passing a Findit object to the CrashAnalysisPipeline and/or > CrashPublishPipeline. Replacing that object with a string fixed the problem; > what any of the methods return was unaltered. This also agrees with what you say > in the start of the next comment... This is true. But my comment above is for "In addition, whatever objects ``CrashWrapperPipeline.run`` yields must be JSON-serializable." which is misleading to me. https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:75: # so the force of this constraint is that whatever arguments we pass On 2016/11/03 17:40:34, wrengr wrote: > On 2016/11/03 04:48:10, stgao (slow on Monday) wrote: > > It is true that parameters passed over to the constructor of Pipeline must be > > JSON-serializable, > > Which agrees with what I wrote... > > > but I don't see a cause-result relation between this and the > > constraint of returned values by Pipeline.run though. > > I don't say anything about the return values. (Because CrashAnalysisPipeline.run > and CrashPublishPipeline don't actually return anything, so it doesn't matter.) Sorry for the typo again. I meant to say __yielded__ objects. In my understanding, the constraint for the parameters to the constructor and the constraints for the __yielded__ objects or __returned__ values by ``run`` are independent from each other, while the constraints are the same for parameters and __returned__ values -- must be JSON-serializable. (__yielded__ objects are different, they must be instances of Pipeline.) But from reading the words here, it's more like a cause-result relation.
Message was sent while issue was closed.
https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:60: # The pipeline library is designed in a strange way which requires that On 2016/11/03 18:48:30, stgao (slow on Monday) wrote: > Maybe we should look into alternative solution and avoid adding the constructor. filed http://crbug.com/662145 https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:74: # be JSON-serializable. The pipeline class handles most of the details, On 2016/11/03 18:48:30, stgao (slow on Monday) wrote: > But my comment above is for "In addition, whatever objects > ``CrashWrapperPipeline.run`` yields must be JSON-serializable." which is > misleading to me. The issue I encountered was that ``yield SomePipeline(stuff)`` will eventually lead to a crash if ``stuff`` is not JSON-serializable. Perhaps I've misdiagnosed why exactly that crashes things. Maybe the problem isn't passing ``stuff`` to ``SomePipeline.__init__`` but rather the fact that ``stuff`` will also be passed to ``SomePipeline.run`` because of what the coroutine does with the yielded SomePipeline; I dunno. If you know of a document that explains exactly what the constraints are and why that example crashes, I'll update the discussion here to point to them.
Message was sent while issue was closed.
https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:74: # be JSON-serializable. The pipeline class handles most of the details, On 2016/11/03 20:04:21, wrengr wrote: > On 2016/11/03 18:48:30, stgao (slow on Monday) wrote: > > But my comment above is for "In addition, whatever objects > > ``CrashWrapperPipeline.run`` yields must be JSON-serializable." which is > > misleading to me. > > The issue I encountered was that ``yield SomePipeline(stuff)`` will eventually > lead to a crash if ``stuff`` is not JSON-serializable. Perhaps I've misdiagnosed > why exactly that crashes things. Maybe the problem isn't passing ``stuff`` to > ``SomePipeline.__init__`` but rather the fact that ``stuff`` will also be passed > to ``SomePipeline.run`` because of what the coroutine does with the yielded > SomePipeline; I dunno. The problem most likely is due to passing ``stuff`` to ``SomePipeline.__init__``, and it could be verified by checking whether the crash occurs when ``SomePipelineInstance.start`` is called. Without a successful ``start`` (parameters are saved to NDB and a taskqueue task is queued), ``run`` won't be executed. However, it has nothing to do with __yield__. Both of the code below should fail: def some_caller_func_in_app_layer(...): instance = SomePipeline(not-JSON-serializable-parameter) instance.start() and class RootPipeline(pipeline.Pipeline): def run(...): yield SomePipeline(not-JSON-serializable-parameter) > > If you know of a document that explains exactly what the constraints are and why > that example crashes, I'll update the discussion here to point to them. I don't know the exact doc on the constraints, but https://github.com/GoogleCloudPlatform/appengine-pipelines/wiki/Python is the best doc to check.
Message was sent while issue was closed.
https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/crash/... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:157: def run(self, *_args, **_kwargs): # pragma: no cover On 2016/11/02 23:52:26, wrengr wrote: > On 2016/11/01 23:03:24, sharu jiang wrote: > > This is a little hacky, we should at lease add doc saying we will throw out > > parameters because of the yield statement. > > I agree it's a terrible hack, but it's forced on us by the structure of > AppEngine pipelines. I'll add docstrings to reiterate the discussion from lines > 57--81. Sorry for the late response, was busy working on my several pending cls, the doc is in here: https://sookocheff.com/post/appengine/pipelines/connecting-pipelines/ Saying: "The rule of thumb here is that anything you instantiate your pipeline with (and subsequently pass to the run method) is accessible within your pipeline. These are called immediate values and you can treat them as regular Python values."
Message was sent while issue was closed.
lgtm
Message was sent while issue was closed.
https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/crash/... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/40001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:157: def run(self, *_args, **_kwargs): # pragma: no cover On 2016/11/03 23:03:25, Sharu Jiang wrote: > On 2016/11/02 23:52:26, wrengr wrote: > > On 2016/11/01 23:03:24, sharu jiang wrote: > > > This is a little hacky, we should at lease add doc saying we will throw out > > > parameters because of the yield statement. > > > > I agree it's a terrible hack, but it's forced on us by the structure of > > AppEngine pipelines. I'll add docstrings to reiterate the discussion from > lines > > 57--81. > > Sorry for the late response, was busy working on my several pending cls, the doc > is in here: > https://sookocheff.com/post/appengine/pipelines/connecting-pipelines/ > > Saying: "The rule of thumb here is that anything you instantiate your pipeline > with (and subsequently pass to the run method) is accessible within your > pipeline. These are called immediate values and you can treat them as regular > Python values." Calling ``run`` directly is not a correct usage for a pipeline ( thus, we cannot find an example in a doc), so that's why I was suggesting to add that note in the doc to avoid that. However, if that's very obvious to all pipeline users, then it's ok to not to bother adding that note in the doc :) https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... File appengine/findit/crash/crash_pipeline.py (right): https://codereview.chromium.org/2455053004/diff/80001/appengine/findit/crash/... appengine/findit/crash/crash_pipeline.py:66: # ``run`` method itself. For more about all this, see: On 2016/11/03 18:17:37, stgao (slow on Monday) wrote: > On 2016/11/03 17:40:34, wrengr wrote: > > On 2016/11/03 04:48:10, stgao (slow on Monday) wrote: > > > I don't understand why the yielded objects must take the same arguments as > the > > > run itself. > > > > > > We have examples that is against this statement. > > > > > > https://chromium.googlesource.com/infra/infra/+/master/appengine/findit/water... > > > > > > The official documentation also has similar example > > > > > > https://github.com/GoogleCloudPlatform/appengine-pipelines/wiki/Python#defini... > > > > I have no idea. This is what katesonia has said repeatedly, and she's showed > me > > crash logs for the version where ``run`` accepts no arguments, so I believe > her. > > Sharu, would you mind clarifying here? I'm confused. The comment "``CrashWrapperPipeline.run`` yields must take the same arguments as that ``run`` method itself is not correct and not what I meant, it should be updated to yield CrashWrapperPipeline(*args, **kwargs), *args and **kwargs should be the same as CrashWrapperPipeline.run", will update this in following which mocks CrashWrapperPipeline. |
