|
|
Created:
9 years, 4 months ago by dennis_jeffrey Modified:
9 years, 4 months ago CC:
chromium-reviews, Nirnimesh, John Grabowski, anantha, dyu1, Paweł Hajdan Jr., dennis_jeffrey Base URL:
svn://svn.chromium.org/chrome/trunk/src Visibility:
Public. |
DescriptionPyauto performance tests now output data to be graphed using autotest.
BUG=chromium-os:18185
TEST=None
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=98517
Patch Set 1 #
Total comments: 10
Patch Set 2 : Addressed review comments. #
Total comments: 4
Patch Set 3 : Addressed second round of review comments. #
Total comments: 17
Patch Set 4 : Addressed latest review comments. #Patch Set 5 : Now computing sample standard deviation. #
Messages
Total messages: 11 (0 generated)
It'd be nicer if you could get rid of the temp file and directly interpret the output from the child process (pref.py) http://codereview.chromium.org/7745007/diff/1/chrome/test/chromeos/autotest/f... File chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py (right): http://codereview.chromium.org/7745007/diff/1/chrome/test/chromeos/autotest/f... chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py:19: _PERF_DATA_FILE = '/usr/local/tmp/perf_data.txt' use tempfile.tempdir() http://codereview.chromium.org/7745007/diff/1/chrome/test/chromeos/autotest/f... chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py:77: try: Do not try http://codereview.chromium.org/7745007/diff/1/chrome/test/chromeos/autotest/f... chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py:92: try: Do not try. Just do it. If something bad happens it'll print it, which is what you're doing in the exception handler anyway http://codereview.chromium.org/7745007/diff/1/chrome/test/functional/perf.py File chrome/test/functional/perf.py (right): http://codereview.chromium.org/7745007/diff/1/chrome/test/functional/perf.py#... chrome/test/functional/perf.py:29: _CHROMEOS_OUTPUT_FILE = '/usr/local/tmp/perf_data.txt' Do not hardcode path. use tempfile.gettempdir() http://codereview.chromium.org/7745007/diff/1/chrome/test/functional/perf.py#... chrome/test/functional/perf.py:86: try: Why the try block?
I got rid of the temp file as suggested. http://codereview.chromium.org/7745007/diff/1/chrome/test/chromeos/autotest/f... File chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py (right): http://codereview.chromium.org/7745007/diff/1/chrome/test/chromeos/autotest/f... chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py:19: _PERF_DATA_FILE = '/usr/local/tmp/perf_data.txt' On 2011/08/25 00:23:36, Nirnimesh wrote: > use tempfile.tempdir() I no longer use a temp file at all. http://codereview.chromium.org/7745007/diff/1/chrome/test/chromeos/autotest/f... chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py:77: try: On 2011/08/25 00:23:36, Nirnimesh wrote: > Do not try This code has been removed. http://codereview.chromium.org/7745007/diff/1/chrome/test/chromeos/autotest/f... chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py:92: try: On 2011/08/25 00:23:36, Nirnimesh wrote: > Do not try. Just do it. If something bad happens it'll print it, which is what > you're doing in the exception handler anyway The try has since been removed. http://codereview.chromium.org/7745007/diff/1/chrome/test/functional/perf.py File chrome/test/functional/perf.py (right): http://codereview.chromium.org/7745007/diff/1/chrome/test/functional/perf.py#... chrome/test/functional/perf.py:29: _CHROMEOS_OUTPUT_FILE = '/usr/local/tmp/perf_data.txt' On 2011/08/25 00:23:36, Nirnimesh wrote: > Do not hardcode path. use tempfile.gettempdir() I no longer use a temp file at all. http://codereview.chromium.org/7745007/diff/1/chrome/test/functional/perf.py#... chrome/test/functional/perf.py:86: try: On 2011/08/25 00:23:36, Nirnimesh wrote: > Why the try block? The try has since been removed.
http://codereview.chromium.org/7745007/diff/4/chrome/test/chromeos/autotest/f... File chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py (right): http://codereview.chromium.org/7745007/diff/4/chrome/test/chromeos/autotest/f... chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py:82: stderr=subprocess.STDOUT).communicate()[0] check for the process's return code and raise a error.TestFail() if perf.py failed. http://codereview.chromium.org/7745007/diff/4/chrome/test/chromeos/autotest/f... chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py:88: map(lambda x: x.split(self._PERF_MARKER_POST)[0].split( Do a regex match. No split.
http://codereview.chromium.org/7745007/diff/4/chrome/test/chromeos/autotest/f... File chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py (right): http://codereview.chromium.org/7745007/diff/4/chrome/test/chromeos/autotest/f... chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py:82: stderr=subprocess.STDOUT).communicate()[0] On 2011/08/25 01:39:43, Nirnimesh wrote: > check for the process's return code and raise a error.TestFail() if perf.py > failed. Done. http://codereview.chromium.org/7745007/diff/4/chrome/test/chromeos/autotest/f... chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py:88: map(lambda x: x.split(self._PERF_MARKER_POST)[0].split( On 2011/08/25 01:39:43, Nirnimesh wrote: > Do a regex match. No split. Done.
This is neat to see. One other thing, it's typical in Autotest to include a vanilla control file to demonstrate vanilla invocation. http://codereview.chromium.org/7745007/diff/3002/chrome/test/chromeos/autotes... File chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py (right): http://codereview.chromium.org/7745007/diff/3002/chrome/test/chromeos/autotes... chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py:88: 'pyauto_functional.py: %d' % proc.returncode) You should consider if you want to tag this TestFail with a CHROMEOS_PERF string to disambiguate this from other instances of pyauto_functional.py failures. http://codereview.chromium.org/7745007/diff/3002/chrome/test/chromeos/autotes... chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py:92: if re_compiled.search(line)] Based on the way you wrote the perf value line creation I'd use match() not search(). http://codereview.chromium.org/7745007/diff/3002/chrome/test/functional/perf.py File chrome/test/functional/perf.py (right): http://codereview.chromium.org/7745007/diff/3002/chrome/test/functional/perf.... chrome/test/functional/perf.py:45: def _MeasureElapsedTime(self, python_command, num_invocations): Are you considering that it would make this more reusable to separate the common utility functions here into a base PerfTest class and then to specialize that class with say a TabPerfTest class with the actual test code below? http://codereview.chromium.org/7745007/diff/3002/chrome/test/functional/perf.... chrome/test/functional/perf.py:61: return (stop_time - start_time) * 1000 # Convert to milliseconds. Just wondering...why not use timeit? There are issues such as GC that can affect pure Python timings discussed in timeit. http://codereview.chromium.org/7745007/diff/3002/chrome/test/functional/perf.... chrome/test/functional/perf.py:74: std_dev = math.sqrt(sum(temp_vals) / len(temp_vals)) This illustrates why I use a Python stats lib instead of writing these. I think you want to float() the len(temp_vals) otherwise this could end up being int division which would incur unexpected rounding. Also, you're calculating the population stddev opposed to sample and I think sample is the more common approach. http://codereview.chromium.org/7745007/diff/3002/chrome/test/functional/perf.... chrome/test/functional/perf.py:109: self._OutputPerfGraphValue('%s_%s' % (units, description), avg) FYI, it's typical for current ChromeOS perf tests to write all the keyvals to the Autotest db. That way the data is preserved in case a stat function is later proven improper. http://codereview.chromium.org/7745007/diff/3002/chrome/test/functional/perf.... chrome/test/functional/perf.py:118: will open up a single tab. I think you should explain this pattern of separating the initial sample from the remaining iterations. It's ambiguous what 1 iteration would mean - since iterations=1 would actually return 2 separate samples.
Code I commented on LGTM
Thanks you both for looking over this code. Mike - I addressed all of your comments but added a few TODO's for a couple of them - please let me know if you'd still prefer changes to the current CL. Also, there is indeed already a basic autotest control file in the same directory as the file desktopui_PyAutoPerfTests.py. It's just that it wasn't modified in the current CL so is not part of it. Is that what you meant by a "vanilla control file" for autotest? http://codereview.chromium.org/7745007/diff/3002/chrome/test/chromeos/autotes... File chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py (right): http://codereview.chromium.org/7745007/diff/3002/chrome/test/chromeos/autotes... chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py:88: 'pyauto_functional.py: %d' % proc.returncode) On 2011/08/25 04:25:01, truty wrote: > You should consider if you want to tag this TestFail with a CHROMEOS_PERF string > to disambiguate this from other instances of pyauto_functional.py failures. Good idea - I added CHROMEOS_PERF to the failure message to indicate that the failure here happened when running pyauto_functional with that particular suite. http://codereview.chromium.org/7745007/diff/3002/chrome/test/chromeos/autotes... chrome/test/chromeos/autotest/files/client/site_tests/desktopui_PyAutoPerfTests/desktopui_PyAutoPerfTests.py:92: if re_compiled.search(line)] On 2011/08/25 04:25:01, truty wrote: > Based on the way you wrote the perf value line creation I'd use match() not > search(). That thought did cross my mind, but I decided to be paranoid :-). I changed it to use "match" instead. However, to appease my paranoia, I also added a newline right before the associated print statement in perf.py to ensure the perf value starts on a new line and is not appended at the end of some existing line (that probably wouldn't happen anyway, but I wanted to be sure). http://codereview.chromium.org/7745007/diff/3002/chrome/test/functional/perf.py File chrome/test/functional/perf.py (right): http://codereview.chromium.org/7745007/diff/3002/chrome/test/functional/perf.... chrome/test/functional/perf.py:45: def _MeasureElapsedTime(self, python_command, num_invocations): On 2011/08/25 04:25:01, truty wrote: > Are you considering that it would make this more reusable to separate the common > utility functions here into a base PerfTest class and then to specialize that > class with say a TabPerfTest class with the actual test code below? I think this is a great idea, and would like to do it. But I'd like to address it in a subsequent CL if you don't mind, since it involves restructuring the code here, and it strays a bit from the point of the current CL. I added a TODO for that to make sure I don't forget about it. I'd like to take care of it in a later "cleanup" CL that involves reorganizing this code. http://codereview.chromium.org/7745007/diff/3002/chrome/test/functional/perf.... chrome/test/functional/perf.py:61: return (stop_time - start_time) * 1000 # Convert to milliseconds. On 2011/08/25 04:25:01, truty wrote: > Just wondering...why not use timeit? There are issues such as GC that can > affect pure Python timings discussed in timeit. I didn't use it because I didn't know about it - that's awesome! http://codereview.chromium.org/7745007/diff/3002/chrome/test/functional/perf.... chrome/test/functional/perf.py:74: std_dev = math.sqrt(sum(temp_vals) / len(temp_vals)) On 2011/08/25 04:25:01, truty wrote: > This illustrates why I use a Python stats lib instead of writing these. I think > you want to float() the len(temp_vals) otherwise this could end up being int > division which would incur unexpected rounding. Also, you're calculating the > population stddev opposed to sample and I think sample is the more common > approach. Originally Nirnimesh had recommended I use the "numpy" module for computing avg/std dev, but we found out that it's a third-party library that we could not always count on to be present on all the platforms (e.g., it seemed to exist on Linux and not Windows). Since we'd like these perf tests to also run on chrome on Windows, we decided to leave it like this. That is a great point about being careful of integer division unexpectedly cropping up. However, I think it shouldn't happen here, unless I'm missing something (in which case I'd like to hear where I'm wrong!). The reason is that the division at line 74 above depends upon "sum(temp_vals)", which itself is the sum of a set of list elements computed at line 73. Each list element is computed by squaring a value 'x - avg', with 'avg' being computed at line 72. 'avg' should be a float because it is computed using "float(sum(values))". This should mean that the expression 'x - avg' should also be a float, making its square a float, making each element of the "temp_vals" list a float, making "sum(temp_vals)" in line 74 a float, preventing integer division at that line. I could be wrong here, so please let me know if I am - I just wanted to check to make sure I understand this properly. You're right about the population vs. sample std dev point you bring up. I was reading a bit about the different types of std dev, and I see that "sample" is more commonly-used (according to wikipedia). The reason I originally went with "population" was because I figured I just want the std dev of the given set of measurements, which seems to form the "complete population" (but I am not sure whether that reasoning is correct). But a benefit of using "sample" is that the standard deviation value would be larger, which seems like a more conservative value to report. Do you recommend that I change it to "sample"? http://codereview.chromium.org/7745007/diff/3002/chrome/test/functional/perf.... chrome/test/functional/perf.py:109: self._OutputPerfGraphValue('%s_%s' % (units, description), avg) On 2011/08/25 04:25:01, truty wrote: > FYI, it's typical for current ChromeOS perf tests to write all the keyvals to > the Autotest db. That way the data is preserved in case a stat function is > later proven improper. Is this something that happens automatically when values are written to the 'keyval' file using write_perf_keyval()? I saw a comment in platform_BootPerf.py saying that "Keyval attribute names go into a database that truncates after 30 characters", implying that those results are going into the database, but I don't see any code there that explicitly does that. http://codereview.chromium.org/7745007/diff/3002/chrome/test/functional/perf.... chrome/test/functional/perf.py:118: will open up a single tab. On 2011/08/25 04:25:01, truty wrote: > I think you should explain this pattern of separating the initial sample from > the remaining iterations. It's ambiguous what 1 iteration would mean - since > iterations=1 would actually return 2 separate samples. This is indeed something that was bugging me a little - actually I think I'd like to remove the code that does the initial sample, since I'm not using it for anything right now. That way, the number of iterations specified would exactly match the number of samples actually taken. I added a TODO to take care of this when I clean up the organization of this file.
All in all, I only have todo's left so lgtm. What throws me is that I expected to review a patch with the updates and I don't see it. I'll LGTM this here. http://codereview.chromium.org/7745007/diff/3002/chrome/test/functional/perf.py File chrome/test/functional/perf.py (right): http://codereview.chromium.org/7745007/diff/3002/chrome/test/functional/perf.... chrome/test/functional/perf.py:74: std_dev = math.sqrt(sum(temp_vals) / len(temp_vals)) On 2011/08/25 22:33:05, dennis_jeffrey wrote: > On 2011/08/25 04:25:01, truty wrote: > > This illustrates why I use a Python stats lib instead of writing these. I > think > > you want to float() the len(temp_vals) otherwise this could end up being int > > division which would incur unexpected rounding. Also, you're calculating the > > population stddev opposed to sample and I think sample is the more common > > approach. > > Originally Nirnimesh had recommended I use the "numpy" module for computing > avg/std dev, but we found out that it's a third-party library that we could not > always count on to be present on all the platforms (e.g., it seemed to exist on > Linux and not Windows). Since we'd like these perf tests to also run on chrome > on Windows, we decided to leave it like this. > > That is a great point about being careful of integer division unexpectedly > cropping up. However, I think it shouldn't happen here, unless I'm missing > something (in which case I'd like to hear where I'm wrong!). The reason is that > the division at line 74 above depends upon "sum(temp_vals)", which itself is the > sum of a set of list elements computed at line 73. Each list element is > computed by squaring a value 'x - avg', with 'avg' being computed at line 72. > 'avg' should be a float because it is computed using "float(sum(values))". This > should mean that the expression 'x - avg' should also be a float, making its > square a float, making each element of the "temp_vals" list a float, making > "sum(temp_vals)" in line 74 a float, preventing integer division at that line. > I could be wrong here, so please let me know if I am - I just wanted to check to > make sure I understand this properly. > > You're right about the population vs. sample std dev point you bring up. I was > reading a bit about the different types of std dev, and I see that "sample" is > more commonly-used (according to wikipedia). The reason I originally went with > "population" was because I figured I just want the std dev of the given set of > measurements, which seems to form the "complete population" (but I am not sure > whether that reasoning is correct). But a benefit of using "sample" is that the > standard deviation value would be larger, which seems like a more conservative > value to report. Do you recommend that I change it to "sample"? -I use a Python stats library that is opensource - we should discuss it this week. -You're right about the float. I'm just paranoid I guess - sorry. -I prefer the sample stddev - please check with jrbarnette for a final preference though. http://codereview.chromium.org/7745007/diff/3002/chrome/test/functional/perf.... chrome/test/functional/perf.py:109: self._OutputPerfGraphValue('%s_%s' % (units, description), avg) On 2011/08/25 22:33:05, dennis_jeffrey wrote: > On 2011/08/25 04:25:01, truty wrote: > > FYI, it's typical for current ChromeOS perf tests to write all the keyvals to > > the Autotest db. That way the data is preserved in case a stat function is > > later proven improper. > > Is this something that happens automatically when values are written to the > 'keyval' file using write_perf_keyval()? I saw a comment in > platform_BootPerf.py saying that "Keyval attribute names go into a database that > truncates after 30 characters", implying that those results are going into the > database, but I don't see any code there that explicitly does that. Yes. Using write_perf_keyval() causes the entire dict to get written to the db.
ok, i see the updates: lgtm
Submitting this now - thank you! http://codereview.chromium.org/7745007/diff/3002/chrome/test/functional/perf.py File chrome/test/functional/perf.py (right): http://codereview.chromium.org/7745007/diff/3002/chrome/test/functional/perf.... chrome/test/functional/perf.py:74: std_dev = math.sqrt(sum(temp_vals) / len(temp_vals)) On 2011/08/26 20:33:00, truty wrote: > On 2011/08/25 22:33:05, dennis_jeffrey wrote: > > On 2011/08/25 04:25:01, truty wrote: > > > This illustrates why I use a Python stats lib instead of writing these. I > > think > > > you want to float() the len(temp_vals) otherwise this could end up being int > > > division which would incur unexpected rounding. Also, you're calculating > the > > > population stddev opposed to sample and I think sample is the more common > > > approach. > > > > Originally Nirnimesh had recommended I use the "numpy" module for computing > > avg/std dev, but we found out that it's a third-party library that we could > not > > always count on to be present on all the platforms (e.g., it seemed to exist > on > > Linux and not Windows). Since we'd like these perf tests to also run on > chrome > > on Windows, we decided to leave it like this. > > > > That is a great point about being careful of integer division unexpectedly > > cropping up. However, I think it shouldn't happen here, unless I'm missing > > something (in which case I'd like to hear where I'm wrong!). The reason is > that > > the division at line 74 above depends upon "sum(temp_vals)", which itself is > the > > sum of a set of list elements computed at line 73. Each list element is > > computed by squaring a value 'x - avg', with 'avg' being computed at line 72. > > 'avg' should be a float because it is computed using "float(sum(values))". > This > > should mean that the expression 'x - avg' should also be a float, making its > > square a float, making each element of the "temp_vals" list a float, making > > "sum(temp_vals)" in line 74 a float, preventing integer division at that line. > > > I could be wrong here, so please let me know if I am - I just wanted to check > to > > make sure I understand this properly. > > > > You're right about the population vs. sample std dev point you bring up. I > was > > reading a bit about the different types of std dev, and I see that "sample" is > > more commonly-used (according to wikipedia). The reason I originally went > with > > "population" was because I figured I just want the std dev of the given set of > > measurements, which seems to form the "complete population" (but I am not sure > > whether that reasoning is correct). But a benefit of using "sample" is that > the > > standard deviation value would be larger, which seems like a more conservative > > value to report. Do you recommend that I change it to "sample"? > > -I use a Python stats library that is opensource - we should discuss it this > week. > -You're right about the float. I'm just paranoid I guess - sorry. > -I prefer the sample stddev - please check with jrbarnette for a final > preference though. I changed the computation to "sample" std dev based on suggestion by jrbarnette. |