|
|
Created:
4 years, 10 months ago by Stefano Sanfilippo Modified:
4 years, 10 months ago Reviewers:
Michael Achenbach CC:
v8-reviews_googlegroups.com, tandrii(chromium) Base URL:
https://chromium.googlesource.com/v8/v8.git@master Target Ref:
refs/pending/heads/master Project:
v8 Visibility:
Public. |
Description[Swarming] work around slow calls in archive.py
Apparently, the tarfile Python module spends a lot of time in
grp.getgrid for retrieving a piece information (the name of the
primary group) which we don't need anyway. There is no
proper way to disable these slow calls, but there's a workaround
which relies on the way in which grp (and pwd) is used.
In fact, pwd and grp are imported in this fashion:
try:
import grp, pwd
except ImportError:
grp = pwd = None
and then used with the following pattern [2]:
if grp:
try:
tarinfo.gname = grp.getgrgid(tarinfo.gid)[0]
except KeyError:
pass
By setting grp and pwd to None, thus skipping the calls, I was
able to achieve a 35x speedup on my workstation.
The user and group names are set to test262 when building the tar.
The downside to this approach is that we are relying on an
implementation detail, which is not in the public API.
However, the blamelist shows that the relevant bits of the module
have not been updated since 2003 [3], so we might as well assume
that the workaround will keep working, on cPython 2.x at least.
---
[1] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l56
[2] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l1933
[3] https://hg.python.org/cpython/rev/f9a5ed092660
BUG=chromium:535160
LOG=N
Committed: https://crrev.com/1c1b70c98d9231bf1606127796a57459aebba481
Cr-Commit-Position: refs/heads/master@{#34245}
Patch Set 1 #
Messages
Total messages: 23 (11 generated)
ssanfilippo@chromium.org changed reviewers: + machenbach@chromium.org
Here is the fix we discussed by e-mail. It's a workaround, so I understand if you rather decline it. PTAL.
Description was changed from ========== [Swarming] work around slow calls in archive.py Apparently, the tarfile Python module spends a lot of time in grp.getgrid for retrieving a piece information (the name of the primary group) which we don't need anyway. There is no proper way to disable these slow calls, but there's a simple workaround due to the way in which modules grp and pwd are used. In fact, pwd and grp are imported in this fashion: try: import grp, pwd except ImportError: grp = pwd = None and then used with the following pattern [2]: if grp: try: tarinfo.gname = grp.getgrgid(tarinfo.gid)[0] except KeyError: pass By setting grp and pwd to None, thus skipping the calls, I was able to achieve a 35x speedup on my workstation. The downside to this approach is that we are relying on an implementation detail, which is not in the a public API. However, the blamelist shows that the relevant bits of the module have not been updated since 2003 [3], so we might as well assume that the workaround will keep working, on cPython 2.x at least. --- [1] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l56 [2] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l1933 [3] https://hg.python.org/cpython/rev/f9a5ed092660 BUG=chromium:535160 LOG=N ========== to ========== [Swarming] work around slow calls in archive.py Apparently, the tarfile Python module spends a lot of time in grp.getgrid for retrieving a piece information (the name of the primary group) which we don't need anyway. There is no proper way to disable these slow calls, but there's a simple workaround due to the way in which modules grp and pwd are used. In fact, pwd and grp are imported in this fashion: try: import grp, pwd except ImportError: grp = pwd = None and then used with the following pattern [2]: if grp: try: tarinfo.gname = grp.getgrgid(tarinfo.gid)[0] except KeyError: pass By setting grp and pwd to None, thus skipping the calls, I was able to achieve a 35x speedup on my workstation. The downside to this approach is that we are relying on an implementation detail, which is not in the public API. However, the blamelist shows that the relevant bits of the module have not been updated since 2003 [3], so we might as well assume that the workaround will keep working, on cPython 2.x at least. --- [1] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l56 [2] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l1933 [3] https://hg.python.org/cpython/rev/f9a5ed092660 BUG=chromium:535160 LOG=N ==========
Description was changed from ========== [Swarming] work around slow calls in archive.py Apparently, the tarfile Python module spends a lot of time in grp.getgrid for retrieving a piece information (the name of the primary group) which we don't need anyway. There is no proper way to disable these slow calls, but there's a simple workaround due to the way in which modules grp and pwd are used. In fact, pwd and grp are imported in this fashion: try: import grp, pwd except ImportError: grp = pwd = None and then used with the following pattern [2]: if grp: try: tarinfo.gname = grp.getgrgid(tarinfo.gid)[0] except KeyError: pass By setting grp and pwd to None, thus skipping the calls, I was able to achieve a 35x speedup on my workstation. The downside to this approach is that we are relying on an implementation detail, which is not in the public API. However, the blamelist shows that the relevant bits of the module have not been updated since 2003 [3], so we might as well assume that the workaround will keep working, on cPython 2.x at least. --- [1] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l56 [2] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l1933 [3] https://hg.python.org/cpython/rev/f9a5ed092660 BUG=chromium:535160 LOG=N ========== to ========== [Swarming] work around slow calls in archive.py Apparently, the tarfile Python module spends a lot of time in grp.getgrid for retrieving a piece information (the name of the primary group) which we don't need anyway. There is no proper way to disable these slow calls, but there's a workaround which relies on the way in which modules grp and pwd are used. In fact, pwd and grp are imported in this fashion: try: import grp, pwd except ImportError: grp = pwd = None and then used with the following pattern [2]: if grp: try: tarinfo.gname = grp.getgrgid(tarinfo.gid)[0] except KeyError: pass By setting grp and pwd to None, thus skipping the calls, I was able to achieve a 35x speedup on my workstation. The downside to this approach is that we are relying on an implementation detail, which is not in the public API. However, the blamelist shows that the relevant bits of the module have not been updated since 2003 [3], so we might as well assume that the workaround will keep working, on cPython 2.x at least. --- [1] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l56 [2] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l1933 [3] https://hg.python.org/cpython/rev/f9a5ed092660 BUG=chromium:535160 LOG=N ==========
Description was changed from ========== [Swarming] work around slow calls in archive.py Apparently, the tarfile Python module spends a lot of time in grp.getgrid for retrieving a piece information (the name of the primary group) which we don't need anyway. There is no proper way to disable these slow calls, but there's a workaround which relies on the way in which modules grp and pwd are used. In fact, pwd and grp are imported in this fashion: try: import grp, pwd except ImportError: grp = pwd = None and then used with the following pattern [2]: if grp: try: tarinfo.gname = grp.getgrgid(tarinfo.gid)[0] except KeyError: pass By setting grp and pwd to None, thus skipping the calls, I was able to achieve a 35x speedup on my workstation. The downside to this approach is that we are relying on an implementation detail, which is not in the public API. However, the blamelist shows that the relevant bits of the module have not been updated since 2003 [3], so we might as well assume that the workaround will keep working, on cPython 2.x at least. --- [1] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l56 [2] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l1933 [3] https://hg.python.org/cpython/rev/f9a5ed092660 BUG=chromium:535160 LOG=N ========== to ========== [Swarming] work around slow calls in archive.py Apparently, the tarfile Python module spends a lot of time in grp.getgrid for retrieving a piece information (the name of the primary group) which we don't need anyway. There is no proper way to disable these slow calls, but there's a workaround which relies on the way in which grp (and pwd) is used. In fact, pwd and grp are imported in this fashion: try: import grp, pwd except ImportError: grp = pwd = None and then used with the following pattern [2]: if grp: try: tarinfo.gname = grp.getgrgid(tarinfo.gid)[0] except KeyError: pass By setting grp and pwd to None, thus skipping the calls, I was able to achieve a 35x speedup on my workstation. The downside to this approach is that we are relying on an implementation detail, which is not in the public API. However, the blamelist shows that the relevant bits of the module have not been updated since 2003 [3], so we might as well assume that the workaround will keep working, on cPython 2.x at least. --- [1] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l56 [2] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l1933 [3] https://hg.python.org/cpython/rev/f9a5ed092660 BUG=chromium:535160 LOG=N ==========
Description was changed from ========== [Swarming] work around slow calls in archive.py Apparently, the tarfile Python module spends a lot of time in grp.getgrid for retrieving a piece information (the name of the primary group) which we don't need anyway. There is no proper way to disable these slow calls, but there's a workaround which relies on the way in which grp (and pwd) is used. In fact, pwd and grp are imported in this fashion: try: import grp, pwd except ImportError: grp = pwd = None and then used with the following pattern [2]: if grp: try: tarinfo.gname = grp.getgrgid(tarinfo.gid)[0] except KeyError: pass By setting grp and pwd to None, thus skipping the calls, I was able to achieve a 35x speedup on my workstation. The downside to this approach is that we are relying on an implementation detail, which is not in the public API. However, the blamelist shows that the relevant bits of the module have not been updated since 2003 [3], so we might as well assume that the workaround will keep working, on cPython 2.x at least. --- [1] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l56 [2] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l1933 [3] https://hg.python.org/cpython/rev/f9a5ed092660 BUG=chromium:535160 LOG=N ========== to ========== [Swarming] work around slow calls in archive.py Apparently, the tarfile Python module spends a lot of time in grp.getgrid for retrieving a piece information (the name of the primary group) which we don't need anyway. There is no proper way to disable these slow calls, but there's a workaround which relies on the way in which grp (and pwd) is used. In fact, pwd and grp are imported in this fashion: try: import grp, pwd except ImportError: grp = pwd = None and then used with the following pattern [2]: if grp: try: tarinfo.gname = grp.getgrgid(tarinfo.gid)[0] except KeyError: pass By setting grp and pwd to None, thus skipping the calls, I was able to achieve a 35x speedup on my workstation. The user and group names are then set to test262 when building the tar. The downside to this approach is that we are relying on an implementation detail, which is not in the public API. However, the blamelist shows that the relevant bits of the module have not been updated since 2003 [3], so we might as well assume that the workaround will keep working, on cPython 2.x at least. --- [1] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l56 [2] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l1933 [3] https://hg.python.org/cpython/rev/f9a5ed092660 BUG=chromium:535160 LOG=N ==========
Description was changed from ========== [Swarming] work around slow calls in archive.py Apparently, the tarfile Python module spends a lot of time in grp.getgrid for retrieving a piece information (the name of the primary group) which we don't need anyway. There is no proper way to disable these slow calls, but there's a workaround which relies on the way in which grp (and pwd) is used. In fact, pwd and grp are imported in this fashion: try: import grp, pwd except ImportError: grp = pwd = None and then used with the following pattern [2]: if grp: try: tarinfo.gname = grp.getgrgid(tarinfo.gid)[0] except KeyError: pass By setting grp and pwd to None, thus skipping the calls, I was able to achieve a 35x speedup on my workstation. The user and group names are then set to test262 when building the tar. The downside to this approach is that we are relying on an implementation detail, which is not in the public API. However, the blamelist shows that the relevant bits of the module have not been updated since 2003 [3], so we might as well assume that the workaround will keep working, on cPython 2.x at least. --- [1] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l56 [2] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l1933 [3] https://hg.python.org/cpython/rev/f9a5ed092660 BUG=chromium:535160 LOG=N ========== to ========== [Swarming] work around slow calls in archive.py Apparently, the tarfile Python module spends a lot of time in grp.getgrid for retrieving a piece information (the name of the primary group) which we don't need anyway. There is no proper way to disable these slow calls, but there's a workaround which relies on the way in which grp (and pwd) is used. In fact, pwd and grp are imported in this fashion: try: import grp, pwd except ImportError: grp = pwd = None and then used with the following pattern [2]: if grp: try: tarinfo.gname = grp.getgrgid(tarinfo.gid)[0] except KeyError: pass By setting grp and pwd to None, thus skipping the calls, I was able to achieve a 35x speedup on my workstation. The user and group names are set to test262 when building the tar. The downside to this approach is that we are relying on an implementation detail, which is not in the public API. However, the blamelist shows that the relevant bits of the module have not been updated since 2003 [3], so we might as well assume that the workaround will keep working, on cPython 2.x at least. --- [1] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l56 [2] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l1933 [3] https://hg.python.org/cpython/rev/f9a5ed092660 BUG=chromium:535160 LOG=N ==========
The CQ bit was checked by machenbach@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1727773002/1 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1727773002/1
I love work-arounds! Lets see if it works...
On 2016/02/23 19:47:16, Michael Achenbach wrote: > I love work-arounds! Lets see if it works... To be fair: My CL introducing archive.py is already a workaround ;)
Wonder if this has the desired effect. According to the compiler stats, taring still takes 33 seconds: http://chromium-build-stats.appspot.com/ninja_log/2016/02/23/slave404-c4/ninj...
On 2016/02/23 19:58:24, Michael Achenbach wrote: > Wonder if this has the desired effect. According to the compiler stats, taring > still takes 33 seconds: > http://chromium-build-stats.appspot.com/ninja_log/2016/02/23/slave404-c4/ninj... Or 54 seconds: http://chromium-build-stats.appspot.com/ninja_log/2016/02/23/slave587-c4/ninj...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
On 2016/02/23 20:01:18, Michael Achenbach wrote: > On 2016/02/23 19:58:24, Michael Achenbach wrote: > > Wonder if this has the desired effect. According to the compiler stats, taring > > still takes 33 seconds: > > > http://chromium-build-stats.appspot.com/ninja_log/2016/02/23/slave404-c4/ninj... > > Or 54 seconds: > http://chromium-build-stats.appspot.com/ninja_log/2016/02/23/slave587-c4/ninj... I am not sure I'm reading the numbers correctly, but the build time of all targets seem to be in the same ballpark and suspiciously long. Maybe the bots were under heavy load? For instance, obj/test/cctest/cctest.test-asm-validator.o takes 12s on my laptop and 2m19s on the first buildbot.
On 2016/02/24 01:30:42, ssanfilippo wrote: > On 2016/02/23 20:01:18, Michael Achenbach wrote: > > On 2016/02/23 19:58:24, Michael Achenbach wrote: > > > Wonder if this has the desired effect. According to the compiler stats, > taring > > > still takes 33 seconds: > > > > > > http://chromium-build-stats.appspot.com/ninja_log/2016/02/23/slave404-c4/ninj... > > > > Or 54 seconds: > > > http://chromium-build-stats.appspot.com/ninja_log/2016/02/23/slave587-c4/ninj... > > I am not sure I'm reading the numbers correctly, but the build time of all > targets seem to be in the same ballpark and suspiciously long. > Maybe the bots were under heavy load? > > For instance, obj/test/cctest/cctest.test-asm-validator.o takes 12s on my laptop > and 2m19s on the first buildbot. lgtm if it works for you Goma might have been under load, while the tar job was executed locally. But I won't block this as it seems not to do any harm. The tar is still made with everything that should be in it. At least on the tested platforms. We'll see after landing if there's any catch on other platforms.
The CQ bit was checked by ssanfilippo@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1727773002/1 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1727773002/1
Message was sent while issue was closed.
Description was changed from ========== [Swarming] work around slow calls in archive.py Apparently, the tarfile Python module spends a lot of time in grp.getgrid for retrieving a piece information (the name of the primary group) which we don't need anyway. There is no proper way to disable these slow calls, but there's a workaround which relies on the way in which grp (and pwd) is used. In fact, pwd and grp are imported in this fashion: try: import grp, pwd except ImportError: grp = pwd = None and then used with the following pattern [2]: if grp: try: tarinfo.gname = grp.getgrgid(tarinfo.gid)[0] except KeyError: pass By setting grp and pwd to None, thus skipping the calls, I was able to achieve a 35x speedup on my workstation. The user and group names are set to test262 when building the tar. The downside to this approach is that we are relying on an implementation detail, which is not in the public API. However, the blamelist shows that the relevant bits of the module have not been updated since 2003 [3], so we might as well assume that the workaround will keep working, on cPython 2.x at least. --- [1] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l56 [2] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l1933 [3] https://hg.python.org/cpython/rev/f9a5ed092660 BUG=chromium:535160 LOG=N ========== to ========== [Swarming] work around slow calls in archive.py Apparently, the tarfile Python module spends a lot of time in grp.getgrid for retrieving a piece information (the name of the primary group) which we don't need anyway. There is no proper way to disable these slow calls, but there's a workaround which relies on the way in which grp (and pwd) is used. In fact, pwd and grp are imported in this fashion: try: import grp, pwd except ImportError: grp = pwd = None and then used with the following pattern [2]: if grp: try: tarinfo.gname = grp.getgrgid(tarinfo.gid)[0] except KeyError: pass By setting grp and pwd to None, thus skipping the calls, I was able to achieve a 35x speedup on my workstation. The user and group names are set to test262 when building the tar. The downside to this approach is that we are relying on an implementation detail, which is not in the public API. However, the blamelist shows that the relevant bits of the module have not been updated since 2003 [3], so we might as well assume that the workaround will keep working, on cPython 2.x at least. --- [1] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l56 [2] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l1933 [3] https://hg.python.org/cpython/rev/f9a5ed092660 BUG=chromium:535160 LOG=N ==========
Message was sent while issue was closed.
Committed patchset #1 (id:1)
Message was sent while issue was closed.
Description was changed from ========== [Swarming] work around slow calls in archive.py Apparently, the tarfile Python module spends a lot of time in grp.getgrid for retrieving a piece information (the name of the primary group) which we don't need anyway. There is no proper way to disable these slow calls, but there's a workaround which relies on the way in which grp (and pwd) is used. In fact, pwd and grp are imported in this fashion: try: import grp, pwd except ImportError: grp = pwd = None and then used with the following pattern [2]: if grp: try: tarinfo.gname = grp.getgrgid(tarinfo.gid)[0] except KeyError: pass By setting grp and pwd to None, thus skipping the calls, I was able to achieve a 35x speedup on my workstation. The user and group names are set to test262 when building the tar. The downside to this approach is that we are relying on an implementation detail, which is not in the public API. However, the blamelist shows that the relevant bits of the module have not been updated since 2003 [3], so we might as well assume that the workaround will keep working, on cPython 2.x at least. --- [1] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l56 [2] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l1933 [3] https://hg.python.org/cpython/rev/f9a5ed092660 BUG=chromium:535160 LOG=N ========== to ========== [Swarming] work around slow calls in archive.py Apparently, the tarfile Python module spends a lot of time in grp.getgrid for retrieving a piece information (the name of the primary group) which we don't need anyway. There is no proper way to disable these slow calls, but there's a workaround which relies on the way in which grp (and pwd) is used. In fact, pwd and grp are imported in this fashion: try: import grp, pwd except ImportError: grp = pwd = None and then used with the following pattern [2]: if grp: try: tarinfo.gname = grp.getgrgid(tarinfo.gid)[0] except KeyError: pass By setting grp and pwd to None, thus skipping the calls, I was able to achieve a 35x speedup on my workstation. The user and group names are set to test262 when building the tar. The downside to this approach is that we are relying on an implementation detail, which is not in the public API. However, the blamelist shows that the relevant bits of the module have not been updated since 2003 [3], so we might as well assume that the workaround will keep working, on cPython 2.x at least. --- [1] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l56 [2] https://hg.python.org/cpython/file/2.7/Lib/tarfile.py#l1933 [3] https://hg.python.org/cpython/rev/f9a5ed092660 BUG=chromium:535160 LOG=N Committed: https://crrev.com/1c1b70c98d9231bf1606127796a57459aebba481 Cr-Commit-Position: refs/heads/master@{#34245} ==========
Message was sent while issue was closed.
Patchset 1 (id:??) landed as https://crrev.com/1c1b70c98d9231bf1606127796a57459aebba481 Cr-Commit-Position: refs/heads/master@{#34245} |