|
|
DescriptionUse GUID to generate unique temp file names and retire GetTempFileName
We use GetTempFileName Windows API to generate an unique temp file name
in some scenarios like downloading files.
It is mentioned on MSDN for this API that "Due to the algorithm used to
generate file names, GetTempFileName Windows API can perform poorly when
creating a large number of files with the same prefix. In such cases, it
is recommended that you construct unique file names based on GUIDs." The
reasons why it's doing poorly include it takes a long time to find an
unique name when a lot of names are already in use. This has been proved
by the fact we found this API call takes minutes for some users who have
lots of temp files in their default download directory. Note that the
temp files in the default download directory can be generated by any
software other than Chrome.
This CL retires GetTempFileName and uses GUID to generate unique temp
file names. With GUID, it is almost guaranteed that an unique temp file
name can be generated with a single attempt.
BUG=103737
Review-Url: https://codereview.chromium.org/2788483005
Cr-Commit-Position: refs/heads/master@{#462662}
Committed: https://chromium.googlesource.com/chromium/src/+/170f1cf7d23ce3e3fb99085042bf68b8772825a1
Patch Set 1 #
Total comments: 12
Patch Set 2 : Address comments. #
Total comments: 8
Patch Set 3 : Address early comments from grt@ #
Total comments: 4
Patch Set 4 : Using GUID to generate new temp file names. #
Total comments: 11
Patch Set 5 : Address comments. #Patch Set 6 : Revert some variable names. #
Total comments: 4
Patch Set 7 : Add comments to the code, fix nits. #
Total comments: 12
Patch Set 8 : Address comments from gab@ #
Total comments: 2
Patch Set 9 : Add a loop to make sure an unique name is generated. #
Total comments: 2
Messages
Total messages: 99 (72 generated)
The CQ bit was checked by chengx@chromium.org to run a CQ dry run
Description was changed from ========== Supply random prefix to GetTempFileName. BUG=103737 ========== to ========== Supply random prefix to GetTempFileName BUG=103737 ==========
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was checked by chengx@chromium.org to run a CQ dry run
Patchset #1 (id:1) has been deleted
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Description was changed from ========== Supply random prefix to GetTempFileName BUG=103737 ========== to ========== Supply random prefix to GetTempFileName We use GetTempFileName Windows API to generate an unique temp file name in some scenarios like downloading files. It is mentioned on MSDN for this API that "Due to the algorithm used to generate file names, GetTempFileName Windows API can perform poorly when creating a large number of files with the same prefix." The reasons why is is doing poorly include that it takes a long time to find an unique name when a lot of names are already in use. This has been proved by the fact we found this API runs for ages for some users who have lots of temp files in their default download directory. Note that the temp files in the default download directory can be generated any software other than Chrome. This CL supplies a random 3-character prefix to GetTempFileName API. Now the chance of generating an unique file name is way more higher than supplying no prefix. However, it still takes a long time to read all the existing file names from the disk to make sure the generated name is unique. BUG=103737 ==========
Description was changed from ========== Supply random prefix to GetTempFileName We use GetTempFileName Windows API to generate an unique temp file name in some scenarios like downloading files. It is mentioned on MSDN for this API that "Due to the algorithm used to generate file names, GetTempFileName Windows API can perform poorly when creating a large number of files with the same prefix." The reasons why is is doing poorly include that it takes a long time to find an unique name when a lot of names are already in use. This has been proved by the fact we found this API runs for ages for some users who have lots of temp files in their default download directory. Note that the temp files in the default download directory can be generated any software other than Chrome. This CL supplies a random 3-character prefix to GetTempFileName API. Now the chance of generating an unique file name is way more higher than supplying no prefix. However, it still takes a long time to read all the existing file names from the disk to make sure the generated name is unique. BUG=103737 ========== to ========== Supply random prefix to GetTempFileName We use GetTempFileName Windows API to generate an unique temp file name in some scenarios like downloading files. It is mentioned on MSDN for this API that "Due to the algorithm used to generate file names, GetTempFileName Windows API can perform poorly when creating a large number of files with the same prefix." The reasons why it's doing poorly include it takes a long time to find an unique name when a lot of names are already in use. This has been proved by the fact we found this API runs for ages for some users who have lots of temp files in their default download directory. Note that the temp files in the default download directory can be generated any software other than Chrome. This CL supplies a random 3-character prefix to GetTempFileName API. Now the chance of successfully generating an unique file name is way more higher than before when no prefix is supplied. However, it still takes a long time to read all the existing file names from the disk to make sure the generated name is unique. BUG=103737 ==========
Description was changed from ========== Supply random prefix to GetTempFileName We use GetTempFileName Windows API to generate an unique temp file name in some scenarios like downloading files. It is mentioned on MSDN for this API that "Due to the algorithm used to generate file names, GetTempFileName Windows API can perform poorly when creating a large number of files with the same prefix." The reasons why it's doing poorly include it takes a long time to find an unique name when a lot of names are already in use. This has been proved by the fact we found this API runs for ages for some users who have lots of temp files in their default download directory. Note that the temp files in the default download directory can be generated any software other than Chrome. This CL supplies a random 3-character prefix to GetTempFileName API. Now the chance of successfully generating an unique file name is way more higher than before when no prefix is supplied. However, it still takes a long time to read all the existing file names from the disk to make sure the generated name is unique. BUG=103737 ========== to ========== Supply random prefix to GetTempFileName We use GetTempFileName Windows API to generate an unique temp file name in some scenarios like downloading files. It is mentioned on MSDN for this API that "Due to the algorithm used to generate file names, GetTempFileName Windows API can perform poorly when creating a large number of files with the same prefix." The reasons why it's doing poorly include it takes a long time to find an unique name when a lot of names are already in use. This has been proved by the fact we found this API runs for ages for some users who have lots of temp files in their default download directory. Note that the temp files in the default download directory can be generated by any software other than Chrome. This CL supplies a random 3-character prefix to GetTempFileName API. Now the chance of successfully generating an unique file name is way more higher than before when no prefix is supplied. However, it still takes a long time to read all the existing file names from the disk to make sure the generated name is unique. BUG=103737 ==========
Description was changed from ========== Supply random prefix to GetTempFileName We use GetTempFileName Windows API to generate an unique temp file name in some scenarios like downloading files. It is mentioned on MSDN for this API that "Due to the algorithm used to generate file names, GetTempFileName Windows API can perform poorly when creating a large number of files with the same prefix." The reasons why it's doing poorly include it takes a long time to find an unique name when a lot of names are already in use. This has been proved by the fact we found this API runs for ages for some users who have lots of temp files in their default download directory. Note that the temp files in the default download directory can be generated by any software other than Chrome. This CL supplies a random 3-character prefix to GetTempFileName API. Now the chance of successfully generating an unique file name is way more higher than before when no prefix is supplied. However, it still takes a long time to read all the existing file names from the disk to make sure the generated name is unique. BUG=103737 ========== to ========== Supply random prefix to GetTempFileName We use GetTempFileName Windows API to generate an unique temp file name in some scenarios like downloading files. It is mentioned on MSDN for this API that "Due to the algorithm used to generate file names, GetTempFileName Windows API can perform poorly when creating a large number of files with the same prefix." The reasons why it's doing poorly include it takes a long time to find an unique name when a lot of names are already in use. This has been proved by the fact we found this API runs for ages for some users who have lots of temp files in their default download directory. Note that the temp files in the default download directory can be generated by any software other than Chrome. This CL supplies a random 3-character prefix to GetTempFileName API. Now the chance of successfully generating an unique file name is way more higher than before when no prefix is supplied. However, it still takes a long time to read all the existing file names from the disk to make sure the generated name is unique. BUG=103737 ==========
Description was changed from ========== Supply random prefix to GetTempFileName We use GetTempFileName Windows API to generate an unique temp file name in some scenarios like downloading files. It is mentioned on MSDN for this API that "Due to the algorithm used to generate file names, GetTempFileName Windows API can perform poorly when creating a large number of files with the same prefix." The reasons why it's doing poorly include it takes a long time to find an unique name when a lot of names are already in use. This has been proved by the fact we found this API runs for ages for some users who have lots of temp files in their default download directory. Note that the temp files in the default download directory can be generated by any software other than Chrome. This CL supplies a random 3-character prefix to GetTempFileName API. Now the chance of successfully generating an unique file name is way more higher than before when no prefix is supplied. However, it still takes a long time to read all the existing file names from the disk to make sure the generated name is unique. BUG=103737 ========== to ========== Supply random prefix to GetTempFileName We use GetTempFileName Windows API to generate an unique temp file name in some scenarios like downloading files. It is mentioned on MSDN for this API that "Due to the algorithm used to generate file names, GetTempFileName Windows API can perform poorly when creating a large number of files with the same prefix." The reasons why it's doing poorly include it takes a long time to find an unique name when a lot of names are already in use. This has been proved by the fact we found this API runs for ages for some users who have lots of temp files in their default download directory. Note that the temp files in the default download directory can be generated by any software other than Chrome. This CL supplies a random 3-character prefix to GetTempFileName API. Now the chance of successfully generating an unique file name is way more higher than before when no prefix is supplied. BUG=103737 ==========
Description was changed from ========== Supply random prefix to GetTempFileName We use GetTempFileName Windows API to generate an unique temp file name in some scenarios like downloading files. It is mentioned on MSDN for this API that "Due to the algorithm used to generate file names, GetTempFileName Windows API can perform poorly when creating a large number of files with the same prefix." The reasons why it's doing poorly include it takes a long time to find an unique name when a lot of names are already in use. This has been proved by the fact we found this API runs for ages for some users who have lots of temp files in their default download directory. Note that the temp files in the default download directory can be generated by any software other than Chrome. This CL supplies a random 3-character prefix to GetTempFileName API. Now the chance of successfully generating an unique file name is way more higher than before when no prefix is supplied. BUG=103737 ========== to ========== Supply random prefix to GetTempFileName We use GetTempFileName Windows API to generate an unique temp file name in some scenarios like downloading files. It is mentioned on MSDN for this API that "Due to the algorithm used to generate file names, GetTempFileName Windows API can perform poorly when creating a large number of files with the same prefix." The reasons why it's doing poorly include it takes a long time to find an unique name when a lot of names are already in use. This has been proved by the fact we found this API call takes minutes for some users who have lots of temp files in their default download directory. Note that the temp files in the default download directory can be generated by any software other than Chrome. This CL supplies a random 3-character prefix to GetTempFileName API. Now the chance of successfully generating an unique file name is way more higher than before when no prefix is supplied. BUG=103737 ==========
chengx@chromium.org changed reviewers: + gab@chromium.org
PTAL~
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by chengx@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Patchset #1 (id:20001) has been deleted
gab@chromium.org changed reviewers: + grt@chromium.org
Meta-comment: it's kind of weird to seed a random-filename-generator with a random prefix and expect it would do much better (i.e. why can't *it* do this if it's so trivial..?). The comment on MSDN about this suggests using GUID-based filenames if this is a problem, not randomizing our prefix. How many files do *very large* icon dirs gather? 36^3 = 46656 prefixes in our random set. Maybe that's enough but it's also fairly simple to use base::GenerateGUID() instead of ::GetTempFileName() altogether? +grt for his thoughts https://codereview.chromium.org/2788483005/diff/40001/base/files/file_util_wi... File base/files/file_util_win.cc (right): https://codereview.chromium.org/2788483005/diff/40001/base/files/file_util_wi... base/files/file_util_win.cc:45: const char kAlphaNumberCharSet[] = constexpr https://codereview.chromium.org/2788483005/diff/40001/base/files/file_util_wi... base/files/file_util_win.cc:45: const char kAlphaNumberCharSet[] = I think you'll need to use FILE_PATH_LITERAL to get the char type that matches FilePath::StringType on each platform. https://codereview.chromium.org/2788483005/diff/40001/base/files/file_util_wi... base/files/file_util_win.cc:46: "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"; Filenames on Windows are case insensitive so don't include upper-case characters here. https://codereview.chromium.org/2788483005/diff/40001/base/files/file_util_wi... base/files/file_util_win.cc:54: const int kPrefixLength = 3; constexpr https://codereview.chromium.org/2788483005/diff/40001/base/files/file_util_wi... base/files/file_util_win.cc:62: size_t char_set_length = strlen(kAlphaNumberCharSet); inline arraysize(kAlphaNumberCharSet) instead (it's a constexpr) https://codereview.chromium.org/2788483005/diff/40001/base/files/file_util_wi... base/files/file_util_win.cc:66: temp_file_name_prefix[i] = kAlphaNumberCharSet[rand() % char_set_length]; Use base::RandGenerator(arraysize(kAlphaNumberCharSet)); (don't think you need to seed, would assume Chrome has to seed its generators when it starts already, though I'm not sure how it does it)
The CQ bit was checked by chengx@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
On 2017/04/03 16:40:50, gab wrote: > Meta-comment: > > it's kind of weird to seed a random-filename-generator with a random prefix and > expect it would do much better (i.e. why can't *it* do this if it's so > trivial..?). The comment on MSDN about this suggests using GUID-based filenames > if this is a problem, not randomizing our prefix. > > How many files do *very large* icon dirs gather? 36^3 = 46656 prefixes in our > random set. Maybe that's enough but it's also fairly simple to use > base::GenerateGUID() instead of ::GetTempFileName() altogether? Thanks for the comments. I know MSDN suggests to use GUID, which has a very very high probability in generating an unique name. However, I think the bottleneck here is that the program needs to make sure the name is unique before creating the temp file with the newly generated name. To make sure the name is unique, the program needs to read all the filenames in the target directory (e.g, the Downloads directory), which can be very slow when there are already a lot of temp files. As these temp files can be generated by any program other than Chrome, it should be the users' responsibility to clean them although they may not. For GetTempFileName API, without prefix it can generate at most 16^4=65536 unique file names. When there are already thousands of temp files in the target directory, the chance of failing to generate an unique filename increases. Adding the random 3-digit prefix increases the maximum unique filenames allowed to 36*36*36*65536, which is a lot bigger. In addition, GetTempFileName API makes sure the generated filename is unique and creates the file if an unique name is found by itself. For GUID based method, we need to do the uniqueness check (using a for loop maybe) and create the file by ourselves. In short, I think both GetTempFileName with random prefix and GUID based method should be able to generate an unique name with one shot even though there are lots of temp files in the folder already. However, the uniqueness check would still take a while in this case, and I don't know how to improve it. https://codereview.chromium.org/2788483005/diff/40001/base/files/file_util_wi... File base/files/file_util_win.cc (right): https://codereview.chromium.org/2788483005/diff/40001/base/files/file_util_wi... base/files/file_util_win.cc:45: const char kAlphaNumberCharSet[] = On 2017/04/03 16:40:49, gab wrote: > I think you'll need to use FILE_PATH_LITERAL to get the char type that matches > FilePath::StringType on each platform. Done. https://codereview.chromium.org/2788483005/diff/40001/base/files/file_util_wi... base/files/file_util_win.cc:45: const char kAlphaNumberCharSet[] = On 2017/04/03 16:40:49, gab wrote: > I think you'll need to use FILE_PATH_LITERAL to get the char type that matches > FilePath::StringType on each platform. Done. https://codereview.chromium.org/2788483005/diff/40001/base/files/file_util_wi... base/files/file_util_win.cc:46: "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"; On 2017/04/03 16:40:49, gab wrote: > Filenames on Windows are case insensitive so don't include upper-case characters > here. Done. https://codereview.chromium.org/2788483005/diff/40001/base/files/file_util_wi... base/files/file_util_win.cc:54: const int kPrefixLength = 3; On 2017/04/03 16:40:50, gab wrote: > constexpr Done. https://codereview.chromium.org/2788483005/diff/40001/base/files/file_util_wi... base/files/file_util_win.cc:62: size_t char_set_length = strlen(kAlphaNumberCharSet); On 2017/04/03 16:40:50, gab wrote: > inline arraysize(kAlphaNumberCharSet) instead (it's a constexpr) Done. https://codereview.chromium.org/2788483005/diff/40001/base/files/file_util_wi... base/files/file_util_win.cc:66: temp_file_name_prefix[i] = kAlphaNumberCharSet[rand() % char_set_length]; On 2017/04/03 16:40:49, gab wrote: > Use base::RandGenerator(arraysize(kAlphaNumberCharSet)); > > (don't think you need to seed, would assume Chrome has to seed its generators > when it starts already, though I'm not sure how it does it) Yes, no need to seed here if using base::RandGenerator.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
a few early comments. i haven't thought about the overall strategy yet. https://codereview.chromium.org/2788483005/diff/60001/base/files/file_util_wi... File base/files/file_util_win.cc (right): https://codereview.chromium.org/2788483005/diff/60001/base/files/file_util_wi... base/files/file_util_win.cc:59: FilePath::StringType temp_file_name_prefix; always prefer constructing to the desired size over constructing then resizing: FilePath::StringType temp_file_name_prefix(kPrefixLength, 0); https://codereview.chromium.org/2788483005/diff/60001/base/files/file_util_wi... base/files/file_util_win.cc:62: for (size_t i = 0; i < kPrefixLength; i++) nit: add braces. alternatively, use std::generate to fill temp_file_name_prefix https://codereview.chromium.org/2788483005/diff/60001/base/files/file_util_wi... base/files/file_util_win.cc:64: arraysize(kAlphaNumberCharSet))]; arraysize - 1 to omit the trailing string terminator https://codereview.chromium.org/2788483005/diff/60001/base/files/file_util_wi... base/files/file_util_win.cc:373: // prefix. Therefore, making the prefix random should boost its performance. re: "should": have you verified that filling a directory with a bajillion FOO* files and then creating a temp file with the prefix FOO is slower than creating a file with the prefix BAR? in other words, is the problem that there are many files with the same prefix as the desired temp file, or is it that there are many files period?
The CQ bit was checked by chengx@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
On 2017/04/03 21:11:16, grt (UTC plus 2) wrote: > a few early comments. i haven't thought about the overall strategy yet. Thanks Greg! Please see my inline replies to your comments. https://codereview.chromium.org/2788483005/diff/60001/base/files/file_util_wi... File base/files/file_util_win.cc (right): https://codereview.chromium.org/2788483005/diff/60001/base/files/file_util_wi... base/files/file_util_win.cc:59: FilePath::StringType temp_file_name_prefix; On 2017/04/03 21:11:15, grt (UTC plus 2) wrote: > always prefer constructing to the desired size over constructing then resizing: > FilePath::StringType temp_file_name_prefix(kPrefixLength, 0); Done. https://codereview.chromium.org/2788483005/diff/60001/base/files/file_util_wi... base/files/file_util_win.cc:62: for (size_t i = 0; i < kPrefixLength; i++) On 2017/04/03 21:11:15, grt (UTC plus 2) wrote: > nit: add braces. alternatively, use std::generate to fill temp_file_name_prefix Done. https://codereview.chromium.org/2788483005/diff/60001/base/files/file_util_wi... base/files/file_util_win.cc:64: arraysize(kAlphaNumberCharSet))]; On 2017/04/03 21:11:15, grt (UTC plus 2) wrote: > arraysize - 1 to omit the trailing string terminator Done. https://codereview.chromium.org/2788483005/diff/60001/base/files/file_util_wi... base/files/file_util_win.cc:373: // prefix. Therefore, making the prefix random should boost its performance. On 2017/04/03 21:11:15, grt (UTC plus 2) wrote: > re: "should": have you verified that filling a directory with a bajillion FOO* > files and then creating a temp file with the prefix FOO is slower than creating > a file with the prefix BAR? in other words, is the problem that there are many > files with the same prefix as the desired temp file, or is it that there are > many files period? I should remove "should" probably. The main issue is that there are simply many files there in the target Downloads folder. In this case, checking the uniqueness of a newly generated name is expensive, because it needs to loop over all the files. There is nothing we can do here. The users need to delete them all. The problem of GetTempFileName is that it can only generate 16^4=65536 files, and this limit is not that hard to hit. When the temp files in the folder accumulates, the chance of failing to generate an unique name increases and the uniqueness check needs to be done for more than once, maybe a few times. This makes things worse. I think this CL can fix this issue, as now the file limit has been increased to 36^3 times larger. Using GUID is another option to fix this issue.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
I think I'm about to repeat Gab: why generate three random chars and then ask the platform to add a few more onto it? Why not generate 11 random chars (8.3 for those folks who just love FAT) and probe in a loop to make a random filename? If the platform's GetTempFileName function isn't good enough, let's go while hog and roll our own. As for your sense that the platform enumerates all files, do you have data for this? I can dream up several bad ways of implementing this function that are all better than that. :-) https://codereview.chromium.org/2788483005/diff/80001/base/files/file_util_wi... File base/files/file_util_win.cc (right): https://codereview.chromium.org/2788483005/diff/80001/base/files/file_util_wi... base/files/file_util_win.cc:45: constexpr base::FilePath::CharType kAlphaNumberCharSet[] = nit: move these constants into the function as static constexpr values. https://codereview.chromium.org/2788483005/diff/80001/base/files/file_util_wi... base/files/file_util_win.cc:60: for (size_t i = 0; i < kPrefixLength; i++) { nit: prefer pre-increment; see style guide.
The CQ bit was checked by chengx@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: win_chromium_x64_rel_ng on master.tryserver.chromium.win (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_x64_...)
The CQ bit was checked by chengx@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Description was changed from ========== Supply random prefix to GetTempFileName We use GetTempFileName Windows API to generate an unique temp file name in some scenarios like downloading files. It is mentioned on MSDN for this API that "Due to the algorithm used to generate file names, GetTempFileName Windows API can perform poorly when creating a large number of files with the same prefix." The reasons why it's doing poorly include it takes a long time to find an unique name when a lot of names are already in use. This has been proved by the fact we found this API call takes minutes for some users who have lots of temp files in their default download directory. Note that the temp files in the default download directory can be generated by any software other than Chrome. This CL supplies a random 3-character prefix to GetTempFileName API. Now the chance of successfully generating an unique file name is way more higher than before when no prefix is supplied. BUG=103737 ========== to ========== Use GUID to generate temp file names and retire GetTempFileName We use GetTempFileName Windows API to generate an unique temp file name in some scenarios like downloading files. It is mentioned on MSDN for this API that "Due to the algorithm used to generate file names, GetTempFileName Windows API can perform poorly when creating a large number of files with the same prefix. In such cases, it is recommended that you construct unique file names based on GUIDs." The reasons why it's doing poorly include it takes a long time to find an unique name when a lot of names are already in use. This has been proved by the fact we found this API call takes minutes for some users who have lots of temp files in their default download directory. Note that the temp files in the default download directory can be generated by any software other than Chrome. This CL retires GetTempFileName and uses GUID to generate unique temp file names. With GUID, it is almost guaranteed that an unique temp file name can be generated with a single attempt. BUG=103737 ==========
Description was changed from ========== Use GUID to generate temp file names and retire GetTempFileName We use GetTempFileName Windows API to generate an unique temp file name in some scenarios like downloading files. It is mentioned on MSDN for this API that "Due to the algorithm used to generate file names, GetTempFileName Windows API can perform poorly when creating a large number of files with the same prefix. In such cases, it is recommended that you construct unique file names based on GUIDs." The reasons why it's doing poorly include it takes a long time to find an unique name when a lot of names are already in use. This has been proved by the fact we found this API call takes minutes for some users who have lots of temp files in their default download directory. Note that the temp files in the default download directory can be generated by any software other than Chrome. This CL retires GetTempFileName and uses GUID to generate unique temp file names. With GUID, it is almost guaranteed that an unique temp file name can be generated with a single attempt. BUG=103737 ========== to ========== Use GUID to generate unique temp file names and retire GetTempFileName We use GetTempFileName Windows API to generate an unique temp file name in some scenarios like downloading files. It is mentioned on MSDN for this API that "Due to the algorithm used to generate file names, GetTempFileName Windows API can perform poorly when creating a large number of files with the same prefix. In such cases, it is recommended that you construct unique file names based on GUIDs." The reasons why it's doing poorly include it takes a long time to find an unique name when a lot of names are already in use. This has been proved by the fact we found this API call takes minutes for some users who have lots of temp files in their default download directory. Note that the temp files in the default download directory can be generated by any software other than Chrome. This CL retires GetTempFileName and uses GUID to generate unique temp file names. With GUID, it is almost guaranteed that an unique temp file name can be generated with a single attempt. BUG=103737 ==========
Patchset #4 (id:100001) has been deleted
The CQ bit was checked by chengx@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: win_chromium_x64_rel_ng on master.tryserver.chromium.win (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_x64_...)
The CQ bit was checked by chengx@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Patchset #5 (id:140001) has been deleted
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Patchset #4 (id:120001) has been deleted
On 2017/04/04 12:00:43, grt (UTC plus 2) wrote: > I think I'm about to repeat Gab: why generate three random chars and then ask > the platform to add a few more onto it? Why not generate 11 random chars (8.3 > for those folks who just love FAT) and probe in a loop to make a random > filename? If the platform's GetTempFileName function isn't good enough, let's go > while hog and roll our own. I have retired GetTempFileName Windows API and switched to GUID to generate unique file names. GUID is suggested in many places for this purpose, including MSDN if you search for GUID in the webpage below. https://msdn.microsoft.com/en-us/library/windows/desktop/aa364991(v=vs.85).aspx As the probability of generating an unique name using GUID is extremely high, as in http://stackoverflow.com/questions/39771/is-a-guid-unique-100-of-the-time I don't feel like we need a for-loop to make a few attempts in case the first attempt fails. In CreateFile Windows API call in the new patch set, I used CREATE_ALWAYS flag which means that if the file already exists, it will be overwritten. I think this is fine as it is just a .tmp file. > As for your sense that the platform enumerates all files, do you have data for > this? I can dream up several bad ways of implementing this function that are all > better than that. :-) It is my intuition that all file names will be enumerated (at least be read) for the purpose of checking if the newly generated file already exists or not. However, I think it is true that having many files in the folder slows down the file operation. There is a lot of discussion online, e.g., https://superuser.com/questions/623965/can-file-system-performance-decrease-i... Also, users have reported that removing the large amount of files in their Downloads directories fixed the Download hang issue for them. In short, I think GetTempFileName should be replaced with GUID based method anyway. https://codereview.chromium.org/2788483005/diff/80001/base/files/file_util_wi... File base/files/file_util_win.cc (right): https://codereview.chromium.org/2788483005/diff/80001/base/files/file_util_wi... base/files/file_util_win.cc:45: constexpr base::FilePath::CharType kAlphaNumberCharSet[] = On 2017/04/04 12:00:43, grt (UTC plus 2) wrote: > nit: move these constants into the function as static constexpr values. This constant has been removed in the new patch set. https://codereview.chromium.org/2788483005/diff/80001/base/files/file_util_wi... base/files/file_util_win.cc:60: for (size_t i = 0; i < kPrefixLength; i++) { On 2017/04/04 12:00:43, grt (UTC plus 2) wrote: > nit: prefer pre-increment; see style guide. This for-loop has been removed in the new patch set.
On 2017/04/05 05:39:51, chengx wrote: > On 2017/04/04 12:00:43, grt (UTC plus 2) wrote: > > I think I'm about to repeat Gab: why generate three random chars and then ask > > the platform to add a few more onto it? Why not generate 11 random chars (8.3 > > for those folks who just love FAT) and probe in a loop to make a random > > filename? If the platform's GetTempFileName function isn't good enough, let's > go > > while hog and roll our own. > > I have retired GetTempFileName Windows API and switched to GUID to generate > unique file names. GUID is suggested in many places for this purpose, including > MSDN if you search for GUID in the webpage below. > https://msdn.microsoft.com/en-us/library/windows/desktop/aa364991(v=vs.85).aspx > > As the probability of generating an unique name using GUID is extremely high, as > in http://stackoverflow.com/questions/39771/is-a-guid-unique-100-of-the-time > I don't feel like we need a for-loop to make a few attempts in case the first > attempt fails. In CreateFile Windows API call in the new patch set, I used > CREATE_ALWAYS flag which means that if the file already exists, it will be > overwritten. I think this is fine as it is just a .tmp file. I think this is the wrong choice -- it's not up to Chrome to stomp over files in TMP. > > As for your sense that the platform enumerates all files, do you have data for > > this? I can dream up several bad ways of implementing this function that are > all > > better than that. :-) > > It is my intuition that all file names will be enumerated (at least be read) for > the purpose of checking if the newly generated file already exists or not. If I understand your thinking, I would guess that this isn't how it works. At least, I hope it isn't. Enumerating all entries in the directory and then picking a "random" filename has needless memory overhead and introduces a race that is easily avoided -- another thread in the same or other proc could create the chosen file between the "enumerate and pick" and "create file" steps. I suspect that the implementation actually loops like this: - generate a random filename - try to create the file - break out of loop if created We could make some guesses about the cost of the middle step there -- log(N) if we imagine that the OS uses a balanced binary tree for a directory's entries, or O(1) if we imagine that it uses something that smells more like a hash table. This operation will be expensive on large directories since there could be many iterations if there are a lot of existing files that could collide with the randomly-named file, and it's possible that the complexity of creating a new file is somehow related to the number of entries in the directory. ... > In short, I think GetTempFileName should be replaced with GUID based method > anyway. I agree.
https://codereview.chromium.org/2788483005/diff/160001/base/files/file_util_w... File base/files/file_util_win.cc (right): https://codereview.chromium.org/2788483005/diff/160001/base/files/file_util_w... base/files/file_util_win.cc:344: FilePath::StringType temp_name = UTF8ToUTF16(base::GenerateGUID() + ".tmp"); ASCIIToUTF16 since you know the GUID is ASCII https://codereview.chromium.org/2788483005/diff/160001/base/files/file_util_w... base/files/file_util_win.cc:346: HANDLE file_handle = either use base::win::ScopedHandle or base::File. the latter is preferred if the flags you need are supported. https://codereview.chromium.org/2788483005/diff/160001/base/files/file_util_w... base/files/file_util_win.cc:348: FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, CREATE_ALWAYS, regarding CREATE_ALWAYS: we have no guarantee that the existing GUID-named file was created by Chrome or for what purpose it was created. blindly clobbering someone else's file (even if it is in tmp) doesn't seem nice to me. i think CREATE_NEW is the right thing here. https://codereview.chromium.org/2788483005/diff/160001/base/files/file_util_w... base/files/file_util_win.cc:348: FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, CREATE_ALWAYS, nit: NULL -> nullptr https://codereview.chromium.org/2788483005/diff/160001/base/files/file_util_w... base/files/file_util_win.cc:359: GetLongPathName(temp_path_name.value().c_str(), full_path_name, MAX_PATH); could you dig through the blamelist to figure out why this is here? i don't get it. it's a waste of IO as far as i can tell, and can fail for various reasons. my first guess was "maybe this is some trick to get a full path", but it isn't. i don't see why this function doesn't simply return temp_path_name in call cases.
The CQ bit was checked by chengx@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
On 2017/04/05 11:26:21, grt (UTC plus 2) wrote: > On 2017/04/05 05:39:51, chengx wrote: > > On 2017/04/04 12:00:43, grt (UTC plus 2) wrote: > > > I think I'm about to repeat Gab: why generate three random chars and then > ask > > > the platform to add a few more onto it? Why not generate 11 random chars > (8.3 > > > for those folks who just love FAT) and probe in a loop to make a random > > > filename? If the platform's GetTempFileName function isn't good enough, > let's > > go > > > while hog and roll our own. > > > > I have retired GetTempFileName Windows API and switched to GUID to generate > > unique file names. GUID is suggested in many places for this purpose, > including > > MSDN if you search for GUID in the webpage below. > > > https://msdn.microsoft.com/en-us/library/windows/desktop/aa364991(v=vs.85).aspx > > > > As the probability of generating an unique name using GUID is extremely high, > as > > in http://stackoverflow.com/questions/39771/is-a-guid-unique-100-of-the-time > > I don't feel like we need a for-loop to make a few attempts in case the first > > attempt fails. In CreateFile Windows API call in the new patch set, I used > > CREATE_ALWAYS flag which means that if the file already exists, it will be > > overwritten. I think this is fine as it is just a .tmp file. > > I think this is the wrong choice -- it's not up to Chrome to stomp over files in > TMP. I have changed CREATE_ALWAYS to CREATE_NEW, so that old files won't be overwritten. > > > As for your sense that the platform enumerates all files, do you have data > for > > > this? I can dream up several bad ways of implementing this function that are > > all > > > better than that. :-) > > > > It is my intuition that all file names will be enumerated (at least be read) > for > > the purpose of checking if the newly generated file already exists or not. > > If I understand your thinking, I would guess that this isn't how it works. At > least, I hope it isn't. Enumerating all entries in the directory and then > picking a "random" filename has needless memory overhead and introduces a race > that is easily avoided -- another thread in the same or other proc could create > the chosen file between the "enumerate and pick" and "create file" steps. I > suspect that the implementation actually loops like this: > > - generate a random filename > - try to create the file > - break out of loop if created > > We could make some guesses about the cost of the middle step there -- log(N) if > we imagine that the OS uses a balanced binary tree for a directory's entries, or > O(1) if we imagine that it uses something that smells more like a hash table. > This operation will be expensive on large directories since there could be many > iterations if there are a lot of existing files that could collide with the > randomly-named file, and it's possible that the complexity of creating a new > file is somehow related to the number of entries in the directory. Thanks for the explanation! If this is the case, then I think using GUID based method would help a lot here since it can create an unique name with "just" one attempt even there are already thousands of temp files in the directory. For GetTempFileName, I think there could be many iterations when there are many temp files already, as the maximum files allowed is 65536.
PTAL~ https://codereview.chromium.org/2788483005/diff/160001/base/files/file_util_w... File base/files/file_util_win.cc (right): https://codereview.chromium.org/2788483005/diff/160001/base/files/file_util_w... base/files/file_util_win.cc:344: FilePath::StringType temp_name = UTF8ToUTF16(base::GenerateGUID() + ".tmp"); On 2017/04/05 11:26:34, grt (UTC plus 2) wrote: > ASCIIToUTF16 since you know the GUID is ASCII Done. https://codereview.chromium.org/2788483005/diff/160001/base/files/file_util_w... base/files/file_util_win.cc:346: HANDLE file_handle = On 2017/04/05 11:26:34, grt (UTC plus 2) wrote: > either use base::win::ScopedHandle or base::File. the latter is preferred if the > flags you need are supported. Done. Changed to base::File. https://codereview.chromium.org/2788483005/diff/160001/base/files/file_util_w... base/files/file_util_win.cc:348: FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, CREATE_ALWAYS, On 2017/04/05 11:26:34, grt (UTC plus 2) wrote: > regarding CREATE_ALWAYS: we have no guarantee that the existing GUID-named file > was created by Chrome or for what purpose it was created. blindly clobbering > someone else's file (even if it is in tmp) doesn't seem nice to me. i think > CREATE_NEW is the right thing here. I agree. Changed CREATE_ALWAYS to CREATE_NEW. https://codereview.chromium.org/2788483005/diff/160001/base/files/file_util_w... base/files/file_util_win.cc:348: FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, CREATE_ALWAYS, On 2017/04/05 11:26:34, grt (UTC plus 2) wrote: > nit: NULL -> nullptr Done. I know nullptr is preferred. However, "NULL" is used for all ohter CreatFile in this file, so I used NULL instead of nullptr for consistency. Maybe I should change all other NULL's to nullptr's in another CL, or simply ignore them? Any suggestion please? https://codereview.chromium.org/2788483005/diff/160001/base/files/file_util_w... base/files/file_util_win.cc:359: GetLongPathName(temp_path_name.value().c_str(), full_path_name, MAX_PATH); On 2017/04/05 11:26:34, grt (UTC plus 2) wrote: > could you dig through the blamelist to figure out why this is here? i don't get > it. it's a waste of IO as far as i can tell, and can fail for various reasons. > my first guess was "maybe this is some trick to get a full path", but it isn't. > i don't see why this function doesn't simply return temp_path_name in call > cases. I had the same doubts here too. So I made some changes here which failed TEST_F(FileUtilTest, CreateTemporaryFileInDirLongPathTest). In more details, the assertion failures were https://cs.chromium.org/chromium/src/base/files/file_util_unittest.cc?q=file_... and https://cs.chromium.org/chromium/src/base/files/file_util_unittest.cc?q=file_... I think the reason why this API call is needed is documented here. https://cs.chromium.org/chromium/src/base/files/file_util_unittest.cc?q=file_... Also, as mentioned in the accepted answer below: http://stackoverflow.com/questions/31249019/what-is-the-difference-between-ge... "GetLongPathName requires disk access, so a relative path will be probably be resolved by using the current working directory, too." In short, I think one reason is that this API call tells if the users have full access to the higher level directories. There may be other reasons I am not aware of yet.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
> We could make some guesses about the cost of the middle step there Is it worth doing some measurements? It shouldn't take long to create a program that creates a directory and then creates 100,000 GUID-named files in that directory. Record the time (with QueryPerformanceCounter) for each Create call and then graph it. There will be some spikes due to noise in the system but the trend should be pretty clear - O(1), O(log(N)) and O(N) look *very* different. The results will be skewed by the fact that such a test keeps the cache warm and because we have SSDs, but it could be interesting and might better inform the discussion.
On 2017/04/05 21:34:25, brucedawson wrote: > > We could make some guesses about the cost of the middle step there > > Is it worth doing some measurements? It shouldn't take long to create a program > that creates a directory and then creates 100,000 GUID-named files in that > directory. Record the time (with QueryPerformanceCounter) for each Create call > and then graph it. There will be some spikes due to noise in the system but the > trend should be pretty clear - O(1), O(log(N)) and O(N) look *very* different. > > The results will be skewed by the fact that such a test keeps the cache warm and > because we have SSDs, but it could be interesting and might better inform the > discussion. I'll do the experiments.
The CQ bit was checked by chengx@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.chromium.org/2788483005/diff/160001/base/files/file_util_w... File base/files/file_util_win.cc (right): https://codereview.chromium.org/2788483005/diff/160001/base/files/file_util_w... base/files/file_util_win.cc:359: GetLongPathName(temp_path_name.value().c_str(), full_path_name, MAX_PATH); On 2017/04/05 18:22:34, chengx wrote: > On 2017/04/05 11:26:34, grt (UTC plus 2) wrote: > > could you dig through the blamelist to figure out why this is here? i don't > get > > it. it's a waste of IO as far as i can tell, and can fail for various reasons. > > my first guess was "maybe this is some trick to get a full path", but it > isn't. > > i don't see why this function doesn't simply return temp_path_name in call > > cases. > > I had the same doubts here too. So I made some changes here which failed > TEST_F(FileUtilTest, CreateTemporaryFileInDirLongPathTest). In more details, the > assertion failures were > https://cs.chromium.org/chromium/src/base/files/file_util_unittest.cc?q=file_... > and > https://cs.chromium.org/chromium/src/base/files/file_util_unittest.cc?q=file_... > > I think the reason why this API call is needed is documented here. > https://cs.chromium.org/chromium/src/base/files/file_util_unittest.cc?q=file_... > > Also, as mentioned in the accepted answer below: > http://stackoverflow.com/questions/31249019/what-is-the-difference-between-ge... > "GetLongPathName requires disk access, so a relative path will be probably be > resolved by using the current working directory, too." > > In short, I think one reason is that this API call tells if the users have full > access to the higher level directories. There may be other reasons I am not > aware of yet. Yes, GetLongPathName will fail if the user doesn't have accesses to some parent dirs. It's good that we have test coverage for this. I'm not sure why CreateTemporaryFileInDir tries to figure out the long path name, though. Why does it matter? If the caller passes in a valid path to a directory and this fn is able to create a temp file in it, isn't its job done? Why does it also try to get the long path name of the given directory? I can't think of a reason why one would expect that CreateTemporaryFileInDir("C:\SHORT~1\PATH~1") would return "C:\Short but really long\Path where you want a file\foo.tmp". Can you? I think changing this is out of scope for the current CL. In your experimentation, I think it's worth checking to see if there's a performance impact of GetLongPathName when the dir has many files. Perhaps that will be motivation to remove this in a separate change. https://codereview.chromium.org/2788483005/diff/200001/base/files/file_util_w... File base/files/file_util_win.cc (right): https://codereview.chromium.org/2788483005/diff/200001/base/files/file_util_w... base/files/file_util_win.cc:344: FilePath::StringType temp_file_name = please leave a comment explaining why GetTempFileName isn't used -- we wouldn't want someone to come along a year from now and switch back without realizing why we're switching away. :-) https://codereview.chromium.org/2788483005/diff/200001/base/files/file_util_w... base/files/file_util_win.cc:347: File file(::CreateFile(temp_name.value().c_str(), can you not use the (const FilePath& path, uint32_t flags) ctor of base::File?
The CQ bit was checked by chengx@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
On 2017/04/06 10:35:02, grt (no reviews Apr 7-17) wrote: > https://codereview.chromium.org/2788483005/diff/160001/base/files/file_util_w... > File base/files/file_util_win.cc (right): > > https://codereview.chromium.org/2788483005/diff/160001/base/files/file_util_w... > base/files/file_util_win.cc:359: GetLongPathName(temp_path_name.value().c_str(), > full_path_name, MAX_PATH); > On 2017/04/05 18:22:34, chengx wrote: > > On 2017/04/05 11:26:34, grt (UTC plus 2) wrote: > > > could you dig through the blamelist to figure out why this is here? i don't > > get > > > it. it's a waste of IO as far as i can tell, and can fail for various > reasons. > > > my first guess was "maybe this is some trick to get a full path", but it > > isn't. > > > i don't see why this function doesn't simply return temp_path_name in call > > > cases. > > > > I had the same doubts here too. So I made some changes here which failed > > TEST_F(FileUtilTest, CreateTemporaryFileInDirLongPathTest). In more details, > the > > assertion failures were > > > https://cs.chromium.org/chromium/src/base/files/file_util_unittest.cc?q=file_... > > and > > > https://cs.chromium.org/chromium/src/base/files/file_util_unittest.cc?q=file_... > > > > I think the reason why this API call is needed is documented here. > > > https://cs.chromium.org/chromium/src/base/files/file_util_unittest.cc?q=file_... > > > > Also, as mentioned in the accepted answer below: > > > http://stackoverflow.com/questions/31249019/what-is-the-difference-between-ge... > > "GetLongPathName requires disk access, so a relative path will be probably be > > resolved by using the current working directory, too." > > > > In short, I think one reason is that this API call tells if the users have > full > > access to the higher level directories. There may be other reasons I am not > > aware of yet. > > Yes, GetLongPathName will fail if the user doesn't have accesses to some parent > dirs. It's good that we have test coverage for this. I'm not sure why > CreateTemporaryFileInDir tries to figure out the long path name, though. Why > does it matter? If the caller passes in a valid path to a directory and this fn > is able to create a temp file in it, isn't its job done? Why does it also try to > get the long path name of the given directory? I can't think of a reason why one > would expect that CreateTemporaryFileInDir("C:\SHORT~1\PATH~1") would return > "C:\Short but really long\Path where you want a file\foo.tmp". Can you? > I think changing this is out of scope for the current CL. In your > experimentation, I think it's worth checking to see if there's a performance > impact of GetLongPathName when the dir has many files. Perhaps that will be > motivation to remove this in a separate change. Sure, I will investigate more. Honestly, I am not sure why it's needed even though it's covered by unittests. https://codereview.chromium.org/2788483005/diff/200001/base/files/file_util_w... File base/files/file_util_win.cc (right): https://codereview.chromium.org/2788483005/diff/200001/base/files/file_util_w... base/files/file_util_win.cc:344: FilePath::StringType temp_file_name = On 2017/04/06 10:35:02, grt (no reviews Apr 7-17) wrote: > please leave a comment explaining why GetTempFileName isn't used -- we wouldn't > want someone to come along a year from now and switch back without realizing why > we're switching away. :-) Comments added. https://codereview.chromium.org/2788483005/diff/200001/base/files/file_util_w... base/files/file_util_win.cc:347: File file(::CreateFile(temp_name.value().c_str(), On 2017/04/06 10:35:02, grt (no reviews Apr 7-17) wrote: > can you not use the (const FilePath& path, uint32_t flags) ctor of base::File? Done. Changed to the (const FilePath& path, uint32_t flags) ctor.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
> I think I'm about to repeat Gab: why generate three random chars and then ask > the platform to add a few more onto it? Why not generate 11 random chars (8.3 > for those folks who just love FAT) and probe in a loop to make a random > filename? If the platform's GetTempFileName function isn't good enough, let's go > while hog and roll our own. > > As for your sense that the platform enumerates all files, do you have data for > this? I can dream up several bad ways of implementing this function that are all > better than that. :-) Wow, I just found this in the documentation (we did set that param to zero so it did try filename one by one from the current time -- assume super slow if creating lots of files within same timestamp...): uUnique [in] An unsigned integer to be used in creating the temporary file name. For more information, see Remarks. If uUnique is zero, the function attempts to form a unique file name using the current system time. If the file already exists, the number is increased by one and the functions tests if this file already exists. This continues until a unique filename is found; the function creates a file by that name and closes it. Note that the function does not attempt to verify the uniqueness of the file name when uUnique is nonzero. So maybe a random prefix would actually have helped... but at that point might as well roll our own if it's that stupid... (looking at CL now)
lgtm w/ nits, thanks! https://codereview.chromium.org/2788483005/diff/220001/base/files/file_util_w... File base/files/file_util_win.cc (right): https://codereview.chromium.org/2788483005/diff/220001/base/files/file_util_w... base/files/file_util_win.cc:344: // Use GUID instead of GetTempFileName API to generate unique file names. s/GetTempFileName API/::GetTempFileName()/ https://codereview.chromium.org/2788483005/diff/220001/base/files/file_util_w... base/files/file_util_win.cc:346: // API can perform poorly when creating a large number of files with the same Since this is a quote, remove "Windows API" IMO https://codereview.chromium.org/2788483005/diff/220001/base/files/file_util_w... base/files/file_util_win.cc:351: ASCIIToUTF16(base::GenerateGUID() + ".tmp"); Add L".tmp" outside ASCIIToUTF16(base::GenerateGUID()) (1 temporary string instead of 2) https://codereview.chromium.org/2788483005/diff/220001/base/files/file_util_w... base/files/file_util_win.cc:353: FilePath temp_name = dir.Append(temp_file_name); inline |temp_file_name| in here https://codereview.chromium.org/2788483005/diff/220001/base/files/file_util_w... base/files/file_util_win.cc:368: *temp_file = temp_name; std::move(temp_name) https://codereview.chromium.org/2788483005/diff/220001/base/files/file_util_w... base/files/file_util_win.cc:374: *temp_file = FilePath(long_temp_name_str); std::move(long_temp_name_str)
The CQ bit was checked by chengx@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
On 2017/04/06 19:03:40, gab (behind) wrote: > > I think I'm about to repeat Gab: why generate three random chars and then ask > > the platform to add a few more onto it? Why not generate 11 random chars (8.3 > > for those folks who just love FAT) and probe in a loop to make a random > > filename? If the platform's GetTempFileName function isn't good enough, let's > go > > while hog and roll our own. > > > > As for your sense that the platform enumerates all files, do you have data for > > this? I can dream up several bad ways of implementing this function that are > all > > better than that. :-) > > Wow, I just found this in the documentation (we did set that param to zero so it > did try filename one by one from the current time -- assume super slow if > creating lots of files within same timestamp...): > > uUnique [in] > An unsigned integer to be used in creating the temporary file name. For more > information, see Remarks. > If uUnique is zero, the function attempts to form a unique file name using the > current system time. If the file already exists, the number is increased by one > and the functions tests if this file already exists. This continues until a > unique filename is found; the function creates a file by that name and closes > it. Note that the function does not attempt to verify the uniqueness of the file > name when uUnique is nonzero. > > So maybe a random prefix would actually have helped... but at that point might > as well roll our own if it's that stupid... > > (looking at CL now) Exactly. A random prefix can relieve the uniqueness issue by 36*36*36 times. Anyway, now GUID is used to generate unique file names, and the uniqueness issue should be "ALMOST" gone if we cannot say 100% gone.
https://codereview.chromium.org/2788483005/diff/240001/base/files/file_util_w... File base/files/file_util_win.cc (right): https://codereview.chromium.org/2788483005/diff/240001/base/files/file_util_w... base/files/file_util_win.cc:350: FilePath temp_name = dir.Append(ASCIIToUTF16(base::GenerateGUID()) + L".tmp"); Actually, we should probably loop this while it exists, right?
On 2017/04/06 20:19:12, gab (behind) wrote: > https://codereview.chromium.org/2788483005/diff/240001/base/files/file_util_w... > File base/files/file_util_win.cc (right): > > https://codereview.chromium.org/2788483005/diff/240001/base/files/file_util_w... > base/files/file_util_win.cc:350: FilePath temp_name = > dir.Append(ASCIIToUTF16(base::GenerateGUID()) + L".tmp"); > Actually, we should probably loop this while it exists, right? Theoretically, probably we should. However, I don't see the need because the probability of generating an unique file name using GUID is "very^very" high, so it's nearly impossible to get a duplicate. Please see the discussion in http://stackoverflow.com/questions/39771/is-a-guid-unique-100-of-the-time
On 2017/04/06 20:27:49, chengx wrote: > On 2017/04/06 20:19:12, gab (behind) wrote: > > > https://codereview.chromium.org/2788483005/diff/240001/base/files/file_util_w... > > File base/files/file_util_win.cc (right): > > > > > https://codereview.chromium.org/2788483005/diff/240001/base/files/file_util_w... > > base/files/file_util_win.cc:350: FilePath temp_name = > > dir.Append(ASCIIToUTF16(base::GenerateGUID()) + L".tmp"); > > Actually, we should probably loop this while it exists, right? > > Theoretically, probably we should. However, I don't see the need because the > probability of generating an unique file name using GUID is "very^very" high, so > it's nearly impossible to get a duplicate. Please see the discussion in > http://stackoverflow.com/questions/39771/is-a-guid-unique-100-of-the-time Sure, but why not basically. This method guarantees a new file, why risk the odd case... it's not like the extra file stat from PathExists() matters for performance as we already create the file and do GetLongPathName() on it.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by chengx@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
On 2017/04/06 20:35:58, gab (behind) wrote: > On 2017/04/06 20:27:49, chengx wrote: > > On 2017/04/06 20:19:12, gab (behind) wrote: > > > > > > https://codereview.chromium.org/2788483005/diff/240001/base/files/file_util_w... > > > File base/files/file_util_win.cc (right): > > > > > > > > > https://codereview.chromium.org/2788483005/diff/240001/base/files/file_util_w... > > > base/files/file_util_win.cc:350: FilePath temp_name = > > > dir.Append(ASCIIToUTF16(base::GenerateGUID()) + L".tmp"); > > > Actually, we should probably loop this while it exists, right? > > > > Theoretically, probably we should. However, I don't see the need because the > > probability of generating an unique file name using GUID is "very^very" high, > so > > it's nearly impossible to get a duplicate. Please see the discussion in > > http://stackoverflow.com/questions/39771/is-a-guid-unique-100-of-the-time > > Sure, but why not basically. This method guarantees a new file, why risk the odd > case... it's not like the extra file stat from PathExists() matters for > performance as we already create the file and do GetLongPathName() on it. Alright. I added a for-loop with maximum 100 attempts in the new patch set. I don't want to add an infinite loop and use some other conditions to break the loop though. I think this should be more than enough to guarantee uniqueness.
Hi gab@, all comments addressed. Thanks! https://codereview.chromium.org/2788483005/diff/220001/base/files/file_util_w... File base/files/file_util_win.cc (right): https://codereview.chromium.org/2788483005/diff/220001/base/files/file_util_w... base/files/file_util_win.cc:344: // Use GUID instead of GetTempFileName API to generate unique file names. On 2017/04/06 19:09:37, gab (behind) wrote: > s/GetTempFileName API/::GetTempFileName()/ Done. https://codereview.chromium.org/2788483005/diff/220001/base/files/file_util_w... base/files/file_util_win.cc:346: // API can perform poorly when creating a large number of files with the same On 2017/04/06 19:09:37, gab (behind) wrote: > Since this is a quote, remove "Windows API" IMO Done. https://codereview.chromium.org/2788483005/diff/220001/base/files/file_util_w... base/files/file_util_win.cc:351: ASCIIToUTF16(base::GenerateGUID() + ".tmp"); On 2017/04/06 19:09:37, gab (behind) wrote: > Add L".tmp" outside ASCIIToUTF16(base::GenerateGUID()) (1 temporary string > instead of 2) Done. https://codereview.chromium.org/2788483005/diff/220001/base/files/file_util_w... base/files/file_util_win.cc:353: FilePath temp_name = dir.Append(temp_file_name); On 2017/04/06 19:09:37, gab (behind) wrote: > inline |temp_file_name| in here Done. https://codereview.chromium.org/2788483005/diff/220001/base/files/file_util_w... base/files/file_util_win.cc:368: *temp_file = temp_name; On 2017/04/06 19:09:36, gab (behind) wrote: > std::move(temp_name) Done. https://codereview.chromium.org/2788483005/diff/220001/base/files/file_util_w... base/files/file_util_win.cc:374: *temp_file = FilePath(long_temp_name_str); On 2017/04/06 19:09:37, gab (behind) wrote: > std::move(long_temp_name_str) Done. https://codereview.chromium.org/2788483005/diff/240001/base/files/file_util_w... File base/files/file_util_win.cc (right): https://codereview.chromium.org/2788483005/diff/240001/base/files/file_util_w... base/files/file_util_win.cc:350: FilePath temp_name = dir.Append(ASCIIToUTF16(base::GenerateGUID()) + L".tmp"); On 2017/04/06 20:19:12, gab (behind) wrote: > Actually, we should probably loop this while it exists, right? Acknowledged.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by chengx@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from gab@chromium.org Link to the patchset: https://codereview.chromium.org/2788483005/#ps260001 (title: "Add a loop to make sure an unique name is generated.")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
CQ is committing da patch. Bot data: {"patchset_id": 260001, "attempt_start_ts": 1491516981829820, "parent_rev": "3d8e30ac4d15e9f85e57423bc38eaafd762dd821", "commit_rev": "170f1cf7d23ce3e3fb99085042bf68b8772825a1"}
Message was sent while issue was closed.
Description was changed from ========== Use GUID to generate unique temp file names and retire GetTempFileName We use GetTempFileName Windows API to generate an unique temp file name in some scenarios like downloading files. It is mentioned on MSDN for this API that "Due to the algorithm used to generate file names, GetTempFileName Windows API can perform poorly when creating a large number of files with the same prefix. In such cases, it is recommended that you construct unique file names based on GUIDs." The reasons why it's doing poorly include it takes a long time to find an unique name when a lot of names are already in use. This has been proved by the fact we found this API call takes minutes for some users who have lots of temp files in their default download directory. Note that the temp files in the default download directory can be generated by any software other than Chrome. This CL retires GetTempFileName and uses GUID to generate unique temp file names. With GUID, it is almost guaranteed that an unique temp file name can be generated with a single attempt. BUG=103737 ========== to ========== Use GUID to generate unique temp file names and retire GetTempFileName We use GetTempFileName Windows API to generate an unique temp file name in some scenarios like downloading files. It is mentioned on MSDN for this API that "Due to the algorithm used to generate file names, GetTempFileName Windows API can perform poorly when creating a large number of files with the same prefix. In such cases, it is recommended that you construct unique file names based on GUIDs." The reasons why it's doing poorly include it takes a long time to find an unique name when a lot of names are already in use. This has been proved by the fact we found this API call takes minutes for some users who have lots of temp files in their default download directory. Note that the temp files in the default download directory can be generated by any software other than Chrome. This CL retires GetTempFileName and uses GUID to generate unique temp file names. With GUID, it is almost guaranteed that an unique temp file name can be generated with a single attempt. BUG=103737 Review-Url: https://codereview.chromium.org/2788483005 Cr-Commit-Position: refs/heads/master@{#462662} Committed: https://chromium.googlesource.com/chromium/src/+/170f1cf7d23ce3e3fb99085042bf... ==========
Message was sent while issue was closed.
Committed patchset #9 (id:260001) as https://chromium.googlesource.com/chromium/src/+/170f1cf7d23ce3e3fb99085042bf...
Message was sent while issue was closed.
Thanks, post-commit nit https://codereview.chromium.org/2788483005/diff/260001/base/files/file_util_w... File base/files/file_util_win.cc (right): https://codereview.chromium.org/2788483005/diff/260001/base/files/file_util_w... base/files/file_util_win.cc:367: // Exist early if we can't create an unique name. Remove this comment, it merely states what the self-documenting code below does.
Message was sent while issue was closed.
https://codereview.chromium.org/2788483005/diff/260001/base/files/file_util_w... File base/files/file_util_win.cc (right): https://codereview.chromium.org/2788483005/diff/260001/base/files/file_util_w... base/files/file_util_win.cc:367: // Exist early if we can't create an unique name. On 2017/04/07 17:27:15, gab wrote: > Remove this comment, it merely states what the self-documenting code below does. Sure, will remove it in another CL. |