Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(105)

Side by Side Diff: third_party/gsutil/gslib/addlhelp/wildcards.py

Issue 2280023003: depot_tools: Remove third_party/gsutil (Closed)
Patch Set: Created 4 years, 3 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
OLDNEW
(Empty)
1 # Copyright 2012 Google Inc. All Rights Reserved.
2 #
3 # Licensed under the Apache License, Version 2.0 (the "License");
4 # you may not use this file except in compliance with the License.
5 # You may obtain a copy of the License at
6 #
7 # http://www.apache.org/licenses/LICENSE-2.0
8 #
9 # Unless required by applicable law or agreed to in writing, software
10 # distributed under the License is distributed on an "AS IS" BASIS,
11 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 # See the License for the specific language governing permissions and
13 # limitations under the License.
14
15 from gslib.help_provider import HELP_NAME
16 from gslib.help_provider import HELP_NAME_ALIASES
17 from gslib.help_provider import HELP_ONE_LINE_SUMMARY
18 from gslib.help_provider import HelpProvider
19 from gslib.help_provider import HELP_TEXT
20 from gslib.help_provider import HelpType
21 from gslib.help_provider import HELP_TYPE
22
23 _detailed_help_text = ("""
24 <B>DESCRIPTION</B>
25 gsutil supports URI wildcards. For example, the command:
26
27 gsutil cp gs://bucket/data/abc* .
28
29 will copy all objects that start with gs://bucket/data/abc followed by any
30 number of characters within that subdirectory.
31
32
33 <B>DIRECTORY BY DIRECTORY VS RECURSIVE WILDCARDS</B>
34 The "*" wildcard only matches up to the end of a path within
35 a subdirectory. For example, if bucket contains objects
36 named gs://bucket/data/abcd, gs://bucket/data/abcdef,
37 and gs://bucket/data/abcxyx, as well as an object in a sub-directory
38 (gs://bucket/data/abc/def) the above gsutil cp command would match the
39 first 3 object names but not the last one.
40
41 If you want matches to span directory boundaries, use a '**' wildcard:
42
43 gsutil cp gs://bucket/data/abc** .
44
45 will match all four objects above.
46
47 Note that gsutil supports the same wildcards for both objects and file names.
48 Thus, for example:
49
50 gsutil cp data/abc* gs://bucket
51
52 will match all names in the local file system. Most command shells also
53 support wildcarding, so if you run the above command probably your shell
54 is expanding the matches before running gsutil. However, most shells do not
55 support recursive wildcards ('**'), and you can cause gsutil's wildcarding
56 support to work for such shells by single-quoting the arguments so they
57 don't get interpreted by the shell before being passed to gsutil:
58
59 gsutil cp 'data/abc**' gs://bucket
60
61
62 <B>BUCKET WILDCARDS</B>
63 You can specify wildcards for bucket names. For example:
64
65 gsutil ls gs://data*.example.com
66
67 will list the contents of all buckets whose name starts with "data" and
68 ends with ".example.com".
69
70 You can also combine bucket and object name wildcards. For example this
71 command will remove all ".txt" files in any of your Google Cloud Storage
72 buckets:
73
74 gsutil rm gs://*/**.txt
75
76
77 <B>OTHER WILDCARD CHARACTERS</B>
78 In addition to '*', you can use these wildcards:
79
80 ? Matches a single character. For example "gs://bucket/??.txt"
81 only matches objects with two characters followed by .txt.
82
83 [chars] Match any of the specified characters. For example
84 "gs://bucket/[aeiou].txt" matches objects that contain a single vowel
85 character followed by .txt
86
87 [char range] Match any of the range of characters. For example
88 "gs://bucket/[a-m].txt" matches objects that contain letters
89 a, b, c, ... or m, and end with .txt.
90
91 You can combine wildcards to provide more powerful matches, for example:
92 gs://bucket/[a-m]??.j*g
93
94
95 <B>EFFICIENCY CONSIDERATION: USING WILDCARDS OVER MANY OBJECTS</B>
96 It is more efficient, faster, and less network traffic-intensive
97 to use wildcards that have a non-wildcard object-name prefix, like:
98
99 gs://bucket/abc*.txt
100
101 than it is to use wildcards as the first part of the object name, like:
102
103 gs://bucket/*abc.txt
104
105 This is because the request for "gs://bucket/abc*.txt" asks the server
106 to send back the subset of results whose object names start with "abc",
107 and then gsutil filters the result list for objects whose name ends with
108 ".txt". In contrast, "gs://bucket/*abc.txt" asks the server for the complete
109 list of objects in the bucket and then filters for those objects whose name
110 ends with "abc.txt". This efficiency consideration becomes increasingly
111 noticeable when you use buckets containing thousands or more objects. It is
112 sometimes possible to set up the names of your objects to fit with expected
113 wildcard matching patterns, to take advantage of the efficiency of doing
114 server-side prefix requests. See, for example "gsutil help prod" for a
115 concrete use case example.
116
117
118 <B>EFFICIENCY CONSIDERATION: USING MID-PATH WILDCARDS</B>
119 Suppose you have a bucket with these objects:
120 gs://bucket/obj1
121 gs://bucket/obj2
122 gs://bucket/obj3
123 gs://bucket/obj4
124 gs://bucket/dir1/obj5
125 gs://bucket/dir2/obj6
126
127 If you run the command:
128 gsutil ls gs://bucket/*/obj5
129 gsutil will perform a /-delimited top-level bucket listing and then one bucket
130 listing for each subdirectory, for a total of 3 bucket listings:
131 GET /bucket/?delimiter=/
132 GET /bucket/?prefix=dir1/obj5&delimiter=/
133 GET /bucket/?prefix=dir2/obj5&delimiter=/
134
135 The more bucket listings your wildcard requires, the slower and more expensive
136 it will be. The number of bucket listings required grows as:
137 - the number of wildcard components (e.g., "gs://bucket/a??b/c*/*/d"
138 has 3 wildcard components);
139 - the number of subdirectories that match each component; and
140 - the number of results (pagination is implemented using one GET
141 request per 1000 results, specifying markers for each).
142
143 If you want to use a mid-path wildcard, you might try instead using a
144 recursive wildcard, for example:
145
146 gsutil ls gs://bucket/**/obj5
147
148 This will match more objects than gs://bucket/*/obj5 (since it spans
149 directories), but is implemented using a delimiter-less bucket listing
150 request (which means fewer bucket requests, though it will list the entire
151 bucket and filter locally, so that could require a non-trivial amount of
152 network traffic).
153 """)
154
155
156 class CommandOptions(HelpProvider):
157 """Additional help about wildcards."""
158
159 help_spec = {
160 # Name of command or auxiliary help info for which this help applies.
161 HELP_NAME : 'wildcards',
162 # List of help name aliases.
163 HELP_NAME_ALIASES : ['wildcard', '*', '**'],
164 # Type of help:
165 HELP_TYPE : HelpType.ADDITIONAL_HELP,
166 # One line summary of this help.
167 HELP_ONE_LINE_SUMMARY : 'Wildcard support',
168 # The full help text.
169 HELP_TEXT : _detailed_help_text,
170 }
OLDNEW
« no previous file with comments | « third_party/gsutil/gslib/addlhelp/versioning.py ('k') | third_party/gsutil/gslib/bucket_listing_ref.py » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698