Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(158)

Side by Side Diff: trace/DESIGN.md

Issue 1411663004: Create gRPC client and server, traceservice, that stores trace data in a BoltDB backend. (Closed) Base URL: https://skia.googlesource.com/buildbot@master
Patch Set: fix vet Created 5 years, 2 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « go/tracedb/DESIGN.md ('k') | trace/service/README.md » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
(Empty)
1 tracedb
2 =======
3
4 The tracedb package is designed to replace the current storage system for
5 traces, tiles, with a new backend that allows for much more flexibility
6 and an increase in the size of data that can be stored. The new system needs
7 to support both branches and trybots (note that in the future there may be no
8 difference between the two), while still supporting the current capabilities
9 of looking at master.
10
11 The current structure for a Tile looks like:
12
13 type GoldenTrace struct {
14 Params_ map[string]string
15 Values []string
16 }
17
18 type PerfTrace struct {
19 Values []float64 `json:"values"`
20 Params_ map[string]string `json:"params"`
21 }
22
23 // Commit is information about each Git commit.
24 type Commit struct {
25 CommitTime int64 `json:"commit_time" bq:"timestamp" db:"ts"`
26 Hash string `json:"hash" bq:"gitHash" db:"githash"`
27 Author string `json:"author" db:"author"`
28 }
29
30 // Tile is a config.TILE_SIZE commit slice of data.
31 //
32 // The length of the Commits array is the same length as all of the Values
33 // arrays in all of the Traces.
34 type Tile struct {
35 Traces map[string]Trace `json:"traces"`
36 ParamSet map[string][]string `json:"param_set"`
37 Commits []*Commit `json:"commits"`
38
39 // What is the scale of this Tile, i.e. it contains every Nth point, where
40 // N=const.TILE_SCALE^Scale.
41 Scale int `json:"scale"`
42 TileIndex int `json:"tileIndex"`
43 }
44
45 Where `PerfTrace` and `GoldenTrace` implement the `Trace` interface.
46
47 Requirements
48 ============
49
50 In the following list you may substitute 'branch' for 'trybot'.
51
52 1. Build a tile of the last N commits from master. (Our only usage today.)
53 2. Build a Tile for a trybot.
54 3. Build a Tile for a single trybot result vs a specific commit.
55 4. Build a Tile for all commits to master in a given time range. (Be able to go back in time for either Gold or Perf.)
56 5. Build a Tile for all commits to all branches in a given time range. (Show how all branches compare against main.)
57 6. Build a Tile for all commits to main and a given branch for a given time rang e. (See how a single branch compares to main.)
58
59 Assumptions
60 ===========
61
62 1. We will use queries to the interface to build in-memory Tiles.
63 2. We can extract a timestamp from Rietveld for each patch.
64
65 Design
66 ======
67
68 The design will actually be done in two layers, tracedb.DB, which is the Go
69 interface for talking to the data store, and then a separate service that
70 implements a gRPC interface and stores the data in BoltDB.
71
72
73            +-------------+
74            | tracedb.DB  |
75            | interface   |
76            +-------------+
77                   |
78 |
79       |
80  +------v------+
81  | gRPC Server |
82  | BoltDB      |
83  +-------------+
84
85
86 tracedb.DB Interface
87 --------------------
88
89 This is the Go interface to the storage for traces. The interface to tracedb loo ks like:
90
91 // DB represents the interface to any datastore for perf and gold results.
92 //
93 // Notes:
94 // 1. The Commits in the Tile will only contain the commit id and
95 // the timestamp, the Author will not be populated.
96 // 2. The Tile's Scale and TileIndex will be set to 0.
97 //
98 type DB interface {
99 // Add new information to the datastore.
100 //
101 // The values maps a trace id to a Entry.
102 //
103 // Note that only allowing adding data for a single commit at a time
104 // should work well with ingestion while still breaking up writes into
105 // shorter actions.
106 Add(commitID *CommitID, values map[string]*Entry) error
107
108 // Remove the given commit from the datastore.
109 Remove(commitID *CommitID) error
110
111 // List returns all the CommitID's between begin and end.
112 List(begin, end time.Time) ([]*CommitID, error)
113
114 // Create a Tile for the given commit ids. Will build the Tile using the
115 // commits in the order they are provided.
116 //
117 // Note that the Commits in the Tile will only contain the commit id and
118 // the timestamp, the Author will not be populated.
119 TileFromCommits(commitIDs []*CommitID) (*tiling.Tile, error)
120
121 // Close the datastore.
122 Close() error
123 }
124
125 The above interface depends on the CommitID struct, which is:
126
127 // CommitID represents the time of a particular commit, where a commit could either be
128 // a real commit into the repo, or an event like running a trybot.
129 type CommitID struct {
130 Timestamp time.Time
131 ID string // Normally a git hash, but could also be Rietveld patch id.
132 Source string // The branch name, e.g. "master", or the Rietveld issue id.
133 }
134
135 And Entry, which is:
136
137 // Entry holds the params and a value for single measurement.
138 type Entry struct {
139 Params map[string]string
140
141 // Value is the value of the measurement.
142 //
143 // It should be the digest string converted to a []byte, or a float64
144 // converted to a little endian []byte. I.e. tiling.Trace.SetAt
145 // should be able to consume this value.
146 Value []byte
147 }
148
149 Note that this will require adding a new method to the Trace interface:
150
151 // Sets the value of the measurement at index.
152 //
153 // Each specialization will convert []byte to the correct type.
154 SetAt(index int, value []byte) error
155
156
157 BoltDB Implementation
158 =====================
159
160 For local testing the Go interface above will be implemented in terms of the
161 gRPC interface defined below with a BoltDB store. I.e. there will be a
162 standalone server that implements the following gRPC interface.
163
164 The gRPC interface is similar to the Go interface, with Add and List operating
165 exactly the same. The only difference is in retrieving data, which means that
166 TileForCommits is broken down into two different calls, GetValues, and
167 GetParams, which the caller can use to build a Tile from.
168
169 // TraceDB stores trace information for both Gold and Perf.
170 service TraceDB {
171 // Returns a list of traceids that don't have Params stored in the datasto re.
172 rpc MissingParams(MissingParamsRequest) returns (MissingParamsResponse) {}
173
174 // Adds Params for a set of traceids.
175 rpc AddParams(AddParamsRequest) returns (EmptyResponse) {}
176
177 // Adds data for a set of traces for a particular commitid.
178 rpc Add(AddRequest) returns (AddResponse) {}
179
180 // Removes data for a particular commitid.
181 rpc Remove(RemoveRequest) returns (EmptyResponse) {}
182
183 // List returns all the CommitIDs that exist in the given time range.
184 rpc List(ListRequest) return (ListResponse) {}
185
186 // GetValues returns all the trace values stored for the given CommitID.
187 rpc GetValues(GetValuesRequest) (GetValuesResponse)
188
189 // GetParams returns the Params for all of the given traces.
190 rpc GetParams(GetParamsRequest) (GetParamsResponse)
191 }
192
193 See `go/tracedb/proto/tracestore.proto` for more details.
194
195
196 To actually handle this in BoltDB we will need to create three buckets, one for
197 the per-commit values in each trace, and another for the trace-level
198 information, such as the params for each trace, and a third for mapping
199 traceids to much shorter int64 values.
200
201 traceid bucket
202 --------------
203
204 To reduce the amount of data stored, we'll map traceids to 64 bit ints
205 and use the 64 bit ints as the keys to the maps stored in the commit
206 bucket. The traceid bucket maps traceids to trace64id, and vice versa.
207
208 There is a special key, "the largest trace64id", which isn't a valid traceid, wh ich
209 contains the largest trace64id seen, and defaults to 0 if not set.
210
211 commit bucket
212 -------------
213
214 The keys for the commit bucket are structured as:
215
216 [timestamp]![git hash]![branch name]
217
218 The key maps to a serialized values and their trace64ids. I.e. a serialized
219 map[uint64][]byte, where the uint64 is the trace64id.
220
221 trace bucket
222 ------------
223
224 The keys for the trace bucket are traceids.
225
226 [traceid]
227
228 The values are structs serialized Protocol Buffers that contain the params for
229 each trace and the original traceid.
230
231 constructor
232 -----------
233
234 func NewTraceStoreDB(conn *grpc.ClientConn, tb tiling.TraceBuilder) (DB, err or) {
235
236 Usage
237 =====
238
239 Here is how the single TileFromCommits can be used to satisfy all the above requ irements:
240
241 1. Build a tile of the last N commits from master.
242 * Find the last N commits via gitinfo, construct CommitIDs for each one, then call:
243
244 TileFromCommits(commits)
245
246 2. Build a Tile for a trybot.
247 * Find the Rietveld issue id and created time of each patchset. Use the
248 patchset ids and created timestamps to create a slice of CommitID's to use
249 in:
250
251 TileFromCommits(commits)
252
253 3. Build a Tile for a single trybot result vs a specific commit.
254 * Find the Rietveld issue id and created time of the patchset. Find the
255 commitid of the target commit:
256
257 TileFromCommits([]*CommitID{trybot, commit})
258
259 4. Build a Tile for all commits to master in a given time range. (Be able to go back in time for either Gold or Perf).
260 * Given the time range, build CommitIDs from gitinfo, then call:
261
262 TileFromCommits(commits)
263
264 5. Build a Tile for all commits to all branches in a given time range. (Show how all branches compare against main).
265 * Given the time range, call List, then TileFromCommits:
266
267 commits, err := List(beginTimestamp, endTimestamp)
268 TileFromCommits(commits)
269
270 6. Build a Tile for all commits to main and a given branch for a given time rang e. (See how a single branch compares to main).
271 * Find the ~Nth commit via gitinfo. Then call List, filter the results, then c all TileFromCommits.
272
273 commits, err := List(beginTimestamp, endTimestamp)
274 // Filter commits to only include values from the desired branches.
275 TileFromCommits(commits)
OLDNEW
« no previous file with comments | « go/tracedb/DESIGN.md ('k') | trace/service/README.md » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698