Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(81)

Side by Side Diff: go/tracedb/DESIGN.md

Issue 1411663004: Create gRPC client and server, traceservice, that stores trace data in a BoltDB backend. (Closed) Base URL: https://skia.googlesource.com/buildbot@master
Patch Set: fix vet Created 5 years, 2 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « no previous file | trace/DESIGN.md » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
(Empty)
1 tracedb
2 =======
3
4 The tracedb package is designed to replace the current storage system for
5 traces, tiles, with a new BoltDB backend that allows for much more flexibility
6 and an increase in the size of data that can be stored. The new system needs
7 to support both branches and trybots (note that in the future there may be no
8 difference between the two), while still supporting the current capabilities
9 of looking at master.
10
11 The current structure for a Tile looks like:
12
13 type GoldenTrace struct {
14 Params_ map[string]string
15 Values []string
16 }
17
18 type PerfTrace struct {
19 Values []float64 `json:"values"`
20 Params_ map[string]string `json:"params"`
21 }
22
23 // Commit is information about each Git commit.
24 type Commit struct {
25 CommitTime int64 `json:"commit_time" bq:"timestamp" db:"ts"`
26 Hash string `json:"hash" bq:"gitHash" db:"githash"`
27 Author string `json:"author" db:"author"`
28 }
29
30 // Tile is a config.TILE_SIZE commit slice of data.
31 //
32 // The length of the Commits array is the same length as all of the Values
33 // arrays in all of the Traces.
34 type Tile struct {
35 Traces map[string]Trace `json:"traces"`
36 ParamSet map[string][]string `json:"param_set"`
37 Commits []*Commit `json:"commits"`
38
39 // What is the scale of this Tile, i.e. it contains every Nth point, where
40 // N=const.TILE_SCALE^Scale.
41 Scale int `json:"scale"`
42 TileIndex int `json:"tileIndex"`
43 }
44
45 Where `PerfTrace` and `GoldenTrace` implement the `Trace` interface.
46
47 Requirements
48 ============
49
50 In the following list you may substitute 'branch' for 'trybot'.
51
52 1. Build a tile of the last N commits from master. (Our only usage today.)
53 2. Build a Tile for a trybot.
54 3. Build a Tile for a single trybot result vs a specific commit.
55 4. Build a Tile for all commits to master in a given time range. (Be able to go back in time for either Gold or Perf.)
56 5. Build a Tile for all commits to all branches in a given time range. (Show how all branches compare against main.)
57 6. Build a Tile for all commits to main and a given branch for a given time rang e. (See how a single branch compares to main.)
58
59 Assumptions
60 ===========
61
62 1. We will use queries to the BoltDB to build in-memory Tiles.
63 2. We can extract a timestamp from Reitveld for each patch.
64
65 Design
66 ======
67
68 To actually handle this in BoltDB we will need to create two buckets, one
69 for the per-commit values in each trace, and another for the trace-level
70 information, such as the params for each trace.
71
72 commit bucket
73 -------------
74
75 The keys for the commit bucket are structured as:
76
77 [timestamp]:[git hash]:[branch name]:[trace_key]
78
79 and the keys map to a single value []byte, that is either the Gold digest or
80 the Perf float64 measurement value.
81
82 Note that to search through a time range for a specific branch name we'll need
83 to do the filtering inside the closure we pass to BoltDB.
84
85 trace bucket
86 ------------
87
88 The keys for the trace bucket are just the trace keys.
89
90 [trace_key]
91
92 The values are structs serialized as JSON that contain the params for each
93 trace. We are using JSON over GOB since these are relatively small structs.
94
95 Interface
96 ---------
97
98 The interface to tracedb looks like:
99
100 // DB represents the interface to any datastore for perf and gold results.
101 //
102 // Notes:
103 // 1. If 'sources' is an empty slice it will match all sources.
104 // 2. The Commits in the Tile will only contain the commit id and
105 // the timestamp, the Author will not be populated.
106 // 3. The Tile's Scale and TileIndex will be set to 0.
107 //
108 type DB interface {
109 // Add new information to the datastore.
110 //
111 // source - Either a branch name or a Rietveld issue id.
112 // values - maps the trace id to a DBEntry.
113 //
114 // Note that only allowing adding data for a single commit at a time
115 // should work well with ingestion while still breaking up writes into
116 // shorter actions.
117 Add(commitID *CommitID, source string, values map[string]*DBEntry) error
118
119 // Create a Tile based on the given query parameters.
120 //
121 // If 'sources' is an empty slice it will match all sources.
122 //
123 // Note that the Commits in the Tile will only contain the commit id and
124 // the timestamp, the Author will not be populated.
125 TileFromRangeAndSources(begin, end time.Time, sources []string) (*tiling .Tile, error)
126
127 // Create a Tile for the given commit ids. Commits should be provided in
128 // time order.
129 //
130 // Note that the Commits in the Tile will only contain the commit id and
131 // the timestamp, the Author will not be populated.
132 TileFromCommits(commitIDs []*CommitID) (*tiling.Tile, error)
133 }
134
135 The above interface depends on the CommitID struct, which is:
136
137 // CommitID represents the time of a particular commit, where a commit could either be
138 // a real commit into the repo, or an event like running a trybot.
139 type CommitID struct {
140 Timestamp time.Time
141 ID string // Normally a git hash, but could also be Rietveld issue id + patch id.
142 }
143
144 func (c *CommitID) String() string {
145 return fmt.Sprintf("%s%s", c.Timestamp.Format(time.RFC3339), c.ID)
146 }
147
148 And DBEntry, which is:
149
150 // DBEntry holds the params and a value for single measurement.
151 type DBEntry struct {
152 Params map[string]string
153
154 // Value is the value of the measurement.
155 //
156 // It should be the digest string converted to a []byte, or a float64
157 // converted to a little endian []byte. I.e. tiling.Trace.SetAt
158 // should be able to consume this value.
159 Value []byte
160 }
161
162 Note that this will require adding a new method to the Trace interface:
163
164 // Sets the value of the measurement at index.
165 //
166 // Each specialization will convert []byte to the correct type.
167 SetAt(index int, value []byte) error
168
169 Usage
170 =====
171
172 Here is how the single TileFromRangeAndSources can be used to satisfy all the ab ove requirements:
173
174 1. Build a tile of the last N commits from master.
175 * Find the ~Nth commit via gitinfo, along with its timestamp. Then call
176
177 TileFromRangeAndSources(nth.Timestamp, head.Timestamp, []string{"master"})
178
179 2. Build a Tile for a trybot.
180 * Find the Reitveld issue id and created time of each patchset. Use the
181 patchset ids and created timestamps to create a slice of CommitID's to use
182 in:
183
184 TileFromCommits(commits)
185
186 or if you know the timestamp when the issue was created:
187
188 TileFromRangeAndSources(created.Timestamp, time.Now(), []string{"[coderevi ew id]"})
189
190 3. Build a Tile for a single trybot result vs a specific commit.
191 * Find the Reitveld issue id and created time of the patchset. Find the
192 commitid of the target commit:
193
194 TileFromCommits([]*CommitID{trybot, commit})
195
196 4. Build a Tile for all commits to master in a given time range. (Be able to go back in time for either Gold or Perf).
197 * Given the time range:
198
199 TileFromRangeAndSources(beginTimestamp, endTimestamp, []string{"master"})
200
201 5. Build a Tile for all commits to all branches in a given time range. (Show how all branches compare against main).
202 * Given the time range, the empty slice for source means include all sources:
203
204 TileFromRangeAndSources(beginTimestamp, endTimestamp, []string{})
205
206 6. Build a Tile for all commits to main and a given branch for a given time rang e. (See how a single branch compares to main).
207 * Find the ~Nth commit via gitinfo. Then call:
208
209 TileFromRangeAndSources(nth.Timestamp, head.Timestamp, []string{"master", "[codereview id]"})
210
211 Note that this might return multiple tries, i.e. one for each patchset.
OLDNEW
« no previous file with comments | « no previous file | trace/DESIGN.md » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698