| OLD | NEW |
| (Empty) |
| 1 tracedb | |
| 2 ======= | |
| 3 | |
| 4 The tracedb package is designed to replace the current storage system for | |
| 5 traces, tiles, with a new BoltDB backend that allows for much more flexibility | |
| 6 and an increase in the size of data that can be stored. The new system needs | |
| 7 to support both branches and trybots (note that in the future there may be no | |
| 8 difference between the two), while still supporting the current capabilities | |
| 9 of looking at master. | |
| 10 | |
| 11 The current structure for a Tile looks like: | |
| 12 | |
| 13 type GoldenTrace struct { | |
| 14 Params_ map[string]string | |
| 15 Values []string | |
| 16 } | |
| 17 | |
| 18 type PerfTrace struct { | |
| 19 Values []float64 `json:"values"` | |
| 20 Params_ map[string]string `json:"params"` | |
| 21 } | |
| 22 | |
| 23 // Commit is information about each Git commit. | |
| 24 type Commit struct { | |
| 25 CommitTime int64 `json:"commit_time" bq:"timestamp" db:"ts"` | |
| 26 Hash string `json:"hash" bq:"gitHash" db:"githash"` | |
| 27 Author string `json:"author" db:"author"` | |
| 28 } | |
| 29 | |
| 30 // Tile is a config.TILE_SIZE commit slice of data. | |
| 31 // | |
| 32 // The length of the Commits array is the same length as all of the Values | |
| 33 // arrays in all of the Traces. | |
| 34 type Tile struct { | |
| 35 Traces map[string]Trace `json:"traces"` | |
| 36 ParamSet map[string][]string `json:"param_set"` | |
| 37 Commits []*Commit `json:"commits"` | |
| 38 | |
| 39 // What is the scale of this Tile, i.e. it contains every Nth point, where | |
| 40 // N=const.TILE_SCALE^Scale. | |
| 41 Scale int `json:"scale"` | |
| 42 TileIndex int `json:"tileIndex"` | |
| 43 } | |
| 44 | |
| 45 Where `PerfTrace` and `GoldenTrace` implement the `Trace` interface. | |
| 46 | |
| 47 Requirements | |
| 48 ============ | |
| 49 | |
| 50 In the following list you may substitute 'branch' for 'trybot'. | |
| 51 | |
| 52 1. Build a tile of the last N commits from master. (Our only usage today.) | |
| 53 2. Build a Tile for a trybot. | |
| 54 3. Build a Tile for a single trybot result vs a specific commit. | |
| 55 4. Build a Tile for all commits to master in a given time range. (Be able to go
back in time for either Gold or Perf.) | |
| 56 5. Build a Tile for all commits to all branches in a given time range. (Show how
all branches compare against main.) | |
| 57 6. Build a Tile for all commits to main and a given branch for a given time rang
e. (See how a single branch compares to main.) | |
| 58 | |
| 59 Assumptions | |
| 60 =========== | |
| 61 | |
| 62 1. We will use queries to the BoltDB to build in-memory Tiles. | |
| 63 2. We can extract a timestamp from Reitveld for each patch. | |
| 64 | |
| 65 Design | |
| 66 ====== | |
| 67 | |
| 68 To actually handle this in BoltDB we will need to create two buckets, one | |
| 69 for the per-commit values in each trace, and another for the trace-level | |
| 70 information, such as the params for each trace. | |
| 71 | |
| 72 commit bucket | |
| 73 ------------- | |
| 74 | |
| 75 The keys for the commit bucket are structured as: | |
| 76 | |
| 77 [timestamp]:[git hash]:[branch name]:[trace_key] | |
| 78 | |
| 79 and the keys map to a single value []byte, that is either the Gold digest or | |
| 80 the Perf float64 measurement value. | |
| 81 | |
| 82 Note that to search through a time range for a specific branch name we'll need | |
| 83 to do the filtering inside the closure we pass to BoltDB. | |
| 84 | |
| 85 trace bucket | |
| 86 ------------ | |
| 87 | |
| 88 The keys for the trace bucket are just the trace keys. | |
| 89 | |
| 90 [trace_key] | |
| 91 | |
| 92 The values are structs serialized as JSON that contain the params for each | |
| 93 trace. We are using JSON over GOB since these are relatively small structs. | |
| 94 | |
| 95 Interface | |
| 96 --------- | |
| 97 | |
| 98 The interface to tracedb looks like: | |
| 99 | |
| 100 // DB represents the interface to any datastore for perf and gold results. | |
| 101 // | |
| 102 // Notes: | |
| 103 // 1. If 'sources' is an empty slice it will match all sources. | |
| 104 // 2. The Commits in the Tile will only contain the commit id and | |
| 105 // the timestamp, the Author will not be populated. | |
| 106 // 3. The Tile's Scale and TileIndex will be set to 0. | |
| 107 // | |
| 108 type DB interface { | |
| 109 // Add new information to the datastore. | |
| 110 // | |
| 111 // source - Either a branch name or a Rietveld issue id. | |
| 112 // values - maps the trace id to a DBEntry. | |
| 113 // | |
| 114 // Note that only allowing adding data for a single commit at a time | |
| 115 // should work well with ingestion while still breaking up writes into | |
| 116 // shorter actions. | |
| 117 Add(commitID *CommitID, source string, values map[string]*DBEntry) error | |
| 118 | |
| 119 // Create a Tile based on the given query parameters. | |
| 120 // | |
| 121 // If 'sources' is an empty slice it will match all sources. | |
| 122 // | |
| 123 // Note that the Commits in the Tile will only contain the commit id and | |
| 124 // the timestamp, the Author will not be populated. | |
| 125 TileFromRangeAndSources(begin, end time.Time, sources []string) (*tiling
.Tile, error) | |
| 126 | |
| 127 // Create a Tile for the given commit ids. Commits should be provided in | |
| 128 // time order. | |
| 129 // | |
| 130 // Note that the Commits in the Tile will only contain the commit id and | |
| 131 // the timestamp, the Author will not be populated. | |
| 132 TileFromCommits(commitIDs []*CommitID) (*tiling.Tile, error) | |
| 133 } | |
| 134 | |
| 135 The above interface depends on the CommitID struct, which is: | |
| 136 | |
| 137 // CommitID represents the time of a particular commit, where a commit could
either be | |
| 138 // a real commit into the repo, or an event like running a trybot. | |
| 139 type CommitID struct { | |
| 140 Timestamp time.Time | |
| 141 ID string // Normally a git hash, but could also be Rietveld issue
id + patch id. | |
| 142 } | |
| 143 | |
| 144 func (c *CommitID) String() string { | |
| 145 return fmt.Sprintf("%s%s", c.Timestamp.Format(time.RFC3339), c.ID) | |
| 146 } | |
| 147 | |
| 148 And DBEntry, which is: | |
| 149 | |
| 150 // DBEntry holds the params and a value for single measurement. | |
| 151 type DBEntry struct { | |
| 152 Params map[string]string | |
| 153 | |
| 154 // Value is the value of the measurement. | |
| 155 // | |
| 156 // It should be the digest string converted to a []byte, or a float64 | |
| 157 // converted to a little endian []byte. I.e. tiling.Trace.SetAt | |
| 158 // should be able to consume this value. | |
| 159 Value []byte | |
| 160 } | |
| 161 | |
| 162 Note that this will require adding a new method to the Trace interface: | |
| 163 | |
| 164 // Sets the value of the measurement at index. | |
| 165 // | |
| 166 // Each specialization will convert []byte to the correct type. | |
| 167 SetAt(index int, value []byte) error | |
| 168 | |
| 169 Usage | |
| 170 ===== | |
| 171 | |
| 172 Here is how the single TileFromRangeAndSources can be used to satisfy all the ab
ove requirements: | |
| 173 | |
| 174 1. Build a tile of the last N commits from master. | |
| 175 * Find the ~Nth commit via gitinfo, along with its timestamp. Then call | |
| 176 | |
| 177 TileFromRangeAndSources(nth.Timestamp, head.Timestamp, []string{"master"}) | |
| 178 | |
| 179 2. Build a Tile for a trybot. | |
| 180 * Find the Reitveld issue id and created time of each patchset. Use the | |
| 181 patchset ids and created timestamps to create a slice of CommitID's to use | |
| 182 in: | |
| 183 | |
| 184 TileFromCommits(commits) | |
| 185 | |
| 186 or if you know the timestamp when the issue was created: | |
| 187 | |
| 188 TileFromRangeAndSources(created.Timestamp, time.Now(), []string{"[coderevi
ew id]"}) | |
| 189 | |
| 190 3. Build a Tile for a single trybot result vs a specific commit. | |
| 191 * Find the Reitveld issue id and created time of the patchset. Find the | |
| 192 commitid of the target commit: | |
| 193 | |
| 194 TileFromCommits([]*CommitID{trybot, commit}) | |
| 195 | |
| 196 4. Build a Tile for all commits to master in a given time range. (Be able to go
back in time for either Gold or Perf). | |
| 197 * Given the time range: | |
| 198 | |
| 199 TileFromRangeAndSources(beginTimestamp, endTimestamp, []string{"master"}) | |
| 200 | |
| 201 5. Build a Tile for all commits to all branches in a given time range. (Show how
all branches compare against main). | |
| 202 * Given the time range, the empty slice for source means include all sources: | |
| 203 | |
| 204 TileFromRangeAndSources(beginTimestamp, endTimestamp, []string{}) | |
| 205 | |
| 206 6. Build a Tile for all commits to main and a given branch for a given time rang
e. (See how a single branch compares to main). | |
| 207 * Find the ~Nth commit via gitinfo. Then call: | |
| 208 | |
| 209 TileFromRangeAndSources(nth.Timestamp, head.Timestamp, []string{"master",
"[codereview id]"}) | |
| 210 | |
| 211 Note that this might return multiple tries, i.e. one for each patchset. | |
| OLD | NEW |