logdog/README.md - Issue 2183233002: Add LogDog project and component documentation.

Side by Side Diff: logdog/README.md

Issue 2183233002: Add LogDog project and component documentation. (Closed) Base URL: https://github.com/luci/luci-go@master

Patch Set: Created 4 years, 4 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

OLD	NEW
(Empty)
	1 LogDog

	2 ======

	3

	4 LogDog is a high-performance log collection and dissemination platform. It is

	5 designed to collect log data from a large number of cooperative individual

	6 sources and make it available to users and systems for consumption. It is

	7 composed of several services and tools that work cooperatively to provide a

	8 reliable log streaming platform.

	9

	10 Like other LUCI components, LogDog primarily aims to be useful to the

	11 [Chromium](https://www.chromium.org/) project.

	12

	13 LogDog offers several useful features:

	14

	15 * Log data is streamed, and is consequently available the moment that it is

	16 ingested in the system.

	17 * Flexible hierarchial log namespaces for organization and navigation.

	18 * Recognition of different projects, and application of different ACLs for each

	19 project.

	20 * Able to stream text, binary data, or records.

	21 * Long term (possibly indefinite) log data storage and access.

	22 * Log data is sourced from read-forward streams (think files, sockets, etc.).

	23 * Leverages the LUCI Configuration Service for configuration and management.

	24 * Log data is implemented as [protobufs](api/logpb/log.proto).

	25 * The entire platform is written in Go.

	26 * Rich metadata is collected and stored alongside log records.

	27 * Built entirely on scalable platform technologies, targeting Google Cloud

	28 Platform.

	29 * Resource requirements scale linearly with log volume.

	30

	31

	32 ## APIs

	33

	34 Most applications will interact with a LogDog Coordinator instance via its

	35 [Coordinator Logs API](api/endpoints/coordinator/logs/v1).

	36

	37

	38 ## Life of a Log Stream

	39

	40 Log streams pass through several layers and states during their path from

	41 generation through archival.

	42

	43 1. Streaming: A log stream is being emitted by a Butler instance and

	44 pushed through the Transport Layer to the Collector.

	45 1. Pre-Registration: The log stream hasn't been observed by a

	46 Collector instance yet, and exists only in the mind of the Butler

	47 and the Transport layer.

	48 1. Registered: The log stream has been observed by a Collector

	49 instance and successfully registered with the Coordinator. At this

	50 point, it becomes queryable, listable, and the records that have been

	51 loaded into Intermediate Storage are streamable.

	52 1. ArchivePending: One of the following events cause the log stream to be

	53 recognized as finished and have an archival request dispatched. The archival

	54 request is submitted to the Archivist cluster.

	55 * The log stream's terminal entry is collected, and the terminal index is

	56 successfully registered with the Coordinator.

	57 * A sufficient amount of time has expired since the log stream's

	58 registration.

	59 1. Archived: An Archivist instance has received an archival request for

	60 the log stream, successfully executed the request according to its

	61 parameters, and updated the log stream's state with the Coordinator.

	62

	63

	64 Most of the lifecycle is hidden from the Logs API endpoint by design. The user

	65 need not distinguish between a stream that is streaming, has archival pending,

	66 or has been archived. They will issue the same `Get` requests and receive the

	67 same log stream data.

	68

	69 A user may differentiate between a streaming and a complete log by observing its

	70 terminal index, which will be `< 0` if the log stream is still streaming.

	71

	72

	73 ## Components

	74

	75 The LogDog platform consists of several components:

	76

	77 * [Coordinator](appengine/coordinator), a hosted service which serves log data

	78 to users and manages the log stream lifecycle.

	79 * [Butler](client/cmd/logdog_butler), which runs on each log stream producing

	80 system and serves log data to the Collector for consumption.

	81 * [Collector](server/cmd/logdog_collector), a microservice which takes log

	82 stream data and ingests it into intermediate storage for streaming and

	83 archival.

	84 * [Archivist](server/cmd/logdog_archivist), a microservice which compacts

	85 completed log streams and prepares them for long-term storage.

	86

	87 LogDog offers several log stream clients to query and consume log data:

	88

	89 * [LogDog Cat](client/cmd/logdog_cat), a CLI to query and view log streams.

	90 * [Web App](/web/apps/logdog-app), a heavy log stream navigation

	91 application built in [Polymer](https://www.polymer-project.org).

	92 * [Web Viewer](/web/apps/logdog-view), a lightweight log stream viewer built in

	93 [Polymer](https://www.polymer-project.org).

	94

	95 Additionally, LogDog is built on several abstract middleware technologies,

	96 including:

	97

	98 * A Transport, a layer for the Butler to send data to the Collector.

	99 * An Intermediate Storage, a fast highly-accessible layer which stores log

	100 data immediately ingested by the Collector until it can be archived.

	101 * An Archival Storage, for cheap long-term file storage.

	102

	103 Log data is sent from the Butler through Transport to the Collector,

	104 which stages it in Intermediate Storage. Once the log stream is complete

	105 (or expired), the Archivist moves the data from Intermediate Storage to

	106 Archival Storage, where it will permanently reside.

	107

	108 The Chromium-deployed LogDog service uses

	109 [Google Cloud Platform](https://cloud.google.com/) for several of the middleware

	110 layers:

	111

	112 * [Google AppEngine](https://cloud.google.com/appengine), a scaling application

	113 hosting service.

	114 * [Cloud Datastore](https://cloud.google.com/datastore/), a powerful

	115 transactional NOSQL structured data storage system. This is used by the

	116 Coordinator to store log stream state.

	117 * [Cloud Pub/Sub](https://cloud.google.com/pubsub/), a publish / subscribe model

	118 transport layer. This is used to ferry log data from Butler instances to

	119 Collector instances for ingest.

	120 * [Cloud BigTable](https://cloud.google.com/bigtable/), an unstructured

	121 key/value storage. This is used as intermediate storage for log stream

	122 data.

	123 * [Cloud Storage](https://cloud.google.com/storage/), used for long-term log

	124 stream archival storage.

	125 * [Container Engine](https://cloud.google.com/container-engine/), which manages

	126 Kubernetes clusters. This is used to host the Collector and Archivist

	127 microservices.

	128

	129 Additionally, other LUCI services are used, including:

	130

	131 * [Auth Service](https://github.com/luci/luci-py/tree/master/appengine/auth_serv ice),

	132 a configurable hosted access control system.

	133 * [Configuration Service](https://github.com/luci/luci-py/tree/master/appengine/ config_service),

	134 a simple repository-based configuration service.

	135

	136 ## Instantiation

	137

	138 To instantiate your own LogDog instance, you will need the following

	139 prerequisites:

	140

	141 * A Configuration Service instance.

	142 * A Google Cloud Platform project configured with:

	143 * Datastore

	144 * A Pub/Sub topic (Butler) and subscription (Collector) for log streaming.

	145 * A Pub/Sub topic (Coordinator) and subscription (Archivist) for archival

	146 coordination.

	147 * A Container Engine instance for microservice hosting.

	148 * A BigTable cluster.

	149 * A Cloud Storage bucket for archival staging and storage.

	150

	151 Other compatible optional components include:

	152

	153 * An Auth Service instance to manage authentication. This is necessary if

	154 something stricter than public read/write is desired.

	155

	156 ### Config

	157

	158 The Configuration Service must have a valid service entry text protobuf for

	159 this LogDog service (defined in

	160 [svcconfig/config.proto](api/config/svcconfig/config.proto)).

	161

	162 ### Coordinator

	163

	164 After deploying the Coordiantor to a suitable cloud project, several

	165 configuration parameters must be defined visit its settings page at:

	166 `https://<your-app>/admin/settings`, and configure:

	167

	168 * Configure the "Configuration Service Settings" to point to the **Configuration

	169 Service** instance.

	170 * Update "Tumble Settings" appropriate (see [tumble docs](/tumble)).

	171 * If using timeseries monitoring, update the "Time Series Monitoring Settings".

	172 * If using Auth Service, set the "Authorization Settings".

	173

	174 If you are using a BigTable instance outside of your cloud project (e.g.,

	175 staging, dev), you will need to add your BigTable service account JSON to the

	176 service's settings. Currently this cannot be done without a command-line tool.

	177 Hopefully a proper settings page will be added to enable this, or alternatiely

	178 Cloud BigTable will be updated to support IAM.

	179

	180 ### Microservices

	181

	182 Microservices are hosted in Google Container Engine, and use Google Compute

	183 Engine metadata for configuration.

	184

	185 The following metadata parameters must be set for deployed microservices

	186 to work:

	187

	188 * `logdog_coordinator_host`, the host name of the Coordinator service.

	189

	190 All deployed microservices use the following optional metadata parameters for

	191 configuration:

	192

	193 * `logdog_storage_auth_json`, an optional file containing the authentication

	194 credentials for intermediate storage (i.e., BigTable). This isn't necessary

	195 if the BigTable node is hosted in the same cloud project as the microservice

	196 is running, and the microservice's container has BigTable Read/Write

	197 permissions.

	198 * `tsmon_endpoint`, an optional endpoint for timeseries monitoring data.

	199

	200 #### Collector

	201

	202 The Collector instance is fully command-line compatible. Its [entry point

	203 script](server/cmd/logdog_collector/run.sh) uses Google Compute Engine metadata

	204 to populate the command line in a production enviornment:

	205

	206 * `logdog_collector_log_level`, an optional `-log-level` flag value.

	207

	208 #### Archivist

	209

	210 The Archivist instance is fully command-line compatible. Its [entry point

	211 script](server/cmd/logdog_archivist/run.sh) uses Google Compute Engine metadata

	212 to populate the command line in a production enviornment:

	213

	214 * `logdog_archivist_log_level`, an optional `-log-level` flag value.

OLD	NEW

« no previous file with comments | « no previous file | logdog/appengine/README.md » ('j') | no next file with comments »