The Apache Beam project maintains a set of performance tests running on a regular basis on Jenkins (see: Load Tests, Nexmark and Performance Tests). Their results are sent to time series database, which is InfluxDB, and displayed using Grafana. Dashboards are available at: http://metrics.beam.apache.org.
Resources
- Grafana dashboards
- Design docs:
- Source code: .test-infra/metrics
Local development and deployment is managed by docker-compose, while the production environment is deployed via Kubernetes.
Instructions for developing and making updates to dashboards are checked-in: README.md. Instructions for deploying the stack to a production environment are below.
Kubernetes deployment
First Time Prep
Configure gcloud & kubectl: https://cloud.google.com/kubernetes-engine/docs/quickstart
- In particular, either set the default project and compute zone to match
apache-beam-testing
and themetrics
cluster, or remember to use those as parameters in the steps below.
Deploy the infrastructure
InfluxDB
Before deploying the infrastructure, make sure the cluster has storage-rw access scope. This scope grants write access to all Cloud Storage resources and is required for automatic backups to work properly.
Execute the following from the .test-infra/metrics
directory of the Apache Beam repository:
- Add secrets for InfluxDB:
kubectl create secret generic influxdb-creds --from-literal=INFLUXDB_USER=<user> --from-literal=INFLUXDB_USER_PASSWORD=<pwd> --from-literal=INFLUXDB_ADMIN_USER=<user> --from-literal=INFLUXDB_ADMIN_PASSWORD=<pwd>.
Important: keep the name of a user and the password in sync with Jenkins credentials, otherwise there might be problems with sending test results. - Create persistent volume claims:
kubectl create -f beam-influxdb-storage-persistentvolumeclaim.yaml kubectl create -f beam-influxdb-backups-persistentvolumeclaim.yaml
- Create deployment and services:
kubectl create -f beam-influxdb.yaml
- Create cron job for automated backups:
kubectl create -f beam-influxdb-autobackup.yaml
The administator and the user account as well as a new database called beam-test-metrics will be automatically created.
Grafana
Refer to Community Metrics for details on how to deploy Grafana.
Restore from a backup
A full database backup is being made once a day and written to GCS bucket at gs://apache-beam-testing-metrics/. The bucket has the following policies enabled:
- object versioning,
- a lifecycle rule that keeps only the 14 most recent versions of the backup.
The bucket is readable to everyone on the Internet.
If something goes wrong, it is possible to restore database from a backup. Before getting started, make sure the pod with InfluxDB is operational.
- Start InfluxDB CLI and authenticate yourself. Instructions are below, in the Maintenance section.
- Check that the database to be restored from the backup does not exist by executing show databases command.
- Leave the CLI and run the following command from within the pod:
influxd restore -portable /path/to/backup
Do note that this will not restore users and their permissions! In case of disaster, these data must be recreated manually.
Maintenance
Normally, if developing new Grafana dashboards or testing SQL queries, you don't need to connect to the production instance of InfluxDB. These tips are indened only for maintenance work.
Before getting started, make sure that you have gcloud and kubectl configured. Appropriate permissions for apache-beam-testing project will also be needed.
How can I use InfluxDB command line interface?
First, obtain the name of the admin user and the password:
kubectl get secret influxdb-creds -o yaml
YAML values of secret data are encoded as base64 strings. Use the following command to decode data:
echo '<secret-data>' | base64 --decode
The next step is to run an interactive bash session in the pod. Find the name of the pod that runs InfluxDB:
kubectl get pods
The name of the pod starts with "influxdb". Use the full name to execute a command:
kubectl exec -it influxdb-67c6bcf57b-w672w -- bash
Then start the CLI with the command influx. The CLI indicates readiness to accept commands with ">" prompt. Execute auth command to authenticate yourself with the admin credentials.
For information on how to use InfluxDB command line interface, see: https://docs.influxdata.com/influxdb/v1.8/tools/shell/.
How can I send HTTP requests to InfluxDB?
We are using Internal Load Balancing to make InfluxDB accessible outside of Kubernetes cluster. Internal Load Balancing creates an internal IP address that receives traffic from clients in the same VPC network and compute region. This means that you are not allowed to connect to InfluxDB from your workstation unless you set up a port forwarding over SSH.
To learn how to use a port forwarding, see https://cloud.google.com/solutions/connecting-securely#port-forwarding-over-ssh.
Find the IP address by executing the following command:
kubectl get service influxdb
The IP address and the port can be found in the EXTERNAL-IP and PORT(S) columns, respectively.
Having all the information needed, run the following command to set up a port forwarding:
gcloud compute ssh example-instance --project apache-beam-testing --zone us-central1-a -- -L 8086:<EXTERNAL-IP>:<PORT>
How can I increase storage of InfluxDB?
Open a cloud-shell session and execute:
kubectl edit pvc influxdb-storage
Then increase the resources.requrests.storage field, save and exit, the change should take effect a moment later.