The Apache Beam project tracks a set of community and project health metrics, with targets to ensure a healthy, sustainable community (ex: test timing and reliability, pull request latency). The dashboards are available at: https://s.apache.org/beam-community-metrics.
Resources
- s.apache.org/beam-community-metrics
- Design docs:
- Source code: .test-infra/metrics
- Feature Requests: JIRA label community-metrics
- Monitoring: Jenkins beam_Prober_CommunityMetrics
Updating and Deploying
Instructions for developing and making updates to the metrics pipelines and dashboards are checked-in: README.md.
Local development and deployment is managed by docker-compose, while the production environment is deployed via Kubernetes and Cloud SQL on Google Cloud.
Analytics Database
Metric data is imported to and queried from a Postgresql analytics database. Our database is deployed on Google Cloud SQL, under project apache-beam-testing
, instance beammetrics
. The database tables are automatically maintained by the import scripts.
Migrating Jenkins Job History
In some cases, it will be necessary to migrate historical data. For example, jobs can be renamed in Jenkins. Jenkins will correctly migrate history to refer to the new job name, but already imported job runs need to be migrated in the postgres database. Follow the steps below in order to migrate data:
- Backup the database. Login to the Database Backups page for beammetrics and create a backup in case anything goes wrong.
- Connect to the database. From a console, login using gcloud command:
gcloud sql beta connect beammetrics --database=beammetrics --user=<your_username>
- Verify the data to migrate. Craft a SELECT query for the data you plan to update. For example:
Note that the import scripts will import recent history with the migrated job name from Jenkins; theSELECT *
FROM jenkins_builds jb
WHERE jb.job_name = 'beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle'
AND NOT EXISTS (
SELECT *
FROM jenkins_builds
WHERE job_name = 'beam_PostCommit_Java_ValidatesRunner_Gearpump'
AND build_id=jb.build_id);
NOT EXISTS
clause above ensures that the duplicate history is not migrated from the previous name (it will be removed later). TheUPDATE
command will fail without this condition. - Update the data. After validating the query targets the intended rows, modify it to update the necessary fields:
UPDATE jenkins_builds jb
SET job_name = 'beam_PostCommit_Java_ValidatesRunner_Gearpump'
WHERE <query-from-above>; - Remove duplicate history. The import script will import recent history with the migrated job name from Jenkins. The final step is to remove the redundant history with the old job name:
DELETE jenkins_builds
WHERE job_name = 'beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle'
Kubernetes Cluster
A Kubernetes cluster hosts the data ingestion scripts and the Grafana frontend. Our cluster is deployed on Google Cloud Platform, under project apache-beam-testing
, cluster metrics
, workload beamgrafana
. The following sections cover instructions for deploying and updating the Kubernetes cluster. Only accounts with the container.deployments.update
and container.deployments.create
permissions are able to do this. If you need to deploy or update the cluster, request that someone with the appropriate permissions follow these instructions.
First Time Prep
If you haven't worked on the apache-beam-testing
GCP project before, there is some configuration work required first.
- Configure gcloud & kubectl: https://cloud.google.com/kubernetes-engine/docs/quickstart
- In particular, either set the default project and compute zone to match
apache-beam-testing
and themetrics
cluster, or remember to use those as parameters in the steps below.
- In particular, either set the default project and compute zone to match
- Configure PosgreSQL
- Check on this link to configure connection from kubernetes to postgresql: https://cloud.google.com/sql/docs/postgres/connect-kubernetes-engine
Deploy the Cluster
Execute the following from the .test-infra/metrics
directory of the Apache Beam repository:
- Add secrets for grafana:
kubectl create secret generic grafana-admin-pwd --from-literal=grafana_admin_password=<pwd>
- Create persistent volume claims:
kubectl create -f beam-grafana-etcdata-persistentvolumeclaim.yaml kubectl create -f beam-grafana-libdata-persistentvolumeclaim.yaml kubectl create -f beam-grafana-logdata-persistentvolumeclaim.yaml
- Build and publish sync containers ./build_and_publish_containers.sh
- Create deployment
kubectl create -f beamgrafana-deploy.yaml
Update the Cluster
https://kubernetes.io/docs/concepts/workloads/controllers/deployment/
## Update deployment from yaml file
## Update yaml file and build containers
./build_and_publish_containers.sh
## Deploy new version
kubectl replace -f beamgrafana-deploy.yaml
## Manual partial update
# Build and publish sync containers cd sync/jenkins docker build -t gcr.io/${PROJECT_ID}/beammetricssyncjenkins:v1 . docker push -t gcr.io/${PROJECT_ID}/beammetricssyncjenkins:v1 # If needed check current pod status kubectl get pods kubectl describe pod <pod_id> # Update container image via one of the following. ## update image for container kubectl set image deployment/beamgrafana container=<new_image_name>
Grafana UI
The Grafana dashboarding frontend is deployed as part of the Kubernetes cluster. The website is exposed at public IP http://metrics.beam.apache.org/.
When you deploy a new Grafana instance, there is some one-time setup:
- Log-in at http://localhost:3000 with username
admin
and the value specified forGF_SECURITY_ADMIN_PASSWORD
indocker-compose.yml
. - Add Postgres as a data source:
- Click the 'Add data source' button.
- Fill out the following config:
- Name: BeamPSQL
- Type: PostgreSQL
- Host beampostgresql:5432
- Database: beam_metrics
- User: admin
- Password:
POSTGRES_PASSWORD
indocker-compose.yml
. - SSL Mode: Disable
- Change default organization name to "Beam" to let anonymous users browse dashboards without logging in. This must be done manually, since it's impossible to do it via configuration (see issue: https://github.com/grafana/grafana/issues/2908).
- Log in as an administator
- Go to 'Server Admin' / 'Orgs'
- There should be one organization called 'Main Org.' Click the name and change it to "Beam"
- Change home dashboard. Just like in the previous step, this must be done manually, due to https://github.com/grafana/grafana/issues/10266.
- Log in as an administrator
- Click at top-left on "Home". Find a dashboard called "Home / Getting Started" and mark it as favourite (give it a star)
- Go to 'Configuration' / 'Preferences'
- Choose "Home / Getting Started" from a drop-down list next to the 'Home Dashboard' label
To configure the InfluxDB Data Source:
- Log-in at http://localhost:3000 as specified above.
- Add InfluxDB as a data source:
- Click the 'Add data source' button
- Fill out the following config:
- Name: BeamInfluxDB
- URL: http://localhost:8086
- Access: Browser
- Database: beam_test_metrics
Updating Dashboards
The Grafana dashboards are exported as JSON files in the codebase. Dashboards can be easily exported from the UI.
- Run local version of grafana using docker-compose. Refer to README.md for more information.
- Once dashboards are manually updated via UI, re-export every modified or new dashboard to JSON file:
- Click at top-right on "Share dashboard" button.
- A modal dialog will appear. Go to "Export" tab and click "Save to file" button.
- Click at top-right on "Share dashboard" button.
- Save/update file in .test-infra/metrics/grafana/dashboards.
- Create a Pull Request. New dashboards will be automatically deployed to Grafana after merge.
Appendix
Useful Kubernetes commands and hints
# Get pods kubectl get pods # Get detailed status kubectl describe pod <pod_name> # Get logs kubectl log <PodName> <ContainerName> # Set kubectl logging level: -v [1..10] https://github.com/kubernetes/kubernetes/issues/35054
# Update kubernetes secret
kubectl create secret generic production-tls --from-file=./tls.key --from-file=./tls.crt --dry-run -o yaml | kubectl apply -f -
Useful docker commands and hints
- Connect from one container to another
curl <containername>:<port>
- Remove all containers/images/volumes
sudo docker rm $(sudo docker ps -a -q) sudo docker rmi $(sudo docker images -q) sudo docker volume prune
Enabling the legend on dashboards
If you have a graph that doesn't have the legend, you can enable it by clicking on Graph name → More → ToggleLegend.