Instrumenting applications is something we are used to at Bithaus. Having the code reporting itself delivers great information, and every person involved in the system from developers, operations and managers appreciate it.
But you cannot always change the code to get metrics, or not as fast as you would like to.
An excellent way to gather business metric of your application, is to look in your tables in the database. Continue on reading for step-by-step on how to create a custom metric based on a SQL query, so we can display it on a dashboard and create monitors to check it’s value.
Your table, your metrics
In our case, we needed to measure the performance and progress of an ETL process in a data migration project. We had a statistics table that looks something like this:
example_database=> \d source_stats View "public.source_stats" Column | Type | Modifiers -----------------------+--------------------------------+----------- source_id | integer | data_type_id | smallint | creation_ts | timestamp without time zone | update_ts | timestamp without time zone | last_inserted_data_ts | timestamp(0) without time zone | entry_count | bigint |
Rows here are inserted/updated when the ETL works on a “source” identified by “source_id”. So we have when the ETL started working on a source, when was the last time the stats were updated and the total entry count processed.
Based on this information, we created a view called etl_stats that summarises the overall process. on a regular basis, our “source_stats” table is updated allowing the view to show real-time status.
count | avgtime | avg_entry_count -------+---------+----------------- 156 | 50 | 1092847
Configuring the Datadog Postgresql collector
If you haven’t already enabled the PostgreSQL extension of Datadog agent, you can do so HERE.
Go to the Datadog extensions configuration directory, which is /etc/dd-agent/conf.d for Ubuntu installations, and edit the postgres.yaml file.
init_config:
instances:
- host: localhost
port: 5432
username: datadog
password: YOUR_PASSWORD
ssl: True
tags:
- psql:example_instance
dbname: example_database
custom_metrics:
- # ETL Stats
descriptors: []
metrics:
count: [my.etl.sources_finished, GAUGE]
avgtime: [my.etl.avg_process_time, GAUGE]
avg_entry_count: [my.etl.avg_entry_count, GAUGE]
query: SELECT %s from etl_stats;
relation: false
This is the most simple setup. The logic is that every metric defined in the “metric” property correspond to a table column in the “query” statement. In this case, our table has 3 columns count, avgtime, avg_entry_count that are mapped to 3 metrics. Internally, the agent builds the query based on the metrics on runtime, as you can see in the collector.log when debbuging:
2016-02-22 13:13:51 CLT | DEBUG | dd.collector | checks.postgres(postgres.py:446) | Running query: SELECT count, avgtime, avg_entry_count from etl_stats;
The column – metric mapping. This is the magic right here!
COLUMN_NAME : [METRIC_NAME, METRIC_TYPE]
In our example we created a metric “my.etl.sources_finished” type “gauge” that reports the value of the column “count” of our summary table.
count : [my.etl.sources_finished, GAUGE]
Define as many metrics as you need depending on your query. Datadog collector will run these when collecting standard postgres metrics. If anything goes wrong you should check the collector log – Ubuntu: /var/log/datadog/collector.log.
Restart the agent and that’s it – Ubuntu: service datadog-agent restart.
You may check your configuration using service datadog-agent checkconfig and check the collection status with service datadog-agent info.
===================
Collector (v 5.6.1)
===================
Status date: 2016-02-22 18:21:10 (7s ago)
Pid: 17902
Platform: Linux-3.2.0-23-generic-x86_64-with-debian-wheezy-sid
Python Version: 2.7.10
Logs: , /var/log/datadog/collector.log, syslog:/dev/log
Clocks
======
NTP offset: 0.0053 s
System UTC time: 2016-02-22 21:21:18.485238
Paths
=====
conf.d: /etc/dd-agent/conf.d
checks.d: /opt/datadog-agent/agent/checks.d
Hostnames
=========
socket-hostname: ubuntu-vm
ec2-hostname: ubuntu-vm.bithaus.cl
hostname: ubuntu-vm.bithaus.cl
socket-fqdn: ubuntu-vm.bithaus.cl
Checks
======
ntp
---
- Collected 0 metrics, 0 events & 1 service check
disk
----
- instance #0 [OK]
- Collected 48 metrics, 0 events & 1 service check
postgres
--------
- instance #0 [OK]
- Collected 45 metrics, 0 events & 2 service checks
Using your custom metric on Datadog
Go into your Datadog account to the Metric > Explorer. In the Graph input type the name of the metric that you defined in postgres.yaml.
If everything went OK, magic!
And then you can go all crazy with dashboards B-)
Enjoy!
Datadog helps teams across different areas in an organisation to have insight and control over the IT products and services that they are involved. Metrics & monitoring over SO + databases + services + apps + business; all in one TV screen.
Bithaus Software – Datadog partner from Chile.