MySQL: How to monitor MySQL Replication Lag in MilliSeconds With PMM and pt-heartbeat
There could be various requirements at application end which need realtime slaves. Let me introduce a solution to a problem we used to face how can we monitor whether our slaves are real time or they are lagging in Milliseconds. Unfortunately there is no built in feature in MySQL to get Replication Lag in MilliSeconds.
Perhaps there is a tool provisioned in pt-toolkit named as pt-heartbeat. It generates heartbeat events on master and monitoring system can monitor time difference on slave to calculate lag.
Step 1: Download pt-heartbeat using below command.
wget http://percona.com/get/pt-heartbeat -P /usr/bin/ && chmod +x /usr/bin/pt-heartbeat
Step 2: It requires a database where it can create a table. Let's create it on master.mysql -uUser -pPassword -e "create database if not exists percona;"
Step 3: Create heartbeat table in percona db.pt-heartbeat -D percona --create-table --check -uUser -pPassword --master-server-id ServerId
Step 4: Start pt-heartbeat daemonpt-heartbeat -D percona --daemonize --update h=127.0.0.1,u=User,p=Password --utc
Step 5: Changes in pmm client to monitor lag in ms.pmm-admin remove mysql:metrics
pmm-admin add mysql:metrics --user=monitor --password=xxxx -- --collect.heartbeat.database=percona --collect.heartbeat=true
Step 6: Use Below grafana panel json to add new graph on dashboardadd panel > Graph > Panel Title > More > Panel Json > Update
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "Prometheus",
"description": "Requires pt-heartbeat\nThis graph requires mysqld_exporter with -collect.heartbeat=true and a pt-heartbeat daemon running on MySQL masters",
"fill": 2,
"gridPos": {
"h": 7,
"w": 12,
"x": 12,
"y": 3
},
"id": 41,
"legend": {
"alignAsTable": true,
"avg": true,
"current": false,
"max": true,
"min": true,
"show": true,
"total": false,
"values": true
},
"lines": true,
"linewidth": 2,
"links": [],
"nullPointMode": "null",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
{
"alias": "Lag",
"color": "#bf1b00"
}
],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "max_over_time(mysql_heartbeat_now_timestamp_seconds{instance=\"$host\"}[$interval]) - \nmax_over_time(mysql_heartbeat_stored_timestamp_seconds{instance=\"$host\"}[$interval]) or \nmax_over_time(mysql_heartbeat_now_timestamp_seconds{instance=\"$host\"}[5m]) - \nmax_over_time(mysql_heartbeat_stored_timestamp_seconds{instance=\"$host\"}[5m])",
"format": "time_series",
"interval": "$interval",
"intervalFactor": 1,
"legendFormat": "Lag for server-id={{server_id}}",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeShift": null,
"title": "MySQL Heartbeat Replication Delay",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": "0",
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
}
I believe this feature is available in PMM 2.0 which is still in beta release. Can you please confirm which version of PMM have been used for the above ?
ReplyDeleteThis works well with PMM 1.17
DeleteDoes this support group replication ? like innodb cluster ?
ReplyDelete