Grafana using Prometheus as data source

Overview:

Prometheus: An open-source monitoring and alerting toolkit designed to collect and store metrics as time series data for reliability and scalability.

Grafana: An open-source analytics and monitoring platform that visualizes data from various sources, creating interactive dashboards and alerts.

Node Exporter: A Prometheus exporter that provides detailed hardware and OS metrics from *nix systems, enabling detailed monitoring of system performance.

Alertmanager: A component of Prometheus that handles alerts by deduplicating, grouping, and routing them to various notification channels.

The below Docker Compose file is set up to deploy a monitoring stack using Prometheus, Grafana, Node Exporter, and Alertmanager.

Step 1: Open the editor by using vim or nano and save it as docker-compose.yml

Here is the full yml file for docker compose.

    sudo nano docker-compose.yml

version: '3'

services:
 prometheus:
   image: prom/prometheus
   ports:
     - "9090:9090"
   volumes:
     - ./prometheus:/etc/prometheus
   command:
     - '--config.file=/etc/prometheus/prometheus.yml'
   networks:
     - monitoring

 grafana:
   image: grafana/grafana
   ports:
     - "3000:3000"
   environment:
     - GF_SECURITY_ADMIN_PASSWORD=P@ssw0rd
   networks:
     - monitoring

 node_exporter:
   image: prom/node-exporter
   ports:
     - "9101:9100"
   networks:
     - monitoring

 alertmanager:
   image: prom/alertmanager
   ports:
     - "9093:9093"
   volumes:
     - ./alertmanager:/etc/alertmanager
   command:
     - '--config.file=/etc/alertmanager/alertmanager.yml'
   networks:
     - monitoring

networks:
 monitoring:
   driver: bridge

Refer to the below table for better understanding.,

 

Grafana

Prometheus

Node Exporter

Alertmanager

Image

grafana/grafana

prom/prometheus

prom/node-exporter

prom/alertmanager

Port

3000:3000

9090:9090

9101:9100

9093:9093

Volume

 

./prometheus

 

./alertmanager

Network

Monitoring/Bridge

 

Note: In the specified node exporter image, we expose the port as 9101 instead of the default port (9100).Because when we install node exporter in the local host it will throw an error as the port is already in use.

Step 2: Start the docker service

sudo docker-compose up -d

Step 3: Ensure the directories ./prometheus and ./alertmanager exist and have the correct configurations (prometheus.yml and alertmanager.yml respectively).

Step 4: Go to the Prometheus directory and create alert.yml file

sudo nano alert.yml

Paste the below rules: whenever the machine is off-stage and memory, CPU increases above 80%  the alert is triggered.(Both Linux and Windows server rules)

groups:
- name: alert.rules
 rules:
 - alert: InstanceDown
   expr: up == 0
   for: 1m
   labels:
     severity: "critical"
   annotations:
     summary: "Endpoint {{ $labels.instance }} down"
     description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minutes."

 - alert: HostOutOfMemory
   expr: node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes * 100 < 20
   for: 5m
   labels:
     severity: warning
   annotations:
     summary: "Host out of memory (instance {{ $labels.instance }})"
     description: "Node memory is filling up (< 25% left)\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}"

 - alert: HostOutOfMemory
   expr: windows_os_physical_memory_free_bytes / windows_cs_physical_memory_bytes * 100 < 20
   for: 5m
   labels:
     severity: warning
   annotations:
     summary: "Host out of memory (instance {{ $labels.instance }})"
     description: "Node memory is filling up (< 25% left)\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}"

 - alert: HostHighCpuLoad
   expr: (sum by (instance) (irate(node_cpu{job="node_exporter_metrics",mode="idle"}[5m]))) > 80
   for: 5m
   labels:
     severity: warning
   annotations:
     summary: "Host high CPU load (instance {{ $labels.instance }})"
     description: "CPU load is > 80%\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}"

 - alert: HostHighCpuLoad
   expr: 100 - (avg by (instance) (irate(windows_cpu_time_total{mode="idle", instance=~"$server"}[1m])) * 100) > 80
   for: 5m
   labels:
     severity: warning
   annotations:
     summary: "Host high CPU load (instance {{ $labels.instance }})"
     description: "CPU load is > 80%\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}"

Step 5:

Now create prometheus.yml file inside the Prometheus directory and update the required monitoring servers.

sudo nano prometheus.yml

# Alertmanager configuration
alerting:
 alertmanagers:
  - static_configs:
      - targets:
         - localhost:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  - alert_rules.yml
 # - "second_rules.yml"


scrape_configs:
 - job_name: 'prometheus'
   static_configs:
     - targets: ['localhost:9090']  # Prometheus itself

 - job_name: 'node_exporter'
   static_configs:
     - targets:[ # Internal server running Node Exporter'xxxxx:9100','yyyyyyyyy:9182']

Step 6:

Make sure to install the node/win exporter to the servers to be monitored.
Check by,
Linux:localhost:9100
Windows:localhost:9182

Step 7:

Now create the mail configuration. Go to the alertmanager directory and create alertmanager.yml file.

sudo nano alertmanager.yml

global:
 resolve_timeout: 5m

route:
 receiver: 'alert'
 group_by: ['instance', 'alert']
 group_wait: 30s
 group_interval: 5m
 repeat_interval: 3h

receivers:
 - name: 'alert'
   email_configs:
     - to: This email address is being protected from spambots. You need JavaScript enabled to view it.'
       from: This email address is being protected from spambots. You need JavaScript enabled to view it.'
       smarthost: 'smtp.office365.com:587'
       auth_username: This email address is being protected from spambots. You need JavaScript enabled to view it.'
       auth_password: 'xxxxxxxxxx'

Step 8:

Go to the Grafana dashboard under the menu and select the data source as Prometheus.,

 Home>Connection>Data Source>prometheus

To update the Prometheus URL, Click the save and test option to verify.

Step 9:

Go to the New dashboard and select the following option to proceed. For example, Here we used an import a dashboard option.

It is readily available ., To enter the ID to load the dashboard.

For Windows -10467

For Linux-11074

 

Step 10:

We use the volume option in our compose file. If you want to add or remove any server, do not login to the docker container. Instead, enter the server details outside the Prometheus folder.
Once done just restart the container.

Follow Us On

Registered Office

CHG IT CONSULTANCY PVT LTD

STPI Technology Incubation Centre,
2nd Floor, No.5, Rajiv Gandhi Salai,
Taramani, Chennai – 600113,
Tamil Nadu, INDIA

Parent Office

CIC Corporation

2-16-4 Dogenzaka, Shibuya-ku,
Nomura Real Estate,
Shibuya Dogenzaka Building,
Tokyo 150-0043, JAPAN

  +81 03-3496-1571
AboutUs

CHG IT Consultancy Pvt. Ltd. is a subsidiary of CIC Holdings Co. Ltd. Japan. Our company is focused on IT related solutions to reap the benefits of global popularity of Software Industry.

Registered Office
CHG IT CONSULTANCY PVT LTD

STPI Technology Incubation Centre,
2nd Floor, No.5, Rajiv Gandhi Salai,
Taramani, Chennai – 600113,
Tamil Nadu, INDIA

CIC Corporation

2-16-4 Dogenzaka, Shibuya-ku,
Nomura Real Estate,
Shibuya Dogenzaka Building,
Tokyo 150-0043, JAPAN

+81 03-3496-1571