Prometheus
Alert Manager & Long-term Storage

Lee Calcote

Innovate Summit 2017

Lee Calcote

clouds, containers, functions, applications  and their management

Show of Hands

AlertManager

Prometheus

Alertmanager is an alert...

Purpose

ingester

grouper

de-duplicator

silencer

throttler

notifier

\ˈnō-mən-ˌklā-chər

a brief Prometheus AlertManager construct review

match alerts to their receiver and how often to notify

where and how to send alerts

Routes  

  • Silencers - matches alerts with specific labels and prevents them from being included in notifications.

 

  • Inhibitors - suppress specific notifications when other specific alerts are already firing.

 

  • Grouping - categorizes alerts of similar nature into a single notification.

\ˈnō-mən-ˌklā-chər

a brief Prometheus AlertManager construct review

Muting

Suppressing

Correlating

                                group_wait: 30s

group_by: ['alertname', 'cluster']

group_interval: 5m

Multiple approaches to suppression

vs

vs

per route

global

via ui / api

Alerts

                                ALERT <alert name>
   IF <PromQL vector expression>
   FOR <duration>
   LABELS { ... }
   ANNOTATIONS { ... }

Supports clients other than Prometheus

is notified when alerts transition state

a shared construct

Prometheus

AlertManager

inactive

firing

pending

state transition

inactive

firing

notifications

!

Notification Integrations

Notifying to Multiple Destinations

Use continue to advance to next receiver.

                                route:
 receiver: email_webhook

receivers:
  - name: email_webhook
    email_configs:
      - to: 'lee@example.io'
    webhook_configs:
      - url: <webhook url here>

Use a receiver that goes to both destinations.

                                route:
 receiver: ops-team-all  # default
 routes:
   - match:
       severity: page
     receiver: ops-team-b
     continue: true
   - match:
       severity: critical
     receiver: ops-team-a

receivers:
  - name: ops-team-all
    email_configs:
      - to: ops-team-all@example.io
  - name: ops-team-a
    email_configs:
      - to: ops-team-a@example.io
  - name: ops-team-b
    email_configs:
      - to: ops-team-b@example.io

or

api

Inhibitor

!

de-duplication

Dispatcher

Non-HA AlertManager Architecture

Silencer

Dispatcher sorts incoming alerts into aggregation groups and assigns the correct notifiers to each.

Alert Provider

UI

Silence Provider

store

de-duplication

subscribe

Router

batched alerts

notification pipeline

Notify Provider

checks for previously sent notifications

Retry

Retry

Maintenance Script

alerts

High Availability

being introduced in 0.5

I

gossip protocols.

built atop Weave Mesh

With HA, you no longer have to monitor the monitor.

 

Designed for an alert to be sent to all instances in the cluster.

 

All Prometheus instances send alerts to all Alertmanager instances.

 

Guarantees notifications to be sent at least once.

Understanding and Extending

Prometheus AlertManager

KubeCon EU 2017

Storage Adapters

Prometheus

“an ephemeral sliding window of recent data”

Prometheus stores sample data on disk in a highly optimized custom format based on flat files.

 

  • efficient

  • tunable

  • NOT scaleable

  • NOT durable

                                ./data/01BKGV7JBM69T2G1BGBGM6KB12
./data/01BKGV7JBM69T2G1BGBGM6KB12/meta.json
./data/01BKGV7JBM69T2G1BGBGM6KB12/wal
./data/01BKGV7JBM69T2G1BGBGM6KB12/wal/000002
./data/01BKGV7JBM69T2G1BGBGM6KB12/wal/000001
./data/01BKGTZQ1SYQJTR4PB43C8PD98
./data/01BKGTZQ1SYQJTR4PB43C8PD98/meta.json
./data/01BKGTZQ1SYQJTR4PB43C8PD98/index
./data/01BKGTZQ1SYQJTR4PB43C8PD98/chunks
./data/01BKGTZQ1SYQJTR4PB43C8PD98/chunks/000001
./data/01BKGTZQ1SYQJTR4PB43C8PD98/tombstones
./data/01BKGTZQ1HHWHV8FBJXW1Y3W0K
./data/01BKGTZQ1HHWHV8FBJXW1Y3W0K/meta.json
./data/01BKGTZQ1HHWHV8FBJXW1Y3W0K/wal
./data/01BKGTZQ1HHWHV8FBJXW1Y3W0K/wal/000001
./data/01BKGV7JC0RY8A6MACW02A2PJD
./data/01BKGV7JC0RY8A6MACW02A2PJD/meta.json
./data/01BKGV7JC0RY8A6MACW02A2PJD/index
./data/01BKGV7JC0RY8A6MACW02A2PJD/chunks
./data/01BKGV7JC0RY8A6MACW02A2PJD/chunks/000001
./data/01BKGV7JC0RY8A6MACW02A2PJD/tombstones

Prometheus Local Storage

 

source: prometheus.io

Prometheus Remote Storage

The solution is a remote storage adapter.

3rd Party Storage

Adapter

Prometheus

custom protocol

write samples

request: matchers + time ranges

receive: series + samples

Adapter: an application that can receive batches of samples from Prometheus over HTTP and send them to some backend.

AppOptics Remote Storage Adapter

Send Prometheus data to AppOptics

 

prometheus2appoptics

  • batching
  • translation

POST /v1/measurements

POST /receive

 

1000 Samples per POST

app optics

Prometheus

AppOptics Prometheus Storage Integration

Resources

IRC: #prometheus on irc.freenode.net 

 

Mailing lists:

 

 

@PrometheusIO

 

 

Prometheus repositories to file bugs and features requests

#

Lee Calcote

Thank you. Questions?

clouds, containers, infrastructure,

applications  and their management

yes, we're hiring