Build a Centralized Log Management System
Centralized logging is essential for security monitoring, troubleshooting, and compliance. This project uses Grafana Loki - a cost-effective, scalable log aggregation system.
Project Overview
What you'll build:
- Centralized log collection infrastructure
- Log aggregation with Grafana Loki
- Log shipping with Promtail agents
- Dashboards and alerting in Grafana
Time to complete: 3-5 hours
Why Loki?
Compared to alternatives like Elasticsearch:
- Lower resource requirements - Indexes only metadata, not full text
- Cost-effective storage - Uses object storage efficiently
- Native Grafana integration - Single pane of glass
- Familiar query language - LogQL similar to PromQL
Architecture
┌────────────────────────────────────────────────────────────┐
│ Log Management System │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Server │ │ Server │ │ Container │ │
│ │ Promtail │ │ Promtail │ │ Promtail │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │ │
│ └────────────────┼────────────────┘ │
│ ▼ │
│ ┌───────────────┐ │
│ │ Loki │ │
│ │ (Storage) │ │
│ └───────┬───────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ │
│ │ Grafana │ │
│ │ (Dashboard) │ │
│ └───────────────┘ │
└────────────────────────────────────────────────────────────┘Prerequisites
- Docker and Docker Compose
- Linux server(s) for log sources
- 4GB+ RAM for Loki server
- Storage for log retention
Part 1: Deploy Loki Stack
Create Project Structure
mkdir -p /opt/loki/{config,data}
cd /opt/lokiLoki Configuration
Create config/loki-config.yaml:
auth_enabled: false
server:
http_listen_port: 3100
grpc_listen_port: 9096
common:
instance_addr: 127.0.0.1
path_prefix: /loki
storage:
filesystem:
chunks_directory: /loki/chunks
rules_directory: /loki/rules
replication_factor: 1
ring:
kvstore:
store: inmemory
query_range:
results_cache:
cache:
embedded_cache:
enabled: true
max_size_mb: 100
schema_config:
configs:
- from: 2020-10-24
store: tsdb
object_store: filesystem
schema: v13
index:
prefix: index_
period: 24h
ruler:
alertmanager_url: http://alertmanager:9093
limits_config:
reject_old_samples: true
reject_old_samples_max_age: 168h
ingestion_rate_mb: 16
ingestion_burst_size_mb: 24Promtail Configuration
Create config/promtail-config.yaml:
server:
http_listen_port: 9080
grpc_listen_port: 0
positions:
filename: /tmp/positions.yaml
clients:
- url: http://loki:3100/loki/api/v1/push
scrape_configs:
# System logs
- job_name: system
static_configs:
- targets:
- localhost
labels:
job: varlogs
__path__: /var/log/*log
# Auth logs
- job_name: auth
static_configs:
- targets:
- localhost
labels:
job: auth
__path__: /var/log/auth.log
# Journal logs
- job_name: journal
journal:
max_age: 12h
labels:
job: systemd-journal
relabel_configs:
- source_labels: ['__journal__systemd_unit']
target_label: 'unit'Docker Compose
Create docker-compose.yml:
version: '3.8'
services:
loki:
image: grafana/loki:2.9.0
container_name: loki
ports:
- "127.0.0.1:3100:3100"
volumes:
- ./config/loki-config.yaml:/etc/loki/local-config.yaml
- ./data:/loki
command: -config.file=/etc/loki/local-config.yaml
restart: unless-stopped
promtail:
image: grafana/promtail:2.9.0
container_name: promtail
volumes:
- ./config/promtail-config.yaml:/etc/promtail/config.yml
- /var/log:/var/log:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
command: -config.file=/etc/promtail/config.yml
restart: unless-stopped
grafana:
image: grafana/grafana:latest
container_name: grafana
ports:
- "3000:3000"
volumes:
- grafana-data:/var/lib/grafana
environment:
- GF_SECURITY_ADMIN_PASSWORD=changeme
- GF_USERS_ALLOW_SIGN_UP=false
restart: unless-stopped
volumes:
grafana-data:Start Services
docker-compose up -dPart 2: Configure Grafana
Add Loki Data Source
- Login to Grafana (http://localhost:3000)
- Go to Configuration > Data Sources
- Add data source > Loki
- URL:
http://loki:3100 - Click Save & Test
Create Log Dashboard
Import or create dashboard with panels:
{
"panels": [
{
"title": "Log Volume",
"type": "timeseries",
"targets": [
{
"expr": "sum(rate({job=~\".+\"}[5m])) by (job)",
"legendFormat": "{{job}}"
}
]
},
{
"title": "Error Logs",
"type": "logs",
"targets": [
{
"expr": "{job=~\".+\"} |~ \"(?i)error|fail|critical\""
}
]
}
]
}Part 3: Deploy Remote Promtail Agents
Standalone Promtail Installation
On each server to monitor:
# Download Promtail
wget https://github.com/grafana/loki/releases/download/v2.9.0/promtail-linux-amd64.zip
unzip promtail-linux-amd64.zip
sudo mv promtail-linux-amd64 /usr/local/bin/promtailRemote Promtail Config
# /etc/promtail/config.yaml
server:
http_listen_port: 9080
grpc_listen_port: 0
positions:
filename: /var/lib/promtail/positions.yaml
clients:
- url: http://loki-server:3100/loki/api/v1/push
scrape_configs:
- job_name: system
static_configs:
- targets:
- localhost
labels:
job: varlogs
host: ${HOSTNAME}
__path__: /var/log/*.log
- job_name: nginx
static_configs:
- targets:
- localhost
labels:
job: nginx
host: ${HOSTNAME}
__path__: /var/log/nginx/*.log
pipeline_stages:
- regex:
expression: '^(?P<remote_addr>[\w\.]+) - (?P<remote_user>[^ ]*) \[(?P<time_local>[^\]]*)\] "(?P<request>[^"]*)" (?P<status>\d+) (?P<body_bytes_sent>\d+)'
- labels:
status:
remote_addr:Systemd Service
# /etc/systemd/system/promtail.service
[Unit]
Description=Promtail
After=network.target
[Service]
Type=simple
ExecStart=/usr/local/bin/promtail -config.file=/etc/promtail/config.yaml
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.targetsudo systemctl enable --now promtailPart 4: LogQL Queries
Basic Queries
# All logs from a job
{job="varlogs"}
# Filter by text
{job="auth"} |= "Failed password"
# Regex filter
{job="nginx"} |~ "status=(4|5)\\d\\d"
# Exclude pattern
{job="varlogs"} != "DEBUG"
# Multiple filters
{job="auth"} |= "sshd" |= "Failed" != "invalid user"Parsing and Extraction
# JSON parsing
{job="app"} | json | status >= 400
# Regex extraction
{job="nginx"} | regexp `(?P<ip>\d+\.\d+\.\d+\.\d+)` | ip != ""
# Line format
{job="auth"} | pattern `<_> <_> <_> <host> <_>: <message>` | host="server01"Metrics from Logs
# Rate of errors
rate({job="app"} |= "error" [5m])
# Count by label
sum by (status) (count_over_time({job="nginx"} | json [1h]))
# Top IPs
topk(10, sum by (ip) (count_over_time({job="nginx"} | pattern `<ip> - -` [1h])))Part 5: Security Monitoring
Alert Rules
Create alert rules in Grafana:
# Alerting rule example
groups:
- name: security
rules:
- alert: BruteForceAttempt
expr: |
sum(rate({job="auth"} |= "Failed password" [5m])) > 5
for: 2m
labels:
severity: warning
annotations:
summary: "Possible brute force attack detected"
- alert: PrivilegeEscalation
expr: |
count_over_time({job="auth"} |= "sudo" |= "COMMAND" [5m]) > 20
for: 1m
labels:
severity: high
annotations:
summary: "Unusual sudo activity detected"Security Dashboard Queries
# Failed SSH attempts
{job="auth"} |= "sshd" |= "Failed"
# Successful SSH logins
{job="auth"} |= "sshd" |= "Accepted"
# Sudo commands
{job="auth"} |= "sudo" |= "COMMAND"
# Web application errors
{job="nginx"} |~ "status=\"5\\d\\d\""Part 6: Retention and Performance
Configure Retention
# In loki-config.yaml
limits_config:
retention_period: 744h # 31 days
compactor:
working_directory: /loki/compactor
retention_enabled: true
retention_delete_delay: 2h
retention_delete_worker_count: 150Performance Tuning
# For high-volume environments
ingester:
chunk_idle_period: 30m
max_chunk_age: 1h
chunk_target_size: 1572864
querier:
max_concurrent: 20
query_scheduler:
max_outstanding_requests_per_tenant: 2048Troubleshooting
Common Issues
Promtail not shipping logs:
# Check Promtail status
curl http://localhost:9080/ready
# View targets
curl http://localhost:9080/targetsQuery performance:
- Use specific label filters
- Limit time ranges
- Use line filters before parsing
Storage growing:
- Enable retention
- Run compaction
- Archive old logs
Checklist
- Loki deployed and healthy
- Promtail agents on all servers
- Grafana connected to Loki
- Dashboards created
- Alert rules configured
- Retention policy set
- Backup strategy defined
Conclusion
A centralized log management system provides essential visibility for security and operations. Grafana Loki offers a cost-effective, scalable solution that integrates seamlessly with existing Grafana deployments.
Last updated: January 2026