Distributed log processing systems are essential for modern application monitoring and analysis. A common approach involves using Filebeat for log collection, Kafka as a message buffer, Logstash for transformation, Elasticsearch for storage, and Kibana for visualization. Grafana can also integrate with Elasticsearch for real-time monitoring dashboards.
Filebeat is deployed on application servers to minimize resource contention, handling only log reading and forwarding. Logstash, Elasticsearch, and Kibana typically run on dedicated servers, with Logstash performing filtering operations that may require CPU optimization through efficient filter configurations.
Common Architecture Patterns
Direct Filebeat to Elasticsearch Integration
Filebeat sends logs directly to Elasticsearch, with Kibana providing search and visualization capabilities.
Buffered Pipeline with Kafka
Multiple Filebeat instances forward logs to a Kafka cluster. One to three Logstash nodes consume from Kafka and output to an Elasticsearch cluster. This design ensures data persistence; if Logstash fails, logs remain in Kafka until processing resumes. Kibana serves as the front-end for log exploration.
Filebeat Configuration and Deployment
Docker Deployment for Elasticsearch Output
Ensure Filebeat and Elasticsearch versions match. Configure logback.xml to output JSON format.
docker run --privileged --name filebeat --net=host -d -m 1000M \
--log-driver json-file --log-opt max-size=1024m \
-v /config/filebeat.yml:/usr/share/filebeat/filebeat.yml \
-v /local/logs:/app/logs \
-v /filebeat/data:/data \
registry.example.com/filebeat:7.10.0
Example filebeat.yml configuration:
filebeat.inputs:
- type: log
enabled: true
paths:
- /app/logs/service-a/*.log
- /app/logs/service-b/*.log
ignore_older: 12h
clean_inactive: 14h
tags: ["primary-logs"]
- type: log
enabled: true
paths:
- /app/logs/service-c/*.log
ignore_older: 12h
clean_inactive: 14h
tags: ["secondary-logs"]
json.keys_under_root: true
json.overwrite_keys: true
setup.ilm.enabled: false
setup.template.name: "app-logs"
setup.template.pattern: "app-logs-*"
setup.template.enabled: false
setup.template.overwrite: true
setup.template.settings:
index.number_of_shards: 2
index.number_of_replicas: 1
index.codec: best_compression
output.elasticsearch:
hosts: ["elasticsearch-host:9200"]
indices:
- index: "app-logs-primary-%{+yyyy.MM.dd}"
when.contains:
tags: "primary-logs"
- index: "app-logs-secondary-%{+yyyy.MM.dd}"
when.contains:
tags: "secondary-logs"
processors:
- decode_json_fields:
fields: ["message"]
target: ""
overwrite_keys: true
- rename:
fields:
- from: "exception"
to: "app_exception"
- drop_fields:
fields: ["beat", "host", "input", "agent"]
Alternative Elasticsearch output configuration:
output.elasticsearch:
hosts: ["es-node:9200"]
index: "logs-%{[fields.log_source]}-*"
indices:
- index: "logs-web-%{+yyyy.MM.dd}"
when.equals:
fields.log_source: "web_server"
- index: "logs-app-%{+yyyy.MM.dd}"
when.equals:
fields.log_source: "application"
Multiline Log Aggregation
To handle Java exception stack traces:
multiline:
pattern: '^[[:space:]]+(at|\.{3})\b|^Caused by:'
negate: false
match: after
Custom Elasticsearch Index Templates
Enable custom field mappings:
setup.template.json.enabled: true
setup.template.json.path: "/usr/share/filebeat/index_template.json"
setup.template.json.name: "custom_template"
Add volume mounts for template files:
-v /config/fields.yml:/usr/share/filebeat/fields.yml
-v /config/index_template.json:/usr/share/filebeat/index_template.json
Example fields.yml:
- key: app-logs
title: Application Logs
description: "Custom log schema for application monitoring"