Prometheus has become the de facto standard for metrics collection in modern Java-based microservices. Its pull-based architecture, dimnesional data model, and expressive query language make it ideal for observability in dynamic, containerized environments.
Core Architecture Overview
Prometheus operates by periodically scraping HTTP endpoints exposed by instrumented applications. Unlike push-based systems, this design simplifies service discovery and avoids reliance on intermediary agents or collectors.
Minimal Setup with Official Client Libraries
Begin by adding the latest stable Prometheus Java client dependencies. Prefer version 0.16.0 or newer to benefit from improved thread safety and module support:
<dependency>
<groupId>io.prometheus</groupId>
<artifactId>simpleclient</artifactId>
<version>0.16.0</version>
</dependency>
<dependency>
<groupId>io.prometheus</groupId>
<artifactId>simpleclient_httpserver</artifactId>
<version>0.16.0</version>
</dependency>
<dependency>
<groupId>io.prometheus</groupId>
<artifactId>simpleclient_hotspot</artifactId>
<version>0.16.0</version>
</dependency>
Exposing Metrics via Embedded HTTP Server
Initialize a lightweight metrics endpoint using HTTPServer. This eliminates external dependencies and supports zero-config deployment:
import io.prometheus.client.CollectorRegistry;
import io.prometheus.client.exporter.HTTPServer;
import io.prometheus.client.hotspot.DefaultExports;
public class MetricsEndpoint {
public static void start() throws Exception {
// Register JVM-level metrics (GC, memory, threads)
DefaultExports.initialize();
// Use custom registry for isolation
CollectorRegistry registry = new CollectorRegistry();
// Start HTTP server on port 8080
new HTTPServer(8080, registry);
}
}
Implementing Application-Specific Metrics
Define purpose-built metrics using semantic naming and appropriate types. For example, track request volume and latency separately:
import io.prometheus.client.Counter;
import io.prometheus.client.Summary;
public class ServiceMetrics {
private static final Counter REQUEST_COUNT = Counter.build()
.name("service_request_total")
.help("Total number of processed requests")
.labelNames("endpoint", "status")
.register();
private static final Summary REQUEST_DURATION = Summary.build()
.name("service_request_duration_seconds")
.help("Latency distribution of requests in seconds")
.labelNames("endpoint", "status")
.register();
public static void recordRequest(String endpoint, String status, long durationNanos) {
REQUEST_COUNT.labels(endpoint, status).inc();
REQUEST_DURATION.labels(endpoint, status).observe(durationNanos / 1_000_000_000.0);
}
}
Scraping Configuration Example
In your prometheus.yml, configure scrape intervals and target labels to enable filtering and aggregation:
global:
scrape_interval: 10s
scrape_configs:
- job_name: 'java-service'
static_configs:
- targets: ['host.docker.internal:8080']
metrics_path: '/metrics'
params:
format: ['prometheus']
Effective PromQL Queries for Troubleshooting
Leverage PromQL to derive actionable insights. Common patterns include:
- Per-endpoint error rate:
rate(service_request_total{status=~"5.."}[5m]) / rate(service_request_total[5m]) - 95th percentile latency per path:
histogram_quantile(0.95, sum(rate(service_request_duration_seconds_bucket[1h])) by (le, endpoint)) - Uptime check:
up{job="java-service"} == 0
Alerting Best Practices
Define alerts based on service-level objectives—not infrastructure thresholds. Example alert rule for sustained high latency:
groups:
- name: service-alerts
rules:
- alert: LatencySpike
expr: histogram_quantile(0.99, sum(rate(service_request_duration_seconds_bucket[10m])) by (le, endpoint)) > 2.0
for: 5m
labels:
severity: warning
annotations:
summary: "High latency detected on {{ $labels.endpoint }}"
Grafana Dashboard Integration
Use Grafana’s built-in Promethues data source to build dashboards. Key visualizations include:
- Time-series graphs showing
rate(service_request_total[1m])by status code - Heatmaps of
service_request_duration_seconds_bucketfor latency distribution - Single-stat panels for uptime (
up) and JVM heap usage (jvm_memory_bytes_used{area="heap"})