Metrics
Metrics provide quantitative measurements of your AI applications’ performance and behavior.
Overview
The Metrics collection in Observability provides a way to collect, store, and analyze metrics from your AI applications. Metrics can:
- Measure application performance and health
- Track resource usage and capacity
- Monitor business KPIs
- Support alerting and dashboards
Key Features
- Collection: Collect metrics from various sources
- Storage: Store metrics efficiently
- Visualization: Visualize metrics in charts and dashboards
- Alerting: Set up alerts based on metric thresholds
Metric Types
Observability.do supports various metric types:
Counter
A cumulative metric that represents a single monotonically increasing counter:
// Example counter metric
{
name: 'api_requests_total',
description: 'Total number of API requests',
type: 'counter',
unit: 'requests',
labels: ['service', 'endpoint', 'method', 'status_code']
}
Gauge
A metric that represents a single numerical value that can arbitrarily go up and down:
// Example gauge metric
{
name: 'memory_usage',
description: 'Current memory usage',
type: 'gauge',
unit: 'bytes',
labels: ['service', 'instance']
}
Histogram
A metric that samples observations and counts them in configurable buckets:
// Example histogram metric
{
name: 'request_duration_seconds',
description: 'Request duration in seconds',
type: 'histogram',
unit: 'seconds',
labels: ['service', 'endpoint', 'method'],
buckets: [0.01, 0.05, 0.1, 0.5, 1, 5, 10]
}
Summary
Similar to a histogram, a summary samples observations and provides quantile statistics:
// Example summary metric
{
name: 'request_latency_seconds',
description: 'Request latency in seconds',
type: 'summary',
unit: 'seconds',
labels: ['service', 'endpoint', 'method'],
quantiles: [0.5, 0.9, 0.95, 0.99]
}
Metric Categories
Observability.do supports various metric categories:
System Metrics
Measure system resources and performance:
// Example system metrics
const systemMetrics = [
{
name: 'cpu_usage_percent',
description: 'CPU usage percentage',
type: 'gauge',
unit: 'percent',
labels: ['service', 'instance'],
},
{
name: 'memory_usage_bytes',
description: 'Memory usage in bytes',
type: 'gauge',
unit: 'bytes',
labels: ['service', 'instance'],
},
{
name: 'disk_usage_bytes',
description: 'Disk usage in bytes',
type: 'gauge',
unit: 'bytes',
labels: ['service', 'instance', 'device'],
},
{
name: 'network_received_bytes',
description: 'Network bytes received',
type: 'counter',
unit: 'bytes',
labels: ['service', 'instance', 'interface'],
},
{
name: 'network_sent_bytes',
description: 'Network bytes sent',
type: 'counter',
unit: 'bytes',
labels: ['service', 'instance', 'interface'],
},
]
Application Metrics
Measure application performance and behavior:
// Example application metrics
const applicationMetrics = [
{
name: 'http_requests_total',
description: 'Total number of HTTP requests',
type: 'counter',
unit: 'requests',
labels: ['service', 'endpoint', 'method', 'status_code'],
},
{
name: 'http_request_duration_seconds',
description: 'HTTP request duration in seconds',
type: 'histogram',
unit: 'seconds',
labels: ['service', 'endpoint', 'method'],
buckets: [0.01, 0.05, 0.1, 0.5, 1, 5, 10],
},
{
name: 'active_users',
description: 'Number of active users',
type: 'gauge',
unit: 'users',
labels: ['service'],
},
{
name: 'error_rate',
description: 'Rate of errors',
type: 'gauge',
unit: 'errors/second',
labels: ['service', 'endpoint'],
},
]
Business Metrics
Measure business KPIs:
// Example business metrics
const businessMetrics = [
{
name: 'conversion_rate',
description: 'Conversion rate',
type: 'gauge',
unit: 'percent',
labels: ['funnel', 'step'],
},
{
name: 'revenue',
description: 'Revenue',
type: 'counter',
unit: 'USD',
labels: ['product', 'region'],
},
{
name: 'active_subscriptions',
description: 'Number of active subscriptions',
type: 'gauge',
unit: 'subscriptions',
labels: ['plan', 'region'],
},
{
name: 'customer_satisfaction',
description: 'Customer satisfaction score',
type: 'gauge',
unit: 'score',
labels: ['product', 'region'],
},
]
AI-Specific Metrics
Measure AI model performance and usage:
// Example AI-specific metrics
const aiMetrics = [
{
name: 'model_inference_duration_seconds',
description: 'Model inference duration in seconds',
type: 'histogram',
unit: 'seconds',
labels: ['model', 'version', 'endpoint'],
buckets: [0.01, 0.05, 0.1, 0.5, 1, 5, 10],
},
{
name: 'model_token_usage',
description: 'Number of tokens used',
type: 'counter',
unit: 'tokens',
labels: ['model', 'type'],
},
{
name: 'model_accuracy',
description: 'Model accuracy',
type: 'gauge',
unit: 'percent',
labels: ['model', 'version', 'dataset'],
},
{
name: 'model_calls_total',
description: 'Total number of model calls',
type: 'counter',
unit: 'calls',
labels: ['model', 'version', 'endpoint'],
},
]
Collecting Metrics
Collect metrics using the Observability.do API:
// Import the metrics client
import { metrics } from '@drivly/observability'
// Configure the metrics client
metrics.configure({
service: 'user-service',
environment: 'production',
})
// Increment a counter
metrics.counter('api_requests_total').inc({
labels: {
endpoint: '/api/users',
method: 'GET',
status_code: '200',
},
})
// Set a gauge value
metrics.gauge('memory_usage').set(1024 * 1024 * 100, {
labels: {
instance: 'server-1',
},
})
// Record a histogram value
metrics.histogram('request_duration_seconds').observe(0.25, {
labels: {
endpoint: '/api/users',
method: 'GET',
},
})
// Use a timer
const timer = metrics.histogram('request_duration_seconds').startTimer({
labels: {
endpoint: '/api/users',
method: 'GET',
},
})
// ... perform the operation
timer.end()
// Track function execution time
const result = await metrics.withTimer(
async () => {
// Function implementation...
return { success: true }
},
{
metric: 'function_duration_seconds',
labels: {
function: 'processUserRequest',
},
},
)
// Use middleware for HTTP requests
app.use(metrics.middleware.http())
// Use middleware for database queries
const db = metrics.wrapDatabase(database)
Querying Metrics
Query metrics using the Observability.do API:
// Query a metric
const data = await observability.metrics.query({
name: 'http_request_duration_seconds',
aggregation: 'avg',
timeRange: {
start: '2023-06-01T00:00:00Z',
end: '2023-06-30T23:59:59Z',
},
step: '1h',
filters: {
service: 'user-service',
endpoint: '/api/users',
},
})
// Query multiple metrics
const multiData = await observability.metrics.queryBatch(
[
{
name: 'http_request_duration_seconds',
aggregation: 'avg',
filters: {
service: 'user-service',
},
},
{
name: 'error_rate',
filters: {
service: 'user-service',
},
},
],
{
timeRange: {
start: '2023-06-01T00:00:00Z',
end: '2023-06-30T23:59:59Z',
},
step: '1h',
},
)
// Calculate derived metrics
const derivedData = await observability.metrics.calculate({
expression: 'http_requests_total{status_code="500"} / http_requests_total * 100',
timeRange: {
start: '2023-06-01T00:00:00Z',
end: '2023-06-30T23:59:59Z',
},
step: '1h',
})
Metric Visualization
Visualize metrics using the Observability.do dashboard:
// Create a chart
const chart = await observability.metrics.createChart({
title: 'HTTP Request Duration',
description: 'Average HTTP request duration over time',
type: 'line',
metrics: [
{
name: 'http_request_duration_seconds',
aggregation: 'avg',
filters: {
service: 'user-service',
},
groupBy: ['endpoint'],
},
],
timeRange: {
relative: '7d',
},
})
// Create a dashboard
const dashboard = await observability.metrics.createDashboard({
title: 'User Service Dashboard',
description: 'Dashboard for the User Service',
charts: [
{
id: 'request-duration',
position: { x: 0, y: 0, w: 6, h: 4 },
},
{
id: 'error-rate',
position: { x: 6, y: 0, w: 6, h: 4 },
},
{
id: 'active-users',
position: { x: 0, y: 4, w: 12, h: 4 },
},
],
})
Metric Management
Manage your metrics through the dashboard or API:
// Register a metric
await observability.metrics.register({
name: 'new_metric',
description: 'A new metric',
type: 'counter',
unit: 'count',
labels: ['service', 'endpoint'],
})
// Update a metric
await observability.metrics.update('new_metric', {
description: 'Updated description',
labels: ['service', 'endpoint', 'method'],
})
// Configure metric retention
await observability.metrics.configureRetention({
default: {
duration: '90d',
resolution: {
raw: '24h',
'1m': '7d',
'5m': '30d',
'1h': '90d',
},
},
metrics: {
http_requests_total: {
duration: '365d',
resolution: {
raw: '24h',
'1m': '7d',
'5m': '30d',
'1h': '90d',
'1d': '365d',
},
},
},
})