Metrics

Metrics provide quantitative measurements of your AI applications’ performance and behavior.

Overview

The Metrics collection in Observability provides a way to collect, store, and analyze metrics from your AI applications. Metrics can:

Measure application performance and health
Track resource usage and capacity
Monitor business KPIs
Support alerting and dashboards

Key Features

Collection: Collect metrics from various sources
Storage: Store metrics efficiently
Visualization: Visualize metrics in charts and dashboards
Alerting: Set up alerts based on metric thresholds

Metric Types

Observability.do supports various metric types:

Counter

A cumulative metric that represents a single monotonically increasing counter:


// Example counter metric
{
  name: 'api_requests_total',
  description: 'Total number of API requests',
  type: 'counter',
  unit: 'requests',
  labels: ['service', 'endpoint', 'method', 'status_code']
}

Gauge

A metric that represents a single numerical value that can arbitrarily go up and down:


// Example gauge metric
{
  name: 'memory_usage',
  description: 'Current memory usage',
  type: 'gauge',
  unit: 'bytes',
  labels: ['service', 'instance']
}

Histogram

A metric that samples observations and counts them in configurable buckets:


// Example histogram metric
{
  name: 'request_duration_seconds',
  description: 'Request duration in seconds',
  type: 'histogram',
  unit: 'seconds',
  labels: ['service', 'endpoint', 'method'],
  buckets: [0.01, 0.05, 0.1, 0.5, 1, 5, 10]
}

Summary

Similar to a histogram, a summary samples observations and provides quantile statistics:


// Example summary metric
{
  name: 'request_latency_seconds',
  description: 'Request latency in seconds',
  type: 'summary',
  unit: 'seconds',
  labels: ['service', 'endpoint', 'method'],
  quantiles: [0.5, 0.9, 0.95, 0.99]
}

Metric Categories

Observability.do supports various metric categories:

System Metrics

Measure system resources and performance:


// Example system metrics
const systemMetrics = [
  {
    name: 'cpu_usage_percent',
    description: 'CPU usage percentage',
    type: 'gauge',
    unit: 'percent',
    labels: ['service', 'instance'],
  },
  {
    name: 'memory_usage_bytes',
    description: 'Memory usage in bytes',
    type: 'gauge',
    unit: 'bytes',
    labels: ['service', 'instance'],
  },
  {
    name: 'disk_usage_bytes',
    description: 'Disk usage in bytes',
    type: 'gauge',
    unit: 'bytes',
    labels: ['service', 'instance', 'device'],
  },
  {
    name: 'network_received_bytes',
    description: 'Network bytes received',
    type: 'counter',
    unit: 'bytes',
    labels: ['service', 'instance', 'interface'],
  },
  {
    name: 'network_sent_bytes',
    description: 'Network bytes sent',
    type: 'counter',
    unit: 'bytes',
    labels: ['service', 'instance', 'interface'],
  },
]

Application Metrics

Measure application performance and behavior:


// Example application metrics
const applicationMetrics = [
  {
    name: 'http_requests_total',
    description: 'Total number of HTTP requests',
    type: 'counter',
    unit: 'requests',
    labels: ['service', 'endpoint', 'method', 'status_code'],
  },
  {
    name: 'http_request_duration_seconds',
    description: 'HTTP request duration in seconds',
    type: 'histogram',
    unit: 'seconds',
    labels: ['service', 'endpoint', 'method'],
    buckets: [0.01, 0.05, 0.1, 0.5, 1, 5, 10],
  },
  {
    name: 'active_users',
    description: 'Number of active users',
    type: 'gauge',
    unit: 'users',
    labels: ['service'],
  },
  {
    name: 'error_rate',
    description: 'Rate of errors',
    type: 'gauge',
    unit: 'errors/second',
    labels: ['service', 'endpoint'],
  },
]

Business Metrics

Measure business KPIs:


// Example business metrics
const businessMetrics = [
  {
    name: 'conversion_rate',
    description: 'Conversion rate',
    type: 'gauge',
    unit: 'percent',
    labels: ['funnel', 'step'],
  },
  {
    name: 'revenue',
    description: 'Revenue',
    type: 'counter',
    unit: 'USD',
    labels: ['product', 'region'],
  },
  {
    name: 'active_subscriptions',
    description: 'Number of active subscriptions',
    type: 'gauge',
    unit: 'subscriptions',
    labels: ['plan', 'region'],
  },
  {
    name: 'customer_satisfaction',
    description: 'Customer satisfaction score',
    type: 'gauge',
    unit: 'score',
    labels: ['product', 'region'],
  },
]

AI-Specific Metrics

Measure AI model performance and usage:


// Example AI-specific metrics
const aiMetrics = [
  {
    name: 'model_inference_duration_seconds',
    description: 'Model inference duration in seconds',
    type: 'histogram',
    unit: 'seconds',
    labels: ['model', 'version', 'endpoint'],
    buckets: [0.01, 0.05, 0.1, 0.5, 1, 5, 10],
  },
  {
    name: 'model_token_usage',
    description: 'Number of tokens used',
    type: 'counter',
    unit: 'tokens',
    labels: ['model', 'type'],
  },
  {
    name: 'model_accuracy',
    description: 'Model accuracy',
    type: 'gauge',
    unit: 'percent',
    labels: ['model', 'version', 'dataset'],
  },
  {
    name: 'model_calls_total',
    description: 'Total number of model calls',
    type: 'counter',
    unit: 'calls',
    labels: ['model', 'version', 'endpoint'],
  },
]

Collecting Metrics

Collect metrics using the Observability.do API:


// Import the metrics client
import { metrics } from '@drivly/observability'
 
// Configure the metrics client
metrics.configure({
  service: 'user-service',
  environment: 'production',
})
 
// Increment a counter
metrics.counter('api_requests_total').inc({
  labels: {
    endpoint: '/api/users',
    method: 'GET',
    status_code: '200',
  },
})
 
// Set a gauge value
metrics.gauge('memory_usage').set(1024 * 1024 * 100, {
  labels: {
    instance: 'server-1',
  },
})
 
// Record a histogram value
metrics.histogram('request_duration_seconds').observe(0.25, {
  labels: {
    endpoint: '/api/users',
    method: 'GET',
  },
})
 
// Use a timer
const timer = metrics.histogram('request_duration_seconds').startTimer({
  labels: {
    endpoint: '/api/users',
    method: 'GET',
  },
})
// ... perform the operation
timer.end()
 
// Track function execution time
const result = await metrics.withTimer(
  async () => {
    // Function implementation...
    return { success: true }
  },
  {
    metric: 'function_duration_seconds',
    labels: {
      function: 'processUserRequest',
    },
  },
)
 
// Use middleware for HTTP requests
app.use(metrics.middleware.http())
 
// Use middleware for database queries
const db = metrics.wrapDatabase(database)

Querying Metrics

Query metrics using the Observability.do API:


// Query a metric
const data = await observability.metrics.query({
  name: 'http_request_duration_seconds',
  aggregation: 'avg',
  timeRange: {
    start: '2023-06-01T00:00:00Z',
    end: '2023-06-30T23:59:59Z',
  },
  step: '1h',
  filters: {
    service: 'user-service',
    endpoint: '/api/users',
  },
})
 
// Query multiple metrics
const multiData = await observability.metrics.queryBatch(
  [
    {
      name: 'http_request_duration_seconds',
      aggregation: 'avg',
      filters: {
        service: 'user-service',
      },
    },
    {
      name: 'error_rate',
      filters: {
        service: 'user-service',
      },
    },
  ],
  {
    timeRange: {
      start: '2023-06-01T00:00:00Z',
      end: '2023-06-30T23:59:59Z',
    },
    step: '1h',
  },
)
 
// Calculate derived metrics
const derivedData = await observability.metrics.calculate({
  expression: 'http_requests_total{status_code="500"} / http_requests_total * 100',
  timeRange: {
    start: '2023-06-01T00:00:00Z',
    end: '2023-06-30T23:59:59Z',
  },
  step: '1h',
})

Metric Visualization

Visualize metrics using the Observability.do dashboard:


// Create a chart
const chart = await observability.metrics.createChart({
  title: 'HTTP Request Duration',
  description: 'Average HTTP request duration over time',
  type: 'line',
  metrics: [
    {
      name: 'http_request_duration_seconds',
      aggregation: 'avg',
      filters: {
        service: 'user-service',
      },
      groupBy: ['endpoint'],
    },
  ],
  timeRange: {
    relative: '7d',
  },
})
 
// Create a dashboard
const dashboard = await observability.metrics.createDashboard({
  title: 'User Service Dashboard',
  description: 'Dashboard for the User Service',
  charts: [
    {
      id: 'request-duration',
      position: { x: 0, y: 0, w: 6, h: 4 },
    },
    {
      id: 'error-rate',
      position: { x: 6, y: 0, w: 6, h: 4 },
    },
    {
      id: 'active-users',
      position: { x: 0, y: 4, w: 12, h: 4 },
    },
  ],
})

Metric Management

Manage your metrics through the dashboard or API:


// Register a metric
await observability.metrics.register({
  name: 'new_metric',
  description: 'A new metric',
  type: 'counter',
  unit: 'count',
  labels: ['service', 'endpoint'],
})
 
// Update a metric
await observability.metrics.update('new_metric', {
  description: 'Updated description',
  labels: ['service', 'endpoint', 'method'],
})
 
// Configure metric retention
await observability.metrics.configureRetention({
  default: {
    duration: '90d',
    resolution: {
      raw: '24h',
      '1m': '7d',
      '5m': '30d',
      '1h': '90d',
    },
  },
  metrics: {
    http_requests_total: {
      duration: '365d',
      resolution: {
        raw: '24h',
        '1m': '7d',
        '5m': '30d',
        '1h': '90d',
        '1d': '365d',
      },
    },
  },
})