Honeycomb

Integrate Arxignis with Honeycomb for distributed tracing and observability using OpenTelemetry spans.

Overview

Honeycomb provides powerful distributed tracing and observability capabilities for Arxignis deployments, enabling you to understand request flows, identify performance bottlenecks, and correlate security events across your entire system using OpenTelemetry spans.

Key Features

🔍 Distributed Tracing

End-to-end request tracing across services
Automatic instrumentation of security checks
Custom span creation for threat analysis
Trace correlation with security events

📊 Performance Monitoring

Real-time performance spans
Custom instrumentation points
Response time analysis
Error rate tracking

🎯 Custom Spans

Security event tracking
Threat intelligence data
Custom business spans
Anomaly detection

📈 Advanced Analytics

Query-based analysis with Honeycomb Query Language (HQL)
Custom dashboards and visualizations
SLO/SLI monitoring
Trend analysis and alerting

Setup and Configuration

Prerequisites

Honeycomb Account: Active Honeycomb account with API access
API Key: Honeycomb API key with write permissions
Team: Configured team for OpenTelemetry spans

Installation

Get your Honeycomb credentials:

# Your Honeycomb API key
HONEYCOMB_API_KEY="your-honeycomb-api-key"

# Your Honeycomb team
HONEYCOMB_TEAM="your-team"

Configure Arxignis integration:

{
  "observability": {
    "honeycomb": {
      "enabled": true,
      "api_key": "your-honeycomb-api-key",
      "team": "your-team",
      "sample_rate": 1.0
    }
  }
}

Enable tracing:

# Set environment variables
export HONEYCOMB_ENABLED=true
export HONEYCOMB_API_KEY="your-honeycomb-api-key"
export HONEYCOMB_TEAM="your-team"

OpenTelemetry Spans

Security Event Spans

{
  "name": "security_check",
  "service_name": "arxignis-proxy",
  "trace_id": "1-5f3b3b3b-3b3b3b3b3b3b3b3b",
  "span_id": "3b3b3b3b3b3b3b3b",
  "parent_id": "3b3b3b3b3b3b3b3a",
  "timestamp": "2024-01-15T10:30:00Z",
  "duration_ms": 45,
  "attributes": {
    "source_ip": "192.168.1.100",
    "threat_detected": true,
    "threat_type": "malware",
    "severity": "high",
    "action": "blocked",
    "user_agent": "Mozilla/5.0...",
    "request_path": "/api/v1/data",
    "response_code": 403,
    "geo_country": "US",
    "geo_city": "New York"
  }
}

Custom Spans

Create custom spans for detailed security analysis:

// Threat intelligence check span
const threatSpan = honeycomb.startSpan({
  name: "threat_intelligence_check",
  service_name: "arxignis-proxy",
  attributes: {
    "ip_address": sourceIP,
    "check_type": "reputation",
    "data_source": "arxignis_api"
  }
});

// Perform threat check
const threatResult = await checkThreatIntelligence(sourceIP);

// Add results to span
threatSpan.addField("threat_score", threatResult.score);
threatSpan.addField("threat_categories", threatResult.categories);
threatSpan.addField("confidence_level", threatResult.confidence);

threatSpan.end();

Query Examples

Security Event Analysis

SELECT
  COUNT(*) as event_count,
  AVG(duration_ms) as avg_duration,
  P95(duration_ms) as p95_duration
FROM honeycomb
WHERE timestamp > now() - 1h
  AND name = "security_check"
  AND attributes.threat_detected = true
GROUP BY attributes.threat_type, attributes.severity

Performance Monitoring

SELECT
  name,
  COUNT(*) as request_count,
  AVG(duration_ms) as avg_response_time,
  P99(duration_ms) as p99_response_time,
  COUNT_WHERE(attributes.response_code >= 400) as error_count
FROM honeycomb
WHERE timestamp > now() - 1h
  AND service_name = "arxignis-proxy"
GROUP BY name
ORDER BY avg_response_time DESC

Geographic Threat Analysis

SELECT
  attributes.geo_country,
  attributes.geo_city,
  COUNT(*) as threat_count,
  COUNT_WHERE(attributes.severity = "critical") as critical_threats
FROM honeycomb
WHERE timestamp > now() - 24h
  AND attributes.threat_detected = true
GROUP BY attributes.geo_country, attributes.geo_city
ORDER BY threat_count DESC
LIMIT 10

Dashboard Configuration

Security Overview Dashboard

Create comprehensive security dashboards with these panels:

Threat Detection Overview
- Total threats detected (last 24h)
- Threats by severity and type
- Geographic distribution of threats
Performance Metrics
- Response time percentiles
- Error rates by endpoint
- Throughput analysis
Trace Analysis
- Slowest security checks
- Dependency performance
- Error correlation

SLO/SLI Configuration

Set up Service Level Objectives for security operations:

# Security Check SLO
name: "Security Check Response Time"
target: 95% of requests under 100ms
query: |
  SELECT
    COUNT_WHERE(duration_ms < 100) / COUNT(*) * 100 as slo_percentage
  FROM honeycomb
  WHERE timestamp > now() - 1h
    AND name = "security_check"

Best Practices

Sampling Strategy

Use adaptive sampling for high-volume endpoints
Sample 100% of security events
Implement custom sampling rules for different event types

Instrumentation

Add custom attributes for business context
Use consistent naming conventions
Include relevant metadata in spans

Performance Optimization

Monitor trace volume and adjust sampling
Use efficient queries with proper time ranges
Implement trace batching for high-throughput scenarios

Security

Secure API key storage
Implement least-privilege access
Regular key rotation
Monitor access patterns

Troubleshooting

Common Issues

Spans Not Appearing

Verify API key permissions
Check team configuration
Ensure OpenTelemetry tracing is enabled
Validate network connectivity

High Latency

Check sampling configuration
Monitor network performance
Verify Honeycomb service status
Optimize query time ranges

Data Loss

Verify sampling rates
Check API rate limits
Monitor error logs
Validate span format

Getting Help

Honeycomb Documentation: docs.honeycomb.io
Support: Contact Honeycomb support for platform issues
Community: Join our Discord community

FAQ

Traces are collections of related spans that show the flow of requests through your system. Spans are individual timing units within a trace. Arxignis sends OpenTelemetry spans to provide comprehensive trace-based observability.

Configure sampling rates based on span importance. Security events should be sampled at 100%, while routine requests can use lower sampling rates to manage costs.

Yes, Honeycomb's distributed tracing allows you to correlate security check spans with application performance, helping identify security-related performance impacts.

Use adaptive sampling for spans, implement efficient queries, monitor data volume, and set up appropriate retention policies to optimize costs while maintaining observability.

Add business-relevant attributes like user_id, organization, threat_type, severity, geographic location, and custom security context to enhance span analysis.

For more information, visit honeycomb.io or join our Discord community.

Overview​

Key Features​

🔍 Distributed Tracing​

📊 Performance Monitoring​

🎯 Custom Spans​

📈 Advanced Analytics​

Setup and Configuration​

Prerequisites​

Installation​

OpenTelemetry Spans​

Security Event Spans​

Custom Spans​

Query Examples​

Security Event Analysis​

Performance Monitoring​

Geographic Threat Analysis​

Dashboard Configuration​

Security Overview Dashboard​

SLO/SLI Configuration​

Best Practices​

Sampling Strategy​

Instrumentation​

Performance Optimization​

Security​

Troubleshooting​

Common Issues​

Spans Not Appearing​

High Latency​

Data Loss​

Getting Help​

FAQ​