Skip to content

Latest commit

 

History

History
112 lines (83 loc) · 3.7 KB

distributed-tracing.md

File metadata and controls

112 lines (83 loc) · 3.7 KB
layout title permalink redirect_from
post
Distributed Tracing
/docs/distributed-tracing
/distributed-tracing.md/
/docs/distributed-tracing.md/

AIStore supports distributed tracing via OpenTelemetry (OTEL), enhancing its observability capabilities alongside existing extensive metrics and logging features. Distributed tracing enables tracking client requests across AIStore's proxy and target daemons, providing better visibility into the request flow and offering valuable performance insights

For more details:

WARNING: Enabling distributed tracing introduces slight overhead in AIStore's critical data path. Enable this feature only after carefully considering its performance impact and ensuring that the benefits of enhanced observability justify the potential trade-offs.

Table of Contents

Getting Started

In this section, we use AIStore Local Playground and local Jaeger. This is done for purely (easy-to-use-and-repropduce) demonsration purposes.

Pre-Requisite

  • Docker
  1. Local Jaeger setup

    docker run -d --name jaeger \
    -e COLLECTOR_OTLP_ENABLED=true \
    -p 16686:16686 \
    -p 4317:4317 \
    -p 4318:4318 \
    jaegertracing/all-in-one:latest
  2. Optionally, shutdown and cleanup Local Playground:

    make kill clean
  3. Deploy the cluster with AuthN enabled:

    AIS_TRACING_ENDPOINT="localhost:4317" make deploy

    This will start up an AIStore cluster with distributed-tracing enabled.

Example operations

ais bucket create ais://nnn
ais put README.md ais://nnn
ais get ais://nnn/README.md /dev/null

View traces at: http://localhost:16686

Configuration

Cluster-wide tracing configuration. For list of AIStore config options refer to configuration.md.

Option name Default value Description
tracing.enabled false If true, enables distributed tracing
tracing.exporter_endpoint '' OTEL exporter gRPC endpoint
tracing.service_name_prefix aistore Prefix added to OTEL service name reported by exporter
tracing.attributes {} Extra attributes to be added the traces
tracing.sampler_probablity 1 (export all traces) Percentage of traces to sample [0,1]
tracing.skip_verify false Allow insecure (TLS) exporter gRPC connection
tracing.exporter_auth.token_header '' Request header used for exporter auth token
tracing.exporter_auth.token_file '' Filepath to obtain exporter auth token

Sample aistore cluster configuration:

{
    ...
    "tracing": {
        "enabled": true,
        "exporter_endpoint": "localhost:4317",
        "skip_verify": true,
        "service_name_prefix": "aistore",
        "sampler_probability": "1.0"
    },
    ...
}

Build AIStore with tracing

Distributed tracing is a build-time option controlled using oteltracing build tag.

When aisnode binary is built without this build tag, tracing configuration is ignored and the entire tracing functionality becomes a no-op.

# build with tracing support
TAGS=oteltracing make node

# build without tracing support
make node