Skip to content

Conversation

@mashraf-222
Copy link

Add OpenTelemetry Integration

This PR adds OpenTelemetry support for distributed tracing across the codebase. The implementation includes automatic instrumentation for NumPy and Pandas operations, plus a decorator-based approach for manual instrumentation of key functions.

What's Added

  • Telemetry module (src/telemetry/) with configuration, setup, and decorators
  • Automatic instrumentation for NumPy and Pandas (when available)
  • Manual instrumentation via @trace_function decorator applied to critical functions in:
    • numerical/optimization.py - gradient descent
    • algorithms/dynamic_programming.py - fibonacci, matrix operations, knapsack
    • algorithms/graph.py - graph traversal, clustering, path finding
    • data_processing/dataframe.py - filtering, grouping, merging
    • statistics/descriptive.py - statistical operations
  • Environment configuration via env.example with all OpenTelemetry variables documented

How It Works

Telemetry configuration is managed through TelemetryConfig which reads environment variables with sensible defaults. Configuration can be overridden via environment variables or by passing parameters to setup_telemetry().

Current Implementation:

  1. Configuration defaults (can be overridden via environment variables):

    • OTEL_SDK_DISABLED=false - Telemetry enabled by default
    • OTEL_SERVICE_NAME=optimize-me - Default service name
    • OTEL_EXPORTER_TYPE=console - Console exporter for development
    • OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 - OTLP endpoint when using otlp exporter
  2. Initialization: Call setup_telemetry() once at application startup (as shown in examples/run_with_telemetry.py):

    from src.telemetry import setup_telemetry
    
    setup_telemetry(
        service_name="optimize-me",
        service_version="0.1.0",
        exporter_type="console",
    )
  3. Automatic instrumentation: Functions are already instrumented with @trace_function decorators:

    • gradient_descent captures iterations and learning_rate arguments
    • Other critical functions (graph traversal, clustering, statistical operations) are automatically traced
    • NumPy and Pandas operations are automatically instrumented when available
  4. Environment variable overrides: Set variables from env.example to customize behavior without code changes.

Testing

Run the example script to see telemetry in action:

python examples/run_with_telemetry.py

This will output trace spans to the console showing execution of instrumented functions including gradient descent, graph traversal, statistical operations, and clustering.

For production, set OTEL_EXPORTER_TYPE=otlp and configure OTEL_EXPORTER_OTLP_ENDPOINT to point to an OTLP collector (e.g., Jaeger, Tempo).

@Saga4
Copy link

Saga4 commented Nov 5, 2025

This Repo is hardly used in any real production environment. @KRRT7 We are looking to get real logs and signals to generate live production style test use cases for users and later know the benchmarks performances over a period of time.
Here we do not even have E2E tests as well.

@KRRT7
Copy link
Collaborator

KRRT7 commented Nov 5, 2025

@Saga4 we need to know what it'd look like as a basic POC before being able to implement it in a real codebase environment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants