Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Code Generation with Weaver - Example of Code Generation for SemConv Attribute Registry and SemConv Metrics #136

Merged
merged 47 commits into from
May 8, 2024

Conversation

lquerel
Copy link
Contributor

@lquerel lquerel commented Apr 26, 2024

IMPORTANT NOTES:

  • THIS PR SHOWS HOW TO USE WEAVER TO GENERATE CODE FOR SEMANTIC CONVENTIONS.
  • THE GENERATED RUST CODE IS JUST AN EXAMPLE, NOT AN OFFICIAL RUST SEMCONV VERSION.

Guidance to review this PR

The purpose of this PR is to allow various maintainers of API/SDK clients to provide feedback on Weaver in the context of code generation. The generation of Rust code for SemConv is taken as an example.

File structure, under crates/weaver_codegen_test/:

  • semconv_registry/: Contains the SemConv registry used in this example.
  • templates/registry/rust/: Holds the templates for this PR.
  • templates/registry/rust/weaver.xml: Defines the JQ expression for applying each template to the resolved registry.
  • generated_code/: Location where the Rust SemConv code is generated.

The following command can be used to reproduce this code generation:

cargo run -- registry generate \
-t crates/weaver_codegen_test/templates/ \
-r crates/weaver_codegen_test/semconv_registry \
rust crates/weaver_codegen_test/generated_code/

To the attention of API/SDK client maintainers,

Weaver uses a Jinja2-compatible engine to render the templates. The following custom filters are available:

  • lower_case: Converts a string to lowercase.
  • upper_case: Converts a string to UPPERCASE.
  • title_case: Converts a string to TitleCase.
  • pascal_case: Converts a string to PascalCase.
  • camel_case: Converts a string to camelCase.
  • snake_case: Converts a string to snake_case.
  • screaming_snake_case: Converts a string to SCREAMING_SNAKE_CASE.
  • kebab_case: Converts a string to kebab-case.
  • screaming_kebab_case: Converts a string to SCREAMING-KEBAB-CASE.
  • acronym: Replaces acronyms in the input string with the full name defined in the acronyms section of the weaver.yaml configuration file.
  • split_ids: Splits a string by '.' creating a list of nested ids.
  • type_mapping: Converts a semantic convention type to a target type (see weaver.yaml section type_mapping).
  • comment_with_prefix(prefix): Outputs a multiline comment with the given prefix.
  • flatten: Converts a List of Lists into a single list with all elements.
    e.g. [[a,b],[c]] => [a,b,c]
  • attribute_sort: Sorts a list of Attributes by requirement level, then name.
  • metric_namespace: Converts registry.{namespace}.{other}.{components} to {namespace}.
  • attribute_registry_file: Converts registry.{namespace}.{other}.{components} to attributes-registry/{namespace}.md (kebab-case namespace).
  • attribute_registry_title: Converts registry.{namespace}.{other}.{components} to {Namespace} (title case the namespace).
  • attribute_registry_namespace: Converts metric.{namespace}.{other}.{components} to {namespace}.
  • attribute_namespace: Converts {namespace}.{attribute_id} to {namespace}.

Note 1: Most of the standard Jinja2 filters are also supported.

Note 2: Some filters present in the build-tools have not yet been implemented in Weaver. It would be interesting to get feedback from the maintainers to identify which are missing.

The following custom testers are available:

  • stable: Tests if an Attribute is stable.
  • experimental: Tests if an Attribute is experimental.
  • deprecated: Tests if an Attribute is deprecated.

To the attention of the maintainers of opentelemetry-rust

@jtescher, @TommyCpp , @cijothomas , @lalitb , @djc (and any other opentelemetry-rust maintainers/contributors), please review this PR and provide feedback on both the Weaver tool and the Rust code generated from a few Jinja2 templates that I assembled specifically for this exercise.

The generated code adheres to the recommendations defined in this PR by @lmolkova, along with a few other suggestions of my own:

  • One Rust file per group namespace.
  • Every SemConv experimental attribute is annotated with #[cfg(feature = "semconv_experimental")]
  • Every SemConv deprecated attribute is annotated with #[deprecated(note="{{ attribute.deprecated }}")].
  • Every SemConv enumeration is translated into a non-exhaustive Rust enum. Each of these enums implement the Display trait and expose a method as_str returning the string value for each variant. A _Custom variant is also added to let the users define their own variant.
  • Every attribute of the registry is declared as a typed AttributeKey constant. A method is exposed to generate a standard KeyValue from a properly typed given value.
  • For the metrics, this example generates an API similar to the one used by the Python SDK, as well as a type-safe API (more details below).

Below is an example showcasing the creation of metrics using the type-safe API generated by the templates. The goal is to guide our users through the smart completion offered by most IDEs (including documentation), and also to leverage Rust's type system and compiler to guide the user and prevent any omissions, errors, and misuse.

    // ==== A TYPE-SAFE HISTOGRAM API ====
    // Create a u64 http.server.request.duration metric (as defined in the OpenTelemetry HTTP
    // semantic conventions)
    let http_request_duration = HttpServerRequestDuration::<u64>::new(&meter);

    // Records a new data point and provide the required and some optional attributes
    http_request_duration.record(100, &HttpServerRequestDurationReqAttributes {
        http_request_method: HttpRequestMethod::Connect,
        url_scheme: "http".to_owned(),
    }, Some(&HttpServerRequestDurationOptAttributes {
        http_response_status_code: Some(200),
        ..Default::default()
    }));

    // ==== A TYPE-SAFE UP-DOWN-COUNTER API ====
    // Create a f64 http.server.request.duration metric (as defined in the OpenTelemetry HTTP
    // semantic conventions)
    let http_client_active_requests = HttpClientActiveRequests::<f64>::new(&meter);

    // Adds a new data point and provide the required attributes. Optional attributes are not
    // provided in this example.
    http_client_active_requests.add(10.0, &HttpClientActiveRequestsReqAttributes {
        server_address: "10.0.0.1".to_owned(),
        server_port: 8080,
    }, None);

    // ==== A TYPE-SAFE COUNTER API ====
    // Create a f64 system.cpu.time metric (as defined in the OpenTelemetry System semantic
    // conventions)
    let system_cpu_time = SystemCpuTime::<f64>::new(&meter);

    // Adds a new data point and provide some optional attributes.
    // Note: In the method signature, there is no required attribute.
    system_cpu_time.add(10.0, Some(&SystemCpuTimeOptAttributes {
        system_cpu_logical_number: Some(0),
        system_cpu_state: Some(SystemCpuState::Idle)
    }));
    // Adds a new data point with a custom CPU state.
    system_cpu_time.add(20.0, Some(&SystemCpuTimeOptAttributes {
        system_cpu_logical_number: Some(0),
        system_cpu_state: Some(SystemCpuState::_Custom("custom".to_owned()))
    }));

    // ==== A TYPE-SAFE GAUGE API ====
    // Create a i64 system.cpu.utilization metric (as defined in the OpenTelemetry System semantic
    // conventions)
    let system_cpu_utilization = SystemCpuUtilization::<i64>::new(&meter);

    // Adds a new data point with no optional attributes.
    system_cpu_utilization.record(-5, None);
    // Adds a new data point with some optional attributes.
    system_cpu_utilization.record(10, Some(&SystemCpuUtilizationOptAttributes {
        system_cpu_logical_number: Some(0),
        system_cpu_state: Some(SystemCpuState::Idle)
    }));

Note 1: Similarly to other Client SDKs, the individual SemConv attributes and metrics are accessible via the generated code without using the type-safe API (as seen in the Python or Java SDKs). We could offer a configuration mechanism allowing users to choose between permissive (but error-prone) APIs or more rigid, type-safe APIs.

Note 2: A similar approach could be used for Span, ...

Discussion on performance:

  • The current experimentation generates code relying on the opentelemetry and opentelemetry-sdk crates. The entirety of the existing pipeline is reused. The type-safe API generated on top of the SDK is relatively light but adds a small overhead.
  • It is possible in the long run to completely eliminate the numerous layers of abstraction by directly generating the code that will interface this type-safe API with the opentelemetry-otlp crate (or any other target exporter). With this approach, many optimizations are possible: Direct conversion of Rust structures representing required and optional attributes to corresponding protobuf messages. Aggregation relies on computing a hash combining all key+value attributes sorted by key; it is possible to completely eliminate this sorting as well as half of the overhead of the hash (hashing of values only). Many other optimizations are possible with this type of approach.

Status

  • Implement the Display trait for every enum.
  • Expose the method as_str for every enum.
  • Generate deprecated and experimental attributes.
  • Use the first segment in the attribute id instead of the prefix of groups.
  • Generate Metrics
  • Test code generate (see crate weaver_codegen_test).
  • More documentation on JQ syntax.

Work for future PR: Generate type-safe API for spans.

Copy link

codecov bot commented Apr 26, 2024

Codecov Report

Attention: Patch coverage is 17.39130% with 76 lines in your changes are missing coverage. Please review.

❗ No coverage uploaded for pull request base (main@d9f1838). Click here to learn what that means.

Files Patch % Lines
crates/weaver_common/src/in_memory.rs 0.0% 37 Missing ⚠️
crates/weaver_forge/src/lib.rs 38.4% 24 Missing ⚠️
crates/weaver_forge/src/debug.rs 7.6% 12 Missing ⚠️
crates/weaver_codegen_test/build.rs 0.0% 3 Missing ⚠️
Additional details and impacted files
@@          Coverage Diff           @@
##             main    #136   +/-   ##
======================================
  Coverage        ?   73.4%           
======================================
  Files           ?      44           
  Lines           ?    2504           
  Branches        ?       0           
======================================
  Hits            ?    1840           
  Misses          ?     664           
  Partials        ?       0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@jsuereth jsuereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the code in weaver lgtm.

Regarding rust templates - I think those either belong in an examples directory or a separate or for otel-rust. At a minimum there should be a test on them if they live in weaver.

@lquerel
Copy link
Contributor Author

lquerel commented Apr 29, 2024

@jsuereth yes this PR is still a draft I will add some tests today and I will move the Rust templates into an examples directory.

@lquerel lquerel changed the title Using Weaver to Generate Semantic Convention Attributes Code [WIP] Using Weaver to Generate Semantic Convention Attributes Code Apr 29, 2024
/// TRACE method.
Trace,
/// Any HTTP method that the instrumentation has no prior knowledge of.
Other,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those should have string values matching ones defined in the spec, i.e. CONNECT, _OTHER - is it possible in Rust?

Also, enums are notoriously difficult in other languages for their back-compat issues (calling Enum.NewValueAddedInVNext but in runtime you deal with vOld of the enum vlass). I assume it's not an issue in Rust?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there are ways to create this mapping, either with a macro or directly via the implementation of the Display trait. I will add that support.

To handle the evolution of enum definitions, I used the #[non_exhaustive] macro attribute, which indicates that a type or variant may have more fields or variants added in the future (more details here).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string values for the enums are now supported. See 9483d63

@lquerel lquerel self-assigned this May 4, 2024
@lquerel lquerel changed the title [WIP] Using Weaver to Generate Semantic Convention Attributes Code [WIP] Code Generation with Weaver - Example of Code Generation for SemConv Attribute Registry and SemConv Metrics May 4, 2024
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clippy found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

@lquerel lquerel marked this pull request as ready for review May 7, 2024 06:46
use weaver_common::Logger;

/// Return a nice summary of the error.
/// Return a nice summary of the error including the chain of causes.
/// Only the last error in the chain is displayed with a full stack trace.
pub(crate) fn error_summary(error: minijinja::Error) -> String {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this does what I think it does, maybe it's time to move the Docker image from debug=>release?

I can start a tracker but it'd be nice to rely on release builds (~100MB vs. 400MB somehow).

Copy link
Contributor Author

@lquerel lquerel May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, but I don't precisely remember the exact problem you observed with the release version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there were two issues:

  1. The error output was truncated to the point of being useless, whereas in debug build you got line numbers and pointers at problematic templates.
  2. The {{ debug(var) }} trick for templates stopped rendering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will check that

Copy link
Contributor

@jsuereth jsuereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, pending a windows build fix

if parent.is_empty() {
vec![]
} else {
parent.split('/').map(|s| s.to_owned()).collect::<Vec<_>>()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line may not be windows friendly. I think you should be able to split using Path in some fashion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Path.components() is the platform independent solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well not completely...

crates/weaver_forge/src/lib.rs Fixed Show fixed Hide fixed
crates/weaver_forge/src/lib.rs Fixed Show fixed Hide fixed
crates/weaver_forge/src/lib.rs Fixed Show fixed Hide fixed
crates/weaver_forge/src/lib.rs Fixed Show fixed Hide fixed
crates/weaver_forge/src/lib.rs Fixed Show fixed Hide fixed
crates/weaver_forge/src/lib.rs Fixed Show fixed Hide fixed
crates/weaver_forge/src/lib.rs Fixed Show fixed Hide fixed
crates/weaver_forge/src/lib.rs Fixed Show fixed Hide fixed
crates/weaver_forge/src/lib.rs Fixed Show fixed Hide fixed
crates/weaver_forge/src/lib.rs Fixed Show fixed Hide fixed
crates/weaver_forge/src/lib.rs Fixed Show fixed Hide fixed
crates/weaver_forge/src/lib.rs Fixed Show fixed Hide fixed
crates/weaver_forge/src/lib.rs Fixed Show fixed Hide fixed
crates/weaver_forge/src/lib.rs Fixed Show fixed Hide fixed
@lquerel lquerel merged commit f4e1ff5 into open-telemetry:main May 8, 2024
20 checks passed
@lquerel lquerel deleted the example-codegen-semconv-rust branch May 29, 2024 00:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request help wanted Extra attention is needed
Projects
Status: Done
4 participants