Skip to content

Releases: DataDog/datadog-agent

7.44.1

16 May 12:35
299bdcd
Compare
Choose a tag to compare

Prelude

Release on: 2023-05-16

Enhancement Notes

  • Agents are now built with Go 1.19.8.
  • Added optional config flag process_config.cache_lookupid to cache calls to user.LookupId in the process Agent. Use to minimize the number of calls to user.LookupId and avoid potential leak.

Bug Fixes

  • Fixes the inclusion of the security-agent.yaml file in the flare.

7.44.0

27 Apr 13:36
09a59ab
Compare
Choose a tag to compare

Agent

Prelude

Release on: 2023-04-27

New Features

  • Added HTTP/2 parsing logic to Universal Service Monitoring.
  • Adding Universal Service Monitoring to the Agent status check. Now Datadog has visibility into the status of Universal Service Monitoring. Startup failures appear in the status check.
  • In the agent.log, a DEBUG, WARN, and ERROR log have been added to report how many file handles the core Agent process has open. The DEBUG log reports the info, the WARN log appears when the core Agent is over 90% of the OS file limit, and the ERROR log appears when the core Agent has reached 100% of the OS file limit. In the Agent status command, fields CoreAgentProcessOpenFiles and OSFileLimit have been added to the Logs Agent section. This feature is currently for Linux only.
  • APM: Collect trace agent startup errors and successes using instrumentation-telemetry "apm-onboarding-event" messages.
  • APM OTLP: Introduce OTLP Ingest probabilistic sampling, configurable via otlp_config.traces.probabilistic_sampler.sampling_percentage.
  • The Datadog Admission Controller can inject the .NET APM library into Kubernetes containers for auto-instrumentation.
  • Enable CWS Security Profiles by default.
  • Support the config additional_endpoints for Data Streams monitoring.
  • Added support for collecting container image metadata when using Docker.
  • Added Kafka parsing logic to system-probe
  • Allow writing SECL rules against container creation time through the new container.created_at field, similar to the existing process.container_at field. The container creation time is also reported in the sent events.
  • [experimental] CWS generates an SBOM for any running workload on the machine.
  • [experimental] CWS events are enriched with SBOM data.
  • [experimental] CWS activity dumps are enriched with SBOM data.
  • Enable OTLP endpoint for receiving traces in the Datadog Lambda Extension.
  • On Windows, when service inference is enabled, process_context tags can now be populated by the service name in the SCM. This feature can be controlled by either the service_monitoring_config.process_service_inference.enabled config setting in the user's datadog.yaml config file, or it can be configured via the DD_SYSTEM_PROBE_PROCESS_SERVICE_INFERENCE_USE_WINDOWS_SERVICE_NAME environment variable. This setting is enabled by default.

Enhancement Notes

  • Added kubernetes_state.hpa.status_target_metric and kubernetes_state.deployment.replicas_ready metrics part of the kubernetes_state_core check.

  • The status page now includes a Status render errors section to highlight errors that occurred while rendering it.

  • APM:

    • Run the /debug/* endpoints in a separate server which uses port 5012 by default and only listens on 127.0.0.1. The port is configurable through apm_config.debug.port and DD_APM_DEBUG_PORT, set it to 0 to disable the server.
    • Scrub the content served by the expvar endpoint.
  • APM: apm_config.features is now configurable from the Agent configuration file. It was previously only configurable via DD_APM_FEATURES.

  • Agents are now built with Go 1.19.7.

  • The OTLP ingest endpoint now supports the same settings and protocol as the OpenTelemetry Collector OTLP receiver v0.71.0.

  • Collect Kubernetes Pod conditions.

  • Added the "availability-zone" tag to the Fargate integration. This matches the tag emitted by other AWS infrastructure integrations.

  • Allow to report all gathered data in case of partial failure of container metrics retrieval.

  • Upgraded JMXFetch to 0.47.8 which has improvements aimed to help large metric collections drop fewer payloads.

  • JMXFetch upgraded to 0.47.5 which now supports pulling metrics from javax.management.openmbean.TabularDataSupport. Also contains a fix for pulling metrics from javax.management.openmbean.TabularDataSupport when no tags are specified.

  • Updated chunking util and use cases to use generics. No behavior change.

  • [corechecks/snmp] Add interface_configs to override interface speed.

  • No longer increments TCP retransmit count when the retransmit fails.

  • The OTLP ingestion endpoint now supports the same settings and protocols as the OpenTelemetry Collector OTLP receiver v0.70.0.

  • Changes the retry mechanism of starting workloadmeta collectors so that instead of retrying every 30 seconds, it retries following an exponential backoff with initial interval of 1s and max of 30s. In general, this should help start sooner the collectors that failed on the first try.

  • Added the "pull_duration" metric in the workloadmeta telemetry. It measures the time that it takes to pull from the collectors.

Deprecation Notes

  • Marked the "availability_zone" tag as deprecated for the Fargate integration, in favor of "availability-zone".
  • Configuration enable_sketch_stream_payload_serialization is now deprecated.

Security Notes

  • The Agent now checks containerd containers Spec size before parsing it. Any Spec exceeding 2MB will not be parsed and a warning will be emitted. This impacts the container_env_as_tags feature and %%hostname%% variable resolution for environments based on containerd outside of Kubernetes.

Bug Fixes

  • APM: Fix issue where dogstatsd proxy would not work when bind address was set to localhost on MacOS. APM: Fix issue where setting bind_host to "::1" would break runtime metrics for the trace-agent.
  • APM: Trace Agent not printing critical init errors.
  • Fixes a bug where ignored container files (that were not tailed) were incorrectly counted against the total open files.
  • Fixes the configuration parsing of the "container_lifecycle" check. Custom config values were not being applied.
  • Corrects dogstatsd metric message validation to support all current (and some future) dogstatsd features
  • Avoid panic in kubernetes_state_core check with specific Ingress objects configuration.
  • Fixes a divide-by-zero panic when sketch serialization fails on the last metric of a given batch
  • Fix issue introduced in 7.43 that prevents the Datadog Agent Manager application from executing from the checkbox at the end of the Datadog Agent installation when the installer is run by a non-elevated administrator user.
  • Fixes a problem with USM and IIS on Windows Server 2022 due to a change in the way Microsoft reports IIS connections.
  • Fixes the labelsAsTags parameter of the kube-state metrics core check. Tags were not properly formatted when they came from a label on one resource type (for example, namespace) and turned into a tag on another resource type (for example, pod).
  • The OTLP ingest endpoint does not report the first cumulative monotonic sum value if the start timestamp of the timeseries matches its timestamp.
  • Prevent disallowlisting on empty command line for processes in the Process Agent when encountering a failure to parse, use exe value instead.
  • Make SNMP Listener support all authProtocol.
  • Fix an issue where agent status would show incorrect system-probe status for 15 seconds as the system-probe started up.
  • Fix partial loss of NAT info in system-probe for pre-existing connections.
  • Replace ; with & in the URL to open GUI to follow golang.org/issue/25192.
  • Workloadmeta now avoids concurrent pulls from the same collector. This bug could lead to incorrect or missing data when the collectors were too slow pulling data.
  • Fixes a bug that prevents the containerd workloadmeta collector from starting sometimes when container_image_collection.metadata.enabled is set to true.
  • Fixed a bug in the SBOM collection feature. In certain cases, some SBOMs were not collected.

Other Notes

  • The logs_config.cca_in_ad has been removed.

Datadog Cluster Agent

New Features

  • Add conditions to Vertical Pod Autoscalers
  • Experimental: Support Ruby library injection through the Admission Controller on Kubernetes.

Enhancement Notes

  • Add new metrics for the KSM Core check for extended resources:
    • Pod requests and limits of the network bandwidth extended resource: kubernetes_state.container.network_bandwidth_limit, kubernetes_state.container.network_bandwidth_requested
    • The capacity and allocatable network bandwidth extended resource of a node: kubernetes_state.node.network_bandwidth_allocatable, kubernetes_state.node.network_bandwidth_capacity
  • Admission Controller: Add telemetry around auto-instrumentation via remote config.
  • The UDS socket volume when using the Admission Controller is now mounted in readOnly mode.

7.43.2

20 Apr 19:15
26255a9
Compare
Choose a tag to compare

Prelude

Release on: 2023-04-20

Enhancement Notes

  • Upgraded JMXFetch to 0.47.8 which has improvements aimed to help large metric collections drop fewer payloads.

lambda-extension-41

22 Mar 19:11
Compare
Choose a tag to compare
arn:aws:lambda:<AWS_REGION>:464622532012:layer:Datadog-Extension:41
arn:aws:lambda:<AWS_REGION>:464622532012:layer:Datadog-Extension-ARM:41
arn:aws-us-gov:lambda:us-gov-<AWS_REGION>:002406178527:layer:Datadog-Extension:41
arn:aws-us-gov:lambda:us-gov-<AWS_REGION>:002406178527:layer:Datadog-Extension-ARM:41

What's Changed

  • Default DD_TRACE_MANAGED_SERVICES to true #16176
  • Ensure we filter the serverless span correctly #16240
  • Fix panic when running the extension without appsec enabled #16054

The extension is now built with the otlp build tag which enables opentelemetry.

7.43.1

07 Mar 17:47
9e9c790
Compare
Choose a tag to compare

Prelude

Release on: 2023-03-07

Enhancement Notes

  • Agents are now built with Go 1.19.6.

7.43.0

23 Feb 12:33
e3c1ac4
Compare
Choose a tag to compare

Agent

Prelude

Release on: 2023-02-23

Upgrade Notes

  • The command line arguments to the Datadog Agent Manager for Windows ddtray.exe have changed from single-dash arguments to double-dash arguments. For example, -launch-gui must now be provided as --launch-gui. The start menu shortcut created by the installer will be automatically updated. Any custom scripts or shortcuts that launch ddtray.exe with arguments must be updated manually.

New Features

  • NDM: Add snmp.device.reachable/unreachable metrics to all monitored devices.

  • Add a new container_image long running check to collect information about container images.

  • Enable orchestrator manifest collection by default.

  • Add a new sbom core check to collect the software bill of materials of containers.

  • The Agent now leverages DMI (Desktop Management Interface) information on Unix to get the instance ID on Amazon EC2 when the metadata endpoint fails or is not accessible. The instance ID is exposed through DMI only on AWS Nitro instances. This will not change the hostname of the Agent upon upgrading, but will add it to the list of host aliases.

  • Adds the option to collect and store in workloadmeta the software bill of materials (SBOM) of containerd images using Trivy. This feature is disabled by default. It can be enabled by setting container_image_collection.sbom.enabled to true. Note: This feature is CPU and IO intensive.

Enhancement Notes

  • Adds a new snmp.interface_status metric reflecting the same status as within NDM.
  • APM: Ported a faster implementation of NormalizeTag with a fast-path for already normalized ASCII tags. Should marginally improve CPU usage of the trace-agent.
  • The external metrics server now automatically adjusts the query time window based on the Datadog metrics MaxAge attribute.
  • Added parity to Unix-based permissions.log Flare file on Windows. permissions.log file list the original rights/ACL of the files copied into a Agent flare. This will ease troubleshooting permissions issues.
  • [corechecks/snmp] Add id and source_type to NDM Topology Links
  • Add an --instance-filter option to the Agent check command.
  • APM: Disable max_memory and max_cpu_percent by default in containerized environments (Docker-only, ECS and CI). Users rely on the orchestrator / container runtime to set resource limits. Note: max_memory and max_cpu_percent have been disabled by default in Kubernetes environments since Agent 7.18.0.
  • Agents are now built with Go 1.19.5.
  • To reduce "cluster-agent" memory consomption when cluster_agent.collect_kubernetes_tags option is enabled, we introduce cluster_agent.kubernetes_resources_collection.pod_annotations_exclude option to exclude Pod annotation from the extracted Pod metadata.
  • Introduce a new option enabled_rfc1123_compliant_cluster_name_tag that enforces the kube_cluster_name tag value to be an RFC1123 compliant cluster name. It can be disabled by setting this new option to false.
  • Allows profiling for the Process Agent to be dynamically enabled from the CLI with process-agent config set internal_profiling. Optionally, once profiling is enabled, block, mutex, and goroutine profiling can also be enabled with process-agent config set runtime_block_profile_rate, process-agent config set runtime_mutex_profile_fraction, and process-agent config set internal_profiling_goroutines.
  • Adds a new process discovery hint in the process agent when the regular process and container checks run.
  • Added new telemetry metrics (pymem.*) to track Python heap usage.
  • There are two default config files. Optionally, you can provide override config files. The change in this release is that for both sets, if the first config is inaccessible, the security agent startup process fails. Previously, the security agent would continue to attempt to start up even if the first config file is inaccessible. To illustrate this, in the default case, the config files are datadog.yaml and security-agent.yaml, and in that order. If datadog.yaml is inaccessible, the security agent fails immediately. If you provide overrides, like foo.yaml and bar.yaml, the security agent fails immediately if foo.yaml is inaccessible. In both sets, if any additional config files are missing, the security agent continues to attempt to start up, with a log message about an inaccessible config file. This is not a change from previous behavior.
  • [corechecks/snmp] Add IP Addresses to NDM Metadata interfaces
  • [corechecks/snmp] Add LLDP remote device IP address.
  • prometheus_scrape: Adds support for tag_by_endpoint and collect_counters_with_distributions in the prometheus_scrape.checks[].configurations[] items.
  • The OTLP ingest endpoint now supports the same settings and protocols as the OpenTelemetry Collector OTLP receiver v0.68.0.

Deprecation Notes

  • The command line arguments to the Datadog Agent Manager for Windows ddtray.exe have changed from single-dash arguments to double-dash arguments. For example, -launch-gui must now be provided as --launch-gui.
  • system_probe_config.enable_go_tls_support is deprecated and replaced by service_monitoring_config.enable_go_tls_support.

Security Notes

  • Some HTTP requests sent by the Datadog Agent to Datadog endpoints were including the Datadog API key in the query parameters (in the URL). This meant that the keys could potentially have been logged in various locations, for example, in a forward or a reverse proxy server logs the Agent connected to. We have updated all requests to not send the API key as a query parameter. Anyone who uses a proxy to connect the Agent to Datadog endpoints should make sure their proxy forwards all Datadog headers (patricularly DD-Api-Key). Failure to not send all Datadog headers could cause payloads to be rejected by our endpoints.

Bug Fixes

  • The secret command now correctly displays the ACL on a path with spaces.
  • APM: Lower default incoming trace payload limit to 25MB. This more closely aligns with the backend limit. Some users may see traces rejected by the Agent that the Agent would have previously accepted, but would have subsequently been rejected by the trace intake. The Agent limit can still be configured via apm_config.max_payload_size.
  • APM: Fix the trace-agent -info command when remote configuration is enabled.
  • APM: Fix parsing of SQL Server identifiers enclosed in square brackets.
  • Remove files created by system-probe at uninstall time.
  • Fix the kubernetes_state_core check so that the host alias name creation uses a normalized (RFC1123 compliant) cluster name.
  • Fix an issue in Autodiscovery that could prevent Cluster Checks containing secrets (ENC[] syntax) to be unscheduled properly.
  • Fix panic due to uninitialized Obfuscator logger
  • On Windows, fixes bug in which HTTP connections were not properly accounted for when the client and server were the same host (loopback).
  • The Openmetrics check is no longer scheduled for Kubernetes headless services.

Other Notes

  • Upgrade of the cgosymbolizer dependency to use github.com/ianlancetaylor/cgosymbolizer.
  • The Datadog Agent Manager ddtray.exe now requires admin to launch.

Datadog Cluster Agent

New Features

  • Starts the collecting of Vertical Pod Autoscalers within Kubernetes clusters.
  • Enable orchestrator manifest collection by default

Bug Fixes

  • Make the cluster-agent admission controller able to inject libraries for several languages in a single pod.

7.42.2

16 Feb 19:36
373d0f8
Compare
Choose a tag to compare

Prelude

Release on: 2023-02-16

7.42.1

02 Feb 22:53
40fce81
Compare
Choose a tag to compare

Prelude

Release on: 2023-02-02

7.42.0

23 Jan 14:26
Compare
Choose a tag to compare

Agent

Prelude

Release on: 2023-01-23

Upgrade Notes

  • Downloading and installing official checks with agent integration install is no longer supported for Agent installations that do not include an embedded python3.

New Features

  • Adding the kube_api_version tag to all orchestrator resources.

  • Kubernetes Pod events generated by the kubernetes_apiserver can now benefit from the new cluster-tagger component in the Cluster-Agent.

  • APM OTLP: Added compatibility for the OpenTelemetry Collector's datadogprocessor to the OTLP Ingest.

  • The CWS agent now supports rules on mount events.

  • Adding a configuration option, exclude_ec2_tags, to exclude EC2 instance tags from being converted into host tags.

  • Adds detection for a process being executed directly from memory without the binary present on disk.

  • Introducing agent sampling rates remote configuration.

  • Adds support for secret_backend_command_sha256 SHA for the secret_backend_command executable. If secret_backend_command_sha256 is used, the following restrictions are in place:

    • Value specified in the secret_backend_command setting must be an absolute path.

    - Permissions for the datadog.yaml config file must disallow write access by users other than ddagentuser or Administrators on Windows or the user running the Agent on Linux and macOS. The agent will refuse to start if the actual SHA256 of the secret_backend_command executable is different from the one specified by secret_backend_command_sha256. The secret_backend_command file is locked during verification of SHA256 and subsequent run of the secret backend executable.

  • Collect network devices topology metadata.

  • Add support for AWS Lambda Telemetry API

  • Adds three new metrics collected by the Lambda Extension

    `aws.lambda.enhanced.response_latency`: Measures the elapsed time in milliseconds from when the invocation request is received to when the first byte of response is sent to the client.

    `aws.lambda.enhanced.response_duration`: Measures the elapsed time in milliseconds between sending the first byte of the response to the client and sending the last byte of the response to the client.

    `aws.lambda.enhancdd.produced_bytes`: Measures the number of bytes returned by a function.

  • Create cold start span representing time and duration of initialization of an AWS Lambda function.

Enhancement Notes

  • Adds both the StartTime and ScheduledTime properties in the collector for Kubernetes pods.
  • Add an option (hostname_trust_uts_namespace) to force the Agent to trust the hostname value retrieved from non-root UTS namespaces (Linux only).
  • Metrics from Giant Swarm pause containers are now excluded by default.
  • Events emitted by the Helm check now have "Error" status when the release fails.
  • Add an annotations_as_tags parameter to the kubernetes_state_core check to allow attaching Kubernetes annotations as Datadog tags in a similar way that the labels_as_tags parameter does.
  • Adds the windows_counter_init_failure_limit option. This option limits the number of times a check will attempt to initialize a performance counter before ceasing attempts to initialize the counter.
  • [netflow] Expose collector metrics (from goflow) as Datadog metrics
  • [netflow] Add prometheus listener to expose goflow telemetry
  • OTLP ingest now uses the minimum and maximum fields from delta OTLP Histograms and OTLP ExponentialHistograms when available.
  • The OTLP ingest endpoint now reports the first cumulative monotonic sum value if the timeseries started after the Datadog Agent process started.
  • Added the workload-list command to the process agent. It lists the entities stored in workloadmeta.
  • Allows running secrets in the Process Agent on Windows by sandboxing secret_backend_command execution to the ddagentuser account used by the Core Agent service.
  • Add process_context tag extraction based on a process's command line arguments for service monitoring. This feature is configured in the system-probe.yaml with the following configuration: service_monitoring_config.process_service_inference.enabled.
  • Reduce the overhead of using Windows Performance Counters / PDH in checks.
  • The OTLP ingest endpoint now supports the same settings and protocol as the OpenTelemetry Collector OTLP receiver v0.64.1
  • The OTLP ingest endpoint now supports the same settings and protocols as the OpenTelemetry Collector OTLP receiver v0.66.0.

Deprecation Notes

  • Removes the install-service Windows agent command.
  • Removes the remove-service Windows agent command.

Security Notes

  • Upgrade the wheel package to 0.37.1 for Python 2.
  • Upgrade the wheel package to 0.38.4 for Python 3.

Bug Fixes

  • APM: Fix an issue where container tags weren't working because of overwriting an essential tag on spans.
  • APM OTLP: Fix an issue where a span's local "peer.service" attribute would not override a resource attribute-level service.
  • On Windows, fixes a bug in the NPM network driver which could cause a system crash (BSOD).
  • Create only endpoints check from prometheus scrape configuration when prometheus_scrape.service.endpoint option is enabled.
  • Fix how Kubernetes events forwarding detects the Node/Host.
    • Previously Nodes' events were not always attached to the correct host.
    • Pods' events from "custom" controllers might still be not attached to a host if the controller doesn't set the host in the source.host event's field.
  • APM: Fix SQL parsing of negative numbers and improve error message.
  • Fix a potential panic when df outputs warnings or errors among its standard output.
  • Fix a bug where a misconfig error does not show when hidepid=invisible
  • The agent no longer wrongly resolves its hostname on ECS Fargate when requests to the Fargate API timeout.
  • Metrics reported through OTLP ingest now have the interval property unset.
  • Fix a PDH query handle leak that occurred when a counter failed to add to a query.
  • Remove unused environment variables DD_AGENT_PY and DD_AGENT_PY_ENV from known environment variables in flare command.
  • APM: Fix SQL obfuscator parsing of identifiers containing dollar signs.

Other Notes

  • JMXFetch upgraded to 0.47.2
  • Bump embedded Python3 to 3.8.16.

Datadog Cluster Agent

New Features

  • Supports the collection of custom resource definition and custom resource manifests for the orchestrator explorer.

Enhancement Notes

  • Collects Unified Service Tags for the orchestrator explorer product.

7.41.1

21 Dec 15:51
4f39b9e
Compare
Choose a tag to compare

Prelude

Release on: 2022-12-21

Enhancement Notes

  • Agents are now built with Go 1.18.9.