Releases: DataDog/datadog-agent
6.5.1
Docker, Windows, Linux
Download links
Changes
Prelude
- Please refer to the 6.5.1 tag on integrations-core for the list of changes on the Core Checks.
- Please refer to the 6.5.1 tag on trace-agent for the list of changes on the Trace Agent.
- Please refer to the 6.5.1 tag on process-agent for the list of changes on the Process Agent.
Bug Fixes
- Fix possible deadlocks that could occur when new docker sources and services are pushed and:
- The docker socket is closed at agent setup
- The docker socket is not mounted
- The kubernetes integration is enabled
- Fix a deadlock that could occur when the logs-agent is enabled and the configuration parameter
logs_config.container_collect_all
or the environment variableDD_LOGS_CONFIG_CONTAINER_COLLECT_ALL
are set to true.
6.5.0
Please note that a critical bug identified in this release affecting container log collection when the container_collect_all
was set, would lead to an agent deadlock. The severity of the issue has led us to remove the packages for the affected platforms (Linux and Docker). If you have upgraded to this version, on Linux or Docker we recommend you downgrade to 6.4.2
.
Prelude
- Please refer to the 6.5.0 tag on integrations-core for the list of changes on the Core Checks.
- Please refer to the 6.5.0 tag on trace-agent for the list of changes on the Trace Agent.
- Please refer to the 6.5.0 tag on process-agent for the list of changes on the Process Agent.
New Features
- Autodiscovery: the
docker
andkubelet
listeners will retry on error, to support starting the agent before your container runtime (host install) - Bump the default number of check runners to 4. This has some concurrency implications as we will now run multiple checks in parallel.
- Kubernetes: to avoid hostname collisions between clusters, a new
cluster_name
option is available. It will be added as a suffix to the host alias detected from the kubelet in order to make these aliases unique across different clusters. - Docker image: handle docker/kubernetes secret files with a helper script.
- The Node Agent can rely on the Datadog Cluster Agent to collect Node Labels.
- Improved ECS fargate tagging:
- Honor the
docker_labels_as_tags
option to extract custom tags - Make the
cluster_name
tag shorter - Add the
short_image
andcontainer_id
tags - Remove some noisy tags
- Fix a lifecycle issue that caused missing tags
- Honor the
- The live containers view can now retrieve containers directly from the kubelet, in order to support containerd and crio
- Kubernetes events: setting event host tags to the related hosts, instead of the host collecting the events.
- Added dedicated configuration parameters to send logs to a proxy by TCP. Note that
logs_config.dd_url
,logs_config.dd_port
andlogs_config.dev_mode_no_ssl
are deprecated and will be unavailable soon, use the new parameterslogs_config.logs_dd_url
andlogs_config.logs_no_ssl
instead. - Added the possibility to send logs to Datadog using the port 443.
Enhancement Notes
- Add more environment variables to the flare whitelist
- When
dd_url
is set toapp.datadoghq.eu
, the infra Agent also sends data to versioned endpoints (similar toapp.datadoghq.com
) - Make all numbers on the status page more human readable (using unit and SI prefix when appropriate)
- Display hostname provider and errors on the status page
- Kubelet Autodiscovery: reduce logging when no change is detected
- On Windows, the hostname_fqdn flag will now be honored, and the host reported by Datadog will be the fully qualified hostname.
- Enable all configuration options to be set with env vars
- Tags generated from GCE metadata may now be omitted by using
collect_gce_tags
configuration option. - Introduction of a new bucketed scheduler to enable multiple check workers to increase concurrency while spreading the load over the collection interval.
- The 'status' command and 'status' page (in the GUI) now displays errors raised by the '__init__' method of a Python check.
- Exclude the rancher pause container in the agent
- On status page, allow users to know which instance of a check matches which yaml instance in configcheck page
- The file_handle check reports 4 new metrics for feature parity with agent 5
- The ntp check will now query multiple servers by default to be more resilient to servers returning wrong offsets. A now config option
hosts
is now available in the ntp check configuration file to
allow users to change the list of ntp servers. - Tags and sources in the tagger-list command are now sorted to ease troubleshooting.
- To allow concurrent execution of subprocess calls from python, we now save the thread state and release the GIL to unblock the interpreter . We can reaquire the GIL and restore the thread state when the subprocess call returns.
- Add a new configuration option, named tag_value_split_separator, allowing the specified list of raw tags to have its value split by a given separator. Only applies to host tags, tags coming from container integrations. Does not apply to tags on dogstatsd metrics, and tags collected by other integrations.
Upgrade Notes
-
Autodiscovery now enforces the ac_exclude and ac_include filtering options for all listeners. Please double-check your exclusion patterns before upgrading and add inclusion patterns if some autodiscovered containers match these.
-
The introduction of multiple runners for checks implies check instances may now run concurrently. This should help the agent make better use of resources, in particular it will help prevent or reduce the side-effects of slow checks delaying the execution of all other checks.
The change will affect custom checks not enforcing thread safety as they may, depending on the schedule, access unsynchronized structures concurrently with the corresponding data race ensuing. If you wish to run checks in a fully sequential fashion, you may set the check_runners option in your datadog.yaml config or via the DD_CHECK_RUNNERS to 1. Also, please feel free to reach out to us if you need more information or help with the new multiple runner/concurrency model.
For more details please read the technical note in the datadog.yaml.
-
Prometheus custom checks are now limited to 2000 metrics by default to provide users control over the maximum number of custom metrics sent in the case of configuration errors or input changes. This limit can be changed with the
max_returned_metrics
option in the check configuration.
Bug Fixes
- All Autodiscovery listeners now enforce the ac_exclude and ac_include filtering options, as described in the documentation.
- Fixed "logs_config.frame_size" override that would not be taken into account.
- collect io metrics for drives with path only (like: C:C0) on Windows
- Fix API_KEY validation for 'additional_endpoints' by using their respective endpoint instead of the main one all the time.
- Fix port ordering for the %%port_%% Autodiscovery tag on the docker listener
- Fix missing ECS tags under some conditions
- Change the name of the agent expvar from
aggregator/ServiceCheckFlushed)
toaggregator/ServiceCheckFlushed
- Fix an issue where logs wouldn't be ingested if the API key contains a trailing new line
- Setting the log level of the
check
subcommand using the-l
flag was not setting the log level of python integrations. - Display embedded Python version in the status page instead of the version from the system Python.
- Fixes a bug causing kube_service tags to be missing when kubernetes_map_services_on_ip is false.
- The ntp check now handles negative offsets if the host time is in the future.
- Fix a possible index out of range panic in Dogstatsd origin detection
- Fix a verbose debug log caused by rescheduling services with no checks associated with them.
Other Notes
- JMXFetch upgraded to 0.20.2; ships updated FasterXML.
- Remove noisy and useless debug log line from contextResolver
6.4.2
Docker, Windows, Linux
Changes
Prelude
Release on: 2018-08-13
- Please refer to the 6.4.2 tag on integrations-core for the list of changes on the Core Checks.
Enhancement Notes
- The flare command does not collect the agent container's environment
variables anymore
Bug Fixes
- Fixes an issue with docker tailing on restart of monitored
containers. Previously, at each container restart the agent would re
submit all logs. Now, on restart we use tracked offsets properly,
and as a result submit only new logs
6.4.1 / 2018-08-01
Docker, Windows, Linux
Download links
Changes
Prelude
Release on: 2018-08-01
- Please refer to the 6.4.1 tag on integrations-core for the list of changes on the Core Checks.
- Please refer to the 6.4.1 tag on trace-agent for the list of changes on the Trace Agent.
- Please refer to the 6.4.1 tag on process-agent for the list of changes on the Process Agent.
New Features
- Create packaging for google cloud launcher integration.
- Add options to exclude specific payloads from being sent to Datadog.
In some environments, some of the gathered information is considered
too sensitive to be sent to Datadog (i.e. IP addresses in events or
service checks). This feature adds to option to exclude specific
payload types from being sent to the backend. - Collect container disk metrics less often in the docker check,
decreasing its effect on performance when enabled. - Autodiscovery now supports the %%hostname%% tag on the docker
listener This tag will resolve to the containers' hostname value if
present in the container inspect. It is useful if the container IP
is not available or erroneous. - Dogstatsd origin detection now supports container tagging for
Kubernetes clusters running containerd or cri-o, in addition to the
existing docker support - This release ships full support of Kubernetes 1.3+
- OpenShift ClusterResourceQuotas metrics are now collected by the
kube_apiserver check, under the openshift.clusterquota.* and
openshift.appliedclusterquota.* names. - Display the version for Python checks on the status page.
Enhancement Notes
- Adding DD_EXPVAR_PORT to the configuration environment variables.
- On Windows, Specifically log to both the log file and the event
viewer what initiated an agent shutdown. Also logs specific startup
errors to both the log file and event viewer. - The embedded Python has been bumped from 2.7.14 to 2.7.15
- Agent expvar metrics now have default values. Metrics like the
number of packets dropped by the agent or errors were previously not
reported until a first event occurred. This should make it easier to
use the expvar configurationagent_stats.yaml
. - Proxy settings can be configured through the environment variables
DD_PROXY_HTTP
,DD_PROXY_HTTPS
andDD_PROXY_NO_PROXY
. These
environment variables take precedence over theproxy
options
configured indatadog.yaml
, and behave exactly the same way as
these options. The standardHTTP_PROXY
,HTTPS_PROXY
and
NO_PROXY
are still honored but have known side effects on
integrations, for simplicity we recommended using the new
environment variables. For more information, please refer to our
proxy docs - Update to distribution metrics algorithm with improved accuracy
- Added ECS pause containers to the default docker exclusion list
- Adding logging for when the agent fails to detect the origin of a
packet in dogstatsd socket mode because of namespace issues. - The
skip_ssl_validation
configuration option can now be set
through the relatedDD_SKIP_SSL_VALIDATION
env var - The Agent will log failed healthchecks on query and during exit
- On Windows, provides installation parameter to set the cmd_port,
the port on which the agent command interface runs. To be used if
the default (5001) is already used by another program. - The kube_service tag is now collected on Kubernetes 1.3.x versions.
The matching uses a new logic. If it were to fail, reverting to the
previous logic is possible by setting the
kubernetes_map_services_on_ip option to true. - The Kubernetes event collection timeout is now configurable
- Logs Agent: Added SOCKS5 proxy support. Use
logs_config: socks5_proxy_address: fqdn.example.com:port
to set
the proxy. - The diagnose output is now sorted by the diagnosis name
- Adding the status of the DCA (If enabled) in the Agent status
command.
Upgrade Notes
- If the environment variables that can be used to configure a proxy
(DD_PROXY_HTTP
,DD_PROXY_HTTPS
,DD_PROXY_NO_PROXY
,
HTTP_PROXY
,HTTPS_PROXY
andNO_PROXY
) are present with an
empty value (e.g.HTTP_PROXY=""
), the Agent now uses this empty
value instead of ignoring it and using lower-precedence options.
Deprecation Notes
- Begin deprecating "Agent start" command. It is being replaced by
"run". The "start" command will continue to function, with a
deprecation notice
Security Issues
- 'app_key' value from the configuration is now redacted when
creating a flare with the agent.
Bug Fixes
- Fixes presence of invalid UTF-8 characters when docker log message
is greater than 16Kb - Fix a possible agent crash due to a race condition in the auto
discovery. - Fixed an issue with jmxfetch not being killed on agent exit.
- Errors logged before the agent initialized the log module are now
printed on STDERR instead of being silenced. - Detect and handle Docker messages without header.
- Fixes installation, packaging scripts for OpenSUSE LEAP and greater.
- In the event of being unable to lock the dd-agent user (eg. dd-agent
is an LDAP user) during installation, do not fail; print relevant
warning. - The leader election process is now restarted if the leader stops
leading. - Avoid Linux package installation failures when both the
initctl
andsystemctl
commands are present but upstart is used as the init
system
Other Notes
- The system information collected from gohai no longer includes
network information when the agent is running in a container since
the network information is for the the container and not the host
itself. - The ntp check now runs every 15 minutes by default to avoid
over-loading the NTP server pools - Added new command "run" to the agent. This command replaces the
"start" command, to reduce ambiguity with the service lifecycle
commands
6.3.3
Docker, Windows, Linux
Changes
Prelude
Release on: 2018-07-16
-
Please refer to the 6.3.3 tag on integrations-core for the list of changes on the Core Checks.
-
Please refer to the 6.3.3 tag on trace-agent for the list of changes on the Trace Agent.
-
Please refer to the 6.3.3 tag on process-agent for the list of changes on the Process Agent.
Enhancements
- Add 'system.mem.buffered' metric on linux system.
Bug Fixes
-
Fix the IO check behavior on unix based on 'iostat' tool:
- Most metrics are an average time, so we don't need to divide again by
'delta' (ex: number of read/time doing read operations) - time is based on the millisecond and not the second
- Most metrics are an average time, so we don't need to divide again by
-
Kubernetes API Server's polling frequency is now customisable.
-
Use as expected the configuration value of kubernetes_metadata_tag_update_freq,
introduce a kubernetes_apiserver_client_timeout configuration option. -
Fix a bug that led the agent to panic in some cases if
thelog_level
configuration option was set toerror
.
6.3.2
Docker, Windows, Linux
Changes
Prelude
Released on: 2018-07-04
- Please refer to the 6.3.2 tag on
integrations-core
for the list of changes on the Core Checks.
Bug Fixes
- The service mapper now groups the mappings of pods to services by
namespace. This prevents kube_service tags from being erroneously
applied to metrics for a pod not targeted by a service but has the
same name as a pod in a different namespace targeted by that
service. - Fix a bug in dogstatsd metrics parsing where the Agent would leave
the host tag empty instead of applying its hostname on metrics with
a tag metadata field but no tags (i.e. the tags field is only one #
character). Regression introduced in 6.3.0 - Replace invalid utf-8 characters by the standard replacement char.
6.3.1
Docker, Windows, Linux
Changes
Prelude
Release on: 2018-06-27
-
Please refer to the 6.3.1 tag on integrations-core for the list of changes on the Core Checks.
-
Please refer to the 6.3.1 tag on trace-agent for the list of changes on the Trace Agent.
-
Please refer to the 6.3.1 tag on process-agent for the list of changes on the Process Agent.
Upgrade Notes
- JMXFetch upgraded to 0.20.1; ships tagging bugfixes.
Bug Fixes
-
Fixes panic when the agent receives an unsupported pattern in a log processing rule
-
Fixes problem in 6.3.0 in which agent wouldn't start on Windows
Server 2008r2. -
Provide the actual JMX check name as
check_name
in configurations
provided to JMXFetch via the agent API. This addresses a regression
in 6.3.0 that broke theinstance:
tag.
Due to the nature of the regression, and the fix, this will cause
churn on the tag potentially affecting dashboards and monitors.
6.3.0
Docker, Windows, Linux
Changes
Prelude
Release on: 2018-06-20
- Please refer to the 6.3.0 tag on
integrations-core
for the list of changes on the Core Checks. - Please refer to the 6.3.0 tag on
trace-agent
for the list of changes on the Trace Agent. - Please refer to the 6.3.0 tag on
process-agent
for the list of changes on the Process Agent.
New Features
- Add docker memory soft limit metric.
- Added a host tag for docker swarm node role.
- The import command now support multiple dd_url and API keys.
- Add an option to set the read buffer size for dogstatsd socket on
POSIX system (SO_RCVBUF). - Add support for port names in template vars for autodiscovery.
- Add a new "tagger-list" command that outputs the tagger content of a
running agent. - Adding Azure pause containers to the default image exclusion list
- Add flag histogram_copy_to_distribution to send histogram metric
values as distributions automatically. Note that the distributions
feature is in beta. An additional flag
histogram_copy_to_distribution_prefix modifies the existing
histogram metric name by adding a prefix, e.g. dist., to better
distinguish between these values. - Add docker & swarm information to host metadata
- "[BETA] Encrypted passwords in configurations can now be fetched
from a secrets manager." - Add docker ps -a output to the flare.
- Introduces a new redacting writer that will make sure anything
written into the flare is scrubbed from credentials and sensitive
information. - The agent now supports setting/overriding proxy URLs through
environment variables (HTTP_PROXY, HTTPS_PROXY and NO_PROXY). - Created a new journald integration to collect logs from systemd.
It's only available on debian distributions for now. - Add kubelet version to container metadata.
- Adds support for windows event logs collection
- Allow overriding procfs path. Should allow to collect relevant host
metrics in containerized environments. The override will affect
python checks and will result in psutil using the overriding path. - The fowarder will now spaw specific workers per domain to avoid slow
down when one domain is down. - ALPHA - Adding new tooling to securely upgrade integration
packages/wheels from our private TUF repository. Please note any
third party dependencies will still be downloaded from PyPI with no
additional security validation.
Upgrade Notes
- If your Agent is configured to use a web proxy through the
proxy
config option or one of the*_PROXY
environment variables, and the
configured proxy URL starts with thehttps://
scheme, the Agent
will now attempt to connect to your proxy using HTTPS, whereas it
would previously connect to your proxy using HTTP. If you have a
working proxy configuration, please make sure your proxy URL(s)
start with thehttp://
scheme before upgrading to v6.3+. This has
no impact on the security of the data sent to Datadog, since the
payloads are always secured with HTTPS between your Agents and
Datadog whateverproxy
configuration you may use. - Docker image: we moved the default configuration from the docker
image's default environment variables to the datadog-*.yaml files.
This allows users to easily mount a custom datadog.yaml
configuration file to set all options. If you already did so, you
will need to update your datadog.yaml to include these new defaults.
If you only used envvars, no change is needed. - The agent now supports the environment variables "HTTP_PROXY",
"HTTPS_PROXY" and "NO_PROXY". If set these variables will override
the setting in datadog.yaml. - Moves away from the community library for the kubernetes client in
favor of the official one.
Deprecation Notes
- The core Agent check Python code is no longer duplicated here and is
instead pulled from integrations-core. The code now resides in the
datadog_checks namespace, though the old checks, utils, etc. paths
are still supported. Please update your custom checks accordingly.
For more information, see
https://github.com/DataDog/datadog-agent/blob/master/docs/agent/changes.md#python-modules
Bug Fixes
- Default config agent_stats.yaml used to collect go_expvar metrics
from the Agent has been updated. - Take into account empty hosts on metrics coming from dogstatsd,
instead of ignoring them and applying the Agent's hostname. - Decrease epsilon and increase incoming buffer size for improved
accuracy of distribution metrics. - Better handling of docker return values to avoid errors
- Fix log format when no log file is specified which cause the log
date to not be correctly displayed. - Configurations of unscheduled checks are now properly removed from
the configcheck command display. - The agent would send the source twice when protobuf enabled
(default), once in the source field and once in tags. As a result,
we would see the source twice in the app. This PR fixes it, by
sending it only in the source field. - Fix a bug on windows where the io check was reporting metrics for
theC:
drive only. - Multiple config files can now be used for the same JMX based
integration - The auto-discovery mechanism can now properly discover multiple
configs for one JMX based integration - The JMXFetch process is now managed properly when JMXFetch configs
are unscheduled through auto-discovery - Fix a possible panic in the kubernetes event watcher.
- Fix panics within the agent when using non thread safe method from
Viper library (Unmarshall). - On RHEL/SUSE, stop the Agent properly in the pre-install RPM script
on systems where/lib
is not a symlink to/usr/lib
. - To match the behavior of Agent 5, a flag has been introduced to make
the agent usehostname -f
on unix-based systems before trying
os.Hostname()
. This flag is turned off by default for 6.3 and will
be enabled by default in 6.4. The import command used to upgrade
from the Agent5 to the Agent6 will enable this flag in the config. - Align docker agent's kubernetes liveness probe timeout with docker
healthcheck (5s) to avoid too many container restarts. - Fix kube_service tagging of kubernetes network metrics
- Fixed parsing issue with logs processing rules in autodiscovery.
- Prevent logs agent from submitting protocol buffer payloads with
invalid UTF-8. - Fixes JMXFetch on Windows when the
custom_jar_paths
and/or
tools_jar_path
options are set, by using a semicolon as the path
separator on Windows. - Prevent an empty response body from being marked as a "successfull
call to the GCE metadata api". Fixes a bug where hostnames became an
empty string when using docker swarm and a non GCE environment. - Config option specified in syslog_pem if syslog logging is enabled
with TLS should be a path to the certificate, not a textual
certificate in the configuration. - Changes the hostname used for Docker events to be the hostname of
the agent. - Removes use of gopsutil on Windows. Gopsutil relies heavily on WMI;
because the go runtime doesn't lock goroutines to system threads,
the COM layer can have difficulties initializing. Solves the problem
where metadata and various system checks can't initialize properly
Other Notes
- The agent is now compiled with Go 1.10.2
- The datadog/agent docker image now runs two collector runners by
default - The DEB and RPM packages now create the
dd-agent
user with no
login shell (/sbin/nologin
or/usr/sbin/nologin
). The packages
do not modify the login shell of thedd-agent
user if it already
exists. - The scripts of the Linux packages now don't exit with errors when no
supported init system is detected, and only print warnings instead - On the status and check command outputs, rename checks'
Metrics
to
Metric Samples
to reflect that the number represents the number of
samples submitted by the check, not the number of metrics after
aggregation. - Scrub all logging output from credentials. Should prevent leakage of
credentials in logs from 3rd-party code or code added in the future.
6.2.1
6.2.1
2018-05-23
Prelude
- Please refer to the 6.2.1 tag on
integrations-core
for the list of changes on the Core Checks. - Please refer to the 6.2.1 tag on
trace-agent
for the list of changes on the Trace Agent. - Please refer to the 6.2.1 tag on
process-agent
for the list of changes on the Process Agent.
Known Issues
- If the kubelet is not configured with TLS auth, the agent will fail
to communicate with the API when it should still try HTTP.
Bug Fixes
- Fix collection of host tags pulled from GCP project (
project:
and
numeric_project_id:
tags) and GCP instance attributes. - A bug was preventing some jmx configuration options to be set from
the jmx checks configs. - The RPM packages now write systemd service files to
/usr/lib/systemd/system/ (recommended path on RHEL/SUSE) instead of
/lib/systemd/system/
6.2.0 / 2018-05-11
Docker, Windows, Linux, macOS
Download links
Changes
Prelude
- Please refer to the 6.2.0 tag on integrations-core for the list of changes on the Core Checks.
- Please refer to the 6.2.0 tag on trace-agent for the list of changes on the Trace Agent.
- Please refer to the 6.2.0 tag on process-agent for the list of changes on the Process Agent.
Enhancements
- Introduce new docker cpu shares gauge.
- Add ability to configure the namespace in which the resources related to the kubernetes check are created.
- The kubelet check now honors container filtering options
- Adding Datadog Cluster Agent client in Node Agent. Adding support for TLS in the Datadog Cluster Agent API.
- Docker: set a default 5 seconds timeout on all docker requests to mitigate possible docker daemon freezes
- Connection to the ECS agent should be more resilient
- Add agent5-like JMXFetch helper commands to help with JMXFetch troubleshooting.
- The agent has been tested on Kubernetes 1.4 & OpenShift 3.4. Refer to https://github.com/DataDog/datadog-agent/blob/master/Dockerfiles/agent/README.md for installation instructions
- Extract creator tags from kubernetes legacy created-by annotation if the new ownerReferences field is not found
- The agent import command now handles converting options from the legacy kubernetes.yaml file, for agents running on the host
- The memory corecheck sends 2 new metrics on Linux:
system.mem.commit_limit
andsystem.mem.committed_as
- Added the possibility to filter docker containers by name for log collection.
- Added a support for docker labels to enrich logs metadata.
- Logs Agent: add a filename tag to messages with the name of the file being tailed.
- Shipping protobuf C++ implementation for the protobuf package, this should help us be more performant when parsing larger/binary protobuf messages in relevant integrations.
- Enable to set collect_ec2_tags from environment variable DD_COLLECT_EC2_TAGS
- The configcheck command now display checks in alphabetical orders and are no longer grouped by configuration provider
- Add average check run time to
datadog-agent status
and to the GUI. - Consider every configuration having autodiscovery identifier a template
- Implement a circuit breaker and use jittered, truncated exponential backoff for network error retries.
- Change logs agent configuration to use protocol buffers encoding and endpoint by default.
Known Issues
- Kubernetes 1.3 & OpenShift 3.3 are currently not fully supported: docker and kubelet integrations work OK, but apiserver communication (event collection, kube_service tagging) is not implemented
Deprecation Notes
- Removing python PDH code bundled with the agent in favor of code already included in the integrations-core` repository and bundled with datadog_checks_base wheel. This provides a single source of truth for the python PDH logic.
Bug Fixes
- Fix a possible race condition in AutoDiscovery where configuration is identical on container churn and considered as duplicate before being de-scheduled.
- It is now possible to save logs only configuration in the GUI without getting an error message.
- Docker network metrics are now tagged by interface name as a fallback if a docker network name cannot be determined (affects some Swarm stack deployments)
- Dogstatsd now support listening on an IPv6 address when using
bind_host
config option. - The agent now fetches a hostname alias from kubernetes when possible. It fixes some duplicated host issues that could happen when metrics were using kubernetes host names, as the kubernetes_state integration
- Fix case issues in tag extraction for docker/kubernetes container tags and kubernetes host tags
- Fixes initialization of performance counter (Windows) to be able to better cope with missing counter strings, and non-english locales
- Bind the kubelet_tls_verify as an environment variable.
- Docker image: fix entrypoint bug causing the kubernetes_apiserver check to not be enabled
- Fixed an issue with collecting logs bigger than 4096 chars on windows.
- Fixes a misleading log line on windows for logs file tailing
- Fixed a concurrent issue in the logs auditor causing the agent to crash.
- Fix an issue for docker image name filtering when images contain a tag.
- On Windows, changes the configuration for Process Agent and Trace Agent services to be manual-start. There is no impact if the services are configured to be active; however, if they're disabled, will stop the behavior where they're briefly started then stopped, which creates excessive Windows service alert.
- API key validation logic was ignoring proxy settings, leading to situations where the agent reported that it was "Unable to validate API key" in the GUI.
- Fix EC2 tags collection when multiple marketplaces are set.
- Fixes collection of host tags from GCE metadata
- Fix Go checks errors not being displayed in the status page.
- Sanitize logged Datadog URLs when proxies are configured.
- Fix a race condition in the kubernetes service tagging logic
- Fix a possible panic when docker cannot inspect a container
Other Notes
- In the metrics aggregator, log readable context information (metric name, host, tags) instead of the raw context key to help troubleshooting
- Remove executable permission bits from systemd/upstart/launchd service definition files.
- Improved the flare credential removing logic to work in a few edge cases that where not accounted for previously.
- Make file tailing a little less verbose. We avoid logging at every iteration the different issues we encountered, instead we log them at first run only. The status command shows the up-to-date information, and can be used at anytime to troubleshoot such issues
- Adds collection of PDH counter information to the flare; saves the step of always asking the customer for this information.
- Improve logging for the metamap, avoid spammy error when no cluster level metadata is found.