-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add database statement-level metrics utils #7837
Conversation
Codecov Report
|
c64e631
to
4ac1552
Compare
@@ -30,6 +30,7 @@ ldap3==2.5 | |||
lxml==4.5.0 | |||
lz4==2.2.1 | |||
meld3==1.0.2 | |||
mmh3==2.5.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Earlier discussions in Slack suggested using pyfasthash instead of mmh3. We have a requirement that the hashes computed here match the hashes produced by our go backend. I haven't been able to produce a matching hash with the pyhash API, but I have with mmh3. Also the murmur 3 code in pyfasthash
hasn't been updated for 7 years (compared to 3 years ago with mmh3).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll use this after it's fixed explosion/murmurhash#14
logger = logging.getLogger(__name__) | ||
|
||
|
||
class StatementMetrics: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this class will be used in both Postgres and MySQL integrations (as well as SQL Server, Oracle, etc. when we support those).
Flake8 fixes Fix dependency sync Sync deps Fix flake8
3662e6d
to
471d592
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR should be split in two, one that only affects checks base and another one for postgres. Once the check's base one is merged checks base should be released and the base package dependency in postgres' setup.py
needs to be bumped to the new version containing changes.
Both PRs need to have tests and need to have CI passing (at the moment style is broken, it can be fixed running ddev test -fs datadog_checks_base
(or postgres
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments to the PR. We'll need to split this PR in two:
- One updating datadog_checks_base, and we'll release a new version X.
- One updating postgres where we can set the requirement to datadog_checks_base==X
Also the PRs will need tests
datadog_checks_base/datadog_checks/base/utils/db/statement_metrics.py
Outdated
Show resolved
Hide resolved
datadog_checks_base/datadog_checks/base/utils/db/statement_metrics.py
Outdated
Show resolved
Hide resolved
datadog_checks_base/datadog_checks/base/utils/db/statement_metrics.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Left a few comments and ideas.
Curious about discussing https://github.com/DataDog/integrations-core/pull/7837/files#r510744048 — at first glance, it seems like a lot of code could be stripped by relying on the Agent's existing monotonic_count
functionality, rather than computing and submitting diffs ourselves.
datadog_checks_base/datadog_checks/base/utils/db/statement_metrics.py
Outdated
Show resolved
Hide resolved
def __init__(self): | ||
self.previous_statements = dict() | ||
|
||
def compute_derivative_rows(self, rows, metrics, key): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: we support comment-based type hints and running mypy
in the integrations-core
code base, which would allow you to write…
def compute_derivative_rows(self, rows, metrics, key): | |
def compute_derivative_rows(self, rows, metrics, key): | |
# type: (List[dict], List[dict], Callable) -> List[dict] |
(Provided from typing import Callable, List
.)
More info: https://datadoghq.dev/integrations-core/guidelines/style/#mypy
Not saying you should change anything for this PR (I understand we're trying to get these changes in before the freeze), but just flagging it as a possibility, in case that's something you'd like to explore in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking forward to the statistics this will report on!
Curious about the performance impact of all of this - it will be different for each database, but how should a customer tune their configuration to know what the optimal config should be?
datadog_checks_base/datadog_checks/base/utils/db/statement_metrics.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Florian Veaux <[email protected]>
…om:DataDog/integrations-core into justiniso/dbm-statement-metrics-postgresql
Co-authored-by: Florian Veaux <[email protected]>
This reverts commit c1bb075.
This reverts commit d5b9c78.
…e explicit comments and motivations
e5574ed
to
c9ea711
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Superb! We can merge with the mmh3
failures, knowing these should go away with the next Agent build.
* Add shared class for statement metrics * Add function for computing sql signature * Update config for deep database monitoring options * Add statement metrics Co-authored-by: Florian Veaux <[email protected]> f48d972
What does this PR do?
This PR adds utility functions to be used in MySQL and Postgres integrations for collecting per-statement metrics via aggregate statistics tables. All relational database integrations (MySQL, Postgres, SQL Server, Oracle, etc.) will work the same way by polling aggregate monotonic stat tables which track metrics per-normalized query family.
(Note: a previous version of this PR contained the changes for Postgres as well, but those were split to a separate PR)
Motivation
Query-level metrics is one of the features of Deep Database Monitoring. This change will give customers metrics like
postgresql.queries.time
tagged byquery
andquery_signature
to report the time spent executing a particular normalized query.Additional Notes
Review checklist (to be filled by reviewers)
changelog/
andintegration/
labels attached