Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add database statement-level metrics utils #7837

Merged
merged 65 commits into from
Oct 27, 2020

Conversation

justiniso
Copy link
Contributor

@justiniso justiniso commented Oct 23, 2020

What does this PR do?

This PR adds utility functions to be used in MySQL and Postgres integrations for collecting per-statement metrics via aggregate statistics tables. All relational database integrations (MySQL, Postgres, SQL Server, Oracle, etc.) will work the same way by polling aggregate monotonic stat tables which track metrics per-normalized query family.

(Note: a previous version of this PR contained the changes for Postgres as well, but those were split to a separate PR)

Motivation

Query-level metrics is one of the features of Deep Database Monitoring. This change will give customers metrics like postgresql.queries.time tagged by query and query_signature to report the time spent executing a particular normalized query.

Additional Notes

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • PR title must be written as a CHANGELOG entry (see why)
  • Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
  • PR must have changelog/ and integration/ labels attached

@codecov
Copy link

codecov bot commented Oct 23, 2020

@justiniso justiniso force-pushed the justiniso/dbm-statement-metrics-postgresql branch from c64e631 to 4ac1552 Compare October 23, 2020 03:30
@ghost ghost added the dependencies label Oct 23, 2020
@@ -30,6 +30,7 @@ ldap3==2.5
lxml==4.5.0
lz4==2.2.1
meld3==1.0.2
mmh3==2.5.1
Copy link
Contributor Author

@justiniso justiniso Oct 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier discussions in Slack suggested using pyfasthash instead of mmh3. We have a requirement that the hashes computed here match the hashes produced by our go backend. I haven't been able to produce a matching hash with the pyhash API, but I have with mmh3. Also the murmur 3 code in pyfasthash hasn't been updated for 7 years (compared to 3 years ago with mmh3).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll use this after it's fixed explosion/murmurhash#14

logger = logging.getLogger(__name__)


class StatementMetrics:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this class will be used in both Postgres and MySQL integrations (as well as SQL Server, Oracle, etc. when we support those).

Flake8 fixes

Fix dependency sync

Sync deps

Fix flake8
@justiniso justiniso force-pushed the justiniso/dbm-statement-metrics-postgresql branch from 3662e6d to 471d592 Compare October 23, 2020 04:30
Copy link
Contributor

@hithwen hithwen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR should be split in two, one that only affects checks base and another one for postgres. Once the check's base one is merged checks base should be released and the base package dependency in postgres' setup.py needs to be bumped to the new version containing changes.
Both PRs need to have tests and need to have CI passing (at the moment style is broken, it can be fixed running ddev test -fs datadog_checks_base (or postgres)

Copy link
Member

@FlorianVeaux FlorianVeaux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments to the PR. We'll need to split this PR in two:

  • One updating datadog_checks_base, and we'll release a new version X.
  • One updating postgres where we can set the requirement to datadog_checks_base==X

Also the PRs will need tests

postgres/datadog_checks/postgres/config.py Outdated Show resolved Hide resolved
postgres/datadog_checks/postgres/statements.py Outdated Show resolved Hide resolved
postgres/datadog_checks/postgres/statements.py Outdated Show resolved Hide resolved
postgres/datadog_checks/postgres/statements.py Outdated Show resolved Hide resolved
postgres/datadog_checks/postgres/statements.py Outdated Show resolved Hide resolved
Copy link
Contributor

@florimondmanca florimondmanca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Left a few comments and ideas.

Curious about discussing https://github.com/DataDog/integrations-core/pull/7837/files#r510744048 — at first glance, it seems like a lot of code could be stripped by relying on the Agent's existing monotonic_count functionality, rather than computing and submitting diffs ourselves.

def __init__(self):
self.previous_statements = dict()

def compute_derivative_rows(self, rows, metrics, key):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: we support comment-based type hints and running mypy in the integrations-core code base, which would allow you to write…

Suggested change
def compute_derivative_rows(self, rows, metrics, key):
def compute_derivative_rows(self, rows, metrics, key):
# type: (List[dict], List[dict], Callable) -> List[dict]

(Provided from typing import Callable, List.)

More info: https://datadoghq.dev/integrations-core/guidelines/style/#mypy

Not saying you should change anything for this PR (I understand we're trying to get these changes in before the freeze), but just flagging it as a possibility, in case that's something you'd like to explore in the future.

postgres/datadog_checks/postgres/postgres.py Outdated Show resolved Hide resolved
postgres/datadog_checks/postgres/statements.py Outdated Show resolved Hide resolved
postgres/datadog_checks/postgres/statements.py Outdated Show resolved Hide resolved
Copy link
Member

@mgarabed mgarabed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking forward to the statistics this will report on!

Curious about the performance impact of all of this - it will be different for each database, but how should a customer tune their configuration to know what the optimal config should be?

postgres/requirements.in Outdated Show resolved Hide resolved
postgres/datadog_checks/postgres/statements.py Outdated Show resolved Hide resolved
postgres/datadog_checks/postgres/postgres.py Outdated Show resolved Hide resolved
postgres/datadog_checks/postgres/statements.py Outdated Show resolved Hide resolved
justiniso and others added 3 commits October 23, 2020 09:58
Co-authored-by: Florian Veaux <[email protected]>
…om:DataDog/integrations-core into justiniso/dbm-statement-metrics-postgresql
@justiniso justiniso force-pushed the justiniso/dbm-statement-metrics-postgresql branch from e5574ed to c9ea711 Compare October 27, 2020 14:09
Copy link
Contributor

@florimondmanca florimondmanca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Superb! We can merge with the mmh3 failures, knowing these should go away with the next Agent build.

@hithwen hithwen merged commit f48d972 into master Oct 27, 2020
@hithwen hithwen deleted the justiniso/dbm-statement-metrics-postgresql branch October 27, 2020 16:49
github-actions bot pushed a commit that referenced this pull request Oct 27, 2020
* Add shared class for statement metrics

* Add function for computing sql signature

* Update config for deep database monitoring options

* Add statement metrics

Co-authored-by: Florian Veaux <[email protected]> f48d972
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants