Add database statement-level metrics utils #7837

justiniso · 2020-10-23T02:46:07Z

What does this PR do?

This PR adds utility functions to be used in MySQL and Postgres integrations for collecting per-statement metrics via aggregate statistics tables. All relational database integrations (MySQL, Postgres, SQL Server, Oracle, etc.) will work the same way by polling aggregate monotonic stat tables which track metrics per-normalized query family.

MySQL Integration Use: Support MySQL statement-level metrics for deep database monitoring #7851
Postgres Integration Use: Support postgres statement-level metrics for deep database monitoring #7852

(Note: a previous version of this PR contained the changes for Postgres as well, but those were split to a separate PR)

Motivation

Query-level metrics is one of the features of Deep Database Monitoring. This change will give customers metrics like postgresql.queries.time tagged by query and query_signature to report the time spent executing a particular normalized query.

Additional Notes

Review checklist (to be filled by reviewers)

Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
PR title must be written as a CHANGELOG entry (see why)
Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
PR must have changelog/ and integration/ labels attached

codecov · 2020-10-23T02:50:08Z

Codecov Report

Merging #7837 into master will increase coverage by 0.06%.
The diff coverage is 90.76%.

Impacted Files	Coverage Δ
...ks_base/datadog_checks/base/stubs/datadog_agent.py	`95.12% <50.00%> (ø)`
...og_checks_base/datadog_checks/base/utils/db/sql.py	`80.00% <80.00%> (ø)`
.../datadog_checks/base/utils/db/statement_metrics.py	`85.96% <85.96%> (ø)`
datadog_checks_base/tests/test_db_statements.py	`98.03% <98.03%> (ø)`
datadog_checks_base/tests/test_db_sql.py	`100.00% <100.00%> (ø)`
gunicorn/datadog_checks/gunicorn/gunicorn.py	`82.69% <0.00%> (-1.93%)`	⬇️
mysql/datadog_checks/mysql/version_utils.py
mysql/datadog_checks/mysql/config.py
gitlab/datadog_checks/gitlab/gitlab.py
gitlab/datadog_checks/gitlab/__about__.py
... and 226 more

justiniso · 2020-10-23T04:11:49Z

datadog_checks_base/datadog_checks/base/data/agent_requirements.in

@@ -30,6 +30,7 @@ ldap3==2.5
 lxml==4.5.0
 lz4==2.2.1
 meld3==1.0.2
+mmh3==2.5.1


Earlier discussions in Slack suggested using pyfasthash instead of mmh3. We have a requirement that the hashes computed here match the hashes produced by our go backend. I haven't been able to produce a matching hash with the pyhash API, but I have with mmh3. Also the murmur 3 code in pyfasthash hasn't been updated for 7 years (compared to 3 years ago with mmh3).

We'll use this after it's fixed explosion/murmurhash#14

justiniso · 2020-10-23T04:15:43Z

datadog_checks_base/datadog_checks/base/utils/db/statement_metrics.py

+logger = logging.getLogger(__name__)
+
+
+class StatementMetrics:


Note that this class will be used in both Postgres and MySQL integrations (as well as SQL Server, Oracle, etc. when we support those).

Flake8 fixes Fix dependency sync Sync deps Fix flake8

hithwen

This PR should be split in two, one that only affects checks base and another one for postgres. Once the check's base one is merged checks base should be released and the base package dependency in postgres' setup.py needs to be bumped to the new version containing changes.
Both PRs need to have tests and need to have CI passing (at the moment style is broken, it can be fixed running ddev test -fs datadog_checks_base (or postgres)

postgres/datadog_checks/postgres/config.py

FlorianVeaux

Added some comments to the PR. We'll need to split this PR in two:

One updating datadog_checks_base, and we'll release a new version X.
One updating postgres where we can set the requirement to datadog_checks_base==X

Also the PRs will need tests

datadog_checks_base/datadog_checks/base/stubs/datadog_agent.py

datadog_checks_base/datadog_checks/base/utils/db/statement_metrics.py

postgres/datadog_checks/postgres/config.py

postgres/datadog_checks/postgres/statements.py

datadog_checks_base/datadog_checks/base/utils/db/statement_metrics.py

postgres/datadog_checks/postgres/statements.py

florimondmanca

Thanks! Left a few comments and ideas.

Curious about discussing https://github.com/DataDog/integrations-core/pull/7837/files#r510744048 — at first glance, it seems like a lot of code could be stripped by relying on the Agent's existing monotonic_count functionality, rather than computing and submitting diffs ourselves.

datadog_checks_base/datadog_checks/base/utils/db/statement_metrics.py

florimondmanca · 2020-10-23T08:37:11Z

datadog_checks_base/datadog_checks/base/utils/db/statement_metrics.py

+    def __init__(self):
+        self.previous_statements = dict()
+
+    def compute_derivative_rows(self, rows, metrics, key):


Note: we support comment-based type hints and running mypy in the integrations-core code base, which would allow you to write…

Suggested change

def compute_derivative_rows(self, rows, metrics, key):

def compute_derivative_rows(self, rows, metrics, key):

# type: (List[dict], List[dict], Callable) -> List[dict]

(Provided from typing import Callable, List.)

More info: https://datadoghq.dev/integrations-core/guidelines/style/#mypy

Not saying you should change anything for this PR (I understand we're trying to get these changes in before the freeze), but just flagging it as a possibility, in case that's something you'd like to explore in the future.

postgres/datadog_checks/postgres/postgres.py

postgres/datadog_checks/postgres/statements.py

mgarabed

Looking forward to the statistics this will report on!

Curious about the performance impact of all of this - it will be different for each database, but how should a customer tune their configuration to know what the optimal config should be?

datadog_checks_base/datadog_checks/base/utils/db/statement_metrics.py

postgres/requirements.in

postgres/datadog_checks/postgres/statements.py

datadog_checks_base/datadog_checks/base/utils/db/sql.py

postgres/datadog_checks/postgres/postgres.py

postgres/datadog_checks/postgres/statements.py

Co-authored-by: Florian Veaux <[email protected]>

…om:DataDog/integrations-core into justiniso/dbm-statement-metrics-postgresql

Co-authored-by: Florian Veaux <[email protected]>

This reverts commit c1bb075.

This reverts commit d5b9c78.

…e explicit comments and motivations

florimondmanca

Superb! We can merge with the mmh3 failures, knowing these should go away with the next Agent build.

* Add shared class for statement metrics * Add function for computing sql signature * Update config for deep database monitoring options * Add statement metrics Co-authored-by: Florian Veaux <[email protected]> f48d972

justiniso added 10 commits October 22, 2020 10:25

Add shared class for statement metrics

2b46055

Add function for computing sql signature

dc94426

Black formatting

4f33667

Update config for deep database monitoring options

df6da80

Add statement metrics

5b2cec7

Fix imports in postgres check

938ca7c

Add stub for obfuscate_sql to datadog_agent

12f2fe2

Use the statement metrics in postgres check

42b4980

Fix comment

d1c3235

Add RDS tags to postgres tags

ca4f1a0

justiniso added integration/datadog_checks_base integration/postgres changelog/Added labels Oct 23, 2020

justiniso requested review from a team as code owners October 23, 2020 02:46

justiniso force-pushed the justiniso/dbm-statement-metrics-postgresql branch from c64e631 to 4ac1552 Compare October 23, 2020 03:30

ghost added the dependencies label Oct 23, 2020

justiniso commented Oct 23, 2020

View reviewed changes

Lint and test fixes

471d592

Flake8 fixes Fix dependency sync Sync deps Fix flake8

justiniso force-pushed the justiniso/dbm-statement-metrics-postgresql branch from 3662e6d to 471d592 Compare October 23, 2020 04:30

hithwen requested changes Oct 23, 2020

View reviewed changes

hithwen reviewed Oct 23, 2020

View reviewed changes

postgres/datadog_checks/postgres/config.py Outdated Show resolved Hide resolved

FlorianVeaux reviewed Oct 23, 2020

View reviewed changes

florimondmanca suggested changes Oct 23, 2020

View reviewed changes

mgarabed requested changes Oct 23, 2020

View reviewed changes

justiniso and others added 3 commits October 23, 2020 09:58

Remove unnecessary variable

7082c18

Co-authored-by: Florian Veaux <[email protected]>

Code review feedback

3e74298

Merge branch 'justiniso/dbm-statement-metrics-postgresql' of github.c…

59560ed

…om:DataDog/integrations-core into justiniso/dbm-statement-metrics-postgresql

justiniso and others added 20 commits October 27, 2020 10:08

Code review feedback

77429cf

Remove unnecessary variable

d1caff3

Co-authored-by: Florian Veaux <[email protected]>

Bugfixes

2fe7ae5

Support tests on pg_stat_statements in testing framework

9411c43

Add a test for statement level metrics

e497041

Cleanup unused vars

8596968

Revert postgres for its own PR

6eb076f

Bump datadog_checks_base version

fb7a4c6

Update db statements lib with tests

2965151

Add a test for stats resets

6d0525c

Fix edge case that would drop results on stats reset

3a77bb3

Bump datadog_checks_base version in agent requirements

8ad5ebb

Assert in any order because py2 dicts are unordered

750561d

Revert "Bump datadog_checks_base version in agent requirements"

0f07c9d

This reverts commit c1bb075.

Revert "Bump datadog_checks_base version"

88004ef

This reverts commit d5b9c78.

Make previous_statments a private var

397a77d

Make mmh3 an dep of extras db

490c031

Code review feedback: improved readability of code and added much mor…

6a2ce13

…e explicit comments and motivations

Add simple test for sql signature consistency

ceb4ba2

Slightly more documentation

c9ea711

justiniso force-pushed the justiniso/dbm-statement-metrics-postgresql branch from e5574ed to c9ea711 Compare October 27, 2020 14:09

Re-merge normalization overwritten from rebase

eca8865

hithwen requested review from florimondmanca, mgarabed and hithwen October 27, 2020 16:18

florimondmanca approved these changes Oct 27, 2020

View reviewed changes

hithwen merged commit f48d972 into master Oct 27, 2020

hithwen deleted the justiniso/dbm-statement-metrics-postgresql branch October 27, 2020 16:49

justiniso mentioned this pull request Oct 27, 2020

Add mypy typing to datadog_checks_base for db utils #7867

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add database statement-level metrics utils #7837

Add database statement-level metrics utils #7837

justiniso commented Oct 23, 2020 •

edited

Loading

codecov bot commented Oct 23, 2020 •

edited

Loading

justiniso Oct 23, 2020 •

edited

Loading

ofek Oct 26, 2020

justiniso Oct 23, 2020

hithwen left a comment •

edited

Loading

FlorianVeaux left a comment •

edited

Loading

florimondmanca left a comment

florimondmanca Oct 23, 2020

mgarabed left a comment

florimondmanca left a comment

	def compute_derivative_rows(self, rows, metrics, key):
	def compute_derivative_rows(self, rows, metrics, key):
	# type: (List[dict], List[dict], Callable) -> List[dict]

Add database statement-level metrics utils #7837

Add database statement-level metrics utils #7837

Conversation

justiniso commented Oct 23, 2020 • edited Loading

What does this PR do?

Motivation

Additional Notes

Review checklist (to be filled by reviewers)

codecov bot commented Oct 23, 2020 • edited Loading

Codecov Report

justiniso Oct 23, 2020 • edited Loading

Choose a reason for hiding this comment

ofek Oct 26, 2020

Choose a reason for hiding this comment

justiniso Oct 23, 2020

Choose a reason for hiding this comment

hithwen left a comment • edited Loading

Choose a reason for hiding this comment

FlorianVeaux left a comment • edited Loading

Choose a reason for hiding this comment

florimondmanca left a comment

Choose a reason for hiding this comment

florimondmanca Oct 23, 2020

Choose a reason for hiding this comment

mgarabed left a comment

Choose a reason for hiding this comment

florimondmanca left a comment

Choose a reason for hiding this comment

justiniso commented Oct 23, 2020 •

edited

Loading

codecov bot commented Oct 23, 2020 •

edited

Loading

justiniso Oct 23, 2020 •

edited

Loading

hithwen left a comment •

edited

Loading

FlorianVeaux left a comment •

edited

Loading