Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

analyze table returns ok but does not generate stats #8326

Open
timsehn opened this issue Sep 4, 2024 · 2 comments
Open

analyze table returns ok but does not generate stats #8326

timsehn opened this issue Sep 4, 2024 · 2 comments
Labels
bug Something isn't working performance sql Issue with SQL

Comments

@timsehn
Copy link
Contributor

timsehn commented Sep 4, 2024

Potentially related to: #8324

$ du -h                                               
4.0K	./.doltcfg
479G	./.dolt/noms/oldgen
479G	./.dolt/noms
  0B	./.dolt/temptf
  0B	./.dolt/stats/.dolt/noms/oldgen
1.0M	./.dolt/stats/.dolt/noms
  0B	./.dolt/stats/.dolt/temptf
1.0M	./.dolt/stats/.dolt
1.0M	./.dolt/stats
479G	./.dolt
654G	.
$ dolt sql -q "analyze table actor, revision, comment"
+----------+---------+----------+----------+
| Table    | Op      | Msg_type | Msg_text |
+----------+---------+----------+----------+
| actor    | analyze | status   | OK       |
| revision | analyze | status   | OK       |
| comment  | analyze | status   | OK       |
+----------+---------+----------+----------+

$ du -h                                               
4.0K	./.doltcfg
479G	./.dolt/noms/oldgen
479G	./.dolt/noms
  0B	./.dolt/temptf
  0B	./.dolt/stats/.dolt/noms/oldgen
1.0M	./.dolt/stats/.dolt/noms
  0B	./.dolt/stats/.dolt/temptf
1.0M	./.dolt/stats/.dolt
1.0M	./.dolt/stats
479G	./.dolt
654G	.
$ dolt sql -q "select count(*) from dolt_statistics;"
+----------+
| count(*) |
+----------+
| 0        |
+----------+

These are large tables:

$ dolt sql -q "select count(*) from actor"
+----------+
| count(*) |
+----------+
| 290054   |
+----------+

$ dolt sql -q "select count(*) from revision"
+----------+
| count(*) |
+----------+
| 4013041  |
+----------+

$ dolt sql -q "select count(*) from comment" 
+----------+
| count(*) |
+----------+
| 1409589  |
+----------+

Seems like something should have gotten written to disk.

@timsehn timsehn added bug Something isn't working sql Issue with SQL performance labels Sep 4, 2024
@timsehn
Copy link
Contributor Author

timsehn commented Sep 4, 2024

Note, when stats were generated using call dolt_stats_restart() the query I was trying to change the plan for that used these three tables started to execute fast.

Here's the query:

$ dolt sql -q " SELECT rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,actor_rev_user.actor_user AS \`rev_user\`,actor_rev_user.actor_name AS \`rev_user_text\`,rev_actor,comment_rev_comment.comment_text AS \`rev_comment_text\`,comment_rev_comment.comment_data AS \`rev_comment_data\`,comment_rev_comment.comment_id AS \`rev_comment_cid\` FROM \`revision\` JOIN \`actor\` \`actor_rev_user\` ON ((actor_rev_user.actor_id = rev_actor)) JOIN \`comment\` \`comment_rev_comment\` ON ((comment_rev_comment.comment_id = rev_comment_id)) WHERE rev_page = 21990 AND (rev_timestamp <= '20040219154224') ORDER BY rev_timestamp DESC,rev_id DESC LIMIT 1"

Here is the old plan:

$ dolt sql -q "explain SELECT rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,actor_rev_user.actor_user AS \`rev_user\`,actor_rev_user.actor_name AS \`rev_user_text\`,rev_actor,comment_rev_comment.comment_text AS \`rev_comment_text\`,comment_rev_comment.comment_data AS \`rev_comment_data\`,comment_rev_comment.comment_id AS \`rev_comment_cid\` FROM \`revision\` JOIN \`actor\` \`actor_rev_user\` ON ((actor_rev_user.actor_id = rev_actor)) JOIN \`comment\` \`comment_rev_comment\` ON ((comment_rev_comment.comment_id = rev_comment_id)) WHERE rev_page = 21990 AND (rev_timestamp <= '20040219154224') ORDER BY rev_timestamp DESC,rev_id DESC LIMIT 1"
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Limit(1)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
|  └─ Project                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
|      ├─ columns: [revision.rev_id, revision.rev_page, revision.rev_timestamp, revision.rev_minor_edit, revision.rev_deleted, revision.rev_len, revision.rev_parent_id, revision.rev_sha1, actor_rev_user.actor_user as rev_user, actor_rev_user.actor_name as rev_user_text, revision.rev_actor, comment_rev_comment.comment_text as rev_comment_text, comment_rev_comment.comment_data as rev_comment_data, comment_rev_comment.comment_id as rev_comment_cid]                                                                                                                                                                                                                                                       |
|      └─ Sort(revision.rev_timestamp DESC, revision.rev_id DESC)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
|          └─ Project                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
|              ├─ columns: [revision.rev_id, revision.rev_page, revision.rev_comment_id, revision.rev_actor, revision.rev_timestamp, revision.rev_minor_edit, revision.rev_deleted, revision.rev_len, revision.rev_parent_id, revision.rev_sha1, actor_rev_user.actor_id, actor_rev_user.actor_user, actor_rev_user.actor_name, comment_rev_comment.comment_id, comment_rev_comment.comment_hash, comment_rev_comment.comment_text, comment_rev_comment.comment_data, actor_rev_user.actor_user as rev_user, actor_rev_user.actor_name as rev_user_text, comment_rev_comment.comment_text as rev_comment_text, comment_rev_comment.comment_data as rev_comment_data, comment_rev_comment.comment_id as rev_comment_cid] |
|              └─ HashJoin                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
|                  ├─ (actor_rev_user.actor_id = revision.rev_actor)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
|                  ├─ LookupJoin                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
|                  │   ├─ (comment_rev_comment.comment_id = revision.rev_comment_id)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
|                  │   ├─ TableAlias(comment_rev_comment)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
|                  │   │   └─ Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
|                  │   │       ├─ name: comment                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
|                  │   │       └─ columns: [comment_id comment_hash comment_text comment_data]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
|                  │   └─ Filter                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
|                  │       ├─ ((revision.rev_page = 21990) AND (revision.rev_timestamp <= '20040219154224'))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
|                  │       └─ IndexedTableAccess(revision)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
|                  │           ├─ index: [revision.rev_page,revision.rev_timestamp]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
|                  │           ├─ columns: [rev_id rev_page rev_comment_id rev_actor rev_timestamp rev_minor_edit rev_deleted rev_len rev_parent_id rev_sha1]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
|                  │           └─ keys: 21990                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
|                  └─ HashLookup                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
|                      ├─ left-key: (revision.rev_actor)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                      ├─ right-key: (actor_rev_user.actor_id)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
|                      └─ TableAlias(actor_rev_user)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
|                          └─ Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
|                              ├─ name: actor                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
|                              └─ columns: [actor_id actor_user actor_name]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

And here is the improved plan after stats were collected:

$ dolt sql -q "explain SELECT rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,actor_rev_user.actor_user AS \`rev_user\`,actor_rev_user.actor_name AS \`rev_user_text\`,rev_actor,comment_rev_comment.comment_text AS \`rev_comment_text\`,comment_rev_comment.comment_data AS \`rev_comment_data\`,comment_rev_comment.comment_id AS \`rev_comment_cid\` FROM \`revision\` JOIN \`actor\` \`actor_rev_user\` ON ((actor_rev_user.actor_id = rev_actor)) JOIN \`comment\` \`comment_rev_comment\` ON ((comment_rev_comment.comment_id = rev_comment_id)) WHERE rev_page = 21990 AND (rev_timestamp <= '20040219154224') ORDER BY rev_timestamp DESC,rev_id DESC LIMIT 1"
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Limit(1)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
|  └─ Project                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
|      ├─ columns: [revision.rev_id, revision.rev_page, revision.rev_timestamp, revision.rev_minor_edit, revision.rev_deleted, revision.rev_len, revision.rev_parent_id, revision.rev_sha1, actor_rev_user.actor_user as rev_user, actor_rev_user.actor_name as rev_user_text, revision.rev_actor, comment_rev_comment.comment_text as rev_comment_text, comment_rev_comment.comment_data as rev_comment_data, comment_rev_comment.comment_id as rev_comment_cid]                                                                                                                                                                                                                                                       |
|      └─ Sort(revision.rev_timestamp DESC, revision.rev_id DESC)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
|          └─ Project                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
|              ├─ columns: [revision.rev_id, revision.rev_page, revision.rev_comment_id, revision.rev_actor, revision.rev_timestamp, revision.rev_minor_edit, revision.rev_deleted, revision.rev_len, revision.rev_parent_id, revision.rev_sha1, actor_rev_user.actor_id, actor_rev_user.actor_user, actor_rev_user.actor_name, comment_rev_comment.comment_id, comment_rev_comment.comment_hash, comment_rev_comment.comment_text, comment_rev_comment.comment_data, actor_rev_user.actor_user as rev_user, actor_rev_user.actor_name as rev_user_text, comment_rev_comment.comment_text as rev_comment_text, comment_rev_comment.comment_data as rev_comment_data, comment_rev_comment.comment_id as rev_comment_cid] |
|              └─ LookupJoin                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
|                  ├─ LookupJoin                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
|                  │   ├─ IndexedTableAccess(revision)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
|                  │   │   ├─ index: [revision.rev_page,revision.rev_actor,revision.rev_timestamp]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                  │   │   ├─ filters: [{[21990, 21990], [NULL, ∞), (NULL, [50 48 48 52 48 50 49 57 49 53 52 50 50 52]]}]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
|                  │   │   └─ columns: [rev_id rev_page rev_comment_id rev_actor rev_timestamp rev_minor_edit rev_deleted rev_len rev_parent_id rev_sha1]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
|                  │   └─ TableAlias(actor_rev_user)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
|                  │       └─ IndexedTableAccess(actor)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
|                  │           ├─ index: [actor.actor_id]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
|                  │           ├─ columns: [actor_id actor_user actor_name]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
|                  │           └─ keys: revision.rev_actor                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
|                  └─ TableAlias(comment_rev_comment)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
|                      └─ IndexedTableAccess(comment)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
|                          ├─ index: [comment.comment_id]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
|                          ├─ columns: [comment_id comment_hash comment_text comment_data]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
|                          └─ keys: revision.rev_comment_id          

@max-hoffman
Copy link
Contributor

I have not been able to repro (1) with/without fixing dolt_stats_restart() db location, (2) while a server is/is not running. I did find and fixed some dropped errors on the read/write paths.

If we catch this happening again the thing we want to do is zip and save .dolt/stats/.dolt/noms. If a write fails silently we will notice the journal missing data. Otherwise it's some in-process specific tracking issue, like branch/database name/database location. All three databases returning OK but no stats makes me think there was some lock conflict that rejected the write, like a LOCK file left behind or a disk permissions issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working performance sql Issue with SQL
Projects
None yet
Development

No branches or pull requests

2 participants