Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: startup: ensure local disk state is durable #8835

Merged
merged 1 commit into from
Aug 26, 2024

Conversation

problame
Copy link
Contributor

@problame problame commented Aug 26, 2024

refs #6989

Problem

After unclean shutdown, we get restarted, start reading the local filesystem,
and make decisions based on those reads. However, some of the data might have
not yet been fsynced when the unclean shutdown completed.

Durability matters even though Pageservers are conceptually just a cache
of state in S3. For example:

  • the cloud control plane is no control loop => pageserver responses
    to tenant attachmentm, etc, needs to be durable.
    • the storage controller does not rely on this (as much?)
  • we don't have layer file checksumming, so, downloaded+renamed but not
    fsynced layer files are technically not to be trusted

Solution

syncfs the tenants directory during startup, before we start reading from it.

This is a bit overkill because we do remove some temp files (InMemoryLayer!)
later during startup. Further, these temp files are particularly likely to
be dirty in the kernel page cache. However, we don't want to refactor that
cleanup code right now, and the dirty data on pageservers is generally
not that high. Last, with direct
IO
we're going to
have near-zero kernel page cache anyway quite soon.

refs #6989

Problem
-------

After unclean shutdown, we get restarted and read the local filesystem
to make decisions on those reads. Some of the data might have not yet
been fsynced when the unclean shutdown completed.

Durability matters even though Pageservers are conceptually just a cache
of state in S3. For example:
- the cloud control plane is no control loop => pageserver responses
  to tenant attachmentm, etc, needs to be durable.
  - the storage controller does not rely on this (as much?)
- we don't have layer file checksumming, so, downloaded+renamed but not
  fsynced layer files are technically not to be trusted
  - #2683

Solution
--------

`syncfs` the tenants directory during startup, before we start reading from it.

This is a bit overkill because we do remove some temp files (InMemoryLayer!)
later during startup. Further, these temp files are particularly likely to
be dirty in the kernel page cache. However, we don't want to refactor that
cleanup code right now, and the dirty data on pageservers is generally
not that high. Last, with [direct
IO](#8130) we're going to
have near-zero kernel page cache anyway quite soon.
@problame problame requested a review from a team as a code owner August 26, 2024 14:59
@problame problame requested review from arssher and koivunej August 26, 2024 14:59
Copy link

3766 tests run: 3660 passed, 0 failed, 106 skipped (full report)


Code coverage* (full report)

  • functions: 32.2% (7255 of 22565 functions)
  • lines: 50.3% (58802 of 116957 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
307ace2 at 2024-08-26T15:56:21.461Z :recycle:

@problame problame changed the title pageserver: ensure local disk state is durable during startup pageserver: startup: ensure local disk state is durable Aug 26, 2024
@problame problame merged commit 9724177 into main Aug 26, 2024
69 checks passed
@problame problame deleted the problame/sync-on-startup branch August 26, 2024 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants