SA: Implement schema and methods for (account, hostname) pausing #7490

beautifulentropy · 2024-05-16T15:34:17Z

Add the storage implementation for our new (account, hostname) pair pausing feature.

Add schema and model for for the new paused table
Add getters and setters for interacting with the paused schema
Add exported SA methods for interacting with the getters and setters

This PR adds the functionality, logic will be implemented in the WFE and RA, respectively.

Part of #7406
Part of #7475

github-actions · 2024-05-16T15:43:05Z

@beautifulentropy, this PR appears to contain configuration and/or SQL schema changes. Please ensure that a corresponding deployment ticket has been filed with the new values.

sa/model.go

aarongable

A few high-level suggestions to simplify the new gRPC API methods:

Pause and Repause can be the same method, since I don't think the RA needs to know that a pair has been previously paused. The SA can determine which case it is in inside a transaction, and increment the relevant metric.
CheckPair[s]Paused can be a single method, no need for separate singular/plural implementations (when would we be calling the singular one?). And rather than taking a single identifier type and many values, it should take a repeated message Identifier { string type = 1; string value = 2 }. Let the SA handle the need for multiple separate db queries if necessary; no need to expose that deficiency of our database schema to the RA.
I'm not clear on why we need both UnpausePair and UnpauseAccount, since the current design doc only calls for ever unpausing whole accounts at a time.

Doing all three of these would reduce the new API surface to just three methods: CheckPairsPaused, PausePair, UnpauseAccount.

There's one more method which might be necessary: GetPausedIdentifiersForAccount, to be used by the self-service unpausing page, to populate it with a (truncated) list of identifiers that will be unpaused.

sa/model.go

beautifulentropy · 2024-05-21T16:17:17Z

Pause and Repause can be the same method, since I don't think the RA needs to know that a pair has been previously paused. The SA can determine which case it is in inside a transaction, and increment the relevant metric.

On its face I think this is a good idea. However, it was implemented this way because I had always assumed that these observations would be confined to the RA. I'm a little uncomfortable with the idea of emitting metrics dependant on business logic inside of the storage layer.

CheckPair[s]Paused can be a single method, no need for separate singular/plural implementations (when would we be calling the singular one?). And rather than taking a single identifier type and many values, it should take a repeated message Identifier { string type = 1; string value = 2 }. Let the SA handle the need for multiple separate db queries if necessary; no need to expose that deficiency of our database schema to the RA.

Fair. I'll fix this.

I'm not clear on why we need both UnpausePair and UnpauseAccount, since the current design doc only calls for ever unpausing whole accounts at a time.

You're right, UnpausePair was an oversight on my part.

There's one more method which might be necessary: GetPausedIdentifiersForAccount, to be used by the self-service unpausing page, to populate it with a (truncated) list of identifiers that will be unpaused.

Thanks, I'll add this.

aarongable · 2024-05-21T16:33:58Z

On its face I think this is a good idea. However, it was implemented this way because I had always assumed that these observations would be confined to the RA. I'm a little uncomfortable with the idea of emitting metrics dependant on business logic inside of the storage layer.

I totally agree with this line of thinking. But for me it's a balancing act. I think emitting the "repaused" metric from the SA is slightly unfortunate, because it does feel like an RA sort of thing to be measuring. But I think that exposing twice as many methods from the SA and having strict calling conventions for those methods that will fail if the RA is ever wrong (or races against another RA!) is significantly more unfortunate. Requiring every potential SA client to have identical "call CheckPairsPaused, then depending on its answer, either call Pause or Repause, and gracefully handle the error in case the situation has changed between those two calls" logic feels much worse than allowing SA clients to just say "hey, pause this one" and letting the SA handle the intricacies of making that idempotent inside a database transaction.

beautifulentropy · 2024-05-21T16:42:29Z

On its face I think this is a good idea. However, it was implemented this way because I had always assumed that these observations would be confined to the RA. I'm a little uncomfortable with the idea of emitting metrics dependant on business logic inside of the storage layer.

I totally agree with this line of thinking. But for me it's a balancing act. I think emitting the "repaused" metric from the SA is slightly unfortunate, because it does feel like an RA sort of thing to be measuring. But I think that exposing twice as many methods from the SA and having strict calling conventions for those methods that will fail if the RA is ever wrong (or races against another RA!) is significantly more unfortunate. Requiring every potential SA client to have identical "call CheckPairsPaused, then depending on its answer, either call Pause or Repause, and gracefully handle the error in case the situation has changed between those two calls" logic feels much worse than allowing SA clients to just say "hey, pause this one" and letting the SA handle the intricacies of making that idempotent inside a database transaction.

I agree and as long as I've got support from you I'm happy to implement this as described above.

pgporada · 2024-05-29T18:17:58Z

sa/model.go

+// provided account. If no matches are found, an empty slice is returned. If a
+// non-DNS identifier is provided, an error is returned.
+func checkIdentifiersPaused(ctx context.Context, dbs db.Selector, regID int64, ids pauseIds) ([]string, error) {
+ if len(ids) == 0 {


ids is a []pauseId so this could just be a nil check because a slice zero value is nil. @aarongable brought up a good point in a different PR that this catches the case where the []pauseids slice exists but has an obviously-invalid (zero) length.

Meta question, what is the most safe/correct way of checking? https://go.dev/play/p/m4QpBnr6DP5

The idiomatic thing to do for slices is to check their length, because it is a single highly-readable check that works on both nil and non-nil-but-empty slices.

pgporada · 2024-05-29T18:18:12Z

sa/model.go

+func checkIdentifiersPaused(ctx context.Context, dbs db.Selector, regID int64, ids pauseIds) ([]string, error) {
+ if len(ids) == 0 {
+ // No identifier values to check.
+ return []string{}, nil


You could return nil, nil here since the function returns []string anyways and you save allocating an slice containing nothing.

pgporada · 2024-05-29T19:53:22Z

sa/model.go

+
+// pausedModel represents a row in the paused table. The pausedAt and unpausedAt
+// fields are pointers because they are NULL-able columns. At no point should
+// both pausedAt and unpausedAt be non-NULL. Valid states are:


At no point should both pausedAt and unpausedAt be non-NULL.

How does this square with the example directly below it?

// - identifier unpaused: pausedAt is non-NULL, unpausedAt is non-NULL

aarongable · 2024-05-29T16:37:22Z

sa/model.go

+
+// pausedModel represents a row in the paused table. The pausedAt and unpausedAt
+// fields are pointers because they are NULL-able columns. At no point should
+// both pausedAt and unpausedAt be non-NULL. Valid states are:


Suggested change

// both pausedAt and unpausedAt be non-NULL. Valid states are:

// both pausedAt and unpausedAt be NULL. Valid states are:

aarongable · 2024-05-29T16:40:44Z

sa/model.go

+// checkIdentifiersPaused takes a slice of identifiers of type "dns" and returns
+// a slice of the first 15 identifier values which are currently paused for the
+// provided account. If no matches are found, an empty slice is returned. If a
+// non-DNS identifier is provided, an error is returned.


I think, for future readers, we should note that this decision to reject non-DNS identifiers is not about the existence of non-DNS identifiers themselves, but rather about the fact that we don't (yet) want this function to eat the complexity of supporting more than one identifier type at a time.

aarongable · 2024-05-29T18:31:16Z

sa/proto/sa.proto

+ rpc CheckIdentifiersPaused (CheckIdentifiersPausedRequest) returns (Hostnames) {}
+ rpc GetPausedIdentifiersForAccount (RegistrationID) returns (Hostnames) {}


Why not have these return Identifiers? We're being flexible enough to support taking different kinds of identifiers as input, we should do the same for results.

aarongable · 2024-05-29T20:27:34Z

sa/proto/sa.proto

@@ -96,6 +100,8 @@ service StorageAuthority {
 rpc UpdateRevokedCertificate(RevokeCertificateRequest) returns (google.protobuf.Empty) {}
 rpc LeaseCRLShard(LeaseCRLShardRequest) returns (LeaseCRLShardResponse) {}
 rpc UpdateCRLShard(UpdateCRLShardRequest) returns (google.protobuf.Empty) {}
+ rpc PauseIdentifier(PauseIdentifierRequest) returns (PauseIdentifierResponse) {}


truly a question: would it make sense for this to take a plural set of identifiers to pause? If the RA detects that they've hit the limit for many names at the same time (which wouldn't surprise me, in the median case of requesting foo.co and www.foo.co) then we could pause all of them with a single request.

aarongable · 2024-05-29T21:19:37Z

sa/model.go

+// provided account. If no matches are found, an empty slice is returned. If a
+// non-DNS identifier is provided, an error is returned.
+func checkIdentifiersPaused(ctx context.Context, dbs db.Selector, regID int64, ids pauseIds) ([]string, error) {
+ if len(ids) == 0 {


The idiomatic thing to do for slices is to check their length, because it is a single highly-readable check that works on both nil and non-nil-but-empty slices.

aarongable · 2024-05-29T21:24:08Z

sa/saro.go

+func (ssa *SQLStorageAuthority) GetSerialsByAccount(req *sapb.RegistrationID, stream sapb.StorageAuthority_GetSerialsByAccountServer) error {
+ return ssa.SQLStorageAuthorityRO.GetSerialsByAccount(req, stream)
+}
+


This wrapper has been erroneously re-added, and the similar wrappers below are no longer necessary, thanks to #7501

aarongable · 2024-05-29T21:27:07Z

sa/model.go

+// a slice of the first 15 identifier values which are currently paused for the
+// provided account. If no matches are found, an empty slice is returned. If a
+// non-DNS identifier is provided, an error is returned.
+func checkIdentifiersPaused(ctx context.Context, dbs db.Selector, regID int64, ids pauseIds) ([]string, error) {


Why is this function (and the similar complex helper functions below) in this file? These are each called in only one place, and they have no direct tests (all of the tests operate at the SA level). Why not inline their logic directly into the SA methods? This would also prevent having to deal with the annoyance of ensuring the executor they're called with is actually a transaction.

SA: Implement schema and methods for (account, hostname) pausing

661be55

beautifulentropy marked this pull request as ready for review May 16, 2024 15:42

beautifulentropy requested a review from a team as a code owner May 16, 2024 15:42

beautifulentropy requested a review from aarongable May 16, 2024 15:42

beautifulentropy marked this pull request as draft May 16, 2024 16:53

Add UnpauseAccount method.

60c5f23

beautifulentropy commented May 16, 2024

View reviewed changes

sa/model.go Outdated Show resolved Hide resolved

beautifulentropy marked this pull request as ready for review May 16, 2024 17:45

letsencrypt deleted a comment from github-actions bot May 16, 2024

aarongable reviewed May 20, 2024

View reviewed changes

sa/model.go Outdated Show resolved Hide resolved

pgporada self-requested a review May 22, 2024 14:19

beautifulentropy added 3 commits May 24, 2024 17:21

Addressing comments.

bed01cf

Merge branch 'main' into fail3pause-part-1

56c5316

Fixing a possible panic.

588a35d

beautifulentropy force-pushed the fail3pause-part-1 branch from ec34ece to 588a35d Compare May 24, 2024 21:49

beautifulentropy requested a review from aarongable May 24, 2024 21:51

Use db.IsNoRows() instead of errors.Is()

61bfbde

beautifulentropy force-pushed the fail3pause-part-1 branch from 8918fe4 to 8e95c04 Compare May 28, 2024 20:12

Add new getters to saro proto

7adf408

beautifulentropy force-pushed the fail3pause-part-1 branch from 8e95c04 to 7adf408 Compare May 28, 2024 20:25

pgporada reviewed May 29, 2024

View reviewed changes

aarongable reviewed May 29, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SA: Implement schema and methods for (account, hostname) pausing #7490

SA: Implement schema and methods for (account, hostname) pausing #7490

beautifulentropy commented May 16, 2024 •

edited

github-actions bot commented May 16, 2024

aarongable left a comment

beautifulentropy commented May 21, 2024 •

edited

aarongable commented May 21, 2024

beautifulentropy commented May 21, 2024

pgporada May 29, 2024 •

edited

pgporada May 29, 2024

aarongable May 29, 2024

pgporada May 29, 2024

pgporada May 29, 2024

aarongable May 29, 2024

aarongable May 29, 2024

aarongable May 29, 2024

aarongable May 29, 2024

aarongable May 29, 2024

aarongable May 29, 2024

aarongable May 29, 2024

	// both pausedAt and unpausedAt be non-NULL. Valid states are:
	// both pausedAt and unpausedAt be NULL. Valid states are:

		rpc CheckIdentifiersPaused (CheckIdentifiersPausedRequest) returns (Hostnames) {}
		rpc GetPausedIdentifiersForAccount (RegistrationID) returns (Hostnames) {}

SA: Implement schema and methods for (account, hostname) pausing #7490

Are you sure you want to change the base?

SA: Implement schema and methods for (account, hostname) pausing #7490

Conversation

beautifulentropy commented May 16, 2024 • edited

github-actions bot commented May 16, 2024

aarongable left a comment

Choose a reason for hiding this comment

beautifulentropy commented May 21, 2024 • edited

aarongable commented May 21, 2024

beautifulentropy commented May 21, 2024

pgporada May 29, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

beautifulentropy commented May 16, 2024 •

edited

beautifulentropy commented May 21, 2024 •

edited

pgporada May 29, 2024 •

edited