Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add validate_record API to upgraded contexts #1218

Merged
merged 2 commits into from
Aug 12, 2024

Conversation

akoshelev
Copy link
Collaborator

This API serves multiple purposes:

We could never make the existing API work and kept discovering bugs associated with it.

Previously, the validation process was separate, detached from actual protocol and validated everything at once. We saw multiple occurrences where reveal was called before sharings were validated.

Previous approach did not integrate smoothly with vectorization

Vectorization makes data processed in chunks. Going between different chunk sizes (one for data, one for validation) has proven to be challenging. The code written for that was hard to read.

Validate record API

The core of this proposal is to put validate_record API on the UpgradedContext that takes a record_id and blocks the execution until this record (and others in the same batch) has been validated. FWIW this is exactly how ZKP validator works now.

This API allows to bring closer together MAC and ZKP validation.

In addition to bringing this API, this change also updates all the uses of MAC validators and contexts to use it.

The pros include:

  • Validate record now must be called per record basis, making protocols easier to review. One can see that validate call is there and it is right before the reveal.
  • Reveal can have special power and abort if the record being revealed hasn't been validated yet. Because UpgradedContext now can keep track of things validated, we can add this functionality later.
  • No chunk conversion required on the protocol side. They can be simply written w/o doing magic conversions
  • Validation can now be done in batches, transparently to the code calling validate_record and it integrates smoothly with seq_join (no need for special validated_seq_join)

Downsides:

  • Tracking total records is difficult. We still don't have a good solution to some of our protocols where we need to have different total records per step, created inside UpgradedContext
  • record_id appears more and more in the API. Traits like ShareKnownValue must be updated to support it.

This API serves multiple purposes:

## We could never make the existing API work and kept discovering bugs associated with it.

Previously, the validation process was separate, detached from actual protocol and validated everything at once. We saw multiple occurrences where reveal was called before sharings were validated.

## Previous approach did not integrate smoothly with vectorization

Vectorization makes data processed in chunks. Going between different chunk sizes (one for data, one for validation) has proven to be challenging. The code written for that was hard to read.

## Validate record API

The core of this proposal is to put `validate_record` API on the `UpgradedContext` that takes a `record_id` and blocks the execution until this record (and others in the same batch) has been validated. FWIW this is exactly how ZKP validator works now.

This API allows to bring closer together MAC and ZKP validation.

In addition to bringing this API, this change also updates all the uses of MAC validators and contexts to use it.

The pros include:
* Validate record now must be called per record basis, making protocols easier to review. One can see that validate call is there and it is right before the reveal.
* Reveal can have special power and abort if the record being revealed hasn't been validated yet. Because `UpgradedContext` now can keep track of things validated, we can add this functionality later.
* No chunk conversion required on the protocol side. They can be simply written w/o doing magic conversions
* Validation can now be done in batches, transparently to the code calling `validate_record` and it integrates smoothly with `seq_join` (no need for special `validated_seq_join`)

Downsides:

* Tracking total records is difficult. We still don't have a good solution to some of our protocols where we need to have different total records per step, created inside `UpgradedContext`
* `record_id` appears more and more in the API. Traits like `ShareKnownValue` must be updated to support it.
Copy link
Collaborator

@andyleiserson andyleiserson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I captured a couple things that we may want to reconcile with the DZKP APIs in #1219, feel free to add to the list.

}

fn r_share(&self, record_id: RecordId) -> Replicated<F::ExtendedField> {
// its unfortunate, but carrying references across mutex boundaries is not possible
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment. Is it saying that it's unfortunate the with_batch call is necessary here?

In the old code there was a comment about r_share being intentionally private, I think it might be worth maintaining something like it in the new version.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea before we could return a reference, now we must clone each time we call r_share. that's unfortunate, but there is no other way.

pub(super) batch: B,
pub(super) notify: Arc<Notify>,
records_per_batch: usize,
records: AtomicUsize,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this atomic ended up being unnecessary, or at least, not useful given that the batches are anyways behind a mutex. I'm fine if we merge this as is and clean that up later, though. (Besides cleanup of a possibly unnecessary atomic, there's also a question of resolving code duplication with the DZKP validator version of this code.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, to keep others informed - I think this interface ended up not working properly for mac validation at least. There is no mechanism for records failing validation to abort the pending futures - in this case they get resolved and execution resumes for them. If those are behind seq_join then they get cancelled, but the damage can be done already.

So this requires a major overhaul, I agree. Lets clean this up in one PR if that's possible

// send our `u_i+1` value to the helper on the right
let (u_share, w_share) = self.propagate_u_and_w().await?;

// This should probably be done in parallel with the futures above
let narrow_ctx = self
.validate_ctx
.narrow(&ValidateStep::RevealR)
.set_total_records(TotalRecords::ONE);
.set_total_records(TotalRecords::Indeterminate);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would it take to specify total records here? (From a performance perspective it probably doesn't matter, but it would be nice to eliminate uses of TotalRecords::Indeterminate).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see how can we make this work with the current limitation (one active batch). The problem here is that we require data to be communicated right away - even if we could set the total records correctly using total_records / batch_size, we can't make progress until the first batch is validated.

We could probably narrow to a unique step per batch. Will likely stress out our compact gate system as we would require ~50k unique steps to validate 50M records.

If we have multi-batch support, then I guess we can set up total_records for RevealR and validate all batches at the same time - that will likely mean too many pending futures at the time of validation.

The most reasonable approach to eliminate Indeterminate records would be to support custom batch size per step in compact gate macro. That way, we can set the total records correctly and batch size to 1 for RevealR and CheckZero steps

ipa-core/src/protocol/context/validator.rs Outdated Show resolved Hide resolved
Copy link

codecov bot commented Aug 12, 2024

Codecov Report

Attention: Patch coverage is 94.72141% with 18 lines in your changes missing coverage. Please review.

Project coverage is 92.77%. Comparing base (468e1e3) to head (322e377).
Report is 15 commits behind head on main.

Files Patch % Lines
ipa-core/src/protocol/context/batcher.rs 86.11% 10 Missing ⚠️
ipa-core/src/protocol/context/malicious.rs 93.65% 4 Missing ⚠️
ipa-core/src/protocol/context/semi_honest.rs 0.00% 3 Missing ⚠️
ipa-core/src/protocol/context/validator.rs 99.05% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1218      +/-   ##
==========================================
+ Coverage   92.69%   92.77%   +0.08%     
==========================================
  Files         196      198       +2     
  Lines       30503    30781     +278     
==========================================
+ Hits        28274    28558     +284     
+ Misses       2229     2223       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@akoshelev akoshelev merged commit 04cf79a into private-attribution:main Aug 12, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants