Add `validate_record` API to upgraded contexts #1218

akoshelev · 2024-08-09T06:30:21Z

This API serves multiple purposes:

We could never make the existing API work and kept discovering bugs associated with it.

Previously, the validation process was separate, detached from actual protocol and validated everything at once. We saw multiple occurrences where reveal was called before sharings were validated.

Previous approach did not integrate smoothly with vectorization

Vectorization makes data processed in chunks. Going between different chunk sizes (one for data, one for validation) has proven to be challenging. The code written for that was hard to read.

Validate record API

The core of this proposal is to put validate_record API on the UpgradedContext that takes a record_id and blocks the execution until this record (and others in the same batch) has been validated. FWIW this is exactly how ZKP validator works now.

This API allows to bring closer together MAC and ZKP validation.

In addition to bringing this API, this change also updates all the uses of MAC validators and contexts to use it.

The pros include:

Validate record now must be called per record basis, making protocols easier to review. One can see that validate call is there and it is right before the reveal.
Reveal can have special power and abort if the record being revealed hasn't been validated yet. Because UpgradedContext now can keep track of things validated, we can add this functionality later.
No chunk conversion required on the protocol side. They can be simply written w/o doing magic conversions
Validation can now be done in batches, transparently to the code calling validate_record and it integrates smoothly with seq_join (no need for special validated_seq_join)

Downsides:

Tracking total records is difficult. We still don't have a good solution to some of our protocols where we need to have different total records per step, created inside UpgradedContext
record_id appears more and more in the API. Traits like ShareKnownValue must be updated to support it.

This API serves multiple purposes: ## We could never make the existing API work and kept discovering bugs associated with it. Previously, the validation process was separate, detached from actual protocol and validated everything at once. We saw multiple occurrences where reveal was called before sharings were validated. ## Previous approach did not integrate smoothly with vectorization Vectorization makes data processed in chunks. Going between different chunk sizes (one for data, one for validation) has proven to be challenging. The code written for that was hard to read. ## Validate record API The core of this proposal is to put `validate_record` API on the `UpgradedContext` that takes a `record_id` and blocks the execution until this record (and others in the same batch) has been validated. FWIW this is exactly how ZKP validator works now. This API allows to bring closer together MAC and ZKP validation. In addition to bringing this API, this change also updates all the uses of MAC validators and contexts to use it. The pros include: * Validate record now must be called per record basis, making protocols easier to review. One can see that validate call is there and it is right before the reveal. * Reveal can have special power and abort if the record being revealed hasn't been validated yet. Because `UpgradedContext` now can keep track of things validated, we can add this functionality later. * No chunk conversion required on the protocol side. They can be simply written w/o doing magic conversions * Validation can now be done in batches, transparently to the code calling `validate_record` and it integrates smoothly with `seq_join` (no need for special `validated_seq_join`) Downsides: * Tracking total records is difficult. We still don't have a good solution to some of our protocols where we need to have different total records per step, created inside `UpgradedContext` * `record_id` appears more and more in the API. Traits like `ShareKnownValue` must be updated to support it.

andyleiserson

I captured a couple things that we may want to reconcile with the DZKP APIs in #1219, feel free to add to the list.

andyleiserson · 2024-08-09T21:56:31Z

ipa-core/src/protocol/context/malicious.rs

+    }
+
+    fn r_share(&self, record_id: RecordId) -> Replicated<F::ExtendedField> {
+        // its unfortunate, but carrying references across mutex boundaries is not possible


I don't understand this comment. Is it saying that it's unfortunate the with_batch call is necessary here?

In the old code there was a comment about r_share being intentionally private, I think it might be worth maintaining something like it in the new version.

yea before we could return a reference, now we must clone each time we call r_share. that's unfortunate, but there is no other way.

andyleiserson · 2024-08-09T22:05:23Z

ipa-core/src/protocol/context/batcher.rs

+    pub(super) batch: B,
+    pub(super) notify: Arc<Notify>,
+    records_per_batch: usize,
+    records: AtomicUsize,


I think that this atomic ended up being unnecessary, or at least, not useful given that the batches are anyways behind a mutex. I'm fine if we merge this as is and clean that up later, though. (Besides cleanup of a possibly unnecessary atomic, there's also a question of resolving code duplication with the DZKP validator version of this code.)

yea, to keep others informed - I think this interface ended up not working properly for mac validation at least. There is no mechanism for records failing validation to abort the pending futures - in this case they get resolved and execution resumes for them. If those are behind seq_join then they get cancelled, but the damage can be done already.

So this requires a major overhaul, I agree. Lets clean this up in one PR if that's possible

andyleiserson · 2024-08-09T22:14:25Z

ipa-core/src/protocol/context/validator.rs

        // send our `u_i+1` value to the helper on the right
        let (u_share, w_share) = self.propagate_u_and_w().await?;

        // This should probably be done in parallel with the futures above
        let narrow_ctx = self
            .validate_ctx
            .narrow(&ValidateStep::RevealR)
-            .set_total_records(TotalRecords::ONE);
+            .set_total_records(TotalRecords::Indeterminate);


What would it take to specify total records here? (From a performance perspective it probably doesn't matter, but it would be nice to eliminate uses of TotalRecords::Indeterminate).

I can't see how can we make this work with the current limitation (one active batch). The problem here is that we require data to be communicated right away - even if we could set the total records correctly using total_records / batch_size, we can't make progress until the first batch is validated.

We could probably narrow to a unique step per batch. Will likely stress out our compact gate system as we would require ~50k unique steps to validate 50M records.

If we have multi-batch support, then I guess we can set up total_records for RevealR and validate all batches at the same time - that will likely mean too many pending futures at the time of validation.

The most reasonable approach to eliminate Indeterminate records would be to support custom batch size per step in compact gate macro. That way, we can set the total records correctly and batch size to 1 for RevealR and CheckZero steps

ipa-core/src/protocol/context/validator.rs

codecov · 2024-08-12T12:20:57Z

Codecov Report

Attention: Patch coverage is 94.72141% with 18 lines in your changes missing coverage. Please review.

Project coverage is 92.77%. Comparing base (468e1e3) to head (322e377).
Report is 15 commits behind head on main.

Files	Patch %	Lines
ipa-core/src/protocol/context/batcher.rs	86.11%	10 Missing ⚠️
ipa-core/src/protocol/context/malicious.rs	93.65%	4 Missing ⚠️
ipa-core/src/protocol/context/semi_honest.rs	0.00%	3 Missing ⚠️
ipa-core/src/protocol/context/validator.rs	99.05%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1218      +/-   ##
==========================================
+ Coverage   92.69%   92.77%   +0.08%     
==========================================
  Files         196      198       +2     
  Lines       30503    30781     +278     
==========================================
+ Hits        28274    28558     +284     
+ Misses       2229     2223       -6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

akoshelev requested a review from andyleiserson August 9, 2024 06:30

andyleiserson approved these changes Aug 9, 2024

View reviewed changes

Feedback

322e377

akoshelev merged commit 04cf79a into private-attribution:main Aug 12, 2024
12 checks passed

akoshelev mentioned this pull request Aug 13, 2024

Adapt Downgrade to new malicious API #1222

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `validate_record` API to upgraded contexts #1218

Add `validate_record` API to upgraded contexts #1218

akoshelev commented Aug 9, 2024

andyleiserson left a comment

andyleiserson Aug 9, 2024

akoshelev Aug 12, 2024

andyleiserson Aug 9, 2024

akoshelev Aug 12, 2024

andyleiserson Aug 9, 2024

akoshelev Aug 12, 2024

codecov bot commented Aug 12, 2024

Add validate_record API to upgraded contexts #1218

Add validate_record API to upgraded contexts #1218

Conversation

akoshelev commented Aug 9, 2024

We could never make the existing API work and kept discovering bugs associated with it.

Previous approach did not integrate smoothly with vectorization

Validate record API

andyleiserson left a comment

Choose a reason for hiding this comment

andyleiserson Aug 9, 2024

Choose a reason for hiding this comment

akoshelev Aug 12, 2024

Choose a reason for hiding this comment

andyleiserson Aug 9, 2024

Choose a reason for hiding this comment

akoshelev Aug 12, 2024

Choose a reason for hiding this comment

andyleiserson Aug 9, 2024

Choose a reason for hiding this comment

akoshelev Aug 12, 2024

Choose a reason for hiding this comment

codecov bot commented Aug 12, 2024

Codecov Report

Add `validate_record` API to upgraded contexts #1218

Add `validate_record` API to upgraded contexts #1218