Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for Message Pickup v4 #110

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

JamesKEbert
Copy link
Collaborator

@JamesKEbert JamesKEbert commented Jun 4, 2024

This is a proposal for message-pickup v4, motivated primarily by ambiguity when in live mode as to whether messages must still be delivered via delivery and messages-received. It was determined that clarifying that they are required would be considered a breaking change -- #105.
This is also an issue in DIDComm v1, which uses pickup v2, so this protocol introduces examples for both DIDComm v1 and DIDComm v2.
This also attempts to address minor inconsistencies or issues identified.

Open questions:
Should a delivery-request that has 0 queued messages return a status message or an empty delivery message? I lean towards yes as it reduces complexity (if a status message is sent in response to a delivery-request due to an empty queue, and the delivery-request had a filter recipient_did specified, does the status also contain that?) - empty delivery message was the consensus during DIDComm UG discussion

See related issues:
#105
hyperledger/aries-rfcs#760
#87

publisher: JamesKEbert
license: MIT
piuri: https://didcomm.org/message-pickup/4.0
status: Production
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we might need a somewhat more sophisticated process to advance didcomm protocols.

Because protocols are immediately versioned 1.0 (or another major version) it means every slight change that needs to happen baed on implementation experience is a new version.

Would make sense IMO if protocols would stay in "draft" mode for a few months (e.g. 0.1 or 1.0-draft) where breaking changes can be made.

This way we don't end up with a lot of pickup protocols versions each slightly different than the previous one

We need some playtime with a protocol before it's "locked"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do tend to agree. I'd opt for introducing RFCs on a 0.1 version, as it makes a lot of sense to update from 0.1 to 1.0 once the RFC is finalized. Theoretically I think the RFC flow found in the Aries RFCs could also handle this via the status changes (PROPOSED -> DEMONSTRATED -> ACCEPTED -> ADOPTED -> RETIRED) wherein theoretically breaking changes could be made prior to moving it to ACCEPTED. I don't think that's worked in practice though.

On a related note, I find it troubling that we made new major versions of protocols that only supported DIDComm v2, as it essentially made any adjustments/fixes to protocols on DIDComm v1 impossible that required breaking changes (it's really odd to make a v2 for DIDComm v1, v3 for DIDComm v2, and then a v4 for DIDComm v1). I think it might be wise to have protocols have dual support (as proposed in this RFC or as seen in https://didcomm.org/media-sharing/1.0/), or have a mechanism in the protocol URI to indicate which DIDComm version is in use.

In this instance though, Pickup v2 was introduced in early 2022 and was implemented in AFJ/Credo in mid-2022 and similar timing for support in ACA-Py. So, given that it's been around for that long (and used in production/deployments for that long), I think in this instance it seems likely appropriate to me to have a new version. And Pickup v3 is basically just a reskin of pickup v2 but for DIDComm v2.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually discussed this a good amount on one of the recent DIDComm UG calls--we agreed that starting a protocol at a 0.1 would be wise, but there were less immediate solutions available for subsequent versions (like in this case with v3->v4). One option discussed was adopting semver more closely, as it allows for something along the lines of 4.0.0-draft or 4.0.0-draft.2, etc. This would be ideal, but is a very major breaking change for protocol handler implementations.

@JamesKEbert
Copy link
Collaborator Author

As discussed in the DIDComm UG calls, I've updated the RFC with a changelog and made the adjustment to the behavior of handling a delivery-request when no pending messages are in the queue.
Any feedback welcome. @dbluhm, your input here would also be particularly valuable here. cc: @TelegramSam

site/content/protocols/messagepickup/4.0/readme.md Outdated Show resolved Hide resolved
## Roles
There are two roles in this protocol:

- `mediator`: The agent that has messages waiting for pickup by the `recipient`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there is a specific reason for it, but I don't see why the 'message holder' role has been renamed to 'mediator' in v2. Is Message Pickup protocol now only supported for a mediator/mediatee relationship? If that's the case, only the forwarded messages are supposed to be queued and packed in this protocol, or this applies also for any message sent from mediator to mediatee (e.g. keylist update response)?

Some time ago I had an use case where mobile agents without mediator would get credentials from an issuer: when connecting through regular HTTP, we were using Message Pickup in polling mode to check for credential issuance and other messages until the flow was finished. Do you think this is not an intended usage for this protocol?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some time ago I had an use case where mobile agents without mediator would get credentials from an issuer: when connecting through regular HTTP, we were using Message Pickup in polling mode to check for credential issuance and other messages until the flow was finished. Do you think this is not an intended usage for this protocol?

My current answer is that I think I would consider that to be outside the scope/intentions of this protocol. Given the already fairly involved nature of this protocol, my initial response is that I'd rather have a separate protocol dedicated to that particular use case so it can be more specific in its definitions and clarifications (especially given the troublesome, hard to troubleshoot issues observed with mobile message pickup over the years).

Also some context there according to my memory -- Pickup v2 was a completely fresh draft of the protocol that did not draw heavily from the v1 version (it was more based off of the implicit mechanisms being used with ACA-Py), but we thought it would be ideal to replace the initial version.

site/content/protocols/messagepickup/4.0/readme.md Outdated Show resolved Hide resolved
site/content/protocols/messagepickup/4.0/readme.md Outdated Show resolved Hide resolved
}
```

`limit` is a **REQUIRED** attribute, and after receipt of this message, the `mediator` **SHOULD** deliver up to the limit indicated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is some message exchange overhead for every loop (delivery-request -> delivery -> messages-received -> status), I think it is important to keep loop number as minimal as possible. Is actually a limit by message number the best approach, or should we consider also the transport capabilities of the recipient (i.e. maximum single message size)?

A recipient would likely want to receive all messages at once, as long as they fit into a single message. So I'm wondering that we can combine this specified limit with the max_receive_bytes constraint, in such a way that the mediator should consider both when constructing each delivery message.

I think this could work nicely unless a certain queued message exceeds recipient's capabilities: in such case, the message would be stuck in the queue indefinitely. Shall the mediator reject any inbound message for a recipient if it exceeds its reported max_receive_bytes value?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A recipient would likely want to receive all messages at once, as long as they fit into a single message. So I'm wondering that we can combine this specified limit with the max_receive_bytes constraint, in such a way that the mediator should consider both when constructing each delivery message.

My only concern here is regarding additional processing power/resources required on the mediator side to determine the number of messages attachable. Is the max_receive_bytes considered by the raw message size, or after packing for delivery (inside a delivery wrapper, and then inside a forward wrapper)?

I think this could work nicely unless a certain queued message exceeds recipient's capabilities: in such case, the message would be stuck in the queue indefinitely. Shall the mediator reject any inbound message for a recipient if it exceeds its reported max_receive_bytes value?

Perhaps that should actually be covered in the Mediation Coordination protocol? Since it sounds like it should be set up once and forgotten. But I do agree having some mechanism for indicating what is too large of a message is probably ideal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, yes, certainly that will need more processing on mediator side. But I would expect the mediator to be more powerful that most clients, which may be running in resource-constrained devices.

I do, however, understand your point, since working with byte count will require the mediator to query its data base in a way that it can sum up each message, and that of course will be a lot less efficient than simply doing a "select * from messages where recipient_did order by date limit 10".

I agree that all of this can be coordinated in another protocol (or simply queried with Discover Features), and that the mediator could take this constraint into account before actually packing and sending the messages, even if the requested limit is more than the total message count it is going to pack into delivery message. This is the solution we are following right now in our mediator and it is working quite well.

Unfortunately, this solution is not 100% suitable in this protocol as it is written right now, it will require the client to always create an extra status-request loop 'just in case' (see my another comment).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you may be right @genaris. @TheTechmage do you know if this would be a lot of work to add to existing mediator implementations? I lean towards having it be an optional field (vs required) and then having the number of messages be determined for whichever is lower (limit or max_received_bytes).


`message_id_list` is a list of `ids` of each message received. The `id` of each message is present in the attachment descriptor of each attached message of a delivery message.

Upon receipt of this message, the `mediator` knows which messages have been received, and can remove them from the collection of queued messages with confidence.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed in this section that it does not say anything about an updated status message sent as response for this message. This was explicitly mentioned in previous versions, so I'm wondering if this update intends to reflect that no further response is expected, or if it's just to not be too redundant (in Basic Walkthrough it is already stated it is a response)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch -- that should definitely go in the changelog

I believe this is motivated by the same complexity / ambiguity as to why we removed a status message as a followup to a delivery-request when no messages are queued for delivery, as it is unclear as to if it should be filtered by recipient_did or not (and actually can't be).
For example, in the following flow: status-request (restricted by recipient_did) -> status -> delivery -> messages-received -> status (NOT restricted by recipient_did, as there's no way to know that we were doing messages-received in regards to a filtered flow)
This may give the incorrect view that a given recipient_did has additional messages awaiting for pickup, but in reality they are for a different recipient_did.

If I'm missing something here, please say so

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to avoid sending an status as a response to messages-received. But at the beginning of Basic Waltkthrough section I see:

This protocol consists of four different message requests from the recipient that should be replied to by the mediator:
...
3. Message Received -> Status

Which seems confusing.

A minor downside of this spec change is that it will force the recipient to always start an extra loop to determine that there are no further mesages: for instance, if I have 10 messages and my limit is 10, I'll need to do a status-request -> status -> delivery-request -> delivery -> messages-received followed by a status-request -> status. But should not affect in the vast majority of cases.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've somewhat fixed the walkthrough section (cause yeah that's confusing, nice catch).

if I have 10 messages and my limit is 10, I'll need to do a status-request -> status -> delivery-request -> delivery -> messages-received followed by a status-request -> status.

That's only if you assume that you could have received a new message in the 200ms or however long it takes to receive the 10 messages (because we were informed by the first status message that we had 10 at that point in time). You in fact could receive a new message right after sending the second status-request and think you have no messages queued. I think it's an inherit problem with polling strategies.

@TheTechmage
Copy link
Collaborator

Just double checking, is there any update on this?

@JamesKEbert
Copy link
Collaborator Author

@TheTechmage I think we should move forward on this. I'm going to see if I can get it discussed on today's DIDComm UG call to get it merged in

Copy link
Contributor

@genaris genaris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some extra comments and corrections/suggestions (I won't be offended if not taken into account), but overall LGTM.

site/content/protocols/messagepickup/4.0/readme.md Outdated Show resolved Hide resolved

`message_id_list` is a list of `ids` of each message received. The `id` of each message is present in the attachment descriptor of each attached message of a delivery message.

Upon receipt of this message, the `mediator` knows which messages have been received, and can remove them from the collection of queued messages with confidence.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to avoid sending an status as a response to messages-received. But at the beginning of Basic Waltkthrough section I see:

This protocol consists of four different message requests from the recipient that should be replied to by the mediator:
...
3. Message Received -> Status

Which seems confusing.

A minor downside of this spec change is that it will force the recipient to always start an extra loop to determine that there are no further mesages: for instance, if I have 10 messages and my limit is 10, I'll need to do a status-request -> status -> delivery-request -> delivery -> messages-received followed by a status-request -> status. But should not affect in the vast majority of cases.

}
```

`limit` is a **REQUIRED** attribute, and after receipt of this message, the `mediator` **SHOULD** deliver up to the limit indicated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, yes, certainly that will need more processing on mediator side. But I would expect the mediator to be more powerful that most clients, which may be running in resource-constrained devices.

I do, however, understand your point, since working with byte count will require the mediator to query its data base in a way that it can sum up each message, and that of course will be a lot less efficient than simply doing a "select * from messages where recipient_did order by date limit 10".

I agree that all of this can be coordinated in another protocol (or simply queried with Discover Features), and that the mediator could take this constraint into account before actually packing and sending the messages, even if the requested limit is more than the total message count it is going to pack into delivery message. This is the solution we are following right now in our mediator and it is working quite well.

Unfortunately, this solution is not 100% suitable in this protocol as it is written right now, it will require the client to always create an extra status-request loop 'just in case' (see my another comment).

site/content/protocols/messagepickup/4.0/readme.md Outdated Show resolved Hide resolved
site/content/protocols/messagepickup/4.0/readme.md Outdated Show resolved Hide resolved

`longest_waited_seconds` is the age, in seconds, of the oldest message in the queue.

`newest_received_time` and `oldest_received_time` are expressed in UTC Epoch Seconds (seconds since 1970-01-01T00:00:00Z) as an integer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In DIDComm V1 we have a Best Practice that indicates that fields whose suffix is _time are strings in ISO 8601 format. Since this protocol has examples in both V1 and V2, I want to point this out because it can trigger inconsistencies with other protocols in the ecosystem.

This is probably not really relevant for this protocol, because I think a resolution in seconds is enough (although it won´t hurt neither to suffix it to _t or _timestamp. But to me it is a big problem of DIDComm V2 compared to V1, since in some cases it is very important to distinguish the order of messages sent within the same second and ISO 8601 provides means to do so. If anybody knows the reasons behind the decision of switching to integers (other than saving some bytes), please let me know!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this in the DIDComm UG on 12-9-24--we determined that we wanted to discuss with additional folks in the community to get their takes on a general best practice for DIDComm and also bring it for discussion in the WG spec meeting.
Specifically for this protocol we have determined we want additional specificity, but the debate is whether to do ISO 8601 or do UTC Epoch Milliseconds. I now lean towards the UTC approach since it's represented as an integer instead of a string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants