Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receiving events late/out of order #57

Open
ScottG489 opened this issue Apr 22, 2022 · 9 comments
Open

Receiving events late/out of order #57

ScottG489 opened this issue Apr 22, 2022 · 9 comments

Comments

@ScottG489
Copy link

Version of the custom_component

v0.8.3

Configuration

N/A

Describe the bug

I sometimes receive events from zoom quite some time after they occur. I'm not sure if this is a problem with zoom or this integration, but I thought I'd get a discussion going here to see if anyone else is having this problem.

I think if this is a problem on Zoom's side, there's still potentially something we could do. Event's come with a timestamp of when they occurred (unique from when HA receives them), so we could verify that events attempting to change current state are not older than whatever event set the current state.

An example of this problem I've run into personally has been where I'm in a meeting and my status in HA is set appropriately to on, but then an old (sometimes by 1+ hours) event comes in marking me as available and my state is updated to off.

Thoughts?

Debug log

Example of a log line from an old event (all times are UTC):

2022-04-22 20:43:28 DEBUG (MainThread) [custom_components.zoom.common] Received event: {'event': 'user.presence_status_updated', 'payload': {'account_id': '<redacted>', 'object': {'date_time': '2022-04-22T19:15:10Z', 'email': '<redacted>', 'id': '<redacted>', 'presence_status': 'Available'}}, 'event_ts': 1650654910524}

As you can see, HA receives the event at 20:43:28, but the event happened at 19:15:10.

@raman325
Copy link
Owner

thanks for the investigation and providing all of these details, that's interesting..

I can't think of why this would be a problem within the integration, so my best guess is that this is an issue on the Zoom side with their APIs. Given that webhooks are a free feature, I wouldn't be surprised if they've had scaling issues and are deprioritizing them for other more mission critical APIs/webhooks.

In any case, I am not quite sure how to fix this without adding polling all the time, which doesn't work for all accounts due to permissions. I could certainly discard old events but that could leave you in an odd state until the next valid timely event comes in. I need some time to think on this some more but if you have suggestions I am all ears.

@ScottG489
Copy link
Author

Have you had any issues like this? I wonder if it's affecting other people.

I can't think of why it would be a problem with the integration either. The only potentially weird thing with my setup is what I discussed in my other issue (#56) where I'm using my Nabu Casa external URL instead of the one I have set up myself, but even with that I'm not exactly sure how it would be affecting things.

I wasn't suggesting we'd discard events that are "old", just events that are older than the most recent event of the same type. Correct me if I'm wrong, but I think for the purpose of setting the state, it's safe to say we'd always want to do this, correct?

I'm not familiar with the current implementation, but one potential complexity there would be that we'd need to store event times when we receive the event (for comparison later). Could be as simple as storing them as an attribute?

@raman325
Copy link
Owner

It's certainly possible that the Nabu Casa reverse proxy is not forwarding the events in a timely matter, but I have a hard time believing that. I know that they recently had to address some performance issues but an hour seems unreasonable as it would basically break all HA instances using Nabu Casa.

I see what you mean - I don't think I fully understood that idea in the original issue comment, but that makes sense! I can work on this.

I don't use Zoom anymore. I'd actually love to add another contributor to this repo who actually uses Zoom because my ability to test and see issues is limited.

@ScottG489
Copy link
Author

Did a little Googling and came across this:
https://devsupport.zoom.us/hc/en-us/articles/360059893812-Webhook-delays

Not sure if it's applicable here, but these delayed events are getting quite frustrating :/

@raman325
Copy link
Owner

raman325 commented May 4, 2022

I don't think it's related looking at your timeline above of an 88 minute delay. I feel like something else must be going on that's causing this issue although I can't say what it is.

@ScottG489
Copy link
Author

I found a page that shows API usage for your apps. It doesn't provide too much useful information, but does show a graph of API usage and it says there have been 0 failures.

I wonder if there are performance issues because the app is in "development" mode since it's never published. So Zoom may have it in a less reliable environment.

I'm still wondering why no one else using this integration is experiencing this issue, though.

@ScottG489
Copy link
Author

I've been able to mitigate incorrect state with some custom logic (e.g. Node-RED). I am listening for other zoom_webhook events and updating my state accordingly.

The delivery of events is really unreliable, so sometimes even though I won't get the user.presence_status_updated event, I will get other event's such as meeting.ended (which I've set up in my Zoom app dev settings to be sent) or meeting.participant_left and I check that the user leaving is me. Same with meeting started or participant joined events. I'll update the Zoom binary_sensor accordingly.

Again, this really shouldn't be necessary if Zoom was reliably sending events. So I'm wondering how wide spread of an issue this is or if it's just me. But I thought I'd share.

I still do suffer from out of order events and haven't yet set up anything custom for that which I think would help a lot. A somewhat naive approach I have here is to ignore events that are older than X minutes.

@raman325
Copy link
Owner

Wow that sounds pretty awful. I have personally never experienced this, although truth be told, I am no longer a Zoom user so it's possible that the behavior changed and I just didn't notice. I would welcome a fix and would be happy to review any PRs but I don't have the bandwidth to troubleshoot this at the moment

@ScottG489
Copy link
Author

I don't know if it's due to my custom logic in node-red or what, but for the past 1-2 months things have been better.

However, I just received an event that was about 6 hours old. I did a little googling around and found this:

https://devsupport.zoom.us/hc/en-us/articles/360059893812-Webhook-delays

It doesn't add up though since they claim they would stop after 60 minutes. In any case, thought I'd leave it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants