Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notifo stops processing events? #225

Open
benjyiw opened this issue Mar 19, 2024 · 6 comments
Open

Notifo stops processing events? #225

benjyiw opened this issue Mar 19, 2024 · 6 comments

Comments

@benjyiw
Copy link

benjyiw commented Mar 19, 2024

Hi again,

I have ran into a bug and unfortunately I don’t have many details to provide. I figure I would just open the issue now to see if that sparked any ideas from your end. I intend to provide more information if/when it happens again, so that I can drill down and provide more logs and perhaps poke around in the DB.

At some point in the past 90 days, Notifo seemingly stopped processing events. I am running Notifo in k8s, so my first hail mary attempt at a resolution was to kill the Notifo pod, and unsatisfyingly that fixed it. This happened for two separate environments, so this is a real bug, but unfortunately I have nothing that I think would be very helpful to provide as I can’t cause the bug to reoccur in order to get more information. I'll provide what I do have below.


Prior to restart of the pod the only logs I saw were simple HTTP access logs.

During pod shutdown, Notifo logged the following errors, I don't know if this is actually problematic:

2024-03-19 11:05:32	
{"logLevel":"Error","stoppedSystemFailed":"Messaging.Consumer(user-events)","timestamp":"2024-03-19T17:05:32Z","severity":"ERROR","exception":{"type":"System.NullReferenceException","message":"Object reference not set to an instance of an object.","stackTrace":"   at RabbitMQ.Client.Impl.ModelBase.BasicCancel(String consumerTag)\n   at RabbitMQ.Client.Impl.AutorecoveringModel.BasicCancel(String consumerTag)\n   at Squidex.Messaging.RabbitMq.RabbitMqSubscription.DisposeAsync()\n   at Squidex.Messaging.Implementation.DelegatingConsumer.StopAsync(CancellationToken ct)\n   at Squidex.Hosting.BackgroundHost.StopAsync(CancellationToken cancellationToken)"}}

2024-03-19 11:05:32	
{"logLevel":"Error","stoppedSystemFailed":"Messaging.Consumer(events)","timestamp":"2024-03-19T17:05:32Z","severity":"ERROR","exception":{"type":"System.NullReferenceException","message":"Object reference not set to an instance of an object.","stackTrace":"   at RabbitMQ.Client.Impl.ModelBase.BasicCancel(String consumerTag)\n   at RabbitMQ.Client.Impl.AutorecoveringModel.BasicCancel(String consumerTag)\n   at Squidex.Messaging.RabbitMq.RabbitMqSubscription.DisposeAsync()\n   at Squidex.Messaging.Implementation.DelegatingConsumer.StopAsync(CancellationToken ct)\n   at Squidex.Hosting.BackgroundHost.StopAsync(CancellationToken cancellationToken)"}}

2024-03-19 11:05:32	
{"logLevel":"Error","stoppedSystemFailed":"Messaging.Consumer(confirms)","timestamp":"2024-03-19T17:05:32Z","severity":"ERROR","exception":{"type":"System.NullReferenceException","message":"Object reference not set to an instance of an object.","stackTrace":"   at RabbitMQ.Client.Impl.ModelBase.BasicCancel(String consumerTag)\n   at RabbitMQ.Client.Impl.AutorecoveringModel.BasicCancel(String consumerTag)\n   at Squidex.Messaging.RabbitMq.RabbitMqSubscription.DisposeAsync()\n   at Squidex.Messaging.Implementation.DelegatingConsumer.StopAsync(CancellationToken ct)\n   at Squidex.Hosting.BackgroundHost.StopAsync(CancellationToken cancellationToken)"}}

After I restarted the application, I saw dozens of logs around new events being processed, here's one example:

{"logLevel":"Information","message":"Received event for app 026fa665-d935-4b55-a99b-edd47647039a with ID 84e0ae4b-7c86-4991-adf8-57860c858fa7 to topic users/9739ce9f-bc50-4c2b-abeb-1ace7da20ab7.","appId":"026fa665-d935-4b55-a99b-edd47647039a","id":"84e0ae4b-7c86-4991-adf8-57860c858fa7","topic":"users/9739ce9f-bc50-4c2b-abeb-1ace7da20ab7","timestamp":"2024-03-19T17:06:03Z","severity":"INFO","category":"Notifo.Domain.UserEvents.Pipeline.UserEventPublisher"}
@SebastianStehle
Copy link
Collaborator

So you are using RabbitMQ as a backbone for events?

@benjyiw
Copy link
Author

benjyiw commented Mar 19, 2024

So you are using RabbitMQ as a backbone for events?

Yep, that's what we are currently using.

@SebastianStehle
Copy link
Collaborator

I had a look to the code and I think right here a log statement could help: https://github.com/Squidex/libs/blob/main/messaging/Squidex.Messaging.RabbitMq/RabbitMqSubscription.cs#L51

I will provide that asap. Perhaps it helps. We could also think about adding a method to get the number of pending messages and add that to the health check endpoint.,

@SebastianStehle
Copy link
Collaborator

I have pushed a new version with the improved logging.

@benjyiw
Copy link
Author

benjyiw commented Mar 22, 2024

Thank you! I found the squidex/notifo-dev image and saw that it has a new tag on it. I'll try that out and update this issue if I learn anything more.

@benjyiw
Copy link
Author

benjyiw commented Mar 22, 2024

I was curious, so I checked the most recent commits, and I don't actually see the logging changes to the RabbitMQ messaging implementation. Am I missing something?

Ah, never mind, I see those changes in the lib repo. I had assumed that the link before was in this repo and didn't catch that it was actually elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants