-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storage calls seem to result in main thread deadlocks #277
Comments
Thanks @tristan-warner-smith, I'll have a look at this and see if I can reproduce. If you could give me an example of the track calls your sending, that would be helpful. There were some instances where large IO on the main thread sent to the storage class could result in something that takes long enough for the OS to kill the app. You might pull However, your current work around will almost certainly result in corrupt data given time. Those sync's are there on purpose to synchronize writes to a single file from potentially multiple threads. |
Our users depend on us for immediate relief from pain, stress etc so we can't wait for perfect in this case. If you can think of any better workarounds @bsneed that don't jeopardise our app's usability, we'd really appreciate it. These are similar events emitted roughly on app start: Name: abcd_abcdefghijk_abcdefj
Parameters: [
"userId": "abcdefghijklmnopqrstuvwxyz01",
"is_first_session": true,
"platform": "iOS",
"trialStart": false,
"clicked_branch_link": false,
"appVersion": "3.20.0"
]
Name: "abcdefghij_abcde"
Parameters: [
"trialStart": false,
"platform": "iOS",
"appVersion": "3.20.0",
"userId": "abcdefghijklmnopqrstuvwxyz01"
]
Name: "abcdefghijklmno_abcde"
Parameters: [
"userId": "abcdefghijklmnopqrstuvwxyz01",
"variant_id": "baseline",
"appVersion": "3.20.0",
"event_timestamp": "2023-11-23T22:49:46.669Z",
"trialStart": false,
"experiment_id": "ab_abcdef_abcdefgh_abc",
"platform": "iOS"
]
Name: "abcdefghijklmno_abcde"
Parameters: [
"userId": "abcdefghijklmnopqrstuvwxyz01",
"experiment_id": "ab_abcde_ab_abcd",
"appVersion": "3.20.0",
"event_timestamp": "2023-11-23T22:49:46.669Z",
"trialStart": false,
"variant_id": "baseline",
"platform": "iOS"
]
Name: "abcdefghijklmno_abcde"
Parameters: [
"userId": "abcdefghijklmnopqrstuvwxyz01",
"experiment_id": "ab_abcd_abcdefghijkl_abc_ab",
"platform": "iOS",
"trialStart": false,
"variant_id": "baseline",
"event_timestamp": "2023-11-23T22:49:46.669Z",
"appVersion": "3.20.0"
] |
Thanks @tristan-warner-smith! To clarify, I was just sharing information with you. You should always do (as you are already) what's best for you and your customers and I would never suggest otherwise. We'll dive into this on Monday, but since you have a repro scenario it'd be useful to check out |
Could you outline a potential scenario that represents "However, your current work around will almost certainly result in corrupt data given time." Ie: is it about ordering of events/losing events |
@johncDepop it's about 2 threads trying to write to a json file at the same time potentially. ie: thread1 is mid-way through writing an event, thread2 writes an event in the middle, you might end up with |
Hi @bsneed, just to add to the context @tristan-warner-smith posted earlier, I thought it might be useful to add the following screenshots of the app hanging when we background and foreground the app. We've updated our fork to be in sync with |
It appears I was incorrect regarding the sync->async thing causing issues. I've built some tests and it's working as it should despite the move to async. Change coming soon. I do think the deadlock there is a symptom of something else, such as a large amount of data being written, etc. |
Ok, I stand corrected. In theory it'd work on a serial queue when mixing sync/async. Once I make the change, the thread sanitizer identifies a half dozen deadlocks. I won't be able to do anything about this until I have a reproduction scenario unfortunately. Given that it's all around fileIO, you might look into the content of your events. |
I shared the exact amount of data we were passing through Segment above (subjectively it doesn't seem like a lot of event data), we've subsequently reduced it down to just 2 events tracked initially and had the same result. Note that the only related change we made to our codebase was upgrading from the previous iOS SDK to this one. Beyond that we're using a SwiftUI-based project and initialising a Segment singleton lazily with the first track call. We'll attempt to create a repro scenario, when we can prioritise the time for it, but the sample project you linked seemed to only have pre-app start synchronous event dispatch (in the app delegate), no post app-start events and no async scenarios from what I saw so it doesn't seem like it would be comprehensive enough to represent the real-world. We mostly interact with the Segment SDK via unstructured If you can try to replicate the environment described above you may see the issue. Let us know if you make any progress, we appreciate all the help you've given us so far to help us onboard the new product. It's really key for the stability of our users. |
@bsneed I'd appreciate you revisiting this bug report, we tried updating to the latest 1.55 release, given the mention of concurrency fixes to Sovran-Swift but the issue we mentioned above still manifests. That of the main thread deadlocking due here where the On 1.55 we're now exclusively seeing this issue after the app has been backgrounded for ~2 minutes and brought back to the foreground. |
I've got a PR coming in the next week or so that removes this whole storage mechanism and replaces it with something quite a bit simpler and less error prone. I'll reopen this until that lands and we can revisit. |
Closing this unless something else comes up. The connected PR went out with 1.5.6. |
We've been able to migrate back to |
That's great to hear! Looking into other one in tandem. |
Describe the bug
Analytics iOS
toAnalytics Swift
and we're experiencing regular app hangsStorage
functions that use the blockingsyncQueue.sync
.Storage.write
as a result of a track call where we're making a call toAnalytics.track(name:properties:)
from the main threadStorage.remove
fromHttpClient.startBatchUpload
>SegmentDestination.flush
>Storage.remove
Storage.write
as a result of another track call, this one from thesync.segment.com
serial queueSee attached images.
To Reproduce
Our initialisation is:
Initialisation is lazy on the first track event we fire.
Expected behavior
Screenshots
Platform (please complete the following information):
main
commitc4bb71dea0b38c179b2d87b56ee096a06ce2ea86
Additional context
Storage.write
callsyncQueue.async
rather thansyncQueue.sync
. This is obviously not ideal, we aren't domain experts in your codebase, but in our testing this change removes all the main-thread blocking issues we encountered and still seems to send Segment events as expected.The text was updated successfully, but these errors were encountered: