Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race-condition in socket close #40

Open
hlef opened this issue Oct 9, 2024 · 0 comments
Open

Race-condition in socket close #40

hlef opened this issue Oct 9, 2024 · 0 comments

Comments

@hlef
Copy link
Collaborator

hlef commented Oct 9, 2024

Unfortunately, our socket close is racy:

  • We shut down the socket here, but the TCP FIN message is not sent immediately because that is done by the TCP/IP thread which is not running.

  • Then we remove the firewall endpoint here. This is done immediately.

  • By the time the TCP/IP thread gets a chance to send the FIN message, the firewall blocks the outgoing packet.

We may not have detected this earlier because of the type of workloads we were running, but this is super obvious in the server API case I am about to PR.

Notes on how to fix this

Unfortunately, even if we add a quick trylock on the ipThreadLockState (which is there precisely to wake up the IP thread for that reason), it only allows the IP thread to send out the FIN but not to ACK the FIN ACK sent by the other party. Then we get a whole lot of retransmissions.

We cannot rely on a FreeRTOS+TCP callback (such as FREERTOS_SO_TCP_CONN_HANDLER) to tackle this. This is because FreeRTOS+TCP does not do a three-way handshake close when the socket is closed through FreeRTOS_socketclose(). Instead, FreeRTOS+TCP sends a FIN to the peer and destroys the socket immediately. Any further packet from the peer is answered with a RST (as a spurious packet). Because the socket is destroyed in a state when the TCP connection has not been properly closed (it is just in the "FIN sent" stage, i.e., FIN Wait-1), the callback is not called.

The only viable way to fix this seems to be adding a "shutting down list" to the firewall that either lets packets through or replies with a RST, until a timer expires.

@hlef hlef mentioned this issue Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant