Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop goroutines when closing a connection #39

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Stop goroutines when closing a connection #39

wants to merge 5 commits into from

Conversation

stolyaroleh
Copy link

Hello, @BurntSushi. I would like to use xgb in one of my projects. It's a service that runs in background and talks to the X server on behalf of all currently logged in users. Since it maintains a pool of X11 connections that can die at any moment (for example, when an interactive session stops on logout), I wanted to make sure that xgb will not panic/leave garbage behind when this happens.

This pull request attempts to fix #32, by introducing a sync.WaitGroup and blocking in Close() until all 4 goroutines, spawned by NewConnNet die. I tested it by creating and closing 100 000 connections - decreased RAM usage from at least 700 Mb to ~25 Mb.

Additionally, I use xgb.Conn.done (previously xgb.Conn.closing, a chan struct{}) to broadcast that all goroutines need to stop (receive on closed channel does not block).

Finally, I broadcastDone on unrecoverable read errors in readResponses instead of xgb.Conn.Close(). Since readResponses contains an infinite loop, and previously would just continue after Close() (line 406, xgb.go), it would attempt to Close() again in the next iteration, causing a panic (closing an already closed channel).

I don't have much experience with xgb or X11 in general and would greatly appreciate if you had a look at this, in case I missed something.


func (c *Conn) broadcastDone() {
select {
case <-c.done:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What sends on this channel?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When readResponses hits an unrecoverable read error on line 435 and 463, it closes this channel by calling broadcastDone, thus signaling to all goroutines that they should shutdown. Since receive on a closed channel does not block, having:

select {
case <-c.done:
   cleanup()
   return
case ...:
   ...
}

in a goroutine will shut it down after broadcastDone.

I am doing a nonblocking select here to prevent c.done from being closed twice, since that will trigger a panic.

xgb.go Outdated
@@ -314,6 +333,15 @@ type request struct {
// In all likelihood, you should be able to copy and paste with some minor
// edits the generated code for the request you want to issue.
func (c *Conn) NewRequest(buf []byte, cookie *Cookie) {
select {
case <-c.done:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain this select statement in more detail?

Copy link
Author

@stolyaroleh stolyaroleh Aug 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewRequest waits until sendRequests processes current request. However, if an unrecoverable error happens, all goroutines, including sendRequests are stopped, and NewRequest will block forever. This select statement checks whether this connection was closed.

...however, now that you mention it, I can see a race condition here. The select statement should be:

select {
case c.reqChan<-&request:
case <-c.done:
    return
case <-seq:
    return
}

@@ -366,9 +398,8 @@ func (c *Conn) writeBuffer(buf []byte) error {
if _, err := c.conn.Write(buf); err != nil {
Logger.Printf("A write error is unrecoverable: %s", err)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if broadcastDone should be called here, too. writeBuffer returns an error which is ignored. Sending another request will likely fail.

This was referenced Sep 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Memory leak?
2 participants