Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CachedThreadPool does not spin down idle threads #1075

Open
SamSaffron opened this issue Dec 19, 2024 · 8 comments
Open

CachedThreadPool does not spin down idle threads #1075

SamSaffron opened this issue Dec 19, 2024 · 8 comments

Comments

@SamSaffron
Copy link

SamSaffron commented Dec 19, 2024

In the Ruby implementation:

require 'concurrent'

puts "Thread Count: #{Thread.list.count}"

pool = Concurrent::CachedThreadPool.new( min_threads: 0, max_threads: 5, idletime: 1 )

puts "Thread Count: #{Thread.list.count}"
# we have 1 thread for now

5.times do
  pool.post do
    puts "Hello from thread #{Thread.current.object_id}"
  end
end
# we now have 5

sleep 2
puts "Thread Count: #{Thread.list.count}"
# we still have 5

pool.post do
  puts "Hello from thread #{Thread.current.object_id}"
end

sleep 1

puts "Thread Count: #{Thread.list.count}"
# we just issued a prune as a side effect so we have 2

pool.prune_pool
sleep 2
# we need to sleep cause prune_pool is async

puts "Thread Count: #{Thread.list.count}"
# we now have 1 thread

CachedThreadPool only spins down threads when you queue work, so if you are done queuing work you need to remember to wait for it to be done and then issue a #prune_pool call

This is technically very tricky cause you need to carry an extra thread for that. The Ruby design should change so threads just wait idle_time for new work and spin down if no new work arrives.

SamSaffron added a commit to discourse/discourse that referenced this issue Dec 19, 2024
Following: ruby-concurrency/concurrent-ruby#1075

Current implementation in Ruby Concurrent is not spinning down
threads

New implementation is simpler and as a bonus respects Discourse database conventions
so it is multisite safe


(discourse fonts is weird - not sure what is up with that)
@bensheldon
Copy link
Contributor

What's the cost/downside you're seeing with a sleeping thread on Queue#pop?

To prune precisely I think we have a couple of options:

  • another thread to sleep and prune. Downside as you note is another thread.
  • replace the Queue object with a queue-with-wait implementation (I don't think Concurrent Ruby has one, though TimerSet is close it) so the thread could prune itself.

@SamSaffron
Copy link
Author

@bensheldon mainly complexity, it is complexity we do not want to carry.

I wrote a queue with a conditional var here:

discourse/discourse#30364

I am more than happy for this to be lifted into concurrent ruby, it has a few tiny discoursisms (multisite support and logging patterns) but is safe.

For us this is perfect, I have not benched it against the current implementation but for our usage we always tend to make DB calls so we are not optimising for microsecond execution times shortest would be usually 10s of milliseconds.

@eregon
Copy link
Collaborator

eregon commented Dec 19, 2024

I wrote a queue with a conditional var here:

FWIW I also recall writing one in https://github.com/ruby/timeout/blob/607d8c6fbe4d86db1cf22846e89198b47cec7161/lib/timeout.rb#L105-L109
And since Ruby 3.2 there is Queue#pop(timeout:).

@SamSaffron
Copy link
Author

@eregon thanks for the tip, I ended up following up with a hybrid approach

https://github.com/discourse/discourse/pull/30392/files

There is so much detail in such a small amount of code. being able to efficiently scale up the queue is very tough, coordinating work between all the critical sections is tough.

I am not even sure if this is my last PR refining this ...

Would love it if you can have a look and let me know if this is something we should port to concurrent, having a building block like this is super important, I want it to work for everyone

SamSaffron added a commit to discourse/discourse-ai that referenced this issue Dec 20, 2024
Previous version was prone to the bug:

ruby-concurrency/concurrent-ruby#1075

This is particularly bad cause we could have a DB connection
attached to the thread and we never clear it up, so after N hours
this could start exhibiting weird connection issues.
SamSaffron added a commit to discourse/discourse-ai that referenced this issue Dec 20, 2024
Previous version was prone to the bug:

ruby-concurrency/concurrent-ruby#1075

This is particularly bad cause we could have a DB connection
attached to the thread and we never clear it up, so after N hours
this could start exhibiting weird connection issues.
@eregon
Copy link
Collaborator

eregon commented Dec 26, 2024

I think having the improvement to automatically spin down idle threads in CachedThreadPool would make a lot of sense.
The only thing is I don't have much time to review, but other maintainers might have more.

It seems the Java version does that:
https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/Executors.html#newCachedThreadPool--

Threads that have not been used for sixty seconds are terminated and removed from the cache.

So it would be good for consistency too.

@joshuay03
Copy link

joshuay03 commented Jan 17, 2025

Hey folks, I was pointed this way after opening a PR to fix a similar issue with ThreadPoolExecutor. I ended up adding another thread, however I'm open to other ideas. This seems worth exploring 🤔:

replace the Queue object with a queue-with-wait implementation (I don't think Concurrent Ruby has one, though TimerSet is close it) so the thread could prune itself.

Context:

@bensheldon
Copy link
Contributor

TIL about Queue#pop(timeout:) after Ruby 3.2. That is soo nice!

...but also presents the challenge that it becomes simple (imo) to implement spinning down idle threads in Ruby 3.2+... and somewhat complex to do so in older versions of Ruby (assuming that a timeoutable-queue-pop is the strategy).

I sorta see two paths, and I'd love feedback about which to pick:

  • Have idle thread spin-down be a Ruby 3.2+ only feature
  • Have a Concurrent::QueueWithPopTimeout that implements the feature on older Ruby and inherits from Queue for Ruby 3.2+

@SamSaffron
Copy link
Author

SamSaffron commented Jan 19, 2025

There is a 100% working implementation using ConditionVariable.new right here:

https://github.com/discourse/discourse/blob/main/lib/scheduler/thread_pool.rb

No background thread needed, Queue timeout is awesome, but this achieves the exact same goal in a backwards compatible way.

The challenge here imo is not the implementation, it is more the architecture of Concurrent rb. Given its Java roots there are lots of abstractions at play and conditional Java code baked in. It makes it very complicated to port in a solution like this despite it being pretty much ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants