Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ServerErrorHandler: Poco::Exception. Code: 1000, e.code() = 107, e.displayText() = Net Exception: Socket is not connected, Stack trace (when copying this message, always include the lines below): clickhouse-1 | #5707

Open
mahesh1b opened this issue Mar 25, 2024 · 19 comments

Comments

@mahesh1b
Copy link

Self-Hosted Version

24.4.0.dev

CPU Architecture

x86_64

Docker Version

26.0.0

Docker Compose Version

2.25.0

Steps to Reproduce

  • Clone the project from the Github: https://github.com/getsentry
  • Do ./install.sh in the GitHub folder
  • Once the containers are up check the clickhouse container logs

Expected Result

The clickhouse container should work without throwing any errors in the logs and CPU consumption should be normal.

Actual Result

clickhouse-1  | 2024.03.25 15:38:16.970267 [ 46 ] {} <Error> ServerErrorHandler: Poco::Exception. Code: 1000, e.code() = 107, e.displayText() = Net Exception: Socket is not connected, Stack trace (when copying this message, always include the lines below):
clickhouse-1  |
clickhouse-1  | 0. Poco::Net::SocketImpl::error(int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) @ 0x13c4ee8e in /usr/bin/clickhouse
clickhouse-1  | 1. Poco::Net::SocketImpl::peerAddress() @ 0x13c510d6 in /usr/bin/clickhouse
clickhouse-1  | 2. DB::ReadBufferFromPocoSocket::ReadBufferFromPocoSocket(Poco::Net::Socket&, unsigned long) @ 0x101540cd in /usr/bin/clickhouse
clickhouse-1  | 3. DB::HTTPServerRequest::HTTPServerRequest(std::__1::shared_ptr<DB::Context const>, DB::HTTPServerResponse&, Poco::Net::HTTPServerSession&) @ 0x110e6fd5 in /usr/bin/clickhouse
clickhouse-1  | 4. DB::HTTPServerConnection::run() @ 0x110e5d6e in /usr/bin/clickhouse
clickhouse-1  | 5. Poco::Net::TCPServerConnection::start() @ 0x13c5614f in /usr/bin/clickhouse
clickhouse-1  | 6. Poco::Net::TCPServerDispatcher::run() @ 0x13c57bda in /usr/bin/clickhouse
clickhouse-1  | 7. Poco::PooledThread::run() @ 0x13d89e59 in /usr/bin/clickhouse
clickhouse-1  | 8. Poco::ThreadImpl::runnableEntry(void*) @ 0x13d860ea in /usr/bin/clickhouse
clickhouse-1  | 9. start_thread @ 0x9609 in /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
clickhouse-1  | 10. clone @ 0x122293 in /usr/lib/x86_64-linux-gnu/libc-2.31.so
clickhouse-1  |  (version 21.8.13.1.altinitystable (altinity build))
clickhouse-1  | 2024.03.25 15:38:17.081968 [ 513 ] {} <Error> ServerErrorHandler: Poco::Exception. Code: 1000, e.code() = 107, e.displayText() = Net Exception: Socket is not connected, Stack trace (when copying this message, always include the lines below):
clickhouse-1  |
clickhouse-1  | 0. Poco::Net::SocketImpl::error(int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) @ 0x13c4ee8e in /usr/bin/clickhouse
clickhouse-1  | 1. Poco::Net::SocketImpl::peerAddress() @ 0x13c510d6 in /usr/bin/clickhouse
clickhouse-1  | 2. DB::HTTPServerRequest::HTTPServerRequest(std::__1::shared_ptr<DB::Context const>, DB::HTTPServerResponse&, Poco::Net::HTTPServerSession&) @ 0x110e6f0b in /usr/bin/clickhouse
clickhouse-1  | 3. DB::HTTPServerConnection::run() @ 0x110e5d6e in /usr/bin/clickhouse
clickhouse-1  | 4. Poco::Net::TCPServerConnection::start() @ 0x13c5614f in /usr/bin/clickhouse
clickhouse-1  | 5. Poco::Net::TCPServerDispatcher::run() @ 0x13c57bda in /usr/bin/clickhouse
clickhouse-1  | 6. Poco::PooledThread::run() @ 0x13d89e59 in /usr/bin/clickhouse
clickhouse-1  | 7. Poco::ThreadImpl::runnableEntry(void*) @ 0x13d860ea in /usr/bin/clickhouse
clickhouse-1  | 8. start_thread @ 0x9609 in /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
clickhouse-1  | 9. clone @ 0x122293 in /usr/lib/x86_64-linux-gnu/libc-2.31.so
clickhouse-1  |  (version 21.8.13.1.altinitystable (altinity build))
clickhouse-1  | 2024.03.25 15:38:17.749096 [ 513 ] {} <Error> ServerErrorHandler: Poco::Exception. Code: 1000, e.code() = 107, e.displayText() = Net Exception: Socket is not connected, Stack trace (when copying this message, always include the lines below):
clickhouse-1  |
clickhouse-1  | 0. Poco::Net::SocketImpl::error(int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) @ 0x13c4ee8e in /usr/bin/clickhouse
clickhouse-1  | 1. Poco::Net::SocketImpl::peerAddress() @ 0x13c510d6 in /usr/bin/clickhouse
clickhouse-1  | 2. DB::HTTPServerRequest::HTTPServerRequest(std::__1::shared_ptr<DB::Context const>, DB::HTTPServerResponse&, Poco::Net::HTTPServerSession&) @ 0x110e6f0b in /usr/bin/clickhouse
clickhouse-1  | 3. DB::HTTPServerConnection::run() @ 0x110e5d6e in /usr/bin/clickhouse
clickhouse-1  | 4. Poco::Net::TCPServerConnection::start() @ 0x13c5614f in /usr/bin/clickhouse
clickhouse-1  | 5. Poco::Net::TCPServerDispatcher::run() @ 0x13c57bda in /usr/bin/clickhouse
clickhouse-1  | 6. Poco::PooledThread::run() @ 0x13d89e59 in /usr/bin/clickhouse
clickhouse-1  | 7. Poco::ThreadImpl::runnableEntry(void*) @ 0x13d860ea in /usr/bin/clickhouse
clickhouse-1  | 8. start_thread @ 0x9609 in /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
clickhouse-1  | 9. clone @ 0x122293 in /usr/lib/x86_64-linux-gnu/libc-2.31.so
clickhouse-1  |  (version 21.8.13.1.altinitystable (altinity build))

Event ID

No response

@csvan
Copy link

csvan commented Mar 26, 2024

Seeing the same thing on 24.3.0. It is unclear when and how it started, but it was not there when we set up the instance initially, nor after upgrading to 24.3.0.

It is also unclear if it has any actual impact on functionality.

@mahesh1b
Copy link
Author

@csvan I suspect that clickhouse is causing spikes in the CPU usage and the CPU usage for the server has not been stable
image

@csvan
Copy link

csvan commented Mar 26, 2024

Looking at our internal graphs, I have not noticed any significant deviations in CPU usage.

@mahesh1b
Copy link
Author

Do you think having too many projects can cause CPU spikes? I have a total of 67 projects on the Sentry and 23 out of them are actively used for monitoring.

@jap
Copy link

jap commented Mar 26, 2024

I also came across this, but also saw this in the logs early on while booting:

clickhouse-1                                    | 2024.03.26 13:59:42.424894 [ 44 ] {} <Warning> Application: Listen [::]:8123 failed: Poco::Exception. Code: 1000, e.code() = 0, e.displayText() = DNS error: EAI: Address family for hostname not supported (version 21.8.13.1.altinitystable (altinity build)). If it is an IPv6 or IPv4 address and your host has disabled IPv6 or IPv4, then consider to specify not disabled IPv4 or IPv6 address to listen in <listen_host> element of configuration file. Example for disabled IPv6: <listen_host>0.0.0.0</listen_host> . Example for disabled IPv4: <listen_host>::</listen_host>

which makes sense as there is no IPv6 in our docker.

I've added a <listen_host>0.0.0.0</listen_host> to clickhouse/config.xml and rebuilt things.

I've now gotten another error from clickhouse (which has scrolled out of my terminal's history unfortunately) about not being able to bind to several ports, but some prodding with nsenter and ss tells me that it was able to bind, and tcpdumping confirms that requests are being made and processed.

Note that I'm not seeing any CPU spikes as well.

(all of this one 24.3.0)

@azaslavsky
Copy link

A duplicate of this error is at getsentry/self-hosted#2876. Have you tried the updating to a nightly build past the PR listed there?

@mahesh1b
Copy link
Author

@azaslavsky For now I have just rolled back to 24.1.0 and the clickhouse stopped throwing the error, but still seeing a lot of CPU spikes for the server.

@aldy505
Copy link

aldy505 commented Mar 27, 2024

Do you think having too many projects can cause CPU spikes? I have a total of 67 projects on the Sentry and 23 out of them are actively used for monitoring.

@mahesh1b to answer this: No, having too many projects doesn't cause CPU spikes. I have 100+ projects with only 8 core CPU and the average CPU usage is around 19% - 24%

I also came across this, but also saw this in the logs early on while booting:

clickhouse-1                                    | 2024.03.26 13:59:42.424894 [ 44 ] {} <Warning> Application: Listen [::]:8123 failed: Poco::Exception. Code: 1000, e.code() = 0, e.displayText() = DNS error: EAI: Address family for hostname not supported (version 21.8.13.1.altinitystable (altinity build)). If it is an IPv6 or IPv4 address and your host has disabled IPv6 or IPv4, then consider to specify not disabled IPv4 or IPv6 address to listen in <listen_host> element of configuration file. Example for disabled IPv6: <listen_host>0.0.0.0</listen_host> . Example for disabled IPv4: <listen_host>::</listen_host>

which makes sense as there is no IPv6 in our docker.

I've added a <listen_host>0.0.0.0</listen_host> to clickhouse/config.xml and rebuilt things.

@jap There's IPv6 support in Docker though, but it's not the default yet: https://docs.docker.com/config/daemon/ipv6/

@aldy505
Copy link

aldy505 commented Mar 27, 2024

Wild guess, but try to change every rust-consumer entries on the docker-compose.yml file to be just consumer, and see if the problem's solved.

@mahesh1b
Copy link
Author

@aldy505 For now I have rolled-back to version 24.1.0 as it was production, Should I try this on version 23.4.0 ?

@aldy505
Copy link

aldy505 commented Mar 27, 2024

@aldy505 For now I have rolled-back to version 24.1.0 as it was production, Should I try this on version 23.4.0 ?

It's up to you, using consumer works though, as it's not a deprecated command. But if you're not facing any issues by using the consumer instead of the rust-consumer, we might need to consider some things about the usage of Rust consumers.

@mahesh1b
Copy link
Author

mahesh1b commented Mar 27, 2024

@aldy505 I have set up a new sentry server with version 23.4.0, I will try it and let know.
I am a bit confused changing consumer will resolve the clickhouse error or the CPU usage.

Thanks.

@mmerickel
Copy link

Just upgraded from 23.9.1 to 24.3.0 and am seeing this connection error. Also events are not being processed by the instance - it seems very broken. I followed the instructions in getsentry/self-hosted#2876 (comment) to stop using the rust-consumer and add the billing worker and it seems to have fixed the issues for now.

@mahesh1b
Copy link
Author

replacing rust-consumer with consumer in the docker-compose.yml file resolved the errors, now I no longer see the error in the clickhouse container.
But am still not sure why the CPU usage is so unstable, I am using the t3a.2xlarge instance
image

@csvan
Copy link

csvan commented Mar 27, 2024

I wouldn't say your CPU usage cure looks unusual tbh. There is a lot going on in Sentry and a straight line is not to be expected.

Screenshot 2024-03-27 at 13 08 48

@mahesh1b
Copy link
Author

I understand @csvan, but right now I am using the t3a.2xlarge instance, which I feel is more than enough to run things smoothly. We had an old sentry server with version 23.2.0 with 4 vCPU and 16GB memory and We didn't have any CPU issues for it so I am a bit doubtful. @csvan do you think any feature in sentry might be causing it?
Thank you everyone for all the help, really appreciate all the comments.

@saibotk
Copy link

saibotk commented Mar 28, 2024

We also faced the main error in this issue and resolved it by going back to the non-rust consumers and switching to nightly.
Without the switch to nightly Clickhouse still complained.

@lraphael
Copy link

I was able to solve the problem by switching to the non-rust consumer. I am using version 24.3.0.

@azaslavsky
Copy link

Transferring this bug to the snuba team, since it seems like a rust consumer issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Waiting for: Product Owner
Status: No status
Status: No status
Development

No branches or pull requests

9 participants