Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken Pipe Error while writing data using clickhouse-spark-connector #365

Open
ukm21 opened this issue Nov 5, 2024 · 0 comments
Open

Comments

@ukm21
Copy link

ukm21 commented Nov 5, 2024

clickhouse 22.10.2.11
spark 3.3.2
spark-clickhouse-connector 0.8.0
clickhouse-jdbc 0.6.3.

Job is failing with below exception when batch size is 1000.

Job is successful when batch size id around 300.

Caused by: com.clickhouse.spark.exception.CHServerException: [HTTP]default@node1:8123}/default [210] Broken pipe (Write failed)
at com.clickhouse.spark.client.NodeClient.syncInsert(NodeClient.scala:146)
at com.clickhouse.spark.client.NodeClient.syncInsertOutputJSONEachRow(NodeClient.scala:111)
at com.clickhouse.spark.write.ClickHouseWriter.$anonfun$doFlush$1(ClickHouseWriter.scala:228)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at scala.util.Try$.apply(Try.scala:213)
at com.clickhouse.spark.Utils$.retry(Utils.scala:99)
at com.clickhouse.spark.write.ClickHouseWriter.doFlush(ClickHouseWriter.scala:226)
at com.clickhouse.spark.write.ClickHouseWriter.flush(ClickHouseWriter.scala:216)
at com.clickhouse.spark.write.ClickHouseWriter.write(ClickHouseWriter.scala:188)
at com.clickhouse.spark.write.ClickHouseWriter.write(ClickHouseWriter.scala:37)
at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.$anonfun$run$1(WriteToDataSourceV2Exec.scala:445)
at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1539)
at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:483)
at org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.$anonfun$writeWithV2$2(WriteToDataSourceV2Exec.scala:384)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:136)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:551)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1505)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:554)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: com.clickhouse.client.ClickHouseException: Broken pipe (Write failed)
at com.clickhouse.client.ClickHouseException.of(ClickHouseException.java:149)
at com.clickhouse.client.AbstractClient.lambda$execute$0(AbstractClient.java:275)
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
... 3 more
Caused by: java.net.ConnectException: Broken pipe (Write failed)
at com.clickhouse.client.http.ApacheHttpConnectionImpl.post(ApacheHttpConnectionImpl.java:276)
at com.clickhouse.client.http.ClickHouseHttpClient.send(ClickHouseHttpClient.java:195)
at com.clickhouse.client.AbstractClient.sendAsync(AbstractClient.java:161)
at com.clickhouse.client.AbstractClient.lambda$execute$0(AbstractClient.java:273)

Clickhouse Server Logs

2024.11.04 07:32:07.360221 [ 1401433 ] {5c51d481-968f-4af3-b604-a4953032d213} MemoryTracker: Peak memory usage (for query): 344.60 MiB.
2024.11.04 07:32:07.360249 [ 1401433 ] {} HTTP-Session: 216485aa-f0c1-4587-a8ba-948eee65689d Logout, user_id: 94309d50-4f52-5250-31bd-74fecac179db
2024.11.04 07:32:09.680679 [ 1401433 ] {} HTTP-Session: 5dbc57f3-e720-45c2-8916-6a2ef6a2c9fd Authenticating user 'default' from x.x.x.x:57180
2024.11.04 07:32:09.680783 [ 1401433 ] {} HTTP-Session: 5dbc57f3-e720-45c2-8916-6a2ef6a2c9fd Authenticated with global context as user 94309d50-4f52-5250-31bd-74fecac179db
2024.11.04 07:32:09.680800 [ 1401433 ] {} HTTP-Session: 5dbc57f3-e720-45c2-8916-6a2ef6a2c9fd Creating session context with user_id: 94309d50-4f52-5250-31bd-74fecac179db
2024.11.04 07:32:09.682980 [ 1401433 ] {72900589-cbde-4ee7-b391-9bcf13838a6e} executeQuery: (from x.x.x.x:57180) INSERT INTO default.my_table FORMAT ArrowStream (stage: Complete)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant