Skip to content

Commit

Permalink
[SPARK-45593][BUILD] Building a runnable distribution from master cod…
Browse files Browse the repository at this point in the history
…e running spark-sql raise error

### What changes were proposed in this pull request?

Fix a build issue, when building a runnable distribution from master code running spark-sql raise error:
```
Caused by: java.lang.ClassNotFoundException: org.sparkproject.guava.util.concurrent.internal.InternalFutureFailureAccess
	at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
	at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520)
	... 58 more
```
the problem is due to a gauva dependency in  spark-connect-common POM that **conflicts**  with the shade plugin of the parent pom.

- the spark-connect-common contains `connect.guava.version` version of guava, and it is relocation as `${spark.shade.packageName}.guava` not the `${spark.shade.packageName}.connect.guava`;
- The spark-network-common also contains guava related classes, it has also been relocation is `${spark.shade.packageName}.guava`, but guava version `${guava.version}`;
- As a result, in the presence of different versions of the classpath org.sparkproject.guava.xx;

In addition, after investigation, it seems that module spark-connect-common is not related to guava, so we can remove guava dependency from spark-connect-common.

### Why are the changes needed?

Building a runnable distribution from master code is not runnable.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

I ran the build command output a runnable distribution package manually for the tests;

Build command:
```
./dev/make-distribution.sh --name ui --pip --tgz  -Phive -Phive-thriftserver -Pyarn -Pconnect
```

Test result:
<img width="1276" alt="image" src="https://github.com/apache/spark/assets/51110188/aefbc433-ea5c-4287-8ebd-367806043ac8">

I also checked the `org.sparkproject.guava.cache.LocalCache` from jars dir;
Before:
```
➜  jars grep -lr 'org.sparkproject.guava.cache.LocalCache' ./
.//spark-connect_2.13-4.0.0-SNAPSHOT.jar
.//spark-network-common_2.13-4.0.0-SNAPSHOT.jar
.//spark-connect-common_2.13-4.0.0-SNAPSHOT.jar
```

Now:
```
➜  jars grep -lr 'org.sparkproject.guava.cache.LocalCache' ./
.//spark-network-common_2.13-4.0.0-SNAPSHOT.jar
```

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#43436 from Yikf/SPARK-45593.

Authored-by: yikaifei <[email protected]>
Signed-off-by: yangjie01 <[email protected]>
  • Loading branch information
yikf authored and okumin committed Mar 31, 2024
1 parent fd86f85 commit b883a4c
Show file tree
Hide file tree
Showing 4 changed files with 41 additions and 32 deletions.
6 changes: 6 additions & 0 deletions assembly/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,12 @@
<groupId>org.apache.spark</groupId>
<artifactId>spark-connect_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<exclusions>
<exclusion>
<groupId>org.apache.spark</groupId>
<artifactId>spark-connect-common_${scala.binary.version}</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
Expand Down
8 changes: 1 addition & 7 deletions connector/connect/client/jvm/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -51,15 +51,9 @@
<version>${project.version}</version>
</dependency>
<!--
We need to define guava and protobuf here because we need to change the scope of both from
We need to define protobuf here because we need to change the scope of both from
provided to compile. If we don't do this we can't shade these libraries.
-->
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>${connect.guava.version}</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
Expand Down
34 changes: 34 additions & 0 deletions connector/connect/common/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,11 @@
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
</dependency>
<!--
SPARK-45593: spark connect relies on a specific version of Guava, We perform shading
of the Guava library within the connect-common module to ensure both connect-server and
connect-client modules maintain consistent and accurate Guava dependencies.
-->
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
Expand Down Expand Up @@ -152,6 +157,35 @@
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<configuration>
<shadedArtifactAttached>false</shadedArtifactAttached>
<artifactSet>
<includes>
<include>org.spark-project.spark:unused</include>
<include>com.google.guava:guava</include>
<include>com.google.guava:failureaccess</include>
<include>org.apache.tomcat:annotations-api</include>
</includes>
</artifactSet>
<relocations>
<relocation>
<pattern>com.google.common</pattern>
<shadedPattern>${spark.shade.packageName}.connect.guava</shadedPattern>
</relocation>
</relocations>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
<profiles>
Expand Down
25 changes: 0 additions & 25 deletions connector/connect/server/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -51,12 +51,6 @@
<groupId>org.apache.spark</groupId>
<artifactId>spark-connect-common_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<exclusions>
<exclusion>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
Expand Down Expand Up @@ -158,17 +152,6 @@
<artifactId>scala-parallel-collections_${scala.binary.version}</artifactId>
</dependency>
--><!-- #endif scala-2.13 -->
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>${connect.guava.version}</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>failureaccess</artifactId>
<version>${guava.failureaccess.version}</version>
</dependency>
<dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
Expand Down Expand Up @@ -289,7 +272,6 @@
<shadedArtifactAttached>false</shadedArtifactAttached>
<artifactSet>
<includes>
<include>com.google.guava:*</include>
<include>io.grpc:*:</include>
<include>com.google.protobuf:*</include>

Expand All @@ -309,13 +291,6 @@
</includes>
</artifactSet>
<relocations>
<relocation>
<pattern>com.google.common</pattern>
<shadedPattern>${spark.shade.packageName}.connect.guava</shadedPattern>
<includes>
<include>com.google.common.**</include>
</includes>
</relocation>
<relocation>
<pattern>com.google.thirdparty</pattern>
<shadedPattern>${spark.shade.packageName}.connect.guava</shadedPattern>
Expand Down

0 comments on commit b883a4c

Please sign in to comment.