diff --git a/docs/Getting started.md b/docs/Getting started.md index 64e878246..a06278e1c 100644 --- a/docs/Getting started.md +++ b/docs/Getting started.md @@ -78,11 +78,11 @@ If you use Linux, Florian Kellner also kindly contributed an [installation scrip RumbleDB requires an Apache Spark installation on Linux, Mac or Windows. -It is straightforward to directly [download it](https://spark.apache.org/downloads.html), unpack it and put it at a location of your choosing. We recommend to pick Spark 3.2.2. Let us call this location SPARK_HOME (it is a good idea, in fact to also define an environment variable SPARK_HOME pointing to the absolute path of this location). +It is straightforward to directly [download it](https://spark.apache.org/downloads.html), unpack it and put it at a location of your choosing. We recommend to pick Spark 3.4.3. Let us call this location SPARK_HOME (it is a good idea, in fact to also define an environment variable SPARK_HOME pointing to the absolute path of this location). What you need to do then is to add the subdirectory "bin" within the unpacked directory to the PATH variable. On macOS this is done by adding - export SPARK_HOME=/path/to/spark-3.2.2-bin-hadoop3.2 + export SPARK_HOME=/path/to/spark-3.4.3-bin-hadoop3.2 export PATH=$SPARK_HOME/bin:$PATH (with SPARK_HOME appropriately set to match your unzipped Spark directory) to the file .zshrc in your home directory, then making sure to force the change with @@ -111,9 +111,11 @@ Like Spark, RumbleDB is just a download and no installation is required. In order to run RumbleDB, you simply need to download one of the small .jar files from the [download page](https://github.com/RumbleDB/rumble/releases) and put it in a directory of your choice, for example, right besides your data. -If you use Spark 3.2+, use rumbledb-1.22.0-for-spark-3.2.jar. +If you use Spark 3.4+, use rumbledb-1.22.0-for-spark-3.4.jar. -If you use Spark 3.3+, use rumbledb-1.22.0-for-spark-3.3.jar. +If you use Spark 3.5+, use rumbledb-1.22.0-for-spark-3.5.jar. + +If you use Spark 4.0+ (preview), use rumbledb-1.22.0-for-spark-4.0.jar. These jars do not embed Spark, since you chose to set it up separately. They will work with your Spark installation with the spark-submit command. diff --git a/docs/install.md b/docs/install.md index d3bc0bebd..98f158a46 100644 --- a/docs/install.md +++ b/docs/install.md @@ -7,9 +7,9 @@ We show here how to install RumbleDB from the github repository if you wish to d The following software is required: - [Java SE](http://www.oracle.com/technetwork/java/javase/downloads/index.html) 8 (last tested on OpenJDK 8u251). The version of Java is important, as Spark only works with Java 8 or java 11. -- [Spark](https://spark.apache.org/), version 3.1.2 (for example) +- [Spark](https://spark.apache.org/), version 3.4.3 (for example) - [Ant](http://www.ant.org/), version 1.11.1 -- [ANTLR](http://www.ant.org/), version 4.8 (supplied in our repository) +- [ANTLR](http://www.ant.org/), version 4.9.3 (supplied in our repository) - [Maven](https://maven.apache.org/) 3.6.0 Important: the ANTLR version varies with the Spark version, because Spark is also shipped with an ANTLR runtime (example: Spark 3.0 and 3.1 is with ANTLR 4.7, Spark 3.2 with ANTLR 4.8). The ANTLR runtime MUST match the ANTLR generator used to generate the RumbleDB jar file.