Skip to content

Public runnable examples of using John Snow Labs' NLP for Apache Spark.

License

Notifications You must be signed in to change notification settings

JackChou1996/spark-nlp-workshop

 
 

Repository files navigation

Spark NLP Workshop

Build Status Maven Central PyPI version Anaconda-Cloud License

Showcasing notebooks and codes of how to use Spark NLP in Python and Scala.

Table of contents

Python Setup

$ java -version
# should be Java 8 (Oracle or OpenJDK)
$ conda create -n sparknlp python=3.6 -y
$ conda activate sparknlp
$ pip install spark-nlp pyspark==2.4.4

Docker setup

If you want to experience Spark NLP and run Jupyter examples without installing anything, you can simply use our Docker image:

1- Get the docker image for spark-nlp-workshop:

docker pull johnsnowlabs/spark-nlp-workshop

2- Run the image locally with port binding.

 docker run -it --rm -p 8888:8888 -p 4040:4040 johnsnowlabs/spark-nlp-workshop

3- Open Jupyter notebooks inside your browser by using the token printed on the console.

http://localhost:8888/
  • The password to Jupyter notebook is sparknlp
  • The size of the image grows everytime you download a pretrained model or a pretrained pipeline. You can cleanup ~/cache_pretrained if you don't need them.
  • This docker image is only meant for testing/learning purposes and should not be used in production environments. Please install Spark NLP natively.

Main repository

https://github.com/JohnSnowLabs/spark-nlp

Project's website

Take a look at our official spark-nlp page: http://nlp.johnsnowlabs.com/ for user documentation and examples

Slack community channel

Join Slack

Contributing

If you find any example that is no longer working, please create an issue.

License

Apache Licence 2.0

About

Public runnable examples of using John Snow Labs' NLP for Apache Spark.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Jupyter Notebook 99.7%
  • Other 0.3%