Skip to content

A sample project to structure your pyspark project, use pytest to run unit test in pyspark and a demonstration of using Single SparkSession for all your test case pytest's conftest feature.

Notifications You must be signed in to change notification settings

zakkyakhtar/pytest-pyspark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pytest-pyspark

A sample project to organise your pyspark project. It demonstrates the use of pytest to unit test PySpark methods. It also demonstrates the use of pytest's conftest.py feature which can be used for dependency injection. In this case SparkSession is being injected to the test cases.

How to use it

  • The files containing main methods should be kept outside the package
  • Create an egg file by running python setup.py bdist_egg
  • Submit your spark application: spark-submit --py-files=pytest-pyspark-1.0.egg main.py

How to Run Test cases

  • Create a virtual environment: virtualenv pyspark-venv
  • Activate the virtual environment: source pyspark-ven/bin/activate
  • Install the test's dependency: pip install -r test_requirements.txt
  • Run the test: pytest In order to generate the unit test report in junit xml and coverage report in Cobetura xml format: py.test --cov-report xml:./reports/coverage.xml ----junitxml=./reports/unit-test.xml

About

A sample project to structure your pyspark project, use pytest to run unit test in pyspark and a demonstration of using Single SparkSession for all your test case pytest's conftest feature.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages