Installation

NaturalSym, proposed in Natural Symbolic Execution-based Testing for Big Data Analytics (FSE24), is a symbolic execution-based tool to generate natural tests for big data analytics. To replicate our evaluation, please go to https://zenodo.org/records/11090237.

Installation

Users can download NaturalSym from github. Ubuntu>=20.04 with Python>=3.12 are suggested.

> apt-get install python3.12 python3-pip ant -y
> pip3 install numpy scipy
> git clone https://github.com/UCLA-SEAL/NaturalSym.git
> cd NaturalSym

Run an example

NaturalSym is a test generator for scala-based Apache Spark programs. To use NaturalSym, users need to provide a scala program under test and a input annotation file, e.g. grades.scala and grades.config.

grades.scala contains a method def execute(input1: RDD[String], input2: RDD[String]), which takes two input tables respectively representing students' math and physics scores, and filters out students who fail the total score.
grades.config contains user knowledge about the program input. Each line in grades.config corresponds to one input table in grades.scala, e.g. input1 := Discrete("alice","bob") | scipy.binom(100, 0.1). The above annotation means that users give two examples of the first column (student name), i.e. "alice" and "bob". The second column (maths score) should follow a binomial distribution binom(n=100,p=0.1).

To run NaturalSym, try out > ./naturalsym.sh grades.scala grades.config. You'll see the generated tests both in console and grades.tests. The output snippet is shown below which refers to the test generated for the first condition that a student's math grade and physics grade can be found from two tables and the sum is smaller than 60.

Generated tests for Path1 in Rundir/1.smt2
input1.csv
alice,8
input2.csv
alice,34

Try it yourself!

Under the root folder, please execute > ./naturalsym <target.scala> <target.config>. You'll see the output in both console and <target.tests>.

Limited by the back-end symbolic execution engine, the target method must be execute and input arguments must be of the shape RDD[String]. Please see our template in template.scala.
The configuration file format is shown below. In general, each line of the configuration file declares user knowledge about each column from a input table delimited by "|". User knowledge can be none, an example list, Gaussian distribution, uniform distribution, or any distribution supported by Python scipy library.

<config>     :=     ""    |   <config> "\n" <tab>
<tab>        :=     <tab-name> ":=" <cols>
<cols>       :=     ""    |   <cols> "|" <col>
<col>        :=     <none> | <examples> | <uniform> | <gaussian> | <trunc-gaussian> | <scipy-distr>
<none>       :=     ""
<examples>   :=     Discrete(<examples-delimited-by-comma>)
<uniform>    :=     Uniform(<l-bound>,<r-bound>)
<gaussian>   :=     Gaussian(<mu>,<sigma>)
<trunc-gaussian>:=  <l-bound><=Gaussian(<mu>,<sigma>)<=<r-bound>
<scipy-distr>:=     scipy.<distr-name>(<parameter-list>)

Run our benchmark (Optional)

Users can run > ./NaturalSym/scripts/run1.sh <bench> to run a subject program from our benchmark suite. <bench> should be one of airport,movie1,usedcars,transit,credit,Q1,Q3,Q6,Q7,Q12,Q15,Q19,Q20.

For example, > ./NaturalSym/scripts/run1.sh airport will run NatualSym for NaturalSym/newbench/src/airport/airport.scala with the configuration file NaturalSym/newbench/config/airport.config. Generated tests are under NaturalSym/newbench/geninputs/airport.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NaturalSym

NaturalSym

_dependencies

_dependencies

LICENSE

LICENSE

grades.config

grades.config

grades.scala

grades.scala

grades.tests

grades.tests

naturalsym.sh

naturalsym.sh

readme.md

readme.md

template.scala

template.scala

Repository files navigation

Installation

Run an example

Try it yourself!

Run our benchmark (Optional)

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
NaturalSym		NaturalSym
_dependencies		_dependencies
LICENSE		LICENSE
grades.config		grades.config
grades.scala		grades.scala
grades.tests		grades.tests
naturalsym.sh		naturalsym.sh
readme.md		readme.md
template.scala		template.scala

License

UCLA-SEAL/NaturalSym

Folders and files

Latest commit

History

Repository files navigation

Installation

Run an example

Try it yourself!

Run our benchmark (Optional)

About

Resources

License

Stars

Watchers

Forks

Languages