For now we have a data production and consumption demo. This can be ran using:
docker-compose up --build
sensor_heater
will produce updates for the simulated appliance. These are passed to Kafka and then consumed by the
db_interface
and the streaming_worker
.
After the infrastructure is setup you can access a container called spark
from which we submit our Spark jobs. Copy the
relevant files over to the container using docker cp path/to/script spark:/opt/bitnami/spark
.
- Copy the file
spark/streaming_worker/streaming_worker.py
to thespark
container. docker exec -it spark bash
to get into the container.spark-submit --master spark://spark-master:7077 --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.4.5 --py-files streaming_worker.py streaming_worker.py
to submit.
- Copy the file
spark/ml/regression.py
to thespark
container. docker exec -it spark bash
to get into the container.spark-submit --packages com.datastax.spark:spark-cassandra-connector_2.11:2.4.2 --py-files regression.py regression.py
to submit to Spark.
Please find the google slides describing our system architecture here.
- Interactive map with pins that simulate households. Click on a pin and be presented with options: view data (historical query) / request lifetime of appliances of household (leaflet for Vue).
- View data as interactive/dynamic graphs.
- If no time for leaflet, a list of all households would suffice.