This project focuses on using reinforcement learning to mutate a partially-correct/complete piece of coding homework to a complete and highly scored (e.g. test inputs all give correct output) homework submission. Our domain uses code written in the functional programming language, OCaml. This project extracts and preprocesses data from a large database of homework submission, transforming them into an abstract syntax tree (AST) and passing the through a graph neural network (GNN).
To set up via docker, follow the following steps:
- Install Docker.
- Set up run-logger.
- Configure your
.env
file so that the environment variableGRAPHQL_ENDPOINT
is the server you have set up. Startdirenv
by runningdirenv allow
.
GRAPHQL_ENDPOINT=http://server.com:1200/v1/graphql
- Create a docker volume called
rl_checkpoint
by using the command
docker volume create rl_checkpoint
- Now, you can build the project with docker by running the following commands in the terminal:
bash run.sh <DOCKER_IMAGE_NAME> <DOCKER_VOLUME_MOUNT_DIR> <DESCRIPTION_ON_LOGGER>
If you want to work on this project on a local machine, you need to install Poetry and opam. You can run make deps
to install all dependencies needed.
To visualize the actions that your agent is taking, you can run visualize.sh
. This requires you to have saved a model in your docker volume. If you have done so already, run
bash visualize <DOCKER_IMAGE_NAME> <DOCKER_VOLUME_MOUNT_DIR> <LOG_NAME> <RUN_ID>
The following directories each have the following functions:
agent/
: This directory includes the code for our reinforcement learning agentclib/
: This directory includes the C code for our project. The C code is used for communicating between our Python and OCaml code.envs/
: This directory includes the Python code for our environment. The environment that we are using is inenvs/ast_env.py
.ocamllib/
: This directory includes the OCaml code for our environment.
- If there is a sudden error of not finding a child or something like that, check if
max_num_nodes
is sufficient for problem.