Whiteboard Vision

Description

This project aims to detect and transcribe text written on a whiteboard in a classroom environment. Handwritten mathematical equations and drawn diagrams will not be targetted as an area for transcription.

A client-server relationship is leveraged to achieve functionality. Docker is used to host the server components and deliver the client-side assets to a user's web browser. Text detection and text recognition is achieved through usage of two neural networks running on the server.

Server Requirements

Please Note: PyTorch is used to load the neural network models and weights in Docker.

(If using a CPU to run the models) Equivalent to a 7th Generation Intel i7 or greater.
(If using a GPU to run the models) Equivalent to a GTX 1060 or greater.
At least 6GB of RAM assigned to Docker
- Please Note: The servers themselves will function on less than 6GB of RAM, however, the API server will fail to load the PyTorch models or fail to perform Recognition/Detection on a large given image.
Windows/MacOS/Linux with Docker installed
Docker >=v2.1.0.5
Docker Compose >=v1.25.4
Approx 4GB of space

Project Setup

Clone this repository to an optimal location on the server

git clone https://www.github.com/Brikwerk/whiteboard-vision

For evaluation purposes, pre-trained weights are available for usage with the CRAFT and DTR neural networks inside this project. These weights, however, need to be downloaded.

The weights required for CRAFT to function can be downloaded HERE (Main Weights) and HERE (Refiner Weights).

The weights required for DTR to function can be downloaded HERE.

Both of the CRAFT/DTR and their respective pre-trained models are available thanks to Clova AI's research. For those interested: CRAFT and DTR.
Once the above weights are downloaded, you should have three files:
- craft_mlt_25k.pth
- craft_refiner_CTW1500.pth
- TPS-ResNet-BiLSTM-Attn.pth
Inside the whiteboard-vision project, place the weights "craft_mlt_25k.pth" and "craft_refiner_CTW1500.pth" in the location "api_server/whiteboard_vision/clova_detection/weights" and the weight "TPS-ResNet-BiLSTM-Attn.pth" in the location "api_server/whiteboard_vision/clova_recognition/weights".
Make a copy of the "example_config.py" file within the "frontend_server" directory and rename it to "config.py".
Make a copy of the "example_config.py" file within the "api_server" directory and rename it to "config.py".
Make a copy of the "example_nginx_http.conf" file within the "conf" directory and rename it to "nginx.conf".

Running a Testing Server

Note: Make sure to complete the "Project Setup" section before looking at this section.

If you wish to run a development instance of this project with debug features enabled, this section is for you.

Edit both "config.py" files within the "api_server" and "frontend_server" directories so that the DEBUG, ENV, and TESTING variables match the following:
```
DEBUG = True
ENV = 'development'
TESTING = True
```
Note: Do not delete the API_ENDPOINT variable in the "frontend_testing" config.
With a console at the root of this project, run the following command:
```
docker-compose up
```
This will start NGINX, the API server, the frontend server, and Certbot (which can be ignored) in Docker with an overview of what's going on. Please Note: The build process for the API server can take some time, depending on system/internet speed.
The frontend client should now be accessible over the browser through the url http://localhost

Changes to the Python files in either project will appropriately reload the respective servers.

Running a Production Server

Note: Make sure to complete the "Project Setup" section before looking at this section.

If you wish to run an online, production instance of this server, this section is for you. An assumption is made going forward that a URL is available for usage on the server.

Open up the "nginx.conf" file under the "conf" directory and replace any instance of "localhost" (quotes excluded) with the URL you will be hosting the server under. An example is provided below:
```
server {
  listen 80;
  server_name INSERT.URL.HERE;

...
```
To enable access to a users webcam for usage in the client, an SSL certificate is required. To get an SSL certificate through LetsEncrypt, edit the certbot script located at the root of the project with your information and run it.

If you are on Linux or Mac, edit the "certbot.sh" script. If you are on Windows, edit the "certbot.bat" script. An example of a filled in Linux/Mac script is provided below:
```
docker-compose run --rm --entrypoint\
"certbot certonly --webroot -w /var/www/certbot \
--email [email protected] \
-d example.org \
--rsa-key-size 4096 \
--agree-tos \
--no-eff-email \
--force-renewal" certbot

docker stop nginx
```
After you have finished filling out the script, run it with the respective command:

Linux/Mac:
```
./certbot.sh
```
Windows:
```
certbot.bat
```
The script will run the Certbot docker image and attempt to grant an SSL certificate to the specified domain. A "Congratulations" notice will be listed first under the "Important Notes" section in your terminal after the script as finished running.
Delete the "nginx.conf" file located in the "conf" directory. Next, make a copy of the "example_nginx_https.conf" and rename it to "nginx.conf". Repeat step 1 with this file, replacing any instances of "localhost" (quotes excluded) with your domain name.
Run the following to boot the server
```
docker-compose up -d
```
The server should now be running securely under the specified domain.

Future Improvements

Improved word grouping and sentence grouping
- Word grouping can fail in adverse scenarios (words forgotten). Inclusion of words needs to be looked into if sentence gathering fails.
- Sentence grouping can fail when words are further apart. This can be solved by tweaking the config values within CRAFT.
API server scaling needs to be improved/investigated
- A 502 error can still occur in periods of heavy image submission. This needs to be investigated and proper solutions applied. A solution could involve usage of Celery for querying backlogging and uWSGI, instead of Gunicorn, for better scaling.

Name		Name	Last commit message	Last commit date
Latest commit History 149 Commits
api_server		api_server
conf		conf
docs		docs
frontend_server		frontend_server
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
certbot.bat		certbot.bat
certbot.sh		certbot.sh
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

api_server

api_server

conf

conf

docs

docs

frontend_server

frontend_server

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

certbot.bat

certbot.bat

certbot.sh

certbot.sh

docker-compose.yml

docker-compose.yml

Repository files navigation

Whiteboard Vision

Description

Server Requirements

Project Setup

Running a Testing Server

Running a Production Server

Future Improvements

About

Releases

Packages

Languages

License

Brikwerk/whiteboard-vision

Folders and files

Latest commit

History

Repository files navigation

Whiteboard Vision

Description

Server Requirements

Project Setup

Running a Testing Server

Running a Production Server

Future Improvements

About

Topics

Resources

License

Stars

Watchers

Forks

Languages