Skip to content

A python script to convert any imported text into a dictionary of word-binary pairs.

License

Notifications You must be signed in to change notification settings

tutmoses/text2array

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

text2array

This program converts a folder of PDF documents into a dictionary of 100-element arrays containing binary strings, which can be fed into a neural network.

DEPENDENCIES

Make sure these are installed from your terminal. You will need the package managers Brew and Pip. Brew installation instruction can be found here: https://brew.sh/ and pip3 will be bundled with the Homebrew python installation.

  • brew install python
  • brew install pkg-config poppler
  • pip3 install glob2
  • pip3 install pdftotext
  • pip3 install numpy

Run in Atom using the Script plugin.

About

A python script to convert any imported text into a dictionary of word-binary pairs.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages