Theano implementation of Paper "Natural Language Processing (almost) from Scratch"
There are still quite a few important items to finish, but it seems like learning the embedding.
ATIS Data. Contains 46635 sentences, with 572 words.
Word embedding for each word.
python train.py
You might need to install at least Theano 0.7+ and numpy to run the program.
- cPickle the embedding at the end of each epoch.
- Use validation set to avoid over fitting.
- Hyper param tuning.
- GPU perhaps, cos it is really slow right now.
- Normalizing the embedding?