Summarization (sentence extraction) module based on Hidden Markov Models. Written mostly in summer 2013.
- Evaluation (forward algorithm)
- Decoding (Viterbi algorithm)
- Learning (counting)
- Implementation fo Hidden Markov Models in C# link.
- Presentation about Hidden Markov Models link to PDF.
- Redis client for Go link.
- Multiple features for emissions.
- Integration with Redis DB (model parameters).
- Go concurrency features for better performance.
Running learning process:
$ ./summer <path to full textfiles> <path to texts summaries>
Summarizing a text:
$ ./summer <path to textfile>
- updating model with unsupervised learning (Baum-Welch algorithm)
- estimating emission distribution (functions instead of slices)