QA-rankit

Rank candidate answers for a given question.

Models

Unsupervised

We first tried some unsupervised models. Although these models are straightforward and simple, they work effectively!

Word Overlap Count
IDF weighted
Q-A distance
...

Supervised

We can use those metrics calculated in unsurpervised models as features of surpervised models. Besides, we can employ other models like CNN and LSTM to extract more features.

In this program, we tried following models:

Random Forest
Logistic Regression
Mixed CNN
Mixed LSTM

Among these models, the mixed LSTM achieved best performance.

Source code

main.ipynb

Main code, edited using Jupyter Notebook.

You'd better open this file using Jupyter Notebook.

If you dont's have Jupyter Notebook installed on your computer, please try main.py.

What main.ipynb does:

read data
preprocess data
extract features
fit models (models are implements in other source files)
evaluate models
predict on test dataset

main.py

.py version of main.ipynb.

MyLSTM.py

Implementation of an adapted LSTM model using Keras.

MyCNN.py

Implementation of an adapted CNN model using Keras.

MyGA.py, ParamGA.py

Implementaion of an adapted Genetic Algorithm using DEAP.

This can be used to find respectable parameters for sklearn models, like RandomForestClassifier.

PairWiseRanker.py

Implementation of pairwise ranking algorithm.

util.py, word2vec_util.py

Implementation of some tools.

Running environment settings

This program is developed under Python 2.7.

Packages that this program uses include:

Pandas
Numpy
DEAP
NLTK
Keras

This program also uses learning2rank. This original repository of learning2rank is https://github.com/shiba24/learning2rank. I forked it and made some modifications. The repository is https://github.com/betterenvi/learning2rank. Therefore the modified version will be better if you want to use learning2rank.

learning2rank also uses some packages, please install them if you want to use learning2rank.

It's possible that I miss some packages that this program actually uses. Therefore, I used the following command to generate requirements.txt file:

$ pip freeze > requirements.txt

then you can run the following command:

$ pip install -r requirements.txt

Actually, many packages listed in requirements.txt have been included in Anaconda.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
data		data
learning2rank @ 7a2ca0c		learning2rank @ 7a2ca0c
pic		pic
.gitignore		.gitignore
.gitmodules		.gitmodules
MyCNN.py		MyCNN.py
MyGA.py		MyGA.py
MyLSTM.py		MyLSTM.py
PairWiseRanker.py		PairWiseRanker.py
ParamGA.py		ParamGA.py
README.md		README.md
eval.py		eval.py
final_rank.txt		final_rank.txt
main.ipynb		main.ipynb
main.py		main.py
note.md		note.md
requirements.txt		requirements.txt
res.md		res.md
tmp.py		tmp.py
util.py		util.py
word2vec_util.py		word2vec_util.py

hailiang-wang/QA-rankit

Folders and files

Latest commit

History

Repository files navigation

QA-rankit

Models

Unsupervised

Supervised

Source code

main.ipynb

main.py

MyLSTM.py

MyCNN.py

MyGA.py, ParamGA.py

PairWiseRanker.py

util.py, word2vec_util.py

Running environment settings

About

Topics

Resources

Stars

Watchers

Forks

Languages