Skip to content

论文“Attention-over-Attention Neural Networks for Reading Comprehension”中AoA模型实现

Notifications You must be signed in to change notification settings

lc222/attention-over-attention-tf-QA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Attention over Attention

这是我仿照github上面一个用户的代码,仅对其中少量部分代码进行了修改,是代码支持tf1.0及以上版本
源代码链接为:https://github.com/OlavHN/attention-over-attention
代码执行顺序为:
1,下载数据集
2,运行reader.py文件,将原数据集保存为.tfrecords文件,方便程序的高效读取
3,运行model.py文件,训练模型.

以下是原链接的readme说明。
Implementation of the paper Attention-over-Attention Neural Networks for Reading Comprehension in tensorflow

Some context on my blog

Reading comprehension for cloze style tasks is to remove word from an article summary, then read the article and try to infer the missing word. This example works on the CNN news dataset.

With the same hyperparameters as reported in the paper, this implementation got an accuracy of 74.3% on both the validation and test set, compared with 73.1% and 74.4% reported by the author.

To train a new model: python model.py --training=True --name=my_model

To test accuracy: python model.py --training=False --name=my_model --epochs=1 --dropout_keep_prob=1

Note that the tfrecords and model files are stored with git lfs

Raw data for use with reader.py to produce .tfrecords files was downloaded from [http://cs.nyu.edu/~kcho/DMQA/]

Interesting parts

  • Masked softmax implementation
  • Example of batched sparse tensors with correct mask handling
  • Example of pointer style attention
  • Test/validation split part of the tf-graph

About

论文“Attention-over-Attention Neural Networks for Reading Comprehension”中AoA模型实现

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages