CDSSM model doesnt work #800

tramya28 · 2019-11-06T03:14:30Z

Describe the Question

I was trying to implement the CDSSM model with the toy dataset provided and I get the following error.

Also the word hashing (preprocessing) is consuming a lot of memory. Is there a hack around to have less memory usage and have word hashing?

I followed the tutorials in the Matchzoo and used the code of CDSSM from wikiqa (https://github.com/NTMC-Community/MatchZoo/tree/master/tutorials/wikiqa)

P.S - I played around with other models like knrm, convknrm, dssm, arcII, duet and mvlstm and they all worked. I only have issue with the CDSSM model. I followed the tutorials for all the other models as well.

uduse · 2019-11-06T15:19:41Z

#481 may help with your memory issue.

tramya28 · 2019-11-06T17:31:14Z

I figured out that i can use "with_word_hashing=False" for large datasets, but if i want to use word hashing then it is not possible with large datasets right? Please confirm.
Also the CDSSM model doesnt work irrespective of word hashing set to false. Please let me know if there is a fix!
Thanks

uduse · 2019-11-07T16:23:38Z

Word hashing takes a lot of space. Unless you have HUGE memory in your computer, it's not possible to do it as a part of the preprocessing process.

What do you mean by "CDSSM doesn't work"?

tramya28 · 2019-11-07T16:30:39Z

This is the error i get when i use the same code provided in the tutorials and using the dataset given the tutorial.

matthew-z · 2019-11-07T19:23:38Z

did you run the wikiqa/cdssm ipynb?

I could not reproduce this error with matchzoo 2.2.

did you modify the notebook?

I think this error means that the labels are not in the correct form

matthew-z · 2019-11-07T19:30:58Z

ranking_task = mz.tasks.Classification(num_classes=2)

This is why you failed. You loaded the ranking dataset, but set the task to classification.

tramya28 · 2019-11-07T19:33:23Z

This is the code that i used for the CDSSM model.

import keras
import pandas as pd
import numpy as np
import matchzoo as mz
import json
print('matchzoo version', mz.version)
print()

print('data loading ...')
train_pack_raw = mz.datasets.wiki_qa.load_data('train', task='ranking')
dev_pack_raw = mz.datasets.wiki_qa.load_data('dev', task='ranking', filtered=True)
test_pack_raw = mz.datasets.wiki_qa.load_data('test', task='ranking', filtered=True)
print('data loaded as train_pack_raw dev_pack_raw test_pack_raw')

preprocessor = mz.preprocessors.CDSSMPreprocessor(fixed_length_left=10, fixed_length_right=10)

train_processed = preprocessor.fit_transform(train_pack_raw)
valid_processed = preprocessor.transform(dev_pack_raw)
test_processed = preprocessor.transform(test_pack_raw)

model = mz.models.CDSSM()
model.params['input_shapes'] = preprocessor.context['input_shapes']
model.params['filters'] = 64
model.params['kernel_size'] = 3
model.params['strides'] = 1
model.params['padding'] = 'same'
model.params['conv_activation_func'] = 'tanh'
model.params['w_initializer'] = 'glorot_normal'
model.params['b_initializer'] = 'zeros'
model.params['mlp_num_layers'] = 1
model.params['mlp_num_units'] = 64
model.params['mlp_num_fan_out'] = 64
model.params['mlp_activation_func'] = 'tanh'
model.params['dropout_rate'] = 0.8
model.params['optimizer'] = 'adadelta'
model.guess_and_fill_missing_params()
model.build()
model.compile()
model.backend.summary()

pred_x, pred_y = train_processed[:].unpack()
evaluate = mz.callbacks.EvaluateAllMetrics(model, x=pred_x, y=pred_y, batch_size=len(pred_x))
train_generator = mz.DataGenerator(train_processed, batch_size=20, mode='pair', num_dup=2, num_neg=1, )
print('num batches:', len(train_generator))

history = model.fit_generator(train_generator, epochs=20, callbacks=[evaluate], workers=1, use_multiprocessing=False)

print("done")

and i get this error

matthew-z · 2019-11-07T20:07:07Z

Your matchzoo is outdated. Please upgrade it.

Matchzoo 2.1 does not work with keras 2.3.1

tramya28 · 2019-11-07T20:09:30Z

Can i clone the latest version from github? when i do a pip install, its installing an older version of match zoo. Please let me know.
Thanks

matthew-z · 2019-11-08T05:28:11Z

did you install matchzoo with this command?
pip install -U matchzoo

could you show us the log?

tramya28 added the question label Nov 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CDSSM model doesnt work #800

CDSSM model doesnt work #800

tramya28 commented Nov 6, 2019

uduse commented Nov 6, 2019

tramya28 commented Nov 6, 2019

uduse commented Nov 7, 2019

tramya28 commented Nov 7, 2019

matthew-z commented Nov 7, 2019 •

edited

matthew-z commented Nov 7, 2019

tramya28 commented Nov 7, 2019

matthew-z commented Nov 7, 2019 •

edited

tramya28 commented Nov 7, 2019

matthew-z commented Nov 8, 2019

CDSSM model doesnt work #800

CDSSM model doesnt work #800

Comments

tramya28 commented Nov 6, 2019

Describe the Question

uduse commented Nov 6, 2019

tramya28 commented Nov 6, 2019

uduse commented Nov 7, 2019

tramya28 commented Nov 7, 2019

matthew-z commented Nov 7, 2019 • edited

matthew-z commented Nov 7, 2019

tramya28 commented Nov 7, 2019

matthew-z commented Nov 7, 2019 • edited

tramya28 commented Nov 7, 2019

matthew-z commented Nov 8, 2019

matthew-z commented Nov 7, 2019 •

edited

matthew-z commented Nov 7, 2019 •

edited