Getting gibberish predictions when using recurrent LSTM and arrays of strings as output training data #799

chrisvel · 2022-05-20T19:52:16Z

I have a bunch of sentences and I want to generate tags for each one. I have my data which are simple:

[{
 input: "Buy tickets for opera",
 output: ['errands', 'orders']
},
{
 input: "Clean garage",
 output: ['errands', 'home']
},
....

I am training the model simply with :

const network = new brain.recurrent.LSTM();

  network.train(trainingData, {
    (error) => console.log(error),
    iterations: 1000,
  });

When I run a:

network.run('Some random text');

sometimes I get an array with a correct tag, but other times it returns gibberish with random characters or the tags joined together in a string, for example the sentence "Service my XBox dvd drive" returns this output:

["sco comhermer fililys.AAouto afamily.shopping"]

I read somewhere that LSTM cannot classify so I am ok with this but what do you suggest?

Will something like making a matching table with numbers and tag words work? Something like:

1: orders
2: errands
3: family 
4: personal 
.....

and then feeding the output of my training data with numbers ?

Is it something not expected to work as I am hoping to or is it a bug?

The text was updated successfully, but these errors were encountered:

nilooy · 2022-06-11T10:33:19Z

same for me, i'm passing strings in both , input and output object, but when run it, it gives mixed/gibrish output.

purplnay · 2023-01-02T18:33:41Z

The gibberish is probably due to 2 reasons, the model learns to read and write character by character an not word by word, the second I would say is that 1,000 iterations seems quite low for training.
For now I would recommend splitting your input into words, with a space after each, and same for your output, add a space so they don't collapse like "errandsorders".

I've filled an issue for a feature request here #871 since I think this is a recurrent issue that is not so well documented.

purplnay mentioned this issue Jan 2, 2023

Recurrent neural networks and string arrays #871

Open

robertleeplummerjr self-assigned this Apr 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting gibberish predictions when using recurrent LSTM and arrays of strings as output training data #799

Getting gibberish predictions when using recurrent LSTM and arrays of strings as output training data #799

chrisvel commented May 20, 2022

nilooy commented Jun 11, 2022

purplnay commented Jan 2, 2023

Getting gibberish predictions when using recurrent LSTM and arrays of strings as output training data #799

Getting gibberish predictions when using recurrent LSTM and arrays of strings as output training data #799

Comments

chrisvel commented May 20, 2022

nilooy commented Jun 11, 2022

purplnay commented Jan 2, 2023