It's a chatbot written using Tensorflow and Python.
I used scripts from this tutorial. Thanks tensorflow!
-
Clone or download this project.
-
Export your Telegram Data to
json
format. -
Check
config.json
and choose your preferences. -
Use
pipenv
to install dependenciespipenv sync
It's about 1.5 GB so... be prepared.
You can install it locally. Just create dir
.venv
here -
Now you can run script using
> pipenv shell > python Trainer.py
I recommend running all 15 epochs.
-
After you trained use your model to predict some messages.
Results can be... disappointing.
You can find recommended congig.json
in this repo.
- "telegram_export_path" -
string
, path to yourresult.json file
- "max_data_size" -
int
, max size of your dataset. More == better model. But it all depends on your RAM or VRAM. If you exceed your RAM, change batch_size to smaller value. - "batch_size" -
int
, more batch == faster training, but watch out for RAM. - "epochs" -
int
- how many epoch should be ran in onepython Train.py
, more == better, but you can overtrain model. Watch out for loss value (less == better.) - "embedding_dims":
int
, embbeded layers, recommended value is256
, but you can experiment with it. - "rnn_units":
int
, recurrent layers, recommended value is1024
, but you can experiment with it. - "dense_units":
int
, fully-connected layers, recommended value is1024
, but you can experiment with it. - "enable_special_char":
bool
, if set totrue
, emojis will be included in dataset. Set tofalse
to remove emojis. - "max_message_length":
int
, max length of a simple messsage. Longer messages will be ommited. - "checkpoint_dir":
string
, path to save your training checkpoint. It can take a lot of space. (Over 2GB probably) - "save_checkpoint":
bool
, iftrue
,train.py
will save checkpoint from time to time.true
is recommended. - "save_checkpoint_for_epoch":
int
, how oftentrain.py
will save checkpoints, starting from1
. For example if value will be2
, it will save checkpoint at1
,3
,5
... - "test_every_epoch":
bool
, if set totrue
it will test model at the end of every epoch with random message fromexamples
- "examples":
arrays[string]
, array of messages, which would be used to test model at the end of every epoch.
-
What is your setup?
I'm using Nvidia GTX 1070 tu train my model.
-
How much time will an epoch take?
For 40 000 messages nad batch_size 64 one epochs took me 420s.
-
How many messages should I include?
Basically the more, the better.
-
Okay. I've trained it. Now what?
Now you can use it some app or for bot. See
Chat.py
for example.
If you want some help in that fill the issue, but keep in mind that I'm just started to learning ML and stuff.
Do you have suggestion to improve this? Great. I'll be very happy to colaborate with you. Let's start with making a new issue
.