Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] 能否加入keras train_on_batch的api,大的数据集内存无法完整加载 #336

Open
miaohancheng opened this issue Mar 9, 2020 · 5 comments
Assignees
Labels
enhancement New feature or request pinned

Comments

@miaohancheng
Copy link

miaohancheng commented Mar 9, 2020

https://keras-cn.readthedocs.io/en/latest/models/model/
在model下只找到了fit 和 fit_without_generator,没有看到fit_generator 或者 train_on_batch,我又仔细的看了下,fit的实现是用了fit_generator,但是fit的入口怎么分块传入数据

@miaohancheng miaohancheng added the enhancement New feature or request label Mar 9, 2020
@BrikerMan
Copy link
Owner

BrikerMan commented Mar 10, 2020

这一块之前版本没有考虑,正在构思如何实现。
因为实际上我需要调用两次这个 Generator
第一次遍历构建词表和 Label 表,第二次遍历进行训练。虽然训练是用 fit_generator,但是构建词表目前还不是,导致必须全部加载。

有什么好的办法建议欢迎提出来~

@miaohancheng
Copy link
Author

之前的文本语义类模型,过大的文件我们用的是spark mllib来做的word2vec,但这个场景肯定是没法用了;
我的思路是用先用keras fit_generator API 生成word2vec,然后再调用fit_generator去训练;
训练的步骤中间可以多一个word2vec的model产出,查问题的时候也更加友好一些;
参考https://github.com/bojone/tf_word2vec/blob/master/word2vec_keras.py 这个项目。
以上是我的想法,不知道有没有问题,供你参考

@BrikerMan
Copy link
Owner

在 kashgari2 分支提交了基于 TF2.0 实现,目前只实现了一个分类模型和 W2V embedding,可以测试看看。

@stale
Copy link

stale bot commented Apr 23, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Apr 23, 2020
@BrikerMan BrikerMan added pinned and removed wontfix This will not be worked on labels Apr 23, 2020
@BrikerMan BrikerMan added this to the Tensorflow 2.0 milestone Jun 18, 2020
@luozhouyang
Copy link

推荐使用tf.data API构建数据输入管道,强大且方便。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request pinned
Projects
None yet
Development

No branches or pull requests

3 participants