Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loss计算的问题 #1

Open
entalent opened this issue Feb 17, 2018 · 2 comments
Open

loss计算的问题 #1

entalent opened this issue Feb 17, 2018 · 2 comments

Comments

@entalent
Copy link

entalent commented Feb 17, 2018

在 tf_policy_value_net.py 第54行,定义了每个动作的概率是
self.action_probs = tf.nn.softmax(policy_net_out, name="policy_net_proba"),
第90行定义的loss是
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=self.action_probs, labels=self.mcts_probs)。
但tensorflow文档对 tf.nn.softmax_cross_entropy 这个函数有个warning ,是说不要把softmax的输出给这个函数,因为这个函数内部会自己对logits做一遍softmax……所以这里直接用这个loss是不是不太好

另外,请问model文件夹里的模型是从0开始用这份代码训练的,还是已经训练好的theano/pytorch的模型直接转成tensorflow模型得到的?

@zouyih
Copy link
Owner

zouyih commented Feb 17, 2018

谢谢帮我发现了这个bug!这里需要改一下。
文件夹里的模型是把https://github.com/junxiaosong/AlphaZero_Gomoku 里面训练的参数转成tensorflow的。直接训练8*8的五子棋有点慢,可以用6 * 6的四子棋试一下训练效果,感觉把self.buffer_size = 10000这里改小一点应该会收敛得更快。

@Dave-he
Copy link

Dave-he commented Apr 21, 2018

image
同样,我在运行的时候出现这个

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants