You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In calculating gradients, the gradient of the softmax function is not calculated using the formula that is derived in the lecture notes. It seems like in the code, this step is skipped over, and the gradient of the cost function with respect to yhat is used only ('d3' variable). Am I missing something here?
The text was updated successfully, but these errors were encountered:
I found the codes in Backprop is nothing wrong. Actually, binary classification via cross entropy with softmax has a very simple derivative formula, which is yhat - labels. You can find more details in https://deepnotes.io/softmax-crossentropy
In calculating gradients, the gradient of the softmax function is not calculated using the formula that is derived in the lecture notes. It seems like in the code, this step is skipped over, and the gradient of the cost function with respect to yhat is used only ('d3' variable). Am I missing something here?
The text was updated successfully, but these errors were encountered: