Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training does not converge #59

Open
SmartMachineBay opened this issue Mar 9, 2018 · 3 comments
Open

Training does not converge #59

SmartMachineBay opened this issue Mar 9, 2018 · 3 comments

Comments

@SmartMachineBay
Copy link

Hi,

When I change the group convolution to depthwise convolution, there is no convergence of the network training. What is the reason for this? Thx!
modified:
layer {
name: "conv2_1/dpwise"
type: "DepthwiseConvolution"
bottom: "conv2_1/expand/bn"
top: "conv2_1/dwise"
param {
lr_mult: 1
decay_mult: 1
}
convolution_param {
num_output: 32
bias_term: false
pad: 1
kernel_size: 3

weight_filler {
  type: "msra"
}

}
}
Training....
I0309 13:11:24.893823 18939 solver.cpp:272] Solving MOBILENET_V2
I0309 13:11:24.893851 18939 solver.cpp:273] Learning Rate Policy: poly
I0309 13:11:25.859395 18939 solver.cpp:218] Iteration 0 (0 iter/s, 0.915814s/20 iters), loss = 7.08397
I0309 13:11:25.859428 18939 solver.cpp:237] Train net output #0: loss = 7.08397 (* 1 = 7.08397 loss)
I0309 13:11:25.859437 18939 sgd_solver.cpp:105] Iteration 0, lr = 0.045
I0309 13:11:36.243708 18939 solver.cpp:218] Iteration 20 (1.92605 iter/s, 10.3839s/20 iters), loss = 6.98624
I0309 13:11:36.243757 18939 solver.cpp:237] Train net output #0: loss = 6.98624 (* 1 = 6.98624 loss)
I0309 13:11:36.243767 18939 sgd_solver.cpp:105] Iteration 20, lr = 0.0449991
I0309 13:11:46.530542 18939 solver.cpp:218] Iteration 40 (1.94431 iter/s, 10.2864s/20 iters), loss = 6.92067
I0309 13:11:46.530589 18939 solver.cpp:237] Train net output #0: loss = 6.92067 (* 1 = 6.92067 loss)
I0309 13:11:46.530599 18939 sgd_solver.cpp:105] Iteration 40, lr = 0.0449982
I0309 13:11:56.811890 18939 solver.cpp:218] Iteration 60 (1.94534 iter/s, 10.281s/20 iters), loss = 6.92625
I0309 13:11:56.812000 18939 solver.cpp:237] Train net output #0: loss = 6.92625 (* 1 = 6.92625 loss)
I0309 13:11:56.812011 18939 sgd_solver.cpp:105] Iteration 60, lr = 0.0449973
I0309 13:12:07.103955 18939 solver.cpp:218] Iteration 80 (1.94333 iter/s, 10.2916s/20 iters), loss = 6.91425
I0309 13:12:07.104001 18939 solver.cpp:237] Train net output #0: loss = 6.91425 (* 1 = 6.91425 loss)
I0309 13:12:07.104009 18939 sgd_solver.cpp:105] Iteration 80, lr = 0.0449964
I0309 13:12:17.393060 18939 solver.cpp:218] Iteration 100 (1.94388 iter/s, 10.2887s/20 iters), loss = 6.91382
I0309 13:12:17.393095 18939 solver.cpp:237] Train net output #0: loss = 6.91382 (* 1 = 6.91382 loss)
I0309 13:12:17.393105 18939 sgd_solver.cpp:105] Iteration 100, lr = 0.0449955
I0309 13:12:27.754611 18939 solver.cpp:218] Iteration 120 (1.93029 iter/s, 10.3611s/20 iters), loss = 87.3365
I0309 13:12:27.754704 18939 solver.cpp:237] Train net output #0: loss = 87.3365 (* 1 = 87.3365 loss)
I0309 13:12:27.754716 18939 sgd_solver.cpp:105] Iteration 120, lr = 0.0449946
I0309 13:12:38.106243 18939 solver.cpp:218] Iteration 140 (1.93215 iter/s, 10.3512s/20 iters), loss = 87.3365
I0309 13:12:38.106288 18939 solver.cpp:237] Train net output #0: loss = 87.3365 (* 1 = 87.3365 loss)
I0309 13:12:38.106298 18939 sgd_solver.cpp:105] Iteration 140, lr = 0.0449937
I0309 13:12:48.467030 18939 solver.cpp:218] Iteration 160 (1.93043 iter/s, 10.3604s/20 iters), loss = 87.3365
I0309 13:12:48.467075 18939 solver.cpp:237] Train net output #0: loss = 87.3365 (* 1 = 87.3365 loss)
I0309 13:12:48.467085 18939 sgd_solver.cpp:105] Iteration 160, lr = 0.0449928
I0309 13:12:58.827340 18939 solver.cpp:218] Iteration 180 (1.93052 iter/s, 10.3599s/20 iters), loss = 87.3365
I0309 13:12:58.827447 18939 solver.cpp:237] Train net output #0: loss = 87.3365 (* 1 = 87.3365 loss)
I0309 13:12:58.827459 18939 sgd_solver.cpp:105] Iteration 180, lr = 0.0449919
I0309 13:13:09.212028 18939 solver.cpp:218] Iteration 200 (1.926 iter/s, 10.3842s/20 iters), loss = 87.3365
I0309 13:13:09.212060 18939 solver.cpp:237] Train net output #0: loss = 87.3365 (* 1 = 87.3365 loss)
I0309 13:13:09.212086 18939 sgd_solver.cpp:105] Iteration 200, lr = 0.044991

@beniz
Copy link

beniz commented May 1, 2018

Remove the use_global_stats:true line in each of the BatchNorm layers.

@qinxianyuzi
Copy link

Changing the group convolution to depthwise convolution does reduce the performance?

@feymanpriv
Copy link

@SmartMachineBay I got the same loss value with you when i fine-tune the network on my own data set. Do you have some idea?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants