segvault when updating / stuck updating #184

johanneswuerbach · 2018-08-27T23:09:50Z

Recently the CLUO (v0.7.0) seems to have been stuck and continuously tried to update the same node.

CoreOS: CoreOS 1800.5.0
Kubernetes: v1.9.9
Cloud: AWS us-east-1, kops 1.10

Agent logs:

I0827 23:00:28.134375       1 agent.go:184] Node drained, rebooting
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x1238536]
 goroutine 33 [running]:
github.com/coreos/container-linux-update-operator/pkg/updateengine.(*Client).ReceiveStatuses(0xc4203c6660, 0xc420052480, 0xc420052300)
	/go/src/github.com/coreos/container-linux-update-operator/pkg/updateengine/client.go:99 +0x186
created by github.com/coreos/container-linux-update-operator/pkg/agent.(*Klocksmith).watchUpdateStatus
	/go/src/github.com/coreos/container-linux-update-operator/pkg/agent/agent.go:251 +0x102

Controller logs:

I0827 23:03:26.319148       1 operator.go:449] Found node "ip-10-100-24-49.ec2.internal" still rebooting, waiting
I0827 23:03:26.319172       1 operator.go:451] Found 1 (of max 1) rebooting nodes; waiting for completion
I0827 23:03:59.455065       1 operator.go:507] Found 0 rebooted nodes
I0827 23:03:59.719801       1 operator.go:449] Found node "ip-10-100-24-49.ec2.internal" still rebooting, waiting
I0827 23:03:59.720003       1 operator.go:451] Found 1 (of max 1) rebooting nodes; waiting for completion
I0827 23:04:32.719449       1 operator.go:507] Found 0 rebooted nodes
I0827 23:04:33.119047       1 operator.go:449] Found node "ip-10-100-24-49.ec2.internal" still rebooting, waiting
I0827 23:04:33.119072       1 operator.go:451] Found 1 (of max 1) rebooting nodes; waiting for completion
I0827 23:05:06.520970       1 operator.go:507] Found 0 rebooted nodes
I0827 23:05:06.918956       1 operator.go:449] Found node "ip-10-100-24-49.ec2.internal" still rebooting, waiting
I0827 23:05:06.918976       1 operator.go:451] Found 1 (of max 1) rebooting nodes; waiting for completion
I0827 23:05:39.920518       1 operator.go:507] Found 0 rebooted nodes
I0827 23:05:40.320071       1 operator.go:449] Found node "ip-10-100-24-49.ec2.internal" still rebooting, waiting
I0827 23:05:40.320094       1 operator.go:451] Found 1 (of max 1) rebooting nodes; waiting for completion
I0827 23:06:13.719273       1 operator.go:507] Found 1 rebooted nodes
I0827 23:06:14.519760       1 operator.go:449] Found node "ip-10-100-24-49.ec2.internal" still rebooting, waiting
I0827 23:06:14.519909       1 operator.go:451] Found 1 (of max 1) rebooting nodes; waiting for completion

The text was updated successfully, but these errors were encountered:

johanneswuerbach · 2018-08-28T06:21:32Z

Looks like downgrading to v0.6.0 has solved the issue for us.

sdemos · 2018-08-29T20:12:25Z

The panic you are running into looks the same as the one in #93, which is odd because according to that issue, it should've been fixed with v0.7.0. I'll have to try and reproduce that, I don't remember much about it.

As far as not updating, the panic shouldn't have anything to do with it, the panic comes because the dbus channel gets closed underneath the watch function because the system is going down for the reboot.

Can you post the operator deployment for the failed one? Do you have a reboot window or any pre- or post-reboot hooks configured? It might also be helpful to get some of the debugging logs, which you can do by adding the flag -v 4 to the operator deployment.

johanneswuerbach changed the title ~~segvault when updating~~ segvault when updating / stuck updating Aug 28, 2018

invidian mentioned this issue Jan 14, 2020

examples/deploy/update-agent.yaml: make sure update-agent runs as root flatcar/flatcar-linux-update-operator#17

Merged

invidian mentioned this issue Oct 30, 2020

Ensure update-agent does not panics/segfaults when node is shutting down flatcar/flatcar-linux-update-operator#31

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

segvault when updating / stuck updating #184

segvault when updating / stuck updating #184

johanneswuerbach commented Aug 27, 2018

johanneswuerbach commented Aug 28, 2018

sdemos commented Aug 29, 2018

segvault when updating / stuck updating #184

segvault when updating / stuck updating #184

Comments

johanneswuerbach commented Aug 27, 2018

johanneswuerbach commented Aug 28, 2018

sdemos commented Aug 29, 2018