You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
https://riju.statuspage.io/incidents/xc559lskkttw was caused by an error which for some reason did not show up in container logs, but was visible when I connected to the EC2 instance and tried to start a session manually:
admin@ip-172-31-1-13:~$ sudo docker exec -it riju-app-green bash
riju@93ea824572b0:/src$ make sandbox L=python
L=python node backend/sandbox.js
Starting session with UUID 3f13a0f56a4844d1b8972c0a2aed3102
docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: process_linux.go:508: setting cgroup config for procHooks process caused: failed to write "100000": write /sys/fs/cgroup/cpu,cpuacct/riju.slice/docker-fb919378f50b91e7e4e6e070b853342a3b4dbbb468dc5bfc4487264f0286050f.scope/cpu.cfs_quota_us: invalid argument: unknown.
ERRO[0000] error waiting for container: context canceled
container did not come up within 10 seconds (errno 17)
For some reason, when I applied 0d92a77 to the production server, it started causing the above issue, and when I reverted those changes, the issue went away. However, additional testing made me uncertain as to whether the above changes actually triggered the problem.
The issue may be due to kubernetes/kubernetes#72878, which points to a kernel bug that was patched some time ago. We would need to verify that the patch is included in the kernel version we are running on EC2.
The text was updated successfully, but these errors were encountered:
https://riju.statuspage.io/incidents/xc559lskkttw was caused by an error which for some reason did not show up in container logs, but was visible when I connected to the EC2 instance and tried to start a session manually:
For some reason, when I applied 0d92a77 to the production server, it started causing the above issue, and when I reverted those changes, the issue went away. However, additional testing made me uncertain as to whether the above changes actually triggered the problem.
The issue may be due to kubernetes/kubernetes#72878, which points to a kernel bug that was patched some time ago. We would need to verify that the patch is included in the kernel version we are running on EC2.
The text was updated successfully, but these errors were encountered: