Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TEST-46-HOMED is flaky in Github Actions #32727

Open
DaanDeMeyer opened this issue May 9, 2024 · 4 comments · Fixed by #32728
Open

TEST-46-HOMED is flaky in Github Actions #32727

DaanDeMeyer opened this issue May 9, 2024 · 4 comments · Fixed by #32728
Labels
bug 🐛 Programming errors, that need preferential fixing tests

Comments

@DaanDeMeyer
Copy link
Contributor

systemd version the issue has been seen with

main

Used distribution

No response

Linux kernel version used

No response

CPU architectures issue was seen on

None

Component

No response

Expected behaviour you didn't see

https://github.com/systemd/systemd/actions/runs/9013386982/job/24764128257

gh run download 9013386982 --name ci-mkosi-9013386982-1-centos-9-failed-test-journals -D ci/ci-mkosi-9013386982-1-centos-9-failed-test-journals && journalctl --file ci/ci-mkosi-9013386982-1-centos-9-failed-test-journals/test/journal/TEST-46-HOMED.journal --no-hostname -o short-monotonic -u testsuite-46.service -p info

Unexpected behaviour you saw

No response

Steps to reproduce the problem

No response

Additional program output to the terminal or log subsystem illustrating the issue

29/57 systemd:integration-tests / TEST-46-HOMED                 FAIL             50.54s   exit status 1
>>> MALLOC_PERTURB_=46 SYSTEMD_INTEGRATION_TESTS=1 /home/runner/work/systemd/systemd/test/integration-test-wrapper.py --meson-source-dir /home/runner/work/systemd/systemd --meson-build-dir /mnt/mkosi/build --test-name TEST-46-HOMED --test-number 46 --storage volatile --firmware linux --
――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――
Listing only the last 100 lines from a long log.
[   44.876944] testsuite-46.sh[523]: + homectl activate blob-user
[   45.276489] testsuite-46.sh[523]: + inspect blob-user
[   45.278862] testsuite-46.sh[523]: + local USERNAME=blob-user
[   45.281470] testsuite-46.sh[996]: + tee /tmp/a
[   45.284098] testsuite-46.sh[995]: + homectl inspect blob-user
[   45.296852] testsuite-46.sh[996]:    User name: blob-user
[   45.298966] testsuite-46.sh[996]:        State: active
[   45.301266] testsuite-46.sh[996]:  Disposition: regular
[   45.303798] testsuite-46.sh[996]:  Last Change: Thu 2024-05-09 06:58:32 UTC
[   45.306503] testsuite-46.sh[996]:     Login OK: yes
[   45.308731] testsuite-46.sh[996]:  Password OK: yes
[   45.310887] testsuite-46.sh[996]:          UID: 12345
[   45.313118] testsuite-46.sh[996]:          GID: 12345 (blob-user)
[   45.315397] testsuite-46.sh[996]:    Directory: /home/blob-user
[   45.317440] testsuite-46.sh[996]:    Blob Dir.: /var/cache/systemd/home/blob-user
[   45.320045] testsuite-46.sh[996]:               ├─ test1 (436a101d6264b1daa5c5e2ba362407e1fc743126a7a18c09b381bc431ab8c08b)
[   45.323868] testsuite-46.sh[996]:               └─ test2 (5a43615af72cda81a2d8dc79ac115e2b3d85b027fd6c7373eb8cb8617be5978a)
[   45.328015] testsuite-46.sh[996]:      Storage: luks (strong encryption)
[   45.330650] testsuite-46.sh[996]:   Image Path: /home/blob-user.home
[   45.333148] testsuite-46.sh[996]:    Removable: no
[   45.335192] testsuite-46.sh[996]:        Shell: /bin/bash
[   45.337314] testsuite-46.sh[996]:  Access Mode: 0700
[   45.339652] testsuite-46.sh[996]: LUKS Discard: online=yes offline=yes
[   45.342060] testsuite-46.sh[996]:    LUKS UUID: 21b0c4d3-9c58-476d-a18d-4cf49ea5e44d
[   45.344932] testsuite-46.sh[996]:    Part UUID: 0d7cb051-5866-43d3-9d83-e184ae936a3f
[   45.347783] testsuite-46.sh[996]:      FS UUID: 3aa67bea-1be4-490f-bffa-4b049ec69f93
[   45.350621] testsuite-46.sh[996]:  File System: ext4
[   45.352942] testsuite-46.sh[996]:  LUKS Cipher: aes
[   45.355010] testsuite-46.sh[996]:  Cipher Mode: xts-plain64
[   45.359219] testsuite-46.sh[996]:   Volume Key: 256bit
[   45.362395] testsuite-46.sh[996]:   PBKDF Type: pbkdf2
[   45.365998] testsuite-46.sh[996]:   PBKDF Time: 1ms
[   45.369142] testsuite-46.sh[996]:  Mount Flags: nosuid nodev exec
[   45.372982] testsuite-46.sh[996]:    Disk Size: 19.0M
[   45.374956] testsuite-46.sh[996]:   Disk Usage: 16.1M (= 84.9%)
[   45.377205] testsuite-46.sh[996]:    Disk Free: 908.0K (= 4.6%)
[   45.379445] testsuite-46.sh[996]:   Disk Floor: 19.0M
[   45.381610] testsuite-46.sh[996]: Disk Ceiling: 290.0M
[   45.383722] testsuite-46.sh[996]:   Good Auth.: 1
[   45.385665] testsuite-46.sh[996]:    Last Good: Thu 2024-05-09 06:58:35 UTC
[   45.388203] testsuite-46.sh[996]:     Next Try: anytime
[   45.390254] testsuite-46.sh[996]:  Auth. Limit: 1000 attempts per 1s
[   45.392702] testsuite-46.sh[996]:    Rebalance: off
[   45.394722] testsuite-46.sh[996]:    Passwords: 1
[   45.396708] testsuite-46.sh[996]:   Local Sig.: yes
[   45.398712] testsuite-46.sh[996]:      Service: io.systemd.Home
[   45.401957] testsuite-46.sh[997]: + userdbctl user blob-user
[   45.404084] testsuite-46.sh[998]: + tee /tmp/b
[   45.405936] testsuite-46.sh[998]:    User name: blob-user
[   45.407906] testsuite-46.sh[998]:        State: active
[   45.412148] testsuite-46.sh[998]:  Disposition: regular
[   45.413948] testsuite-46.sh[998]:  Last Change: Thu 2024-05-09 06:58:32 UTC
[   45.416300] testsuite-46.sh[998]:     Login OK: yes
[   45.417694] testsuite-46.sh[998]:  Password OK: yes
[   45.419619] testsuite-46.sh[998]:          UID: 12345
[   45.421391] testsuite-46.sh[998]:          GID: 12345 (blob-user)
[   45.423524] testsuite-46.sh[998]:    Directory: /home/blob-user
[   45.425629] testsuite-46.sh[998]:    Blob Dir.: /var/cache/systemd/home/blob-user
[   45.428062] testsuite-46.sh[998]:               ├─ test1 (436a101d6264b1daa5c5e2ba362407e1fc743126a7a18c09b381bc431ab8c08b)
[   45.431734] testsuite-46.sh[998]:               └─ test2 (5a43615af72cda81a2d8dc79ac115e2b3d85b027fd6c7373eb8cb8617be5978a)
[   45.435484] testsuite-46.sh[998]:      Storage: luks (strong encryption)
[   45.437768] testsuite-46.sh[998]:   Image Path: /home/blob-user.home
[   45.439870] testsuite-46.sh[998]:    Removable: no
[   45.441576] testsuite-46.sh[998]:        Shell: /bin/bash
[   45.443428] testsuite-46.sh[998]:  Access Mode: 0700
[   45.445202] testsuite-46.sh[998]: LUKS Discard: online=yes offline=yes
[   45.447306] testsuite-46.sh[998]:    LUKS UUID: 21b0c4d3-9c58-476d-a18d-4cf49ea5e44d
[   45.449840] testsuite-46.sh[998]:    Part UUID: 0d7cb051-5866-43d3-9d83-e184ae936a3f
[   45.452412] testsuite-46.sh[998]:      FS UUID: 3aa67bea-1be4-490f-bffa-4b049ec69f93
[   45.454414] testsuite-46.sh[998]:  File System: ext4
[   45.456150] testsuite-46.sh[998]:  LUKS Cipher: aes
[   45.457609] testsuite-46.sh[998]:  Cipher Mode: xts-plain64
[   45.459308] testsuite-46.sh[998]:   Volume Key: 256bit
[   45.460954] testsuite-46.sh[998]:   PBKDF Type: pbkdf2
[   45.462881] testsuite-46.sh[998]:   PBKDF Time: 1ms
[   45.465578] testsuite-46.sh[998]:  Mount Flags: nosuid nodev exec
[   45.467671] testsuite-46.sh[998]:    Disk Size: 19.0M
[   45.469364] testsuite-46.sh[998]:   Disk Usage: 16.1M (= 85.0%)
[   45.471383] testsuite-46.sh[998]:    Disk Free: 908.0K (= 4.6%)
[   45.473321] testsuite-46.sh[998]:   Disk Floor: 19.0M
[   45.475155] testsuite-46.sh[998]: Disk Ceiling: 290.0M
[   45.476660] testsuite-46.sh[998]:   Good Auth.: 1
[   45.478473] testsuite-46.sh[998]:    Last Good: Thu 2024-05-09 06:58:35 UTC
[   45.481020] testsuite-46.sh[998]:     Next Try: anytime
[   45.483053] testsuite-46.sh[998]:  Auth. Limit: 1000 attempts per 1s
[   45.485666] testsuite-46.sh[998]:    Rebalance: off
[   45.487488] testsuite-46.sh[998]:    Passwords: 1
[   45.491721] testsuite-46.sh[998]:   Local Sig.: yes
[   45.495830] testsuite-46.sh[998]:      Service: io.systemd.Home
[   45.502158] testsuite-46.sh[523]: + diff -I '^\s*Disk \(Size\|Free\|Floor\|Ceiling\):' /tmp/a /tmp/b
[   45.511180] testsuite-46.sh[999]: 30c30
[   45.514136] testsuite-46.sh[999]: <   Disk Usage: 16.1M (= 84.9%)
[   45.517486] testsuite-46.sh[999]: ---
[   45.519209] testsuite-46.sh[999]: >   Disk Usage: 16.1M (= 85.0%)
[   46.386634] reboot: Power down
@DaanDeMeyer DaanDeMeyer added bug 🐛 Programming errors, that need preferential fixing tests labels May 9, 2024
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this issue May 9, 2024
This can change between the call to homectl inspect and userdbctl
user so let's ignore it along with the other disk fields.

Fixes systemd#32727
@DaanDeMeyer
Copy link
Contributor Author

DaanDeMeyer commented May 9, 2024

Another different failure on Debian in https://github.com/systemd/systemd/actions/runs/9011374536/job/24758890861:

gh run download 9013386982 --name ci-mkosi-9013386982-1-centos-9-failed-test-journals -D ci/ci-mkosi-9013386982-1-centos-9-failed-test-journals && journalctl --file ci/ci-mkosi-9013386982-1-centos-9-failed-test-journals/test/journal/TEST-46-HOMED.journal --no-hostname -o short-monotonic -u testsuite-46.service -p info

This one is a kernel crash:

[   10.834840] kernel: VFS: Busy inodes after unmount of dm-0 (btrfs)
[   10.834861] kernel: ------------[ cut here ]------------
[   10.836255] kernel: kernel BUG at fs/super.c:698!
[   10.837314] kernel: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[   10.838764] kernel: CPU: 0 PID: 472 Comm: systemd-homewor Not tainted 6.7.12-cloud-amd64 #1  Debian 6.7.12-1
[   10.840816] kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014
[   10.850226] kernel: RIP: 0010:generic_shutdown_super+0x10e/0x110
[   10.851756] kernel: Code: 91 00 e8 95 67 f3 ff 48 8b bb 00 01 00 00 eb db 48 8b 43 28 48 8d b3 b8 03 00 00 48 c7 c7 68 69 a6 84 48 8b 10 e8 d2 a5 d5 ff <0f> 0b 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa
[   10.856569] kernel: RSP: 0018:ffffbde3009c3db0 EFLAGS: 00010246
[   10.857842] kernel: RAX: 000000000000002e RBX: ffff99c10c7eb000 RCX: 0000000000000027
[   10.859651] kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff99c179c21400
[   10.861386] kernel: RBP: ffffffffc08458c0 R08: 0000000000000000 R09: ffffbde3009c3c58
[   10.862984] kernel: R10: ffffbde3009c3c50 R11: 0000000000000003 R12: ffff99c10d405914
[   10.864559] kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   10.866223] kernel: FS:  00007f03cb1779c0(0000) GS:ffff99c179c00000(0000) knlGS:0000000000000000
[   10.868342] kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   10.869884] kernel: CR2: 000055e709133000 CR3: 000000000d5d0005 CR4: 0000000000370ef0
[   10.871806] kernel: Call Trace:
[   10.872499] kernel:  <TASK>
[   10.873102] kernel:  ? die+0x36/0x90
[   10.873914] kernel:  ? do_trap+0xda/0x100
[   10.874845] kernel:  ? generic_shutdown_super+0x10e/0x110
[   10.876050] kernel:  ? do_error_trap+0x6a/0x90
[   10.876934] kernel:  ? generic_shutdown_super+0x10e/0x110
[   10.878091] kernel:  ? exc_invalid_op+0x50/0x70
[   10.879095] kernel:  ? generic_shutdown_super+0x10e/0x110
[   10.880230] kernel:  ? asm_exc_invalid_op+0x1a/0x20
[   10.881337] kernel:  ? generic_shutdown_super+0x10e/0x110
[   10.882646] kernel:  kill_anon_super+0x16/0x40
[   10.883649] kernel:  btrfs_kill_super+0x16/0x20 [btrfs]
[   10.884902] kernel:  deactivate_locked_super+0x33/0xb0
[   10.886089] kernel:  cleanup_mnt+0xba/0x150
[   10.886999] kernel:  task_work_run+0x5d/0x90
[   10.887985] kernel:  exit_to_user_mode_prepare+0x1dd/0x1e0
[   10.889279] kernel:  syscall_exit_to_user_mode+0x1b/0x40
[   10.890540] kernel:  do_syscall_64+0x70/0xe0
[   10.891492] kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
[   10.892755] kernel:  ? exit_to_user_mode_prepare+0x40/0x1e0
[   10.894044] kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
[   10.895224] kernel:  ? syscall_exit_to_user_mode+0x22/0x40
[   10.896257] kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
[   10.897402] kernel:  ? do_syscall_64+0x70/0xe0
[   10.898282] kernel:  entry_SYSCALL_64_after_hwframe+0x73/0x7b
[   10.899386] kernel: RIP: 0033:0x7f03cab272f7
[   10.900216] kernel: Code: 0c 00 f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 f9 9a 0c 00 f7 d8 64 89 02 b8
[   10.904541] kernel: RSP: 002b:00007fffec601aa8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a6
[   10.906315] kernel: RAX: 0000000000000000 RBX: 000055e70912f660 RCX: 00007f03cab272f7
[   10.908024] kernel: RDX: 0000000000000000 RSI: 0000000000000008 RDI: 000055e70912f660
[   10.909691] kernel: RBP: 00007fffec601b10 R08: 0000000000000000 R09: 0000000000000000
[   10.911364] kernel: R10: 0000000000000007 R11: 0000000000000202 R12: 0000000000000010
[   10.912996] kernel: R13: 0000000000000000 R14: 000055e7078991ba R15: 0000000000000000
[   10.914603] kernel:  </TASK>
[   10.915123] kernel: Modules linked in: dm_crypt nls_ascii nls_cp437 vfat fat binfmt_misc btrfs intel_rapl_msr intel_rapl_common blake2b_generic iosf_mbi xor kvm_amd raid6_pq libcrc32c kvm sd_mod sg irqbypass crct10dif_pclmul crc32_p>
[   10.934080] kernel: ---[ end trace 0000000000000000 ]---

DaanDeMeyer added a commit that referenced this issue May 9, 2024
This can change between the call to homectl inspect and userdbctl
user so let's ignore it along with the other disk fields.

Fixes #32727
@DaanDeMeyer DaanDeMeyer reopened this May 9, 2024
@yuwata
Copy link
Member

yuwata commented May 17, 2024

Another different failure on Debian in https://github.com/systemd/systemd/actions/runs/9011374536/job/24758890861:

gh run download 9013386982 --name ci-mkosi-9013386982-1-centos-9-failed-test-journals -D ci/ci-mkosi-9013386982-1-centos-9-failed-test-journals && journalctl --file ci/ci-mkosi-9013386982-1-centos-9-failed-test-journals/test/journal/TEST-46-HOMED.journal --no-hostname -o short-monotonic -u testsuite-46.service -p info

Ugh, the journal is already removed? Please provide another one.

@DaanDeMeyer
Copy link
Contributor Author

Another different failure on Debian in https://github.com/systemd/systemd/actions/runs/9011374536/job/24758890861:

gh run download 9013386982 --name ci-mkosi-9013386982-1-centos-9-failed-test-journals -D ci/ci-mkosi-9013386982-1-centos-9-failed-test-journals && journalctl --file ci/ci-mkosi-9013386982-1-centos-9-failed-test-journals/test/journal/TEST-46-HOMED.journal --no-hostname -o short-monotonic -u testsuite-46.service -p info

Ugh, the journal is already removed? Please provide another one.

I'll also increase the retention for failed journals to a month or so.

@DaanDeMeyer
Copy link
Contributor Author

@yuwata another journal:

gh run download 9126934864 --name ci-mkosi-9126934864-1-arch-rolling-failed-test-journals -D ci/ci-mkosi-9126934864-1-arch-rolling-failed-test-journals && journalctl --file ci/ci-mkosi-9126934864-1-arch-rolling-failed-test-journals/test/journal/TEST-46-HOMED.journal --no-hostname -o short-monotonic -u TEST-46-HOMED.service -p info

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Programming errors, that need preferential fixing tests
Development

Successfully merging a pull request may close this issue.

2 participants