Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nhrpd: fixes core dump on shutdown #15879

Merged
merged 2 commits into from
May 30, 2024

Conversation

dleroy
Copy link
Contributor

@dleroy dleroy commented Apr 29, 2024

When nhrpd is shutdown via nhrp_request_stop() the shutdown sequence was not handling the case where there are active shortcut routes installed. The zebra client and shortcut rib were being cleaned up before vrf_terminate() had an opportunity to delete the active routes.

core.txt

@louberger
Copy link
Member

would be nice to have a topotest for this

@ton31337
Copy link
Member

Having a coredump inside the commit would be helpful too.

@dleroy
Copy link
Contributor Author

dleroy commented Apr 29, 2024

Having a coredump inside the commit would be helpful too.

core.txt link has the backtrace. Next time I'll include in line.

@frrbot frrbot bot added the tests Topotests, make check, etc label Apr 29, 2024
@github-actions github-actions bot added size/XXL and removed size/XS labels Apr 29, 2024
@ton31337
Copy link
Member

Having a coredump inside the commit would be helpful too.

core.txt link has the backtrace. Next time I'll include in line.

Why not this time?

@ton31337
Copy link
Member

@Mergifyio backport stable/10.0 stable/9.1 stable/9.0 stable/8.5

Copy link

mergify bot commented Apr 30, 2024

backport stable/10.0 stable/9.1 stable/9.0 stable/8.5

✅ Backports have been created

@dleroy
Copy link
Contributor Author

dleroy commented Apr 30, 2024

Having a coredump inside the commit would be helpful too.

core.txt link has the backtrace. Next time I'll include in line.

Why not this time?

Hi Donatas, I'd be happy to add coredump and associated nhrpd. I looked for a previous pull request example of best practices how to do that and couldn't find another PR with a coredump. Can you suggest how FRR community would like that done. Thanks.

@dleroy dleroy force-pushed the dleroy/nhrpd-shutdown-fix branch 2 times, most recently from cb75c32 to a63f6a9 Compare May 1, 2024 23:17
@eqvinox
Copy link
Contributor

eqvinox commented May 2, 2024

The zebra client and shortcut rib were being cleaned up before vrf_terminate() had an opportunity to delete the active routes.

(This is non-blocking re. this PR) — we may have a more general problem here with ordering shutdown operations. vrf_init is called before any NHRP initialization (not counting nhrp_error_init), and normally the shutdown calls should be in reverse order to init calls. If we need to clear out routes first, don't do that, and that causes problems… something is wrong with our setup of operations.

@dleroy
Copy link
Contributor Author

dleroy commented May 6, 2024

The topotest added by this commit is failing on Ubuntu 18.04 arm8 because 'iptables' is not installed on the image. The test leverages 'iptables' in order to be able to test nhrp shortcuts

2024/05/02-00:05:51 2024-05-02 00:05:51,225 INFO: topo: input: iptables -A FORWARD -i r1-gre0 -o r1-gre0 -m hashlimit --hashlimit-upto 4/minute --hashlimit-burst 1 --hashlimit-mode srcip,dstip --hashlimit-srcmask 24 --hashlimit-dstmask 24 --hashlimit-name loglimit-0 -j NFLOG --nflog-group 1 --nflog-range 128
2024/05/02-00:05:51 2024-05-02 00:05:51,237 WARNING: r1: Router(r1): proc failed: rc 127 pid 51424
2024/05/02-00:05:51 args: /usr/bin/nsenter --mount=/proc/50823/ns/mnt --net=/proc/50823/ns/net --uts=/proc/50823/ns/uts -F /bin/bash -c iptables -A FORWARD -i r1-gre0 -o r1-gre0 -m hashlimit --hashlimit-upto 4/minute --hashlimit-burst 1 --hashlimit-mode srcip,dstip --hashlimit-srcmask 24 --hashlimit-dstmask 24 --hashlimit-name loglimit-0 -j NFLOG --nflog-group 1 --nflog-range 128
2024/05/02-00:05:51 stdout: /bin/bash: iptables: command not found
2024/05/02-00:05:51 stderr: empty
2024/05/02-00:05:51 2024-05-02 00:05:51,238 INFO: topo: output: /bin/bash: iptables: command not found

@dleroy dleroy force-pushed the dleroy/nhrpd-shutdown-fix branch from a63f6a9 to 4f6e5a6 Compare May 6, 2024 21:36
@github-actions github-actions bot added the rebase PR needs rebase label May 6, 2024
@dleroy dleroy force-pushed the dleroy/nhrpd-shutdown-fix branch 4 times, most recently from 9188309 to 94d381d Compare May 16, 2024 01:54
@donaldsharp
Copy link
Member

If you are adding usage of iptables and it is not present you should skip the test on that platform. See the bgp_as_allow_in/test_bgp_as_allow_in.py script and look at setup_module and add a similiar test to the kernel version and pytest.skip(...) if its not available.

@dleroy
Copy link
Contributor Author

dleroy commented May 17, 2024

If you are adding usage of iptables and it is not present you should skip the test on that platform. See the bgp_as_allow_in/test_bgp_as_allow_in.py script and look at setup_module and add a similiar test to the kernel version and pytest.skip(...) if its not available.

Thanks Donald. I pushed the fix to ignore unsupported platforms already. Now just trying to get all the unrelated basics tests to pass. i.e.

🚫 ospf6_ecmp_inter_area.test_ospf6_ecmp_inter_area test_ecmp_inter_area

E AssertionError: 'r1' wrong number of route nexthops
assert [1, 1, 1, 1, 1, 1, 2, 2, 2] == [1, 1, 1, 1, 1, 2, 2, 2, 2]
At index 5 diff: 1 != 2
Full diff:
- [1, 1, 1, 1, 1, 2, 2, 2, 2]
? ---
+ [1, 1, 1, 1, 1, 1, 2, 2, 2]
? +++

@dleroy dleroy force-pushed the dleroy/nhrpd-shutdown-fix branch 2 times, most recently from e39f017 to be936d4 Compare May 23, 2024 14:43
@dleroy dleroy force-pushed the dleroy/nhrpd-shutdown-fix branch from be936d4 to 104410d Compare May 30, 2024 13:51
When nhrpd is shutdown via nhrp_request_stop() the shutdown
sequence was not handling the case where there are active
shortcut routes installed. The zebra client and shortcut rib
were being cleaned up before vrf_terminate() had an opportunity
to delete the active routes.

Signed-off-by: dleroy <[email protected]>
…ology

Contains 2 testcases. The first does a basic configuration/connectivity.
The second testcase initiates a shortcut through the primary NHS,
verifies shortcut routes are installed. Primary NHS interface brought
down and verify that the shortcut is not impacted. Finally verify that
after the shortcut expires, it is able to be re-established via a backup
NHS.

Signed-off-by: dleroy <[email protected]>
@dleroy dleroy force-pushed the dleroy/nhrpd-shutdown-fix branch from 104410d to a7037ab Compare May 30, 2024 19:26
@Jafaral Jafaral merged commit 8e7bc85 into FRRouting:master May 30, 2024
8 checks passed
@Jafaral
Copy link
Member

Jafaral commented May 30, 2024

@Mergifyio backport stable/8.4

Copy link

mergify bot commented May 30, 2024

backport stable/8.4

✅ Backports have been created

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport master nhrp rebase PR needs rebase size/XXL tests Topotests, make check, etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants