Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arm64/vmm: Preserve PSR_C64 when injecting an exception #2255

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

markjdb
Copy link
Contributor

@markjdb markjdb commented Nov 27, 2024

I'm not sure if this might be simplistic, but it resolves a problem I see with breakpoint injection from bhyve's gdb stub. This arises when the debugger has installed a breakpoint, and the guest triggers a breakpoint exception some other way, e.g., a dtrace FBT probe.

sys/arm64/vmm/vmm_arm64.c Outdated Show resolved Hide resolved
@jrtc27
Copy link
Member

jrtc27 commented Nov 27, 2024

This looks to be the bug @kwitaszczyk was running into, and the cause of the problem aligns with what @bsdjhb and I had managed to ascertain at the PI meeting.

@jrtc27
Copy link
Member

jrtc27 commented Nov 27, 2024

The handling of vbar_el1 also looks a bit dodgy. If CPACR_EL1.CEN[0] is 0 then CVBAR_EL1 is just interpreted as VBAR_EL1 by the architecture when trapping to EL1 (setting PCC's address to it), which means we need to derive a capability for tf_elr from elr_el1.

@jrtc27
Copy link
Member

jrtc27 commented Nov 27, 2024

The handling of vbar_el1 also looks a bit dodgy. If CPACR_EL1.CEN[0] is 0 then CVBAR_EL1 is just interpreted as VBAR_EL1 by the architecture when trapping to EL1 (setting PCC's address to it), which means we need to derive a capability for tf_elr from elr_el1.

This should be testable by running GDB against a plain FreeBSD VM, setting a breakpoint from GDB and triggering a breakpoint from within the VM, just as for the SPSR C64 issue except with a FreeBSD guest. I'm not sure how exactly I expect it to break, whether it'll get stuck in a trap loop or end up in bhyve. Hopefully at least it doesn't wedge the host, which should be true as long as some of this code is preemptible... otherwise there are other ways a malicious guest could trigger the same kinds of issues even with correct handling here.

@kwitaszczyk
Copy link
Member

This looks to be the bug @kwitaszczyk was running into, and the cause of the problem aligns with what @bsdjhb and I had managed to ascertain at the PI meeting.

Unfortunately, it doesn't seem the host kernel enters the block of if (hypctx->has_exception) in my case. It does enter it but once the guest kernel ends up in kdb_enter() after the panic.

@bsdjhb
Copy link
Collaborator

bsdjhb commented Nov 29, 2024

Can you add a trace to see if vmmops_setreg is ever called to write a value to VM_REG_GUEST_CPSR?

@kwitaszczyk
Copy link
Member

kwitaszczyk commented Dec 2, 2024

Can you add a trace to see if vmmops_setreg is ever called to write a value to VM_REG_GUEST_CPSR?

It doesn't seem vmmops_setreg() is ever called for this purpose. I've added

diff --git a/sys/arm64/vmm/vmm_arm64.c b/sys/arm64/vmm/vmm_arm64.c
index a66d5ed1ba97..23efd3e4887d 100644
--- a/sys/arm64/vmm/vmm_arm64.c
+++ b/sys/arm64/vmm/vmm_arm64.c
@@ -1447,6 +1447,12 @@ vmmops_setreg(void *vcpui, int reg, uintcap_t val)
 #endif
                *(uintcap_t *)regp = val;
                break;
+       case VM_REG_GUEST_CPSR:
+               printf("%s:%d\nval=%lu\nelr_el1=%#lp\ntf_elr=%#lp\nspsr=0x%lx\n",
+                   __func__, __LINE__, (uint64_t)val, (void *)hypctx->elr_el1,
+                   (void *)hypctx->tf.tf_elr, hypctx->tf.tf_spsr);
+               *(uint64_t *)regp = (uint64_t)val;
+               break;
        default:
                *(uint64_t *)regp = (uint64_t)val;
                break;

and I don't get anything in the serial console.

Copy link
Collaborator

@bsdjhb bsdjhb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think there are other things to check (CEN) that Jess noted, but this is certainly an improvement over what is there now.

@markjdb
Copy link
Contributor Author

markjdb commented Dec 18, 2024

I do think there are other things to check (CEN) that Jess noted, but this is certainly an improvement over what is there now.

I have a patch to address that comment, but had been holding off on pushing it until I could test with hybrid kernels. Now I'm looking at an apparent regression with the VHE merge after I rebased onto dev; hopefully it won't take too long to fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants