Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCTI state transitions are inconsistent and unpredictable #2674

Open
fergus-dall opened this issue Aug 4, 2023 · 2 comments
Open

TCTI state transitions are inconsistent and unpredictable #2674

fergus-dall opened this issue Aug 4, 2023 · 2 comments
Labels

Comments

@fergus-dall
Copy link

Looking through the TCTI code, I have noticed a couple of inconsistencies.

Cancellation

The tcti-mssim and tcti-pcap TCTIs will transition directly to the TCTI_STATE_TRANSMIT state after a successful call to cancel. This contradicts the specification which states that another call to receive is required before another command can be transmitted. This means there's no way for the caller to get the response buffer for the cancelled command, which is required to tell if the command actually executed or not. Higher layers in the stack will likely also hit unexpected errors if a command using these TCTIs is cancelled. For example, I think the ESYS library will enter an unrecoverable internal error state if the caller does

Esys_{Command}_Async(context, ...);
Esys_GetTcti(context, &tcti);
tcti->cancel(tcti);
Esys_{Command}_Finish(context, ...);

which I believe is the intended pattern for cancelling a command. Note that skipping the Esys_{Command}_Finish call leaves the ESYS context in the _ESYS_STATE_SENT state where new commands cannot be issued, which is effectively the same as the unrecoverable error state for this purpose.

All other TCTIs in this repo appear to act correctly in this regard. This looks like it's straight forward to fix, but I don't know much about the lower level interface they're using. Given the above, I would guess no one is relying on this behaviour.

Receive errors

Some TCTIs will also transition to the TCTI_STATE_TRANSMIT state on certain errors in the receive call. The spec doesn't explicitly comment on this either way, and neither do the man pages or installed headers, but there is a comment in internal tcti-common.h header that claims this occurs for all return codes other than TRY_AGAIN, INSUFFICIENT_BUFFER, BAD_CONTEXT, BAD_REFERENCE, BAD_VALUE, and BAD_SEQUENCE, but the behaviour of some TCTIs diverges from this. For example:

  • tcti-cmd transitions when returning INSUFFICIENT_BUFFER.
  • tcti-device does not transition when returning IO_ERROR. It is also ambiguous when returning unmarshaling errors and GENERAL_FAILURE, transitioning on some code paths but not others.
  • tcti-i2c-helper only transitions when returning SUCCESS, even though there are paths that return IO_ERROR and lower-layer errors that don't document any restrictions on the errors they should return.
  • tcti-pcap doesn't appear to interpret error codes from the underlying TCTI to decide whether to change its own state or not, if these get out of sync the TCTI would become unusable.
  • tcti-libtpms and tcti-mssim appear to behave correctly i.e. the same as the comment in tcti-common.h.

This has very similar issues as the cancellation issue, where higher layers of the stack are entirely unprepared for this.

My theory of the intent of this behaviour is that it's intended to represent three types of errors through different error codes:

  • Transient errors that mean the caller should retry receive.
  • Fatal errors that mean the caller should give up on using this TCTI context entirely.
  • Errors that mean the caller should give up on getting a response for this command, but can still try to send more commands in the future.

Considering the widely varying existing behaviour, and the common pattern of returning error codes from lower layers without inspection, I think trying to signal the presence or absence of a state transition through the specific error code returned was a mistake.

The most straightforward fix within the existing interface would be to say that error returns never indicate a transition out of the TCTI_STATE_RECEIVE state. Equivalently, transmit can only be called after initialization or a successful receive call. Transient errors then get retried by the caller as appropriate, and a fatal error is represented by simply never being able to perform a successful receive. Higher layers of software generally seem to assume this already, but it's more likely that someone would be broken by this than by the proposed cancellation fix above.

@tomoveu
Copy link
Contributor

tomoveu commented Sep 5, 2023

I am currently experiencing issues running latest tpm2-tss with the ibmswtpm2.

Could this bug be the reason and what is a good tpm2 tss version that has mssim TCTI stable?

[1] 89
root@0a72a66cbc25:/ibmswtpm2/src# LIBRARY_COMPATIBILITY_CHECK is ON
Manufacturing NV state...
Size of OBJECT = 1732
Size of components in TPMT_SENSITIVE = 1096
    TPMI_ALG_PUBLIC                 2
    TPM2B_AUTH                      66
    TPM2B_DIGEST                    66
    TPMU_SENSITIVE_COMPOSITE        962
MAX_CONTEXT_SIZE can be reduced to 1808 (2680)
Starting ACT thread...
TPM command server listening on port 2321
Platform server listening on port 2322

root@0a72a66cbc25:/ibmswtpm2/src# tpm2_startup -T mssim:host=localhost,port=2321
Command IPv4 client accepted
Platform IPv4 client accepted
WARNING:esys:src/tss2-esys/api/Esys_Startup.c:212:Esys_Startup_Finish() Received TPM Error 
ERROR:esys:src/tss2-esys/api/Esys_Startup.c:78:Esys_Startup() Esys Finish ErrorCode (0x000001c4) 
ERROR: Esys_Startup(0x1C4) - tpm:parameter(1):value is out of range or is not correct for the context
ERROR: Unable to run tpm2_startup
Platform server listening on port 2322
TPM command server listening on port 2321
root@0a72a66cbc25:/ibmswtpm2/src# tpm2_getrandom 8 -T mssim:host=localhost,port=2321
Command IPv4 client accepted
Platform IPv4 client accepted
WARNING:esys:src/tss2-esys/api/Esys_GetCapability.c:301:Esys_GetCapability_Finish() Received TPM Error 
ERROR:esys:src/tss2-esys/api/Esys_GetCapability.c:106:Esys_GetCapability() Esys Finish ErrorCode (0x00000100) 
ERROR: Esys_GetCapability(0x100) - tpm:error(2.0): TPM not initialized by TPM2_Startup or already initialized
ERROR: Unable to run tpm2_getrandom
Platform server listening on port 2322
TPM command server listening on port 2321
root@0a72a66cbc25:/ibmswtpm2/src# 

@JuergenReppSIT
Copy link
Member

With tpm2_startup -c -T mssim:host=localhost,port=2321 it worked for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants