Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: read error leads to message skip for blocking API #292

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

kusstas
Copy link

@kusstas kusstas commented Dec 19, 2024

ISSUE: Read error leads to message skip. It happens due consuming STX byte of message, even if message parsing isn't completed yet.

GOAL: I have custom readers that could return WouldBlock IO error at any time and code shall interrupt message parsing and return to this later.

@kusstas kusstas changed the title Fix read error leads to message skip for blocking API Fix: read error leads to message skip for blocking API Dec 19, 2024
@pv42
Copy link
Contributor

pv42 commented Dec 21, 2024

This seems to be an issue worth fixing. But I think this does not seem to compleatly fix the issue at least as I understand it.
I wrote the following test that uses a reader that alternates beween returning would block and reading a single byte:

    struct BlockyReader {
        block_next_read: bool,
        index: usize,
    }
    impl Read for BlockyReader {
        fn read(&mut self, buf: &mut [u8]) -> Result<usize> {
            if self.block_next_read {
                self.block_next_read = false;
                Result::Err(Error::new(ErrorKind::WouldBlock, "Test Block"))
            } else {
                let read = HEARTBEAT_V2.get(self.index).ok_or(Error::new(
                    ErrorKind::UnexpectedEof,
                    "EOF",
                ));
                buf[0] = *read?;
                self.index += 1;
                self.block_next_read = true;
                Ok(1)
            }
        }
    }

    #[test]
    fn test_read_error() {
        let mut reader = PeekReader::new(BlockyReader {
            block_next_read: true,
            index: 0,
        });
        loop {
            match read_v2_msg::<mavlink::common::MavMessage, _>(&mut reader) {
                Ok((header, _)) => {
                    assert_eq!(header, crate::test_shared::COMMON_MSG_HEADER);
                    break;
                },
                Err(MessageReadError::Io(err)) if err.kind() == ErrorKind::WouldBlock => (),
                Err(err) => panic!("{err}"),
            }
        }
    }

which fails. The problem seems to be that PeekReader::fetch calls io::Read::read_exact() which then calls read() multiple times discarding all previously read data in case of an error.
This PR still fixes the issue when the block occures directly after the STX byte, as in this reader:

impl Read for BlockyReader {
        fn read(&mut self, buf: &mut [u8]) -> Result<usize> {
            if self.block_next_read {
                self.block_next_read = false;
                Result::Err(Error::new(ErrorKind::WouldBlock, "Test Block"))
            } else {
                let read = HEARTBEAT_V2.get(self.index).ok_or(Error::new(
                    ErrorKind::UnexpectedEof,
                    "EOF",
                ));
                buf[0] = *read?;
                self.index += 1;
                if self.index <= 1 {
                    self.block_next_read = true;
                }
                Ok(1)
            }
        }
    }

but if at any other position the message still gets trown out.
Regardless if this is the intended fix or not there should probably be a test case for it.

@kusstas
Copy link
Author

kusstas commented Dec 22, 2024

Hi, thank you for response. Yeah, I've not tested case where reader can read less data than expect. I've modified fetch function to use read instead of read_exact to store all data that read to buffer and added test cases for each version of mavlink read.

@pv42
Copy link
Contributor

pv42 commented Dec 22, 2024

f801bea seems to have broken the process_log_files test, it does not terminate anymore (at least for me).
This seems to be happening because std::io::default_read_exact() that performs the work for read_exact() has a special abort condition where it returns an error when 0 bytes where read:

while !buf.is_empty() {
    match this.read(buf) {
        Ok(0) => break,
        ...
    }
}
if !buf.is_empty() { Err(Error::READ_EXACT_EOF) } else { Ok(()) }

@kusstas
Copy link
Author

kusstas commented Dec 22, 2024

Yeah, I've added case for zero check on read, and now seems tests with features listed in workflow work well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants