-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Verifier integration #114
Verifier integration #114
Conversation
WalkthroughThe changes in this pull request involve the removal of the entire Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🧹 Outside diff range and nitpick comments (9)
libfuzzer/CMakeLists.txt (1)
21-21
: New include directories look goodThe addition of GSL and ebpf-verifier include directories aligns with the PR objectives. However, for better readability and maintainability, consider grouping related paths together.
You might want to reorder the include directories like this:
target_include_directories("ubpf_fuzzer" PRIVATE "${CMAKE_SOURCE_DIR}/vm" "${CMAKE_BINARY_DIR}/vm" "${CMAKE_SOURCE_DIR}/vm/inc" "${CMAKE_BINARY_DIR}/vm/inc" "${CMAKE_SOURCE_DIR}/ubpf_plugin" "${CMAKE_BINARY_DIR}/_deps/gsl-src/include" "${CMAKE_SOURCE_DIR}/external/ebpf-verifier/src" "${CMAKE_SOURCE_DIR}/external/ebpf-verifier/src/crab" "${CMAKE_SOURCE_DIR}/external/ebpf-verifier/src/crab_utils" )Also applies to: 25-27
custom_tests/srcs/ubpf_test_debug_function.cc (2)
25-27
: LGTM! Consider future use ofregister_mask
.The addition of the
register_mask
parameter to thedebug_callout
function signature aligns with the PR objectives. The use ofUNREFERENCED_PARAMETER
is a good practice to suppress compiler warnings.However, the
register_mask
parameter is currently unused in the function body. Consider documenting the intended future use of this parameter or implement its functionality if it's meant to be used in this PR.
Line range hint
1-114
: Summary of changes and next steps.The changes in this file align with the PR objectives, primarily updating the
debug_callout
function to include the newregister_mask
parameter. However, there are a few points to consider:
The
register_mask
parameter is currently unused in thedebug_callout
function. Consider documenting its intended use or implementing its functionality.The
ubpf_register_debug_fn
call in themain
function doesn't pass the newregister_mask
parameter. Verify that theubpf_register_debug_fn
function signature inubpf.h
has been updated accordingly.If the
ubpf_register_debug_fn
signature has been updated, modify the call in themain
function to include theregister_mask
parameter.Review any other files that might be affected by this change, such as test files or other parts of the codebase that use the debug function.
These steps will ensure consistency across the codebase and complete the integration of the new
register_mask
parameter.vm/inc/ubpf.h (1)
Line range hint
1-581
: Consider updating related functions and documentationWhile the change to
ubpf_debug_fn
is the only visible modification, it's important to consider its broader impact on the uBPF library:
- Update the documentation for
ubpf_register_debug_fn
to reflect the newregister_mask
parameter.- Review any implementations of
ubpf_debug_fn
in the codebase to ensure they handle the new parameter correctly.- Consider adding a brief explanation of the
register_mask
parameter's purpose and usage in a comment near the typedef.- Verify that this change doesn't break any existing code that uses the debug function.
To maintain consistency and clarity throughout the project, consider the following:
- Update the project's changelog to document this API change.
- If there are any example implementations or usage of
ubpf_debug_fn
, update them to include the new parameter.- Review and update any relevant test cases to cover the new functionality provided by the
register_mask
parameter.vm/ubpf_vm.c (1)
682-682
: Overall enhancement: Introduction of shadow register tracking.The changes introduce a comprehensive mechanism for tracking register initialization through shadow registers. This feature significantly enhances the uBPF VM's ability to detect and prevent bugs related to uninitialized register usage.
While this is a valuable addition for debugging and ensuring correct program behavior, consider the following architectural advice:
Performance Impact: The additional checks in the main execution loop could potentially impact performance. Consider implementing a compile-time flag to enable/disable this feature, allowing users to optimize for performance in production environments when necessary.
Configurability: Provide options to configure the level of checking (e.g., only check certain critical instructions) to balance between thoroughness and performance.
Error Reporting: Enhance the error reporting mechanism to provide detailed information about uninitialized register usage, including the instruction context and call stack if possible.
Documentation: Ensure that this new feature is well-documented, including its purpose, how to use it effectively, and any performance considerations.
Testing: Develop a comprehensive test suite that covers various scenarios of register usage to ensure the shadow register tracking is working correctly across all instruction types and edge cases.
These enhancements will make the shadow register feature more robust, flexible, and user-friendly while maintaining the overall integrity and performance of the uBPF VM.
libfuzzer/libfuzz_harness.cc (4)
13-15
: Verify Necessity of Included HeadersThe headers
"asm_unmarshal.hpp"
,"crab_verifier.hpp"
, and"platform.hpp"
are included. Please ensure that these headers are necessary for this file. Removing unused headers can reduce compilation time and improve code clarity.
245-305
: Clarify Disabled Debug CodeIn
ubpf_debug_function
, the debug implementation is enclosed within#if 1 ... #else
blocks, effectively disabling it. If the debug functionality is intentionally disabled, consider adding a comment to explain why. Alternatively, remove the unused code to clean up the codebase.
468-472
: Provide Feedback on Verification FailureWhen
verify_bpf_byte_code(program)
returnsfalse
, the function exits silently. Consider logging an error message or returning a non-zero value to facilitate debugging and provide feedback on why the execution was aborted.
Line range hint
510-514
: Ensure Resource Cleanup on ExceptionIn the case where
interpreter_result != jit_result
, an exception is thrown. Verify that all allocated resources are properly cleaned up to prevent memory leaks, especially since exceptions alter the normal control flow.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (9)
- .github/workflows/fuzzing.yml (1 hunks)
- .gitmodules (1 hunks)
- CMakeLists.txt (1 hunks)
- custom_tests/srcs/ubpf_test_debug_function.cc (1 hunks)
- external/ebpf-verifier (1 hunks)
- libfuzzer/CMakeLists.txt (1 hunks)
- libfuzzer/libfuzz_harness.cc (7 hunks)
- vm/inc/ubpf.h (1 hunks)
- vm/ubpf_vm.c (1 hunks)
✅ Files skipped from review due to trivial changes (1)
- external/ebpf-verifier
🧰 Additional context used
🔇 Additional comments (11)
.gitmodules (1)
7-9
: LGTM: New submodule correctly added.The new submodule "external/ebpf-verifier" has been properly added to the
.gitmodules
file. This addition aligns with the PR objectives of integrating theebpf-verifier
submodule into the project. The path and URL are correctly formatted and consistent with the submodule name.libfuzzer/CMakeLists.txt (1)
34-34
: ```shell
#!/bin/bashSearch for any reference to 'ebpfverifier' in the codebase
rg 'ebpfverifier' .
</blockquote></details> <details> <summary>CMakeLists.txt (1)</summary><blockquote> `46-46`: **LGTM. Consider adding a clarifying comment.** The addition of the "external/ebpf-verifier" subdirectory is correct and aligns with the PR objectives. It's appropriately conditioned on the UBPF_ENABLE_LIBFUZZER option, which is good for modularity. Consider adding a brief comment explaining why the ebpf-verifier is only included when UBPF_ENABLE_LIBFUZZER is set. This would improve code readability and maintainability. For example: ```cmake if (UBPF_ENABLE_LIBFUZZER) add_subdirectory("libfuzzer") # Include ebpf-verifier for enhanced bytecode verification in libfuzzer add_subdirectory("external/ebpf-verifier") endif()
To ensure the "external/ebpf-verifier" directory exists, we can run the following check:
✅ Verification successful
LGTM. The "external/ebpf-verifier" directory exists and is correctly referenced in
.gitmodules
.
Consider adding a brief comment explaining why
ebpf-verifier
is only included whenUBPF_ENABLE_LIBFUZZER
is set. This would improve code readability and maintainability. For example:if (UBPF_ENABLE_LIBFUZZER) add_subdirectory("libfuzzer") # Include ebpf-verifier for enhanced bytecode verification in libfuzzer add_subdirectory("external/ebpf-verifier") endif()🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Verify the existence of the external/ebpf-verifier directory if [ -d "external/ebpf-verifier" ]; then echo "external/ebpf-verifier directory exists." else echo "Warning: external/ebpf-verifier directory not found." fi # Check if it's mentioned in .gitmodules if grep -q "external/ebpf-verifier" .gitmodules; then echo "external/ebpf-verifier is mentioned in .gitmodules" else echo "Warning: external/ebpf-verifier not found in .gitmodules" fiLength of output: 282
custom_tests/srcs/ubpf_test_debug_function.cc (1)
Line range hint
82-82
: Verifyubpf_register_debug_fn
signature inubpf.h
.The
ubpf_register_debug_fn
call doesn't pass the newregister_mask
parameter todebug_callout
. This might indicate that theubpf_register_debug_fn
function signature in theubpf.h
header hasn't been updated to accommodate the new parameter.Please verify that the
ubpf_register_debug_fn
function signature inubpf.h
has been updated to match the newdebug_callout
signature. If it hasn't been updated, ensure that it's modified to include theregister_mask
parameter..github/workflows/fuzzing.yml (1)
100-100
: Verify test coverage for disabled verifier testsThe addition of
-DVERIFIER_ENABLE_TESTS=false
aligns with the PR objective of integrating the ebpf-verifier submodule. While this may optimize the build process, it's crucial to ensure that these tests are executed elsewhere in the CI/CD pipeline to maintain code quality and catch potential issues early.To confirm that verifier tests are covered in other parts of the CI/CD pipeline, please run the following script:
vm/inc/ubpf.h (2)
Line range hint
1-581
: Summary and RecommendationThe change to add a
register_mask
parameter to theubpf_debug_fn
typedef is a valuable enhancement to the uBPF library's debugging capabilities. It allows for more fine-grained control over register inspection during debugging.Recommendations:
- Proceed with the change, as it improves the library's functionality.
- Ensure all related code, documentation, and tests are updated to reflect this change.
- Verify that the change doesn't introduce any breaking changes for existing users of the library.
- Consider adding a brief explanation of the
register_mask
parameter's purpose and usage in the header file.Overall, this change is approved, subject to the implementation of the suggested verifications and updates.
580-581
: Update toubpf_debug_fn
typedef: Newregister_mask
parameter addedThe
ubpf_debug_fn
typedef has been updated to include a newuint64_t register_mask
parameter. This change enhances the debugging capabilities of the uBPF VM by allowing more granular control over which registers are of interest during debugging.To ensure this change is consistently applied throughout the codebase, please run the following script:
If the script returns any results, those locations need to be updated to match the new signature.
vm/ubpf_vm.c (2)
682-682
: Comprehensive shadow register validation function added.The
ubpf_validate_shadow_register
function has been introduced to handle shadow register operations. It checks if registers being accessed by an instruction are initialized and updates the shadow register state based on the instruction being executed.Consider adding detailed comments explaining the logic for each instruction type, especially for complex cases. This will improve maintainability and make it easier for other developers to understand and modify the code in the future.
Let's check if all instruction types are handled:
#!/bin/bash # List all unique instruction types in the code rg --type c "case EBPF_OP_" vm/ubpf_vm.c | awk '{print $2}' | sort | uniq # Compare with the cases in ubpf_validate_shadow_register rg --type c -A 50 "static inline bool\nubpf_validate_shadow_register" vm/ubpf_vm.c | rg "case EBPF_" | awk '{print $2}' | sort | uniqCompare these outputs to ensure all instruction types are accounted for in the shadow register validation.
682-682
: Update to debug function call includes new shadow_registers parameter.The debug function call has been updated to include a new
shadow_registers
parameter. This change allows for more detailed debugging information about register initialization states.To ensure this change is consistently applied, let's check for any other debug function calls that might need updating:
✅ Verification successful
Verified: No other debug function calls need updating.
All debug function calls in
vm/ubpf_vm.c
are consistent with the introduction of theshadow_registers
parameter.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for other debug function calls that might need updating rg --type c "vm->debug_function\s*\(" vm/ubpf_vm.cLength of output: 170
libfuzzer/libfuzz_harness.cc (2)
28-35
: Ensure Proper Initialization ofubpf_context_t
FieldsThe
ubpf_context_t
structure defines the execution context for uBPF programs. Make sure all instances of this structure are properly initialized before use to prevent undefined behavior due to uninitialized data.
338-339
: Verify Debug Function RegistrationThe call to
ubpf_register_debug_fn
registersubpf_debug_function
with the VM. Ensure thatubpf_debug_function
is compatible with the expected signature and correctly handles the context provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
- libfuzzer/CMakeLists.txt (1 hunks)
- libfuzzer/libfuzz_harness.cc (7 hunks)
- vm/ubpf_vm.c (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- libfuzzer/CMakeLists.txt
- vm/ubpf_vm.c
🧰 Additional context used
📓 Learnings (1)
libfuzzer/libfuzz_harness.cc (1)
Learnt from: Alan-Jowett PR: Alan-Jowett/ubpf#114 File: libfuzzer/libfuzz_harness.cc:137-175 Timestamp: 2024-10-16T00:06:09.352Z Learning: In `verify_bpf_byte_code`, the use of exceptions for control flow is intentional because the caller only cares about whether verification succeeded or failed, not the reason why.
🔇 Additional comments (2)
libfuzzer/libfuzz_harness.cc (2)
13-34
: LGTM: New includes and context structureThe new includes for the eBPF verifier and the
ubpf_context_t
structure definition are well-structured and necessary for integrating the verifier functionality. The context structure appropriately encapsulates the required fields for uBPF program execution.
Line range hint
1-524
: Overall assessment: Significant improvements with some areas for enhancementThe changes to
libfuzzer/libfuzz_harness.cc
significantly enhance the uBPF fuzzer by integrating eBPF verification and improving debugging capabilities. The addition of theubpf_context_t
structure and related functions provides a solid foundation for uBPF program execution and analysis.Main points of improvement:
- Implement or gracefully handle currently unimplemented functions.
- Refactor error handling in the
verify_bpf_byte_code
function to avoid using exceptions for control flow.- Enhance pointer arithmetic safety in the
ubpf_classify_address
function.- Implement a more flexible approach for enabling/disabling the debug function.
- Add error handling for context creation in
call_ubpf_interpreter
andcall_ubpf_jit
.- Improve error logging and reporting in the
LLVMFuzzerTestOneInput
function, especially for bytecode verification failures.Addressing these points will further improve the robustness, safety, and usability of the uBPF fuzzer. Overall, the changes represent a significant step forward in the project's capabilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (2)
libfuzzer/libfuzz_harness.cc (2)
344-465
: Well-implemented address classification; consider documenting debug function plansThe
ubpf_classify_address
function is well-implemented and provides useful information for debugging. It correctly classifies memory addresses based on their location in the uBPF context.The
ubpf_debug_function
is comprehensive but currently disabled. While the comment indicates it relies on a feature not present in the main branch of the verifier, it would be helpful to add a TODO comment or issue tracking when this function is expected to be enabled.Consider adding a comment like:
// TODO: Enable this debug function once the required verifier feature is merged. // Track progress in issue #<issue_number>This will help other developers understand the status and future plans for this function.
Line range hint
474-552
: Consider adding error handling for context creationThe
ubpf_context_from
function is well-implemented and correctly sets up the context structure. Thecall_ubpf_interpreter
andcall_ubpf_jit
functions are properly updated to use the new context. However, there's no error handling for potential failures in context creation.Consider adding checks to ensure that the memory and stack vectors are not empty and that their data pointers are valid before creating the context. For example:
ubpf_context_t ubpf_context_from(std::vector<uint8_t>& memory, std::vector<uint8_t>& ubpf_stack) { if (memory.empty() || ubpf_stack.empty()) { throw std::runtime_error("Memory or stack is empty"); } ubpf_context_t context; context.data = reinterpret_cast<uint64_t>(memory.data()); context.data_end = context.data + memory.size(); context.stack_start = reinterpret_cast<uint64_t>(ubpf_stack.data()); context.stack_end = context.stack_start + ubpf_stack.size(); if (context.data == 0 || context.stack_start == 0) { throw std::runtime_error("Invalid memory or stack pointer"); } return context; }Then, in
call_ubpf_interpreter
andcall_ubpf_jit
, you can catch these exceptions and handle them appropriately.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
- custom_tests/srcs/ubpf_test_debug_function.cc (1 hunks)
- libfuzzer/libfuzz_harness.cc (7 hunks)
- vm/inc/ubpf.h (1 hunks)
- vm/ubpf_vm.c (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- custom_tests/srcs/ubpf_test_debug_function.cc
- vm/inc/ubpf.h
🧰 Additional context used
📓 Learnings (1)
libfuzzer/libfuzz_harness.cc (2)
Learnt from: Alan-Jowett PR: Alan-Jowett/ubpf#114 File: libfuzzer/libfuzz_harness.cc:627-631 Timestamp: 2024-10-16T00:34:06.526Z Learning: In `verify_bpf_byte_code`, logging is avoided because the function is invoked tens of thousands of times per second during fuzzing, and emitting text would impair performance.
Learnt from: Alan-Jowett PR: Alan-Jowett/ubpf#114 File: libfuzzer/libfuzz_harness.cc:137-175 Timestamp: 2024-10-16T00:06:09.352Z Learning: In `verify_bpf_byte_code`, the use of exceptions for control flow is intentional because the caller only cares about whether verification succeeded or failed, not the reason why.
🔇 Additional comments (3)
libfuzzer/libfuzz_harness.cc (2)
13-60
: LGTM: New includes and structure definitionsThe new includes and structure definitions are well-organized and appropriate for integrating the eBPF verifier. The
ubpf_context_t
structure mirrors the XDP context, which is a good design choice. The global variablesg_ebpf_context_descriptor_ubpf
andg_ubpf_program_type
provide necessary information for the verifier.
Line range hint
554-615
: LGTM: Well-implemented input splitting functionThe
split_input
function is well-implemented with proper checks for input validity. It correctly handles the separation of program and memory data from the input buffer. The function includes appropriate checks for:
- Minimum input size
- Program length validity
- Proper alignment of program length
- Handling of empty memory
This implementation helps ensure that only valid inputs are processed, which is crucial for the fuzzer's effectiveness and efficiency.
vm/ubpf_vm.c (1)
685-696
: Enhancement of Debug Function InvocationThe addition of
shadow_registers
andshadow_stack
parameters tovm->debug_function
improves the debugging capabilities by providing extra context about the register and stack states. The implementation correctly checks if the debug function is enabled before invoking it, and the parameters are properly passed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (9)
libfuzzer/CMakeLists.txt (2)
11-14
: LGTM! Consider adding a comment for clarity.The addition of the
configure_file
command is a good practice for generating configuration files. It allows for dynamic configuration based on CMake variables.Consider adding a brief comment explaining the purpose of this generated file:
# Generate configuration file for libfuzzer configure_file( libfuzzer_config.h.inc "${CMAKE_CURRENT_BINARY_DIR}/libfuzzer_config.h" )
Line range hint
1-39
: Consider adding version checks for new dependencies.The changes to integrate GSL and ebpf-verifier are well-structured. To improve robustness, consider adding version checks for these new dependencies.
You could add something like this at the beginning of the file:
# Check for required dependencies find_package(GSL 2.0 REQUIRED) find_package(ebpfverifier REQUIRED) if(NOT GSL_FOUND) message(FATAL_ERROR "GSL not found. Please install GSL 2.0 or later.") endif() if(NOT ebpfverifier_FOUND) message(FATAL_ERROR "ebpfverifier not found. Please ensure it's properly installed.") endif()Adjust the version numbers and package names as appropriate for your project requirements.
cmake/options.cmake (1)
13-13
: LGTM! Consider adding a default value for consistency.The new option
UBPF_ENABLE_LIBFUZZER_CONSTRAINT_CHECK
is well-placed and follows the naming convention of other options. Its purpose is clear from the description.For consistency with other options in this file, consider specifying a default value (likely OFF). You could modify the line as follows:
option(UBPF_ENABLE_LIBFUZZER_CONSTRAINT_CHECK "Set to true to enable the libfuzzer constraint check" OFF)This makes it explicit that the option is disabled by default, which is typically the case for such feature flags.
libfuzzer/libfuzz_harness.cc (6)
41-62
: Approve: Global variables for context descriptor and program typeThe global variables
g_ebpf_context_descriptor_ubpf
andg_ubpf_program_type
are well-defined and serve important purposes:
- They provide essential information for the verifier to understand the context structure and program type.
- The use of global variables is appropriate as they represent constant configuration data.
These definitions enhance the flexibility and configurability of the uBPF environment.
Consider adding a brief comment above
g_ubpf_program_type
explaining why thesection_prefixes
field is empty, as this might not be immediately clear to other developers.
64-126
: Approve: Placeholder functions for program and map type handlingThe new functions
ubpf_get_program_type
,ubpf_get_map_type
,ubpf_get_helper_prototype
, andubpf_is_helper_usable
are well-structured placeholders:
- They provide a clear interface for future implementation of program type, map type, and helper function handling.
- The current minimal implementations are appropriate given that the fuzzer doesn't yet support maps or helper functions.
To improve maintainability:
- Consider adding TODO comments in each function body to remind developers to implement the full functionality when maps and helper functions are supported.
- For
ubpf_get_program_type
, consider adding a comment explaining why it always returnsg_ubpf_program_type
regardless of the input parameters.
186-200
: Approve: Platform abstraction for the verifierThe
g_ebpf_platform_ubpf_fuzzer
structure is well-defined and serves as an excellent abstraction layer:
- It provides a clean interface for the verifier to interact with the uBPF fuzzer platform.
- The use of function pointers allows for easy extension or modification of platform-specific behavior.
This abstraction enhances the modularity and flexibility of the system.
Consider adding a brief comment above the structure explaining its purpose and how it's used by the verifier. This would improve the code's self-documentation and help other developers understand the system's architecture more quickly.
259-312
: Approve: BPF bytecode verification functionThe
verify_bpf_byte_code
function is a crucial addition that enhances the safety of the fuzzer:
- It properly integrates the eBPF verifier to check the safety of the input programs.
- The function handles exceptions well, returning false for any verification failure.
- The use of
std::ostringstream
for error messages allows for detailed error information capture.Consider adding a logging mechanism to capture the error messages when verification fails. This could be valuable for debugging and improving the fuzzer. For example:
} catch (const std::exception& ex) { // Log the error message for debugging purposes std::cerr << "Verification failed: " << ex.what() << std::endl; return false; }However, ensure that this logging doesn't significantly impact the fuzzer's performance, especially if it's called frequently.
426-504
: Approve: Debug function for program executionThe
ubpf_debug_function
is a well-implemented debugging tool:
- It provides detailed information about register values and memory classifications during program execution.
- The use of the
UBPF_ENABLE_LIBFUZZER_CONSTRAINT_CHECK
macro allows for easy enabling/disabling of the debug functionality, which is crucial for performance in non-debug builds.To potentially improve performance when debugging is enabled:
- Consider using a more efficient data structure than
std::set
for storing constraints, such asstd::vector
orstd::array
, if the order of constraints is not important.- If possible, pre-allocate the
constraints
container to avoid frequent reallocations.Example:
std::vector<std::string> constraints; constraints.reserve(20); // Reserve space for an estimated number of constraintsThis could help reduce the overhead of the debug function, especially if it's called frequently during execution.
Line range hint
670-724
: Approve: Enhanced main fuzzer function with verificationThe updates to
LLVMFuzzerTestOneInput
significantly improve the fuzzer's robustness and error handling:
- The addition of the
verify_bpf_byte_code
step filters out invalid programs early, potentially saving execution time.- The new error handling provides more detailed information about failures, which is valuable for debugging.
These changes enhance the overall effectiveness and reliability of the fuzzer.
To further improve error reporting and debugging:
- Consider adding more context to the error messages. For example, include the size of the program or other relevant metadata when throwing exceptions.
- You might want to log the program code (or a hash of it) when a discrepancy between interpreter and JIT results is found. This could help reproduce and debug issues more easily.
Example:
if (interpreter_result != jit_result) { std::cerr << "Discrepancy found in program of size " << program.size() << " bytes" << std::endl; // Log program hash or first few bytes for identification throw std::runtime_error("interpreter_result != jit_result"); }These additions could provide valuable context for debugging without significantly impacting performance.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (6)
- .github/workflows/fuzzing.yml (3 hunks)
- .github/workflows/main.yml (1 hunks)
- cmake/options.cmake (1 hunks)
- libfuzzer/CMakeLists.txt (2 hunks)
- libfuzzer/libfuzz_harness.cc (10 hunks)
- libfuzzer/libfuzzer_config.h.inc (1 hunks)
✅ Files skipped from review due to trivial changes (1)
- libfuzzer/libfuzzer_config.h.inc
🚧 Files skipped from review as they are similar to previous changes (1)
- .github/workflows/fuzzing.yml
🧰 Additional context used
📓 Learnings (1)
libfuzzer/libfuzz_harness.cc (2)
Learnt from: Alan-Jowett PR: Alan-Jowett/ubpf#114 File: libfuzzer/libfuzz_harness.cc:627-631 Timestamp: 2024-10-16T00:34:06.526Z Learning: In `verify_bpf_byte_code`, logging is avoided because the function is invoked tens of thousands of times per second during fuzzing, and emitting text would impair performance.
Learnt from: Alan-Jowett PR: Alan-Jowett/ubpf#114 File: libfuzzer/libfuzz_harness.cc:137-175 Timestamp: 2024-10-16T00:06:09.352Z Learning: In `verify_bpf_byte_code`, the use of exceptions for control flow is intentional because the caller only cares about whether verification succeeded or failed, not the reason why.
🔇 Additional comments (5)
libfuzzer/CMakeLists.txt (2)
25-25
: LGTM! New include directories properly added.The addition of new include directories for GSL, ebpf-verifier, and the current binary directory is correct and necessary for the integration of these components.
Note: The GSL include path now matches the one used in the
target_include_directories
command, addressing the inconsistency mentioned in a previous review comment.Also applies to: 29-32
39-39
: LGTM! ebpfverifier library correctly linked.The addition of the
ebpfverifier
library to the link libraries is consistent with the integration of the ebpf-verifier component and the previously added include directories.cmake/options.cmake (1)
13-13
: Verify usage of the new option in the build system and source code.The addition of
UBPF_ENABLE_LIBFUZZER_CONSTRAINT_CHECK
is a good enhancement for configuring the build process. To ensure its effectiveness:
- Please confirm that this option is properly utilized in the relevant CMake files (e.g., in conditional statements to control compilation or linking).
- Verify that the corresponding functionality is implemented in the source code when this option is enabled.
Run the following script to check for the usage of this new option:
This will help ensure that the new option is properly integrated into the project.
✅ Verification successful
UBPF_ENABLE_LIBFUZZER_CONSTRAINT_CHECK is properly integrated into the build system and source code.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for usage of UBPF_ENABLE_LIBFUZZER_CONSTRAINT_CHECK in CMake files and source code # Search in CMake files echo "Searching for UBPF_ENABLE_LIBFUZZER_CONSTRAINT_CHECK in CMake files:" rg --type cmake "UBPF_ENABLE_LIBFUZZER_CONSTRAINT_CHECK" -A 5 # Search in C/C++ source files echo "Searching for references to libfuzzer constraint check in C/C++ files:" rg --type cpp "libfuzzer.*constraint.*check" -i -A 5Length of output: 1728
libfuzzer/libfuzz_harness.cc (2)
13-39
: Approve: New includes and context structureThe new includes and the
ubpf_context_t
structure are well-defined and serve clear purposes:
- The additional includes integrate the eBPF verifier and platform-specific configurations.
- The
ubpf_context_t
structure effectively models the context passed to BPF programs, similar to XDP contexts, which is crucial for proper program execution and verification.These additions lay a solid foundation for the enhanced functionality of the uBPF fuzzer.
Line range hint
1-724
: Summary: Significant enhancements to uBPF fuzzer harnessThis update to
libfuzzer/libfuzz_harness.cc
brings substantial improvements to the uBPF fuzzer:
- Integration of the eBPF verifier, enhancing program safety checks.
- Addition of new debugging capabilities, allowing for more detailed analysis during execution.
- Implementation of placeholder functions for future map and helper function support.
- Updates to the main fuzzer function to incorporate verification steps.
These changes significantly enhance the robustness, flexibility, and debugging capabilities of the fuzzer. The code is generally well-structured and documented, with only minor suggestions for improvements in error handling and documentation.
Overall, this update represents a substantial step forward in the capabilities of the uBPF fuzzer.
04f28e2
to
09829b7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (3)
libfuzzer/libfuzz_harness.cc (3)
Line range hint
591-603
: Prevent memory leak by freeingerror_message
after compilation failureIn
call_ubpf_jit
, ifubpf_compile_ex
fails, theerror_message
allocated by the function is not freed, leading to a memory leak.Apply this diff to free
error_message
after use:591 ubpf_context_t context = ubpf_context_from(memory, ubpf_stack); 592 char* error_message = nullptr; if (vm == nullptr) { // VM creation failed. return false; } auto fn = ubpf_compile_ex(vm.get(), &error_message, JitMode::ExtendedJitMode); if (fn == nullptr) { + std::string error_str = error_message; + free(error_message); throw std::runtime_error("Failed to compile program with error: " + error_str); } jit_result = fn(&context, 0, ubpf_stack.data(), ubpf_stack.size());
703-705
: Correct the comment to reflect interpreter executionThe comment in
LLVMFuzzerTestOneInput
incorrectly references the JIT when it should refer to the interpreter.Update the comment for clarity:
703 if (!call_ubpf_interpreter(program, memory, ubpf_stack, interpreter_result)) { -704 // Failed to load or execute the program in the JIT. +704 // Failed to load or execute the program in the interpreter. // This is not interesting, as the fuzzer input is invalid. return 0; }
Line range hint
628-630
: Avoid potential unaligned access when readingprogram_length
Casting
data
toconst uint32_t*
may cause unaligned memory access on some platforms. This can lead to undefined behavior.Modify the code to safely read
program_length
without risking unaligned access:628 uint32_t program_length; -629 program_length = *reinterpret_cast<const uint32_t*>(data); +629 std::memcpy(&program_length, data, sizeof(uint32_t));If endianness could be an issue, consider handling it explicitly using
ntohl
or similar functions.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
- .github/workflows/fuzzing.yml (4 hunks)
- libfuzzer/libfuzz_harness.cc (10 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- .github/workflows/fuzzing.yml
🧰 Additional context used
📓 Learnings (1)
libfuzzer/libfuzz_harness.cc (2)
Learnt from: Alan-Jowett PR: Alan-Jowett/ubpf#114 File: libfuzzer/libfuzz_harness.cc:627-631 Timestamp: 2024-10-16T00:34:06.526Z Learning: In `verify_bpf_byte_code`, logging is avoided because the function is invoked tens of thousands of times per second during fuzzing, and emitting text would impair performance.
Learnt from: Alan-Jowett PR: Alan-Jowett/ubpf#114 File: libfuzzer/libfuzz_harness.cc:137-175 Timestamp: 2024-10-16T00:06:09.352Z Learning: In `verify_bpf_byte_code`, the use of exceptions for control flow is intentional because the caller only cares about whether verification succeeded or failed, not the reason why.
b2316bd
to
0e15c81
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🧹 Outside diff range and nitpick comments (7)
libfuzzer/libfuzz_harness.cc (7)
21-24
: Consider alternative to macro renaming.While the current approach of renaming
ebpf_inst
toebpf_inst_ubpf
works, it might make the code harder to follow. Consider using namespaces or more distinct naming conventions to avoid conflicts without relying on macro redefinition.
30-62
: Well-structured context and program type definitions.The
ubpf_context_t
structure and the global descriptorsg_ebpf_context_descriptor_ubpf
andg_ubpf_program_type
are well-defined and properly documented. They provide necessary context for eBPF program verification.Consider adding
const
to the global variables to ensure they're not accidentally modified:const ebpf_context_descriptor_t g_ebpf_context_descriptor_ubpf = { ... }; const EbpfProgramType g_ubpf_program_type = { ... };
64-93
: Program and map type retrieval functions look good.The
ubpf_get_program_type
function is correctly implemented. Theubpf_get_map_type
function is appropriately left as a placeholder for future implementation.Consider adding a TODO comment in
ubpf_get_map_type
to make it clear that this needs to be implemented when map support is added:EbpfMapType ubpf_get_map_type(uint32_t platform_specific_type) { // TODO: Implement this function when map support is added to the fuzzer. // It should return metadata about the map, primarily the key and value size. UNREFERENCED_PARAMETER(platform_specific_type); return {}; }
95-126
: Helper function handling placeholders look good.Both
ubpf_get_helper_prototype
andubpf_is_helper_usable
are appropriately implemented as placeholders. The current implementation ofubpf_is_helper_usable
always returning false is a safe default.Consider adding TODO comments to both functions to make it clear they need to be implemented when helper function support is added:
EbpfHelperPrototype ubpf_get_helper_prototype(int32_t n) { // TODO: Implement this function when helper function support is added to the fuzzer. // It should return metadata about the helper function. UNREFERENCED_PARAMETER(n); return {}; } bool ubpf_is_helper_usable(int32_t n) { // TODO: Implement this function when helper function support is added to the fuzzer. // It should return whether the helper function is usable. UNREFERENCED_PARAMETER(n); return false; }
186-200
: Well-structured platform abstraction.The
g_ebpf_platform_ubpf_fuzzer
global variable provides a clean interface for the platform-specific functions. It's well-structured and correctly uses the previously defined functions.Consider adding
const
to ensure this global variable isn't accidentally modified:const ebpf_platform_t g_ebpf_platform_ubpf_fuzzer = { // ... existing initializers ... };
Line range hint
582-606
: Add error handling for context creation incall_ubpf_jit
.The modifications to
call_ubpf_jit
appropriately integrate the newubpf_context_t
structure. However, similar to the interpreter function, there's no error handling for potential failures in context creation.Consider adding error checking after creating the context:
auto context = ubpf_context_from(memory, ubpf_stack); if (!context) { // Handle error, e.g., return false or throw an exception return false; } auto fn = ubpf_compile_ex(vm.get(), &error_message, JitMode::ExtendedJitMode); // ... (rest of the function)This addition would improve the robustness of the function by handling potential context creation failures.
Line range hint
677-699
: Consider enhancing error handling for verification failures.The addition of the verification step before execution is a good improvement. However, the current implementation simply returns 0 when verification fails, which might miss opportunities to gather useful information about the failure.
Consider the following improvements:
- Log the reason for verification failure (if available from the
verify_bpf_byte_code
function).- Optionally, add a counter or metric to track the number of programs that fail verification.
Example:
if (!verify_bpf_byte_code(program)) { // Log verification failure reason if available // std::cerr << "Verification failed: " << get_verification_error() << std::endl; // Optionally, increment a counter // verification_failure_count++; return 0; }This would provide more insight into why programs are failing verification, which could be valuable for improving the fuzzer or identifying patterns in invalid programs.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (12)
- .github/workflows/fuzzing.yml (6 hunks)
- .github/workflows/main.yml (2 hunks)
- .gitmodules (1 hunks)
- CMakeLists.txt (1 hunks)
- cmake/options.cmake (1 hunks)
- custom_tests/srcs/ubpf_test_debug_function.cc (1 hunks)
- external/ebpf-verifier (1 hunks)
- libfuzzer/CMakeLists.txt (2 hunks)
- libfuzzer/libfuzz_harness.cc (10 hunks)
- libfuzzer/libfuzzer_config.h.inc (1 hunks)
- vm/inc/ubpf.h (1 hunks)
- vm/ubpf_vm.c (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (10)
- .github/workflows/fuzzing.yml
- .github/workflows/main.yml
- .gitmodules
- CMakeLists.txt
- cmake/options.cmake
- custom_tests/srcs/ubpf_test_debug_function.cc
- external/ebpf-verifier
- libfuzzer/CMakeLists.txt
- libfuzzer/libfuzzer_config.h.inc
- vm/ubpf_vm.c
🧰 Additional context used
📓 Learnings (1)
libfuzzer/libfuzz_harness.cc (2)
Learnt from: Alan-Jowett PR: Alan-Jowett/ubpf#114 File: libfuzzer/libfuzz_harness.cc:627-631 Timestamp: 2024-10-16T00:34:06.526Z Learning: In `verify_bpf_byte_code`, logging is avoided because the function is invoked tens of thousands of times per second during fuzzing, and emitting text would impair performance.
Learnt from: Alan-Jowett PR: Alan-Jowett/ubpf#114 File: libfuzzer/libfuzz_harness.cc:137-175 Timestamp: 2024-10-16T00:06:09.352Z Learning: In `verify_bpf_byte_code`, the use of exceptions for control flow is intentional because the caller only cares about whether verification succeeded or failed, not the reason why.
🔇 Additional comments (3)
vm/inc/ubpf.h (1)
Line range hint
570-576
: Documentation updates forubpf_debug_fn
are clear and consistentThe documentation for the
ubpf_debug_fn
typedef has been appropriately updated to include descriptions for the newregister_mask
andstack_mask_start
parameters. These changes are consistent with the typedef modifications and provide clear explanations of the new parameters' purposes.libfuzzer/libfuzz_harness.cc (2)
13-18
: New includes look good.The added includes for libfuzzer configuration and eBPF verification components are appropriate for the new functionality being introduced.
Line range hint
1-735
: Overall, good improvements with some areas for enhancement.The changes to
libfuzzer/libfuzz_harness.cc
significantly enhance the eBPF program fuzzing capabilities by introducing verification, improved context handling, and debugging support. The integration with the ebpf-verifier library is well-implemented, and the new structures and functions provide a solid foundation for future development.Main points for improvement:
- Enhance error handling in placeholder functions for map support.
- Improve safety in pointer arithmetic and address classification.
- Consider more flexible approaches for enabling/disabling debug functions.
- Add more robust error handling for context creation and verification failures.
- Fix the error message concatenation in the
capture_printf
function.These enhancements will improve the overall robustness and maintainability of the code. Great work on the implementation, and addressing these points will further strengthen the fuzzer's capabilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Outside diff range and nitpick comments (8)
.github/workflows/update-docs.yml (1)
Line range hint
28-28
: Follow up on the TODO comment foregress-policy
.The comment suggests changing the
egress-policy
from 'audit' to 'block' after a couple of runs. It's important to follow up on this TODO to enhance the security of the workflow.Consider updating the
egress-policy
to 'block' if sufficient runs have been completed without issues. If you decide to make this change, update the comment as well..github/workflows/scorecards.yml (1)
Line range hint
39-74
: Summary of GitHub Actions version updatesThis pull request updates three GitHub Actions to their latest versions:
actions/checkout
: v4.2.0 → v4.2.1 (patch update)actions/upload-artifact
: v3.1.3 → v4.4.3 (major update)github/codeql-action/upload-sarif
: v3.26.12 → v3.26.13 (patch update)These updates are generally beneficial for security and stability. However, the major version update for
actions/upload-artifact
requires additional attention to ensure compatibility with your workflow.To maintain the stability and security of your CI/CD pipeline:
- Regularly review and update GitHub Actions versions.
- For major version updates, always check the changelog and test the workflow thoroughly before merging.
- Consider implementing a process for automated updates of patch and minor versions, with manual review for major version changes.
.github/workflows/windows.yml (5)
54-54
: Approved: actions/checkout version updateUpdating the actions/checkout version to v4.2.1 is a good practice. It ensures you're using the latest features and security improvements.
Consider using the major version tag (v4) instead of the specific patch version. This allows for automatic minor updates while maintaining compatibility:
- uses: actions/checkout@v4
60-60
: Approved: github/codeql-action/init version updateUpdating the github/codeql-action/init to a specific commit hash ensures reproducibility of your workflow.
Consider using a tagged version instead of a specific commit hash. This approach balances reproducibility with easier maintenance:
uses: github/codeql-action/init@v2This way, you'll automatically get compatible updates within the major version.
111-111
: Approved: actions/upload-artifact version updateUpdating the actions/upload-artifact to a specific commit hash ensures reproducibility of your workflow.
Consider using a tagged version instead of a specific commit hash for easier maintenance:
uses: actions/upload-artifact@v3This approach allows you to automatically receive compatible updates within the major version while maintaining stability.
119-119
: Approved: github/codeql-action/analyze version updateUpdating the github/codeql-action/analyze to a specific commit hash ensures reproducibility of your workflow.
For consistency with the github/codeql-action/init action, consider using the same version approach:
uses: github/codeql-action/analyze@v2This ensures that both CodeQL actions are using compatible versions and simplifies future updates.
Line range hint
1-119
: Overall improvements to the Windows workflowThe changes made to this workflow file significantly enhance its flexibility and maintainability:
- Input parameters are now optional with default values, allowing for more versatile workflow execution.
- Action versions have been updated, improving security and potentially adding new features.
These modifications align well with the PR objectives of improving the build configuration and updating dependencies. The workflow is now more robust and easier to maintain.
To further improve the workflow's maintainability:
- Consider using major version tags for actions where possible (e.g.,
v4
instead ofv4.2.1
).- Ensure consistency in versioning approach across related actions (e.g., CodeQL init and analyze).
- Regularly review and update action versions to benefit from the latest improvements and security patches.
vm/inc/ubpf.h (1)
582-593
: Clarify the documentation forstack_mask_start
parameterIn the
ubpf_debug_fn
typedef, thestack_mask_start
parameter is described as a bitmask representing modified bytes of the stack. To improve clarity:
- Specify the size of the stack mask and how it relates to
stack_length
.- Clarify whether
stack_mask_start
points to a bitmask array or a specific memory location.- Consider renaming
stack_mask_start
tostack_mask
for consistency, unless the suffix_start
is necessary to indicate the beginning of a buffer.This will help users implement the debug function correctly and prevent potential misunderstandings.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (9)
- .github/workflows/dependency-review.yml (1 hunks)
- .github/workflows/fuzzing.yml (10 hunks)
- .github/workflows/main.yml (3 hunks)
- .github/workflows/posix.yml (5 hunks)
- .github/workflows/scorecards.yml (2 hunks)
- .github/workflows/update-docs.yml (1 hunks)
- .github/workflows/windows.yml (3 hunks)
- libfuzzer/libfuzz_harness.cc (10 hunks)
- vm/inc/ubpf.h (2 hunks)
✅ Files skipped from review due to trivial changes (1)
- .github/workflows/posix.yml
🚧 Files skipped from review as they are similar to previous changes (2)
- .github/workflows/fuzzing.yml
- .github/workflows/main.yml
🧰 Additional context used
📓 Learnings (1)
libfuzzer/libfuzz_harness.cc (2)
Learnt from: Alan-Jowett PR: Alan-Jowett/ubpf#114 File: libfuzzer/libfuzz_harness.cc:627-631 Timestamp: 2024-10-16T00:34:06.526Z Learning: In `verify_bpf_byte_code`, logging is avoided because the function is invoked tens of thousands of times per second during fuzzing, and emitting text would impair performance.
Learnt from: Alan-Jowett PR: Alan-Jowett/ubpf#114 File: libfuzzer/libfuzz_harness.cc:137-175 Timestamp: 2024-10-16T00:06:09.352Z Learning: In `verify_bpf_byte_code`, the use of exceptions for control flow is intentional because the caller only cares about whether verification succeeded or failed, not the reason why.
🔇 Additional comments (14)
.github/workflows/dependency-review.yml (1)
25-25
: LGTM! Updating to the latest patch version is a good practice.The update of
actions/checkout
from v4.2.0 to v4.2.1 is a positive change. It's always recommended to keep actions up-to-date to benefit from the latest bug fixes and security improvements.Let's verify if this is indeed the latest version of the
actions/checkout
action:This script will help us ensure that the project is using the most recent version of the action.
✅ Verification successful
Verification Successful: The workflow is using the latest version of
actions/checkout
.No further action is needed.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check if v4.2.1 is the latest version of actions/checkout # Fetch the latest release version latest_version=$(gh api repos/actions/checkout/releases/latest --jq .tag_name) echo "Latest version of actions/checkout: $latest_version" echo "Version used in the workflow: v4.2.1" if [ "$latest_version" = "v4.2.1" ]; then echo "The workflow is using the latest version." else echo "A newer version is available. Consider updating to $latest_version." fiLength of output: 384
.github/workflows/scorecards.yml (3)
39-39
: LGTM: Updated actions/checkout to the latest patch version.Updating the
actions/checkout
action from v4.2.0 to v4.2.1 is a good practice. This minor version update likely includes bug fixes or small improvements, enhancing the stability and security of your workflow.
74-74
: LGTM: Updated github/codeql-action/upload-sarif to the latest patch version.Updating the
github/codeql-action/upload-sarif
action from v3.26.12 to v3.26.13 is a good practice. This patch version update likely includes bug fixes or small improvements, enhancing the stability and security of your workflow.
66-66
: Verify compatibility with the updated actions/upload-artifact version.The update from
actions/[email protected]
tov4.4.3
is a major version change. While keeping actions up-to-date is crucial for security and feature improvements, major version updates may introduce breaking changes.Please ensure that:
- The new version is compatible with your workflow.
- Any new features or changes in behavior are understood and desired.
- The changelog has been reviewed for any breaking changes or deprecations.
To verify the impact of this change, you can run the following command:
✅ Verification successful
All instances of
actions/upload-artifact
have been successfully updated to versionv4.4.3
across all workflow files.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for any changes in the artifact upload step that might be affected by the new version. # Test: Look for uses of upload-artifact in all workflow files rg --type yaml 'uses:\s*actions/upload-artifact@' .github/workflowsLength of output: 1054
.github/workflows/windows.yml (1)
23-25
: Improved flexibility in workflow executionThe changes to the input parameters
upload_packages
,build_codeql
, anddisable_retpolines
are well-implemented. By making these parameters optional with default values set tofalse
, the workflow becomes more flexible and easier to use in different scenarios. This aligns well with the PR objectives of improving the build configuration.Also applies to: 29-30, 34-35
libfuzzer/libfuzz_harness.cc (8)
13-18
: LGTM: New includes added for necessary functionality.The new includes seem appropriate for the added functionality. The
libfuzzer_config.h
likely contains configuration specific to this fuzzer, while the other includes are related to the eBPF verifier and platform.
30-62
: LGTM: Well-structured context and program type definitions.The
ubpf_context_t
structure andg_ebpf_context_descriptor_ubpf
descriptor are well-defined and documented. Theg_ubpf_program_type
provides a clear description of the program type for the verifier.
64-93
: LGTM: Placeholder functions for program and map type retrieval.The
ubpf_get_program_type
andubpf_get_map_type
functions are implemented as placeholders. They correctly useUNREFERENCED_PARAMETER
to avoid compiler warnings for unused parameters.
95-126
: LGTM: Placeholder functions for helper function handling.The
ubpf_get_helper_prototype
andubpf_is_helper_usable
functions are implemented as placeholders. They correctly useUNREFERENCED_PARAMETER
to avoid compiler warnings for unused parameters.
186-200
: LGTM: Well-defined platform abstraction for the verifier.The
g_ebpf_platform_ubpf_fuzzer
structure is well-defined and includes all necessary function pointers for the verifier to interact with the uBPF fuzzer platform.
428-517
: LGTM: Well-implemented debug function with conditional compilation.The
ubpf_debug_function
is well-implemented and includes detailed constraint checking whenUBPF_ENABLE_LIBFUZZER_CONSTRAINT_CHECK
is defined. The use of conditional compilation allows for easy enabling/disabling of the debug functionality.
Line range hint
611-635
: LGTM: JIT compilation and execution well-implemented.The
call_ubpf_jit
function correctly handles JIT compilation and execution of the eBPF program. Error handling for compilation failures is appropriately implemented.
724-728
: LGTM: Verification step added before execution.The addition of the
verify_bpf_byte_code
step before execution is a good improvement. It ensures that only valid programs are executed, enhancing the security and reliability of the fuzzer.vm/inc/ubpf.h (1)
510-517
: Approve the addition ofubpf_bounds_check
typedefThe new function type
ubpf_bounds_check
and its accompanying documentation are correctly defined and align well with the existing codebase. This addition enhances the VM's capability to perform bounds checking on memory accesses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (12)
vm/inc/ubpf.h (4)
583-588
: Enhanced debugging capabilities with register and stack modification trackingThe
ubpf_debug_fn
typedef has been updated to include two new parameters:
register_mask
: A bitmask indicating which registers have been modified.stack_mask_start
: A bitmask showing which parts of the stack have been modified.These additions provide more detailed information during debugging, allowing for better tracking of program state changes. However, this is a breaking change that will require updates to any existing code using this typedef.
Consider providing a backward-compatible version of the debug function registration to ease migration for existing users.
605-606
: Improved error handling inubpf_register_debug_fn
The
ubpf_register_debug_fn
function now returns anint
instead ofvoid
, allowing for error handling:
- Returns 0 on success
- Returns -1 on failure
This change improves the API's robustness by enabling callers to check if the debug function was registered successfully.
Consider updating the function's documentation to explain the possible return values and their meanings.
510-518
: Approved: New memory bounds checking functionalityThe addition of the
ubpf_bounds_check
typedef andubpf_register_data_bounds_check
function enhances the library's security and flexibility:
ubpf_bounds_check
typedef allows for custom memory access validation.ubpf_register_data_bounds_check
function enables registering a custom bounds check function.These additions are well-designed and consistent with the existing API structure.
Consider adding more detailed documentation for the
ubpf_register_data_bounds_check
function, including:
- Examples of when and how to use this feature
- Best practices for implementing a bounds check function
- Any performance implications of using custom bounds checking
Also applies to: 525-533
359-359
: Improved consistency in return value documentationThe return value documentation for several functions has been updated to use
@retval
instead of@return
. This change:
- Improves consistency in the documentation style across the header file.
- Makes the documentation clearer and easier to parse for documentation generation tools.
- Allows for more precise documentation of different return values and their meanings.
Consider reviewing the entire header file to ensure all functions use the
@retval
style for return value documentation, maintaining consistency throughout the API.Also applies to: 378-378, 490-490, 505-505, 556-557, 569-570
libfuzzer/libfuzz_harness.cc (6)
33-65
: LGTM: New context structure and descriptors addedThe new
ubpf_context_t
structure and related descriptors (g_ebpf_context_descriptor_ubpf
andg_ubpf_program_type
) are well-implemented and crucial for the verifier's understanding of memory layout and program type.Consider adding a brief comment explaining the significance of the
meta
field ing_ebpf_context_descriptor_ubpf
being set to -1.
75-129
: Placeholder functions added for program and map type handlingThe new functions (
ubpf_get_program_type
,ubpf_get_map_type
,ubpf_get_helper_prototype
,ubpf_is_helper_usable
) are well-documented placeholders for future implementation.Consider adding TODO comments in these functions to remind about future implementation needs. For example:
EbpfMapType ubpf_get_map_type(uint32_t platform_specific_type) { // TODO: Implement this function to return metadata about the map, primarily the key and value size. UNREFERENCED_PARAMETER(platform_specific_type); return {}; }
141-187
: Placeholder functions added for map handlingThe new functions (
ubpf_parse_maps_section
,ubpf_resolve_inner_map_references
,ubpf_get_map_descriptor
) are correctly implemented to throw runtime errors, which is consistent with the goal of failing verification if a map is present.Consider enhancing the error messages to provide more context. For example:
throw std::runtime_error("ubpf_parse_maps_section not implemented: map support not yet available in the fuzzer");This would make it clearer why these functions are not implemented and help with future debugging or development.
273-330
: LGTM: BPF bytecode verification function addedThe
verify_bpf_byte_code
function is a crucial addition for program validation. It correctly uses the ebpf-verifier library and handles exceptions appropriately.Consider logging the exception message when verification fails due to an exception. This could provide valuable information for debugging:
} catch (const std::exception& ex) { // Log the exception message if needed // std::cerr << "Verification failed with exception: " << ex.what() << std::endl; return false; }However, only uncomment this if it doesn't impact the performance of the fuzzer, as per the previous discussions about avoiding excessive logging.
436-529
: LGTM: Debug function added with constraint checkingThe
ubpf_debug_function
is a well-implemented debugging tool that provides valuable information about register values and their classifications. It's correctly guarded by theUBPF_ENABLE_LIBFUZZER_CONSTRAINT_CHECK
flag.Consider extracting the constraint building logic into a separate function to improve readability and maintainability. For example:
std::set<std::string> build_constraints(const ubpf_context_t* context, const uint64_t registers[16], uint64_t register_mask) { std::set<std::string> constraints; // ... (constraint building logic) ... return constraints; } // In ubpf_debug_function: std::set<std::string> constraints = build_constraints(ubpf_context, registers, register_mask);This would make the
ubpf_debug_function
more focused and easier to understand.
Line range hint
590-667
: LGTM: Improved interpreter and JIT execution functionsThe modifications to
call_ubpf_interpreter
andcall_ubpf_jit
functions enhance their robustness by using the newubpf_context_t
structure and improving error handling. The addition of the bounds check function is a valuable safety feature.For consistency, consider using the same error handling approach in both functions. For example, in
call_ubpf_jit
, you could use a similar pattern as incall_ubpf_interpreter
:if (fn == nullptr) { // Check if the error is being suppressed by one of the known error messages regex. for (const auto& error_message : g_error_message_to_ignore) { if (std::regex_search(g_error_message, std::regex(error_message))) { return false; } } throw std::runtime_error("Failed to compile program with error: " + std::string(error_message)); }This would provide a consistent approach to error handling across both functions.
vm/ubpf_vm.c (2)
695-707
: Enhanced debugging capabilities with additional context.The addition of the debug function call with extended parameters significantly improves the VM's debugging capabilities. By providing information about shadow registers and the shadow stack, it allows for more comprehensive tracking of the VM's state during execution.
This change is well-implemented and will be valuable for debugging complex eBPF programs.
Consider adding a brief comment explaining the purpose of each parameter in the debug function call, especially for
shadow_registers
andshadow_stack
. This would improve code readability and make it easier for other developers to understand the debug function's capabilities.
565-570
: Improved error reporting with consistent method and additional context.The changes to error reporting in
ubpf_exec_ex
are beneficial:
- Use of
vm->error_printf
instead of directfprintf
calls ensures consistency and allows for potential customization of error output.- Inclusion of the program counter (pc) in error messages provides valuable context for debugging.
These improvements will make it easier to identify and diagnose issues in eBPF programs.
For consistency, consider updating all error messages in this function to include the pc value. For example, the message at line 670-671 could be modified to include the pc:
vm->error_printf(stderr, "Error: Instruction limit exceeded at PC %d.\n", cur_pc);This would provide consistent context across all error messages in the function.
Also applies to: 594-596, 670-671
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
- .gitignore (1 hunks)
- libfuzzer/libfuzz_harness.cc (9 hunks)
- vm/inc/ubpf.h (9 hunks)
- vm/ubpf_vm.c (7 hunks)
✅ Files skipped from review due to trivial changes (1)
- .gitignore
🧰 Additional context used
📓 Learnings (1)
libfuzzer/libfuzz_harness.cc (1)
Learnt from: Alan-Jowett PR: Alan-Jowett/ubpf#114 File: libfuzzer/libfuzz_harness.cc:573-594 Timestamp: 2024-10-16T21:10:37.964Z Learning: In this codebase, if execution of the program fails, unhandled exceptions are thrown to indicate fatal bugs. Therefore, explicit error handling for context creation in `call_ubpf_interpreter` is not necessary.
🔇 Additional comments (7)
vm/inc/ubpf.h (1)
595-597
: Approved: Logical parameter ordering inubpf_debug_fn
The changes to the
ubpf_debug_fn
function parameters are well-structured:
- The
stack_length
parameter has been moved to group it with other stack-related parameters.- New parameters (
register_mask
andstack_mask_start
) are added at the end, maintaining compatibility with the C calling convention.This ordering is logical and makes the function signature easier to understand and use.
libfuzzer/libfuzz_harness.cc (5)
10-32
: LGTM: New includes and global variables addedThe new includes and global variables appear to be necessary for the enhanced functionality of handling eBPF programs. The additions seem appropriate and well-placed.
189-202
: LGTM: Platform abstraction structure addedThe
g_ebpf_platform_ubpf_fuzzer
structure is well-implemented and correctly initialized with the previously defined functions. This provides a clear interface for the verifier to interact with the uBPF fuzzer platform.
252-271
: Bug fix: Correct implementation of capture_printfThe modification to
capture_printf
fixes a critical bug where the format string was being appended directly instead of the formatted output. The new implementation correctly usesvsnprintf
to capture the formatted output.Great job on catching and fixing this bug. This will ensure that error messages are correctly formatted and captured.
Line range hint
733-789
: LGTM: Enhanced fuzzing process with verification and improved executionThe modifications to
LLVMFuzzerTestOneInput
significantly improve the robustness of the fuzzing process:
- The addition of the
verify_bpf_byte_code
call before execution enhances safety.- Using separate memory for interpreter and JIT execution ensures consistency.
- Improved error handling provides better debugging capabilities.
These changes will lead to more effective and reliable fuzzing of uBPF programs.
Line range hint
1-789
: Overall: Significant enhancements to uBPF handling and verificationThis pull request introduces substantial improvements to the
libfuzzer/libfuzz_harness.cc
file, enhancing its capabilities in handling and verifying uBPF programs. Key improvements include:
- Introduction of a new context structure (
ubpf_context_t
) and related descriptors for better memory management.- Addition of a robust verification step (
verify_bpf_byte_code
) before program execution.- Enhanced error handling and debugging capabilities.
- Improved consistency between interpreter and JIT execution.
- Placeholder functions for future map support, correctly failing verification when maps are present.
These changes significantly boost the robustness and effectiveness of the uBPF fuzzer. The code is well-structured, properly documented, and shows careful consideration of error cases and edge conditions.
Great work on these enhancements! They will undoubtedly lead to more effective and reliable fuzzing of uBPF programs.
vm/ubpf_vm.c (1)
Line range hint
473-533
: Improved register initialization tracking and error reporting.The changes to
ubpf_validate_shadow_register
enhance the function's ability to track uninitialized registers and provide more informative error messages by including the program counter (pc) in the output. This is a valuable improvement for debugging.However, there's a notable change in the EBPF_CLS_STX case:
src_register_required = (inst.dst != BPF_REG_10); // Writing uninitialized memory to the stack is allowed.This allows writing uninitialized memory to the stack (when dst is BPF_REG_10). While this might be intentional, it could potentially lead to issues if not handled carefully in other parts of the code.
To ensure this change doesn't introduce unintended consequences, please run the following verification:
✅ Verification successful
Verified: The usage of
BPF_REG_10
is consistent and controlled across the codebase. Allowing writes to it in theEBPF_CLS_STX
case aligns with its role as the stack pointer and does not introduce unintended consequences.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check for any other occurrences of BPF_REG_10 usage rg "BPF_REG_10" --type cLength of output: 659
330ae5e
to
72eed31
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (3)
libfuzzer/libfuzz_harness.cc (2)
Line range hint
650-659
: Avoid memory leak by freeingerror_message
after JIT compilation failureIn
call_ubpf_jit
, ifubpf_compile_ex
fails, theerror_message
allocated by the uBPF library is not freed, leading to a memory leak.Modify the code to free
error_message
before throwing the exception:if (fn == nullptr) { std::string error_str = error_message; + free(error_message); throw std::runtime_error("Failed to compile program with error: " + error_str); }
743-743
: Initialize the stack to prevent undefined behaviorThe
ubpf_stack
is allocated but not initialized, which may contain indeterminate values. This could lead to unpredictable behavior during program execution.Initialize the stack with zeros:
std::vector<uint8_t> program; std::vector<uint8_t> memory; std::vector<uint8_t> ubpf_stack(3 * 4096); +std::fill(ubpf_stack.begin(), ubpf_stack.end(), 0); g_error_message = "";
vm/ubpf_vm.c (1)
670-670
: Improve Error Message for Instruction Limit ExceededThe error message for exceeding the instruction limit could be more informative by including the actual limit value. This would aid in debugging and understanding the context of the error.
Apply this diff to enhance the error message:
- vm->error_printf(stderr, "Error: Instruction limit exceeded.\n"); + vm->error_printf(stderr, "Error: Instruction limit of %u exceeded.\n", vm->instruction_limit);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
- .gitignore (0 hunks)
- libfuzzer/libfuzz_harness.cc (9 hunks)
- vm/ubpf_vm.c (7 hunks)
💤 Files with no reviewable changes (1)
- .gitignore
🧰 Additional context used
📓 Learnings (1)
libfuzzer/libfuzz_harness.cc (1)
Learnt from: Alan-Jowett PR: Alan-Jowett/ubpf#114 File: libfuzzer/libfuzz_harness.cc:573-594 Timestamp: 2024-10-16T21:10:37.964Z Learning: In this codebase, if execution of the program fails, unhandled exceptions are thrown to indicate fatal bugs. Therefore, explicit error handling for context creation in `call_ubpf_interpreter` is not necessary.
🔇 Additional comments (4)
vm/ubpf_vm.c (4)
690-692
: Consistent Invocation ofubpf_validate_shadow_register
inubpf_exec_ex
The function
ubpf_validate_shadow_register
is now being called before each instruction execution, ensuring that register initialization is validated consistently throughout the execution loop. This enhancement addresses previous concerns about potential use of uninitialized registers and improves the robustness of the virtual machine.
500-503
: Correct Handling of Uninitialized Memory Writes to StackIn the handling of store indirect instructions (
EBPF_CLS_STX
), the code correctly allows writing uninitialized memory to the stack by not requiring the source register to be initialized when the destination register isBPF_REG_10
(the stack pointer). This behavior is acceptable as writing uninitialized data to the stack is permissible in this context.
524-533
: Accurate Register Initialization inMOV
InstructionThe logic for the
MOV
instruction correctly updates the initialization state of the destination register based on the source register's state. If the source register is not initialized, the destination register is also marked as uninitialized, preventing the use of uninitialized registers in subsequent operations.
695-706
: Enhanced Debug Function Invocation for Improved State InspectionThe
vm->debug_function
is now invoked with additional parameters, includingshadow_registers
andshadow_stack
, providing more detailed state information during execution. This enhancement facilitates better debugging and analysis of the VM's behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (8)
libfuzzer/CMakeLists.txt (3)
25-30
: Consider removing commented-out codeThese lines of commented-out code appear to be obsolete, as they've been replaced by a more specific check later in the file. To maintain code cleanliness and prevent confusion, it's generally better to remove unused code rather than commenting it out.
Consider removing these commented lines entirely:
-# include(CheckCXXSymbolExists) - -set(CMAKE_REQUIRED_INCLUDES ${UBPF_FUZZER_INCLUDES}) - -# check_cxx_symbol_exists(ebpf_verifier_options_t::store_pre_invariants "config.hpp" HAVE_EBPF_CHECK_CONSTRAINTS_AT_LABEL)
31-42
: LGTM: Improved check for ebpf_verifier_options_t::store_pre_invariantsThe addition of this specific C++ source compilation check is a good improvement over the previous commented-out version. It provides a more robust way to verify the presence of
ebpf_verifier_options_t::store_pre_invariants
.For consistency with the commented-out version, consider renaming the result variable:
-check_cxx_source_compiles("${CHECK_CONFIG_STORE_PRE_INVARIANTS}" HAVE_EBPF_VERIFIER_CHECK_CONSTRAINTS_AT_LABEL) +check_cxx_source_compiles("${CHECK_CONFIG_STORE_PRE_INVARIANTS}" HAVE_EBPF_CHECK_CONSTRAINTS_AT_LABEL)This change would align the variable name with the previously used naming convention.
46-49
: LGTM: Configuration file generation addedThe addition of the
configure_file
command to generatelibfuzzer_config.h
is a good practice for managing configuration files. This aligns well with the PR objectives of improving build configuration.For improved clarity, consider adding a comment explaining the purpose of this configuration file:
# Generate configuration file with build-specific settings configure_file( libfuzzer_config.h.inc "${CMAKE_CURRENT_BINARY_DIR}/libfuzzer_config.h" )libfuzzer/libfuzz_harness.cc (1)
Line range hint
742-764
: Handle Potential Unaligned Memory Access insplit_input
The use of
reinterpret_cast<const uint32_t*>(data)
may cause unaligned memory access ifdata
is not properly aligned. Unaligned access can lead to undefined behavior on architectures that do not support it.Consider using
memcpy
to safely readprogram_length
:uint32_t program_length; std::memcpy(&program_length, data, sizeof(uint32_t));This approach ensures portability and avoids potential issues with unaligned memory access.
vm/ubpf_vm.c (4)
716-726
: Optimize Debug Function InvocationThe debug function is conditionally invoked within the execution loop. To optimize performance, especially in tight loops, consider moving the
vm->debug_function
check outside the loop if possible or using likely/unlikely macros to aid branch prediction.Example optimization:
- if (vm->debug_function) { + if (unlikely(vm->debug_function)) { vm->debug_function( vm->debug_function_context, cur_pc, reg, stack_start, stack_length, shadow_registers, (uint8_t*)shadow_stack ); }Note: Use of
unlikely
macro (if available) indicates to the compiler that the condition is rarely true.
Line range hint
690-725
: Avoid Potential Memory Leak ofshadow_stack
In the
ubpf_exec_ex
function, ifshadow_stack
allocation fails,return_value
is set to-1
, and the function jumps tocleanup
. However, ifshadow_stack
is non-null, it is freed incleanup
. To prevent potential issues, ensureshadow_stack
is properly handled in all code paths.Apply this diff to initialize
shadow_stack
toNULL
and adjust the cleanup logic:void* shadow_stack = NULL; if (vm->undefined_behavior_check_enabled) { shadow_stack = calloc(stack_length / 8, 1); if (!shadow_stack) { return_value = -1; goto cleanup; } } // ... cleanup: - if (shadow_stack) { + if (vm->undefined_behavior_check_enabled && shadow_stack) { free(shadow_stack); } return return_value;
Line range hint
620-626
: Correct Register Mask When Unwinding StackLines 620-626 handle register initialization when an external function is called. Ensure that the
shadow_registers
bitmask correctly reflects the initialization state of registersr1
tor5
after the call, as they may be modified by the external function.Consider updating the
shadow_registers
to markr1
tor5
as potentially uninitialized or initialized, depending on the expected behavior.
Line range hint
891-911
: Validate Bounds Check Function Pointer Before InvocationIn the
bounds_check
function (lines 891-911), before invokingvm->bounds_check_function
, ensure that the function pointer is valid and properly handles any potential null pointer dereferences.Include a null check before calling
vm->bounds_check_function
:if (vm->bounds_check_function != NULL && + vm->bounds_check_function(vm->bounds_check_user_data, access_start, size)) { return true; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (5)
- external/ebpf-verifier (1 hunks)
- libfuzzer/CMakeLists.txt (1 hunks)
- libfuzzer/libfuzz_harness.cc (10 hunks)
- libfuzzer/libfuzzer_config.h.inc (1 hunks)
- vm/ubpf_vm.c (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- external/ebpf-verifier
- libfuzzer/libfuzzer_config.h.inc
🧰 Additional context used
📓 Learnings (1)
libfuzzer/libfuzz_harness.cc (1)
Learnt from: Alan-Jowett PR: Alan-Jowett/ubpf#114 File: libfuzzer/libfuzz_harness.cc:573-594 Timestamp: 2024-10-16T21:10:37.964Z Learning: In this codebase, if execution of the program fails, unhandled exceptions are thrown to indicate fatal bugs. Therefore, explicit error handling for context creation in `call_ubpf_interpreter` is not necessary.
🔇 Additional comments (8)
libfuzzer/CMakeLists.txt (4)
9-18
: LGTM: Include directories properly updatedThe UBPF_FUZZER_INCLUDES variable has been correctly updated to include the necessary directories for GSL and ebpf-verifier. This change aligns with the PR objectives of integrating the ebpf-verifier submodule.
Note: The previously reported inconsistency with the GSL include path has been resolved in this version.
20-23
: LGTM: Library dependencies updatedThe addition of the 'ebpfverifier' library to UBPF_FUZZER_LIBS is consistent with the integration of the ebpf-verifier submodule. This change aligns well with the new include directories and the overall objectives of the PR.
44-44
: LGTM: C++20 standard setSetting CMAKE_CXX_STANDARD to 20 ensures that C++20 features are available for the project. This is a good practice for maintaining consistency across the codebase and leveraging modern C++ features.
51-58
: LGTM: ubpf_fuzzer executable properly configuredThe addition of the
ubpf_fuzzer
executable is well-structured and consistent with the PR objectives. The use of the previously definedUBPF_FUZZER_INCLUDES
andUBPF_FUZZER_LIBS
variables for include directories and linked libraries, respectively, improves maintainability and reduces the chance of errors.vm/ubpf_vm.c (4)
614-616
: Check Initialization of Return Value Register BeforeEXIT
InstructionIn
ubpf_validate_shadow_register
, when handling theEBPF_OP_EXIT
instruction (lines 614-616), the code checks ifr0
is initialized. Ensure that this check accounts for all possible code paths leading toEXIT
to prevent false positives or negatives in uninitialized register detection.Consider analyzing the control flow to confirm that
r0
is initialized before allEXIT
instructions.
690-690
: Consistent Error Printing for Instruction Limit ExceededAt line 690, the error message for exceeding the instruction limit uses
vm->error_printf
. Ensure all error messages throughout the file consistently usevm->error_printf
for uniform error handling.Run the following script to check for consistent usage:
#!/bin/bash # Description: Verify all error messages use vm->error_printf. # Search for error messages not using vm->error_printf. rg --type c --no-heading --line-number "(fprintf|printf|perror)\(" vm/
Line range hint
473-726
: Include Program Counter in Validation Error MessagesThe
ubpf_validate_shadow_register
function now accepts thepc
(program counter) parameter, enabling more informative error messages that include the exact instruction where an uninitialized register is detected. This enhances debugging capabilities.Ensure that all calls to
ubpf_validate_shadow_register
are updated to pass the newpc
parameter to prevent potential mismatches or compilation issues.You can verify that all invocations have been updated by running:
✅ Verification successful
All calls to
ubpf_validate_shadow_register
correctly include thepc
parameter. Verification successful.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify that all calls to ubpf_validate_shadow_register include the `pc` parameter. # Search for calls to ubpf_validate_shadow_register with the correct number of arguments. rg --type c --no-heading --line-number "ubpf_validate_shadow_register\([^,]+,\s*[^,]+,\s*[^,]+,\s*[^,]+\)" vm/Length of output: 349
Line range hint
479-726
: Ensure Correct Validation Logic inubpf_validate_shadow_register
The validation logic within
ubpf_validate_shadow_register
has been significantly restructured. While the enhancements improve the detection of uninitialized registers, please review the logic for handling different instruction classes to ensure all cases are correctly addressed.Specifically, verify:
- Correct handling of source and destination registers for each instruction type.
- Proper updating of the
shadow_registers
bitmask after each instruction.- Accurate error messages for uninitialized register usage.
Consider using the following script to analyze the control flow and validation cases:
✅ Verification successful
Validation Logic in
ubpf_validate_shadow_register
is CorrectThe validation logic within
ubpf_validate_shadow_register
correctly handles source and destination registers for each instruction type, updates theshadow_registers
bitmask appropriately after each instruction, and provides accurate error messages for uninitialized register usage.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Extract and review the validation logic for different instruction classes. # Extract the switch cases for instruction classes. rg --type c --no-heading --line-number "switch \(inst.opcode & EBPF_CLS_MASK\) \{.*?\}" -A 50 vm/ubpf_vm.cLength of output: 5512
Script:
#!/bin/bash # Description: Extract and review the validation logic for different instruction classes. # Corrected regex pattern to match the switch statement for instruction classes rg --type c --no-heading --line-number "switch\s*\(inst\.opcode\s*&\s*EBPF_CLS_MASK\)\s*\{" -A 50 vm/ubpf_vm.cLength of output: 2737
Script:
#!/bin/bash # Description: Extract and review the complete validation logic for different instruction classes up to line 726. # Extract the switch statement for instruction classes along with additional lines up to line 726 rg --type c --no-heading --line-number "switch\s*\(inst\.opcode\s*&\s*EBPF_CLS_MASK\)\s*\{" -A 300 vm/ubpf_vm.cLength of output: 13132
Signed-off-by: Alan Jowett <[email protected]>
fe76a8d
to
79ab7b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Inline review comments failed to post
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (4)
libfuzzer/libfuzz_harness.cc (2)
113-119
: Implementubpf_get_program_type
functionThe
ubpf_get_program_type
function is currently a placeholder that always returnsg_ubpf_program_type
. Consider implementing logic to determine the program type based on thesection
andpath
parameters if different program types are expected in the future.
128-134
: Implement placeholder functions for map and helper function supportThe functions
ubpf_get_map_type
,ubpf_get_helper_prototype
, andubpf_is_helper_usable
are currently placeholders. While this is acceptable for now, it's important to implement these functions when map and helper function support is added to the fuzzer.Also applies to: 143-150, 160-167
vm/ubpf_vm.c (2)
Line range hint
473-619
: Potential Issue with Uninitialized Registers in ALU OperationsIn the
ubpf_validate_shadow_register
function, the current logic for ALU operations allows operations on uninitialized registers and sets the destination register's validity based on the source register's initialization state. This could lead to scenarios where uninitialized values are propagated without detection.Specifically, for binary ALU operations, if the source or destination register is uninitialized, performing operations on them should be flagged as an error to prevent undefined behavior.
Consider updating the validation logic to enforce that both source and destination registers must be initialized before performing binary ALU operations. If either register is uninitialized, an error should be reported. Here's a proposed change:
case EBPF_CLS_ALU: case EBPF_CLS_ALU64: switch (inst.opcode & EBPF_ALU_OP_MASK) { // Binary ops. case 0x00: // EBPF_OP_ADD case 0x10: // EBPF_OP_SUB case 0x20: // EBPF_OP_MUL case 0x30: // EBPF_OP_DIV case 0x40: // EBPF_OP_OR case 0x50: // EBPF_OP_AND case 0x60: // EBPF_OP_LSH case 0x70: // EBPF_OP_RSH case 0x90: // EBPF_OP_MOD case 0xa0: // EBPF_OP_XOR case 0xc0: // EBPF_OP_ARSH case 0xb0: // EBPF_OP_MOV + // Check if destination register is initialized + if (!destination_register_valid_before_instruction) { + vm->error_printf(stderr, "Error: %d: Destination register r%d is not initialized.\n", pc, inst.dst); + return false; + } + // If using source register, check if it's initialized + if ((inst.opcode & EBPF_SRC_REG) && !source_register_valid_before_instruction) { + vm->error_printf(stderr, "Error: %d: Source register r%d is not initialized.\n", pc, inst.src); + return false; + } destination_register_valid_after_instruction = true; break; // Unary ops case 0x80: // EBPF_OP_NEG case 0xd0: // EBPF_OP_LE // Doesn't change the initialized state of the registers. break; default: vm->error_printf(stderr, "Error: %d: Unknown ALU opcode %x.\n", pc, inst.opcode); return false; } break;By enforcing the initialization checks, we can catch potential bugs related to uninitialized register usage during ALU operations.
715-715
: Ensure Consistent Error Handling on Validation FailureIn the
ubpf_exec_ex
function at line 715, whenubpf_validate_shadow_register
returnsfalse
, the execution jumps tocleanup
without an explicit error message at this point. Sinceubpf_validate_shadow_register
already outputs detailed error messages, this is functionally adequate.However, for consistency and clarity in the execution flow, consider adding a debug message or comment indicating that a validation failure occurred. This can aid in tracing execution paths during debugging.
🛑 Comments failed to post (2)
libfuzzer/libfuzz_harness.cc (1)
733-735:
⚠️ Potential issueCheck for
nullptr
before usingerror_message
In
call_ubpf_jit
, whenubpf_compile_ex
fails,error_message
might benullptr
. Passing anullptr
tostd::string
can result in undefined behavior.Modify the code to handle a potential
nullptr
:if (fn == nullptr) { std::string error = error_message ? std::string(error_message) : "Unknown error"; free(error_message); // Free if allocated throw std::runtime_error("Failed to compile program with error: " + error); }This ensures that your code safely handles the error message and prevents potential crashes.
vm/ubpf_vm.c (1)
695-695:
⚠️ Potential issuePrevent Potential Off-by-One Error in Instruction Limit Check
At line 695, the instruction limit is checked using
if (vm->instruction_limit && instruction_limit-- <= 0)
. This post-decrement may lead to an off-by-one error, where the last instruction exceeds the limit.Consider using pre-decrement to ensure accurate instruction counting:
- if (vm->instruction_limit && instruction_limit-- <= 0) { + if (vm->instruction_limit && --instruction_limit < 0) { return_value = -1; vm->error_printf(stderr, "Error: Instruction limit exceeded.\n"); goto cleanup; }This adjustment ensures that the VM enforces the instruction limit correctly.
Committable suggestion was skipped due to low confidence.
Signed-off-by: Alan Jowett <[email protected]>
Signed-off-by: Alan Jowett <[email protected]>
Signed-off-by: Alan Jowett <[email protected]>
This pull request includes updates to the fuzzing workflow, integration of a new eBPF verifier submodule, and various enhancements and fixes to the uBPF virtual machine. The most important changes include adding a new submodule, modifying the fuzzing workflow, and updating the uBPF VM headers and functions for better clarity and functionality.
Workflow and Configuration Updates:
.github/workflows/fuzzing.yml
: Addedworkflow_call
trigger, updated platform toubuntu-24.04
, addedlibyaml-cpp-dev
dependency, and changed build type toRelWithDebInfo
. [1] [2] [3] [4] [5].github/workflows/main.yml
: Updated permissions towrite
for contents and integrated the fuzzing workflow. [1] [2]Submodule Integration:
.gitmodules
: Addedexternal/ebpf-verifier
submodule.CMakeLists.txt
: Integratedexternal/ebpf-verifier
submodule underUBPF_ENABLE_LIBFUZZER
.uBPF VM Enhancements:
vm/inc/ubpf.h
: Updated function documentation to use@retval
for return descriptions and added new parameters forubpf_debug_fn
. [1] [2] [3] [4] [5] [6] [7] [8] [9]custom_tests/srcs/ubpf_test_debug_function.cc
: Added new parameters todebug_callout
function.Build System Enhancements:
libfuzzer/CMakeLists.txt
: Refactored to include new verifier headers and libraries, added checks for verifier constraints.libfuzzer/libfuzzer_config.h.inc
: Added configuration header for libfuzzer.Code Quality and Bug Fixes:
vm/ubpf_vm.c
: Improved validation logic for shadow registers inubpf_validate_shadow_register
.Summary by CodeRabbit
New Features
Bug Fixes
Chores
.gitignore
file, affecting ignored patterns in the repository.