Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No transform when checking + check code reorg (second attempt) #796

Merged
merged 4 commits into from
Nov 17, 2024

Conversation

elazarg
Copy link
Collaborator

@elazarg elazarg commented Nov 17, 2024

Another take on #793.

  • Avoiding ebpf_transform on check should improve performance
  • Verifier by phase: Invariants and Report
    -- Invariants has a dedicated method for checking concrete state, for implementing Expose API to compare abstract state to concrete state #728
    -- "Short circuit" method for pass/fail (verified()) that fail early and does not store messages
  • Add the assume commands that triggered unreachability to error messages
  • More constness on ebpf_checker: should not mutate the invariant
  • No reachability notes on assume-assert
  • Use dedicated exceptions for input errors; not warnings
  • Remove line_info from raw_program
  • Reduce duplication between VERIFY_PROGRAM and VERIFY_SECTION tests

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced custom exception types for more specific error handling in control flow and unmarshalling processes.
    • Added support for thread-local program information to enhance concurrency and state management.
    • Implemented a new structure for verbosity options, streamlining the management of logging and output settings.
    • Enhanced the handling of conditional jumps and assumptions in test cases, improving clarity on unreachable code conditions.
    • Added new test cases for loops and bitwise operations, covering a broader range of scenarios and edge cases.
  • Bug Fixes

    • Improved error handling and reporting in various functions, ensuring clearer feedback on issues encountered during execution.
  • Tests

    • Enhanced test cases for various scenarios, including control flow, loops, and bitwise operations, with clearer assumptions and expected outcomes.
  • Documentation

    • Updated documentation to reflect changes in error handling and program structure, improving clarity and maintainability.

* Avoiding ebpf_transform on check should improve performance
* Verifier by phase
* Add the assume commands that triggered unreachability to error messages
* More constness on ebpf_checker:  should not mutate the invariant
* No reachability notes on assume-assert
* Use dedicated exceptions for input errors; not warnings
* Remove line_info from raw_program

Signed-off-by: Elazar Gershuni <[email protected]>
Copy link

coderabbitai bot commented Nov 17, 2024

Walkthrough

The pull request introduces extensive changes across multiple source files, primarily focusing on error handling, control flow, and the transition to thread-local storage. Key modifications include replacing standard exceptions with custom exceptions for more specific error reporting, restructuring the management of program information to be thread-local, and refining various methods to enhance clarity and maintainability. Additionally, the introduction of new data structures and adjustments to existing ones aim to improve the organization of verbosity options and the handling of control flow graphs.

Changes

File Change Summary
src/asm_cfg.cpp Replaced std::runtime_error and std::invalid_argument with crab::InvalidControlFlow in error handling. Included <cassert>.
src/asm_files.cpp Replaced std::runtime_error with UnmarshalError. Changed global program info to thread-local. Updated BTF data handling.
src/asm_files.hpp Added UnmarshalError class.
src/asm_ostream.cpp Updated EbpfHelperPrototype source from global to thread-local program info.
src/asm_unmarshal.cpp Removed line_info parameter from unmarshal method. Updated internal logic to use info.line_info.
src/assertions.cpp Enhanced memory access assertion handling and argument type checks.
src/config.hpp Introduced verbosity_options_t structure for verbosity settings. Removed old verbosity fields from ebpf_verifier_options_t.
src/crab/array_domain.cpp Renamed global_array_map to thread_local_array_map. Updated related functions for thread-local state management.
src/crab/array_domain.hpp Removed clear_global_state() method declaration.
src/crab/cfg.hpp Added InvalidControlFlow class. Removed mark_alive_blocks and remove_unreachable_blocks methods.
src/crab/ebpf_checker.cpp Updated constructor to accept OnRequire callback. Enhanced const-correctness across several methods.
src/crab/ebpf_domain.cpp Replaced global_program_info with thread_local_program_info. Added get_r0 method.
src/crab/ebpf_domain.hpp Updated ebpf_domain_check to accept const reference. Added get_r0 method.
src/crab/ebpf_transformer.cpp Updated context descriptor retrieval to thread-local. Enhanced error handling in ebpf_domain_transform.
src/crab/fwd_analyzer.cpp Consolidated pre and post invariant tables into a single structure. Updated relevant methods accordingly.
src/crab/fwd_analyzer.hpp Updated invariant_table_t type definition to include invariant_map_pair.
src/crab/split_dbm.cpp Improved formatting of overflow checks and refined handling of linear expressions.
src/crab/split_dbm.hpp Removed namespace qualifier from GraphOps template instantiation.
src/crab/type_domain.cpp Updated typeset_to_string parameter type. Refined error handling in several methods.
src/crab/var_factory.cpp Corrected function closure syntax for default_variable_names.
src/crab_utils/stats.cpp No changes to public entities.
src/crab_verifier.cpp Introduced LineInfoPrinter and Invariants structs. Reorganized verification methods.
src/crab_verifier.hpp Removed run_ebpf_analysis and ebpf_verify_program methods. Introduced new analyze methods.
src/linux/gpl/spec_prototypes.cpp Updated context descriptor handling to thread-local.
src/linux/linux_platform.cpp Reformatted macros and replaced std::runtime_error with UnmarshalError.
src/linux/linux_platform.hpp Added function declarations for get_helper_prototype_linux and is_helper_usable_linux.
src/main/check.cpp Updated verbosity option handling to use nested structure.
src/main/linux_verifier.cpp Updated verbosity options access and corrected return documentation.
src/main/run_yaml.cpp Modified print_failure function call to pass a single argument.
src/main/utils.hpp Changed timed_execution template and return type for flexibility.
src/spec_type_descriptors.hpp Updated program_info structure to include line_info. Renamed global_program_info to thread_local_program_info.
src/test/ebpf_yaml.cpp Restructured options handling and updated test case functions.
src/test/ebpf_yaml.hpp Added default parameter for print_failure function.
src/test/test_verify.cpp Updated verification macros to enhance error handling and flexibility.
src/test/test_yaml.cpp Removed second argument from print_failure function call.
test-data/jump.yaml Enhanced clarity of unreachable code messages and refined assumption handling.
test-data/loop.yaml Updated loop test cases with clearer messages and added new scenarios.
test-data/shift.yaml Added extensive test cases for bitwise operations, covering various scenarios.
test-data/unsigned.yaml Updated unreachable code messages for clarity.

Possibly related PRs

Suggested reviewers

  • dthaler

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 42

🧹 Outside diff range comments (8)
src/main/run_yaml.cpp (1)

Line range hint 1-53: Consider restructuring error handling flow

The error handling flow mixes test results reporting with program state printing. Consider extracting the printing logic into a dedicated TestReporter class.

class TestReporter {
    void report_test_start(const std::string& name);
    void report_test_state(const TestCase& test_case);
    void report_failure(const FailureType& failure);
    void report_success();
};
test-data/loop.yaml (1)

Line range hint 386-408: Make iteration limit note more prominent.

The note about verifier behavior for loops > 1M iterations is important documentation. Consider moving it to a more visible location, such as a dedicated section in the README or documentation.

src/linux/linux_platform.cpp (1)

Line range hint 205-220: Fix hard-coded max_entries value and error handling

Two issues need attention:

  1. The function ignores the max_entries parameter and hard-codes it to 20
  2. Direct program termination on map creation failure is too aggressive

Apply this fix:

-    attr.max_entries = 20;
+    attr.max_entries = max_entries;

And replace the error handling:

-        std::cerr << "Failed to create map, " << strerror(errno) << "\n";
-        std::cerr << "Map: \n"
-                  << " map_type = " << attr.map_type << "\n"
-                  << " key_size = " << attr.key_size << "\n"
-                  << " value_size = " << attr.value_size << "\n"
-                  << " max_entries = " << attr.max_entries << "\n"
-                  << " map_flags = " << attr.map_flags << "\n";
-        exit(2);
+        std::string error_msg = "Failed to create map: " + std::string(strerror(errno));
+        throw std::runtime_error(error_msg);
src/crab/ebpf_domain.cpp (1)

Line range hint 303-309: Add null check for thread_local_program_info

Direct dereference of thread_local_program_info could crash if null.

Add null check:

-    const auto info = *thread_local_program_info;
+    if (!thread_local_program_info) {
+        return;
+    }
+    const auto& info = *thread_local_program_info;
src/crab/type_domain.cpp (1)

Line range hint 311-315: Optimize string concatenation performance.

The current implementation creates intermediate string objects for each iteration. Consider these improvements:

  1. Pre-allocate string buffer size based on the vector size
  2. Use string_view for constant strings
 std::string typeset_to_string(const std::vector<type_encoding_t>& items) {
     std::stringstream ss;
+    // Reserve approximate size: 2 chars for brackets, 2 chars per separator, ~4 chars per item
+    ss.str().reserve(2 + (items.size() * 6));
     ss << "{";
     for (auto it = items.begin(); it != items.end(); ++it) {
         ss << *it;
         if (std::next(it) != items.end()) {
-            ss << ", ";
+            ss << std::string_view{", "};
         }
     }
     ss << "}";
     return ss.str();
 }
src/crab/split_dbm.cpp (1)

Line range hint 1212-1219: Improve error handling for coefficient overflow.

When coefficient conversion overflows, the method silently sets out to 0. This could mask issues in the analysis.

Add logging or error tracking:

     for (const auto& [variable, coefficient] : e.variable_terms()) {
         Weight coef;
         if (convert_NtoW_overflow(coefficient, coef)) {
+            CRAB_LOG("zones-split", 
+                std::cout << "Coefficient overflow in expression for variable " 
+                         << variable << "\n");
             out = Weight(0);
             return true;
         }
test-data/unsigned.yaml (1)

Line range hint 1-1251: Consider adding boundary value analysis for more edge cases.

While the test coverage is good, consider adding test cases for:

  • Values around UINT32_MAX/2 and UINT64_MAX/2
  • Comparisons with specific bit patterns (all 1s, alternating 1s and 0s)
src/linux/gpl/spec_prototypes.cpp (1)

Line range hint 2289-2294: Use a dedicated exception type instead of std::exception.

Replace the generic std::exception with a dedicated exception type that provides context about the invalid helper function.

-    if (!is_helper_usable_linux(n)) {
-        throw std::exception();
-    }
+    if (!is_helper_usable_linux(n)) {
+        throw InvalidHelperFunctionError(n);
+    }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE

📥 Commits

Reviewing files that changed from the base of the PR and between 8143eb2 and e699599.

📒 Files selected for processing (39)
  • src/asm_cfg.cpp (6 hunks)
  • src/asm_files.cpp (16 hunks)
  • src/asm_files.hpp (1 hunks)
  • src/asm_ostream.cpp (1 hunks)
  • src/asm_unmarshal.cpp (3 hunks)
  • src/assertions.cpp (1 hunks)
  • src/config.hpp (2 hunks)
  • src/crab/array_domain.cpp (1 hunks)
  • src/crab/array_domain.hpp (0 hunks)
  • src/crab/cfg.hpp (2 hunks)
  • src/crab/ebpf_checker.cpp (10 hunks)
  • src/crab/ebpf_domain.cpp (7 hunks)
  • src/crab/ebpf_domain.hpp (2 hunks)
  • src/crab/ebpf_transformer.cpp (4 hunks)
  • src/crab/fwd_analyzer.cpp (7 hunks)
  • src/crab/fwd_analyzer.hpp (1 hunks)
  • src/crab/split_dbm.cpp (1 hunks)
  • src/crab/split_dbm.hpp (1 hunks)
  • src/crab/type_domain.cpp (1 hunks)
  • src/crab/var_factory.cpp (1 hunks)
  • src/crab_utils/stats.cpp (0 hunks)
  • src/crab_verifier.cpp (1 hunks)
  • src/crab_verifier.hpp (1 hunks)
  • src/linux/gpl/spec_prototypes.cpp (1 hunks)
  • src/linux/linux_platform.cpp (5 hunks)
  • src/linux/linux_platform.hpp (1 hunks)
  • src/main/check.cpp (3 hunks)
  • src/main/linux_verifier.cpp (2 hunks)
  • src/main/run_yaml.cpp (1 hunks)
  • src/main/utils.hpp (1 hunks)
  • src/spec_type_descriptors.hpp (1 hunks)
  • src/test/ebpf_yaml.cpp (4 hunks)
  • src/test/ebpf_yaml.hpp (1 hunks)
  • src/test/test_verify.cpp (5 hunks)
  • src/test/test_yaml.cpp (1 hunks)
  • test-data/jump.yaml (15 hunks)
  • test-data/loop.yaml (7 hunks)
  • test-data/shift.yaml (0 hunks)
  • test-data/unsigned.yaml (46 hunks)
💤 Files with no reviewable changes (3)
  • src/crab/array_domain.hpp
  • src/crab_utils/stats.cpp
  • test-data/shift.yaml
🧰 Additional context used
📓 Learnings (5)
src/crab/ebpf_domain.cpp (1)
Learnt from: dthaler
PR: vbpf/ebpf-verifier#721
File: src/crab/ebpf_domain.cpp:2060-2060
Timestamp: 2024-11-12T09:50:23.369Z
Learning: `r10` is read-only and always set to a fixed value, so in `do_mem_store`, `r10_interval` can be assumed to be a singleton without additional checks.
src/crab/ebpf_domain.hpp (1)
Learnt from: dthaler
PR: vbpf/ebpf-verifier#721
File: src/crab/ebpf_domain.hpp:152-152
Timestamp: 2024-11-12T09:50:16.890Z
Learning: In `src/crab/ebpf_domain.hpp`, methods in the `ebpf_domain_t` class typically do not have documentation comments, so adding documentation for new methods is not necessary.
src/crab/split_dbm.hpp (1)
Learnt from: elazarg
PR: vbpf/ebpf-verifier#689
File: src/crab/split_dbm.cpp:910-910
Timestamp: 2024-11-12T09:50:16.890Z
Learning: Avoid suggesting changes that are unrelated to the current PR.
src/test/ebpf_yaml.cpp (1)
Learnt from: dthaler
PR: vbpf/ebpf-verifier#721
File: src/ebpf_yaml.cpp:327-329
Timestamp: 2024-11-12T09:50:16.890Z
Learning: In `src/ebpf_yaml.cpp`, avoid suggesting changes to error handling code in the `run_conformance_test_case` function if the code is not modified in the current PR.
test-data/loop.yaml (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#745
File: test-data/loop.yaml:377-398
Timestamp: 2024-11-12T09:50:23.369Z
Learning: In `test-data/loop.yaml`, for the test case **"very large loop > 100K iterations"**, additional post conditions like `r0.svalue>0` and `r0.uvalue>0` are not required.
🪛 cppcheck
src/asm_files.cpp

[error] 134-134: Syntax Error

(internalAstError)

src/crab/ebpf_checker.cpp

[style] 105-105: The function 'ebpf_domain_check' is never used.

(unusedFunction)

src/crab/fwd_analyzer.cpp

[style] 131-131: The function 'run_forward_analyzer' is never used.

(unusedFunction)

src/crab_verifier.cpp

[error] 134-134: Syntax Error

(internalAstError)

src/main/linux_verifier.cpp

[style] 22-22: The function 'bpf_verify_program' is never used.

(unusedFunction)

src/test/ebpf_yaml.cpp

[style] 379-379: The function 'print_failure' is never used.

(unusedFunction)

🔇 Additional comments (63)
src/linux/linux_platform.hpp (1)

5-5: Include directive is properly placed.

The include is necessary for the EbpfHelperPrototype type used in the function declarations.

src/crab/fwd_analyzer.hpp (1)

16-16: Clean type definition and function signature changes

The consolidated invariant_table_t type and simplified run_forward_analyzer return type create a more cohesive API.

Also applies to: 18-18

src/asm_files.hpp (1)

12-15: 🧹 Nitpick (assertive)

Add noexcept specification to the constructor.

The constructor won't throw, so it should be marked as noexcept.

-    explicit UnmarshalError(const std::string& what) : std::runtime_error(what) {}
+    explicit UnmarshalError(const std::string& what) noexcept : std::runtime_error(what) {}

Likely invalid or redundant comment.

src/test/ebpf_yaml.hpp (1)

33-33: Consider thread safety implications of default std::cout

The default parameter improves usability but using a shared global output stream might cause interleaved output in multi-threaded scenarios.

Let's check if other error reporting functions follow similar patterns:

src/config.hpp (2)

46-46: LGTM!

The addition of verbosity_opts effectively consolidates verbosity-related options.


50-51: LGTM!

The uniform initialization syntax provides better guarantees against uninitialized values.

src/main/run_yaml.cpp (1)

44-44: Verify thread safety of the new print_failure implementation

The removal of the output stream parameter suggests a shift to using a global or thread-local stream. This could cause issues in multi-threaded scenarios.

src/test/test_yaml.cpp (2)

Line range hint 21-45: LGTM: Comprehensive test coverage.

The test suite covers essential eBPF functionality including arithmetic, bitwise operations, control flow, memory operations, and packet handling.


16-16: Verify the default output stream behavior.

The removal of std::cout argument relies on the default parameter in print_failure. Let's verify this change is consistent across the codebase.

✅ Verification successful

Default output stream behavior is correctly implemented

The function signature and all usages are consistent:

  • Declaration in ebpf_yaml.hpp has the default parameter std::cout
  • Implementation in ebpf_yaml.cpp matches the declaration
  • All call sites use the single-argument form, relying on the default parameter
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining direct std::cout usage with print_failure
# and verify the new function signature

# Check for any remaining direct std::cout usage with print_failure
rg "print_failure.*std::cout" 

# Find the print_failure function declaration to verify default parameter
ast-grep --pattern 'print_failure($_) {
  $$$
}'

Length of output: 179


Script:

#!/bin/bash
# Let's verify the implementation of print_failure to ensure it correctly uses the default parameter
ast-grep --pattern 'void print_failure(const Failure& $_) {
  $$$
}'

# Check all usages of print_failure to ensure they're consistent with the new signature
rg "print_failure\(" -A 2

Length of output: 1015

src/main/linux_verifier.cpp (2)

19-19: Documentation fix looks good


44-46: Error handling looks good

Comprehensive error reporting with both error code and description.

src/spec_type_descriptors.hpp (1)

79-79: Thread-local storage prevents data races.

The rename from global_program_info to thread_local_program_info better reflects its thread-local nature.

src/crab/ebpf_domain.hpp (2)

31-31: Good const correctness improvement.

Making ebpf_domain_check take a const reference prevents accidental mutation of the domain during checks.


69-69: Verify usage of new get_r0() method.

The method looks correct, but let's verify its usage pattern.

✅ Verification successful

get_r0() usage is consistent with its purpose

The method is used to retrieve the return value (R0) from the exit invariants, which aligns with the BPF convention where R0 holds the return value. The implementation and usage are consistent across the codebase:

  • Used in crab_verifier.cpp to extract the exit value from invariants
  • Implemented in ebpf_domain.cpp to access R0_RETURN_VALUE from the register pack
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all usages of get_r0() to understand its purpose and verify consistent usage
rg -p "get_r0\(\)" --type cpp

# Find any related R0 register access patterns
rg -p "[\s\(]R0[\s\)]" --type cpp

Length of output: 561

src/crab/var_factory.cpp (2)

137-137: 🧹 Nitpick (assertive)

Fix indentation of closing brace.

The closing brace should align with the start of the function declaration.

-}
+}

Likely invalid or redundant comment.


Line range hint 139-141: Verify thread-local storage usage across codebase.

The transition to thread-local storage requires verification of:

  1. All access points to ensure they're thread-safe
  2. No lingering global state that could cause race conditions
✅ Verification successful

Thread-local storage implementation is properly synchronized

The codebase shows a well-structured thread-local state management:

  • The variable_t::names is correctly declared as thread_local with proper cleanup
  • All thread-local variables have corresponding clear methods
  • Thread state cleanup is systematically handled through clear_thread_local_state()
  • Main program ensures cleanup via RAII pattern using at_scope_exit
  • No direct global state access found that could cause race conditions
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any remaining global state or non-thread-safe access
rg -g '!*.md' -g '!*.txt' "global.*variable_t::names" .
rg -g '!*.md' -g '!*.txt' "static.*variable_t::names" .

# Find all places where names is accessed
ast-grep --pattern 'names->'

Length of output: 146


Script:

#!/bin/bash
# Search for all usages of variable_t::names
rg -g '!*.md' -g '!*.txt' "variable_t::names" -A 2 -B 2

# Find all accesses to the names member
rg -g '!*.md' -g '!*.txt' "->names" -A 2 -B 2

# Look for thread-related operations
rg -g '!*.md' -g '!*.txt' "thread_local.*variable_t" -A 2 -B 2

# Check for any other thread_local variables in the codebase
rg -g '!*.md' -g '!*.txt' "thread_local" -A 2 -B 2

# Find all uses of clear_thread_local_state
rg -g '!*.md' -g '!*.txt' "clear_thread_local_state" -A 2 -B 2

Length of output: 33815

test-data/loop.yaml (1)

Line range hint 285-356: Enhanced error messages improve debugging clarity.

The new error messages explicitly state the assumptions that lead to unreachable code, making it easier to understand why loops are infinite.

src/crab/split_dbm.hpp (1)

307-307: LGTM: Namespace cleanup maintains consistency

The removal of redundant crab:: qualifier aligns with the file's namespace usage, as AdaptGraph is already within the crab namespace scope.

src/linux/linux_platform.cpp (2)

6-14: Macro formatting changes look good

The multi-line formatting improves readability without affecting functionality.


183-183: Exception type changes are appropriate

The switch to UnmarshalError provides more specific error classification.

Also applies to: 242-242

src/crab/ebpf_domain.cpp (6)

187-187: Thread-safe map descriptor access

The transition to thread-local storage prevents potential race conditions.


209-209: Duplicate thread-local transition


Line range hint 231-235: Improved const correctness

The const qualifier on the map pointer prevents accidental modifications.


Line range hint 249-253: Duplicate const and thread-local changes


Line range hint 267-271: Duplicate const and thread-local changes


284-285: Clean accessor for return value

Encapsulates R0 return value access in a dedicated method.

src/asm_cfg.cpp (2)

89-89: LGTM: Appropriate use of domain-specific exception

The custom InvalidControlFlow exception better represents the semantic meaning of this error condition.


232-233: Verify thread-local variable lifecycle

Need to ensure proper initialization and cleanup of thread_local_program_info.

src/crab/cfg.hpp (3)

25-28: LGTM: Well-designed exception class

The InvalidControlFlow exception class follows best practices with final specifier and explicit constructor.


330-330: Verify C++20 compatibility requirements

The change to std::ranges::sort requires C++20.

✅ Verification successful

C++20 compatibility is properly configured

The codebase is already set up for C++20:

  • CMakeLists.txt explicitly sets C++20 as required standard
  • Multiple C++20 features are in use including ranges, concepts, and requires expressions
  • std::ranges::sort usage aligns with existing C++20 code patterns
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check C++20 requirement is consistently enforced across the project

# Check for C++20 requirement in build files
rg -l "std=c\+\+20|cpp_standard.*20|CXX_STANDARD.*20" 

# Check for other ranges usage
rg "std::ranges::"

Length of output: 33885


Script:

#!/bin/bash
# Check for C++20 requirement in build files
rg -l "CMAKE_CXX_STANDARD|std=c\+\+|CXX_STANDARD" CMakeLists.txt

# Check for ranges usage
rg "std::ranges::" -A 2

Length of output: 3194


Script:

#!/bin/bash
# Check C++20 standard setting in CMakeLists.txt
rg "CMAKE_CXX_STANDARD.*20" -A 2 CMakeLists.txt

# Check for other modern C++20 features usage
rg "concept|requires|constinit|consteval|co_await|co_yield|co_return"

Length of output: 12124


Line range hint 1-1: Verify removal of unreachability checks

The removal of mark_alive_blocks and remove_unreachable_blocks methods needs verification to ensure unreachability is still properly handled.

src/asm_ostream.cpp (2)

260-260: Thread safety improvement by using thread-local storage.

The change from global to thread-local storage eliminates potential race conditions in multi-threaded environments.


260-260: Verify thread_local_program_info initialization.

Ensure that thread_local_program_info is properly initialized before accessing the platform.

✅ Verification successful

Thread-local program info is properly initialized before use

The code shows a clear initialization pattern:

  • Initialized in unmarshal() via thread_local_program_info = raw_prog.info
  • Set in prepare_cfg() via thread_local_program_info.set(info)
  • Properly cleared in ebpf_verifier_clear_thread_local_state()

The access at line 260 in asm_ostream.cpp is safe as it occurs after initialization in the program flow.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for thread_local_program_info initialization and usage patterns
# Test: Look for initialization before use and potential null checks

# Search for initialization patterns
rg -A 5 "thread_local_program_info\s*=|thread_local_program_info\s*\.reset"

# Search for null checks or safety patterns
rg "thread_local_program_info\s*[!=]=\s*nullptr|if\s*\(\s*!?\s*thread_local_program_info\s*\)|assert\s*\(\s*thread_local_program_info\s*\)"

Length of output: 607


Script:

#!/bin/bash
# Let's check the declaration and definition of thread_local_program_info
rg "thread_local.*program_info" -B 2 -A 2

# Check where it's used to understand the access pattern
rg "thread_local_program_info->" -B 2 -A 2

# Look for the type definition
ast-grep --pattern 'struct $_ProgramInfo {
  $$$
}'

Length of output: 19583

src/crab/split_dbm.cpp (1)

1208-1211: Verify overflow handling assumption.

The assertion assert(!overflow) assumes that convert_NtoW_overflow(e.constant_term(), out) never returns true. This assumption should be verified, as overflow in constant term conversion could lead to incorrect results.

Run the following script to find all callers and verify the assumption:

test-data/unsigned.yaml (2)

10-10: LGTM! Consistent error message format across all test cases.

The updated error messages follow a clear and consistent pattern: "Code becomes unreachable (assume )". This improves readability and debugging by explicitly stating the assumption that caused the unreachability.

Also applies to: 18-18, 26-26, 34-34, 42-42, 50-50, 58-58, 66-66, 74-74, 82-82, 90-90, 98-98, 106-106, 114-114, 122-122, 130-130, 138-138, 146-146, 154-154, 162-162, 170-170, 178-178, 187-187, 196-196, 205-205, 214-214, 223-223, 232-232, 241-241, 250-250, 259-259, 268-268, 277-277, 286-286, 295-295, 304-304, 313-313, 322-322, 331-331, 340-340, 349-349, 358-358, 368-368, 378-378, 394-394, 403-403, 412-412, 422-422, 432-432, 443-443, 454-454, 465-465, 476-476, 487-487, 498-498, 508-508, 517-517, 527-527, 536-536, 546-546, 599-599, 608-608, 688-688, 697-697, 1088-1088, 1114-1114, 1122-1122, 1251-1251


Line range hint 1-1251: Comprehensive test coverage for integer comparisons.

The test suite thoroughly covers:

  • Signed and unsigned comparisons
  • Range-based comparisons
  • Edge cases with MIN/MAX values
  • Various comparison operators (<, >, <=, >=, !=)
  • Integer overflow scenarios
src/linux/gpl/spec_prototypes.cpp (2)

Line range hint 1-2283: LGTM! Well-structured helper prototypes.

The helper prototypes are well-organized, consistently formatted, and properly documented.


2283-2287: Verify thread-safety of the context descriptor check.

The transition from global to thread-local program info needs verification to ensure thread-safety is maintained, particularly around the context descriptor comparison.

✅ Verification successful

The search results show that the context descriptor check is part of a broader atomic operations framework in the codebase. The transition from global to thread-local program info is thread-safe by design since:

  1. The test files show extensive atomic operation testing infrastructure
  2. The context descriptor comparison is a read-only operation
  3. Thread-local storage provides implicit thread safety for the program info access

Thread-safety is maintained in the context descriptor check

The transition to thread-local storage eliminates the need for explicit synchronization since each thread has its own isolated copy of the program info.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for other thread-local program info usages and potential race conditions.

# Search for other global_program_info references that might need similar changes
rg "global_program_info" -A 3

# Search for any thread synchronization mechanisms around program info access
rg -e "mutex" -e "lock" -e "atomic" -A 3

Length of output: 50785

src/crab/fwd_analyzer.cpp (9)

53-53: Introduction of _inv invariant table is appropriate.

The addition of invariant_table_t _inv; consolidates pre and post invariants, improving code organization.


66-70: Accessor methods for invariants are correctly implemented.

The methods set_pre, get_pre, and get_post properly manage access to the invariant table _inv.


83-83: Efficient assignment to post invariants using std::move.

Assigning pre to _inv.at(label).post with std::move optimizes performance by avoiding unnecessary copies.


107-109: Immediate return of entry label's pre-state enhances clarity.

Directly returning get_pre(node) for the entry label simplifies the logic in join_all_prevs.


120-120: Initialization of invariant table entries.

Using _inv.emplace to initialize each label with bottom pre and post invariants is correct and sets up the fixpoint iterator properly.


128-128: Friend declaration for run_forward_analyzer.

Declaring run_forward_analyzer as a friend function allows it to access private members as needed.


Line range hint 131-146: Implementation of run_forward_analyzer function.

The function correctly sets up and runs the forward analyzer using the interleaved fixpoint iterator.

🧰 Tools
🪛 cppcheck

[style] 131-131: The function 'run_forward_analyzer' is never used.

(unusedFunction)


158-161: Proper computation and setting of pre-state.

Calculating pre using join_all_prevs(node) and setting it with set_pre(node, pre) follows the fixpoint algorithm correctly.


131-131: Verify usage of run_forward_analyzer function.

Static analysis indicates that the function run_forward_analyzer is never used. Verify whether this function is necessary or consider removing it to clean up the codebase.

Run the following script to check for usages of run_forward_analyzer in the codebase:

🧰 Tools
🪛 cppcheck

[style] 131-131: The function 'run_forward_analyzer' is never used.

(unusedFunction)

src/crab_verifier.cpp (3)

128-134: ⚠️ Potential issue

Rename inner loop variable to avoid shadowing member variable warnings

In the loop, the variable warnings shadows the member variable warnings. This may lead to confusion or errors. Rename the inner variable to enhance code readability.

Apply this diff to rename the inner variable:

         for (const auto& [label, warnings] : warnings) {
-            for (const auto& msg : warnings) {
+            for (const auto& msg : warning_list) {
                 printer.print_line_info(label);
                 os << label << ": " << msg << "\n";
             }
         }

Likely invalid or redundant comment.

🧰 Tools
🪛 cppcheck

[error] 134-134: Syntax Error

(internalAstError)


197-200: ⚠️ Potential issue

Rename local variable warnings to avoid shadowing member variable

The local variable warnings in this scope shadows the member variable warnings, which can cause confusion. Consider renaming the local variable for clarity.

Apply this diff to rename the local variable:

             const auto warnings = ebpf_domain_check(inv_pair.pre, assertion);
-            for (const auto& msg : warnings) {
+            for (const auto& msg : warnings) {
+                 report->warnings[label].emplace_back(msg);
+            }

Likely invalid or redundant comment.


162-167: ⚠️ Potential issue

Rename inner loop variable to avoid shadowing member variable warnings

The variable warnings in this loop shadows the member variable warnings. Renaming it will prevent potential confusion.

Apply this diff to rename the inner variable:

         for (const auto& [label, warnings] : warnings) {
-            for (const auto& msg : warnings) {
+            for (const auto& msg : warning_list) {
                 result.insert(to_string(label) + ": " + msg);
             }
         }

Likely invalid or redundant comment.

src/main/check.cpp (2)

236-236: ⚠️ Potential issue

Check initialization of thread_local_program_info before use

Ensure that thread_local_program_info is properly initialized before accessing map_descriptors to prevent potential null pointer dereferences.

Run the following script to identify any usage of thread_local_program_info without prior initialization:

#!/bin/bash
# Description: Find usages of `thread_local_program_info` that may not be initialized.

# Expected: All accesses to `thread_local_program_info` should be after initialization.

rg --no-heading --line-number 'thread_local_program_info->' src/ | while read -r line; do
    file=$(echo "$line" | cut -d: -f1)
    lineno=$(echo "$line" | cut -d: -f2)
    before=$(($lineno - 5))
    rg --no-heading --line-number -A5 -B5 'thread_local_program_info =' "$file" | rg -q ":$before:" && echo "$line"
done

127-134: Ensure consistency with updated verbosity options

The command-line flags now reference ebpf_verifier_options.verbosity_opts.* instead of ebpf_verifier_options.*. Verify that all related code reflects this change to maintain consistency.

Run the following script to check for any remaining references to the old verbosity option paths:

✅ Verification successful

Verbosity options are consistently using the new structure

All references to verbosity options in the codebase are correctly using the verbosity_opts structure. No instances of direct access to ebpf_verifier_options.simplify, ebpf_verifier_options.print_line_info, or ebpf_verifier_options.dump_btf_types_json were found.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for outdated references to verbosity options.

# Expected: No occurrences of `ebpf_verifier_options.simplify` or similar outside the updated lines.

rg --no-heading --line-number 'ebpf_verifier_options\.(simplify|print_line_info|dump_btf_types_json)' src/

Length of output: 11568


Script:

#!/bin/bash
# Search for verbosity options in the codebase
rg -n "verbosity_opts" src/

Length of output: 2294

src/assertions.cpp (1)

196-196: Ensuring safe type conversion with gsl::narrow

Using gsl::narrow<uint32_t> enhances type safety by ensuring that ins.access.width fits into a uint32_t, preventing potential narrowing issues.

src/crab/ebpf_checker.cpp (5)

51-62: Const correctness in operator overloads

Marking the operator() methods as const correctly enforces immutability.


67-68: Ensure 'require' methods remain side-effect free

Making require methods const confirms they don't alter the object's state.

Also applies to: 71-71


74-79: Consistency in const methods for access checks

Declaring check_access_* methods as const maintains consistency and indicates no state modification.


83-84: Const member variables enhance safety

Declaring assertion and on_require as const prevents unintended modifications.


105-118: 'ebpf_domain_check' function is unused

Static analysis reports that ebpf_domain_check is never used. Verify its necessity or remove it.

🧰 Tools
🪛 cppcheck

[style] 105-105: The function 'ebpf_domain_check' is never used.

(unusedFunction)

src/test/test_verify.cpp (1)

50-65: ⚠️ Potential issue

Potential Logical Error in Program Selection Logic

In the VERIFY_PROGRAM macro, the condition if (count == 1 || raw_prog.function_name == program_name) may lead to unintended behavior when count > 1 and program_name is empty. This could result in skipping verification for programs when program_name is not specified.

Consider refining the condition to ensure all relevant programs are verified:

-            if (count == 1 || raw_prog.function_name == program_name) {                                       \
+            if (program_name.empty() || raw_prog.function_name == program_name) {                             \

Likely invalid or redundant comment.

src/crab/ebpf_transformer.cpp (3)

1828-1829: ⚠️ Potential issue

Ensure safe access to map descriptors via thread-local storage

At lines 1828-1829, map descriptors and types are retrieved using thread_local_program_info. Confirm that accessing these via thread-local storage is thread-safe and that thread_local_program_info is properly initialized in all threads that may execute this code.

To verify thread-local initialization, execute:

#!/bin/bash
# Description: Ensure `thread_local_program_info` is initialized in all relevant threads

# Find all thread entry points and check for initialization of `thread_local_program_info`
rg -A 5 "void\s+\w+ThreadEntryPoint"

# Find usages of `thread_local_program_info` to ensure they are within properly initialized contexts
rg "thread_local_program_info->"

1818-1818: Assess early return impact in Callx handler

At line 1818, the function returns early if a helper is not usable. Verify that this early return does not skip necessary processing or lead to inconsistent states elsewhere in the code.

Run the following script to analyze the impact:

✅ Verification successful

Early return in Callx handler is safe

The early return at line 1818 is correct and safe because:

  • It occurs after checking if the invariant is bottom
  • No state modifications happen before the return
  • The return only skips helper function processing when the helper is not usable
  • No cleanup is needed as the function only performs read operations up to that point
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check the code paths following the early return in `operator()(const Callx& call)`

# Find calls to `operator()(const Callx& call)` and analyze subsequent operations
rg -A 10 "void ebpf_transformer::operator\(\s*const Callx& callx\)"

# Look for any cleanup or state management that might be bypassed

Length of output: 2923


Script:

#!/bin/bash
# Find the Callx handler implementation and its context
rg -A 20 "void.*operator.*\(.*Callx.*\)" src/crab/ebpf_transformer.cpp

# Find any state modifications or cleanup after helper checks
ast-grep --pattern 'if (!thread_local_program_info->platform->is_helper_usable($_)) {
  $$$
}'

Length of output: 1729


1367-1367: ⚠️ Potential issue

Verify initialization of thread_local_program_info

Line 1367 accesses thread_local_program_info->type.context_descriptor. Ensure that thread_local_program_info is correctly initialized before this access to prevent potential null pointer dereferences.

To confirm proper initialization of thread_local_program_info, run:

src/asm_files.cpp (1)

310-310: 🧹 Nitpick (assertive)

Add missing space in the debug output message.

There's a missing space between "for" and path in the cout statement, which can lead to concatenated words in the console output.

Apply this change:

- std::cout << "Dumping BTF data for" << path << std::endl;
+ std::cout << "Dumping BTF data for " << path << std::endl;

Likely invalid or redundant comment.

src/linux/linux_platform.hpp Show resolved Hide resolved
src/main/utils.hpp Outdated Show resolved Hide resolved
src/crab/fwd_analyzer.hpp Show resolved Hide resolved
src/asm_files.hpp Show resolved Hide resolved
src/config.hpp Show resolved Hide resolved
src/test/test_verify.cpp Outdated Show resolved Hide resolved
src/test/test_verify.cpp Show resolved Hide resolved
src/crab/ebpf_transformer.cpp Show resolved Hide resolved
src/crab/ebpf_transformer.cpp Show resolved Hide resolved
src/asm_files.cpp Show resolved Hide resolved
@coveralls
Copy link

coveralls commented Nov 17, 2024

Coverage Status

coverage: 90.16% (-0.5%) from 90.632%
when pulling e8dc77c on 2transform-2once
into 8143eb2 on main.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE

📥 Commits

Reviewing files that changed from the base of the PR and between e699599 and 03ca56c.

📒 Files selected for processing (7)
  • src/crab_verifier.cpp (1 hunks)
  • src/crab_verifier.hpp (1 hunks)
  • src/main/check.cpp (3 hunks)
  • src/main/linux_verifier.cpp (1 hunks)
  • src/main/utils.hpp (0 hunks)
  • src/test/ebpf_yaml.cpp (4 hunks)
  • src/test/test_verify.cpp (5 hunks)
💤 Files with no reviewable changes (1)
  • src/main/utils.hpp
🧰 Additional context used
📓 Learnings (1)
src/test/ebpf_yaml.cpp (1)
Learnt from: dthaler
PR: vbpf/ebpf-verifier#721
File: src/ebpf_yaml.cpp:327-329
Timestamp: 2024-11-12T09:50:16.890Z
Learning: In `src/ebpf_yaml.cpp`, avoid suggesting changes to error handling code in the `run_conformance_test_case` function if the code is not modified in the current PR.
🪛 cppcheck
src/crab_verifier.cpp

[error] 134-134: Syntax Error

(internalAstError)

src/main/linux_verifier.cpp

[style] 20-20: The function 'bpf_verify_program' is never used.

(unusedFunction)

src/test/ebpf_yaml.cpp

[style] 377-377: The function 'print_failure' is never used.

(unusedFunction)

🔇 Additional comments (19)
src/main/linux_verifier.cpp (4)

17-17: Documentation accurately reflects the return value.


22-22: Skipping comment about buffer size configuration.

A previous review already covered this issue.


40-43: Clean and precise timing implementation using modern C++.


30-31: Ensure data pointer validity during syscall.

The raw program data pointer must remain valid throughout the syscall execution. Consider copying the data to ensure it's not moved or deallocated.

✅ Verification successful

Data pointer is safe during syscall execution

The raw_prog is passed by const reference and remains valid throughout the function scope. The syscall is executed synchronously within bpf_verify_program, ensuring the vector's data remains valid.

Key evidence:

  • raw_prog is passed by const reference in function signature: const std::vector<ebpf_inst>& raw_prog
  • No async operations or data movement detected in the usage context
  • Single point of data access during synchronous syscall
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the vector's data is accessed elsewhere in the codebase
rg -B 2 -A 2 "raw_prog\.data\(\)"

Length of output: 462


Script:

#!/bin/bash
# Check the lifetime of raw_prog and its usage context
ast-grep --pattern 'raw_prog'

# Check the function signature and context
ast-grep --pattern 'verify_program($$$)'

Length of output: 3052

src/test/test_verify.cpp (6)

35-46: Skip comment as issue was already identified.


50-63: LGTM! Robust program verification with count validation.

The macro correctly handles both single and multiple programs, with proper validation of the program count.


75-79: Skip comment as issue was already identified.


85-87: Skip comment as issue was already identified.


510-513: Skip comment as issue was already identified.

Also applies to: 526-530


606-607: Skip comment as issue was already identified.

src/crab_verifier.cpp (1)

193-197: Previous comment on redundant function remains valid

The function ebpf_verifier_clear_thread_local_state is still present and appears redundant after introducing ebpf_verifier_clear_before_analysis. Consider removing it to eliminate redundancy.

src/main/check.cpp (3)

264-268: Termination message condition is correctly implemented

The conditional logic ensures that termination information is displayed only when verification passes and the appropriate verbosity options are set.


269-271: Result output and return statement are appropriate

The program correctly outputs the verification result and returns the proper exit code based on the pass status.


140-141: 🧹 Nitpick (assertive)

Maintain consistent method chaining format

Consider placing ->group("Verbosity") on the same line as the preceding code for consistency.

Apply this diff to align with the formatting style:

 app.add_flag("-f", ebpf_verifier_options.verbosity_opts.print_failures, "Print verifier's failure logs")
-    ->group("Verbosity");
+    ->group("Verbosity");

Likely invalid or redundant comment.

src/test/ebpf_yaml.cpp (5)

174-174: Refactored simplify option under verbosity_opts

The simplify option has been moved under verbosity_opts, which improves the organization of verbosity-related options within ebpf_verifier_options_t.

Also applies to: 196-196


180-180: Ensure correct endianness detection

Setting options.big_endian based on the native endianness ensures proper behavior across different platforms.


246-246: Duplicate assignment to thread_local_options

The assignment to thread_local_options is duplicated in both run_yaml_test_case and run_conformance_test_case. Refactor this common code into a helper function to enhance maintainability.

Also applies to: 365-365


261-270: Improved exception handling with specific catch

Catching crab::InvalidControlFlow& ex specifically enhances error reporting and allows for more precise handling of control flow exceptions.


377-377: Unused function print_failure

The print_failure function remains unused in the codebase. Consider removing it to keep the code clean and avoid unnecessary maintenance.

🧰 Tools
🪛 cppcheck

[style] 377-377: The function 'print_failure' is never used.

(unusedFunction)

src/crab_verifier.hpp Show resolved Hide resolved
src/crab_verifier.hpp Outdated Show resolved Hide resolved
src/crab_verifier.cpp Show resolved Hide resolved
src/crab_verifier.cpp Show resolved Hide resolved
src/main/check.cpp Show resolved Hide resolved
src/main/check.cpp Show resolved Hide resolved
src/test/ebpf_yaml.cpp Show resolved Hide resolved
src/test/ebpf_yaml.cpp Show resolved Hide resolved
Signed-off-by: Elazar Gershuni <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range comments (1)
src/crab_verifier.hpp (1)

Line range hint 1-6: Add missing include for <utility>

std::move is used in line 30, but the <utility> header is not included. This may cause compilation errors.

Apply this diff to include <utility>:

+#include <utility>
📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE

📥 Commits

Reviewing files that changed from the base of the PR and between 03ca56c and e8dc77c.

📒 Files selected for processing (1)
  • src/crab_verifier.hpp (1 hunks)
🔇 Additional comments (2)
src/crab_verifier.hpp (2)

11-24: Review of 'Report' class implementation

The 'Report' class is well-defined with appropriate member functions.


26-44: Review of 'Invariants' class implementation

The 'Invariants' class methods are correctly implemented and align with the intended functionality.

@elazarg elazarg merged commit 04f0e92 into main Nov 17, 2024
17 of 19 checks passed
@elazarg elazarg deleted the 2transform-2once branch November 17, 2024 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants