No transform when checking + check code reorg (second attempt) #796

elazarg · 2024-11-17T14:04:53Z

Another take on #793.

Avoiding ebpf_transform on check should improve performance
Verifier by phase: Invariants and Report
-- Invariants has a dedicated method for checking concrete state, for implementing Expose API to compare abstract state to concrete state #728
-- "Short circuit" method for pass/fail (verified()) that fail early and does not store messages
Add the assume commands that triggered unreachability to error messages
More constness on ebpf_checker: should not mutate the invariant
No reachability notes on assume-assert
Use dedicated exceptions for input errors; not warnings
Remove line_info from raw_program
Reduce duplication between VERIFY_PROGRAM and VERIFY_SECTION tests

Summary by CodeRabbit

Release Notes

New Features
- Introduced custom exception types for more specific error handling in control flow and unmarshalling processes.
- Added support for thread-local program information to enhance concurrency and state management.
- Implemented a new structure for verbosity options, streamlining the management of logging and output settings.
- Enhanced the handling of conditional jumps and assumptions in test cases, improving clarity on unreachable code conditions.
- Added new test cases for loops and bitwise operations, covering a broader range of scenarios and edge cases.
Bug Fixes
- Improved error handling and reporting in various functions, ensuring clearer feedback on issues encountered during execution.
Tests
- Enhanced test cases for various scenarios, including control flow, loops, and bitwise operations, with clearer assumptions and expected outcomes.
Documentation
- Updated documentation to reflect changes in error handling and program structure, improving clarity and maintainability.

* Avoiding ebpf_transform on check should improve performance * Verifier by phase * Add the assume commands that triggered unreachability to error messages * More constness on ebpf_checker: should not mutate the invariant * No reachability notes on assume-assert * Use dedicated exceptions for input errors; not warnings * Remove line_info from raw_program Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai · 2024-11-17T14:05:01Z

Walkthrough

The pull request introduces extensive changes across multiple source files, primarily focusing on error handling, control flow, and the transition to thread-local storage. Key modifications include replacing standard exceptions with custom exceptions for more specific error reporting, restructuring the management of program information to be thread-local, and refining various methods to enhance clarity and maintainability. Additionally, the introduction of new data structures and adjustments to existing ones aim to improve the organization of verbosity options and the handling of control flow graphs.

Changes

File	Change Summary
src/asm_cfg.cpp	Replaced `std::runtime_error` and `std::invalid_argument` with `crab::InvalidControlFlow` in error handling. Included `<cassert>`.
src/asm_files.cpp	Replaced `std::runtime_error` with `UnmarshalError`. Changed global program info to thread-local. Updated BTF data handling.
src/asm_files.hpp	Added `UnmarshalError` class.
src/asm_ostream.cpp	Updated `EbpfHelperPrototype` source from global to thread-local program info.
src/asm_unmarshal.cpp	Removed `line_info` parameter from `unmarshal` method. Updated internal logic to use `info.line_info`.
src/assertions.cpp	Enhanced memory access assertion handling and argument type checks.
src/config.hpp	Introduced `verbosity_options_t` structure for verbosity settings. Removed old verbosity fields from `ebpf_verifier_options_t`.
src/crab/array_domain.cpp	Renamed `global_array_map` to `thread_local_array_map`. Updated related functions for thread-local state management.
src/crab/array_domain.hpp	Removed `clear_global_state()` method declaration.
src/crab/cfg.hpp	Added `InvalidControlFlow` class. Removed `mark_alive_blocks` and `remove_unreachable_blocks` methods.
src/crab/ebpf_checker.cpp	Updated constructor to accept `OnRequire` callback. Enhanced const-correctness across several methods.
src/crab/ebpf_domain.cpp	Replaced `global_program_info` with `thread_local_program_info`. Added `get_r0` method.
src/crab/ebpf_domain.hpp	Updated `ebpf_domain_check` to accept const reference. Added `get_r0` method.
src/crab/ebpf_transformer.cpp	Updated context descriptor retrieval to thread-local. Enhanced error handling in `ebpf_domain_transform`.
src/crab/fwd_analyzer.cpp	Consolidated pre and post invariant tables into a single structure. Updated relevant methods accordingly.
src/crab/fwd_analyzer.hpp	Updated `invariant_table_t` type definition to include `invariant_map_pair`.
src/crab/split_dbm.cpp	Improved formatting of overflow checks and refined handling of linear expressions.
src/crab/split_dbm.hpp	Removed namespace qualifier from `GraphOps` template instantiation.
src/crab/type_domain.cpp	Updated `typeset_to_string` parameter type. Refined error handling in several methods.
src/crab/var_factory.cpp	Corrected function closure syntax for `default_variable_names`.
src/crab_utils/stats.cpp	No changes to public entities.
src/crab_verifier.cpp	Introduced `LineInfoPrinter` and `Invariants` structs. Reorganized verification methods.
src/crab_verifier.hpp	Removed `run_ebpf_analysis` and `ebpf_verify_program` methods. Introduced new `analyze` methods.
src/linux/gpl/spec_prototypes.cpp	Updated context descriptor handling to thread-local.
src/linux/linux_platform.cpp	Reformatted macros and replaced `std::runtime_error` with `UnmarshalError`.
src/linux/linux_platform.hpp	Added function declarations for `get_helper_prototype_linux` and `is_helper_usable_linux`.
src/main/check.cpp	Updated verbosity option handling to use nested structure.
src/main/linux_verifier.cpp	Updated verbosity options access and corrected return documentation.
src/main/run_yaml.cpp	Modified `print_failure` function call to pass a single argument.
src/main/utils.hpp	Changed `timed_execution` template and return type for flexibility.
src/spec_type_descriptors.hpp	Updated `program_info` structure to include `line_info`. Renamed `global_program_info` to `thread_local_program_info`.
src/test/ebpf_yaml.cpp	Restructured options handling and updated test case functions.
src/test/ebpf_yaml.hpp	Added default parameter for `print_failure` function.
src/test/test_verify.cpp	Updated verification macros to enhance error handling and flexibility.
src/test/test_yaml.cpp	Removed second argument from `print_failure` function call.
test-data/jump.yaml	Enhanced clarity of unreachable code messages and refined assumption handling.
test-data/loop.yaml	Updated loop test cases with clearer messages and added new scenarios.
test-data/shift.yaml	Added extensive test cases for bitwise operations, covering various scenarios.
test-data/unsigned.yaml	Updated unreachable code messages for clarity.

Possibly related PRs

Cleanup interval bound handling #692: Modifications in src/crab/ebpf_domain.cpp focusing on interval evaluations and error handling, related to changes in src/asm_cfg.cpp.
Use gsl::narrow in asm files #710: This PR updates the src/asm_files.cpp file to enhance type safety and error handling, which aligns with the main PR's focus on improving error handling mechanisms.
Simplify configuration options #775: This PR simplifies configuration options in the src/asm_cfg.cpp file, which is related to the main PR's changes in error handling and control flow within the same file.
Fix overflow in conversion from uvalue to svalue #769: The changes in this PR enhance arithmetic operations and overflow handling in src/crab/ebpf_domain.cpp, which is relevant to the main PR's focus on error handling and control flow.
Fail hard when a non-Assume instruction turns an invariant into bottom #791: This PR introduces error handling logic in the ebpf_domain_transform function, which is directly related to the main PR's focus on improving error handling in the context of control flow management.

Suggested reviewers

dthaler

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 42

🧹 Outside diff range comments (8)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE

📥 Commits

Reviewing files that changed from the base of the PR and between 8143eb2 and e699599.

📒 Files selected for processing (39)

src/asm_cfg.cpp (6 hunks)
src/asm_files.cpp (16 hunks)
src/asm_files.hpp (1 hunks)
src/asm_ostream.cpp (1 hunks)
src/asm_unmarshal.cpp (3 hunks)
src/assertions.cpp (1 hunks)
src/config.hpp (2 hunks)
src/crab/array_domain.cpp (1 hunks)
src/crab/array_domain.hpp (0 hunks)
src/crab/cfg.hpp (2 hunks)
src/crab/ebpf_checker.cpp (10 hunks)
src/crab/ebpf_domain.cpp (7 hunks)
src/crab/ebpf_domain.hpp (2 hunks)
src/crab/ebpf_transformer.cpp (4 hunks)
src/crab/fwd_analyzer.cpp (7 hunks)
src/crab/fwd_analyzer.hpp (1 hunks)
src/crab/split_dbm.cpp (1 hunks)
src/crab/split_dbm.hpp (1 hunks)
src/crab/type_domain.cpp (1 hunks)
src/crab/var_factory.cpp (1 hunks)
src/crab_utils/stats.cpp (0 hunks)
src/crab_verifier.cpp (1 hunks)
src/crab_verifier.hpp (1 hunks)
src/linux/gpl/spec_prototypes.cpp (1 hunks)
src/linux/linux_platform.cpp (5 hunks)
src/linux/linux_platform.hpp (1 hunks)
src/main/check.cpp (3 hunks)
src/main/linux_verifier.cpp (2 hunks)
src/main/run_yaml.cpp (1 hunks)
src/main/utils.hpp (1 hunks)
src/spec_type_descriptors.hpp (1 hunks)
src/test/ebpf_yaml.cpp (4 hunks)
src/test/ebpf_yaml.hpp (1 hunks)
src/test/test_verify.cpp (5 hunks)
src/test/test_yaml.cpp (1 hunks)
test-data/jump.yaml (15 hunks)
test-data/loop.yaml (7 hunks)
test-data/shift.yaml (0 hunks)
test-data/unsigned.yaml (46 hunks)

💤 Files with no reviewable changes (3)

src/crab/array_domain.hpp
src/crab_utils/stats.cpp
test-data/shift.yaml

🧰 Additional context used

📓 Learnings (5)

src/crab/ebpf_domain.cpp (1)

Learnt from: dthaler
PR: vbpf/ebpf-verifier#721
File: src/crab/ebpf_domain.cpp:2060-2060
Timestamp: 2024-11-12T09:50:23.369Z
Learning: `r10` is read-only and always set to a fixed value, so in `do_mem_store`, `r10_interval` can be assumed to be a singleton without additional checks.

src/crab/ebpf_domain.hpp (1)

Learnt from: dthaler
PR: vbpf/ebpf-verifier#721
File: src/crab/ebpf_domain.hpp:152-152
Timestamp: 2024-11-12T09:50:16.890Z
Learning: In `src/crab/ebpf_domain.hpp`, methods in the `ebpf_domain_t` class typically do not have documentation comments, so adding documentation for new methods is not necessary.

src/crab/split_dbm.hpp (1)

Learnt from: elazarg
PR: vbpf/ebpf-verifier#689
File: src/crab/split_dbm.cpp:910-910
Timestamp: 2024-11-12T09:50:16.890Z
Learning: Avoid suggesting changes that are unrelated to the current PR.

src/test/ebpf_yaml.cpp (1)

Learnt from: dthaler
PR: vbpf/ebpf-verifier#721
File: src/ebpf_yaml.cpp:327-329
Timestamp: 2024-11-12T09:50:16.890Z
Learning: In `src/ebpf_yaml.cpp`, avoid suggesting changes to error handling code in the `run_conformance_test_case` function if the code is not modified in the current PR.

test-data/loop.yaml (1)

Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#745
File: test-data/loop.yaml:377-398
Timestamp: 2024-11-12T09:50:23.369Z
Learning: In `test-data/loop.yaml`, for the test case **"very large loop > 100K iterations"**, additional post conditions like `r0.svalue>0` and `r0.uvalue>0` are not required.

🪛 cppcheck

src/asm_files.cpp

[error] 134-134: Syntax Error

(internalAstError)

src/crab/ebpf_checker.cpp

[style] 105-105: The function 'ebpf_domain_check' is never used.

(unusedFunction)

src/crab/fwd_analyzer.cpp

[style] 131-131: The function 'run_forward_analyzer' is never used.

(unusedFunction)

src/crab_verifier.cpp

[error] 134-134: Syntax Error

(internalAstError)

src/main/linux_verifier.cpp

[style] 22-22: The function 'bpf_verify_program' is never used.

(unusedFunction)

src/test/ebpf_yaml.cpp

[style] 379-379: The function 'print_failure' is never used.

(unusedFunction)

🔇 Additional comments (63)

src/linux/linux_platform.hpp (1)

5-5: Include directive is properly placed.

The include is necessary for the EbpfHelperPrototype type used in the function declarations.

src/crab/fwd_analyzer.hpp (1)

16-16: Clean type definition and function signature changes

The consolidated invariant_table_t type and simplified run_forward_analyzer return type create a more cohesive API.

Also applies to: 18-18

src/asm_files.hpp (1)

12-15: 🧹 Nitpick (assertive)

Add noexcept specification to the constructor.

The constructor won't throw, so it should be marked as noexcept.

-    explicit UnmarshalError(const std::string& what) : std::runtime_error(what) {}
+    explicit UnmarshalError(const std::string& what) noexcept : std::runtime_error(what) {}

src/test/ebpf_yaml.hpp (1)

33-33: Consider thread safety implications of default std::cout

The default parameter improves usability but using a shared global output stream might cause interleaved output in multi-threaded scenarios.

Let's check if other error reporting functions follow similar patterns:

src/config.hpp (2)

46-46: LGTM!

The addition of verbosity_opts effectively consolidates verbosity-related options.

50-51: LGTM!

The uniform initialization syntax provides better guarantees against uninitialized values.

src/main/run_yaml.cpp (1)

44-44: Verify thread safety of the new print_failure implementation

The removal of the output stream parameter suggests a shift to using a global or thread-local stream. This could cause issues in multi-threaded scenarios.

src/test/test_yaml.cpp (2)

Line range hint 21-45: LGTM: Comprehensive test coverage.

The test suite covers essential eBPF functionality including arithmetic, bitwise operations, control flow, memory operations, and packet handling.

16-16: Verify the default output stream behavior.

The removal of std::cout argument relies on the default parameter in print_failure. Let's verify this change is consistent across the codebase.

✅ Verification successful

Default output stream behavior is correctly implemented

The function signature and all usages are consistent:

Declaration in ebpf_yaml.hpp has the default parameter std::cout
Implementation in ebpf_yaml.cpp matches the declaration
All call sites use the single-argument form, relying on the default parameter

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining direct std::cout usage with print_failure
# and verify the new function signature

# Check for any remaining direct std::cout usage with print_failure
rg "print_failure.*std::cout" 

# Find the print_failure function declaration to verify default parameter
ast-grep --pattern 'print_failure($_) {
  $$$
}'

Length of output: 179

Script:

#!/bin/bash
# Let's verify the implementation of print_failure to ensure it correctly uses the default parameter
ast-grep --pattern 'void print_failure(const Failure& $_) {
  $$$
}'

# Check all usages of print_failure to ensure they're consistent with the new signature
rg "print_failure\(" -A 2

Length of output: 1015

src/main/linux_verifier.cpp (2)

19-19: Documentation fix looks good

44-46: Error handling looks good

Comprehensive error reporting with both error code and description.

src/spec_type_descriptors.hpp (1)

79-79: Thread-local storage prevents data races.

The rename from global_program_info to thread_local_program_info better reflects its thread-local nature.

src/crab/ebpf_domain.hpp (2)

31-31: Good const correctness improvement.

Making ebpf_domain_check take a const reference prevents accidental mutation of the domain during checks.

69-69: Verify usage of new get_r0() method.

The method looks correct, but let's verify its usage pattern.

✅ Verification successful

get_r0() usage is consistent with its purpose

The method is used to retrieve the return value (R0) from the exit invariants, which aligns with the BPF convention where R0 holds the return value. The implementation and usage are consistent across the codebase:

Used in crab_verifier.cpp to extract the exit value from invariants
Implemented in ebpf_domain.cpp to access R0_RETURN_VALUE from the register pack

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all usages of get_r0() to understand its purpose and verify consistent usage
rg -p "get_r0\(\)" --type cpp

# Find any related R0 register access patterns
rg -p "[\s\(]R0[\s\)]" --type cpp

Length of output: 561

src/crab/var_factory.cpp (2)

137-137: 🧹 Nitpick (assertive)

Fix indentation of closing brace.

The closing brace should align with the start of the function declaration.

-}
+}

Line range hint 139-141: Verify thread-local storage usage across codebase.

The transition to thread-local storage requires verification of:

All access points to ensure they're thread-safe
No lingering global state that could cause race conditions

✅ Verification successful

Thread-local storage implementation is properly synchronized

The codebase shows a well-structured thread-local state management:

The variable_t::names is correctly declared as thread_local with proper cleanup
All thread-local variables have corresponding clear methods
Thread state cleanup is systematically handled through clear_thread_local_state()
Main program ensures cleanup via RAII pattern using at_scope_exit
No direct global state access found that could cause race conditions

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any remaining global state or non-thread-safe access
rg -g '!*.md' -g '!*.txt' "global.*variable_t::names" .
rg -g '!*.md' -g '!*.txt' "static.*variable_t::names" .

# Find all places where names is accessed
ast-grep --pattern 'names->'

Length of output: 146

Script:

#!/bin/bash
# Search for all usages of variable_t::names
rg -g '!*.md' -g '!*.txt' "variable_t::names" -A 2 -B 2

# Find all accesses to the names member
rg -g '!*.md' -g '!*.txt' "->names" -A 2 -B 2

# Look for thread-related operations
rg -g '!*.md' -g '!*.txt' "thread_local.*variable_t" -A 2 -B 2

# Check for any other thread_local variables in the codebase
rg -g '!*.md' -g '!*.txt' "thread_local" -A 2 -B 2

# Find all uses of clear_thread_local_state
rg -g '!*.md' -g '!*.txt' "clear_thread_local_state" -A 2 -B 2

Length of output: 33815

test-data/loop.yaml (1)

Line range hint 285-356: Enhanced error messages improve debugging clarity.

The new error messages explicitly state the assumptions that lead to unreachable code, making it easier to understand why loops are infinite.

src/crab/split_dbm.hpp (1)

307-307: LGTM: Namespace cleanup maintains consistency

The removal of redundant crab:: qualifier aligns with the file's namespace usage, as AdaptGraph is already within the crab namespace scope.

src/linux/linux_platform.cpp (2)

6-14: Macro formatting changes look good

The multi-line formatting improves readability without affecting functionality.

183-183: Exception type changes are appropriate

The switch to UnmarshalError provides more specific error classification.

Also applies to: 242-242

src/crab/ebpf_domain.cpp (6)

187-187: Thread-safe map descriptor access

The transition to thread-local storage prevents potential race conditions.

209-209: Duplicate thread-local transition

Line range hint 231-235: Improved const correctness

The const qualifier on the map pointer prevents accidental modifications.

Line range hint 249-253: Duplicate const and thread-local changes

Line range hint 267-271: Duplicate const and thread-local changes

284-285: Clean accessor for return value

Encapsulates R0 return value access in a dedicated method.

src/asm_cfg.cpp (2)

89-89: LGTM: Appropriate use of domain-specific exception

The custom InvalidControlFlow exception better represents the semantic meaning of this error condition.

232-233: Verify thread-local variable lifecycle

Need to ensure proper initialization and cleanup of thread_local_program_info.

src/crab/cfg.hpp (3)

25-28: LGTM: Well-designed exception class

The InvalidControlFlow exception class follows best practices with final specifier and explicit constructor.

330-330: Verify C++20 compatibility requirements

The change to std::ranges::sort requires C++20.

✅ Verification successful

C++20 compatibility is properly configured

The codebase is already set up for C++20:

CMakeLists.txt explicitly sets C++20 as required standard
Multiple C++20 features are in use including ranges, concepts, and requires expressions
std::ranges::sort usage aligns with existing C++20 code patterns

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check C++20 requirement is consistently enforced across the project

# Check for C++20 requirement in build files
rg -l "std=c\+\+20|cpp_standard.*20|CXX_STANDARD.*20" 

# Check for other ranges usage
rg "std::ranges::"

Length of output: 33885

Script:

#!/bin/bash
# Check for C++20 requirement in build files
rg -l "CMAKE_CXX_STANDARD|std=c\+\+|CXX_STANDARD" CMakeLists.txt

# Check for ranges usage
rg "std::ranges::" -A 2

Length of output: 3194

Script:

#!/bin/bash
# Check C++20 standard setting in CMakeLists.txt
rg "CMAKE_CXX_STANDARD.*20" -A 2 CMakeLists.txt

# Check for other modern C++20 features usage
rg "concept|requires|constinit|consteval|co_await|co_yield|co_return"

Length of output: 12124

Line range hint 1-1: Verify removal of unreachability checks

The removal of mark_alive_blocks and remove_unreachable_blocks methods needs verification to ensure unreachability is still properly handled.

src/asm_ostream.cpp (2)

260-260: Thread safety improvement by using thread-local storage.

The change from global to thread-local storage eliminates potential race conditions in multi-threaded environments.

260-260: Verify thread_local_program_info initialization.

Ensure that thread_local_program_info is properly initialized before accessing the platform.

✅ Verification successful

Thread-local program info is properly initialized before use

The code shows a clear initialization pattern:

Initialized in unmarshal() via thread_local_program_info = raw_prog.info
Set in prepare_cfg() via thread_local_program_info.set(info)
Properly cleared in ebpf_verifier_clear_thread_local_state()

The access at line 260 in asm_ostream.cpp is safe as it occurs after initialization in the program flow.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for thread_local_program_info initialization and usage patterns
# Test: Look for initialization before use and potential null checks

# Search for initialization patterns
rg -A 5 "thread_local_program_info\s*=|thread_local_program_info\s*\.reset"

# Search for null checks or safety patterns
rg "thread_local_program_info\s*[!=]=\s*nullptr|if\s*\(\s*!?\s*thread_local_program_info\s*\)|assert\s*\(\s*thread_local_program_info\s*\)"

Length of output: 607

Script:

#!/bin/bash
# Let's check the declaration and definition of thread_local_program_info
rg "thread_local.*program_info" -B 2 -A 2

# Check where it's used to understand the access pattern
rg "thread_local_program_info->" -B 2 -A 2

# Look for the type definition
ast-grep --pattern 'struct $_ProgramInfo {
  $$$
}'

Length of output: 19583

src/crab/split_dbm.cpp (1)

1208-1211: Verify overflow handling assumption.

The assertion assert(!overflow) assumes that convert_NtoW_overflow(e.constant_term(), out) never returns true. This assumption should be verified, as overflow in constant term conversion could lead to incorrect results.

Run the following script to find all callers and verify the assumption:

test-data/unsigned.yaml (2)

10-10: LGTM! Consistent error message format across all test cases.

The updated error messages follow a clear and consistent pattern: "Code becomes unreachable (assume )". This improves readability and debugging by explicitly stating the assumption that caused the unreachability.

Also applies to: 18-18, 26-26, 34-34, 42-42, 50-50, 58-58, 66-66, 74-74, 82-82, 90-90, 98-98, 106-106, 114-114, 122-122, 130-130, 138-138, 146-146, 154-154, 162-162, 170-170, 178-178, 187-187, 196-196, 205-205, 214-214, 223-223, 232-232, 241-241, 250-250, 259-259, 268-268, 277-277, 286-286, 295-295, 304-304, 313-313, 322-322, 331-331, 340-340, 349-349, 358-358, 368-368, 378-378, 394-394, 403-403, 412-412, 422-422, 432-432, 443-443, 454-454, 465-465, 476-476, 487-487, 498-498, 508-508, 517-517, 527-527, 536-536, 546-546, 599-599, 608-608, 688-688, 697-697, 1088-1088, 1114-1114, 1122-1122, 1251-1251

Line range hint 1-1251: Comprehensive test coverage for integer comparisons.

The test suite thoroughly covers:

Signed and unsigned comparisons
Range-based comparisons
Edge cases with MIN/MAX values
Various comparison operators (<, >, <=, >=, !=)
Integer overflow scenarios

src/linux/gpl/spec_prototypes.cpp (2)

Line range hint 1-2283: LGTM! Well-structured helper prototypes.

The helper prototypes are well-organized, consistently formatted, and properly documented.

2283-2287: Verify thread-safety of the context descriptor check.

The transition from global to thread-local program info needs verification to ensure thread-safety is maintained, particularly around the context descriptor comparison.

✅ Verification successful

The search results show that the context descriptor check is part of a broader atomic operations framework in the codebase. The transition from global to thread-local program info is thread-safe by design since:

The test files show extensive atomic operation testing infrastructure
The context descriptor comparison is a read-only operation
Thread-local storage provides implicit thread safety for the program info access

Thread-safety is maintained in the context descriptor check

The transition to thread-local storage eliminates the need for explicit synchronization since each thread has its own isolated copy of the program info.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for other thread-local program info usages and potential race conditions.

# Search for other global_program_info references that might need similar changes
rg "global_program_info" -A 3

# Search for any thread synchronization mechanisms around program info access
rg -e "mutex" -e "lock" -e "atomic" -A 3

Length of output: 50785

src/crab/fwd_analyzer.cpp (9)

53-53: Introduction of _inv invariant table is appropriate.

The addition of invariant_table_t _inv; consolidates pre and post invariants, improving code organization.

66-70: Accessor methods for invariants are correctly implemented.

The methods set_pre, get_pre, and get_post properly manage access to the invariant table _inv.

83-83: Efficient assignment to post invariants using std::move.

Assigning pre to _inv.at(label).post with std::move optimizes performance by avoiding unnecessary copies.

107-109: Immediate return of entry label's pre-state enhances clarity.

Directly returning get_pre(node) for the entry label simplifies the logic in join_all_prevs.

120-120: Initialization of invariant table entries.

Using _inv.emplace to initialize each label with bottom pre and post invariants is correct and sets up the fixpoint iterator properly.

128-128: Friend declaration for run_forward_analyzer.

Declaring run_forward_analyzer as a friend function allows it to access private members as needed.

Line range hint 131-146: Implementation of run_forward_analyzer function.

The function correctly sets up and runs the forward analyzer using the interleaved fixpoint iterator.

🧰 Tools

🪛 cppcheck

[style] 131-131: The function 'run_forward_analyzer' is never used.

(unusedFunction)

158-161: Proper computation and setting of pre-state.

Calculating pre using join_all_prevs(node) and setting it with set_pre(node, pre) follows the fixpoint algorithm correctly.

131-131: Verify usage of run_forward_analyzer function.

Static analysis indicates that the function run_forward_analyzer is never used. Verify whether this function is necessary or consider removing it to clean up the codebase.

Run the following script to check for usages of run_forward_analyzer in the codebase:

🧰 Tools

🪛 cppcheck

[style] 131-131: The function 'run_forward_analyzer' is never used.

(unusedFunction)

src/crab_verifier.cpp (3)

128-134: ⚠️ Potential issue

Rename inner loop variable to avoid shadowing member variable warnings

In the loop, the variable warnings shadows the member variable warnings. This may lead to confusion or errors. Rename the inner variable to enhance code readability.

Apply this diff to rename the inner variable:

         for (const auto& [label, warnings] : warnings) {
-            for (const auto& msg : warnings) {
+            for (const auto& msg : warning_list) {
                 printer.print_line_info(label);
                 os << label << ": " << msg << "\n";
             }
         }

🧰 Tools

🪛 cppcheck

[error] 134-134: Syntax Error

(internalAstError)

197-200: ⚠️ Potential issue

Rename local variable warnings to avoid shadowing member variable

The local variable warnings in this scope shadows the member variable warnings, which can cause confusion. Consider renaming the local variable for clarity.

Apply this diff to rename the local variable:

             const auto warnings = ebpf_domain_check(inv_pair.pre, assertion);
-            for (const auto& msg : warnings) {
+            for (const auto& msg : warnings) {
+                 report->warnings[label].emplace_back(msg);
+            }

162-167: ⚠️ Potential issue

Rename inner loop variable to avoid shadowing member variable warnings

The variable warnings in this loop shadows the member variable warnings. Renaming it will prevent potential confusion.

Apply this diff to rename the inner variable:

         for (const auto& [label, warnings] : warnings) {
-            for (const auto& msg : warnings) {
+            for (const auto& msg : warning_list) {
                 result.insert(to_string(label) + ": " + msg);
             }
         }

src/main/check.cpp (2)

236-236: ⚠️ Potential issue

Check initialization of thread_local_program_info before use

Ensure that thread_local_program_info is properly initialized before accessing map_descriptors to prevent potential null pointer dereferences.

Run the following script to identify any usage of thread_local_program_info without prior initialization:

#!/bin/bash
# Description: Find usages of `thread_local_program_info` that may not be initialized.

# Expected: All accesses to `thread_local_program_info` should be after initialization.

rg --no-heading --line-number 'thread_local_program_info->' src/ | while read -r line; do
    file=$(echo "$line" | cut -d: -f1)
    lineno=$(echo "$line" | cut -d: -f2)
    before=$(($lineno - 5))
    rg --no-heading --line-number -A5 -B5 'thread_local_program_info =' "$file" | rg -q ":$before:" && echo "$line"
done

127-134: Ensure consistency with updated verbosity options

The command-line flags now reference ebpf_verifier_options.verbosity_opts.* instead of ebpf_verifier_options.*. Verify that all related code reflects this change to maintain consistency.

Run the following script to check for any remaining references to the old verbosity option paths:

✅ Verification successful

Verbosity options are consistently using the new structure

All references to verbosity options in the codebase are correctly using the verbosity_opts structure. No instances of direct access to ebpf_verifier_options.simplify, ebpf_verifier_options.print_line_info, or ebpf_verifier_options.dump_btf_types_json were found.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for outdated references to verbosity options.

# Expected: No occurrences of `ebpf_verifier_options.simplify` or similar outside the updated lines.

rg --no-heading --line-number 'ebpf_verifier_options\.(simplify|print_line_info|dump_btf_types_json)' src/

Length of output: 11568

Script:

#!/bin/bash
# Search for verbosity options in the codebase
rg -n "verbosity_opts" src/

Length of output: 2294

src/assertions.cpp (1)

196-196: Ensuring safe type conversion with gsl::narrow

Using gsl::narrow<uint32_t> enhances type safety by ensuring that ins.access.width fits into a uint32_t, preventing potential narrowing issues.

src/crab/ebpf_checker.cpp (5)

51-62: Const correctness in operator overloads

Marking the operator() methods as const correctly enforces immutability.

67-68: Ensure 'require' methods remain side-effect free

Making require methods const confirms they don't alter the object's state.

Also applies to: 71-71

74-79: Consistency in const methods for access checks

Declaring check_access_* methods as const maintains consistency and indicates no state modification.

83-84: Const member variables enhance safety

Declaring assertion and on_require as const prevents unintended modifications.

105-118: 'ebpf_domain_check' function is unused

Static analysis reports that ebpf_domain_check is never used. Verify its necessity or remove it.

🧰 Tools

🪛 cppcheck

[style] 105-105: The function 'ebpf_domain_check' is never used.

(unusedFunction)

src/test/test_verify.cpp (1)

50-65: ⚠️ Potential issue

Potential Logical Error in Program Selection Logic

In the VERIFY_PROGRAM macro, the condition if (count == 1 || raw_prog.function_name == program_name) may lead to unintended behavior when count > 1 and program_name is empty. This could result in skipping verification for programs when program_name is not specified.

Consider refining the condition to ensure all relevant programs are verified:

-            if (count == 1 || raw_prog.function_name == program_name) {                                       \
+            if (program_name.empty() || raw_prog.function_name == program_name) {                             \

src/crab/ebpf_transformer.cpp (3)

1828-1829: ⚠️ Potential issue

Ensure safe access to map descriptors via thread-local storage

At lines 1828-1829, map descriptors and types are retrieved using thread_local_program_info. Confirm that accessing these via thread-local storage is thread-safe and that thread_local_program_info is properly initialized in all threads that may execute this code.

To verify thread-local initialization, execute:

#!/bin/bash
# Description: Ensure `thread_local_program_info` is initialized in all relevant threads

# Find all thread entry points and check for initialization of `thread_local_program_info`
rg -A 5 "void\s+\w+ThreadEntryPoint"

# Find usages of `thread_local_program_info` to ensure they are within properly initialized contexts
rg "thread_local_program_info->"

1818-1818: Assess early return impact in Callx handler

At line 1818, the function returns early if a helper is not usable. Verify that this early return does not skip necessary processing or lead to inconsistent states elsewhere in the code.

Run the following script to analyze the impact:

✅ Verification successful

Early return in Callx handler is safe

The early return at line 1818 is correct and safe because:

It occurs after checking if the invariant is bottom
No state modifications happen before the return
The return only skips helper function processing when the helper is not usable
No cleanup is needed as the function only performs read operations up to that point

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check the code paths following the early return in `operator()(const Callx& call)`

# Find calls to `operator()(const Callx& call)` and analyze subsequent operations
rg -A 10 "void ebpf_transformer::operator\(\s*const Callx& callx\)"

# Look for any cleanup or state management that might be bypassed

Length of output: 2923

Script:

#!/bin/bash
# Find the Callx handler implementation and its context
rg -A 20 "void.*operator.*\(.*Callx.*\)" src/crab/ebpf_transformer.cpp

# Find any state modifications or cleanup after helper checks
ast-grep --pattern 'if (!thread_local_program_info->platform->is_helper_usable($_)) {
  $$$
}'

Length of output: 1729

1367-1367: ⚠️ Potential issue

Verify initialization of thread_local_program_info

Line 1367 accesses thread_local_program_info->type.context_descriptor. Ensure that thread_local_program_info is correctly initialized before this access to prevent potential null pointer dereferences.

To confirm proper initialization of thread_local_program_info, run:

src/asm_files.cpp (1)

310-310: 🧹 Nitpick (assertive)

Add missing space in the debug output message.

There's a missing space between "for" and path in the cout statement, which can lead to concatenated words in the console output.

Apply this change:

- std::cout << "Dumping BTF data for" << path << std::endl;
+ std::cout << "Dumping BTF data for " << path << std::endl;

src/linux/linux_platform.hpp

src/main/utils.hpp

src/crab/fwd_analyzer.hpp

src/asm_files.hpp

src/config.hpp

src/test/test_verify.cpp

src/crab/ebpf_transformer.cpp

src/asm_files.cpp

coveralls · 2024-11-17T14:29:30Z

coverage: 90.16% (-0.5%) from 90.632%
when pulling e8dc77c on 2transform-2once
into 8143eb2 on main.

Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai

Actionable comments posted: 8

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE

📥 Commits

Reviewing files that changed from the base of the PR and between e699599 and 03ca56c.

📒 Files selected for processing (7)

src/crab_verifier.cpp (1 hunks)
src/crab_verifier.hpp (1 hunks)
src/main/check.cpp (3 hunks)
src/main/linux_verifier.cpp (1 hunks)
src/main/utils.hpp (0 hunks)
src/test/ebpf_yaml.cpp (4 hunks)
src/test/test_verify.cpp (5 hunks)

💤 Files with no reviewable changes (1)

src/main/utils.hpp

🧰 Additional context used

📓 Learnings (1)

src/test/ebpf_yaml.cpp (1)

Learnt from: dthaler
PR: vbpf/ebpf-verifier#721
File: src/ebpf_yaml.cpp:327-329
Timestamp: 2024-11-12T09:50:16.890Z
Learning: In `src/ebpf_yaml.cpp`, avoid suggesting changes to error handling code in the `run_conformance_test_case` function if the code is not modified in the current PR.

🪛 cppcheck

src/crab_verifier.cpp

[error] 134-134: Syntax Error

(internalAstError)

src/main/linux_verifier.cpp

[style] 20-20: The function 'bpf_verify_program' is never used.

(unusedFunction)

src/test/ebpf_yaml.cpp

[style] 377-377: The function 'print_failure' is never used.

(unusedFunction)

🔇 Additional comments (19)

src/main/linux_verifier.cpp (4)

17-17: Documentation accurately reflects the return value.

22-22: Skipping comment about buffer size configuration.

A previous review already covered this issue.

40-43: Clean and precise timing implementation using modern C++.

30-31: Ensure data pointer validity during syscall.

The raw program data pointer must remain valid throughout the syscall execution. Consider copying the data to ensure it's not moved or deallocated.

✅ Verification successful

Data pointer is safe during syscall execution

The raw_prog is passed by const reference and remains valid throughout the function scope. The syscall is executed synchronously within bpf_verify_program, ensuring the vector's data remains valid.

Key evidence:

raw_prog is passed by const reference in function signature: const std::vector<ebpf_inst>& raw_prog
No async operations or data movement detected in the usage context
Single point of data access during synchronous syscall

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the vector's data is accessed elsewhere in the codebase
rg -B 2 -A 2 "raw_prog\.data\(\)"

Length of output: 462

Script:

#!/bin/bash
# Check the lifetime of raw_prog and its usage context
ast-grep --pattern 'raw_prog'

# Check the function signature and context
ast-grep --pattern 'verify_program($$$)'

Length of output: 3052

src/test/test_verify.cpp (6)

35-46: Skip comment as issue was already identified.

50-63: LGTM! Robust program verification with count validation.

The macro correctly handles both single and multiple programs, with proper validation of the program count.

75-79: Skip comment as issue was already identified.

85-87: Skip comment as issue was already identified.

510-513: Skip comment as issue was already identified.

Also applies to: 526-530

606-607: Skip comment as issue was already identified.

src/crab_verifier.cpp (1)

193-197: Previous comment on redundant function remains valid

The function ebpf_verifier_clear_thread_local_state is still present and appears redundant after introducing ebpf_verifier_clear_before_analysis. Consider removing it to eliminate redundancy.

src/main/check.cpp (3)

264-268: Termination message condition is correctly implemented

The conditional logic ensures that termination information is displayed only when verification passes and the appropriate verbosity options are set.

269-271: Result output and return statement are appropriate

The program correctly outputs the verification result and returns the proper exit code based on the pass status.

140-141: 🧹 Nitpick (assertive)

Maintain consistent method chaining format

Consider placing ->group("Verbosity") on the same line as the preceding code for consistency.

Apply this diff to align with the formatting style:

 app.add_flag("-f", ebpf_verifier_options.verbosity_opts.print_failures, "Print verifier's failure logs")
-    ->group("Verbosity");
+    ->group("Verbosity");

src/test/ebpf_yaml.cpp (5)

174-174: Refactored simplify option under verbosity_opts

The simplify option has been moved under verbosity_opts, which improves the organization of verbosity-related options within ebpf_verifier_options_t.

Also applies to: 196-196

180-180: Ensure correct endianness detection

Setting options.big_endian based on the native endianness ensures proper behavior across different platforms.

246-246: Duplicate assignment to thread_local_options

The assignment to thread_local_options is duplicated in both run_yaml_test_case and run_conformance_test_case. Refactor this common code into a helper function to enhance maintainability.

Also applies to: 365-365

261-270: Improved exception handling with specific catch

Catching crab::InvalidControlFlow& ex specifically enhances error reporting and allows for more precise handling of control flow exceptions.

377-377: Unused function print_failure

The print_failure function remains unused in the codebase. Consider removing it to keep the code clean and avoid unnecessary maintenance.

🧰 Tools

🪛 cppcheck

[style] 377-377: The function 'print_failure' is never used.

(unusedFunction)

src/crab_verifier.hpp

src/crab_verifier.cpp

src/main/check.cpp

src/test/ebpf_yaml.cpp

Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range comments (1)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE

📥 Commits

Reviewing files that changed from the base of the PR and between 03ca56c and e8dc77c.

📒 Files selected for processing (1)

src/crab_verifier.hpp (1 hunks)

🔇 Additional comments (2)

src/crab_verifier.hpp (2)

11-24: Review of 'Report' class implementation

The 'Report' class is well-defined with appropriate member functions.

26-44: Review of 'Invariants' class implementation

The 'Invariants' class methods are correctly implemented and align with the intended functionality.

coderabbitai bot reviewed Nov 17, 2024

View reviewed changes

elazarg added 2 commits November 17, 2024 16:38

no abstract classes and no unique_ptr

835e315

Signed-off-by: Elazar Gershuni <[email protected]>

no higher-order timed_execution()

03ca56c

Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai bot reviewed Nov 17, 2024

View reviewed changes

CR comment

e8dc77c

Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai bot reviewed Nov 17, 2024

View reviewed changes

elazarg merged commit 04f0e92 into main Nov 17, 2024
17 of 19 checks passed

elazarg deleted the 2transform-2once branch November 17, 2024 16:06

elazarg mentioned this pull request Nov 17, 2024

Do not transform when checking + check code reorg #793

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No transform when checking + check code reorg (second attempt) #796

No transform when checking + check code reorg (second attempt) #796

elazarg commented Nov 17, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 17, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coveralls commented Nov 17, 2024 •

edited

Loading

coderabbitai bot left a comment

coderabbitai bot left a comment

No transform when checking + check code reorg (second attempt) #796

No transform when checking + check code reorg (second attempt) #796

Conversation

elazarg commented Nov 17, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

Release Notes

coderabbitai bot commented Nov 17, 2024 • edited Loading

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coveralls commented Nov 17, 2024 • edited Loading

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

elazarg commented Nov 17, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 17, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

coveralls commented Nov 17, 2024 •

edited

Loading