Reduce lock hold time in the ringbuffer implementation #835

Alan-Jowett · 2024-11-23T22:52:16Z

Description

Describe the purpose of and changes within this Pull Request.

Testing

Do any existing tests cover this change? Are new tests needed?

Documentation

Is there any documentation impact for this change?

Installation

Is there any installer impact for this change?

Summary by CodeRabbit

Release Notes

New Features
- Enhanced asynchronous handling for ring buffer operations, improving data integrity during writes.
- Introduced a new function for dynamic data writing to the ring buffer.
Bug Fixes
- Improved error handling for memory allocation failures in ring buffer operations.
Tests
- Added new stress test for concurrent producer-consumer scenarios.
- Updated existing tests for better accuracy and clarity in ring buffer functionality.

coderabbitai · 2024-11-23T22:52:22Z

Walkthrough

The changes in this pull request enhance the eBPF API's handling of asynchronous operations and ring buffer management. Key modifications include improved concurrency control through the introduction of spin locks, refined error handling for asynchronous IOCTL operations, and a new inline function for dynamic data writing to the ring buffer. Additionally, the testing framework for the ring buffer has been expanded with new test cases, including stress tests for concurrent operations. Overall, these updates aim to improve the robustness and performance of the eBPF API.

Changes

File Path	Change Summary
`libs/api/ebpf_api.cpp`	Updated functions for asynchronous handling; refined error handling; improved subscription lifecycle management; enhanced code readability and maintainability.
`libs/runtime/ebpf_ring_buffer.c`	Modified `ebpf_ring_buffer_t` structure for better concurrency; replaced locking mechanisms with spin locks; refactored buffer management logic; removed several inline functions.
`libs/runtime/ebpf_ring_buffer.h`	Added new inline function `ebpf_ring_buffer_output` for dynamic data writing; existing functions remain unchanged.
`libs/runtime/unit/platform_unit_test.cpp`	Enhanced tests for ring buffer output, reserve, and discard; added new stress test case; improved overall test structure and error handling.
`libs/shared/ebpf_ring_buffer_record.h`	Modified `ebpf_ring_buffer_record_t` structure; added macros and inline functions for better status flag management and record size retrieval.
`external/usersim`	Updated subproject commit reference to a newer version.

Poem

In the burrow deep, where the buffers flow,
A rabbit hops high, with a gleeful glow.
Spin locks are spinning, errors now tamed,
With tests that are stronger, our records are claimed.
So here’s to the changes, both swift and profound,
In the world of eBPF, new joys abound! 🐇✨

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 2c1d84d and d54a129.

📒 Files selected for processing (1)

external/usersim (1 hunks)

✅ Files skipped from review due to trivial changes (1)

external/usersim

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 6

🧹 Outside diff range and nitpick comments (6)

libs/shared/ebpf_ring_buffer_record.h (3)
8-11: Consider using size_t for flag definitions

The flag values are cast to long, but since they're used with size values, consider using size_t instead for better type consistency and platform independence.
-#define EBPF_RING_BUFFER_RECORD_FLAG_LOCKED (long)(0x1ul << EBPF_RING_BUFFER_RECORD_FLAG_LOCKED_OFFSET)
-#define EBPF_RING_BUFFER_RECORD_FLAG_DISCARDED (long)(0x1ul << EBPF_RING_BUFFER_RECORD_FLAG_DISCARDED_OFFSET)
+#define EBPF_RING_BUFFER_RECORD_FLAG_LOCKED ((size_t)1 << EBPF_RING_BUFFER_RECORD_FLAG_LOCKED_OFFSET)
+#define EBPF_RING_BUFFER_RECORD_FLAG_DISCARDED ((size_t)1 << EBPF_RING_BUFFER_RECORD_FLAG_DISCARDED_OFFSET)
15-17: Improve documentation clarity and verify padding assumption

The comment about padding needs to be more explicit about alignment requirements and the actual padding calculation.
-    long size; ///< Size of the record in bytes. The lower 30 bits are the size, the 31st bit is the locked flag, and
-               ///< the 32nd bit is the discarded flag. Next record starts at this + size + sizeof(size) + padding (to
-               ///< 8).
+    long size; ///< Size of the record in bytes:
+               ///<   - Bits 0-29: Record size (max ~1GB)
+               ///<   - Bit 30: Discarded flag
+               ///<   - Bit 31: Locked flag
+               ///< The next record starts at an 8-byte aligned offset:
+               ///<   current_offset + sizeof(size) + size + padding_to_8_bytes
42-59: LGTM! Consider documenting memory ordering requirements

The implementation correctly uses atomic reads and bit manipulation. The functions are well-designed for the lock-free approach, which aligns with the PR's objective of reducing lock hold time.

Consider adding documentation about memory ordering requirements:
+/**
+ * @brief Check if a record has been marked as discarded.
+ * @param[in] record Pointer to the ring buffer record.
+ * @return true if the record is discarded, false otherwise.
+ * @note Uses relaxed memory ordering (ReadNoFence).
+ */
 inline const bool
 ebpf_ring_buffer_record_is_discarded(_In_ const ebpf_ring_buffer_record_t* record)
libs/runtime/ebpf_ring_buffer.h (1)
41-43: Document locking behavior in function comments

Consider enhancing the function documentation to describe its locking behavior and any assumptions about concurrent access. This helps users understand the performance characteristics and thread-safety guarantees.

Add the following to the function documentation:
 /**
  * @brief Write out a variable sized record to the ring buffer.
  *
+ * @details This function minimizes lock hold time by using a two-phase
+ * approach: first reserving space, then copying data. The actual data
+ * copy occurs outside the critical section.
+ *
  * @param[in, out] ring_buffer Ring buffer to write to.
  * @param[in] data Data to copy into record.
  * @param[in] length Length of data to copy.
  * @retval EBPF_SUCCESS Successfully wrote record ring buffer.
  * @retval EBPF_OUT_OF_SPACE Unable to output to ring buffer due to inadequate space.
  */
libs/runtime/unit/platform_unit_test.cpp (1)
1149-1248: LGTM! Comprehensive stress test for concurrent ring buffer operations.

The new test effectively validates the ring buffer's thread safety and data integrity under load by:

Running multiple producer threads with different data patterns

Verifying data integrity in the consumer thread

Using proper synchronization primitives

Running for a sufficient duration (10 seconds)

A few suggestions to make the test even more robust:

Consider parameterizing the test duration to allow for longer stress testing when needed

Add metrics for tracking the number of successful vs failed operations

Consider adding varying data sizes to test different allocation patterns
-    std::this_thread::sleep_for(std::chrono::seconds(10));
+    const auto test_duration = std::chrono::seconds(10);
+    std::atomic<size_t> failed_operations = 0;
+    
+    std::this_thread::sleep_for(test_duration);
+    
+    // Report metrics
+    std::cout << "Test ran for " << test_duration.count() << " seconds\n"
+              << "Total records: " << (a_records + b_records) << "\n"
+              << "Failed operations: " << failed_operations << "\n";
libs/api/ebpf_api.cpp (1)
4278-4278: Simplify cast to avoid unnecessary const_cast

The expression const_cast<void*>(reinterpret_cast<const void*>(record->data)) is redundant. You can simplify it to record->data since it's already a void* pointer.

Apply this diff to simplify the cast:
-        const_cast<void*>(reinterpret_cast<const void*>(record->data)),
+        record->data,

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 5f5c451 and d1c0713.

📒 Files selected for processing (6)

external/usersim (1 hunks)
libs/api/ebpf_api.cpp (1 hunks)
libs/runtime/ebpf_ring_buffer.c (4 hunks)
libs/runtime/ebpf_ring_buffer.h (1 hunks)
libs/runtime/unit/platform_unit_test.cpp (3 hunks)
libs/shared/ebpf_ring_buffer_record.h (2 hunks)

✅ Files skipped from review due to trivial changes (1)

external/usersim

🔇 Additional comments (9)

libs/shared/ebpf_ring_buffer_record.h (1)

14-19: Verify atomic operations in all consumers

The change from bit-fields to a single atomic field is a good improvement for reducing lock hold time. Let's verify that all consumers properly handle the atomic operations.

✅ Verification successful

Let me gather more information about the usage patterns to ensure completeness of the verification.

All consumers properly handle atomic operations and memory barriers

The verification shows that all accesses to the ring buffer record's size field are properly handled:

Helper functions in ebpf_ring_buffer_record.h consistently use ReadNoFence for flag checks and size reads
Consumer code in ebpf_api.cpp uses these helper functions exclusively for accessing the record
Test code in platform_unit_test.cpp correctly uses ReadAcquire when needed for synchronization
No direct, non-atomic accesses to the size field were found in the codebase

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for direct accesses to the size field that might not use atomic operations
rg -A 2 "record->size" --type cpp

# Search for potential consumers of the ring buffer record
ast-grep --pattern 'ebpf_ring_buffer_record_t* $var = $_'

Length of output: 1490

Script:

#!/bin/bash
# Search for ring buffer record access patterns
rg -A 5 "ebpf_ring_buffer_record" --type cpp

# Look for atomic operations and memory barriers
rg -A 2 "(ReadNoFence|ReadAcquire|WriteNoFence|WriteRelease)" --type cpp

Length of output: 7659

libs/runtime/ebpf_ring_buffer.h (1)

41-43: Verify thread-safety of the new implementation

Since this PR aims to reduce lock hold time, we should verify that the new implementation maintains thread-safety while improving performance.

✅ Verification successful

Let me gather more context about the implementation of ebpf_ring_buffer_output to verify its thread-safety.

Based on the gathered information, I can now provide the final response regarding thread-safety:

Thread-safety is properly implemented with appropriate locking mechanisms

The implementation shows a well-designed thread-safe approach:

Uses separate spin locks for producers and consumers (producer_lock and consumer_lock)
The ebpf_ring_buffer_output function internally uses ebpf_ring_buffer_reserve which handles proper locking
Memory operations use ReadNoFence64 for safe concurrent access
Lock acquisition and release are properly paired with IRQL management

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for lock usage in the ring buffer implementation
# to verify thread-safety.

# Look for lock-related patterns in the implementation
rg -A 5 "cxplat_spin_lock|_lock_|mutex|critical_section" libs/runtime/ebpf_ring_buffer.c

# Look for concurrent access patterns
ast-grep --pattern 'ebpf_ring_buffer_output($$$)'

Length of output: 1817

Script:

#!/bin/bash
# Search for the implementation of ebpf_ring_buffer_output
ast-grep --pattern 'ebpf_ring_buffer_output($$$) {
  $$$
}'

# Look for related reserve/commit functions that might be used
rg -A 10 "ebpf_ring_buffer_(reserve|commit)" libs/runtime/ebpf_ring_buffer.c

Length of output: 910

libs/runtime/unit/platform_unit_test.cpp (2)

1060-1060: LGTM! The changes improve atomic operations safety.

The modifications properly use EBPF_PAD_8 for size calculations and switch to direct size access instead of going through the header structure. This aligns with the PR objective of reducing lock hold time by:

Ensuring proper 8-byte alignment for atomic operations
Simplifying the record structure to reduce lock contention

Also applies to: 1065-1065, 1067-1067, 1073-1074

1137-1137: LGTM! More accurate error code.

The change from EBPF_INVALID_ARGUMENT to EBPF_NO_MEMORY is more semantically accurate for the case of attempting to reserve more memory than available in the ring buffer.

libs/runtime/ebpf_ring_buffer.c (5)

39-39: Initialize the mask field correctly

Setting local_ring_buffer->mask = capacity - 1; correctly initializes the mask field for efficient indexing in the ring buffer when the capacity is a power of two. This is a good practice for ring buffers that use bitwise operations.

75-84: Refactored ebpf_ring_buffer_output enhances performance and clarity

The updated implementation of ebpf_ring_buffer_output utilizes ebpf_ring_buffer_reserve and ebpf_ring_buffer_submit, removing the need for explicit locking within this function. This refactoring simplifies the code and can improve performance by leveraging the underlying reservation and submission mechanisms.

218-228: Ensure proper flag handling in ebpf_ring_buffer_submit

The function correctly verifies that the EBPF_RING_BUFFER_RECORD_FLAG_LOCKED flag is set before submitting the record and then clears the flag. This ensures that only properly reserved records are submitted.

234-244: Correctly handle record discard in ebpf_ring_buffer_discard

The implementation properly sets the EBPF_RING_BUFFER_RECORD_FLAG_DISCARDED flag when a record is discarded, ensuring that the consumer can identify and skip over discarded records.

168-211: Verify that producer_offset and consumer_offset maintain correct order

In the calculation of remaining_space, if producer_offset is less than consumer_offset, it could result in incorrect remaining space due to underflow. Please verify that producer_offset is always greater than or equal to consumer_offset to prevent negative remaining_space values.

Run the following script to check for any instances where producer_offset might be less than consumer_offset:

✅ Verification successful

Based on the gathered information, I can now provide a final response:

The ring buffer implementation correctly maintains producer and consumer offset ordering

The implementation is safe from underflow in remaining_space calculation for these reasons:

The ring buffer structure shows both offsets are initialized to 0 (implied by C semantics)
Producer offset only increases by effective_length during writes
Consumer offset only increases by size during reads
The code has proper space validation: remaining_space = ring->length - (producer_offset - consumer_offset)
Writes are protected by producer_lock and reads by consumer_lock
The implementation uses proper memory barriers via ReadNoFence64 and WriteNoFence64

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that producer_offset is always greater than or equal to consumer_offset.

# Search for code paths where producer_offset might be less than consumer_offset.
rg -A 5 -B 5 'producer_offset.*<.*consumer_offset' libs/runtime/

Length of output: 66

Script:

#!/bin/bash
# Let's analyze the ring buffer implementation more thoroughly

# 1. Find all places where producer_offset is modified
rg -A 5 "WriteNoFence64.*producer_offset" libs/runtime/

# 2. Find all places where consumer_offset is modified
rg -A 5 "WriteNoFence64.*consumer_offset" libs/runtime/

# 3. Look for ring buffer initialization to understand initial state
ast-grep --pattern 'ebpf_ring_buffer_create'

# 4. Check for any arithmetic operations involving these offsets
rg -A 3 "(producer_offset|consumer_offset).*[+\-*/]" libs/runtime/

Length of output: 4660

Script:

#!/bin/bash
# Let's examine the ring buffer initialization and structure

# 1. Look for ring buffer structure definition
rg -A 10 "struct.*ebpf_ring_buffer" libs/runtime/

# 2. Check the initialization code
rg -A 15 "ebpf_ring_buffer_create.*{" libs/runtime/

# 3. Look for any reset or clear operations
rg -A 5 "ring.*init|ring.*reset|ring.*clear" libs/runtime/

Length of output: 2611

coderabbitai · 2024-11-23T22:56:17Z

libs/runtime/ebpf_ring_buffer.c

+    int64_t length;
+    int64_t mask;
+    cxplat_spin_lock_t producer_lock;
+    cxplat_spin_lock_t consumer_lock;
+    volatile int64_t consumer_offset;
+    volatile int64_t producer_offset;


⚠️ Potential issue

Ensure proper use of signed integers for buffer sizes and offsets

The change from size_t to int64_t for length, consumer_offset, and producer_offset introduces signed integers for values that should always be non-negative. Using signed integers may lead to underflow or incorrect calculations if negative values are inadvertently assigned. Consider using uint64_t instead to represent these values and prevent potential issues with negative numbers.

Apply this diff to change the types to unsigned integers:

- int64_t length; - int64_t mask; - volatile int64_t consumer_offset; - volatile int64_t producer_offset; + uint64_t length; + uint64_t mask; + volatile uint64_t consumer_offset; + volatile uint64_t producer_offset;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

int64_t length;

int64_t mask;

cxplat_spin_lock_t producer_lock;

cxplat_spin_lock_t consumer_lock;

volatile int64_t consumer_offset;

volatile int64_t producer_offset;

uint64_t length;

uint64_t mask;

cxplat_spin_lock_t producer_lock;

cxplat_spin_lock_t consumer_lock;

volatile uint64_t consumer_offset;

volatile uint64_t producer_offset;

coderabbitai · 2024-11-23T22:56:17Z

libs/runtime/ebpf_ring_buffer.c

+    if (remaining_space > effective_length) {
        result = EBPF_INVALID_ARGUMENT;
-        goto Done;
+        goto Exit;


⚠️ Potential issue

Fix the logic error in remaining_space validation

The current condition may lead to accepting invalid states. The logic should ensure that there is sufficient data to process:

- if (remaining_space > effective_length) { + if (remaining_space < effective_length) { result = EBPF_INVALID_ARGUMENT; goto Exit; }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (remaining_space > effective_length) {

result = EBPF_INVALID_ARGUMENT;

goto Done;

goto Exit;

if (remaining_space < effective_length) {

result = EBPF_INVALID_ARGUMENT;

goto Exit;

coderabbitai · 2024-11-23T22:56:17Z

libs/runtime/ebpf_ring_buffer.c

+        KeLowerIrql(old_irql);
+    }
+
+    EBPF_RETURN_RESULT(EBPF_SUCCESS);


⚠️ Potential issue

Return the correct result after processing

Ensure that the function returns the appropriate result code by changing:

- EBPF_RETURN_RESULT(EBPF_SUCCESS); + EBPF_RETURN_RESULT(result);

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-23T22:56:17Z

libs/runtime/ebpf_ring_buffer.c

+    *consumer = (size_t)ReadAcquire64(&ring->consumer_offset);
+    *producer = (size_t)ReadAcquire64(&ring->producer_offset);


⚠️ Potential issue

Avoid casting int64_t to size_t to prevent potential data loss

Casting int64_t to size_t can lead to unexpected behavior if the int64_t values are negative, as size_t is an unsigned type. To ensure data integrity, consider changing the types of consumer_offset and producer_offset to uint64_t, or add checks to confirm that the values are non-negative before casting.

Apply this diff to change the cast and variable types:

- *consumer = (size_t)ReadAcquire64(&ring->consumer_offset); - *producer = (size_t)ReadAcquire64(&ring->producer_offset); + *consumer = (uint64_t)ReadAcquire64(&ring->consumer_offset); + *producer = (uint64_t)ReadAcquire64(&ring->producer_offset);

And ensure the offsets are of type uint64_t as suggested earlier.

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-23T22:56:17Z

libs/runtime/ebpf_ring_buffer.c

+    KIRQL old_irql = KeGetCurrentIrql();
+    if (old_irql < DISPATCH_LEVEL) {
+        KeRaiseIrqlToDpcLevel();
    }

-    // Verify count.
-    while (local_length != 0) {
-        ebpf_ring_buffer_record_t* record = _ring_record_at_offset(ring, offset);
-        if (local_length < record->header.length) {
-            break;
-        }
-        offset += record->header.length;
-        local_length -= record->header.length;
+    cxplat_acquire_spin_lock_at_dpc_level(&ring->consumer_lock);
+
+    int64_t consumer_offset = ReadNoFence64(&ring->consumer_offset);
+    int64_t producer_offset = ReadNoFence64(&ring->producer_offset);
+    int64_t effective_length = EBPF_PAD_8(length + EBPF_OFFSET_OF(ebpf_ring_buffer_record_t, data));
+
+    if (consumer_offset == producer_offset) {
+        result = EBPF_INVALID_ARGUMENT;
+        goto Exit;
    }
-    // Did it end on a record boundary?
-    if (local_length != 0) {
-        EBPF_LOG_MESSAGE_UINT64(
-            EBPF_TRACELOG_LEVEL_ERROR,
-            EBPF_TRACELOG_KEYWORD_MAP,
-            "ebpf_ring_buffer_return: Invalid buffer length",
-            local_length);
+
+    int64_t remaining_space = producer_offset - consumer_offset;
+
+    if (remaining_space > effective_length) {
        result = EBPF_INVALID_ARGUMENT;
-        goto Done;
+        goto Exit;
    }

-    _ring_advance_consumer_offset(ring, length);
+    remaining_space = effective_length;
+
+    while (remaining_space > 0) {
+        ebpf_ring_buffer_record_t* record =
+            (ebpf_ring_buffer_record_t*)(ring->shared_buffer + (consumer_offset & ring->mask));
+
+        long size = ReadNoFence(&record->size);
+        size += EBPF_OFFSET_OF(ebpf_ring_buffer_record_t, data);
+        size = EBPF_PAD_8(size);
+
+        consumer_offset += size;
+        remaining_space -= size;
+
+        record->size = 0;
+    }
+
+    WriteNoFence64(&ring->consumer_offset, consumer_offset);
+
    result = EBPF_SUCCESS;

-Done:
-    ebpf_lock_unlock(&ring->lock, state);
-    EBPF_RETURN_RESULT(result);
+Exit:
+    cxplat_release_spin_lock_from_dpc_level(&ring->consumer_lock);
+
+    if (old_irql < DISPATCH_LEVEL) {
+        KeLowerIrql(old_irql);
+    }
+
+    EBPF_RETURN_RESULT(EBPF_SUCCESS);


⚠️ Potential issue

Correct logical conditions and ensure proper error handling in ebpf_ring_buffer_return

Incorrect condition in remaining space check:

The condition if (remaining_space > effective_length) is likely incorrect. If remaining_space is greater than effective_length, there should be enough data to proceed. The condition should be:

- if (remaining_space > effective_length) { + if (remaining_space < effective_length) { result = EBPF_INVALID_ARGUMENT; goto Exit; }

Incorrect return value ignores error conditions:

At the end of the function, EBPF_RETURN_RESULT(EBPF_SUCCESS); is called regardless of any errors that may have occurred. This can mask errors and cause unexpected behavior. Update the return statement to use the result variable:

- EBPF_RETURN_RESULT(EBPF_SUCCESS); + EBPF_RETURN_RESULT(result);

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

KIRQL old_irql = KeGetCurrentIrql();

if (old_irql < DISPATCH_LEVEL) {

KeRaiseIrqlToDpcLevel();

}

// Verify count.

while (local_length != 0) {

ebpf_ring_buffer_record_t* record = _ring_record_at_offset(ring, offset);

if (local_length < record->header.length) {

break;

}

offset += record->header.length;

local_length -= record->header.length;

cxplat_acquire_spin_lock_at_dpc_level(&ring->consumer_lock);

int64_t consumer_offset = ReadNoFence64(&ring->consumer_offset);

int64_t producer_offset = ReadNoFence64(&ring->producer_offset);

int64_t effective_length = EBPF_PAD_8(length + EBPF_OFFSET_OF(ebpf_ring_buffer_record_t, data));

if (consumer_offset == producer_offset) {

result = EBPF_INVALID_ARGUMENT;

goto Exit;

}

// Did it end on a record boundary?

if (local_length != 0) {

EBPF_LOG_MESSAGE_UINT64(

EBPF_TRACELOG_LEVEL_ERROR,

EBPF_TRACELOG_KEYWORD_MAP,

"ebpf_ring_buffer_return: Invalid buffer length",

local_length);

int64_t remaining_space = producer_offset - consumer_offset;

if (remaining_space > effective_length) {

result = EBPF_INVALID_ARGUMENT;

goto Done;

goto Exit;

}

_ring_advance_consumer_offset(ring, length);

remaining_space = effective_length;

while (remaining_space > 0) {

ebpf_ring_buffer_record_t* record =

(ebpf_ring_buffer_record_t*)(ring->shared_buffer + (consumer_offset & ring->mask));

long size = ReadNoFence(&record->size);

size += EBPF_OFFSET_OF(ebpf_ring_buffer_record_t, data);

size = EBPF_PAD_8(size);

consumer_offset += size;

remaining_space -= size;

record->size = 0;

}

WriteNoFence64(&ring->consumer_offset, consumer_offset);

result = EBPF_SUCCESS;

Done:

ebpf_lock_unlock(&ring->lock, state);

EBPF_RETURN_RESULT(result);

Exit:

cxplat_release_spin_lock_from_dpc_level(&ring->consumer_lock);

if (old_irql < DISPATCH_LEVEL) {

KeLowerIrql(old_irql);

}

EBPF_RETURN_RESULT(EBPF_SUCCESS);

KIRQL old_irql = KeGetCurrentIrql();

if (old_irql < DISPATCH_LEVEL) {

KeRaiseIrqlToDpcLevel();

}

cxplat_acquire_spin_lock_at_dpc_level(&ring->consumer_lock);

int64_t consumer_offset = ReadNoFence64(&ring->consumer_offset);

int64_t producer_offset = ReadNoFence64(&ring->producer_offset);

int64_t effective_length = EBPF_PAD_8(length + EBPF_OFFSET_OF(ebpf_ring_buffer_record_t, data));

if (consumer_offset == producer_offset) {

result = EBPF_INVALID_ARGUMENT;

goto Exit;

}

int64_t remaining_space = producer_offset - consumer_offset;

if (remaining_space < effective_length) {

result = EBPF_INVALID_ARGUMENT;

goto Exit;

}

remaining_space = effective_length;

while (remaining_space > 0) {

ebpf_ring_buffer_record_t* record =

(ebpf_ring_buffer_record_t*)(ring->shared_buffer + (consumer_offset & ring->mask));

long size = ReadNoFence(&record->size);

size += EBPF_OFFSET_OF(ebpf_ring_buffer_record_t, data);

size = EBPF_PAD_8(size);

consumer_offset += size;

remaining_space -= size;

record->size = 0;

}

WriteNoFence64(&ring->consumer_offset, consumer_offset);

result = EBPF_SUCCESS;

Exit:

cxplat_release_spin_lock_from_dpc_level(&ring->consumer_lock);

if (old_irql < DISPATCH_LEVEL) {

KeLowerIrql(old_irql);

}

EBPF_RETURN_RESULT(result);

coderabbitai · 2024-11-23T22:56:17Z

libs/api/ebpf_api.cpp

+            while (ebpf_ring_buffer_record_is_locked(record)) {
+                // The record is being written to by the producer.
+                // Wait for the producer to finish writing.
+                YieldProcessor();


🛠️ Refactor suggestion

Potential performance impact due to busy-wait loop

The spin-wait loop using YieldProcessor() may cause high CPU utilization if the producer holds the lock for an extended period. Consider using a more efficient synchronization mechanism, such as an event or condition variable, to avoid busy-waiting and reduce CPU consumption during the wait.

coderabbitai

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (1)

libs/runtime/unit/platform_unit_test.cpp (1)
1149-1248: LGTM! Comprehensive multi-threaded stress test for the ring buffer.

The test effectively validates:

Concurrent producer-consumer operations

Data integrity under load

Proper record size handling

Memory barriers and synchronization

Consider these improvements:

Make the test duration configurable via a parameter

Enhance error reporting by capturing which thread encountered the bad record and the actual vs expected values

Add statistics about the distribution of 'a' and 'b' records to verify fair scheduling

Example enhancement for error reporting:
-                        if (memcmp(record->data, data1.data(), data1.size()) != 0) {
-                            bad_record = true;
-                        }
+                        if (memcmp(record->data, data1.data(), data1.size()) != 0) {
+                            bad_record = true;
+                            printf("Data corruption in record type 'a': size=%d\n", actual_size);
+                            printf("Expected: %.*s\n", (int)data1.size(), (char*)data1.data());
+                            printf("Actual: %.*s\n", (int)data1.size(), (char*)record->data);
+                        }

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between d1c0713 and a5fe86d.

📒 Files selected for processing (8)

external/ebpf-verifier (1 hunks)
external/ubpf (1 hunks)
external/usersim (1 hunks)
libs/api/ebpf_api.cpp (1 hunks)
libs/runtime/ebpf_ring_buffer.c (4 hunks)
libs/runtime/ebpf_ring_buffer.h (1 hunks)
libs/runtime/unit/platform_unit_test.cpp (3 hunks)
libs/shared/ebpf_ring_buffer_record.h (2 hunks)

✅ Files skipped from review due to trivial changes (2)

external/ebpf-verifier
external/ubpf

🚧 Files skipped from review as they are similar to previous changes (3)

external/usersim
libs/runtime/ebpf_ring_buffer.h
libs/shared/ebpf_ring_buffer_record.h

🔇 Additional comments (10)

libs/runtime/ebpf_ring_buffer.c (5)

13-14: Good improvement: Split locks for better concurrency

The separation of producer and consumer locks is a good improvement that can reduce contention by allowing concurrent producer and consumer operations.

75-84: Well-structured implementation with minimal lock contention

The implementation correctly uses the reserve/submit pattern, which minimizes the lock hold time by separating the buffer allocation from the data copy operation.

169-176: Good optimization: Pre-checking before lock acquisition

The implementation efficiently checks for available space before acquiring the lock, which helps avoid unnecessary lock acquisitions when the buffer is full.

218-228: LGTM: Proper synchronization in submit operation

The implementation correctly uses memory barriers and proper flag handling for synchronization between producers and consumers.

234-245: LGTM: Proper synchronization in discard operation

The implementation correctly handles the record state transition with appropriate memory barriers.

libs/runtime/unit/platform_unit_test.cpp (2)

1060-1060: LGTM! The ring buffer alignment and position tracking look correct.

The test properly verifies:

Producer position advances by the padded record size
Record size is correctly read
Consumer position matches producer after consuming the record

Also applies to: 1065-1065, 1067-1067, 1073-1074

1137-1137: LGTM! Good negative test case for memory allocation.

The test properly verifies that attempting to reserve more memory than the ring buffer size returns EBPF_NO_MEMORY.

libs/api/ebpf_api.cpp (3)

4269-4272: Busy-wait loop detected

4275-4283: LGTM!

The ring buffer record processing logic is correct and safe:

Properly skips discarded records
Safely invokes user callback after record is unlocked
Allows user to stop processing by returning non-zero

4284-4284: LGTM!

The consumer offset update correctly advances by the full record size.

coderabbitai · 2024-11-23T22:58:59Z

libs/runtime/ebpf_ring_buffer.c

+    WriteNoFence(&record->size, (long)length | EBPF_RING_BUFFER_RECORD_FLAG_LOCKED);
    *data = record->data;
+
+    WriteNoFence64(&ring->producer_offset, producer_offset + effective_length);


⚠️ Potential issue

Ensure proper memory ordering between record initialization and offset update

There's a potential race condition where consumers might see the updated producer offset before the record is fully initialized. Consider using appropriate memory barriers or Release semantics for the record size write.

- WriteNoFence(&record->size, (long)length | EBPF_RING_BUFFER_RECORD_FLAG_LOCKED); + WriteRelease(&record->size, (long)length | EBPF_RING_BUFFER_RECORD_FLAG_LOCKED); *data = record->data; - WriteNoFence64(&ring->producer_offset, producer_offset + effective_length); + WriteRelease64(&ring->producer_offset, producer_offset + effective_length);

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

WriteNoFence(&record->size, (long)length | EBPF_RING_BUFFER_RECORD_FLAG_LOCKED);

*data = record->data;

WriteNoFence64(&ring->producer_offset, producer_offset + effective_length);

WriteRelease(&record->size, (long)length | EBPF_RING_BUFFER_RECORD_FLAG_LOCKED);

*data = record->data;

WriteRelease64(&ring->producer_offset, producer_offset + effective_length);

coderabbitai · 2024-11-23T22:58:59Z

libs/runtime/ebpf_ring_buffer.c

+    cxplat_acquire_spin_lock_at_dpc_level(&ring->consumer_lock);
+
+    int64_t consumer_offset = ReadNoFence64(&ring->consumer_offset);
+    int64_t producer_offset = ReadNoFence64(&ring->producer_offset);
+    int64_t effective_length = EBPF_PAD_8(length + EBPF_OFFSET_OF(ebpf_ring_buffer_record_t, data));
+
+    if (consumer_offset == producer_offset) {
+        result = EBPF_INVALID_ARGUMENT;
+        goto Exit;
    }
-    // Did it end on a record boundary?
-    if (local_length != 0) {
-        EBPF_LOG_MESSAGE_UINT64(
-            EBPF_TRACELOG_LEVEL_ERROR,
-            EBPF_TRACELOG_KEYWORD_MAP,
-            "ebpf_ring_buffer_return: Invalid buffer length",
-            local_length);
+
+    int64_t remaining_space = producer_offset - consumer_offset;
+
+    if (remaining_space > effective_length) {
        result = EBPF_INVALID_ARGUMENT;
-        goto Done;
+        goto Exit;
    }

-    _ring_advance_consumer_offset(ring, length);
+    remaining_space = effective_length;
+
+    while (remaining_space > 0) {
+        ebpf_ring_buffer_record_t* record =
+            (ebpf_ring_buffer_record_t*)(ring->shared_buffer + (consumer_offset & ring->mask));
+
+        long size = ReadNoFence(&record->size);
+        size += EBPF_OFFSET_OF(ebpf_ring_buffer_record_t, data);
+        size = EBPF_PAD_8(size);
+
+        consumer_offset += size;
+        remaining_space -= size;
+
+        record->size = 0;
+    }
+
+    WriteNoFence64(&ring->consumer_offset, consumer_offset);


🛠️ Refactor suggestion

Consider reducing lock hold time in the processing loop

The consumer lock is held during the entire record processing loop. For large numbers of records, this could lead to increased contention. Consider processing records in smaller batches or implementing a lock-free record processing mechanism.

cxplat_acquire_spin_lock_at_dpc_level(&ring->consumer_lock); + // Read values under lock + int64_t initial_consumer_offset = ReadNoFence64(&ring->consumer_offset); + int64_t final_consumer_offset = initial_consumer_offset; + cxplat_release_spin_lock_from_dpc_level(&ring->consumer_lock); + + // Process records without holding the lock + while (remaining_space > 0) { + ebpf_ring_buffer_record_t* record = + (ebpf_ring_buffer_record_t*)(ring->shared_buffer + (final_consumer_offset & ring->mask)); + + long size = ReadNoFence(&record->size); + size += EBPF_OFFSET_OF(ebpf_ring_buffer_record_t, data); + size = EBPF_PAD_8(size); + + final_consumer_offset += size; + remaining_space -= size; + + record->size = 0; + } + + // Re-acquire lock only for the final update + cxplat_acquire_spin_lock_at_dpc_level(&ring->consumer_lock); + WriteNoFence64(&ring->consumer_offset, final_consumer_offset); - // ... existing implementation ...

Committable suggestion skipped: line range outside the PR's diff.

Signed-off-by: Alan Jowett <[email protected]>

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)

libs/runtime/ebpf_ring_buffer.c (1)

13-14: Good design choice with separate producer and consumer locks!

The separation of producer and consumer locks is an excellent improvement that can reduce lock contention by allowing concurrent producer and consumer operations. This aligns well with the PR's objective of reducing lock hold time.

Consider documenting the locking order to prevent potential deadlocks if any operation needs to acquire both locks.
libs/runtime/unit/platform_unit_test.cpp (1)
1149-1248: LGTM! Comprehensive stress test for concurrent ring buffer operations.

The test effectively verifies:

Data integrity under concurrent access

Lock handling

Producer backpressure

Consumer processing

Consider enhancing the test with additional verifications.

Consider adding:

Memory leak detection by tracking total allocated vs freed memory

Producer fairness verification by ensuring both record types have similar counts
 TEST_CASE("ring_buffer_stress", "[platform]")
 {
     // ... existing setup ...
+    std::atomic<size_t> total_allocated{0};
+    std::atomic<size_t> total_freed{0};
     
     // ... in consumer ...
     if (ebpf_ring_buffer_return(ring_buffer, actual_size) != EBPF_SUCCESS) {
         bad_record = true;
         break;
     }
+    total_freed += actual_size;
     
     // ... after stop ...
     REQUIRE(!bad_record);
     REQUIRE((a_records + b_records) > 0);
+    // Verify no memory leaks
+    REQUIRE(total_freed == total_allocated);
+    // Verify producer fairness (allow 20% difference)
+    size_t diff = (a_records > b_records) ? 
+                  (a_records - b_records) : 
+                  (b_records - a_records);
+    REQUIRE(diff <= (a_records + b_records) * 0.2);

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between a5fe86d and 2c1d84d.

📒 Files selected for processing (5)

libs/api/ebpf_api.cpp (1 hunks)
libs/runtime/ebpf_ring_buffer.c (4 hunks)
libs/runtime/ebpf_ring_buffer.h (1 hunks)
libs/runtime/unit/platform_unit_test.cpp (3 hunks)
libs/shared/ebpf_ring_buffer_record.h (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

libs/runtime/ebpf_ring_buffer.h
libs/shared/ebpf_ring_buffer_record.h

🔇 Additional comments (12)

libs/runtime/ebpf_ring_buffer.c (7)

75-84: Excellent refactoring to reduce lock hold time!

The new implementation properly separates buffer reservation from data copying, ensuring the lock is held only during the critical section in reserve. This is a significant performance improvement as the memory copy operation now occurs outside the lock.

169-176: Great optimization with pre-lock availability check!

The initial check for available space before acquiring the lock is an excellent optimization. It prevents unnecessary lock acquisition when the buffer is full.

218-228: LGTM! Proper memory ordering and state validation

The implementation correctly uses memory barriers and validates the record state before submission.

234-245: LGTM! Consistent implementation with submit function

The discard implementation follows the same pattern as submit with proper memory ordering and state validation.

198-201: ⚠️ Potential issue

Strengthen memory ordering for producer offset updates

The current use of WriteNoFence for both record size and producer offset could lead to memory ordering issues. Consumers might see the updated producer offset before the record is fully initialized.

Apply this fix:

-    WriteNoFence(&record->size, (long)length | EBPF_RING_BUFFER_RECORD_FLAG_LOCKED);
+    WriteRelease(&record->size, (long)length | EBPF_RING_BUFFER_RECORD_FLAG_LOCKED);
     *data = record->data;

-    WriteNoFence64(&ring->producer_offset, producer_offset + effective_length);
+    WriteRelease64(&ring->producer_offset, producer_offset + effective_length);

Likely invalid or redundant comment.

99-149: ⚠️ Potential issue

Critical issues still present in return function

While the IRQL handling and spin lock usage is correct, there are two critical issues that need to be addressed:

The remaining space check at line 117 has incorrect logic
The function always returns EBPF_SUCCESS at line 149, ignoring the actual result

Apply these fixes:

-    if (remaining_space > effective_length) {
+    if (remaining_space < effective_length) {
         result = EBPF_INVALID_ARGUMENT;
         goto Exit;
     }

     // ... at end of function ...
-    EBPF_RETURN_RESULT(EBPF_SUCCESS);
+    EBPF_RETURN_RESULT(result);

Likely invalid or redundant comment.

Line range hint 11-245: Verify performance improvement with the new locking scheme

While the changes theoretically reduce lock hold time, it would be valuable to verify the actual performance improvement.

Consider adding performance benchmarks that measure:

Producer/consumer throughput under concurrent load
Lock contention metrics
Average lock hold time

libs/runtime/unit/platform_unit_test.cpp (2)

1060-1060: LGTM! The ring buffer position and record size checks are accurate.

The test correctly verifies:

8-byte alignment of producer position
Record size field correctness
Consumer-producer position synchronization after record consumption

Also applies to: 1065-1065, 1067-1067, 1073-1074

1137-1137: LGTM! Good boundary testing for ring buffer reservation.

The test correctly verifies that attempting to reserve more memory than the ring buffer size returns EBPF_NO_MEMORY.

libs/api/ebpf_api.cpp (3)

4269-4272: Busy-wait loop can cause high CPU utilization

4275-4283: LGTM! Well-structured error handling and callback processing

The code properly checks for discarded records and respects the callback's return value for early termination.

4284-4284: LGTM! Correct ring buffer offset advancement

The consumer offset is properly incremented by the record size.

Signed-off-by: Alan Jowett <[email protected]>

coderabbitai bot reviewed Nov 23, 2024

View reviewed changes

Alan-Jowett force-pushed the rb_perf branch from d1c0713 to a5fe86d Compare November 23, 2024 22:56

coderabbitai bot reviewed Nov 23, 2024

View reviewed changes

Reduce lock hold time in the ringbuffer implementation

2c1d84d

Signed-off-by: Alan Jowett <[email protected]>

Alan-Jowett force-pushed the rb_perf branch from a5fe86d to 2c1d84d Compare November 23, 2024 23:05

coderabbitai bot reviewed Nov 23, 2024

View reviewed changes

Update usersim

d54a129

Signed-off-by: Alan Jowett <[email protected]>

Alan-Jowett closed this Nov 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce lock hold time in the ringbuffer implementation #835

Reduce lock hold time in the ringbuffer implementation #835

Alan-Jowett commented Nov 23, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 23, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Nov 23, 2024

coderabbitai bot Nov 23, 2024

coderabbitai bot Nov 23, 2024

coderabbitai bot Nov 23, 2024

coderabbitai bot Nov 23, 2024

coderabbitai bot Nov 23, 2024

coderabbitai bot left a comment

coderabbitai bot Nov 23, 2024

coderabbitai bot Nov 23, 2024

coderabbitai bot left a comment

		*consumer = (size_t)ReadAcquire64(&ring->consumer_offset);
		*producer = (size_t)ReadAcquire64(&ring->producer_offset);

Reduce lock hold time in the ringbuffer implementation #835

Reduce lock hold time in the ringbuffer implementation #835

Conversation

Alan-Jowett commented Nov 23, 2024 • edited by coderabbitai bot Loading

Description

Testing

Documentation

Installation

Summary by CodeRabbit

Release Notes

coderabbitai bot commented Nov 23, 2024 • edited Loading

Walkthrough

Changes

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 23, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 23, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 23, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 23, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 23, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 23, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 23, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 23, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

Alan-Jowett commented Nov 23, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 23, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)