-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce lock hold time in the ringbuffer implementation #835
Conversation
WalkthroughThe changes in this pull request enhance the eBPF API's handling of asynchronous operations and ring buffer management. Key modifications include improved concurrency control through the introduction of spin locks, refined error handling for asynchronous IOCTL operations, and a new inline function for dynamic data writing to the ring buffer. Additionally, the testing framework for the ring buffer has been expanded with new test cases, including stress tests for concurrent operations. Overall, these updates aim to improve the robustness and performance of the eBPF API. Changes
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Outside diff range and nitpick comments (6)
libs/shared/ebpf_ring_buffer_record.h (3)
8-11
: Consider using size_t for flag definitionsThe flag values are cast to
long
, but since they're used with size values, consider usingsize_t
instead for better type consistency and platform independence.-#define EBPF_RING_BUFFER_RECORD_FLAG_LOCKED (long)(0x1ul << EBPF_RING_BUFFER_RECORD_FLAG_LOCKED_OFFSET) -#define EBPF_RING_BUFFER_RECORD_FLAG_DISCARDED (long)(0x1ul << EBPF_RING_BUFFER_RECORD_FLAG_DISCARDED_OFFSET) +#define EBPF_RING_BUFFER_RECORD_FLAG_LOCKED ((size_t)1 << EBPF_RING_BUFFER_RECORD_FLAG_LOCKED_OFFSET) +#define EBPF_RING_BUFFER_RECORD_FLAG_DISCARDED ((size_t)1 << EBPF_RING_BUFFER_RECORD_FLAG_DISCARDED_OFFSET)
15-17
: Improve documentation clarity and verify padding assumptionThe comment about padding needs to be more explicit about alignment requirements and the actual padding calculation.
- long size; ///< Size of the record in bytes. The lower 30 bits are the size, the 31st bit is the locked flag, and - ///< the 32nd bit is the discarded flag. Next record starts at this + size + sizeof(size) + padding (to - ///< 8). + long size; ///< Size of the record in bytes: + ///< - Bits 0-29: Record size (max ~1GB) + ///< - Bit 30: Discarded flag + ///< - Bit 31: Locked flag + ///< The next record starts at an 8-byte aligned offset: + ///< current_offset + sizeof(size) + size + padding_to_8_bytes
42-59
: LGTM! Consider documenting memory ordering requirementsThe implementation correctly uses atomic reads and bit manipulation. The functions are well-designed for the lock-free approach, which aligns with the PR's objective of reducing lock hold time.
Consider adding documentation about memory ordering requirements:
+/** + * @brief Check if a record has been marked as discarded. + * @param[in] record Pointer to the ring buffer record. + * @return true if the record is discarded, false otherwise. + * @note Uses relaxed memory ordering (ReadNoFence). + */ inline const bool ebpf_ring_buffer_record_is_discarded(_In_ const ebpf_ring_buffer_record_t* record)libs/runtime/ebpf_ring_buffer.h (1)
41-43
: Document locking behavior in function commentsConsider enhancing the function documentation to describe its locking behavior and any assumptions about concurrent access. This helps users understand the performance characteristics and thread-safety guarantees.
Add the following to the function documentation:
/** * @brief Write out a variable sized record to the ring buffer. * + * @details This function minimizes lock hold time by using a two-phase + * approach: first reserving space, then copying data. The actual data + * copy occurs outside the critical section. + * * @param[in, out] ring_buffer Ring buffer to write to. * @param[in] data Data to copy into record. * @param[in] length Length of data to copy. * @retval EBPF_SUCCESS Successfully wrote record ring buffer. * @retval EBPF_OUT_OF_SPACE Unable to output to ring buffer due to inadequate space. */libs/runtime/unit/platform_unit_test.cpp (1)
1149-1248
: LGTM! Comprehensive stress test for concurrent ring buffer operations.The new test effectively validates the ring buffer's thread safety and data integrity under load by:
- Running multiple producer threads with different data patterns
- Verifying data integrity in the consumer thread
- Using proper synchronization primitives
- Running for a sufficient duration (10 seconds)
A few suggestions to make the test even more robust:
- Consider parameterizing the test duration to allow for longer stress testing when needed
- Add metrics for tracking the number of successful vs failed operations
- Consider adding varying data sizes to test different allocation patterns
- std::this_thread::sleep_for(std::chrono::seconds(10)); + const auto test_duration = std::chrono::seconds(10); + std::atomic<size_t> failed_operations = 0; + + std::this_thread::sleep_for(test_duration); + + // Report metrics + std::cout << "Test ran for " << test_duration.count() << " seconds\n" + << "Total records: " << (a_records + b_records) << "\n" + << "Failed operations: " << failed_operations << "\n";libs/api/ebpf_api.cpp (1)
4278-4278
: Simplify cast to avoid unnecessaryconst_cast
The expression
const_cast<void*>(reinterpret_cast<const void*>(record->data))
is redundant. You can simplify it torecord->data
since it's already avoid*
pointer.Apply this diff to simplify the cast:
- const_cast<void*>(reinterpret_cast<const void*>(record->data)), + record->data,
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (6)
external/usersim
(1 hunks)libs/api/ebpf_api.cpp
(1 hunks)libs/runtime/ebpf_ring_buffer.c
(4 hunks)libs/runtime/ebpf_ring_buffer.h
(1 hunks)libs/runtime/unit/platform_unit_test.cpp
(3 hunks)libs/shared/ebpf_ring_buffer_record.h
(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- external/usersim
🔇 Additional comments (9)
libs/shared/ebpf_ring_buffer_record.h (1)
14-19
: Verify atomic operations in all consumers
The change from bit-fields to a single atomic field is a good improvement for reducing lock hold time. Let's verify that all consumers properly handle the atomic operations.
✅ Verification successful
Let me gather more information about the usage patterns to ensure completeness of the verification.
All consumers properly handle atomic operations and memory barriers
The verification shows that all accesses to the ring buffer record's size field are properly handled:
- Helper functions in
ebpf_ring_buffer_record.h
consistently useReadNoFence
for flag checks and size reads - Consumer code in
ebpf_api.cpp
uses these helper functions exclusively for accessing the record - Test code in
platform_unit_test.cpp
correctly usesReadAcquire
when needed for synchronization - No direct, non-atomic accesses to the size field were found in the codebase
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for direct accesses to the size field that might not use atomic operations
rg -A 2 "record->size" --type cpp
# Search for potential consumers of the ring buffer record
ast-grep --pattern 'ebpf_ring_buffer_record_t* $var = $_'
Length of output: 1490
Script:
#!/bin/bash
# Search for ring buffer record access patterns
rg -A 5 "ebpf_ring_buffer_record" --type cpp
# Look for atomic operations and memory barriers
rg -A 2 "(ReadNoFence|ReadAcquire|WriteNoFence|WriteRelease)" --type cpp
Length of output: 7659
libs/runtime/ebpf_ring_buffer.h (1)
41-43
: Verify thread-safety of the new implementation
Since this PR aims to reduce lock hold time, we should verify that the new implementation maintains thread-safety while improving performance.
✅ Verification successful
Let me gather more context about the implementation of ebpf_ring_buffer_output
to verify its thread-safety.
Based on the gathered information, I can now provide the final response regarding thread-safety:
Thread-safety is properly implemented with appropriate locking mechanisms
The implementation shows a well-designed thread-safe approach:
- Uses separate spin locks for producers and consumers (
producer_lock
andconsumer_lock
) - The
ebpf_ring_buffer_output
function internally usesebpf_ring_buffer_reserve
which handles proper locking - Memory operations use
ReadNoFence64
for safe concurrent access - Lock acquisition and release are properly paired with IRQL management
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Search for lock usage in the ring buffer implementation
# to verify thread-safety.
# Look for lock-related patterns in the implementation
rg -A 5 "cxplat_spin_lock|_lock_|mutex|critical_section" libs/runtime/ebpf_ring_buffer.c
# Look for concurrent access patterns
ast-grep --pattern 'ebpf_ring_buffer_output($$$)'
Length of output: 1817
Script:
#!/bin/bash
# Search for the implementation of ebpf_ring_buffer_output
ast-grep --pattern 'ebpf_ring_buffer_output($$$) {
$$$
}'
# Look for related reserve/commit functions that might be used
rg -A 10 "ebpf_ring_buffer_(reserve|commit)" libs/runtime/ebpf_ring_buffer.c
Length of output: 910
libs/runtime/unit/platform_unit_test.cpp (2)
1060-1060
: LGTM! The changes improve atomic operations safety.
The modifications properly use EBPF_PAD_8 for size calculations and switch to direct size access instead of going through the header structure. This aligns with the PR objective of reducing lock hold time by:
- Ensuring proper 8-byte alignment for atomic operations
- Simplifying the record structure to reduce lock contention
Also applies to: 1065-1065, 1067-1067, 1073-1074
1137-1137
: LGTM! More accurate error code.
The change from EBPF_INVALID_ARGUMENT to EBPF_NO_MEMORY is more semantically accurate for the case of attempting to reserve more memory than available in the ring buffer.
libs/runtime/ebpf_ring_buffer.c (5)
39-39
: Initialize the mask
field correctly
Setting local_ring_buffer->mask = capacity - 1;
correctly initializes the mask
field for efficient indexing in the ring buffer when the capacity is a power of two. This is a good practice for ring buffers that use bitwise operations.
75-84
: Refactored ebpf_ring_buffer_output
enhances performance and clarity
The updated implementation of ebpf_ring_buffer_output
utilizes ebpf_ring_buffer_reserve
and ebpf_ring_buffer_submit
, removing the need for explicit locking within this function. This refactoring simplifies the code and can improve performance by leveraging the underlying reservation and submission mechanisms.
218-228
: Ensure proper flag handling in ebpf_ring_buffer_submit
The function correctly verifies that the EBPF_RING_BUFFER_RECORD_FLAG_LOCKED
flag is set before submitting the record and then clears the flag. This ensures that only properly reserved records are submitted.
234-244
: Correctly handle record discard in ebpf_ring_buffer_discard
The implementation properly sets the EBPF_RING_BUFFER_RECORD_FLAG_DISCARDED
flag when a record is discarded, ensuring that the consumer can identify and skip over discarded records.
168-211
: Verify that producer_offset
and consumer_offset
maintain correct order
In the calculation of remaining_space
, if producer_offset
is less than consumer_offset
, it could result in incorrect remaining space due to underflow. Please verify that producer_offset
is always greater than or equal to consumer_offset
to prevent negative remaining_space
values.
Run the following script to check for any instances where producer_offset
might be less than consumer_offset
:
✅ Verification successful
Based on the gathered information, I can now provide a final response:
The ring buffer implementation correctly maintains producer and consumer offset ordering
The implementation is safe from underflow in remaining_space
calculation for these reasons:
- The ring buffer structure shows both offsets are initialized to 0 (implied by C semantics)
- Producer offset only increases by
effective_length
during writes - Consumer offset only increases by
size
during reads - The code has proper space validation:
remaining_space = ring->length - (producer_offset - consumer_offset)
- Writes are protected by
producer_lock
and reads byconsumer_lock
- The implementation uses proper memory barriers via
ReadNoFence64
andWriteNoFence64
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that producer_offset is always greater than or equal to consumer_offset.
# Search for code paths where producer_offset might be less than consumer_offset.
rg -A 5 -B 5 'producer_offset.*<.*consumer_offset' libs/runtime/
Length of output: 66
Script:
#!/bin/bash
# Let's analyze the ring buffer implementation more thoroughly
# 1. Find all places where producer_offset is modified
rg -A 5 "WriteNoFence64.*producer_offset" libs/runtime/
# 2. Find all places where consumer_offset is modified
rg -A 5 "WriteNoFence64.*consumer_offset" libs/runtime/
# 3. Look for ring buffer initialization to understand initial state
ast-grep --pattern 'ebpf_ring_buffer_create'
# 4. Check for any arithmetic operations involving these offsets
rg -A 3 "(producer_offset|consumer_offset).*[+\-*/]" libs/runtime/
Length of output: 4660
Script:
#!/bin/bash
# Let's examine the ring buffer initialization and structure
# 1. Look for ring buffer structure definition
rg -A 10 "struct.*ebpf_ring_buffer" libs/runtime/
# 2. Check the initialization code
rg -A 15 "ebpf_ring_buffer_create.*{" libs/runtime/
# 3. Look for any reset or clear operations
rg -A 5 "ring.*init|ring.*reset|ring.*clear" libs/runtime/
Length of output: 2611
int64_t length; | ||
int64_t mask; | ||
cxplat_spin_lock_t producer_lock; | ||
cxplat_spin_lock_t consumer_lock; | ||
volatile int64_t consumer_offset; | ||
volatile int64_t producer_offset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure proper use of signed integers for buffer sizes and offsets
The change from size_t
to int64_t
for length
, consumer_offset
, and producer_offset
introduces signed integers for values that should always be non-negative. Using signed integers may lead to underflow or incorrect calculations if negative values are inadvertently assigned. Consider using uint64_t
instead to represent these values and prevent potential issues with negative numbers.
Apply this diff to change the types to unsigned integers:
- int64_t length;
- int64_t mask;
- volatile int64_t consumer_offset;
- volatile int64_t producer_offset;
+ uint64_t length;
+ uint64_t mask;
+ volatile uint64_t consumer_offset;
+ volatile uint64_t producer_offset;
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
int64_t length; | |
int64_t mask; | |
cxplat_spin_lock_t producer_lock; | |
cxplat_spin_lock_t consumer_lock; | |
volatile int64_t consumer_offset; | |
volatile int64_t producer_offset; | |
uint64_t length; | |
uint64_t mask; | |
cxplat_spin_lock_t producer_lock; | |
cxplat_spin_lock_t consumer_lock; | |
volatile uint64_t consumer_offset; | |
volatile uint64_t producer_offset; |
if (remaining_space > effective_length) { | ||
result = EBPF_INVALID_ARGUMENT; | ||
goto Done; | ||
goto Exit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the logic error in remaining_space
validation
The current condition may lead to accepting invalid states. The logic should ensure that there is sufficient data to process:
- if (remaining_space > effective_length) {
+ if (remaining_space < effective_length) {
result = EBPF_INVALID_ARGUMENT;
goto Exit;
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if (remaining_space > effective_length) { | |
result = EBPF_INVALID_ARGUMENT; | |
goto Done; | |
goto Exit; | |
if (remaining_space < effective_length) { | |
result = EBPF_INVALID_ARGUMENT; | |
goto Exit; |
KeLowerIrql(old_irql); | ||
} | ||
|
||
EBPF_RETURN_RESULT(EBPF_SUCCESS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return the correct result after processing
Ensure that the function returns the appropriate result code by changing:
- EBPF_RETURN_RESULT(EBPF_SUCCESS);
+ EBPF_RETURN_RESULT(result);
Committable suggestion skipped: line range outside the PR's diff.
*consumer = (size_t)ReadAcquire64(&ring->consumer_offset); | ||
*producer = (size_t)ReadAcquire64(&ring->producer_offset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid casting int64_t
to size_t
to prevent potential data loss
Casting int64_t
to size_t
can lead to unexpected behavior if the int64_t
values are negative, as size_t
is an unsigned type. To ensure data integrity, consider changing the types of consumer_offset
and producer_offset
to uint64_t
, or add checks to confirm that the values are non-negative before casting.
Apply this diff to change the cast and variable types:
- *consumer = (size_t)ReadAcquire64(&ring->consumer_offset);
- *producer = (size_t)ReadAcquire64(&ring->producer_offset);
+ *consumer = (uint64_t)ReadAcquire64(&ring->consumer_offset);
+ *producer = (uint64_t)ReadAcquire64(&ring->producer_offset);
And ensure the offsets are of type uint64_t
as suggested earlier.
Committable suggestion skipped: line range outside the PR's diff.
KIRQL old_irql = KeGetCurrentIrql(); | ||
if (old_irql < DISPATCH_LEVEL) { | ||
KeRaiseIrqlToDpcLevel(); | ||
} | ||
|
||
// Verify count. | ||
while (local_length != 0) { | ||
ebpf_ring_buffer_record_t* record = _ring_record_at_offset(ring, offset); | ||
if (local_length < record->header.length) { | ||
break; | ||
} | ||
offset += record->header.length; | ||
local_length -= record->header.length; | ||
cxplat_acquire_spin_lock_at_dpc_level(&ring->consumer_lock); | ||
|
||
int64_t consumer_offset = ReadNoFence64(&ring->consumer_offset); | ||
int64_t producer_offset = ReadNoFence64(&ring->producer_offset); | ||
int64_t effective_length = EBPF_PAD_8(length + EBPF_OFFSET_OF(ebpf_ring_buffer_record_t, data)); | ||
|
||
if (consumer_offset == producer_offset) { | ||
result = EBPF_INVALID_ARGUMENT; | ||
goto Exit; | ||
} | ||
// Did it end on a record boundary? | ||
if (local_length != 0) { | ||
EBPF_LOG_MESSAGE_UINT64( | ||
EBPF_TRACELOG_LEVEL_ERROR, | ||
EBPF_TRACELOG_KEYWORD_MAP, | ||
"ebpf_ring_buffer_return: Invalid buffer length", | ||
local_length); | ||
|
||
int64_t remaining_space = producer_offset - consumer_offset; | ||
|
||
if (remaining_space > effective_length) { | ||
result = EBPF_INVALID_ARGUMENT; | ||
goto Done; | ||
goto Exit; | ||
} | ||
|
||
_ring_advance_consumer_offset(ring, length); | ||
remaining_space = effective_length; | ||
|
||
while (remaining_space > 0) { | ||
ebpf_ring_buffer_record_t* record = | ||
(ebpf_ring_buffer_record_t*)(ring->shared_buffer + (consumer_offset & ring->mask)); | ||
|
||
long size = ReadNoFence(&record->size); | ||
size += EBPF_OFFSET_OF(ebpf_ring_buffer_record_t, data); | ||
size = EBPF_PAD_8(size); | ||
|
||
consumer_offset += size; | ||
remaining_space -= size; | ||
|
||
record->size = 0; | ||
} | ||
|
||
WriteNoFence64(&ring->consumer_offset, consumer_offset); | ||
|
||
result = EBPF_SUCCESS; | ||
|
||
Done: | ||
ebpf_lock_unlock(&ring->lock, state); | ||
EBPF_RETURN_RESULT(result); | ||
Exit: | ||
cxplat_release_spin_lock_from_dpc_level(&ring->consumer_lock); | ||
|
||
if (old_irql < DISPATCH_LEVEL) { | ||
KeLowerIrql(old_irql); | ||
} | ||
|
||
EBPF_RETURN_RESULT(EBPF_SUCCESS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct logical conditions and ensure proper error handling in ebpf_ring_buffer_return
-
Incorrect condition in remaining space check:
The condition
if (remaining_space > effective_length)
is likely incorrect. Ifremaining_space
is greater thaneffective_length
, there should be enough data to proceed. The condition should be:- if (remaining_space > effective_length) { + if (remaining_space < effective_length) { result = EBPF_INVALID_ARGUMENT; goto Exit; }
-
Incorrect return value ignores error conditions:
At the end of the function,
EBPF_RETURN_RESULT(EBPF_SUCCESS);
is called regardless of any errors that may have occurred. This can mask errors and cause unexpected behavior. Update the return statement to use theresult
variable:- EBPF_RETURN_RESULT(EBPF_SUCCESS); + EBPF_RETURN_RESULT(result);
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
KIRQL old_irql = KeGetCurrentIrql(); | |
if (old_irql < DISPATCH_LEVEL) { | |
KeRaiseIrqlToDpcLevel(); | |
} | |
// Verify count. | |
while (local_length != 0) { | |
ebpf_ring_buffer_record_t* record = _ring_record_at_offset(ring, offset); | |
if (local_length < record->header.length) { | |
break; | |
} | |
offset += record->header.length; | |
local_length -= record->header.length; | |
cxplat_acquire_spin_lock_at_dpc_level(&ring->consumer_lock); | |
int64_t consumer_offset = ReadNoFence64(&ring->consumer_offset); | |
int64_t producer_offset = ReadNoFence64(&ring->producer_offset); | |
int64_t effective_length = EBPF_PAD_8(length + EBPF_OFFSET_OF(ebpf_ring_buffer_record_t, data)); | |
if (consumer_offset == producer_offset) { | |
result = EBPF_INVALID_ARGUMENT; | |
goto Exit; | |
} | |
// Did it end on a record boundary? | |
if (local_length != 0) { | |
EBPF_LOG_MESSAGE_UINT64( | |
EBPF_TRACELOG_LEVEL_ERROR, | |
EBPF_TRACELOG_KEYWORD_MAP, | |
"ebpf_ring_buffer_return: Invalid buffer length", | |
local_length); | |
int64_t remaining_space = producer_offset - consumer_offset; | |
if (remaining_space > effective_length) { | |
result = EBPF_INVALID_ARGUMENT; | |
goto Done; | |
goto Exit; | |
} | |
_ring_advance_consumer_offset(ring, length); | |
remaining_space = effective_length; | |
while (remaining_space > 0) { | |
ebpf_ring_buffer_record_t* record = | |
(ebpf_ring_buffer_record_t*)(ring->shared_buffer + (consumer_offset & ring->mask)); | |
long size = ReadNoFence(&record->size); | |
size += EBPF_OFFSET_OF(ebpf_ring_buffer_record_t, data); | |
size = EBPF_PAD_8(size); | |
consumer_offset += size; | |
remaining_space -= size; | |
record->size = 0; | |
} | |
WriteNoFence64(&ring->consumer_offset, consumer_offset); | |
result = EBPF_SUCCESS; | |
Done: | |
ebpf_lock_unlock(&ring->lock, state); | |
EBPF_RETURN_RESULT(result); | |
Exit: | |
cxplat_release_spin_lock_from_dpc_level(&ring->consumer_lock); | |
if (old_irql < DISPATCH_LEVEL) { | |
KeLowerIrql(old_irql); | |
} | |
EBPF_RETURN_RESULT(EBPF_SUCCESS); | |
KIRQL old_irql = KeGetCurrentIrql(); | |
if (old_irql < DISPATCH_LEVEL) { | |
KeRaiseIrqlToDpcLevel(); | |
} | |
cxplat_acquire_spin_lock_at_dpc_level(&ring->consumer_lock); | |
int64_t consumer_offset = ReadNoFence64(&ring->consumer_offset); | |
int64_t producer_offset = ReadNoFence64(&ring->producer_offset); | |
int64_t effective_length = EBPF_PAD_8(length + EBPF_OFFSET_OF(ebpf_ring_buffer_record_t, data)); | |
if (consumer_offset == producer_offset) { | |
result = EBPF_INVALID_ARGUMENT; | |
goto Exit; | |
} | |
int64_t remaining_space = producer_offset - consumer_offset; | |
if (remaining_space < effective_length) { | |
result = EBPF_INVALID_ARGUMENT; | |
goto Exit; | |
} | |
remaining_space = effective_length; | |
while (remaining_space > 0) { | |
ebpf_ring_buffer_record_t* record = | |
(ebpf_ring_buffer_record_t*)(ring->shared_buffer + (consumer_offset & ring->mask)); | |
long size = ReadNoFence(&record->size); | |
size += EBPF_OFFSET_OF(ebpf_ring_buffer_record_t, data); | |
size = EBPF_PAD_8(size); | |
consumer_offset += size; | |
remaining_space -= size; | |
record->size = 0; | |
} | |
WriteNoFence64(&ring->consumer_offset, consumer_offset); | |
result = EBPF_SUCCESS; | |
Exit: | |
cxplat_release_spin_lock_from_dpc_level(&ring->consumer_lock); | |
if (old_irql < DISPATCH_LEVEL) { | |
KeLowerIrql(old_irql); | |
} | |
EBPF_RETURN_RESULT(result); |
while (ebpf_ring_buffer_record_is_locked(record)) { | ||
// The record is being written to by the producer. | ||
// Wait for the producer to finish writing. | ||
YieldProcessor(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Potential performance impact due to busy-wait loop
The spin-wait loop using YieldProcessor()
may cause high CPU utilization if the producer holds the lock for an extended period. Consider using a more efficient synchronization mechanism, such as an event or condition variable, to avoid busy-waiting and reduce CPU consumption during the wait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (1)
libs/runtime/unit/platform_unit_test.cpp (1)
1149-1248
: LGTM! Comprehensive multi-threaded stress test for the ring buffer.The test effectively validates:
- Concurrent producer-consumer operations
- Data integrity under load
- Proper record size handling
- Memory barriers and synchronization
Consider these improvements:
- Make the test duration configurable via a parameter
- Enhance error reporting by capturing which thread encountered the bad record and the actual vs expected values
- Add statistics about the distribution of 'a' and 'b' records to verify fair scheduling
Example enhancement for error reporting:
- if (memcmp(record->data, data1.data(), data1.size()) != 0) { - bad_record = true; - } + if (memcmp(record->data, data1.data(), data1.size()) != 0) { + bad_record = true; + printf("Data corruption in record type 'a': size=%d\n", actual_size); + printf("Expected: %.*s\n", (int)data1.size(), (char*)data1.data()); + printf("Actual: %.*s\n", (int)data1.size(), (char*)record->data); + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (8)
external/ebpf-verifier
(1 hunks)external/ubpf
(1 hunks)external/usersim
(1 hunks)libs/api/ebpf_api.cpp
(1 hunks)libs/runtime/ebpf_ring_buffer.c
(4 hunks)libs/runtime/ebpf_ring_buffer.h
(1 hunks)libs/runtime/unit/platform_unit_test.cpp
(3 hunks)libs/shared/ebpf_ring_buffer_record.h
(2 hunks)
✅ Files skipped from review due to trivial changes (2)
- external/ebpf-verifier
- external/ubpf
🚧 Files skipped from review as they are similar to previous changes (3)
- external/usersim
- libs/runtime/ebpf_ring_buffer.h
- libs/shared/ebpf_ring_buffer_record.h
🔇 Additional comments (10)
libs/runtime/ebpf_ring_buffer.c (5)
13-14
: Good improvement: Split locks for better concurrency
The separation of producer and consumer locks is a good improvement that can reduce contention by allowing concurrent producer and consumer operations.
75-84
: Well-structured implementation with minimal lock contention
The implementation correctly uses the reserve/submit pattern, which minimizes the lock hold time by separating the buffer allocation from the data copy operation.
169-176
: Good optimization: Pre-checking before lock acquisition
The implementation efficiently checks for available space before acquiring the lock, which helps avoid unnecessary lock acquisitions when the buffer is full.
218-228
: LGTM: Proper synchronization in submit operation
The implementation correctly uses memory barriers and proper flag handling for synchronization between producers and consumers.
234-245
: LGTM: Proper synchronization in discard operation
The implementation correctly handles the record state transition with appropriate memory barriers.
libs/runtime/unit/platform_unit_test.cpp (2)
1060-1060
: LGTM! The ring buffer alignment and position tracking look correct.
The test properly verifies:
- Producer position advances by the padded record size
- Record size is correctly read
- Consumer position matches producer after consuming the record
Also applies to: 1065-1065, 1067-1067, 1073-1074
1137-1137
: LGTM! Good negative test case for memory allocation.
The test properly verifies that attempting to reserve more memory than the ring buffer size returns EBPF_NO_MEMORY.
libs/api/ebpf_api.cpp (3)
4269-4272
: Busy-wait loop detected
4275-4283
: LGTM!
The ring buffer record processing logic is correct and safe:
- Properly skips discarded records
- Safely invokes user callback after record is unlocked
- Allows user to stop processing by returning non-zero
4284-4284
: LGTM!
The consumer offset update correctly advances by the full record size.
WriteNoFence(&record->size, (long)length | EBPF_RING_BUFFER_RECORD_FLAG_LOCKED); | ||
*data = record->data; | ||
|
||
WriteNoFence64(&ring->producer_offset, producer_offset + effective_length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure proper memory ordering between record initialization and offset update
There's a potential race condition where consumers might see the updated producer offset before the record is fully initialized. Consider using appropriate memory barriers or Release semantics for the record size write.
- WriteNoFence(&record->size, (long)length | EBPF_RING_BUFFER_RECORD_FLAG_LOCKED);
+ WriteRelease(&record->size, (long)length | EBPF_RING_BUFFER_RECORD_FLAG_LOCKED);
*data = record->data;
- WriteNoFence64(&ring->producer_offset, producer_offset + effective_length);
+ WriteRelease64(&ring->producer_offset, producer_offset + effective_length);
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
WriteNoFence(&record->size, (long)length | EBPF_RING_BUFFER_RECORD_FLAG_LOCKED); | |
*data = record->data; | |
WriteNoFence64(&ring->producer_offset, producer_offset + effective_length); | |
WriteRelease(&record->size, (long)length | EBPF_RING_BUFFER_RECORD_FLAG_LOCKED); | |
*data = record->data; | |
WriteRelease64(&ring->producer_offset, producer_offset + effective_length); |
cxplat_acquire_spin_lock_at_dpc_level(&ring->consumer_lock); | ||
|
||
int64_t consumer_offset = ReadNoFence64(&ring->consumer_offset); | ||
int64_t producer_offset = ReadNoFence64(&ring->producer_offset); | ||
int64_t effective_length = EBPF_PAD_8(length + EBPF_OFFSET_OF(ebpf_ring_buffer_record_t, data)); | ||
|
||
if (consumer_offset == producer_offset) { | ||
result = EBPF_INVALID_ARGUMENT; | ||
goto Exit; | ||
} | ||
// Did it end on a record boundary? | ||
if (local_length != 0) { | ||
EBPF_LOG_MESSAGE_UINT64( | ||
EBPF_TRACELOG_LEVEL_ERROR, | ||
EBPF_TRACELOG_KEYWORD_MAP, | ||
"ebpf_ring_buffer_return: Invalid buffer length", | ||
local_length); | ||
|
||
int64_t remaining_space = producer_offset - consumer_offset; | ||
|
||
if (remaining_space > effective_length) { | ||
result = EBPF_INVALID_ARGUMENT; | ||
goto Done; | ||
goto Exit; | ||
} | ||
|
||
_ring_advance_consumer_offset(ring, length); | ||
remaining_space = effective_length; | ||
|
||
while (remaining_space > 0) { | ||
ebpf_ring_buffer_record_t* record = | ||
(ebpf_ring_buffer_record_t*)(ring->shared_buffer + (consumer_offset & ring->mask)); | ||
|
||
long size = ReadNoFence(&record->size); | ||
size += EBPF_OFFSET_OF(ebpf_ring_buffer_record_t, data); | ||
size = EBPF_PAD_8(size); | ||
|
||
consumer_offset += size; | ||
remaining_space -= size; | ||
|
||
record->size = 0; | ||
} | ||
|
||
WriteNoFence64(&ring->consumer_offset, consumer_offset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider reducing lock hold time in the processing loop
The consumer lock is held during the entire record processing loop. For large numbers of records, this could lead to increased contention. Consider processing records in smaller batches or implementing a lock-free record processing mechanism.
cxplat_acquire_spin_lock_at_dpc_level(&ring->consumer_lock);
+ // Read values under lock
+ int64_t initial_consumer_offset = ReadNoFence64(&ring->consumer_offset);
+ int64_t final_consumer_offset = initial_consumer_offset;
+ cxplat_release_spin_lock_from_dpc_level(&ring->consumer_lock);
+
+ // Process records without holding the lock
+ while (remaining_space > 0) {
+ ebpf_ring_buffer_record_t* record =
+ (ebpf_ring_buffer_record_t*)(ring->shared_buffer + (final_consumer_offset & ring->mask));
+
+ long size = ReadNoFence(&record->size);
+ size += EBPF_OFFSET_OF(ebpf_ring_buffer_record_t, data);
+ size = EBPF_PAD_8(size);
+
+ final_consumer_offset += size;
+ remaining_space -= size;
+
+ record->size = 0;
+ }
+
+ // Re-acquire lock only for the final update
+ cxplat_acquire_spin_lock_at_dpc_level(&ring->consumer_lock);
+ WriteNoFence64(&ring->consumer_offset, final_consumer_offset);
- // ... existing implementation ...
Committable suggestion skipped: line range outside the PR's diff.
Signed-off-by: Alan Jowett <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
libs/runtime/ebpf_ring_buffer.c (1)
13-14
: Good design choice with separate producer and consumer locks!The separation of producer and consumer locks is an excellent improvement that can reduce lock contention by allowing concurrent producer and consumer operations. This aligns well with the PR's objective of reducing lock hold time.
Consider documenting the locking order to prevent potential deadlocks if any operation needs to acquire both locks.
libs/runtime/unit/platform_unit_test.cpp (1)
1149-1248
: LGTM! Comprehensive stress test for concurrent ring buffer operations.The test effectively verifies:
- Data integrity under concurrent access
- Lock handling
- Producer backpressure
- Consumer processing
Consider enhancing the test with additional verifications.
Consider adding:
- Memory leak detection by tracking total allocated vs freed memory
- Producer fairness verification by ensuring both record types have similar counts
TEST_CASE("ring_buffer_stress", "[platform]") { // ... existing setup ... + std::atomic<size_t> total_allocated{0}; + std::atomic<size_t> total_freed{0}; // ... in consumer ... if (ebpf_ring_buffer_return(ring_buffer, actual_size) != EBPF_SUCCESS) { bad_record = true; break; } + total_freed += actual_size; // ... after stop ... REQUIRE(!bad_record); REQUIRE((a_records + b_records) > 0); + // Verify no memory leaks + REQUIRE(total_freed == total_allocated); + // Verify producer fairness (allow 20% difference) + size_t diff = (a_records > b_records) ? + (a_records - b_records) : + (b_records - a_records); + REQUIRE(diff <= (a_records + b_records) * 0.2);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (5)
libs/api/ebpf_api.cpp
(1 hunks)libs/runtime/ebpf_ring_buffer.c
(4 hunks)libs/runtime/ebpf_ring_buffer.h
(1 hunks)libs/runtime/unit/platform_unit_test.cpp
(3 hunks)libs/shared/ebpf_ring_buffer_record.h
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- libs/runtime/ebpf_ring_buffer.h
- libs/shared/ebpf_ring_buffer_record.h
🔇 Additional comments (12)
libs/runtime/ebpf_ring_buffer.c (7)
75-84
: Excellent refactoring to reduce lock hold time!
The new implementation properly separates buffer reservation from data copying, ensuring the lock is held only during the critical section in reserve
. This is a significant performance improvement as the memory copy operation now occurs outside the lock.
169-176
: Great optimization with pre-lock availability check!
The initial check for available space before acquiring the lock is an excellent optimization. It prevents unnecessary lock acquisition when the buffer is full.
218-228
: LGTM! Proper memory ordering and state validation
The implementation correctly uses memory barriers and validates the record state before submission.
234-245
: LGTM! Consistent implementation with submit function
The discard implementation follows the same pattern as submit with proper memory ordering and state validation.
198-201
:
Strengthen memory ordering for producer offset updates
The current use of WriteNoFence
for both record size and producer offset could lead to memory ordering issues. Consumers might see the updated producer offset before the record is fully initialized.
Apply this fix:
- WriteNoFence(&record->size, (long)length | EBPF_RING_BUFFER_RECORD_FLAG_LOCKED);
+ WriteRelease(&record->size, (long)length | EBPF_RING_BUFFER_RECORD_FLAG_LOCKED);
*data = record->data;
- WriteNoFence64(&ring->producer_offset, producer_offset + effective_length);
+ WriteRelease64(&ring->producer_offset, producer_offset + effective_length);
Likely invalid or redundant comment.
99-149
:
Critical issues still present in return function
While the IRQL handling and spin lock usage is correct, there are two critical issues that need to be addressed:
- The remaining space check at line 117 has incorrect logic
- The function always returns EBPF_SUCCESS at line 149, ignoring the actual result
Apply these fixes:
- if (remaining_space > effective_length) {
+ if (remaining_space < effective_length) {
result = EBPF_INVALID_ARGUMENT;
goto Exit;
}
// ... at end of function ...
- EBPF_RETURN_RESULT(EBPF_SUCCESS);
+ EBPF_RETURN_RESULT(result);
Likely invalid or redundant comment.
Line range hint 11-245
: Verify performance improvement with the new locking scheme
While the changes theoretically reduce lock hold time, it would be valuable to verify the actual performance improvement.
Consider adding performance benchmarks that measure:
- Producer/consumer throughput under concurrent load
- Lock contention metrics
- Average lock hold time
libs/runtime/unit/platform_unit_test.cpp (2)
1060-1060
: LGTM! The ring buffer position and record size checks are accurate.
The test correctly verifies:
- 8-byte alignment of producer position
- Record size field correctness
- Consumer-producer position synchronization after record consumption
Also applies to: 1065-1065, 1067-1067, 1073-1074
1137-1137
: LGTM! Good boundary testing for ring buffer reservation.
The test correctly verifies that attempting to reserve more memory than the ring buffer size returns EBPF_NO_MEMORY.
libs/api/ebpf_api.cpp (3)
4269-4272
: Busy-wait loop can cause high CPU utilization
4275-4283
: LGTM! Well-structured error handling and callback processing
The code properly checks for discarded records and respects the callback's return value for early termination.
4284-4284
: LGTM! Correct ring buffer offset advancement
The consumer offset is properly incremented by the record size.
Signed-off-by: Alan Jowett <[email protected]>
Description
Describe the purpose of and changes within this Pull Request.
Testing
Do any existing tests cover this change? Are new tests needed?
Documentation
Is there any documentation impact for this change?
Installation
Is there any installer impact for this change?
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Tests