Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8331865: Consolidate size and alignment checks in LayoutPath #19251

Closed

Conversation

mcimadamore
Copy link
Contributor

@mcimadamore mcimadamore commented May 15, 2024

When creating a nested memory access var handle, we ensure that the segment is accessed at the correct alignment for the root layout being accessed. But we do not ensure that the segment has at least the size of the accessed root layout. Example:

MemoryLayout LAYOUT = sequenceLayout(2, structLayout(JAVA_INT.withName("x"), JAVA_INT.withName("y")));
VarHandle X_HANDLE = LAYOUT.varHandle(PathElement.sequenceElement(), PathElement.groupElement("x"));

If I have a memory segment whose size is 12, I can successfully call X_HANDLE on it with index 1, like so:

MemorySegment segment = Arena.ofAuto().allocate(12);
int x = (int)X_HANDLE.get(segment, 0, 1);

This seems incorrect: after all, the provided segment doesn't have enough space to fit two elements of the nested structs.

This PR adds checks to detect this kind of scenario earlier, thus improving usability. To achieve this we could, in principle, just tweak LayoutPath::checkEnclosingLayout and add the required size check.

But doing so will add yet another redundant check on top of the other already added by pull/19124. Instead, this PR rethinks how memory segment var handles are created, in order to eliminate redundant checks.

The main observation is that size and alignment checks depend on an enclosing layout. This layout might be the very same value layout being accessed (this is the case when e.g. using JAVA_INT.varHandle()). But, in the general case, the accessed value layout and the enclosing layout might differ (e.g. think about accessing a struct field).

Furthermore, the enclosing layout check depends on the base offset at which the segment is accessed, prior to any index computation that occurs if the accessed layout path has any open elements. It is important to notice that this base offset is only available when looking at the var handle that is returned to the user. For instance, an indexed var handle with coordinates:

(MemorySegment, long /* base */, long /* index 1 */, long /* index 2 */, long /* index 3 */)

Is obtained through adaptation, by taking a more basic var handle of the form:

(MemorySegment, long /* offset */)

And then injecting the result of the index multiplication into offset. As such, we can't add an enclosing layout check inside the var handle returned by VarHandles::memorySegmentViewHandle, as doing so will end up seeing the wrong offset (e.g. an offset in which the index part has already been mixed in).

The only possibility then, is to remove size and alignment checks from the raw var handles returned by VarHandles::memorySegmentViewHandle, and perform such checks outside (e.g. in LayoutPath::dereferenceHandle). The only checks left in the raw var handles are:

  • an alignment check to make sure that the access mode selected by the client is really available - this alignment check is against the alignment of the value layout being selected, and not the enclosing layout alignment (which might be stricter)
  • a read-only check, to make sure that write access modes are blocked if the accessed segment is read-only
  • liveness/confinement checks, as mandated by ScopedMemoryAccess

Since these check depends on the particular access mode selected by the client, we can't move these checks away from the raw var handle.

These changes come with some consequences:

  • Now it is always necessary to adapt a raw var handle, and to insert appropriate size and alignment checks (otherwise OOB access might be possible). As such, ValueLayouts also need to call the path layout variant of MemoryLayout::varHandle, to make sure that the raw var handle is adapted accordingly, before it is cached in its stable field.
  • The var handle cache has been moved from Utils to ValueLayouts::varHandle. The cache is used (a) to reduce the number of var handle instances created and (b) to make sure that the cached var handle in the ValueLayout has stable identity. With respect to (a), while it makes sense to cache "toplevel" var handles (e.g. JAVA_INT.varHandle()) the cost/benefit ratio for caching nested var handles seem more unfavourable, as the same var handle can be reused with different enclosing layouts, leading to different checks. Ultimately, all nested var handles will require some kind of adaptation, so it doesn't seem too crucial to have a deeper level of caching here.
  • The order in which exceptions are thrown might be slightly different. This is because the size/alignment checks now take place before the raw var handle is even called. This caused a bunch of small updates in code and tests.
  • It used to be possible to create a sequence layout with maximal size, like MemoryLayout.sequenceLayout(Long.MAX_VALUE, JAVA_LONG), and derive an indexed var handle from that layout. Since there used to be no enclosing layout size check, access to the sequence element was allowed, as long as the index computation did not result in an offset outside the boundary of the accessed memory segment. This is now no longer the case: when selecting an element from the above layout, the implementation will make sure that the accessed segment indeed has the size of that sequence layout (which will probably lead to a IndexOutOfBoundException). To do indexed access on an unbounded sequence, the MemoryLayout::arrayElementVarHandle should be used instead (but this is the norm anyway for such cases).

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Change requires CSR request JDK-8332685 to be approved
  • Commit message must refer to an issue

Issues

  • JDK-8331865: Consolidate size and alignment checks in LayoutPath (Enhancement - P3)
  • JDK-8332685: Consolidate size and alignment checks in LayoutPath (CSR)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19251/head:pull/19251
$ git checkout pull/19251

Update a local copy of the PR:
$ git checkout pull/19251
$ git pull https://git.openjdk.org/jdk.git pull/19251/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 19251

View PR using the GUI difftool:
$ git pr show -t 19251

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19251.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented May 15, 2024

👋 Welcome back mcimadamore! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented May 15, 2024

@mcimadamore This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8331865: Consolidate size and alignment checks in LayoutPath

Reviewed-by: psandoz, jvernee

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 162 new commits pushed to the master branch:

  • da6aa2a: 8332849: Update doc/testing.{md,html} (spelling and stale information)
  • b8f2ec9: 8195675: Call to insertText with single character from custom Input Method ignored
  • 0f3e2cc: 8331670: Deprecate the Memory-Access Methods in sun.misc.Unsafe for Removal
  • 51ae08f: 8333093: Incorrect comment in zAddress_aarch64.cpp
  • 4754f05: 8333035: Parallel: Remove ParMarkBitMap::IterationStatus
  • 87a06b6: 8325805: Compiler Implementation for Flexible Constructor Bodies (Second Preview)
  • e708d13: 8332064: Implementation of Structured Concurrency (Third Preview)
  • 7b52d0a: 8332265: RISC-V: Materialize pointers faster by using a temp register
  • aa4c83a: 8332505: JEP 457: ClassRemapper forgets to remap bootstrap method references
  • cabe337: 8331921: Hotspot assembler files should use common logic to setup exported functions
  • ... and 152 more: https://git.openjdk.org/jdk/compare/61aff6db15d5bdda77427af5ce34d0fe43373197...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk
Copy link

openjdk bot commented May 15, 2024

@mcimadamore
The core-libs label was successfully added.

@mcimadamore
Copy link
Contributor Author

/csr

@openjdk openjdk bot added the csr Pull request needs approved CSR before integration label May 16, 2024
@openjdk
Copy link

openjdk bot commented May 16, 2024

@mcimadamore has indicated that a compatibility and specification (CSR) request is needed for this pull request.

@mcimadamore please create a CSR request for issue JDK-8331865 with the correct fix version. This pull request cannot be integrated until the CSR request is approved.

@@ -823,6 +828,7 @@ public MemorySegment get(AddressLayout layout, long offset) {
@ForceInline
@Override
public void set(AddressLayout layout, long offset, MemorySegment value) {
Objects.requireNonNull(value);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been added, otherwise it would be possible to pass a null as the value of value, and not get an NPE, in case e.g. the alignment of the segment is incorrect (because we now check that before we even try to perform the memory access).

assertTrue(compatible ||
(layout instanceof GroupLayout && segment.maxByteAlignment() < layout.byteAlignment()));
// access is unaligned
assertTrue(segment.maxByteAlignment() < layout.byteAlignment());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this change is required because, before this PR, we used to issue UOE for a bad access mode regardless of the alignment with which we accessed the segment (well, only for toplevel var handles). Now we uniformly check alignment before access mode, for both toplevel and nested var handles, so this assertion needed to be tweaked.

@@ -97,9 +97,9 @@ final class VarHandleSegmentAs$Type$s extends VarHandleSegmentViewBase {
#end[floatingPoint]

@ForceInline
static AbstractMemorySegmentImpl checkAddress(Object obb, long offset, long length, boolean ro) {
static AbstractMemorySegmentImpl checkReadOnly(Object obb, boolean ro) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the following methods are the bulk of the changes in this template. That is, we no longer check size and alignment of the accessed segment. Every other change in this template is needed to get rid of fields and parameters that are no longer used.

Copy link
Contributor Author

@mcimadamore mcimadamore May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: we don't even check for overflows, e.g. offset < 0. The reasoning here is that all layouts check for overflow on construction, so it's never possible to construct a layout whose size is greater than Long.MAX_VALUE. This constraint also affects computation for indexed var handles, since all the long indices corresponding to open path elements are checked against their bounds (and their bounds must be small enough so that the enclosing layout has a size smaller than Long.MAX_VALUE).

@mcimadamore mcimadamore marked this pull request as ready for review May 16, 2024 10:59
@openjdk openjdk bot added the rfr Pull request is ready for review label May 16, 2024
@mlbridge
Copy link

mlbridge bot commented May 16, 2024

Webrevs

@minborg
Copy link
Contributor

minborg commented May 16, 2024

Do we have any performance tests available to see if there are any potential impacts?

/** alignment constraint (in bytes, expressed as a bit mask) **/
final long alignmentMask;

VarHandleSegmentViewBase(VarForm form, boolean be, long length, long alignmentMask, boolean exact) {
VarHandleSegmentViewBase(VarForm form, boolean be, long alignmentMask, boolean exact) {
Copy link
Contributor

@minborg minborg May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Copyright year. This also applies to some other files in this PR.

@mcimadamore
Copy link
Contributor Author

Do we have any performance tests available to see if there are any potential impacts?

I've run all micro benchmarks whose name match LoopOver*. No regression was found. Few benchmarks seems a tad faster, but the percentages are so tiny that I'm sure it has nothing to do with the patch. I compared results before/after using JMH visualizer.

@mcimadamore
Copy link
Contributor Author

Do we have any performance tests available to see if there are any potential impacts?

I've run all micro benchmarks whose name match LoopOver*. No regression was found. Few benchmarks seems a tad faster, but the percentages are so tiny that I'm sure it has nothing to do with the patch. I compared results before/after using JMH visualizer.

Results here: https://cr.openjdk.org/~mcimadamore/jdk/8331865/
Unfortunately I can't seem to be able to upload the results in the visualizer (which would be handy, so I could share a link to the results). Not sure if it's an issue in the visualizer, or in the cr server itself. But, it is possible to at least download the above results locally, and then upload them to the visualizer tool.

@@ -45,7 +45,7 @@ public class TestHeapAlignment {
public void testHeapAlignment(MemorySegment segment, int align, Object val, Object arr, ValueLayout layout, Function<Object, MemorySegment> segmentFactory) {
assertAligned(align, layout, () -> layout.varHandle().get(segment, 0L));
assertAligned(align, layout, () -> layout.varHandle().set(segment, 0L, val));
MemoryLayout seq = MemoryLayout.sequenceLayout(10, layout);
MemoryLayout seq = MemoryLayout.sequenceLayout(1, layout);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was an actual issue in the test, which was made manifest by the new checks

* accessed memory segment.</li>
* <li>The access operation must fall inside the spatial bounds of the accessed
* memory segment, or an {@link IndexOutOfBoundsException} is thrown. This is the case
* when {@code B + A <= S}, where {@code O} is the base offset (defined above),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean {@code O + A <= S}?
(Still working my way through the changes...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, apologies

Copy link
Member

@PaulSandoz PaulSandoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took me a few passes to work it all out. Looks good. All the bounds checking action now consistently passes through the call to checkEnclosingLayout in adaption of the raw (and unsafe) segment accessing VarHandle.

Separately, we might be missing a few long argument accepting guard methods for simpler cases as I suspect they are still focused more on int index types.

@mcimadamore
Copy link
Contributor Author

Separately, we might be missing a few long argument accepting guard methods for simpler cases as I suspect they are still focused more on int index types.

Not sure I understand what guard methods you are referring to here?

@mcimadamore
Copy link
Contributor Author

Separately, we might be missing a few long argument accepting guard methods for simpler cases as I suspect they are still focused more on int index types.

Not sure I understand what guard methods you are referring to here?

Ah, got it. You mean more support in VarHandleGuards. Yes, I have a separate patch for that (actually had that for quite a while), but we're not super sure how to evaluate what impact it has :-)

@PaulSandoz
Copy link
Member

Ah, got it. You mean more support in VarHandleGuards. Yes, I have a separate patch for that (actually had that for quite a while), but we're not super sure how to evaluate what impact it has :-)

Ah, i did not realize that. Yes its tricky, it was designed for VHs to fields/arrays, to really minimize their overhead. With segments there is already some additional overhead e.g.,

        if (derefAdapters.length == 0) {
            // insert align check for the root layout on the initial MS + offset
            List<Class<?>> coordinateTypes = handle.coordinateTypes();
            MethodHandle alignCheck = MethodHandles.insertArguments(MH_CHECK_ENCL_LAYOUT, 2, rootLayout());
            handle = MethodHandles.collectCoordinates(handle, 0, alignCheck);
            int[] reorder = IntStream.concat(IntStream.of(0, 1), IntStream.range(0, coordinateTypes.size())).toArray();
            handle = MethodHandles.permuteCoordinates(handle, coordinateTypes, reorder);
        }

So perhaps it does not make much difference.

@mcimadamore
Copy link
Contributor Author

some additional overhead e.g.,

        if (derefAdapters.length == 0) {
            // insert align check for the root layout on the initial MS + offset
            List<Class<?>> coordinateTypes = handle.coordinateTypes();
            MethodHandle alignCheck = MethodHandles.insertArguments(MH_CHECK_ENCL_LAYOUT, 2, rootLayout());
            handle = MethodHandles.collectCoordinates(handle, 0, alignCheck);
            int[] reorder = IntStream.concat(IntStream.of(0, 1), IntStream.range(0, coordinateTypes.size())).toArray();
            handle = MethodHandles.permuteCoordinates(handle, coordinateTypes, reorder);
        }

True, the chain for segment var handle is quite long (and that is not a result of this patch, it has always been more or less like that). Funnily, I was staring at this piece of code the other day, and I think this can be dealt with morally with a foldArguments, but we don't have the equivalent of that in the VH/coordinates world. Maybe we should add that (or at least implement internally) as that would simplify the adaptation, avoiding the permute step (which is typically rather heavy).

}

@ForceInline
static $type$ get(VarHandle ob, Object obb, long base) {
VarHandleSegmentViewBase handle = (VarHandleSegmentViewBase)ob;
AbstractMemorySegmentImpl bb = checkAddress(obb, base, handle.length, true);
AbstractMemorySegmentImpl bb = checkReadOnly(obb, true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For getter methods, which pass a constant true here, checkReadOnly essentially just does a null check and cast on the segment. Not sure if it's worth simplifying... (I'm happy if you want to leave it like this as well)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave it for now. There's always a trade-off with these generated templates...

@openjdk openjdk bot added ready Pull request is ready to be integrated and removed csr Pull request needs approved CSR before integration labels May 28, 2024
@mcimadamore
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented May 29, 2024

Going to push as commit c003c12.
Since your change was applied there have been 173 commits pushed to the master branch:

  • 6d718ae: 8324341: Remove redundant preprocessor #if's checks
  • 9b64ece: 8332904: ubsan ppc64le: c1_LIRGenerator_ppc.cpp:581:21: runtime error: signed integer overflow: 9223372036854775807 + 1 cannot be represented in type 'long int'
  • 3d4eb15: 8302744: Refactor Hotspot container detection code
  • 2cca83b: 8332880: JFR GCHelper class recognizes "Archive" regions as valid
  • b8ae11e: 8332960: ubsan: classListParser.hpp:159:12: runtime error: load of value 2101478704, which is not a valid value for type 'ParseMode'
  • 9a83dfe: 8332431: NullPointerException in JTable of SwingSet2
  • 01060ad: 8325083: jdk/incubator/vector/Double512VectorTests.java crashes in Assembler::vex_prefix_and_encode
  • 673f767: 8285506: Unify os::vsnprintf implementations
  • 91ab088: 8333116: test/jdk/tools/jpackage/share/ServiceTest.java test fails
  • 9ac8d05: 8332228: TypePollution.java: Unrecognized VM option 'UseSecondarySuperCache'
  • ... and 163 more: https://git.openjdk.org/jdk/compare/61aff6db15d5bdda77427af5ce34d0fe43373197...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label May 29, 2024
@openjdk openjdk bot closed this May 29, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 29, 2024
@openjdk
Copy link

openjdk bot commented May 29, 2024

@mcimadamore Pushed as commit c003c12.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@mcimadamore
Copy link
Contributor Author

Do we have any performance tests available to see if there are any potential impacts?

I've run all micro benchmarks whose name match LoopOver*. No regression was found. Few benchmarks seems a tad faster, but the percentages are so tiny that I'm sure it has nothing to do with the patch. I compared results before/after using JMH visualizer.

Results here: https://cr.openjdk.org/~mcimadamore/jdk/8331865/ Unfortunately I can't seem to be able to upload the results in the visualizer (which would be handy, so I could share a link to the results). Not sure if it's an issue in the visualizer, or in the cr server itself. But, it is possible to at least download the above results locally, and then upload them to the visualizer tool.

Here's a link to the results:
https://jmh.morethan.io/?sources=https://corsproxy.io/?https://cr.openjdk.org/~mcimadamore/jdk/8331865/loop_over_00_baseline.json,https://corsproxy.io/?https://cr.openjdk.org/~mcimadamore/jdk/8331865/loop_over_01_8331865.json

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs [email protected] integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

4 participants