Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distinct 'static' items never overlap #1657

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented Oct 20, 2024

It seems like so far we did not actually guarantee this.

While we are at it, also clarify that static initializers can read even mutable statics, and what happens in that case.

@workingjubilee
Copy link
Member

I don't think you understood my example? I was positing that this seems to be a legal interpretation of the reference's current text:

static ZEE: u8 = 0;
static ZED: u8 = 0;
assert_eq!(&raw const ZEE, &raw const ZEE);
assert_eq!(&raw const ZED, &raw const ZED);
assert_eq!(&raw const ZEE, &raw const ZED);

@workingjubilee
Copy link
Member

Because this

I would say if a static has a wibbly wobbly address not equal to itself, that's not a "precise memory location". We don't even have addresses that are not equal to themselves.

cannot be a response to what I actually said if it is said with an understanding of what I was trying to say. :/

@RalfJung
Copy link
Member Author

RalfJung commented Oct 20, 2024

I think I understood the example? I don't understand how that can be a valid interpretation of the text. A static has a location, &raw const gives you a pointer pointing there. Different locations compare inequal.

@RalfJung
Copy link
Member Author

RalfJung commented Oct 20, 2024

Oh, maybe I didn't quite understand the example.

But statics are certainly intended to be unique and disjoint. That's their point -- they describe a place, distinct from all other places.

More specifically, statics form their own allocated objects that don't overlap with any other allocated object. So in fact ZST statics are not quite unique -- but statics of type i32 are guaranteed to be at least 4 apart.

@RalfJung
Copy link
Member Author

@rustbot label +T-opsem

@rustbot rustbot added the T-opsem Team: opsem label Oct 20, 2024
@RalfJung
Copy link
Member Author

@rfcbot merge
since so far it seems like we haven't actually documented "different statics are disjoint".
Cc @rust-lang/lang

@rfcbot
Copy link

rfcbot commented Oct 20, 2024

Team member @RalfJung has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns.
See this document for info about what commands tagged team members can give me.

@RalfJung RalfJung changed the title attempt to clarify 'static' unique address guarantees distinct 'static' items never overlap Oct 20, 2024
@JakobDegen
Copy link

Lgtm, with the note that it's important this language remains restricted to static items, and not other language constructs that produce statics - const promotion, vtables, functions, etc. Will check my box when I'm off mobile, as apparently the GH app doesn't let you edit things anymore (or someone can do it for me :) )

@RalfJung
Copy link
Member Author

You probably don't have edit rights here. I don't.

You can use @rfcbot reviewed to check your box.

with the note that it's important this language remains restricted to static items, and not other language constructs that produce statics - const promotion, vtables, functions, etc.

I would argue those aren't statics, they are other kinds of global allocations -- exactly because of this fundamental difference.

@JakobDegen
Copy link

@rfcbot reviewed

@workingjubilee
Copy link
Member

workingjubilee commented Oct 20, 2024

hmm. how must the addressing work for this, then?

static BLAH: &str = "blah";
static ALSO_BLAH: &str = "blah";

Are these potentially two different pointers to the same string literal?

@RalfJung
Copy link
Member Author

Are these potentially two different pointers to the same string literal?

Yes.

@saethlin
Copy link
Member

@rfcbot reviewed

@rfcbot
Copy link

rfcbot commented Oct 20, 2024

🔔 This is now entering its final comment period, as per the review above. 🔔

psst @RalfJung, I wasn't able to add the final-comment-period label, please do so.

@digama0
Copy link

digama0 commented Oct 20, 2024

@rfcbot reviewed

Comment on lines 12 to 13
program that is initialized with the initializer expression. This allocated object is disjoint from
all other allocated objects. All references and raw pointers to the static refer to the same
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to guarantee that they are disjoint from other allocated objects? Or just other statics. IE. if I have:

static FOO: i32 = 0;
const BAR: i32 = 0;

fn foo(){
    assert!(!core::ptr::eq(&FOO, &BAR));
}

Should we guarantee the assertion will always pass?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have concern permissions, but I think this should be addresses by T-opsem before the end of the FCP.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think we should guarantee that, and it's what the text already says, isn't it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That prevents emitting unnamed_addr on const items.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnamed_addr specifically does coalescing of non-significant-address-items into each other, and potentially into a significant-address item.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LangRef says:

Note that a constant with significant address can be merged with a unnamed_addr constant, the result being a constant whose address is significant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that is unfortunate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think we should guarantee that, and it's what the text already says, isn't it?

That's what the proposed text says, my question is whether that's what we want it to say.

Copy link
Member

@workingjubilee workingjubilee Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I... honestly don't think that's surprising at all?

That's exactly what I would expect in a case of

const BIG_CONST: BigFrozen = BigFrozen::big_init();
static BIG_STATIC: BigFrozen = BIG_CONST;

That's the precise situation where the const will be unified "into" the static, and where it would be a clearly beneficial optimization.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I... honestly don't think that's surprising at all?

I guess we have different intuitions then.

But reality clearly works one way here, so I have updated the text to match reality. @chorman0773 please have a look.

@CAD97
Copy link

CAD97 commented Oct 20, 2024

@rfcbot reviewed

@workingjubilee
Copy link
Member

I have a concern: this proposed definition prevents emitting optimization annotations on non-static items that we may wish to have coalesced via optimizations that exploit the fact that const items have a non-significant address. Even if we wish to guarantee static unicity, it seems pointlessly penalizing to make this guarantee affect the ability to optimize, quite literally, the items that don't have unique addresses.

@saethlin
Copy link
Member

@rfcbot concern clarify-optimization-of-consts

I think the proposed text and the previous text have the same meaning, which is unfortunate because I think that we'd already specified that consts are not merged into statics. But given the participation on this PR, and its title which seems to be about overlap of statics, and the fact that I think rustc currently does not implement what is documented here, I would like the implications for consts to be spelled out clearly.

@RalfJung
Copy link
Member Author

RalfJung commented Oct 25, 2024

But given the participation on this PR, and its title which seems to be about overlap of statics, and the fact that I think rustc currently does not implement what is documented here, I would like the implications for consts to be spelled out clearly.

I gave that a shot, please have a look.

@saethlin
Copy link
Member

@rustbot resolve clarify-optimization-of-consts

Copy link
Contributor

@chorman0773 chorman0773 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the concern is now resolved, I see no further issues with this.

@digama0
Copy link

digama0 commented Nov 8, 2024

No, I don't think that has to be the usual definition of overlap of a range. Ranges are not sets.

I don't know which definition is more widely used in mathematics. @chorman0773 said my definition is used there; do you have a source for that? @digama0 do you have any idea?

I think that the definition of overlap of sets is clearly that they have empty intersection, but I think your definition makes more sense in this context. The difference of course is whether we want to say that [0, 2) overlaps [1, 1), which is false for sets because the latter set is empty, but is true under the endpoint-based definition. Off the top of my head I don't know a formalization of this interval order, I think it's not likely to come up except in CS-like contexts and in that case the definitions are usually tailored to the application anyway.

I recall a related version of this issue: We currently allow ZST's to be magicked up anywhere, such that <*const ()>::dangling(n) produces a valid pointer for any n. But if so, then that implies that we can also create ZST allocations in the middle of other allocations, so it would not be the case that allocations must be disjoint in the interval sense (although they are still disjoint in the set-of-bytes sense).

@RalfJung
Copy link
Member Author

RalfJung commented Nov 8, 2024

I recall a related version of this issue: We currently allow ZST's to be magicked up anywhere, such that <*const ()>::dangling(n) produces a valid pointer for any n. But if so, then that implies that we can also create ZST allocations in the middle of other allocations,

As mentioned above, "maigcking up" a ZST like that does not create a ZST allocation. It just creates a pointer/reference without provenance. Those are not the same thing.


I wonder if we are constrained by LLVM here... @nikic do you know if there will be trouble with LLVM when we have a zero-sized static located "inside" another static (or inside a stack/heap allocation)? When I do x = malloc(10), is LLVM then allowed to optimize static_addr <= x || static_addr >= x+10 to true?

@workingjubilee
Copy link
Member

I went surveying various embedded software implementations using Rust today. Many of them do in fact use patterns that would prefer to be able to place static ZSTs at fairly arbitrary locations (including inside another static) and use those markers to generate slices in the ways we've discussed, because they need to somehow reason about fixed ranges of memory in slice-like manners. Ideally we would have some sort of pattern that enables writing this, and if not by using the setoid-based rule here, then we'd want something else.

One or two do in fact do everything in the most tedious way (entirely asm! and linker script), but not many seem to tolerate slogging through that mire (...they still gotta suffer through the linker script though).

@RalfJung
Copy link
Member Author

RalfJung commented Nov 8, 2024

Note that using the addresses of those statics as markers can be fine, but under no circumstances is it fine to access the same underlying global memory with pointers derived from different static or extern static declarations (except, probably, those with the same link_name).

@nikic
Copy link
Contributor

nikic commented Nov 8, 2024

I went surveying various embedded software implementations using Rust today. Many of them do in fact use patterns that would prefer to be able to place static ZSTs at fairly arbitrary locations (including inside another static) and use those markers to generate slices in the ways we've discussed, because they need to somehow reason about fixed ranges of memory in slice-like manners. Ideally we would have some sort of pattern that enables writing this, and if not by using the setoid-based rule here, then we'd want something else.

One or two do in fact do everything in the most tedious way (entirely asm! and linker script), but not many seem to tolerate slogging through that mire (...they still gotta suffer through the linker script though).

Do the embedded use-cases just need ZST markers that are directly adjacent to other statics (i.e. they share a start/end), or are they nested strictly inside them? If the latter, do you have any example to share?

@nikic
Copy link
Contributor

nikic commented Nov 8, 2024

I wonder if we are constrained by LLVM here... @nikic do you know if there will be trouble with LLVM when we have a zero-sized static located "inside" another static (or inside a stack/heap allocation)? When I do x = malloc(10), is LLVM then allowed to optimize static_addr <= x || static_addr >= x+10 to true?

After some experimentation, I got this somewhat concerning result: https://llvm.godbolt.org/z/6qTvx3qxd

I guess test1 answers your question in terms of what LLVM currently assumes at least, while test0 looks like an outright miscompile to me. (Alive seems to be struggling with zero-sized globals and thinks that everything is UB: https://alive2.llvm.org/ce/z/qFySyd Filed as AliveToolkit/alive2#1109.)

@workingjubilee
Copy link
Member

workingjubilee commented Nov 8, 2024

I can go into some more detail after I get some sleep but a simplified example would look like this, so yes, strict inclusion: https://godbolt.org/z/eE6Wdjvnb

Most of the ones I looked at were from fairly diligent and clued-in folks, so even the cases that wrote code that is probably still UB tbh went to fairly extreme lengths to write contorted code that they clearly are hoping will avoid the notice of an optimizer. And each was contorted in its own unique way. I am offering this simpler example not because it's an exact replica of what their code is like, but an example of what I think they could be doing if we can support this. (...also, because it doesn't involve everyone learning to read linker script so they can discuss it.)

...but uh, that's fairly alarming and we should probably bring this up to Alive.

@RalfJung
Copy link
Member Author

RalfJung commented Nov 8, 2024

After some experimentation, I got this somewhat concerning result: https://llvm.godbolt.org/z/6qTvx3qxd

Could you add some comments that aid in interpreting the results? :)

@nikic
Copy link
Contributor

nikic commented Nov 8, 2024

After some experimentation, I got this somewhat concerning result: https://llvm.godbolt.org/z/6qTvx3qxd

Could you add some comments that aid in interpreting the results? :)

test0: zero-size global == start of alloca -> false
test1: zero-size global == middle of alloca -> false
test2: zero-size global == end of alloca -> unknown

test0 is a miscompile, and test1 depends on whether a zero-size global can be in the middle of an alloca or not. LLVM currently assumes it can't, but given that it also assumes it can't be at the start of one, it's hard to distinguish whether that's a bug or a feature :)

I think if there is a consensus that we really want to allow zero-size statics that arbitrarily overlap with other allocations, we probably could get through a LangRef change to that effect and adjust InstSimplify accordingly.

Though I think in this context, it's also important to distinguish between a) what addresses an extern static may be placed at (e.g. via linker script) and b) what addresses Rust itself can place a static. Even if we allow extern statics to overlap other statics, if you define a static directly in Rust, is Rust allowed to place it at an overlapping address?

@RalfJung
Copy link
Member Author

RalfJung commented Nov 8, 2024

That helps, thanks!

I don't have a strong opinion on whether should allow zero-size statics that arbitrarily overlap with other allocations. If it can be done in LLVM without impacting relevant optimizations, I would generally err on the side of having less UB.

@workingjubilee
Copy link
Member

workingjubilee commented Nov 8, 2024

Though I think in this context, it's also important to distinguish between a) what addresses an extern static may be placed at (e.g. via linker script) and b) what addresses Rust itself can place a static. Even if we allow extern statics to overlap other statics, if you define a static directly in Rust, is Rust allowed to place it at an overlapping address?

My impression is that from the perspective of embedded developers, the artificial distinction we draw between extern "C" static and "native" Rust static is not very strong to them. Many do avoid it, but especially in code that I see when someone is asking for help debugging stuff, they don't bother with the extern "C" linkage. Partly because it works: as the example I show demonstrates, all you need to is #[link_section] (or #[no_mangle], with slightly different tricks) and then you can start picking addresses.

And as I've mentioned, what's more important is having a blessed pattern that we can recommend to this community, rather than telling them what not to do, which mostly leads to them producing obfuscated code:

Ideally we would have some sort of pattern that enables writing this, and if not by using the setoid-based rule here, then we'd want something else.

@scottmcm
Copy link
Member

scottmcm commented Nov 8, 2024

Or... do we actually have codegen synthesize addresses like this now?

Yes:

https://github.com/rust-lang/rust/blob/209799f3b910c64c8bd5001c0a8a55e03e7c2614/compiler/rustc_codegen_llvm/src/common.rs#L272-L282

@workingjubilee
Copy link
Member

I have opened rust-lang/unsafe-code-guidelines#546 to try to capture some of the discussion from here about the ZST issue and will try to expand on the embedded use-cases in that issue or the also-relevant rust-lang/unsafe-code-guidelines#545

@RalfJung
Copy link
Member Author

RalfJung commented Nov 9, 2024

@scottmcm okay, so that would place ZST constant allocations at NonNull::dangling addresses. That's already covered by the wording of this PR even without the ZST exception.

Do zero-sized local variables get an alloca, or what do we do about them?

@joshtriplett
Copy link
Member

What I was responding to was specifically @joshtriplett's comment:

In particular, I don't think it should be possible for a ZST to be in the middle of an unrelated object (e.g. an unrelated array of non-ZSTs).
@nikomatsakis proposed defining overlap as "the start of one object lies between the start and end of the other, exclusive". That would prevent that property.

Which doesn't seem to have a clear motivation behind why preventing that property is desirable, and it sounded pretty definite, at least? Perhaps I am attaching too much finality to "I don't think it should be possible"?

I'm aware I can come across as needling, and I don't particularly enjoy it either. I feel I get into situations like the one that started with this PR often enough, where everyone signs off on something within 2 hours and then suddenly raising my concern and making sure people understand it is immediately on a timer. Then I feel rushed and have to fumble together an explanation before people have Made Up Their Minds.

That comment, written during the meeting in the course of discussion, was very much meant to be an early indicator of "this raised some eyebrows in a meeting around what seemed like an unaddressed corner case; here were some various thoughts that came up". It was not meant to convey any kind of definitive conclusion on What Behavior The Team Wants, just to start a discussion. That said, I could absolutely have spelled that out more explicitly. Sorry if it came across as a Decision Being Made rather than a Discussion Being Had.

@digama0
Copy link

digama0 commented Nov 10, 2024

For what it's worth, my answer above was given without having read too much of the backscroll, and now that I have I'm leaning more toward allowing ZST allocations (or "allocations") to be anywhere, including strictly inside another allocation, consistently for all purposes. But for the purpose of this RFC, I'm also fine with just punting on the question for now.

@scottmcm
Copy link
Member

Do zero-sized local variables get an alloca, or what do we do about them?

Today they do, though I in fact have an open exploratory draft PR considering changing that:

https://github.com/rust-lang/rust/pull/132387/files#diff-160634de1c336f2cf325ff95b312777326f1ab29fec9b9b21d5ee9aae215ecf5R57-R67

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Nov 14, 2024 via email

@nikic
Copy link
Contributor

nikic commented Nov 14, 2024

The LLVM issue is fixed by llvm/llvm-project#115728.

@traviscross
Copy link
Contributor

Given the change in 8dde3eb that excludes the challenging bit under discussion here with respect to ZSTs, let's...

@rfcbot fcp merge

@rfcbot
Copy link

rfcbot commented Nov 20, 2024

Team member @traviscross has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns.
See this document for info about what commands tagged team members can give me.

@traviscross
Copy link
Contributor

Checking off the boxes for members of T-opsem as these were checked off in the earlier FCP.

@tmandry
Copy link
Member

tmandry commented Nov 20, 2024

I think the clarification proposed by @joshtriplett (or something more explicit) should be included, but otherwise I'm happy to go ahead with this.

@rfcbot reviewed

Co-authored-by: Josh Triplett <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.