Removed MaxStackSize setter. #892

kekyo · 2023-01-19T04:46:23Z

When generating assemblies using cecil, I manually specified .maxstack for MethodBody, but it is not reflected. It is always overwritten with the calculated result. This problem has been fixed in this PR.

Example:

var method = new MethodDefinition(...);
var body = method.Body;

body.MaxStackSize = 100;

// ...

module.Write("foo.dll");

ltrzesniewski · 2023-01-19T17:02:33Z

Mono.Cecil.Cil/CodeWriter.cs

+ if (body.max_stack_size == 0) {
+ body.max_stack_size = max_stack;
+ }


The CodeReader sets max_stack_size to a nonzero value for methods with a fat header:

cecil/Mono.Cecil.Cil/CodeReader.cs

Line 118 in 870ce3e

body.max_stack_size = ReadUInt16 ();

This will certainly break workflows such as loading an assembly, adding instructions which increase the max stack size, then writing the assembly. The writer would assume the max stack size was set explicitly by the user.

The following would be a safer alternative:

Suggested change

if (body.max_stack_size == 0) {

body.max_stack_size = max_stack;

}

body.max_stack_size = Math.Max(body.max_stack_size, max_stack);

But then, changes that decrease the max stack size wouldn't be taken into account automatically. So maybe you need a "maxstack is set explicitly" flag.

Also, note this only applies to fat headers. Cecil will always write a tiny header if possible. I'm actually curious why you need this change in the first place.

I'm writing an assembler tooling and using cecil for output. One time I noticed that the value of .maxstack output by cecil sometimes differed from the expected value.

As it turns out, the problem was a mistake in my assembler code, but it made me think a bit about calculating the value of maxstack. For example, which is a rather aggressive:

int32 foo(int32 count) { .locals ([0] int32 i) ldarg.0 ; i = count stloc.0 L1: ldc.i4.1 ; [increase ev stack] ldloc.0 ; i-- ldc.i4.1 sub stloc.0 ldloc.0 ; if (i != 0) goto L1 ldc.i4.0 ceq brfalse L1 ; (snip: decrease ev stack by count) ret }

Even if a static execution flow analysis is performed, complete maxstack size cannot be calculated in advance. Therefore, the person writing the assembly code must manually set an appropriate size for maxstack.

Of course in most cases, the size should be set to the size obtained from the static execution flow analysis, so I believe we only need to deal with this exceptional situation. So, I explicitly specified a value for the MaxStackSize property only if a size was specified, but this was not reflected.

This will certainly break workflows such as loading an assembly, adding instructions which increase the max stack size, then writing the assembly. The writer would assume the max stack size was set explicitly by the user.

I completely agree and it was my mistake that the existing implementation was affected. May I make additional changes? I will use the "maxstack is set explicitly" flag as suggested.

complete maxstack size cannot be calculated in advance

Oh, you're misunderstanding what maxstack means. It's not the max size of the execution stack, but the max size of the IL evaluation stack. The IL evaluation stack size is constant for each IL instruction.

See ECMA-335 III.1.7.4:

[Note: Maxstack is related to analysis of the program, not to the size of the stack at runtime. It does not specify the maximum size in bytes of a stack frame, but rather the number of items that
shall be tracked by an analysis tool. end note]

[Rationale: By analyzing the CIL stream for any method, it is easy to determine how many items
will be pushed on the CIL Evaluation stack. However, specifying that maximum number ahead
of time helps a CIL-to-native-code compiler (especially a simple one that does only a single pass
through the CIL stream) in allocating internal data structures that model the stack and/or
verification algorithm. end rationale]

TL;DR: It can always be calculated statically, and Cecil does the right thing. 🙂

May I make additional changes?

I'm not the maintainer, just a guy on the internet trying to help.

I'm not the maintainer, just a guy on the internet trying to help.

Thanks for reviewing my code! I await comments from the maintainer.

I understand the difference between the runtime stack (as indicated by rsp and rbp in x64) and the evaluation stack. And I also understand that the topic about maxstack is about the evaluation stack.

The sample code indicates that "If the max stack size of the evaluation stack changes at "run-time", max stack size cannot be calculated by static analysis."

I could not read the meaning from the description in the ECMA specification that the evaluation stack consumption is always statically determined. In fact, if we write code like the sample code above, we will see that it is not possible to estimate it in advance. And it causes InvalidProgramException when maxstack is insufficient.

I thought there might be a need to explicitly specify maxstack when outputting assemblies.

Ok, I see. If you're trying to write custom tooling (something that replaces the .NET JIT), then this makes sense. Your code could accept a variable evaluation stack depth.

Just be aware this is not correct ECMA-335 IL though, and is not accepted by the .NET JIT. You could support that as a custom extension to the standard.

Here's an example project: ConsoleApp.zip, it does an inifite loop while pushing an item to the stack. The .NET JIT immediately rejects it with an InvalidProgramException.

See the III.1.7.5 paragraph from ECMA-335 which defines the related correctness rule (emphasis mine):

It shall be possible, with a single forward-pass through the CIL instruction stream for any
method, to infer the exact state of the evaluation stack at every instruction (where by “state” we
mean the number and type of each item on the evaluation stack).

In particular, if that single-pass analysis arrives at an instruction, call it location X, that
immediately follows an unconditional branch, and where X is not the target of an earlier branch
instruction, then the state of the evaluation stack at X, clearly, cannot be derived from existing
information. In this case, the CLI demands that the evaluation stack at X be empty.

Following on from this rule, it would clearly be invalid CIL if a later branch instruction to X
were to have a non-empty evaluation stack

[Rationale: This constraint ensures that CIL code can be processed by a simple CIL-to-nativecode compiler. It ensures that the state of the evaluation stack at the beginning of each CIL can
be inferred from a single, forward-pass analysis of the instruction stream. end rationale]

[Note: the stack state at location X in the above can be inferred by various means: from a
previous forward branch to X; because X marks the start of an exception handler, etc. end note]

The rules for merging evaluation stack frames described in III.1.8.1.3 could also be relevant, even though they determine verifiability (and not correctness).

Thanks for showing the pointer. I referred to the text of ECMA, as it seems I only looked at the quoted part and misinterpreted it.

And I understood that in III.1.7.4 and III.1.7.5, the statically analyzed maxstack must be set and its value must be correct.
I now understand that this means that an instruction stream like the sample code I showed can be written, but it is not guaranteed (nor should it be) that the CLR can execute it. Your point is absolutely correct!!! 😄 If someone asks me, I can now explain to them that they should not write such CIL code.

So, back to cecil, I always feel that the MaxStackSize property should not have a setter, since the maxstack can be calculated by static analysis. The CoreReader stores the value directly in the backing store of the MaxStackSize property. So without a setter, cecil users will be able to treat MaxStackSize as a view in the true sense.

I pushed the code modified to do so. Thank you explain it real issue again!

kekyo · 2023-01-20T02:53:27Z

I wasn't sure if the setter for the MaxStackSize property should check to see if the value has been changed, but I determined that the intention was to change it when the setter was called. How about this?

ltrzesniewski reviewed Jan 19, 2023

View reviewed changes

kekyo force-pushed the fix-maxstack-ignored branch from 40c55c1 to 5debb46 Compare January 20, 2023 02:49

Removed MaxStackSize setter.

e7e5bb8

kekyo force-pushed the fix-maxstack-ignored branch from 5debb46 to e7e5bb8 Compare January 21, 2023 02:54

kekyo changed the title ~~Fixed explicitly applied maxstack being ignored.~~ Removed MaxStackSize setter. Jan 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Removed MaxStackSize setter. #892

Removed MaxStackSize setter. #892

kekyo commented Jan 19, 2023 •

edited

ltrzesniewski Jan 19, 2023 •

edited

kekyo Jan 20, 2023

ltrzesniewski Jan 20, 2023

kekyo Jan 20, 2023

ltrzesniewski Jan 20, 2023 •

edited

kekyo Jan 21, 2023

kekyo commented Jan 20, 2023 •

edited

Removed MaxStackSize setter. #892

Are you sure you want to change the base?

Removed MaxStackSize setter. #892

Conversation

kekyo commented Jan 19, 2023 • edited

ltrzesniewski Jan 19, 2023 • edited

Choose a reason for hiding this comment

kekyo Jan 20, 2023

Choose a reason for hiding this comment

ltrzesniewski Jan 20, 2023

Choose a reason for hiding this comment

kekyo Jan 20, 2023

Choose a reason for hiding this comment

ltrzesniewski Jan 20, 2023 • edited

Choose a reason for hiding this comment

kekyo Jan 21, 2023

Choose a reason for hiding this comment

kekyo commented Jan 20, 2023 • edited

kekyo commented Jan 19, 2023 •

edited

ltrzesniewski Jan 19, 2023 •

edited

ltrzesniewski Jan 20, 2023 •

edited

kekyo commented Jan 20, 2023 •

edited