-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: MVKCommandBuffer: Optimize management of command storage. #1678
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So much boilerplate... I'm amazed you had the patience to slog through all those code changes!
Allocator alc; | ||
|
||
|
||
public: | ||
Allocator alc; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if you shouldn't add a constructor to MVKSmallVector<T>
and its base to initialize the allocator. Then you wouldn't have to reach into the allocator in order to set the command buffer for the bump allocator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, to me that seems to add complication for no clear reason, thus I'm leaving it for others to decide. Are you thinking doing something of the sort:
_bindings = MVKCommandVector(MVKCommandVectorAllocator<BlahBlah>(cmdBuff));
in setContent
functions, or passing the command buffer to all MVKCommand constructors and putting
_bindings(MVKCommandVector(MVKCommandVectorAllocator<BlahBlah>(cmdBuff)))
in the inializer list, or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cdavis5e, @billhollings, any preferences on this? I'm largely indifferent, other than doing the first is less intrusive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@js6i , I highly discourage this change to MVKSmallVector. The allocator should not be exposed at all and the overall design wasn't ment to be used with a specific constructor as well. The allocator type is a pure template argument.
I think the problem is, that MVKCommandVectorAllocator is already bad by design, it should not have the member cmdBuffer and it should not call mvkPushCommandMemory directly.
There are multiple ways of redesigning this, but since I am not really familiar with MoltenVK, I can't suggest a good solution right now, since I am afraid I will brake other things.
But please do not change MVKSmallVector with the above change, it will introduce all kind of issues in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aerofly, I don't see what's the argument you're making, other than saying it's bad. Setting the allocator with accompanying data for an expandable array is a very simple thing to do, and I don't understand how that's going to be a source of issues.
One thing that I'd probably be in favor of changing is removing the allocator as a template argument, and making a common interface for it, i.e. some sort of struct Allocator
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't mean to be rude here. What I tried to say is, that exposing the allocator and actually accessing allocator elements from another class is basically mixing separate concepts. The allocator as well as the actual vector should not be intertwined like this.
I have added MVKSmallVector to MoltenVK to have a std::vector compatible container that allows the easy use of pre allocated stack elements to be used, to avoid new/delete's.
The way it is used now would not be STL compatible.
Now for this special case, I recommend to write a special class that encapsulates everything that is intended here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the first point, I still don't get it. If you want to use some allocator for an array or whatever, you gotta set it and its things. I don't see how that's mixing concepts. If I put the pointer to command buffer directly into the vector class, then I'd agree it would be mixing unrelated things.
I don't know if we need STL compatibility for anything. If we do, that's probably a valid argument.
That said, I also don't mind having dedicated code for this.
Thanks for submitting this. I'll review this next week, after the SDK release this weekend. |
Bad news: compiled libMoltenVK.dylib from this branch via |
What is the game? You can save logs in CrossOver with the "run with options" dialog, which I think should include MoltenVK errors by default, but you can also set MVK_CONFIG_LOG_LEVEL=3 environment variable. It may be useful to also set METAL_DEVICE_WRAPPER_TYPE=1 to enable Metal API validation. |
@js6i it worked, i was able to enable the logs, thanks. please find the log attached: |
@italomandara Do you use any workarounds to launch this game with CrossOver? I'm getting an error saying "Unhandled Exception: mono-io-layer-error" on startup. |
ok to install Elite dangerous you need to install microsoft .net framework 4.8 |
This WIP also crash Guild Wars 2. The easiest way to reproduce is fast travel. Upon appear on new spot, loading textures and assets will make game crash or freeze entirely. |
@italomandara, how to trigger this freeze? Can I expect to hit it by playing for a few minutes through the tutorial? |
you have to unlock fast travel point and travel from somewhere to it that's it for me triggers every time |
@js6i the freeze is random in Elite dangerous odyssey, it frequently happens in the main menu, but also while playing, I'm using M1Max, can't confirm you get the same on x86 machines and discrete GPUs |
I am also with m1 max |
@italomandara I think I'll need more info about your setup. On my M1 Air Elite Dangerous seems to work fine with this change. Which version of CrossOver are you using? Do you use the version of DXVK that ships with it? Did you change any potentially important registry settings or environment variables? Esync? High res/retina mode? Fullscreen, windowed fullscreen, other graphics settings in game? Unfortunately I don't have direct access to M1 Max specifically.. |
05965c7
to
daf1323
Compare
I didn't reproduce the freeze in GW2 either, but I spotted a possible problem. This PR is now updated and also rebased against current master. Feel free to retest.. |
yes now it will not crash, also seems that helps on fps in group events. I have played with patch like hour - hour and a half and traveled several times in very populated zones where before will hang and is ok |
@js6i It works now i also experimented with |
Sadly after prolonged gaming Guild Wars 2 hangs again, last time it stall whole laptop. I don't have such problems with current master. edit: mistake is mine because I have forced MTLEvents for Rosetta if use single queue every think is working as expected EDIT 2: It seems that problem is triggered if I use async version of DXVK with |
When I tested it all went well, i turned on async in dxvk but never forced MTLEvents, so this might narrow down the issue to MTLEvents? |
It happens with single queue also when async is enabled |
daf1323
to
4042693
Compare
Rebased on current master and some minor clean ups. @phush0, are you getting any error messages from DXVK or MoltenVK? What about Metal runtime? (run with METAL_DEVICE_WRAPPER_TYPE=1) Isn't dxvk-async just compiling shaders in a thread? That in principle shouldn't care about how commands are stored in buffers. |
Will check and will report back. Basically display just hangs. Actually async will skip draw at first and introduce it later as it is compiled in background, so it is more a hack than solution. May be this skip will lead to crash I am not entirely sure. |
I’d applied the first version of this on 1.11.1 and haven’t seen any issue with Witcher3. Tested in the training grounds where you’d usually get a lockup when hit (need to have MVK_CONFIG_RESUME_LOST_DEVICE=1 set) completed it without problem. edit: Also tested on Lego Starwars the skywalker saga still no issues both Witcher3 and this have async enabled. |
f9674d3
to
b272684
Compare
@js6i any reason to still consider this WIP? |
I tested this on my entire game library and it works |
with latest version I don't have hangs also |
While it may be working for everyone it’s possible that @js6i still considering this wip for some issue they’ve encountered, that’s why I was asking. |
sadly seems will miss the next Vulkan SDK feature cutoff (#1736) |
@oscarbg agreed but we don’t know if there’s a reason it’s still marked as WIP Once the next release is marked I’ll provide the usual stock package along with another that includes this PR and a coupe of other interesting patches/hacks. |
@Gcenx nice.. it's one of these additional patches the UE4 black screen fix: https://github.com/nastys/MoltenVK/releases/tag/ue4-workaround-1? |
I still have hangs in Guild Wars 2 even with the last patch. This is what I have in log as errors
after several such hangs of the app I reach the point where whole laptop freeze and reboots itself everything other is info - create this create that nothing more, no errors logged or smth |
If your using CrossOver-22.0.1 did you use the shipped DXVK or replace it with a third-party build? |
I am using latest build from your page and yes it is CrossOver-22.0.1 |
The latest DXVK-macOS build would be 1.10.3 that includes DXVK-CX hacks (that are still relevant) along with DXVK-async that’s only auto enabled on select titles. |
I don't use async for guild wars. Easiest way to reproduce e port to big city from somewhere. Loading objects will lead to hang. |
That seems a dxvk error not a moltenvk one... still strange that you don't get errors with the stock MoltenVK tho... |
this don't change the fact that Guild Wars hangs with this pr and not without it |
This is what I found when game hang. I can upload detailed log just it is 170 MB last shader before hang:
|
@phush0 I noticed at the end of the log |
yes no change same assert and hang |
@js6i could you please rebase this |
Took the liberty of rebasing here https://github.com/marzent/MoltenVK/tree/cmdbuffer-storage2 if anyone needs it |
The main motivation behind this commit is to reduce the time needed to free or reset a command buffer. With the current scheme, we need to loop over all recorded commands and return them to their pools. This can take a significant amount of time in games, of the order of a millisecond. With this change, instead of pooling individual commands by their type, we instead keep chunks of memory in each command buffer, from which we bump allocate memory for the commands to be recorded. MVKCommandVector is introduced to use the same memory pool for dynamic arrays of data associated with the commands. Pointers to commands that actually have non-trivial destructors (i.e. they keep references to other objects) are stored separately, so that we can call their destructors when reseting the command buffer. All the memory simply gets discarded by setting the current chunk's index and offset to zero. As a side note, specific applications might opt to not retain pipeline layouts in bind/push descriptor commands (introduced by 30113fc) if it's known not to be needed, to further reduce time spent resetting command buffers.
b2edd30
to
1ba86d9
Compare
@js6i this is not from your code changes but maybe your changes revealed some existing issues |
No. |
right it's even done in the same order in the apple docs, sorry. |
@js6i is there and issues you’ve encountered when using this that’s caused it to be still marked as WIP? |
@js6i could you rebase this PR to master, also is there any reason this is sitting in limbo? |
I'll most likely rebase on a newer version soon, the problem is that I haven't been able to reproduce the issue @phush0 was getting, and understandably @billhollings is not eager to merge regressive changes. Perhaps it's something that will go away on its own after rebasing. |
@js6i here is an easy example of a game crashing quickly with this applied. Jedi: Fallen Order, fails to load a save without the game will load the save though the game will eventually crash out when remembering an ability (first one being wall running) when using DXVK not the case when using WineD3D Vulkan though it’s unplayable with that. The issue is probably with spirv-cross. Though the above may have been resolved via 75b4d26 as something fails to compile causing the pipeline to get locked up. I suspect this PR was just uncovering these issues much sooner than without. |
Just rebase will test again |
Note: it's a large-ish change and probably would require some more clean up. For example, I think it no longer makes sense to have a separate MVKCmdRenderPassBase. Still, I'm submitting for overall review and suggestions.
The main motivation behind this commit is to reduce the time needed to free or
reset a command buffer. With the current scheme, we need to loop over all
recorded commands and return them to their pools. This can take a significant
amount of time in games, of the order of a millisecond.
With this change, instead of pooling individual commands by their type, we
instead keep chunks of memory in each command buffer, from which we bump
allocate memory for the commands to be recorded. MVKCommandVector is introduced
to use the same memory pool for dynamic arrays of data associated with the
commands.
Pointers to commands that actually have non-trivial destructors (i.e. they keep
references to other objects) are stored separately, so that we can call their
destructors when reseting the command buffer. All the memory simply gets
discarded by setting the current chunk's index and offset to zero.
As a side note, specific applications might opt to not retain pipeline layouts
in bind/push descriptor commands (introduced by 30113fc)
if it's known not to be needed, to further reduce time spent resetting command buffers.