-
Notifications
You must be signed in to change notification settings - Fork 63
Implement staging buffer and use it for the material system #1675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
96d8c0d
to
88c7d08
Compare
The CI error is caused by some unrelated bullshit. |
That sounds related |
Oh right, it's because of the moronic behaviour with static const in class declarations. |
Well no, it's gotta be caused by something else too, because even with |
Yeah I don't think |
Yeah, it's defaulting to C++14 apparently. |
It'll just have to do with the old redundant way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get crashes in the Nvidia driver if I make the buffer small enough that it can't hold all the data at once. For example SIZE = 20 << 20
for Thunder or SIZE = 2 << 20
for plat23.
So don't do that? |
That means all the code to handle |
e9827af
to
bf91241
Compare
Fixed. |
I still get crashes. For example |
I don't get any crashes with that. I don't get why you keep trying to use it with a different size buffer that it's not supposed to work with, either. I don't care if it breaks due to someone randomly changing the code. |
Even though what I say here might be censored again, let me bother this one time.
Am I understanding this correctly: you change a constant in the source code (to yield a buffer with less than 4MB), and observe a crash? And you believe the source code has to be crafted such that no changes in any compile time constant may result in a crash? |
To test all the code paths, since I don't know how to use over 128MB of buffers with existing assets. If the code path for flushing out data to make room for incoming data is not intended to work, we can replace it with I tried on AMD+Linux with the same |
I tested with a lower size (not 3.6mb) and the wrap-around does work correctly.
That is not enough to fit all the data being mapped at the end of map loading. |
Could the maximum allocation be different based on some settings? I logged the sizes and the maximum one for me on plat23 had a count between 800k and 900k. If I put SIZE smaller than the largest single allocation, it calls |
Why?
The point of a staging buffer is to transfer and allocate memory on the GPU efficiently, it's not some memory store that you reallocate randomly. |
I thought maybe you are implying that 900000 was not enough for the largest single mapping with the comment, "That is not enough to fit all the data being mapped at the end of map loading." If I understood the API design of the staging buffer correctly: after allocating a block of memory, you must write its contents and request copying to another buffer before you allocate the next block. So only one allocation is available to use at a time. |
Not quite: it's two mappings. It wouldn't make sense to fill up those allocations separately, and doing so would be slower. But since they both can't fit a 900000 size buffer at the same time, the first allocation gets flushed before any data is written to it, which is why there's no geometry visible in that case (I cannot reproduce a crash there though).
It does work with multiple allocations, that's the reason (or part of the reason, anyway) that it's 128mb in size. |
If I remove the geometry cache commit, I still get the crashes. The geometry cache allocations aren't actually the biggest ones anyway, at least on plat23. I don't get crashes with the first commit alone. Introducing the staging buffer plus the "cleanup" commit is necessary. If I put back the code that zeroes the |
I mean the
It's most likely just a conesquence of |
Thanks, now I get it. Writes to The bug here seems fairly possible to hit in real life by making a huge map. I think we should make sure to prevent it since it has the potential to bring down the user's whole system. How about writing the surface commands to a temporary CPU buffer before writing them to the GL buffer? Besides fixing the bug, that would stop the following code from having to read back data from the staging buffer. I think reading back those buffers is something we should try to avoid?
|
That is correct.
This would require approximately 2e6 surfaces or ~4.2e7 triangles. I believe daemon would fail elsewhere long before that point.
Yes, but it's not a big deal on map load. I also have some plans for changes that will slightly change the pipeline and eliminate the need for interleaving these writes. |
I don't know what the relation with triangle counts is but Thunder uses 35 MB for the pair of allocations so it would need to be just 3.7x bigger, doesn't seem out of the realm of possibility. |
There is a cvar that affects the buffer sizes. |
Adds `GLStagingBuffer` and `GLBufferCopy()`. `GLStagingBuffer` allows setting buffer storage and map flags to 0 on the destination buffers, which pretty much guarantees that they will go into VRAM. The data is uploaded through mapping `stagingBuffer` and doing a copy of the data on the GPU.
The fact that it's directly correlated with the amount of surfaces in material system. Anyway, I've changed it up a bit, so now it's only uploading and mapping ~3.13mb for these 2 buffers on |
Adds
GLStagingBuffer
andGLBufferCopy()
.GLStagingBuffer
allows setting buffer storage and map flags to 0 on the destination buffers, which pretty much guarantees that they will go into VRAM. The data is uploaded through mappingstagingBuffer
and doing a copy of the data on the GPU.The
stagingBuffer
itself should (still depends on the driver) go into the BAR memory (on Nvidia this shows up asDMA_CACHED
when usingr_glDebugProfile on; r_glDebugMode 7
).Slightly improves material system performance. Currently this is used for all material system buffers except the portal one (since it requires reading from the GPU), and for the 2 geometry cache buffers that are currently used.
Also fixed
GLBuffer.mapped
state being retained on map change/load.