Implement staging buffer and use it for the material system #1675

VReaperV · 2025-06-08T08:22:47Z

Adds GLStagingBuffer and GLBufferCopy(). GLStagingBuffer allows setting buffer storage and map flags to 0 on the destination buffers, which pretty much guarantees that they will go into VRAM. The data is uploaded through mapping stagingBuffer and doing a copy of the data on the GPU.

The stagingBuffer itself should (still depends on the driver) go into the BAR memory (on Nvidia this shows up as DMA_CACHED when using r_glDebugProfile on; r_glDebugMode 7).

Slightly improves material system performance. Currently this is used for all material system buffers except the portal one (since it requires reading from the GPU), and for the 2 geometry cache buffers that are currently used.

Also fixed GLBuffer.mapped state being retained on map change/load.

VReaperV · 2025-06-08T09:05:27Z

The CI error is caused by some unrelated bullshit.

slipher · 2025-06-08T09:47:50Z

FAILED: daemon 
: && /usr/bin/g++ -std=gnu++14 -fvisibility=hidden -fno-strict-aliasing -fno-gnu-unique -pthread -fPIC -march=x86-64 -mtune=generic -mcx16 -O3 -DNDEBUG -D_FORTIFY_SOURCE=2 -Wl,-O1 -Wl,--sort-common -Wl,--as-needed -Wl,--no-undefined -Wl,-z,relro -Wl,-z,now CMakeFiles/client-objects.dir/GeneratedSource/DaemonBuildInfo/Engine.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/cmd.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/common.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/crypto.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/cvar.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/files.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/huffman.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/msg.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/net_chan.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/net_ip.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/translation.cpp.o CMakeFiles/client-objects.dir/src/engine/server/sv_bot.cpp.o CMakeFiles/client-objects.dir/src/engine/server/sv_ccmds.cpp.o CMakeFiles/client-objects.dir/src/engine/server/sv_client.cpp.o CMakeFiles/client-objects.dir/src/engine/server/sv_init.cpp.o CMakeFiles/client-objects.dir/src/engine/server/sv_main.cpp.o CMakeFiles/client-objects.dir/src/engine/server/sv_net_chan.cpp.o CMakeFiles/client-objects.dir/src/engine/server/sv_sgame.cpp.o CMakeFiles/client-objects.dir/src/engine/server/sv_snapshot.cpp.o CMakeFiles/client-objects.dir/src/engine/server/CryptoChallenge.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_avi.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_cgame.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_console.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_download.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_input.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_main.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_parse.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_scrn.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_serverlist.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_serverstatus.cpp.o CMakeFiles/client-objects.dir/src/engine/client/dl_main.cpp.o CMakeFiles/client-objects.dir/src/engine/client/hunk_allocator.cpp.o CMakeFiles/client-objects.dir/src/engine/audio/ALObjects.cpp.o CMakeFiles/client-objects.dir/src/engine/audio/Audio.cpp.o CMakeFiles/client-objects.dir/src/engine/audio/Emitter.cpp.o CMakeFiles/client-objects.dir/src/engine/audio/OggCodec.cpp.o CMakeFiles/client-objects.dir/src/engine/audio/OpusCodec.cpp.o CMakeFiles/client-objects.dir/src/engine/audio/Sample.cpp.o CMakeFiles/client-objects.dir/src/engine/audio/Sound.cpp.o CMakeFiles/client-objects.dir/src/engine/audio/SoundCodec.cpp.o CMakeFiles/client-objects.dir/src/engine/audio/WavCodec.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_keys.cpp.o CMakeFiles/client-objects.dir/src/engine/client/key_binding.cpp.o CMakeFiles/client-objects.dir/src/engine/client/key_identification.cpp.o CMakeFiles/client-objects.dir/src/engine/sys/sdl_input.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/DetectGLVendors.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/gl_shader.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/shaders.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_animation.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_backend.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_bsp.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_cmds.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_curve.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_fbo.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_font.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/GeometryCache.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/GeometryOptimiser.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/GLMemory.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/InternalImage.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/Material.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/TextureManager.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_image.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_image_crn.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_image_dds.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_image_jpg.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_image_ktx.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_image_png.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_image_tga.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_image_webp.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_init.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_light.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_main.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_marks.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_mesh.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_model.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_model_iqm.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_model_md3.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_model_md5.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_model_skel.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_noise.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_scene.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_shade.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_shader.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_shade_calc.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_skin.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_sky.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_surface.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_vbo.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_video.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_world.cpp.o CMakeFiles/client-objects.dir/src/engine/sys/sdl_glimp.cpp.o CMakeFiles/client.dir/src/engine/client/ClientApplication.cpp.o -o daemon  libengine-lib.a  -lncursesw  /usr/lib/x86_64-linux-gnu/libformw.so  /usr/lib/x86_64-linux-gnu/libGL.so  /usr/lib/x86_64-linux-gnu/libSDL2main.a  /usr/lib/x86_64-linux-gnu/libSDL2.so  /usr/lib/x86_64-linux-gnu/libogg.so  /usr/lib/x86_64-linux-gnu/libvorbisfile.so  /usr/lib/x86_64-linux-gnu/libvorbis.so  /usr/lib/libopusfile.so  /usr/lib/x86_64-linux-gnu/libopus.so  /usr/lib/x86_64-linux-gnu/libogg.so  /usr/lib/x86_64-linux-gnu/libvorbisfile.so  /usr/lib/x86_64-linux-gnu/libvorbis.so  /usr/lib/libopusfile.so  /usr/lib/x86_64-linux-gnu/libopus.so  /usr/lib/x86_64-linux-gnu/libwebp.so  /usr/lib/x86_64-linux-gnu/libjpeg.so  /usr/lib/x86_64-linux-gnu/libpng.so  /usr/lib/x86_64-linux-gnu/libfreetype.so  /usr/lib/x86_64-linux-gnu/libGLEW.so  /usr/lib/x86_64-linux-gnu/libopenal.so  /usr/lib/x86_64-linux-gnu/libcurl.so  libsrclibs-mumblelink.a  libsrclibs-findlocale.a  /usr/lib/x86_64-linux-gnu/libhogweed.so  /usr/lib/x86_64-linux-gnu/libnettle.so  /usr/lib/x86_64-linux-gnu/libgmp.so  libsrclibs-nacl-native.a  -lm  /usr/lib/x86_64-linux-gnu/librt.a  -ldl  /usr/lib/x86_64-linux-gnu/libz.so  libsrclibs-minizip.a && :
/usr/bin/ld: CMakeFiles/client-objects.dir/src/engine/renderer/GLMemory.cpp.o: in function `GLStagingBuffer::MapBuffer(long)':
GLMemory.cpp:(.text+0x555): undefined reference to `GLStagingBuffer::SIZE'

That sounds related

VReaperV · 2025-06-08T09:59:10Z

FAILED: daemon 
: && /usr/bin/g++ -std=gnu++14 -fvisibility=hidden -fno-strict-aliasing -fno-gnu-unique -pthread -fPIC -march=x86-64 -mtune=generic -mcx16 -O3 -DNDEBUG -D_FORTIFY_SOURCE=2 -Wl,-O1 -Wl,--sort-common -Wl,--as-needed -Wl,--no-undefined -Wl,-z,relro -Wl,-z,now CMakeFiles/client-objects.dir/GeneratedSource/DaemonBuildInfo/Engine.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/cmd.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/common.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/crypto.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/cvar.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/files.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/huffman.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/msg.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/net_chan.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/net_ip.cpp.o CMakeFiles/client-objects.dir/src/engine/qcommon/translation.cpp.o CMakeFiles/client-objects.dir/src/engine/server/sv_bot.cpp.o CMakeFiles/client-objects.dir/src/engine/server/sv_ccmds.cpp.o CMakeFiles/client-objects.dir/src/engine/server/sv_client.cpp.o CMakeFiles/client-objects.dir/src/engine/server/sv_init.cpp.o CMakeFiles/client-objects.dir/src/engine/server/sv_main.cpp.o CMakeFiles/client-objects.dir/src/engine/server/sv_net_chan.cpp.o CMakeFiles/client-objects.dir/src/engine/server/sv_sgame.cpp.o CMakeFiles/client-objects.dir/src/engine/server/sv_snapshot.cpp.o CMakeFiles/client-objects.dir/src/engine/server/CryptoChallenge.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_avi.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_cgame.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_console.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_download.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_input.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_main.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_parse.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_scrn.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_serverlist.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_serverstatus.cpp.o CMakeFiles/client-objects.dir/src/engine/client/dl_main.cpp.o CMakeFiles/client-objects.dir/src/engine/client/hunk_allocator.cpp.o CMakeFiles/client-objects.dir/src/engine/audio/ALObjects.cpp.o CMakeFiles/client-objects.dir/src/engine/audio/Audio.cpp.o CMakeFiles/client-objects.dir/src/engine/audio/Emitter.cpp.o CMakeFiles/client-objects.dir/src/engine/audio/OggCodec.cpp.o CMakeFiles/client-objects.dir/src/engine/audio/OpusCodec.cpp.o CMakeFiles/client-objects.dir/src/engine/audio/Sample.cpp.o CMakeFiles/client-objects.dir/src/engine/audio/Sound.cpp.o CMakeFiles/client-objects.dir/src/engine/audio/SoundCodec.cpp.o CMakeFiles/client-objects.dir/src/engine/audio/WavCodec.cpp.o CMakeFiles/client-objects.dir/src/engine/client/cl_keys.cpp.o CMakeFiles/client-objects.dir/src/engine/client/key_binding.cpp.o CMakeFiles/client-objects.dir/src/engine/client/key_identification.cpp.o CMakeFiles/client-objects.dir/src/engine/sys/sdl_input.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/DetectGLVendors.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/gl_shader.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/shaders.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_animation.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_backend.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_bsp.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_cmds.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_curve.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_fbo.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_font.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/GeometryCache.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/GeometryOptimiser.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/GLMemory.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/InternalImage.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/Material.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/TextureManager.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_image.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_image_crn.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_image_dds.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_image_jpg.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_image_ktx.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_image_png.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_image_tga.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_image_webp.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_init.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_light.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_main.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_marks.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_mesh.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_model.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_model_iqm.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_model_md3.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_model_md5.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_model_skel.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_noise.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_scene.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_shade.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_shader.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_shade_calc.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_skin.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_sky.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_surface.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_vbo.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_video.cpp.o CMakeFiles/client-objects.dir/src/engine/renderer/tr_world.cpp.o CMakeFiles/client-objects.dir/src/engine/sys/sdl_glimp.cpp.o CMakeFiles/client.dir/src/engine/client/ClientApplication.cpp.o -o daemon  libengine-lib.a  -lncursesw  /usr/lib/x86_64-linux-gnu/libformw.so  /usr/lib/x86_64-linux-gnu/libGL.so  /usr/lib/x86_64-linux-gnu/libSDL2main.a  /usr/lib/x86_64-linux-gnu/libSDL2.so  /usr/lib/x86_64-linux-gnu/libogg.so  /usr/lib/x86_64-linux-gnu/libvorbisfile.so  /usr/lib/x86_64-linux-gnu/libvorbis.so  /usr/lib/libopusfile.so  /usr/lib/x86_64-linux-gnu/libopus.so  /usr/lib/x86_64-linux-gnu/libogg.so  /usr/lib/x86_64-linux-gnu/libvorbisfile.so  /usr/lib/x86_64-linux-gnu/libvorbis.so  /usr/lib/libopusfile.so  /usr/lib/x86_64-linux-gnu/libopus.so  /usr/lib/x86_64-linux-gnu/libwebp.so  /usr/lib/x86_64-linux-gnu/libjpeg.so  /usr/lib/x86_64-linux-gnu/libpng.so  /usr/lib/x86_64-linux-gnu/libfreetype.so  /usr/lib/x86_64-linux-gnu/libGLEW.so  /usr/lib/x86_64-linux-gnu/libopenal.so  /usr/lib/x86_64-linux-gnu/libcurl.so  libsrclibs-mumblelink.a  libsrclibs-findlocale.a  /usr/lib/x86_64-linux-gnu/libhogweed.so  /usr/lib/x86_64-linux-gnu/libnettle.so  /usr/lib/x86_64-linux-gnu/libgmp.so  libsrclibs-nacl-native.a  -lm  /usr/lib/x86_64-linux-gnu/librt.a  -ldl  /usr/lib/x86_64-linux-gnu/libz.so  libsrclibs-minizip.a && :
/usr/bin/ld: CMakeFiles/client-objects.dir/src/engine/renderer/GLMemory.cpp.o: in function `GLStagingBuffer::MapBuffer(long)':
GLMemory.cpp:(.text+0x555): undefined reference to `GLStagingBuffer::SIZE'

That sounds related

Oh right, it's because of the moronic behaviour with static const in class declarations.

VReaperV · 2025-06-08T10:18:03Z

Well no, it's gotta be caused by something else too, because even with constexpr it still fails.

src/engine/renderer/Material.cpp

slipher · 2025-06-08T11:01:23Z

Yeah I don't think constexpr fixed anything for that case; it just behaves the same as const. In general you need C++17 inline constexpr for constants that truly behave as they ought to

VReaperV · 2025-06-08T12:04:01Z

Yeah I don't think constexpr fixed anything for that case; it just behaves the same as const. In general you need C++17 inline constexpr for constants that truly behave as they ought to

Yeah, it's defaulting to C++14 apparently.

VReaperV · 2025-06-08T12:13:04Z

It'll just have to do with the old redundant way.

slipher

I get crashes in the Nvidia driver if I make the buffer small enough that it can't hold all the data at once. For example SIZE = 20 << 20 for Thunder or SIZE = 2 << 20 for plat23.

src/engine/renderer/GLMemory.cpp

VReaperV · 2025-06-10T11:14:02Z

I get crashes in the Nvidia driver if I make the buffer small enough that it can't hold all the data at once. For example SIZE = 20 << 20 for Thunder or SIZE = 2 << 20 for plat23.

So don't do that?

slipher · 2025-06-10T11:15:55Z

That means all the code to handle pointer + size > SIZE etc. doesn't work.

VReaperV · 2025-06-10T18:17:16Z

Fixed.

slipher · 2025-06-11T07:19:32Z

I still get crashes. For example SIZE = 900000 and load plat23

VReaperV · 2025-06-11T07:37:03Z

I still get crashes. For example SIZE = 900000 and load plat23

I don't get any crashes with that.

I don't get why you keep trying to use it with a different size buffer that it's not supposed to work with, either. I don't care if it breaks due to someone randomly changing the code.

sweet235 · 2025-06-11T18:17:36Z

Even though what I say here might be censored again, let me bother this one time.

I still get crashes. For example SIZE = 900000 and load plat23

Am I understanding this correctly: you change a constant in the source code (to yield a buffer with less than 4MB), and observe a crash? And you believe the source code has to be crafted such that no changes in any compile time constant may result in a crash?

slipher · 2025-06-11T18:40:37Z

I don't get why you keep trying to use it with a different size buffer that it's not supposed to work with, either.

To test all the code paths, since I don't know how to use over 128MB of buffers with existing assets. If the code path for flushing out data to make room for incoming data is not intended to work, we can replace it with Sys::Drop instead.

I tried on AMD+Linux with the same SIZE = 900 * 1000 loading plat23 and it's broken there too. Locked up the system for several minutes and the OOM killer zapped most of the running applications. As on Nvidia (in the brief moment before the latter crashes) the map geometry is not rendered.

VReaperV · 2025-06-11T18:55:03Z

To test all the code paths, since I don't know how to use over 128MB of buffers with existing assets. If the code path for flushing out data to make room for incoming data is not intended to work, we can replace it with Sys::Drop instead.

I tested with a lower size (not 3.6mb) and the wrap-around does work correctly.

I tried on AMD+Linux with the same SIZE = 900 * 1000 loading plat23 and it's broken there too. Locked up the system for several minutes and the OOM killer zapped most of the running applications. As on Nvidia (in the brief moment before the latter crashes) the map geometry is not rendered.

That is not enough to fit all the data being mapped at the end of map loading.

slipher · 2025-06-11T19:01:12Z

Could the maximum allocation be different based on some settings? I logged the sizes and the maximum one for me on plat23 had a count between 800k and 900k. If I put SIZE smaller than the largest single allocation, it calls Sys::Drop as expected. If I make it just slightly larger than the largest single allocation, the map finishes loading but things are broken.

VReaperV · 2025-06-11T20:19:28Z

Could the maximum allocation be different based on some settings?

Why?

I logged the sizes and the maximum one for me on plat23 had a count between 800k and 900k. If I put SIZE smaller than the largest single allocation, it calls Sys::Drop as expected. If I make it just slightly larger than the largest single allocation, the map finishes loading but things are broken.

The point of a staging buffer is to transfer and allocate memory on the GPU efficiently, it's not some memory store that you reallocate randomly.

slipher · 2025-06-12T13:49:35Z

Could the maximum allocation be different based on some settings?

Why?

I thought maybe you are implying that 900000 was not enough for the largest single mapping with the comment, "That is not enough to fit all the data being mapped at the end of map loading."

If I understood the API design of the staging buffer correctly: after allocating a block of memory, you must write its contents and request copying to another buffer before you allocate the next block. So only one allocation is available to use at a time.

VReaperV · 2025-06-12T14:17:06Z

Could the maximum allocation be different based on some settings?

Why?

I thought maybe you are implying that 900000 was not enough for the largest single mapping with the comment, "That is not enough to fit all the data being mapped at the end of map loading."

Not quite: it's two mappings. It wouldn't make sense to fill up those allocations separately, and doing so would be slower. But since they both can't fit a 900000 size buffer at the same time, the first allocation gets flushed before any data is written to it, which is why there's no geometry visible in that case (I cannot reproduce a crash there though).

If I understood the API design of the staging buffer correctly: after allocating a block of memory, you must write its contents and request copying to another buffer before you allocate the next block. So only one allocation is available to use at a time.

It does work with multiple allocations, that's the reason (or part of the reason, anyway) that it's 128mb in size.

slipher · 2025-06-12T17:02:53Z

If I remove the geometry cache commit, I still get the crashes. The geometry cache allocations aren't actually the biggest ones anyway, at least on plat23.

I don't get crashes with the first commit alone. Introducing the staging buffer plus the "cleanup" commit is necessary. If I put back the code that zeroes the culledCommandsSSBO which was removed in the "cleanup" commit, it works. Could filling with zeroes be important? Of course I may just be studying noise which depends on the allocation pattern.

VReaperV · 2025-06-12T18:51:31Z

If I remove the geometry cache commit, I still get the crashes. The geometry cache allocations aren't actually the biggest ones anyway, at least on plat23.

I mean the surfaceDescriptorsSSBO and surfaceCommandsSSBO.

I don't get crashes with the first commit alone. Introducing the staging buffer plus the "cleanup" commit is necessary. If I put back the code that zeroes the culledCommandsSSBO which was removed in the "cleanup" commit, it works. Could filling with zeroes be important? Of course I may just be studying noise which depends on the allocation pattern.

It's most likely just a conesquence of surfaceDescriptorsSSBO/surfaceCommandsSSBO not being filled out correctly. culledCommandsSSBO is only actually written on the GPU, and none of the rendering commands access parts of the buffer that weren't written to. Since you're giving garbage data as input in surfaceDescriptorsSSBO/surfaceCommandsSSBO, it will access garbage in culledCommandsSSBO.

slipher · 2025-06-12T20:17:30Z

Thanks, now I get it. Writes to surfaceDescriptorsSSBO and surfaceCommandsSSBO are interleaved; the initial memset of surfaceCommands is not the final say on its contents. The interleaving causes potential problems since the staging buffer API only guarantees the most recent mapping to stay mapped.

The bug here seems fairly possible to hit in real life by making a huge map. I think we should make sure to prevent it since it has the potential to bring down the user's whole system. How about writing the surface commands to a temporary CPU buffer before writing them to the GL buffer? Besides fixing the bug, that would stop the following code from having to read back data from the staging buffer. I think reading back those buffers is something we should try to avoid?

for ( int i = 0; i < MAX_VIEWFRAMES; i++ ) {
	memcpy( surfaceCommands + surfaceCommandsCount * i, surfaceCommands, surfaceCommandsCount * sizeof( SurfaceCommand ) );
}

VReaperV · 2025-06-12T20:35:12Z

Thanks, now I get it. Writes to surfaceDescriptorsSSBO and surfaceCommandsSSBO are interleaved; the initial memset of surfaceCommands is not the final say on its contents. The interleaving causes potential problems since the staging buffer API only guarantees the most recent mapping to stay mapped.

That is correct.

The bug here seems fairly possible to hit in real life by making a huge map.

This would require approximately 2e6 surfaces or ~4.2e7 triangles. I believe daemon would fail elsewhere long before that point.

Besides fixing the bug, that would stop the following code from having to read back data from the staging buffer. I think reading back those buffers is something we should try to avoid?

Yes, but it's not a big deal on map load. I also have some plans for changes that will slightly change the pipeline and eliminate the need for interleaving these writes.

slipher · 2025-06-12T21:32:04Z

I don't know what the relation with triangle counts is but Thunder uses 35 MB for the pair of allocations so it would need to be just 3.7x bigger, doesn't seem out of the realm of possibility.

slipher · 2025-06-12T21:46:45Z

There is a cvar that affects the buffer sizes. r_subdivisions changes the sizes for the surface stuff and geometry cache. Could affect reproducibility.

Adds `GLStagingBuffer` and `GLBufferCopy()`. `GLStagingBuffer` allows setting buffer storage and map flags to 0 on the destination buffers, which pretty much guarantees that they will go into VRAM. The data is uploaded through mapping `stagingBuffer` and doing a copy of the data on the GPU.

VReaperV · 2025-06-12T23:38:05Z

I don't know what the relation with triangle counts is but Thunder uses 35 MB for the pair of allocations so it would need to be just 3.7x bigger, doesn't seem out of the realm of possibility.

The fact that it's directly correlated with the amount of surfaces in material system.

Anyway, I've changed it up a bit, so now it's only uploading and mapping ~3.13mb for these 2 buffers on thunder.

src/engine/renderer/Material.cpp

VReaperV added this to Material system Jun 8, 2025

VReaperV added T-Bug T-Improvement Improvement for an existing feature A-Renderer T-Performance labels Jun 8, 2025

VReaperV moved this to In Progress in Material system Jun 8, 2025

VReaperV force-pushed the staging-buffer branch 3 times, most recently from 96d8c0d to 88c7d08 Compare June 8, 2025 09:03

VReaperV force-pushed the staging-buffer branch from 88c7d08 to 070c4a2 Compare June 8, 2025 09:10

VReaperV force-pushed the staging-buffer branch from 070c4a2 to 5fc89f2 Compare June 8, 2025 09:58

VReaperV force-pushed the staging-buffer branch from 5fc89f2 to 7f31ee4 Compare June 8, 2025 10:19

github-advanced-security bot found potential problems Jun 8, 2025

View reviewed changes

src/engine/renderer/Material.cpp Dismissed Show dismissed Hide dismissed

src/engine/renderer/Material.cpp Dismissed Show dismissed Hide dismissed

slipher reviewed Jun 10, 2025

View reviewed changes

src/engine/renderer/GLMemory.cpp Outdated Show resolved Hide resolved

VReaperV force-pushed the staging-buffer branch 2 times, most recently from e9827af to bf91241 Compare June 10, 2025 18:17

VReaperV added 4 commits June 13, 2025 01:34

Use staging buffer for the geometry cache

cfc723a

Fix missing static

0f9e3dc

Material system clean-up

e14f912

VReaperV force-pushed the staging-buffer branch from bf91241 to e14f912 Compare June 12, 2025 23:36

github-advanced-security bot found potential problems Jun 12, 2025

View reviewed changes

src/engine/renderer/Material.cpp Dismissed Show dismissed Hide dismissed

Implement staging buffer and use it for the material system #1675

Are you sure you want to change the base?

Implement staging buffer and use it for the material system #1675

Uh oh!

Conversation

VReaperV commented Jun 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

VReaperV commented Jun 8, 2025

Uh oh!

slipher commented Jun 8, 2025

Uh oh!

VReaperV commented Jun 8, 2025

Uh oh!

VReaperV commented Jun 8, 2025

Uh oh!

Uh oh!

Uh oh!

slipher commented Jun 8, 2025

Uh oh!

VReaperV commented Jun 8, 2025

Uh oh!

VReaperV commented Jun 8, 2025

Uh oh!

slipher left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

VReaperV commented Jun 10, 2025

Uh oh!

slipher commented Jun 10, 2025

Uh oh!

VReaperV commented Jun 10, 2025

Uh oh!

slipher commented Jun 11, 2025

Uh oh!

VReaperV commented Jun 11, 2025

Uh oh!

sweet235 commented Jun 11, 2025

Uh oh!

slipher commented Jun 11, 2025

Uh oh!

VReaperV commented Jun 11, 2025

Uh oh!

slipher commented Jun 11, 2025

Uh oh!

VReaperV commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slipher commented Jun 12, 2025

Uh oh!

VReaperV commented Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slipher commented Jun 12, 2025

Uh oh!

VReaperV commented Jun 12, 2025

Uh oh!

slipher commented Jun 12, 2025

Uh oh!

VReaperV commented Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slipher commented Jun 12, 2025

Uh oh!

slipher commented Jun 12, 2025

Uh oh!

VReaperV commented Jun 12, 2025

Uh oh!

Uh oh!

Uh oh!

VReaperV commented Jun 8, 2025 •

edited

Loading

VReaperV commented Jun 11, 2025 •

edited

Loading

VReaperV commented Jun 12, 2025 •

edited

Loading

VReaperV commented Jun 12, 2025 •

edited

Loading