working hires sprites on 2GB VRAM
Posted: Tue May 01, 2018 10:34 am
Hey,
I'm able to run hires sprites on the flaky Radeon 270x with 2GB VRAM. This throws a warning when choosing hires sprites but with the correct settings, allows for 60 FPS. I don't know how it works on the NVIDIA drivers, or on Linux. I expect Linux Mesa drivers perform better than Windows with its binary Radeon driver.
I hope this post is useful to anyone. See the end of the post for the exact settings. Here's my attempt at explanation for each option. Sadly it's technical. I'm going to play with atlas sizing some more to squeeze out more VRAM usage.
0. After VRAM is exhausted, the GPU will release whole atlases, not their parts. Better to make smaller atlases but more of them. A re-upload will cost precious latency. It's similar to downloading from GPU memory which forces a long wait if done synchronously (just like an upload has to be in this case). The PCI-X bandwidth is large but has long delays too.
1. My CPU usage on modern Core i5 is 60% of one core. Options costing CPU due to GPU driver overhead won't make the game starved for CPU. Any Haswell+ CPU should handle it.
2. Create specialized atlases: See (0).
3. Optimize atlas sprite packing: a) drawing in retained mode isn't completely expensive 2) eye-inspecting atlases via the debug key shows a major difference in their sizes.
4. Low VRAM mode: it's actually very sensible to lazy-load sprites. Loading each and every one at startup will choke without large VRAM amount. A decent amount could be never used actually, especially with mods.
5. Texture compression: From Wikipedia: There are five variations of the S3TC algorithm [...] resulting in compression ratios of 6:1 with 24-bit RGB input data or 4:1 with 32-bit RGBA input data. I recommend it under all circumstances. The GPUs support S3TC in hardware directly.
6. Atlas texture size: See point (0).
7. Sprite resolution: High. That's what we came for!
8. Render threads: Setting it to max value strangely helps a lot. The game's programmers seem to be very competent, given the difficulties of multicore rendering.
9. Video memory usage: I don't understand why it has to be set like that. I thought "High" scales with max VRAM, and that it should be the optimal value. But it lowers FPS to ~40.
I'm able to run hires sprites on the flaky Radeon 270x with 2GB VRAM. This throws a warning when choosing hires sprites but with the correct settings, allows for 60 FPS. I don't know how it works on the NVIDIA drivers, or on Linux. I expect Linux Mesa drivers perform better than Windows with its binary Radeon driver.
I hope this post is useful to anyone. See the end of the post for the exact settings. Here's my attempt at explanation for each option. Sadly it's technical. I'm going to play with atlas sizing some more to squeeze out more VRAM usage.
0. After VRAM is exhausted, the GPU will release whole atlases, not their parts. Better to make smaller atlases but more of them. A re-upload will cost precious latency. It's similar to downloading from GPU memory which forces a long wait if done synchronously (just like an upload has to be in this case). The PCI-X bandwidth is large but has long delays too.
1. My CPU usage on modern Core i5 is 60% of one core. Options costing CPU due to GPU driver overhead won't make the game starved for CPU. Any Haswell+ CPU should handle it.
2. Create specialized atlases: See (0).
3. Optimize atlas sprite packing: a) drawing in retained mode isn't completely expensive 2) eye-inspecting atlases via the debug key shows a major difference in their sizes.
4. Low VRAM mode: it's actually very sensible to lazy-load sprites. Loading each and every one at startup will choke without large VRAM amount. A decent amount could be never used actually, especially with mods.
5. Texture compression: From Wikipedia: There are five variations of the S3TC algorithm [...] resulting in compression ratios of 6:1 with 24-bit RGB input data or 4:1 with 32-bit RGBA input data. I recommend it under all circumstances. The GPUs support S3TC in hardware directly.
6. Atlas texture size: See point (0).
7. Sprite resolution: High. That's what we came for!
8. Render threads: Setting it to max value strangely helps a lot. The game's programmers seem to be very competent, given the difficulties of multicore rendering.
9. Video memory usage: I don't understand why it has to be set like that. I thought "High" scales with max VRAM, and that it should be the optimal value. But it lowers FPS to ~40.