`zstd -1` on atlas caches could double even SSD read speed
Moderator: ickputzdirwech
`zstd -1` on atlas caches could double even SSD read speed
It occurrred to me to check how zstd would do on the atlas caches, the vanilla one's about 3GB. `zstd -1` compresses at ~350MB/s on my 5-year-old midrange box to 22% of the original size, ~3GB to ~0.66GB, which is enough to help even SSDs some when writing, but the decompress performance is ridiculous, I get ~1GB/s decompression, it takes less than three seconds to decompress it with hot caches and less than five seconds straight off my 150MB/s HDD. That'd cut cold-start time noticeably even on SSDs, on HDDs the difference would be really gratifying.
Re: `zstd -1` on atlas caches could double even SSD read speed
In 0.17.56, there will be config.ini setting "compress-sprite-atlas-cache". At the moment it uses compression level 1, we may change it to -1 in the future. Also it doesn't utilize parallel compression/decompression, so I am not sure if it'll make 350 MB/s resp. 1 GB/s on your PC, but it seemed pretty fast on mine.
Re: `zstd -1` on atlas caches could double even SSD read speed
the `zstd -1` was meant to be a command quote, that gets the zstd level 1 compression you implemented. _thanks_! (and also, btw, thanks for whatever magic you worked that made loading the atlas from a hot OS buffer cache so close to instantaneous, that still gets a little internal smile of gratification when it hits).
Re: `zstd -1` on atlas caches could double even SSD read speed
The game now loads very quickly. Thanks!
Re: `zstd -1` on atlas caches could double even SSD read speed
Lol just checked out the data rates on the new PCIe SSDs, they're stupid fast. Thanks for indulging us older-hw users on this.
-
- Filter Inserter
- Posts: 587
- Joined: Sun Jun 09, 2019 10:40 pm
- Contact:
Re: `zstd -1` on atlas caches could double even SSD read speed
You are not wrong, but... to quote from Latency at a human scale -- which scales one CPU cycle to one second, and reflects modern system latency:
L3 cache is about one minute away from the CPU, main memory is about four minutes, and the super-fast Optane stuff? 15 minutes for persistent memory, 7 hours for the DC SSD version. 17-ish hours for the average NVMe SSD.
CPU power is vastly, vastly cheaper than any sort of I/O, to the point that as long as you don't need a huge dictionary, compressed data is almost always a performance win even in main memory, so long as you don't need random access inside pages.
Re: `zstd -1` on atlas caches could double even SSD read speed
That's actually a really cool and useful way to try and explain the gap between the different components in a human-readable way. Thank you for providing that.slippycheeze wrote: ↑Wed Jul 31, 2019 11:09 pmYou are not wrong, but... to quote from Latency at a human scale -- which scales one CPU cycle to one second, and reflects modern system latency:
L3 cache is about one minute away from the CPU, main memory is about four minutes, and the super-fast Optane stuff? 15 minutes for persistent memory, 7 hours for the DC SSD version. 17-ish hours for the average NVMe SSD.
CPU power is vastly, vastly cheaper than any sort of I/O, to the point that as long as you don't need a huge dictionary, compressed data is almost always a performance win even in main memory, so long as you don't need random access inside pages.
-
- Long Handed Inserter
- Posts: 85
- Joined: Sun Dec 04, 2016 9:18 pm
- Contact:
Re: `zstd -1` on atlas caches could double even SSD read speed
A big property of zstd is that decompression speed is practically independent of compression level. With single-threaded zstd on my machine (Haswell, 4.2 GHz), that looks like:posila wrote: ↑Wed Jul 10, 2019 11:40 am In 0.17.56, there will be config.ini setting "compress-sprite-atlas-cache". At the moment it uses compression level 1, we may change it to -1 in the future. Also it doesn't utilize parallel compression/decompression, so I am not sure if it'll make 350 MB/s resp. 1 GB/s on your PC, but it seemed pretty fast on mine.
Code: Select all
$ zstd -b1 -e10 atlas-cache.dat
1#atlas-cache.dat : 992084233 -> 276101141 (3.593), 579.7 MB/s ,1467.1 MB/s
2#atlas-cache.dat : 992084233 -> 271624627 (3.652), 454.6 MB/s ,1433.9 MB/s
3#atlas-cache.dat : 992084233 -> 267384342 (3.710), 300.7 MB/s ,1381.9 MB/s
4#atlas-cache.dat : 992084233 -> 265075857 (3.743), 228.0 MB/s ,1352.3 MB/s
5#atlas-cache.dat : 992084233 -> 260563757 (3.807), 88.0 MB/s ,1327.7 MB/s
6#atlas-cache.dat : 992084233 -> 259328406 (3.826), 70.0 MB/s ,1299.0 MB/s
7#atlas-cache.dat : 992084233 -> 256378987 (3.870), 55.4 MB/s ,1435.8 MB/s
8#atlas-cache.dat : 992084233 -> 254566280 (3.897), 54.9 MB/s ,1411.8 MB/s
9#atlas-cache.dat : 992084233 -> 253687259 (3.911), 41.4 MB/s ,1442.4 MB/s
10#atlas-cache.dat : 992084233 -> 249708698 (3.973), 34.7 MB/s ,1396.7 MB/s
408.32
392.63
372.48
361.29
348.75
339.52
371.01
362.28
368.81
351.55
With the usual assumption about reading the cache being much more frequent than writing it, it seems like you could go to arbitrarily high compression level and still benefit HDD users. Probably shouldn't go higher than 3 or 4, to avoid irritating people with fast SSDs too much at cache creation time.