`zstd -1` on atlas caches could double even SSD read speed

Suggestions that have been added to the game.

Moderator: ickputzdirwech

Post Reply
quyxkh
Smart Inserter
Smart Inserter
Posts: 1027
Joined: Sun May 08, 2016 9:01 am
Contact:

`zstd -1` on atlas caches could double even SSD read speed

Post by quyxkh »

It occurrred to me to check how zstd would do on the atlas caches, the vanilla one's about 3GB. `zstd -1` compresses at ~350MB/s on my 5-year-old midrange box to 22% of the original size, ~3GB to ~0.66GB, which is enough to help even SSDs some when writing, but the decompress performance is ridiculous, I get ~1GB/s decompression, it takes less than three seconds to decompress it with hot caches and less than five seconds straight off my 150MB/s HDD. That'd cut cold-start time noticeably even on SSDs, on HDDs the difference would be really gratifying.

posila
Factorio Staff
Factorio Staff
Posts: 5201
Joined: Thu Jun 11, 2015 1:35 pm
Contact:

Re: `zstd -1` on atlas caches could double even SSD read speed

Post by posila »

In 0.17.56, there will be config.ini setting "compress-sprite-atlas-cache". At the moment it uses compression level 1, we may change it to -1 in the future. Also it doesn't utilize parallel compression/decompression, so I am not sure if it'll make 350 MB/s resp. 1 GB/s on your PC, but it seemed pretty fast on mine.

quyxkh
Smart Inserter
Smart Inserter
Posts: 1027
Joined: Sun May 08, 2016 9:01 am
Contact:

Re: `zstd -1` on atlas caches could double even SSD read speed

Post by quyxkh »

the `zstd -1` was meant to be a command quote, that gets the zstd level 1 compression you implemented. _thanks_! (and also, btw, thanks for whatever magic you worked that made loading the atlas from a hot OS buffer cache so close to instantaneous, that still gets a little internal smile of gratification when it hits).

dimm
Inserter
Inserter
Posts: 23
Joined: Tue Jun 06, 2017 12:04 pm
Contact:

Re: `zstd -1` on atlas caches could double even SSD read speed

Post by dimm »

The game now loads very quickly. Thanks!

quyxkh
Smart Inserter
Smart Inserter
Posts: 1027
Joined: Sun May 08, 2016 9:01 am
Contact:

Re: `zstd -1` on atlas caches could double even SSD read speed

Post by quyxkh »

Lol just checked out the data rates on the new PCIe SSDs, they're stupid fast. Thanks for indulging us older-hw users on this.

slippycheeze
Filter Inserter
Filter Inserter
Posts: 587
Joined: Sun Jun 09, 2019 10:40 pm
Contact:

Re: `zstd -1` on atlas caches could double even SSD read speed

Post by slippycheeze »

quyxkh wrote:
Fri Jul 26, 2019 2:04 pm
Lol just checked out the data rates on the new PCIe SSDs, they're stupid fast. Thanks for indulging us older-hw users on this.
You are not wrong, but... to quote from Latency at a human scale -- which scales one CPU cycle to one second, and reflects modern system latency:

L3 cache is about one minute away from the CPU, main memory is about four minutes, and the super-fast Optane stuff? 15 minutes for persistent memory, 7 hours for the DC SSD version. 17-ish hours for the average NVMe SSD.

CPU power is vastly, vastly cheaper than any sort of I/O, to the point that as long as you don't need a huge dictionary, compressed data is almost always a performance win even in main memory, so long as you don't need random access inside pages.

Darinth
Filter Inserter
Filter Inserter
Posts: 323
Joined: Wed Oct 17, 2018 12:17 pm
Contact:

Re: `zstd -1` on atlas caches could double even SSD read speed

Post by Darinth »

slippycheeze wrote:
Wed Jul 31, 2019 11:09 pm
quyxkh wrote:
Fri Jul 26, 2019 2:04 pm
Lol just checked out the data rates on the new PCIe SSDs, they're stupid fast. Thanks for indulging us older-hw users on this.
You are not wrong, but... to quote from Latency at a human scale -- which scales one CPU cycle to one second, and reflects modern system latency:

L3 cache is about one minute away from the CPU, main memory is about four minutes, and the super-fast Optane stuff? 15 minutes for persistent memory, 7 hours for the DC SSD version. 17-ish hours for the average NVMe SSD.

CPU power is vastly, vastly cheaper than any sort of I/O, to the point that as long as you don't need a huge dictionary, compressed data is almost always a performance win even in main memory, so long as you don't need random access inside pages.
That's actually a really cool and useful way to try and explain the gap between the different components in a human-readable way. Thank you for providing that.

Vegemeister
Long Handed Inserter
Long Handed Inserter
Posts: 85
Joined: Sun Dec 04, 2016 9:18 pm
Contact:

Re: `zstd -1` on atlas caches could double even SSD read speed

Post by Vegemeister »

posila wrote:
Wed Jul 10, 2019 11:40 am
In 0.17.56, there will be config.ini setting "compress-sprite-atlas-cache". At the moment it uses compression level 1, we may change it to -1 in the future. Also it doesn't utilize parallel compression/decompression, so I am not sure if it'll make 350 MB/s resp. 1 GB/s on your PC, but it seemed pretty fast on mine.
A big property of zstd is that decompression speed is practically independent of compression level. With single-threaded zstd on my machine (Haswell, 4.2 GHz), that looks like:

Code: Select all

$ zstd -b1 -e10 atlas-cache.dat
 1#atlas-cache.dat   : 992084233 -> 276101141 (3.593), 579.7 MB/s ,1467.1 MB/s 
 2#atlas-cache.dat   : 992084233 -> 271624627 (3.652), 454.6 MB/s ,1433.9 MB/s 
 3#atlas-cache.dat   : 992084233 -> 267384342 (3.710), 300.7 MB/s ,1381.9 MB/s 
 4#atlas-cache.dat   : 992084233 -> 265075857 (3.743), 228.0 MB/s ,1352.3 MB/s 
 5#atlas-cache.dat   : 992084233 -> 260563757 (3.807),  88.0 MB/s ,1327.7 MB/s 
 6#atlas-cache.dat   : 992084233 -> 259328406 (3.826),  70.0 MB/s ,1299.0 MB/s 
 7#atlas-cache.dat   : 992084233 -> 256378987 (3.870),  55.4 MB/s ,1435.8 MB/s 
 8#atlas-cache.dat   : 992084233 -> 254566280 (3.897),  54.9 MB/s ,1411.8 MB/s 
 9#atlas-cache.dat   : 992084233 -> 253687259 (3.911),  41.4 MB/s ,1442.4 MB/s 
10#atlas-cache.dat   : 992084233 -> 249708698 (3.973),  34.7 MB/s ,1396.7 MB/s 
Dividing the decompression speed by compression ratio, for the disk read speed to feed the CPU, I get:

408.32
392.63
372.48
361.29
348.75
339.52
371.01
362.28
368.81
351.55

With the usual assumption about reading the cache being much more frequent than writing it, it seems like you could go to arbitrarily high compression level and still benefit HDD users. Probably shouldn't go higher than 3 or 4, to avoid irritating people with fast SSDs too much at cache creation time.

Post Reply

Return to “Implemented Suggestions”