better compression algorithm for saves and mods
Moderator: ickputzdirwech
better compression algorithm for saves and mods
using XZ compression and default settings from GNU tar, recompressing all elements in a desync report cut the size in half which is the difference between being able to share it with the developers, or being unable to.
my LTE connection limits me to 64KiB/s after I hit a bandwidth limit and after this time it's a pain to connect to the server due to the save file size which can grow to ~74MiB (the largest I have done, personally) and I've seen other players with even larger, 115MiB save files - this is without replay.dat enabled, too.
using XZ compression can reduce a 60MiB save file down to just 12MiB.
there are better compression algorithms out there, for example, zstd compression is dictionary based. we can certainly make a custom dictionary for zstd to use that is fine-crafted for factorio and this will help reduce save file size even further.
my LTE connection limits me to 64KiB/s after I hit a bandwidth limit and after this time it's a pain to connect to the server due to the save file size which can grow to ~74MiB (the largest I have done, personally) and I've seen other players with even larger, 115MiB save files - this is without replay.dat enabled, too.
using XZ compression can reduce a 60MiB save file down to just 12MiB.
there are better compression algorithms out there, for example, zstd compression is dictionary based. we can certainly make a custom dictionary for zstd to use that is fine-crafted for factorio and this will help reduce save file size even further.
Re: better compression algorithm for saves and mods
So just re-compress it?
I won’t be surprised if the default compression algorithm and strength is chosen more for its speed (when saving) than its effectiveness and it’s really more of a way to package multiple files into a single file than about saving space.
I won’t be surprised if the default compression algorithm and strength is chosen more for its speed (when saving) than its effectiveness and it’s really more of a way to package multiple files into a single file than about saving space.
Re: better compression algorithm for saves and mods
well, have you ever looked inside a save file or desync report? there's some dat files that are probably 3-4x the size of the ZIP file and they do compress nicely, but we can do better.Nemo4809 wrote: ↑Thu Mar 12, 2020 8:14 pm So just re-compress it?
I won’t be surprised if the default compression algorithm and strength is chosen more for its speed (when saving) than its effectiveness and it’s really more of a way to package multiple files into a single file than about saving space.
the referenced compression algorithms are competitive for speed and multi-core capabilities. much more so than zip.
Re: better compression algorithm for saves and mods
Oh. Thought he was manually sending stuff.
Well, I guessing then the developers went with zip for speed - I looked up xz; it uses LZMA compression which is slower than zip - and figured the compression ratio was good enough vs processing time required. Maybe they can add a separate compressor for things that need to be transferred over the internet as an option - for people with really tight bandwidth limitations/data caps.ptx0 wrote: ↑Thu Mar 12, 2020 8:38 pmwell, have you ever looked inside a save file or desync report? there's some dat files that are probably 3-4x the size of the ZIP file and they do compress nicely, but we can do better.Nemo4809 wrote: ↑Thu Mar 12, 2020 8:14 pm So just re-compress it?
I won’t be surprised if the default compression algorithm and strength is chosen more for its speed (when saving) than its effectiveness and it’s really more of a way to package multiple files into a single file than about saving space.
the referenced compression algorithms are competitive for speed and multi-core capabilities. much more so than zip.
Re: better compression algorithm for saves and mods
Do keep in mind that tar xz makes a solid archive. The individual files contained within are not compressed separately, and to read any individual file, all preceding files may need to be decompressed (or at least the decompressor must process the data for them to be in the appropriate state). It would be interesting to know what size the data would be if you put it in a tar without compression then in a zip file.
That is to say it may not be the algorithm as much as how it is applied.
Re: better compression algorithm for saves and mods
so i was playing space exploration and the 300MiB map really makes multiplayer difficult, but XZ compression - though slower - reduces the save file and makes it playable. but sharing singleplayer maps with friends for remote play is getting old, can we have an option to use a different algorithm? I suppose it doesn't even need to be the ones I've proposed. and it needn't be default option, much like the unsupported "non-blocking save", if the performance is such a concern - but it's not a major performance difference, in my testing.
Re: better compression algorithm for saves and mods
can we get some kind of official feedback from this idea? i could even submit the PR myself.
Re: better compression algorithm for saves and mods
Just out of curiosity, what/are there any compressed file formats other than zip Windows/MacOS can open natively? (Not saying that everyone shouldn't have 7-zip or something installed....) It's nice to be able to just open them natively on the OS level (although granted the people who will be manually opening Factorio saves are probably the same group of people who know how to use 7-zip). Probably not a good idea to use a different format for local saving vs. multiplayer and desyncs.
There are 10 types of people: those who get this joke and those who don't.
Re: better compression algorithm for saves and mods
the native zip/cab browser/archiver isn't even recommended for production use due to numerous issues with it: https://superuser.com/questions/476740/ ... 366#481366Jap2.0 wrote: ↑Sun May 03, 2020 6:17 pm Just out of curiosity, what/are there any compressed file formats other than zip Windows/MacOS can open natively? (Not saying that everyone shouldn't have 7-zip or something installed....) It's nice to be able to just open them natively on the OS level (although granted the people who will be manually opening Factorio saves are probably the same group of people who know how to use 7-zip). Probably not a good idea to use a different format for local saving vs. multiplayer and desyncs.
https://superuser.com/questions/566123/ ... 207#566207
Re: better compression algorithm for saves and mods
I just did a quick check on a 200+MB save, a .tar.xz wound up only 25% smaller than the .zip. A desync report is basically just two saves and some notes, right? Doesn't look like the astoundingly-slow compression will save enough to get many reports across any particular feasibility barrier. xz produces <1MB/s of compressed data here, multiplayer compression is counterproductive if you can't keep the pipes full. I'd call it "not competitive here".
Looks like factorio maps is one of the cases where zstd's only marginally better at compressing, it'll do somewhat worse lots faster, about the same a little faster, so it's better on this workload by pretty much any metric but not a whole lot better, going on the quick checks I wouldn't see much value in switching.
Looks like factorio maps is one of the cases where zstd's only marginally better at compressing, it'll do somewhat worse lots faster, about the same a little faster, so it's better on this workload by pretty much any metric but not a whole lot better, going on the quick checks I wouldn't see much value in switching.
Re: better compression algorithm for saves and mods
saving can be set up to only occur on the server, it can even be non-blocking save. throughput is a non-issue.quyxkh wrote: ↑Mon May 04, 2020 1:48 am I just did a quick check on a 200+MB save, a .tar.xz wound up only 25% smaller than the .zip. A desync report is basically just two saves and some notes, right? Doesn't look like the astoundingly-slow compression will save enough to get many reports across any particular feasibility barrier. xz produces <1MB/s of compressed data here, multiplayer compression is counterproductive if you can't keep the pipes full. I'd call it "not competitive here".
Re: better compression algorithm for saves and mods
+1.
I can definitely see a benefit of better compression for multiplayer in heavily modded games (the question though is "what is better?"). Space Exploration games easily go over 100 MB, and sometimes reach even 300MB. It is very painful to connect to them, and I saw many people stopping playing only because they either could not connect or it was taking too much time.
I can definitely see a benefit of better compression for multiplayer in heavily modded games (the question though is "what is better?"). Space Exploration games easily go over 100 MB, and sometimes reach even 300MB. It is very painful to connect to them, and I saw many people stopping playing only because they either could not connect or it was taking too much time.
Re: better compression algorithm for saves and mods
i like this idea
i have friend with bad internet and he can't play big maps with me and other friends (he leave our SE game when save was about 70+ MB, at end we had about 150 MB), so if i can set option for server for better compression - it will be nice
i have friend with bad internet and he can't play big maps with me and other friends (he leave our SE game when save was about 70+ MB, at end we had about 150 MB), so if i can set option for server for better compression - it will be nice
Re: better compression algorithm for saves and mods
I vote for compressing saves only for multiplayer games.
In my single player games just store the uncompressed data to accelerate saving.
In my single player games just store the uncompressed data to accelerate saving.
Re: better compression algorithm for saves and mods
writing hundreds of MiB/s to disk can be a big problem. compression means writing less to disk, so it can be faster. serializing is a b i g issue with save time, it's the huge pause before the save progress bar begins to fill. of course an option to disable compression would be useful, though.
Re: better compression algorithm for saves and mods
The pause before the bar begins to fill is saving force data, logistic network data, path-finder data, and active entity order on chunks. It's all memory/latency bound and not related to compression or disk access speeds.ptx0 wrote: ↑Fri May 08, 2020 7:16 pmwriting hundreds of MiB/s to disk can be a big problem. compression means writing less to disk, so it can be faster. serializing is a b i g issue with save time, it's the huge pause before the save progress bar begins to fill. of course an option to disable compression would be useful, though.
If you want to get ahold of me I'm almost always on Discord.
Re: better compression algorithm for saves and mods
* No non-blocking saving on Windows last I checked
* Throughput might not be an issue for non-blocking autosaves, but potentially multiple minutes is unacceptable if it's blocking.
* How long the saving takes directly impacts how long it will take to join a server - both for the initial wait, and the amount of catching up necessary.
There are 10 types of people: those who get this joke and those who don't.
Re: better compression algorithm for saves and mods
well it doesn't exist on windows and yet the option exists for people who need it, just like compression could be. and in several situations, CPU time and wall clock time are cheaper than network bandwidth. I have data caps.Jap2.0 wrote: ↑Fri May 08, 2020 11:20 pm* No non-blocking saving on Windows last I checked
* Throughput might not be an issue for non-blocking autosaves, but potentially multiple minutes is unacceptable if it's blocking.
* How long the saving takes directly impacts how long it will take to join a server - both for the initial wait, and the amount of catching up necessary.
Re: better compression algorithm for saves and mods
So can you tell me exactly when this is going to be default, optional, and not available? (Obviously decompression should be available everywhere.)ptx0 wrote: ↑Sat May 09, 2020 1:22 amwell it doesn't exist on windows and yet the option exists for people who need it, just like compression could be.Jap2.0 wrote: ↑Fri May 08, 2020 11:20 pm * No non-blocking saving on Windows last I checked
* Throughput might not be an issue for non-blocking autosaves, but potentially multiple minutes is unacceptable if it's blocking.
* How long the saving takes directly impacts how long it will take to join a server - both for the initial wait, and the amount of catching up necessary.
It's tradeoffs all the way down. How much time is it worth? A minute? Five? Ten? An hour? There's no way to say that one setting will be better or worse for everyone. Keep in mind that the longer you wait to load the save, the more catch-up data that has to be sent over the network.and in several situations, CPU time and wall clock time are cheaper than network bandwidth. I have data caps.
There are 10 types of people: those who get this joke and those who don't.