Friday Facts #201 - 0.15 Stable, but not really

Regular reports on Factorio development.
User avatar
cpy
Filter Inserter
Filter Inserter
Posts: 839
Joined: Thu Jul 31, 2014 5:34 am
Contact:

Re: Save in the background

Post by cpy »

fechnert wrote:
To not talk about the implementation, but the idea about a quicksave that won't interrupt the game while playing would be very nice
Why not?

ratchetfreak
Filter Inserter
Filter Inserter
Posts: 950
Joined: Sat May 23, 2015 12:10 pm
Contact:

Re: Save in the background

Post by ratchetfreak »

mrvn wrote:I'm always annoyed when trying to do something and the saving dialog pops up. And the more you play the larger the problem becomes because saving takes longer on larger factories.

Why not save in the background in a separate thread and let the game keep running?

"All" this would need to do is to make a snapshot of the game state and then save that while the game continues on the main data. Under Linux at least this is simple since you can

Code: Select all

fork()
to get a snapshot of the processes memory. This will affect performance and memory footprint somewhat during save but not stop the game. So UPS might drop while saving but the game won't stop dead.
It'll suddenly double memory consumption as the orginal program continues updating the world, and then you have to account for all the file descriptors that are duplicated and making sure that nothign funny happens with that.

User avatar
cpy
Filter Inserter
Filter Inserter
Posts: 839
Joined: Thu Jul 31, 2014 5:34 am
Contact:

Re: Save in the background

Post by cpy »

ratchetfreak wrote:
mrvn wrote:I'm always annoyed when trying to do something and the saving dialog pops up. And the more you play the larger the problem becomes because saving takes longer on larger factories.

Why not save in the background in a separate thread and let the game keep running?

"All" this would need to do is to make a snapshot of the game state and then save that while the game continues on the main data. Under Linux at least this is simple since you can

Code: Select all

fork()
to get a snapshot of the processes memory. This will affect performance and memory footprint somewhat during save but not stop the game. So UPS might drop while saving but the game won't stop dead.
It'll suddenly double memory consumption as the orginal program continues updating the world, and then you have to account for all the file descriptors that are duplicated and making sure that nothign funny happens with that.
Well in MP server save only settings, it could be an option, when I host factorio game i have machine with 32GB RAM so uninterrupted play with save in background would be very nice even with additional RAM cost.

Jap2.0
Smart Inserter
Smart Inserter
Posts: 2339
Joined: Tue Jun 20, 2017 12:02 am
Contact:

Re: Friday Facts #201 - 0.15 Stable, but not really

Post by Jap2.0 »

cpy wrote:
ratchetfreak wrote:
mrvn wrote:I'm always annoyed when trying to do something and the saving dialog pops up. And the more you play the larger the problem becomes because saving takes longer on larger factories.

Why not save in the background in a separate thread and let the game keep running?

"All" this would need to do is to make a snapshot of the game state and then save that while the game continues on the main data. Under Linux at least this is simple since you can

Code: Select all

fork()
to get a snapshot of the processes memory. This will affect performance and memory footprint somewhat during save but not stop the game. So UPS might drop while saving but the game won't stop dead.
It'll suddenly double memory consumption as the orginal program continues updating the world, and then you have to account for all the file descriptors that are duplicated and making sure that nothign funny happens with that.
Well in MP server save only settings, it could be an option, when I host factorio game i have machine with 32GB RAM so uninterrupted play with save in background would be very nice even with additional RAM cost.
See this thread, "Use fork() on *nix systems for doing save game". The developers basically say that copying the entirety of the memory would make it very complicated to save, which I'm assuming would mean a much longer period of UPS drops instead of a very short freeze. Additionally, I assume that it would be very complicated to program.
There are 10 types of people: those who get this joke and those who don't.

mrvn
Smart Inserter
Smart Inserter
Posts: 5682
Joined: Mon Sep 05, 2016 9:10 am
Contact:

Re: Friday Facts #201 - 0.15 Stable, but not really

Post by mrvn »

Jap2.0 wrote:
cpy wrote:
ratchetfreak wrote:
mrvn wrote:I'm always annoyed when trying to do something and the saving dialog pops up. And the more you play the larger the problem becomes because saving takes longer on larger factories.

Why not save in the background in a separate thread and let the game keep running?

"All" this would need to do is to make a snapshot of the game state and then save that while the game continues on the main data. Under Linux at least this is simple since you can

Code: Select all

fork()
to get a snapshot of the processes memory. This will affect performance and memory footprint somewhat during save but not stop the game. So UPS might drop while saving but the game won't stop dead.
It'll suddenly double memory consumption as the orginal program continues updating the world, and then you have to account for all the file descriptors that are duplicated and making sure that nothign funny happens with that.
Well in MP server save only settings, it could be an option, when I host factorio game i have machine with 32GB RAM so uninterrupted play with save in background would be very nice even with additional RAM cost.
See this thread, "Use fork() on *nix systems for doing save game". The developers basically say that copying the entirety of the memory would make it very complicated to save, which I'm assuming would mean a much longer period of UPS drops instead of a very short freeze. Additionally, I assume that it would be very complicated to program.
There is a lot of misinformation in that thread.

All that garbage about synchronizing is exactly what fork() is solving. At the point where now the save function is called you fork(). At that point the game is in a consistent state for saving and all game data is unlocked. Then in the child you save and exit. In the parent you simply return and keep playing. The only new bit is that you need a back channel to report errors and have to reap the child. Easy enough to call waitpid once a frame while saving is ongoing. You can add a pipe to report save progress and error strings, which is also easy to do and to check once a frame. Or have a thread that does that in a blocking manner.

As for the memory overhead I can't say much there. But it only affects pages that are changed. I would assume a good portion of the memory used by factorio doesn't change every tick.

Jap2.0
Smart Inserter
Smart Inserter
Posts: 2339
Joined: Tue Jun 20, 2017 12:02 am
Contact:

Re: Friday Facts #201 - 0.15 Stable, but not really

Post by Jap2.0 »

mrvn wrote:
Jap2.0 wrote:
cpy wrote:
ratchetfreak wrote:
mrvn wrote:I'm always annoyed when trying to do something and the saving dialog pops up. And the more you play the larger the problem becomes because saving takes longer on larger factories.

Why not save in the background in a separate thread and let the game keep running?

"All" this would need to do is to make a snapshot of the game state and then save that while the game continues on the main data. Under Linux at least this is simple since you can

Code: Select all

fork()
to get a snapshot of the processes memory. This will affect performance and memory footprint somewhat during save but not stop the game. So UPS might drop while saving but the game won't stop dead.
It'll suddenly double memory consumption as the orginal program continues updating the world, and then you have to account for all the file descriptors that are duplicated and making sure that nothign funny happens with that.
Well in MP server save only settings, it could be an option, when I host factorio game i have machine with 32GB RAM so uninterrupted play with save in background would be very nice even with additional RAM cost.
See this thread, "Use fork() on *nix systems for doing save game". The developers basically say that copying the entirety of the memory would make it very complicated to save, which I'm assuming would mean a much longer period of UPS drops instead of a very short freeze. Additionally, I assume that it would be very complicated to program.
There is a lot of misinformation in that thread.

All that garbage about synchronizing is exactly what fork() is solving. At the point where now the save function is called you fork(). At that point the game is in a consistent state for saving and all game data is unlocked. Then in the child you save and exit. In the parent you simply return and keep playing. The only new bit is that you need a back channel to report errors and have to reap the child. Easy enough to call waitpid once a frame while saving is ongoing. You can add a pipe to report save progress and error strings, which is also easy to do and to check once a frame. Or have a thread that does that in a blocking manner.

As for the memory overhead I can't say much there. But it only affects pages that are changed. I would assume a good portion of the memory used by factorio doesn't change every tick.

I won't claim to know much about fork(), so I might be misunderstanding it, and I won't claim to know how much of that thread is misinformation, but I would consider Rseding91 a reliable source and the following quote appears to be relevant:
Rseding91 wrote:
SyncViews wrote:
Rseding91 wrote:
  • 85% copying memory
  • 10% compressing the save data
  • 5% writing to disk
Something is wrong there, on an AWS T2 micro instance I get over 5GB/s on memcpy. There must be a lot of stuff going on around it, or operations not playing well with the cache and memory subsystem for Factorio to spend so much time "copying memory".
5 GB/s copying raw concurrent memory around sure. But that's not how actual programs are laid out in memory and we don't want to write out the entire contents of the processes memory to disk.
If you don't think that's correct, feel free to let him know. Teaching me won't improve the game.
There are 10 types of people: those who get this joke and those who don't.

pleegwat
Filter Inserter
Filter Inserter
Posts: 255
Joined: Fri May 19, 2017 7:31 pm
Contact:

Re: Save in the background

Post by pleegwat »

mrvn wrote:I'm always annoyed when trying to do something and the saving dialog pops up. And the more you play the larger the problem becomes because saving takes longer on larger factories.

Why not save in the background in a separate thread and let the game keep running?

"All" this would need to do is to make a snapshot of the game state and then save that while the game continues on the main data. Under Linux at least this is simple since you can

Code: Select all

fork()
to get a snapshot of the processes memory. This will affect performance and memory footprint somewhat during save but not stop the game. So UPS might drop while saving but the game won't stop dead.

Code: Select all

fork()
is not free (you do need to copy page tables). I've seen it take over a second, though that process was much larger than a factorio server would be. Then after that you quickly need to duplicate a lot of the memory structures, as they are continuously modified in an active factory. And then you've got the save itself running.

And even if all that is not prohibitive I believe windows doesn't have a fork() equivalent at all.

kovarex
Factorio Staff
Factorio Staff
Posts: 8078
Joined: Wed Feb 06, 2013 12:00 am
Contact:

Re: Save in the background

Post by kovarex »

pleegwat wrote:And even if all that is not prohibitive I believe windows doesn't have a fork() equivalent at all.
That is the main problem.

NotABiter
Fast Inserter
Fast Inserter
Posts: 124
Joined: Fri Nov 14, 2014 9:05 am
Contact:

Re: Friday Facts #201 - 0.15 Stable, but not really

Post by NotABiter »

Jap2.0 wrote:
Rseding91 wrote:5 GB/s copying raw concurrent memory around sure. But that's not how actual programs are laid out in memory and we don't want to write out the entire contents of the processes memory to disk.
If you don't think that's correct, feel free to let him know.
Rseding presents a false dichotomy - i.e. that the choice is either copy individual objects in the pause phase (long pause due to inefficient copy) or write the whole process memory to disk. Another option is an efficient copy in the pause phase, and then do copy #2 (serialization) in a different thread (or process) after copy #1 is done (and do compression and disk writing from that other thread as well).

Q: How do you do an efficient copy when you have all of these individual objects/allocations?
A: Any performant program that needs to quickly copy a significant portion of its sizeable state should properly manage its heaps. I.e. it doesn't matter if individual objects/allocations are in essentially random positions within a heap, because you don't copy the individual objects. Instead (right from their onset - at object creation/allocation) you put those game-state objects that will ultimately need to be copied+saved into one heap (or some known set of heaps that you keep all/most other data out of) and when it comes time to kick off a save you copy the whole heap. The heap should itself of course consist of whole pages (or "superpages" - e.g., 16KB, 32KB, or 64KB chunks), and whole pages can be copied extremely efficiently. (You also have to ensure the heap does not get too sparse over time or efficiency goes down from copying too much unused memory, but I imagine Factorio is essentially ideal in this respect since in normal play its heap requirements really only ever go one way - up.) Doing the copy this way (and then letting another thread finish the save work using that copy) should result in the pause portion of the save being *many* times faster than present - no special OS support required.

This does require two copy operations instead of one (fast/efficient one during pause, slow serializing one after pause), but the second copy operation essentially doesn't matter. It's in another thread so the player doesn't have to wait for it. And it's not going to suck up much memory bandwidth because it's serializing Factorio's "haphazardly placed in memory" objects so it will end up memory-latency-bound long before it eats up any significant fraction of the raw memory bandwidth.

This heap scheme can of course be combined with COW techniques if desired (using COW to do the initial copy "as needed"), but I don't know that COW is necessarily a good idea. Even with COW all of the copy work still ends up being eaten by the active game thread (since it's the one doing writes), it potentially provides much worse copy performance (because now *ONLY* one thread can be used to do the copy), and the player is potentially going to be hit with a sustained period of stuttering rather than just one nice clean copy-pause. Also, I have significant doubts about how many of the pages can really avoid being copied -- it only takes one tiny active entity in a 4KB page to cause the whole page to get copied. Unless Factorio actively segregates active and inactive entities into different pages (which itself I do not see as being a "sure thing" performance win - it could end up making game saving more efficient at the cost of making normal tick updates less efficient) I'd think most pages containing game-state entities are likely going to be modified (and therefore copied) in any game with a reasonably active factory.

torne
Filter Inserter
Filter Inserter
Posts: 341
Joined: Sun Jan 01, 2017 11:54 am
Contact:

Re: Friday Facts #201 - 0.15 Stable, but not really

Post by torne »

NotABiter wrote: Q: How do you do an efficient copy when you have all of these individual objects/allocations?
A: Any performant program that needs to quickly copy a significant portion of its sizeable state should properly manage its heaps. I.e. it doesn't matter if individual objects/allocations are in essentially random positions within a heap, because you don't copy the individual objects. Instead (right from their onset - at object creation/allocation) you put those game-state objects that will ultimately need to be copied+saved into one heap (or some known set of heaps that you keep all/most other data out of) and when it comes time to kick off a save you copy the whole heap. The heap should itself of course consist of whole pages (or "superpages" - e.g., 16KB, 32KB, or 64KB chunks), and whole pages can be copied extremely efficiently. (You also have to ensure the heap does not get too sparse over time or efficiency goes down from copying too much unused memory, but I imagine Factorio is essentially ideal in this respect since in normal play its heap requirements really only ever go one way - up.) Doing the copy this way (and then letting another thread finish the save work using that copy) should result in the pause portion of the save being *many* times faster than present - no special OS support required.
This introduces a different problem, though, which is that if you copy the heap to a different location all the pointers in it become invalid (since they still point to the old copy, which might be changing), and the serialisation code needs to traverse the pointers to do its job. So, you can no longer write "normal" C++ code that accesses things directly, but must fix up all the addresses as you go along by the offset between the two copies of the heap - the code has to look pretty different.

soryu2
Burner Inserter
Burner Inserter
Posts: 6
Joined: Wed Jul 06, 2016 8:50 pm
Contact:

Re: Friday Facts #201 - 0.15 Stable, but not really

Post by soryu2 »

eX_ploit wrote:
On this new computer with normal graphics quality Factorio takes 9.84 seconds to reach the main menu. I think that's pretty good for a game these days
Actually it's not good. Seems like you are loading all of the assets before showing main menu, while only a small minority of those assets are needed in main menu. You can just load those assets and then load everything else in background while player chooses what he's gonna play.
+1. I came here to suggest the exact same thing.

Also benchmarking on the fastest computer you can buy is a nice exercise, but not practical. You probably have some stats about your user’s systems, taking a slow or average machine seems like a much better target for such optimization tests.

Anyway. I love the game and all the peeps working on it! Cheers!

Jap2.0
Smart Inserter
Smart Inserter
Posts: 2339
Joined: Tue Jun 20, 2017 12:02 am
Contact:

Re: Friday Facts #201 - 0.15 Stable, but not really

Post by Jap2.0 »

soryu2 wrote:
eX_ploit wrote:
On this new computer with normal graphics quality Factorio takes 9.84 seconds to reach the main menu. I think that's pretty good for a game these days
Actually it's not good. Seems like you are loading all of the assets before showing main menu, while only a small minority of those assets are needed in main menu. You can just load those assets and then load everything else in background while player chooses what he's gonna play.
+1. I came here to suggest the exact same thing.

Also benchmarking on the fastest computer you can buy is a nice exercise, but not practical. You probably have some stats about your user’s systems, taking a slow or average machine seems like a much better target for such optimization tests.

Anyway. I love the game and all the peeps working on it! Cheers!
Don't feel like finding the exact quote, but someone said that wouldn't make sense because it takes about 2 seconds to start your game, if you have 15 seconds of loading to do left then you either have to not let the user load their game or give them a 15 second loading screen. Personally, it makes more sense to just get it over with in the loading screen.
There are 10 types of people: those who get this joke and those who don't.

NotABiter
Fast Inserter
Fast Inserter
Posts: 124
Joined: Fri Nov 14, 2014 9:05 am
Contact:

Re: Friday Facts #201 - 0.15 Stable, but not really

Post by NotABiter »

torne wrote:This introduces a different problem, though, which is that if you copy the heap to a different location all the pointers in it become invalid
I was aware of this issue but considered it not even worth mentioning in my post. It isn't really a "problem". The pointers don't become "invalid" except in some uselessly narrow idea of validity. It only takes some very simple pointer math (just a fixed/constant offset assuming you manage your memory correctly) to properly adjust the pointers for their new location, and that pointer math will have essentially zero impact on performance (which is going to be dominated by memory latency).
torne wrote:the code has to look pretty different.
A simple offset adjustment (likely implemented as a locally-defined short-named template function to make invocations relatively painless and unobtrusive) doesn't qualify as "has to look pretty different". The code would be identical except for additions of invocations of said function.

Rseding91
Factorio Staff
Factorio Staff
Posts: 13175
Joined: Wed Jun 11, 2014 5:23 am
Contact:

Re: Friday Facts #201 - 0.15 Stable, but not really

Post by Rseding91 »

NotABiter wrote:
Jap2.0 wrote:
Rseding91 wrote:5 GB/s copying raw concurrent memory around sure. But that's not how actual programs are laid out in memory and we don't want to write out the entire contents of the processes memory to disk.
If you don't think that's correct, feel free to let him know.
Rseding presents a false dichotomy - i.e. that the choice is either copy individual objects in the pause phase (long pause due to inefficient copy) or write the whole process memory to disk. Another option is an efficient copy in the pause phase, and then do copy #2 (serialization) in a different thread (or process) after copy #1 is done (and do compression and disk writing from that other thread as well).
It already works this way. As the game state is copied out it's written to disk in a different thread. By the time the copy finishes the compression and writing to disk has already finished as well.
If you want to get ahold of me I'm almost always on Discord.

User avatar
featherwinglove
Filter Inserter
Filter Inserter
Posts: 579
Joined: Sat Jun 25, 2016 6:14 am
Contact:

Re: Friday Facts #201 - 0.15 Stable, but not really

Post by featherwinglove »

I'm khan-fewzed. If Copy #2 (compression & writing to disk) needs Copy #1 (copying game state to other memory) to finish before it can begin, how can Copy #2 be done by the time Copy #1 ends?

Rseding91
Factorio Staff
Factorio Staff
Posts: 13175
Joined: Wed Jun 11, 2014 5:23 am
Contact:

Re: Friday Facts #201 - 0.15 Stable, but not really

Post by Rseding91 »

featherwinglove wrote:I'm khan-fewzed. If Copy #2 (compression & writing to disk) needs Copy #1 (copying game state to other memory) to finish before it can begin, how can Copy #2 be done by the time Copy #1 ends?
Because the copy finishes before the full save process finishes. There's additional things that happen after the data copy finishes to fully finish the save meanwhile the thread is able to write out the remaining data.
If you want to get ahold of me I'm almost always on Discord.

User avatar
featherwinglove
Filter Inserter
Filter Inserter
Posts: 579
Joined: Sat Jun 25, 2016 6:14 am
Contact:

Re: Friday Facts #201 - 0.15 Stable, but not really

Post by featherwinglove »

Okay, that makes sense. The way your previous comment is worded strongly implies (if not outright states) that the "additional things" finish before the "game state is copied out". Try reading it again:
Rseding91 wrote:
NotABiter wrote:
Jap2.0 wrote:
Rseding91 wrote:5 GB/s copying raw concurrent memory around sure. But that's not how actual programs are laid out in memory and we don't want to write out the entire contents of the processes memory to disk.
If you don't think that's correct, feel free to let him know.
Rseding presents a false dichotomy - i.e. that the choice is either copy individual objects in the pause phase (long pause due to inefficient copy) or write the whole process memory to disk. Another option is an efficient copy in the pause phase, and then do copy #2 (serialization) in a different thread (or process) after copy #1 is done (and do compression and disk writing from that other thread as well).
It already works this way. As the game state is copied out it's written to disk in a different thread. By the time the copy finishes the compression and writing to disk has already finished as well.

NotABiter
Fast Inserter
Fast Inserter
Posts: 124
Joined: Fri Nov 14, 2014 9:05 am
Contact:

Re: Friday Facts #201 - 0.15 Stable, but not really

Post by NotABiter »

Rseding91 wrote:It already works this way.
According to what you yourself have said, no, it doesn't work the way I described - Factorio doesn't do the most important part. That is, during the pause what Factorio does is a bazillion inefficient tiny (object-level) copies when what it should be doing is a small number of large (heap-level) copies. The pause time would be many times shorter if it did the large copies rather than all of those little copies.

kreatious
Inserter
Inserter
Posts: 20
Joined: Sat Jul 15, 2017 1:59 am
Contact:

Re: Friday Facts #201 - 0.15 Stable, but not really

Post by kreatious »

kovarex wrote:
pleegwat wrote: And even if all that is not prohibitive I believe windows doesn't have a fork() equivalent at all.
That is the main problem.
Windows has a VirtualProtect() API that allows you to mark pages as copy on write (PAGE_WRITECOPY). I've used it before. Works nice; certainly a lot faster than fork() which has to copy the page table (vs. setting a flag on a single entry in the process's Virtual Address Descriptor tree). Mac & Linux also have an equivalent API: mmap() - and the flags are extremely similar, since virtual memory is hardware accelerated and has been for a long time. If you're lucky, there's a _mmap() API you can call in Windows as well; but I haven't looked into that.
torne wrote:This introduces a different problem, though, which is that if you copy the heap to a different location all the pointers in it become invalid (since they still point to the old copy, which might be changing), and the serialisation code needs to traverse the pointers to do its job. So, you can no longer write "normal" C++ code that accesses things directly, but must fix up all the addresses as you go along by the offset between the two copies of the heap - the code has to look pretty different.
I suggest implementing a custom pointer wrapper template class that uses the address where the pointer is stored (or uses thread local storage) to determine what offset to add when dereferencing. The code doesn't even need to look different - just a quick change of the types of any pointers in headers. Kind of like STL's smart pointer classes. Overhead's minimal since you're going to choke on memory access latency anyway.

Since Factorio is 64-bit, you can reserve two sections of the address space that differ only by a bit and use a bitmask when doing the pointer calculations.

Desync bugs from race conditions due to forgetting to use the pointer class can be caught by deleting the main thread's copy of the game state (and suspending that thread as well) - any segmentation faults will quickly isolate the bug. The game state can then be restored from the save file.

IronCartographer
Filter Inserter
Filter Inserter
Posts: 454
Joined: Tue Jun 28, 2016 2:07 pm
Contact:

Re: Friday Facts #201 - 0.15 Stable, but not really

Post by IronCartographer »

NotABiter wrote:
Rseding91 wrote:It already works this way.
According to what you yourself have said, no, it doesn't work the way I described - Factorio doesn't do the most important part. That is, during the pause what Factorio does is a bazillion inefficient tiny (object-level) copies when what it should be doing is a small number of large (heap-level) copies. The pause time would be many times shorter if it did the large copies rather than all of those little copies.
Even if you managed to get a magically instant copy of the gamestate running in a separate thread/process which would allow the game to continue, Factorio's hunger for memory bandwidth would mean you've just created a new problem: UPS drops for a time while the save runs in the background. Since this magic free copying ability doesn't even exist, you would actually be adding more work so saving would be more painful overall.

Post Reply

Return to “News”