[1.0.0] Server creates corrupt autosave file and dies

Bugs that are actually features.
Post Reply
DomNomNom
Burner Inserter
Burner Inserter
Posts: 7
Joined: Sat Apr 11, 2020 11:48 am
Contact:

[1.0.0] Server creates corrupt autosave file and dies

Post by DomNomNom »

Hi, I played alone on a multiplayer unmodded 1.0 server where I was adding more solar to my late game base when the server unexpectedly created a bad autosave file.
Due to not being able to complete the save, the server quit.
When attempting to restart the server with --start-server-load-latest, it attempts to load the _autosave6.tmp.zip which is invalid and quits.
I was able to mitigate the problem by deleting the failed autosave and loosing 5 minutes of progress.

Potentially related: I remember previously that day, similar interruption where the server was not responding for a couple of seconds but it managed to recover.
Attachments
_autosave6.tmp.zip
the save where the server died.
(13.28 MiB) Downloaded 71 times
_autosave5.zip
the save before the crash
(52.66 MiB) Downloaded 71 times
factorio-server.log
server log
(207.4 KiB) Downloaded 67 times
factorio-client.log
client log
(37.97 KiB) Downloaded 70 times

Rseding91
Factorio Staff
Factorio Staff
Posts: 13209
Joined: Wed Jun 11, 2014 5:23 am
Contact:

Re: [1.0.0] Server creates corrupt autosave file and dies

Post by Rseding91 »

The log file shows this:
810302.403 Info AppManager.cpp:278: Saving to _autosave6 (non-blocking).
810302.473 Info AsyncScenarioSaver.cpp:144: Saving process PID: 26043
810313.608 Error ChildProcessAgent.cpp:62: Child 26043 was terminated by signal 9
810313.626 Error Util.cpp:83: Attempting to create notice box in headless mode. Message: 'Saving process crashed.'
Which from a quick search means something killed the auto-save fork process outside of Factorio's control.
If you want to get ahold of me I'm almost always on Discord.

DomNomNom
Burner Inserter
Burner Inserter
Posts: 7
Joined: Sat Apr 11, 2020 11:48 am
Contact:

Re: [1.0.0] Server creates corrupt autosave file and dies

Post by DomNomNom »

Looking at my server graphs, it probably was an out-of-memory kill. (see screenshot)
However after restarting, it seems to take up only half as much RAM as before the quit, not sure if that's indicative of a memory leak or expected.
Attachments
factorio_crash_memory_graph.PNG
factorio_crash_memory_graph.PNG (20.31 KiB) Viewed 1167 times

Rseding91
Factorio Staff
Factorio Staff
Posts: 13209
Joined: Wed Jun 11, 2014 5:23 am
Contact:

Re: [1.0.0] Server creates corrupt autosave file and dies

Post by Rseding91 »

DomNomNom wrote:
Tue Aug 25, 2020 3:58 pm
Looking at my server graphs, it probably was an out-of-memory kill. (see screenshot)
However after restarting, it seems to take up only half as much RAM as before the quit, not sure if that's indicative of a memory leak or expected.
On linux; kind of expected. Factorio will free memory but the OS doesn't have to reclaim it until the process exits (if I'm remembering correctly about this same kind of thing I've seen in the past).

I don't understand why linux would kill a process when it runs out of memory... on Windows when memory runs out it just overflows to the page file and if that runs out the OS fails allocations and it's up to what ever program is wanting more memory to handle the allocation failure or to just not and kill itself.

Almost seems like you have a virus on your PC just killing random processes :P Not what I would want to happen... (I know it's not a virus but it seems like one to me)
If you want to get ahold of me I'm almost always on Discord.

DomNomNom
Burner Inserter
Burner Inserter
Posts: 7
Joined: Sat Apr 11, 2020 11:48 am
Contact:

Re: [1.0.0] Server creates corrupt autosave file and dies

Post by DomNomNom »

Thanks! Have a nice day.

blahfasel2000
Inserter
Inserter
Posts: 49
Joined: Sat Mar 28, 2020 2:10 pm
Contact:

Re: [1.0.0] Server creates corrupt autosave file and dies

Post by blahfasel2000 »

Rseding91 wrote:
Tue Aug 25, 2020 6:46 pm
I don't understand why linux would kill a process when it runs out of memory... on Windows when memory runs out it just overflows to the page file and if that runs out the OS fails allocations and it's up to what ever program is wanting more memory to handle the allocation failure or to just not and kill itself.
It's because of memory overcommit. The Out Of Memory condition doesn't happen at a point when the application is actually making a memory allocation where an error could get reported back, instead it happens at a random point when a shared page needs to be copied because of copy on write logic and there's no more physical storage (RAM or swap) available. The process triggering it is just executing a CPU instruction that happens to write to a shared page at this point, so there's really no way the OS could report an out of memory condition back to the process, and killing a process (there's a bunch of heuristics to select which one to kill, which doesn't necessarily have to be the one that is triggering the OOM condition) to free up some memory is really the only thing it can do.

You can disable memory overcommit through kernel settings, but if you do so be prepared that you can suddenly run a lot less stuff in parallel on your system, as you lose the memory saving benefits of copy on write (it still does copy on write, but the OS will fully account for every possible future copy right from the start, thus inflating the amount of memory it considers as "allocated" even if the page will actually never get copied because it never gets written to). This includes losing the benefits of shared libraries, as the shared library architecture is designed with copy on write and memory overcommit in mind.

Post Reply

Return to “Not a bug”