[2.0.23] Game crashes during headless non-blocking autosave
Posted: Sat Dec 07, 2024 4:24 pm
We are running Factorio on a headless server in Linux. We enabled the non-blocking saving and set an autosave interval of 30 minutes. We are aware that this is an experimental feature. Everything ran fine since the day of the Space Age release until now including all the updates to 2.0.23. We have not yet updated the to the latest 2.0.24.
Today I was alone on the server and got a "Server not responding" message. I let the progress bar reach the end and wanted to locally save the game. This caused my client to crash. See file client.log. I can't remember doing anything special at the time of crash.
I checked the server logs and the reason for the crash server-side was the same as my client-side crash. The file server.log has three sections:
We suspected that the server hit some memory limit that might cause the forking to fail. I checked our server monitoring. During normal operation there are CPU spikes every 30 minutes but no spikes for the memory usage. This can be seen in server_normal_operation.png But in the moment of the crash, the memory usage increased by a fair bit. This can be seen in server_crash.png Not sure if that is relevant or just an expected side-effect of the dumping after a crash.
I have attached the latest autosave before the event (_autosave418.zip) and the temporary failed attempt of _autosave419.tmp.zip. I have loaded the _autosave418.zip on the server again and waited 30 minutes without doing anything. Then the autosave worked fine again.
So I all I lost was 30 minutes of gameplay and now I can continue again. But maybe this report helps to fix some spurious occurring bug.
Today I was alone on the server and got a "Server not responding" message. I let the progress bar reach the end and wanted to locally save the game. This caused my client to crash. See file client.log. I can't remember doing anything special at the time of crash.
I checked the server logs and the reason for the crash server-side was the same as my client-side crash. The file server.log has three sections:
- The initial start of the server a few days ago
- Normal gameplay from 9:30 - 11:32
- The crash
We suspected that the server hit some memory limit that might cause the forking to fail. I checked our server monitoring. During normal operation there are CPU spikes every 30 minutes but no spikes for the memory usage. This can be seen in server_normal_operation.png But in the moment of the crash, the memory usage increased by a fair bit. This can be seen in server_crash.png Not sure if that is relevant or just an expected side-effect of the dumping after a crash.
I have attached the latest autosave before the event (_autosave418.zip) and the temporary failed attempt of _autosave419.tmp.zip. I have loaded the _autosave418.zip on the server again and waited 30 minutes without doing anything. Then the autosave worked fine again.
So I all I lost was 30 minutes of gameplay and now I can continue again. But maybe this report helps to fix some spurious occurring bug.