[Oxyd] [1.1.1] Asynchronous save process stuck, crash when quitting game

This subforum contains all the issues which we already resolved.
Post Reply
selplacei
Burner Inserter
Burner Inserter
Posts: 17
Joined: Wed Aug 05, 2020 9:52 am
Contact:

[Oxyd] [1.1.1] Asynchronous save process stuck, crash when quitting game

Post by selplacei »

Save files: https://drive.google.com/drive/folders/ ... sp=sharing
Included are the original save (suffering..zip), the autosaves as of right after the crash, and both of the .tmp.zip files that were present.
Log file, as of right after the crash (also attached to this post): https://pastebin.com/RiiRQcVD

System info: Arch Linux 5.9.9-zen 64-bit, Factorio 1.1.1 on the native Steam client installed from multilib/steam, with multilib/steam-native-runtime installed as well.

What happened:
- I played on a different modded save for a bit (the only present mod was Editor Extensions)
- I manually saved the game, then selected the main save ("suffering.") and synced the mods, disabling all except "base"
- The game restarted and loaded the save as normal. I played for a couple hours, with async autosaves happening without issues
- At some point, an autosave stopped with an in-game message saying "Saving process crashed.", which I confirmed and just continued playing (can't remember if I manually saved right after it - hopefully the log can clarify that)
- After a bit of playing (with more autosaves happening as normal), a save started (I literally can't remember if it was an autosave or a manual save - I'm sorry) and the game immediately took up 100% of the CPU and an abnormal amount of memory, nearly crashing. Both the CPU and RAM returned to normal after several seconds
- The save message in the top left remained there for several minutes, being stuck at 100% (screenshot attached)
- When I decided it wasn't going to go away, I tried to quit the game as normal (press Esc, click "Quit Game")
- The "Game is not saved. Quit anyway?" box showed up, and as soon as I clicked "Quit", the game hung up for several seconds, followed by an error message (screenshot attached)
- The game process exited when I clicked "Yes" (didn't have to kill it with a task manager)

After the crash:
- The game opens normally
- The main menu "Continue" button shows "suffering..tmp" as the save name
- When I clicked "Load game" (not "Continue"), I got the following message: "Opening zip /home/selplacei/.factorio/saves/suffering..tmp.zip failed: Bad zip file"
- I could load _autosave2 (the most recent save) normally, and manually saved it without problem

Not sure if this is a game bug or caused by my own system - posting here just in case.
Attachments
screenshot-factorio-stuck-autosave.png
screenshot-factorio-stuck-autosave.png (45.55 KiB) Viewed 5942 times
screenshot-factorio-error-message.png
screenshot-factorio-error-message.png (2.33 MiB) Viewed 5942 times
factorio-current.log
(33.92 KiB) Downloaded 122 times

User avatar
valneq
Smart Inserter
Smart Inserter
Posts: 1150
Joined: Fri Jul 12, 2019 7:43 am
Contact:

Re: [1.1.1] Asynchronous save process stuck, crash when quitting game

Post by valneq »

I thought the devs removed the non-blocking save. Is it still available in the settings?

selplacei
Burner Inserter
Burner Inserter
Posts: 17
Joined: Wed Aug 05, 2020 9:52 am
Contact:

Re: [1.1.1] Asynchronous save process stuck, crash when quitting game

Post by selplacei »

valneq wrote:
Thu Nov 26, 2020 7:36 am
I thought the devs removed the non-blocking save. Is it still available in the settings?
It is available via the config file on Linux. Based on the comment ("Does nothing on Windows") it appears to be supported. It has worked perfectly fine until 1.1 (I got the "Saving process crashed" message a total of 3 times since upgrading).

Squelch
Filter Inserter
Filter Inserter
Posts: 346
Joined: Sat Apr 23, 2016 5:31 pm
Contact:

Re: [1.1.1] Asynchronous save process stuck, crash when quitting game

Post by Squelch »

valneq wrote:
Thu Nov 26, 2020 7:36 am
I thought the devs removed the non-blocking save. Is it still available in the settings?
It was kept after several requests so we could attempt to identify the cause.

see [Oxyd] [Linux/Mac] non-blocking save crashes

From the log, and what seems to have initiated the crash is the generic error (return value 1) for autosave1 followed by a successful save to autosave2, but failed to save the blueprint storage:

Code: Select all

8363.920 Info AppManager.cpp:285: Saving to _autosave1 (non-blocking).
8364.072 Info AsyncScenarioSaver.cpp:144: Saving process PID: 1724870
8366.131 [1724870] Info BlueprintShelf.cpp:720: Saving blueprint storage.
8366.671 Info ChildProcessAgent.cpp:60: Child 1724870 exited with return value 1
8544.637 Info AppManager.cpp:285: Saving to _autosave2 (non-blocking).
8544.777 Info AsyncScenarioSaver.cpp:144: Saving process PID: 1726490
8547.438 [1726490] Info BlueprintShelf.cpp:720: Saving blueprint storage.
8547.530 [1726490] Error BlueprintShelf.cpp:897: Failed to save blueprint storage to cloud. Size: 467699, Cloud Storage: Total=0, Available=0
8547.789 Info ChildProcessAgent.cpp:60: Child 1726490 exited with return value 0
8547.789 Info AppManager.cpp:286: Saving finished
8567.214 Info AppManagerStates.cpp:1829: Saving game as /home/selplacei/.factorio/saves/suffering. (non-blocking)
8567.347 Info AsyncScenarioSaver.cpp:144: Saving process PID: 1726684
8569.386 [1726684] Info BlueprintShelf.cpp:720: Saving blueprint storage.
8740.021 Warning AppManager.cpp:301: Not saving _autosave3: Previous saving process hasn't finished yet
8922.703 Warning AppManager.cpp:301: Not saving _autosave1: Previous saving process hasn't finished yet
Saves to this path were previously successful.

I still haven't been able to reproduce this, and wish I could so I could try and chase it down.

strace might reveal something if a reliable repro can be found. So far, I haven't been able to identify anything savefile related from previous reports, and I can only assume it is something particular to some system environments.

posila
Factorio Staff
Factorio Staff
Posts: 5202
Joined: Thu Jun 11, 2015 1:35 pm
Contact:

Re: [1.1.1] Asynchronous save process stuck, crash when quitting game

Post by posila »

In the OPs stacktrace, last line that makes sense is

Code: Select all

#32 0x00007ffddf8ac1f8 in SteamContext::getStorageQuota() const at /tmp/factorio-build-7k9uW8/src/SteamIntegration/SteamContext.cpp:182
from there up it is nonsense, so it's probably inside Steam API call and the address symbolization returned nonsense for those calls.
I am not even sure if Steam API is supposed to work from forked process.

Squelch
Filter Inserter
Filter Inserter
Posts: 346
Joined: Sat Apr 23, 2016 5:31 pm
Contact:

Re: [1.1.1] Asynchronous save process stuck, crash when quitting game

Post by Squelch »

posila wrote:
Thu Nov 26, 2020 12:42 pm
In the OPs stacktrace, last line that makes sense is

Code: Select all

#32 0x00007ffddf8ac1f8 in SteamContext::getStorageQuota() const at /tmp/factorio-build-7k9uW8/src/SteamIntegration/SteamContext.cpp:182
from there up it is nonsense, so it's probably inside Steam API call and the address symbolization returned nonsense for those calls.
I am not even sure if Steam API is supposed to work from forked process.
Steam doesn't appear to be a common denominator in the reported crashes although I suspect you are correct with your observation, and it may be a contributing factor, or just another vector to the defect.

Code: Select all

8366.671 Info ChildProcessAgent.cpp:60: Child 1724870 exited with return value 1
Does appear regularly in the log files for previous reports, but it's as descriptive as saying "I haz a bug" unfortunately.

Squelch
Filter Inserter
Filter Inserter
Posts: 346
Joined: Sat Apr 23, 2016 5:31 pm
Contact:

Re: [1.1.1] Asynchronous save process stuck, crash when quitting game

Post by Squelch »

selplacei wrote:
Thu Nov 26, 2020 6:39 am
System info: Arch Linux 5.9.9-zen 64-bit, Factorio 1.1.1 on the native Steam client installed from multilib/steam, with multilib/steam-native-runtime installed as well.
System info: [CPU: Intel(R) Core(TM) i5-4460 CPU @ 3.20GHz, 4 cores, RAM: 7896 MB]
Out of interest @selplacei, what swap space do you have, and how is it allocated?

selplacei
Burner Inserter
Burner Inserter
Posts: 17
Joined: Wed Aug 05, 2020 9:52 am
Contact:

Re: [1.1.1] Asynchronous save process stuck, crash when quitting game

Post by selplacei »

Squelch wrote:
Thu Nov 26, 2020 1:10 pm
selplacei wrote:
Thu Nov 26, 2020 6:39 am
System info: Arch Linux 5.9.9-zen 64-bit, Factorio 1.1.1 on the native Steam client installed from multilib/steam, with multilib/steam-native-runtime installed as well.
System info: [CPU: Intel(R) Core(TM) i5-4460 CPU @ 3.20GHz, 4 cores, RAM: 7896 MB]
Out of interest @selplacei, what swap space do you have, and how is it allocated?
I have a 16 GB swap partition on an SSD with swappiness set to 60. Everything other than that I haven't touched.

Squelch
Filter Inserter
Filter Inserter
Posts: 346
Joined: Sat Apr 23, 2016 5:31 pm
Contact:

Re: [1.1.1] Asynchronous save process stuck, crash when quitting game

Post by Squelch »

selplacei wrote:
Thu Nov 26, 2020 1:39 pm
I have a 16 GB swap partition on an SSD with swappiness set to 60. Everything other than that I haven't touched.
Thanks, that helps to rule out that aspect.

selplacei
Burner Inserter
Burner Inserter
Posts: 17
Joined: Wed Aug 05, 2020 9:52 am
Contact:

Re: [1.1.1] Asynchronous save process stuck, crash when quitting game

Post by selplacei »

I'm currently running

Code: Select all

strace -f
on Factorio on a clean save, hoping that at least the "Save process crashed" thing happens again. Will post if anything comes up. The useful info should be easy to separate since the saving process has its own PID

Squelch
Filter Inserter
Filter Inserter
Posts: 346
Joined: Sat Apr 23, 2016 5:31 pm
Contact:

Re: [1.1.1] Asynchronous save process stuck, crash when quitting game

Post by Squelch »

selplacei wrote:
Thu Nov 26, 2020 2:03 pm
I'm currently running

Code: Select all

strace -f
on Factorio on a clean save, hoping that at least the "Save process crashed" thing happens again. Will post if anything comes up. The useful info should be easy to separate since the saving process has its own PID
Good call, and thanks for your efforts.

The following might also be useful

Code: Select all

strace -ff --output ~/.factorio/AsychMonitor /path/to/factorio
It will create a new AsynchMonitor.[PID] for each process instead of all in one.

If only I was able to recreate the crashes on any of my systems, I'd be doing this myself. I look forward to seeing what you find.

Squelch
Filter Inserter
Filter Inserter
Posts: 346
Joined: Sat Apr 23, 2016 5:31 pm
Contact:

Re: [1.1.1] Asynchronous save process stuck, crash when quitting game

Post by Squelch »

posila wrote:
Thu Nov 26, 2020 12:42 pm
In the OPs stacktrace, last line that makes sense is

Code: Select all

#32 0x00007ffddf8ac1f8 in SteamContext::getStorageQuota() const at /tmp/factorio-build-7k9uW8/src/SteamIntegration/SteamContext.cpp:182
from there up it is nonsense, so it's probably inside Steam API call and the address symbolization returned nonsense for those calls.
I am not even sure if Steam API is supposed to work from forked process.
Squelch wrote:
Thu Nov 26, 2020 1:03 pm
Steam doesn't appear to be a common denominator in the reported crashes although I suspect you are correct with your observation, and it may be a contributing factor, or just another vector to the defect.

Code: Select all

8366.671 Info ChildProcessAgent.cpp:60: Child 1724870 exited with return value 1
Does appear regularly in the log files for previous reports, but it's as descriptive as saying "I haz a bug" unfortunately.
I'll retract the first part of my assertion. Steam does indeed appear to be a common factor. I've just gone over my collated reports again, and the crashes that at first appeared to be attributed to the child process hanging do in fact appear to have other causes/exit values. This might explain why myself and some others haven't been able to reproduce the problem. In my case, I've tried everything else except running via Steam.

The second half of my assertion still stands, with exit value 1 for an autosave attempt. The pattern seems to go:
  1. autosave attempt PID[1]
  2. Blueprint storage (cloud) success
  3. PID[1] Failed with exit value 1
  4. Another autosave attempt PID[2]
  5. Blueprint storage (cloud) fail
  6. PID[2] Success with exit value 0
  7. Next save attempt
  8. PID[1], PID[2]... "Previous saving process hasn't finished yet"
  9. Crash
  10. Garbage stacktrace
I will try to reproduce via Steam, and also see what strace returns if any from @selplacei.

[Conjecture]
Steam cloud might be seeing two negotiations for storage from each half of the fork. It accepts and acknowledges the original request, but because the second request comes from the same host it never responds, or it's assumed to be the same, therefore leaving it hanging and a possible zombie. The next request to sync storage fails because the fork is still busy. Previous successful saves might just be luck in timing.
[/conjecture]

Oxyd
Former Staff
Former Staff
Posts: 1428
Joined: Thu May 07, 2015 8:42 am
Contact:

Re: [1.1.1] Asynchronous save process stuck, crash when quitting game

Post by Oxyd »

This being connected to Steam storage would make sense to me. I also haven't really tried the Steam version, so I'll see if I can reproduce it this way.

selplacei
Burner Inserter
Burner Inserter
Posts: 17
Joined: Wed Aug 05, 2020 9:52 am
Contact:

Re: [Oxyd] [1.1.1] Asynchronous save process stuck, crash when quitting game

Post by selplacei »

I haven't been able to reproduce neither the original issue nor the "Save process crashed" message so far. It is interesting that saving on that one save still takes much more CPU and RAM than normal (noticeably more than on 1.1.0) on my machine. I'll play as normal with strace enabled and will post if a crash happens again, but don't expect it.

Also, it might be obvious by now, but just before the crash I was doing a lot of putting new blueprints into the global blueprint library as part of normal gameplay.

Squelch
Filter Inserter
Filter Inserter
Posts: 346
Joined: Sat Apr 23, 2016 5:31 pm
Contact:

Re: [Oxyd] [1.1.1] Asynchronous save process stuck, crash when quitting game

Post by Squelch »

selplacei wrote:
Thu Nov 26, 2020 8:05 pm
I haven't been able to reproduce neither the original issue nor the "Save process crashed" message so far. It is interesting that saving on that one save still takes much more CPU and RAM than normal (noticeably more than on 1.1.0) on my machine. I'll play as normal with strace enabled and will post if a crash happens again, but don't expect it.
I'm still poking at it, and nothing yet.
I have set strace with individual files for each process to a folder named session. This is cleared on every restart (strace generates a lot of verbage) Matching up the PIDs from Factorio log should be easier than one flat file.
Also, it might be obvious by now, but just before the crash I was doing a lot of putting new blueprints into the global blueprint library as part of normal gameplay.
Yes, that is the line of thought I was taking, and have enabled "Blueprint library cloud sync" along with creating test blueprints as I go in a new game.

Thanks for participating in this. I'm sure something will come of this.

selplacei
Burner Inserter
Burner Inserter
Posts: 17
Joined: Wed Aug 05, 2020 9:52 am
Contact:

Re: [Oxyd] [1.1.1] Asynchronous save process stuck, crash when quitting game

Post by selplacei »

I've managed to get the bug again, this time with strace enabled! Also, I wasn't doing anything blueprint-related this time (at least since 30 minutes before the crash), and there weren't any failed autosaves before the crash. Like last time, the progress bar was stuck at 100%, and when I tried to quit the game, it crashed.

The log is attached. As for the strace, it probably contains sensitive information, so I'll kindly ask a developer to DM me and I'll provide a link to it - the whole thing weighs 8 GiB since I've been playing for a while, but I'll also provide the last 10,000 and 1,000,000 lines as separate files. All PIDs are in the same file.
Attachments
factorio-current.log
(72.47 KiB) Downloaded 100 times

Oxyd
Former Staff
Former Staff
Posts: 1428
Joined: Thu May 07, 2015 8:42 am
Contact:

Re: [Oxyd] [1.1.1] Asynchronous save process stuck, crash when quitting game

Post by Oxyd »

So, with Steam enabled I've seen the hang and subsequent crash myself, but also couldn't reproduce it reliably. Nevertheless, when I tried saving a file to Steam Cloud from both the parent and child processes, both processes hung immediately. So whilst I can't be 100% sure, I think the cause of this was that the child tried to push files to Steam.

In 1.1.4, only the parent process will talk to Steam, which should hopefully fix this issue.

Squelch
Filter Inserter
Filter Inserter
Posts: 346
Joined: Sat Apr 23, 2016 5:31 pm
Contact:

Re: [Oxyd] [1.1.1] Asynchronous save process stuck, crash when quitting game

Post by Squelch »

Oxyd wrote:
Tue Dec 01, 2020 4:51 pm
So, with Steam enabled I've seen the hang and subsequent crash myself, but also couldn't reproduce it reliably. Nevertheless, when I tried saving a file to Steam Cloud from both the parent and child processes, both processes hung immediately. So whilst I can't be 100% sure, I think the cause of this was that the child tried to push files to Steam.

In 1.1.4, only the parent process will talk to Steam, which should hopefully fix this issue.
That seems to align with my conjecture, and I'm so pleased that you may have nailed this @Oxyd.
My testing time has been rather limited of late, so I wasn't able to reproduce the problem either. Let's hope that is finally resolved.

Post Reply

Return to “Resolved Problems and Bugs”