Page 1 of 2

[Oxyd] [Linux/Mac] non-blocking save crashes

Posted: Tue Dec 24, 2019 3:26 pm
by Cromefire_
I've enabled async saves and if I accidentally (or because the autosave is stuck, which is another point) save the game (manually) while it's autosaving, the game crashes because there is only one instance of the async save class allowed.

[0.18.0] Crash to Desktop (save process stalled with non-blocking save feature)

Posted: Thu Jan 23, 2020 5:25 pm
by triffid_hunter
Hi Wube team,

Just had an interesting and rare occurrence - factorio crashed!

I think this is my first crash since I bought the game circa 0.14, so you're doing an excellent job regarding stability!

I was poking around in single player with a few minor mods (LTN, Whistle Stop, Mining Drones, Squeakthrough, Beautiful Rail Bridges, Lighted power poles, Noxys waterfill, couple others perhaps - see the attached saves).

It tried to autosave as it has done successfully many many times (including in this session and on this map), but this specific time the dialog remained visible for several minutes with the progress bar at 100%.

I've no idea why it stalled this time, I had 8G of RAM available (from 16G total), and 35G free on the disk where I have my factorio saves.

The game kept running for a while (I have non-blocking save enabled) then crashed.

There's nothing interesting in the log, but I've attached it anyway since I know you've plenty of reason to not believe folks when they say there's nothing interesting in there.

Nothing interesting in dmesg either, last message there is from several hours ago.

I've also attached the save file it was *trying* to write, as well as the previous autosave which wrote successfully. They're time-stamped 5 minutes apart.

My guess is that some unknown and unlogged issue caused the save process to stall, then (as a result of non-blocking save) it tried to autosave *again*, and barfed on the previous autosave process still running.

If that's an accurate assessment, the main problem is that autosave stalled somehow.

Suggestion: if non-blocking autosave stalls, perhaps detect this and at least offer to foreground-save before dying?

System specifics:

Linux Mint 18.3 (Sylvia) (essentially Ubuntu xenial), kernel 4.15.0-64-generic #73~16.04.1-Ubuntu
AMD 1500X Ryzen 3 CPU
nVidia GTX1050Ti, 396.54.09 driver
16GB DDR4-2400 RAM (12G free after factorio crashed, ~8G free while the autosave was stalling with each factorio process taking ~2G)
240G SSD (35G free) + 1TB spinning rust (not used for factorio)

PS: I love the non-blocking save feature, it's a truly genius move to offload game state snapshotting to the kernel's CoW fork mechanism! :D

Re: [0.18.0] Crash to Desktop (save process stalled with non-blocking save feature)

Posted: Wed Jan 29, 2020 5:01 pm
by TruePikachu

Code: Select all

41635.350 Info AppManager.cpp:268: Saving to _autosave1 (non-blocking).
41635.591 Info AsyncScenarioSaver.cpp:144: Saving process PID: 27878
41636.135 [27878] Info BlueprintShelf.cpp:691: Saving blueprint storage.
[NB: EOF]
Is there a core dump available?

Re: [0.18.0] Crash to Desktop (save process stalled with non-blocking save feature)

Posted: Fri Feb 14, 2020 5:53 pm
by movax20h
I can't reproduce a crash with no mods. non-blocking autosaving works flawlessly for me. Could you share your mods? As a single zip with all required mods.

Could you reproduce it in any way, even only just sometimes?

[0.18.15] crash when saving while ending game; when using non-blocking-saving

Posted: Wed Mar 25, 2020 11:35 am
by ssilk
Hi, this crash was caused when I quit a game, while in the same moment the game saved. The game is still saved, but afterwards it crashes. This is possible when turning on non-blocking-saving. I would understand, that this will not be fixed, cause it is an experimental setup, that is normally off.

Code: Select all

3857.338 Info AppManager.cpp:278: Saving finished
4013.722 Info AppManagerStates.cpp:1747: Saving game as /Users/ssilk/Library/Application Support/factorio/saves/World50.030 (non-blocking)
4014.398 Info AsyncScenarioSaver.cpp:144: Saving process PID: 16188
4014.400 [16188] Verbose Scenario.cpp:835: Saving game as /Users/ssilk/Library/Application Support/factorio/saves/World50.030
4058.735 [16188] Verbose Scenario.cpp:951: Time to save game: 44.3352
4059.057 Info ChildProcessAgent.cpp:60: Child 16188 exited with return value 0
4059.057 Info AppManagerStates.cpp:1748: Saving finished
4059.077 Time travel logging:
  23.007 Popped blueprint record (player-index: 0, ID: 6) from book (player-index: 0, ID: 8)
  23.007 Popped blueprint record (player-index: 0, ID: 21) from book (player-index: 0, ID: 31)
  23.007 Popped blueprint record (player-index: 0, ID: 30) from book (player-index: 0, ID: 31)
Factorio crashed. Generating symbolized stacktrace, please wait ...
#1  0x0000000105bcd442 in Logger::logStacktrace(StackTraceInfo*) + 0x12
#2  0x00000001051e8589 in CrashHandler::writeStackTrace(CrashHandler::CrashReason) + 0xb9
#3  0x0000000105bb1744 in CrashHandler::commonSignalHandler(int) + 0x74
#4  0x0000000105bb0c39 in CrashHandler::SignalHandler(int) + 0x9
#5  0x00007fff6a92f42d in _sigtramp + 0x1d
Stack trace logging done
4059.197 Error Util.cpp:97: Unexpected error occurred. If you're running the latest version of the game you can help us solve the problem by posting the contents of the log file on the Factorio forums.
Please also include the save file(s), any mods you may be using, and any steps you know of to reproduce the crash.
4074.889 Uploading log file
4074.953 Info SystemUtil.cpp:547: Started /Applications/factorio.app/Contents/MacOS/factorio; trampoline PID: 16199
4074.953 Error CrashHandler.cpp:591: Unhandled exception type: NSt3__117bad_function_callE
4074.954 Error CrashHandler.cpp:598: Unhandled exception: std::exception

Re: [0.18.15] crash when saving while ending game

Posted: Wed Mar 25, 2020 11:43 am
by Klonan
Can you provide the save game

Re: [0.18.15] crash when saving while ending game

Posted: Sat Mar 28, 2020 9:23 am
by ssilk
No, (and as said the save worked well. but is overwritten), but I’ll try to reproduce it with current version.

Maybe this helps: I think I was already in the menu and didn’t recognize that the game begins to save (takes about 20 seconds on my map) when I pressed quit.

Re: [0.18.15] crash when saving while ending game

Posted: Sat Mar 28, 2020 1:17 pm
by ssilk
Simple to reproduce:
In game go into menu and save (note again: non-blocking-saving is on!) While saving press quit. I know that this must be introduced somewhere before 0.18.15, because I remember that I tried that in January/February.

Re: [0.18.0] Crash to Desktop (save process stalled with non-blocking save feature)

Posted: Tue Apr 07, 2020 5:33 pm
by triffid_hunter
Sorry, haven't been able to reproduce, and unless factorio saves core dumps by default somewhere that I'm not aware of I didn't get one.

[1.0.0] non-blocking save on Linux crashes or locks up

Posted: Sun Aug 16, 2020 8:24 am
by sthalik
At the first time, autosave produced a message that the save process crashed. The second time, it locked up in the background. I was unable to save the game manually. There was a message in syslog and a bunch of files in /tmp/dumps left:

Code: Select all

Aug 16 10:07:29 burzum.local crash_20200816100729_2.dmp[8108]: Uploading dump (out-of-process)
                                                               /tmp/dumps/crash_20200816100729_2.dmp
Aug 16 10:07:29 burzum.local assert_20200816100729_4.dmp[8112]: Uploading dump (out-of-process)
                                                                /tmp/dumps/assert_20200816100729_4.dmp
Aug 16 10:07:30 burzum.local crash_20200816100729_2.dmp[8108]: Finished uploading minidump (out-of-process): success = yes
Aug 16 10:07:30 burzum.local crash_20200816100729_2.dmp[8108]: response: Discarded=1
Aug 16 10:07:30 burzum.local crash_20200816100729_2.dmp[8108]: file ''/tmp/dumps/crash_20200816100729_2.dmp'', upload yes: ''Discarded=1''
Aug 16 10:07:30 burzum.local assert_20200816100729_4.dmp[8112]: Finished uploading minidump (out-of-process): success = yes
Aug 16 10:07:30 burzum.local assert_20200816100729_4.dmp[8112]: response: Discarded=1
Aug 16 10:07:30 burzum.local assert_20200816100729_4.dmp[8112]: file ''/tmp/dumps/assert_20200816100729_4.dmp'', upload yes: ''Discarded=1''

Code: Select all

-rw-r----- 1 sthalik wheel  213 Aug 16 10:07 sthalik_log.txt
/tmp/dumps % file *
assert_20200816100729_4.dmp: Mini DuMP crash report, 14 streams, Sun Aug 16 08:07:29 2020, 0x0 type
crash_20200816100729_2.dmp:  Mini DuMP crash report, 14 streams, Sun Aug 16 08:07:29 2020, 0x0 type
sthalik_log.txt:             ASCII text
/tmp/dumps % cat sthalik_log.txt 
Sun Aug 16 08:07:30 2020 GMT: file ''/tmp/dumps/crash_20200816100729_2.dmp'', upload yes: ''Discarded=1''
Sun Aug 16 08:07:30 2020 GMT: file ''/tmp/dumps/assert_20200816100729_4.dmp'', upload yes: ''Discarded=1''
Attaching the directory contents.

NOTE: the real extension is .tar.xz.
factorio-logs.zip
(20.16 KiB) Downloaded 28 times

Re: [1.0.0] non-blocking save on Linux crashes or locks up

Posted: Mon Aug 17, 2020 4:56 pm
by movax20h
You need to provide the factorio.log, save file and if you use mods, a directory with all the mods.

I never had issues with non-blocking saves, but some people said it could crash when using some mods.

Re: [1.0.0] non-blocking save on Linux crashes or locks up

Posted: Mon Aug 17, 2020 5:58 pm
by sthalik
factorio-previous.log
(33.03 KiB) Downloaded 26 times
mods.tar
(8.14 MiB) Downloaded 20 times

NOTE: I've truncated the Alien Biomes zip files due to attachment limit.

Re: [1.0.0] non-blocking save on Linux crashes or locks up

Posted: Wed Aug 19, 2020 8:59 am
by kovarex
Honestly, I would just remove it. It is marked as experimental feature, and the only gain we have is to deal with bug reports when it doesn't work.

Re: [1.0.0] non-blocking save on Linux crashes or locks up

Posted: Wed Aug 19, 2020 10:04 am
by sthalik
That's a shame. It's interesting that the feature is causing so many problems to begin with.

Re: [1.0.0] non-blocking save on Linux crashes or locks up

Posted: Wed Aug 19, 2020 4:06 pm
by Rseding91
I'm merging the 3 reports around non-blocking saving and assigning it to Oxyd. If he wants to work on it then he will but otherwise it's labeled experimental for a reason.

Re: [Oxyd] [Linux/Mac] non-blocking save crashes

Posted: Fri Sep 04, 2020 3:19 am
by ssilk
Please don’t remove this feature!

I will say it so: this experimental setting is such a relief! I turned it on some time ago and after half an hour I recognized that I wasn’t interrupted by saving. It’s a “game-changer”. :)

And as I see it, it’s two bugs: a small one (Solution: just block “quit” while it saves), and an (for me) unclear one.

Re: [Oxyd] [Linux/Mac] non-blocking save crashes

Posted: Sun Sep 13, 2020 7:13 am
by rafasc
ssilk wrote:
Fri Sep 04, 2020 3:19 am
Please don’t remove this feature!
<snip>
And as I see it, it’s two bugs: a small one (Solution: just block “quit” while it saves), and an (for me) unclear one.
This is a super nice feature, but it's one of those where you only value it when it stops working. :)

Please do not discontinue it.

I've been playing with it enabled hoping to find a way to reproduce it. After seeing blueprints syncing to the cloud on the log, I disabled blueprint cloud sync and steam cloud saves and after two relatively long sessions I experienced no crashes.

So I suspect this may be something like a race condition, or a unreleased lock from the cloud save logic that prevents the forked process to complete.

[1.0.0] Non blocking save hangs

Posted: Thu Sep 24, 2020 7:14 am
by ferromagus
Hello,

I'm running factorio 1.0.0 on a dedicated debian 10 machine and it seems the server is stuck in or after saving the world. I have to add that I'm using the experimental feature non_blocking_saving and I'm fully aware that it being experimental means that I can't expect any support for problems caused by that feature, but I want to provide all the information to investigate the server hang and fix it if desired. I personally think that leveraging Linux' in-process CoW features to parallelize world-saving is a rather smart thing and would love to continue using it. I'm aware to some extent of how this roughly works since I came into contact with this technique in a blog post describing how the redis in-memory database stores its dataset on the harddisk and was well impressed.

The forked process to save the world is in a defunct/zombie state according to the ps output:

Code: Select all

factorio 23322 15.7 36.1 1945168 1462900 pts/1 Ssl+ Sep15 1956:43      \_ /srv/factorio/active/server/current/bin/x64/factorio -c /srv/factorio/active/data/config.ini --start-server /srv/factorio/active/data/saves/world.zip
factorio  9078  0.0  0.0      0     0 pts/1    Z+   06:00   0:04          \_ [factorio] <defunct>
which probably means that the process exited successfully but the parent process never checked back on the child process' exit code.

The last entry in the log files is

Code: Select all

735548.655 Info AppManager.cpp:404: Saving game as /srv/factorio/active/data/saves/world.zip
735548.659 Info AsyncScenarioSaver.cpp:144: Saving process PID: 9078
I dumped the core of the process as well. Here is an excerpt of the stacktrace of all threads stored in that coredump:

Code: Select all

(gdb) thread apply all where

Thread 10 (Thread 0x7ff11d124700 (LWP 23343)):
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x7fffba9cf260) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7fffba9cf1f0, cond=0x7fffba9cf238) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7fffba9cf238, mutex=0x7fffba9cf1f0) at pthread_cond_wait.c:655
#3  0x0000000001e3828c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>)
    at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:865
#4  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#5  0x0000000000cc9654 in WorkerThread::loop () at /tmp/factorio-build-e5O6Xt/src/Util/WorkerThread.cpp:43
#6  0x0000000001ea6730 in std::execute_native_thread_routine (__p=0xb111d30) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#7  0x00007ff122ab9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#8  0x00007ff1228624cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 9 (Thread 0x7ff12052b700 (LWP 23342)):
#0  0x00007ff12285a037 in __GI___select (nfds=1, readfds=0x7ff1205296c0, writefds=0x0, exceptfds=0x0, timeout=0x7ff1205296b0) at ../sysdeps/unix/sysv/linux/select.c:41
#1  0x0000000000cdb508 in InterruptibleStdioStream::read () at /tmp/factorio-build-e5O6Xt/src/Util/Streams/InterruptibleStdioStream.cpp:36
#2  0x0000000000cdb6bf in RemoteCommandProcessor::StdStreamInterface::update () at /tmp/factorio-build-e5O6Xt/src/RemoteCommandProcessor.cpp:75
#3  0x0000000001ea6730 in std::execute_native_thread_routine (__p=0xb115ae0) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#4  0x00007ff122ab9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#5  0x00007ff1228624cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 8 (Thread 0x7ff11d925700 (LWP 23340)):
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0xa3a7144) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0xa3a70f0, cond=0xa3a7118) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0xa3a7118, mutex=0xa3a70f0) at pthread_cond_wait.c:655
#3  0x0000000001e3828c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>)
    at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:865
#4  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#5  0x0000000000cf23b9 in TransferSource::sendDataLoop () at /tmp/factorio-build-e5O6Xt/src/Net/TransferSource.cpp:138
#6  0x0000000001ea6730 in std::execute_native_thread_routine (__p=0xb0dc760) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#7  0x00007ff122ab9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#8  0x00007ff1228624cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 7 (Thread 0x7ff11f929700 (LWP 23339)):
#0  0x00007ff12285a037 in __GI___select (nfds=10, readfds=0x7ff11f927490, writefds=0x0, exceptfds=0x0, timeout=0x7ff11f927440) at ../sysdeps/unix/sysv/linux/select.c:41
#1  0x0000000000d53897 in UDPSocket::recvfrom () at /tmp/factorio-build-e5O6Xt/src/Net/UDPSocket.cpp:433
#2  0x0000000000d62c65 in TransmissionControlHelper::receive () at /tmp/factorio-build-e5O6Xt/src/Net/TransmissionControlHelper.cpp:108
#3  0x0000000000d632a2 in RouterBase::readPacketsLoop () at /tmp/factorio-build-e5O6Xt/src/Net/RouterBase.cpp:65
#4  0x0000000001ea6730 in std::execute_native_thread_routine (__p=0xb0dcdc0) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#5  0x00007ff122ab9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#6  0x00007ff1228624cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 6 (Thread 0x7ff11e126700 (LWP 23337)):
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x7ff118012180) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7ff118012130, cond=0x7ff118012158) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7ff118012158, mutex=0x7ff118012130) at pthread_cond_wait.c:655
#3  0x0000000001e3828c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>)
    at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:865
#4  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#5  0x0000000000916d48 in MapGenerationHelper::consumeTasks () at /tmp/factorio-build-e5O6Xt/src/Map/MapGenerationHelper.cpp:149
#6  0x0000000001ea6730 in std::execute_native_thread_routine (__p=0x7ff114c19e30) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#7  0x00007ff122ab9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#8  0x00007ff1228624cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 5 (Thread 0x7ff11e927700 (LWP 23336)):
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x7ff118000dd0) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7ff118000d60, cond=0x7ff118000da8) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7ff118000da8, mutex=0x7ff118000d60) at pthread_cond_wait.c:655
--Type <RET> for more, q to quit, c to continue without paging--c
#3  0x0000000001e3828c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>) at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:865
#4  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#5  0x0000000000cc9654 in WorkerThread::loop () at /tmp/factorio-build-e5O6Xt/src/Util/WorkerThread.cpp:43
#6  0x0000000001ea6730 in std::execute_native_thread_routine (__p=0x7ff118008920) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#7  0x00007ff122ab9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#8  0x00007ff1228624cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 4 (Thread 0x7ff11f128700 (LWP 23335)):
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x7ff118000d00) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7ff118000c90, cond=0x7ff118000cd8) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7ff118000cd8, mutex=0x7ff118000c90) at pthread_cond_wait.c:655
#3  0x0000000001e3828c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>) at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:865
#4  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#5  0x0000000000cc9654 in WorkerThread::loop () at /tmp/factorio-build-e5O6Xt/src/Util/WorkerThread.cpp:43
#6  0x0000000001ea6730 in std::execute_native_thread_routine (__p=0x7ff118018100) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#7  0x00007ff122ab9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#8  0x00007ff1228624cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 3 (Thread 0x7ff121f2e700 (LWP 23328)):
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x47389b0) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x4738940, cond=0x4738988) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x4738988, mutex=0x4738940) at pthread_cond_wait.c:655
#3  0x0000000001e3828c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>) at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:865
#4  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#5  0x0000000000cc9654 in WorkerThread::loop () at /tmp/factorio-build-e5O6Xt/src/Util/WorkerThread.cpp:43
#6  0x0000000001ea6730 in std::execute_native_thread_routine (__p=0x4750560) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#7  0x00007ff122ab9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#8  0x00007ff1228624cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 2 (Thread 0x7ff122762700 (LWP 23327)):
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x46df360) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x46df310, cond=0x46df338) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x46df338, mutex=0x46df310) at pthread_cond_wait.c:655
#3  0x0000000001e3828c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>) at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:865
#4  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#5  0x0000000000845d73 in TaskManager::run () at /tmp/factorio-build-e5O6Xt/src/Util/TaskManager.cpp:65
#6  0x0000000001ea6730 in std::execute_native_thread_routine (__p=0x46eec10) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#7  0x00007ff122ab9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#8  0x00007ff1228624cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 1 (Thread 0x7ff122764bc0 (LWP 23322)):
#0  __libc_read (nbytes=4, buf=0x7fffba9cef08, fd=11) at ../sysdeps/unix/sysv/linux/read.c:26
#1  __libc_read (fd=11, buf=0x7fffba9cef08, nbytes=4) at ../sysdeps/unix/sysv/linux/read.c:24
#2  0x0000000000b1ce3d in ChildProcessAgent::readPipe () at /tmp/factorio-build-e5O6Xt/src/ChildProcessAgent.cpp:195
#3  0x0000000000b1d27b in ChildProcessAgent::getError[abi:cxx11](int) () at /tmp/factorio-build-e5O6Xt/src/ChildProcessAgent.cpp:155
#4  0x0000000000cd755d in AsyncScenarioSaver::update () at /tmp/factorio-build-e5O6Xt/src/Scenario/AsyncScenarioSaver.cpp:177
#5  0x000000000125292e in MainLoop::gameUpdateLoop () at /tmp/factorio-build-e5O6Xt/src/MainLoop.cpp:1037
#6  0x0000000001276429 in MainLoop::mainLoopStepHeadless () at /tmp/factorio-build-e5O6Xt/src/MainLoop.cpp:566
#7  0x0000000001276f93 in MainLoop::run(Filesystem::Path const&, Filesystem::Path const&, bool, bool, std::function<void ()>, Filesystem::Path const&, MainLoop::HeavyMode) () at /tmp/factorio-build-e5O6Xt/src/MainLoop.cpp:374
#8  0x0000000001278df0 in hostMultiplayerGameInternal () at /tmp/factorio-build-e5O6Xt/src/CommandLineMultiplayer.cpp:284
#9  0x0000000001279d0f in CommandLineMultiplayer::hostCommandLineMultiplayerGame () at /tmp/factorio-build-e5O6Xt/src/CommandLineMultiplayer.cpp:340
#10 0x00000000005bcf1d in main () at /tmp/factorio-build-e5O6Xt/src/Main.cpp:639
I'm not going to post the coredump in a way that is accessible for the public audience due to the nature of a coredump containing all sorts of different stuff but if the game developers would like to receive it in order to further investigate the hang, I would be more than willing to provide the core dump through a private channel.

Re: [1.0.0] Non blocking save hangs

Posted: Thu Sep 24, 2020 1:02 pm
by kovarex
We will fix it by removing the non-blocking save feature, it is just trouble that works only on linux and is not worth it.

Re: [Oxyd] [Linux/Mac] non-blocking save crashes

Posted: Fri Sep 25, 2020 5:30 am
by ferromagus
kovarex wrote:
Thu Sep 24, 2020 1:02 pm
We will fix it by removing the non-blocking save feature, it is just trouble that works only on linux and is not worth it.
That's a big shame, I really loved that feature for multiple reasons. It allows for saving the game without interrupting the player, hence not interfering with playability of the game on the server. It's also really cool from a technical perspective in my opinion as a software engineer myself. I think we need to look into more ways how Copy-On-Write can be useful in solving tasks. File systems are just one field. And persisting state in an application might be another field. Like I said before, Redis has a very similar approach to persist the hot data set without blocking the main loop in the server <https://github.com/redis/redis/blob/323 ... db.c#L1382>. Are you guys sure you can't revise the implementation? It works rather reliable for Redis as well.

Sorry for nagging so much and if that feature is too much of a maintenance burden, it can't be helped then I guess. I really love the game and I loved every minute spending on it. So thank you guys for allowing me to have this kind of fun. :D