[Oxyd] [Linux/Mac] non-blocking save crashes

Bugs that we were not able to reproduce, and/or are waiting for more detailed info.
Cromefire_
Manual Inserter
Manual Inserter
Posts: 1
Joined: Tue Dec 24, 2019 3:17 pm
Contact:

[Oxyd] [Linux/Mac] non-blocking save crashes

Post by Cromefire_ »

I've enabled async saves and if I accidentally (or because the autosave is stuck, which is another point) save the game (manually) while it's autosaving, the game crashes because there is only one instance of the async save class allowed.
Attachments
factorio-current.log
(69.12 KiB) Downloaded 166 times

triffid_hunter
Inserter
Inserter
Posts: 41
Joined: Wed Dec 14, 2016 7:33 am
Contact:

[0.18.0] Crash to Desktop (save process stalled with non-blocking save feature)

Post by triffid_hunter »

Hi Wube team,

Just had an interesting and rare occurrence - factorio crashed!

I think this is my first crash since I bought the game circa 0.14, so you're doing an excellent job regarding stability!

I was poking around in single player with a few minor mods (LTN, Whistle Stop, Mining Drones, Squeakthrough, Beautiful Rail Bridges, Lighted power poles, Noxys waterfill, couple others perhaps - see the attached saves).

It tried to autosave as it has done successfully many many times (including in this session and on this map), but this specific time the dialog remained visible for several minutes with the progress bar at 100%.

I've no idea why it stalled this time, I had 8G of RAM available (from 16G total), and 35G free on the disk where I have my factorio saves.

The game kept running for a while (I have non-blocking save enabled) then crashed.

There's nothing interesting in the log, but I've attached it anyway since I know you've plenty of reason to not believe folks when they say there's nothing interesting in there.

Nothing interesting in dmesg either, last message there is from several hours ago.

I've also attached the save file it was *trying* to write, as well as the previous autosave which wrote successfully. They're time-stamped 5 minutes apart.

My guess is that some unknown and unlogged issue caused the save process to stall, then (as a result of non-blocking save) it tried to autosave *again*, and barfed on the previous autosave process still running.

If that's an accurate assessment, the main problem is that autosave stalled somehow.

Suggestion: if non-blocking autosave stalls, perhaps detect this and at least offer to foreground-save before dying?

System specifics:

Linux Mint 18.3 (Sylvia) (essentially Ubuntu xenial), kernel 4.15.0-64-generic #73~16.04.1-Ubuntu
AMD 1500X Ryzen 3 CPU
nVidia GTX1050Ti, 396.54.09 driver
16GB DDR4-2400 RAM (12G free after factorio crashed, ~8G free while the autosave was stalling with each factorio process taking ~2G)
240G SSD (35G free) + 1TB spinning rust (not used for factorio)

PS: I love the non-blocking save feature, it's a truly genius move to offload game state snapshotting to the kernel's CoW fork mechanism! :D
Attachments
_autosave3.zip
This is the second-to-last save in the same session and map, which had zero issues.
(8.15 MiB) Downloaded 145 times
_autosave1.tmp.zip
This is the latest save, I guess it was in the process of writing it when it crashed. unzip says it's corrupt or incomplete.
(8.13 MiB) Downloaded 148 times
factorio-current.log
log
(59.74 KiB) Downloaded 176 times

User avatar
TruePikachu
Filter Inserter
Filter Inserter
Posts: 978
Joined: Sat Apr 09, 2016 8:39 pm
Contact:

Re: [0.18.0] Crash to Desktop (save process stalled with non-blocking save feature)

Post by TruePikachu »

Code: Select all

41635.350 Info AppManager.cpp:268: Saving to _autosave1 (non-blocking).
41635.591 Info AsyncScenarioSaver.cpp:144: Saving process PID: 27878
41636.135 [27878] Info BlueprintShelf.cpp:691: Saving blueprint storage.
[NB: EOF]
Is there a core dump available?

movax20h
Fast Inserter
Fast Inserter
Posts: 164
Joined: Fri Mar 08, 2019 7:07 pm
Contact:

Re: [0.18.0] Crash to Desktop (save process stalled with non-blocking save feature)

Post by movax20h »

I can't reproduce a crash with no mods. non-blocking autosaving works flawlessly for me. Could you share your mods? As a single zip with all required mods.

Could you reproduce it in any way, even only just sometimes?

User avatar
ssilk
Global Moderator
Global Moderator
Posts: 12889
Joined: Tue Apr 16, 2013 10:35 pm
Contact:

[0.18.15] crash when saving while ending game; when using non-blocking-saving

Post by ssilk »

Hi, this crash was caused when I quit a game, while in the same moment the game saved. The game is still saved, but afterwards it crashes. This is possible when turning on non-blocking-saving. I would understand, that this will not be fixed, cause it is an experimental setup, that is normally off.

Code: Select all

3857.338 Info AppManager.cpp:278: Saving finished
4013.722 Info AppManagerStates.cpp:1747: Saving game as /Users/ssilk/Library/Application Support/factorio/saves/World50.030 (non-blocking)
4014.398 Info AsyncScenarioSaver.cpp:144: Saving process PID: 16188
4014.400 [16188] Verbose Scenario.cpp:835: Saving game as /Users/ssilk/Library/Application Support/factorio/saves/World50.030
4058.735 [16188] Verbose Scenario.cpp:951: Time to save game: 44.3352
4059.057 Info ChildProcessAgent.cpp:60: Child 16188 exited with return value 0
4059.057 Info AppManagerStates.cpp:1748: Saving finished
4059.077 Time travel logging:
  23.007 Popped blueprint record (player-index: 0, ID: 6) from book (player-index: 0, ID: 8)
  23.007 Popped blueprint record (player-index: 0, ID: 21) from book (player-index: 0, ID: 31)
  23.007 Popped blueprint record (player-index: 0, ID: 30) from book (player-index: 0, ID: 31)
Factorio crashed. Generating symbolized stacktrace, please wait ...
#1  0x0000000105bcd442 in Logger::logStacktrace(StackTraceInfo*) + 0x12
#2  0x00000001051e8589 in CrashHandler::writeStackTrace(CrashHandler::CrashReason) + 0xb9
#3  0x0000000105bb1744 in CrashHandler::commonSignalHandler(int) + 0x74
#4  0x0000000105bb0c39 in CrashHandler::SignalHandler(int) + 0x9
#5  0x00007fff6a92f42d in _sigtramp + 0x1d
Stack trace logging done
4059.197 Error Util.cpp:97: Unexpected error occurred. If you're running the latest version of the game you can help us solve the problem by posting the contents of the log file on the Factorio forums.
Please also include the save file(s), any mods you may be using, and any steps you know of to reproduce the crash.
4074.889 Uploading log file
4074.953 Info SystemUtil.cpp:547: Started /Applications/factorio.app/Contents/MacOS/factorio; trampoline PID: 16199
4074.953 Error CrashHandler.cpp:591: Unhandled exception type: NSt3__117bad_function_callE
4074.954 Error CrashHandler.cpp:598: Unhandled exception: std::exception
Cool suggestion: Eatable MOUSE-pointers.
Have you used the Advanced Search today?
Need help, question? FAQ - Wiki - Forum help
I still like small signatures...

User avatar
Klonan
Factorio Staff
Factorio Staff
Posts: 5246
Joined: Sun Jan 11, 2015 2:09 pm
Contact:

Re: [0.18.15] crash when saving while ending game

Post by Klonan »

Can you provide the save game

User avatar
ssilk
Global Moderator
Global Moderator
Posts: 12889
Joined: Tue Apr 16, 2013 10:35 pm
Contact:

Re: [0.18.15] crash when saving while ending game

Post by ssilk »

No, (and as said the save worked well. but is overwritten), but I’ll try to reproduce it with current version.

Maybe this helps: I think I was already in the menu and didn’t recognize that the game begins to save (takes about 20 seconds on my map) when I pressed quit.
Cool suggestion: Eatable MOUSE-pointers.
Have you used the Advanced Search today?
Need help, question? FAQ - Wiki - Forum help
I still like small signatures...

User avatar
ssilk
Global Moderator
Global Moderator
Posts: 12889
Joined: Tue Apr 16, 2013 10:35 pm
Contact:

Re: [0.18.15] crash when saving while ending game

Post by ssilk »

Simple to reproduce:
In game go into menu and save (note again: non-blocking-saving is on!) While saving press quit. I know that this must be introduced somewhere before 0.18.15, because I remember that I tried that in January/February.
Cool suggestion: Eatable MOUSE-pointers.
Have you used the Advanced Search today?
Need help, question? FAQ - Wiki - Forum help
I still like small signatures...

triffid_hunter
Inserter
Inserter
Posts: 41
Joined: Wed Dec 14, 2016 7:33 am
Contact:

Re: [0.18.0] Crash to Desktop (save process stalled with non-blocking save feature)

Post by triffid_hunter »

Sorry, haven't been able to reproduce, and unless factorio saves core dumps by default somewhere that I'm not aware of I didn't get one.

sthalik
Long Handed Inserter
Long Handed Inserter
Posts: 56
Joined: Tue May 01, 2018 9:32 am
Contact:

[1.0.0] non-blocking save on Linux crashes or locks up

Post by sthalik »

At the first time, autosave produced a message that the save process crashed. The second time, it locked up in the background. I was unable to save the game manually. There was a message in syslog and a bunch of files in /tmp/dumps left:

Code: Select all

Aug 16 10:07:29 burzum.local crash_20200816100729_2.dmp[8108]: Uploading dump (out-of-process)
                                                               /tmp/dumps/crash_20200816100729_2.dmp
Aug 16 10:07:29 burzum.local assert_20200816100729_4.dmp[8112]: Uploading dump (out-of-process)
                                                                /tmp/dumps/assert_20200816100729_4.dmp
Aug 16 10:07:30 burzum.local crash_20200816100729_2.dmp[8108]: Finished uploading minidump (out-of-process): success = yes
Aug 16 10:07:30 burzum.local crash_20200816100729_2.dmp[8108]: response: Discarded=1
Aug 16 10:07:30 burzum.local crash_20200816100729_2.dmp[8108]: file ''/tmp/dumps/crash_20200816100729_2.dmp'', upload yes: ''Discarded=1''
Aug 16 10:07:30 burzum.local assert_20200816100729_4.dmp[8112]: Finished uploading minidump (out-of-process): success = yes
Aug 16 10:07:30 burzum.local assert_20200816100729_4.dmp[8112]: response: Discarded=1
Aug 16 10:07:30 burzum.local assert_20200816100729_4.dmp[8112]: file ''/tmp/dumps/assert_20200816100729_4.dmp'', upload yes: ''Discarded=1''

Code: Select all

-rw-r----- 1 sthalik wheel  213 Aug 16 10:07 sthalik_log.txt
/tmp/dumps % file *
assert_20200816100729_4.dmp: Mini DuMP crash report, 14 streams, Sun Aug 16 08:07:29 2020, 0x0 type
crash_20200816100729_2.dmp:  Mini DuMP crash report, 14 streams, Sun Aug 16 08:07:29 2020, 0x0 type
sthalik_log.txt:             ASCII text
/tmp/dumps % cat sthalik_log.txt 
Sun Aug 16 08:07:30 2020 GMT: file ''/tmp/dumps/crash_20200816100729_2.dmp'', upload yes: ''Discarded=1''
Sun Aug 16 08:07:30 2020 GMT: file ''/tmp/dumps/assert_20200816100729_4.dmp'', upload yes: ''Discarded=1''
Attaching the directory contents.

NOTE: the real extension is .tar.xz.
factorio-logs.zip
(20.16 KiB) Downloaded 140 times

movax20h
Fast Inserter
Fast Inserter
Posts: 164
Joined: Fri Mar 08, 2019 7:07 pm
Contact:

Re: [1.0.0] non-blocking save on Linux crashes or locks up

Post by movax20h »

You need to provide the factorio.log, save file and if you use mods, a directory with all the mods.

I never had issues with non-blocking saves, but some people said it could crash when using some mods.

sthalik
Long Handed Inserter
Long Handed Inserter
Posts: 56
Joined: Tue May 01, 2018 9:32 am
Contact:

Re: [1.0.0] non-blocking save on Linux crashes or locks up

Post by sthalik »

factorio-previous.log
(33.03 KiB) Downloaded 151 times
mods.tar
(8.14 MiB) Downloaded 135 times

NOTE: I've truncated the Alien Biomes zip files due to attachment limit.

kovarex
Factorio Staff
Factorio Staff
Posts: 8195
Joined: Wed Feb 06, 2013 12:00 am
Contact:

Re: [1.0.0] non-blocking save on Linux crashes or locks up

Post by kovarex »

Honestly, I would just remove it. It is marked as experimental feature, and the only gain we have is to deal with bug reports when it doesn't work.

sthalik
Long Handed Inserter
Long Handed Inserter
Posts: 56
Joined: Tue May 01, 2018 9:32 am
Contact:

Re: [1.0.0] non-blocking save on Linux crashes or locks up

Post by sthalik »

That's a shame. It's interesting that the feature is causing so many problems to begin with.

Rseding91
Factorio Staff
Factorio Staff
Posts: 14152
Joined: Wed Jun 11, 2014 5:23 am
Contact:

Re: [1.0.0] non-blocking save on Linux crashes or locks up

Post by Rseding91 »

I'm merging the 3 reports around non-blocking saving and assigning it to Oxyd. If he wants to work on it then he will but otherwise it's labeled experimental for a reason.
If you want to get ahold of me I'm almost always on Discord.

User avatar
ssilk
Global Moderator
Global Moderator
Posts: 12889
Joined: Tue Apr 16, 2013 10:35 pm
Contact:

Re: [Oxyd] [Linux/Mac] non-blocking save crashes

Post by ssilk »

Please don’t remove this feature!

I will say it so: this experimental setting is such a relief! I turned it on some time ago and after half an hour I recognized that I wasn’t interrupted by saving. It’s a “game-changer”. :)

And as I see it, it’s two bugs: a small one (Solution: just block “quit” while it saves), and an (for me) unclear one.
Cool suggestion: Eatable MOUSE-pointers.
Have you used the Advanced Search today?
Need help, question? FAQ - Wiki - Forum help
I still like small signatures...

rafasc
Manual Inserter
Manual Inserter
Posts: 4
Joined: Tue Aug 25, 2020 4:47 pm
Contact:

Re: [Oxyd] [Linux/Mac] non-blocking save crashes

Post by rafasc »

ssilk wrote:
Fri Sep 04, 2020 3:19 am
Please don’t remove this feature!
<snip>
And as I see it, it’s two bugs: a small one (Solution: just block “quit” while it saves), and an (for me) unclear one.
This is a super nice feature, but it's one of those where you only value it when it stops working. :)

Please do not discontinue it.

I've been playing with it enabled hoping to find a way to reproduce it. After seeing blueprints syncing to the cloud on the log, I disabled blueprint cloud sync and steam cloud saves and after two relatively long sessions I experienced no crashes.

So I suspect this may be something like a race condition, or a unreleased lock from the cloud save logic that prevents the forked process to complete.

ferromagus
Manual Inserter
Manual Inserter
Posts: 3
Joined: Thu Sep 24, 2020 6:55 am
Contact:

[1.0.0] Non blocking save hangs

Post by ferromagus »

Hello,

I'm running factorio 1.0.0 on a dedicated debian 10 machine and it seems the server is stuck in or after saving the world. I have to add that I'm using the experimental feature non_blocking_saving and I'm fully aware that it being experimental means that I can't expect any support for problems caused by that feature, but I want to provide all the information to investigate the server hang and fix it if desired. I personally think that leveraging Linux' in-process CoW features to parallelize world-saving is a rather smart thing and would love to continue using it. I'm aware to some extent of how this roughly works since I came into contact with this technique in a blog post describing how the redis in-memory database stores its dataset on the harddisk and was well impressed.

The forked process to save the world is in a defunct/zombie state according to the ps output:

Code: Select all

factorio 23322 15.7 36.1 1945168 1462900 pts/1 Ssl+ Sep15 1956:43      \_ /srv/factorio/active/server/current/bin/x64/factorio -c /srv/factorio/active/data/config.ini --start-server /srv/factorio/active/data/saves/world.zip
factorio  9078  0.0  0.0      0     0 pts/1    Z+   06:00   0:04          \_ [factorio] <defunct>
which probably means that the process exited successfully but the parent process never checked back on the child process' exit code.

The last entry in the log files is

Code: Select all

735548.655 Info AppManager.cpp:404: Saving game as /srv/factorio/active/data/saves/world.zip
735548.659 Info AsyncScenarioSaver.cpp:144: Saving process PID: 9078
I dumped the core of the process as well. Here is an excerpt of the stacktrace of all threads stored in that coredump:

Code: Select all

(gdb) thread apply all where

Thread 10 (Thread 0x7ff11d124700 (LWP 23343)):
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x7fffba9cf260) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7fffba9cf1f0, cond=0x7fffba9cf238) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7fffba9cf238, mutex=0x7fffba9cf1f0) at pthread_cond_wait.c:655
#3  0x0000000001e3828c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>)
    at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:865
#4  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#5  0x0000000000cc9654 in WorkerThread::loop () at /tmp/factorio-build-e5O6Xt/src/Util/WorkerThread.cpp:43
#6  0x0000000001ea6730 in std::execute_native_thread_routine (__p=0xb111d30) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#7  0x00007ff122ab9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#8  0x00007ff1228624cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 9 (Thread 0x7ff12052b700 (LWP 23342)):
#0  0x00007ff12285a037 in __GI___select (nfds=1, readfds=0x7ff1205296c0, writefds=0x0, exceptfds=0x0, timeout=0x7ff1205296b0) at ../sysdeps/unix/sysv/linux/select.c:41
#1  0x0000000000cdb508 in InterruptibleStdioStream::read () at /tmp/factorio-build-e5O6Xt/src/Util/Streams/InterruptibleStdioStream.cpp:36
#2  0x0000000000cdb6bf in RemoteCommandProcessor::StdStreamInterface::update () at /tmp/factorio-build-e5O6Xt/src/RemoteCommandProcessor.cpp:75
#3  0x0000000001ea6730 in std::execute_native_thread_routine (__p=0xb115ae0) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#4  0x00007ff122ab9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#5  0x00007ff1228624cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 8 (Thread 0x7ff11d925700 (LWP 23340)):
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0xa3a7144) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0xa3a70f0, cond=0xa3a7118) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0xa3a7118, mutex=0xa3a70f0) at pthread_cond_wait.c:655
#3  0x0000000001e3828c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>)
    at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:865
#4  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#5  0x0000000000cf23b9 in TransferSource::sendDataLoop () at /tmp/factorio-build-e5O6Xt/src/Net/TransferSource.cpp:138
#6  0x0000000001ea6730 in std::execute_native_thread_routine (__p=0xb0dc760) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#7  0x00007ff122ab9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#8  0x00007ff1228624cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 7 (Thread 0x7ff11f929700 (LWP 23339)):
#0  0x00007ff12285a037 in __GI___select (nfds=10, readfds=0x7ff11f927490, writefds=0x0, exceptfds=0x0, timeout=0x7ff11f927440) at ../sysdeps/unix/sysv/linux/select.c:41
#1  0x0000000000d53897 in UDPSocket::recvfrom () at /tmp/factorio-build-e5O6Xt/src/Net/UDPSocket.cpp:433
#2  0x0000000000d62c65 in TransmissionControlHelper::receive () at /tmp/factorio-build-e5O6Xt/src/Net/TransmissionControlHelper.cpp:108
#3  0x0000000000d632a2 in RouterBase::readPacketsLoop () at /tmp/factorio-build-e5O6Xt/src/Net/RouterBase.cpp:65
#4  0x0000000001ea6730 in std::execute_native_thread_routine (__p=0xb0dcdc0) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#5  0x00007ff122ab9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#6  0x00007ff1228624cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 6 (Thread 0x7ff11e126700 (LWP 23337)):
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x7ff118012180) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7ff118012130, cond=0x7ff118012158) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7ff118012158, mutex=0x7ff118012130) at pthread_cond_wait.c:655
#3  0x0000000001e3828c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>)
    at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:865
#4  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#5  0x0000000000916d48 in MapGenerationHelper::consumeTasks () at /tmp/factorio-build-e5O6Xt/src/Map/MapGenerationHelper.cpp:149
#6  0x0000000001ea6730 in std::execute_native_thread_routine (__p=0x7ff114c19e30) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#7  0x00007ff122ab9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#8  0x00007ff1228624cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 5 (Thread 0x7ff11e927700 (LWP 23336)):
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x7ff118000dd0) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7ff118000d60, cond=0x7ff118000da8) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7ff118000da8, mutex=0x7ff118000d60) at pthread_cond_wait.c:655
--Type <RET> for more, q to quit, c to continue without paging--c
#3  0x0000000001e3828c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>) at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:865
#4  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#5  0x0000000000cc9654 in WorkerThread::loop () at /tmp/factorio-build-e5O6Xt/src/Util/WorkerThread.cpp:43
#6  0x0000000001ea6730 in std::execute_native_thread_routine (__p=0x7ff118008920) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#7  0x00007ff122ab9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#8  0x00007ff1228624cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 4 (Thread 0x7ff11f128700 (LWP 23335)):
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x7ff118000d00) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7ff118000c90, cond=0x7ff118000cd8) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7ff118000cd8, mutex=0x7ff118000c90) at pthread_cond_wait.c:655
#3  0x0000000001e3828c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>) at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:865
#4  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#5  0x0000000000cc9654 in WorkerThread::loop () at /tmp/factorio-build-e5O6Xt/src/Util/WorkerThread.cpp:43
#6  0x0000000001ea6730 in std::execute_native_thread_routine (__p=0x7ff118018100) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#7  0x00007ff122ab9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#8  0x00007ff1228624cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 3 (Thread 0x7ff121f2e700 (LWP 23328)):
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x47389b0) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x4738940, cond=0x4738988) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x4738988, mutex=0x4738940) at pthread_cond_wait.c:655
#3  0x0000000001e3828c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>) at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:865
#4  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#5  0x0000000000cc9654 in WorkerThread::loop () at /tmp/factorio-build-e5O6Xt/src/Util/WorkerThread.cpp:43
#6  0x0000000001ea6730 in std::execute_native_thread_routine (__p=0x4750560) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#7  0x00007ff122ab9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#8  0x00007ff1228624cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 2 (Thread 0x7ff122762700 (LWP 23327)):
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x46df360) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x46df310, cond=0x46df338) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x46df338, mutex=0x46df310) at pthread_cond_wait.c:655
#3  0x0000000001e3828c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>) at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:865
#4  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#5  0x0000000000845d73 in TaskManager::run () at /tmp/factorio-build-e5O6Xt/src/Util/TaskManager.cpp:65
#6  0x0000000001ea6730 in std::execute_native_thread_routine (__p=0x46eec10) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#7  0x00007ff122ab9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#8  0x00007ff1228624cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 1 (Thread 0x7ff122764bc0 (LWP 23322)):
#0  __libc_read (nbytes=4, buf=0x7fffba9cef08, fd=11) at ../sysdeps/unix/sysv/linux/read.c:26
#1  __libc_read (fd=11, buf=0x7fffba9cef08, nbytes=4) at ../sysdeps/unix/sysv/linux/read.c:24
#2  0x0000000000b1ce3d in ChildProcessAgent::readPipe () at /tmp/factorio-build-e5O6Xt/src/ChildProcessAgent.cpp:195
#3  0x0000000000b1d27b in ChildProcessAgent::getError[abi:cxx11](int) () at /tmp/factorio-build-e5O6Xt/src/ChildProcessAgent.cpp:155
#4  0x0000000000cd755d in AsyncScenarioSaver::update () at /tmp/factorio-build-e5O6Xt/src/Scenario/AsyncScenarioSaver.cpp:177
#5  0x000000000125292e in MainLoop::gameUpdateLoop () at /tmp/factorio-build-e5O6Xt/src/MainLoop.cpp:1037
#6  0x0000000001276429 in MainLoop::mainLoopStepHeadless () at /tmp/factorio-build-e5O6Xt/src/MainLoop.cpp:566
#7  0x0000000001276f93 in MainLoop::run(Filesystem::Path const&, Filesystem::Path const&, bool, bool, std::function<void ()>, Filesystem::Path const&, MainLoop::HeavyMode) () at /tmp/factorio-build-e5O6Xt/src/MainLoop.cpp:374
#8  0x0000000001278df0 in hostMultiplayerGameInternal () at /tmp/factorio-build-e5O6Xt/src/CommandLineMultiplayer.cpp:284
#9  0x0000000001279d0f in CommandLineMultiplayer::hostCommandLineMultiplayerGame () at /tmp/factorio-build-e5O6Xt/src/CommandLineMultiplayer.cpp:340
#10 0x00000000005bcf1d in main () at /tmp/factorio-build-e5O6Xt/src/Main.cpp:639
I'm not going to post the coredump in a way that is accessible for the public audience due to the nature of a coredump containing all sorts of different stuff but if the game developers would like to receive it in order to further investigate the hang, I would be more than willing to provide the core dump through a private channel.

kovarex
Factorio Staff
Factorio Staff
Posts: 8195
Joined: Wed Feb 06, 2013 12:00 am
Contact:

Re: [1.0.0] Non blocking save hangs

Post by kovarex »

We will fix it by removing the non-blocking save feature, it is just trouble that works only on linux and is not worth it.

ferromagus
Manual Inserter
Manual Inserter
Posts: 3
Joined: Thu Sep 24, 2020 6:55 am
Contact:

Re: [Oxyd] [Linux/Mac] non-blocking save crashes

Post by ferromagus »

kovarex wrote:
Thu Sep 24, 2020 1:02 pm
We will fix it by removing the non-blocking save feature, it is just trouble that works only on linux and is not worth it.
That's a big shame, I really loved that feature for multiple reasons. It allows for saving the game without interrupting the player, hence not interfering with playability of the game on the server. It's also really cool from a technical perspective in my opinion as a software engineer myself. I think we need to look into more ways how Copy-On-Write can be useful in solving tasks. File systems are just one field. And persisting state in an application might be another field. Like I said before, Redis has a very similar approach to persist the hot data set without blocking the main loop in the server <https://github.com/redis/redis/blob/323 ... db.c#L1382>. Are you guys sure you can't revise the implementation? It works rather reliable for Redis as well.

Sorry for nagging so much and if that feature is too much of a maintenance burden, it can't be helped then I guess. I really love the game and I loved every minute spending on it. So thank you guys for allowing me to have this kind of fun. :D

Post Reply

Return to “Pending”