Crash on linux

This subforum contains all the issues which we already resolved.
Post Reply
FilipForFico
Burner Inserter
Burner Inserter
Posts: 13
Joined: Sun Oct 07, 2018 12:56 pm
Contact:

Crash on linux

Post by FilipForFico »

Version: 0.17.41
OS: Linux, Fedora 29
RAM: 16GB

The game crashes on startup. The game goes to 63%, stops and I get an error that an error happened and to contact the devs on the forums.
I would like this fixed because I cannot play. I had mods turned on, so I changed the true to false inside mod-list.json.

Log file attached.
Attachments
factorio-current.log
The latest log file
(4.1 KiB) Downloaded 70 times

posila
Factorio Staff
Factorio Staff
Posts: 5073
Joined: Thu Jun 11, 2015 1:35 pm
Contact:

Re: Crash on linux

Post by posila »

Hello, try to verify integrity of game files in Factorio properties in Steam Library.

FilipForFico
Burner Inserter
Burner Inserter
Posts: 13
Joined: Sun Oct 07, 2018 12:56 pm
Contact:

Re: Crash on linux

Post by FilipForFico »

I verified the game files but it still crashed. :/

FilipForFico
Burner Inserter
Burner Inserter
Posts: 13
Joined: Sun Oct 07, 2018 12:56 pm
Contact:

Re: Crash on linux

Post by FilipForFico »

I restarted steam and it worked. Thank you for the help.

posila
Factorio Staff
Factorio Staff
Posts: 5073
Joined: Thu Jun 11, 2015 1:35 pm
Contact:

Re: Crash on linux

Post by posila »

Thanks for letting us know. I assumed the crash was due to corrupted game files, but it might be something else.

Symbolized stacktrace:

Code: Select all

0x0000000000ae881b: Logger::writeStacktrace(FileWriteStream*, StackTraceInfo*) at /tmp/factorio-build-WghqE4/src/Util/Logger.cpp:448
0x0000000000c95cdd: std::__uniq_ptr_impl<LoggerFileWriteStream, std::default_delete<LoggerFileWriteStream> >::_M_ptr() const at /usr/include/c++/8/bits/unique_ptr.h:150
0x0000000000b76935: GlobalContext::getMap() at /tmp/factorio-build-WghqE4/src/GlobalContext.cpp:1807
0x0000000000b76c58: CrashHandler::commonSignalHandler(int) at /tmp/factorio-build-WghqE4/src/Util/CrashHandler.cpp:578
0x0000000000b76cb9: CrashHandler::SignalHandler(int) at /tmp/factorio-build-WghqE4/src/Util/CrashHandler.cpp:592
0x00000000000385c0: ?? ??:0
0x0000000000000000: ?? ??:0
0x000000000000000b: ?? ??:0
0x0000000000515cbd: png_safe_error at /tmp/factorio-build-WghqE4/libraries/png/pngerror.c:916
0x000000000156f0f9: png_default_error at /tmp/factorio-build-WghqE4/libraries/png/pngerror.c:754
0x000000000156f12a: png_error at /tmp/factorio-build-WghqE4/libraries/png/pngerror.c:88
0x000000000156f51f: png_error at /tmp/factorio-build-WghqE4/libraries/png/pngerror.c:88
0x00000000015865ab: png_read_IDAT_data at /tmp/factorio-build-WghqE4/libraries/png/pngrutil.c:4120
0x0000000001571eb8: png_read_row at /tmp/factorio-build-WghqE4/libraries/png/pngread.c:539
0x00000000015767f7: png_read_image at /tmp/factorio-build-WghqE4/libraries/png/pngread.c:745 (discriminator 3)
0x0000000000629671: preloadPng(unsigned int&, unsigned int&, MemoryBitmapData&, unsigned char*, unsigned int, bool) at /tmp/factorio-build-WghqE4/src/Graphics/PngLoad.cpp:100
0x00000000006297d2: std::__atomic_base<bool>::store(bool, std::memory_order) at /usr/include/c++/8/bits/atomic_base.h:374
0x0000000000f86bc2: std::__atomic_base<unsigned int>::operator++() at /usr/include/c++/8/bits/atomic_base.h:296
0x0000000001838fbf: execute_native_thread_routine at blake2s.c:?
0x000000000000858e: ?? ??:0
0x0000000000000000: ?? ??:0

zebediah49
Fast Inserter
Fast Inserter
Posts: 119
Joined: Fri Jun 17, 2016 8:17 pm
Contact:

Re: Crash on linux

Post by zebediah49 »

I'm going to join in this one, with a bit more details:

- Factorio 0.17.58
- Ubuntu 16.04 / 40GB mem

For me, it crashes on load approximately 30% of the time, at varying parts of the way through "loading sprites". 57% seems to be a favorite, but I've seen numbers in the 60's and 70's. Given that, I suspect some kind of parallel race condition issue?

If there's a debug option that would be useful, I can flip it and reboot the game a few dozen times to see what happens?

Code: Select all

  20.344 Texture processor created (2048). GPU accelerated compression Supported: yes, Enabled: yes/yes. Test passed. YCoCgDXT PSNR: 35.83, BC3 PSNR: 33.82
  21.807 Parallel Sprite Loader initialized (threads: 7)
Factorio crashed. Generating symbolized stacktrace, please wait ...
Raw stacktrace: 0xb0f8b8, 0xcd3a3d, 0xb9eede, 0xb9f228, 0xb9f289, 0x354b0, 0, 0xb, 0x51df8b, 0x16450a9, 0x16450da, 0x16454cf, 0x165c58b, 0x1647e68, 0x164c7a7, 0x633a91, 0x633bf2, 0xfcb8f2, 0x190ffbf, 0x76ba, 0
  34.978 Warning Logger.cpp:518: Symbols.size() == 36, usedSize == 20
#0  0x0000000000cd3a3d in std::__uniq_ptr_impl<LoggerFileWriteStream, std::default_delete<LoggerFileWriteStream> >::_M_ptr() const at /usr/include/c++/8/bits/unique_ptr.h:150
#1  0x0000000000b9eede in std::unique_ptr<LoggerFileWriteStream, std::default_delete<LoggerFileWriteStream> >::get() const at /usr/include/c++/8/bits/unique_ptr.h:343
#2  0x0000000000b9f228 in std::unique_ptr<LoggerFileWriteStream, std::default_delete<LoggerFileWriteStream> >::operator->() const at /usr/include/c++/8/bits/unique_ptr.h:337
#3  0x0000000000b9f289 in Logger::flush() at /tmp/factorio-build-NKXqGz/src/Util/Logger.cpp:558
#4  0x00000000000354b0 in Logger::logStacktrace(StackTraceInfo*) at /tmp/factorio-build-NKXqGz/src/Util/Logger.cpp:544
#5  (nil) in GlobalContext::getMap() at /tmp/factorio-build-NKXqGz/src/GlobalContext.cpp:1827
#6  0x000000000000000b in CrashHandler::writeStackTrace(CrashHandler::CrashReason) at /tmp/factorio-build-NKXqGz/src/Util/CrashHandler.cpp:185
#7  0x000000000051df8b in CrashHandler::commonSignalHandler(int) at /tmp/factorio-build-NKXqGz/src/Util/CrashHandler.cpp:595
#8  0x00000000016450a9 in CrashHandler::SignalHandler(int) at /tmp/factorio-build-NKXqGz/src/Util/CrashHandler.cpp:609
#9  0x00000000016450da in ?? at ??:0
#10 0x00000000016454cf in ?? at ??:0
#11 0x000000000165c58b in ?? at ??:0
#12 0x0000000001647e68 in png_safe_error at /tmp/factorio-build-NKXqGz/libraries/png/pngerror.c:916
#13 0x000000000164c7a7 in png_default_error at /tmp/factorio-build-NKXqGz/libraries/png/pngerror.c:754
#14 0x0000000000633a91 in png_error at /tmp/factorio-build-NKXqGz/libraries/png/pngerror.c:88
#15 0x0000000000633bf2 in png_error at /tmp/factorio-build-NKXqGz/libraries/png/pngerror.c:88
#16 0x0000000000fcb8f2 in png_chunk_error at /tmp/factorio-build-NKXqGz/libraries/png/pngerror.c:485
#17 0x000000000190ffbf in png_read_IDAT_data at /tmp/factorio-build-NKXqGz/libraries/png/pngrutil.c:4120
#18 0x00000000000076ba in png_read_row at /tmp/factorio-build-NKXqGz/libraries/png/pngread.c:539
#19 (nil) in png_read_image at /tmp/factorio-build-NKXqGz/libraries/png/pngread.c:745 (discriminator 3)
#20 (nil) in preloadPng(unsigned int&, unsigned int&, MemoryBitmapData&, unsigned char*, unsigned int, bool) at /tmp/factorio-build-NKXqGz/src/Graphics/PngLoad.cpp:100
#21 (nil) in std::__atomic_base<bool>::store(bool, std::memory_order) at /usr/include/c++/8/bits/atomic_base.h:374
#22 (nil) in std::__atomic_base<bool>::operator=(bool) at /usr/include/c++/8/bits/atomic_base.h:267
#23 (nil) in std::atomic<bool>::operator=(bool) at /usr/include/c++/8/atomic:79
#24 (nil) in SpriteLoaders::CrossPlatformImageLoader::preload() at /tmp/factorio-build-NKXqGz/src/Graphics/CrossPlatformImageLoader.cpp:14
#25 (nil) in std::__atomic_base<unsigned int>::operator++() at /usr/include/c++/8/bits/atomic_base.h:296
#26 (nil) in PreloadWorker::preload() at /tmp/factorio-build-NKXqGz/src/Graphics/ParallelSpriteLoader.cpp:61
#27 (nil) in PreloadWorker::compute()::{lambda()#1}::operator()() const at /tmp/factorio-build-NKXqGz/src/Graphics/ParallelSpriteLoader.cpp:40
#28 (nil) in void std::__invoke_impl<void, PreloadWorker::compute()::{lambda()#1}>(std::__invoke_other, PreloadWorker::compute()::{lambda()#1}&&) at /usr/include/c++/8/bits/invoke.h:60
#29 (nil) in std::__invoke_result<PreloadWorker::compute()::{lambda()#1}>::type std::__invoke<PreloadWorker::compute()::{lambda()#1}>(std::__invoke_result&&, (PreloadWorker::compute()::{lambda()#1}&&)...) at /usr/include/c++/8/bits/invoke.h:95
#30 (nil) in decltype (__invoke((_S_declval<0ul>)())) std::thread::_Invoker<std::tuple<PreloadWorker::compute()::{lambda()#1}> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) at /usr/include/c++/8/thread:234
#31 (nil) in std::thread::_Invoker<std::tuple<PreloadWorker::compute()::{lambda()#1}> >::operator()() at /usr/include/c++/8/thread:243
#32 (nil) in std::thread::_State_impl<std::thread::_Invoker<std::tuple<PreloadWorker::compute()::{lambda()#1}> > >::_M_run() at /usr/include/c++/8/thread:186
#33 (nil) in execute_native_thread_routine at blake2s.c:?
#34 (nil) in ?? at ??:0
#35 (nil) in ?? at ??:0
Stack trace logging done
  41.135 Warning Logger.cpp:518: Symbols.size() == 31, usedSize == 19
  41.135 Error Util.cpp:97: Unexpected error occurred. If you're running the latest version of the game you can help us solve the problem by posting the contents of the log file on the Factorio forums.
Please also include the save file(s), any mods you may be using, and any steps you know of to reproduce the crash.
  42.467 Error CrashHandler.cpp:592: Received SIGSEGV
Factorio crashed. Generating symbolized stacktrace, please wait ...
Raw stacktrace: 0xb0f8b8, 0xcd3a3d, 0xb9eede, 0xb9f228, 0xb9f289, 0x354b0, 0x11beffc, 0xb9e02d, 0xb9e2c7, 0xb9eaa3, 0xb9ec77, 0xb9ed8c, 0xb9ef55, 0xb9f228, 0xb9f289, 0x354b0, 0, 0xb, 0x51df8b, 0x16450a9, 0x16450da, 0x16454cf, 0x165c58b, 0x1647e68, 0x164c7a7, 0x633a91, 0x633bf2, 0xfcb8f2, 0x190ffbf, 0x76ba, 0
  48.937 Warning Logger.cpp:518: Symbols.size() == 57, usedSize == 30
#0  0x0000000000cd3a3d in std::__uniq_ptr_impl<LoggerFileWriteStream, std::default_delete<LoggerFileWriteStream> >::_M_ptr() const at /usr/include/c++/8/bits/unique_ptr.h:150
#1  0x0000000000b9eede in std::unique_ptr<LoggerFileWriteStream, std::default_delete<LoggerFileWriteStream> >::get() const at /usr/include/c++/8/bits/unique_ptr.h:343
#2  0x0000000000b9f228 in std::unique_ptr<LoggerFileWriteStream, std::default_delete<LoggerFileWriteStream> >::operator->() const at /usr/include/c++/8/bits/unique_ptr.h:337
#3  0x0000000000b9f289 in Logger::flush() at /tmp/factorio-build-NKXqGz/src/Util/Logger.cpp:558
#4  0x00000000000354b0 in Logger::logStacktrace(StackTraceInfo*) at /tmp/factorio-build-NKXqGz/src/Util/Logger.cpp:544
#5  0x00000000011beffc in GlobalContext::getMap() at /tmp/factorio-build-NKXqGz/src/GlobalContext.cpp:1827
#6  0x0000000000b9e02d in CrashHandler::writeStackTrace(CrashHandler::CrashReason) at /tmp/factorio-build-NKXqGz/src/Util/CrashHandler.cpp:185
#7  0x0000000000b9e2c7 in CrashHandler::commonSignalHandler(int) at /tmp/factorio-build-NKXqGz/src/Util/CrashHandler.cpp:595
#8  0x0000000000b9eaa3 in CrashHandler::SignalHandler(int) at /tmp/factorio-build-NKXqGz/src/Util/CrashHandler.cpp:609
#9  0x0000000000b9ec77 in ?? at ??:0
#10 0x0000000000b9ed8c in std::_Rb_tree<int, std::pair<int const, ChildProcessAgent::ChildRecord>, std::_Select1st<std::pair<int const, ChildProcessAgent::ChildRecord> >, std::less<int>, std::allocator<std::pair<int const, ChildProcessAgent::ChildRecord> > >::lower_bound(int const&) at /usr/include/c++/8/bits/stl_tree.h:1203
#11 0x0000000000b9ef55 in std::map<int, ChildProcessAgent::ChildRecord, std::less<int>, std::allocator<std::pair<int const, ChildProcessAgent::ChildRecord> > >::lower_bound(int const&) at /usr/include/c++/8/bits/stl_map.h:1240
#12 0x0000000000b9f228 in std::map<int, ChildProcessAgent::ChildRecord, std::less<int>, std::allocator<std::pair<int const, ChildProcessAgent::ChildRecord> > >::operator[](int const&) at /usr/include/c++/8/bits/stl_map.h:495
#13 0x0000000000b9f289 in std::_Function_base::_Function_base() at /usr/include/c++/8/bits/std_function.h:252
#14 0x00000000000354b0 in std::function<void (int, ChildProcessAgent::ProcessStatus)>::function(std::function<void (int, ChildProcessAgent::ProcessStatus)> const&) at /usr/include/c++/8/bits/std_function.h:654
#15 (nil) in std::function<void (int, ChildProcessAgent::ProcessStatus)>::operator=(std::function<void (int, ChildProcessAgent::ProcessStatus)> const&) at /usr/include/c++/8/bits/std_function.h:463
#16 0x000000000000000b in ChildProcessAgent::ChildRecord::operator=(ChildProcessAgent::ChildRecord const&) at /tmp/factorio-build-NKXqGz/src/ChildProcessAgent.hpp:33
#17 0x000000000051df8b in ChildProcessAgent::fork(std::function<void (int)> const&, std::function<void (int, ChildProcessAgent::ProcessStatus)> const&) at /tmp/factorio-build-NKXqGz/src/ChildProcessAgent.cpp:109
#18 0x00000000016450a9 in SystemUtil::runProcessDontWait(std::string const&, std::vector<std::string, std::allocator<std::string> > const&) at /tmp/factorio-build-NKXqGz/src/Util/SystemUtil.cpp:519
#19 0x00000000016450da in std::string::_M_rep() const at /usr/include/c++/8/bits/basic_string.h:3303
#20 0x00000000016454cf in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() at /usr/include/c++/8/bits/basic_string.h:3621
#21 0x000000000165c58b in SystemUtil::openFileManagerAndSelectFile(Filesystem::Path const&) at /tmp/factorio-build-NKXqGz/src/Util/SystemUtil.cpp:568
#22 0x0000000001647e68 in Util::showCrashedError(std::string const&, std::string const&) at /tmp/factorio-build-NKXqGz/src/Util/Util.cpp:126
#23 0x000000000164c7a7 in std::string::_M_rep() const at /usr/include/c++/8/bits/basic_string.h:3303
#24 0x0000000000633a91 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() at /usr/include/c++/8/bits/basic_string.h:3621
#25 0x0000000000633bf2 in Util::showCrashedError(std::string const&) at /tmp/factorio-build-NKXqGz/src/Util/Util.cpp:91
#26 0x0000000000fcb8f2 in CrashHandler::showCrashedMessage() at /tmp/factorio-build-NKXqGz/src/Util/CrashHandler.cpp:240
#27 0x000000000190ffbf in CrashHandler::writeStackTrace(CrashHandler::CrashReason) at /tmp/factorio-build-NKXqGz/src/Util/CrashHandler.cpp:227
#28 0x00000000000076ba in CrashHandler::commonSignalHandler(int) at /tmp/factorio-build-NKXqGz/src/Util/CrashHandler.cpp:595
#29 (nil) in CrashHandler::SignalHandler(int) at /tmp/factorio-build-NKXqGz/src/Util/CrashHandler.cpp:609
#30 (nil) in ?? at ??:0
#31 0x00007f92e731e550 in ?? at ??:0
#32 0x000000000211c320 in ?? at ??:0
#33 0x00007f92e731e510 in png_safe_error at /tmp/factorio-build-NKXqGz/libraries/png/pngerror.c:916
#34 0x00007f92e731df30 in png_default_error at /tmp/factorio-build-NKXqGz/libraries/png/pngerror.c:754
#35 0x00007f92e731e550 in png_error at /tmp/factorio-build-NKXqGz/libraries/png/pngerror.c:88
#36 0x000000000211c320 in png_error at /tmp/factorio-build-NKXqGz/libraries/png/pngerror.c:88
#37 0x00007f92e731e510 in png_chunk_error at /tmp/factorio-build-NKXqGz/libraries/png/pngerror.c:485
#38 0x00007f92e731e530 in png_read_IDAT_data at /tmp/factorio-build-NKXqGz/libraries/png/pngrutil.c:4120
#39 0x00007f92e731def0 in png_read_row at /tmp/factorio-build-NKXqGz/libraries/png/pngread.c:539
#40 0x00007f93a10f253c in png_read_image at /tmp/factorio-build-NKXqGz/libraries/png/pngread.c:745 (discriminator 3)
#41 0x00007f92e731df98 in preloadPng(unsigned int&, unsigned int&, MemoryBitmapData&, unsigned char*, unsigned int, bool) at /tmp/factorio-build-NKXqGz/src/Graphics/PngLoad.cpp:100
#42 0x00007f92e731e550 in std::__atomic_base<bool>::store(bool, std::memory_order) at /usr/include/c++/8/bits/atomic_base.h:374
#43 0x000000000211c320 in std::__atomic_base<bool>::operator=(bool) at /usr/include/c++/8/bits/atomic_base.h:267
#44 0x0000000001938e81 in std::atomic<bool>::operator=(bool) at /usr/include/c++/8/atomic:79
#45 0x00007f92e731df30 in SpriteLoaders::CrossPlatformImageLoader::preload() at /tmp/factorio-build-NKXqGz/src/Graphics/CrossPlatformImageLoader.cpp:14
#46 0x00007f92e731df30 in std::__atomic_base<unsigned int>::operator++() at /usr/include/c++/8/bits/atomic_base.h:296
#47 0x00007f92e731e550 in PreloadWorker::preload() at /tmp/factorio-build-NKXqGz/src/Graphics/ParallelSpriteLoader.cpp:61
#48 0x000000000211c320 in PreloadWorker::compute()::{lambda()#1}::operator()() const at /tmp/factorio-build-NKXqGz/src/Graphics/ParallelSpriteLoader.cpp:40
#49 0x00007f92e731e510 in void std::__invoke_impl<void, PreloadWorker::compute()::{lambda()#1}>(std::__invoke_other, PreloadWorker::compute()::{lambda()#1}&&) at /usr/include/c++/8/bits/invoke.h:60
#50 0x00000000007fca62 in std::__invoke_result<PreloadWorker::compute()::{lambda()#1}>::type std::__invoke<PreloadWorker::compute()::{lambda()#1}>(std::__invoke_result&&, (PreloadWorker::compute()::{lambda()#1}&&)...) at /usr/include/c++/8/bits/invoke.h:95
#51 0x00007f92b2768858 in decltype (__invoke((_S_declval<0ul>)())) std::thread::_Invoker<std::tuple<PreloadWorker::compute()::{lambda()#1}> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) at /usr/include/c++/8/thread:234
#52 0x00007f92b28fe808 in std::thread::_Invoker<std::tuple<PreloadWorker::compute()::{lambda()#1}> >::operator()() at /usr/include/c++/8/thread:243
#53 0x00007f92b28fe808 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<PreloadWorker::compute()::{lambda()#1}> > >::_M_run() at /usr/include/c++/8/thread:186
#54 0x00000000020f7298 in execute_native_thread_routine at blake2s.c:?
#55 0x00000000020f7298 in ?? at ??:0
#56 0x00007f9200000000 in ?? at ??:0
Stack trace logging done

slippycheeze
Filter Inserter
Filter Inserter
Posts: 587
Joined: Sun Jun 09, 2019 10:40 pm
Contact:

Re: Crash on linux

Post by slippycheeze »

zebediah49 wrote:
Sat Jul 20, 2019 11:45 pm
For me, it crashes on load approximately 30% of the time, at varying parts of the way through "loading sprites". 57% seems to be a favorite, but I've seen numbers in the 60's and 70's. Given that, I suspect some kind of parallel race condition issue?
You have a PNG somewhere that has a corrupted chunk - I think - which triggers the libpng error handling path that involves setjmp/longjmp. It might be helpful to try and figure out which one, exactly, it is and see if the problem is easier to isolate that way. http://www.libpng.org/pub/png/apps/pngcheck.html is likely the best choice, what with using the same decoder, thus the same error detection.

I'd guess the "random" part is that sprite loading is threaded, so the order that it gets pulled off the queue is semi-random, but that the trigger is consistently the same file, or at least, something related to same. Guesses, obviously.

posila
Factorio Staff
Factorio Staff
Posts: 5073
Joined: Thu Jun 11, 2015 1:35 pm
Contact:

Re: Crash on linux

Post by posila »

Can you post full log, please?

So what is weird that it happens only 30% of time; That might mean libpng is not threadsafe or we use it in non-thread safe way; the file is read from disk incorrectly sometimes; RAM gets corrupted sometimes; CPU bugs out in a hot loop; or something else that I haven't thought of.

The weirdest thing is - why does it crash in the first place? Corrupted PNG file should fail to decompress, but it shouldn't crash an application.

Try to set max-sprite-loading-threads to 1. (add max-sprite-loading-threads=1 at the end of Factorio's config.ini)

slippycheeze
Filter Inserter
Filter Inserter
Posts: 587
Joined: Sun Jun 09, 2019 10:40 pm
Contact:

Re: Crash on linux

Post by slippycheeze »

posila wrote:
Mon Jul 22, 2019 9:09 am
So what is weird that it happens only 30% of time; That might mean libpng is not threadsafe or we use it in non-thread safe way; the file is read from disk incorrectly sometimes; RAM gets corrupted sometimes; CPU bugs out in a hot loop; or something else that I haven't thought of.

The weirdest thing is - why does it crash in the first place? Corrupted PNG file should fail to decompress, but it shouldn't crash an application.
I don't know what libpng error handling you use, but it has (and get ready to scream) setjmp/longjmp as part of the process to implement exception throwing on chunk decompression error. The specific line of code, too, is a failure of the zlib (de)compression of the chunk, where no output was returned, so that ... could be involved too, but should be thread-safe unless you do something odd with memory allocation - as in, provide your own routines built over C++ new or something.

Both libpng, and zlib used by it, should be totally thread safe as long as you don't share any of their data structures, and you restrict touching them to a single thread forever. So, uh, for limited values of thread safe that I'd call more "thread compatible".

I'd casually guess that if you ran with LLVM ASAN / MSAN, and especially if you use TSAN annotations, you would quickly detect any cross-thread use of these things that might be causing the crash. So if that is possible I'd certainly guess starting there might be helpful to you....

Good luck. I'd hope that if the OP can confirm there are no actually corrupted chunks you can more quickly narrow this down to a bitflip vs badness in libpng. As long as you avoid the default error handling, though, I'd guess you were pretty safe there...

posila
Factorio Staff
Factorio Staff
Posts: 5073
Joined: Thu Jun 11, 2015 1:35 pm
Contact:

Re: Crash on linux

Post by posila »

slippycheeze wrote:
Mon Jul 22, 2019 5:11 pm
I don't know what libpng error handling you use, but it has (and get ready to scream) setjmp/longjmp as part of the process to implement exception throwing on chunk decompression error. The specific line of code, too, is a failure of the zlib (de)compression of the chunk, where no output was returned, so that ... could be involved too, but should be thread-safe unless you do something odd with memory allocation - as in, provide your own routines built over C++ new or something.

Both libpng, and zlib used by it, should be totally thread safe as long as you don't share any of their data structures, and you restrict touching them to a single thread forever. So, uh, for limited values of thread safe that I'd call more "thread compatible".

I'd casually guess that if you ran with LLVM ASAN / MSAN, and especially if you use TSAN annotations, you would quickly detect any cross-thread use of these things that might be causing the crash. So if that is possible I'd certainly guess starting there might be helpful to you....

Good luck. I'd hope that if the OP can confirm there are no actually corrupted chunks you can more quickly narrow this down to a bitflip vs badness in libpng. As long as you avoid the default error handling, though, I'd guess you were pretty safe there...
Thanks for sharing your knowledge. It made me look into how libpng error handling is supposed to work and we had it set up for it to use longjmp to error handler; so instead, I have provided user error callback and throw exception in it (the callback is not supposed to return). So the crash should be fixed for 0.17.61, it should fallback to single-threaded loading on error (fail to allocate more memory is the most common reason of parallel sprite loading on Windows, so it might be the case here also). My original assumption was that libpng is trying to longjmp into its own error handler and that fails due to some problem inside libpng.

We do run ASAN and TSAN in CI, but those run just headless tests (no spritesheets loadeding). Linux devs use Valgrind to troubleshoot some weird issues occasionally.

Duplicate: 73814

slippycheeze
Filter Inserter
Filter Inserter
Posts: 587
Joined: Sun Jun 09, 2019 10:40 pm
Contact:

Re: Crash on linux

Post by slippycheeze »

posila wrote:
Tue Jul 30, 2019 3:22 pm
Thanks for sharing your knowledge. It made me look into how libpng error handling is supposed to work and we had it set up for to longjump to error handler; so I have provided user error callback instead and throw exception in it instead (the callback is not supposed to return).
I'm glad it helped. I assume that y'all can ignore when it doesn't, so things like MSAN etc, better said twice than not said. Also that there are a billion twisty little bits of broken everything, and none of us know all of them.

I'll also take "just shut up" as an answer, should it come up. :)

Post Reply

Return to “Resolved Problems and Bugs”