[Dominik][0.17.23] Frequent, spurious crashes (FluidBox::PipeConnection::getOpposite)

Place for things which are bugs but we have no idea how to solve them. Things related to hardware, libraries, strange setups, etc.
Post Reply
Synchrotron
Manual Inserter
Manual Inserter
Posts: 4
Joined: Tue Apr 02, 2019 4:11 am
Contact:

[Dominik][0.17.23] Frequent, spurious crashes (FluidBox::PipeConnection::getOpposite)

Post by Synchrotron »

OS: Linux Ubuntu 18.04
Mods: None, completely Vanilla
Game version: 0.17.23

I've started a new save with 0.17 on Linux/x86_64 and have about 120 hours on it now. The base is large, but not excessively large in my opinion. Lately I'm getting a TON of crashes, sometimes litterally every few minutes. I.e., I have the autosave interval at 1 minute. Since I believe the underlying issue to be the same, I'm attaching all four crash logs here. They were recorded withing about half an hour (although there were actually six crashes in that half hour, but at one Factorio hung and did not SIGSEGV, i.e., did not generate a crash report and the other crash report I accidently deleted).

The crashes happen usually when I'm doing something that is resource intensive (e.g., building a large amount of solar panels), but they occur all the time (I just have a feeling that this is when they appear most often). But they happen at times where I'm doing pretty much nothing except watching my factory as well (just seemingly less often).

Crash reports:
https://pastebin.com/JkMkRXqX
https://pastebin.com/XCRykzBT
https://pastebin.com/V2F0ZAyb
https://pastebin.com/f0Y3rTu7

Savegame file https://mega.nz/#!R8YnzABS!z0EQ3RAGhCG5 ... kajbE7wXLU
Last edited by Synchrotron on Tue Apr 02, 2019 4:58 am, edited 2 times in total.

Dominik
Former Staff
Former Staff
Posts: 658
Joined: Sat Oct 12, 2013 9:08 am
Contact:

Re: [Dominik][0.17.23] Frequent, spurious crashes (FluidBox::PipeConnection::getOpposite)

Post by Dominik »

Hi, thanks for the report. The save looks ok from what I can see. And in the 4 reports I see 3 different hardly related issues. So I think that the reason is corrupted memory. Try running some memory check.

Synchrotron
Manual Inserter
Manual Inserter
Posts: 4
Joined: Tue Apr 02, 2019 4:11 am
Contact:

Re: [Dominik][0.17.23] Frequent, spurious crashes (FluidBox::PipeConnection::getOpposite)

Post by Synchrotron »

Yes I noticed also the different paths in the execution, but I though some racey thing in the Factorio software led to the memory corruption. However, I tried testing using Memtest86+ 5.0.1. Ran for hours, nothing found. I did notice, though, that the CPU ran significantly cooler when running Memtest86 than when playing Factorio. As a matter of fact, I played a bit until it crashed, then rebooted immediately into memtest and I could watch the temperature slowly drop -- no errors found.

Getting the temperature up depends on two factors: Frequency scaling (unsure if memtest does any, but I highly doubt it) and SMP (wihch memtest doesn't do). So I sat down and wrote a quick & dirty "user mode memory testing" program that essentially would malloc a few gigs of memory and then run the tests that are also included in memtest, except parallel using pthreads. And holey guacamole:

Code: Select all

Testing 21340 MiB of memory.
Test: Word offset test
Test: Word address test
Test failed: test_word_address at word 1317497328 (10051 MiB). Thread 0 of 8. Expected: 00007f77df7c3f90   Read: 00697f77df7c3f90   XOR:   69000000000000
Test failed: test_word_address at word 1317497330 (10051 MiB). Thread 2 of 8. Expected: 00007f77df7c3fa0   Read: 00e97f77df7c3fa0   XOR:   e9000000000000
Test failed: test_word_address at word 1317497329 (10051 MiB). Thread 1 of 8. Expected: 00007f77df7c3f98   Read: 00e97f77df7c3f98   XOR:   e9000000000000
Test failed: test_word_address at word 1317497326 (10051 MiB). Thread 6 of 8. Expected: 00007f77df7c3f80   Read: 00697f77df7c3f80   XOR:   69000000000000
Test failed: test_word_address at word 1317497332 (10051 MiB). Thread 4 of 8. Expected: 00007f77df7c3fb0   Read: 00eb7f77df7c3fb0   XOR:   eb000000000000
Test FAILED: Word address test
Test: Word inversion test
Test failed: test_word_inversions at word 1030801375 (7864 MiB). Thread 7 of 8. Expected: fffffffffffffffe   Read: ff70fffffffffffe   XOR:   8f000000000000
Test failed: test_word_inversions at word 1030801374 (7864 MiB). Thread 6 of 8. Expected: fffffffffffffffe   Read: ff72fffffffffffe   XOR:   8d000000000000
Test failed: test_word_inversions at word 1030801381 (7864 MiB). Thread 5 of 8. Expected: fffffffffffffffe   Read: ff63fffffffffffe   XOR:   9c000000000000
Test failed: test_word_inversions at word 1030801378 (7864 MiB). Thread 2 of 8. Expected: fffffffffffffffe   Read: ff68fffffffffffe   XOR:   97000000000000
Test failed: test_word_inversions at word 1030801377 (7864 MiB). Thread 1 of 8. Expected: fffffffffffffffe   Read: ff68fffffffffffe   XOR:   97000000000000
Test failed: test_word_inversions at word 1030801380 (7864 MiB). Thread 4 of 8. Expected: fffffffffffffffe   Read: ff63fffffffffffe   XOR:   9c000000000000
Test failed: test_word_inversions at word 1030801376 (7864 MiB). Thread 0 of 8. Expected: fffffffffffffffe   Read: ff68fffffffffffe   XOR:   97000000000000
Test failed: test_word_inversions at word 1030801379 (7864 MiB). Thread 3 of 8. Expected: fffffffffffffffe   Read: ff6bfffffffffffe   XOR:   94000000000000
Test FAILED: Word inversion test
Yeah, that explains it completely. Geeze, Factorio is also a HW burn in test, I never knew :-)

Maybe my memtester is to use to someone, I'll clean it up and push it on my GitHub in a few days. Thanks again for responding so quickly, Dominik!

Dominik
Former Staff
Former Staff
Posts: 658
Joined: Sat Oct 12, 2013 9:08 am
Contact:

Re: [Dominik][0.17.23] Frequent, spurious crashes (FluidBox::PipeConnection::getOpposite)

Post by Dominik »

Haha, a very sophisticated user :D
It always feels like trying to offload work when referring to bad memory, so I am glad I was right, for my sake. Sorry about your pc though.

Post Reply

Return to “1 / 0 magic”