OS: Linux Ubuntu 18.04
Mods: None, completely Vanilla
Game version: 0.17.23
I've started a new save with 0.17 on Linux/x86_64 and have about 120 hours on it now. The base is large, but not excessively large in my opinion. Lately I'm getting a TON of crashes, sometimes litterally every few minutes. I.e., I have the autosave interval at 1 minute. Since I believe the underlying issue to be the same, I'm attaching all four crash logs here. They were recorded withing about half an hour (although there were actually six crashes in that half hour, but at one Factorio hung and did not SIGSEGV, i.e., did not generate a crash report and the other crash report I accidently deleted).
The crashes happen usually when I'm doing something that is resource intensive (e.g., building a large amount of solar panels), but they occur all the time (I just have a feeling that this is when they appear most often). But they happen at times where I'm doing pretty much nothing except watching my factory as well (just seemingly less often).
Crash reports:
https://pastebin.com/JkMkRXqX
https://pastebin.com/XCRykzBT
https://pastebin.com/V2F0ZAyb
https://pastebin.com/f0Y3rTu7
Savegame file https://mega.nz/#!R8YnzABS!z0EQ3RAGhCG5 ... kajbE7wXLU
[Dominik][0.17.23] Frequent, spurious crashes (FluidBox::PipeConnection::getOpposite)
-
- Burner Inserter
- Posts: 7
- Joined: Tue Apr 02, 2019 4:11 am
- Contact:
[Dominik][0.17.23] Frequent, spurious crashes (FluidBox::PipeConnection::getOpposite)
Last edited by Synchrotron on Tue Apr 02, 2019 4:58 am, edited 2 times in total.
Re: [Dominik][0.17.23] Frequent, spurious crashes (FluidBox::PipeConnection::getOpposite)
Hi, thanks for the report. The save looks ok from what I can see. And in the 4 reports I see 3 different hardly related issues. So I think that the reason is corrupted memory. Try running some memory check.
-
- Burner Inserter
- Posts: 7
- Joined: Tue Apr 02, 2019 4:11 am
- Contact:
Re: [Dominik][0.17.23] Frequent, spurious crashes (FluidBox::PipeConnection::getOpposite)
Yes I noticed also the different paths in the execution, but I though some racey thing in the Factorio software led to the memory corruption. However, I tried testing using Memtest86+ 5.0.1. Ran for hours, nothing found. I did notice, though, that the CPU ran significantly cooler when running Memtest86 than when playing Factorio. As a matter of fact, I played a bit until it crashed, then rebooted immediately into memtest and I could watch the temperature slowly drop -- no errors found.
Getting the temperature up depends on two factors: Frequency scaling (unsure if memtest does any, but I highly doubt it) and SMP (wihch memtest doesn't do). So I sat down and wrote a quick & dirty "user mode memory testing" program that essentially would malloc a few gigs of memory and then run the tests that are also included in memtest, except parallel using pthreads. And holey guacamole:
Yeah, that explains it completely. Geeze, Factorio is also a HW burn in test, I never knew
Maybe my memtester is to use to someone, I'll clean it up and push it on my GitHub in a few days. Thanks again for responding so quickly, Dominik!
Getting the temperature up depends on two factors: Frequency scaling (unsure if memtest does any, but I highly doubt it) and SMP (wihch memtest doesn't do). So I sat down and wrote a quick & dirty "user mode memory testing" program that essentially would malloc a few gigs of memory and then run the tests that are also included in memtest, except parallel using pthreads. And holey guacamole:
Code: Select all
Testing 21340 MiB of memory.
Test: Word offset test
Test: Word address test
Test failed: test_word_address at word 1317497328 (10051 MiB). Thread 0 of 8. Expected: 00007f77df7c3f90 Read: 00697f77df7c3f90 XOR: 69000000000000
Test failed: test_word_address at word 1317497330 (10051 MiB). Thread 2 of 8. Expected: 00007f77df7c3fa0 Read: 00e97f77df7c3fa0 XOR: e9000000000000
Test failed: test_word_address at word 1317497329 (10051 MiB). Thread 1 of 8. Expected: 00007f77df7c3f98 Read: 00e97f77df7c3f98 XOR: e9000000000000
Test failed: test_word_address at word 1317497326 (10051 MiB). Thread 6 of 8. Expected: 00007f77df7c3f80 Read: 00697f77df7c3f80 XOR: 69000000000000
Test failed: test_word_address at word 1317497332 (10051 MiB). Thread 4 of 8. Expected: 00007f77df7c3fb0 Read: 00eb7f77df7c3fb0 XOR: eb000000000000
Test FAILED: Word address test
Test: Word inversion test
Test failed: test_word_inversions at word 1030801375 (7864 MiB). Thread 7 of 8. Expected: fffffffffffffffe Read: ff70fffffffffffe XOR: 8f000000000000
Test failed: test_word_inversions at word 1030801374 (7864 MiB). Thread 6 of 8. Expected: fffffffffffffffe Read: ff72fffffffffffe XOR: 8d000000000000
Test failed: test_word_inversions at word 1030801381 (7864 MiB). Thread 5 of 8. Expected: fffffffffffffffe Read: ff63fffffffffffe XOR: 9c000000000000
Test failed: test_word_inversions at word 1030801378 (7864 MiB). Thread 2 of 8. Expected: fffffffffffffffe Read: ff68fffffffffffe XOR: 97000000000000
Test failed: test_word_inversions at word 1030801377 (7864 MiB). Thread 1 of 8. Expected: fffffffffffffffe Read: ff68fffffffffffe XOR: 97000000000000
Test failed: test_word_inversions at word 1030801380 (7864 MiB). Thread 4 of 8. Expected: fffffffffffffffe Read: ff63fffffffffffe XOR: 9c000000000000
Test failed: test_word_inversions at word 1030801376 (7864 MiB). Thread 0 of 8. Expected: fffffffffffffffe Read: ff68fffffffffffe XOR: 97000000000000
Test failed: test_word_inversions at word 1030801379 (7864 MiB). Thread 3 of 8. Expected: fffffffffffffffe Read: ff6bfffffffffffe XOR: 94000000000000
Test FAILED: Word inversion test
Maybe my memtester is to use to someone, I'll clean it up and push it on my GitHub in a few days. Thanks again for responding so quickly, Dominik!
Re: [Dominik][0.17.23] Frequent, spurious crashes (FluidBox::PipeConnection::getOpposite)
Haha, a very sophisticated user
It always feels like trying to offload work when referring to bad memory, so I am glad I was right, for my sake. Sorry about your pc though.
It always feels like trying to offload work when referring to bad memory, so I am glad I was right, for my sake. Sorry about your pc though.