[0.18.4] Crash: HeatBuffer::HeatBuffer
[0.18.4] Crash: HeatBuffer::HeatBuffer
Overall history: Running headless server for our weeklong+ MP session. Game was originally started in version 0.18.3. The game is in megafactory state and is very big and extremely messy. We've run makeshift console commands to purge the visible map of all enemy entities to ease on UPS issues twice, using the small command script found on wiki. Also adjusting game speed routinely for clients with lower end hardware.
Crash: Game crashed shortly after one of the players had ordered several thousand additional heatpipes via blueprints to be placed upon active nuclear power plants (already heated up; most of the planned heat piping would go adjacent to current heatpipes) - so the bots were already working on that for past minute or so. Said player was also fiddling with modules on the heat pipe manufacturing assembler [replacing efficiency modules with speed modules], to speed up the process of replenishing heat pipes, right as the crash happened.
Save file has corrupted and fails to load, with error message 'trying to load already loaded heat buffer'. We have save files from both server-side and client-side (this made me wonder if the crash is actually related to autosave procedure) - they seem to be almost the same, and they both fail to load with the same error message. Providing both just in case there's a single tick difference that might give some clue as to what happened.
A freak memory error is not entirely out of the question, although the current server is extremely stable and has proven itself hardy against any hardware related faults over multiple years now (but as running 'out of spec', no guarantees are given)
Providing links to the save files (they're well excess of 50MB by now)
Server-side save
Client-side save
Attached should be the .log and .dmp files from server.
Crash: Game crashed shortly after one of the players had ordered several thousand additional heatpipes via blueprints to be placed upon active nuclear power plants (already heated up; most of the planned heat piping would go adjacent to current heatpipes) - so the bots were already working on that for past minute or so. Said player was also fiddling with modules on the heat pipe manufacturing assembler [replacing efficiency modules with speed modules], to speed up the process of replenishing heat pipes, right as the crash happened.
Save file has corrupted and fails to load, with error message 'trying to load already loaded heat buffer'. We have save files from both server-side and client-side (this made me wonder if the crash is actually related to autosave procedure) - they seem to be almost the same, and they both fail to load with the same error message. Providing both just in case there's a single tick difference that might give some clue as to what happened.
A freak memory error is not entirely out of the question, although the current server is extremely stable and has proven itself hardy against any hardware related faults over multiple years now (but as running 'out of spec', no guarantees are given)
Providing links to the save files (they're well excess of 50MB by now)
Server-side save
Client-side save
Attached should be the .log and .dmp files from server.
- Attachments
-
- factorio-dump-current.dmp
- (625.52 KiB) Downloaded 167 times
-
- factorio-current.log
- (615.65 KiB) Downloaded 175 times
Re: [0.18.4] Crash: HeatBuffer::HeatBuffer
Thanks for the report. Do you have any way to reproduce the issue? the save file is in a corrupt state (as it's detecting while loading). Looking at the crash logs this is the only one that I can see with this specific crash which makes me think it might not be reproducible (maybe random bit flip?)
If you want to get ahold of me I'm almost always on Discord.
Re: [0.18.4] Crash: HeatBuffer::HeatBuffer
No known reproduction steps. I tried replicating the heat pipe additions in single player to no effect. Our most recent autosave on the MP was almost 15 (real time)minutes before, replicating the game into the exact same state is of course practically impossible, considering multiple players were doing multiple things at once during the timeframe.
It is as I feared then, most likely a hardware error, if no software related quirk is in sight (only things I could think of from my vague understanding of the crash log was potential multithreading sync error/race condition anyway; there was a reason I wrote the line about random memory error in the original bug report)
It is as I feared then, most likely a hardware error, if no software related quirk is in sight (only things I could think of from my vague understanding of the crash log was potential multithreading sync error/race condition anyway; there was a reason I wrote the line about random memory error in the original bug report)
Re: [0.18.4] Crash: HeatBuffer::HeatBuffer
Update: I managed to reproduce this in single player, using the 15 (real time) minutes or so old latest autosave of the multiplayer game.
Reproduction steps can not be given quite exactly, but I'll do my best:
I queued thousands of extra heatpipe additions to already-existing and heated up nuclear power plants, set autosave interval to 2 minutes, set game.speed=1.5 and let the bots work their magic, after autosave the game crashes to desktop with the same HeatBuffer problem. The resulting autosave is corrupted in similar way, but different details (different index numbers)
I will work to see if I can get this reproduced consistently. Might not be a hardware problem after all (it would've been pretty deep one to have the game state random bit flip corrupt on both client and server side; something about CPU caches and/or speculative execution surely, as running client/server on same machine in tandem practically replicates all instructions with a few millisecond timegap. And they definitely should use different chunks of RAM.)
Reproduction steps can not be given quite exactly, but I'll do my best:
I queued thousands of extra heatpipe additions to already-existing and heated up nuclear power plants, set autosave interval to 2 minutes, set game.speed=1.5 and let the bots work their magic, after autosave the game crashes to desktop with the same HeatBuffer problem. The resulting autosave is corrupted in similar way, but different details (different index numbers)
I will work to see if I can get this reproduced consistently. Might not be a hardware problem after all (it would've been pretty deep one to have the game state random bit flip corrupt on both client and server side; something about CPU caches and/or speculative execution surely, as running client/server on same machine in tandem practically replicates all instructions with a few millisecond timegap. And they definitely should use different chunks of RAM.)
-
- Burner Inserter
- Posts: 6
- Joined: Tue Feb 11, 2020 5:31 pm
- Contact:
Re: [0.18.4] Crash: HeatBuffer::HeatBuffer
I was also able to reproduce the error.
During my savegame I just concreted a lot and worked on my power plant on the side.
When I went with the mouse over a not yet built heat pipe the crash came (see screenshot).
Downloadlink: http://cyberkeks.net/Factorio/_autosave2.zip
During my savegame I just concreted a lot and worked on my power plant on the side.
When I went with the mouse over a not yet built heat pipe the crash came (see screenshot).
Downloadlink: http://cyberkeks.net/Factorio/_autosave2.zip
- Attachments
-
- factorio-dump-current.dmp
- (749.37 KiB) Downloaded 162 times
-
- factorio-current.log
- (10.27 KiB) Downloaded 156 times
-
- Factorio Crash.png (4.8 MiB) Viewed 5849 times
-
- Burner Inserter
- Posts: 6
- Joined: Tue Feb 11, 2020 5:31 pm
- Contact:
Re: [0.18.4] Crash: HeatBuffer::HeatBuffer
Reproducable. Just mark that pipe (left side of the base) and wait ~5 minutes.
- Attachments
-
- factorio-dump-current.dmp
- (729.99 KiB) Downloaded 161 times
-
- factorio-current.log
- (9.08 KiB) Downloaded 173 times
Re: [0.18.4] Crash: HeatBuffer::HeatBuffer
Confirming. Loading masterofavenger's save _autosave2.zip, hilighting the pipe piece mentioned and waiting for a few minutes does indeed crash the game with a HeatBuffer error.
I am unable to consistently get my own game to crash, however (probably because I don't actually have a save state where the bug would have happened, but am relying on trying to ghost thousands of heatpipes and wish for a lucky break). I am relieved to see it is probably not a hardware error after all, though.
Edit: hilighting any heatpipe piece is not necessary, the crash will happen nevertheless. It seems obvious to me the problem lies within ghosting of heatpipes & heat manager. masterofavenger's save will get corrupted with a state that tries to load heatbuffer for a nuclear reactor which has been allocated for a heatpipe.
By setting autosaves to 1 minute, I now have managed to secure masterofavenger's game into a savegame that will reliably crash the game about 1 minute from load. It is still unclear whether the crash happens due to autosave process or whether this just coincides with something happening (such as the reactor getting fed with fuel and starting to heat up). I will continue to investigate.
I am unable to figure out what exactly may be the cause. At least the crash reports will indicate the entity location indices to look for the troublemakers. I have not managed to obtain a better save than the 1 minute one. Saving even 10 seconds after that will corrupt the savefile; however this one will give a different error "Corrupt map: invalid heat buffer index: 264 >= 264." - saving 10 seconds after that will already indicate nuclear reactor & heat pipe entities in the corrupted map message (presumably both items will have been placed by robots by that time)
Here is a link to the "1 minute save" (which can be loaded, and reliably crashes about 1 minute 35 seconds into the game). Hopefully figuring out what exactly happens within the next 10 seconds from that game load will prove fruitful. Have a productive Wednesday
I am unable to consistently get my own game to crash, however (probably because I don't actually have a save state where the bug would have happened, but am relying on trying to ghost thousands of heatpipes and wish for a lucky break). I am relieved to see it is probably not a hardware error after all, though.
Edit: hilighting any heatpipe piece is not necessary, the crash will happen nevertheless. It seems obvious to me the problem lies within ghosting of heatpipes & heat manager. masterofavenger's save will get corrupted with a state that tries to load heatbuffer for a nuclear reactor which has been allocated for a heatpipe.
By setting autosaves to 1 minute, I now have managed to secure masterofavenger's game into a savegame that will reliably crash the game about 1 minute from load. It is still unclear whether the crash happens due to autosave process or whether this just coincides with something happening (such as the reactor getting fed with fuel and starting to heat up). I will continue to investigate.
I am unable to figure out what exactly may be the cause. At least the crash reports will indicate the entity location indices to look for the troublemakers. I have not managed to obtain a better save than the 1 minute one. Saving even 10 seconds after that will corrupt the savefile; however this one will give a different error "Corrupt map: invalid heat buffer index: 264 >= 264." - saving 10 seconds after that will already indicate nuclear reactor & heat pipe entities in the corrupted map message (presumably both items will have been placed by robots by that time)
Here is a link to the "1 minute save" (which can be loaded, and reliably crashes about 1 minute 35 seconds into the game). Hopefully figuring out what exactly happens within the next 10 seconds from that game load will prove fruitful. Have a productive Wednesday
Re: [0.18.4] Crash: HeatBuffer::HeatBuffer
Noticed 0.18.5 came out. Updated, decided to see if the save still reliably crashes. It does *not*. Unable to confirm whether the potential bug has actually been fixed for 0.18.5, or whether savegame update script manages to work some magic on the savefile to fix the soon-to-become-corrupt state.
0.18.5 also has seemingly no trouble loading the savefiles which *were* considered corrupt by 0.18.4
Reverting back to 0.18.4 continues to exhibit savegame corruption and reliably crashes the 1 minute save after 1 minute 35 seconds. [well, at least that worked as expected!]
0.18.5 also has seemingly no trouble loading the savefiles which *were* considered corrupt by 0.18.4
Reverting back to 0.18.4 continues to exhibit savegame corruption and reliably crashes the 1 minute save after 1 minute 35 seconds. [well, at least that worked as expected!]
Re: [0.18.4] Crash: HeatBuffer::HeatBuffer
I'll take a look and see what I can find. There's a chance that the save(s) you're using are already corrupt and it just happens at that time into the logic. The latest release changed some prototype values which causes the heat system to re-initialize which might fix what ever was wrong.
If you want to get ahold of me I'm almost always on Discord.
Re: [0.18.4] Crash: HeatBuffer::HeatBuffer
Considering this bug was created in both our instances by having active heat systems (heated up reactors & pipes) & ordering thousands of new heatpipes (and/or reactors) adjacent to already heated up heat systems, I do think there was a software issue to blame, not random chance corrupted game states. I would wager some hard to reproduce race condition was involved. From latest dev reply I gather some deep under the hood tweaking has been done to heatmanager, which might easily have hidden or fixed the culprit entirely.
In any case 0.18.5+ seems to have fixed the problem as I've been unable to reproduce this again even once over half a dozen large scale tries of ordering massive amounts of heatpipes next to active heat systems. Granted, it was hard to reproduce on .18.4 already, so this is not quite conclusive evidence yet.
The feeling when one accidentally fixes bugs that didn't even know existed?
In any case 0.18.5+ seems to have fixed the problem as I've been unable to reproduce this again even once over half a dozen large scale tries of ordering massive amounts of heatpipes next to active heat systems. Granted, it was hard to reproduce on .18.4 already, so this is not quite conclusive evidence yet.
The feeling when one accidentally fixes bugs that didn't even know existed?
-
- Burner Inserter
- Posts: 6
- Joined: Tue Feb 11, 2020 5:31 pm
- Contact:
Re: [0.18.4] Crash: HeatBuffer::HeatBuffer
Confirmed. Bug fixed in latest (experimental) version.
Re: [0.18.4] Crash: HeatBuffer::HeatBuffer
I spent all day today looking over this and found the issue. If a heat pipe system was re-merged in the same tick a pipe was created that merged 2 different sets of pipes it would lead to this corrupt state.
It's now fixed for the next release. The "works in latest experimental" is just a bandaid due to the system(s) getting reset when prototype data changes.
It's now fixed for the next release. The "works in latest experimental" is just a bandaid due to the system(s) getting reset when prototype data changes.
If you want to get ahold of me I'm almost always on Discord.