Heavy mode determinism vs Factorissimo3

Place to get help with not working mods / modding interface.
archont
Burner Inserter
Burner Inserter
Posts: 6
Joined: Tue May 03, 2016 8:36 pm
Contact:

Heavy mode determinism vs Factorissimo3

Post by archont »

While debugging a desync multiplayer issue I've used /heavy-mode to isolate the cause of desyncs. After simplifying it to a new game and no mods except Factorissimo3, I've noticed that placing down a warehouse and the creation of the factory floor surface immediately causes heavy mode to immediately and always flag errors each single tick, 100% of the time. No interactions, no items - just /editor mode empty warehouse on a fresh world with no other mods.
106.525 Loading level.dat: 981340 bytes.
106.525 Info Scenario.cpp:154: Map version 2.0.72-0
106.880 Checksum for script __level__/control.lua: 2722821277
106.890 Script @__factorissimo-2-notnotmelon__/lib/events.lua:52: Finalized 58 events for factorissimo-2-notnotmelon
106.890 Checksum for script __factorissimo-2-notnotmelon__/control.lua: 2946075274
124.581 Script @__factorissimo-2-notnotmelon__/lib/events.lua:52: Finalized 58 events for factorissimo-2-notnotmelon
126.977 Error MainLoop.cpp:1575: Heavy mode check before-update-999 failed.
126.977 Heavy mode - tick 999 finished
129.369 Error MainLoop.cpp:1575: Heavy mode check after-gameactionhandler-update-999 failed.
131.764 Error MainLoop.cpp:1575: Heavy mode check after-scenario-update-1000 failed.
132.195 Script @__factorissimo-2-notnotmelon__/lib/events.lua:52: Finalized 58 events for factorissimo-2-notnotmelon
134.640 Error MainLoop.cpp:1575: Heavy mode check before-update-1000 failed.
134.640 Heavy mode - tick 1000 finished
137.121 Error MainLoop.cpp:1575: Heavy mode check after-gameactionhandler-update-1000 failed.
139.588 Error MainLoop.cpp:1575: Heavy mode check after-scenario-update-1001 failed.
140.001 Script @__factorissimo-2-notnotmelon__/lib/events.lua:52: Finalized 58 events for factorissimo-2-notnotmelon
142.465 Error MainLoop.cpp:1575: Heavy mode check before-update-1001 failed.
142.465 Heavy mode - tick 1001 finished
144.922 Error MainLoop.cpp:1575: Heavy mode check after-gameactionhandler-update-1001 failed.
147.458 Error MainLoop.cpp:1575: Heavy mode check after-scenario-update-1002 failed.
154.104 Script @__factorissimo-2-notnotmelon__/lib/events.lua:52: Finalized 58 events for factorissimo-2-notnotmelon
156.565 Error MainLoop.cpp:1575: Heavy mode check before-update-1002 failed.
156.565 Heavy mode - tick 1002 finished
159.017 Error MainLoop.cpp:1575: Heavy mode check after-heavy-latency-1002 failed.
161.503 Error MainLoop.cpp:1575: Heavy mode check after-gameactionhandler-update-1002 failed.

Counterintuitively however, despite heavy mode suggesting the gamestate should immediately diverge, multiplayer works fine for days, until a secondary cause, which may or may not be Factorissimo3 related, causes an actual desync.

Since notnotmelon isn't presently around to help with the issue, I'm asking the community for some thoughts:
1) Factorio relies on replaying inputs to all players and producing a deterministic outcome on all machines, and heavy-mode presumably reruns the gamestate locally and checksums it every tick to check for consistency of outcome. But Factorissimo3 causes heavy mode detection to immediately go absolutely berzerk, without actually causing a multiplayer desync. At least, not for days.
What is the difference in the condition for detecting a multiplayer desync versus heavy mode checksum mismatch? What is the gap between heavy mode and multiplayer desync detection that Factorissmo3 skirts?

2) Am I understanding correctly that /heavy-mode is useless to debug desync issues on this world as Factorissimo3 false positives drown out any actual reproduction?

3) Is there some sort of heavy mode plus script or doodad that can I can use to pinpoint the cause of the divergence? Do I diff the desynced gamestate save files? If yes, are there tools that tell me what entity/data is under a specific offset? Is there a /ultra-heavy-mode that spits out a stack trace of what Lua function executed differently?
User avatar
boskid
Factorio Staff
Factorio Staff
Posts: 4353
Joined: Thu Dec 14, 2017 6:56 pm
Contact:

Re: Heavy mode determinism vs Factorissimo3

Post by boskid »

Heavy mode is the ultimate tool for finding desyncs with game bugs or scripting bugs, you may notice it is uplayably slow because it saves the game, loads a game, and then does update of the original game instance and second game instance doing more saves in between to rule out all possible places where state could diverge.

When in MP you may run `/c game.force_crc()` to trigger (only for one tick) server and all clients to do a full compare to see if they are divering. This causes a noticable slowdown due to entire game state being saved to create a crc that is then compared with server. This is also a full check similarily to heavy-mode.

At runtime, it is not possible to do a full heavy mode check because that would make the game completly unplayable. Instead there is a small subset of variables that are used as indicators of desync happening, like last samples of item production statistics, amounts of energy transferred by the electric network, positions of some players, some number allocators etc: those values are part of "heuristic" check because this check only covers some values that should be equal between server and clients. Usually when a desync happens in one place, it propagates rapidly onto everything else and almost always causing one of the variables covered by heuristic check to also diverge causing a desync to be detected. In rare situations it may happen that a script causes a desync on a variable that is not covered by the heuristic and does not cascade differences onto one such variable. Such desync may survive unnoticed for a long time but it is still a desync state, it still counts as a bug regardless of not tripping and it needs to be fixed.

For an example there could be a mod that develops a desync on one of its variables, player joins, this mod uses the faulty variable to create an entity but the entity is created at position {0.5, 0.5} on the server but it is created at {3.5, 0.5} on the client. It will most likely not cause any desyncs, but if a character starts moving and collides with this entity, then entity position will start cascading onto character's position which will get detected by heuristic and a desync will be detected.

Because of this, your statement in p2 about "false positive" is just wrong. It is not a false positive. It is a real desync.
archont
Burner Inserter
Burner Inserter
Posts: 6
Joined: Tue May 03, 2016 8:36 pm
Contact:

Re: Heavy mode determinism vs Factorissimo3

Post by archont »

Fascinating! Thank you very much!

Note that placing a warehouse from Factorissimo3 immediately and always causes failure of /heavy-mode - so it does cause a de facto desync state per your definition. As part of the normal "happy path" behavior of the mod.

But the nature of the desync it causes somehow does not result in a cascading effect. At least - not under normal circumstances, not until much later - dozens of hours. Then a special event occurs - some sort of rare, unaccounted for interaction, causing a cascade that eventually triggers the canary variables?
Rseding91
Factorio Staff
Factorio Staff
Posts: 16414
Joined: Wed Jun 11, 2014 5:23 am
Contact:

Re: Heavy mode determinism vs Factorissimo3

Post by Rseding91 »

It looks like the desync it's seeing is related to the roboport base_animation not existing and the logic not handling that. I've fixed the error for 2.1. Looking at the logic - it will never actually cause a real game desync and just gets detected due to how /toggle-heavy-mode works (save complete game to compute CRC and compare with other instance). The value is only ever used for drawing and is included in the save just for visual consistencies sake.

If you define a base_animation of anything (1 frame) it should fix the "false-positive" desync you're seeing.
If you want to get ahold of me I'm almost always on Discord.
Post Reply

Return to “Modding help”