Desync resolver: reverse parallel execution
Posted: Mon Oct 14, 2019 9:43 am
I am not a Factorio coder but I understand programming in general.
Apparently a "game save" contains the entire deterministic state needed to continue running from the moment of the save.
And apparently a desync report contains both the deterministic state of the client and server of the exact tick where a desync occurred.
If the game engine is truly deterministic then it seems it should be possible to walk backwards through the programming, turning the tick counter backwards, running the code backwards, to find where a desync occurred.
,
In order to wind the tick counter backwards it is necessary to know what choices the emulated lockstep processor made, and what it was last executing when the desync was detected. The reverse process then runs the game backwards through the code, doing the opposite of whatever was programmed, to undo game events.
To make this easier, may need to log processor subroutine call events as a sort of "COME FROM" list to undo procedural jumps that can come from multiple sources.
It would also need to log the nondeterministic player action data coming in from the server to the client so that it can be used to undo player triggered activity.
But it is not necessary to log everything the processor did, such as math operations, if/then statements, etc, as those can be performed by merely running the actual game code in reverse. Only the significant choice events made outside the direct linear calculations need to be logged.
,
The person doing the debugging needs to decide how much memory to use to store this state tracking data, which sets a limit on how far back the code can be unwound.
The two saved states from the client and server are wound backwards together until the desynced elements reconverge back together again and are identical once more.
This then can be used to repeatedly wind the processor clock forward and backward across the problem code to find the source of the desync.
,
Apparently this would need help from the core development team as apparently only they know the specific details of how the lockstep processor emulation works.
Apparently a "game save" contains the entire deterministic state needed to continue running from the moment of the save.
And apparently a desync report contains both the deterministic state of the client and server of the exact tick where a desync occurred.
If the game engine is truly deterministic then it seems it should be possible to walk backwards through the programming, turning the tick counter backwards, running the code backwards, to find where a desync occurred.
,
In order to wind the tick counter backwards it is necessary to know what choices the emulated lockstep processor made, and what it was last executing when the desync was detected. The reverse process then runs the game backwards through the code, doing the opposite of whatever was programmed, to undo game events.
To make this easier, may need to log processor subroutine call events as a sort of "COME FROM" list to undo procedural jumps that can come from multiple sources.
It would also need to log the nondeterministic player action data coming in from the server to the client so that it can be used to undo player triggered activity.
But it is not necessary to log everything the processor did, such as math operations, if/then statements, etc, as those can be performed by merely running the actual game code in reverse. Only the significant choice events made outside the direct linear calculations need to be logged.
,
The person doing the debugging needs to decide how much memory to use to store this state tracking data, which sets a limit on how far back the code can be unwound.
The two saved states from the client and server are wound backwards together until the desynced elements reconverge back together again and are identical once more.
This then can be used to repeatedly wind the processor clock forward and backward across the problem code to find the source of the desync.
,
Apparently this would need help from the core development team as apparently only they know the specific details of how the lockstep processor emulation works.