Friday Facts #340 - Deep desyncs

Regular reports on Factorio development.
raidho36
Long Handed Inserter
Long Handed Inserter
Posts: 93
Joined: Wed Jun 01, 2016 2:08 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by raidho36 »

posila wrote: Sun Mar 29, 2020 2:43 pm you are just making broad assumptions
My bad. When people have problems like this, you almost always discover a whole slew of deeper rooted problems once you start shoveling through all of it. Call it PTSD if you will. It seemed obvious to me that if you have sync problems, it's because your gamestate data wasn't shared over the net correctly, not because your custom mod save file failed to generate properly. Maybe you could elaborate on exactly how a corrupted save (which I assume is common for all players) causes synchronization issues.

I don't believe the example is a valid behavior for a serializer. It would first create a table and only then fill it up, at which point it already has this very table created so it can insert it as a self-reference. Surely it's meant to be a circular reference example, or more generally an example where the referenced table isn't yet created. But I would also argue that it should create a blank table immediately and fill it up eventually. Or, the same data serializer could be used as the one for networking - on assumption that it serializes everything and doesn't leaves out custom script data. I will still berate you for using a serializer that just has bare "loadstring" on its deserialization function, stuff like that is just not acceptable. The array length operator instability is generally a noob trap, I don't use it unless it's something not performance critical and dead simple (not to mention in LuaJIT doing your own book keeping on array length is a lot faster).

When rookie makes a rookie mistake it's par for the course, it would be odd if they didn't. When a professional makes a rookie mistake - not so much. You see my point?
Rseding91 wrote: Sun Mar 29, 2020 2:44 pm I don't see how feeding the trolls is helpful here. If someone wants to actually be helpful they would be helpful. The internet is full of "internet programmers" who like to act like they are smarter than everyone else and yet they aren't the ones who are making Factorio: we (Wube) are.
"Everyone who has (strong) criticisms is a troll" is a convenient position to have, because you can deflect any and all criticism and still feel good about yourself. Well, the "it just needs to function" also means that no amount of questionable programming can ever prevent your game from being a success - Notch will vouch for that. So you can just keep doing what you're doing no matter what, I'd wish you luck but clearly you don't need it as you're making mad profits as it is.
Last edited by raidho36 on Sun Mar 29, 2020 4:11 pm, edited 1 time in total.
User avatar
ptx0
Smart Inserter
Smart Inserter
Posts: 1507
Joined: Wed Jan 01, 2020 7:16 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by ptx0 »

Oktokolo wrote: Sun Mar 29, 2020 1:36 pm
raidho36 wrote: Sun Mar 29, 2020 6:03 am Wow. You guys just keep making rookie's mistakes when it comes to programming, it's embarrassing.
ptx0 wrote: Sun Mar 29, 2020 6:33 am yeah, they had a quick release recently because looking at the world in map view caused it to crash almost immediately.
Wube really should have outsourced making the engine to you two. Obviously, having it done by developers who never let any testable bug slip though is better, than letting actual humans write the code. Humans err, but you two obviously don't.
i can't speak for that guy but yeah, you should ACTUALLY BE WRITING TEST CASES when you're putting a code change into the source tree.

code coverage should INCREASE with each successive commit.

tests should pass.

event handlers should thus ALWAYS WORK.

this is what regression testing is. you fix a bug, you make a reproducible test case, you add it to the test suite so that it NEVER HAPPENS AGAIN.

but i've seen several regressions in .18.
User avatar
ptx0
Smart Inserter
Smart Inserter
Posts: 1507
Joined: Wed Jan 01, 2020 7:16 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by ptx0 »

Rseding91 wrote: Sun Mar 29, 2020 2:44 pm I don't see how feeding the trolls is helpful here. If someone wants to actually be helpful they would be helpful. The internet is full of "internet programmers" who like to act like they are smarter than everyone else and yet they aren't the ones who are making Factorio: we (Wube) are.
can you please elaborate on your thoughts of how we could be helpful? other than the two desyncs that I brought forward in a reproducible way that were then resolved by boskid and discussed in this FFF?

i don't see why you'd put ""s around "internet programmers", or even why you'd add the word "internet" except as a sleight toward anyone who tries to give technical advice. doesn't that place you squarely on the level ground with a troll? don't see anyone here acting like they're smarter than anyone else, just a few of you at Wube.

it's the benefit we have of the internet to be able to communicate like this and give others ideas and feedback on something as complicated as programming a game. don't throw that away by calling us "trolls".

btw, it looked like Bilka and posila were doing just fine on their own without you coming in to show them they aren't as smart as you are for "feeding the trolls".
User avatar
boskid
Factorio Staff
Factorio Staff
Posts: 4262
Joined: Thu Dec 14, 2017 6:56 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by boskid »

ptx0 wrote: Sun Mar 29, 2020 4:07 pm i can't speak for that guy but yeah, you should ACTUALLY BE WRITING TEST CASES when you're putting a code change into the source tree.
There is a lot of tests, i think they are even shipped with portable package in directory `tests`, so you can run them locally. Tests for my change of serpent are under Lua/SerpentDumpIterationOrder. Tests for unit group max speed are under UnitGroup/UnitGroupMaxSpeedWhenChangingUnitSpeed and UnitGroup/UnitGroupMaxSpeedWhenUnitsAffectedByTiles.
User avatar
LotA
Fast Inserter
Fast Inserter
Posts: 117
Joined: Fri Oct 10, 2014 11:41 am
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by LotA »

Well I'm a developer myself and I stay on my position : I love you Wube. Don't ever let those internet "smart-ass behind my desk" make you feel bad

I love you for the inner, in-depth stories & adventures about the dev
I love you for having the ball to admit when you failed on something and actually correcting the problem

But most of all,
I love you for taking such a good care of your engine. I mean most dev team would just sweep under the carpet defects in the engine and just try to hide it. You don't and that is priceless <3

Factorio has flaws of course, nothing a human makes can be perfect. It's just philosophical.
Yet, Factorio is to my knowledge one of the most beautiful baby, one of the most cherished project a developer can make.
User avatar
ptx0
Smart Inserter
Smart Inserter
Posts: 1507
Joined: Wed Jan 01, 2020 7:16 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by ptx0 »

boskid wrote: Sun Mar 29, 2020 4:27 pm
ptx0 wrote: Sun Mar 29, 2020 4:07 pm i can't speak for that guy but yeah, you should ACTUALLY BE WRITING TEST CASES when you're putting a code change into the source tree.
There is a lot of tests, i think they are even shipped with portable package in directory `tests`, so you can run them locally. Tests for my change of serpent are under Lua/SerpentDumpIterationOrder. Tests for unit group max speed are under UnitGroup/UnitGroupMaxSpeedWhenChangingUnitSpeed and UnitGroup/UnitGroupMaxSpeedWhenUnitsAffectedByTiles.
i know you did, because you and i discussed that briefly the other day before the fix was released.

but the other releases where things have broken indicate zero test coverage for those changes.
User avatar
ptx0
Smart Inserter
Smart Inserter
Posts: 1507
Joined: Wed Jan 01, 2020 7:16 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by ptx0 »

boskid wrote: Sun Mar 29, 2020 4:27 pm There is a lot of tests, i think they are even shipped with portable package in directory `tests`, so you can run them locally.

Code: Select all

% ls factorio-beta/spaceexploration 
achievements.dat         archive  blueprint-storage.dat  config-path.cfg  data      dumps                 factorio-previous.log  player-data.json  scenarios      temp
achievements-modded.dat  bin      config                 crop-cache.dat   doc-html  factorio-current.log  mods                   saves             script-output
there is no such thing on my local install. am I supposed to be looking somewhere specifically in here?
User avatar
boskid
Factorio Staff
Factorio Staff
Posts: 4262
Joined: Thu Dec 14, 2017 6:56 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by boskid »

ptx0 wrote: Sun Mar 29, 2020 4:35 pm there is no such thing on my local install. am I supposed to be looking somewhere specifically in here?
Maybe it is only in portable for windows, i am never looking into portable for other OSes
User avatar
MakeItGraphic
Fast Inserter
Fast Inserter
Posts: 237
Joined: Sat Jan 06, 2018 7:53 am
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by MakeItGraphic »

I have the test folder, portable install.
testfolder.PNG
testfolder.PNG (95.14 KiB) Viewed 9788 times
User avatar
BattleFluffy
Fast Inserter
Fast Inserter
Posts: 215
Joined: Sun Mar 31, 2019 4:58 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by BattleFluffy »

Rseding91 wrote: Sun Mar 29, 2020 2:44 pm I don't see how feeding the trolls is helpful here. If someone wants to actually be helpful they would be helpful. The internet is full of "internet programmers" who like to act like they are smarter than everyone else and yet they aren't the ones who are making Factorio: we (Wube) are.
Well said !
Whether this guy has any merit to his claims is besides the point - he is being rude, going around calling people rookies - even if he were the most knowledgable and talented programmer in the world, he'd still be a jerk and nobody would want to work with him. It's his own fault that nobody wants to listen to his poorly worded explanations. I constructively criticise Factorio constantly but I am always careful to be respectful and balanced about it.
raidho36
Long Handed Inserter
Long Handed Inserter
Posts: 93
Joined: Wed Jun 01, 2016 2:08 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by raidho36 »

BattleFluffy wrote: Mon Mar 30, 2020 2:51 am It's his own fault that nobody wants to listen to his poorly worded explanations. I constructively criticise Factorio constantly but I am always careful to be respectful and balanced about it.
Good thing I'm not trying to be constructive, I'm just shaming the devs for making mistakes far below their skill level. The skill level expected of them anyway. Not to make them mad mind you, only to point out that they weren't supposed to make mistakes like this.
posila
Former Staff
Former Staff
Posts: 5448
Joined: Thu Jun 11, 2015 1:35 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by posila »

raidho36 wrote: Sun Mar 29, 2020 3:28 pm It seemed obvious to me that if you have sync problems, it's because your gamestate data wasn't shared over the net correctly, not because your custom mod save file failed to generate properly. Maybe you could elaborate on exactly how a corrupted save (which I assume is common for all players) causes synchronization issues.
I don't know how much back have your read FFF, so I just throw in link to this ancient one: https://www.factorio.com/blog/post/fff-76, on high level, MP still works the same, except for it not being peer-to-peer anymore.
Corrupted save to me is save that doesn't load, but I assume you mean save that deserializes to different state than it was serialized from (we say such saves are not save-load stable). In general (so this does not apply to just script state serialization), server is running and updating it's state, then client wants to connect, so server saves out it's state (makes a save file) and transfers it to client - which then loads it, but due to a bug the game state of client does not end up being exactly same as state on the server - so following updates will start off from different initial state. Had server (and all already connected clients) reloaded the map when new client connects, this would not be a problem, but that would just hide the issue.

I acknowledge that's not only possible way how to make the lua serialization/deserialization, but Serpent works well enough for us for the past 7-8 years, so there was no need to replace it. Which I could not say about many other things in the game.
dee-
Filter Inserter
Filter Inserter
Posts: 416
Joined: Mon Jan 19, 2015 9:21 am
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by dee- »

posila wrote: Mon Mar 30, 2020 12:45 pm
raidho36 wrote: Sun Mar 29, 2020 3:28 pm It seemed obvious to me that if you have sync problems, it's because your gamestate data wasn't shared over the net correctly, not because your custom mod save file failed to generate properly. Maybe you could elaborate on exactly how a corrupted save (which I assume is common for all players) causes synchronization issues.
I don't know how much back have your read FFF, so I just throw in link to this ancient one: https://www.factorio.com/blog/post/fff-76, on high level, MP still works the same, except for it not being peer-to-peer anymore.
Corrupted save to me is save that doesn't load, but I assume you mean save that deserializes to different state than it was serialized from (we say such saves are not save-load stable). In general (so this does not apply to just script state serialization), server is running and updating it's state, then client wants to connect, so server saves out it's state (makes a save file) and transfers it to client - which then loads it, but due to a bug the game state of client does not end up being exactly same as state on the server - so following updates will start off from different initial state. Had server (and all already connected clients) reloaded the map when new client connects, this would not be a problem, but that would just hide the issue.

I acknowledge that's not only possible way how to make the lua serialization/deserialization, but Serpent works well enough for us for the past 7-8 years, so there was no need to replace it. Which I could not say about many other things in the game.
Would checksumming help here?
1. On save: serialize memory -> disk, create checksum from disk-file: ChkA
2a. Load game: deserialize disk -> memory
2b. After loading immediately serialize again to a temp-file: memory -> disk, create checksum from disk-file: ChkB
3. Compare ChkA with ChkB to see if the deserialization rpdued the exact same output (so loading probably restored the previous memory situation)
Its not 100% proof as the deserialization could be broken and create the same checksum even when memory on save and after load differ but maybe a bit better
User avatar
Klonan
Factorio Staff
Factorio Staff
Posts: 5423
Joined: Sun Jan 11, 2015 2:09 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by Klonan »

dee- wrote: Mon Mar 30, 2020 12:51 pm Would checksumming help here?
We already do checksums to see if the game state has desynced, but it doesn't help other than telling us there is a problem.

We have runtime mode 'heavy-mode' which saves and loads the game each tick and compares the checksum, and we run tests in both normal and heavy mode, which catches a quite a few determinism issues.

Also when we detect the checksums are different, and the game has desynced, we dump the save game with extra tags so we can see where the differences are,
From there we investigate how the game code has resulted in these differences
posila
Former Staff
Former Staff
Posts: 5448
Joined: Thu Jun 11, 2015 1:35 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by posila »

dee- wrote: Mon Mar 30, 2020 12:51 pmWould checksumming help here?
Yes, that's how part of what happens during heavy mode testing (which can be toggled on on server during runtime also) ... https://www.factorio.com/blog/post/fff-63

But even this step is little bit too ... performance heavy ... to do it just preemptively whenever someone connects.
User avatar
ptx0
Smart Inserter
Smart Inserter
Posts: 1507
Joined: Wed Jan 01, 2020 7:16 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by ptx0 »

posila wrote: Mon Mar 30, 2020 1:23 pm
dee- wrote: Mon Mar 30, 2020 12:51 pmWould checksumming help here?
Yes, that's how part of what happens during heavy mode testing (which can be toggled on on server during runtime also) ... https://www.factorio.com/blog/post/fff-63

But even this step is little bit too ... performance heavy ... to do it just preemptively whenever someone connects.
there's also game.force_crc() which does the same thing but not every tick and can be used to trigger desync in multiplayer where heavy mode is not that useful. you can't even connect to a server that has heavy mode enabled.
dee-
Filter Inserter
Filter Inserter
Posts: 416
Joined: Mon Jan 19, 2015 9:21 am
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by dee- »

posila wrote: Mon Mar 30, 2020 1:23 pm
dee- wrote: Mon Mar 30, 2020 12:51 pmWould checksumming help here?
Yes, that's how part of what happens during heavy mode testing (which can be toggled on on server during runtime also) ... https://www.factorio.com/blog/post/fff-63

But even this step is little bit too ... performance heavy ... to do it just preemptively whenever someone connects.
Wow, great. So I am really behind the times, FFF-63 :D
User avatar
BattleFluffy
Fast Inserter
Fast Inserter
Posts: 215
Joined: Sun Mar 31, 2019 4:58 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by BattleFluffy »

raidho36 wrote: Mon Mar 30, 2020 6:11 am Good thing I'm not trying to be constructive, I'm just shaming the devs for making mistakes far below their skill level. The skill level expected of them anyway.
And I'm shaming you for being a jerk. Learn some manners.
raidho36
Long Handed Inserter
Long Handed Inserter
Posts: 93
Joined: Wed Jun 01, 2016 2:08 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by raidho36 »

BattleFluffy wrote: Mon Mar 30, 2020 10:37 pm And I'm shaming you for being a jerk. Learn some manners.
Fair enough. I subscribe to Louis Rossmann school of criticism, not so much by choice as because I am who I am, and it coincides nicely.
posila wrote: Mon Mar 30, 2020 12:45 pm server is running and updating it's state, then client wants to connect, so server saves out it's state (makes a save file) and transfers it to client
Yes I was aware how clients connect to server. I assumed that the error sometimes occurred mid-game thus being an annoyance, as opposed to every time on connect thus being game-breaking. Mostly because the blog post text downplayed the severity of the bug. The save is corrupted as if doesn't restore the gamestate into identical state, the layman rater than technical meaning of the term.

At this point it's moot but an alternative connection strategy might have worked better - same as in "regular" MMO games - when the client connects into a blank world and then asks the server to give it sync packets about everything, one thing at a time, closest first. It would take about the same amount of time to load the map, but it will be playable much sooner to a new client, plus there wouldn't be the "catching up" problem to deal with. That's of course assuming that the server is only sharing minimal amount of information with clients, instead of sending them literally everything all the time, by extension that the client game doesn't breaks if it's an incomplete gamestate. If the entire map needs to be loaded for each client before anything can work then it doesn't matter, you might as well just pass a map save file.
User avatar
5thHorseman
Smart Inserter
Smart Inserter
Posts: 1193
Joined: Fri Jun 10, 2016 11:21 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by 5thHorseman »

raidho36 wrote: Mon Mar 30, 2020 6:11 am Good thing I'm not trying to be constructive, I'm just shaming the devs for making mistakes far below their skill level. The skill level expected of them anyway. Not to make them mad mind you, only to point out that they weren't supposed to make mistakes like this.
You failed. The only one who should be ashamed is you, and not even for your childish antics. In a few years when you actually write something of any worth (if that ever happens) you'll understand how wrong you are.
Post Reply

Return to “News”