Friday Facts #340 - Deep desyncs

Regular reports on Factorio development.
User avatar
ptx0
Smart Inserter
Smart Inserter
Posts: 1507
Joined: Wed Jan 01, 2020 7:16 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by ptx0 »

5thHorseman wrote: ↑
Tue Mar 31, 2020 1:48 am
raidho36 wrote: ↑
Mon Mar 30, 2020 6:11 am
Good thing I'm not trying to be constructive, I'm just shaming the devs for making mistakes far below their skill level. The skill level expected of them anyway. Not to make them mad mind you, only to point out that they weren't supposed to make mistakes like this.
You failed. The only one who should be ashamed is you, and not even for your childish antics. In a few years when you actually write something of any worth (if that ever happens) you'll understand how wrong you are.
don't think those words you spent so much time writing had any effect that you desired other than to inflate your own ego, much like mine here.

mmmPI
Smart Inserter
Smart Inserter
Posts: 2673
Joined: Mon Jun 20, 2016 6:10 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by mmmPI »

That's the only FFF i remember that doesn't end by "let us know what you think on the forum", maybe because it mentionned a sensitive topic , hopefully it didn't seem to prevent anyone from expressing their finest thoughts :twisted:

i think it's cool that the dev can still work on the game during the virus, and also if you know you are generally annoying with other people it might be a good time to chill a bit.

Desync didn't feel like a major bug due to the rare number of occurences, reading that some were fragile and hard to track is no suprise, there used to be more, hopefully the dynamic continue.

Koub
Global Moderator
Global Moderator
Posts: 7175
Joined: Fri May 30, 2014 8:54 am
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by Koub »

[Koub] Please calm down and behave. I'm letting people as much freedom of speech as I can, but the unacceptable limit's geting dangerously close.
Koub - Please consider English is not my native language.

User avatar
Klonan
Factorio Staff
Factorio Staff
Posts: 5148
Joined: Sun Jan 11, 2015 2:09 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by Klonan »

mmmPI wrote: ↑
Tue Mar 31, 2020 4:27 am
That's the only FFF i remember that doesn't end by "let us know what you think on the forum", maybe because it mentionned a sensitive topic , hopefully it didn't seem to prevent anyone from expressing their finest thoughts :twisted:
Since we added the 'discuss on reddit' and 'discuss on forum' buttons, I have felt that the "As always, let us know what you think..." is a bit redundant, so I stopped adding it

posila
Factorio Staff
Factorio Staff
Posts: 5201
Joined: Thu Jun 11, 2015 1:35 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by posila »

raidho36 wrote: ↑
Tue Mar 31, 2020 12:37 am
BattleFluffy wrote: ↑
Mon Mar 30, 2020 10:37 pm
And I'm shaming you for being a jerk. Learn some manners.
Fair enough. I subscribe to Louis Rossmann school of criticism, not so much by choice as because I am who I am, and it coincides nicely.
I see, but you have failed the classes and are going to take them again, right? Louis Rossmann: A word on criticism

User avatar
wheybags
Former Staff
Former Staff
Posts: 328
Joined: Fri Jun 02, 2017 1:50 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by wheybags »

raidho36 wrote: ↑
Tue Mar 31, 2020 12:37 am
At this point it's moot but an alternative connection strategy might have worked better - same as in "regular" MMO games - when the client connects into a blank world and then asks the server to give it sync packets about everything, one thing at a time, closest first.
So, I get that you're kind of an angry internet person, and anything said will be received badly, but if you do work on games, for fun or profit, then it's worth learning how to do multiplayer sync, and the different options available. What you're describing here is one option, but is definitely not the only, or best option for every case.

Glenn Fiedler has some great articles on this topic, here's a blog post chain where he covers various methods of synchronising a physics simulation, including lock-step: https://gafferongames.com/post/introduc ... d_physics/

That series is great and detailed, but if you want a quicker overview, there's this article too: https://gafferongames.com/post/what_eve ... etworking/

Glenn was responsible for designing the netcode for a whole bunch of AAA games, and now runs a games networking company.
Some choice quotes from those articles re lockstep: "it's exceptionally difficult to ensure that a game is completely deterministic", and "The reason [for using lockstep] being that in RTS games the game state consists of many thousands of units and is simply too large to exchange between players"

lethern
Manual Inserter
Manual Inserter
Posts: 3
Joined: Sat Mar 21, 2020 7:43 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by lethern »

Dumb person: I know everything
Smart person: I know I don't know anything

go figure

raidho36
Long Handed Inserter
Long Handed Inserter
Posts: 93
Joined: Wed Jun 01, 2016 2:08 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by raidho36 »

wheybags wrote: ↑
Tue Mar 31, 2020 12:33 pm
So, I get that you're kind of an angry internet person, and anything said will be received badly, but if you do work on games, for fun or profit, then it's worth learning how to do multiplayer sync, and the different options available. What you're describing here is one option, but is definitely not the only, or best option for every case.

...
That's an interesting read, unfortunately I didn't learn anything because none of that is new to me. I have been programming for 15 years. You seem to have thought that I don't understand multiplayer networking very much - that's a fair thing to assume about people you don't know, but immediately dropping an introductory article series is a bit condescending I find. But I'm not the one to talk.

Maybe there was a miscommunication. Maybe I worded my idea poorly. I tried to talk about a game join strategy, not game update - that's not relevant. Factorio's game join strategy - as far as I'm told - is to save the game and share the save file with new clients, so that when they actually join they immediately have the full current gamestate (there's also the "catching up" to prevent server gigastutter every time someone joins). An MMO game such as Minecraft would put you into the game immediately, and then the completely blank world gradually gets filled up with relevant gamestate items such as nearby terrain, players, objects and monsters as the client receives sync packets about them, and it's usually the closest-first basis. Only a limited amount of objects get synced with the player, normally based on physical distance. Of course the player also receives sync packets about all the objects that are already loaded. That would be an alternate strategy to the current one and it might have worked better, but again at this stage it's not important anymore, the game's pretty much done and works OK as it is, so nobody's gonna bother with things like that because non broken things don't need fixing.

User avatar
MakeItGraphic
Fast Inserter
Fast Inserter
Posts: 237
Joined: Sat Jan 06, 2018 7:53 am
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by MakeItGraphic »

raidho36 wrote: ↑
Wed Apr 01, 2020 7:24 am
wheybags wrote: ↑
Tue Mar 31, 2020 12:33 pm
So, I get that you're kind of an angry internet person, and anything said will be received badly, but if you do work on games, for fun or profit, then it's worth learning how to do multiplayer sync, and the different options available. What you're describing here is one option, but is definitely not the only, or best option for every case.

...
That's an interesting read, unfortunately I didn't learn anything because none of that is new to me. I have been programming for 15 years. You seem to have thought that I don't understand multiplayer networking very much - that's a fair thing to assume about people you don't know, but immediately dropping an introductory article series is a bit condescending I find. But I'm not the one to talk.

Maybe there was a miscommunication. Maybe I worded my idea poorly. I tried to talk about a game join strategy, not game update - that's not relevant. Factorio's game join strategy - as far as I'm told - is to save the game and share the save file with new clients, so that when they actually join they immediately have the full current gamestate (there's also the "catching up" to prevent server gigastutter every time someone joins). An MMO game such as Minecraft would put you into the game immediately, and then the completely blank world gradually gets filled up with relevant gamestate items such as nearby terrain, players, objects and monsters as the client receives sync packets about them, and it's usually the closest-first basis. Only a limited amount of objects get synced with the player, normally based on physical distance. Of course the player also receives sync packets about all the objects that are already loaded. That would be an alternate strategy to the current one and it might have worked better, but again at this stage it's not important anymore, the game's pretty much done and works OK as it is, so nobody's gonna bother with things like that because non broken things don't need fixing.
I hate to add to this discussion but I think you added something VERY important to your post.

"nobody's gonna bother with things like that because non broken things don't need fixing."

the devs are so close to a final product I think they've reached their limit in the quest of perfection. I have seen numerous post saying quote "fixing this will cause a cascade of other issues to fix" again paraphrasing but it has been said numerous times, in numerous ways.

Introducing anything new into such a refined product such as this game in this late of development I can completely understand the hesitation especially if the output is minimal in gain. It's not feasible anymore, I've been bias in this perception as well. The way certain devs respond to things can seem like they're brushing stuff off, or just don't care. But I don't believe that is the fact of the matter.

Simply put it again is no longer feasible to make such minimal changes to result in a chain reaction of bug reports, and further refinement for the sole purpose of minimal gain. All that will do is further bloat the source. And the more lines of code written the more convoluted things begin. You eventually reach a point of futility that will result in a cycle of negative returns.

User avatar
wheybags
Former Staff
Former Staff
Posts: 328
Joined: Fri Jun 02, 2017 1:50 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by wheybags »

raidho36 wrote: ↑
Wed Apr 01, 2020 7:24 am
none of that is new to me. I have been programming for 15 years. You seem to have thought that I don't understand multiplayer networking very much - that's a fair thing to assume about people you don't know, but immediately dropping an introductory article series is a bit condescending I find.

...

Maybe there was a miscommunication. Maybe I worded my idea poorly. I tried to talk about a game join strategy, not game update - that's not relevant. Factorio's game join strategy - as far as I'm told - is to save the game and share the save file with new clients, so that when they actually join they immediately have the full current gamestate
So, the reason I posted an introductory article like that is that I'm not sure you understand how lock-step multiplayer works (and it's a pretty cool concept, I would recommend those articles to any programmer to read just for fun). Sorry if this was an incorrect assumption.
The thing about locks-step is that the game join strategy is inseparable from the update strategy. If your game is lock-step, then you have to use the game join strategy we use (or a less optimal version where you just freeze the game once someone starts joining is also possible). Starting to update the game while the world state is still streaming in is just not something that is possible with lock-step, because running an update with some entities missing is very likely to cause a behaviour change, and thus a desync.

This is why I thought you didn't understand lock-step, maybe I am just mis-reading though, and you are actually proposing something else. It reads to me that you are proposing lock-step updates with a streamed in game world, which is not feasible. Maybe this would work if entities could not interact across chunk boundaries, but that wouldn't be much fun really. My point was that factorio is a natural fit for lock-step, because of the massive amount of independently updating entities present in a normal game, and implicitly that comes with the connection strategy we use.

I'm not trying to be condescending, what I'm doing is pretending to myself that you're being polite, and trying to respond as I would in a normal conversation, by posting relevant and (hopefully) interesting information.

raidho36
Long Handed Inserter
Long Handed Inserter
Posts: 93
Joined: Wed Jun 01, 2016 2:08 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by raidho36 »

wheybags wrote: ↑
Wed Apr 01, 2020 11:23 am
The thing about locks-step is that the game join strategy is inseparable from the update strategy. If your game is lock-step, then you have to use the game join strategy we use (or a less optimal version where you just freeze the game once someone starts joining is also possible). Starting to update the game while the world state is still streaming in is just not something that is possible with lock-step, because running an update with some entities missing is very likely to cause a behaviour change, and thus a desync.

This is why I thought you didn't understand lock-step, maybe I am just mis-reading though, and you are actually proposing something else. It reads to me that you are proposing lock-step updates with a streamed in game world, which is not feasible. Maybe this would work if entities could not interact across chunk boundaries, but that wouldn't be much fun really. My point was that factorio is a natural fit for lock-step, because of the massive amount of independently updating entities present in a normal game, and implicitly that comes with the connection strategy we use.
I didn't make any assumption about what sync method Factorio uses, I did however mentioned that it requires certain concessions for "streaming join" to work, which may or may not be the case.

Lockstep is a "natural fit" for any game whatsoever as long as the environment is in its most basic configuration: fully deterministic, fully loaded and fully reliable. In practice this means that in order to even use lockstep you have to make workarounds, lot of them. At some point you may introduce a workaround to deal with not-yet-loaded entities. Or you might acknowledge that maybe it's not the best solution. You've been reducing game logic complexity from frame-by-frame simulation to basic state changes that last a number of frames with no work being actually done the whole time. By extension this has vastly reduced the amount of network traffic that would've been necessary to communicate the changes - for a typical game entity it's a few bytes every few seconds and that's it. Communicating only the changes in immediate proximity to the player makes the traffic burden lesser yet. Maybe lockstep was indeed the best solution with the original brute force design. And now that it's more sophisticated, a more practical method can be used. It won't be, of course, because it works as it is, but in principle.

EDIT: haha wow that went off tangent, I take all of that back.

User avatar
Klonan
Factorio Staff
Factorio Staff
Posts: 5148
Joined: Sun Jan 11, 2015 2:09 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by Klonan »

raidho36 wrote: ↑
Wed Apr 01, 2020 12:26 pm
for a typical game entity it's a few bytes every few seconds and that's it.
And with 100 players running around a factory with 10,000 active entities, the network bandwidth adds up quick.
raidho36 wrote: ↑
Wed Apr 01, 2020 12:26 pm
Maybe lockstep was indeed the best solution with the original brute force design.
I don't see how its still not the best solution. Once you've synced the game state, everything runs perfectly smoothly and scales really well with more players.
If you jump on a train and zoom off to a outpost, you don't have to wait for the outpost to load in, you don't get bitten by biters you can't see, you don't rubber band or clip around as entities pop into existence.

Also the determinism is super useful in many other ways. For one, replays would not be possible if the results changed each time. Tests and bug reports are perfectly reproducible from the initial conditions.

orzelek
Smart Inserter
Smart Inserter
Posts: 3911
Joined: Fri Apr 03, 2015 10:20 am
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by orzelek »

Klonan wrote: ↑
Wed Apr 01, 2020 2:05 pm
Also the determinism is super useful in many other ways. For one, replays would not be possible if the results changed each time. Tests and bug reports are perfectly reproducible from the initial conditions.
You do know how to make software engineer laugh :D

bobucles
Smart Inserter
Smart Inserter
Posts: 1669
Joined: Wed Jun 10, 2015 10:37 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by bobucles »

Every design choice is always a tradeoff. Factorio uses a very hard deterministic structure where all players run 100% simulation of the game at the same time. One of the nice things is that it's very hard to cheat in such a system. A player who provides bad data will simply desync and can't corrupt the game state for others. That's pretty good for a game where random people can join or leave at any time. The downside is that everyone has to stay perfectly synced, and weak computers drop out.

raidho36
Long Handed Inserter
Long Handed Inserter
Posts: 93
Joined: Wed Jun 01, 2016 2:08 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by raidho36 »

Klonan wrote: ↑
Wed Apr 01, 2020 2:05 pm
...
Determinism is great, unfortunately it's not always there, and when it is it doesn't always stay that way*. Using determinism for synchronization is an easy way to provide yourself with an endless supply of headache. I don't lean on determinism and I advice others against shooting yourself in the foot like this as well.

Lockstep is easily the least bandwidth intensive solution, and that would be true for most games. It is also the least robust of all because nothing whatsoever is allowed to go wrong at any point. If I understand it correctly, the client has to run the entire simulation and that's obviously not great for large worlds and/or weak machines, at some point it becomes straight up unplayable even if the server can handle it fine. Now I'll argue that the bandwidth isn't a huge problem; how much data do entities generate, assuming it's generated properly and not in naive fashion? 32 bytes per second or so? And how many active entities can there be in immediate vicinity? A couple of thousand? That's 500 kilobits per second of traffic per player, without further compression, a rather modest amount compared to Minecraft for example, and that's a pretty crowded scenario - you need a 40x50 array of fast inserters shuffling coal around to really pull off such high density high update rate layout. Bots are high density low update rate, belts are low density low update rate, and other structures will go even lower with exception of green circuit manufacturing cluster which is low density high update rate. Last I checked biters don't even update until they start attacking and even then it's once in a while (maybe I misread that but if I did - it should be like this, they don't need updating every frame). Bandwidth will go up if you take a high speed train, but not by a large factor, maybe 2 or 3 depending on how much data needs to be sent to convey basic state of the entity.

If you made this game in 1998 and it had to be played over dialup then sure, lockstep would be the way to go. You wouldn't be able to join a running session but if you were there it would run fine. Nowdays you really have to be wasteful to actually encounter problems, be it bandwidth limit or data cap. Datacenters will have no problem running a 100 megabit churning server and the client bandwidth is low enough to be played off cellular network in the middle of nowhere (not in America of course but in general).

* Particularly when any amount of floating point math is involved. If you're trying to pull off floating point math determinism you might as well save yourself the trouble and just admit yourself into mental ward. The stories are countless and I've heard one noteworthy tale of a game becoming nondeterministic when played on two different iOS builds. And if you're into LuaJIT you might have dealt with jitting instability induced nondeterminism, where ambient fluctuations from run to run - such as OS scheduler timings - can eventually cause JIT to produce equivalent but slightly different math assembly for the same equation, and these minute differences can butterfly their way into radically different outcomes. It doesn't usually happen but when it does, it's ugly.

User avatar
Klonan
Factorio Staff
Factorio Staff
Posts: 5148
Joined: Sun Jan 11, 2015 2:09 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by Klonan »

raidho36 wrote: ↑
Wed Apr 01, 2020 10:10 pm
Using determinism for synchronization is an easy way to provide yourself with an endless supply of headache. I don't lean on determinism and I advice others against shooting yourself in the foot like this as well.
...
It is also the least robust of all because nothing whatsoever is allowed to go wrong at any point.
...
If you're trying to pull off floating point math determinism you might as well save yourself the trouble and just admit yourself into mental ward. T
Well, we're already 8 years and 2,000,000 sales in, so its a bit late for us to change gears.

Our base game/engine determinism is working alright for us, there was a server with 500 players, across different CPUs, OS (Windows 7, 10, Mac, Linux), continents, and there was no desynchronisation, which i'd say is a pretty good test.

The specific issue in the FFF was related to cyclic reference in mod scripting data, and a small save/load instability with modded units/tiles, which in the scheme of things are not really issues that break the whole concept of using our deterministic lock-step.
We fixed the bugs and move on.

raidho36
Long Handed Inserter
Long Handed Inserter
Posts: 93
Joined: Wed Jun 01, 2016 2:08 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by raidho36 »

Klonan wrote: ↑
Wed Apr 01, 2020 10:16 pm
Well, we're already 8 years and 2,000,000 sales in, so its a bit late for us to change gears.

Our base game/engine determinism is working alright for us, there was a server with 500 players, across different CPUs, OS (Windows 7, 10, Mac, Linux), continents, and there was no desynchronisation, which i'd say is a pretty good test.
I didn't suggest that you change it. It's not so much about the sunk cost as about it not being an upgrade at this stage. Maybe it will become an upgrade some time in the future, but probably not.

Hope you didn't use trigonometric floating point functions. ;)

Jap2.0
Smart Inserter
Smart Inserter
Posts: 2339
Joined: Tue Jun 20, 2017 12:02 am
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by Jap2.0 »

raidho36 wrote: ↑
Wed Apr 01, 2020 10:38 pm
Hope you didn't use trigonometric floating point functions. ;)
6 years too late there.
There are 10 types of people: those who get this joke and those who don't.

raidho36
Long Handed Inserter
Long Handed Inserter
Posts: 93
Joined: Wed Jun 01, 2016 2:08 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by raidho36 »

Jap2.0 wrote: ↑
Thu Apr 02, 2020 12:25 am
6 years too late there.
Oh jeez. :lol: And then people say that I can't be calling this "rookie mistakes". Oh well, eventually they got all of it working anyway.

User avatar
Oktokolo
Filter Inserter
Filter Inserter
Posts: 883
Joined: Wed Jul 12, 2017 5:45 pm
Contact:

Re: Friday Facts #340 - Deep desyncs

Post by Oktokolo »

raidho36 wrote: ↑
Thu Apr 02, 2020 7:07 am
Oh jeez. :lol: And then people say that I can't be calling this "rookie mistakes". Oh well, eventually they got all of it working anyway.
Too bad they did not knew you from the start. They could just have licensed your game engine and would have spared themselves all the trouble.

Post Reply

Return to β€œNews”