Parallel Computing and Factorio
Posted: Sun Jun 04, 2017 6:04 pm
Parallel processing is difficult for many reasons. To help facilitate understanding I'll call multi threaded processes "players". The basic idea is that if you think of a separate threaded process in the same way as two or three computers playing over the internet, it can help clarify the sort of problems you will run into. Namely I will assume that communication between processes is much like computers on a network. If you want data from a different process you must ASK for it, and if you want to make changes to another process' data you must WAIT for confirmation. You have to deal with lag and all sorts of hand shaking to make sure there is no kind of desync between separate threads. Every handshake wastes time and precious CPU as well as making a potential coding nightmare. So to give us the least trouble we want our threads I.E. players to do as LITTLE interacting as possible.
With that in mind here are some simple rules for separate players. Everything owned by a player only ever interacts with that player. Inserters will only pull from player owned belts, will only put into player owned assemblers, and will only share logistics with player owned networks. Belts will stop cold when hitting another player's belt and will not exchange contents in any way. Pipes are the same and will not interact between players in any way. Logistic bots and networks will only interact with structures associated with the same player. For all intents and purposes the factory of one player is completely blind and oblivious to any other player. We don't want Bill's inserters to be asking Joe "hey what's on your belt" and be waiting for Joe's belt to tell us "I have iron plates" and then Bill asks "I'm taking this iron plate plz?" and waiting for Joe to say "yeah sure I don't want it" and then Bill says "Thanks" and Joe says "NP". It is easier for Bill to say "Oh my inserter doesn't connect to anything I own. I can't do anything". This isolation of data lets our processes run freely independent of each other to reduce our headaches and maximize our parallel processing potential.
What remains are several ways separate processes may still be forced to interact:
1) Between players and terrain
1) Between players and power network
2) Between players and trains
3) Between players and combat/aliens
4) Between players and avatars
5) Between dedicated "bridges" to pass items from one player to another.
6) Between players and the video renderer
Each interaction is its own problem that has to be solved in some way.
== Terrain ==
We don't want two players occupying the same space in the game world. It's not actually dangerous for the code, because each player is functionally a parallel universe. Nothing from one universe ever sees or directly touches another. But it IS ugly to be stacking factories on itself. A mediating factor is required to make sure that one player's assets do not occupy the physical map space of another.
There may also be objects on the world surface that a player interacts with. This part could get very tricky to solve. An inserter may throw an item on the ground for another inserter to pick up, and we don't know which player the grounded gear belongs to. Player1 may drop gears on the ground and Player2 might build belts over it. Normally the items would continue down the belt, but player2 is blind to player1's assets so it wouldn't even know the gears are there. Should the items be given to the "world"? That would increase cross talk and we don't want that. A simple answer is to say "woah woah woah. Player1 has something, ANYTHING here. I can't touch this tile."
For the most part there is no reason for player1's factory to interact or weave into player2's factory. We generally want all parts of a spaghetti fortress to have access to itself and the game code is efficient enough to allow this. If we think of player2 as an "outpost" player, it's easier to understand that an outpost is physically separate from the main base. We can then create a physical map exclusion zone so that our players don't even interact through the world's data. Player1's chunks belong to player1 and everything on the ground belongs to him. Player2 can't see it, touch it or even place an inserter anywhere near it. Fewer interactions make parallel processing easier.
== Power Network ==
Linking power networks between players isn't too bad as there's only one single "resource" being coordinated between separate threads. The simple solution is to not allow separate player power networks to interact in any way. But we may want a huge central nuclear plant that feeds everything regardless of player count. In order for this to work let's use a separate "energy broker".
Each player calculates the power it has available and reports it to the broker. They then request power from the broker. The broker then decides how much power each player is either exporting or importing. Making the power transfer fully fluid and 2 way will involve some coordination between threads to sync power storage and consumers. I won't claim how easy or hard that will be. For the most part players will be satisfied with a large central power source that only exports to other players. For example player1 has a huge nuke plant and is exporting power to all of player2's outposts. That's not a huge burden to code.
== Trains ==
If you can't trade resources between players, what's the point of running parallel processes? We need a way to move items between players. Trains move lots of stuff, they can't interact on the move, and they travel large physical distances that we can use to separate our players. They're perfect. Keeping data synced is as simple as ensuring a train only obeys one player at a time. The train goes to player2's ore outpost, fills up with player2's inserters, travels to player1's base and unloads into player1's chests. Our player interaction is a single bulk transaction as the train changes ownership from player2 to player1.
A train needs access to all the rail network data in order to do its job properly. It needs to know where all the stations are and how to get beween them. A train also needs to transfer data(items) between players. In that respect a train can't function very well if it has to ask other players where all its things are. It may be fruitful to have a "playertrain" that owns and manages all the train assets. A train player has very few interactions for its assorted assets. It merely needs to secure territory for its stuff (rails, signals) and know if a station is open or closed. Everything else important happens with the train itself.
There will be issues with combinators and circuit networks that interact with train signals. For example a train crosswalk uses gates and combinators to turn rail signals on and off. This sort of behavior can't be isolated in player data and needs to be directly communicated to trains. It may be an easy problem or it may be a hard one. I don't have any answers here, sorry.
== Combat/Turrets/Aliens ==
Players need access to all the data involving where biters are and what they are doing. Generally as long as one player's gun range can touch another player's gun range, then they HAVE to interact in some way. There is a potentially huge issue of mixed player turrets coordinating their attacks, and making sure biters have all the data they need to engage targets from different players. If we use exclusion zones around a player's stuff, most of these problems disappear. Turret lines won't be mixed because players can't build in each other's zones, so there is no need to cross talk between players. Players close to each other may cause potential desync if both turret lines fire on the same target. Stuff like that. It is also possible for force all combat and associated assets to obey player1, but that would screw player2 outposts that can no longer interact with their guns. There are some issues involved and I don't claim to know what they all are or all the ways to address them.
== Between players and avatars==
Yes, I use the word "players" to describe a parallel process, but this time I'm taking about the guy behind the keyboard. The user avatar must be able to interact with all the stuff in the world. I suspect this isn't very different from hosting your own network game and then connecting to it. Most interactions are going to involve grabbing something or placing something, and every one of those requests will have to be synced with whichever factory the player talks to. These requests won't be more than the player's APM, so it won't cause crazy overhead.
The personal roboport will have to be sync'd so it doesn't blow up between different factories and logistic networks may do funny things if the player crosses boundaries. There may be more things to worry about of course.
== Dedicated bridges ==
Sometimes we want players to interact directly with each other. Maybe your rocket factory is SO big that you have no choice but to cut it down the middle. Maybe you want to run a massive pipe or belt from a distant player2 outpost. A dedicated bridge to connect two players is needed. You only need 3 types of bridges- a belt for solid items, a pipe for liquid items, and a pole for electricity. In effect the bridge consumes items from one player's side and spawns them on the other.
Why a bridge item? As I said at the very top, every exchange between two processes is expensive on CPU. You don't want that code running for every single thing in the game. Chances are you only need a handful of bridges to make things work. Take a 16 wide belt for example, you want to split 8 belts for one factory and 8 belts for another factory. That's only 8 conveyor bridges to feed the second factory.
The train problem might be solved by using dedicated train bridges. Player1's train enters on player1's tracks, goes through a checkpoint, and is now player2's train on player2's tracks. Nothing else on the train network interacts between players. The checkpoint serves to connect train station information between the two players and also to hide any lag related glitches the train might have switching sides.
== video stuff ==
You got me. I don't know how that stuff works. I only know that many games dump video rendering on separate threads. Can multiple sets of data talk to the renderer, or will we be stuck with weird things like player1's stuff vanishes when player2 moves into view? I dunno. It requires study.
== To Infinity and Beyond ==
You've already gone through all this effort break factories down into units that can be parallelized. What if players were actual players, each running a factory from their home computer? The host may still share the bulk of the work handling things like pollution, trains, and biters. But other players might join in, build their mega factory, and try to spaghetti it together with another mega factory. Who knows what the possibilities might be?
== The end ==
If you made it this far, thanks for reading! If you skipped directly to the end thanks for TL;DRing! I don't actually know anything about parallel computing and basically pulled everything here out of my ass. If I'm right or wrong or made grievous errors that will plague humanity for years to come, sorry! I didn't mean it. But I hope I gave you guys some ideas about what might work and maybe get some serious discussion on how a parallel Factorio could work.
With that in mind here are some simple rules for separate players. Everything owned by a player only ever interacts with that player. Inserters will only pull from player owned belts, will only put into player owned assemblers, and will only share logistics with player owned networks. Belts will stop cold when hitting another player's belt and will not exchange contents in any way. Pipes are the same and will not interact between players in any way. Logistic bots and networks will only interact with structures associated with the same player. For all intents and purposes the factory of one player is completely blind and oblivious to any other player. We don't want Bill's inserters to be asking Joe "hey what's on your belt" and be waiting for Joe's belt to tell us "I have iron plates" and then Bill asks "I'm taking this iron plate plz?" and waiting for Joe to say "yeah sure I don't want it" and then Bill says "Thanks" and Joe says "NP". It is easier for Bill to say "Oh my inserter doesn't connect to anything I own. I can't do anything". This isolation of data lets our processes run freely independent of each other to reduce our headaches and maximize our parallel processing potential.
What remains are several ways separate processes may still be forced to interact:
1) Between players and terrain
1) Between players and power network
2) Between players and trains
3) Between players and combat/aliens
4) Between players and avatars
5) Between dedicated "bridges" to pass items from one player to another.
6) Between players and the video renderer
Each interaction is its own problem that has to be solved in some way.
== Terrain ==
We don't want two players occupying the same space in the game world. It's not actually dangerous for the code, because each player is functionally a parallel universe. Nothing from one universe ever sees or directly touches another. But it IS ugly to be stacking factories on itself. A mediating factor is required to make sure that one player's assets do not occupy the physical map space of another.
There may also be objects on the world surface that a player interacts with. This part could get very tricky to solve. An inserter may throw an item on the ground for another inserter to pick up, and we don't know which player the grounded gear belongs to. Player1 may drop gears on the ground and Player2 might build belts over it. Normally the items would continue down the belt, but player2 is blind to player1's assets so it wouldn't even know the gears are there. Should the items be given to the "world"? That would increase cross talk and we don't want that. A simple answer is to say "woah woah woah. Player1 has something, ANYTHING here. I can't touch this tile."
For the most part there is no reason for player1's factory to interact or weave into player2's factory. We generally want all parts of a spaghetti fortress to have access to itself and the game code is efficient enough to allow this. If we think of player2 as an "outpost" player, it's easier to understand that an outpost is physically separate from the main base. We can then create a physical map exclusion zone so that our players don't even interact through the world's data. Player1's chunks belong to player1 and everything on the ground belongs to him. Player2 can't see it, touch it or even place an inserter anywhere near it. Fewer interactions make parallel processing easier.
== Power Network ==
Linking power networks between players isn't too bad as there's only one single "resource" being coordinated between separate threads. The simple solution is to not allow separate player power networks to interact in any way. But we may want a huge central nuclear plant that feeds everything regardless of player count. In order for this to work let's use a separate "energy broker".
Each player calculates the power it has available and reports it to the broker. They then request power from the broker. The broker then decides how much power each player is either exporting or importing. Making the power transfer fully fluid and 2 way will involve some coordination between threads to sync power storage and consumers. I won't claim how easy or hard that will be. For the most part players will be satisfied with a large central power source that only exports to other players. For example player1 has a huge nuke plant and is exporting power to all of player2's outposts. That's not a huge burden to code.
== Trains ==
If you can't trade resources between players, what's the point of running parallel processes? We need a way to move items between players. Trains move lots of stuff, they can't interact on the move, and they travel large physical distances that we can use to separate our players. They're perfect. Keeping data synced is as simple as ensuring a train only obeys one player at a time. The train goes to player2's ore outpost, fills up with player2's inserters, travels to player1's base and unloads into player1's chests. Our player interaction is a single bulk transaction as the train changes ownership from player2 to player1.
A train needs access to all the rail network data in order to do its job properly. It needs to know where all the stations are and how to get beween them. A train also needs to transfer data(items) between players. In that respect a train can't function very well if it has to ask other players where all its things are. It may be fruitful to have a "playertrain" that owns and manages all the train assets. A train player has very few interactions for its assorted assets. It merely needs to secure territory for its stuff (rails, signals) and know if a station is open or closed. Everything else important happens with the train itself.
There will be issues with combinators and circuit networks that interact with train signals. For example a train crosswalk uses gates and combinators to turn rail signals on and off. This sort of behavior can't be isolated in player data and needs to be directly communicated to trains. It may be an easy problem or it may be a hard one. I don't have any answers here, sorry.
== Combat/Turrets/Aliens ==
Players need access to all the data involving where biters are and what they are doing. Generally as long as one player's gun range can touch another player's gun range, then they HAVE to interact in some way. There is a potentially huge issue of mixed player turrets coordinating their attacks, and making sure biters have all the data they need to engage targets from different players. If we use exclusion zones around a player's stuff, most of these problems disappear. Turret lines won't be mixed because players can't build in each other's zones, so there is no need to cross talk between players. Players close to each other may cause potential desync if both turret lines fire on the same target. Stuff like that. It is also possible for force all combat and associated assets to obey player1, but that would screw player2 outposts that can no longer interact with their guns. There are some issues involved and I don't claim to know what they all are or all the ways to address them.
== Between players and avatars==
Yes, I use the word "players" to describe a parallel process, but this time I'm taking about the guy behind the keyboard. The user avatar must be able to interact with all the stuff in the world. I suspect this isn't very different from hosting your own network game and then connecting to it. Most interactions are going to involve grabbing something or placing something, and every one of those requests will have to be synced with whichever factory the player talks to. These requests won't be more than the player's APM, so it won't cause crazy overhead.
The personal roboport will have to be sync'd so it doesn't blow up between different factories and logistic networks may do funny things if the player crosses boundaries. There may be more things to worry about of course.
== Dedicated bridges ==
Sometimes we want players to interact directly with each other. Maybe your rocket factory is SO big that you have no choice but to cut it down the middle. Maybe you want to run a massive pipe or belt from a distant player2 outpost. A dedicated bridge to connect two players is needed. You only need 3 types of bridges- a belt for solid items, a pipe for liquid items, and a pole for electricity. In effect the bridge consumes items from one player's side and spawns them on the other.
Why a bridge item? As I said at the very top, every exchange between two processes is expensive on CPU. You don't want that code running for every single thing in the game. Chances are you only need a handful of bridges to make things work. Take a 16 wide belt for example, you want to split 8 belts for one factory and 8 belts for another factory. That's only 8 conveyor bridges to feed the second factory.
The train problem might be solved by using dedicated train bridges. Player1's train enters on player1's tracks, goes through a checkpoint, and is now player2's train on player2's tracks. Nothing else on the train network interacts between players. The checkpoint serves to connect train station information between the two players and also to hide any lag related glitches the train might have switching sides.
== video stuff ==
You got me. I don't know how that stuff works. I only know that many games dump video rendering on separate threads. Can multiple sets of data talk to the renderer, or will we be stuck with weird things like player1's stuff vanishes when player2 moves into view? I dunno. It requires study.
== To Infinity and Beyond ==
You've already gone through all this effort break factories down into units that can be parallelized. What if players were actual players, each running a factory from their home computer? The host may still share the bulk of the work handling things like pollution, trains, and biters. But other players might join in, build their mega factory, and try to spaghetti it together with another mega factory. Who knows what the possibilities might be?
== The end ==
If you made it this far, thanks for reading! If you skipped directly to the end thanks for TL;DRing! I don't actually know anything about parallel computing and basically pulled everything here out of my ass. If I'm right or wrong or made grievous errors that will plague humanity for years to come, sorry! I didn't mean it. But I hope I gave you guys some ideas about what might work and maybe get some serious discussion on how a parallel Factorio could work.