Using the graphics card to help the CPU out…

Post by **ssilk** » Tue Sep 21, 2021 8:04 am

Lately I looked some videos of how guys used the graphics card to speed up some complicated calculations. This changed their apps from practically unusable to wow.

So I imagine how cool it would be to use a graphics card for physic-simulations of the Factorio world:
- realistic simulation of the fluid system/heat-pipes
- real-time power-loss over distance
- calculating the positions of bots and items on belts in real-time
- calculating inserter positions/state of recipe assembling depending on power-supply
- extremely fast pathfinding
- …

I don’t know much about what’s needed to achieve that Factorio could use this magic power and where the problems are laying, but interested to know more.

SoShootMe · Post by **SoShootMe** » Tue Sep 21, 2021 10:12 am

ssilk wrote: Tue Sep 21, 2021 8:04 am Lately I looked some videos of how guys used the graphics card to speed up some complicated calculations.

Videos such as?

I'm also interested. I am by no means an expert but in the context of Factorio, my gut instinct is that there is a deep mismatch between the massive parallelism GPUs offer (which AIUI is the main source of performance benefit) and the determinism Factorio is built on - at a basic level I think this boils down to there being many dependencies between calculations, which limits parallelism. The game as it stands can't keep many CPU cores busy: I feel that should be an easier problem to solve (in relative terms), yet it isn't, and I doubt that is for a lack of trying.

I'd loved to be proved wrong (that utilising the GPU could offer improvement), since I'd learn something in the process!

netmand · Post by **netmand** » Tue Sep 21, 2021 2:29 pm

There's plenty of discussion out there regarding this if you relate GPU to cores i.e. mutli-threading Factorio...

quinor · Post by **quinor** » Tue Sep 21, 2021 3:12 pm

GPU engineer here. I will omit all of the issues connected with implementing all of the stuff on GPUs and portability of such solutions and skip directly to the other difficulties.

Basically, GPU is a bunch (hundreds to thousands) very small and relatively dumb cpu-like cores. It is also often the case that all (or big groups) of those cores have to run the same code. Technicalities aside, if something can be sped up by using a GPU, it'll be much, much easier to speed it up by doing multithreaded code. Since Factorio struggled with utilizing multiple CPU threads effectively, it will be even harder to use a GPU. The game would basically have to be redesigned from scratch to allow for that and while I would love to see that project happen (it is in my opinion certainly possible even if very hard), it's not a very reasonable thing to do.

Don't hesitate to ping me here or on Discord for more details

quyxkh · Post by **quyxkh** » Tue Sep 21, 2021 3:39 pm

This is doing N-body gravity propagation with Vulkan compute + graphics. On 20000 bodies. 400M acceleration calcs per frame. On, as I recall, a GTX 770. edit: no, I see it now, it says right there: GTX 760.

Not real sure you could get the determinism working acceptably, but it'd sure be sweet to do fluids and heat that way.

Post by **ssilk** » Wed Sep 22, 2021 1:40 am

Not all was interesting on my click-stream. One article which brought me to the idea:

https://www.digitalengineering247.com/a ... imulation/

“We found that almost all types of simulation can be accelerated by the GPU,”

Factorio is nothing else than a simulation. Some parts - not all - can be outsourced to simulate on the GPU. That has nothing to do with the problems of threading. I explain why:

The current problem with parallelism in Factorio is, that you cannot simply calculate all movements on the belts, while in the same moment you calculate the inserters and power supply. That is highly dependent on the right order to keep determinism correct.

See also here: https://www.worldscientific.com/doi/10. ... 4116500404
“ 1) the states of 32/64 cells in 32/64-bit words (integers) and the next states are computed by the Bitwise Parallel Bulk Computation (BPBC) technique”
and
“ The experimental results show that, the performance of our GPU implementation using GeForce GTX TITAN X is 1350×109 updates per second for 16K-step simulation of 512K ×512K cells stored in the SSD. Since Intel Core i7 CPU using the same technique performs 13.4×109 updates per second, our GPU implementation for the Game of Life achieves a speedup factor of 100.”

Factor 10 for Factorio would be a big success!

What I mean is to use the GPU as the faster CPU. E.g. the robots are quite simple units. The GPU can read out all positions and state of the bots from the memory, do one tick/simulation step and store the result in memory. Same with more complex stuff like belts or inserters. One step after the other, so that determinism is not affected.

The good question is: might that be really faster? I mean the logic of an inserter is not trivial, much more complicated than Game-of-Life. The GPU-program needs to do dozens of checks before it begins to grab one item from a belt. Perhaps you need several implementations of an inserter-simulation in the GPU, depending on if it grabs from a belt, a chest, a wagon, a vehicle, from ground, …

mrvn · Post by **mrvn** » Wed Sep 22, 2021 3:00 am

ssilk wrote: Wed Sep 22, 2021 1:40 am Not all was interesting on my click-stream. One article which brought me to the idea:

https://www.digitalengineering247.com/a ... imulation/

“We found that almost all types of simulation can be accelerated by the GPU,”

Factorio is nothing else than a simulation. Some parts - not all - can be outsourced to simulate on the GPU. That has nothing to do with the problems of threading. I explain why:

The current problem with parallelism in Factorio is, that you cannot simply calculate all movements on the belts, while in the same moment you calculate the inserters and power supply. That is highly dependent on the right order to keep determinism correct.

See also here: https://www.worldscientific.com/doi/10. ... 4116500404
“ 1) the states of 32/64 cells in 32/64-bit words (integers) and the next states are computed by the Bitwise Parallel Bulk Computation (BPBC) technique”
and
“ The experimental results show that, the performance of our GPU implementation using GeForce GTX TITAN X is 1350×109 updates per second for 16K-step simulation of 512K ×512K cells stored in the SSD. Since Intel Core i7 CPU using the same technique performs 13.4×109 updates per second, our GPU implementation for the Game of Life achieves a speedup factor of 100.”

Factor 10 for Factorio would be a big success!

But the Game of Life is the perfect simulation for a GPU. The state of a cell depends solely on a 3x3 region centered on the cell. There is absolutely no dependency on the order of compuation. In fact you have to do all the computations and then update all cells in a second step (or actually have 2 states and switch between the two every tick). The good algorithms for the CPU already handle cells in parallel in bitfields. So the move to the GPU just makes the bitfields you work with that much larger.

ssilk wrote: Wed Sep 22, 2021 1:40 am What I mean is to use the GPU as the faster CPU. E.g. the robots are quite simple units. The GPU can read out all positions and state of the bots from the memory, do one tick/simulation step and store the result in memory. Same with more complex stuff like belts or inserters. One step after the other, so that determinism is not affected.

I thought robots where made to be so simple they don't need to do any work per tick. They only do work when damaged, when they reach their target or when they run out of energy (although I don't know how you compute the energy since you are adverse to checking if the energy is sufficient at the start of the journey). Do you decrement the energy every tick?

I think there would be one thing that should be trivial to move to the GPU: Pollution. That's just something like below, right?

pollution_next = pollution_now * (1 - neighbours * X) + sum([neighbour[n].pollution_now * X for n in neighbours]) - absorbtion_in_chunk + output_from_buildings

where e.g. X = 0.01 determines how fast pollution spreads.

The absorbtion_in_chunk and output_from_buildings you calculate per chunk as you process the trees or buildings. And then the rest you would run as big matrix on the GPU. Every chunk takes a bit from neighbours, gives a bit to neighbours and some is absorbed. No branches, no dependencies. And isn't that part now done for only a few chunks every tick because updating all chunks every tick was too expensive?

Talking about trees. The affect of pollution on trees could probably be done by the GPU too.

Post by **ssilk** » Wed Sep 22, 2021 7:16 am

mrvn wrote: Wed Sep 22, 2021 3:00 am
I thought robots where made to be so simple they don't need to do any work per tick. They only do work when damaged, when they reach their target or when they run out of energy

Yea, but at least you need to calculate their current position every tick when visible on screen, so you can draw them.

So if a

can move in one dimension, the current position p depends on the start position plus their speed * time since start.

P = p0 + v * (t - t0)

Same with energy:

E = e0 - EnergyPerTick * (t - t0)

Every time something changes on the bot p0, e0 and t0 needs to be recalculated. You also need to calculate t1: the time when it arrives at the target, tLowBat, the time, when energy falls below 20% (? Not documented) and tDrain when energy is empty. The lowest t is the time, when a recalculation needs to be done.

In two dimensions the formulas get a bit longer, some square roots and quadrations…

And you won’t do it with every tick for all bots, because then the memory throughput can grow significantly. You want to reduce that. To calculate if a bot is visible on screen you need to have at least to know on which chunk they are. So perhaps it’s a good idea to calculate only a path from chunk to chunk. Now you need to calculate not only t1 etc. but also tChunk, which is, when a bot leaves one chunk.

And so on… you can do several optimizations here and it gets more and more complex, and the more it is better to let the CPU do all the stuff.

What if we do it on the hard tour and just put source position, target position and energy on the graphics-card and let the GPU do the rest for each tick?

My thoughts go into that direction: instead of optimizing, make things so simple, that the GPU can do it with pure calculating-power.

In the end this becomes quite theoretical, we can write ages about this, but need to prove this. But I’m just a web-application developer, my approach in GPU programming is CSS.

But I’m just thinking loud. Yes, the pollution should be a perfect example for GPU calculation. But thinking also to the fluid network-calculations. They are only one-pass and we know all the problems this makes. With a GPU it might be possible to do all the calculations in two passes faster than the CPU can do it for one.

quinor · Post by **quinor** » Wed Sep 22, 2021 11:16 am

A big issue with GPUs is that moving data from/to a gpu is expensive. It's bad enough that ie. moving data from/to gpu takes 10x more time than the simple computation you have to do. It only makes sense if you have big enough computation to do or you don't have to move the data from the GPU.

Also, Factorio is a very complex simulation with hundreds or thousands of different "relatively small" computations going on and GPUs like BIG computations. For example, doing X 100k times can be a very small job for a gpu - ideally it would be 10M times, at least.

mrvn · Post by **mrvn** » Wed Sep 22, 2021 7:27 pm

ssilk wrote: Wed Sep 22, 2021 7:16 am
mrvn wrote: Wed Sep 22, 2021 3:00 am
I thought robots where made to be so simple they don't need to do any work per tick. They only do work when damaged, when they reach their target or when they run out of energy
Yea, but at least you need to calculate their current position every tick when visible on screen, so you can draw them.

So if a can move in one dimension, the current position p depends on the start position plus their speed * time since start.

P = p0 + v * (t - t0)

With all the power the GPU has can't you calculate that as part of the render pipeline on the fly without ever storing it? Moving data from/to the GPU is expensive. So if a bot just hasd to move their p0, v and t0 values to the GPU and the GPU handle the rendering from that then you save a lot of CPU time.

ssilk wrote: Wed Sep 22, 2021 7:16 am Same with energy:

E = e0 - EnergyPerTick * (t - t0)

Every time something changes on the bot p0, e0 and t0 needs to be recalculated. You also need to calculate t1: the time when it arrives at the target, tLowBat, the time, when energy falls below 20% (? Not documented) and tDrain when energy is empty. The lowest t is the time, when a recalculation needs to be done.

Unless the mouse hovers over the bot and the energy appears in the tooltip why ever would you need to compute E for each tick? You do need the t* times but that is determined at the start of the journey.

Which actually still baffels me that you let bots travel till they run out of energy before finding a charging pad. If and when that happens is known when they start the journey. Where it happens is the trivial formula from above. So if the bot knows it can't reach it goal with its charge it can calculate where that would happen, find the nearest charging port now (instead of finding it when it arrives at the point) and then go directly there without detour. And if the nearest charging spot is where the bot already is then you can show an "unreachable destination" alert.

All at no additional computation cost (just re-aranging when you compute stuff) but saving CPU time because bot performance would increase.

ssilk wrote: Wed Sep 22, 2021 7:16 am In two dimensions the formulas get a bit longer, some square roots and quadrations…

And you won’t do it with every tick for all bots, because then the memory throughput can grow significantly. You want to reduce that. To calculate if a bot is visible on screen you need to have at least to know on which chunk they are. So perhaps it’s a good idea to calculate only a path from chunk to chunk. Now you need to calculate not only t1 etc. but also tChunk, which is, when a bot leaves one chunk.

And so on… you can do several optimizations here and it gets more and more complex, and the more it is better to let the CPU do all the stuff.

What if we do it on the hard tour and just put source position, target position and energy on the graphics-card and let the GPU do the rest for each tick?

My thoughts go into that direction: instead of optimizing, make things so simple, that the GPU can do it with pure calculating-power.

In the end this becomes quite theoretical, we can write ages about this, but need to prove this. But I’m just a web-application developer, my approach in GPU programming is CSS.

But I’m just thinking loud. Yes, the pollution should be a perfect example for GPU calculation. But thinking also to the fluid network-calculations. They are only one-pass and we know all the problems this makes. With a GPU it might be possible to do all the calculations in two passes faster than the CPU can do it for one.

I wonder how many bots you have to have before it becomes necessary to sort them by chunk on the CPU instead of letting the GPU clip them on the fly every frame. Calculating P from above for every bot in parallel is really in the GPUs court. Then compare to the visible map region and have it render what's inside. I could imagine the number could be in the thousands.

It should also be easy to have the GPU calculate all bots that will intersect the current map view whenever the view changes or for new bots. Or at least calculate "tChunk" for a much larger region than a chunk. Endgame bots cross chunk boundaries rather quickly. Think about what it would mean for a bot that moves between 2 tiles on different chunks. Splitting the bots into a spatial tree would be much better. You really don't care about bots crossing chunk boundaries a long way away from the current visible region. So maybe a tree where regions increase with distance to the view and you split or merge regions as the view changes.

If you want to see the power of the gpu maybe look at https://www.kaggle.com/jlesuffleur/cuda ... -with-gpu . There are similar examples for C/C++ for mandelbrot computations with similar speed improvements. I remember that initial mandelbrot set taking 2+ hours on my C64, now it's done in 2ms.

It being python it might also be a great platform to test out algorithms on the fly without having to write up a lot of framework first. You could write up a visualization of a pipe network with a screen full of code and then try out different algorithms for pipes in no time with real time rendering.

Nidan · Post by **Nidan** » Thu Sep 23, 2021 12:34 am

Most things turn out to be more complicated than they seem to be at first glance, this includes bots.

mrvn wrote: Wed Sep 22, 2021 7:27 pm Which actually still baffels me that you let bots travel till they run out of energy before finding a charging pad. If and when that happens is known when they start the journey. Where it happens is the trivial formula from above. So if the bot knows it can't reach it goal with its charge it can calculate where that would happen, find the nearest charging port now (instead of finding it when it arrives at the point) and then go directly there without detour. And if the nearest charging spot is where the bot already is then you can show an "unreachable destination" alert.

And predict where new roboports are constructed, destructed, run out of power or will have too many bots queueing for charging? All of these may force the bot to pick another roboport.

All at no additional computation cost (just re-aranging when you compute stuff) but saving CPU time because bot performance would increase.

The CPU still needs to know the exact location of the bots every tick for any of the unpredictable interactions like biter attacks, fire/acid, player selection/destruction. Or at least a general idea (e.g. which chunk) to narrow down which bots currently qualify for these interactions.

If you want to see the power of the gpu maybe look at https://www.kaggle.com/jlesuffleur/cuda ... -with-gpu . There are similar examples for C/C++ for mandelbrot computations with similar speed improvements. I remember that initial mandelbrot set taking 2+ hours on my C64, now it's done in 2ms.

It being python it might also be a great platform to test out algorithms on the fly without having to write up a lot of framework first. You could write up a visualization of a pipe network with a screen full of code and then try out different algorithms for pipes in no time with real time rendering.

Mandelbrot is like game of life, all cells/pixels are completely independent of each other. No suprise a GPU performs well.

mrvn · Post by **mrvn** » Thu Sep 23, 2021 6:25 am

Nidan wrote: Thu Sep 23, 2021 12:34 am Most things turn out to be more complicated than they seem to be at first glance, this includes bots.

mrvn wrote: Wed Sep 22, 2021 7:27 pm Which actually still baffels me that you let bots travel till they run out of energy before finding a charging pad. If and when that happens is known when they start the journey. Where it happens is the trivial formula from above. So if the bot knows it can't reach it goal with its charge it can calculate where that would happen, find the nearest charging port now (instead of finding it when it arrives at the point) and then go directly there without detour. And if the nearest charging spot is where the bot already is then you can show an "unreachable destination" alert.
And predict where new roboports are constructed, destructed, run out of power or will have too many bots queueing for charging? All of these may force the bot to pick another roboport.

Sure, that's always possible. But is the bots expectation that it can fly towards its goal and you will be building a roboport right where it will run out of energy?

Anyway, none of that is anything new that would be introduced by bots aiming for the next recharge directly instead of when they run out of energy. All that code is already there for when bots do run out of energy and go for a recharge.

Nidan wrote: Thu Sep 23, 2021 12:34 am
All at no additional computation cost (just re-aranging when you compute stuff) but saving CPU time because bot performance would increase.
The CPU still needs to know the exact location of the bots every tick for any of the unpredictable interactions like biter attacks, fire/acid, player selection/destruction. Or at least a general idea (e.g. which chunk) to narrow down which bots currently qualify for these interactions.

If you want to see the power of the gpu maybe look at https://www.kaggle.com/jlesuffleur/cuda ... -with-gpu . There are similar examples for C/C++ for mandelbrot computations with similar speed improvements. I remember that initial mandelbrot set taking 2+ hours on my C64, now it's done in 2ms.

It being python it might also be a great platform to test out algorithms on the fly without having to write up a lot of framework first. You could write up a visualization of a pipe network with a screen full of code and then try out different algorithms for pipes in no time with real time rendering.
Mandelbrot is like game of life, all cells/pixels are completely independent of each other. No suprise a GPU performs well.

Sure, but that would be the case for pollution or the goal for a parallel fluid simulation too.

After all what would be the point of shoving something on the GPU that you don't expect to perform well there.

Post by **ssilk** » Thu Sep 23, 2021 8:04 am

mrvn wrote: Wed Sep 22, 2021 7:27 pm Which actually still baffels me that you let bots travel till they run out of energy before finding a charging pad. If and when that happens is known when they start the journey.

I wrote that later:

You also need to calculate t1: the time when it arrives at the target, tLowBat, the time, when energy falls below 20% (? Not documented) and tDrain when energy is empty. The lowest t is the time, when a recalculation needs to be done.

—

What I take out of the current discussion:
- it’s bad to access memory
- it’s much better to have parts of the simulation run completely in the GPU
- as undisturbed as possible
- any change from “outside” makes problems

With this knowledge I think we get to the main problems with this idea. We need to have isolated problems, that can run completely in the GPU and are not too disturbed from outside.
Hufff, that’s really not so simple as I thought. I mean inserters fall not into that category. Maybe fluid networks? But for every tick something might be filled into or is taken out from it.
Same with pollution, the input and output of pollution can change every tick. All hangs together with everything else.

Post by **posila** » Thu Sep 23, 2021 4:01 pm

quinor wrote: Tue Sep 21, 2021 3:12 pm GPU engineer here. I will omit all of the issues connected with implementing all of the stuff on GPUs and portability of such solutions and skip directly to the other difficulties. ...

quinor wrote: Wed Sep 22, 2021 11:16 am A big issue with GPUs is that moving data from/to a gpu is expensive. ...

Thanks, I appreciate you wrote these (so I didn't have to

)

ssilk wrote: Thu Sep 23, 2021 8:04 am- any change from “outside” makes problems

That goes both ways ... if CPU needs result of computation made by GPU that may create bottleneck too. For example, when we query GPU timestamps to measure how long GPU spend on rendering something for debug timings, we read them back 4 frames later, in order to not create "CPU-GPU sync point"

mrvn wrote: Wed Sep 22, 2021 7:27 pmIf you want to see the power of the gpu maybe look at https://www.kaggle.com/jlesuffleur/cuda ... -with-gpu . There are similar examples for C/C++ for mandelbrot computations with similar speed improvements. I remember that initial mandelbrot set taking 2+ hours on my C64, now it's done in 2ms.

I hope nobody is surprised by the fact that "figuring out what should be color of each pixel in an image based on simple computation" is kind of the problem in which GPUs dominate over CPUs

Sorry for spoiling it to you, but only non-rendering/non-graphics feature Factorio currently has, that would make sense to accelerate by GPU is generating of the map preview

PS: @ssilk, as far as I remember, you are Mac user. And since compute shaders are not a thing in OpenGL on macOS (and will never be), you wouldn't be able to enjoy such acceleration anyway, unless we add also Metal rendering backend first. So you should lobby for that first

quinor · Post by **quinor** » Thu Sep 23, 2021 5:47 pm

posila wrote: Thu Sep 23, 2021 4:01 pm Sorry for spoiling it to you, but only non-rendering/non-graphics feature Factorio currently has, that would make sense to accelerate by GPU is generating of the map preview

I wish I could disagree, but it's most likely true - at least in the current game architecture.

Just so you know. Factorio development team is the only gamedev team I genuinely trust when they say that they optimized their game engine. I do high-performance code for a living and I don't think I could speed up Factorio engine significantly without essentially rewriting all of it from scratch and redesigning it in a performance-friendly way.

@posila: if you guys ever needed any help with high-performance code, I can donate any reasonable number of hours and consult for the dev team!

Zavian · Post by **Zavian** » Fri Sep 24, 2021 12:54 am

ssilk wrote: Wed Sep 22, 2021 7:16 am
mrvn wrote: Wed Sep 22, 2021 3:00 am
I thought robots where made to be so simple they don't need to do any work per tick. They only do work when damaged, when they reach their target or when they run out of energy
Yea, but at least you need to calculate their current position every tick when visible on screen, so you can draw them.

So if a can move in one dimension, the current position p depends on the start position plus their speed * time since start.

P = p0 + v * (t - t0)

Same with energy:

E = e0 - EnergyPerTick * (t - t0)

Every time something changes on the bot p0, e0 and t0 needs to be recalculated. You also need to calculate t1: the time when it arrives at the target, tLowBat, the time, when energy falls below 20% (? Not documented) and tDrain when energy is empty. The lowest t is the time, when a recalculation needs to be done.

In two dimensions the formulas get a bit longer, some square roots and quadrations…

Don't forget that logistics and construction bots can be targeting a moving object (eg the player). The object they are targeting can also be mined or destroyed. That means their movement direction could change every tick.

ptx0 · Post by **ptx0** » Fri Sep 24, 2021 4:36 am

glad someone else mentioned memory bandwidth so i don't have to

whole lot of misunderstandings about hardware in this thread

ptx0 · Post by **ptx0** » Fri Sep 24, 2021 4:39 am

quinor wrote: Thu Sep 23, 2021 5:47 pm I do high-performance code for a living and I don't think I could speed up Factorio engine significantly without essentially rewriting all of it from scratch and redesigning it in a performance-friendly way.

@posila: if you guys ever needed any help with high-performance code, I can donate any reasonable number of hours and consult for the dev team!

for whatever reason, they no longer seem interested in providing source access to contributors.

and i'm interested in how you came to the first conclusion without having seen the source to begin with. you should elaborate on your redesign for performance, as i'm sure they haven't heard of all your ideas yet. keep in mind that a lot of the game's chosen behaviours are a part of why it hurts to do certain things. you would likely have to change or eliminate some features or subtle behaviours if you wanted to improve things much more.

quinor · Post by **quinor** » Fri Sep 24, 2021 9:16 pm

ptx0 wrote: Fri Sep 24, 2021 4:39 am and i'm interested in how you came to the first conclusion without having seen the source to begin with.

I've seen enough FFF posts to build a reasonable amount of trust for the guys. They described enough of very-non-trivial optimizations so that I can say they probably dug deep into all the hotspots and optimized to a reasonable extent everything that was there to optimize. Also talked to some of the guys on the forums/irc/discord over the years.

ptx0 wrote: Fri Sep 24, 2021 4:39 am you should elaborate on your redesign for performance, as i'm sure they haven't heard of all your ideas yet. keep in mind that a lot of the game's chosen behaviours are a part of why it hurts to do certain things. you would likely have to change or eliminate some features or subtle behaviours if you wanted to improve things much more.

Yeah, that is part of the issue. To really get a lot of performance out of Factorio the engine would have to be designed from scratch in an extremely parallellizable way: kindof cellular-automaton style where state of each object depends only on previous state of a itself and a limited number of neighbours. That also means removing all of the annoying race conditions like two inserters trying to pick up the same object / from the same container and all other things that depend on entity update order. It is possible, but it would likely result in a game that "feels" different in a number of small ways - and it could be that some content would have to be modified or cut.

But I'm pretty sure they have thought of most of that already - as I said, I have a great amount of respect for the team and trust that they are very capable.

ptx0 · Post by **ptx0** » Mon Sep 27, 2021 3:27 pm

quinor wrote: Fri Sep 24, 2021 9:16 pm Yeah, that is part of the issue. To really get a lot of performance out of Factorio the engine would have to be designed from scratch in an extremely parallellizable way: kindof cellular-automaton style where state of each object depends only on previous state of a itself and a limited number of neighbours. That also means removing all of the annoying race conditions like two inserters trying to pick up the same object / from the same container and all other things that depend on entity update order. It is possible, but it would likely result in a game that "feels" different in a number of small ways - and it could be that some content would have to be modified or cut.

But I'm pretty sure they have thought of most of that already - as I said, I have a great amount of respect for the team and trust that they are very capable.

it'd be interesting to have the game's entity update loop work on chunk groups that behave as a dependency graph.

chunk groups would be designated during building placement time, based upon a grouping of all other entities this one must be able to interact with, and place it into a thread group.

but if your factory is all interconnected, that's not much help.

Factorio Forums

Using the graphics card to help the CPU out…

Using the graphics card to help the CPU out…

Re: Using the graphics card to help the CPU out…

Re: Using the graphics card to help the CPU out…

Re: Using the graphics card to help the CPU out…

Re: Using the graphics card to help the CPU out…

Re: Using the graphics card to help the CPU out…

Re: Using the graphics card to help the CPU out…

Re: Using the graphics card to help the CPU out…

Re: Using the graphics card to help the CPU out…

Re: Using the graphics card to help the CPU out…

Re: Using the graphics card to help the CPU out…

Re: Using the graphics card to help the CPU out…

Re: Using the graphics card to help the CPU out…

Re: Using the graphics card to help the CPU out…

Re: Using the graphics card to help the CPU out…

Re: Using the graphics card to help the CPU out…

Re: Using the graphics card to help the CPU out…

Re: Using the graphics card to help the CPU out…

Re: Using the graphics card to help the CPU out…

Re: Using the graphics card to help the CPU out…