Add multithreading/parallelization options

ratchetfreak · Post by **ratchetfreak** » Sun Aug 09, 2015 2:53 pm

One of the things that people like to shout for is adding multithreading as if it's the silver bullet.

There are 2 main problems with it:

multithreading is hard™ and prone to errors that are hard to debug.
Determinism: multithreaded code is very non-deterministic which creates problems with desyncs. There are ways around it though.

So how would we give modders a way to use multithreading without creating a ton of desyncs in the process?

We need to let portions of code (tasks) get executed in arbitrary order without that order affecting the outcome of any task.

One way to let that happen is let the tasks only read data from the world. Useful for a parallel map and reduce. (like summing the items in all chests in a logistics system).

It would also be useful to update an entity that way. So my idea is to let each task write to/update a single entity and have the game engine group all tasks affecting that entity and execute them sequentially on a single thread in a deterministic order based on the mod that submitted the task and the order in which it was submitted (while letting tasks affecting other entities run on another thread). The state of the other entities that the task will read will be the old state; the updates will only be visible to subsequent tasks in the entity's group and not visible until after all tasks are done and the next tick begins.

This would result in the game engine flip-flopping between serial and parallel once every tick. In the parallel phase each entity queries the state around itself and updates its own state accordingly like a combinator reading the circuit network input (or the combinators that output to that network) and updating its own outputs. In the serial phase everything that needs to update multiple entities at once will run.

One of the additional things that those tasks could do is create more tasks for other entities. This would allow communication between entities where they both need to update.

Afforess · Post by **Afforess** » Fri Aug 21, 2015 2:21 pm

The way I've seen other mod API's deal with this, is allow mods to schedule async tasks, with a certain number of milliseconds/game-ticks before it executes, and the number of times the task should repeat. If async tasks were scheduled in advance, they would not take up time on the main thread, but also still be deterministic in multiplayer.

ratchetfreak · Post by **ratchetfreak** » Fri Aug 21, 2015 2:57 pm

Afforess wrote:The way I've seen other mod API's deal with this, is allow mods to schedule async tasks, with a certain number of milliseconds/game-ticks before it executes, and the number of times the task should repeat. If async tasks were scheduled in advance, they would not take up time on the main thread, but also still be deterministic in multiplayer.

The issue is not that tasks have to be scheduled equally; it's that tasks can't be allowed to modify the game state without special care.

One of the ways is collecting the results for use in the main thread. The other is independent entity updates.

BicycleEater · Post by **BicycleEater** » Thu Jul 29, 2021 4:48 pm

OK, I know I'm digging this up from the grave, after 6 years, in a Won't implement post, but still, here goes:
I would love if there was some (even basic) way of implementing multithreading in the lua code. Even if it is de-sync-city, it would allow for more aggressive mods, running more stuff in lua.
I have a mod which can run for about 10 minutes on a single tick, at 100% CPU, processing loads of independent changes across literally the entire map. I don't really care what it takes for me to implement multithreading, this is a perfect use case. I know this is an obscure use-case, but loads of mods (read: at least one) have to do labour intensive things in their lua scripts, but can separate these into very geographically diverse areas (e.g. doing upkeep on every machine). In this case multithreading would be a great help, and would offset the fact that mods can only be written in lua.
I know that writing mods in Factorio is absurdly easy, and will always be appreciative of how much the dev team cares, but there are no options for really high performance mods, as it all runs through lua (and I understand why, as allowing people to run random code through the mod interface is... problematic).
My request is this:
Can we have a
script.on_multithread() and script.start_multithread(data)
or something, giving data you define (only done during the scripting phase of the relevant mod). This could be relatively simple to do for the dev team, as it would be entirely up to modders to make it work (although I know lua is somewhat difficult to multithread, but things like LuaLanes/llthreads exist, so you could use those - they are both MIT licensed...). Then the game could allow mods to start multithreading, even if it breaks things. I know it would cause de-syncs everywhere, but you could add something to the error message for a de-sync to say "mod ... was using multithreading, so probably caused the de-sync, please contact the mod author", and then the problem is entirely the mod writer to solve, no extra difficulty for you. You don't need inter-thread communication, and could make it clear that the implementation is only there to 'make it possible', the new thread doesn't need to global table, only access to some game functions, which could be synchronised using locking.
Yes this would all make it easier to de-sync the game, but if a modder wants to crash the game using a mod, there are plenty of options anyway.

I hope my proposal minimises the amount the devs have to do, while still allowing multithreading, in some capacity (I really like when anything is possible, even if it is difficult).
I already know this probably won't be implemented, as it adds a huge amount of game complexity for very few use-cases, but would love to hear a response anyway.

DarkShadow44 · Post by **DarkShadow44** » Thu Jul 29, 2021 7:50 pm

BicycleEater wrote: Thu Jul 29, 2021 4:48 pm I have a mod which can run for about 10 minutes on a single tick, at 100% CPU, processing loads of independent changes across literally the entire map. I don't really care what it takes for me to implement multithreading, this is a perfect use case.

May I ask what you are doing that long?

I wouldn't want to wait 10 minutes for something to happen... IMHO, long tasks should span multiple ticks.

BicycleEater · Post by **BicycleEater** » Thu Jul 29, 2021 8:56 pm

That's the biggest nuke detonation for
https://mods.factorio.com/mod/True-Nukes
the megaton nuke has to load in loads of map, and damage loads of entities (like every entity there is), and change huge numbers of tiles. I also want it for the following ~30-40 mins of blast simulation (all at about 1-10 fps)... I really do need loads of performance.
I might have been exaggerating on the 10mins, but certainly 2-5.
To be fair, if I had multithreading, I'd just add an even bigger nuke, but still...

I also have to spend a while processing all the smaller, more practical nukes, as they need loads of computation...

Post by **boskid** » Thu Jul 29, 2021 9:52 pm

I was already thinking about something simillar that would enable modders to do some computation heavy tasks in parallel with rest of the code, however when i was thinking about it more there were many limitations that would make it useless, at least from my point of view.

Primary concern is to prevent desyncs. If doing wrong thing would be easy (aka desyncs) and correct thing would be hard it would be pretty much useless. So there would be multiple limitations imposed on such lua worker system: when a control lua would request a job it would have to specify a tick at which the result is to be expected. If the result would be early it would be buffered waiting. If the result would not be ready yet, the main loop would have to wait. Data would be sent to the worker only at the point of job creation and the data would be only received from the worker at the declared deadline tick.

Lua worker would be completly isolated from the outside world. It would have no access to the entities as unsynchronized accesses would cause desyncs (best case) or crashes due to transient inconsistent states (worst case). It would be only allowed to take data from job description and could send data only with the job result.

Maybe it would be possible to have some persistent storage on the lua worker side, something like a `global` but separate for the worker only. That would unfortunately mean that the work jobs would have to be executed in the exactly same order as requested so if there would be 2 jobs: first declared to end at tick X+100 and second declared to end at tick X+1, result of the second one would be internally delayed to the X+100'th tick.

There would be also save-game concerns. In order to get hands on the lua worker state at a consistent state, all the jobs would have to be already finished meaning that a save process would have to wait until all lua workers are done even if a game would be playable at a given point in time. In case a lua worker would not have any persistent storage, it would be possible to save only the job description and restart the job itself after loading but that would move a joining client really close to the job deadline causing huge freeze waiting for the job to be recomputed from scratch.

I think that this type of lua workers would be mostly useless. You would need to gather all required data in the control lua and send them to the worker lua, i have no idea what this worker would have to do with this and then it would send some result that would be provided at the declared tick and then control lua would have to touch all the relevant entities again. I think it could be useful only with certain "custom" mechanics, where you need to simulate some large scale process (lets say a custom fluid system) but at the end of tick you would only need to update only a few "outputs" (pipes that are directly next to consumers)

BicycleEater · Post by **BicycleEater** » Fri Jul 30, 2021 1:52 pm

I was mostly figuring that it would all happen at the same time as all the other lua activity - so instead of just waiting for the one lua thread to complete, you wait for all of them. This still allows ~8x more computation (on an 8 core machine) so long as the lua is the slowest part. This would allow a mod to divide the world up, and do things in different places at different times, or split off a thread to do something self-contained and slow.
This would make save-games no harder, as you can save the game at the point in a tick where all the threads have stopped.

This would make the majority of the game time spent single-threaded, so would only matter when a mod has to do a lot of computation very often... not as much of a performance boost as running it in parallel with the rest of the game, but close in the really slow case.

This would make the case I'm looking at much easier, as I could multithread the initial explosion (and maybe even the blast wave), as they are single ticks with >100ms of computation each. It wouldn't allow background computation, but that would (as you said) be very difficult.

The big difficulty would be allowing different threads to use things like game.surface.set_tiles, and have it not cause a conflict... this could be achieved by using a lock on all the game interactions (like a volatile bool in the interface, and you block until it is false, set to true on entry)... this would mean that if multiple threads were accessing things a lot at the same time, it wouldn't be as much of a performance gain, but this wouldn't be too bad, as interactions like this are relatively rare, and the c++ is much faster than the lua - if my mod becomes limited by c++ speed, I'm in a much better place.

You can also make it so that any tables given to the thread are copied instead of given by reference. This makes it a bad idea to pass a reference to global, but fine if you just want to give, e.g. position and size, to your thread.
I don't mind the idea that making multithreading work is hard for the modder, it always will be, I also don't mind if the multithreading is only a time-save in really slow bits of the mod.

(Edit, some time later, after re-reading your answer)
I was planning on allowing multiple threads to 'touch' entities, etc. so that they could be more useful, but this is more possible, as they would all be running during a 'lua stage' where nothing else is happening. Then use locking to prevent conflicts on game-native stuff (like entities)... this would mean you could have two threads running, both having got a table of entities they want to edit, then commit those changes alternately, so one changes an entity, while the other scans the next entity to change, then changes the entity they have found when the first one is looking for the next entity (blocking to wait if necessary). This would work if they were looking in different areas of the map, as they couldn't affect each other (if written carefully), which would make them not de-synch... provided that enough space is left between them so that they can't change the same entity. Obviously this would be vulnerable to 'sufficiently large' modded entities, but still, it would work in most cases.
This would also mean if a mod needed to do tile changes as it did entity changes, it could look for the tiles and build a table of changes with one thread, while changing entities with another, provided they cannot effect each other's operation, no de-syncs can occur.

BicycleEater · Post by **BicycleEater** » Wed Aug 11, 2021 12:10 pm

Sorry to bump this, just wanted to make sure you really aren't going to implement it, in some form or another... I think the way I described it would be doable, but I don't know the internals of the game... (I would love this to be implemented, as I have said, my mod would really benefit from anything like this).

chaoscode · Post by **chaoscode** » Tue Jan 16, 2024 6:29 pm

I am sorry to bring this up again.

deterministic multithreading is a pain but fortunatelly I enjoy the suffering of parallel computing

.

I think we have to differentiate 2 Problems:
self containing threads and world interacting threads.

self containing threads (easy):
As boskid mentioned there would be problems in how to sync the threads in a way that they will always reproduce the same output so no desync happens.
Or he has an idea how to do it but fears the lag during saving/loading a savefile.

Instead of starting a background job with a state and wait till it is done to get an output i suggest multiple fixed short runs of the same job with interaction possibilities.
The background job should be started & created, the only thing it has is a entry point and a message queue.
It's only possibilitie to interact with other Jobs are messages (Objects) which will be sent and received at the end of every run.
During a run there is no possibilitie to interact with the outside.
Runs could be limited to 10 ticks? or 30?
So the Backgroundjob would be forced to actively yield every once in a while and each yield would give the game a possibilitie to acquire a deterministic state. The Backround job would keep it's own private state inbetween the runs. After a Backgroundjob is done it can be deleted to clean up.

For Interjob communication it would be important that the messages have a determinisctic order. The solution would be that Background jobs can only be created by the main thread, and every job has an increasing ID. Before a Message exchange can happen, all Backgroundjobs finishing in this tick need to be done. After this has happened the messages can be exchanged. The recipent will receive only messages of BG-Jobs which have yielded before this tick and it would receive them sorted by tick and senderID.

If the background jobs need access to the world, the mainthread has to send the data in one tick and apply it after the background jobs sent messages back after it's run.
It would be an idea to give 1 Tick Backgroundjobs a read only access to the world. But while the job(s) run no other modifying background tasks could run.
So there would be a 1 Tick delay minimum before a BG-Job can return it's data and a PingPong between 2 Jobs would be minimum 2 ticks.

This would help calculation intensive Problems (e.g. Computer mods), but not interactive intensive problems. So no bigger nukes with this one.

e.g.:

Code: Select all

Mainthread once:

	JobID1= createBGJob("job.lua");
	JobID2= createBGJob("job.lua");

	SendMsg(JobID1, JobID2 ); // send other JobID so they can communicate
	SendMsg(JobID2, JobID1 ); // Msg has been sent before run so msg buffer is already filled on first execution.

Mainthread every loop:
	enqueueRun(JobID1, 3);
	enqueueRun(JobID2, 5);

Code: Select all

job.lua:

entry()
{	
	partner = receive();
	SendMsg(partner , "Start " + partner);
	int counter=0;
	while(true)
	{
		counter++;
		while(msg = receive()) // while having messages
			printToConsole(msg); //as Messages, get sorted by JobID
			SendMsg(partner, "Hello from " + Me.ID + " to " + partner + " in loop no " + counter);
		
		yield(); //have a consistent state, return control to factorio, messages get sent and received now
	}
}

expected output:
cycle 1:
"Start 1"
"Start 2"
Cycle 3:
"Hello from 2 to 1 in loop 1"
cycle 5:
"Hello from 1 to 2 in loop 1"
...

world interacting threads:

How to do this? this is a hard one and needs detailed knowledge about the game internals.
I can only takes guesses, but a suggestion would be to go away from the classical thread model and apply a shader model.
(I am thinking of OpenCL right now)

Instead of several Background jobs which will interact with the world it would be better to give factorio a list of
* independent units (e.g. chunks, or a List of (x,y) which chunks factoio should apply this function at)
* a function
* and an object which will be available to the function as ReadOnly.
* also to each function it's individual x_coord, y_coord

This function has a limited set of functions how it can interact with the world. The functions effects need to be contained to the unit of seperation.
If you go by chunks: the function is only allowed to access whats in the chunk. Change the Tiles from Gras to nuked? No Problem. Change a Powernetwork? Thats a No No. Remove 5 points from health? Ok! create/delete entities? Maybe?

Those subthreads even could send messages, (the recipent needs to receive them ordered of course), but not receive them.

A drawback would be that this would not be a true background job, to keep the game synced the Mainthread would need to be stoped.
But at least it would be parallel.

Factorio Forums

Add multithreading/parallelization options

Add multithreading/parallelization options

Re: Add multithreading/parallelization options

Re: Add multithreading/parallelization options

Re: Add multithreading/parallelization options

Re: Add multithreading/parallelization options

Re: Add multithreading/parallelization options

Re: Add multithreading/parallelization options

Re: Add multithreading/parallelization options

Re: Add multithreading/parallelization options

Re: Add multithreading/parallelization options