Train pathfinder in-depth

aaargha · Post by **aaargha** » Thu Sep 14, 2017 6:38 pm

Warning!
This post contains graph theory, algorithms, and mentions of/references to code.
Viewer discretion is advised.

...
Ahem
...

This post is based on analysis of the source code of what I am assuming is the 0.15.34 version of the train pathfinder. This code was shared by Rseding91 in response to my question on his Q&A.

TL;DR
Programmer TL;DR: It's Dijkstra's algorithm with segments as nodes and a complicated cost function.
Non programmer TL;DR: It's witchcraft, we can only hope that Wube won't use its dark magic for anything sinister

The too long part
The pathfinder tries to find as cheap a path as possible to its destination. To do that is uses a well known path-finding algorithm known as Dijkstra's algorithm. As the algorithm has been around for over 60 years I won't get in to the details of how it works, there are plenty of resources available online that explains it better than I can anyway. Instead I will be focusing on describing the cost function as that is where most of the interesting things happen, at least with respect to how the trains behave.

The current implementation uses segments as nodes.

When referring to the code I will use @XXc for lines in TrainPathFinder.cpp, and @XXh for lines in TrainPathFinder.hpp. Example: The definition of the findPath function starts @35c.

Glossary
Some concepts that are needed for this to make sense:
Segments, F4 -> show_rail_segments
A segment is a contiguous stretch of rails that is bounded by either a signal, a station or a rail split/merge. If a train in automatic has entered a segment in one end it should always be able to reach the other. It's basically the simplest construct that is not just one piece of track

Blocks, F4 -> show_rail_blocks
See here, part 1 from here, or here for explanations of blocks.
Tile
The cost of a path is measured in tiles. The length of a tile is the same a one piece of belt, an inserter, or a small power pole. Straight rail pieces are 2 tiles long, curved are probably somewhere around 8.

findPath() overview
Here is a short summary of what some different parts of the code does:
@35c - Start of findPath()
@38c - Special case: End is in same segment as train
@53c - Construct start node for travelling forward if possible
@72c - Construct start node for travelling backward if possible
@91c - Start of main search loop
@94c - Mark node as visited
@97c - Check if we're at the goal
@113c - Start of loop for assigning cost to neighbours
@115c, @117c - Check if neighbour is reachable (one way track)
@120c - @141c - Handle seen neighbours
@143c - @193c - Calculate tentative cost of neighbour
@195c - Handle updating tentative cost of unvisited neighbour
@206c - Add neighbour to list of reachable unvisited nodes
@215c - End of search loop
@218c - Signal that no path was found

Calculating the cost of an edge
The penalties to cost calculation can be divided into two groups. First are the penalties that are always checked for every segment @143c - @156c:

The base cost of a path is its length. This can be seen @143c where the length of the current segment is added to the cost of the next segment.
Any rail signal that is reserved by circuits adds a 1000 tiles cost to the edge, @148c
If the block the segment is in is occupied/reserved(?) an additional cost of (2 * 'length of neighbour' / (1 + 'blocks crossed by path')) tiles, @152c. This is probably so that far off occupied blocks should not matter as much. An interesting side effect is that a path that is strictly longer can be cheaper if it has more blocks than the shorter path.
If the segment contains a train stop add an additional cost of 2000 tiles, @155c

Second are the penalties that are only applied once per block, @159c - @193c. These all relate to occupied blocks and apply different penalties depending on which state the train is in, does not apply for the train that is searching (@163c).

If the train is waiting at a station (@166c) we add a cost of 100 tiles + ~~an additional 1000 tiles if the train does not use a timer based wait condition (@168c)~~ a penalty that is over 200k if the train does not use a timer based wait condition (I can't pinpoint this in the source but it is intended)+ the penalty used for trains braking for a train stop (@173c).
If the train is braking for approaching a train stop we add a cost of (100 + 'ticks it should wait at the station' / 10) tiles. I assume 'ticks it should wait at the station' is 0 for trains that do not use a timer based schedule.
If the train is in manual mode and not moving (@177c) a penalty of 2000 tiles is added. If the train is also empty an additional cost of 5000 tiles is added (@180c).
If the train is approaching a red signal a cost of 200 tiles is added (@185c)
If the train is waiting at a signal a cost of (100 + 'ticks it has waited' / 10) tiles (@188c)

What is important to note about this second group is that they can be applied multiple times for each train, once for each block the train occupies that we attempt to also path through.

Closing words
A big thank you to Rseding91 for sharing the code, it's been really interesting to see some of the actual working parts of this great game!

Hopefully this can help demystify the train pathfinder a bit, at least when someone shares it in a more readable format

If I've made a mistake some where, or missed something, please let me know and I'll try to fix it ASAP.

Don't hesitate to ask if anything is unclear, I'll do my best to explain.

FrodoOf9Fingers · Post by **FrodoOf9Fingers** » Fri Sep 15, 2017 1:02 am

Good find to know! I guess the A* theory might be coming from the A* comment in the header file >.>

I think the biggest general takeaway is that each rail segment is a node (probably shoulda known that already), and if you want to keep ups high, don't spam segments.

I wonder if there are any easy ways to give hints to the algorithm. I could see intentionally making a path very expensive when two lines run in parallel, the algorithm searches one path much farther faster. But that might make the other lane completely useless too.

Hrm....

FrodoOf9Fingers · Post by **FrodoOf9Fingers** » Fri Sep 15, 2017 1:34 am

Oh, and here is an example of what the units are:

"Train station adds 2000 tiles penalty when path finding, so trains try to avoid stations not related to their path." 15.0 patch notes

aaargha · Post by **aaargha** » Fri Sep 15, 2017 5:54 pm

FrodoOf9Fingers wrote:Good find to know! I guess the A* theory might be coming from the A* comment in the header file >.>

I think the biggest general takeaway is that each rail segment is a node (probably shoulda known that already), and if you want to keep ups high, don't spam segments.

I wonder if there are any easy ways to give hints to the algorithm. I could see intentionally making a path very expensive when two lines run in parallel, the algorithm searches one path much farther faster. But that might make the other lane completely useless too.

Hrm....

You should be able to use train stops and circuit controlled rail signals to control which path trains take. Train stops are probably easiest to use as trains won't get stuck on them if they're set up incorrectly. Circuit controlled signals are more dynamic but you need some kind of mechanism to detect if a train is still trying to use that path to prevent them from deadlocking.

FrodoOf9Fingers wrote:Oh, and here is an example of what the units are:

"Train station adds 2000 tiles penalty when path finding, so trains try to avoid stations not related to their path." 15.0 patch notes

That seems correct, the experiment I had initially set up to verify that utilized a train waiting at a station but that gave nothing useful. It seems that the penalty for waiting trains may be bugged in some cases, a station with a penalty of over 200k (due to circuit closed signals) was still seen as cheaper than the one nearby with a waiting train.

Anyway, trying it with just a circuit closed signal or a train stop verifies that the cost is measured in tiles, a straight rail piece bing two tiles long. I guess the next question is how long a curvy rail piece is.

Thanks for the info, I'll update the OP.

TheTom · Post by **TheTom** » Mon Sep 18, 2017 11:50 am

What about reducing the impact of a train IF IT IS MOVING. I mean, a train moving at full speed will not be there once the pathfinding train arrives.

On top, I would ignore the impact of trains on the route after a certain distance. This is what my navi does - yes, traffic situation is important, but a traffic jam 400km down the trip will not be there once you arrive. Likely.

This really needs adjustment. A train bypassing another moving train makes no sense.

FrodoOf9Fingers · Post by **FrodoOf9Fingers** » Mon Sep 18, 2017 2:04 pm

There is a tradeoff between smarts and efficiency. Large train worlds don't benefit much from having smarter Pathfinding I'd thier ups crashes.

Additionally, distance is factored in for trains, that train 400km away would have very, very little weight. Finally, there is a differnce between moving trains and stopped trains.

Factorio Forums

Train pathfinder in-depth

Train pathfinder in-depth

Re: Train pathfinder in-depth

Re: Train pathfinder in-depth

Re: Train pathfinder in-depth

Re: Train pathfinder in-depth

Re: Train pathfinder in-depth