This post contains graph theory, algorithms, and mentions of/references to code.
Viewer discretion is advised.
...
Ahem
...
This post is based on analysis of the source code of what I am assuming is the 0.15.34 version of the train pathfinder. This code was shared by Rseding91 in response to my question on his Q&A.
TL;DR
Programmer TL;DR: It's Dijkstra's algorithm with segments as nodes and a complicated cost function.
Non programmer TL;DR: It's witchcraft, we can only hope that Wube won't use its dark magic for anything sinister
The too long part
The pathfinder tries to find as cheap a path as possible to its destination. To do that is uses a well known path-finding algorithm known as Dijkstra's algorithm. As the algorithm has been around for over 60 years I won't get in to the details of how it works, there are plenty of resources available online that explains it better than I can anyway. Instead I will be focusing on describing the cost function as that is where most of the interesting things happen, at least with respect to how the trains behave.
The current implementation uses segments as nodes.
When referring to the code I will use @XXc for lines in TrainPathFinder.cpp, and @XXh for lines in TrainPathFinder.hpp. Example: The definition of the findPath function starts @35c.
Glossary
Some concepts that are needed for this to make sense:
Segments, F4 -> show_rail_segments
A segment is a contiguous stretch of rails that is bounded by either a signal, a station or a rail split/merge. If a train in automatic has entered a segment in one end it should always be able to reach the other. It's basically the simplest construct that is not just one piece of track
Blocks, F4 -> show_rail_blocks
See here, part 1 from here, or here for explanations of blocks.
Tile
The cost of a path is measured in tiles. The length of a tile is the same a one piece of belt, an inserter, or a small power pole. Straight rail pieces are 2 tiles long, curved are probably somewhere around 8.
findPath() overview
Here is a short summary of what some different parts of the code does:
@35c - Start of findPath()
@38c - Special case: End is in same segment as train
@53c - Construct start node for travelling forward if possible
@72c - Construct start node for travelling backward if possible
@91c - Start of main search loop
@94c - Mark node as visited
@97c - Check if we're at the goal
@113c - Start of loop for assigning cost to neighbours
@115c, @117c - Check if neighbour is reachable (one way track)
@120c - @141c - Handle seen neighbours
@143c - @193c - Calculate tentative cost of neighbour
@195c - Handle updating tentative cost of unvisited neighbour
@206c - Add neighbour to list of reachable unvisited nodes
@215c - End of search loop
@218c - Signal that no path was found
Calculating the cost of an edge
The penalties to cost calculation can be divided into two groups. First are the penalties that are always checked for every segment @143c - @156c:
- The base cost of a path is its length. This can be seen @143c where the length of the current segment is added to the cost of the next segment.
- Any rail signal that is reserved by circuits adds a 1000 tiles cost to the edge, @148c
- If the block the segment is in is occupied/reserved(?) an additional cost of (2 * 'length of neighbour' / (1 + 'blocks crossed by path')) tiles, @152c. This is probably so that far off occupied blocks should not matter as much. An interesting side effect is that a path that is strictly longer can be cheaper if it has more blocks than the shorter path.
- If the segment contains a train stop add an additional cost of 2000 tiles, @155c
- If the train is waiting at a station (@166c) we add a cost of 100 tiles +
an additional 1000 tiles if the train does not use a timer based wait condition (@168c)a penalty that is over 200k if the train does not use a timer based wait condition (I can't pinpoint this in the source but it is intended)+ the penalty used for trains braking for a train stop (@173c). - If the train is braking for approaching a train stop we add a cost of (100 + 'ticks it should wait at the station' / 10) tiles. I assume 'ticks it should wait at the station' is 0 for trains that do not use a timer based schedule.
- If the train is in manual mode and not moving (@177c) a penalty of 2000 tiles is added. If the train is also empty an additional cost of 5000 tiles is added (@180c).
- If the train is approaching a red signal a cost of 200 tiles is added (@185c)
- If the train is waiting at a signal a cost of (100 + 'ticks it has waited' / 10) tiles (@188c)
Closing words
A big thank you to Rseding91 for sharing the code, it's been really interesting to see some of the actual working parts of this great game!
Hopefully this can help demystify the train pathfinder a bit, at least when someone shares it in a more readable format
If I've made a mistake some where, or missed something, please let me know and I'll try to fix it ASAP.
Don't hesitate to ask if anything is unclear, I'll do my best to explain.