Until here you are correct.
No they are not. Assuming we are using a sane heuristic(in this case that would be euclidean distance to goal) the paths that deviate the least from the direct line between the start and the goal are explored first.
Until here you are correct.
No they are not. Assuming we are using a sane heuristic(in this case that would be euclidean distance to goal) the paths that deviate the least from the direct line between the start and the goal are explored first.
OHGODS. It's a djikstra. The only thing djikstra has over A* is that it is easier to implement. No wonder big railnetworks with lots of intersections are slow.
If you say so. The only time I've ever seen rail path finding being "slow" is on rail networks like this: https://www.factorio.com/blog/post/fff-296 and even then it's not actually slow relative to the rest of the simulation (still runs at 60 UPS) - it just takes enough time that I can measure it. All of the time ends up being memory latency at that point.
Which is why I mentioned that djikstra is easier to implement. I guess there is no point in spending development time making something 10 times faster if it's taking less than 10% of the simulation time. But the point of "all the time being memory latency" A* would be faster than djikstra even more as it needs to access a fraction of the nodes that djikstra does. If for some reason the train pathfinding needs optimizing in the future the first thing I would do is implement A*. Even tough it is more complicated than djikstra it is not a hard to implement algorithm. And its performance is wastly superior to djikstra.Rseding91 wrote: βFri Oct 11, 2019 6:52 pmIf you say so. The only time I've ever seen rail path finding being "slow" is on rail networks like this: https://www.factorio.com/blog/post/fff-296 and even then it's not actually slow relative to the rest of the simulation (still runs at 60 UPS) - it just takes enough time that I can measure it. All of the time ends up being memory latency at that point.
Well, do you have a reliable source for it being djikstra? I cannot find an FFF that says whether it is A* or djikstra. The only thing I can find is viewtopic.php?p=293877#p293877 which says A*. If I am to believe the comments in the original pathfinding commit from 0.5.0, it's also A*, but that was 6 years ago.PunPun wrote: βSat Oct 12, 2019 1:25 pm Also someone should correct the wiki https://wiki.factorio.com/Railway/Train_path_finding. It claims A* is used when it is actually djikstra. And those two are infact different algorithms.
Well the sourcecode that is in the See also part of the page is dijkstra.Bilka wrote: βSat Oct 12, 2019 1:33 pmWell, do you have a reliable source for it being djikstra? I cannot find an FFF that says whether it is A* or djikstra. The only thing I can find is viewtopic.php?p=293877#p293877 which says A*. If I am to believe the comments in the original pathfinding commit from 0.5.0, it's also A*, but that was 6 years ago.PunPun wrote: βSat Oct 12, 2019 1:25 pm Also someone should correct the wiki https://wiki.factorio.com/Railway/Train_path_finding. It claims A* is used when it is actually djikstra. And those two are infact different algorithms.
Edit: just realized it is spelled dijkstra not djikstra.
He also thought that he was correct with the A*, aka not a reliable source. I guess I shall trust aaargha's judgement of the source code, afaik they know trains pretty well.
In that case I shall try to catch you on discord once I have cought some sleep. Optimizing pathfinding depends a lot on what data is already there from other gameplay and how it is structured so trying to figure this out in a forum correspondance would be annoying and slow.Rseding91 wrote: βSat Oct 12, 2019 6:15 pmIf I did say it was using A* in the past and it's not then I was wrong. Regardless of what algorithm it's using I have implemented a few optimizations on top of the code to improve performance so if someone thinks they might have ideas about how to make it even faster I'd love to hear them.
Sure. But in a grid the distance is more a manhatten metric. So going NWNWNWNW is longer than the euclidean distance predicted and then paths more distance from the directed line are predicted to be shorter. The longer the path the wider the area A* explores becomes.
This.mrvn wrote: βMon Oct 14, 2019 9:47 amSure. But in a grid the distance is more a manhatten metric. So going NWNWNWNW is longer than the euclidean distance predicted and then paths more distance from the directed line are predicted to be shorter. The longer the path the wider the area A* explores becomes.
Apart from all that A* has to work around other trains as well. There are many paths in a grid and it will try to avoid blocked paths. Even take longer paths if they have less obstacles. No matter how good A* is in trying the best paths first the fact that there are so many possibilities costs time.
Looks like this and is also what factorios current pathfinding does because it is not using A* but dijkstra -
I understand. Pathfinding and especially A* is a subject that is not obvious without visual examples. Also if you google A* stuff the internet is full of misinformation posted by people who do not understand how A* actually works. So it is really common for people who have not spent 100+h coding A* implementations to have the wrong idea about it. Also the biggest misconception about A* is that it does not support multiple goals. When all you have to do is search the path backwards. You can have multiple start nodes and it works extremely well. In a grid it most of the time only expands from the closest start node only.
Indeed and it gets even worse when the start is not near an edge of the grid. As dijkstra expands away from the goal at the same rate as it expands towards the goal.
There will always be worst cases.PunPun wrote: βMon Oct 14, 2019 4:27 pm There is however a situation where A* will be much much worse and you'd want to do a special case to detect it before starting A*. If a train is pointing towards a dead end. Because we want to support N stations in a single search we are searching from stations to the train. And if the train is unreachable because it is pointing towards a dead end A* will have to search all the nodes in the entire network. Ofcourse this is just as bad as having a train pointing towards the network trying to path to an unreachable station in the current implementation.
You can also do A* towards multiple goals. The problem there is the heuristic function. You have to estimate the distance to the goal, which comes down to distance to the closest station. If that station turns out not to be the goal the heuristic will greatly underestimate the cost to reach the actual best station. Which makes A* perform less than optimal. As it expands it will constantly turn back to reach the closest station until it gets far enough from one station so that another is the closest. Then it will race towards that station and spread around it again. For that scenario though you would have to have tracks signaled bidirectional with stations on the wrong side (for the train) and no easy way to turn the train around.PunPun wrote: βMon Oct 14, 2019 4:27 pm Maybe a bidirectional algorithm where we expand from the train with dijkstra and expand from the stations with A* would work the best. If either exhausts all nodes it can terminate early as there is no path. At worst case this approach would be just as slow as plain dijkstra and at best it would be less than twice the nodes as the best case A* when a path exists.
Edit: I tought up a situation where the bidirectional algorithm would net twice the nodes. Situation being two big networks train in one and station in the other. But if we think about realistic gameplay situations the bidirectional dijkstra/A* hybrid or just regular A* would be the fastest on avarage.
For double headed trains you search with A* starting at the destinations. Then you expand the paths from each station backwards till one of them hits the segment the train is in. Why would you need to search twice?
Because the ends are not neccessarily in the same segment. If you have a 100+ car long train the ends can be really far from eachother.
I think you misunderstand how a bidirectional search works. When one search reaches a node that the other search has already seen the algorithm terminates and combines the two searches to make the path. It is a regularly used optimization for many cases. It would never search the network more than once.
No you can't just use the distance to the closest goal. If you do that you break the algorithm and are no longer guaranteed to get the best path to a goal.mrvn wrote: βThu Oct 17, 2019 10:36 am You can also do A* towards multiple goals. The problem there is the heuristic function. You have to estimate the distance to the goal, which comes down to distance to the closest station. If that station turns out not to be the goal the heuristic will greatly underestimate the cost to reach the actual best station. Which makes A* perform less than optimal.
It does not take just a bit more memory. It also takes N+1 times more time. We are trying to do less work not more. And a bidirectional search is not the same as two searches in parallel.
Right. The distance heuristic would be from the nose/tail of the train and differ quite a bit for a 100+ car train.
No, I understand that part. But with there being no path the two searches would never hit the same segment going in the same direction. They can reach every single segment and always be in opposing driving directions.PunPun wrote: βSat Oct 19, 2019 6:21 pmI think you misunderstand how a bidirectional search works. When one search reaches a node that the other search has already seen the algorithm terminates and combines the two searches to make the path. It is a regularly used optimization for many cases. It would never search the network more than once.
For A* to work the heuristic must never report a distance larger than what it actually is. As long as that remains true A* will find the shortest path. The better the heuristic the faster it finds it usually. A heuristic of H() = min([<distance to each goal>]) fits the requirement.PunPun wrote: βSat Oct 19, 2019 6:21 pmNo you can't just use the distance to the closest goal. If you do that you break the algorithm and are no longer guaranteed to get the best path to a goal.mrvn wrote: βThu Oct 17, 2019 10:36 am You can also do A* towards multiple goals. The problem there is the heuristic function. You have to estimate the distance to the goal, which comes down to distance to the closest station. If that station turns out not to be the goal the heuristic will greatly underestimate the cost to reach the actual best station. Which makes A* perform less than optimal.
Is there a way to record all the train path finding the game does for say 10 minutes in a megabase? Might be helpful to run the same set of path finding through the various improvements suggested and see how many nodes in the graph each one touches.
I was too tired. You are correct it does find the shortest path. But if you have two lists of points and you need to connect any one point from one list to any one point on the other list with the shortest path. You explore less nodes if you choose the bigger list as the startnodes and the smaller list as goals.mrvn wrote: βMon Oct 21, 2019 9:51 amFor A* to work the heuristic must never report a distance larger than what it actually is. As long as that remains true A* will find the shortest path. The better the heuristic the faster it finds it usually. A heuristic of H() = min([<distance to each goal>]) fits the requirement.