Page 1 of 1

Performance of LuaTransportLine

Posted: Fri Sep 23, 2016 4:59 pm
by doktorstick
Howdy. First some context for this post. I've a mod called Hacked Splitters that changes the behavior of splitters from one that maintains line affinity to one that balances outputs regardless of inputs. To do this, I have to manage eight lines (four input, four output) per splitter. The APIs I use are LuaEntity.get_transport_line(), LuaTransportLine.get_contents(), LuaTransportLine.insert_at(), and LuaTransportLine.remove_item(). I achieve this by removing items from the input lines and placing them on the output lines before the default splitter logic can take over.

It's my hope that these APIs' performance can be improved, or better yet, include a version of this splitter natively :D.

I've observed that around 1400 fully active hacked splitters (11200 transport lines), the script time approaches 12ms for my system (Win10, i7-5820K @ 3.30Ghz). To debug this, the first thing I did was to stub the Factorio APIs with fake results to roughly measure my code. Once this was done, my script time dropped to 3.5ms. I then turned to profiling the Factorio APIs. At the bottom of this post is my raw data and the code I used in the on_tick handler to generate it. The log API was used for the timestamps, and hopefully the high loop count pushes the formatting time into the noise. (Profiler timestamp function please!)

LuaEntity.get_transport_line(): 460 calls/ms (/8 = 57.5 splitters/ms)
I was able to trade memory for performance on this one by caching the eight LuaTransportLine's in my splitter object within global.

LuaTransportLine.get_contents(): 620 calls/ms (/ 8 = 155 splitters/ms)

LuaTransportLine.insert_at() + remove_item(): 342.1 call pairs/ms (/ 4 = 85.5 splitters/ms)
I combined these two for two reasons--1) that's how they are used; insert to output and if successful remove from input; and 2) since after two inserts the line would have been full I had to do it this way :). In hindsight, I should have tested failed inserts to a full line because that does happen.

Sufficed to say, this doesn't scale at all. I do predictive analysis to mitigate the calls to get_contents and insert_at, and there's undoubtedly room for improvement on my side of the things.

As an aside, checking LuaEntity.valid 1000 times per tick adds .8ms to the on_tick handler. I have also mitigated this unless I can't tell what version of upgrade-planner they are using (an older version invalidated the entities when it did the upgrade).
Raw Data
Code Sample

Re: Performance of LuaTransportLine

Posted: Fri Sep 23, 2016 5:51 pm
by Loewchen
Moved to suggestions.

Re: Performance of LuaTransportLine

Posted: Fri Sep 23, 2016 11:07 pm
by ssilk
It looks like the devs plan to rework the splitter completely. Unknown if really true and when.


Unless then: viewtopic.php?f=80&t=518 Smart Splitter (-Suggestions will never End)
and you may search for "smart splitter" in this board.

Re: Performance of LuaTransportLine

Posted: Fri Sep 23, 2016 11:12 pm
by doktorstick
Not sure why this was moved to suggestions. The OP is about the performance of the APIs with the context of where and how it's important.

Re: Performance of LuaTransportLine

Posted: Fri Sep 23, 2016 11:22 pm
by ssilk
Cause missing performance is not a bug? It's eventually an issue.

Re: Performance of LuaTransportLine

Posted: Sat Sep 24, 2016 12:53 pm
by Rseding91
Those methods use the exact same logic that normal belt flow does - they're literally wrappers around preexisting logic that belts had and I simply added the API calls.

All of the slow parts of them are simply Lua and having to go from Lua to C++ and then back. It's the nature of a scripting language that's interpreted runtime with every single call and every single function having error checking and error safety built in. Lua is simply incredibly slow.

Re: Performance of LuaTransportLine

Posted: Sat Sep 24, 2016 1:20 pm
by ssilk
I think there are some possibilities to increase performance. Think for example to the conversion, that is needed for the x/y-position: It needs to convert a fixed-point to a floating-point real-value and back.