for i=1,#table a lot slower than for k,v in pairs (table)?

Place to get help with not working mods / modding interface.
User avatar
Optera
Smart Inserter
Smart Inserter
Posts: 2920
Joined: Sat Jun 11, 2016 6:41 am
Contact:

for i=1,#table a lot slower than for k,v in pairs (table)?

Post by Optera »

The behaviour of this function to merge red and green wires is confusing me.
This is the version I was using up until now.

Code: Select all

function GetCircuitValues(entity) 
  local greenWire = entity.get_circuit_network(defines.wire_type.green)
  local redWire =  entity.get_circuit_network(defines.wire_type.red)
  local items = {} 
  if greenWire then
    for _, v in pairs (greenWire.signals) do
      items[v.signal.type..","..v.signal.name] = v.count
    end
  end
  if redWire then
    for _, v in pairs (redWire.signals) do 
      if items[v.signal.type..","..v.signal.name] ~= nil then
        items[v.signal.type..","..v.signal.name] = items[v.signal.type..","..v.signal.name] + v.count
      else
        items[v.signal.type..","..v.signal.name] = v.count
      end
    end
  end
  return items
end
According to any lua performance guide using for i, #table do should be several times faster, but when I use tried to optimzie the function like the following version it became nearly 10 times slower.

Code: Select all

function GetCircuitValues(entity) 
  local greenWire = entity.get_circuit_network(defines.wire_type.green)
  local redWire =  entity.get_circuit_network(defines.wire_type.red)
  local items = {}
  if greenWire then
    for i=1, #greenWire.signals do
      local type = greenWire.signals[i].signal.type
      local name = greenWire.signals[i].signal.name
      local count = greenWire.signals[i].count
      
      items[type..","..name] = count
    end
  end
  if redWire then
    for i=1, #redWire.signals do
      local type = redWire.signals[i].signal.type
      local name = redWire.signals[i].signal.name
      local count = redWire.signals[i].count
      
      if items[type..","..name] ~= nil then
        items[type..","..name] = items[type..","..name] + count
      else
        items[type..","..name] = count
      end
    end
  end
  return items
end
User avatar
MeduSalem
Smart Inserter
Smart Inserter
Posts: 1686
Joined: Sun Jun 08, 2014 8:13 pm
Contact:

Re: for i=1,#table a lot slower than for k,v in pairs (table)?

Post by MeduSalem »

Which performance guides do you refer to?



Because once I last read the official Lua manual about a year ago the # operation might turn out to be really ugly because of how it works internally. As far as I know rather than caching the table size the operation instead iterates through all items of the table to actually find out the table's length during run-time because Lua doesn't really keep track of the actual table length so it has to determine it explicitly every time you call the # operation... which is urgh, especially for very long tables.

On the other hand the pairs(table) function is based upon the next(table) function which proceeds with the next item in the table until it hits an "End of Table" or so (so it actually doesn't know the table length until it hits a definitive end), which is why pairs(table) turns out to be a lot faster in cases where you want to do something with each table entry anyways.

So basically with # you are doing the loop twice... first with the # operation internally to determine the table's actual length and then again with the for-loop where you actually process each item pair.

My suggestion is use the pairs(table) (if order doesn't matter) or ipairs(table) (if order matters) if possible and avoid the stupid # operation completely if you can.

The # operation is only good for situations when you want to know the table size but don't want to actually do anything with the items (Which kinda defeats its own purpose). And then there would still be better methods than using the # operation, like for example storing the table size manually somewhere yourself to avoid that the Lua runtime has to go unecessary extra rounds.



In general I hate the # operation for its godforsakenly dumb implementation (whoever came up with how it works should have been withdrawn the programmer license because it can turn out to be quite a performance killer) and I hate how they also in turn deprecated the table.setn() and table.getn() functions at the same time. The later were a lot better in handling a table size (especially with huge sparse tables). Now you either have to use the slow # operation OR store the table size elsewhere and/or create a meta-table and update the size yourself whenever it changes if you want it to be performant.

Really... the # operation is the devil's work. It's like "so here you have a slow-ass automatic method of getting the table size, while we removed an easy way to manually update and attach a table-size to a specific table yourself, now see how you deal with that, nyahahahaha!"
User avatar
Optera
Smart Inserter
Smart Inserter
Posts: 2920
Joined: Sat Jun 11, 2016 6:41 am
Contact:

Re: for i=1,#table a lot slower than for k,v in pairs (table)?

Post by Optera »

This tests and my previous tests in factorio heavily favored # loops over pairs, let alone ipairs.

I agree with you that # operand is a hack and should be replaced by tables that hold their amount of entries like in C++ or C#.
User avatar
MeduSalem
Smart Inserter
Smart Inserter
Posts: 1686
Joined: Sun Jun 08, 2014 8:13 pm
Contact:

Re: for i=1,#table a lot slower than for k,v in pairs (table)?

Post by MeduSalem »

Weird. Personally, I never modded for Factorio so I can't say anything about that in particular...

I rather did some modding for Prison Architect, which also uses Lua, but only up to Lua version 5.1. There every pairs/ipairs configuration outperformed the usage of the #-operation, which was according to the Lua manual back then to a certain degree.

Maybe it also depends on how the underlying data structure is laid out or how well they incorporated the Lua runtime into their API. (I'm pretty sure the implementation of Prison Architect sucks in comparison to Factorio's because of how Wube is in general working in a much more organized and cleaned-up fashion judging from their development philosophy)


Also I'd imagine Factorio uses a newer iteration of Lua altogether so maybe the internal # behaviour may have changed since I last read into the manual.

But that might still not explain how for i, #table is faster in some things and but becomes slow with other things.

Maybe it has something to do with the # operation working on a multi-dimensional array instead of a classic array. Who knows what kind of black magic is happening deep within the Lua runtime.
Last edited by MeduSalem on Mon Dec 26, 2016 12:27 pm, edited 1 time in total.
User avatar
Klonan
Factorio Staff
Factorio Staff
Posts: 5290
Joined: Sun Jan 11, 2015 2:09 pm
Contact:

Re: for i=1,#table a lot slower than for k,v in pairs (table)?

Post by Klonan »

Optera wrote:The behaviour of this function to merge red and green wires is confusing me.
This is the version I was using up until now.

Code: Select all

function GetCircuitValues(entity) 
  local greenWire = entity.get_circuit_network(defines.wire_type.green)
  local redWire =  entity.get_circuit_network(defines.wire_type.red)
  local items = {} 
  if greenWire then
    for _, v in pairs (greenWire.signals) do
      items[v.signal.type..","..v.signal.name] = v.count
    end
  end
  if redWire then
    for _, v in pairs (redWire.signals) do 
      if items[v.signal.type..","..v.signal.name] ~= nil then
        items[v.signal.type..","..v.signal.name] = items[v.signal.type..","..v.signal.name] + v.count
      else
        items[v.signal.type..","..v.signal.name] = v.count
      end
    end
  end
  return items
end
According to any lua performance guide using for i, #table do should be several times faster, but when I use tried to optimzie the function like the following version it became nearly 10 times slower.

Code: Select all

function GetCircuitValues(entity) 
  local greenWire = entity.get_circuit_network(defines.wire_type.green)
  local redWire =  entity.get_circuit_network(defines.wire_type.red)
  local items = {}
  if greenWire then
    for i=1, #greenWire.signals do
      local type = greenWire.signals[i].signal.type
      local name = greenWire.signals[i].signal.name
      local count = greenWire.signals[i].count
      
      items[type..","..name] = count
    end
  end
  if redWire then
    for i=1, #redWire.signals do
      local type = redWire.signals[i].signal.type
      local name = redWire.signals[i].signal.name
      local count = redWire.signals[i].count
      
      if items[type..","..name] ~= nil then
        items[type..","..name] = items[type..","..name] + count
      else
        items[type..","..name] = count
      end
    end
  end
  return items
end
What i see is that your optimization was on the wrong part,
For a table as short as the signals, the time taken to loop through is completely insignificant,

Whether using pairs or for i = 1, #table, with a table this short it really won't make a difference
The reason the 1st is faster, is because the for loop calls each object in the table as a local variable,
Whereas with the other one, you are referencing the table for each value of i

This would be similar to the how the for loop works:

Code: Select all

  if greenWire then
    local signals = greenWire.signals
    for i=1, #signals do
      local v = signals[i]
      items[v.signal.type..","..v.signal.name] = v.count
    end
  end
But really, the choice of loop is making almost no difference in this case
User avatar
Optera
Smart Inserter
Smart Inserter
Posts: 2920
Joined: Sat Jun 11, 2016 6:41 am
Contact:

Re: for i=1,#table a lot slower than for k,v in pairs (table)?

Post by Optera »

Klonan wrote:But really, the choice of loop is making almost no difference in this case
It should indeed make no noticeable difference.
But changing for pairs in this function to for # jumps ups from 0.4 to 15.2.
User avatar
cpeosphoros
Inserter
Inserter
Posts: 40
Joined: Fri Dec 23, 2016 10:57 pm
Contact:

Re: for i=1,#table a lot slower than for k,v in pairs (table)?

Post by cpeosphoros »

TL;DR: "#" is fastest, but it's only relevant with really huge tables.

Here is what I got using my profile tool (viewtopic.php?f=135&t=38996):

control.lua:

Code: Select all

require "stdlib.log.logger"

LOGGER = Logger.new("LoopProfiler", "loop", true, {log_ticks=true} )


local done

function testTable(tableSize)

	LOGGER.log("Table Size: " .. tableSize)
	local largeTable = {}
	for i = 1, tableSize do
		table.insert(largeTable, i)
	end

	LOGGER.log("Creating second table")
	local largeTable2 = {}
	for i = 1, tableSize do
		largeTable2[i] = i
	end

	LOGGER.log("Entering Loops")

	for i = 1, tableSize do
		largeTable[i] = largeTable[i] + 1
	end
	LOGGER.log("Finished tableSize Loop")

	for i = 1, #largeTable do
		largeTable[i] = largeTable[i] + 1
	end
	LOGGER.log("Finished # Loop")

	for i, entry in pairs(largeTable) do
		largeTable[i] = largeTable[i] + 1
	end
	LOGGER.log("Finished first pairs Loop")

	for i, entry in pairs(largeTable) do
		largeTable[i] = entry + 1
	end
	LOGGER.log("Finished second pairs Loop")

	for i, entry in ipairs(largeTable) do
		largeTable[i] = largeTable[i] + 1
	end
	LOGGER.log("Finished first ipairs Loop")

	for i, entry in ipairs(largeTable) do
		largeTable[i] = entry + 1
	end
	LOGGER.log("Finished second ipairs Loop")

end

local function onTick(event)

	if not done then

		testTable(10000)
		testTable(100000)
		testTable(1000000)
		testTable(10000000)

		done = true
	end
end

script.on_event(defines.events.on_tick, onTick)
Results from profiler.log (slightly edited for readability:

Code: Select all

           0: 89:01:52.39: Table Size: 10000
           5: 89:01:52.39: Creating second table
           2: 89:01:52.39: Entering Loops
           0: 89:01:52.39: Finished tableSize Loop
           0: 89:01:52.39: Finished # Loop
           1: 89:01:52.39: Finished first pairs Loop
           0: 89:01:52.39: Finished second pairs Loop
           1: 89:01:52.39: Finished first ipairs Loop
           0: 89:01:52.39: Finished second ipairs Loop

           0: 89:01:52.39: Table Size: 100000
          51: 89:01:52.39: Creating second table
          14: 89:01:52.39: Entering Loops
           8: 89:01:52.39: Finished tableSize Loop
           7: 89:01:52.39: Finished # Loop
          11: 89:01:52.39: Finished first pairs Loop
           8: 89:01:52.39: Finished second pairs Loop
          11: 89:01:52.39: Finished first ipairs Loop
           8: 89:01:52.39: Finished second ipairs Loop

           0: 89:01:52.39: Table Size: 1000000
         840: 89:01:52.39: Creating second table
         292: 89:01:52.39: Entering Loops
         161: 89:01:52.39: Finished tableSize Loop
         155: 89:01:52.39: Finished # Loop
         196: 89:01:52.39: Finished first pairs Loop
         172: 89:01:52.39: Finished second pairs Loop
         189: 89:01:52.39: Finished first ipairs Loop
         167: 89:01:52.39: Finished second ipairs Loop

           0: 89:01:52.39: Table Size: 10000000
       51416: 89:01:52.39: Creating second table
       23523: 89:01:52.39: Entering Loops
        1293: 89:01:52.39: Finished tableSize Loop
        1281: 89:01:52.39: Finished # Loop
        1645: 89:01:52.39: Finished first pairs Loop
        1438: 89:01:52.39: Finished second pairs Loop
        1659: 89:01:52.39: Finished first ipairs Loop
        1380: 89:01:52.39: Finished second ipairs Loop
(The first value in each column is the time in milliseconds this step took since the finishing of the previous, so

Code: Select all

           0: 89:01:52.39: Table Size: 1000000
         840: 89:01:52.39: Creating second table
         292: 89:01:52.39: Entering Loops
means it took 840 ms to create the first table and 292 to create the second.

Analysis:

- Populating Tables with t = value is between 2 to 3 times faster than table.insert(t, i)
- Populating a Table is about a magnitude order above using it.
- Externally storing tableSize is almost exactly as fast as using #, with # being slightly faster and also the fastest loop method
- pairs() is slightly slower than ipairs(), which is slightly slower than # or tableSize
- t = value + 1 is fastest than t = t + 1, where i, value is the pair returned by pairs() or ipairs()
- it took a table size of 1.000.000 so the loop overhead would start making any noticeable difference on performance. I don't know of any table relevant for modding Factorio that is as large as that.
Choumiko
Smart Inserter
Smart Inserter
Posts: 1352
Joined: Fri Mar 21, 2014 10:51 pm
Contact:

Re: for i=1,#table a lot slower than for k,v in pairs (table)?

Post by Choumiko »

Optera wrote:This tests and my previous tests in factorio heavily favored # loops over pairs, let alone ipairs.

I agree with you that # operand is a hack and should be replaced by tables that hold their amount of entries like in C++ or C#.
MeduSalem wrote:But that might still not explain how for i, #table is faster in some things and but becomes slow with other things.

Maybe it has something to do with the # operation working on a multi-dimensional array instead of a classic array. Who knows what kind of black magic is happening deep within the Lua runtime.
It could be that's because the table Optera is iterating over isn't a "real" lua table. The # operator didn't work on tables/references (whatever it is really) you got directly through the API a few versions ago, so i guess it's a custom implementation that isn't that optimized (yet?)
cpeosphoros test's seem to indicate something like that.
cpeosphoros wrote:Here is what I got using my profile tool (viewtopic.php?f=135&t=38996):
Nice, how did i miss that? Gonna use the hell out of it once my playing addiction changes to modding addiction again :D
cpeosphoros wrote: - Populating Tables with t = value is between 2 to 3 times faster than table.insert(t, i)
- Populating a Table is about a magnitude order above using it.
- Externally storing tableSize is almost exactly as fast as using #, with # being slightly faster and also the fastest loop method
- pairs() is slightly slower than ipairs(), which is slightly slower than # or tableSize
- t = value + 1 is fastest than t = t + 1, where i, value is the pair returned by pairs() or ipairs()
- it took a table size of 1.000.000 so the loop overhead would start making any noticeable difference on performance. I don't know of any table relevant for modding Factorio that is as large as that.


That seems to be what i remember from the different sources i remember reading.
A few notes: I believe ipairs doesn't really work on tables you get through the API. Factorios pairs() might be customized to guarantee determinism. I mostly just go with pairs().
User avatar
cpeosphoros
Inserter
Inserter
Posts: 40
Joined: Fri Dec 23, 2016 10:57 pm
Contact:

Re: for i=1,#table a lot slower than for k,v in pairs (table)?

Post by cpeosphoros »

Choumiko wrote:It could be that's because the table Optera is iterating over isn't a "real" lua table. The # operator didn't work on tables/references (whatever it is really) you got directly through the API a few versions ago, so i guess it's a custom implementation that isn't that optimized (yet?)
cpeosphoros test's seem to indicate something like that.
Choumiko wrote:
cpeosphoros wrote:Here is what I got using my profile tool (viewtopic.php?f=135&t=38996):
Nice, how did i miss that? Gonna use the hell out of it once my playing addiction changes to modding addiction again :D
Thanks. Please PM me if you find any improvement/bug/etc
A few notes: I believe ipairs doesn't really work on tables you get through the API. Factorios pairs() might be customized to guarantee determinism. I mostly just go with pairs().
From lua documentation, pairs() will work with any table whatsoever, always returning all k,v pairs in that table.

ipairs() will only work on what the documentation calls "the array part of a table", which is defined as all pairs 1, v; 2, v; ...; n, v where n is the last continuous integer key indexing a non-nil value.

So, if you have: t = {name = "myName", 1 = "one", 2 = "two", 3 = "three", 5 = "five"}, then pairs() will return all 5 pairs, while ipairs() will return only 1,"one", 2,"two" and 3,"three" (it will not return 5, "five", because of the gap at 4).

Just as a side note, if you have: t = {1 = "one", 2 = "two", 3 = "three", 4 = "four", 5 = "five"}, then table.remove(t, 4) will result in t = {1 = "one", 2 = "two", 3 = "three", 4 = "five"}. If you wanted the result to be t = {1 = "one", 2 = "two", 3 = "three", 5 = "five"}, then the correct command should be t[4] = nil.

It took me a while to figure it out my first time around.

I always use pairs(). The performance hit is negligible.
User avatar
Optera
Smart Inserter
Smart Inserter
Posts: 2920
Joined: Sat Jun 11, 2016 6:41 am
Contact:

Re: for i=1,#table a lot slower than for k,v in pairs (table)?

Post by Optera »

This all is fine, but does not explain the odd increase in from 0.4 ups with for pairs to 5.1ups with for #table of my op functions.
User avatar
cpeosphoros
Inserter
Inserter
Posts: 40
Joined: Fri Dec 23, 2016 10:57 pm
Contact:

Re: for i=1,#table a lot slower than for k,v in pairs (table)?

Post by cpeosphoros »

Optera wrote:This all is fine, but does not explain the odd increase in from 0.4 ups with for pairs to 5.1ups with for #table of my op functions.
Without actually profiling it, I'd guess the problem is not directly with the kind of loop you are using, but:

Code: Select all

local type = greenWire.signals[i].signal.type
local name = greenWire.signals[i].signal.name
local count = greenWire.signals[i].count
      
items[type..","..name] = count
should be slower than

Code: Select all

items[v.signal.type..","..v.signal.name] = v.count
due to the extra indexing for each "greenWire.signals" call.

It's basically the difference between each version of pairs/ipairs loops I used on my profiling test.

I say the problem is not directly with the loop kind, but it's related to it, since when you use # (or tableSize, anyway), you will not have the indexed value directly available.

This should work a bit faster with a # loop, though:

Code: Select all

local signals = greenWire.signals[i]
local type = signals.signal.type
local name = signals.signal.name
local count = signals.count
      
items[type..","..name] = count
User avatar
aubergine18
Smart Inserter
Smart Inserter
Posts: 1264
Joined: Fri Jul 22, 2016 8:51 pm
Contact:

Re: for i=1,#table a lot slower than for k,v in pairs (table)?

Post by aubergine18 »

Just adding to the point made in some posts above about the tables not being normal Lua tables - this affects all aspects of the LuaCustomTable concept, including #, pairs, etc.

The values in the table are only converted from C to Lua at time of access, and as far as I know aren't cached. So wherever possible make a local reference to a value in Lua if you are accessing it more than once (to avoid multiple converts between C and Lua).

http://lua-api.factorio.com/latest/LuaCustomTable.html
Better forum search for modders: Enclose your search term in quotes, eg. "font_color" or "custom-input" - it prevents the forum search from splitting on hypens and underscores, resulting in much more accurate results.
User avatar
Optera
Smart Inserter
Smart Inserter
Posts: 2920
Joined: Sat Jun 11, 2016 6:41 am
Contact:

Re: for i=1,#table a lot slower than for k,v in pairs (table)?

Post by Optera »

Klonan wrote: This would be similar to the how the for loop works:

Code: Select all

  if greenWire then
    local signals = greenWire.signals
    for i=1, #signals do
      local v = signals[i]
      items[v.signal.type..","..v.signal.name] = v.count
    end
  end
But really, the choice of loop is making almost no difference in this case
cpeosphoros wrote: This should work a bit faster with a # loop, though:

Code: Select all

local signals = greenWire.signals[i]
local type = signals.signal.type
local name = signals.signal.name
local count = signals.count
      
items[type..","..name] = count
I just ran another test on a different map between the two loops:
  • for pairs:
    0.05 - 0.09
  • for # with only signals localized as Klonan suggested:
    0.21 - 0.40
  • for # with localized signals and localized name,Type, count as cpeosphoros suggested:
    0.20 - 0.39
  • for # as in OP:
    0.52 - 1.01

Localizing signals is making it nearly 3x faster, but it's still ~4x slower than for pairs.
I still can't wrap my head around the ~4x slower execution of for # in this particular case. As Klonan said the difference should be hardly noticeable.

aubergine18 wrote:Just adding to the point made in some posts above about the tables not being normal Lua tables - this affects all aspects of the LuaCustomTable concept, including #, pairs, etc.

The values in the table are only converted from C to Lua at time of access, and as far as I know aren't cached. So wherever possible make a local reference to a value in Lua if you are accessing it more than once (to avoid multiple converts between C and Lua).

http://lua-api.factorio.com/latest/LuaCustomTable.html

Thank's I've only seen LuaCustomTable in game.script so far.
LuaCircuitNetwork.signals for example states it's an array of Signal. So i presume it's not a LuaCustomTable.
Post Reply

Return to “Modding help”