[1.1.91] Lua sequences with 1-2 million elements excessively slow (integer hash collisions)

We are aware of them, but they have low priority. We have more important things to do. They go here in order not to take space in the main bug thread list.
Post Reply
mwls
Burner Inserter
Burner Inserter
Posts: 10
Joined: Wed Jan 18, 2023 6:00 am
Contact:

[1.1.91] Lua sequences with 1-2 million elements excessively slow (integer hash collisions)

Post by mwls »

Problem
I have been working on a mod that renders a bitmap image in-memory based on the game map to a table before writing it out to a file. While performance was slow but acceptable with smaller image (and therefore table) sizes, I found that after adding 1049600 entries to a table, the next entry took over 11 seconds to write.

Code: Select all

local p = game.create_profiler()
local t = {}
for i = 1, 1049599 do t[i] = 0 end
game.print{"", "1-", #t, ": ", p}
for i = #t+1, #t+4 do
    p.reset()
    t[i] = 0
    game.print{"", i, ": ", p}
end
 
-- output:
1-1049599: Elapsed: 556.496840ms
1049600: Elapsed: 0.007879ms
1049601: Elapsed: 11443.948614ms
1049602: Elapsed: 0.008234ms
1049603: Elapsed: 0.075207ms
Digging further, I also found that, surprisingly, tables got faster again after crossing approximately 2.1 million items.

Code: Select all

local p = game.create_profiler()
local t = {}
for i = 1, 5000000 do
    t[i] = 0
    if i % 100000 == 0 then
        game.print{"", i, ": ", p}
        p.reset()
    end
end

-- output:
100000: Elapsed: 46.207088ms
200000: Elapsed: 53.280971ms
300000: Elapsed: 91.903078ms
400000: Elapsed: 30.148098ms
500000: Elapsed: 35.163378ms
600000: Elapsed: 189.806780ms
700000: Elapsed: 34.790679ms
800000: Elapsed: 34.939359ms
900000: Elapsed: 36.173256ms
1000000: Elapsed: 38.712967ms
1100000: Elapsed: 12990.752463ms
1200000: Elapsed: 200.527076ms
1300000: Elapsed: 568.851723ms
1400000: Elapsed: 819.362545ms
1500000: Elapsed: 1064.082434ms
1600000: Elapsed: 1302.363555ms
1700000: Elapsed: 1572.051218ms
1800000: Elapsed: 1840.763217ms
1900000: Elapsed: 2099.557564ms
2000000: Elapsed: 2371.360799ms
2100000: Elapsed: 3335.998881ms
2200000: Elapsed: 39.499897ms
2300000: Elapsed: 41.462934ms
2400000: Elapsed: 40.821955ms
2500000: Elapsed: 43.024659ms
2600000: Elapsed: 42.432301ms
2700000: Elapsed: 42.355921ms
2800000: Elapsed: 45.198566ms
2900000: Elapsed: 44.559136ms
3000000: Elapsed: 48.675417ms
3100000: Elapsed: 47.267981ms
3200000: Elapsed: 48.517513ms
3300000: Elapsed: 48.437369ms
3400000: Elapsed: 51.772120ms
3500000: Elapsed: 49.738693ms
3600000: Elapsed: 53.296874ms
3700000: Elapsed: 55.244591ms
3800000: Elapsed: 54.951138ms
3900000: Elapsed: 54.545961ms
4000000: Elapsed: 55.934428ms
4100000: Elapsed: 58.980624ms
4200000: Elapsed: 1936.416306ms
4300000: Elapsed: 29.680170ms
4400000: Elapsed: 30.193556ms
4500000: Elapsed: 29.431090ms
4600000: Elapsed: 29.327514ms
4700000: Elapsed: 33.734657ms
4800000: Elapsed: 29.790708ms
4900000: Elapsed: 29.234059ms
5000000: Elapsed: 30.076491ms
It's not just growing the table that's slow, in that range, but all operations. For example, it's approximately 30-50x faster to do a read-only iteration over the entire contents of a table with 2.2 million entries than a table with 2 million:

Code: Select all

local function iterate(t)
    local p = game.create_profiler()
    local sum = 0
    for i = 1, #t do sum = sum + t[i] end --[[ Using pairs/ipairs for iteration have similar performance ]]
    p.stop()
    game.print{"", "Iterated ", #t, " items: ", p}
end
local t = {}
for i = 1, 2000000 do t[i] = 1 end; iterate(t)
for i = 2000001, 2200000 do t[i] = 1 end; iterate(t)

--output:
Iterated 2000000 items: Duration: 26773.726862ms
Iterated 2200000 items: Duration: 544.336383ms
  • What I expected is that there shouldn't be such a severe performance degradation in the 1-2.1 million size range when using tables with sequential integer keys.
  • This is completely reproducible in a fresh savefile with no mods.

Investigation/Explanation
(Caveat lector: I'm not super-experienced with this and some or all of this explanation may be incorrect)

I grabbed a copy of the Factorio Lua fork from https://github.com/Rseding91/Factorio-Lua/ and compiled it and was able to reproduce the slow performance there. After some profiling and adding some debug logging, I found that when the table reaches the problem size, almost all the time is spent following long node chains in the hash table.

Some background: Standard Lua tables are implemented internally using a combination of an array and a hash table. Sequential integer keys starting from 1 are stored in the array part, and other types of keys, potentially including larger integers when representing sparse arrays, are stored in the hash part. This behavior is described in more detail in the Performance Tips chapter of Lua Programming Gems.

In Factorio's fork of Lua, integer keys of 1024 or less are always stored in the array part of the table, even if sparse, and integers greater than 1024 are always stored in the hash part. This makes storing large sequences in tables inherently a little slower in Factorio Lua than in standard Lua, since almost all keys need to go through a hash table lookup, but it ought to be a fairly constant slow-down.

Lua also has two different approaches for hashing numbers that it uses depending on the platform: one based on re-interpreting a double's underlying storage as integers ("cast" hash), and one based on the frexp() function. Factorio uses the "frexp" hash on all platforms, except for Nintendo Switch. My understanding is that these differences are intended to make Lua script execution deterministic across multiple platforms, which is necessary for Factorio's multiplayer and replay functionality.

The problem is that the "frexp" hash function distributes integers across hash table buckets poorly compared to other hash functions, including the "cast" hash, and when the hash table size is 2^21 specifically, almost all keys end up in a very small number of buckets. I haven't dug into the math behind the hash function enough to understand why that's the case, but I have written a small program to evaluate the different hash functions at various table sizes to experimentally determine how effective they are. Note the 21st row in particular:

Code: Select all

frexp hash                        ||  cast hash
==================================++==================================
bits  in-pos  max    avg  it-ops  ||  bits  in-pos  max    avg  it-ops
   1   50.0%    2    2.0    1.50  ||     1   50.0%    2    2.0    1.50
   2   50.0%    2    2.0    1.50  ||     2   75.0%    2    1.3    1.25
   3   25.0%    4    4.0    2.50  ||     3   87.5%    2    1.1    1.12
   4   50.0%    2    2.0    1.50  ||     4   93.8%    2    1.1    1.06
   5   50.0%    2    2.0    1.50  ||     5   96.9%    2    1.0    1.03
   6   28.1%    4    3.6    2.31  ||     6   98.4%    2    1.0    1.02
   7    1.6%   64   64.0   32.50  ||     7   99.2%    2    1.0    1.01
   8   85.2%    2    1.2    1.15  ||     8   99.6%    2    1.0    1.00
   9   28.5%    4    3.5    2.29  ||     9   99.8%    2    1.0    1.00
  10   50.0%    2    2.0    1.50  ||    10   99.9%    2    1.0    1.00
  11   75.0%    2    1.3    1.25  ||    11   75.0%    2    1.3    1.25
  12   76.4%    3    1.3    1.27  ||    12   81.2%    2    1.2    1.19
  13   52.4%    4    1.9    1.62  ||    13   78.1%    3    1.3    1.23
  14   34.0%    5    2.9    2.26  ||    14   82.4%    3    1.2    1.18
  15   81.1%    3    1.2    1.19  ||    15   81.8%    3    1.2    1.19
  16   49.2%    5    2.0    1.82  ||    16   81.5%    3    1.2    1.19
  17   47.2%    7    2.1    2.04  ||    17   81.4%    3    1.2    1.19
  18   43.9%   10    2.3    2.21  ||    18   81.7%    3    1.2    1.18
  19   37.6%   13    2.7    2.51  ||    19   81.7%    3    1.2    1.18
  20   25.1%   21    4.0    3.48  ||    20   81.7%    3    1.2    1.18
  21    0.3%  512  341.6  235.04  ||    21   81.6%    3    1.2    1.18
  22   50.3%   12    2.0    1.99  ||    22   70.0%    4    1.4    1.32
  23   73.7%    5    1.4    1.31  ||    23   55.9%    4    1.8    1.51
  24   79.3%    4    1.3    1.21  ||    24   50.0%    6    2.0    1.88
  25   66.9%    3    1.5    1.34  ||    25   31.2%   12    3.2    3.19
  26   54.3%    5    1.8    1.54  ||    26   18.7%   24    5.3    5.84
  27   54.8%    6    1.8    1.74  ||    27   12.5%   16    8.0    7.17
  28   57.4%    4    1.7    1.62  ||    28   14.8%    8    6.7    4.34
  29   64.6%    4    1.5    1.39  ||    29   25.4%    7    3.9    2.50
  30   81.7%    3    1.2    1.18  ||    30   37.4%    6    2.7    2.01
The program simulates adding sequential integers, starting from 1024, to various size hash tables, and counts the number of keys that end up in the same bucket.
  • bits: size of hashtable = 2^bits
  • in-pos: what percentage of table entries are are stored at their "main" position in the hash table and require no Node chain traversal to get (ideally should be about 63% for large tables with a uniform hash)
  • max: the longest node chain, or the most entries that collide to the same hash bucket
  • avg: the average length of all Node chains in the table, ignoring buckets with no nodes
  • it-ops: the average number of operations/comparisons needed to find an entry in the table (ideally should be about 1.5 for large tables with a uniform hash). This is the closest to a "score" for how good the hash function is.
So to come full circle, the reason that the 1049601st entry is so slow to add to the hash table is because it's the point at which the hash table fills up its 10^20 entries (plus 1024 for the array part), allocates a new 10^21-sized hash table, and copies all the entries to the new one, but most re-insertions end up needing to traverse a chain of hundreds of colliding nodes instead of the expected 1 or 2.

This issue doesn't affect standard Lua when using tables with sequential integer keys because it is able to store all of the table entries internally in the array part of the table, so it doesn't hash the keys at all and doesn't encounter the collisions at this particular table size.
Workarounds
Obviously, avoiding creating such large tables if at all possible is a good start. If it's unavoidable, the simplest workaround I've found is to multiply all table indexes by a medium-sized prime, such as 997, to better spread entries out over different hash buckets. This requires keeping track of the largest key separately instead of being able to use #table, and slows most accesses down a bit, but it's far faster when the hash table capacity is (or will be) 2^21.
Last edited by mwls on Mon Oct 02, 2023 8:12 am, edited 2 times in total.

mwls
Burner Inserter
Burner Inserter
Posts: 10
Joined: Wed Jan 18, 2023 6:00 am
Contact:

Re: [1.1.91] Lua sequences with 1-2 million elements excessively slow (integer hashing)

Post by mwls »

I don't know if this is a known issue but I couldn't find anything about it. I'm not expecting this to be a very high-priority issue to address given how many mods there are that apparently haven't encountered it, but I thought it was interesting to investigate, and I wanted to write up my findings in case anyone else stumbles over it in the future.

Changing the hash function would be one obvious fix, but I'm not sure whether or not that's viable for Factorio (potential compatibility/performance concerns?). I am also a little curious why the better Lua hash function, or a variant of it, isn't being used; given that table iteration happens in insertion order, it's not immediately obvious to me how the hash function could affect execution determinism even if it did differ between platforms.
Last edited by mwls on Mon Oct 02, 2023 8:20 am, edited 1 time in total.

Bilka
Factorio Staff
Factorio Staff
Posts: 3139
Joined: Sat Aug 13, 2016 9:20 am
Contact:

Re: [1.1.91] Lua sequences with 1-2 million elements excessively slow (integer hashing)

Post by Bilka »

Not sure what the meaning of this is, but you are linking to two different hash functions. The one you linked for original Lua is inside a LUA_IEEE754TRICK define. Here is the same one as linked for Factorio-Lua: https://github.com/lua/lua/blob/v5.2.1/ ... #L284-L286

Edit: The removal of that LUA_IEEE754TRICK define from the Factorio-Lua is related to 70571.
I'm an admin over at https://wiki.factorio.com. Feel free to contact me if there's anything wrong (or right) with it.

mwls
Burner Inserter
Burner Inserter
Posts: 10
Joined: Wed Jan 18, 2023 6:00 am
Contact:

Re: [1.1.91] Lua sequences with 1-2 million elements excessively slow (integer hashing)

Post by mwls »

Bilka wrote:
Mon Oct 02, 2023 7:15 am
Not sure what the meaning of this is, but you are linking to two different hash functions. The one you linked for original Lua is inside a LUA_IEEE754TRICK define. Here is the same one as linked for Factorio-Lua: https://github.com/lua/lua/blob/v5.2.1/ ... #L284-L286

Edit: The removal of that LUA_IEEE754TRICK define from the Factorio-Lua is related to 70571.
Thank you for pointing this out. In my alternating between the two Lua forks, I missed that the frexp-based hash is also used in standard Lua, and I misread a different file that made me think that LUA_IEEE754TRICK is always defined. My platform also compiles with the "trick" hash by default.

So the "weak" hash does come from standard Lua, but it's not really* an issue there because its tables' arrays aren't size-limited.

I've updated my original post to make the explanation about the two hashes more accurate.



* I was able to get similar slow-table behavior in standard Lua by compiling for the "ansi" target and using negative table indexes to force the use of the hash table.

Rseding91
Factorio Staff
Factorio Staff
Posts: 13209
Joined: Wed Jun 11, 2014 5:23 am
Contact:

Re: [1.1.91] Lua sequences with 1-2 million elements excessively slow (integer hash collisions)

Post by Rseding91 »

Thanks for the report. I'm going to move this to minor issues until such a time where another developer wants to poke at the lua internals. We've had too many issues with Lua in the past to make me comfortable changing anything around it without extensive testing and time for us to use it before releasing.
If you want to get ahold of me I'm almost always on Discord.

Post Reply

Return to “Minor issues”