[Rseding91] [0.17.74] mass UnitGroup creation and deletion causes desync if stored in global table using #table

Things that we don't consider worth fixing at this moment.
User avatar
Muppet9010
Filter Inserter
Filter Inserter
Posts: 279
Joined: Sat Dec 09, 2017 6:01 pm
Contact:

[Rseding91] [0.17.74] mass UnitGroup creation and deletion causes desync if stored in global table using #table

Post by Muppet9010 »

If you create and destroy a lot of unit groups this can cause a desync in heavy mode.

In a clean 0.17.74 vanilla game.
Enable heavy mode (toggle-heavy-mode)
Run the below script a few times (up to 5) and a desync will be reported.

Code: Select all

/c
global.groups = global.groups or {}
local surface = game.player.surface
local force = game.player.force
for i = 1, 100 do
    local unit = surface.create_entity{name="small-biter", position={10,10}, force=force}
    local group = surface.create_unit_group{position={10,10}, force=force}
    group.add_member(unit)
    global.groups[#global.groups+1] = group
end
game.print("#global.groups: " .. #global.groups)

local toDestroy = math.random(10,30)
for i=1, toDestroy do
    local groupIndex = math.random(1, #global.groups)
    local group = global.groups[groupIndex]
    if group ~= nil and group.valid then
        group.destroy()
        global.groups[groupIndex] = nil
    end
end

This issue and their identification of the Lua code triggering the desync was pointed out to me by a fan of Comfy.
If the line "global.groups[groupIndex] = nil" isn't present the desync doesn't occur. So I assume its some internal lua/c++ game engine thing?
Last edited by Muppet9010 on Thu Oct 31, 2019 9:53 am, edited 1 time in total.
Rseding91
Factorio Staff
Factorio Staff
Posts: 14619
Joined: Wed Jun 11, 2014 5:23 am
Contact:

Re: [Rseding91] [0.17.74] mass UnitGroup creation and deletion causes desync

Post by Rseding91 »

The issue is coming from directly assigning nil to table values when the entire table is numeric-keyed. Lua in its ever-smartness decides (seemingly randomly) to convert the table between a hash table and an array depending on how many nils are in the table and where they're located.

I don't have a solution to this at the moment and likely won't have one for some time. The "#" operator on sparse tables gives you back garbage values (as you're finding out) and just doesn't work on hash tables. Basically don't use the "#" operator - it's just trouble.
If you want to get ahold of me I'm almost always on Discord.
User avatar
BlueTemplar
Smart Inserter
Smart Inserter
Posts: 3234
Joined: Fri Jun 08, 2018 2:16 pm
Contact:

Re: [Rseding91] [0.17.74] mass UnitGroup creation and deletion causes desync

Post by BlueTemplar »

Rseding91 wrote: Wed Oct 30, 2019 3:14 pm Basically don't use the "#" operator - it's just trouble.
FFS !

(Is it at least safe to use it during the Factorio-loading stages ?)
BobDiggity (mod-scenario-pack)
User avatar
boskid
Factorio Staff
Factorio Staff
Posts: 3290
Joined: Thu Dec 14, 2017 6:56 pm
Contact:

Re: [Rseding91] [0.17.74] mass UnitGroup creation and deletion causes desync

Post by boskid »

For now it looks that length operator (#) on its own works ok (deterministic), but issue (desync) comes from cases where global variable is save-loaded: globals are serialised using serpent.dump on server and then loaded in client but here when loading, lua may choose different internal format than server's and so length operator may give different value.
This means that length operator on tables that were assigned to global in previous ticks (or in same tick in different update phase) may desync when lua chooses hash instead of array or array instead of hash.

I am not sure if it is worth fixing because of performance of all table operations.

This two reproductions are interesting since first will give second's initialiser and second will give first's initialiser:

Code: Select all

/c global.t = {0,0,0,nil,nil,nil,nil,nil,nil,nil,nil,nil,nil,nil,nil,nil,nil,nil,nil,nil,nil,0,0,0,0,0,0,0,0,0,0,0,0}
/c game.print(serpent.dump(global.t))
and

Code: Select all

/c global.t = {[1]=0,[2]=0,[3]=0,[22]=0,[23]=0,[24]=0,[25]=0,[26]=0,[27]=0,[28]=0,[29]=0,[30]=0,[31]=0,[32]=0,[33]=0}
/c game.print(serpent.dump(global.t))
Bilka
Factorio Staff
Factorio Staff
Posts: 3310
Joined: Sat Aug 13, 2016 9:20 am
Contact:

Re: [Rseding91] [0.17.74] mass UnitGroup creation and deletion causes desync

Post by Bilka »

In that case, see 74166
I'm an admin over at https://wiki.factorio.com. Feel free to contact me if there's anything wrong (or right) with it.
Rseding91
Factorio Staff
Factorio Staff
Posts: 14619
Joined: Wed Jun 11, 2014 5:23 am
Contact:

Re: [Rseding91] [0.17.74] mass UnitGroup creation and deletion causes desync if stored in global table using #table

Post by Rseding91 »

Boskid looked at this more and the fix would make every single usage of # from O(logN) to O(N) which I consider far worse than the rare issue you're running into.
If you want to get ahold of me I'm almost always on Discord.
User avatar
boskid
Factorio Staff
Factorio Staff
Posts: 3290
Joined: Thu Dec 14, 2017 6:56 pm
Contact:

Re: [Rseding91] [0.17.74] mass UnitGroup creation and deletion causes desync if stored in global table using #table

Post by boskid »

Simply speaking: length operator # uses binary search (O(logN)) to quickly compute amount of non-nil contiguous elements with numeric indexes. It should be deterministic when there are no gaps. When there are gaps, it may desync since it uses sizearray as starting point and so middle points may or may not hit a gap making result belong to longer or shorter part of array. table_size goes over every element and is O(N), but ignores sizearray and should always give the same value.

Code: Select all

/c
  local a = {}
  for i = 1,1023 do
    a[i] = 1
  end
  a[512]=nil
  a[256]=nil
  a[128]=nil
  a[64]=nil
  a[32]=nil
  a[16]=nil
  a[8]=nil
  a[4]=nil
  a[2]=nil
  game.print(#a)
above example most likely will print "1" because binary search hits all the gaps.

From the opposite end, there is this example:

Code: Select all

/c
  local a = {}
  a[512] = 1
  a[768] = 1
  a[896] = 1
  a[960] = 1
  a[992] = 1
  a[1008] = 1
  a[1016] = 1
  a[1020] = 1
  a[1022] = 1
  a[1023] = 1
  game.print(#a)
where size will be 1023 because binary search hits only the non-nil elements.
User avatar
Muppet9010
Filter Inserter
Filter Inserter
Posts: 279
Joined: Sat Dec 09, 2017 6:01 pm
Contact:

Re: [Rseding91] [0.17.74] mass UnitGroup creation and deletion causes desync if stored in global table using #table

Post by Muppet9010 »

Yea i think the fact # ever found entries above the first nil is what surprised me and I assume lead to the author of this code reaching this situation where it generally "seemed" to work.
As you said its a Lua "feature" of the # operator, with your example code reproducing the same oddness in a standalone Lua demo site. Confirmed by if I lower the size to the table index to 1020, the # operator then behaves as expected, stopping at first nil value.

I was looking at it seeing if I thought it would be helpful to modders to have it documented, but as its just another odd Lua feature that would seem like the start of a long list.
Post Reply

Return to “Won't fix.”