[raiguard][2.0.45] No multiplayer connectivity after 620 distinct mods (inside podman container)
[raiguard][2.0.45] No multiplayer connectivity after 620 distinct mods (inside podman container)
I found that if I have more than exactly 620 distinct mods present in the mods directory, the game won't connect at all to remote servers with mods.
I have reproduced this with 2.0.43 and 2.0.45, both with Space Age, and remote servers are headless.
UDP traffic still goes out and the server replies but it looks like the game drops the communication when this happens.
I have reproduced this with 2.0.43 and 2.0.45, both with Space Age, and remote servers are headless.
UDP traffic still goes out and the server replies but it looks like the game drops the communication when this happens.
Last edited by Evio on Mon Apr 28, 2025 4:06 pm, edited 1 time in total.
Re: [2.0.45] No multiplayer connectivity after 620 distinct mods
Could you zip and upload your mods directory somewhere so that I could attempt to reproduce this? I do not know of any such game limitation around mods-on-disk and don't look forward to trying to click download on 620 mods on the mod portal.
If you want to get ahold of me I'm almost always on Discord.
Re: [2.0.45] No multiplayer connectivity after 620 distinct mods
From previous work reverse-engineering the Multiplayer protocol I remember that the Mods/Version portion of the handshake had what appeared to be a "Length" field to indicate how many frames were to follow? I was not sure if that was part of the UDP system headers or was added by Factorio itself, but changing it caused the game to Reject the Client with a Mod error. It may simply be that 620 Mods (640 is 80 bytes; minus a bit of padding) is exceeding the max length of a Struct in the network code somewhere.
Re: [2.0.45] No multiplayer connectivity after 620 distinct mods
The mods count length field supports up to 4'294'967'295 mods so I doubt that's the issue.
If you want to get ahold of me I'm almost always on Discord.
Re: [2.0.45] No multiplayer connectivity after 620 distinct mods (inside podman container)
I confirm that this happens when running Factorio inside a container (podman) and mods/ is an overlay mount, this happens regardless of the mods being present in the upper or lower layer of the overlay.
I have 7.0G of mods and a mobile internet connection so It's not possible for me to share the mods, I can share a script that I drafted that downloads all mods from a plain text list but that's still overkill, only the presence of 620+ mods with a valid info.json inside is enough to trigger the bug, so generating fake mods with a script works:
Place that as populate.sh (or run directly) inside the lower or upper dir of the mount and run to create 621 empty mods.
Other files don't affect, only zip files that were parsed by Factorio do, and they must be distinct 620+ mods, older versions of the same mods don't have effect on this.
The overlay mount is:
I just checked ulimit to make sure that it's not a ulimit issue but all limits appear as unlimited both inside and outside the container.
I have 7.0G of mods and a mobile internet connection so It's not possible for me to share the mods, I can share a script that I drafted that downloads all mods from a plain text list but that's still overkill, only the presence of 620+ mods with a valid info.json inside is enough to trigger the bug, so generating fake mods with a script works:
Code: Select all
#!/bin/sh
mkdir empty
for num in {0..620}; do
echo '{"name": "empty'"$num"'", "version": "0.1.0", "title": "Empty mod '"$num"'", "author": "nobody"}' >empty/info.json
zip -r "empty$num"_0.1.0.zip empty
done
rm empty/info.json
rmdir empty
Other files don't affect, only zip files that were parsed by Factorio do, and they must be distinct 620+ mods, older versions of the same mods don't have effect on this.
The overlay mount is:
Code: Select all
-v ./common/mods/:/mnt/server/test/mods:O,upperdir="server/test/mods",workdir="server/test/mods.workdir"
Re: [2.0.45] No multiplayer connectivity after 620 distinct mods (inside podman container)
So, I'm doing the following (on windows):
* I have 632 (zipped) mods in the mods folder - none of them are active.
* The in-game mods manager shows all of them as valid but inactive
* Join game by browse-LAN games -> join -> works fine
* Join game by connect to IP -> works fine
* Join game by browse public games -> join a game with some amount of mods -> after syncing and installing the mods -> works fine
My conclusion is you're hitting the ulimit for opened file handles even though we explicitly asks the OS via: to increase it to 12544.
* I have 632 (zipped) mods in the mods folder - none of them are active.
* The in-game mods manager shows all of them as valid but inactive
* Join game by browse-LAN games -> join -> works fine
* Join game by connect to IP -> works fine
* Join game by browse public games -> join a game with some amount of mods -> after syncing and installing the mods -> works fine
My conclusion is you're hitting the ulimit for opened file handles even though we explicitly asks the OS via:
Code: Select all
setrlimit
If you want to get ahold of me I'm almost always on Discord.
Re: [2.0.45] No multiplayer connectivity after 620 distinct mods (inside podman container)
Can you post a log file from a failed multiplayer attempt?
If you want to get ahold of me I'm almost always on Discord.
Re: [raiguard][2.0.45] No multiplayer connectivity after 620 distinct mods (inside podman container)
Here are the two log files, I removed the leading line timestamp for easy diff.
In the “work” log the server immediately responded with mod mismatch because that run had no mods.
The “fail” log is a run with 621 fake mods present, this one ended with connection timeout.
During the failed run the server still responded though, I could see with iptraf how UDP packets went to the server and the server responded, around 3 times per second, until the timeout. Mod downloading and updating still works alright, it's only game connectivity the one that's affected.
I can make a minimal script to reproduce this in a container if that helps.
In the “work” log the server immediately responded with mod mismatch because that run had no mods.
The “fail” log is a run with 621 fake mods present, this one ended with connection timeout.
During the failed run the server still responded though, I could see with iptraf how UDP packets went to the server and the server responded, around 3 times per second, until the timeout. Mod downloading and updating still works alright, it's only game connectivity the one that's affected.
I can make a minimal script to reproduce this in a container if that helps.
- Attachments
-
- evio.pod.work.sed.log
- (8.63 KiB) Downloaded 15 times
-
- evio.pod.fail.sed.log
- (8.89 KiB) Downloaded 13 times
Re: [raiguard][2.0.45] No multiplayer connectivity after 620 distinct mods (inside podman container)
(Maybe post the unaltered log too, because the timestamp will show the delay between each line. And logs from both server and client if you have them.)Evio wrote: Tue Apr 29, 2025 3:51 pm Here are the two log files, I removed the leading line timestamp for easy diff.
In the “work” log the server immediately responded with mod mismatch because that run had no mods.
The “fail” log is a run with 621 fake mods present, this one ended with connection timeout.
My mods: Multiple Unit Train Control, Smart Artillery Wagons
Maintainer of Vehicle Wagon 2, Cargo Ships, Honk
Maintainer of Vehicle Wagon 2, Cargo Ships, Honk
Re: [raiguard][2.0.45] No multiplayer connectivity after 620 distinct mods (inside podman container)
Raiguard looked into this and it seems it is due to file handle limits on Linux, except in this case it’s due to us using the select() function for network sockets and it failing when having > 1024 handles open (files, sockets, other, I guess). Switching it to poll() resolves the issue.
Windows doesn’t seem to share the same issue/its limits on handles are larger.
Windows doesn’t seem to share the same issue/its limits on handles are larger.
If you want to get ahold of me I'm almost always on Discord.
Re: [raiguard][2.0.45] No multiplayer connectivity after 620 distinct mods (inside podman container)
Was this problem actually reproduced? This is an issue with running inside OCI containers, such as the ones made with podman/docker, this problem doesn't happen if I run Factorio as a normal program in Linux, this only happens inside containers.
As for the other logs and timestamps: I don't have those logs now but there were no unusual delay between actions and the server didn't acknowledge any connection attempts in the logs.
From: https://www.man7.org/linux/man-pages/man2/select.2.html
As for the other logs and timestamps: I don't have those logs now but there were no unusual delay between actions and the server didn't acknowledge any connection attempts in the logs.
From: https://www.man7.org/linux/man-pages/man2/select.2.html
Makes sense, although TCP connections still work when the issue is triggered.WARNING: select() can monitor only file descriptors numbers that
are less than FD_SETSIZE (1024)—an unreasonably low limit for many
modern applications—and this limitation will not change. All
modern applications should instead use poll(2) or epoll(7), which
do not suffer this limitation.
Last edited by Evio on Tue Apr 29, 2025 7:53 pm, edited 1 time in total.
Re: [raiguard][2.0.45] No multiplayer connectivity after 620 distinct mods (inside podman container)
I was told it was reproduced.
If you want to get ahold of me I'm almost always on Discord.
Re: [raiguard][2.0.45] No multiplayer connectivity after 620 distinct mods (inside podman container)
If the change makes it to a new Factorio version I'll try to replicate to confirm then. Thanks for the help.
Re: [raiguard][2.0.45] No multiplayer connectivity after 620 distinct mods (inside podman container)
Yes, I reproduced it on my machine (Fedora Linux, i9-10900k), but I had to up the number of mods to around 650. LAN games would not show up whatsoever and connecting to public games would fail (even though I could see them). Changing all usages of select() to poll() resolved the issue.
Don't forget, you're here forever.
Re: [raiguard][2.0.45] No multiplayer connectivity after 620 distinct mods (inside podman container)
Thanks for the report, this has been fixed for 2.0.48.
Don't forget, you're here forever.