[0.17.56] Performance issue with large rcon commands

Bugs that are actually features.
Post Reply
veladon
Manual Inserter
Manual Inserter
Posts: 4
Joined: Sat Jun 27, 2015 3:36 pm
Contact:

[0.17.56] Performance issue with large rcon commands

Post by veladon »

Hi

We have hit a perfomance issue in clusterio with sending large rcon commands - the use case here is to sync user inventories across servers, which include serialised blueprint strings held in player's inventories, which can reach up to 250k characters in length. Sending this 250k character rcon request can take around a minute.

We have diagnosed that this issue is only seen if there is at least one player on the server - if there are no players on the server, the request is processed instantly (about 20ms).

I then ran some tests to see how long incrementally large commands take. With no players, every command takes no more than 20ms, tested up to 500k characters:

request size: 1000. Duration: 23
request size: 2000. Duration: 17
request size: 3000. Duration: 18
request size: 4000. Duration: 17
request size: 5000. Duration: 17

request size: 100000. Duration: 26
request size: 200000. Duration: 19
request size: 300000. Duration: 20
request size: 400000. Duration: 22
request size: 500000. Duration: 22

With just me logged on, I saw the following (duration in ms):

request size: 1000. Duration: 177
request size: 2000. Duration: 350
request size: 3000. Duration: 514
request size: 4000. Duration: 696
request size: 5000. Duration: 843


From clusterio team member's conversations with Oxyd, it seems that this is due to having to synchronise with clients. Is there a way to allow us to run rcon commands that do not interact with clients outside of this restriction?


Thanks
veladon

psihius
Fast Inserter
Fast Inserter
Posts: 192
Joined: Mon Dec 15, 2014 12:47 am
Contact:

Re: [0.17.56] Performance issue with large rcon commands

Post by psihius »

Also this starts to rear it's ugly head under high load aka close or at 100% cpu utilisation wall. On our production cluster we start to have issues with 10 users going about their business and inventory sync does not happen often (most of the time it just pools for empty response - and even those get heavily backlogged), but on our test cluster with 40 users sync dropping to a server had no issues syncing everything inside 5 seconds from joining and that's some serious amount of data being pushed through, probably over 1MB/sec for a few seconds.

If need more data - we have web based consoles, logging and all kinds of goodies you can come and see in person :)
@veladon also made some test code to reproduce issues :)

Very pretty please, send help!

kovarex
Factorio Staff
Factorio Staff
Posts: 8078
Joined: Wed Feb 06, 2013 12:00 am
Contact:

Re: [0.17.56] Performance issue with large rcon commands

Post by kovarex »

veladon wrote:
Mon Jul 15, 2019 9:42 am
From clusterio team member's conversations with Oxyd, it seems that this is due to having to synchronise with clients. Is there a way to allow us to run rcon commands that do not interact with clients outside of this restriction?
No, there is obviously not as the command affects the gameplay. Fragmenting the command is wanted feature to avoid bursts of data having to be sent to all the clients.

I would propose to try to minimize the amount of data you send 250k command is just huge.

psihius
Fast Inserter
Fast Inserter
Posts: 192
Joined: Mon Dec 15, 2014 12:47 am
Contact:

Re: [0.17.56] Performance issue with large rcon commands

Post by psihius »

kovarex wrote:
Mon Jul 15, 2019 10:34 am
veladon wrote:
Mon Jul 15, 2019 9:42 am
From clusterio team member's conversations with Oxyd, it seems that this is due to having to synchronise with clients. Is there a way to allow us to run rcon commands that do not interact with clients outside of this restriction?
No, there is obviously not as the command affects the gameplay. Fragmenting the command is wanted feature to avoid bursts of data having to be sent to all the clients.

I would propose to try to minimize the amount of data you send 250k command is just huge.
We had a long conversation about this whole thing with Oxyd in Discord and the situation is quite a bit more nuanced than Yes/No :)
I will leave for him to relay info in proper format and explanation. There are also Discord channel logs to read if needed ;)

We will attempt to minimise the data, but Factorio data structures are, well, verbose when serialised (same issue JSON has - named indexes repeating 100 times blowing up the "structure" to "data" ratio to almost 1 to 1 or even worse). So there are actually 2 things we came up that can and will help with this. We don't need all 250k command to go in one tick, but we do need a certain "minimal" rate of data flow and a max cap so it does not go into ridicilous territory ob abuse :)

Oxyd
Former Staff
Former Staff
Posts: 1428
Joined: Thu May 07, 2015 8:42 am
Contact:

Re: [0.17.56] Performance issue with large rcon commands

Post by Oxyd »

psihius wrote:
Mon Jul 15, 2019 10:48 am
kovarex wrote:
Mon Jul 15, 2019 10:34 am
veladon wrote:
Mon Jul 15, 2019 9:42 am
From clusterio team member's conversations with Oxyd, it seems that this is due to having to synchronise with clients. Is there a way to allow us to run rcon commands that do not interact with clients outside of this restriction?
No, there is obviously not as the command affects the gameplay. Fragmenting the command is wanted feature to avoid bursts of data having to be sent to all the clients.

I would propose to try to minimize the amount of data you send 250k command is just huge.
We had a long conversation about this whole thing with Oxyd in Discord and the situation is quite a bit more nuanced than Yes/No :)
I will leave for him to relay info in proper format and explanation. There are also Discord channel logs to read if needed ;)

We will attempt to minimise the data, but Factorio data structures are, well, verbose when serialised (same issue JSON has - named indexes repeating 100 times blowing up the "structure" to "data" ratio to almost 1 to 1 or even worse). So there are actually 2 things we came up that can and will help with this. We don't need all 250k command to go in one tick, but we do need a certain "minimal" rate of data flow and a max cap so it does not go into ridicilous territory ob abuse :)
I was going to move it to Not a Bug myself when I noticed Kovarex was faster. Because, really, it isn't one – it's working as designed.

Starting with 0.17.57, RCON commands are compressed internally, which should speed this up. You will also be able to tune the segment size in the next release, so you can play with that.

But the bottom line is that our multiplayer wasn't designed for this, so you'll always have to accept some sort of compromise.

psihius
Fast Inserter
Fast Inserter
Posts: 192
Joined: Mon Dec 15, 2014 12:47 am
Contact:

Re: [0.17.56] Performance issue with large rcon commands

Post by psihius »

Oxyd wrote:
Tue Jul 16, 2019 9:22 pm
I was going to move it to Not a Bug myself when I noticed Kovarex was faster. Because, really, it isn't one – it's working as designed.

Starting with 0.17.57, RCON commands are compressed internally, which should speed this up. You will also be able to tune the segment size in the next release, so you can play with that.

But the bottom line is that our multiplayer wasn't designed for this, so you'll always have to accept some sort of compromise.
Thank you, compromise is usually the best middle ground, so thank you. I do believe this will be sufficient for whatever we can come up in the future and compression does help already a lot.

We will work on optimising our side too - there are some things we can do too (and have been doing past few days) ;)

Post Reply

Return to “Not a bug”