[Oxyd] [1.1.53/Linux] RCON crash after repeated connections

This subforum contains all the issues which we already resolved.
gadgeteering
Manual Inserter
Manual Inserter
Posts: 3
Joined: Mon Feb 07, 2022 2:00 am
Contact:

[Oxyd] [1.1.53/Linux] RCON crash after repeated connections

Post by gadgeteering »

I think that I have managed to reliably crash my Factorio server by steadily making RCON calls

Platform: Linux (Headless Server)
Version: 1.1.53
Mods: None

Server command:

Code: Select all

/home/redacted/.local/factorio-server/bin/x64/factorio --start-server crashtest.zip --rcon-port 5000 --rcon-password foobar
I wanted to have an accurate game-time tally that I could display outside of Factorio. To do this, I created a script that would connect to a headless server via RCON and get the output of the `/time` command. The plan on my end was to be able to query for this data via websocket in order to power an OBS overlay.

Originally encountered on a map that started out as 1.1.50, but was later run under 1.1.53 where the crash was encountered. Reproduced on a fresh map generated under 1.1.53. My testing suggests to me that the map isn't a factor, but you might find something different.

The crash seems to happen after approximately 535 successful RCON calls against localhost for me. Test results:

Code: Select all

* Test 1: Crash
  * Old Map
  * 533 successful RCON calls
  * Interval between RCON calls: 1s
* Test 2: Crash
  * New Map
  * 535 successful RCON calls
  * Interval between RCON calls: 1s
* Test 3: Crash
  * New Map
    * All tests from here on out are on the New Map
  * 535 successful RCON calls
  * Interval between RCON calls: 1s
  * RCON calls started a few minutes after the server
* Test 4: Crash
  * 535 successful RCON calls
  * Interval between RCON calls: 0.5s
  * Problem accelerated with smaller interval, useful for faster testing
* Test 5: Crash
  * 535 successful RCON calls
  * Interval between RCON calls: 2s
  * Problem slowed down with greater interval
* Test 6: Crash
  * 535 successful RCON calls
  * Interval between RCON calls: 0.25s  
  * No Factorio client connected to server, useful for faster testing
* Test 7: Crash
  * 735 successful RCON calls (100 + 100 + 535)
  * Interval between RCON calls: 0.25s
  * RCON test-script interrupted at 100 calls, restarted, interrupted again at 100 calls (200 total) and then restarted after 10 minutes
* Test 8: Crash
  * 635 successful RCON calls (100 + 535)
  * Interval between RCON calls: 0.25s
  * RCON test-script stopped after 100 calls, then immediately restarted
    * I tested 2 changes with Test 7, so this test was run to address that
* Test 9: Crash
  * 535 RCON calls
    * Using an incorrect RCON password, so no 'success'.
  * Interval between RCON calls: 0.25s
* Test 10: No Crash
  * 535 RCON calls exactly. Not attempting a 536th call
  * Interval between RCON calls: 0.25s
* Test 11:
  * 535 successful RCON calls (266+269)
  * 2 client script instances
  * Interval between RCON calls per script: 0.25s
  * The first run of this test crashed at 110 successful calls (53 + 57), but since this was from the same server run as Test 10 I'm labelling that run as a mistrial.
* Test 12: No Crash
  * Client script performs 100 calls before exiting. The script is immediately called again in a loop up to 100 times or until the server crashes. If successful, there will have been 10.000 RCON commands.
  * Interval between RCON calls: 0.25s
* Test 13: Crash
  * 509 successful RCON calls
  * Instead of connecting to a loopback address as I have been up to now, I wanted to test with a remote connection. Since my only available machines for running my script or the Factorio server are currently on my local network, used a separate script to relay TCP connections back at myself from a further-away machine.
  * Interval between RCON calls per script: 0.25s
When I discovered that interrupting my script would refresh my quota of calls, it started to look more like a problem with the RCON client library that I was using. The library makes a new TCP connection with each call, and I'm not familiar enough with RCON to know if this is standard. However, I'm not sure what to make of the problem after Test 10 uncovered that two clients could cover ground to the crashing point together.

Common stack trace (this would often be printed multiple times per crash when the client would try again after a timeout):

Code: Select all

 540.495 Info RemoteCommandProcessor.cpp:241: New RCON connection from IP ADDR:({127.0.0.1:55420})
 540.495 Error CrashHandler.cpp:633: Received SIGSEGV
Factorio crashed. Generating symbolized stacktrace, please wait ...
Raw stacktrace: 0xa10257, 0xa1092d, 0xd6f7c5, 0xd70261, 0xd70319, 0x3da70, 0xe9f660, 0xed4c7d, 0x20b65d0, 0x93f9, 0
 541.496 Info RemoteCommandProcessor.cpp:241: New RCON connection from IP ADDR:({127.0.0.1:55422})
 541.496 Error CrashHandler.cpp:633: Received SIGSEGV
Factorio crashed. Generating symbolized stacktrace, please wait ...
Raw stacktrace: 0xa10257, 0xa1092d, 0xd6f7c5, 0xd70261, 0xd70319, 0x3da70, 0xe9f660, 0xed4c7d, 0x20b65d0, 0x93f9, 0
 542.496 Info RemoteCommandProcessor.cpp:241: New RCON connection from IP ADDR:({127.0.0.1:55424})
 542.496 Error CrashHandler.cpp:633: Received SIGSEGV
Factorio crashed. Generating symbolized stacktrace, please wait ...
Raw stacktrace: 0xa10257, 0xa1092d, 0xd6f7c5, 0xd70261, 0xd70319, 0x3da70, 0xe9f660, 0xed4c7d, 0x20b65d0, 0x93f9, 0
 543.497 Info RemoteCommandProcessor.cpp:241: New RCON connection from IP ADDR:({127.0.0.1:55426})
 543.497 Error CrashHandler.cpp:633: Received SIGSEGV
Factorio crashed. Generating symbolized stacktrace, please wait ...
Raw stacktrace: 0xa10257, 0xa1092d, 0xd6f7c5, 0xd70261, 0xd70319, 0x3da70, 0xe9f660, 0xed4c7d, 0x20b65d0, 0x93f9, 0
 544.498 Info RemoteCommandProcessor.cpp:241: New RCON connection from IP ADDR:({127.0.0.1:55428})
 544.498 Error CrashHandler.cpp:633: Received SIGSEGV
Factorio crashed. Generating symbolized stacktrace, please wait ...
Raw stacktrace: 0xa10257, 0xa1092d, 0xd6f7c5, 0xd70261, 0xd70319, 0x3da70, 0xe9f660, 0xed4c7d, 0x20b65d0, 0x93f9, 0
 545.499 Info RemoteCommandProcessor.cpp:241: New RCON connection from IP ADDR:({127.0.0.1:55430})
 545.499 Error CrashHandler.cpp:633: Received SIGSEGV
Factorio crashed. Generating symbolized stacktrace, please wait ...
Raw stacktrace: 0xa10257, 0xa1092d, 0xd6f7c5, 0xd70261, 0xd70319, 0x3da70, 0xe9f660, 0xed4c7d, 0x20b65d0, 0x93f9, 0
 546.499 Info RemoteCommandProcessor.cpp:241: New RCON connection from IP ADDR:({127.0.0.1:55432})
 546.499 Error CrashHandler.cpp:633: Received SIGSEGV
Factorio crashed. Generating symbolized stacktrace, please wait ...
Raw stacktrace: 0xa10257, 0xa1092d, 0xd6f7c5, 0xd70261, 0xd70319, 0x3da70, 0xe9f660, 0xed4c7d, 0x20b65d0, 0x93f9, 0
 547.500 Info RemoteCommandProcessor.cpp:241: New RCON connection from IP ADDR:({127.0.0.1:55434})
 547.500 Error CrashHandler.cpp:633: Received SIGSEGV
Factorio crashed. Generating symbolized stacktrace, please wait ...
Raw stacktrace: 0xa10257, 0xa1092d, 0xd6f7c5, 0xd70261, 0xd70319, 0x3da70, 0xe9f660, 0xed4c7d, 0x20b65d0, 0x93f9, 0
 548.064 Warning Logger.cpp:526: Symbols.size() == 17, usedSize == 10
#0  0x0000000000a1092d in std::__uniq_ptr_impl<LoggerFileWriteStream, std::default_delete<LoggerFileWriteStream> >::_M_ptr() const at /home/build/gcc-9.2/include/c++/9.2.0/bits/unique_ptr.h:154
#1  0x0000000000d6f7c5 in std::unique_ptr<LoggerFileWriteStream, std::default_delete<LoggerFileWriteStream> >::get() const at /home/build/gcc-9.2/include/c++/9.2.0/bits/unique_ptr.h:353
#2  0x0000000000d70261 in std::unique_ptr<LoggerFileWriteStream, std::default_delete<LoggerFileWriteStream> >::operator->() const at /home/build/gcc-9.2/include/c++/9.2.0/bits/unique_ptr.h:347
#3  0x0000000000d70319 in Logger::flush() at /tmp/factorio-build-0NpO1H/src/Util/Logger.cpp:566
#4  0x000000000003da70 in Logger::logStacktrace(StackTraceInfo*) at /tmp/factorio-build-0NpO1H/src/Util/Logger.cpp:552
#5  0x0000000000e9f660 in GlobalContext::getMap() at /tmp/factorio-build-0NpO1H/src/GlobalContext.cpp:2052
#6  0x0000000000ed4c7d in CrashHandler::writeStackTrace(CrashHandler::CrashReason) at /tmp/factorio-build-0NpO1H/src/Util/CrashHandler.cpp:188
#7  0x00000000020b65d0 in CrashHandler::commonSignalHandler(int) at /tmp/factorio-build-0NpO1H/src/Util/CrashHandler.cpp:635
#8  0x00000000000093f9 in CrashHandler::SignalHandler(int) at /tmp/factorio-build-0NpO1H/src/Util/CrashHandler.cpp:650
#9  (nil) in ?? at ??:0
#10 (nil) in TCPSocket::recv(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) at /tmp/factorio-build-0NpO1H/src/Net/TCPSocket.cpp:137
#11 (nil) in RemoteCommandProcessor::RconInterface::updateClient(RemoteCommandProcessor::RconInterface::Client&) at /tmp/factorio-build-0NpO1H/src/RemoteCommandProcessor.cpp:263
#12 (nil) in std::default_delete<std::thread::_State>::operator()(std::thread::_State*) const at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:81
#13 (nil) in std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >::~unique_ptr() at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:284
#14 (nil) in execute_native_thread_routine at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/src/c++11/../../../../../libstdc++-v3/src/c++11/thread.cc:79
#15 (nil) in ?? at ??:0
#16 (nil) in ?? at ??:0
The stack trace from Test 13 had a slightly different printout, though it still seems to hit similar beats. Confirming that 10.20.30.40 is a redacted address.:

Code: Select all

296.520 Info RemoteCommandProcessor.cpp:241: New RCON connection from IP ADDR:({10.20.30.40:50378})
terminate called after throwing an instance of 'RuntimeError'
  what():  select failed: Bad file descriptor
 296.525 Info RemoteCommandProcessor.cpp:241: New RCON connection from IP ADDR:({10.20.30.40:50380})
< ~25 more 'New RCON connection' lines >
 303.541 Info RemoteCommandProcessor.cpp:241: New RCON connection from IP ADDR:({10.20.30.40:50438})
303.577 Warning Logger.cpp:526: Symbols.size() == 24, usedSize == 17
Factorio crashed. Generating symbolized stacktrace, please wait ...
Raw stacktrace: 0x9c74c5, 0xd6f7c5, 0xd70261, 0xd70319, 0x3da70, 0, 0xb, 0x5cc977, 0x203b176, 0x203b1c1, 0x203b2f4, 0x43d4be, 0xe9f646, 0xed4c7d, 0x20b65d0, 0x93f9, 0
 303.790 Info RemoteCommandProcessor.cpp:241: New RCON connection from IP ADDR:({10.20.30.40:50440})
< ~25 more 'New RCON connection' lines >
 310.562 Info RemoteCommandProcessor.cpp:241: New RCON connection from IP ADDR:({10.20.30.40:50494})
 310.619 Warning Logger.cpp:526: Symbols.size() == 19, usedSize == 16
#0  0x0000000000d6f7c5 in GlobalContext::getMap() at /tmp/factorio-build-0NpO1H/src/GlobalContext.cpp:2052
#1  0x0000000000d70261 in CrashHandler::writeStackTrace(CrashHandler::CrashReason) at /tmp/factorio-build-0NpO1H/src/Util/CrashHandler.cpp:188
#2  0x0000000000d70319 in CrashHandler::commonSignalHandler(int) at /tmp/factorio-build-0NpO1H/src/Util/CrashHandler.cpp:635
#3  0x000000000003da70 in CrashHandler::SignalHandler(int) at /tmp/factorio-build-0NpO1H/src/Util/CrashHandler.cpp:650
#4  (nil) in ?? at ??:0
#5  0x000000000000000b in ?? at ??:0
#6  0x00000000005cc977 in ?? at ??:0
#7  0x000000000203b176 in __gnu_cxx::__verbose_terminate_handler() at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/../../../../libstdc++-v3/libsupc++/vterminate.cc:75
#8  0x000000000203b1c1 in __cxxabiv1::__terminate(void (*)()) at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/../../../../libstdc++-v3/libsupc++/eh_terminate.cc:47
#9  0x000000000203b2f4 in std::terminate() at ??:?
#10 0x000000000043d4be in __cxa_throw at ??:?
#11 0x0000000000e9f646 in TCPSocket::wait() at /tmp/factorio-build-0NpO1H/src/Net/TCPSocket.cpp:258
#12 0x0000000000ed4c7d in TCPSocket::recv(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) at /tmp/factorio-build-0NpO1H/src/Net/TCPSocket.cpp:131
#13 0x00000000020b65d0 in RemoteCommandProcessor::RconInterface::updateClient(RemoteCommandProcessor::RconInterface::Client&) at /tmp/factorio-build-0NpO1H/src/RemoteCommandProcessor.cpp:263
#14 0x00000000000093f9 in std::default_delete<std::thread::_State>::operator()(std::thread::_State*) const at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:81
#15 (nil) in std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >::~unique_ptr() at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:284
#16 (nil) in execute_native_thread_routine at /home/build/gcc-9.2-source/gcc-9.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/src/c++11/../../../../../libstdc++-v3/src/c++11/thread.cc:79
#17 (nil) in ?? at ??:0
#18 (nil) in ?? at ??:0
Stack trace logging done
 310.619 Error CrashHandler.cpp:189: Map tick at moment of crash: 10610
 310.619 Error Util.cpp:97: Unexpected error occurred. If you're running the latest version of the game you can help us solve the problem by posting the contents of the log file on the Factorio forums.
Please also include the save file(s), any mods you may be using, and any steps you know of to reproduce the crash.
 310.619 Uploading log file
 310.625 Info SystemUtil.cpp:554: Started /home/redacted/.local/factorio-server/bin/x64/factorio; trampoline PID: 3454492
 310.625 Error CrashHandler.cpp:605: Unhandled exception type: 12RuntimeError
 310.625 Error CrashHandler.cpp:612: Unhandled exception: select failed: Bad file descriptor
Below is a version of my script that I made in order to explore this issue.

Code: Select all

#!/usr/bin/env python3

from rcon import rcon
from rcon.exceptions import WrongPassword
from sys import argv
from time import perf_counter

import asyncio


def hexit(exit_code):
    print('./crash_demo.py host port password [interval] [max_successes]')
    print('default interval: 1s')
    exit(exit_code)


async def main(args):

    # Quick arg parsing
    ##
    if len(args) < 3:
        hexit(1)

    host = args[0]
    try:
        port = int(args[1])
    except ValueError:
        print('Port could not be converted to int')
        hexit(1)
    passwd = args[2]

    interval = 1
    if len(args) > 3:
        try:
            interval = float(args[3])
            if interval <= 0:
                raise ValueError()
        except ValueError:
            print('invalid interval')
            hexit(1)

    max_successes = 0
    if len(args) > 4:
        try:
            max_successes = int(args[4])
            if max_successes < 0:
                raise ValueError()
        except ValueError:
            print('invalid max successes')
            hexit(1)

    # Summary
    ##
    settings = (
        f'getting `/time` output from {host}:{port} (interval: {interval} seconds)'
    )
    settings_successes = f'quitting after {max_successes} successes'

    def print_settings():
        print(settings)
        if max_successes:
            print(settings_successes)

    print_settings()

    async def do_update():
        # Inner command to actually connect to the server

        # In this demo, just return
        return await rcon('/time', host=host, port=port, passwd=passwd)

    # Wait for at most 1 second
    timeout = 1.0

    # Track successful connections
    successes = 0
    # Track concurrent failures (after initial success)
    failures = 0
    # Special tracking
    failures_passwords = 0

    exit_code = 0
    while not max_successes or successes < max_successes:
        # Try eternally or until we reach our target number of successes.
        # If we have 3 consequtive failures, then the server probably
        #  isn't coming back

        time_start = perf_counter()

        try:
            data = await asyncio.wait_for(do_update(), timeout=timeout)
            successes += 1

            # Reset failure tally
            failures = 0
        except WrongPassword:
            failures_passwords += 1
            data = f'WRONG PASSWORD ({failures_passwords} so far)'

            # I don't want to count a WrongPassword as failure.
            # Added accounting for WrongPassword in order to see
            #   if the server could be brought down by too many requests
            #   with bad passwords (it could).
        except asyncio.TimeoutError:
            data = 'TIMEOUT'

            failures += 1

        except ConnectionRefusedError:
            # Server not running, splattered against a closed port
            data = 'NOT RUNNING (connection refused)'

            failures += 1

        except Exception as e:
            data = f'Unexpected problem ({type(e)}): {e}'
            # count as a failure regardless of success tally
            failures += 1

        print(f'({successes} successes): {data}')

        time_end = perf_counter()

        if failures >= 3:
            exit_code = 1
            break

        # Sleep before next loop
        await asyncio.sleep(max(0, interval - (time_end - time_start)))

    # Print settings again for the sake of staggered tests
    print_settings()
    print('END')

    exit(exit_code)


if __name__ == '__main__':
    try:
        asyncio.run(main(argv[1:]))
    except KeyboardInterrupt:
        exit(127)
Usage:

Code: Select all

./crash_demo.py host port password [interval] [max_successes]
For the moment:
  • This demonstrates that RCON's availability should be whitelisted through firewall rules to addresses where one thinks that proper commands will come from, even if a password is in place. A malicious user could easily crash any server that they could reach via RCON.
  • I'm going try to adjusting my main script to invoke another Python script that will do the actual connection. 100% a cheat, but if it works then it keeps my plans on track.
Attachments
factorio-crashlog-established-game.log
(81.62 KiB) Downloaded 150 times
Oxyd
Former Staff
Former Staff
Posts: 1428
Joined: Thu May 07, 2015 8:42 am
Contact:

Re: [Oxyd] [1.1.53/Linux] RCON crash after repeated connections

Post by Oxyd »

gadgeteering wrote: Mon Feb 07, 2022 7:19 am The library makes a new TCP connection with each call, and I'm not familiar enough with RCON to know if this is standard.
That's fine. But the library never closes those connections, so you keep making new ones until the game breaks at around 500 simultaneous RCON connections.

I added a limit of 128 simultaneous RCON connections because I don't think there's any reasonable use case for more than that.
gadgeteering
Manual Inserter
Manual Inserter
Posts: 3
Joined: Mon Feb 07, 2022 2:00 am
Contact:

Re: [Oxyd] [1.1.53/Linux] RCON crash after repeated connections

Post by gadgeteering »

Excellent, thanks!

I should have figured that I was over-complicating it in pondering a solution. I even had a watch-tally of relevant netstat stuff, but "just way too many open connections" didn't register on me.

On my end, it sounds like I'll still need to double-check available options for the library. Non-closures happened with two different examples straight out of their documentation. If that doesn't pan out, I'll need to find a new library or stick with the separate invocation strategy to force it closed.
gadgeteering
Manual Inserter
Manual Inserter
Posts: 3
Joined: Mon Feb 07, 2022 2:00 am
Contact:

Re: [Oxyd] [1.1.53/Linux] RCON crash after repeated connections

Post by gadgeteering »

I've created a fixed version of my script that makes sure to close the connection. I'm posting it here in case anyone else has grief with using Python's rcon library to communicate with Factorio before or after the new release or otherwise has problems with Python+RCON and has found this thread in their search.

The script uses an adjusted copy of the rcon library's rcon function that makes sure to close the connection each time it's used. I'll be posting an issue on the GitHub project sometime today, so this change will hopefully be unnecessary in the future.

Usage of "noncrash_demo.py" is the same as my original post's "crash_demo.py". At the time of my final edit it has sent over 6400 messages without issue, and the number of sockets described by "netstat -tn | grep -c 5000" is stable (a decent number of client-side TIME_WAITs, but they clean themselves up given time and aren't Factorio's problem).

Code: Select all

#!/usr/bin/env python3

from sys import argv
from time import perf_counter
from rcon.exceptions import RequestIdMismatch, WrongPassword
from rcon.proto import Packet, Type
from typing import IO

import asyncio


async def close_connection(remote_socket: IO) -> None:
    remote_socket.close()
    await remote_socket.wait_closed()


async def communicate(reader: IO, writer: IO, packet: Packet) -> Packet:
    """Asynchronous requests."""

    writer.write(bytes(packet))
    await writer.drain()
    return await Packet.aread(reader)


async def rcon(
    command: str,
    *arguments: str,
    host: str,
    port: int,
    passwd: str,
    encoding: str = 'utf-8',
) -> str:
    """Runs a command asynchronously."""

    reader, writer = await asyncio.open_connection(host, port)
    login = Packet.make_login(passwd, encoding=encoding)
    response = await communicate(reader, writer, login)

    # Wait for SERVERDATA_AUTH_RESPONSE according to:
    # https://developer.valvesoftware.com/wiki/Source_RCON_Protocol
    while response.type != Type.SERVERDATA_AUTH_RESPONSE:
        response = await Packet.aread(reader)

    if response.id == -1:
        # Close before raising exception
        await close_connection(writer)
        raise WrongPassword()

    request = Packet.make_command(command, *arguments, encoding=encoding)
    response = await communicate(reader, writer, request)
    # No need to communicate further
    await close_connection(writer)

    if response.id != request.id:
        raise RequestIdMismatch(request.id, response.id)

    return response.payload.decode(encoding)


def hexit(exit_code):
    print('./crash-demo.py host port password [interval] [max_successes]')
    print('default interval: 1s')
    exit(exit_code)


async def main(args):

    # Quick arg parsing
    ##
    if len(args) < 3:
        hexit(1)

    host = args[0]
    try:
        port = int(args[1])
    except ValueError:
        print('Port could not be converted to int')
        hexit(1)
    passwd = args[2]

    interval = 1
    if len(args) > 3:
        try:
            interval = float(args[3])
            if interval <= 0:
                raise ValueError()
        except ValueError:
            print('invalid interval')
            hexit(1)

    max_successes = 0
    if len(args) > 4:
        try:
            max_successes = int(args[4])
            if max_successes < 0:
                raise ValueError()
        except ValueError:
            print('invalid max successes')
            hexit(1)

    # Summary
    ##
    settings = (
        f'getting `/time` output from {host}:{port} (interval: {interval} seconds)'
    )
    settings_successes = f'quitting after {max_successes} successes'

    def print_settings():
        print(settings)
        if max_successes:
            print(settings_successes)

    print_settings()

    async def do_update():
        # Inner command to actually connecct to the server

        # In this demo, just return
        return await rcon('/time', host=host, port=port, passwd=passwd)

    # Wait for at most 1 second
    timeout = 1.0

    # Track successful connections
    successes = 0
    # Track concurrent failures (after initial success)
    failures = 0
    # Special tracking
    failures_passwords = 0

    exit_code = 0
    while not max_successes or successes < max_successes:
        # Try eternally or until we reach our target number of successes.
        # If we have 3 consequtive failures, then the server probably
        #  isn't coming back

        time_start = perf_counter()

        try:
            data = await asyncio.wait_for(do_update(), timeout=timeout)
            successes += 1

            # Reset failure tally
            failures = 0
        except WrongPassword:
            failures_passwords += 1
            data = f'WRONG PASSWORD ({failures_passwords} so far)'

            # I don't want to count a WrongPassword as failure.
            # Added accounting for WrongPassword in order to see
            #   if the server could be brought down by too many requests
            #   with bad passwords (it could).
        except asyncio.TimeoutError:
            data = 'TIMEOUT'

            failures += 1

        except ConnectionRefusedError:
            # Server not running
            # Distinguish from default message)
            data = 'NOT RUNNING (connection refused)'

            failures += 1

        except Exception as e:
            data = f'Unexpected problem ({type(e)}): {e}'
            # count as a failure regardless of success tally
            failures += 1

        print(f'({successes} successes): {data}')

        time_end = perf_counter()

        if failures >= 3:
            exit_code = 1
            break

        # Sleep before next loop
        await asyncio.sleep(max(0, interval - (time_end - time_start)))

    # Print settings again for the sake of staggered tests
    print_settings()
    print('END')

    exit(exit_code)


if __name__ == '__main__':
    try:
        asyncio.run(main(argv[1:]))
    except KeyboardInterrupt:
        exit(127)

Post Reply

Return to “Resolved Problems and Bugs”