Network error messages are too vague to be useful.
Posted: Thu Jun 28, 2018 1:11 pm
When trying to connect to a friend's server, I got this error: "couldn't establish network communication with server"
That is far too vague to be helpful. What exactly does it mean? I can think of many things it might mean:
1. DNS look-up failure: Maybe I typed the domain name wrong. Maybe I should just try an IP address instead.
2. ENETUNREACH - Probably my ethernet cable came unplugged or my WIFI connection died.
3. EHOSTUNREACH - Maybe I typo'd the IP address.
4. ETIMEDOUT - The computer the server runs on is probably turned off.
5. ECONNREFUSED - Either the server software isn't running, or my friend didn't forward the port properly, or Windows firewall is blocking the connection.
6. Factorio protocol negotiation failures -- Maybe we're running different version of Factorio. Maybe this version of Factorio is just broken.
I never figured out the cause. My friend checked his port forwarding and it seemed to be correct. We tried connecting the other way (him connecting to my computer) to verify that the game wasn't just broken, and that worked fine. So then I had a look at the console hoping to find a more specific error message, but all I found was this:
181.743 Info ClientMultiplayerManager.cpp:573: MapTick(4294967295) changing state from(Ready) to(Connecting)
183.574 Warning TransmissionControlHelper.cpp:176: Fragment 0000 failed too many times
184.424 Error ClientMultiplayerManager.cpp:95: MultiplayerManager failed: multiplayer.not-received-connection-accept-reply
However, "not-received-connection-accept-reply" is also an error message that could mean many things, e.g. it could mean ECONNREFUSED, or it could mean the connection was accepted, but then in the Factorio protocol, the server failed to respond to some request to connect to the game. In particular "Fragment 0000 failed too many times" sounds like maybe it was connected and trying to transfer game data? I don't know, as a "fragment" could be almost anything too.
All I knew was that it didn't seem to be ECONNREFUSED as my friend said that he could see activity on his end when I tried to connect, even though my own game was giving no indication that it had made any contact with the server at all. So I thought maybe the game uses both TCP and UDP and he only forwarded one, but he checked that and said both were forwarded.
So I began to use Linux's "strace" program (which displays what system calls a program makes, and what the results are and what data is transferred) to determine what the actual problem was since Factorio clearly wasn't going to tell me, but then the problem spontaneously went away, so I never was able to figure out what the problem was.
Anyway, regardless of what the actual problem was, users shouldn't have to resort to using strace to figure out what part of connecting to a multiplayer game has failed. Factorio knows exactly what went wrong when it was trying to establish network communication with the server. If it would just say what that was, then users could use that information to narrow down the list of possible resolutions to the problem. A specific error message is an error message that users can Google and find specific steps that might actually resolve their problem. Vague error messages just lead to long lists of completely random suggestions, most of which have nothing to do with the actual problem that is occurring.
That is far too vague to be helpful. What exactly does it mean? I can think of many things it might mean:
1. DNS look-up failure: Maybe I typed the domain name wrong. Maybe I should just try an IP address instead.
2. ENETUNREACH - Probably my ethernet cable came unplugged or my WIFI connection died.
3. EHOSTUNREACH - Maybe I typo'd the IP address.
4. ETIMEDOUT - The computer the server runs on is probably turned off.
5. ECONNREFUSED - Either the server software isn't running, or my friend didn't forward the port properly, or Windows firewall is blocking the connection.
6. Factorio protocol negotiation failures -- Maybe we're running different version of Factorio. Maybe this version of Factorio is just broken.
I never figured out the cause. My friend checked his port forwarding and it seemed to be correct. We tried connecting the other way (him connecting to my computer) to verify that the game wasn't just broken, and that worked fine. So then I had a look at the console hoping to find a more specific error message, but all I found was this:
181.743 Info ClientMultiplayerManager.cpp:573: MapTick(4294967295) changing state from(Ready) to(Connecting)
183.574 Warning TransmissionControlHelper.cpp:176: Fragment 0000 failed too many times
184.424 Error ClientMultiplayerManager.cpp:95: MultiplayerManager failed: multiplayer.not-received-connection-accept-reply
However, "not-received-connection-accept-reply" is also an error message that could mean many things, e.g. it could mean ECONNREFUSED, or it could mean the connection was accepted, but then in the Factorio protocol, the server failed to respond to some request to connect to the game. In particular "Fragment 0000 failed too many times" sounds like maybe it was connected and trying to transfer game data? I don't know, as a "fragment" could be almost anything too.
All I knew was that it didn't seem to be ECONNREFUSED as my friend said that he could see activity on his end when I tried to connect, even though my own game was giving no indication that it had made any contact with the server at all. So I thought maybe the game uses both TCP and UDP and he only forwarded one, but he checked that and said both were forwarded.
So I began to use Linux's "strace" program (which displays what system calls a program makes, and what the results are and what data is transferred) to determine what the actual problem was since Factorio clearly wasn't going to tell me, but then the problem spontaneously went away, so I never was able to figure out what the problem was.
Anyway, regardless of what the actual problem was, users shouldn't have to resort to using strace to figure out what part of connecting to a multiplayer game has failed. Factorio knows exactly what went wrong when it was trying to establish network communication with the server. If it would just say what that was, then users could use that information to narrow down the list of possible resolutions to the problem. A specific error message is an error message that users can Google and find specific steps that might actually resolve their problem. Vague error messages just lead to long lists of completely random suggestions, most of which have nothing to do with the actual problem that is occurring.