[1.1.72] Multiplayer join fails when SRV request is caught by a CNAME record

This subforum contains all the issues which we already resolved.
rhyser9
Burner Inserter
Burner Inserter
Posts: 7
Joined: Wed Jul 09, 2014 3:35 pm
Contact:

[1.1.72] Multiplayer join fails when SRV request is caught by a CNAME record

Post by rhyser9 »

Factorio 1.1.72 (build 60222, win64, steam)

I'm attempting to host a Factorio server accessed via direct connection, rather than advertised via the public server list. For testing, I'm using the domain zone2.rbarrie.us, which has an A record pointing to my home IP, and a wildcard CNAME record aliasing *.zone2.rbarrie.us to zone2.rbarrie.us. The server is intended to be accessed via factorio.zone2.rbarrie.us

Code: Select all

@               IN      A       98.209.243.3
*               IN      CNAME   zone2.rbarrie.us.
I've found that when the Factorio client attempts to look up the SRV record for _factorio._udp.factorio.zone2.rbarrie.us, since there is no valid SRV record defined, the CNAME record gets returned instead. This appears to be valid behavior per RFC 4592 section 4.5. "SRV RRSet at a Wildcard Domain Name" (https://www.ietf.org/rfc/rfc4592.txt). However, something about this response causes the Factorio client to attempt to connect to the multiplayer server on port 0, rather than falling back to the default port 34197.

Code: Select all

1375.748 Info SocketAddress.cpp:175: DNS SRV lookup returned zone2.rbarrie.us:0 for factorio.zone2.rbarrie.us
1375.792 Joining game IP ADDR:({98.209.243.3:0})
1375.792 Info UDPSocket.cpp:33: Opening socket
1375.792 Info ClientMultiplayerManager.cpp:610: UpdateTick(4294967295) changing state from(Ready) to(Connecting)
1375.792 Verbose RouterBase.cpp:60: Started router thread.
1380.875 Verbose TransmissionControlHelper.cpp:170: Fragment 0000 failed too many times
1381.726 Info ClientMultiplayerManager.cpp:152: Disconnecting multiplayer connection. Reason: Quit.
1381.726 Info ClientMultiplayerManager.cpp:610: UpdateTick(4294967295) changing state from(Connecting) to(Disconnected)
1381.730 Verbose RouterBase.cpp:82: Finishing router thread.
1381.730 Info UDPSocket.cpp:218: Closing socket
1381.730 Info UDPSocket.cpp:248: Socket closed

Domain Name System (response)
    Transaction ID: 0xc595
    Flags: 0x8180 Standard query response, No error
    Questions: 1
    Answer RRs: 1
    Authority RRs: 1
    Additional RRs: 0
    Queries
        _factorio._udp.factorio.zone2.rbarrie.us: type SRV, class IN
            Name: _factorio._udp.factorio.zone2.rbarrie.us
            [Name Length: 40]
            [Label Count: 6]
            Type: SRV (Server Selection) (33)
            Class: IN (0x0001)
    Answers
        _factorio._udp.factorio.zone2.rbarrie.us: type CNAME, class IN, cname zone2.rbarrie.us
            Name: _factorio._udp.factorio.zone2.rbarrie.us
            Type: CNAME (Canonical NAME for an alias) (5)
            Class: IN (0x0001)
            Time to live: 60 (1 minute)
            Data length: 2
            CNAME: zone2.rbarrie.us

Domain Name System (response)
    Transaction ID: 0xa1c0
    Flags: 0x8180 Standard query response, No error
    Questions: 1
    Answer RRs: 1
    Authority RRs: 0
    Additional RRs: 0
    Queries
        zone2.rbarrie.us: type A, class IN
            Name: zone2.rbarrie.us
            [Name Length: 16]
            [Label Count: 3]
            Type: A (Host Address) (1)
            Class: IN (0x0001)
    Answers
        zone2.rbarrie.us: type A, class IN, addr 98.209.243.3
            Name: zone2.rbarrie.us
            Type: A (Host Address) (1)
            Class: IN (0x0001)
            Time to live: 60 (1 minute)
            Data length: 4
            Address: 98.209.243.3
By contrast, if I have zone1.rbarrie.us in which the wildcard CNAME is replaced by an explicit CNAME for factorio.zone1.rbarrie.us, the Factorio client correctly identifies that there are no valid SRV records, and defaults to port 34197.

Code: Select all

@               IN      A       98.209.243.3
factorio        IN      CNAME   zone1.rbarrie.us.

Code: Select all

1370.377 Verbose SocketAddress.cpp:177: DNS SRV lookup for _factorio._udp.factorio.zone1.rbarrie.us didn't return any usable records
1370.391 Joining game IP ADDR:({98.209.243.3:34197})
1370.391 Info UDPSocket.cpp:33: Opening socket
1370.392 Info ClientMultiplayerManager.cpp:610: UpdateTick(4294967295) changing state from(Ready) to(Connecting)
1370.392 Verbose RouterBase.cpp:60: Started router thread.
1370.408 Info ClientMultiplayerManager.cpp:211: Quitting multiplayer connection.
1370.408 Info ClientMultiplayerManager.cpp:610: UpdateTick(4294967295) changing state from(Connecting) to(Disconnected)
1370.412 Verbose RouterBase.cpp:82: Finishing router thread.
1370.412 Info UDPSocket.cpp:218: Closing socket
1370.412 Info UDPSocket.cpp:248: Socket closed

Domain Name System (response)
    Transaction ID: 0x928a
    Flags: 0x8183 Standard query response, No such name
    Questions: 1
    Answer RRs: 0
    Authority RRs: 1
    Additional RRs: 0
    Queries
        _factorio._udp.factorio.zone1.rbarrie.us: type SRV, class IN
            Name: _factorio._udp.factorio.zone1.rbarrie.us
            [Name Length: 40]
            [Label Count: 6]
            Type: SRV (Server Selection) (33)
            Class: IN (0x0001)

Domain Name System (response)
    Transaction ID: 0xdee3
    Flags: 0x8180 Standard query response, No error
    Questions: 1
    Answer RRs: 2
    Authority RRs: 0
    Additional RRs: 0
    Queries
        factorio.zone1.rbarrie.us: type A, class IN
            Name: factorio.zone1.rbarrie.us
            [Name Length: 25]
            [Label Count: 4]
            Type: A (Host Address) (1)
            Class: IN (0x0001)
    Answers
        factorio.zone1.rbarrie.us: type CNAME, class IN, cname zone1.rbarrie.us
            Name: factorio.zone1.rbarrie.us
            Type: CNAME (Canonical NAME for an alias) (5)
            Class: IN (0x0001)
            Time to live: 40 (40 seconds)
            Data length: 2
            CNAME: zone1.rbarrie.us
        zone1.rbarrie.us: type A, class IN, addr 98.209.243.3
            Name: zone1.rbarrie.us
            Type: A (Host Address) (1)
            Class: IN (0x0001)
            Time to live: 40 (40 seconds)
            Data length: 4
            Address: 98.209.243.3
The expected behavior is: if the Factorio client makes a SRV request, and receives a response that does not contain any SRV records, then it should identify that the "DNS SRV lookup for [xyz] didn't return any usable records" and fall back to the default port 34197.

Full factorio-current.log file showing connection attempts to both factorio.zone1.rbarrie.us and factorio.zone2.rbarrie.us:
factorio-current.log
(122.01 KiB) Downloaded 81 times
SoShootMe
Filter Inserter
Filter Inserter
Posts: 517
Joined: Mon Aug 03, 2020 4:16 pm
Contact:

Re: [1.1.72] Multiplayer join fails when SRV request is caught by a CNAME record

Post by SoShootMe »

rhyser9 wrote: Sun Nov 13, 2022 9:03 pm The expected behavior is: if the Factorio client makes a SRV request, and receives a response that does not contain any SRV records, then it should identify that the "DNS SRV lookup for [xyz] didn't return any usable records" and fall back to the default port 34197.
Ideally, the recommended procedure in RFC2782 (page 6), in particular:
...
If the reply is NOERROR, ANCOUNT>0 and there is at least one
SRV RR which specifies the requested Service and Protocol in
the reply:
...
The difference between the two cases in your description is that in the first one, zone2.rbarrie.us (from the synthesised record _factorio._udp.factorio.zone2.rbarrie.us. IN CNAME zone2.barrie.us.) exists but has no SRV record (no error, but obviously no answer of the query type) while in the second, _factorio._udp.factorio.zone1.rbarrie.us. does not exist (non-existent domain). The key point being that "no error" does not necessarily imply there is an answer of the query type, in part leading to the conditions I quoted above.
User avatar
vinzenz
Factorio Staff
Factorio Staff
Posts: 341
Joined: Mon Aug 02, 2021 6:45 pm
Contact:

Re: [1.1.72] Multiplayer join fails when SRV request is caught by a CNAME record

Post by vinzenz »

Are the DNS records still up? Can't reproduce on linux.
bringing the oops to devops
SoShootMe
Filter Inserter
Filter Inserter
Posts: 517
Joined: Mon Aug 03, 2020 4:16 pm
Contact:

Re: [1.1.72] Multiplayer join fails when SRV request is caught by a CNAME record

Post by SoShootMe »

vinzenz wrote: Wed Nov 16, 2022 11:03 am Are the DNS records still up? Can't reproduce on linux.
Could it be platform-specific? I can reproduce on 1.1.70 win64, using (a subdomain of) a domain under my control:

Code: Select all

$ORIGIN example.
something IN A x.x.x.x
*.something IN CNAME something
; no factorio.something IN CNAME record, which would cause zero answers in the response to the SRV query (a possible workaround for the issue)
; no factorio.something IN A record(s), which would cause zero answers in the response to the SRV query (a possible workaround for the issue)
; no something IN SRV record(s), which would be provided in the response to the SRV query (not useful in this kind of setup anyway)
; no _factorio._udp.factorio.something IN SRV record(s), which would be provided in the response to the SRV query (a possible workaround for the issue)
Multiplayer -> Connect to address -> IP address and port: factorio.something.example

The effect is as in the OP; port 0 is apparently "selected" instead of recognising the absence of any SRV record in the response to the SRV query.
rhyser9
Burner Inserter
Burner Inserter
Posts: 7
Joined: Wed Jul 09, 2014 3:35 pm
Contact:

Re: [1.1.72] Multiplayer join fails when SRV request is caught by a CNAME record

Post by rhyser9 »

@vinzenz I republished the zones and added a few more for testing a couple of the options @SoShootMe described.

factorio.zone1.rbarrie.us
  • factorio IN CNAME zone1.rbarrie.us.
  • returns no results for SRV query
  • client connects to default port
factorio.zone2.rbarrie.us
  • * IN CNAME zone2.rbarrie.us.
  • returns CNAME for SRV query
  • client connects to port 0
factorio.zone3.rbarrie.us
  • * IN CNAME zone3.rbarrie.us.
  • factorio IN CNAME zone3.rbarrie.us.
  • returns no results for SRV query (interesting... didn't expect this)
  • client connects to default port
factorio.zone4.rbarrie.us
  • * IN CNAME zone4.rbarrie.us.
  • factorio IN A 98.209.243.3
  • returns no results for SRV query (same as zone3)
  • client connects to default port
factorio.zone5.rbarrie.us
  • * IN CNAME zone5.rbarrie.us.
  • _factorio._udp.factorio IN SRV 0 0 40001 factorio.zone5.rbarrie.us.
  • this SRV record is invalid per RFC, as a SRV record cannot point to a CNAME
  • returns SRV record and correctly identifies port
  • fails to resolve factorio.zone5.rbarrie.us CNAME/A record
factorio.zone6.rbarrie.us
  • * IN CNAME zone6.rbarrie.us.
  • factorio IN SRV 0 0 40001 factorio.zone6.rbarrie.us.
  • this SRV record is also invalid per RFC, as a SRV record cannot point to a CNAME. no idea if just declaring "factorio" is valid or not
  • not sure if this is what @SoShootMe meant by "no something IN SRV record(s), which would be provided in the response to the SRV query"
  • returns CNAME for SRV query
  • client connects to port 0
Removing the wildcard CNAME record fixes things (as expected), but in this situation that requires explicitly defining DNS records for all other services on this same endpoint, which isn't feasible/maintainable.

Adding the full SRV record (properly pointing to an A record NOT a CNAME) works, but I wouldn't expect many people to be aware that this is required. The only reason I had any idea Factorio could use SRV records is due to the single forum thread which requested the feature (viewtopic.php?p=572281#p572281).

Explicitly defining the factorio CNAME (or A record) to override the wildcard CNAME works fine. I didn't expect this, I would've probably assumed that the SRV record would still be caught by the wildcard, but that's correct behavior now that I look. This is a valid workaround for my use case.
SoShootMe
Filter Inserter
Filter Inserter
Posts: 517
Joined: Mon Aug 03, 2020 4:16 pm
Contact:

Re: [1.1.72] Multiplayer join fails when SRV request is caught by a CNAME record

Post by SoShootMe »

rhyser9 wrote: Wed Nov 16, 2022 6:01 pm factorio.zone3.rbarrie.us
  • * IN CNAME zone3.rbarrie.us.
  • factorio IN CNAME zone3.rbarrie.us.
  • returns no results for SRV query (interesting... didn't expect this)
  • client connects to default port
Because the (conceptual) search in the authoritative server uses the name, in order of increasing levels; query type is secondary. factorio.zone3.rbarrie.us. exists but nothing under it does, so given a query for any type at _factorio._udp.factorio.zone3.rbarrie.us. the search stops at that point and results in NXDOMAIN. The presence of the wildcard doesn't have any effect. DNS is fun! :p
factorio.zone6.rbarrie.us
  • * IN CNAME zone6.rbarrie.us.
  • factorio IN SRV 0 0 40001 factorio.zone6.rbarrie.us.
  • this SRV record is also invalid per RFC, as a SRV record cannot point to a CNAME. no idea if just declaring "factorio" is valid or not
  • not sure if this is what @SoShootMe meant by "no something IN SRV record(s), which would be provided in the response to the SRV query"
  • returns CNAME for SRV query
  • client connects to port 0
For the sake of clarity, what I meant was if you have:

Code: Select all

$ORIGIN zone6.barrie.us.
@ IN A x.x.x.x
@ IN SRV ...
* IN CNAME @
A query for _factorio._udp.factorio.zone6.barrie.us IN SRV will give an answer with a synthesised CNAME record and the SRV record. Such a SRV record isn't useful in combination with the other records because the same SRV will be in the answer for _service._proto.anything.zone6.barrie.us.
User avatar
vinzenz
Factorio Staff
Factorio Staff
Posts: 341
Joined: Mon Aug 02, 2021 6:45 pm
Contact:

Re: [1.1.72] Multiplayer join fails when SRV request is caught by a CNAME record

Post by vinzenz »

So how should Factorio behave in each of these cases? In my testing on linux it didn't detect a SRV record for each of the zones. For zone1 and zone2 it tried to connect to 98.209.243.3:34197, for zone3-6 it failed with "$domain is not a valid address (Name or service not known)".
SoShootMe wrote: Wed Nov 16, 2022 7:41 pm Adding the full SRV record (properly pointing to an A record NOT a CNAME) works, but I wouldn't expect many people to be aware that this is required. The only reason I had any idea Factorio could use SRV records is due to the single forum thread which requested the feature (viewtopic.php?p=572281#p572281).
I'm planning to document SRV records, but I want more community testing first ;) SRV record pointing to a CNAME seems to work for bigcommunitygames.com
bringing the oops to devops
SoShootMe
Filter Inserter
Filter Inserter
Posts: 517
Joined: Mon Aug 03, 2020 4:16 pm
Contact:

Re: [1.1.72] Multiplayer join fails when SRV request is caught by a CNAME record

Post by SoShootMe »

vinzenz wrote: Thu Nov 17, 2022 7:22 am So how should Factorio behave in each of these cases? In my testing on linux it didn't detect a SRV record for each of the zones. For zone1 and zone2 it tried to connect to 98.209.243.3:34197, for zone3-6 it failed with "$domain is not a valid address (Name or service not known)".
The only problem I have seen is with factorio.zone2.rbarrie.us (as in the OP and equivalent to what I generalised in an earlier post) and I replicated it a few minutes ago in 1.1.70 win64. Port 0 is used, as reported with 1.1.72 win64.

Assuming nothing changed since your earlier testing (likely), if you're not seeing it in Linux, it sounds like the problem is platform-specific.
SoShootMe wrote: Wed Nov 16, 2022 7:41 pm Adding the full SRV record (properly pointing to an A record NOT a CNAME) works, but I wouldn't expect many people to be aware that this is required. The only reason I had any idea Factorio could use SRV records is due to the single forum thread which requested the feature (viewtopic.php?p=572281#p572281).
These are not my words :).
I'm planning to document SRV records, but I want more community testing first ;) SRV record pointing to a CNAME seems to work for bigcommunitygames.com
RFC2782 p4 states:
Target
The domain name of the target host. There MUST be one or more
address records for this name, the name MUST NOT be an alias (in
the sense of RFC 1034 or RFC 2181).
The ideal is for this misconfiguration to be tolerated, but that is separate to the issue at hand.
User avatar
vinzenz
Factorio Staff
Factorio Staff
Posts: 341
Joined: Mon Aug 02, 2021 6:45 pm
Contact:

Re: [1.1.72] Multiplayer join fails when SRV request is caught by a CNAME record

Post by vinzenz »

SoShootMe wrote: Thu Nov 17, 2022 11:54 am The only problem I have seen is with factorio.zone2.rbarrie.us (as in the OP and equivalent to what I generalised in an earlier post) and I replicated it a few minutes ago in 1.1.70 win64. Port 0 is used, as reported with 1.1.72 win64.

Assuming nothing changed since your earlier testing (likely), if you're not seeing it in Linux, it sounds like the problem is platform-specific.
Yeah I already have an idea of what's going wrong.
SoShootMe wrote: Thu Nov 17, 2022 11:54 am These are not my words :).
haha, looks like I've confused the phpBB quote button
bringing the oops to devops
Post Reply

Return to “Resolved Problems and Bugs”