Zandronum Chat on our Discord Server Get the latest version: 3.1
Source Code

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0004064Zandronum[All Projects] Bugpublic2022-12-22 22:252023-07-27 01:53
ReporterZalewa 
Assigned ToDrinkyBird 
PrioritynormalSeverityminorReproducibilitysometimes
StatusresolvedResolutionfixed 
PlatformPCOSWindowsOS Version11
Product Version3.2 
Target Version3.2Fixed in Version3.2 
Summary0004064: Trouble with big UDP packets in launcher query responses from game servers
DescriptionThe issue affects game servers, not the master server.

With more and more information being transmitted in the launcher query, the UDP packet that is sent as the reply can be big enough to cause transmission problems.

As an example, there's currently a TSPG server at 104.128.58.120:10711 that is running almost 30 WADs. Moreover up to 32 players may be connected to this server. When you also consider that Zandronum is sending a checksum for each WAD, the number of bytes quickly grows. I checked this server and it's sending a 2KB UDP packet in the response.

A user (King Dumb) has reported problems querying this server when using Windows 11. It would always reply with the "Refreshed too fast" status. The reason for this is that the query packet from his launcher reaches the server, but the 2KB reply from the server is dropped somewhere in transmission, possibly at the network stack of his OS.

Now, I remember that the problem with too big packets was already foreseen in the master server query. The master server sends the query response split into multiple packets of smaller size. But the game servers don't do that. I think it may be necessary to introduce the same split into this protocol too.
Additional InformationI cannot replicate this problem myself, neither on Windows 10 or Ubuntu 22.04. The server from the example responds just fine for me, whereas for King Dumb it was impossible to get a proper response from this server.

Discord message where I diagnose the problem:'https://discord.com/channels/297616756636254218/479693063128875008/1055582870229491834 [^]'

MS forums where people discuss a similar issue with UDP and big packets:'https://social.technet.microsoft.com/Forums/en-US/965e107e-d9b0-4240-ac3f-74797c91b476/unable-to-send-udp-packets-larger-than-the-mtu-with-windows-build-1809-using-c-udpclient?forum=win10itpronetworking [^]'
Attached Files

- Relationships
related to 0004130needs testingZalewa Doomseeker Doomseeker requests with flag SQF2_PWAD_HASHES even if flag 'check the integrity of local wads' is disabled 
related to 0004142resolvedZalewa Doomseeker Support the Zandronum's segmented server query response 

-  Notes
User avatar (0022675)
duke (reporter)
2023-01-05 19:23

I hope the server protocol doesn't become more complicated with multi-packet responses just because of some Windows bug. Microsoft may fix their software eventually but added protocol complexity will stay forever.

A possible workaround for the people impacted by this problem would be to use 'https://doomlist.net/ [^]' to avoid the need to fully query servers from their end. Being able to join games directly from the website would be a lot easier for people if my proposed change in issue 0004015 was merged.
User avatar (0022743)
Zalewa (developer)
2023-01-26 15:29
edited on: 2023-01-26 15:31

Quote from duke
I hope the server protocol doesn't become more complicated with multi-packet responses just because of some Windows bug. Microsoft may fix their software eventually but added protocol complexity will stay forever.

These are just two sentences but there are numerous problems with them:

1. Even though the problem is described on Microsoft's page and linked there to a certain Windows build, the issue with UDP "jumbo" packets isn't limited to just Windows. We can't say this is "just a Windows bug" where other networking equipment or even the ISP may be at fault.
2. What makes you think that Microsoft will "fix their software"?
3. Why are you not counting for "their software" that Microsoft might not fix and people may still use? Doomseeker 1.4 still runs on Windows XP.
4. Why are you so concerned about the complexity of the protocol? Isn't it the job of the software developers to manage the complexity? What makes you equalize complexity with complications, especially since the master server is already segmenting its packets and there are no problems with it?

Quote from duke
A possible workaround for the people impacted by this problem would be to use'https://doomlist.net/ [^]' [^] to avoid the need to fully query servers from their end. Being able to join games directly from the website would be a lot easier for people if my proposed change in issue 0004015 was merged.

This is incorrect, because Doomseeker still needs to query the server to learn about its WADs - this is mandatory. We can reduce the packet size here, because in such scenario Doomseeker is asking for many things it doesn't need, but a server that hosts sufficiently large amount of WADs will still trigger the problem.

User avatar (0022750)
duke (reporter)
2023-01-26 18:51
edited on: 2023-01-26 22:08

I take back my workaround suggestion, you are right that a server with a large number of wads may go over the packet size limit and prevent the affected people from joining even when using the website links. My mistake.

All I meant to say is, let's not rush into complicating the protocol before we know there isn't a better solution. I don't think we should just accept that some network stack decided that a 2KB packet is "too big" and devs need to work around that.

To answer your points:
1) Indeed we don't have enough information to tell if this is just a Windows bug or not, though it seems likely to me. We need reports from the affected people about their configuration to investigate more.

2) I said Microsoft MAY fix it, I don't know if they will.

3) I don't want to leave people behind either. Hopefully as we get more understanding of the problem we will know if there are any fixes or viable workarounds.

4) More complexity means more potential bugs, more work for people developing the protocol or tools using that protocol. Hopefully the protocol change would be done in a backwards-compatible way, but still. The job of software developers is to manage the *unavoidable* complexity that is inherent in the problems we are trying to solve, while avoiding or minimizing any unnecessary complexity.
Of course the devs are free to disregard my opinion and do whatever they think is best.

User avatar (0022752)
duke (reporter)
2023-01-27 04:18
edited on: 2023-01-27 04:22

I did some investigation and I must say I was probably also wrong thinking this is just some obscure Windows bug.

While the theoretical maximum for UDP packet size is around 65KB, packets above some threshold (in practice 1500+ bytes, the most common MTU setting) get fragmented on the IP layer and some systems and ISPs have issues with fragmented packets regardless of OS.

This article has a lot of detail and ways to test large packet delivery:'https://blog.cloudflare.com/ip-fragmentation-is-broken/ [^]'
It cites a paper from 2012 saying that around 6% hosts block inbound fragment datagrams.

Just tonight there was a MegaMan server up with 20+ wads and 25+ players going over the 1500 byte threshold and I found out that even the Oracle Cloud instance hosting 'https://doomlist.net [^]' seems to be dropping UDP packets larger than 1500 bytes and was failing to query that server while many players were connected. That's embarrasing.

I suppose that's why the Steam Server Query protocol ('https://developer.valvesoftware.com/wiki/Server_queries [^]'), a UDP based protocol similar to Zandronum Launcher Protocol does fragmentation on the application level, like Zalewa proposed. Not fun to implement but maybe a good idea after all.

To keep compatibility with older launchers I think servers would need to keep sending big packets by default and allow clients to request fragmentation with a flag.

Sorry for muddying the ticket with my previous uninformed opinions.

User avatar (0022814)
DrinkyBird (developer)
2023-03-19 21:50

I've been working on this.
Some initial refactoring was required:'https://hg.osdn.net/view/zandronum/zandronum-stable/rev/805780cfc641 [^]'
The actual segmented protocol changes are still being worked on.
User avatar (0022843)
DrinkyBird (developer)
2023-05-07 21:13

'https://hg.osdn.net/view/zandronum/zandronum-stable/rev/0e8360876fdf [^]'

Now I need to clean up and update the documentation...
User avatar (0022844)
DrinkyBird (developer)
2023-05-07 21:31

Docs updated:'https://wiki.zandronum.com/Launcher_protocol [^]'

As a bonus here's the tool I used to test this with:'https://github.com/DrinkyBird/zanquerytest/blob/master/index.js [^]'
User avatar (0022891)
Zalewa (developer)
2023-07-22 22:27

One question about the new segmented protocol: what happens when an SQF or SQF2 field extends beyond the MTU limit?

I created 100 empty WADs, called them empty_<n>.wad, loaded them up on server and the server sent me 4796 bytes in a response, not segmented. Now, if we go into the segmented responses: each WAD checksum in the SQF2_PWAD_HASHES segment is 33 bytes, so we reach the 1472 bytes MTU for this segment with 44 WADs.

What happens when a server tries to host more WADs than that?

1. Will the SQF2_PWAD_HASHES be sent in a single segment, breaching the 1472 MTU?
2. Will I receive two segments, both with SQF2_PWAD_HASHES, and I'll have to resume the parsing in the 2nd segment where I left off in the 1st one?

I can't check that myself, just yet.
User avatar (0022897)
DrinkyBird (developer)
2023-07-27 01:49
edited on: 2023-07-27 01:53

Marking this as resolved, as a Zan beta with this is now out, Doomseeker now supports it (0004142) and with it, servers with huge responses are now showing up consistently for people affected by this problem


Issue Community Support
This issue is already marked as resolved.
If you feel that is not the case, please reopen it and explain why.
Supporters: WaTaKiD
Opponents: No one explicitly opposes this issue yet.

- Issue History
Date Modified Username Field Change
2022-12-22 22:25 Zalewa New Issue
2022-12-22 22:39 Kaminsky Status new => acknowledged
2022-12-31 00:57 Kaminsky Target Version => 3.2
2023-01-05 19:23 duke Note Added: 0022675
2023-01-26 15:29 Zalewa Note Added: 0022743
2023-01-26 15:31 Zalewa Note Edited: 0022743 View Revisions
2023-01-26 18:51 duke Note Added: 0022750
2023-01-26 22:08 duke Note Edited: 0022750 View Revisions
2023-01-27 04:18 duke Note Added: 0022752
2023-01-27 04:19 duke Note Edited: 0022752 View Revisions
2023-01-27 04:20 duke Note Edited: 0022752 View Revisions
2023-01-27 04:22 duke Note Edited: 0022752 View Revisions
2023-03-19 21:48 DrinkyBird Assigned To => DrinkyBird
2023-03-19 21:48 DrinkyBird Status acknowledged => assigned
2023-03-19 21:50 DrinkyBird Note Added: 0022814
2023-04-30 16:57 WaTaKiD Relationship added related to 0004130
2023-05-07 21:13 DrinkyBird Note Added: 0022843
2023-05-07 21:13 DrinkyBird Status assigned => needs testing
2023-05-07 21:31 DrinkyBird Note Added: 0022844
2023-06-25 13:43 Zalewa Relationship added related to 0004142
2023-07-22 22:27 Zalewa Note Added: 0022891
2023-07-27 01:49 DrinkyBird Note Added: 0022897
2023-07-27 01:49 DrinkyBird Status needs testing => resolved
2023-07-27 01:49 DrinkyBird Fixed in Version => 3.2
2023-07-27 01:49 DrinkyBird Resolution open => fixed
2023-07-27 01:53 DrinkyBird Note Edited: 0022897 View Revisions






Questions or other issues? Contact Us.

Links


Copyright © 2000 - 2024 MantisBT Team
Powered by Mantis Bugtracker