Zandronum Chat on our Discord Server Get the latest version: 3.1
Source Code

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0003334Zandronum[All Projects] Bugpublic2017-11-08 00:522024-04-06 10:00
ReporterLeonard 
Assigned ToKaminsky 
PriorityhighSeveritymajorReproducibilityalways
StatusassignedResolutionopen 
PlatformOSOS Version
Product Version3.0 
Target Version3.2Fixed in Version 
Summary0003334: Tickrate discrepancies between clients/servers
DescriptionThis issue was first described by unknownna here: 0002859:0015907.
In this summary, all issues are addressed by 0003314, 0003317 and 0003316.
All but one issue:
Quote from unknownna
* Desyncs a little consistently every 24-25 seconds.

Turns out this is a little off though, the exact period is 28 seconds.

When clients start on both Linux/Windows, a timing method is selected: either a timing event fired by the OS can be created and Zandronum will use it or time polling is used instead.
Unfortunately, as described in this thread from the ZDoom forums, Windows timing events are only precise down to the milliseconds and as such only a delay of 28ms can be achieved due to rounding.
Zandronum servers on the other hand always use time polling and, as a way of dealing with this issue, use a tickrate fixed at 35.75hz.
lNewTics = static_cast<LONG> ( lNowTime / (( 1.0 / (double)35.75 ) * 1000.0 ) );

Can you spot the problem here?
As described in the linked thread, a time step of exactly 28ms would theoritically result in a frequency of (1000 / 28) = 35.714285714...hz.
Using a tickrate of 35.75 instead results in a time step of (1000 / 35.75) = 27.972027972ms.
Calculating the difference and dividing the client time step by it (i.e. magnifying that difference until it equals one time step) and we get, you guessed it, 28 seconds.
(28 / (28 - (1000 / 35.75))) = 1001 tics or 28 seconds.
This means that after 28 seconds, the server will essentially be ahead of us by 1 tic.
The prediction will account for this and as such clients won't notice a thing but outside spectators sure will.
Attached is a demo of the desync occuring by using cl_predict_players false (28ms.cld).

By making the server loop use a proper time step of 28ms, the issue is fixed:
lNewTics = static_cast<LONG> ( lNowTime / 28.0 );

However a problem still remains: the windows timing method is the ONLY one that has a time step of 28ms.
If for some reason the creation of a timer failed at start, Windows will use time polling which will obviously be a proper 35hz.
Linux timer events are precise down to the microseconds as reported in the thread and will have proper 35hz both for polling and timing events.
As such, even with a fixed time step of 28ms on the server, clients using a proper 35hz tickrate will experience a desync too and a much worse one at that.
Calculating the difference again, etc...:
((1000 / 35) / ((1000 / 35) - 28)) = 50 tics.
Only 50 tics and these clients will appear to jitter to outside observers.
I will attach a demo of this with cl_predict_players false as well (35hz.cld).

The solution is obviously to enforce a consistent tickrate in every case but which one?
Should we force the time step to be 28ms globally or should we simply disable Windows timer events and use 35hz on the servers and be done with it?
In my opinion, we should simply disable Windows timer events and here are my arguments on why:
* The clients never slept to begin with if cl_capfps is off.
* As per 0001633, the timer events are already disabled on linux for clients due to being faulty.
* Proper 35hz.
I'm not sure but doesn't the fact that the tickrate is not a proper 35hz mean that mods that would rely on it being correct experience time drift?
For example, a mod uses a timer in ACS that's supposed to count time at 35hz, does it keep track of the time correctly?
I haven't really checked this so I'm not sure but that could be a thing.
* If we want to look at some sort of standard, Quake 3 uses time polling for both its clients and servers and, just like Zandronum, only the servers sleep.
Steps To ReproduceUsing the test map referenced in 0002859:0018627:
-host a server on MAP01
-join a client with cl_predict_players 0
-wait at least 28 seconds
-you will notice stuttering for about a second or so, this is what outside observers actually see
You may also use the debug output given by the linked commit, it will show the desync happening in real time (the server messages arriving at different times and the prediction suddenly having to do 2 ticks even though this is a local connection).
Additional InformationThe demos were both recorded on the same test map described in the steps to reproduce.
Attached Files? file icon 28ms.cld [^] (86,629 bytes) 2017-11-08 00:52
? file icon 35hz.cld [^] (112,668 bytes) 2017-11-08 00:57
? file icon jack.cld [^] (62,779 bytes) 2018-02-11 19:45
? file icon jitter.cld [^] (78,600 bytes) 2018-02-11 19:45
? file icon unlagged_debug_03.wad [^] (13,507 bytes) 2024-03-09 02:49

- Relationships
related to 0001633resolvedLeonard [Linux x86_64] Multiplayer game completely broken 
parent of 0002859needs testingLeonard Gametic-based unlagged seemingly goes out of sync often compared to ping-based unlagged 
related to 0002491needs testingLeonard Screenshot exploit 
related to 0003418resolvedLeonard (3.1 alpha) Stuttering ingame 
Not all the children of this issue are yet resolved or closed.

-  Notes
User avatar (0018816)
Edward-san (developer)
2017-11-08 10:54

Just for curiosity: did you check also if GZDoom multiplayer has this desync problem?

AFAIR some years ago, when I, on Ubuntu 64, tested with a Windows user, there weren't issues like that.
User avatar (0018818)
Leonard (developer)
2017-11-08 17:26

No, I didn't check GZDoom.
Quote from Edward-san
there weren't issues like that.

I'm assuming you're also talking about GZDoom here?
My guess is that given the nature of GZDoom's multiplayer it literally doesn't matter: everyone connected experience the worst latency so even on a lan a different tickrate would just mean a slight latency (0.5ms) for the Windows user who is ticking faster.
If you're talking about Zandronum then bear in mind the client does not see this for itself and even if you hosted on Linux, the server still uses the same loop so you would still only notice the other user's jitter every 28 seconds.
User avatar (0018820)
Edward-san (developer)
2017-11-08 20:25

Quote
I'm assuming you're also talking about GZDoom here?


Correct.
User avatar (0018850)
Ru5tK1ng (updater)
2017-11-10 02:26

It seems to make more sense and consistency to disable Windows timer events. I would say just begin working to implement 35hz.
User avatar (0018851)
Leonard (developer)
2017-11-10 18:33

PR.
This is for proper 35hz.
If for some reason we end up going for 28ms though I could always edit it later.
User avatar (0018852)
Blzut3 (administrator)
2017-11-10 20:17
edited on: 2017-11-10 21:11

I'm confused on "The clients never slept to begin with, why bother with a timer?" The difference between the polled and event timer should be whether Zandronum uses 100% CPU or not with cl_capfps on (I think vid_vsync would be affected too but not certain). If disabling the event timer does indeed cause 100% CPU usage expect a lot of angry laptop users.

On a related note, I believe 3.0 got the updated Linux timer and should be fine now modulo the issue you're trying to solve.

Edit: Regarding the ACS timing point. It has been too long but the last time I looked at the 35.75Hz issue I did notice some mods were correcting for the round off. So some mods are going to be wrong with either choice, but it really doesn't matter unless your hobby is watching timers and looking for drift. :P

User avatar (0018853)
Leonard (developer)
2017-11-11 01:19

Oh you're right, I didn't check for the I_WaitForTic functions.
Then I guess we loose sleeping with cl_capfps on for Windows.
User avatar (0019006)
Ru5tK1ng (updater)
2018-01-23 01:47

Is the loss of sleep for windows the reason the patch wasn't pulled?
User avatar (0019011)
Leonard (developer)
2018-01-23 20:04

The loss of sleep issue was solved, the reason this wasn't addressed yet is because having consistent ticrates (both 35hz and 28ms) revealed a much bigger problem that needs to be addressed first.
User avatar (0019012)
Ru5tK1ng (updater)
2018-01-24 01:40
edited on: 2018-01-24 01:40

For those following this ticket or interested parties, what bigger issue was revealed? I'm assuming 1633 is part of it.

User avatar (0019031)
Leonard (developer)
2018-02-11 19:46

Sorry for the extremely late reply, I wanted to respond once I had the basic implementation up and running and time played against me again and I kind of forgot to reply.

Quote from Ru5tK1ng
I'm assuming 1633 is part of it.

No.

Quote from Ru5tK1ng
what bigger issue was revealed?

I will try my best to explain this.
In this ticket it was found out that the ticrates between clients/servers differ.
This means that inevitably over time one is going to end up behind the other: after 28 seconds the windows clients are so far behind they loose one tic and at the very moment this occurs, the server's ticking gets closer and closer and that's what causes the "desync".
At this point, when the client and the server's ticking happen almost at the same time, the server seems to alternate between receiving the client's movement commands before and after ticking.
This produces what Alex called the "jackhammering effect" where in the worst case the server consistently processes 2 movement commands (the previous one that was received AFTER ticking and the current one) and then waits the next tic.
Obviously this makes the player extremely jittery.
The problem is that once the ticrates were actually fixed and made consistent across clients/servers, on rare occasions a connecting client would get the jackhammering effect except this time permanently.
This is presumably due to the fact that the clients/servers do not lag behind relative to each other anymore which implies that a client that would tick almost exactly at the same time as its server by "chance" would do so permanently.

I'm not sure what causes it to happen in the first place and the fact that searching for such a problem doesn't seem to give anything useful and that apparently this problem never occurred to any other game engines made me doubt what was even happening.
Even then, I tried to debug further but couldn't find anything useful anymore on this issue.

After that I decided to work on smoothing the ticbuffer which would not only fix (or kludge if it's not something that is supposed to happen) the jackhammering problem but also benefit online play in general.
In particular, I expect this will come close to completely solving 0002491.
The way this works is simple: detect if movement command "drops" occur frequently and if so simply apply a "safe latency" to the processing in such a way that the movement will end up being completely smooth.
I think this is the most logical way to solve such a thing: the jittery player shouldn't even notice a thing as both prediction and unlagged should account for any extra latency except now, instead of being very hard to hit to outside observers, will be much, much smoother.
I attached a demo (jack.cld) in which a player experiences the jackhammering effect followed by the new ticbuffer "regulator" being turned on.
This means that players experiencing this problem will be completely smooth to outside observers regardless but at the expense of having one extra tic of latency.
I also attached a much worse example using gamer's proxy to simulate a jittery player (jitter.cld).
User avatar (0019158)
Leonard (developer)
2018-03-25 15:19

Since the ticbuffer smoothing is ready to be reviewed, this needs to be reviewed as well.
User avatar (0019177)
Leonard (developer)
2018-04-30 15:47

This was pulled.
User avatar (0019211)
StrikerMan780 (reporter)
2018-05-07 02:07
edited on: 2018-05-07 05:58

There seems to be an issue with this in the latest 3.1 build. Player and actor movement is no longer interpolated, it looks like it's running at 35fps, even with uncapped framerate. Both online and offline.

Everything else seems fine however.

User avatar (0019215)
Leonard (developer)
2018-05-07 09:46

That was reported in 0003418.
User avatar (0019267)
StrikerMan780 (reporter)
2018-06-03 22:39
edited on: 2018-06-03 22:39

'https://zandronum.com/tracker/view.php?id=3314#c19255 [^]'

Linking to this comment, since it may have more to do with this than the ticket I posted in. Not sure though.

User avatar (0023304)
unknownna (updater)
2024-03-06 12:27
edited on: 2024-03-12 11:49

Quote from Leonard
on rare occasions a connecting client would get the jackhammering effect except this time permanently.


Hey, was this ever added to 3.1, or looked into any further?

This issue is still present in 3.1. It's especially noticeable if you connect with some latency, for instance with 300 ping emulated.

If you're unlucky and connect to the server at the wrong tick, all the other players will permanently jitter/jackhammer around needlessly. Their player body movements aren't as smooth when you circle-strafe around each other while fighting. It's actually rather bad when it happens.

When the bug occurs, the impression from your POV is that the jittering players move slightly faster all the time due to all the micro-skipping they do.
Additionally, other clients see you as skipping as well. Glitched clients affect both themselves and others on the server.

The only way to fix it is to quit and restart the program and hope that you get better luck upon the next instance of starting the program and joining the server.

To sum it up so far:

* Clients connected with a bad tick sync will visually jitter/jackhammer around to everyone else, and they in turn will perceive everyone else as jittering as well.
* Clients connected with a good tick sync will see other clients with good tick syncs moving smoothly.
* To fix it, you have to quit and restart the program and hope for the best upon connecting to the server again.
* Because of this desync, you effectively have 2 different positions of the other clients flickering back and forth that you can hit successfully with a hitscan weapon.

And to add to the final point, because of this permanent desync and flickering of 2 different positions rapidly, it could explain this issue:

Quote from unknownna
The unlagged seems to act odd when firing immediately upon passing and bumping into another player close by, allowing you to seemingly hit behind the player in thin air and still hit.


Another effect of this is that shots with a wide spread like the SSG aimed at the center of your target are more likely to miss some pellets the further away the target is, because the target is rapidly bouncing back and forth between 2 points. Players are thus potentially harder to hit the further away they are from you in Zandronum.

I updated Leonard's example wad to easily reproduce the issue and also gave it Quake 3 hit sounds for target checks and made the hitbox sprites more intuitive. There are 2 pistols, one fast like the chaingun, and one slow like the shotgun. Connect 2 clients to the server and use an emulated ping of 300 on one client to easily see the issue when it occurs. You can 100% hit the flickering position that is incorrect, and you even have a dead zone in the middle of the 2 positions where no pellets hit. It's most definitely bugged.

I tested various timings to use when joining the server immediately after starting the server exe.
Joining after 6-7 seconds seems to not be enough to trigger the issue. But any later than that and it almost seems to have something to do with the time being an even or odd number, like there's some rounding error that causes the player positions to fluctuate between 2 tics permanently. It's 1 tic off when it glitches, I'd say.

'https://web.archive.org/web/20181107022429/http://www.mine-control.com:80/zack/timesync/timesync.html [^]'

Maybe this link has some information that can help? I can't really do anything more from here until someone looks into it further.

User avatar (0023377)
Kaminsky (developer)
2024-03-12 15:13

It might also be worth backporting this commit from Q-Zandronum:'https://github.com/IgeNiaI/Q-Zandronum/commit/d024154462267f10f65e4c6f78bf836764295e01 [^]' to help fix the permanent time desync issue.
User avatar (0023381)
unknownna (updater)
2024-03-13 14:08
edited on: 2024-04-03 10:01

Thanks for that! I tested the latest Q-Zandronum 1.4.11, which I assume has the fix, and it's much worse there, and the unlagged is extremely broken. The client-side movement prediction of your own player also jitters a lot. Whatever it's doing is not the correct way at the moment.

Back to regular Zandronum, turning the tic buffer off doesn't improve the issue, so it seems to be something more fundamental going on.

With 2 clients standing still on the scrolling floor, with one client using 300 emulated ping, older versions of Zandronum (1.0, 1.1, 1.3, 1.2.2, 2.1.2 and 3.0) seem to be very smooth until 16 seconds pass, at which it starts to jitter for 13-14 seconds before correcting itself and normalizing again, and it loops like that over and over again. This time loop is 100% consistent.

The permanent time desync jitter starts to appear in 3.1 after the fixes.

So it's a time drift issue after all, but Q-Zandronum's fix is not the correct one since it glitches it out even further.

Quote from unknownna
older versions of Zandronum (1.0, 1.1, 1.3, 1.2.2, 2.1.2 and 3.0) seem to be very smooth until 16 seconds pass, at which it starts to jitter for 13-14 seconds before correcting itself and normalizing again, and it loops like that over and over again. This time loop is 100% consistent.

After testing this further, it means that prior to 3.1, the unlagged position of your target would be shifted forward every 16 seconds, and flicker rapidly between 2 points for 13-14 seconds until it normalized. Because of this, You could hit even further ahead of the player than before when these 16 seconds passed. And due to it already being 1 tic off because of a separate issue, it meant aiming extra far ahead to land a hit. The unlagged would break every 16 seconds in older versions of Zandronum.

The difference is that it's now permanent, and it's very likely to break often for incoming clients. It's not too fun to play the game in its current state, just saying.

User avatar (0023521)
Kaminsky (developer)
2024-04-06 00:22

Thanks a lot for your detailed analysis, and for testing the aforementioned commit from Q-Zandronum! I was hoping that Q-Zandronum's approach was good enough and we could easily backport it.

I started working on adding time synchronization using the information in the link you posted in'https://zandronum.com/tracker/view.php?id=3334#c23304 [^]' recently. It's not 100% complete, and I haven't had an opportunity to thoroughly test it to see if it fixes the issues you mentioned. I'm afraid that I won't have the fix ready for the next 3.2 beta that might come in a few days, but I'll be happy to share test builds with you afterwards.
User avatar (0023523)
unknownna (updater)
2024-04-06 10:00

Incredible, looking forward to it!

Issue Community Support
Only registered users can voice their support. Click here to register, or here to log in.
Supporters: Combinebobnt unknownna
Opponents: No one explicitly opposes this issue yet.

- Issue History
Date Modified Username Field Change
2017-11-08 00:52 Leonard New Issue
2017-11-08 00:52 Leonard Status new => assigned
2017-11-08 00:52 Leonard Assigned To => Leonard
2017-11-08 00:52 Leonard File Added: 28ms.cld
2017-11-08 00:53 Leonard Description Updated View Revisions
2017-11-08 00:57 Leonard File Added: 35hz.cld
2017-11-08 00:59 Leonard Relationship added parent of 0002859
2017-11-08 10:54 Edward-san Note Added: 0018816
2017-11-08 17:26 Leonard Note Added: 0018818
2017-11-08 20:25 Edward-san Note Added: 0018820
2017-11-10 02:26 Ru5tK1ng Note Added: 0018850
2017-11-10 18:33 Leonard Note Added: 0018851
2017-11-10 18:33 Leonard Status assigned => needs review
2017-11-10 20:17 Blzut3 Note Added: 0018852
2017-11-10 21:11 Blzut3 Note Edited: 0018852 View Revisions
2017-11-11 01:19 Leonard Note Added: 0018853
2017-11-11 01:19 Leonard Description Updated View Revisions
2017-11-13 09:16 Leonard Relationship added related to 0001633
2017-11-13 14:08 Leonard Status needs review => assigned
2018-01-23 01:47 Ru5tK1ng Note Added: 0019006
2018-01-23 20:04 Leonard Note Added: 0019011
2018-01-24 01:40 Ru5tK1ng Note Added: 0019012
2018-01-24 01:40 Ru5tK1ng Note Edited: 0019012 View Revisions
2018-02-11 19:45 Leonard File Added: jack.cld
2018-02-11 19:45 Leonard File Added: jitter.cld
2018-02-11 19:46 Leonard Note Added: 0019031
2018-03-25 15:19 Leonard Note Added: 0019158
2018-03-25 15:19 Leonard Status assigned => needs review
2018-03-25 15:22 Leonard Relationship added related to 0002491
2018-04-30 15:47 Leonard Note Added: 0019177
2018-04-30 15:47 Leonard Status needs review => needs testing
2018-05-02 16:05 Leonard Relationship added related to 0003418
2018-05-07 02:07 StrikerMan780 Note Added: 0019211
2018-05-07 02:08 StrikerMan780 Note Edited: 0019211 View Revisions
2018-05-07 03:09 StrikerMan780 Note Edited: 0019211 View Revisions
2018-05-07 05:58 StrikerMan780 Note Edited: 0019211 View Revisions
2018-05-07 05:58 StrikerMan780 Note Edited: 0019211 View Revisions
2018-05-07 09:46 Leonard Note Added: 0019215
2018-06-03 22:39 StrikerMan780 Note Added: 0019267
2018-06-03 22:39 StrikerMan780 Note Edited: 0019267 View Revisions
2024-03-06 12:27 unknownna Note Added: 0023304
2024-03-06 12:58 unknownna Note Edited: 0023304 View Revisions
2024-03-06 16:05 unknownna Note Edited: 0023304 View Revisions
2024-03-07 09:40 unknownna Note Edited: 0023304 View Revisions
2024-03-07 22:38 unknownna Note Edited: 0023304 View Revisions
2024-03-07 22:39 unknownna Status needs testing => feedback
2024-03-09 02:49 unknownna Note Edited: 0023304 View Revisions
2024-03-09 02:49 unknownna File Added: unlagged_debug_03.wad
2024-03-12 05:17 unknownna Priority normal => high
2024-03-12 11:49 unknownna Note Edited: 0023304 View Revisions
2024-03-12 15:13 Kaminsky Note Added: 0023377
2024-03-13 14:08 unknownna Note Added: 0023381
2024-03-14 06:22 unknownna Note Edited: 0023381 View Revisions
2024-03-14 08:19 unknownna Note Edited: 0023381 View Revisions
2024-03-14 08:24 unknownna Note Edited: 0023381 View Revisions
2024-03-14 08:26 unknownna Note Edited: 0023381 View Revisions
2024-04-03 10:01 unknownna Note Edited: 0023381 View Revisions
2024-04-06 00:22 Kaminsky Note Added: 0023521
2024-04-06 00:22 Kaminsky Assigned To Leonard => Kaminsky
2024-04-06 00:22 Kaminsky Status feedback => assigned
2024-04-06 00:22 Kaminsky Target Version 3.1 => 3.2
2024-04-06 10:00 unknownna Note Added: 0023523






Questions or other issues? Contact Us.

Links


Copyright © 2000 - 2024 MantisBT Team
Powered by Mantis Bugtracker