Zandronum Chat @ irc.zandronum.com
#zandronum
Get the latest version: 3.0
Source Code

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0000822Zandronum[All Projects] Bugpublic2012-04-29 02:422018-01-21 20:26
ReporterAlexMax 
Assigned To 
PrioritynormalSeverityminorReproducibilityalways
StatusconfirmedResolutionopen 
PlatformLinuxOSUbuntuOS Version10.04 x86-64
Product Version98d 
Target VersionFixed in Version 
Summary0000822: Servers freeze after 24 days uptime
DescriptionPretty simple. As soon as my Skulltag servers had reached 25 days uptime, people started asking me where they were. I took a look at supervisorctl, and _all_ of my servers were unresponsive except for the ones that were restarted earlier. This is output from supervisor:

st-duel32-duel RUNNING pid 1943, uptime 25 days, 8:51:01
st-duel32-duel2 RUNNING pid 1936, uptime 25 days, 8:51:01
st-idl2012-atf RUNNING pid 3459, uptime 17 days, 23:51:05
st-idl2012-ctf RUNNING pid 1937, uptime 25 days, 8:51:01
st-idl2012-privctf RUNNING pid 1944, uptime 25 days, 8:51:01
st-idl2012-scrimctf RUNNING pid 1942, uptime 25 days, 8:51:01

Of the six servers, the only one that was up was my "Attack the Flag" test, and it was only up for 17 days. This isn't the first time this has happened either.
Attached Files

- Relationships

-  Notes
User avatar (0003487)
AlexMax (developer)
2012-04-29 02:46

Attack the Flag is running 98e. Will let you know if it crashes too.
User avatar (0003491)
DevilHunter (reporter)
2012-04-29 11:17

I think Silvertear has the same issue with his servers

[10:40:11] [@silvertear] yeah they stop working after 25 days or so
[10:40:18] [@silvertear] happens to alexmax's server as well

After he restarted them, they worked just fine.

Haven't noticed this on Armada, but then again, when has any of those servers been on for least a month lol
User avatar (0006410)
jwaffe (reporter)
2013-06-09 20:44
edited on: 2013-06-09 20:45

I can confirm this behavior on my servers ([IFOC]), all of them go down at the same time, though I figured it was closer to 28 days. This has been happening as long as I have been hosting, even on Skulltag 98D

When this happens to my servers, they disappear from the master server, I can connect to them using -connect, but I get a bright yellow HOM screen. Killing and restarting the servers fixes all problems for another interval of around 28 days.

My server is running the 64 bit linux build on Ubuntu server

User avatar (0006478)
Konar6 (reporter)
2013-06-22 21:55
edited on: 2013-06-22 21:56

This problem is caused by overflow in SERVER_Tick().
The variables which hold the millisecond timers work with the LONG datatype, and thus overflow when the server has been running for 2,147,483,647 msec, or 24 days and ~ 20 hours.
The timer is however provided by an SDL function SDL_GetTicks() which itself is just Uint32 anyway, so I guess the fix won't be as easy as switching our variables to a 64bit datatype (and switching to ULONG would only postpone the problem to 49 days?)

According to Firestone, he doesn't suffer from this problem on his Windows servers.

User avatar (0006480)
Torr Samaho (administrator)
2013-06-23 11:02

If it's just an overflow problem in SERVER_Tick(), why aren't the Windows servers affected?
User avatar (0006489)
Edward-san (developer)
2013-06-23 16:33

... it should affect windows servers too, but only after 49 days, which is, coincidentally, the time for SDL_GetTicks.
User avatar (0006490)
jwaffe (reporter)
2013-06-23 17:25

Ah... the timer rolls over and the server doesn't know what to do with packets from 49 days ago, or something similar I imagine.

It might not be easy to go from using the absolute value of the counter to using a relative value. I'm not sure how deeply it's coded to use the return value instead of the return value minus a point in time to measure against.
User avatar (0006493)
Torr Samaho (administrator)
2013-06-23 19:48

Quote from Edward-san

... it should affect windows servers too, but only after 49 days

SERVER_Tick stores the return value of I_MSTime as LONG. To this should also overflow under Windows after 24 days.
User avatar (0017808)
Ru5tK1ng (updater)
2017-06-07 03:32

Is there any reason SERVER_Tick doesn't use ULong?
User avatar (0019005)
Torr Samaho (administrator)
2018-01-21 20:26

The following could help with debugging the issue:

[21:16:19] <Dusk> [20:07:51] <AlexMax> [18:05:29]https://idea.popcount.org/2013-07-19-how-to-sleep-a-million-years/ [^]
[21:16:19] <Dusk> [20:07:51] <AlexMax> [18:05:36] this might be useful for trying to fix the 24 day bug

Issue Community Support
Only registered users can voice their support. Click here to register, or here to log in.
Supporters: Combinebobnt jwaffe DevilHunter Chronos Ouroboros unknownna WaTaKiD President People Monsterovich StrikerMan780 Korshun Konda
Opponents: No one explicitly opposes this issue yet.

- Issue History
Date Modified Username Field Change
2012-04-29 02:42 AlexMax New Issue
2012-04-29 02:46 AlexMax Note Added: 0003487
2012-04-29 11:17 DevilHunter Note Added: 0003491
2012-06-09 13:22 Torr Samaho Category General => Bug
2013-06-09 20:44 jwaffe Note Added: 0006410
2013-06-09 20:45 jwaffe Note Edited: 0006410 View Revisions
2013-06-09 20:45 jwaffe Note Edited: 0006410 View Revisions
2013-06-22 21:55 Konar6 Note Added: 0006478
2013-06-22 21:56 Konar6 Note Edited: 0006478 View Revisions
2013-06-23 00:48 Dusk Status new => confirmed
2013-06-23 11:02 Torr Samaho Note Added: 0006480
2013-06-23 16:33 Edward-san Note Added: 0006489
2013-06-23 17:25 jwaffe Note Added: 0006490
2013-06-23 19:48 Torr Samaho Note Added: 0006493
2017-06-07 03:32 Ru5tK1ng Note Added: 0017808
2018-01-21 20:26 Torr Samaho Note Added: 0019005






Questions or other issues? Contact Us.

Links


Copyright © 2000 - 2018 MantisBT Team
Powered by Mantis Bugtracker