MantisBT - Zandronum |
View Issue Details |
|
ID | Project | Category | View Status | Date Submitted | Last Update |
0000822 | Zandronum | [All Projects] Bug | public | 2012-04-29 02:42 | 2024-07-17 17:08 |
|
Reporter | AlexMax | |
Assigned To | Kaminsky | |
Priority | normal | Severity | minor | Reproducibility | always |
Status | resolved | Resolution | fixed | |
Platform | Linux | OS | Ubuntu | OS Version | 10.04 x86-64 |
Product Version | 98d | |
Target Version | 3.2 | Fixed in Version | 3.2 | |
|
Summary | 0000822: Servers freeze after 24 days uptime |
Description | Pretty simple. As soon as my Skulltag servers had reached 25 days uptime, people started asking me where they were. I took a look at supervisorctl, and _all_ of my servers were unresponsive except for the ones that were restarted earlier. This is output from supervisor:
st-duel32-duel RUNNING pid 1943, uptime 25 days, 8:51:01
st-duel32-duel2 RUNNING pid 1936, uptime 25 days, 8:51:01
st-idl2012-atf RUNNING pid 3459, uptime 17 days, 23:51:05
st-idl2012-ctf RUNNING pid 1937, uptime 25 days, 8:51:01
st-idl2012-privctf RUNNING pid 1944, uptime 25 days, 8:51:01
st-idl2012-scrimctf RUNNING pid 1942, uptime 25 days, 8:51:01
Of the six servers, the only one that was up was my "Attack the Flag" test, and it was only up for 17 days. This isn't the first time this has happened either. |
Steps To Reproduce | |
Additional Information | |
Tags | No tags attached. |
Relationships | |
Attached Files | |
|
Issue History |
Date Modified | Username | Field | Change |
2012-04-29 02:42 | AlexMax | New Issue | |
2012-04-29 02:46 | AlexMax | Note Added: 0003487 | |
2012-04-29 11:17 | DevilHunter | Note Added: 0003491 | |
2012-06-09 13:22 | Torr Samaho | Category | General => Bug |
2013-06-09 20:44 | jwaffe | Note Added: 0006410 | |
2013-06-09 20:45 | jwaffe | Note Edited: 0006410 | bug_revision_view_page.php?bugnote_id=6410#r3526 |
2013-06-09 20:45 | jwaffe | Note Edited: 0006410 | bug_revision_view_page.php?bugnote_id=6410#r3527 |
2013-06-22 21:55 | Konar6 | Note Added: 0006478 | |
2013-06-22 21:56 | Konar6 | Note Edited: 0006478 | bug_revision_view_page.php?bugnote_id=6478#r3566 |
2013-06-23 00:48 | Dusk | Status | new => confirmed |
2013-06-23 11:02 | Torr Samaho | Note Added: 0006480 | |
2013-06-23 16:33 | Edward-san | Note Added: 0006489 | |
2013-06-23 17:25 | jwaffe | Note Added: 0006490 | |
2013-06-23 19:48 | Torr Samaho | Note Added: 0006493 | |
2017-06-07 03:32 | Ru5tK1ng | Note Added: 0017808 | |
2018-01-21 20:26 | Torr Samaho | Note Added: 0019005 | |
2024-03-01 16:42 | Kaminsky | Note Added: 0023175 | |
2024-03-01 16:42 | Kaminsky | Assigned To | => Kaminsky |
2024-03-01 16:42 | Kaminsky | Status | confirmed => needs review |
2024-03-01 16:42 | Kaminsky | Target Version | => 3.2 |
2024-03-03 21:09 | Kaminsky | Note Added: 0023295 | |
2024-03-03 21:09 | Kaminsky | Status | needs review => needs testing |
2024-07-17 17:08 | Kaminsky | Note Added: 0023797 | |
2024-07-17 17:08 | Kaminsky | Status | needs testing => resolved |
2024-07-17 17:08 | Kaminsky | Fixed in Version | => 3.2 |
2024-07-17 17:08 | Kaminsky | Resolution | open => fixed |
Notes |
|
|
Attack the Flag is running 98e. Will let you know if it crashes too. |
|
|
|
I think Silvertear has the same issue with his servers
[10:40:11] [@silvertear] yeah they stop working after 25 days or so
[10:40:18] [@silvertear] happens to alexmax's server as well
After he restarted them, they worked just fine.
Haven't noticed this on Armada, but then again, when has any of those servers been on for least a month lol |
|
|
(0006410)
|
jwaffe
|
2013-06-09 20:44
(edited on: 2013-06-09 20:45) |
|
I can confirm this behavior on my servers ([IFOC]), all of them go down at the same time, though I figured it was closer to 28 days. This has been happening as long as I have been hosting, even on Skulltag 98D
When this happens to my servers, they disappear from the master server, I can connect to them using -connect, but I get a bright yellow HOM screen. Killing and restarting the servers fixes all problems for another interval of around 28 days.
My server is running the 64 bit linux build on Ubuntu server
|
|
|
(0006478)
|
Konar6
|
2013-06-22 21:55
(edited on: 2013-06-22 21:56) |
|
This problem is caused by overflow in SERVER_Tick().
The variables which hold the millisecond timers work with the LONG datatype, and thus overflow when the server has been running for 2,147,483,647 msec, or 24 days and ~ 20 hours.
The timer is however provided by an SDL function SDL_GetTicks() which itself is just Uint32 anyway, so I guess the fix won't be as easy as switching our variables to a 64bit datatype (and switching to ULONG would only postpone the problem to 49 days?)
According to Firestone, he doesn't suffer from this problem on his Windows servers.
|
|
|
|
If it's just an overflow problem in SERVER_Tick(), why aren't the Windows servers affected? |
|
|
|
|
|
(0006490)
|
jwaffe
|
2013-06-23 17:25
|
|
Ah... the timer rolls over and the server doesn't know what to do with packets from 49 days ago, or something similar I imagine.
It might not be easy to go from using the absolute value of the counter to using a relative value. I'm not sure how deeply it's coded to use the return value instead of the return value minus a point in time to measure against. |
|
|
|
Quote from Edward-san
... it should affect windows servers too, but only after 49 days
SERVER_Tick stores the return value of I_MSTime as LONG. To this should also overflow under Windows after 24 days. |
|
|
|
Is there any reason SERVER_Tick doesn't use ULong? |
|
|
|
|
|
|
|
|
|
The merge request above got pushed into the default branch of the repository. |
|
|
|
[12:52 PM] Sean: most of my servers have been running for at least 93 days straight now
[12:54 PM] Sean: my 3.1 servers where I backported that fix are at 135 days
Since Sean's Blue Firestick servers, as of writing this, have been running for this long and still work fine, I'll mark this issue as resolved. |
|