Zandronum Chat on our Discord Server Get the latest version: 3.1
Source Code

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0001268Zandronum[All Projects] Suggestionpublic2013-01-29 15:522016-04-10 19:27
ReporterWatermelon 
Assigned To 
PrioritynoneSeveritytweakReproducibilityN/A
StatusnewResolutionopen 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version 
Summary0001268: Huffman optimization
DescriptionI've heard that the current Huffman table is not as optimized as it should be (note: someone should upload the huffman.zip somewhere on Zandronum's main site because it's linked to skulltag.com and no one knows how stable that site will be).
If Huffman is used in compressing and sending packets to the clients, optimization of this table may help in coop/major monster games (like ZDW) where people tend to time out due to bandwidth issues.

If the idea is considered, to allow for backwards compatibility, maybe there could be an sv_optimizedhuffman cvar that will tell the server to transmit data based on a new huffman table (if one should be made). Making a new table just as a basic brainstorm could be done with monitoring the raw bytes pre-huffman encoding and storing the amount in the table, and then generating a new frequency table from there. It would be interesting to see if there is a significant difference or not.

Another even further idea would be to allow for dynamic huffman tables, not that I expect that to actually be implemented.
Attached Files

- Relationships
related to 0000803closedTIHan Reducing Monster Bandwidth Patch 
related to 0000757new Monster Bandwidth Mitigation 

-  Notes
User avatar (0005862)
Watermelon (developer)
2013-01-29 17:16
edited on: 2013-01-29 17:23

Sorry I made a typo: it says "... transmit the new huffman table", I meant "to transmit data based on a new huffman table"
[Fixed typo -Dusk]

User avatar (0009267)
Watermelon (developer)
2014-06-14 01:59

Code review on Sunday for this
User avatar (0009339)
Watermelon (developer)
2014-06-14 18:59
edited on: 2014-06-14 19:01

My brain is fried... what the hell?



1) Client sends request to server:

MY HUFFMAN:
Preparing to send [unencoded]: 0 50 46 48 45 97 108 112 104 97 0 112 97 115 115 119 111 114 100 0 5 191 48 50 98 51 99 102 56 53 97 49 97 101 100 57 55 57 52 55 101 53 49 98 52 54 100 54 98 102 53 49 56 99 0
Sending (out 56) [encoded]: 255 0 50 46 48 45 97 108 112 104 97 0 112 97 115 115 119 111 114 100 0 5 191 48 50 98 51 99 102 56 53 97 49 97 101 100 57 55 57 52 55 101 53 49 98 52 54 100 54 98 102 53 49 56 99 0


CURRENT HUFFMAN:
Preparing to send [unencoded]: 0 50 46 48 45 97 108 112 104 97 0 112 97 115 115 119 111 114 100 0 5 58 48 50 98 51 99 102 56 53 97 49 97 101 100 57 55 57 52 55 101 53 49 98 52 54 100 54 98 102 53 49 56 99 0
Sending (out 56) [encoded]: 255 0 50 46 48 45 97 108 112 104 97 0 112 97 115 115 119 111 114 100 0 5 58 48 50 98 51 99 102 56 53 97 49 97 101 100 57 55 57 52 55 101 53 49 98 52 54 100 54 98 102 53 49 56 99 0

=================================================================================================
**ALL GOOD SO FAR**
The unencoded has some slightly different bytes, unknown how significant this is, will test later
=================================================================================================


2) Server responds

MY HUFFMAN:
Server to client response
Reading (10 bytes) [encoded]: 7 35 73 242 96 238 161 180 100 0
Reading [decoded]: 3 0 0 0 0 0 77 65 80 48 49 0


CURRENT HUFFMAN:
Server to client response
Reading (10 bytes) [encoded]: 7 68 146 132 139 19 73 79 151 0
Reading [decoded]: 3 0 0 0 0 0 77 65 80 48 49 0

====================
**ALL GOOD SO FAR**
Both say in the console:
"Connected!"
"Authenticating level..."
====================



3) Client responds again:

MY HUFFMAN:
CLIENT:
Preparing to send [unencoded]: 1 0 97 102 54 50 48 101 53 100 54 51 50 54 100 97 100 56 54 99 56 55 54 97 49 99 98 97 101 99 49 57 97 100 0 53 54 56 48 48 54 50 51 49 98 51 50 100 52 48 50 57 52 49 100 50 50 100 54 99 100 48 56 52 51 51 54 0 57 54 49 97 97 102 52 48 55 49 57 101 55 48 53 48 56 57 98 102 102 55 101 100 49 48 98 101 52 57 100 101 0 97 51 99 98 98 102 48 99 56 100 52 48 48 53 57 56 50 50 50 49 100 102 52 102 97 102 53 49 53 50 52 55 0 0
Sending (out 136) [encoded]: 255 1 0 97 102 54 50 48 101 53 100 54 51 50 54 100 97 100 56 54 99 56 55 54 97 49 99 98 97 101 99 49 57 97 100 0 53 54 56 48 48 54 50 51 49 98 51 50 100 52 48 50 57 52 49 100 50 50 100 54 99 100 48 56 52 51 51 54 0 57 54 49 97 97 102 52 48 55 49 57 101 55 48 53 48 56 57 98 102 102 55 101 100 49 48 98 101 52 57 100 101 0 97 51 99 98 98 102 48 99 56 100 52 48 48 53 57 56 50 50 50 49 100 102 52 102 97 102 53 49 53 50 52 55 0 0

Reading (5 bytes) [encoded]: 1 227 148 228 20
/* BOOM client error */
// Seems to crash when this is decoded
// Decoding analysis on its own yields that my decoder translates it to: 3 1 0 0 0 1 0
// This is what is expected, so it seems to crash with the same code that it gets from the current huffman... which makes no sense

// Apparently sends out something here for some reason? Is this a disconnect?
Preparing to send [unencoded]: 4
Sending (out 2) [encoded]: 255 4



SERVER:
Preparing to send [unencoded]: 3 1 0 0 0 1 0
Sending (out 5) [encoded]: 1 227 148 228 20
// Shows that there is proof of it being sent compressed properly

Reading (2 bytes) [encoded]: 255 4
Reading [decoded]: 4
127.0.0.1:10667 disconnected.
// This is probably the quit message


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~




CURRENT HUFFMAN:
CLIENT:
Preparing to send [unencoded]: 1 0 97 102 54 50 48 101 53 100 54 51 50 54 100 97 100 56 54 99 56 55 54 97 49 99 98 97 101 99 49 57 97 100 0 53 54 56 48 48 54 50 51 49 98 51 50 100 52 48 50 57 52 49 100 50 50 100 54 99 100 48 56 52 51 51 54 0 57 54 49 97 97 102 52 48 55 49 57 101 55 48 53 48 56 57 98 102 102 55 101 100 49 48 98 101 52 57 100 101 0 97 51 99 98 98 102 48 99 56 100 52 48 48 53 57 56 50 50 50 49 100 102 52 102 97 102 53 49 53 50 52 55 0 0
Sending (out 136) [encoded]: 255 1 0 97 102 54 50 48 101 53 100 54 51 50 54 100 97 100 56 54 99 56 55 54 97 49 99 98 97 101 99 49 57 97 100 0 53 54 56 48 48 54 50 51 49 98 51 50 100 52 48 50 57 52 49 100 50 50 100 54 99 100 48 56 52 51 51 54 0 57 54 49 97 97 102 52 48 55 49 57 101 55 48 53 48 56 57 98 102 102 55 101 100 49 48 98 101 52 57 100 101 0 97 51 99 98 98 102 48 99 56 100 52 48 48 53 57 56 50 50 50 49 100 102 52 102 97 102 53 49 53 50 52 55 0 0

Reading (5 bytes) [encoded]: 3 100 151 180 11
Reading [decoded]: 3 1 0 0 0 1 0
Level authenticated!



SERVER:
Preparing to send [unencoded]: 3 1 0 0 0 1 0
Sending (out 5) [encoded]: 3 100 151 180 11


To be speculated later what is going on

User avatar (0009973)
AlexMax (developer)
2014-07-15 22:39

Instead of messing with huffman tables, why not simply use compression? I would try lz4:'https://code.google.com/p/lz4/ [^]'

Keep the existing huffman table stuff for backwards compatibility, and pick another magic number (0xfe?) to prefix lz4 compressed data with.

It would be interesting to see what kind of bandwidth savings it can result in.
User avatar (0010416)
TIHan (reporter)
2014-10-09 09:00

Wanted to chip in here. I've swapped out huffman for LZ4 compression in zandronum long ago to see what would happen. It doesn't help much at all.
User avatar (0010428)
Blzut3 (administrator)
2014-10-09 16:19

TIHan, I suppose the more important question that has yet to be answered is "does it hurt?" From a quick test that AlexMax did it looks like Zandronum may have some other issues preventing lz4 from providing the maximum benefit as is. For example, there is a reliable packet buffer which could probably be streamed rather than compressed individually. Surely lz4 should win out over huffman if that can work.
User avatar (0010429)
Watermelon (developer)
2014-10-09 16:24

Quote
For example, there is a reliable packet buffer which could probably be streamed rather than compressed individually. Surely lz4 should win out over huffman if that can work.


Could be wrong, but doesn't Zandronum do this? Like it waits until it reaches its max and then compresses and sends it?

Hopefully it doesn't compress each command it sends... that'd be bad.
User avatar (0010442)
Blzut3 (administrator)
2014-10-09 23:59

Watermelon, "streaming" means leaving the lz4 stream open instead of starting and closing for each packet. The advantage is that lz4 can look back at past packets and compress based on them. This obviously can't be done for unreliable packets as a stream requires that packets be read in order.
User avatar (0010455)
Torr Samaho (administrator)
2014-10-10 16:55

To use streaming is an awesome idea. The reliable packet buffer is responsible for most of the traffic anyway, so we should definitely look into this.
User avatar (0010457)
Watermelon (developer)
2014-10-10 17:23

I fully understand now.

I'm going to try to get results from this and see where we can go with it.
User avatar (0010858)
Watermelon (developer)
2014-11-11 03:10

If we do LZ4 and it works, do we even need the Huffman table? What about removing it from the launcher protocol?
User avatar (0010867)
Blzut3 (administrator)
2014-11-11 15:33

We would want the huffman code during a transition period. At least the decode part (encode can be bypassed by sending 0xFF as the first byte).
User avatar (0011321)
Dusk (developer)
2015-01-07 19:57

So I've taken a look into lz4 recently. It appears that in order to make it actually effective in actually optimizing network traffic, streams need to be used and this requires that packets must be decompressed in the same sequence as the server sends them.

This in turn means that only the reliable stream can be stream-optimized and packets must be first compressed and then numbered for the client packet buffer. The client must then first deal with the packet numbering and then decompress the packets. This is getting a little bit complicated.
User avatar (0011330)
Dusk (developer)
2015-01-08 11:01

I think lz4 is absolutely unfeasible to implement.

The streaming mode not only requires very carefully implemented decompression order which already results in the the code devolving into an undecipherable mess but it also apparently requires that "Previous data blocks are assumed to still be present at their previous location." which means stuff needs to be cached and that we also would need to change packet loss handling to store compressed data, etc..

AND EVEN WITH ALL THIS BUFFER MADNESS IT STILL REFUSES TO WORK.

I think we should look into some alternative compression methods that preferably would work without needing previous data.
User avatar (0012177)
Watermelon (developer)
2015-04-27 17:03
edited on: 2015-04-27 17:23

I've created a new huffman encoder and tested it. It meets all the criteria in the dev channel discussion (heavy documentation, dynamic, very easy to setup... etc), it's size is smaller (this encoder is 2 files and around 400 source LOC, the current encoder is 10 files and around 600 source LOC).

New huffman coder
1000 attempts at compressing 8192 byte packets took:
0.108723 sec encoding
0.001443 sec decoding
0.110166 seconds total

Old huffman coder:
1000 attempts at compressing 8192 byte packets took:
0.128971 sec encoding
0.124076 sec decoding
0.253047 seconds total

It appears that on average, my new encoder has a 15% speed increase, and 85% faster decoding.


Now, if I put in the full bounds checking for safety (if we want to ever support a limit past carnevil's 131072 massive buffer allocation), my decoding will be slightly worse (maybe 3-5% worse) -- however, I don't think the current implementation we have can support any decompression over 131072 since it assumes the buffer is always that size and will never change.


Therefore, we have a question:
- Do we want to assume we'll never decode over 131072 bytes to get a super blazing fast decompression?
- Do we never want to assume that and play it safe and keep approximately the same decompression speed we currently have (while allowing us an almost unlimited decompression size)?

EDIT: Just realized I might be able to increase the compression time too by not worrying about buffer bounds since I'm sure carn did the same thing with the encoding...

EDIT2: With my new method, if anyone sends corrupt packets, it will error out safely and we can check if the packet data being sent from the client (or server) is corrupt or not. This shouldn't be important though since the chances of corruption and the CRC check being the same are extremely low.

EDIT3: After a discussion with Blzut, the max compression we can get is 8:1, therefore our theoretical cap should be 65507 * 8, which we should never have to worry about since sending packets that big likely will have problems/fragmentation/who-knows-what when sending.

User avatar (0012181)
Torr Samaho (administrator)
2015-04-29 06:03

Is your implementation using the same table as the old one? If not, this would mean that servers using the new implementation (i.e. 3.x+) cannot tell clients clients using the old implementation (2.0) that their version is outdated, right?

If this is the case, we first have to decide on a new network header. We should only completely break backwards compatibility once. The new header surely needs to contain the packet number. From what I understood, this is a requirement to make an lz4 stream possible.
User avatar (0012185)
Watermelon (developer)
2015-04-29 15:23
edited on: 2015-04-29 15:29

Quote
Is your implementation using the same table as the old one? If not, this would mean that servers using the new implementation (i.e. 3.x+) cannot tell clients clients using the old implementation (2.0) that their version is outdated, right?

If needed, I can re-create the old huffman table. If we realize its an old client, we can do the 0xFF + data trick so that even old clients will be able to decode the data -- which would allow us to bypass support for the old huffman. With the new protocol, I've so far set it up so that 0x80 is the marker that there is no compression, but I have to investigate if the current huffman requires 0xFF.

We should in either case still be able to have 100% backwards compatibility if needed.

Quote
If this is the case, we first have to decide on a new network header. We should only completely break backwards compatibility once. The new header surely needs to contain the packet number. From what I understood, this is a requirement to make an lz4 stream possible.


Do you think its worth pursuing lz4 still? Dusk tried to get it working and with the info in his post, it looks like it would require such a huge overhaul to everything that it might not be worth it unless someone has an insane amount of time... plus with an unknown end result; would lz4 possibly even give significant compression? It's possible but I'm unsure if the gains would be worth tearing up the whole stable system we currently have in place.

===============================

Also, any suggestions on what you had in mind for the header?
Ex: Single byte?
Bitfields?

User avatar (0012193)
Torr Samaho (administrator)
2015-05-01 15:30

Quote from Watermelon
If we realize its an old client

How can we find out that it's on old client?
Quote from Watermelon
Do you think its worth pursuing lz4 still?

Yes, the stream should automatically adjust the compression to the data. Thus, it should be much more efficient than any fixed table and we don't have to reinvent the wheel, since lz4 is designed to compress network traffic.

If I understood Dusk correctly when we talked about lz4 last, one of the main problems was that we didn't have a proper network header that shows the packet number before decompressing the packet.

Quote from Watermelon

Also, any suggestions on what you had in mind for the header?
Ex: Single byte?
Bitfields?

Mandatory contents of the header are the compression type and the packet number (to keep track of lost packets). How exactly they are encoded and if additional things are needed is up for discussion.
User avatar (0012194)
Watermelon (developer)
2015-05-01 18:47
edited on: 2015-05-01 18:58

Quote
How can we find out that it's on old client?


From what I've seen, the current Huffman coding uses 0xFF for no encoding, and 0x00 through 0x07 as the residual number of bits in the header.

Therefore, if we get any of the combinations above, we know it's an old client and can decompress it the old way (or send a version out of date packet...etc). This should be easy since if we see its an old client, we send the data with 0xFF as a header.

Any new clients will have bits 0x10, 0x20, or 0x40 set. This allows us to quickly tell if it's a new client or not by doing something like:

bool isNewClient = (header != 0xFF) && (!!(header & 0x70));



Quote
Yes, the stream should automatically adjust the compression to the data. Thus, it should be much more efficient than any fixed table and we don't have to reinvent the wheel, since lz4 is designed to compress network traffic.

If I understood Dusk correctly when we talked about lz4 last, one of the main problems was that we didn't have a proper network header that shows the packet number before decompressing the packet.

Mandatory contents of the header are the compression type and the packet number (to keep track of lost packets). How exactly they are encoded and if additional things are needed is up for discussion.


Based on the above (since I'll be dealing only with the compression/decompression part until something is done on the lz4 end) it looks like the current method may work:


Old huffman support:
11111111 - Old huffman uncompressed
0000xxxx - Old huffman compressed
001xxxxx - New huffman compressed/uncompressed (uncompressed is based on 0x10)
01xxxxxx - Can indicate new compression types (ex: xxxxx could be 1 for LZ4, 2 for method A, 3 for method B...etc)

The logic would look something like this pseudocode:


// Let header = data[0]
bool oldClient = false; // Assume by default it's a new client.

if (header == 0xFF)
    oldClient = true
    rawData = 1 .. n
else if (header <= 7)
    oldClient = true
    oldHuffmanDecode...
else if (header & 0x20)
    huffmanDecode...
else if (header & 0x40)
    switch (header & 0x3F)
    case 0:
        LZ4...
    case 1:
        methodX...
    case 2:
        methodY...
    default:
        *corrupt data detected*
else
    *corrupt data detected*


NOTE: The above would need more sanity checks for corrupt packets, but it should give a general idea of where I'm going with this.


With this, we'd be able to keep everything in one header (though I dislike how messy it sort of looks, but we can thank carnevil for this), and we can support old and new clients. In addition, this easily allows support for many types of compression if in the future we find one that is better.

There is also one more bit left over in the event we need to somehow use two headers (ex: if only 0x80 is set, it can indicate that we need some newer protocol, and older implementations can error out if they only see 0x80).

User avatar (0012209)
Torr Samaho (administrator)
2015-05-07 20:07

Sounds like a good plan! FYI, I didn't check the header implementation of the old Huffman code, I'll just assume that you're right with its analysis.

BTW: Sorry for the delayed answer, but for some reason I'm not getting email notification from the tracker anymore.

Issue Community Support
Only registered users can voice their support. Click here to register, or here to log in.
Supporters: Frits Combinebobnt
Opponents: No one explicitly opposes this issue yet.

- Issue History
Date Modified Username Field Change
2013-01-29 15:52 Watermelon New Issue
2013-01-29 17:16 Watermelon Note Added: 0005862
2013-01-29 17:23 Dusk Description Updated View Revisions
2013-01-29 17:23 Dusk Note Edited: 0005862 View Revisions
2014-06-13 16:03 Watermelon Assigned To => Watermelon
2014-06-13 16:03 Watermelon Status new => assigned
2014-06-13 18:05 Watermelon Relationship added related to 0000757
2014-06-13 18:05 Watermelon Relationship added related to 0000803
2014-06-14 01:59 Watermelon Note Added: 0009267
2014-06-14 01:59 Watermelon Status assigned => needs review
2014-06-14 17:22 Watermelon Status needs review => assigned
2014-06-14 18:59 Watermelon Note Added: 0009339
2014-06-14 19:01 Watermelon Note Edited: 0009339 View Revisions
2014-07-15 22:39 AlexMax Note Added: 0009973
2014-10-09 09:00 TIHan Note Added: 0010416
2014-10-09 16:19 Blzut3 Note Added: 0010428
2014-10-09 16:24 Watermelon Note Added: 0010429
2014-10-09 23:59 Blzut3 Note Added: 0010442
2014-10-10 16:55 Torr Samaho Note Added: 0010455
2014-10-10 17:23 Watermelon Note Added: 0010457
2014-11-11 03:10 Watermelon Note Added: 0010858
2014-11-11 15:33 Blzut3 Note Added: 0010867
2015-01-07 19:57 Dusk Note Added: 0011321
2015-01-07 19:57 Dusk Assigned To Watermelon => Dusk
2015-01-08 11:01 Dusk Note Added: 0011330
2015-01-08 11:01 Dusk Assigned To Dusk =>
2015-01-08 11:01 Dusk Status assigned => new
2015-04-27 17:03 Watermelon Note Added: 0012177
2015-04-27 17:03 Watermelon Assigned To => Watermelon
2015-04-27 17:03 Watermelon Status new => needs review
2015-04-27 17:06 Watermelon Note Edited: 0012177 View Revisions
2015-04-27 17:06 Watermelon Note Edited: 0012177 View Revisions
2015-04-27 17:07 Watermelon Note Edited: 0012177 View Revisions
2015-04-27 17:07 Watermelon Note Edited: 0012177 View Revisions
2015-04-27 17:10 Watermelon Note Edited: 0012177 View Revisions
2015-04-27 17:13 Watermelon Note Edited: 0012177 View Revisions
2015-04-27 17:15 Watermelon Note Edited: 0012177 View Revisions
2015-04-27 17:23 Watermelon Note Edited: 0012177 View Revisions
2015-04-29 06:03 Torr Samaho Note Added: 0012181
2015-04-29 15:23 Watermelon Note Added: 0012185
2015-04-29 15:29 Watermelon Note Edited: 0012185 View Revisions
2015-05-01 15:30 Torr Samaho Note Added: 0012193
2015-05-01 15:32 Torr Samaho Status needs review => feedback
2015-05-01 18:47 Watermelon Note Added: 0012194
2015-05-01 18:47 Watermelon Status feedback => assigned
2015-05-01 18:47 Watermelon Status assigned => needs review
2015-05-01 18:48 Watermelon Note Edited: 0012194 View Revisions
2015-05-01 18:51 Watermelon Note Edited: 0012194 View Revisions
2015-05-01 18:52 Watermelon Note Edited: 0012194 View Revisions
2015-05-01 18:54 Watermelon Note Edited: 0012194 View Revisions
2015-05-01 18:54 Watermelon Note Edited: 0012194 View Revisions
2015-05-01 18:57 Watermelon Note Edited: 0012194 View Revisions
2015-05-01 18:58 Watermelon Note Edited: 0012194 View Revisions
2015-05-07 20:07 Torr Samaho Note Added: 0012209
2015-05-07 20:07 Torr Samaho Status needs review => feedback
2016-04-10 19:25 Dusk Status feedback => new
2016-04-10 19:27 Dusk Assigned To Watermelon =>






Questions or other issues? Contact Us.

Links


Copyright © 2000 - 2024 MantisBT Team
Powered by Mantis Bugtracker