Versioning

killall -9

Moderator: Developers

User avatar
AlexMax
Forum Regular
Posts: 244
Joined: Tue May 29, 2012 10:14 pm
Contact:

Versioning

#1

Post by AlexMax » Tue Jun 05, 2012 2:39 am

I think Zandronum's versioning should be pretty simple. Feature releases should increase the major version number and put the minor version at 0, bugfix releases should increase the minor version number. What do you think?
The only limit to my freedom is the inevitable closure of the
universe, as inevitable as your own last breath. And yet,
there remains time to create, to create, and escape.

Escape will make me God.

User avatar
The Toxic Avenger
Forum Staff
Posts: 1536
Joined: Fri May 25, 2012 1:12 am
Location: New Jersey
Clan: ???
Clan Tag: [???]
Contact:

RE: Versioning

#2

Post by The Toxic Avenger » Tue Jun 05, 2012 2:40 am

I like it. Having 1.0, 1.1, 1.2..., then the next major release be 2.0 sounds good.

Vordenko
 
Posts: 47
Joined: Mon Jun 04, 2012 5:16 am
Location: The Limbo

RE: Versioning

#3

Post by Vordenko » Tue Jun 05, 2012 3:30 am

since it's still an alpha (is it?) I think it should start from 0.1, and 1.0 being the first beta.
Last edited by Vordenko on Tue Jun 05, 2012 3:32 am, edited 1 time in total.
InGame: <*>Sky<-=DriveR=-
Captain Ventris wrote: I shorten it to Zan. This means that if I ever for some reason host a ported Zan Zan, I can call it a Zan Zan Zan server.
Image

Blzut3
Developer
Posts: 308
Joined: Thu May 24, 2012 9:37 pm

RE: Versioning

#4

Post by Blzut3 » Tue Jun 05, 2012 4:14 am

AlexMax wrote: I think Zandronum's versioning should be pretty simple. Feature releases should increase the major version number and put the minor version at 0, bugfix releases should increase the minor version number. What do you think?
That was the plan. Major version would increase on ZDoom base upgrades and the minor version was what used to be the letter.

I kind of wonder if we should call the first release 1.4 (accounting for 98a-d) so that should we upgrade the ZDoom version after this release it doesn't seem like we're on an extremely rapid release.
Last edited by Blzut3 on Tue Jun 05, 2012 4:16 am, edited 1 time in total.

User avatar
Torr Samaho
Lead Developer
Posts: 1543
Joined: Fri May 25, 2012 6:03 pm
Location: Germany

RE: Versioning

#5

Post by Torr Samaho » Tue Jun 05, 2012 7:21 pm

I'm not completely sure when we should increase the major and when the minor version number. If we upgrade our base from ZDoom 2.3.1 (which is approximately what we have now) to 2.4.1 (which is a conservative but realistic goal for the version following Zandronum 1.0), increase the major version number for this and follow this strategy, we may end up having a much higher version number than ZDoom in the not too distant future. We could make it more like ZDoom and use two minor numbers.

Nubcakes
 
Posts: 29
Joined: Tue Jun 05, 2012 4:24 pm
Location: Outside your window with duct tape stretched out!

RE: Versioning

#6

Post by Nubcakes » Tue Jun 05, 2012 7:29 pm

Why not just call it Alpha RC1 and increase the number after RC for each update? Don't use decimals, just Integers. Do the same for beta and it's release.

Examples: Zandronum Alpha RC5, Zandronum Beta RC31, Zandronum RC666

Xaser
 
Posts: 34
Joined: Mon Jun 04, 2012 11:19 pm
Location: 25 Miles From The Outskirts of Space-Dallas
Contact:

RE: Versioning

#7

Post by Xaser » Tue Jun 05, 2012 7:36 pm

Personal opinion, but I find the practice of sub-minor numbers a bit extraneous. It's like there's a phobia about incrementing major numbers, though I can see how a newcomer could perhaps try and compare the version numbers between Zan and ZDoom and conclude that one is more advanced than the other.

I'd file it under the "not really a big deal" category, though.

Guess I should ask, though: Is the official plan to do a major release for every big bump of ZDoom? That is, go from 2.3.1->2.4.1->2.5.0->eventual 2.6.0 and do a release each time? I'm being overly optimistic here but once the engine is stable enough to allow porting from ZDoom, I'd hope that the process of porting features will speed up considerably, perhaps so that a multiple ZDoom releases can be spanned in a single Zan release. Even with something like 2.3.1->2.4.1->2.6.0, that's one less major version bump to have to "worry" about... even though at that point, Zan would be 3.0. So whelp. :P

Ignore this if my lack of knowledge about the code is laughable. Thought dump presently ceases.

Blzut3
Developer
Posts: 308
Joined: Thu May 24, 2012 9:37 pm

RE: Versioning

#8

Post by Blzut3 » Tue Jun 05, 2012 8:52 pm

Torr Samaho wrote: I'm not completely sure when we should increase the major and when the minor version number. If we upgrade our base from ZDoom 2.3.1 (which is approximately what we have now) to 2.4.1 (which is a conservative but realistic goal for the version following Zandronum 1.0), increase the major version number for this and follow this strategy, we may end up having a much higher version number than ZDoom in the not too distant future. We could make it more like ZDoom and use two minor numbers.
The thing is, as many open source projects have come to realize (including the Linux kernel), the major version number, in a traditional sense, is no longer significant. ZDoom and thus Zandronum do not intend to ever break the API so if you take the traditional route of increment on API breakage would mean we'd be on 1.x forever. The solution is to drop the major version or use the date as the version number. However, if you're going to use the "rapid release" scheme then you need to use it now. Otherwise you end up like Firefox where people think you did it purly for marketing reasons.

In terms of passing the ZDoom version number. We could not increment the version number on minor ZDoom upgrades (bug fix, small random features). In general this would mean we would remain 2 behind ZDoom. Unless we upgrade to mid points between ZDoom, but honestly ZDoom needs to update more anyways.

In fact, for Skulltag we pretty much, in practice, switched to the rapid release scheme. The website calls the latest version 98d and I don't even recall the last person to call it "0.98d".

Twister
Under Moderation
Posts: 49
Joined: Mon Jun 04, 2012 1:48 pm
Location: Next door
Contact:

RE: Versioning

#9

Post by Twister » Tue Jun 05, 2012 9:34 pm

I'd recommend we start at 99a, and once we are truly diverged from skulltag, we start at 100!

User avatar
Torr Samaho
Lead Developer
Posts: 1543
Joined: Fri May 25, 2012 6:03 pm
Location: Germany

RE: Versioning

#10

Post by Torr Samaho » Wed Jun 06, 2012 6:45 pm

Blzut3 wrote: The thing is, as many open source projects have come to realize (including the Linux kernel), the major version number, in a traditional sense, is no longer significant. ZDoom and thus Zandronum do not intend to ever break the API so if you take the traditional route of increment on API breakage would mean we'd be on 1.x forever.
Good argument. So far I disliked the rapid release version numbering model Firefox and Chrome are using, most likely because I'm used to the more traditional model and the new one feels so inflationary. But your argument makes sense, there is no real need to fix the major revision for ages. So let's give the modern system a try.

VortexCortex
 
Posts: 27
Joined: Mon Jun 11, 2012 1:43 am
Location: Houston
Contact:

RE: Versioning

#11

Post by VortexCortex » Sat Jun 30, 2012 6:35 pm

There is a versioning problem in the network protocol. Specifically: The launcher protocol lacks proper versioning. This deficit prevents launchers from distinguishing between Skulltag and Zandronum packets. Currently, users must decide between the two...

Fortunately, backward compatible versioning can be added to the protocol by leveraging the nature of the Huffman compression algorithm and data format.

First, a little info about the current Huffman data format:
When data is Huffman compressed the resulting bit string may not align on a 8-bit byte (octet) boundary. However, current computers operate atomically on octets, so some number of bits must be appended to the bit string to fill out any partial final octet. Any padding bits must be ignored by the decompression algorithm otherwise additional unwanted data will be appended to the return buffer. Thus, a single octet is prefixed to the bit string which indicates the number of padding bits that have been appended to the bit string.

This means: The first octet of payload data in all of your packets (prior to decompression) is always one of: 0, 1, 2, 3, 4, 5, 6, 7, or 255. (Zero through Seven Inclusive, and 255 or -1 as a signed char).

To add versioning to the protocol: Always Huffman compress the data with newer implementations, increases due to encoding are of minimal impact. After Huffman compressing the data, increase the first octet's value by 8 (OR the first byte with 8). Additionally, append a single zeroed octet to the end of the data buffer.

The legacy Huffman decompression algorithms will read in the increased value and ignore the eight additional zeroed bits. New Zandronum aware programs can then check the first byte of data: If the first byte is less than 8 or equal to 255, then it's a legacy Skulltag packet, otherwise it's a Zandronum packet.

Since this change is backwards compatible with the current launcher protocols, Master, Servers, and Clients it can be rolled out incrementally i.e. not everything has to be updated all at once.

This method can be used to shoehorn up to 248 additional bits (31 octets) of metadata into the current network protocol. I advise that only one additional padding octet be used at this time, and that it must be zeroed; This is because both the extra "padding" bits and the upper bits in the prefix byte could be used in the future for additional versioning and/or protocol signaling as compression mode and handshaking meta-data; Especially if a new authentication system were needed...

I don't know of any plans you may already have to alleviate the issue, nor have I searched for them (sort of busy ATM). Apologies, if the issue has already been corrected. I (or someone else) can open a ticket, if the proposed solution sounds acceptable.
Last edited by VortexCortex on Sun Jul 01, 2012 7:41 am, edited 1 time in total.

User avatar
Torr Samaho
Lead Developer
Posts: 1543
Joined: Fri May 25, 2012 6:03 pm
Location: Germany

RE: Versioning

#12

Post by Torr Samaho » Sat Jun 30, 2012 8:08 pm

That sounds like an interesting idea (if I understood it correctly the appended zeroed octet would increase the size of each packet by 1 byte though, right?). Can you give some examples of things made possible by the added versioning? Currently we can distinguish between Skulltag and Zandronum servers by their version string and we could distinguish old launcher from Zandronum aware launchers by the MASTER_SERVER_VERSION part of the master-launcher protocol. We are not making use of the latter for this yet though.

VortexCortex
 
Posts: 27
Joined: Mon Jun 11, 2012 1:43 am
Location: Houston
Contact:

RE: Versioning

#13

Post by VortexCortex » Sun Jul 01, 2012 4:50 am

First off, in haste I failed to mention 255 as one of the possible first byte values (which is why I even began listing them all out, doh! I've corrected the post). Also, I agree Doomseeker and other launchers could use the server version string to pick the appropriate binary to execute; However, it seems like a single plugin solution would be required that handles both Zandronum and Skulltag servers since the Master can not currently provide versioning data to the Launchers along with the server IP list. Say you put the same Master in both ST & Zand plugins, you'll get ST servers in your Zand list, and Zand servers in your ST listing... Is this the root of the current problem? (haven't yet had the time to disect it myself) It would be optimal to have two completely separate plugins with no crosstalk; However the reality is that the master server handles both, so at some level Launchers will have to as well -- The earlier, the better.
Spoiler: For great justice. (Open)
The additional octet does increase the packet length, but it's inconsequential to the point of being statistically uninteresting, and it's a very small price to pay for backwards compatibility.

For the small launcher query and response packets, one octet makes no difference in terms of speed or packet splitting (frame frature), and little difference in terms of bandwidth. For the larger response packets it only makes a difference at an extreme edge case where it would bump the packets above the MTU (maximum transmission unit) of an intervening link causing a packet split; Considering that in such reply packets there are multiple variable length string data fields which do cause the packet lengths to fluctuate in increments much larger than a single octet, such addition is actually unnoticeable statistically because no attempt is being made to actually determine the MTU of the intervening routers...

Additionally, the extra bits can be excluded if desired once the endpoints have established they are both not using the legacy protocol -- Though it could be leveraged for much more, the octet is included at this stage only to provide full backwards compatibility in the event endpoint versions are not known ahead of time (which is the current situation) and to allow Launchers to quickly discern server type. I would advocate its inclusion into Zandronum even if the other more drastic changes are not yet implemented if only to differentiate Zandronum traffic from Skulltag's in a reliable way. The changes would be fairly minimal. If needed, I could schedule the time to create and test the necessary patches.. (..and compatibility with IDE... in a VM, Still closed source, ya?)

While true that there is versioning in the data format, this version data can only be read after decompressing the data and parsing the launcher protocol. Because of the lack of versioning in the transport encoding layer, the network protocol is needlessly married to the antiquated Huffman algorithm. Consider the above technique as also enabling encoding/compression versioning. This would allow you to change your compression/encoding to a more efficient algorithm or to selectively drop it entirely if desired, and also provides backwards compatibility by detection of older implementations prior to decompression and parsing of the of the packet payload.

The Master does not supply any versioning information when it replies to a Launcher challenge with a server list... In other words: Doomseeker could know which plugin to roll over the packet data to by examining only a single byte. This would allow both Skulltag and Zandronum Servers to continue to use the same Master server while opening the possibility for drastic changes to the packet encoding and internal data formats used (eg: a different master server list reply format; IP6 perhaps?). A different challenge value could provide such an option as well; However due to the simplicity of the challenge ID system anyone pointing the new Launcher at an older master wouldn't get a reply...

Take note: An extensible system would use the Challenge ID to indicate What you want, and a separate value to indicate the Version you want it in (hence my claim that there's really no versioning in the Launcher protocol). Asking for the same thing by using different challenges is like speaking a different language. Versioning exists to allow both old and new systems to speak the same base language. If a new system talks to a legacy system it shouldn't say: "Ba7wgMtypfb.yXuJ!!1!", instead it should say: "Hey I'd like to talk, preferably in Zandronic, but Skulltagian is fine". Conversely, a new system replying to and older system should say, "Let me dumb this down for you: ..." Anyhow, it's a little late to re-engineer the base language to provide proper versioning fields, so we make do with what we've got, eh?

For instance, let's say you wanted to allow launchers to drop the Huffman encoding completely (a request I've seen a few times). Setting the bits of the first octet of data to ones (0xff) currently enables Launchers to send an uncompressed query to Masters and Servers; It signals no compression is in use, which is provided in case the Huffman encoding increased the length of the data. However, the Launchers must still implement Huffman decompression to read replies. Setting all but the least significant bit of the first octet to ones (0xfe) could signal to newer Masters and Servers that the Launcher only understands uncompressed responses. An initial octet of 0xfe tells the current Huffman decoding procedure: Subtract 254 bits from the length of the Huffman bit string; They are considered padding which is to be ignored. As long as the initial Launcher Challenge is kept below 31 bytes in length the legacy Huffman decompression implementations will return Zero length payloads, and thus will ignore Launchers that indicate they don't understand the Huffman compression.

The relevant code is:

Code: Select all

void HUFFMAN_Decode( unsigned char *in, unsigned char *out, int inlen, int *outlen )
{
	int bits,tbits;
	huffnode_t *tmp;	
	if (*in==0xff)
	{
		memcpy(out,in+1,inlen-1);
		*outlen=inlen-1;
		return;
	}
	tbits=(inlen-1)*8-*in;
	bits=0;
	*outlen=0;
	while (bits<tbits)
	{
		... // Decode the Huffman bit string.
Where "in" is a pointer into the start of the input char buffer, "tbits" is the number of Huffman compressed bits in the bit string, "inlen" is the input char buffer length in bytes, and "bits" is the number of bits decompressed:

Thus, if it's not a 0xff (plain) encoded packet, tbits is decreased by the value of the first byte, and can actually end up being zero or less, the while loop is skipped in this case, and "outlen" remains zero.


Currently, Launcher Challenge is a 32 bit value of 199 ordered with least significant byte first. There is a lack of versioning here. The same challenge ID is sent to Masters and Servers alike. Consider a decentralized system whereby Clients and Servers were also Launchers and Masters as well: If a Server wanted to query another Server or Client for its known list of Servers, it would need to signal: Hey, I want your "Master" list by sending it... what? The Launcher Challenge? The challenged server thus would not know whether or not they were expected to return a Master Server List, or the Server's own Information.

The additional protocol versioning at the top most encoding level could be used to supply an expected "Role" signals:
  • Consider this packet as if you were you were a Master Server.
  • Consider this packet as if you were a Game Server.
  • Consider this packet as if you understood XML.
  • Consider this packet as if you were an Authentication Server...
  • etc.
Since the Master does not currently provide the Server Version along with its IP addresses in the master server list the Launchers do not know what version of server they are querying ahead of time. With the above mentioned encoding layer versioning other types of signals could be encoded in the appended data. The additional data would be considered as padding and ignored by the legacy implementations, so it would allow Launchers to query Servers with a single packet in a backwards and forwards compatible fashion, while alleviating the burden of versioning from the Master Server List.

For instance, a new Launcher protocol could specify that Zandronum specific query data is appended to the Huffman encoded data, and the lower 3 bits of the first octet treated as the actual "padding" length. Within the suffixed data could exist encoding and/or authentication protocol negotiation data, which the legacy servers would ignore. These older implementations would not be confused by the additional data, and instead they would reply with the legacy launcher protocol; Newer implementations could reply using an updated protocol, different compression (that the launcher indicated they understood), a new and radically different data format including new data fields, or even different encoding types such as XML, JSON, etc. Newer Servers could use the lack of such versioning data to indicate the Launcher is a legacy launcher, and still respond in the older protocol (providing legacy support).
These are some of the versioning features that come off the top of my head -- I remember having spent some time engineering the initial byte values of the MASSTER auth system in such a way to naturally take advantage of this "ignore the extra bits" system, and enable alternate Huffman trees optimized for various game modes -- Invasion, Deathmathch, etc, as well as different compression trees/dictionaries on a per WAD basis. In fact, I found a few full network logs and profiling stat tables I made back then for various game modes, but the exact protocol itself resides unsafely locked away in Carney's forum data... Though unlikely, I wonder if he could be reasoned with to give us a SQL dump of our data?

It's mainly the backwards compatibility angle that requires the mildly hackish approach (though I've seen much worse); We're lucky such a system is possible, most protocols lacking pre-payload versioning have far smaller wiggle rooms.

User avatar
Wartorn
Retired Staff / Community Team Member
Posts: 254
Joined: Thu May 24, 2012 9:28 pm

RE: Versioning

#14

Post by Wartorn » Sun Jul 01, 2012 4:54 am

VortexCortex wrote: In fact, I found a few full network logs and profiling stat tables I made back then for various game modes, but the exact protocol itself resides unsafely locked away in Carney's forum data... Though unlikely, I wonder if he could be reasoned with to give us a SQL dump of our data?
I have a copy of the database, although it has seen its damage from the post disappearance incident that happened a few months ago so there is a likely chance it could be lost. Tell me what to look for and I'll retrieve your protocol.
Last edited by Wartorn on Sun Jul 01, 2012 4:56 am, edited 1 time in total.

VortexCortex
 
Posts: 27
Joined: Mon Jun 11, 2012 1:43 am
Location: Houston
Contact:

RE: Versioning

#15

Post by VortexCortex » Sun Jul 01, 2012 7:14 am

Well, knowing me, if it's got bit fields (which I'm sure it did) then it's got a few ASCII "art" patterns like this:

Code: Select all

| | | |
Spoiler: E.G. (Open)

Code: Select all

REMAINING TREE DATA:

Decimal:        38,       155,       136,       176
Binary:  0010:0110, 1001:1011, 1000:1000, 1011:0000

28 bits of Huffman tree data:  0 0 1 001 1 010 0 1 101 1 100 0 1 000 1 011
                               | | | |_| | |_| | | |_| | |_| | | |_| | |_|
    [Branch]       B0 ---------' | |  |  |  |  | |  |  |  |  | |  |  |  |
    [Branch]       B1 -----------' |  |  |  |  | |  |  |  |  | |  |  |  |
    [Leaf]         L0 -------------'  |  |  |  | |  |  |  |  | |  |  |  |
    [3 bits]    ( 1 ) ----------------'  |  |  | |  |  |  |  | |  |  |  |
    [Leaf]         L1 -------------------'  |  | |  |  |  |  | |  |  |  |
    [3 bits]    ( 2 ) ----------------------'  | |  |  |  |  | |  |  |  |
    [Branch]       B2 -------------------------' |  |  |  |  | |  |  |  |
    [Leaf]         L2 ---------------------------'  |  |  |  | |  |  |  |
    [3 bits]    ( 5 ) ------------------------------'  |  |  | |  |  |  |
    [Leaf]         L3 ---------------------------------'  |  | |  |  |  |
    [3 bits]    ( 4 ) ------------------------------------'  | |  |  |  |
    [Branch]       B3 ---------------------------------------' |  |  |  |
    [Leaf]         L4 -----------------------------------------'  |  |  |
    [3 bits]    ( 0 ) --------------------------------------------'  |  |
    [Leaf]         L5 -----------------------------------------------'  |
    [3 bits]    ( 3 ) --------------------------------------------------'
or

Image

Those exact things won't be in the SQL dump, but something similar would be.
Cross referencing posts containing 'Huffman' and/or 'SkullCrusher' with 'VortexCortex' as the author may also yield the post(s).
Last edited by VortexCortex on Sun Jul 01, 2012 7:33 am, edited 1 time in total.

User avatar
Torr Samaho
Lead Developer
Posts: 1543
Joined: Fri May 25, 2012 6:03 pm
Location: Germany

RE: Versioning

#16

Post by Torr Samaho » Sun Jul 01, 2012 7:52 am

VortexCortex wrote: If needed, I could schedule the time to create and test the necessary patches.. (..and compatibility with IDE... in a VM, Still closed source, ya?)
The more you tell about it, the more useful your proposed versioning sounds :). If you have the time to create patches, that would be great!
Wartorn wrote: I have a copy of the database, although it has seen its damage from the post disappearance incident that happened a few months ago so there is a likely chance it could be lost. Tell me what to look for and I'll retrieve your protocol.
I think Eruanna still has a copy of the .net forum database. So if the post we are looking for was written on skulltag.net, it should be in there.

User avatar
Wartorn
Retired Staff / Community Team Member
Posts: 254
Joined: Thu May 24, 2012 9:28 pm

RE: Versioning

#17

Post by Wartorn » Sun Jul 01, 2012 8:53 pm

It's possible that the SQL dump wasn't correctly exported after some trial and error I'm an idiot and I probably need help (The database is fine), so as a result a lot of the example code might be damaged on the remote server I installed the database to make my hunt easier. I'll keep tinkering with it to see if I can get the thing in tact but in the event I can't, I'll need someone's help to get this encoding thing working. In addition, there was an attachment named "Hardened Huffman" but dumps don't include that kind of thing. But is this what you're looking for?

[spoiler]I've been working on a replacement for the current Huffman encoding protocol.
My goal is to extend the current Huffman encoding protocol to include multiple encoding modes such as better compression and a non encoded mode while providing backwards compatibility.

I've named the encoding protocol "Skulltag Online Network Encoding Protocol" or "STONE Protocol" or StEP for short.
This corresponds to my implementation of the protocol: "Skulltag Online Network Encoding Layer 2" or "STONE Layer 2" or StL2.

Think of STONE Protocol as the underlying foundation that all other Skulltag network protocols are stacked on top of. The name is subject to change if anyone has a better or existing name to use.

Similar to posting an RFC: I'm posting the protocol here to get ideas, comments, and approval before finalizing the implementation details. I also wanted to demonstrate how the protocol achieves backwards compatibility with the current Huffman protocol

Apologies in advance -- I have a tendency to be long winded, but did not spoiler anything in the interest of readability.
(No apologies to Rivecoder. He knew what he was getting into and should be used to this by now :P)

I wasn't sure if this was the correct place to post... Skulltag Protocol thread and Source Discussion seemed likely candidates as well. Please move if needed, thanks.


Legacy Encoding Protocol[/size]
It's important to note that the method of Huffman compression currently in use is actually a Data Encoding Protocol. The first byte of data is a signal that indicates how the following data bytes should be decoded.

A first byte of 0xFF, 255 (or -1 as a signed char) means the following bytes are not encoded.

Any other value signifies Huffman encoded data follows.
When Huffman encoding is used, The first byte value specified is equal to the number of bits that were appended to the last byte of data in order to pad the Huffman data out to a full byte.

For example, the number 5 would mean to disregard the last 5 bits of data in the last byte of the following data. Since a byte is 8 bits, there can logically only exist eight possible padding values for the first signal byte: 0, 1, 2, 3, 4, 5, 6, 7.

Current Huffman implementations operate on the signal byte and determine how many bits of data to read from the remaining data bytes via the following algorithm:

Code: Select all

prefix_byte_value = ... // get the first byte of packet data.
number_of_data_bytes = number_of_packet_bytes - 1; // Account for the protocol's "signal/padding" byte.
if ( prefix_byte_value == 0xFF )
{
    ... // Return the data bytes unchanged (not encoded mode)
}
number_of_data_bytes * 8 = total_data_bits;
number_of_huffman_encoded_bits = total_data_bits - prefix_byte_value;
... // Huffman decode the number_of_huffman_encoded_bits from the data bytes.
       Â
This legacy implementation detail is crucial to providing backwards compatibility when using certain STONE Protocol features.
The actual code used in the older HUFFMAN_Decode function will be reviewed later.



STONE Protocol[/size]
The new protocol is based on the above protocol with backwards compatibility and minimal header (meta data) size being of utmost importance.

In the interest of backwards compatibility StEP sometimes stores meta data at the end of the packet as well as the beginning.

In the following tables the bit values and ranges inside of bytes are given as their "unshifted" values.

The Signal Byte[/size]
Similar to the Legacy Encoding, the first byte of a StEP packet holds special "signal" data.

In STONE Protocol the first byte is a bit field containing 3 independent pieces of data.
The bit positions and values within the signal byte have been carefully chosen to inherently provide backwards compatibility.

Code: Select all

Signal byte bit Values.  7 = Most significant bit
.--------------------------------------
|
|    7 6 5 4 3 2 1 0  
|    | | | | | | | |  
|    | | | | | x x x = Numeric value [0 to 7]
|    | | | | |         Number of Padding Bits (used with variable bit length encodings)
|    | | | | |         or Message Flags Field (used with Unencoded mode)
|    | | | | |
|    | | | | x ----- = Encoding Negotiation Flag
|    | | | |           1: An encoding negotiation byte has been appended after the last byte of data.
|    | | | |           0: No negotiation byte exists.
|    | | | |
|    x x x x ------- = Numeric value [0 to 15]
|                      Encoding Identification Signal
|
`---------------------------------------
Padding Bits Value[/size]
StEP uses the Padding Bits value in the same way that the Legacy Encoding protocol / implementation does.
The last byte of encoded data contains N padding bits which should be ignored.
Future encodings may have different usages for the padding bits.

Message Identification Flags[/size]
When composing a STONE Protocol "Unencoded" packet "Padding Bits" are unneccesary.
Instead of letting those bits go to waste a Message Identification bit field (MID) is used.

Code: Select all

Message Identification Signal.  7 = Most significant bit
.--------------------------------------
|
|    7 6 5 4 3 2 1 0  
|    | | | | | | | |  
|    | | | | | | x x = Numeric Value [0 to 3]
|    | | | | | |       Error Code
|    | | | | | |       0: No error, a normal regular unencoded data message.
|    | | | | | |       1: Unsupported Encoding
|    | | | | | |          The previously received data encoding is not supported.
|    | | | | | |       2: Corrupt Data
|    | | | | | |          The previously received data did not decode properly.
|    | | | | | |       3: Connection Refused
|    | | | | | |          For some reason the connection was refused, and no further
|    | | | | | |          data should be sent (at least for a while)
|    | | | | | |
|    | | | | | x---- = Request for Protocol Negotiation
|    | | | | |         1: Request that a reply contain Encoding Negotiation.
|    | | | | |         0: Encoding negotiation not requested from the other party.
|    | | | | |
|    x x x x x------ = These bits are beyond the MID scope.
|
`---------------------------------------
Regardless of Error Code, it is beyond STONE Protocol's scope to determine the contents of the unencoded data.

Error code meta data can be used by higher level protocols to provide extended error messages,
re-request corrupted data or perform a negotiation handshake.

When the "Connection Refused" error code is received, it is up to the application to check this value and take action (or inaction).
StL2 provides getter methods for these fields.

It is Forbidden to use the Request for Protocol Negotiation when the "Connection Refused" Error Code is in use.
(This would make no sense, and also prevent Legacy Encoding Mode detection)

In the STONE Layer 2 implementation, if a Request for Protocol Negotiation flag is received then the next reply packet will automatically be configured for Protocol Negotiation. This behavior can be disabled.


Encoding Identification Signal[/size]
16 encoding values are possible.

Code: Select all

.------------------------------------------------------------------------------------------.
|     Encoding ID Value    |                 Description of Encoding Mode                  |
| Hex .  Binary  . Decimal |                                                               |
|-----|----------|---------|---------------------------------------------------------------|
|  0  |   0000   |     0   |  Legacy Huffman: Data is encoded using a Huffman Tree         |
|     |          |         |      that is compatible with the original huffman.cpp         |
|-----|----------|---------|---------------------------------------------------------------|
|  1  |   0001   |     1   |  Improved Huffman: Data is encoded using a newly calculated   |
|     |          |         |      Huffman tree, and bytes do not have reversed bit order   |
|-----|----------|---------|---------------------------------------------------------------|
|  2  |   0010   |     2   |  Skull Crusher: Data is compressed using a custom algorithm   |
|     |          |         |      that utilizes two Huffman trees and a sliding "window"   |
|-----|----------|---------|---------------------------------------------------------------|
| 0x3 |   0011   |     3   |                                                               |
|  to |    to    |   to    |     Available for future expansion of the STONE Protocol      |
| 0xE |   1110   |    14   |                                                               |
|-----|----------|---------|---------------------------------------------------------------|
|  F  |   1111   |    15   |  Unencoded Mode: data bytes use the 1:1 (identity) encoding,  |
|     |          |         |     a "no op" raw data mode also used for error code replies  |
`------------------------------------------------------------------------------------------'

Due to the backwards compatibility requirements the Signal byte's Encoding Identification Signal values do not match exactly to the Negotiation byte's Encoding Level values. However, conversion between the two values is possible.

Encoding Negotiation[/size]
If the Encoding Negotiation flag is set then the last byte of the packet contains a special bit field.
When Legacy mode is enabled the Encoding Negotiation flag is treated as if it were false.

The "Negotiation" bit field contains two independent pieces of data.

Code: Select all

Encoding Negotiation byte bit Values.  7 = Most significant bit
.--------------------------------------
|
|    7 6 5 4 3 2 1 0  
|    | | | | | | | |  
|    | | | | x x x x = Numeric value [0 to 15]
|    | | | |           Minimum Encoding Level value
|    | | | |  
|    x x x x ------- = Numeric value [0 to 15]
|                      Maximum Encoding Level value
|
`---------------------------------------
Having both minimum and maximum encoding levels instead of just a maximum supported is important for future encodings that may include encryption. In the event of encrypted protocol implementation an application may decide to specify a minimum encoding level that disallows all unencrypted encodings.


Encoding Level Values[/size]
16 encoding values are possible.

Code: Select all

.------------------------------------------------------------------------------------------.
|   Encoding Level Value   |                 Description of Encoding Mode                  |
| Hex .  Binary  . Decimal |                                                               |
|-----|----------|---------|---------------------------------------------------------------|
|  0  |   0000   |     0   |  Unencoded Mode: data bytes use the 1:1 (identity) encoding,  |
|     |          |         |     a "no op" raw data mode also used for error code replies  |
|-----|----------|---------|---------------------------------------------------------------|
|  1  |   0001   |     1   |  Legacy Huffman: Data is encoded using a Huffman Tree         |
|     |          |         |      that is compatible with the original huffman.cpp         |
|-----|----------|---------|---------------------------------------------------------------|
|  2  |   0010   |     2   |  Improved Huffman: Data is encoded using a newly calculated   |
|     |          |         |      Huffman tree, and bytes do not have reversed bit order   |
|-----|----------|---------|---------------------------------------------------------------|
|  3  |   0011   |     3   |  Skull Crusher: Data is compressed using a custom algorithm   |
|     |          |         |      that utilizes two Huffman trees and a sliding "window"   |
|-----|----------|---------|---------------------------------------------------------------|
| 0x4 |   0100   |     4   |                                                               |
|  to |    to    |   to    |     Available for future expansion of the STONE Protocol      |
| 0xF |   1111   |    15   |                                                               |
`------------------------------------------------------------------------------------------'
In the above table the encodings are listed in order of complexity.
This allows for simple comparisons in source code instead of using a complex look up table or switch statements.

There should be a 1 to 1 mapping between Encoding ID and Encoding Level values.
Conversion between the Encoding ID (EID) and Encoding Level (EL) is possible.

Code: Select all

// From EID to EL
EL = EID + 1 & 0xF;

// From EL to EID (avoids underflow for cross language independence)
EID = (EL == 0) ? 15 : EL - 1;
    Â

Receiving Legacy Encoded Data[/size]

In the STONE Protocol a Signal byte value of [0 to 7] will correspond to Legacy Huffman Encoding, and backwards compatibility is achieved inherently.

However, If a STONE Layer 2 implementation receives a packet of Legacy "Non Encoded" data a Signal Byte of 0xff (-1 or 255), would correctly supply a StEP EID of "Unencoded Mode", but incorrectly specify the Encoding Negotiation Flag being set, and also use a "padding bits" value of 7...

Fortunately I have carefully designed the new protocol to never use a signal byte of 0xFF (255 or -1), therefore if such a signal byte is encountered Legacy Compatibility Mode is enabled.

When LCM is enabled STONE Layer 2 behaves exactly like the older protocol's huffman.cpp functions.
The Encoding Negotiation Flag is ignored and a response packets will be encoded via Legacy Unecoded mode or Legacy Huffman encoding.



Transmitting to Legacy Encoding Implementations[/size]
Simply select the "Legacy Huffman" encoding mode and send data.

Since the current Master / Server protocol only provides the IP:port of the servers there is no way to tell if you'll be talking to a STONE Protocol aware server or Legacy Huffman server.

For now a fail safe approach is to select "Legacy Huffman" encoding mode, set the Encoding Negotiation Flag, and append the Encoding Negotiation byte indicating your maximum and minimum desired encoding levels. (These details are gracefully handled by getter / setter method calls when using STONE Layer 2)

For example. Let's say we're sending such a launcher query to a Skulltag server that's using Legacy Huffman encoding (but we don't know that yet...)

A Legacy launcher request may look something like this (in hex):
03 b8491a9c8b1506 - 8 bytes of data.
The 03 would indicate Huffman encoding with 3 bits of extra padding at the end.

Our compatible STONE Protocol launcher request would look like this:
0b b8491a9c8b1506 20 - 9 bytes of data.
Legacy Huffman mode = 0, along with Bit #3 set (0x08 = Negotiation Flag) and with 3 padding bits yields a Signal byte of 0x0b.
Our maximum desired encoding is 2 - Improved Huffman, and minimum understood encoding is 0 - Unencoded mode.

(Note: Encoding negotiation only adds 1 byte!)

Now let's look at some of the actual code used in the old HUFFMAN_Decode to see how this will play out.
(I've been in here once or twice before :) )

Code: Select all

void HUFFMAN_Decode( unsigned char *in, unsigned char *out, int inlen, int *outlen )
{

    // sanity check for *outlen.
    if ( (*outlen < 1) ){
        *outlen = 0; // zero it incase it's some wild negative.
        return;
    }

    int bits,tbits,maxBits;
    huffnode_t *tmp;
    if (*in==0xff) // The first byte is 11, so this won't happen.
    {
        // This block is only executed if StEP sends 255 when LCM is enabled.
        memcpy(...); 
        return;
    }

    // Ah... I just love the next error prone line of code.
    // I immediately realized it as "Magic", and left it just the way it is :)
    tbits=(inlen-1)*8-*in;

    // Here's what happens:
    // tbits = (9 - 1) * 8 - 11;
    //   inlen--^             ^--- this value is exactly 8 bits more than the 3 padding bits expected...
    // Remember that extra negotiation byte (8 bits) we appended?  It's now fully ignored!

    bits=0;
    // Calculate the max bits from output size.
    maxBits = *outlen << 3; // shift 3 faster than * 8 :)
    *outlen=0;

    while ((bits<tbits) && (bits<maxBits)) // added check.
    {
        // Omitted: decode bytes from the input bits.
        // This section gets skipped when StEP unencoded mode launcher queries arrive :)
    }
}
   Â
If you're wondering why this implementation looks different than the one referred to in the current Launcher protocol -- It's because the old one is insecure.

STONE Layer 1.2 or a hardened version of huffman.cpp should be used instead.
huffman.zip
As you can see, applications using the old huffman.cpp or STONE Layer 1.x code are compatible with this "fail safe" approach :nod:

STONE Protocol aware applications can easily take advantage of StEP's unencoded mode without even having a Huffman implementation.

Launcher Protocol queries are small enough that it is safe for StEP unencoded mode to be used with legacy implementations.
The older HUFFMAN_Decode() function will attempt to Huffman decode the packet (0xF0 != 0xFF), and in the process ignore up to 240 bits (30 bytes) worth of data. ( tbits = 9 - 240 = -231; )

To older implementations it would be as if they received a Zero length packet. A novice user using only unencoded mode wouldn't be able to decode a Huffman encoded reply from the older server, so it's OK that older servers will not reply to the novice.

Newer servers using StEP / StL2 would correctly reply with unencoded data to the novice.

Important Note[/size]
Legacy Huffman allows for switching between unencoded and Huffman encodings at will. STONE Protocol does not!

In STONE Protocol it is unsafe to reply using a different encoding unless an Encoding Negotiation byte has been received and specifies that it is acceptable to switch protocols (or Legacy Compatibility Mode is enabled). Higher level protocols are free to define default minimum and maximum encoding levels that both ends must support, however STONE Protocol makes no assumptions.

It is currently beyond the scope of the STONE Protocol / STONE Layer 2 implementation to decide whether the Encoding Negotiation byte must be sent more than once per "session" since "session" state is maintained within higher level protocols.

It is also beyond the scope of STONE Protocol to determine the "default" encoding to use when sending data to an unknown server. A higher level protocol can use empty packets and STONE Protocol's encoding negotiation bytes to handshake and establish a common encoding mode if needed.

I think I just broke a record for the longest explanation of how 1 or 2 bytes of data can be used :-D[/spoiler]
Last edited by Blzut3 on Mon Jul 02, 2012 2:59 am, edited 1 time in total.

VortexCortex
 
Posts: 27
Joined: Mon Jun 11, 2012 1:43 am
Location: Houston
Contact:

RE: Versioning

#18

Post by VortexCortex » Sat Jul 07, 2012 3:30 pm

Ah! Yes, that's exactly it! Thanks so much!

I had forgotten all about the unencoded mode error code replies. Lots of the terse comments in the initial Perl implementation of the STONE Layer v2 code make sense now. :) Now I should be able to continue building the encoding & compression suite -- The aim was to create the transport encoding code licensed such that Doomseeker and others may also utilize it. Hmm, I guess another backronym is needed now...

[spoiler]Security Note:
The hardened huffman.cpp isn't needed so long as the buffer sizes are adequate enough to prevent the buffer overflow bug -- Which is the fix Torr adopted, IIRC. The other option was to indicate the max buffer size by passing the value by pointer reference in the output buffer length parameter, and to modify the Huffman_Decode function to properly check buffer bounds. I haven't ever checked IDE for exploit, but I know for sure that older clients/servers released prior to addressing this buffer overflow shouldn't be used openly as Internet facing due to a DoS, or remote code execution exploit the buffer overflow enables (depending on OS and DEP). This exploit was addressed some years ago, I doubt any but extreme diehards are using versions that are vulnerable. Doomseeker is unaffected, IIRC.

In retrospect, this protocol versioning / negotiation may sound a bit more complex than some would like, but that's mostly just verbosity: It boils down to a prefix and optional suffix byte pair, and enables implementations to choose the level of encoding to support -- IE, Launcher authors could opt out of compression initially, and include better compression as they got around to adopting it. There's a problem with using a High / Low value for the Encoding IDs in that implementations would need to support intervening protocols. Perhaps a set of suffixed bitfields would be better... In light of "Roles" for decentralization support, the protocol would have to be modified anyhow. Another issue is it requires some additional connection state to be maintained -- That is: Most of the protocol can be handled via drop in replacement of the Huffman_Xxx() functions, but some features, like negotiation, require pairing a few vars with each "connection" or IP/Port pair; Shouldn't be a problem, just that a full implementation of the protocol touches several source files -- It would be best to proceed with a limited subset at first, IMHO.

The SkullCrusher algorithm mentioned relies on a small lump (~400b up to 64Kb) that contains a "dictionary" along with it's two Huffman trees, to serve as a starting point for the sliding window compression. To build a good dictionary requires a network data profiler (which isn't yet complete) to be included in some Zandronum source. It looks like I can resurrect most of the progress. The profiler is needed to avoid having to capture full netlogs when generating dictionary lumps, and to benchmark the compression efficiency. The algorithm was designed to solve the issues inherent with using Gzip or other stream compression algorithms over UDP: The pre-defined dictionaries and trees reduce compression overhead, Data can be received out of sequence, lost packets are allowed, and the simple algorithm achieves decent compression on a per packet basis with minimal CPU usage over the current Huffman implementation.

The improved Huffman encoding version in that protocol could probably be dropped, it's only about 5% to 10% better than the current Huffman at most things, it's a few times faster too but that's not the bottleneck -- Mainly, the flexible Huffman tree implementation was a requirement to build the other compression modes.

For optimal compression, a different dictionary could be used for different default game modes (A dedicated Encoding ID each for Launchers, Invasion/Coop, CTF/Deathmatch, etc) -- This is because some game modes have similar network data, but others have very different network data sets. An Encoding ID could also be designated to provide the general purpose "WAD Defined Compression". Modders would enable a "profiler" mode then play online to generate the compression dictionary .LUMP -- I wasn't sure how that option would be presented, probably a new command line switch or cvar. Games played with WADS having the new Lump present would enable the negotiation capability.

This lightweight per WAD defined optimal compression is the main reason why the above encoding layer versioning and negotiating protocol was designed -- Launchers could talk with a known default encoding (or none), and Clients / Servers could start off with a known encoding, then negotiate up to a more optimal compression mode for successive packets.[/spoiler]
I'll open a tracker ticket when next I have time so we can discuss any required changes, and to enable any desired future options -- To ensure extensibility, ie we don't get painted into a corner.

Thanks again Wartorn!

User avatar
Wartorn
Retired Staff / Community Team Member
Posts: 254
Joined: Thu May 24, 2012 9:28 pm

RE: Versioning

#19

Post by Wartorn » Sat Jul 07, 2012 6:03 pm

Neato, glad we can make progress on this now :)

User avatar
Empyre
Zandrone
Posts: 1316
Joined: Sun Jul 08, 2012 6:41 am
Location: Garland, TX, USA

RE: Versioning

#20

Post by Empyre » Sun Jul 08, 2012 7:33 am

My thoughts on the version numbers: We should avoid using 2 minor version numbers specifically to avoid making the version numbers look like ZDoom's, to avoid the potential confusion resulting from the possibility that our version number may someday be higher than ZDoom's. Since Zandronum is a continuation of Skulltag, maybe the version should also be continuing the Skulltag numbering scheme, except maybe dropping the "0." in front, so 98e would be followed by 99 and a letter then 100, 101, etc. The number would increment either when the ZDoom source is updated, or when many features are added, like when Skulltag's 0.97e got its name changed to 0.98a.
"For the world is hollow, and I have touched the sky."

Post Reply