Versioning
Moderator: Developers
Versioning
I think Zandronum's versioning should be pretty simple.  Feature releases should increase the major version number and put the minor version at 0, bugfix releases should increase the minor version number.  What do you think?
			
									
									The only limit to my freedom is the inevitable closure of the
universe, as inevitable as your own last breath. And yet,
there remains time to create, to create, and escape.
Escape will make me God.
						universe, as inevitable as your own last breath. And yet,
there remains time to create, to create, and escape.
Escape will make me God.
- The Toxic Avenger
- Forum Staff
- Posts: 1536
- Joined: Fri May 25, 2012 1:12 am
- Location: New Jersey
- Clan: ???
- Clan Tag: [???]
- Contact:
RE: Versioning
I like it. Having 1.0, 1.1, 1.2..., then the next major release be 2.0 sounds good.
			
									
									
						RE: Versioning
since it's still an alpha (is it?) I think it should start from 0.1, and 1.0 being the first beta.
			
													
					Last edited by Vordenko on Tue Jun 05, 2012 3:32 am, edited 1 time in total.
									
			
									InGame: <*>Sky<-=DriveR=-

						Captain Ventris wrote: I shorten it to Zan. This means that if I ever for some reason host a ported Zan Zan, I can call it a Zan Zan Zan server.
RE: Versioning
That was the plan. Major version would increase on ZDoom base upgrades and the minor version was what used to be the letter.AlexMax wrote: I think Zandronum's versioning should be pretty simple. Feature releases should increase the major version number and put the minor version at 0, bugfix releases should increase the minor version number. What do you think?
I kind of wonder if we should call the first release 1.4 (accounting for 98a-d) so that should we upgrade the ZDoom version after this release it doesn't seem like we're on an extremely rapid release.
					Last edited by Blzut3 on Tue Jun 05, 2012 4:16 am, edited 1 time in total.
									
			
									
						- Torr Samaho
- Lead Developer
- Posts: 1543
- Joined: Fri May 25, 2012 6:03 pm
- Location: Germany
RE: Versioning
I'm not completely sure when we should increase the major and when the minor version number. If we upgrade our base from ZDoom 2.3.1 (which is approximately what we have now) to 2.4.1 (which is a conservative but realistic goal for the version following Zandronum 1.0), increase the major version number for this and follow this strategy, we may end up having a much higher version number than ZDoom in the not too distant future. We could make it more like ZDoom and use two minor numbers.
			
									
									
						- 
				Nubcakes
- Posts: 29
- Joined: Tue Jun 05, 2012 4:24 pm
- Location: Outside your window with duct tape stretched out!
RE: Versioning
Why not just call it Alpha RC1 and increase the number after RC for each update? Don't use decimals, just Integers. Do the same for beta and it's release.
Examples: Zandronum Alpha RC5, Zandronum Beta RC31, Zandronum RC666
			
									
									
						Examples: Zandronum Alpha RC5, Zandronum Beta RC31, Zandronum RC666
- 
				Xaser
- Posts: 34
- Joined: Mon Jun 04, 2012 11:19 pm
- Location: 25 Miles From The Outskirts of Space-Dallas
- Contact:
RE: Versioning
Personal opinion, but I find the practice of sub-minor numbers a bit extraneous. It's like there's a phobia about incrementing major numbers, though I can see how a newcomer could perhaps try and compare the version numbers between Zan and ZDoom and conclude that one is more advanced than the other.
I'd file it under the "not really a big deal" category, though.
Guess I should ask, though: Is the official plan to do a major release for every big bump of ZDoom? That is, go from 2.3.1->2.4.1->2.5.0->eventual 2.6.0 and do a release each time? I'm being overly optimistic here but once the engine is stable enough to allow porting from ZDoom, I'd hope that the process of porting features will speed up considerably, perhaps so that a multiple ZDoom releases can be spanned in a single Zan release. Even with something like 2.3.1->2.4.1->2.6.0, that's one less major version bump to have to "worry" about... even though at that point, Zan would be 3.0. So whelp. :P
Ignore this if my lack of knowledge about the code is laughable. Thought dump presently ceases.
			
									
									
						I'd file it under the "not really a big deal" category, though.
Guess I should ask, though: Is the official plan to do a major release for every big bump of ZDoom? That is, go from 2.3.1->2.4.1->2.5.0->eventual 2.6.0 and do a release each time? I'm being overly optimistic here but once the engine is stable enough to allow porting from ZDoom, I'd hope that the process of porting features will speed up considerably, perhaps so that a multiple ZDoom releases can be spanned in a single Zan release. Even with something like 2.3.1->2.4.1->2.6.0, that's one less major version bump to have to "worry" about... even though at that point, Zan would be 3.0. So whelp. :P
Ignore this if my lack of knowledge about the code is laughable. Thought dump presently ceases.
RE: Versioning
The thing is, as many open source projects have come to realize (including the Linux kernel), the major version number, in a traditional sense, is no longer significant. ZDoom and thus Zandronum do not intend to ever break the API so if you take the traditional route of increment on API breakage would mean we'd be on 1.x forever. The solution is to drop the major version or use the date as the version number. However, if you're going to use the "rapid release" scheme then you need to use it now. Otherwise you end up like Firefox where people think you did it purly for marketing reasons.Torr Samaho wrote: I'm not completely sure when we should increase the major and when the minor version number. If we upgrade our base from ZDoom 2.3.1 (which is approximately what we have now) to 2.4.1 (which is a conservative but realistic goal for the version following Zandronum 1.0), increase the major version number for this and follow this strategy, we may end up having a much higher version number than ZDoom in the not too distant future. We could make it more like ZDoom and use two minor numbers.
In terms of passing the ZDoom version number. We could not increment the version number on minor ZDoom upgrades (bug fix, small random features). In general this would mean we would remain 2 behind ZDoom. Unless we upgrade to mid points between ZDoom, but honestly ZDoom needs to update more anyways.
In fact, for Skulltag we pretty much, in practice, switched to the rapid release scheme. The website calls the latest version 98d and I don't even recall the last person to call it "0.98d".
RE: Versioning
I'd recommend we start at 99a, and once we are truly diverged from skulltag, we start at 100!
			
									
									
						- Torr Samaho
- Lead Developer
- Posts: 1543
- Joined: Fri May 25, 2012 6:03 pm
- Location: Germany
RE: Versioning
Good argument. So far I disliked the rapid release version numbering model Firefox and Chrome are using, most likely because I'm used to the more traditional model and the new one feels so inflationary. But your argument makes sense, there is no real need to fix the major revision for ages. So let's give the modern system a try.Blzut3 wrote: The thing is, as many open source projects have come to realize (including the Linux kernel), the major version number, in a traditional sense, is no longer significant. ZDoom and thus Zandronum do not intend to ever break the API so if you take the traditional route of increment on API breakage would mean we'd be on 1.x forever.
- 
				VortexCortex
- Posts: 27
- Joined: Mon Jun 11, 2012 1:43 am
- Location: Houston
- Contact:
RE: Versioning
There is a versioning problem in the network protocol.  Specifically: The launcher protocol lacks proper versioning.  This deficit prevents launchers from distinguishing between Skulltag and Zandronum packets.  Currently, users must decide between the two...  
Fortunately, backward compatible versioning can be added to the protocol by leveraging the nature of the Huffman compression algorithm and data format.
First, a little info about the current Huffman data format:
When data is Huffman compressed the resulting bit string may not align on a 8-bit byte (octet) boundary. However, current computers operate atomically on octets, so some number of bits must be appended to the bit string to fill out any partial final octet. Any padding bits must be ignored by the decompression algorithm otherwise additional unwanted data will be appended to the return buffer. Thus, a single octet is prefixed to the bit string which indicates the number of padding bits that have been appended to the bit string.
This means: The first octet of payload data in all of your packets (prior to decompression) is always one of: 0, 1, 2, 3, 4, 5, 6, 7, or 255. (Zero through Seven Inclusive, and 255 or -1 as a signed char).
To add versioning to the protocol: Always Huffman compress the data with newer implementations, increases due to encoding are of minimal impact. After Huffman compressing the data, increase the first octet's value by 8 (OR the first byte with 8). Additionally, append a single zeroed octet to the end of the data buffer.
The legacy Huffman decompression algorithms will read in the increased value and ignore the eight additional zeroed bits. New Zandronum aware programs can then check the first byte of data: If the first byte is less than 8 or equal to 255, then it's a legacy Skulltag packet, otherwise it's a Zandronum packet.
Since this change is backwards compatible with the current launcher protocols, Master, Servers, and Clients it can be rolled out incrementally i.e. not everything has to be updated all at once.
This method can be used to shoehorn up to 248 additional bits (31 octets) of metadata into the current network protocol. I advise that only one additional padding octet be used at this time, and that it must be zeroed; This is because both the extra "padding" bits and the upper bits in the prefix byte could be used in the future for additional versioning and/or protocol signaling as compression mode and handshaking meta-data; Especially if a new authentication system were needed...
I don't know of any plans you may already have to alleviate the issue, nor have I searched for them (sort of busy ATM). Apologies, if the issue has already been corrected. I (or someone else) can open a ticket, if the proposed solution sounds acceptable.
			
													Fortunately, backward compatible versioning can be added to the protocol by leveraging the nature of the Huffman compression algorithm and data format.
First, a little info about the current Huffman data format:
When data is Huffman compressed the resulting bit string may not align on a 8-bit byte (octet) boundary. However, current computers operate atomically on octets, so some number of bits must be appended to the bit string to fill out any partial final octet. Any padding bits must be ignored by the decompression algorithm otherwise additional unwanted data will be appended to the return buffer. Thus, a single octet is prefixed to the bit string which indicates the number of padding bits that have been appended to the bit string.
This means: The first octet of payload data in all of your packets (prior to decompression) is always one of: 0, 1, 2, 3, 4, 5, 6, 7, or 255. (Zero through Seven Inclusive, and 255 or -1 as a signed char).
To add versioning to the protocol: Always Huffman compress the data with newer implementations, increases due to encoding are of minimal impact. After Huffman compressing the data, increase the first octet's value by 8 (OR the first byte with 8). Additionally, append a single zeroed octet to the end of the data buffer.
The legacy Huffman decompression algorithms will read in the increased value and ignore the eight additional zeroed bits. New Zandronum aware programs can then check the first byte of data: If the first byte is less than 8 or equal to 255, then it's a legacy Skulltag packet, otherwise it's a Zandronum packet.
Since this change is backwards compatible with the current launcher protocols, Master, Servers, and Clients it can be rolled out incrementally i.e. not everything has to be updated all at once.
This method can be used to shoehorn up to 248 additional bits (31 octets) of metadata into the current network protocol. I advise that only one additional padding octet be used at this time, and that it must be zeroed; This is because both the extra "padding" bits and the upper bits in the prefix byte could be used in the future for additional versioning and/or protocol signaling as compression mode and handshaking meta-data; Especially if a new authentication system were needed...
I don't know of any plans you may already have to alleviate the issue, nor have I searched for them (sort of busy ATM). Apologies, if the issue has already been corrected. I (or someone else) can open a ticket, if the proposed solution sounds acceptable.
					Last edited by VortexCortex on Sun Jul 01, 2012 7:41 am, edited 1 time in total.
									
			
									
						- Torr Samaho
- Lead Developer
- Posts: 1543
- Joined: Fri May 25, 2012 6:03 pm
- Location: Germany
RE: Versioning
That sounds like an interesting idea (if I understood it correctly the appended zeroed octet would increase the size of each packet by 1 byte though, right?). Can you give some examples of things made possible by the added versioning? Currently we can distinguish between Skulltag and Zandronum servers by their version string and we could distinguish old launcher from Zandronum aware launchers by the MASTER_SERVER_VERSION part of the master-launcher protocol. We are not making use of the latter for this yet though.
			
									
									
						- 
				VortexCortex
- Posts: 27
- Joined: Mon Jun 11, 2012 1:43 am
- Location: Houston
- Contact:
RE: Versioning
First off, in haste I failed to mention 255 as one of the possible first byte values (which is why I even began listing them all out, doh!  I've corrected the post).  Also, I agree Doomseeker and other launchers could use the server version string to pick the appropriate binary to execute; However, it seems like a single plugin solution would be required that handles both Zandronum and Skulltag servers since the Master can not currently provide versioning data to the Launchers along with the server IP list.  Say you put the same Master in both ST & Zand plugins, you'll get ST servers in your Zand list, and Zand servers in your ST listing...  Is this the root of the current problem? (haven't yet had the time to disect it myself)  It would be optimal to have two completely separate plugins with no crosstalk; However the reality is that the master server handles both, so at some level Launchers will have to as well -- The earlier, the better.
It's mainly the backwards compatibility angle that requires the mildly hackish approach (though I've seen much worse); We're lucky such a system is possible, most protocols lacking pre-payload versioning have far smaller wiggle rooms.
			
									
									
						Spoiler: For great justice. (Open)These are some of the versioning features that come off the top of my head -- I remember having spent some time engineering the initial byte values of the MASSTER auth system in such a way to naturally take advantage of this "ignore the extra bits" system, and enable alternate Huffman trees optimized for various game modes -- Invasion, Deathmathch, etc, as well as different compression trees/dictionaries on a per WAD basis. In fact, I found a few full network logs and profiling stat tables I made back then for various game modes, but the exact protocol itself resides unsafely locked away in Carney's forum data... Though unlikely, I wonder if he could be reasoned with to give us a SQL dump of our data?
It's mainly the backwards compatibility angle that requires the mildly hackish approach (though I've seen much worse); We're lucky such a system is possible, most protocols lacking pre-payload versioning have far smaller wiggle rooms.
RE: Versioning
I have a copy of the database, although it has seen its damage from the post disappearance incident that happened a few months ago so there is a likely chance it could be lost. Tell me what to look for and I'll retrieve your protocol.VortexCortex wrote: In fact, I found a few full network logs and profiling stat tables I made back then for various game modes, but the exact protocol itself resides unsafely locked away in Carney's forum data... Though unlikely, I wonder if he could be reasoned with to give us a SQL dump of our data?
					Last edited by Wartorn on Sun Jul 01, 2012 4:56 am, edited 1 time in total.
									
			
									
						- 
				VortexCortex
- Posts: 27
- Joined: Mon Jun 11, 2012 1:43 am
- Location: Houston
- Contact:
RE: Versioning
Well, knowing me, if it's got bit fields (which I'm sure it did) then it's got a few ASCII "art" patterns like this:
			
													Code: Select all
| | | |Spoiler: E.G. (Open)Cross referencing posts containing 'Huffman' and/or 'SkullCrusher' with 'VortexCortex' as the author may also yield the post(s).
					Last edited by VortexCortex on Sun Jul 01, 2012 7:33 am, edited 1 time in total.
									
			
									
						- Torr Samaho
- Lead Developer
- Posts: 1543
- Joined: Fri May 25, 2012 6:03 pm
- Location: Germany
RE: Versioning
The more you tell about it, the more useful your proposed versioning sounds :). If you have the time to create patches, that would be great!VortexCortex wrote: If needed, I could schedule the time to create and test the necessary patches.. (..and compatibility with IDE... in a VM, Still closed source, ya?)
I think Eruanna still has a copy of the .net forum database. So if the post we are looking for was written on skulltag.net, it should be in there.Wartorn wrote: I have a copy of the database, although it has seen its damage from the post disappearance incident that happened a few months ago so there is a likely chance it could be lost. Tell me what to look for and I'll retrieve your protocol.
RE: Versioning
It's possible that the SQL dump wasn't correctly exported after some trial and error I'm an idiot and I probably need help (The database is fine), so as a result a lot of the example code might be damaged on the remote server I installed the database to make my hunt easier. I'll keep tinkering with it to see if I can get the thing in tact but in the event I can't, I'll need someone's help to get this encoding thing working. In addition, there was an attachment named "Hardened Huffman" but dumps don't include that kind of thing. But is this what you're looking for?
[spoiler]I've been working on a replacement for the current Huffman encoding protocol.
My goal is to extend the current Huffman encoding protocol to include multiple encoding modes such as better compression and a non encoded mode while providing backwards compatibility.
I've named the encoding protocol "Skulltag Online Network Encoding Protocol" or "STONE Protocol" or StEP for short.
This corresponds to my implementation of the protocol: "Skulltag Online Network Encoding Layer 2" or "STONE Layer 2" or StL2.
Think of STONE Protocol as the underlying foundation that all other Skulltag network protocols are stacked on top of. The name is subject to change if anyone has a better or existing name to use.
Similar to posting an RFC: I'm posting the protocol here to get ideas, comments, and approval before finalizing the implementation details. I also wanted to demonstrate how the protocol achieves backwards compatibility with the current Huffman protocol
Apologies in advance -- I have a tendency to be long winded, but did not spoiler anything in the interest of readability.
(No apologies to Rivecoder. He knew what he was getting into and should be used to this by now :P)
I wasn't sure if this was the correct place to post... Skulltag Protocol thread and Source Discussion seemed likely candidates as well. Please move if needed, thanks.
Legacy Encoding Protocol[/size]
It's important to note that the method of Huffman compression currently in use is actually a Data Encoding Protocol. The first byte of data is a signal that indicates how the following data bytes should be decoded.
A first byte of 0xFF, 255 (or -1 as a signed char) means the following bytes are not encoded.
Any other value signifies Huffman encoded data follows.
When Huffman encoding is used, The first byte value specified is equal to the number of bits that were appended to the last byte of data in order to pad the Huffman data out to a full byte.
For example, the number 5 would mean to disregard the last 5 bits of data in the last byte of the following data. Since a byte is 8 bits, there can logically only exist eight possible padding values for the first signal byte: 0, 1, 2, 3, 4, 5, 6, 7.
Current Huffman implementations operate on the signal byte and determine how many bits of data to read from the remaining data bytes via the following algorithm:
This legacy implementation detail is crucial to providing backwards compatibility when using certain STONE Protocol features.
The actual code used in the older HUFFMAN_Decode function will be reviewed later.
STONE Protocol[/size]
The new protocol is based on the above protocol with backwards compatibility and minimal header (meta data) size being of utmost importance.
In the interest of backwards compatibility StEP sometimes stores meta data at the end of the packet as well as the beginning.
In the following tables the bit values and ranges inside of bytes are given as their "unshifted" values.
The Signal Byte[/size]
Similar to the Legacy Encoding, the first byte of a StEP packet holds special "signal" data.
In STONE Protocol the first byte is a bit field containing 3 independent pieces of data.
The bit positions and values within the signal byte have been carefully chosen to inherently provide backwards compatibility.
Padding Bits Value[/size]
StEP uses the Padding Bits value in the same way that the Legacy Encoding protocol / implementation does.
The last byte of encoded data contains N padding bits which should be ignored.
Future encodings may have different usages for the padding bits.
Message Identification Flags[/size]
When composing a STONE Protocol "Unencoded" packet "Padding Bits" are unneccesary.
Instead of letting those bits go to waste a Message Identification bit field (MID) is used.
Regardless of Error Code, it is beyond STONE Protocol's scope to determine the contents of the unencoded data.
Error code meta data can be used by higher level protocols to provide extended error messages,
re-request corrupted data or perform a negotiation handshake.
When the "Connection Refused" error code is received, it is up to the application to check this value and take action (or inaction).
StL2 provides getter methods for these fields.
It is Forbidden to use the Request for Protocol Negotiation when the "Connection Refused" Error Code is in use.
(This would make no sense, and also prevent Legacy Encoding Mode detection)
In the STONE Layer 2 implementation, if a Request for Protocol Negotiation flag is received then the next reply packet will automatically be configured for Protocol Negotiation. This behavior can be disabled.
Encoding Identification Signal[/size]
16 encoding values are possible.
Due to the backwards compatibility requirements the Signal byte's Encoding Identification Signal values do not match exactly to the Negotiation byte's Encoding Level values.  However, conversion between the two values is possible.
Encoding Negotiation[/size]
If the Encoding Negotiation flag is set then the last byte of the packet contains a special bit field.
When Legacy mode is enabled the Encoding Negotiation flag is treated as if it were false.
The "Negotiation" bit field contains two independent pieces of data.
Having both minimum and maximum encoding levels instead of just a maximum supported is important for future encodings that may include encryption. In the event of encrypted protocol implementation an application may decide to specify a minimum encoding level that disallows all unencrypted encodings.
Encoding Level Values[/size]
16 encoding values are possible.
In the above table the encodings are listed in order of complexity.
This allows for simple comparisons in source code instead of using a complex look up table or switch statements.
There should be a 1 to 1 mapping between Encoding ID and Encoding Level values.
Conversion between the Encoding ID (EID) and Encoding Level (EL) is possible.
Receiving Legacy Encoded Data[/size]
In the STONE Protocol a Signal byte value of [0 to 7] will correspond to Legacy Huffman Encoding, and backwards compatibility is achieved inherently.
However, If a STONE Layer 2 implementation receives a packet of Legacy "Non Encoded" data a Signal Byte of 0xff (-1 or 255), would correctly supply a StEP EID of "Unencoded Mode", but incorrectly specify the Encoding Negotiation Flag being set, and also use a "padding bits" value of 7...
Fortunately I have carefully designed the new protocol to never use a signal byte of 0xFF (255 or -1), therefore if such a signal byte is encountered Legacy Compatibility Mode is enabled.
When LCM is enabled STONE Layer 2 behaves exactly like the older protocol's huffman.cpp functions.
The Encoding Negotiation Flag is ignored and a response packets will be encoded via Legacy Unecoded mode or Legacy Huffman encoding.
Transmitting to Legacy Encoding Implementations[/size]
Simply select the "Legacy Huffman" encoding mode and send data.
Since the current Master / Server protocol only provides the IP:port of the servers there is no way to tell if you'll be talking to a STONE Protocol aware server or Legacy Huffman server.
For now a fail safe approach is to select "Legacy Huffman" encoding mode, set the Encoding Negotiation Flag, and append the Encoding Negotiation byte indicating your maximum and minimum desired encoding levels. (These details are gracefully handled by getter / setter method calls when using STONE Layer 2)
For example. Let's say we're sending such a launcher query to a Skulltag server that's using Legacy Huffman encoding (but we don't know that yet...)
A Legacy launcher request may look something like this (in hex):
03 b8491a9c8b1506 - 8 bytes of data.
The 03 would indicate Huffman encoding with 3 bits of extra padding at the end.
Our compatible STONE Protocol launcher request would look like this:
0b b8491a9c8b1506 20 - 9 bytes of data.
Legacy Huffman mode = 0, along with Bit #3 set (0x08 = Negotiation Flag) and with 3 padding bits yields a Signal byte of 0x0b.
Our maximum desired encoding is 2 - Improved Huffman, and minimum understood encoding is 0 - Unencoded mode.
(Note: Encoding negotiation only adds 1 byte!)
Now let's look at some of the actual code used in the old HUFFMAN_Decode to see how this will play out.
(I've been in here once or twice before :) )
If you're wondering why this implementation looks different than the one referred to in the current Launcher protocol -- It's because the old one is insecure.
STONE Layer 1.2 or a hardened version of huffman.cpp should be used instead. As you can see, applications using the old huffman.cpp or STONE Layer 1.x code are compatible with this "fail safe" approach :nod:
STONE Protocol aware applications can easily take advantage of StEP's unencoded mode without even having a Huffman implementation.
Launcher Protocol queries are small enough that it is safe for StEP unencoded mode to be used with legacy implementations.
The older HUFFMAN_Decode() function will attempt to Huffman decode the packet (0xF0 != 0xFF), and in the process ignore up to 240 bits (30 bytes) worth of data. ( tbits = 9 - 240 = -231; )
To older implementations it would be as if they received a Zero length packet. A novice user using only unencoded mode wouldn't be able to decode a Huffman encoded reply from the older server, so it's OK that older servers will not reply to the novice.
Newer servers using StEP / StL2 would correctly reply with unencoded data to the novice.
Important Note[/size]
Legacy Huffman allows for switching between unencoded and Huffman encodings at will. STONE Protocol does not!
In STONE Protocol it is unsafe to reply using a different encoding unless an Encoding Negotiation byte has been received and specifies that it is acceptable to switch protocols (or Legacy Compatibility Mode is enabled). Higher level protocols are free to define default minimum and maximum encoding levels that both ends must support, however STONE Protocol makes no assumptions.
It is currently beyond the scope of the STONE Protocol / STONE Layer 2 implementation to decide whether the Encoding Negotiation byte must be sent more than once per "session" since "session" state is maintained within higher level protocols.
It is also beyond the scope of STONE Protocol to determine the "default" encoding to use when sending data to an unknown server. A higher level protocol can use empty packets and STONE Protocol's encoding negotiation bytes to handshake and establish a common encoding mode if needed.
I think I just broke a record for the longest explanation of how 1 or 2 bytes of data can be used :-D[/spoiler]
			
													[spoiler]I've been working on a replacement for the current Huffman encoding protocol.
My goal is to extend the current Huffman encoding protocol to include multiple encoding modes such as better compression and a non encoded mode while providing backwards compatibility.
I've named the encoding protocol "Skulltag Online Network Encoding Protocol" or "STONE Protocol" or StEP for short.
This corresponds to my implementation of the protocol: "Skulltag Online Network Encoding Layer 2" or "STONE Layer 2" or StL2.
Think of STONE Protocol as the underlying foundation that all other Skulltag network protocols are stacked on top of. The name is subject to change if anyone has a better or existing name to use.
Similar to posting an RFC: I'm posting the protocol here to get ideas, comments, and approval before finalizing the implementation details. I also wanted to demonstrate how the protocol achieves backwards compatibility with the current Huffman protocol
Apologies in advance -- I have a tendency to be long winded, but did not spoiler anything in the interest of readability.
(No apologies to Rivecoder. He knew what he was getting into and should be used to this by now :P)
I wasn't sure if this was the correct place to post... Skulltag Protocol thread and Source Discussion seemed likely candidates as well. Please move if needed, thanks.
Legacy Encoding Protocol[/size]
It's important to note that the method of Huffman compression currently in use is actually a Data Encoding Protocol. The first byte of data is a signal that indicates how the following data bytes should be decoded.
A first byte of 0xFF, 255 (or -1 as a signed char) means the following bytes are not encoded.
Any other value signifies Huffman encoded data follows.
When Huffman encoding is used, The first byte value specified is equal to the number of bits that were appended to the last byte of data in order to pad the Huffman data out to a full byte.
For example, the number 5 would mean to disregard the last 5 bits of data in the last byte of the following data. Since a byte is 8 bits, there can logically only exist eight possible padding values for the first signal byte: 0, 1, 2, 3, 4, 5, 6, 7.
Current Huffman implementations operate on the signal byte and determine how many bits of data to read from the remaining data bytes via the following algorithm:
Code: Select all
prefix_byte_value = ... // get the first byte of packet data.
number_of_data_bytes = number_of_packet_bytes - 1; // Account for the protocol's "signal/padding" byte.
if ( prefix_byte_value == 0xFF )
{
    ... // Return the data bytes unchanged (not encoded mode)
}
number_of_data_bytes * 8 = total_data_bits;
number_of_huffman_encoded_bits = total_data_bits - prefix_byte_value;
... // Huffman decode the number_of_huffman_encoded_bits from the data bytes.
       ÂThe actual code used in the older HUFFMAN_Decode function will be reviewed later.
STONE Protocol[/size]
The new protocol is based on the above protocol with backwards compatibility and minimal header (meta data) size being of utmost importance.
In the interest of backwards compatibility StEP sometimes stores meta data at the end of the packet as well as the beginning.
In the following tables the bit values and ranges inside of bytes are given as their "unshifted" values.
The Signal Byte[/size]
Similar to the Legacy Encoding, the first byte of a StEP packet holds special "signal" data.
In STONE Protocol the first byte is a bit field containing 3 independent pieces of data.
The bit positions and values within the signal byte have been carefully chosen to inherently provide backwards compatibility.
Code: Select all
Signal byte bit Values.  7 = Most significant bit
.--------------------------------------
|
|    7 6 5 4 3 2 1 0  
|    | | | | | | | |  
|    | | | | | x x x = Numeric value [0 to 7]
|    | | | | |         Number of Padding Bits (used with variable bit length encodings)
|    | | | | |         or Message Flags Field (used with Unencoded mode)
|    | | | | |
|    | | | | x ----- = Encoding Negotiation Flag
|    | | | |           1: An encoding negotiation byte has been appended after the last byte of data.
|    | | | |           0: No negotiation byte exists.
|    | | | |
|    x x x x ------- = Numeric value [0 to 15]
|                      Encoding Identification Signal
|
`---------------------------------------
StEP uses the Padding Bits value in the same way that the Legacy Encoding protocol / implementation does.
The last byte of encoded data contains N padding bits which should be ignored.
Future encodings may have different usages for the padding bits.
Message Identification Flags[/size]
When composing a STONE Protocol "Unencoded" packet "Padding Bits" are unneccesary.
Instead of letting those bits go to waste a Message Identification bit field (MID) is used.
Code: Select all
Message Identification Signal.  7 = Most significant bit
.--------------------------------------
|
|    7 6 5 4 3 2 1 0  
|    | | | | | | | |  
|    | | | | | | x x = Numeric Value [0 to 3]
|    | | | | | |       Error Code
|    | | | | | |       0: No error, a normal regular unencoded data message.
|    | | | | | |       1: Unsupported Encoding
|    | | | | | |          The previously received data encoding is not supported.
|    | | | | | |       2: Corrupt Data
|    | | | | | |          The previously received data did not decode properly.
|    | | | | | |       3: Connection Refused
|    | | | | | |          For some reason the connection was refused, and no further
|    | | | | | |          data should be sent (at least for a while)
|    | | | | | |
|    | | | | | x---- = Request for Protocol Negotiation
|    | | | | |         1: Request that a reply contain Encoding Negotiation.
|    | | | | |         0: Encoding negotiation not requested from the other party.
|    | | | | |
|    x x x x x------ = These bits are beyond the MID scope.
|
`---------------------------------------
Error code meta data can be used by higher level protocols to provide extended error messages,
re-request corrupted data or perform a negotiation handshake.
When the "Connection Refused" error code is received, it is up to the application to check this value and take action (or inaction).
StL2 provides getter methods for these fields.
It is Forbidden to use the Request for Protocol Negotiation when the "Connection Refused" Error Code is in use.
(This would make no sense, and also prevent Legacy Encoding Mode detection)
In the STONE Layer 2 implementation, if a Request for Protocol Negotiation flag is received then the next reply packet will automatically be configured for Protocol Negotiation. This behavior can be disabled.
Encoding Identification Signal[/size]
16 encoding values are possible.
Code: Select all
.------------------------------------------------------------------------------------------.
|     Encoding ID Value    |                 Description of Encoding Mode                  |
| Hex .  Binary  . Decimal |                                                               |
|-----|----------|---------|---------------------------------------------------------------|
|  0  |   0000   |     0   |  Legacy Huffman: Data is encoded using a Huffman Tree         |
|     |          |         |      that is compatible with the original huffman.cpp         |
|-----|----------|---------|---------------------------------------------------------------|
|  1  |   0001   |     1   |  Improved Huffman: Data is encoded using a newly calculated   |
|     |          |         |      Huffman tree, and bytes do not have reversed bit order   |
|-----|----------|---------|---------------------------------------------------------------|
|  2  |   0010   |     2   |  Skull Crusher: Data is compressed using a custom algorithm   |
|     |          |         |      that utilizes two Huffman trees and a sliding "window"   |
|-----|----------|---------|---------------------------------------------------------------|
| 0x3 |   0011   |     3   |                                                               |
|  to |    to    |   to    |     Available for future expansion of the STONE Protocol      |
| 0xE |   1110   |    14   |                                                               |
|-----|----------|---------|---------------------------------------------------------------|
|  F  |   1111   |    15   |  Unencoded Mode: data bytes use the 1:1 (identity) encoding,  |
|     |          |         |     a "no op" raw data mode also used for error code replies  |
`------------------------------------------------------------------------------------------'
Encoding Negotiation[/size]
If the Encoding Negotiation flag is set then the last byte of the packet contains a special bit field.
When Legacy mode is enabled the Encoding Negotiation flag is treated as if it were false.
The "Negotiation" bit field contains two independent pieces of data.
Code: Select all
Encoding Negotiation byte bit Values.  7 = Most significant bit
.--------------------------------------
|
|    7 6 5 4 3 2 1 0  
|    | | | | | | | |  
|    | | | | x x x x = Numeric value [0 to 15]
|    | | | |           Minimum Encoding Level value
|    | | | |  
|    x x x x ------- = Numeric value [0 to 15]
|                      Maximum Encoding Level value
|
`---------------------------------------
Encoding Level Values[/size]
16 encoding values are possible.
Code: Select all
.------------------------------------------------------------------------------------------.
|   Encoding Level Value   |                 Description of Encoding Mode                  |
| Hex .  Binary  . Decimal |                                                               |
|-----|----------|---------|---------------------------------------------------------------|
|  0  |   0000   |     0   |  Unencoded Mode: data bytes use the 1:1 (identity) encoding,  |
|     |          |         |     a "no op" raw data mode also used for error code replies  |
|-----|----------|---------|---------------------------------------------------------------|
|  1  |   0001   |     1   |  Legacy Huffman: Data is encoded using a Huffman Tree         |
|     |          |         |      that is compatible with the original huffman.cpp         |
|-----|----------|---------|---------------------------------------------------------------|
|  2  |   0010   |     2   |  Improved Huffman: Data is encoded using a newly calculated   |
|     |          |         |      Huffman tree, and bytes do not have reversed bit order   |
|-----|----------|---------|---------------------------------------------------------------|
|  3  |   0011   |     3   |  Skull Crusher: Data is compressed using a custom algorithm   |
|     |          |         |      that utilizes two Huffman trees and a sliding "window"   |
|-----|----------|---------|---------------------------------------------------------------|
| 0x4 |   0100   |     4   |                                                               |
|  to |    to    |   to    |     Available for future expansion of the STONE Protocol      |
| 0xF |   1111   |    15   |                                                               |
`------------------------------------------------------------------------------------------'
This allows for simple comparisons in source code instead of using a complex look up table or switch statements.
There should be a 1 to 1 mapping between Encoding ID and Encoding Level values.
Conversion between the Encoding ID (EID) and Encoding Level (EL) is possible.
Code: Select all
// From EID to EL
EL = EID + 1 & 0xF;
// From EL to EID (avoids underflow for cross language independence)
EID = (EL == 0) ? 15 : EL - 1;
    ÂReceiving Legacy Encoded Data[/size]
In the STONE Protocol a Signal byte value of [0 to 7] will correspond to Legacy Huffman Encoding, and backwards compatibility is achieved inherently.
However, If a STONE Layer 2 implementation receives a packet of Legacy "Non Encoded" data a Signal Byte of 0xff (-1 or 255), would correctly supply a StEP EID of "Unencoded Mode", but incorrectly specify the Encoding Negotiation Flag being set, and also use a "padding bits" value of 7...
Fortunately I have carefully designed the new protocol to never use a signal byte of 0xFF (255 or -1), therefore if such a signal byte is encountered Legacy Compatibility Mode is enabled.
When LCM is enabled STONE Layer 2 behaves exactly like the older protocol's huffman.cpp functions.
The Encoding Negotiation Flag is ignored and a response packets will be encoded via Legacy Unecoded mode or Legacy Huffman encoding.
Transmitting to Legacy Encoding Implementations[/size]
Simply select the "Legacy Huffman" encoding mode and send data.
Since the current Master / Server protocol only provides the IP:port of the servers there is no way to tell if you'll be talking to a STONE Protocol aware server or Legacy Huffman server.
For now a fail safe approach is to select "Legacy Huffman" encoding mode, set the Encoding Negotiation Flag, and append the Encoding Negotiation byte indicating your maximum and minimum desired encoding levels. (These details are gracefully handled by getter / setter method calls when using STONE Layer 2)
For example. Let's say we're sending such a launcher query to a Skulltag server that's using Legacy Huffman encoding (but we don't know that yet...)
A Legacy launcher request may look something like this (in hex):
03 b8491a9c8b1506 - 8 bytes of data.
The 03 would indicate Huffman encoding with 3 bits of extra padding at the end.
Our compatible STONE Protocol launcher request would look like this:
0b b8491a9c8b1506 20 - 9 bytes of data.
Legacy Huffman mode = 0, along with Bit #3 set (0x08 = Negotiation Flag) and with 3 padding bits yields a Signal byte of 0x0b.
Our maximum desired encoding is 2 - Improved Huffman, and minimum understood encoding is 0 - Unencoded mode.
(Note: Encoding negotiation only adds 1 byte!)
Now let's look at some of the actual code used in the old HUFFMAN_Decode to see how this will play out.
(I've been in here once or twice before :) )
Code: Select all
void HUFFMAN_Decode( unsigned char *in, unsigned char *out, int inlen, int *outlen )
{
    // sanity check for *outlen.
    if ( (*outlen < 1) ){
        *outlen = 0; // zero it incase it's some wild negative.
        return;
    }
    int bits,tbits,maxBits;
    huffnode_t *tmp;
    if (*in==0xff) // The first byte is 11, so this won't happen.
    {
        // This block is only executed if StEP sends 255 when LCM is enabled.
        memcpy(...); 
        return;
    }
    // Ah... I just love the next error prone line of code.
    // I immediately realized it as "Magic", and left it just the way it is :)
    tbits=(inlen-1)*8-*in;
    // Here's what happens:
    // tbits = (9 - 1) * 8 - 11;
    //   inlen--^             ^--- this value is exactly 8 bits more than the 3 padding bits expected...
    // Remember that extra negotiation byte (8 bits) we appended?  It's now fully ignored!
    bits=0;
    // Calculate the max bits from output size.
    maxBits = *outlen << 3; // shift 3 faster than * 8 :)
    *outlen=0;
    while ((bits<tbits) && (bits<maxBits)) // added check.
    {
        // Omitted: decode bytes from the input bits.
        // This section gets skipped when StEP unencoded mode launcher queries arrive :)
    }
}
   ÂSTONE Layer 1.2 or a hardened version of huffman.cpp should be used instead. As you can see, applications using the old huffman.cpp or STONE Layer 1.x code are compatible with this "fail safe" approach :nod:
STONE Protocol aware applications can easily take advantage of StEP's unencoded mode without even having a Huffman implementation.
Launcher Protocol queries are small enough that it is safe for StEP unencoded mode to be used with legacy implementations.
The older HUFFMAN_Decode() function will attempt to Huffman decode the packet (0xF0 != 0xFF), and in the process ignore up to 240 bits (30 bytes) worth of data. ( tbits = 9 - 240 = -231; )
To older implementations it would be as if they received a Zero length packet. A novice user using only unencoded mode wouldn't be able to decode a Huffman encoded reply from the older server, so it's OK that older servers will not reply to the novice.
Newer servers using StEP / StL2 would correctly reply with unencoded data to the novice.
Important Note[/size]
Legacy Huffman allows for switching between unencoded and Huffman encodings at will. STONE Protocol does not!
In STONE Protocol it is unsafe to reply using a different encoding unless an Encoding Negotiation byte has been received and specifies that it is acceptable to switch protocols (or Legacy Compatibility Mode is enabled). Higher level protocols are free to define default minimum and maximum encoding levels that both ends must support, however STONE Protocol makes no assumptions.
It is currently beyond the scope of the STONE Protocol / STONE Layer 2 implementation to decide whether the Encoding Negotiation byte must be sent more than once per "session" since "session" state is maintained within higher level protocols.
It is also beyond the scope of STONE Protocol to determine the "default" encoding to use when sending data to an unknown server. A higher level protocol can use empty packets and STONE Protocol's encoding negotiation bytes to handshake and establish a common encoding mode if needed.
I think I just broke a record for the longest explanation of how 1 or 2 bytes of data can be used :-D[/spoiler]
					Last edited by Blzut3 on Mon Jul 02, 2012 2:59 am, edited 1 time in total.
									
			
									
						- 
				VortexCortex
- Posts: 27
- Joined: Mon Jun 11, 2012 1:43 am
- Location: Houston
- Contact:
RE: Versioning
Ah!  Yes, that's exactly it!  Thanks so much!
I had forgotten all about the unencoded mode error code replies. Lots of the terse comments in the initial Perl implementation of the STONE Layer v2 code make sense now. :) Now I should be able to continue building the encoding & compression suite -- The aim was to create the transport encoding code licensed such that Doomseeker and others may also utilize it. Hmm, I guess another backronym is needed now...
[spoiler]Security Note:
The hardened huffman.cpp isn't needed so long as the buffer sizes are adequate enough to prevent the buffer overflow bug -- Which is the fix Torr adopted, IIRC. The other option was to indicate the max buffer size by passing the value by pointer reference in the output buffer length parameter, and to modify the Huffman_Decode function to properly check buffer bounds. I haven't ever checked IDE for exploit, but I know for sure that older clients/servers released prior to addressing this buffer overflow shouldn't be used openly as Internet facing due to a DoS, or remote code execution exploit the buffer overflow enables (depending on OS and DEP). This exploit was addressed some years ago, I doubt any but extreme diehards are using versions that are vulnerable. Doomseeker is unaffected, IIRC.
In retrospect, this protocol versioning / negotiation may sound a bit more complex than some would like, but that's mostly just verbosity: It boils down to a prefix and optional suffix byte pair, and enables implementations to choose the level of encoding to support -- IE, Launcher authors could opt out of compression initially, and include better compression as they got around to adopting it. There's a problem with using a High / Low value for the Encoding IDs in that implementations would need to support intervening protocols. Perhaps a set of suffixed bitfields would be better... In light of "Roles" for decentralization support, the protocol would have to be modified anyhow. Another issue is it requires some additional connection state to be maintained -- That is: Most of the protocol can be handled via drop in replacement of the Huffman_Xxx() functions, but some features, like negotiation, require pairing a few vars with each "connection" or IP/Port pair; Shouldn't be a problem, just that a full implementation of the protocol touches several source files -- It would be best to proceed with a limited subset at first, IMHO.
The SkullCrusher algorithm mentioned relies on a small lump (~400b up to 64Kb) that contains a "dictionary" along with it's two Huffman trees, to serve as a starting point for the sliding window compression. To build a good dictionary requires a network data profiler (which isn't yet complete) to be included in some Zandronum source. It looks like I can resurrect most of the progress. The profiler is needed to avoid having to capture full netlogs when generating dictionary lumps, and to benchmark the compression efficiency. The algorithm was designed to solve the issues inherent with using Gzip or other stream compression algorithms over UDP: The pre-defined dictionaries and trees reduce compression overhead, Data can be received out of sequence, lost packets are allowed, and the simple algorithm achieves decent compression on a per packet basis with minimal CPU usage over the current Huffman implementation.
The improved Huffman encoding version in that protocol could probably be dropped, it's only about 5% to 10% better than the current Huffman at most things, it's a few times faster too but that's not the bottleneck -- Mainly, the flexible Huffman tree implementation was a requirement to build the other compression modes.
For optimal compression, a different dictionary could be used for different default game modes (A dedicated Encoding ID each for Launchers, Invasion/Coop, CTF/Deathmatch, etc) -- This is because some game modes have similar network data, but others have very different network data sets. An Encoding ID could also be designated to provide the general purpose "WAD Defined Compression". Modders would enable a "profiler" mode then play online to generate the compression dictionary .LUMP -- I wasn't sure how that option would be presented, probably a new command line switch or cvar. Games played with WADS having the new Lump present would enable the negotiation capability.
This lightweight per WAD defined optimal compression is the main reason why the above encoding layer versioning and negotiating protocol was designed -- Launchers could talk with a known default encoding (or none), and Clients / Servers could start off with a known encoding, then negotiate up to a more optimal compression mode for successive packets.[/spoiler]
I'll open a tracker ticket when next I have time so we can discuss any required changes, and to enable any desired future options -- To ensure extensibility, ie we don't get painted into a corner.
Thanks again Wartorn!
			
									
									
						I had forgotten all about the unencoded mode error code replies. Lots of the terse comments in the initial Perl implementation of the STONE Layer v2 code make sense now. :) Now I should be able to continue building the encoding & compression suite -- The aim was to create the transport encoding code licensed such that Doomseeker and others may also utilize it. Hmm, I guess another backronym is needed now...
[spoiler]Security Note:
The hardened huffman.cpp isn't needed so long as the buffer sizes are adequate enough to prevent the buffer overflow bug -- Which is the fix Torr adopted, IIRC. The other option was to indicate the max buffer size by passing the value by pointer reference in the output buffer length parameter, and to modify the Huffman_Decode function to properly check buffer bounds. I haven't ever checked IDE for exploit, but I know for sure that older clients/servers released prior to addressing this buffer overflow shouldn't be used openly as Internet facing due to a DoS, or remote code execution exploit the buffer overflow enables (depending on OS and DEP). This exploit was addressed some years ago, I doubt any but extreme diehards are using versions that are vulnerable. Doomseeker is unaffected, IIRC.
In retrospect, this protocol versioning / negotiation may sound a bit more complex than some would like, but that's mostly just verbosity: It boils down to a prefix and optional suffix byte pair, and enables implementations to choose the level of encoding to support -- IE, Launcher authors could opt out of compression initially, and include better compression as they got around to adopting it. There's a problem with using a High / Low value for the Encoding IDs in that implementations would need to support intervening protocols. Perhaps a set of suffixed bitfields would be better... In light of "Roles" for decentralization support, the protocol would have to be modified anyhow. Another issue is it requires some additional connection state to be maintained -- That is: Most of the protocol can be handled via drop in replacement of the Huffman_Xxx() functions, but some features, like negotiation, require pairing a few vars with each "connection" or IP/Port pair; Shouldn't be a problem, just that a full implementation of the protocol touches several source files -- It would be best to proceed with a limited subset at first, IMHO.
The SkullCrusher algorithm mentioned relies on a small lump (~400b up to 64Kb) that contains a "dictionary" along with it's two Huffman trees, to serve as a starting point for the sliding window compression. To build a good dictionary requires a network data profiler (which isn't yet complete) to be included in some Zandronum source. It looks like I can resurrect most of the progress. The profiler is needed to avoid having to capture full netlogs when generating dictionary lumps, and to benchmark the compression efficiency. The algorithm was designed to solve the issues inherent with using Gzip or other stream compression algorithms over UDP: The pre-defined dictionaries and trees reduce compression overhead, Data can be received out of sequence, lost packets are allowed, and the simple algorithm achieves decent compression on a per packet basis with minimal CPU usage over the current Huffman implementation.
The improved Huffman encoding version in that protocol could probably be dropped, it's only about 5% to 10% better than the current Huffman at most things, it's a few times faster too but that's not the bottleneck -- Mainly, the flexible Huffman tree implementation was a requirement to build the other compression modes.
For optimal compression, a different dictionary could be used for different default game modes (A dedicated Encoding ID each for Launchers, Invasion/Coop, CTF/Deathmatch, etc) -- This is because some game modes have similar network data, but others have very different network data sets. An Encoding ID could also be designated to provide the general purpose "WAD Defined Compression". Modders would enable a "profiler" mode then play online to generate the compression dictionary .LUMP -- I wasn't sure how that option would be presented, probably a new command line switch or cvar. Games played with WADS having the new Lump present would enable the negotiation capability.
This lightweight per WAD defined optimal compression is the main reason why the above encoding layer versioning and negotiating protocol was designed -- Launchers could talk with a known default encoding (or none), and Clients / Servers could start off with a known encoding, then negotiate up to a more optimal compression mode for successive packets.[/spoiler]
I'll open a tracker ticket when next I have time so we can discuss any required changes, and to enable any desired future options -- To ensure extensibility, ie we don't get painted into a corner.
Thanks again Wartorn!
RE: Versioning
Neato, glad we can make progress on this now :)
			
									
									
						RE: Versioning
My thoughts on the version numbers: We should avoid using 2 minor version numbers specifically to avoid making the version numbers look like ZDoom's, to avoid the potential confusion resulting from the possibility that our version number may someday be higher than ZDoom's. Since Zandronum is a continuation of Skulltag, maybe the version should also be continuing the Skulltag numbering scheme, except maybe dropping the "0." in front, so 98e would be followed by 99 and a letter then 100, 101, etc. The number would increment either when the ZDoom source is updated, or when many features are added, like when Skulltag's 0.97e got its name changed to 0.98a.
			
									
									"For the world is hollow, and I have touched the sky."
						


