It's possible that
the SQL dump wasn't correctly exported after some trial and error I'm an idiot and I probably need help (The database is fine), so as a result a lot of the example code might be damaged on the remote server I installed the database to make my hunt easier. I'll keep tinkering with it to see if I can get the thing in tact but in the event I can't, I'll need someone's help to get this encoding thing working. In addition, there was an attachment named "Hardened Huffman" but dumps don't include that kind of thing. But is this what you're looking for?
[spoiler]I've been working on a replacement for the current Huffman encoding protocol.
My goal is to extend the current Huffman encoding protocol to include multiple encoding modes such as better compression and a non encoded mode while providing backwards compatibility.
I've named the encoding protocol "Skulltag Online Network Encoding Protocol" or "STONE Protocol" or StEP for short.
This corresponds to my implementation of the protocol: "Skulltag Online Network Encoding Layer 2" or "STONE Layer 2" or StL2.
Think of STONE Protocol as the underlying foundation that all other Skulltag network protocols are stacked on top of. The name is subject to change if anyone has a better or existing name to use.
Similar to posting an RFC: I'm posting the protocol here to get ideas, comments, and approval before finalizing the implementation details. I also wanted to demonstrate how the protocol achieves backwards compatibility with the current Huffman protocol
Apologies in advance -- I have a tendency to be long winded, but did not spoiler anything in the interest of readability.
(No apologies to Rivecoder. He knew what he was getting into and should be used to this by now :P)
I wasn't sure if this was the correct place to post... Skulltag Protocol thread and Source Discussion seemed likely candidates as well. Please move if needed, thanks.
Legacy Encoding Protocol[/size]
It's important to note that the method of Huffman compression currently in use is actually a Data Encoding Protocol. The first byte of data is a signal that indicates how the following data bytes should be decoded.
A first byte of 0xFF, 255 (or -1 as a signed char) means the following bytes are not encoded.
Any other value signifies Huffman encoded data follows.
When Huffman encoding is used, The first byte value specified is equal to the number of
bits that were appended to the last byte of data in order to pad the Huffman data out to a full byte.
For example, the number 5 would mean to disregard the
last 5 bits of data in the last byte of the following data. Since a byte is 8 bits, there can logically only exist eight possible padding values for the first signal byte: 0, 1, 2, 3, 4, 5, 6, 7.
Current Huffman implementations operate on the signal byte and determine how many
bits of data to read from the remaining data bytes via the following algorithm:
Code: Select all
prefix_byte_value = ... // get the first byte of packet data.
number_of_data_bytes = number_of_packet_bytes - 1; // Account for the protocol's "signal/padding" byte.
if ( prefix_byte_value == 0xFF )
{
    ... // Return the data bytes unchanged (not encoded mode)
}
number_of_data_bytes * 8 = total_data_bits;
number_of_huffman_encoded_bits = total_data_bits - prefix_byte_value;
... // Huffman decode the number_of_huffman_encoded_bits from the data bytes.
       Â
This legacy implementation detail is crucial to providing backwards compatibility when using certain STONE Protocol features.
The actual code used in the older HUFFMAN_Decode function will be reviewed later.
STONE Protocol[/size]
The new protocol is based on the above protocol with backwards compatibility and minimal header (meta data) size being of utmost importance.
In the interest of backwards compatibility StEP sometimes stores meta data at the end of the packet as well as the beginning.
In the following tables the bit values and ranges inside of bytes are given as their "unshifted" values.
The Signal Byte[/size]
Similar to the Legacy Encoding, the first byte of a StEP packet holds special "signal" data.
In STONE Protocol the first byte is a bit field containing 3 independent pieces of data.
The bit positions and values within the signal byte have been carefully chosen to inherently provide backwards compatibility.
Code: Select all
Signal byte bit Values.  7 = Most significant bit
.--------------------------------------
|
|Â Â Â Â 7Â 6Â 5Â 4Â 3Â 2Â 1Â 0Â Â
|Â Â Â Â |Â |Â |Â |Â |Â |Â |Â |Â Â
|    | | | | | x x x = Numeric value [0 to 7]
|    | | | | |         Number of Padding Bits (used with variable bit length encodings)
|    | | | | |         or Message Flags Field (used with Unencoded mode)
|Â Â Â Â |Â |Â |Â |Â |
|    | | | | x ----- = Encoding Negotiation Flag
|    | | | |           1: An encoding negotiation byte has been appended after the last byte of data.
|    | | | |           0: No negotiation byte exists.
|Â Â Â Â |Â |Â |Â |
|    x x x x ------- = Numeric value [0 to 15]
|                      Encoding Identification Signal
|
`---------------------------------------
Padding Bits Value[/size]
StEP uses the Padding Bits value in the same way that the Legacy Encoding protocol / implementation does.
The last byte of encoded data contains N padding bits which should be ignored.
Future encodings may have different usages for the padding bits.
Message Identification Flags[/size]
When composing a STONE Protocol "Unencoded" packet "Padding Bits" are unneccesary.
Instead of letting those bits go to waste a Message Identification bit field (MID) is used.
Code: Select all
Message Identification Signal.  7 = Most significant bit
.--------------------------------------
|
|Â Â Â Â 7Â 6Â 5Â 4Â 3Â 2Â 1Â 0Â Â
|Â Â Â Â |Â |Â |Â |Â |Â |Â |Â |Â Â
|    | | | | | | x x = Numeric Value [0 to 3]
|    | | | | | |       Error Code
|    | | | | | |       0: No error, a normal regular unencoded data message.
|    | | | | | |       1: Unsupported Encoding
|    | | | | | |          The previously received data encoding is not supported.
|    | | | | | |       2: Corrupt Data
|    | | | | | |          The previously received data did not decode properly.
|    | | | | | |       3: Connection Refused
|    | | | | | |          For some reason the connection was refused, and no further
|    | | | | | |          data should be sent (at least for a while)
|Â Â Â Â |Â |Â |Â |Â |Â |
|    | | | | | x---- = Request for Protocol Negotiation
|    | | | | |         1: Request that a reply contain Encoding Negotiation.
|    | | | | |         0: Encoding negotiation not requested from the other party.
|Â Â Â Â |Â |Â |Â |Â |
|    x x x x x------ = These bits are beyond the MID scope.
|
`---------------------------------------
Regardless of Error Code, it is beyond STONE Protocol's scope to determine the contents of the unencoded data.
Error code meta data can be used by higher level protocols to provide extended error messages,
re-request corrupted data or perform a negotiation handshake.
When the "Connection Refused" error code is received, it is up to the application to check this value and take action (or inaction).
StL2 provides getter methods for these fields.
It is Forbidden to use the Request for Protocol Negotiation when the "Connection Refused" Error Code is in use.
(This would make no sense, and also prevent Legacy Encoding Mode detection)
In the STONE Layer 2 implementation, if a Request for Protocol Negotiation flag is received then the next reply packet will automatically be configured for Protocol Negotiation. This behavior can be disabled.
Encoding Identification Signal[/size]
16 encoding values are possible.
Code: Select all
.------------------------------------------------------------------------------------------.
|     Encoding ID Value    |                 Description of Encoding Mode                  |
| Hex .  Binary  . Decimal |                                                               |
|-----|----------|---------|---------------------------------------------------------------|
|  0  |   0000   |     0   |  Legacy Huffman: Data is encoded using a Huffman Tree         |
|     |          |         |      that is compatible with the original huffman.cpp         |
|-----|----------|---------|---------------------------------------------------------------|
|  1  |   0001   |     1   |  Improved Huffman: Data is encoded using a newly calculated   |
|     |          |         |      Huffman tree, and bytes do not have reversed bit order   |
|-----|----------|---------|---------------------------------------------------------------|
|  2  |   0010   |     2   |  Skull Crusher: Data is compressed using a custom algorithm   |
|     |          |         |      that utilizes two Huffman trees and a sliding "window"   |
|-----|----------|---------|---------------------------------------------------------------|
|Â 0x3Â |Â Â Â 0011Â Â Â |Â Â Â Â Â 3Â Â Â |Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â |
|  to |    to    |   to    |     Available for future expansion of the STONE Protocol      |
|Â 0xEÂ |Â Â Â 1110Â Â Â |Â Â Â Â 14Â Â Â |Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â |
|-----|----------|---------|---------------------------------------------------------------|
|  F  |   1111   |    15   |  Unencoded Mode: data bytes use the 1:1 (identity) encoding,  |
|     |          |         |     a "no op" raw data mode also used for error code replies  |
`------------------------------------------------------------------------------------------'
Due to the backwards compatibility requirements the Signal byte's Encoding Identification Signal values do not match exactly to the Negotiation byte's Encoding Level values. However, conversion between the two values is possible.
Encoding Negotiation[/size]
If the Encoding Negotiation flag is set then the last byte of the packet contains a special bit field.
When Legacy mode is enabled the Encoding Negotiation flag is treated as if it were false.
The "Negotiation" bit field contains two independent pieces of data.
Code: Select all
Encoding Negotiation byte bit Values.  7 = Most significant bit
.--------------------------------------
|
|Â Â Â Â 7Â 6Â 5Â 4Â 3Â 2Â 1Â 0Â Â
|Â Â Â Â |Â |Â |Â |Â |Â |Â |Â |Â Â
|    | | | | x x x x = Numeric value [0 to 15]
|    | | | |           Minimum Encoding Level value
|Â Â Â Â |Â |Â |Â |Â Â
|    x x x x ------- = Numeric value [0 to 15]
|                      Maximum Encoding Level value
|
`---------------------------------------
Having both minimum and maximum encoding levels instead of just a maximum supported is important for future encodings that may include encryption. In the event of encrypted protocol implementation an application may decide to specify a minimum encoding level that disallows all unencrypted encodings.
Encoding Level Values[/size]
16 encoding values are possible.
Code: Select all
.------------------------------------------------------------------------------------------.
|   Encoding Level Value   |                 Description of Encoding Mode                  |
| Hex .  Binary  . Decimal |                                                               |
|-----|----------|---------|---------------------------------------------------------------|
|  0  |   0000   |     0   |  Unencoded Mode: data bytes use the 1:1 (identity) encoding,  |
|     |          |         |     a "no op" raw data mode also used for error code replies  |
|-----|----------|---------|---------------------------------------------------------------|
|  1  |   0001   |     1   |  Legacy Huffman: Data is encoded using a Huffman Tree         |
|     |          |         |      that is compatible with the original huffman.cpp         |
|-----|----------|---------|---------------------------------------------------------------|
|  2  |   0010   |     2   |  Improved Huffman: Data is encoded using a newly calculated   |
|     |          |         |      Huffman tree, and bytes do not have reversed bit order   |
|-----|----------|---------|---------------------------------------------------------------|
|  3  |   0011   |     3   |  Skull Crusher: Data is compressed using a custom algorithm   |
|     |          |         |      that utilizes two Huffman trees and a sliding "window"   |
|-----|----------|---------|---------------------------------------------------------------|
|Â 0x4Â |Â Â Â 0100Â Â Â |Â Â Â Â Â 4Â Â Â |Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â |
|  to |    to    |   to    |     Available for future expansion of the STONE Protocol      |
|Â 0xFÂ |Â Â Â 1111Â Â Â |Â Â Â Â 15Â Â Â |Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â |
`------------------------------------------------------------------------------------------'
In the above table the encodings are listed in order of complexity.
This allows for simple comparisons in source code instead of using a complex look up table or switch statements.
There should be a 1 to 1 mapping between Encoding ID and Encoding Level values.
Conversion between the Encoding ID (EID) and Encoding Level (EL) is possible.
Code: Select all
// From EID to EL
ELÂ =Â EIDÂ +Â 1Â &Â 0xF;
// From EL to EID (avoids underflow for cross language independence)
EID = (EL == 0) ? 15 : EL - 1;
    Â
Receiving Legacy Encoded Data[/size]
In the STONE Protocol a Signal byte value of [0 to 7] will correspond to Legacy Huffman Encoding, and backwards compatibility is achieved inherently.
However, If a STONE Layer 2 implementation receives a packet of Legacy "Non Encoded" data a Signal Byte of 0xff (-1 or 255), would correctly supply a StEP EID of "Unencoded Mode", but incorrectly specify the Encoding Negotiation Flag being set, and also use a "padding bits" value of 7...
Fortunately I have carefully designed the new protocol to never use a signal byte of 0xFF (255 or -1), therefore if such a signal byte is encountered Legacy Compatibility Mode is enabled.
When LCM is enabled STONE Layer 2 behaves exactly like the older protocol's huffman.cpp functions.
The Encoding Negotiation Flag is ignored and a response packets will be encoded via Legacy Unecoded mode or Legacy Huffman encoding.
Transmitting to Legacy Encoding Implementations[/size]
Simply select the "Legacy Huffman" encoding mode and send data.
Since the current Master / Server protocol only provides the IP:port of the servers there is no way to tell if you'll be talking to a STONE Protocol aware server or Legacy Huffman server.
For now a fail safe approach is to select "Legacy Huffman" encoding mode, set the Encoding Negotiation Flag, and append the Encoding Negotiation byte indicating your maximum and minimum desired encoding levels. (These details are gracefully handled by getter / setter method calls when using STONE Layer 2)
For example. Let's say we're sending such a launcher query to a Skulltag server that's using Legacy Huffman encoding (but we don't know that yet...)
A Legacy launcher request may look something like this (in hex):
03 b8491a9c8b1506 - 8 bytes of data.
The 03 would indicate Huffman encoding with 3 bits of extra padding at the end.
Our compatible STONE Protocol launcher request would look like this:
0b b8491a9c8b1506 20 - 9 bytes of data.
Legacy Huffman mode = 0, along with Bit #3 set (0x08 = Negotiation Flag) and with 3 padding bits yields a Signal byte of 0x0b.
Our maximum desired encoding is 2 - Improved Huffman, and minimum understood encoding is 0 - Unencoded mode.
(Note: Encoding negotiation only adds 1 byte!)
Now let's look at some of the actual code used in the old HUFFMAN_Decode to see how this will play out.
(I've been in here once or twice before :) )
Code: Select all
void HUFFMAN_Decode( unsigned char *in, unsigned char *out, int inlen, int *outlen )
{
    // sanity check for *outlen.
    if ( (*outlen < 1) ){
        *outlen = 0; // zero it incase it's some wild negative.
        return;
    }
    int bits,tbits,maxBits;
    huffnode_t *tmp;
    if (*in==0xff) // The first byte is 11, so this won't happen.
    {
        // This block is only executed if StEP sends 255 when LCM is enabled.
        memcpy(...);Â
        return;
    }
    // Ah... I just love the next error prone line of code.
    // I immediately realized it as "Magic", and left it just the way it is :)
    tbits=(inlen-1)*8-*in;
    // Here's what happens:
    // tbits = (9 - 1) * 8 - 11;
    //   inlen--^             ^--- this value is exactly 8 bits more than the 3 padding bits expected...
    // Remember that extra negotiation byte (8 bits) we appended?  It's now fully ignored!
    bits=0;
    // Calculate the max bits from output size.
    maxBits = *outlen << 3; // shift 3 faster than * 8 :)
    *outlen=0;
    while ((bits<tbits) && (bits<maxBits)) // added check.
    {
        // Omitted: decode bytes from the input bits.
        // This section gets skipped when StEP unencoded mode launcher queries arrive :)
    }
}
   Â
If you're wondering why this implementation looks different than the one
referred to in the current Launcher protocol -- It's because the old one is insecure.
STONE Layer 1.2 or a hardened version of huffman.cpp should be used instead.
huffman.zip
As you can see, applications using the old huffman.cpp or STONE Layer 1.x code are compatible with this "fail safe" approach :nod:
STONE Protocol aware applications can easily take advantage of StEP's unencoded mode without even having a Huffman implementation.
Launcher Protocol queries are small enough that it is safe for StEP unencoded mode to be used with legacy implementations.
The older HUFFMAN_Decode() function will attempt to Huffman decode the packet (0xF0 != 0xFF), and in the process ignore up to 240 bits (30 bytes) worth of data. ( tbits = 9 - 240 = -231; )
To older implementations it would be as if they received a Zero length packet. A novice user using only unencoded mode wouldn't be able to decode a Huffman encoded reply from the older server, so it's OK that older servers will not reply to the novice.
Newer servers using StEP / StL2 would correctly reply with unencoded data to the novice.
Important Note[/size]
Legacy Huffman allows for switching between unencoded and Huffman encodings at will. STONE Protocol does not!
In STONE Protocol it is unsafe to
reply using a different encoding unless an Encoding Negotiation byte has been received and specifies that it is acceptable to switch protocols (or Legacy Compatibility Mode is enabled). Higher level protocols are free to define default minimum and maximum encoding levels that both ends must support, however STONE Protocol makes no assumptions.
It is currently beyond the scope of the STONE Protocol / STONE Layer 2 implementation to decide whether the Encoding Negotiation byte must be sent more than once per "session" since "session" state is maintained within higher level protocols.
It is also beyond the scope of STONE Protocol to determine the "default" encoding to use when sending data to an unknown server. A higher level protocol can use empty packets and STONE Protocol's encoding negotiation bytes to handshake and establish a common encoding mode if needed.
I think I just broke a record for the longest explanation of how 1 or 2 bytes of data can be used :-D[/spoiler]