Galaxy Level Decompression Failure (Resolved)

Tools, assembly, and file formats.
Post Reply
User avatar
Fleexy
Site Admin
Posts: 490
Joined: Fri Dec 12, 2008 1:33 am
Location: Bloogton Tower

Galaxy Level Decompression Failure (Resolved)

Post by Fleexy »

EDIT: I resolved it! Apparently, I'm supposed to multiply the word after xA8 by 2 when de-Carmackizing. The original text follows.
I wrote:I'm trying to read Galaxy levels, but I'm having some trouble with decompression. (It doesn't help that there is no consistent information on the web, except the uberobscure mobydoc.txt.)

I know I need to de-Carmackize first, and I get a chunk of length 6536 for Keen 4's game maps. That length checks out with what's reported in the file. However, when I then un-RLEW it (with the flag provided in MAPHEAD, $CD $AB = 43981), I get a massive chunk of length 106618, far larger than the prescribed size of 8052. My RLEW decompressor works with Vorticons levels, but what it produces here is insanely incorrect.

When I resave the mal-loaded levels, TOM can open them and not Keen. When TOM opens them, it's OK for a while and then starts looking like this:

Image

Notice that the infoplane is OK, probably because it's sparse. This makes me suspect that badness only starts happening after a Carmack far pointer.
This is my fixed Carmack decompressor:

Code: Select all

Sub DecompressCarmack(Data As IO.Stream, Output As IO.Stream)
        Dim opos As UInteger = Output.Position
        Do Until Data.Position = Data.Length
            Dim w As UShort = ReadUShort(Data)
            Dim f As Byte() = UShortToByte(w)
            If f(0) = 0 And (f(1) = 167 Or f(1) = 168) Then
                Output.Write({f(1), Data.ReadByte}, 0, 2)
            Else
                If f(1) = 167 Then
                    Dim o As Byte = ReadByte(Data)
                    Dim cpos As UInteger = Output.Position
                    Output.Position -= (2 * o)
                    Dim buf((2 * f(0)) - 1) As Byte
                    Output.Read(buf, 0, 2 * f(0))
                    Output.Position = cpos
                    Output.Write(buf, 0, 2 * f(0))
                ElseIf f(1) = 168 Then
                    Dim l As UShort = ReadUShort(Data)
                    Dim cpos As UInteger = Output.Position
                    Output.Position = opos + (2 * l)
                    Dim buf((2 * f(0)) - 1) As Byte
                    Output.Read(buf, 0, 2 * f(0))
                    Output.Position = cpos
                    Output.Write(buf, 0, 2 * f(0))
                Else
                    WriteUShort(Output, w)
                End If
            End If
        Loop
    End Sub
And this is my RLEW decompressor, which was right originally:

Code: Select all

Sub DecompressRLEW(Data As IO.Stream, Flag As UShort, Output As IO.Stream)
        Dim current As UShort
        Dim count As UShort
        Do
            current = ReadUShort(Data)
            If current = Flag Then
                count = ReadUShort(Data)
                current = ReadUShort(Data)
                For z = 1 To count
                    WriteUShort(Output, current)
                Next
            Else
                WriteUShort(Output, current)
            End If
        Loop Until Data.Position = Data.Length
    End Sub
I wrote:Any help is appreciated, and I release these (probably flawed) implementations into the public domain.
Fleexy
Last edited by Fleexy on Wed Jan 01, 2014 3:13 am, edited 2 times in total.
levellass
Posts: 3001
Joined: Wed Oct 11, 2006 12:03 pm
Location: Ngaruawahia New Zealand

Post by levellass »

Here; this is decarmackized output of Keen 6 fom TED5 (The temp files are RLE'd only.) The first level I tweaked: https://dl.dropboxusercontent.com/u/3940020/MAPTEMP.zip

If after decarmackizzing you get data about the same size as the maptemp file then that's ok. However as you note it's probably Carmack related. I'm thinking it can't handle words with a high byte of $A7 or $A8: http://www.shikadi.net/moddingwiki/Carmack_compression


As a side note once ruined my levels by putting in a door value that was the same as a Carmack far pointer. TOM doesn't compress that right. Be aware!
User avatar
Fleexy
Site Admin
Posts: 490
Joined: Fri Dec 12, 2008 1:33 am
Location: Bloogton Tower

Post by Fleexy »

Ah, thanks. I actually figured it out and came back here to edit/delete the post. Turns out, I forgot to multiply the word after xA8 by 2, which was not clear at all in any of the documentation I read. I will update the OP accordingly.

Also, the information on ModdingWiki about storing Huffman trees is incredibly confusing. I understand that the bottom row (leaf nodes) are stored first, noted by $00. However, it is massively unclear as to how branch nodes reference other branch nodes. Can you shed light here? ;)
levellass
Posts: 3001
Joined: Wed Oct 11, 2006 12:03 pm
Location: Ngaruawahia New Zealand

Post by levellass »

Quite possibly you should write up an explanation (Here or on moddingwiki) so that others have a clearer path to follow (The values in the compression are for words, not bytes was the issue I believe? The units of the compression have always been byte based.)



In essence the tree is stored 'upside down' if you go to the end of the DICT chunk you will find the root node.

Starting at the root node we can build the tree as follows:

1.) The first two bytes are an 0, the second two a 1. We'll start with the first two bytes of any node first (And 0)

2a.) For the node check the second byte. if this is an 0 then this branch is a leaf node and the 1st byte is the character for that node.

2b.) If the second byte is an 0 then it is a branch node and the character is for what node in the dict to go to next (Its address will be character * 4 from the start of the dict chunk)

3.) Once you have done the first two bytes of a node do the second two remembering that these are a 1.

4.) When you reach a leaf node go back up a level.


Doing this you should generate a series of bits for each node in a kind of 'numeric' order (The ones with the most leading 0s going first) e.g.:

00000
00001
0001
001000
001001....

For Keen this means you will finish with the node for character 0 (represented by the single bit '1') as it is usually right next to the root node on the right side.


As a further test example this EGADICT file generated by Keengraph produces a tree that doesn't compress at all; the path to each node is the reverse of the character for that node. (So the first node you'll get following the above method is node 0, then 128, then 64...) https://dl.dropboxusercontent.com/u/3940020/EGADICT.CK6

This is discussed along with an example of how to find the node for character $80 here: http://www.shikadi.net/moddingwiki/Huff ... #Example_2
User avatar
Fleexy
Site Admin
Posts: 490
Joined: Fri Dec 12, 2008 1:33 am
Location: Bloogton Tower

Post by Fleexy »

Alright, with that and Napalm's source code I got a reader for the dictionary that gets the dictionary you described from KeenGraph-imported graphics. However, it appears that the EGAGRAPH is uncompressed already, not even with these reversed bits. This is the code I'm using to read EGADICT. Keen and KG understand these files, but my reader doesn't. Trying to use that dictionary on the EGAGRAPH, as expected, flips the bits, but they were already in a usable order.
levellass
Posts: 3001
Joined: Wed Oct 11, 2006 12:03 pm
Location: Ngaruawahia New Zealand

Post by levellass »

Indeed, Keengraph creates a 'trivial' tree, one that does not compress the data at all. This was done out of laziness and a like of speed in importing since the space concerns that required Huffman compression are no longer relevant.

Therefore passing the data through that tree shouldn't change it at all.
User avatar
Fleexy
Site Admin
Posts: 490
Joined: Fri Dec 12, 2008 1:33 am
Location: Bloogton Tower

Post by Fleexy »

Yeah, but it does. It reverses the bits... I'm so confused! Am I supposed to flip the bits of the dictionary after reading it in or am I reading the entire EGADICT totally wrong?
levellass
Posts: 3001
Joined: Wed Oct 11, 2006 12:03 pm
Location: Ngaruawahia New Zealand

Post by levellass »

*Tries to conceptualize*

*Looks at code*

*Looks at EGADICT*


Ugh. Right. I forgot the most annoying thing about the implementation; each byte of data is read backwards bitwise. So yes, for each byte reverse the bits while reading each bunch of bytes in the normal fashion. (This means the end bytes will be padded at their high end not their low too.)

This will be noted on the Huffman page.
Post Reply