File formats

Tools, assembly, and file formats.
User avatar
Tulip
Posts: 394
Joined: Mon Jun 16, 2008 2:40 pm
Location: Heidelberg, Germany
Contact:

Post by Tulip »

Something like that. A blocking sprite for level 1 is 32769 and for level 4 for example it's 32772
User avatar
CommanderSpleen
Posts: 1017
Joined: Sun Aug 31, 2003 12:11 pm
Location: The Land of Sparkly Things
Contact:

Post by CommanderSpleen »

So it's 0x8000 + level number.
User avatar
Fleexy
Site Admin
Posts: 490
Joined: Fri Dec 12, 2008 1:33 am
Location: Bloogton Tower

Post by Fleexy »

CommanderSpleen wrote:So it's 0x8000 + level number.
Or 32768 + L#. 8)
levellass
Posts: 3001
Joined: Wed Oct 11, 2006 12:03 pm
Location: Ngaruawahia New Zealand

Post by levellass »

EDIT: In fact, the level file documentation overall is pretty vague. It mentions the structure of the file, but not importatn things like what tiles are what, etc. I am possibly just so incredibly stupid that these are common knowledge, but I doubt it.
This information is how levels WORK, which is governed by the executable, not their format, which is what the documentation covers. Some basic points are as follows:


* The game specifies various special levels, which can be patched. These are Menu level (90), Map level (80) and finalelevel (81) These are identical in format to the other levels.

* Tiles: All tiles are loaded for each level. Tile properties (If they kill, if they block, ice, etc) are stored in the Keen tileinfo file in the executable. Look there for details

* Sprites: Which value is which sprite is controlled by the executalbe, can be patched and is not fixed. Most Keen level editors have these hard coded, so I shan't list them here.

Sprite values are words (Two bytes long), on the mpas positive values are level entrances (But only work up to 32) and negative values are blocks for that level. (As noted in a word, a negative value is 0x8000 + level number.) Done markers appear of the upper-leftmost level entrance and/or block. Blocks are either 1x1 or 4x4 in size.

In normal levels, bridges and teleport values in hex (Hexadecimal) are the x,y location RELATIVE to the bridge\switch. Negative values here are 65536 - the x,y hex. (Map teleporters are hard coded and can be patched.)

Keen 3's map has special values to guide Messie and make her stop.

Some Keen engine games also have doors and exit values.


Did I miss anything?
billyblaze
Posts: 3
Joined: Wed Jul 15, 2009 1:32 pm

Post by billyblaze »

Hello, i´m interested in the ck4, ck5 and ck6-formats. I´ve seen some textfiles provided by levellass - but this documents don´t describe the formats in detail.

I want to implement a routine in managed code (C#) to extract the tiles/sprites from egagraphics.ck#.

Can someone support me?
levellass
Posts: 3001
Joined: Wed Oct 11, 2006 12:03 pm
Location: Ngaruawahia New Zealand

Post by levellass »

First up, you want the EGAGRAPH format, not the CK4 etc format (That is just the file extension.) The reason my texts aren't that accurate is that frankly, they're terribly outdated. I recently updated the patch index, but not with much new information about the formats.

You may want Andy for this, since his ModKeen seamlessly extracts (Most) of the graphics files, though I know the rough outline of things easily enough.


The GRAPH format is basically a collection of files that may or may not be compressed using huffman compression. The locations of each graphic in the file is given in an internal header. Each type of graphic has its own 'sub format' etc, I'm sure you know all of that. (If there's some more specific questions you want answered I can discuss them at more length, either here or on MSN.)

The method I use to get graphics out is this:
Check if there is an external header
-> If not, search the file for a word of the same value as the length of the EGAGRAPH, this will be the last entry in the header
Work back from this until we find an entry with value 0, this is the header start.
Extract
Check for an external huffman dictionary
->If not, search for the string *; this is found in nearly all dictionaries, the graphics one is always the first in the file
Dictionaries are 1020 bytes long, extract
Decompress each entry
Depending on what entry we have, convert it in various ways

Some actual code (In QBASIC) I use for extracting audio (It's shorter than the graphics code.) is as follows:

Code: Select all

'_________________________________________________
SUB EXTAUDIO ' Extract the HED and DCT from an executable
'_________________________________________________
PRINT "AUDIOHED not found, extracting from "; exeq; "...";
x4 = MKL$(LOF(3)) ' Size of audio file
OPEN folder + exeq + ".EXE" FOR BINARY AS #7
x0 = SPACE$(1000) ' Scan the EXE for value
FOR l20 = 1 TO LOF(7) \ 2000
GET #7, (LOF(7) \ 2) + (l20 * 996), x0
FOR l21 = 1 TO LEN(x0)
IF MID$(x0, l21, 4) = x4 THEN
 x = (LOF(7) \ 2) + (l20 * 996) + l21 - 997 ' Save header end
 GOTO 3
END IF
NEXT l21, l20
PRINT "I couldn't find the AUDIOHED in this file! Are you sure it is UNLZEXE'd and"
INPUT "The right file"; x0
END

3 PRINT "extracting from"; x; "...";
GET #7, x, x0
x4 = MKL$(0) ' Now look for word of value 0, working back from x
FOR l1 = 255 TO 0 STEP -1 ' HED never larger than 255 entries
IF MID$(x0, (l1 * 4) + 1, 4) = x4 THEN
 y0 = RIGHT$(x0, 1000 - (l1 * 4))
 PUT #2, 1, y0
 EXIT FOR
END IF
NEXT l1
PRINT "done."
IF LOF(2) = 0 THEN  END ' Error
END IF
OPEN folder + "AUDIODCT" + extq FOR BINARY AS #1
IF LOF(1) > 1019 THEN
 CLOSE #1
 EXIT SUB
END IF

PRINT "AUDIODCT not found, extracting from "; exeq; "...";
x6 = MKL$(509) + MKI$(0) ' Root node of nearly ALL DCTs
FOR l20 = 1 TO LOF(7) \ 2000
GET #7, (LOF(7) \ 2) + (l20 * 996), x0
FOR l21 = 1 TO LEN(x0)
IF MID$(x0, l21, 6) = x6 THEN
 x = (LOF(7) \ 2) + (l20 * 996) + l21 - 1019 ' DCT starts 1020 bytes before root node
 GOTO 1
END IF
NEXT l21, l20
PRINT "I couldn't find the AUDIODCT in this file! Are you sure it is UNLZEXE'd and"
INPUT "The right file"; x0
END

1 PRINT "extracting from"; x; "...";
u0 = u0 + MKL$(x - 1)
x0 = SPACE$(1020)
GET #7, x, x0
PUT #1, 1, x0
CLOSE #7
CLOSE #1
PRINT "done."
END SUB
I hope this was helpful-ish, what will this project be on? I am interested.
User avatar
adurdin
Site Founder
Posts: 549
Joined: Fri Aug 29, 2003 11:27 pm
Location: Edinburgh, Scotland
Contact:

Post by adurdin »

levellass wrote:You may want Andy for this
The mere act of pronouncing my name is enough to summon me into existence!

I'll just point you to some extra resources that may help:

http://files.keenmodding.org/mobydoc.txt - A short text file with an overview of the Keen 4, 5, and 6 maps and graphics file formats.

http://files.keenmodding.org/modkeen2.zip - A tool that can import and export the graphics from Keen 1,2,3,4,5,6. Full source code (in C) is included, which will give you specifics on the graphics formats.

There are also several other bits and pieces related to formats/data info (tileinfo, ck456tli, keenwright) most of which have source code also on http://files.keenmodding.org/

And with that, he pops back into the magic lamp[/i]
billyblaze
Posts: 3
Joined: Wed Jul 15, 2009 1:32 pm

Post by billyblaze »

Thank you - both levellass and adurdin. I´ll check out the ressources and get in touch with you, shortly. Maybe, with more questions...
billyblaze
Posts: 3
Joined: Wed Jul 15, 2009 1:32 pm

Post by billyblaze »

Hello Andy,

i´ve studied the modkeen-source code (well implemented; thanks a lot).

But i have a question belonging the hard-coded offsets in the EpisodeInfo-Array. I have tested modkeen with Keen 4 (version 1.4); but in my version the filelength differs from the hard-coded value in keen456.c-file - I guess the offsets may differ, too.

I have tried to port some of the code (utils.c, huff.c and keen456.c) to C#. Reading the header, the ega-dictionary and the ega-offsets is already working (i have tested the code with a keen4e.exe from: http://files.keenmodding.org/4keen14.zip).

Is there a chance to obtain the offsets dynamically?

Best regards,
Matze
User avatar
adurdin
Site Founder
Posts: 549
Joined: Fri Aug 29, 2003 11:27 pm
Location: Edinburgh, Scotland
Contact:

Post by adurdin »

billyblaze wrote:I have tested modkeen with Keen 4 (version 1.4); but in my version the filelength differs from the hard-coded value in keen456.c-file - I guess the offsets may differ, too. … Is there a chance to obtain the offsets dynamically?
The sizes and offsets refer not to the file on disk, but to the image of the executable; i.e. ignoring the header. The header size is conveniently stored in the EXE header, as a count of 16 byte paragraphs; see http://www.delorie.com/djgpp/doc/exe/ for details. The mainreason for this is that when the EXE is unpacked, different unpacking tools put differently sized headers on it, but in each case the image size remains the same. The ModKeen code does these calculations as well, so you can see it happening in the code.

As for getting the offsets dynamically, that's only kind of possible. The exe itself won't contain anything definite to find the offsets, but you could write heuristics to try to find the Huffman dictionaries and other parts; only that's a lot more work to write such reliable heuristics than to locate the offsets manually (as I and others did) for a handful of executables.
levellass
Posts: 3001
Joined: Wed Oct 11, 2006 12:03 pm
Location: Ngaruawahia New Zealand

Post by levellass »

Now see, this is why I'd like to talk to you Andy, I DID write heuristics, because I knew of no other way of doing things. I'm always doing things the long way, because nobody can explain the short way in a manner simple enough to understand.
Post Reply