$48 $69 $20 $54 $68 $65 $72 $65 $21
(Hi There!)
I think I've got hex figured out now!
What I need to figure out now is how exactly Keen:Vorticons is built in hex.
What language is it written in? BASIC? C? Or one of those dedicated languages that's built from source code and takes ages and ages to learn and- [/panic]
Could someone shed some light on this?
It should be noted that I possess binary dumps of keens 1-6, courtesy of Levellass.
Heeeeex!
Weeeell, according to the various developers of Keen, it was originally programmed in a combination of C and assembly. The dumps you have are in assembly (as converting programs back to high-level source code such as C is nigh impossible - you can translate it backwards, but the effect is rather like translating from english to japanese and back again, i.e. not pretty). It doesn't matter what language the program was originally in - once it is compiled down to an .exe file, it's in assembly. I have no clue as to what the assembly commands are in hex (not a huge assembly fan), but it also seems you are misunderstanding slightly what hex is.
Hex (short for hexadecimal) is merely a numbering system, specifically base 16. (Decimal, our standard numbering system, is base 10, binary, a computer's native numbering system, is base 2.) Due to a computer's ability to only work in ones and zeros (zeroes?), all numbers are in base 2. People, having been taught base 10 from as soon as they could count, have a little difficulty understanding binary. 16, being a power of two, is easily convertable between binary, and is slightly easier on the human eye, too.
All computers can understand is numbers. Therefore, in a computer, everything is represented by numbers. Letters are numbers (as demonstrated by the handful of ASCII code you displayed at the top of your post), graphics are numbers, program instructions are numbers. What you want to learn is not so much the numbering system (although you seem to have done so quite well, and it is useful anyways) as the program instructions, ie assembly.
You may or may not have known most of that already, but anyways. I will now hand over to someone a bit more qualified in this area, like lemm or levellass.
Hex (short for hexadecimal) is merely a numbering system, specifically base 16. (Decimal, our standard numbering system, is base 10, binary, a computer's native numbering system, is base 2.) Due to a computer's ability to only work in ones and zeros (zeroes?), all numbers are in base 2. People, having been taught base 10 from as soon as they could count, have a little difficulty understanding binary. 16, being a power of two, is easily convertable between binary, and is slightly easier on the human eye, too.
All computers can understand is numbers. Therefore, in a computer, everything is represented by numbers. Letters are numbers (as demonstrated by the handful of ASCII code you displayed at the top of your post), graphics are numbers, program instructions are numbers. What you want to learn is not so much the numbering system (although you seem to have done so quite well, and it is useful anyways) as the program instructions, ie assembly.
You may or may not have known most of that already, but anyways. I will now hand over to someone a bit more qualified in this area, like lemm or levellass.
Hello there. I'm slightly more qualified.
Keen itself says it was written in C+ (At the start of its text segment) but as it's been said, in the executable it's in assembly. Assembly is actually pretty easy to learn, once you get used to it. The problem is getting used to it, as it's all numbers.
First up, we'll need to be practical here, there is NO way you're going to be able to write out the whole executable with things like '$3DEC: Here's where Keen checks for door tiles' and whatnot. There's just too much data and it's too complicated.
There are programs that can 'decompile' the whole thing automatically, but what comes out is little better than what goes in, you may have seen some of this stuff where Spleen was trying to work out the robot shot, it'll tell you what each byte of assembly is doing, (And helpfully work out where jumps, calls, etc go.) but it doesn't do much to help you.
As I've said before, you're better off asking for something, then seeing how people solve the problem. (Many a patcher has started off trying to make a Keen 2/3 patch from a Keen 1 one) As an example, here is how I dissected the sprite spawning code of the Butler Bot found at $1777 in Keen 1:
Now as you can see, this is total gibberish, mostly because I didn't fill everything in, but also because I didn't use the proper wording. But *I* can understand it; and by comparing it to other bits of code, can make patches I might want. I am sure even you can see a few things to look for (For example C7 44 28 will be just before the initial sprite to show for all spawned sprites) and you can gradually work your way up from there.
I now turn you over to Lemm, who has actually studied the proper mouth words for things rather than bumbling along.
Keen itself says it was written in C+ (At the start of its text segment) but as it's been said, in the executable it's in assembly. Assembly is actually pretty easy to learn, once you get used to it. The problem is getting used to it, as it's all numbers.
First up, we'll need to be practical here, there is NO way you're going to be able to write out the whole executable with things like '$3DEC: Here's where Keen checks for door tiles' and whatnot. There's just too much data and it's too complicated.
There are programs that can 'decompile' the whole thing automatically, but what comes out is little better than what goes in, you may have seen some of this stuff where Spleen was trying to work out the robot shot, it'll tell you what each byte of assembly is doing, (And helpfully work out where jumps, calls, etc go.) but it doesn't do much to help you.
As I've said before, you're better off asking for something, then seeing how people solve the problem. (Many a patcher has started off trying to make a Keen 2/3 patch from a Keen 1 one) As an example, here is how I dissected the sprite spawning code of the Butler Bot found at $1777 in Keen 1:
Code: Select all
55 #Start something big
8B #Start
EC 56
E8 B7 11 #Jump to 'spawn something' code
8B #Start
F0
C7 #Set...
04 #Variable 4...
0F 00 #To 15
8B #Start
46 04 99 B1 0C
E8 C4 C9 #Jump to something or other
89 44 04 89 54 06
8B #Start
46 06 99 B1 0C
E8 B5 C9 #Jump to same thing as before
89 44 08 89 54 0A
C7 #Set...
44
20 #Variable 32... (Initial speed)
5A 00 #To 90
8B #Start
44 06
8B #Start
54 04 3B 06 E0 6E 7F 10 7C 06 3B 16 DE 6E 73 08
8B #Start
44 20 F7 D8 89 44 20
C7 #Set...
44
32 #Variable 50...
C7 #Set...
1D
C7 #Set...
44
34 #Variable 52... (Sprite behavior)
94 1E #to Butler Bot walk (At $1E94)
C7 #Set...
44
28 #Variable 46... (Sprite to show)
60 00 #to 96
5E #Close 1
5D #Close 2
C3 #Close big
Now as you can see, this is total gibberish, mostly because I didn't fill everything in, but also because I didn't use the proper wording. But *I* can understand it; and by comparing it to other bits of code, can make patches I might want. I am sure even you can see a few things to look for (For example C7 44 28 will be just before the initial sprite to show for all spawned sprites) and you can gradually work your way up from there.
I now turn you over to Lemm, who has actually studied the proper mouth words for things rather than bumbling along.
This is the output from the dissassembler, which takes the .exe and turns the machine instructions (numbers) into readable operation codes. For example, in levellass' post, the first number is $55. This is the opcode for "push bp," the first instruction in the dissassembly.
I have done some commenting to name the variables. The seg000:xxxx is the location in the executable (in bytes) where the instruction resides. The CPU goes through the instructions, line by line, moving number back and forth between memory and the cpu, and sometimes performing operations in the cpu. The two-letter words like ax, dx, etc. are called registers. These are 16-bit locations in the CPU that hold numbers temporarily so that the CPU can do operations on them. The bracketed regions [bp+4. si+sprite..etc] refer to locations in memory.
You can see that most of the operations (in the second column) are:
mov (move bytes from the cpu to/from memory),
call (call a subroutine)
cmp, jl/jg (compare two values, jump if greater, or jump if less to another instruction)
After you learn the syntax, you can write your own little routines to make new sprite behaviours and things. There's only about 20 commands that you will use 95% of the time, so it isn't hard to learn. "The Art of Assembly Programming" is what I read. It's free online.
I have done some commenting to name the variables. The seg000:xxxx is the location in the executable (in bytes) where the instruction resides. The CPU goes through the instructions, line by line, moving number back and forth between memory and the cpu, and sometimes performing operations in the cpu. The two-letter words like ax, dx, etc. are called registers. These are 16-bit locations in the CPU that hold numbers temporarily so that the CPU can do operations on them. The bracketed regions [bp+4. si+sprite..etc] refer to locations in memory.
You can see that most of the operations (in the second column) are:
mov (move bytes from the cpu to/from memory),
call (call a subroutine)
cmp, jl/jg (compare two values, jump if greater, or jump if less to another instruction)
Code: Select all
seg000:1777 ; ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦ S U B R O U T I N E ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦
seg000:1777
seg000:1777 ; Attributes: bp-based frame
seg000:1777
seg000:1777 add_monster_2_butler proc near ; CODE XREF: sub_115EB+8Dp
seg000:1777
seg000:1777 tile_x_spawn = word ptr 4
seg000:1777 tile_y_spawn = word ptr 6
seg000:1777
seg000:1777 push bp ; preserve callers stack frame
seg000:1778 mov bp, sp
seg000:177A push si
seg000:177B call add_monster ; allocate memory for new sprite
seg000:177E mov si, ax ; add_monster returns pointer to memory in ax
seg000:177E ; save this pointer (memory location) in si
seg000:1780 mov [si+sprite.type], 5 ; store 5 in memory there, because butler bots are type = 5 in game
seg000:1784 mov ax, [bp+tile_x_spawn] ; load x spawn coord in TILES to ax
seg000:1787 cwd ; manipulate...
seg000:1788 mov cl, 0Ch
seg000:178A call near ptr H_LLSH
seg000:178D mov word ptr [si+sprite.pos_x], ax ; and store the resulting coordinate in 256ths of a pixel in sprite memory
seg000:1790 mov word ptr [si+(sprite.pos_x+2)], dx
seg000:1793 mov ax, [bp+tile_y_spawn] ; do the same for y
seg000:1796 cwd
seg000:1797 mov cl, 0Ch
seg000:1799 call near ptr H_LLSH
seg000:179C mov word ptr [si+sprite.pos_y], ax
seg000:179F mov word ptr [si+(sprite.pos_y+2)], dx
seg000:17A2 mov [si+sprite.vel_x], 5Ah ; 'Z' ; store velocity of $5A in memory
seg000:17A7 mov ax, [si+6] ; next few instructions compare butler bot position to keen position
seg000:17AA mov dx, [si+4]
seg000:17AD cmp ax, word ptr sprite_array.pos_x+2
seg000:17B1 jg short functions
seg000:17B3 jl short startsright
seg000:17B5 cmp dx, word ptr sprite_array.pos_x
seg000:17B9 jnb short functions
seg000:17BB
seg000:17BB startsright: ; CODE XREF: add_monster_2_butler+3Cj
seg000:17BB mov ax, [si+20h] ; if keen starts to the left, then NEGate velocity
seg000:17BB ; so the butler bot heads left
seg000:17BE neg ax
seg000:17C0 mov [si+20h], ax
seg000:17C3
seg000:17C3 functions: ; CODE XREF: add_monster_2_butler+3Aj
seg000:17C3 ; add_monster_2_butler+42j
seg000:17C3 mov word ptr [si+32h], offset think_2_butler ; store pointers to behaviour...
seg000:17C8 mov word ptr [si+34h], offset contact_2_butler ; ... and to contact function
seg000:17CD mov word ptr [si+28h], 60h ; '`' ; animation frame 60
seg000:17D2 pop si
seg000:17D3 pop bp ; restore stack frame
seg000:17D4 retn
seg000:17D4 add_monster_2_butler endp ; sp = 4
seg000:17D4