Har, a new toy!

Jul 19, 2008 at 8:17 PM
The Bartender
"All your forum are belong to us!"
Join Date: Jun 18, 2006
Location: Montreal, Canada
Posts: 581
Age: 40
Discrete, the assembly pseudo-compiler I wrote a good... what, 2 years back? is in bad need of an update. It doesn't support every instruction and it's getting increasingly difficult to work on my hack because of this. As it stands, I can't even complete the title screen (even though I have the code ready - Discrete can't calculate far jumps.)

In fact, Discrete was meant to be distributed along with an assembly primer I never got around to writing, as a means of visual support for the examples in it. So it never was intended for hacking in the first place - it just evolved.



So I've started work on an updated version. This is going to be a complete rewrite with a much more solid parser. It will also check remaining space before injecting code into an executable so you can avoid overwriting existing (and useful) code accidently.

Additional features will be added after the basics are done.

Since this is in priority over my hack, expect a release very soon, perhaps a demo or alpha release by the end of the weekend if all goes well.
 
Jul 19, 2008 at 9:42 PM
Hoxtilicious
"Life begins and ends with Nu."
Join Date: Dec 30, 2005
Location: Germany
Posts: 3218
Age: 33
Pronouns: No homie
Very nice ;)
I'm really in need of this :D
For the jumps for example.
 
Jul 19, 2008 at 9:52 PM
The Bartender
"All your forum are belong to us!"
Join Date: Jun 18, 2006
Location: Montreal, Canada
Posts: 581
Age: 40
That's exactly why I'm rewriting it. It's simply impossible working on a segment of code as large as my hack's title screen to use short jumps anymore. :x

I'm also going to make the formatting looser. No more "two spaces here, a semi-colon there, three letters, five commas, eight numbers, a dead hamster, and three more letters" formatting. And since this code is relying on my newer GUI libraries, it'll perform way better and be MUCH easier to update.

I can add new functions at the drop of a hat, in fact. Seeing as I'm doing some heavy decompilation-related work right now, being able to add buttons and functions to it quickly could come in handy...
 
Jul 20, 2008 at 1:13 AM
Hoxtilicious
"Life begins and ends with Nu."
Join Date: Dec 30, 2005
Location: Germany
Posts: 3218
Age: 33
Pronouns: No homie
Ahh, thats cool man ;)
It's easier to write the code then and for improvements you could edit the assembly code :D
 
Jul 20, 2008 at 3:55 AM
The Bartender
"All your forum are belong to us!"
Join Date: Jun 18, 2006
Location: Montreal, Canada
Posts: 581
Age: 40
S. P. Gardebiter said:
Ahh, thats cool man ;)
It's easier to write the code then and for improvements you could edit the assembly code :D
Well... usually, when you write and compile a program, the source code doesn't vanish... :p Makes it that much easier to add functionality afterwards.
 
Jul 20, 2008 at 11:02 PM
The Bartender
"All your forum are belong to us!"
Join Date: Jun 18, 2006
Location: Montreal, Canada
Posts: 581
Age: 40
Hehe. Behold, some unrelated assembly work!

Code:
Before

00001220 push  ebp
         mov   ebp, esp
         push  ecx
         mov   [ebp - 04], 00000000
0000122B cmp   [ebp - 04], 08
         jnl   0000124F
         mov   eax, [ebp - 04]
         imul  eax, eax, 14
         mov   ecx, [eax + 00499BC8]
         cmp   ecx, [ebp + 08]
         jne   00001244
         jmp   0000124F

00001244 mov   edx, [ebp - 04]
         add   edx, 01
         mov   [ebp - 04], edx
         jmp   0000122B

0000124F cmp   [ebp - 04], 08
         jne   00001259
         xor   eax, eax
         jmp   000012C3

00001259 mov   eax, [ebp - 04]
         imul  eax, eax, 14
         mov   [eax + 00499BCC], 00000001
         mov   ecx, [ebp - 04]
         imul  ecx, ecx, 14
         mov   edx, [ebp + 0C]
         mov   [ecx + 00499BC8], edx
         mov   eax, [ebp - 04]
         imul  eax, eax, 14
         mov   ecx, [eax + 00499BD4]
         add   ecx, [ebp + 10]
         mov   edx, [ebp - 04]
         imul  edx, edx, 14
         mov   [edx + 00499BD4], ecx
         mov   eax, [ebp - 04]
         imul  eax, eax, 14
         mov   ecx, [eax + 00499BD8]
         add   ecx, [ebp + 10]
         mov   edx, [ebp - 04]
         imul  edx, edx, 14
         mov   [edx + 00499BD8], ecx
         mov   eax, [ebp - 04]
         imul  eax, eax, 14
         mov   [eax + 00499BD0], 00000000
         mov   eax, 00000001
000012C3 mov   esp, ebp
         pop   ebp
         ret

Ahh, assembly... the best damned thing since sliced bread. Only it's not very readable at a glance.

Code:
After

00001220 push ebp
         ebp = esp
         push ecx
         [Local_04] = 00000000
0000122B If [Local_04] < 08
	 {
         	If [([Local_04] * 14) + 00499BC8] == [Param_08]
		 {
         		jump 0000124F
		 }
00001244 	[Local_04] = [Local_04] + 01
         	jump 0000122B
	 }
0000124F If [Local_04] == 08
	 {
         	eax = 0
         	jump 000012C3
	 }
00001259 [([Local_04] * 14) + 00499BCC] = 00000001
         [([Local_04] * 14) + 00499BC8] = [Param_0C]
         [([Local_04] * 14) + 00499BD4] = [([Local_04] * 14) + 00499BD4] + [Param_10]
         [([Local_04] * 14) + 00499BD8] = [([Local_04] * 14) + 00499BD8] + [Param_10]
         [([Local_04] * 14) + 00499BD0] = 00000000
         eax = 00000001
000012C3 esp = ebp
         pop ebp
         return

Wooooah! That's way better. And it's all automatically generated with what I was talking about earlier. This is one of the things I will be adding to the mini-compiler after I get it working.

Before anyone gets excited (*crickets* :) ) this code will never reparse to assembly again. It could be recompiled, but a lot of metadata (inferred information, such as intention, e.g. why you're doing something that looks useless at first glance) is destroyed in converting assembly to code and back and the resulting assembly would be very different from the original.

Also, this is not written in C++ - I'm using a scripting language to test out a few ideas - so implanting it will still take a while. I'll have to convert this code into C++ to, well, use it in C++. Obviously. :) Still gotta finish it though, and that won't happen for a little while.

Most likely, this will be used to view a more readable version of the code to get a sense of what's supposed to happen, or to check if the code you're writing hasn't gone completely awry despite making perfect sense in your head.

Edit:

Just gotta code the compiler now (in other words, the only part that doesn't take 5 minutes to set up...) I'm going to add the ability to save and load scripts after that, and even to specify the offset/length directly in the script file, allowing for one script file to contain a large number of different hacks.

All in due time though. As soon as this can compile stuff, I'm putting my hack's title screen code through it, patching the exe, making a release (both of the hack and this thingie), and then I'm gonna start adding fun tools to it... This might evolve into a huge-ass assembly-hacking toolkit over time...

Edit 2: Alright! I can support a few push instructions and the operandless version of ret. Nice start. Being able to format the code to convert with spaces and tabs is a very welcomed change from the old version, too...
 
Jul 23, 2008 at 12:24 AM
The Bartender
"All your forum are belong to us!"
Join Date: Jun 18, 2006
Location: Montreal, Canada
Posts: 581
Age: 40
This is coming along great. So far, it supports the following mnemonics entirely.

Code:
aaa
cbw
cdq
clc
cld
cli
clts
cmc
cmpsb
cmpsd
cmpsw
cpuid
cwd
cwde
daa
das
emms
f2xm1
fabs
fchs
fclex
fcompp
fcos
fdecstp
fdisi
feni
fincstp
finit
fld1
fldl2e
fldl2t
fldlg2
fldln2
fldpi
fldz
fnclex
fndisi
fneni
fninit
fnop
fpatan
fprem
fprem1
fptan
frndint
fscale
fsetpm
fsin
fsincos
fsqrt
ftst
fucompp
fxam
fxtract
fyl2x
fyl2xp1
hlt
icebp
insb
insd
insw
int1
int3
into
invd
iret
lahf
leave
loadall
loadall286
lodsb
lodsd
lodsw
movsb
movsd
movsw
nop
outsb
outsd
outsw
popa
popf
push
pusha
pushf
rdmsr
rdpmc
rdtsc
ret
rsm
sahf
salc
scasb
scasd
scasw
smi
stc
std
sti
stosb
stosd
stosw
wait
wbinvd
wrmsr
xlatb

I'm planning to support all instructionsets, including MMX and the FPU ops. :confused: That's right, floating point operations in your Cave Story hacks coming soon. ...The 0.52 kind of numbers as opposed to 25, 113, 7... y'know? fsin, fcos, fsqrt? ...anyways.
 
Jul 23, 2008 at 1:40 AM
Hoxtilicious
"Life begins and ends with Nu."
Join Date: Dec 30, 2005
Location: Germany
Posts: 3218
Age: 33
Pronouns: No homie
Oh cool D:
So a beta coming out next weekend if we're lucky? :confused:
 
Jul 23, 2008 at 6:38 AM
The Bartender
"All your forum are belong to us!"
Join Date: Jun 18, 2006
Location: Montreal, Canada
Posts: 581
Age: 40
Unlikely. I'm filling a pineapple with rum and putting a faucet on it this weekend, and heading up north with it.

Edit: Screenshot!

 
Jul 24, 2008 at 6:28 AM
The Bartender
"All your forum are belong to us!"
Join Date: Jun 18, 2006
Location: Montreal, Canada
Posts: 581
Age: 40
Annnnd a preview!

Wait wtf, this isn't a screenshot? OMG IT'S A RELEASE? O_O

Actually, the compiler part doesn't work yet (some of it does, but I'm writing support for different parsers - just supporting one chipset (80x86) will be made TONS easier once I do this.) This is a bit of a preview to get a feel for what this is gonna look like and how it's going to feel.

Enjoy. :D
 
Jul 24, 2008 at 7:07 PM
Hoxtilicious
"Life begins and ends with Nu."
Join Date: Dec 30, 2005
Location: Germany
Posts: 3218
Age: 33
Pronouns: No homie
Oh cool *downloads* :o
I will test it, maybe I can report some bugs.
 
Jul 24, 2008 at 11:20 PM
The Bartender
"All your forum are belong to us!"
Join Date: Jun 18, 2006
Location: Montreal, Canada
Posts: 581
Age: 40
There isn't much that can go wrong right now (it being mostly interface at the moment) but if you spot anything, let me know. Ignore any parser-related bugs as that part is totally mid-dev and not intended for use yet (although if the app crashes, that's a different matter...)

Once this thing is complete, it'll also be possible to use it to hack other platforms. I've also had the 65816 chipset (snes) in mind while working things out, but it could decompiler/inject code from/to any kind of executable. The app will use syntax files (plain text files, or maybe XML, which can be tweaked by anyone with a text editor) to define the instruction set, so it could even be used to hack, say, FFV's enemy AI scripts.

That part'll make it REALLY easy for me to implant 80x86 assembly (the main goal, after all.) Since I'll be using some generic code to load and parse instructions, I won't have to code a seperate case for each of the 300-some instructions. Just one little definition file and I'm done.

There will also be a decompiler (still having some SLIGHT problems figuring out how to work out addressing modes and how to manage CPU flags; if I can't do that, it'll just be a disassembler, OR I'll just pick a different approach.) You'll be able to decompile the code from the offset you're hacking to see what's there and to make your own little tweaks with very little effort. And I eventually intend to include a way to define offsets so instead of writing "call 00481100" you could write "call gain_level" or instead of "mov eax, [00492AC0]", "mov eax, [money]".

That's as close to transforming it into script as I can make it. :o
 
Jul 24, 2008 at 11:57 PM
Hoxtilicious
"Life begins and ends with Nu."
Join Date: Dec 30, 2005
Location: Germany
Posts: 3218
Age: 33
Pronouns: No homie
Very nice actually, thanks Rune! :o

There is a bug in it though.
I just clicked "Check Syntax":

diph.php
 
Jul 25, 2008 at 12:43 AM
The Bartender
"All your forum are belong to us!"
Join Date: Jun 18, 2006
Location: Montreal, Canada
Posts: 581
Age: 40
Yeah, that's the script parser. That's the part that's still in dev and not ready for use. :o Don't worry about parser-related problems.

You basically told it to parse an instruction that expects operands but has none. Since the parser is being changed to use external files, I never got around to finishing it up. In this case, the parser simply assumed the instruction's operands were there, but didn't know what to do anymore once it tried to use them.

Edit: Here's an early look at a parser file...

Code:
;	register, groupID, registerID

.registers
; group 0: 8bit registers
al  0 0 ; accumulator (low byte)
cl  0 1 ; counter (low byte)
dl  0 2 ; data register (low byte)
bl  0 3 ; base register (low byte)
ah  0 4 ; accumulator (high byte)
ch  0 5 ; counter (high byte)
dh  0 6 ; data register (high byte)
bh  0 7 ; base register (high byte)

; group 1: 16bit registers
ax  1 0 ; accumulator (16 bit)
cx  1 1 ; counter (16 bit)
dx  1 2 ; data register (16 bit)
bx  1 3 ; base register (16 bit)
sp  1 4 ; stack pointer (16 bit)
bp  1 5 ; base pointer (16 bit)
si  1 6 ; source index (16 bit)
di  1 7 ; destination index (16 bit)

; group 2: 32bit registers
eax 2 0 ; accumulator (32 bit)
ecx 2 1 ; counter (32 bit)
edx 2 2 ; data register (32 bit)
ebx 2 3 ; base register (32 bit)
esp 2 4 ; stack pointer (32 bit)
ebp 2 5 ; base pointer (32 bit)
esi 2 6 ; source index (32 bit)
edi 2 7 ; destination index (32 bit)

; group 3: segment registers
es  3 0 ; extra segment 1
cs  3 1 ; code segment
ss  3 2 ; stack segment
ds  3 3 ; data segment
fs  3 4 ; extra segment 2
gs  3 5 ; extra segment 3

; group 4: floating-point registers
st0 4 1
st1 4 2
st2 4 3
st3 4 4
st4 4 5
st5 4 6
st6 4 7
st7 4 8

; group 5: 64-bit MMX registers
mm0 5 0
mm1 5 1
mm2 5 2
mm3 5 3
mm4 5 4
mm5 5 5
mm6 5 6
mm7 5 7

; group 6: control registers
cr0 6 0
cr2 6 2
cr3 6 3
cr4 6 4

; group 7: debug registers
dr0 7 0
dr1 7 1
dr2 7 2
dr3 7 3
dr6 7 6
dr7 7 7

; group 8: test registers
tr3 8 3
tr4 8 4
tr5 8 5
tr6 8 6
tr7 8 7



;	suffix, groupID, value
.suffixes
; jump suffixes (jcc)
o   0  0 ; overflow
no  0  1 ; not overflow
b   0  2 ; below
c   0  2 ; carry flag set
nae 0  2 ; not above or equal
ae  0  3 ; above or equal
nb  0  3 ; not below
nc  0  3 ; not carry flag set (carry unset)
e   0  4 ; equal
z   0  4 ; zero flag set
ne  0  5 ; not equal
nz  0  5 ; not zero flag set (zero flag unset)
be  0  6 ; below or equal
na  0  6 ; not above
a   0  7 ; above
nbe 0  7 ; not below or equal
s   0  8 ; sign flag set
ns  0  9 ; not sign flag set (sign flag unset)
p   0 10 ; parity flag set
pe  0 10 ; parity is equal
np  0 11 ; not parity flag set (parity flag unset)
po  0 11 ; parity is opposite
l   0 12 ; less
nge 0 12 ; not greater or equal
ge  0 13 ; greater or equal
nl  0 13 ; not less
le  0 14 ; less or equal
ng  0 14 ; not greater than
g   0 15 ; greater than
nle 0 15 ; not less or equal



;	mnemonic | operands | output | full name
;	if operands is "SYSTEM" then the third param is a command ID for the compiler, followed by possible parameters...

.mnemonics

; system commands
#32 SYSTEM mode_32_bit
#16 SYSTEM mode_16_bit

; mnemonics [8086]
aaa |             |     37       | Ascii Adjustment (Add)
aas |             |     3F       | Ascii Adjustment (Subtract)
aad |             |     D5 0A    | Ascii Adjustment (Divide)
aad | imm         |     D5    ib | Ascii Adjustment (Divide)
aam |             |     D4 0A    | Ascii Adjustment (Multiply)
aam | imm         |     D4    ib | Ascii Adjustment (Multiply)
adc | r/m8  reg8  |     10 /r    | ADd with Carry
adc | r/m16 reg16 | o16 11 /r    | ADd with Carry
adc | reg8  r/m8  |     12 /r    | ADd with Carry
adc | reg16 r/m16 | o16 13 /r    | ADd with Carry
adc | r/m8  imm8  |     80 /2 ib | ADd with Carry
adc | r/m16 imm16 | o16 81 /2 iw | ADd with Carry
adc | r/m16 imm8  | o16 83 /2 ib | ADd with Carry
adc | al    imm8  |     14    ib | ADd with Carry
adc | ax    imm8  | o16 15    iw | ADd with Carry

; mnemonics [186]
bound | reg16 mem | o16 62 /r | check array index against BOUNDs

; mnemonics [286 PRIVILEDGED MODE]
arpl | r/m16 reg16 | 63 /r | Adjust RPL field of selector
clts |             | 0F 06 | CLear Task Set

; mnemonics [386]
adc | r/m32 reg32 | o32 11 /r    | ADd with Carry
adc | reg32 r/m32 | o32 13 /r    | ADd with Carry
adc | r/m32 imm32 | o32 81 /2 id | ADd with Carry
adc | r/m32 imm8  | o32 83 /2 ib | ADd with Carry
adc | eax   imm32 | o32 15    id | ADd with Carry

; mnemonics [486]
bswap | reg32 | o32 0F C8+r | Byte SWAP

; mnemonics [486 UNDOCUMENTED]
cmpxchg486 | r/m8  reg8  |     0F A6 /r | CoMPare and eXCHanGe
cmpxchg486 | r/m16 reg16 | o16 0F A7 /r | CoMPare and eXCHanGe
cmpxchg486 | r/m32 reg32 | o32 0F A7 /r | CoMPare and eXCHanGe

That's right. Undocumented commands and priviledge-mode commands are also supported. Don't fuck up your computer now. :p

I will eventually also add support for different parser file formats, so this could also edit, for instance, Cave Story event scripts. But that's not for a little while.
 
Jul 28, 2008 at 12:40 AM
The Bartender
"All your forum are belong to us!"
Join Date: Jun 18, 2006
Location: Montreal, Canada
Posts: 581
Age: 40
Ok, the parser file format is ready! There's still something missing (support for r/m8, r/m16, and r/m32 operands) but the basic format is now set in stone.

Here is the 80x86 compiler script file, minus the r/m operand instructions and the MMX/FPU/priviledged/undocumented instruction.

Code:
.mode asm

;
;	REGISTERS
;
.registers

;	Register Group 00: 8bit Registers
;	------------ 
al  00 00 ; | [Low  Byte] Accumulator
cl  00 01 ; | [Low  Byte] Counter
dl  00 02 ; | [Low  Byte] Data
bl  00 03 ; | [Low  Byte] Base
ah  00 04 ; | [High Byte] Accumulator
ch  00 05 ; | [High Byte] Counter
dh  00 06 ; | [High Byte] Data
bh  00 07 ; | [High Byte] Base

;	Register Group 01: 16bit Registers
;	------------ 
ax  01 00 ; | [16 bit] Accumulator
cx  01 01 ; | [16 bit] Counter
dx  01 02 ; | [16 bit] Data
bx  01 03 ; | [16 bit] Base
sp  01 04 ; | [16 bit] Stack Pointer
bp  01 05 ; | [16 bit] Base Pointer
si  01 06 ; | [16 bit] Source Index
di  01 07 ; | [16 bit] Destination Index

;	Register Group 02: 32bit Registers
;	------------ 
eax 02 00 ; | [32 bit] Accumulator
ecx 02 01 ; | [32 bit] Counter
edx 02 02 ; | [32 bit] Data
ebx 02 03 ; | [32 bit] Base
esp 02 04 ; | [32 bit] Stack Pointer
ebp 02 05 ; | [32 bit] Base Pointer
esi 02 06 ; | [32 bit] Source Index
edi 02 07 ; | [32 bit] Destination Index

;	Register Group 03: Segment Registers
;	------------ 
es  03 00 ; | Extra Segment 1
cs  03 01 ; | Code  Segment
ss  03 02 ; | Stack Segment
ds  03 03 ; | Data  Segment
fs  03 04 ; | Extra Segment 2
gs  03 05 ; | Extra Segment 3



;
;	SUFFIXES
;
.suffixes

;	Suffix Group 00: Jump Condition Suffixes
;	------------
o   00 00 ; |     Overflow
no  00 01 ; | Not Overflow
b   00 02 ; |     Below
c   00 02 ; |     Carry Flag Set
nae 00 02 ; | Not Above or Equal
ae  00 03 ; |     Above or Equal
nb  00 03 ; | Not Below
nc  00 03 ; |     Carry Flag Unset
e   00 04 ; |     Equal
z   00 04 ; |     Zero Flag Set
ne  00 05 ; | Not Equal
nz  00 05 ; |     Zero Flag Unset
be  00 06 ; |     Below or Equal
na  00 06 ; | Not Above
a   00 07 ; |     Above
nbe 00 07 ; | Not Below or Equal
s   00 08 ; |     Sign Flag Set
ns  00 09 ; |     Sign Flag Unset
p   00 0A ; |     Parity Flag Set
pe  00 0A ; |     Parity is Equal
np  00 0B ; |     Parity Flag Unset
po  00 0B ; |     Parity is Opposite
l   00 0C ; |     Less
nge 00 0C ; | Not Greater or Equal
ge  00 0D ; |     Greater or Equal
nl  00 0D ; | Not Less
le  00 0E ; |     Less or Equal
ng  00 0E ; | Not Greater Than
g   00 0F ; |     Greater Than
nle 00 0F ; | Not Less or Equal



;
;	OPERANDS
;
.operands
1         value 1
al        register al
ax        register ax
cl        register cl
cs        register cs
cx        register cx
ds        register ds
dx        register dx
eax       register eax
ecx       register ecx
es        register es
fs        register fs
gs        register gs
ss        register ss
reg8      register_group 0
reg16     register_group 1
reg32     register_group 2
segreg    register_group 3
imm8      immediate 8
imm16     immediate 16
imm32     immediate 32
imm??     immediate_by_mode 16 32
mem8      memory 8
mem16     memory 16
mem32     memory 32



;
;	SYMBOLS
;
.symbols
##    number
jj    jump_number
00+cc add_suffix 00
00+r  add_register
a16   address_16_bit 67
a32   address_32_bit 67
o16   operand_16_bit 66
o32   operand_32_bit 66



;
;	MNEMONICS
;
.mnemonics

;	--- System Commands
#32bit  SYSTEM mode_32_bit
#16bit  SYSTEM mode_16_bit
#offset SYSTEM set_offset
#length SYSTEM set_length
#db     SYSTEN write_bytes

;	--- Assignement Commands
mov        | reg16 imm16 | o16 B8+r ##
mov        | reg32 imm32 | o32 B8+r ## 
mov        | reg8  imm8  |     B0+r ##
movsb      |             |     A4
movsd      |             | o32 A5  
movsw      |             | o16 A5
xchg       | ax    reg16 | o16 90+r
xchg       | eax   reg32 | o32 90+r
xchg       | reg16 ax    | o16 90+r
xchg       | reg32 eax   | o32 90+r

;	--- Arithmetic Commands
adc        | al    imm8  |     14 ##
adc        | ax    imm8  | o16 15 ##
adc        | eax   imm32 | o32 15 ##
add        | al    imm8  |     04 ##
add        | ax    imm16 | o16 05 ##
add        | eax   imm32 | o32 05 ##
dec        | reg16       | o16 48+r
dec        | reg32       | o32 48+r
inc        | reg16       | o16 40+r
inc        | reg32       | o32 40+r
sbb        | al    imm8  |     1C ##
sbb        | ax    imm16 | o16 1D ##
sbb        | eax   imm32 | o32 1D ##
sub        | al    imm8  |     2C ##
sub        | ax    imm16 | o16 2D ##
sub        | eax   imm32 | o32 2D ##

;	--- Binary Operator Commands
and        | al    imm8  |     24 ##
and        | ax    imm16 | o16 25 ##
and        | eax   imm32 | o32 25 ##
or         | al    imm8  |     0C ##
or         | ax    imm16 | o16 0D ##
or         | eax   imm32 | o32 0D ##
xor        | al    imm8  |     34 ##
xor        | ax    imm16 | o16 35 ##
xor        | eax   imm32 | o32 35 ##

;	--- Call Commands
call       | imm??       |     E8 ##
call       | imm16 imm16 | o16 9A ## ##
call       | imm16 imm32 | o32 9A ## ##

;	--- Jump Commands
j@0        | imm8        |        70+cc jj
jcc        | imm??       |     0F 80+cc jj
jcxz       | imm8        | o16 E3       jj
jecxz      | imm8        | o32 E3       jj
jmp        | imm??       |     E9       jj
jmp        | imm16 imm16 | o16 EA       ## ##
jmp        | imm16 imm32 | o32 EA       ## ##
jmp        | imm8        |     EB       jj

;	--- Return commands
iret       |             |     CF
iretd      |             | o32 CF  
iretw      |             | o16 CF
ret        |             |     C3
ret        | imm16       |     C2 ##
retf       |             |     CB
retf       | imm16       |     CA ##
retn       |             |     C3
retn       | imm16       |     C2 ##

;	--- Stack Commands
enter      | imm16 imm8  |     C8 ## ##
leave      |             |     C9
pop        | ds          |     1F
pop        | es          |     07
pop        | fs          |     0F A1   
pop        | gs          |     0F A9   
pop        | reg16       | o16 58+r
pop        | reg32       | o32 58+r
pop        | ss          |     17
popa       |             |     61
popad      |             | o32 61
popaw      |             | o16 61
popf       |             |     9D
popfd      |             | o32 9D
popfw      |             | o16 9D
push       | cs          |     0E
push       | ds          |     1E
push       | es          |     06
push       | fs          |     0F A0   
push       | gs          |     0F A8   
push       | imm16       | o16 68 ##
push       | imm32       | o32 68 ##
push       | imm8        |     6A ##
push       | reg16       | o16 50+r
push       | reg32       | o32 50+r
push       | ss          |     16
pusha      |             |     60
pushad     |             | o32 60
pushf      |             |     9C
pushfd     |             | o32 9C
pushfw     |             | o16 9C
pushw      |             | o16 60

;	--- Test/Compare Commands
cmp        | al   imm8   |     3C ##
cmp        | ax   imm16  | o16 3D ##
cmp        | eax   imm32 | o32 3D ##
cmpsb      |             |     A6
cmpsd      |             | o32 A7
cmpsw      |             | o16 A7
test       | al    imm8  |     A8 ##
test       | ax    imm16 | o16 A9 ##
test       | eax   imm32 | o32 A9 ##

;	---	Loop Commands
loop       | imm8        |     E2 jj
loop       | imm8  cx    | a16 E2 jj
loop       | imm8  ecx   | a32 E2 jj
loope      | imm8        |     E1 jj
loope      | imm8  cx    | a16 E1 jj
loope      | imm8  ecx   | a32 E1 jj
loopne     | imm8        |     E0 jj
loopne     | imm8  cx    | a16 E0 jj
loopne     | imm8  ecx   | a32 E0 jj
loopnz     | imm8        |     E0 jj
loopnz     | imm8  cx    | a16 E0 jj
loopnz     | imm8  ecx   | a32 E0 jj
loopz      | imm8        |     E1 jj
loopz      | imm8  cx    | a16 E1 jj
loopz      | imm8  ecx   | a32 E1 jj

;	--- String Commands
insb       |             |     6C
insd       |             | o32 6D
insw       |             | o16 6D
lodsb      |             |     AC
lodsd      |             | o32 AD  
lodsw      |             | o16 AD
outsb      |             |     6E
outsd      |             | o32 6F
outsw      |             | o16 6F
scasb      |             |     AE
scasd      |             | o32 AF  
scasw      |             | o16 AF
stosb      |             |     AA
stosd      |             | o32 AB  
stosw      |             | o16 AB

;	---	I/O Commands
in         | al    dx    |     EC
in         | al    imm8  |     E4 ##
in         | ax    dx    | o16 ED
in         | ax    imm8  | o16 E5 ##
in         | eax   dx    | o32 ED  
in         | eax   imm8  | o32 E5 ##
out        | dx    al    |     EE
out        | dx    ax    | o16 EF
out        | dx    eax   | o32 EF  
out        | imm8  al    |     E6 ##
out        | imm8  ax    | o16 E7 ##
out        | imm8  eax   | o32 E7 ##

;	--- Conversion Commands
aaa        |             |     37
aad        |             |     D5 0A
aad        | imm8        |     D5 ##
aam        |             |     D4 0A
aam        | imm8        |     D4 ##
aas        |             |     3F
bswap      | reg32       | o32 0F C8+r
cbw        |             | o16 98
cdq        |             | o32 99
cwd        |             | o16 99
cwde       |             | o32 98
daa        |             |     27
das        |             |     2F
xlatb      |             |     D7

;	--- Flag Commands
clc        |             | F8
cld        |             | FC
cli        |             | FA
cmc        |             | F5
lahf       |             | 9F
sahf       |             | 9E
stc        |             | F9
std        |             | FD
sti        |             | FB

;	--- Oddball Assembly Commands
cpuid      |             | 0F A2
hlt        |             | F4
icebp      |             | F1
int        | imm8        | CD ##
int01      |             | F1
int1       |             | F1
int3       |             | CC
into       |             | CE
invd       |             | 0F 08
nop        |             | 90
rdmsr      |             | 0F 32
rdpmc      |             | 0F 33
rdtsc      |             | 0F 31
rsm        |             | 0F AA
wait       |             | 9B
wbinvd     |             | 0F 09
wrmsr      |             | 0F 30

Take note of the system commands under mnemonics. It's now going to be possible to specify multiple offsets within the same script (so instead of rewriting the entire code, you can just pick out which offsets to target directly)

Edit

New screenshot!



Parsers can be swapped from script to script, so a project could contain changes to, say, scripts, as well as some assembly changes.
 
Jul 28, 2008 at 1:16 AM
Hoxtilicious
"Life begins and ends with Nu."
Join Date: Dec 30, 2005
Location: Germany
Posts: 3218
Age: 33
Pronouns: No homie
Ahh, seems nice.
Can you give an example of an edit of the format?
 
Jul 28, 2008 at 1:27 AM
The Bartender
"All your forum are belong to us!"
Join Date: Jun 18, 2006
Location: Montreal, Canada
Posts: 581
Age: 40
What do you mean? Isn't that what I just posted in the code box...?

If you mean the format of the code that goes in the script window, well... it's assembly. There is no "format." It's a full-blown compiler so it doesn't take whitespace into account. ;) Anyway there's some code in the screenshot.
 
Jul 28, 2008 at 1:44 AM
Hoxtilicious
"Life begins and ends with Nu."
Join Date: Dec 30, 2005
Location: Germany
Posts: 3218
Age: 33
Pronouns: No homie
You don't know what I mean ;)

RuneLancer said:
Ok, the parser file format is ready! There's still something missing (support for r/m8, r/m16, and r/m32 operands) but the basic format is now set in stone.

Here is the 80x86 compiler script file, minus the r/m operand instructions and the MMX/FPU/priviledged/undocumented instruction.

Edited scriptfile.
 
Jul 28, 2008 at 2:10 AM
The Bartender
"All your forum are belong to us!"
Join Date: Jun 18, 2006
Location: Montreal, Canada
Posts: 581
Age: 40
Yes, that IS what I posted in the code box. ;)

I've also given different formats some thought. Here's a decompiler script file for Final Fantasy V's character startup stats.

PHP:
;    ****************************************************************
;    *                                                              *
;    *    Final Fantasy V                                           *
;    *    Base Stat Decompiler Script                               *
;    *    by RuneLancer                                             *
;    *                                                              *
;    *    Extracts the startup stats each character has at level 1. *
;    *                                                              *
;    ****************************************************************

.mode    data
.order   reverse

;    ****************************************************************
;    *                                                              *
;    *    HEADER                                                    *
;    *                                                              *
;    ****************************************************************
header
{
base_offset        0011571E
entry_count        5
entry_size         4
entry_label        char_names
}

;    ****************************************************************
;    *                                                              *
;    *    ENTRY LABELS                                              *
;    *                                                              *
;    ****************************************************************
entry_labels
{
char_names
{
"Butz"    ; 00
"Lenna"   ; 01
"Galuf"   ; 02
"Faris"   ; 03
"Cara"    ; 04
}

stat_level
{
"Terrible" ; 00
"Poor"     ; 01
"Average"  ; 02
"High"     ; 03
"Great"    ; 04
}
}

;    ****************************************************************
;    *                                                              *
;    *    EXTRACT                                                   *
;    *                                                              *
;    ****************************************************************
extract
{
; Get the stats.
#str  : read_byte(0)
#agi  : read_byte(1)
#vit  : read_byte(2)
#mpw  : read_byte(3)
#total: #str + #agi + #vit + #mpw

; Create the labels.
#str_level: label(stat_level, #str)
#agi_level: label(stat_level, #str)
#vit_level: label(stat_level, #str)
#mpw_level: label(stat_level, #str)

; Define the ranking.
; This is just an analysis value to test conditions.
if #total is  7 then #total_rank: "lowest"
if #total is  8 then #total_rank: "second lowest"
if #total is  9 then #total_rank: "second best"
if #total is 10 then #total_rank: "best"

; And this is what we output.    
define value "Strength"    #str   #str_level
define value "Agility"     #agi   #agi_level
define value "Vitality"    #vit   #vit_level
define value "Magic Power" #mpw   #mpw_level
define value "Total"       #total #total_rank
}

Bwahaha, it even color-coded it for me. Nice board, niiiice board... :)
 
Jul 30, 2008 at 2:00 AM
The Bartender
"All your forum are belong to us!"
Join Date: Jun 18, 2006
Location: Montreal, Canada
Posts: 581
Age: 40
Ok, the data script format is finalized and I've merged compilation and decompilation scripts. Enjoy. :)

PHP:
############################################
#  Final Fantasy V                         #
#  Base Stat Script                        #
#  by RuneLancer                           #
############################################

.mode  data
.order  reverse

############################################
#                                          #
#  HEADER                                  #
#                                          #
############################################
header
{
base_offset   0011571E
entry_count   5
entry_size    4
entry_label   char_names
}

############################################
#                                          #
#  LABELS                                  #
#                                          #
############################################
labels
{
char_names
{
"Butz"    # 00
"Lenna"   # 01
"Galuf"   # 02
"Faris"   # 03
"Cara"    # 04
}
}

############################################
#                                          #
#  FORMAT                                  #
#                                          #
############################################
format
{
byte strength
byte agility
byte vitality
byte magic_pow
}

############################################
#                                          #
#  LANGUAGE                                #
#                                          #
############################################
language
{
comment_char      ";"
ignore_whitespace 1
mode              root

root
{
# Read the next identifier.
$command: get_token();

# Handle the commands.
if $command = "Strength" then &mode: strength ;
if $command = "Agility"  then &mode: agility  ;
if $command = "Vitality" then &mode: vitality ;
if $command = "MagicPow" then &mode: magic_pow;

# If we're still in root, there's a problem.
if &mode = root then error("Unrecognized token: " $command);
}

strength  { $param: get_token(); set(strength  $param); }
agility   { $param: get_token(); set(agility   $param); }
vitality  { $param: get_token(); set(vitality  $param); }
magic_pow { $param: get_token(); set(magic_pow $param); }
}

############################################
#                                          #
#  EXTRACT                                 #
#                                          #
############################################
extract
{
# Get the stats.
*strength : read_byte(0);
*agility  : read_byte(1);
*vitality : read_byte(2);
*magic_pow: read_byte(3);
}

############################################
#                                          #
#  DISPLAY                                 #
#                                          #
############################################
display
{
$name: label(char_names &current_entry);

output(&comment_char " ID " &current_entry ": " $name "\n");
output("Strength " *strength  "\n");
output("Agility  " *agility   "\n");
output("Vitality " *vitality  "\n");
output("MagicPow " *magic_pow "\n");
}

############################################
#                                          #
#  INJECT                                  #
#                                          #
############################################
inject
{
# Just write everything.
write_all();
}

A little breakdown.

Header contains... well... the header. Basic info about this thing. What goes where. And such.

Labels are mappings between a value and some text. For instance, you can create labels to represent weapon IDs by the weapon's actual name.

Format is the data format. These are like global variables (they're kept in memory and are accessible from any part of the script.)

Language defines the script format. A script is matched against this, so you just have to write plain english instead of working out the details. :o

Extract is run when attempting to retrieve data from the EXE. It fills up "Format"'s data.

Display is used to format the data and display it. Usually this would take what's in "Format" and find some way of displaying it.

Inject, finally, dumps what's in "Format" back into the EXE. In the example up here, I just write everything without doing any special formatting.

This file by the way is more or less how you'd create an editor. This one's an editor for the startup stats of the five characters in Final Fantasy V, for instance. You do not deal with funky shit like this inside the application. Most of you, in fact, will probably never touch this part.

This format may eventually extend to the script and assembly parts of this. Anyways. Time to start coding the parser. :D
 
Back
Top