|
(The general purpose registers can be "split". You have the AH and the AL register for example. AH contains the high byte of AX and AL contains the lowbyte. You also have: BH, BL, CH, CL, DL, DH So if eg. DX contains the value 1234h DH would be 12h and DL would be 34h). |
And a 16-bit FLAG Register. All "flags" (see below) are stored here. The FLAGS Register consists of 9 status bits. These bits are also called flags, because they can either be SET (1) or NOT SET (0). All these flags have a name and purpose.
|
ZF | Zero Flag | 6 | if set, resulting number of calculation is zero | ||||||||||||||||||||||||
| AF | Auxiliary Carry | 4 | some sort of second carry flag | |||||||||||||||||||||||||
| PF | Parity Flag | 2 | indicates even or odd parity | |||||||||||||||||||||||||
| CF | Carry Flag | 0 | contains the left-most bit after calculations |
Test it!
If you want to see all these register and flags, you can go to DOS and then start "debug" (just type debug) When you're in debug, just type "r" and you'll see all the registers and some abreviations for the flags. Type "q" to quit again. We won't use debug to program in this tutorial, we'll use a real assembler. I use TASM 3.2, but MASM or any other assembler works just fine too.
So if the memory looks like this: 78h 56h and you get a word from memory you'll get the value 5678h. (note, I use the "h" after a number to indicate it's hexadecimal) However, if you just get a byte from memory it goes this way: memory 78h 56h -----> first byte you get 78h. Okay, pretty clear huh?
Now let's talk about segments. The 8086 divides it's memory into segments. Segments are (standard in DOS) 64 KB big and have a number. These numbers are stored in the segment registers (see above). Three main segments are the code, data and stack segment. Segments overlap each other almost completely. If you start debug again and type "d" you can see some addresses at the left of the screen. The format is like this: 4576:0100. that's a memory address. The first number is the segment number and the second number is the offset within the segment. So FFFF:FFF0 means: Segment FFFFh and FFF0h bytes from the beginning of the segment.
As I said before, segments overlap. The address 0000:0010 is EXACTLY
the same address as 0001:0000. That means that segment begin at paragraph
boundaries. (a paragraph=16 bytes, so the segment starts at an address
divisible by 16) Now you can start calculating REAL addresses in memory.
An example: 0000:0010 means: segment 0000h offset 10h Now we multiply the
segment number with 16 and add the offset.
Note that the offset 10h means the value 16 in decimal:
Next, the other address 0001:0000:
By The Way, this segmentation of memory is actually done by DOS at startup. On a 286 or higher, you have something called real-mode and protected-mode. This Segment explanation is based on Real-mode, in Protected-mode it's way different, but don't bother, that's real complicated stuff you don't need to know. Just assume that what I explained about segments is ALWAYS true. But remember in the back of your head, that there's more.... Trust me...... I know what I'm talking about.
.model small .stack .data message db "Hello world, I'm learning Assembly !!!", "$" .code main proc mov ax,seg message mov ds,ax mov ah,09 lea dx,message int 21h mov ax,4c00h int 21h main endp end main |
You can assemble this by typing: "tasm first [enter] tlink first [enter]" or something like: "masm first [enter] link first [enter] You must have an assembler and the link/tlink program. I'll explain the code now.
.model small : Lines that start with a "." are used to
provide the assembler with infomation. The word(s) behind it say what kind
of info. In this case it just tells the assembler the program is small
and doesn't need a lot of memory. I'll get back on this later.
.stack : Another line with info. This one tells the assembler
that the "stack" segment starts here. The stack is used to store temporary
data. It isn't used in the program, but it must be there, because we make
an .EXE file and these files MUST have a stack.
.data : indicates that the data segment starts here and
that the stack segment ends there.
.code : indicates that the code segment starts there
and the data segment ends there.
main proc : Code must be in procedures, just like in C or any other language. This indicates a procedure called main starts here. main endp states that the procedure is finished. Procedures MUST have a start and end. end main : tells the assembler that the program is finished. It also tells the assembler were to start the program. At the procedure called main in this case.
message db "xxxx" : DB means Define Byte and so
it does. In the data-segment it defines a couple of bytes. These bytes
contain the information between the brackets. "Message" is a name to indentify
this byte-string. It's called an "indentifier".
mov ax, seg message : AX is a register. You use registers
all the time, so that's why you had to know about them before I could explain
this. MOV is an instruction that moves data. It can have a few "operands"
(don't worry, I'll explain these names later) Here the operands are AX
and seg message. Seg message can be seen as a number. It's the number of
the segment "message" is in (The data-segment) We have to know this number,
so we can load the DS register with it. Else we can't get to the bit-string
in memory. We need to know WHERE the bit-string is located in memory. The
number is loaded in the AX register. MOV always moves data to the operand
left of the comma and from the operand right of the comma.
mov ds,ax : The MOV instruction again. Here it moves the number in the AX register (the number of the data segment) into the DS register. We have to load this DS register this way (with two instructions) Just typing: "mov ds,segment message" isn't possible.
mov ah, 09 : MOV again. This time it load the AH register with the constant value nine.
lea dx, message : LEA Load Efective Address. This intructions stores the offset within the datasegment of the bit-string message into the DX register. This offset is the second thing we need to know, when we want to know where "message" is in the memory. So now we have DS:DX. See the segment explanation above.
int 21h : This instruction causes an Interrupt. The processor calls a routine somewhere in memory. 21h tells the processor what kind of routine, in this case a DOS routine. INT's are very important and I'll explain more of them later, since they're also very, very complex. However, for now assume that it just calls a procedure from DOS. The procedure looks at the AH register to find out out what it has to do. In this example the value 9 in the AH register indicates that the procedure should write a bit-string to the screen.
mov ax, 4c00h : Load the Ax register with the constant
value 4c00h
int 21h : The same INT again. But this time the AH register
contains the value 4ch (AX=4c00h) and to the DOS procedure that means "exit
program". The value of AL is used as an "exit-code" 00h means "No error"
That's it!!! You now fully understand this program (I hope)
0F77:0000 B8790F MOV AX,0F79 0F77:0003 8ED8 MOV DS,AX 0F77:0005 B409 MOV AH,09First 0F77:0000 is the segment number and offset. B8790F is the machine code of the mov ax,0f79 instruction. B8 means "mov ax," and 790F is the number. (reversed order) Note that the instruction was:
The other instruction
Now let's calculate another address for the data. 0F79:0000 substract
2 from the segment number. That would give you 0F77 (the code segment).
0002:0000 --> 2*16+0=32. Two segments further means 32 bytes further, and
that means an offset of 32.
So at this location the data is: 0F77:0020. Check by typing "d 0f77:0020".
Please note that it's the SAME data. We can see it at multiple addresses
only because the segments overlap! But in the program we said the data
had to be in a data-segment. Remember, the .data instruction? Well, it
IS in a data-segment, the data is just stored directly behind the code,
but that doesn't matter. I mean, we can address the data with a segment
number and an offset of zero.
Also note, that after the int 21h instruction to end the program the data doesn't immediately start, first there some undefined bytes. (probably zero) That's because segments start at paragraph boundaries. The data-segment couldn't start at 0F77:0010 anymore, because there is code there, if there wasn't any code there, the data-segment would have been: 0F78. So the data-segment has to be 0F79 (closest match) and so, some bytes after the code and before the data just take up space. But that doesn't matter. Please remember that the assembler doesn't care how the segment are in the .ASM file. In this example we first declared the data-segment, but the assembler puts it last in memory.
|
The final value of AX will be 1234h. First we load 1234h into AX, then we push that value to the stack. We now store 9 in AH, so AX will be 0934h and execute an INT. Then we pop the AX register. We retreive the pushed value from the stack. So AX contains 1234h again. Another example: |
MOV AX, 1234H |
The final values will be: |
MOV AX,1234H MOV BX,5678H PUSH AX PUSH BX POP AX POP BX |
The values: AX=5678h BX=1234h First the value 1234h was pushed after that the value 5678h was pushed to the stack. Acording to LIFO 5678h comes of first, so AX will pop that value and BX will pop the next. |
| Indentifiers | An identifier is a name you aply to items in your program. the two types of indetifiers are "name", wich refers to the address of a data item, and "label", wich refers to the address of an instruction. The same rules aply to names and labels. |
| Statements | A program is made of a set of statements, there are two types of statements, "instructions" such as MOV and LEA, and "directives" wich tell the assembler to perform a specific action, like ".model small" |
Here's the general format of a statement:
indentifier - operation - operand(s) - comment
The identifier is the name as explained above.
The operation is an instruction like MOV.
The operands provide information for the Operation to act on.
Like MOV (operation) AX,BX (operands).
The comment is a line of text you can add as a comment, everything
the assembler sees after a ";" is ignored.
So a complete instruction looks like this:
MOVINSTRUCTION: MOV AX,BX ;this is a MOV instructionThe label and the comment are optional. In fact I allready explained directives , but, okay, I'll do it again. Directives provide the assembler with information on how to assemble a .ASM file. .MODEL SMALL, or .CODE are, for example, directives.
In Part 2 I'll explain some more instructions and I'll explain
how to address data yourself.
I'll also explain the Interrupts and interrupt table.
Next >>