sections in this module | City
College of San Francisco - CS270 Computer Architecture Module: MIPS-I |
module list |
Basic
MIPS Instructions
MIPS was revolutionary for its time. It was probably the chip that settled the longstanding competition between RISC chips and the then-dominant CISC chips. All new microprocessor chips made today are RISC. The only surviving CISC chip in use is the Intel family, and it could be argued that it is only Intel's Herculean effort and resources (and their large installed base) that has kept this technology in the forefront.
The MIPS instruction set is easier to program, use, and implement for several reasons
MIPS future was in question several times over its history. As
recently as ten years ago it seemed destined for limited use in
embedded devices. Since then, however, a huge investment has been made
in enhancing and licensing the technology from various sources,
culminating in the purchase of the company itself by a Chinese
consortium a few years ago. This was preceded by many years of
independent development of MIPS-based processors (named Gobson and
Loongson. It was at that time on the short list of processors being
considered to be chosen for a national Chinese architecture. It is
curious that the company was purchased (thus alleviating any
outstanding patent issues) soon afterwards. The winner of the national
architecture was never announced in the media. You can read a lot of
details about the status of the various Chinese MIPS processors on
wikipedia here.
MIPS parameters
All MIPS instructions are 32-bits. To ease the writing of
assembly instructions given this restriction, the MIPS assembler
allows pseudoinstructions,
which appear to be simple instructions, but, instead generate a
sequence of up to three actual MIPS instructions. To facilitate the use
of pseudoinstructions, register 1 ($1 or $at) is reserved as the assembler temporary.
All operations are performed in registers on 32-bit quantities.
The only instructions that access memory are loads and stores, and
these are the only instructions that indicate data size. Loads and
stores must be properly aligned. Thus a load of a four-byte
quantity must be done from an address that is a multiple of four.
Loads of small sizes (halfword or byte) have signed and unsigned
versions.
Register classes
As indicated, MIPS has a convention for register usage. Registers are divided into register classes:
Basic instruction encoding.
MIPS instructions are divided into fields, just like our Simple Machine. The fields indicate the operation, and the operands. Let's take this a step at a time
Let's see how this works:
Arithmetic Instructions
Arithmetic instructions are encoded as R-type instructions. They have the following parts
They are used, of course, for arithmetic operations such as add,
shift, and, and or. (Note that the use of the registers changes in
the shift instructions. See below.)
Examples:
add
rd,rs,rt # rd = rs + rt (signed addition. overflow is
possible)
addu rd,rs,rt # rd = rs + rt (unsigned addition)
and rd,rs,rt # rd = rs AND rt
sll rd,rt, N # rd = rt << N (0<N<31)
If the integers a, b, and c are in registers $t0, $t1 and $t2 respectively, write code for the following:
c = a-b+c
First, lets rewrite this as
c += a-b
One way of coding this would be
add
$t2,$t2,$t0 # c += a
sub $t2,$t2,$t1 # c -= b
An alternate way would use a temporary register
sub
$t3, $t0, $t1 # t3 = (a-b)
add
$t2, $t2, $t3 # c += t3
add vs addu
Bitwise addition proceeds the same whether the operands are
signed or unsigned. Thus, there does not seem to be any reason for
add and addu instructions.
Instead of indicating any difference in how the basic instruction
operates, the difference between add and addu has to do with
overflow.
If you recall, overflow is indicated by comparing the signs of the operands to that of the result. The only question when using addition is do you want to know if there was an overflow?
The add instruction should be used when you want to know about
overflow. The addu instruction should be used when you dont want
to know about overflow. Generally, when dealing with signed
numbers, you want to use add, and when dealing with unsigned
numbers, you want to use addu.
Note that both addi and addiu
(where one operand is a constant) use sign-extension when the
constant is converted to 32-bits.
Using constants in arithmetic
instructions
Besides the shift operators, some arithmetic instructions have
forms that allow the second source operand to be a small constant.
These include add, as well as the bitwise operators and, or and
xor. This form uses the lower 16-bits of the instruction to hold a
16-bit constant. For example
addi
rt,rs,imm e.g. addi $t0,$t2,1024
You should note for later that the destination register in this instruction is rt(!)
This form of the instruction uses a different encoding, called I-type. We will discuss the encodings in a later section. The I-type encoding is important, however, as this is the encoding used for load and store operations.
Load/Store Operations
As mentioned previously, all operations in MIPS code is done in registers. Thus, before an operation can be performed, data must be loaded into registers. After the operation is complete, the data must be stored from the register(s) to its permanent location in memory.
Although all operations are done on the entire register (i.e., 32-bits), load and store operations must deal with the three different sizes available - bytes, halfwords (shorts) and words (ints). This is indicated by the size letter in the load or store operation:
lw - load 32-bits
lb - load 8-bits
Loading and storing 32-bits is simple. For the store instruction, storing smaller quantities (bytes or halfwords) simply ignores the most-significant bits of the register. When loading smaller quantities, we must do something about the other bits in the register.
If the datum being loaded is a signed quantity, the small datum is sign-extended to fill the entire register. This is the default of the load instructions, thus, lb is inherently signed.
If the datum being loaded is an unsigned quantity, the
most-significant bits of the register should be zeroed. These load
instructions have the additional character u for unsigned. Thus, lbu is load byte unsigned. It is
inherently unsigned.
If you load the upper (most-significant) part of the instruction (with lui, as we shall see) the lower 16 bits are zeroed.
The load and store instructions have a register (the destination for load instructions, and the source for store instructions) followed by a memory specification. Without getting hung up on the memory specifier right now, let's look at how loading a datum affects the register. If the datum being loaded has the value -2, here is what the register looks like after the load instruction:
instruction |
lh
reg,mem |
lhu reg,mem | lb reg,mem | lbu reg,mem |
register contents |
0xfffffffe |
0x0000fffe |
0xfffffffe |
0x000000fe |
The Memory Operand
There is a single way to address memory in instructions, by using the addressing mode base + displacement. This means a register holds a memory address, and the effective memory address (the one loaded from or stored to) is that address plus a constant. The constant, which is a 16-bit signed constant, is specified in the instruction.
For example, if you want to load a 32-bit word from an address contained in register $t0, placing the word in register $t1, the instruction would be
lw $t1,0($t0)
Thus, 0 is added to the contents of $t0 to get the address of the 32-bit quantity to load. If $t0 is the base of an array, and the element of the array you want has a constant index like 1, you could access the element (array[1]) by adjusting the constant. But there is a trick
the constant must be multiplied by the size of the array element!
Thus, if the array is an integer array, and you want element #1 (array[1]), the instruction would be
lw $t1,4($t0)
If the array is a character array and you want element #1, the instruction is
lw $t1,1($t0)
Suppose you are operating on an array and you need to access array[i], where the array is an integer array, the address of the start of the array (the base of the array) is in $t0, and the value of i is in $t2. Here is what you do:
sll $t3,$t2,2 # multiply $t2 by 4 (the size of an element)
lw $t1,0($t3)
Note that storing to memory takes the same format, and the memory
address is still the right-hand operand. Thus, the equivalent
store operation to that above is
sw $t1,0($t3)
Loads and stores must be aligned.
This means that you can only load an N-byte quantity (N=1,2,4,...)
from an address that is a multiple of N bytes. Thus, while it is
legal to load a byte
from any memory address, it is only legal to load a word from an address that is
a multiple of 4.
Loading an initial address
Loading the base address of the array into the $t0 register in the example above is a bit tricky. Remember, an instruction only has room for a 16-bit constant, and addresses are 32-bits! This means we must initialize $t0 in two steps:
If the desired address is 0x01004008:
lui $t0, 0x100 # $t0 now has 0x01000000
addi $t0,$t0,0x4008 # $t0 now has 0x01004008
Note that the add instruction has a 16-bit signed constant. If the address wanted was 0x01008008, we could not use an add instruction! This is what would happen:
lui $t0, 0x100 # $t0 now has 0x01000000
addi $t0,$t0,0x8008 # $t0 now has 0x00ff8008
There are two solutions to this. First, we could simply add one to the constant used in the lui instruction
lui $t0, 0x101 # $t0 now has 0x01010000
addi $t0,$t0,0x8008 # $t0 now has 0x01008008
an alternative solution is to use an ori instruction instead of an add. (The constant in andi and ori is unsigned)lui $t0, 0x100 # $t0 now has 0x01000000
ori $t0,$t0,0x8008 # $t0 now has 0x01008008
This operation (loading a 32-bit address into a register) is so common that it has pseudoinstruction in MIPS assembler: la (for load address). This means the instruction la can be used in MIPS but it is translated to the appropriate lui/ori combination. In our case this would be
la $t0, 0x1008008
The global pointer
As you see, loads from 32-bit addresses ("static" data) are
cumbersome. To streamline such loads, MIPS programs are laid
out in memory so that "static data" (as opposed to data on the
stack) is in blocks of 64kB. A register is then initialized
with an address that is in the middle of the current block of
data. Later, that register can be used as the base register, and
loads and stores can reference memory using a signed offset from
it. If there is more than one 64kB block of data for a program,
the global pointer can be initialized each time a function is
entered. If a function's data is larger than 64kB, the global
pointer will have to be manipulated more often.
In our programs, data will be very small. We could simply use a global pointer that points to the beginning of our memory area. That way offsets are always positive.
Pseudoinstructions
The MIPS assembler(s) (including MARS) relax the strict MIPS assembly language specification by letting us be sloppy about the exact operand in some cases, as well as adding complex instructions that must be implemented as a sequence of instructions. Let me give you a few examples:
For convenience, several uses of add appear as pseudoinstructions. For example
addi $t0,$t1,-16 can be coded as the pseudoinstruction subi $t0,$t1,16
add $t0,$zero,$t1 can be coded as the pseudoinstruction move $t0,$t1
addi $t0,$zero,1024 can be coded as the pseudoinstruction li $t0,1024
In most cases, an instruction that allows a 16-bit immediate operand has a corresponding pseudoinstruction for the form with a 32-bit immediate operand. Thus
lw $t0,0x1000($s0) is a real instruction but
lw
$t0,0x10000($s0) is a pseudoinstruction
Last, the assembler is forgiving about the exact instruction ending. Thus it will allow the instruction
add $t0,$t1,16
when the instruction is actually addi.
A complete list of pseudoinstructions implemented in MARS is
available on its help page.
Prev | This page was made entirely
with free software on linux: the Mozilla Project and Openoffice.org |
Next |