sections in this module City College of San Francisco - CS270 Computer Architecture Module: MIPS-I module list

Bit Instructions and Instruction Encoding

Bit instructions are used to manipulate data at the bit level. Although not common in high-level code, their use is quite common in instructions generated.

The shift instructions

Consider a number 2^N where 31 > N > 0. This number is represented in binary on our machine by a word with a single bit set, bit #N. Thus 2^5 has bit 5 set (counting from 0) or 100000 in binary. In decimal, of course, this is 32.

If we move this bit to the left one position, it becomes 2^6, or 64. Moving left one bit position multiplies the number by 2. This is a left-shift operation. A left-shift by P positions multiplies the value by 2^P.

Similarly, moving the bit one position to the right, divides it by 2. Thus, 2^6 (decimal 64) becomes 2^5 (decimal 32) when right-shifted one position.

Of course, integer division is not exact. Any bits lost (shifted off the right side) mean the result is inexact. We know this from integer division in general: 3/2 is 1 in integer division. This is because the binary number 011 (decimal 3), when right-shifted one position becomes 001, and the previous 2^0 bit is lost.

Right-shifting has another problem: what do we do when we right-shift a number that has its most-significant bit set? If we shift zero bits in from the left, the sign bit is no longer set! This would be correct if the original number was unsigned. Let's look at an example using a four-bit numbers:

1100 in a four-bit unsigned number is decimal 12. If we right-shift this one position and set the leftmost position to 0, the result is 0110, or 6 base 10. This is the correct answer. However, if the original number was interpreted as signed, its original value would have been -4. When we right-shift -4 by 1 position it shouldn't become 6!  Instead we want to set the [shifted-in] most-significant bit to 1. This would produce 1110, which, when interpreted as a signed four-bit number is -2.

These two types of right shifts are called logical (when we treat the number as unsigned, and shift 0 bits in) and arithmetic (when we treat the number as signed, and replicate the sign bit in the bits shifted in). Since there is only one type of left-shift and it shifts 0 bits in, it is also called logical.

Example:

Implement the following operation in MIPS code:

int index;
index /= 2;

Choose the correct code sequence from the following:

 lw  \$t0,index sll  \$t1,\$t0,1 sw  \$t1,index lw  \$t0,index srl  \$t1,\$t0,1 sw  \$t1,index lw  \$t0,index sra  \$t1,\$t0,1 sw  \$t1,index

AND and OR operators

I hope we don't need to discuss what AND and OR mean for a pair of bits. Give two words of some size, then, x AND y (written xy. there could be a dot between them, but the dot wont display correctly here) is simply the AND of each corresponding bit. Similarly, the OR of x and y (written x + y) is the OR of each corresponding bit. Resorting to our four-bit numbers, if x is 0110 and y is 1100

• what is xy ? (answer 0100)
• what is x + y ? (answer 1110)

That's how easy AND and OR are. The amazing thing is that these three primitives (and, or, and shift) form the basis of all arithmetic operations!

Example:

Given two 16-bit unsigned non-zero numbers in \$t0 and \$t1, set \$t2 so that \$t2 has the number in \$t1 as its most significant 16 bits and the number in \$t0 as its least significant 16 bits.

Which code sequence is correct:

 and  \$t2,\$t0,\$t1 sll \$t2,\$t1,16 or  \$t2,\$t2,\$t0 sll \$t2,\$t1,16 and \$t2,\$t2,\$t0 sll  \$t2,\$t0,16 sra  \$t2,\$t2,16 sll  \$t1,\$t1,16 or  \$t2,\$t1,\$t2

(One of the answers is correct all of the time and one is correct some of the time.)

Other bitwise operators

Besides AND and OR, there are two operators that are useful: NOR and XOR. For those who are unfamiliar with these operators, here are the basics:

Given two single-bit numbers X and Y, X NOR Y is 1 only if BOTH X and Y are 0.

Similarly X XOR Y is 1 only if exactly one of X or Y is 1.

Here is a truth-table:

 X Y X AND Y X OR Y X NOR Y X XOR Y 0 0 0 0 1 0 0 1 0 1 0 1 1 0 0 1 0 1 1 1 1 1 0 0

MIPS Instructions

All of and, or, xor and nor have R-type MIPS instructions where three registers are used:

op  rd, rs, rt    # rd = rs op rt   for op=and,or,xor,nor

All of these except nor also have immediate counterparts where the 16-bit immediate value is treated as unsigned (not sign-extended) when the operation is performed. These are useful for creating 16-bit ANDs, ORs and XORs.

Extracting values stored in a bit-field

A common operation in low-level computer software requires the extraction of a particular sequence of bits and their interpretation as a value. A simple example of this are the basic permissions on Unix.

Without getting too much into Unix-land, the first 16-bits of the master data structure stored for every data object on Unix, called an inode, comprise the object's mode. The mode includes both the filetype and the basic permissions. Within these 16 bits, bit #8 (counting from 0) indicates whether the owner of the file can read it. Thus, our data looks like

-------B--------

where B is the bit we want, and - indicates each bit that is 'in the way'.

Problem: if a file's mode is in \$t0, set \$t1 to 1 if the file's owner can read it, 0 otherwise.

There are several ways to solve this problem. Let's look at this graphically again. Here is the operation we are interested in:

-------B-------- ->   000000000000000B

You may already see a simple solution to this problem, but we are going to take the long way around and discuss the generally-useful idea of a mask.

A mask is a special bit pattern that is constructed so that an AND or OR operation can be applied to a selected sequence of bits to isolate it. In our case, we want to construct a mask so that we can extract our single bit, i.e., so that the operation

-------B-------- ->  0000000B00000000

can be performed. Of course, if we use an AND operator, the bit pattern is all zeros except in the position of B, where we have a 1:

-------B-------- AND  0000000100000000  ->  0000000B00000000

Once this operation is performed, we can simply shift B to the correct position:

0000000B00000000  ->  000000000000000B

In our case, where the file's mode is in \$t0, the following sequence would be used

andi  \$t2, \$t0, 0x100    # here 0x0100 is our 'mask'
srl   \$t1,\$t2,8

We indicated earlier that this may not be the easiest solution, though it is the most general. A simpler solution would probably be

sll \$t2,\$t0,23
srl \$t0,\$t0,31

The concept of a mask is very useful and can be used to isolate and extract any data value. We will see it again later. As one further example, suppose we wanted to isolate all the permissions bits in the word in \$t0, placing the isolated bits in \$t2. Since the permissions take up a total of 12 bits, we would use

andi \$t2,\$t0,0x0fff

Encoding MIPS instructions

As you might expect, instruction encoding for MIPS is significantly more complicated that it was for the Simple Machine. Consider two standard (but different) R-type instructions:

sll \$t0, \$t1, 4

As discussed in an earlier section (and as you can see from the two examples above), R-type instructions must have room in the instruction encoding for the following parts:

• an opcode field
• two source registers (rs and rt)
• a destination register (rd)
• a shift amount (for shift instructions)

Since there are 32 instructions, the register fields must have 5 bits. Similarly, since the maximum shift amount is 31, the shift amount field must have 5 bits. This leaves 12 bits for the opcode. To make the encoding for different instruction types more compatible, the opcode field was broken into two 6-bit fields, called opcode and function. For R-type instructions, the function (funct) field indicates the instruction and the opcode (op) field (which is 0 or 1 for an R-type instruction) indicates to look in the funct field for the operation code. This allows the fields to be laid out so the instruction is symmetrical around the midpoint:

 R-type Instruction Encoding op rs rt rd shamt funct 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

(The last row in the table above is for encoding the result in hexadecimal.)

Let's try encoding our example instructions above

sll \$t0, \$t1, 4

We need the numeric values. The values for the instructions are easiest to see in section A.10 of your text, as are the instruction formats. The register values, however, must be obtained from the Green Sheet in the front of your textbook (front cover). The Green Sheet is also available from a PDF link on my website's Syllabus for the class.)

• for each of our registers: \$t0 is 8, so \$t1 is 9 and \$t2 is 10.
• for each of add and sll, the op field 0. For add, funct is 32. For sll, funct is 0

The register positions in the add instruction is add rd,rs,rt. Now we can encode the first instruction: add \$t0, \$t1, \$t2

 op (0) rs (9) rt (10) rd (8) shamt (0) funct (32) 0 0 0 0 0 0 0 1 0 0 1 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 2 A 4 0 2 0

Checking on MARS, 0x012A4020 is indeed correct.

Let's try the second instruction: sll \$t0, \$t1, 4

(The register positions in the shift instructions are sll  rd, rt, shamt ).

 op (0) rs (0) rt (9) rd (8) shamt (4) funct (0) 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 9 4 1 0 0

Again, MARS agrees with 0x00094100 for this shift instruction.

That's all there is to encoding R-type instructions. Let's move on to I-type:

The I-type instructions keep the same fields in the most-significant 16 bits of the instruction word, but merge the least-significant 16-bits into the 16-bit signed immediate (imm) field:

 op rs rt imm 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

The normal encoding for I-type instructions is   op  rt, address, where address is imm(rs). Let's try a couple of I-type instructions:

lw  \$s1, 4(\$sp)

lui  \$at, 0x0400

Here are the values we need:

lw: opc=35, \$s1=17, \$sp=29

For lw  \$s1, 4(\$sp), here is the encoding

 op (35) rs (29) rt (17) imm (4) 1 0 0 0 1 1 1 1 1 0 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 8 F B 1 0 0 0 4

lui: opc=15, \$at=1

For lui  \$at, 0x0400

 op (15) rs (0) rt (1) imm (0x400) 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 3 C 0 1 0 4 0 0

Both of these check out using MARS

Encoding Instructions using Instructions

Using our example instruction of  add \$t0, \$t1, \$t2 we will use our bitwise instructions to accomplish two tasks:

1. Create the instruction word from its parts (this is, after all, what MARS must do when it assembles the instruction!)
2. Take an existing R-type instruction and modify the rd field, setting it to the value currently in \$t4. Leave the remainder of the fields intact.

Both of these tasks will allow us to practice with masks and our bitwise operators.

Problem: Encode the   add \$t0, \$t1, \$t2  instruction, placing the result in \$t0

If you remember from our earlier discussion, we had values for these constants:

• \$t0 is 8, so \$t1 is 9 and \$t2 is 10.
• for add, the op field is 0 and the funct field is 32

Let's assume we have the following register assignments already. (We are doing the general solution here rather than optimizing for 0-valued fields.)

li  \$t1,0   # opcode
li  \$t2,9   # rs
li  \$t3,10  # rt
li  \$t4,8   # rd
li  \$t5,0   # shamt
li  \$t6,32  # funct

Notice that each of these load immediate instructions (a pseudoinstruction) is implemented on Mars by an add instruction. For example, li  \$t2,9  becomes  addiu \$t2,\$zero,9  (addiu still sign-extends the immediate value, but the value is positive and less than 0x8000, so sign-extension doesnt occur)

Here is the easiest way.

move \$t0,\$t1     # opcode
sll \$t0,\$t0,5    # make room for rs

sll \$t0,\$t0,5    # shift to make room for rt
sll \$t0,\$t0,5    # shift to make room for rd
sll \$t0,\$t0,5    # shift to make room for shamt
sll \$t0,\$t0,6    # shift to make room for funct

The alternate way is to OR in each piece. Let's do this for practice.

sll \$t0,\$t1,26   # put op in place
sll \$t2,\$t2,21   # shift rs to align
or  \$t0,\$t0,\$t2  # or in rs
sll \$t3,\$t3,16   # shift rt to align
or  \$t0,\$t0,\$t3  # or in rt
sll \$t4,\$t4,11   # shift rd to align
or  \$t0,\$t0,\$t4  # or in rd
sll \$t5,\$t5,6    # shift shamt to align
or  \$t0,\$t0,\$t5  # or in shamt
or  \$t0,\$t0,\$t6  # or in funct

Interestingly, this took one less instruction.

Problem: Take an existing R-type instruction and modify the rd field, setting it to \$t4. Leave the remainder of the fields intact.

This involves OR-ing in the new value of rd. But first, the current instruction's rd field must be zeroed. This is a common use of masks. Here are the steps involved:

1. Create mask that has 1's everywhere EXCEPT the rd field, where there are 0's
2. AND the existing instruction with the mask. This zeroes the rd field, leaving the remainder of the instruction intact.
3. Align the new value of rd so that it is in the correct position
4. OR the new value of rd into the existing instruction

Assuming our existing instruction is in \$t0 and the value of the new rd field is in \$t7, here is the code

sll \$t7,\$t7,11        # shift rd to align
ori  \$t8,\$zero,0xF800 # see note below
nor \$t8,\$t8,\$zero     # complement of \$t8 to form correct mask
and \$t0,\$t0,\$t8       # and mask with existing instruction to zero rd
or \$t0,\$t0,\$t7        # or in the new rd field

Note: We create the complement of the mask, then complement it to get our correct mask (this is the nor instruction). We can use a load immediate instruction: If we use it with a hexadecimal constant 0xF800, Mars will realize that it cannot use an addi, as this will sign-extend the result, and, instead it will substitute an ori instruction. We will just insert the native ori instruction to show what it looks like. \$zero comes in very handy here.

(The code for these examples in in the file online/mipsI/bitexample.s in the public work area on hills)