Object-oriented Programming

City College of San Francisco - CS270
Computer Architecture
Module: MIPS-V

Object-oriented programming, implemented in a high-level language such as C++ or Java, is an extremely useful mechanism to bridge our more abstract thought to the cold strict reality of the machine. At the assembly level, of course, objects do not exist. At the assembly level, there is only data (including instructions) and its addresses.

In this section we will briefly explore how this 'trick' of mapping abstract objects to machine structures is accomplished. Although most of us are more familiar with Java, we will use C++ as our example, since it is a compiled language and we can actually test our ideas.

The basics

As we all know, objects are instances of classes that contain members - both data and functions. At the assembly level, this must be mapped to data and code objects and addresses to each item must be generated and managed. This process is handled differently for the data and function members of a class:

the data members of a class are placed together in a compound data structure similar to a C struct. (This is what an 'object' becomes at the assembly level.) The struct has a constant size equal to the sum of the sizes of its data members. Each data member is then assigned an offset from the start of the data structure (object) as its location relative to the address of the object. For example, given the class

class xyz { 

    int a,b; 

public:

    int sum(void) {return (a+b);}

     xyz (int mya, int myb) { a = mya; b=myb; } 

    int sum(void) { return(a+b); }

    int sum(int i)  { return(a+b+i); }

};

and the initialization of an object of that class

xyz myxyz(1,2);

the size of myxyz would be eight bytes (assuming four-byte ints). Given the address of myxyz, the address of member a would be the same, and the address of member b would be 4+the address of myxyz

Although they appear in the class declaration, the function members of a class are, of course, separate from the class object. As we know from our studies, each function exists at a constant address which we can access by a global symbol or label. Since the member function can be called from any module that has included its class definition, and exactly one global label is attached to the function, a mechanism must exist for generating the function label from the class definition. This must take into account two issues:

the name of a member function must be unique only within its class (each class can have its own "add" function, for example.)
member functions may be overloaded, with several identical function names and different argument lists appearing in the same single class.

This is accomplished by name mangling. This convention creates a standard label format encoding the class, the function name, and the argument types into the label itself. Further, the label generated should not be able to be generated by the user - this could be accomplished by the addition of a character in the label which is illegal at the source code level (but legal at the assembler or object level) or by the use of a recognized convention on names to which the user must ascribe, such as usernames cannot begin with an underscore.

As an example, the labels for the overloaded functions sum (void) and sum (int) in our simple class xyz above might be generated simply by using the class name xyz, an illegal character at the user level such as $, the function name, sum, and an abbreviation for the argument types in order. Thus we would arrive at the label xyz$sum$v for sum(void) and xyz$sum$i for sum(int). These labels could be correctly generated from any module which had access to the class definition. The constructor could be given an empty name - in our example xyz$$ii, since the single constructor takes two integer arguments.

(Note: these rules are much simpler than the actual rules governing name mangling, but the simpler example is illustrative, nonetheless.)

Now that we have a mechanism for generating addresses of data members and labels of function members of our objects, we have one more problem - when the member function is called, it must know where the object it is operating on is! (Remember, member functions are not in the object, as they appear - they are completely separate!). The address of the object being operated on (appropriately called this), is passed as an additional first argument to each member function. Thus, the function call (from the example above)

int thesum = myxyz.sum(4)

would actually do a jump-and-link to the label xyz$sum$i passing it the two arguments &myxyz, and 4.

Access restrictions

The access specifiers public, private and protected are enforced by the compiler only. They do not appear at the assembly level. The const specifier, however, is often enforced by the use of a separate data section which is read-only. This data section (.rodata on MIPS, but not recognized by Mars) also contains constant strings.

A complete example

Let's return to our simple class definition

class xyz {

    int a,b;

public:

    int sum(void) {return (a+b);}

    int sum(int i) {return(a+b+i);}

    xyz (int mya, int myb) { a = mya; b=myb; }

};

and add a simple main program

main() {

int result; // kept in $v0

xyz myxyz (2,4);

result = myxyz.sum();

PrintInteger(result);

result = myxyz.sum(1);

PrintInteger(result);

}

Here, a variable myxyz of size 8 bytes will be allocated on the stack and initialized using the constructor. Then each of the two member functions will be called and their return values printed.

Let's see what the code looks like using our aforementioned conventions:

(myxyz.cpp (which has its own PrintInteger function) and myxyz.s are in the online/mipsV directory in our public work area)

#class xyz {
#    int a,b;
#public:
#    int sum(void) {return (a+b);}
#    int sum(int i) {return(a+b+i);}
#    xyz (int mya, int myb) { a = mya; b=myb; }
#};
    .globl xyz$sum$v
xyz$sum$v:
    lw      $t0,0($a0) # a
    lw      $t1,4($a0) # b
    add     $v0,$t0,$t1
    jr      $ra

    .globl xyz$sum$i
xyz$sum$i:
    lw      $t0,0($a0) # a
    lw      $t1,4($a0) # b
    add     $v0,$t0,$t1
    add     $v0,$v0,$a1
    jr      $ra

    # constructor
    .globl xyz$$ii
xyz$$ii:
    sw      $a1,0($a0)
    sw      $a2,4($a0)
    jr      $ra

# void PrintInteger(int i) --- part of util.s

#int main() {
#    int result; // kept in $v0
#    xyz myxyz (2,4);
#    result = myxyz.sum();
#    PrintInteger(result);
#    result = myxyz.sum(1);
#    PrintInteger(result);
#}
#

#int main() {
    .globl main
main:
    # need std stack frame, plus one s-reg (for address of myxyz) plus
    # 8 bytes for myxyz
    addiu   $sp,$sp,-32
    sw      $ra,28($sp)
    sw      $s0,24($sp)
    # myxyz is at 16($sp)
    la      $s0,16($sp)
#    int result; // kept in $v0
#    xyz myxyz (2,4);
    move    $a0,$s0
    li      $a1,2
    li      $a2,4
    jal     xyz$$ii

#    result = myxyz.sum();
    move    $a0,$s0
    jal     xyz$sum$v
#    PrintInteger(result);
    move    $a0,$v0
    jal     PrintInteger

#    result = myxyz.sum(1);
    move    $a0,$s0
    li      $a1,1
    jal     xyz$sum$i
#    PrintInteger(result);
    move    $a0,$v0
    jal     PrintInteger
#}
    lw      $s0,24($sp)
    lw      $ra,28($sp)
    addiu   $sp,$sp,32
    jr      $ra

#
.include "/pub/cs/gboyd/cs270/util.s"

This page was made entirely with free software on linux:
Kompozer and Openoffice.org