Arrays
Contents
Introduction
Let's say we have 10 students and we need to store their grades are integers. We could define 10 variables say: grade_student1, grade_student2, grade_student3 We could store our values in these variables. However it is a bit messy to create separate variables to hold 10 similar values that may be used together. As an example to find the average of the grades we have to do: int average = ( grade1 + grade2 + grade3 + grade4 ... + grade10 ) / 10 A cleaner option is to use the data structures arrays. We can declare an array as: int studentGrades[10] ; This states that "studentGrades" can hold 10 integer values. How can we store values in the array ? We can use the array subscript notation. Ex: studentGrades[0] = 50 ; studentGrades[1] = 66 ; studentGrades[9] = 86 ; Now the average can be coded as : int average = 0 ; int total = 0 ; for( int i1=0 ; i1< 10 ; i1++ ) total += studentGrades[i1] average = total / 10 ;
File: arr1.cpp
#include <iostream> using namespace std ; int main() { int grade_student1= 10 , grade_student2= 20 , grade_student3= 30 , grade_student4 = 40 , grade_student5= 50 , grade_student6= 60 , grade_student7= 70 , grade_student8=80, grade_student9=90 , grade_student10 =100 ; double total = 0 ; total = grade_student1 + grade_student2 + grade_student3 + grade_student4 + grade_student5 + grade_student6 + grade_student7 + grade_student8 + grade_student9 + grade_student10 ; cout << "Average is : " << total / 10.0 << endl ; int studentGrades[] = { 10 , 20 , 30 , 40 , 50 , 60 , 70, 80 , 90, 100 } ; total = 0 ; for( int i1=0 ; i1<10 ; i1++ ) { total = total + studentGrades[i1] ; } cout << "Average is : " << total / 10.0 << endl ; } $ ./a.exe Average is : 55 Average is : 55
Using Functions
We can see how the array fits what we want to do in a much cleaner fashion. This also becomes important when we write functions and pass arrays as arguments. Let us compare the programs:File: arr2.cpp
#include <stdio.h> #include <iostream> using namespace std ; double average( int grade_student1 , int grade_student2 , int grade_student3 , int grade_student4 , int grade_student5 , int grade_student6 , int grade_student7 , int grade_student8 , int grade_student9 , int grade_student10 ) { double total = 0 ; total = grade_student1 + grade_student2 + grade_student3 + grade_student4 + grade_student5 + grade_student6 + grade_student7 + grade_student8 + grade_student9 + grade_student10 ; return ( total / 10.0 ) ; } double averageArray( int array1[] ) { double total = 0 ; for( int i1=0 ; i1<10 ; i1++ ) { total = total + array1[i1] ; } return ( total / 10.0 ) ; } int main() { int grade_student1= 10 , grade_student2= 20 , grade_student3= 30 , grade_student4 = 40 , grade_student5= 50 , grade_student6= 60 , grade_student7= 70 , grade_student8=80, grade_student9=90 , grade_student10 =100 ; cout << "Average is : " << average( grade_student1 , grade_student2 , grade_student3 , grade_student4 , grade_student5 , grade_student6 , grade_student7 , grade_student8 , grade_student9 , grade_student10 ) << endl ; int studentGrades[] = { 10 , 20 , 30 , 40 , 50 , 60 , 70, 80 , 90, 100 } ; cout << "Average is : " << averageArray( studentGrades ) << endl ; } $ g++ arr1.cpp ; ./a.exe Average is : 55 Average is : 55
Declaration and Indexing
In the above we declared the array as:int studentGrades[] = { 10 , 20 , 30 , 40 , 50 , 60 , 70, 80 , 90, 100 } ;
We are declaring the array as "studentGrades[]" and then assigning values at the same time. The array can be declared as :
studentGrades[]
or with a size between the square brackets:
studentGrades[10]
In our case we did not use the "[10]" notation because we are assigning the values and the C++ compiler is able to figure out what size it needs to create the array as. Also notice how we run the "for" loop.
for( int i1=0 ; i1<10 ; i1++ ) { total = total + studentGrades[i1] ; }We can access the elements in the array using the notation [i1]. Notice the i1 starts with 0. In C++ indexing starts with 0. The "[0]" gives us the first element of the array. The array can be thought of a consecutive block of 10 integers in this case. Let us look at a program illustrating a different declaration and assignment style.
File: decl1.cpp
#include <stdio.h> #include <iostream> using namespace std ; int global1[5] ; int main() { int arr1[5] ; arr1[0] = 10 ; arr1[1] = 20 ; global1[0] = 10 ; global1[1] = 20 ; for( int i1 = 0 ; i1<5 ; i1++ ) cout << "arr1[i1]:" << arr1[i1] << endl ; for( int i1 = 0 ; i1<5 ; i1++ ) cout << "global1[i1]:" << global1[i1] << endl ; } $ g++ decl1.cpp ; ./a.exe arr1[i1]:10 arr1[i1]:20 arr1[i1]:-13184 arr1[i1]:7 arr1[i1]:-12816 global1[i1]:10 global1[i1]:20 global1[i1]:0 global1[i1]:0 global1[i1]:0One interesting thing to note in the above is the printout of the default values. For the "arr1" array defined locally the default values are random and for the global array "global1" the default values are zero. As a general rule it is good practice to initialize something explicitly rather than the relying on the system default values. It makes the code more readable. The intent is not just to write something that works but write in a clear understandable manner.
Local variables ( variables defined in a function ) are stored on the stack. Variables that are dynamically created are created on the heap. Dynamic creation means the memory is allocated at run time when the program is running . Dynamic creation is done in C++ with the calls "malloc" or "new". It is possible that the size of the variables for dynamic memory are not known at compile time. However variables allocated for the stack must have their size known at compile time. Thus the following is not valid.
File: stack1.cpp
#include <iostream> #include <fstream> #include <stdio.h> using namespace std; int main() { int size = 100 ; //Compiler error int myArray[size] ; return 0; } //mainWe can correct the above program by applying the "const" modifier to the "size" variable.
File: stack2.cpp
#include <iostream> using namespace std; int main() { const int size = 100 ; //Gets rid of Compiler error int myArray[size] ; return 0; } //main
A small note here. The C++ 11 standard does allow for variable length arrays as it is smart enough to convert that to dynamic allocation instead of static allocation. We can compile the above code with a compiler option: g++ -Werror=vla stack1.cpp to produce the error: stack1.cpp: In function ‘int main()’: stack1.cpp:13:20: error: variable length array ‘myArray’ is used [-Werror=vla] int myArray[size] ; ^cc1plus: some warnings being treated as errors
Accessing array elements
File: grade1.cpp
We can write another example with a user inputting the grades in the array and then we print the array out.
#include <iostream> using namespace std ; int main() { const int NUM = 5; // Number of students float grades[NUM]; int count; // Input the grades. for (count = 0; count < NUM ; count++) { cout << "Enter the grade for student" << (count+1) << ":" ; cin >> grades[count]; cout << endl; } // Display the contents of the array. cout << "The grades you entered are:"; for (count = 0; count < NUM ; count++) cout << " " << grades[count]; cout << endl; return 0; } Output: ./a.out Enter the grade for student1:30 Enter the grade for student2:49 Enter the grade for student3:30 Enter the grade for student4:80 Enter the grade for student5:90 The grades you entered are: 30 49 30 80 90We are asking the user for input of the grades and run a loop to enter them into the array. The index in a C++ array starts at 0. In the example above we are using "count" as a variable to hold the value. This could also be an expression such as "count+1" .
Storage
An array variable holds the address of where the block is stored in RAM. It can be pictured as:See full image
In the above diagram the array variable "Array1" holds the address of where the block for the array is created in the RAM. Once the variable gets assigned an address the array's variable value cannot be changed. C++ only stores the bare minimum information for an array. It stores the address where the memory block is created and nothing else. The block that is used for storage is determined based on the number of elements and the type. If the array is of size 10 and the size of int on this machine is 4 bytes then 40 bytes will be allocated.
Programmers used to arrays from other languages such as Java or C# will find this minimalist approach very different. If we have an array in C++ then there is no way to get the size( length ) of the array .
Even though an array variable only contains the address we cannot change this value. Once an array has been declared then it's value is fixed.
int arr1[2] ; int arr2[2] ; arr1 = arr2 ; //compiler error
File: size1.cpp
#include <iostream> using namespace std ; void function1( int arr[] ) { cout << "Size of arr:" << sizeof( arr ) << endl ; } int main() { int arr1[5] ; arr1[0] = 10 ; arr1[1] = 20 ; //Seems to work here cout << "Size of arr:" << sizeof( arr1 ) << endl ; function1( arr1 ) ; } $ g++ size1.cpp ; ./a.exe size1.cpp: In function ‘void function1(int*)’: size1.cpp:6:36: warning: ‘sizeof’ on array function parameter ‘arr’ will return size of ‘int*’ [-Wsizeof-array-argument] 6 | cout << "Size of arr:" << sizeof( arr ) << endl ; | ~~^~~~~ size1.cpp:4:21: note: declared here 4 | void function1( int arr[] ) | ~~~~^~~~~ Size of arr:20 Size of arr:8The size of the array is actually 5 and we use the C++ "sizeof" function to try to print the length but we get the value of 8. Where is this 8 coming from ? It is actually the size of the address that "arr" has . This was compiled on a 64 bit machine and the addresses are 64 bits . In a C style string we have a similar storage scheme but all the string are terminated by the null character. There is no such character ( and there couldn't be because the array can also contain the character 0 ) with arrays. That is the reason we see many functions that take an array argument also have the "size" as an additional parameter so that the function knows the size of the array.
There is no bounds checking in arrays in C++ . The following code will compile:
File: bounds1.cpp
#include <iostream> #include <fstream> #include <stdio.h> using namespace std; int main() { int array1[] = { 1, 2 } ; int array2[] = {2, 4} ; cout << array1[100] << endl ; // What's the problem here ? // cout << array1[900000] << endl ; return(0) ; } $ g++ bounds1.cpp ; ./a.exe 0 0The array1 is of sizee 2 and we are accessing 100 th element. What happens at run time ? We don't know where that position will end up in the RAM. If it falls in what the operating system considers to be an invalid memory location then we will get a "segment" error and if the address falls in the RAM block assigned for the program then it's possible we will not get a crash but rather some garbage value. Running the above program produces a garbage integer value. Now let's test the case of accessing the array1 at position 100000.
Taking out the comments : // What's the problem here ? // cout << array1[900000] << endl ;In the above case the address pointed to an invalid location and the CPU was able to catch that. The absence of bounds checking does have a purpose in C++. The focus of C++ is on efficiency and speed and it stores the minimum information needed for that. It is up to the programmer to make sure the indices do not fall out of range. This is vastly different from other languages like Java that will throw an exception at run time.
What are the problems in the following code ?
File: arr3.cpp
#include <iostream> #include <fstream> #include <stdio.h> using namespace std; int global[5] ; int[] function1() { return global ; } int main() { int array1[] = { 1, 2 } ; int array2[] = {2, 4} ; array1 = array2 ; int array3[] ; return(0) ; } The above example has couple of problems. A function cannot return an array because that array value will have to be assigned to an array variable and that means changing the value of the existing array. Also the statement: array1 = array2 is invalid because "array1" cannot be changed. The statement int array3[] ; is not valid because the size of the array is not specified. Initializing an array
File: arr4.cpp
#include <iostream> using namespace std; int main() { int array1[4] = { 6, 1, 2 , 4} ; for( int i1=0 ; i1 < 4 ; i1++ ) cout << array1[i1] << " " ; cout << endl ; return(0) ; } Output: [amittal@hills Chapter7]$ ./a.out 6 1 2 4We declared an array "array1" of size 4 and initialized it with 4 elements. We could have chosen not to specify the size of the array. The C++ compiler will figure out what the size is.
File: arr5.cpp
#include <iostream> using namespace std; int main() { int array1[] = { 6, 1, 2 , 4} ; for( int i1=0 ; i1 < 4 ; i1++ ) cout << array1[i1] << " " ; cout << endl ; return(0) ; }We can specify the size and choose to only initialize some of the elements.
File: arr6.cpp
#include <iostream> using namespace std; int main() { int array1[4] = { 6, 1 } ; for( int i1=0 ; i1 < 4 ; i1++ ) cout << array1[i1] << " " ; cout << endl ; return(0) ; } Output: [amittal@hills Chapter7]$ ./a.out 6 1 0 0 We cannot choose not to initialize an element in the middle. What that means is we cannot do : int array1[4] = { 6, , 1} The end result that we are trying to obtain is: { 6, 0 , 1, 0 } We can only initialize elements to the left and not leave out the middle element to assume default values.
Range based loop
File: range1.cpp
#include <iostream> #include <fstream> #include <stdio.h> using namespace std; int main() { int array1[] = { 10, 20, 30 , 40, 50 } ; //Range based loop for( int x1: array1 ) x1++ ; //Range based loop for( int& x1: array1 ) x1++ ; //Range based loop for( int x1: array1 ) cout << x1 << " " ; return ( 1 ) ; } $ g++ range1.cpp ; ./a.exe 11 21 31 41 51In the first loop we we increment x1 but that does not change anything. All we have is a variable that is assigned a value from the array. The second range loop has a variable that is declared as a reference and this does end up changing the value of the element in the array. It increments the original array to 11,21,32,41,51 and that's what the last for range loop prints out.
Two dimensional arrays
A two dimension array can be defined as:int array1[2][3] = { {1, 2,3} , {4, 5,6} } ;
This states that the array consists of 2 rows and 3 columns. Let us look at the following code:
File: 2dim1.cpp
#include <iostream> #include <fstream> using namespace std; void function2( int arr1[] ) { } void function1( int arr1[][3] ) { cout << arr1[1][0] ; } int main() { int array1[2][3] = { {1, 2,3} , {4, 5,6} } ; int array2[3] = { 10,20,30 } ; //cout << array1[1][0] ; function1( array1 ) ; function2( array2 ) ; return(0) ; }
The above code defines a 2 dimensional array with 2 rows and 3 columns. When an array is stored in RAM only the address of the array is known and that is stored in the array variable. Information such as number of rows and columns are not kept. In RAM the array looks like: RAM 0 array1 1 5 2 3 4 5 1 6 2 7 3 8 4 9 5 10 6The above is a simplified diagram. Actual memory address numbers will of course be different. The 2 dimensional array is stored in consecutive 6 locations.This creates a problem if we try to pass it to a function.
void function1( int arr1[][] ) { cout << arr1[1][0] ; } in order to print the value of "arr1[1][0]" the "function1" has to know the number of columns in order to print the first element of the second row. We can accomplish that by specifying the number of columns in the declaration. void function1( int arr1[][3] ) { cout << arr1[1][0] ; } Now the compiler know that there are 3 columns so it knows to skip 3 elements in order to get to the first element of the second row.
Exercises
1). What is the output of the following code:File: ex1.cpp
#include <stdio.h> #include <iostream> using namespace std ; int main() { int arr1[] = { 1 , 2, 3, 4 , 5 } ; int arr2[5] = { 1 , 2 } ; for( int i1=0 ; i1 < 5 ; i1++ ) { arr2[i1] = arr1[i1] + arr2[i1] ; arr2[i1] += arr1[i1] ; cout << arr2[i1] << " " ; } //for cout << endl ; return(0) ; }
2)Search for an element in the array.
Solutions
Solutions1)
$ g++ ex1.cpp ; ./a.exe 3 6 6 8 102)