Archive for the ‘Stanford CS107’ Category
Programming Paradigms - Lecture 6 - CS107 - Stanford University
The following is a brief overview of Lecture 6 of the ‘Programming Paradigms,” a programming class taught at Standford University. Screenshots and Code are included from the lecture. The ‘Programming Paradigms’ lectures can be found in iTunes. The link is on the Open Standford Website. The following lecture covers Implementing an Int Specific and Generic Stack in C/C++.
Int Specific Stack
stack.h typedef struct{ int * elems; int loglength; int alloclength; } stack; void StackNew(stack * s) void StackDispose(stack * s) void StackPush(stack * s, int value) int StackPop(stack * s)
The last lecture was left off with header file for a int specific stack being defined which includes a strcut called 'stack' and four functions. This lecture begins with defining StackNew and StackDispose for an int specific stack.
void StackNew(stack * s) { s->loglength = 0; s->alloclength = 4 s->elems = malloc(4 * sizeof(int)) assert(s->elems != NULL); }
When a new stack is allocated, the length is set originally to zero (because the stack is empty). Space is made for 4 elements, then malloc is used to allocate memory for those four elements. Malloc goes to the heap and finds 16 bytes of space, puts a "little halo" around the space, and then returns the base address of the memory. Assert confirms that elem is not equal to NULL just incase malloc fails for some reason. Asserts should be used to you clearly know where the program failed.
void StackDispose(stack * s) { free(s); }
StackDispose is trivial. The opposite of malloc, free is used. This gives the memory back to the heap.
void StackPush(stack * s, int value) { if(s->loglength == s->alloclength){ s->alloclength *= 2; s->elems = realloc(s->elems, s->alloclength * sizeof(int)); assert(s->elems != NULL); } s->elemns[s->loglength] = value; s->loglength++; }
In the function StackPush, you first want to check if you are out of memory to actually push value onto the Stack. If the current length of the stack is already equal to the allocated length, more memory will have to be allocated for the stack, else bad things will happen.
Actually, we want to go ahead and re-allocate the array. The previously allocated amount of memory is doubled. Then realloc is called. Realloc only exists in C, it doesn't exist in C++. Jerry says that he'll explain in a few weeks why realloc doesn't exist in C++. Realloc first sees if the memory can be allocated in place, meaning that realloc checks to see if the memory that comes after the previously allocated space in the heap is in use. If it isn't, it uses that memory. If it can't resize it in place, it copies all the elements over to a different block of memory, frees the old memory, and returns the address of the new memory.
If realloc is called with the first parameter being equal to NULL, it functions the same as a malloc command. This is useful if you don't want to call malloc once, then realloc every single time after that.
The doubling strategy is really popular because it only comes up once every 2^n times. 512 bytes of memory not big enough? Ok, let's allocate 1024. And so on.
int StackPop(stack *s) { assert(s->loglength> 0); s->loglength--; return s->elems[s->loglength]; }
First, make sure that their is at least one object on the stack. Subtract one for loglength (the length of the stack), then return the new address of the stack which simulates the discarding of the last element on the stack.
Stack Using Generics
Next, the exact same stack functions will be implemented generically, not just for stack integers.
stack.h typdef struct { void * elems; int elemSize; int loglength; int alloclength; ?? - Suspense Element } stack; void StackNew(stack * s, int elemSize); void StackDispose(stack * s); void StackPush(stack * s, void * elemAddr); void StackPop(stack * s, void * elemAddr);
The above is the setup for the stack functions that will be implemented generically. Note that not all the information has yet been given as suspense element will be added later to the stack struct. 70% of the code for the generic stack functions is the same as the int specific stack functions.
void StackNew(stack * s, int elemSize) { assert(s->elemSize > 0); s->length = 0; s->alloclength = 4; s->elemSize = elemSize; s->elems = malloc(4 * elemSize); assert(s->elems != NULL); }
The second parameter (elemSize) tells the length of a single element, so malloc is used to pre-allocate 4 elements on the stack of size elemSize. Assert is used to make your life easier. It checks to make sure actual memory was allocated and to make sure elemSize was not accidentally set to zero.
void StackDispose(stack * s) { free(s->elems); }
StackDispose is again trivial.
void StackPush(stack *s, void * elemAddr) { if(s->loglength == s->alloclength) StackGrow(s); void * target = (char *) s->elems + s->loglength * s->elemSize; memcpy(target, elemAddr, s->elemSize); s->loglength++; }
In StackPush, if the stack needs more memory allocated, StackGrow() is called to allocate that memory. Target points to the block of memory already pre-allocated in the stack where the element will be copied.. Memcpy is used to do the actual copying of the element to that address. Remember to cast s->elems as a char * to trick the compiler because it cannot cast as a void *.
The word static has like 80 or 85 meanings in C++. Here is one more. When you see the word static decorating the prototype of a c or c++ function such as:
stack void StackGrow(stack *s)
This means that it is a private function that should not be advertised outside this file. In companies like Microsoft which deal with programs with tens of thousands of files, many times function names will conflict with each other if not made static.
static void StackGrow(stack *s) { s->alloclength *=2; s->elems = realloc(s->elems, s->alloclength * s->elemSize); }
StackGrow re-allocates double the amount of memory just like the int specific version.
void StackPop(stack *s, void * elemAddr) { void * source = (char*) s->elems + (s->loglength - 1) * s->elemSize; memcpy(elemAddr, source, s->elemSize) s->loglength--; }
In StackPop(), source points to the last chunk of data in memory on the stack. Memcpy copies that memory to the memory address, then the length of the stack is decremented thus "popping" off the memory from the stack.
When dealing with generics, it is very easy to get the program to compile because the compiler looks at all the void * declarations and stamps them as "good." However, when running a program implementing generics, it is very likely to crash.
Programming Paradigms - Lecture 5 - CS107 Stanford University
Linear Search with Generics
void * lserach(void * key, void * base, int n, int elemsize, int(*cmpfn)(void*, void*) ) { for(int i = 0; i < n; i++) { void * elemAddr = (char *) base + i * elemSize; if( cmpfn(key, elemAddr) == 0 ) return elemAddr; } return NULL; }
The linear serach takes five parameters. Key is a pointer to a searched for item, base is a pointer to the base of the array, n is the number of elements in the array, elemsize is the size of an element in the array, and cmpfn is a function pointer to a compare function. We loop through the array one element at a time. The cmpfn returns zero if the elements are equal, and lsearch returns a pointer to that element. If nothing is found, NULL is returned.
Implementing lsearch with ints
int array {} = {4, 2, 3, 7, 11, 7, 6}; int size = 6; int number = 7; int * found = lserach(&number, array, 6, sizeof(int), intcmp); if(found == NULL) :( else :)
When calling lsearch above, the address of 'array' doesn't have to be passed because when one sends in 'array,' it implicitly passes in the address of the zero(th) element. The 'intcmp' function that is passed into lsearch needs to be coded for the above to work.
int IntCmp(void * elem1, void * elem2) { int * ip1 = elem1; int * ip2 = elem2; return *ip1 - *ip2; }
The compare function returns a negative value if ip1 is less than ip2, zero if the elements are equal, or a positive value if ip1 is greater than ip2. Because it is known that the elemnts we wish to search for are of type int, we can make the compare function passed to lserach an specific type of compare ala int.
There are definetely better ways to implement generics in other languages. These languages have learned from the mistakes in C. C has been in existence for 35 years which is pretty old in computer time.
Implementing lsearch with CStrings
On this example, an array of CStrings is searched. The array is searched for the string 'EFlat'. All the character elements are NULL terminated ('\0′). This example is tricky because we are working with type char star star (char **) or double pointers.
char * notes[] = { "AFlat", "FSharp", "B", "GFlat", "D" }; char * favoriteNote = "EFlat" char ** found = lsearch(&favoriteNote, notes, 5, sizeof(char *), StrCmp); int StrCmp(void * vp1, void * vp2) { char * s1 = * (char **) vp1; char * s2 = * (char **) vp2; return strcmp(s1, s2); }
Vp1 and vp2 are cast to the types they really are which are double pointers. After that, the value of vp1 and Vp2 are derefereced to s1 and s2. Vp1 is of type void * so it can't be deferenced as a void * so it must be cast as what it is which is a char **. A C standard library function can then be used to compare s1 and s2 which takes two char pointers.
The difference between functions and methods
The difference between functions and methods. Funtions are independent of Objects. Methods are functions that exist inside Objects. Methods have invisible (this) pointers lying around, where functions actually have to have everything passed to it.
Generic Data Structures (Stacks)
We are going to implement a stack using ints first before a generic version will be attempted.
stack.h
typedef struct { int * elems; int logicallen; int alloclenth; } stack; void StackNew(stack * s); void StackDispose(stack * s); void StackPush(stack * s, int value); void StackPop(stack * s); stack s;
This allocates a block of memory for stack, but doesn't actually initialize the data.
StackNew(&s);
A call to StackNew receives the address of the previously allocated stack s. StackNew initializes the actual elements in stack s.
for(int i = 0; i < 5; i++) { StackPush(&s, i); }
The above call pushes the numbers zero through 4 onto the stack.
StackDispose(&s);
And finally, the stack is disposed of. Other details need to be implemented as well, but those details were not provided in this lecture. For example, initially the elems pointer only pointed to an array of size 4. This array, however, gets filled up after the number '3′ is pushed onto the stack. More memory then has to be allocated to fit the number 4.
void StackNew(stack * s) { s->logicallen = 0; s->;alloclen = 4; s->elems = malloc(4 * sizeof(int)); assert(s->elems != NULL); }
In C, malloc goes out to the heap, finds a block of memory that is 4 * sizeof(int) wide (or 16 bytes) and returns the address of that block of memory to elems. The assert function, if evaluated to false, declares an assertion at that call (ends the program) and tells you at what line the program ended. If malloc isn't able to find the memory on the heap, it will return NULL and the assert will be triggered. If not caught, a segmentation fault is called, but you don't know where it was called.
Standford University - Programming Paradigms - Lecture 4
The following is a brief overview of Lecture 4 of the ‘Programming Paradigms,” a programming class taught at Standford University. Screenshots and Code are included from the lecture. The ‘Programming Paradigms’ lectures can be found in iTunes. The link is on the Open Standford Website. The following lecture covers implementation of Swap using Generics.
Implementation of Swap using Templates
void swap( void * vp1, void * vp2){ void temp = *vp1; *vp1 = *vp2; *vp2 = temp; }
The above Swap function takes a generic pointer type (void * ). This means that it points to something that it doesn't have any type information about, so Vp1 and Vp2 are pointing to addresses in memory that it knows nothing about. Furthermore, there is a problem with the above implementation. One cannot declare temp or a variable to be of type void. Also, one isn't allowed to dereference a void * variable because it doesn't know how to go out and embrace an unknown number of bytes.
void swap (void * vp1, void * vp2, int size){ char buffer[size]; memcpy(buffer, vp1, size); memcpy(vp1, vp2, size); memcpy(vp1, buffer, size); }
int x = 17, y = 37; swap(&x, &y, sizeof(int));
The offical way to perform a swap using generics is to expect a third argument to the swap function, size, where size is the explicit size of the bytes being swapped. A character buffer of a dynamic size is first declared. Memcpy is like strcpy, except it doesn't stop copying at '\0′ One has to specifically tell it how many bytes to copy. Although the above example works, true ASCI C doesn't allow you to dynamically allocate an array.
double d = ∏, e = e; swap(&d, &e, sizeof(double))
The problem with using generics, however, is that the compiler does very little checking. You can pass in anything you want for a void * and the compiler will not yell at you. It might not work very nicely, but it will compile. When you use generics, you are making the compiler do less work for you, but you run more risk when you actually run the program.
int i = 44; short s = 5; swap(&i, &s, sizeof(short));
Say you need to swap an int and a short, but you accidentally pass the sizeof(short) to the swap function. Vp1 and Vp2 just have the address itself, they don't know how big the variables i and s are that they point to. They only access the first two bytes of each address because it is specifically told to Vp1 and Vp2 to access a short. Five will be written in the first two bytes of the int. The first two bytes of i will be written with 5. So zero will be written to s, 5 and 44 to i.
char * husband = strdup("Fred"); char * wife = strdup("Wilma"); swap(&husband, &wife, sizeof(char *));
In the above code, the husband and wife char pointers above are going to be switched. Husband will point to Wilma and Wife will point to Fred. We want to exchange the two things are are held by the husband and the wife variables. If you want to swap char *, that is, the actual pointer boxes, you pass sizeof(char *). Vp1 and Vp2 point to the pointers husband and wife. Fred and Wilma actually stay put, the pointers are just switched. However if the following is passed to swap:
swap(husband, wife, sizeof(char *));
This will still compile and run. Vp1 and Vp2 this time point to the raw text "Fred\0″ and "Wilma\0." The swap function now has two address and thinks that it is supposed to swap 4 byte figures (char *'s are four bytes). So it copies "Wilm" from Wilma and puts it in Fred's place "Wilm\0″ and copies "Fred" and adds it to the first four letters of "Wilma" for "Freda\0″
Implementation of LSearch in Generics
First the non-generic linear search.
int lsearch(int key, int array[], int size){ for (int i 0; i < size; i++) { if(array[i] == key) return i; } }
The lsearch seraches from the front to back of an array of a certain size looking for a key passed into the function. The function will return index of the array of where the key was found. The generic version is best understood as a generic blob of memory. In order to advance from element zero to element one, the size information will have to be passed into the function. A comparison function will also have to be passed in so we know how to compare the key to to i(th) element in the array because double =='s can't be used very easily. So the address of the key is passed in, the address of the beginning of the blob of memory, the width of each element in the blog of memory, the number of elements, as well as a comparison function.
void * lsearch (void * key, void * base, int n, int elem_size){ for(int i = 0; i < n; i++) { void * elemAddress = (char *) base + i * elemSize; if(memcmp(key, elemAddress, elemSize) == 0) return elemAddress; } return NULL; } void * elemAddress = (char *) base + i * elemSize;
This lsearch that results uses the char * casting hack. Because base is (void *), the compiler can't do pointer arithmetic on a (void *) element, therefore, casting fools it into thinking that its pointing to one-byte characters. The above function works in a few cases, but doesn't work in cases like struct. A function pointer to a comparison function must be used in that case as seen below.
void * lsearch (void * key, void * base, int n, int elem_size, int (*cmpfn)(void *, void *))
This function will be discussed in the next class.
Standford University - Programming Paradigms - Lecture 3
The following is a brief overview of Lecture 3 of the 'Programming Paradigms," a programming class taught at Standford University. Screenshots and Code are included from the lecture. The 'Programming Paradigms' lectures can be found in iTunes. The link is on the Open Standford Website. The following lecture covers Casting Types, Structs, Arrays, Arrays in Structs, and the beginning of programming generics in C.
Casting Types in C++
The first two examples deal with casting in C.
double d = 3.1416 char ch = * (char *) &d; cout << ch << endl;
A double of 3.14 or pi is declared then cast as a character ch. &d is a pointer to the first byte address in d. (char *) says, "Oh, it's not a double it's a character." The final * "de-references" that value which means it copies the value to ch not the address.
short s = 45; double d = *(double *) &s
A two byte figure that has 4 and 5 in it. &s points to the first byte address of s. (double *) is a brute force interpretation of &s. The code is essentially saying, "Oh, that never pointed to a short. It always pointed to an eight byte double." The first two bytes of the double "d" will be 4 and 5 and the rest will be blank.
A student asks a question about Endian(ness). When a "1″ is represented as two byte short in Big Endian, the right most bit in the right most byte is a one. Most Linux Machines store bytes in the reverse order or "Little Endian." When a "1″ is represented as a two byte short in Little Endian, the right most bit is set in the left most byte.
Structs in C++
struct fraction { int num; int denum; }; fraction pi; pi.num = 22; //sets num to 22 pi.denum = 7; // sets denum to 7
When a stuct is declared with two four byte ints, they are stacked like dominos in memory. The num variable is on the bottom. The denum is stacked on top of the num. Setting the two fields is easy enough. One sets the num and denum fields with the (.) dot operator.
Now things become a little crazier.
((fraction *) &(pi.denum))->num = 12;
Let's break it down.
&(pi.denum)
points to the top address of the struct — the denum address.
(fraction *) &(pi.denum)
Casts the top address as a struct with two int variables. Pi.denum becomes the bottom address (or num address).
((fraction *) &(pi.denum))->num = 12;
Sets the previous pi.denum, which is now pi.num, to "12."
Tricky Tricky.
((fraction *) &(pi.denum))->denum = 33.
Sets the space above the original pi.denum to 33 even though the address isn't owned by the struct. There is no legal way to get to this and print it out, however, the memory is set anyway.
Arrays in C++
int array[10];
This call allocates 40 bytes of memory. 4 bytes for each of the 10 blocks in the array.
array[0] = 44;
Puts 44 in the first four bytes of memory, or the zero place in the array.
array[9] = 100;
Puts 100 in the last four bytes of memory, or the 9th place of the array.
Therom: array ≡ &array[0]
The array is completely synonymous with a pointer to the first address of the zero(th) entry. When one passes an array to a function, you aren't passing the array itself (and all the memory). You are passing the address to the zero(th) entry and from there you can jump to whatever place you want in the array.
array[10] = 1;
If for some reason you mess up and try to assign the number one to the 11th entry in the array. The computer will try to place a one in that location even though that memory isn't preallocated.
array[-4] = 77;
C++ even tolerates brute force. You can assign a number to a negative offset in memory. There is no boundary checking in C and C++.
Theorem: array + k ≡ &array[k]
This is pointer arithmetic. K is automatically scaled based on the type system. In the case above, k is an int, therefore k is scaled in 4 byte increments.
Theorem: *array ≡ array[0]
(array + k) ≡ array[k]
When array is de-referenced, the above is equal to the value stored in the array — not the address.
int arr[5]; arr[3] = 128; ((short *) arr[6] = 2; cout<< arr[3] << endl; 640
Each int is compsed of four bytes. Each short is composed of two bytes. The second and third statement above actually set the same bytes in memory. arr[3] sets the memory 12 bytes over (3 * 4 (size of int) = 12). ((short *)arr[6]) (6 * 2 (size of short) = 12) also sets the memory six bytes over. Although, 128 fits in the 15th and 16th byte, while 2 is placed in the 13th byte. When arr[3] is printed to the screen, the value printed out is 128 + 512 = 640 ( 2^9 + 2^7 = 640).
Arrays in Structs
struct student { char * name; char suid[8]; int numUnits; }; student pupils[4];
pupils[0].numUnits = 21;
Sets numUnits in the first instance of pupil to 21
pupils[2].name = strdup("Adam");
Creates memory on the heap with "Adam�" name points to this memory on the heap.
pupils[3].name = pupils[0].suid + 6;
The name character pointer in the fourth instance of pupil points at the address starting at the sixth character in the first instance of the struct pupil
strcpy(pupils[1].suid, "40415xx");
Copies "40415xx" to the suid character array starting at the first character in the second instance of pupil.
stcpy(pupils[3].name, "123456");
Since pupil[3].name points at the address starting at the sixth character in the first instance of the struct pupil, starting from that location "123456″ is copied. This writes over the 21 which was previously assigned to pupils[0].numUnits.
pupils[7].suid[11] = 'A';
Even though an eighth pupil isn't properly allocated, the computer still tries to write 'A' to the 11th character into the suid character array. Will it succeed? Maybe.
How to Write Generics In C
int x = 7; int y = 117; swap(&x, &y) void swap(int *ap, int *bp) { int temp = *ap; *ap = *bp; *bp = temp; }
Two addresses are passed to the swap function. Ap and bp point to x and y respectively. The first line in the swap funtion copies 7 (the value of what ap points to (x)) to temp — so temp is set to 7. The second line, copies the 117 (the value of what bp points to (y)) to whatever ap points to (x) - so seven is replaced by 117. Temp, whose value is seven, is then copied to y or what bp points to.



















