We are still actively working on the spam issue.

Difference between revisions of "Computer science"

From InstallGentoo Wiki
Jump to: navigation, search
(Created page with "200px|thumb|right|The principal programming paradigms '''You do need to know how to code before starting with this.''' If you...")
 
m (Candidate for moving + link to /sci/'s page on comp sci)
 
(19 intermediate revisions by 8 users not shown)
Line 1: Line 1:
[[File:The principal programming paradigms.png|200px|thumb|right|The principal programming paradigms]]
+
{{move|Algorithms|Computer science is a broad area and the information on this page only relates to algorithms and data types.}}
'''You do need to know how to code before starting with this.'''
+
{{note|Check out [https://sites.google.com/site/scienceandmathguide/subjects/computer-science this page on the /sci/ wiki]. It does a much better job than this page (currently)}}
  
If you are reading this you are probably a neet. Computer Science is something really nice, but has little practical applications unless you have finished college and started doing something really interesting. If you want to learn how to do something useful, you have to learn how to program, not how to do Computer Science. How to be useful is not discussed here; however, if you are [http://codeforces.com/profile/tourist autistically good] at it, you can be instantly hired by Google, Microsoft or facebook.
+
What is Computer Science? As Harold Abelson from the [[Structure and Interpretation of Computer Programs|SICP]] lectures says, the name Computer Science is in fact misleading.  
 +
<blockquote>
 +
"Computer science is not about computers, in the same sense that physics is not really about particle accelerators, and biology is not about microscopes and Petri dishes."
 +
</blockquote>
 +
Thus, "computer" is not needed in the title. Secondly, a lot of Computer Science is mathematics and theory. There is very little actual scientific method like one would find in Physics, Chemistry, or Biology, so in this sense, it's not really a science as much as it is a form of applied mathematics or engineering. So Computer Science should be replaced with a much more informative name.
  
There are books about computer science at http://books.gentoomen.org/. Buy phisical ones from http://abebooks.com
+
So what is Computer Science then? CS consists of algorithms for solving problems in the most mathematically convenient or efficient way. Take for example the problem of finding the fastest route between several locations in a city; this is solved using a graph searching algorithm. There are also data structures designed for storing and indexing huge amounts of data in a database, protocols for making sure that data transmission is reliable, statistical learning methods behind AI, and so on.
  
 +
===Getting started===
 +
The most recommended Computer Science books are Structure and Interpretations of Computer Programs (SICP), and what is considered the holy grail of CS literature: The Art of Computer Programming (TAOCP) by Donald Knuth.
 +
 +
You'll find some eBooks at http://books.gentoomen.org/ and you can buy physical ones for cheap from http://abebooks.com
 +
 +
A good place to start would be the SICP lectures on youtube, found [https://www.youtube.com/watch?v=2Op3QLzMgSY here], and Computerphile's videos, found [https://www.youtube.com/user/Computerphile/videos here].
  
 
== Algorithms ==
 
== Algorithms ==
Line 17: Line 27:
 
There is also a good introduction to many concepts explained [http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=alg_index here].
 
There is also a good introduction to many concepts explained [http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=alg_index here].
  
==Competitive Programming==
+
==Sorting Algorithms==
 +
 
 +
A big topic in computer applications is the quick and efficient sorting of data. There are many different sorting techniques, as a programmer one has to balance the algorithm's ease of reading vs the efficiency of said code. Many standard libraries/APIs already have sorting functions already written, however it is still good practice to know some of the theory behind how the different algorithms work.
 +
 
 +
Check out [https://www.youtube.com/watch?v=kgBjXUE_Nwc this video]
 +
 
 +
===Bubble Sort===
 +
 
 +
The bubble sort is a very simple to read algorithm that is the standard first sorting method that most programmers learn. The bubble sort gets its name because values are swapped among each other, gradually moving to the top of the list.
 +
 
 +
<nowiki>//bubble sort example
 +
#include <stdio.h>
 +
#define SIZE 10
 +
int main( void ){
 +
    int a[SIZE] = {2,6,9,1,4,15,7,2}; //array to be sorted
 +
    int count; //to be used to count passes
 +
    int i; //comparisons counter
 +
    int hold; //temp variable to hold numbers
 +
 
 +
    //start of bubble sort
 +
    for(count=0; count<SIZE-1; count++){//loop to control number of passes
 +
    //size is decreased by 1 to prevent array out of bounds error
 +
 
 +
    for(i=0; i<SIZE-1; i++){//comparison loop
 +
        if(a[i]>a[i+1]){//if a[i] bigger than a[i+1], swap values
 +
        hold = a[i];
 +
        a[i] = a[i+1];
 +
        a[i+1] = hold;
 +
        }//end if
 +
    }//end inner for
 +
 
 +
    }//end outer for
 +
   
 +
    //print array
 +
    for(i=0; i<SIZE; i++)
 +
    printf("%4d", a[i]);
 +
    return 0;
 +
}</nowiki>
 +
 
 +
===Selection Sort===
 +
A bit further up on the food chain is the selection sort. The selection sort goes through each element position and finds the value which should occupy the next position in the sorted array. When it finds the appropriate element, the algorithm exchanges it with the valuewhich previously occupied the desired position. On the second pass it looks for the second smallest element and swaps it with the second position and so on.
 +
 
 +
<nowiki>//Selection sort example
 +
//finds smallest element between left hand and right hand
 +
//moves element to correct position
 +
#include <stdio.h>
 +
void selectionSort(int array[], int n){
 +
    //n is length
 +
    int lh, rh, i, temp;
 +
       
 +
    for (lh=0; lh < n; lh++){//everything left of lh is considered sorted
 +
        rh = lh;    //reset right hand
 +
        //runs through lh-rh bounds of array       
 +
        for (i = lh + 1; i<n; i++){
 +
            //if lowest element found, copy index to right hand
 +
            if (array[i] < array[rh]) rh = i;
 +
        }
 +
        //move lowest element to correct position
 +
        temp = array[lh];
 +
        array[lh] = array[rh];
 +
        array[rh] = temp;
 +
        }   
 +
    }
 +
int main( void ){
 +
    int len = 8;   
 +
    int a[8] = {2,6,9,1,4,15,7,2}; //array to be sorted
 +
    int i;   
 +
 
 +
    selectionSort(a, len);
 +
 
 +
    //print array
 +
    for(i=0; i<len; i++)
 +
    printf("%4d", a[i]);
 +
    return 0;
 +
}</nowiki>
 +
 
 +
===Merge sort===
 +
    // This is an iterative version of merge sort.
 +
    // It merges pairs of two consecutive lists one after another.
 +
    // After scanning the whole array to do the above job,
 +
    // it goes to the next stage. Variable k controls the size
 +
    // of lists to be merged. k doubles each time the main loop
 +
    // is executed.
 +
    #include <stdio.h>
 +
    #include <math.h>
 +
    int i,n,t,k;
 +
    int a[100000],b[100000];
 +
    int merge(l,r,u)
 +
      int l,r,u;
 +
    { int i,j,k;
 +
      i=l; j=r; k=l;
 +
      while (i<r && j<u) {
 +
        if (a[i]<=a[j]) {b[k]=a[i]; i++;}
 +
        else {b[k]=a[j]; j++;}
 +
        k++;
 +
      }
 +
      while (i<r) {
 +
        b[k]=a[i]; i++; k++;
 +
      }
 +
      while (j<u) {
 +
        b[k]=a[j]; j++; k++;
 +
      }
 +
      for (k=l; k<u; k++) {
 +
        a[k]=b[k];
 +
      }
 +
    }
 +
    sort()
 +
    { int k,u;
 +
      k=1;
 +
      while (k<n) {
 +
        i=1;
 +
        while (i+k<=n) {
 +
          u=i+k*2;
 +
          if (u>n) u=n+1;
 +
          merge(i,i+k,u);
 +
          i=i+k*2;
 +
        }
 +
        k=k*2;
 +
      }
 +
    }
 +
    main()
 +
    { printf("input size \n");
 +
      scanf("%d",&n);
 +
    /*  for (i=1;i<=n;i++) scanf("%d",&a[i]); */
 +
      for (i=1;i<=n;i++) a[i]=random()%1000;
 +
      t=clock();
 +
      sort();
 +
      for (i=1;i<=10;i++) printf("%d ",a[i]);
 +
      printf("\n");
 +
      printf("time= %d millisec\n",(clock()-t)/1000);
 +
    }
 +
 
 +
===Quick Sort===
 +
Watch [https://www.youtube.com/watch?v=XE4VP_8Y0BU this video]
 +
===Performance===
 +
 
 +
==Search Algorithms==
 +
Think of an ordered phone book. Consider two ways to search it for a name: linear search and binary search.
 +
 
 +
Linear search consists of starting at page 1, at the first name, and checking every name, one at a time, in order, until you find the name. Considering a phone book of one million names, this is very useful if you're searching for the number of Aaron A. Aardvark, but could be tedious if looking for Matthew Maloney, or downright awful if looking for a person named Z. Zerthis.
 +
 
 +
Binary search consists of opening the book up to half way, and comparing the name you are searching for with the items on either side. If the name you are looking for is earlier on, then you'll want the left half. If it is later on, you'll want the right half. What you have done here, is effectively cut the amount of names to search in half. You then go to the middle of whichever half you chose, and repeat the process until you've found what you're looking for. This is what we humans naturally do in a general sense.
 +
 
 +
To really get a hold on how much better Binary search is compared to linear search, consider a name that is roughly 3 quarters to the end of the phone book. With 1 million names, a linear search would take you roughly 750,000 comparisons before you found that name. With binary search, it would take roughly 3 or 4 comparisons.
 +
 
 +
There are also two string searching/pattern matching algorithms known as the Knuth-Morris-Pratt (KMP) and the Boyer-Moore (BM) algorithms.
 +
 
 +
Computers use graph searching algorithms called the Floyd-Warshall algorithm, and Dijkstra's (pronounced Daik-stra) algorithm, to find the optimal path between two or more nodes connected by pathes on a graph.
 +
 
 +
==Recursion==
 +
Recursion is the concept of something entering inside itself, or rather a new ''instance'' of itself. Functional programming languages such as Lisp dialects are known to employ this often.
 +
 
 +
Recursion can be tough to get your head around at first, because for a lot of people it is a new concept. However, once understood, it's quite simple. The main things to remember are the '''base case''', the '''terminating case''', and the concept of an '''accumulator'''.
 +
 
 +
Base case: This is the starting step of recursion.
 +
 
 +
Termination case: This tells the recursion when to stop. Without this, it would simply recur forever, or at least until the machine runs out of memory.
 +
 
 +
Accumulator: This is how you can pass data through a function recursively. It's called an accumulator because it ''accumulates'' data, or collects it as it goes. Here's an example of a program in Python that recursively adds numbers from 0 to 10 into a list, and then returns it.
 +
 
 +
The base case will be starting at 0. The terminating case is at 10. We'll use a list as an accumulator.
 +
 
 +
Lets define our function:
 +
 
 +
    def one_to_ten(number, accumulator):
 +
        if number > 10:
 +
            return accumulator
 +
        else:
 +
            accumulator.append(number)
 +
            return one_to_ten(number + 1, accumulator)
 +
   
 +
    empty_list = []
 +
    base_case = 0
 +
   
 +
    result = one_to_ten(base_case, empty_list)
 +
   
 +
    print(result) # result is [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
 +
 
 +
Going through it step by step:
 +
 
 +
  1. Our function takes a number and the accumulator
 +
  2. Check that the termination case hasn't been met
 +
  3. If it has, don't call the function anymore. Just return the accumulator.
 +
  4. Otherwise, put the number in the accumulator
 +
  5. Call the function from within itself, but with number + 1, so the next recursion of it
 +
    will have the next number. Note that we're also passing in the newly updated accumulator.
 +
 
 +
So, the function enters a second instance of itself, but with number plus one, and it does this 10 times, until number is 10 (which will be 10 levels deep). However, when number becomes 11, it will not recur anymore, and it will return the accumulator back to the 9th function, which will also finish and return to the 8th function, which returns back to the 7th function, and so on until the first call of the function has returned. You could say it naturally ''unrolls'' itself.
 +
 
 +
==Linked Lists==
 +
===Structure===
 +
A linked list consists of a series of nodes. Each node contains data, and a
 +
pointer to the next node. A visual representation of a linked list might
 +
look something like this:
 +
 
 +
  Node1          Node2          Node3
 +
  /------\      /------\      /------\
 +
  | Data |  ---->| Data |  ---->| Data |  ---->
 +
  |------|  |    |------|  |    |------|  |
 +
  | Next ----    | Next ----    | Next ----
 +
  \------/      \------/      \------/
 +
 
 +
As the diagram shows, each node has a '''Data''' field, and a '''Next''' field. Each node's Next points to another node, except the last node, which points to nothing. If you consider these three nodes as a single, connected entity, then you have a linked list. One of the interesting things about a linked list is that, unlike an ''array'', where all the elements are grouped next to each other in memory, the individual nodes can be anywhere in memory, with their Next field serving to point each to its successor.
 +
 
 +
There are different types of linked lists. The diagram depicts a '''singly''' linked list, meaning each node contains a single link, which points to the next node.
 +
 
 +
Linked lists can also be '''doubly''' or '''circularly''' linked. Both are
 +
similarly structured to a singly linked list, except that a doubly linked
 +
list also contains a Previous field which points to the node before it,
 +
and the last node in a circularly linked list points back to the first node
 +
in the list.
 +
 
 +
===Implementation===
 +
To represent a node in Pascal, we will use a '''record'''.
 +
 
 +
First we must define a pointer to a node, which can be thought of as one of the arrows in the diagram.
 +
 
 +
  type
 +
    PNode = ^TNode;
 +
 
 +
Now we can define the actual node type itself
 +
 
 +
    TNode = record
 +
      Data: Integer;
 +
      Next: PNode;
 +
    end;
 +
 
 +
Like the diagram, our node has a Data field, and a Next field. The Data field will contain an Integer, and the Next field will point to another node.
 +
 
 +
With a node type defined, we can define our linked list type. The linked list itself only contains a
 +
pointer to the first node in the list, which we will call its Head. We will also define a pointer to a linked list, which will come in handy later when we want to pass our lists between different functions to modify it.
 +
 
 +
    PLinkedList = ^TLinkedList;
 +
    TLinkedList = record
 +
      Head: PNode;
 +
    end;
 +
 
 +
===Usage===
 +
 
 +
We now have all the structure we need for a proper linked list. But, lists and nodes by themselves aren't
 +
very interesting. We should also create some functions that will let us create and delete lists, add and
 +
remove nodes, and visualise our lists by printing them to the terminal screen.
 +
 
 +
We'll start with creating a new list. For the most part, we'll be working with pointers to lists rather
 +
than lists themselves, so we will define a function that will create a new list and return a pointer to it
 +
that we can use
 +
 
 +
  function NewList: PLinkedList;
 +
  begin
 +
    New(NewList);
 +
    NewList^.Head := Nil
 +
  end;
 +
 
 +
We assign '''Nil''' to Head to indicate that it does not point to any node, like the arrow that points to
 +
nothing in the diagram. This is also useful for testing if any node has another node attached to it: if Next = Nil, then that node is not followed by any other node, and it is at the end of the list.
 +
 
 +
We also need a function to delete the lists we create when we are done with them, along with all of the
 +
nodes we've added to it. To do this, we must step over the list one node at a time, deleting each node until we reach the end.
 +
 
 +
  procedure DisposeList(List: PLinkedList);
 +
  var
 +
    RemovedNode, CurrentNode: PNode;
 +
  begin
 +
    CurrentNode := List^.Head;
 +
    while CurrentNode <> Nil do
 +
    begin
 +
      RemovedNode := CurrentNode;
 +
      CurrentNode := CurrentNode^.Next;
 +
      Dispose(RemovedNode)
 +
    end
 +
  end;
  
'''You are in the jungle. You have a pocket-knife.''' Someone asks you to kill a mountain lion.  
+
Now we want to be able to add nodes, to the front or back of the list. Since we mainly care about the
 +
data each node contains, these functions will create the new nodes for us, and assign the data we want to
 +
add to their Data field.
  
Anyone but a programmer would be asking ''"WTF is a MOUNTAIN lion doing in a JUNGLE?!"'', but that's not what you have been trained to do as a programmer. You are here to solve problems, not to question them.
+
Adding to the front of the list is easiest, since we already know exactly where the first node is: the
 +
list's Head field points to it. To place it in front, we first place the node Head points to after our
 +
new node (by pointing the new node's Next field to it), then we reassign the list's Head field to point
 +
to our new node:
  
Years of training has taught you well. You use your knife to sharpen a stick. You cut vines to lash sharp stones on one end. Maybe you're from a top university, and you've learned to extract essential ingredients from plant and insect life around you to fashion a poison to tip your weapon with.
+
  procedure AddToFront(List: PLinkedList; Data: Integer);
 +
  var
 +
    NewNode: PNode;
 +
  begin
 +
    New(NewNode);
 +
    NewNode^.Data := Data;
 +
    NewNode^.Next := List^.Head;
 +
    List^.Head := NewNode
 +
  end;
  
Convinced that you have an effective and efficient way to kill the lion, you set forth to accomplish your task. Maybe your stick is too short, or your poisons don't work. It's okay - you live to refine your method and try again another day.
+
Adding to the back of the list is a little more complicated, because first we need to find the last node
 +
in the list. There are a couple of cases we must consider:
 +
# The list is empty (Head = Nil).
 +
# The list has one or more nodes.
 +
In the first case, we can simply assign Head to point at our new node. In the second case, we must step
 +
over the list, one node at a time, until we reach a node that has no node following it (Next = Nil).
  
Then someone figures out a way to fashion a low-grade explosive from harvesting chemicals in the jungle. Your method of fashioning a spear to kill the lion is now far from the best way to accomplish your task. Nevertheless, it's still a simple way, and will continue to be taught in schools. Every lion-killer will be taught how to build his tools from scratch.
+
  procedure AddToBack(List: PLinkedList; Data: Integer);
 +
  var
 +
    NewNode, CurrentNode: PNode;
 +
  begin
 +
    New(NewNode);
 +
    NewNode^.Data := Data;
 +
    NewNode^.Next := Nil;
 +
    if List^.Head = Nil then
 +
    begin
 +
      List^.Head := NewNode
 +
    end else
 +
    begin
 +
      CurrentNode := List^.Head;
 +
      while CurrentNode^.Next <> Nil do
 +
      begin
 +
        CurrentNode := CurrentNode^.Next
 +
      end;
 +
      CurrentNode^.Next := NewNode
 +
    end
 +
  end;
  
That's "real-life" programming.
+
We also want a way to remove nodes from the front and back of the list.
  
In competitive programming, you start out with the same resources (a pocket-knife), except '''you have 2 minutes to kill the lion.'''
+
Like adding, removing from the front is relatively simple, we just assign Head to the node pointed to by
 +
the Head node's Next field, and dispose of the old Head node:
  
As a beginner, you will stare at the lion and do nothing.
+
  procedure RemoveFromFront(List: PLinkedList);
 +
  var
 +
    RemovedNode: PNode;
 +
  begin
 +
    RemovedNode := List^.Head;
 +
    List^.Head := List^.Head^.Next;
 +
    Dispose(RemovedNode)
 +
  end;
  
Soon, you learn that if you kill a squirrel, sometimes the judge thinks it's a lion and you're good to go.
+
And also like adding, removing from the back involves a number of cases:
 +
# The list is empty.
 +
# The list contains one node.
 +
# The list contains two or more nodes.
 +
In the first case, we don't have to do anything, since there are no nodes to remove. For the second case,
 +
we must remove the node that the list's Head field points to. For the third case, we must step over the
 +
list until we reach the second to last node, and then remove the node that node's Next field points to.
  
A more experienced programmer just keeps stabbing the lion and hopes that the lion dies in time. Soon, you learn that there are certain spots on a lion that are damage immune. You learn to not even bother stabbing those spots. Sometimes, the lion doesn't expose those spots, so you get really good at killing squirrels.
+
  procedure RemoveFromBack(List: PLinkedList);
 +
  var
 +
    CurrentNode: PNode;
 +
  begin
 +
    if List^.Head <> Nil then
 +
    begin
 +
      if List^.Head^.Next = Nil then
 +
      begin
 +
        Dispose(List^.Head);
 +
        List^.Head := Nil
 +
      end else
 +
      begin
 +
        CurrentNode := List^.Head;
 +
        while CurrentNode^.Next^.Next <> Nil do
 +
        begin
 +
          CurrentNode := CurrentNode^.Next
 +
        end;
 +
        Dispose(CurrentNode^.Next);
 +
        CurrentNode^.Next := Nil
 +
      end
 +
    end
 +
  end;
  
And then, to be a great competitive programmer, you need to be able to do two things.
+
The final function we will create is much simpler. This function will print out a visual representation
 +
of the data in a list to the terminal screen:
  
First, you must learn how to find the lion's critical point and kill it in one swift stroke.
+
  procedure PrintList(List: PLinkedList);
 +
  var
 +
    CurrentNode: PNode;
 +
  begin
 +
    CurrentNode := List^.Head;
 +
    while CurrentNode <> Nil do
 +
    begin
 +
      Write(CurrentNode^.Data);
 +
      if CurrentNode^.Next <> Nil then
 +
      begin
 +
        Write(' => ')
 +
      end;
 +
      CurrentNode := CurrentNode^.Next
 +
    end;
 +
    Writeln
 +
  end;
  
Second, you must learn how to be so handy with your knife that you can fashion a sharp stick in 1 minute, and spend the next minute stabbing the lion to death.
+
===Testing===
  
But never ever will you be able to have enough time to fashion an explosive to take the lion out.
+
Now we can test our linked list structure and functions:
  
===Material===
+
  var
 +
    List: PLinkedList;
 +
  begin
 +
    List := NewList;
 +
    AddToFront(List, 3);
 +
    AddToFront(List, 2);
 +
    AddToFront(List, 4);
 +
    PrintList(List);
 +
    AddToBack(List, 1);
 +
    AddToBack(List, 0);
 +
    PrintList(List);
 +
    RemoveFromFront(List);
 +
    RemoveFromBack(List);
 +
    PrintList(List);
 +
    DisposeList(List)
 +
  end.
  
====Books====
+
This will result in the following output:
  
There are books that teach programming competition. They are a very good resource to learn computer Science
+
  4 => 2 => 3
 +
  4 => 2 => 3 => 1 => 0
 +
  2 => 3 => 1
  
* http://www.comp.nus.edu.sg/~stevenha/myteaching/competitive_programming/cp1.pdf
+
Full source: [https://paste.installgentoo.com/view/39114a9a linkedlistexample.pas]
** https://sites.google.com/site/stevenhalim/
 
* (A less good one)
 
** http://acm.cs.buap.mx/downloads/Programming_Challenges.pdf
 
  
====Compete====
+
==Binary Search Trees==
  
There are many pages to compete in online programming competitions. They often offer tutorials and many other more:
+
A binary search tree is a way to very quickly search sorted data. It is simply constructing a tree by following a binary search.
  
* [http://tocpoder.com/tc TopCoder]
+
    /* This program repeatedly inserts n random numbers into
* [http://codeforces.com/ Codeforces]
+
      a binary search tree. Then it traverse the tree
* [http://www.codechef.com/ Codechef]
+
      in in-order to store the keys into array a.
* [https://code.google.com/codejam/ Google CodeJam]
+
      After the traversal, sorted numbers are in array a. */
* [https://www.facebook.com/hackercup Facebook HackerCup] (I do not reccomend this one)
+
    #include <stdio.h>
* [http://icpc.baylor.edu/worldfinals/results ICPC] (Might be available in your college)
+
    #include <math.h>
* [https://en.wikipedia.org/wiki/International_Olympiad_in_Informatics IOI] If you are underage and autistic, look for the local version in your country of this one.
+
    #include <stdlib.h>
 +
    #define nil 0
 +
    int k=1;
 +
    struct item *proot;
 +
    struct item{
 +
      int key;
 +
      struct item *adr;
 +
      struct item *left;
 +
      struct item *right;
 +
    };
 +
    struct item a[1000000];
 +
    void insert(int x){
 +
      struct item *p, *q;
 +
      p=proot;
 +
      do{
 +
        q=p;
 +
        if (x<=(*p).key)p=(*p).left; else p=(*p).right;
 +
      }
 +
      while (p!=nil);
 +
      p=malloc(sizeof(struct item));
 +
      (*p).key=x; (*p).left=nil; (*p).right=nil;
 +
      (*p).adr=p;
 +
      if (x<=(*q).key) (*q).left=p; else (*q).right=p;
 +
    }
 +
    traverse(struct item *proot)
 +
    { struct item *p;
 +
      p=proot; 
 +
      if (p!=nil){
 +
        printf("(");
 +
        traverse((*p).left);
 +
        printf("%d", (*p).key);
 +
        a[k]=(*p); k++;
 +
        traverse((*p).right);
 +
        printf(")");
 +
      }
 +
    }
 +
    main()
 +
    {
 +
      int i,n,t;
 +
      scanf("%d",&n);
 +
      proot=malloc(sizeof(struct item));
 +
      (*proot).key=99999;
 +
      (*proot).left=nil;
 +
      (*proot).right=nil;
 +
      for(i=1;i<=n;i++) a[i].key=random()%100000;
 +
    // for (i=1;i<=n;i++)printf("%d ",a[i].key);
 +
      printf("\n");
 +
      t=clock();
 +
      for (i=1;i<=n;i++)insert(a[i].key);
 +
      traverse(proot->left);
 +
      printf("\n");
 +
      printf("time= %d millisec\n\n", (clock()-t)/1000);
 +
    /*
 +
      for (i=1;i<=n;i++)
 +
        printf("%3d %3d %3d %3d\n\n",(int)(a[i].adr) % 1000, a[i].key,
 +
                                (int)(a[i].left) % 1000, (int)(a[i].right) % 1000);
 +
    */
 +
    }
  
====Training sites====
+
==Hash Tables==
  
If you want to train:
 
  
* [http://uva.onlinejudge.org/ UVA]
 
* [http://usaco.org/ USACO]
 
* [http://ahmed-aly.com/ Ahmed Aly's Webpage]
 
  
 
[[Category:Programming]]
 
[[Category:Programming]]
 +
[[Category:Terms]]
 +
[[Category:HowTo]]

Latest revision as of 05:34, 15 March 2020

Imbox move.png
Imbox move.png
MOVE CANDIDATE
This page is being proposed to be moved to somewhere else. Relevant discussion may be found on the talk page. Reason: Computer science is a broad area and the information on this page only relates to algorithms and data types.
Note: Check out this page on the /sci/ wiki. It does a much better job than this page (currently)

What is Computer Science? As Harold Abelson from the SICP lectures says, the name Computer Science is in fact misleading.

"Computer science is not about computers, in the same sense that physics is not really about particle accelerators, and biology is not about microscopes and Petri dishes."

Thus, "computer" is not needed in the title. Secondly, a lot of Computer Science is mathematics and theory. There is very little actual scientific method like one would find in Physics, Chemistry, or Biology, so in this sense, it's not really a science as much as it is a form of applied mathematics or engineering. So Computer Science should be replaced with a much more informative name.

So what is Computer Science then? CS consists of algorithms for solving problems in the most mathematically convenient or efficient way. Take for example the problem of finding the fastest route between several locations in a city; this is solved using a graph searching algorithm. There are also data structures designed for storing and indexing huge amounts of data in a database, protocols for making sure that data transmission is reliable, statistical learning methods behind AI, and so on.

Getting started

The most recommended Computer Science books are Structure and Interpretations of Computer Programs (SICP), and what is considered the holy grail of CS literature: The Art of Computer Programming (TAOCP) by Donald Knuth.

You'll find some eBooks at http://books.gentoomen.org/ and you can buy physical ones for cheap from http://abebooks.com

A good place to start would be the SICP lectures on youtube, found here, and Computerphile's videos, found here.

Algorithms

An algorithm is any well-defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values as output. It is the part that does the magic.

There are a few programming concepts discussed in the Programming concepts page.

For the rest, check Introduction to Algorithms (by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein), and if you have the balls and the time, check The Art of Computer Programming (by Donald D. Knuth).

There is also a good introduction to many concepts explained here.

Sorting Algorithms

A big topic in computer applications is the quick and efficient sorting of data. There are many different sorting techniques, as a programmer one has to balance the algorithm's ease of reading vs the efficiency of said code. Many standard libraries/APIs already have sorting functions already written, however it is still good practice to know some of the theory behind how the different algorithms work.

Check out this video

Bubble Sort

The bubble sort is a very simple to read algorithm that is the standard first sorting method that most programmers learn. The bubble sort gets its name because values are swapped among each other, gradually moving to the top of the list.

//bubble sort example
#include <stdio.h>
#define SIZE 10
int main( void ){
    int a[SIZE] = {2,6,9,1,4,15,7,2}; //array to be sorted
    int count; //to be used to count passes
    int i; //comparisons counter
    int hold; //temp variable to hold numbers

    //start of bubble sort
    for(count=0; count<SIZE-1; count++){//loop to control number of passes
    //size is decreased by 1 to prevent array out of bounds error

    for(i=0; i<SIZE-1; i++){//comparison loop
        if(a[i]>a[i+1]){//if a[i] bigger than a[i+1], swap values
        hold = a[i];
        a[i] = a[i+1];
        a[i+1] = hold;
        }//end if
    }//end inner for

    }//end outer for
    
    //print array
    for(i=0; i<SIZE; i++)
    printf("%4d", a[i]);
    return 0;
}

Selection Sort

A bit further up on the food chain is the selection sort. The selection sort goes through each element position and finds the value which should occupy the next position in the sorted array. When it finds the appropriate element, the algorithm exchanges it with the valuewhich previously occupied the desired position. On the second pass it looks for the second smallest element and swaps it with the second position and so on.

//Selection sort example
//finds smallest element between left hand and right hand
//moves element to correct position
#include <stdio.h>
void selectionSort(int array[], int n){
    //n is length
    int lh, rh, i, temp;
        
    for (lh=0; lh < n; lh++){//everything left of lh is considered sorted
        rh = lh;    //reset right hand
        //runs through lh-rh bounds of array        
        for (i = lh + 1; i<n; i++){
            //if lowest element found, copy index to right hand
            if (array[i] < array[rh]) rh = i;
        }
        //move lowest element to correct position
        temp = array[lh];
        array[lh] = array[rh];
        array[rh] = temp;
        }    
    }
int main( void ){
    int len = 8;    
    int a[8] = {2,6,9,1,4,15,7,2}; //array to be sorted
    int i;    

    selectionSort(a, len);
   
    //print array
    for(i=0; i<len; i++)
    printf("%4d", a[i]);
    return 0;
}

Merge sort

   // This is an iterative version of merge sort.
   // It merges pairs of two consecutive lists one after another.
   // After scanning the whole array to do the above job,
   // it goes to the next stage. Variable k controls the size
   // of lists to be merged. k doubles each time the main loop
   // is executed.
   #include <stdio.h>
   #include <math.h>
   int i,n,t,k;
   int a[100000],b[100000];
   int merge(l,r,u)
     int l,r,u;
   { int i,j,k;
     i=l; j=r; k=l;
     while (i<r && j<u) { 
       if (a[i]<=a[j]) {b[k]=a[i]; i++;} 
       else {b[k]=a[j]; j++;}
       k++;
     }
     while (i<r) { 
       b[k]=a[i]; i++; k++;
     }
     while (j<u) { 
       b[k]=a[j]; j++; k++;
     }
     for (k=l; k<u; k++) { 
       a[k]=b[k]; 
     }
   }
   sort()
   { int k,u;
     k=1;
     while (k<n) {
       i=1;
       while (i+k<=n) {
         u=i+k*2;
         if (u>n) u=n+1;
         merge(i,i+k,u);
         i=i+k*2;
       }
       k=k*2;
     }
   }
   main()
   { printf("input size \n");
     scanf("%d",&n); 
   /*  for (i=1;i<=n;i++) scanf("%d",&a[i]); */
     for (i=1;i<=n;i++) a[i]=random()%1000;
     t=clock();
     sort();
     for (i=1;i<=10;i++) printf("%d ",a[i]);
     printf("\n");
     printf("time= %d millisec\n",(clock()-t)/1000);
   }

Quick Sort

Watch this video

Performance

Search Algorithms

Think of an ordered phone book. Consider two ways to search it for a name: linear search and binary search.

Linear search consists of starting at page 1, at the first name, and checking every name, one at a time, in order, until you find the name. Considering a phone book of one million names, this is very useful if you're searching for the number of Aaron A. Aardvark, but could be tedious if looking for Matthew Maloney, or downright awful if looking for a person named Z. Zerthis.

Binary search consists of opening the book up to half way, and comparing the name you are searching for with the items on either side. If the name you are looking for is earlier on, then you'll want the left half. If it is later on, you'll want the right half. What you have done here, is effectively cut the amount of names to search in half. You then go to the middle of whichever half you chose, and repeat the process until you've found what you're looking for. This is what we humans naturally do in a general sense.

To really get a hold on how much better Binary search is compared to linear search, consider a name that is roughly 3 quarters to the end of the phone book. With 1 million names, a linear search would take you roughly 750,000 comparisons before you found that name. With binary search, it would take roughly 3 or 4 comparisons.

There are also two string searching/pattern matching algorithms known as the Knuth-Morris-Pratt (KMP) and the Boyer-Moore (BM) algorithms.

Computers use graph searching algorithms called the Floyd-Warshall algorithm, and Dijkstra's (pronounced Daik-stra) algorithm, to find the optimal path between two or more nodes connected by pathes on a graph.

Recursion

Recursion is the concept of something entering inside itself, or rather a new instance of itself. Functional programming languages such as Lisp dialects are known to employ this often.

Recursion can be tough to get your head around at first, because for a lot of people it is a new concept. However, once understood, it's quite simple. The main things to remember are the base case, the terminating case, and the concept of an accumulator.

Base case: This is the starting step of recursion.

Termination case: This tells the recursion when to stop. Without this, it would simply recur forever, or at least until the machine runs out of memory.

Accumulator: This is how you can pass data through a function recursively. It's called an accumulator because it accumulates data, or collects it as it goes. Here's an example of a program in Python that recursively adds numbers from 0 to 10 into a list, and then returns it.

The base case will be starting at 0. The terminating case is at 10. We'll use a list as an accumulator.

Lets define our function:

   def one_to_ten(number, accumulator):
       if number > 10:
           return accumulator
       else:
           accumulator.append(number)
           return one_to_ten(number + 1, accumulator)
   
   empty_list = []
   base_case = 0
   
   result = one_to_ten(base_case, empty_list)
   
   print(result) # result is [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Going through it step by step:

 1. Our function takes a number and the accumulator
 2. Check that the termination case hasn't been met
 3. If it has, don't call the function anymore. Just return the accumulator.
 4. Otherwise, put the number in the accumulator
 5. Call the function from within itself, but with number + 1, so the next recursion of it
    will have the next number. Note that we're also passing in the newly updated accumulator.

So, the function enters a second instance of itself, but with number plus one, and it does this 10 times, until number is 10 (which will be 10 levels deep). However, when number becomes 11, it will not recur anymore, and it will return the accumulator back to the 9th function, which will also finish and return to the 8th function, which returns back to the 7th function, and so on until the first call of the function has returned. You could say it naturally unrolls itself.

Linked Lists

Structure

A linked list consists of a series of nodes. Each node contains data, and a pointer to the next node. A visual representation of a linked list might look something like this:

  Node1          Node2          Node3
 /------\       /------\       /------\
 | Data |  ---->| Data |  ---->| Data |  ---->
 |------|  |    |------|  |    |------|  |
 | Next ----    | Next ----    | Next ----
 \------/       \------/       \------/

As the diagram shows, each node has a Data field, and a Next field. Each node's Next points to another node, except the last node, which points to nothing. If you consider these three nodes as a single, connected entity, then you have a linked list. One of the interesting things about a linked list is that, unlike an array, where all the elements are grouped next to each other in memory, the individual nodes can be anywhere in memory, with their Next field serving to point each to its successor.

There are different types of linked lists. The diagram depicts a singly linked list, meaning each node contains a single link, which points to the next node.

Linked lists can also be doubly or circularly linked. Both are similarly structured to a singly linked list, except that a doubly linked list also contains a Previous field which points to the node before it, and the last node in a circularly linked list points back to the first node in the list.

Implementation

To represent a node in Pascal, we will use a record.

First we must define a pointer to a node, which can be thought of as one of the arrows in the diagram.

 type
   PNode = ^TNode;

Now we can define the actual node type itself

   TNode = record
     Data: Integer;
     Next: PNode;
   end;

Like the diagram, our node has a Data field, and a Next field. The Data field will contain an Integer, and the Next field will point to another node.

With a node type defined, we can define our linked list type. The linked list itself only contains a pointer to the first node in the list, which we will call its Head. We will also define a pointer to a linked list, which will come in handy later when we want to pass our lists between different functions to modify it.

   PLinkedList = ^TLinkedList;
   TLinkedList = record
     Head: PNode;
   end;

Usage

We now have all the structure we need for a proper linked list. But, lists and nodes by themselves aren't very interesting. We should also create some functions that will let us create and delete lists, add and remove nodes, and visualise our lists by printing them to the terminal screen.

We'll start with creating a new list. For the most part, we'll be working with pointers to lists rather than lists themselves, so we will define a function that will create a new list and return a pointer to it that we can use

 function NewList: PLinkedList;
 begin
   New(NewList);
   NewList^.Head := Nil
 end;

We assign Nil to Head to indicate that it does not point to any node, like the arrow that points to nothing in the diagram. This is also useful for testing if any node has another node attached to it: if Next = Nil, then that node is not followed by any other node, and it is at the end of the list.

We also need a function to delete the lists we create when we are done with them, along with all of the nodes we've added to it. To do this, we must step over the list one node at a time, deleting each node until we reach the end.

 procedure DisposeList(List: PLinkedList);
 var
   RemovedNode, CurrentNode: PNode;
 begin
   CurrentNode := List^.Head;
   while CurrentNode <> Nil do
   begin
     RemovedNode := CurrentNode;
     CurrentNode := CurrentNode^.Next;
     Dispose(RemovedNode)
   end
 end;

Now we want to be able to add nodes, to the front or back of the list. Since we mainly care about the data each node contains, these functions will create the new nodes for us, and assign the data we want to add to their Data field.

Adding to the front of the list is easiest, since we already know exactly where the first node is: the list's Head field points to it. To place it in front, we first place the node Head points to after our new node (by pointing the new node's Next field to it), then we reassign the list's Head field to point to our new node:

 procedure AddToFront(List: PLinkedList; Data: Integer);
 var
   NewNode: PNode;
 begin
   New(NewNode);
   NewNode^.Data := Data;
   NewNode^.Next := List^.Head;
   List^.Head := NewNode
 end;

Adding to the back of the list is a little more complicated, because first we need to find the last node in the list. There are a couple of cases we must consider:

  1. The list is empty (Head = Nil).
  2. The list has one or more nodes.

In the first case, we can simply assign Head to point at our new node. In the second case, we must step over the list, one node at a time, until we reach a node that has no node following it (Next = Nil).

 procedure AddToBack(List: PLinkedList; Data: Integer);
 var
   NewNode, CurrentNode: PNode;
 begin
   New(NewNode);
   NewNode^.Data := Data;
   NewNode^.Next := Nil;
   if List^.Head = Nil then
   begin
     List^.Head := NewNode
   end else
   begin
     CurrentNode := List^.Head;
     while CurrentNode^.Next <> Nil do
     begin
       CurrentNode := CurrentNode^.Next
     end;
     CurrentNode^.Next := NewNode
   end
 end;

We also want a way to remove nodes from the front and back of the list.

Like adding, removing from the front is relatively simple, we just assign Head to the node pointed to by the Head node's Next field, and dispose of the old Head node:

 procedure RemoveFromFront(List: PLinkedList);
 var
   RemovedNode: PNode;
 begin
   RemovedNode := List^.Head;
   List^.Head := List^.Head^.Next;
   Dispose(RemovedNode)
 end;

And also like adding, removing from the back involves a number of cases:

  1. The list is empty.
  2. The list contains one node.
  3. The list contains two or more nodes.

In the first case, we don't have to do anything, since there are no nodes to remove. For the second case, we must remove the node that the list's Head field points to. For the third case, we must step over the list until we reach the second to last node, and then remove the node that node's Next field points to.

 procedure RemoveFromBack(List: PLinkedList);
 var
   CurrentNode: PNode;
 begin
   if List^.Head <> Nil then
   begin
     if List^.Head^.Next = Nil then
     begin
       Dispose(List^.Head);
       List^.Head := Nil
     end else
     begin
       CurrentNode := List^.Head;
       while CurrentNode^.Next^.Next <> Nil do
       begin
         CurrentNode := CurrentNode^.Next
       end;
       Dispose(CurrentNode^.Next);
       CurrentNode^.Next := Nil
     end
   end
 end;

The final function we will create is much simpler. This function will print out a visual representation of the data in a list to the terminal screen:

 procedure PrintList(List: PLinkedList);
 var
   CurrentNode: PNode;
 begin
   CurrentNode := List^.Head;
   while CurrentNode <> Nil do
   begin
     Write(CurrentNode^.Data);
     if CurrentNode^.Next <> Nil then
     begin
       Write(' => ')
     end;
     CurrentNode := CurrentNode^.Next
   end;
   Writeln
 end;

Testing

Now we can test our linked list structure and functions:

 var
   List: PLinkedList;
 begin
   List := NewList;
   AddToFront(List, 3);
   AddToFront(List, 2);
   AddToFront(List, 4);
   PrintList(List);
   AddToBack(List, 1);
   AddToBack(List, 0);
   PrintList(List);
   RemoveFromFront(List);
   RemoveFromBack(List);
   PrintList(List);
   DisposeList(List)
 end.

This will result in the following output:

 4 => 2 => 3
 4 => 2 => 3 => 1 => 0
 2 => 3 => 1

Full source: linkedlistexample.pas

Binary Search Trees

A binary search tree is a way to very quickly search sorted data. It is simply constructing a tree by following a binary search.

   /* This program repeatedly inserts n random numbers into
      a binary search tree. Then it traverse the tree
      in in-order to store the keys into array a.
      After the traversal, sorted numbers are in array a. */
   #include <stdio.h>
   #include <math.h>
   #include <stdlib.h>
   #define nil 0
   int k=1;
   struct item *proot;
   struct item{
     int key;
     struct item *adr;
     struct item *left;
     struct item *right;
   };
   struct item a[1000000];
   void insert(int x){
     struct item *p, *q;
     p=proot;
     do{
       q=p;
       if (x<=(*p).key)p=(*p).left; else p=(*p).right;
     }
     while (p!=nil);
     p=malloc(sizeof(struct item)); 
     (*p).key=x; (*p).left=nil; (*p).right=nil;
     (*p).adr=p;
     if (x<=(*q).key) (*q).left=p; else (*q).right=p;
   }
   traverse(struct item *proot)
   { struct item *p;
     p=proot;  
     if (p!=nil){
       printf("("); 
       traverse((*p).left);
       printf("%d", (*p).key);
       a[k]=(*p); k++;
       traverse((*p).right);
       printf(")"); 
     }
   }
   main()
   { 
     int i,n,t;
     scanf("%d",&n);
     proot=malloc(sizeof(struct item));
     (*proot).key=99999;
     (*proot).left=nil;
     (*proot).right=nil;
     for(i=1;i<=n;i++) a[i].key=random()%100000;
   //  for (i=1;i<=n;i++)printf("%d ",a[i].key);
     printf("\n");
     t=clock();
     for (i=1;i<=n;i++)insert(a[i].key);
     traverse(proot->left);
     printf("\n");
     printf("time= %d millisec\n\n", (clock()-t)/1000);
   /*
     for (i=1;i<=n;i++)
       printf("%3d %3d %3d %3d\n\n",(int)(a[i].adr) % 1000, a[i].key, 
                                (int)(a[i].left) % 1000, (int)(a[i].right) % 1000);
   */
   }

Hash Tables