Classic data structures

Data Structures

http://cseweb.ucsd.edu/~kube/cls/100/Lectures

Terminology: Child, Descendant, Parent, Ancestor, Node, Root, Leaf, sub-tree, Depth, Height.

Best/Worst/Average performance, and memory cost estimations:

Big-O: g(N) = O( f(N)) if g(n>n0) <= k f(n>n0).
Big-Omega: f(N) = Omega(g(N))
Big-Theta : if both above are true.

sorting heuristics:
http://www.sorting-algorithms.com/

interview questions:
http://www.geeksforgeeks.org/write-a-c-program-to-print-all-permutations-of-a-given-string/
(follow links on right)

C++ STL containers:

  • vector
  • deque
  • list linked list
  • stack
  • queue
  • priority_queue
  • set BST
  • multiset
  • map
  • multimap
  • bitset

Trees

  • Linked lists: A->B->C
    • Skip lists A->B->C->D, and A->D accelerator
      If node 2^n skips ahead to 2^n+1 then same as a binary search tree. But overkill.
      node-level = # of forward ptr. Node level of node inserted is chosen randomly.
      Insert/Delete/Search O(log N)
    • Double-linked list
  • Binary Trees:  c<-a->b
    Each node holds a key to manipulate. Height = log2(nodes) => H = O(Log N).

    • BST Binary Search Tree: keys k are ordered, all keys left are smaller, right are bigger. A node holds (*p, *l, *r, key/data, property)
      • Find: compare key vs k.
      • Insert: find till NULL (no dupes)
      • Delete:
      • Find next key: may be right child, or may be a parent linked via a left link.
    • Balanced BST
      • AVL tree: A balanced BST with depth off by no more than 1.
        keep sub-trees balanced with a rotation of the nodes: Swap a child and its parent & the middle sub tree changes parent for l-l and r-r childs. If it’s a l-r or r-l do a double swap to pull up the node. Keep sub-tree height in each node.
      • Red-black tree: O(log N) worst case for insert, find, delete. More lax rule than AVL: Root is black. If parent is red, children are black. All paths from a node to NULL must have same number of black.
    • Splay tree: A BST,but on a find, move the node found as root, to improve speed of next search, using splaying.
    • K-D K-dimensional trees: Branch logic at level L is done with key L mod k. Used to store nodes with k-dimension coordinates, to allow sorting and region partitions.
    • Heap: A partially sorted binary tree or equivalent, where A is parent of B if A is greater (maxHeap), or lesser (minHeap). It’s an efficient implementation of a priority queue. It allows fast find/delete of max or min.
      Often implemented with an array, putting the child of n at 2n+1,2n+2, or parent at (i-1)/2, and swaps to move nodes up/down. Insertion is done at 1st free spot and fixing the heap with swaps.
    • Treaps: It is a BST for the keys, but node also has a priority, where parent has priority >= all children. In other words, instead of AVL, nodes are swapped based on priority. Inserts are O(Height of tree). Used for split/join operations. To split at key K, insert node (K, infinity). To join, make a dummy root with both trees, and remove it.
      • RST: Randomized search tree: Use a priority that is random at insert. Depth = O(long N) = cost of  find, delete, insert, split, join…
  • B-tree balanced trees: Nodes have variable m to M children so all leaves are at same level. B-tree node fits in a 4KB block, high branch-out limit number of disk IO.
    • B+ tree:
    • 2-3 tree: M=L=3, a node has 2 or 3 child.
  • Tries (or radix tree):
    Finds a key, held in the leaf node by traversing the tree, of which each node represent one letter/bit of the word/number. The digit or letter used is the radix, and tells you how many edges are out of a node (R=2^edges). If an element has at most B digits -> B comparisons/height.

    • Alphabet trie: {a..z} -> {a…z} (represent word dictionaries, 26 edges)
    • Ternary trie: Each node is a digit, and move down middle on a match and do next digit, else left/right.
    • Bitwise tree (represent numbers, 2 edges/binary)
      Note the shortcut is to have a root with 32 edges, and encoding the leading zeroes in it to reduce the search depth.
    • (DST (digital search trees): use the ith bit to choose the ith vertex, each node holds a key)

Graphs

node is a vertex, connected by edge

  • DAG Directed Acyclic Graph
    edges are directed, and the flow does not loop. Circular graphs are often broken into DAGs in compilers.
  • Dense graph: V^2 edges
    adjacency matrix: boolean 2D array saying if 2 vertices are connected by an edge.
  • Sparse graph: ~ V edges
    adjacency list: each vertex has a list of other adjecent vertices.
  • Weighted graph: edge has weight

Shortest path in a graph G(V,E) from a node to any other.

  • depth first
    pre-order traversal of a tree. You need to back track using a stack. If you find a shorter path to a node, you need to retry the paths from that node
  • breadth first O(E + V)
    In a tree it means scan all the same level first. It is better for shortest path detection in a graph. Visit all nodes adjacent using a queue. Once visited don’t revisit a node.
  • best first (for weighted shortest path)

Shortest path in weighted graph G(V,E)

Dijkstra’s greedy algorithm extends the best path found so far. It uses a priority queue<pointer to vertex, cost>, and marks a vertex with current best cost/distance, previous node on that best path and done if solved.

Hash Tables & Maps

A map is called table, dictionary or associative memory. Hold <key, value> pairs.

Searching an index is O(1), a sorted list is O(log N), O(N) if not. To implement create B buckets slightly more than the number of keys stored. On collision use a linked list or store in an unused bucket.

load factor lambda: number of entries n / number of buckets m l =n/m.

hash function: must be deterministic, fast, uniform distribution, consistent with equal.

Hash(string) = sum of ascii value *31^pos // use horner’s rule to reduce math work

Searching:

  • binary search of sorted data: O(log N)
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s