John | 曲

Reflection in Transition

10 SOMESIMPLEALGORITHMSANDDATASTRUCTURES

John Qu / 2020-06-10


Why the teacher will talk and why we should care so much about the efficiency of programs?

The goal of this chapter is:

By the time you get through this chapter you should understand:

What is the goal of having showed the brute-force exhaustive enumeration algorithm?

Because modern computers are so fast that it is often the case that employing clever algorithms is a waste of time. Writing code that is simple and obviously correct, is often the right way to go.

Which is the key to efficiency, coding tricks or good algorithms?

There are problems where the search space are too large to make brute force practical. This led us to consider more efficient algorithms such as bisection search and Newton-Raphson. The major point in introducing these two algorithms was that the key to efficiency is a good algorithm, not clever coding tricks.

What is a practical attitude toward implementing an efficient algorithm?

In the sciences (physical, life, and social), programmers often start by quickly coding up a simple algorithm to test the plausibility1 of a hypothesis about a data set, and then run it on a small amount of data. If this yields encouraging results, the hard work of producing an implementation that can be run (perhaps over and over again) on large data sets begins. Such implementations need to be based on efficient algorithms.

Do we set our goal at inventing novel algorithms when learning about previously existing ones?

Efficient algorithms are hard to invent. Successful professional computer scientists might invent one algorithm during their whole career—if they are lucky. Most of us never invent a novel algorithm. What we do instead is learn to reduce the most complex aspects of the problems we are faced with to previously solved problems.

More specifically, we

What is a good attitude towards finding the most efficient algorithms for all everything in a program?

Keep in mind that the most efficient algorithm is not always the algorithm of choice. A program that does everything in the most efficient possible way is often needlessly difficult to understand. It is often a good strategy to start by solving the problem at hand in the most straightforward manner possible, instrument2 it to find any computational bottlenecks, and then look for ways to improve the computational complexity of those parts of the program contributing to the bottlenecks.

10.1 Search Algorithms

What is a search algorithm?

A search algorithm is a method for finding an item or group of items with specific properties within a collection of items.

We refer to the collection of items as a search space. The search space might be something concrete, such as a set of electronic medical records, or something abstract, such as the set of all integers.

Is search algorithms useful in solving problems?

Many of the algorithms presented earlier in this book can be viewed as search algorithms. In Chapter 3, we formulated finding an approximation to the roots of a polynomial as a search problem, and looked at three algorithms— exhaustive enumeration, bisection search, and Newton-Raphson—for searching the space of possible answers.

What is the specification of e in L?

def search(L, e): 
    """
    Assumes L is a list.
    Returns True if e is in L and False otherwise
    """

Why it is “at best” linear in the length of L if the element e is not in the list in the following code?

for i in range(len(L)): 
    if L[i] == e:
         return True
return False
  • It will be linear only if each operation inside the loop can be done in constant time.
  • That raises the question of whether Python retrieves the ith element of a list in constant time.
  • Since our model of computation assumes that fetching the contents of an address is a constant-time operation, the question becomes whether we can compute the address of the ith element of a list in constant time.

In Python, what structure is a list is represented as?

A length (the number of objects in the list) and a sequence of fixed-size pointers to objects.

image-20200610191140261

image-20200610191140261

What is the “indirection” implementation technique?

Generally speaking, indirection involves accessing something by first accessing something else that contains a reference to the thing initially sought. This is what happens each time we use a variable to refer to the object to which that variable is bound. When we use a variable to access a list and then a reference stored in that list to access another object, we are going through two levels of indirection. It has often been said that “any problem in computing can be solved by adding another level of indirection.” 【Really, or just kidding?】

Getting back to the problem of implementing search(L, e), is O(len(L)) the best we can do?

Yes, if we know nothing about the relationship of the values of the elements in the list and the order in which they are stored. In the worst case, we have to look at each element in L to determine whether L contains e.

【The answer means: “No, if we know something about the relationship of the values of the elements in the list and the order in which they are stored.”】

def search(L, e):
    """Assumes L is a list, the elements of which are in
    ascending order.
    Returns True if e is in L and False otherwise"""
    for i in range(len(L)):
        if L[i] == e:
            return True
        if L[i] > e:
            return False
    return False

This would improve the average running time. However, it would not change the worst-case complexity of the algorithm, since in the worst case each element of L is examined.

If we can assume the list has an ordering, can we get a considerable improvement in the worst-case complexity?

def search(L, e):
    """
    Assumes L is a list, the elements of which are in
    ascending order.
    Returns True if e is in L and False otherwise
    """

    def bSearch(L, e, low, high): 
        #Decrements high - low
        if high == low:
            return L[low] == e 
        mid = (low + high)//2 
        if L[mid] == e:
            return True 
        elif L[mid] > e:
            if low == mid: #nothing left to search 
                return False
            else:
                return bSearch(L, e, low, mid - 1)
        else:
            return bSearch(L, e, mid + 1, high)

    if len(L) == 0: 
        return False
    else:
        return bSearch(L, e, 0, len(L) - 1)

The idea is simple:

  1. Pick an index, i, that divides the list L roughly in half.
  2. Ask if L[i] == e.
  3. If not, ask whether L[i] is larger or smaller than e.
  4. Depending upon the answer, search either the left or right half of L for e.

Should search be modified to check that the assumption is satisfied?

This might eliminate a source of errors, but it would defeat the purpose of using binary search, since checking the assumption would itself take O(len(L)) time.

Why does the code use mid+1 rather than mid in the second recursive call?

The calculation of mid is mid = (low + high)//2, which is floor rounding. If take mid, high rather than mid+1, high as bSearch’s argument for low, high parameters, the new_round_mid = (mid + high)//2 can also be mid when high is 1 bigger than mid , then the value of the decrementing function high-low is no less than the value of it on entry to the instance of the function making the call.

For example:

L =[0 1 2 3 4 5]
e = 2.5

Loop 1:

l = 0, h = 5
m = (0 + 5)//2 i.e 2
L [m=2] = 2 < e=2.5

Loop 2:

l = 2, h = 5
m = 3
L[m=3] = 3 > e=2.5

Loop 3:

l = 2, h = 3
m = 2
L[m=2] = 2 < e=2.5

Loop 4:

l = 2, h = 3
m = 2
L[m=2] = 2 < e=2.5

The loop 4 is the same as loop 3.

10.2 Sorting Algorithms

How to use induction to reason about the loop invariants in selection sort?

def selSort (L):
    """ 
    Assumes that L is a list of elements that can be compared using >.
    Sorts L in ascending order.
    """
    suffixStart = 0
    while suffixStart != len (L):
        # Look at each element in suffix.
        for i in range (suffixStart, len (L)):
            if L[i] < L[suffixStart]:
                # Swap position of elements.
                L[suffixStart], L[i] = L[i], L[suffixStart]
        suffixStart += 1

The loop invariant in selection sort: given a partitioning of the list into a prefix (L[0:i]) and a suffix (L[i+1:len(L)]), the prefix is sorted and no element in the prefix is larger than the smallest element in the suffix.

  • Base case: At the start of the first iteration, the prefix is empty, i.e., the suffix is the entire list. Therefore, the invariant is (trivially) true.
  • Induction step: At each step of the algorithm, we move one element from the suffix to the prefix. We do this by appending a minimum element of the suffix to the end of the prefix. Because the invariant held before we moved the element, we know that after we append the element the prefix is still sorted. We also know that since we removed the smallest element in the suffix, no element in the prefix is larger than the smallest element in the suffix.
  • Termination: When the loop is exited, the prefix includes the entire list, and the suffix is empty. Therefore, the entire list is now sorted in ascending order.

What is the character of a divide-and-conquer algorithm?

  1. A threshold input size, below which the problem is not subdivided,
  2. The size and number of sub-instances into which an instance is split, and
  3. The algorithm used to combine sub-solutions.

The threshold is sometimes called the recursive base. For item 2 it is usual to consider the ratio of initial problem size to the sub-instance size. In most of the examples we’ve seen so far, the ratio was 2.

How to describe the divide-and- conquer algorithm recursively?

  1. If the list is of length 0 or 1, it is already sorted.
  2. If the list has more than one element, split the list into two lists, and use merge sort to sort each of them.
  3. Merge the results.

What is the key observation make by von Neumann in inventing merge sort?

Two sorted lists can be efficiently merged into a single sorted list. The merge progress involved two constant-time operations:

  1. Comparing the values of elements O(len(L)), and
  2. Copying elements from one list to another O(len(L1) + len(L2)).
Remaining in list 1 Remaining in list 2 Result
[1,5,12,18,19,20] [2,3,4,17] []
[5,12,18,19,20] [2,3,4,17] [1]
[5,12,18,19,20] [3,4,17] [1,2]
[5,12,18,19,20] [4,17] [1,2,3]
[5,12,18,19,20] [17] [1,2,3,4]
[12,18,19,20] [17] [1,2,3,4,5]
[18,19,20] [17] [1,2,3,4,5,12]
[18,19,20] [] [1,2,3,4,5,12,17]
[] [] [1,2,3,4,5,12,17,18,19,20]

What is the implementation of merge sort?

def merge (left, right, compare):
    """ Assumes left and right are sorted lists and
        compare defines an ordering on the elements. 
        Returns a new sorted (by compare) list containing the
        same elements as (left + right) would contain."""
    result =[]
    i,j = 0, 0
    while i < len (left) and j < len (right):
        if compare (left[i], right[j]):
            result.append (left[i])
            i += 1 # not delete, but move pointer one step forward
        else:
            result.append (right[j])
            j += 1
    while (i < len (left)):
        result.append (left[i])
        i += 1
    while (j < len (right)):
        result.append (right[j])
        j += 1
    return result

def mergeSort (L, compare = lambda x, y: x < y):
    """ Assumes L is a list, compare defines an ordering
        on elements of L
        Returns a new sorted list with the same elements as L"""
    if len (L) < 2:
        return L[:]
    else:
        middle = len (L)//2
        left = mergeSort (L[:middle], compare)
        right = mergeSort (L[middle:], compare)
        return merge (left, right, compare)

What is the benefit of making the comparison operator a parameter of the mergeSort function?

To get control handle of compare behavior. Suppose we want to sort a list of names written as firstName lastName, e.g., the list [‘Chris Terman’, ‘Tom Brady’, ‘Eric Grimson’, ‘Gisele Bundchen’]. We can uses these function in the following code to sort the list in two different ways.

def lastNameFirstName(name1, name2):
    arg1 = name1.split(' ')
    arg2 = name2.split(' ')
    if arg1[1] != arg2[1]:
        return arg1[1] < arg2[1]
    else: #last names the same, sort by first name
        return arg1[0] < arg2[0]

def firstNameLastName(name1, name2):
    arg1 = name1.split(' ')
    arg2 = name2.split(' ')
    if arg1[0] != arg2[0]:
        return arg1[0] < arg2[0]
    else: #first names the same, sort by last name
        return arg1[1] < arg2[1]

L = ['Tom Brady', 'Eric Grimson', 'Gisele Bundchen']
newL1 = mergeSort(L, lastNameFirstName)
print('Sorted by last name =', newL1)
## Sorted by last name = ['Tom Brady', 'Gisele Bundchen', 'Eric Grimson']
newL2 = mergeSort(L, firstNameLastName)
print('Sorted by first name =', newL2)
## Sorted by first name = ['Eric Grimson', 'Gisele Bundchen', 'Tom Brady']

10.3 Hash Tables

Is logarithmic the best that we can do for search when we are willing to do some preprocessing?

If we put merge sort together with binary search, we have a nice way to search lists. We use merge sort to preprocess the list in O(n*log(n)) time, and then we use binary search to test whether elements are in the list in O(log(n)) time. If we search the list k times, the overall time complexity is O(n*log(n) + k*log(n)).

Dictionaries use a technique called hashing to do the lookup in time that is nearly independent of the size of the dictionary.

What is the basic idea behind a hash table?

We convert the key to an integer, and then use that integer to index into a list, which can be done in constant time.

In principle, values of any type can be easily converted to an integer. After all, we know that the internal representation of each object is a sequence of bits, and any sequence of bits can be viewed as representing an integer.

What is the function of a hash function?

A hash function maps a large space of inputs (e.g., all natural numbers) to a smaller space of outputs (e.g., the nature numbers between 0 and 5000). Hash functions can be used to convert a large space of keys to a smaller space of integer indices.

What is the character of a good hash function?

Since the space of possible outputs is smaller than the space of possible inputs, a hash function is a many-to-one mapping, i.e., multiple different inputs may be mapped to the same output. When two inputs are mapped to the same output, it is called a collision. A good hash function produces a uniform distribution; i.e., every output in the range is equally probable, which minimizes the probability of collisions.

Can you create a concrete example to illustrate the inner mechanism of a hash buckets?

class intDict (object):
    """ A dictionary with integer keys """

    def __init__(self, numBuckets:int):
        """
        Create an empty dictionary as list of lists of key-value pairs
        :param numBuckets: int, number of hash buckets. It should be
        relevant to the number of dict items, not too small causing
        too much collision, and not too large wasting space.
        :return none: init self.buckets into an list of numBucket empty lists
        """
        self.buckets =[]
        self.numBuckets = numBuckets
        # why not write like this
        # self.buckets = list([] * numBuckets)
        # Oh, print([] * 7), got [].
        # So, repeating list items have to be appended iterate
        for i in range (numBuckets):
            self.buckets.append ([])

    def addEntry (self, key, dictVal):
        """
        Add an entry of (key, dictVal) pair to buckets, or update it.
        :param key: hashable type, as dict's key.
        :param dictVal: any type, as dict key's value.
        :return none: make sure there is an entry of (key, dictVal) pair
         in the list in the bucket list.
        """
        # Recall that i%j returns the remainder when the integer i is
        # divided by j, so it is an integer in range [0, j)
        hashBucket = self.buckets[ (key % self.numBuckets) ]
        # Check to update the (key, dictVal) pair if key already existed
        for i in range (len (hashBucket)):
            if hashBucket[i][0] == key:
                hashBucket[i] = (key, dictVal)
                return
        # Append the (key, dictVal) pair since there is no this key.
        hashBucket.append ((key, dictVal))

    def getValue (self, key):
        """
        Get value with key from this dict.
        :param key: hashable type, as dict's key.
        :return: value of key, or None if the key not exist.
        """
        hashBucket = self.buckets[ (key % self.numBuckets) ]
        # Search rooms on this floor for the key.
        for e in hashBucket:
           if e[0] == key:
               return e[1]
        return None

    def __str__(self):
        """ Construct the literal of dict. """
        result = '{'
        for b in self.buckets:
            for e in b:
                result = result + str (e[0]) + ':' + str (e[1]) + ', '
        return result[:-2] + '}' #result[:-2] omits the last coma
import random
D = intDict(17)
for i in range(20):
    #choose a random int in the range 0 to 10**5 - 1
    key = random.choice(range(10**5))
    D.addEntry(key, i)
print('The value of the intDict is:')
## The value of the intDict is:
print(D)
## {35633:14, 40445:7, 88692:6, 41806:13, 11208:0, 35875:8, 32255:16, 74485:10, 95260:17, 45825:5, 57657:15, 40249:18, 2765:3, 32838:11, 72482:19, 20396:1, 74933:4, 25090:9, 48958:12, 40408:2}
print('\n', 'The buckets are:')
## 
##  The buckets are:
for hashBucket in D.buckets: #violates abstraction barrier
    print(' ', hashBucket)
##   []
##   [(35633, 14)]
##   [(40445, 7)]
##   [(88692, 6), (41806, 13)]
##   []
##   [(11208, 0), (35875, 8)]
##   [(32255, 16)]
##   []
##   [(74485, 10)]
##   [(95260, 17)]
##   [(45825, 5), (57657, 15), (40249, 18)]
##   [(2765, 3), (32838, 11), (72482, 19)]
##   []
##   [(20396, 1)]
##   [(74933, 4)]
##   [(25090, 9), (48958, 12)]
##   [(40408, 2)]

Is list an efficient way to handle collisions?

There are many other ways to handle collisions, some considerably more efficient than using lists. But this is probably the simplest mechanism, and it works fine if the hash table is large enough relative to the number of elements stored in it, and the hash function provides a good enough approximation to a uniform distribution.

What is the complexity of getValue?

If there were no collisions it would be O(1), because each hash bucket would be of length 0 or 1. But, of course, there might be collisions. If everything hashed to the same bucket, it would be O(n) where n is the number of entries in the dictionary, because the code would perform a linear search on that hash bucket. By making the hash table large enough, we can reduce the number of collisions sufficiently to allow us to treat the complexity as O(1). That is, we can trade space for time.

What is the tradeoff?

To answer this question, one needs to know a tiny bit of probability. Ref. to Chapter 15.

2020-06-11


  1. plausibility: The quality of being plausible; speciousness. To give any plausibility to a scheme. - De Quincey. Syn. – Plausible, Specious. Plausible denotes that which seems reasonable, yet leaves distrust in the judgment. Specious describes that which presents a fair appearance to the view and yet covers something false. Specious refers more definitely to the act or purpose of false representation; plausible has more reference to the effect on the beholder or hearer. An argument may be specious when it is not plausible because its sophistry is so easily discovered.

  2. instrument, v. t. To perform upon an instrument; to prepare for an instrument; as, a sonata instrumented for orchestra.