Selection Sort

The Cloistered Monkey

2021-11-23 15:17

Umm… Selection Sort?

Yes, this continues the look at Brute-Force Sorting, begun with the The Bubble Sort. With the selection sort you repeatedly select the smallest item in the unsorted section of the data and place it at the end of the sorted data until you have sorted all the items. How To Think About Algorithms classifies it as a More of the Output algorithm, one of the types of Iterative Algorithms, while Introduction to the Design & Analysis of Algorithms (Levitin) classifies it as a Brute Force algorithm.

Imports

# python
from functools import partial
from random import shuffle

# pypi
from numba import njit
from expects import contain_exactly, equal, expect
from joblib import Parallel, delayed
from numpy.random import default_rng

import altair
import numpy
import pandas

# this project
from bowling.sort.bubble.bubble import bubble
from bowling.sort.selection import selection_counter
from bowling.sort.selection.selection import selection_swaps

# my stuff
from graeae.visualization.altair_helpers import output_path, save_chart
from graeae import Timer

Set Up

random = default_rng(2021)
TIMER = Timer()

SLUG = "selection-sort"
OUTPUT_PATH = output_path(SLUG)
save_it = partial(save_chart, output_path=OUTPUT_PATH)

Parts of the Algorithm

Item	Answer
Specification	This is a Sorting Problem.
Basic Steps	We're going to repeatedly select the smallest item in the unselected.
Measure Progress	We'll know we're progressing by the number of elements we've selected (k).
Loop Invariant	None of the selected elements is larger than any element in the unselected and the selected items are in sorted order. The unselected items contain the largest elements in the list.
What are the Main Steps?	1. Find the smallest unsorted element
	2. Put the selected element at the end of the previously selected element(s)
Making Progress	The number of selected items (k) always goes up after each loop.
Loop Invariant Maintenance	- The previous Loop Invariant tells us that all the unselected items were at least as large or larger than the largest of the selected items.
	- We chose the largest item of the unselected so it's as big or bigger than the biggest selected item.
	- Putting the selected item at the end of the previously selected items maintains the Loop Invariant.
Establish the Loop Invariant	At the start, no items have been selected so the Loop Invariant is vacuously true.
Exit-Condition	We stop when all the items have been selected.
Ending	- Exit-Condition: All items have been selected
	- Loop Invariant: All selected are sorted.
	- So our output has all the original items and they are now sorted, satisfying the Postconditions.

Running Time

$\begin{align} C(n) &= \sum_{i=0}^{n-2} \sum_{j=i+1}^{n-1} 1\\ &= \frac{n(n-1)}{2} \in \Theta{n^2} \end{align}$

The Pseudocode

Algorithm 1 SelectionSort

Input: An array of orderable items

Output: The array sorted in ascending order

procedure SelectionSort($A$)

for $i \gets 0$ to $n - 2$ do

$min \gets i$

for $j \gets i + 1$ to $n - 1$ do

if $A[j] < A[min]$ then

$min \gets j$

end if

end for

Swap $A[i]$ and $A[min]$

end for

end procedure

The outer loop (where i is set) tracks the number of items selected so far (which is the same thing as the size of the sorted section at the beginning of the list). The inner loop goes over the remaining, previously unselected, items in the list and looks for the index of the smallest item, it then puts that smallest item at the beginning of the unselected items and moves what was at the beginning to where the smallest item was. Once the swap is done, the sub-list on the left-side of the list is now bigger by one so the outer loop increments and then we search for the next smallest item, and so on until all the items have been selected (except for the last item) and the list is sorted.

The Implementations

Selection Sort

This will be a straight translation of the pseudocode (or straight-ish).

from collections.abc import MutableSequence
from collections import namedtuple
from typing import Any, Dict

SelectionOutput = namedtuple("SelectionOutput",
                             ["element_count",
                              "comparisons",
                              "swaps",
                              "elements"])
Swaps = Dict[int, list[int]]
Sortable = MutableSequence[Any]

def selection_counter(elements: Sortable) -> SelectionOutput:
    """Does the selection sort on the elements

    Args:
     elements: list of orderable objects

    Returns:
     (number of elements, comparisons, swaps)
    """
    number_of_elements = len(elements)
    comparisons = swaps = 0

    for start_of_unselected in range(number_of_elements - 1):
        smallest_unselected = start_of_unselected
        for next_unselected in range(start_of_unselected + 1,
                                     number_of_elements):
            comparisons += 1
            if elements[next_unselected] < elements[smallest_unselected]:
                smallest_unselected = next_unselected
        swaps += 1
        elements[start_of_unselected], elements[smallest_unselected] = (
            elements[smallest_unselected], elements[start_of_unselected]
        )
    return SelectionOutput(element_count=number_of_elements,
                           comparisons=comparisons,
                           swaps=swaps,
                           elements=elements)

Some Checks

def check(collection: list, n: int, comparisons: int, swaps: int) -> None:
    """Check that the sort worked

    Args:
     collection: the sorted collection
     n: number of elements in the collection
     comparisons: number of comparisons made
     swaps: number of swaps made

    Raises:
     AssertionError: some check didn't match
    """
    expect(n).to(equal(len(collection)))
    runtime = (n * (n - 1))/2
    expect(comparisons).to(equal(runtime))
    expect(swaps).to(equal(n - 1))
    expect(list(collection)).to(contain_exactly(*list(sorted(collection))))
    return

test = [1, 2, 3]
n, comparisons, swaps, _ = selection_counter(test)
check(test, n, comparisons, swaps)

test = [4, 3, 2, 1]
n, comparisons, swaps, _ = selection_counter(test)
check(test, n, comparisons, swaps)

COUNT = 1000
test = random.integers(low=0, high=COUNT, size=COUNT)
n, comparisons, swaps, _ = selection_counter(test)
check(test, n, comparisons, swaps)

Run It

So, let's see how it does. We'll set the selection sort up as a numba function and set up the things to sort so that we can compare it to the bubble sort.

numba_selection = njit(selection_counter)
things_to_sort = [random.integers(low=0, high=count, size=count)
                  for count in range(1, 10**5 + 1, 1000)]

with TIMER:
    elements_comparisons_and_swaps = Parallel(n_jobs=-1)(
        delayed(numba_selection)(thing_to_sort)
        for thing_to_sort in things_to_sort)

Started: 2022-01-11 00:05:00.866956
Ended: 2022-01-11 00:05:41.085584
Elapsed: 0:00:40.218628

Let's plot the comparisons and swaps.

SIZE, COMPARISONS, SWAPS = 0, 1, 2
unzipped = list(zip(*elements_comparisons_and_swaps))
count_frame = pandas.DataFrame({"Elements": unzipped[SIZE],
                                "Selection Comparisons": unzipped[COMPARISONS],
                                "Selection Swaps": unzipped[SWAPS]})

base = altair.Chart(count_frame).mark_point().encode(
    x = "Elements",
)

comparisons = base.encode(
    y="Selection Comparisons",
    tooltip=[altair.Tooltip("Elements", format=","),
             altair.Tooltip("Selection Comparisons", format=","),
             altair.Tooltip("Selection Swaps", format=",")]
).properties(title="Selection Sort Comparisons", width=800, height=250)

swaps = base.mark_point(color="DarkRed").encode(
    x="Elements",
    y="Selection Swaps",
    tooltip=[altair.Tooltip("Elements", format=","),
             altair.Tooltip("Selection Comparisons", format=","),
             altair.Tooltip("Selection Swaps", format=",")]
).properties(title="Selection Sort Swaps", width=800, height=250)

chart = (comparisons & swaps)

save_it(chart, "selection-sort-comparisons")

It's important to note the scale of the y-axes here - when I tried putting the comparisons and swaps on the same plot it was pretty much impossible to see the slope of the swaps, even when zoomed way in. Unlike the Bubble Sort, the Selection Sort's swaps have a linear growth instead of a quadratic growth.

Looking at the Swaps

Here's where it might be a little more interesting. We can do the same exercise we did with the bubble sort and plot the actual swaps to see if we can see the sorting in action.

def selection_swaps(elements: Sortable) -> Swaps:
    """Keeps track of the element indexes as they are swapped

    Args:
     elements: list of orderable elements

    Returns:
     dict mapping element to list of indices where it was in the elements list
    """
    swaps = {element: [index] for index, element in enumerate(elements)}

    number_of_elements = len(elements)

    for start_of_unselected in range(number_of_elements - 1):
        smallest_unselected = start_of_unselected

        for next_unselected in range(start_of_unselected + 1,
                                     number_of_elements):
            if elements[next_unselected] < elements[smallest_unselected]:
                smallest_unselected = next_unselected

        elements[start_of_unselected], elements[smallest_unselected] = (
            elements[smallest_unselected], elements[start_of_unselected]
        )

        # record the location of the elements
        for index, element in enumerate(elements):
            swaps[element].append(index)
    return swaps

Because we're tracking the swaps with a dict there can't be any repetitions in the inputs, so I'll use python instead of numpy to make the randomized input since it seems clearer to me.

COUNT = 50

inputs = list(range(COUNT))
shuffle(inputs)
swaps = selection_swaps(inputs)

track_frame = pandas.DataFrame(swaps)
re_indexed = track_frame.reset_index().rename(columns={"index": "Swap"})
melted = re_indexed.melt(var_name="Value To Sort",
                         value_name="Location In Array", id_vars="Swap")

chart = altair.Chart(melted).mark_line().encode(
    x="Swap",
    y="Location In Array",
    color="Value To Sort:O",
    tooltip=["Swap", "Location In Array", "Value To Sort"]
).properties(
    title="Selection Sort Swaps",
    width=800,
    height=525,
).interactive()

save_it(chart, "selection-sort-swaps")

Since I put in more inputs than I did with the Bubble Sort, the actual swaps aren't so easy to see, here, but the point of this plot is to show the (imaginary) diagonal line running from the bottom left corner up to the upper right. This shows why it's called a "More of the Output" algorithm - with each loop (represented by a "Swap" on the X-axis) one more sorted item is added to the beginning of the list (the bottom of the chart) from the unsorted part (the section above the imaginary diagonal of the chart) until you end up with a sorted list.

Since the Selection Sort always checks all the items in the "unsorted" section, the lengths of the lines above the diagonal as they move up and down don't really have a significance, they're just single swaps. What matters more is that the section below the diagonal is "sorted" and when picking the next item the algorithm has to check every element above the diagonal to find the next smallest item.

Worst Case

COUNT = 50

inputs = list(reversed(range(COUNT)))
swaps = selection_swaps(inputs)

track_frame = pandas.DataFrame(swaps)
re_indexed = track_frame.reset_index().rename(columns={"index": "Swap"})
melted = re_indexed.melt(var_name="Value To Sort",
                         value_name="Location In Array", id_vars="Swap")

chart = altair.Chart(melted).mark_line().encode(
    x="Swap",
    y="Location In Array",
    color="Value To Sort:O",
    tooltip=["Swap", "Location In Array", "Value To Sort"]
).properties(
    title="Selection Sort Locations (Worst Case)",
    width=800,
    height=525,
).interactive()

save_it(chart, "selection-sort-locations-worst-case")

This produces kind of an interesting figure… Since the input list is in exactly the backwards order (from largest element to smallest), and the selection sort traverses the list from start to end and swaps out each element with the smallest element to the right of the current element, we are always swapping out the largest element remaining for the smallest remaining element, so although the algorithm doesn't specifically try to do it, it's putting both the smallest unsorted element in place and the largest unsorted element in place with each swap, so it's actually done sorting at the halfway point. We could, then create a short-circuit version of this as well, to make it quit if there's no element in the unsorted smaller than the elements sorted so far, but as with the bubble sort, this would most-likely only help in outside cases like these.

Sources

Bubble-Sort Revisited

The Cloistered Monkey

2021-11-19 17:14

This is a continuation of a series on the Bubble Sort.

In the Beginning

In the previous bubble sort post I implemented the basic bubble sort, but there is a variation of it where you short-circuit the search if the collection is already sorted which is what I'll look at here. The short-circuiting means that the bubble sort can sometimes do better than other methods, but only when the data is mostly sorted so it's still not a practical algorithm. Why do this, then? Eh, why not?

Some Setup

Imports

# python
from functools import partial

# pypi
# from bokeh.models import HoverTool
from expects import be_below, contain_exactly, equal, expect
from joblib import Parallel, delayed
from numba import njit
from numba.typed import List
from numpy.random import default_rng

import altair
import numpy
import pandas

# this project
from graeae.visualization.altair_helpers import output_path, save_chart
from bowling.sort.bubble import bubble, BubbleOutput

# my other stuff
from graeae import Timer

Stuff For Later

TIMER = Timer()
SLUG = "double-bubble-sort"
OUTPUT_PATH = output_path(SLUG)
random = default_rng(2021)

The Sorters

Bubba Sort

This will be the short-circuiting version of the bubble sort.

Bubba

This is the short-circuiter. It works pretty much the same way with just an extra if-statement to break out of the loop if nothing was swapped.

def bubba(elements) -> BubbleOutput:
    """Sorts the list in place

    Args:
     elements: list of (in-place) sortable elements

    Returns:
     number of elements, count of comparisons
    """
    all_but_one = len(elements) - 1
    comparisons = swaps = 0
    for items_sorted in range(all_but_one):
        swapped_at_least_once = False
        for in_front_of_us in range(all_but_one - items_sorted):
            comparisons += 1
            to_the_right = in_front_of_us + 1
            if elements[to_the_right] < elements[in_front_of_us]:
                (elements[in_front_of_us],
                 elements[to_the_right]) = (elements[to_the_right],
                                          elements[in_front_of_us])
                swapped_at_least_once = True
                swaps += 1
        if not swapped_at_least_once:
            break
    return BubbleOutput(len(elements), comparisons, swaps, elements)

Bubble

I showed how I implement the bubble sort in the previous bubble sort post so, since I imported it up above, I'm just going to run it.

Test It Out

Once again, let's just make sure everything works.

inputs = [1, 2, 3, 4, 7, 6, 5]

expected = list(sorted(inputs))

test_1, test_2 = inputs.copy(), inputs.copy()

bubba_output = bubba(test_1)
original_output = bubble(test_2)

expect(test_1).to(contain_exactly(*expected))
expect(test_2).to(contain_exactly(*expected))

n = len(inputs)
worst = (n * (n - 1))/2
expect(bubba_output.comparisons).to(be_below(worst))
expect(bubba_output.comparisons).to(equal(15))

expect(bubba_output.swaps).to(equal(original_output.swaps))
expect(original_output.comparisons).to(equal(worst))

# try a bigger input
inputs = random.choice(list(range(100)), size=100)
expected = list(sorted(inputs))

test_1, test_2 = inputs.copy(), inputs.copy()

bubba_output = bubba(test_1)
original_output = bubble(test_2)

expect(list(test_1)).to(contain_exactly(*expected))
expect(list(test_2)).to(contain_exactly(*expected))

n = len(inputs)
worst = (n * (n - 1))/2
expect(bubba_output.comparisons).to(be_below(worst))
expect(original_output.comparisons).to(equal(worst))

expect(bubba_output.swaps).to(equal(original_output.swaps))
expect(original_output.comparisons).to(equal(worst))

Counting Comparisons

Since the actual method of sorting is the same the swaps should be the same in either case (the new version doesn't quit until there's no more swaps to be done) so I'll just look at the comparisons and see if it made any real difference.

Set Up Numba

bubba = njit(bubba, nogil=True)
bubble = njit(bubble, nogil=True)

Run The Bubble Counter

things_to_sort = [random.choice(list(range(count)), size=count)
                  for count in range(1, 10**5+ 1, 1000)]

with TIMER:
    bubbles_counts_and_comparisons = Parallel(n_jobs=-1)(
        delayed(bubble)(thing_to_sort)
        for thing_to_sort in things_to_sort)

Started: 2022-01-12 23:40:18.730737
Ended: 2022-01-12 23:41:51.143272
Elapsed: 0:01:32.412535

Run the Bubba Counter

with TIMER:
    bubbas_counts_and_comparisons = Parallel(n_jobs=-2)(
        delayed(bubba)(thing_to_sort)
        for thing_to_sort in things_to_sort)

Started: 2022-01-12 23:41:58.766180
Ended: 2022-01-12 23:43:38.475355
Elapsed: 0:01:39.709175

Time-wise it seems to have done about the same as the original bubble sort.

SIZE, COMPARISONS = 0, 1
bubble_unzipped = list(zip(*bubbles_counts_and_comparisons))
bubba_unzipped = list(zip(*bubbas_counts_and_comparisons))
bubba_frame = pandas.DataFrame({"Elements": bubble_unzipped[SIZE],
                                "Bubble": bubble_unzipped[COMPARISONS],
                                "Bubba": bubba_unzipped[COMPARISONS]})

melted = bubba_frame.melt(id_vars=["Elements"],
                          var_name="Sorter",
                          value_name="Comparisons")

chart = altair.Chart(melted).mark_point().encode(
    x="Elements",
    y="Comparisons",
    color="Sorter",
    tooltip=[altair.Tooltip("Elements", format=","), "Sorter",
             altair.Tooltip("Comparisons", format=",")]
).properties(
    title="Bubble Sort Comparisons (with and without short-circuiting)",
    width=900,
).interactive()

save_chart(chart, "bubble_comparisons", output_path=OUTPUT_PATH, height=400)

At first glance it looks like there's only one grey set of points, but if you zoom in (using the mouse's scroll wheel) you'll see that the grey points are actually (sometimes) created by adjacent points with the short-circuiting version's point (the blue point) slightly lower than the non-short-ciruiting orange point. So there is a difference, but it's small enough that it isn't easy to see.

Let's see if it becomes more obvious using the difference between the points instead.

bubba_frame["Difference"] = (bubba_frame["Bubble"] -
                             bubba_frame["Bubba"])

chart = altair.Chart(bubba_frame).mark_trail().encode(
    x="Elements:O",
    y="Difference:Q",
    size="Difference:Q",
    tooltip=[altair.Tooltip("Elements", format=","),
             altair.Tooltip("Difference", format=","),
             altair.Tooltip("Bubble", format=","),
             altair.Tooltip("Bubba", format=",")]
).properties(title="Difference In Comparisons Between the Bubble Sorts",
             height=500,
             width=900).interactive()

save_chart(chart, "bubble_differences", output_path=OUTPUT_PATH, height=600)

So, it looks like there are sometimes significant differences between the two sorts, but how do they appear given the total number of comparisons made?

bubba_frame["Difference/Comparisons"] = bubba_frame.Difference/bubba_frame.Bubba
sub_bubba = bubba_frame[["Elements", "Difference/Comparisons"]]
chart = altair.Chart(sub_bubba).mark_trail().encode(
    x="Elements:O",
    y="Difference/Comparisons:Q",
    size="Difference/Comparisons:Q",
    tooltip=[altair.Tooltip("Elements", format=","),
             altair.Tooltip("Difference/Comparisons", format=".4f")]
).properties(title="Proportion of Difference Between the Bubble Sorts",
             height=500,
             width=900).interactive()

save_chart(chart, "bubble_difference_proportions", output_path=OUTPUT_PATH,
           height=600)

It looks like as the amount of sorting that's needed goes up, the difference made by the short-circuiting becomes smaller in comparison to the total number of comparisons made, but even the biggest effect proportional to the amount of comparisons made by the short-circuiting sorter only comes to the difference being 0.2% of the total comparisons.

Best Case, Worst Case

Now that we know that the short-circuiting isn't that big a deal I'll add some plotting showing how our random cases compare to the best-case and worst case sorting inputs.

best_things = [numpy.arange(count, dtype=int) for count in range(1, 10**5+ 1, 1000)]
worst_things = [numpy.flip(things) for things in best_things]

with TIMER:
    worst_output = Parallel(n_jobs=-1)(
        delayed(bubba)(thing) for thing in worst_things
)

Started: 2022-01-13 00:01:27.932132
Ended: 2022-01-13 00:02:08.108278
Elapsed: 0:00:40.176146

with TIMER:
    best_output = Parallel(n_jobs=-1)(
        delayed(bubba)(thing) for thing in best_things
)

Started: 2022-01-13 00:08:59.231664
Ended: 2022-01-13 00:09:00.409103
Elapsed: 0:00:01.177439

worst_unzipped = list(zip(*worst_output))
best_unzipped = list(zip(*best_output))

del bubba_frame["Bubble"]
del bubba_frame["Difference"]
del bubba_frame["Difference/Comparisons"]
bubba_frame = bubba_frame.rename(columns={"Bubba": "Random"})

bubba_frame["Worst Case"] = worst_unzipped[COMPARISONS]
bubba_frame["Best Case"] = best_unzipped[COMPARISONS]

melted = bubba_frame.melt(id_vars=["Elements"],
                          var_name="Input",
                          value_name="Comparisons")

chart = altair.Chart(melted).mark_point().encode(
    x="Elements",
    y="Comparisons",
    color="Input"
).properties(
    width=800,
    height=550,
    title="Bubble Sort Best, Worst, and Random Input Comparisons"
).interactive()

save_chart(chart, "best-worst-random", output_path=OUTPUT_PATH, height=600)

Once again we run into the problem/case that it looks like the Random input is missing, but if you zoom way in you'll see it does slightly better than the worst-case, but the difference is so small that when you zoom back out they look like they did exactly the same. The best-case does much better, since it only has to pass through the data once to see that it's already sorted and then quit (which isn't really reflected in the plot because our increment of comparisons only happens if the inner loop is entered), but except for this unusual case, random inputs don't do enough better than the worst-case examples to make short-circuiting a noteworthy improvement.

The End, The End

Well, I think I've beaten this dead horse enough for now. The overview for all the posts is here, if you didn't get enough of it. Up next: Selection Sort.

Bubble Sort: An Empirical Look

The Cloistered Monkey

2021-11-17 10:57

This is a continuation of a series on the Bubble Sort. In the previous post I made an implementation of the Bubble Sort in python. In this post I'll run it with inputs of varying sizes to see how its runtime changes as well as do a plot to get a sense of how it rearranges the items its sorting as it works.

An Empirical Look

Using math to figure out the number of comparisons is nice and all, but let's do some counting using real code. I'm going to re-implement the bubble sort but this time keeping track of the comparisons and swaps.

Some Setup

# python
from collections.abc import MutableSequence
from collections import namedtuple
from functools import partial

import random
from numpy.random import default_rng

# pypi
from bokeh.models import HoverTool
from expects import be_above_or_equal, contain_only, equal, expect
from joblib import Parallel, delayed
from numba import njit

import hvplot.pandas
import pandas

# my stuff
from graeae import EmbedHoloviews, Timer

Stuff For Later

TIMER = Timer()
SLUG = "bubble-sort-an-empirical-look"
PATH = f"files/posts/{SLUG}/"
Embed = partial(EmbedHoloviews, folder_path=PATH)

Numba Random

numba_random = default_rng(2021)

A Counter

This is going to be a reimplementation of the Bubble Sort function but with variables added to keep track of the counts. I originally did this class-based so I could re-use the code but once I switched from pypy to numba that started to get problematic, and the functions are short anyway so I guess it's all the same.

Some Types

ElementCount = int
ComparisonCount = int
SwapCount = int
SortedElements = MutableSequence
Counts = tuple[ElementCount, ComparisonCount, SwapCount, SortedElements]

BubbleOutput = namedtuple("BubbleOutput",
                          [
                              "element_count", "comparisons",
                              "swaps",
                              "elements"])

The Sorting Function (With Counts)

def bubble(elements: MutableSequence) -> Counts:
    """Sorts the list in place

    Args:
     elements: list of (in-place) sortable elements

    Returns:
     number of elements, count of comparisons, count of swaps, sorted elements
    """
    all_but_one = len(elements) - 1
    comparisons = swaps = 0
    for items_sorted in range(all_but_one):
        for in_front_of_us in range(all_but_one - items_sorted):
            comparisons += 1
            to_the_right = in_front_of_us + 1
            if elements[to_the_right] < elements[in_front_of_us]:
                (elements[in_front_of_us],
                 elements[to_the_right]) = (elements[to_the_right],
                                          elements[in_front_of_us])
                swaps += 1
    return BubbleOutput(len(elements), comparisons, swaps, elements)

One thing to note is that in the algorithm the outer loop stops at $n-2$ , which it also does here, but since the range function doesn't include the last number our argument is only $n-1$ , which might look confusing at first, and might sound even more confusing when you read this, but the point is they're the same even though they might not look exactly the same.

Testing the Counter

def check_non_decreasing(elements: list) -> None:
    """Checks that all the elements in the list are non-decreasing

    Args:
     elements: list of comparable sorted items

    Raises:
     AssertionError if something is out of order
    """
    last_one = elements[0]
    for index in range(1, n):
        this_one = elements[index]
        expect(this_one).to(be_above_or_equal(last_one))
        last_one = this_one
    return

A Random Input

Because we aren't using any kind of short-circuiting to stop once the data is sorted, it will always loop $\Omega(n^2)$ , but the number of swaps will depend on the input.

n = 1000
inputs = random.choices(list(range(n)), k=n)

output = bubble(inputs)

sorted_elements = output.elements
check_non_decreasing(sorted_elements)

expected_runtime = expected_swaps = (n * (n - 1))/2
expect(output.comparisons).to(equal(expected_runtime))

The Worst Case

In the worst case where it's in the exact opposite sorted order (non-increasing instead of non-decreasing) the number of comparisons should equaly the number of swaps.

output = bubble(list(reversed(list(range(n)))))
check_non_decreasing(output.elements)
expect(output.comparisons).to(equal(expected_runtime))
expect(output.swaps).to(equal(expected_swaps))

The Tracker

To visualize what the sort is doing I'm going to update the bubble sort to keep track of the order of the items as it's being sorted. I'm using a dictionary to map the list values to their locations so we can't use inputs where there are repeated elements or more than one list of locations will be mapped to a element value.

IndexHistory = list[int]
ElementValue = int
Swaps = dict[ElementValue, IndexHistory]

def swap_tracker(elements: MutableSequence) -> Swaps:
    """Does the bubble-sort and tracks the locations

    Args:
     elements: list of orderable items

    Returns:
     dict of element value: list of indices it was at during sort
    """
    all_but_one = len(elements) - 1

    swaps = {element: [index] for index, element in enumerate(elements)}

    for items_sorted in range(all_but_one):
        for in_front_of_us in range(all_but_one - items_sorted):
            to_the_right = in_front_of_us + 1
            if elements[to_the_right] < elements[in_front_of_us]:
                (elements[in_front_of_us],
                 elements[to_the_right]) = (elements[to_the_right],
                                          elements[in_front_of_us])
                for index, element in enumerate(elements):
                    swaps[element].append(index)
    return swaps

First a little sanity check just to make sure it still works.

inputs = [6, 3, 4, 1]

swaps = swap_tracker(inputs)
expect(len(swaps)).to(equal(len(inputs)))

n = 100
inputs = list(range(n))
random.shuffle(inputs)

swaps = swap_tracker(inputs)
expect(swaps.keys()).to(contain_only(*inputs))

check_non_decreasing(inputs)

Try Them Out

Comparisons

Let's look at how the comparisons and swaps change as the input gets bigger. To speed this up I'm going to run the sort in numba.

numba_bubble = njit(bubble)

runs = {}
things_to_sort = [numba_random.integers(low=0, high=count, size=count)
                  for count in range(1, 10**5+ 1, 1000)]

with TIMER:
    comparisons_and_swaps = Parallel(n_jobs=-1)(
        delayed(numba_bubble)(thing_to_sort)
        for thing_to_sort in things_to_sort)

Started: 2021-12-22 02:49:12.003456
Ended: 2021-12-22 02:50:44.485043
Elapsed: 0:01:32.481587

Now we'll plot it.

SIZE, COMPARISONS, SWAPS, ELEMENTS = 0, 1, 2, 3
unzipped = list(zip(*comparisons_and_swaps))
bubba_frame = pandas.DataFrame({"Elements": unzipped[SIZE],
                                "Comparisons": unzipped[COMPARISONS],
                                "Swaps": unzipped[SWAPS]})
bubba_frame["n^2"] = bubba_frame["Elements"]**2
tooltips_comparisons = [
    ("Elements", "@Elements{0,}"),
    ("Comparisons", "@Comparisons{0,}")
]

tooltips_swaps = [
    ("Elements", "@Elements{0,}"),
    ("Swaps", "@Swaps{0,}")
]

tooltips_n2 = [
    ("Elements", "@Elements{0,}"),
    ("n^2", "@{n^2}{0,}")
]

hover_comparisons = HoverTool(tooltips=tooltips_comparisons)
hover_swaps = HoverTool(tooltips=tooltips_swaps)
hover_n2 = HoverTool(tooltips=tooltips_n2)

swap_plots = bubba_frame.hvplot(x="Elements", y="Swaps").opts(
    tools=[hover_swaps])
comparison_plots = bubba_frame.hvplot(x="Elements", y="Comparisons").opts(
    tools=[hover_comparisons])
n_squared_plot = bubba_frame.hvplot(x="Elements", y="n^2").opts(
    tools=[hover_n2])

plot = (swap_plots * comparison_plots * n_squared_plot).opts(
    title="Comparisons, Swaps and n-squared Counts",
    height=700, width=800)
output = Embed(plot=plot, file_name="bubble_sort_comparisons")()

The top line (yellow) is the square of the size of the inputs, the middle (red) is the number of comparisons, and the bottom line (blue) is the number of swaps. If you hover over the lines you can see that each line is roughly double the one below it - there are around twice as many comparisons as swaps for a given input and $n^2$ is around twice as big as the comparison count for a given input.

Swaps

The comparisons and swaps are pretty much what we expected to see, they just confirm the theoretical assessment, but now let's look at plotting the swaps as they occur to see if we can understand what the bubble sort is doing.

COUNT = 20
inputs = random.sample(list(range(COUNT)), k=COUNT)
swaps = swap_tracker(inputs)

# swaps = {str(key): value for key, value in tracker.swaps.items()}
track_frame = pandas.DataFrame(swaps)
re_indexed = track_frame.reset_index().rename(columns={"index": "Swap"})
melted = re_indexed.melt(var_name="Value To Sort", value_name="Location In Array", id_vars="Swap")

tooltips = [
    ("Item to Sort", "@{Value To Sort}"),
    ("Swap", "@{Swap}"),
    ("Current Location", "@{Location In Array}")
]

hover = HoverTool(tooltips=tooltips)

ticks = [(index, index) for index in range(COUNT)]
plot = melted.hvplot(x="Swap", y="Location In Array",
                     by="Value To Sort").opts(tools=[hover],
                                              show_legend=False,
                     width=800, height=700, yticks=ticks,
                            title="Bubble Sort Swaps",)


output = Embed(plot=plot, file_name="bubble_sort_swaps")()

Aside:

HoloViews seems to not let you set the Tooltips if you use multiple columns, which is why I went through all the rigamarole of melting it. If you just plot it as the DataFrame with each column being one of the tracked locations for a sort value (e.g. the column name is '1' and the rows are the positions in the array at each swap) then the plot comes out okay, but the labels are kind of confusing.

The plot is hopefully a useful way to figure out what's going on. If you look at the unsorted values you can see that once they are the largest of the unsorted values, they "bubble up" in a diagonal but straight line. Before this plot I would have said that the largest elements are the ones that get sorted first, but if you look at the plot (assuming I don't re-run it and change the arrangements) and in particular you look at the least-valued elements (1 and 2) you can see that they reach their final position fairly early, just by virtue of being in a position to get pushed down.

Worst Case

The random-input gives an interesting view of how the algorithm might work in practice, but let's look at the worst-case input where the values are in the opposite of the sorted order.

COUNT = 20
inputs = list(reversed(range(COUNT)))
swaps = swap_tracker(inputs)

track_frame = pandas.DataFrame(swaps)
re_indexed = track_frame.reset_index().rename(columns={"index": "Swap"})
melted = re_indexed.melt(var_name="Value To Sort", value_name="Location In Array", id_vars="Swap")


ticks = [(index, index) for index in range(COUNT)]
plot = melted.hvplot(x="Swap", y="Location In Array", cmap="blues",
                     by="Value To Sort").opts(show_legend=False,
                     width=800, height=700, yticks=ticks,
                            title="Bubble Sort Swaps (Worst Case)",)


output = Embed(plot=plot, file_name="bubble_sort_worst_swaps")()

This image gives an even better sense of the way that the bubble sort works. Since the Bubble Sort uses a left-to-right traversal and swapping as you go, the largest values shoot up to their final positions in straight lines, while the lesser values get pushed down a little with each traversal until they reach the correct position.

Onward

The final post in this series (maybe) will look at a variation on the Bubble Sort that can improve the performance in special cases.

Bubble Sort: The Implementation

The Cloistered Monkey

2021-11-17 05:06

This is part of a series on Bubble Sort that began with a look at the Bubble Sort Algorithm and continues now with a python translation of the algorithm.

Some Setup

Imports

# python
from random import choices

# pypi
from expects import contain_exactly, equal, expect

Bubble Sort

The Implementation

Here's a straight-forward translation of the pseudocode to python.

def bubble_sort(unsorted: list) -> list:
    """Sorts the unsorted

    Args:
     unsorted: mutable collect of orderable items

    Returns:
     original list sorted in ascending order
    """
    all_but_one = len(unsorted) - 1
    for items_bubbled_up in range(all_but_one):
        for left_hand in range(all_but_one - items_bubbled_up):
            right_hand = left_hand + 1
            if unsorted[right_hand] < unsorted[left_hand]:
                (unsorted[left_hand],
                 unsorted[right_hand]) = (unsorted[right_hand],
                                          unsorted[left_hand])
    return unsorted

Try It Out

inputs = [6, 3, 4, 5, 0]
expect(bubble_sort(inputs)).to(contain_exactly(0, 3, 4, 5, 6))

population = list(range(1000))
scrambled = choices(expected, k=1000)
expect(bubble_sort(scrambled)).to(contain_exactly(*list(sorted(scrambled))))

Well, that wasn't so exciting, but that's because I'm going to look more at how the sort performs in the next post.

Up Next

In the next post I'll run the function over some inputs to see how it performs.

Bubble Sort: Runtime Explained

The Cloistered Monkey

2021-11-17 03:45

What This Is

The number of comparisons for the Bubble Sort is equal to the number of times the loops run, so we get $\Theta(n^2)$ . I originally only did a truncated version of showing how you get $n^2$ but then I had to struggle a little to remember what was going on when I looked back at it so I'm going to break it down more here.

Start With the Loops

We have an outer loop that runs from $0\ldots n-2$ and an inner loop that goes from $0 \dots n - 2 - i$ giving us a summation that looks like this

$C(n) = \sum_{i=0}^{n-2} \sum_{j=0}^{n - 2 - i} 1$

It's summing $1$ because there's one comparison in each of the loops.

Decompose the Inner Sum

So first we have to recognize that since we're summing $1$ (the comparison) what we're doing is adding $1$ for every item in the summation which is the equivalent of counting how many numbers there are from $0$ to $n - 2 - i$ . When you have a sequence of consecutive integers (say 5, 6, 7, 8) the way to count the number there are in the sequence is $\textit{last} - \textit{first} + 1$ . So for 5, 6, 7, 8 we have $8 - 5 + 1 = 4$ so there are four numbers in the sequence. If we apply that to the sequence created by the inner summation we get this

$\begin{align} C(n) &= \sum_{i=0}^{n-2} \sum_{j=0}^{n - 2 - i} 1\\ &= \sum_{i=0}^{n-2} (n - 2 - i) - 0 + 1\\ &= \sum_{i=0}^{n-2} n - 1 - i\\ \end{align}$

Break Apart the Polynomial

With summations, if you are summing a polynomial, then you can break additions and subtractions apart into separate summations. Additionally, any multiplications can be moved outside of the summation. So now we get

$\begin{align} C(n) &= \sum_{i=0}^{n-2} n - 1 - i\\ &= n \sum_{i=0}^{n-2} 1 - \sum_{i=0}^{n-2} 1 - \sum_{i=0}^{n-2} i\\ \end{align}$

That Last Term

If you squint at the first two terms of that last line you can see that we're summing up $1$ just like we did with the inner loop so it's going to be a similar outcome just with a different ending point (and we'll also multiply by n for the first term). But that last one is a little trickier. It's a summation of i rather than $1$ so it'll be $0 + 1 + \cdots + (n-1) + (n-2)$ . This is a sequence that happens often enough that if you do this sort of thing a lot you'll just remember it, but I don't do it a lot and I'm rather of a forgetful bent anyway so I'll show how I remember to get it. First you have to remember that the sum of terms is the same no matter the order you put the terms in, so the summation comes out the same even when it's reversed. What we'll do is add the sequence with its reverse term by term we to get something like this

$\begin{array}{ccccccccc} & 0 & + & 1 & + & \cdots & + & (n - 3) & + & (n - 2) \\ + & (n - 2) & + & (n - 3) & + & \cdots & + & 1 & + & 0 \\ \hline & (n - 2) & + & (n - 2) & + & \cdots & + & (n - 2) & + & (n - 2)\\ \end{array}$

And using our counting equation we have

$\begin{align} end - start + 1 &= n - 2 - 0 + 1 \\ &= n - 1 \end{align}$

So we have $n - 1$ terms in that sum we got by adding the reverse (each term being $n - 2$ ), but since we added the sequence with its reverse it's now twice as big as it should be so we need to halve it, and the third term of our summation becomes

$\frac{(n - 1)(n - 2)}{2}$

Put the Three Terms Back Together

Now, going back to where we were in the summation - if do the expansion of the first two terms using the counting equation and add the third term from the previous section, we get this

$\begin{align} C(n) &= n \sum_{i=0}^{n-2} 1 - \sum_{i=0}^{n-2} 1 - \sum_{i=0}^{n-2} i\\ &= n(n - 2 - 0 + 1) - (n - 2 - 0 + 1) - \frac{(n - 1)(n - 2)}{2}\\ &= n(n - 1) - (n - 1)- \frac{n^2 -2n - n + 2}{2}\\ &= (n^2 - n) - (n - 1) - \frac{n^2 - 3n + 2}{2}\\ &= \frac{2(n^2 - n)}{2} - \frac{2(n - 1)}{2} - \frac{n^2 - 3n + 2}{2}\\ &= \frac{2 n^2 - 2n - 2n + 2 - n^2 + 3n -2}{2}\\ &= \frac{n^2 - n}{2} \in \Theta(n^2) \end{align}$

Oy. Now don't forget next time.

Re-Orienting

This is part of a series on Bubble Sort. More specifically, it expands on the post looking at the Bubble Sort Algorithms.

Bubble Sort: The Algorithm

The Cloistered Monkey

2021-11-17 02:45

This is part of a series on the Bubble Sort. In this post we'll look at the algorithm without worrying about the implementation details.

$\def\sc#1{\dosc#1\csod} \def\dosc#1#2\csod{{\rm #1{\small #2}}}$

The Parts

Specification: This is an instance of The Sorting Problem.
Basic Steps: Swap adjacent elements that are out of order.
Measure of Progress: The number of items sorted so far.
The Loop Invariant: The number of times the loop has been performed (k) is the number of elements at the end of the items which are the largest elements in sorted order.
Main Steps: Move from the start of the list swapping adjacent items that are out of order until the largest previously unsorted item is at the beginning of the sorted items.
Make Progress: The main step reduces the unsorted by 1.
Maintain the Loop Invariant:
- $\langle Loop-Invariant \rangle$ : k items at the end of the list are sorted.
- $\lnot \langle Exit-Condition \rangle$ : There is more than one item that is unsorted.
- $\textsc{CodeLoop}$ : The k+1 item was the largest of the unsorted items and is now at the beginning of the sorted items.
Establish the Loop Invariant: Initially no items are swapped.
Exit Condition: Stop when there is only one item in the unsorted items.
Ending:
- $\langle Exit-Condition \rangle$ : All but the smallest item are in the upper list.
- $\langle Loop-Invariant \rangle$ : All the items in the upper list are sorted in non-decreasing order.
- $\langle Post-Condition \rangle$ : The list is sorted in non-decreasing order.

The Algorithm

Pseudocode

The Bubble Sort algorithm is a Brute-Force sort which works by repeatedly traversing the input array, checking adjacent values and swapping them if they are out of order. This has the effect of "bubbling-up" the largest unsorted value, thus the name, probably (see Astrachan, 2003 for some history on it). Here's the algorithm in pseudocode.

Algorithm 2 BubbleSort

Input: An array of orderable items

Output: The array sorted in ascending order

procedure BubbleSort($A$)

for $i \gets 0$ to $n - 2$ do

for $j \gets 0$ to $n - 2 - i$ do

if $A[j+1] < A[j]$ then

swap $A[j]$ and $A[j+1]$

end if

end for

end procedure

If you squint a little you'll see that there's only really four lines of code, two for-loops, an if-then conditional, and a "swap", the rest is noise from the algorithmic formatting.

In translation, the $i$ variable holds the amount of items at the end of the list that have been sorted so far, and the $j$ variable moves from the start of the list up to the end of the unsorted (since the outer loop goes up by one each time and the inner loop is subtracting the outer loop's value when finding where to stop it goes up one less entry with each loop). Then the conditional says that if the item to the right of where we currently are is smaller than the item in front of us, swap the two items so that the bigger item is on the right.

Diagram

nil

The three colors in the diagram represent:

Blue: The outer for-loop
Red: The inner for-loop
Green: The transitioning to and from the inner loop to the outer loop.

A Simple Example

Let's step through how the Bubble Sort works with five items to sort.

The Array To Sort: A = [4, 5, 2, 3, 1]
n = 5
n - 1 = 4
n - 2 = 3

As a Table

# Sorted	All Sorted?	Last Unsorted			Exit Inner Loop?				Swap?
i	i == n - 1	n - 2 - i		j	j > n - 2 - i	A[j]		A[j + 1]	A[j] > A[j + 1]	A
0	$\ne n - 1$	3	$\nless$	0		4	<	5		4, 5, 2, 3, 1
			$\nless$	1		5	>	2	Swap	4, 2, 5, 3, 1
			$\nless$	2		5	>	3	Swap	4, 2, 3, 5, 1
			$\nless$	3		5	>	4	Swap	4, 2, 3, 1, 5
			<	4	Exit Loop
1	$\ne n - 1$	2	$\nless$	0		4	>	2	Swap	2, 4, 3, 1, 5
			$\nless$	1		4	>	3	Swap	2, 3, 4, 1, 5
			$\nless$	2		4	>	1	Swap	2, 3, 1, 4, 5
			<	3	Exit Loop
2	$\ne n - 1$	1	$\nless$	0		2	<	3		2, 3, 1, 4, 5
			$\nless$	1		3	>	1	Swap	2, 1, 3, 4, 5
			<	2	Exit Loop
3	$\ne n - 1$	0	$\nless$	0		2	>	1	Swap	1, 2, 3, 4, 5
			<	1	Exit Loop
4	$== n - 1$ (Done)

Some Column Notes:

i is the count of items in the sorted section at the end of the array
i == n - 1: When there's only one item left in the unsorted section there's nothing to swap so the array is sorted.
n - 2 - i is the index of the item at the end of the unsorted section right before the sorted section of the array.
j > n - 2 - i: When j moves into the already sorted section restart the inner loop.
A[j] > A[j + 1]: If the item to the right of j is smaller than the item at j then swap them.

Assessing the Damage

Although it's nice to know that the sort works we're really not as concerned about how correct it is, as much as we are interested in how it performs. There's two things we can count:

The Number of comparisons
The Number of swaps.

The fact that you have those two loops makes it pretty likely that it's going to be $\Theta\left(n^2\right)$ but since Bubble Sort is mostly an academic example let's work it out.

Comparisons

The number of comparisons is equal to the number of times the loops run, so we get $\Theta\left(n^2\right)$ . I made some notes on how I got that in this post.

Swaps

The number of swaps will depend on how the inputs are arranged, but in the worst case where the array is sorted backwards, every comparison will produce a swap so you'll end up with the same bounds as the comparisons.

$\begin{align} S_{worst-case} &= C(n)\\ &= \frac{n^2 - n}{2} \in \Theta(n^2) \end{align}$

Onward

The next post will look at translating the algorithm to python.

Sources

(Levitin, 2007) Introduction to the Design & Analysis of Algorithms (Levitin)
(Astrachan, 2003) Bubble Sort: An Archeological Algorithmic Analysis

How To Think About Algorithms

The Cloistered Monkey

2021-10-21 13:19

Introduction

Part One: Iterative Algorithms and Loop Invariants

Iterative Algorithms: Measures of Progress and Loop Invariants
Examples Using More-of-the-Input Loop Invariants
Abstract Data Types
Narrowing the Search Space: Binary Search
Iterative Sorting Algorithms
Euclid's GCD Algorithm
The Loop Invariant for Lower Bounds

Part Two: Recursion

Abstractions, Technique, and Theory
Some Simple Examples of Recursive Algorithms
Recursion on Trees
Recursive Images
Parsing with Context-Free Grammars

Part Three: Optimization Problems

Definition of Optimization Problems
Graph Search Algorithms
Network Flows and Linear Programming
Greedy Algorithms
Recursive Backtracking
Dynamic Programming Algorithms
Examples of Dynamic Programming
Reductions and NP-Completeness
Randomized Algorithms

Part Four: Appendix

Existential and Universal Quantifiers
Time Complexity
Logarithms and Exponentials
Asymptotic Growth
Adding-Made-Easy Approximations
Recurrence Relations
A Formal Proof of Correctness

Part Five: Exercise Solutions

Conclusion

Citation

[HTTAA] Edmonds J. How to think about algorithms. Cambridge ; New York: Cambridge University Press; 2008. 448 p.

Introduction to the Design & Analysis of Algorithms (Levitin)

The Cloistered Monkey

2021-10-20 16:24

Levitin A. 2007. Introduction to the design & analysis of algorithms. 2nd ed. Boston: Pearson Addison-Wesley.

Main Chapters

Introduction
Fundamentals of the Analysis of Algorithm Efficiency
Brute Force
Divide-and-Conquer
Decrease-and-Conquer
Transform-and-Conquer
Space and Time Tradeoffs
Dynamic Programming
Greedy Technique
Iterative Improvement
Limitations of Algorithm Power
Coping with the Limitations of Algorithm Power
Useful Formulas for the Analysis of Algorithms
Short Tutorial on Recurrence Relations

Essential Algorithms

The Cloistered Monkey

2021-08-12 12:39

Citation

Stephens R. Essential algorithms: a practical approach to computer algorithms using Python and C#. 2nd ed. Indianapolis: John Wiley; 2019.

Introduction To Algorithms (CLRS)

The Cloistered Monkey

2021-08-12 12:39

Chapters

Foundations

The Role of Algorithms in Computing

Algorithms
Algorithms as a technology

Getting Started

Insertion Sort
Analyzing algorithms
Designing algorithms

Growth of Functions

Asymptotic notation
Standard notations and common functions

Divide-and-Conquer

The maximum-subarray problem
Strassen's algorithm for matrix multiplication
The substitution method for solving recurrences
The recursion-tree method for solving recurrences
The master method for solving recurrences
Proof of the master theorem

Probabilistic Analysis and Randomization Algorithms

The hiring problem
Indicator Random Variables
Randomized Algorithms
Probabilistic analysis and further uses of indicator random variables

Sorting and Order Statistics

Heapsort

Heaps
Maintaining the heap property
Building a heap
The heapsort algorithm
Priority queues

Quicksort

Description of quicksort
Performance of quicksort
A randomized version of quicksort
Analysis of quicksort

Sorting in Linear Time

Lower bounds for sorting
Counting sort
Radix sort
Bucket sort

Medians and Order Statistics

Minimum and maximum
Selection in expected linear time
Selection in worst-case linear time

Data Structures

Elementary Data Structures

Stacks and Queues
Linked Lists
Implementing pointers and objects
Representing rooted trees

Hash Tables

Direct-address tables
Hash tables
Hash functions
Open addressing
Perfect hashing

Binary Search Trees

What is a binary search tree?
Querying a binary search tree
Insertion and deletion
Randomly built binary search trees

Red-Black Trees

Properties of red-black trees
Rotations
Insertion
Deletion

Augmenting Data Structures

Dynamic order statistics
How to augment a ddata structure

Advanced Design and Analysis Techniques

Dynamic Programming

Rod cutting
Matrix-chain multiplication
Elements of dynamic programming
Longest common subsequence
Optimal binary search trees

Greedy Algorithms

An activity-selection problem
Elements of the greedy strategy
Huffman codes
Matroids and greedy methods
A task-scheduling problem as a matroid

Amortized Analysis

Aggregate analysis
The accounting method
The potential method
Dynamic tables

Advanced Data Structures

B-Trees

Definition of B-trees
Basic operations on B-trees
Deleting a key from a B-tree

Fibonacci Heaps

Structure of Fibonacci heaps
Mergeable-heap operations
Decreasing a key and deleting a node
Bounding the maximum degree

van Emde Boas Trees

Preliminary approaches
A recursive structure
The van Emde Boas Tree

Data Structures for Disjoint Sets

Disjoint-set operations
Linked-list representations of disjoint sets
Disjoint-set forests
Analysis of uninon by rank with path compression

Graph Algorithms

Elementary Graph Algorithms

Representations of graphs
Breadth-first search
Depth-first search
Topological sort
Strongly connected components

Minimum Spanning Trees

Growing a minimum spanning tree
The algorithms of Kruskal and Prim

Single-Source Shortest Paths

The Bellman-Ford algorithm
Single-source shortest paths in directed acyclic graphs
Dijkstra's algorithm
Difference constraints and shortest paths
Proofs of shortest-paths properties

All-Pairs Shortest Paths

Shortest paths and matrix multiplication
The Floyd-Warshall algorithm
Johnson's algorithm for sparse graphs

Maximum Flow

Flow networks
The Ford-Fulkerson method
Maximum bipartite matching
Push-relabel algorithms
The relabel-to-front algorithm

Selected Topics

Multithreaded Algorithms

The basics of dynamic multithreading
Multithreaded matrix multiplication
Multithreaded merge sort

Matrix Operations

Solving systems of linear equations
Inverting matrices
Symmetric positive-definite matrices and least-squares approximation

Linear Programming

Standard and slack forms
Formulating problems as linear programs
The simplex algorithm
Duality
The initial basic feasible solution

Polynomials and the FFT

Representing polynomials
The DFT and FFT
Efficient FFT Implementations

Number-Theoretic Algorithms

Elementary number-theoretic notions
Greatest common divisor
Modular arithmetic
Solving modular linear equations
The Chinese remainder theorem
Powers of an element
The RSA public-key cryptosystem
Primality testing
Integer factorization

String Matching

The naive string-matching algorithm
The Rabin-Karp algorithm
String matching with finite automata
The Knuth-Morris-Pratt algorithm

Computational Geometry

Line-segment properties
Determining whether any pair of segments intersects
Finding the convex hull
Finding the closest pair of points

NP-Completeness

Polynomial time
Polynomial-time verification
NP-completeness and reducibility
NP-completeness proofs
NP-complete problems

Approximation Algorithms

The vertex-coverc problem
The traveling-salesman problem
The set-covering problem
Randomization and linear programming
The subset-sum problem

Appendix: Mathematical Background

Summations

Summation formulas and properties
Bounding summations

Sets, Etc.

Sets
Relations
Functions
Graphs
Trees

Counting and Probability

Counting
Probability
Discrcete random variables
The geometric and binomial distributions
The tails of the binomial distribution

Matrices

Matrices and matrix operations
Basic matrix properties

Citation

Cormen TH, editor. Introduction to algorithms. 3rd ed. Cambridge, Mass: MIT Press; 2009. 1292 p.

Table of Contents

Umm… Selection Sort?

Imports

Set Up

Parts of the Algorithm

The Pseudocode

The Implementations

Selection Sort

Some Checks

Run It

Looking at the Swaps

Worst Case

Sources

Table of Contents

In the Beginning

Some Setup

Imports

Stuff For Later

The Sorters

Bubba Sort

Bubba

Bubble

Test It Out

Counting Comparisons

Set Up Numba

Run The Bubble Counter

Run the Bubba Counter

Best Case, Worst Case

The End, The End

Table of Contents

An Empirical Look

Some Setup

Stuff For Later

Numba Random

A Counter

Some Types

The Sorting Function (With Counts)

Testing the Counter

A Random Input

The Worst Case

The Tracker

Try Them Out

Comparisons

Swaps

Worst Case

Onward

Table of Contents

Some Setup

Imports

Bubble Sort

The Implementation

Try It Out

Up Next

Table of Contents

What This Is

Start With the Loops

Decompose the Inner Sum

Break Apart the Polynomial

That Last Term

Put the Three Terms Back Together

Re-Orienting

Table of Contents

The Parts

The Algorithm

Pseudocode

Diagram

A Simple Example

As a Table

Assessing the Damage

Comparisons

Swaps

Onward

Sources

Contents

Part One: Iterative Algorithms and Loop Invariants

Part Two: Recursion

Part Three: Optimization Problems

Part Four: Appendix

Part Five: Exercise Solutions

Conclusion