Asymptotic Analysis and Comparison of Sorting Algorithms

Mary Rachael Koenke
11 min readJan 3, 2021

--

What is a Sorting Algorithm? A Sorting Algorithm is used to rearrange a given array or list of elements, usually according to a comparison operator on the elements. The comparison operator is used to decide the new order of elements in the respective data structure.

The efficiency of an algorithm is mainly defined by two factors, space and time. An efficient algorithm is one that takes less time and less space, but this is not possible simultaneously all of the time. There is often a trade-off between time and space. If you want to reduce the time, then space might increase. Similarly, if you want to reduce the space, then the time may increase. You may have to compromise and evaluate the tradeoffs.

Asymptotic Analysis

Asymptotic analysis of an algorithm refers to defining the mathematical framing of its run-time performance. Space and time complexity are generally difficult to compute exactly, and the running time for small inputs is usually inconsequential. Due to this fact, we pay attention to the behavior of the complexity when the input size increases — that is, the asymptotic behavior of the complexity. Using asymptotic analysis, we can conclude the best case, average case, and worst case scenario of an algorithm. The notation Ο(n), called Big O Notation, is the formal way to express the upper bound of an algorithm’s running time.

Big O Notation

What is the upper bound of an algorithm? In terms of time, an algorithm can not take more time than the given time, or upper bound, when the argument tends toward a particular value or infinity. Very simply, we can say that the big O notation denotes the maximum time (or space) taken by an algorithm or the worst-case time (or space) complexity of an algorithm. Therefore, big O notation is the most used notation for the time complexity of an algorithm.

Big O notation describes how many steps an algorithm takes based on the number of elements that are acted upon. For example, constant notation shows an algorithm will always take the same number of steps regardless of how much data is passed in. Logarithmic notation shows the number of operations increases by one each time the data is doubled. Quadratic notation illustrates that if the input array has 1 element it will do 1 operation, if it has 10 elements it will do 100 operations, the operations increase at an exponential rate every time the input data increases by one.

The illustrations below show the comparisons and details of common big O notations.

Space Complexity

Space complexity is used to evaluate the use of memory, or data storage. Algorithms require the use of memory to store program instructions, execute (i.e. function calls) and store data. While these are all contributing factors, the stored variable data is often the primary consideration.

Space complexity is equal to the sum of two components:

  1. A fixed amount of space that is required to store certain data such as variables, constants, program size, etc. and is not dependent on the size of the problem
  2. A variable amount of space that is required by variables and is totally dependent on the size of the problem (i.e. recursion stack space and dynamic memory allocation)

Space complexity is usually represented with big O notation, which we know considers the worst-case scenario. What you create takes up space, and we need to be prepared for the worst!

Time Complexity

Time complexity is computational representation that describes the amount of time it takes to run an algorithm to completion. Since an algorithm’s running time may vary among different inputs of the same size, we need to consider worst case time complexity, or the maximum amount of time required for inputs of a given size.

Like space complexity, time complexity is also affected by factors such as the operating system and hardware, that is completely independent of the algorithm itself.

Time complexity is commonly estimated by counting the number of elementary operations performed by the algorithm, supposing that each elementary operation takes a fixed amount of time to perform. Therefore, the amount of time taken and the number of elementary operations performed by the algorithm are assumed to differ by at most a constant factor, and the algorithm that performs a task in the smallest number of operations is considered the most efficient one.

As with space complexity, time complexity is also usually represented with big O notation, considering the worst-case scenario. What you create takes up time, and we need to be prepared for the worst!

Tradeoffs in Time and Space

The best algorithm, or most efficient, hence the best method to solve a given problem, is one that requires less space in memory and takes less time to execute its instruction and to generate an output. In application, it is not always possible to achieve both of these objectives. Realistically, there may be more than one approach to solve a same problem, sometimes three or four solutions. One approach may require more space but takes less time to complete its execution, or vice versa. Therefore, we may have to sacrifice one at the cost of the other. This is why it is said that there are time and space tradeoffs among algorithms. There is no rule for which route is “better.” It is all dependent on which route is a priority in the given situation.

Bubble Sort

Bubble Sort is the simplest sorting algorithm that works by creating a loop that compares each item in the array with another item and repeatedly swapping the adjacent elements if they are in wrong order. This process is repeated until the entire array is sorted.

*See information on Stability at the end of the blog

Selection Sort

The selection sort algorithm sorts an array by repeatedly finding the minimum element from an unsorted portion of an array and putting it at the beginning. It does this by assuming the first element in the array is the smallest one and looping through to compare that value to every other value in the array. If it ever reaches a value smaller than itself, that element is set as the new smallest value and their positions in the array are swapped. The process continues doing comparisons until it reaches the end. The loops starts all over again moving on to the second element in the array as the smallest one. This continues until the entire array is sorted.

Insertion Sort

Insertion sort is a simple sorting algorithm that works similar to the way you sort playing cards in your hands. The array is virtually split into a sorted and an unsorted part. Values from the unsorted part are picked and placed at the correct position in the sorted part. The algorithm’s name comes from the process of picking an element and inserting it in its correct place and then repeating it for all elements.

It works by assuming the first element in the array is already sorted. It then selects the next element and compares this to all elements in the array. Every element that is greater than the selected element is shifted, and the selected element is inserted. This process is repeated until the array is sorted.

Quick Sort

Quick sort is one of the most efficient ways of sorting elements in computer systems. It is known as a divide and conquer algorithm. The term “divide and conquer” means we divide one large problem into several smaller problems and solve the smaller problems to ultimately solve the large problem.

In quick sort, we find a pivot point in the array to compare all other elements in the array. A pivot can be selected in different ways, as the first, last, random, or median element. If an element is smaller than the pivot, it is moved before the pivot. If an element is greater than the pivot, it is moved after the pivot. This is done simultaneously from both ends of the array, with a left pointer (first element) and right pointer (last element), incrementing toward each other and swapping values should they be on the wrong side. Once this is complete, the process continues recursively on the sub arrays, created by dividing the array in half, or partitioning, until the entire array is sorted.

Merge Sort

Merge sort is also a divide and conquer algorithm. It divides the input array into two halves, calls itself recursively on the two halves, and then merges the two sorted halves.

In merge sort we divide the array into n arrays until each of the arrays contain only one element. The smaller arrays are merged in order to produce a new array. This is repeated until there is only one array remaining, the original array in its sorted state.

Heap Sort

Heap sorting is a way of sorting elements by using the “Heap” data structure. The is actually quite similar to the selection sort technique. In order to understand heap sort, you must first understand what heaps are and how they are defined.

Basically, a heap is a binary tree with some added rules. The added rules are that, first, the binary tree must be complete. This means that you have to fill all the nodes at one level before adding another one. Secondly, there must be a defined parent/child relationship with the element values of the heap. In a min heap, the value of the parent must be smaller than its children. In a max heap, the value of the parent must be greater than its children.

In heap sort, we build a max heap that makes sure the highest value element is at the top. That top element is then swapped with the last element of the heap. The top element is removed and stored on a sorted array. This process is repeated until only one element remains in the heap.

One important point is that heaps are not natively supported in JavaScript, therefore we implement heaps using arrays. Although heap sort is complex, its space complexity is excellent.

Radix Sort

The best time complexity for comparison based sorting algorithms (all of the algorithms we have discussed thus far) is O(nlogn). So how do we do better?

Radix sort is the answer. Radix sort is a non-comparison based sorting algorithm. The radix sort algorithm distributes integers into buckets based on a number’s significant digit or value (the radix) to avoid the comparison.

The idea of Radix Sort is to do digit by digit sort starting from least significant digit to most significant digit. In other words, the ones place digits are sorted in into buckets first, then the tens place, then the hundreds place, and so on. Radix sort uses another sorting method called counting sort as a subroutine to complete its sort.

*n is the number of elements and k is the number of bits of memory required to represent the largest element in the array

A Note on Stability

Stability is mainly important when we have key value pairs with duplicate keys possible and we wish to sort these objects by keys. A sorting algorithm is said to be stable if two objects with equal keys appear in the same order in sorted output as they appear in the input array before sorting. Informally, stability means that equivalent elements retain their relative positions, after sorting. When equal elements are indistinguishable, such as with integers, or more generally, any data where the entire element is the key, stability is not an issue. Stability is also not an issue if all keys are different.

Sorting algorithms are not all created equal. Each have strengths and drawbacks, and many of which you can use to solve the same problem. Therefore, make sure you do your research and choose wisely! Happy Coding!

References

--

--

Mary Rachael Koenke
Mary Rachael Koenke

No responses yet