I recently came across a coding challenge that required writing some code and then making it run more efficiently. The question was as follows:
Given an array of numbers, find the largest difference between an element and any larger element that occurs later in the array.
For example, given [5, 3, 2, 6, 8, 1, 4], the largest difference would be between 2 and 8, or 6. (The difference between 8 and 1 doesn’t qualify, because 1 is smaller than 8.)
One way to find this would be through brute force: compare all possible pairs of numbers, and return the largest difference. In other words, compare:
5-3, 5-2, 5-6, 5-8, 5-1, 5-4,
3-2, 3-6, 3-8, 3-1, 3-4,
2-6, 2-8, 2-1, 2-4,
6-8, 6-1, 6-4,
Here’s one way to write this in Python:
def max_difference(array): differences =  for i in range(len(array)): for j in range (i+1, len(array)): differences.append(array[j] - array[i]) return max(differences)
While this would find the correct solution, it’s not the fastest. Furthermore, the longer the array, the exponentially longer this code takes to run. The runtime of this code, expressed in Big O notation, is O(n2). This is usually the case when code contains nested loops, as above.
Here’s a way to visualize this: the array above contains 7 elements. The pyramid above is about half of a 7×6 square, so it contains (7×6)/2 pairs. So for an array of length n, this code needs to compare (n*n-1)/2 pairs, or approximately n2/2 pairs. Constants aren’t counted when determining runtime, so this simplifies to n2. In other words, the runtime is O(n2).
Is there a way to reduce the runtime to O(n)? In other words, can we code this solution so that adding elements to the array increases the runtime at a slower rate – say, linearly instead of exponentially? Yes. If we could write code that would traverse through the array just once, instead of continually backtracking to compare new pairs, that would do it:
def max_difference(array): current_min = array current_max_diff = 0 for i in range(len(array) - 1): current_min = array[i] if array[i] < current_min else current_min current_max_diff = (array[i+1] - current_min) if (array[i+1] - current_min) > current_max_diff else current_max_diff return current_max_diff
While this code looks longer, it actually runs more quickly, especially for larger n.
This code keeps track of the current minimum value and the current maximum difference. It sets the initial minimum value to the first element of the array and the initial max difference to 0, If it encounters a smaller minimum value or a larger difference, it sets those as the new bases for comparison. Using our example array, [5, 3, 2, 6, 8, 1, 4], it starts with these values:
current_min: 5, current_max_diff: 0
It then starts the iteration. The current_min is 5. The difference between 5 and the next element (3) is -2, which is less than 0, so current_max_diff stays 0.
In the next iteration, it compares 3 to 5; since 3 is smaller, it sets current_min to 3. The difference between 3 and 2 is -1, so current_max_diff still stays 0:
current_min: 3, current_max_diff: 0
In the next iteration, it compares 2 to current_min (3) and sets current_min to 2. The difference between 2 and 6 is 4, so current_max_diff gets set to 4:
current_min: 2, current_max_diff: 4
Next, it compares 6 to current_min (2), and it compares the difference between 2 and 8 (6) to current_max_diff (4), resulting in:
current_min: 2, current_max_diff: 6
These values stay the same until near the end, when current_min becomes 1. Since 4-1 (3) is still less than the current_max_diff of 6, the code ends with current_max_diff equal to 6.
Why does this code run more quickly, and how do we know? Because instead of nesting a loop inside another loop for each element, it uses only only loop per element. It runs n loops, while the original code runs n*(n-1) loops, which is many more loops.
This new code has a runtime of O(n), which is much faster than O(n2), especially as the array increases in size.
This makes for more efficient, faster code, which makes for happier users.