I began holding some weekly student workshop at Imperial College London and I find some of the material presented useful for Codeforces community, so I will also post it here. So, enjoy!
Quicksort
We already discussed some cool sorts such as mergesort that work in O(N log N). We want you to learn not “a sort algorithm”, but the ideas behind some of them.
A technique that works many times is finding redundancies in problems. We will illustrate this by trying to build quicksort out of insertion sort. Insertion sort takes each element and puts it in the right place. Take the next example: ( first array is the sorted one that is being constructed and the second is the remainings from the initial array ) ~~~~~ [] [ 2 , 3 , 1 , 2 , 5 ] →
[ 2 ] [ 3 , 1 , 2 , 5 ] → [ 2 , 3 ] [ 1 , 2 , 5 ] →
[ 1 , 2 , 3 ] [ 2 , 5 ] →
[ 1 , 2 , 2 , 3 ] [ 5 ] →
[ 1 , 2 , 2 , 3 , 5 ] [ ] ~~~~~
This seems to work fine, but the array [ 5 , 4 , 3 , 2 , 1 ]
would be much hard to sort as we will make1 + 2 + 3 + 4 = 10
steps. Therefore the complexity is O(N^2)
. We observe a redundant part in our algorithm, which is the fact that we have to move a lot of values each moment.
A better approach has to move less values at one time, right? Well, at least less values per total. The idea is that we do not want to process those long left segments each time, so we want to make them as small as possible.
Instead of moving each element to the “local” right place (that is the place that is supposed to be in the processed array – look at the example before), we will put it in it's “global” right place. And that gives us the standard quicksort algorithm with pivot on the leftmost position.
quicksort :: (Ord a) => [a] -> [a]
quicksort []
= []
quicksort (x:xs)
= smaller ++ [x] ++ bigger
where
smaller = quicksort [a | a <- xs, a <= x]
bigger = quicksort [a | a <- xs, a > x]
The algorithm works as follows: - take the first element in the array (call it pivot – “x” in the picture ) - split all other elements in two arrays: elements bigger than the pivot and the rest - recurse with the same algorithm over the both arrays - make the new array out of the two arrays and the pivot
Is this better than insertion sort? Yes, but the answer is not straight forward. The “local” and “global” analogy suggests the better result. But in the case of a descending array it will work in O(N^2) complexity.
We achieved a better result, but we found another redundancy. That is the dependency of the complexity on the ordering of the array. We depend a lot on the first element at each step. What if I take another element, say second one? No, really. We will depend on the second element at each step in this case. We have to take different elements at each step, therefore we will choose the pivot randomly.
This seems to have no obvious redundancy. Let's analyze the complexity.
Each pivot has the same probability to appear. Therefore, the lengths of the two arrays might be (1,n-2), (2,n-3), … , (n-2,1)
, each with equal probability. Informally, we expect that the cut is made at middle so the expected time will be: T(n) = O(n) + 2 * T(n/2)
. This is the same as mergesort, O(N log N)
.
Formally, we use binary search tree structure and probabilities to demonstrate the complexity. We will not cover this.
To summarize here is a general(ish) approach of solving alike problems: 1. Start from a bruteforce 2. Find some redundancy 3. Try to eliminate it 4. Analyze new complexity 5. Stop or go to point 2
Kth Element
This is a more specific problem related to quicksort. We are given an array and we want to find it's K-th biggest element.
We will do the same steps as in quicksort with the difference that we go with the recurrence in only one segment. We take a random pivot and count the smaller elements. If the number of smaller elements is bigger than k then we search in the left side, otherwise we search in the other side.
The complexity we expect with a random pivot is O(N)
because we expect a cut at N/2
at first step, then at N/4
, and so on. As ( 1 + ½ + ¼ + .. ) * N = 2 * N
, the complexity is O(N)
.