Binary Search Algorithm
Binary search trees can be wonderfully efficient when it comes to finding a specific value. Arrays, objects, and many others potentially require iterating over each value. Binary search trees, however, are able to reduce the remaining elements left to search by a factor of 2 each time.
This speedy search comes with trade-offs. For one, the data must follow some sort of logical sequence. We must be able to sort the data in some way, where each item in our tree is "less" or "greater" than another, by some common rule. Additionally, our binary search tree isn't able to insert values as quickly as we can in an array or object.
Despite these limitations, the binary search tree still proves to be an incredibly powerful way to organize data.
We can note now that the efficiency of this data structure lies not in its tree-like structure, but in the "binary search" aspect. That is, whenever we have sorted data, we can in theory use binary search.
How would we apply this to an array?
Let's say we have an array of 10 items. For simplicity, we'll make the values equal to the indices.
Our array looks like this:
var array = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Now, let's say we want to get the index where array[index] === 9
. (Conveniently, this is at the index 9 as well! What a coincidence.).
One way we could do this would be to iterate over the array. We would check each element to see if we find the target value. If we do, we'll return the index. If we don't, we'll keep searching. Maybe we'd use a for loop, like this:
for (var i = 0; i < array.length; i++) {
if (array[i] === 7) {
return i
}
}
In this case, if the value we want is the first one in the array, then we'd only have to check one item. How lucky! But what if our value is the last one? We'd have to check the entire array. We can say that in the worst case, we'd check n items, where n is the length of the array. On average, we'd check n/2 times.
Now what if we use a binary search?
Let's first take the midpoint of the data. Here, we have an even number of items, and can either use 4 or 5. For consistency, let's say we'll always use the lower number when we have an even number. We can use the average of our start and end point to find our midpoint. Here, with our lastIndex being 9 and our firstIndex being 0, our midpoint is 4.
Now, let's check if our midpoint is equal to the value we're looking for. If it were, we'd be done! In this case, it isn't, and a simple comparison tells us our midpoint is too low.
Now comes the beauty of binary search. What information has our last comparison given us? By knowing that our midpoint was too low, we now know that any index lower than ours will also be too low. Now we've saved ourselves the time of checking any of those values! (Remember, this isn't just because our values happen to match our indices here! This will apply as long as data is sorted).
This time, let's only check the values between 5 and 9. Let's set our midpoint to the halfway point between that, using the same method again. By using 5 as our start point and 9 as our end point, we get 7 as our new midpoint.
Once again, let's check if our midpoint matches. It seems like 7 is still too low, so let's do this again with the remaining values, 8 and 9. (Notice this time we've ruled out 5 and 6).
When we take 8 as our start point and 9 as our end point, we get 8 as our midpoint. We're still too low! Let's take the remaining values. But wait! It would seem there is only one value left.
This time, our firstIndex and lastIndex are the same - both 9. Although we could repeat the process of checking the midpoint and produce the same result (the average of 9 and 9 is 9), let's take the easy route and just check our last remaining value. As it turns out, this is the value we were looking for! (What if it weren't? That would mean the value doesn't exist in the collection, and we can handle this accordingly).
Notice in this case, we searched until there was only one remaining value - at no earlier point did we get lucky and hit the target with our midpoint. We can say that this is binary search's version of the worst case. And yet, we only took 4 steps to find our value. This is much better than having to look through 10 items.
Let's dig a little deeper, to prove that our code in fact does check each item. After all, this all falls apart if an item gets overlooked. We need to be sure we've done a complete search.
What if 4 had been too high? We would then check the values between the indices 0 and 3. The midpoint here is 1 (1.5 rounded down). If 1 is too high, we would check 0 - the final value. If 1 is too low, we would check 2 - the final value as well. It would seem that we in fact do check each value.
You may hypothesize that starting with an odd number of items in our array may make a difference. What if we had started with 5 items?
We would check the index of 3. If this were too low, we would check the items between 4 and 5. If it were too high, we would check the items between 0 and 2. We quickly see that the starting number of items makes to difference, as our problem quickly reduces to a similar pattern.
This final conclusion here - that our problem quickly reduces in to a smaller version of the same problem - makes it the perfect candidate for recursion. Let's try implementing this into a recursive function, and seeing how it works.
First, we know that our function will need to take an array, which we will search through, and a target, which we will search for. Let's add these as arguments.
Next, we know that we will continuously look through smaller portions of the array. We may want to specify the start and end points of section of the array that we'll search, so let's add those as arguments too.
var binarySearch = function (array, target, start, end) {
}
Next, we know that when the start and end indices are the same, we only have 1 item left to check. In this case, we don't need to recurse any further, so let's set this as our base case. Here, we'll check if the value is our target. If it is, we'll return our index. If not, we'll return -1, to denote that our value wasn't found.
var binarySearch = function (array, target, start, end) {
if (start === end) {
if (array[start] === target) {
return start;
} else {
return -1;
}
}
}
Now, let's tackle the heart of the problem - the recursive case. We know that the first thing we want to do is find the midpoint, so let's do that. Remember, we're going to take the average of the start and end points, and round it down using Math.floor to get a whole number. We'll then check if the midpoint of the array is the target, and return the index if it is.
var binarySearch = function (array, target, start, end) {
if (start === end) {
if (array[start] === target) {
return start;
} else {
return -1;
}
}
var midpoint = Math.floor((start + end)/2)
if (array[midpoint] === target) {
return midpoint;
}
}
If our midpoint isn't the target, we'll now need to check the rest of our array. Remember, whether we check the left or right of the array depends on whether our midpoint is too high, or too low. This means we'll first need to make the comparison. Then, let's set up an if/else statement to tackle the two scenarios accordingly.
(Note: the condition of the else if
is technically unnecessary - if we have done everything correctly, an else
statement should suffice. However, we include this for clarity.)
var binarySearch = function (array, target, start, end) {
if (start === end) {
if (array[start] === target) {
return start;
} else {
return -1;
}
}
var midpoint = Math.floor((start + end)/2)
if (array[midpoint] === target) {
return midpoint;
} else {
if (array[midpoint] > target) {
} else if (array[midpoint] < target) {
}
}
}
The final step is to decide what to do in each case. We know that in both cases, we want to check a smaller portion of the array. Perhaps a recursive call to the same function can take care of this for us.
Let's add recursive function calls, adjusting the start and end points accordingly. If our midpoint was too high, we will want to check the lower half of the array, starting from our last start point, up to (but not including) the midpoint. If our midpoint was too low, we will want to check the higher half of the array, starting from after the midpoint to our last end point.
Lastly, we'll want to make sure that the first time the function is run, we use the entire array. Let's add in default values, so we don't need to explicitly pass them in each time.
var binarySearch = function (array, target, start, end) {
start = start || 0;
end = end || array.length-1;
if (start === end) {
if (array[start] === target) {
return start;
} else {
return -1;
}
}
var midpoint = Math.floor((start + end)/2)
if (array[midpoint] === target) {
return midpoint;
} else {
if (array[midpoint] > target) {
return binarySearch(array, target, start, midpoint-1)
} else if (array[midpoint] < target) {
return binarySearch(array, target, midpoint+1, end)
}
}
}
All done! Now, each time our function recursively calls itself, we cut down on the elements we still need to check by a factor of 2. We will do this until we either have found our target, or have one element left.
Let's run our function with some arrays to test it out!
binarySearch([1, 3, 5, 6, 8], 3); // 1
binarySearch([1, 3, 5, 6, 8], 8); // 4
binarySearch([23, 24, 27, 31, 35, 39], 23); // 0
binarySearch([23, 24, 27, 31, 35, 39], 31); // 3
Success!