Computer Science 1¶
Overview¶
Runtime Complexity
Data Structures
Introduction to Runtime Complexity¶
We Want To Scale¶
Given numbers in a loop, we want to return those numbers, reversed.
let nums = []
for (let i = 0; i < 10000; i++) {
nums.push(i)
}
nums = nums.reverse()
let nums = []
for (let i = 0; i < 10000; i++) {
nums.unshift(i)
}
Right-hand version is hundreds of times slower!
Speed¶
Premature optimization is the root of all evil.
—Donald Knuth
Clear, simpler programs are often better than faster ones
Please note the “premature:” we don’t want to “micro-optimize”
But we do care about how they “scale” — that adds up!
Here are times, in nanoseconds, on a fast 2015 Macbook Pro, for inserting versus appending n items:
n |
push |
unshift |
|---|---|---|
10 |
2 |
2 |
100 |
9 |
19 |
1000 |
73 |
324 |
10000 |
678 |
21,000 |
100000 |
7,810 |
1,950,000 |
1000000 |
84,900 |
40,323,000,000 |
Introduction to Data Structures¶
Data Structures¶
This means understanding how data structures work
And picking the right ones
And using them correctly
So… Why Do We Care?¶
You want to write programs that can scale
These are super-common interview questions
Runtime Complexity¶
What is Runtime Complexity?¶
How long does it take my program to run
compared to how much workload I have,
where workload is the length of a list.
Why is it Important?¶
A mathematical way to talk about algorithm efficiency
Be aware of the efficiency of your code (and make good choices!)
Common interview question: “What’s the runtime of this code?”
Example: Sending Data from SLC to NYC¶
By sending hard drives on plane
By sending via cable modem
Can predict times for any # of TB
Notice different “shapes”
“constant” for plane
“linear” for modem
You Already Know How to Do This¶
Example¶
Say we have this function:
function sumNums(numsList) {
//return the sum of all numbers in an array
let sum = 0
numsList.forEach(num => sum += num)
return sum
}
sumNums([1, 2, 3, 4, 5])
One operation for initialization of sum variable
One operation for each addition
Total operations: 1 + (1 × 5)
= 6 ops
Same Function, Bigger List¶
function sumNums(numsList) {
//return the sum of all numbers in an array
let sum = 0
numsList.forEach(num => sum += num)
return sum
}
sumNums([1, 2, 3, ... , 1000000000])
How long will this take?
One operation for initialization of summation variable
One operation for each addition
Total operations: 1 + (1 × 1,000,000,000)
= 1,000,000,001 ops
Asymptotic Behavior¶
For any algorithm we drop any factors and constants, and pay attention to the largest growing term. This is how we figure out the asymptotic behavior.
1 + (1 × 5)
1 + (1 × 1,000,000,000)
General formula of above: 1 + (1 × n)
1 + n
So if your list is n long, it takes n time to do your calculations.
This function is O(n).
For 2n + 3 …
We can drop the “+ 3” (doesn’t change when we scale)
We can drop the “2 ×” (we grow in direct proportion to n)
We’d call 2n + 3 → O(n)
But, But, But¶
Isn’t 1000n + 1000 bigger than O(n2) ?
For n <= 1000, yes.
Sooner or later, better shapes always win.
The Lingo¶
This function is O(n)
It has linear runtime
It runs in linear time
It runs in O(n) “Big oh of n” time
Q: What’s the runtime of your program? A: O(n)
The “O” here is “omicron”. As you get more advanced, you may also talk about other related but less common runtime measurements, such as omega (the least amount of time an algorithm could take), and theta (the range). Here, we’ll just be talking about O.
Worst Case¶
let evenNums = []
for (let i = 0; i < someLargeList; i++) {
if (i % 2 === 0) {
evenNums.push(someLargeList[i])
}
}
How could we measure the runtime here?
Do we have to know what % of someLargeList is even?
We talk pessimistically, only about worst case.
Not useful to talk about best case:
doesn’t happen often (worst case is more likely)
most algorithms share best case
eg, searching best case is always O(1)
Common Runtimes¶
Frequently Encountered Runtimes¶
O(1)
O(log n)
O(n)
O(n log n)
O(n2)
Example¶
Q: What’s the runtime?
let listOne = [5, 48, 88, 24, 14]
console.log(listOne[3])
A: O(1)
Q: What’s the runtime?
let someObj = {"a": 5, "b": 6, "c": 7}
console.log(someObj.c)
(hint: what if the object had 1,000,000 items in it?)
A: O(1)
O(1) - Constant Time¶
Accessing properties on objects
Array indexing
Example¶
Q: What’s the runtime?
let myList = [2, 4, 6, 8, 10]
let sum = 0
for (let i = 0; i < myList.length; i++) {
sum += myList[i]
}
O(n) - Linear Time¶
A single loop
Two loops run in succession (2n → O(n))
Multi-Part Algorithms: Addition vs. Multiplication¶
If listA and listB each have n items, what is the runtime of:
listA.forEach(item => console.log(item))
listB.forEach(item => console.log(item))
A: n + n → 2n → O(n)
Q: What is the runtime of:
for (let i = 0; i < listA.length; i++) {
for (let j = 0; j < listB.length; j++) {
console.log(listA[i], listB[j])
}
}
A: n × n → n2 → O(n2)
O(n2) - Quadratic Time¶
Two nested for loops
Bubble sort
Insertion sort
Logarithmic Runtimes¶
What’s a Logarithm?
A logarithm is the inverse operation of exponentiation.
Consider exponentiation (raising a number “to the power of”). For example, 53 (five “raised to the power of three”) is 5 × 5 × 5, or 125.
A logarithm is the inverse of this: log3 125 means “what would I have to raise to the third power to equal 125”? (and, of course, in this case, the answer is 5).
In computer science, we deal so much with binary numbers or True/False (less-than/more-than, on/off, etc) kinds of questions that we use log2 more than any other logarithmic base. As such, it’s common to leave off the 2, and to simple assume that.
(More technically: from the perspective of runtime itself, it doesn’t matter what the logarithm base is—while there is a difference mathematically between log2 and log3, they both have the same asymptotic analysis or line shape. So it’s fine to just say “O(log n)” without saying what the log base is.)
Example¶
Q: What’s the runtime of this code?
function findNum(max) {
let min = 0
let target = Math.floor(Math.random() * max)
let guess = Math.floor((max + 1) / 2)
while (guess !== target) {
if (guess > target) {
max = guess - 1
} else {
min = guess + 1
}
}
return guess
}
findNum(32)
A: O(log n)
O(log n) - Logarithmic Time¶
“Divide and conquer”
Binary Tree Search
Computer scientists assume logarithms are logarithms in base 2, and tend to leave this off when writing, but, for clarity’s sake, that’s O(log2 n)
O(n log n) - Logarithmic Time¶
More complex “Divide and conquer”
Mergesort
That’s n times log n – so it’s a number bigger than n but smaller than n2.
Computer scientists assume logarithms are logarithms in base 2, and tend to leave this off when writing, but, for clarity’s sake, that’s O(n log2 n)
What’s the Difference?¶
log n < n < n log n < n2
For n = 100:
log n |
n |
n log n |
n2 |
|---|---|---|---|
6.6 |
100 |
664 |
10,000 |
O(2n) - Exponential Time¶
A common example is Password cracking
Say there are 72 valid chars for a password
If password is 1 char there are 72 possibilities
For 4 chars that’s 72 × 72 × 72 × 72 (or 724) possibilities (26,873,856)
For 10 chars that’s 7210 or 3,743,906,242,624,487,424
For 20: 7220 => 14,016,833,953,562,607,293,918,185,758,734,155,776
O(n!) - Factorial Time¶
Graph traversal
Traveling Salesman problem
Find a path that hits each node once and only once:
Four choices. Start with one node, say A. [A, …]
Now there are 3 left, choose 1. [A, C, …]
Now there are 2 left, choose 1. [A, C, D, …]
Now there’s only one left. [A, C, D, B]
4 × 3 × 2 × 1 = 4! = 24
There are 24 possible routes to hit every node once.
Data Structures¶
Built-In Data Structures¶
Data Structures are simply a way in which we can organize our data… Some common examples are:
Arrays (or lists)
Objects (or dictionaries)
Let’s take a look at some others.
Linked-Lists¶
Linked Lists feel very similar to arrays (they are, after all, lists), however, a linked list is essentially a collection of objects (or nodes) that typically have 2 key/value pairs, a value, and a next. As an example:
let a = {
value: "A",
next: b
}
let b = {
value: "B",
next: c
}
let c = {
value: "C",
next: null
}
Why Linked Lists?¶
Wouldn’t it be easier to just have an array ["A", "B", "C"]? Sure, but we need to think about more than just ease of use… How does an array scale? Let’s say we have a list of 20 letters, and we want to insert a letter into index 0, this would require us to move each letter down an index, meaning it will require 20 steps, plus 1 for the insert itself. This is an O(n) time complexity. However, if we have a liked list, we can actually do this with an O(1) time complexity. All we have to do is set the previous elements next value to the new element we want to insert, and our new elements next value to whatever the previous elements next value used to be. Ex:
If we have A -> B -> D, all we need to do is create a C node, change B’s next from D to C and set C’s next to ``D. Now we will have a list that looks like: A -> B -> C -> D.
Queues¶
What is a Queue data structure? Well, let’s think about real-life queues…
Grocery Store line
DMV office
Airplanes queuing to take-off…
They all follow a similar process. If you are the first in the line, you are the first out of the line. This is known as FIFO.
It is important to note that Queues use arrays under the hood.
Why Use Queues¶
Queues are not used to improve time complexity over an array. As mentioned, they do use arrays under the hood. However, queues are great to use when you are wanting an array-like structure with limited, customized rules. For an example, we could create a Queue class that has a dequeue() method. This method will remove the first item in the array. You might also create an enqueue() method that adds an element to the end of the array. Your Queue will be protected from methods such as shift, slice, and splice unless you specifically add these methods to your Queue class.
Stacks¶
Stacks are very similar to Queues. Let’s think about some real-life stacks:
Stack of books
Stack of pancakes
The difference between a Stack and a Queue, is a stack is first in, last out (FILO). Think about a stack. If you stack a bunch of books on top of each other, in order to get to the bottom book (or the first one added to the stack) you need to remove the books above it (the books added afterwards).
Why Use Stacks?¶
Just like Queues, Stacks are used for simplicity and control. It is an array under the hood, but you can create your own custom rules for a stack, and prevent methods such as splice and slice from being used.
Trees¶
Trees are another type of data structure. They can be a little more complicated than some others, however, they can be much more efficient when searching through them.
Trees have a root node (you can think of this as your main category, ex. Soda), and children nodes (ex. Coca-Cola and Pepsi). Those children nodes can have more children nodes (ex, Coca-Cola may have Sprite, Fanta, Coke, and Pepsi may have Diet Pepsi, Sierra Mist, Dr. Pepper). With this type of data structure, and assuming you organize your data well, you can quickly search through thousands of nodes by simply diving into the correct child node.
More Data Structures¶
This is just a brief look at some of the most commonly used data structures. There are many, many, more. Learning about the different data structures commonly used will be very helpful when you are searching for a job.