Program Execution Cost

The main goal for this unit is to develop an understanding of the cost of running programs. So far, you haven't worried about this and have been happy to write code that gives the correct result. Once you start to make programs bigger, make them do more things or run on larger inputs, you have to start thinking about the cost of running them. This question of what it costs to evaluate an execution is a very important, and it is a fundamental problem in computer science. Some people spend their whole careers working on this. It's called algorithm analysis.

You may not be aware of this but you've already written many algorithms. An algorithm is a procedure that always finishes and produces the correct result. A procedure is a well defined sequence of steps that it can be executed mechanically. We're mostly interested in procedures which can be executed by a computer, but the important part about what makes it a procedure is that the steps are precisely defined and require no thought to execute.

To be an algorithm, it has to always finish. You've already seen that it isn't an easy problem to determine if an algorithm always finishes. It isn't possible to answer that question in general, but it can be answered for many specific programs.

So, once you have an algorithm, you have well a defined sequence of steps that will always finish and produce the right results. This means you can reason about its cost.

What is Cost?

The way computer scientists think about cost is quite different from how most people think about cost.

When you think about the cost of things, you know specific things such as the Toyota Yaris car costs PKR 2,900,000 and the Suzuki Mehran car PKR 8,000,000 – the Toyota Yaris costs more than the Suzuki Mehran car. You just have to compare those costs. This is thinking in terms of very specific things with specific costs. It's different with algorithms. We don't usually have a specific execution in mind. The cost depends on the inputs.

Suppose algorithms Algo 1 and Algo 2 both solve the same problem. You can't put a fixed price on them like with the cars. For some inputs, it might be the case that Algo 1 is cheaper than Algo 2, but for others, Algo 2 might be cheaper. You don't want to have to work this out for every input because then you might as well run it for every input. You want to be able to predict the cost for every input without having to run every input.

The primary way that cost is talked about in computer science is in terms of the size of the input. Usually, the size of the input is the main factor that determines the speed of the algorithm, that is, the cost in computing is measured in terms of how the time increases as the size of the input increases. Sometimes, other properties of the input matter, which will be mentioned later. Ultimately, cost always comes down to money. What costs money when algorithms are executed?

The time it takes to finish: if it finishes more quickly, you'll spend less time on it. You can rent computers by the time it takes to execute. There are various cloud computing services where you pay for a certain sized processor for the time you use it. It's just a few cents per hour. Time really is money and, although we don't need to turn the costs into money because we might not know the exact computing costs, understanding the time to execute will give a good sense of the cost.
Memory: – if a certain amount of memory is needed to execute an algorithm then you have an indication of the size and cost of computer required to run the program.

In summary, cost is talked about in terms of time and memory rather than money; although the actual implementation of these do convert to actual monetary costs. Time is often the most important aspect of cost, but memory is also another consideration.

Q5.1 - Measuring Speed

Why do computer scientists focus on measuring how time scales with input size, instead of absolute time?

A): We want to predict how long it will take for a program to execute before running it.
B): We want to know how the time will change as computers get faster.
C): We want to understand fundamental properties of our algorithms, not things specific to a particular input or machine.
D): We want abstract answers, so they can never be wrong.

Throughout the entire history of computing, it's been the case that the computer you can buy in a year for the same amount of money will be faster than the computer you can buy today.

Answer

The correct answers are A), B), C).

A) We want to predict how long it will take for a program to execute before running it. If we have to run the program to figure out how long it takes, then we've already done what we wanted so there is no point predicting. Furthermore, we've only found out how long it takes to run for that particular input. It won't be very useful to tell us how long it will take to run it for a different input.

B) We want to know how the time will change as computers get faster. By understanding how time scales with input size, that is how the time increases as the input size increases, we get a better idea how costs will change with time. Computers keep getting cheaper and faster. This was observed by Gordon Moore in 1965 and is known as Moore's Law. It's not a physical law but a trend that has been followed throughout the history of computing. Computing power doubles approximately every 18 months, so what you can get now will be about half the power you will be able to get in a year and a half for the same amount of money. That's a pretty nice property but knowing the monetary cost is today doesn't tell us much about the cost in a year's time. Understanding the cost in a more fundamental way is needed.

C) We want to understand fundamental properties of our algorithms, not things specific to a particular input or machine. Again, this is correct. Understanding fundamental properties will give us much more information that a few measurements from particular inputs and computers.

D) We want abstract answers, so they can never be wrong. Abstract answers can be just as wrong as concrete answers, but having abstract answers can help us understand in a more fundamental way than just concrete answers.

Measuring Execution Time

It is sometimes better to do bechmarking of your code to figure out what how your procedure will behave on real inputs. There are several tools available for this job. We will use Python timeit module for this job.

Python timeit.timeit() is a method in Python library to measure the execution time taken by the given code snippet. The Python library runs the code statement 1 million times (this is configurable) and provides the minimum time taken from the given set of code snippets.

Here is an example of how to you timeit module to measure basic arithmetic operation.

import timeit

print(timeit.timeit('1 + 1'))

Spin Loop

In order to get a better idea of how timing works, define the procedure spin_loop:

def spin_loop(n):
    i = 0
    while i < n:
        i = i + 1

To measure execution time of spin_loop, you need to do something like this:

print(timeit.timeit('spin_loop(1000)', globals=globals(), number=1000))

This code will run for longer, and by picking the value n you can go through and loop any number of times. The table below shows spin_loop running 1000, 10000, 100000 and 1000000 times, returning a result that is the time it takes to run the loop that number of times.

`n`	Execution Time
`1000`	`0.04618648900009248`
`10000`	`0.42609286699996574`
`100000`	`4.169496288000005`
`1000000`	`45.209366087000035`

It is important to notice that the time changes depending on the input – when you increase the input, the time (or the output) also increases accordingly.

Q5.2 - Predicting Execution Time

Given the execution times above, what is the expected execution time for spin_loop(10**9) (one billion) in seconds? You can use table above to estimate time taken.

Answer

Looking at the values in the table, you can see that when n is increased by a factor of 10 (multiplied by 10), then so is the running time, like this:

n: 10^4 * 10 = 10^5, running time: 0.046 * 10 = ~0.46s,
n: 10^5 * 10 = 10^6, running time: 0.46 * 10 = ~4.6s.

So you can predict that going from one million to ten million, which is an increase by a factor of 10, that the running time will also increase by a factor of 10.

n: 10^6 * 10 = 10^7, predicted running time: 4.6 * 10 = ~46s Continuing in this way,
n: 10^7 * 10 = 10^8, predicted running time: 46 * 10 = ~460s
n: 10^8 * 10 = 10^9, predicted running time: 460 * 10 = ~4600s, which is the value you were trying to predict.

When the running time increases by the same factor as the size of the input, it is called linear. If we plot running time against the size of the input on a graph, we get a straight line. Unfortunately, the graph would have to be very large to plot all the points up to a billion.