Problem Statement
Consider the following problem: Find all possible combinations of a, b, and c such that a + b + c = 1000 and a^2 + b^2 = c^2 (where a, b, and c are natural numbers).
Initial Attempt
import time
start_time = time.time()
# Note: triple loop
for a in range(0, 1001):
for b in range(0, 1001):
for c in range(0, 1001):
if a**2 + b**2 == c**2 and a + b + c == 1000:
print("a, b, c: %d, %d, %d" % (a, b, c))
end_time = time.time()
print("elapsed: %f" % (end_time - start_time))
print("complete!")
Results:
a, b, c: 0, 500, 500
a, b, c: 200, 375, 425
a, b, c: 375, 200, 425
a, b, c: 500, 0, 500
elapsed: 214.583347
complete!
Time taken: 214.583347 seconds
Algorithm Development
An algorithm is a well-defined procedure or set of rules to solve a specific problem or perform a task. It consists of a finite sequence of instructions that can be executed by a computer.
Characteristics of an Algorithm
- Input: An algorithm may have zero or more inputs.
- Output: An algorithm must produce at least one output.
- Finiteness: The algorithm must terminate after a finite number of steps.
- Definiteness: Each step must be clearly defined and unambiguous.
- Feasibility: Each step must be executable within a reasonable amount of time.
Improved Attempt
import time
start_time = time.time()
# Note: double loop
for a in range(0, 1001):
for b in range(0, 1001 - a):
c = 1000 - a - b
if a**2 + b**2 == c**2:
print("a, b, c: %d, %d, %d" % (a, b, c))
end_time = time.time()
print("elapsed: %f" % (end_time - start_time))
print("complete!")
Results:
a, b, c: 0, 500, 500
a, b, c: 200, 375, 425
a, b, c: 375, 200, 425
a, b, c: 500, 0, 500
elapsed: 0.182897
complete!
Time taken: 0.182897 seconds
Evaluating Algorithm Efficiency
Execution Time as a Measure of Efficiency
For the same problem, two different algorithms were implemented. The execution times of the programs showed a significant difference (214.583347 seconds versus 0.182897 seconds). This indicates that the execution time of a program can reflect the efficiency of the algorithm.
Reliability of Execution Time
If the second attempt's algorithm were run on an older, less powerful computer, the execution time might not be significantly faster than the first algorithm's 214.583347 seconds. Relying solely on execution time to compare algorithms may not be objective.
Time Frequency
The time an algorithm takes to execute cannot be calculated theoretically; it must be tested on a machine. However, we don't need to test every algorithm. Instead, we focus on how many operations each algorithm performs. The number of operations performed by an algorithm is called its time frequency, denoted T(n).
Time Complexity and Big O Notation
The time frequency T(n) represents the number of operations an algorithm performs. If T(n) is proportional to a function g(n), then we say the algorithm has a time complexity of O(g(n)). Big O notation ignores constants and lower-order terms, focusing only on the highest-order term.
Understanding Big O Notation
When evaluating algorithm efficiency, the focus should be on the growth rate of the function rather than constant factors. For example, 3n^2 and 100n^2 are considered to be in the same complexity class.
Worst-Case Time Complexity
When analyzing algorithms, consider:
- Best-case time complexity: the minimum number of operations needed.
- Worst-case time complexity: the maximum number of operations needed.
- Average-case time complexity: the average number of operations needed.
We primarily focus on the worst-case scenario, as it provides a guarantee that the algorithm will complete within a certain number of operations.
Basic Rules for Calculating Time Complexity
- Basic operations with constant time complexity are O(1).
- Sequential structures add their time complexities.
- Loop structures multiply their time complexities.
- Branching structures take the maximum time complexity.
- Focus on the highest-order term when evaluating time complexity.
- Unless specified otherwise, the time complexity analyzed is the worst-case complexity.
Algorithm Analysis
First Attempt Algorithm Core:
for a in range(0, 1001):
for b in range(0, 1001):
for c in range(0, 1001):
if a**2 + b**2 == c**2 and a + b + c == 1000:
print("a, b, c: %d, %d, %d" % (a, b, c))
Time Complexity: T(n) = O(n^3)
Second Attempt Algorithm Core:
for a in range(0, 1001):
for b in range(0, 1001 - a):
c = 1000 - a - b
if a**2 + b**2 == c**2:
print("a, b, c: %d, %d, %d" % (a, b, c))
Time Complexity: T(n) = O(n^2)
Common Time Complexities
Note: log2n is often written as logn.
Relationships Between Common Time Complexities
O(1) < O(logn) < O(n) < O(nlogn) < O(n2) < O(n3) < O(2n) < O(n!) < O(nn)
Python Built-in Type Performance Analysis
timeit Module
timeit is used to measure the execution speed of small Python code snippets.
Class timeit.Timer(stmt='pass', setup='pass', timer=)
Timer is a class for measuring the execution speed of small code snippets.
stmt parameter is the code statement to test; setup parameter is the setup required for running the code; timer parameter is a timer function dependent on the platform.
Method timeit.Timer.timeit(number=1000000)
The method to test the execution speed of a statement in the Timer class. number parameter is the number of times to test the code, default is 1000000 times. The method returns the execution time, a float representing seconds.
List Operations Test
def t1():
l = []
for i in range(1000):
l = l + [i]
def t2():
l = []
for i in range(1000):
l.append(i)
def t3():
l = [i for i in range(1000)]
def t4():
l = list(range(1000))
from timeit import Timer
timer1 = Timer("t1()", "from __main__ import t1")
print("concat ",timer1.timeit(number=1000), "seconds")
timer2 = Timer("t2()", "from __main__ import t2")
print("append ",timer2.timeit(number=1000), "seconds")
timer3 = Timer("t3()", "from __main__ import t3")
print("comprehension ",timer3.timeit(number=1000), "seconds")
timer4 = Timer("t4()", "from __main__ import t4")
print("list range ",timer4.timeit(number=1000), "seconds")
Append vs Insert Comparison
def t5(): li = [] for i in range(10000): li.insert(0, i)
Data Structures
====
How can we store student information in a class using Python types? How can we quickly retrieve a student's information by name?
When considering this problem, we have already used data structrues. Both lists and dictionaries can store student information, but retrieving a student's information from a list requires traversing the list, which has a time complexity of O(n). Ussing a dictionary allows us to store the student's name as the key and the student's information as the value, enabling quick retrieval without traversal, which has a time complexity of O(1).
To solve problems efficiently, we need to store data and design algorithms based on the storage method. Different storage methods require different algorithms. We aim for the most efficeint algorithms, leading us to consider how to store data effectively, which is what data structures are about.
In the above problem, we can choose between Python's list or dictionary to store student information. Lists and dictionaries are two built-in data structures provided by Python.
Concept
--
A data structure is a way of organizing and storing data in a computer. It refers to a collection of data elements that have one or more specific relationships between them.
To solve problems, we need to store data and design algorithms based on the storage method. Different storage methods lead to different algorithms. We want the most efficient algorithms, so we must consider how to store data effectively, which is the essence of data structures.