Working with data
Imagine you are asked to compute the average of a list of numbers. But you need to only take the average of the positive numbers, skipping over any that are less than or equal to zero.
First try
Here is one way to do that:
def average_positive(numbers):
"""
Take a list of numbers, and return the average of the
positive values in the list
"""
total = 0
n = 0
for number in numbers:
if number > 0:
total = total + number
n = n + 1
return total / n
if __name__ == '__main__':
# try it the first way
result = average_positive([-100, 1, 0, 2, 3])
print(result)
Working with the data this way, we are going through the list one time and doing all our calculations as we go through the list.
TO do this, we need two variables that we use to compute the average. One is
total
, which we use to calculate the total of all of the positive numbers in
the numbers
list. The second is n
, which we use to count how many numbers
are positive.
We increment both total
and n
only for the positive numbers. We increment
total
with number
, and we increment n
with 1
.
If you are confused by this code, copy it into PyCharm and run it through the
debugger. Step through each line of average_position()
and notice how total
,
n
, and number
change values over time.
Second try
Here is another way to do this problem:
def only_positives(numbers):
"""
Take a list of numbers and return a NEW list that has
only the positive numbers from the original list.
"""
new_list = []
for number in numbers:
if number > 0:
new_list.append(number)
return new_list
def average_numbers(numbers):
"""
Take a list of numbers and average them.
"""
total = 0
for number in numbers:
total = total + number
return total / len(numbers)
def average_positive(numbers):
"""
Take a list of numbers, and return the average of the
positive values in the list
"""
# get a list of only the positive numbers
new_list = only_positives(numbers)
# average the list of numbers
average = average_numbers(new_list)
return average
if __name__ == '__main__':
result = average_positive2([-100, 1, 0, 2, 3])
print(result)
In this case, we first use the original list of numbers to create a new list that has only positive numbers. This is a filtering operation. Then once we have a list of only positive numbers, we take the average of them. This is a reduce operation.
Thinking about a problem this way centers the data rather than the operations on the data. The purpose of the code is to manipulate the data, one step at a time, until we have the result we need.
You may want to use this way of solving the problem if you are going to need the
basic pieces — removing negative numbers, taking an average of a list of
numbers — many more times. For example, maybe you have a problem that requires
averaging all numbers, regardless of whether they are positive or negative. Our
average_numbers()
function works just fine for that. But if we used the first
solution, we would need to write some new code.
The accumulator pattern
Both solutions to this problem show examples of using the accumulator pattern
to solve a problem. With this pattern, you initialize a variable to some value
(zero, an empty list), and then add to it (adding to a total, appending to the
list) in a loop.
In this class, you will practice these patterns repeatedly so they will become second-nature to you.