Working with data

Imagine you are asked to compute the average of a list of numbers. But you need to only take the average of the positive numbers, skipping over any that are less than or equal to zero.

First try

Here is one way to do that:


def average_positive(numbers):
    """
    Take a list of numbers, and return the average of the
    positive values in the list
    """
    total = 0
    n = 0
    for number in numbers:
        if number > 0:
            total = total + number
            n = n + 1
    return total / n


if __name__ == '__main__':
    # try it the first way
    result = average_positive([-100, 1, 0, 2, 3])
    print(result)

Working with the data this way, we are going through the list one time and doing all our calculations as we go through the list.

TO do this, we need two variables that we use to compute the average. One is total, which we use to calculate the total of all of the positive numbers in the numbers list. The second is n, which we use to count how many numbers are positive.

We increment both total and n only for the positive numbers. We increment total with number, and we increment n with 1.

If you are confused by this code, copy it into PyCharm and run it through the debugger. Step through each line of average_position() and notice how total, n, and number change values over time.

Second try

Here is another way to do this problem:

def only_positives(numbers):
    """
    Take a list of numbers and return a NEW list that has
    only the positive numbers from the original list.
    """
    new_list = []
    for number in numbers:
        if number > 0:
            new_list.append(number)
    return new_list


def average_numbers(numbers):
    """
    Take a list of numbers and average them.
    """
    total = 0
    for number in numbers:
        total = total + number
    return total / len(numbers)


def average_positive(numbers):
     """
    Take a list of numbers, and return the average of the
    positive values in the list
    """
    # get a list of only the positive numbers
    new_list = only_positives(numbers)
    # average the list of numbers
    average = average_numbers(new_list)
    return average


if __name__ == '__main__':
    result = average_positive2([-100, 1, 0, 2, 3])
    print(result)

In this case, we first use the original list of numbers to create a new list that has only positive numbers. This is a filtering operation. Then once we have a list of only positive numbers, we take the average of them. This is a reduce operation.

Thinking about a problem this way centers the data rather than the operations on the data. The purpose of the code is to manipulate the data, one step at a time, until we have the result we need.

You may want to use this way of solving the problem if you are going to need the basic pieces — removing negative numbers, taking an average of a list of numbers — many more times. For example, maybe you have a problem that requires averaging all numbers, regardless of whether they are positive or negative. Our average_numbers() function works just fine for that. But if we used the first solution, we would need to write some new code.

The accumulator pattern

Both solutions to this problem show examples of using the accumulator pattern to solve a problem. With this pattern, you initialize a variable to some value (zero, an empty list), and then add to it (adding to a total, appending to the list) in a loop.

In this class, you will practice these patterns repeatedly so they will become second-nature to you.