Welcome to day 23 of the 30 Days of Python series! Today we're going to be looking at some ways of creating our own iterators using generators and generator expressions.
We're also going to be looking at an important function called
iter which returns an iterator for any iterable we pass to it. This is going to let us confirm a lot of the theory we discussed in yesterday's post, and we're also going to be able to use it to get a deeper understanding of
This is post is going to build a great deal on what we discussed in yesterday's post, so if you haven't read it yet, I'd recommend you take a look before reading any further.
Python has a built in function called
iter which returns an iterator for the iterable we provide as an argument.
For example, let's take a simple list of numbers like this:
numbers = [1, 2, 3, 4, 5]
If we pass this list of numbers to
iter we'll get back an iterator for that list.
numbers = [1, 2, 3, 4, 5] numbers_iter = iter(numbers) print(numbers_iter) # <list_iterator object at 0x7f57d138af70>
In this case we get a
list_iterator object, which is going to let us access values in
numbers. Different types have their own iterators which understand how to give us items from those iterables. Getting elements from a dictionary is somewhat different from getting items from a list, after all.
We can use this
list_iterator object just like any other iterator. We can pass it to
next, for example.
numbers = [1, 2, 3, 4, 5] numbers_iter = iter(numbers) print(next(numbers_iter)) # 1 print(next(numbers_iter)) # 2
One interesting question is, what happens when we call
iter on the
This is perfectly legal, because
iter just expects an iterable, and all iterators are iterables. It also produces an interesting effect.
numbers = [1, 2, 3, 4, 5] numbers_iter = iter(numbers) print(numbers_iter is iter(numbers_iter)) # True
We find that passing
numbers_iter to the
iter function causes
iter to return the very same iterator. This might seem odd at first, but it makes quite a lot of sense.
Yesterday we talked about iterators being the means by which we access items in an iterable. When we want to iterate over an iterable, we need to ask for an iterator that knows how to get those values.
If we ask the iterator to give us a way to access those values, it offers up itself, since it's already capable of doing what we want.
for loops with
One cool thing we can do with the
iter function is replicate the behaviour of Python's
for loop. This is going to give us a little peek at what
for loops really do behind the scenes.
Let's use our list of numbers for this example again.
numbers = [1, 2, 3, 4, 5] numbers_iter = iter(numbers)
We have an iterator, which is an important first step, but we need a couple of other tools to make this work. First, we need a
while loop, because we want to loop a potentially infinite number of times. Second, we need a
try statement so that we can look out for a
numbers = [1, 2, 3, 4, 5] numbers_iter = iter(numbers) while True: try: number = next(numbers_iter) except StopIteration: break else: print(number)
And just like that we have a
for loop written with
We have our loop variable
number defined inside the
try, and we have the loop body inside the
else clause. Once we run out of numbers, the loop is going to terminate, just like we see with a
This is actually extremely close to how an actual
for loop works under the hood. It does request an iterator for whatever we want to iterate over, and it does call
next to retrieve values from that iterator. When a
StopIteration is raised, Python handles that error by breaking the loop.
This isn't something you should be doing in your production code, but it's an interesting peek behind the curtain that helps us better understand the structures we've been using since week 1.
iter alone for a moment and turn to the topic of creating our own iterators using generators.
There are quite a few ways to create custom iterators in Python, but most of them are beyond the scope of this series. This isn't really much of a limitation though, and we can do a great deal of very complicated things using generators.
The generator syntax is actually going to be very familiar to us, because a generator is actually just a function. The only thing which differentiates a generator from a regular function is a special keyword called
Before we dive into this new
yield keyword, let's look at a simple generator example.
def first_hundred(): for number in range(1, 101): yield number
Here I've defined a generator, which is just a special function, and I've called it
We can see from the function body that it has something to do with the numbers
100 inclusive, and we can probably infer that it's going to give us the first hundred integers, starting with
Let's call our function and see what happens.
def first_hundred(): for number in range(1, 101): yield number g = first_hundred() print(g)
If you run this code, we certainly don't get anything like the numbers
100 printed to the console. We get this
<generator object first_hundred at 0x7faaa563fc80>
This is actually called a generator iterator, which is what gets returned when we call any function that contains the
As the name would imply, this is an iterator, and we can use it just like any other.
def first_hundred(): for number in range(1, 101): yield number g = first_hundred() print(next(g)) # 1 print(next(g)) # 2 print(next(g)) # 3
When we call a generator, it gives us back a new generator iterator. Each of these generator iterators is an independent iterator, so be careful you don't do something like this:
def first_hundred(): for number in range(1, 101): yield number print(next(first_hundred())) # 1 print(next(first_hundred())) # 1 print(next(first_hundred())) # 1
Each call to
first_hundred gave us a new iterator, so we're only getting the first value from each one. You also don't assign the iterator anywhere, so it's not really possible for us to call
next on the same iterator again.
Now that we've seen a generator in action, it's time to talk about what this
yield keyword is doing.
We know already that it signals to Python that we're defining a generator, but it also seems to have some role in actually providing the values we want from the resulting generator iterator.
yield actually does is create a pause in the execution of the function body. When we call
next and pass in our generator iterator, the code in the function body is going to run until we hit that
The value after the
yield keyword is what we actually want to provide before we pause the execution of the function body. In this way we can think of
yield as something like a non-terminating
We can see all this by adding a few
def first_hundred(): print("First value requested\n") for number in range(1, 101): print("Starting new iteration") yield number print("Ending this iteration\n") g = first_hundred()
At this point, nothing is printed. The generator iterator has been created, but we haven't actually tried to access any values. Now let's pass
next a couple of times.
def first_hundred(): print("First value requested\n") for number in range(1, 101): print("Starting new iteration") yield number print("Ending this iteration\n") g = first_hundred() print(next(g)) print(next(g))
Now our output looks like this:
First value requested Starting new iteration 1 Ending this iteration Starting new iteration 2
First we get the
"First value requested\n" string, and then we enter the
for loop. At this point we get a value from the
range object, which is assumed to
number, and we print the
"Starting new iteration" string.
We then encounter the
yield keyword which pauses the execution of the function body, and our generator iterator spits out
1, which is the current value of
number. This value is returned by the call to
next and we print it to the console.
We then call
next again, and we continue from where we left off. This means we print the
"Ending this iteration\n" string, and we move onto a new iteration of the
We call the
yield one more time. We yield the number, which is what
next returns once again. This is then printed to the console, just as before.
For this second iteration, you'll note that we don't print the
"Ending this iteration\n" string, because
yield paused the execution before we reached that point.
If we were to call
next again, we'd get this string printed first, before starting a third iteration of the loop.
yield is actually a very complicated keyword, and it can do a great deal more than what we're using it for. We're not going to be covering this additional behaviour in this series, however, because it only has applications in much more advanced code.
I'm mentioning this only so that you know there is more to learn once you're a little further along in your Python career.
In addition to creating generator iterators through functions, we can also use generator expressions.
The generator expression syntax is also going to be very familiar to us, because it's exactly the same as the comprehension syntax we say in day 15. The only difference is that we use regular parentheses, rather than square brackets or curly braces.
We can use them very much like comprehensions, but they come with all the benefits of iterators that
filter provide. If you've wanted to have those benefits, but didn't like the syntax for
filter, generator expressions are for you.
For example, let's create a simple generator expression that squares every number in a
squares = (number ** 2 for number in range(1, 11))
squares refers to an iterator, printing it directly doesn't give us anything too useful, but it does at least confirm we're working with a generator iterator.
<generator object <genexpr> at 0x7f33225a0c80>
If we want to get values out, we can either pass it to a
for loop, we can destructure it, or we can use
next to perform manual iteration.
squares = (number ** 2 for number in range(1, 11)) for square in squares: print(square) squares = (number ** 2 for number in range(1, 11)) print(*squares, sep=", ") squares = (number ** 2 for number in range(1, 11)) print(next(squares)) # 1 print(next(squares)) # 4 print(next(squares)) # 9
Remember that the values in
squares get consumed when we iterate over the iterator, so you need to redefine
squares if you want to iterate over it more than once.
One nice thing about generator expressions is that we can forego the parentheses when we use the generator expression as the sole argument in a function or method.
This is totally legal syntax for example:
total = sum(number ** 2 for number in range(1, 11)) print(total) # 385
This helps us reduce nested brackets when they would only hinder readability.
1) Write a generator that generates prime numbers in a specified range. You can make use of your solution to exercise 3 from day 8 as a starting point.
2) Below we have an example where
map is being used to process names in a list. Rewrite this code using a generator expression.
names = [" rick", " MORTY ", "beth ", "Summer", "jerRy "] names = map(lambda name: name.strip().title(), names)
3) Write a small program to deal cards for a game of Texas Hold'em. The order of the deal is as follows:
- The deck is shuffled.
- One card is handed to each player in order.
- A second card is handed to each player order.
Then comes the more complicated part of the deal.
- First, the top card of the deck is discarded. This is called the burn.
- Three cards are then placed in the centre of the table, which is called the flop.
- Another card is burned, meaning we discard another card from the top of the deck.
- We add another card to the centre, which is called the turn.
- We burn another card.
- Finally, there's the river, where a fifth and final card is added to the centre.
The desired output for the program is something like this:
How many players are there? 2 Player 1 was dealt: (4, hearts), (4, clubs) Player 2 was dealt: (9, clubs), (jack, diamonds) The flop: (jack, clubs), (4, diamonds), (king, spades) The turn: (8, hearts) The river: (ace, hearts)
As the example would indicate, the program should accept a variable number of players. There must be at least 2 players, and no more than 10.
After the flop, the turn, and the river there's usually a round of betting, so if you want to extend this exercise, you may want to give the user the option to pause at each of these points.
Hint: We can shuffle cards using the
random.shuffle method. This shuffles a sequence in-place, which means it modifies the original sequence. We can then create an iterator from that sequence using
iter to make is easy for us to retrieve cards one at a time.
You can find documentation for