Introduction to Python generator function with a real world example

As a junior Python coder, there are many advanced concepts which I am trying to master. One of these concepts is the generator object. Although there are many articles and tutorials on the Internet about such topic, most of them are way too complicated for those who are in the beginning of their computer coding journey.

Not only most of the tutorials about advanced Python concepts are way too complicated for the beginner coder, but they also don’t offer real world examples to make the theory more concrete. Long story short, the articles don’t teach you how to apply the theory in practice through real world code. At least, most of them.

Through this easy to read and understand article, you are going to learn the usage of the Python generator object through a real world example; through real world code which is part of a side project I am currently working on.

What's the one thing every developer wants? More screens! Enhance your coding experience with an external monitor to increase screen real estate.

First of all, you need to learn about normal functions.

Normal functions in Python

As most of you may already know, a normal function in the Python computer programming language is declared through the def statement like shown in the following piece of code.

def add_numbers(a, b):
    pass

The above piece of Python code is the skeleton of a function object in the Python computer programming language, but it is not complete yet. The function must return something to the caller; which in the above case is the sum of two numbers.

Let’s finish coding the whole function.

def add_numbers(a, b):
    sum = a + b
    return sum

When called with two number arguments, the above function finds their sum and then returns the result to the caller with the help of the Python’s return statement.

Let’s call the function like shown below.

sum = add_numbers(3, 5)
print(sum)

Once the above piece of Python code got executed on my own interactive console, I got the following output.

8

The thing with the above normal function in Python, and others similar to it, is that once they have returned the result to the caller, they’re done.

On the other hand, generators return multiple results. One by one.

The generator object in Python

A generator object in the Python computer programming language returns results to the caller through the yield statement. It is defined as a normal function, through the def statement.

def generator_function():
    l = [1, 2, 3, 4, 5, 6, 7, 8]
    for el in l:
        yield el

Now let’s call the above generator object like we call a normal function in the Python computer programming language.

generator_function()

Once the above piece of Python code is executed on my own interactive console, the following output comes out.

<generator object generator_function at 0x10af779b0>

The coder does not get much information about the object by looking at the result which is shown above.

To get the results one by one, we can make use of next(), a function which is supported by the generator object.

Let’s try it.

gen = generator_function()
next(gen)

Once the above code is executed, the following comes in the interactive Python console.

1

Continue with the next().

next(gen)

The next result comes out.

2

Let’s continue with another next() statement.

next(gen)

The next result comes out.

3

Once you have managed to get out the last result, another call on the generator through the next() statement will result in an error.

The piece of Python code shown below is a proof.

next(gen)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

For now, do not focus on understanding on how the Python generator object works under the hood as it is complicated stuff for the hackers.

To make things more concrete for the junior Python coders, we are going to take a look at a real world scenario.

Real world implementation of a Python generator

At the moment I am working on a wrapper which automates some of the basic functionalities provided by the FFMpeg multimedia framework. Those who have used at least once in their life this framework, know for sure that it gives live output to the user.

The problem I had to solve required the use of a generator object. Let’s see it through a practical example.

When one converts a video from one format to another through ffmpeg, they get live output on their console; information such as the video frames the multimedia framework is currently processing for example.

[blockquote]
ffmpeg -i test.mp4 test.avi
[/blockquote]

Once the above command gets executed on my computer terminal, live output such as the one shown below comes out.

[blockquote]
ffmpeg version 3.2 Copyright (c) 2000-2016 the FFmpeg developers
built with Apple LLVM version 8.0.0 (clang-800.0.42.1)
configuration: –prefix=/usr/local/Cellar/ffmpeg/3.2 –enable-shared –enable-pthreads –enable-gpl –enable-version3 –enable-hardcoded-tables –enable-avresample –cc=clang –host-cflags= –host-ldflags= –enable-libmp3lame –enable-libx264 –enable-libxvid –enable-opencl –disable-lzma –enable-vda
libavutil 55. 34.100 / 55. 34.100
libavcodec 57. 64.100 / 57. 64.100
libavformat 57. 56.100 / 57. 56.100
libavdevice 57. 1.100 / 57. 1.100
libavfilter 6. 65.100 / 6. 65.100
libavresample 3. 1. 0 / 3. 1. 0
libswscale 4. 2.100 / 4. 2.100
libswresample 2. 3.100 / 2. 3.100
libpostproc 54. 1.100 / 54. 1.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from ‘test.mp4’:
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf57.56.100
Duration: 00:05:12.89, start: 0.000000, bitrate: 1085 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280×720 [SAR 1:1 DAR 16:9], 955 kb/s, 23.98 fps, 23.98 tbr, 24k tbn, 47.95 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 125 kb/s (default)
Metadata:
handler_name : SoundHandler
Output #0, avi, to ‘test.avi’:
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
ISFT : Lavf57.56.100
Stream #0:0(und): Video: mpeg4 (FMP4 / 0x34504D46), yuv420p, 1280×720 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 23.98 fps, 23.98 tbn, 23.98 tbc (default)
Metadata:
handler_name : VideoHandler
encoder : Lavc57.64.100 mpeg4
Side data:
cpb: bitrate max/min/avg: 0/0/200000 buffer size: 0 vbv_delay: -1
Stream #0:1(und): Audio: mp3 (libmp3lame) (U[0][0][0] / 0x0055), 44100 Hz, stereo, fltp (default)
Metadata:
handler_name : SoundHandler
encoder : Lavc57.64.100 libmp3lame
Stream mapping:
Stream #0:0 -> #0:0 (h264 (native) -> mpeg4 (native))
Stream #0:1 -> #0:1 (aac (native) -> mp3 (libmp3lame))
Press [q] to stop, [?] for help
frame= 4146 fps=206 q=31.0 size= 16137kB time=00:02:52.95 bitrate= 764.3kbits/s speed=8.61x
[/blockquote]

Having a function to return live output to the user does not work, because the function returns one result and then exits.

I had to make use of a generator function to give the user live output to the console. So as the ffmpeg processes the frames, the generator reads data and displays it line by line to the user.

The code for the generator of my project is being shown below.

def get_live_output(self):
    """Get live output from the opened
    subprocess.Popen object, reads chunk by chunk."""
    line = ''  # a line of ffmpeg output .e.g.
    new_lines = ['\n', '\r\n', '\r']
    # base parsing is going to be the same
    # for any kind of executable, read data
    # chunk by chunk, append it to line, yield line,
    # reset line and so on until the whole data is read

    # the following while loop generates a
    # python generator
    while True:  # we need a while True condition in here
        # when there is predicted that there is no error
        # in the output, it is redirected to stdout
        # instead of stderr where live ffmpeg output goes
        if self.stdout_read:  # read from stdout
            chunk = self.child_p.stdout.read(1)
        else:  # read from stderr
            chunk = self.child_p.stderr.read(1)
        if chunk == '' and self.child_p.poll() is not None:
            self.status_finished = True
            break  # break the while True loop
        line += chunk  # build the line chunk by chunk
        if chunk in new_lines:  # line finished, yield line
            yield line
            line = ''  # reset line to empty string

As you can see from the above piece of Python code, the generator object reads data from a subprocess.Popen object and every time it hits a new line character, it yields the data of line variable.

Final thoughts

The Python generator is a bit hard to understand when it comes to its implementation under the hood. As a junior programmer, I used to face a lot of difficulties in understanding such object. Having implemented it in a real world scenario through my personal project; I decided to share my experience with the novice coders.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download

Leave a Reply

Your email address will not be published. Required fields are marked *

*