Python Generators are the functions that return the traversal object and used to create iterators. It traverses the entire items at once. The generator can also be an expression in which syntax is similar to the list comprehension in Python.
There is a lot of complexity in creating iteration in Python; we need to implement __iter__() and __next__() method to keep track of internal states.
It is a lengthy process to create iterators. That's why the generator plays an essential role in simplifying this process. If there is no value found in iteration, it raises StopIteration exception.
It is quite simple to create a generator in Python. It is similar to the normal function defined by the def keyword and uses a yield keyword instead of return. Or we can say that if the body of any function contains a yield statement, it automatically becomes a generator function. Consider the following example:
def simple(): for i in range(10): if(i%2==0): yield i #Successive Function call using for loop for i in simple(): print(i)
The yield statement is responsible for controlling the flow of the generator function. It pauses the function execution by saving all states and yielded to the caller. Later it resumes execution when a successive function is called. We can use the multiple yield statement in the generator function.
The return statement returns a value and terminates the whole function and only one return statement can be used in the function.
Using multiple yield Statement
We can use the multiple yield statement in the generator function. Consider the following example.
def multiple_yield(): str1 = "First String" yield str1 str2 = "Second string" yield str2 str3 = "Third String" yield str3 obj = multiple_yield() print(next(obj)) print(next(obj)) print(next(obj))
We can easily create a generator expression without using user-defined function. It is the same as the lambda function which creates an anonymous function; the generator's expressions create an anonymous generator function.
The representation of generator expression is similar to the Python list comprehension. The only difference is that square bracket is replaced by round parentheses. The list comprehension calculates the entire list, whereas the generator expression calculates one item at a time.
Consider the following example:
list = [1,2,3,4,5,6,7] # List Comprehension z = [x**3 for x in list] # Generator expression a = (x**3 for x in list) print(a) print(z)
In the above program, list comprehension has returned the list of cube of elements whereas generator expression has returned the reference of calculated value. Instead of applying a for loop, we can also call next() on the generator object. Let's consider another example:
list = [1,2,3,4,5,6] z = (x**3 for x in list) print(next(z)) print(next(z)) print(next(z)) print(next(z))
In the above program, we have used the next() function, which returned the next item of the list.
Example: Write a program to print the table of the given number using the generator.
def table(n): for i in range(1,11): yield n*i i = i+1 for i in table(15): print(i)
Output:
In the above example, a generator function is iterating using for loop.
There are various advantages of Generators. Few of them are given below:
Generators are easy to implement as compared to the iterator. In iterator, we have to implement __iter__() and __next__() function.
Generators are memory efficient for a large number of sequences. The normal function returns a sequence of the list which creates an entire sequence in memory before returning the result, but the generator function calculates the value and pause their execution. It resumes for successive call. An infinite sequence generator is a great example of memory optimization. Let's discuss it in the below example by using sys.getsizeof() function.
import sys # List comprehension nums_squared_list = [i * 2 for i in range(1000)] print(sys.getsizeof("Memory in Bytes:"nums_squared_list)) # Generator Expression nums_squared_gc = (i ** 2 for i in range(1000)) print(sys.getsizeof("Memory in Bytes:", nums_squared_gc))
We can observe from the above output that list comprehension is using 4508 bytes of memory, whereas generator expression is using 56 bytes of memory. It means that generator objects are much efficient than the list compression.
Data Pipeline provides the facility to process large datasets or stream of data without using extra computer memory.
Suppose we have a log file from a famous restaurant. The log file has a column (4th column) that keeps track of the number of burgers sold every hour and we want to sum it to find the total number of burgers sold in 4 years. In that scenario, the generator can generate a pipeline with a series of operations. Below is the code for it:
with open('sells.log') as file: burger_col = (line[3] for line in file) per_hour = (int(x) for x in burger_col if x != 'N/A') print("Total burgers sold = ",sum(per_hour))
The generator can produce infinite items. Infinite sequences cannot be contained within the memory and since generators produce only one item at a time, consider the following example:
def infinite_sequence(): num = 0 while True: yield num num += 1 for i in infinite_sequence(): print(i)