Python, Recursive Function and Various Speedup!

Image for post
Image for post
Photo by Joel Fulgencio on Unsplash

Today, I would like to explore how simple self-recursive function could be accelerated via frameworks/libraries like Python’s functools, numba, dask, and tensorflow.

Recursive functions are essential to dynamic programming which is arguably the most intuitive way of many solving problems because it closely resembles how we think about the procedure of solving those problems. However, programmers usually get obstructed by the “too depth recursion” errors and seemingly infinite waiting time even on a simple recursive function. Careful algorithm design is one way of tackling this. This article will focus on using existing frameworks/libraries to help in speeding up. Let’s get started.

Our Baseline

This algorithm is used to solve a problem I encountered a regional data science competition at Data Squad Techjam 2018.

Image for post
Image for post
The Original Problem Statement in Thai

The problem wanted me to find a password to a zip file. The password is the concatenation of a number of possible ways to pronounce those words.

There are numerous approaches to solve this simple problem, but let’s keep this simple by using this one as a baseline.

147 ms is this function getWaysOfReading(20) execution time on my MacBook Pro.

I used n=20 because when n=30 the execution time becomes too long.

lru_cache of functools

Caching recurve function is one way to improve this function speed. There is a standard Python library called functools . It has a memory caching function lru_cache . It could be used as a Python decorator. I added a line of code like this.

Each time this function is called with the same the set of arguments(same values), the lru_cache will try to look up it the table if the cache is hit or not. If it is hit, it will return the previous function output instead.

190 µs is an execution time. Surprisingly, I observed a roughly 1000x speedup with this one line of code.

There is a disk caching implementation from joblib library that would also work perfectly. In case, you need a larger-than-memory caching.

Numba’s Just-in-time Compiler

Imagine that our Python code could be as fast as bare metal C/C++ speed. numba library has bought this magic to Python community under the Pydata umbrella.

This would be the implementation.

Sadly that, this code would give an error. Because numba recursion is not yet implemented in the current implementation.

Dask’s Multiprocess Automation

Dask utilizes the unused processing cores that previously limit by the Python GIL. The approach of using multiple cores might increase execution time due to the bottleneck communication between multiple cores. In combination with many trivial workloads being delivered to each core, I don’t expect a huge boost in execution time. Anyway, let’s give it a shot.

15 ms is spent on this function. Wow, that gives a surprising 10x speedup. It also scales well when I increased n by 10 times to n=200 with 16.6 ms (just +1.1 ms).

Tensorflow’s Eager Execution

We are about to reach the destination~. Tensorflow is the last framework to try. It uses a similar approach like numba. It converts the code to efficient code which runs on the machine level. Let’s see where it take us to.

31 sec is taken by this approach using the same parameter and same machine. The good news is that we know that tensorflow could be easily programmed in a recursive manner. The bad news is that a lot of overhead is there. It is 100x slower than our baseline.

Number 1 goes forrrrrrr rrr r r….

👏👏👏 Python’s functools.👏👏👏

Follow by, dask, our baseline and tensorflow

numba can’t make it. Try harder again na~ numba san.

That’s it. I hope this article shed some light onto these frameworks/libraries, so next time you get a better sense of which one is suited your job most.

Written by

A quant, ML Engineer— Pythonista — hacker.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store