Today, I would like to explore how simple self-recursive function could be accelerated via frameworks/libraries like Python’s
Recursive functions are essential to dynamic programming which is arguably the most intuitive way of many solving problems because it closely resembles how we think about the procedure of solving those problems. However, programmers usually get obstructed by the “too depth recursion” errors and seemingly infinite waiting time even on a simple recursive function. Careful algorithm design is one way of tackling this. This article will focus on using existing frameworks/libraries to help in speeding up. Let’s get started.
This algorithm is used to solve a problem I encountered a regional data science competition at Data Squad Techjam 2018.
The problem wanted me to find a password to a zip file. The password is the concatenation of a number of possible ways to pronounce those words.
There are numerous approaches to solve this simple problem, but let’s keep this simple by using this one as a baseline.
147 ms is this function
getWaysOfReading(20) execution time on my MacBook Pro.
n=20 because when
n=30 the execution time becomes too long.
lru_cache of functools
Caching recurve function is one way to improve this function speed. There is a standard Python library called
functools . It has a memory caching function
lru_cache . It could be used as a Python decorator. I added a line of code like this.
Each time this function is called with the same the set of arguments(same values), the
lru_cache will try to look up it the table if the cache is hit or not. If it is hit, it will return the previous function output instead.
190 µs is an execution time. Surprisingly, I observed a roughly 1000x speedup with this one line of code.
There is a disk caching implementation from
joblib library that would also work perfectly. In case, you need a larger-than-memory caching.
Numba’s Just-in-time Compiler
Imagine that our Python code could be as fast as bare metal C/C++ speed.
numba library has bought this magic to Python community under the Pydata umbrella.
This would be the implementation.
Sadly that, this code would give an error. Because numba recursion is not yet implemented in the current implementation.
Dask’s Multiprocess Automation
Dask utilizes the unused processing cores that previously limit by the Python GIL. The approach of using multiple cores might increase execution time due to the bottleneck communication between multiple cores. In combination with many trivial workloads being delivered to each core, I don’t expect a huge boost in execution time. Anyway, let’s give it a shot.
15 ms is spent on this function. Wow, that gives a surprising 10x speedup. It also scales well when I increased n by 10 times to n=200 with 16.6 ms (just +1.1 ms).
Tensorflow’s Eager Execution
We are about to reach the destination~. Tensorflow is the last framework to try. It uses a similar approach like numba. It converts the code to efficient code which runs on the machine level. Let’s see where it take us to.
31 sec is taken by this approach using the same parameter and same machine. The good news is that we know that tensorflow could be easily programmed in a recursive manner. The bad news is that a lot of overhead is there. It is 100x slower than our baseline.
Number 1 goes forrrrrrr rrr r r….
dask, our baseline and
numba can’t make it. Try harder again na~
That’s it. I hope this article shed some light onto these frameworks/libraries, so next time you get a better sense of which one is suited your job most.