Using TensorFlow Lite we see a considerable speed increase when compared with the original results from our previous benchmarks using full TensorFlow.
We see between a ×3 and ×4 increase in inferencing speed between our original TensorFlow benchmark, and the new results using TensorFlow Lite. This decrease in inferencing time brings the Raspberry Pi 4 directly into competition with the NVIDIA Jetson Nano.
A single 3888×2916 pixel test image was used containing two recognisable objects in the frame, a banana🍌 and an apple🍎. The image was resized down to 300×300 pixels before presenting it to the model, and each model was run 10,000 times before an average inferencing time was taken. The first inferencing run, which takes longer due to loading overheads, was discarded.
⚠️Warning As per our previous results with the Raspberry Pi 4 the addition of a small fan, driven from the Raspberry Pi’s own GPIO headers, was need to keep the CPU temperature stable and prevent thermal throttling of the CPU.
Our initial TensorFlow results on the new Raspberry Pi 4 showed a ×2 increase in performance. This is roughly in line with expectations as with twice the NEON capacity more than the Raspberry Pi 3, we would expect this order of speedup in performance for well-written NEON kernels.
However we see a significantly larger speed increase with TensorFlow Lite, with a ×3 to ×4 increase in inferencing speeds between our TensorFlow benchmark, and the new results using TensorFlow Lite. This result is much larger than we saw when a similar comparison was made with the Raspberry Pi 3, where we saw only a ×2 increase in performance between the two packages. We are therefore seeing almost double the expected speed gain by using TensorFlow Lite over TensorFlow on the Raspberry Pi 4.
This decrease in inferencing time brings the Raspberry Pi 4 directly into competition with both the NVIDIA Jetson Nano and the Movidius-based hardware from Intel.
⚠️Warning It is probable that the Movidius Neural Compute Stick and the Intel Neural Compute Stick 2 will show better performance when connected to the Raspberry Pi 4 using USB 3 rather than USB 2. However until the OpenVINO framework supports Python 3.7 it is impossible to know for certain. Right now the Movidius-based hardware from Intel is not usable with the Rapsberry Pi 4.
If you were looking at purchasing the NVIDIA Jetson Nano to use for machine learning, there now seems no reason to do so as the Raspberry Pi 4 performs at a similar level, but for half the cost.
The performance increase seen with the new Raspberry Pi 4 makes it a very competitive platform for machine learning inferencing at the edge. The increase in inferencing performance we see with TensorFlow Lite on the Raspberry Pi 4 puts it directly into competition with the NVIDIA Jetson Nano and the Intel Neural Compute Stick 2.
Priced at $35 for the 1GB version, and $55 for the 4GB version, the new Raspberry Pi 4 is significantly cheaper than both the NVIDIA Jetson Nano, and the Intel Neural Compute Stick 2, both of which cost $99. Especially considering that, for the Compute Stick, this cost is in addition to the cost of the Raspberry Pi itself which therefore comes to a total of $134.
While the Coral Dev Board from Google is still the ‘best in class’ board, the addition on USB 3 to the Raspberry Pi 4 means that it is now also price competitive with the Dev Board. Priced at $35 the 1GB version of the new Raspberry Pi 4 is significantly cheaper than the $149 Coral Dev Board. Adding an additional $74.99 for the Coral USB Accelerator to the price of the Raspberry Pi means that you can outperform the previous ‘best in class’ board for a cost of $109.99. That’s a saving of $39.01 over the cost of the Coral Dev Board, for better performance.
Part II — Methodology
Installing TensorFlow Lite on the Raspberry Pi
Installing TensorFlow on the Raspberry Pi used to be a difficult process, however towards the middle of last year everything became a lot easier. Fortunately, thanks to the community, installing TensorFlow Lite isn’t that much harder. We aren’t going to have to resort to building it from source.
Go ahead and download the latest release of Raspbian Lite and set up your Raspberry Pi. Unless you’re using wired networking, or have a display and keyboard attached to the Raspberry Pi, at a minimum you’ll need to put the Raspberry Pi on to your wireless network, and enable SSH.
Once you’ve set up your Raspberry Pi go ahead and power it on, and then open up a Terminal window on your laptop and SSH into the Raspberry Pi.
ℹ️ InformationIf you’re working on an existing installation, and you already have the official version of TensorFlow installed, you should make sure you have uninstalled it first, by doing sudo pip3 uninstall tensorflow.
While there isn’t yet a build of TensorFlow Lite specifically for Python 3.7, we can make use of one of the Python 3.5 builds. However, you’ll need to make some tweaks before installation.
It’ll take some time to install. So you might want to take a break and get some coffee. Once it has finished installing you can test the installation as follows.
$ python3 -c "import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))"
⚠️WarningYou will receive ‘Runtime Warnings’ when you import tensorflow. These aren’t a concern and just indicate that the wheel was built under Python 3.5 and you’re using it with Python 3.7. You can safely ignore the warnings.
Now TensorFlow has been successfully installed we need to install OpenCV, the Pillow fork of the Python Imaging Library (PIL) and the NumPy library.
If you’re interested in getting started with any of the accelerator hardware I used during my benchmarks, I’ve put together getting started guides for the Google, Intel, and NVIDIA hardware I looked at during the analysis.