Teensy 4.0 Brings 600 MHz Cortex-M7 to the Arduino World




Teensy 4.0 is an Arduino-compatible board with an Arm Cortex-M7 running at 600 MHz. That is right. Six. Hundred. Megahertz. Today, Paul Stoffregen and the PJRC team adds its newest member to the Teensy product family. At $20, the Teensy 4.0 may be the best performance per dollar board available.

Teensy 4.0–600 MHz Arduino-compatible board

What Is a Teensy?

PJRC produces a series of boards called the Teensy, with both 8-bit and 32-bit boards available. They are all compatible with the Arduino IDE and the Arduino library. In many cases, code written for another Arduino board works with minimal to no changes on a Teensy. As the name implies, these boards tend to be very small. For example, the current form factor is only about 18 by 36 millimeters. But do not let the size deceive you, these boards pack a ton of functionality. For example, the new the Teensy 4.0 features a megabyte of RAM, two megabytes of Flash, a bevy of I/O options, cryptographic support, a hardware floating-point processor (FPU), and a built-in real-time clock (RTC).

Here’s a list of the Teensy 4.0’s specs. I am not able to address everything on this list, but I have picked out a few highlights that I thought were interesting.

  • Arm Cortex-M7 at 600 MHz
  • 1024K RAM (512K tightly coupled)
  • 2048K Flash (64K reserved for recovery and EEPROM emulation)
  • 2 USB ports, both 480 MBit/sec
  • 3 CAN-Bus (1 with CAN FD)
  • 2 I2S digital audio
  • 1 S/PDIF digital audio
  • 1 SDIO (4 bit) native SD
  • 3 SPI, all with 16 word FIFO
  • 3 I2C, all with 4 byte FIFO
  • 7 Serial, all with 4 byte FIFO
  • 32 general purpose DMA channels
  • 31 PWM pins
  • 40 digital pins, all interrupt capable
  • 14 analog pins, 2 ADCs on chip
  • Cryptographic acceleration
  • Random number generator
  • RTC for date/time
  • Programmable FlexIO
  • Pixel processing pipeline
  • Peripheral cross triggering
  • Power on/off management

The last bullet point immediately caught my attention. It is by no means a high-tech capability. However, the Teensy 4.0 is finally a microcontroller board with an on/off control! As mentioned before, 4.0 comes in the same form factor as PJRC’s Teensy 3.2. So they’ve packed those features into a 17.78 x 35.56 mm footprint.

Comparing Teensy 4.0 (left) and Teensy 3.6 (right)

600 MHz MCU!

At the heart of Teensy 4.0 is an NXP i.MX RT1060 series processor. It contains the Arm Cortex-M7, along with a variety of connectivity options, system controls, memory, power management, two 12-bit analog to digital converters, and a security module. For more details about the processor, check out the MIMXRT1062DVL6A’s datasheet.

A 600 MHz clock speed sounds impressive. But what can you do with that? One thought that comes to mind is machine learning. Imagine how great it would be to add an I2S microphone to enable sound detection, or perhaps to add gesture detection to a hands-off project. Before Teensy 4.0, you would probably need to consider a much more power-hungry single-board-computer platform. Speaking of power, Teensy 4.0 only draws about 100 mA when the clock is running at full speed!

i.MX RT1060 block diagram (📷: NXP)

The other thought that comes to mind is real-time signal analysis. Combine the speed along with the floating-point unit (FPU), and you get a compelling math machine. Unlike almost all other Arduino boards, the Cortex-M7 does floating-point functions in hardware, not software. Therefore, it is possible to use one of the onboard ADCs, available across 14 analog input pin, to acquire a signal and do some first-order processing without passing the data to the PC.

The 600 MHz clock itself is not really a defining reason to get the board. Yet, knowing that speed gives you an immediate sense for the kind of performance to expect from the tiny board. Speaking of performance, what about some benchmark numbers?

Teensy 4.0 Benchmarks

Stoffregen provided two benchmark code examples. These give some context for the power of the board. The first is a CoreMark test, which is a synthetic workload. However, it provides a consistent way to compare single-core microcontrollers. In the graph below, you can see how the Teensy 4.0 compares to other small microprocessor boards. In this case, larger numbers are better. Compared to Teensy 3.6, the 4.0 is over five times faster! I find this comparison interesting since the Teensy 3.6 was clocked about three times slower at “only” 180 MHz.

CoreMark benchmark comparing Teensy 4.0 to other Arduino platforms

The next test uses mbed TLS, which is an open source cryptographic library for small platforms that enables SSL. In this case, Stoffregen’s example is using RSA to sign a string. I think this example is a realistic workload for a processor like the one found in the Teensy 4.0. For example, I could see an IoT device signing messages before passing it through a mesh network.

Time to calculate an RSA signature comparing Teensy 4.0 to other Arduino platforms

The amount of time it takes the Teensy 4.0 to complete the task is around 85 milliseconds. The Teensy 3.6, though, is almost six times slower at 474 ms. It is apparent differences between the M4 and M7 architectures contribute to performance just as much as a bump in clock speed.

Differences Between M0, M4 and M7

If you’re like me, it is easy to get confused by all of the various Arm processor cores available. Using Arduino-compatible boards as a baseline, three Arm cores are popular: M0+, M4, and (now) M7.

M0+ is a minimal processor focused on energy-efficiency. The M4 adds a hardware divider and multiply-accumulate instructions, which speeds up math while also supporting additional DSP instructions. Depending on the implementation, it can also support a floating-point unit (FPU). The M7 is Arm’s high-performance core intended for intensive processing applications. It takes the M4 a step further with double-precision floating-point, cache, and the concept of tightly coupled memory (TCM).

TCM is a unique feature intended to improve performance. It is not a cache. Instead, TCM is a specific area of the RAM. The CPU has fast (single-cycle access) to instructions and data stored in their TCM areas. The Arduino-IDE extension, Teensyduino allocates the sketch code and variables to use these fast memory areas. It is possible to override this behavior as well as use malloc() to use the memory space outside of the TCM.

For most users, the default behavior means having access to this performance feature without additional code.

Full Arduino IDE Support

Speaking of Arduino, the IDE supports the Teensy 4.0. However, the installation does not use the board manager. Instead, you need to download the Teensyduino installer. In the case of the Teensy 4.0, make sure you have at least Arduino IDE 1.8.9 installed. In addition to adding examples and board support, there is also a small program that transfers programs to the Teensy. For this reason, you cannot use the Arduino Web Editor. You need to use the offline, or downloadable, version of the IDE.

As of this writing, most of the core Arduino functionality has been confirmed to work with the Teensy 4.0. There has also been a focus on graphic displays, like those supported by the AdafruitGFX library. My experience, with all of PJRC’s boards, has been that code generally works out-of-the-box. Stoffregen and team do a fantastic job of making their products compatible with Arduino code. I see no changes to that effort with the 4.0 board.

Very Flexible General Purpose IO

With every Teensy board, there is a two-sided card. This card identifies all of the functions available on each pin. Color blocks group the functions for easy identification.

Fortunately, many of the functions repeat across multiple pins. It is realistic to combine multiple UART, I2C, SPI, GPIO, Analog Inputs, and PWM outputs in a single project. Like most 3.3 volt boards, Teeny 4.0’s I/O pins are limited to 3.3 volt operation. They are NOT 5.0 volt tolerant. However, with 14 analog input pins, 40 digital pins, 31 PWM capable pins, and multiple serial interfaces that is a minuscule trade-off.

Also, you might notice there is a VBat pin. That pin keeps the real-time-clock (RTC) running when the board is off. (The board does not have onboard battery charging or monitoring.)

Final Thoughts

Like all products there are a couple of trade-offs. I wish Teensy 4.0 included: battery charging and WiFi. However, for the form factor, I do not know if either would have been possible. Even after adding an Adafruit PowerBoost and an ESP for WiFi, the overall size of a solution would be very small. So I am okay with those two omissions. The Teensy 4.0 is about processing power and flexibility. Which is why I became so excited when I first saw it.

As a long time user of PJRC’s Teensy boards, this one is a very welcomed addition. I have a machine learning project in the works, which is what I plan to do with my 4.0. What kind of projects do you have in mind for a tiny 600 MHz Arduino board? Let me know in the comments. For more information, check out the PJRC Teensy 4.0.


Teensy 4.0 Brings 600 MHz Cortex-M7 to the Arduino World was originally published in Hackster Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.





Original article: Teensy 4.0 Brings 600 MHz Cortex-M7 to the Arduino World
Author: James Lewis