This is my final report of the IOTA Crypto Core project. In case you missed the last, click here.
Interesting 9 month of work coming to an end — Time to sum everything up!
The first milestone was the PiDiver. This work was done before the Ecosystem funding was announced and I’m glad I could include it the proposal which gave me nice compensation for the work I’ve done!
There also is powsrv which is a PoW-as-a-service service for outsourcing PoW to highly specialized and efficient hardware based on the PiDiver.
IOTA Crypto FPGA Core
The second milestone (reports here, here and here) was the development of an FPGA core which provides most IOTA core functions with hardware acceleration. It offers a high-level API which is easy to use whereas computationally intense low-level calculations are off-loaded to specialized logic which gives significant advantage in speed compared to a software-only solution.
For this work, an off-the-shelf Arty S7 board was used and a HAT was designed. The HAT had a secure-element, SPI-Flash (was never used) and a W5500 ethernet controller on board.
Inside the FPGA, a Cortex M1 (ARM) soft-cpu is running (@ 100MHz) which can be programmed in C/C++. The firmware is protected by security mechanisms the FPGA provides. It should be hard for attackers to unauthorizedly gain access of seeds — which are stored in a secure-memory — or to tamper with the firmware.
Also inside the FPGA, there is specialized logic which accelerates hashing (Curl-P81 (and PoW), Keccak384 (used in Kerl) and Troika) and type conversions (binary <-> trinary). This specialized logic blocks are called hardware-accelerators and give a nice performance-boost (hashing >150) compared to software-only implementations. They often are faster than hashing running on a PC CPU with e.g. 3GHz (some speed-comparisions here and here).
IOTA Crypto FPGA Module
The next milestone (report here) was to build dedicated hardware for the FPGA core developed in the last milestone.
The result was a small (30x26mm) FPGA module which can be put in a mini-PCIe socket (but it’s not mini-PCIe!).
For this module, also a small test-board was developed to test the module.
Originally I thought, I would never use this board again … But I was wrong about. I like it and I use it for every change 🙂
Linux SoM (System-on-Module)
One further part was a SoM (report here) which could make use of the FPGA module. Originally, a slightly bigger microcontroller was planned but it was replaced by a more potent microcontroller capable of running full main-line Linux. It also has 128MB RAM which makes it suitable for a lot of applications.
The ATSAMA5D27 has very nice security features like booting signed Linux kernels and it needs little power (1/2W). It also has a nice set of peripherals like 10/100MBit ethernet, USB host and other serial interfaces (I2C, SPI, UART).
Milestone 4 — Gateway
To make use of the Linux SoM a Gateway-board was designed. It made most peripherals accessible and additionally provided a 3G mobile modem, WiFi and Bluetooth.
Put it all together
In the image above, the sensor is connected via BLE (Bluetooth Low Energy) and low6pan (IPv6 via Bluetooth) to the Gateway and publishes MQTT pakets. Software running on the Gateway subscribes to the MQTT-topic and uses the FPGA-module for efficiently building IOTA Data transactions which are then sent to the tangle.
There are some side-projects which originally were not planned in the Ecosystem proposal.
On the end of december the new light-weight hashing algorithm Troika was announced. Although not in the EDF budget-plan included, a FPGA implementation of Troika was developed — in two versions. One high-speed implementation (1 clock cycle per hashing round) for the IOTA Crypto Core FPGA and one slower variant for $5 FPGAs (picture below).
Also a lot of time was put into optimizing the reference-implementation of Troika. But it turned out, someone else (called the “silent-hero”) did a way better job 😂
But also a SIMD-Version was implemented (up to AVX256) which can be used for hashing multiple hashes at once. It will turn out if it is of any use or not 🙂
One part not mentioned in the EDF proposal was a sensor. I needed one, so I built one. I choose a nRF52 module (UBlox NINA B112), a CO2 sensor (Sensirion SCD30) and an E-Paper display (Waveshare). The VERY nice thing about nRF52 is that they can be programmed like any other Cortex M based microcontroller! So, the sensor only needed the UBlox module as main processor and Bluetooth (BLE) came for free with it. I love the nRF52!
Perhaps you noticed … The architectural overview always contained MAM but it was more or less excluded. Partly because the new MAM wasn’t ready, partly because it wasn’t clear if MAM would run on the FPGA — and which part of it.
The Cortex M1 has very limited resources (128kB ROM, 128kB+40kB RAM). It has just been the last days — that’s the reason why the end of the project was postponed by 3 days —in which it was tried to get MAM running on the FPGA module.
And it worked … Of course, this only is a proof-of-concept and a lot of development could be put in to get it production ready but it shows, that it’s possible — and the best, it’s fast enough to be of practical use. MAM can benefit a lot from the Troika accelerator. Following some comparisons with Cortex M1 with and without hardware-acceleration and a 3.4GHz i5 CPU:
The hardware-acceleration gives a speed advantage of about 148 compared to a software-only implementation on the Cortex M1 and it can (almost) keep up with my i5 — with a fraction of power (<1W) and only 100MHz.
This all was — more or less — proof-of-concept work and it will turn out which parts will be used more often than others but I think it gives a nice base for further development.
In my oppinion, I like the FPGA module most because it could be used in IoT-like stand-alone applications very nicely. It provides a lot of hashing power for which normally a PC CPU would be needed but only needs a fraction of power. It is freely programmable in C/C++ and it’s protected by security mechanisms (like FPGA bitstream-encryption). It also has secure-memory for seed storage. The FPGA-system is very flexible. If other periphery is needed the system can be changed accordingly.
What I would change, though, would definitely be to get rid of the Cortex M1 soft-cpu (ARM’s licensing is sub-optimal) and replace it by a (really) free RISC V. ARM gives the IP for the Cortex M1 cost-free but no part of their IP may be included in opensource code-repositories. So it was quite an effort to work-around the license — e.g. writing instructions how to download from the ARM website and to patch source-files afterwards for the ICCFPGA-project.
The documentation can be found here: https://gitlab.com/iccfpga/iccfpga-core/wikis/home
All repositories here: https://gitlab.com/iccfpga
I’m glad everything I planned turned out nicely — especially within the planned time-frames. It’s like a miracle that there were no major mistakes or show-stoppers. Lucky me 🙂
I want to thank the IOTA Foundation who put trust in my abilities and made this project possible by funding it through the Ecosystem Development Fund! 😘