Development AI on Microcontrollers

Less computing power in the application.
Less computing power in the application.

Neural networks require much less computing power in the application phase than in the learning phase. This fact can be used to execute artificial intelligence algorithms on microcontrollers.

The evolution of artificial intelligence (AI) technologies such as machine learning and deep learning has been remarkable in recent years, and the range of applications is rapidly expanding from the cloud market, mainly focused on the IT field, to the embedded system market. An example of an evolving AI application is service robots. While the computing power available for AI in the cloud is almost unlimited, embedded AI is characterized by processors and controllers that are rather weak in terms of resources and interact with each other and with the environment.

In the future embedded devices equipped with AI may be used for service robots that need to perform judgment and control functions, depending on the particular situation. In addition, it is expected that the development of embedded devices equipped with AI will accelerate not only for service robots, but also for services and their associated devices in general that require interaction with people. This paper explores and provides resources for further evaluating embedded-AI (e-AI) implementations in the context of Renesas artificial intelligence techno¬logy in embedded devices.

Before we look at what constitutes an e-AI solution, let’s consider some terminology around the »e« for embedded in e-AI using Renesas microcontrollers and microprocessors and their tools as an example case.As the first step of the e-AI solution, we must introduce a new function to implement the result of deep learning in the endpoint embedded device. In this specific case, this is realized in the form of plug-ins that are compatible with the open-source Eclipse-based integrated development environment »e² studio« (Fig. 1):

  • e-AI translator: Converts the learned AI network from machine learning/deep learning frameworks like Caffe or TensorFlow to the MCU/MPU development environment.
  • e-AI checker: Based on the output result from the translator, the ROM/RAM mounting size and the inference execution processing time are calculated on the selected MCU/MPU.
  • e-AI importer: Connects a new AI framework specialized for embedded systems that enables real-time performance and resource-saving design to the MCU/MPU development environment.

What is an e-AI solution?

Anyone can use AI relatively easily by using Caffe, developed by UC Berkeley, or TensorFlow, developed by Google. Although AI’s field of speciality varies depending on the algorithm used, for e-AI, we are especially interested in DNN (Deep Neural Network), a multi-layered network that became famous for computer vision, but can be applied in all kinds of other fields of application.

The algorithm is fed with example data (so-called Teacher Data) and the calculated results are then compared with the desired results, so that the parameters of the network are gradually changed and the error in the calculation becomes smaller and smaller (Fig. 2). This way, automatic feature extraction becomes more and more precise. DNNs are also of such high interest because, although the computing effort for learning is very high, the application of what has been learned (inference phase) can be carried out with far less computing power.

In the learning phase, enormous amounts of learning data have to be added to the neuronal network in order to calculate the coefficients. Since the calculation input is very high, servers are used for this purpose. Once the learning phase of the neuronal net is completed and the net is trained, the coefficients are transferred to the target system. This asymmetry of the neural networks means that much less computing power is required in the appli¬cation phase (Fig. 3). For this reason, Renesas calls the inferences in embedded devices »e-AI« (embedded-AI).

Depending on the network structure, the result of a neural network is generated by matrix calculations, starting from the input layer step by step in one direction to the output layer. Since the coefficients are all constant after learning, they can be stored in the ROM area so that the KI can run on a microcontroller with low RAM capacity.

  • The e-AI development environment is an effective tool to embed DNN into an MCU/MPU after learning. However, there are some difficulties with implementing learning results on an MCU/MPU:
  • Python is used as a description language in many AI frameworks, while the control program of the MCU is usually in C/C++Incompatibility with ROM/RAM management suitable for MCU/MPU.

The e-AI development environment solves these problems and makes it possible to implement learned DNN results onto an MCU/MPU in conformance with an e² studio C/C ++ project. The network structure, function, and learned parameters are extracted by inputting the learned AI to the translator, and it is converted into a usable form for e² studio C/C ++ projects (Fig. 4).

The translator (free version) supports microcontrollers with comparatively small ROM/RAM capacity. In order to compress the capacity used by the library, only functions that are often used by neural networks are supported (see table). Neural net functions that are used only for learning and not for inference are not usable.

A tutorial for the e-AI Translator has been published by Renesas on a Gadget«-Website.