Tensorflow Lite micro ESP32-CAM: Fashion Mnist [PlatformIO]

This tutorial covers how to use Tensorflow Lite micro with ESP32-CAM. The goal of this experimental project is to describe how we can use Tensorflow Lite micro with ESP32-CAM to classify images. Moreover, this tutorial describes the steps to follow to implement a machine learning application using ESP32-CAM. In more detail, we want to run the inference process directly on the ESP32-CAM using Tensorflow lite micro. To demonstrate how to do it we will use the Fashion MNIST.

Moreover, to develop this ESP32-CAM project we will use PlatformIO with EspressIf ESP-Idf. Therefore the steps to follow to implement a machine learning application with ESP32-CAM and Tensorflow lite micro are:

  • Build the Tensorflow lite model using Fashion MNIST dataset
  • Quantizing the model
  • Exporting the Tensorflow model to Tensorflow lite micro model
  • Implement the ESP32-CAM application that uses the Tensorflow lite micro

In this project, the inference process runs directly on the ESP32-CAM.

Building Tensorflow lite micro model with Fashion Mnist

The first step is building the Tensorflow model using Fashion Mnist. This is a dataset that holds 60,000 image examples to use to train the model and 10,000 test images. Moreover, these images are 28×28 grayscale images. We will use this dataset to train the model before exporting it so that it runs on the ESP32-CAM. There are several machine learning models available, here we will use the machine learning model derived from the Tensorflow tutorial. Moreover, to do it, we will use Keras. You can train the model using Colab.

Firstly let us install the latest version of Tensorflow:

!pip install tensorflow import tensorflow as tf import numpy as np print(tf.__version__)
Code language: Python (python)

Next, we will import the dataset and normalize the images in a range between 0-1:

fashion_mnist = tf.keras.datasets.fashion_mnist (train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data() # Normalize the data train_images = train_images.astype(np.float32) / 255 test_images = test_images.astype(np.float32) / 255 train_images = np.expand_dims(train_images, -1) test_images = np.expand_dims(test_images, -1)
Code language: Python (python)

Finally the model:

from tensorflow.keras import layers input_shape = (28, 28, 1) model = tf.keras.Sequential( [ tf.keras.Input(shape=input_shape), layers.Conv2D(6, kernel_size=(3, 3), activation="relu"), layers.MaxPooling2D(pool_size=(2, 2)), layers.Conv2D(6, kernel_size=(3, 3), activation="relu"), layers.Flatten(), layers.Dense(num_classes, activation="softmax"), ] ) model.summary() model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy']) model.fit(train_images, train_labels, epochs=20, validation_split=0.2)
Code language: Python (python)

Now you can train the model. In this tutorial, we will not focus on the model accuracy instead the goal of this tutorial is to cover how to run a Tensorflow lite micro model on the ESP32-CAM to classify images. You can use a different model or dataset if you prefer.

Quantize the model

The next step is to quantize the model so that we can use it on the ESP32-CAM. This step is very important. Therefore, we will generate a representative dataset. In this way, we can classify images directly on the ESP32-CAM:

def representative_data_gen(): for input_value in tf.data.Dataset.from_tensor_slices(train_images).batch(1).take(100): yield [input_value] converter = tf.lite.TFLiteConverter.from_saved_model('./') converter.optimizations = [tf.lite.Optimize.DEFAULT] converter.representative_dataset = representative_data_gen tflite_model = converter.convert() tflite_model_size = open('./mnist_model.tflite', "wb").write(tflite_model) print("Quantized model is %d bytes" % tflite_model_size)
Code language: Python (python)

You can find more details about how to use quantization with Tensroflow here. Now our machine learning model is ready and we can use it with ESP32-CAM. Before using it, it is necessary to export it:

!apt-get -qq install xxd !xxd -i "./mnist_model.tflite" > "./mnist_model_quant.cc" !cat "./mnist_model_quant.cc"
Code language: PHP (php)

The last line visualizes the model in an array so that we can use it with the ESP32-CAM.

Implementing the ESP32-CAM application with Tensorflow lite micro

Once the machine learning model is ready, we will implement the ESP32-CAM application that uses the Tensorflow lite micro model to recognize fashion objects. To develop this project, as said before, we will use PlatformIO. You can clone the github repository.

Open the platformio.ini . In this file the framework is espidf and not arduino as usual.

[env] build_unflags=-Werror=all [env:esp32cam] platform = espressif32 board = esp32cam framework = espidf board_build.partitions = partitions.csv monitor_speed = 115200 lib_deps = esp32-camera

Furthemore it is necessary to set some flags in the compiler options to avoid errors during the compilation process.

Moreover, under library folder there is the tensorflow library. We have created it using HelloWorld example.

More useful example:
ESP32-CAM with Edge Impulse
How to use ESP32-CAM with TensorflowJS
ESP32-CAM with cloud machine learning

ESP32-CAM application structure

The application has three components:

  • app_webserver
  • app_camera
  • app_machine_learning

The first two components manage the camera and the webserver so that we can connect to the ESP32-CAM using a browser. This is the classic web server provided with the ESP32-CAM modified to run the inference process.

The app_camera manages the image acquisition and the video. The picture has a 96×96 size and we have to resize it as described in the next paragraph.

How to run Tensorflow lite model with ESP32-CAM

The most interesting component is the app_machine_learning. In this component, we will run the inference process using the Tensorflow lite micro on the ESP32-CAM. In the init_ml_module, we initialize the Tensorflow library:

void init_ml_module() { // Set up logging. Google style is to avoid globals or statics because of // lifetime uncertainty, but since this has a trivial destructor it's okay. // NOLINTNEXTLINE(runtime-global-variables) static tflite::MicroErrorReporter micro_error_reporter; error_reporter = &micro_error_reporter; // Map the model into a usable data structure. This doesn't involve any // copying or parsing, it's a very lightweight operation. model = tflite::GetModel(_mnist_model); if (model->version() != TFLITE_SCHEMA_VERSION) { error_reporter->Report( "Model provided is schema version %d not equal " "to supported version %d.", model->version(), TFLITE_SCHEMA_VERSION); return; } // Here we define the model static tflite::AllOpsResolver micro_mutable_op_resolver; // ---------------------- interpreter = new tflite::MicroInterpreter( model, micro_mutable_op_resolver, tensor_arena, kTensorArenaSize, error_reporter); TfLiteStatus allocate_status = interpreter->AllocateTensors(); if (allocate_status != kTfLiteOk) { error_reporter -> Report("--- Error allocating tensor arena ----"); } model_input = interpreter->input(0); model_input_buffer = model_input->data.f; // ESP_LOGI(TAG, "Model size %d", model_input->dims->size); }
Code language: C++ (cpp)

In more details, the code executes these steps:

  1. Load the Tensorflow lite model and check the version
  2. Initialize the AllOpsResolver
  3. Initialize the interpreter
  4. Allocate the memory for the TensorFlow arena
  5. Extract the input and the output

It is important to notice, that in our model the data is normalized between 0 and 1 therefore we will feed the machine learning model using float data (model_input->data.f;)

Running the inference process on the ESP32-CAM

This is the step where we combine the Tensorflow machine learning model and the ESP32-CAM to classify images. Before feeding the Tensorflow model with the image captured by the ESP32-CAM, it is necessary to manipulate the image. Therefore, the first step is cropping the image from 96×96 to 28×28. Remember that 28×28 is the image size used in the Fashin Mnist dataset. To do it, the code uses a library provided by the EspressIf in the Github repository. The original utility code was modified so that it compiles with the PlatformIO:

int img_size = MODEL_IMAGE_WIDTH * MODEL_IMAGE_HEIGHT * NUM_CHANNELS; uint8_t * tmp_buffer = (uint8_t *) malloc(img_size); image_resize_linear(tmp_buffer,fb->buf,MODEL_IMAGE_HEIGHT, MODEL_IMAGE_WIDTH,NUM_CHANNELS,fb->width,fb->height);
Code language: C++ (cpp)

Finally we can normalize the image:

for (int i=0; i < img_size; i++) { //printf("Data %i", tmp_buffer[i]); //(interpreter->input(0))->data.f[i] = tmp_buffer[i] / 255.0f; model_input_buffer[i] = tmp_buffer[i] / 255.0f; //normalise_image_buffer( (interpreter->input(0))->data.f, tmp_buffer, img_size); } free(tmp_buffer);
Code language: C++ (cpp)

The last step is feeding the model:

if (kTfLiteOk != interpreter->Invoke()) { error_reporter->Report("Error"); } ESP_LOGI(TAG, "Showing results"); TfLiteTensor* output = interpreter->output(0); for (int i=1; i < kCategory; i++) { ESP_LOGI(TAG, "Label=%s, Prob=%f",kCategoryLabels[i], output->data.f[i] ); }
Code language: C++ (cpp)

As output, there will be a list of classes with their probabilities. Now you can use the ESP32-CAM to acquire the image and classify it. Below the result in the output console for a bag:

ESP32-CAM with Tensorflow lite micro: MNIST image classification

To simplify all the process, I’ve made a simple HTML page to use to this purpose.

Wrapping up

At the end of this post, we have covered how to use ESP32-CAM with Tensorflow lite micro. We have described all the steps we have to follow to build first the Tensorflow lite model and then a machine learning ESP32-CAM to classify fashion articles. You can use this ESP32-CAM project with other Tensorflow lite model so that you can discover how to use machine learning with the ESP32-CAM.