Creating a CNN to Classify Cats and Dogs for Kendryte K210 Boards

Claissifer running on Maix Dock M1 board

When working with Kendryte K210 boards, a cost-effective option for edge AI projects, you’ll quickly notice that the documentation is often limited. After spending significant time researching and experimenting, I decided to document my process to help others save time when building a Convolutional Neural Network (CNN) to classify images of cats and dogs on these boards.

Finding and Preparing the Dataset

The first step is to obtain a dataset of cats and dogs. I recommend using the popular Microsoft Cats and Dogs dataset. Once you have the dataset, it’s essential to format the images to the appropriate size for the K210 board. It is important to note that in this particiular dataset objects may not be centrally aligned, so it is better to resize the image and pad it to achive required size. I resized the images to 128x128 pixels, which strikes a good balance between memory consumption and model performance. Smaller sizes are more memory-efficient, a critical consideration for the limited resources of the K210.

Here is some sample code you can use to convert images to 128x128 pixels:


base_dir = '/kaggle/input/microsoft-catsvsdogs-dataset/PetImages'

cat_images = [os.path.join(base_dir, 'Cat', f) for f in os.listdir(os.path.join(base_dir, 'Cat'))]
dog_images = [os.path.join(base_dir, 'Dog', f) for f in os.listdir(os.path.join(base_dir, 'Dog'))]

cat_train, cat_val = train_test_split(cat_images, test_size=0.2)
dog_train, dog_val = train_test_split(dog_images, test_size=0.2)

train_dir = os.path.join("data", 'train')
val_dir = os.path.join("data", 'val')

os.makedirs(os.path.join(train_dir, 'Cat'), exist_ok=True)
os.makedirs(os.path.join(train_dir, 'Dog'), exist_ok=True)
os.makedirs(os.path.join(val_dir, 'Cat'), exist_ok=True)
os.makedirs(os.path.join(val_dir, 'Dog'), exist_ok=True)

for img in cat_train:
shutil.copy(img, os.path.join(train_dir, 'Cat'))

for img in dog_train:
shutil.copy(img, os.path.join(train_dir, 'Dog'))

for img in cat_val:
shutil.copy(img, os.path.join(val_dir, 'Cat'))

for img in dog_val:
shutil.copy(img, os.path.join(val_dir, 'Dog'))

Designing the Neural Network

Designing a neural network for the K210 requires careful consideration. You need to ensure that the network architecture is both compact and effective for the task at hand. Some key points to consider when designing your network include:

  • Padding: Use same for AveragePooling2D and valid for Conv2D, depending on the layer.
  • Stride: strides = (2,2),
  • Pooling: pool_size = (2,2).

Here is the network I used

model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(filters = 16, kernel_size = (3,3), padding = 'valid', activation = tf.nn.relu, input_shape = (IMG_SIZE,IMG_SIZE,3)),
tf.keras.layers.AveragePooling2D(pool_size = (2,2), strides = (2,2), padding = 'same'),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Conv2D(filters = 32, kernel_size = (3,3), padding = 'valid', activation = tf.nn.relu),
tf.keras.layers.AveragePooling2D(pool_size = (2,2), strides = (2,2), padding = 'same'),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Conv2D(filters = 64, kernel_size = (3,3), padding = 'valid', activation = tf.nn.relu),
tf.keras.layers.AveragePooling2D(pool_size = (2,2), strides = (2,2), padding = 'same'),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(units = 16, activation = tf.nn.relu),
tf.keras.layers.Dense(units = 2, activation = tf.nn.softmax)
])

model.compile(loss='categorical_crossentropy', optimizer='adam',metrics=['accuracy'])

Verifying Compatibility with Kmodel Format

Before diving into training, it’s crucial to verify that your network can be successfully converted to the kmodel format, which is required for the K210. To do this, run the model for a single epoch, convert it to TensorFlow Lite (tflite) format, and record its inference performance. This step helps catch any potential issues early in the process.

At the time of writing this article, the conversion from an .h5 file to a .tflite file is broken in the current version of TensorFlow (2.16). To bypass this issue, you’ll need to adopt a specific code snippet that ensures the conversion process works as expected. Here’s how you can do it:

pip install -U tf_keras
import tensorflow as tf

model = tf.keras.models.load_model('your_model.h5')
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_model = converter.convert()

with open('model.tflite', 'wb') as f:
f.write(tflite_model)

Converting to Kmodel with NNCase

Once you have a working tflite model, the next step is to convert it to the kmodel format using NNCase (v0.2.0). After conversion, use the ‘infer’ command on the kmodel to ensure that the output is consistent with what you observed with the tflite model. Once output matches it is good idea to verify it with test image on board. This consistency check is vital for ensuring that the model will perform correctly on the K210 hardware.

./ncc compile /workspace/myfolder/Maix_Toolbox/catsdogs.tflite /workspace/myfolder/Maix_Toolbox/catsdogs.kmodel -i tflite -o kmodel -t k210 --dataset /workspace/myfolder/Maix_Toolbox/images 
./ncc infer /workspace/myfolder/Maix_Toolbox/catsdogs.kmodel  .  --dataset  /workspace/myfolder/Maix_Toolbox/images/ 

matching output of code on PC and on device using test image

Training the Neural Network

With the initial verification complete, you can now proceed to train your neural network. Balancing model size, memory consumption, and accuracy is more of an art than a science. It requires practice and experimentation to get it right. In my experience, trial and error is often necessary to find the optimal balance.

Implementing on K210

In the attached repository, I’ve provided the tflite model, the kmodel, and the inference code. Before running the code on your K210 board, remember to set the garbage collector’s heap size and reset the board. It’s also a good idea to use the minimum firmware version, as it’s smaller in size and leaves more room for your model. Here is link to the repository: https://github.com/code2k13/k210_cats_dogs

A Note on Accuracy

Finally, it’s important to manage your expectations regarding the model’s performance in real-world scenarios. Factors such as camera quality, lighting conditions, and shadows can significantly impact accuracy compared to the controlled environment of training.

By following these steps, you should be well on your way to creating an effective CNN for classifying cats and dogs on the Kendryte K210 board.