Playing around with microcontrollers and other programmable gadgets is something I enjoy doing, and lately, I’ve been paying close attention to TinyML and TFlite. I’ve always wanted to transfer some of my machine learning models onto microcontroller devices. Recently, I acquired a Raspberry Pi Pico and started experimenting with it. My goal was to create a lightweight character-based RNN that could generate random text on a 0.96-inch OLED display. However, when it came time to execute this plan, I encountered a host of issues.
Firstly,TFLite for microcontrollers is written in C++, which means that in order to run it on a Pi Pico, a fully-functional C++ programming environment, as well as a well-configured and set-up Pi Pico SDK, are required. While working on several TFLite examples for the Pi Pico, I encountered numerous issues using CMAKE. Even after I thought I had correctly built the code, nothing happened when I copied it to my Pi Pico. Unfortunately, I lack the necessary tools to perform debugging, which is a challenging task.
Here is the link to project’s Github repo: https://github.com/code2k13/text_gen_circuitpython
CircuitPython, my friend !
After encountering numerous issues with C++, I decided to switch to CircuitPython. However, the problem is that TFLite bindings are currently not available in CircuitPython. Although there is an open-source project aimed at resolving this issue, it does not support Pi Pico. As a result, I decided to explore other methods of generating text while continuing to use CircuitPython. This is when I came across the concept of Markov Chains, which are essentially state transition maps consisting of a list of possible alternative states from current states along with their respective probabilities.
Fortunately, there is a Python module called markovify that can generate a Markov Chain from text and use it to make inferences. I have included some sample sentences below that were generated using the ‘markovify‘ module, using the text from the book ‘Alice in Wonderland‘.
Alice said nothing; she had found the fan and gloves. |
After some experimentation with markovify, I came to appreciate this simple yet effective approach (although it is not as effective as neural networks). Consequently, I created a “pure” implementation of generating Markov Chains and making inferences from them using CircuitPython.
To create a character-based Markov Chain model from any text file, the src/generate_chain.py file in the repository needs to be executed from a standard Python interpreter (as it will not work with CircuitPython). By default, it searches for CSV files that contain dinosaur names (which must be downloaded separately) and creates dino_chain.json.
import re |
Any text file can be used as an input to this script, but files that contain small text per line are known to produce the best results (such as names of places, people, animals, compounds, etc.). Long sentences are not well-suited for character-based chains and it’s better to use a word or n-gram-based chain instead. Once generated, the resulting Markov Chain JSON file (dino_chain.json) should resemble the following format:
{ |
It is, in essence, a dictionary of dictionaries. The first level keys contain the text file’s unique set of characters. Every character is mapped to a second level dictionary that contains characters that appear immediately after the first level key character as well as the number of times it was encountered. In the preceding JSON, for example, we can see that ‘x’ is followed by ‘n’ (newline) 39 times and ‘a’ 9 times, and so on. It’s as straightforward as that.
To produce inferences (random text), just utilize this JSON structure and convert the counts to probabilities to forecast the next character for a given character. We could have used random.choices in Python to do this, but it is not available in CircuitPython. To do this, I wrote my own function (_custom_random_choices). It is present in the src/markov chain parser.py file , which is basically a utility script for generating inferences. To generate inferences, use the src/generate_text.py file, which contains the following code:
import json |
We solely rely on CircuitPython’s random, json, and time modules to create inferences; we have no additional dependencies. That’s it; in order for text generation to work, you must copy the following files to Pi Pico:
- dino_chain.json
- markov_chain_parser.py
- generate_text.py
I hope you found the article interesting. You may also use the src/generate_text_oled.py file to show the generated text on an OLED display.