Text Generation on Raspi Pico using Markov Chains

Random dinosaur names generated by Raspi Pico displayed on OLED screen

Here is the link to project’s Github repo: https://github.com/code2k13/text_gen_circuitpython

My experiments with TFlite for microcontrollers

I enjoy tinkering with microcontrollers and other programmable gadgets. I’ve been paying great attention to TinyML and TFlite lately. I’ve always wanted to port some of my machine learning models to microcontroller devices. I recently got a Raspi Pico and began experimenting with it. I wanted to play around with text generation models on it. I intended to write a lightweight character-based RNN and run it on my Raspberry Pi Pico to show some randomly generated text on a 0.96-inch OLED display. When it came to putting this strategy into action, I ran into a slew of issues.

To begin, TFLite for microcontrollers is written in C++. To run it on a Pi Pico, you’ll need a full-fledged C++ programming environment as well as a well configured and setup Pi Pico SDK. When creating several of the TFLite examples for Pi Pico, I ran into a number of problems using CMAKE. When I believed I had properly built the code, I copied it to my Pi Pico, but nothing occurred. Debugging is a difficult task that I have not done and do not presently have the tools to perform.

CircuitPython, my friend !

After having a lot of troubles with C++, I switched to CircuitPython. However, the issue is that CircuitPython currently lacks TFLite bindings. However, there is an open source project in this direction: https://github.com/mocleiri/tensorflow-micropython-examples, but it does not support Pi Pico. I decided to look for alternate text creation methods while sticking with CircuitPython. This is when I thought of ‘Markov Chains‘. Simply described, a Markov Chain is a state transition map that comprises a list of alternative states from current states along with probabilities.

In fact, there is a Python module called markovify that can generate a Markov Chain from text and use it to draw inferences.Given below are some sample sentences generated using ‘markovify‘, using the text from book ‘Alice in Wonderland

Alice said nothing; she had found the fan and gloves.
She was walking by the time when she caught it, and on both sides at once.
She went in search of her sister, who was gently brushing away some dead leaves that had a consultation about this, and after a few things indeed were really impossible.
And so it was too slippery; and when Alice had begun to think that very few things indeed were really impossible.
Alas! it was as much as she spoke, but no result seemed to have lessons to learn!
They are waiting on the bank, with her arms round it as a cushion, resting their elbows on it, and talking over its head.

After a while of experimenting with markovify I grew to appreciate this basic but effective method (although not as effective as neural networks). So I made a ‘pure’ CircuitPython implementation for generating Markov Chains and inferring from them.

The src/generate_chain.py file in the repository must be executed from a standard Python interpreter (it will not work with CircuitPython). From any text file, this file builds a character-based Markov Chain model. By default, it searches for CSV files containing dinosaur names (which must be downloaded separately) and creates dino chain.json.

import re
from collections import Counter
import json

#You will need the supply a text file containing one name per line.
#A sample one can be downloaded from
with open("dinosaurs.csv") as f:
txt = f.read()
unique_letters = set(txt)

chain = {}
for l in unique_letters:
next_chars = []
for m in re.finditer(l, txt):
start_idx = m.start()
if start_idx+1 < len(txt):
chain[l] = dict(Counter(next_chars))

with open("dino_chain.json", "w") as i :
json.dump(chain, i)

You may use any text file as an input to this script, but a file containing small text per line produce the best results (like names of places, people, animals, compounds etc). A character-based chain will not work well with text file containing long sentences. We must utilize a word or n-gram-based chain for them. The resulting Markov Chain json (dino_chain.json) file should look like this:

"x": {
"\n": 39,
"a": 9,
"i": 34,
"l": 1,
"o": 2,
"y": 1,
"e": 4,
"u": 3
"z": {
"s": 1,
"o": 4,
"i": 5,
"u": 6,
"e": 3,
"a": 13,
"h": 21,
"k": 3,
"y": 1,
"r": 1,
"b": 1

It is, in essence, a dictionary of dictionaries. The first level keys contain the text file’s unique set of characters. Every character is mapped to a second level dictionary that contains characters that appear immediately after the first level key character as well as the number of times it was encountered. In the preceding JSON, for example, we can see that ‘x’ is followed by ‘n’ (newline) 39 times and ‘a’ 9 times, and so on. It’s as straightforward as that.

To produce inferences (random text), just utilize this JSON structure and convert the counts to probabilities to forecast the next character for a given character. We could have used random.choices in Python to do this, but it is not available in CircuitPython. To do this, I wrote my own function (_custom_random_choices). It is present in the src/markov chain parser.py file , which is basically a utility script for generating inferences. To generate inferences, use the src/generate_text.py file, which contains the following code:

import json
import time
from markov_chain_parser import generate_text

with open('dino_chain.json', 'r') as f:
dino_chain = json.loads(data)

while True:
a = generate_text(200,dino_chain);
names = a.split("\n")
for n in names:
if len(n) > 4 and len(n) < 15:

We solely rely on CircuitPython’s random, json, and time modules to create inferences; we have no additional dependencies. That’s it; in order for text generation to work, you must copy the following files to Pi Pico:

  • dino_chain.json
  • markov_chain_parser.py
  • generate_text.py

I hope you found the article interesting. You may also use the src/generate_text_oled.py file to show the generated text on an OLED display.