Text Generation on Raspi Pico using Markov Chains

2022-08-20

Random dinosaur names generated by Raspi Pico displayed on OLED screen

Playing around with microcontrollers and other programmable gadgets is something I enjoy doing, and lately, I’ve been paying close attention to TinyML and TFlite. I’ve always wanted to transfer some of my machine learning models onto microcontroller devices. Recently, I acquired a Raspberry Pi Pico and started experimenting with it. My goal was to create a lightweight character-based RNN that could generate random text on a 0.96-inch OLED display. However, when it came time to execute this plan, I encountered a host of issues.

Firstly,TFLite for microcontrollers is written in C++, which means that in order to run it on a Pi Pico, a fully-functional C++ programming environment, as well as a well-configured and set-up Pi Pico SDK, are required. While working on several TFLite examples for the Pi Pico, I encountered numerous issues using CMAKE. Even after I thought I had correctly built the code, nothing happened when I copied it to my Pi Pico. Unfortunately, I lack the necessary tools to perform debugging, which is a challenging task.

Here is the link to project’s Github repo: https://github.com/code2k13/text_gen_circuitpython

CircuitPython, my friend !

After encountering numerous issues with C++, I decided to switch to CircuitPython. However, the problem is that TFLite bindings are currently not available in CircuitPython. Although there is an open-source project aimed at resolving this issue, it does not support Pi Pico. As a result, I decided to explore other methods of generating text while continuing to use CircuitPython. This is when I came across the concept of Markov Chains, which are essentially state transition maps consisting of a list of possible alternative states from current states along with their respective probabilities.

Fortunately, there is a Python module called markovify that can generate a Markov Chain from text and use it to make inferences. I have included some sample sentences below that were generated using the ‘markovify‘ module, using the text from the book ‘Alice in Wonderland‘.

Alice said nothing; she had found the fan and gloves.
She was walking by the time when she caught it, and on both sides at once.
She went in search of her sister, who was gently brushing away some dead leaves that had a consultation about this, and after a few things indeed were really impossible.
And so it was too slippery; and when Alice had begun to think that very few things indeed were really impossible.
Alas! it was as much as she spoke, but no result seemed to have lessons to learn!
They are waiting on the bank, with her arms round it as a cushion, resting their elbows on it, and talking over its head.

After some experimentation with markovify, I came to appreciate this simple yet effective approach (although it is not as effective as neural networks). Consequently, I created a “pure” implementation of generating Markov Chains and making inferences from them using CircuitPython.

To create a character-based Markov Chain model from any text file, the src/generate_chain.py file in the repository needs to be executed from a standard Python interpreter (as it will not work with CircuitPython). By default, it searches for CSV files that contain dinosaur names (which must be downloaded separately) and creates dino_chain.json.

import re
from collections import Counter
import json

#You will need the supply a text file containing one name per line.
#A sample one can be downloaded from  
#https://raw.githubusercontent.com/junosuarez/dinosaurs/master/dinosaurs.csv
with open("dinosaurs.csv") as f:
    txt = f.read()
unique_letters = set(txt)


chain = {}
for l in unique_letters:
    next_chars = []
    try:
        for m in re.finditer(l, txt):
             start_idx = m.start()
             if start_idx+1 < len(txt):
                next_chars.append(txt[start_idx+1])
        chain[l] = dict(Counter(next_chars))
    except:
        continue 


with open("dino_chain.json", "w") as i :
   json.dump(chain, i)

Any text file can be used as an input to this script, but files that contain small text per line are known to produce the best results (such as names of places, people, animals, compounds, etc.). Long sentences are not well-suited for character-based chains and it’s better to use a word or n-gram-based chain instead. Once generated, the resulting Markov Chain JSON file (dino_chain.json) should resemble the following format:

{
...
  "x": {
    "\n": 39,
    "a": 9,
    "i": 34,
    "l": 1,
    "o": 2,
    "y": 1,
    "e": 4,
    "u": 3
  },
  "z": {
    "s": 1,
    "o": 4,
    "i": 5,
    "u": 6,
    "e": 3,
    "a": 13,
    "h": 21,
    "k": 3,
    "y": 1,
    "r": 1,
    "b": 1
  },
  .....
}

It is, in essence, a dictionary of dictionaries. The first level keys contain the text file’s unique set of characters. Every character is mapped to a second level dictionary that contains characters that appear immediately after the first level key character as well as the number of times it was encountered. In the preceding JSON, for example, we can see that ‘x’ is followed by ‘n’ (newline) 39 times and ‘a’ 9 times, and so on. It’s as straightforward as that.

To produce inferences (random text), just utilize this JSON structure and convert the counts to probabilities to forecast the next character for a given character. We could have used random.choices in Python to do this, but it is not available in CircuitPython. To do this, I wrote my own function (_custom_random_choices). It is present in the src/markov chain parser.py file , which is basically a utility script for generating inferences. To generate inferences, use the src/generate_text.py file, which contains the following code:

import json
import time
from markov_chain_parser import generate_text

with open('dino_chain.json', 'r') as f:
    data=f.read()
dino_chain =  json.loads(data)

while True:
    a = generate_text(200,dino_chain);
    names = a.split("\n")
    for n in names:
        if len(n) > 4 and len(n) < 15:
            print(n)
    print("--------------------------------------")
    time.sleep(10)

We solely rely on CircuitPython’s random, json, and time modules to create inferences; we have no additional dependencies. That’s it; in order for text generation to work, you must copy the following files to Pi Pico:

dino_chain.json
markov_chain_parser.py
generate_text.py

I hope you found the article interesting. You may also use the src/generate_text_oled.py file to show the generated text on an OLED display.

Text Generation on Raspi Pico using Markov Chains

CircuitPython, my friend !

About Me

Recents

Tag Cloud

Categories

Tags

Archives