Tsetlin Machine: Fresh Thinking in ML

“Speed is the most important feature.”

Fred Wilson

This repository provides an alternative Fuzzy-Pattern Tsetlin Machine implementation with zero external dependencies and blazingly fast performance. It achieves over 32 million MNIST predictions per second at 98% accuracy, with a throughput of 4 GB/s on a desktop CPU.

Key Features

Up to 10× faster training and 34× faster inference compared to the original FPTM implementation, achieved through the use of bitwise operations, SIMD instructions, and a specialized memory layout.
Binary classifier.
Multi-class classifier.
Single-threaded and multi-threaded training and inference.
Specialized BitSet index over literals to improve performance on very large, sparse binary vector inputs.
Model compilation to reduce memory usage and increase inference speed.
Save and load trained models for production deployment or continued training with modified hyperparameters.
Automatic selection of UInt8 or UInt16 Tsetlin Automata based on the number of TA states.
Automatic switching between binary and multi-class classification depending on the dataset.
Built-in benchmarking tool.

Quick Start

Talk is cheap, show me the ~~code~~ some examples.

Below is an example of character-level text generation in the style of Shakespeare.

First, install the Julia language by running the following command and following the installation instructions:

curl -fsSL https://install.julialang.org | sh

In the first terminal window, run the following command to train the model over multiple epochs:

julia -t auto examples/TEXT/train.jl

In the second terminal window, run the following command after each training epoch to observe how the quality of the generated text evolves from one epoch to the next:

julia examples/TEXT/sample.jl

After 400+ epochs, you should see output similar to the following:

ROMEO:
The father's death,
And then I shall be so;
For I have done that was a queen,
That I may be so, my lord.

JULIET:
I would have should be so, for the prince,
And then I shall be so;
For the princely father with the princess,
And then I shall be the virtue of your soul,
Which your son,--

ESCALUS:
What, what should be particular me to death.

BUCKINGHAM:
God save the queen's proclaim'd:
Come, come, the Duke of York.

KING EDWARD IV:
So do I do not know the prince,
And then I shall be so, and such a part.

KING RICHARD III:
Shall I be some confess the state,
Which way the sun the prince's dead;
And then I will be so.

How To Use

Here is a quick "Hello, World!" example of a typical use case with the Tsetlin Machine.

Importing the necessary functions and the MNIST dataset:

using MLDatasets: MNIST
using .Tsetlin: TMInput, TMClassifier, train!, predict, accuracy, save, load, unzip, booleanize, compile, benchmark

x_train, y_train = unzip([MNIST(:train)...])
x_test, y_test = unzip([MNIST(:test)...])

Booleanizing input data (2 bits per pixel):

x_train = [booleanize(x, 0, 0.5) for x in x_train]
x_test = [booleanize(x, 0, 0.5) for x in x_test]

Hyperparameters

This implementation introduces some differences compared to the Vanilla Tsetlin Machine:

L — limits the number of included literals in a clause.
LF — new hyperparameter that sets the number of literal misses allowed per clause.

CLAUSES = 20   # Number of clauses per class
T       = 20   # Voting threshold
S       = 200  # Specificity
L       = 150  # Maximum literals per clause
LF      = 75   # Allowed failed literals per clause

EPOCHS  = 1000 # Number of training epochs

Train the model over 1000 epochs and save the compiled model to disk:

tm = TMClassifier(x_train[1], y_train, CLAUSES, T, S, L=L, LF=LF, states_num=256, include_limit=240)
train!(tm, x_train, y_train, x_test, y_test, EPOCHS, shuffle=true, index=false)
save(compile(tm), "/tmp/tm_last.tm")

Loading the compiled model and evaluating accuracy:

tm = load("/tmp/tm_last.tm")
accuracy(predict(tm, x_test), y_test) |> println

Benchmarking the compiled model:

benchmark(tm, x_test, y_test, 1000 * 4, warmup=true, index=false)

More Examples

This repository includes examples for MNIST, Fashion-MNIST, CIFAR-10, AmazonSales, IMDb sentiment analysis, and Shakespeare character-level text generation.

Instructions on how to run the examples can be found here.

Name		Name	Last commit message	Last commit date
Latest commit History 183 Commits
examples		examples
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Tsetlin Machine: Fresh Thinking in ML

Key Features

Quick Start

How To Use

Hyperparameters

More Examples

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

BooBSD/Tsetlin.jl

Folders and files

Latest commit

History

Repository files navigation

Tsetlin Machine: Fresh Thinking in ML

Key Features

Quick Start

How To Use

Hyperparameters

More Examples

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages