IntentMatcher

A lightweight Natural Language Probabilistic Intent Engine designed to match free-form user queries to predefined intents using a hybrid of similarity metrics and Bayesian scoring. Ideal for chatbots, virtual assistants, and any system needing a simple yet robust intent recognition layer.

🚀 Features

Combined Similarity & Bayesian Scoring
- Word-level F1 overlap for lexical matching
- Character n-gram (trigram) Jaccard for fuzzy matching
- Length penalty to normalize short vs. long inputs
- Bayesian log-likelihood with configurable match / non-match priors and example-frequency priors
- Final score is a 50/50 blend of similarity and Bayesian scores
Clean Preprocessing
- Lowercasing and removal of punctuation (preserves math operators)
- Stopword filtering to focus on keywords
Simple API
- train_intent(intent_name, example_query) to register labeled expressions
- match(query) -> (intent_name, score) to infer best intent and confidence
No External Dependencies
- Pure Python standard library (re, math, collections)

📦 Installation

Clone or download, then include intentmatcher.py in your project. No additional packages are required.

If you prefer a package structure, copy into your module and import:

# Your project structure:
#   myapp/
#     intentmatcher.py
#     main.py

🏁 Quickstart

from intentmatcher import IntentMatcher

# 1. Instantiate the engine
matcher = IntentMatcher(p_match=0.9, p_nomatch=0.1)

# 2. Train with example phrases
matcher.train_intent("get_balance", "What is my current account balance?")
matcher.train_intent("get_balance", "Show me my balance.")
matcher.train_intent("transfer", "Send $100 to Alice.")
matcher.train_intent("transfer", "Transfer funds to Bob.")

# 3. Match new queries
intent, score = matcher.match("Could you please show my balance now?")
print(intent, score)  # => "get_balance", e.g. 0.82

intent, score = matcher.match("I want to transfer funds.")
print(intent, score)  # => "transfer", e.g. 0.76

🔍 How It Works

Preprocessing
- clean_text: lowercase, remove non-word chars, filter stopwords.
Similarity Calculation
- word_f1: F1 overlap of token sets.
- letter_ngram_sim: Jaccard of 3-gram character shingles.
- length_penalty: penalizes extreme length mismatches.
- Combined: (0.5 * word_f1 + 0.3 * letter_ngram_sim) * length_penalty
Bayesian Scoring
- Log-sum of p_match vs. p_nomatch per token match.
- Plus log prior based on how often examples were seen.
Intent Matching
- For each intent, score against each example.
- Take maximum example score as the intent score.
- Return the intent with highest overall score.

⚙️ Configuration

p_match and p_nomatch determine the strength of Bayesian likelihood when tokens match or don’t.
Tweak weights inside combined_similarity to adjust word vs. n-gram importance.

🛠️ Integration Tips

Use match() in your bot’s message handler to route queries to handlers:

intent, score = matcher.match(user_input)
if intent == "get_balance" and score > 0.5:
    handle_balance()
else:
    fallback()

Preload your intents at application startup for minimal latency.

📜 License

MIT License. See LICENSE for details.

Built and maintained by Tommy Muga — a Software Engineer and Computational Modeler.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
Readme.MD		Readme.MD
example.py		example.py
intent_engine.py		intent_engine.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

IntentMatcher

🚀 Features

📦 Installation

🏁 Quickstart

🔍 How It Works

⚙️ Configuration

🛠️ Integration Tips

📜 License

About

Uh oh!

Releases

Packages

Languages

License

tommygrammar/IntentMatcher

Folders and files

Latest commit

History

Repository files navigation

IntentMatcher

🚀 Features

📦 Installation

🏁 Quickstart

🔍 How It Works

⚙️ Configuration

🛠️ Integration Tips

📜 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages