Skip to content

A lightweight Natural Language Probabilistic Intent Engine designed to match free-form user queries to predefined intents using a hybrid of similarity metrics and Bayesian scoring. Ideal for chatbots, virtual assistants, and any system needing a simple yet robust intent recognition layer.

License

Notifications You must be signed in to change notification settings

tommygrammar/IntentMatcher

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

IntentMatcher

A lightweight Natural Language Probabilistic Intent Engine designed to match free-form user queries to predefined intents using a hybrid of similarity metrics and Bayesian scoring. Ideal for chatbots, virtual assistants, and any system needing a simple yet robust intent recognition layer.


🚀 Features

  • Combined Similarity & Bayesian Scoring

    • Word-level F1 overlap for lexical matching
    • Character n-gram (trigram) Jaccard for fuzzy matching
    • Length penalty to normalize short vs. long inputs
    • Bayesian log-likelihood with configurable match / non-match priors and example-frequency priors
    • Final score is a 50/50 blend of similarity and Bayesian scores
  • Clean Preprocessing

    • Lowercasing and removal of punctuation (preserves math operators)
    • Stopword filtering to focus on keywords
  • Simple API

    • train_intent(intent_name, example_query) to register labeled expressions
    • match(query) -> (intent_name, score) to infer best intent and confidence
  • No External Dependencies

    • Pure Python standard library (re, math, collections)

📦 Installation

Clone or download, then include intentmatcher.py in your project. No additional packages are required.

If you prefer a package structure, copy into your module and import:

# Your project structure:
#   myapp/
#     intentmatcher.py
#     main.py

🏁 Quickstart

from intentmatcher import IntentMatcher

# 1. Instantiate the engine
matcher = IntentMatcher(p_match=0.9, p_nomatch=0.1)

# 2. Train with example phrases
matcher.train_intent("get_balance", "What is my current account balance?")
matcher.train_intent("get_balance", "Show me my balance.")
matcher.train_intent("transfer", "Send $100 to Alice.")
matcher.train_intent("transfer", "Transfer funds to Bob.")

# 3. Match new queries
intent, score = matcher.match("Could you please show my balance now?")
print(intent, score)  # => "get_balance", e.g. 0.82

intent, score = matcher.match("I want to transfer funds.")
print(intent, score)  # => "transfer", e.g. 0.76

🔍 How It Works

  1. Preprocessing

    • clean_text: lowercase, remove non-word chars, filter stopwords.
  2. Similarity Calculation

    • word_f1: F1 overlap of token sets.
    • letter_ngram_sim: Jaccard of 3-gram character shingles.
    • length_penalty: penalizes extreme length mismatches.
    • Combined: (0.5 * word_f1 + 0.3 * letter_ngram_sim) * length_penalty
  3. Bayesian Scoring

    • Log-sum of p_match vs. p_nomatch per token match.
    • Plus log prior based on how often examples were seen.
  4. Intent Matching

    • For each intent, score against each example.
    • Take maximum example score as the intent score.
    • Return the intent with highest overall score.

⚙️ Configuration

  • p_match and p_nomatch determine the strength of Bayesian likelihood when tokens match or don’t.
  • Tweak weights inside combined_similarity to adjust word vs. n-gram importance.

🛠️ Integration Tips

  • Use match() in your bot’s message handler to route queries to handlers:

    intent, score = matcher.match(user_input)
    if intent == "get_balance" and score > 0.5:
        handle_balance()
    else:
        fallback()
  • Preload your intents at application startup for minimal latency.


📜 License

MIT License. See LICENSE for details.


Built and maintained by Tommy Muga — a Software Engineer and Computational Modeler.

About

A lightweight Natural Language Probabilistic Intent Engine designed to match free-form user queries to predefined intents using a hybrid of similarity metrics and Bayesian scoring. Ideal for chatbots, virtual assistants, and any system needing a simple yet robust intent recognition layer.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages