Skip to content

Conversation

@magdyksaleh
Copy link
Contributor

No description provided.

@magdyksaleh magdyksaleh requested review from Copilot and tgaddair May 21, 2025 17:40
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances the /classify and /classify_batch endpoints to support Named Entity Recognition (NER) by returning raw NER entities and linkable fields, and implements NER postprocessing and regex-based transaction annotation in the inference layer.

  • Switch classification inputs from plain strings to ClassifyInput with original_description and amount, and add optional parameters.
  • Change handlers to return a structured ClassifyResponse (with raw_ner and linkable_fields) instead of raw entity lists.
  • Introduce NER postprocessing utilities in infer.rs (using lazy_static and Regex) and wire them into Infer::classify and streaming batch classify.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

File Description
router/src/server.rs Updated HTTP handlers to consume ClassifyInput, record NER metrics, and return ClassifyResponse.
router/src/lib.rs Defined ClassifyInput, ClassifyParameters, ClassifyResponse, and BatchClassifyResponse types.
router/src/infer.rs Added NER postprocessing (postprocess_entity_rust, regex annotation, linkable-fields builder) and updated Infer methods.
router/Cargo.toml Added lazy_static dependency for static regex initializations.
Comments suppressed due to low confidence (6)

router/src/infer.rs:42

  • [nitpick] Suffix _RUST on constant names is redundant and may confuse readers. Consider renaming to MIN_LENGTH_PER_ENTITY for clarity and consistency.
static ref MIN_LENGTH_PER_ENTITY_RUST: HashMap<&'static str, usize> = {

router/src/infer.rs:49

  • The key "store number" includes a space, whereas other entity types use single words or snake_case. Verify this matches upstream entity_group values or consider using store_number for consistency.
m.insert("store number", 5); // Note: Rust variable names typically use snake_case

router/src/infer.rs:1111

  • Using expect here will panic the server if an ID is missing. Consider handling this case more gracefully (e.g., returning an error) to avoid runtime panics.
let request_id = id.expect("Classify response in batch missing ID. This is a bug.");

router/src/server.rs:1875

  • The new parameters field in ClassifyRequest is never accessed inside the handler. Consider validating or passing it to the inference layer if intended, or remove until needed.
Json(req): Json<ClassifyRequest>,

router/src/lib.rs:1221

  • The optional parameters field is defined in the request types but never read or validated by the handlers. Consider wiring it through or removing until it’s needed.
struct ClassifyParameters {

router/src/server.rs:1952

  • In the utoipa::path macro, body = Vec<ClassifyResponse> may not be recognized as a valid type reference. Verify the OpenAPI schema generates correctly or use a wrapper type.
(status = 200, description = "Classifications", body = Vec<ClassifyResponse>),

Comment on lines +1259 to +1263
#[derive(Debug, Serialize)]
struct BatchClassifyResponse {
responses: Vec<ClassifyResponse>,
}

Copy link

Copilot AI May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BatchClassifyResponse struct is added but never used by the handlers (they now return Vec<ClassifyResponse> directly). Consider removing it to reduce dead code.

Suggested change
#[derive(Debug, Serialize)]
struct BatchClassifyResponse {
responses: Vec<ClassifyResponse>,
}

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants