Skip to content

Legacy ara language not working with recent versions of tesseract #3929

@naourass

Description

@naourass

Environment

  • Tesseract Version: 5.x, 4.1.x, 4.0.x
  • Platform: Linux DESKTOP-**** 5.10.102.1-microsoft-standard-WSL2 x86_64 GNU/Linux (Ubuntu 20.04)

Current Behavior:

While other legacy languages are working fine with recent versions of tesseract, legacy ara 4.00 is returning this error in all versions listed above (--oem 0) :
read_params_file: Can't open txt mgr->GetComponent(TESSDATA_INTTEMP, &fp):Error:Assert failed:in file src/classify/adaptmatch.cpp

I'm using tessdata 4.00 because it seems that arabic legacy model has been removed from the newer versions.

Suggested Fix:

Update ara traineddata file with legacy support for tesseract 5.x, or add documentation for tesseract 4.00 installation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions