Skip to content

Segmentation fault when initializing with null language #4028

@ccouzens

Description

@ccouzens

Basic Information

tesseract 5.2.0
leptonica-1.82.0
libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.1.3) : libpng 1.6.37 : libtiff 4.4.0 : zlib 1.2.12 : libwebp 1.3.0
Found AVX2
Found AVX
Found FMA
Found SSE4.1

Operating System

No response

Other Operating System

Fedora Linux 37

But this was originally reported to me from a user on a Mac M1 (presumably macOS 13 Ventura).

uname -a

Linux fedora-desktop 6.1.14-200.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Feb 26 00:13:26 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Compiler

gcc version 12.2.1 20221121 (Red Hat 12.2.1-4) (GCC)

Virtualization / Containers

No response

CPU

13th Gen Intel® Core™ i7-13700K

Current Behavior

When using TessBaseAPIInit3(cube, NULL, NULL) the language isn't set to a sensible default, thus later causing a segmentation fault when TessBaseAPIRecognize is called.

Expected Behavior

Given that the documentation says:

The language is (usually) an ISO 639-3 string or nullptr will default to eng.

I would expect a NULL to work the same way as "eng" (not segmentation fault at the Recognize step).

Suggested Fix

Null pointer defaults to "eng".

Other Information

Test case program

#include <tesseract/capi.h>
#include <leptonica/allheaders.h>

int main(int argc, char *argv[]) {
    TessBaseAPI *cube = TessBaseAPICreate();
    TessBaseAPIInit3(cube, NULL, NULL); // change this 2nd `NULL` to "eng" for success

    PIX *image = pixRead("img.png");
    TessBaseAPISetImage2(cube, image);
    TessBaseAPIRecognize(cube, NULL);
    char *text = TessBaseAPIGetUTF8Text(cube);
    printf("%s\n", text);
    TessDeleteText(text);
    pixFreeData(image);
    TessBaseAPIDelete(cube);
}

run using gcc $(pkg-config --cflags --libs tesseract) $(pkg-config --cflags --libs lept) test.c && ./a.out.

This was originally reported against a Rust wrapper: antimatter15/tesseract-rs#34

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions