PARSeq Model #2089

sineeli · 2025-02-10T22:36:45Z

PARSeq Model

Description of the Change

This PR adds an end-to-end scene text recognition model, PARSeq, to KerasHub. PARSeq is a ViT-based OCR model that enables iterative decoding for robust text recognition in natural scenes.

Closes the first half of #<issue_number>

Reference

For details, see Scene Text Recognition with Permuted Autoregressive Sequence Models (PARSeq paper). The model and configuration are based on the official paper and open-source implementation

Colab Notebook

Usage and numerics matching Colab:

Checklist

I have added all the necessary unit tests for my change.
I have verified that my change does not break existing code and works with all backends (TensorFlow, JAX, and PyTorch).
My PR is based on the latest changes of the main branch (if unsure, rebase the code).
I have followed the Keras Hub Model contribution guidelines in making these changes.
I have followed the Keras Hub API design guidelines in making these changes.
I have signed the Contributor License Agreement.

keras_hub/src/models/parseq/parseq_tokenizer.py

abheesht17 · 2025-02-20T16:01:27Z

@sineeli - which parts of the PR are ready for review? Asking because it's still marked as draft

sineeli · 2025-02-20T18:52:00Z

Sure @abheesht17

First preprocessing and tokenizer these parts I think are good for reviewing, as they are the primary steps.

keras_hub/src/models/parseq/parseq_tokenizer.py
keras_hub/src/models/text_recognition_preprocessor.py

abheesht17

Thanks for the PR! Left some comments on the tokeniser. Will take a look at the text recognition preprocessor soon.

Sorry for the delay in reviewing

keras_hub/src/models/parseq/parseq_tokenizer.py

abheesht17 · 2025-02-25T02:24:03Z

keras_hub/src/models/parseq/parseq_tokenizer.py

+        self.char_to_id = tf.lookup.StaticHashTable(
+            initializer=tf.lookup.KeyValueTensorInitializer(
+                keys=list(self._stoi.keys()),
+                values=list(self._stoi.values()),
+                key_dtype=tf.string,
+                value_dtype=tf.int32,
+            ),
+            default_value=0,
+        )
+        self.id_to_char = tf.lookup.StaticHashTable(
+            initializer=tf.lookup.KeyValueTensorInitializer(
+                keys=list(self._stoi.values()),
+                values=list(self._stoi.keys()),
+                key_dtype=tf.int32,
+                value_dtype=tf.string,
+            ),
+            default_value=self.pad_token,
+        )


The defaults don't match. EOS is the 0th token, and pad is the len(vocabulary) - 1th token

I recognized the same in the original code, but seems they are using EOS -> 0, BOS->len(vocabulary), but while padding they are doing BOS first and then EOS at the end.

keras_hub/src/models/parseq/parseq_tokenizer.py

abheesht17 · 2025-02-25T02:29:14Z

keras_hub/src/models/parseq/parseq_tokenizer.py

+            label = tf.strings.upper(label)
+
+        label = tf.strings.regex_replace(label, self.unsupported_regex, "")
+        label = tf.strings.substr(label, 0, self.max_label_length)


Why are we truncating the input to 25 characters?

While preparing the dataset in the preprocessing itself if the label is above 25 they jus ignore that datapoint itself. Instead I truncated and we can start and end tokens instead.

Ref: https://github.com/baudm/parseq/blob/1902db043c029a7e03a3818c616c06600af574be/strhub/data/dataset.py#L112

keras_hub/src/models/parseq/parseq_tokenizer.py

sachinprasadhs

Thanks, added some comments,
could you please add a PR description by following the recent PR description template which includes Colab notebook link with end to end working demo and numerics verification.
Also add the original implementation reference in the PR description.

keras_hub/src/models/parseq/parseq_backbone.py

keras_hub/src/models/parseq/parseq_decoder.py

keras_hub/src/models/parseq/parseq_tokenizer.py

divyashreepathihalli · 2025-07-11T00:05:34Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces the PARSeq model, a ViT-based OCR model, to KerasHub. I've identified a few issues, including two critical bugs related to model serialization and tokenizer functionality that must be addressed. I've also found a couple of medium-severity issues regarding a typo in a layer name and a docstring example that should be corrected for clarity and maintainability.

keras_hub/src/models/parseq/parseq_causal_lm_preprocessor.py

keras_hub/src/models/parseq/parseq_tokenizer.py

keras_hub/src/models/parseq/parseq_causal_lm.py

keras_hub/src/models/parseq/parseq_decoder.py

tools/checkpoint_conversion/convert_parseq_checkpoints.py

sineeli added 13 commits January 31, 2025 11:11

Base for parseq model

528d3a4

make it vit compatiable with diff height and width sizes

3bf11cd

correct vit conv scripts

a8fb177

make class token optional in backbone by default its included

6f4363a

add flags to adjust vit network

d1cece0

add test case for without class_token

92b2745

Merge branch 'master' into parseq

ed00b73

decoder file

25f661c

parseq tokenizer base

f97fab1

add api for parseq tokenizer

d424210

Add missing arg max_label_length.

3f3ad0d

nit

bb4457e

Merge branch 'master' into parseq

68829f8

sineeli commented Feb 10, 2025

View reviewed changes

keras_hub/src/models/parseq/parseq_tokenizer.py Show resolved Hide resolved

sineeli added 5 commits February 11, 2025 15:28

add missing normalization step using tf_text

1bde466

add missing config for preprocessor

e6c5379

add default start, pad and end tokens

5b08c93

nit

49260ef

correct special token order

b4150ed

abheesht17 self-assigned this Feb 18, 2025

divyashreepathihalli requested a review from abheesht17 February 18, 2025 17:20

sineeli added 3 commits February 18, 2025 10:33

return padding mask as well

ed8b9d7

use proper keras ops

4e4511c

nit

9222331

abheesht17 requested changes Feb 25, 2025

View reviewed changes

sineeli added 3 commits March 3, 2025 11:42

add decoder for parseq

78a07a0

Build unbuilt layers for model validation

decc12c

fix forward pass and decoder

7aa2b67

sineeli requested a review from sachinprasadhs May 30, 2025 21:10

sachinprasadhs reviewed Jun 9, 2025

View reviewed changes

sineeli added 3 commits June 18, 2025 22:56

add example usage for backbone and causal lm

751b0a8

nit

3860843

Merge remote-tracking branch 'upstream/master' into parseq

6f5f093

sachinprasadhs added kokoro:force-run Runs Tests on GPU and removed WIP Pull requests which are work in progress and not ready yet for review. labels Jun 23, 2025

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Jun 23, 2025

gemini-code-assist bot reviewed Jul 11, 2025

View reviewed changes

sachinprasadhs added this to KerasHub Jul 16, 2025

sachinprasadhs moved this to In Progress in KerasHub Jul 16, 2025

Merge branch 'keras-team:master' into parseq

6634f23

sineeli added the kokoro:force-run Runs Tests on GPU label Aug 6, 2025

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Aug 6, 2025

sineeli added 2 commits August 8, 2025 20:16

fix minor issues

4843f1f

use default params from args as we preprocessor can be None

bfb5ff7

sineeli added the kokoro:force-run Runs Tests on GPU label Aug 9, 2025

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Aug 9, 2025

during pre compile self variables not available

8fbcd68

sineeli added the kokoro:force-run Runs Tests on GPU label Aug 9, 2025

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Aug 9, 2025

nit

210d860

sineeli added the kokoro:force-run Runs Tests on GPU label Aug 9, 2025

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Aug 9, 2025

sachinprasadhs reviewed Aug 11, 2025

View reviewed changes

tools/checkpoint_conversion/convert_parseq_checkpoints.py Outdated Show resolved Hide resolved

nit

6d065af

sachinprasadhs changed the title ~~[WIP] PARSeq Model~~ PARSeq Model Aug 13, 2025

sachinprasadhs added the kokoro:force-run Runs Tests on GPU label Aug 13, 2025

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Aug 13, 2025

PARSeq Model #2089

Are you sure you want to change the base?

PARSeq Model #2089

Uh oh!

Conversation

sineeli commented Feb 10, 2025 • edited by sachinprasadhs Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PARSeq Model

Description of the Change

Reference

Colab Notebook

Checklist

Uh oh!

Uh oh!

abheesht17 commented Feb 20, 2025

Uh oh!

sineeli commented Feb 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abheesht17 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

abheesht17 Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

sineeli Feb 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

abheesht17 Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

sineeli Feb 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sachinprasadhs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

divyashreepathihalli commented Jul 11, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sineeli commented Feb 10, 2025 •

edited by sachinprasadhs

Loading

sineeli commented Feb 20, 2025 •

edited

Loading

abheesht17 left a comment •

edited

Loading