Is the code to generate the augmented data available anywhere?

In the paper, the authors write

> Augmentations Token augmentation consists of randomly inserting up to 4 typos per token up to 25% of the token length. This is consistent with an observed maximum human error frequency of around 20% [11]. We use 22 distinct typo augmentations, which can be grouped into four categories: deletion, insertion, substitution, and transposition. For each token, we randomly select a target augmentation percentage between 0-25%, and for each augmentation step we randomly apply an augmentation from one of the four typo categories. The full list of augmentations used is reported in Appendix D.

Is the code to apply these augmentations available anywhere? I'd like to use & adapt it for my specific use-case.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Is the code to generate the augmented data available anywhere? #44

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Is the code to generate the augmented data available anywhere? #44

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions