Skip to content

This repository contains the code and data resources for my ISCL master thesis project (2022-2023 Fall Semester) in the Eberhard Karl University of Tübingen.

Notifications You must be signed in to change notification settings

yixuanwu4/LowResourceTextClassification-CN

Repository files navigation

LowResourceTextClassification-CN (Thesis Project)

This repository contains codes and resources for the Master thesis "Exploring different Chinese segmentation approaches: Benefits of radical-based segmentation in low-resource text classification" (2022-2023 Winter semester Eberhard-Karls-Universität Tübingen)

Resources used in this project

  1. Data for the TNews experiments: https://metatext.io/datasets/toutiao-text-classification-for-news-titles-(tnews)-(clue-benchmark)
  2. Data for the ChnSentiCorp experiments: https://ieee-dataport.org/open-access/chnsenticorp
  3. Data for the WU3D experiments: https://github.com/aidenwang9867/Weibo-User-Depression-Detection-Dataset
  4. Data for the SWSR experiments: https://zenodo.org/record/4773875
  5. The radical list: https://github.com/hankcs/sub-character-cws/blob/master/data/radical/radical.txt

About

This repository contains the code and data resources for my ISCL master thesis project (2022-2023 Fall Semester) in the Eberhard Karl University of Tübingen.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published