Skip to content
This repository was archived by the owner on Jan 15, 2024. It is now read-only.

Commit 2294421

Browse files
authored
[Numpy Refactor] BART (#1282)
* init * fix convert roberta * rename TransformerNMTModel as TransformerModel * update bart * fix * fix * update init * add layernorm_embedding for transformer * convert script * encoder * fix * fix vocab * fix roberta * fix * fix electra * add conversion bash for roberta and xlmr * ELECTRA SETUP * convert bart decoder * fix * update * testing output * remove arange_like for embeddings * fix * update * use_pooler for bart * fix * upload params for bart * add test_models_bart * fix cfg * test bart * update * fix transformer * Squashed commit of the following: commit 510d991 Author: ZheyuYe <[email protected]> Date: Thu Jul 30 02:33:22 2020 +0800 test commit 1b5fa7b Author: ZheyuYe <[email protected]> Date: Thu Jul 30 01:48:01 2020 +0800 fix comment1 commit 6533601 Author: ZheyuYe <[email protected]> Date: Thu Jul 30 01:27:44 2020 +0800 fix comment commit a8853f9 Author: ZheyuYe <[email protected]> Date: Thu Jul 30 01:10:06 2020 +0800 Squashed commit of the following: commit 232e0b6 Author: ZheyuYe <[email protected]> Date: Thu Jul 30 01:05:17 2020 +0800 update commit 995e5d7 Author: ZheyuYe <[email protected]> Date: Thu Jul 30 01:01:56 2020 +0800 fix commit 9623240 Author: ZheyuYe <[email protected]> Date: Thu Jul 30 00:52:17 2020 +0800 fix commit d9c4140 Author: ZheyuYe <[email protected]> Date: Wed Jul 29 23:07:10 2020 +0800 fix transformer commit e49fbe1 Author: ZheyuYe <[email protected]> Date: Wed Jul 29 22:18:12 2020 +0800 update commit 1f75b26 Author: ZheyuYe <[email protected]> Date: Wed Jul 29 22:04:08 2020 +0800 test bart commit 5bab516 Author: ZheyuYe <[email protected]> Date: Wed Jul 29 21:34:47 2020 +0800 fix cfg commit 6c62a29 Merge: 3366cf3 033214e Author: ZheyuYe <[email protected]> Date: Wed Jul 29 21:33:10 2020 +0800 Merge remote-tracking branch 'upstream/numpy' into bart commit 033214e Author: Xingjian Shi <[email protected]> Date: Wed Jul 29 00:36:57 2020 -0700 [Numpy] Fix SQuAD + Fix GLUE downloading (#1280) * Update run_squad.py * Update run_squad.py * Update prepare_glue.py commit 3c87457 Author: Xingjian Shi <[email protected]> Date: Tue Jul 28 18:03:21 2020 -0700 Add layout + compute_layout support: TransformerNMT, BERT, ALBERT, ELECTRA, MobileBERT, RoBERTA, XLMR (#1258) * Add layout support * fix test * Update transformer.py * Update transformer.py * Update README.md * try to add set_layout * update test case * fix * update * update * update * Update bert.py * fix bug * update * Update test_models_bert.py * Update tokenizers.py * add compute layout * Update xlmr.py * Update test_models_bert.py * revise test cases * Update layers.py * move jieba to try import * fix * Update transformer.py * fix * Update bert.py * Update setup.py * Update test_models_bert.py * Update test_models_bert.py * fix * update * Revise * Update electra.py * Update electra.py * Update test_models_electra.py * fix * fix bug * Update test_models_albert.py * add more testcases * fix * Update albert.py * Update albert.py * fix bug * fix testcase * Update test_models_electra.py * Update bert.py * update * Update test_models_electra.py * Update mobilebert.py * Update mobilebert.py * update mobilebert * Update test_models_mobilebert.py * Update mobilebert.py * fix bug * Update roberta.py * fix roberta * update * update * fix import * fix bug * update * reduce test workloads * address comment * address comment commit 4d43f82 Author: Sheng Zha <[email protected]> Date: Mon Jul 27 20:21:00 2020 -0700 add subversion/wget to docker, add readme (#1279) commit d76897b Author: phile <[email protected]> Date: Tue Jul 28 10:10:13 2020 +0800 Add embedding related methods in numpy version (#1263) * A draft for embedding * fix embed_loader * add hyperbolic space and some updates * revise evaluation * fix * simple fixes * move l2norm to op.py * new features * fix * update * add tests, update * newline * Squashed commit of the following: commit 9e1ffde Author: ZheyuYe <[email protected]> Date: Thu Jul 30 11:42:01 2020 +0800 todo commit 9a7c343 Author: ZheyuYe <[email protected]> Date: Thu Jul 30 10:53:15 2020 +0800 revert gelu commit 0425346 Author: ZheyuYe <[email protected]> Date: Thu Jul 30 10:49:52 2020 +0800 re-upload bart commit 516ae84 Author: ZheyuYe <[email protected]> Date: Thu Jul 30 03:32:35 2020 +0800 use_qkv_bias for transformer commit 9d60cda Author: ZheyuYe <[email protected]> Date: Thu Jul 30 03:17:28 2020 +0800 classifier_activation commit 510d991 Author: ZheyuYe <[email protected]> Date: Thu Jul 30 02:33:22 2020 +0800 test commit 1b5fa7b Author: ZheyuYe <[email protected]> Date: Thu Jul 30 01:48:01 2020 +0800 fix comment1 commit 6533601 Author: ZheyuYe <[email protected]> Date: Thu Jul 30 01:27:44 2020 +0800 fix comment commit a8853f9 Author: ZheyuYe <[email protected]> Date: Thu Jul 30 01:10:06 2020 +0800 Squashed commit of the following: commit 232e0b6 Author: ZheyuYe <[email protected]> Date: Thu Jul 30 01:05:17 2020 +0800 update commit 995e5d7 Author: ZheyuYe <[email protected]> Date: Thu Jul 30 01:01:56 2020 +0800 fix commit 9623240 Author: ZheyuYe <[email protected]> Date: Thu Jul 30 00:52:17 2020 +0800 fix commit d9c4140 Author: ZheyuYe <[email protected]> Date: Wed Jul 29 23:07:10 2020 +0800 fix transformer commit e49fbe1 Author: ZheyuYe <[email protected]> Date: Wed Jul 29 22:18:12 2020 +0800 update commit 1f75b26 Author: ZheyuYe <[email protected]> Date: Wed Jul 29 22:04:08 2020 +0800 test bart commit 5bab516 Author: ZheyuYe <[email protected]> Date: Wed Jul 29 21:34:47 2020 +0800 fix cfg commit 6c62a29 Merge: 3366cf3 033214e Author: ZheyuYe <[email protected]> Date: Wed Jul 29 21:33:10 2020 +0800 Merge remote-tracking branch 'upstream/numpy' into bart commit 033214e Author: Xingjian Shi <[email protected]> Date: Wed Jul 29 00:36:57 2020 -0700 [Numpy] Fix SQuAD + Fix GLUE downloading (#1280) * Update run_squad.py * Update run_squad.py * Update prepare_glue.py commit 3c87457 Author: Xingjian Shi <[email protected]> Date: Tue Jul 28 18:03:21 2020 -0700 Add layout + compute_layout support: TransformerNMT, BERT, ALBERT, ELECTRA, MobileBERT, RoBERTA, XLMR (#1258) * Add layout support * fix test * Update transformer.py * Update transformer.py * Update README.md * try to add set_layout * update test case * fix * update * update * update * Update bert.py * fix bug * update * Update test_models_bert.py * Update tokenizers.py * add compute layout * Update xlmr.py * Update test_models_bert.py * revise test cases * Update layers.py * move jieba to try import * fix * Update transformer.py * fix * Update bert.py * Update setup.py * Update test_models_bert.py * Update test_models_bert.py * fix * update * Revise * Update electra.py * Update electra.py * Update test_models_electra.py * fix * fix bug * Update test_models_albert.py * add more testcases * fix * Update albert.py * Update albert.py * fix bug * fix testcase * Update test_models_electra.py * Update bert.py * update * Update test_models_electra.py * Update mobilebert.py * Update mobilebert.py * update mobilebert * Update test_models_mobilebert.py * Update mobilebert.py * fix bug * Update roberta.py * fix roberta * update * update * fix import * fix bug * update * reduce test workloads * address comment * address comment commit 4d43f82 Author: Sheng Zha <[email protected]> Date: Mon Jul 27 20:21:00 2020 -0700 add subversion/wget to docker, add readme (#1279) commit d76897b Author: phile <[email protected]> Date: Tue Jul 28 10:10:13 2020 +0800 Add embedding related methods in numpy version (#1263) * A draft for embedding * fix embed_loader * add hyperbolic space and some updates * revise evaluation * fix * simple fixes * move l2norm to op.py * new features * fix * update * add tests, update * newline * fix comment * use xavier for embedding initializer
1 parent 033214e commit 2294421

18 files changed

+962
-137
lines changed

scripts/conversion_toolkits/README.md

Lines changed: 32 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,8 @@ The testing step mentioned above are controlled by the flag `--test`, in which t
1212
tolerance of 1e-3 between gluon model with converted weights and original tensorflow model.
1313
In addition, we can use GPU in all converting scripts by adding `--gpu 0`.
1414

15+
For RoBERTa XLM-R and BART model, please instal the [fairseq](https://github.com/pytorch/fairseq#requirements-and-installation) package locally as `pip install git+https://github.com/pytorch/fairseq.git@master`.
16+
1517
## BERT
1618
Convert model from [BERT LIST](https://tfhub.dev/google/collections/bert/1).
1719

@@ -37,25 +39,42 @@ do
3739
done
3840
```
3941

40-
## RoBERTa
42+
## ELECTRA
43+
The TF Hub is not available for ELECTRA model currently.
44+
Thus, you will need to clone the [electra repository](https://github.com/ZheyuYe/electra)
45+
and download the checkpoint. The parameters are converted from local checkpoints.
46+
By running the following command, you can convert + verify the ELECTRA model with both the discriminator and the generator.
47+
48+
Notice: pleas set up the `--electra_path` with the cloned path ~~or get this electra repository packaged by `pip install -e .`.~~
49+
50+
```bash
51+
# Need to use TF 1.13.2 to use contrib layer
52+
pip uninstall tensorflow
53+
pip install tensorflow==1.13.2
54+
55+
# Actual conversion
56+
bash convert_electra.sh
57+
```
4158

59+
## Mobile Bert
4260
```bash
43-
pip install fairseq==0.9.0
61+
bash convert_mobilebert.sh
62+
```
4463

64+
## RoBERTa
65+
```bash
4566
for model in base large
4667
do
4768
mkdir roberta_${model}
4869
wget "https://dl.fbaipublicfiles.com/fairseq/models/roberta.${model}.tar.gz"
4970
tar zxf roberta.${model}.tar.gz --directory roberta_${model}
50-
python convert_fairseq_roberta.py --fairseq_model_path roberta_${model}/roberta.${model} --model_size ${model} --test
71+
python convert_fairseq_roberta.py --fairseq_model_path roberta_${model}/roberta.${model} --test
5172
done
5273
```
5374

5475
## XLM-R
5576

5677
```bash
57-
pip install fairseq==0.9.0
58-
5978
for model in base large
6079
do
6180
mkdir xlmr_${model}
@@ -65,23 +84,13 @@ do
6584
done
6685
```
6786

68-
## ELECTRA
69-
The TF Hub is not available for ELECTRA model currently.
70-
Thus, you will need to clone the [electra repository](https://github.com/ZheyuYe/electra)
71-
and download the checkpoint. The parameters are converted from local checkpoints.
72-
By running the following command, you can convert + verify the ELECTRA model with both the discriminator and the generator.
73-
74-
Notice: pleas set up the `--electra_path` with the cloned path or get this electra repository packaged by `pip install -e .`.
75-
87+
## BART
7688
```bash
77-
# Need to use TF 1.13.2 to use contrib layer
78-
pip install tensorflow==1.13.2 --upgrade --force-reinstall
79-
80-
# Actual conversion
81-
bash convert_electra.sh
82-
```
83-
84-
## Mobile Bert
85-
```bash
86-
bash convert_mobilebert.sh
89+
for model in base large
90+
do
91+
mkdir bart_${model}
92+
wget "https://dl.fbaipublicfiles.com/fairseq/models/bart.${model}.tar.gz"
93+
tar zxf bart.${model}.tar.gz --directory bart_${model}
94+
python convert_fairseq_bart.py --fairseq_model_path bart_${model}/bart.${model} --test
95+
done
8796
```
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
for model in base large
2+
do
3+
mkdir bart_${model}
4+
wget "https://dl.fbaipublicfiles.com/fairseq/models/bart.${model}.tar.gz"
5+
tar zxf bart.${model}.tar.gz --directory bart_${model}
6+
python convert_fairseq_bart.py --fairseq_model_path bart_${model}/bart.${model} --test
7+
done

scripts/conversion_toolkits/convert_electra.py

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,9 @@ def read_tf_checkpoint(path):
5353
return tensors
5454

5555

56-
def get_dict_config(model_size, electra_dir):
56+
def get_dict_config(model_size, electra_path):
57+
sys.path.append(electra_path)
58+
electra_dir = os.path.abspath(os.path.join(os.path.dirname(electra_path), os.path.pardir))
5759
sys.path.append(electra_dir)
5860
from electra.util.training_utils import get_bert_config
5961
from electra.configure_pretraining import PretrainingConfig
@@ -100,7 +102,7 @@ def convert_tf_config(config_dict, vocab_size):
100102
return cfg
101103

102104

103-
def convert_tf_assets(tf_assets_dir, model_size, electra_dir):
105+
def convert_tf_assets(tf_assets_dir, model_size, electra_path):
104106
"""Convert the assets file including config, vocab and tokenizer model"""
105107
file_names = os.listdir(tf_assets_dir)
106108
vocab_path = None
@@ -113,7 +115,7 @@ def convert_tf_assets(tf_assets_dir, model_size, electra_dir):
113115
if vocab_path:
114116
vocab_path = os.path.join(tf_assets_dir, vocab_path)
115117
vocab_size = len(open(vocab_path, 'rU').readlines())
116-
config_dict = get_dict_config(model_size, electra_dir)
118+
config_dict = get_dict_config(model_size, electra_path)
117119
cfg = convert_tf_config(config_dict, vocab_size)
118120
return cfg, vocab_path
119121

@@ -190,12 +192,12 @@ def get_name_map(tf_names, convert_type='backbone'):
190192
return name_map
191193

192194

193-
def convert_tf_model(model_dir, save_dir, test_conversion, model_size, gpu, electra_dir):
195+
def convert_tf_model(model_dir, save_dir, test_conversion, model_size, gpu, electra_path):
194196
ctx = mx.gpu(gpu) if gpu is not None else mx.cpu()
195197
if not os.path.exists(save_dir):
196198
os.makedirs(save_dir)
197199

198-
cfg, vocab_path = convert_tf_assets(model_dir, model_size, electra_dir)
200+
cfg, vocab_path = convert_tf_assets(model_dir, model_size, electra_path)
199201
with open(os.path.join(save_dir, 'model.yml'), 'w') as of:
200202
of.write(cfg.dump())
201203
new_vocab = HuggingFaceWordPieceTokenizer(
@@ -234,6 +236,8 @@ def convert_tf_model(model_dir, save_dir, test_conversion, model_size, gpu, elec
234236
tf_names = list(tf_names)
235237

236238
# reload the electra module for this local scope
239+
sys.path.append(electra_path)
240+
electra_dir = os.path.abspath(os.path.join(os.path.dirname(electra_path), os.path.pardir))
237241
sys.path.append(electra_dir)
238242
from electra.util.training_utils import get_bert_config
239243
from electra.configure_pretraining import PretrainingConfig
@@ -426,11 +430,10 @@ def convert_qkv_weights(tf_prefix, mx_prefix):
426430
logging_config()
427431
save_dir = args.save_dir if args.save_dir is not None else os.path.basename(
428432
args.tf_model_path) + '_gluon'
429-
electra_dir = os.path.abspath(os.path.join(os.path.dirname(args.electra_path), os.path.pardir))
430433
convert_tf_model(
431434
args.tf_model_path,
432435
save_dir,
433436
args.test,
434437
args.model_size,
435438
args.gpu,
436-
electra_dir)
439+
args.electra_path)

0 commit comments

Comments
 (0)