|
| 1 | +# FineTune BERT with Stanford Sentiment Tree Bank |
| 2 | +- reference [repo](https://github.com/kabirahuja2431/FineTuneBERT) |
| 3 | + |
| 4 | +## Requirements |
| 5 | +- python 3.9 |
| 6 | +- mindspore 2.3.1 |
| 7 | +- mindnlp 0.4.1 |
| 8 | +- pandas |
| 9 | + |
| 10 | +## Data |
| 11 | +Download the data from this [link](https://gluebenchmark.com/tasks). There will be a main zip file download option at the right side of the page. Extract the contents of the zip file and place them in data/SST-2/ |
| 12 | + |
| 13 | +## Args for training the model |
| 14 | +To train the model with fixed weights of BERT layers, set: |
| 15 | +``` |
| 16 | +args.freeze_bert = True |
| 17 | +``` |
| 18 | +To train the entire model i.e. both BERT layers and the classification layer, set: |
| 19 | +``` |
| 20 | +args.freeze_bert = False |
| 21 | +``` |
| 22 | + |
| 23 | +other optional arguments: |
| 24 | +- args.device_target : Ascend |
| 25 | +- args.device_id |
| 26 | +- args.base_model_name_or_path : 'bert-base-uncased' or the path to the model |
| 27 | +- args.dataset_name_or_path : path to the data directory |
| 28 | +- args.maxlen : maximum length of the input sequence |
| 29 | +- args.batch_size : batch size |
| 30 | +- args.lr : learning rate |
| 31 | +- args.print_every : print the loss and accuracy after these many iterations |
| 32 | +- args.max_eps : maximum number of epochs |
| 33 | +- args.save_path : path to save the model, if not provided the model will not be saved, such as './outputs/' |
| 34 | + |
| 35 | +## Results |
| 36 | +### my results on mindspore |
| 37 | +|Model Variant|Accuracy on Dev Set| |
| 38 | +|-------------|-------------------| |
| 39 | +|BERT (no finetuning)|81.25%| |
| 40 | +|BERT (with finetuning)|90.07%| |
| 41 | + |
| 42 | +requirements: |
| 43 | +- Ascend 910B |
| 44 | +- Python 3.9 |
| 45 | +- MindSpore 2.3.1 |
| 46 | +- MindNLP 0.4.1 |
| 47 | + |
| 48 | +### my results on pytorch |
| 49 | +|Model Variant|Accuracy on Dev Set| |
| 50 | +|-------------|-------------------| |
| 51 | +|BERT (no finetuning)|81.03%| |
| 52 | +|BERT (with finetuning)|89.84%| |
| 53 | + |
| 54 | +requirements: |
| 55 | +- GPU 1080ti |
| 56 | +- CUDA 11.1.1 |
| 57 | +- Python 3.9 |
| 58 | +- Pytorch 1.10.2 |
| 59 | +- Transformers 4.45.2 |
| 60 | + |
| 61 | +### Original results from the repo |
| 62 | +|Model Variant|Accuracy on Dev Set| |
| 63 | +|-------------|-------------------| |
| 64 | +|BERT (no finetuning)|82.59%| |
| 65 | +|BERT (with finetuning)|88.29%| |
| 66 | + |
| 67 | +requirements: |
| 68 | +- Python 3.6 |
| 69 | +- Pytorch 1.2.0 |
| 70 | +- Transformers 2.0.0 |
0 commit comments