add train example in readme

3 years ago · 7a834f6894
1 changed files with 35 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -133,3 +133,38 @@ op = ops.sentence_embedding.transformers().get_op()
 full_list = op.supported_model_names()
 onnx_list = op.supported_model_names(format='onnx')
 ```
 ## Fine-tune
 ### Requirement
 If you want to train this operator, besides dependency in requirements.txt, you need install these dependencies.
 ```python
 ! python -m pip install datasets evaluate scikit-learn
 ```
 ### Get started
 Simply speaking, you only need to construct an op instance and pass in some configurations to train the specified task.
 ```python
 import towhee
 bert_op = towhee.ops.sentence_embedding.transformers(model_name='bert-base-uncased').get_op()
 data_args = {
    'dataset_name': 'wikitext',
    'dataset_config_name': 'wikitext-2-raw-v1',
 }
 training_args = {
    'num_train_epochs': 3, # you can add epoch number to get a better metric.
    'per_device_train_batch_size': 8,
    'per_device_eval_batch_size': 8,
    'do_train': True,
    'do_eval': True,
    'output_dir': './tmp/test-mlm',
    'overwrite_output_dir': True
 }
 bert_op.train(task='mlm', data_args=data_args, training_args=training_args)
 ```
 For more infos, refer to the [examples](https://github.com/towhee-io/examples/tree/main/fine_tune/6_train_language_modeling_tasks).
 ### Dive deep and customize your training
 You can change the [training script](https://towhee.io/text-embedding/transformers/src/branch/main/train_clm_with_hf_trainer.py) in your customer way. 
 Or your can refer to the original [hugging face transformers training examples](https://github.com/huggingface/transformers/blob/main/examples/pytorch/language-modeling).