logo
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Readme
Files and versions

81 lines
1.6 KiB

# Machine Translation with Opus-MT
2 years ago
*author: David Wang*
<br />
## Description
A machine translation operator translates a sentence, paragraph, or document from source language
to the target language. This operator is trained on [OPUS](https://opus.nlpl.eu/) data by Helsinki-NLP.
More detail can be found in [ Helsinki-NLP/Opus-MT ](https://github.com/Helsinki-NLP/Opus-MT).
<br />
## Code Example
Use the pre-trained model 'opus-mt-en-zh'
to generate the Chinese translation for the sentence "Hello, world.".
*Write a pipeline with explicit inputs/outputs name specifications:*
```python
from towhee import pipe, ops, DataCollection
p = (
pipe.input('text')
.map('text', 'translation', ops.machine_translation.opus_mt(model_name='opus-mt-en-zh'))
.output('text', 'translation')
)
DataCollection(p('hello, world.')).show()
```
<img src="./result.png" width="800px"/>
<br />
## Factory Constructor
Create the operator via the following factory method:
***machine_translation.opus_mt(model_name="opus-mt-en-zh")***
**Parameters:**
***model_name***: *str*
The model name in string.
The default model name is "opus-mt-en-zh".
Supported model names:
- opus-mt-en-zh
- opus-mt-zh-en
- opus-tatoeba-en-ja
- opus-tatoeba-ja-en
- opus-mt-ru-en
- opus-mt-en-ru
<br />
## Interface
The operator takes a piece of text in string as input.
It loads tokenizer and pre-trained model using model name.
and then return translated text in string.
***__call__(text)***
**Parameters:**
***text***: *str*
​ The source language text in string.
**Returns**:
*str*
​ The target language text.