logo
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Readme
Files and versions

1.6 KiB

Machine Translation with Opus-MT

author: David Wang


Description

A machine translation operator translates a sentence, paragraph, or document from source language to the target language. This operator is trained on OPUS data by Helsinki-NLP. More detail can be found in Helsinki-NLP/Opus-MT .


Code Example

Use the pre-trained model 'opus-mt-en-zh' to generate the Chinese translation for the sentence "Hello, world.".

Write a pipeline with explicit inputs/outputs name specifications:

from towhee.dc2 import pipe, ops, DataCollection

p = (
    pipe.input('text')
    .map('text', 'translation', ops.machine_translation.opus_mt(model_name='opus-mt-en-zh')) 
    .output('text', 'translation')
)

DataCollection(p('hello, world.')).show()


Factory Constructor

Create the operator via the following factory method:

machine_translation.opus_mt(model_name="opus-mt-en-zh")

Parameters:

model_name: str

The model name in string. The default model name is "opus-mt-en-zh".

Supported model names:

  • opus-mt-en-zh
  • opus-mt-zh-en
  • opus-tatoeba-en-ja
  • opus-tatoeba-ja-en
  • opus-mt-ru-en
  • opus-mt-en-ru


Interface

The operator takes a piece of text in string as input. It loads tokenizer and pre-trained model using model name. and then return translated text in string.

call(text)

Parameters:

text: str

​ The source language text in string.

Returns:

str

​ The target language text.

1.6 KiB

Machine Translation with Opus-MT

author: David Wang


Description

A machine translation operator translates a sentence, paragraph, or document from source language to the target language. This operator is trained on OPUS data by Helsinki-NLP. More detail can be found in Helsinki-NLP/Opus-MT .


Code Example

Use the pre-trained model 'opus-mt-en-zh' to generate the Chinese translation for the sentence "Hello, world.".

Write a pipeline with explicit inputs/outputs name specifications:

from towhee.dc2 import pipe, ops, DataCollection

p = (
    pipe.input('text')
    .map('text', 'translation', ops.machine_translation.opus_mt(model_name='opus-mt-en-zh')) 
    .output('text', 'translation')
)

DataCollection(p('hello, world.')).show()


Factory Constructor

Create the operator via the following factory method:

machine_translation.opus_mt(model_name="opus-mt-en-zh")

Parameters:

model_name: str

The model name in string. The default model name is "opus-mt-en-zh".

Supported model names:

  • opus-mt-en-zh
  • opus-mt-zh-en
  • opus-tatoeba-en-ja
  • opus-tatoeba-ja-en
  • opus-mt-ru-en
  • opus-mt-en-ru


Interface

The operator takes a piece of text in string as input. It loads tokenizer and pre-trained model using model name. and then return translated text in string.

call(text)

Parameters:

text: str

​ The source language text in string.

Returns:

str

​ The target language text.