# Text Embedding with Transformers *author: Jael Gu* ## Desription A text embedding operator takes a sentence, paragraph, or document in string as an input and output an embedding vector in ndarray which captures the input's core semantic elements. This operator is implemented with pretrained models from [Huggingface Transformers](https://huggingface.co/docs/transformers). ## Code Example Use the pretrained model 'distilbert-base-cased' to generate a text embedding for the sentence "Hello, world.". *Write the pipeline in simplified style*: ```python from towhee import dc dc.stream(["Hello, world."]) .text_embedding.transformers('distilbert-base-cased') .show() ``` *Write a same pipeline with explicit inputs/outputs name specifications:* ```python from towhee import dc dc.stream['txt'](["Hello, world."]) .text_embedding.transformers['txt', 'vec']('distilbert-base-cased') .select('txt', 'vec') .show() ``` ## Factory Constructor Create the operator via the following factory method ***text_embedding.transformers(model_name="bert-base-uncased")*** **Parameters:** ​ ***model_name***: *str* ​ The model name in string. You can get the list of supported model names by calling `get_model_list`. ## Interface The operator takes a text in string as input. It loads tokenizer and pre-trained model using model name. Text embeddings are returned in ndarray. **Parameters:** ​ ***text***: *str* ​ The text in string. **Returns**: ​ *numpy.ndarray* ​ The text embedding extracted by model.