Readme

Files and versions

1.4 KiB

Raw Blame History

Text Loader

author: shiyu22

Description

Text loader is used to load files and split them into text lists. It supports loading local files (with file path), or web links (with url).

Refer to Recursive Characters for the operation of splitting text.

Code Example

from towhee import pipe, ops, DataCollection

p = (
    pipe.input('url')
        .flat_map('url', 'text', ops.text_loader(source_type='url'))
        .output('url', 'text')
    )

res = p('https://docs.towhee.io/Getting%20Started/create-pipeline/')
DataCollection(res).show()

Factory Constructor

Create the operator via the following factory method

towhee.text_loader(chunk_size=300, source_type='file')

Parameters:

chunk_size: int

The size of each chunk, defaults to 300.

source_type: str

The type of the soure, defaults to 'file', you can also set to 'url' for you url of your documentation.

Interface

The operator load the documentation, then split incoming the text and return chunks.

Parameters:

data_src: str

Path or url of the document to be loaded.

Return: List[Document]

A list of the chunked document.

1.4 KiB

Raw Blame History

Text Loader

author: shiyu22

Description

Text loader is used to load files and split them into text lists. It supports loading local files (with file path), or web links (with url).

Refer to Recursive Characters for the operation of splitting text.

Code Example

from towhee import pipe, ops, DataCollection

p = (
    pipe.input('url')
        .flat_map('url', 'text', ops.text_loader(source_type='url'))
        .output('url', 'text')
    )

res = p('https://docs.towhee.io/Getting%20Started/create-pipeline/')
DataCollection(res).show()

Factory Constructor

Create the operator via the following factory method

towhee.text_loader(chunk_size=300, source_type='file')

Parameters:

chunk_size: int

The size of each chunk, defaults to 300.

source_type: str

The type of the soure, defaults to 'file', you can also set to 'url' for you url of your documentation.

Interface

The operator load the documentation, then split incoming the text and return chunks.

Parameters:

data_src: str

Path or url of the document to be loaded.

Return: List[Document]

A list of the chunked document.

Readme

Files and versions

1.4 KiB Raw Blame History

Text Loader

Description

Code Example

Factory Constructor

Interface

1.4 KiB Raw Blame History

Text Loader

Description

Code Example

Factory Constructor

Interface

1.4 KiB

Raw Blame History

1.4 KiB

Raw Blame History