towhee
copied
Readme
Files and versions
1.1 KiB
Text Spliter
author: shiyu22
Description
Text spliter is used to split text into chunk lists.
Refer to Recursive Characters for the operation of splitting text.
Code Example
from towhee import pipe, ops, DataCollection
p = (
pipe.input('url')
.map('url', 'text', ops.text_loader())
.flat_map('text', 'text', ops.text_spliter())
.output('url', 'text')
)
res = p('https://github.com/towhee-io/towhee/blob/main/README.md')
DataCollection(res).show()
Factory Constructor
Create the operator via the following factory method
towhee.text_loader(chunk_size=300)
Parameters:
chunk_size: int
The size of each chunk, defaults to 300.
Interface
The operator split incoming the text and return chunks.
Parameters:
data: str
The text data.
Return: List[Document]
A list of the chunked document.
1.1 KiB
Text Spliter
author: shiyu22
Description
Text spliter is used to split text into chunk lists.
Refer to Recursive Characters for the operation of splitting text.
Code Example
from towhee import pipe, ops, DataCollection
p = (
pipe.input('url')
.map('url', 'text', ops.text_loader())
.flat_map('text', 'text', ops.text_spliter())
.output('url', 'text')
)
res = p('https://github.com/towhee-io/towhee/blob/main/README.md')
DataCollection(res).show()
Factory Constructor
Create the operator via the following factory method
towhee.text_loader(chunk_size=300)
Parameters:
chunk_size: int
The size of each chunk, defaults to 300.
Interface
The operator split incoming the text and return chunks.
Parameters:
data: str
The text data.
Return: List[Document]
A list of the chunked document.