# readthedocs

*author: junjie.jiang*


<br />

## Desription

To get the list of documents for a single Read the Docs project.

<br />


## Code Example

### Example

```python

from towhee import DataLoader, pipe, ops
p = (
    pipe.input('url')
    .map('url', 'text', ops.text_loader())
    .flat_map('text', 'sentence', ops.text_splitter())
    .map('sentence', 'embedding', ops.sentence_embedding.transformers(model_name='all-MiniLM-L6-v2'))
    .map('embedding', 'embedding', ops.towhee.np_normalize())
    .output('embedding')
)


for data in DataLoader(ops.data_source.readthedocs('https://towhee.readthedocs.io/en/latest/', include='html', exclude='index.html')):
    print(p(data).to_list(kv_format=True))

# batch
for data in DataLoader(ops.data_source.readthedocs('https://towhee.readthedocs.io/en/latest/', include='html', exclude='index.html'), batch_size=10):
    p.batch(data)
```

**Parameters:**


***page_prefix:*** *str*

The root path of the page. Generally, the crawled links are relative paths. The complete URL needs to be obtained by splicing the root path + relative path.

***index_page:*** *str*

The main page contains links to all other pages, if None, will use `page_prefix`.

example: https://towhee.readthedocs.io/en/latest/

***include:*** *Union[List[str], str]*

Only contains URLs that meet this condition.

***exclude:*** *Union[List[str], str]*

Filter out URLs that meet this condition.


# More Resources

- [RAG Without OpenAI: BentoML, OctoAI and Milvus - Zilliz blog](https://zilliz.com/blog/rag-without-open-ai-bentoml-octoai-milvus): In this tutorial we will use BentoML to serve embeddings, OctoAI to get the LLM and Milvus as our vector database.
- [Building RAG with Llama3, Ollama, DSPy, and Milvus - Zilliz blog](https://zilliz.com/learn/how-to-build-rag-system-using-llama3-ollama-dspy-milvus): In this article, we aim to guide readers through constructing an RAG system using four key technologies: Llama3, Ollama, DSPy, and Milvus. First, letâs understand what they are.
- [An LLM Powered Text to Image Prompt Generation with Milvus - Zilliz blog](https://zilliz.com/blog/llm-powered-text-to-image-prompt-generation-with-milvus): An interesting LLM project powered by the Milvus vector database for generating more efficient text-to-image prompts.
- [Vectorizing and Querying EPUB Content with the Unstructured and Milvus - Zilliz blog](https://zilliz.com/learn/vectorize-and-query-epub-content-with-unstructured-and-milvus): In this post, we explore the vectorization and retrieval of EPUB data using Milvus and the Unstructured framework, offering developers actionable insights for enhancing LLM performance.
- [Vectorizing PDFs - Ingesting PDFs into Vector Databases with Milvus and Zilliz - Zilliz blog](https://zilliz.com/learn/transforming-pdfs-into-insights-vectorizing-and-ingesting-with-zilliz-cloud-pipelines): You will learn how Zilliz Cloud Pipeline transforms PDF data into a format ready for LLMs to use in semantic search tasks. Finally, we will conduct data retrieval using vector search.
- [Training Text Embeddings with Jina AI  - Zilliz blog](https://zilliz.com/blog/training-text-embeddings-with-jina-ai): In a recent talk by Bo Wang, he discussed the creation of Jina text embeddings for modern vector search and RAG systems. He also shared methodologies for training embedding models that effectively encode extensive information, along with guidance o