logo
Browse Source

update readme with dc2

main
ChengZi 2 years ago
parent
commit
df99d701aa
  1. 47
      README.md
  2. 2
      drl.py
  3. BIN
      text_emb_result.png
  4. BIN
      vect_explicit_text.png
  5. BIN
      vect_explicit_video.png
  6. BIN
      vect_simplified_text.png
  7. BIN
      vect_simplified_video.png
  8. BIN
      video_emb_result.png

47
README.md

@ -25,42 +25,35 @@ Read the text 'kids feeding and playing with the horse' to generate a text embed
*Write the pipeline in simplified style*: *Write the pipeline in simplified style*:
```python ```python
import towhee
from towhee.dc2 import pipe, ops, DataCollection
towhee.dc(['./demo_video.mp4']) \
.video_decode.ffmpeg(sample_type='uniform_temporal_subsample', args={'num_samples': 12}) \
.runas_op(func=lambda x: [y for y in x]) \
.video_text_embedding.drl(base_encoder='clip_vit_b32', modality='video', device='cpu') \
.show()
p = (
pipe.input('text') \
.map('text', 'vec', ops.video_text_embedding.drl(base_encoder='clip_vit_b32', modality='text', device='cuda:0')) \
.output('text', 'vec')
)
towhee.dc(['kids feeding and playing with the horse']) \
.video_text_embedding.drl(base_encoder='clip_vit_b32', modality='text', device='cpu') \
.show()
DataCollection(p('kids feeding and playing with the horse')).show()
``` ```
![](vect_simplified_video.png)
![](vect_simplified_text.png)
![](text_emb_result.png)
*Write a same pipeline with explicit inputs/outputs name specifications:*
```python ```python
import towhee
towhee.dc['path'](['./demo_video.mp4']) \
.video_decode.ffmpeg['path', 'frames'](sample_type='uniform_temporal_subsample', args={'num_samples': 12}) \
.runas_op['frames', 'frames'](func=lambda x: [y for y in x]) \
.video_text_embedding.drl['frames', 'vec'](base_encoder='clip_vit_b32', modality='video', device='cpu') \
.show(formatter={'path': 'video_path'})
towhee.dc['text'](['kids feeding and playing with the horse']) \
.video_text_embedding.drl['text','vec'](base_encoder='clip_vit_b32', modality='text', device='cpu') \
.select['text', 'vec']() \
.show()
```
from towhee.dc2 import pipe, ops, DataCollection
p = (
pipe.input('video_path') \
.map('video_path', 'flame_gen', ops.video_decode.ffmpeg(sample_type='uniform_temporal_subsample', args={'num_samples': 12})) \
.map('flame_gen', 'flame_list', lambda x: [y for y in x]) \
.map('flame_list', 'vec', ops.video_text_embedding.drl(base_encoder='clip_vit_b32', modality='video', device='cuda:0')) \
.output('video_path', 'flame_list', 'vec')
)
![](vect_explicit_video.png)
![](vect_explicit_text.png)
DataCollection(p('./demo_video.mp4')).show()
```
![](video_emb_result.png)
<br /> <br />

2
drl.py

@ -41,7 +41,7 @@ class DRL(NNOperator):
self.device = "cuda" if torch.cuda.is_available() else "cpu" self.device = "cuda" if torch.cuda.is_available() else "cpu"
else: else:
self.device = device self.device = device
self.model = drl.create_model(base_encoder=base_encoder, pretrained=True, cdcr=0, weights_path=weight_path)
self.model = drl.create_model(base_encoder=base_encoder, pretrained=True, cdcr=0, weights_path=weight_path, device=device)
self.tokenize = clip4clip.SimpleTokenizer() self.tokenize = clip4clip.SimpleTokenizer()
self.tfms = transforms.Compose([ self.tfms = transforms.Compose([

BIN
text_emb_result.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

BIN
vect_explicit_text.png

Binary file not shown.

Before

Width:  |  Height:  |  Size: 15 KiB

BIN
vect_explicit_video.png

Binary file not shown.

Before

Width:  |  Height:  |  Size: 585 KiB

BIN
vect_simplified_text.png

Binary file not shown.

Before

Width:  |  Height:  |  Size: 18 KiB

BIN
vect_simplified_video.png

Binary file not shown.

Before

Width:  |  Height:  |  Size: 16 KiB

BIN
video_emb_result.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

Loading…
Cancel
Save