From f6fba9a0bf4d97daee1656360bbb28ae9399e26d Mon Sep 17 00:00:00 2001 From: Jael Gu Date: Thu, 12 Jan 2023 18:08:04 +0800 Subject: [PATCH] Update readme Signed-off-by: Jael Gu --- README.md | 14 ++++---------- result.png | Bin 5547 -> 5793 bytes 2 files changed, 4 insertions(+), 10 deletions(-) diff --git a/README.md b/README.md index 173a673..57f2b7a 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,7 @@ ## Description A text embedding operator takes a sentence, paragraph, or document in string as an input -and output an embedding vector in ndarray which captures the input's core semantic elements. +and outputs token embeddings which captures the input's core semantic elements. This operator is implemented with pre-trained models from [Huggingface Transformers](https://huggingface.co/docs/transformers).
@@ -329,18 +329,12 @@ If None, the operator will use default tokenizer by `model_name` from Huggingfac
-***return_sentence_emb***: *bool* - -The flag to output a sentence embedding for each text, defaults to True. -If False, the operator returns token embeddings for each text. - -
## Interface The operator takes a piece of text in string as input. It loads tokenizer and pre-trained model using model name. -and then return text embedding in ndarray. +and then return text embedding(s) in ndarray. ***\_\_call\_\_(txt)*** @@ -349,8 +343,8 @@ and then return text embedding in ndarray. ***data***: *Union[str, list]* ​ The text in string or a list of texts. -If data is string, the operator returns embedding(s) in ndarray. -If data is a list, the operator returns embedding(s) in a list. +If data is string, the operator returns token embedding(s) in ndarray. +If data is a list, the operator returns token embedding(s) in a list. **Returns**: diff --git a/result.png b/result.png index bbfc7ec06a5deec4e0763d5bccae191eee7aa110..4d87bb55960b4d636884d19ebdb4b57e2f8710ce 100644 GIT binary patch literal 5793 zcmZWt2T&8tvkxLcAs9qJYNUjY^ddqafRuoAq$9m4ND&Z1M|v?3I!Kcuy##4WR|rih zBE1Dd??|uD&+pCq|7YIJ&25?6otydX?e1-)rn(Xt2^|Rl03bssKhXvN2qAy?xLbcV zKpx@37yv+Rk9Z=h>qW5nQ}Aw}CWxqNWzGc5?-HX)uyywFo4i}AFcT548WBO z4)H1hG7$0~vL^xu4-y^#VE;}2p8^2;b7I~em2K(+sGA0TeRLI*v}f(ER6APdG*#!c z^vbDTzjN{r|7EO8SaFV@BSGgkw0^6lRrb-v@&kX4u`Vm);L*jkq}K;}v7$nwnH1yt{Ruk4s&wA^i0Y@2 z>k*`pYNbd4v~V}PaW)Yd_`5U+Z-sLnc++=H2Z$1L;jlfCD+(1YiFEsb0R0EXXAVn4AQ{oj zt|@^op>Zi)=ZJ)GW@(P&MLUeTd`v!fxzy){k3g;I>iNgyj*u4*zXeZ_qq=KO(&)Fd zWOWOYu)Yzboh=k-wsP{lr+PWwMN*{j(3IwsZQqEjPA%TRB7}a&KOrF`lCC7(vqT;# zLZZ)WT(`;IGk|N4u!sCEwsaoXb4;Yb`nOAfMS=cv++(54Z2ZU zy1K7u2FusRM&%v-F6dcC!5zG$N=of%W=^@2BLW3pT&G z$QF%1*#Q#Egyhoqidz2yVL_9X#@HGsQ=1J7&F~j;>dpmtlD2Yt2=)!`u{pY-T3QCD@#UX2 z?*fqyTl!oDwBuSn&!o3%uhH8hqjxq?=5|}PYZJf3`r)Yheg6V|^yfRKGzl6md*&-e zLZI3Q&5|Cai-WO=42vmuo@Wd~-zp7&)cIaWB44lT9p{qf^{Nbd@1BKxFmYc0-r`ww zI6lHCbJ(&xN?_yVQg#?Gr>&D;`Q;tMjy`384C<;-onP@t@u<1!gx%@6YjU>KAF-umF|oQcWJL(#Qlr06m3|P< z!x-*!#v16CZuyW+cyN&@P*P<}0HC__G72XOu)PfA73T&}#A2ijZjhIfbzmww_ak}C zoCMom;z}wSaUUve5WhMBB{c`;0*o(InX(&w3wdAiQxo7B<2p5(C5UqxSpDLOHIZs) zP=bhZ2{@a_F$GnhJ)lRe@;t^q{hitTCkCffN9@pdHbJK2D#(C1;7ejG)c6qJ)PbyY zl1_SxQAFPv5N>O#Dzq{tYg?EaHWIrQJv5k^;%UD9EeKyF^pui7(A-}|yWUmb1C3mO z-!Ncv5Wt!?F{#G-Y!lUY)8r_kOQv7<$STnAyVaNAhaxgK0=D?!yui(*i;*o7OWQk0 z;Bg1W7qaTc%#f_tjt$z@YrBn*X&+WyF9TJUn457wiX#Q*`&I4cDvq_=Dx82yfFAXf zcR0UxcV7N#@ZKK-N~Gou2hl9cLJx6|mLu4Y;ME6a-HrK?JGpYhNx}})6g$rVku`MF zFFBKyjk_~pFG(KtGdf!zHK^l%EbF?qiHYY*CJU1GlK1#j$5J-$9tfc1`_ZK?S5;0 zWM$UQ$I-OQygGyK+?Q4%C_iI2zH9S7;=9YY?fc;2MJpg)2mT`@pE!)$#{TBCl)|VO zycsImMj+$y#U539M&%TujySvVPGhdl^fA%lcB7!T^}|j;;Loop+((DgWVw$@1k?#j zkCbw?(0cIQYa`T#OQ*B3-m7mXa>H|i;^lJ^brYh4DSXe;M&8k0$RDiwd>O&F`s-@J z)>D?#JCdBD_Vnsf4;``*8gf!x=lq6nzA5<}vJ zG?HfqnZt9{lW+DFWX~sVbpL)#cHf%II_pP!fVDRm_{&MH>eER5l(9ZrF{!62dkmB@ z=yJb-SQMjcm6$7Wtv@Yw`m+8W{hNi+C)xp2_ZlM1>J(h4%p0LgyIH&`>jBBq07R}auADI- z1oGWgu4V6##zZl_JW3NGho)}lKtsZK>Jw~x0-^Fcx1sMLbMfqNmXZ193A`z}vJPEh z&Mnl`Ed5B*n;2eZvvBkA#(m4PrEv_Qx;-(wrb~-bGEathc=Y;wGH_F*kxaZVk1E~f zlzsRX;o+{O zmxB?z%1QWpNV2oQmFskG`S8r^JWRhf>Wy%w1_D8HyN5J?gkm^RO<4hYTkw2oz$Ed; zX!lEFy;&wPXz{(t{L&N-A-|dq)9_=q=5I}N1xh>S>8!Y%8DZ$r`>4Oh%@G${ z{b^}@4gKS<=|X1Oga;G+)tPp5Q5FSId5RiyRU_0Lf$UmQzRAQARR zuu3U{(;6n|V9B!R6UztJSsfsci=3=fq*wkS0NmLHR9@BDK7w z0yX4a1s8`^da0n1hbko8Up?tlU+b%ER1e0EcNwk@!6eu<^BkZ-A+fq@&kL`+hce%; z{koMH&eCa)D(Q&~B3RD1ZY0p8Rpw^*G(%^-ec`Q=(-EM6MozhAPUc}1R2~HO3Zf&CYG%GsETKXSclXaX* z<~!CC%Wu}MHmaV|`A3RS!O-v8TLVy#6~2!Xb}Jw0NvI8%CuRx{4Aixmk8{+e&Dx8S zt7@X0J_Bba$zUWs1u8;E(LWa_t$j*u54Wqu&sA+pUOv?Ou^t1-fF}p(yF{N+1O|D# z=>DF$c=wGpn23dZtWU3L^|3beDF;z7Mipk^sX`NfzyB0Vt|M2> zyC^7*dkpL;%)QD4zUuxu#a{|}4wCu)OV>UOIh~l#pX-o}ECO_!>C9!T+hp?grST)~ z*}W;xi2iBSDCZwyeAO@dbTRpg0F_wobQSk#qE3Fh;hae-2b}W(wrU9OabH@`I>|fL zDlEz<#4_|Ic2zx>eluUSU~(PSIBD_w`c|T?v;rEplYnFhr4B92_+^K3wx$i{r@}p> z8GW%p;+>HSWqbkXiqUzm*db_U_6oDDTp$ zH5W@Z-FtUj%oqkV6E)By)3`K0W-G<6`T*M=!NfT6oD9cyKUVzWSe&AV!KIWK(y=e{ zgN1S=@-L-BRJ8%d*#;&}}=V&VL_T)C)Jep=6` z>q7*soYtLX{%(hLydVE0Wuq}?U1iCOGL*&OX(~tYo&JhM_O;KHKRmN%Lva0)&*zRe zTAwMCfx6wUeRE`sQm}~eY;RX&oGI9O>qIqf>I1pVAqfeUr2%W-ZX-c)+%F7oJBSCG z(n-=|mn+qDiTPWKrh6$LT!ldh(`cfqk#$IAt8lSIKr{*Lqy!GJ_8u7O^4m7-jziRuB1^V4gbyc>Suc z+GmW-umqtZ)4tT-c*x#Ja(WZ!?xO#A1vdB)^?ZDjeUP&hay2<-u~o$Ie7%&(CA&v@ zEYwt{a3*wT=H+3uNyd#csVyH-^d$pL;1qCSZQZs1MMzI41~eX zEEH6RZ#uw8C^2^xFVZsfr8@&Y-JfJ}(n)k%;0m1%trXsYe0=p)-7suE8#g~HQw`xR z5QwQILl&;>2{&={F*KCk63-1b9d=uuBwBx~Z73ZaaE>fW9S@Xj>pwK0r$)&Bux56r zs_WhtJ6Ml8hT9fnr=T{cl%J(w=`*PI^!E^k+Z^KRvewI~U3`q;4_KXaD20C|Q;=vUWWy|V zd-orTsLr+baE#^AayvUT^k{Bh&gOxn8J^8ict*hp?r*gzIL0Y2D}SvL8qY$J16iZa(5kg<19{MEjqBihehxr+ zKe2;VTj}Yz1D27SpN|n_gGk5jK(TrsZN-WE}` z-Cht;6d^Ibs8b8|IA{xZ+YBiorkf&P|8!$iN_X8y-qPTVTrt}OTn@AhY}{NMN&`^9 zd*BAjK+#THW#x5X$5wdn^w&L7E!d~tEil`wJnTkpJ%A7oQY{H)FTTfg``fFD!7f;} zj!_@9wBU;2zRR0b!}z{UwgQiL6VVt~MD)yS9&Ns`Ykp6%-47_>H#Pj6MW{TfnCycr@{xegJ`X)8SH2-pi+Rs^Gr~`Z z8}ao44bxtIV)WhvH6cnmySmsu4GG#G39X(C8eka@#y&#Mt;D4-P;+V-?QRy!Uh>VS z0Q@yWPv_LYK*8icI1l%Kz`lw}Txxv40}yUC9@_mfpV>og60{HjK^}Z2=$?AWm9nLS zoKC%G>v72AC9BX>%T|_3w))*9(a7yjwqD<3bLam0jW`tU>tT>|^@gnNV%PJIv0b^6 zC(vl%4kxL&@fqHWO$n7-xfGU|dH&FJ-#Z49Ecq}6yL8XwRf|L}DDS@dtiE0te-_OM zC5!`D2)y3xd2U0~>c9Et57t=j@YF@kCjCe-^VLam|Su?xffozSdG~~44bTU>=C0tbXmHf)jpLrH=4OMWC zl1Aoo2GX{wQR)ls&5eLUNQp&(RM-ZGh_N1Kt@1=>|Fmd9|LmZD|60OP(j8u7<)@Ho z+dGc-W<|;c_g+s)Gu|TrzyW}mk^rIvEr8wpm835o3#oeY&+Ipl`QPF1AR+7k4u%cN mgAe|)DByo)#ciDsI6>s@&qP6$-_`%jo)HS_Pb%chgZ~Hk*(%lm literal 5547 zcmaJ_cQhQ{vtMh8)vey!E~`WkWQkxAwuoLNC;v(E|VgCNxUd6ab(G zoy&1_=L-%b7>9UuW<|6kkxp?Y7}9Y7jAeSEcq0InZh%`12i zMEL7VK;sFu6N%YD;VDq}JVfYii>x@i{gAD>Y7a`i$1P=d{vTwXb17(T8liA7`^E53 z@sWj>Us%WWsiQ09^fBv!Wuy~}+FOl&B~?L&O`E)BBqMcI^6VDCZ8ebJHRa}v&s^EU z$l;sf0XLi9uK1?2pMN$$j)rZTd0Dr@c3GksBjek3J}MB`Wkfr#Y40^LZ0h^abT`GP ztJGmGa_Z5psus$9=)C@z&r-{Note;QI4+_w7v&jU9dRg#f_Q*B?q0A)qho?~ zi^sJLlFj=Id>&J}7pLhg(_}6b*fja92eP-N3Gi5<6Ji>NW{{7Pm+KJ-EgS8h-9pX}Ixa!_;Kb=mNSnu*eEZyvGVIG7Nli(= zR~%4911qamq7VB%%M6HmpqM4O^LeLCFVYX!6V9>7@%**?BiS5oc0xK%33Wbo0VKId6o)A&phsCIgKhl5aFoynxS2U3l?y!ehser^3rw z11mv)Rz!X6qaTQ7+)WgupK?7SZXEO7=Ru;X& z+MZ|941>K|2rO5m*q$wZh;6JPDxb{e9aioNy7RsFEYjr&03>aJwUh?pz9#FiaO%`BI|63BJxJZe^>f{l=6S5;a~&r zPSBgSgEp4CGykv4qFTq=nz~j2X7cPTe===wI7X!%Km`N=BUtGoSfP*mSR(sUIDMbW zG;!NnxAs>$WjYDBcX7vc}a=}$=) z7h^S619J}pbB5fcgsd8Z(^8FgCN3UwuJxz3jt*SB_50^7a>zjBNT7$CKr9IR)NV`y zZDcslN9-zKZ_u~fQV*)u7gxV`*DQ5n%xJXV=a;u>YWJIhhh{Z}xmg+uMOn~WM(P6o z_U@{`;1wpbo{wO;ZBJb&Jt(tFMs*4EeFMc5f4`@|gd8(ZACg$$vF;Y0RYJsNs|IVaq--C@PpX#6iWAnbssb^oJ#Lv1BWUKRWpj zghD60s04=Tu%$e)ATO#G^GD@WM?tZAsF1sW`7QI9HL>u1wok>oDZxTR34rL5JG}C< zq~7vhcW}xZ4Yh#0{kRn(4uqeORMuX<7v7ipeA13bRv z)g>hRb2^k~Rm4i__vwk{KIWC^fEfA*8I&^J&dBe23zR3fbg%cxp?k2G?;tXZ=>=kk zyv$I;p(^-kq#pJNm?S#9za{me+%Gmu1t5H<_=k8+*qKC)#1v7%M!kj%3_U(a^ zJ|$E|J*W$c%L!zQ6Lwo%*o^@Cw=X;zSH7U}Jd{AV$~VuH3}c$ru+bOk@Kz|Ut_sUN zvs{nQ9tS~QM{`Er%>efnLAxtusuzsqVyp60@~u+5ZK(Wq{Y)QXmq@{F9cd!g;i}?- ziFmbar8lf9Z;CP*@F*cGFf_ftpE(hU9;ytsqtlaX!d5eB7<~D{h=8`Xtkv?qD5_g( z6jp&1?PQ;}<6#KE<+sX>0uyg3;P?C&uD>e*jp53o=Em4?>kpKyC|p10`QaTwyC|ac z<6;}-a(0B@_vOJmzXai%yfm2Fxz;|9N zpz=xUFlq1|G%d_<;p@-f-}_hRy^YRHN5>!b8#q*4vY6XXJ@`3J69H=AB0)}XVKpEe zipq%J4SlzdZt|u^DrK8G@kBY9`hIr@g8y7E(yV7L1Uh%mDlmUIbw zLgGD2PKUqbCu`m}GblN&>d^_Q71FVkRm>M9>fQvildrmw);x zU(WqXX*^4)Ihjk#<8FI?Ix#cspA`^@*>xEHRmV}nk@&KZloM|>PBzyQVDZHbepDrq z;pN;mzE@u~dw5+|HUiH;G)h{po!xu9-7ydetaLRx=*^Bdk6Jxx_%ch~L;f4*%z z>vU*EwZK{sq+4NZ@)%MV{_D$V{=2Rtm}{?PAfPP;cI`Fe99uEwG90X7ERc$)8+`qY zYPDenvkTWAnO{2;BNd%o9XlL6l?U^27E=jHb(y$6cbc;QL%fE~>T0QjxFRYT`Il;v zomI>E;=%(W73=g7axxv0tW@x(y{C{Bcy!UBah9t#J;yIYUT0QV*Xy(dkYbDQMoaf} z0{bOla85^|*`bjk;j*c~l=uBJ=+sjD2Ydf{&&rrvro8eyQ_+6y$-5s_A9Qd>Y~6Qm z5)>@(a2LtQ1rp+KijHzi5X(~_qL1&t&4Bm7!SVD?da#TXfnz{i- zJzhl!BJlbrS+0q8jD2NxNY-W?3}+vWJ;1kummR=4+As}C8}R-=N%;>%8gEAk#7}{< z50Erg9LsK0#Wzfio`uvva4+)3whMVn7;}~XtR4l^Ku#FoIYQJze~2K*R59l6nLEEU zmp^QOm?lHq@4!j>mgQ&V~PDzk1Bd@w^Auq@#;m*0LbfF&I(^= z&yEu|xs9l~ss+jgPyJq33Ui}68~9mk9D?RhA59ke2Kt`K>8xm0v{M@Ib3N|I8@bMU z{H!IGcIFNYMGF&t{LG%2?EMy|&fs{)&$Zp^hDM3K{r>0DQCP&*$kdiyed!~E=Kw+8n=ZoqV9 z;`c70<(;KY#`+invtmwb8m^iJ6@Hs04?f2Wlj=$SO38m+ZJvOl?1A_p4zI(Ww zp=%MXPG1;o=@zP@HWYpdTm(uF4c>Of5%j05Xz`9Xf+ueIO^_m+8}Hha?EM{%m!gfj zs*D%=eZ**255F3@_&;bZem3;>H?-&-=*y#7r6L}-Ky!JElz*Js_BG7jJ|e9$+VGhH z8#tJEW+)UnT%LTn$Tmv8MT*|J=PBJS?Y{`6kju6@F*K**iKvA^l>E^XDtSvnVh!b* za88qaIa0%k1LO0z38AOtslYcKepwE*nm|4l;)G7g*36u;L0Sfy4n*a9yByohin$WJ z_0BL?s;l)s_8zI#$s*{Kq!;5NIKfVv??z-&YUE{+Uc0u}aXhJuVI8n}WMh%G^M`_T zUi$Ci0+unKhReb|@L$I2eZ`a8PjE=Xz8biX`ZW+&upL~n;#uDkT2S45*S z{Rpl5_6ZM-lJ-xA4=Lf&ZVIxzA2ZgP3x+^3Y!4KK!XO6kQxyL4qJ*ne%vSxCf5#?k9(cEXw`g~l~K9Q3MCsj)9q1w$z4Fk&fRN)Zt1y7Q%|$sm7N)5U5g7vU=} z8`1w{A2W*HOTs^~kGq4cZyuF>naJtgC&aDUv#$yX2H8DqX#YS4JzQjf1T9WJ+YNBM~mS!3pypH=ZVY``y0O0-8P<7$_~n^uF*iAXt_1rfhA{Ra zze3(B0d&!L^wmC<&XZ0EBX@h48Q&4zJn^8t@OX3;*yu1xcSP@8ztKyN;3{n4mYZ5- zubJZi{7ctvx*;Y@DOo`*cI+)%tkqv5{gGNxcQe^E0VLjpC?Yo%4mbqe49T5w9IDlV zm(2wvx7Dd1k;)U%HC#M-KB3dQ};^T2=R#9QH#ug<&H(oQ61 zTZKH5+%9173Z#W^q?t3^P~-TsJyUT&U6fr3`Yp_= zz}mEe%Hs-0ux{-aQ@>=x`;zsvUVYR%%OsG}lAQ9zkqXv)vr0R9?YiL}KG1+W_vARK zE6-6j7LixonGV@B=y9<2$3rE#8#wBNjhu-NBQn$au@&?P`=yGx!tl3Rgg-`1 ztjCCpO%&cI>vHxi!c$xU5YkmhsY~^Du36Etq)*}y^;fi|EoXu)MT)7Fhn8m;ALl^_ z2AYF_Hx~be`e(oo)ApEPXdim#{Z?y#$7kQ|nijJ>`~1^c@pxo!lVtx5LvXNNPIKny zrSHp~*W!_Tqpfv5#FojEv0c<)S|x~Udp%lD;bix9ifiX=TdKKXjA!l+DKdE=y$Ch$ z5vs1PccsdP*VxdMWvGo9$vz~PJ2{qr{K+au=E9{*cEbHU{>o*FsWN`$O{bGCat@7q z41#`?i5{U%`jf18puivE3`(MV$%~2WfhPGrKi1ZXJA0M{^<5~BB-Xm{{$*oqX5C}e z-_+2@*T0>eQ^VW9%VnBb4Q}S1^CvcYx7v=^J{|2ybzr*~;?sn2(yieWn%aaOyHgVg zU}!&f>;&idH)ii1C1Sx0=yAKPXP{LW_Q~Z_=Fdx3e=h%0-hMOTMbYU@aWN&b^!;;KmYYYYjF)dGnCS*acabOB-jws5+Ct9Vg@ c)&H^Q4aP`wF=FsLf2XkM>xR0O+IYgh09s5T0RR91