copied
Readme
Files and versions
Updated 2 years ago
molecular-fingerprinting
Molecular Fingerprinting
author: shiyu
Desription
Molecular Fingerprinting encodes a Simplified Molecular Input Line Entry Specification (SMILES) as a fingerprint. The fingerprint can represent elements, atom pairs, or functional groups, etc., and are often used for substructure searches and similarity searches in drug discovery.
This operator uses RDKit to generate the molecular fingerprint.
Code Example
An example that use the Morgan algorithm to generate a fingerprint of the molecular formula 'Cc1ccc(cc1)S(=O)(=O)N'.
Write a same pipeline with explicit inputs/outputs name specifications:
from towhee.dc2 import pipe, ops, DataCollection
p = (
pipe.input('smiles')
.map('smiles', 'fingerprint', ops.molecular_fingerprinting.rdkit())
.output('smiles', 'fingerprint')
)
DataCollection(p('Cc1ccc(cc1)S(=O)(=O)N')).show()
Factory Constructor
Create the operator via the following factory method:
molecular_fingerprinting.rdkit( algorithm: str = 'morgan', size: int = 2048)
Parameters:
algorithm: str
Which algorithm to use for fingerprinting, including 'morgan', 'daylight', 'ap', 'maccs', defaluts to 'morgan', and there is the list of available fingerprints.
size: int
The bit vector size just for morgan and daylight algorithm, defaults to 2048.
Interface
An molecular fingerprinting operator takes a SMILES as input. It uses the RDKit specified by algorithm name to generate a SMILES fingerprint.
Parameters:
smiles: str
A Simplified Molecular Input Line Entry Specification (SMILES).
Returns: bytes
The molecular fingerprint.
shiyu22
ca5b2ebc7f
| 6 Commits | ||
---|---|---|---|
.gitattributes |
1.1 KiB
|
3 years ago | |
README.md |
1.8 KiB
|
2 years ago | |
__init__.py |
668 B
|
3 years ago | |
rdkit.py |
2.2 KiB
|
3 years ago | |
requirements.txt |
11 B
|
3 years ago | |
result1.png |
12 KiB
|
3 years ago | |
result2.png |
25 KiB
|
3 years ago |