copied
Readme
Files and versions
Updated 3 years ago
molecular-fingerprinting
Molecular Fingerprinting
author: shiyu
Desription
Molecular Fingerprinting encodes a Simplified Molecular Input Line Entry Specification (SMILES) as a fingerprint. The fingerprint can represent elements, atom pairs, or functional groups, etc., and are often used for substructure searches and similarity searches in drug discovery.
This operator uses RDKit to generate the molecular fingerprint.
Code Example
Before running the following code, you need to install rdkit, refer to https://www.rdkit.org/docs/Install.html.
# install rdkit with conda $ conda install -c conda-forge rdkit
An example that use the Morgan algorithm to generate a fingerprint of the molecular formula 'Cc1ccc(cc1)S(=O)(=O)N'.
Write the pipeline in simplified style:
import towhee
towhee.dc(['Cc1ccc(cc1)S(=O)(=O)N']) \
.molecular_fingerprinting.rdkit() \
.show()
Write a same pipeline with explicit inputs/outputs name specifications:
import towhee
towhee.dc['smiles'](['Cc1ccc(cc1)S(=O)(=O)N']) \
.molecular_fingerprinting.rdkit['smiles', 'fingerprint']() \
.show()
Factory Constructor
Create the operator via the following factory method:
molecular_fingerprinting.rdkit( algorithm: str = 'morgan', size: int = 2048)
Parameters:
algorithm: str
Which algorithm to use for fingerprinting, including 'morgan', 'daylight', 'ap', 'maccs', defaluts to 'morgan', and there is the list of available fingerprints.
size: int
The bit vector size just for morgan and daylight algorithm, defaults to 2048.
Interface
An molecular fingerprinting operator takes a SMILES as input. It uses the RDKit specified by algorithm name to generate a SMILES fingerprint.
Parameters:
smiles: str
A Simplified Molecular Input Line Entry Specification (SMILES).
Returns: bytes
The molecular fingerprint.
shiyu22
02792ed326
| 2 Commits | ||
---|---|---|---|
.gitattributes |
1.1 KiB
|
3 years ago | |
README.md |
2.1 KiB
|
3 years ago | |
__init__.py |
668 B
|
3 years ago | |
rdkit.py |
1.9 KiB
|
3 years ago | |
requirements.txt |
0 B
|
3 years ago | |
result1.png |
12 KiB
|
3 years ago | |
result2.png |
25 KiB
|
3 years ago |