logo
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Readme
Files and versions

Updated 2 years ago

molecular-fingerprinting

Molecular Fingerprinting

author: shiyu


Desription

Molecular Fingerprinting encodes a Simplified Molecular Input Line Entry Specification (SMILES) as a fingerprint. The fingerprint can represent elements, atom pairs, or functional groups, etc., and are often used for substructure searches and similarity searches in drug discovery.

This operator uses RDKit to generate the molecular fingerprint.


Code Example

Before running the following code, you need to install rdkit, refer to https://www.rdkit.org/docs/Install.html.

# install rdkit with conda
$ conda install -c conda-forge rdkit

An example that use the Morgan algorithm to generate a fingerprint of the molecular formula 'Cc1ccc(cc1)S(=O)(=O)N'.

Write the pipeline in simplified style:

import towhee

towhee.dc(['Cc1ccc(cc1)S(=O)(=O)N']) \
  .molecular_fingerprinting.rdkit() \
  .show()

Write a same pipeline with explicit inputs/outputs name specifications:

import towhee

towhee.dc['smiles'](['Cc1ccc(cc1)S(=O)(=O)N']) \
  .molecular_fingerprinting.rdkit['smiles', 'fingerprint']() \
  .show()


Factory Constructor

Create the operator via the following factory method:

molecular_fingerprinting.rdkit( algorithm: str = 'morgan', size: int = 2048)

Parameters:

algorithm: str

Which algorithm to use for fingerprinting, including 'morgan', 'daylight', 'ap', 'maccs', defaluts to 'morgan', and there is the list of available fingerprints.

size: int

The bit vector size just for morgan and daylight algorithm, defaults to 2048.


Interface

An molecular fingerprinting operator takes a SMILES as input. It uses the RDKit specified by algorithm name to generate a SMILES fingerprint.

Parameters:

smiles: str

A Simplified Molecular Input Line Entry Specification (SMILES).

Returns: bytes

The molecular fingerprint.

shiyu22 02792ed326 Add rdkit op 2 Commits
file-icon .gitattributes
1.1 KiB
download-icon
Initial commit 2 years ago
file-icon README.md
2.1 KiB
download-icon
Add rdkit op 2 years ago
file-icon __init__.py
668 B
download-icon
Add rdkit op 2 years ago
file-icon rdkit.py
1.9 KiB
download-icon
Add rdkit op 2 years ago
file-icon requirements.txt
0 B
download-icon
Add rdkit op 2 years ago
file-icon result1.png
12 KiB
download-icon
Add rdkit op 2 years ago
file-icon result2.png
25 KiB
download-icon
Add rdkit op 2 years ago