ãã®ç« ã§ã¯RDKitã䜿ã£ãŠååã®èªã¿èŸŒã¿ã®åºæ¬ãèŠããŸãã
Simplified molecular input line entry system(SMILES)ãšã¯ååŠæ§é ãæååã§è¡šçŸããããã®è¡šèšæ¹æ³ã§ãã 詳ããã¯SMILES Tutorialã§èª¬æãããŠããŸãããäŸãã°c1ccccc1ã¯6ã€ã®è³éŠæççŽ ãæåãšæåŸãã€ãªãã§ã«ãŒãã«ãªã£ãŠããæ§é ãã€ãŸããã³ãŒã³ãè¡šçŸããŠããããšã«ãªããŸãã
SMILESã§ååãè¡šçŸããããšãããã£ãã®ã§ãSMILESãèªã¿èŸŒãã§ååãæç»ãããŠã¿ãŸãããããŸãã¯RDKitã®ã©ã€ãã©ãªããChemã¯ã©ã¹ãèªã¿èŸŒã¿ãŸããäºè¡ç®ã¯Jupyter Notebookäžã§æ§é ãæç»ããããã®èšå®ã§ãã
from rdkit import Chem
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import Draw
RDKitã«ã¯SMILESæååãèªã¿èŸŒãããã«MolFromSmilesãšããã¡ãœãããçšæãããŠããŸãã®ã§ãããã䜿ãååãèªã¿èŸŒã¿ãŸãã
mol = Chem.MolFromSmiles("c1ccccc1")
ç¶ããŠæ§é ãæç»ããŸãããåçŽã«molãè©äŸ¡ããã ãã§æ§é ã衚瀺ãããŸãã
mol
å³ã®ããã«æ§é ã衚瀺ãããŠããã¯ãã§ãã
äžã®ããã«ååãç·ã§ã€ãªãæ§é ãè¡šçŸããæ¹æ³ïŒæ§é åŒïŒãšãSMILESè¡šèšã¯ã©ã¡ããåããã®ãè¡šçŸããŠããŸããæ§é åŒã¯äººãèŠãŠããããããã§ãããSMILESã¯ASCIIæååã§è¡šçŸãããã®ã§ããå°ãªãããŒã¿éã§è¡šçŸã§ãããšããã¡ãªããããããŸãã
Note
|
æååã§è¡šçŸã§ãããšããããšã¯ãæååçæã¢ã«ãŽãªãºã ãå¿çšããããšã§æ°èŠãªååŠæ§é ãçæããããšãå¯èœãšããããšã§ãããã®å 容ã«é¢ããŠã¯12ç« ã§è©³ãã説æããŸãã |
è€æ°ã®ååç©ãäžã€ã®ãã¡ã€ã«ã«æ ŒçŽããæ¹æ³ã«ã¯ããã€ããããŸãããsdãšãããã¡ã€ã«åœ¢åŒãå©çšããã®ãäžè¬çã§ããsdãã¡ã€ã«ãšããããšã§ããã¡ã€ã«ã®æ¡åŒµåã¯.sdfãšãªãããšãå€ãã§ãã
MDL瀟ã§éçºãããååè¡šçŸã®ããã®ãã©ãŒãããã«MOL圢åŒãšãããã®ããããŸãããã®MOL圢åŒãæ¡åŒµãããã®ãSDãã¡ã€ã«ã§ããå ·äœçã«ã¯MOL圢åŒã§è¡šçŸããããã®ã$$$$ãšããè¡ã§åºåãããšã«ãããè€æ°ã®ååãåãæ±ããããã«ããŠãããŸãã
MOL圢åŒã¯ååã®äžæ¬¡å 座æšãæ ŒçŽããããšãã§ãäºæ¬¡å ã ãã§ãªãç«äœæ§é ãè¡šçŸã§ããç¹ã¯SMILESãšã®å€§ããªéãã§ãã
4ç« ãåèã«ChEMBLã®ããã€ãœã¡ã©ãŒãŒIIé»å®³è©Šéš(CHEMBL669726)ã®æ§é ããŒã¿ãsdfãã¡ã€ã«åœ¢åŒã§ããŠã³ããŒãããŸãã
- NOTE
å ·äœçãªæé ã¯ãªã³ã¯ã®ããŒãžãéããŠãæ€çŽ¢ãã©ãŒã ã«CHEMBL669726ãå ¥åãããšæ€çŽ¢çµæã衚瀺ãããã®ã§ãCompoundsã¿ããã¯ãªãã¯ããŸãããã®åŸãå šéžæããŠSDFã§ããŠã³ããŒããããšgzipå§çž®ãããsdfãããŠã³ããŒããããã®ã§ãgunzipã³ãã³ããŸãã¯é©åœãªè§£åãœããã§è§£åããŠãã ããããããch05_compounds.sdfãšããååã§ä¿åããŸãã
RDKitã§sdfãã¡ã€ã«ãèªã¿èŸŒãã«ã¯SDMolSupplierãšããã¡ãœãããå©çšããŸããè€æ°ã®ååç©ãåãæ±ãããšã«ãªãã®ã§molã§ã¯ãªãmolsãšããå€æ°ã«æ ŒçŽããŠããããšã«æ³šæããŠãã ãããã©ãããå€æ°ã䜿ããã®æ±ºãŸãã¯ãããŸããããèŠãŠããããããå€æ°åãã€ããããšã§äœèšãªãã¹ãæžããããšã¯å¿ããããšããã§ãããã
mols = Chem.SDMolSupplier("ch05_compounds.sdf")
äœä»¶ã®ååãèªã¿èŸŒãŸããã®ã確èªããŸããæ°ãæ°ããã«ã¯lenã䜿ããŸãã
len(mols)
34件ã§ããã
forã«ãŒãã䜿ã£ãŠãã²ãšã€ãã€ååãæç»ããŠãããã§ãããRDKitã«ã¯è€æ°ã®ååãäžåºŠã«äžŠã¹ãŠæç»ããã¡ãœãããçšæãããŠããã®ã§ãä»åã¯ãã¡ãã®MolsToGridImageã¡ãœããã䜿ããŸãããªãäžè¡ã«äžŠã¹ãååã®æ°ãå€æŽããã«ã¯molsPerRowãªãã·ã§ã³ã§æå®ããŸã
Draw.MolsToGridImage(mols)
åµè¬ã®ååç©æé©åãããžã§ã¯ãã§ãååã®åœ¢ãå€æŽããªãã§ååç©ã®ç¹æ§ãå€ããããšããããšããããŸãããã®ãããªå Žåãè³éŠç°ã圢æããççŽ ãçªçŽ ãç¡«é»ãé žçŽ ãªã©ã®ååçš®ãå ¥ãæ¿ããããšã§ããè¯ãç¹æ§ã®ååç©ãåŸãããããšããããŸãããã®ããã«ãããåå(æ°ŽçŽ ä»¥å€ã®åå)ãå ¥ãæ¿ããã¢ãããŒãããããã·ã£ãããªã³ã°ãšãããŸãã
ãããã·ã£ãããªã³ã°ãè¡ãããšã§ã掻æ§ãç¶æãããŸãŸç©æ§ãå€åãããŠåæ ãè¯ãããã掻æ§ãã®ãã®ãåäžããããç¹èš±ã¯ã¬ãŒã ã®åé¿ãšãã£ãå¹æãæåŸ ã§ããŸãã
å°ãã®æ§é ã®éããéžææ§ãè¬ç©åæ ã圱é¿ãäžããæåãªäŸãšããŠãPfizer瀟ã®SildenafilãšGSK瀟ã®VardinafilãæããããŸãã
äºã€ã®æ§é ãæ¯èŒãããšäžå¿ã®ç°æ§é éšåã®çªçŽ ååã®äžŠã³ãç°ãªã£ãŠããã ãã§æ¥µããŠäŒŒãŠããŸããäž¡ååã¯åãæšçèçœè³ªãé»å®³ããŸããããã®æŽ»æ§ãè¬ç©åæ ã¯ç°ãªããŸãã
äžèšã®ç»åãçæããã³ãŒãã瀺ããŸããåã«Draw.MolsToGridImageãé©çšããã®ã§ã¯ãªã Coreæ§é ãããŒã¹ã«ã¢ã©ã€ã¡ã³ãããŠããããšãšDraw.MolToGridImageã®ãªãã·ã§ã³ã«legendsãäžããåååã衚瀺ããŠããããšã«æ³šæããŠãã ããã
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import Draw
from rdkit.Chem import rdDepictor
from rdkit.Chem import rdFMCS
from rdkit.Chem import TemplateAlign
IPythonConsole.ipython_useSVG = True
rdDepictor.SetPreferCoordGen(True)
sildenafil = Chem.MolFromSmiles('CCCC1=NN(C)C2=C1NC(=NC2=O)C1=C(OCC)C=CC(=C1)S(=O)(=O)N1CCN(C)CC1')
vardenafil = Chem.MolFromSmiles('CCCC1=NC(C)=C2N1NC(=NC2=O)C1=C(OCC)C=CC(=C1)S(=O)(=O)N1CCN(CC)CC1')
rdDepictor.Compute2DCoords(sildenafil)
rdDepictor.Compute2DCoords(vardenafil)
res = rdFMCS.FindMCS([sildenafil, vardenafil], completeRingsOnly=True, atomCompare=rdFMCS.AtomCompare.CompareAny)
MCS = Chem.MolFromSmarts(res.smartsString)
rdDepictor.Compute2DCoords(MCS)
TemplateAlign.AlignMolToTemplate2D(sildenafil, MCS)
TemplateAlign.AlignMolToTemplate2D(vardenafil, MCS)
Draw.MolsToGridImage([sildenafil, vardenafil], legends=['sildenafil', 'vardenafil'])
ãããã·ã£ããã«ããååãçæããããã«HeteroShuffleãšããã¯ã©ã¹ãå®çŸ©ããŸãããªããžã§ã¯ãã®çæã«ã¯ã·ã£ããã«ãããååãšå€æãããéšåæ§é ïŒCoreïŒãäžããŸããã¯ã©ã¹å ã®ã³ãŒãã§ã¯ãŸããååãCoreã§åæããCoreãšãã以å€ã«åããŸããCoreã®Aromaticååã§ã眮æåºãã€ããŠãªãååã®ã¿ã眮æåè£ã«ãªããŸããã·ã£ããã«åŸã®CoreãšCore以å€ã®ããŒããåçµåããããã®åå¿ãªããžã§ã¯ããçæããã¡ãœãããmake_connectorã§ãããã®ã¡ãœããã§äœãããåå¿ãªããžã§ã¯ããå©çšããŠre_construct_molã§ååãåæ§ç¯ããŠããŸãã
èããããååã®çµã¿åãããæ§ç¯ããããã«ãitertools.productã«ãåè£ååïŒC, S, N, OïŒã®ååçªå·ãšãç°ãæ§æããååæ°target_atomic_numsãäžããŸãããã®åŸã«ååãšããŠçæã§ããªããã®ã¯æé€ããã®ã§ããã§ã¯èããããå šéšã®çµã¿åãããåºããŸãã
import copy
import itertools
from rdkit import Chem
from rdkit.Chem import AllChem
class HeteroShuffle():
def __init__(self, mol, query):
self.mol = mol
self.query = query
self.subs = Chem.ReplaceCore(self.mol, self.query)
self.core = Chem.ReplaceSidechains(self.mol, self.query)
self.target_atomic_nums = [6, 7, 8, 16]
def make_connectors(self):
n = len(Chem.MolToSmiles(self.subs).split('.'))
map_no = n+1
self.rxn_dict = {}
for i in range(n):
self.rxn_dict[i+1] = AllChem.ReactionFromSmarts('[{0}*][*:{1}].[{0}*][*:{2}]>>[*:{1}][*:{2}]'.format(i+1, map_no, map_no+1))
return self.rxn_dict
def re_construct_mol(self, core):
'''
re construct mols from given substructures and core
'''
keys = self.rxn_dict.keys()
ps = [[core]]
for key in keys:
ps = self.rxn_dict[key].RunReactants([ps[0][0], self.subs])
mol = ps[0][0]
try:
smi = Chem.MolToSmiles(mol)
mol = Chem.MolFromSmiles(smi)
Chem.SanitizeMol(mol)
return mol
except:
return None
def get_target_atoms(self):
'''
get target atoms for replace
target atoms means atoms which don't have anyatom(*) in neighbors
'''
atoms = []
for atom in self.core.GetAromaticAtoms():
neighbors = [a.GetSymbol() for a in atom.GetNeighbors()]
if '*' not in neighbors and atom.GetSymbol() != '*':
atoms.append(atom)
print(len(atoms))
return atoms
def generate_mols(self):
atoms = self.get_target_atoms()
idxs = [atom.GetIdx() for atom in atoms]
combinations = itertools.product(self.target_atomic_nums, repeat=len(idxs))
smiles_set = set()
self.make_connectors()
for combination in combinations:
target = copy.deepcopy(self.core)
for i, idx in enumerate(idxs):
target.GetAtomWithIdx(idx).SetAtomicNum(combination[i])
smi = Chem.MolToSmiles(target)
target = Chem.MolFromSmiles(smi)
if target is not None:
n_attachment = len([atom for atom in target.GetAtoms() if atom.GetAtomicNum() == 0])
n_aromatic_atoms = len(list(target.GetAromaticAtoms()))
if target.GetNumAtoms() - n_attachment == n_aromatic_atoms:
try:
mol = self.re_construct_mol(target)
if check_mol(mol):
smiles_set.add(Chem.MolToSmiles(mol))
except:
pass
mols = [Chem.MolFromSmiles(smi) for smi in smiles_set]
return mols
äžã®ã³ãŒãã§äœ¿ãããŠããcheck_molãšããé¢æ°ã¯c1coooo1ã®ãããªïŒå¡ç°ã®æ§é ãAromaticã ãšå€å®ãããŠããŸãã®ã§ãããé¿ããããã«äœ¿ã£ãŠããŸããO, Sã蚱容ãããã®ã¯ïŒå¡ç°ã®ãããè³éŠç°ã®ã¿ã«ããŸããã
def check_mol(mol):
arom_atoms = mol.GetAromaticAtoms()
symbols = [atom.GetSymbol() for atom in arom_atoms if not atom.IsInRingSize(5)]
if not symbols:
return True
elif 'O' in symbols or 'S' in symbols:
return False
else:
return True
å®éã«äœ¿ã£ãŠã¿ãŸãã
# Gefitinib
mol1 = Chem.MolFromSmiles('COC1=C(C=C2C(=C1)N=CN=C2NC3=CC(=C(C=C3)F)Cl)OCCCN4CCOCC4')
core1 = Chem.MolFromSmiles('c1ccc2c(c1)cncn2')
# Oxaprozin
mol2 = Chem.MolFromSmiles('OC(=O)CCC1=NC(=C(O1)C1=CC=CC=C1)C1=CC=CC=C1')
core2 = Chem.MolFromSmiles('c1cnco1')
å ã®åå
ht = HeteroShuffle(mol1, core1)
res = ht.generate_mols()
print(len(res))
Draw.MolsToGridImage(res, molsPerRow=5)
Gefitinibãå ¥åãšããå Žåã®å€æçµæã®äžéšã§ããè³éŠç°ã圢æããååãå ã®ååç©ããå€åããååãåºåãããŠããŸãã ãŸããCoreã§æå®ãããããŸãªã³éšåã®ã¿ãå€æãããŠããŸãã
ht = HeteroShuffle(mol2, core2)
res = ht.generate_mols()
print(len(res))
Draw.MolsToGridImage(res, molsPerRow=5)
Oxaprozinãå ¥åãšããå Žåã®å€æçµæã§ãããã¡ãã¯äžå¿ã«ããªããµãŸãŒã«ãšåŒã°ãã5å¡ç°æ§é ãæããŠãŸããïŒå¡ç°ã圢æããè³éŠç°ã«ã¯ããªãã§ã³ããã©ã³ãªã©ã®ããã«çªçŽ ãé žçŽ ãå«ããã®ããããŸãã以äžã®äŸã§ãSãOã5å¡ç°ã®æ§æååã«å«ãŸããŠããååãåºåãããŠããŸãã
ã©ãã§ãããããäºã€ã®ååã®äŸã瀺ããŸãããäžã€ç®ã®äŸãGefitinibã¯ãååãæ§æããè³éŠç°ãããããŸãªã³ãšãã³ãŒã³ã§ããããããŸãªã³ã¯ããã³ãŒã³ãšããªããžã³ãšããäºã€ã®ïŒå¡ç°ãçž®ç°ããæ§é ã§ããïŒå¡ç°ãããŒã¹ã«æ§æãããè³éŠç°ã圢æããååã®åè£ã¯ççŽ ãšçªçŽ ã«ãªããŸããïŒããªãªãŠã ã€ãªã³ãªã©é»è·ãæã€ãã®ãèæ ®ããã°é žçŽ ãç¡«é»ãåè£ã«ãªããŸãããéåžžãã®ãããªæ§é ãDrug Designã§äœ¿ãããšã¯å°ãªãã®ã§ä»åã®èª¬æããã¯å€ããŠããŸããè€çŽ ç°åŒååç©ã®èª¬æïŒ Oxaprozinã¯ãªããµãŸãŒã«ãæããŠããŸããïŒå¡ç°ã®è³éŠç°ã圢æããååã®åè£ã¯ççŽ ãçªçŽ ãç¡«é»ãé žçŽ ãæããããŸãããã®ãããªååã®å Žåã®äŸãšããŠçŽ¹ä»ããŸããã ãããã®ã±ãŒã¹ã§ãäžèšã®ã³ãŒãã§ãããååãã·ã£ããã«ããããã®ãçæãããŠããŸã
Tip
|
2018.03.1 以éã®RDKitã§ã¯EnumerateHeterocyclesãšããã¡ãœãããå®è£ ãããŠããã®ã§äžèšã®ã³ãŒããæžãããšããããã·ã£ããã«ã容æã«è¡ãããšãã§ããŸãããŸããå šéšã®è³éŠæååãå€æã®å¯Ÿè±¡ã«ããããªããšããå ŽåãããããšæããŸãããã®æã¯ãé€å€ãããååã«_protectedãšããå±æ§æãããŠãããšè¯ãã§ããå®éã®äŸãã³ãŒãã§èŠãŠã¿ãŸãããã |
# EnumerateHeterocyclesã¯ãžã§ãã¬ãŒã¿ãŒãè¿ããŸããããã¯çµã¿åãããå€ããªã¹ãã§è¿ããšã¡ã¢ãªãå€ãæ¶è²»ããããã§ãã以äžã®äŸã§ã¯ãããããããããããªã¹ãã«å€æããŠããŸãã
from rdkit import Chem
from rdkit.Chem import EnumerateHeterocycles
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import Draw
from rdkit.Chem import AllChem
import copy
omeprazole = Chem.MolFromSmiles('CC1=CN=C(C(=C1OC)C)CS(=O)C2=NC3=C(N2)C=C(C=C3)OC')
enumerated_mols = EnumerateHeterocycles.EnumerateHeterocycles(omeprazole)
enumerated_mols = [m for m in enumerated_mols]
print(len(enumerated_mols))
# 384
Draw.MolsToGridImage(enumerated_mols[:6], molsPerRow=3)
# ä»åºŠã¯56çž®ç°éšåã¯ä¿è·ããŠã¿ãŸãã
ringinfo = omeprazole.GetRingInfo()
ringinfo.AtomRings()
# ((1, 6, 5, 4, 3, 2), (14, 13, 17, 16, 15), (18, 19, 20, 21, 15, 16))
protected_omeprazole = copy.deepcopy(omeprazole)
for idx in ringinfo.AtomRings()[1]:
atom = protected_omeprazole.GetAtomWithIdx(idx)
atom.SetProp("_protected", "1")
for idx in ringinfo.AtomRings()[2]:
atom = protected_omeprazole.GetAtomWithIdx(idx)
atom.SetProp("_protected", "1")
enumerated_mols2 = EnumerateHeterocycles.EnumerateHeterocycles(protected_omeprazole)
enumerated_mols2 = [m for m in enumerated_mols2]
print(len(enumerated_mols2))
# 4
Draw.MolsToGridImage(enumerated_mols2)
J. Med. Chem. 2012, 55, 11, 5151-5164ã§ã¯PIM-1ãããŒãŒé»å®³å€ã«ãããNã·ã£ãããªã³ã°ã®å¹æãFragment Molecular Orbitalæ³ãšããéåååŠçãªã¢ãããŒãã䜿ã£ãŠæ€èšŒããŠããŸããããã«J. Chem. Inf. Model. 2019, 59, 1, 149-158ã§ã¯AspâArgå¡©æ©ãšãããç°ã®ã¹ã¿ããã³ã°ã®ã¡ã«ããºã ãéåååŠèšç®ã«ããæ¢ã£ãŠããã眮æãã¶ã€ã³ã®ææšã«ãªãããã§ãã
ãŸãããã€ãªã¢ãã€ã©ããªãã£æ¹åã®ããã«ãããã·ã£ãããªã³ã°ãè¡ã£ãäŸãšããŠã¯J. Med. Chem. 2011, 54, 8, 3076-3080ããããŸãã