Molecular Representations¶

3/19/2024¶

In [1]:
%%html
<script src="https://bits.csb.pitt.edu/preamble.js"></script>

How do we get a molecule into a model?¶

"2D" Representations¶

DNA¶

ATGAGCTCCGCAGCCGGGTTCTGCGCC...

Protein¶

MSSAAGFCASRPGLLFLGLL...

Small Molecules?¶

No description has been provided for this image

Simplified Molecular Input Line Entry System (SMILES)¶

Atoms

Specified by their atomic symbols inside brackets

  • [Au], [Fe], [Zn], etc

No brackets needed for organic subset: B, C, N, O, P, S, F, Cl, Br, and I

Aromatic atoms are lower case: c1ccccc1

Bonds

  • Single -
  • Double =
  • Triple #
  • Aromatic :

Single and aromatic can be omitted.

In [2]:
%%html
<div id="mchem1" style="width: 500px"></div>
<script>
    var divid = '#mchem1';
	jQuery(divid).asker({
	    id: divid,
	    question: "What is a SMILES string for ethane?",
		answers: ['c-c','CC','C=C','C#C','[Ca]-[Ca]'],
        server: "https://bits.csb.pitt.edu/asker.js/example/asker.cgi",
		charter: chartmaker})
    
    $(".jp-InputArea .o:contains(html)").closest('.jp-InputArea').hide();


</script>

SMILES, cont.¶

No description has been provided for this image

Branches¶

Parentheses denote branches and can be nested.

Example: SC(N)CO

Cycles¶

Break a bond in the cycle and use a digit to label the break.

No description has been provided for this image

As long as rings are separate, digits can be reused.

SMILES, cont.¶

Disconnections¶

A period . separates nonbonded molecules.

[Na+].[Cl-]

Isomeric Smiles¶

Slashes (/ \) denote configuration around double bonds.

At (@) denotes configuration around chiral centers.

In [3]:
%%html
<div id="mchem3" style="width: 500px"></div>
<script>

    var divid = '#mchem3';
	jQuery(divid).asker({
	    id: divid,
	    question: "What is a SMILES expression for benzene?",
		answers: ['c1cccc1','cccccc1','c2ccccc2','1cccccc1','c1ccccc1c'],
        server: "https://bits.csb.pitt.edu/asker.js/example/asker.cgi",
		charter: chartmaker})
    
    $(".jp-InputArea .o:contains(html)").closest('.jp-InputArea').hide();


</script>

Drawing¶

All but the simplest smiles can be challenging to interpret (especially if chirality is included). Fortunately, you can use pybel (or molecular viewers like MarvinView) to convert them to their 2D representation.

Example: CC(NC1=CC=C(O)C=C1)=O

In [4]:
from openbabel import pybel
mol = pybel.readstring('smi','CC(NC1=CC=C(O)C=C1)=O')
mol
Out[4]:
No description has been provided for this image
In [5]:
mol.draw(filename="figs/accet.png",show=False) 
No description has been provided for this image

Fingerprints¶

A molecular fingerprint reduces the chemical features of a molecule into a bit vector. The features of the fingerprint correspond to a bit in the vector. This bit is set if the compound has that feature.

The most common type of fingerprint is a Daylight style fingerprint where all the paths (up to a given length) are enumerated and hashed to their bit positions.

No description has been provided for this image

This provides a fixed length vector representation of a chemical structure.¶

No description has been provided for this image
In [6]:
%%html
<div id="cnnfinger" style="width: 500px"></div>
<script>

    var divid = '#cnnfinger';
	jQuery(divid).asker({
	    id: divid,
	    question: "Is a CNN a reasonable architecture for ingesting a fingerprint?",
		answers: ['Yes','No'],
        server: "https://bits.csb.pitt.edu/asker.js/example/asker.cgi",
		charter: chartmaker})
    
    $(".jp-InputArea .o:contains(html)").closest('.jp-InputArea').hide();


</script>

"Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules"¶

https://arxiv.org/pdf/1610.02415.pdf

No description has been provided for this image
  • Variational Autoencoder
  • Encoder 1D convolutional layers (not recurrent)
  • Decoder RNN using GRUs
  • Used canonical smiles
  • 1% - 70% of output valid SMILES

"Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks"¶

https://pubs.acs.org/doi/10.1021/acscentsci.7b00512

No description has been provided for this image

Uses LSTM units. 97.7% valid molecules.

"Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks"¶

https://pubs.acs.org/doi/10.1021/acscentsci.7b00512

No description has been provided for this image

Generated molecules sample from the same distribution as training set (but are still new molecules - only 12% scaffold overlap).

"Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks"¶

https://pubs.acs.org/doi/10.1021/acscentsci.7b00512

No description has been provided for this image No description has been provided for this image

Used transfer learning (fine-tuning) to generate molecules from same distribution as compounds active against a target.

"Molecular de-novo design through deep reinforcement learning"¶

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-017-0235-x#Fig9

No description has been provided for this image

How the model "thinks". Uses GRUs.

SMILES Validity¶

Generative models can struggle to generate syntactically correct smiles strings.

  • c1cccccc
  • c1ccccc2c
  • [nHCCC
  • CSl

"Molecular de-novo design through deep reinforcement learning"¶

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-017-0235-x#Fig9

No description has been provided for this image

Note tokenization of SMILES

"DeepSMILES: An Adaptation of SMILES for Use in Machine-Learning of Chemical Structures"¶

https://chemrxiv.org/articles/preprint/DeepSMILES_An_Adaptation_of_SMILES_for_Use_in_Machine-Learning_of_Chemical_Structures/7097960/1

Changes syntax of SMILES to make some common syntax errors impossible.

No description has been provided for this image

"Self-Referencing Embedded Strings (SELFIES): A 100% robust molecular string representation"¶

https://arxiv.org/pdf/1905.13741.pdf

More local syntax combined with rule set for interpretting SELFIES string. No description has been provided for this image

No description has been provided for this image No description has been provided for this image

"Grammar Variational Autoencoder"¶

Keep SMILES representation, change decoder so it must respect SMILES grammar.

https://arxiv.org/pdf/1703.01925.pdf

No description has been provided for this image

"Deep reinforcement learning for de novo drug design"¶

https://advances.sciencemag.org/content/4/7/eaap7885

Mariya Popova, Olexandr Isayev and Alexander Tropsha

Describes Stack-RNN for decoding.

No description has been provided for this image

"Randomized SMILES strings improve the quality of molecular generative models"¶

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-019-0393-0/figures/1

SMILES can be augmented by generating them using different graph traversal orders. Data augmentation improves model performance.

No description has been provided for this image

"Randomized SMILES strings improve the quality of molecular generative models"¶

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-019-0393-0/figures/1

No description has been provided for this image

"Convolutional Networks on Graphs for Learning Molecular Fingerprints"¶

Uses graph convolutions for molecular property prediction.

https://arxiv.org/abs/1509.09292

No description has been provided for this image

"Molecule Attention Transformer"¶

https://arxiv.org/abs/2002.08264

Consistently out-performs graph convolutions in their evaluations.

No description has been provided for this image

3D¶

In [7]:
mol.make3D() #this makes a reasonable 3D structure
print(mol.atoms[0].coords)
mol.localopt() #this further optimizes the structure
print(mol.atoms[0].coords)
(0.950634151515104, -0.00985885224559379, -0.1440758247433966)
(1.0366603208497924, -0.3020406109655527, -0.20514490988402379)
In [8]:
sdf = mol.write('sdf')
In [9]:
import py3Dmol
view = py3Dmol.view()
view.addModel(sdf)
view.setStyle({'stick':{}})
view.zoomTo()
view.show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

Common 3D File Formats¶

  • PDB: limited by fixed width format, lacks ability to fully specify chemical properties, impicit bonds for known residues
  • mmCIF: improved version of PDB that somehow ends up worse
  • SDF: capable of representing chemical information, explicit bonds
  • MOL2: similar to SDF
  • XYZ: atom type and Cartesian coordinates only

sdf Molecules¶

In [11]:
mols = list(pybel.readfile('sdf','best.sdf'))
len(mols)
Out[11]:
10
In [12]:
atom = mols[0].atoms[0]
print(atom.coords)
(-0.5939, -56.8911, 14.3139)
In [13]:
!cat best.sdf
ZINC78996542


 39 44  0  0  0  0  0  0  0  0999 V2000
   -0.5939  -56.8911   14.3139 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.3154  -57.8883   15.8741 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3628  -55.5394   14.9296 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.0440  -55.7357   15.4805 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.3058  -57.7869   14.5684 N   0  0  0  0  0  0  0  0  0  0  0  0
    1.2724  -57.1748   15.3144 N   0  0  0  0  0  0  0  0  0  0  0  0
    3.1864  -57.3893   16.5881 O   0  0  0  0  0  0  0  0  0  0  0  0
   -6.5650  -58.0576   12.9536 C   0  0  0  0  0  0  0  0  0  0  0  0
   -6.4112  -58.0403   11.5707 C   0  0  0  0  0  0  0  0  0  0  0  0
   -5.4635  -57.8375   13.7859 C   0  0  0  0  0  0  0  0  0  0  0  0
   -5.1560  -57.8031   11.0185 C   0  0  0  0  0  0  0  0  0  0  0  0
   -4.1883  -57.5962   13.2480 C   0  0  0  0  0  0  0  0  0  0  0  0
   -4.0573  -57.5833   11.8565 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.9942  -57.3574   14.1090 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.7971  -57.1312   13.5121 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.6648  -57.1197   12.0139 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.8049  -57.3464   11.2822 N   0  0  0  0  0  0  0  0  0  0  0  0
   -0.5742  -56.9136   11.4820 O   0  0  0  0  0  0  0  0  0  0  0  0
   -2.7364  -57.3419   10.3001 H   0  0  0  0  0  0  0  0  0  0  0  0
   -2.7198  -58.5030   16.2901 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.5010  -56.2171   16.2506 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.7946  -58.5045   17.6831 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.5759  -56.2186   17.6435 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.0729  -57.3592   15.5739 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.2227  -57.3625   18.3596 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.3046  -57.3655   19.8487 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.3111  -54.6466   15.3638 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.2563  -53.8710   14.6948 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.8121  -54.3645   13.5073 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.6178  -52.1004   11.5954 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.4057  -52.3452   11.0066 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.0836  -54.8943   14.7675 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.9949  -53.3368   13.4353 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.7496  -53.5881   12.8303 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.9229  -52.5915   12.8100 N   0  0  0  0  0  0  0  0  0  0  0  0
    2.4640  -53.0881   11.6153 N   0  0  0  0  0  0  0  0  0  0  0  0
    3.6585  -59.9707   15.9992 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.1564  -60.0645   16.2196 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.3350  -59.3635   15.5577 C   0  0  0  0  0  0  0  0  0  0  0  0
  8  9  1  0  0  0
 10 12  1  0  0  0
 20 24  1  0  0  0
 21 23  1  0  0  0
 27 32  1  0  0  0
 22 25  1  0  0  0
 28 33  1  0  0  0
 11 13  1  0  0  0
 29 34  1  0  0  0
 24 14  1  0  0  0
 12 14  1  0  0  0
 33 34  1  0  0  0
 13 17  1  0  0  0
 15  1  1  0  0  0
 15 16  1  0  0  0
 16 17  1  0  0  0
  2  6  1  0  0  0
  3  1  1  0  0  0
  3  4  1  0  0  0
  4 32  1  0  0  0
  4  6  1  0  0  0
 26 25  1  0  0  0
 37 39  1  0  0  0
 38 39  1  0  0  0
 39  2  1  0  0  0
  6  5  1  0  0  0
  8 10  2  0  0  0
  9 11  2  0  0  0
 20 22  2  0  0  0
 21 24  2  0  0  0
 27 28  2  0  0  0
 23 25  2  0  0  0
 29 32  2  0  0  0
 30 31  2  0  0  0
 30 35  2  0  0  0
 31 36  2  0  0  0
 12 13  2  0  0  0
 33 35  2  0  0  0
 34 36  2  0  0  0
 14 15  2  0  0  0
  1  5  2  0  0  0
 16 18  2  0  0  0
  2  7  2  0  0  0
 17 19  1  0  0  0
M  END
> <minimizedAffinity>
-7.83433

> <minimizedRMSD>
1.45522

> <molecular weight>
475.372

$$$$
ZINC78996542


 39 44  0  0  0  0  0  0  0  0999 V2000
   -0.5722  -56.8468   14.3132 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.3170  -57.8869   15.8829 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3244  -55.4995   14.9316 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.0775  -55.7161   15.4874 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.3140  -57.7556   14.5698 N   0  0  0  0  0  0  0  0  0  0  0  0
    1.2862  -57.1582   15.3202 N   0  0  0  0  0  0  0  0  0  0  0  0
    3.1923  -57.4012   16.6007 O   0  0  0  0  0  0  0  0  0  0  0  0
   -6.5452  -57.9747   12.9290 C   0  0  0  0  0  0  0  0  0  0  0  0
   -6.3911  -57.9310   11.5468 C   0  0  0  0  0  0  0  0  0  0  0  0
   -5.4434  -57.7729   13.7658 C   0  0  0  0  0  0  0  0  0  0  0  0
   -5.1352  -57.6858   10.9997 C   0  0  0  0  0  0  0  0  0  0  0  0
   -4.1676  -57.5240   13.2330 C   0  0  0  0  0  0  0  0  0  0  0  0
   -4.0362  -57.4846   11.8422 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.9733  -57.3042   14.0987 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.7756  -57.0691   13.5066 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.6429  -57.0290   12.0089 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.7832  -57.2393   11.2727 N   0  0  0  0  0  0  0  0  0  0  0  0
   -0.5516  -56.8149   11.4815 O   0  0  0  0  0  0  0  0  0  0  0  0
   -2.7144  -57.2159   10.2909 H   0  0  0  0  0  0  0  0  0  0  0  0
   -2.7038  -58.4927   16.2574 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.4766  -56.2038   16.2618 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.7789  -58.5209   17.6500 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.5521  -56.2319   17.6545 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.0525  -57.3342   15.5634 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.2031  -57.3905   18.3484 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.2855  -57.4221   19.8372 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.3565  -54.6510   15.3845 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.3151  -53.8880   14.7200 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.8755  -54.3614   13.5148 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.7196  -52.1345   11.6161 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.5100  -52.3693   11.0185 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.1314  -54.8885   14.7794 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.0693  -53.3564   13.4560 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.8264  -53.5977   12.8421 C   0  0  0  0  0  0  0  0  0  0  0  0
    5.0099  -52.6234   12.8352 N   0  0  0  0  0  0  0  0  0  0  0  0
    2.5559  -53.0999   11.6229 N   0  0  0  0  0  0  0  0  0  0  0  0
    3.6348  -59.9861   16.0001 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.1325  -60.0490   16.2302 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.3170  -59.3619   15.5646 C   0  0  0  0  0  0  0  0  0  0  0  0
  8  9  1  0  0  0
 10 12  1  0  0  0
 20 24  1  0  0  0
 21 23  1  0  0  0
 27 32  1  0  0  0
 22 25  1  0  0  0
 28 33  1  0  0  0
 11 13  1  0  0  0
 29 34  1  0  0  0
 24 14  1  0  0  0
 12 14  1  0  0  0
 33 34  1  0  0  0
 13 17  1  0  0  0
 15  1  1  0  0  0
 15 16  1  0  0  0
 16 17  1  0  0  0
  2  6  1  0  0  0
  3  1  1  0  0  0
  3  4  1  0  0  0
  4 32  1  0  0  0
  4  6  1  0  0  0
 26 25  1  0  0  0
 37 39  1  0  0  0
 38 39  1  0  0  0
 39  2  1  0  0  0
  6  5  1  0  0  0
  8 10  2  0  0  0
  9 11  2  0  0  0
 20 22  2  0  0  0
 21 24  2  0  0  0
 27 28  2  0  0  0
 23 25  2  0  0  0
 29 32  2  0  0  0
 30 31  2  0  0  0
 30 35  2  0  0  0
 31 36  2  0  0  0
 12 13  2  0  0  0
 33 35  2  0  0  0
 34 36  2  0  0  0
 14 15  2  0  0  0
  1  5  2  0  0  0
 16 18  2  0  0  0
  2  7  2  0  0  0
 17 19  1  0  0  0
M  END
> <minimizedAffinity>
-7.7915

> <minimizedRMSD>
1.18555

> <molecular weight>
475.372

$$$$
ZINC78996534


 39 44  0  0  0  0  0  0  0  0999 V2000
   -0.6060  -58.4259   14.4308 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.2622  -57.0761   15.7885 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3848  -59.6010   15.3414 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.0076  -59.2726   15.8653 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.2852  -57.4884   14.4946 N   0  0  0  0  0  0  0  0  0  0  0  0
    1.2358  -57.9070   15.3815 N   0  0  0  0  0  0  0  0  0  0  0  0
    3.1167  -57.3919   16.6176 O   0  0  0  0  0  0  0  0  0  0  0  0
   -6.5490  -57.6468   12.7169 C   0  0  0  0  0  0  0  0  0  0  0  0
   -6.3640  -57.9721   11.3766 C   0  0  0  0  0  0  0  0  0  0  0  0
   -5.4660  -57.6655   13.6007 C   0  0  0  0  0  0  0  0  0  0  0  0
   -5.0959  -58.3170   10.9188 C   0  0  0  0  0  0  0  0  0  0  0  0
   -4.1782  -58.0108   13.1580 C   0  0  0  0  0  0  0  0  0  0  0  0
   -4.0156  -58.3341   11.8081 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.0030  -58.0395   14.0758 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.7918  -58.3829   13.5703 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.6256  -58.7300   12.1163 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.7497  -58.6825   11.3283 N   0  0  0  0  0  0  0  0  0  0  0  0
   -0.5225  -59.0404   11.6676 O   0  0  0  0  0  0  0  0  0  0  0  0
   -2.6589  -58.9060   10.3738 H   0  0  0  0  0  0  0  0  0  0  0  0
   -3.5198  -58.6843   16.4130 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.8163  -56.4211   15.9438 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.6259  -58.3704   17.7678 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.9225  -56.1072   17.2987 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.1149  -57.7096   15.5009 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.3273  -57.0819   18.2108 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.4433  -56.7463   19.6592 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.8030  -59.9702   14.2440 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.7737  -60.8749   13.8173 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.3083  -61.4219   16.0936 C   0  0  0  0  0  0  0  0  0  0  0  0
    5.1631  -64.0341   14.7974 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.4366  -64.3052   15.9262 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.0669  -60.2450   15.3869 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.0225  -62.0537   14.5162 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.2759  -62.3324   15.6757 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.9642  -62.9109   14.0844 N   0  0  0  0  0  0  0  0  0  0  0  0
    3.4901  -63.4609   16.3744 N   0  0  0  0  0  0  0  0  0  0  0  0
    1.2882  -54.7944   15.8358 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.6925  -55.1298   15.1839 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.2860  -55.7119   15.1441 C   0  0  0  0  0  0  0  0  0  0  0  0
  8  9  1  0  0  0
 10 12  1  0  0  0
 20 24  1  0  0  0
 21 23  1  0  0  0
 27 32  1  0  0  0
 22 25  1  0  0  0
 28 33  1  0  0  0
 11 13  1  0  0  0
 29 34  1  0  0  0
 24 14  1  0  0  0
 12 14  1  0  0  0
 33 34  1  0  0  0
 13 17  1  0  0  0
 15  1  1  0  0  0
 15 16  1  0  0  0
 16 17  1  0  0  0
  2  6  1  0  0  0
  3  1  1  0  0  0
  3  4  1  0  0  0
  4 32  1  0  0  0
  4  6  1  0  0  0
 26 25  1  0  0  0
 37 39  1  0  0  0
 38 39  1  0  0  0
 39  2  1  0  0  0
  6  5  1  0  0  0
  8 10  2  0  0  0
  9 11  2  0  0  0
 20 22  2  0  0  0
 21 24  2  0  0  0
 27 28  2  0  0  0
 23 25  2  0  0  0
 29 32  2  0  0  0
 30 31  2  0  0  0
 30 35  2  0  0  0
 31 36  2  0  0  0
 12 13  2  0  0  0
 33 35  2  0  0  0
 34 36  2  0  0  0
 14 15  2  0  0  0
  1  5  2  0  0  0
 16 18  2  0  0  0
  2  7  2  0  0  0
 17 19  1  0  0  0
M  END
> <minimizedAffinity>
-7.60183

> <minimizedRMSD>
2.26383

> <molecular weight>
475.372

$$$$
ZINC78996542


 39 44  0  0  0  0  0  0  0  0999 V2000
   -1.1562  -57.7105   14.6555 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.0859  -57.6546   15.8298 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.1390  -56.2126   14.7797 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.3392  -55.9873   15.0720 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.0584  -58.3149   14.9816 N   0  0  0  0  0  0  0  0  0  0  0  0
    0.8469  -57.3439   15.3020 N   0  0  0  0  0  0  0  0  0  0  0  0
    2.9188  -56.8145   16.1731 O   0  0  0  0  0  0  0  0  0  0  0  0
   -6.8917  -60.1405   14.9063 C   0  0  0  0  0  0  0  0  0  0  0  0
   -6.9219  -60.5932   13.5908 C   0  0  0  0  0  0  0  0  0  0  0  0
   -5.7606  -59.4813   15.3968 C   0  0  0  0  0  0  0  0  0  0  0  0
   -5.8212  -60.3875   12.7645 C   0  0  0  0  0  0  0  0  0  0  0  0
   -4.6396  -59.2637   14.5786 C   0  0  0  0  0  0  0  0  0  0  0  0
   -4.6919  -59.7269   13.2610 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.4188  -58.5636   15.0722 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.3780  -58.3922   14.2190 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.4425  -58.8944   12.8026 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.5968  -59.5297   12.4148 N   0  0  0  0  0  0  0  0  0  0  0  0
   -1.4927  -58.7341   12.0364 O   0  0  0  0  0  0  0  0  0  0  0  0
   -3.6562  -59.8644   11.4908 H   0  0  0  0  0  0  0  0  0  0  0  0
   -2.7756  -58.8669   17.4469 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.7367  -56.7639   16.7466 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.6706  -58.3845   18.7515 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.6320  -56.2815   18.0513 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.3085  -58.0565   16.4443 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.0988  -57.0919   19.0536 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.9885  -56.5777   20.4491 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.6434  -55.4303   12.6355 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.3219  -54.7691   11.6131 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.1612  -54.4616   14.2263 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.1096  -52.5500   11.1925 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.5256  -52.3975   12.4883 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.0650  -55.2760   13.9477 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.4186  -53.9532   11.8808 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.8462  -53.7965   13.2121 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.0567  -53.3258   10.8773 N   0  0  0  0  0  0  0  0  0  0  0  0
    3.9009  -53.0163   13.5062 N   0  0  0  0  0  0  0  0  0  0  0  0
    2.4158  -59.7838   14.5992 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.6872  -59.3395   16.7216 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.3789  -59.1280   15.9719 C   0  0  0  0  0  0  0  0  0  0  0  0
  8  9  1  0  0  0
 10 12  1  0  0  0
 20 24  1  0  0  0
 21 23  1  0  0  0
 27 32  1  0  0  0
 22 25  1  0  0  0
 28 33  1  0  0  0
 11 13  1  0  0  0
 29 34  1  0  0  0
 24 14  1  0  0  0
 12 14  1  0  0  0
 33 34  1  0  0  0
 13 17  1  0  0  0
 15  1  1  0  0  0
 15 16  1  0  0  0
 16 17  1  0  0  0
  2  6  1  0  0  0
  3  1  1  0  0  0
  3  4  1  0  0  0
  4 32  1  0  0  0
  4  6  1  0  0  0
 26 25  1  0  0  0
 37 39  1  0  0  0
 38 39  1  0  0  0
 39  2  1  0  0  0
  6  5  1  0  0  0
  8 10  2  0  0  0
  9 11  2  0  0  0
 20 22  2  0  0  0
 21 24  2  0  0  0
 27 28  2  0  0  0
 23 25  2  0  0  0
 29 32  2  0  0  0
 30 31  2  0  0  0
 30 35  2  0  0  0
 31 36  2  0  0  0
 12 13  2  0  0  0
 33 35  2  0  0  0
 34 36  2  0  0  0
 14 15  2  0  0  0
  1  5  2  0  0  0
 16 18  2  0  0  0
  2  7  2  0  0  0
 17 19  1  0  0  0
M  END
> <minimizedAffinity>
-7.58798

> <minimizedRMSD>
1.876

> <molecular weight>
475.372

$$$$
ZINC35448294


 33 38  0  0  0  0  0  0  0  0999 V2000
    6.2193  -51.5392   13.7822 C   0  0  0  0  0  0  0  0  0  0  0  0
    6.5893  -51.9773   15.0518 C   0  0  0  0  0  0  0  0  0  0  0  0
    5.1156  -52.0958   13.1259 C   0  0  0  0  0  0  0  0  0  0  0  0
    5.8703  -52.9821   15.7052 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.3763  -53.1129   13.7626 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.2347  -53.8835   13.4110 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.7696  -53.5311   15.0379 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.9678  -54.7425   14.4550 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.2954  -56.1404   13.1784 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.3853  -53.8636   12.1948 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.6865  -55.2195   12.0267 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.8500  -55.7376   14.5366 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.8912  -54.5146   15.4387 N   0  0  0  0  0  0  0  0  0  0  0  0
    1.0317  -55.6866   13.2732 N   0  0  0  0  0  0  0  0  0  0  0  0
   -0.9662  -56.1848   12.1474 O   0  0  0  0  0  0  0  0  0  0  0  0
    3.9233  -54.9834   16.3038 H   0  0  0  0  0  0  0  0  0  0  0  0
   -2.9363  -58.4538   18.1389 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.3508  -57.1319   18.2899 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.1432  -58.8360   17.0510 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.9875  -56.1521   17.3616 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.9885  -57.8949   14.9095 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.7643  -57.8655   16.1021 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.1944  -56.5486   16.2790 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.9633  -56.6191   14.3950 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.6932  -55.8143   15.2279 N   0  0  0  0  0  0  0  0  0  0  0  0
   -1.8386  -54.8497   15.0945 H   0  0  0  0  0  0  0  0  0  0  0  0
    2.4658  -57.5272   16.2097 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.6382  -58.0380   13.8546 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.9083  -58.8130   16.5209 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.0807  -59.3238   14.1659 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.3307  -57.1398   14.8765 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.2157  -59.7112   15.4991 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.7617  -61.2970   15.8834 Cl  0  0  0  0  0  0  0  0  0  0  0  0
 17 18  1  0  0  0
  1  2  1  0  0  0
 19 22  1  0  0  0
  3  5  1  0  0  0
 27 31  1  0  0  0
 28 30  1  0  0  0
 20 23  1  0  0  0
  4  7  1  0  0  0
 29 32  1  0  0  0
 21 22  1  0  0  0
  5  6  1  0  0  0
 23 25  1  0  0  0
  7 13  1  0  0  0
 32 33  1  0  0  0
 24  9  1  0  0  0
 24 25  1  0  0  0
  8 13  1  0  0  0
  9 14  1  0  0  0
 10  6  1  0  0  0
 10 11  1  0  0  0
 11 14  1  0  0  0
 12 31  1  0  0  0
 12  8  1  0  0  0
 12 14  1  0  0  0
 17 19  2  0  0  0
  1  3  2  0  0  0
 18 20  2  0  0  0
  2  4  2  0  0  0
 27 29  2  0  0  0
 28 31  2  0  0  0
 30 32  2  0  0  0
 21 24  2  0  0  0
 22 23  2  0  0  0
  5  7  2  0  0  0
  6  8  2  0  0  0
  9 15  2  0  0  0
 25 26  1  0  0  0
 13 16  1  0  0  0
M  END
> <minimizedAffinity>
-7.52352

> <minimizedRMSD>
6.72818

> <molecular weight>
407.767

$$$$
ZINC72314638


 34 38  0  0  0  0  0  0  0  0999 V2000
   -6.9192  -60.0249   14.8267 C   0  0  0  0  0  0  0  0  0  0  0  0
   -6.9224  -60.5532   13.5394 C   0  0  0  0  0  0  0  0  0  0  0  0
   -5.7591  -59.4408   15.3438 C   0  0  0  0  0  0  0  0  0  0  0  0
   -5.7655  -60.4981   12.7677 C   0  0  0  0  0  0  0  0  0  0  0  0
   -4.5814  -59.3754   14.5808 C   0  0  0  0  0  0  0  0  0  0  0  0
   -4.6074  -59.9121   13.2905 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.3284  -58.7584   15.1036 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.2330  -58.7339   14.3037 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.2694  -59.3144   12.9167 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.4559  -59.8663   12.4993 N   0  0  0  0  0  0  0  0  0  0  0  0
   -1.2700  -59.2872   12.1989 O   0  0  0  0  0  0  0  0  0  0  0  0
   -3.4977  -60.2504   11.5937 H   0  0  0  0  0  0  0  0  0  0  0  0
   -0.9776  -58.1386   14.7709 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.1138  -55.4726   15.4117 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.0692  -59.0069   15.4111 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.2024  -57.9981   15.5499 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.4884  -55.4636   16.0338 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.6963  -56.8766   14.7008 N   0  0  0  0  0  0  0  0  0  0  0  0
    0.5616  -56.7232   15.2099 N   0  0  0  0  0  0  0  0  0  0  0  0
    0.5532  -54.4163   15.1165 O   0  0  0  0  0  0  0  0  0  0  0  0
    3.3449  -58.4107   12.4428 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.5462  -58.8702   12.9824 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.2658  -58.1304   13.2810 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.6683  -59.0498   14.3602 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.3878  -58.3101   14.6587 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.5892  -58.7698   15.1985 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.7434  -58.9704   16.6705 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.8700  -58.9787   17.5299 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.5363  -56.8305   16.6476 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.7893  -58.4284   18.8090 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.4557  -56.2801   17.9268 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.2435  -58.1798   16.4491 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.0821  -57.0791   19.0075 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.9979  -56.4920   20.3757 C   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0  0  0
 21 22  1  0  0  0
  3  5  1  0  0  0
 28 32  1  0  0  0
 29 31  1  0  0  0
 23 25  1  0  0  0
 24 26  1  0  0  0
 30 33  1  0  0  0
  4  6  1  0  0  0
 32  7  1  0  0  0
  5  7  1  0  0  0
  6 10  1  0  0  0
  8 13  1  0  0  0
  8  9  1  0  0  0
  9 10  1  0  0  0
 14 19  1  0  0  0
 15 13  1  0  0  0
 15 16  1  0  0  0
 16 25  1  0  0  0
 16 19  1  0  0  0
 34 33  1  0  0  0
 27 26  1  0  0  0
 17 14  1  0  0  0
 19 18  1  0  0  0
  1  3  2  0  0  0
 21 23  2  0  0  0
 22 24  2  0  0  0
  2  4  2  0  0  0
 28 30  2  0  0  0
 29 32  2  0  0  0
 31 33  2  0  0  0
  5  6  2  0  0  0
 25 26  2  0  0  0
  7  8  2  0  0  0
 13 18  2  0  0  0
  9 11  2  0  0  0
 14 20  2  0  0  0
 10 12  1  0  0  0
M  END
> <minimizedAffinity>
-7.51168

> <minimizedRMSD>
1.81673

> <molecular weight>
411.326

$$$$
ZINC72314638


 34 38  0  0  0  0  0  0  0  0999 V2000
   -6.9192  -60.0250   14.8266 C   0  0  0  0  0  0  0  0  0  0  0  0
   -6.9223  -60.5532   13.5393 C   0  0  0  0  0  0  0  0  0  0  0  0
   -5.7590  -59.4409   15.3438 C   0  0  0  0  0  0  0  0  0  0  0  0
   -5.7655  -60.4980   12.7677 C   0  0  0  0  0  0  0  0  0  0  0  0
   -4.5813  -59.3753   14.5807 C   0  0  0  0  0  0  0  0  0  0  0  0
   -4.6073  -59.9120   13.2905 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.3283  -58.7583   15.1036 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.2330  -58.7336   14.3037 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.2693  -59.3140   12.9166 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.4559  -59.8659   12.4992 N   0  0  0  0  0  0  0  0  0  0  0  0
   -1.2700  -59.2867   12.1988 O   0  0  0  0  0  0  0  0  0  0  0  0
   -3.4977  -60.2499   11.5936 H   0  0  0  0  0  0  0  0  0  0  0  0
   -0.9776  -58.1384   14.7708 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.1137  -55.4723   15.4114 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.0691  -59.0065   15.4111 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.2024  -57.9977   15.5499 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.4883  -55.4632   16.0337 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.6963  -56.8762   14.7006 N   0  0  0  0  0  0  0  0  0  0  0  0
    0.5616  -56.7229   15.2098 N   0  0  0  0  0  0  0  0  0  0  0  0
    0.5532  -54.4160   15.1161 O   0  0  0  0  0  0  0  0  0  0  0  0
    3.3452  -58.4103   12.4429 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.5464  -58.8703   12.9827 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.2660  -58.1300   13.2811 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.6682  -59.0500   14.3606 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.3879  -58.3098   14.6589 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.5890  -58.7698   15.1986 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.7431  -58.9706   16.6707 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.8702  -58.9787   17.5299 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.5360  -56.8304   16.6476 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.7895  -58.4286   18.8091 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.4554  -56.2801   17.9268 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.2434  -58.1797   16.4491 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.0819  -57.0792   19.0075 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.9978  -56.4922   20.3758 C   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0  0  0
 21 22  1  0  0  0
  3  5  1  0  0  0
 28 32  1  0  0  0
 29 31  1  0  0  0
 23 25  1  0  0  0
 24 26  1  0  0  0
 30 33  1  0  0  0
  4  6  1  0  0  0
 32  7  1  0  0  0
  5  7  1  0  0  0
  6 10  1  0  0  0
  8 13  1  0  0  0
  8  9  1  0  0  0
  9 10  1  0  0  0
 14 19  1  0  0  0
 15 13  1  0  0  0
 15 16  1  0  0  0
 16 25  1  0  0  0
 16 19  1  0  0  0
 34 33  1  0  0  0
 27 26  1  0  0  0
 17 14  1  0  0  0
 19 18  1  0  0  0
  1  3  2  0  0  0
 21 23  2  0  0  0
 22 24  2  0  0  0
  2  4  2  0  0  0
 28 30  2  0  0  0
 29 32  2  0  0  0
 31 33  2  0  0  0
  5  6  2  0  0  0
 25 26  2  0  0  0
  7  8  2  0  0  0
 13 18  2  0  0  0
  9 11  2  0  0  0
 14 20  2  0  0  0
 10 12  1  0  0  0
M  END
> <minimizedAffinity>
-7.51156

> <minimizedRMSD>
2.07052

> <molecular weight>
411.326

$$$$
ZINC39912421


 35 39  0  0  0  0  0  0  0  0999 V2000
   -2.8637  -58.0485   14.3831 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.6591  -57.4567   14.0111 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.7718  -57.6110   13.4624 C   0  0  0  0  0  0  0  0  0  0  0  0
   -5.0742  -58.1435   13.6832 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.4965  -58.9601   15.3604 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.8048  -56.6783   12.9233 N   0  0  0  0  0  0  0  0  0  0  0  0
   -3.1200  -56.7955   12.6090 N   0  0  0  0  0  0  0  0  0  0  0  0
   -4.8956  -58.9553   14.8275 N   0  0  0  0  0  0  0  0  0  0  0  0
   -6.0866  -57.9453   13.0303 O   0  0  0  0  0  0  0  0  0  0  0  0
   -3.5472  -56.3394   11.8484 H   0  0  0  0  0  0  0  0  0  0  0  0
    0.0881  -58.8830   14.9464 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.1072  -57.9326   15.8719 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3763  -57.6028   14.6449 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.3300  -59.0479   15.5599 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.6430  -56.6523   15.5704 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.4011  -56.4874   14.9569 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.8249  -60.4168   15.8846 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.4900  -55.4712   15.9138 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.0441  -55.2299   14.6663 O   0  0  0  0  0  0  0  0  0  0  0  0
    0.4836  -54.4853   14.8788 H   0  0  0  0  0  0  0  0  0  0  0  0
   -2.9761  -58.9905   17.8000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.6595  -56.9124   16.7746 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.8625  -58.3455   19.0315 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.5458  -56.2672   18.0061 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.3746  -58.2738   16.6715 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.1474  -56.9839   19.1346 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.0409  -56.3650   20.3177 F   0  0  0  0  0  0  0  0  0  0  0  0
   -5.9715  -59.7155   15.4305 C   0  0  0  0  0  0  0  0  0  0  0  0
   -8.5076  -61.3824   13.1820 C   0  0  0  0  0  0  0  0  0  0  0  0
   -7.7831  -62.4617   12.6762 C   0  0  0  0  0  0  0  0  0  0  0  0
   -7.9064  -60.4967   14.0761 C   0  0  0  0  0  0  0  0  0  0  0  0
   -6.4575  -62.6553   13.0647 C   0  0  0  0  0  0  0  0  0  0  0  0
   -6.5808  -60.6902   14.4647 C   0  0  0  0  0  0  0  0  0  0  0  0
   -5.8563  -61.7695   13.9589 C   0  0  0  0  0  0  0  0  0  0  0  0
   -4.2144  -62.0410   14.4164 Cl  0  0  0  0  0  0  0  0  0  0  0  0
 29 30  1  0  0  0
 21 25  1  0  0  0
 22 24  1  0  0  0
 31 33  1  0  0  0
 23 26  1  0  0  0
 32 34  1  0  0  0
 11 13  1  0  0  0
 12 14  1  0  0  0
 13  2  1  0  0  0
  1  2  1  0  0  0
 15 16  1  0  0  0
 16 19  1  0  0  0
 26 27  1  0  0  0
 34 35  1  0  0  0
  3  4  1  0  0  0
  3  7  1  0  0  0
  4  8  1  0  0  0
  5 25  1  0  0  0
  5  1  1  0  0  0
  5  8  1  0  0  0
 17 14  1  0  0  0
 18 15  1  0  0  0
 28 33  1  0  0  0
 28  8  1  0  0  0
  7  6  1  0  0  0
 29 31  2  0  0  0
 30 32  2  0  0  0
 21 23  2  0  0  0
 22 25  2  0  0  0
 24 26  2  0  0  0
 11 14  2  0  0  0
 12 15  2  0  0  0
 13 16  2  0  0  0
  1  3  2  0  0  0
 33 34  2  0  0  0
  2  6  2  0  0  0
  4  9  2  0  0  0
  7 10  1  0  0  0
 19 20  1  0  0  0
M  END
> <minimizedAffinity>
-7.49363

> <minimizedRMSD>
1.94002

> <molecular weight>
442.764

$$$$
ZINC39912421


 35 39  0  0  0  0  0  0  0  0999 V2000
   -2.8637  -58.0484   14.3832 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.6591  -57.4566   14.0111 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.7718  -57.6110   13.4624 C   0  0  0  0  0  0  0  0  0  0  0  0
   -5.0743  -58.1434   13.6832 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.4964  -58.9600   15.3605 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.8048  -56.6783   12.9233 N   0  0  0  0  0  0  0  0  0  0  0  0
   -3.1200  -56.7954   12.6090 N   0  0  0  0  0  0  0  0  0  0  0  0
   -4.8956  -58.9552   14.8276 N   0  0  0  0  0  0  0  0  0  0  0  0
   -6.0865  -57.9453   13.0303 O   0  0  0  0  0  0  0  0  0  0  0  0
   -3.5473  -56.3392   11.8484 H   0  0  0  0  0  0  0  0  0  0  0  0
    0.0881  -58.8830   14.9464 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.1074  -57.9325   15.8720 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3763  -57.6028   14.6449 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.3299  -59.0479   15.5600 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.6428  -56.6523   15.5704 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.4011  -56.4874   14.9569 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.8249  -60.4168   15.8846 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.4900  -55.4712   15.9137 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.0441  -55.2299   14.6664 O   0  0  0  0  0  0  0  0  0  0  0  0
    0.4836  -54.4854   14.8789 H   0  0  0  0  0  0  0  0  0  0  0  0
   -2.9763  -58.9904   17.8002 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.6592  -56.9122   16.7746 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.8627  -58.3454   19.0317 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.5455  -56.2671   18.0061 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.3746  -58.2738   16.6716 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.1473  -56.9838   19.1347 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.0408  -56.3649   20.3178 F   0  0  0  0  0  0  0  0  0  0  0  0
   -5.9714  -59.7154   15.4306 C   0  0  0  0  0  0  0  0  0  0  0  0
   -8.5075  -61.3823   13.1820 C   0  0  0  0  0  0  0  0  0  0  0  0
   -7.7831  -62.4617   12.6762 C   0  0  0  0  0  0  0  0  0  0  0  0
   -7.9063  -60.4966   14.0762 C   0  0  0  0  0  0  0  0  0  0  0  0
   -6.4575  -62.6552   13.0648 C   0  0  0  0  0  0  0  0  0  0  0  0
   -6.5808  -60.6901   14.4648 C   0  0  0  0  0  0  0  0  0  0  0  0
   -5.8563  -61.7695   13.9591 C   0  0  0  0  0  0  0  0  0  0  0  0
   -4.2143  -62.0411   14.4166 Cl  0  0  0  0  0  0  0  0  0  0  0  0
 29 30  1  0  0  0
 21 25  1  0  0  0
 22 24  1  0  0  0
 31 33  1  0  0  0
 23 26  1  0  0  0
 32 34  1  0  0  0
 11 13  1  0  0  0
 12 14  1  0  0  0
 13  2  1  0  0  0
  1  2  1  0  0  0
 15 16  1  0  0  0
 16 19  1  0  0  0
 26 27  1  0  0  0
 34 35  1  0  0  0
  3  4  1  0  0  0
  3  7  1  0  0  0
  4  8  1  0  0  0
  5 25  1  0  0  0
  5  1  1  0  0  0
  5  8  1  0  0  0
 17 14  1  0  0  0
 18 15  1  0  0  0
 28 33  1  0  0  0
 28  8  1  0  0  0
  7  6  1  0  0  0
 29 31  2  0  0  0
 30 32  2  0  0  0
 21 23  2  0  0  0
 22 25  2  0  0  0
 24 26  2  0  0  0
 11 14  2  0  0  0
 12 15  2  0  0  0
 13 16  2  0  0  0
  1  3  2  0  0  0
 33 34  2  0  0  0
  2  6  2  0  0  0
  4  9  2  0  0  0
  7 10  1  0  0  0
 19 20  1  0  0  0
M  END
> <minimizedAffinity>
-7.49359

> <minimizedRMSD>
2.12367

> <molecular weight>
442.764

$$$$
ZINC39912344


 35 39  0  0  0  0  0  0  0  0999 V2000
   -2.9655  -58.0579   14.4075 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.7487  -57.5053   14.0153 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.8718  -57.6016   13.4941 C   0  0  0  0  0  0  0  0  0  0  0  0
   -5.1866  -58.0933   13.7357 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.6126  -58.9419   15.4006 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.8851  -56.7323   12.9225 N   0  0  0  0  0  0  0  0  0  0  0  0
   -3.2070  -56.8131   12.6256 N   0  0  0  0  0  0  0  0  0  0  0  0
   -5.0176  -58.9002   14.8849 N   0  0  0  0  0  0  0  0  0  0  0  0
   -6.2006  -57.8708   13.0937 O   0  0  0  0  0  0  0  0  0  0  0  0
   -3.6301  -56.3508   11.8663 H   0  0  0  0  0  0  0  0  0  0  0  0
   -0.0185  -58.9767   14.9117 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.0258  -58.0767   15.8325 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.4630  -57.6840   14.6346 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.2258  -59.1731   15.5107 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.5813  -56.7838   15.5554 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.3369  -56.5875   14.9564 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.6995  -60.5554   15.8092 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.4524  -55.6234   15.9088 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.0884  -55.3180   14.6896 O   0  0  0  0  0  0  0  0  0  0  0  0
    0.4544  -54.5863   14.9086 H   0  0  0  0  0  0  0  0  0  0  0  0
   -3.0661  -58.9675   17.8345 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.6938  -56.8776   16.7976 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.9180  -58.3157   19.0588 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.5456  -56.2257   18.0218 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.4539  -58.2484   16.7039 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.1577  -56.9448   19.1524 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.0181  -56.3191   20.3285 F   0  0  0  0  0  0  0  0  0  0  0  0
   -6.1077  -59.6230   15.5080 C   0  0  0  0  0  0  0  0  0  0  0  0
   -8.0602  -62.0396   13.3356 C   0  0  0  0  0  0  0  0  0  0  0  0
   -5.9663  -63.0767   13.9497 C   0  0  0  0  0  0  0  0  0  0  0  0
   -7.7088  -60.9008   14.0603 C   0  0  0  0  0  0  0  0  0  0  0  0
   -5.6148  -61.9379   14.6743 C   0  0  0  0  0  0  0  0  0  0  0  0
   -7.1891  -63.1276   13.2804 C   0  0  0  0  0  0  0  0  0  0  0  0
   -6.4861  -60.8499   14.7295 C   0  0  0  0  0  0  0  0  0  0  0  0
   -7.5641  -64.3456   12.5058 C   0  0  0  0  0  0  0  0  0  0  0  0
 21 25  1  0  0  0
 22 24  1  0  0  0
 29 33  1  0  0  0
 30 32  1  0  0  0
 31 34  1  0  0  0
 23 26  1  0  0  0
 11 13  1  0  0  0
 12 14  1  0  0  0
 13  2  1  0  0  0
  1  2  1  0  0  0
 15 16  1  0  0  0
 16 19  1  0  0  0
 26 27  1  0  0  0
  3  4  1  0  0  0
  3  7  1  0  0  0
  4  8  1  0  0  0
  5 25  1  0  0  0
  5  1  1  0  0  0
  5  8  1  0  0  0
 35 33  1  0  0  0
 17 14  1  0  0  0
 18 15  1  0  0  0
 28 34  1  0  0  0
 28  8  1  0  0  0
  7  6  1  0  0  0
 21 23  2  0  0  0
 22 25  2  0  0  0
 29 31  2  0  0  0
 30 33  2  0  0  0
 32 34  2  0  0  0
 24 26  2  0  0  0
 11 14  2  0  0  0
 12 15  2  0  0  0
 13 16  2  0  0  0
  1  3  2  0  0  0
  2  6  2  0  0  0
  4  9  2  0  0  0
  7 10  1  0  0  0
 19 20  1  0  0  0
M  END
> <minimizedAffinity>
-7.4906

> <minimizedRMSD>
1.79745

> <molecular weight>
419.322

$$$$

3D Representations¶

  • Fixed width vectors (summarize structural information)
  • 2D graph representations extended with 3D information (e.g. bond lengths, angles)
    • edges in graph not necessarily bonds
  • Atomic Environment Vectors

https://arxiv.org/pdf/2101.04673.pdf

RF Score¶

https://pubmed.ncbi.nlm.nih.gov/20236947/

No description has been provided for this image

Coulomb Matrices¶

https://pubs.acs.org/doi/10.1021/ct400195d

No description has been provided for this image

Diagonal is per-atom energies. Off-diagonal is Coulombic repulsion between atoms (could be any pairwise property).

What are the issues with getting this into a typical neural network?¶

Coulomb Matrices¶

No description has been provided for this image

"Three different permutationally invariant representations of a molecule derived from its Coulomb matrix C: (a) eigenspectrum of the Coulomb matrix, (b) sorted Coulomb matrix, (c) set of randomly sorted Coulomb matrices."

Bag of Bonds¶

No description has been provided for this image

"Schematic view of the Bag of Bonds (BoB) representation. (a) 3D structure of ethanol (CH3CH2OH) and (b) involved nuclear charges for each Coulomb matrix element. (c) Different Coulomb matrix entries that are present for ethanol are sorted into bags, and the BoB vector (d) is obtained by concatenating these bags and adding zeros to allow for dealing with other molecules with larger bags."

Atomic Environment Vectors¶

https://pubs.rsc.org/en/content/articlelanding/2017/SC/C6SC05720A

Each atom has its local environment summarized into a vector (this can get complicated)

  • bond distances, angles, atom types, etc
  • only "sees" atoms within cutoff distance
  • result is sum of individual atomic contributions
No description has been provided for this image

"Quantum-chemical insights from deep tensor neural networks"¶

https://www.nature.com/articles/ncomms13890

No description has been provided for this image

EGNN: Equivariant Graph Neural Networks¶

https://arxiv.org/pdf/2102.09844.pdf

No description has been provided for this image $$\begin{align*} \mathbf{m}_{ij} &= \phi_e\left(\mathbf{h}_i^l,\mathbf{h}_i^l, ||\mathbf{x}_i^l - \mathbf{x}_j^l||^2, a_{ij}\right) \\ \mathbf{x}_i^{i+1} & = \mathbf{x}_i^l + C\sum_{j \ne i} ( \mathbf{x}_i^l - \mathbf{x}_j^l ) \phi_x (\mathbf{m}_{ij}) \\ \mathbf{m}_i &= \sum_{j \ne i} \mathbf{m}_{ij} \\ \mathbf{h}_i^{l+1} &= \phi_h(\mathbf{h}_i^l,\mathbf{m}_i) \end{align*}$$

Grids¶

https://pubs.acs.org/doi/full/10.1021/acs.jcim.6b00740 https://pubs.acs.org/doi/10.1021/acs.jcim.9b01145

Basically a 3D picture with atom types instead of red/green/blue.

No description has been provided for this image
In [14]:
%%html
<div id="gridrot" style="width: 500px"></div>
<script>

    var divid = '#gridrot';
	jQuery(divid).asker({
	    id: divid,
	    question: "If the molecule is rotated, the output of a CNN that takes a molecular grid as input will not change.",
		answers: ['True','False','Depends'],
        server: "https://bits.csb.pitt.edu/asker.js/example/asker.cgi",
		charter: chartmaker})
    
    $(".jp-InputArea .o:contains(html)").closest('.jp-InputArea').hide();


</script>