Tuesday, August 12, 2025

#Module 5 Post-Docking Analysis

Post-Docking Analysis, laid out as a clean, runnable activity so your participants can walk straight through it after docking. It includes visualization tips, quick scripts to extract scores, RMSD validation vs co-crystal ligand, interaction mapping, ranking, and next-step suggestions.

Module 5 — Post-Docking Analysis (step-by-step)

Objective

From a completed docking run (output.pdbqt / output.pdb + log.txt), you will:

Extract binding scores and create a ranked table.
Visualize and inspect docked poses and interactions.
Validate docking by redocking / RMSD vs co-crystallized ligand.
Identify key interactions and shortlist candidates.
Know simple rescoring/next-step options.

Step 0 — Files you should have

ACE_clean.pdbqt (receptor) and ACE_clean.pdb (original PDB)
lisinopril_out.pdbqt (vina output) and lisinopril_out.pdb (converted PDB)
lisinopril_log.txt (vina log) — optional
If available: original co-crystallized ligand (from 1O86.pdb) for RMSD checks

Step 1 — Parse Vina results (quick CSV of modes)

Vina writes pose results as REMARK VINA RESULT: lines in the output .pdbqt. Use this Python script to extract them into results.csv:

python
# parse_vina_results.py
import sys
import csv

if len(sys.argv) < 3:
    print("Usage: python parse_vina_results.py output.pdbqt results.csv")
    sys.exit(1)

infile = sys.argv[1]
outfile = sys.argv[2]

results = []
with open(infile) as f:
    for line in f:
        line = line.strip()
        if line.startswith("REMARK VINA RESULT:"):
            parts = line.split()
            # typical line: REMARK VINA RESULT:  -7.1  0.000  0.000
            try:
                affinity = float(parts[3])
            except:
                affinity = None
            results.append({"mode": len(results)+1, "affinity_kcal_mol": affinity})

with open(outfile, "w", newline="") as csvfile:
    writer = csv.DictWriter(csvfile, fieldnames=["mode", "affinity_kcal_mol"])
    writer.writeheader()
    for r in results:
        writer.writerow(r)

print(f"Wrote {len(results)} results to {outfile}")

Run:

bash
python parse_vina_results.py lisinopril_out.pdbqt lisinopril_results.csv

Step 2 — Batch docking results: rank multiple ligands

If you docked many ligands into out/ folder, use a small bash loop + the parser to build a master ranking:

bash
mkdir -p summary
for out in out/*_out.pdbqt; do
  base=$(basename "$out" _out.pdbqt)
  python parse_vina_results.py "$out" "summary/${base}.csv"
  # grab top affinity
  head -n 2 "summary/${base}.csv" | tail -n1 >> summary/top_affinities.txt
done

# Now combine into a single file and sort (assuming lines are "mode,affinity")
awk -F, 'NR>1{print FILENAME","$2}' summary/*.csv > summary/all_affinities.csv
# Better: write a small Python script to collect top affinities and sort

(I can provide a polished collector script if you want.)

Step 3 — Visual inspection in PyMOL (show poses, H-bonds, pocket)

Open PyMOL (or Chimera) and run these commands (adjust names to your files/selections):

pymol
# load files
load ACE_clean.pdb, receptor
load lisinopril_out.pdb, docked

# show protein and ligand
hide everything
show cartoon, receptor
show sticks, docked
color cyan, receptor
color yellow, docked

# select ligand (assumes ligand residues are present as 'resn LPR')
select lig, docked and (resn LPR or resn LPR_) # adjust resname if needed

# define pocket (all receptor atoms within 4 Å of ligand)
select pocket, receptor within 4.0 of lig

# show pocket and ligand
show surface, pocket
show sticks, pocket

# draw distances (possible H-bonds) within 3.5 Å
distance hbonds, lig, pocket, 3.5

# label pocket residues (for quick notes)
label pocket and name CA, "%s%s" % (resn, resi)

What to look for:

Does ligand sit in expected pocket (co-crystal site)?
Key H-bonds (distances < 3.5 Å).
Hydrophobic contacts (ligand buried in hydrophobic region).

Take annotated screenshots for your slides / report.

Step 4 — Compute RMSD vs co-crystal ligand (redocking validation)

Why: Redocking the crystallographic ligand and computing RMSD between docked pose and crystal pose is a sanity check. RMSD ≤ 2.0 Å is commonly considered a successful redocking.

Use this Python script (Kabsch algorithm) to compute RMSD between two ligand PDBs. It expects that you have two PDB files containing only ligand atom lines (same atom ordering helps).

python
# rmsd_kabsch.py
import sys
import numpy as np

def read_coords(pdbfile):
    coords = []
    for line in open(pdbfile):
        if line.startswith(("ATOM","HETATM")):
            x = float(line[30:38]); y = float(line[38:46]); z = float(line[46:54])
            coords.append([x,y,z])
    return np.array(coords)

def kabsch_rmsd(P, Q):
    # P and Q are Nx3
    Pc = P - P.mean(axis=0)
    Qc = Q - Q.mean(axis=0)
    C = np.dot(Pc.T, Qc)
    V, S, Wt = np.linalg.svd(C)
    d = np.sign(np.linalg.det(np.dot(V, Wt)))
    D = np.diag([1.0, 1.0, d])
    U = np.dot(np.dot(V, D), Wt)
    Qr = np.dot(Qc, U.T)
    diff = Pc - Qr
    return np.sqrt((diff**2).sum() / len(P))

if __name__ == "__main__":
    if len(sys.argv) != 3:
        print("Usage: python rmsd_kabsch.py ref_ligand.pdb dock_ligand.pdb")
        sys.exit(1)
    P = read_coords(sys.argv[1])
    Q = read_coords(sys.argv[2])
    if P.shape != Q.shape:
        print("Warning: different atom counts or order. RMSD may be invalid.")
    rmsd = kabsch_rmsd(P, Q)
    print("RMSD = %.3f Å" % rmsd)

Usage:

Extract the co-crystal ligand from 1O86.pdb to ref_lig.pdb (keep only ligand atoms).
Extract the docked ligand from lisinopril_out.pdb to dock_lig.pdb.
Run:

bash
python rmsd_kabsch.py ref_lig.pdb dock_lig.pdb

If RMSD ≤ 2.0 Å → good agreement.

Step 5 — Map interactions (hydrogen bonds, hydrophobic, salt bridges)

Options:

PLIP (Protein-Ligand Interaction Profiler) — easiest: upload PDB (receptor + ligand) to the PLIP webserver; it returns an interaction table + 2D/3D diagrams. (Use PLIP web for quick, annotated outputs.)
Manual inspection in PyMOL — use distance to find close contacts, then compile a table of interacting residues and types.

Suggested output table columns:

ligand_atom - receptor_residue - receptor_atom - interaction_type (H-bond / hydrophobic / salt bridge) - distance (Å)

Step 6 — Rank & shortlist candidates

Create a shortlist based on:

Binding affinity (vina score) — more negative is better.
Pose plausibility (does it occupy the known active site?).
Key interactions: presence of H-bond(s) to catalytic residues or zinc-binding motif (if relevant).
RMSD (if comparing to known ligand) — low RMSD preferred for validation.

Produce a final CSV with columns: ligand_name, best_affinity, num_hbonds, key_residues, rmsd (if applicable), notes

Step 7 — Rescoring & follow-up (high-value options)

If you need higher confidence, consider:

Rescore with smina (a Vina fork with different scoring / custom weights).
MM-GBSA / MM-PBSA free energy estimations (requires short MD and AMBER / GROMACS + gmx_MMPBSA).
Short MD simulations to check pose stability (OpenMM, GROMACS).
Filter for ADMET & drug-likeness: use RDKit to compute logP, MW, Lipinski rules, or commercial tools.

(These require extra compute and expertise; treat as next-stage steps.)

Step 8 — Report & deliverables

Ask participants to submit:

results.csv (modes & affinities).
top_hits.csv (ranked top 5 ligands with key notes).
Screenshots: PyMOL view(s) showing ligand in pocket with labeled interacting residues.
RMSD report for any redocking performed.
Short one-page summary: why the top hit was selected / concerns about it.

#Module 4 Molecular Docking Execution

Run the docking process in AutoDock Vina and obtain binding affinity results.

Step 1 – Check Prepared Files

Ensure the following files are ready from Module 3:

ACE_clean.pdbqt – Protein file (target)
lisinopril.pdbqt – Ligand file (drug candidate)
conf.txt – Docking configuration (grid box parameters)

Step 2 – Create Docking Configuration File (`conf.txt`)

If not already done in Module 3, create a text file with:


receptor = ACE_clean.pdbqt
ligand = lisinopril.pdbqt

center_x = <value>
center_y = <value>
center_z = <value>

size_x = 20
size_y = 20
size_z = 20

exhaustiveness = 8
num_modes = 10

Replace <value> with the grid box coordinates obtained in Module 3.

Step 3 – Run AutoDock Vina

Open Command Prompt / Terminal in the folder containing the files.

Run:

bash

vina --config conf.txt --out output.pdbqt --log log.txt

Step 4 – Check Docking Results

Open log.txt
Look for:
```
bash
Affinity:  -9.2 kcal/mol
```
Lower (more negative) = better binding.
The top-ranked mode is usually the best pose.

Step 5 – Convert Results for Visualization

Convert output.pdbqt → .pdb using OpenBabel:
```
bash

obabel output.pdbqt -O output.pdb
```

Step 6 – Visualize Binding

Open PyMOL or Chimera.
Load:
- ACE_clean.pdb (protein)
- output.pdb (docked ligand)
View ligand position in the binding site.
Highlight hydrogen bonds & interactions.

Step 7 – Record Observations

Binding energy values for each mode.
Interaction residues.
Potential drug-likeness observations

Module 3 Protein & Ligand Preparation

Module 3 – Protein & Ligand Preparation in clear step-by-step activity format, continuing from where Module 2 left off.
We’ll still use ACE (1O8A) as the example protein and Lisinopril as the ligand for hypertension drug discovery.

Module 3: Protein & Ligand Preparation for Docking

Objective:
Prepare both the protein (target) and ligand (drug molecule) files in the correct format for AutoDock Vina.

Step 1 – Clean the Protein Structure

Open protein file (1O8A.pdb) in PyMOL or UCSF Chimera.
Remove water molecules:
- In PyMOL:
```
arduino
remove solvent
```
- In Chimera:
  Select → Structure → solvent → delete.
Remove any ligands or ions from the protein that are not needed for docking.

Step 2 – Save Clean Protein File

After removing unwanted molecules,
save as ACE_clean.pdb.

Step 3 – Add Hydrogens & Charges to Protein

Open AutoDockTools (ADT).
Load protein → ACE_clean.pdb.
Add polar hydrogens:
Edit → Hydrogens → Add → Polar only.
Add Gasteiger charges:
Edit → Charges → Compute Gasteiger.
Save as ACE_clean.pdbqt.

Step 4 – Prepare Ligand

Download Lisinopril from PubChem in .SDF format.

Convert .SDF → .PDB using OpenBabel:

bash
obabel lisinopril.sdf -O lisinopril.pdb

Open ligand in AutoDockTools.
Add hydrogens and Gasteiger charges.
Save as lisinopril.pdbqt.

Step 5 – Define Docking Grid Box

In AutoDockTools, load both protein (ACE_clean.pdbqt) and ligand (lisinopril.pdbqt).
Select binding site region (around active site residues).
Set grid box center coordinates (X, Y, Z).
Set box size to cover the active site completely.
Save grid configuration as conf.txt.

Step 6 – Verify Files

Protein → ACE_clean.pdbqt
Ligand → lisinopril.pdbqt
Config file → conf.txt
All ready for docking in Module 4.

#2 Activity

Here’s a step-by-step activity list for Module 2 – Installation & Environment Setup (with AutoDock & Python tools) in your drug discovery workshop, using a real-life example file:

Activity: Installing Tools & Preparing the Environment for Hypertension Drug Discovery

Objective:

Get participants ready to perform molecular docking using AutoDock with a real hypertension-related protein (ACE – Angiotensin-Converting Enzyme).

Step 1 – Download & Install Python

Visit https://www.python.org/downloads/
Download Python 3.11+ (Windows/Mac/Linux).
During installation, check "Add Python to PATH".
Verify installation in terminal:
```
python --version
```

Step 2 – Install Required Python Libraries

Open Command Prompt / Terminal.

Install required packages:

pip install numpy pandas biopython matplotlib vina

Step 3 – Download & Install AutoDock Vina

Go to https://vina.scripps.edu/downloads/
Select your OS version (Windows/Mac/Linux).
Extract files to a known folder (e.g., C:\AutoDockVina).
Add the folder to PATH in system environment variables.
Test installation:
```
vina --help
```

Step 4 – Download a Real-Life Protein Example (ACE)

Open RCSB PDB: https://www.rcsb.org/
Search for ACE protein (PDB ID: 1O8A as example).
Download PDB file to your working directory.

Step 5 – Install Visualization Tool (PyMOL or UCSF Chimera)

Download PyMOL (open-source): https://pymol.org/2/
Install it and verify by opening 1O8A.pdb from Step 4.
Rotate and zoom to observe protein structure.

Step 6 – Prepare Ligand File

Download a known hypertension drug structure (e.g., Lisinopril) from PubChem: https://pubchem.ncbi.nlm.nih.gov/.
Save in SDF format.

Convert to PDBQT format using OpenBabel:

obabel lisinopril.sdf -O lisinopril.pdbqt

Step 7 – Check Everything Works

Open 1O8A.pdb in PyMOL.
Load lisinopril.pdbqt in Vina.
Run a test docking to confirm setup.

# Module 2 Installation & Environment Setup

~ with a real-life example (ACE co-crystal PDB 1O86 + Lisinopril ligand).
I’ll include exact commands (Linux/macOS/WSL), small scripts to compute the docking box center, and troubleshooting tips.

Quick plan (what you’ll do in this module)

Create a conda environment and install Vina, OpenBabel, and MGLTools (AutoDockTools).
Download the example receptor (PDB 1O86) and ligand (Lisinopril) SMILES.
Convert ligand SMILES → 3D → mol2 (OpenBabel).
Prepare receptor and ligand .pdbqt files (AutoDockTools prepare_receptor4.py / prepare_ligand4.py).
Run a test docking with vina (example command).
Inspect outputs and common errors.

Important downloads / references

AutoDock-Vina (binaries & manual / conda package). autodock-vina.readthedocs.ioAnaconda
MGLTools / AutoDockTools (prepare_receptor4.py, prepare_ligand4.py). ccsb.scripps.edu
OpenBabel (format conversion and 3D generation). openbabel.github.io
Example receptor: 1O86 (ACE–lisinopril co-crystal) from RCSB PDB. RCSB PDB
Example ligand data: Lisinopril (PubChem CID 5362119; SMILES available). PubChem

0) Prerequisites (one-time)

Install Miniconda/Anaconda (recommended).
(Optional but recommended) install mamba for faster conda installs.

Install mamba (optional, faster):

bash
conda install -n base -c conda-forge mamba -y

1) Create environment & install tools (recommended: Linux/macOS/WSL)

Recommended approach: use conda channels conda-forge and bioconda and install vina, openbabel, and mgltools.

Using mamba (fast):

bash
# add channels (if not already)
conda config --add channels conda-forge
conda config --add channels bioconda
conda config --set channel_priority strict

# create env and install
mamba create -n autodock python=3.9 -y
conda activate autodock

# install core packages
mamba install -n autodock -c conda-forge -c bioconda vina openbabel mgltools -y

Fallback with plain conda:

bash
conda create -n autodock python=3.9 -y
conda activate autodock
conda install -c conda-forge vina openbabel -y
conda install -c bioconda mgltools -y

Notes:

vina is available from conda-forge / bioconda (use whichever channel works on your platform). Anaconda+1
MGLTools installers are also available directly from the Scripps page if you prefer GUI installers. ccsb.scripps.edu

Windows users: you may download installers (Vina MSI or zip and MGLTools .exe) from the vendor pages and add Vina to PATH; or use WSL2 and follow the Linux steps for a smoother experience. AutoDock Vinaccsb.scripps.edu

2) Get the example files (receptor + ligand)

A. Download ACE co-crystal (PDB 1O86) — save as 1O86.pdb:

bash
curl -L -o 1O86.pdb https://files.wwpdb.org/download/1O86.pdb

(You can also download from RCSB web UI.) RCSB PDB+1

B. Create a SMILES file for Lisinopril (PubChem CID 5362119; canonical SMILES shown):

bash
# example SMILES (from PubChem)
echo "C1C[C@H](N(C1)C(=O)C(CCCCN)NC(CCC2=CC=CC=C2)C(=O)O)C(=O)O" > lisinopril.smiles

(You can also wget/download the SDF from PubChem). PubChem

3) Convert SMILES → 3D SDF → MOL2 (OpenBabel)

Generate 3D coordinates and write a mol2 (preferred for prepare_ligand4.py):

bash
# 1) SMILES -> SDF (3D)
obabel -ismi lisinopril.smiles -O lisinopril.sdf --gen3D

# 2) SDF -> MOL2 (add hydrogens)
obabel -isdf lisinopril.sdf -omol2 -O lisinopril.mol2 --addhs --minimize

If you prefer a single-step: obabel -ismi lisinopril.smiles -omol2 -O lisinopril.mol2 --gen3D --addhs --minimize

Tip: OpenBabel flags --gen3D, --addhs, and --minimize create a reasonable 3D geometry. For charges you generally use AutoDockTools (prepare_ligand4) to assign Gasteiger charges. openbabel.github.ioopen-babel.readthedocs.io

4) Prepare receptor and ligand `.pdbqt` (AutoDockTools scripts)

Why use AutoDockTools (MGLTools)? prepare_receptor4.py and prepare_ligand4.py correctly set atom types, merge non-polar hydrogens, assign Gasteiger charges and define torsions — this is the recommended route for reliable .pdbqt. dockey.readthedocs.ioGitHub

Example commands (paths depend on how MGLTools installed — if installed via conda the scripts will be on your PATH; else use full path to MGLToolsPckgs/.../Utilities24/prepare_receptor4.py):

bash
# Prepare receptor: remove water / heteroatoms and make pdbqt
prepare_receptor4.py -r 1O86.pdb -o 1O86_receptor.pdbqt

# (If needed: remove water/ligand first in PyMOL or via flags such as -U)
# e.g., prepare_receptor4.py -r 1O86.pdb -o 1O86_receptor.pdbqt -U 'nphs' 

# Prepare ligand (use the mol2 created earlier)
prepare_ligand4.py -l lisinopril.mol2 -o lisinopril.pdbqt

If your MGLTools is not Python-3 compatible on your system, use the packaged installer or the autodocktools-prepare conda package that provides prepare_* scripts. Anacondaccsb.scripps.edu

Alternate: Meeko (Python package) or obabel --partialcharges gasteiger -opdbqt can be used, but AutoDockTools remain the most commonly used/robust for assigning torsions for Vina. Durrant Labdockey.readthedocs.io

5) Find the docking box center (practical options)

You need center_x/y/z for vina. three ways:

A) PyMOL GUI (quick)

Load 1O86.pdb, select the ligand (e.g., select lig, resn LPR) then center on it and read coordinates or use a centroid plugin.

B) Small Python snippet (Biopython) — compute ligand centroid automatically
Install Biopython: conda install -c conda-forge biopython -y
Then run:

python
# get_centroid_ligand.py
from Bio.PDB import PDBParser
import numpy as np

p = PDBParser(QUIET=True)
s = p.get_structure('1O86','1O86.pdb')
coords = []
for atom in s.get_atoms():
    # ligand in 1O86 has residue name 'LPR' (lisinopril); adjust if different
    if atom.get_parent().get_resname().strip() == 'LPR':
        coords.append(atom.get_coord())
coords = np.array(coords)
centroid = coords.mean(axis=0)
print("center_x, center_y, center_z = ", centroid.tolist())

Run: python get_centroid_ligand.py — this prints coordinates you can pass to vina.

C) Manual — pick coordinates from the ligand lines in the PDB (HETATM lines).

(Residue name for lisinopril in 1O86 is LPR — confirm in the PDB file). RCSB PDBIUPHAR/MMV Guide to Malaria Pharmacology

6) Run a test docking with Vina

Example (replace center_* with values from the centroid tool or PyMOL):

bash
vina --receptor 1O86_receptor.pdbqt \
     --ligand lisinopril.pdbqt \
     --center_x  XX.XXX --center_y  YY.YYY --center_z  ZZ.ZZZ \
     --size_x 20 --size_y 20 --size_z 20 \
     --exhaustiveness 8 \
     --out lisinopril_out.pdbqt \
     --log lisinopril_log.txt

size_* controls the box dimensions (Å). 20 Å is a reasonable starting cube for small ligands but tune as needed.
Inspect lisinopril_log.txt — it lists binding energy (kcal/mol) for the top poses. AutoDock Vina

7) Quick visualization

Open lisinopril_out.pdbqt in PyMOL/PyRx and view poses; compare with co-crystallized ligand in 1O86 for validation.

Common errors & fixes

“Parse error… Unknown tag”: often malformed .pdbqt. Rebuild ligand with prepare_ligand4.py (do not hand-edit). ResearchGate
Missing charges (q = 0) after OpenBabel conversion — add --partialcharges gasteiger or use AutoDockTools. OpenBabel sometimes needs explicit flags. Durrant LabGitHub
prepare_ligand/prepare_receptor scripts expect python2: use MGLTools installer or the conda mgltools package / autodocktools-prepare that supplies python3-compatible scripts. Anaconda+1

Short checklist to hand to participants

Create env: conda/mamba create -n autodock python=3.9 → conda activate autodock.
Install: mamba install -c conda-forge -c bioconda vina openbabel mgltools (or download installers). Anacondaccsb.scripps.eduopenbabel.github.io
Download 1O86.pdb. RCSB PDB
Get Lisinopril SMILES (PubChem CID 5362119). PubChem
Convert, prepare and run vina (commands above).
Visualize results in PyMOL or PyRx.

Tuesday, August 12, 2025

#Module 5 Post-Docking Analysis

Module 5 — Post-Docking Analysis (step-by-step)

Objective

Step 0 — Files you should have

Step 1 — Parse Vina results (quick CSV of modes)

Step 2 — Batch docking results: rank multiple ligands

Step 3 — Visual inspection in PyMOL (show poses, H-bonds, pocket)

Step 4 — Compute RMSD vs co-crystal ligand (redocking validation)

Step 5 — Map interactions (hydrogen bonds, hydrophobic, salt bridges)

Step 6 — Rank & shortlist candidates

Step 7 — Rescoring & follow-up (high-value options)

Step 8 — Report & deliverables

#Module 4 Molecular Docking Execution

Step 1 – Check Prepared Files

Step 2 – Create Docking Configuration File (conf.txt)

Step 3 – Run AutoDock Vina

Step 4 – Check Docking Results

Step 5 – Convert Results for Visualization

Step 6 – Visualize Binding

Step 7 – Record Observations

Module 3 Protein & Ligand Preparation

Module 3: Protein & Ligand Preparation for Docking

Step 1 – Clean the Protein Structure

Step 2 – Save Clean Protein File

Step 3 – Add Hydrogens & Charges to Protein

Step 4 – Prepare Ligand

Step 5 – Define Docking Grid Box

Step 6 – Verify Files

#2 Activity

Activity: Installing Tools & Preparing the Environment for Hypertension Drug Discovery

Objective:

Step 1 – Download & Install Python

Step 2 – Install Required Python Libraries

Step 3 – Download & Install AutoDock Vina

Step 4 – Download a Real-Life Protein Example (ACE)

Step 5 – Install Visualization Tool (PyMOL or UCSF Chimera)

Step 6 – Prepare Ligand File

Step 7 – Check Everything Works

# Module 2 Installation & Environment Setup

Quick plan (what you’ll do in this module)

Important downloads / references

0) Prerequisites (one-time)

1) Create environment & install tools (recommended: Linux/macOS/WSL)

2) Get the example files (receptor + ligand)

3) Convert SMILES → 3D SDF → MOL2 (OpenBabel)

4) Prepare receptor and ligand .pdbqt (AutoDockTools scripts)

5) Find the docking box center (practical options)

6) Run a test docking with Vina

7) Quick visualization

Common errors & fixes

Short checklist to hand to participants

#Module 5 Post-Docking Analysis

Step 2 – Create Docking Configuration File (`conf.txt`)

4) Prepare receptor and ligand `.pdbqt` (AutoDockTools scripts)