Tuesday, August 12, 2025

#Module 5 Post-Docking Analysis

Post-Docking Analysis, laid out as a clean, runnable activity so your participants can walk straight through it after docking. It includes visualization tips, quick scripts to extract scores, RMSD validation vs co-crystal ligand, interaction mapping, ranking, and next-step suggestions.


Module 5 — Post-Docking Analysis (step-by-step)

Objective

From a completed docking run (output.pdbqt / output.pdb + log.txt), you will:

  1. Extract binding scores and create a ranked table.

  2. Visualize and inspect docked poses and interactions.

  3. Validate docking by redocking / RMSD vs co-crystallized ligand.

  4. Identify key interactions and shortlist candidates.

  5. Know simple rescoring/next-step options.


Step 0 — Files you should have

  • ACE_clean.pdbqt (receptor) and ACE_clean.pdb (original PDB)

  • lisinopril_out.pdbqt (vina output) and lisinopril_out.pdb (converted PDB)

  • lisinopril_log.txt (vina log) — optional

  • If available: original co-crystallized ligand (from 1O86.pdb) for RMSD checks


Step 1 — Parse Vina results (quick CSV of modes)

Vina writes pose results as REMARK VINA RESULT: lines in the output .pdbqt. Use this Python script to extract them into results.csv:

python
# parse_vina_results.py
import sys import csv if len(sys.argv) < 3: print("Usage: python parse_vina_results.py output.pdbqt results.csv") sys.exit(1) infile = sys.argv[1] outfile = sys.argv[2] results = [] with open(infile) as f: for line in f: line = line.strip() if line.startswith("REMARK VINA RESULT:"): parts = line.split() # typical line: REMARK VINA RESULT: -7.1 0.000 0.000 try: affinity = float(parts[3]) except: affinity = None results.append({"mode": len(results)+1, "affinity_kcal_mol": affinity}) with open(outfile, "w", newline="") as csvfile: writer = csv.DictWriter(csvfile, fieldnames=["mode", "affinity_kcal_mol"]) writer.writeheader() for r in results: writer.writerow(r) print(f"Wrote {len(results)} results to {outfile}")

Run:

bash
python parse_vina_results.py lisinopril_out.pdbqt lisinopril_results.csv

Step 2 — Batch docking results: rank multiple ligands

If you docked many ligands into out/ folder, use a small bash loop + the parser to build a master ranking:

bash
mkdir -p summary for out in out/*_out.pdbqt; do base=$(basename "$out" _out.pdbqt) python parse_vina_results.py "$out" "summary/${base}.csv" # grab top affinity head -n 2 "summary/${base}.csv" | tail -n1 >> summary/top_affinities.txt done # Now combine into a single file and sort (assuming lines are "mode,affinity") awk -F, 'NR>1{print FILENAME","$2}' summary/*.csv > summary/all_affinities.csv # Better: write a small Python script to collect top affinities and sort

(I can provide a polished collector script if you want.)


Step 3 — Visual inspection in PyMOL (show poses, H-bonds, pocket)

Open PyMOL (or Chimera) and run these commands (adjust names to your files/selections):

pymol
# load files load ACE_clean.pdb, receptor load lisinopril_out.pdb, docked # show protein and ligand hide everything show cartoon, receptor show sticks, docked color cyan, receptor color yellow, docked # select ligand (assumes ligand residues are present as 'resn LPR') select lig, docked and (resn LPR or resn LPR_) # adjust resname if needed # define pocket (all receptor atoms within 4 Å of ligand) select pocket, receptor within 4.0 of lig # show pocket and ligand show surface, pocket show sticks, pocket # draw distances (possible H-bonds) within 3.5 Å distance hbonds, lig, pocket, 3.5 # label pocket residues (for quick notes) label pocket and name CA, "%s%s" % (resn, resi)

What to look for:

  • Does ligand sit in expected pocket (co-crystal site)?

  • Key H-bonds (distances < 3.5 Å).

  • Hydrophobic contacts (ligand buried in hydrophobic region).

Take annotated screenshots for your slides / report.


Step 4 — Compute RMSD vs co-crystal ligand (redocking validation)

Why: Redocking the crystallographic ligand and computing RMSD between docked pose and crystal pose is a sanity check. RMSD ≤ 2.0 Å is commonly considered a successful redocking.

Use this Python script (Kabsch algorithm) to compute RMSD between two ligand PDBs. It expects that you have two PDB files containing only ligand atom lines (same atom ordering helps).

python
# rmsd_kabsch.py import sys import numpy as np def read_coords(pdbfile): coords = [] for line in open(pdbfile): if line.startswith(("ATOM","HETATM")): x = float(line[30:38]); y = float(line[38:46]); z = float(line[46:54]) coords.append([x,y,z]) return np.array(coords) def kabsch_rmsd(P, Q): # P and Q are Nx3 Pc = P - P.mean(axis=0) Qc = Q - Q.mean(axis=0) C = np.dot(Pc.T, Qc) V, S, Wt = np.linalg.svd(C) d = np.sign(np.linalg.det(np.dot(V, Wt))) D = np.diag([1.0, 1.0, d]) U = np.dot(np.dot(V, D), Wt) Qr = np.dot(Qc, U.T) diff = Pc - Qr return np.sqrt((diff**2).sum() / len(P)) if __name__ == "__main__": if len(sys.argv) != 3: print("Usage: python rmsd_kabsch.py ref_ligand.pdb dock_ligand.pdb") sys.exit(1) P = read_coords(sys.argv[1]) Q = read_coords(sys.argv[2]) if P.shape != Q.shape: print("Warning: different atom counts or order. RMSD may be invalid.") rmsd = kabsch_rmsd(P, Q) print("RMSD = %.3f Å" % rmsd)

Usage:

  1. Extract the co-crystal ligand from 1O86.pdb to ref_lig.pdb (keep only ligand atoms).

  2. Extract the docked ligand from lisinopril_out.pdb to dock_lig.pdb.

  3. Run:

bash
python rmsd_kabsch.py ref_lig.pdb dock_lig.pdb

If RMSD ≤ 2.0 Å → good agreement.


Step 5 — Map interactions (hydrogen bonds, hydrophobic, salt bridges)

Options:

  1. PLIP (Protein-Ligand Interaction Profiler) — easiest: upload PDB (receptor + ligand) to the PLIP webserver; it returns an interaction table + 2D/3D diagrams. (Use PLIP web for quick, annotated outputs.)

  2. Manual inspection in PyMOL — use distance to find close contacts, then compile a table of interacting residues and types.

Suggested output table columns:

  • ligand_atom - receptor_residue - receptor_atom - interaction_type (H-bond / hydrophobic / salt bridge) - distance (Å)


Step 6 — Rank & shortlist candidates

Create a shortlist based on:

  • Binding affinity (vina score) — more negative is better.

  • Pose plausibility (does it occupy the known active site?).

  • Key interactions: presence of H-bond(s) to catalytic residues or zinc-binding motif (if relevant).

  • RMSD (if comparing to known ligand) — low RMSD preferred for validation.

Produce a final CSV with columns: ligand_name, best_affinity, num_hbonds, key_residues, rmsd (if applicable), notes


Step 7 — Rescoring & follow-up (high-value options)

If you need higher confidence, consider:

  • Rescore with smina (a Vina fork with different scoring / custom weights).

  • MM-GBSA / MM-PBSA free energy estimations (requires short MD and AMBER / GROMACS + gmx_MMPBSA).

  • Short MD simulations to check pose stability (OpenMM, GROMACS).

  • Filter for ADMET & drug-likeness: use RDKit to compute logP, MW, Lipinski rules, or commercial tools.

(These require extra compute and expertise; treat as next-stage steps.)


Step 8 — Report & deliverables

Ask participants to submit:

  1. results.csv (modes & affinities).

  2. top_hits.csv (ranked top 5 ligands with key notes).

  3. Screenshots: PyMOL view(s) showing ligand in pocket with labeled interacting residues.

  4. RMSD report for any redocking performed.

  5. Short one-page summary: why the top hit was selected / concerns about it.

#Module 4 Molecular Docking Execution

Run the docking process in AutoDock Vina and obtain binding affinity results.


Step 1 – Check Prepared Files

Ensure the following files are ready from Module 3:

  • ACE_clean.pdbqt – Protein file (target)

  • lisinopril.pdbqt – Ligand file (drug candidate)

  • conf.txt – Docking configuration (grid box parameters)


Step 2 – Create Docking Configuration File (conf.txt)

If not already done in Module 3, create a text file with:


receptor = ACE_clean.pdbqt ligand = lisinopril.pdbqt center_x = <value> center_y = <value> center_z = <value> size_x = 20 size_y = 20 size_z = 20 exhaustiveness = 8 num_modes = 10

Replace <value> with the grid box coordinates obtained in Module 3.


Step 3 – Run AutoDock Vina

  1. Open Command Prompt / Terminal in the folder containing the files.

  2. Run:

    bash

    vina --config conf.txt --out output.pdbqt --log log.txt

Step 4 – Check Docking Results

  • Open log.txt

  • Look for:

    bash
    Affinity: -9.2 kcal/mol

    Lower (more negative) = better binding.

  • The top-ranked mode is usually the best pose.


Step 5 – Convert Results for Visualization

  • Convert output.pdbqt.pdb using OpenBabel:

    bash

    obabel output.pdbqt -O output.pdb

Step 6 – Visualize Binding

  1. Open PyMOL or Chimera.

  2. Load:

    • ACE_clean.pdb (protein)

    • output.pdb (docked ligand)

  3. View ligand position in the binding site.

  4. Highlight hydrogen bonds & interactions.


Step 7 – Record Observations

  • Binding energy values for each mode.

  • Interaction residues.

  • Potential drug-likeness observations

Module 3 Protein & Ligand Preparation

 

Module 3 – Protein & Ligand Preparation in clear step-by-step activity format, continuing from where Module 2 left off.
We’ll still use ACE (1O8A) as the example protein and Lisinopril as the ligand for hypertension drug discovery.


Module 3: Protein & Ligand Preparation for Docking

Objective:
Prepare both the protein (target) and ligand (drug molecule) files in the correct format for AutoDock Vina.


Step 1 – Clean the Protein Structure

  1. Open protein file (1O8A.pdb) in PyMOL or UCSF Chimera.

  2. Remove water molecules:

    • In PyMOL:

      arduino
      remove solvent
    • In Chimera:
      Select → Structure → solvent → delete.

  3. Remove any ligands or ions from the protein that are not needed for docking.


Step 2 – Save Clean Protein File

  1. After removing unwanted molecules,
    save as ACE_clean.pdb.


Step 3 – Add Hydrogens & Charges to Protein

  1. Open AutoDockTools (ADT).

  2. Load proteinACE_clean.pdb.

  3. Add polar hydrogens:
    Edit → Hydrogens → Add → Polar only.

  4. Add Gasteiger charges:
    Edit → Charges → Compute Gasteiger.

  5. Save as ACE_clean.pdbqt.


Step 4 – Prepare Ligand

  1. Download Lisinopril from PubChem in .SDF format.

  2. Convert .SDF.PDB using OpenBabel:

    bash
    obabel lisinopril.sdf -O lisinopril.pdb
  3. Open ligand in AutoDockTools.

  4. Add hydrogens and Gasteiger charges.

  5. Save as lisinopril.pdbqt.


Step 5 – Define Docking Grid Box

  1. In AutoDockTools, load both protein (ACE_clean.pdbqt) and ligand (lisinopril.pdbqt).

  2. Select binding site region (around active site residues).

  3. Set grid box center coordinates (X, Y, Z).

  4. Set box size to cover the active site completely.

  5. Save grid configuration as conf.txt.


Step 6 – Verify Files

  • ProteinACE_clean.pdbqt

  • Ligandlisinopril.pdbqt

  • Config fileconf.txt
    All ready for docking in Module 4.

#2 Activity

 Here’s a step-by-step activity list for Module 2 – Installation & Environment Setup (with AutoDock & Python tools) in your drug discovery workshop, using a real-life example file:


Activity: Installing Tools & Preparing the Environment for Hypertension Drug Discovery

Objective:

Get participants ready to perform molecular docking using AutoDock with a real hypertension-related protein (ACE – Angiotensin-Converting Enzyme).


Step 1 – Download & Install Python

  1. Visit https://www.python.org/downloads/

  2. Download Python 3.11+ (Windows/Mac/Linux).

  3. During installation, check "Add Python to PATH".

  4. Verify installation in terminal:

    python --version
    

Step 2 – Install Required Python Libraries

  1. Open Command Prompt / Terminal.

  2. Install required packages:

    pip install numpy pandas biopython matplotlib vina
    

Step 3 – Download & Install AutoDock Vina

  1. Go to https://vina.scripps.edu/downloads/

  2. Select your OS version (Windows/Mac/Linux).

  3. Extract files to a known folder (e.g., C:\AutoDockVina).

  4. Add the folder to PATH in system environment variables.

  5. Test installation:

    vina --help
    

Step 4 – Download a Real-Life Protein Example (ACE)

  1. Open RCSB PDB: https://www.rcsb.org/

  2. Search for ACE protein (PDB ID: 1O8A as example).

  3. Download PDB file to your working directory.


Step 5 – Install Visualization Tool (PyMOL or UCSF Chimera)

  1. Download PyMOL (open-source): https://pymol.org/2/

  2. Install it and verify by opening 1O8A.pdb from Step 4.

  3. Rotate and zoom to observe protein structure.


Step 6 – Prepare Ligand File

  1. Download a known hypertension drug structure (e.g., Lisinopril) from PubChem: https://pubchem.ncbi.nlm.nih.gov/.

  2. Save in SDF format.

  3. Convert to PDBQT format using OpenBabel:

    obabel lisinopril.sdf -O lisinopril.pdbqt
    

Step 7 – Check Everything Works

  1. Open 1O8A.pdb in PyMOL.

  2. Load lisinopril.pdbqt in Vina.

  3. Run a test docking to confirm setup.



# Module 2 Installation & Environment Setup

 

~ with a real-life example (ACE co-crystal PDB 1O86 + Lisinopril ligand).
I’ll include exact commands (Linux/macOS/WSL), small scripts to compute the docking box center, and troubleshooting tips.


Quick plan (what you’ll do in this module)

  1. Create a conda environment and install Vina, OpenBabel, and MGLTools (AutoDockTools).

  2. Download the example receptor (PDB 1O86) and ligand (Lisinopril) SMILES.

  3. Convert ligand SMILES → 3D → mol2 (OpenBabel).

  4. Prepare receptor and ligand .pdbqt files (AutoDockTools prepare_receptor4.py / prepare_ligand4.py).

  5. Run a test docking with vina (example command).

  6. Inspect outputs and common errors.


Important downloads / references


0) Prerequisites (one-time)

  • Install Miniconda/Anaconda (recommended).

  • (Optional but recommended) install mamba for faster conda installs.

Install mamba (optional, faster):

bash
conda install -n base -c conda-forge mamba -y

1) Create environment & install tools (recommended: Linux/macOS/WSL)

Recommended approach: use conda channels conda-forge and bioconda and install vina, openbabel, and mgltools.

Using mamba (fast):

bash
# add channels (if not already) conda config --add channels conda-forge conda config --add channels bioconda conda config --set channel_priority strict # create env and install mamba create -n autodock python=3.9 -y conda activate autodock # install core packages mamba install -n autodock -c conda-forge -c bioconda vina openbabel mgltools -y

Fallback with plain conda:

bash
conda create -n autodock python=3.9 -y conda activate autodock conda install -c conda-forge vina openbabel -y conda install -c bioconda mgltools -y

Notes:

  • vina is available from conda-forge / bioconda (use whichever channel works on your platform). Anaconda+1

  • MGLTools installers are also available directly from the Scripps page if you prefer GUI installers. ccsb.scripps.edu

Windows users: you may download installers (Vina MSI or zip and MGLTools .exe) from the vendor pages and add Vina to PATH; or use WSL2 and follow the Linux steps for a smoother experience. AutoDock Vinaccsb.scripps.edu


2) Get the example files (receptor + ligand)

A. Download ACE co-crystal (PDB 1O86) — save as 1O86.pdb:

bash
curl -L -o 1O86.pdb https://files.wwpdb.org/download/1O86.pdb

(You can also download from RCSB web UI.) RCSB PDB+1

B. Create a SMILES file for Lisinopril (PubChem CID 5362119; canonical SMILES shown):

bash
# example SMILES (from PubChem) echo "C1C[C@H](N(C1)C(=O)C(CCCCN)NC(CCC2=CC=CC=C2)C(=O)O)C(=O)O" > lisinopril.smiles

(You can also wget/download the SDF from PubChem). PubChem


3) Convert SMILES → 3D SDF → MOL2 (OpenBabel)

Generate 3D coordinates and write a mol2 (preferred for prepare_ligand4.py):

bash
# 1) SMILES -> SDF (3D) obabel -ismi lisinopril.smiles -O lisinopril.sdf --gen3D # 2) SDF -> MOL2 (add hydrogens) obabel -isdf lisinopril.sdf -omol2 -O lisinopril.mol2 --addhs --minimize

If you prefer a single-step: obabel -ismi lisinopril.smiles -omol2 -O lisinopril.mol2 --gen3D --addhs --minimize

Tip: OpenBabel flags --gen3D, --addhs, and --minimize create a reasonable 3D geometry. For charges you generally use AutoDockTools (prepare_ligand4) to assign Gasteiger charges. openbabel.github.ioopen-babel.readthedocs.io


4) Prepare receptor and ligand .pdbqt (AutoDockTools scripts)

Why use AutoDockTools (MGLTools)? prepare_receptor4.py and prepare_ligand4.py correctly set atom types, merge non-polar hydrogens, assign Gasteiger charges and define torsions — this is the recommended route for reliable .pdbqt. dockey.readthedocs.ioGitHub

Example commands (paths depend on how MGLTools installed — if installed via conda the scripts will be on your PATH; else use full path to MGLToolsPckgs/.../Utilities24/prepare_receptor4.py):

bash
# Prepare receptor: remove water / heteroatoms and make pdbqt prepare_receptor4.py -r 1O86.pdb -o 1O86_receptor.pdbqt # (If needed: remove water/ligand first in PyMOL or via flags such as -U) # e.g., prepare_receptor4.py -r 1O86.pdb -o 1O86_receptor.pdbqt -U 'nphs' # Prepare ligand (use the mol2 created earlier) prepare_ligand4.py -l lisinopril.mol2 -o lisinopril.pdbqt

If your MGLTools is not Python-3 compatible on your system, use the packaged installer or the autodocktools-prepare conda package that provides prepare_* scripts. Anacondaccsb.scripps.edu

Alternate: Meeko (Python package) or obabel --partialcharges gasteiger -opdbqt can be used, but AutoDockTools remain the most commonly used/robust for assigning torsions for Vina. Durrant Labdockey.readthedocs.io


5) Find the docking box center (practical options)

You need center_x/y/z for vina. three ways:

A) PyMOL GUI (quick)

  • Load 1O86.pdb, select the ligand (e.g., select lig, resn LPR) then center on it and read coordinates or use a centroid plugin.

B) Small Python snippet (Biopython) — compute ligand centroid automatically
Install Biopython: conda install -c conda-forge biopython -y
Then run:

python
# get_centroid_ligand.py from Bio.PDB import PDBParser import numpy as np p = PDBParser(QUIET=True) s = p.get_structure('1O86','1O86.pdb') coords = [] for atom in s.get_atoms(): # ligand in 1O86 has residue name 'LPR' (lisinopril); adjust if different if atom.get_parent().get_resname().strip() == 'LPR': coords.append(atom.get_coord()) coords = np.array(coords) centroid = coords.mean(axis=0) print("center_x, center_y, center_z = ", centroid.tolist())

Run: python get_centroid_ligand.py — this prints coordinates you can pass to vina.

C) Manual — pick coordinates from the ligand lines in the PDB (HETATM lines).

(Residue name for lisinopril in 1O86 is LPR — confirm in the PDB file). RCSB PDBIUPHAR/MMV Guide to Malaria Pharmacology


6) Run a test docking with Vina

Example (replace center_* with values from the centroid tool or PyMOL):

bash
vina --receptor 1O86_receptor.pdbqt \ --ligand lisinopril.pdbqt \ --center_x XX.XXX --center_y YY.YYY --center_z ZZ.ZZZ \ --size_x 20 --size_y 20 --size_z 20 \ --exhaustiveness 8 \ --out lisinopril_out.pdbqt \ --log lisinopril_log.txt
  • size_* controls the box dimensions (Å). 20 Å is a reasonable starting cube for small ligands but tune as needed.

  • Inspect lisinopril_log.txt — it lists binding energy (kcal/mol) for the top poses. AutoDock Vina


7) Quick visualization

Open lisinopril_out.pdbqt in PyMOL/PyRx and view poses; compare with co-crystallized ligand in 1O86 for validation.


Common errors & fixes

  • “Parse error… Unknown tag”: often malformed .pdbqt. Rebuild ligand with prepare_ligand4.py (do not hand-edit). ResearchGate

  • Missing charges (q = 0) after OpenBabel conversion — add --partialcharges gasteiger or use AutoDockTools. OpenBabel sometimes needs explicit flags. Durrant LabGitHub

  • prepare_ligand/prepare_receptor scripts expect python2: use MGLTools installer or the conda mgltools package / autodocktools-prepare that supplies python3-compatible scripts. Anaconda+1


Short checklist to hand to participants

  1. Create env: conda/mamba create -n autodock python=3.9conda activate autodock.

  2. Install: mamba install -c conda-forge -c bioconda vina openbabel mgltools (or download installers). Anacondaccsb.scripps.eduopenbabel.github.io

  3. Download 1O86.pdb. RCSB PDB

  4. Get Lisinopril SMILES (PubChem CID 5362119). PubChem

  5. Convert, prepare and run vina (commands above).

  6. Visualize results in PyMOL or PyRx.

#Module 5 Post-Docking Analysis

Post-Docking Analysis , laid out as a clean, runnable activity so your participants can walk straight through it after docking. It includes ...