Tuesday, August 12, 2025

#Module 5 Post-Docking Analysis

Post-Docking Analysis, laid out as a clean, runnable activity so your participants can walk straight through it after docking. It includes visualization tips, quick scripts to extract scores, RMSD validation vs co-crystal ligand, interaction mapping, ranking, and next-step suggestions.


Module 5 — Post-Docking Analysis (step-by-step)

Objective

From a completed docking run (output.pdbqt / output.pdb + log.txt), you will:

  1. Extract binding scores and create a ranked table.

  2. Visualize and inspect docked poses and interactions.

  3. Validate docking by redocking / RMSD vs co-crystallized ligand.

  4. Identify key interactions and shortlist candidates.

  5. Know simple rescoring/next-step options.


Step 0 — Files you should have

  • ACE_clean.pdbqt (receptor) and ACE_clean.pdb (original PDB)

  • lisinopril_out.pdbqt (vina output) and lisinopril_out.pdb (converted PDB)

  • lisinopril_log.txt (vina log) — optional

  • If available: original co-crystallized ligand (from 1O86.pdb) for RMSD checks


Step 1 — Parse Vina results (quick CSV of modes)

Vina writes pose results as REMARK VINA RESULT: lines in the output .pdbqt. Use this Python script to extract them into results.csv:

python
# parse_vina_results.py
import sys import csv if len(sys.argv) < 3: print("Usage: python parse_vina_results.py output.pdbqt results.csv") sys.exit(1) infile = sys.argv[1] outfile = sys.argv[2] results = [] with open(infile) as f: for line in f: line = line.strip() if line.startswith("REMARK VINA RESULT:"): parts = line.split() # typical line: REMARK VINA RESULT: -7.1 0.000 0.000 try: affinity = float(parts[3]) except: affinity = None results.append({"mode": len(results)+1, "affinity_kcal_mol": affinity}) with open(outfile, "w", newline="") as csvfile: writer = csv.DictWriter(csvfile, fieldnames=["mode", "affinity_kcal_mol"]) writer.writeheader() for r in results: writer.writerow(r) print(f"Wrote {len(results)} results to {outfile}")

Run:

bash
python parse_vina_results.py lisinopril_out.pdbqt lisinopril_results.csv

Step 2 — Batch docking results: rank multiple ligands

If you docked many ligands into out/ folder, use a small bash loop + the parser to build a master ranking:

bash
mkdir -p summary for out in out/*_out.pdbqt; do base=$(basename "$out" _out.pdbqt) python parse_vina_results.py "$out" "summary/${base}.csv" # grab top affinity head -n 2 "summary/${base}.csv" | tail -n1 >> summary/top_affinities.txt done # Now combine into a single file and sort (assuming lines are "mode,affinity") awk -F, 'NR>1{print FILENAME","$2}' summary/*.csv > summary/all_affinities.csv # Better: write a small Python script to collect top affinities and sort

(I can provide a polished collector script if you want.)


Step 3 — Visual inspection in PyMOL (show poses, H-bonds, pocket)

Open PyMOL (or Chimera) and run these commands (adjust names to your files/selections):

pymol
# load files load ACE_clean.pdb, receptor load lisinopril_out.pdb, docked # show protein and ligand hide everything show cartoon, receptor show sticks, docked color cyan, receptor color yellow, docked # select ligand (assumes ligand residues are present as 'resn LPR') select lig, docked and (resn LPR or resn LPR_) # adjust resname if needed # define pocket (all receptor atoms within 4 Å of ligand) select pocket, receptor within 4.0 of lig # show pocket and ligand show surface, pocket show sticks, pocket # draw distances (possible H-bonds) within 3.5 Å distance hbonds, lig, pocket, 3.5 # label pocket residues (for quick notes) label pocket and name CA, "%s%s" % (resn, resi)

What to look for:

  • Does ligand sit in expected pocket (co-crystal site)?

  • Key H-bonds (distances < 3.5 Å).

  • Hydrophobic contacts (ligand buried in hydrophobic region).

Take annotated screenshots for your slides / report.


Step 4 — Compute RMSD vs co-crystal ligand (redocking validation)

Why: Redocking the crystallographic ligand and computing RMSD between docked pose and crystal pose is a sanity check. RMSD ≤ 2.0 Å is commonly considered a successful redocking.

Use this Python script (Kabsch algorithm) to compute RMSD between two ligand PDBs. It expects that you have two PDB files containing only ligand atom lines (same atom ordering helps).

python
# rmsd_kabsch.py import sys import numpy as np def read_coords(pdbfile): coords = [] for line in open(pdbfile): if line.startswith(("ATOM","HETATM")): x = float(line[30:38]); y = float(line[38:46]); z = float(line[46:54]) coords.append([x,y,z]) return np.array(coords) def kabsch_rmsd(P, Q): # P and Q are Nx3 Pc = P - P.mean(axis=0) Qc = Q - Q.mean(axis=0) C = np.dot(Pc.T, Qc) V, S, Wt = np.linalg.svd(C) d = np.sign(np.linalg.det(np.dot(V, Wt))) D = np.diag([1.0, 1.0, d]) U = np.dot(np.dot(V, D), Wt) Qr = np.dot(Qc, U.T) diff = Pc - Qr return np.sqrt((diff**2).sum() / len(P)) if __name__ == "__main__": if len(sys.argv) != 3: print("Usage: python rmsd_kabsch.py ref_ligand.pdb dock_ligand.pdb") sys.exit(1) P = read_coords(sys.argv[1]) Q = read_coords(sys.argv[2]) if P.shape != Q.shape: print("Warning: different atom counts or order. RMSD may be invalid.") rmsd = kabsch_rmsd(P, Q) print("RMSD = %.3f Å" % rmsd)

Usage:

  1. Extract the co-crystal ligand from 1O86.pdb to ref_lig.pdb (keep only ligand atoms).

  2. Extract the docked ligand from lisinopril_out.pdb to dock_lig.pdb.

  3. Run:

bash
python rmsd_kabsch.py ref_lig.pdb dock_lig.pdb

If RMSD ≤ 2.0 Å → good agreement.


Step 5 — Map interactions (hydrogen bonds, hydrophobic, salt bridges)

Options:

  1. PLIP (Protein-Ligand Interaction Profiler) — easiest: upload PDB (receptor + ligand) to the PLIP webserver; it returns an interaction table + 2D/3D diagrams. (Use PLIP web for quick, annotated outputs.)

  2. Manual inspection in PyMOL — use distance to find close contacts, then compile a table of interacting residues and types.

Suggested output table columns:

  • ligand_atom - receptor_residue - receptor_atom - interaction_type (H-bond / hydrophobic / salt bridge) - distance (Å)


Step 6 — Rank & shortlist candidates

Create a shortlist based on:

  • Binding affinity (vina score) — more negative is better.

  • Pose plausibility (does it occupy the known active site?).

  • Key interactions: presence of H-bond(s) to catalytic residues or zinc-binding motif (if relevant).

  • RMSD (if comparing to known ligand) — low RMSD preferred for validation.

Produce a final CSV with columns: ligand_name, best_affinity, num_hbonds, key_residues, rmsd (if applicable), notes


Step 7 — Rescoring & follow-up (high-value options)

If you need higher confidence, consider:

  • Rescore with smina (a Vina fork with different scoring / custom weights).

  • MM-GBSA / MM-PBSA free energy estimations (requires short MD and AMBER / GROMACS + gmx_MMPBSA).

  • Short MD simulations to check pose stability (OpenMM, GROMACS).

  • Filter for ADMET & drug-likeness: use RDKit to compute logP, MW, Lipinski rules, or commercial tools.

(These require extra compute and expertise; treat as next-stage steps.)


Step 8 — Report & deliverables

Ask participants to submit:

  1. results.csv (modes & affinities).

  2. top_hits.csv (ranked top 5 ligands with key notes).

  3. Screenshots: PyMOL view(s) showing ligand in pocket with labeled interacting residues.

  4. RMSD report for any redocking performed.

  5. Short one-page summary: why the top hit was selected / concerns about it.

No comments:

Post a Comment

#Module 5 Post-Docking Analysis

Post-Docking Analysis , laid out as a clean, runnable activity so your participants can walk straight through it after docking. It includes ...