grand.utils module

Description

Functions to provide support for grand canonical sampling in OpenMM. These functions are not used during the simulation, but will be relevant in setting up simulations and processing results

Marley Samways Ollie Melling

class grand.utils.PDBRestartReporter(filename, topology)

Bases: object

Very basic class to write PDB files as a basic form of restarter

report(simulation, state)

Write out a PDB of the current state

Parameters
  • simulation (simtk.openmm.app.Simulation) – Simulation object being used

  • state (simtk.openmm.State) – Current State of the simulation

grand.utils.add_ghosts(topology, positions, ff='tip3p', n=10, pdb='gcmc-extra-wats.pdb')

Function to add water molecules to a topology, as extras for GCMC This is to avoid changing the number of particles throughout a simulation Instead, we can just switch between ‘ghost’ and ‘real’ waters…

Notes

Ghosts currently all added to a new chain Residue numbering continues from the existing PDB numbering

Parameters
  • topology (simtk.openmm.app.Topology) – Topology of the initial system

  • positions (simtk.unit.Quantity) – Atomic coordinates of the initial system

  • ff (str) – Water forcefield to use. Currently the only options are ‘tip3p’, ‘spce’ or ‘tip4pew’. Should be the same as used for the solvent

  • n (int) – Number of waters to add to the system

  • pdb (str) – Name of the PDB file to write containing the updated system This will be useful for visualising the results obtained.

Returns

  • modeller.topology (simtk.openmm.app.Topology) – Topology of the system after modification

  • modeller.positions (simtk.unit.Quantity) – Atomic positions of the system after modification

  • ghosts (list) – List of the residue numbers (counting from 0) of the ghost waters added to the system.

grand.utils.align_traj(topology=None, trajectory=None, t=None, reference=None, output=None)

Align a trajectory to the protein

Parameters
  • topology (str) – Name of the topology/connectivity file (e.g. PDB, GRO, etc.)

  • trajectory (str) – Name of the trajectory file (e.g. DCD, XTC, etc.)

  • t (mdtraj.Trajectory) – Trajectory object, if already loaded

  • reference (str) – Name of a PDB file to align the protein to. May be better to visualise

  • output (str) – Name of the file to which the new trajectory is written. If None, then a Trajectory will be returned

Returns

t – Will return a trajectory object, if no output file name is given

Return type

mdtraj.Trajectory

grand.utils.cluster_waters(topology, trajectory, sphere_radius, ref_atoms=None, sphere_centre=None, cutoff=2.4, output='gcmc_clusts.pdb')

Carry out a clustering analysis on GCMC water molecules with the sphere. Based on the clustering code in the ProtoMS software package.

This function currently assumes that the system has been aligned and centred on the GCMC sphere (approximately).

Parameters
  • topology (str) – Topology of the system, such as a PDB file

  • trajectory (str) – Trajectory file, such as DCD

  • sphere_radius (float) – Radius of the GCMC sphere in Angstroms

  • ref_atoms (list) – List of reference atoms for the GCMC sphere (list of dictionaries)

  • sphere_centre (simtk.unit.Quantity) – Coordinates around which the GCMC sphere is based

  • cutoff (float) – Distance cutoff used in the clustering

  • output (str) – Name of the output PDB file containing the clusters

grand.utils.create_ligand_xml(prmtop, prepi, resname='LIG', output='lig.xml')

Takes two AMBER parameter files (.prmtop and .prepi) for a small molecule and uses them to create an XML file which can be used to load the force field parameters for the ligand into OpenMM This function could do with some tidying at some point…

Parameters
  • prmtop (str) – Name of the .prmtop file

  • prepi (str) – Name of the .prepi file

  • resname (str) – Residue name of the small molecule

  • output (str) – Name of the output XML file

grand.utils.get_data_file(filename)

Get the absolute path of one of the data files included in the package

Parameters

filename (str) – Name of the file

Returns

filepath – Name of the file including the path

Return type

str

grand.utils.random_rotation_matrix()

Generate a random axis and angle for rotation of the water coordinates (using the method used for this in the ProtoMS source code (www.protoms.org), and then return a rotation matrix produced from these

Returns

rot_matrix – Rotation matrix generated

Return type

numpy.ndarray

grand.utils.read_ghosts_from_file(ghost_file)

Read in the ghost water residue IDs from each from in the ghost file

Parameters

ghost_file (str) – File containing the IDs of the ghost residues in each frame

Returns

ghost_resids – List of lists, containing residue IDs for each frame

Return type

list

grand.utils.read_prepi(filename)

Function to read in some atomic data and bonding information from an AMBER prepi file

Parameters

filename (str) – Name of the prepi file

Returns

  • atom_data (list) – A list containing a list for each atom, of the form [name, type, charge], where each are strings

  • bonds (list) – A list containing one list per bond, of the form [name1, name2]

grand.utils.recentre_traj(topology=None, trajectory=None, t=None, name='CA', resname='ALA', resid=1, output=None)

Recentre a trajectory based on a specific protein residue. Assumes that the protein has not been broken by periodic boundaries. Would be best to do this step before aligning a trajectory

Parameters
  • topology (str) – Name of the topology/connectivity file (e.g. PDB, GRO, etc.)

  • trajectory (str) – Name of the trajectory file (e.g. DCD, XTC, etc.)

  • t (mdtraj.Trajectory) – Trajectory object, if already loaded

  • resname (str) – Name of the atom to centre the trajectorname.lower(

  • resname – Name of the protein residue to centre the trajectory on. Should be a binding site residue

  • resid (int) – ID of the protein residue to centre the trajectory. Should be a binding site residue

  • output (str) – Name of the file to which the new trajectory is written. If None, then a Trajectory will be returned

Returns

t – Will return a trajectory object, if no output file name is given

Return type

mdtraj.Trajectory

grand.utils.remove_ghosts(topology, positions, ghosts=None, pdb='gcmc-removed-ghosts.pdb')

Function to remove ghost water molecules from a topology, after a simulation. This is so that a structure can then be used to run further analysis without ghost waters disturbing the system.

Parameters
  • topology (simtk.openmm.app.Topology) – Topology of the initial system

  • positions (simtk.unit.Quantity) – Atomic coordinates of the initial system

  • ghosts (list) – List of residue IDs for the ghost waters to be deleted

  • pdb (str) – Name of the PDB file to write containing the updated system This will be useful for visualising the results obtained.

Returns

  • modeller.topology (simtk.openmm.app.Topology) – Topology of the system after modification

  • modeller.positions (simtk.unit.Quantity) – Atomic positions of the system after modification

grand.utils.shift_ghost_waters(ghost_file, topology=None, trajectory=None, t=None, output=None)

Translate all ghost waters in a trajectory out of the simulation box, to make visualisation clearer

Parameters
  • ghost_file (str) – Name of the file containing the ghost water residue IDs at each frame

  • topology (str) – Name of the topology/connectivity file (e.g. PDB, GRO, etc.)

  • trajectory (str) – Name of the trajectory file (e.g. DCD, XTC, etc.)

  • t (mdtraj.Trajectory) – Trajectory object, if already loaded

  • output (str) – Name of the file to which the new trajectory is written. If None, then a Trajectory will be returned

Returns

t – Will return a trajectory object, if no output file name is given

Return type

mdtraj.Trajectory

grand.utils.wrap_waters(topology=None, trajectory=None, t=None, output=None)

Wrap water molecules if the coordinates haven’t been wrapped by the DCDReporter

Parameters
  • topology (str) – Name of the topology/connectivity file (e.g. PDB, GRO, etc.)

  • trajectory (str) – Name of the trajectory file (e.g. DCD, XTC, etc.)

  • t (mdtraj.Trajectory) – Trajectory object, if already loaded

  • output (str) – Name of the file to which the new trajectory is written. If None, then a Trajectory will be returned

Returns

t – Will return a trajectory object, if no output file name is given

Return type

mdtraj.Trajectory

grand.utils.write_conect(pdb, resname, prepi, output)

Take in a PDB and write out a new one, including CONECT lines for a specified residue, given a .prepi file Should make it easy to run this on more residues at a time - though for now it can just be run separately per residue but this isn’t ideal…

Parameters
  • pdb (str) – Name of the input PDB file

  • resname (str) – Name of the residue of interest

  • prepi (str) – Name of the prepi file

  • output (str) – Name of the output PDB file

grand.utils.write_sphere_traj(radius, ref_atoms=None, topology=None, trajectory=None, t=None, sphere_centre=None, output='gcmc_sphere.pdb', initial_frame=False)

Write out a multi-frame PDB file containing the centre of the GCMC sphere

Parameters
  • radius (float) – Radius of the GCMC sphere in Angstroms

  • ref_atoms (list) – List of reference atoms for the GCMC sphere (list of dictionaries)

  • topology (str) – Topology of the system, such as a PDB file

  • trajectory (str) – Trajectory file, such as DCD

  • t (mdtraj.Trajectory) – Trajectory object, if already loaded

  • sphere_centre (simtk.unit.Quantity) – Coordinates around which the GCMC sphere is based

  • output (str) – Name of the output PDB file

  • initial_frame (bool) – Write an extra frame for the topology at the beginning of the trajectory. Sometimes necessary when visualising a trajectory loaded onto a PDB