CP2K interface
CP2K is a very popular and powerful periodic electronic structure program, particulary known for its unique mixed Gaussian and plane waves (GPW) approach and efficient parallelization which enables largescale DFT MD simulations to be carried out.
ASH features a reasonably flexible interface to CP2K allowing the use of CP2K both as a QMonly code within ASH or as part of QM/MM Theory in ASH. Energies and gradients are available in the interface so the CP2KTheory class in ASH can be used for singlepoint energies, geometry optimizations, numerical frequencies, surface scans, NEB and molecular dynamics. Neither the CP2K optimizer,MD or NEB routines are used, instead ASH handles all of this. Both periodic and nonperiodic systems are supported to some extent. Additionally pointchargeembedding is supported by the interface via the GEEP (Gaussian Expansion of Electrostatic Potential) approach in CP2K. This allows QM/MM calculations with CP2K as QMcode and OpenMM as MMcode within ASH.
If the purpose is to primarily carry out periodic DFT MD simulations, running CP2K via the ASH interface will not offer many benefits over CP2K directly, however, running geometry optimizations, surface scans and NEB via ASH may be more convenient than using CP2K directly. Furthermore the QM/MM capabilities within ASH and the flexible forcefield support by OpenMM may be preferable to the CP2K options.
CP2KTheory class:
class CP2KTheory:
def __init__(self, cp2kdir=None, cp2k_bin_name=None, filename='cp2k', printlevel=2, basis_dict=None, potential_dict=None, label="CP2K",
periodic=False, periodic_type='XYZ', qm_periodic_type=None,cell_dimensions=None, cell_vectors=None,
qm_cell_dims=None, qm_cell_shift_par=6.0, wavelet_scf_type=40,
functional=None, psolver='wavelet', potential_file='POTENTIAL', basis_file='BASIS',
basis_method='GAPW', ngrids=4, cutoff=250, rel_cutoff=60,
method='QUICKSTEP', numcores=1, parallelization='OMP', mixed_mpi_procs=None, mixed_omp_threads=None,
center_coords=True, scf_maxiter=50, outer_SCF=True, outer_scf_maxiter=10, scf_convergence=1e6, eps_default=1e10,
coupling='GAUSSIAN', GEEP_num_gauss=6, MM_radius_scaling=1, mm_radii=None,
OT=True, OT_minimizer='DIIS', OT_preconditioner='FULL_ALL',
OT_linesearch='3PNT', outer_SCF_optimizer='DIIS', OT_energy_gap=0.08):
Keyword 
Type 
Default value 
Details 


string 
None 
Directory where CP2K binaries are. 

string 
None 
Name of CP2K binary to use. If not provided, ASH will search for cp2k.ssmp, cp2k.sopt, cp2k.psmp in the cp2kdir directory or PATH. 

integer 
2 
Printlevel 

string 
'cp2k' 
Name of CP2K inputfile created by ASH. 

integer 
1 
The number of CPU cores used. 

string 
'OMP' 
Parallelization strategy. Options: 'OMP', 'MPI', 'Mixed' 

integer 
1 
If parallelization='Mixed': The number of MPI processes active. 

integer 
None 
If parallelization='Mixed': The number of OpenMP threads per MPI process. 

dict 
None 
Dictionary defining basis set information for each element. 

dict 
None 
Dictionary defining pseudopotential information for each element. 

string 
'CP2K' 
Label for the CP2KTheory object. 

Boolean 
False 
Whether to use periodic boundary conditions or not. 

string 
'XYZ' 
Type of PBC used. Options: 'XYZ', 'XY', 'XZ', 'YZ' etc. 

list 
None 
The cell dimensions defining the box (applies to both PBC and noPBC). List of celllengths in Angstroms and cellangles in degrees. 

list of lists 
None 
Alternative way of specifying the cell dimensions. List of lists of cellvectors in Angstroms. 

integer 
40 
Waveletonly: Scaling function. Possible values: 8,14,16,20,24,30,40,50,60,100 

string 
None 
Name of DFT functional to use. 

string 
'wavelet' 
Name of Poisson solver to use. Options: 'PERIODIC', 'MT', 'WAVELET', 'MULTIPOLE'. 

string 
'POTENTIAL' 
Name of potential file. 

string 
'BASIS' 
Name of basis set file. 

string 
'GAPW' 
Type of CP2K basisset method to use. Options: 'GPW' (Gaussianplanewave with pseudopoentials), 'GAPW' (Gaussian augmented planewave). 

integer 
4 
Number of realspace grids to use. 

integer 
250 
Planewave cutoff (in Ry) used. 

integer 
60 
Relative cutoff (in Ry) used. Controls which product Gaussians are mapped onto which level of the multigrid. 

Boolean 
True 
Whether CP2K centers coordinates or not. Usually necessary. 

float 
1e6 
SCF convergence in Hartree. 

integer 
50 
Max number of (inner) SCF iterations 

float 
1e10 
Overall CP2K convergence setting (see manual). Probably more useful than scf_convergence 

Boolean 
True 
Whether the OT SCF method is on or not 

string 
'DIIS' 
Type of minimizer method. Options: 'DIIS', 'CG', 'SD', 'BROYDEN' 

string 
'FULL_ALL' 
OT preconditioner. Options: 'FULL_ALL', 'FULL_SINGLE_INVERSE' and more 

string 
'3PNT' 
OT linesearch option. Options: '3PNT', '2PNT', 'NONE', 'GOLD'. 

Boolean 
True 
Whether outerSCF loop is on or not. 

integer 
10 
How many outer SCF iterations to perform. 

string 
'DIIS' 
Outer SCF optimizer method. Options: 'DIIS', 'CG' 

float 
0.08 
Energy gap to use in OT method. 

string 
'QUICKSTEP' 
The CP2K module to use. 

string 
None 
QM/MM only: Type of PBC used for QMpart. Options: 'XYZ', 'XY', 'XZ', 'YZ' etc. 

list 
None 
QM/MM only: List of celllengths in Angstroms for the QM region. Cell dimensions estimated if not provided. 

float 
None 
QM/MM only: Parameter used to shift the QM cell dimensions based on molecule size. Only used if qm_cell_dims is not provided. 

string 
'GAUSSIAN' 
QM/MM only: The type of QM/MM coupling used. Only 'GAUSSIAN' is supported for DFT. 'COULOMB' available for semiempiricical systems. 

integer 
6 
QM/MM only: Number of Gaussians used to expand each MM point charge. 

float 
1.0 
QM/MM only: Optional scaling factor of the MM radii. 

dict 
None 
QM/MM only: Optional dictionary of MM radii for each element. If not provided, radii are estimated internal table. 
CP2K installation
CP2K can be installed in several different ways, see https://www.cp2k.org/download It can be tricky to compile. It is easiest to either download binaries (see link, though MPIparallel version not typically available) or install via conda/mamba (see https://anaconda.org/condaforge/cp2k). Alternatively you can compile CP2K from source: https://github.com/cp2k/cp2k/blob/master/INSTALL.md
Note that downloaded or compiled CP2K binaries may come in a few different forms: e.g. cp2k.ssmp, cp2k.sopt, ccp2k.popt, cp2k.psmp where sopt means serialoptimized, ssmp means singleprocess with OpenMP, popt means paralleloptimized with MPI and psmp means paralleloptimized with MPI and OpenMP. The cp2k.psmp binary is the most flexible and is recommended to use if available.
We have had success installing the latest CP2K version (2024.1) via conda/mamba like this:
#CP2K 2024.1 with OpenMPI and OpenBLAS: installs cp2k.psmp binary
#Note: mamba install cp2k will install nonMPI version: cp2k.ssmp binary
mamba install cp2k=2024.1=openblas_openmpi_h7c9ef3d_1
ASH will then try find the CP2K installation to use according to this logic:
if cp2kdir variable provided (containing path to where the binaries are) and cp2k_bin_name provided: use that binary in that directory
if cp2kdir variable provided but cp2k_bin_name NOT provided: search for cp2k.X executables in the cp2kdir directory
if cp2kdir variable NOT provided but cp2k_bin_name provided: search for cp2k_bin_name in PATH
if cp2kdir variable NOT provided and cp2k_bin_name not provided: search for cp2k.X executables in PATH
Note that the search for executables will only work if the binaries are named: cp2k.X where X is one of: ssmp, sopt, popt, psmp. ASH will search for executables in this order: ["cp2k.psmp", "cp2k.popt", "cp2k.ssmp","cp2k.sopt"]
Parallelization
CP2K binaries differ in their parallelization capabilities:
sopt: no parallelization
ssmp: only OpenMP parallelization
popt: only MPI parallelization
psmp: mixed MPI and OpenMP parallelization. Primarily useful for massive parallelization (>10K cores).
The CP2K manual advises to use the cp2k.psmp executable as it is the most flexible (can be used for serial, MPI, OpenMP and Mixed calculations). If this executable is available it is best to specify this in the CP2KTheory object using the cp2k_bin_name keyword.
The parallelization that ASH will tell CP2K to use is controlled by the parallelization keyword when defining the CP2KTheory object in ASH. It can be se set to 'OMP' (OpenMP threading, default), 'MPI' or 'Mixed'. The best parallelization strategy depends on the system and the hardware and you may have to do your own benchmarks.
OpenMP parallelization
This is the easiest parallelization strategy to start using and is hence the default (parallelization='OMP') in ASH. It requires either a cp2k.ssmp or cp2k.psmp executable. One simply has to specify the number of CPU cores to be used via the numcores keyword in CP2KTheory. 1 CP2K process (either cp2k.ssmp or cp2k.psmp) executable will be launched which will be capable of OpenMP threading up to the chosen number of cores.
MPI parallelization
For MPIparallelization one should set parallelization='MPI'. It requires either the cp2k.popt or cp2k.psmp executable. Additionally a CP2Kcompatible MPI program needs to be installed and in PATH. ASH assumes OpenMPI and will search for this. If numcores=4 and parallelization='MPI' then 4 CP2K processes will be launched by the mpirun program.
As discussed on https://www.cp2k.org/faq:mpi_vs_openmp CP2K is primarily MPIparallelized and is thus probably a faster option than OpenMP overall.
Mixed OMP/MPI parallelization
The mixed OMP/MPI parallelization is only possible using the cp2k.psmp executable. This parallelization strategy is particularly useful for massively parallel calculations (thousands of cores) and may not be as beneficial for small systems or a small amount of CPU cores. But this depends on the system and hardware so test it out.
To use one should set: parallelization='Mixed' , specify the total number of CPU cores by the numcores keyword and additionally one must specify how many MPI processes and how many OMP threads per process via the mixed_mpi_procs and mixed_omp_threads keywords. For example, if numcores=8, mixed_mpi_procs=4 and mixed_omp_threads=2 then 4 MPI processes will be used with 2 OMP threads used per process, for a total of 8 utilized CPU cores. Note that ASH will give an error if numcores is not equal to mixed_mpi_procs*mixed_omp_threads.
Controlling the basis set
The primary purpose of using CP2K is probably to take advantage of the efficient mixed Gaussian and plane wave (GPW) approach where Gaussians are used to calculate the 1electron integrals and plane waves are used to calculate the 2electron integrals. Furthermore the user should specify whether the standard GPW (Gaussian and planewaves) or GAPW (Gaussian augmented GPW) method should be used. Pseudopotentialbased calculations can be performed with both methods, however, allelectron calculations can only be performed with GAPW. GAPW may have more stable forces and require reduced cutoff but may be more expensive.
Depending on whether GPW or GAPW is used, suitable basis set and pseudopotential information should be provided. This is controlled by defining the basis_dict and potential_dict keywords in the CP2KTheory object. The chosen basis sets and pseudopotentials must be available in the specified basis and potential files. For allelectron GAPW calculations one should set value for each element in the potential_dict to 'ALL'.
#Defining MOLOPT basis sets and GTH pseudopotentials for each element
basis_dict={'C':'DZVPMOLOPTSRGTH','O':'DZVPMOLOPTSRGTH','H':'DZVPMOLOPTSRGTH'}
potential_dict={'C':'GTHPBEq4','O':'GTHPBEq6','H':'GTHPBEq1'}
cp2k_object = CP2KTheory(basis_method='GPW', basis_dict=basis_dict,potential_dict=potential_dict,
potential_file='POTENTIAL', basis_file='BASIS',)
Note that if the specified basisfile or potentialfile is not in the current dir (or parent dir) then ASH will automatically copy a file containing GTH pseudopotentials (renamed from GTH_POTENTIALS to POTENTIAL) and MOLOPT basis sets (renamed from BASIS_MOLOPT to BASIS). This will only work if MOLOPT basis sets are being used. For all other basis sets, then the user must provide the basis and potential files.
For the planewave part of the basis set, the cutoff and rel_cutoff keywords can be used to control the cutoffs used. The number of grids also play a role in the accuracy of the calculation and can be controlled by the ngrids keyword (default is 4). Suitable cutoff values and grids require some experience or testing. See https://www.cp2k.org/howto:converging_cutoff for some information on how to choose cutoffs and grids. A reasonable value for the Cutoff is 250 Ry and a good value for the rel_cutoff is usually 60 Ry. These cutoff should be varied simultaneously. These are the ASH defaults but we don't have a lot of experience with CP2K so don't rely on these values. Some system setups (depends on elements, basis set and pseudopotential) may require larger values and other systems will run more efficiently with smaller values.
Periodic vs. nonperiodic calculations
CP2K is a code first and foremost developed for the purpose of periodic calculations. It is nonetheless possible to perform nonperiodic calculations and this is probably preferable for calculations on molecules in vacuum (to avoid PBC artifacts) and may also be beneficial for some QM/MM applications within ASH.
Regardless of whether the system is periodic or not, the system cell needs to be specified. The cell_dimensions (e.g. cell_dimensions=[10.0,10.0,10.0,90.0,90.0,90.0] or cell_vectors (cell_vectors=[[10.0,0.0,0.0],[0.0,10.0,0.0],[0.0,0.0,10.0]]) keywords should be used to define the box size. For nonperiodic calculations this is necessary as the basis set and solver are based on the box dimensions. However, if cell information is not provided, then by default a cell size will be automatically estimated (by ASH) based on the molecule size and the qm_cell_shift_par parameter will extend the box by an additional amount (6.0 Angstrom by default).
Nonperiodic calculations
For nonperiodic calculations, the CP2KTheory object should be defined with periodic=False, this is the default. The cell should be specified as described above. Poisson solver should also be specified with the psolver keyword. The default is 'wavelet' which is probably the most efficient for nonperiodic calculations. The accuracy of the solver can be controlled by the wavelet_scf_type keyword (see CP2Kmanualwavelet ). Another Poisson solver option is 'MT' (CP2KmanualMT ). In the case of the MT solver the cell should be at least 2 as large as the charge density (i.e. the molecule). The cell can be smaller for the wavelet solver.
Periodic calculations
For periodic calculations, the CP2KTheory object should be defined with periodic=True. The periodic_type is by default 'XYZ' (i.e. PBC in all directions). The cell size should be specified as described above. Poisson solver options are : 'PERIODIC', 'WAVELET', 'MULTIPOLE' or 'IMPLICIT'. The PERIODIC solver is recommended (only available for full 3D periodicity).
QM/MM
QM/MM calculations are possible in the ASH interface to CP2K. Unlike most other QMcodes, however, regular electrostatic embedding is not available for DFTmethods in CP2K so instead we use the GEEP (Gaussian Expansion of Electrostatic Potential) approach available in CP2K. This approach expands the MM pointcharges as Gaussians. The GEEP approach is overall an improvement over traditional electrostatic embedding as it should prevent chargeleakage onto MM atoms (electron spillout effect) which can be a larger problem for planewavebasis calculations. The GEEP approach, however, requires definition of radii on the MMatoms which control the width of the Gaussians used to expand the MM pointcharges.
To use CP2K as QMcode in an ASH QM/MM calculation one needs be aware of a few things:
The cell size must be specified (either cell_dimension or cell_vectors) but counterintuitively it needs to be specified for the whole system (QM+MM) and not just the QMpart as CP2K needs this information.
Additionally the QMcell size should be specified (where the electrons and basis sets are) and this should be a box encompassing the whole QMregion (slightly larger). The qm_cell_dims keyword can be used to specify this or alternatively ASH can also estimate the QMcell size based on the QMregion size and the qm_cell_shift_par extension parameter (default 6). If the Poisson solver is wavelet, the QMcell needs to be cubic (automatically done if the QMcell size is estimated from the QMregion).
A QM/MM job with CP2K in ASH can either be periodic or nonperiodic. For nonperiodic calculations it is recommended to use the wavelet Poisson solver.
For periodic QM/MM calculations, one should typically set: periodic=True and psolver='PERIODIC'
One then should specify the QM/MM electrostatic coupling. For DFT only the Gaussianbased GEEP approach is available (coupling='GAUSSIAN') while coupling='COULOMB' is available for semiempirical systems. GEEP can only be used with the wavelet or periodic Poisson solver (not 'MT'). The number of Gaussians used to expand each MMcenter is controlled by the GEEP_num_gauss keyword (default=6). The width of the Gaussians depends on the defined MMradius for each MM site which should vary according to the element. Element information of the MMregion is automatically passed onto CP2K and default MMradii will be used:
#Element radii in Angstrom (will be converted to Bohrs by CP2K)
element_radii_for_cp2k = {'H':0.44,'He':0.44,'Li':0.6,'Be':0.6,'B':0.78,'C':0.78,'N':0.78,'O':0.78,'F':0.78,'Ne':0.78,
'Na':1.58,'Mg':1.58,'Al':1.67,'Si':1.67,'P':1.67,'S':1.67,'Cl':1.67,'Ar':1.67,
'K':1.52,'Ca':1.6,'Sc':1.6,'Ti':1.6,'V':1.6,'Cr':1.6,'Mn':1.6,'Fe':1.6,'Co':1.6,
'Ni':1.6,'Cu':1.6,'Zn':1.6,'Br':1.6, 'Mo':1.7}
If the user wants to use different MM radii for each element, then this can be specified with the mm_radii keyword which should point to a dictionary containing radii for each element present in the system. It is also possible to use the MM_radius_scaling keyword to scale the radii by a factor (default=1.0).
ASH handles the creation of linkatoms and chargeshifting at a QMMM boundary and this information is provided to CP2K as a modified XYZfile. It is unclear whether the automatic dipolecorrection (addition of charges to maintain dipole), commonly employed in chargeshifted electrostatic embedding QM/MM is useful when combined with the GEEP procedure of CP2K. It is thus possible to turn it off with the dipole_correction keyword in the QMMMTheory object.
Examples
In the examples below it is assumed that the CP2K binaries are already in PATH (no need to use cp2kdir)
Minimal nonperiodic geometry optimization of MeOH in vacuum:
from ash import *
numcores=2
frag = Fragment(xyzfile="MeOH.xyz",charge=0, mult=1)
#Basis set and pseudopotential information per element
basis_dict={'C':'DZVPMOLOPTSRGTH','O':'DZVPMOLOPTSRGTH','H':'DZVPMOLOPTSRGTH'}
potential_dict={'C':'GTHPBEq4','O':'GTHPBEq6','H':'GTHPBEq1'}
#Minimal CP2KTheory definition: no periodicity, psolver=wavelet by default, basis_method='GAPW', cutoff=250,rel_cutoff=60
#Cell dimensions are estimated from molecule size
qm = CP2KTheory(cp2k_bin_name="cp2k.ssmp",basis_dict=basis_dict,potential_dict=potential_dict,functional='PBE',numcores=numcores)
#Geometry optimization
Optimizer(theory=qm, fragment=frag)
Periodic geometry optimization of MeOH in vacuum:
from ash import *
numcores=2
frag = Fragment(xyzfile="MeOH.xyz",charge=0, mult=1)
#Basis set and pseudopotential information per element
basis_dict={'C':'DZVPMOLOPTSRGTH','O':'DZVPMOLOPTSRGTH','H':'DZVPMOLOPTSRGTH'}
potential_dict={'C':'GTHPBEq4','O':'GTHPBEq6','H':'GTHPBEq1'}
#Periodic CP2KTheory definition with specified cell dimensions
qm = CP2KTheory(cp2k_bin_name="cp2k.ssmp",basis_dict=basis_dict,potential_dict=potential_dict,functional='PBE',numcores=numcores,
periodic=True,cell_dimensions=[10,10,10,90,90,90], psolver='periodic',basis_method='GPW', ngrids=4, cutoff=450, rel_cutoff=50)
#Geometry optimization
Optimizer(theory=qm, fragment=frag)
Nonperiodic QM/MM geometry optimization and frequencies of a protein
Here using the simple solvated lysozyme protein as a test system with a threonine sidechain in the QMregion.
from ash import *
numcores=4
#Defining path to dir containing forcefield files and coordinates
forcefielddir="./"
psffile=forcefielddir+"step3_pbcsetup.psf"
topfile=forcefielddir+"top_all36_prot.rtf"
prmfile=forcefielddir+"par_all36_prot.prm"
xyzfile=forcefielddir+"coordinates.xyz"
#Read coordinates from XYZfile
frag = Fragment(xyzfile=xyzfile)
#Creating OpenMM object from CHARMMfiles
openmmobject = OpenMMTheory(psffile=psffile, CHARMMfiles=True, charmmtopfile=topfile,
charmmprmfile=prmfile, periodic=True, periodic_cell_dimensions=[80.0, 80.0, 80.0, 90.0, 90.0, 90.0],
autoconstraints=None, rigidwater=False)
#CP2KTheory object
basis_dict={'C':'DZVPMOLOPTSRGTH','N':'DZVPMOLOPTSRGTH','O':'DZVPMOLOPTSRGTH','H':'DZVPMOLOPTSRGTH'}
potential_dict={'C':'GTHPBEq4','N':'GTHPBEq5', 'O':'GTHPBEq6','H':'GTHPBEq1'}
functional='PBE'
#cell_dimensions are for full system (slight expansion was necessary
#QMcell dimensions here defined manually
qm = CP2KTheory(basis_dict=basis_dict,potential_dict=potential_dict,functional=functional, psolver='wavelet', coupling='GAUSS',
periodic=False, cell_dimensions=[82.0, 82.0, 82.0, 90.0, 90.0, 90.0], qm_cell_dims=[12.0,12.0,12.0], numcores=numcores)
#act and qmatoms lists. Defines QMregion (atoms described by QM) and Activeregion (atoms allowed to move)
#IMPORTANT: atom indices begin at 0.
#Here selecting the sidechain of a threonine residue
qmatoms = [569,570,571,572,573,574,575,576]
actatoms = qmatoms
# Create QM/MM OBJECT by combining QM and MM objects above. Dipolecorrection turned off.
qmmmobject = QMMMTheory(qm_theory=qm, mm_theory=openmmobject, printlevel=2, dipole_correction=False,
fragment=frag, embedding="Elstat", qmatoms=qmatoms, qm_charge=0, qm_mult=1)
#Run geometry optimization using geomeTRIC optimizer and HDLC coordinates. Using QMregion as activeregion.
Optimizer(theory=qmmmobject, fragment=frag, ActiveRegion=True, actatoms=actatoms,
maxiter=200, coordsystem='hdlc', charge=0,mult=1)
#Partial numerical Hessian calculation
NumFreq(theory=qmmmobject, fragment=frag, hessatoms=actatoms)
Periodic QM/MM MD simulation of a protein
from ash import *
numcores=2
#Defining path to dir containing forcefield files and coordinates
forcefielddir="./"
psffile=forcefielddir+"step3_pbcsetup.psf"
topfile=forcefielddir+"top_all36_prot.rtf"
prmfile=forcefielddir+"par_all36_prot.prm"
xyzfile=forcefielddir+"coordinates.xyz"
#Read coordinates from XYZfile
frag = Fragment(xyzfile=xyzfile)
#Creating OpenMM object
openmmobject = OpenMMTheory(psffile=psffile, CHARMMfiles=True, charmmtopfile=topfile,
charmmprmfile=prmfile, periodic=True, periodic_cell_dimensions=[80.0, 80.0, 80.0, 90.0, 90.0, 90.0],
do_energy_decomposition=True, autoconstraints=None, rigidwater=False)
#QM
basis_dict={'C':'DZVPMOLOPTSRGTH','N':'DZVPMOLOPTSRGTH','O':'DZVPMOLOPTSRGTH','H':'DZVPMOLOPTSRGTH'}
potential_dict={'C':'GTHPBEq4','N':'GTHPBEq5', 'O':'GTHPBEq6','H':'GTHPBEq1'}
functional='PBE'
qm = CP2KTheory(basis_dict=basis_dict,potential_dict=potential_dict,functional=functional, psolver='periodic', coupling='GAUSS',
periodic=True, cell_dimensions=[82.0, 82.0, 82.0, 90.0, 90.0, 90.0], qm_cell_dims=[12.0,12.0,12.0], numcores=numcores)
#act and qmatoms lists. Defines QMregion (atoms described by QM) and Activeregion (atoms allowed to move)
#IMPORTANT: atom indices begin at 0.
#Here selecting the sidechain of threonine
qmatoms = [569,570,571,572,573,574,575,576]
actatoms = qmatoms
# Create QM/MM OBJECT by combining QM and MM objects above
qmmmobject = QMMMTheory(qm_theory=qm, mm_theory=openmmobject, printlevel=2, dipole_correction=False,
fragment=frag, embedding="Elstat", qmatoms=qmatoms, qm_charge=0, qm_mult=1)
#MD simulation with 0.5 fs timestep for 2 ps. enforcePeriodicBox=False to prevent OpenMM PBC wrapping
OpenMM_MD(fragment=frag, theory=qmmmobject, timestep=0.0005, simulation_time=2, traj_frequency=1, temperature=300,
integrator='LangevinIntegrator', coupling_frequency=1, enforcePeriodicBox=False)