docs/tutorial.rst from choderalab/yank

docs/tutorial.rst
Summary

Maintainability

Test Coverage

Issues
.. _tutorial:

YAML syntax tutorial
********************

This tutorial assumes you've :ref:`installed <installation>` YANK successfully.

We'll start by creating a YAML file for a hydration free energy calculation in explicit solvent, and then we'll add an
absolute binding free energy experiment in explicit and implicit solvent.

.. contents::

A first example: Hydration free energy
======================================

We'll go line-by-line through the sections of the following YAML file for YANK that sets up and run a hydration binding
free energy of a small molecule. The three dashes (``---``) is the standard way in YAML to indicate the beginning of a
YAML document.

.. code-block:: yaml

    ---

    options:
      verbose: yes
      minimize: yes
      default_timestep: 2.0*femtoseconds
      default_nsteps_per_iteration: 500
      temperature: 300*kelvin
      pressure: 1*atmosphere

    molecules:
      benzene:
        filepath: benzene.tripos.mol2
        antechamber:
          charge_method: bcc
        leap:
          parameters: [leaprc.gaff]

    solvents:
      water:
        nonbonded_method: PME
        nonbonded_cutoff: 9*angstroms
        clearance: 16*angstroms
        solvent_model: tip4pew
        leap:
          parameters: [leaprc.water.tip4pew]
      vacuum:
        nonbonded_method: NoCutoff

    systems:
      hydration-system:
        solute: benzene
        solvent1: water
        solvent2: vacuum
        leap:
          parameters: [leaprc.protein.ff14SB]

    protocols:
      hydration-protocol:
        solvent1:
          alchemical_path: auto
        solvent2:
          alchemical_path: auto

    experiments:
      system: hydration-system
      protocol: hydration-protocol


Prepare the molecules
"""""""""""""""""""""

Let's skip the ``options`` section for now and look at the ``molecules`` block (:ref:`docs <yaml_molecules_head>`). This
is the place where you can specify the pipeline that a molecule should go through before eventually being solvated or
combined with another molecule to form a complex.

.. code-block:: yaml

    molecules:
      benzene:
        filepath: benzene.tripos.mol2
        antechamber:
          charge_method: bcc
        leap:
          parameters: [leaprc.gaff]

The ``benzene`` keyword here is a string identifier so it can be any name we want. The nested ``filepath`` keyword points
to a ``mol2`` file specifying the molecule topology and initial coordinates of the atoms. The automatic pipeline in YANK
is based on `AmberTools <http://ambermd.org/AmberTools.php>`_ so you can determine charges with antechamber (in this case
AM1-BCC point charges) and assign the parameters to the molecule by loading any parameter file accessible to tleap.

The tleap parameters can be assigned in the ``systems`` section as well with identical effects, but when you have multiple
molecules and you want to assign different parameters to them, it is convenient to be able to specify the parameter file
to load in the ``molecules`` section.

Specify the solvent
"""""""""""""""""""

We'll turn off benzene's nonbonded interactions in explicit water and turn them back on in vacuum to close the
thermodynamic cycle so let's create two solvents.

.. code-block:: yaml

    solvents:
      water:
        nonbonded_method: PME
        nonbonded_cutoff: 9*angstroms
        clearance: 16*angstroms
        solvent_model: tip4pew
        leap:
          parameters: [leaprc.water.tip4pew]
      vacuum:
        nonbonded_method: NoCutoff

Again, ``water`` and ``vacuum` are just two arbitrary string identifiers. The ``vacuum`` solvent is quite simple. The
``water`` solvent we use Particle Mesh Ewald for long-range electrostatic treatment with a cutoff of 9A for both Coulomb
and Lennard-Jones interactions.

The ``clearance`` and ``solvent_model`` keywords instruct YANK to solvate the solute with TIP4P-Ew water molecules so that
the solute will be distant from the box boundaries by 16A. Finally, we ask leap to load the TIP4P-Ew parameters when solvating.
Again, this parameter file can be specified in the in ``systems`` section.

Prepare the systems
"""""""""""""""""""

Finally let's put everything together to create the systems in solvent and in vacuum that YANK will use to run the
hydration free energy calculation.

.. code-block:: yaml

    systems:
      hydration-system:
        solute: benzene
        solvent1: water
        solvent2: vacuum
        leap:
          parameters: [leaprc.protein.ff14SB]

``hydration-system`` is, as usual, an arbitrary identifier. Here we are instructing YANK to create two systems by
solvating ``benzene`` in ``solvent1`` (i.e. the ``water`` specified above) and in vacuum. YANK will use these two systems
to compute the free energy of transferring benzene from ``solvent1`` to ``solvent2``. Finally, we load the general amber
parameters that are required by tleap.

If we don't specify the parameters in the ``solvents`` and ``molecules`` sections, we can do it here.

.. code-block:: yaml

    systems:
      hydration-system:
        ...
        leap:
          parameters: [leaprc.protein.ff14SB, leaprc.water.tip4pew, leaprc.gaff]

Alchemical protocol
"""""""""""""""""""

The ``protocols`` section determines the intermediate states to sample during both phases of the the alchemical calculations.
YANK provides a method to determine this automatically by spacing the intermediate states equally in thermodynamic length.
We'll see later in the tutorial how you can specify an alchemical path manually instead.

.. code-block:: yaml

    protocols:
      hydration-protocol:
        solvent1:
          alchemical_path: auto
        solvent2:
          alchemical_path: auto

Experiment
""""""""""

This is where we associate a protocol to the system and tell YANK to run the hydration free energy calculation.

.. code-block:: yaml

    experiments:
      system: hydration-system
      protocol: hydration-protocol

Running options
"""""""""""""""

Finally, the ``options`` section contains some several parameters for the simulation (see :ref:`here <yaml-options-head>`
for a complete list).

.. code-block:: yaml

    options:
      verbose: yes
      minimize: yes
      default_timestep: 2.0*femtoseconds
      default_nsteps_per_iteration: 500
      default_number_of_iterations: 1000
      temperature: 300*kelvin
      pressure: 1*atmosphere

Here we're setting the verbosity of the output, and asking YANK to minimize the systems before running 1000 iterations
of Hamiltonian replica exchange. Each iteration, by default, perform a Monte Carlo rigid translation and rotation of the
ligand followed, in this case, by 1ps of Langevin dynamics (with 2fs time step) keeping the temperature at 300K. The
pressure is controlled by a Monte Carlo barostat.

Since it is possible to specify multiple free energy calculations in a single YAML file, this section is used to keep
the options to use as default, but they can be overwritten by other parts of the YAML document as well. For example, the
``options`` block can be added in the experiment specification to overwrite some or all of the default values.

.. code-block:: yaml

    experiments:
      system: hydration-system
      protocol: hydration-protocol
      options:
        temperature: 310*kelvin
        default_nsteps_per_iteration: 1000


Add an absolute binding free energy experiment
==============================================

We mentioned that it's possible to specify multiple experiments in a single YAML file. Let's add an absolute binding free
energy calculation.

Preparing protein receptors
"""""""""""""""""""""""""""

We start as before by preparing the receptor.

.. code-block:: yaml

    molecules:
      ...
      t4-lysozyme:
        filepath: input/t4.pdb

In this case, we don't specify leap parameters since we'll add them to the system.

.. note::

    Since the PDB file goes through tleap, your PDB file should adopt the residue names consistent with their protonation state as expected by tleap (e.g., CYS for protonated Cysteine, CYM for deprotonated, and CYX for cysteines forming a disulphide bridge).


Add buffer and neutralizing ions
""""""""""""""""""""""""""""""""

In this case, we want to simulate a particular buffer ionic strength so we create a new solvent.

.. code-block:: yaml

    solvents:
      ...
      buffer:
        nonbonded_method: PME
        nonbonded_cutoff: 11*angstroms
        clearance: 12*angstroms
        positive_ion: Na+
        negative_ion: Cl-
        ionic_strength: 100*millimolar
        solvent_model: tip3p
        leap:
          parameters: [leaprc.water.tip3p]

By adding the three new keywords ``positive_ion``, ``negative_ion``, and ``ionic_strength``, YANK will add neutralizing
and buffer Na+ and Cl- ions.

.. note::

    If the molecule that is alchemically decoupled has a net charge, YANK will automatically select a random counterion to decouple together with the ligand.

Create receptor ligand system
"""""""""""""""""""""""""""""

Now we create the system by combining the ``t4-lysozyme`` and ``benzene`` to form a complex.

.. code-block:: yaml

    systems:
      ...
      t4-benzene:
        receptor: t4-lysozyme
        ligand: benzene
        solvent: buffer
        leap:
          parameters: [leaprc.protein.ff14SB, leaprc.gaff2, leaprc.water.tip4pew]

This time instead of ``solute``, ``solvent1``, and ``solvent2`` we have ``receptor``, ``ligand``, and ``solvent``. An
experiment using this system will decouple the ``ligand`` in complex and turn back on the interaction in bulk.

Restraints
""""""""""

Absolute binding calculations often require a restraint to keep the ligand in the binding pocket.

.. code-block:: yaml

    protocol:
      ...
      absolute-binding:
        complex:
          alchemical_path:
            lambda_electrostatics: [1.00, 1.00, 1.00, 1.00, 1.00, 0.90, 0.80, 0.70, 0.60, 0.50, 0.40, 0.30, 0.20, 0.10, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00]
            lambda_sterics:        [1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 0.90, 0.80, 0.70, 0.60, 0.50, 0.40, 0.30, 0.20, 0.10, 0.00]
            lambda_restraints:     [0.00, 0.25, 0.50, 0.75, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00]
        solvent:
          alchemical_path: auto

    experiment-t4-benzene:
      system: t4-benzene
      protocol: absolute-binding
      restraint:
        type: Harmonic

    experiment-hydration:
      ...

    experiments: [experiment-t4-benzene, experiment-hydration]

There are three main changes here. First, notice how the ``experiments`` section now points to a list of experiments.
This is necessary since we are using the same YAML file to run multiple experiments. Secondly, the ``experiment-t4-benzene``
block contains a ``restraint.type`` keyword that we can be used to specify the type of restraint we want to use. Other available
restraints are ``FlatBottom``, ``Boresch``, and ``PeriodicTorsionBoresch``.

Finally, we've used a manual protocol instead of ``auto`` to control the intermediate states of the Hamiltonian replica
exchange simulation of the ``complex`` phase. The protocol here first turns on the harmonic restraint (``lambda_restraints``
goes from 0.0 to 1.0), then turn off charges (``lambda_electrostatics`` from 1.0 to 0.0), and finally the Lennard-Jones
interactions are decoupled (``lambda_sterics``).

YANK uses a heuristic to find the parameters of the restraint, but it is possible to specify them manually.

.. code-block:: yaml

    experiment-t4-benzene:
      system: t4-benzene
      protocol: absolute-binding
      restraint:
        type: Harmonic
        spring_constant: 0.2*kilocalories_per_mole/(angstrom**2)
        restrained_receptor_atoms: (resi 77 or resi 90 or resi 98 or resi 110) and (mass > 1.5)
        restrained_ligand_atoms: (resname MOL) and (mass > 1.5)

This example manually set the spring constant, and attach the harmonic restraint to the centroid of the heavy atoms
identified by the `MDTraj DSL <http://mdtraj.org/latest/atom_selection.html>`_ selection expression. In general, ``type``
is the name of a class, and the keywords that is possible to specify to configure the restraints are its constructor
parameters. Check the `Python API documentation <http://getyank.org/0.23.4/api/restraints_api.html#yank.restraints.Harmonic>`_
for a complete overview.

.. rst-class:: html-toggle

The full script after the changes
"""""""""""""""""""""""""""""""""

.. code-block:: yaml

    ---

    options:
      verbose: yes
      minimize: yes
      default_timestep: 2.0*femtoseconds
      default_nsteps_per_iteration: 500
      temperature: 300*kelvin
      pressure: 1*atmosphere

    molecules:
      benzene:
        filepath: benzene.tripos.mol2
        antechamber:
          charge_method: bcc
        leap:
          parameters: [leaprc.gaff]
      t4-lysozyme:
        filepath: input/t4.pdb

    solvents:
      water:
        nonbonded_method: PME
        nonbonded_cutoff: 9*angstroms
        clearance: 16*angstroms
        solvent_model: tip4pew
        leap:
          parameters: [leaprc.water.tip4pew]
      vacuum:
        nonbonded_method: NoCutoff
      buffer:
        nonbonded_method: PME
        nonbonded_cutoff: 11*angstroms
        clearance: 12*angstroms
        positive_ion: Na+
        negative_ion: Cl-
        ionic_strength: 100*millimolar
        solvent_model: tip3p
        leap:
          parameters: [leaprc.water.tip3p]

    systems:
      hydration-system:
        solute: benzene
        solvent1: water
        solvent2: vacuum
        leap:
          parameters: [leaprc.protein.ff14SB]
      t4-benzene:
        receptor: t4-lysozyme
        ligand: benzene
        solvent: buffer
        leap:
          parameters: [leaprc.protein.ff14SB, leaprc.water.tip4pew]

    protocols:
      hydration-protocol:
        solvent1:
          alchemical_path: auto
        solvent2:
          alchemical_path: auto
      absolute-binding:
        complex:
          alchemical_path:
            lambda_electrostatics: [1.00, 1.00, 1.00, 1.00, 1.00, 0.90, 0.80, 0.70, 0.60, 0.50, 0.40, 0.30, 0.20, 0.10, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00]
            lambda_sterics:        [1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 0.90, 0.80, 0.70, 0.60, 0.50, 0.40, 0.30, 0.20, 0.10, 0.00]
            lambda_restraints:     [0.00, 0.25, 0.50, 0.75, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00]
        solvent:
          alchemical_path: auto

    experiment-t4-benzene:
      system: t4-benzene
      protocol: absolute-binding
      restraint:
        type: Harmonic

    experiment-hydration:
      system: hydration-system
      protocol: hydration-protocol

    experiments: [experiment-t4-benzene, experiment-hydration]


Combinatorial experiments: implicit and explicit solvents
=========================================================

We can use the ``!Combinatorial`` keyword in many sections to generate multiple experiments from a single YAML file while
maintaining the representation of our computational experiment compact. For example, let's assume we want to compare the
effect of multiple restraint types and the accuracy of the binding free energy estimate changes with TIP3P and the GBSA
implicit solvent model.

.. code-block:: yaml

    solvents:
      ...
      implicit:
        nonbonded_method: NoCutoff
        implicit_solvent: OBC2
        implicit_solvent_salt_conc: 100*millimolar

    systems:
      ...
      t4-benzene:
        receptor: t4-lysozyme
        ligand: benzene
        solvent: !Combinatorial [buffer, implicit]
        leap:
          parameters: [leaprc.protein.ff14SB, leaprc.water.tip4pew]

    experiment-t4-benzene:
      system: t4-benzene
      protocol: absolute-binding
      restraint:
        type: !Combinatorial [FlatBottom, Harmonic, Boresch]

Here we have defined a new solvent that uses the GBSA/OBC2 model, and we have changed the ``systems.t4-benzene.solvent``
keyword from simply ``buffer`` to ``!Combinatorial [buffer, implicit]``. We've also added ``!Combinatorial`` in the
``experiment-t4-benzene`` section, this time specifying several restraint types. When YANK parses ``experiment-t4-benzene``,
it generates 6 different binding free energy calculations using 3 different restraints and 2 different solvent models.

.. rst-class:: html-toggle

The full script after the changes
"""""""""""""""""""""""""""""""""

.. code-block:: yaml

    ---

    options:
      verbose: yes
      minimize: yes
      default_timestep: 2.0*femtoseconds
      default_nsteps_per_iteration: 500
      temperature: 300*kelvin
      pressure: 1*atmosphere

    molecules:
      benzene:
        filepath: benzene.tripos.mol2
        antechamber:
          charge_method: bcc
        leap:
          parameters: [leaprc.gaff]
      t4-lysozyme:
        filepath: input/t4.pdb

    solvents:
      water:
        nonbonded_method: PME
        nonbonded_cutoff: 9*angstroms
        clearance: 16*angstroms
        solvent_model: tip4pew
        leap:
          parameters: [leaprc.water.tip4pew]
      vacuum:
        nonbonded_method: NoCutoff
      buffer:
        nonbonded_method: PME
        nonbonded_cutoff: 11*angstroms
        clearance: 12*angstroms
        positive_ion: Na+
        negative_ion: Cl-
        ionic_strength: 100*millimolar
        solvent_model: tip3p
        leap:
          parameters: [leaprc.water.tip3p]
      implicit:
        nonbonded_method: NoCutoff
        implicit_solvent: OBC2
        implicit_solvent_salt_conc: 100*millimolar

    systems:
      hydration-system:
        solute: benzene
        solvent1: water
        solvent2: vacuum
        leap:
          parameters: [leaprc.protein.ff14SB]
      t4-benzene:
        receptor: t4-lysozyme
        ligand: benzene
        solvent: !Combinatorial [buffer, implicit]
        leap:
          parameters: [leaprc.protein.ff14SB, leaprc.water.tip4pew]

    protocols:
      hydration-protocol:
        solvent1:
          alchemical_path: auto
        solvent2:
          alchemical_path: auto
      absolute-binding:
        complex:
          alchemical_path:
            lambda_electrostatics: [1.00, 1.00, 1.00, 1.00, 1.00, 0.90, 0.80, 0.70, 0.60, 0.50, 0.40, 0.30, 0.20, 0.10, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00]
            lambda_sterics:        [1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 0.90, 0.80, 0.70, 0.60, 0.50, 0.40, 0.30, 0.20, 0.10, 0.00]
            lambda_restraints:     [0.00, 0.25, 0.50, 0.75, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00]
        solvent:
          alchemical_path: auto

    experiment-t4-benzene:
      system: t4-benzene
      protocol: absolute-binding
      restraint:
        type: !Combinatorial [FlatBottom, Harmonic, Boresch]

    experiment-hydration:
      system: hydration-system
      protocol: hydration-protocol

    experiments: [experiment-t4-benzene, experiment-hydration]

Changing sampling method
========================

By default, YANK uses a combination of Monte Carlo rigid displacement/rotation of the ligand and Langevin dynamics to
propagate the replicas of the Hamiltonian Replica Exchange algorithm, but both the propagation and the enhanced sampling
algorithm can be configured.

.. code-block:: yaml

    mcmc_moves:
      langevin:
        type: LangevinSplittingDynamicsMove
        timestep: 4.0*femtosecond
        n_steps: 1250
        collision_rate: 1.0/picosecond
        reassign_velocities: no
        splitting: 'V R O R V'
        n_restart_attempts: 4

    samplers:
      repex:
        type: ReplicaExchangeSampler
        mcmc_moves: langevin
        number_of_iterations: 100000
        online_analysis_interval: 10
      sams:
        type: SAMSSampler
        mcmc_moves: langevin
        state_update_scheme: global-jump
        gamma0: 10.0
        flatness_threshold: 10.0
        number_of_iterations: 100000
        online_analysis_interval: 1000

The ``mcmc_moves`` block allows to specify the constructor of any ``MCMCMove`` object available in the ``openmmtools.mcmc``
module (`see API docs there <https://openmmtools.readthedocs.io/en/latest/mcmc.html>`_). The ``type`` keyword is used to
indicate the class name of the ``MCMCMove``, and the other keywords are passed to its constructor. Note that ``timestep``
and ``n_steps`` overwrite the global options ``default_timestep`` and ``default_nsteps_per_iteration`` respectively
defined above.

The ``samplers`` block has a similar structure, but you can invoke the constructor of any sampler in the ``yank.multistate``
package. Similarly, the ``number_of_iterations`` overwrites the global option ``default_number_of_iterations``. Note also
that we can pass to the ``samplers.SAMPLER_ID.mcmc_moves`` keyword any identifier of the ``mcmc_moves`` section.


Starting from different input files
===================================

Files with multiple molecules
"""""""""""""""""""""""""""""

If you have input files describing multiple molecules, you can define ``molecules`` by selecting all or a subset of them.

.. code-block:: yaml

    molecules:
      t4-lysozyme:
        filepath: input/t4-frames.pdb
        select: 0
      binders:
        filepath: input/binders.mol2
        antechamber:
          charge_method: bcc
        select: all

The example above pick the first structure in the ``t4-frames.pdb`` file to define the ``t4-lysozyme`` molecule, while
it reads all the molecules in the ``binders.mol2`` file to generate the binders. In the latter case, YANK will generate
as many experiments as the number of molecules in ``binders.mol2``.

.. note::

    Selecting mol2 and sdf files requires an installation of the OpenEye library.

To select a subset of the structures in a single file you can do

.. code-block:: yaml

    molecules:
      t4-lysozyme:
        filepath: input/t4-frames.pdb
        select: !Combinatorial [0, 1, 4, 5, 8]

SMILES
""""""

If you have OpenEye installed, then you can also generate small molecules from their SMILES representation.

.. code-block:: yaml

    molecules:
      ...
      benzene:
        smiles: c1ccccc1
        openeye:
          quacpac: am1-bcc
        antechamber:
          charge_method: null
      binders:
        filepath: input/L99A-binders.csv
        openeye:
          quacpac: am1-bcc
        antechamber:
          charge_method: null
        select: all

    systems:
      t4-benzene:
        receptor: t4-lysozyme
        ligand: benzene
        pack: yes
        ...

Here we define a ``benzene`` molecules passing directly its SMILES representation to the ``smiles`` keyword. It is also
possible to load a CSV file defining multiple SMILES molecule to generate combinatorial experiments. The charges for
these molecules are assigned using the OpenEye's AM1-BCC charge scheme rather than antechamber.

Finally, note the ``pack: yes`` keyword in the definition of the system. Since no coordinates are specified by the SMILES
representation, the molecule is placed somewhat randomly in the solvation box. Setting the ``pack`` option brings receptor
and ligand close before generating the system with tleap. This is very helpful in explicit solvent since otherwise tleap
may generate a huge solvation box to include both molecules.

Extra directives for tleap
""""""""""""""""""""""""""

In special cases, you may want to add more directives for tleap. For example, you may want to use tleap to rename some
atoms that are not recognized by the force field you want to load. You can do so by creating a ``leaprc.renameatoms``
including the directive to inject into the tleap script

.. code-block:: none

    addPdbAtomMap {
      { 1DH6 DH61 }
      { 2DH6 DH62 }
    }

and then load it normally together with the leap parameters in the YAML file

.. code-block:: yaml

    molecules:
      t4-lysozyme:
        filepath: input/t4.pdb
        leap:
          parameters: [leaprc.renameatoms]

When the automatic pipeline won't do
""""""""""""""""""""""""""""""""""""

The easiest way to set up a free energy calculation in YANK is to use the automatic pipeline based on AmberTools, but
sometimes that is not flexible enough. In this case, you can use any other program to generate parameter/coordinates files
in Amber or Gromacs format or and XML representation of an OpenMM system.

.. code-block:: yaml

  solvents:
      pme:
        nonbonded_method: PME
        nonbonded_cutoff: 11*angstroms
      vacuum:
        nonbonded_method: NoCutoff

    systems:
      t4-benzene:
        phase1_path: [complex.top, complex.gro]
        phase2_path: [solvent.top, solvent.gro]
        ligand_dsl: resname MOL
        solvent: pme
        gromacs_include_dir: include/
      hydration-system:
        phase1_path: [solvent1.prmtop, solvent1.inpcrd]
        phase2_path: [solvent2.prmtop, solvent2.inpcrd]
        solvent1: pme
        solvent2: vacuum

The ``t4-benzene`` system now loads parameter/coordinates files in Gromacs format for the calculation of the ligand in
complex and in bulk. The ligand is identified by the ``ligand_dsl`` MDTraj DSL selection string. Note that the ``solvent``
must still be specified to indicate the long-range interactions treatment.

When ``ligand_dsl`` is not specified, YANK alchemically modifies the whole solute, which is used in ``hydration-system``
to perform a hydration free energy calculation.

.. note::

    If your ligand/solute has a net charged and you're using PME, your systems should contain enough ions to neutralizing. YANK will decouple them together with the system to maintain a neutral solvation box.