alchemical-analysis.py not computing BAR/MBAR results #97

mqbpksi2 · 2017-01-26T11:23:38Z

Dear Mobley lab,

I am trying to run the alchemical-analysis tool on a set of output files from Amber TI simulations. The tool successfully computes dG, but fails to compute BAR/MBAR results and output any plots.

My command line input:

alchemical_analysis -a AMBER -p l-prod-0.[1-9]-0 -q out -o . -r 5 -u kcal -s 0 -g -w

The output from alchemical_analysis:

Started on Thu Jan 26 11:12:41 2017
Command line was: /usr/local/bin/alchemical_analysis -a AMBER -p l-prod-0.[1-9]-0 -q out -o . -r 5 -u kcal -s 0 -g -w

Loading in data from ./l-prod-0.4-0.out... 1000 data points, 1 DV/DL averages
Loading in data from ./l-prod-0.6-0.out... 1000 data points, 1 DV/DL averages
Loading in data from ./l-prod-0.7-0.out... 1000 data points, 1 DV/DL averages
Loading in data from ./l-prod-0.5-0.out... 1000 data points, 1 DV/DL averages
Loading in data from ./l-prod-0.8-0.out... 1000 data points, 1 DV/DL averages
Loading in data from ./l-prod-0.9-0.out... 1000 data points, 1 DV/DL averages
Loading in data from ./l-prod-0.1-0.out... 1000 data points, 1 DV/DL averages
Loading in data from ./l-prod-0.3-0.out... 1000 data points, 1 DV/DL averages
Loading in data from ./l-prod-0.2-0.out... 1000 data points, 1 DV/DL averages

Sorting input data by starting time

Note: BAR/MBAR results are not computed.

Skipping first 0 steps (= 0.000000 ps)

Note: extrapolating missing gradient for lambda = 0.0: 58.481188
Note: extrapolating missing gradient for lambda = 1.0: -127.351642

The gradients (DV/DL) from the correlated samples (kcal/mol):

Lambda gradient

0.00000 58.481
0.10000 15.875
0.20000 -5.820
0.30000 -10.537
0.40000 -19.724
0.50000 -29.159
0.60000 -36.760
0.70000 -54.858
0.80000 -57.177
0.90000 -71.284
1.00000 -127.352

dG = -30.388

The correlated gradient (DV/DL) components from every_single step (kcal/mol):

Lambda BOND ANGLE DIHED 1-4 NB 1-4 EEL VDWAALS EELEC RESTRAINT

0.10000 2.512 0.854 -0.093 0.426 -16.137 3.537 24.232 0.000
0.20000 2.186 0.694 -0.095 0.420 -16.186 3.245 3.514 0.000
0.30000 1.932 0.580 -0.151 0.401 -16.272 5.021 -2.080 0.000
0.40000 1.835 0.351 -0.116 0.388 -16.119 3.152 -9.142 0.000
0.50000 1.758 0.577 -0.152 0.433 -16.059 -0.498 -15.740 0.000
0.60000 1.534 0.080 -0.400 0.313 -16.274 0.016 -21.656 0.000
0.70000 1.290 0.134 -0.126 0.372 -16.155 -9.769 -30.628 0.000
0.80000 1.085 -0.168 -0.227 0.340 -16.201 -0.396 -42.075 0.000
0.90000 0.898 -0.210 -0.134 0.353 -16.175 5.599 -61.267 0.000

dG = 1.606 0.283 -0.125 0.360 -16.266 -1.093 -14.374 0.000

WARNING: Found all 0 in input data.
Generating results.txt with all 0
Exiting...

The .out files can be found here: https://www.dropbox.com/sh/i4c44d0yrid1a9u/AABjyJjOMBykC9FWefZYCMNGa?dl=0

I have pymbar installed. Could it be that it's because these were generated by sander or because I'm using a lambda interval from 0.1 to 0.9? Any help and advice would be greatly appreciated.

Sincerely,

Stefan

@davidlmobley

halx · 2017-01-26T11:50:40Z

Yes, this is because sander cannot sample the end points. I could probably change the code such that BAR/MBAR free energies are calculated for the existing lambdas only (I assume pymbar would be fine with this). The user could then plot the free energy of each window vs. lambda and find a suitable extrapolation scheme (see discussion in a different thread of the issue tracker). The practical problem though is that sander provides MBAR energies only for equally spaced lambdas which limits what the minimum and maximum lambdas can be.

…

On 26 January 2017 at 11:29, mqbpksi2 ***@***.***> wrote: @davidlmobley <https://github.com/davidlmobley> — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#97 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AH3aWjbHlcPXBWmi4yP43HFQBHj4DA0Bks5rWIOAgaJpZM4LuiP6> .

davidlmobley · 2017-01-26T14:03:46Z

Yes, I'm with @halx - a key question in this case is what the expected behavior would be. If you're running a free energy calculation and you're lacking data at the endpoints, what OUGHT BAR and MBAR to give you?

In the other thread, we discussed extrapolation, but one view was that this is a bad idea since it will be difficult to know when it will break.

Historically, this has been why AMBER analysis of these types of simulations has been only with TI.

mqbpksi2 · 2017-01-26T14:07:37Z

Many thanks for the feedback.

Best wishes,

Stef

danmermelstein · 2017-04-17T23:50:40Z

Hi,

I got the same error, but I was using the pmemd version of TI, which does compute endpoints. The analysis worked when I reran my simulations with MBAR on. @halx do you know if this code is only designed to work when MBAR is on? I'm just wondering because right now the GPU version of the code I have only supports TI, and for the time being it would be nice to be able to use this code for analysis/testing

Thanks!

Dan

P.S. Command line was: /net/home/dmermels/SOFTWARE/alchemical-analysis/alchemical_analysis/alchemical_analysis.py -a AMBER -d . -p [01].*/charge.gas_0-5ns -q out -m -mbar-bar -r 5 -o . -u kcal -s 0 -g

davidlmobley · 2017-04-17T23:53:53Z

Could you provide inputs, @danmermelstein ?

danmermelstein · 2017-04-18T00:01:18Z

Sure! Let me know if anything else is missing

Thanks!

mermelstein_alchemical_analysis_inputs.tar.gz

halx · 2017-04-18T08:56:23Z

Hi Dan,

I'm not sure what exactly the problem is. When I run your command with your data I get the output just fine. Only "issue" is that the tool also computes zero results for both BAR and MBAR. If you use "-m ti+ti-cubic" you can avoid that.

You have explicitly switch off MBAR reporting with ifmbar=0. The message you see is just a warning and can safely be ignored.

Cheers,
Hannes.

danmermelstein · 2017-04-20T21:26:04Z

Hi @halx,

Thanks for looking into this! What version of python and pymbar are you using?

Thanks!

Dan

python 2.7.13

halx · 2017-04-21T07:47:13Z

Python 2.7.9 and pymbar 3.0.0

…

On 20 April 2017 at 22:26, danmermelstein ***@***.***> wrote: Hi @halx <https://github.com/halx>, Thanks for looking into this! What version of python and pymbar are you using? Thanks! Dan python 2.7.13 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#97 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AH3aWlR7jO0AhUGjHRCG_w16n-uv5hbKks5rx81tgaJpZM4LuiP6> .

davidlmobley · 2017-04-21T16:19:16Z

@danmermelstein - LMK if you think it will be helpful for me to try too.

danmermelstein · 2017-04-21T18:34:18Z

Hi Hannes and David,

I tried pymbar 3.0.0 and no luck, so I just disabled the system exit following the warning like Hannes said and I get the correct results. I'll post here if I figure out why that was happening, but may have just been an unnecessary exit.

Either way, problem solved, thanks for all of your help!

Dan

mqbpksi2 · 2017-04-23T10:04:06Z

Hello all, Thanks for cc-ing me and letting me know about this, much appreciated. Best wishes, Stef

…

________________________________ From: danmermelstein [[email protected]] Sent: Friday, April 21, 2017 7:34 PM To: MobleyLab/alchemical-analysis Cc: Stefan Ivanov; Author Subject: Re: [MobleyLab/alchemical-analysis] alchemical-analysis.py not computing BAR/MBAR results (#97) Hi Hannes and David, I tried pymbar 3.0.0 and no luck, so I just disabled the system exit following the warning like Hannes said and I get the correct results. I'll post here if I figure out why that was happening, but may have just been an unnecessary exit. Either way, problem solved, thanks for all of your help! Dan — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#97 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AYMI_dYpKrCtUUnAqPpStgt56BCxyXGkks5ryPargaJpZM4LuiP6>.

jchodera · 2017-04-23T13:13:13Z

MBAR should absolutely be able to extrapolate to the endpoints even if they aren't sampled. Let me know if you folks need any help on the MBAR interface end of things.

Yes, this is because sander cannot sample the end points.

I believe this has was fixed with the introduction of softcore Lennard-Jones, which should be absolutely required for modern alchemical free energy calculations. Neglecting endpoint sampling in non-softcore free energy calculations has been shown to be highly problematic and has not been part of recommended best practices for many years now (nearly a decade).

halx · 2017-04-23T15:54:03Z

This is really an annoying implementation issue with sander. It does not allow you to set lambda < 0.005 or > 0.995. @danmermelstein can probably add more information. It is not an issue with pmemd though.

In this context, being able to estimate the non-sampled end-points would obviously be very interesting. Except, that I often get wrong results with MBAR (tested with pmemd result) that is the free energy can be very different from TI or BAR.

jchodera · 2017-04-23T16:38:31Z

Except, that I often get wrong results with MBAR (tested with pmemd result) that is the free energy can be very different from TI or BAR.

Can you elaborate on this? We find this universally indicates an issue with either (1) a problem in the thermodynamic self-consistency of your data, or (2) actual bias in the calculation that is impossible to detect via TI or BAR. TI allows you to miss near-singularities that hide the primary contributions to your integrals at the endpoints beyond or between the lambda values you evaluate, while one-sided BAR reduces to EXP, which has an enormous problem with bias being as large as the statistical error.

jchodera · 2017-04-23T16:39:05Z

In other words, and not meaning to mince words, if MBAR is giving very different results from TI or BAR, something is totally fucked up with your calculation.

halx · 2017-04-23T17:26:43Z

Thanks for the comment. I can provide more info and also data this week.

halx · 2017-04-24T07:51:50Z

I have uploaded the data for a simulation mutating ethane to methanol in water (see link below). The transformation has been split into charge only contributions and vdW+bonded contributions. The charge transformations seems to agree in the total dG in all methods (TI, TI-cubic, BAR, MBAR) but MBAR differs in almost all windows. The vdW+bonded transformation is just a mess... We have other simulations with small molecules but results are very similar with MBAR disagreeing. MBAR only agrees when the dH/dl gradients are a constant function over lambda.

https://www.dropbox.com/s/vbuonmoo42e69ye/ethane~methanol_water.tar.xz?dl=0

jchodera · 2017-04-24T12:59:36Z

Thanks! I've just downloaded the files.

Just to be clear, when you say you've split the transformation into contributions, what do you mean precisely? Do you break this into stages where you first change only the charges from ethane to methanol in the charge branch, and then change the vdW+bonded from ethane to methanol parameters in the vdw branch? Setting ifsc=1 I believe means that both Lennard-Jones and electrostatics are softened at intermediate lambda values even for the charge branch if just the charges are changing, and presumably the dummy atoms must also have nonzero Lennard-Jones repulsion to ensure the solvent does not overlap with it.

halx · 2017-04-24T13:29:50Z

Yes, in stages as you say. So, charges only.

Yes, ifsc=1 is required to make use of softcore potentials but they also require scmask for thedefinition of the dummy atoms. In my particular case ifsc=1 is really an accident but this has no further consequences because there are no dummy atoms.

Many thanks, for looking into this.

jchodera · 2017-04-24T13:36:17Z

How is it that there are no dummy atoms? An ethane-to-methanol transformation (CH3-CH3 to CH3-OH) would necessarily turn two of the hydrogens into dummy atoms, would it not?

halx · 2017-04-24T13:44:40Z

No dummy atoms for the charge only transformation. So here we really transform between the same molecule (same vdW and same bonded parameters) but with the charges of the respective end states (zero charges for "will-be" dummy atoms).

For the vdw+bonded transformation dummy atoms are defined for the two "vanishing" hydrogen atoms, see scmask1 = ':1@H7,H8' and ifsc=1. As charges are zero on these hydrogens already, only the LJ softcore will be active.

jchodera · 2017-04-24T13:56:19Z

No dummy atoms for the charge only transformation. So here we really transform between the same molecule (same vdW and same bonded parameters) but with the charges of the respective end states (zero charges for "will-be" dummy atoms).

Ah, excellent, thanks!

For the vdw+bonded transformation dummy atoms are defined for the two "vanishing" hydrogen atoms, see scmask1 = ':1@H7,H8' and ifsc=1. As charges are zero on these hydrogens already, only the LJ softcore will be active.

Thanks for the clarification!

davidlmobley · 2017-04-24T17:59:19Z

@jchodera :

On this:

MBAR should absolutely be able to extrapolate to the endpoints even if they aren't sampled. Let me know if you folks need any help on the MBAR interface end of things.

Yes, this is because sander cannot sample the end points.

I believe this has was fixed with the introduction of softcore Lennard-Jones, which should be absolutely required for modern alchemical free energy calculations. Neglecting endpoint sampling in non-softcore free energy calculations has been shown to be highly problematic and has not been part of recommended best practices for many years now (nearly a decade).

Part of the issue here is that people might be changing a variety of other things aside from Lennard-Jones and Coulomb (adding or removing restraints, adding or removing bonds, etc.), and doing it in a way which (is perhaps) very non-ideal. Since we aren't responsible for what data they are feeding in, and in fact have no idea whether they generated it by doing something sensible, our thought was that it did not make sense to do any type of extrapolation even though it certainly could be done.

We discussed this in a separate thread (#66 ) and Michael Shirts used this analogy:

I think that any sort of extrapolation is a bad idea. It's new functionality that allows users to cut off their own feet. The error analysis gets to be really horrific. There's really no good reason to extrapolate in a generally distributed code.

I really would rather not let people cut off their feet if it can be avoided.

Now, if the AMBER folks have a way of parsing their free energy files (which I haven't worked with) to allow us to identify cases where extrapolation should be relatively "well behaved" and turn it on only in those cases, then I'm totally fine with doing that (and we should get alchemlyb headed in that direction). But otherwise I just don't think we should be doing it because people will use it to cut off their feet.

danmermelstein · 2017-04-24T19:45:24Z

@davidlmobley yeah I don't know of any tools right now that can check for "well behaved" systems in AMBER.

The easier thing would probably be just to update sander to use the same bonded potential as is in pmemd?

mrshirts · 2017-04-24T19:48:28Z

It's not easier for us, but clearly having everything in AMBER use the same functional form would be a good think for AMBER if it's low(er) hanging fruit!

…

On Mon, Apr 24, 2017 at 1:45 PM, danmermelstein ***@***.***> wrote: @davidlmobley <https://github.com/davidlmobley> yeah I don't know of any tools right now that can check for "well behaved" systems in AMBER. The easier thing would probably be just to update sander to use the same bonded potential as is in pmemd? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#97 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEE31IHmWDd0mVM3JFfmuXG5TmbiQvrMks5rzPvUgaJpZM4LuiP6> .

danmermelstein · 2017-04-24T19:51:35Z

is the pmemd bonded functional form a problem?

davidlmobley · 2017-04-24T19:56:13Z

I think @mrshirts just means that we don't particularly care what functional form AMBER is using, at some level (it doesn't make life easier for us one way or the other), though it WOULD certainly simplify the life of people who have to write the parsers for AMBER data and worry about whether the data is coming from pmemd or sander. So it seems like a good thing to me. Otherwise you have to worry that people might be able to get one set of answers in sander and another in pmemd, etc.

danmermelstein · 2017-04-24T19:57:18Z

ah gotcha, that makes sense thanks!

danmermelstein · 2017-04-24T20:10:25Z

yeah that is probably something that should happen. I don't know much about sander so I can't say there isn't some other issue preventing us from changing the bonded functional form, but I doubt it. I also don't want to commit myself to making the changes just because I have no idea if/when I'll have some extra time, but certainly I'll look into getting it done, if not by me then someone else.

mqbpksi2 · 2017-05-22T12:02:10Z

Sorry if this is a little off topic, but did someone say that TI can now run on GPUs? If so, which version of Amber is this? 16?

davidlmobley · 2017-05-22T18:47:34Z

You want @danmermelstein I think.

danmermelstein · 2017-05-24T23:02:24Z

Sorry for the lag, @mqbpksi2 I have a patch over AMBER 16 I can give you if you're interested?

mqbpksi2 · 2017-06-07T09:24:06Z

Hi @danmermelstein,

Sorry for the late reply. Thanks for the offer; I don't need Amber16 at the moment; I was just wondering if it's possible, in principle, to run TI on GPUs now. This issue has been pending for a while, I believe. Does anyone know if there are plans to implement per-residue decomposition in TI in pmemd as well?

mqbpksi2 · 2017-06-09T13:37:39Z

The -s flag skips entries before a predefined time limit. Analysis is then carried out past that time point till the end of the input file(s). My question is "Is there a corresponding flag that limits analysis to a predetermined time point before the end of the simulation(s)?" Of course, I had a look at the available options with -h and saw that there is none, but I guess there's no harm in asking to double check. I imagine deleting the file entries past a certain time is one way to do it.

davidlmobley · 2017-06-09T14:20:31Z

@mqbpksi2 - no, there is not. What's the use case for this, out of curiosity?

Thanks.

mqbpksi2 · 2017-06-09T14:25:55Z

I wanted to see how much the ΔG values would change if I break up the trajectories, for example if I break up the trajectories into four equally long intervals and compare ΔG values among those; it's one more way of looking at convergence. It would probably be useful to add a flag like this in future releases, in my humble opinion, at least.

mqbpksi2 · 2017-06-09T17:00:34Z

If I may ask one more followup question, how are lambda values extrapolated to the end states, if we don't have sampling at the end states?

For example, I have 11 lambda values: 0.005, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.995. The output reads:

Note: extrapolating missing gradient for lambda = 0.0: -0.511236
Note: extrapolating missing gradient for lambda = 1.0: -396.616445

The gradients (DV/DL) from the correlated samples (kcal/mol):

Lambda gradient

0.00000 -0.511
0.00500 0.259
0.10000 -13.376
0.20000 55.118
0.30000 7.271
0.40000 -67.904
0.50000 -94.552
0.60000 -111.262
0.70000 -147.835
0.80000 -185.730
0.90000 -244.077
0.99500 -392.396
1.00000 -396.616

dG = -100.191

Does the tool fit to a 10th degree polynomial and then extrapolate to 0 and 1 or what exactly?

@davidlmobley, can you tell me how the extrapolation is performed?

ash286 · 2017-08-15T09:08:46Z

I have uploaded the data for a simulation mutating ethane to methanol in water (see link below).

Hi @halx ,

Can you please help me creating parmtop & rst7 files with dummy atoms that you have created in ethane~methanol system for charge transformation. May be this is question is not exactly related to alchemical-analysis script, but I would like to dig into MD/TI.

Thank you.

halx · 2017-08-15T09:39:34Z

We have a tool for this: FESetup, see http://www.ccpbiosim.ac.uk/fesetup/

…

On 15 August 2017 at 10:08, ash286 ***@***.***> wrote: I have uploaded the data for a simulation mutating ethane to methanol in water (see link below). Hi @halx <https://github.com/halx> , Can you please help me creating parmtop & rst7 files with dummy atoms that you have created in ethane~methanol system for charge transformation. May be this is question is not exactly related to alchemical-analysis script, but I would like to dig into MD/TI. Thank you. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#97 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AH3aWidj9caWFhXBqlTaOLxeD5sn0V0Eks5sYWAegaJpZM4LuiP6> .

alchemical-analysis.py not computing BAR/MBAR results #97

alchemical-analysis.py not computing BAR/MBAR results #97

Comments

mqbpksi2 commented Jan 26, 2017 • edited Loading

Lambda gradient

0.00000 58.481 0.10000 15.875 0.20000 -5.820 0.30000 -10.537 0.40000 -19.724 0.50000 -29.159 0.60000 -36.760 0.70000 -54.858 0.80000 -57.177 0.90000 -71.284 1.00000 -127.352

Lambda BOND ANGLE DIHED 1-4 NB 1-4 EEL VDWAALS EELEC RESTRAINT

halx commented Jan 26, 2017 via email

davidlmobley commented Jan 26, 2017

mqbpksi2 commented Jan 26, 2017

danmermelstein commented Apr 17, 2017

davidlmobley commented Apr 17, 2017

danmermelstein commented Apr 18, 2017

halx commented Apr 18, 2017

danmermelstein commented Apr 20, 2017

halx commented Apr 21, 2017 via email

davidlmobley commented Apr 21, 2017

danmermelstein commented Apr 21, 2017

mqbpksi2 commented Apr 23, 2017 via email

jchodera commented Apr 23, 2017

halx commented Apr 23, 2017

jchodera commented Apr 23, 2017

jchodera commented Apr 23, 2017

halx commented Apr 23, 2017

halx commented Apr 24, 2017

jchodera commented Apr 24, 2017

halx commented Apr 24, 2017

jchodera commented Apr 24, 2017

halx commented Apr 24, 2017

jchodera commented Apr 24, 2017

davidlmobley commented Apr 24, 2017

danmermelstein commented Apr 24, 2017

mrshirts commented Apr 24, 2017 via email

danmermelstein commented Apr 24, 2017

davidlmobley commented Apr 24, 2017

danmermelstein commented Apr 24, 2017

danmermelstein commented Apr 24, 2017

mqbpksi2 commented May 22, 2017

davidlmobley commented May 22, 2017

danmermelstein commented May 24, 2017

mqbpksi2 commented Jun 7, 2017

mqbpksi2 commented Jun 9, 2017

davidlmobley commented Jun 9, 2017

mqbpksi2 commented Jun 9, 2017

mqbpksi2 commented Jun 9, 2017 • edited Loading

Lambda gradient

0.00000 -0.511 0.00500 0.259 0.10000 -13.376 0.20000 55.118 0.30000 7.271 0.40000 -67.904 0.50000 -94.552 0.60000 -111.262 0.70000 -147.835 0.80000 -185.730 0.90000 -244.077 0.99500 -392.396 1.00000 -396.616

ash286 commented Aug 15, 2017

halx commented Aug 15, 2017 via email

mqbpksi2 commented Jan 26, 2017 •

edited

Loading

0.00000 58.481
0.10000 15.875
0.20000 -5.820
0.30000 -10.537
0.40000 -19.724
0.50000 -29.159
0.60000 -36.760
0.70000 -54.858
0.80000 -57.177
0.90000 -71.284
1.00000 -127.352

mqbpksi2 commented Jun 9, 2017 •

edited

Loading

0.00000 -0.511
0.00500 0.259
0.10000 -13.376
0.20000 55.118
0.30000 7.271
0.40000 -67.904
0.50000 -94.552
0.60000 -111.262
0.70000 -147.835
0.80000 -185.730
0.90000 -244.077
0.99500 -392.396
1.00000 -396.616