Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Free list transformer to replace deprecated ListT #253

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

turion
Copy link
Collaborator

@turion turion commented Feb 7, 2023

samples_free_listt.pdf
samples_master.pdf

Removes the dependency on the deprecated ListT in favour of a free monad construction. In fact improves benchmarks significantly!

@turion
Copy link
Collaborator Author

turion commented Feb 7, 2023

I observed runtimes to go down by 30%, and never going up in comparison to master. This is probably because the free monad over an applicative only adds free layers whenever it really needs, but not for fmap and <*>. I didn't test whether a free monad over the list functor gives a similar speedup, but it has no advantage I'm aware of, so I didn't

Adding ApplicativeDo globally also had a slight positive effect, but that's a separate issue.

@turion turion mentioned this pull request Feb 7, 2023
@turion turion changed the title Free list transformer, without ApplicativeDo yet Free list transformer to replace deprecated ListT Feb 7, 2023
Copy link
Contributor

@reubenharry reubenharry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing! My one request would be an update to the documentation here: https://monad-bayes.netlify.app/usage/#population You can update this by changing the relevant markdown file in the docs. Just a quick explanation of the motivation of the new representation, where it comes from, maybe the definition too. I can also have a go at this, but would be nice to check if it's straightforward for you to update the docs.

src/Control/Monad/Bayes/Population.hs Outdated Show resolved Hide resolved
@turion
Copy link
Collaborator Author

turion commented Feb 7, 2023

My one request would be an update to the documentation here: https://monad-bayes.netlify.app/usage/#population

Yes, on it.

but would be nice to check if it's straightforward for you to update the docs.

Very good point.

I have trouble understanding the current documentation:

Population m a ~ m [Log Double -> (a, Log Double)]

Shouldn't it be:

Population m a ~ m [(a, Log Double)]

@reubenharry
Copy link
Contributor

newtype Population m a = Population (Weighted (ListT m) a)

Weighted is the State monad. (By the way, it's possible that one of the optimized Writer or accum monads could be used here, but that's for a separate issue).

@ohad
Copy link

ohad commented Feb 8, 2023

Nice!
Since this implementation is computing things differently, not just faster, we should at least try to check that it's producing the correct results, not just producing results faster.

@turion
Copy link
Collaborator Author

turion commented Feb 8, 2023

Weighted is the State monad. (By the way, it's possible that one of the optimized Writer or accum monads could be used here, but that's for a separate issue).

Ah, true. I guess it's implemented as State for efficiency reasons, but its interface only allows Writer operations. Ironically in the ListT situation that's an important distinction because State is not commutative, but Writer for a commutative monoid like Log Double is.

I wonder what we should do documentation-wise. I'd find it simpler to explain that Weighted is a Writer, so we can't arbitrarily change the weight of a path, but only adding to its log. But that's a separate issue.

@ohad
Copy link

ohad commented Feb 8, 2023

We should understand what the proposed change is doing a bit more before merging. I wrote it in issue #245, but it also belongs here.

The paper and its follow-up used the monad-structure transformer ListT because the computation it produces makes the correct samples and in the correct order. @adscib (and later the Tweag team) have tested that the resulting inference algorithms produces the expected distributions when run on models, and the paper proves the algorithms, when working on these data structures, are correct.

In general, free monads may do something different, for example, suspend random samples that some of the monad-bayes combinators do in crucial places. This might mean, for example, that using the same combinations of monad-bayes's building blocks to implement algorithms such as smc, smc^2 etc. are not longer correct.

I suggest we first understand better how this proposed change affects the existing inference algorithms, not just in runtime, but in the actual samples they compute, both by testing and using an accompanying formalisation.

@reubenharry
Copy link
Contributor

Ah, true. I guess it's implemented as State for efficiency reasons, but its interface only allows Writer operations.

That's right. I'm still surprised that there isn't a Writer implementation that is suitably efficient (or even just uses State under the hood), but I don't know enough about performance to solve that problem. (I tried the continuized Writer, and I recall seeing mild speedups or at least, no slowdowns, although I might be misremembering).

I wonder what we should do documentation-wise. I'd find it simpler to explain that Weighted is a Writer, so we can't arbitrarily change the weight of a path, but only adding to its log. But that's a separate issue.

I agree in theory, but I think this would just cause extra confusion when anyone looked at the code and saw State, not Writer.

@turion turion mentioned this pull request Jul 17, 2023
@turion turion force-pushed the dev_free_listt branch 5 times, most recently from 7f7bf90 to 6493bed Compare January 2, 2024 09:54
@turion
Copy link
Collaborator Author

turion commented Jan 3, 2024

I suggest we first understand better how this proposed change affects the existing inference algorithms, not just in runtime, but in the actual samples they compute, both by testing and using an accompanying formalisation.

I tried to go for a conservative approach now that keeps an Applicative PopulationT transformer which has the same semantics as before, and is used in all places where needed. Unfortunately it isn't clear to me where to appropriately flatten the free population trees to make the SMC2 algorithm work. I tried all places I could think of, but ultimately didn't succeed in fixing the corresponding test. (On the other hand, I don't know whether SMC2 was correct to begin with.)

One learning is definitely that using an unlawful monad is a maintenance nightmare. Replacing it with a lawful one and then trying to find all the places where the particular flattening behaviour was intended and where it was accidental is like looking for a needle in a haystack. If someone else wants to give this a go, I'm happy to hand over and advise, but I've done enough work on this now for free.

Another thought I had while working on this issue: SequentialT is used pretty much like a free monad transformer. But the disadvantage is that it is defined using a library that is not very well known and popular. Possibly, we should do away with it completely and just use free monads instead. This might clarify the PopulationT story.

@reubenharry
Copy link
Contributor

I also had that thought re. SequentialT, and attempted to replace with FreeT and a catamorphism, but discovered that my understanding was not good, because the "folding" in the coroutine operations was quite different to what I'd imagined. You might have more luck though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants