Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial support for Implicit TPMs #105

Open
wants to merge 191 commits into
base: feature/iit-4.0
Choose a base branch
from

Conversation

isacdaavid
Copy link
Collaborator

@isacdaavid isacdaavid commented Mar 26, 2023

This might probably require further prettifying, and I have just started writing tests for the new code, documentation, etc. That said, it's mature enough to ask for review and suggestions. Test results are on par with the 4.0 branch, the examples I run are working. Also, I want to get a sense of merge conflicts.

This is what the code is supposed to be doing (also, a mini guide for users):

  • The previous TPM format and Network creation remain supported. Users shouldn't need to update their old scripts 90% of the time. API breakage is mostly contained within node.py. The new TPM format obviously looks different, but most of the existing API has been either inherited or re-implemented for it.
  • Regular (explicit) TPMs are automatically converted to implicit ones upon Network creation. ImplicitTPMs are supposed to be the new common currency throughout pyphi (help us spot leftovers and subpar conversions!).
  • The way to create an ImplicitTPM is with a list or tuple of node TPMs, where each node TPM looks very much like a multidimensional explicit TPM with one dimension per node in the network (inputs to node contribute nonsingleton dimensions, non-inputs contribute singletons), plus the last dimension containing the probabilities for this node at t+1. Instead of only providing probabilities for the ON state, that last dimension must contain entries for all states (to simplify our work regardless of whether the node is binary or not). Users can look at the existing my_subsystem.nodes[i].tpm to get a sense node TPMs.

Example using the 2nd system in fig. 7C in the IIT 4.0 paper:

import pyphi
import numpy as np

node_labels = ("A", "B", "C")

connectivity_matrix = np.array([
    [1, 1, 0,],
    [0, 1, 1,],
    [1, 1, 1,],
])

explicit_tpm = np.array([
    [1, 0, 0],
    [0, 1, 0],
    [1, 1, 1],
    [0, 1, 1],
    [0, 0, 0],
    [1, 1, 0],
    [0, 0, 1],
    [1, 0, 1],
])

implicit_tpm = [
    np.array(
        [[[[0., 1.],
            [1., 0.]]],
         [[[1., 0.],
            [0., 1.]]]]
    ),
    np.array(
         [[[[1., 0.],
            [1., 0.]],
           [[0., 1.],
            [1., 0.]]],
          [[[0., 1.],
            [0., 1.]],
           [[0., 1.],
            [1., 0.]]]]
    ),
    np.array(
        [[[[1., 0.],
            [1., 0.]],
           [[0., 1.],
            [0., 1.]]]]
    )
]
  • How do I convert an ExplicitTPM to a ImplicitTPM? You can do it indirectly, by defining a Network and extracting its .tpm attribute. Also see (assuming candidate system is whole network) [node.tpm for node in my_subsystem.nodes] and (more involved!) pyphi.node.generate_nodes.
  • How do I convert an ImplicitTPM back to ExplicitTPM? pyphi.tpm.reconstitute_tpm
  • The connectivity matrix is now optional when passing an implicit TPM to Network. If absent, pyphi will infer the cm from the node TPMs, or report inconsistencies in the TPM. If passed, the cm will be used to validate that it matches the TPM. When passing an explicit TPM, the behavior of the cm parameter is as before (assumes all-to-all if absent).
  • There's a new optional parameter to Network (as well as a corresponding attribute): state_space. This can be used to define state labels for each node. If absent, pyphi will create a default state space using int's as state indices (like 0 for OFF, 1 for ON as previously implied).
  • Internally, node labels and the state space are used to create xarray DataArrays with appropriate dimension and coordinate names.
  • For users, ImplicitTPMs have fancier indexing. In addition to the regular numpy syntax using positional indexing with integers and slices, there's also pandas/xarray-like indexing by name:
>>> network = pyphi.Network(implicit_tpm, node_labels=node_labels, state_space=(("OFF", "ON"),) * 3)
>>> network
Network(
ImplicitTPM((A, B, C)),
cm=[[1 1 0]
 [0 1 1]
 [1 1 1]],
node_labels=NodeLabels(('A', 'B', 'C')),
state_space={'B': ['OFF', 'ON'], 'A': ['OFF', 'ON'], 'C': ['OFF', 'ON']}
)

# P(A_{t+1} | A=0, B=0, C=0)

>>> network.tpm[0, 0, 0]
ImplicitTPM((A, B, C))

# That result returned indexed nodes, behind the scenes. To prove it we can repeat it and then inspect node A: 

>>> network.tpm[0, 0, 0].nodes[0].tpm
ExplicitTPM(
[0. 1.]
)

# That means P(A_{t+1}=OFF | A=0, B=0, C=0) = 0, and P(A_{t+1}=ON | A=0, B=0, C=0) = 1.

# Using state space labels, that would be the same as:

>>> network.tpm[{"B": "OFF", "A": "OFF", "C": "OFF"}].nodes[0].tpm
ExplicitTPM(
[0. 1.]
)

# A different example. P(A_{t+1}=1). This can be achieved by indexing the last dimension, called "Pr".

network.tpm[{"Pr": "ON"}].nodes[0].tpm
ExplicitTPM(
[[[1. 0.]]
 [[0. 1.]]]
)
  • We still have to work around the 32-node limit coming from numpy. However, the fact that we now use xarray DataArrays and named dimensions on top means that it should be relatively easy to come up with a workaround.
  • Nonbinary stuff is still unsupported, beyond being able to define nonbinary, possibly heterogeneous Networks . There are several places throughout the source code that still assume binary units, so correct analyses aren't guaranteed nor tested. This patch paves the way. though.

isacdaavid and others added 30 commits January 3, 2023 15:15
Remove inconsequential assignments introduced in bfd62c and 99ad3f.
Thus allowing xarray DataArrays to have anonymous singleton dimensions.
It turns out that, while allowed on a DataArray-level, nameless singleton
dimensions cannot be aligned at the Dataset level.
isacdaavid and others added 30 commits November 21, 2023 17:17
In `subsystem.find_mice`, computing potential purviews can be very
expensive in some situations, and if the user has provided a short
iterable of purviews, then computing the potential purviews is not worth
it. So, we simply use the user-provided purviews directly, allowing the
user to decide whether to filter out reducible purviews.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants