Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add association between tracker hit and mcparticles #70

Closed
wants to merge 3 commits into from

Conversation

ShujieL
Copy link
Contributor

@ShujieL ShujieL commented Jan 31, 2024

Briefly, what does this PR introduce?

Add association between tracker hit and mcparticles to allow easy trace back from hits to mc particles in analysis.

What kind of change does this PR introduce?

  • Bug fix (issue #__)
  • [ x] New feature (issue #__)
  • Documentation update
  • Other: __

Please check if this PR fulfills the following:

  • Tests for the changes have been added
  • Documentation has been added / updated
  • [x ] Changes have been communicated to collaborators

Does this PR introduce breaking changes? What changes might users need to make to their code?

No breaking changes. This is a standalone data structure. Corresponding factory will be developed and inserted to the downstream of tracking algorithm to fill the data structure. It should not interfere with any existing structure.

Does this PR change default behavior?

No.

@ShujieL ShujieL requested a review from a team as a code owner January 31, 2024 00:42
Copy link
Contributor

@wdconinc wdconinc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather we connect TrackerHit to RawTrackerHit.

edm4eic.yaml Outdated Show resolved Hide resolved
edm4eic.yaml Show resolved Hide resolved
Co-authored-by: Dmitry Kalinkin <[email protected]>
@ShujieL
Copy link
Contributor Author

ShujieL commented Jan 31, 2024

I'd rather we connect TrackerHit to RawTrackerHit.

I thought about this option. But we don't have the raw hit to sim hit association filled. So it's almost the same amount of work to add it in EICrecon.
And if doing so, it takes the user three steps: tracker hit -> raw hit -> sim hit -> mcparticles to know what particle creates the tracker hit. I think it's too much a burden for analysis, so I prefer to have it as an association to be stored in EICrecon output. It is harmless. I will provide a factory to fill in the structure so user can easily access it.

@wdconinc
Copy link
Contributor

But we don't have the raw hit to sim hit association filled. So it's almost the same amount of work to add it in EICrecon.
And if doing so, it takes the user three steps: tracker hit -> raw hit -> sim hit -> mcparticles to know what particle creates the tracker hit. I think it's too much a burden for analysis[...]

We should use associations to connect quantities that we can determine from data to truth MC information which is only sometimes available and ONLY when there is a clear one-to-one connection to truth information. There seems to be this desire to create associations to MCParticles for every reconstructed entity there is. Most likely that is not possible and will require that someone makes a choice as to what is the truth which others will disagree with. If an entity is derived from multiple other entities that are uniquely associated with a specific truth entity, then we should not replicate this dependency.

edm4eic.yaml Outdated Show resolved Hide resolved
@ShujieL
Copy link
Contributor Author

ShujieL commented Feb 15, 2024

But we don't have the raw hit to sim hit association filled. So it's almost the same amount of work to add it in EICrecon.
And if doing so, it takes the user three steps: tracker hit -> raw hit -> sim hit -> mcparticles to know what particle creates the tracker hit. I think it's too much a burden for analysis[...]

We should use associations to connect quantities that we can determine from data to truth MC information which is only sometimes available and ONLY when there is a clear one-to-one connection to truth information. There seems to be this desire to create associations to MCParticles for every reconstructed entity there is. Most likely that is not possible and will require that someone makes a choice as to what is the truth which others will disagree with. If an entity is derived from multiple other entities that are uniquely associated with a specific truth entity, then we should not replicate this dependency.

As Dmitry @veprbl suggested, we can make the tracker hit and mcparticle relation one to one, and using the field "weight" to handle the multiple associations. In this way, the relation between truth info and recon quantity is clear and no built-in "choices".

This is the only association data structure the tracking study requested so far b/c it's necessary for purity calculation. Any alternative solution means extensive local algorithm development out side of EICrecon that is difficult to track and maintain.

@veprbl
Copy link
Member

veprbl commented Feb 15, 2024

But we don't have the raw hit to sim hit association filled. So it's almost the same amount of work to add it in EICrecon.
And if doing so, it takes the user three steps: tracker hit -> raw hit -> sim hit -> mcparticles to know what particle creates the tracker hit. I think it's too much a burden for analysis[...]

We should use associations to connect quantities that we can determine from data to truth MC information which is only sometimes available and ONLY when there is a clear one-to-one connection to truth information. There seems to be this desire to create associations to MCParticles for every reconstructed entity there is. Most likely that is not possible and will require that someone makes a choice as to what is the truth which others will disagree with. If an entity is derived from multiple other entities that are uniquely associated with a specific truth entity, then we should not replicate this dependency.

What @ShujieL described last week, to me, sounded as very similar to CalorimeterClusters and MCParticles associations. If we were to follow the principle quoted above, we'd need to remove associations to CalorimeterClusters and introduce one with CalorimeterHit, which is probably not something we want to do. So that's a counterexample.

@ShujieL
Copy link
Contributor Author

ShujieL commented Feb 16, 2024

But we don't have the raw hit to sim hit association filled. So it's almost the same amount of work to add it in EICrecon.
And if doing so, it takes the user three steps: tracker hit -> raw hit -> sim hit -> mcparticles to know what particle creates the tracker hit. I think it's too much a burden for analysis[...]

We should use associations to connect quantities that we can determine from data to truth MC information which is only sometimes available and ONLY when there is a clear one-to-one connection to truth information. There seems to be this desire to create associations to MCParticles for every reconstructed entity there is. Most likely that is not possible and will require that someone makes a choice as to what is the truth which others will disagree with. If an entity is derived from multiple other entities that are uniquely associated with a specific truth entity, then we should not replicate this dependency.

What @ShujieL described last week, to me, sounded as very similar to CalorimeterClusters and MCParticles associations. If we were to follow the principle quoted above, we'd need to remove associations to CalorimeterClusters and introduce one with CalorimeterHit, which is probably not something we want to do. So that's a counterexample.

Thanks @veprbl. In this case, can you or @wdconinc approve it?

@wdconinc
Copy link
Contributor

This was discussed last week at the software and computing meeting, https://indico.bnl.gov/event/22292/. There was no agreement and plenty of discussion about whether this is the right way forward or not. There was no conclusion on this. We should not merge this until we reach a better conclusion. You should ask @mdiefent or @sly2j for what and when the next step is.

@ShujieL
Copy link
Contributor Author

ShujieL commented Mar 19, 2024

To close the loop, the recommendation from the software group was to NOT having this direct association.

Instead, the raw hits and sim hits are added to the EICrecon output. See eic/EICrecon#1301

So with the above PR and the existing sim hits to raw hits association data structure, one can trace back from tracker hits -> raw hits -> sim hits -> mc particle offline in the analysis script. The raw to sim hits association needs to be filled in EICrecon still.

@ShujieL ShujieL closed this Mar 19, 2024
@veprbl veprbl deleted the trajectory branch April 15, 2024 03:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants