Data product: a class representing data, that is stored in a ROOT tree
Charge1
- periodically review the architecture of selected components and subsystems
- also consider issues raised ad hoc
- draft findings and recommendations for changes and improvements to the architecture
- ensure that solutions will work for all the experiments
- LArSoft should offer data products that can be read by other frameworks with no change in their source code
- each data product must represent to specific concept2
(different concepts want different data products) - data must be readable with bare C plus ROOT to allow different code to use it3
- data products should be fully expressed by pure C means
Q. Are dictionary libraries independent of art etc.?
A. Currently they are not.
It might be possible to fix that for some of the products which do not explicitly require art
.
- C++11 is required to fully compile the data product classes
- header files must be parsable with ROOT GenReflex4
(C++11-specific headers and constructs need to be enclosed in#ifndef __GCCXML__
blocks) - data products can depend on ROOT5
- implementation of classes is allowed to use MessageFacility6
Is CET library (cetlib) allowed in the implementation too7?
- is contains the popular
cet::exception
- some other part of the code might have to use
cet
search paths anyway - it requires Fermilab's cpp0x library (from LArSoft point of view, for no reason since we do use C++11)
- it requires Boost libraries
The only part positively needed is an exception that can be recognized by the frameworks (it could be derived from std::exception
).
Data products can contain data members:
- plain data
- STL containers from standard C++03
- ROOT objects
- custom structures based on the above
They must not contain:
- C++11 structures8 (e.g. no
std::array
)
(ROOT GenReflex support would be very problematic) - CLHEP objects9
(typically its data structures have an equivalent in ROOT) - art objects10
(for framework independence)
The data members will be /private/, accessed through the class interface11.
The interface should offer a read-only access12.
The object will not operate on its data, leaving the task to additional layers and algorithms13.
Suggestions:
3-vectors (position…) | TVector3 (unless there are data size issues) or double[3] ? (both are currently used) |
3-vectors (momentum…) | TVector3 or float[3] (as above, except that we don't have, nor need, more than 6 digits precision) |
4-vectors | TLorentzVector (unless there are data size issues) |
error/correlation matrices | TMatrixTSym<> or float[6] ? (both TMatrixD and double[6] are currently used) |
wire/time pair | short int[2] , float[2] |
… |
Each data product represents a physics quantity or some concept.
First step: identify and define those concepts for the existing and needed objects.
In the standard MicroBooNE simulation + reconstruction chain, 34 classes from LArSoft (lardata
) are used as data products, plus 4 more for simulation from nutools
.
Let's start from the largest chunk of correlate objects: reconstruction (base)
- live in
recob
namespace - connected to simulation (
sim
namespace) and raw data (raw
namespace)
This encompasses 13 LArSoft classes:
raw::RawDigit
recob::Cluster
recob::EndPoint2D
recob::Hit
recob::PFParticle
recob::Shower
recob::SpacePoint
recob::Track
recob::Vertex
recob::Wire
anab::Calorimetry
anab::CosmicTag
anab::ParticleID
“(version with modules creating each associations)”:recoblock_assns.svg
The sequence of (uncalibrated?) ADC counts measured on one channel in the full time window.
- can be compressed with one of a few standard algorithms
The signal deposited on a single channel as function of data acquisition time (TDC).
The signal may have been be zero-suppressed, filtered from noise and deconvoluted from electronics effects.
- includes geometrical identification: type of signal (collection or induction) and wire direction (U, V or X)
- it is possible to store exclusively the time slices where there was activity (regions of interest)
- it can point to the
raw::RawDigit
object on its channel
- remove
art::Ptr
from the data products14 - have modules create associations between
recob::Wire
andraw::RawDigit
15 - add channel ID data member to
recob::Wire
1617
The observed charge from a travelling particle that drifted to a single wire .
A hit describes the charge profile in time using a shape hypothesis .
Its position information is two-dimensional: wire location and time.
It includes:
- parameters of the signal shape best fitting the calibrated charge (peak, width…)
- the portion of wire readings it fits directly, and timing information
- full geometrical identification (
geo::WireID
) - references to the channel (wire) where the charge was collected and its raw data
- remove
art::Ptr
from the data products18 - have modules create associations between
recob::Hit
andrecob::Wire
19 - add channel ID data member to
recob::Hit
2021 - make
fHitSignal
member private, add constructors to initialize it22
fHitSignal
is never filled in LArSoft code… should it be? or should it be removed?fView
andfSignalType
are felt by MicroBooNE useless23
A sequence of hits showing geometrical correlation.
Its position information is two-dimensional.
It includes:
- initial and final position of the sequence (in wire×time), with uncertainties
- full geometrical information (
geo::PlaneID
, wire direction) - total observed charge, dQ/dW and dT/dW (and errors)
- replace each of the positions and errors with two
float
or afloat[2]
- replace
double
items withfloat
Two-dimensional coordinate on a plane
- what is this supposed to represent?
The observed information from a single physical particle with well-defined trajectory.
Q. All the information? or just trajectory? or…?
- split into detected information (hit collection) and trajectory (geometrical)
The observed information from a single physical particle evolving in a shower.
A geometrical point in the detector.
A point in the detector origin of particles
- need uncertainty on vertex position… error matrix? independent variances?
Calorimetric information of a particle.
Cosmic-ray-like attributes of a particle.
Hypothesis on the nature of a particle.
The evolution of a particle in the event (particle flow).
- it is possible to attribute a very rough particle hypothesis
- interface with
pandora
Footnotes
-
From discussion on September 15, 2014 . ↩
-
From discussion on October 6, 2014 . ↩
-
From discussion on October 6, 2014 . ↩
-
From discussion on October 6, 2014 . ↩
-
From discussion on October 6, 2014 . ↩
-
From discussion on October 6, 2014 . ↩
-
From discussion on October 6, 2014 . ↩
-
From discussion on October 6, 2014 . ↩
-
From discussion on October 6, 2014 . ↩
-
From discussion on October 6, 2014 . ↩
-
From discussion on October 6, 2014 . ↩
-
From discussion on October 6, 2014 . ↩
-
From discussion on October 6, 2014 . ↩
-
From discussion on October 13, 2014 . ↩
-
From discussion on October 13, 2014 . ↩
-
From discussion on October 13, 2014 . ↩
-
After discussion at [^24], two proposals have been evaluated: adding the channel ID as a new data member, or adding it in the
geo::WireID
structure. No compelling reason was identified to prefer either. The decision was in the end taken from the idea of havinggeo::WireID
reflect a strictly geometric concept, as opposed to integrate a data acquisition concept. ↩ -
From discussion on October 13, 2014 . ↩
-
From discussion on October 13, 2014 . ↩
-
From discussion on October 13, 2014 . ↩
-
After discussion at [^25], two proposals have been evaluated: adding the channel ID as a new data member, or adding it in the
geo::WireID
structure. No compelling reason was identified to prefer either. The decision was in the end taken from the idea of havinggeo::WireID
reflect a strictly geometric concept, as opposed to integrate a data acquisition concept. ↩ -
From discussion on MicroBooNE Analysis Tools meeting, October 16, 2014, and following private e-mail exchange with Brian Rebel. ↩
-
From discussion on MicroBooNE Analysis Tools meeting, October 16, 2014, and following private e-mail exchange with Brian Rebel. ↩