-
Notifications
You must be signed in to change notification settings - Fork 3
Setting up a fddaq‐v4.2.0 software area
06-Nov-2023 - Work in progress (the following steps have been verified to work)
Reference links:
- general development: software development workflow, DUNE DAQ Software Style Guide
- suggested Spack commands to learn about the characteristics of an existing software area are available here
- an introduction to the "assets" system, which we use to store files that are not code, is here
- testing: NP04 computer inventory
- other: Working Group task lists, List of DUNE-DAQ GitHub teams and repos
- Main grafana dashboard
- Tag Collector
- Latest test-tracking spreadsheet
-
create a new software area based on the release build (see step 1.iv for the exact
dbt-create
command to use)-
The steps for this are based on the latest instructions for daq-buildtools
-
As always, you should verify that your computer has access to /cvmfs/dunedaq.opensciencegrid.org
-
If you are using one of the np04daq computers, and need to clone packages, add the following lines to your .gitconfig file (once you do this, there will be no need to activate the web proxy each time you log in, and this means that you won't forget to disable it...):
[http] proxy = http://np04-web-proxy.cern.ch:3128 sslVerify = false
-
Here are the steps for creating the new software area:
cd <directory_above_where_you_want_the_new_software_area> source /cvmfs/dunedaq.opensciencegrid.org/setup_dunedaq.sh setup_dbt fddaq-v4.2.0 dbt-create -c fddaq-v4.2.0-a9 <work_dir> # for AL9 #dbt-create -c fddaq-v4.2.0-c8 <work_dir> # for CS8 #dbt-setup-release fddaq-v4.2.0-a9 or fddaq-v4.2.0-c8 to setup the release without creating a software area cd <work_dir>
-
Please note that if you are following these instructions on a computer on which the DUNE-DAQ software has never been run before, there are several system packages that may need to be installed on that computer. These are mentioned in this script. To check whether a particular one is already installed, you can use a command like
yum list libzstd
and check whether the package is listed underInstalled Packages
.
-
-
add any desired repositories to the /sourcecode area. An example is provided here.
- clone the repositories (the following block has some extra directory checking; it can all be copy/pasted into your shell window)
# change directory to the "sourcecode" subdir, if possible and needed if [[ -d "sourcecode" ]]; then cd sourcecode fi # double-check that we're in the correct subdir current_subdir=`echo ${PWD} | xargs basename` if [[ "$current_subdir" != "sourcecode" ]]; then echo "" echo "*** Current working directory is not \"sourcecode\", skipping repo clones" else # finally, do the repo clone(s) git clone https://github.com/DUNE-DAQ/daqsystemtest.git -b v2.1.1 cd .. fi
- clone the repositories (the following block has some extra directory checking; it can all be copy/pasted into your shell window)
-
setup the work area, and build the software
dbt-workarea-env dbt-build -j 20 dbt-workarea-env
-
prepare a
daqconf.json
file, such as the one shown here. This sample includes a playback data file that is appropriate for the WIBEth data type. (Please note the additional comments on this sample file that are included below!){ "boot": { "use_connectivity_service": true, "start_connectivity_service": true, "connectivity_service_host": "localhost", "connectivity_service_port": 15432 }, "daq_common": { "data_rate_slowdown_factor": 1 }, "detector": { "clock_speed_hz": 62500000 }, "readout": { "use_fake_cards": true, "default_data_file": "asset://?label=WIBEth&subsystem=readout" }, "trigger": { "trigger_window_before_ticks": 1000, "trigger_window_after_ticks": 1000 }, "hsi": { "random_trigger_rate_hz": 1.0 } }
A few notes on the sample file shown above:
- The "use/start_connectivity_service" parameters aren't strictly needed, since their default value is "true". Ditto, the "connectivity_service_host/port". However, all of these are included so that people can use them for reference.
- A port offset is applied to the "connectivity_service_port" by nanorc, so we don't all need to use different numbers, as long as we use different partition numbers when running nanorc, e.g.
'nanorc --partition-number 2 ...'
) - If you want to use an existing, externally-started Connectivity Service instance, such as the one on the np04 cluster, you would set "use_connectivity_service" to true, and "start_connectivity_service" to false.
Another option (the initial config, but with the ConnSvc disabled)
{ "boot": { "use_connectivity_service": false, "start_connectivity_service": false }, "daq_common": { "data_rate_slowdown_factor": 1 }, "detector": { "clock_speed_hz": 62500000 }, "readout": { "use_fake_cards": true, "default_data_file": "asset://?label=WIBEth&subsystem=readout" }, "trigger": { "trigger_window_before_ticks": 1000, "trigger_window_after_ticks": 1000 }, "hsi": { "random_trigger_rate_hz": 1.0 } }
-
prepare a data-readout map file (e.g. my_dro_map.json), listing the detector streams (true or fake) that you want to run with. This sample specifies parameter values that are appropriate for WIBEth data:
[ { "src_id": 100, "geo_id": { "det_id": 3, "crate_id": 1, "slot_id": 0, "stream_id": 0 }, "kind": "eth", "parameters": { "protocol": "udp", "mode": "fix_rate", "rx_iface": 0, "rx_host": "localhost", "rx_mac": "00:00:00:00:00:00", "rx_ip": "0.0.0.0", "tx_host": "localhost", "tx_mac": "00:00:00:00:00:00", "tx_ip": "0.0.0.0" } }, { "src_id": 101, "geo_id": { "det_id": 3, "crate_id": 1, "slot_id": 0, "stream_id": 1 }, "kind": "eth", "parameters": { "protocol": "udp", "mode": "fix_rate", "rx_iface": 0, "rx_host": "localhost", "rx_mac": "00:00:00:00:00:00", "rx_ip": "0.0.0.0", "tx_host": "localhost", "tx_mac": "00:00:00:00:00:00", "tx_ip": "0.0.0.0" } } ]
-
Generate a configuration, e.g.:
fddaqconf_gen -c ./daqconf.json --detector-readout-map-file ./my_dro_map.json my_test_config
-
nanorc --partition-number <num> <config name> <partition name> boot conf start_run <run number> wait 60 stop_run scrap terminate
- e.g.
nanorc --partition-number 2 my_test_config ${USER}-test boot conf start_run 111 wait 60 stop_run scrap terminate
- or, you can simply invoke
nanorc --partition-number 2 my_test_config ${USER}-test
by itself and input the commands individually
- e.g.
-
When you return to working with the software area after logging out, the steps that you'll need to redo are the following:
cd <work_dir>
source ./env.sh
-
dbt-build
# if needed -
dbt-workarea-env
# if needed
-
For reference, here are additional sample
daqconf.json
anddro_map.json
files that illustrate various types of running with WIBEth data. These can be mixed and matched with the samples above to generate demo systems of various levels of complexity.Sample daqconf.json for running with TPG
{ "boot": { "use_connectivity_service": true, "start_connectivity_service": true, "connectivity_service_host": "localhost", "connectivity_service_port": 15432 }, "daq_common": { "data_rate_slowdown_factor": 1 }, "detector": { "clock_speed_hz": 62500000 }, "readout": { "use_fake_cards": true, "default_data_file": "asset://?checksum=dd156b4895f1b06a06b6ff38e37bd798", "generate_periodic_adc_pattern": true, "emulated_TP_rate_per_ch": 1, "enable_tpg": true, "tpg_threshold": 500, "tpg_algorithm": "SimpleThreshold" }, "trigger": { "trigger_window_before_ticks": 1000, "trigger_window_after_ticks": 1000, "trigger_activity_plugin": ["TriggerActivityMakerPrescalePlugin"], "trigger_activity_config": [ {"prescale": 25} ], "trigger_candidate_plugin": ["TriggerCandidateMakerPrescalePlugin"], "trigger_candidate_config": [ {"prescale": 100} ], "mlt_merge_overlapping_tcs": false }, "dataflow": { "apps": [ { "app_name": "dataflow0" } ], "enable_tpset_writing": true, "token_count": 20 }, "hsi": { "random_trigger_rate_hz": 1.0 } }
Sample dro_map.json for three Readout Apps (3 separate processes), each with two streams of data (i.e. two DataLinkHandler modules)
[ { "src_id": 100, "geo_id": { "det_id": 3, "crate_id": 1, "slot_id": 0, "stream_id": 0 }, "kind": "eth", "parameters": { "protocol": "udp", "mode": "fix_rate", "rx_iface": 0, "rx_host": "localhost", "rx_mac": "00:00:00:00:00:00", "rx_ip": "0.0.0.0", "tx_host": "localhost", "tx_mac": "00:00:00:00:00:00", "tx_ip": "0.0.0.0" } }, { "src_id": 101, "geo_id": { "det_id": 3, "crate_id": 1, "slot_id": 0, "stream_id": 1 }, "kind": "eth", "parameters": { "protocol": "udp", "mode": "fix_rate", "rx_iface": 0, "rx_host": "localhost", "rx_mac": "00:00:00:00:00:00", "rx_ip": "0.0.0.0", "tx_host": "localhost", "tx_mac": "00:00:00:00:00:00", "tx_ip": "0.0.0.0" } }, { "src_id": 102, "geo_id": { "det_id": 3, "crate_id": 2, "slot_id": 0, "stream_id": 0 }, "kind": "eth", "parameters": { "protocol": "udp", "mode": "fix_rate", "rx_iface": 1, "rx_host": "localhost", "rx_mac": "00:00:00:00:00:01", "rx_ip": "0.0.0.1", "tx_host": "localhost", "tx_mac": "00:00:00:00:00:00", "tx_ip": "0.0.0.0" } }, { "src_id": 103, "geo_id": { "det_id": 3, "crate_id": 2, "slot_id": 0, "stream_id": 1 }, "kind": "eth", "parameters": { "protocol": "udp", "mode": "fix_rate", "rx_iface": 1, "rx_host": "localhost", "rx_mac": "00:00:00:00:00:01", "rx_ip": "0.0.0.1", "tx_host": "localhost", "tx_mac": "00:00:00:00:00:00", "tx_ip": "0.0.0.0" } }, { "src_id": 104, "geo_id": { "det_id": 3, "crate_id": 3, "slot_id": 0, "stream_id": 0 }, "kind": "eth", "parameters": { "protocol": "udp", "mode": "fix_rate", "rx_iface": 2, "rx_host": "localhost", "rx_mac": "00:00:00:00:00:02", "rx_ip": "0.0.0.2", "tx_host": "localhost", "tx_mac": "00:00:00:00:00:00", "tx_ip": "0.0.0.0" } }, { "src_id": 105, "geo_id": { "det_id": 3, "crate_id": 3, "slot_id": 0, "stream_id": 1 }, "kind": "eth", "parameters": { "protocol": "udp", "mode": "fix_rate", "rx_iface": 2, "rx_host": "localhost", "rx_mac": "00:00:00:00:00:02", "rx_ip": "0.0.0.2", "tx_host": "localhost", "tx_mac": "00:00:00:00:00:00", "tx_ip": "0.0.0.0" } } ]
-
For further reference, here are
daqconf.json
anddro_map.json
files for emulated DuneWIB electronicsSample daqconf.json for DuneWIB
{ "boot": { "use_connectivity_service": true, "start_connectivity_service": true, "connectivity_service_host": "localhost", "connectivity_service_port": 15432 }, "daq_common": { "data_rate_slowdown_factor": 10 }, "detector": { "clock_speed_hz": 62500000 }, "readout": { "use_fake_cards": true, "data_files": [ {"detector_id": 3, "data_file": "asset://?label=DuneWIB&subsystem=readout"} ] }, "trigger": { "trigger_window_before_ticks": 1000, "trigger_window_after_ticks": 1000 }, "hsi": { "random_trigger_rate_hz": 1.0 } }
Another option, with DuneWIB, Trigger Primitive generation enabled, and multiple Dataflow apps
{ "boot": { "use_connectivity_service": true, "start_connectivity_service": true, "connectivity_service_host": "localhost", "connectivity_service_port": 15432 }, "dataflow": { "enable_tpset_writing": true, "apps": [ { "app_name": "dataflow0" }, { "app_name": "dataflow1" } ] }, "daq_common": { "data_rate_slowdown_factor": 10 }, "detector": { "clock_speed_hz": 62500000 }, "readout": { "enable_tpg": true, "tpg_threshold": 500, "use_fake_cards": true, "data_files": [ {"detector_id": 3, "data_file": "asset://?label=DuneWIB&subsystem=readout"} ] }, "trigger": { "trigger_activity_plugin": ["TriggerActivityMakerPrescalePlugin"], "trigger_activity_config": [ {"prescale": 1000} ], "trigger_candidate_plugin": ["TriggerCandidateMakerPrescalePlugin"], "trigger_candidate_config": [ {"prescale": 100} ], "trigger_window_before_ticks": 1000, "trigger_window_after_ticks": 1000 }, "hsi": { "random_trigger_rate_hz": 1.0 } }
Sample dro_map.json for DuneWIB
[ { "src_id": 100, "geo_id": { "det_id": 3, "crate_id": 1, "slot_id": 0, "stream_id": 0 }, "kind": "flx", "parameters": { "protocol": "full", "mode": "fix_rate", "host": "localhost", "card": 0, "slr": 0, "link": 0 } }, { "src_id": 101, "geo_id": { "det_id": 3, "crate_id": 1, "slot_id": 0, "stream_id": 1 }, "kind": "flx", "parameters": { "protocol": "full", "mode": "fix_rate", "host": "localhost", "card": 0, "slr": 0, "link": 1 } } ]
An example hardware map file from the Vertical Drift Coldbox can be found here.
-
For reference, here are
daqconf.json
anddro_map.json
files for VD TDE (vertical drift, top detector electronics)Sample daqconf.json for VD TDE
{ "boot": { "use_connectivity_service": true, "start_connectivity_service": true, "connectivity_service_host": "localhost", "connectivity_service_port": 15432 }, "daq_common": { "data_rate_slowdown_factor": 1 }, "detector": { "clock_speed_hz": 62500000 }, "readout": { "use_fake_cards": true, "default_data_file": "asset://?checksum=759e5351436bead208cf4963932d6327" }, "trigger": { "trigger_window_before_ticks": 1000, "trigger_window_after_ticks": 1000 }, "hsi": { "random_trigger_rate_hz": 1.0 } }
Sample dro_map.json for VD TDE
[ { "src_id": 100, "geo_id": { "det_id": 11, "crate_id": 1, "slot_id": 0, "stream_id": 0 }, "kind": "eth", "parameters": { "protocol": "udp", "mode": "fix_rate", "rx_iface": 0, "rx_host": "localhost", "rx_mac": "00:00:00:00:00:00", "rx_ip": "0.0.0.0", "tx_host": "localhost", "tx_mac": "00:00:00:00:00:00", "tx_ip": "0.0.0.0" } }, { "src_id": 101, "geo_id": { "det_id": 11, "crate_id": 1, "slot_id": 1, "stream_id": 1 }, "kind": "eth", "parameters": { "protocol": "udp", "mode": "fix_rate", "rx_iface": 0, "rx_host": "localhost", "rx_mac": "00:00:00:00:00:00", "rx_ip": "0.0.0.0", "tx_host": "localhost", "tx_mac": "00:00:00:00:00:00", "tx_ip": "0.0.0.0" } } ]
Starting with dunedaq-v4.0.0, when we specify a hostname of "localhost" in a daqconf.json or dro_map.json file, that hostname is resolved at configuration time, using the name of the host on which the configuration is generated. This is handled by the code in the daqconf
package, and it is done to prevent problems in situations in which some of the hosts are fully specified and some are simply listed as localhost. Such a mixed system can be problematic since the meaning of "localhost" will be different depending on when, and on which host, it is resolved. To prevent such problems, localhost is now fully resolved at configuration time.
This has ramifications that should be noted, however. Previously, when localhost-only system configurations were run with nanorc
, the DAQ processes would be started on the host on which nanorc
was run. With the new functionality, however, the DAQ processes that had a hostname of "localhost" will always be run on the computer on which the configruation was generated, independent of where nanorc
is run.
-
dbt-release-info
# prints out the release type and name, and the base release name (version) -
dbt-pkg-info <dunedaq_package_name>
# prints out the package version and commit hash used by the release -
dbt-src-status
# prints out the branch names of source repos under sourcecode, and marks those with local changes with "*" -
spack find --loaded -N <external_package_name>
, e.g.spack find --loaded -N boost
# prints out the version of the specified external package that is in use in the current software area -
spack info fddaq
# prints out the packages that are included in thefddaq
bundle for the current software area -
spack info dunedaq
# prints out the packages that are included in thedunedaq
(common) bundle for the current software area
Also see here.
This utility can be used to print out information from the HDF5 raw data files. To invoke it use
HDF5LIBS_TestDumpRecord <filename>
h5dump-shared -H <filename>
This is another use of the h5dump-shared
utility. This case uses the following command-line arguments:
- the HDF5 path of the block we want to dump (-d )
- the output binary file name (-o <output_file>)
- the HDF5 file to be dumped
An example is:
h5dump-shared -d /TriggerRecord00001.0000/RawData/Detector_Readout_0x00000000_WIB -bLE -o dataset1.bin swtest_run002252_0000_dataflow0_datawriter_0_20221102T192809.hdf5
Once you have the binary file, you can examine it with tools like Linux od
(octal dump), for example
od -x dataset1.bin
There are several integration tests available in the integtest directory of the daqsystemtest package. To run them, we suggest adding the daqsystemtest package to your software area (if not already done), cd sourcecode/daqsystemtest/integtest
, and cat the README file to view the suggestions listed within it. (Those suggestions are along the lines of running a test with a command like pytest -s minimal_system_quick_test.py --nanorc-option partition-number <your_fav_num_1-9>
.)
When running with nanorc, metrics reports appear in the info_*.json
files that are produced (e.g. info_dataflow_<portno>.json
). We can collate these, grouped by metric name, using python -m opmonlib.info_file_collator info_*.json
(default output file is opmon_collated.json
).
It is also possible to monitor the system using a graphic interface.
From Pierre on 05-Apr-2023:
- for nano04rc: port_offset = 0 + partition_number * 500 https://github.com/DUNE-DAQ/nanorc/blob/develop/src/nanorc/__main_np04__.py#LL77C6-L77C45
- for nanorc: port_offset = 0 + partition_number * 500 https://github.com/DUNE-DAQ/nanorc/blob/develop/src/nanorc/cli.py#L108
- for nanotimingrc: port_offset = 300 + partition_number * 500 https://github.com/DUNE-DAQ/nanorc/blob/develop/src/nanorc/__main_timing__.py#L69
Here are suggested steps for enabling and viewing debug messages in the TRACE memory buffer:
- set up your software area, if needed (e.g.
cd <work_dir>; source ./dbt-env.sh ; dbt-workarea-env
) -
export TRACE_FILE=$DBT_AREA_ROOT/log/${USER}_dunedaq.trace
- this tells TRACE which file on disk to use for its memory buffer, and in this way, enables TRACE in your shell environment and in subsequent runs of the system with
nanorc
.
- this tells TRACE which file on disk to use for its memory buffer, and in this way, enables TRACE in your shell environment and in subsequent runs of the system with
- run the application using the
nanorc
commands described above- this populates the list of already-enabled TRACE levels so that you can view them in the next step
- run
tlvls
- this command outputs a list of all the TRACE names that are currently known, and which levels are enabled for each name
- TRACE names allow us to group related messages, and these names typically correspond to the name of the C++ source file
- the bitmasks that are relevant for the TRACE memory buffer are the ones in the "maskM" column
- enable levels with
tonM -n <TRACE NAME> <level>
- for example,
tonM -n DataWriter DEBUG+5
(where "5" is the level that you see in theTLOG_DEBUG
statement in the C++ code)
- for example,
- re-run
tlvls
to confirm that the expected level is now set - re-run the application
- view the TRACE messages using
tshow | tdelta -ct 1 | more
- note that the messages are displayed in reverse time order
A couple of additional notes:
- For debug statements in our code that look like
TLOG_DEBUG(5) << "test, test";
, we would enable the output of those messages using a shell command liketonM -n <TRACE_NAME> DEBUG+5
. A couple of notes on this...- when we look at the output of the bitmasks with the
tlvls
command, bit #5 is going to be offset by the number of bits that TRACE and ERS reserve for ERROR, WARNING, INFO, etc. messages. At the moment, the offset appears to be 8, so the setting of bit "DEBUG+5" corresponds to setting bit #13. - when we view the messages with
tshow
, one of the columns in its output shows the level associated with the message (the column heading is abbreviated as "lvl"). Debug messages are prefaced with the letter "D", and they include the number that was specified in the C++ code. So, for our example of level 5, we would see "D05" in thetshow
output for the "test, test" messages.
- when we look at the output of the bitmasks with the
- There are many other TRACE 'commands' that allow you to enable and disable messages. For example,
-
tonMg <level>
enables the specified level for all TRACE names (the "g" means global in this context) -
toffM -n <TRACE NAME> <level>
disables the specified level for the specified TRACE name -
toffMg <level>
disables the specified level for all TRACE names -
tlvlM -n <TRACE name> <mask>
explicitly sets (and un-sets) the levels specified in the bitmask
-