Calibration and Range File Data Structure
This document summarizes the data layout used by the calibration module and its range files.
Notation
(n,): one-dimensional array with lengthnUnits and data type are written as
(unit, dtype)N/Ameans the field has no physical unit
Main Calibration Dataset (HDF5)
Typical calibrated dataset fields:
x (nm):(n,)(nm, float64)reconstructed x-coordinatey (nm):(n,)(nm, float64)reconstructed y-coordinatez (nm):(n,)(nm, float64)reconstructed z-coordinatemc (Da):(n,)(Da, float64)calibrated mass-to-charge ratiomc_uc (Da):(n,)(Da, float64)uncalibrated mass-to-charge ratiohigh_voltage (V):(n,)(V, float64)detector high voltagepulse:(n,)(V, float64)or(pJ, float64)pulse voltage or laser energyt (ns):(n,)(ns, float64)uncalibrated time-of-flightt_c (ns):(n,)(ns, float64)calibrated time-of-flightx_det (cm):(n,)(cm, float64)detector x hit positiony_det (cm):(n,)(cm, float64)detector y hit positiondelta_p:(n,)(N/A, uint32)pulses since previous detected eventmulti:(n,)(N/A, uint32)multiplicity per pulsestart_counter:(n,)(N/A, float64)TDC counter valueevent_group_id(optional):(n,)(N/A, int64)shared event-group id linking each dld row to the matching raw/tdcrows. Present only when the dataset was loaded withload_tdc_raw=True. Survives all downstream cropping steps so the link can be used at save time.
Linked Raw TDC Group /tdc (Optional)
When load_tdc_raw=True is selected at load and save_tdc=True at save, the
output .h5 file also contains a /tdc group with the raw delay-line
timestamps that are still relevant after dld filtering. The group has the
columns of a Surface Concept tdc dataframe plus two link fields:
channel:(m,)(N/A, uint32)delay-line channel index (0-3 for two delay lines, 0-5 for three)start_counter:(m,)(N/A, uint32)pulse-trigger id (wraps; not unique)high_voltage (V),pulse_v (V),pulse_l (pJ),time_data: same semantics as the raw acquisition datasetevent_group_id:(m,)(N/A, int64)shared id used to link each tdc row to the dld row(s) for the same pulse trigger;-1for orphan rowshas_dld_match:(m,)(N/A, bool)Trueiff the pulse trigger produced at least one dld row at load time. Orphan rows (False) are always preserved during save filtering, regardless of which dld rows the user removed.
Linking and filtering rules
The link is built once at load time by walking the consecutive
start_counterruns in both groups in time order. This is robust to counter wraparound, since the algorithm never compares counter values across different runs.A tdc row is kept on save iff
has_dld_match == FalseOR itsevent_group_idis still present in the calibrated dld dataframe.Multi-hit pulses (multiple dld rows for the same trigger) are treated at the group level: if any dld row in the group survives filtering, all tdc rows for that group are preserved.
Range Dataset (HDF5)
Range data defines identified ion windows in mass-to-charge space.
name:(n,)(N/A, string)ion label (plain text)ion:(n,)(N/A, string)ion label (LaTeX style)mass:(n,)(Da, float64)mass-to-charge from isotope compositionmc:(n,)(Da, float64)detected peak centermc_low:(n,)(Da, float64)lower mass-to-charge boundmc_up:(n,)(Da, float64)upper mass-to-charge boundcolor:(n,)(N/A, string)display color (hex code)element:(n,)(N/A, list[str])element symbols for each rangecomplex:(n,)(N/A, list[uint32])stoichiometric multiplicitiesisotope:(n,)(N/A, list[uint32])isotope identifierscharge:(n,)(N/A, uint32)ion charge state
Interoperability
Calibration data can be imported from and exported to:
HDF5
EPOS
POS
ATO
CSV
See tutorial notebooks under pyccapt/calibration/tutorials for examples.