Liquid argon time projection chambers (LArTPCs) provide dense, high-fidelity 3D measurements of particle interactions and underpin current and future neutrino and rare-event experiments. Physics reconstruction typically relies on complex detector-specific pipelines that use tens of hand-engineered pattern recognition algorithms or cascades of task-specific neural networks that require extensive, labeled simulation.
We introduce Panda, a model that learns reusable sensor-level representations directly from raw unlabeled LArTPC data. Panda couples a hierarchical sparse 3D encoder with a multi-view, prototype-based self-distillation objective. On a simulated dataset, Panda substantially improves label efficiency and reconstruction quality, beating the previous state-of-the-art semantic segmentation model with 1,000× fewer labels. We also show that a single set-prediction head 1/20th the size of the backbone with no physical priors trained on frozen outputs from Panda can result in particle identification that is comparable with state-of-the-art reconstruction tools.
We use a shared 90M-parameter point-native hierarchical encoder that operates directly on voxelized 3D charge clouds from LArTPC images. The same encoder is used for self-distilled pre-training and all downstream tasks.
We employ a prototype-based self-distillation scheme inspired by DINO and Sonata. A student network learns to predict consistent prototype distributions across strong augmentations by matching the outputs of an exponential moving average (EMA) teacher network.
The embeddings, cast to 2D using t-SNE, capture both inter-class diversity and intra-class multi-modality. For example, electrons manifest as showers, Michel electrons, or Delta rays, and the model learns to separate these naturally.
Some overlap between γ/e and μ/π clusters reflects genuine physical ambiguities in LArTPC data. For example, photon- and electron-initiated electromagnetic showers can be indistinguishable should there be no resolvable conversion gap and unreliable energy deposition (dE/dx) patterns.
@misc{young2025pandaselfdistillationreusablesensorlevel,
title={Panda: Self-distillation of Reusable Sensor-level Representations for High Energy Physics},
author={Samuel Young and Kazuhiro Terao},
year={2025},
eprint={2512.01324},
archivePrefix={arXiv},
primaryClass={hep-ex},
url={https://arxiv.org/abs/2512.01324},
}