ML4FE Workshop, University of Hawaii
We aim to connect HEP physicists and engineers interested in front-end ML developments and applications to share latest results, brainstorm future directions, build new collaborations, pursue funding, and share infrastructure.
The workshop will take place at Holmes Hall on the campus of the University of Hawai'i Manoa in Honolulu, HI. We encourage in-person attendance but will provide a hybrid participation option.
-
-
9:00 AM
→
9:30 AM
LOC: Workshop opening and welcomeConvener: Jennifer Ott (University of Hawaii Manoa)
-
9:30 AM
→
10:40 AM
Technology intro
-
9:30 AM
Overview of challenges and goals 30mSpeaker: Ryan Herbst (SLAC National Accelerator Laboratory)
-
10:00 AM
ASICs 20mSpeaker: Christian Herwig (University of Michigan)
-
10:20 AM
Heterogeneous computing/TDAQ & hls4ml 20mSpeaker: Javier Duarte (UC San Diego)
-
9:30 AM
-
10:40 AM
→
11:10 AM
Coffee break 30m
-
11:10 AM
→
11:40 AM
Technology, Contributed
Contributed talks, ca. 12' + 3
-
11:10 AM
Introduction to eFPGAs and Their Application to High Energy Physics 15m
Advancements in High Energy Physics (HEP) increasingly rely on intelligent instrumentation capable of processing vast, complex datasets in real time. As detectors evolve, front-end electronics must not only manage extreme data rates with minimal latency and power consumption but also withstand harsh environmental conditions, such as high radiation and cryogenics. Traditional Application-Specific Integrated Circuits (ASICs) deliver the necessary performance and efficiency but lack flexibility, while commercial Field-Programmable Gate Arrays (FPGAs) offer reconfigurability at the cost of power efficiency, reliability, and radiation tolerance.
Embedded FPGAs (eFPGAs) present a promising solution to this trade-off. By integrating reprogrammable logic directly within ASICs, eFPGAs enable adaptable, low-latency processing close to the detector, combining the performance benefits of ASICs with the flexibility of FPGAs. This architecture allows for in-field updates to complex data reduction and AI/ML-based triggering algorithms, eliminating the need for expensive ASIC redesigns as experimental requirements evolve.
At this workshop, we present an introduction to eFPGA technology and its relevance to the challenges of next-generation HEP experiments. We highlight recent work at SLAC focused on integrating eFPGAs into ASIC designs for real-time data processing and triggering. Additionally, we discuss future directions and potential applications of this technology within the HEP community, emphasizing its role in enabling more intelligent, adaptable, and resilient detector systems.
Speaker: Larry Ruckman (SLAC National Accelerator Laboratory) -
11:25 AM
Empowering AI Implementation: The Versatile SLAC Neural Network Library (SNL) for FPGA 15m
The SLAC Neural Network Library (SNL) is a high-performance, hardware-aware framework for deploying machine learning models on FPGAs at the edge of the scientific data chain. Developed using Xilinx's High-Level Synthesis (HLS) tools, SNL combines the flexibility of software-defined design with the low-latency, high-throughput advantages of reconfigurable hardware. It offers a user-friendly API modeled after the Keras interface, streamlining the transition from model development to hardware deployment.
SNL is optimized for moderately sized neural networks, with support for dynamic weight and bias reloading—eliminating the need for time-consuming re-synthesis during model updates. Its modular architecture enables the integration of custom layers and specialized logic, while maintaining compatibility with standard formats like HDF5 for parameter storage.
Designed to meet the stringent latency and throughput demands of real-time scientific computing, SNL empowers edge intelligence by delivering adaptable, efficient, and experiment-ready ML inference engines—positioning itself as a critical enabler for next-generation AI-accelerated detector systems.
Speaker: Abhilasha Dave (SLAC National Accelerator Laboratory)
-
11:10 AM
-
11:40 AM
→
12:10 PM
Ideas session
Short (5' + 5') introductions of new ideas or recently started work; can also be discussion openers to the community or asking for advice on tackling a specific problem
-
11:40 AM
Idea 1 10m
-
11:50 AM
Idea 2 10m
-
12:00 PM
Idea 3 10m
-
11:40 AM
-
12:15 PM
→
1:30 PM
Lunch break 1h 15m
-
1:30 PM
→
2:30 PM
Technology introConveners: Jennifer Ott (University of Hawaii Manoa), Keisuke Yoshihara (University of Hawaii at Manoa)
-
1:30 PM
Photonics 20mSpeaker: Sajjad Moazeni (University of Washington)
-
1:50 PM
TinyML on chip 20m
-
2:10 PM
New Technologies 20m
-
1:30 PM
-
2:30 PM
→
3:30 PM
Technology, Contributed
Contributed talks, ca. 12' + 3
-
2:30 PM
Machine Learning-Assisted Event Classification in Cross-Strip CZT PET Detectors Leveraging Quantum Correlations 15m
We present a new approach for positron emission tomography (PET) event classification that integrates machine learning with quantum-aware signal processing. Our system utilizes a cross-strip Cadmium Zinc Telluride (CZT) detector architecture optimized for high-resolution spatial and energy discrimination. By exploiting quantum correlations of annihilation photons, we aim to enhance the differentiation between true coincidence events and scatter-induced gamma interactions. A machine learning model is trained to identify signatures of entangled annihilation events in real time, using features derived from timing, energy, and spatial coincidence patterns across orthogonal strips. We envision this work contributing to scalable, AI-enhanced PET systems and welcome collaborations on model co-design, firmware integration, and front-end ML acceleration strategies.
Speakers: Dr Praveen Gurunath Bharathi Gurunath Bharathi, Shiva Abbaszadeh -
2:45 PM
Reconfigurable Pulse-Shape Discrimination Algorithm Implementations using eFPGAs 15m
We present our ongoing work toward developing machine learning (ML) algorithms for embedded Field-Programmable Gate Arrays (eFPGAs) integrated on readout Application-Specific Integrated Circuits (ASICs). Our focus is on reconfigurable Pulse-Shape Discrimination (PSD), a critical signal processing technique for neutron imaging and other imaging modalities. By leveraging the reconfigurability of eFPGAs, this approach enables dynamic optimization of the PSD algorithm for specific particle energies and scintillator combinations. Hardware testing of a standalone eFPGA is currently underway, and we are exploring lightweight ML models and feature extraction techniques compatible with the power and area constraints of on-chip inference. This work lays the foundation for flexible, low-power PSD implementations in next-generation radiation detection systems.
Speaker: Carl Grace (Lawrence Berkeley National Laboratory)
-
2:30 PM
-
3:30 PM
→
4:00 PM
Coffee break 30m
-
4:00 PM
→
4:30 PM
Ideas session
Short (5' + 5') introductions of new ideas or recently started work; can also be discussion openers to the community or asking for advice on tackling a specific problem
-
4:00 PM
Idea 4 10m
-
4:10 PM
Idea 5 10m
-
4:20 PM
Idea 6 10m
-
4:00 PM
-
4:30 PM
→
5:00 PM
Workshop summary: Technology
-
9:00 AM
→
9:30 AM
-
-
9:00 AM
→
10:30 AM
Physics intro
-
9:00 AM
Overview of fast ML for detectors and control 30mSpeaker: Dylan Rankin (University of Pennsylvania)
-
9:30 AM
Application to (EF) colliders: CMS ASICs 20m
-
9:50 AM
Application to (NF) DUNE 20m
-
10:10 AM
Discussion/Overflow 20m
-
9:00 AM
-
10:30 AM
→
11:00 AM
Coffee break 30m
-
11:00 AM
→
11:30 AM
Physics, Contributed
-
11:00 AM
Low-Latency On-Chip $\tau$ Event Selection with Machine Learning for the Belle II Level-1 Trigger 15m
Belle II is the second-generation $B$ physics experiment located at the SuperKEKB asymmetric $e^+ e^−$ collider, operating at the $\Upsilon(4S)$ resonance. The $\tau$ physics program at Belle II involves both probes of new physics and precision measurements of SM parameters with large statistics. These include placing strong constraints on lepton flavor violation [1], probing CP violation in the lepton sector [2], and performing precision measurements of SM parameters such as the $\tau$ magnetic moment [3]. SuperKEKB is projected to increase luminosity by roughly one order of magnitude over the next several years. Accordingly, the reconstruction logic for the Level-1 trigger will require significant upgrades to keep the overall trigger rate below the required $30~\text{kHz}$ as luminosity increases [4]. We utilize recent advances in mixed-precision neural network quantization [5] to enable fast machine learning for on-chip $\tau$ event selection at Belle II. We focus on efficient reconstruction of low-multiplicity $\tau$ decays, achieving significant improvements in trigger efficiency and background rejection rate over existing cut-based algorithms.
References
[1] Wenzhe Li. Searches for lepton-flavour violation in $\tau$ decays at Belle and Belle II. PoS, ICHEP2024:425, 2025.
[2] The ATLAS, Belle II, CMS, and LHCb collaborations. Projections for key measurements in heavy flavour physics, March 2025.
[3] Andreas Crivellin, Martin Hoferichter, and J. Michael Roney. Toward testing the magnetic moment of the tau at one part per million. Phys. Rev. D, 106(9):093007, 2022.
[4] Y. T. Lai et al. Design of the Global Reconstruction Logic in the Belle II Level-1 Trigger system, March 2025.
[5] Sun Chang, Thea Arrestad, Vladimir Loncar, Jennifer Ngadiuba, and Maria Spiropulu. Gradient-based automatic per-weight mixed precision quantization for neural networks on-chip, 2024Speaker: Deven Misra (Kavli IPMU) -
11:15 AM
5-D calorimeter design issues with an integrated online/offline AI/ML approach 15m
Many physics analyses use some form of AI/ML to identify physics objects such as jets and electrons and/or for whole event classification. However, such an approach has generally been taken a long time after the detector was designed and constructed. It is therefore relevant to question whether a proposed design of a future calorimeter is optimal for the application of AI/ML techniques and, if such techniques are also to be used onboard the detector, what are the potential benefits of an integrated online/offline AI/ML approach. This talk raises a number of relevant related questions in areas such as granularity vs. confusion, ML online/offline compatibility, ML and on-detector logic, ML and timing, ML-assisted PFA, and cost constraints via ML. The issues associated with an integrated ML comprehensive approach and possible related future research directions will be discussed.
Speaker: Andrew White (U. Texas at Arlington)
-
11:00 AM
-
11:30 AM
→
12:00 PM
Ideas session
Short (5' + 5') introductions of new ideas or recently started work; can also be discussion openers to the community or asking for advice on tackling a specific problem
-
12:00 PM
→
1:30 PM
Lunch break 1h 30m
-
1:30 PM
→
3:00 PM
Physics intro
-
1:30 PM
Application to colliders: Smart Pixels 20mSpeaker: Lindsey Gray (Fermilab)
-
1:50 PM
Application to colliders: MuC 20mSpeaker: Timon Heim (Lawrence Berkeley National Laboratory)
-
2:10 PM
Application to accelerators 20mSpeaker: Auralee Edelen (SLAC National Accelerator Laboratory)
-
2:30 PM
Discussion/Overflow 10m
-
1:30 PM
-
3:00 PM
→
3:30 PM
Coffee break 30m
-
3:30 PM
→
4:30 PM
Physics, Contributed 2
-
3:30 PM
Implementation of small-scale ML in Belle II Chamber Drift Chamber Front-End Electronics for cross-talk noise reduction 15m
Central Drift Chamber in the Belle II experiment is one of the charged tracking device for not only offline but also real-time trigger systems. In the operation so far, we observe an issue of cross-talk noise in the Front-End Electronics device, where a bunch of noise wire hits happen in nearby regions. This issue causes fake tracks in hardware trigger and also increases the processing loading in high-level trigger. We perform a study of implementing small-scale ML in the FPGA of Front-End Electronics for each of the wires to reduce such kind of noise. Not only the power of separation, but also the resource usage in FPGA is the main challenge in the design, since it is expected to have ML for each of the wires within a relatively small FPGA. We will report about progress of development, the plan for real deployment and the validation with Belle II system.
Speaker: Yun-Tsung Lai (KEK IPNS) -
3:45 PM
Co-design of artificial neural networks for real-time feature extraction in front-end ASICs 15m
Integration of machine learning (ML) algorithms within the front-end ASICs used for charge detection and readout in high-energy and nuclear physics experiments can alleviate data transfer bottlenecks to back-end data acquisition systems. By only transmitting higher-level signal features inferred from the front-end signals (such as amplitude, time constant, time of arrival, or even particle types), significant amount of energy and data movement can be saved. However, such extreme edge-AI systems must operate under stringent hardware constraints such as micron-level dimensions, sub-milliwatt power consumption, and nanosecond-scale latency, while providing clear accuracy advantages over traditional architectures. Moreover, it is impractical, if not impossible, to manually determine optimal design and architectural choices for the corresponding artificial neural networks (ANNs) among possibilities that easily exceed billions even for small-scale problems.
To address these challenges, we employ intelligent search using multi-objective Bayesian optimization, integrating both neural network architecture search, variable bit quantization, and logic synthesis in the optimization loop. This approach provides reliable feedback on the collective impact of all cross-domain design choices. We showcase the effectiveness of our approach by finding several Pareto-optimal design choices for effective and efficient ANNs that perform real-time feature extraction from input pulses within the individual pixels of a readout ASIC. The proposed optimization approach was used to realize a smart readout ASIC for segmented radiation detectors. The chip, which was designed in 65 nm CMOS technology, contains 23 independent sensing channels. Each channel features a low-noise analog front-end, single-ended to differential converter, ADC driver, high-speed 12-bit ADC, digital signal processor (DSP), and two-layer ANN with on-chip weights for performing regression or classification tasks. The DSP was realized using a high-level synthesis design flow. Each channel contains 1.8 kb of on-chip memory and consumes approximately 14.3 mW at the nominal sampling rate of 25 MS/s.Speakers: Prashansa Mukim (Brookhaven National Laboratory), Yihui Ren (Brookhaven National Laboratory) -
4:00 PM
AI/ML for Beam Optimization at the UH Accelerator and FEL Facility 15m
The 40MeV linear accelerator and 3$\mu$m free-electron laser (FEL) facility at UH offers a versatile platform for advanced beam physics experiments and compact radiation source development. In this talk, I will discuss opportunities for applying AI and machine learning methods to optimize key beam parameters critical for the performance of the FEL and the inverse Compton scattering (ICS). Targets include minimizing spot size and divergence at the ICS interaction point, controlling the beam energy spread, and maximizing beam current while avoiding beam loading and cathode back-heating. These optimization tasks involve tuning a high-dimensional parameter space—transport magnets, RF and gun settings, and cathode conditions—based on diagnostics such as beam position monitors, wire scanners, spectrometers, and others, some of which are not yet fully integrated into the control system.
Speaker: Siqi Li (University of Hawaii)
-
3:30 PM
-
4:30 PM
→
5:00 PM
Ideas session
Short (5' + 5') introductions of new ideas or recently started work; can also be discussion openers to the community or asking for advice on tackling a specific problem
-
5:00 PM
→
5:30 PM
Workshop summary: Physics
-
6:30 PM
→
8:30 PM
Dinner
Workshop dinner, location to be confirmed
-
9:00 AM
→
10:30 AM
-
-
9:00 AM
→
10:00 AM
Community Session
-
9:00 AM
UF HFCC: AI, Integration, Microelectronics 15m
-
9:15 AM
NSF A3D3 Institute 15m
-
9:30 AM
Fast ML Community 15m
-
9:45 AM
HEPIC 15m
-
9:00 AM
-
10:00 AM
→
10:30 AM
Coffee break 30m
-
10:30 AM
→
11:30 AM
Brainstorming/Discussion
-
10:30 AM
Community White Paper 20m
-
10:50 AM
Collaborations & Funding Opportunities 20m
-
10:30 AM
-
11:30 AM
→
12:00 PM
Workshop summary
-
12:00 PM
→
12:15 PM
LOC: Closing
-
12:30 PM
→
2:00 PM
Lunch break 1h 30m
-
2:00 PM
→
5:00 PM
Tutorials / hands-on
-
9:00 AM
→
10:00 AM