|
|
MONDAY |
||
|
8:00-9:00am |
Registration
and opening remarks |
||
|
9:00-10:00am |
Keynote 1: Re-Engineering Engineering for the Next Era of IC Design, Dr. Sabya Das, Executive Director, R&D Engineering, Synopsys |
||
|
10:00-11:00am |
Session 1A - Security and privacy for AI hardware
(Test, Verification and Security) |
Session
1B - Logic and Circuit Design (LCD) |
|
|
11:00-11:30am |
Coffee
Break |
||
|
11:30-12:45pm |
Session
2A- Scalable AI Training (Computing Systems) |
Session
2B - Emerging Trends (Hardware Architecture) |
|
|
12:45-2:30pm |
Lunch Break |
||
|
2:30-3:30pm |
Session
3A - Memory Management and Performance Modeling (Software Architecture,
Compilers and Tool Chains) |
Session
3B - Yield Analysis & 2.5D/3D Physical Design (Electronic Design
Automation) |
Tutorial 1:
Systems Design for Efficient MoE-based LLM Inference |
|
3:30-4:00pm- |
Coffee
Break |
||
|
4:00-5:15pm |
Session
4A - Memory System Innovations (Computing Systems) |
Session
4B - Emerging Memory Technologies (Hardware Architectures) |
|
|
|
|
||
|
5:30-6:30pm |
Welcome
Reception |
||
|
MONDAY – SESSIONS DETAILS |
TVS1 Session 1A -
Security and privacy for AI hardware - Session Chair: Hadi Kamali - University of Central Florida (UCF) (kamali@ucf.edu)
L1 HElix: Genome
Similarity Detection in the Encrypted Domain, Rostin Shokri, Charles Gouert and
Nektarios Georgios Tsoutsos
L2 Targeted Fault Injection Attack on
Semantic Segmentation Models, Jhon Ordoñez and Chengmo Yang
L3 Towards Low-Latency and Adaptive
Ransomware Detection Using Contrastive Learning, Zhixin Pan, Ziyu Shu and
Amberbir Alemayoh
L4 SecNPU: Securing LLM inference on
NPU, Xuanyao Peng, Yinghao Yang, Shangjie Pan, Junjie Huang, Yujun Liang, Hang
Lu, Fengwei Zhang and Xiaowei Li
LCD1 Session 1B - Logic
and Circuit Design - Session Chair: Hossein Pedram - University of Texas at Dallas (UTD) (Hossein.Pedram@UTDallas.edu)
L1 DMP-BFP: Dynamic Mixed-Precision
Block Floating-Point and Exponent-Guided Precision Adjustment, Yu-Chih Tsai,
Chia-Cheng Chang and Ren-Shuo Liu
L2 Hardware
Efficient Multiplier design using an Optimal mix of Approximate Booth
Encodings, C. Kumar N S, Bhavana S, Ajitesh Kumar Singh and M. Rao
L3 PolyPE: An Efficient
Multi-Precision Multi-Mode Floating-Point Processing Element for HPC and AI,
Zhenzhen Jia, Hongbing Tan, Ling Yang, Hui Guo, Kun Zeng, Junsheng Chang,
Yongwen Wang and Libo Huang
S1 CHQ-SC: Compact and High-Quality
Stochastic Computing Framework Using Magnetic Tunnel Junction, Yu Ma, Jianmin
Zhang, Yan Sun and Fu Siqing
CS1 Session 2A-
Scalable AI Training - Session Chair: Minah Lee - University of Texas at Dallas (UTD) (Minah.Lee@utdallas.edu)
L1 RAM-Wafer: RL-based Automatic
Mapping Framework for Large-scale AI Training on Wafer-scale Computing, Xu Dai,
Dehao Kong, Xufeng He, Zijun Xu, Shaopeng Zhai, Yang Hu and Shouyi Yin
L2 DHeLlam:
General-Purpose, Automatic Micro-batch Co-execution for Distributed LLM
Training, Haiquan Wang, Chaoyi Ruan, Jia He, Jiaqi Ruan, Chengjie Tang,
Xiaosong Ma and Cheng Li
L3 A Co-Design Framework for Graph
Processing on CPU-GPU Heterogeneous Platforms, Yuan Zhang, Huawei Cao, Yiming
Sun, Ming Dun, Jie Zhang and Xiaochun Ye
L4 Towards Affordable, Adaptive and
Automatic GNN Training on CPU-GPU Heterogeneous Platforms, Tong, Qiao, Ao Zhou, Yingjie Qi, Yiou Wang, Han
Wan, Jianlei Yang and Chunming Hu
HW1 Session 2B -
Emerging Trends - Session Chair: Fang Li - Oklahoma Christian University (fang.li@oc.edu)
L1 Enhancing
Transformer Inference Efficiency on FPGA through Fully Fusion and Integer-Only
Quantization Techniques, Zhenqi Li, Yuan Li, Mingche Lai, Puguang Liu, Qiang
Wang, Yankang Zhao, Hanyuan Li and Xingyun Qi
L2 RACE-IT: A Reconfigurable Analog
Computing Engine for In-Memory Transformer Acceleration, Lei Zhao, Aishwarya
Natarajan, Luca Buonanno, Archit Gajjar, Ron Roth, Sergey Serebryakov, John
Moon, Omar Eldash, Jim Ignowski Ignowski and Giacomo Pedretti
L3 PACE-lite: Compact and Efficient
Piecewise Polynomial Approximation for Transformer Nonlinearity Acceleration,
Arpan Suravi Prasad, Gamze Islamoglu, Luca Bertaccini, Davide Rossi, Francesco
Conti and Luca Benini
L4 QuFi: Adaptive Tiled Gustavson
Output Reuse for Edge Sparse DNN Accelerators, Adrián Navarro, José Cano, José
L. Abellán and Manuel E. Acacio
S1 HBM-aware Number Theoretic
Transform Accelerator for Zero-Knowledge Proof, Sangwon Shin, Ngoc-Son Pham,
Lei Xu, Weidong Shi and Taeweon Suh
SW1 Session 3A -
Memory Management and Performance Modeling - Session Chair: Chaitali Sathe The University of Texas at Dallas (ChaitaliGajanan.Sathe@utdallas.edu)
L1 Repo:
Proactive Swapping Exploiting Loop Patterns in Modern Applications, Jiahui
Zhang, Qiang Cao, Yekang Zhan, Yuchen Hu and Jie Yao
L2 RT-PMalloc: Optimizing Persistent
Memory Allocation for Soft Real-Time Systems, Yuquan Chi, Yinjin Fu and Nong
Xiao
L3 A Scalable and Overflow-Tolerant
Mechanism for Minimum Virtual Time Tracking, Gyusun Lee, Seungwoo Jin, Jiwon
Woo and Jinkyu Jeong,
S1 CAST: An Efficient Framework for
Schedules Performance Prediction based on Compact ASTs, Qingqiu Lan, Ao Ren,
Zhenyu Wang, Wei Li, Hongbin Zhu, Yujuan Tan, Duo Liu, Kan Zhong and Chaoxia
Qin
EDA1
Session 3B -
Yield Analysis & 2.5D/3D Physical Design (Electronic Design Automation) - Session Chair: Benjamin Carrion Schaefer, UT Dallas schaferb@utdallas.edu
L1 STAMP-2.5D: Structural and Thermal
Aware Methodology for Placement in 2.5D Integration, Varun Parekh, Zachary
Wyatt Hazenstab, Srivatsa Rangachar Srinivasa, Krishnendu Chakrabarty, Kai Ni
and Vijaykrishnan Narayanan
L2 OpenYield:
An Open-Source SRAM Yield Analysis and Optimization Benchmark Suite, Shan Shen,
Xingyang Li, Zhuohua Liu, Yikai Wang, Yiheng Wu, Junhao Ma, Yuquan Sun and Wei
W. Xing
S1 3DPX - An Open-Source Methodology
for 3D Physical Design Exploration, George Goudroumanis, Maria Pantazi, George
Floros, Athanasios Tziouvaras, George Stamoulis and Alberto Garcia-Ortiz
S1 Declarative Synthesis and
Multi-Objective Optimization of Stripboard Circuit Layouts Using Answer Set
Programming, Fang Li
T1 Tutorial 1: Systems Design for Efficient MoE-based LLM
Inference, Pavan Miriyala (AMD Research Singapore) and Haris Javaid (AMD
Signapore) (slides)
CS2 Session 4A -
Memory System Innovations - Session Chair: Fang Li - Oklahoma Christian University (fang.li@oc.edu)
L1 R2Hash: A
Read-Optimized and Resize-Friendly Hashing Index for Persistent Memory, Jinlei
Hu, Bo Chen, Miaosong Zhang, Jing Hu, Jianxi Chen and Dan Feng
L2 ALPHA: A Scalable Lock-Free
Partitioned Hash Index for Persistent Memory on NUMA Architectures, Qiyang
Zheng, Hao Hu, Hao Huang, Yanqi Pan, Yifeng Zhang, Wen Xia, Xiangrui Meng and
Xudong Li
L3 DDLM: Demand-Aware Dynamic Link
Width Management for Energy-Efficient CXL Memory, Taejeong Kim, Junbum Park,
Yongho Lee and Seokin Hong
L4 Computing-In-Memory Dataflow for
Minimal Buffer Traffic, Choongseok Song and Doo Seok Jeong
S1 PIMFY: Eliminating Remote Page
Walks in MCM GPUs, Junsung Kim, Sungwoo Kim, Seunghyun Jin and Won Woo Ro
HW2 Session 4B -
Emerging Memory Technologies - Session Chair: Victoria Gammenthaler, The University of Texas at Dallas (Victoria.Gammenthaler@UTDallas.edu)
L1 PriME: PIM-Aware Efficient
Compression for Memory-Bound Embedding Layers in sLLMs, Junghyeok Lee, Jihoon
Jang and Hyun Kim
L2 Dissecting and Re-architecting 3D
NAND Flash PIM Arrays for Efficient Single-Batch Token Generation in LLMs,
Yongjoo Jang, Sangwoo Hwang, Hojin Lee, Sangwoo Jung, Donghun Lee, Wonbo Shim
and Jaeha Kung
S1 MamCIMFlow: An Integrated Co-Design
of RRAM-Based CIM and Selective State-Space Streaming for Efficient Mamba Model
Acceleration, Mingzi Li, Zhongrui Wang, Zhongwen Ye, Tao Pan and Han Wang
S2 PIM-SUM: Fast and Reliable
In-Memory Summation for Recommendation Systems, Fan Li, Ruizhi Zhu, Huize Li,
Di Wu and Xin Xin
L3 CMC: Compound Memory-Computing
Architecture for Energy-Efficient CNN Accelerators, Ming Han, Jin Wu, Jian
Dong, Ye Wang and Gang Qu
|
|
TUESDAY |
||
|
8:00-9:00am |
Registration
and opening remarks |
||
|
9:00-10:00am |
Keynote 2: AI for Chip Design: The Inflection Point Is Now, Dr. Mark Ren - Director of of Design Automation Research, NVIDIA |
||
|
10:00-11:15am |
Session
5A - Large‑Scale Model Inference (Computing Systems) |
Session
5B - Design, Testing, and Verification (Test, Verification and Security) |
Session
5C: Quantum Systems Security: Threats, Forensics & Defenses (Special
Session) |
|
11:15-11:45am |
Coffee
Break |
||
|
11:45-12:45pm |
Session
6A - Architecture-Aware Compilation and Scheduling (Software
Architecture, Compilers and Tool Chains) |
Session
6B - Timing‑Driven & Physically‑Aware Optimization (Electronic Design
Automation |
Session 6C: Efficient and Secure Generative AI (Special Session) |
|
12:45-2:30pm |
Lunch
Break |
||
|
2:30-4:00pm |
Session
7A - Efficient Data Storage (Computing Systems) |
Session
7B - Fault tolerance and resilience in Systems (Test, Verification and
Security) |
Tutorial
2: Machine Learning for Automated Physical Design |
|
4:00-4:30pm- |
Coffee
Break |
||
|
4:30-5:45pm |
Session
8A - Advanced Hardware Acceleration Trends (Hardware Architectures) |
Session
8B - AI‑Assisted RTL & HLS Code Generation (Electronic Design
Automation) |
|
|
|
|
||
|
6:30-9:30pm |
Social
Event 1 - Dinner |
||
|
TUESDAY – SESSIONS DETAILS |
CS3 Session 5A -
Large‑Scale Model Inference - Session Chair: Victoria Gammenthaler, The University of Texas at Dallas (Victoria.Gammenthaler@UTDallas.edu)
L1 Ghidorah: Fast LLM Inference on
Edge with Speculative Decoding and Hetero-Core Parallelism, Jinhui Wei, Ye
Huang, Yuhui Zhou, Jiazhi Jiang and Jiangsu Du
L2 DualSpar: A Dual-Granularity
Memory Framework with Adaptive Sparsity for Efficient LLM Inference, Yujuan
Tan, Jiayi Guo, Zhuoxin Bai, Sanle Zhao, Yujiao Wang, Zongjie Wang, Ao Ren, Duo
Liu, Kan Zhong and Jun Liu
L3 Throughput-Oriented LLM Inference
via KV-Activation Hybrid Caching with a Single GPU, Sanghyeon Lee, Hongbeen
Kim, Soojin Hwang, Guseul Heo, Minwoo Noh and Jaehyuk Huh
L4 AuLoRA: Fine-Grained Loading and
Computation Orchestration for Efficient LoRA LLM Serving, Xiao Shi, Jiangsu Du,
Zhiguang Chen and Yutong Lu
S1 Taming Sparse Giants: Deploying
Mixture-of-Experts on 3D Heterogeneous Compute-in-Memory Systems, Pragya
Sharma, Ashish Reddy Bommana, Farshad Firouzi and Krishnendu Chakrabarty
TVS2 Session 5B -
Design, Testing, and Verification - Session Chair: Kimia Azar - University of Central Florida (UCF) (azar@ucf.edu)
L1 µSTT: Microarchitecture Design for
Speculative Taint Tracking, Boru Chen, Rutvik Choudhary, Kaustubh Khulbe,
Archie Lee, Adam Morrison and Christopher W. Fletcher.
L2 Hot-FV: A Semi-Formal Test
Generation Framework for RTL Functional Coverage using Warm Starting States,
Ziyue Zheng, Zhiyuan Yan, Xiangchen Meng, Guangyu Hu, Hongce Zhang and Yangdi
Lyu
L3 ATPG-Based Weighted Scan Chain
Control for Programmable Low-Power LBIST, Yumei Hu, Hairui Cai, Xiaohui Xue,
Yaning Wang, Yu Huang, Zhipeng Lyu, Zhouxing Su, Zezhong Wang and Xing
Wang
S1 FitFuzz: Depth-Oriented
Coverage-Guided Fuzzing via Fitness-Based Seed Scheduling, Venkat Nitin Patnala
and Sai Manoj Pudukotai Dinakarrao
S2 Firebreak: Efficient
In-process Protection with Hardware-Assisted Dynamic Compartmentalization, Yue
Jin, Yibin Xu, Han Wang, Chengyuan Zhang, Tianyi Huang, Tianyue Lu and Mingyu
Chen
SS-1:
Session 5C -
Quantum Systems Security: Threats, Forensics & Defenses - Session Chair: Chaitali Sathe The University of Texas at Dallas (ChaitaliGajanan.Sathe@utdallas.edu)
P1 Oracle-Guided Attack on Quantum
Circuit Obfuscation, Yuntao Liu
P2 Forensics of Error Rates of Quantum
Hardware, Swaroop Ghosh
P3 QTIME: A Machine Learning Framework for Timing Side-Channel Analysis in Quantum Circuit Simulators, Ben Dong, Hui Feng and Qian Wang
P4 Recovering QSVT Polynomials from
Side-Channel Information on Quantum Computers, Jakub Szefer
P5 Concolic Testing for Quantum Compilers, Kanad Basu
P6 Benchmarking and Characterising
NISQ Computers Through Uncertainty Quantification of Variational Quantum
Algorithms, Qiang Guan
SW2 Session 6A -
Architecture-Aware Compilation and Scheduling - Session Chair: Baharealsadat Parchamdar, The University of Texas at Dallas (Baharealsadat.Parchamdar@UTDallas.edu)
L1 NaviMap: Partial Order-guided Neural
Architecture via Deep Q-Networks for Efficient CGRA Mapping, Mingyang Kou, Jun
Zeng, Xinyu Peng, Weiqing Ji and Hailong Yao
L2 IasRT: Interference-Aware and
SLO-Driven GPU Scheduling for Real-Time DNN Inference, Heming Zhong, Jinhui
Wei, Yujia Fu, Dan Huang and Yutong Lu
L3 AICAWS: Arithmetic Intensity based
Cache-Conscious Adaptive Warp Scheduler, Bo Yuan, Sheng Liu, Zekun Jiang,
Jianfeng Cui and Yang Guo
S1 A Dynamic Virtual Memory Management
System for LLMs on AI Chips, Gaolin Wei, Zhaorui Zhang, Jiaqi Xu, Chen Zhang,
Xin Yao and Benben Liu
EDA
2 Session 6B -
Timing‑Driven & Physically‑Aware
Optimization - Session Chair: Ioannis Savidis, Drexel University (is338@drexel.edu)
L1 CPA-remap: Critical-path-based
Physically Aware Remapping Framework for Timing Optimization, Mingxiao He,
Pengcheng Huang, Zhenyu Zhao and Peiyun Bian
L2 Threshold Voltage Tuning Technique
for Leakage Power Recovery, Jaejoon Yoon and Taewhan Kim
L3 Timing-Driven Global Placement
with Entropy-Mobility Guided Pin-to-Pin Weighting, Youzhi Zheng, Zhengjie Zhao,
Linhao Lu, Xiaodong Zhu, Wenxin Yu and Jingwei Lu
S1 Timing-Driven Multi-bit Flip-flop
Allocation Utilizing Design-Technology Co-optimization Techniques, Yeongyeong
Shin, Sehyeon Chung and Taewhan Kim
SS-2: Session 6C – Efficient and Secure Generative AI- Session Chair: Khaza Anuarul Hoque, University of Missouri (hoquek@missouri.edu)
P1 GAN–BiLSTM–HDC: A Hybrid Framework
for Robust and Hardware-Efficient Malware Detection, Emilien J Meyer, Abu
Kaisar Masum, Mehran Moghadam, Lida Kouhalvandi, Gourav Datta, Sercan Aygun,
and M. Hassan Najafi
P2 LM-Fix: Lightweight Bit-Flip
Detection and Rapid Recovery Framework for Language Models, Ahmad Tahmasivand,
Noureldin Zahran, Mohammed Fouda, Ihsen Alouani and Khaled N. Khasawneh
P3 FaRAccel: FPGA-Accelerated Defense
Architecture for Efficient Bit-Flip Attack Resilience in Transformer Model, Najmeh Nazari, Banafsheh Saber, Latibari,
Hosein Mohammadi Makrani, Elahe Hosseini, Chongzhou Fang, Setareh Rafatirad,
Hossein Sayadi, Houman Homayoun
P4 Hammering the Diagnosis:
Rowhammer-Induced Stealthy Trojan Attacks on ViT-Based Medical Imaging, Banafsheh Saber Latibari, Najmeh Nazari, Hossein
Sayadi, Houman Homayoun, Abhijit Mahalonobis
CS4 Session 7A -
Efficient Data Storage - Session Chair: Chaitali Sathe The University of Texas at Dallas (ChaitaliGajanan.Sathe@utdallas.edu)
L1 Hybrid-Rewrite: A Rewriting
Framework for Hybrid Deduplication and Delta Compression, Qiao Li, Hong Jiang,
Zichen Xu, Yucheng Zhang, Junyun Wu and Puchen Lu
L2 NatSep: Little-to-No Overhead Data
Separation for Log-Structured Storage Using Native Information, Jinlong Wang,
Zhipeng Tan, Yang Xiao, Wenjie Qi, Shikai Tan and Ying Yuan
L3 The Logic of Fingerprint Upgrade
in Deduplicated Storage, Cai Deng, Boju Chen, Philip Shilane, Xiangyu Zou, Wen
Xia and Hao Hu
L4 Pixel-DNA: Increasing Robustness
of Approximate DNA Storage for Images by Using Hierarchical Deduplication, Alex
Sensintaffar, David Du and Bingzhe Li
L5 TCFlash: In-Flash Bulk Bitwise
Processing via Dynamic Sensing and TLC Encoding in 3D NAND, Habib Ur Rahman,
Tharini Suresh, Sudeep Pasricha and Biswajit Ray
S1 Minimizing Read Disturb via
Localized Page Allocation for Modern NAND Flash-Based SSDs, Joonseong Hwang,
Minkyu Choi, Minjin Park, Jihun Yoon, Yoonho Jang and Seokin Hong
TVS3 Session 7B - Fault
tolerance and resilience in Systems - Session Chair: Sabrina Ahmed, University of Texas at Dallas (Sabrina.Ahmed@UTDallas.edu)
L1 Laser and Radiation Testing of
Compiler-based Protection for Multi-Bit Upsets, Davide Baroffio, Tomas Antonio
López, Federico Reghenzani and William Fornaciari
L2 Masked Gadgets for
Integer-Floating-Point Conversion with Applications to Falcon, Shuyi Chen,
Jingdian Ming, Yuejun Liu, Yiwen Gao and Yongbin Zhou
L3 WSSR: Weight Set Segmentation and
Recovery for Fault Resilient Transformers, Ntsee Ndingwan and Chengmo Yang
S1 ECOLogic: Enabling Circular,
Obfuscated, and Adaptive Logic via eFPGA-Augmented SoCs, Ishraq Tashdid, Dewan
Saiham, Nafisa Anjum, Tasnuva Farheen and Sazadur Rahman
S2 Enhancing Key-Recovery
Chosen-Ciphertext Side-Channel Attacks on NTRU Using LDPC, Xiaofei Tong, Denis
Nabokov and Qian Guo.
T2 Tutorial 2: Machine Learning for Automated Physical Design, Ioannis
Savidis and Pratik Shrestha (slides)
HW3 Session 8A -
Advanced Hardware Acceleration Trends - Session Chair: Daksith Chandrasekera, The University of Texas at Dallas (csc240000@utdallas.edu)
L1 A Photonic Accelerator for Deep
Learning Training, Yuan Li
L2 FINEA: An Efficient Neural Network
Accelerator Exploiting Factorized Input Features, Yujin Kim, Chanhun Jeong,
Yunho Oh, Myung Kuk Yoon and Gunjae Koo
L3 Flame: A Multiplier-Free LLM
Accelerator with Dynamic Block Floating Point, Ao Lv, Haishuang Fan and Guihai
Yan
S1 Hermes: Accelerating Packet
Processing in DPU with Neural Networ, Rui Meng, Xinyu Chen, Hanyue Lin, Jingya Wu, Wenyan Lu, Xiaowei Li and Guihai
Yan
S2 ASMA: An Anisotropy Scaling
Memristor-based Accelerator for LLM Inference, Zijian Xiong, Xiangrui Yang,
Yuhang Zhang, Yue Zhou, Jianguo Yang, Yaoyu Tao, Xiangshui Miao and Yuhui He
EDA3 Session 8B - AI‑Assisted
RTL & HLS Code Generation - Session Chair: Baharealsadat Parchamdar, The University of Texas at Dallas (Baharealsadat.Parchamdar@UTDallas.edu)
L1 RTLBench: A Multi-Dimensional
Benchmark Suite for Evaluating LLM-Generated RTL Code, Zhigang Fang, Renzhi
Chen, Yang Guo, Huadong Dai and Lei Wang
L2 SAGE-HLS: Syntax-Aware AST-Guided
LLM for High-Level Synthesis Code Generation, M Zafir Sadik Khan, Nowfel
Mashnoor, Mohammad Akyash, Kimia Azar and Hadi Kamali
L3 LLM4MCU-Onto : Leveraging LLMs for
Automated Ontology Generation from Microcontroller Reference Manual, Asmita
Asmita, Grisha Bandodkar, Sujan Ghimire, Shaurya Srivastav, Soheil Salehi and
Houman Homayoun
S1 LLM-Driven Code Generation for
Neural Networks on FPGAs: Bridging Python and HLS, Rupesh Raj Karn, Johann
Knechtel, Ramesh Karri and Ozgur Sinanoglu
|
|
WEDNESDAY |
||
|
8:00-9:00am |
Registration
and opening remarks |
||
|
9:00-10:00am |
Keynote 3: Application and SW-HW codesign approach to driving low power architecture for GenAI, Akila Subramaniam, Sr Fellow, AMD |
||
|
10:00-11:00am |
Session
9A - Processor-Based Solutions 1 (Hardware Architectures) |
Session
9B - Advanced Hardware Design Flows & Synthesis Techniques
(Electronic Design Automation) |
Tutorial
3: Engineering Privacy at the Edge: A Practical Guide to
Differential Privacy in System Architectures |
|
11:00-11:30am |
Coffee
Break |
||
|
11:30-1:00pm |
Session
10A - Processor-Based Solutions 2 (Hardware Architectures) |
Session
10B -System Level AI Optimization (Computing Systems) |
LLMs for Hardware Design Challenge [Register here] |
|
1:00-2:30pm |
Lunch
Break |
||
|
2:30-3:30pm |
Session
11A - Emerging trends 2 (Hardware Architectures) |
Session
11B - GenAI Meets Silicon: LLMs in Hardware Design, Verification,
and Security (Special Session) |
|
|
3:30-4:00pm- |
Coffee
Break |
||
|
4:00-5:00pm |
Session
12A - Specialized High‑Performance Computing (Computing Systems) |
Session
12B - Sustainable Hardware Accelerators with Integrated Electro-Photonics
(Special Session) |
|
|
5:30-6:00pm |
Closing
Remarks |
||
|
|
|
||
|
6:30-10:30pm |
Social
Event 2 – Mavericks Game |
||
|
WEDNESDAY – SESSIONS DETAILS |
HW4 Session 9A -
Processor-Based Solutions 1 - Session Chair: Hossein Pedram - University of Texas at Dallas (UTD) (Hossein.Pedram@UTDallas.edu)
L1 TROOP: At-the-Roofline Performance
for Vector Processors on Low Operational Intensity Workloads, Navaneeth Kunhi
Purayil, Diyou Shen, Matteo Perotti and Luca Benini
L2 RVME: An Efficient Matrix Engine
Design based on Matrix Extension of RISC-V, Wanqi Chen, Weidong Yang, Yiming
Guo, Jing Qiu, Renpei Wang, Jianfei Jiang, Naifeng Jing and Qin Wang
L3 TeraNoC: A Multi-Channel 32-bit
Fine-Grained, Hybrid Mesh-Crossbar NoC for Efficient Scale-up of 1000+ Core
Shared-L1-Memory Clusters, Yichao Zhang, Zexin Fu, Tim Fischer, Yinrong Li,
Marco Bertuletti and Luca Benini
L4 THENA: Accelerating Torus Fully
Homomorphic Encryption on Energy-Efficient Heterogeneous Architecture, Yanze Wu and Md Tanvir Arafin
S1 SSM-RDU: A Reconfigurable Dataflow
Unit for Long-Sequence State-Space Models, Sho Ko and Kunle Olukotun
EDA4
Session 9B -
Advanced Hardware Design Flows & Synthesis Techniques - Session Chair: Hadi Kamali - University of Central Florida (UCF) (kamali@ucf.edu)
L1 Optimization of Wire Pipelining
and Channel Parallelism for 2D-Mesh NoC Physical Design, Pei-Huan Tsai, Maico
Cassel dos Santos, Joseph Zuckerman, Kuan-Lin Chiu and Luca Carloni
L2 Agile Design Flow for
Cryptographic Hardware Accelerators, Deng Liming, Zhu Guowei, Cao Wei, Fan
Xitian and Zhou Xuegong
L3 Decomposition Attack on Structural
Logic Locking of Reversible Circuits, Feng-Jie Chao and Yung-Chih Chen
S1 Supporting Pipelined Memory
Accesses in Processor Synthesis, Essien Taylor, Colin Schilf, Sebastian
Phemister and Russ Joseph
T3 Tutorial 3:
Engineering
Privacy at the Edge: A Practical Guide to Differential Privacy in System
Architectures, Olivera Kotevska, Eyhab Al-Masri, and Wenjun Yang (slides)
HW5 Session 10A -
Processor-Based Solutions 2 (Hardware Architectures) - Session Chair: Daksith Chandrasekera, The University of Texas at Dallas (csc240000@utdallas.edu)
L1 BNRV: A Lightweight SIMD Extension
for Efficient BitNet Inference on RISC-V CPUs, Zijun Jiang and Yangdi Lyu
L2 Design and Evaluation of a N-Trace
Compliant Hardware Tracer for RISC-V Processors, Omer Karslioglu and Ismail
Akturk
L3 FlexIO: A Scalable IO Chiplet
Architecture with Flexible Memory Controller Mapping, Junpei Huang, Haobo Xu,
Yinhe Han and Ying Wang
L4 Register Bridging: A Lightweight
Microarchitectural Approach for Skipping Overhead Instructions in Distance‑Based
ISA Processors, Fan Yang, Toru Koizumi, Jun Li, Shu Sugita, Yuriko Yamauchi,
Ryota Shioya, Junichiro Kadomoto and Hidetsugu Irie
S1 XDMA: A Distributed, Extensible DMA
Architecture for Layout‑Flexible Data Movements in Heterogeneous
Multi-Accelerator SoCs, Fanchen Kong, Yunhao Deng, Xiaoling Yi, Ryan Antonio
and Marian Verhelst
CS5
Session 10B -System
Level AI Optimization - Session Chair: Jiaqi Gu- Arizona State University (ASU) (Jiaqi.Gu@asu.edu)
L1 AceHomo: Accelerating Privacy
Preserving Inference through Dynamic Level Adjustment, Hongyan Li, Jinkai
Zhang, Hang Lu and Xiaowei Li
L2 HyperDrone: an Accurate, Robust,
Fast, and Energy-Efficient Approach for Drone Classification, Shriniwas
Kulkarni, Flavio Ponzina and Tajana Rosing
L3 Access Frequency-Aware Storage
Reduction for Deep Learning Recommendation Model, Chia-Chun Wang, Chuan-Yao Lai
and Ren-Shuo Liu
L4 Recommendation-Expert Framework
for Fast and Adaptive Scheduling in Computing Power Network, Yu Chen and Wenli
Zheng
S1 Oak: A Fault-Tolerant Shared-Memory
System Atop Memory-Semantic Fabrics, Zhaoxiang Huang, Jianqin Yan, Hao Chen,
Jiaxin Li and Yiming Zhang
LLMs for Hardware Design Challenge [Register here] Organizers: Houman Homayoun,
Soheil Salehi, Farinaz Koushanfar, Benjamin Carrion Schafer, Kevin Immanuel
Gubbi, Mohammadnavid Tarighat, & Chongzhou Fang
HW6
Session 11A -
Emerging trends 2 - Session Chair: Ioannis Savidis, Drexel University (is338@drexel.edu)
L1 TLV-HGNN: Thinking Like a Vertex
for Memory-efficient HGNN Inference, Dengke Han, Duo Wang, Mingyu Yan, Xiaochun
Ye and Dongrui Fan
L2 SageSC: Accelerating GraphSAGE
Minibatch Inference on Memory-Intensive Graphs, Yuchen Gui, Wei Yuan, Qizhe Wu,
Huawen Liang, Letian Zhao, Linfeng Tao, Zhongguang Xu and Xi Jin
L3 In-DRAM True Random Number
Generation Using Simultaneous Multiple-Row Activation: An Experimental Study of
Real DRAM Chips, Ismail Emir Yuksel, Ataberk Olgun, Nisa Bostancı, Oguzhan
Canpolat, Geraldo Francisco De Oliveira Junior, Mohammad Sadrosadati, Abdullah
Giray Yaglikci and Onur Mutlu
S1 TIPS: Augment Memory Tagging to
Defend Against Prefetcher Side Channels, Yubiao Huang, Peinan Li, Huan Qiao,
Yunkai Bai, Shiwen Wang, Dan Meng and Rui Hou
S2 Adaptive ML-KEM: A Configurable
HW-SW Architecture for Post-Quantum Cryptography Wenkai Wang, Chao Liu, Zhe Sun, Lei Ju and Zimeng Zhou
SS3 Session 11B SS-3:
GenAI Meets Silicon: LLMs in Hardware Design, Verification, and Security - Session Chair: Hadi Kamali - University of Central Florida (UCF) (kamali@ucf.edu)
P1 FV-PAL: Scalable Formal Verification through Partitioning and LLM-guided Property Generation, Sudipta Paria, Aritra Dasgupta, Dinesh R. Ankireddy, Prabuddha
Chakraborty, and Swarup Bhunia
P2 LLM Reasoning within Hardware
Design: Models, Metrics, and Methodologies, Matthew DeLorenzo, Kevin
Tieu, and Jeyavijayan Rajendran
P3 CircuitGuard: Privacy-Preserving
Fine-Tuning of LLMs Against Hardware IP Leakage, Nowfel Mashnoor , Mohammad
Akyash, Hadi Kamali, and Kimia Azar
P4 Multi-Agent LLMs for Hardware
Security Verification, Jayeeta Chaudhuri and Farshad Firouzi
CS6 Session 12A -
Specialized High‑Performance Computing - Session Chair: Sabrina Ahmed, University of Texas at Dallas (Sabrina.Ahmed@UTDallas.edu)
L1 FlashMP: Fast Discrete
Transform-Based Solver for Preconditioning Maxwell’s Equations on GPUs, Haoyuan
Zhang, Yaqian Gao, Xinxin Zhang, Jialin Li, Runfeng Jin, Yidong Chen, Feng
Zhang, Wu Yuan, Wenpeng Ma, Shan Liang, Jian Zhang and Zhonghua Lu
L2 MH-SpGEMM: Efficient Sparse
General Matrix-Matrix Multiplication on Modern GPUs via Masking and Hashing
Cooperative Optimization, Shuang Yang, Yaobin Wang, Ling Li, Qian Peng and
Qiong Yu
L3 TensTFM: Efficient Total Focusing
Method for Ultrasonic Array Imaging on Dataflow Accelerators, Jieran Zhang,
Bizhao Shi and Guojie Luo
L4 Design of an Online Surface Code
Decoder Using Union-Find Algorithm, Takuya Kasamura, Junichiro Kadomoto and
Hidetsugu Irie
S1 Early Termination with Activation
Sign Prediction for Energy-Efficient CNN Inference Using Sum-of-Power-of-Two
Quantization, Emir Mehmet Eryilmaz, Selim Sandal and Ismail Akturk
SS4 Session 12B SS-4 Sustainable
Hardware Accelerators with Integrated Electro-Photonics - Session Chair: Muhammad Rashed UT Arlington (muhammad.rashed@uta.edu)
P1 Accelerating Diffusion Models for
Generative AI Applications with Silicon Photonics, Sudeep Pasricha
P2 Toward Lifelong-Sustainable
Electronic-Photonic AI Systems via Extreme Area Efficiency, Reconfigurability,
and Robustness, Jiaqi Gu
P3 SUSTAINPHOT: Sustainable
Large-Scale AI Training using Analog Silicon Photonic Accelerators, Dharanidhar
Dang
P4 Scaling Up Operational
Sustainability of Photonic Tensor Cores with Device-Circuit-Signaling Co-Desig,
Ishan Thakkar