Dallas

The 43rd IEEE International Conference on Computer Design

(ICCD 2025)

November 10-12, 2025

Dallas, USA

Logo ICCD

Tentative Program

All papers (long and short) have 15 mins (13 mins for presentation+2 mins for Q&A) - All times are in CST

Papers highlighted in red are best paper candidates.

 

MONDAY

8:00-9:00am

Registration and opening remarks

9:00-10:00am

Keynote 1: Re-Engineering Engineering for the Next Era of IC Design, Dr. Sabya Das, Executive Director, R&D Engineering, Synopsys

10:00-11:00am

Session 1A - Security and privacy for AI hardware (Test, Verification and Security)

Session 1B - Logic and Circuit Design (LCD)

 

 

 

11:00-11:30am

Coffee Break

11:30-12:45pm

 

Session 2A- Scalable AI Training (Computing Systems)

Session 2B - Emerging Trends (Hardware Architecture)

 

 

 

12:45-2:30pm

Lunch Break

2:30-3:30pm

Session 3A - Memory Management and Performance Modeling (Software Architecture, Compilers and Tool Chains)

 

 

Session 3B - Yield Analysis & 2.5D/3D Physical Design (Electronic Design Automation)

 

Tutorial 1:  Systems Design for Efficient MoE-based LLM Inference

3:30-4:00pm-

Coffee Break

4:00-5:15pm

Session 4A - Memory System Innovations (Computing Systems)

Session 4B - Emerging Memory Technologies (Hardware Architectures)

 

 

 

 

 

 

5:30-6:30pm

Welcome Reception

 

 

 

MONDAY – SESSIONS DETAILS

 

 

TVS1        Session 1A - Security and privacy for AI hardware - Session Chair: Hadi Kamali - University of Central Florida (UCF) (kamali@ucf.edu)

L1              HElix: Genome Similarity Detection in the Encrypted Domain, Rostin Shokri, Charles Gouert and Nektarios Georgios Tsoutsos

L2              Targeted Fault Injection Attack on Semantic Segmentation Models, Jhon Ordoñez and Chengmo Yang

L3              Towards Low-Latency and Adaptive Ransomware Detection Using Contrastive Learning, Zhixin Pan, Ziyu Shu and Amberbir Alemayoh

L4              SecNPU: Securing LLM inference on NPU, Xuanyao Peng, Yinghao Yang, Shangjie Pan, Junjie Huang, Yujun Liang, Hang Lu, Fengwei Zhang and Xiaowei Li

 

LCD1        Session 1B - Logic and Circuit Design - Session Chair: Hossein Pedram - University of Texas at Dallas (UTD) (Hossein.Pedram@UTDallas.edu)

L1              DMP-BFP: Dynamic Mixed-Precision Block Floating-Point and Exponent-Guided Precision Adjustment, Yu-Chih Tsai, Chia-Cheng Chang and Ren-Shuo Liu

L2              Hardware Efficient Multiplier design using an Optimal mix of Approximate Booth Encodings, C. Kumar N S, Bhavana S, Ajitesh Kumar Singh and M. Rao

L3              PolyPE: An Efficient Multi-Precision Multi-Mode Floating-Point Processing Element for HPC and AI, Zhenzhen Jia, Hongbing Tan, Ling Yang, Hui Guo, Kun Zeng, Junsheng Chang, Yongwen Wang and Libo Huang

S1             CHQ-SC: Compact and High-Quality Stochastic Computing Framework Using Magnetic Tunnel Junction, Yu Ma, Jianmin Zhang, Yan Sun and Fu Siqing

 

CS1           Session 2A- Scalable AI Training - Session Chair: Minah Lee - University of Texas at Dallas (UTD) (Minah.Lee@utdallas.edu)

L1              RAM-Wafer: RL-based Automatic Mapping Framework for Large-scale AI Training on Wafer-scale Computing, Xu Dai, Dehao Kong, Xufeng He, Zijun Xu, Shaopeng Zhai, Yang Hu and Shouyi Yin

L2              DHeLlam: General-Purpose, Automatic Micro-batch Co-execution for Distributed LLM Training, Haiquan Wang, Chaoyi Ruan, Jia He, Jiaqi Ruan, Chengjie Tang, Xiaosong Ma and Cheng Li

L3              A Co-Design Framework for Graph Processing on CPU-GPU Heterogeneous Platforms, Yuan Zhang, Huawei Cao, Yiming Sun, Ming Dun, Jie Zhang and Xiaochun Ye

L4              Towards Affordable, Adaptive and Automatic GNN Training on CPU-GPU Heterogeneous Platforms, Tong,  Qiao, Ao Zhou, Yingjie Qi, Yiou Wang, Han Wan, Jianlei Yang and Chunming Hu

 

HW1        Session 2B - Emerging Trends - Session Chair: Fang Li - Oklahoma Christian University (fang.li@oc.edu)          

L1              Enhancing Transformer Inference Efficiency on FPGA through Fully Fusion and Integer-Only Quantization Techniques, Zhenqi Li, Yuan Li, Mingche Lai, Puguang Liu, Qiang Wang, Yankang Zhao, Hanyuan Li and Xingyun Qi

L2              RACE-IT: A Reconfigurable Analog Computing Engine for In-Memory Transformer Acceleration, Lei Zhao, Aishwarya Natarajan, Luca Buonanno, Archit Gajjar, Ron Roth, Sergey Serebryakov, John Moon, Omar Eldash, Jim Ignowski Ignowski and Giacomo Pedretti

L3              PACE-lite: Compact and Efficient Piecewise Polynomial Approximation for Transformer Nonlinearity Acceleration, Arpan Suravi Prasad, Gamze Islamoglu, Luca Bertaccini, Davide Rossi, Francesco Conti and Luca Benini

L4              QuFi: Adaptive Tiled Gustavson Output Reuse for Edge Sparse DNN Accelerators, Adrián Navarro, José Cano, José L. Abellán and Manuel E. Acacio

S1             HBM-aware Number Theoretic Transform Accelerator for Zero-Knowledge Proof, Sangwon Shin, Ngoc-Son Pham, Lei Xu, Weidong Shi and Taeweon Suh

 

SW1         Session 3A - Memory Management and Performance Modeling - Session Chair: Chaitali Sathe The University of Texas at Dallas (ChaitaliGajanan.Sathe@utdallas.edu)

L1              Repo: Proactive Swapping Exploiting Loop Patterns in Modern Applications, Jiahui Zhang, Qiang Cao, Yekang Zhan, Yuchen Hu and Jie Yao

L2              RT-PMalloc: Optimizing Persistent Memory Allocation for Soft Real-Time Systems, Yuquan Chi, Yinjin Fu and Nong Xiao

L3              A Scalable and Overflow-Tolerant Mechanism for Minimum Virtual Time Tracking, Gyusun Lee, Seungwoo Jin, Jiwon Woo and Jinkyu Jeong,

S1             CAST: An Efficient Framework for Schedules Performance Prediction based on Compact ASTs, Qingqiu Lan, Ao Ren, Zhenyu Wang, Wei Li, Hongbin Zhu, Yujuan Tan, Duo Liu, Kan Zhong and Chaoxia Qin

 

EDA1       Session 3B - Yield Analysis & 2.5D/3D Physical Design (Electronic Design Automation) - Session Chair: Benjamin Carrion Schaefer, UT Dallas schaferb@utdallas.edu

L1              STAMP-2.5D: Structural and Thermal Aware Methodology for Placement in 2.5D Integration, Varun Parekh, Zachary Wyatt Hazenstab, Srivatsa Rangachar Srinivasa, Krishnendu Chakrabarty, Kai Ni and Vijaykrishnan Narayanan

L2              OpenYield: An Open-Source SRAM Yield Analysis and Optimization Benchmark Suite, Shan Shen, Xingyang Li, Zhuohua Liu, Yikai Wang, Yiheng Wu, Junhao Ma, Yuquan Sun and Wei W. Xing

S1             3DPX - An Open-Source Methodology for 3D Physical Design Exploration, George Goudroumanis, Maria Pantazi, George Floros, Athanasios Tziouvaras, George Stamoulis and Alberto Garcia-Ortiz

S1             Declarative Synthesis and Multi-Objective Optimization of Stripboard Circuit Layouts Using Answer Set Programming, Fang Li

 

T1             Tutorial 1:  Systems Design for Efficient MoE-based LLM Inference, Pavan Miriyala (AMD Research Singapore) and Haris Javaid (AMD Signapore) (slides)

 

CS2           Session 4A - Memory System Innovations - Session Chair: Fang Li - Oklahoma Christian University (fang.li@oc.edu)

L1              R2Hash: A Read-Optimized and Resize-Friendly Hashing Index for Persistent Memory, Jinlei Hu, Bo Chen, Miaosong Zhang, Jing Hu, Jianxi Chen and Dan Feng

L2              ALPHA: A Scalable Lock-Free Partitioned Hash Index for Persistent Memory on NUMA Architectures, Qiyang Zheng, Hao Hu, Hao Huang, Yanqi Pan, Yifeng Zhang, Wen Xia, Xiangrui Meng and Xudong Li

L3              DDLM: Demand-Aware Dynamic Link Width Management for Energy-Efficient CXL Memory, Taejeong Kim, Junbum Park, Yongho Lee and Seokin Hong

L4              Computing-In-Memory Dataflow for Minimal Buffer Traffic, Choongseok Song and Doo Seok Jeong

S1             PIMFY: Eliminating Remote Page Walks in MCM GPUs, Junsung Kim, Sungwoo Kim, Seunghyun Jin and Won Woo Ro

 

HW2        Session 4B - Emerging Memory Technologies - Session Chair: Victoria Gammenthaler, The University of Texas at Dallas (Victoria.Gammenthaler@UTDallas.edu)

L1              PriME: PIM-Aware Efficient Compression for Memory-Bound Embedding Layers in sLLMs, Junghyeok Lee, Jihoon Jang and Hyun Kim

L2              Dissecting and Re-architecting 3D NAND Flash PIM Arrays for Efficient Single-Batch Token Generation in LLMs, Yongjoo Jang, Sangwoo Hwang, Hojin Lee, Sangwoo Jung, Donghun Lee, Wonbo Shim and Jaeha Kung

S1             MamCIMFlow: An Integrated Co-Design of RRAM-Based CIM and Selective State-Space Streaming for Efficient Mamba Model Acceleration, Mingzi Li, Zhongrui Wang, Zhongwen Ye, Tao Pan and Han Wang

S2             PIM-SUM: Fast and Reliable In-Memory Summation for Recommendation Systems, Fan Li, Ruizhi Zhu, Huize Li, Di Wu and Xin Xin

L3              CMC: Compound Memory-Computing Architecture for Energy-Efficient CNN Accelerators, Ming Han, Jin Wu, Jian Dong, Ye Wang and Gang Qu


 

 

 

 

TUESDAY

8:00-9:00am

Registration and opening remarks

9:00-10:00am

Keynote 2: AI for Chip Design: The Inflection Point Is Now, Dr. Mark Ren - Director of of Design Automation Research, NVIDIA

10:00-11:15am

Session 5A - Large‑Scale Model Inference (Computing Systems)

Session 5B - Design, Testing, and Verification (Test, Verification and Security)

 

Session 5C: Quantum Systems Security: Threats, Forensics & Defenses (Special Session)

 

11:15-11:45am

Coffee Break

11:45-12:45pm

 

Session 6A - Architecture-Aware Compilation and Scheduling (Software Architecture, Compilers and Tool Chains)

Session 6B - Timing‑Driven & Physically‑Aware Optimization (Electronic Design Automation

Session 6C: Efficient and Secure Generative AI (Special Session)

12:45-2:30pm

Lunch Break

2:30-4:00pm

Session 7A - Efficient Data Storage (Computing Systems)

Session 7B - Fault tolerance and resilience in Systems (Test, Verification and Security)

 

Tutorial 2: Machine Learning for Automated Physical Design

 

4:00-4:30pm-

Coffee Break

4:30-5:45pm

Session 8A - Advanced Hardware Acceleration Trends (Hardware Architectures)

Session 8B - AI‑Assisted RTL & HLS Code Generation (Electronic Design Automation)

 

 

 

 

 

 

6:30-9:30pm

Social Event 1 - Dinner

 

 

 

 

 

TUESDAY – SESSIONS DETAILS

 

 

CS3           Session 5A - LargeScale Model Inference - Session Chair: Victoria Gammenthaler, The University of Texas at Dallas (Victoria.Gammenthaler@UTDallas.edu)

L1              Ghidorah: Fast LLM Inference on Edge with Speculative Decoding and Hetero-Core Parallelism, Jinhui Wei, Ye Huang, Yuhui Zhou, Jiazhi Jiang and Jiangsu Du

L2              DualSpar: A Dual-Granularity Memory Framework with Adaptive Sparsity for Efficient LLM Inference, Yujuan Tan, Jiayi Guo, Zhuoxin Bai, Sanle Zhao, Yujiao Wang, Zongjie Wang, Ao Ren, Duo Liu, Kan Zhong and Jun Liu

L3              Throughput-Oriented LLM Inference via KV-Activation Hybrid Caching with a Single GPU, Sanghyeon Lee, Hongbeen Kim, Soojin Hwang, Guseul Heo, Minwoo Noh and Jaehyuk Huh

L4              AuLoRA: Fine-Grained Loading and Computation Orchestration for Efficient LoRA LLM Serving, Xiao Shi, Jiangsu Du, Zhiguang Chen and Yutong Lu

S1             Taming Sparse Giants: Deploying Mixture-of-Experts on 3D Heterogeneous Compute-in-Memory Systems, Pragya Sharma, Ashish Reddy Bommana, Farshad Firouzi and Krishnendu Chakrabarty

 

TVS2        Session 5B - Design, Testing, and Verification - Session Chair: Kimia Azar - University of Central Florida (UCF) (azar@ucf.edu) 

L1              µSTT: Microarchitecture Design for Speculative Taint Tracking, Boru Chen, Rutvik Choudhary, Kaustubh Khulbe, Archie Lee, Adam Morrison and Christopher W. Fletcher.

L2              Hot-FV: A Semi-Formal Test Generation Framework for RTL Functional Coverage using Warm Starting States, Ziyue Zheng, Zhiyuan Yan, Xiangchen Meng, Guangyu Hu, Hongce Zhang and Yangdi Lyu

L3              ATPG-Based Weighted Scan Chain Control for Programmable Low-Power LBIST, Yumei Hu, Hairui Cai, Xiaohui Xue, Yaning Wang, Yu Huang, Zhipeng Lyu, Zhouxing Su, Zezhong Wang and Xing Wang

S1             FitFuzz: Depth-Oriented Coverage-Guided Fuzzing via Fitness-Based Seed Scheduling, Venkat Nitin Patnala and Sai Manoj Pudukotai Dinakarrao

S2              Firebreak: Efficient In-process Protection with Hardware-Assisted Dynamic Compartmentalization, Yue Jin, Yibin Xu, Han Wang, Chengyuan Zhang, Tianyi Huang, Tianyue Lu and Mingyu Chen

 

SS-1:        Session 5C - Quantum Systems Security: Threats, Forensics & Defenses - Session Chair: Chaitali Sathe The University of Texas at Dallas (ChaitaliGajanan.Sathe@utdallas.edu)

P1             Oracle-Guided Attack on Quantum Circuit Obfuscation, Yuntao Liu

P2             Forensics of Error Rates of Quantum Hardware, Swaroop Ghosh

P3             QTIME: A Machine Learning Framework for Timing Side-Channel Analysis in Quantum Circuit Simulators, Ben Dong, Hui Feng and Qian Wang

P4             Recovering QSVT Polynomials from Side-Channel Information on Quantum Computers, Jakub Szefer

P5             Concolic Testing for Quantum Compilers, Kanad Basu

P6             Benchmarking and Characterising NISQ Computers Through Uncertainty Quantification of Variational Quantum Algorithms, Qiang Guan

 

SW2         Session 6A - Architecture-Aware Compilation and Scheduling - Session Chair: Baharealsadat Parchamdar, The University of Texas at Dallas (Baharealsadat.Parchamdar@UTDallas.edu)

L1              NaviMap: Partial Order-guided Neural Architecture via Deep Q-Networks for Efficient CGRA Mapping, Mingyang Kou, Jun Zeng, Xinyu Peng, Weiqing Ji and Hailong Yao

L2              IasRT: Interference-Aware and SLO-Driven GPU Scheduling for Real-Time DNN Inference, Heming Zhong, Jinhui Wei, Yujia Fu, Dan Huang and Yutong Lu

L3              AICAWS: Arithmetic Intensity based Cache-Conscious Adaptive Warp Scheduler, Bo Yuan, Sheng Liu, Zekun Jiang, Jianfeng Cui and Yang Guo

S1             A Dynamic Virtual Memory Management System for LLMs on AI Chips, Gaolin Wei, Zhaorui Zhang, Jiaqi Xu, Chen Zhang, Xin Yao and Benben Liu

 

EDA 2      Session 6B - TimingDriven & PhysicallyAware Optimization - Session Chair: Ioannis Savidis, Drexel University (is338@drexel.edu) 

L1              CPA-remap: Critical-path-based Physically Aware Remapping Framework for Timing Optimization, Mingxiao He, Pengcheng Huang, Zhenyu Zhao and Peiyun Bian

L2              Threshold Voltage Tuning Technique for Leakage Power Recovery, Jaejoon Yoon and Taewhan Kim

L3              Timing-Driven Global Placement with Entropy-Mobility Guided Pin-to-Pin Weighting, Youzhi Zheng, Zhengjie Zhao, Linhao Lu, Xiaodong Zhu, Wenxin Yu and Jingwei Lu

S1             Timing-Driven Multi-bit Flip-flop Allocation Utilizing Design-Technology Co-optimization Techniques, Yeongyeong Shin, Sehyeon Chung and Taewhan Kim

 

SS-2:        Session 6C – Efficient and Secure Generative AI- Session Chair: Khaza Anuarul Hoque, University of Missouri (hoquek@missouri.edu)

P1             GAN–BiLSTM–HDC: A Hybrid Framework for Robust and Hardware-Efficient Malware Detection, Emilien J Meyer, Abu Kaisar Masum, Mehran Moghadam, Lida Kouhalvandi, Gourav Datta, Sercan Aygun, and M. Hassan Najafi

P2             LM-Fix: Lightweight Bit-Flip Detection and Rapid Recovery Framework for Language Models, Ahmad Tahmasivand, Noureldin Zahran, Mohammed Fouda, Ihsen Alouani and Khaled N. Khasawneh

P3             FaRAccel: FPGA-Accelerated Defense Architecture for Efficient Bit-Flip Attack Resilience in Transformer Model, Najmeh Nazari, Banafsheh Saber, Latibari, Hosein Mohammadi Makrani, Elahe Hosseini, Chongzhou Fang, Setareh Rafatirad, Hossein Sayadi, Houman Homayoun

P4             Hammering the Diagnosis: Rowhammer-Induced Stealthy Trojan Attacks on ViT-Based Medical Imaging, Banafsheh Saber Latibari, Najmeh Nazari, Hossein Sayadi, Houman Homayoun, Abhijit Mahalonobis

 

CS4           Session 7A - Efficient Data Storage - Session Chair: Chaitali Sathe The University of Texas at Dallas (ChaitaliGajanan.Sathe@utdallas.edu)

L1              Hybrid-Rewrite: A Rewriting Framework for Hybrid Deduplication and Delta Compression, Qiao Li, Hong Jiang, Zichen Xu, Yucheng Zhang, Junyun Wu and Puchen Lu

L2              NatSep: Little-to-No Overhead Data Separation for Log-Structured Storage Using Native Information, Jinlong Wang, Zhipeng Tan, Yang Xiao, Wenjie Qi, Shikai Tan and Ying Yuan

L3              The Logic of Fingerprint Upgrade in Deduplicated Storage, Cai Deng, Boju Chen, Philip Shilane, Xiangyu Zou, Wen Xia and Hao Hu

L4              Pixel-DNA: Increasing Robustness of Approximate DNA Storage for Images by Using Hierarchical Deduplication, Alex Sensintaffar, David Du and Bingzhe Li

L5              TCFlash: In-Flash Bulk Bitwise Processing via Dynamic Sensing and TLC Encoding in 3D NAND, Habib Ur Rahman, Tharini Suresh, Sudeep Pasricha and Biswajit Ray

S1             Minimizing Read Disturb via Localized Page Allocation for Modern NAND Flash-Based SSDs, Joonseong Hwang, Minkyu Choi, Minjin Park, Jihun Yoon, Yoonho Jang and Seokin Hong

 

TVS3        Session 7B - Fault tolerance and resilience in Systems - Session Chair: Sabrina Ahmed, University of Texas at Dallas (Sabrina.Ahmed@UTDallas.edu)

L1              Laser and Radiation Testing of Compiler-based Protection for Multi-Bit Upsets, Davide Baroffio, Tomas Antonio López, Federico Reghenzani and William Fornaciari

L2              Masked Gadgets for Integer-Floating-Point Conversion with Applications to Falcon, Shuyi Chen, Jingdian Ming, Yuejun Liu, Yiwen Gao and Yongbin Zhou

L3              WSSR: Weight Set Segmentation and Recovery for Fault Resilient Transformers, Ntsee Ndingwan and Chengmo Yang

S1             ECOLogic: Enabling Circular, Obfuscated, and Adaptive Logic via eFPGA-Augmented SoCs, Ishraq Tashdid, Dewan Saiham, Nafisa Anjum, Tasnuva Farheen and Sazadur Rahman

S2             Enhancing Key-Recovery Chosen-Ciphertext Side-Channel Attacks on NTRU Using LDPC, Xiaofei Tong, Denis Nabokov and Qian Guo.

 

T2             Tutorial 2: Machine Learning for Automated Physical Design, Ioannis Savidis and Pratik Shrestha (slides)

 

HW3        Session 8A - Advanced Hardware Acceleration Trends - Session Chair: Daksith Chandrasekera, The University of Texas at Dallas (csc240000@utdallas.edu)              

L1              A Photonic Accelerator for Deep Learning Training, Yuan Li

L2              FINEA: An Efficient Neural Network Accelerator Exploiting Factorized Input Features, Yujin Kim, Chanhun Jeong, Yunho Oh, Myung Kuk Yoon and Gunjae Koo

L3              Flame: A Multiplier-Free LLM Accelerator with Dynamic Block Floating Point, Ao Lv, Haishuang Fan and Guihai Yan

S1             Hermes: Accelerating Packet Processing in DPU with Neural Networ, Rui Meng, Xinyu Chen, Hanyue Lin, Jingya Wu, Wenyan Lu, Xiaowei Li and Guihai Yan

S2             ASMA: An Anisotropy Scaling Memristor-based Accelerator for LLM Inference, Zijian Xiong, Xiangrui Yang, Yuhang Zhang, Yue Zhou, Jianguo Yang, Yaoyu Tao, Xiangshui Miao and Yuhui He

 

EDA3       Session 8B - AIAssisted RTL & HLS Code Generation - Session Chair: Baharealsadat Parchamdar, The University of Texas at Dallas (Baharealsadat.Parchamdar@UTDallas.edu)

L1              RTLBench: A Multi-Dimensional Benchmark Suite for Evaluating LLM-Generated RTL Code, Zhigang Fang, Renzhi Chen, Yang Guo, Huadong Dai and Lei Wang

L2              SAGE-HLS: Syntax-Aware AST-Guided LLM for High-Level Synthesis Code Generation, M Zafir Sadik Khan, Nowfel Mashnoor, Mohammad Akyash, Kimia Azar and Hadi Kamali

L3              LLM4MCU-Onto : Leveraging LLMs for Automated Ontology Generation from Microcontroller Reference Manual, Asmita Asmita, Grisha Bandodkar, Sujan Ghimire, Shaurya Srivastav, Soheil Salehi and Houman Homayoun

S1             LLM-Driven Code Generation for Neural Networks on FPGAs: Bridging Python and HLS, Rupesh Raj Karn, Johann Knechtel, Ramesh Karri and Ozgur Sinanoglu

 


 


 

 

WEDNESDAY

8:00-9:00am

Registration and opening remarks

9:00-10:00am

Keynote 3: Application and SW-HW codesign approach to driving low power architecture for GenAI, Akila Subramaniam, Sr Fellow, AMD

10:00-11:00am

Session 9A - Processor-Based Solutions 1 (Hardware Architectures)

Session 9B - Advanced Hardware Design Flows & Synthesis Techniques (Electronic Design Automation)

 

Tutorial 3:  Engineering Privacy at the Edge: A Practical Guide to Differential Privacy in System Architectures

 

11:00-11:30am

Coffee Break

11:30-1:00pm

 

Session 10A - Processor-Based Solutions 2 (Hardware Architectures)

Session 10B -System Level AI Optimization (Computing Systems)

 

LLMs for Hardware Design Challenge [Register here]

 

1:00-2:30pm

Lunch Break

2:30-3:30pm

Session 11A - Emerging trends 2 (Hardware Architectures)

 

 

Session 11B - GenAI Meets Silicon:  LLMs in Hardware Design, Verification, and Security (Special Session)

 

LLMs for Hardware Design Challenge [Register here]

3:30-4:00pm-

Coffee Break

4:00-5:00pm

Session 12A - Specialized High‑Performance Computing (Computing Systems)

 

 

Session 12B - Sustainable Hardware Accelerators with Integrated Electro-Photonics (Special Session)

 

LLMs for Hardware Design Challenge [Register here]

5:30-6:00pm

Closing Remarks

 

 

6:30-10:30pm

Social Event 2 – Mavericks Game

 

 

 

 

 

 

 

 

 

 

 

 

WEDNESDAY – SESSIONS DETAILS

 

 

 

HW4        Session 9A - Processor-Based Solutions 1 - Session Chair: Hossein Pedram - University of Texas at Dallas (UTD) (Hossein.Pedram@UTDallas.edu)

L1              TROOP: At-the-Roofline Performance for Vector Processors on Low Operational Intensity Workloads, Navaneeth Kunhi Purayil, Diyou Shen, Matteo Perotti and Luca Benini

L2              RVME: An Efficient Matrix Engine Design based on Matrix Extension of RISC-V, Wanqi Chen, Weidong Yang, Yiming Guo, Jing Qiu, Renpei Wang, Jianfei Jiang, Naifeng Jing and Qin Wang

L3              TeraNoC: A Multi-Channel 32-bit Fine-Grained, Hybrid Mesh-Crossbar NoC for Efficient Scale-up of 1000+ Core Shared-L1-Memory Clusters, Yichao Zhang, Zexin Fu, Tim Fischer, Yinrong Li, Marco Bertuletti and Luca Benini

L4              THENA: Accelerating Torus Fully Homomorphic Encryption on Energy-Efficient Heterogeneous Architecture, Yanze Wu and Md Tanvir Arafin

S1             SSM-RDU: A Reconfigurable Dataflow Unit for Long-Sequence State-Space Models, Sho Ko and Kunle Olukotun

 

EDA4       Session 9B - Advanced Hardware Design Flows & Synthesis Techniques - Session Chair: Hadi Kamali - University of Central Florida (UCF) (kamali@ucf.edu)

L1              Optimization of Wire Pipelining and Channel Parallelism for 2D-Mesh NoC Physical Design, Pei-Huan Tsai, Maico Cassel dos Santos, Joseph Zuckerman, Kuan-Lin Chiu and Luca Carloni

L2              Agile Design Flow for Cryptographic Hardware Accelerators, Deng Liming, Zhu Guowei, Cao Wei, Fan Xitian and Zhou Xuegong

L3              Decomposition Attack on Structural Logic Locking of Reversible Circuits, Feng-Jie Chao and Yung-Chih Chen

S1             Supporting Pipelined Memory Accesses in Processor Synthesis, Essien Taylor, Colin Schilf, Sebastian Phemister and Russ Joseph

 

T3             Tutorial 3:  Engineering Privacy at the Edge: A Practical Guide to Differential Privacy in System Architectures, Olivera Kotevska, Eyhab Al-Masri, and Wenjun Yang (slides)

 

HW5        Session 10A - Processor-Based Solutions 2 (Hardware Architectures) - Session Chair: Daksith Chandrasekera, The University of Texas at Dallas (csc240000@utdallas.edu)

L1              BNRV: A Lightweight SIMD Extension for Efficient BitNet Inference on RISC-V CPUs, Zijun Jiang and Yangdi Lyu

L2              Design and Evaluation of a N-Trace Compliant Hardware Tracer for RISC-V Processors, Omer Karslioglu and Ismail Akturk

L3              FlexIO: A Scalable IO Chiplet Architecture with Flexible Memory Controller Mapping, Junpei Huang, Haobo Xu, Yinhe Han and Ying Wang

L4              Register Bridging: A Lightweight Microarchitectural Approach for Skipping Overhead Instructions in DistanceBased ISA Processors, Fan Yang, Toru Koizumi, Jun Li, Shu Sugita, Yuriko Yamauchi, Ryota Shioya, Junichiro Kadomoto and Hidetsugu Irie

S1             XDMA: A Distributed, Extensible DMA Architecture for LayoutFlexible Data Movements in Heterogeneous Multi-Accelerator SoCs, Fanchen Kong, Yunhao Deng, Xiaoling Yi, Ryan Antonio and Marian Verhelst

 

CS5           Session 10B -System Level AI Optimization - Session Chair: Jiaqi Gu- Arizona State University (ASU) (Jiaqi.Gu@asu.edu)

L1              AceHomo: Accelerating Privacy Preserving Inference through Dynamic Level Adjustment, Hongyan Li, Jinkai Zhang, Hang Lu and Xiaowei Li

L2              HyperDrone: an Accurate, Robust, Fast, and Energy-Efficient Approach for Drone Classification, Shriniwas Kulkarni, Flavio Ponzina and Tajana Rosing

L3              Access Frequency-Aware Storage Reduction for Deep Learning Recommendation Model, Chia-Chun Wang, Chuan-Yao Lai and Ren-Shuo Liu

L4              Recommendation-Expert Framework for Fast and Adaptive Scheduling in Computing Power Network, Yu Chen and Wenli Zheng

S1             Oak: A Fault-Tolerant Shared-Memory System Atop Memory-Semantic Fabrics, Zhaoxiang Huang, Jianqin Yan, Hao Chen, Jiaxin Li and Yiming Zhang

 

LLMs for Hardware Design Challenge [Register here] Organizers: Houman Homayoun, Soheil Salehi, Farinaz Koushanfar, Benjamin Carrion Schafer, Kevin Immanuel Gubbi, Mohammadnavid Tarighat, & Chongzhou Fang

 

HW6        Session 11A - Emerging trends 2 - Session Chair: Ioannis Savidis, Drexel University (is338@drexel.edu)

L1              TLV-HGNN: Thinking Like a Vertex for Memory-efficient HGNN Inference, Dengke Han, Duo Wang, Mingyu Yan, Xiaochun Ye and Dongrui Fan

L2              SageSC: Accelerating GraphSAGE Minibatch Inference on Memory-Intensive Graphs, Yuchen Gui, Wei Yuan, Qizhe Wu, Huawen Liang, Letian Zhao, Linfeng Tao, Zhongguang Xu and Xi Jin

L3              In-DRAM True Random Number Generation Using Simultaneous Multiple-Row Activation: An Experimental Study of Real DRAM Chips, Ismail Emir Yuksel, Ataberk Olgun, Nisa Bostancı, Oguzhan Canpolat, Geraldo Francisco De Oliveira Junior, Mohammad Sadrosadati, Abdullah Giray Yaglikci and Onur Mutlu

S1             TIPS: Augment Memory Tagging to Defend Against Prefetcher Side Channels, Yubiao Huang, Peinan Li, Huan Qiao, Yunkai Bai, Shiwen Wang, Dan Meng and Rui Hou

S2             Adaptive ML-KEM: A Configurable HW-SW Architecture for Post-Quantum Cryptography Wenkai Wang, Chao Liu, Zhe Sun, Lei Ju and Zimeng Zhou

 

SS3           Session 11B SS-3: GenAI Meets Silicon:  LLMs in Hardware Design, Verification, and Security - Session Chair: Hadi Kamali - University of Central Florida (UCF) (kamali@ucf.edu)

P1             FV-PAL: Scalable Formal Verification through Partitioning and LLM-guided Property Generation, Sudipta Paria, Aritra Dasgupta, Dinesh R. Ankireddy, Prabuddha Chakraborty, and Swarup Bhunia

P2             LLM Reasoning within Hardware Design:  Models, Metrics, and Methodologies, Matthew DeLorenzo, Kevin Tieu, and Jeyavijayan Rajendran

P3             CircuitGuard: Privacy-Preserving Fine-Tuning of LLMs Against Hardware IP Leakage, Nowfel Mashnoor , Mohammad Akyash, Hadi Kamali, and Kimia Azar

P4             Multi-Agent LLMs for Hardware Security Verification, Jayeeta Chaudhuri and Farshad Firouzi

 

CS6           Session 12A - Specialized HighPerformance Computing - Session Chair: Sabrina Ahmed, University of Texas at Dallas (Sabrina.Ahmed@UTDallas.edu)

L1              FlashMP: Fast Discrete Transform-Based Solver for Preconditioning Maxwell’s Equations on GPUs, Haoyuan Zhang, Yaqian Gao, Xinxin Zhang, Jialin Li, Runfeng Jin, Yidong Chen, Feng Zhang, Wu Yuan, Wenpeng Ma, Shan Liang, Jian Zhang and Zhonghua Lu

L2              MH-SpGEMM: Efficient Sparse General Matrix-Matrix Multiplication on Modern GPUs via Masking and Hashing Cooperative Optimization, Shuang Yang, Yaobin Wang, Ling Li, Qian Peng and Qiong Yu

L3              TensTFM: Efficient Total Focusing Method for Ultrasonic Array Imaging on Dataflow Accelerators, Jieran Zhang, Bizhao Shi and Guojie Luo

L4              Design of an Online Surface Code Decoder Using Union-Find Algorithm, Takuya Kasamura, Junichiro Kadomoto and Hidetsugu Irie

S1             Early Termination with Activation Sign Prediction for Energy-Efficient CNN Inference Using Sum-of-Power-of-Two Quantization, Emir Mehmet Eryilmaz, Selim Sandal and Ismail Akturk

 

SS4           Session 12B SS-4 Sustainable Hardware Accelerators with Integrated Electro-Photonics - Session Chair: Muhammad Rashed UT Arlington (muhammad.rashed@uta.edu)

P1             Accelerating Diffusion Models for Generative AI Applications with Silicon Photonics, Sudeep Pasricha

P2             Toward Lifelong-Sustainable Electronic-Photonic AI Systems via Extreme Area Efficiency, Reconfigurability, and Robustness, Jiaqi Gu

P3             SUSTAINPHOT: Sustainable Large-Scale AI Training using Analog Silicon Photonic Accelerators, Dharanidhar Dang

P4             Scaling Up Operational Sustainability of Photonic Tensor Cores with Device-Circuit-Signaling Co-Desig, Ishan Thakkar