CS SEMINAR

Talk 1- Optimizing for the Edge: From Neural Compilation to Instruction Fusion in RISC-V Systems
Talk 2- Accurate and Timely Prefetchers for High-Performance Computers

Speaker

Talk 1: Alexandra Jimborean, Associate Professor, University of Murcia
Talk 2: Alberto Ros, Full Professor, University of Murcia

Chaired by

Dr Trevor E. CARLSON, Associate Professor, School of Computing

tcarlson@comp.nus.edu.sg

27 Jan 2026 Tuesday, 10:00 AM to 11:30 AM

Talk 1: 10am to 10.45am

Abstract:
Bringing neural networks and Transformers to RISC-V edge devices demands both smarter compilers and architecture-aware optimizations. This talk traces a path from high-level graph compilation to low-level instruction reordering, showing how a unified compiler stack can unlock efficiency across the computing stack.
We begin with OML (ONNX-MLIR-LLVM), a portable compilation flow that automatically vectorizes deep learning operators through reduction detection and layout transformations—yielding performance gains on RISC-V that match the ONNX-Runtime (ORT) manually tuned libraries. We then dive into CAIF (Compiler-Assisted Instruction Fusion), a layout-aware scheduler that reorders fusible operations to expose new hardware fusion opportunities, achieving close to 20% speed-up.
Together, these advances reveal how compiler–architecture co-design—from graph to instruction stream—can make RISC-V a competitive target for both neural and general-purpose workloads.

Bio:

Alexandra Jimborean is Associate Professor (Titular de Universidad) working in the area of compilers and software-hardware co-designs. Previously she held a Ramon y Cajal researcher position at the University of Murcia (2020-2025) and was affiliated Associate Professor at Uppsala University, Sweden.
She received her PhD from the University of Strasbourg, France in 2012 researching on automatic parallelization of loops enabled by the compiler. Then continued with a postdoctoral fellowship in Sweden, Uppsala, where she became Assistant Professor in 2015 and got tenured as Associate Professor in 2018.
Alexandra’s main research interests are compile-time code analysis and optimization and software-hardware co-designs for performance, energy-efficiency, and security. In particular, her research focuses on compiler techniques to analyze memory access patterns which are then leveraged for optimizing software for the emerging architectures.

Talk 2: 10.45am to 11.30am

Abstract:

Prefetching instructions and data is a fundamental technique for designing high-performance computers. There are three key properties to consider when designing an efficient and effective prefetcher: timeliness, coverage, and accuracy. Timeliness is essential, as bringing instructions too early increases the risk of the instructions being evicted from the cache before their use and requesting them too late can lead to the instructions arriving after they are demanded. Coverage is important to reduce the number of instruction cache misses and accuracy to ensure that the prefetcher does not pollute the cache or interacts negatively with the other hardware mechanisms. This talk presents advanced mechanisms for instruction and data prefetching with a focus on timeliness and accuracy, but also with high coverage.

Bio:

Alberto Ros is full professor in the Computer Engineering Department at the University of Murcia, Spain. Funded by the Spanish government to conduct Ph.D. studies, he received Ph.D. in computer science from the University of Murcia in 2009. He held postdoctoral positions at the Universitat Politècnica de València and Uppsala University. He received an European Research Council (ERC) Consolidator Grant in 2018 to improve the performance of multicore architectures, and an ERC Proof of Concept Grant in 2024. Working on cache coherence, memory hierarchy designs, memory consistency, and processor microarchitecture, he has co-authored more than 100 peer-reviewed articles. He has been inducted into the ISCA Hall of Fame and the MICRO Hall of Fame. He is IEEE Senior member.

Talk 1- Optimizing for the Edge: From Neural Compilation to Instruction Fusion in RISC-V Systems Talk 2- Accurate and Timely Prefetchers for High-Performance Computers

COM3 Level 2

Talk 1- Optimizing for the Edge: From Neural Compilation to Instruction Fusion in RISC-V Systems
Talk 2- Accurate and Timely Prefetchers for High-Performance Computers