Rethinking Prediction for System Tuning and Architectural Modeling

October 17, 2025 Jonathan Balkind — UCSB

Abstract

In this talk, I will cover two recent papers from our lab, both adopting prediction in unconventional ways. Time permitting, I will also talk a little about our plans to enable hardware-enforced, privacy-preserving prediction of tenants’ applications by cloud providers. To better facilitate application performance programming we propose a software optimization strategy enabled by a novel low-latency Prediction System Service (PSS). Rather than relying on nuanced domain-specific knowledge or slapdash heuristics, a system service for prediction encourages programmers to spend their time uncovering new levers for optimization rather than worrying about the details of their control. The core idea is to write optimizations that improve performance in specific cases, or under specific tunings, and leave the decision of how and when exactly to apply those optimizations to the system to learn through feedback-directed learning. Such a prediction service can be implemented in any number of ways, including as a shared library that can be easily reused by software written in different programming languages, and opens the door to both new software optimization patterns and hardware design possibilities. Modern applications exhibit memory access patterns with complex spatial and temporal relationships. Traditional architectural simulators utilized to evaluate these applications are highly sequential in nature, particularly for stateful components like caches. We present an innovative approach to cache simulation by reframing the problem from a deep learning perspective. We exploit the fact that memory access traces in any part of a processor design can be represented as two-dimensional heatmaps. Our key insight is that the behaviour of a cache acts as a filter on these heatmap images which can be learned as a function using deep learning techniques. Leveraging this observation, we introduce CacheBox, a framework that employs a Generative Adversarial Network (GAN) to learn and replicate the filtering behaviour of caches using memory access heatmaps. We demonstrate that CacheBox effectively generalises across multiple state-of-the-art benchmarks, various cache configurations, different cache hierarchy levels, and even alternative microarchitectural structures with high accuracy.

Speaker Bio

Jonathan Balkind is an Assistant Professor in the Department of Computer Science at the University of California, Santa Barbara. His research interests lie at the intersection of Computer Architecture, Programming Languages, and Operating Systems. Jonathan completed his PhD and MA degrees at Princeton University and his MSci degree at the University of Glasgow. Jonathan was an Open Hardware Trailblazer Fellow and recipient of the NSF CAREER Award. Since 2021, he has served as a Director of the FOSSi Foundation.