Congratualtions to SPP2377’s CXL-Bridge project! The team has won the best paper award at the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’26) with their paper titled Automated Synthesis and Verification of CXL Bridges for Heterogeneous Architectures. Congratulations to the authors Anatole Lefort (TU Munich), Julian Pritzi (TU Munich), Nicolò Carpentieri (TU Munich), David Schall (TU Munich), Simon Dittrich (TU Munich), Soham Chakraborty (TU Delft), Nicolai Oswald (NVIDIA), Pramod Bhatotia (TU Munich).

The authors give a little insight into the award winning paper:
The modern datacenter is no longer a uniform machine. x86 CPUs, Arm servers, GPUs, and specialized accelerators are increasingly expected to cooperate and maintain a coherent view of memory. Compute Express Link (CXL) has emerged as the connective fabric for this heterogeneous world, promising transparent, cache-coherent remote memory access across architectures. But a promise is not a specification. For the Systems Research Group at the Technical University of Munich (TUM), the gap between CXL’s ambition and its operational reality has been the driving question of the past few years of work, led by Professor Pramod Bhatotia and Dr. Anatole Lefort.
CXL Gap
CXL’s core appeal is interoperability: multiple hosts can coherently access a shared remote memory region as if it were their own local memory. What CXL does not define, however, is how to make this work when those hosts disagree on the fundamentals. Every architecture carries its own cache coherence protocol and memory consistency model (MCM), and CXL v3.0 provides no mechanism to reconcile these differences. The result is a semantic gap that, without a principled translation layer, produces unpredictable behavior, silent memory consistency violations, and bugs that are notoriously difficult to detect.
C³
C³: CXL Coherence Controllers for Heterogeneous Architectures, presented at HPCA ’26, introduces the concept of a CXL bridge: a hardware component at the boundary between a host’s native coherence domain and CXL, responsible for translating coherence flows transparently and correctly. Its design rests on two concrete rules. Flow Delegation ensures that any memory operation with globally visible effects is forwarded to the global coherence domain, without requiring modifications to existing protocols. Atomicity prevents any coherence effects in the origin domain until a forwarded operation has fully completed remotely, avoiding the race conditions inherent to CXL’s PCIe-based fabric.
vCXLGen
C³ proved the concept, but manually designing a bridge for each new host architecture or CXL revision does not scale. vCXLGen: Automated Synthesis and Verification of CXL Bridges for Heterogeneous Architectures, presented at ASPLOS ’26, eliminates this bottleneck entirely. Building on C³’s design rules, vCXLGen automatically generates correct-by-construction CXL bridges from machine-readable protocol specifications, with no manual intervention. Correctness is not assumed but proved through a compositional formal verification approach that checks liveness properties in isolation for each host cluster, avoiding state-space explosion and scaling to arbitrarily large heterogeneous deployments. This work earned the Best Paper Award at ASPLOS ’26 in Pittsburgh, USA.
Future Research
The principles behind C³ and vCXLGen point toward a broader vision: coherence controllers for any architecture, generated automatically and verified formally. Ongoing work in the group aims to generalize this methodology to NUMA systems, chiplet-based SoCs, and GPU environments, where accelerators relying on Release Consistency protocols represent the next frontier for automated coherence synthesis.
