First Joint Workshop on Memory Disaggregation

Together with TU Delft, the OSM Group held the first joint workshop on memory disaggregation.

Organizers

Felix Eberhardt (Hasso Plattner Institute)
Andreas Grapentin (Hasso Plattner Institute)
Ákos Hadnagy (TU Delft)

Participants

IBM Labs: Böblingen, Dublin, US, India, France
TU Delft, Big Data Acceleration Group
Erasmus Medical Center Rotterdam, Neuroscience Department
CERN Switzerland
Hasso Plattner Institute Potsdam, Operating Systems and Middleware Group

Agenda

15:00 – 15:15 Welcome and Introduction
15:15 – 16:45 Research Updates
15:15 – 15:45 HPI (disaggregated In-Memory Databases)
15:45 – 16:30 TU Delft (disaggregation projects)
16:30 – 16:45 IBM (Cloud integration)
16:45 – 17:00 (Coffee) break
17:00 – 18:00 Discussion on future joint projects
18:00 – 18:30 Wrap up and next steps

Summary

A variety of challenges, research questions and workloads suitable for the topic of memory disaggregation were identified that could be of interest for further work. In particular, current work on In-Memory Databases, Agent-based Simulation Systems, Big Data Analytical Workloads and Brain Simulation was presented, each of which could potentially benefit from the availability of disaggregated memory in datacenters.

Afterwards, the discussion focussed on two distinct topics: Firstly establishing collaboration on a more formal level for example through funding in the form of a European project, and secondly providing in-depth technical feedback on the presented prototypes and ideas.

Funding Discussion

The groups want to jointly pursue funding for projects. This could be in the form of a European project. Specific calls are expected to be released next year. Country-specific calls like the disruptive memory technologies program from the DFG (German Research Foundation) and the dutch equivalent could be alternatives as well.

Technical Discussion

We grouped the technical discussion into three main sections: Performance, reliability and architecture. The participants concluded that they want to have a better understanding of the runtime infrastructure and the applications running on top. There were two perspectives on memory disaggregation: extension of memory like SMP and as a way to couple systems with a very low latency (<<1 microsecond) network.

Performance

With main memory as the major cost factor of modern server systems, it was noted that the cost of DRAM modules increases super linearly with their capacity, by spreading the same capacity over more DRAM DIMMS decreases the total system cost. Increased memory latency of such a scaled SMP system was seen as a critical aspect of the memory subsystem. Relaxing consistency constraints by viewing such a system as a low latency RDMA system could be leveraged to mitigate the issue.

Disaggregating memory introduces a super-NUMA latency stratification. Applications need to control data placement accordingly to mitigate the effects. Latency sensitivity for data structures and access patterns needs to be evaluated. To what degree are workloads latency and bandwidth constrained and to what extent could they be classified to be mapped to different latency layers?

Reliability

Reliability is of interest since scale-up machines have a superlinear effort of reliability engineering. The question arose whether it is possible to build reliable disaggregated systems where the failure of individual units is tolerated.

Architecture

The group envisioned numerous interesting new architectures both on hardware and software levels that are possible with memory disaggregation.

Hardware

Using coherent interconnects as a new way of building distributed systems was discussed. More specifically could we use memory disaggregation to build next-generation supercomputers which couple several coherency domains?

A main part of the discussion was centered around coherency and consistency. It should be evaluated to what extent having non-coherent, potentially not-the-newest data could be tolerated by the workloads. It was concluded that for the simulation workloads it could be fine, while for transactions in a database this could be more difficult to justify. Do we need new coherency/consistency models or re-evaluate older approaches like distributed shared memory with the new hardware? One suggestion was to leverage coherent interconnects and memory disaggregation to vastly increase the density and effectiveness of computing resources. Systems with Power processors can be seen as memory traffic switches.

Software

The question was brought up looking at such a decoupled system from a programmer's point of view what could be improvements or new programming models to make use of the disaggregation technology. What is the best abstraction and granularity to work with and do resource management? Is it objects / pages / something else? And how do we integrate them into programming language abstractions?

Opportunities for Dissemination

October 28 2021: OpenPower Summit
Feb 01 2022: Paper Deadline CompSYS
May 30 2022: IPDPS workshop // CompSYS

Next Steps

We will use ThymesisFlow (https://github.com/OpenCAPI/ThymesisFlow) as a research vehicle for disaggregation
Deploy and evaluate the workloads on the IC922 testbeds
HPI will provide access to the ThymesisFlow installation so the other research groups can experiment with their workloads, please send a mail to felix.eberhardt ... hpi.de
Identify possible European project calls suitable for joint submission of proposals
HPI and IBM organize the First Workshop on Composable Systems (COMPSYS'22) around the topic of disaggregation to be held in conjunction with the IPDPS conference in 2022. This would be a good venue for presenting the first results. https://compsysworkshop.github.io/compsys22/

Final Notes

We would like to thank all participants of the workshop for their input in the discussion. So far, we have received positive feedback and are looking forward to taking the next steps.