Kempner AI cluster named one of world’s fastest ‘green’ supercomputers
Researchers at Harvard now have access to one of the fastest and greenest supercomputers in the world.
Built to support cutting-edge research at the Kempner Institute for the Study of Natural and Artificial Intelligence, and Harvard University more broadly, the Kempner’s AI cluster has just been named the 32nd fastest “green” supercomputer in the world in the Green500, the industry’s premier, independent ranking of the most energy-efficient supercomputers globally. In addition to cracking the top 50 list of green supercomputers, the cluster has been certified as the 85th fastest supercomputer overall in the TOP500, making it one of the fastest and greenest supercomputers on the planet.
“The Kempner AI cluster’s ranking in the latest Green500 and TOP500 lists positions us squarely among the fastest and most eco-friendly AI clusters in academia and the world,” said Max Shad, Kempner senior director of AI/ML research engineering. “It is no small feat to have built this kind of green high-performance computing power in such a short period of time, enabling cutting-edge research that is innovating in real time, and allowing for truly important advancements at the intersection of artificial intelligence and neuroscience.”
High-performance computing forms the backbone of the massive growth in the field of machine learning, and researchers at the Kempner Institute are leveraging this immense computational power to train and run artificial neural networks, leading to key advances in understanding the basis of intelligence in natural and artificial systems.
Measuring green compute power, from flops to gigaflops
The Kempner’s AI cluster opened with an initial pilot installation in spring 2023, and now represents the forefront of Harvard’s growing engagement with state-of-the-art computing resources. Composed of 528 specialized computer processors called graphics processing units (GPUs), which are networked together in parallel with “switches” to enable fast and simultaneous computation, the cluster can run rapid computations on hundreds of research projects at once.
To gauge the cluster’s green computing power and overall computing power, engineers from Lenovo measured the speed of the cluster’s highest-performing GPUs (called H100s) using the LINPACK Benchmark, which requires solving vast linear algebra problems. This is expressed in terms of floating point operations per second, or “flops.” The system’s efficiency, or “green” computing capacity, depends on how many flops the H100s can perform with a given amount of power, which is expressed as gigaflops per watt of power used.
The Kempner’s H100s demonstrated the capability to perform 16.29 petaflops, at an efficiency of 48.065 gigaflops per watt of power used.
Just how fast is the Kempner AI cluster? To get a sense of perspective on the Kempner’s 16.29 petaflops of computing power, consider this: The computers aboard Apollo 11, which took Neil Armstrong and Buzz Aldrin to the moon in 1969, were capable of 12,250 flops. That sounds like a lot, but by the 1980s much faster computations were possible: The CRAY-2 supercomputer recorded a performance of 1.9 gigaflops. That’s 1.9 billion flops. And now we have vastly more computing power in our pockets. The iPhone 15 is capable of more than 1,700 gigaflops. And the Kempner’s AI cluster has more than 16 petaflops of computing power — that’s 16 followed by 15 zeros — which is four orders of magnitude greater than the iPhone in your pocket. These numbers suggest that the ability of a Large Language Model (LLM) to produce grammatically correct language and simulate cognition is more computationally intensive than navigating a rocket to the moon — at least for now.
A supercomputer supporting new research at the Kempner, and across Harvard
With this magnitude of computing power, Kempner researchers are able to train state-of-the-art AI systems like large language models (LLMs), of which ChatGPT is perhaps the best known, quickly and efficiently. For example, the Kempner cluster can train the popular Meta Llama 3.1 8B and Meta Llama 3.1 70B language models in about one week and two months, respectively. Before the Kempner’s cluster was established and operational, training the Llama models on the next-fastest computer system at Harvard would have taken years to complete.
Beyond using the cluster to create faster models, researchers are also employing the cluster to better understand how and why they work. “With this enhanced computational power, we can delve deeper into how generative models learn to reason and complete tasks with greater efficiency,” says Kempner Institute Research Fellow Binxu Wang.
In addition to providing researchers with the capacity to train complex models quickly and efficiently, and to understand the mechanisms behind how they learn, the Kempner cluster enables scientists to compare vast numbers of model architectures and learning algorithms in parallel, with important applications in fields ranging from medicine to neuroscience. One example: In research recently published in Nature Medicine, Kempner associate faculty member and Harvard Medical School Assistant Professor Marinka Zitnik and colleagues used the cluster to develop and train TxGNN, an AI system that distills vast amounts of medical data into knowledge graphs, and then uses the graphs to predict the effectiveness of a drug for treating rare diseases.
The Kempner GPUs form part of Harvard University’s growing computational ecosystem, joining new or soon-to-be-available GPUs supported by Harvard’s Faculty of Arts and Sciences Research Computing (FASRC). More than 5,200 researchers across the University make use of these computing resources in a wide array of scientific and technological applications.
The power of parallel processing
So what exactly is a cluster? As the name suggests, a computing cluster gathers together multiple devices, each of which can function as a full-fledged computer in its own right. Linking devices together unleashes the power of parallel computing, which leads to massive speed-ups in processing time by performing large numbers of tasks simultaneously.
Until a few decades ago, most computers were powered by a central processing unit (CPU) that could only perform one computational operation at a time. By the early 2000s, computer scientists had figured out how to create “multicore” CPUs that perform multiple computations in parallel.
The road to supercomputing clusters like the Kempner’s involved stacking several levels of parallel processing on top of each other. After the introduction of multicore CPUs, the next level of parallelism was enabled by the use of GPUs. Controlling the graphics on a computer screen requires large numbers of very similar computations that can be done simultaneously. For example, displaying a video game requires computing the brightness and color of millions of pixels up to 120 times per second. GPUs perform these numerous yet simple computations in parallel, freeing up the CPU to perform more complex computations.
Computer scientists realized that the capacity of GPUs to perform vast numbers of parallel computations could be repurposed for other tasks, such as machine learning. Running an artificial neural network such as OpenAI’s GPT or DALL-E, for example, involves vast numbers of mathematical operations that can be performed in parallel. But the parallelism doesn’t stop here: Yet another level of parallelism is enabled by linking multiple GPUs together in a network. The Kempner’s network involves hundreds of NVIDIA GPUs — 144 A100s and 384 H100s — that can work in concert. This multilevel parallelism empowers the Kempner’s researchers to perform the dizzyingly intensive computations involved in the study of natural and artificial intelligence and to develop new AI applications in areas such as medicine.
When it comes to fast and flexible experimentation, iteration, and computationally intensive research, the Kempner AI cluster is, in the words of Boaz Barak, “absolutely instrumental.” Barak, a Kempner associate faculty member and professor at the Harvard John A. Paulson School of Engineering and Applied Sciences, and his lab group “relies on extensive computational experiments using the cluster,” to study the mechanisms, capabilities, and limitations of deep learning systems, he says, allowing them to “hone intuitions and study questions as they arise.”
A powerful supercomputer, built to be green
Intentionally built for optimal energy consumption, the Kempner’s AI cluster is also setting a standard for “green” supercomputing. Modern machine learning has resulted in unprecedented advances in AI, but the methods are increasingly energy-intensive. Lowering the carbon footprint of AI is therefore crucial so that advances in AI do not come at the cost of exacerbating global warming.
Housed at the Massachusetts Green High Performance Computer Center (MGHPCC) along with other FASRC resources, and located in the town of Holyoke, Massachusetts, the Kempner’s AI cluster uses a variety of state-of-the-art techniques to minimize energy usage and make every megawatt of power count. The center is powered by the Holyoke municipal electric company, which delivers 100 percent carbon-free energy through a hydroelectric power station and several solar arrays that they operate.
As the central computing hub employed by most of the state’s research universities, including Harvard, MIT, UMass, Northeastern and Boston University, the MGHPCC was the first university research data center to achieve LEED Platinum Certification, the highest level awarded by the Green Building Council’s Leadership in Energy and Environmental Design Program. Moving forward, the Kempner’s partnership with MGHPCC will allow it to continue to grow with efficiency in mind, keeping the Kempner’s AI cluster green and efficient even as it grows into an even faster and more powerful tool for advancement in the field.
“Building an AI cluster that is not just blazing fast but also energy-efficient fits squarely into the Kempner’s mission, both to advance the field of intelligence, and to do so in a way that benefits people,” said Kempner Executive Director Elise Porter. “We have worked closely with MGHPCC to ensure this cluster is built with energy efficiency top of mind, and ranking as the 32nd fastest green supercomputer in the world is a testament to that work.”
Fast, green — and human
While landing a top spot on the TOP500/Green500 list is no small accomplishment, the real power of the Kempner’s work is knowing how to leverage its impressive computing resources to facilitate groundbreaking research. This involves more than building the AI cluster and giving researchers access to it. After all, researchers can’t just copy and paste old code into new machines — certain types of algorithms that work on traditional computers have to be reconceptualized and reformatted to be used with the Kempner’s computing infrastructure.
To this end, the Kempner has assembled a “full-stack” team of professional research engineers and research scientists with expertise ranging from distributed computing to data architecture to computational neuroscience. This Research & Engineering team develops codebases and standards, working with researchers to enable a seamless pipeline connecting scientific problems to computational solutions. The team also ensures that scientific findings are reproducible by helping students, fellows and faculty adopt industry-tested best practices for coding, testing, and maintenance of open repositories for models and data.
This human know-how is central to the ability of the Kempner community — and researchers all across Harvard University — to harness the scientific and technological potential of the green supercomputing power now available at its collective fingertips.
To find out more about the latest Kempner Institute research, check out the Deeper Learning blog.