Aurora Supercomputer Ranks Quickest for AI

At ISC 2024, Intel publicizes Aurora is the quickest AI supercomputer, has damaged the exascale barrier, and particulars the significance of an open ecosystem in HPC and AI.

At ISC Excessive Efficiency 2024, Intel introduced in collaboration with Argonne Nationwide Laboratory and Hewlett Packard Enterprise (HPE) that the Aurora supercomputer has damaged the exascale barrier at 1.012 exaflops and is the quickest AI system on the earth devoted to AI for open science, reaching 10.6 AI exaflops. Intel can even element the essential function of open ecosystems in driving AI-accelerated excessive efficiency computing (HPC).

“The Aurora supercomputer surpassing exascale will enable it to pave the street to tomorrow’s discoveries,” mentioned Ogi Brkic, Intel vp and common supervisor of Knowledge Heart AI Options. “From understanding local weather patterns to unraveling the mysteries of the universe, supercomputers function a compass guiding us towards fixing really troublesome scientific challenges which will enhance humanity.

On Might 13, 2024, at Worldwide Supercomputing Convention 2024, Intel, Argonne Nationwide Laboratory and Hewlett Packard Enterprise introduced that the Aurora supercomputer has damaged the exascale barrier and leads as the very best ranked supercomputer for top efficiency computing and synthetic intelligence convergence. (Credit score: Argonne Nationwide Laboratory)

Designed as an AI-centric system from its inception, Aurora will enable researchers to harness generative AI fashions to speed up scientific discovery. Vital progress has been made in Argonne’s early AI-driven research. Success tales embrace mapping the human mind’s 80 billion neurons, high-energy particle physics enhanced by deep studying, and drug design and discovery accelerated by machine studying, amongst others.

The Aurora supercomputer is an expansive system with 166 racks, 10,624 compute blades, 21,248 Intel® Xeon® CPU Max Sequence processors and 63,744 Intel® Knowledge Heart GPU Max Sequence items, making it one of many world’s largest GPU clusters. Aurora additionally contains the most important open, Ethernet-based supercomputing interconnect on a single system of 84,992 HPE slingshot material endpoints. Aurora supercomputer got here in second on the high-performance LINPACK (HPL) benchmark however broke the exascale barrier at 1.012 exaflops using 9,234 nodes, solely 87% of the system. Aurora supercomputer additionally secured the third spot on the high-performance conjugate gradient (HPCG) benchmark at 5,612 teraflops per second (TF/s) with 39% of the machine. This benchmark goals to evaluate extra practical eventualities offering insights into communication and reminiscence entry patterns, that are necessary components in real-world HPC functions. It enhances benchmarks like LINPACK by providing a complete view of a system’s capabilities.

On the coronary heart of the Aurora supercomputer is the Intel Knowledge Heart GPU Max Sequence. The Intel X^e GPU structure is foundational to the Max Sequence, that includes specialised {hardware} like matrix and vector compute blocks optimized for each AI and HPC duties. The Intel X^estructure’s design that delivers unparalleled compute efficiency is the explanation the Aurora supercomputer secured the highest spot within the high-performance LINPACK-mixed precision (HPL-MxP) benchmark – which finest highlights the significance of AI workloads in HPC.

The X^e structure’s parallel processing capabilities excel in managing the intricate matrix-vector operations inherent in neural community AI computation. These compute cores are pivotal in accelerating matrix operations essential for deep studying fashions. Complemented by Intel’s suite of software program instruments, together with Intel® oneAPI DPC++/C++ Compiler, a wealthy set of efficiency libraries, and optimized AI frameworks and instruments, the X^e structure fosters an open ecosystem for builders that’s characterised by flexibility and scalability throughout varied gadgets and type components.

In the meantime, Intel® Tiber™ Developer Cloud is increasing its compute capability with new state-of-the-art {hardware} platforms and new service capabilities permitting enterprises and builders to guage the newest Intel structure, to innovate and optimize AI fashions and workloads rapidly, after which to deploy AI fashions at scale. New {hardware} contains previews of Intel® Xeon® 6 E-core and P-core techniques for choose clients, and large-scale Intel® Gaudi® 2-based and Intel® Knowledge Heart GPU Max Sequence-based clusters. New capabilities embrace Intel® Kubernetes Service for cloud-native AI coaching and inference workloads and multiuser accounts.

New supercomputers being deployed with Intel Xeon CPU Max Sequence and Intel Knowledge Heart GPU Max Sequence applied sciences underscore Intel’s purpose to advance HPC and AI. Methods embrace Euro-Mediterranean Centre on Local weather Change’s (CMCC) Cassandra to speed up local weather change modeling; Italian Nationwide Company for New Applied sciences, Power and Sustainable Financial Improvement’s (ENEA) CRESCO 8 to allow breakthroughs in fusion power; Texas Superior Computing Heart (TACC), which is in full manufacturing to allow information evaluation in biology to supersonic turbulence flows and atomistic simulations on a variety of supplies; in addition to United Kingdom Atomic Power Authority (UKAEA) to unravel memory-bound issues that underpin the design of future fusion powerplants.

The consequence from the mixed-precision AI benchmark will likely be foundational for Intel’s next-generation GPU for AI and HPC, code-named Falcon Shores. Falcon Shores will leverage the next-generation Intel X^e structure with the very best of Intel® Gaudi®. This integration permits a unified programming interface.

Early efficiency outcomes on Intel® Xeon® 6 with P-cores and Multiplexer Mixed Ranks (MCR) reminiscence at 8800 megatransfers per second (MT/s) ship as much as 2.3x efficiency enchancment for real-world HPC functions, like Nucleus for European Modeling of the Ocean (NEMO), when in comparison with the earlier era [1], setting a powerful basis as the popular host CPU alternative for HPC options.

[1] See ISC 2024 part of intel.com/performance index for workloads and configurations. Your outcomes might range. Intel applied sciences might require enabled {hardware}, software program or service activation. Efficiency outcomes are primarily based on testing as of dates proven in configurations and will not mirror all publicly accessible updates. No product or part could be completely safe. Intel doesn’t management or audit third-party information. You need to seek the advice of different sources to guage accuracy.

Join the free insideBIGDATA newsletter.

Be a part of us on Twitter: https://twitter.com/InsideBigData1

Be a part of us on LinkedIn: https://www.linkedin.com/company/insidebigdata/

Be a part of us on Fb: https://www.facebook.com/insideBIGDATANOW