IBM BlueGene supercomputer

What Six Months Can Mean in Supercomputing

Rapid fire changes are to be expected in the tech industry, along with rising/falling fortunes, unusual market developments and sudden turns in conventional wisdom. That said, it’s difficult to think of an area where radical shifts occur more frequently than top-end supercomputing and high-performance computing (HPC) installations.

Such shifts are clear in the recently released Top500.org list of the world’s currently fastest supercomputers, which also celebrated the Top500 group’s 25th anniversary. Since the last Top500 report (November 2017) appeared barely half a year ago, some remarkable changes have taken occurred. Those include an all new #1 supercomputer, the emergence of a new Top500 list leader and the remarkable stumble of a vendor that has led the list for half a decade.

IBM regains leadership with Summit and Coral

The new list’s biggest bragging rights belong to IBM and the Summit installation at the Department of Energy’s (DOE’s) Oak Ridge Lab ascending to the peak of the Top500 list. As I noted in a recent Pund-IT Review, Summit contains 4,356 nodes, each equipped with two 22-core IBM Power9 CPUs and six NVIDIA Tesla V100 GPUs linked with a Mellanox dual-rail EDR InfiniBand network.

Summit’s 122.3 petaflops of performance on High Performance Linpack (HPL), the benchmark used to rank the TOP500 list, is about a third faster than the 93.015 petaflops achieved by the Sunway TaihuLight system (at China’s National Supercomputing Center in Wuxi). That installation has stood atop the Top500 since it arrived on the June 2016 list.

More importantly, IBM has noted that Summit’s hybrid Power9/Tesla GPU architecture enables the system to augment conventional supercomputing workloads with machine learning applications, substantially broadening the range of projects and data it can support. That same hybrid architecture (though with different IBM Power System servers) was used in the new #3 system on this Top500 list: the Sierra installation at the DOE’s Lawrence Livermore National Laboratory.

Taken together, Summit and Sierra also enabled IBM to leap to the front of the pack in terms of overall Top500 performance, moving from 19 systems delivering a total of 51.275 petaflops in November 2017 to 18 systems delivering 239.067 petaflops today. The #2 vendor in overall performance is Cray whose 53 Top500-listed systems deliver a collective 187.798 petaflops.

Interestingly, another IBM system is also the oldest installation in the Top 10 group: the Sequoia installation at DOE’s Lawrence Livermore Lab which leverages the company’s <href=”#Blue_Gene.2FQ”>BlueGene/Q architecture. It originally topped the Top500 List in June 2012, and its 17.17petaflops of performance make it #8 on the current list. That’s just behind the second oldest system in the Top 10: the Cray-based Titan at DoE’s Oak Ridge which pushed Sequoia out of the top spot in November 2012.

Lenovo rises, HPE falls

Vendors use placement on the Top500 list for bragging rights so it’s not unusual to see some tussling for various leadership positions. Those range from the total number of listed systems to total petaflop performance to most energy efficient systems (in the Green500 list) to placement on the relatively new High-Performance Conjugate Gradients (HPCG) benchmark list.

The emergence of new leading-edge systems, like Summit is to be expected given the amount of national pride (and funding) that goes into government-sponsored supercomputers. Plus, HPC exhibits the same sort of competitive give/take that’s clear in other commercial IT markets. Less common are wholesale shifts in leadership positions, including the total number of Top500-ranked systems.

However, that’s exactly what’s found on the new list, with Lenovo’s 117 installations handily bypassing HPE’s 79 systems. How unusual is this? Consider that prior to the June 2018 list, HPE (then HP) held the topmost position in system share since June 2013 when the company bumped IBM out of the top spot. In turn, IBM booted HPE out of the leadership role in the November 2010 list. In other words, there isn’t a lot of turnover in that #1 position.

The other notable point is the dramatic drop in the number of HPE systems—from 122 in the last list to today’s 79 that the company suffered in the past six months. It’s no secret that HPE is its go-to-market strategy towards emerging areas, like edge computing and IoT but it’s hard to say whether that resulted in the company taking its eye off the ball in hyperscale and HPC. Then again, HPE’s customers may simply be failing to keep up with the ever-increasing pace of HPC performance.

For example, while the EPFL Blue Brain IV (IBM BlueGene Q) system at the Swiss National Supercomputing Centre, was ranked #372 last November, its 715.6 TFlop/s of performance earned it the #500 spot on the new list. Add in that HPE has tended to be stronger in commercial HPC solutions that populate the lower end of the Top500 list and it’s no surprise to see so many of the company’s systems washed-out.

However, those points don’t detract in any way from <href=”#!/newsDetail/283yi044hzgcdv7snkrmmx9ohuh11srzifj2znpgkf3wxfvy”>Lenovo’s accomplishment. Along with leading in total systems on the new list, the company’s solutions captured two positions in the top 25, five in the top 100 and thirty-nine of the top 200. Lenovo is also responsible for some notable global HPC installations, including Mare Nostrum at the Barcelona Supercomputing Center, the Niagara system (Canada’s fastest supercomputer) and the Marconi system at Cineca in Italy which is among the world’s most energy efficient supercomputers.

Increasing energy efficiency is also the focus of Lenovo’s new Neptune initiative which the company announced at the International Supercomputing Conference (ISC, where the new Top500.org list debuted). Neptune encompasses Lenovo’s suite of liquid cooling technologies, including Direct to Node (DTN) warm water cooling, rear door heat exchanger (RDHX) and hybrid Thermal Transfer Module (TTM) solutions, which are designed to deliver peak or high performance HPC, AI and enterprise workloads.

Final analysis

So, what’s the takeaway from the latest Top500.org list? First and foremost, innovation continues to be alive and well in HPC. That’s hardly headline news given the budgets and other resources these projects require. But the fact is that the Top500 list also serves as a marker for compute technologies and capabilities that eventually work their way into commercial markets. Consider, for example, that the IBM Power Systems AC922 servers and other components driving Oak Ridge’s massive Summit system are available for purchase today.

In addition, just like commercial markets, the new list shows that competition in HPC continues to be fierce. Both IBM and Lenovo deserve congratulations for their new leadership positions and the innovations they delivered along the way. But at the same time studying past Top500 results underscores just how tenuous technical dominance can be.

Finally, it’s important to remember that despite Top500’s focus on pure performance, the systems on this newest list, as well as those on previous lists were built to take on and complete some of the world’s most demanding and complex computing tasks. The capabilities and insights they deliver can and do make real differences in the lives of people, communities, countries and the larger world.

IBM and Lenovo’s achievements aren’t the only significant stories coming out of the new Top500 list. There are also the issues driving the need for and development of the new HPCG standard. Then there’s China’s growing leadership in the number of Top500-listed systems. Finally, consider what the rapid rise and stunning growth of youthful vendors, like Inspur means to mainstream HPC vendors and their allies.

For all the surprises that these latest Top500 standings delivered, the November list is just around the corner.

Scroll to Top