Pawsey gets ready for user workloads on Setonix

By

Scientists prepare for new architecture.

The Pawsey Centre’s Setonix supercomputer is getting ready to onboard its first user communities, to be ready for full operations by February 2023.

Pawsey gets ready for user workloads on Setonix

In an interview with iTnews, Pawsey’s executive director Mark Stickells said since Phase 1 of the project went live in June, the centre has finished removing the previous flagship, Magnus, and has been “filling out that space with Setonix”.

There remains a little under-floor work, Stickells said, and a "couple of months" of tuning and optimisation work.

Onboarding will begin with Pawsey’s core community, the ASKAP radio telescope, followed in short order by other projects like bioinformatics and climate science.

The Setonix project looks like a technology refresh, but Stickells said it’s as much a refresh of people and process, because researchers will need support to take full – and efficient – advantage of Setonix’s architecture.

“We've been busily preparing the Australian research community for Setonix and this new heterogeneous architecture, this blend of CPUs and GPUs, the GPU-heavy environment," he said.

CTO Ugo Varetto said that beyond the objective measure of efficiency that put Setonix into fourth position in the international Green500 list – 56Gflops per watt – efficiency demands software that aligns with architecture.

He gave the application of AI to climate modelling as an example.

Setonix’s architecture is much better suited to AI than its predecessor, and Varetto said that means while one thread of a climate calculation is solving huge and complex differential equations, another can be running AI models to get approximate solutions to that question.

“You can advance the simulation faster," Varetto said. 

“Of course, you have to compare the results at the end with the correct results”, but that will serve to improve the AI’s performance.

Making software run efficiently on Setonix means working with software developers – whether the developer is the scientist developing their own code, or other developers.

“I would say that say 60 percent of the researchers do not develop their own code, they do use canned applications. These are the pure domain specialists,” Varetto said.

“So for the people who are developing code, we're really putting a lot of effort into helping helping them understand how to exploit the power of the new architectures.

“We recruited people, we also joined international forces in this area, we organise hackathons together with other labs. 

“But for the others who use canned applications, we just make sure that those applications can scale on our system. We work directly with the application developers, and then test constantly their applications on our systems and configure them in such a way that the scaling becomes transparent to the researchers”.

Stickells added: “We’ve invested in this over the last two years, and co-funded positions inside several projects around the country.

“They’re bringing on post docs, supported by Pawsey staff, to help optimise their code, or port their code to the new architecture, or improve code that has been developed in some cases over 10-to-20 years.”

That activity, Stickells said, “is a significant part of optimisation and utilisation strategy for these sorts of massive architectures.”

Future architectures

The Pawsey team is, naturally enough, already giving thought to what will follow Setonix.

That won’t be for some time, Stickells said, because Setonix is designed to be expandable.

Expanding its capacity is “quite straightforward”, he said, because the management layer can support it.

Varetto said: “Everything on the system is software-defined. You can create your own mini clusters inside the system, you can isolate partitions, you can create your own file system, whatever you want … it's all software-defined and configured. 

“That's what allows us to expand the systems as needed … if we wanted to add a few nodes with ARM CPUs inside, for example, using the new Grace Hopper architecture from Nvidia, we can just go ahead and do it.”

“We’re also pulled along by the aspirations and challenges of the Square Kilometre Array,” Stickells said.

“Some of its data movement and data processing challenges are at a scale that we’re to tackle in the coming years,” he added. “That will test us.”

Further out, Varetto said the architectures of the next generation of supercomputers will want will reduce the “distance” – the number of hops – between different parts of the system.

“Today, for example, the GPU and CPU are separate and atomic components accessing their own memory," he said.

“Next year there will be a new type of architecture”, he said, that combines CPU, GPU and memory in a single package.

“So you won't need anymore to use tricks to trigger memory transfers asynchronously between components. You'll have one single memory layer and two types (CPU and GPU) accessing the same memory.”

The other architectural development is optimising memory transfer, Varetto said. 

“Today the main issue we have is to move data fast enough,” he added, a problem HPC has in common with commercial data centres.

“To give you an idea in visualisation, you spend 85 percent of the time reading the data, and only 15 percent visualising it."

Centres like Pawsey are working with storage vendors to understand how to overcome this problem and “move the data closer to the CPU, the GPU and the compute units”.

Got a news tip for our journalists? Share it with us anonymously here.
Copyright © iTnews.com.au . All rights reserved.
Tags:

Most Read Articles

Microsoft ending support for Windows 10 could send 240 million PCs to landfills

Microsoft ending support for Windows 10 could send 240 million PCs to landfills

RBA reveals three-year project to upgrade payment IT systems

RBA reveals three-year project to upgrade payment IT systems

Microsoft adds AI button to keyboards to call up chatbot

Microsoft adds AI button to keyboards to call up chatbot

Smart device security labels would cost under $5 million a year

Smart device security labels would cost under $5 million a year

Log In

  |  Forgot your password?