A New Era of I/O Technologies Dawns to Transform HPC

Kirk Kern, CTO, NetApp U.S. Public Sector
490
805
152
Kirk Kern, CTO, NetApp U.S. Public Sector

Kirk Kern, CTO, NetApp U.S. Public Sector

High Performance Computing (HPC) use in the commercial sector can be broadly found in industries that include: oil & gas, life and earth sciences, traditional and electronics manufacturing and even the financial community. Data sets continually growin size as the sensors or models improve their resolution in an effort to increase insight or fidelity of the resultant calculations. Large data sets containing seismology returns, protein or genomic information, temperature and atmospheric scan data are all examples where information growth impacts the HPC architecture.

"Diving deeper, for HPC new programmatic techniques will allow applications to directly address these devices to maximize throughput and reduce latency"

As a result, more data has to be moved to computational elements in shorter time frames. The goal of every HPC industrial system is to improve computational workflows to reduce time to insight or final product design, so expediting the flow of data to or from the computational elements presents both a technical and financial challenge.

Currently HPC applications processing against peta scale volumes of data and are sustaining 10-1000s gigabytes or sec of data movement. Typically, HPC architectures use an external I/O subsystem of tightly connected storage arrays with magnetic disk or flash in the form of SSDs to provide the storage capability. The storage arrays transfer data to the processors via a high bandwidth fabric and file level access is coordinated thru a distributed shared file system or via application I/O libraries.

In many cases, the IO subsystem becomes a large component of the total cost of the HPC system and often is also a source bottlenecks which impacts the productivity of the organization. This is an area where imprudent decisions to reduce storage cost can result in the loss of data and potentially valuable information that can have broader impacts to the business.

Fortunately, the anticipated emergence of Storage Class Memory (SCM) appears to be a reality as recent availability announcements have been published. The devices will have between two to three orders of magnitude improvement in latency and bandwidth and over time will represent a high performance cost effective storage medium. While the specific device technology remains undisclosed, Phase Change Memory (PCM), Magnetic or Resistive RAM are the potential candidates to move into commercial production. For HPC system and storage designers this new class of storage technology can be used to dramatically improve the I/O characteristics of HPC systems. SCM will have broad applicability in SSDs, storage controllers, PCI or NVMe boards and DIMMs but its most exciting use case for HPC will be its use as a nonvolatile memory element that is tightly coupled to an application or processing domain.

Diving deeper, for HPC new programmatic techniques will allow applications to directly address these devices to maximize throughput and reduce latency. Coordination of the IO functions will be handled within the application with support from native microprocessor instructions. Application or node level data protection services are still required so a new two layer (IOPs and capacity) I/O architecture is emerging for leveraging SCM in HPC. Each processing node will have 10s of terabytes of native non-volatile storage. These SCMs will be used to construct shared server side storage, IOPs, layer that spans processing nodes. External lower cost network attached storage will form the capacity layer to address large block or objectI/O requests.

The maturation of commodity based RESTful object storage solutions will increasingly find applicability in HPC systems in satisfying these capacity storage level functions. Since objects are a network centric storage technology, the repository can reside on premise or be cloud based. This leads to secondary effect for CIOs because they can now evaluate financial advantages versus business impact of keeping the traditionally large footprint storage systems in a data center.

Ultimately SCMs will create opportunities to pool server storage resources for either local dedicated storage or be used as clustered storage solution to accelerate computations in HPC systems. CIOs and Data Center operators will enjoy the benefits of reducing large investments in custom high performance switched storage architectures. However SCM will not completely eliminate the need for external storage. It will result in the logical bifurcation of the storage subsystem into host sideside storage for IOPs satisfaction and an external lower cost, lower performance capacity class storage environment for HPC.

Dense content repositories leveraging object technologies are well suited to take on this task while hopefully simplifying or eliminating the operational complexities of maintaining high performance shared file systems. SCMs have the potential to dramatically affect the computational and storage landscape in HPC while the application and storage developers have many interesting opportunities to take advantage of this unique and new technology.

Read Also

High Performance Computing Grows Up

High Performance Computing Grows Up

David Rukshin, CTO, WorldQuant, LLC
Making High-Performance Computing Work for Industry

Making High-Performance Computing Work for Industry

Mark Shephard, Director, Scientific Computation Research Center, Rensselaer Polytechnic Institute
Container Revolution Enables Science Breakthroughs

Container Revolution Enables Science Breakthroughs

Jack Wells, Director of Science, Oak Ridge Leadership Computing Facility
Why AI is the New Frontier in Healthcare?

Why AI is the New Frontier in Healthcare?

Simon Lin, MD, MBA, Research CIO, Nationwide Children’s Hospital