Discuss the advantage and disadvantage of implementing the cloud platforms on the High Performance Computing (HPC) industry.
The Answer
The advantage of implementing the cloud platforms on HPC industry are:-
Capacity and Capability to run HPC workloads:-
It often happens that capacity to run larger workloads is not available. Jobs requiring a higher number of cores either exceeding on-prem resources or facing an always busy on-prem HPC cluster have issues to find a slot. Adding HPC resources in the cloud on-demand overcomes such kind of restrictions and to be able to match HPC job requirements which often happen more seldom but are important as well.
Adapt the hardware resources to the individual HPC job:-
Running an on-premises HPC cluster often results in a compromise between the average hardware configuration suitable for a mix of different applications or a specific configuration that is designed to run a particular application or workload very well. This can result in poor application performance due to the compromises being made to accommodate all of the applications being used.
Testing and benchmarking new hardware :-
Access to recent hardwareor testing and running benchmarks is easily available in the cloud. Multi-GPU servers e.g. with high-end GPUs cost 100+k USD while being available on-demand in the cloud at a couple of 10s of USD. ISV and other codes can be tested on recent GPUs as well as inter-connects on demand to understand performance benefits.
Some more benifits are as mentioned below:-
The disadvantage of implementing the cloud platforms on HPC industry are:-
- Save money and time on expensive ISV software licenses.
- Archive result data.
- Parallel File System as a Service
The disadvantage of implementing the cloud platforms on HPC industry are:-
Cost of HPC in the Cloud higher than on-premises :-
If you have a well-designed local cluster, the costs of acquisition and running it are lower than in the Cloud. The costs of running an HPC Cluster in the Cloud can vary significantly depending on the type of cloud instances you use to create your HPC Cluster.
Performance in the Cloud :-
The performance of an HPC cluster in the Cloud highly depends on the configuration. In the cloud often virtual machines are used with hyper-threading turned on (vCores) and the network infrastructure might be different. This will reduce the compute power available.
Data Gravity keeps data in the cloud :- Data, especially the amounts of data that are used in HPC, cannot be moved around easily. It takes time, even on fast connections, to download 50 or 100 GByte for post-processing, and this is typically only for one HPC job.
Some more disadvantages are mentioned below:-
- Data Egress Cost for downloading data from the cloud
- Secure your Networking into the Cloud
- Split HPC clusters – on-premises and in the cloud