Resources
The University of South Carolina High Performance Computing (HPC) clusters are available
to researchers requiring specialized hardware resources for computational research
applications. The clusters are managed by Research Computing (RC) in the Division
of Information Technology.
High Performance Computing (HPC) resources at the University of South Carolina (USC)
are located in the USC data center, which provides enterprise-level monitoring, cooling,
power backup, and Internet2 connectivity.
Research Computing HPC clusters are available through SLURM job management partitions
(queues) and are managed using the Bright Cluster Management system. Bright provides
a robust software environment to deploy, monitor and manage HPC clusters.
Theia
Theia is a new AI-focused HPC cluster in early-stage production at the University
of South Carolina. Theia features (28) 112-core CPU nodes, 10 quad A100/H100 GPU
nodes, and 1.5 PB GPFS scratch for a total of 3808 Cores and 675 TFLOPS combined CPU
and GPU peak performance.
Funded by an NSF MRI grant, the goal of Theia is to lower the barrier to entry and
increase access to AI, AI model training, and HPC for South Carolina and regional
educational institutions, including K-12.
The Theia NSF MRI grant leadership includes Dr. Ming Hu (PI, Science Core), Paul Sagona
(HPC Management), Dr. Sophya Garashchuk (Workshops), Dr. Jianjun Hu (Undergraduate
Training), and Dr. Forest Agostinelli (Summer Camps and Outreach).
Hyperion
Hyperion is our flagship cluster intended for large, parallel jobs and consists of
356 compute, GPU and Big Memory nodes, providing 16,616 CPU cores. Compute and GPU
nodes have 128-256 GB of RAM and Big Memory nodes have 2TB RAM. All nodes have EDR
infiniband (100 Gb/s) interconnects, and access to 1.4 PB of GPFS storage.
Bolden
This cluster is intended for teaching purposes only and consists of 20 compute nodes
providing 460 CPU cores. All nodes have FDR infiniband (54 Mb/s) interconnects and
access to the 300 TB of Lustre storage.
Maxwell (Retired)
This cluster was available for teaching purposes only. There were 55 compute nodes
with 2.8 GHz and 2.4 GHz CPUs each with 24 GB of RAM.
Historical Summary of HPC Clusters
HPC Cluster |
Theia |
Hyperion Phase III |
Hyperion Phase II |
Hyperion Phase I |
Bolden |
Maxwell |
Status |
Active |
Active |
Retired |
Retired |
Teaching |
Retired |
Number of Nodes |
38 |
356 |
407 |
224 |
20 |
55 |
Total Cores |
3808 |
16,616 |
15,524 |
6,760 |
400 |
660 |
Compute Nodes |
28 |
295 |
346 |
208 |
18 |
40 |
Compute Node Cores |
112 |
64 |
48 |
28 |
20 |
12 |
Compute Node CPU Speed |
2.0 GHz (3.8 GHz Max Turbo) |
3.0 GHz |
3.0 GHz |
2.8 GHz |
2.8 GHz |
2.4 GHz or 2.8 GHz |
Compute Node Memory |
512 GB |
256 GB or 192 GB |
192 GB or 128 GB |
128 GB |
64 GB |
24 GB |
GPU Nodes |
9 Quad A100 |
1 DGX
|
9 Dual P100
|
9 Dual P100 |
1 K20X |
15 M1060 |
1 Quad H100 |
(1) 8x A100 |
44 Dual V100 |
|
|
|
|
44 Dual V100 |
|
|
|
|
GPU Node Cores |
96 or 64 |
48 or 28 |
48 or 28 |
28 |
20 |
12 |
GPU Node CPU Speed |
2.6 GHz (3.6 GHz Max Turbo) |
3.0 GHz |
3.0 GHz |
2.8 GHz |
2.8 GHz |
2.4 GHz or 2.8 GHz |
GPU Node Memory |
512 GB (1.5 TB for Quad H100) |
192 GB |
128 GB |
128 GB |
128GB |
24 GB |
Big Memory Nodes |
|
8 |
8 |
8 |
1 |
0 |
Big Memory Node Cores |
|
64 |
40 |
40 |
20 |
|
Big Memory CPU Speed |
|
3.0 GHz |
3.0 GHz |
2.1 GHz |
2.8 GHz |
|
Big Memory Node Memory |
|
2.0 TB |
1.5 TB |
1.5 TB |
256 GB |
|
Home Storage |
|
450 TB GPFS |
600 TB NFS |
300 TB Lustre
|
50 TB NFS |
|
|
|
|
50 TB NFS |
|
|
Home Storage Interconnect |
|
1 Gb/s Ethernet |
1 Gb/s Ethernet |
1 Gb/s Ethernet |
1 Gb/s Ethernet |
1 Gb/s Ethernet |
Scratch Storage |
1.5 PB |
1.4 PB |
1.4 PB |
1.5 PB |
300 TB |
20 TB |
Scratch Storage Interconnect |
400 Gb/s NDR Infiniband |
100 Gb/s EDR Infiniband |
100 Gb/s EDR Infiniband |
100 Gb/s EDR Infiniband |
54 Gb/s FDR Inifiniband |
40 Gb/s QDR Infiniband |