Post merger summary of ARCSIM hardware and services

Upgrades to Hardware and Services

Research computing services hosted from the data center in Neville Hall at the UMaine campus have served as a cornerstone for researchers across the University of Maine System (UMS) for many years. A wave of upgrades to hardware and services is in its final stages with interest to better serve the UMS community. Here’s an overview of what’s available.

High-Performance Computing (HPC)

The centerpiece of ARCSIM’s HPC capabilities is the Penobscot cluster, which offers unparalleled computing power through its user-friendly Open OnDemand web interface and SLURM job scheduler. With over 100 compute nodes and more than 4,000 CPU cores, ARCSIM ensures that researchers have the computational muscle they need for their work. The nodes are organized into six primary partitions to optimize performance:

  • epyc-genoa: 4 AMD EPYC 4 Genoa systems, each with 96 cores and 384GB of RAM.
  • epyc: 14 AMD EPYC 3 Milan systems, each with 96 cores and 512GB of RAM.
  • epyc-hm: 4 AMD EPYC 3 Milan systems, each with 32 cores and 1TB of RAM.
  • skylake: 8 Intel Skylake systems, each with 36 cores and 256GB of RAM.
  • haswell: 88 Intel Haswell and Broadwell systems with 24-28 cores and 64-128GB of RAM.
  • gpu: 6 GPU nodes featuring 29 NVIDIA GPUs of various types, including:
    • 14 A100 GPUs (8 with 40GB VRAM, 6 with 80GB VRAM)
    • 3 L40 GPUs with 48GB VRAM
    • 4 A30 GPUs with 24GB VRAM
    • 8 RTX2080 GPUs with 11GB VRAM

The diversity of these resources ensures that researchers can tackle everything from traditional computational tasks to cutting-edge AI and machine learning projects.

Virtual Machines (VMs)

ARCSIM’s local virtual machine services are also available, offering researchers and educators flexible and powerful options for remote workstations, compute servers, and data dissemination. Hosted on the same high-performance infrastructure as the HPC systems, the VM cloud supports up to 1,000 virtual machines with configurations ranging from 1vCPU and 3GB of RAM to 16vCPU and 48GB of RAM.

For developers needing GPU capabilities, ARCSIM provides 12 NVIDIA T4 GPUs (16GB VRAM each), which are ideal for tasks such as GPGPU programming with CUDA and running smaller large language models (LLMs) locally. Specialized VMs are also available for handling HIPAA-compliant data, ensuring secure and compliant workflows.

Advanced Storage Solutions

The backbone of ARCSIM’s infrastructure is its high-performance CEPH storage cluster, which supports HPC and VM services while offering scalable and reliable storage for researchers. The cluster currently boasts 2.9PB of raw capacity, with an expansion planned for early 2025 that will increase capacity to over 7PB. Key features include:

  • High throughput and low latency for optimal performance.
  • Redundancy and resilience to ensure data availability, even during hardware upgrades or replacements.
  • Accessibility across UMS via multiple protocols, including S3 buckets, network shares, and SCP.

Whether researchers need archival storage or high-speed working space, ARCSIM’s storage solutions are designed to meet their needs.

Collaborations and External Resources

In addition to its robust internal resources, ARCSIM partners with several leading organizations to offer researchers access to national computing and storage resources:

  • Ohio Supercomputer Center (OSC): OSC supports research in computational science and supercomputing, offering home directory storage (500GB per user), shared project storage in 0.5TB increments, and specialized staff expertise across emerging disciplines. Details on OSC’s storage options and clusters can be found on their documentation page.
  • Texas Advanced Computing Center (TACC): TACC is a premier center for computational excellence, providing systems such as STAMPEDE3 and FRONTERA, along with a range of storage solutions like Corral for project data storage and Ranch for archival needs.
  • ACCESS Program (Advanced Cyberinfrastructure Coordination Ecosystem: Services and Support): Funded by the National Science Foundation, ACCESS offers high-performance computing clusters, cloud infrastructure, and storage resources at no cost to researchers and educators. ARCSIM serves as a local ACCESS Campus Champion, facilitating access and helping researchers secure allocations. Visit the ACCESS program page for more details or contact ARCSIM for assistance.

These partnerships extend ARCSIM’s capabilities, ensuring that researchers have access to world-class resources for their projects.

Supporting the Research Community

ARCSIM’s upgraded hardware and services reflect its ongoing commitment to empowering researchers throughout the University of Maine System. With decades of experience among its staff and a robust infrastructure that continues to evolve, ARCSIM provides the tools necessary for groundbreaking research and innovation.

To learn more or to start leveraging these resources for your research, visit ARCSIM’s website or reach out to their team directly.