For Employers
System Administrator


REVE CLOUD PTE. LTD.
8 days ago
Posted date
8 days ago
N/A
Minimum level
N/A
ITJob category
IT
Position Summary: We are seeking a skilled HPC System Administrator to manage and maintain high-performance computing (HPC) systems. The ideal candidate will be responsible for system administration, user support, software integration, and collaboration with research teams to optimize computational workflows.

Key Responsibilities:

1. HPC System Management and Maintenance

• Install, configure, integrate, and maintain high-performance compute clusters and associated hardware

• Monitor system performance, troubleshoot issues, and ensure security compliance Process and document change management procedures

2. User Support and Consultation

• Assist users with computational jobs and optimize workflows for efficient resource utilization

• Provide training sessions and resolve user issues related to HPC environments

3. Software and Application Support

• Install, configure, and maintain scientific and engineering HPC software solutions

• Support software development for parallel computing and performance optimization

4. Collaboration with Research Teams

• Understand research project requirements and recommend appropriate HPC solutions

• Assist in designing and optimizing computational workflows for researchers

5. Resource Allocation and Scheduling

• Manage resource allocation and job scheduling within the HPC environment

• Implement policies for job queuing, resource limits, and workload balancing

• Enforce operational best practices and implementation plans Internal Use - Confidential

6. System and Network Optimization

• Configure and maintain high-speed networks for optimal data transfer within the HPC infrastructure

• Conduct performance benchmarking and optimization efforts.

7. Documentation and Reporting

• Maintain detailed system documentation, configuration guides, and user manuals

• Generate reports on system performance, resource utilization, and operational efficiency

Qualifications and Skills:

• Strong experience with HPC system administration, Linux-based environments, and cluster management tools.

• Proficiency in job scheduling and resource management frameworks (e.g., Slurm, PBS, Grid Engine).

• Hands-on experience with networking protocols, security policies, and data transfer optimizations.

• Familiarity with scientific computing software and parallel programming techniques.

• Ability to troubleshoot complex system and application issues effectively.

• Strong communication skills to collaborate with researchers and support teams.
Related tags
-
JOB SUMMARY
System Administrator
REVE CLOUD PTE. LTD.
Singapore
8 days ago
N/A
Full-time

System Administrator