System Engineer ( HPC)

HORIZON GLOBAL SERVICES PTE. LTD.
2 days ago
Posted date2 days ago
N/A
Minimum levelN/A
EngineeringJob category
EngineeringNeed Singaporean Only
️ Key Responsibilities:
Required Skills & Tools:
Operating Systems:
⚡ HPC Tools & Technologies:
Scripting & Automation:
️ Compilers:
☁️ Cloud Technologies:
Monitoring & Configuration Tools:
Cluster Management:
️ Soft Skills:
️ Key Responsibilities:
- Infrastructure Management:
Administer and manage HPC infrastructure with 700+ compute nodes and 50+ AWS cloud instances.
Ensure seamless operation and integration of HPC systems, storage subsystems, and networking components. - Linux Administration:
Provide Linux systems administration across Red Hat and CentOS servers.
Perform patching, compiling, securing, and troubleshooting in a heterogeneous environment. - Monitoring & Automation:
Implement and maintain system monitoring, configuration management, and automation using tools like Puppet, Splunk, BigFix, Ganglia, and Nagios. - Job Scheduling:
Manage job scheduling environments using PBS or equivalent workload schedulers. - ️ Technical Support:
Support researchers and developers by troubleshooting advanced technical issues in HPC environments. - Performance Optimization:
Contribute to continuous improvements in system performance, reliability, and scalability. - ⚙️ Environment Coordination:
Coordinate and implement changes across development to production environments. - Collaboration:
Collaborate with internal IT teams and research staff to ensure infrastructure meets project demands. - Disaster Recovery & Documentation:
Participate in disaster recovery planning, system documentation, and user support activities.
Required Skills & Tools:
Operating Systems:
- Red Hat
- CentOS
⚡ HPC Tools & Technologies:
- Xcat
- PBS Scheduler
- Infiniband
- Lustre
- SAN
- LVM
- EXT
- NFS
- XFS
Scripting & Automation:
- Bash
- Python
- Sed
- Awk
️ Compilers:
- GNU
- Intel
- CUDA
☁️ Cloud Technologies:
- AWS (Certified Solutions Architect - Associate preferred)
Monitoring & Configuration Tools:
- Puppet
- Splunk
- BigFix
- Ganglia
- Nagios
Cluster Management:
- Red Hat PCS
- Parallel File Systems
️ Soft Skills:
- Strong communication
- Analytical thinking
- Advanced troubleshooting capabilities
JOB SUMMARY
System Engineer ( HPC)

HORIZON GLOBAL SERVICES PTE. LTD.
Singapore
2 days ago
N/A
Contract / Freelance / Self-employed
System Engineer ( HPC)