High Performance Computing Solutions Engineer
We are seeking an experienced High Performance Computing Solutions Engineer to join our team. As a key member of our organization, you will be responsible for the installation, configuration, and deployment of high-performance computing systems.
This role demands a strong understanding of HPC architectures, networks, and software, along with excellent problem-solving skills. You will work closely with clients, project managers, and other technical staff to ensure that HPC solutions meet performance, reliability, and scalability requirements.
Key Responsibilities:
* Install and configure HPC hardware and software components, including servers, storage, and networking equipment.
* Set up and manage high-speed interconnects (e.g., InfiniBand, Ethernet).
* Deploy operating systems, cluster management software, and parallel file systems.
System Deployment:
* Coordinate with clients and project managers to understand deployment requirements and timelines.
* Implement and document HPC deployment processes and best practices.
* Perform system testing and validation to ensure optimal performance and reliability.
Client Support and Training:
* Provide technical support to clients during the installation and deployment phases.
* Conduct training sessions for clients on HPC system usage and maintenance.
* Develop and maintain user documentation and guides.
Performance Optimization:
* Monitor and analyze system performance to identify and resolve bottlenecks.
* Optimize HPC configurations for specific applications and workloads.
* Implement performance tuning techniques for hardware and software.
Collaboration and Communication:
* Work closely with hardware and software vendors to troubleshoot and resolve issues.
* Collaborate with internal teams to integrate HPC solutions with existing infrastructure.
* Communicate effectively with stakeholders to provide updates on project status and technical issues.
Continuous Improvement:
* Stay updated on the latest HPC technologies and trends.
* Recommend improvements to enhance system performance, reliability, and scalability.
* Participate in the evaluation and testing of new HPC products and solutions.
About This Role:
* Proven experience in installing, configuring, and deploying HPC systems.
* Strong knowledge of HPC architectures, parallel computing, and cluster management.
* Proficiency in Linux/Unix operating systems.
* Experience with HPC software tools and libraries (e.g., MPI, OpenMP, SLURM, Torque).
* Familiarity with high-speed networking technologies (e.g., InfiniBand, Ethernet).
* Excellent problem-solving skills and attention to detail.
* Strong communication and interpersonal skills.
* Ability to work independently and as part of a team.
* Certifications in relevant technologies (e.g., Red Hat Certified Engineer, Certified HPC Professional).
* Experience with cloud-based HPC solutions.
Benefits:
* Comprehensive suite of benefits supporting physical, financial, and emotional wellbeing.
* Ongoing career development opportunities.
* Inclusive work environment embracing individual uniqueness.
What We Offer:
* A dynamic work environment driving innovation and growth.
* The opportunity to work with cutting-edge technology and collaborate with industry experts.
* A competitive compensation package and comprehensive benefits.