Senior Manager of Infrastructure Engineering
Core42, a leader in AI-powered cloud and digital infrastructure, is driving transformative technology solutions globally. Leveraging advanced resources and partnerships, Core42 empowers clients to harness sovereign AI infrastructure, especially in sectors with stringent regulatory needs. With a mission to redefine digital transformation, we combine sovereign capabilities with scalable, high-performance compute infrastructure, positioning ourselves at the forefront of AI innovation in the Middle East and beyond.
With a diverse team of 1,100+ employees globally from ~70 nationalities, we foster an inclusive, innovative, and collaborative environment. At Core42, we foster a culture grounded in trust, accountability and high performance. We are united by our values: Grit, where we overcome challenges with resilience and determination, Passion, which drives us to pursue excellence in everything we do, and Impact, as we aim to inspire progress and create meaningful change. Our team members thrive in an environment where each person’s contributions propel us forward, and together, we commit to achieving extraordinary results.
The Opportunity
The Senior Manager of Infrastructure Engineering will play a leading role in planning, delivery, and support for large-scale GPU AI cluster environments. This role is responsible for physical infrastructure planning, optimizing physical layout, and managing critical support teams responsible for break/fix operations and inventory management. The ideal candidate will have deep expertise in data center infrastructure, high-performance computing (HPC), and AI cluster architecture to ensure maximum uptime, efficiency, and scalability.
Key Responsibilities
- Develop and implement infrastructure strategies to support large-scale GPU clusters, ensuring scalability, efficiency, and reliability.
- Collaborate with engineering teams to optimize physical layout, including rack design, cooling, power distribution, and networking topology.
- Work closely with VP of AI Infrastructure and vendors to specify hardware requirements, ensuring alignment with performance and growth goals.
- Oversee the installation, cabling, and deployment of new compute, storage, and networking hardware.
- Lead and manage a team of break/fix technicians and inventory specialists, ensuring rapid response to hardware failures and continuous improvement in repair processes.
- Implement best practices for asset tracking, inventory control, and spare parts management to minimize downtime.
- Ensure that infrastructure maintenance and upgrades are conducted with minimal impact on system availability.
- Partner with facilities and data center teams to optimize cooling, airflow, and power efficiency in high-density GPU environments.
- Drive automation and process improvements to enhance infrastructure reliability and reduce operational overhead.
- Establish and track KPIs for system uptime, response times, and infrastructure efficiency.
- Serve as the primary infrastructure liaison between operations, hardware engineering, and data center teams.
- Collaborate with network engineers, storage specialists, and VP of AI Infrastructure to ensure seamless integration of physical infrastructure with software environments.
- Advocate for continuous improvement and innovation, evaluating new technologies and methodologies to improve infrastructure performance
Required Qualifications
- Bachelor’s Degree in Computer Science or Equivalent
- 10+ years of experience in large cluster computing environments, infrastructure engineering, or HPC facility management.
- Proven experience managing large-scale GPU clusters, HPC environments, or AI-focused infrastructure.
- Strong knowledge of rack design, high-density cooling, power distribution, and structured cabling.
- Experience managing break/fix teams, inventory control, and hardware lifecycle processes.
- Deep understanding of hardware failure patterns, troubleshooting methodologies, and repair logistics.
- Familiarity with automation tools for infrastructure monitoring and asset tracking.
- Strong leadership skills with a track record of building and managing high-performing teams.
- Excellent communication skills to collaborate across engineering, operations, and procurement teams.
Preferred Qualifications
- Experience working with liquid cooling, immersion cooling, or advanced thermal management solutions for HPC.
- Background in high-speed networking (InfiniBand, Ethernet) and storage systems.
- Familiarity with DCIM (Data Center Infrastructure Management) tools.
- Certifications such as DCDC (Data Center Design Consultant) or CDCDP (Certified Data Center Design Professional) are a plus.
Compensation & Benefits
The base salary for this full-time position ranges from $156,000 in our lowest geographic market to $234,000 in our highest geographic market. The actual base salary will be determined by various factors, including the position’s location, job-related skills, knowledge, experience, and relevant education or training.
Certain roles are eligible for additional rewards, such as merit-based salary increases, annual bonuses, and long-term incentive plans, which are contingent on individual and company performance. Additionally, some positions offer the opportunity to earn sales incentives based on revenue or utilization targets.
As a full-time employee, you will also have access to comprehensive benefits, including leading healthcare options (medical, dental, and vision insurance), a 401(k) plan with company matching, company-sponsored short-term and long-term disability coverage, life insurance, paid time off, and various well-being benefits, among others.
Equal Employment Opportunity
Core42 is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances.
If you need assistance and/or a reasonable accommodation to participate in the job application or interview process, or to perform the essential functions of the position, please contact us at USA-ExternalCandidates@core42.ai.