Lead Site Reliability Engineer
Date: 30 Jul 2025
Location: Abu Dhabi, Abu Dhabi, AE
Company: G Forty Two General Trading LLC
Overview:
Role: Lead Engineer – Site Reliability
Location: Abu Dhabi
About Presight
Presight, an ADX-listed public company limited by shares whose majority shareholder is Abu Dhabi company G42, is the region’s leading big data analytics company powered by Artificial Intelligence (“AI”). It combines big data, analytics, and AI expertise to serve every sector, of every scale, to create business and positive societal impact. With its world-class computer vision, AI and omni-analytics platform as its engine, Presight leverages all-source data to support insight-driven decision making that shapes policy and creates safer, healthier, happier, and more sustainable societies.
The Opportunity:
Seeking a meticulous and expert Lead Engineer - Site Reliability to build and support the Presight delivery model that empowers product & technology teams to develop & deliver high-quality products, improve platform infrastructure and strengthen the reliability of products and solutions. You play a key role in defining & establishing the delivery model deployed in the development of cutting edge, next-gen analytics solutions & services at Presight.
Responsibilities:
Key Responsibilities:
As a Lead Engineer – Site Reliability, you will be responsible for working with relevant stakeholders to drive reliability, performance, and scalability across our infrastructure. You will own the SRE roadmap and guide implementation through mentorship, code contributions, and hands-on infrastructure work. Partnering closely with Engineering, Data Science, and Product teams to embed reliability into the development lifecycle.
Functional
- Architect and lead reliability strategies across services and environments.
- Define and enforce SLOs, SLIs, and error budgets with engineering leadership.
- Lead incident response and root cause analysis.
- Implement automation to reduce toil and improve system resilience.
- Manage capacity planning, traffic forecasting, and cost optimization.
- Mentor junior and senior SREs in technical and process excellence.
- Collaborate with MLOPS, DevSecOps and CloudOps teams to enforce best practices.
- Champion observability, metrics-driven decisions, and platform maturity.
- Deploy monitoring tools such as Prometheus and Grafana to track system performance.
- Ensure system reliability adheres to security and compliance standards, particularly within regulated sectors.
- Comply with QHSE (Quality Health Safety and Environment), Business Continuity, Information Security, Privacy, Risk, Compliance Management and Governance of Organizations policies, procedures, plans and related risk assessments.
Qualifications:
Required Skills:
- Bachelor's Degree in Computer Engineering or related field.
- Minimum 10 years of experience in site reliability with 2 years in people management.
- Expertise in Kubernetes, CI/CD (e.g., GitLab), and infrastructure-as-code (Terraform/Helm).
- Strong experience in cloud (Azure, AWS, or GCP).
- Experience with multi-tenant systems or high-throughput data platforms.
- Exposure to AI/ML infrastructure or MLOps pipelines.
- Proven background in SRE principles, SLIs/SLOs, and reliability-focused engineering.
- Programming proficiency in Python, or Shell (Nice to have)
- Deep understanding of distributed systems, networking, and incident management.
Ideally, you’ll also need
- A highly detail-oriented and methodical approach to problem solving.
- A passion for technology, troubleshooting and customer service.
- A strongly analytical mind.
- Great verbal and written communication skills.
What we look for:
Join us at Presight, where we offer a culture of innovation, outstanding career growth opportunities, and competitive rewards. If you're eager to conquer new frontiers in AI and thrive in a dynamic environment, we welcome you to our community.
What working at Presight offers:
Culture: An open, diverse and inclusive environment with a global vision that encourages personal growth and focuses on ground-breaking, industry-first innovations.
Career: Accelerate your career through high-impact projects and access to resources for continuous growth and learning opportunities.
Rewards: A competitive remuneration package with a host of perks including healthcare, education support, leave benefits and more.