Job Title:
Site Reliability Lead Engineer
We are looking for a Site Reliability Lead Engineer with a minimum of 8 years of relevant experience to ensure system reliability, scalability, and performance.
Key Responsibilities:
1. Primary Focus: Ensuring system reliability, scalability, and performance.
2. Define SLOs, SLIs, and SLAs for reliability; monitor system performance; plan capacity; automate reliability improvements.
3. Main Goal: Ensure production systems are reliable, performant, and scalable.
4. Approach: Reduce downtime, prevent failures, and maintain system stability.
5. Background: Software Engineering, DevOps, Operations, Cloud Engineering.
Good knowledge of the technology landscape to guide the team. Tertiary qualification in IT / Computer Science / related discipline required.
Preferred Skills:
* Strong Azure cloud background with Infrastructure as Code (IaC) skills.
* Strong communication skills, both written and verbal.
* Highly motivated, self-driven, with a positive attitude.
Additional Skills:
* Excellent customer-facing skills.
* Outstanding analytical and problem-solving skills.
At our organization, we recognize individual requirements and strive to create an inclusive environment. We are committed to diversity and equal opportunity employment.