Job Summary
We are seeking a highly skilled Site Reliability Engineer to join our team. This role is responsible for ensuring the reliability, scalability, and performance of our systems.
Main Responsibilities:
* Defining Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Service Level Agreements (SLAs)
* Monitoring system performance and reducing toil
* Capacity planning and scaling
* Automating reliability improvements
Main Goal: Ensure production systems are reliable, performant, and scalable
Approach: Focus on reducing downtime, preventing failures, and maintaining system stability
Required Skills and Qualifications:
* Strong background in software engineering, DevOps, or operations
* Good knowledge of technology landscape to guide the team
* Tertiary qualification in IT or computer science
* Preferred: Strong Azure cloud background with infrastructure as code (IAC) skill
* Strong communication skills, both written and verbal
* Highly motivated self-driven individual with a positive attitude
Benefits:
* Excellent customer-facing skills
* Excellent written and verbal communication skills
* Strong attention to detail and outstanding analytical and problem-solving skills
Other Information:
* We recognize that everyone has individual requirements and are committed to creating an inclusive environment for all employees
* We are proud to be an equal opportunity employer