Reliability Engineer
We're accelerating our digital strategy with an ambition to provide customers with one of the best digital experiences of any company globally.
Site Reliability Engineering (SRE) is key to us achieving this goal. Our teams ensure that our systems maintain the highest standards of service outcomes for our customers, which enables seamless execution of our award-winning banking apps.
Our SRE team marries both, by applying Software Engineering principles to our operational services. We implement latest industry-wide methodologies around observability practices.
As a reliability engineer, you'll be a technical leader, designing and implementing large scale solutions, as well as influencing and engaging the organisation's senior stakeholders on modern best practices for improving reliability throughout the SDLC.
We support our people with flexibility in balancing where work is done with at least half your time each month connecting in office. We also have many other flexible working options available including changing start and finish times, part-time arrangements and job share to name a few.
Key Responsibilities:
* Partnering with senior stakeholders and leading a culture of data-driven reliability, monitoring and automation aligned to SRE principles
* Designing, developing, testing and supporting applications and systems
* Managing and scaling distributed systems in a public, private or hybrid cloud environment
* Creating and developing tools required to support the operational management (including security) of software applications and systems
* Identifying technology limitations and deficiencies in existing systems and associated processes, procedures and methods, and using software engineering to develop scalable and sustainable improvements
Requirements:
* Expertise in at least one Programming language
* Strong experience with modern software development practices using tools such as git for source control, and CI/CD tools like TeamCity, Jenkins, Octopus Deploy or similar; Strong public cloud experience in AWS, GCP or Azure
* Experience leading teams of engineers driving outcomes using observability tools such as Prometheus, Grafana, AWS CloudWatch, Splunk, AppDynamics
* Extensive and in-depth knowledge of Linux internals, networking, containers, and troubleshooting
* Experience in applying SRE Practices in large organisations and excellent Communication and Problem-Solving skills
Benefits:
* Flexible work arrangements
* Support for career development
* Opportunities for growth and progression
* Collaborative and dynamic work environment
How to Apply:
If this sounds like the role for you then we would love to hear from you. Submit your application today and take the first step towards joining our team.