Job Overview
We are seeking a seasoned Engineering Manager to lead our Observability team. The successful candidate will be responsible for defining and executing the strategy and roadmap for Observability, covering logging, metrics, tracing, and alerting.
About the Role
The Engineering Manager will be responsible for leading a team of Observability engineers, defining and executing the strategy and roadmap for Observability, ensuring the reliability, scalability, and performance of Observability platforms and tools, driving the adoption of Observability best practices, and collaborating with other engineering managers and product teams.
Key Responsibilities
* Lead and manage a team of Observability engineers within the Platform Engineering domain.
* Define and execute the strategy and roadmap for Observability across the company.
* Ensure the reliability, scalability, and performance of Observability platforms and tools.
* Drive the adoption of Observability best practices and educate development teams on their effective use.
* Collaborate with other engineering managers and product teams to integrate Observability into the software development lifecycle.
* Stay current with industry trends and emerging technologies in Observability.
* Manage vendor relationships and evaluate new tools and solutions.
* Contribute to the overall architecture and design of the platform infrastructure.
* Provide clear direction, set performance expectations, and offer constructive feedback to team members.
* Foster a culture of continuous learning and growth within the team.
* Identify and nurture talent, providing opportunities for career development.
* Manage team resourcing, recruitment, and retention.
* Resolve conflicts and mediate team issues.
* Champion diversity, equity, and inclusion within the team.
* Ensure team well-being and a healthy work-life balance.
Requirements
* Proven experience in software engineering management, leading teams focused on observability, SRE, or platform engineering.
* Strong technical background in observability principles, tools (e.g., Prometheus, Grafana, ELK stack, Jaeger, OpenTelemetry or similar), and best practices.
* Experience building and managing distributed systems and cloud-native applications (AWS, GCP, or Azure).
* The ability to articulate complex technical concepts to non-technical audiences.
* Experience with agile development methodologies.
Benefits
* Flexible work options, including flexible hours, hybrid work, or part-time arrangements.
* 18 weeks paid parental leave with no distinction between primary and secondary carers.
* Access to unique experiences, behind-the-scenes access, and awesome perks.
* Digital newspaper subscription to our mastheads.
* Annual gift voucher for Stan subscription.