Jobs
My ads
My job alerts
Sign in
Find a job Career Tips Companies
Find

Site reliability engineer

Melbourne
BANXA
Posted: 4 November
Offer description

Reporting to the CIO, the SRE Manager will lead and mentor our Site Reliability Engineering team to ensure the continued stability, performance, and resilience of Banxa's production systems.

This strategic role is responsible for shaping and implementing the vision for operational excellence and proactive support. You will be a key leader in establishing mature SRE practices, fostering a culture of reliability, and driving the automation and continuous improvement of our platforms. Your day-to-day will involve managing a team of talented engineers, strategic planning, and collaborating with other senior stakeholders across the business and engineering & other technology teams to build resilient, scalable systems.

Key Responsibilities

* Lead and Develop SRE Strategy and Team Culture: Lead, mentor, and grow
a high-performing Site Reliability Engineering (SRE) team, developing and executing a strategic roadmap to achieve business objectives for system
reliability, performance, and operational excellence
.
* Own and Mature Incident Management and DORA Compliance: Own the entire incident lifecycle
(detection, response, resolution, and root cause analysis) and ensure
strict compliance with DORA regulations
, including mandated timely reporting (initial, intermediate, and final) of all major ICT-related incidents to competent authorities.
* Ensure Operational Resilience and Readiness: Enhance operational readiness
by designing and conducting regular
tabletop exercises
and failure simulations to rigorously test, validate, and improve Business Continuity Plans (BCP), Disaster Recovery (DR) strategies, and incident response playbooks.
* Manage and Govern Third-Party Provider Risk: Establish and monitor KPIs
for critical third-party providers, ensuring they meet standards for
availability, resiliency, and DORA compliance
. Maintain a dependency register with clear SLAs and audit rights.
* Drive System Stability and Automation: Lead the architecture and operation
of comprehensive monitoring and observability to ensure system security, availability, and performance.
Guide the team
in building and maintaining sustainable systems through automation, infrastructure-as-code, and continuous improvement.
* Champion Stakeholder Collaboration and Governance: Serve as the primary SRE liaison
, collaborating with development teams and business stakeholders. Participate in system design reviews to ensure new services are built for
reliability and scalability
, and champion the creation of a central knowledge base for organizational learning.

Qualifications

* A degree in computer science, software engineering or a similar field would be advantageous
* AWS Developer, SysOps Administrator or DevOps Engineer certification
* Certifications in incident management or ITIL are advantageous

Skills:

* Proven leadership and mentoring capabilities with experience managing a technical team.
* Exceptional interpersonal and communication skills, with the ability to manage and influence technical and non-technical stakeholders.
* Strong strategic thinking, critical analysis, and problem-solving abilities.
* A proactive and forward-thinking approach to identifying systemic problems, performance bottlenecks, and areas for improvement.
* An innovative and collaborative mindset, with a passion for driving continuous improvement through CI/CD and automation

Experience:

* At least 7+ years of experience in a technology role, with a minimum of 3 years in a leadership or management capacity within SRE, DevOps, or Platform Engineering.
* Demonstrable experience managing application support and web application frameworks (e.g., Laravel).
* Deep expertise with AWS services (specifically EC2, Containers, Redis, RDS, S3, SQS, CloudWatch).
* Proven experience implementing and managing application monitoring and observability tools (e.g., Datadog, New Relic, OpenTelemetry).
* Strong background in managing infrastructure as code (IaC) and CI/CD pipelines.
* Experience working in an Agile environment and managing on-call rotations and incident response teams.
* Prior experience with operational resilience frameworks like DORA is highly beneficial

About Banxa Holding Inc

Banxa Holding Inc ("Banxa") is a listed company on the TSX Venture Exchange with global operations. Banxa is one of the fastest-growing payments and compliance infrastructure providers for the digital asset industry. We enable the purchase of digital assets and currencies, such as Bitcoin or USDT, using traditional currencies like USD.

In 2021, Banxa was recognised by The Silicon Review as one of the "50 fastest-growing companies of the year"

Send an application
Create a job alert
Alert activated
Saved
Save
Similar jobs
jobs Melbourne
jobs Victoria
Home > Jobs > Site Reliability Engineer

About Jobstralia

  • Career Advice
  • Company Reviews

Search for jobs

  • Jobs by job title
  • Jobs by sector
  • Jobs by company
  • Jobs by location

Contact / Partnership

  • Contact
  • Publish your job offers on Jobijoba

Legal notice - Terms of Service - Privacy Policy - Manage my cookies - Accessibility: Not compliant

© 2025 Jobstralia - All Rights Reserved

Send an application
Create a job alert
Alert activated
Saved
Save