Overview
Control Plane - Site Reliability Engineer (Hosted Infrastructure)
Join to apply for the Control Plane - Site Reliability Engineer (Hosted Infrastructure) role at Elastic.
Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale — unleashing the potential of businesses and people. The Elastic Search AI Platform, used by more than 50% of the Fortune 500, brings together the precision of search and the intelligence of AI to enable everyone to accelerate the results that matter. By taking advantage of all structured and unstructured data — securing and protecting private information more effectively — Elastic’s complete, cloud-based solutions for search, security, and observability help organizations deliver on the promise of AI.
What Is The Role
We integrate, scale, and evolve multi-cloud infrastructure across 4 CSPs, over 70 globally distributed regions, and tens of thousands of compute to power Elastic Cloud. We scale our capabilities through automation, Infrastructure as Code (IaC), Configuration Management, and developing software that minimizes toil while improving reliability and efficiency for our customers. From provisioning to termination, the complete lifecycle of a host is our focus - and we want it to live its best life.
If this kind of work gives you positive vibes, we would love your experience to help us continue offering a truly outstanding customer experience across a diverse suite of cloud infrastructure!
What You Will Be Doing
- Applying software engineering methods to automate large scale systems administration.
- Optimizing the lifecycle and reliability of compute across multiple cloud providers.
- Ensuring proactive monitoring and alerting to prevent incidents before they happen.
- Growing our global infrastructure to meet the increasing scaling demands by developing and maintaining software, tooling, and automations.
- Collaborating in an inclusive environment - focusing on Operational Excellence and uplifting each other with constructive feedback.
- Being part of an SRE on-call rotation responding to operational needs and incidents.
What You Bring
- 2+ years in software engineering using Golang.
- 2+ years operating hundreds (or more) of Cloud Compute via automated solutions.
- 2+ years with Linux systems - proficient with terminal and shell.
- 2+ years working with containerized services (such as Docker).
- A customer-first approach in solving operational problems from an SRE perspective.
- Comfortable with working remotely on distributed teams.
Bonus Points
- Experience with Terraform, Puppet, Ansible, Argo CD, Argo Workflows, CUE, Kubernetes, or programming languages other than Golang in a production workplace.
- On-call incident experience and using observability tools (e.g., Elastic Stack, Graphite, Prometheus, Influx) to diagnose issues, quantify impact, and confirm mitigations.
- Designed, implemented, and engineered solutions with the Elastic Stack.
Additional Information – We Take Care Of Our People
As a distributed company, diversity drives our identity. Elastic is an equal opportunity employer and is committed to creating an inclusive culture that celebrates different perspectives, experiences, and backgrounds. Qualified applicants will receive consideration for employment without regard to race, ethnicity, color, religion, sex, pregnancy, sexual orientation, gender identity, national origin, age, marital status, disability status, or any other basis protected by law.
We offer a range of benefits across regions, including competitive pay, health coverage, flexible locations and schedules, generous vacation, and volunteering opportunities. For accessibility needs during the application process, please email We respond within 24 business hours.
We respect privacy and provide our Privacy Statement. If you have any questions about equal opportunity or accommodation, contact us accordingly.
We welcome individuals with disabilities and strive to create an accessible and inclusive experience for all. Elastic is an equal opportunity/affirmative action employer committed to diversity, equity, and inclusion.
Seniority level
- Not Applicable
Employment type
- Full-time
Job function
- Technology
Referrals increase your chances of interviewing at Elastic. Get notified about new Site Reliability Engineer jobs in Australia.
#J-18808-Ljbffr
📌 Control Plane - Site Reliability Engineer (Hosted Infrastructure)
🏢 Elastic
📍 Maitland