Overview
Control Plane - Site Reliability Engineer (Hosted Infrastructure) at Elastic. We integrate, scale, and evolve multi-cloud infrastructure across 4 CSPs, over 70 globally distributed regions, and tens of thousands of compute to power Elastic Cloud. We scale our capabilities through automation, Infrastructure as Code (IaC), Configuration Management, and developing software that minimizes toil while improving reliability and efficiency for our customers.
From provisioning to termination, the complete lifecycle of a host is our focus — and we want it to live its best life. If this kind of work gives you positive vibes, we would love your experience to help us continue offering a truly outstanding customer experience across a diverse suite of cloud infrastructure.
What Is The Role
Elastic enables everyone to find answers in real time using data at scale. The Elastic Search AI Platform brings together the precision of search and the intelligence of AI to accelerate results that matter, while securing and protecting private information more effectively.
What You Will Be Doing
Applying software engineering methods to automate large scale systems administration.
Optimizing the lifecycle and reliability of compute across multiple cloud providers.
Ensuring proactive monitoring and alerting to prevent incidents before they happen.
Growing our global infrastructure to meet increasing scaling demands by developing and maintaining software, tooling, and automations.
Collaborating in an inclusive environment — focusing on Operational Excellence and uplifting each other with constructive feedback.
Being part of an SRE on-call rotation responding to operational needs and incidents.
What You Bring
2+ years in software engineering using Golang.
2+ years operating hundreds (or more) of Cloud Compute via automated solutions.
2+ years with Linux systems — proficient with terminal and shell.
2+ years working with containerized services (such as Docker).
A customer-first approach in solving operational problems from an SRE perspective.
Comfortable with working remotely on distributed teams.
Bonus Points
Worked with Terraform, Puppet, Ansible, Argo CD, Argo Workflows, CUE, Kubernetes, or a programming language other than Golang in production.
Experience being on-call during incidents and using observability tools (e.g. Elastic Stack, Graphite, Prometheus, Influx) to diagnose issues, quantify impact, and confirm mitigations.
Designed, implemented, and engineered solutions with the Elastic Stack.
Additional Information – We Take Care Of Our People
Competitive pay based on the work you do here and not your previous salary
Health coverage for you and your family in many locations
Ability to craft your calendar with flexible locations and schedules for many roles
Generous number of vacation days each year
We match up to $2000 (or local currency equivalent) for financial donations and service
Up to 40 hours each year to use toward volunteer projects you love
Parental leave with a minimum of 16 weeks
Elastic is an equal chance employer and is committed to creating an inclusive culture that celebrates different perspectives, experiences, and backgrounds. Qualified applicants will receive consideration for employment without regard to race, ethnicity, color, religion, sex, pregnancy, sexual orientation, gender identity or perception, national origin, age, marital status, disability status, or any other basis protected by law. We welcome individuals with disabilities and strive to create an accessible and inclusive experience. To request an accommodation during the application or recruiting process, please email We will reply within 24 business hours of submission. Applicants have rights under Federal Employment Laws; please view the linked posters for FMLA, pay transparency, EPPA, and know-your-rights information. Please see our Privacy Statement for details.
Elasticsearch's export controls may apply to individuals located in or who are nationals of certain countries; if applicable, an export license may be required as a condition of employment. National origin or nationality does not affect eligibility for employment with Elastic.
#J-18808-Ljbffr
📌 Control Plane - Site Reliability Engineer (Hosted Infrastructure)
🏢 Elastic
📍 Cairns