A fast-scaling SaaS platform is looking for a senior operator who can bring calm to the chaos, own production incidents end-to-end, and build the support foundations properly as the platform grows internationally. Startup pace, real customers, real stakes.
You're not being hired to "watch dashboards".
You're being hired to
find the truth in the logs
, drive the fix, and stop the same incident returning in a different outfit.
What you'll actually do (aka your daily quests)
* Incident owner & escalation lead
— run production incidents from first alert to resolution and post-incident learning
* Deep technical investigations
— logs, traces, metrics, and data… then turn that into clear root cause + actions
* JavaScript/TypeScript debugging
— read code, follow stack traces, reproduce issues, validate fixes services + APIs)
* Make the platform easier to operate
— upgrade monitoring/alerting so signal beats noise
* Build the playbooks
— runbooks, troubleshooting guides, service maps, escalation routes
* Bridge Support and Engineering
— translate customer symptoms into engineering tasks that actually get solved
* Drive permanent fixes
— partner with engineers to reduce repeat incidents and operational drag
* Help shape global support rhythm
— build toward scalable coverage as international usage ramps
Who this is for
1. You're a self-starter who can operate in ambiguity and still create structure.
2. You're calm under pressure and sharp with prioritisation.
3. You've got 6+ years
in senior production-facing work, think Level 3 support / production support / platform ops / production engineering.
4. You've worked deeply with
AWS
, ideally serverless (Lambda, DynamoDB, API Gateway, CloudWatch, S3, Redis/ValKey, OpenSearch).
5. You've got strong
JavaScript fundamentals
and can confidently debug
/ TypeScript
services and API behaviour.
6. You can troubleshoot across mobile ecosystems enough to connect dots (iOS/Android → backend → "why only this cohort?")
7. You can communicate clearly during incidents, with crisp updates and clean outcomes.
Super cool to have (not required, but we'll grin)
* Datadog / observability chops
* GitLab CI/CD deployment + rollback confidence
* React / React Native familiarity
* Some legacy exposure (.NET / Windows)
* Experience in compliance-aware environments
Why it's worth your time
* High impact, you'll shape the operating model
* Real technical depth (not "support theatre")
* Runway into reliability/SRE-style work over time
* Stock options, because building stability should come with upside
Apply now -
This is a rare one,
as rare as a hat-trick on Boxing Day
Sydney (Hybrid, 3 days in office)
Start ASAP