Job Title
A Distributed Systems Architect with a passion for Elixir and AI compute.
About the Role
We are seeking an experienced Distributed Systems Architect to join our team. As a key member of our infrastructure team, you will be responsible for designing, building, and maintaining scalable and fault-tolerant systems that make AI compute feel seamless, distributed, and incredibly efficient.
You will work closely with our AI researchers to create developer-friendly interfaces and APIs, and collaborate with our engineering teams to ensure that our systems are integrated into a unified AI infrastructure.
This is a unique opportunity to work on cutting-edge technologies and push the boundaries of what is possible in AI compute.
The ideal candidate will have deep experience with Elixir, Erlang or Gleam and OTP (or similar functional programming languages) for distributed and fault-tolerant systems, as well as strong understanding of distributed databases and state management.
Responsibilities
1. Build highly available Elixir-based control systems for scheduling, monitoring, and scaling AI workloads
2. Develop fault-tolerant distributed systems for orchestration, state management, and billing
3. Integrate multiple clouds, on-prem clusters, and networking layers into a unified AI infrastructure
4. Optimize real-time data pipelines and event-driven architectures
5. Implement security best practices in authentication, encryption, and API access control
6. Work closely with AI researchers to create developer-friendly interfaces and APIs
Requirements
* Deep experience with Elixir, Erlang or Gleam and OTP (or similar functional programming languages) for distributed and fault-tolerant systems
* Experience building event-driven architectures (examples include RabbitMQ, Kafka, NATS, etc.)
* Strong understanding of distributed databases and state management (PostgreSQL, ETS, Mnesia, Redis)
* Knowledge of cloud and multi-cloud integrations (AWS, GCP, Azure, neoclouds)
* Experience with securing Elixir applications
Benefits
* We move fast and ship weekly - new features, improvements, and fixes go live fast.
* We test big every month with large groups of users face to face.
* We build together - weekend hackathons drive innovation and help us level up as a team.
* We iterate relentlessly - direct user feedback shapes our roadmap.
* We travel when needed - engineers may travel between SF and Sydney.
Location: SF or Sydney (OG startup house vibe)