Job Title Staff Machine Learning Engineer - Agentic AI
Location Remote (Fully Flexible). Locations: Remote, Victoria, Australia. Also Melbourne / Sydney.
Overview Team: AI Agents. We run production AI agents that autonomously resolve customer service tickets across 100,000+ Zendesk accounts. Each agent decomposes a customer issue into a multi‑step plan, executes real actions such as refunds, order modifications, escalations through live APIs, and closes the ticket without a human in the loop. The core uses a proprietary iterative architecture: goals decompose into plans, reusable skills are pulled from a registry, execution is evaluated, and the result feeds the next attempt. Successful resolution patterns are synthesised into new skills and written back into the registry, allowing the system to learn from its own execution history. On GAIA‑class multi‑step tool‑use benchmarks, our agents match the best published results. Internally, 158+ scenario‑based evals run continuously against real Zendesk tickets, scored through Braintrust with regression detection on every deploy.
Responsibilities
Own and maintain the iterative planner architecture, addressing challenges such as plan decomposition under ambiguous goals, memory‑tier interference across concurrent sessions, over‑eager skill acquisition, and multi‑agent delegation via A2A.
Lead domain‑specialised reinforcement‑learning training of language models for customer‑service resolution, overseeing the data pipeline, reward curricula, rollout systems, and feedback loops over a 6–12 month build, owning both the science and the systems.
Build and maintain the evaluation infrastructure: continuous 158+ evals, multi‑turn evaluation, automated trajectory analysis, and quality gates that block deployments when performance drops, integrated into CI from the start.
Design and implement guardrails at scale—tool misuse, cascading action chains, prompt injection, hallucination loops—developing multi‑layered defence supervisor patterns, capabilities‑based access control, and output validation that work across thousands of concurrent sessions without adding latency.
Qualifications
5+ years building production ML/AI systems with experience shipping agent architectures that handle planning, tool dispatch, memory, and failure recovery.
Proficiency in Python and PyTorch, with at least one agent framework and the judgment to know when to use it or build custom solutions.
Experience building internal evaluation frameworks that expose why public benchmarks may be misleading, and the scars to prove it.
Strong understanding of reinforcement learning for language models, including reward shaping, online/offline tradeoffs, and reward hacking diagnostics.
Experience with LangChain tutorials is a disadvantage; this role requires different expertise.
Equal Opportunity & EEO Statement Zendesk is an equal opportunity employer, and we’re proud of our ongoing efforts to foster an inclusive workplace. Individuals seeking employment and employees at Zendesk are considered without regard to race, color, religion, national origin, age, sex, gender, gender identity, gender expression, sexual orientation, marital status, medical condition, ancestry, disability, military or veteran status, or any other characteristic protected by applicable law. We are an AA / EEO / Veterans / Disabled employer. If you are based in the United States and would like more information about your EEO rights under the law, please ... (PD). Zendesk endeavors to make reasonable accommodations for applicants with disabilities and disabled veterans pursuant to applicable federal and state law. If you are an individual with a disability and require a reasonable accommodation to submit this application, complete any pre‑employment testing, or otherwise participate in the employee selection process, please send an e‑mail to with your specific accommodation request.
#J-18808-Ljbffr