Jobs
My ads
My job alerts
Sign in
Find a job Career Tips Companies
Find

Reinforcement learning model developer

Melbourne
beBeeExpert
Developer
Posted: 14 September
Offer description

Role Overview

We are seeking a highly skilled Reinforcement Learning expert to develop and optimize RL models for enterprise-scale applications. The ideal candidate will have a strong theoretical foundation in RL, including policy optimization, reward modeling, and planning, paired with the engineering skills to build scalable production systems.

The role requires full ownership from research through deployment, driving experimentation with systematic evaluation and benchmarking. Collaboration across research, infrastructure, and application teams will be key to delivering impactful AI solutions.

* Research and develop state-of-the-art RL algorithms focusing on large model optimization and alignment techniques.
* Design and implement RL training pipelines, including environment simulation, data generation, and reward function design.
* Apply RL methods to enhance LLM/VLM/Agentic AI capabilities in reasoning, planning, and autonomous decision-making.
* Collaborate with engineers and researchers to integrate RL solutions into enterprise AI platforms.
* Monitor model performance in production and continuously improve through iterative training and fine-tuning.

Requirements:

* Master's degree in Computer Science, Applied Mathematics, Machine Learning, or related fields.
* 3+ years of hands-on experience in RL or LLM/VLM/Agentic AI optimization.
* Strong coding skills in Python, with experience in ML frameworks and RL libraries.
* Experience with large-scale distributed training and optimization.
* Self-driven, ownership mindset, and strong problem-solving skills. Excellent communication skills for cross-functional collaboration.

Why This Opportunity?


You will shape the future with a leading blockchain ecosystem. Collaborate with world-class talent in a user-centric global organization with a flat structure. Tackle unique, fast-paced projects with autonomy in an innovative environment. Thrive in a results-driven workplace with opportunities for career growth and continuous learning. Competitive salary and company benefits. Work-from-home arrangement.

Commitment to Diversity and Inclusion:

Binance is committed to being an equal opportunity employer. We believe that having a diverse workforce is fundamental to our success.

Send an application
Create a job alert
Alert activated
Saved
Save
Similar job
Senior full stack developer
Melbourne
iterate
Developer
Similar job
Senior salesforce developer
Melbourne
Department of Government Services
Developer
USD 90,000 - USD 120,000 a year
Similar job
Haskell developer
Melbourne
Bellroy Pty Ltd
Developer
Similar jobs
IT and Tech jobs in Melbourne
jobs Melbourne
jobs Victoria
Home > Jobs > IT and Tech jobs > Developer jobs > Developer jobs in Melbourne > Reinforcement Learning Model Developer

About Jobstralia

  • Career Advice
  • Company Reviews

Search for jobs

  • Jobs by job title
  • Jobs by sector
  • Jobs by company
  • Jobs by location

Contact / Partnership

  • Contact
  • Publish your job offers on Jobijoba

Legal notice - Terms of Service - Privacy Policy - Manage my cookies - Accessibility: Not compliant

© 2025 Jobstralia - All Rights Reserved

Send an application
Create a job alert
Alert activated
Saved
Save