Jobs
My ads
My job alerts
Sign in
Find a job Career Tips Companies
Find

Data scientist (melbourne)

Melbourne
Maincode
Data Scientist
Posted: 27 November
Offer description

Overview

Maincode is building sovereign AI models in Australia. We are training foundation models from scratch, designing current reasoning architectures, and deploying them on state-of-the-art GPU clusters. Our models are built on datasets we create ourselves, curated, cleaned, and engineered for performance at scale. This is not buying off-the-shelf corpora or scraping without thought. This is building world-class datasets from the ground up. As a

Senior Data Engineer, you will lead the design and construction of these datasets. You will work hands-on to source, clean, transform, and structure massive amounts of raw data into training-ready form. You will design the architecture that powers data ingestion, validation, and storage for multi-terabyte to petabyte-scale AI training. You will collaborate with AI Researchers and Engineers to ensure every byte is high quality, relevant, and optimised for training cutting-edge large language models and other architectures. This is a deep technical role. You will be writing code, building pipelines, defining schemas, and debugging unusual data edge cases at scale. You will think like both a data scientist and a systems engineer, designing for correctness, scalability, and future proofing. If you want to build the datasets that power sovereign AI from first principles, this is your team.

What You’ll Do

Design and build large-scale data ingestion and curation pipelines for AI training datasets

Source, filter, and process diverse data types including text, structured data, code, and multimodal, from raw form to model-ready format

Implement robust quality control and validation systems to ensure dataset integrity, relevance, and ethical compliance

Architect storage and retrieval systems optimised for distributed training at scale

Build tooling to track dataset lineage, reproducibility, and metadata at all stages of the pipeline

Work closely with AI Researchers to align datasets with evolving model architectures and training objectives

Collaborate with DevOps and ML engineers to integrate data systems into large-scale training workflows

Continuously improve ingestion speed, preprocessing efficiency, and data freshness for iterative training cycles

Who You Are

Passionate about building world-class datasets for AI training from raw source to training-ready

Experienced in Python and data engineering frameworks such as Apache Spark, Ray, or Dask

Skilled in working with distributed data storage and processing systems such as S3, HDFS, or cloud object storage

Strong understanding of data quality, validation, and reproducibility in large-scale ML workflows

Familiar with ML frameworks like PyTorch or JAX, and how data pipelines interact with them

Comfortable working with multi-terabyte or larger datasets

Hands-on and pragmatic, you like solving real data problems with code and automation

Motivated to help build sovereign AI capability in Australia

Why Maincode We are a small team building some of the most advanced AI systems in Australia. We create new foundation models from scratch, not just fine-tune existing ones, and we build the datasets they run on from the ground up. We operate our own GPU clusters, run large-scale training, and integrate research and engineering closely to push the frontier of what is possible.

You Will Be Surrounded By People Who

Care deeply about data quality and architecture, not just volume

Build systems that scale reliably and repeatably

Take pride in learning, experimenting, and shipping

Want to help Australia build independent, world-class AI systems

Seniority level Mid-Senior level

Employment type Full-time

Job function Information Technology Industries: Software Development

Melbourne, Victoria, Australia

#J-18808-Ljbffr

Send an application
Create a job alert
Alert activated
Saved
Save
Similar job
Data analyst & data scientist remote melbourne
Melbourne
ProSearch
Data Scientist
Similar job
Data scientist
Melbourne
Paxus
Data Scientist
Similar job
Ai data scientist: machine learning expert
Melbourne
beBeeMachineLearning
Data Scientist
Similar jobs
IT and Tech jobs in Melbourne
jobs Melbourne
jobs Victoria
Home > Jobs > IT and Tech jobs > Data Scientist jobs > Data Scientist jobs in Melbourne > Data Scientist (Melbourne)

About Jobstralia

  • Career Advice
  • Company Reviews

Search for jobs

  • Jobs by job title
  • Jobs by sector
  • Jobs by company
  • Jobs by location

Contact / Partnership

  • Contact
  • Publish your job offers on Jobijoba

Legal notice - Terms of Service - Privacy Policy - Manage my cookies - Accessibility: Not compliant

© 2025 Jobstralia - All Rights Reserved

Send an application
Create a job alert
Alert activated
Saved
Save