Jobs
My ads
My job alerts
Sign in
Find a job Career Tips Companies
Find

Domain architect ai storage

Sydney
World Wide Technology
Architect
Posted: 5 March
Offer description

The Domain Architect - AI Storage acts as the primary technical authority for the physical and logical lifecycle of high-performance data platforms across diverse client environments, who bridges the gap between architectural design and hands‐on execution. You are a "doer" who is as comfortable configuring an NVMe-over-Fabrics connection in the CLI as you are explaining that configuration to a C-level client.

As a System Integrator, we do not simply manage a static cloud; we design and deliver bespoke, high‐scale AI factories for the world's leading enterprises. In this role, you will define the "Gold Standard" for storage infrastructure, moving beyond single‐array management to architect repeatable, scalable, and automated data fabrics. You will serve as the technical lead for NVIDIA Cloud Provider (NCP) and private enterprise AI cloud deployments, owning the "Storage" in the critical "Compute‐Network‐Storage" triad.

In this role, you will operate with a 60/40 split between delivering (60) complex AI infrastructure and providing Pre‐Sales (40) Subject Matter Expertise (SME). You will lead the physical provisioning of high‐throughput storage clusters for NVIDIA SuperPOD, NVIDIA BasePOD, and Cisco AI Factory environments, ensuring our clients receive "Day 2" ready AI factories, while assisting the sales team in defining the scope and cost of future deployments.

Key Responsibilities

1. Delivery & Implementation (60%)

* Parallel File System (PFS) Deployment:
* Lead the installation and configuration of high‐performance storage clusters using technologies such as WEKA, DDN (Lustre), VAST Data, or Pure Storage.
* Optimise storage client configurations on compute nodes, managing kernel modules and mounting parameters to ensure stability at scale.
* Implement GPUDirect Storage (GDS) technologies to bypass the CPU and enable direct data paths between NVMe drives and GPU memory.
* Container Storage Integration:
* Deploy and configure Container Storage Interface (CSI) drivers for Kubernetes/Red Hat OpenShift, ensuring persistent storage is dynamically provisioned for AI workloads.
* Design storage classes that differentiate between "Scratch" (High Performance) and "Home/Project" (General Purpose) tiers.
* Performance Tuning & I/O Profiling:
* Execute synthetic benchmark suites (IOR, FIO, mdtest) to validate throughput (GB/s) and metadata performance (IOPS) against agreed SLAs.
* Troubleshoot "straggler" issues where slow I/O starves GPU utilisation, analysing client‐side logs and fabric counters.
* Data Lifecycle Management:
* Implement automated data tiering strategies to move datasets between Hot (NVMe), Warm (QLC Flash), and Cold (Object/S3/Tape) tiers based on access frequency.
* Technical Scoping & Sizing:
* Lead the architectural sizing for storage opportunities. Move the conversation beyond "How many Petabytes?" to "How many Terabytes per Second per GPU?".
* Calculate required performance for specific workloads (e.g., massive small-file ingestion for Computer Vision vs. large-file streaming for LLMs) to create accurate estimates.
* BoM Validation:
* Own the Storage Bill of Materials (BoM), ensuring the correct ratio of Storage Servers to Compute Nodes/GPUs.
* Validate interoperability between Storage Controllers, Host Channel Adapters (HCAs), and Transceivers against the NVIDIA Hardware Compatibility List (HCL) and vendor compatibility matrices.
* Advise clients on the transition from legacy Enterprise NAS (NFSv3) to modern AI‐Native Storage protocols (NFS over RDMA, NVMe-oF).
* Design namespace architectures that support multi-tenancy, data sovereignty requirements, and potentially hybrid architectures for workload bursting capability.
* High‐Performance Storage Ecosystem:
* Expert‐level knowledge of Parallel File Systems (WEKA, Lustre, BeeGFS, GPFS).
* Deep understanding of Object Storage (S3 protocols) for model checkpointing and archiving.
* Mastery of storage protocols: NVMe-over-Fabrics (NVMe-oF), NFS over RDMA, and NVIDIA GPUDirect Storage (GDS).
* Deep understanding of the Linux I/O stack, including Block Device drivers, file system tuning (xfs, ext4), and client‐side caching mechanisms.
* Infrastructure as Code (IaC):
* Proficiency in one of Python, Ansible, or Terraform for automating storage cluster deployment and client configuration management.
* Industry Background: Experience working within a System Integrator (SI), Storage Vendor (e.g., NetApp, Dell, Pure), or MSP environment.
* Platform Integration: Hands‐on experience integrating storage with bare metal, and Kubernetes/Red Hat OpenShift.
* Network Affinity: Understanding of InfiniBand and RoCEv2 fabrics from a storage perspective (Congestion Control, Quality of Service).
* LLM Inference & Caching Stack:
* Understanding of technologies such as NVIDIA Dynamo, vLLM, SGLang, for distributed inference serving ad KV cache management.
* Understanding of Distributed Pre‐fill concepts and their storage I/O requirements.
* Understanding LMCache for KV cache offloading and sharing.
* I/O Saturation: Achieving >90% of the theoretical wire‐speed throughput on compute clients during validation testing (IOR/FIO).
* Deployment Velocity: Successful "Day 1" mount availability across all compute nodes using automated playbooks.
* Storage BoM Accuracy: Usable capacity vs. Raw capacity calculations are accurate to within 5% of client requirements (i.e. accounting for product functional overhead).
#J-18808-Ljbffr

Send an application
Create a job alert
Alert activated
Saved
Save
Similar job
Remote aws cloud architect traineeship
Sydney
e-Careers Limited
Architect
Similar job
Enterprise architect
Sydney
University of Technology Sydney
Architect
Similar job
Solutions architect
Sydney
FR Consultancy
Architect
Similar jobs
Architecture jobs in Sydney
jobs Sydney
jobs New South Wales
Home > Jobs > Architecture jobs > Architect jobs > Architect jobs in Sydney > Domain Architect AI Storage

About Jobstralia

  • Career Advice
  • Company Reviews

Search for jobs

  • Jobs by job title
  • Jobs by sector
  • Jobs by company
  • Jobs by location

Contact / Partnership

  • Contact
  • Publish your job offers on Jobijoba

Legal notice - Terms of Service - Privacy Policy - Manage my cookies - Accessibility: Not compliant

© 2026 Jobstralia - All Rights Reserved

Send an application
Create a job alert
Alert activated
Saved
Save