Senior AWS Data Engineer cum Data Modeler
Location: Sydney
Mode of Hiring: Permanent
Job Description
* Proficient with AWS SageMaker Unified Studio, including Discover, Build and Govern modules; development in SageMaker IDE, JupyterLab, Spaces, and partner AI apps.
* Experience with Data Analysis and Integration: Query Editor, Visual ETL Jobs, and Data Processing jobs.
* Orchestration of workflows and ML pipelines, including ML and Gen AI tools within SageMaker Unified Studio.
* Setup of projects and data governance in SageMaker Unified Studio.
* Advanced Data ingestion and processing (real‐time & batch). Design and implement reusable ingestion frameworks for diverse sources (databases, APIs, message queues, file systems).
* Low‐latency real‐time pipelines using AWS Kinesis Streams, Apache Kafka or similar streaming technologies; sub‐second latency.
* Batch ingestion patterns with AWS Glue, Apache Spark, and other robust tools.
* Design and implement efficient data transformation logic for streaming and batch data.
* Programming: advanced Python for scalable data ingestion, API integration; use of Boto3. Performance optimisation in data‐intensive Python; knowledge of Scala or Spark optional.
* Core AWS services: Lambda, S3, DynamoDB, CloudWatch, SQS, SNS, API Gateway; networking (VPC, security groups, private endpoints) and IAM.
* Amazon RDS for real‐time ingestion and batch export; Amazon Redshift for data warehousing; Amazon DynamoDB, MongoDB for low latency real‐time storage.
* Orchestration with AWS Managed Apache Airflow (MWAA); writing, deploying and managing Airflow DAGs.
* Real‐time monitoring and observability: design custom metrics, logging, tracing; use CloudWatch for heartbeats, alerts and notifications.
* Software engineering principles, design patterns and scalable architecture.
* Version control (Git), CI/CD pipelines for testing, deployment and release of data solutions.
* Infrastructure as Code with AWS CloudFormation or Terraform.
Qualifications
* Previous experience developing batch and real‐time ingestion frameworks for relational and NoSQL databases or filesystems.
* Experience with real‐time ingestion using AWS Kinesis Stream, Apache Kafka or similar, achieving low latency.
* Knowledge of data warehouses and lakehouses; data modelling experience is a plus.
* Strong background in AWS data services and infrastructure.
#J-18808-Ljbffr