Job Description: We are seeking a Senior Data Engineer with extensive experience in designing, developing, and maintaining data ingestion and replication pipelines using AWS Lambda, Airflow, DMS, and Streamsets.
The ideal candidate will have expertise in Snowflake-based data models, focusing on performance tuning and best practices. Additionally, they should be proficient in Terraform, YAML, and Python for infrastructure automation and data engineering workflows.
This role involves collaborating with source teams to gather data requirements, ensuring smooth data ingestion and migration for each pipeline or project. The successful candidate will also be responsible for mentoring team members to work individually with new tools in the environment.
Key Responsibilities:
* Leverage AWS services (Lambda, S3, EC2, Airflow, DMS) for data storage, compute, transformation, and processing.
* Use Terraform and YAML for infrastructure-as-code (IaC) to automate pipeline deployment.
* Manage real-time data replication using AWS DMS (Database Migration Service).
* Implement CI/CD best practices for automated data pipeline deployment and version control.
* Monitor, troubleshoot, and optimize pipeline performance for high availability and reliability.
Qualifications:
* 6+ years of experience in Snowflake, including data modeling, performance tuning, and optimization.
* 11+ years of experience as an Oracle DBA, including performance tuning and database administration.
* Experience in DBT for transforming raw datasets, creating macros, and building reusable logic.
* Hands-on experience with AWS services (Lambda, S3, EC2) and DMS (Database Migration Service) for real-time replication.
* Proficiency in Terraform, YAML, and Python for infrastructure automation and data engineering workflows.
* Experience with Streamsets, Apache Airflow, and SQL-based ETL development.