Responsibilities
- Lead the implementation of solutions using Microsoft Power Platform (Power Apps, Power Automate) and Azure
- Design and maintain cloud infrastructure across Azure and AWS platforms
- Develop and implement automation strategies for infrastructure deployment and management.
- Create and maintain CI/CD pipelines for cross-platform deployments
- Establish best practices for cloud security, compliance, and cost optimization
- Mentor junior engineers and provide technical leadership to the team
Site Reliability Engineering – Good to have
- Define and maintain Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Service Level Agreements (SLAs)
- Implement and maintain observability solutions across all platforms (metrics, logging, tracing)
- Design and implement automated alerting and incident response systems
- Conduct capacity planning and performance optimization
- Lead post-incident reviews and implement systematic improvements
- Develop and maintain runbooks and playbooks for operational procedures
- Implement chaos engineering practices to improve system resilience
- Automate routine operations and maintenance tasks
- Design and implement disaster recovery and business continuity solutions.
Qualifications
Required Qualifications
- 7+ years of experience in platform engineering or similar role
- Demonstrated knowledge of AWS services (EC2, S3, Lambda, IAM, etc.)
- Experience with Infrastructure as Code (Terraform, ARM templates, CloudFormation)
- Experience with containerization technologies (Docker, Kubernetes)
- Knowledge of security best practices and compliance requirements
- Experience with monitoring and observability platforms
Preferred Qualifications
- Relevant certifications in Microsoft Power Platform, Azure, and/or AWS
- Experience with hybrid cloud architectures
- Knowledge of API design and implementation
- Experience with distributed systems and microservices architecture
- Background in implementing SRE practices in large-scale environments
- Experience with chaos engineering tools and practices
Technical Skills
- Infrastructure as Code: Terraform, ARM templates, CloudFormation
- Scripting: PowerShell, Python, Bash
- Monitoring & Observability: ELK Stack
- Security: Identity and Access Management, Network Security, Compliance
- Performance Testing: JMeter, K6, LoadRunner
SRE Specific Tools & Methodologies
- Error budgets and SLO/SLI implementation
- Incident management and postmortem processes
- APM tools (New Relic, Dynatrace, AppDynamics)
- Git-based configuration management
- Blue-green and canary deployment strategies
Soft Skills
- Robust communication and documentation abilities
- Team leadership and mentoring capabilities
- Ability to work independently and as part of a team
- Strong problem-solving and analytical thinking
- Excellent time management and prioritization skills
- Calm and methodical approach to incident response
- Ability to make data-driven decisions
On-Call Responsibilities
- Participate in on-call rotation
- Respond to critical incidents and service disruptions
- Lead incident response and coordination
- Conduct post-incident analysis and improvement planning
Seniority level
- Mid-Senior level
Employment type
- Full-time
Job function
- Information Technology
- Industries
Referrals increase your chances of interviewing at N2S.Global by 2x
We are seeking candidates in Sydney, New South Wales, Australia.
#J-18808-Ljbffr
📌 Platform Engineer
🏢 N2S.Global
📍 Sydney