Company Description
We're a large scale systems integration company, committed to delivering trusted solutions that help to safeguard Australia. With over 20 years' local experience and the backing of a 32,000 global network, we currently number 1,000 employees mainly in Canberra and Melbourne. We're growing fast and are building a business that is focused and fit for the future. Change and innovation are central to the way we work, and we thrive when developing unique, practical solutions to seriously complex challenges. Centralised Processing (CP) is a key ICT program with the Department of Defence to supply IaaS, PaaS and SaaS to underpin Australia's largest private cloud network. Working in a team of system engineers, and IT professionals, you will be responsible for owning the Availability Management and Incident Management processes and ensuring that Defence Applications and IT Infrastructure meet the availability commitments along with the effective execution of the Incident Management process ensuring restoration of high priority Incidents within SLA. You will routinely interact with all levels of management and engaging with technical SMEs across a broad range of disciplines, briefing management at all levels on Leidos performance in these disciplines and production of reports to support that performance.
**Job Description**:
**Overview**
- As an IT Professional, you will understand the technologies that have or need to be implemented to ensure that availability and reliability targets can be met.
- Ensure effective processes in the Incident Management process to meet Service levels on all priority Incidents for a small team of Incident Managers
- Liaise with internal and external stakeholders, ensuring they receive appropriate communication on Availability and Incident disciplines
- Ensure the quality of information in Post Incident Reports
- Ensuring all Incident cases and records are accurately maintained and are of quality in content
- You will provide advice, assistance and leadership associated with the planning, design and improvement of services and component availability, including the investigation of breaches of Availability and Incident targets and service non-availability, with the instigation of remediation activities.
- You will work closely with the IT Service Continuity Manager to ensure risks are identified and minimised.
- You will implement, maintain and improve the capability to measure and report Availability using ServiceNow including Outage management.
- Continuous improvement of Standard Operating Procedures (SOP's) to assist in day to day running of the Incident and Availability processes
- Keep track of progress against incident SLA's
Specifically, your role will be accountable for:
- Understand the Availability and Incident management contractual requirements
- Own the Availability and Incident Management Process as Process Owner, including auditing the performance of process and compliance with process.
- Implement continual process improvement of both Incident and Availability disciplines
- Develop and implement the Availability Plan
- Availability and Incident Reporting in line with contractual obligations
- Ensure that levels of availability and reliability meet the requirements
- Work with Architecture, Projects and Change Management to ensure that availability and recovery requirements continue to be met when updates and changes are proposed and made to the environment
- Ensuring HA mechanisms are working as designed
- Assisting ITSC Manager in ensuring data is recoverable from backup
- Ensure that Reactive activities such as monitoring, analysis of events and problems are carried out
- Ensure that Proactive activities such as planning, design and improvement initiatives are performed
- Contribute to training of all members of technology staff to ensure high levels of availability and fast recovery of Incident is achieved
- Maintain a close and collaborative working relationship with Defence as customer as well as internal stakeholders
- Experience as an escalation point for queries and more complex calls, incident management and fault reporting, including logging, escalation and coordination with team members and other IT support areas.
- Strong customer service focus with the ability to empathise as well as prioritise
- Other duties as required
**Skills**
- Working knowledge of ServiceNow and the ability to implement and maintain Outage Management
- Strong technical skills in networking, storage, compute, backup and recovery of systems
- Ability to progress ITIL Availability and Incident Management practice to level 3 maturity
- Demonstrated ability to build and maintain effective working relationships across all levels and influence key stakeholders
- Strong interpersonal and presentation skills and a proven ability to talk comfortably and knowledgeably with a wide range of stakeholders
- Excellent communication (verbal and written) and interperso