Senior Cloud Operations Manager
Role Overview
As the Senior Manager of Cloud Operations & Platform Support, you will be the guardian of our cloud ecosystem’s stability, scalability, and performance. You aren’t just "managing" a team;
you are orchestrating a high-stakes environment where uptime is the gold standard and continuous improvement is the cultural norm.
You will lead a team of talented engineers, serving as the bridge between technical execution and strategic business goals. Whether you’re acting as the Incident Commander during a critical outage or steering the team toward CMMI maturity, your focus remains on delivering a seamless, high-quality experience for our stakeholders.
Core Responsibilities
1. Operational Excellence & SLA Management
- Ticket Lifecycle: Oversee the validation, categorization, and assignment of all incoming tickets to ensure no issue falls through the cracks.
- SLA Governance: Rigorously monitor Service Level Agreements (SLAs). You are responsible for ensuring the team meets or exceeds agreed-upon response and resolution times.
- Monitoring & Anomaly Detection: Implement proactive monitoring strategies to detect trends or risks before they impact service availability.
- Maintaining ISO27001 compliance (with comprehensive coverage for process and procedure)
- Working closely with Infosec on routine and ad-hoc activities such as access reviews and compliance with the latest policies relating to the security of our infrastructure
2. Incident Leadership & Escalation
- Incident Commander: Take the lead during major service outages. You will coordinate the technical response, manage stakeholder communication, and ensure high-pressure situations are handled with composure.
- Timezone Authority: Serve as the primary escalation point within your designated timezone, providing guidance to on-call teams during off-hours and holidays.
- Post-Mortem Integrity: Ensure every major incident is followed by a comprehensive review that captures root causes and actionable prevention steps.
3. Leadership & Team Development
- Capability Maturity: Guide the team through the CMMI framework, advancing process maturity and operational discipline.
- Mentorship: Foster a high-performance culture rooted in accountability, professional growth, and technical excellence.
- Training: Ensure the team is fully versed in all Standard Operating Procedures (SOPs) and equipped to handle evolving cloud technologies.
4. Process & Knowledge Management
- SOP Management: Develop, review, and standardize all operational processes to align with industry best practices.
- Knowledge base: Maintain a robust internal library of troubleshooting guides, SOPs, and resolution steps.
- Preparedness: Execute regular operational drills (Disaster Recovery, Incident Response) to identify gaps in readiness.
5. Continuous Improvement & Collaboration
- The Feedback Loop: Contribute at least three (3) data-driven improvement suggestions per quarter to the planning process, derived from operational trends and team feedback.
- Cross-Functional Synergy: Partner with Engineering, Security, and Compliance teams to ensure cloud operations remain secure, compliant, and aligned with broader organizational initiatives.
Required Qualifications & Skills
- Experience: 8+ years in Cloud Operations or Infrastructure Support, with at least 3 years in a formal leadership/management capacity.
- Technical Depth: Strong understanding of cloud architecture (GCP preferred) and CMMI frameworks.
- Crisis Management: Proven track record as an Incident Commander or in a high-pressure NOC/Command Center environment.
- Communication: Exceptional ability to translate complex technical issues into clear updates for executive stakeholders.
- Analytical Mindset: Ability to leverage metrics and logs to drive process optimization and proactive risk mitigation.
Singapore, 01, SG, 0