
Information Technology - Lead Software Engineer (ITSM IT Service Operations and Resilience Lead)
- Singapore
- Permanent
- Full-time
This role is pivotal in enhancing operational resilience, streamlining IT service management processes, and developing the next generation of self-healing, predictive IT operations in close collaboration with diverse IT teams.Key Responsibilities
- Reimagine and enhance core ITSM practices (Incident, Problem, Change, and Knowledge Management) using modern development frameworks and automation tools.
- Design, prototype, and implement AI-driven operational tools, including predictive incident detection, automated remediation workflows, intelligent alerting, and large language model (LLM)-based knowledge agents.
- Lead the development and deployment of custom automation solutions to improve IT service reliability and reduce manual workload across ITSM domains.
- Collaborate with platform teams, enterprise architects, and developers to conceptualize and build next-generation IT operational capabilities.
- Provide mentorship and guidance to ITSM IPC (Incident, Problem, Change and DR management) Engineers, ensuring effective execution and governance of ITSM processes aligned with ITIL best practices.
- Drive adoption and continuous improvement of ITSM best practices across all IT teams.
- Acting as the primary liaison between internal stakeholders and external service providers.
- Monitoring and managing performance of vendor-managed services to ensure SLA and KPI compliance.
- Participating in service reviews, audits, and performance assessments.
- Managing Incident, Problem, and Change Management processes across vendor operations.
- Leading continuous improvement initiatives and service enhancements.
- Supporting escalation management and root cause analysis efforts.
- Bachelor's Degree in Computer Science, Engineering, or a related field (or equivalent experience).
- 5+ years of experience in IT operations or substantial exposure to ITSM processes and tooling.
- Strong understanding of ITIL framework and ITSM best practices; ITIL v3/v4 certification is preferred.
- Hands-on experience with automation tools, scripting, and AI/ML technologies relevant to IT operations.
- Proficient with ITSM platforms such as ServiceNow, BMC Remedy, or similar tools.
- Demonstrated ability to mentor technical teams and lead cross-functional collaboration.
- Excellent problem-solving, communication, and stakeholder management skills.
- Hands-on software development or scripting experience in Python, JavaScript (Node.js), or similar languages.
- Experience with monitoring and observability platforms like Splunk, Grafana, ScienceLogic, or equivalent (advantageous).
- Familiarity with CI/CD pipelines, GitOps practices, cloud platforms (AWS, Azure, GCP), and Infrastructure-as-Code (IaC) tools (advantageous).
- Proficiency with AI/ML frameworks and tools (e.g., TensorFlow, scikit-learn, LangChain, OpenAI APIs) is a strong advantage.
- A passion for innovation and continuous improvement.