Senior Platform SRE (Platform Operations)
Tasks
- Define SLOs/KPIs
- Define observability standards and dashboards
- Design health checks and remediation workflows
- Develop automation and IaC configurations
- Enable outsourcing provider teams
- Ensure platform reliability
- Lead incident diagnosis and recovery
- Lead problem management and RCA
- Maintain runbooks and SOPs
- Mentor peers and promote operational culture
- Partner with platform engineering
- Support AIOps initiatives
Perks/Benefits
- Diversity and inclusion initiatives
- Growth programs
- Life empowerment
- Onboarding and community engagement
Skills/Tech-stack
AIOps | Alerting | Ansible | Automation | Azure | Bash | CI/CD | Cloud Computing | DevOps | Hybrid Cloud | ITSM | Incident Management | Incident Troubleshooting | Infrastructure as Code | Linux | Logging | ML/DL | Networking | Observability | Powershell | Python | Storage | Terraform | Virtualization | Windows server | “as-code”
Education
Roles
Related jobs
- No jobs found.