Incident Response Analyst II
Singapore, Singapore
Applications have closed
Astreya
Home - WORKING INOVATION Astreya is a leading global provider of game-changing IT Managed Services and Technology Solutions to some of the world’s mostWhat this Job Entails:
Incident & Problem Management
- Investigate and respond to alerts, incident response (war room, remote bridges) and report, and on-going maintenance, tuning, and improvements of the detection signals
- Respond to incidents and critical situations in a calm, problem-solving manner, and conduct in-depth investigation of alerts
- Be the first layer of defense responsible for quick detection and incident response using various monitoring and automation tools, conduct thorough investigation of alerts, classification and triage.
- Provide deep understanding and intelligence of the criticality and impact of the incidents to the resolver groups.
- Ensure detailed records of alarm handling activities, including actions taken, resolutions in ticketing tools and file incident reports.
- Be available to coordinate as an incident commander in event of an issue.
- Support program managers and facilitate project deliverables, improve overall operational and engineering initiatives.
- Conduct root cause analysis (RCA) to determine recurring problems to their source.
- Employ in-depth questioning and analysis techniques such as five whys to determine the underlying cause of the incident or problem.
- Perform duties in compliance with SOP.
Server, DCIM, Network and Traffic Alarms Operations
- Continuously monitor alarm dashboards and systems.
- Investigate and respond to alarms such as but not limited to Network, DC Environment, Server Health, Facility Security and Safety.
- Identify and acknowledge incidents associated with alarms.
- Assess incidents to determine their criticality and impact on operations.
- Engage the resolver group who will be resolving the incident and escalate to higher tiers or management when necessary, following established escalation paths.
- Maintain clear and concise communication with relevant teams, stakeholders, and incident responders/resolvers.
- Documented procedures to resolve incidents promptly and effectively.
- Ensure detailed records of alarm handling activities, including actions taken and resolutions in ticketing tools.
- Perform duties in compliance with SOP.
Threat Intelligence & Critical Event Management
- Monitor Everbridge's Visual Command Center (VCC), InternationalSOS e-mails, and other open source tools for real-time incidents impacting ByteDance assets and travelers.
- Monitor directed tools or queries for specific requests from stakeholders.
- Notifications about violence, inclement weather, threats to life, property and assets etc.
- Coordinate emergency response efforts, including liaising with law enforcement if needed.
- Conduct research to verify the accuracy and relevance of the information through additional sources.
- Create heatmap of the affected area to highlight areas impacted by a specific event or series of events.
- Collaborate with other security and operational teams for a coordinated response.
- Implement incident containment and mitigation strategies.
- Document incident details, response actions, and lessons learned.
- Perform duties in compliance with SOP.
Physical Security and Safety
- Basic monitoring of Closed-Circuit Television (CCTV) systems and Access Control Systems (ACS).
- Monitor safety alarms and communication channels for events such as but not limited to electrical incidents, fire & environmental hazards, equipment failure, chemical exposure, water leaks, that pose a risk to the safety of personnel or the data center infrastructure.
- Conduct audits of camera footage to ensure proper functioning, video quality, and coverage of critical areas.
- Respond to access control incidents and anomalies.
- Report findings to the security and safety engineers, and relevant stakeholders promptly.
- Perform duties in compliance with SOP.
Badge Management
- Perform badge enrolment and ensure that all requests go through proper approval process and to assess accuracy and completeness of request in compliance with SOP.
- Access card programming due to access requests such as but not limited to new or temporary access requests via email/ticket, off-boarding by revoking badge access.
- Generating access logs reports.
- Conduct access log audit.
Continuous Service Improvement
- Identify areas of improvement within current service delivery processes.
- Implement changes that lead to measurable enhancements in service quality, efficiency, and customer satisfaction.
- Establish a culture of continuous improvement within the organization.
- Establish mechanisms for ongoing feedback collection from customers and employees.
- Integrate feedback into future continuous improvement efforts.
Required Qualifications/Skills:
- 2 years+ experience in command center, service center, or similar 24x7 operations center environment
- Ability to quickly triage multiple incidents and assign the right priority based on risk and confidence levels
- Knowledge of technical elements associated with systems such as IP Networks, DC Environment and Server Health.
- Outstanding verbal and written communication skills required, work with minimal direction, meeting goals, attention to details and an eye for continuous improvements
- Ability to successfully interact at all levels of the organization, including with clients, while functioning as a team player required.
- Basic working knowledge of data protection policies such as GDPR and the need to keep sensitive information secure.
- XOC Analyst is expected to work at ByteDance datacenter site. This is an on-site role.
- Willingness to work flexible schedules/shifts/areas, including weekends, nights, and holidays.
- Excellent verbal and written communication skills in English
- Effectively utilize the ticket management systems
- Understanding of networking components and infrastructures
- Understanding of Data Center best practices (i.e. basic fault tolerance, cable routing, calculating power usage)
- Strong organization skills
Preferred Qualifications
- Diploma/Degree in Information Technology.
- Works well under pressure and within time/budget constraints to solve problems and complete deliverables.
- Experience with Ticketing, Grafana, Servers and Data Center Systems.
- Working knowledge and/or certifications in CompTIA Server+, Schneider Electric Data Center Certified Associate (DCCA).
- Knowledge of Lenel and Avigilon systems is a plus.
- Hands on experience in electrical, HVAC, and data center infrastructures
- Working knowledge of networking components and infrastructures
- Ability to adapt to changing priorities, conditions, and circumstances
* Salary range is an estimate based on our InfoSec / Cybersecurity Salary Index 💰
Tags: Audits Automation Compliance CompTIA GDPR Grafana Incident response Monitoring Open Source Threat intelligence
Perks/benefits: Flex hours Flex vacation Team events
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.