Incident Response Analyst II

Singapore, Singapore

Applications have closed

Astreya

Home - WORKING INOVATION Astreya is a leading global provider of game-changing IT Managed Services and Technology Solutions to some of the world’s most

View all jobs at Astreya

What this Job Entails: 

Incident & Problem Management

  • Investigate and respond to alerts, incident response (war room, remote bridges) and report, and on-going maintenance, tuning, and improvements of the detection signals
  • Respond to incidents and critical situations in a calm, problem-solving manner, and conduct in-depth investigation of alerts
  • Be the first layer of defense responsible for quick detection and incident response using various monitoring and automation tools, conduct thorough investigation of alerts, classification and triage.
  • Provide deep understanding and intelligence of the criticality and impact of the incidents to the resolver groups.
  • Ensure detailed records of alarm handling activities, including actions taken, resolutions in ticketing tools and file incident reports.
  • Be available to coordinate as an incident commander in event of an issue.
  • Support program managers and facilitate project deliverables, improve overall operational and engineering initiatives.
  • Conduct root cause analysis (RCA) to determine recurring problems to their source.
  • Employ in-depth questioning and analysis techniques such as five whys to determine the underlying cause of the incident or problem.
  • Perform duties in compliance with SOP.

Server, DCIM, Network and Traffic Alarms Operations

  • Continuously monitor alarm dashboards and systems.
  • Investigate and respond to alarms such as but not limited to Network, DC Environment, Server Health, Facility Security and Safety.
  • Identify and acknowledge incidents associated with alarms.
  • Assess incidents to determine their criticality and impact on operations.
  • Engage the resolver group who will be resolving the incident and escalate to higher tiers or management when necessary, following established escalation paths.
  • Maintain clear and concise communication with relevant teams, stakeholders, and incident responders/resolvers.
  • Documented procedures to resolve incidents promptly and effectively.
  • Ensure detailed records of alarm handling activities, including actions taken and resolutions in ticketing tools.
  • Perform duties in compliance with SOP.

Threat Intelligence & Critical Event Management

  • Monitor Everbridge's Visual Command Center (VCC), InternationalSOS e-mails, and other open source tools for real-time incidents impacting ByteDance assets and travelers.
  • Monitor directed tools or queries for specific requests from stakeholders.
  • Notifications about violence, inclement weather, threats to life, property and assets etc.
  • Coordinate emergency response efforts, including liaising with law enforcement if needed.
  • Conduct research to verify the accuracy and relevance of the information through additional sources.
  • Create heatmap of the affected area to highlight areas impacted by a specific event or series of events.
  • Collaborate with other security and operational teams for a coordinated response.
  • Implement incident containment and mitigation strategies.
  • Document incident details, response actions, and lessons learned.
  • Perform duties in compliance with SOP.

Physical Security and Safety

  • Basic monitoring of Closed-Circuit Television (CCTV) systems and Access Control Systems (ACS).
  • Monitor safety alarms and communication channels for events such as but not limited to electrical incidents, fire & environmental hazards, equipment failure, chemical exposure, water leaks, that pose a risk to the safety of personnel or the data center infrastructure.
  • Conduct audits of camera footage to ensure proper functioning, video quality, and coverage of critical areas.
  • Respond to access control incidents and anomalies.
  • Report findings to the security and safety engineers, and relevant stakeholders promptly.
  • Perform duties in compliance with SOP.

Badge Management

  • Perform badge enrolment and ensure that all requests go through proper approval process and to assess accuracy and completeness of request in compliance with SOP.
  • Access card programming due to access requests such as but not limited to new or temporary access requests via email/ticket, off-boarding by revoking badge access.
  • Generating access logs reports.
  • Conduct access log audit.

Continuous Service Improvement

  • Identify areas of improvement within current service delivery processes.
  • Implement changes that lead to measurable enhancements in service quality, efficiency, and customer satisfaction.
  • Establish a culture of continuous improvement within the organization.
  • Establish mechanisms for ongoing feedback collection from customers and employees.
  • Integrate feedback into future continuous improvement efforts.

Required Qualifications/Skills:

  • 2 years+ experience in command center, service center, or similar 24x7 operations center environment
  • Ability to quickly triage multiple incidents and assign the right priority based on risk and confidence levels
  • Knowledge of technical elements associated with systems such as IP Networks, DC Environment and Server Health.
  • Outstanding verbal and written communication skills required, work with minimal direction, meeting goals, attention to details and an eye for continuous improvements
  • Ability to successfully interact at all levels of the organization, including with clients, while functioning as a team player required.
  • Basic working knowledge of data protection policies such as GDPR and the need to keep sensitive information secure.
  • XOC Analyst is expected to work at ByteDance datacenter site. This is an on-site role.
  • Willingness to work flexible schedules/shifts/areas, including weekends, nights, and holidays.
  • Excellent verbal and written communication skills in English
  • Effectively utilize the ticket management systems
  • Understanding of networking components and infrastructures 
  • Understanding of Data Center best practices (i.e. basic fault tolerance, cable routing, calculating power usage) 
  • Strong organization skills

Preferred Qualifications

  • Diploma/Degree in Information Technology.
  • Works well under pressure and within time/budget constraints to solve problems and complete deliverables.
  • Experience with Ticketing, Grafana, Servers and Data Center Systems.
  • Working knowledge and/or certifications in CompTIA Server+, Schneider Electric Data Center Certified Associate (DCCA).
  • Knowledge of Lenel and Avigilon systems is a plus.
  • Hands on experience in electrical, HVAC, and data center infrastructures
  • Working knowledge of networking components and infrastructures
  • Ability to adapt to changing priorities, conditions, and circumstances

* Salary range is an estimate based on our InfoSec / Cybersecurity Salary Index 💰

Job stats:  4  2  0

Tags: Audits Automation Compliance CompTIA GDPR Grafana Incident response Monitoring Open Source Threat intelligence

Perks/benefits: Flex hours Flex vacation Team events

Region: Asia/Pacific
Country: Singapore

More jobs like this

Explore more career opportunities

Find even more open roles below ordered by popularity of job title or skills/products/technologies used.