Site Reliability Architect
A USD 155K-190K (estimate) Senior-level Full Time
Tasks
- Analyze upstream downstream dependencies
- Build Prometheus and Grafana dashboards
- Configure ELK and EFK pipelines
- Define SLIs, SLOs, and error budgets
- Design unified observability dashboards
- Detect anomalies and predict incidents with AIOps
- Enrich and manipulate JSON telemetry
- Implement Dynatrace metrics traces logs and Davis AI
- Implement alerting with static and dynamic thresholds
- Integrate OpenTelemetry
- Monitor Kafka and streaming platform signals
- Monitor and troubleshoot distributed microservices
- Perform root cause analysis
- Recommend runbooks with LLMs
- Reduce alert noise with alert correlation
- Suggest auto remediation actions with GenAI
- Summarize incidents with GenAI
Perks/Benefits
- N/A
Skills/Tech-stack
AI machine learning | AIOps | AWS | Alert Correlation | Alerting | Anomaly Detection | Azure | Cause analysis | Cloud platform | Davis AI | Dependency analysis | Distributed Systems | Dynamic Thresholds | Dynatrace | EFK Stack | ELK Stack | Error budget | GenAI | Google Cloud | Google Cloud Platform | Grafana | Incident Management | Infrastructure as Code | JSON | Kafka | Language Models | Large Language Models | Machine Learning | Microservices | Noise Reduction | Observability | OpenTelemetry | Prometheus | Reliability Engineering | Root Cause Analysis | Root cause | SLI | SLO | Series analysis | Site Reliability | Site Reliability Engineering | Static Thresholds | Streaming Platforms | Telemetry enrichment | Terraform | Time Series | Time Series Analysis | Unified Observability | “as-code”
Education
N/A
Related jobs
-
Security Engineer USD 147K-208KAI Risk | AI Risk Assessment | Access Control | Agent Orchestration | Bias MitigationSenior-level Full TimeBellevue, WA | Menlo Park, CA …2h ago
-
Network Engineer, Operations & Support USD 133K-190KAI Workflow Optimization | AI workflow | Agent Orchestration | Arista | AutomationSenior-level Full TimeNew Albany, OH2h ago
-
Software Engineer USD 257K-314KC++ | Caching | Content Delivery Network | Content delivery | Data DistributionExecutive-level Full TimeBurlingame, CA2h ago
-
Staff Security Engineer - Product Security USD 230K-275KAI Risk Management Framework | Access Control | Application Security | CI/CD | Cloud SecurityHybrid work | Medical, dental, and vision insurance | Paid time offSenior-level Full TimeSouth San Francisco, California, USA9h ago
-
Senior IT Architect / Technical Manager USD 152K-200KAWS | Amazon EKS | Ansible | Application Firewall | AzureSenior-level Full TimeFairfax, United States12h ago
-
Cloud DevSecOps Engineer USD 142K-172KAccess Management | Agile | Azure | Cloud Architecture | Cloud ComputingSenior-level Full TimeUSA VA Chantilly - Customer Proprietary …13h ago
-
Application Security Engineer USD 100K-215K800-53 | AWS | Application Security Testing | Azure | CI/CDIn person five days per weekMid-level Full TimeTysons Corner, VIRGINIA, United States14h ago
-
Operations Engineer USD 86K-176KData Feeds | Grafana | Incident Management | Kibana | Nagios24 7 operations environment | Rotating shift scheduleSenior-level Full TimeAnnapolis Junction, MD14h ago
-
AWS | Azure | Big Data | Cloud infrastructure | EncryptionDental insurance | Health insurance | In-office hybrid schedule | Relocation assistance | Vision insuranceSenior-level Full TimeTysons15h ago
-
Software Security Engineer USD 103K-166KAmazon Web Services | Artificial Intelligence | Automation | Cloud Computing | Cloud platformEmployee stock purchase plan | Flexible paid time off | Growth and development fund | Home office support | Parental leaveSenior-level Full TimeRemote, Canada; Remote, US R15h ago
-
Sr. Embedded Detection Analyst USD 140K-207KAI tools | Alert Correlation | Cause analysis | Data Analysis | Detection engineeringSenior-level Full TimeRemote - USA R15h ago
-
Senior Cybersecurity Engineer, Advanced Security USD 145K-204KAPI Security | AWS | Azure | BGP | BGP RoutingSenior-level Full TimeRemote, United States R16h ago
-
Sr. Software Development Engineer - Control Plane, Reliability, Backend (Flexibility on level) USD 112K-160KAWS | Ansible | Backpressure | C++ | CI/CDHybrid workSenior-level Full TimeSan Jose, California, USA16h ago
-
Software Engineer USD 131K-229KAWS Batch | AWS Cloud | AWS Cloud Development Kit | AWS IAM | AWS Lambda401k employer match | Employer-covered health insurance | Employer-covered life and disability insurance | Paid government holidays | Paid time offSenior-level Full TimeChantilly, VA17h ago
-
AWS | Access Management | Ansible | Azure | CIS ControlsEntry-level Full TimeDallas, TX, United States17h ago
-
Senior Software Engineer (C++), Intelligence Systems USD 166K-220KC# | C++ | Containerization | Distributed Systems | Edge Computing401k matching | Caregiver leave | Commuter benefits | Dental benefits | Generous time offSenior-level Full TimeReston, Virginia, United States18h ago
-
Principal Systems Engineer USD 140K-140KActive Directory | Amazon Web Services | Backup and Disaster Recovery | Bash | Cloud platformSenior-level Full TimeSaint George, Utah, United States18h ago
-
IT Systems Administrator USD 85K-100KAD Connect | Azure | Azure AD | Azure AD Connect | Backup and RecoveryMid-level Full TimeTroy, MI, United States19h ago
-
Security Engineer, Product Security USD 106K-212KAWS | Anti-abuse | Application Security | Azure | Cloud SecurityContract extension possibility | Remote workMid-level Full TimeWoodinville, Washington, United States19h ago
-
Senior-level Full TimeMcKinney, TX; Buffalo, NY; Golden Valley, …19h ago
-
Senior Software Engineer (Infrastructure and DevOps) USD 166K-220KAzure DevOps | Bash | C++ | CI/CD | Compliance Automation401k match | Commuter benefits | Dental insurance | Disability insurance | Health insuranceSenior-level Full TimeReston, Virginia, United States19h ago
-
Cloud System Architect 2 - Terraform/AWS/Ansible/DevOps USD 130K-270KAWS | Ansible | DevOps | Eucalyptus | Kubernetes401k contribution | Accidental death and dismemberment insurance | Dental insurance | Health Savings Account contribution | Life insuranceSenior-level Full TimeAnnapolis Junction, MD19h ago
-
AWS | AWS CloudFormation | Access Management | Ansible | Docker401k matching | Dental insurance | Disability insurance | Health insurance | Life insuranceSenior-level Full TimeBoston, Massachusetts, United States19h ago
-
AWS | Ansible | Azure | CloudFormation | DockerDental benefits | Generous time off | Healthcare benefits | Life and disability insurance | Mental health resourcesSenior-level Full TimeWashington, District of Columbia, United States19h ago
-
Access Management | Amazon Web Services | Ansible | Cloud Security | CloudFormationHealthcare benefits | Professional development reimbursement | Relocation assistance | Time offSenior-level Full TimeSeattle, Washington, United States20h ago