Lead Cloud Infrastructure Engineer / Site Reliability Engineer (SRE)
Tasks
- Build and automate data and model pipelines
- Collaborate with engineering teams for reliability and security
- Design deploy and scale AI ML LLM infrastructure in cloud
- Ensure cloud infrastructure stability performance and security
- Implement monitoring observability and reliability best practices
- Lead incident response during on-call rotations
- Manage and optimize Kubernetes environments for AI services
- Own infrastructure end to end and lead scaling deployments and automation
- Perform performance tuning and cost optimization
Perks/Benefits
Skills/Tech-stack
AWS | AWS Lambda | ArgoCD | Azure | Azure Functions | Bash | CI/CD | Capacity Planning | CloudFormation | Docker | ELK | Flux | GCP | GitOps | Go | Grafana | Helm | Incident Response | Istio | Kubernetes | Linkerd | OpenTelemetry | Powershell | Prometheus | Python | SLA | SLI | SLO | Serverless | Service Mesh | Terraform
Education
Related jobs
-
Entry-level Full TimeMcLean, VA, United States9h ago
-
Sr. Information Security Engineer USD 111K-155KAlert triage | Anomaly Detection | Automation and response | Behavioral detection | Cause analysisSenior-level Full TimeJacksonville, FL, United States9h ago
-
API Design | CI/CD | Component design | Distributed Systems | Frontend architectureDental insurance | Financial benefits | Medical insurance | Vision insuranceSenior-level Full TimeSanta Clara, California, United States R21h ago
-
API Integration | AWS | Ansible | Azure | CI/CDSenior-level Full TimeSanta Clara, California, United States21h ago
-
AWS | Ansible | Azure | CI/CD | CertificatesSenior-level Full TimeSanta Clara, California, United States22h ago
-
Ansible | Autoscaling | CI/CD | Cloud Native | ContainerizationSenior-level Full TimeSanta Clara, California, United States R22h ago
-
API Integration | CI/CD | Component design | Design Systems | GraphQLDental insurance | Financial benefits | Medical insurance | Other benefits | Vision insuranceSenior-level Full TimeSanta Clara, California, United States23h ago
-
DevSecOps Engineer USD 98K-206KAWS | Backup | CI/CD | Compliance | Container OrchestrationFlexible time off | Learning resourcesMid-level Full Time376 SARASOTA FL, United States1d ago
-
Linux System Administrator - AWS Cloud (C2S) USD 86K-181KAWS CloudWatch | AWS IAM | AWS Lambda | Active Directory | Amazon EC2Continuing education | Flexible time off | Healthcare | Learning and development | RetirementMid-level Full Time147 CHANTILLY VA (COMMONWEALTH BUILDING A), …1d ago
-
AI machine learning | API Security | AWS | Azure | CI/CD401k plan | Adoption reimbursement | Disability benefits | Employee assistance program | Employee discountsSenior-level Full TimeUSA:NC:Charlotte / Ibm Dr - Adm:8505 …1d ago
-
Senior Security Platform Engineer CAD 90K-140KAWS | Adversary Emulation | Application Firewall | Attack Simulation | Automation and orchestration24x7 on-call support | Background checks | Career development opportunities | Hybrid work | Wellness programsSenior-level Full TimeSun Life Toronto One York, Canada1d ago
-
CBO - Automation Engineer USD 86K-198KAPI Integration | AWS | Automated response | Azure | Azure LogicPublic Trust Clearance | Remote workMid-level Full TimeUnited States - Remote R1d ago
-
CBO - Security Engineer (Sentinel / SIEM) USD 125K-170K800-53 | AWS | Cloud logging | Correlation | Detection engineeringRemote workSenior-level Full TimeUnited States - Remote R1d ago
-
Senior Security Platform Engineer CAD 90K-140KAWS | Adversary Emulation | Application Firewall | Audit Support | Capacity PlanningBackground checks | Hybrid work | On-call support | Wellness programsSenior-level Full TimeSun Life Toronto One York, Canada1d ago
-
Principle Cyber Test Engineer (Onsite) USD 99K-206KAgile | Bash | Cyber Operations | DHCP | DNSAchievement Awards | Career development | Child/adult backup care | Educational assistance | Flexible work schedulesMid-level Full TimeVA543: 22270 Pacific Blvd, Dulles 22270 …1d ago
-
Cybersecurity Sales Engineer MXN 324K-360KAWS | Application Security | Azure | Bot management | Cloud ComputingHybrid work | International travel | Work permit supportMid-level Full TimeMexico Polanco1d ago
-
Mid-level Full Time9638 Reston VA Non-specific Customer Site, …1d ago
-
Senior-level Full Time9638 Reston VA Non-specific Customer Site, …1d ago
-
Principal Systems Engineer USD 131K-237KAgile | Architecture development | CI/CD | Cameo Systems Modeler | ConfluenceSenior-level Full Time3025 Gaithersburg MD, United States R1d ago
-
Principal Information Security Engineer USD 120K-178KAI Driven | AI-driven security | AI-driven security tools | Artificial Intelligence | AutomationSenior-level Full TimeRochester, MN, United States1d ago
-
Senior Machine Learning Engineer - Cybersecurity USD 80K-200KAnomaly Detection | Behavioral analytics | Cyber Threat | Cyber Threat Detection | CybersecuritySenior-level Full TimeSan Jose, CA, United States1d ago
-
AWS | Access reviews | Audit Readiness | Business Continuity | CCPASenior-level Full TimeRemote, Remote, United States R1d ago
-
Software Engineer SME (TS/SCI with Poly Required) USD 187K-318KAWS | Apache NiFi | CI/CD | DevOps | ETLSenior-level Full TimeDulles, Virginia, United States1d ago
-
Software Engineer SME (TS/SCI with Poly Required) USD 187K-318KAPI | API Key | Amazon Kinesis | Amazon Web Services | Apache AirflowMid-level Full TimeChantilly, Virginia, United States1d ago
-
Software Engineer SME (TS/SCI with Poly Required) USD 187K-318KAWS | Anaconda | Apache NiFi | Azure | Azure MicroservicesMid-level Full TimeMcLean, Virginia, United States1d ago