Site Reliability Engineer (Mid-Level)
San Antonio, TX
Full Time Mid-level / Intermediate Clearance required USD 105K - 130K
Dark Wolf Solutions
The Alpha of technology Dark Wolf Solutions operates at the nexus of mission and technology to meet our Nation’s most challenging missions. JOIN THE PACK Connect Our Background About Us We combine the most innovative emerging technologies with...Dark Wolf Solutions is seeking a Mid-Level Site Reliability Engineer (SRE) to support the Unified Platform Cyber Operations & Security Center (COSC) in San Antonio, TX. The SRE will be responsible for developing, maintaining, and enhancing infrastructure resilience, observability, and operational automation across classified and hybrid cloud environments. The successful candidate will work within a high-performing team to implement Infrastructure as Code (IaC), improve service reliability, monitor system health, and automate incident response and recovery processes.
Key Responsibilities
- Deploy and maintain observability tools including monitoring, alerting, logging, and performance dashboards.
- Develop, refine, and maintain Infrastructure as Code (IaC) configurations using tools such as Terraform, Ansible, or CloudFormation.
- Implement and monitor Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to improve service reliability and system uptime.
- Monitor system performance, identify degradation or anomalies, and implement corrective actions.
- Support incident response efforts including triage, escalation, and resolution activities.
- Assist in the development and maintenance of runbooks, operational procedures, and incident management documentation.
- Collaborate with platform, cloud, and security engineering teams to optimize system performance, scalability, and resilience.
- Integrate operational telemetry into the COSC SIEM and observability frameworks to enhance system visibility.
- Ensure systems are aligned with compliance frameworks such as NIST 800-53, DoD STIGs, and RMF guidelines.
- Participate in root cause analysis (RCA) efforts and contribute to continuous improvement of platform operations.
Basic Qualifications
- Bachelor’s degree in Computer Science, Information Technology, Cybersecurity, Engineering, or a related technical field, or equivalent industry experience.
- Minimum of 3–5 years of experience in system engineering, cloud operations, DevSecOps, or SRE roles.
- Experience with cloud-native infrastructure operations (AWS, Azure, Kubernetes) and container orchestration.
- Practical knowledge of Infrastructure as Code (IaC) tools such as Terraform, Ansible, or CloudFormation.
- Experience using observability platforms such as Elastic Stack, Prometheus, Grafana, or similar.
- Solid understanding of networking fundamentals, system administration, and distributed systems monitoring.
- Familiarity with Agile methodologies and DevSecOps best practices.
- US Citizenship required with an active Secret clearance and eligibility for Top Secret/SCI.
Desired Qualifications
- Certifications such as AWS Certified SysOps Administrator, Certified Kubernetes Administrator (CKA), or related DevOps certifications.
- Experience implementing GitOps workflows using tools like ArgoCD or Flux.
- Exposure to resilience engineering concepts, fault injection testing, or chaos engineering.
- Experience supporting mission-critical operations in DoD or Intelligence Community environments.
- Familiarity with security monitoring, SIEM integration, and compliance alignment.
The estimated salary range is $105,000.00 - $130,000.00, commensurate on experience, technical expertise, certifications, and clearance level.
Primary work location is San Antonio, TX.Hybrid model with a mix of remote and on-site support; on-site presence required for classified system activities.
We are proud to be an EEO/AA employer Minorities/Women/Veterans/Disabled and other protected categories.
In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification form upon hire.
Tags: Agile Ansible Automation AWS Azure Clearance Cloud Compliance Computer Science DevOps DevSecOps DoD Grafana Incident response Kubernetes Monitoring NIST NIST 800-53 Prometheus RMF SIEM SLOs STIGs Terraform Top Secret TS/SCI
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.