HPC AI Systems Administrator Lead
San Jose, California, United States of America
USD 111K-211K Senior-level Full Time
Tasks
- Administer virtualized lab infrastructure
- Communicate lab status risks failures and issues to management
- Configure and manage OS images for HPC cluster provisioning
- Coordinate hardware and software installation and configuration
- Design and operate HPC and AI lab environments
- Design fault tolerant highly available environments
- Design lab networks and operational policies
- Ensure cybersecurity and asset protection compliance
- Image configure and upgrade Linux servers
- Install configure and manage high performance storage solutions
- Install configure and support job scheduling and resource management tools
- Mentor junior system administrators
- Oversee lab facility transitions and infrastructure refreshes
- Perform advanced hardware diagnostics
- Perform capacity planning and resource usage recommendations
- Prioritize coordinate lab work requests
- Provide escalation level troubleshooting
- Troubleshoot power CPU and GPU issues
Perks/Benefits
Skills/Tech-stack
AI | Capacity Planning | Cybersecurity | Fault Tolerance | Firmware Update | GPU troubleshooting | HPC | Hardware Diagnostics | High Availability | Job Scheduling | Linux | Networking | Resource Management | Server Administration | Storage | Virtual Server | Virtual Server Administration | Virtualization
Related jobs
-
IT Systems Engineer – Classified Build Room USD 123K-170KAPT | Air-gapped | Air-gapped networks | Ansible | BashSenior-level Full TimeWoburn,MA3h ago
-
Systems Administrator IV USD 95K-125KComputer Systems | Desk support | Help desk | Help desk support | NetworkingGovernment security clearance eligibility | Valid driver's licenseSenior-level Full TimeLas Vegas, NV, United States14h ago
-
DevOps Engineer / Linux Administrator USD 100K-126KAnsible | Bash | CI/CD | Cloud technologies | Configuration ManagementNo travel required | Remote workMid-level Full TimeRemote, United States R16h ago
-
Systems Administrator (5249) USD 73K-121KActive Directory | Backup and Disaster Recovery | Containers | Continuous Monitoring | DOD 8140Health insurance | Paid leave | RetirementMid-level Full TimePatuxent River, MD16h ago
-
Linux Systems Administrator (5325) USD 78K-130KAnsible | Backups | CentOS | Firewalls | LinuxHealth insurance | Learning and development | Paid leave | RetirementMid-level Full TimePatuxent River, MD16h ago
-
Linux Administrator II USD 95K-145KAccess Control | Account Management | Authentication | Backup Management | ComplianceDental insurance | Disability insurance | FSA | HSA | Health insuranceMid-level Full TimeWestminster, CO17h ago
-
Sr Systems Administrator USD 93K-126KActive Directory | DNS | HBSS | LDAP | LinuxCompany 401k match | Comprehensive health and wellness packages | Full-flex work week | Internal mobility support | Paid HolidaysSenior-level Full TimeUSA VA Langley AFB - Langley …1d ago
-
Systems Administrator USD 36K-46KAccess Control | Account Management | DOD STIGs | Data Fusion | Disaster RecoveryMid-level Full TimePearl Harbor, HI1d ago
-
Sr. Principal Systems Administrator USD 118K-224KActive Directory | Automation | DFARS | DHCP | DNS24x7 support rotation | Onsite work | Security clearance support | Training and documentation supportSenior-level Full TimeUS-CA-EL SEGUNDO-E04 ~ 2006 E El …1d ago
-
Active Directory | Antivirus | Azure AD | Backup and Disaster Recovery | Change ControlMid-level Full TimeUnited States - Remote R1d ago
-
CAD System Administrator USD 100K-110KCJIS Security Policy | CJIS security | Call Support | Capacity Planning | Cause analysis401k match | Employee stock purchase plan | Medical, dental, vision benefits | Paid Holidays | Paid parental and family leaveSenior-level Full TimeOrlando, FL, United States1d ago
-
Backup and Recovery | Data Guard | Database performance | Database performance tuning | Disaster RecoveryContract employment | Hybrid work schedule | In-person interviewMid-level Full TimeColumbia, SC1d ago
-
Senior SQL Database Administrator USD 96K-223KASH | ASM | AWR | Active Data Guard | Autonomous Transaction ProcessingSenior-level Full TimeAZ Mesa HQ1d ago
-
Cybersecurity Administrator, Data Loss Prevention USD 110K-190K800-171 | Access Control | Access Management | CASB | CIS Controls401k match | Casual dress code | FSA | Free daily lunch | HSASenior-level Full TimeWashington, District of Columbia, United States1d ago
-
Mid-level Full TimeAnnapolis Junction, MD1d ago
-
Network Administrator II (2nd Shift) USD 90K-129KAccess Control | Cybersecurity | Cybersecurity compliance | Disaster Recovery | FirewallCompany stock options | Dental insurance | Disability insurance | FSA | HSAMid-level Full TimeWestminster, CO1d ago
-
System Administrator IV USD 97K-130KAuthorization to Operate | Backups and restores | Configuration Management | Continuity of Operations | DISA STIG401k match | Disability insurance | Health savings accounts | Life insurance | Medical/Dental/Vision insuranceMid-level Full TimeEdwards AFB, CA, United States1d ago
-
System Administrator , CLOUD 2 USD 140K-145KAnsible | CloudFormation | Docker | Helm | Kubernetes401k match | Dental insurance | Identity theft protection | Life and disability insurance | Medical insuranceSenior-level Full TimeAnnapolis Junction, MD1d ago
-
System Administrator 2 (CBB) USD 140K-145KAmazon AWS | Amazon Linux | Azure | CentOS | Docker401k company match | Dental insurance | Identity theft protection | Immediate vesting | Life and disability insuranceSenior-level Full TimeLinthicum Heights, MD1d ago
-
System Administrator - Junior USD 70K-98KAccess Control | Account provisioning | Cloud Migration | Enterprise IT | Enterprise IT SecurityEntry-level Full TimeFAIRFAX, VA, United States1d ago
-
System Administrator - Junior USD 71K-105KAnsible | Configuration Management | Cybersecurity compliance | Enterprise Linux | Incident ManagementEntry-level Full TimeFAIRFAX, VA, United States1d ago
-
AWS Systems Administrator USD 120K-151KAWS | AWS Elastic Beanstalk | Access Management | Amazon EC2 | Amazon RDSHybrid workMid-level Full TimeDayton, OH, United States1d ago
-
Mid-level Full TimeAnnapolis Junction, MD1d ago
-
Cybersecurity Administrator - U USD 112K-166KAccess Control | Account Management | Backup and Recovery | Capacity Planning | Data BackupPaid time off | Training supportNone Full TimeMD, DC, VA1d ago
-
Cleared Systems Administrator USD 105K-125KACAS | Active Directory | Assured compliance assessment solution | Enterprise Linux | FirewallOnsite workMid-level Full TimeWoburn,MA1d ago