HPC AI Systems Administrator
Bloomington, Minnesota, United States of America
USD 111K-211K Senior-level Full Time
Tasks
- Communicate lab successes risks failures and issues
- Conduct advanced hardware diagnostics
- Configure and manage OS images for HPC cluster provisioning
- Coordinate power CPU and GPU issue resolution
- Design and operate HPC AI lab environments
- Design and support virtualized lab infrastructure
- Design highly available fault tolerant environments
- Design lab networks and operational policies
- Ensure cybersecurity and asset protection compliance
- Image configure and upgrade servers with Linux
- Install configure and manage high performance storage
- Install configure and support job scheduling and resource management tools
- Lead escalation level troubleshooting
- Mentor junior system administrators
- Oversee lab transitions and facility moves
- Perform hardware and software installation configuration
- Prioritize and coordinate lab work requests
- Recommend resource usage and capacity planning
- Work with remote administrators vendors and partners
Perks/Benefits
- Career Development Programs
- Flexible work arrangements
- Health and wellbeing
- Inclusion and diversity programs
Skills/Tech-stack
AI | Asset Management | CPU troubleshooting | Capacity Planning | Cybersecurity | Fault Tolerance | GPU troubleshooting | HPC | HPC cluster | Hardware Diagnostics | High Availability | High Performance | High-Performance Storage | IT Asset Management | Job Scheduling | Linux | Lustre | Networking | Power Management | Resource Management | Server Administration | Storage | Switch configuration | Virtual Server | Virtual Server Administration | Virtualization | Virtualized Infrastructure
Education
Regions
Countries
States
Cities
Related jobs
-
Linux System Administrator USD 86K-130KACAS | ATO | AWS GovCloud | Ansible | BashAfter-hours supportMid-level Full TimeMCB Quantico, VA, United States11h ago
-
Systems Administrator Level 2 USD 98K-135KClient Help Desk | Client-Server | Database Administration | Dispatch systems | Help deskSenior-level Full TimeAnnapolis Junction, United States16h ago
-
Senior-level Full TimeMegaCenter, MD16h ago
-
Mid-level Full TimeMegaCenter, MD16h ago
-
API Management | AWS | Azure | Azure DevOps | Batch jobsOnsite work requirementMid-level Full TimeTallahassee, FL17h ago
-
Archer GRC Engineer I USD 40K-58KBash | DHCP | DNS | Database systems | Incident ManagementFlexible onsite schedule | Hybrid work model | Stable compliant remote workstation requirementsEntry-level Full Time399 Revolution Drive Somerville (Assembly Row …1d ago
-
System Administrator USD 74K-101KActive Directory | Backup and Recovery | Business Continuity | Cisco | Cybersecurity401k company match | Flexible work week | Paid time offMid-level Full TimeUSA VA Falls Church - 3170 …1d ago
-
Systems Administrator Senior USD 170K-230KAWS | Cloud Computing | Linux | Networking | System Performance401k match | Career stability | Dental insurance | Flexible work schedule | Health insuranceSenior-level Full TimeUSA MD Annapolis Junction - 2721 …1d ago
-
Lead System Administrator USD 59K-106KAWS | Account Management | Active Directory | Ansible | ArubaOn-call support rotation | On-site presenceSenior-level Full Time3025 Gaithersburg MD, United States1d ago
-
Access Control | Airflow Monitoring | Asset tracking | BMS | CMDBHybrid work | On-call rotation | Paid Holidays | Paid time off | Paid vacationMid-level Full TimeRemote-Virginia-120, United States R1d ago
-
Active Directory | Atlassian Bamboo | Atlassian Bitbucket | Atlassian Confluence | Atlassian JiraEducation assistance | Paid time off | Relocation assistance | Training and developmentEntry-level Full TimeFLTA01, United States1d ago
-
IT Manager & Salesforce System Administrator USD 80K-100KAccess Management | Amazon Web Services | Cloud Computing | Cloud platform | Container Technologies401k company match | Discounted Employee Services | Discounted products | Medical/Dental/Vision insurance | Paid time offMid-level Full TimeBurlingame, CA R1d ago
-
HPC AI Systems Administrator USD 111K-211KCPU troubleshooting | Capacity Planning | Cluster administration | Cybersecurity | Cybersecurity StandardsSenior-level Full TimeBloomington, Minnesota, United States of America2d ago
-
HPC AI Systems Administrator USD 111K-211KAI | Capacity Planning | Cybersecurity | Fault Tolerance | Firmware UpdatesCareer Development Programs | Health and wellbeing benefits | Inclusion supportSenior-level Full TimeBloomington, Minnesota, United States of America2d ago
-
Senior Linux System Administrator USD 128K-214KAnsible | Capacity Planning | Configuration Management | Hardware Troubleshooting | LinuxHealth insurance | Holiday pay | Learning and development | Life insurance | Long-term disabilitySenior-level Full TimeUSA-MD-Laurel2d ago
-
Systems Administrator, Junior USD 70K-90KACAS | Change Management | DISA STIG | Desktop infrastructure | EMASSEntry-level Full TimeSan Diego, CA, US2d ago
-
Systems Administrator IV USD 118K-188KCapacity Planning | Citrix | NAS | Patching | Performance optimizationHybrid workMid-level Full TimeBoston3d ago
-
Systems Administrator - Infrastructure USD 56K-74KAnsible | Backup and Recovery | Bash | Disaster Recovery | High AvailabilityAfter-hours support | On-call support | Onsite work scheduleMid-level Full TimeBeachwood, OH, United States3d ago
-
Mid-level Full TimeMegaCenter, MD3d ago
-
Mid-level Full TimeUDC, UT3d ago
-
Mid-level Full TimeOPS 2A, MD3d ago
-
Mid-level Full TimeOPS 2A, MD3d ago
-
Mid-level Full TimeOPS 2A, MD3d ago
-
Mid-level Full TimeOPS 2A, MD3d ago
-
Senior-level Full TimeMegaCenter, MD3d ago