Sr. ML Platform Engineer (Hybrid)
Tasks
- Build observability solutions
- Conduct post-mortems
- Configure alerting workflows
- Debug memory leaks
- Debug resource contention
- Debug scheduling conflicts
- Develop runbooks
- Diagnose distributed systems issues
- Implement automated health checks
- Improve HPC cluster utilization
- Maintain platform reliability metrics
- Mentor engineers on debugging techniques
- Optimize GPU allocation
- Optimize Ray clusters
- Optimize SLURM job scheduling
- Optimize Spark jobs
- Optimize resource allocation
- Perform root cause analysis
- Profile performance bottlenecks
- Resolve production incidents for inference pipelines
- Resolve production incidents for training pipelines
- Troubleshoot JupyterHub spawner issues
- Troubleshoot kernel crashes
Perks/Benefits
- Employee networks
- On-call support
- Paid adoption leave
- Paid parental leave
- Professional development
- Vacation and holidays
- Volunteer opportunities
- Wellness programs
Skills/Tech-stack
AWS | Airflow | Apache Spark | CUDA | Capacity Planning | Chaos Engineering | Debugging | Distributed tracing | Docker | Google Cloud | Grafana | JupyterHub | Kubeflow | Kubernetes | Linux | Log Aggregation | MLflow | Microsoft Azure | OCI | Observability | Performance Tuning | Profiling | Prometheus | Python | Ray | Slurm | Unix
Education
N/A
Related jobs
-
Identity Security DevOps Engineer INR 1000K-1700KAPI Development | Access Management | Audit Logs | Bash | BeyondTrust PasswordSafeMid-level Full TimeIndia6h ago
-
Senior-level Full TimeThane, Maharashtra, India8h ago
-
Ansible | Automation | Bitbucket | CI/CD | Cause analysisSenior-level Full TimeHyderabad, Telangana, India9h ago
-
Senior-level Full TimePune, Maharashtra, India10h ago
-
Cybersecurity Engineer(Autosar + Cyber +SW dev) INR 2156K-4590KASPICE | AUTOSAR | Android | Cybersecurity | Embedded CCreche facility | Employee assistance program | Fitness club access | Food and beverage cards | Higher education opportunitiesSenior-level Full TimeIND - India Tech Center16h ago
-
Senior Engineer, Software Engineering INR 2520K-3380KAWS | AWS Athena | AWS EC2 | AWS EMR | AWS GlueContinuous learning | Family-friendly perks | Flexible time off | Health care coverage | Retirement planningSenior-level Full TimeIN - AHMEDABAD, India16h ago
-
Associate Engineer, Software Engineering (React, OOPS) INR 980K-1050K.NET | ASP.NET MVC | Agile | Behavior-Driven Development | C#Continuing education program | Continuous learning access | Family-friendly perks | Flexible time off | Health care coverageMid-level Full TimeIN - HYDERABAD SKYVIEW, India16h ago
-
AI Software Engineer INR 1753K-2600KAWS | Agent/tool use | Apache Spark | Azure | ContainersContinuing education program | Continuous learning | Family-friendly perks | Financial wellness programs | Flexible time offEntry-level Full TimeIN - HYDERABAD ORION, India16h ago
-
Technical Specialist-App Engg Services INR 1650K-3000KAWS | Agile | Artificial Intelligence | CAD | CybersecuritySenior-level Full TimeINDIA - MUMBAI - BIRLASOFT OFFICE, …16h ago
-
Associate Engineer, Software Engineering (React, OOPS) INR 980K-1050K.NET | ASP.NET MVC | Agile | Behavior-Driven Development | C#Continuous learning | Family-friendly benefits | Flexible time off | Health insurance | Referral incentivesMid-level Full TimeIN - HYDERABAD SKYVIEW, India16h ago
-
Senior Engineer, Software Engineering INR 2520K-3380KAWS | AWS Glue | AWS Lambda | AWS Step Functions | Amazon AthenaContinuing education support | Continuous learning resources | Family-friendly perks | Financial wellness programs | Flexible time offSenior-level Full TimeIN - AHMEDABAD, India16h ago
-
AI Software Engineer INR 1753K-2600KAWS | Agentic AI | Azure | CI/CD | Code reviewContinuing education program | Continuous learning | Family-friendly perks | Financial wellness programs | Flexible time offEntry-level Full TimeIN - HYDERABAD ORION, India16h ago
-
Security Engineer INR 800K-1250KAWS CloudFormation | AWS Security | AWS Security Hub | Access Analyzer | BashEqual employment opportunities | On-call rotationMid-level Full TimeBangalore, India1d ago
-
Systems Development Engineer II INR 1000K-1700KAccess Management | Automation | Debugging | Go | Identity and Access ManagementMentorship | Supportive teamMid-level Full TimeHyderabad, Telangana, India1d ago
-
Critical Infrastructure Platform Engineer INR 1800K-2700KActive Directory | Automation | Azure Arc | Bare Metal | ChefCareer Rotation Program | Flexible remote work option | Training opportunitiesEntry-level Full TimeHyderabad, TS, IN; Bengaluru, KA, IN1d ago
-
Mid-level Full TimeBengaluru, KA, IN1d ago
-
Infrastructure Engineer INR 1753K-3300KActive Directory | Azure | Azure AVD | Azure Active Directory | Backup and RecoverySenior-level Contract Full TimeHyderabad, TS, India1d ago
-
RSA Software Senior Engineer (Java, Full Stack) INR 3000K-4000KAgile | CI/CD | Data Structures | Design Patterns | GradleSenior-level Full TimeBangalore, India1d ago
-
Senior Software Engineer INR 2040K-3380KAmazon Web Services | Apache Kafka | Apache Spark | Azure | BFSSenior-level Full TimeBengaluru, KA, IN1d ago
-
Customer Success Engineer Intern INR 300K-480KAI tools | Automation | Change Management | Customer Relationship Management | Customer relationshipCareer growth | Meaningful equity | Medical insurance | Unlimited leave | Wellness benefitsEntry-level InternshipNew Delhi1d ago
-
Mid-level Full TimeHyderabad, India1d ago
-
Senior Software Engineer (Data Engineering, Python, SQL, Snowflake, Cloud and Devops) INR 2500K-3380KAWS | AWS Step Functions | Airbyte | Airflow | Apache HudiCareer development opportunities | Employee Assistance Program (EAP) | Financial wellness tools | Flexible working environment | LinkedIn LearningSenior-level Full TimeChennai, TN, India1d ago
-
Software Engineer II-DevOps – Cloud Platform INR 1000K-1500KAWS Lambda | Amazon Web Services | Ansible | Audit Logging | BitbucketMid-level Full TimeBengaluru, Karnataka, India1d ago
-
Azure CLI | Azure Firewall | Azure Key Vault | Azure Policy | Azure RBACEntry-level Full TimeGurgaon, Haryana, India1d ago
-
Application Security | Application Security Groups | Azure CLI | Azure Firewall | Azure Key VaultEntry-level Full TimeGurgaon, Haryana, India1d ago