Principal Platform Engineer
Tasks
- Architect cloud infrastructure for production machine learning
- Build maintain CI CD pipelines for ML model training testing and deployment
- Collaborate with data ML backend and frontend engineering teams
- Design deploy and maintain elastic scaling infrastructure
- Implement Infrastructure as Code for scalable deployments
- Implement model monitoring for drift accuracy latency and performance
- Lead infrastructure strategy for machine learning workloads
- Manage Kubernetes clusters for high performance ML
- Operate on call for production systems
- Set up observability for system health and ML specific metrics
Perks/Benefits
- N/A
Skills/Tech-stack
Apache Airflow | Argo CD | Argo Workflows | Auth0 | BigQuery | ClickHouse | Cloud SQL | Cloud platform | Dex | Elasticsearch | GCP VPC | GCP VPC Service Controls | GitHub Actions | GitOps | Google Cloud | Google Cloud Platform | Google Kubernetes | Google Kubernetes Engine | Helm | IAM | Inference Server | Istio | Istio VirtualServices | Jenkins | Kong | Kubernetes | Kubernetes Engine | Kustomize | MLflow | MTLS | Machine Learning | NVIDIA Triton | NVIDIA Triton Inference | NVIDIA Triton Inference Server | Organization Policies | PostgreSQL | Python | SOPs | Service Controls | Terraform | Triton Inference Server | VLLM | VPC | VPC Service Controls
Education
N/A
Roles
Engineer | Platform | Platform Engineer | Principal | Principal Platform Engineer
Related jobs
-
Senior-level Full TimeRemote Lithuania R20d ago