工作內容:
AI 研究中心的任務包含提升英業達的智慧製造,和運用 AI 開拓未來的智慧健康來造福人類、AI 智慧晶片與建構 AI 運算的基礎設施。我們是美國矽谷且樂於將研究成果分享的團隊,扁平化管理,面對技術問題大家一起激盪出解法。您將接觸到最先端的技術,並與重要的客戶及內部工程團隊直接合作。 ☆了解 Inventec AI: https://ai.inventec.com/
工作內容:
### What You Will Do
1. Build end-to-end documentation and instrumentation of our platform to ensure visibility, automation, self-healing, and resiliency throughout the stack. 2. Collaborate with other engineers and AI researchers on the team. 3. Design, build, and maintain infrastructure that supports every stage of ML/Software workflow including data ingestion, model building, and model ops. 4. Improve and restructure the backend architecture to scale to ever-larger customers to ensure a flawless user experience and high uptime. 5. Influence architectural decisions with focus on security, scalability, and high-performance. 6. Work with Application security and IT teams at Inventec to deploy and secure both applications and infrastructure.
### What Skills You Posses
1. 1+ years of work experience in Site Reliability particularly working with cloud providers or large scale systems. 2. 1+ years of experience in scripting or coding using python or bash programming languages. 3. Experience in designing, deploying, and securing cloud-based infrastructure. (AWS or Azure) 4. Experience in containerization technologies (docker), orchestration platform (K8s), and CI/CD framework (gitlab). 5. Experience in writing code to deploy and automate infrastructure. 6. Familiarity with Monitoring and Performance, such as Prometheus 7. Good understanding of networking and related protocols. (HTTP, DNS, TLS, TCP) 8.Good understanding of database technologies
### Nice To Have 1. Experience supporting production systems to minimize customer downtime. 2. Knowledge of common machine learning frameworks, such as Tensorflow, Torch, Sci-kit 3. Experience maintaining Kubeflow or similar MLOps platform. 4. Experience in configure integrate various MLOps application components such as model lifecycle management, model serving, hyperparameter tuning, object storage, load balancers, authentication, etc. (e.g. katib, istio, and dex)