Tasks/岗位概述
负责 AI 模型全生命周期(选型→调参→配置→部署→优化),保证模型在生产环境稳定、可扩展、性能优,支持业务目标与数字化转型。
Lead the end to end lifecycle of AI models (selection → tuning → configuration → deployment → optimization) ensure robust, scalable, high performance model operations that support business goals digital transformation.
主要职责 / Key Responsibilities
模型选型、基准测试与优化/ Model Selection, Benchmarking Optimization
主导 AI/ML 模型在平台及业务应用场景中的选型、评估与基准测试;
设计、配置并优化 AI 模型流水线,确保其具备高可靠性、可扩展性与高效性;
负责模型调优、超参数优化及持续重训练流程的管理。
Lead the selection, evaluation, benchmarking of AI/ML models fplatforms business applications;
Architect, configure streamline AI model pipelines freliability, scalability, efficiency.
Oversee model tuning, hyperparameter optimization, continuous retraining processes
MLOps/DevOps 自动化/ MLOps/DevOps Automation
搭建并维护 CI/CD 与 MLOps 流水线,实现自动化部署与监控、可追溯版本管理。
Implement maintain CI/CD MLOps pipelines fautomated deployment monitoring, with versioning.
性能与可用性保障/ Performance Troubleshooting
识别并解决 模型服务与基础设施 的性能瓶颈
Troubleshoot resolve performance bottlenecks in model serving infrastructure.
跨团队协作/ Cross-team Collaboration
与 数据科学家、工程团队、产品团队 协同,将业务需求转化为可运行、可维护的模型方案。
Collaborate with data scientists, engineers, product teams to translate requirements robust model solutions.
安全、合规与治理/ Security Compliance
确保模型在平台与产品中的 安全、合规与治理(访问控制、审计证据、风险管理)。
Ensure security, compliance, governance of AI models in platforms products.
技术前沿跟进与引入/ Innovation
持续关注 AI/ML 前沿技术,评估后适当引入以提升模型运维能力。
Stay current with AI/ML advancements integrate new technologies model operations when appropriate.
任职要求 / Qualifications
• 学历背景/Education
计算机、工程、数学、AI 等相关 本科或硕士。
Bachelor’s Master’s in Computer Science, Engineering, Mathematics, AI, related fields.
• 模型能力
精通 模型选型、调参与优化,具备可落地的评估方法与结果。
Expert in model selection, tuning, optimization, with practical evaluation methods outcomes
• MLOps/DevOps 工具:
熟悉 MLflow、Kubeflow、Airflow、Docker、Kubernetes、Jenkins 等,能搭建自动化流水线。
Advanced experience with MLflow, Kubeflow, Airflow, Docker, Kubernetes, Jenkins; able to build automated pipelines.
• 编程与框架:
熟练 Python,掌握 TensorFlow / PyTorch / Scikit learn 等主流 ML 框架。
Proficient in Python ML frameworks (TensorFlow, PyTorch, Scikit learn).
• 云与规模化部署:
具备 AWS/Azure/阿里云 等平台上的模型大规模部署经验。
Strong background in cloud platforms (AWS/Azure/Ali) large scale model deployment.
• CI/CD fML:
有 机器学习 CI/CD 的设计与落地经验。
Experience with CI/CD pipelines fmachine learning.
• 监控处理:
深入理解 模型监控、漂移检测 与 再训练 策略。
Deep understanding of model monitoring, drift detection, retraining strategies.
• 问题解决与沟通:
优秀的问题解决、分析与沟通能力
Excellent problem solving, analytical, communication skills.
• 安全与合规:
了解 模型运维的安全与合规要求。
Knowledge of security compliance in AI model operations.
• 语言:
英语熟练,德语加分
Proficient English; German is a plus
经验 / Experience(3-6年)
• 在 企业环境中开展 AI/ML 模型开发、部署与运维/Experience in enterprise AI/ML model development, deployment, operations.
• 面向业务系统的 模型选型、调参与优化 的专家级经验/Expert experience in model selection, tuning, optimization fbusiness systems.
• MLOps 流水线、模型版本管理与自动再训练 实战经/Hands on with MLOps pipelines, model versioning, automated retraining.
• 云原生模型部署与编排(Docker/Kubernetes)能力/Experience in cloud native model deploymentchestration (Docker/Kubernetes).
• 模型服务性能排障与优化 /Experience in performance troubleshooting optimization of AI model serving
更多