Research Engineer, Model Inference & Serving - Paris
H Company —Paris (75)
- Temps plein
- Télétravail partiel
- C++
- Intelligence artificielle
- Python
Supply BPO x AI Specialist
Veepee —Paris (75)
- Télétravail partiel
- Espagnol
- Anglais
- Master
Postuler rapidement
il y a 7 jours
Applied AI Architect
Adobe —Paris (75)
- Temps plein
- Azure
- Anglais
- Master
Software Engineer, AI
amo —Paris (75)
- Temps plein
- Anglais
- Java
- APIs
Senior Machine Learning Engineer (Security)
Proton AG —Paris (75)
- Machine learning
- Python
Senior Software Engineer
Vizzia —Paris (75)
- Temps plein
- Télétravail partiel
- SQL
- Python
Edge Infrastructure Engineer
Palantir Technologies —Paris (75)
- Temps plein
- Oracle
- Déplacements à prévoir
- Linux
Signal Processing Engineer
Norbert Health —Paris (75)
- Temps plein
- Niveau Doctorat
- DevOps
- Master
Postuler rapidement
Software Engineer - Pricing - F/H - CDI
Aramisauto —Arcueil (94)
- De 50 000 € à 55 000 € par an
- Espagnol
- Anglais
- Master
Cybersecurity Engineer - Incident and Resilience specialist
Veepee —Paris (75)
- DevOps
- Anglais
- Python
Applied AI, Fullstack Software Engineer, Critical and Sovereign Institutions, Paris
Mistral AI —Paris (75)
- Temps plein
- Anglais
- Développement logiciel
- Intelligence artificielle
IT Specialist
GitGuardian —Paris (75)
- Anglais
- Technologies de l'information
- Linux
EU Engagement Manager
Harmattan AI —Paris (75)
- Temps plein
Senior Product Manager AI & Internal Tooling
Implicity —Paris (75)
- De 65 000 € à 70 000 € par an
- Temps plein
- Télétravail partiel
- Master
- Intelligence artificielle
AI Senior Engineer
Bain & Company Inc —Paris (75)
- Temps plein
- Azure
- Schémas
- Anglais
Channel Partner Manager EMEA
Qevlar AI —Paris (75)
- Télétravail partiel
- Espagnol
- Anglais
Senior Software Engineer (backend, Django) - NHI team
GitGuardian —Paris (75)
- Télétravail partiel
- Anglais
- JavaScript
- Développement logiciel
Customer Success Manager
Dust —Paris (75)
- Temps plein
- Anglais
- Intelligence artificielle
Principal GenAI GTM Specialist, WWSO
AWS EMEA SARL (France Branch) —Courbevoie (92)
- Temps plein
- Niveau Licence
- Intelligence artificielle
- Leadership
il y a 11 jours
Lead MES Engineer - Corporate Tooling
Harmattan AI —Paris (75)
- Temps plein
- Anglais
- Python

Je souhaite recevoir la dernière alerte Emploi pour les postes de ce type : deployed engineer (paris, a8)

En vous connectant à votre compte, vous acceptez les Conditions d'utilisation de SimplyHired et consentez à notre Politique relative aux cookies et Politique de confidentialité.

Research Engineer, Model Inference & Serving - Paris

H Company -
Paris (75)

Soumettre votre candidature

Détails de l'emploi

Temps plein

Qualifications

Rust
Go
Kubernetes
C++
Systèmes distribués
Deep learning
Intelligence artificielle
Python

Description complète du poste

Research Engineer, Model Inference & Serving

About H: H exists to push the boundaries of superintelligence with agentic AI. By automating complex, multi-step tasks typically performed by humans, AI agents will help unlock full human potential. H is hiring the world's best AI talent, seeking those who are dedicated as much to building safely and responsibly as to advancing disruptive agentic capabilities. We promote a mindset of openness, learning, and collaboration, where everyone has something to contribute.

About the Team: The Inference team builds and operates the systems that serve H's foundational models in production. We focus on multimodal inference and serving for Computer Use Agents, optimizing across both the inference engine layer (e.g., vLLM, SGLang) and the model serving layer (e.g., disaggregated inference, intelligent routing). Agentic inference brings constraints around context length, multimodality, and tool calls, which we address by co-designing with the Models team on training-time choices and with the agent teams on how models are deployed. We operate at the intersection of research and production, translating cutting-edge inference techniques into the systems that power H's next generation of agents. We are looking for strong engineers excited about inference to join the team and help shape the systems behind superintelligent AI.

Key Responsibilities:

Build and operate the inference stack that serves H's multimodal agentic models
Improve latency, throughput, and cost of model serving across the stack
Research and implement inference techniques tailored to agent workloads
Co-design with the Models team on training-time decisions that affect inference
Collaborate with cross-functional teams to integrate inference into agentic AI products
Evaluate inference, serving, and hardware platforms, and communicate findings to stakeholders
Stay current with advancements in inference, model serving, and accelerator technology

Requirements:

Technical skills:
- Strong software engineering track record
- Proficient in Python and at least one systems language (Rust, C++, or Go)
- Hands-on experience with deep learning frameworks (PyTorch, JAX), preferably in an industry setting
- Solid distributed systems fundamentals
- Experience working in a modern cloud environment and with production ML infrastructure (Kubernetes, etc.)
- Working knowledge of modern ML, including transformers and multimodal architectures
Research skills:
- Research engagement: an advanced degree with research output, or publications at top-tier AI or systems venues (e.g., NeurIPS, ICML, MLSys, OSDI), research internships, or substantive open-source contributions
Soft skills:
- Excellent communication and presentation skills
- Strong collaboration and teamwork skills
- Passion for inference and AI
Preferred qualifications:
- Startup experience
- Hands-on experience with inference frameworks (vLLM, SGLang, TensorRT-LLM)
- Writing or modifying GPU kernels (CUDA, Triton, etc.)
- Edge or on-device inference experience (llama.cpp, MLX, ONNX Runtime, etc.)
- Experience with quantization, speculative decoding, disaggregated inference or KV-cache compression
- Experience with multimodal models and/or agentic systems

Location:

Paris or London.
This role is hybrid, and you are expected to be in the office 3 days a week on average.
Please expect some travel between offices on a reasonable cadence (e.g., every 4-6 weeks).

What We Offer:

Join the exciting journey of shaping the future of AI
Collaborate with a fun, dynamic and multicultural team, working alongside world-class AI talent in a highly collaborative environment
Enjoy a competitive salary
Unlock opportunities for professional growth, continuous learning, and career development

If you want to change the status quo in AI, join us.

Soumettre votre candidature

Affinez votre recherche

deployed engineer emplois - paris, a8