PhD Position F/M PhD Position - Procedural reasoning data generation for advanced logics and planning

Inria -
Villeneuve-d'Ascq (59)

Soumettre votre candidature

Détails de l'emploi

Avantages

Qualifications

Français
Statistiques
Informatique
Niveau Doctorat
Big data
Anglais
Master
GitHub
Intelligence artificielle
Python

Description complète du poste

Le descriptif de l’offre ci-dessous est en Anglais

Type de contrat : CDD

Niveau de diplôme exigé : Bac + 5 ou équivalent

Fonction : Doctorant

A propos du centre ou de la direction fonctionnelle

Created in 2008, the Inria center at the University of Lille employs 360 people, including 305 scientists in 16 research teams. Recognized for its strong involvement in the socio-economic development of the Hauts-De-France region, the Inria center at the University of Lille maintains a close relationship with large companies and SMEs. By fostering synergies between researchers and industry, Inria contributes to the transfer of skills and expertise in the field of digital technologies, and provides access to the best of European and international research for the benefit of innovation and businesses, particularly in the region.

For over 10 years, the Inria center at the University of Lille has been at the heart of Lille's university and scientific ecosystem, as well as at the heart of Frenchtech, with a technology showroom based on avenue de Bretagne in Lille, on the EuraTechnologies site of economic excellence dedicated to information and communication technologies (ICT).

Contexte et atouts du poste

Benchmarks for Language Models

Hosting team: Inria Lille, CRIStAL (UMR 9189) Supervisor: Damien Sileo Duration: 36 months Project: TACTICS (PEPR)

Context

Language models have become surprisingly capable reasoners, but progress is bottlenecked by data. Both training (SFT, RLVR) and evaluation rely on problem sets with known answers, and the supply of high-quality, uncontaminated, difficulty-calibrated reasoning problems is running thin. Hand-curated benchmarks saturate quickly, leak into pretraining corpora, and cannot be regenerated at will. Web-scraped reasoning data carries licensing baggage and offers no correctness guarantees.

Procedural generation offers a way out. By coupling a problem generator with a symbolic solver, one can produce an effectively unbounded stream of fresh instances, each shipped with a certified solution and a tunable difficulty knob. The same pipeline serves three purposes at once: pretraining and SFT data with reasoning traces, verifiable rewards for RL, and contamination-free evaluation suites. Our group has been building this line of work through the Reasoning Core library, which already covers PDDL planning over randomized domains, full first-order logic, CFG parsing, causal inference over Bayesian networks, and equation solving.

Thesis topic

The PhD will extend this infrastructure toward harder formalisms and richer reasoning regimes. The core technical question is how to turn a solver plus a formalism into a good procedural generator: one whose output distribution is broad rather than templated, whose difficulty scales smoothly, and whose problems remain well-posed and human-readable as complexity grows.

Several directions are open and will be shaped with the candidate:

Advanced logics and planning. Hierarchical planning (HTN/HDDL), multi-agent and epistemic planning, temporal logics, and modal logics for reasoning about knowledge and belief. Each formalism has mature solvers but no scalable generator that exposes their full expressive range.
From solver to generator. Most solvers are built to consume problems, not produce them. Sampling instances that are non-trivial, non-degenerate, solvable within a budget, and distributionally diverse is a research problem in itself, and it gets harder as the formalism grows.
Verbalization and alignment with human priors. A formal instance is only useful if its natural-language rendering is faithful, varied, and not artificially easy or hard. This involves grammar-based verbalization, controlling surface form independently of logical content, and understanding how phrasing interacts with model inductive biases.
Pushing scalability where solvers struggle. At high difficulty, exact solvers time out. The thesis will explore approximate verification, compositional generation, and incremental construction strategies that preserve correctness guarantees while extending the reachable difficulty range.

Practical

Location: Inria Lille, France
Funding: full PhD scholarship (PEPR TACTICS), plus compute budget for frontier model inference
Start date: flexible, ideally October 2026

Mission confiée

The work will contribute directly to the Reasoning Core https://github.com/sileod/reasoning-core ecosystem and benefit from its existing infrastructure (gramforge grammar framework, containerized solvers, parallel generation pipeline). Evaluation will run on two fronts: zero-shot probing of frontier models to measure where current systems break, and supervised fine-tuning of small models to measure whether the generated data actually instills reasoning capabilities. Both regimes inform what makes a generator useful, and they often disagree in instructive ways.

Principales activités

Study related work

Propose new approaches

Evaluate them, iterate

Write papers

Compétences

Languages :

Good written English; French not required
Solid Python; comfort with formal methods, logic, or symbolic AI is a strong plus

Good relational skills

Avantages

Subsidized meals
Partial reimbursement of public transport costs
Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
Possibility of teleworking and flexible organization of working hours
Professional equipment available (videoconferencing, loan of computer equipment, etc.)
Social, cultural and sports events and activities
Access to vocational training
Social security coverage

Rémunération

2 300 € Monthly Gross Salary

Informations générales

Thème/Domaine : Représentation et traitement des données et des connaissances
Statistiques (Big data) (BAP E)
Ville : Villeneuve d'Ascq
Centre Inria : Centre Inria de l'Université de Lille
Date de prise de fonction souhaitée : 2026-10-01
Durée de contrat : 3 ans
Date limite pour postuler : 2026-06-06

Attention: Les candidatures doivent être déposées en ligne sur le site Inria. Le traitement des candidatures adressées par d'autres canaux n'est pas garanti.

Consignes pour postuler

Application: CV, transcripts, and a short statement of interest to [email protected]

Sécurité défense :
Ce poste est susceptible d’être affecté dans une zone à régime restrictif (ZRR), telle que définie dans le décret n°2011-1425 relatif à la protection du potentiel scientifique et technique de la nation (PPST). L’autorisation d’accès à une zone est délivrée par le chef d’établissement, après avis ministériel favorable, tel que défini dans l’arrêté du 03 juillet 2012, relatif à la PPST. Un avis ministériel défavorable pour un poste affecté dans une ZRR aurait pour conséquence l’annulation du recrutement.

Politique de recrutement :
Dans le cadre de sa politique diversité, tous les postes Inria sont accessibles aux personnes en situation de handicap.

Contacts

Équipe Inria : MAGNET
Directeur de thèse :
Sileo Damien / [email protected]

L'essentiel pour réussir

Master's degree in CS, ML, NLP, or a closely related area
Computer science/math background (logicians/theoretical/verification CS people welcome)
Familiarity with at least one of: automated planning, theorem proving, formal grammars
Interest in working at the boundary between symbolic systems and neural models
Good coding proficiency
Good mastery and interest in formal methods (LLM enthousiasm is not enough)
Scientific mindset

A propos d'Inria

Inria est l’institut national de recherche dédié aux sciences et technologies du numérique. Il emploie 2600 personnes. Ses 215 équipes-projets agiles, en général communes avec des partenaires académiques, impliquent plus de 3900 scientifiques pour relever les défis du numérique, souvent à l’interface d’autres disciplines. L’institut fait appel à de nombreux talents dans plus d’une quarantaine de métiers différents. 900 personnels d’appui à la recherche et à l’innovation contribuent à faire émerger et grandir des projets scientifiques ou entrepreneuriaux qui impactent le monde. Inria travaille avec de nombreuses entreprises et a accompagné la création de plus de 200 start-up. L'institut s'eﬀorce ainsi de répondre aux enjeux de la transformation numérique de la science, de la société et de l'économie.

Soumettre votre candidature

A propos du centre ou de la direction fonctionnelle

Contexte et atouts du poste

Benchmarks for Language Models

Context

Thesis topic

Practical

Mission confiée

Principales activités

Compétences

Avantages

Rémunération

Informations générales

Consignes pour postuler

Contacts

L'essentiel pour réussir

A propos d'Inria

Outils pour les chercheurs d'emploi

Outils Employeurs

Parcourir

Garder le contact