Thèse Tests Dirigés par l'Explicabilité de Modèles de Langages Llm H/F - Doctorat.Gouv.Fr
- CDD
- Doctorat.Gouv.Fr
Les missions du poste
Établissement : Université Grenoble Alpes École doctorale : MSTII - Mathématiques, Sciences et technologies de l'information, Informatique Laboratoire de recherche : Laboratoire d'Informatique de Grenoble Direction de la thèse : Lydie DU BOUSQUET ORCID 0000000318711938 Début de la thèse : 2026-10-01 Date limite de candidature : 2026-06-22T23:59:59 Ce projet étudie comment les méthodes de test logiciel peuvent être adaptées et appliquées aux grands modèles de langage (LLM), qui sont de plus en plus utilisés comme composants essentiels des systèmes logiciels modernes. Alors que le test logiciel traditionnel permet la vérification et la validation à différents niveaux de maturité, les pratiques actuelles d'évaluation des LLM restent principalement limitées à des métriques de performance. Elles ne permettent pas d'assurer une vérification systématique des propriétés attendues ni une validation du comportement des systèmes dans des contextes réels. Cette limite est renforcée par le caractère de boîte noire des modèles neuronaux, qui rend le débogage difficile et limite les approches actuelles à des stratégies de mitigation externes, sans traiter les causes profondes des erreurs. Pour répondre à ces défis, cette recherche propose un cadre de test basé sur l'explicabilité, qui intègre l'intelligence artificielle explicable (IAx) dans le processus de test afin de permettre à la fois la vérification et la validation des systèmes basés sur les LLM. L'idée principale est d'utiliser les représentations internes des modèles pour soutenir l'évaluation systématique du logiciel et la construction de suites de tests. La méthodologie combine des techniques de test logiciel, telles que le test métamorphique, la partition de catégories et le test par mutation, avec de nouveaux critères de couverture définis sur plusieurs dimensions des représentations internes des modèles. Les contributions attendues incluent de nouvelles définitions de l'adéquation des tests, de la localisation des fautes et de la vérification basée sur les propriétés pour les systèmes neuronaux, ainsi qu'un cadre unifié pour tester les applications d'IA. Dans l'ensemble, ce travail vise à améliorer la fiabilité et la securité des systèmes basés sur les LLM, en rapprochant le test logiciel et l'IAx, et en permettant la vérification et la validation des applications d'IA. Software testing is the discipline responsible for ensuring the development of high-quality software systems that satisfy both functional and non-functional requirements. Depending on its level of maturity, testing can serve different purposes: from basic debugging (Level 0), to demonstrating correctness (Level 1), identifying failures (Level 2), reducing operational risk (Level 3), and ultimately contributing to improved system design and quality (Level 4).
Large Language Models (LLMs) are becoming foundational components of modern software applications, enabling systems that automate complex tasks and provide intelligent services . However, while these systems are deployed and maintained as software, their evaluation is still primarily treated as a data-centric problem, largely limited to Level 1 maturity, where performance is demonstrated through evaluation metrics such as precision, recall, or benchmark-based scores. This limitation prevents systematic support for both verification of expected properties and validation of system results in in realistic usage scenarios.
In the context of LLMs, achieving higher levels of testing maturity remains challenging. These systems are not explicitly designed with testability or correctness guarantees, as their outputs emerges from data-driven training processes rather than deterministic logic. Consequently, progressing toward Level 2 and 3 testing maturity requires a shift from purely metric-based evaluation toward the construction of structured test suites that can systematically expose failure modes, enabling risk reduction through deeper quality analysis rather than surface-level performance measurement.
Even when failures are exposed, existing software testing techniques are limited by their reliance on black-box evaluation, operating only at the input-output level. While this allows for performance-based testing, it does not support systematic debugging, as errors cannot be directly traced or fixed, but only mitigated. As a result, current approaches focus on external mitigation strategies such as guardrails and safety filters , that can be bypassed , prompt engineering techniques, and iterative pipelines that do not ensure the elimination of failures. However, these approaches do not address the internal causes of failures within the model.
Explainable AI (XAI) provides a promising mechanism to address this limitation by exposing internal model representations, including feature attributions such as input-ouput token importance, attention distributions, and neuron activations . However, XAI is predominantly used in a post-hoc manner for interpretability rather than as an integrated component of systematic software testing methodologies. This research proposes a testing framework for AI systems that integrates explainability into the testing process. The key idea is to move beyond purely external evaluation and instead leverage internal model representations to improve test adequacy, enable fault localization, and enhance the system validation. In this way, explainability is the core mechanism supporting the testing process for LLM-based applications.
The main research question is : How can software testing principles be implemented for large language models by leveraging explainable AI to reconstruct internal representations for test suite generation, coverage assessment, and fault localization? We propose to develop a framework to adapts classical software testing techniques to neural systems and focuses on four main directions.
The first is test generation, where test cases are designed using techniques such as metamorphic testing and category partitioning. Metamorphic oracles take advantage of metamorphic relations between input values: if a metamorphic relation holds between the inputs, the corresponding model outputs must necessarily satisfy a known relation (usually equality or equivalence). In this setting, the notion of input space is extended to include not only the traditional input domain but also the embedding or representation space of the model, allowing test cases to be constructed based on both syntactic and semantic similarity structures.
The second direction is test suite adequacy evaluation, which addresses the challenge of assessing the quality of the generated test suite, by assessing if the suite of tests is comprehensive enough to test the software thoroughly. Since there is currently no standardized notion of test coverage for LLMs , particularly with respect to multiple internal representations. This work investigates coverage metrics across multiple dimensions, including input coverage, output coverage, neural activation coverage, and flow coverage across attention heads and layers.
The third direction concerns test suite quality validation, where coverage alone is considered insufficient. A high-quality test suite must also be capable of detecting faults when they exist. To address this, mutation-based techniques are introduced to evaluate fault detection capability by generating perturbed inputs or model variants and assessing whether the test suite is able to detect induced failures. This also enables analysis of the relationship between coverage, explanation representations, and failure detection described in the fourth direction.
The fourth direction is fault localization. When failures occur, Explainable AI techniques are used to support deeper analysis by addressing why the model fails and which internal representations correlate with failure cases. Rather than treating explanations as direct causal evidence, this work adopts a property-based perspective, where internal representations are analysed in relation to violations of expected system properties . These properties include, for example, consistency under metamorphic transformations, compliance with non-functional requirements, or stability of internal representations under controlled perturbations.
The proposed framework will be evaluated through case studies involving chatbot systems, test generation pipelines, and agentic AI applications . These case studies are intended to demonstrate the applicability of the approach across different classes of LLM-based systems and levels of complexity.
Le profil recherché
Master en informatique ou mathématiques appliquées
Expériences dans les modèles neuronaux et/ou en test logiciel