Recherchez une offre d'emploi

Engineer Position - Extending Xtc Beyond Tensor Operator For Cpu H/F - 38

Description du poste

  • INRIA

  • Grenoble - 38

  • CDD

  • Publié le 7 Avril 2026


A propos d'Inria

Inria est l'institut national de recherche dédié aux sciences et technologies du numérique. Il emploie 2600 personnes. Ses 215 équipes-projets agiles, en général communes avec des partenaires académiques, impliquent plus de 3900 scientifiques pour relever les défis du numérique, souvent à l'interface d'autres disciplines. L'institut fait appel à de nombreux talents dans plus d'une quarantaine de métiers différents. 900 personnels d'appui à la recherche et à l'innovation contribuent à faire émerger et grandir des projets scientifiques ou entrepreneuriaux qui impactent le monde. Inria travaille avec de nombreuses entreprises et a accompagné la création de plus de 200 start-up. L'institut s'eorce ainsi de répondre aux enjeux de la transformation numérique de la science, de la société et de l'économie.
Engineer position - Extending XTC beyond tensor operator for CPU
Le descriptif de l'offre ci-dessous est en Anglais
Type de contrat : CDD

Contrat renouvelable : Oui

Niveau de diplôme exigé : Bac +5 ou équivalent

Fonction : Ingénieur scientifique contractuel

A propos du centre ou de la direction fonctionnelle

The Centre Inria de l'Université de Grenoble groups together almost 450 people in 26 research teams and 9 research support departments.

Staff is present on three campuses in Grenoble, in close collaboration with other research and higher education institutions (Université Grenoble Alpes, CNRS, CEA, INRAE, ...), but also with key economic players in the area.

The Centre Inria de l'Université Grenoble Alpes is active in the fields of high-performance computing, verification and embedded systems, modeling of the environment at multiple levels, and data science and artificial intelligence. The center is a top-level scientific institute with an extensive network of international collaborations in Europe and the rest of the world.

Contexte et atouts du poste

General motivation : Domain-specific compilation

Given a program and an architecture, the first goal of a compiler is to translate this program into an equivalent assembly code. The secondary goal is to optimize the generated assembly code, in order to exploit fully the capabilities of the architecture, by using its hardware mechanisms and avoiding its performance bottlenecks.

The more information a compiler is able to extract from a program, the better its optimization potential is. There are several interesting positions in this trade-off :
- Generic compiler (such as gcc, llvm) are able to take any program as an input but are not able to reach the best performance.
- Domain-specific compiler restrict themselves to a specific class of programs : these programs have specific structural properties [1, 4, 6] which can be exploited to improve their performance.
- Library implementation are handwritten implementation of a key kernel for a specific architecture [8, 2]. These implementations are usually the best performing implementation, but requires lots of time and expertise.

Recent work - XTC compiler

In recent years, the CORSE Inria team focused on the development of XTC, a domain-specific compiler for tensor operations [7, 5, 3, 6] for CPU. This class of program includes key operations, including matrix multiplication, tensor contraction and convolutions. These operations are central in some application domain, such as artificial intelligence (for convolutions and matrix multiplication) or computational chemistry (for tensor contraction).

The XTC compiler includes a scheduling language (called Descript) that summarizes the optimization decision on the operations (tiling, loop interchange, vectorization, . . .). This allows the user to focus on the exploration of the optimization space, in order to find the best performing implementations.

References

[1] Paul Feautrier and Christian Lengauer. Polyhedron model. In Encyclopedia of Parallel Computing, pages 1581-1592. 2011.
[2] Intel. oneAPI deep neural network library (oneDNN). https://01.org/, 2018.
[3] Guillaume Iooss, Christophe Guillon, Fabrice Rastello, Albert Cohen, and Saday Sadayappan. SARCASM : Set-Associative Rotating Cache Analytical/Simulating Model. working paper or preprint, 2024.
[4] Ravi Teja Mullapudi, Vinay Vasista, and Uday Bondhugula. Polymage : Automatic optimization for image processing pipelines. SIGPLAN Not., 50(4) :429-443, mar 2015.
[5] Auguste Olivry, Guillaume Iooss, Nicolas Tollenaere, Atanas Rountev, P. Sadayappan, and Fabrice Rastello. IOOpt : Automatic derivation of I/O complexity bounds for affine programs. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, PLDI 2021, page 1187-1202, New York, NY, USA, 2021. Association for Computing Machinery.
[6] Hugo Pompougnac, Christophe Guillon, Sylvain Noiry, Alban Dutilleul, Guillaume Iooss, and Fabrice Rastello. XTC, A Research Platform for Optimizing AI Workload Operators. Technical report, CORSE - Compiler Optimization and Run-time Systems, 2025.
[7] Nicolas Tollenaere, Guillaume Iooss, Stéphane Pouget, Hugo Brunie, Christophe Guillon, Albert Cohen, P. Sadayappan, and Fabrice Rastello. Autotuning convolutions is easier than you think. ACM Transaction on Architecture and Code Optimization, 20(2), Mars 2023.
[8] Endong Wang, Qing Zhang, Bo Shen, Guangyong Zhang, Xiaowei Lu, Qing Wu, and Yajuan Wang. Intel Math Kernel Library, pages 167-188. Intel, 05 2014.

Mission confiée

Objective of the position

There are multiple directions of improvement for XTC, that could be explored by the candidate.

1) Extending the Descript scheduling language to multiple operators

Currently, the Descript scheduling language is currently attached to a single operator, and the inter-operators optimization are managed at the graph operator level.

A first goal would be to extend the formalization of the Descript language to cover a group of tensor operators. In particular, we will focus on the fusion transformation, and explore its multiple variations (e.g., should we allocate a temporary buffer, should we duplicate computation, how should we combine tiling and fusion).

2)Extending XTC to linear algebra kernels

A first long-term direction would be to investigate the link between tensor operations and linear algebra kernels (such as the ones found in LAPACK). The first goal would be to exhibit structural properties of these kernels and examine if it is possible to extend the Descript scheduling language to this class of operations. This would allow us to support a larger class of programs, which are useful in other application domains such as scientific applications.

3)Factorizing the code generation process across tensor accelerators

Another long-term direction would be to extend the code generation of XTC to different architecture and accelerator. Indeed, the number of accelerator dedicated to tensor operations have exploded these last year, and building a code generator for each one of them is a tedious work.

The goal will be to factorize as much of the code generation process as possible. In particular, how much of the code generation decision process can be deduced from a description of the architecture ? Multiple axis can be explored on this side, such as (i) formalizing the right abstraction level that allows us to describe a schedule on a generic architecture (ii) check the coherency of this schedule ; (iii) generating constraints that define a search space of possible schedule.

Collaboration:

Guillaume Iooss will be the main person collaborating with the candidate.
In practice, the candidate will interact with the entirety of the CORSE team, especially the people with neighboring subjects.

Principales activités

Main activities:
- Mathematical formalization
- Learning about compilation techniques and hardware architecture
- Implementing new algorithms/compilation passes in XTC
- Learning about research in general (including paper writing and presentation)

Compétences

Required technical knowledges:
- Knowledge in compilation and linear algebra.
- Basic knowledge of Unix environment, git, ... (enough to be able to develop code in this environment)
- Programming language: Python, basic level of C/C++, Latex (for scientific writing).

Required language:
- Proficiency in English.
- A good level in French would be helpful (but not compulsory).

The candidate will have opportunities to learn the following competencies:
- Compilation for high-performance computing
- Polyhedral compilation
- Architectural knowledge about CPU, GPU and tensor accelerators

Avantages

- Restauration subventionnée
- Transports publics remboursés partiellement
- Congés : 7 semaines de congés annuels + 10 jours de RTT (base temps plein) + possibilité d'autorisations d'absence exceptionnelle (ex : enfants malades, déménagement)
- Possibilité de télétravail (90 jours par an flottants) et aménagement du temps de travail
- Prestations sociales, culturelles et sportives (Association de gestion des oeuvres sociales d'Inria)
- Accès à la formation professionnelle
- Participation employeur mutuelle santé et prévoyance (sous conditions)

Rémunération

From 2,692 € grosssalary / month (depending on experience and qualifications).

Je postule sur HelloWork

Offres similaires

Automaticien H/F

  • KALI Group

  • Grenoble - 38

  • CDI

  • 7 Avril 2026

Chef de Projet Logistique H/F

  • LHH Recruitment Solutions

  • Grenoble - 38

  • CDI

  • 7 Avril 2026

Alternance Responsable de Secteur GMS - Grenoble H/F

  • ISCOD

  • Grenoble - 38

  • Alternance

  • 7 Avril 2026

Déposez votre CV

Soyez visible par les entreprises qui recrutent à Grenoble.

J'y vais !

Chiffres clés de l'emploi à Grenoble

  • Taux de chomage : 11%
  • Population : 158198
  • Médiane niveau de vie : 21170€/an
  • Demandeurs d'emploi : 15420
  • Actifs : 75857
  • Nombres d'entreprises : 14581

Sources :


Un site du réseaux :

Logo HelloWork