MIDI Team (Multimedia Indexation and Data Integration)

Team leader

Dimitris Kotzinos (PU)

Members (June 1, 2016)

Boris BORZIC (IR), Tuyet Trâm DANG NGOC (MCF), Philippe-Henri GOSSELIN (PU), Dimitris Kotzinos (PU), Tao-Yuan JEN (MCF), Michel JORDAN (IGR), Dominique LAURENT (PU émérite), Claudia MARINICA (MCF), David PICARD (MCF), Hedi TABIA (MCF), Dan VODISLAV (PU), Son VU (MCF)


Research axis

Les travaux de l’équipe MIDI (Indexation Multimédia et Intégration de Données) portent sur l’indexation, la recherche et la fouille dans de grandes masses de données, allant des bases de données relationnelles, aux données hétérogènes du Web (XML, RDF, flux d’information) et aux données multimédia (images, vidéos, objets 3D).

L’activité de l’équipe s’organise autour de deux axes :

  • Masses de données

    Cet axe aborde des problématiques d’intégration de données du Web à grande échelle files/site-etis/images/images-sites/big_data.jpg(documents XML, flux d’information et réseaux sociaux, RDF/données ouvertes, données multimédia), ainsi que de fouille dans des entrepôts de données (recherche de motifs fréquents) et dans des graphes de réseaux sociaux.

  • Systèmes de recherche multimédia

    La problématique abordée dans cet axe concerne l’extraction de descripteurs de contenu files/site-etis/images/midi/logo_midi_test2.jpgvisuel à partir de documents multimédia (images, vidéos et objets 3D), l’indexation de grandes bases de documents multimédia et l’apprentissage statistique pour la recherche dans ces bases.

L’équipe MIDI est fortement impliquée dans le LabEx PATRIMA (2011-2020) et l’EquipEx associé PATRIMEX, dans plusieurs projets en collaboration avec le laboratoire PRISM (UVSQ) et plusieurs laboratoires et institutions culturelles (BnF, Musée Rodin, le Centre de recherche du Château de Versailles, les Archives Nationales, etc.). L’objectif de l’équipe dans ce contexte est la constitution d’un pôle de recherche en gestion de données du patrimoine.

En 2013, l’équipe MIDI organise le cycle thématique « Données ouvertes pour le patrimoine culturel », sous l’égide de l’Institut des Etudes Avancées de l’Université de Cergy-Pontoise, en collaboration avec l’institut FORTH-ICS (Grèce), incluant le workshop international WOD 2013 et une série de tutoriels sur le thème des données ouvertes.

L’équipe est impliquée dans plusieurs projets de recherche en cours :

  • Projets Investissements d’avenir / Fonds pour la Société Numérique : Culture 3D Clouds (2012-2015) et TerraRush (2012-2014)
  • Projet GOD - STIC Asie (2013-2015)
  • Projets PATRIMA : EDOP (2012-2015), VERSPERA (2012-2015)


Best paper awards and invited speakers

Two best paper awards for the team during the period 2014-2018:

Dimitris Kotzinos (6 times) and Dominique Laurent (3 times) were invited speakers in international conferences and workshops.


The RETIN platform brings together and concretizes the research on the multimedia content-based analysis and search. The platform consolidates the research and allows the demonstration of the different capacities available and it has been proven quite valuable when trying to attract potential external partners.

"Data analytics" Chair

The newest highlight for the team, is the Chair “Data Analytics”, a contribution made by the QWANT company in order to support research in diverse areas along the team’s primary research axes.

The specific project gathers expertise from various team members since in its goals is the application of the content-based search for multimedia data for QWANT’s online search engine, the use of the work on personalized content assignment through preferences in order to understand better and assign more correctly content to various users and the work around data privacy so that users of online search engines can search while given various types and levels of guarantees. There is also the potential to transform this work to a startup so as to professionally consolidate the results, giving a different opportunity to engage on various aspects of knowledge exploitation.

The EU H2020 ANIMA project

The EU H2020 ANIMA project is a project in the area of understanding the quality of life for people leaving around airports, especially focusing on studying the aspect of the noise in the quality of life. The project gathers 22 partners from 11 countries for 48 months and attracted € 7.45m in EU Contribution.

The main contribution of the team members in this project is the surveying of opinions in a massive way for understanding people’s perception of their quality of life and doing this by exploiting social network platforms and combining this with external data in order to understand how this perception is actually materialized. The project demonstrates the ability to participate in research at the EU level and the ability to launch large scale collaborations. The project also demonstrates a successful collaboration inside the lab between members of the MIDI and Neuro teams.


The VERSPERA project (Numérisation et modélisation des plans de Versailles sous l’Ancien Régime), a project funded by the PATRIMA foundation, is an exemplary case of cooperation between partners coming from the cultural heritage domain and computer scientists. It has produced so far results that are directly usable, well perceived and have both a practical and a theoretical value. The transformation of the old plans of various buildings to usable information has research, aesthetic and practical value. The software produced is directly usable in other cases of 3D building reconstruction and has already being tested for this purpose. The project and its results has also been used for educational purposes, making it also an excellent case on teaching by research. Additionally, it received widespread coverage by popular press.

Institute for digital humanities

The University of Cergy-Pontoise created in 2017 an Institute for Digital Humanities and the MIDI team is the main research team of ETIS belonging to this Institute.

Research platforms

In order to support the work of the team members in the long term the team has invested in the creation of three platforms:

  • Platform RETIN, which is implementing in one compact way the work of the team around multimedia description and analysis, multimedia classification and indexing and machine learning for multimedia search;
  • Platform ARAV3D, which is supporting research around acquisition of 3D models and virtual and augmented reality experiments. It is used for experimentation in the areas of 3D modeling, facial recognition, etc.
  • Platform MIDI cloud, a new platform in the process of becoming operational that allows the team members to push to parallel and distributed architectures parts of their research around top-k queries, graph summarization and data privacy processing.