(En anglais.) The first goal of the course is to enable students to acquire strong skills for the efficient querying of relational databases. The second goal of the course is to present foundations and advanced techniques supporting systems for semi-structured data processing. it is well known that in the context of data processing for IA applications, a large part of the effort is devoted to data preparation, a process that strongly depends on techniques and skills for formulating complex queries and tuning systems supporting their execution, in order to ensure reasonable query execution time. In a wide range of use cases, data preparation involves either structured or semi-structured datasets. In this context, the first goal of the course is to enable students to acquire strong skills for the efficient querying of relational databases. After an initial refresh about the standard query language and the design of complex queries, the course will present optimization techniques that relational database management systems adopt in order to ensure efficient querying. The attention will be particularly given to storage and indexing techniques, as well as algorithms for the generation of efficient query execution plans, and main approach for database tuning. How these techniques are transposed in modern systems of the big data ecosystem will be also discussed.
The second goal of the course is to present foundations and advanced techniques supporting systems for semi-structured data processing. The attention is first focused formal specification of query languages, the design of complex queries as well as recent techniques for static analysis that help the user in the design of correct queries, in a context where this task is particularly difficult due to the potentially high and unpredictable variability in the structure of the datasets. Both parts will be supported by both books and scientific articles published in main conferences and journal on databases. The acquired notions will be consolidated in several lab-sessions.
Bibliographie, lectures recommandées
Database Management Systems, Third Edition,
Raghu Ramakrishnan, Johannes Gehrke.
Mac Graw Hill
Relational DBMS internals.
Antonio Albano, Dario Colazzo, Giorgio Ghelli, Enzo Orsini.
Book available in pdf.