DataScience Lab
Table of Contents
News / info #
- We still need more volunteers to present on the first session A2 on November, 5. Please contact Constant or Alexandre if you are interested. If not, we will randomly select some groups.
(Tentative) planning for the year #
Note: A1 = assignment 1, Ax = assignment x.
| Date | Description |
|---|---|
| September, 21 | — NO CLASS — |
| October, 1 | Class intro + Intro A1 |
| October, 7 | - NO CLASS - |
| October, 15 | Preliminary presentations A1 |
| October, 21 | Deadline A1 23h59 |
| October, 22 | Final presentations A1. Intro A2 |
| October, 29 | Alexandre's presentation on PR + group session |
| November, 05 | Preliminary presentations A2 |
| November, 12 | - NO CLASS - |
| November, 17 | Deadline A2 23h59 |
| November, 18 | Final presentations A2 + Intro A3 |
| November, 20 | Lucas' presentation + group session |
| November, 27 | — NO CLASS — |
| December, 2 | Deadline A3 23h59 |
| December, 3 | Preliminary + final presentation A3 |
Assignment 1 #
Links
- Slides assignment 1: here
- GitHub classroom link: here. If you can't find your name, come to me.
- Testing datasets are available here.
- Testing platform: here.
Refs
- Recommender Systems Survey Latent Vector (link)
- Recommender Systems: The Textbook by Charu C. Aggarwal (read the section about MF)
- Generalized Principal Component Analysis — René Vidal, Yi Ma, S. Shankar Sastry (link)
- Deep Matrix Factorization — Xue et al.(link)
- Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering(link)
- Learning to Match via Inverse Optimal Transport(link)
- Neural Graph Collaborative Filtering(link)
Assignment 2 #
Links
- Slides assignment 2: here
- Slides on precision/recall: here
- GitHub classroom link: here
- Testing platform: here
Refs
Available in the slides on precision/recall : hereGroups Registered for Preliminary Presentation
1 - Tzatziki2 - 3-bet_light
3 - Shape_Shifters
4 - GAN_gsters
5 - Crous de Châtelet
6 - OrGan
7 - Hero to Zero
8 - AriGAN
(Note that the order will be randomized on the day of the presentations)
HowTo #
Group sessions
How it is supposed to work:
- Students describe their plan/idea/readings/experiments and ask questions;
- Professors answer questions when they can.
How it is not supposed to work:
- Professors explain students how to conduct their project.
Class presentations
- For each assignment, each group is expected to give exactly one presentation (either a preliminary presentation or a final presentation).
- The presentations must be uploaded on the Git repository at the start of the class (no email).
- The presentations must be in PDF format and named
slides.pdf. - Order of presentations will be randomly determined at the start of the class.
Preliminary presentations
- 6 minutes (~ 6-8 slides)
- Briefly & clearly state the problem you are working on
- Present and compare approaches you are considering
- Describe what you have implemented (briefly)
- Discuss possible experiments and evaluation metrics
- Present preliminary results if you have any
Final presentations
- 6 minutes (~ 6-8 slides)
- State the problem you studied
- Compare approaches
- Describe what you implemented
- Discuss metrics
- Show and discuss experimental results
Reports
- 1 front page with student names, team name, and optional project title
- 5 extra pages max (refs not included, figures included)
- PDF file named
report.pdfon the Git repository by the deadline (no email) - Include: implemented items + file paths, experiments with conclusions, lessons learned
- Exclude: long theory descriptions; extensive code listings (brief pseudocode is ok)
FAQ #
Can I develop approach X (a method not discussed in class)?
You are encouraged to study & implement something not discussed in class, as long as it addresses the target problem. Comparing a known approach with a novel one is typically valuable.
Is it mandatory to use the dataset or metric specified by the professors?
Prefer running at least one comparable experiment, but feel free to explore other datasets/metrics to better understand your method’s behavior.
Do I have to work with virtual env ?
It is not mandatory for your work, but as it is a good practice, we use it to run the testing platform.
Therefore, you should at least provide a requirements.txt file with the list of required packages.
I don't have enough computing power.
Consider cloud notebooks (e.g., Colab) or: Mesonet to access more ressources. To use Mesonet, you need to create an account with your Dauphine email and request access to Constant Bourdrez at name.lastname@ens.fr.