Emilie Morvant
Assistant Professor (Maître de Conférences) in Machine Learning
UPDATE: Position filled!!!
A fully-funded 3-years Ph.D. Thesis Position
Go back to homepage
Title: Annotations transfer in a domain adaptation framework
- Keywords: Machine Learning, Information Retrieval, Transfert Learning, Representation Learning
- Supervisors: Emilie Morvant (LaHC, Saint-Etienne) and Massih-Reza Amini (LIG, Grenoble)
- Location: Hubert Curien Laboratory, Saint-Etienne and Laboratoire d'Informatique de Grenoble
- Starting date: September or October 2015
- Application deadline: May, 18th 2015
- Decision announcement date: June, 15th 2015
- Description:
Nowadays, due to the expansion of the web a plenty of data are available and many applications need to make use of supervised machine learning methods able to take into account different information sources. However, such methods are based on the availability of annotated data that can be difficult and costly to obtain. The objective of this thesis is to tackle the issue of transferring annotations coming from different source datasets to a non-annotated target dataset: the goal is to learn a model for the target dataset thanks to the source annotations. This issue is known as domain adaptation, and one solution consists in (a) finding a common representation space for the source and target data; (b) learning a well-performing model in this space; (c) applying the model on new target data.
From a theoretical standpoint, the guarantees to learn a good model are usually not precise. This implies that one has nothing to validate the defined representation space and the learned model. The first objective of this thesis is to exploit the recent PAC-Bayesian domain adaptation framework to propose new theoretical analyses by taking into account (1) the representation space explicitly and (2) the dependences between the features of the considered data.
As practical applications of our new theory, this thesis will tackle domain adaptation for information retrieval tasks. A typical example corresponds to the problem of learning the parameters of models on an annotated dataset constituted by a set of documents and a set of queries with no relevance judgements. Rather than building relevance judgements for the new collection, we will exploit already annotated data to learn the best values of the parameters of the information retrieval models on the targeted dataset. This scenario is common in information retrieval, but also in other domains as text or image classification where new collections need be classified in existing taxonomies even though no annotation is available for these new collections.
- Profile:
For this position, we are looking for highly motivated people, with a passion to work in machine learning and the skills to develop algorithms for prediction in real-life applications. We are looking for an inquisitive mind with the curiosity to use a new and challenging technology. The applicant must have a Master of Science in Computer Science, Statistics, or related fields, possibly with background in information retrieval and/or optimization. The working language in the lab is English, a good written and oral communication skills are required.
- Contact and Application:
The application should include (in one single pdf file):
- Letter of intent
- Grades and ranking during Master 1 and Master 2
- Scientific CV
- List of publications (if it exists of course)
- Referees
Applications must be sent before May 18th 2015, candidates are encouraged to send application earlier.
Applications should be sent to both of the following addresses:
emilie.morvant[AT]univ-st-etienne.fr
AND massih-reza.amini[AT]imag.fr