Emilie Morvant
Assistant Professor (Maître de Conférences) in Machine Learning

Ph.D. Thesis
Go back to homepage

Title:
Learning Majority Vote for Supervised Classification and Domain Adaptation: PAC-Bayesian Approaches and Similarity Combination

Abstract:

Nowadays, due to the expansion of the web a plenty of data are available and many applications need to make use of supervised machine learning methods able to take into account different information sources. For instance, for multimedia semantic indexing applications, one have to efficiently take advantage of information about color, textual, texture or sound sources of the document. Most of the existing methods try to combine these multimodal informations, either by directly fusionning the descriptors or by combining similarities or classifiers, in order to produce a classification model more reliable for the considered task. Usually, these multimodal facets imply two main issues. On the one hand, one have to be able to correctly make use of all the a priori information available. On the other hand, the data, on which the model will be applied, does not come from the same probability distribution than the data used during the learning step. In this context, we have to adapt the model on new data, which is known as domain adaptation. In this thesis, we propose several theoretically-founded contributions for tackle these issues.
A first serie of contributions studies the problem of learning a weighted majority vote over a set of voters in a supervised classification setting. These results fall within the context of the PAC-Bayesian theory allowing to derive generalization abilities for such a vote by assuming an a priori on the relevance of the voters. Our first contribution aims at extending a recent algorithm, MinCq, minimizing a bound over the error of the majority vote in binary classification. This extension can take into account an a priori belief on the performances of the voters. This belief is expressed as an aligned distribution. We illustrate its usefulness for combining nearest neighbor classifiers [1], and for classifier fusion on a multimedia semantic indexing task [2]. Then, we propose a theoretical contribution for multiclass classification tasks. Our approach is based on an original PAC-Bayesian analysis considering the operator norm of the confusion matrix as an error measure [3][4].
Our second series of contributions relates to domain adaptation. In this situation we present our third result for combining similarities in order to infer a representation space for moving closer the learning distribution and the testing distribution. This contribution is based on the theory of learning from good similarity functions and is justified by the minimization of an usual bound in domain adaptation [5]. For our last contribution, we propose the first PAC-Bayesian analysis for domain adaptation. This analysis is based on a consistent divergence measure between distributions allowing us to derive a generalization bound for learning majority votes in binary classification. Moreover, we propose a first algorithm specialized to linear classifiers and able to directly minimize our bound [6].

Associated Publications:

[1] Learning A Priori Constrained Weighted Majority Votes
Aurélien Bellet ; Amaury Habrard ; Emilie Morvant ; Marc Sebban
Machine Learning Journal (MLJ), 97(1-2):129-154, 2014, DOI: 10.1007/s10994-014-5462-z
[pdf] [published version] [bibtex]
[2] Majority Vote of Diverse Classifiers for Late Fusion
Emilie Morvant ; Amaury Habrard ; Stéphane Ayache
S+SSPR 2014 - IAPR Joint International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recignition (SSPR), Joensuu, Finland.
[pdf] [bibtex] [research report arXiv:1207.1019]
[3] PAC-Bayesian Generalization Bound on Confusion Matrix for Multi-Class Classification
Emilie Morvant ; Sokol Koço ; Liva Ralaivola
International Conference on Machine Learning, 2012, Edinburgh, United Kingdom. pp. 815-822
[pdf] [bibtex] [video] [discussion] [research report arXiv:1202.6228]
[4] On Generalizing the C-Bound to the Multiclass and Multi-label Settings
François Laviolette, Emilie Morvant, Liva Ralaivola, Jean-Francis Roy
NIPS 2014 Workshop on Representation and Learning Methods for Complex Outputs , Montréal, Canada.
[pdf] [research report arXiv:1408.1336]
[5] Parsimonious Unsupervised and Semi-Supervised Domain Adaptation with Good Similarity Functions
Emilie Morvant ; Amaury Habrard ; Stéphane Ayache
Knowledge and Information Systems (KAIS), 33(2):309-349, 2012, DOI: 10.1007/s10115-012-0516-7
[pdf] [published version] [bibtex]
[6] PAC-Bayesian Approach for Domain Adaptation with Specialization to Linear Classifiers
Pascal Germain ; Amaury Habrard ; François Laviolette ; Emilie Morvant
International Conference on Machine Learning, 2013, Atlanta, USA
[pdf] [bibtex] [PBDA code]

Emilie Morvant Assistant Professor (Maître de Conférences) in Machine Learning

Ph.D. Thesis Go back to homepage

Title: Learning Majority Vote for Supervised Classification and Domain Adaptation: PAC-Bayesian Approaches and Similarity Combination

Abstract:

Associated Publications:

Emilie Morvant
Assistant Professor (Maître de Conférences) in Machine Learning

Ph.D. Thesis
Go back to homepage

Title:
Learning Majority Vote for Supervised Classification and Domain Adaptation: PAC-Bayesian Approaches and Similarity Combination