Computers can learn like us, from observations, examples, images, sensors, data, and experience. Machine learning is a field of artificial intelligence that involves the design and implementation of algorithms for computers to evolve their behavior from observations, examples, images, sensors, data, and experience, among many other sources. My main research in machine learning is on pattern recognition, a field of machine learning that involves a decision-making process based on pre-defined or learned patterns (one of the many possible definitions).

The main focus of my research in pattern recogntion is to devise efficient algorihtms for classification, clustering, feature selection and performance evaluation, with applications to interactomics, transcriptomics and data integration. Over the past 15 years, I have contributed quite a bit in many application areas (mainly in transcriptomics and interactomics) as can be seen below. In fundamental pattern recognition I have worked in statistical pattern recognition. A summary of my contributions is given below. More details about this research as well as relevant references will be posted soon.

 

  • I have proposed a polynomial-time algorithm for finding an optimal multilevel thesholding of irregularly sampled histograms. The proposed framework has quite important applications in texture recognition, biofilm image segmentation, and, in particular, it is able to optimally cluster data in one dimension, and in polynomial time. We have extended this algorithm to a sub-optimal multi-level thresholding algorithm for finding binding sites in ChIP-seq data (see Transcriptomics).
  • We have proposed a new linear dimensionality reduction method that maximizes the Chernoff distance in the transformed space. The method has been shown to outperform the traditional linear dimensionality reduction schemes such as Fisher's method and the directed distance matrices.
  • We have found the necessary and sufficient conditions for which Fisher's discriminant analysis method is equivalent to Loog-Duin's linear dimensionality reduction.
  • We have proposed a new family of weak estimators which perform very well in estimating non-stationary data. The estimators have been applied to adaptive data compression, news classification, and applications to detection of child pornography at the network level.
  • We have devised a visualization scheme for analyzing fuzzy-clustered data. The scheme allows to show membership for k-fuzzy clustered data by projecting it onto a (k-1)-dimensional hypertetrahedron. Applications to fuzzy-clustered microarray data has been shown.
  • I have proposed a model to selecting the threshold in Fisher's classifier. I have also studied the relationship between Fisher's classifier and the optimal quadratic classifier.
  • I have found the necessary and sufficient conditions for selecting the best hyperplane classifier within the framework of the optimal pairwise linear classifier.
  • We have developed the formal theory of optimal pairwise linear classifiers for two classes represented by two dimensional normally distributed random vectors. The formal theory has also been extended for d-dimensional normally distributed random vectors, where d > 2.
  • We are currently developing approaches for linear dimensionality reduction (LDR) used for classification and feature selection.
  • We have proposed a new approach that combines non-linear mapping on LDR with applications to class-imbalance problems present in prediction of microRNA and high throughput protein-protein interactions.