My main research interests and contributions are focused on devising new machine learning algorithms for interactomics and transcriptomics. The main applications are on general problems in these fields and on the role of these in finding biomarkers in breast and prostate cancer, and in another area on the role of these in oral fluids. Below is a summary of my interests and contributions in the respective fields. For more details on my research contributions, refer to my List of Publications.
The interactome involves proteins and associated molecules interacting in a living system. The interactome is rather dynamic as the interactions and ultimately proteins' functions are manifested in a temporal and spatial manner. To understand the complex cellular mechanisms involved in a biological system, it is necessary to study the nature and specificity of these interactions and the dynamics involved in it at the molecular level, for which prediction of protein-protein interactions (PPIs) has played a significant role. In a broad sense, my main research in interactomics aims to develop machine learning algorithms for prediction and analysis of PPIs from high-throughput data, understanding the dynamic aspects of these interactions and their relationships with genomic and transcriptional features. One of the key issues I am currently investigating is the integration of transcriptomics data from RNA-seq with interactomics in applications for the identification of biomarkers that will help understand the transcriptional and genetic mechanisms involved in the development of prostate and breast cancer. Another aspect of protein insterations that my lab is currently involved in is in calmoudlin-biding proteins and their interacitons with other proteins. More information about my research in this field can be found in the Interactomics page.
The transcriptome represents the repertoire of transcripts in an organism as the main product of DNA transcription and splicing. The Human genome comprises 3 billion bases on each of (on average 1014) cells in one body, where each cell may contain up to 300k RNA molecules. Then, the full transcriptome may contain approximately 8.423 RNA bases... in one body! One cell line/condition, manifested in terms of transcriptomics data data, could imply 30Mb of microarray data or 30Gb of RNA-seq data, while for all cells in one body the figure could grow up to petabytes or exabytes.
Transcriptomics studies have been traditionally carried out using microarray technologies and more recently, using the emerging next generation sequencing techniques known as RNA-seq. My main research in transcriptomics has been centered in the main aspects of microarray data analysis, with emphasis on DNA microarray image gridding and segmentation, as well as gene selection, biomarker detection and clustering time-time series gene expression data.
Currently, my research focuses on RNA-seq data analysis, aiming at finding relevant enriched regions in RNA-seq and ChIP-seq data and studying the underlying mechanisms of alternative splicing, its relationship with non-coding RNA, as well as their translation into protein isoforms and associated functions, with applications to the discovery of new biomarkers in breast and prostate cancers. One of the main goals is to understand the relationships among different protein variants yielded by alternative splicing and the integration with interactomics: the inherent functions as a result of protein interactions, the underlying domains and short motifs involved, and the dynamics of the interactome. More details are in the Transcriptomics page.
Computers can learn like human beings, from observations, examples, images, sensors, data, and experience. Machine learning is a field of artificial intelligence that involves the design and implementation of algorithms for computers to evolve their behavior from observations, examples, images, sensors, data, and experience, among many other sources. My main research in machine learning is on pattern recognition, a field of machine learning that involves a decision-making process based on pre-defined or learned patterns (one of the many possible definitions). The main focus of my research in pattern recogntion is to devise efficient algorihtms for classification, clustering, feature selection and performance evaluation, with applicaitons to interactomics, transcriptomics and data integration. One of my most recent works involves developing new clustering algorithms for clustering time-series data in transcriptomics and interactomics. More information about my research in this field can be found in the Machine Learning page.
I have worked on some projects on data security. More details will be added later here and in the Data Security page. Prospective students interested in some of the latest projects I am working on should contact me for more details.