The world has observed quick growth of technology and the vast amount of data at entirely new scales and complexity-levels. The field of data mining needs new tools to face challenges in analyzing big data relevant to a wide range of domains. Our work on developing new graph compression and embedding methods in graph mining to expand large graphs for better management, querying, and display is critical to find meaningful patterns in the enormous size of real-world networks. We also focus on developing novel graph and text mining methods to extrapolate useful information from social media data. Additionally, we develop data mining and network-based learning methods to construct different network structure from biomedical relational data. Our goal is to combat different public health problems, such as opioid addiction, Covid-19 and drug-drug interaction, to help government officials and health administrators with providing timely and useful information.

Details on our projects is stated below-

Graph Compressing for Expediting Graph Analysis
Overall objective of this project is to develop new graph processing methods based on graph compression that make the analysis of large graphs possible using existing solutions. The aim of graph compression is to create a smaller graph without losing any/much information about the graph and with preserving key network properties. The rationale underlying the proposed research is that the resultant compressed graph maintains the crucial information from the original large graph while eliminating redundant information. This will make it possible to analyze and visualize large graphs more efficiently with existing algorithms without losing their effectiveness.

Distributed Graph Mining
Nowadays complex data are naturally modeled and represented as graphs which has many application of networks such as community detection, graph classification. There are some index structures to store graph with some objectives to process it more efficiently. Especially, processing large scale graphs is not easy on a single machine even with using index. For example, Facebook data can be model as a graph with more than 1.4 billon vertices as users and 0.4 trillion edges as relations between users. It is not possible to process this graph on a single machine. We should develop distributed computing techniques to process such large graphs. In this project, Dr. Akbas studies distributed graph indexing problem which create and use index of the graph in the distributed systems to make possible to apply many graph algorithms on large scale graphs.

Graph Embedding
Develop new efficient graph embedding methods to encode the localized structure (and attribute information)into a continuous vector space.

Scientific Network Mining
Construct paper, venue and collaborator recommendation system Fingerprint Identification using Topological Data Analysis Present·Convert a large fingerprint image data to a small graph data with simplification methods. Extract the features of the graph data·Identify distorted fingerprint images with these features using machine learning techniques.

Biomedical knowledge mining on social networks
Overall objective of this project is to conduct biomedical knowledge mining to track public health issues on social networks. Dr. Akbas also focuses on developing novel graph and text mining methods to extrapolate useful information from social media data. Her goal is to combat different public health problems, such as opioid addiction and Covid-19, to help government officials and health administrators with providing timely and useful information.

Knowledge Discovery on Biomedical data
The goal of this project is to utilize applied machine learning to massive interaction data among biomedical entities like drugs, diseases, proteins and side effects. This project construct different network structure from this biomedical relational data and also develop network-based learning methods to address critical problems in biology and medicine.

DDI Prediction
We are conducting multiple projects on DDI prediction using different graph-based techniques/methods. In one project, our method considers drugs and other biomedical entities like proteins, pathways, and side effects, for DDI prediction. We design a heterogeneous information network (HIN) to model relations between these entities. Afterward, we extract the rich semantic relationships among these entities using different meta-path-based topological features. An extensive set of features are fed to different classifiers for DDI prediction.
Link to this paper: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9562802


In another project, we propose a novel method for predicting DDIs based on the vital chemical substructure of drugs extracted from their SMILES strings. We construct a graph that connects drugs based on their common functional chemical substructures. Furthermore, we apply different well-known graph neural network (GNN) methods to generate drug embeddings. Drug embeddings of individual drugs are concatenated to generate features of drug pairs. Finally, drug pair features are fed to different machine learning (ML) classifiers for DDI prediction.
Link to this paper: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9562802

Currently, we have multiple projects on DDI prediction using different variants of GNN.