Abstract
In the field of multi-view machine learning, understanding and leveraging the inherent structure of multi-view data poses significant challenges. This research presents two novel contributions: Multimodality-Enhanced Graph Generation (MEGG) and Multimodality-Driven GCN (MDGCN). These methodologies are applicable to various multi-view data scenarios, providing flexibility and adaptability. MEGG fuses ”within” and ”between” graphs to generate a comprehensive representation of the data, leveraging complementary information from multiple modalities. The resulting fused graph serves as input to MDGCN, a custom architecture designed for multi-view data analysis. MDGCN comprises graph convolutional and fully connected layers, enabling effective modeling and classification. In addition, this work explores and compares different fusion strategies, presenting a comprehensive examination of their impacts on multi-view GCN methods. This examination includes an assessment of existing graph fusion methods and GCN architectures for multi-view data, offering a contextual understanding of their relative strengths and weaknesses. To evaluate the effectiveness of the methods, this study conducts meticulous training and comprehensive evaluations using two distinct datasets: the MNIST dataset and the Brain Activity Dataset. Comparative analyses against traditional models such as K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Neural Networks (NN), and Decision Trees highlight the superior performance of the proposed methodologies. The flexible nature of MEGG and MDGCN opens up new possibilities for harnessing the inherent structure of multi-view data in various domains.