Logo image
A deep learning approach to building budget-constrained models for big data analytics: a thesis in Computer Science
Thesis   Open access

A deep learning approach to building budget-constrained models for big data analytics: a thesis in Computer Science

Rui Ming
Master of Science (MS), University of Massachusetts Dartmouth
2022
DOI:
https://doi.org/10.62791/20203

Abstract

With the advent of big data and deep learning technologies, the need to reduce the cost of bigdata analytics has become increasingly urgent. Deep learning methods require data collection of many different input features for accurate model training and prediction, where the cost of features may come from data collection, maintenance, and storage of features. Since there are always limits to the budgets in deep learning applications, it is crucial to reduce data costs by selecting a subset of input features from all possible features while maintaining a sufficiently high accuracy of the model. Since removing features usually reduces the accuracy of the model, it is often required to deliver a budget-constrained model with reasonable accuracy. In this thesis, we introduce an approach to finding and eliminating features that have less impact on the outputs of big data analysis using Deep Neural Networks. In our approach, we identify the weak links in a trained model based on predefined thresholds. When all output links of a neuron are identified as weak links, the neuron is considered to have minimum impact on the outputs; therefore, it is identified as a weak neuron. Our approach starts with the last hidden layer that most directly affects the output neurons, and then works backward to determine the weak links and weak neurons. This process is repeated until we find weak input neurons that correspond to less important features to be eliminated. As the trade-off between budget and predictive accuracy is often a difficult decision, we provide a variety of optional models by generating a list of budget constrained models with multiple expected predictive accuracies, sorted by predefined budget levels. This can be used to choose a budget-constrained model with the best predictive accuracy under a given budget or allow a user to better tradeoff between budget and model accuracy. The experimental results show our approach is feasible and supports user selection of a suitable budget-constrained model within a given budget.
pdf
Ming R. COE MS Thesis 20221.26 MBDownloadView
CC BY-NC-ND V4.0 Open Access

Metrics

7 File views/ downloads
13 Record Views

Details

Logo image