Abstract
The concern for blockchain scalability is the main reason for many studies on consortium blockchain storage management. However, most of the proposed solutions use various off-chain storage strategies, such as InterPlanetary File System and cloud storage. Although off-chain approaches can mitigate the scalability issues of blockchain storage, the benefits of using blockchain technology are compromised when the data is moved off the chain and new issues regarding the security and maintainability of off-chain data can be introduced. In this thesis, we propose a novel scalable storage solution for a consortium blockchain network to manage blockchain data. To reduce the storage burden of most peers in a blockchain network, we establish network nodes as super peers or regular peers, where super peers have greater resources and computing power. In our approach, regular peers maintain only a lightweight blockchain, called the current blockchain, which can be split and transfer the old data to a historical blockchain, thereby reducing the size of the current blockchain by half. When the current blockchain have grown after a given period of time, it can be split again, generating multiple historical blockchains. The current blockchain and the historical blockchains are maintained by super peers in the network; while regular peers can retrieve historical data by making queries to the super peers. We present procedures for generating historical blockchains, dynamically balancing the data retrieval workload of super peers, and concurrently retrieving historical blockchain data in response to queries. To demonstrate the feasibility and effectiveness of our approach, we provide a case study of storing healthcare big data using a consortium blockchain. The simulation results show that our scalable storage solution supports efficient access and sharing of big data on the chain for a consortium blockchain network.