A fundamental issue in blockchain systems is their scalability in terms of data storage, computation, communication, and security. To resolve this issue, a promising research direction is coding theory, which is widely used for distributed storage, recovery from erasures or channel errors and/or to reduce communication cost. To this end, this article provides the first comprehensive survey of approaches that employ coding theory to scale blockchain systems. It shows how the use of coded symbols or shards allow participants to only store a fraction of the total blockchain, protect against malicious nodes or erasures, ensure data availability in order to promote transparency, and scale the security of sharded blockchains. Further, coded symbols help reduce communication cost when disseminating blocks, which help bootstrap new nodes and speed up consensus of blocks. For each category of solutions, we highlight problems and issues that motivated their designs and use of coding. Moreover, we provide a qualitative analysis of their storage, communication, and computation costs.
Funding
National Key Research and Development Program of China (2023YFB2703600)