New Framework for Data Engineering on Ethereum Beacon Chain Rewards

centrality and granularity in reward distribution data, fills a critical gap in research on Ethereum’s transition to PoS.
2. We introduce a dataset comprising attestation, proposer, and sync committee rewards from the Ethereum Beacon chain, providing detailed and auditable records of validator activity. This dataset enables researchers to evaluate the enforceability of rules, protocol compliance, and long-term behavior of validators. By offering a transparent view of validator activities, this dataset facilitates robust assessments of PoS incentive structures.
3. We apply decentralization metrics such as Shannon entropy, Gini Index, Nakamoto Coefficient, and Herfindahl-Hirschman Index (HHI) to analyze the dataset and highlight trends in decentralization within Ethereum’s PoS ecosystem. These metrics offer insights into the distribution of rewards, shedding light on potential centralization trends and supporting research efforts aimed at enhancing the decentralization, security, and efficiency of blockchain systems.
Ethereum’s transition from PoW to PoS represents a significant shift in blockchain technology, signaling a departure from energy-intensive computational exertion towards stake-based validation. This transition not only addresses environmental concerns associated with PoW but also introduces a new paradigm of reward distribution that prioritizes staking Ether. Previous research has emphasized the importance of reward distribution in blockchain networks, highlighting disparities among different systems and raising questions about wealth concentration and power dynamics. Ethereum’s transition to PoS presents an opportunity to examine whether this new model offers a more equitable distribution of rewards, challenging previous centralization trends observed in blockchain ecosystems.
Unlike the PoW phase, PoS rewards in Ethereum are stored on the Beacon chain, posing challenges for data collection and analysis. Existing studies often rely on third-party sources for reward data, raising concerns about data accuracy and completeness. Mainstream blockchain data platforms typically do not provide comprehensive Beacon chain reward data, further complicating research efforts. Few studies have addressed the technical complexities of collecting and parsing raw data from the Ethereum Beacon chain, highlighting the need for a systematic approach to data collection and analysis in decentralized systems.
Our study aims to address these gaps by developing a methodology for collecting reward data from the Ethereum Beacon chain, providing detailed insights into reward distribution and decentralization in Ethereum’s PoS ecosystem. Our dataset, which includes attestation, proposer, and sync committee rewards, offers transparent and auditable records of validator activities. Researchers can utilize this dataset to evaluate protocol compliance, enforceable rules, and long-term behavior trends among validators, enhancing understanding of PoS incentive structures.
By applying decentralization metrics such as Shannon entropy, Gini Index, Nakamoto Coefficient, and HHI to our dataset, we gain valuable insights into decentralization trends within Ethereum’s PoS ecosystem. These metrics reveal patterns in reward distribution and shed light on potential centralization trends, supporting ongoing research efforts to improve the decentralization, security, and efficiency of blockchain systems. Our dataset, publicly available on Harvard Dataverse and supported by open-source analytical tools on GitHub, serves as a valuable resource for future research endeavors focused on enhancing blockchain decentralization.