Weight Copying in Bittensor

8 min readMay 29, 2024

Summary

This article provides the context and background on the issue of weight copying in Bittensor. It describes the updates we are making to the Bittensor protocol to address this issue.

This article accompanies the release of Bittensor 7.0.1 that is available in the testchain 1.1.0. This version of the testchain implements these updates. Also being released is a technical paper formalizing the weight copying problem.

Introduction

In 1906 at a county fair in England, Francis Galton conducted an experiment. He held a competition in which attendees would submit their best guess of the weight of a slaughtered and dressed ox. The person who had the best guess would get the meat. Eight hundred people participated and Galton found that the median guess was within 1% of the actual weight of 1198 pounds.

This phenomenon is known as the wisdom of crowds. It underlies much of the current success of decentralized systems. Individuals observe the same phenomena, yet their perceptions are not all the same. There is an unknown amount of variability in their reports, often considered as noise, that tends to be canceled out when taking some aggregate over all observations.

This form of collective intelligence had been tested extensively in the 20th century, with many early experiments finding success when the task was to make a discrete estimate of a continuous quantity: Simple things like estimating the number of jellybeans in a jar, or the temperature of the room.

In his 2004 book titled The Wisdom of Crowds, author James Surowiecki recounts its use to find a lost submarine:

In 1968, when the submarine Scorpion went down in the north Atlantic, the navy had only the most general idea of where it was.
Yet, using the expertise of various experts in diverse disciplines, and combining them through mathematical formulas, a naval officer managed to determine its resting place within 220 yards.

More recently researchers have shown that the wisdom of crowds effect extends successfully to higher-dimensional tasks, such as the traveling salesman problem, minimum spanning trees, and ordering and ranking problems. Or, for example, to multiarmed bandit problems.

A 2017 paper by Zhang and Lee [Zhang, S., and Lee, M.D., (2010). “Cognitive models and the wisdom of crowds: A case study using the bandit problem”. In R. Catrambone, and S. Ohlsson (Eds.), Proceedings of the 32nd Annual Conference of the Cognitive Science Society, pp. 1118–1123. Austin, TX: Cognitive Science Society] used hierarchical Bayesian models which include parameters for individual people drawn from Gaussian distributions to determine an optimal strategy.

At the largest scale, collective intelligence is harnessed in the form of markets. Open markets for things like stocks, currency, and commodities are generally more efficient at determining value or utility than top down authorities — even sophisticated ones such as Project Cybersyn, an effort of the Chilean government in the early 1970s to apply cybernetic control principles to the country’s agricultural and manufacturing sectors.

The primary assumption underlying all this is the independence of the participants. If members of the crowd do not work to produce their own judgements, then the inherent bias of the few active participants becomes amplified, degrading the ability of the market to accurately assign value.

We have observed that this does indeed occur in the Bittensor ecosystem. Furthermore, the structure of the incentive mechanism can actually encourage and reward this inefficient behavior. The following will describe the problem and the steps we are taking to ameliorate it.

Bittensor

Bittensor can be thought of as a system of independent markets. The Bittensor protocol directs the emission of the valuable TAO token that participants must work to obtain.

In Bittensor, markets, referred to as subnets, are organized around a specific goal or task.

One group of participants within each subnet, called subnet miners, continually perform work to get closer to the goal or better perform the task. How that is achieved is specific to the subnet, and is explicitly defined by the subnet’s creators.

The other group of participants within a subnet are the subnet validators. Subnet validators evaluate the work performed by the subnet miners and assign them credit based on their judgment of how well the miners’ work contributes to the goal of the subnet.

In this scenario, the subnet validators are the crowd and their individual best efforts to determine quality forms the collective intelligence that we wish to reward.

TAO tokens emitted by the chain are distributed over two levels. First, the tokens are split between the subnets. Second, within each subnet, subnet validators individually assign credit, or weight, to each subnet miner based on their evaluation of the subnet miners performance. This usually involves running some potentially very complex code that runs some metric or loss function that is used to measure the work submitted by the miners.

Subnet miners may appear and disappear frequently and a subnet validator that produces its own weights (weight-originator) will be able to determine the quality of their work immediately once submitted.

Half of the emissions given to the subnet are distributed to the subnet miners via these weights. The other half are distributed to the validators via the Bittensor consensus algorithm called Yuma Consensus.

Since each subnet validator sees the work of subnet every miner, their judgements should be similar. Yuma consensus rewards agreement among subnet validators. Subnet validators whose scores are very different from the group are penalized and receive proportionally less of their half of the subnet emissions.

Also rewarded is the speed of subnet miner discovery. When a subnet validator assigns credit to a subnet miner it receives a bond on that subnet miner that grows over time. If, for example, subnet validator 1 finds a good subnet miner before subnet validator 2, then the subnet validator 1’s bond will be larger and will be given a higher proportion of the subnet rewards vs other validators.

These are some of the aspects of Bittensor’s consensus mechanism that lead both to the problem we are highlighting here as well as to our solution.

Weight Copying

The credits that the subnet validators assign to subnet miners are referred to as weights. Weights are submitted and recorded to the blockchain and form the inputs to the consensus algorithm which are written permanently and publicly. Each subnet follows a tempo, that is, after a certain number of blocks have been finalized (currently after 360 blocks of ~12 seconds each, the same for all subnets, though this may change in the future), subnet validators submit their weights for their subnet and their rewards are calculated and distributed according to the consensus algorithm.

Because the consensus is known, as it is recorded on the chain, it is possible for subnet validators to avoid doing any actual validation work by simply setting their weights on the subnet miners as equal to the consensus of the honest validators. That is, they have copied the consensus weights from the previous epoch and submit that as their work.

Because of the consensus algorithm’s emphasis on rewarding agreement among the subnet validators, this leads to the perverse outcome that subnet validators who copy the average or median of the weights will receive higher rewards than weight-originators. We refer to this as copier advantage. This kind of behavior breaks the assumption of independence for collective decision making. It not only diverts resources from subnet validators doing useful work, it also decreases the effectiveness of the subnet as a whole, because the copier is occupying a slot that could otherwise go to a weight-originating subnet validator.

Commit-Reveal

To address this, we are releasing a new feature that changes the way subnet validator weights are recorded to the chain. Rather than submitting weights openly to the chain that can be seen by anyone on the next block, subnet validators will upload an encrypted hash of their weights which will be automatically decrypted after a set number of blocks. Going back to Galton and the county fair, this would be like each participant submitting their guess in a sealed envelope to prevent others from taking their guess and avoid potentially splitting the meat.

The length of the delay in revealing the weights is a parameter that can be set by the subnet owners. The optimal delay will be different for different subnets and depends mostly on the rate of turnover in the subnet miners. For subnets that are very stable and have durable subnet miners who change ranks rarely, a longer delay interval would likely be more effective. For subnets with more frequent subnet miner registrations and deregistrations, a shorter interval could be effective as copiers will not be able to independently score new subnet miners.

By giving copiers access only to old weights, the goal is to reduce the advantage gained by copying the current consensus. TAO-holders can make passive income by staking their TAO on their validator of choice, which does not require them to run any code or maintain any resources to operate validation. A passive staker’s cost for this is that 18% of their staking rewards are paid to the validator for working on their behalf.

With commit-reveal we aim to make copying less attractive by decreasing the reward below this level so that copiers will be encouraged to nominate rather than create their own dishonest validators.

Analysis

To investigate the effect of delaying the reveal of weights on the copier advantage, we used historical data from subnets to simulate the relative dividend return made by a hypothetical validator following a pure consensus-copying strategy.

Figures 1–3 below show a decreasing dividend per stake return for a copier compared to a honest validator (G) as a function of the number of blocks that reveal is delayed.

Figures 1–3. Relative dividend rate (G) of the weight copier as the commit reveal weights interval increases, with commit reveal weights interval ranging from 0 tempo (no conceal at all) to 19 tempo (22.8 hours). The red dotted line G = 1 is where the weight copier receives the same dividend return as the median validator, while the red area highlights the event G > 1. The scenario where these plots were situated are indicated in the subtitle.

Out of 30 subnets, 10 subnets who cannot pass the G = 1 threshold was listed on Figure 1, and 20 subnets who can pass the G = 1 threshold were listed in Figure 1. For these 20 subnets, nominators would no longer be incentivised to stake to the consensus copying validator.

Moreover, we can observe from Figure 2 that, on most of the subnets, G has converged when the commit reveal weights interval reaches 1800 (5 tempos). A longer commit reveal weights interval value yields little benefit, but slows down the evaluation of miners.

In Figure 3, it shows some examples where relative dividend rate G would not be monotonically decreasing with an increased commit reveal weights interval. There is no strict forward explanation for such an event.

Figure 2 shows that it is very profitable to run a consensus-copying strategy on subnet 30, whereas consensus-copying only gives 0.95 relative dividend rate on subnets 10, 16, 24, 31, 32 with a long enough commit reveal weights interval.

Also refer to Section 5 Experiment of the weight copier paper (PDF).

We also recommend that subnet owners update their immunity period. Immunity refers to the number of blocks a newly registered miner has to be detected and evaluated by the subnet validators. Subnet miners that do not perform well are deregistered and removed from the subnet. Because the weights will be concealed for some amount of blocks, the immunity period should be increased by the same amount. Refer to the technical paper for more in-depth analysis.

Weight Copying in Bittensor

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Opentensor Foundation

No responses yet