Benchmarks and performance attribution for DAO treasuries
Can we learn from investment performance evaluation in traditional finance to help DAOs make better decisions?
Can we learn from investment performance evaluation in traditional finance to help DAOs make better decisions?
adcv (Steakhouse Financial Limited)
Sonya Kim (Steakhouse Financial Limited, Programd Capital)
tldr
DAOs have limited scope to do complex things with respect to allocating their treasuries
We propose simple methods borrowed from traditional portfolio management to help DAOs make decisions more clearly
The following frameworks can help communities:
Set a benchmark
Weigh risk to help select between alternatives
Do performance attribution analysis to see if active decisions add or detract value
Other references that could be useful:
Introduction
The literature and practice for investment performance evaluation in traditional finance is extensive. However, the on-chain world has few proxies that could provide DAOs or on-chain investors with the necessary tools to make quantitative investment allocation decisions. There are many ways of conducting portfolio performance attribution analysis, each with its own tradeoffs. However, it all starts and finishes with the benchmark that a portfolio aims to track. In the absence of a benchmark, no performance attribution is really possible. The CFA Institute states, in its practitioner materials by Wright & Mitchell Conover (2024), that:
Benchmarking is necessary for performance measurement, attribution, and evaluation. Effective benchmarking allows the investment committee to evaluate staff and external managers. Two separate levels of benchmarks are appropriate: one that measures the success of the investment managers relative to the purpose for which they were hired and another to measure the gap between the policy portfolio and the portfolio as actually implemented.
With this paper, we propose a simplified framework for benchmark construction. Furthermore, we will also demonstrate these benchmarks in-use by proposing some performance attribution models based on existing practice and literature. Our contribution rests on the fact that all these tools will be built for DAO treasury audiences in mind, taking into account the risks and limitations of operating in a fully on-chain or a fully decentralized manner.
This foundational tool should provide the starting point for any community member to answer what the performance of a decentralized treasury has been and how much of it was attributable to active decisions that the community or a delegated manager has taken.
What is a benchmark?
To attempt to formulate an appropriate on-chain benchmark framework for DAOs, we revisit the definition of a benchmark and its role in traditional finance.
Benchmarks originally served as a general indicator of market sentiment but evolved into a central performance measurement tool for active management, as outsourcing investment management duties to professional managers became more common. Defining a benchmark provides important clarifications for a given strategy, including:
A starting point for the portfolio in the absence of a view on specific assets
Base returns available from the asset class in consideration
A control mechanism for active risk
According to Bailey and Tierney (1998), a good benchmark should have the following characteristics:
Unambiguous—The individual assets and weights should be clearly identifiable
Investable—It must be possible to replicate and hold the benchmark to earn its return (at least net of expenses)
Measurable—It must be possible to measure the benchmark’s return on a reasonably frequent and timely basis
Appropriate—The benchmark must be consistent with the (DAO’s) objectives
Specified in advance—The benchmark must be constructed prior to the evaluation period so that treasury performance is not judged against benchmarks created after the fact.
Accountable—The community should accept ownership of the benchmark and be willing to be held accountable to it. The benchmark should be fully consistent with the allocation process.
How can DAOs set their risk management frameworks?
Stani Kulechov recently described DAO treasuries quite succinctly:
The Aave DAO is a net-positive DAO, meaning that after costs, it has enough revenue to leave a surplus. It's not a business; it's more akin to using democracy to govern a vital public infrastructure (this is how the internet would be governed if crypto-economics existed back then)
We believe that DAO treasury management is a holistic activity, rather than an isolated activity. Effective management of a DAO treasury begins with the recognition of the priorities and unique constraints, which are not present in discretionary portfolio management for actively managed funds.
In our November 2023 report, Steakhouse Financial introduced the Risk Management Framework, which could be used to define a protocol’s objectives, key performance indicators and key risk indicators. A well-structured Risk Appetite Statement should help lay out the DAO’s priorities. Once these objectives are established, the treasury can be utilized to advance the goals further.
For good treasury management practice, we defer to a few ranked priorities:
Determine the minimum treasury required for effective risk mitigation for the normal functioning of the protocol, in the appropriate token denomination
Calculate the ongoing burn requirement needed to cover anticipated operating expenses or grant distributions (for e.g. in months of treasury)
Allocate according to DAO-approved rules
The target denomination of returns of the treasury may vary depending on the protocol. Most DAOs solving for grants and operating expenses will aim in USD.
Why is a benchmark needed for DAO treasuries?
DAO treasuries are a finite public resource. They arise either from external fundraising or internal fee generation of the protocol. They require careful stewardship to be able to perpetuate the development and maintenance of the associated projects.
One of the main challenges that decentralized communities face, however, is the difficulty of making and approving complex decisions such as asset allocation. One solution proposed by claberus from karpatkey, advocates for the delegation of this decision-making to external third parties using tools such as SAFE wallets with Zodiac Role Modifiers, which allows DAO treasury management to remain non-custodial.
Whether or not DAOs opt for external delegation, establishing benchmarks can be beneficial. It can help communities align on a true north metric as part of a broader, community-endorsed, Risk Appetite Statement. It can also help evaluate the performance of decision-making, whether executed through delegated authority by a third-party, or directly by token holders.
In our view, the question is less whether a benchmark is necessary for DAOs with active treasuries. They are, and we presume that active treasuries without benchmarks are incomplete. Rather, we try to provide a general framework to answer the question “which benchmark” to use, taking into account the unique sovereign domains DAOs are exposed to and the constraints they face in navigating these.
A restricted benchmark universe definition for DAOs
Our humble worldview of crypto protocol treasuries is a simplified universe of assets across three separate, sovereign, domains:
USD
ETH
BTC
We believe the restricted asset basket of sDAI, wstETH and WBTC best represents, for DAOs and at time of writing, these three domains in a benchmark. DAOs could combine these three assets in any number of ways to make their own reference exposure benchmark. For convenience the returns should be denominated in one currency, which could be USD. Returns therefore capture beta performance of the underlying asset relative to the presentation currency.
We propose that the majority of crypto protocols will face exposure to one or more of these three assets and should form the core of a benchmark selection for DAOs. Over time, the number of domains may expand but, for now, other than edge cases, we expect most crypto protocols to be oriented in some degree towards one of the above three assets in some form or another. All sub assets in each universe represent a “selection decision” within that universe. For e.g. a tokenized S&P500 ETF would be benchmarked in the USD bucket and UNI tokens would be benchmarked in the ETH bucket. BTC remains something of a monolith, with few permissionless exposure options available for ETH-based DAOs beyond just its plain price appreciation, though this may change in time.
For instance, a protocol that accumulates ETH could decide to follow a benchmark consisting of 80% ETH, 20% USD. A lending protocol with revenue exposure to multiple assets, including WBTC, could follow a broader benchmark such as 40% ETH, 20% BTC, 40% USD. A hypothetical DAO with no desired directional exposure to crypto assets could follow a benchmark of 100% USD.
For assets that make up the components of the benchmark, we propose a combination of assets that can meet Bailey and Tierney’s characteristics for a good benchmark and be useful for a community selecting a benchmark to follow.
Regarding native tokens, our own view is to strictly ignore them and never ever count them in the treasury. DAOs could raise non-native assets by issuing native tokens out of the treasury. In fact, theoretically, DAOs could issue an infinite number of tokens well past the ‘total supply’, or issue new tokens with a new symbol and circumvent the original token altogether. Issuing tokens for non-native assets would count as a primary issuance event that dilutes existing token holders by some percentage. Therefore, the employment of those non-native assets should be at least more productive than the dilution, whether measured by accomplishment of the DAOs mission and objectives or even just by a pure financial return.
We try to select the smallest increments of risk that allow for rewards or returns to be generated with as few active decision steps as possible:
wstETH, a tokenized representation of Ether staked through the Lido protocol and, as the largest liquid staking protocol with over 240k individual holding addresses, represents the best proxy for staked ETH rewards and penalties.
sDAI, a tokenized representation of DAI held in the Dai Savings Rate contract. The Dai Saving s Rate is set by MakerDAO governance, but represents the broadest possible available USD-denominated reward rate that even decentralized DAOs could have exposure to.
WBTC, a tokenized representation of BTC held in custody in BitGo, which we could replace with more decentralized alternatives as soon as feasible
The simplest portfolio simply consists of a combination of one or more of the above. An example index portfolio with daily rebalancing featuring these three assets is available on our public Dune query.
A DAO could decide, as part of a broader Risk Management Framework ratification process, that its benchmark is, with respect to USD/BTC/ETH: 100/0/0 or 33/33/33 or 50/0/50 or any other out of an infinite number of combinations.
Any decision-making is therefore evaluated in relation to the default choice, the benchmark. If the DAO does nothing else, what is the return it should be expecting, within the domains it is exposed to in its benchmark?
DAO Treasury Risk Attribution with Dialectic’s Risk Matrix
Selecting assets to include into the DAO portfolio should take incremental risk into consideration. Dialectic’s @aaaaaaaaaa and @Meph1587 propose a risk matrix for evaluating protocol risk.
Firstly, the overall risk is segmented into six components:
Smart contract
Economic
Bridge
Oracle
Governance
Audit
Smart Contract
Faulty smart contracts are probably the number one cause of hacks. Doing a comprehensive analysis is a non-trivial task (there are audit companies for that), but a general pragmatic approach can be taken.
Over the last years, hundreds of protocols have been hacked through unattended code execution or state manipulation. In 2023 alone, 172 hacks were registered in DeFi.
As soon as a protocol is live, hundreds of eyes start looking at the codebase in order to find possible vulnerabilities. It’s not unreasonable to believe that the black hat market is quite efficient. Of course, there are edge cases where a hidden vulnerability in the codebase remains hidden for months, if not years (Curve and Euler exploits are a fine example of this), but generally speaking taking into account the age of a protocol when assessing the smart contract risk is a good heuristic.
Furthermore, the protocol’s architecture, components, and set of dependencies from other protocols should be taken into account. For instance, full-range AMMs are probably the safest protocols in terms of attack surface, whereas money markets and CLAMMs are much more prone to bugs.
The inclusion of functions within the codebase that can mitigate damage in the event of an attack, such as the ability to pause trading, is a valuable feature.
Governance
Governance refers to the process of implementing modifications to the protocol's codebase, as well as adjusting key parameters, such as loan-to-value ratios and liquidation thresholds.Another facet of risk assessment involves contract upgradability. Teams frequently segregate the state and logic of the protocol through the use of proxy contracts. This method facilitates the deployment of upgrades by altering only the protocol's logic component, thereby eliminating the need to migrate the state, such as user-utilized protocol liquidity. Nonetheless, this strategy has its drawbacks, as a new upgrade could inadvertently introduce vulnerabilities into the codebase. Moreover, a malicious upgrade that would incur in users losses could be also pushed.
There are three major aspects that should be evaluated in regards to Governance risk:
Access control to the Governance process (who/what is in charge)
Time-delay between proposal and implementation
Frequency of upgrades
As of (1), the high-level risk order (from low to high) is the following DAO - Multisig - EOA. Of course, nuances prevail in the evaluation process (i.e. Is the voting distribution sufficiently distributed? Is the Multisig resistant to collusion?)
Concerning point (2), implementing a timelock of a minimum of 3 days is recommended. This period allows users to evaluate the potential impacts of the proposed upgrade and, if desired, to opt out by removing their liquidity from the protocol.
A minor yet significant aspect in regards to (3) is how frequent upgrades or modifications are pushed to the protocol. Ideally, less frequent is preferable.
Economic
Analyzing the economic risks of engaging with a protocol must be conducted on several levels.
Firstly, by providing liquidity to a pool, we expose ourselves to the underlying assets within that pool. Should the value of any of these assets decline, LPs may incur an unrealized loss due to the pricing mechanisms of an AMM. Therefore, it is crucial to analyze the risk profile of each asset to accurately evaluate the pool's overall risk. Factors such as the asset's liquidity, the depth of liquidity, the historical volatility, and the risk profile of the counterparty (be it a centralized issuer or a decentralized protocol) all contribute to the cumulative risk of the pool.
Bridge
When it comes to asset exposure, an asset can either be native or bridged with respect to a given chain.
Historically, bridges have been a pain point of the industry, as they have been targeted by attacks on a large scale.
We can categorize bridges in three different macro categories, ranging from highest to lowest risk:
External
Optimistic
Native
External bridges, as the name suggests, are validated by an external set of validators. To function properly, they rely on an M-of-N honest majority assumption. Sometimes, economic safeguards are put in place to prevent a colluding majority from executing a profitable attack (i.e., slashing).
Overall, this type of bridge is considered the riskiest as it introduces an additional set of trust assumptions into the equation. Bridges like Axelar and Multichain belong to this category.
Optimistic bridges incorporate a latency component in the stack, allowing time to challenge any forged messages between chain A and B by observers. These bridges rely on a 1-of-N model, as only a single actor is required to prevent fraud. Bridges like Across and Connext belong to this category.
Natively verified bridges are secured directly by the validators of the underlying chain(s) by incorporating a light client of one chain within the runtime of the other. For instance, smart contract rollups such as Arbitrum or Starknet have a light-client bridge with Ethereum, where every state transition is verified and validated by Ethereum validators through fraud or validity proofs. From a security standpoint, this family of bridges is the most secure to date as it does not introduce any additional security assumptions.
When providing liquidity, in the majority of cases, the asset exposure to a given pool is either natively issued (e.g., USDC.e on Arbitrum) or externally issued (e.g., USDC.wh). Optimistic bridges usually act mostly as fast liquidity facilitators (i.e. USDC.e (arb)-> USDC (eth) transfers).
Oracle
Protocols may rely on one or more oracles to import external data, such as asset prices, into their systems. If the data is inaccurately reported, it could be exploited to execute attacks, such as harmful arbitrages. The advent of flash loans has provided access to virtually unlimited lines of credit, making oracle price manipulation a common attack vector in DeFi.
Oracles can be categorized into two types: those provided by third parties (e.g., Chainlink, Tellor) and those that are integrations fetching data from on-chain sources (e.g., Uniswap V3 TWAP oracle).
Ideally, projects should not depend solely on a single provider. Implementing a fallback mechanism is crucial to ensure checks and balances against inaccurately reported data. Furthermore, it is always easier to understand a protocol that uses a standardized oracle pricing method rather than a specific solution (such as using LP token prices as a price oracle, for example).
Audit
The audits conducted on the codebase are crucial for risk assessment, as they help identify and patch most potential attack vectors during this phase.
However, the quality of audits can vary significantly since auditors have different resources and adhere to different quality standards. While providing a quality-tier list of audit firms is beyond the scope of this article, it is the responsibility of the capital allocator to rank audit firms based on their track records and the quality of their previous audits. It's also important to note that auditors may specialize in certain areas, making some firms more suitable for auditing a specific protocol than others.
Preferably, a project should undergo multiple audits and maintain an ongoing bug bounty program to enhance security.
Formal verifications give additional guarantees of security as it involves mathematically proving the correctness of the contract's code, ensuring it behaves as intended under all possible conditions.
It's essential to remember that audits are performed on a particular snapshot of the codebase. Therefore, their validity might be compromised by subsequent changes, such as those introduced through a proxy upgrade.
Ongoing audits agreements are a good compromise in case the project's contracts are upgradeable. In that case any upgrade can harm the protocol in unpredictable ways. This also holds for on-chain governance decisions.
Risk score
For each component of risk, a numerical value is assigned. This methodology treats risk as a continuous variable, although it is more accurately binary—indicating whether a protocol is secure or not. In essence, what we do is assign a probability value to each category, representing the likelihood of that component being exploited, leading to the protocol being compromised.
These scores are systematically applied to each new strategy or asset and are multiplied geometrically to calculate a total risk discount.
The formula above operates under the assumption that all dependencies contribute equally to the overall risk, an assumption that may not hold true due to the significant variations in the architecture of different protocols. To account for these variations, assigning a weight to each risk category could enhance the formula's accuracy. Consequently, our revised formula would be:
Where wi represents the weight assigned to the risk category.Taking Uniswap V3 and Aave V3 as examples, we can explore assigning weights to governance risk. Both protocols are governed by their respective DAOs, but the extent of enforcement varies significantly between them.
For Uniswap V3, governance power is somewhat limited; the Uniswap DAO can only impose a protocol fee on swaps. It lacks the authority to blacklist assets, pause pools, or upgrade protocol contracts. In contrast, the Aave V3 DAO plays a more critical role, overseeing the setting of risk parameters for each market, such as Supply and Borrow caps, Loan To Value ratios and Reserve Factors, in addition to having the capability to upgrade protocol contracts themselves.
This distinction highlights a significant variance in potential governance-related risks: Uniswap's DAO, with its limited powers, poses minimal risk, suggesting a governance risk weight close to 0. On the other hand, the Aave DAO, with its broader authority, bears a more substantial governance risk, warranting a higher weight.
Below is a risk matrix designed to provide guidelines for assigning risk scores to each risk category. Values could be interpolated between columns - there is no benefit from a purely discrete categorization.
The expected return from a strategy should be discounted by this risk factor to produce a relative ranking of what the risk-adjusted return for a new strategy is, in comparison to benchmark assets.
Dialectic: Risk Scores on Benchmark Assets
The benchmark assets (wstETH, sDAI, WBTC) themselves are not risk-free. They have positive Dialectic scores on at least two of the above categories. However, this should provide a useful heuristic for communities looking to evaluate whether to include a new strategy or to simply stick to the benchmark.
DAO Treasury Performance Attribution
The performance attribution model we are seeking to answer our two original questions (what the performance of a decentralized surplus has been and how much of it was attributable to active decisions that the community or a delegated manager has taken) is a simplified period-on-period Brinson-Fachler attribution decomposition.
There are many simplifications that are necessary for this performance attribution model to work, but we are comfortable introducing this first iteration of a model as a simplified minimally viable performance attribution tool for DAOs.
Appendix: Examples
Example 1: Simple benchmark index built with the basket
We picked out our own favorite minimum effort allocation. It is just about the lowest effort benchmark you could build with a very strong long bias on the broad crypto market from the perspective of a DAO. It’s suitable if your DAO has a structural long-bias on crypto, including BTC. This may be the case for a lending protocol or an on-chain perp protocol, for example wstETH 53.2%, WBTC 34.2% and sDAI remainder.
Example 2: ETH-based DAO evaluating three new strategies
If an ETH-based DAO (i.e. whose benchmark is 100% ETH) were evaluating three strategies to choose from, an example risk discount table could, for example, look like the below.
We compute the Dialectic risk discounting scores for each of:
Muniswap AMM, a novel AMM design on a highly hyped Layer 2 secured by a multisig
Braave, the same lending protocol as described above
A new stETH-based stablecoin forked from Liquity but without the ability to perform redemptions
When comparing these to the benchmark asset risk score, a community has one more data point to look at when it comes to deciding whether to execute a given strategy.
In this example, other than Braave, either the relative risks for each unit of excess return are too high, or the excess return for each unit of risk is too low. The community now has the vocabulary to debate the merits of allocation, or simply not take any decisions and stay in the benchmark asset, wstETH.
Example 3: Benchmark and performance attribution
Take as an example (spreadsheet for illustration) a DAO that governs Braave, a hypothetical lending market protocol which offers leverage against WBTC, stETH and USDC, among others. Given the weighting of their revenue, their surplus ends up accumulating as a mix of BTC, ETH, USDC and various governance tokens from ETH-based DeFi protocols. The community determines that their ideal benchmark is 10% BTC, 50% ETH and 40% USD.
The treasury committee, composed of ETH maxis, has received delegated authority from token holders to manage the surplus and decides to pursue some more active opportunities weighted away from USD and from BTC despite the mandate. The period returns for the benchmark assets is:
(Completely invented return figures below)
The portfolio selection the committee makes is:
The resulting performance attribution table is:
From this analysis, the community can conclude that during this period, the treasury committee underperformed behind active domain allocation decisions and underperformed behind within domain selection decisions. The community can now evaluate whether the decision making process within the community is driving the management of the surplus towards its desired results, as stated in its Risk Appetite Statement in the context of a DAO-approved Risk Management Framework. It could well be the case, in which case, the benchmark allocation may need revisiting in a further DAO vote.
References
Bailey, J. V., & Tierney, D. E. (1998). Controlling Misfit Risk in Multiple-Manager Investment Programs. Research Foundation of CFA Institute.
Wright, M. A., & Mitchell Conover, C. (2024). Portfolio Performance Evaluation. In Refresher Readings (Vol. Portfolio Management). CFA Institute.
Colophon
About Steakhouse Financial
Steakhouse Financial is a boutique crypto-native advisory firm specialized in stablecoins of various types and backings.
Follow @SteakhouseFi for more grilling tips
Steakhouse.financial
dune.com/steakhouse
Disclaimers: steakhouse.financial/disclaimers
About Dialectic
Dialectic is a crypto-native fund specialized in onchain yielding and treasury management.
Follow @Dialectic for more wizardry
Dialectic.ky