### Market growth

We start by investigating the evolution of NFT sales in our dataset over time. We find that the interest in the collections remained stable until the end of 2020, then started to gain traction in 2021, especially in terms of available NFTs on the market (see SI Fig. 1). The number of primary sales grew from an average of 14 daily sales in January 2021 to 784 sales every day in March 2022, when the market peaked, implying a percentage growth of \(5500\%\) (see Fig. 3). Similarly, secondary sales grew by 110,177%, starting from 9 sales/day in January 2021 and reaching 9925 sales/day in March 2022. Interestingly, around October 2021, the number of secondary sales started to exceed the number of primary sales, a trend that still holds at the moment of writing. This surge in activity led to a growth of daily volume of trades of 18,520% between January 2021 and March 2022 (see Fig. 3 inset), and attracted new users. The number of new buyers increased by 41,755% in 2021. These results indicate an overall growth of the popularity of NFT collections on OpenSea, both with respect to the size of the NFT community, and to the total market value.

Different collections contributed to varying extents to the growth of the collectible NFT market. Figure 4 shows the distribution of key market properties across NFT collections: total number of sales per collection (Fig. 4a), total traded volume per collection (Fig. 4b) and collection items median sale price (Fig. 4c).

Collections are widely heterogeneous with respect to market properties. \(25.6\%\) of the collections have generated less than 1000 sales, whereas 17.1% have generated more than 10,000 (see Fig. 4a). Further, \(43.9\%\) of the collections had a total trade volume below a million dollars, whereas \(3.64\%\) generated more than a hundred million dollars of sales on the marketplace (see Fig. 4b). The success of a collection can also be measured by looking at the median price at which its NFTs are sold on OpenSea. For 18.3% of the collections, the median sale price is lower or equal to a hundred dollars, whereas it is higher than a thousand dollars for 12.9% of the considered collections (see Fig. 4c). These findings indicate that collectibles NFT do not meet the same success on OpenSea, a claim that is supported by the infamous success stories of a few collections, whereas the others quickly become a flop on the platform^{51}.

### Quantifying rarity

We quantify the distribution of rarity scores for items within the same collection. As an example, Fig. 5 shows the distribution of rarity for three popular collections, namely CryptoPunks, Bored Ape Yacht Club, and World of Women.

For CryptoPunks, the median rarity score is 0.82, with only one of the 10,000 CryptoPunks having a rarity score above 75, whereas \(99.7\%\) of the tokens have a rarity score below 10 (see Fig. 5a). Moreover, as most of the CryptoPunks have a low rarity score, the least rare ones are aggregated into two bins, whereas the rare one occupies the only bin with a high rarity score within the collection. The median rarity score for Bored Ape Yacht Club is 20.3, and 26 apes (i.e., \(0.26\%\) of the collection) have a rarity score above 75. The distribution is skewed towards lower rarity scores, with \(68.2\%\) of the assets with a rarity score below 25, among which \(8.23\%\) fall below a rarity score of 10 (see Fig. 5b). The profile for the World of Women collection is also not as heterogeneous as that of CryptoPunks; it has a median rarity score of 14.8 and only 24 assets (\(0.24\%\) of the collection) have a rarity score above 75. \(87.3\%\) of the tokens have a rarity score below 25, and \(19.9\%\) of those lie below a rarity score of 10 (see Fig. 5c). To generalize these observations, we calculated the Spearman rank correlation coefficient between the rarity bin and the number of NFTs by rarity bin. A negative value of the correlation coefficient indicates that the higher the rarity score, the lower the supply of NFTs is within the considered collection. Like the three example collections in Fig. 5a–c, \(96\%\) of the collections in our dataset have a Spearman rank \(r \le 0\) , as shown in Fig. 5d, where the violin plot represents the probability distribution of the Spearman rank correlation by collection. We compare the ability of 6 different statistical distributions, namely the exponential, power-law, uniform, cauchy, log-normal and levy distributions, to capture the distribution of rarity for each collection, using the Akaike model selection method^{52} (see SI for more details). We find that, among the distributions considered, \(90\%\) of the collections are best described by a log-normal distribution (with \(\langle \mu \rangle = 0.91 \pm 0.16\), see SI Fig. 2), only \(7\%\) by an exponential, \(1\%\) by a uniform function and the rest by heterogeneous distributions such as power-laws or Levy (for a visualization of a sample of these distributions, see SI Fig. 3).

The same correlation analysis performed using the rarity rank confirms our results (see SI Fig. 4) In the following, we will focus on NFTs rarity score, because this metric takes into account all the traits associated with an NFT, and is therefore more suitable to quantify NFTs properties and rarity. All the following results are replicated using trait rarity rank as robustness check (see Supplementary Information section E.1).

Our analysis indicates that the distribution of the rarity within a collection is heterogeneous, thus leading to a situation where rare NFTs are genuinely scarce on the marketplace. Notice that while this may seem trivial (“rare items are fewer than common items”), the distribution of traits rarity, and in turn their combination in single NFTs could in principle generate a wide range of distributions of NFT rarity, including homogeneous ones.

### Rarity and market performance

To measure the relationship between rarity and market performance, we compute the rarity score of each NFT, and we split the assets into quantiles with respect to their rarity score to analyse collections individually. We then compare the median sale price across quantiles. We are using quantiles to ensure that NFTs within a collection will be evenly balanced between each bin, as to avoid having a collection skewing the results in the aggregated analysis, by having all of its NFTs concentrated in a single bin. For the individual collections analysis, NFTs are partitioned into twenty quantiles, whereas 100 quantiles are used when aggregating the collections together.

First, we consider the relation between market behaviour and rarity for three exemplar collections, CryptoPunks, Bored Ape Yacht Club, and World of Women (see Fig. 6a–c). We observe that the median sale price at which NFTs are auctioned is relatively constant for the most common NFTs in each collection (rarity quantile smaller than 10), and then increase sharply for the rarest NFTs (rarity quantile larger than 10, see Fig. 6a–c). These findings are robust, and are observed also when we consider NFTs in all collections (see Fig. 6d). We notice that the median sale price is relatively flat for the \(50\%\) least rare NFTs, before increasing by \(195\%\) for the \(10\%\) rarest NFTs. More strikingly, the median sale price for the \(90\%\) least rare NFTs is equal to \(298 \pm 3.2\) USD, and rises to 1254 USD for the \(1\%\) rarest NFTs. Focusing on the top \(10\%\) rarest NFTs, the relationship between the median sale price *p* and the quantity (100-*q*), where *q* is the rarity quantile, is well described by a power law \(p\sim (100 – q)^{\alpha }\) with exponent \(\alpha = -0.55\) (see Fig. 6 inset). This result indicates a strong relationship between NFT rarity and median sale price.

On the other side, we find that rare NFTs are not sold as frequently as common ones on the marketplaces. By looking at the individual collections, we see that the average number of sales decreases as we increase the rarity of the NFTs we are considering (see Fig. 6e–g). Regarding the average number of sales, by aggregating all collections together, we find that the number of sales decreases for rarer NFTs. In particular, the \(1\%\) least rare NFTs are sold, on average, \(10.8\%\) more than the \(1\%\) rarest ones (see Fig. 6h).

In order to check that this behaviour holds when considering a shorter time span within OpenSea’s lifetime, we performed the same analysis by considering only sales happening during the third quarter of 2021 (see SI Fig. 9) and the fourth quarter as well (see SI Fig. 11). Our findings are also robust by considering the sale price in ETH rather than in USD (see SI Fig. 7), and by discarding the rarest and least rare NFTs from each collection (see SI Fig. 13). Moreover, we notice a similar pattern when quantifying the rarity of the NFTs with the NFT rarity rank instead of the NFT rarity score (see SI Fig. 5).

### Rarity and return on investment

NFTs can be purchased and later put on sale again on the marketplace. An NFT owner is free to set an initial price to an auction, and to transfer the ownership of the NFT to the highest bidder. As such, NFTs which have been minted years ago, such as the CryptoPunks, can still be purchased on OpenSea. The results shown in Fig. 6 indicate that, within a collection, the rarest NFTs are typically sold at a higher absolute price than the least rare ones on the market. However, this fact does not necessarily imply that the return on investment of secondary sales is positive, as it does not take into account the price at which the asset was initially purchased before being auctioned again. To study whether the correlation between rarity and price strengthens as a token keeps being exchanged on the market, we computed the return *R* of the \(k{\text{th}}\) sale of an NFT as:

$$\begin{aligned} R = \frac{P(k) – P(k-1)}{P(k-1)}, \end{aligned}$$

(3)

where *P*(*k*) is the price that was paid for the NFT for its \(k{\text{th}}\) sale. A positive return indicates that the NFT was sold at a higher price than the one it was bought for, whereas a negative return represents a financial loss for the seller.

Figure 7a shows the median return computed when aggregating all collections by rarity quantile. We find that the rarest NFTs have a much higher median return, whereas the value is almost constant in the first half of the curve. Focusing on the top \(10\%\) rarest NFTs, we observe that the relationship between the quantity (100-*q*), where *q* is the rarity quantile, and the median return *R* is well described by a power law \(R\sim (100 – q)^{\alpha }\), with an exponent \(\alpha = -0.29\) (see Fig. 7 inset). The median return is relatively flat around \(0.24 \pm 0.001\) for the \(50\%\) least rare NFTs, thus indicating no noticeable advantage for an NFT to be one of the least rare assets of the collection or an average one in terms of rarity, whereas the median return grows by \(105\%\) within the top \(10\%\) rarest NFTs. Finally, we study the relation between NFT rarity and the probability to generate negative returns. We observe that, on average, rarer NFTs are less likely to generate negative returns (see Fig. 7b). The fraction of sales generating negative returns is equal to \(34.6 \pm 0.58\%\) for the \(50\%\) least rare NFTs, but drops from \(30.5\%\) to \(22.9\%\) within the top \(10\%\) rarest NFTs, i.e., a decrease of \(24.9\%\). These results also hold by only considering the sales happening during a shorter a shorter time period, such as the third quarter of 2021 (see SI Fig. 10) and the fourth quarter (see SI Fig. 12). The same analysis has been performed by considering the sale prices in ETH (see SI Fig. 8) and by discarding the rarest and least rare NFTs of every collection (see SI Fig. 14). These results are also robust when using the NFT rarity rank to measure the rarity of an NFT rather than the rarity score (see SI Fig. 6).

### Discussion

We have quantified rarity in 410 NFT collections and analysed its effect on market performance. Rarity is a fundamental feature of NFTs belonging to a collection because (i) it allows users to categorise NFTs on the traditionally market-relevant axis of scarcity and (ii) it is based on human-readable, easy to identify, traits that creators assign to NFTs. We have found that the distribution of rarity is heterogeneous throughout the vast majority of collections. We have shown that rarity is positively correlated with the sale price and negatively correlated with the number of sales of an NFT, with the effect being stronger for the top \(10\%\) rare NFTs. Last, we have shown how rarity is associated with higher return of investment and lower probability of yielding negative returns in secondary sales.

The finding that most rarity distributions are heavily heterogeneous, with few very rare NFTs, is interesting since in principle more homogeneous distributions would be possible. The ubiquitous nature of this pattern may indicate either that creators deliberately choose heterogeneous distributions (design perspective) or that heterogeneous distributions help make a collection successful and therefore are dominant in our sample of actively traded distributions (evolutionary perspective). While information on the rationale behind rarity distributions is hard to retrieve^{53}, the design and evolutionary explanations could have fuelled one another over time, with creators embedding rarity out of imitation of successful pre-existent collections. In this perspective, our results could help to further improve the design of NFT collections.

From the point of view of trading, it is important to highlight that our results concern genuinely emerging properties of the NFT market, since we only considered user-to-user sales. In doing so, we discarded the very first creator-to-user sales, which are often based on lotteries that prevent users to select what NFT to buy^{48}. We found that while the impact of rarity is particularly strong for—and among—the rarest NFTs, which are thus genuinely non-fungible according to the market, its influence propagates to a large number of somehow rare NFTs (see Fig. 6g, inset and Fig. 7a, inset). Most common NFTs, on the other hand, appear to behave more uniformly in the market, which appears to consider them essentially “fungible”. Overall, we anticipate that our results in this context may help inform the decisions of users interested in the financial aspects of NFTs.

Our study has limitations that future work could address. First, our dataset is limited to collections available on Opensea, the biggest NFT market, and sold on the Ethereum blockchain. A natural extension would cover other platforms (potentially on other blockchains) and different types of NFTs, assessing whether rarity has the same effects on other kinds of NFTs such as those related to gaming and the metaverse. Second, we used the rarity score to quantify the rarity of an NFT. While this method does take into account every trait associated with an NFT, it does not consider possible combined effects stemming from the combination of multiple traits (e.g., two common traits for a collection might be present together in just one NFT, making it very rare). Future work could assess whether such second-order effects do play a role on the market performance of NFTs. Third, we considered traits as they are encoded in the NFT metadata and reported on rarity.tools, limiting the analysis to collections where such metadata are available and consistently recorded. Future work making use of computer vision techniques to extract human readable attributes from visual information of NFTs would yield to larger datasets and assess whether also less “official” visual traits, potentially shared by NFTs in multiple collections and where previously developed metrics might help^{54,55}, might play a role on the NFT market. Finally, while this work has focused on how rarity affects NFT market success, a natural extension of the work should focus on how buyers behave with respect to rarity.