DeepNFTValue: Machine Learning Driven Crypto Asset Pricing

Nikolai Yakovenko
11 min readApr 4, 2022

--

The Mission: Industry-Standard NFT Pricing

I started thinking about this problem a year ago — and couldn’t get it out of my mind. All during Miami Tech Week 2021, every time we’d take Uber Black to the next location, I’d be looking at the CryptoPunks Twitter bot, see the price another JPEG was selling for, and think I should build a deep learning model to estimate what they were worth.

Quickly you pick up on the fact that, like baseball cards, some Punks are commons, or “floors” as NFT traders call them, some trade for a bit of a premium, and some like Zombies and Apes sell for 20–50x that of “floor” Punks.

The difference turned out to be that, unlike baseball cards, where if a common was listed for $0.10 in the price guide… you’d be happy to get a couple of pennies for it in bulk — the “floor Punks” have out-performed most of the premiums, over that past year.

During my involvement in this space, I’ve probably heard every alpha strategy that early Punks traders were trying a year ago… and just about all have under-performed buying the cheapest Punks for a competitive price. A story for another time…

In any case, before you can think about building a model to predict future Punk values, you need to estimate the current values. And that is what we’re doing here.

Now-Casting the JPEGs

Just as the Beckett price guide tells you what your baseball card is worth, the Kelley Blue Book tells you what your used car will go for, and Zillow will make an attempt to price your house, we’re proud to announce DeepNFTValue — the first machine learning driven company aiming to accurately price every market-relevant NFT.

We will soon announce the closing of our seed investment round, led by prominent crypto VCs and angel investoooors. With that we will expand our market-leading CryptoPunks valuation model to every NFT collection worth pricing.

Below is a bit of background on what we are doing. How we see the NFT and digital assets market developing, where pricing comes in, and a little bit on our strategy toward pricing these assets. We will follow up with Part 2 on how we train and evaluation machine learning models for pricing NFTs. This is a difficult, unique modeling problem — one you won’t see taught in school — not to mention esoteric wrinkles for each project. [Are you looking at Apes before or after they claimed their $APE token?] You’re not going to want to miss Part 2.

For now, let me explain why accurate, timely asset pricing is necessary for the future of NFTs, what are some short, medium and long term use cases, and what else to expect from DeepNFTValue.

Good at Tech, Curious About Crypto? Join Us!

Lastly, we are hiring! Most of all, we looking for experienced back-end developers to help us scale to multiple projects, to train and inference our ensembles of deep learning models faster, and to put our price predictions on-chain, so they could be consumed by other on-chain programs such as NFT lending protocols.

Base on what has worked well for us so far, we are mostly looking for:

  • developers with 2–10 years of serious technical experience (beyond academic work) — in big tech, startup, significant open source project, etc
  • interested in machine learning and Web 3.0/crypto

In other words, we’re a bit Web 2.0 ourselves. You don’t need to be Solidity-native to come work with us, but if you want to join us on our journey, and learn some Solidity along the way, that would work well.

We will be filling other roles as well, especially if you read this 3–6 months from now. In particular, we will have business development roles (see below) and we are very much interested in expanding our ML and analytics group as well over time.

Let’s connect, and keep in touch. If you have great ideas for how to use our valuations to study market problem X, let us know — happy to share data and start by collaborating on a blog post. There is much to do in this space! We have a million ideas that we can’t possibly get to.

But it all starts with estimating fair pricing for the NFTs. Let us explain our approach to how and why.

The Problem: Pricing Is Hard

Less than 0.1% of items in a typical NFT project trades on a daily basis. Many items in a collection don’t trade for months, if ever. These items still need to be priced, without direct price discovery from the market.

Markets shift weekly, if not daily, based on collector interest, USD:ETH exchange, and other dynamic factors.

Simply “now-casting” current prices, based on recent transactions, historical ratios and project-specific factors, requires dedicated machine learning (ML) work. It is a challenging problem, requiring not just modeling expertise and technical rigor, but passion and creativity.

Initial attempts to curate NFT prices have focused on human evaluation.

  • Too slow to keep up with changes in the market.
  • Humans are biased and cannot estimate value accurately.
  • The sharpest market participants can not be incentivized to participate.

Where the market needs baseline value estimates, these need to be generated by a well calibrated machine.

Much like in sports gambling we bet the “Pinnacle line” when fair price is needed.

The Solution: DeepNFTValue — Real Time ML-Driven Pricing

We have built fully automated, machine-driven pricing, accounting for:

  • Bids, offers (listings) and sales in the marketplace
  • Comparison with similar items
  • Category trends in the marketplace
  • Historical relationships between categories and individual items

We train an ensemble of models, not just to predict future sales, but taking account of many forms of “common sense” and structure in the NFT markets.

More details in Part 2 where we dig into the modeling details, from a technical perspective.

For now we will say that, we are the only crypto startup taking an ML-native approach to this problem. Machine learning is not an afterthought for us, it’s the core of the company.

As the founder of DeepNFTValue, I have spent my whole career in applied machine learning. Having build world-class models for problems from language modeling, to genomics, to computer vision, to poker AI. None of these were just academic models, built to beat a score on a benchmark — although we did that too — but applied ML models, on real problems useful in industry, games, or everyday life.

Like many of you, I caught the crypto bug years ago, and got into NFTs in early 2021. This is a real world problem that ML is uniquely positioned to solve. I’m happy that our proof of concept works well for CryptoPunks, and according to many market participants, helps anchor that market. Now, there’s just every other crypto asset left to go…

The Future: Pricing Millions of On-Chain Assets

On some of his podcast interviews, crypto-optimist Balaji Srinivasan talks about a not too distant future, where we have consensus ownership of millions of individual assets, with a live market on each, administered by some sort of million by million grid, DEX, etc.

For NFT collectoooors, that future is already here:

If you’re active in the NFT space, you can probably name all of those collections — from Punks to Apes, to Doodles, Nouns, MFers and Toadz. Even then it’s unlikely — perhaps impossible — that you would have a notion of what each item in the collection is worth in today’s market. The JPEGs above are all iconic projects — representing twenty out of ~500,000 candidate individual JPEG assets.

Sure, you can estimate the JPEGs values by floor price, last sale (I’d not recommend that) or a spreadsheet model based on rare and common attributes. But wouldn’t you rather just know the fair market price for every item?

NFT pricing — accurate, in real time and historical — help with several obvious problems:

  • What are my bags worth?
  • What’s my live PnL given reasonable liquidation prices?
  • What’s a fair trade with another collector?
  • How has this subset of JPEGs performed lately, and how is it trending?
  • What price should I list my NFT, that isn’t too high to get no interest, yet not so low it will get snapped up immediately by market makers (who maintain their own proprietary valuation models).

We can also answer more complex questions like:

For CryptoPunks, we answer these questions right now, on DeepNFTValue, and have been for months now.

Mo Prices, Mo Problems

What that single picture goes to show, is that you can’t imagine tracking all 500,000+ prices by hand, not even for 20 well-known, iconic projects. You need an automated system to pull in price activity, train models, evaluate models, ensemble and push out price estimates.

What we’ve seen so far, is that 10,000 PFP projects, once they undergo price discovery, do tend to form similar patterns of price distributions. We will drill down on those patterns are in future posts, but suffice it to say that the same algorithms work for pricing multiple PFP collections, provided those algorithms are flexible enough to learn which attributes and combinations matter, what items are viewed as similar by the market, etc.

Looking past the “blue chip” NFT projects — there will be more — who knows what other kinds of assets we will be looking at on-chain a couple of years from now? Either way, we will be there to price millions of new individual assets on the blockchain, based on logical constraints, the structure of the project, and spoof-filtered market activity.

We will no doubt need to develop new pricing algorithms, as not all assets will behave like a 10,000 item PFP project. Every sub-industry, especially in engineering, over time builds its own tools.

Fudding Our Own Bags

  • Do we really need to develop original deep learning models, and spend thousands of dollars a month on GPUs, to estimate CryptoPunk prices?
  • Who is going to pay for these price estimates?
  • If your prices ares so good, why don’t you just trade this information, and beat the NFT market makers at their own game?
  • Wen token??

We have heard these questions, and more, from crypto-knowledgeable folks throughout the fundraising process.

The truth is yes, it’s a bit crazy to care this much about modeling NFT prices, updating those prices daily, then giving them away for free. However, let’s take a look into the future, a couple of years from now. Are we realistically going to be looking at JPEG collections, without accurate mark-to-market prices? Will we we unable to sort our OpenSea bags by value, not just purchase price or collection? Will we not be able to sort deals in the market by anything other than floors, restricted by attribute? I don’t think so.

There will be standard, accurate pricing — and we aim to fill that need in the market. As such, everything else like tokens and market-making, are a bit of a distraction. Furthermore, while market makers are known to make sharp markets, they need not cover every asset, much less every collection. They can have gaps in their coverage — we can not, as a mark-to-market provider.

Scaling an accurate, timely, 100% automated ML-driven model for pricing every project that matters (or at least has experienced multi-party price discovery) is our single mission. Much of the work scales very well from 1 to 10 to 100 projects. A bit of the work needs to be done on a per-project basis.

This can’t be done “well enough” with part-time or hacky work, but nor do we think it is impossible to price millions of assets accurately, reacting to the market quickly, with a relatively small team, a reasonable AWS budget, and a laser focus on the pricing problem.

We anticipate that, at steady state, we will be able to price and onboard a new project within a day or two. Even so it’s a long road to millions of assets, and we’re going to need to put a lot of automated eval in place — and a bit of human eval — to make sure all of these price models are working reasonably. Make no mistake, this is a lot of work!

Unlocking Billion$ in Illiquid Assets

Take CryptoPunks, the first major NFT project, and still probably the largest market cap. We estimate Punks at 1.087 million ETH ~ $3.8B USD.

Of these 10,000 items, 80–90% have never sold for anything approaching today’s prices. Many of the most valuable assets (the Aliens) have not sold at all, or for a non-trivial amount, despite a last sale of 8,000 ETH and our estimated value of ~7,900 ETH per Alien.

Only ~800 Punks are thought to be lost on-chain. The rest are accounted for, but truly illiquid. Until they sell to another buyer, you can’t do much with them.

Bored Apes and other NFTs produce residual “drops” for their owners, in part to get around this problem of illiquidity.

Lending protocols and other financialization tools address this problem more directly. However all of these in some form, require knowing what the underlying asset it worth — what it would sell for in a reasonable time at auction. And this is exactly what we provide.

As for the business model, we plan to offer our price estimates, to all users, for free on our website. We are also experimenting with putting the price estimates on-chain, for better use in smart contracts and lending protocols.

As our prices become ubiquitous, and something that users expect, we will also license our pricing API to marketplaces and large aggregators. Surely, their users will want to be able to score their bags and sort the marketplace deals by best values.

Join Us

There’s a lot more we can say about this project, and crypto asset pricing in general. Please comment on this article — we’d love to continue the discussion.

If you are interested in joining us, we will update this piece with a better place to reach out. But for now, either email me via this article, or send me a DM on the bird app:

I look forward to meeting you in the metaverse. Thank you for reading!

[This is cross-posted from Substack.]

--

--

Nikolai Yakovenko

AI (deep learning) researcher. Moscow → NYC → Bay Area -> Miami