Appraisal Performance

Evaluation and Selection Method

How does SPICE determine if the performance of an appraisal tool is up to par or not?

The performance is evaluated via the following method:

First, a set of contemporary transaction pricing data is algorithmically selected for the desired NFT collection.

Then, each pricing model is called upon to predict the price of each NFT within the selected dataset. The error between the price predicted by each model and the real traded price is evaluated for each NFT within the selected set.

Then, four calculations are made from the dataset:

  • Median absolute percent error

  • 95th percentile absolute percent error

  • Maximum absolute percent error

  • The geometric mean of the three metrics, or Normal Score also known as ‘N score’

Finally, the model that evaluates with the lowest ‘N score’ is deemed the most accurate pricing model for the dataset.

The most accurate model is accepted by SPICE's appraisal aggregator and used to appraise the NFT in question.

Median Absolute Percent Error (MAPE)

MAPE is calculated by taking the absolute value of a dataset of errors and evaluating the median.

Below, we compare the MAPE of individual industry appraisal tools versus SPICE's aggregator over time. (Lower is better)

Normal Score (N Score)

N Score is calculated by taking calculated MAPE, 95APE, MaxAPE values for the aforementioned dataset of errors and evaluating the geometric mean.

Below, we compare the N Score of individual industry appraisal tools versus SPICE's aggregator over time. (Lower is better)

Last updated