Data Sources
We believe you deserve to know exactly where our data comes from. This page documents our data sources, collection methods, and known limitations. No black boxes.
Honest Disclaimer
All data sources have inherent biases and limitations. Our valuations are estimates based on available market data, not guarantees of actual sale price. See our Limitations page for a complete list of known issues.
Primary Data Sources
Public Resale Marketplaces
Aggregated listing data from publicly accessible marketplace APIs and web sources
- • eBay completed listings (public API)
- • Poshmark sold items (public data)
- • Depop historical sales
Daily
~2.1M new records/month
US, UK, EU markets
- • Self-reported condition by sellers
- • May not reflect private sales
- • Geographic bias toward English-speaking markets
Brand MSRP & Retail Data
Original retail pricing from brand catalogs and retail APIs
- • Official brand websites
- • Retail partner feeds
- • Historical price archives
Weekly
~850K active SKUs tracked
Fashion, outdoor gear, electronics
- • Some brands don't publish MSRP publicly
- • Historical pricing may be incomplete for older items
- • Limited coverage for independent/artisan brands
Materials & Chemical Databases
Scientific databases for toxicology and material safety assessments
- • EPA ToxCast (public)
- • PubChem compound data
- • OEKO-TEX standards (licensed)
Monthly
~12K chemical compounds tracked
Textiles, plastics, metals, dyes
- • Not all chemicals have complete toxicity data
- • Product-specific testing not performed (risk is estimated)
- • New chemicals may not be in databases yet
Sustainability Certifications
Verified certification databases from official bodies
- • B Corp certified list
- • Fair Trade registry
- • GOTS certified facilities
Monthly
~45K certified entities
Global
- • Certification != actual practice verification
- • Some certifications are self-reported
- • Coverage varies by industry
Data Processing Pipeline
How raw data becomes actionable intelligence:
Collection
Raw data ingested from APIs, web sources, and partner feeds
All data is collected in compliance with Terms of Service and robots.txt. We do not scrape private or authenticated content.
Validation
Automated and manual checks for data quality
Outliers are flagged (prices >3 standard deviations), duplicates removed, and incomplete records filtered.
Normalization
Standardization across different data formats
Brand names unified, condition grades mapped to our scale, currencies converted with daily rates.
Enrichment
Additional context added from reference databases
Material composition inferred, category classification, brand tier assignment.
Model Training
Processed data used to train and update ML models
Monthly model retraining with holdout validation. Model performance monitored continuously.
What We Don't Use
Transparency also means being clear about data we don't collect or use:
Personal User Data
We don't track individual user behavior or collect PII for model training
Private Sales Data
No access to private marketplace transactions or internal business data
Proprietary Retailer Data
Unless explicitly licensed, we don't use internal retailer pricing strategies
Social Media Scraping
We don't scrape Instagram, TikTok, or other social platforms
Data Freshness & Quality Metrics
(from marketplace to our system)
(required fields populated)
(outliers or quality issues)
Metrics updated monthly. Last update: December 2024
Questions About Our Data?
We're committed to transparency. If you have questions about specific data sources, methodology, or want to report a data quality issue, please reach out.