Where the index comes from.
Every domain in the index originates from one of these public, open sources. We do not scrape private platforms and we do not buy data.
- Tranco
A research-grade ranking of the top one million domains, published by KU Leuven. Provides our long tail of established sites and feeds the visit estimate.
- Hacker News
Public Firebase API. We surface domains linked from front-page and rising stories.
- GitHub trending & awesome lists
Public GitHub data. Repositories with linked homepages, plus curated awesome-* lists, contribute domains and category hints.
- Public RSS feeds
A rotating set of well-known RSS feeds covering tech, design, indie web, and regional publications.
- Public web directories
Curated, openly-licensed directories of indie sites, launches, and community catalogues.
- Common Crawl
The open web index. Used to confirm a domain is reachable and to seed the long tail of newly-discovered hostnames.
