How NeighborhoodTreasure ranks websites
Full, plain-English methodology. No editorial filter, no paid placement, no user accounts — every listing is generated from public signals and a fixed set of formulas described below.
Updated 1970-01-01
What NeighborhoodTreasure is
NeighborhoodTreasure is a continuously updated, public index of the websites currently surfacing across the open web. Domains are discovered automatically from public signal sources, classified into categories, scored along four axes, and presented as browsable feeds. There is no submission form — inclusion is fully automated based on whether a domain appears in our sources.
Where new sites come from
Discovery runs continuously — new domains are ingested every few minutes. The signal sources are entirely public:
- Tranco — the peer-reviewed combined top-list (Cisco Umbrella, Majestic, Cloudflare Radar, Farsight) — used as the long-term popularity backbone.
- Hacker News — story submissions and comment links, weighted by score and recency.
- GitHub Trending — repositories with linked homepages, daily and weekly trending lists.
- Public RSS feeds — a curated set of cross-domain feeds covering tech, design, news, and indie-web sources.
- Common Crawl — open web-scale crawl data, used for verification and metadata enrichment.
- Awesome-list directories — curated GitHub-hosted topic lists.
- Product Hunt — daily product launches.
- Reddit signal — subreddit-level link aggregation across geography- and topic-keyed sources.
Full provenance per source category is documented at /data-sources.
How sites are classified
Every newly discovered domain runs through a two-pass classifier:
- Heuristic pass. A fast, deterministic classifier matches the domain, TLD, title, and description against a fixed taxonomy: AI, tools, dev, design, media, news, video, music, gaming, art & creative, ecommerce, SaaS, education, research, community, finance, health, food, travel, sports, lifestyle, personal, government, search, email, and business.
- AI verification pass. A second pass using OpenAI's
gpt-4o-minireads the site's metadata plus a short Firecrawl-extracted text excerpt and either confirms the heuristic category or upgrades it. The AI also assigns a primary region, language, and a one-sentence factual description.
Heuristic-fallback assignments are flagged and re-checked on subsequent passes, so the "other" category shrinks over time as classification improves.
When a site sits in Other and clearly belongs somewhere else, it almost always means we couldn't read enough text from the page during the scrape — a missing title or description, a blocked crawler, or content that's rendered later by JavaScript we don't execute. As soon as more signal arrives (or the site unblocks our crawler), the next librarian pass moves it into the right bucket automatically. Categorization is best-effort and improves continuously, never fixed at first sight.
How traffic is estimated
Estimated monthly visits are derived from the site's Tranco global rank using a calibrated rank-to-traffic curve. These numbers are directional — they put sites into consistent traffic tiers and order them sensibly relative to one another. They are not, and don't claim to be, measurements of a site's actual analytics.
The four scores: Hot, New, Rising, Overall
Every site receives four independent scores.
Hot
Current public-signal velocity over the last 7 days — Hacker News upvotes, GitHub stars, RSS link frequency, and similar real-time signals, normalized against the site's own baseline. A small site getting a sudden burst of attention can outrank a household name on a quiet week.
New
Pure recency. The score is highest the day a domain is first detected and decays smoothly to zero over 30 days. After a month a site no longer scores on New regardless of how popular it becomes.
Rising
Acceleration relative to the site's own baseline — the percentage change in signal volume over the last 7 days versus the prior 30. Rising rewards genuine growth rather than absolute size, so a fast-doubling indie project can rank above a flat top-100 site.
Overall
A long-term blend dominated by Tranco global rank (30-day sustained worldwide traffic) with smaller weights for the internal Hot and Rising scores so genuinely surging sites can break through. The homepage feed mixes all four; /hot, /new, and /rising surface each individual score.
What gets blocked
The following are excluded from the public index:
- Adult / pornographic content — matched against a maintained domain blocklist plus heuristic detection.
- Spam, doorway pages, and AI-content-farm domains.
- Parked domains and registrar placeholder pages.
- Infrastructure-only hosts (CDN endpoints, telemetry domains, tracking pixel servers) — useful for the underlying internet, but not browsable destinations.
- Any domain whose verified owner has requested removal.
Update cadence
- Ingestion from public signal sources runs continuously, every few minutes.
- Score recompute runs every 30 minutes.
- Full re-score and metadata refresh runs nightly.
- Per-site freshness is shown on every detail page as "Last seen".
Editorial policy
No paid placement. No sponsorships. No affiliate fees influencing rank. No "featured" tier. No user-submitted listings. There is no way to pay to be listed, to rank higher, or to appear in a particular category. Listings are determined entirely by the public signal sources and formulas above.
Removal requests
Domain owners can request removal at /removal. Verified requests are processed within a few business days. Removed domains are permanently excluded from future ingestion.
Who runs it
NeighborhoodTreasure is operated by Stanley Nero through StarNest LLC. It launched in 2025 as a public, automated alternative to algorithmic discovery feeds. For general inquiries see /contact.
