🎯 600K Net-New Scannable Products

Net-new = UPC + nutrition/ingredients NOT in Edgar's usable set (218,995). Updated 2026-06-26 08:22:08 ET · auto-refresh 90s
Structured (solid)
224,683
37.4% of 600K goal
Incl. ingredient-text
456,203
76.0% of 600K goal
Added today (structured)
62,168
20.7% of 300K today-goal

Net-new by source (structured)

SourceNet-newShare
freshop140,058
hy_vee39,959
wynshop30,330
wegmans29,696
kroger_api28,068
instacart_foodtown26,462
instacart_woodmans18,464
instacart_ssr17,381
instacart_pathmark13,852
schnucks9,034
target_ssr_pdp1,201
le_frazierfarmsmarket207
instacart30

Instacart-SSR retailers (the 600K engine)

RetailerRows
sprouts25,817

Staging — ext_staging_products (212,412 rows, 211,025 w/ label · awaiting owner QC + promote)

Retailer (staging)RowsWith label
hy_vee61,17861,178
wegmans42,54742,547
instacart_foodtown35,77135,771
instacart_woodmans29,44329,443
instacart_pathmark19,86219,862
schnucks19,73619,736
le_frazierfarmsmarket2,4281,041
target_ssr_pdp1,4471,447

Active pipelines / agents

Agent / pipelineWhatState
AI Scraper 2 · SchnucksNext.js _next/data JSON → fullUpc + nutrition, $0 direct; 19,736 stageddone
AI Scraper 2 · FoodtownInstacart SSR DOM-UPC, multizone (8 sessions+Decodo) — now running 24/7 on DO droplet 159.223.164.186 (systemd), ~112K to gorunning
AI Scraper 2 · PathmarkInstacart SSR DOM-UPC, multizone (4 sessions+Decodo) — now running 24/7 on DO droplet (systemd), ~62K to gorunning
AI Scraper 2 · multizone fixbuilt multizone_harvest.py — beats Instacart per-ZONE-SESSION throttle via N independent store cookies, resume-awaredone
AI Scraper 2 · Edgar-reject recoveryFINDING: reject pool = OCR gap not scrape gap. ~200K HEB/Walmart/CVS have images, just need Edgar re-OCR (owner). Source re-scrape = Incapsula-locked, not worth cost. → EDGAR_REJECT_RECOVERY.mddone
AI Scraper 3 · Wegmansopen Algolia (hardcoded keys) → real UPC + STRUCTURED nutrition + ingredients, $0; brand-facet shardingrunning
AI Scraper 3 · Hy-Veeself-hosted Next.js __NEXT_DATA__ → item.ean13 + structured ingredients, ~102K, $0running
AI Scraper 3 · Woodman'sInstacart zone-gated UPC + nutrition/ingredients via Decodo rotating proxy; autorunrunning
AI Scraper 3 · QC + playbookaudit_staging.py (100% scannable GTIN) + verify_correctness.py (label vs Open Food Facts) + AI_SCRAPING_PLAYBOOK.mddone
Sprouts SSR harvestshop.sprouts.com ~94K PDPs, UPC+nutritionrunning
Freshop new keys21 AWG/regional app_keys, private labelsrunning
Instacart-SSR discoveryvalidating ~50 shop.<retailer>.com storefrontsrunning
Wynshop chains discoveryfinding new Mi9/storefrontgateway chainsrunning