← Back to Search

For Developers

Want to build something like this? Here's how we indexed 109,000+ documents from the DOJ Epstein Files release.

The EpsteIN Tool

EpsteIN by cfinke is the extraction tool that made this possible. It handles the DOJ's paginated PDF releases and extracts text content.

What We Did

API Access

Direct search against Meilisearch. No auth required for read operations.

curl -X POST "https://epstein.dugganusa.com/indexes/epstein_files/search" \
  -H "Content-Type: application/json" \
  -d '{"q": "Prince Andrew", "limit": 20}'

Response Format

{
  "hits": [
    {
      "efta_id": "EFTA00022136",
      "content": "...",
      "dataset": "dataset3",
      "pages": 5,
      "people": ["prince_andrew", "virginia_giuffre"],
      "locations": ["new_york"]
    }
  ],
  "estimatedTotalHits": 228,
  "processingTimeMs": 12
}

Breaking eggs: This index exists because someone decided completeness mattered more than perfect formatting. If you're building something similar, don't let perfect be the enemy of done. Index what you have, improve later.

Data Completeness

We indexed:

Want the Raw Data?

The source documents are public record from justice.gov. Our index adds searchability and entity extraction on top.

Questions? Contact us.