How this archive is built

Methodology

A transparent, reproducible pipeline. Every step is documented so journalists, researchers and skeptics can audit our work and, if needed, replace it with their own.

01

Ingest

We poll the official manifest published by the U.S. Department of War alongside the May 8, 2026 UAP release: war.gov/Portals/1/Interactive/2026/UFO/uap-release001.csv. This CSV is the Department's own machine-readable index of the release and is the single source of truth for which records exist. We do not add, remove, or reorder records relative to the manifest.
02

Resolve

For each manifest row we resolve the canonical asset URL — a PDF on war.gov, an image on war.gov/portals, or a DVIDS video. We never rehost the bytes. The clickable link in every record points at the government server.
03

Classify

Five derived fields are computed from the manifest text using deterministic rules:
  • kind — pdf, image, or video
  • era — pre-1970, 70s–90s, 2000s, 2010s, 2020s
  • agency — DOW, FBI, NASA, State, DVIDS
  • tags — keyword extraction over title + description
  • threat_level — low / med / high, based on presence of trained-observer corroboration and sensor type
04

Summarize

A short TL;DR (≤ 240 chars) and a longer briefing paragraph are generated by a large language model conditioned on the manifest's official descriptive blurb only. The model is explicitly instructed to:
  • Stay strictly within the source text.
  • Never speculate about extraterrestrial origin.
  • Use neutral, journalistic register.
  • Mark anything inferred as inferred.
A human reviewer spot-checks every tenth record. Detected errors are corrected and the prompt is updated.
05

Publish

The full record set ships as a static JSON file embedded in the site bundle. There is no database and no server-side rendering of editorial content — what you read is what shipped. Diffs are version-controlled and a changelog is kept on the legal page.

What we don't do

← Back to archive