Colophon · 日本

Captioned by AI

Two couples spent two weeks in Japan in October 2024 and came home with nearly 2,000 photos. The pictures were vivid, but they were missing the context — where each was taken, what was happening, and the small cultural details that are easy to miss. Writing all of that by hand would have taken forever, so I had AI do it.

Every caption and every point of interest on this site is AI-generated. What makes captioning vacation photos hard isn't the captioning — it's the context: where and when a photo was taken, who took it, and who's actually in the frame. So the pipeline feeds the model all of that.

The pipeline

  1. 01 A Python script pulls EXIF data — GPS coordinates, timestamps, the photographer — from each photo.
  2. 02 Coordinates are reverse-geocoded into place names, and a map is generated from the GPS for context.
  3. 03 Each photo is sent to Claude, along with reference photos so it can identify who is actually in the shot.
  4. 04 Claude writes a caption in the site's voice and extracts notable points of interest, each linked to a real source.
  5. 05 A second pass reads the photos in trip order and harmonizes the captions so sequences read coherently.
  6. 06 Everything is stored in a SQLite database that this site reads from.

The voice

The captions are written from my perspective, in a deliberately casual, all-lowercase voice — like texting a friend. The whole collection was most recently regenerated with Claude Opus, which is a good deal sharper at reading a scene and identifying people than the model that first captioned the trip.

↑ Back to the gallery Source on GitHub