Build an Image Gallery with Lightbox

An image gallery with lightbox (thumbnails grid + tap-to-zoom modal) appears in many products. The interview tests virtualization, accessibility, gesture handling, and the polish that separates serviceable from delightful.

Functional requirements

  • Grid of thumbnails
  • Tap a thumbnail to open lightbox
  • Lightbox shows full image
  • Navigate next/previous via buttons or swipe
  • Pinch-to-zoom on full image
  • Close via Escape, tap outside, or swipe down
  • Smooth animation on open/close

Architecture

Two views: grid (thumbnails) and lightbox (single image, navigable).

Thumbnail grid

For a few hundred images: CSS Grid with srcset for responsive sizing.

For thousands: virtualize. Render only visible thumbnails.

Use grid-template-columns: repeat(auto-fill, minmax(150px, 1fr)) for fluid columns.

Image preloading

Thumbnails: loading="lazy" for off-screen.

Full images: preload on thumbnail hover or as user opens lightbox. Pre-fetch the next/previous in lightbox so swipe is instant.

The “shared element transition” pattern:

  1. User taps thumbnail
  2. Lightbox opens; the thumbnail visually expands to fill the screen
  3. Image loads at full resolution underneath

Implementation: use the View Transitions API (Chromium 2024+) or a library like react-spring with shared layout IDs.

For browsers without View Transitions: simpler fade-in is fine.

  • Buttons (left/right) on desktop
  • Swipe gesture on mobile
  • Keyboard arrows on desktop
  • Indicator showing current position (3 of 47)

Pinch-to-zoom

Mobile: PointerEvents to detect 2-finger pinch. Apply transform: scale().

Constraints:

  • Min zoom: fit-to-screen
  • Max zoom: 4x or 8x
  • Pan when zoomed
  • Tap to reset zoom

Use a library: react-zoom-pan-pinch, or CSS-only “Photo Swipe” if you want minimal JS.

Closing the lightbox

  • Escape key
  • Tap outside the image
  • Tap close button
  • Swipe down (mobile)

The swipe-down close should follow the finger; release > threshold = close, otherwise spring back.

Accessibility

  • Lightbox is a modal — same focus trap rules
  • Each image has alt text from metadata
  • Navigation buttons have explicit labels
  • Keyboard navigation works (arrow keys, Escape)
  • Screen reader announces “image 3 of 47” on navigation

Performance

  • Use modern formats (AVIF, WebP) with JPEG fallback
  • Serve responsive sizes via srcset
  • Decode images off-thread (decoding=”async”)
  • Use Image element width/height for aspect-ratio reservation

EXIF metadata

For photo galleries, show metadata: camera, lens, aperture, shutter, ISO. Parse client-side from JPEG headers using a library (exif-js, ExifReader).

Common mistakes

  • No virtualization with thousands of thumbnails
  • Lightbox covers the whole page but does not trap focus
  • Swipe gestures conflict with browser scroll
  • Pinch-to-zoom that does not work reliably on real devices
  • Image decode blocks scrolling

Frequently Asked Questions

Should I use a library or build from scratch?

For interview practice, build it. For production, use a library: PhotoSwipe, lightGallery, react-photo-gallery, swiper. They handle gesture and accessibility edge cases.

How do I handle 4K images on mobile?

Serve appropriate resolution per device pixel ratio. iPhone Pro is 3x; serve 3x images. Don’t serve 4K to a 1x display.

What about video in the gallery?

Same architecture; video element instead of img. Same lazy-load and lightbox patterns. Add play/pause UI.

Scroll to Top