An image gallery with lightbox (thumbnails grid + tap-to-zoom modal) appears in many products. The interview tests virtualization, accessibility, gesture handling, and the polish that separates serviceable from delightful.
Functional requirements
- Grid of thumbnails
- Tap a thumbnail to open lightbox
- Lightbox shows full image
- Navigate next/previous via buttons or swipe
- Pinch-to-zoom on full image
- Close via Escape, tap outside, or swipe down
- Smooth animation on open/close
Architecture
Two views: grid (thumbnails) and lightbox (single image, navigable).
Thumbnail grid
For a few hundred images: CSS Grid with srcset for responsive sizing.
For thousands: virtualize. Render only visible thumbnails.
Use grid-template-columns: repeat(auto-fill, minmax(150px, 1fr)) for fluid columns.
Image preloading
Thumbnails: loading="lazy" for off-screen.
Full images: preload on thumbnail hover or as user opens lightbox. Pre-fetch the next/previous in lightbox so swipe is instant.
Lightbox open animation
The “shared element transition” pattern:
- User taps thumbnail
- Lightbox opens; the thumbnail visually expands to fill the screen
- Image loads at full resolution underneath
Implementation: use the View Transitions API (Chromium 2024+) or a library like react-spring with shared layout IDs.
For browsers without View Transitions: simpler fade-in is fine.
Navigation in lightbox
- Buttons (left/right) on desktop
- Swipe gesture on mobile
- Keyboard arrows on desktop
- Indicator showing current position (3 of 47)
Pinch-to-zoom
Mobile: PointerEvents to detect 2-finger pinch. Apply transform: scale().
Constraints:
- Min zoom: fit-to-screen
- Max zoom: 4x or 8x
- Pan when zoomed
- Tap to reset zoom
Use a library: react-zoom-pan-pinch, or CSS-only “Photo Swipe” if you want minimal JS.
Closing the lightbox
- Escape key
- Tap outside the image
- Tap close button
- Swipe down (mobile)
The swipe-down close should follow the finger; release > threshold = close, otherwise spring back.
Accessibility
- Lightbox is a modal — same focus trap rules
- Each image has alt text from metadata
- Navigation buttons have explicit labels
- Keyboard navigation works (arrow keys, Escape)
- Screen reader announces “image 3 of 47” on navigation
Performance
- Use modern formats (AVIF, WebP) with JPEG fallback
- Serve responsive sizes via srcset
- Decode images off-thread (decoding=”async”)
- Use Image element width/height for aspect-ratio reservation
EXIF metadata
For photo galleries, show metadata: camera, lens, aperture, shutter, ISO. Parse client-side from JPEG headers using a library (exif-js, ExifReader).
Common mistakes
- No virtualization with thousands of thumbnails
- Lightbox covers the whole page but does not trap focus
- Swipe gestures conflict with browser scroll
- Pinch-to-zoom that does not work reliably on real devices
- Image decode blocks scrolling
Frequently Asked Questions
Should I use a library or build from scratch?
For interview practice, build it. For production, use a library: PhotoSwipe, lightGallery, react-photo-gallery, swiper. They handle gesture and accessibility edge cases.
How do I handle 4K images on mobile?
Serve appropriate resolution per device pixel ratio. iPhone Pro is 3x; serve 3x images. Don’t serve 4K to a 1x display.
What about video in the gallery?
Same architecture; video element instead of img. Same lazy-load and lightbox patterns. Add play/pause UI.