Photogrammetry for the Masses

Featured Image: iPhone 12 Pro shot on iPhone 11 Pro

This week I went to the dentist to have a crown replaced and encountered some technology that was new to me. I won’t pretend to know much about dentistry, but a crown is meant to match the tooth it’s replacing, so the process begins with obtaining a model of the existing teeth. I’d previously had a mold cast using some sort of clay to take an impression, but this time my dentist instead used an imaging wand to capture the model of my teeth. He then perfected that model on a console right in front of me, designing the new crown and sending it off to be manufactured in a little CNC-like machine sitting in the next room. The entire process, formerly taking days and requiring multiple visits, took less than an hour – a quantum leap in efficiency, and a nice surprise for me.

Image: Scanned 3D model in CEREC software from Dentsply Sirona

The underlying technology that my dentist adopted in the years since my last crown was actually quite familiar to me. In fact, my colleagues and I have also been exploring its uses for years. For lack of a better word, I’ll use the term photogrammetry to broadly describe it – using imaging techniques like photography to create 3D representations of real-world objects. The technique has many other industrial uses, perhaps the oldest being for the creation of topographical maps way back in the 1800s. It’s obviously progressed enormously since then with the application of ever-increasing computing power. In my line of work, we look for novel and interesting ways to use such technologies.

From a tactical standpoint, there are multiple ways to accomplish the basic goal of imaging 3D objects. Photogrammetry rigs (specialized camera setups designed to capture a subject) can use conventional cameras and sophisticated software to stitch together a 3D model. To instantaneously capture a subject from all angles, these rigs require a large number of cameras, all synchronized to shoot simultaneously. A more cost-effective and smaller footprint approach is to use a rotating armature or turntable to allow just a few cameras to capture the subject across a series of images over a brief period of time as either the camera or subject is rotated. I have seen both approaches used to great effect, capturing human subjects and placing them into 3D scenes in real-time. Auto-rigging software can even allow the newly created characters to jump, kick or dance, right in front of the subjects who were scanned into the experience only moments before.

Image: 90-camera DSLR rig at UMBC Imaging Research Center

As a brief aside, you’ve almost certainly seen the results from a similar type of rig called a bullet-time rig. Invented for production of the bullet dodging scenes in The Matrix, it also uses multiple cameras to capture many images at once or in quick succession to allow a “frozen moment” pan around what is usually some chaotic scene. It’s now been used in countless movies and commercials but has also become something of a stand-by for event activations. While not technically photogrammetry as I’ve defined it, it similarly enables an exploration of 3D space using 2D images.

Specialized depth-sensing cameras can also be used to deliver real-time 3D models of subjects. One of the earliest consumer-facing devices for this was Microsoft’s original Kinect, introduced a decade ago. The Kinect was specifically intended for scanning human figures and its software used the depth imagery to translate a figure into a skeleton and then track it in real time, which fit nicely into existing 3D character workflows. Originally intended as an Xbox accessory (and an answer to Nintendo’s Wii), it was quickly adopted by hackers and artists for all manner of experimentation and remains a useful tool for experiential activations. I’ve personally worked on dozens of Kinect-driven projects over the years, many of which are still in use (or were until pandemic shutdowns).

Image: Kinect-scanned 3D figures in an activation for IBM

PrimeSense, the original company behind Kinect’s technology, was acquired by Apple back in 2013. Some of that technology has surely been incorporated into Apple’s front-facing cameras to power Face ID, introduced several years ago. But Apple’s newly released iPhone 12 incorporates even more advanced depth scanning capabilities, including Lidar, potentially opening a fascinating new area for experimentation. Early explorations show some impressive depth-sensing capabilities that could be used in a multitude of applications.

Images: from a tweet by Shawn Frayne (@haddockinvent)

For a while now there have been many mobile apps that attempt to create accurate floorplans from conventional photo scans of a room, but they haven’t worked very well. The iPhone 12’s new capabilities can make such apps much more accurate, and apps like Canvas are already enabling that. I recently had a fascinating conversation with a construction engineer friend of mine about the use of creating 3D scans in his field. For most industrial applications the goal is highly accurate scans, which can be expensive and time-consuming. But for large-project construction management what’s needed is massive and frequent scans of the site in order to track project progress. This leads to a preference for ease and speed over quality. But inexpensive and readily available Lidar could allow for both.

Image: Room scan on iPhone 12 by Canvas

Such capability is obviously useful for architects and interior designers. But accurate 3D models of the environment are also necessary for augmented reality (AR) applications and room-scale virtual reality (VR) installations. The newest generations of consumer AR and VR headsets include sensors that scan the environment (and VR often uses external sensors for so-called outside-in tracking of the headsets, but that’s a whole other rabbit hole). XR (a broad term for AR, VR, “mixed reality”, etc.) is another set of technologies we use a lot for brand activations and attractions. The need for headsets and specialized hardware is a significant barrier to entry for such projects, and the need for people to share them at in-person events only complicates that situation in a post-covid world. So technologies that allow for better AR experiences on participants’ own phones are of great interest to experiential creators like me.

The inclusion of Lidar on the iPhone 12 will almost certainly kick off a new phase of experimentation similar to what happened with the Kinect. In fact, the massively larger installed base of hardware promises much more experimentation than we saw then. And other phone manufacturers will likely follow suit with analogous tech. If in a few years the ability to quickly and easily scan 3D objects and spaces is part of the standard capability set for smartphones, we could see a whole new realm of digital experiences and consumer applications take hold.