Capturing the real world in three dimensions; 3D Scanning, photogrammetry, range imaging, voxels and point clouds

Capturing the real world in three dimensions; 3D Scanning, photogrammetry, range imaging, voxels and point clouds

3D computer graphics date back to around the 1970s. They consist of three dimensional geometrical data which can be rendered to 2D images, for example for displaying on a 2D screen. 3D models is the term used to refer to 3D computer graphics objects.

3D models can be created using 3D modelling software. Virtual 3D worlds can be created by compositing multiple 3D objects. This is typically how 3D games and scenes for films have been created. Traditionally, when a 3D artists wanted to create a 3D model of a real space they build it in software based on observations of that space.

Emerging technologies are opening up new and exciting possibilities for 3D model creation.

3D scanning is a method for analysing real-world objects or environments in order to record shape and sometimes colour data. The data is then used to construct 3D models. As 3D scanners come down in price and are included in more mobile devices, more people will be able to create 3D scans of whatever they choose.

Rapid creation of realistic digital artefacts based on real world objects and locations allows artists and storytellers to more rapidly create more vivid cultural interfaces. The increased accessibility of these tools empowers more creators, enabling more diverse and greater impact digital placemaking.

In the 1.2 billion data point cloud render of a heritage site in danger (below) the high level of detail is impressive. It is a very different experience to looking at a photograph or video. You can see both the beauty and the damage to this historical building. Seeing like this increases emotional and connection. If viewed using a VR headset the audience would be inside the model which could be 1:1 scale in the virtual world. This could in turn lead, for example, to better education and more action to protect or end bombing.

An example of a 1.2 billion data point cloud render of Beit Ghazaleh, a heritage site in danger in Aleppo (Syria). Source. Creative Commons Attribution-Share Alike 4.0 International license.

Range imaging

Range imaging is the ability to detect the distance of multiple points of an object. There are different technologies for achieving this, and those available differ in resolution.

Lidar scanners

Lidar scanners use laser light to measure distances by measuring reflection using a sensor. The iPhone 12 (released late 2020) features a built in LIDAR scanner on the rear of the handset which is capable of scanning objects at up to a 5m distance. This makes it possible to scan a space or room by moving the handset around until all surfaces have been scanned. The speed with which it becomes possible to capture an environment as a 3D object is much faster than a 3D artist who uses software to manually recreate that space.

At launch the iPhone 12 does not have many applications available in the Apple App Store which use the lidar scanner. Apple uses it to increase camera performance, but the tech will also hugely improve augmented reality as the data it collects can be used for accurate occlusion of virtual objects behind real-world objects. So it is a great leap in providing augmented experiences with greater immersion, opening up new possibilities for greater storytelling and experience design. Apple augmented reality glasses have been rumoured for some time- and I imagine, if this becomes reality, could well feature a lidar scanner, by which time many content creators will have already started using this tech.

The rapid distribution of this technology can be seen in the prediction that Apple would sell up to 9 million iPhone 12 units in just one weekend of pre-orders. Thats 12 million lidar scanners being carried around in 12 million pockets after just one weekend.

IR depth sensors

IR depth sensors work in a similar way to lidar but the electromagnetic radiation is infra red light as opposed to laser.

An example of an IR depth sensor is Microsoft’s Azure Kinect DK developer kit which features a 12 megapixel RGB camera, a 1 megapixel depth camera, a 360 degree seven-microphone array and an orientation sensor.

The Azure Kinect has a higher resolution than the iPhone 12’s lidar which means it produces more accurate point cloud representations of objects. However, the iPhone 12 will have the advantage of being in a consumer product which will find its way in to many more people’s (and developer’s) hands. The result of which will likely be many more consumer applications being developed and released.

Below are some stills from a 3D scan I made using the IR depth sensor on an iPhone 11 Pro using the app Capture. The results are not very high in resolution, but the form has been captured to a good enough standard to recognise the person. Colours have been successfully captured using the iPhone camera and automatically combined with the points in the point cloud. From a creation perspective the process was quick and easy, but a little awkward due to having to hold the phone with the screen facing the subject (due to the IR sensor being located on this side of the device).

Point clouds

Point clouds are data sets of points with X, Y and Z coordinates which represent 3D objects most often produced by 3D scanners and photogrammetry software. Point clouds can be rendered and viewed as they are- appearing much as their name describes. It is also possible to convert them to polygon or triangle mesh models using so called surface reconstruction.

Point clouds in their raw visualised form have an abstract but appealing aesthetic which is being made use of by many storytellers.

Voxels

Voxels differ to point clouds and polygons in that they are not represented by coordinates of vertices. Instead they are comparable to 2D pixels- cubic blocks positioned relative to other blocks of the same shape and size. You can see some low-resolution examples that I exported from mobile apps below. Just like 2D raster graphics, as the resolution increases so does the fidelity – better replicating real-world objects.

Volumetric video and photogrammetry

Another interesting technique is photogrammetry which is the process of extrapolating accurate information about physical objects using a range of techniques. An example being to create a 3D model based from a series of images alone.

Improved immersion

As depth sensors make their way into mobile devices, and become higher quality, the creation of experiences with much greater immersion and cultural relevance will become both easier and accessible to more people.

Conclusions

Storytellers and artists will inevitably integrate these technologies into their experience designs, using them to capture ever more lifelike environments, enabling more powerful relationships between artists, experiences and audiences.

I can imagine the family video call evolving into a spatial experience taking place in one of the family members’ living rooms, with the option of collaborative game-like interactions in those spaces.

I imagine people archiving spaces so others (in other places or later times) can walk around in them guided or unguided, with layers of data- visual and/or audio placed there for others to experience.

The technologies covered in this article are yet more tools available to the cultural interfaces emerging in the 21st century- incredibly useful to artists, storytellers and creators designing and making experiences for audiences to interact with cultural data.


Cover Image: Source. Creative Commons Attribution 4.0 International license.