Table of contents
1.
Introduction
2.
How Occlusion is solved?
3.
How does Occlusion work in AR?
3.1.
Work-Flow of 3D Sensing
3.1.1.
Structured light
3.1.2.
Time of Flight
3.1.3.
Stereo Camera
3.2.
Limitations of depth-sensing
3.3.
How to Use Depth Sensing Data for Occlusion?
3.3.1.
Approach 1:
3.3.2.
Approach 2:
4.
The problem with AR devices
5.
Is There Any Other Solution?
6.
Frequently Asked Questions
6.1.
What is Augmented Reality (AR)?
6.2.
What is Virtual Reality (VR)?
6.3.
What are Virtual Web Applications? Are they Virtual Reality?
7.
Conclusion
Last Updated: Mar 27, 2024
Easy

Occlusion

Author Shivani Singh
0 upvote
Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

To experience a new way of reality with AR, some realism rules must be followed, as well as three basic functionalities: a fusion of real and virtual worlds, real-time engagement, and extremely accurate registration of virtual and real objects. Besides that, there are some challenges in AR, like Occlusion. This aspect disables the rendering of objects in the scene that are not currently within the camera's viewing area as they are obscured (occluded) by other objects. So, what will Occlusion look like in AR? Let us discuss this in deep. 

In augmented and mixed reality, occlusion alludes to the notion that any real physical object will disguise the virtual object if it is placed farther away from the real object and the physics behind it. Occlusion is critical in AR for immersive experiences; virtual objects should be showcased only if there are no physical things between them and the camera.

The image below depicts an unoccluded AR experience.

Image source: occlusion

The virtual model of the portrait should be behind the human and further away on screen, but it seems like an overlay. This alters the normal perception and the experience.

This is occlusion when we are not able to see the real picture. There is always some hindrance like as we saw in the above picture. 

How Occlusion is solved?

Virtual content that is added to a physical scene must understand what physical things are in the picture and where they are in the actual world.

This is done to determine what needs to be occluded and to accurately render content.

Currently, application developers address this issue with a variety of special effects.

There are several approaches to dealing with occlusion in mobile AR. One simple solution is to use depth cameras, which provide the 3D location of each pixel. This will reveal what (if anything) stands between the camera and the simulated content.

An even better solution is to combine this depth data across frames to create the scene's 3D geometry.

How does Occlusion work in AR?

When creating AR scenes, the primary objective of occlusion is to keep the rules of line-of-sight intact. This entails three major functions:

  1. Sensing the three-dimensional structure of the real world.
  2. Reconstructing a digital 3D picture of the world.
  3. Rendering: Using that model to create a transparent mask that conceals virtual objects.

 

The most difficult challenge in developing this occlusion mask is reconstructing a great enough model of the real world to apply it. Because there are currently no AR devices that can perceive their surroundings precisely or quickly enough to provide realistic occlusion.

Work-Flow of 3D Sensing

Let us now delve into this to see how things progress. We're talking about depth-sensing here. There are numerous ways to accomplish this. Structured Light, Time of Flight, and Stereo Cameras are examples. 

Structured light

This is accomplished by superimposing an Infrared light pattern or light ribbon onto a three-dimensional surface. It will obtain some distortions created after projecting the light pattern, and then, based on the data obtained, it will reconstruct the surface outline.

Image source: structured light

In the above picture, we can see that there is a light projecting on the picture. And based on that light strip that is projected on the picture, it will collect some data and will try to reconstruct the outline of the picture again. 

Time of Flight

This sensor operates by emitting short bursts of infrared light. It is then struck by objects and reflected. The delay time for every pixel is calculated by the image sensor.

Image source: time of flight

Stereo Camera

Stereo cameras imitate human visual acuity by trying to measure the displacement of pixels between two cameras that are separated by a fixed distance and using that information to interpolate distances to points in the scene.

Image source: stereo camera

Limitations of depth-sensing

We realize how this sensor works, but it has some drawbacks:

  1. Outdoors, IR-based sensors take a long time to operate because the sunlight, which contains a lot of IR, can wash out or add vibration to the measurements. As a result, the results would be misleading.
  2. It will work best for stereo cameras in well-lit areas with a lot of functionalities and stark contrast.
  3. Because these sensors use pixel-based measurements, any vibration or error in the measurements results in holes in the depth image.
  4. So far, the maximum range has been estimated to be less than 4 meters.

How to Use Depth Sensing Data for Occlusion?

Approach 1:

Aligning a camera image with a depth map. Then disguise the parts of the scene which should be hidden behind any depth map pixels. Because it only works on depth images, this method does not require any 3D reconstruction. However, here are some potential issues:

  1. The limited range is below 4 meters.
  2. The depth map has flaws and is not perfect.
  3. The depth map has a lower resolution than the camera. As a result, it generates pixelated, jagged edges for occlusion.

Image source: using depth-sensing

Approach 2:

We can create meshes of the real world using a 3D point cloud. These meshes can be combined to make a transparent mask for obscuring virtual elements.

However, it performs better in a static environment than in a dynamic environment. Because mesh generation from point clouds seems to be too slow for real-time occlusion.

Image source: using depth-sensing

The problem with AR devices

The main issue with all of the popular AR devices, such as Google Tango, Microsoft Hololens, and Apple iPhone X systems, is that they have:

  1. Limited range (<4m): Due to the size and power restrictions of mobile devices, the range of Infrared and Stereo depth sensors is limited.
  2. Low resolution: Fainter particles in the scene are not perceivable in the point cloud, and achieving crisp and reliable occlusion surfaces is extremely difficult.
  3. Slow mesh reconstruction: Current methods for creating meshes from point clouds are too slow for these devices to support real-time occlusion.

Is There Any Other Solution?

Right now, there are numerous augmented reality SDKs available. Both Apple and Google have introduced ARKit and ARCore in recent years. AR Foundation sits on top of ARKit and ARCore, connecting the two and providing a taste of cross-platform advancement with Unity. Occlusion prevention was one of these SDKs. However, none of them produce a perfect result, particularly in a real-time environment. However, in some cases, we can still use shaders to get occlusion to work. If we still want to achieve occlusion, we can relax the real-time constraint, i.e. the static environment. If the application allows for pre-mapping the environment, a pre-built mesh can be used as an occlusion mask for the larger objects, given the condition that they are static.

 

Each AR experience elevates a unique occlusion issue that necessitates a unique solution. Technical advancements, particularly in sensor sensitivity and processor power, will undoubtedly aid in the settlement of this type of issue in the near future. The combination of creativity and software will enhance augmented reality to the point where it will be almost undetectable one day.

Frequently Asked Questions

What is Augmented Reality (AR)?

Augmented reality overlays computer-generated data, such as a restaurant review, on top of real-world views, such as a city street. Many apps use enhanced data collection to overlay animations, images, video, and text in the real world.

What is Virtual Reality (VR)?

Since they are wearing a headset, clients can see and interact with the space. In plenty of other words, VR creates a fully immersive virtual environment.

What are Virtual Web Applications? Are they Virtual Reality?

No. People frequently confuse virtual reality with anything virtual since virtual reality always contains a device that immerses users in a different world.

Conclusion

To summarise, we discussed the fundamentals of occlusion and how to solve it in this blog. We also talked about how occlusion works in AR, as well as the workflow and limitations of 3D sensing. Then we saw the application of depth sensing for occlusion. There were two approaches described. We also talked about the issues with today's AR devices and whether there is another solution besides depth sensing. Finally, we discussed potential future scenarios.

Refer to our guided paths on Coding Ninjas Studio to upskill yourself in Data Structures and AlgorithmsCompetitive ProgrammingJavaScriptSystem Design, and many more! If you want to test your competency in coding, you may check out the mock test series and participate in the contests hosted on Coding Ninjas Studio! But if you have just started your learning process and looking for questions asked by tech giants like Amazon, Microsoft, Uber, etc; you must have a look at the problemsinterview experiences, and interview bundle for placement preparations.

Nevertheless, you may consider our paid courses to give your career an edge over others!

Do upvote our blogs if you find them helpful and engaging!

Happy Learning!

Live masterclass