Hello!, Welcome to COT. Machine learning is doing wonder nowadays. If you are a geek, you must have tried using Blender, or game programming. 3D models are used everywhere, like making 3D animations, 3D printers, games, and engineering. Creating 3D models is very time consuming task, that is why automating the process as much as possible is needed. One of method to do that is use images(2D) to understand geometry, and texture of the objects, and then create 3D Models. So, this article is about that, creating 3D Models from images.
There are many tools, and algorithms proposed for it, but if you know how neural networks work, then you know that machine learning can beat all of other approaches. The last thing in this article is most exciting, so keep reading till the end, let’s start.
Ways to represent 3D Models?.
Before we go any further, you have to understand what are Voxels. If you have used Blender, or any 3D modeling software, then you may be familiar with meshes. Mesh is simply a collection of vertices connected by edges which form faces, which is nothing but 3D model. But there is another way also, you can represent a 3D model using Voxels. Voxels are volumetric pixels. They are small cubes which collectively form a 3D model if we put them together according to the object we want in 3D. Let’s go ahead.
I have read seen some research papers for it. I liked four of them most. One is non Machine Learning based approach, another 3 uses Convolutional Neural Network(CNN), or Generative Adversarial Network(GAN).
- Non ML based approaches.
- Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling(3D-GAN).
- Hierarchical Surface Prediction for 3D Object Reconstruction(HSP).
- Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer(DIB-R).
Let’s start with first one.
1. Non ML based approaches to generate 3D Models from Images.
There are many, but I would like to talk about two, let's see them.
(i)Inverse Projection:
When you capture something by camera, it converts, or draws whatever you see in 3D on a 2D plane(image). In other words, it can be called projection. In doing that we loose depth information, or z value, you can’t rotate objects in an image along z axis. Inverse projection is nothing, but reverse of the projection we talked about above. It means if we have depth/z component information, or depth map, then we can reconstruct 3D scene. It needs depth map, and an image both.
If you want to know in details about inverse projection, then you can read this. Also here is a Python you can try running. It can be done using just OpenCV. If you find any error in the code, you can tell me in comments.
(ii)3D Reconstruction from Single 2D Image:
It was a simple paper I saw. Their method is to first identify various planes of the scene, and then from those planes, it reconstructs the whole scene. Here is the paper link. You can see the result below.
Not so impressive. These approaches do not give significant results. What if we could use the power of Neural Networks, and predict 3D models, just like our brain does?. Many papers have tried that, let’s see the popular ones.
2. Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling(3D GAN).
This is one of the most famous approach to generate 3D models from images using machine learning. Generative Adversarial Networks are powerful. They can create things which do not exist, and that is what 3D modeling artists want. The paper proposes two Generative Adversarial Networks. 3D GAN, and 3D VAE GAN. 3D GAN predicts random 3D Models according to its training. Whereas 3D VAE GAN uses a Variational Autoencoder to take an input image, and then generates 3D model corresponding to that image.
For training you can use a 3D model dataset called ShapeNet. You can use IKEA furniture dataset if you want to try 3D VAE GAN. You can see the results in below image if we use 3D GAN for 3D model prediction.
The 3D Models predicted in above image are in the form of Voxels. Which I already have explained. Here is the link where you will get paper, code, and pre-trained model.
3. Hierarchical Surface Prediction for 3D Object Reconstruction(HSP).
Hierarchical Surface Prediction(HSP) overcomes a drawback of 3D GAN. HSP focuses on one thing which 3D GAN didn’t consider. This paper says why do we have to predict Voxels for the whole volume of the object?, rather we can predict only the surface. The 3D Model predicted by 3D GAN is solid, but it not necessary to predict solid 3D Model, it is OK even if it is hollow. It reduces computational power required, and if add more Voxels to surface then 3D Model resolution/quality can be improved.
HSP uses Convolutional Neural Network(CNN). Here is the link of HSP paper, and here is the code which I have already tried. Here is what result look like if we predict 3D model using HSP.
4. Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer(DIB-R).
Remember what I said first when talking about inverse projection. Images are 2D projections of 3D scene. 3D GAN, and HSP didn’t consider as much as it was needed. 3D Model is not about vertices, edges, faces, or Voxels only. Texture, Lighting are also important. They are also time consuming, and without them 3D Model is not complete. This DIB-R paper took the whole thing to a next level, it predicts lighting, and texture as well.
Not only texture, and lighting, it also introduces its own renderer. The renderer is differentiable, means you can do machine learning. The results it produces can blow your mind, they are almost real 3D Model. Some more research, and it can automate the whole process of creating 3D Models. You can its magic in below image.
Here is the link to more about the paper.
The future is exciting, AI will increase our productivity exponentially as you can see the results. So, I would like to conclude here. If you liked this article, please share with your friends, and you may like our Facebook page for future article. If you have any doubt, please tell us in comments below. Bye!.
1 Comments
This comment has been removed by the author.
ReplyDelete