The art of the illusion with photogrammetry

The Problem

This was one of the most amazing finds of the last year for me. As someone with no real classical artistic talent, I find the idea of taking real world assets and getting them into computers fascinating.

There are many different ways of doing this and processes for it. There is a way of animating called Rotoscoping in which you take a stream of pictures and trace movement to get a life like feel for animation.

horse
Rotoscoping

There is cel shading which is the idea of taking an image and transforming it into something that looks more two dimensional and flat.

Toon-shader

Of all of them, the most magical to me has always been photogrammetry.

Photogrammetry is the science of making measurements from photographs, especially for recovering the exact positions of surface points. Photogrammetry is as old as modern photography, dating to the mid-19th century and in the simplest example, the distance between two points that lie on a plane parallel to the photographic image plane, can be determined by measuring their distance on the image, if the scale (s) of the image is known.

https://en.wikipedia.org/wiki/Photogrammetry

In 2008 Microsoft released a service in beta called PhotoSynth. I recall the service being deemed as the next big thing in Photography and it was truly magical, given enough two dimensional normal camera photographs it would calculate a 3d point cloud which could be used to model the images in 3d space, much like 100’s of Minecraft blocks. There was an idea, that one day, you would be able to take you one photo and with the massive resources of PhotoSynth behind you, you could augment your photo’s and have them turn into 3d models.

I took my ~200 photo’s of a statue in Cannizaro Park of Dianna and the Fawn and I turned these photo’s into the most amazing 3d model, which I could spin around and view from absolutely any angle, it was like magic.

Photosynth2011Synth
PhotoSynth

Unfortunately this was not to last and the service got degraded in time, to be a (very good) photo stitching service.

The problem was with this service gone, there was no easy way for me to take my beautiful flat 2d images and turn them into a fantastical amazing model and they sat, dormant, sad and unanimated since 2010.

Cannizaro Park 023
Diana and the Fawn

The Solution

Fast forward to a week ago and I found a wonderful program called Meshroom by Alice Vision.

anim-meshroom-once.gif
Meshroom

This program promised the same goodness as Microsofts PhotoSynth, but all run locally using the power of your single local computer.

I was able to bring my images back to life using Meshroom and Blender.

Logo_Blender.svg
Blender

And following an amazing video by CG Geek.

Loading my ~200 images into Meshroom with default setting and hitting go resulted in a lot of waiting, but a pretty spectacular result.

photos.jpg
A small sub section of the images
Meshroom
Loaded into Meshroom

And two hours later I had a 3d model with texture mapped onto all created from my 2d images.

There’s a couple things that are important using this technology :

  • Good high quality images, without blur
  • A camera which is known by Meshroom or willing to add it.
  • About 60% overlap between photos to allow the algorithm to work
  • Willingness to learn a bit of Blender (which was pretty complicated for me)
  • Patience

The most rewarding part of this project for me was the ability to bring back to life my photo’s from a previous project of mine, they were uploaded to Google Photo’s at some point and there they have sat for 8 years. They are stored using the free version which does introduce some additional compression, but they seem to have weathered time well enough.

Once I imported the Obj Model from Meshroom into Blender I was greeted with a lot of complicated screens and information.

Blender2.JPG

There a 3D Model which included some of the surroundings and 4 texture atlasses for the scene which is essentially a flat texture with information for how to project it onto your model, kind of like throwing a carpet over a manequin but doing it in a specific way everytime to end up with the same result.

I was very pleased with the model.

WireFrame.JPG
3D Wireframe Model

And the texture look wasn’t looking too bad either

Textured.JPG
Textured in Blender

Unfortunately my final rendered output was looking like it was in need of a bit of attention.

Dianna 1.JPG
A very cement look and a lot of additional scenery

I was able to animate this and get a result out into video, which I was very pleased with

But there was more work to be done, the grounds around the statue didn’t look very good, but it was a quick step to remove all that additional scenery and put some lights into the scene to polish it up.

Since I had taken these photos in normal daylight without a flash, the idea was to light the model up as evenly as possible, but not introduce any new shadows, which was just a check box inside Blender for the lights.

And this took me to my final output

This video is a more dynamic pan around to show that I can now view any angle  I want, and while it is a little too shiny still, it does represent the statue much better now.

Where to from here

The next steps for me are to see if Meshroom can map an indoor area properly with better photo’s. Indoors pose a unique problem that a lot of these algorithms work on the texture or object to work out how they are moving in a scene and single colour painted walls do not represent well, so you need to have interesting rooms if you hope for this to succeed.

I think it is amazing how far this technology has come, and what you can now do with Open Source projects which are out in the wild for anyone to try. I’m quite keen to see about how much of my world I can map.

With mapping technology like Google Maps and Bing Maps, how long is it until they can apply this to our StreetView photos and give us a literal like for like 3D View of what we would see and could walk around.

How much more immersive would Google Earth be with technology like this behind it ?

They are using similar technology already, but if you could source ground level data, this could essentially turn our whole world Virtual.

Imagine deciding to play Grand Theft Auto or Just Cause and choosing your last holiday as the location or your local neighbourhood, because no longer are the artists going to be constrained by the environment, they’re going to focus on making dynamic stories which adapt to the environment you choose to play in.

We’re not quite there yet, but how about Tower Defence on Google Maps

Is Raytracing the future of rendering or the next big fad ?

I was surprised that Ray tracing made a massive resurgence at GDC 2018.

Ray tracing has always been the alternative to Polygonal Rasterization (Standard 3d Rendering) and Voxels (similar to Polygons but using many many many dots in 3d space).

Take a look at these phenomenal videos which were released at the Conference. Pay particular attention to shadows, lights and reflections as well as things like the ‘feel’ of the material and the way it interacts with the light.

Proving it was real using a phone as a camera in 3d space. You can see how the camera moves in the 3d space with the phone movements, showing that this is being generated in real time.

What I want to accomplish

I’m quite fascinated by the technology and wanted to dig into the history and the alternatives, some of the theory behind it, and my reasoning it’s seeing a resurgence.

Mowr Video

Here is the video that showed at GDC recently demo’ing a lot of the technology and it’s use in DX12 DXR API.

https://www.youtube.com/watch?v=mgyJseJrkx8

History… Haven’t we heard about it before

A very long time ago before we had 3d accelerators there was a world of choice for game developers wanting to do something approximating our reality on the PC. It was with the surge of the FPS and Quake that saw the dawn of 3DFx and eventually OpenGL. Microsoft made a push into the same space with the initial versions of DirectX and Direct3d.

Quake was one of the first mainstream game I remember to do ‘real 3d’ using Polygon rasterizers, but there were other technologies which we lost along the way.

There were Voxels, using elementary 3d particles to build environments (kinda like Minecraft with smaller building blocks)

Outcast with it’s Voxel terrain rendering and it’s recent 3d rasterizer remake, 18 years later!

For Raytracing it’s been something more of an offline affair for getting a amazing lighting for 3d models which you can see from Autodesk.

Motorcycle, Engine, Raytracing, Render, Raytrace

But for real time ray tracing it’s been in the domain of Coding Demo competitions for a long time, showcasing the skills of up and coming programmers and being done on even the early Amiga’s like Jeff Atwood spoke about here.

And here is an amazing 64kb demo release in 2011 Exceed – Heaven Seven (Heaven 7).

The technology

Ray tracing uses the idea of ray casting, which involves shooting rays from a camera into a scene and evaluating what it hits in the scene, to determine what it looks like, these rays are affected by the lights casting into the scene as well.

The idea is that the more rays you shoot into a scene the higher a level of fidelity you achieve.

The increasing detail Illustrated by additional rays being cast at a 2d kettle from a tutorial on programming Ray tracing engines.

Since the rays rely on additional recursive rays being cast, you can also speed up the process by limiting the amount of rays you will additionally cast into scene from the object such as the depth, but doing this too heavily will result in losing a lot of the amazing lighting you get from ray tracing.

Here is a simple Javascript Raytracer illustrating the rays for scene depth.

Depth of 1
You can distinctly see the lack of detail with regards to the floor on the balls and the lack of any detail on the reflection of the ball on the plane below it

Depth of 2
With 2 levels of rays you can see the distinct checker board pattern more realistically displayed on the balls, as well as the detail on the floor.

Depth 0f 5
There is a lot more detail on the floor reflections as well as muted colours in the later reflections on the ball.

Depth of 50
Very minimal change in detail the only bif I can see is on the underside of the big ball on the floor plane has become a lot darker due to more dark shadow rays being cast into the scene from the big ball obscuring the light sources in later depth

The resurgence and the future ?

Accurate light and shadow representation in a scene has long been one of the tenants of graphic fidelity.

I can remember seeing the real time lighting in Doom 3 and barely being able to believe it was all running in real time on my own PC.

The comparisons made were always to animated movies by Pixar and when could we have that level of detail in our games. When movies like Toy Story were made they were taking hours rather than milliseconds to calculate frames.

I believe, if our computers have the power, it could be the next step for our graphics. It’s a much simpler way of representing the world and more closely mimics our reality and vision than rasterizers do.

There is a concern about building renderers which support current gen consoles and our current PC graphics cards, but then also supporting the new technology.

Thankfully, in a lot of instances, the final renderer has seemed to be hot swappable for Raytracing and Rasterization. Unity and Unreal engines are also great at supporting up and coming rendering technologies as they emerge.

Futuremark is on the action with their next version of 3dMark having a Raytracing benchmark built in.

DXR Raytracing on and off

Though the Siren demo didn’t specifically mention DXR, it’s also showing how the state of the art is moving forwards in so many other areas right now.

We are going to be spoilt for ways to be amazed in the near future.

I personally think there’s a bright future for the technology and it will get more main stream acceptance, as we ever push towards digitally representing our reality.

 

The creation of Lauren

For me, the best part of finishing a project is making people’s tedious problems into non-entities.

I believe that people’s jobs are generally encumbered with too much peripheral work to what they should actually be doing and focussing on.

I take great pleasure in making manual things go away, to free up a person to do the more interesting parts of their job.

So this project has a particular place in my heart… and GitHub.

Premise

As part of the their job a friend writes articles and publishes them on a custom CMS and they have to put a picture in mutliple formats into the CMS to allow it to display in many different forms.

Some of the formats are on the full page article, mini-preview, linking previews and sister-site integrations, each article may need up to 10 photos and each of these has to be cropped, centred and re-sized into their respective formats, meaning up to 40 seperate images that need to be created and was taking an hour plus each time they had to do it!

I felt certain I could help them spend their time better.

Research!

Whenever I want to start a project the first step is always reasearch. In this age of information when you set out to do something there is almost always someone who has done it already, so I looked to a lot of my favourite imaging tools to see how easily they could be setup to automatically create a bunch of formats each time they were needed.

I’d used XNConvert in the past and IrfanView with great sucess, but couldn’t find a way to easily automate them to this specific task that would be easy enough and portable enough to setup on any machine.

So the project was a go!

What I set out to accomplish

I had a couple goals when I set out to do this.

  • Solve the Problem : It must resize into the custom formats very quickly.
  • Don’t get overzealous : It mustn’t take me a lot of time to complete.
  • Keep it simple : It should be tiny and singularly focussed.

I thought of these as mantra’s during my development.

I decided to write a Windows desktop application, this was an easy choice as I wouldn’t have to host anything for them and they could locally have the application on their PC and transfer it as they desired.

Using C# and Visual Studio by Microsoft.

Solve the Problem

I wanted a very simple UI, drag and drop an image and get your copies.

There were a couple iterations of even this, but this is what I came up with in the end.

A simple header (I had to use a 80’s era logo for something in my lifetime).

I also wanted to highlight the chosen name, Lauren, as per a work collegues request. While I was happy to oblige I still wanted to be somehow related to the application. The chosen concatenation was Largely AUtomated REsiziNg tool.

With the two most important questions answered (what does it do and what do we call it) it was time we move on.

Don’t get overzealous

A brief break down, the top bar is for the different sizing requirements depending on CMS and tasks, so instead of creating twenty different types, we actually only create the ones pertinent to the specific task.

The bottom bar is for quickly setting the max size of quality of the picture.

These were the only two concessions I made to expanding requirements.

There is a choose file option, but dragging and dropping into the window is the expected usage.

There is no process button, we just process as you go.

Keep it simple (stupid)

I do actually try to follow this principle when I do anything.

It’s tightly coupled to a favourite saying.

Keeping this project small and singularly focussed managed to get me to the finish line fast with the best ROI (Return on Investment) for the work I was doing, which was in my spare time and to help a friend. The ROI was seeing a happy friend with a tedious task removed from their life.

Fun stuff ?!?!

My original approach was to naively resize everything regardless of the aspect ratio changing.

Surprisingly to me, this resulted in a really terrible output.

I quickly changed to using a resize and crop method which didn’t give the best result.

The image above illustrated the centred focal points in each photo which is how the resize and crop where happening.

I then changed this to allow the user to select a focal point if they wished and defaulted it to the centre. This avoided having to go to an external tool again.

With the focal point selected you get much higher quality output.

With the resize and crop you see with similar aspect ratios’ we get a similar photos, but with the long upright photo’s we get a better result.

A couple things I learnt.

  • Resize it to get the one axis right.
  • Then crop to get the sizing.
  • Maintain image quality at a certain level.
  • Sometimes you may have to zoom in to get the first axis right.
  • Some CMS’s have file size limits as well.

This resulted in a very good image which my friend was happy with.

To get the files into the users hand with minimal fuss, I create new files in the same directory as the created file but with the size of the file appended onto the filename.

Finished product

I put the single exe in a folder on Google Drive and shared the link with the friend, after a couple firewall mistarts I did get it to them.

Once they had the application, we went through a couple iterations.

Initially I created a folder with the file name and stored the files in there, they found it easier to have it in the same folder.

After that it was mostly around the sizes required, so I added xml config, for them to control their own destiny.

And finally it was about the different tasks, so I added 4 xml profiles.

They have been very happy with the program and even shared it with collegues to help them do their jobs quicker and more effeciently.

Where to from here

Sometimes a project should be short lived, but there are a couple things I would add if I worked on it in the future.

  • The resizing and crop algorithm could be improved, but it’s Good Enough ™ right now.
  • The UI could give more feedback and be multi-threaded.
  • The code should follow best practices and have tests built in.
  • The profiles should be configurable in App and you shouldn’t be limited.
  • Conext Menu Integration in Windows.
  • It should be multi platform.

I’m not sure if I’ll ever do any of the above, but I am still very happy to have invested the time I have so far into this tool.

Have a look at the source in Github, if you want and give me a shout if you’d like to talk about it 🙂

https://github.com/tonym128/lauren