The art of the illusion with photogrammetry

The Problem

This was one of the most amazing finds of the last year for me. As someone with no real classical artistic talent, I find the idea of taking real world assets and getting them into computers fascinating.

There are many different ways of doing this and processes for it. There is a way of animating called Rotoscoping in which you take a stream of pictures and trace movement to get a life like feel for animation.

horse
Rotoscoping

There is cel shading which is the idea of taking an image and transforming it into something that looks more two dimensional and flat.

Toon-shader

Of all of them, the most magical to me has always been photogrammetry.

Photogrammetry is the science of making measurements from photographs, especially for recovering the exact positions of surface points. Photogrammetry is as old as modern photography, dating to the mid-19th century and in the simplest example, the distance between two points that lie on a plane parallel to the photographic image plane, can be determined by measuring their distance on the image, if the scale (s) of the image is known.

https://en.wikipedia.org/wiki/Photogrammetry

In 2008 Microsoft released a service in beta called PhotoSynth. I recall the service being deemed as the next big thing in Photography and it was truly magical, given enough two dimensional normal camera photographs it would calculate a 3d point cloud which could be used to model the images in 3d space, much like 100’s of Minecraft blocks. There was an idea, that one day, you would be able to take you one photo and with the massive resources of PhotoSynth behind you, you could augment your photo’s and have them turn into 3d models.

I took my ~200 photo’s of a statue in Cannizaro Park of Dianna and the Fawn and I turned these photo’s into the most amazing 3d model, which I could spin around and view from absolutely any angle, it was like magic.

Photosynth2011Synth
PhotoSynth

Unfortunately this was not to last and the service got degraded in time, to be a (very good) photo stitching service.

The problem was with this service gone, there was no easy way for me to take my beautiful flat 2d images and turn them into a fantastical amazing model and they sat, dormant, sad and unanimated since 2010.

Cannizaro Park 023
Diana and the Fawn

The Solution

Fast forward to a week ago and I found a wonderful program called Meshroom by Alice Vision.

anim-meshroom-once.gif
Meshroom

This program promised the same goodness as Microsofts PhotoSynth, but all run locally using the power of your single local computer.

I was able to bring my images back to life using Meshroom and Blender.

Logo_Blender.svg
Blender

And following an amazing video by CG Geek.

Loading my ~200 images into Meshroom with default setting and hitting go resulted in a lot of waiting, but a pretty spectacular result.

photos.jpg
A small sub section of the images
Meshroom
Loaded into Meshroom

And two hours later I had a 3d model with texture mapped onto all created from my 2d images.

There’s a couple things that are important using this technology :

  • Good high quality images, without blur
  • A camera which is known by Meshroom or willing to add it.
  • About 60% overlap between photos to allow the algorithm to work
  • Willingness to learn a bit of Blender (which was pretty complicated for me)
  • Patience

The most rewarding part of this project for me was the ability to bring back to life my photo’s from a previous project of mine, they were uploaded to Google Photo’s at some point and there they have sat for 8 years. They are stored using the free version which does introduce some additional compression, but they seem to have weathered time well enough.

Once I imported the Obj Model from Meshroom into Blender I was greeted with a lot of complicated screens and information.

Blender2.JPG

There a 3D Model which included some of the surroundings and 4 texture atlasses for the scene which is essentially a flat texture with information for how to project it onto your model, kind of like throwing a carpet over a manequin but doing it in a specific way everytime to end up with the same result.

I was very pleased with the model.

WireFrame.JPG
3D Wireframe Model

And the texture look wasn’t looking too bad either

Textured.JPG
Textured in Blender

Unfortunately my final rendered output was looking like it was in need of a bit of attention.

Dianna 1.JPG
A very cement look and a lot of additional scenery

I was able to animate this and get a result out into video, which I was very pleased with

But there was more work to be done, the grounds around the statue didn’t look very good, but it was a quick step to remove all that additional scenery and put some lights into the scene to polish it up.

Since I had taken these photos in normal daylight without a flash, the idea was to light the model up as evenly as possible, but not introduce any new shadows, which was just a check box inside Blender for the lights.

And this took me to my final output

This video is a more dynamic pan around to show that I can now view any angle  I want, and while it is a little too shiny still, it does represent the statue much better now.

Where to from here

The next steps for me are to see if Meshroom can map an indoor area properly with better photo’s. Indoors pose a unique problem that a lot of these algorithms work on the texture or object to work out how they are moving in a scene and single colour painted walls do not represent well, so you need to have interesting rooms if you hope for this to succeed.

I think it is amazing how far this technology has come, and what you can now do with Open Source projects which are out in the wild for anyone to try. I’m quite keen to see about how much of my world I can map.

With mapping technology like Google Maps and Bing Maps, how long is it until they can apply this to our StreetView photos and give us a literal like for like 3D View of what we would see and could walk around.

How much more immersive would Google Earth be with technology like this behind it ?

They are using similar technology already, but if you could source ground level data, this could essentially turn our whole world Virtual.

Imagine deciding to play Grand Theft Auto or Just Cause and choosing your last holiday as the location or your local neighbourhood, because no longer are the artists going to be constrained by the environment, they’re going to focus on making dynamic stories which adapt to the environment you choose to play in.

We’re not quite there yet, but how about Tower Defence on Google Maps

Is Raytracing the future of rendering or the next big fad ?

I was surprised that Ray tracing made a massive resurgence at GDC 2018.

Ray tracing has always been the alternative to Polygonal Rasterization (Standard 3d Rendering) and Voxels (similar to Polygons but using many many many dots in 3d space).

Take a look at these phenomenal videos which were released at the Conference. Pay particular attention to shadows, lights and reflections as well as things like the ‘feel’ of the material and the way it interacts with the light.

Proving it was real using a phone as a camera in 3d space. You can see how the camera moves in the 3d space with the phone movements, showing that this is being generated in real time.

What I want to accomplish

I’m quite fascinated by the technology and wanted to dig into the history and the alternatives, some of the theory behind it, and my reasoning it’s seeing a resurgence.

Mowr Video

Here is the video that showed at GDC recently demo’ing a lot of the technology and it’s use in DX12 DXR API.

https://www.youtube.com/watch?v=mgyJseJrkx8

History… Haven’t we heard about it before

A very long time ago before we had 3d accelerators there was a world of choice for game developers wanting to do something approximating our reality on the PC. It was with the surge of the FPS and Quake that saw the dawn of 3DFx and eventually OpenGL. Microsoft made a push into the same space with the initial versions of DirectX and Direct3d.

Quake was one of the first mainstream game I remember to do ‘real 3d’ using Polygon rasterizers, but there were other technologies which we lost along the way.

There were Voxels, using elementary 3d particles to build environments (kinda like Minecraft with smaller building blocks)

Outcast with it’s Voxel terrain rendering and it’s recent 3d rasterizer remake, 18 years later!

For Raytracing it’s been something more of an offline affair for getting a amazing lighting for 3d models which you can see from Autodesk.

Motorcycle, Engine, Raytracing, Render, Raytrace

But for real time ray tracing it’s been in the domain of Coding Demo competitions for a long time, showcasing the skills of up and coming programmers and being done on even the early Amiga’s like Jeff Atwood spoke about here.

And here is an amazing 64kb demo release in 2011 Exceed – Heaven Seven (Heaven 7).

The technology

Ray tracing uses the idea of ray casting, which involves shooting rays from a camera into a scene and evaluating what it hits in the scene, to determine what it looks like, these rays are affected by the lights casting into the scene as well.

The idea is that the more rays you shoot into a scene the higher a level of fidelity you achieve.

The increasing detail Illustrated by additional rays being cast at a 2d kettle from a tutorial on programming Ray tracing engines.

Since the rays rely on additional recursive rays being cast, you can also speed up the process by limiting the amount of rays you will additionally cast into scene from the object such as the depth, but doing this too heavily will result in losing a lot of the amazing lighting you get from ray tracing.

Here is a simple Javascript Raytracer illustrating the rays for scene depth.

Depth of 1
You can distinctly see the lack of detail with regards to the floor on the balls and the lack of any detail on the reflection of the ball on the plane below it

Depth of 2
With 2 levels of rays you can see the distinct checker board pattern more realistically displayed on the balls, as well as the detail on the floor.

Depth 0f 5
There is a lot more detail on the floor reflections as well as muted colours in the later reflections on the ball.

Depth of 50
Very minimal change in detail the only bif I can see is on the underside of the big ball on the floor plane has become a lot darker due to more dark shadow rays being cast into the scene from the big ball obscuring the light sources in later depth

The resurgence and the future ?

Accurate light and shadow representation in a scene has long been one of the tenants of graphic fidelity.

I can remember seeing the real time lighting in Doom 3 and barely being able to believe it was all running in real time on my own PC.

The comparisons made were always to animated movies by Pixar and when could we have that level of detail in our games. When movies like Toy Story were made they were taking hours rather than milliseconds to calculate frames.

I believe, if our computers have the power, it could be the next step for our graphics. It’s a much simpler way of representing the world and more closely mimics our reality and vision than rasterizers do.

There is a concern about building renderers which support current gen consoles and our current PC graphics cards, but then also supporting the new technology.

Thankfully, in a lot of instances, the final renderer has seemed to be hot swappable for Raytracing and Rasterization. Unity and Unreal engines are also great at supporting up and coming rendering technologies as they emerge.

Futuremark is on the action with their next version of 3dMark having a Raytracing benchmark built in.

DXR Raytracing on and off

Though the Siren demo didn’t specifically mention DXR, it’s also showing how the state of the art is moving forwards in so many other areas right now.

We are going to be spoilt for ways to be amazed in the near future.

I personally think there’s a bright future for the technology and it will get more main stream acceptance, as we ever push towards digitally representing our reality.

 

FaceOff without the surgery… DeepFace and FaceSwap

What is FaceSwap ?

FaceSwap is a technology used for taking one face and swapping it with another in an image or video. What makes this news ? All the tools are available for anyone to use now!

It’s an interesting topic and it’s pretty current, I found the images of Nic Cage everywhere absolutely amazing, especially the Indianna Jones video, that was fantastic. Take a look at the video below!

I love the technology and the way it’s so accessible, unfortunately that comes with the bad part of it too, that people are using this to put people into situations they were never in.

What I’m trying to accomplish and what I want to share

So with the bad people doing bad things, what am I trying to do ?

I want to see how easy it is for a person like me to grab the required tools, and with very little understanding of the underlying technologies and how they work, see what I can accomplish at what to me, feels like the cutting edge of ML (Machine Learning).

Getting setup

The tools

OpenCV logo

The toolset runs on Python and uses OpenCV and TensorFlow on a GPU, this is pretty standard for this area from what I understand.

So I grabbed the source code, Python, OpenCV, TensorFlow and then proceeded to have ultimate pain on my windows box.

After fighting the requirements and massive Nvidia downloads of many different drivers. I actually upgraded the requirements for the DeepFace source code to a new version of TensorFlow, which then entailed using newer versions of CUDA and cuDNN and then the magic happened.

Faces!

Some of the libraries used are fascinating, to have the images setup the way you need, they actually understand that in all likelyhood you’re not going to have a homegenous image set of faces ready to use and provide a first step, which takes all the faces out of the photos you have, even doing some simple rotations to align them. I will definitely be using this feature in future.

Once I had my two face sets from an image search on my Google Photo’s by Name, I pulled the faces out and cleaned out the out directories of the additional people who appeared in the photos.

This took about an hour of manual labour, but there is now a very freaky collection of 300+ photos of me and my wife in single folders. (The actual reccomendation is about 3000 photos, so I’m well under, this might bite me later)

I am imagining doing this process for multiple friends one day so I can put them in places they’ve never been and see if I can freak them out. That is probably a LOT of effort for a very minimal payoff.

Do the face magic

Now that we have training data, we actually need to train a model that will encode the logic for doing this transformation from one face to another, my understanding is that there are libraries logic for mapping the features of a face and with enough data the mapping from the one face to the other face will become seamless. This involves a lot of processing and trial and error to build a model that can do this in a repeatable fashion over and over.

The training step runs for a very long time and the longer you run it the better a result you get, thankfully this isn’t a one off process and you can pick up your training to make the resultant image better quite easily.

Working Hard

Make me images

The process of making images is quite straight forward of providing a source image, this then gets taken along with the model and the face mapping magic goes on, on your GPU and you get a new image outputted with the faces changed.

After a few minutes of processing, my data loss went down to 0.055 and my images looked like a child had cut a copy of the one face out and stuck inside the other!

You can see above how the alignment is actually amazing, but unfortunately without the face shapes being similar that effect is not believable at all. Even more so when we move to video.

Make me videos

Interestingly the process of creating a video is the same as with images, though it’s batch processing of the single frames which you then recombine back into a video.

Where to from here ?

Streamline

The technology is amazing and the work that has been put in is awesome, but there is a lot of manual tinkering to get to where you want to be, so automating this would be amazing.

I had a few ideas.

Face Classification from Source Images

This whole process could be automated, you take your whole image set, and let DeepFace automated the pulling of faces (which is already does), but then it takes those faces and matches them with each other to work out which are the same peoples (much like Google does using TensorFlow on Google Photos)

Automated model generation

Once we have our people classified, we could have a process to take all the images fed in and do all the combinations of models for those people.

Only replace the person for the model.

Since we can identify individuals now, you could actually only replace the face of the person who matches the model on a face match. This was a bit unfortunate from my experience that everyone gets treated to the model fitting.

Video Processing

Being able to process videos directly could be a boon for the software.

Audio

And finally since we’re doing the video now, we should extend this to do the audio as well. Imagine being able to change a person entirely in a video, including the way their voice sounded on your own home PC!

Adobe and Lyra both have products in testing to do just this, with one minute of audio!

The future is bright

And fake! or whatever you want it to be, depending on how you look at it.

A miner in a new world

I’m a self confessed techno nerd, I find almost everything to do with computers fascinating from reading every technology news source on the web on a daily basis to Kickstarting insane technologies, it’s actually a surprise to me that I never had the urge to learn about the ‘seedy underworld’ of crypto.

Even now I still find it amazing how far the phenomenon has come and what levels people have gone to, to be involved in it.

With so much critical mass and not wanting to feel left out anymore, I started a journey to find out more about what crypto is and how it works.

In all honesty, it’s after the fact and the bubble has actually started to pop on the last 6 months hay days of ever rising exchange rates, but I was still finally ready to read up and learn some more about what’s going on, and dig my hands in a bit further than I had before.

I decided that more than for financial gain, I wanted to get in on the technology and learn start to finish what it takes to be involved.

Mining

What is it

At a simplistic, off the top of my head level of it, is that you make your computer work hard to validate something, this work, when done by 1000’s of PC’s can’t be faked by any one corporation and the PoW (Proof of Work) that you do builds upon the Blocks of the work previously done (The BlockChain) to allow a decentralised network to validate a transaction on your behalf without actually involving a third party between the two parties transacting.

What should I mine ?

Actually mining on the popular cryptos takes a lot of patience or money for a rig. So my journey started with finding a viable mining currency.

This changes on a practically daily basis. WhatToMine Is a great website for finding out what’s going on the market and what you can potentially do with your gaming rig. After working out that my little gaming rig’s 1070 wasn’t going to bring home the $$$’s I started looking for something that I could feasably mine and make some coins off of.

A quick investigation into some algorithms shows that some are now FPGA and ASIC resistant. Bitcoin is not, so if you’re not mining with ASIC’s you shouldn’t waste your time.

FPGA - Fully programmable gate array, a device that you can
program specifically to run certain code very efficiently and fast, much
faster than a normal generic processor could.

ASIC - A circuit developed to run a specific program in hardware, VERY fast at doing a very specific task.

Another great website is CoinMarketCap where you can see all the top currencies.

For my jouney I found Monero, a even more privacy orientated coin, which seems quite interesting. I also really liked the look of TRON and Ripple but unfortunately were too hard to mine and already felt a bit too mainstream for my interests. ZCash also cropped up a few times as really good for Nvidia cards.

Thankfully there’s a couple offshoot projects like AEON and SumoKoin

I chose SumoKoin for my learning experience.

How to mine

There is a lot of software out there for mining, from the simple miners (which are normally only CPU), to what feels like dodgy miners (unverified). But a couple pop out quite soon in, though which ones support your GPU and the crypto you want to mine will determine which you want to use.

  • CCMiner – Good comprehesive miner, not always the fastest though.
  • Claymore – Recognised as a great AMD Mining software stack, supports Nvidia too.
  • XMR-Stak – A great Monero miner for AMD and Nvidia.
  • MinerGate – Really easy to use, while it was easy to get up and running, I didn’t feel like I had control over anything, which this whole new world is meant to provide you with.

I chose XMR-Stak for it’s speed to ease of use, the SumoKoin website actually detailed it’s setup as well.

What else do I need

For any of the crypto’s you want to get involved in you need to have a Wallet for storing and transferring funds.

Wallet

Normally you would generate these yourself, as there is no fully trusted third part to do this. To do so, you need to download the wallet software as well as the blockchain (which can be 10’s of Gigabytes big).

If you’re willing to cede control you can have a third party do this on your behalf. It’s safest to do this on your own, or use a very reputable exchange.

Exchange

At some point you’re going to want to turn your coin into something else, so you’ll need an exchange which can handle the exchange of one token to another, they normally do this by providing you with wallets on the exchange in the different crypto’s you want to use. Some exchanges will even allow you to transfer back into fiat letting you spend your hard earned $$$’s.

SumoKoin is currently only listed on Cryptopia A well regarded exchange from my travels online. LiveCoin The one I used to succesfully make my conversions between crypto’s

Pool

Even with the new currencies the amount of processing power being thrown at them is staggering, to find the key to the block chain is like searching for a needle in a haystack, you could possibly get lucky on your own or you could pool your procesing together with others (for a small fee) and collectively as a group split the bounty you get for successfully finding a key.

I chose to go with the Official SumoKoin pool

Other ones I found included Supernova.cc – Main coins, but not Sumo FlyPool – Seemed quite nice, have some unconfirmed balances there, no transfers.

Lets get mining

Once you have a wallet, you setup your mining software to mine to the pool, either to your account at the pool, or to your wallet directly.

With your software configured, off you go.

Some time later

For mining SumoKoins, I determined my end goal was to mine enough to make a profit, aparently I was going to mine a revenue or $4 / day, so from a profit expecation I should have come out with roughly $3.40 per day.

I decided on 5 SumoKoins for my experiment and this took about a week to mine with my hardware and the current difficulty on the coin.

Thus the task of retreiving cash from my hard earned coins started.

Get the money in your physical wallet

Depending on the pool your mining too you could have a different flow.

For the main SumoKoin pool, I was mining directly to my wallet and whenever I hit a threshold, 0.5 SumoKoin’s, they would transfer the cash from my unconfirmed balance on the pool to my wallet which I had setup to mine to.

For another pool you would mine to your account and then have to initiate a transfer at a later date which is also over a threshold they hold you to.

Now that you have control over you money, it’s time to move it to an exchange which you can get it into the crypto / currency you need.

In my case I transferred it

  • From my wallet
  • To an exchange which supported it (very few for some altcoins, sometimes only one)
  • Exchanged the Sumo into Ethereum
  • Move to an exchange in my country which supported Ethereum and my local currency.
  • Move onto my bank card

The whole process to about a day all in all with moving around, mostly waiting time to confirm.

In the case of the crypto currencies there is normally waiting time to confirm that you didn’t try to double transfer the cash or fake it, and in the case of banks, they just take a day or two.

At the end of that I made half of what the coins where worth as when I started, most of it was actually the withdrawl fee which was static, but hurt quite a bit at the low volume I was moving around.

Where to from here ?

I honestly thought I knew more about the process when I started and learnt quite a lot. The setting up of the wallets and the mining clients as well as keeping it all running smoothly was very interesting. The process of moving the coins around was actually quite complicated to work out, but once I had it I was very happy with the process.

The other things I’m interested in is the idea of a mining rig with proper operating system support, actual running of a Crypto currency, the pools and the exchanges. I still think there’s a lot for me to learn here and I look forwards to exploring it more.