There are two major trends in photography today. The first is the ever-increasing numbers of photos being made. I can’t even be bothered to look up how many billions of photos are being uploaded to Instickrbook every minute or every day or every year. It’s a lot. This is usually talked about in terms of how many photos there are, and how we are drowning in them.
This isn’t quite right. Unless you’re an Instagram addict, most of the pictures you see in a day are still mass media: ads, news, entertainment. You see maybe a few hundred of those billions of snapshots. A picture of a movie star might be seen by a billion people, whereas your picture of your cat, or even of that pretty model, might only be seen by ten. The billions don’t affect us much on the consumption side.
The billions are a reflection of how many people are now picture-makers. 50 years ago, maybe 1-2 percent of the world’s population was a picture maker. Now, it’s more like 20-80 percent and rising. Is it reasonable now to say that basically every person on earth has had their picture taken? Has almost every object, every car, every hill, every waterfall, every beach has been photographed? If not yet, soon.
It’s not really so many photographs! It’s so many photographers!
Hold that thought while we take a look at the second trend: AI, neural networks.
All the top-end phone cameras use neural networks for something, be it Apple’s Portrait Mode, or Google’s low-light photography. We’re changing from a world in which you pull pixels off the sensor and manipulate them in-place, leaving them fixed in their rigid rectangular grid except maybe for a little bit of cloning (usually). In the new world we will dump one or more grids of pixels (frames, exposures) into a neural network, and what pops out is something new.
Perhaps soon we’ll see sensors designed to be handled this way.
Perhaps sensors will have some big photosites for high sensitivity, and some small ones for detail (noisy). Perhaps the Bayer array will be replaced by something else. Perhaps it would be reasonable to skip demosaicing and let the AI sort that out. Take a couple of exposures with this crazy mass of different sized pixels, some noisy, some not, some overexposed, some underexposed, some colored, some not. Throw the whole mess into the AI that’s been trained to turn these into pictures. At no point before the AI starts working is there anything that even remotely looks like a picture, it’s twice as raw as RAW. Maybe.
But whether sensors go this way or not, AI/neural networks are with us to stay.
As a simple experiment, I used one of the online AI photo tools to uprez a photograph of some fruit, which I had first downrezzed to 100×80 pixels. The tool gave me back a 400×320 pixel picture which it created.
It looks a bit soft, but it’s not bad, and it’s much much better than the 100 pixel mess I gave it to work with. Here is the original, downrezzed to a matching 400×320 picture:
We can see that the AI was able to re-paint convincing-looking edges on the fruit, which were formerly jaggies from the downrezzing. The AI did not put the detail in the surface of the fruit back, obviously. How could it? While this is a pretty simple system, it is essentially painting a new picture based on the input (jagged) picture, and its “knowledge” of how the real world looks. Most of the little defects and splotches on the fruit are pretty much gone in the reconstructed one.
So, here’s the key point: The uprezzed picture looks pretty real, but it’s not. It’s not what the fruit looked like.
There are really two areas where AIs can do things we don’t want, and they’re really simple. One is “what if they work wrong?” and the other is “what if they work right?”
Wrong is easy, everything gets rendered as a huge pile of eyes or kittens or whatever. It’s amusing, maybe, but just wrong. What if the AI takes in a blurry picture of a toy gun, and repaints it as a real gun based on what it “knows” about guns? It’s funny, or irrelevant, right up to the moment the picture turns up in a courtroom.
What about when it’s working right? What happens when your phone’s camera insists on making women’s eyes a little bigger and their lips a little fuller and their skin a little smoother, and you can’t turn it off because the sensor literally won’t work without the neural network? Is this good or bad? It depends, right?
Apparently “Snapchat body dysmorphia” is already a thing, with people finding dissatisfaction with their own bodies arising from their self-image as seen through Snapchat. Retail portraiture often renders the subjects as a sort of plastic with worryingly intense eyes, but imagine what happens when you can’t even obtain a straight photograph of yourself unless you know someone with an old camera.
Now put these two trends together.
Five years from now, maybe, pretty much every camera will have some kind of neural network/AI technology in it. Maybe to make it work at all, maybe just for a handful of shooting modes, maybe something we haven’t even thought of. This means pretty much every picture that gets taken is going to be something that an AI painted for us, based on some inputs.
This means that in 5 years, maybe 10, not only will everything and everyone be photographed, but also every one of those photographs will be subtly distorted, wrong, untruthful. Whether we want them to be or not. Every picture of a woman or a girl will be subtly beautified by the standards of some programmer in Palo Alto. Every landscape will be a little cleaned up, a little more colorful. Every night photograph will be an interpolation based on a bunch of very noisy pixels, an interpolation that looks very very realistic, but which shows a bunch of stuff in the shadows that is just flat out made up by the AI.
We’re going to have billions of cameras, from a dozen vendors, each with a dozen software versions in the wild. If there is a plausible scenario for something that could go wrong, at this kind of scale you can be pretty sure that somewhere, sometime, it’s going to go wrong. Sometimes the AI will simply behave badly, and at other times it will behave exactly as designed, with lousy outcomes anyways.
The thing that makes a photograph a photo and not a painting is that it’s drawn, with light, from the world itself. Yes, it’s just one point of view. Yes, it’s a crop. Yes, some parts are blurry. Yes, it might have been cloned and airbrushed. But with those caveats, in its own limited way, it’s truthful. The new world, coming with the speed of a freight train, is about to throw that away. Things that look like photographs will not be, they will be photo-realistic paintings made by neural networks that look a lot like what was in front of the lens.
They’ll be close enough that we’ll tend to trust them because the programmers will make sure of that. That’s what a photograph is, right? It’s trustable, with limits. But these things won’t be trustworthy. Not quite.
These magical automatic painting machines will be trained by the young, the white, the male, the American, by whomever, for use by the old, the non-white, the global. Even if you object to “political correctness,” you will someday be using a camera that was trained to paint its pictures by someone who isn’t much very like you.
There will be consequences: social, ethical, cultural, legal. We just don’t know entirely what they will be.
About the author: Andrew Molitor writes software by day and takes pictures by night. The opinions expressed in this article are solely those of the author. Molitor is based in Norfolk, Virginia, and does his best to obsess over gear, specs, or sharpness. You can find more of his writing on his blog. This article was also published here.
Image credits: Header photo based on illustration by Mike MacKenzie via www.vpnsrus.com and licensed under CC BY 2.0