Imagine you're staring at a photo you took of a serene lake, palm trees gently swaying under an overcast sky. Now, picture this: with just a few spoken words, you can transform that sky into a dark, stormy scene with flashes of lightning.
Yes, I said spoken words. Not written prompts.
Sounds like magic, right? Well, Apple, in collaboration with AI researchers from the University of California, is turning this into reality.
I was away in Dublin when Apple hosted its latest WWDC event, and I missed the streaming. Now, I have finally managed to catch up with all the updates on how Apple will integrate AI (Apple Intelligence) into the new system updates. And they're… interesting.
Something very specific got me thinking—something that could be revolutionary. I kind of disclosed it in the intro, but let me rewind a little bit…
Background (March 2024)
It's March, and a new GitHub repository has been spotted. People are excited because it is owned by Apple, and it could be the first tangible confirmation that Apple is indeed close to releasing AI features to its devices.
Introducing ‘MGIE' – MLLM-Guided Image Editing. An innovative tool that leverages natural language to make precise changes to images. You simply type what you want, and the AI does the rest. Want to add a touch of drama to your sunset photo? Just type “add red hues to the sky,” and voilà! The AI-powered editor brings your vision to life.
The tech behind this is pretty exciting. In a recent paper, the researchers outlined how MGIE can handle various editing tasks, making it a powerful tool for anyone from professional photographers to casual users.
Here's a quick rundown of what MGIE can do:
- Expressive Instruction-Based Editing: Guide your edits with simple, clear instructions. “Brighten the scene” or “make the background blurrier” – MGIE understands and executes these commands efficiently, improving both the quality of the edits and your experience.
- Photoshop-Style Modifications: MGIE can perform all the classic edits we know and love – cropping, resizing, adding filters, and more. It can even tackle advanced tasks like changing backgrounds or blending images seamlessly.
- Global Photo Optimisation: MGIE can optimize brightness, contrast, sharpness, and colour balance. Plus, it can add artistic effects like sketching or cartooning.
- Local Editing: Focus on specific parts of your image, like enhancing the colour of someone's eyes or changing the texture of a dress. MGIE can tweak the shapes, sizes, colours, textures, and styles of objects in your photos.
The initial results aren't perfect (it's still in beta), but with some tweaks, it shows potential.
Imagine how much faster and more intuitive editing could become, especially if the tool integrates voice commands. This could be a game-changer, making photo editing accessible to everyone, including those with disabilities.
This is exactly where we're going with this…
The Current State (WWDC June 2024)
There is a lot to unpack from the WWDC event; it's not all about Apple Intelligence. They even announced that the iPad will finally get a calculator app!
However, Apple's first steps into the AI world seem to be going in the right direction. Very importantly, they seem to focus (even stress) on the importance of keeping privacy at its core. Personally, I believe Apple is only keeping everything obscured from third-party eyes, but theirs are probably watching. But this is not the place for such debate.
What is funny, though, is that I'm writing an entire article about one sentence at the end of an almost two-hour event. It's why I believe you may have missed it, but it definitely caught my attention.
By the way, you can re-watch the entire event on Apple's YouTube channel (and maybe subscribe to my channel as well while you're there).
Anyway, this is the passage:
“Using the new App Intents, an app like Darkroom will be able to use the Apply Filter intent to give users the ability to say, “Apply a cinematic preset to the photo I took of Ian yesterday”.
Craig Federighi – Senior Vice President (SVP) of Software Engineering
This, to me, is the opening of an entirely new world of possibilities.
Voice-Based Photo Editing
Apple Intelligence could be a game-changer for photo editing.
Forget fiddling with menus and sliders or spending endless time making a precise selection. With App Intents, your favourite photo editing apps will understand plain English (and potentially other languages, too). Just say, “Make my sunset pic pop”, ” Brighten my smile in this group photo”, or “Remove that person in the background.”
No more cryptic icons or technical know-how. Just you and your photos, having a conversation!
Apple Intelligence is like having a tiny photo editing genius in your pocket. Imagine struggling with a stubborn red eye or a distracting background element. Just tell your editing app to fix it, and watch the magic happen—in seconds.
However, the possibilities go way beyond basic edits. Want to add a retro vibe to your latest selfie? No problem! Feeling artistic? Tell your app to turn your portrait into a cool cartoon or a classic painting. The creative potential is endless!
And imagine what this could mean to anyone with a disability. What this new creative tool could enable.
It's Early Days for Apple Intelligence
This tech is still in its early stages, but the future looks incredibly bright (pun intended!).
Imagine voice-controlled batch editing, where you can tell your app to adjust lighting or colours across a whole collection of photos. Imagine using your voice to add captions, watermarks, or even funny effects. Remove distractions, crop, turn a horizontal photo into a vertical one for Stories, apply artistic effects… Just saying it. The possibilities are truly boggling!
As excited as I always am for new tech, I'm here to embrace the future of photo manipulation. With Apple Intelligence, your voice and imagination are all you need to turn everyday snaps into stunning masterpieces. Who knows, maybe you'll become the next Instagram sensation—all thanks to the power of your voice!
The future of image editing is coming this Autumn, and it's looking pretty amazing.