OMNIGEN - REMARKABLE NEW IMAGE EDITOR
Imagine being able to say "take the person on the left in image 1 and the middle person in image 2 and have them [whatever you went them to do, wherever you want them to do it]. Or telling it to deblur an image, or add or remove things when combining multiple images or parts of images.
Omnigen can do this and much more. You just tell it what you want it to do to the image and it does it. You can see it in action at https://www.youtube.com/watch?v=PCL9SAlHqzw
And try it out at https://huggingface.co/spaces/Shitao/OmniGen
Warning: Being designed by geeks, it's not the most intuitive, and it can cost you in credits after a while. If you have a powerful enough PC and graphics card, you can install it locally and use it for free with no limits.
RECRAFT AKA RED PANDA
Another new image generator, Recraft.ai, has appeared and is claiming to be the best, but in all the tests I've seen, while it is actually on a par with the best - Ideogram, MidJourney, Flux etc. - it is not better than them.
It is very good for photorealism and long text, and has similar extra features to some of the others (upscaling, background removal, erasing portions), and it adds vector images, collages, and mockups. There is a free version, so it is definitely worth a try.
RUNWAYML ADDS ADVANCED CAMERA CONTROLS
RunWayML has added advanced camera controls the give ultraprecise, and much easier to use camera controls when you are generating your videos.
You can check these out at https://www.youtube.com/watch?v=0buDtZKLDJ8
WONDERANIMATION
Wonder Dynamics, the folks who enabled us to drop animated CGI characters into our videos, and who I featured in in many of my seminars, have now introduced WonderAnimation, which turns any footage that you shoot into fully rendered 3D animated scenes that you have full post-production control over!
You can literally shoot a scene with any camera, (or phone) in any location, and turn the sequence into an animated scene with CG characters in a 3D environment - even with shots from multiple angles!
You can read about it at https://adsknews.autodesk.com/en/news/autodesk-launches-wonder-animation-video-to-3d-scene-technology/
You can see it in action at https://www.youtube.com/watch?v=xad1ajxln28
CHATGPT SEARCH
Per ChatGPT, it "can now search the web in a much better way than before. You can get fast, timely answers with links to relevant web sources, which you would have previously needed to go to a search engine for. This blends the benefits of a natural language interface with the value of up-to-date sports scores, news, stock quotes, and more.
ChatGPT will choose to search the web based on what you ask, or you can manually choose to search by clicking the web search icon.
Search will be available at chatgpt.com (opens in a new window), as well as on our desktop and mobile apps. All ChatGPT Plus and Team users, as well as SearchGPT waitlist users, will have access today. Enterprise and Edu users will get access in the next few weeks. We’ll roll out to all Free users over the coming months."
ChatGPT also added a much-needed conversation search function at the top left enabling you to search through all your previous conversations.
IDEOGRAM AND MIDJOURNEY ADD IMAGE EDITING
Both ideogram and MidJourney have introduced excellent editing tools for the images you create with them, or that you upload.
With Ideogram's Canvas, you can upload your own images or generate new ones, then seamlessly edit, extend, or combine them using Magic Fill (inpainting - adding things to the image, like the girl added above) and Image Extending (outpainting) tools. You can also seamlessly combine multiple images into one unified image. Magic Fill allows you to edit specific regions of your images to replace objects, add text, fix imperfections, change backgrounds, and more.
With Midjourney, users can upload any image of their choosing and edit sections of it with AI, or change the style and texture of it from the source to something totally different, such as turning a vintage photograph into anime — while preserving most of the image’s subjects and objects and spatial relationships. It also works on doodles and hand drawings that the user submits, turning scribbles into full art pieces in seconds.
Shorts:
RunwayML introduced Act-One, an extraordinary way to add fully controllable facial expressiveness to any face - real or animated - in a video. Instead of trying to explain all that it does, check it out here: https://runwayml.com/research/introducing-act-one
Stability AI released the open source Stable Diffusion 3.5 - with improved photorealism of people and much better rendering of hands.
Alibaba’s MIMO - Alibaba's got a new AI tool called MIMO that can swap out people in videos using just a single photo reference, and change them into whatever characters you like, doing whatever you wish. It eliminates the need for complicated stuff like multi-camera setups or motion capture.
Leonardo in Canvas - Canva has launched Dream Lab which incorporates Leonardo in its text to image creations. The new Dream Lab tool can generate up to 19 different types of graphics, including 3D renders and illustrations, and can also reference other images to fine-tune outputs, making its outputs more reliable. It’s also capable of generating multi-subject images and photorealistic portraits.
HEYGEN INTERACTIVE AVATARS FOR ZOOM
HeyGen has introduced an innovative feature that allows users to integrate AI-powered avatars into Zoom meetings, enhancing virtual interactions. These Interactive Avatars can join multiple Zoom sessions simultaneously, operating 24/7, and are designed to look, sound, and behave like the user, making real-time decisions based on provided knowledge bases. https://www.heygen.com/
Key Features:
- Real-Time Interaction: The avatars engage in dynamic conversations, responding promptly to participants using OpenAI's real-time voice integration. This ensures natural and efficient interactions during meetings.
- Versatility: Suitable for various applications such as online coaching, customer support, sales calls, and interviews, these avatars can handle repetitive tasks, allowing users to focus on more critical aspects of their work.
- Personalization: Users can create custom avatars that mirror their appearance and voice, and how they speak, providing a consistent and authentic presence in virtual meetings. Additionally, users can create up to 100 different "looks" for their avatar, enabling variations in backgrounds, outfits, and camera angles to keep the virtual presence engaging and versatile.
While it is definitely getting better all the time, the avatars still look and sound fake to me - almost there, but not quite.
Image generator Krea - https://www.krea.ai/ - has released a major update where they partnered with some of the top AI video generators to bring multiple video models into Krea. Now you can create videos with MiniMax, LumaLabs, RunwayML, Pika Labs and Kling all in the one place.
They also have real-time image generation, image to video, and can upscale images and videos, as well as animations that morph from one image to another.
NEW ADOBE AI TOOLS
At Adobe MAX 2024, Adobe announced many new AI features which include:
Adobe Firefly Video Model (Beta): Adobe expanded its Firefly family of generative AI models to include video, enabling creators to generate videos from text and image prompts. This model is designed to be commercially safe and is integrated into Premiere Pro, offering features like Generative Extend to seamlessly add frames to video clips .
Photoshop Enhancements: Photoshop received several AI-driven updates:
- Distraction Removal: Automatically identifies and removes elements like people, wires, and poles from images.
- Generative Workspace (Beta): Allows designers to ideate and iterate concepts simultaneously using generative AI.
- Substance 3D Viewer (Beta): Enables viewing and editing 3D objects within Photoshop.
- Premiere Pro Enhancements: Premiere Pro introduces Generative Extend, allowing editors to seamlessly add frames to video clips using AI.
- Adobe Express: Adobe Express introduces new AI capabilities to simplify content creation, such as campaign creation, animation, and one-click brand setup.
NOT DIAMOND
My favorite new GPT is Not Diamond at https://chat.notdiamond.ai
Like Poe, it enables you to try different GPTs (ChatGPT, Claude, Gemini, Perplexity etc.) in the one place, but it does more. Based on what you ask, it chooses the best GPT for your query.
And you can compare the output of different GPTs side by side. And it does image generation, including the new Flux. And it is free.
INSTANT PODCASTS
Google recently enhanced its NotebookLM tool with an experimental Audio Overview feature, turning any collection of sources into a captivating podcast discussion hosted by two AI personalities. The AI-generated dialogue is downloadable, engaging, and tailored for auditory learners, as advertised by Google.
However, the feature goes beyond mere audio playback. The AI hosts display remarkable pacing, tone, and delivery, mimicking the natural flow of a human conversation. It's quite remarkable.
Credit: Lifehacker
FREE YOUTUBE TRANSCRIPTS
Another way to get a free transcript of a YouTube video is to add 3 "t's"after the youtube in the address - e.g. https://www.youtubettt.com/watch?v=cw0UOQd3ZB8 of any YouTube video you're watching.