Welcome to the AI Connection Club, a welcoming and interactive community centered around AI, where members can learn, share knowledge, stay updated, and support each other.

AI Tips and News of the Week

AND NEWS 2

For AI news updated daily, go here

OmniHuman: ByteDance’s New AI Turning Photos into Moving Videos

bytedance-omnihuman-ai-video-gen

ByteDance’s new AI tool, OmniHuman, takes a single photo and transforms it into a realistic video where the subject can speak, move, and even gesture naturally. Instead of just animating facial expressions, this AI creates full-body movement, making videos look more lifelike. The system was trained using thousands of hours of video data to better understand how people move, helping it generate smooth and natural-looking animations.

This development could be useful for content creators, educators, and digital artists looking for new ways to bring static images to life. While it opens exciting creative possibilities, experts are also considering the ethical implications of AI-generated videos. Learn more about OmniHuman here.

ReRender AI: Turning Ideas into Stunning Visuals

Rerender

ReRender AI offers a way to transform simple sketches, 3D models, or images into high-quality, photorealistic renderings within seconds. Whether you're designing architecture, interiors, or creative concepts, this tool helps users bring their visions to life with ease. By automating the rendering process, it provides quick results without requiring advanced technical skills.

Users can adjust details to fit their creative needs, making it useful for professionals and hobbyists alike. To explore more, visit ReRender AI.

Pika 2.1: AI Video Creation Gets a Big Upgrade

Pika 2-1

It hasn’t been long since Pika AI introduced its 2.0 update, yet it's already rolling out Pika 2.1 with even more upgrades. This rapid pace of innovation brings smoother animations, improved video resolution up to 1080p, and new tools like advanced motion control and dynamic lighting. A key addition is the ability to upload images to customize video scenes, giving users more creative control than ever before.

These updates make AI-powered video creation even more accessible and intuitive. Whether you’re crafting short clips, animations, or cinematic projects, Pika 2.1 continues to push boundaries. Learn more at Pika AI.

ChatGPT Now Works Without an Account

ChatGPT-Open-AI-1120

OpenAI has made it easier than ever to use ChatGPT by allowing access without needing to sign up for an account. This change means anyone can try AI-powered conversations instantly, removing barriers and making AI more accessible. While some advanced features remain exclusive to registered users, this update is a big step toward making AI tools available to more people with minimal hassle.

For those curious about AI but hesitant to create an account, this update offers a chance to experiment with ChatGPT effortlessly. It’s a great way to explore what AI can do without committing to a sign-up process. Read more at The Verge.

Riffusion: A New Free Competitor To Suno And Udio

riffusion image

Riffusion is an AI-powered tool that lets users create entire songs just by typing a simple prompt. Whether it's country, hip-hop, metal, or even a quirky song about too many streaming subscriptions, it generates unique tracks with lyrics and melodies in seconds. Users can also experiment with blending different musical styles, tweaking elements like tempo and tone, and even adjusting how "weird" a song sounds. The tool is still in beta but already offers an easy way to play with AI-generated music, making it accessible to anyone curious about the intersection of AI and creativity.

One of Riffusion’s standout features is the ability to upload audio and transform it in creative ways. Users can remix old recordings, extend songs, or even reimagine a track in a completely different genre. Whether it’s turning a jazz tune into punk rock or adding a funk beat to a guitar riff, the possibilities are endless. The platform also plans to personalize music based on user preferences, making AI-generated tracks even more tailored. And it’s currently free.

ChatGPT's 'Deep Research': Your New Research Companion

deep research

OpenAI has introduced 'Deep Research,' a new feature in ChatGPT designed to assist users in conducting thorough research on complex topics. By analyzing information from various online sources, it compiles detailed reports complete with citations, all within a short time frame. This tool aims to make in-depth research more accessible, saving users valuable time and effort.

Unfortunately, 'Deep Research' is only available to Pro subscribers at this time, and they can utilize up to 100 queries per month. Depending on the complexity of the topic, the research process takes between 5 to 30 minutes. Users receive clarifying questions at the start and notifications once the results are ready. This feature represents a significant step toward making comprehensive research more manageable for everyone.

Learn more here.

Ideogram Canvas: A New Way to Create and Edit AI-Generated Images

ideogram

Last year Ideogram introduced Canvas, a creative workspace that allows users to organize, generate, and refine images seamlessly. With features like Magic Fill and Extend, users can edit specific areas of an image or expand its borders while maintaining a cohesive look. Whether you’re tweaking details or blending multiple images, Canvas offers a flexible and user-friendly approach to AI-powered design. To explore more, visit Ideogram Canvas.

Now they have added a very powerful text editor which you can read about and see a video about here.

OpenAI Introduces o3-mini: Making Advanced AI More Accessible

OpenAI-Launches-O3-Model-Family

OpenAI has unveiled o3-mini, a new AI model designed to make advanced reasoning tasks more accessible to everyone. This model is now available to all ChatGPT users, including those on the free tier, marking the first time free users can experience such capabilities. For those on paid plans, o3-mini offers increased usage limits, allowing up to 150 messages daily.

One of the standout features of o3-mini is its efficiency. It excels in areas like math and coding, delivering responses 24% faster than previous models. Additionally, it operates at a significantly reduced cost, being 63% less expensive to run than its predecessor. This efficiency doesn't come at the expense of performance; o3-mini matches or even surpasses earlier models in technical tasks. Developers also have the flexibility to adjust the 'reasoning effort' to balance speed and accuracy according to their needs. Read more here.

Kling AI's 'Elements': Bringing Your Stories to Life

New-Kling-AI-Elements

Kling AI has recently introduced a new feature called 'Elements' that makes creating consistent and engaging videos easier than ever. With 'Elements', you can upload up to four images—such as characters, objects, or backgrounds—and the tool will help you weave them into a smooth animation. This means your characters and scenes stay consistent throughout the video, making your storytelling more coherent and visually appealing.

If you're crafting a short story, an educational clip, or just experimenting with creative ideas, 'Elements' offers a straightforward way to bring your concepts to life. By allowing multiple images to interact seamlessly, it opens up new possibilities for dynamic and engaging content creation. It's a user-friendly approach to making your videos more lively and connected, helping you share your ideas in a more compelling way. Learn more here.

Krea AI's New Feature: Transforming Your 2D Images into 3D Creations

Krea 2

Krea AI has introduced an exciting feature that allows users to convert their flat, 2D images into dynamic 3D models. By simply uploading a picture, you can watch it come to life with added depth and perspective, making your visuals more engaging and interactive.

This tool is designed to be user-friendly, requiring no prior experience in design or 3D modeling. Whether you're looking to enhance your digital art, create captivating social media content, or explore new creative avenues, Krea AI's 2D-to-3D conversion offers a straightforward way to add a new dimension to your projects.

Learn more here: seaart.ai

Captions.ai: Transform Your Videos with AI Magic

image1

Editing videos can be time-consuming and complex, but tools like Captions.ai aim to make it easier. Upload your footage, select an editing style, and the platform can help you add captions, transitions, and background music.

Whether you’re sharing stories, building a brand, or creating social media content, Captions.ai provides features designed to streamline the process, even for beginners.

Captions.ai could be the perfect tool for anyone who wants to take their video editing to the next level.

Skyrocket Your Ideas with SkyReels.ai

skyreels

SkyReels.ai is a creative tool designed to help you brainstorm and develop ideas for video content. Simply type in your concept or upload a clip, and it can suggest storylines, script ideas, and visuals to bring your project to life.

Whether you’re working on social media content, exploring a creative idea, or stuck in a brainstorming rut, SkyReels.ai offers a way to spark inspiration and move your project forward. It’s a straightforward, helpful tool for turning ideas into something more. Check it out here.

Gemini + YouTube Integration

Gemini YouTube

Gemini can now help you learn from YouTube videos more effectively. By pasting a YouTube video link into Gemini, you can ask it to create a step-by-step guide or summary. This can be helpful when learning a new skill or trying to understand a complex topic.

You can use this feature in Google Chrome by typing "@ Gemini" in the address bar and starting a chat. This can be a useful tool for anyone who wants to learn from video content in a more interactive and efficient way.

OpenAI’s New Tool “Operator”

operator

OpenAI has introduced a tool called “Operator”, aimed at helping with everyday tasks like planning trips, booking reservations, and ordering groceries. With a few prompts, users can delegate these small but time-consuming chores.

Right now, Operator is only available to Pro users in the U.S., but OpenAI has plans to expand access over time. As it rolls out to more people, it could become a useful way to streamline routine tasks and free up time for other priorities.

This release is part of OpenAI’s broader effort to develop AI tools that fit into daily life.

Read more: https://www.yenisafak.com/en/news/openai-announces-new-artificial-intelligence-tool-3697594

Smart Write by Neo

neo 3a

PixVerse has recently released a significant update, Version 3, with a focus on enhancing the user experience. This new version aims to better understand and translate your creative visions into captivating videos.

Additionally, explore effects like "Zombie Mode" and "Alive Art" to add a touch of the unexpected to your creations.

Version 3 also introduces new features like Lipsync, which allows characters to seamlessly match spoken words, and Extend, which enables you to easily build upon existing video clips. Learn more here.

PixVerse V3: Explore New Creative Frontiers

pixverse

PixVerse has recently released a significant update, Version 3, with a focus on enhancing the user experience. This new version aims to better understand and translate your creative visions into captivating videos.

Additionally, explore effects like "Zombie Mode" and "Alive Art" to add a touch of the unexpected to your creations.

Version 3 also introduces new features like Lipsync, which allows characters to seamlessly match spoken words, and Extend, which enables you to easily build upon existing video clips. Learn more here.

Google Gemini Deep Research + NotebookLM - Ultimate AI Combo

Google AI Combo

Google Gemini's Deep Research feature is designed to streamline your research process. It helps you organize your thoughts and gather information efficiently by creating research plans and compiling sources into a single document.

Combining Deep Research with Notebook LM can further enhance your workflow. You can easily synthesize information, generate insights, and even create AI-powered content like podcasts. This integrated approach can be a valuable asset for anyone who wants to explore new topics, conduct research, or produce high-quality content.

To learn more, watch this.

Vidu 2.0: Making Video Creation Faster, Cheaper, and Easier

Vidu

ShengShu Technology has introduced Vidu 2.0, an update to its video creation tool. This version offers faster video generation and a new "Templates" feature, which allows users to add actions or props with a single click.

With a focus on accessibility, Vidu 2.0 is designed for a range of users, from small business owners to aspiring editors. The update provides a streamlined way to create videos quickly and efficiently, making video production more approachable for a wider audience.

Learn more here.

Guidde: Your Friendly Guide to Effortless How-To Videos

Guidde

Creating how-to videos can feel overwhelming, especially if you’re unfamiliar with video editing. Guidde offers a way to simplify the process by allowing you to record your screen and turn your actions into step-by-step video guides.

With a focus on ease of use, Guidde requires no design or technical expertise—just record your workflow, and it organizes the content into a clear tutorial. Whether for training, customer support, or sharing tips, it helps streamline the process.

For a visual demonstration of how Guidde works, check out this video.

Spotter Studio

Spotter

Spotter Studio offers a variety of tools to help streamline your YouTube content creation. Whether you're brainstorming video ideas or trying to connect more with your audience, these tools aim to spark your creativity and keep things fresh.

Some popular creators, like Dude Perfect and The Odditty Diaries, have shared how Spotter Studio helps them save time and stay productive. It's designed to be a helpful companion in your creative process, making video creation a bit easier and more enjoyable.

Curious how it works? Check it here.

ChatGPT introduces "Tasks"

tasks

OpenAI has introduced a new feature in ChatGPT called “Tasks,” designed to make your life a little easier by helping you stay on top of things. With Tasks, you can set reminders for important events, whether it’s a one-time meeting or a recurring commitment, like weekly family calls. You can also schedule helpful updates, such as receiving the weather forecast every morning or a roundup of news each week, delivered straight to your email. It’s like having a friendly assistant that keeps you organized and informed.

The best part? You’re in complete control. You can easily create, edit, or remove tasks through a simple interface, and ChatGPT will let you know when a task is completed with a notification or email. Tasks even work when you’re offline! You can set up to 10 tasks, Here's how to use it: https://help.openai.com/en/articles/10291617-scheduled-tasks-in-chatgpt

Invideo 3.0 update

invideo 3 b

Big news from InVideo - their latest update, V3, is a game-changer for video creation. You can now create entire videos—script, footage, voiceovers, music, subtitles, animations, the whole package—just by typing a single text prompt. No editing skills or juggling multiple tools needed!

This means anyone, whether you’re a creator, marketer, or entrepreneur, can easily produce professional-quality videos to tell your story, promote a product, or create engaging content for social media. Imagine whipping up a polished video ad or even translating existing videos with just a few clicks!

If you’ve ever felt intimidated by video editing, this might be the perfect time to give it a try here.

ChatGPT update giving it "eyes"

ChatGPT

OpenAI has unveiled a groundbreaking feature for ChatGPT—real-time video and voice interaction through its new Advanced Voice mode. This update enables the chatbot to visually interpret its surroundings and respond contextually to what users show through their device's camera.

For example, users can display items or situations, and ask for guidance. The chatbot can provide detailed, step-by-step instructions, answer clarifying questions, and adapt its responses to what’s in the frame. Additionally, users can share their device screen, allowing ChatGPT to view and assist with tasks, such as drafting replies to messages within a messenger app.

This feature is part of ChatGPT Plus and Pro subscription plans and will roll out next week. Business and educational users can expect access to this functionality by early 2025. With these advancements,

MagicQuill (revolutionary image editing)

MagicQuill

MagicQuill is here to make image editing simpler, smarter, and more fun for everyone. With its AI-powered tools and an intuitive interface, you can easily do things like insert new elements, erase objects, or tweak colors—no complex skills required.

What’s really cool? MagicQuill understands what you’re trying to do in real time, so there’s no need to type out prompts or navigate tricky menus. Just a few quick strokes, and you’re in control, getting exactly the edits you want with precision and ease.

Whether you’re working on casual photo tweaks or intricate design projects, MagicQuill combines powerful AI with simplicity to bring your creative vision to life. If you’ve been looking for a tool that makes advanced editing feel effortless, this one’s definitely worth checking out.

MultiFoley

MultiFoley

MultiFoley is an impressive new AI tool for creating soundtracks that perfectly match silent videos, whether you’re going for realistic sound effects or something more imaginative. With MultiFoley, you can generate high-quality, synchronized sounds using text, audio, or video as inputs.

One of its coolest tricks is that you can guide it with reference sounds—like pulling audio from a sound effects library or a partial video soundtrack—and it will build a complete, seamless audio experience. Need a skateboard’s wheels spinning without the wind noise, or maybe a lion’s roar that sounds like a cat’s meow? MultiFoley can help you with this.

It’s super versatile, too. You can use it to create sounds based on text prompts, extend incomplete soundtracks, or tweak audio using existing references. By combining AI smarts with professional sound effects, it produces clear, full-bandwidth audio that’s perfect for everything from film production to creative projects.

If sound design is part of your workflow, MultiFoley could save you a lot of time while opening up endless creative possibilities.

NotebookLM update

nlm

Google has just rolled out some exciting updates to NotebookLM, their AI-powered productivity tool, and they’re pretty game-changing.

The standout feature? You can now jump into the podcast conversations with their new Audio Overview update. This means you can interact with the audio using your voice—ask questions, get extra details, or even request different explanations, all in real-time. It’s like being part of the discussion!

They’ve also redesigned the interface to make things easier and more intuitive. You’ve got three key panels now: one for keeping track of your sources, one for AI-powered chats (with citations!), and another for creating things like study guides and custom audio overviews.

And for those who need even more power, there’s a new premium tier called NotebookLM Plus coming early next year. It’s built for teams, schools, and businesses, offering more storage, shared notebooks, and collaboration features.

You can check it out here.

Pika 2 update

Pika 2 update

Pika Labs has introduced Pika 2.0, a fun and user-friendly AI video generator designed for everyday creators, not just big studios. One of its standout features is the Scene Ingredients tool, which lets you upload your own characters, props, and settings to mix with AI-generated content. Whether it’s a dragon flying over a castle or a cat surfing through space, you get more control to bring your ideas to life.

Unlike traditional text-prompt-based video tools, Pika 2.0 has improved text alignment for better results and upgraded motion rendering for smoother, more natural movements. It’s made with small creators and social media users in mind, making it perfect for TikToks, marketing clips, or just having fun with creative video projects.

Available for both free and paid users, Pika 2.0 is all about making video creation accessible and enjoyable for “actual people,” as they put it.

Nvidia's Fugato

unnamed (1)

Fugatto by Nvidia is a revolutionary AI model that generates and transforms audio using text and audio prompts. It allows users to compose music, modify voices, add or remove instruments, and even create entirely new sounds.

It allows fine-grained control over attributes like accent, emotion, and sound evolution. For example, it can morph sounds over time, such as a train transitioning into a string orchestra, or a choir.

Its debut showcased impressive creativity, from music with barking dogs to instruments mimicking animal sounds, marking Fugatto as a groundbreaking leap in generative audio technology.

The video below shows what amazing capabilities it will give to filmmakers.

Hunyuan video generator

unnamed (2)

Hunyuan AI Video is a new, state of the art, AI Video Generator that creates high-quality videos from text descriptions. With massive horsepower and state-of-the-art performance, it claims to be the most powerful open-source video generation model available.

It generates high-quality AI videos with superior motion stability, scene transitions, and realistic visuals.

Try it at https://fal.ai/models/fal-ai/hunyuan-video

CapCut's AI Avatar Generator

You can now make a lip-syncing talking Custom AI Avatar for free in Capcut!

unnamed (3)

CapCut's AI avatar generator is completely free to use. You can create and personalize your avatar without any subscriptions or hidden fees, allowing you to explore your creativity without breaking the bank.

From Capcut's promo:

"Key Features of CapCut’s AI Avatar Generator

  • Diverse Avatar Styles: Explore a library of unique styles, from bold and graphic to soft and whimsical, tailored to your vision.
  • User-Friendly Interface: Create avatars effortlessly with intuitive tools, perfect for beginners and pros alike.
  • Extensive Customization: Personalize avatars with detailed features, sound effects, and backgrounds to reflect your individuality.

Benefits of Using CapCut’s AI Avatar Generator

  • Free Creativity: Design avatars at no cost, eliminating the need for expensive software or subscriptions.
  • Pre-Designed Templates: Start with diverse character templates that inspire and simplify the creative process.
  • Seamless Video Integration: Easily incorporate custom avatars into videos with CapCut’s editing tools.

Creative Applications of AI Avatars

  • Gaming & Entertainment: Enhance gameplay commentary or skits with unique character avatars.
  • Marketing & Advertising: Create memorable campaigns featuring custom avatars to elevate your brand.
  • Reaction & Review Videos: Add personality and engagement to your content with visually captivating avatars."

Learn more on how to use CapCut AI avatar generator here: https://www.capcut.com/tools/free-avatar-creator

Google Raises the Bar in Video Generation

unnamed

Google just announced Veo 2 which produces another advance in video quality, out-performing even OpenAI's Sora. They also announced Imagen 3, an upgraded image model also offering state-of-the-art quality.

While video models frequently “hallucinate” unwanted details—such as extra fingers or unexpected objects—Veo 2 minimizes these occurrences, resulting in more realistic outputs.

Additionally, Veo 2 embeds an invisible SynthID watermark in its videos, allowing them to be identified as AI-generated. This helps mitigate risks of misinformation and misattribution.

Visit Google Labs to sign up for the waitlist. They also plan to expand Veo 2 to YouTube Shorts and other products next year.

Read more about it at https://blog.google/technology/google-labs/video-image-generation-update-december-2024

Imagen 3 outperformed all models, including Midjourney, Flux, and Ideogram, in human evaluations for preference, visual quality, and prompt adherence. The model is now available through Google Labs’ ImageFX.

PUT YOUR FRIENDS IN ANY ENVIRONMENT IN ANY POSITION

OpenArt combines numerous great image generation and editing tools into one online program, but what sets it apart is its ability to train a "model" composed of different images that you upload of a friend, a family member, a pet etc. that you can then place into any environment, in any pose, and any style.

You can see it in action in this great video from Bob Doyle at 5:10 to about 20:30: https://www.youtube.com/watch?v=gEjm0Mc1jkc 

 

 

 

 

Ai-Da, a humanoid robot artist, just made history by selling her portrait of Alan Turing for over $1 million at Sotherby's. The painting is below.

2.-Ai-God-Polyptych-by-Ai-Da-Robot

OMNIGEN - REMARKABLE NEW IMAGE EDITOR

omnigen

Imagine being able to say "take the person on the left in image 1 and the middle person in image 2 and have them [whatever you went them to do, wherever you want them to do it]. Or telling it to deblur an image, or add or remove things when combining multiple images or parts of images.

Omnigen can do this and much more. You just tell it what you want it to do to the image and it does it. You can see it in action at https://www.youtube.com/watch?v=PCL9SAlHqzw

And try it out at https://huggingface.co/spaces/Shitao/OmniGen

Warning: Being designed by geeks, it's not the most intuitive, and it can cost you in credits after a while. If you have a powerful enough PC and graphics card, you can install it locally and use it for free with no limits.

RECRAFT AKA RED PANDA

fb68852f-4c99-4ff6-aa79-a50ba8a8aa1e

Another new image generator, Recraft.ai, has appeared and is claiming to be the best, but in all the tests I've seen, while it is actually on a par with the best - Ideogram, MidJourney, Flux etc. - it is not better than them.

It is very good for photorealism and long text, and has similar extra features to some of the others (upscaling, background removal, erasing portions), and it adds vector images, collages, and mockups. There is a free version, so it is definitely worth a try.

RUNWAYML ADDS ADVANCED CAMERA CONTROLS

runway-advanced-camera-control-1456x1202

RunWayML has added advanced camera controls the give ultraprecise, and much easier to use camera controls when you are generating your videos.

You can check these out at https://www.youtube.com/watch?v=0buDtZKLDJ8

WONDERANIMATION

wonderanim

Wonder Dynamics, the folks who enabled us to drop animated CGI characters into our videos, and who I featured in in many of my seminars, have now introduced WonderAnimation, which turns any footage that you shoot into fully rendered 3D animated scenes that you have full post-production control over!

You can literally shoot a scene with any camera, (or phone) in any location, and turn the sequence into an animated scene with CG characters in a 3D environment - even with shots from multiple angles!

You can read about it at https://adsknews.autodesk.com/en/news/autodesk-launches-wonder-animation-video-to-3d-scene-technology/

You can see it in action at https://www.youtube.com/watch?v=xad1ajxln28

CHATGPT SEARCH

Per ChatGPT, it "can now search the web in a much better way than before. You can get fast, timely answers with links to relevant web sources, which you would have previously needed to go to a search engine for. This blends the benefits of a natural language interface with the value of up-to-date sports scores, news, stock quotes, and more.

ChatGPT will choose to search the web based on what you ask, or you can manually choose to search by clicking the web search icon.

Search will be available at chatgpt.com (opens in a new window), as well as on our desktop and mobile apps. All ChatGPT Plus and Team users, as well as SearchGPT waitlist users, will have access today. Enterprise and Edu users will get access in the next few weeks. We’ll roll out to all Free users over the coming months."

ChatGPT also added a much-needed conversation search function at the top left enabling you to search through all your previous conversations.

IDEOGRAM AND MIDJOURNEY ADD IMAGE EDITING

Both ideogram and MidJourney have introduced excellent editing tools for the images you create with them, or that you upload.

id dogs 2

With Ideogram's Canvas, you can upload your own images or generate new ones, then seamlessly edit, extend, or combine them using Magic Fill (inpainting - adding things to the image, like the girl added above) and Image Extending (outpainting) tools. You can also seamlessly combine multiple images into one unified image. Magic Fill allows you to edit specific regions of your images to replace objects, add text, fix imperfections, change backgrounds, and more.

mj cars

With Midjourney, users can upload any image of their choosing and edit sections of it with AI, or change the style and texture of it from the source to something totally different, such as turning a vintage photograph into anime — while preserving most of the image’s subjects and objects and spatial relationships. It also works on doodles and hand drawings that the user submits, turning scribbles into full art pieces in seconds.

Shorts:

RunwayML introduced Act-One, an extraordinary way to add fully controllable facial expressiveness to any face - real or animated - in a video. Instead of trying to explain all that it does, check it out here: https://runwayml.com/research/introducing-act-one

Stability AI released the open source Stable Diffusion 3.5 - with improved photorealism of people and much better rendering of hands.

Alibaba’s MIMO - Alibaba's got a new AI tool called MIMO  that can swap out people in videos using just a single photo reference, and change them into whatever characters you like, doing whatever you wish. It eliminates the need for complicated stuff like multi-camera setups or motion capture.


Leonardo in Canvas
-
Canva has launched Dream Lab which incorporates Leonardo in its text to image creations. The new Dream Lab tool can generate up to 19 different types of graphics, including 3D renders and illustrations, and can also reference other images to fine-tune outputs, making its outputs more reliable. It’s also capable of generating multi-subject images and photorealistic portraits.

HEYGEN INTERACTIVE AVATARS FOR ZOOM

heygen new

HeyGen has introduced an innovative feature that allows users to integrate AI-powered avatars into Zoom meetings, enhancing virtual interactions. These Interactive Avatars can join multiple Zoom sessions simultaneously, operating 24/7, and are designed to look, sound, and behave like the user, making real-time decisions based on provided knowledge bases. https://www.heygen.com/


Key Features:

  • Real-Time Interaction: The avatars engage in dynamic conversations, responding promptly to participants using OpenAI's real-time voice integration. This ensures natural and efficient interactions during meetings.
  • Versatility: Suitable for various applications such as online coaching, customer support, sales calls, and interviews, these avatars can handle repetitive tasks, allowing users to focus on more critical aspects of their work.
  • Personalization: Users can create custom avatars that mirror their appearance and voice, and how they speak, providing a consistent and authentic presence in virtual meetings. Additionally, users can create up to 100 different "looks" for their avatar, enabling variations in backgrounds, outfits, and camera angles to keep the virtual presence engaging and versatile.

While it is definitely getting better all the time, the avatars still look and sound fake to me - almost there, but not quite.

krea new

Image generator Krea - https://www.krea.ai/ - has released a major update where they partnered with some of the top AI video generators to bring multiple video models into Krea. Now you can create videos with MiniMax, LumaLabs, RunwayML, Pika Labs and Kling all in the one place.

They also have real-time image generation, image to video, and can upscale images and videos, as well as animations that morph from one image to another.

adobe new

NEW ADOBE AI TOOLS

At Adobe MAX 2024, Adobe announced many new AI features which include:

Adobe Firefly Video Model (Beta): Adobe expanded its Firefly family of generative AI models to include video, enabling creators to generate videos from text and image prompts. This model is designed to be commercially safe and is integrated into Premiere Pro, offering features like Generative Extend to seamlessly add frames to video clips .

Photoshop Enhancements: Photoshop received several AI-driven updates:

  • Distraction Removal: Automatically identifies and removes elements like people, wires, and poles from images.
  • Generative Workspace (Beta): Allows designers to ideate and iterate concepts simultaneously using generative AI.
  • Substance 3D Viewer (Beta): Enables viewing and editing 3D objects within Photoshop.
  • Premiere Pro Enhancements:  Premiere Pro introduces Generative Extend, allowing editors to seamlessly add frames to video clips using AI.
  • Adobe Express:  Adobe Express introduces new AI capabilities to simplify content creation, such as campaign creation, animation, and one-click brand setup.

NOT DIAMOND

My favorite new GPT is Not Diamond at https://chat.notdiamond.ai

Like Poe, it enables you to try different GPTs (ChatGPT, Claude, Gemini, Perplexity etc.) in the one place, but it does more. Based on what you ask, it chooses the best GPT for your query.

And you can compare the output of different GPTs side by side. And it does image generation, including the new Flux. And it is free.

INSTANT PODCASTS

Google recently enhanced its NotebookLM tool with an experimental Audio Overview feature, turning any collection of sources into a captivating podcast discussion hosted by two AI personalities. The AI-generated dialogue is downloadable, engaging, and tailored for auditory learners, as advertised by Google.

However, the feature goes beyond mere audio playback. The AI hosts display remarkable pacing, tone, and delivery, mimicking the natural flow of a human conversation. It's quite remarkable.

Credit: Lifehacker

FREE YOUTUBE TRANSCRIPTS

Another way to get a free transcript of a YouTube video is to add 3 "t's"after the youtube in the address - e.g. https://www.youtubettt.com/watch?v=cw0UOQd3ZB8 of any YouTube video you're watching.

Invideo 3.0 update

invideo 3 b

Big news from InVideo - their latest update, V3, is a game-changer for video creation. You can now create entire videos—script, footage, voiceovers, music, subtitles, animations, the whole package—just by typing a single text prompt. No editing skills or juggling multiple tools needed!

This means anyone, whether you’re a creator, marketer, or entrepreneur, can easily produce professional-quality videos to tell your story, promote a product, or create engaging content for social media. Imagine whipping up a polished video ad or even translating existing videos with just a few clicks!

If you’ve ever felt intimidated by video editing, this might be the perfect time to give it a try here.

ChatGPT update giving it "eyes"

ChatGPT

OpenAI has unveiled a groundbreaking feature for ChatGPT—real-time video and voice interaction through its new Advanced Voice mode. This update enables the chatbot to visually interpret its surroundings and respond contextually to what users show through their device's camera.

For example, users can display items or situations, and ask for guidance. The chatbot can provide detailed, step-by-step instructions, answer clarifying questions, and adapt its responses to what’s in the frame. Additionally, users can share their device screen, allowing ChatGPT to view and assist with tasks, such as drafting replies to messages within a messenger app.

This feature is part of ChatGPT Plus and Pro subscription plans and will roll out next week. Business and educational users can expect access to this functionality by early 2025. With these advancements,

MagicQuill (revolutionary image editing)

MagicQuill

MagicQuill is here to make image editing simpler, smarter, and more fun for everyone. With its AI-powered tools and an intuitive interface, you can easily do things like insert new elements, erase objects, or tweak colors—no complex skills required.

What’s really cool? MagicQuill understands what you’re trying to do in real time, so there’s no need to type out prompts or navigate tricky menus. Just a few quick strokes, and you’re in control, getting exactly the edits you want with precision and ease.

Whether you’re working on casual photo tweaks or intricate design projects, MagicQuill combines powerful AI with simplicity to bring your creative vision to life. If you’ve been looking for a tool that makes advanced editing feel effortless, this one’s definitely worth checking out.

MultiFoley

MultiFoley

MultiFoley is an impressive new AI tool for creating soundtracks that perfectly match silent videos, whether you’re going for realistic sound effects or something more imaginative. With MultiFoley, you can generate high-quality, synchronized sounds using text, audio, or video as inputs.

One of its coolest tricks is that you can guide it with reference sounds—like pulling audio from a sound effects library or a partial video soundtrack—and it will build a complete, seamless audio experience. Need a skateboard’s wheels spinning without the wind noise, or maybe a lion’s roar that sounds like a cat’s meow? MultiFoley can help you with this.

It’s super versatile, too. You can use it to create sounds based on text prompts, extend incomplete soundtracks, or tweak audio using existing references. By combining AI smarts with professional sound effects, it produces clear, full-bandwidth audio that’s perfect for everything from film production to creative projects.

If sound design is part of your workflow, MultiFoley could save you a lot of time while opening up endless creative possibilities.

NotebookLM update

nlm

Google has just rolled out some exciting updates to NotebookLM, their AI-powered productivity tool, and they’re pretty game-changing.

The standout feature? You can now jump into the podcast conversations with their new Audio Overview update. This means you can interact with the audio using your voice—ask questions, get extra details, or even request different explanations, all in real-time. It’s like being part of the discussion!

They’ve also redesigned the interface to make things easier and more intuitive. You’ve got three key panels now: one for keeping track of your sources, one for AI-powered chats (with citations!), and another for creating things like study guides and custom audio overviews.

And for those who need even more power, there’s a new premium tier called NotebookLM Plus coming early next year. It’s built for teams, schools, and businesses, offering more storage, shared notebooks, and collaboration features.

You can check it out here.

Pika 2 update

Pika 2 update

Pika Labs has introduced Pika 2.0, a fun and user-friendly AI video generator designed for everyday creators, not just big studios. One of its standout features is the Scene Ingredients tool, which lets you upload your own characters, props, and settings to mix with AI-generated content. Whether it’s a dragon flying over a castle or a cat surfing through space, you get more control to bring your ideas to life.

Unlike traditional text-prompt-based video tools, Pika 2.0 has improved text alignment for better results and upgraded motion rendering for smoother, more natural movements. It’s made with small creators and social media users in mind, making it perfect for TikToks, marketing clips, or just having fun with creative video projects.

Available for both free and paid users, Pika 2.0 is all about making video creation accessible and enjoyable for “actual people,” as they put it.

Nvidia's Fugato

unnamed (1)

Fugatto by Nvidia is a revolutionary AI model that generates and transforms audio using text and audio prompts. It allows users to compose music, modify voices, add or remove instruments, and even create entirely new sounds.

It allows fine-grained control over attributes like accent, emotion, and sound evolution. For example, it can morph sounds over time, such as a train transitioning into a string orchestra, or a choir.

Its debut showcased impressive creativity, from music with barking dogs to instruments mimicking animal sounds, marking Fugatto as a groundbreaking leap in generative audio technology.

The video below shows what amazing capabilities it will give to filmmakers.

Hunyuan video generator

unnamed (2)

Hunyuan AI Video is a new, state of the art, AI Video Generator that creates high-quality videos from text descriptions. With massive horsepower and state-of-the-art performance, it claims to be the most powerful open-source video generation model available.

It generates high-quality AI videos with superior motion stability, scene transitions, and realistic visuals.

Try it at https://fal.ai/models/fal-ai/hunyuan-video

CapCut's AI Avatar Generator

You can now make a lip-syncing talking Custom AI Avatar for free in Capcut!

unnamed (3)

CapCut's AI avatar generator is completely free to use. You can create and personalize your avatar without any subscriptions or hidden fees, allowing you to explore your creativity without breaking the bank.

From Capcut's promo:

"Key Features of CapCut’s AI Avatar Generator

  • Diverse Avatar Styles: Explore a library of unique styles, from bold and graphic to soft and whimsical, tailored to your vision.
  • User-Friendly Interface: Create avatars effortlessly with intuitive tools, perfect for beginners and pros alike.
  • Extensive Customization: Personalize avatars with detailed features, sound effects, and backgrounds to reflect your individuality.

Benefits of Using CapCut’s AI Avatar Generator

  • Free Creativity: Design avatars at no cost, eliminating the need for expensive software or subscriptions.
  • Pre-Designed Templates: Start with diverse character templates that inspire and simplify the creative process.
  • Seamless Video Integration: Easily incorporate custom avatars into videos with CapCut’s editing tools.

Creative Applications of AI Avatars

  • Gaming & Entertainment: Enhance gameplay commentary or skits with unique character avatars.
  • Marketing & Advertising: Create memorable campaigns featuring custom avatars to elevate your brand.
  • Reaction & Review Videos: Add personality and engagement to your content with visually captivating avatars."

Learn more on how to use CapCut AI avatar generator here: https://www.capcut.com/tools/free-avatar-creator

Google Raises the Bar in Video Generation

unnamed

Google just announced Veo 2 which produces another advance in video quality, out-performing even OpenAI's Sora. They also announced Imagen 3, an upgraded image model also offering state-of-the-art quality.

While video models frequently “hallucinate” unwanted details—such as extra fingers or unexpected objects—Veo 2 minimizes these occurrences, resulting in more realistic outputs.

Additionally, Veo 2 embeds an invisible SynthID watermark in its videos, allowing them to be identified as AI-generated. This helps mitigate risks of misinformation and misattribution.

Visit Google Labs to sign up for the waitlist. They also plan to expand Veo 2 to YouTube Shorts and other products next year.

Read more about it at https://blog.google/technology/google-labs/video-image-generation-update-december-2024

Imagen 3 outperformed all models, including Midjourney, Flux, and Ideogram, in human evaluations for preference, visual quality, and prompt adherence. The model is now available through Google Labs’ ImageFX.

PUT YOUR FRIENDS IN ANY ENVIRONMENT IN ANY POSITION

OpenArt combines numerous great image generation and editing tools into one online program, but what sets it apart is its ability to train a "model" composed of different images that you upload of a friend, a family member, a pet etc. that you can then place into any environment, in any pose, and any style.

You can see it in action in this great video from Bob Doyle at 5:10 to about 20:30: https://www.youtube.com/watch?v=gEjm0Mc1jkc 

 

 

 

 

Ai-Da, a humanoid robot artist, just made history by selling her portrait of Alan Turing for over $1 million at Sotherby's. The painting is below.

2.-Ai-God-Polyptych-by-Ai-Da-Robot

OMNIGEN - REMARKABLE NEW IMAGE EDITOR

omnigen

Imagine being able to say "take the person on the left in image 1 and the middle person in image 2 and have them [whatever you went them to do, wherever you want them to do it]. Or telling it to deblur an image, or add or remove things when combining multiple images or parts of images.

Omnigen can do this and much more. You just tell it what you want it to do to the image and it does it. You can see it in action at https://www.youtube.com/watch?v=PCL9SAlHqzw

And try it out at https://huggingface.co/spaces/Shitao/OmniGen

Warning: Being designed by geeks, it's not the most intuitive, and it can cost you in credits after a while. If you have a powerful enough PC and graphics card, you can install it locally and use it for free with no limits.

RECRAFT AKA RED PANDA

fb68852f-4c99-4ff6-aa79-a50ba8a8aa1e

Another new image generator, Recraft.ai, has appeared and is claiming to be the best, but in all the tests I've seen, while it is actually on a par with the best - Ideogram, MidJourney, Flux etc. - it is not better than them.

It is very good for photorealism and long text, and has similar extra features to some of the others (upscaling, background removal, erasing portions), and it adds vector images, collages, and mockups. There is a free version, so it is definitely worth a try.

RUNWAYML ADDS ADVANCED CAMERA CONTROLS

runway-advanced-camera-control-1456x1202

RunWayML has added advanced camera controls the give ultraprecise, and much easier to use camera controls when you are generating your videos.

You can check these out at https://www.youtube.com/watch?v=0buDtZKLDJ8

WONDERANIMATION

wonderanim

Wonder Dynamics, the folks who enabled us to drop animated CGI characters into our videos, and who I featured in in many of my seminars, have now introduced WonderAnimation, which turns any footage that you shoot into fully rendered 3D animated scenes that you have full post-production control over!

You can literally shoot a scene with any camera, (or phone) in any location, and turn the sequence into an animated scene with CG characters in a 3D environment - even with shots from multiple angles!

You can read about it at https://adsknews.autodesk.com/en/news/autodesk-launches-wonder-animation-video-to-3d-scene-technology/

You can see it in action at https://www.youtube.com/watch?v=xad1ajxln28

CHATGPT SEARCH

Per ChatGPT, it "can now search the web in a much better way than before. You can get fast, timely answers with links to relevant web sources, which you would have previously needed to go to a search engine for. This blends the benefits of a natural language interface with the value of up-to-date sports scores, news, stock quotes, and more.

ChatGPT will choose to search the web based on what you ask, or you can manually choose to search by clicking the web search icon.

Search will be available at chatgpt.com (opens in a new window), as well as on our desktop and mobile apps. All ChatGPT Plus and Team users, as well as SearchGPT waitlist users, will have access today. Enterprise and Edu users will get access in the next few weeks. We’ll roll out to all Free users over the coming months."

ChatGPT also added a much-needed conversation search function at the top left enabling you to search through all your previous conversations.

IDEOGRAM AND MIDJOURNEY ADD IMAGE EDITING

Both ideogram and MidJourney have introduced excellent editing tools for the images you create with them, or that you upload.

id dogs 2

With Ideogram's Canvas, you can upload your own images or generate new ones, then seamlessly edit, extend, or combine them using Magic Fill (inpainting - adding things to the image, like the girl added above) and Image Extending (outpainting) tools. You can also seamlessly combine multiple images into one unified image. Magic Fill allows you to edit specific regions of your images to replace objects, add text, fix imperfections, change backgrounds, and more.

mj cars

With Midjourney, users can upload any image of their choosing and edit sections of it with AI, or change the style and texture of it from the source to something totally different, such as turning a vintage photograph into anime — while preserving most of the image’s subjects and objects and spatial relationships. It also works on doodles and hand drawings that the user submits, turning scribbles into full art pieces in seconds.

Shorts:

RunwayML introduced Act-One, an extraordinary way to add fully controllable facial expressiveness to any face - real or animated - in a video. Instead of trying to explain all that it does, check it out here: https://runwayml.com/research/introducing-act-one

Stability AI released the open source Stable Diffusion 3.5 - with improved photorealism of people and much better rendering of hands.

Alibaba’s MIMO - Alibaba's got a new AI tool called MIMO  that can swap out people in videos using just a single photo reference, and change them into whatever characters you like, doing whatever you wish. It eliminates the need for complicated stuff like multi-camera setups or motion capture.


Leonardo in Canvas
-
Canva has launched Dream Lab which incorporates Leonardo in its text to image creations. The new Dream Lab tool can generate up to 19 different types of graphics, including 3D renders and illustrations, and can also reference other images to fine-tune outputs, making its outputs more reliable. It’s also capable of generating multi-subject images and photorealistic portraits.

HEYGEN INTERACTIVE AVATARS FOR ZOOM

heygen new

HeyGen has introduced an innovative feature that allows users to integrate AI-powered avatars into Zoom meetings, enhancing virtual interactions. These Interactive Avatars can join multiple Zoom sessions simultaneously, operating 24/7, and are designed to look, sound, and behave like the user, making real-time decisions based on provided knowledge bases. https://www.heygen.com/


Key Features:

  • Real-Time Interaction: The avatars engage in dynamic conversations, responding promptly to participants using OpenAI's real-time voice integration. This ensures natural and efficient interactions during meetings.
  • Versatility: Suitable for various applications such as online coaching, customer support, sales calls, and interviews, these avatars can handle repetitive tasks, allowing users to focus on more critical aspects of their work.
  • Personalization: Users can create custom avatars that mirror their appearance and voice, and how they speak, providing a consistent and authentic presence in virtual meetings. Additionally, users can create up to 100 different "looks" for their avatar, enabling variations in backgrounds, outfits, and camera angles to keep the virtual presence engaging and versatile.

While it is definitely getting better all the time, the avatars still look and sound fake to me - almost there, but not quite.

krea new

Image generator Krea - https://www.krea.ai/ - has released a major update where they partnered with some of the top AI video generators to bring multiple video models into Krea. Now you can create videos with MiniMax, LumaLabs, RunwayML, Pika Labs and Kling all in the one place.

They also have real-time image generation, image to video, and can upscale images and videos, as well as animations that morph from one image to another.

adobe new

NEW ADOBE AI TOOLS

At Adobe MAX 2024, Adobe announced many new AI features which include:

Adobe Firefly Video Model (Beta): Adobe expanded its Firefly family of generative AI models to include video, enabling creators to generate videos from text and image prompts. This model is designed to be commercially safe and is integrated into Premiere Pro, offering features like Generative Extend to seamlessly add frames to video clips .

Photoshop Enhancements: Photoshop received several AI-driven updates:

  • Distraction Removal: Automatically identifies and removes elements like people, wires, and poles from images.
  • Generative Workspace (Beta): Allows designers to ideate and iterate concepts simultaneously using generative AI.
  • Substance 3D Viewer (Beta): Enables viewing and editing 3D objects within Photoshop.
  • Premiere Pro Enhancements:  Premiere Pro introduces Generative Extend, allowing editors to seamlessly add frames to video clips using AI.
  • Adobe Express:  Adobe Express introduces new AI capabilities to simplify content creation, such as campaign creation, animation, and one-click brand setup.

NOT DIAMOND

My favorite new GPT is Not Diamond at https://chat.notdiamond.ai

Like Poe, it enables you to try different GPTs (ChatGPT, Claude, Gemini, Perplexity etc.) in the one place, but it does more. Based on what you ask, it chooses the best GPT for your query.

And you can compare the output of different GPTs side by side. And it does image generation, including the new Flux. And it is free.

INSTANT PODCASTS

Google recently enhanced its NotebookLM tool with an experimental Audio Overview feature, turning any collection of sources into a captivating podcast discussion hosted by two AI personalities. The AI-generated dialogue is downloadable, engaging, and tailored for auditory learners, as advertised by Google.

However, the feature goes beyond mere audio playback. The AI hosts display remarkable pacing, tone, and delivery, mimicking the natural flow of a human conversation. It's quite remarkable.

Credit: Lifehacker

FREE YOUTUBE TRANSCRIPTS

Another way to get a free transcript of a YouTube video is to add 3 "t's"after the youtube in the address - e.g. https://www.youtubettt.com/watch?v=cw0UOQd3ZB8 of any YouTube video you're watching.

Invideo 3.0 update

invideo 3 b

Big news from InVideo - their latest update, V3, is a game-changer for video creation. You can now create entire videos—script, footage, voiceovers, music, subtitles, animations, the whole package—just by typing a single text prompt. No editing skills or juggling multiple tools needed!

This means anyone, whether you’re a creator, marketer, or entrepreneur, can easily produce professional-quality videos to tell your story, promote a product, or create engaging content for social media. Imagine whipping up a polished video ad or even translating existing videos with just a few clicks!

If you’ve ever felt intimidated by video editing, this might be the perfect time to give it a try here.

ChatGPT update giving it "eyes"

ChatGPT

OpenAI has unveiled a groundbreaking feature for ChatGPT—real-time video and voice interaction through its new Advanced Voice mode. This update enables the chatbot to visually interpret its surroundings and respond contextually to what users show through their device's camera.

For example, users can display items or situations, and ask for guidance. The chatbot can provide detailed, step-by-step instructions, answer clarifying questions, and adapt its responses to what’s in the frame. Additionally, users can share their device screen, allowing ChatGPT to view and assist with tasks, such as drafting replies to messages within a messenger app.

This feature is part of ChatGPT Plus and Pro subscription plans and will roll out next week. Business and educational users can expect access to this functionality by early 2025. With these advancements,

MagicQuill (revolutionary image editing)

MagicQuill

MagicQuill is here to make image editing simpler, smarter, and more fun for everyone. With its AI-powered tools and an intuitive interface, you can easily do things like insert new elements, erase objects, or tweak colors—no complex skills required.

What’s really cool? MagicQuill understands what you’re trying to do in real time, so there’s no need to type out prompts or navigate tricky menus. Just a few quick strokes, and you’re in control, getting exactly the edits you want with precision and ease.

Whether you’re working on casual photo tweaks or intricate design projects, MagicQuill combines powerful AI with simplicity to bring your creative vision to life. If you’ve been looking for a tool that makes advanced editing feel effortless, this one’s definitely worth checking out.

MultiFoley

MultiFoley

MultiFoley is an impressive new AI tool for creating soundtracks that perfectly match silent videos, whether you’re going for realistic sound effects or something more imaginative. With MultiFoley, you can generate high-quality, synchronized sounds using text, audio, or video as inputs.

One of its coolest tricks is that you can guide it with reference sounds—like pulling audio from a sound effects library or a partial video soundtrack—and it will build a complete, seamless audio experience. Need a skateboard’s wheels spinning without the wind noise, or maybe a lion’s roar that sounds like a cat’s meow? MultiFoley can help you with this.

It’s super versatile, too. You can use it to create sounds based on text prompts, extend incomplete soundtracks, or tweak audio using existing references. By combining AI smarts with professional sound effects, it produces clear, full-bandwidth audio that’s perfect for everything from film production to creative projects.

If sound design is part of your workflow, MultiFoley could save you a lot of time while opening up endless creative possibilities.

NotebookLM update

nlm

Google has just rolled out some exciting updates to NotebookLM, their AI-powered productivity tool, and they’re pretty game-changing.

The standout feature? You can now jump into the podcast conversations with their new Audio Overview update. This means you can interact with the audio using your voice—ask questions, get extra details, or even request different explanations, all in real-time. It’s like being part of the discussion!

They’ve also redesigned the interface to make things easier and more intuitive. You’ve got three key panels now: one for keeping track of your sources, one for AI-powered chats (with citations!), and another for creating things like study guides and custom audio overviews.

And for those who need even more power, there’s a new premium tier called NotebookLM Plus coming early next year. It’s built for teams, schools, and businesses, offering more storage, shared notebooks, and collaboration features.

You can check it out here.

Pika 2 update

Pika 2 update

Pika Labs has introduced Pika 2.0, a fun and user-friendly AI video generator designed for everyday creators, not just big studios. One of its standout features is the Scene Ingredients tool, which lets you upload your own characters, props, and settings to mix with AI-generated content. Whether it’s a dragon flying over a castle or a cat surfing through space, you get more control to bring your ideas to life.

Unlike traditional text-prompt-based video tools, Pika 2.0 has improved text alignment for better results and upgraded motion rendering for smoother, more natural movements. It’s made with small creators and social media users in mind, making it perfect for TikToks, marketing clips, or just having fun with creative video projects.

Available for both free and paid users, Pika 2.0 is all about making video creation accessible and enjoyable for “actual people,” as they put it.

Nvidia's Fugato

unnamed (1)

Fugatto by Nvidia is a revolutionary AI model that generates and transforms audio using text and audio prompts. It allows users to compose music, modify voices, add or remove instruments, and even create entirely new sounds.

It allows fine-grained control over attributes like accent, emotion, and sound evolution. For example, it can morph sounds over time, such as a train transitioning into a string orchestra, or a choir.

Its debut showcased impressive creativity, from music with barking dogs to instruments mimicking animal sounds, marking Fugatto as a groundbreaking leap in generative audio technology.

The video below shows what amazing capabilities it will give to filmmakers.

Hunyuan video generator

unnamed (2)

Hunyuan AI Video is a new, state of the art, AI Video Generator that creates high-quality videos from text descriptions. With massive horsepower and state-of-the-art performance, it claims to be the most powerful open-source video generation model available.

It generates high-quality AI videos with superior motion stability, scene transitions, and realistic visuals.

Try it at https://fal.ai/models/fal-ai/hunyuan-video

CapCut's AI Avatar Generator

You can now make a lip-syncing talking Custom AI Avatar for free in Capcut!

unnamed (3)

CapCut's AI avatar generator is completely free to use. You can create and personalize your avatar without any subscriptions or hidden fees, allowing you to explore your creativity without breaking the bank.

From Capcut's promo:

"Key Features of CapCut’s AI Avatar Generator

  • Diverse Avatar Styles: Explore a library of unique styles, from bold and graphic to soft and whimsical, tailored to your vision.
  • User-Friendly Interface: Create avatars effortlessly with intuitive tools, perfect for beginners and pros alike.
  • Extensive Customization: Personalize avatars with detailed features, sound effects, and backgrounds to reflect your individuality.

Benefits of Using CapCut’s AI Avatar Generator

  • Free Creativity: Design avatars at no cost, eliminating the need for expensive software or subscriptions.
  • Pre-Designed Templates: Start with diverse character templates that inspire and simplify the creative process.
  • Seamless Video Integration: Easily incorporate custom avatars into videos with CapCut’s editing tools.

Creative Applications of AI Avatars

  • Gaming & Entertainment: Enhance gameplay commentary or skits with unique character avatars.
  • Marketing & Advertising: Create memorable campaigns featuring custom avatars to elevate your brand.
  • Reaction & Review Videos: Add personality and engagement to your content with visually captivating avatars."

Learn more on how to use CapCut AI avatar generator here: https://www.capcut.com/tools/free-avatar-creator

Google Raises the Bar in Video Generation

unnamed

Google just announced Veo 2 which produces another advance in video quality, out-performing even OpenAI's Sora. They also announced Imagen 3, an upgraded image model also offering state-of-the-art quality.

While video models frequently “hallucinate” unwanted details—such as extra fingers or unexpected objects—Veo 2 minimizes these occurrences, resulting in more realistic outputs.

Additionally, Veo 2 embeds an invisible SynthID watermark in its videos, allowing them to be identified as AI-generated. This helps mitigate risks of misinformation and misattribution.

Visit Google Labs to sign up for the waitlist. They also plan to expand Veo 2 to YouTube Shorts and other products next year.

Read more about it at https://blog.google/technology/google-labs/video-image-generation-update-december-2024

Imagen 3 outperformed all models, including Midjourney, Flux, and Ideogram, in human evaluations for preference, visual quality, and prompt adherence. The model is now available through Google Labs’ ImageFX.

PUT YOUR FRIENDS IN ANY ENVIRONMENT IN ANY POSITION

OpenArt combines numerous great image generation and editing tools into one online program, but what sets it apart is its ability to train a "model" composed of different images that you upload of a friend, a family member, a pet etc. that you can then place into any environment, in any pose, and any style.

You can see it in action in this great video from Bob Doyle at 5:10 to about 20:30: https://www.youtube.com/watch?v=gEjm0Mc1jkc 

 

 

 

 

Ai-Da, a humanoid robot artist, just made history by selling her portrait of Alan Turing for over $1 million at Sotherby's. The painting is below.

2.-Ai-God-Polyptych-by-Ai-Da-Robot

OMNIGEN - REMARKABLE NEW IMAGE EDITOR

omnigen

Imagine being able to say "take the person on the left in image 1 and the middle person in image 2 and have them [whatever you went them to do, wherever you want them to do it]. Or telling it to deblur an image, or add or remove things when combining multiple images or parts of images.

Omnigen can do this and much more. You just tell it what you want it to do to the image and it does it. You can see it in action at https://www.youtube.com/watch?v=PCL9SAlHqzw

And try it out at https://huggingface.co/spaces/Shitao/OmniGen

Warning: Being designed by geeks, it's not the most intuitive, and it can cost you in credits after a while. If you have a powerful enough PC and graphics card, you can install it locally and use it for free with no limits.

RECRAFT AKA RED PANDA

fb68852f-4c99-4ff6-aa79-a50ba8a8aa1e

Another new image generator, Recraft.ai, has appeared and is claiming to be the best, but in all the tests I've seen, while it is actually on a par with the best - Ideogram, MidJourney, Flux etc. - it is not better than them.

It is very good for photorealism and long text, and has similar extra features to some of the others (upscaling, background removal, erasing portions), and it adds vector images, collages, and mockups. There is a free version, so it is definitely worth a try.

RUNWAYML ADDS ADVANCED CAMERA CONTROLS

runway-advanced-camera-control-1456x1202

RunWayML has added advanced camera controls the give ultraprecise, and much easier to use camera controls when you are generating your videos.

You can check these out at https://www.youtube.com/watch?v=0buDtZKLDJ8

WONDERANIMATION

wonderanim

Wonder Dynamics, the folks who enabled us to drop animated CGI characters into our videos, and who I featured in in many of my seminars, have now introduced WonderAnimation, which turns any footage that you shoot into fully rendered 3D animated scenes that you have full post-production control over!

You can literally shoot a scene with any camera, (or phone) in any location, and turn the sequence into an animated scene with CG characters in a 3D environment - even with shots from multiple angles!

You can read about it at https://adsknews.autodesk.com/en/news/autodesk-launches-wonder-animation-video-to-3d-scene-technology/

You can see it in action at https://www.youtube.com/watch?v=xad1ajxln28

CHATGPT SEARCH

Per ChatGPT, it "can now search the web in a much better way than before. You can get fast, timely answers with links to relevant web sources, which you would have previously needed to go to a search engine for. This blends the benefits of a natural language interface with the value of up-to-date sports scores, news, stock quotes, and more.

ChatGPT will choose to search the web based on what you ask, or you can manually choose to search by clicking the web search icon.

Search will be available at chatgpt.com (opens in a new window), as well as on our desktop and mobile apps. All ChatGPT Plus and Team users, as well as SearchGPT waitlist users, will have access today. Enterprise and Edu users will get access in the next few weeks. We’ll roll out to all Free users over the coming months."

ChatGPT also added a much-needed conversation search function at the top left enabling you to search through all your previous conversations.

IDEOGRAM AND MIDJOURNEY ADD IMAGE EDITING

Both ideogram and MidJourney have introduced excellent editing tools for the images you create with them, or that you upload.

id dogs 2

With Ideogram's Canvas, you can upload your own images or generate new ones, then seamlessly edit, extend, or combine them using Magic Fill (inpainting - adding things to the image, like the girl added above) and Image Extending (outpainting) tools. You can also seamlessly combine multiple images into one unified image. Magic Fill allows you to edit specific regions of your images to replace objects, add text, fix imperfections, change backgrounds, and more.

mj cars

With Midjourney, users can upload any image of their choosing and edit sections of it with AI, or change the style and texture of it from the source to something totally different, such as turning a vintage photograph into anime — while preserving most of the image’s subjects and objects and spatial relationships. It also works on doodles and hand drawings that the user submits, turning scribbles into full art pieces in seconds.

Shorts:

RunwayML introduced Act-One, an extraordinary way to add fully controllable facial expressiveness to any face - real or animated - in a video. Instead of trying to explain all that it does, check it out here: https://runwayml.com/research/introducing-act-one

Stability AI released the open source Stable Diffusion 3.5 - with improved photorealism of people and much better rendering of hands.

Alibaba’s MIMO - Alibaba's got a new AI tool called MIMO  that can swap out people in videos using just a single photo reference, and change them into whatever characters you like, doing whatever you wish. It eliminates the need for complicated stuff like multi-camera setups or motion capture.


Leonardo in Canvas
-
Canva has launched Dream Lab which incorporates Leonardo in its text to image creations. The new Dream Lab tool can generate up to 19 different types of graphics, including 3D renders and illustrations, and can also reference other images to fine-tune outputs, making its outputs more reliable. It’s also capable of generating multi-subject images and photorealistic portraits.

HEYGEN INTERACTIVE AVATARS FOR ZOOM

heygen new

HeyGen has introduced an innovative feature that allows users to integrate AI-powered avatars into Zoom meetings, enhancing virtual interactions. These Interactive Avatars can join multiple Zoom sessions simultaneously, operating 24/7, and are designed to look, sound, and behave like the user, making real-time decisions based on provided knowledge bases. https://www.heygen.com/


Key Features:

  • Real-Time Interaction: The avatars engage in dynamic conversations, responding promptly to participants using OpenAI's real-time voice integration. This ensures natural and efficient interactions during meetings.
  • Versatility: Suitable for various applications such as online coaching, customer support, sales calls, and interviews, these avatars can handle repetitive tasks, allowing users to focus on more critical aspects of their work.
  • Personalization: Users can create custom avatars that mirror their appearance and voice, and how they speak, providing a consistent and authentic presence in virtual meetings. Additionally, users can create up to 100 different "looks" for their avatar, enabling variations in backgrounds, outfits, and camera angles to keep the virtual presence engaging and versatile.

While it is definitely getting better all the time, the avatars still look and sound fake to me - almost there, but not quite.

krea new

Image generator Krea - https://www.krea.ai/ - has released a major update where they partnered with some of the top AI video generators to bring multiple video models into Krea. Now you can create videos with MiniMax, LumaLabs, RunwayML, Pika Labs and Kling all in the one place.

They also have real-time image generation, image to video, and can upscale images and videos, as well as animations that morph from one image to another.

adobe new

NEW ADOBE AI TOOLS

At Adobe MAX 2024, Adobe announced many new AI features which include:

Adobe Firefly Video Model (Beta): Adobe expanded its Firefly family of generative AI models to include video, enabling creators to generate videos from text and image prompts. This model is designed to be commercially safe and is integrated into Premiere Pro, offering features like Generative Extend to seamlessly add frames to video clips .

Photoshop Enhancements: Photoshop received several AI-driven updates:

  • Distraction Removal: Automatically identifies and removes elements like people, wires, and poles from images.
  • Generative Workspace (Beta): Allows designers to ideate and iterate concepts simultaneously using generative AI.
  • Substance 3D Viewer (Beta): Enables viewing and editing 3D objects within Photoshop.
  • Premiere Pro Enhancements:  Premiere Pro introduces Generative Extend, allowing editors to seamlessly add frames to video clips using AI.
  • Adobe Express:  Adobe Express introduces new AI capabilities to simplify content creation, such as campaign creation, animation, and one-click brand setup.

NOT DIAMOND

My favorite new GPT is Not Diamond at https://chat.notdiamond.ai

Like Poe, it enables you to try different GPTs (ChatGPT, Claude, Gemini, Perplexity etc.) in the one place, but it does more. Based on what you ask, it chooses the best GPT for your query.

And you can compare the output of different GPTs side by side. And it does image generation, including the new Flux. And it is free.

INSTANT PODCASTS

Google recently enhanced its NotebookLM tool with an experimental Audio Overview feature, turning any collection of sources into a captivating podcast discussion hosted by two AI personalities. The AI-generated dialogue is downloadable, engaging, and tailored for auditory learners, as advertised by Google.

However, the feature goes beyond mere audio playback. The AI hosts display remarkable pacing, tone, and delivery, mimicking the natural flow of a human conversation. It's quite remarkable.

Credit: Lifehacker

FREE YOUTUBE TRANSCRIPTS

Another way to get a free transcript of a YouTube video is to add 3 "t's"after the youtube in the address - e.g. https://www.youtubettt.com/watch?v=cw0UOQd3ZB8 of any YouTube video you're watching.