Latest AI Tools

For AI news updated daily, go here

Monica - One AI Tool to Rule Them All

monica ai

Monica enables me to use all the GPTs - ChatGPT, Claude, Gemini, and more, as well as the best image and video generators all in one place. This enables me to save a lot on all the individual subscriptions, but also to compare results and not to have to go to multiple tools to do so. Monica also allows quite a bit more usage for less cost than Poe, even in their free version, and it has much better privacy implementations.

It has a browser extension that enables you to "chat" with websites, get summaries, do screenshots, create mindmaps, and get YouTube video summaries. And it has a large number of writing and image editing tools, and it can summarize multiple open browser tabs at once, check grammar, remove image backgrounds and watermarks, integrate with Gmail, Facebook and others to help with replies and posts, transcribe audio, convert PDFs to PowerPoints, generate podcasts, humanize AI text, translate and convert document files, do OCR on images, and all sorts of other things.

It has a free plan, but I think the $9.90 plan is an incredible bargain as it can replace hundreds of dollars in subscriptions for all the tools it incudes. You can get free access to GPT-4o, Claude 3.5 & DeepSeek with Monica here

Keeping the Story Together: Runway Gen-4 Update

runway

Runway has introduced Gen-4, its latest update in AI video generation, and it’s catching attention for one very welcome reason—more consistency. If you’ve ever tried making AI-generated video, you know how tricky it is to keep the same characters looking the same across different scenes. Gen-4 aims to make that easier by helping maintain the same style, characters, and objects from one shot to the next.

This upgrade means users can now create smoother, more coherent videos without jarring shifts in appearance or tone. While it’s not perfect (what tool is?), it’s a big step toward making AI video creation more usable for those who want to tell stories that flow naturally. It’s still early days, but Gen-4 might just make experimenting with video a little more fun—and a lot less frustrating. Read more

 

Bring Characters to Life with Lemon Slice

Lemonslice3

Lemon Slice (formerly Infinity AI) makes it easy to create short, expressive videos starring digital characters that talk and emote. Whether you’re telling a story, making content for fun, or experimenting with video ideas, the platform offers a creative way to animate characters without needing advanced tools or editing skills. Some describe it as better than Heygen for less.

You can start with a character template and guide them through a scene with your script or audio. It's designed for those who want to bring an idea to life visually—even if they’ve never made a video before. If you're curious, you can check it out here: https://lemonslice.com/

Image Makeover, Made Easy

chatgpt2

A new feature in ChatGPT using GPT-4o now lets users blend the style of one image with another—kind of like giving your photos a wardrobe swap. Want your selfie to match the look of a dreamy landscape photo? This tool makes it easier to play around with visuals in fun and creative ways without needing any design background.

It’s built right into ChatGPT’s image tools, and it works by letting you pick one image for content and another for style. Whether you're experimenting for a project or just curious about how your vacation pic might look with a vintage film vibe, it opens up a new way to get artistic with your images—no extra apps required. You can explore this feature through the tutorial here: Transferring Styles with GPT-4o

Krea AI's Video Re-Style: Transform Your Videos with Ease

krea

Krea AI has introduced an exciting feature called Video Re-Style, designed to help you effortlessly change the style of your videos while keeping the original movement intact. This means you can apply new artistic effects or themes to your existing footage without altering the flow of the action.

The Video Re-Style feature is part of Krea AI's ongoing efforts to make advanced video editing accessible to everyone. By integrating this tool into their platform, Krea AI allows users to reimagine their video content in creative ways without needing extensive technical skills. It's a user-friendly solution for anyone interested in enhancing their videos with new styles and effects. Learn more

Image Generation Takes a Quantum Leap!

hero_image_1-whiteboard1

OpenAI, the company who makes ChatGPT just raised the bar in image generation. And within 24 hours my favorite image generator for the last year, Ideogram 2.0, released version 3.0. Both of these are game changers.

OpenAI played it smart this time, unlike when they announced their quantum leap in video generation, Sora, in February of 2024 and then didn't release it until December, by which time many other companies had gone beyond it. So its release was kind of like a damp squib.

This time they said nothing and just released it and became the best image generator there is. They have finally retired my least favorite generator DALL-E3, thank goodness. They're just calling it 4o Image Generation and you can access it by typing "Create Image" (or clicking on the 3 View Tools dots) right within ChatGPT, even the free version (which has limits as to how many images you can create).

What’s special about it? You can refine and edit your image via natural conversations, keeping what you like exactly as it is and only changing what you want changed, it delivers exactly what you ask it to, its handling of text, even long text is phenomenal, it's photorealistic or whatever you want it to be, you can show it an image as a guide, and more. It can create beautiful promo pieces fully. It can accept a single image of you, or anybody, and you can place you or them in any environment. It will maintain character consistency from image to image, and more.

It uses a similar technology to the Google AI Studio one that I wrote about on Monday, but in my tests, It is better.

You can find out more at https://openai.com/index/introducing-4o-image-generation/

And yesterday Ideogram 3.0 came out, which is also a wonderfully enhanced version, with a lot of similar features. To prevent this post from becoming a book, you can see and read more about it at https://about.ideogram.ai/3.0

And now there's also Reve, which is also excellent!  https://preview.reve.art/app/explore

Higgsfield: AI-Powered Video Creation Made Simple

higgsfield

Higgsfield is a tool designed to streamline video creation, helping users go from idea to finished video in one place. It offers a script editor, a storyboard tool to keep visuals consistent, and a video editor to refine the final product. There are also built-in audio options, including background music, sound effects, and voiceovers with lip-sync, making it easy to bring projects to life without juggling multiple tools.

It also has high-end cinematic camera shots with bullet time (think Matrix), super dolly shots and more.

For those looking to create content without complicated software, Higgsfield provides an intuitive way to experiment and iterate. Whether it's a short story, a marketing video, or a creative project, the platform is designed to simplify the process while keeping users in control. You can check out this AI video production studio at Higgsfield.ai.

Cloning Website Made Easy with Gamma AI

gamma Ai 2

Gamma AI is an innovative tool that simplifies website creation by allowing users to replicate existing sites effortlessly. By simply inputting the URL of a desired website, Gamma AI generates a fully functional replica, eliminating the need for extensive coding knowledge. This feature is particularly beneficial for individuals looking to understand website structures or seeking inspiration for their own projects. Gamma also creates PowerPoints.

Beyond cloning, Gamma AI offers customization options to tailor the duplicated site to individual preferences. Users can modify layouts, tables, graphs, fonts, and other elements to better align with their vision. Gamma also integrates AI models like Flux Pro, Imagen 3, Ideogram, and Luma Photon to assist in generating high-quality content, including text and images.

Effortless Ad Creation with Icon AI

a-sleek-product-shot-advertisement-featu_WxU-ZTu2Ro-dYiyLaGiofQ_SHlzj2CjQXGjHFwtHGAwlQ

Creating ads that stand out can be time-consuming, but Icon AI is designed to make the process much smoother. This AI-powered tool helps with everything from researching successful ads to writing scripts, generating voiceovers, adding captions, and even choosing background music. It aims to simplify the creative process so users can focus more on their ideas rather than the technical aspects of ad production.

What makes Icon AI particularly useful is its ability to streamline multiple steps into one seamless workflow. Instead of juggling different tools for each task, users can generate and refine their ads in one place, potentially saving time and effort. The tool also supports auto-uploading, making it even easier to get finished ads live. Learn more here

What happens when Suno meets the new Hedra Studio with Character 3?

Bob Doyle found out. 🙂

Google Lets You Really Edit Your images

new elephant trunk 4

Google just introduced a really great new feature to their free-to-use AI Studio called "Native Image Output." This exciting addition not only lets Gemini create images, but even better, it allows Gemini to edit your existing images!

All you need to do is upload an image and describe what changes you'd like to see. Within moments, Gemini will provide you with your newly edited image. It's that simple!

Here's how to use it:

1. Go to Google AI Studio using this link https://aistudio.google.com/prompts/new_chat
2. Sign in with your Google Account.
3. Select "Create Prompt" in the left navigation bar.
4. Set the model to Gemini 2.0 Flash Experimental.
5. Ensure the output format is set to "Images and text" in the right hand settings sidebar.
6. Upload an image that you want edited using the + at the bottom right.
7. Describe the change that you want Gemini to make in your text prompt.

8. Click Run.

See it in action at this great video: https://www.youtube.com/watch?v=DDrjlE_ecSw

New Effects from Pika Labs Add More Fun to AI Video Creation

pika-1-5-ai-video-generator-release

Pika Labs recently introduced 16 new effects to their AI video platform, giving users more creative ways to transform images into animated character videos. These new effects open up playful options like turning a photo into a cartoon-style scene or adding dynamic movements that bring images to life. It’s designed to be simple enough for everyday users who just want to explore video creation without getting too technical.

The Pika Labs platform keeps things easy to navigate, so users can try out these new features without feeling overwhelmed. You can check out their update here.

Breathe New Life into Your Old Videos with Freepik’s AI Video Upscaler using Topaz

topaz-video-ai-video-upscaler-interface

Freepik’s AI Video Upscaler is a handy tool designed for anyone who wants to improve the quality of their videos without diving into complicated editing software. With just one click, you can take your low-quality videos and upscale them to crisp 4K resolution. Whether it’s an old family video, a favorite clip, or content you’re preparing to share online, this tool makes the process simple and accessible for everyday users.

What’s great about this tool is how user-friendly it is — no need to figure out tricky settings or have technical knowledge. You just upload your video and let the AI handle the rest. It’s perfect for those moments when you want your videos to look a bit sharper and more polished. If you’re curious, you can check it out directly on Freepik’s platform here: Freepik AI Video Upscaler

Bring Your Images to Life with Stable Virtual Camera

stable virtual camera

Stable Virtual Camera is a new tool from Stability AI designed to add movement and depth to still images. It turns your photos into short 3D videos by creating smooth, dynamic camera paths—making your pictures feel like they’re being filmed rather than just viewed. Instead of a flat image, you get a video that moves around the scene, offering a fresh perspective and more life-like visuals.

The tool gives everyday users a creative way to breathe new energy into their photos without needing complex editing skills. If you’re experimenting with visuals for fun or looking to add simple motion to a project, this update could make your images more engaging. You can check out more details and see examples here: Stable Virtual Camera

Claude Adds Real-Time Web Search

claude

Claude, the AI assistant from Anthropic, now catches up with ChatGPT and others and comes with a web search feature that lets it pull in real-time information from the internet. This means Claude can now help with up-to-date answers, whether you’re looking for the latest news, checking current events, or curious about something happening right now. It’s a simple but useful addition that helps Claude keep up with fast-moving topics that static knowledge can’t always cover.

The web search feature is rolling out in Claude’s Pro and Team plans, making the assistant even more flexible for everyday users. Instead of switching between apps or windows to look something up, you can stay in the conversation and let Claude handle the search. While it’s still early days for the feature, it opens the door to more dynamic and responsive interactions—great for anyone who values staying current without the extra hassle. Read more

PikaSwaps: Effortless Object Swapping in Videos

pikaswaps crop

PikaSwaps makes video editing more accessible by allowing users to swap objects in a clip without complex tools or technical skills. Simply select an item in your video, describe what you’d like to replace it with, or use a reference image, and PikaSwaps smoothly integrates the change..

With an intuitive approach, PikaSwaps helps users make quick and seamless edits without extensive editing experience. Whether enhancing visuals, updating product placements, or just experimenting with fun swaps, this tool opens up new possibilities for video creativity. Learn more at PikaSwaps

Generate Unlimited Images, 100% Free with Raphael AI

Raphael AI

Raphael AI is a simple, no-sign-up image generator designed for anyone curious about creating visuals with AI. It lets you type what you imagine and watch it turn into an image—without worrying about usage limits or registration. You can even upload a reference image to help guide the result.

What’s nice is that it feels accessible, even if you’re not a designer or AI expert. Great for testing creative ideas, trying new styles, or just playing around—all free. Explore more here.

Manus: A Step Toward AI Autonomy

Manus3

A new AI tool called Manus is making waves with its ability to handle real-world tasks on its own. In a recent demo, it was shown screening resumes, conducting property research, and even working on freelance platforms like Upwork and Fiverr. Unlike traditional AI assistants that require user guidance, Manus operates independently, navigating websites, writing code, and generating visuals within its own virtual environment.

While details on its full capabilities are still emerging, early reports suggest it has outperformed major AI assistants in specific benchmarks. For now, Manus is available on an invite-only basis, but its developers have announced plans to open-source its underlying models later this year. Learn more at Manus.im

Sesame: Making AI Voices Sound More Human

Sesame

Sesame is working on making AI voices more natural and expressive, aiming to create digital assistants that feel more engaging and lifelike. Their research focuses on improving tone, rhythm, and emotional depth in speech, making AI conversations more fluid and relatable. Instead of robotic-sounding voices, they want AI to sound like it truly understands and responds in a meaningful way.

Their latest demo - you can try at the link below - showcases these improvements, allowing users to experience AI voices with better expressiveness and personality. It’s amazing a lot of people. More details: Sesame Research.

QwQ-32B: A Smarter Way to Train AI

Qwen 2

Qwen has introduced QwQ-32B, a reasoning model that demonstrates an efficient way to enhance AI without requiring massive computational resources. The model incorporates tools and adapts its responses based on feedback, showing promising results in tasks like problem-solving and code generation.

QwQ-32B is openly available on platforms like Hugging Face, allowing researchers and developers to explore its capabilities. More details here: Qwen Blog

Tavus: Conversational Video Interface To Bring AI Agents To Life

Tavus

Tavus has introduced an upgraded Conversational Video Interface (CVI) that incorporates emotional intelligence, making AI-powered video interactions feel more natural. This update includes new models that enhance facial expressiveness, perception, and conversation flow. The system aims to make AI interactions more intuitive for applications like customer support and coaching.

By focusing on timing, context, and emotional cues, this update allows AI to engage in more realistic, real-time conversations. Learn more here: Tavus AI Update

OpenAI Introduces GPT-4.5: Enhancing Conversational AI

ChatGPT 4-5

OpenAI has unveiled GPT-4.5, also known as Orion, aiming to make interactions with AI more natural and intuitive. This latest model focuses on understanding user intentions better and responding with improved emotional awareness, making conversations feel more engaging and human-like. Early users have noted that GPT-4.5 provides more accurate and contextually appropriate responses, which could be particularly beneficial for creative projects and everyday inquiries.

Currently, GPT-4.5 is accessible to ChatGPT Pro subscribers and developers on paid plans, with plans to extend access to Plus and Team users soon. Learn more

Ideogram's 2a Model: Faster and More Affordable Image Generation

Ideogram 2a

Ideogram has introduced its latest 2a model, designed to make creating images from text descriptions quicker and more budget-friendly. Users can now generate images in about 10 seconds, with a '2a Turbo' option that delivers results even faster. Additionally, the new model is priced at half the cost of the previous version, Ideogram 2.0, making it more accessible for both personal and professional projects. You can explore these features through Ideogram's web platform, API, or applications like Freepik, Poe, and Gamma.- Check it out

Hunyuan Turbo S: Tencent’s AI Built for Speed

Hunyuan

Tencent has introduced Hunyuan Turbo S, an AI model designed for speed, offering instant responses rather than deep reasoning. Unlike models that take a moment to process and reply, Turbo S prioritizes quick interactions while maintaining strong performance in knowledge, math, and reasoning. Tencent has also reduced the cost significantly, making it more accessible to users.

This release is part of a growing AI race in China, with companies like DeepSeek and Alibaba launching competitive models. Tencent is also preparing a separate model, T1, for more in-depth reasoning tasks. Learn more: Tencent Hunyuan Turbo S

Scribe by ElevenLabs: Expanding Speech-to-Text Possibilities

Scribe

ElevenLabs has introduced Scribe, a speech-to-text tool designed to transcribe audio with high accuracy across 99 languages. It includes features like word-level timestamps, speaker identification, and the ability to detect non-verbal sounds like laughter or music. The model is designed to handle real-world audio challenges, making it useful for subtitles, searchable podcasts, and multilingual transcriptions.

Scribe is priced at $0.40 per hour for transcribing pre-recorded audio, with a real-time version coming soon. It aims to improve accessibility in languages that have limited speech recognition options. Learn more: ElevenLabs Blog.

Adobe's Firefly Introduces AI Video Creation

Adobe Firefly

Adobe has unveiled its latest addition to the Firefly family: an AI-powered video generator now available in public beta. This new tool allows users to craft short video clips using simple text or image prompts, making video creation more accessible to everyone. While the current version produces clips up to five seconds long at 1080p resolution, it's designed to help users quickly visualize and prototype their ideas without needing extensive video editing skills. This feature is particularly useful for creators looking to fill gaps in their content or generate quick visual concepts. Learn more

Goku AI: Bridging Images and Videos with Realism

ByteDance_Goku-01-thumb-1080x608-66019

ByteDance and the University of Hong Kong have introduced Goku and Goku+, AI models designed to generate both images and videos. These models aim to produce high-quality visuals, with applications in content creation, advertising, and marketing. The release includes demos showing detailed animations, cinematic shots, and lifelike video scenes.

While details about their accessibility and broader use cases are still emerging, the technology hints at a growing shift in how AI can assist in producing realistic video content efficiently. They show some of their versions of content that Sora previously produced - to my eye, the Sora ones look considerably better. You can explore more about these models here.

Alibaba’s Wan2.1: A Leap in AI Video Creation

Tongyi

Alibaba’s Tongyi Lab has introduced Wan2.1, an open-source video generation model designed to create high-quality AI-generated videos at impressive speeds. This model suite includes tools for text-to-video, image-to-video, and video-to-audio generation, making it a flexible option for various creative needs. A key highlight is advanced editing features like video inpainting and multi-image referencing.

What Wan2.1 claims makes it stand out is its efficiency—it generates videos 2.5 times faster than some leading models, while excelling in motion dynamics and real-world physics simulation. Learn more at WanX AI.

PlayAI Dialog: A Step Forward in AI-Generated Speech

PlayAIv1

PlayAI Dialog has introduced a new voice AI model designed to sound more natural and expressive. Recent testing showed that users preferred its speech quality over other industry-leading models, highlighting its ability to deliver smoother, more emotionally coherent dialogue. This improvement could be useful for applications like voice assistants, automated customer support, and content narration.

The model now supports multiple languages, making it accessible to a wider audience. It also maintains low response times, which can be beneficial for real-time interactions. More details are available at Play.ht.

Grok 3: Enhancing Your AI Experience

Grok3-1

Elon Musk's AI venture, xAI, has introduced Grok 3, the latest version of its AI chatbot designed to make interactions more intuitive and helpful. It boasts more than 10 times the compute power of its predecessor and excels in math, science, and coding benchmarks. Grok 3 is now available to all users for a limited time, offering an opportunity to experience its enhanced features without a subscription. Its improved responsiveness and expanded capabilities are designed to make interactions smoother and more productive. And since February 21st, it is now free to all X users. Read more here.

Veo 2 Update: Enhancing Video Creation

Veo 2

Google has introduced Veo 2, an advanced AI tool designed to simplify video creation for everyone. Many video creators are stating that it is currently the best video generator currently out there.. Veo 2 is also now integrated into YouTube's Dream Screen feature, allowing creators to add unique AI-generated backgrounds or standalone clips to their Shorts. This integration offers a fun and accessible way to enhance your content, making it more engaging for your audience. Learn more

AI Portrait Generators: Transforming Photos into Art

AI Portrait Generator

Wondering how you'd look as a painting or a sketch? AI portrait generators make it easy to transform your regular photos into artistic creations. Platforms like Canva and Fotor offer simple tools where you can upload a photo and choose from various art styles to create a personalized portrait. You don't need any special skills—just a few clicks, and you can see yourself in a whole new light.

Beatoven AI: A Simple Way to Create Custom Music

Beatoven.ai_-scaled

Beatoven AI is designed for anyone who wants background music without the hassle of composing it themselves. It offers an easy way to generate music based on the mood and genre you choose, making it useful for videos, podcasts, and other creative projects. Instead of searching for tracks that fit, users can guide the AI to create something that matches their vision. You can explore more at Beatoven.ai.

ElevenLabs Conversational AI: Bringing Voices to AI Agents

ElevenLabsB

ElevenLabs' Conversational AI makes it easier to create AI-powered voice assistants that feel more natural and responsive. Whether for customer support, gaming, or education, this tool enables AI agents to speak and interact in real time. It supports multiple languages and can be integrated into various platforms like websites and phone systems, making AI-driven conversations more accessible.

With features like turn-taking, external app connections, and customizable voices, users can design agents that sound realistic and engage smoothly in conversations. Learn more at ElevenLabs Conversational AI.

Shopify Magic: AI Tools for Online Sellers

Shopify Magic

Shopify Magic is a set of AI-powered features designed to help online store owners streamline their business operations. It assists with writing compelling product descriptions, generating responses to customer inquiries, and even editing product images to enhance their appeal. These tools aim to save time and effort for entrepreneurs, making it easier to manage their stores efficiently.

Beyond text-based assistance, Shopify Magic also integrates with Shopify’s chat and email tools to improve customer interactions. It helps store owners craft responses, automate frequently asked questions, and refine marketing emails for better engagement. Learn more at Shopify Magic.

Relevance AI: Building AI Teams That Work for You

relevance1

Relevance AI is designed to help businesses create and manage AI-powered agents that handle repetitive tasks, making workflows smoother and more efficient. Instead of relying on multiple tools, users can build their own AI workforce—customized to fit sales, marketing, customer support, and other business functions. The platform offers pre-built templates and integrations, allowing users to get started quickly without needing technical expertise.

With Relevance AI, businesses can automate processes like lead nurturing, content creation, and customer inquiries, helping teams focus on higher-value work. The platform also prioritizes security and compliance, ensuring that AI agents work safely within organizations. Learn more at Relevance AI.

OmniHuman: ByteDance’s New AI Turning Photos into Moving Videos

bytedance-omnihuman-ai-video-gen

ByteDance’s new AI tool, OmniHuman, takes a single photo and transforms it into a realistic video where the subject can speak, move, and even gesture naturally. Instead of just animating facial expressions, this AI creates full-body movement, making videos look more lifelike. The system was trained using thousands of hours of video data to better understand how people move, helping it generate smooth and natural-looking animations.

This development could be useful for content creators, educators, and digital artists looking for new ways to bring static images to life. While it opens exciting creative possibilities, experts are also considering the ethical implications of AI-generated videos. Learn more about OmniHuman here.

ReRender AI: Turning Ideas into Stunning Visuals

Rerender

ReRender AI offers a way to transform simple sketches, 3D models, or images into high-quality, photorealistic renderings within seconds. Whether you're designing architecture, interiors, or creative concepts, this tool helps users bring their visions to life with ease. By automating the rendering process, it provides quick results without requiring advanced technical skills.

Users can adjust details to fit their creative needs, making it useful for professionals and hobbyists alike. To explore more, visit ReRender AI.

Pika 2.1: AI Video Creation Gets a Big Upgrade

Pika 2-1

It hasn’t been long since Pika AI introduced its 2.0 update, yet it's already rolling out Pika 2.1 with even more upgrades. This rapid pace of innovation brings smoother animations, improved video resolution up to 1080p, and new tools like advanced motion control and dynamic lighting. A key addition is the ability to upload images to customize video scenes, giving users more creative control than ever before.

These updates make AI-powered video creation even more accessible and intuitive. Whether you’re crafting short clips, animations, or cinematic projects, Pika 2.1 continues to push boundaries. Learn more at Pika AI.

ChatGPT Now Works Without an Account

ChatGPT-Open-AI-1120

OpenAI has made it easier than ever to use ChatGPT by allowing access without needing to sign up for an account. This change means anyone can try AI-powered conversations instantly, removing barriers and making AI more accessible. While some advanced features remain exclusive to registered users, this update is a big step toward making AI tools available to more people with minimal hassle.

For those curious about AI but hesitant to create an account, this update offers a chance to experiment with ChatGPT effortlessly. It’s a great way to explore what AI can do without committing to a sign-up process. Read more at The Verge.

Riffusion: A New Free Competitor To Suno And Udio

riffusion image

Riffusion is an AI-powered tool that lets users create entire songs just by typing a simple prompt. Whether it's country, hip-hop, metal, or even a quirky song about too many streaming subscriptions, it generates unique tracks with lyrics and melodies in seconds. Users can also experiment with blending different musical styles, tweaking elements like tempo and tone, and even adjusting how "weird" a song sounds. The tool is still in beta but already offers an easy way to play with AI-generated music, making it accessible to anyone curious about the intersection of AI and creativity.

One of Riffusion’s standout features is the ability to upload audio and transform it in creative ways. Users can remix old recordings, extend songs, or even reimagine a track in a completely different genre. Whether it’s turning a jazz tune into punk rock or adding a funk beat to a guitar riff, the possibilities are endless. The platform also plans to personalize music based on user preferences, making AI-generated tracks even more tailored. And it’s currently free.

ChatGPT's 'Deep Research': Your New Research Companion

deep research

OpenAI has introduced 'Deep Research,' a new feature in ChatGPT designed to assist users in conducting thorough research on complex topics. By analyzing information from various online sources, it compiles detailed reports complete with citations, all within a short time frame. This tool aims to make in-depth research more accessible, saving users valuable time and effort.

Unfortunately, 'Deep Research' is only available to Pro subscribers at this time, and they can utilize up to 100 queries per month. Depending on the complexity of the topic, the research process takes between 5 to 30 minutes. Users receive clarifying questions at the start and notifications once the results are ready. This feature represents a significant step toward making comprehensive research more manageable for everyone.

Learn more here.

Ideogram Canvas: A New Way to Create and Edit AI-Generated Images

ideogram

Last year Ideogram introduced Canvas, a creative workspace that allows users to organize, generate, and refine images seamlessly. With features like Magic Fill and Extend, users can edit specific areas of an image or expand its borders while maintaining a cohesive look. Whether you’re tweaking details or blending multiple images, Canvas offers a flexible and user-friendly approach to AI-powered design. To explore more, visit Ideogram Canvas.

Now they have added a very powerful text editor which you can read about and see a video about here.

OpenAI Introduces o3-mini: Making Advanced AI More Accessible

OpenAI-Launches-O3-Model-Family

OpenAI has unveiled o3-mini, a new AI model designed to make advanced reasoning tasks more accessible to everyone. This model is now available to all ChatGPT users, including those on the free tier, marking the first time free users can experience such capabilities. For those on paid plans, o3-mini offers increased usage limits, allowing up to 150 messages daily.

One of the standout features of o3-mini is its efficiency. It excels in areas like math and coding, delivering responses 24% faster than previous models. Additionally, it operates at a significantly reduced cost, being 63% less expensive to run than its predecessor. This efficiency doesn't come at the expense of performance; o3-mini matches or even surpasses earlier models in technical tasks. Developers also have the flexibility to adjust the 'reasoning effort' to balance speed and accuracy according to their needs. Read more here.

Kling AI's 'Elements': Bringing Your Stories to Life

New-Kling-AI-Elements

Kling AI has recently introduced a new feature called 'Elements' that makes creating consistent and engaging videos easier than ever. With 'Elements', you can upload up to four images—such as characters, objects, or backgrounds—and the tool will help you weave them into a smooth animation. This means your characters and scenes stay consistent throughout the video, making your storytelling more coherent and visually appealing.

If you're crafting a short story, an educational clip, or just experimenting with creative ideas, 'Elements' offers a straightforward way to bring your concepts to life. By allowing multiple images to interact seamlessly, it opens up new possibilities for dynamic and engaging content creation. It's a user-friendly approach to making your videos more lively and connected, helping you share your ideas in a more compelling way. Learn more here.

Krea AI's New Feature: Transforming Your 2D Images into 3D Creations

Krea 2

Krea AI has introduced an exciting feature that allows users to convert their flat, 2D images into dynamic 3D models. By simply uploading a picture, you can watch it come to life with added depth and perspective, making your visuals more engaging and interactive.

This tool is designed to be user-friendly, requiring no prior experience in design or 3D modeling. Whether you're looking to enhance your digital art, create captivating social media content, or explore new creative avenues, Krea AI's 2D-to-3D conversion offers a straightforward way to add a new dimension to your projects.

Learn more here: seaart.ai

Captions.ai: Transform Your Videos with AI Magic

image1

Editing videos can be time-consuming and complex, but tools like Captions.ai aim to make it easier. Upload your footage, select an editing style, and the platform can help you add captions, transitions, and background music.

Whether you’re sharing stories, building a brand, or creating social media content, Captions.ai provides features designed to streamline the process, even for beginners.

Captions.ai could be the perfect tool for anyone who wants to take their video editing to the next level.

Skyrocket Your Ideas with SkyReels.ai

skyreels

SkyReels.ai is a creative tool designed to help you brainstorm and develop ideas for video content. Simply type in your concept or upload a clip, and it can suggest storylines, script ideas, and visuals to bring your project to life.

Whether you’re working on social media content, exploring a creative idea, or stuck in a brainstorming rut, SkyReels.ai offers a way to spark inspiration and move your project forward. It’s a straightforward, helpful tool for turning ideas into something more. Check it out here.

Gemini + YouTube Integration

Gemini YouTube

Gemini can now help you learn from YouTube videos more effectively. By pasting a YouTube video link into Gemini, you can ask it to create a step-by-step guide or summary. This can be helpful when learning a new skill or trying to understand a complex topic.

You can use this feature in Google Chrome by typing "@ Gemini" in the address bar and starting a chat. This can be a useful tool for anyone who wants to learn from video content in a more interactive and efficient way.

OpenAI’s New Tool “Operator”

operator

OpenAI has introduced a tool called “Operator”, aimed at helping with everyday tasks like planning trips, booking reservations, and ordering groceries. With a few prompts, users can delegate these small but time-consuming chores.

Right now, Operator is only available to Pro users in the U.S., but OpenAI has plans to expand access over time. As it rolls out to more people, it could become a useful way to streamline routine tasks and free up time for other priorities.

This release is part of OpenAI’s broader effort to develop AI tools that fit into daily life.

Read more: https://www.yenisafak.com/en/news/openai-announces-new-artificial-intelligence-tool-3697594

Smart Write by Neo

neo 3a

PixVerse has recently released a significant update, Version 3, with a focus on enhancing the user experience. This new version aims to better understand and translate your creative visions into captivating videos.

Additionally, explore effects like "Zombie Mode" and "Alive Art" to add a touch of the unexpected to your creations.

Version 3 also introduces new features like Lipsync, which allows characters to seamlessly match spoken words, and Extend, which enables you to easily build upon existing video clips. Learn more here.

PixVerse V3: Explore New Creative Frontiers

pixverse

PixVerse has recently released a significant update, Version 3, with a focus on enhancing the user experience. This new version aims to better understand and translate your creative visions into captivating videos.

Additionally, explore effects like "Zombie Mode" and "Alive Art" to add a touch of the unexpected to your creations.

Version 3 also introduces new features like Lipsync, which allows characters to seamlessly match spoken words, and Extend, which enables you to easily build upon existing video clips. Learn more here.

Google Gemini Deep Research + NotebookLM - Ultimate AI Combo

Google AI Combo

Google Gemini's Deep Research feature is designed to streamline your research process. It helps you organize your thoughts and gather information efficiently by creating research plans and compiling sources into a single document.

Combining Deep Research with Notebook LM can further enhance your workflow. You can easily synthesize information, generate insights, and even create AI-powered content like podcasts. This integrated approach can be a valuable asset for anyone who wants to explore new topics, conduct research, or produce high-quality content.

To learn more, watch this.

Vidu 2.0: Making Video Creation Faster, Cheaper, and Easier

Vidu

ShengShu Technology has introduced Vidu 2.0, an update to its video creation tool. This version offers faster video generation and a new "Templates" feature, which allows users to add actions or props with a single click.

With a focus on accessibility, Vidu 2.0 is designed for a range of users, from small business owners to aspiring editors. The update provides a streamlined way to create videos quickly and efficiently, making video production more approachable for a wider audience.

Learn more here.

Guidde: Your Friendly Guide to Effortless How-To Videos

Guidde

Creating how-to videos can feel overwhelming, especially if you’re unfamiliar with video editing. Guidde offers a way to simplify the process by allowing you to record your screen and turn your actions into step-by-step video guides.

With a focus on ease of use, Guidde requires no design or technical expertise—just record your workflow, and it organizes the content into a clear tutorial. Whether for training, customer support, or sharing tips, it helps streamline the process.

For a visual demonstration of how Guidde works, check out this video.

Spotter Studio

Spotter

Spotter Studio offers a variety of tools to help streamline your YouTube content creation. Whether you're brainstorming video ideas or trying to connect more with your audience, these tools aim to spark your creativity and keep things fresh.

Some popular creators, like Dude Perfect and The Odditty Diaries, have shared how Spotter Studio helps them save time and stay productive. It's designed to be a helpful companion in your creative process, making video creation a bit easier and more enjoyable.

Curious how it works? Check it here.

ChatGPT introduces "Tasks"

tasks

OpenAI has introduced a new feature in ChatGPT called “Tasks,” designed to make your life a little easier by helping you stay on top of things. With Tasks, you can set reminders for important events, whether it’s a one-time meeting or a recurring commitment, like weekly family calls. You can also schedule helpful updates, such as receiving the weather forecast every morning or a roundup of news each week, delivered straight to your email. It’s like having a friendly assistant that keeps you organized and informed.

The best part? You’re in complete control. You can easily create, edit, or remove tasks through a simple interface, and ChatGPT will let you know when a task is completed with a notification or email. Tasks even work when you’re offline! You can set up to 10 tasks, Here's how to use it: https://help.openai.com/en/articles/10291617-scheduled-tasks-in-chatgpt

Invideo 3.0 update

invideo 3 b

Big news from InVideo - their latest update, V3, is a game-changer for video creation. You can now create entire videos—script, footage, voiceovers, music, subtitles, animations, the whole package—just by typing a single text prompt. No editing skills or juggling multiple tools needed!

This means anyone, whether you’re a creator, marketer, or entrepreneur, can easily produce professional-quality videos to tell your story, promote a product, or create engaging content for social media. Imagine whipping up a polished video ad or even translating existing videos with just a few clicks!

If you’ve ever felt intimidated by video editing, this might be the perfect time to give it a try here.

ChatGPT update giving it "eyes"

ChatGPT

OpenAI has unveiled a groundbreaking feature for ChatGPT—real-time video and voice interaction through its new Advanced Voice mode. This update enables the chatbot to visually interpret its surroundings and respond contextually to what users show through their device's camera.

For example, users can display items or situations, and ask for guidance. The chatbot can provide detailed, step-by-step instructions, answer clarifying questions, and adapt its responses to what’s in the frame. Additionally, users can share their device screen, allowing ChatGPT to view and assist with tasks, such as drafting replies to messages within a messenger app.

This feature is part of ChatGPT Plus and Pro subscription plans and will roll out next week. Business and educational users can expect access to this functionality by early 2025. With these advancements,

MagicQuill (revolutionary image editing)

MagicQuill

MagicQuill is here to make image editing simpler, smarter, and more fun for everyone. With its AI-powered tools and an intuitive interface, you can easily do things like insert new elements, erase objects, or tweak colors—no complex skills required.

What’s really cool? MagicQuill understands what you’re trying to do in real time, so there’s no need to type out prompts or navigate tricky menus. Just a few quick strokes, and you’re in control, getting exactly the edits you want with precision and ease.

Whether you’re working on casual photo tweaks or intricate design projects, MagicQuill combines powerful AI with simplicity to bring your creative vision to life. If you’ve been looking for a tool that makes advanced editing feel effortless, this one’s definitely worth checking out.

MultiFoley

MultiFoley

MultiFoley is an impressive new AI tool for creating soundtracks that perfectly match silent videos, whether you’re going for realistic sound effects or something more imaginative. With MultiFoley, you can generate high-quality, synchronized sounds using text, audio, or video as inputs.

One of its coolest tricks is that you can guide it with reference sounds—like pulling audio from a sound effects library or a partial video soundtrack—and it will build a complete, seamless audio experience. Need a skateboard’s wheels spinning without the wind noise, or maybe a lion’s roar that sounds like a cat’s meow? MultiFoley can help you with this.

It’s super versatile, too. You can use it to create sounds based on text prompts, extend incomplete soundtracks, or tweak audio using existing references. By combining AI smarts with professional sound effects, it produces clear, full-bandwidth audio that’s perfect for everything from film production to creative projects.

If sound design is part of your workflow, MultiFoley could save you a lot of time while opening up endless creative possibilities.

NotebookLM update

nlm

Google has just rolled out some exciting updates to NotebookLM, their AI-powered productivity tool, and they’re pretty game-changing.

The standout feature? You can now jump into the podcast conversations with their new Audio Overview update. This means you can interact with the audio using your voice—ask questions, get extra details, or even request different explanations, all in real-time. It’s like being part of the discussion!

They’ve also redesigned the interface to make things easier and more intuitive. You’ve got three key panels now: one for keeping track of your sources, one for AI-powered chats (with citations!), and another for creating things like study guides and custom audio overviews.

And for those who need even more power, there’s a new premium tier called NotebookLM Plus coming early next year. It’s built for teams, schools, and businesses, offering more storage, shared notebooks, and collaboration features.

You can check it out here.

Pika 2 update

Pika 2 update

Pika Labs has introduced Pika 2.0, a fun and user-friendly AI video generator designed for everyday creators, not just big studios. One of its standout features is the Scene Ingredients tool, which lets you upload your own characters, props, and settings to mix with AI-generated content. Whether it’s a dragon flying over a castle or a cat surfing through space, you get more control to bring your ideas to life.

Unlike traditional text-prompt-based video tools, Pika 2.0 has improved text alignment for better results and upgraded motion rendering for smoother, more natural movements. It’s made with small creators and social media users in mind, making it perfect for TikToks, marketing clips, or just having fun with creative video projects.

Available for both free and paid users, Pika 2.0 is all about making video creation accessible and enjoyable for “actual people,” as they put it.

Nvidia's Fugato

unnamed (1)

Fugatto by Nvidia is a revolutionary AI model that generates and transforms audio using text and audio prompts. It allows users to compose music, modify voices, add or remove instruments, and even create entirely new sounds.

It allows fine-grained control over attributes like accent, emotion, and sound evolution. For example, it can morph sounds over time, such as a train transitioning into a string orchestra, or a choir.

Its debut showcased impressive creativity, from music with barking dogs to instruments mimicking animal sounds, marking Fugatto as a groundbreaking leap in generative audio technology.

The video below shows what amazing capabilities it will give to filmmakers.

Hunyuan video generator

unnamed (2)

Hunyuan AI Video is a new, state of the art, AI Video Generator that creates high-quality videos from text descriptions. With massive horsepower and state-of-the-art performance, it claims to be the most powerful open-source video generation model available.

It generates high-quality AI videos with superior motion stability, scene transitions, and realistic visuals.

Try it at https://fal.ai/models/fal-ai/hunyuan-video

CapCut's AI Avatar Generator

You can now make a lip-syncing talking Custom AI Avatar for free in Capcut!

unnamed (3)

CapCut's AI avatar generator is completely free to use. You can create and personalize your avatar without any subscriptions or hidden fees, allowing you to explore your creativity without breaking the bank.

From Capcut's promo:

"Key Features of CapCut’s AI Avatar Generator

  • Diverse Avatar Styles: Explore a library of unique styles, from bold and graphic to soft and whimsical, tailored to your vision.
  • User-Friendly Interface: Create avatars effortlessly with intuitive tools, perfect for beginners and pros alike.
  • Extensive Customization: Personalize avatars with detailed features, sound effects, and backgrounds to reflect your individuality.

Benefits of Using CapCut’s AI Avatar Generator

  • Free Creativity: Design avatars at no cost, eliminating the need for expensive software or subscriptions.
  • Pre-Designed Templates: Start with diverse character templates that inspire and simplify the creative process.
  • Seamless Video Integration: Easily incorporate custom avatars into videos with CapCut’s editing tools.

Creative Applications of AI Avatars

  • Gaming & Entertainment: Enhance gameplay commentary or skits with unique character avatars.
  • Marketing & Advertising: Create memorable campaigns featuring custom avatars to elevate your brand.
  • Reaction & Review Videos: Add personality and engagement to your content with visually captivating avatars."

Learn more on how to use CapCut AI avatar generator here: https://www.capcut.com/tools/free-avatar-creator

Google Raises the Bar in Video Generation

unnamed

Google just announced Veo 2 which produces another advance in video quality, out-performing even OpenAI's Sora. They also announced Imagen 3, an upgraded image model also offering state-of-the-art quality.

While video models frequently “hallucinate” unwanted details—such as extra fingers or unexpected objects—Veo 2 minimizes these occurrences, resulting in more realistic outputs.

Additionally, Veo 2 embeds an invisible SynthID watermark in its videos, allowing them to be identified as AI-generated. This helps mitigate risks of misinformation and misattribution.

Visit Google Labs to sign up for the waitlist. They also plan to expand Veo 2 to YouTube Shorts and other products next year.

Read more about it at https://blog.google/technology/google-labs/video-image-generation-update-december-2024

Imagen 3 outperformed all models, including Midjourney, Flux, and Ideogram, in human evaluations for preference, visual quality, and prompt adherence. The model is now available through Google Labs’ ImageFX.

PUT YOUR FRIENDS IN ANY ENVIRONMENT IN ANY POSITION

OpenArt combines numerous great image generation and editing tools into one online program, but what sets it apart is its ability to train a "model" composed of different images that you upload of a friend, a family member, a pet etc. that you can then place into any environment, in any pose, and any style.

You can see it in action in this great video from Bob Doyle at 5:10 to about 20:30: https://www.youtube.com/watch?v=gEjm0Mc1jkc 

 

 

 

 

Ai-Da, a humanoid robot artist, just made history by selling her portrait of Alan Turing for over $1 million at Sotherby's. The painting is below.

2.-Ai-God-Polyptych-by-Ai-Da-Robot

OMNIGEN - REMARKABLE NEW IMAGE EDITOR

omnigen

Imagine being able to say "take the person on the left in image 1 and the middle person in image 2 and have them [whatever you went them to do, wherever you want them to do it]. Or telling it to deblur an image, or add or remove things when combining multiple images or parts of images.

Omnigen can do this and much more. You just tell it what you want it to do to the image and it does it. You can see it in action at https://www.youtube.com/watch?v=PCL9SAlHqzw

And try it out at https://huggingface.co/spaces/Shitao/OmniGen

Warning: Being designed by geeks, it's not the most intuitive, and it can cost you in credits after a while. If you have a powerful enough PC and graphics card, you can install it locally and use it for free with no limits.

RECRAFT AKA RED PANDA

fb68852f-4c99-4ff6-aa79-a50ba8a8aa1e

Another new image generator, Recraft.ai, has appeared and is claiming to be the best, but in all the tests I've seen, while it is actually on a par with the best - Ideogram, MidJourney, Flux etc. - it is not better than them.

It is very good for photorealism and long text, and has similar extra features to some of the others (upscaling, background removal, erasing portions), and it adds vector images, collages, and mockups. There is a free version, so it is definitely worth a try.

RUNWAYML ADDS ADVANCED CAMERA CONTROLS

runway-advanced-camera-control-1456x1202

RunWayML has added advanced camera controls the give ultraprecise, and much easier to use camera controls when you are generating your videos.

You can check these out at https://www.youtube.com/watch?v=0buDtZKLDJ8

WONDERANIMATION

wonderanim

Wonder Dynamics, the folks who enabled us to drop animated CGI characters into our videos, and who I featured in in many of my seminars, have now introduced WonderAnimation, which turns any footage that you shoot into fully rendered 3D animated scenes that you have full post-production control over!

You can literally shoot a scene with any camera, (or phone) in any location, and turn the sequence into an animated scene with CG characters in a 3D environment - even with shots from multiple angles!

You can read about it at https://adsknews.autodesk.com/en/news/autodesk-launches-wonder-animation-video-to-3d-scene-technology/

You can see it in action at https://www.youtube.com/watch?v=xad1ajxln28

CHATGPT SEARCH

Per ChatGPT, it "can now search the web in a much better way than before. You can get fast, timely answers with links to relevant web sources, which you would have previously needed to go to a search engine for. This blends the benefits of a natural language interface with the value of up-to-date sports scores, news, stock quotes, and more.

ChatGPT will choose to search the web based on what you ask, or you can manually choose to search by clicking the web search icon.

Search will be available at chatgpt.com (opens in a new window), as well as on our desktop and mobile apps. All ChatGPT Plus and Team users, as well as SearchGPT waitlist users, will have access today. Enterprise and Edu users will get access in the next few weeks. We’ll roll out to all Free users over the coming months."

ChatGPT also added a much-needed conversation search function at the top left enabling you to search through all your previous conversations.

IDEOGRAM AND MIDJOURNEY ADD IMAGE EDITING

Both ideogram and MidJourney have introduced excellent editing tools for the images you create with them, or that you upload.

id dogs 2

With Ideogram's Canvas, you can upload your own images or generate new ones, then seamlessly edit, extend, or combine them using Magic Fill (inpainting - adding things to the image, like the girl added above) and Image Extending (outpainting) tools. You can also seamlessly combine multiple images into one unified image. Magic Fill allows you to edit specific regions of your images to replace objects, add text, fix imperfections, change backgrounds, and more.

mj cars

With Midjourney, users can upload any image of their choosing and edit sections of it with AI, or change the style and texture of it from the source to something totally different, such as turning a vintage photograph into anime — while preserving most of the image’s subjects and objects and spatial relationships. It also works on doodles and hand drawings that the user submits, turning scribbles into full art pieces in seconds.

Shorts:

RunwayML introduced Act-One, an extraordinary way to add fully controllable facial expressiveness to any face - real or animated - in a video. Instead of trying to explain all that it does, check it out here: https://runwayml.com/research/introducing-act-one

Stability AI released the open source Stable Diffusion 3.5 - with improved photorealism of people and much better rendering of hands.

Alibaba’s MIMO - Alibaba's got a new AI tool called MIMO  that can swap out people in videos using just a single photo reference, and change them into whatever characters you like, doing whatever you wish. It eliminates the need for complicated stuff like multi-camera setups or motion capture.


Leonardo in Canvas
-
Canva has launched Dream Lab which incorporates Leonardo in its text to image creations. The new Dream Lab tool can generate up to 19 different types of graphics, including 3D renders and illustrations, and can also reference other images to fine-tune outputs, making its outputs more reliable. It’s also capable of generating multi-subject images and photorealistic portraits.

HEYGEN INTERACTIVE AVATARS FOR ZOOM

heygen new

HeyGen has introduced an innovative feature that allows users to integrate AI-powered avatars into Zoom meetings, enhancing virtual interactions. These Interactive Avatars can join multiple Zoom sessions simultaneously, operating 24/7, and are designed to look, sound, and behave like the user, making real-time decisions based on provided knowledge bases. https://www.heygen.com/


Key Features:

  • Real-Time Interaction: The avatars engage in dynamic conversations, responding promptly to participants using OpenAI's real-time voice integration. This ensures natural and efficient interactions during meetings.
  • Versatility: Suitable for various applications such as online coaching, customer support, sales calls, and interviews, these avatars can handle repetitive tasks, allowing users to focus on more critical aspects of their work.
  • Personalization: Users can create custom avatars that mirror their appearance and voice, and how they speak, providing a consistent and authentic presence in virtual meetings. Additionally, users can create up to 100 different "looks" for their avatar, enabling variations in backgrounds, outfits, and camera angles to keep the virtual presence engaging and versatile.

While it is definitely getting better all the time, the avatars still look and sound fake to me - almost there, but not quite.

krea new

Image generator Krea - https://www.krea.ai/ - has released a major update where they partnered with some of the top AI video generators to bring multiple video models into Krea. Now you can create videos with MiniMax, LumaLabs, RunwayML, Pika Labs and Kling all in the one place.

They also have real-time image generation, image to video, and can upscale images and videos, as well as animations that morph from one image to another.

adobe new

NEW ADOBE AI TOOLS

At Adobe MAX 2024, Adobe announced many new AI features which include:

Adobe Firefly Video Model (Beta): Adobe expanded its Firefly family of generative AI models to include video, enabling creators to generate videos from text and image prompts. This model is designed to be commercially safe and is integrated into Premiere Pro, offering features like Generative Extend to seamlessly add frames to video clips .

Photoshop Enhancements: Photoshop received several AI-driven updates:

  • Distraction Removal: Automatically identifies and removes elements like people, wires, and poles from images.
  • Generative Workspace (Beta): Allows designers to ideate and iterate concepts simultaneously using generative AI.
  • Substance 3D Viewer (Beta): Enables viewing and editing 3D objects within Photoshop.
  • Premiere Pro Enhancements:  Premiere Pro introduces Generative Extend, allowing editors to seamlessly add frames to video clips using AI.
  • Adobe Express:  Adobe Express introduces new AI capabilities to simplify content creation, such as campaign creation, animation, and one-click brand setup.

NOT DIAMOND

My favorite new GPT is Not Diamond at https://chat.notdiamond.ai

Like Poe, it enables you to try different GPTs (ChatGPT, Claude, Gemini, Perplexity etc.) in the one place, but it does more. Based on what you ask, it chooses the best GPT for your query.

And you can compare the output of different GPTs side by side. And it does image generation, including the new Flux. And it is free.

INSTANT PODCASTS

Google recently enhanced its NotebookLM tool with an experimental Audio Overview feature, turning any collection of sources into a captivating podcast discussion hosted by two AI personalities. The AI-generated dialogue is downloadable, engaging, and tailored for auditory learners, as advertised by Google.

However, the feature goes beyond mere audio playback. The AI hosts display remarkable pacing, tone, and delivery, mimicking the natural flow of a human conversation. It's quite remarkable.

Credit: Lifehacker

FREE YOUTUBE TRANSCRIPTS

Another way to get a free transcript of a YouTube video is to add 3 "t's"after the youtube in the address - e.g. https://www.youtubettt.com/watch?v=cw0UOQd3ZB8 of any YouTube video you're watching.

Invideo 3.0 update

invideo 3 b

Big news from InVideo - their latest update, V3, is a game-changer for video creation. You can now create entire videos—script, footage, voiceovers, music, subtitles, animations, the whole package—just by typing a single text prompt. No editing skills or juggling multiple tools needed!

This means anyone, whether you’re a creator, marketer, or entrepreneur, can easily produce professional-quality videos to tell your story, promote a product, or create engaging content for social media. Imagine whipping up a polished video ad or even translating existing videos with just a few clicks!

If you’ve ever felt intimidated by video editing, this might be the perfect time to give it a try here.

ChatGPT update giving it "eyes"

ChatGPT

OpenAI has unveiled a groundbreaking feature for ChatGPT—real-time video and voice interaction through its new Advanced Voice mode. This update enables the chatbot to visually interpret its surroundings and respond contextually to what users show through their device's camera.

For example, users can display items or situations, and ask for guidance. The chatbot can provide detailed, step-by-step instructions, answer clarifying questions, and adapt its responses to what’s in the frame. Additionally, users can share their device screen, allowing ChatGPT to view and assist with tasks, such as drafting replies to messages within a messenger app.

This feature is part of ChatGPT Plus and Pro subscription plans and will roll out next week. Business and educational users can expect access to this functionality by early 2025. With these advancements,

MagicQuill (revolutionary image editing)

MagicQuill

MagicQuill is here to make image editing simpler, smarter, and more fun for everyone. With its AI-powered tools and an intuitive interface, you can easily do things like insert new elements, erase objects, or tweak colors—no complex skills required.

What’s really cool? MagicQuill understands what you’re trying to do in real time, so there’s no need to type out prompts or navigate tricky menus. Just a few quick strokes, and you’re in control, getting exactly the edits you want with precision and ease.

Whether you’re working on casual photo tweaks or intricate design projects, MagicQuill combines powerful AI with simplicity to bring your creative vision to life. If you’ve been looking for a tool that makes advanced editing feel effortless, this one’s definitely worth checking out.

MultiFoley

MultiFoley

MultiFoley is an impressive new AI tool for creating soundtracks that perfectly match silent videos, whether you’re going for realistic sound effects or something more imaginative. With MultiFoley, you can generate high-quality, synchronized sounds using text, audio, or video as inputs.

One of its coolest tricks is that you can guide it with reference sounds—like pulling audio from a sound effects library or a partial video soundtrack—and it will build a complete, seamless audio experience. Need a skateboard’s wheels spinning without the wind noise, or maybe a lion’s roar that sounds like a cat’s meow? MultiFoley can help you with this.

It’s super versatile, too. You can use it to create sounds based on text prompts, extend incomplete soundtracks, or tweak audio using existing references. By combining AI smarts with professional sound effects, it produces clear, full-bandwidth audio that’s perfect for everything from film production to creative projects.

If sound design is part of your workflow, MultiFoley could save you a lot of time while opening up endless creative possibilities.

NotebookLM update

nlm

Google has just rolled out some exciting updates to NotebookLM, their AI-powered productivity tool, and they’re pretty game-changing.

The standout feature? You can now jump into the podcast conversations with their new Audio Overview update. This means you can interact with the audio using your voice—ask questions, get extra details, or even request different explanations, all in real-time. It’s like being part of the discussion!

They’ve also redesigned the interface to make things easier and more intuitive. You’ve got three key panels now: one for keeping track of your sources, one for AI-powered chats (with citations!), and another for creating things like study guides and custom audio overviews.

And for those who need even more power, there’s a new premium tier called NotebookLM Plus coming early next year. It’s built for teams, schools, and businesses, offering more storage, shared notebooks, and collaboration features.

You can check it out here.

Pika 2 update

Pika 2 update

Pika Labs has introduced Pika 2.0, a fun and user-friendly AI video generator designed for everyday creators, not just big studios. One of its standout features is the Scene Ingredients tool, which lets you upload your own characters, props, and settings to mix with AI-generated content. Whether it’s a dragon flying over a castle or a cat surfing through space, you get more control to bring your ideas to life.

Unlike traditional text-prompt-based video tools, Pika 2.0 has improved text alignment for better results and upgraded motion rendering for smoother, more natural movements. It’s made with small creators and social media users in mind, making it perfect for TikToks, marketing clips, or just having fun with creative video projects.

Available for both free and paid users, Pika 2.0 is all about making video creation accessible and enjoyable for “actual people,” as they put it.

Nvidia's Fugato

unnamed (1)

Fugatto by Nvidia is a revolutionary AI model that generates and transforms audio using text and audio prompts. It allows users to compose music, modify voices, add or remove instruments, and even create entirely new sounds.

It allows fine-grained control over attributes like accent, emotion, and sound evolution. For example, it can morph sounds over time, such as a train transitioning into a string orchestra, or a choir.

Its debut showcased impressive creativity, from music with barking dogs to instruments mimicking animal sounds, marking Fugatto as a groundbreaking leap in generative audio technology.

The video below shows what amazing capabilities it will give to filmmakers.

Hunyuan video generator

unnamed (2)

Hunyuan AI Video is a new, state of the art, AI Video Generator that creates high-quality videos from text descriptions. With massive horsepower and state-of-the-art performance, it claims to be the most powerful open-source video generation model available.

It generates high-quality AI videos with superior motion stability, scene transitions, and realistic visuals.

Try it at https://fal.ai/models/fal-ai/hunyuan-video

CapCut's AI Avatar Generator

You can now make a lip-syncing talking Custom AI Avatar for free in Capcut!

unnamed (3)

CapCut's AI avatar generator is completely free to use. You can create and personalize your avatar without any subscriptions or hidden fees, allowing you to explore your creativity without breaking the bank.

From Capcut's promo:

"Key Features of CapCut’s AI Avatar Generator

  • Diverse Avatar Styles: Explore a library of unique styles, from bold and graphic to soft and whimsical, tailored to your vision.
  • User-Friendly Interface: Create avatars effortlessly with intuitive tools, perfect for beginners and pros alike.
  • Extensive Customization: Personalize avatars with detailed features, sound effects, and backgrounds to reflect your individuality.

Benefits of Using CapCut’s AI Avatar Generator

  • Free Creativity: Design avatars at no cost, eliminating the need for expensive software or subscriptions.
  • Pre-Designed Templates: Start with diverse character templates that inspire and simplify the creative process.
  • Seamless Video Integration: Easily incorporate custom avatars into videos with CapCut’s editing tools.

Creative Applications of AI Avatars

  • Gaming & Entertainment: Enhance gameplay commentary or skits with unique character avatars.
  • Marketing & Advertising: Create memorable campaigns featuring custom avatars to elevate your brand.
  • Reaction & Review Videos: Add personality and engagement to your content with visually captivating avatars."

Learn more on how to use CapCut AI avatar generator here: https://www.capcut.com/tools/free-avatar-creator

Google Raises the Bar in Video Generation

unnamed

Google just announced Veo 2 which produces another advance in video quality, out-performing even OpenAI's Sora. They also announced Imagen 3, an upgraded image model also offering state-of-the-art quality.

While video models frequently “hallucinate” unwanted details—such as extra fingers or unexpected objects—Veo 2 minimizes these occurrences, resulting in more realistic outputs.

Additionally, Veo 2 embeds an invisible SynthID watermark in its videos, allowing them to be identified as AI-generated. This helps mitigate risks of misinformation and misattribution.

Visit Google Labs to sign up for the waitlist. They also plan to expand Veo 2 to YouTube Shorts and other products next year.

Read more about it at https://blog.google/technology/google-labs/video-image-generation-update-december-2024

Imagen 3 outperformed all models, including Midjourney, Flux, and Ideogram, in human evaluations for preference, visual quality, and prompt adherence. The model is now available through Google Labs’ ImageFX.

PUT YOUR FRIENDS IN ANY ENVIRONMENT IN ANY POSITION

OpenArt combines numerous great image generation and editing tools into one online program, but what sets it apart is its ability to train a "model" composed of different images that you upload of a friend, a family member, a pet etc. that you can then place into any environment, in any pose, and any style.

You can see it in action in this great video from Bob Doyle at 5:10 to about 20:30: https://www.youtube.com/watch?v=gEjm0Mc1jkc 

 

 

 

 

Ai-Da, a humanoid robot artist, just made history by selling her portrait of Alan Turing for over $1 million at Sotherby's. The painting is below.

2.-Ai-God-Polyptych-by-Ai-Da-Robot

OMNIGEN - REMARKABLE NEW IMAGE EDITOR

omnigen

Imagine being able to say "take the person on the left in image 1 and the middle person in image 2 and have them [whatever you went them to do, wherever you want them to do it]. Or telling it to deblur an image, or add or remove things when combining multiple images or parts of images.

Omnigen can do this and much more. You just tell it what you want it to do to the image and it does it. You can see it in action at https://www.youtube.com/watch?v=PCL9SAlHqzw

And try it out at https://huggingface.co/spaces/Shitao/OmniGen

Warning: Being designed by geeks, it's not the most intuitive, and it can cost you in credits after a while. If you have a powerful enough PC and graphics card, you can install it locally and use it for free with no limits.

RECRAFT AKA RED PANDA

fb68852f-4c99-4ff6-aa79-a50ba8a8aa1e

Another new image generator, Recraft.ai, has appeared and is claiming to be the best, but in all the tests I've seen, while it is actually on a par with the best - Ideogram, MidJourney, Flux etc. - it is not better than them.

It is very good for photorealism and long text, and has similar extra features to some of the others (upscaling, background removal, erasing portions), and it adds vector images, collages, and mockups. There is a free version, so it is definitely worth a try.

RUNWAYML ADDS ADVANCED CAMERA CONTROLS

runway-advanced-camera-control-1456x1202

RunWayML has added advanced camera controls the give ultraprecise, and much easier to use camera controls when you are generating your videos.

You can check these out at https://www.youtube.com/watch?v=0buDtZKLDJ8

WONDERANIMATION

wonderanim

Wonder Dynamics, the folks who enabled us to drop animated CGI characters into our videos, and who I featured in in many of my seminars, have now introduced WonderAnimation, which turns any footage that you shoot into fully rendered 3D animated scenes that you have full post-production control over!

You can literally shoot a scene with any camera, (or phone) in any location, and turn the sequence into an animated scene with CG characters in a 3D environment - even with shots from multiple angles!

You can read about it at https://adsknews.autodesk.com/en/news/autodesk-launches-wonder-animation-video-to-3d-scene-technology/

You can see it in action at https://www.youtube.com/watch?v=xad1ajxln28

CHATGPT SEARCH

Per ChatGPT, it "can now search the web in a much better way than before. You can get fast, timely answers with links to relevant web sources, which you would have previously needed to go to a search engine for. This blends the benefits of a natural language interface with the value of up-to-date sports scores, news, stock quotes, and more.

ChatGPT will choose to search the web based on what you ask, or you can manually choose to search by clicking the web search icon.

Search will be available at chatgpt.com (opens in a new window), as well as on our desktop and mobile apps. All ChatGPT Plus and Team users, as well as SearchGPT waitlist users, will have access today. Enterprise and Edu users will get access in the next few weeks. We’ll roll out to all Free users over the coming months."

ChatGPT also added a much-needed conversation search function at the top left enabling you to search through all your previous conversations.

IDEOGRAM AND MIDJOURNEY ADD IMAGE EDITING

Both ideogram and MidJourney have introduced excellent editing tools for the images you create with them, or that you upload.

id dogs 2

With Ideogram's Canvas, you can upload your own images or generate new ones, then seamlessly edit, extend, or combine them using Magic Fill (inpainting - adding things to the image, like the girl added above) and Image Extending (outpainting) tools. You can also seamlessly combine multiple images into one unified image. Magic Fill allows you to edit specific regions of your images to replace objects, add text, fix imperfections, change backgrounds, and more.

mj cars

With Midjourney, users can upload any image of their choosing and edit sections of it with AI, or change the style and texture of it from the source to something totally different, such as turning a vintage photograph into anime — while preserving most of the image’s subjects and objects and spatial relationships. It also works on doodles and hand drawings that the user submits, turning scribbles into full art pieces in seconds.

Shorts:

RunwayML introduced Act-One, an extraordinary way to add fully controllable facial expressiveness to any face - real or animated - in a video. Instead of trying to explain all that it does, check it out here: https://runwayml.com/research/introducing-act-one

Stability AI released the open source Stable Diffusion 3.5 - with improved photorealism of people and much better rendering of hands.

Alibaba’s MIMO - Alibaba's got a new AI tool called MIMO  that can swap out people in videos using just a single photo reference, and change them into whatever characters you like, doing whatever you wish. It eliminates the need for complicated stuff like multi-camera setups or motion capture.


Leonardo in Canvas
-
Canva has launched Dream Lab which incorporates Leonardo in its text to image creations. The new Dream Lab tool can generate up to 19 different types of graphics, including 3D renders and illustrations, and can also reference other images to fine-tune outputs, making its outputs more reliable. It’s also capable of generating multi-subject images and photorealistic portraits.

HEYGEN INTERACTIVE AVATARS FOR ZOOM

heygen new

HeyGen has introduced an innovative feature that allows users to integrate AI-powered avatars into Zoom meetings, enhancing virtual interactions. These Interactive Avatars can join multiple Zoom sessions simultaneously, operating 24/7, and are designed to look, sound, and behave like the user, making real-time decisions based on provided knowledge bases. https://www.heygen.com/


Key Features:

  • Real-Time Interaction: The avatars engage in dynamic conversations, responding promptly to participants using OpenAI's real-time voice integration. This ensures natural and efficient interactions during meetings.
  • Versatility: Suitable for various applications such as online coaching, customer support, sales calls, and interviews, these avatars can handle repetitive tasks, allowing users to focus on more critical aspects of their work.
  • Personalization: Users can create custom avatars that mirror their appearance and voice, and how they speak, providing a consistent and authentic presence in virtual meetings. Additionally, users can create up to 100 different "looks" for their avatar, enabling variations in backgrounds, outfits, and camera angles to keep the virtual presence engaging and versatile.

While it is definitely getting better all the time, the avatars still look and sound fake to me - almost there, but not quite.

krea new

Image generator Krea - https://www.krea.ai/ - has released a major update where they partnered with some of the top AI video generators to bring multiple video models into Krea. Now you can create videos with MiniMax, LumaLabs, RunwayML, Pika Labs and Kling all in the one place.

They also have real-time image generation, image to video, and can upscale images and videos, as well as animations that morph from one image to another.

adobe new

NEW ADOBE AI TOOLS

At Adobe MAX 2024, Adobe announced many new AI features which include:

Adobe Firefly Video Model (Beta): Adobe expanded its Firefly family of generative AI models to include video, enabling creators to generate videos from text and image prompts. This model is designed to be commercially safe and is integrated into Premiere Pro, offering features like Generative Extend to seamlessly add frames to video clips .

Photoshop Enhancements: Photoshop received several AI-driven updates:

  • Distraction Removal: Automatically identifies and removes elements like people, wires, and poles from images.
  • Generative Workspace (Beta): Allows designers to ideate and iterate concepts simultaneously using generative AI.
  • Substance 3D Viewer (Beta): Enables viewing and editing 3D objects within Photoshop.
  • Premiere Pro Enhancements:  Premiere Pro introduces Generative Extend, allowing editors to seamlessly add frames to video clips using AI.
  • Adobe Express:  Adobe Express introduces new AI capabilities to simplify content creation, such as campaign creation, animation, and one-click brand setup.

NOT DIAMOND

My favorite new GPT is Not Diamond at https://chat.notdiamond.ai

Like Poe, it enables you to try different GPTs (ChatGPT, Claude, Gemini, Perplexity etc.) in the one place, but it does more. Based on what you ask, it chooses the best GPT for your query.

And you can compare the output of different GPTs side by side. And it does image generation, including the new Flux. And it is free.

INSTANT PODCASTS

Google recently enhanced its NotebookLM tool with an experimental Audio Overview feature, turning any collection of sources into a captivating podcast discussion hosted by two AI personalities. The AI-generated dialogue is downloadable, engaging, and tailored for auditory learners, as advertised by Google.

However, the feature goes beyond mere audio playback. The AI hosts display remarkable pacing, tone, and delivery, mimicking the natural flow of a human conversation. It's quite remarkable.

Credit: Lifehacker

FREE YOUTUBE TRANSCRIPTS

Another way to get a free transcript of a YouTube video is to add 3 "t's"after the youtube in the address - e.g. https://www.youtubettt.com/watch?v=cw0UOQd3ZB8 of any YouTube video you're watching.

Invideo 3.0 update

invideo 3 b

Big news from InVideo - their latest update, V3, is a game-changer for video creation. You can now create entire videos—script, footage, voiceovers, music, subtitles, animations, the whole package—just by typing a single text prompt. No editing skills or juggling multiple tools needed!

This means anyone, whether you’re a creator, marketer, or entrepreneur, can easily produce professional-quality videos to tell your story, promote a product, or create engaging content for social media. Imagine whipping up a polished video ad or even translating existing videos with just a few clicks!

If you’ve ever felt intimidated by video editing, this might be the perfect time to give it a try here.

ChatGPT update giving it "eyes"

ChatGPT

OpenAI has unveiled a groundbreaking feature for ChatGPT—real-time video and voice interaction through its new Advanced Voice mode. This update enables the chatbot to visually interpret its surroundings and respond contextually to what users show through their device's camera.

For example, users can display items or situations, and ask for guidance. The chatbot can provide detailed, step-by-step instructions, answer clarifying questions, and adapt its responses to what’s in the frame. Additionally, users can share their device screen, allowing ChatGPT to view and assist with tasks, such as drafting replies to messages within a messenger app.

This feature is part of ChatGPT Plus and Pro subscription plans and will roll out next week. Business and educational users can expect access to this functionality by early 2025. With these advancements,

MagicQuill (revolutionary image editing)

MagicQuill

MagicQuill is here to make image editing simpler, smarter, and more fun for everyone. With its AI-powered tools and an intuitive interface, you can easily do things like insert new elements, erase objects, or tweak colors—no complex skills required.

What’s really cool? MagicQuill understands what you’re trying to do in real time, so there’s no need to type out prompts or navigate tricky menus. Just a few quick strokes, and you’re in control, getting exactly the edits you want with precision and ease.

Whether you’re working on casual photo tweaks or intricate design projects, MagicQuill combines powerful AI with simplicity to bring your creative vision to life. If you’ve been looking for a tool that makes advanced editing feel effortless, this one’s definitely worth checking out.

MultiFoley

MultiFoley

MultiFoley is an impressive new AI tool for creating soundtracks that perfectly match silent videos, whether you’re going for realistic sound effects or something more imaginative. With MultiFoley, you can generate high-quality, synchronized sounds using text, audio, or video as inputs.

One of its coolest tricks is that you can guide it with reference sounds—like pulling audio from a sound effects library or a partial video soundtrack—and it will build a complete, seamless audio experience. Need a skateboard’s wheels spinning without the wind noise, or maybe a lion’s roar that sounds like a cat’s meow? MultiFoley can help you with this.

It’s super versatile, too. You can use it to create sounds based on text prompts, extend incomplete soundtracks, or tweak audio using existing references. By combining AI smarts with professional sound effects, it produces clear, full-bandwidth audio that’s perfect for everything from film production to creative projects.

If sound design is part of your workflow, MultiFoley could save you a lot of time while opening up endless creative possibilities.

NotebookLM update

nlm

Google has just rolled out some exciting updates to NotebookLM, their AI-powered productivity tool, and they’re pretty game-changing.

The standout feature? You can now jump into the podcast conversations with their new Audio Overview update. This means you can interact with the audio using your voice—ask questions, get extra details, or even request different explanations, all in real-time. It’s like being part of the discussion!

They’ve also redesigned the interface to make things easier and more intuitive. You’ve got three key panels now: one for keeping track of your sources, one for AI-powered chats (with citations!), and another for creating things like study guides and custom audio overviews.

And for those who need even more power, there’s a new premium tier called NotebookLM Plus coming early next year. It’s built for teams, schools, and businesses, offering more storage, shared notebooks, and collaboration features.

You can check it out here.

Pika 2 update

Pika 2 update

Pika Labs has introduced Pika 2.0, a fun and user-friendly AI video generator designed for everyday creators, not just big studios. One of its standout features is the Scene Ingredients tool, which lets you upload your own characters, props, and settings to mix with AI-generated content. Whether it’s a dragon flying over a castle or a cat surfing through space, you get more control to bring your ideas to life.

Unlike traditional text-prompt-based video tools, Pika 2.0 has improved text alignment for better results and upgraded motion rendering for smoother, more natural movements. It’s made with small creators and social media users in mind, making it perfect for TikToks, marketing clips, or just having fun with creative video projects.

Available for both free and paid users, Pika 2.0 is all about making video creation accessible and enjoyable for “actual people,” as they put it.

Nvidia's Fugato

unnamed (1)

Fugatto by Nvidia is a revolutionary AI model that generates and transforms audio using text and audio prompts. It allows users to compose music, modify voices, add or remove instruments, and even create entirely new sounds.

It allows fine-grained control over attributes like accent, emotion, and sound evolution. For example, it can morph sounds over time, such as a train transitioning into a string orchestra, or a choir.

Its debut showcased impressive creativity, from music with barking dogs to instruments mimicking animal sounds, marking Fugatto as a groundbreaking leap in generative audio technology.

The video below shows what amazing capabilities it will give to filmmakers.

Hunyuan video generator

unnamed (2)

Hunyuan AI Video is a new, state of the art, AI Video Generator that creates high-quality videos from text descriptions. With massive horsepower and state-of-the-art performance, it claims to be the most powerful open-source video generation model available.

It generates high-quality AI videos with superior motion stability, scene transitions, and realistic visuals.

Try it at https://fal.ai/models/fal-ai/hunyuan-video

CapCut's AI Avatar Generator

You can now make a lip-syncing talking Custom AI Avatar for free in Capcut!

unnamed (3)

CapCut's AI avatar generator is completely free to use. You can create and personalize your avatar without any subscriptions or hidden fees, allowing you to explore your creativity without breaking the bank.

From Capcut's promo:

"Key Features of CapCut’s AI Avatar Generator

  • Diverse Avatar Styles: Explore a library of unique styles, from bold and graphic to soft and whimsical, tailored to your vision.
  • User-Friendly Interface: Create avatars effortlessly with intuitive tools, perfect for beginners and pros alike.
  • Extensive Customization: Personalize avatars with detailed features, sound effects, and backgrounds to reflect your individuality.

Benefits of Using CapCut’s AI Avatar Generator

  • Free Creativity: Design avatars at no cost, eliminating the need for expensive software or subscriptions.
  • Pre-Designed Templates: Start with diverse character templates that inspire and simplify the creative process.
  • Seamless Video Integration: Easily incorporate custom avatars into videos with CapCut’s editing tools.

Creative Applications of AI Avatars

  • Gaming & Entertainment: Enhance gameplay commentary or skits with unique character avatars.
  • Marketing & Advertising: Create memorable campaigns featuring custom avatars to elevate your brand.
  • Reaction & Review Videos: Add personality and engagement to your content with visually captivating avatars."

Learn more on how to use CapCut AI avatar generator here: https://www.capcut.com/tools/free-avatar-creator

Google Raises the Bar in Video Generation

unnamed

Google just announced Veo 2 which produces another advance in video quality, out-performing even OpenAI's Sora. They also announced Imagen 3, an upgraded image model also offering state-of-the-art quality.

While video models frequently “hallucinate” unwanted details—such as extra fingers or unexpected objects—Veo 2 minimizes these occurrences, resulting in more realistic outputs.

Additionally, Veo 2 embeds an invisible SynthID watermark in its videos, allowing them to be identified as AI-generated. This helps mitigate risks of misinformation and misattribution.

Visit Google Labs to sign up for the waitlist. They also plan to expand Veo 2 to YouTube Shorts and other products next year.

Read more about it at https://blog.google/technology/google-labs/video-image-generation-update-december-2024

Imagen 3 outperformed all models, including Midjourney, Flux, and Ideogram, in human evaluations for preference, visual quality, and prompt adherence. The model is now available through Google Labs’ ImageFX.

PUT YOUR FRIENDS IN ANY ENVIRONMENT IN ANY POSITION

OpenArt combines numerous great image generation and editing tools into one online program, but what sets it apart is its ability to train a "model" composed of different images that you upload of a friend, a family member, a pet etc. that you can then place into any environment, in any pose, and any style.

You can see it in action in this great video from Bob Doyle at 5:10 to about 20:30: https://www.youtube.com/watch?v=gEjm0Mc1jkc 

 

 

 

 

Ai-Da, a humanoid robot artist, just made history by selling her portrait of Alan Turing for over $1 million at Sotherby's. The painting is below.

2.-Ai-God-Polyptych-by-Ai-Da-Robot

OMNIGEN - REMARKABLE NEW IMAGE EDITOR

omnigen

Imagine being able to say "take the person on the left in image 1 and the middle person in image 2 and have them [whatever you went them to do, wherever you want them to do it]. Or telling it to deblur an image, or add or remove things when combining multiple images or parts of images.

Omnigen can do this and much more. You just tell it what you want it to do to the image and it does it. You can see it in action at https://www.youtube.com/watch?v=PCL9SAlHqzw

And try it out at https://huggingface.co/spaces/Shitao/OmniGen

Warning: Being designed by geeks, it's not the most intuitive, and it can cost you in credits after a while. If you have a powerful enough PC and graphics card, you can install it locally and use it for free with no limits.

RECRAFT AKA RED PANDA

fb68852f-4c99-4ff6-aa79-a50ba8a8aa1e

Another new image generator, Recraft.ai, has appeared and is claiming to be the best, but in all the tests I've seen, while it is actually on a par with the best - Ideogram, MidJourney, Flux etc. - it is not better than them.

It is very good for photorealism and long text, and has similar extra features to some of the others (upscaling, background removal, erasing portions), and it adds vector images, collages, and mockups. There is a free version, so it is definitely worth a try.

RUNWAYML ADDS ADVANCED CAMERA CONTROLS

runway-advanced-camera-control-1456x1202

RunWayML has added advanced camera controls the give ultraprecise, and much easier to use camera controls when you are generating your videos.

You can check these out at https://www.youtube.com/watch?v=0buDtZKLDJ8

WONDERANIMATION

wonderanim

Wonder Dynamics, the folks who enabled us to drop animated CGI characters into our videos, and who I featured in in many of my seminars, have now introduced WonderAnimation, which turns any footage that you shoot into fully rendered 3D animated scenes that you have full post-production control over!

You can literally shoot a scene with any camera, (or phone) in any location, and turn the sequence into an animated scene with CG characters in a 3D environment - even with shots from multiple angles!

You can read about it at https://adsknews.autodesk.com/en/news/autodesk-launches-wonder-animation-video-to-3d-scene-technology/

You can see it in action at https://www.youtube.com/watch?v=xad1ajxln28

CHATGPT SEARCH

Per ChatGPT, it "can now search the web in a much better way than before. You can get fast, timely answers with links to relevant web sources, which you would have previously needed to go to a search engine for. This blends the benefits of a natural language interface with the value of up-to-date sports scores, news, stock quotes, and more.

ChatGPT will choose to search the web based on what you ask, or you can manually choose to search by clicking the web search icon.

Search will be available at chatgpt.com (opens in a new window), as well as on our desktop and mobile apps. All ChatGPT Plus and Team users, as well as SearchGPT waitlist users, will have access today. Enterprise and Edu users will get access in the next few weeks. We’ll roll out to all Free users over the coming months."

ChatGPT also added a much-needed conversation search function at the top left enabling you to search through all your previous conversations.

IDEOGRAM AND MIDJOURNEY ADD IMAGE EDITING

Both ideogram and MidJourney have introduced excellent editing tools for the images you create with them, or that you upload.

id dogs 2

With Ideogram's Canvas, you can upload your own images or generate new ones, then seamlessly edit, extend, or combine them using Magic Fill (inpainting - adding things to the image, like the girl added above) and Image Extending (outpainting) tools. You can also seamlessly combine multiple images into one unified image. Magic Fill allows you to edit specific regions of your images to replace objects, add text, fix imperfections, change backgrounds, and more.

mj cars

With Midjourney, users can upload any image of their choosing and edit sections of it with AI, or change the style and texture of it from the source to something totally different, such as turning a vintage photograph into anime — while preserving most of the image’s subjects and objects and spatial relationships. It also works on doodles and hand drawings that the user submits, turning scribbles into full art pieces in seconds.

Shorts:

RunwayML introduced Act-One, an extraordinary way to add fully controllable facial expressiveness to any face - real or animated - in a video. Instead of trying to explain all that it does, check it out here: https://runwayml.com/research/introducing-act-one

Stability AI released the open source Stable Diffusion 3.5 - with improved photorealism of people and much better rendering of hands.

Alibaba’s MIMO - Alibaba's got a new AI tool called MIMO  that can swap out people in videos using just a single photo reference, and change them into whatever characters you like, doing whatever you wish. It eliminates the need for complicated stuff like multi-camera setups or motion capture.


Leonardo in Canvas
-
Canva has launched Dream Lab which incorporates Leonardo in its text to image creations. The new Dream Lab tool can generate up to 19 different types of graphics, including 3D renders and illustrations, and can also reference other images to fine-tune outputs, making its outputs more reliable. It’s also capable of generating multi-subject images and photorealistic portraits.

HEYGEN INTERACTIVE AVATARS FOR ZOOM

heygen new

HeyGen has introduced an innovative feature that allows users to integrate AI-powered avatars into Zoom meetings, enhancing virtual interactions. These Interactive Avatars can join multiple Zoom sessions simultaneously, operating 24/7, and are designed to look, sound, and behave like the user, making real-time decisions based on provided knowledge bases. https://www.heygen.com/


Key Features:

  • Real-Time Interaction: The avatars engage in dynamic conversations, responding promptly to participants using OpenAI's real-time voice integration. This ensures natural and efficient interactions during meetings.
  • Versatility: Suitable for various applications such as online coaching, customer support, sales calls, and interviews, these avatars can handle repetitive tasks, allowing users to focus on more critical aspects of their work.
  • Personalization: Users can create custom avatars that mirror their appearance and voice, and how they speak, providing a consistent and authentic presence in virtual meetings. Additionally, users can create up to 100 different "looks" for their avatar, enabling variations in backgrounds, outfits, and camera angles to keep the virtual presence engaging and versatile.

While it is definitely getting better all the time, the avatars still look and sound fake to me - almost there, but not quite.

krea new

Image generator Krea - https://www.krea.ai/ - has released a major update where they partnered with some of the top AI video generators to bring multiple video models into Krea. Now you can create videos with MiniMax, LumaLabs, RunwayML, Pika Labs and Kling all in the one place.

They also have real-time image generation, image to video, and can upscale images and videos, as well as animations that morph from one image to another.

adobe new

NEW ADOBE AI TOOLS

At Adobe MAX 2024, Adobe announced many new AI features which include:

Adobe Firefly Video Model (Beta): Adobe expanded its Firefly family of generative AI models to include video, enabling creators to generate videos from text and image prompts. This model is designed to be commercially safe and is integrated into Premiere Pro, offering features like Generative Extend to seamlessly add frames to video clips .

Photoshop Enhancements: Photoshop received several AI-driven updates:

  • Distraction Removal: Automatically identifies and removes elements like people, wires, and poles from images.
  • Generative Workspace (Beta): Allows designers to ideate and iterate concepts simultaneously using generative AI.
  • Substance 3D Viewer (Beta): Enables viewing and editing 3D objects within Photoshop.
  • Premiere Pro Enhancements:  Premiere Pro introduces Generative Extend, allowing editors to seamlessly add frames to video clips using AI.
  • Adobe Express:  Adobe Express introduces new AI capabilities to simplify content creation, such as campaign creation, animation, and one-click brand setup.

NOT DIAMOND

My favorite new GPT is Not Diamond at https://chat.notdiamond.ai

Like Poe, it enables you to try different GPTs (ChatGPT, Claude, Gemini, Perplexity etc.) in the one place, but it does more. Based on what you ask, it chooses the best GPT for your query.

And you can compare the output of different GPTs side by side. And it does image generation, including the new Flux. And it is free.

INSTANT PODCASTS

Google recently enhanced its NotebookLM tool with an experimental Audio Overview feature, turning any collection of sources into a captivating podcast discussion hosted by two AI personalities. The AI-generated dialogue is downloadable, engaging, and tailored for auditory learners, as advertised by Google.

However, the feature goes beyond mere audio playback. The AI hosts display remarkable pacing, tone, and delivery, mimicking the natural flow of a human conversation. It's quite remarkable.

Credit: Lifehacker

FREE YOUTUBE TRANSCRIPTS

Another way to get a free transcript of a YouTube video is to add 3 "t's"after the youtube in the address - e.g. https://www.youtubettt.com/watch?v=cw0UOQd3ZB8 of any YouTube video you're watching.