Making Sense of Tech       Home        Blog        Templates        Usecases        Contact

 

AI video generators have made tremendous progress in recent months - most likely you have noticed this through various posts in your LinkedIn timeline, featuring 5- to 10-seconds-clips. But it makes a difference if you want to create short clips that create “likes”, or generate videos for your business to which customers will relate to.

Anyone who has worked on creating attention on their products and services knows that great images draw attention. In social networks, like LinkedIn, the trend has moved from images to short videos in the last years. As the creation of high-quality videos can be time- and cost-intense, the temptation to leverage generative AI technologies is big.

Where are the limitations of today's AI video generators? The most relevant ones reside in challenges on consistency and realistic physics.

Limitations

Consistency of objects: the more objects, or persons, are part of a video, the more challenging it will be for the AI to visualize them throughout the whole scene consistently. As most AI is limited to generate clips of length not longer than 5 or 10 seconds, creating longer videos means stitching together separate clips - which raises difficulties in this regard.

Scene consistency: generating images with persons in front of a great background, and then making a video out of it, is possible today. You have good control over how fore- and background look like. Nevertheless, with persons moving in your video, the background scenery will need to change accordingly, so that you will lose control over details in and quality of the background. This limits your ability to create dynamic videos while keeping quality high

Quality degradation for persons: persons can be depicted at great quality with many details. As the seconds in a video pass, the quality tends to decrease, however. Even though the overall quality of the video might still be good, persons will start to look a bit “off”.

Control over objects: in business scenarios, you might want to depict objects that not only look similar to your product - but they must look exactly like it. This generally means that you need to fine-tune the AI on this very object. Which is possible today, even for non-developers. The more objects and persons you want to control, however, the more difficult this will be at satisfying quality level.

Realistic physics and interaction: when people move in a video, cups fall to the ground, hairs of the actor blow in the wind - it must look realistic and physically correct. AI video generators have made progress in this aspect. If you want to control certain interactions - for example having a person take a cup out of a cupboard, or open a fridge - the chances rise that this will look weird, or even physically incorrect. It adds on the challenges for having control over objects: ensuring that objects and persons look in a certain way, and then controlling how these interact, is very difficult.

Bias on how persons and scenery appear: This depends strongly on the data on which the AI has been trained. If the AI has learned its capabilities mostly based on Western movies or web clips, then it might struggle on Asian or African settings, or on depicting specific business environments. Which video generators work best for you? You need to find out with trial-and-error yourself.

Summary

How should you get started today? Here is what is realistic with today's technologies:

  • Limit your clips to not more than 10 seconds, if possible.
  • Reduce the number of persons and objects you need to control.
  • Avoid changing background scenery.
  • Try out different video generators - quality will vary with regards to physical correctness, image quality, natural facial expressions, etc.
Prominent AI providers you can try out: Google Veo-2, OpenAI Sora, Alibaba Qwen, Runway Gen-3, Hailuo AI, and many more.

Many tech companies are working towards overcoming these limitations. We will be able to create even more complex and compelling videos in 12 or 24 months from today. Therefore, keep track of where innovations in video generation will push its capabilities.

 

 

Imprint        © Dominik Hörndlein 2025, all rights reserved.