Runway continues to enrich its platform with tools aimed at giving more control to creators to better direct its AI models. Its co-founder reveals the latest advances and explains how these tools are transforming creative practices.
JDN. Runway regularly enriches its platform with new tools. What are the latest and what significant progress have you made in recent months?
Anastasis Germanidis. Although the results obtained with our text-to-video models were impressive and compelling, our users, who are creatives, quickly realized that they could not obtain accurate results with text prompts alone. For example, a director may need to create certain interactions between objects or perform very specific camera movements. Our objective is therefore to provide them with as many advanced tools as possible to enable them to operate our models, including Gen-3 Alpha in particular, with greater control.
We currently have around forty tools, some of which allow you to generate videos from images, control camera movements, etc. I can also mention the recent additions of Expand Video, which allows you to expand a video in a vertical or horizontal format, or Act One, which allows you to capture and transfer facial expressions to another face.
Runway presented its Gen-1 and Gen-2 models in February 2023 then Gen-3 Alpha, in June 2024, with considerable progress. When do you plan to introduce Gen-4, and how are you training these models? ?
Each generation of our base models is trained entirely from scratch. One of the keys, apart from improvements in architecture and algorithms, is the increase in the computational resources used for each model. Regarding Gen-4, there are discussions internally about the next steps to come, but we have nothing concrete to announce at the moment. When you’re doing research, it’s difficult to have very specific deadlines. It’s a process that takes time, and only a fraction of the ideas and projects underway will actually come to fruition. I think we still have a lot to do with Gen-3, especially since we haven’t released everything yet.
Act One is useful for animating a character by allowing you to preserve the facial expressions of a face coming from another source video. What uses do you envision for this new tool?
The common misconception around generative AI models has been to initially think that the results were going to be identical and that everyone would create the same things. In reality, the control tools were not yet there to allow real control of AI models. Act One is arguably one of the best representations of our efforts to give more control and precision to our users.
“We launch a new tool approximately every two weeks.”
Controlling human facial expressions is an important part of video storytelling because it allows you to better connect with characters and master certain subtleties. Of course, this is only one step and we still have other areas to improve, especially when it comes to mastering body movements. But Act One already represents a big step forward.
What are your projects in the field of audio, which remains an essential component for being able to make films, for example to add sound effects or dialogue?
We’re thinking about different things, but we’re looking for the right time to really integrate fully integrated tools. It seemed important to us, initially, to perfect the visual rendering of our AI models and to gradually add control options dedicated to video before adding this audio component. It will come and it’s a matter of time. Eventually, we will introduce complete audio workflows. But it is essential today for our research team to stay focused, especially since video is already a sufficiently complex field. That said, we already offer several audio tools on our platform, allowing for example to have text read by AI, to create personalized voices, to synchronize audio with the lips of a character or even to clean a file to remove unwanted sounds.
The ability to create a feature film entirely with AI appears more accessible than ever. What time frame do you anticipate to achieve this? ?
We are actually not very far from it anymore. If I had to give an estimate, I would say it should be achievable within one to two years. This doesn’t mean that just writing a few sentences will instantly produce a two-hour film. You will always have to work on each shot, use different control tools to refine each shot and each scene in order to obtain the desired result, etc. But yes, it will potentially be possible to create a complete film with Runway, without having to film anything.
How many people work at Runway and how are your workforce distributed between your different departments?
We are around 90 people, the majority of whom work in our research center. We try to stay as small as possible because we want to maintain a certain agility. Some members of our creative team are initially video directors or editors. Their role consists, among other things, of testing our models and tools in order to provide feedback to our researchers, while producing videos to promote our platform.
“I hope that works generated entirely by AI will win awards in big ceremonies without anyone knowing or wondering if it is AI.”
Our researchers are all driven by the idea of advancing innovation by publishing articles. But they are also very motivated by the idea of seeing their models concretely used by the community. We announced last November the opening of our office in London, which has around ten people, mainly dedicated to research.
Last September, you announced a partnership with the production studio Lionsgate. What are your goals and do you plan other similar collaborations with Hollywood studios?
This partnership, which is a first of its kind, has two objectives, the first of which concerns the creative aspect. We work closely with Lionsgate’s various teams, including video editors, VFX supervisors or their production teams, to integrate our tools into their workflows for their future film projects. The other aspect is data: we create custom models, based on Gen-3 Alpha, from Lionsgate’s catalog of films and series.
These two parts are interconnected, as custom models trained on these contents will provide better performance for certain use cases specific to Lionsgate. Movie studios are increasingly adopting AI-based technologies. Tools like Runway allow them to go faster while creating ever more spectacular and qualitative works. We of course remain open to the idea of working with other studios.
How do you think the cinema industry views AI and how your solution is perceived by professionals in the sector ?
“Runway is above all a platform designed for professional creatives.”
There is a lot of discussion within the creative sectors at large about the impact of AI on the jobs of tomorrow. I often observe that opinions evolve as soon as someone views content produced with this technology. This is also why we organized our AI Film Festival last May in Los Angeles and New York. Some films presented during this event were also shown at the Tribeca Film Festival, which allowed the artists to reach a new audience, from the traditional cinema sector, and to obtain excellent feedback. When we watch a film, we don’t spend time wondering what techniques or cameras were used. I think it will be the same with AI. I hope that works generated entirely by AI will win awards in major ceremonies without anyone knowing or wondering if it is AI.
You are of Greek origin, therefore a European citizen. What is your view on Europe’s place in this race for AI and on the impact of the AI Act, the European regulation aimed at regulating this technology?
There is currently a lot of discussion around AI but we must understand that we are still at the beginning. The uses of models, tools and products integrating AI will be very different in five years from those of today. There are currently a lot of projections about the potential risks of AI, but I think it’s a bit risky to extrapolate that far. Of course, it is essential to put in place safeguards and appropriate regulations, for example to ensure security in the deployment of models. But trying to anticipate such distant potential risks, which may never materialize, seems to me to be a mistake. It seems important to me to remain pragmatic and not get too far ahead of the facts.