2024-12-21 [ENG]: How I use GenAI at work: HeyGen

Generative AI continues to reshape the way we work and interact, particularly in creative and communication domains. Predicting the full scope of its impact remains challenging, but adopting these tools is becoming increasingly inevitable. In this post, I’ll share my experience with HeyGen, a tool I’ve been experimenting with recently. Its capabilities are impressive and have sparked a lot of ideas for practical use. I hope this overview helps others, especially in academic and creative fields, explore its potential.

Today, today I’m going to write about a tool with enormous potential, which I will certainly be working with a lot in the near future. But for now, the first portion of experience in working with: HeyGen.

What is HeyGen?

HeyGen https://heygen.com/ is an AI-powered platform for generating video content using customizable avatars. It enables users to create professional-looking videos by simply typing a script and selecting an avatar. The platform offers support for multiple languages and accents, making it suitable for diverse audiences.

In essence, HeyGen is a virtual studio where you don’t need to endlessly retake recordings, set up cameras, or even spend hours editing. It’s all about efficiency, accessibility, and customization. In my opinion, such solution has gigantic potential in the education sector (and beyond).

How do I work with HeyGen?

HeyGen feels like a video creation studio (a simplified version of Camtasia or iMovie) but with some fantastic features. Let’s take a closer look at these features one by one.

HeyGen has an excellent text-to-speech engine. You can input any text into your script, and it will be converted into an audio recording. There’s a wide selection of voice-over options available for virtually every popular language. These voice-overs can be customized by adjusting the speech rate, excitement level, accent, and more. You can even clone your own voice. I must emphasize that the voice synthesis is outstanding—it’s pleasant to listen to, and the variety of voice options is impressive. You can include multiple languages in a single recording and switch between them dynamically. HeyGen can also translate text into another language and generate audio in that language. However, I usually prefer using other tools for translations (I used to rely on DeepL, but lately, I’ve been using ChatGPT more often—more on that in the future).

HeyGen doesn’t just create audio—it allows you to play it back using an avatar. There’s a wide range of avatars to choose from, and you can also upload your own based on a photo or video. This is an amazing feature, as it allows you to create highly realistic animations from a single image. The avatar can be based on a photo, but it can also be a synthetic character created, for example, using MidJourney.

HeyGen also offers the option to automatically process videos, translate them into another language, and provide realistic lip-syncing. This is the feature I use the least, but I believe it could attract quite a significant user base.

What I’m doing with HeyGen?

Here are two main approaches I’ve integrated HeyGen into my workflow:

Create videos to popularize projects or papers:

In my academic work, I often work on sophisticated papers or tackle challenging projects. The documentation for these can easily span hundreds of pages (see, for example, the entire Explanatory Model Analysis monograph). However, sometimes it would be helpful to have a short video introduction to the topic. A brief summary that presents the main findings in just a few minutes.

This is where HeyGen proves to be an excellent solution. We can write the script ourself or with the help of tools like NotebookLM. Automatic summaty usually requires some editing, but that’s much easier than writing the text from scratch. The summary is then converted into audio using HeyGen, while background slides showcase the project’s or article’s results.

For me personally, this is a game-changer. Even though I’ve worked a lot with tools like Camtasia, the most time-consuming part (taking up to 80% of the effort) was always creating the audio layer. HeyGen simplifies this process significantly, which is an incredible help.

Below is an example summary for the PINEBERRY project. It’s one of the first recordings I made, and despite that, it took only two hours to complete—from the initial idea to learning the tool and editing the video. Compared to traditional tools, this time-to-youtube is truly revolutionary.

Automatic translations:

Both in teaching and in outreach activities, I am always torn between two languages. Many topics relate to “local” issues, so writing about them in Polish feels more natural. On the other hand, when it comes to reaching a broader, more universal audience, English seems to be a much better option.

HeyGen makes it much easier to create content in multiple languages. Switching between languages is quite simple, and in just a few minutes, you can convert a short video from one language to another.

For example, below is an “almost” automatic translation of the above video into Polish.

What I’m NOT doing with HeyGen?

So far, there are two things I don’t do with HeyGen. One because it’s (yet) not possible, and the other because it probably wouldn’t work well.

Currently, the HeyGen allows you to use a static slide as a background, but you can’t include animations or other videos. This is a significant limitation, as staring at a static screen for an extended period can become quite boring. For this reason, I don’t create recordings with animated backgrounds.

I also don’t make long recordings. While the paid version allows for this, I feel that videos longer than a few minutes would become tedious and drawn-out. Five minutes is still an acceptable length (for me).

And how do you use HeyGen?

Let me know in the comments.