FAQ

Please find the frequently asked questions below. If there is a question that you have that is not answered here, feel free to reach out to us at [email protected].

What is Captioner?

Captioner is an AI Speech-to-Text software that is built to specialize in transcribing videos to text. You can use it to generate subtitles for your videos. Captioner converts video files to text in 98+ languages in high accuracy, and it aligns the subtitles closely to the speech in the videos.

What is the difference between Express and Unlimited plans?

Captioner Express is geared towards creators who need to quickly add subtitles without much editing, or add quickly add subtitle translations to existing videos. Captioner Unlimited is for creators who need more editing control over the subtitles, and want to integrate the subtitles into their video editing workflow.

How much does it cost?

For Captioner Express, it costs $5/month ($60 billed yearly) or $10/month (billed monthly). Captioner Unlimited plan costs $10/month ($120 billed yearly) or $20/month (billed monthly).

Is Captioner really unlimited?

Yes, Captioner is truly unlimited. You can upload as many videos as you want, and each video can be up to 3 hours long. There are no hidden fees or extra charges. The only thing we ask is that you don't share the account with others.

Can I upload large files?

While you can submit videos up to 3 hours long, the file size limit is currently at 2GB. If you have a larger file, you can shrink it to a smaller resolution (480p) and compress it before uploading, and edit the subtitles on Captioner first, then export the subtitles in SRT / VTT format and import them to your video editing software before rendering.

What video formats do you support?

Captioner currently supports MP4 and MOV file uploads. We recommend using MP4 files for the best results. If you have a different format, you can convert it to MP4 using a free online converter.

How long does it take to transcribe a video?

We are not the fastest because we optimize for accuracy so you don't have to spend too much time editing afterwards. But for an average 30-minute video, it usually takes around 5-6 minutes to transcribe.

How long does it take to render a video?

Rendering a video is the most time consuming step in the process. Usually for a 5-minute video, it takes about 5-10 minutes to render.

The processing is taking too long! I have waited for more than 30 minutes!

I apologize for the delay. Please feel free to reach out to us by using the chat bubble on the bottom right on the dashboard, we will be happy to help you speed up.

What formats can I export the subtitles in?

You can export the subtitles in SBV, SRT and VTT formats. These are the most common formats that are supported by most video editing software.

What languages do you support?

Captioner converts speech to text in over 98 languages using the highest accuracy AI transcription technology. Languages like English and Chinese are most accurate, and we have specifically fine-tuned our Whisper model to support Cantonese, a dialect of Chinese. However, Voice to text accuracy varies by language. Do feel free to try it out and see if it meets your needs. You can check out the full list of supported languages in the documentation.

What languages do you offer translations?

Captioner currently offers translations from English to Mandarin Chinese on Captioner Express plan. We are working hard to add support for more language translations on both Express and Unlimited plans.

Can I edit the subtitles?

If you subscribe to the Unlimited plan, after the transcribing process is done, you can edit the subtitles directly on Captioner. You can change the text, adjust the timing, and add or remove subtitles. You can also export the subtitles and import them to your video editing software for further editing. We are working on adding a simple way for you to make quick edits to the subtitles on Free and Express plans.

Do I need to mute the background music in my video?

Even though it is best to have a clean speech audio for the best transcription results, you don't have to mute the background music in your video. Captioner is designed to transcribe speech even with background noise, because we have a built-in voice extraction process that can separate the speech from the background noise. This ensures that the subtitles are as accurate as possible.

How do I cancel the subscription?

You can cancel your subscription at any time by clicking "Billing" in the dashboard. You'll have full access to Captioner through the end of the current billing period.

Who is behind Captioner?

Hey! I am Simon. I spent over a decade building AI systems at companies like Amazon. Now I'm building products and services like Captioner.