FAQ
Please find the frequently asked questions below. If there is a question that you have that is not answered here, feel free to reach out to us at support@captioner.io.
What is Captioner?
Captioner is an AI Speech-to-Text software that is built to specialize in transcribing videos to text. You can use it to generate subtitles for your videos. Captioner converts video files to text in 98+ languages in high accuracy, and it aligns the subtitles closely to the speech in the videos.
How much does it cost?
Captioner has one single unlimited plan that costs $10/month (billed yearly) or $20/month (billed monthly).
Is Captioner really unlimited?
Yes, Captioner is truly unlimited. You can upload as many videos as you want, and each video can be up to 3 hours long. There are no hidden fees or extra charges. The only thing we ask is that you don't share the account with others.
Can I upload large files?
While you can submit videos up to 3 hours long, the file size limit is currently at 2GB. If you have a larger file, you can shrink it to a smaller resolution (480p) and compress it before uploading, and edit the subtitles on Captioner first, then export the subtitles in SRT / VTT format and import them to your video editing software before rendering.
What video formats do you support?
Captioner currently supports MP4 and MOV file uploads. We recommend using MP4 files for the best results. If you have a different format, you can convert it to MP4 using a free online converter.
How long does it take to transcribe a video?
We are not the fastest because we optimize for accuracy so you don't have to spend too much time editing afterwards. But for an average 30-minute video, it usually takes around 5-6 minutes to transcribe.
What formats can I export the subtitles in?
You can export the subtitles in SBV, SRT and VTT formats. These are the most common formats that are supported by most video editing software.
What languages do you support?
Captioner converts speech to text in over 98 languages using the highest accuracy AI transcription technology. Languages like English and Chinese are most accurate, and we have specifically fine-tuned our Whisper model to support Cantonese, a dialect of Chinese. However, Voice to text accuracy varies by language. Do feel free to try it out and see if it meets your needs. You can check out the full list of supported languages in the documentation.
Can I edit the subtitles?
Yes, after the transcribing process is done, you can edit the subtitles directly on Captioner. You can change the text, adjust the timing, and add or remove subtitles. You can also export the subtitles and import them to your video editing software for further editing.
Do I need to mute the background music in my video?
Even though it is best to have a clean speech audio for the best transcription results, you don't have to mute the background music in your video. Captioner is designed to transcribe speech even with background noise, because we have a built-in voice extraction process that can separate the speech from the background noise. This ensures that the subtitles are as accurate as possible.
How do I cancel the subscription?
You can cancel your subscription at any time by clicking "Billing" in the dashboard. You'll have full access to Captioner through the end of the current billing period.
Who is behind Captioner?
Hey! I am Simon. I spent over a decade building AI systems at companies like Amazon. Now I'm building products and services like Captioner.