Caption 1.0 has shipped, and we couldn’t be more excited to share the amazing new features added to the release!
One of our main value props, and key differentiators from competing products is that we’re able to extract meaning from customers’ raw materials. And we started on this route by providing keyword extraction from transcripts. Apart from being a cool feature in and of itself, it’s an enormously valuable one as well: broadcasters, one of our main customer segments, lose hours and hours od productivity due to lousy metadata about their materials. While the road is long, and much investment in NLP will be needed, we’ve taken decisive steps in this direction. Keywords are available both through the dashboard and through the API. Example:
The early, beta version of Caption required customers to have their files in a system like Amazon’s S3 or similar (files needed to be fetchable by a simple HTTP GET). This caused a lot of inconvenience, which is why we expanded ingest options to cover YouTube. The process is really simple: find the YouTube URL of your video, and simply input it into either the Caption dashboard or via the API. The rest of the flow is exactly the same as before.
Needless to say, this is not the end of our effort to cover as many ingestion options as possible in order to provide the best possible customer experience. On the contrary, we’re planning integration with Google Drive, Microsoft OneDrive, as well as NFS systems.
We pride ourselves on our ability to make customers’ audio and video collections searchable, and we achieve this through analyzing timestamps inside the files. For customers wishing to use Caption for processing raw materials (such as TV producers working on news segments), a related feature is of immense importance: automatically generating the subtitle (SRT) file along with the timestamps. We now allow this with just one click. To give you a taste, here is the SRT file for the video above:
This is definitely one of the coolest features we developed, and it’s based on the keyword extraction algorithms. We enable customers to find audio and video files that are similar to the current ones. Here’s a quick example from our podcast use case:
And keep in mind, the coolest thing about this is not actually viewing it from the dashboard! In fact, this can be used to build a recommender for your own end users via the Caption API.
We now provide the information about what speaker utters which part of the text. And to make it even neater, you can easily rename the labels from the default Speaker X! This also helps organize the text files more neatly. Here’s an example: