Metadata, Transcription, & Tagging

Unlock the spoken word in your video and audio content

At the foundation of RAMP’s solutions is MediaCloud, a proprietary, patented technology for generating time-coded text transcripts and metadata from video content. RAMP’s solution delivers a complete time-coded transcript, tag set, and dynamic thumbnailing “thumbprint” for your video assets. The rich metadata created by MediaCloud is used to enable search across otherwise inaccessible spoken word. This metadata can be pushed back into your CMS, to YouTube, and to other workflows to help you make better use of that content –making it more discoverable, more engaging, and more actionable.

Core Capabilities

Powerful Content Ingestion & Workflow

Content to be enriched with metadata can be provided to RAMP via feeds, streams, uploads, RESTful API, as well as via live capture. Content can also be easily extracted from popular video content management systems such as Brightcove and thePlatform.

Automated Transcription & Tagging

Using MediaCloud, the valuable metadata of your videos is created automatically. Our cloud-based platform has:

  • Time-coded transcriptions on video & audio assets using automated speech-to-text technology.
  • Natural language processing to extract meaning from transcripts and metadata. Tags can be generated from RAMP’s global dictionaries or from your custom dictionaries.
  • Audio text alignment (“ATA”) enabling existing scripts or transcripts to be cut and pasted into the console and then automatically time-coded via MediaCloud.
  • Automatic video thumbnailing and scene segmentation.

Human Transcription & Translation

RAMP also offers integrated human transcription and translation. You can designate turnaround time requirements, and receive transcripts in a variety of outputs compatible with popular video platforms including YouTube, Brightcove, thePlatform, and more.

Image Detection

RAMP’s facial recognition automatically identifies faces in a frame, time-stamps the appearances and duration, and matches to a pre-defined list of people.  This capability can be extended to objects as well.

Sentiment Analysis

Better understand the emotional tone of video, audio, or text content with RAMP sentiment analysis.  Sentiment is defined on named entities and themes that you control.  Further, sentiment can be determined for specific time coded segements.

On-premise option for Speech-to-Text Transcription

RAMP offers a license of its speech-to-text transcription software as an on-premise solution for companies who require transcription for audio or video assets – contact RAMP to learn more.

Web Captioning & CVAA Compliance

For companies in the media space, video tags, transcripts and captions are more important than ever in order to comply with the 21st Century Video Accessibility Act (CVAA). RAMP has developed the first end-to-end solution for deploying comprehensive web closed captioning, and offers three approaches that can be used in combination based on your needs:

  1. Broadcast Capture and Alignment
  2. Automated Caption Alignment
  3. Human Transcription