Transcribe Call Workflow Action

The Transcribe Call workflow action in AI Studio for Workflows allows you to automatically transcribe recorded calls and return a text transcript as part of a HubSpot workflow. It is specially designed to leverage HubSpot’s Call-based Workflows, ensuring you can seamlessly insert the transcription step anywhere in your automated call-related processes.

Key Features & Benefits

Multilingual Transcription using OpenAI Whisper (automatically detects and transcribes 50+ languages).
Pause Removal & Audio Optimization to reduce token usage and improve transcription clarity.
HubSpot-Native Integration (no separate platform needed)

Prerequisites

AI Studio for HubSpot installed via HubSpot Marketplace or ai.resonatehq.com and HubSpot portal successfully connected at ai.resonatehq.com

Setting Up a Call-Based Workflow

In your HubSpot account, go to Automation > Workflows. Click Create workflow (or edit an existing one). Pick Call-based workflow option (or any other if you intend to run workflow triggered by a different object type.

Ensure your enrollment criteria includes something like “Recording URL is known” so that only calls with existing recordings are enrolled.

Adding the “Transcribe Call” Action

Click the + Icon in the workflow to add a new action.
Select “AI Studio for Workflows” in the action list.
Choose “Transcribe Call” from the AI Studio actions

Action Inputs

Call Recording URL: Select or map the property that holds the URL of the recorded call.
Typically, this will be Call recording URL automatically logged by HubSpot.
Prompt (optional): Enter a short text prompt to help Whisper understand context or specialized terms.

For example:
The transcript is about Resonate - HubSpot Partner, CRM, apps for HubSpot: AI Studio, OCodeTools, and B2B Data Enrichment
A well-structured prompt can help the model produce clearer, more accurate transcriptions (proper nouns, acronyms, etc.) - see Prompting Best Practices below.

How it Works

When this action runs:

AI Studio retrieves the audio from the provided recording URL.
It automatically converts and trims silent or pause segments to reduce token usage (helps control costs).
The OpenAI Whisper model processes the audio to generate a text transcript.
The transcript is then output as a variable you can use in subsequent workflow steps

Using the Transcription Output

Scenario 1:

Save transcript into Call Notes: In your workflow, add an “Edit record” or “Set property value” action right after “Transcribe Call". Add Action output "Transcription" as a value for Call notes property.

Scenario 2: Use Send to AI workflow action for processing of the call.

Add a "Send to AI" step (also part of AI Studio) after Transcribe call action.

Choose AI Model (e.g., Open AI GPT-4o mini or if you aim to transcribe long conversations, Google Gemini is a better option as it has the largest context window.

Cost & Token Usage

500 tokens per minute of final audio is used for transcription.
Pause removal helps reduce total tokens, which lowers cost for longer calls that contain significant silence.
You can view your AI usage credits in the AI Studio settings to track monthly consumption

Prompting Best Practices

Contextual Prompts: If your calls often mention specific product names or technical acronyms, include them in the prompt (e.g., “GPT-3,” “DALL·E,” “OCodeTools”).

Format: Include punctuation and capitalization in your prompt if you want the transcribed text to do the same.

Split Calls into Segments: For very long calls, you may consider segmenting the audio. Use the transcript of the previous segment as part of the prompt for the next segment to preserve context.

Filler Words & Punctuation: If you want to preserve “um,” “uh,” or other natural language features, add them in the prompt as examples.

Language Detection & Multilingual Support

The Whisper model automatically detects the language of the call. No manual switching is required.

Over 50 languages are supported (e.g., English, Spanish, French, Japanese, Arabic, Chinese, etc.).

If a single call has multiple language switches, the transcription will still do its best to capture content accurately

Troubleshooting & FAQs

For my Zoom call recordings I'm getting an error when trying to transcribe it: "Invalid data. Check if URL can be read without authorization." The most common issue is with password-protected Zoom links. To resolve this, you need to turn off the following Zoom settings:

Access Zoom Web Portal:
• Navigate to Zoom's web portal and sign in to your account.

Navigate to Settings:
• Click on the Settings option in the left-hand menu.

Select the Recording Tab:
• At the top of the page, choose the Recording tab to view recording-specific settings.

Disable Password Protection:
• Scroll down to find the setting labeled "Require password to access shared cloud recordings."
• Toggle this setting off to prevent future recordings from being password-protected.