Skip to main content

How to Generate Podcasts with the Podhoc API: A Complete Walkthrough

Generate AI podcasts programmatically with the Podhoc REST API. Authentication, the create-poll-download lifecycle, rate limits, credit estimation, and production-ready Python and Node.js examples.

Generate AI podcasts programmatically with the Podhoc API

To generate a podcast with the Podhoc API, send a POST request to https://api-ext.podhoc.com/v1/podcasts with your X-Api-Key header, the source URLs, a target duration between 1 and 120 minutes, and an optional style and voice configuration. The endpoint returns a podcast_id that you poll on /v1/podcasts/{id}/status until completion (usually 2-5 minutes), then call /v1/podcasts/{id}/download to fetch a presigned MP3 URL valid for one hour.

The API is designed for a single create-poll-download lifecycle. There are no streams, no callbacks, no webhooks (yet). Authentication is a single static header. If you have ever called a REST API, you can integrate Podhoc in under an hour.


What the API is for

The Podhoc API exposes the same generation pipeline that powers the web app, reduced to a small set of primitives:

  • Programmatic podcast creation from public URLs.
  • Cost estimation before you spend credits.
  • Account introspection — credit balance, usage history, current tier.
  • Lifecycle management — status polling and presigned download.

The pipeline behind those primitives is the same five-stage system described in What Is an AI Podcast?: ingestion, understanding, audio reformatting, format choice, and voice synthesis. The API simply makes it scriptable.

Common use cases:

  • A SaaS product that turns customer-uploaded URLs into onboarding audio.
  • An internal tool that converts weekly newsletters into commute-ready episodes.
  • A learning platform that automatically generates audio versions of new course modules.
  • A research workflow that synthesises a set of papers into a single 30-minute briefing.

For more, see API integration ideas.


Step 1 — Provision a token

API access is gated to the Pro plan (€29/month, 3500 credits) and above. Once you have upgraded, head to app.podhoc.com/account/api-access and create a token.

There are two flavours:

  • Test tokens — prefix phk_test_…, cheaper (1.5x credit multiplier), restricted feature set. Use these during development and CI integration tests.
  • Production tokens — prefix phk_prod_…, full feature set, 2.5x credit multiplier.

Treat tokens like passwords. Store them in a secrets manager (AWS Secrets Manager, HashiCorp Vault, Doppler) and never commit them to source control. The API rejects requests with leaked or revoked tokens with a 401 UNAUTHORIZED response.

Authentication is a single header on every request:

X-Api-Key: phk_prod_a1b2c3d4e5f6...

The base URL is https://api-ext.podhoc.com/v1. All endpoints return JSON with a success boolean, a data object on success, an error object on failure, and a meta object with request_id plus credit-related fields. The envelope is borrowed from the same conventions Anthropic and other modern AI providers follow — predictable structure, predictable error handling.


Step 2 — Estimate the cost first

Before calling the create endpoint, ask the API how much it will charge. The estimate endpoint is free and lets you implement spending controls cleanly.

curl "https://api-ext.podhoc.com/v1/estimate-cost?duration_minutes=30&source_count=2&voice_count=2" \
  -H "X-Api-Key: $PODHOC_API_KEY"

The response breaks down the cost so you can apply your own policy logic:

{
  "success": true,
  "data": {
    "base_credits": 114,
    "credit_multiplier": 1.5,
    "final_credits": 171,
    "breakdown": {
      "base_cost": 75,
      "multi_source_bonus": 20,
      "custom_weights_bonus": 0,
      "voice_multiplier": 1.2,
      "subtotal_before_cap": 114,
      "cap_applied": null,
      "tier_max_cost": 500
    },
    "formula": "max(30, ceil(30 x 2.5)) + 20 x 1.2 = 114 x multiplier = 171"
  }
}

The pricing formula has a 500-credit cap per request, applied after voice multipliers, multi-source bonuses, and custom-weight bonuses. Use the breakdown to surface a “this episode will cost N credits” preview to your end users before they confirm.

You can also fetch your live balance:

curl https://api-ext.podhoc.com/v1/account/credits \
  -H "X-Api-Key: $PODHOC_API_KEY"

Step 3 — Create the podcast

The create endpoint is the one that does work and charges credits. The minimum payload is a list of URLs:

curl -X POST https://api-ext.podhoc.com/v1/podcasts \
  -H "X-Api-Key: $PODHOC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "urls": ["https://example.com/article"],
    "language": "en-US",
    "target_duration_minutes": 15,
    "style": "deep_dive",
    "voice_config": { "voices": 2 }
  }'

The response (HTTP 202 Accepted) gives you the identifier you will poll on:

{
  "success": true,
  "data": {
    "podcast_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    "status": "processing",
    "estimated_duration_minutes": 15,
    "credits_charged": 113
  },
  "meta": {
    "request_id": "f1e2d3c4-b5a6-7890-abcd-ef1234567890",
    "credits_charged": 113,
    "credit_balance": 3387
  }
}

The style parameter is one of the eight pedagogical formats Podhoc supports — deep_dive, didactic, feynman_technique, critique, debate, simplified_explanation, pedagogical_framework, alchemist_formula. Each produces noticeably different output for the same source. See the audio styles guide for when to pick which.

The language parameter accepts any of the 74 supported language codes — en-US, es, fr, de, it, ca, ar, ru, plus 66 others. Source language and output language are decoupled: pass an English URL and a language: "es" flag and you get a Spanish podcast.


Step 4 — Poll the status endpoint

Generation runs asynchronously. The podcast_id is your handle for the rest of the lifecycle.

curl https://api-ext.podhoc.com/v1/podcasts/$PODCAST_ID/status \
  -H "X-Api-Key: $PODHOC_API_KEY"

The status moves through four values:

  • requested — accepted, queued.
  • processing — actively running.
  • completed — done, ready to download.
  • failed — terminal error; consult the response details.

Poll once every 10 seconds with a small backoff. Most podcasts complete in 2-5 minutes regardless of source length, because the pipeline parallelises across cloud GPUs. A reasonable client implements polling with a hard cap (15 minutes) and surfaces an “in progress” UI to the end user.


Step 5 — Download the MP3

The download endpoint returns a presigned S3 URL that expires after one hour:

curl https://api-ext.podhoc.com/v1/podcasts/$PODCAST_ID/download \
  -H "X-Api-Key: $PODHOC_API_KEY"
{
  "success": true,
  "data": {
    "download_url": "https://s3.amazonaws.com/...",
    "expires_at": "2026-05-06T11:00:00+00:00",
    "format": "mp3",
    "duration_seconds": 900
  }
}

Stream the URL into your own storage. If you need a permanent reference, copy the bytes into your S3 / GCS / Azure bucket — the presigned URL itself is short-lived, but the audio you fetch lives forever in your environment.


A complete Python integration

Here is the same lifecycle as a runnable script. It estimates cost, checks balance, creates, polls and downloads. The error handling is intentionally explicit so you can see where each API contract is enforced.

import os
import time

import requests

API_KEY = os.environ["PODHOC_API_KEY"]
BASE = "https://api-ext.podhoc.com/v1"
HEADERS = {"X-Api-Key": API_KEY, "Content-Type": "application/json"}


def estimate(duration: int, sources: int, voices: int) -> int:
    r = requests.get(
        f"{BASE}/estimate-cost",
        headers={"X-Api-Key": API_KEY},
        params={
            "duration_minutes": duration,
            "source_count": sources,
            "voice_count": voices,
        },
        timeout=15,
    )
    r.raise_for_status()
    return r.json()["data"]["final_credits"]


def balance() -> int:
    r = requests.get(f"{BASE}/account/credits", headers={"X-Api-Key": API_KEY}, timeout=15)
    r.raise_for_status()
    return r.json()["data"]["credits"]


def create(urls: list[str], duration: int, language: str, style: str) -> str:
    r = requests.post(
        f"{BASE}/podcasts",
        headers=HEADERS,
        json={
            "urls": urls,
            "language": language,
            "target_duration_minutes": duration,
            "style": style,
        },
        timeout=30,
    )
    r.raise_for_status()
    return r.json()["data"]["podcast_id"]


def wait_for(podcast_id: str, max_seconds: int = 900) -> None:
    started = time.time()
    while time.time() - started < max_seconds:
        r = requests.get(
            f"{BASE}/podcasts/{podcast_id}/status",
            headers={"X-Api-Key": API_KEY},
            timeout=15,
        )
        r.raise_for_status()
        status = r.json()["data"]["status"]
        if status == "completed":
            return
        if status == "failed":
            raise RuntimeError(f"Generation failed: {r.json()}")
        time.sleep(10)
    raise TimeoutError(f"Podcast {podcast_id} did not complete within {max_seconds}s")


def download(podcast_id: str, dest: str) -> None:
    r = requests.get(
        f"{BASE}/podcasts/{podcast_id}/download",
        headers={"X-Api-Key": API_KEY},
        timeout=15,
    )
    r.raise_for_status()
    audio = requests.get(r.json()["data"]["download_url"], timeout=60)
    audio.raise_for_status()
    with open(dest, "wb") as f:
        f.write(audio.content)


if __name__ == "__main__":
    cost = estimate(duration=15, sources=1, voices=2)
    if balance() < cost:
        raise SystemExit(f"Insufficient credits ({balance()} < {cost})")

    pid = create(
        urls=["https://example.com/article"],
        duration=15,
        language="en-US",
        style="deep_dive",
    )
    wait_for(pid)
    download(pid, "podcast.mp3")
    print(f"Saved podcast.mp3")

A Node.js equivalent uses the same flow with fetch. Both are documented in the full API reference.


Rate limits and how to respect them

Token typeRequests/minuteRequests/hourConcurrent generations
Test2201
Production303005

Hitting a limit returns HTTP 429 with a Retry-After header. A correct client honours that header and queues the next attempt accordingly. Production limits are calibrated for typical SaaS integrations — most teams never approach them. If you do, talk to us about an enterprise quota.

The standard practice for hardening any API client applies here: timeout every request (15-30 seconds is plenty), cap retries (three attempts with exponential backoff), and surface 5xx errors to your operator instead of swallowing them. The OAuth 2.1 RFC is overkill for a static-token API like this, but its operational hygiene is worth borrowing — log request IDs (meta.request_id) on every error so support can correlate.


What test tokens cannot do

Test tokens are deliberately restricted so you can integrate cheaply without burning production credits. Specifically:

  • Maximum episode duration: 5 minutes.
  • Languages allowed: English (en-US, en) only.
  • URLs per request: 1.
  • Maximum voices: 2.
  • custom_focus, source_weights and auto_publish: not available.

Attempting any of these returns 400 TEST_TOKEN_RESTRICTED. Switch to a production token (still within the same Pro account) when you are ready to ship.


Webhooks, file uploads and what is not supported (yet)

A few capabilities that exist in the web app are not in the API today:

  • File uploads. You can pass URLs but not raw PDF / DOCX / TXT bytes. If your content lives behind authentication, host it publicly with a signed URL or contact us about an enterprise route.
  • Raw text bodies. The urls parameter is the only ingestion mechanism; you cannot POST a text_content field.
  • Webhooks / callbacks. Status changes are observed via polling. A webhook layer is on our roadmap; for now, a 10-second polling loop is the recommended pattern.

These gaps are intentional — the API ships with a minimum viable surface and we will widen it as integration patterns stabilise.


What to build next

Once your first integration is running, the natural next step is adopting the API into a real product surface. Some places to start:

  • Pair the API with a Telegram bot to let your users trigger generation from chat.
  • Combine podcast creation with our eight audio styles to let your users pick the pedagogical treatment per source.
  • Drive PDF-to-podcast workflows by hosting the PDF publicly first, then passing the URL.
  • Read API integration ideas for concrete patterns we have seen teams ship with the API in the first 30 days.

The goal of the API is to take Podhoc out of the browser and into your product. The five-step lifecycle is intentionally small. Build the thin wrapper, test against your own URLs, and iterate.

Provision an API token →

Frequently asked questions

What does the Podhoc API let me do?
The Podhoc API lets you generate AI podcasts programmatically from publicly accessible URLs. You can integrate podcast creation into your own product, automate batch generation of episodes, build internal knowledge tools that turn documents into audio, and orchestrate multi-step workflows that combine podcast generation with other services.
Which plan includes API access?
API access is included with the Pro plan (€29/month, 3500 credits) and higher. The Free and Creator plans do not include API tokens. You can find pricing at podhoc.com and provision tokens at app.podhoc.com/account/api-access once subscribed.
What is the difference between test and production tokens?
Test tokens (phk_test_…) are cheaper to use (1.5x credit multiplier) but have restrictions: 5-minute episodes max, English only, 1 URL per request, max 2 voices, no custom focus, no source weighting, no auto-publish. They are intended for development and integration testing. Production tokens (phk_prod_…) cost 2.5x credits and unlock the full feature set: 120-minute episodes, all 73 output languages, unlimited URLs per request, all voice options, custom focus, source weighting and auto-publish.
What are the API rate limits?
Test tokens: 2 requests/minute, 20 requests/hour, 1 concurrent generation. Production tokens: 30 requests/minute, 300 requests/hour, 5 concurrent generations. When you hit a limit, the API returns HTTP 429 with a Retry-After header indicating when to retry. The production limits are designed for typical SaaS-style integrations; if you need higher throughput, contact us for an enterprise quota.
How long does generation take, and how should I poll?
Most podcasts complete in 2-5 minutes regardless of source length, because the LLM and TTS pipeline run in parallel on cloud GPUs. Poll GET /v1/podcasts/{id}/status every 10 seconds with exponential backoff if you want to be polite. Stop polling on status completed or failed. For latency-sensitive integrations, surface a generic “Generating your episode” UI to the user while you poll behind the scenes.
What sources can I pass to the API?
The API accepts publicly accessible URLs only — articles, blog posts, research papers hosted on arXiv or institutional repositories, YouTube videos with transcripts, and any other URL whose body is reachable without authentication. File uploads (PDF/DOCX/TXT) and raw text content are not supported via the API today; if your sources are behind authentication, host them publicly first or contact us for an enterprise integration discussion.
How is the credit cost calculated?
The base cost is max(30, ceil(duration_minutes × 2.5)). Multi-source bonus adds 20 credits, custom-weights bonus adds 10, multi-voice multiplies the subtotal by 1.2, and the cap is 500 credits per request. The final API cost multiplies the base by 1.5 (test token) or 2.5 (production token). Use GET /v1/estimate-cost to preview before generating — it returns the full breakdown.
What error codes should I handle?
The most common ones are 400 INVALID_REQUEST (missing or malformed fields), 400 INVALID_DURATION (outside 1-120 minutes), 400 TEST_TOKEN_RESTRICTED (a feature unavailable on test tokens — switch to production), 401 UNAUTHORIZED (revoked or expired token), 402 INSUFFICIENT_CREDITS, 404 PODCAST_NOT_FOUND, and 429 RATE_LIMITED (back off using the Retry-After header). All errors share the same JSON envelope with code, message and optional details fields.