Submit any public audio URL and get back a fully speaker-diarised, timestamped transcript in seconds. Ideal for agents that need to reason over specific recordings not yet in the Sonar index.
Consume utterances as JSON events with optional word-level timestamps. Pipe the stream straight into your tool layer or buffer segments for summarisation — same schema whether the source is a podcast feed or a one-off courtroom archive.
◎ In beta{ "type": "transcript.segment.created", "audio_url": "podcasts.apple.com/.../ai-a16z", "entry": { "speaker_name": "Martin", "timestamp": "00:04.120", "confidence": 0.97, "verbatim_utterance": "AI apps are becoming systems, not just prompts." }, "format": "json", "word_timestamps": true}