I co-founded Speak AI in 2019. Today it's used by 250,000+ people across 100+ countries to transcribe, analyze, and make sense of audio and video at scale. This is the story of how we got there.
The premise was simple: voice is the most natural way humans communicate, and almost all of it disappears. Every meeting, every customer call, every research interview - gone. The insight, the nuance, the follow-up that should have happened. Lost.
I started building Speak AI in 2019 because I kept watching this happen. Smart people making decisions without the full picture because the conversation wasn't captured, or if it was, nobody had time to go back through hours of recording to find the part that mattered.
The early product was rough. We iterated constantly - talking to researchers, marketers, academics, journalists. What they all had in common was a backlog of audio they couldn't do anything with. We built the tools to fix that.
We grew from zero to 250,000+ users without significant outside funding. No big raise, no blitz-scaling. Organic search, product-led growth, and deep iteration on what people actually needed. I drove product direction and content in the early years - a lot of what you find ranking for transcription and voice AI terms today started as blog posts I wrote at 6am before anyone else was up.
Techstars Toronto 2021 was a turning point. The program pushed us to get sharper on positioning and accelerated some key partnerships. But the core growth engine was always the same: build something that works, make it findable, let the product speak.
UX researchers running qualitative studies. Marketing teams analyzing customer calls. Academics transcribing hundreds of hours of fieldwork. Journalists. Podcast creators. Enterprise teams that need a reliable, HIPAA-compliant transcription layer for their existing workflow.
The unifying thread: people who deal with spoken data at scale and need to do something useful with it beyond just having a recording.
The most durable products live closest to the user's actual workflow. Not a feature, not an integration - a tool that becomes part of how work gets done. Voice and speech are still massively underserved. Most of the data that matters to businesses is spoken. Most of it is never captured, let alone analyzed. That gap is still closing, and it's still a big opportunity.
Automatic transcription in 100+ languages. Upload audio or video, or record live. Fast, accurate, and speaker-labeled.
Theme extraction, sentiment, keyword tracking, and AI summaries across any conversation or recording.
AI meeting assistant for Zoom, Teams, and Google Meet. Auto-joins, transcribes, and delivers structured notes.
Built for qualitative teams. Bulk upload, tag, search, and analyze across hundreds of interviews.
AI voice agents for phone and web. Inbound support, lead qualification, outbound calls.
Zapier, Google Drive, Dropbox, and your existing stack. Full API access for custom builds.