AI is changing how we transcribe, and this might be…

AI is changing how we transcribe, and this might be the best example of it on Mac yet

For all the things I (and many others) are wary about in the AI future, there are certainly some obvious uses for the technology.

Running huge calculations in a fraction of the time, translating text, and other “donkey work” is what it should be used for, rather than an excuse to overwrite humankind’s propensity for creativity.

As a writer, one of the most frustrating parts of interviews has long been transcribing voice to text. It’s got much better in recent years, but it’s often required expensive, sometimes demanding software tools.

I stumbled upon TypeWhisper, and it’s a fantastic AI tool for transcription - and it runs locally.

Talking to myself

TypeWhisper is free to use for non-commercial purposes, but you can pay for additional models. I’ve been testing the app with the WhisperKit LLM, installed locally on my MacBook Air.

This means there are multiple sizes to choose from, and I’ve settled on the Large v3 model at 1.5GB - but some models are as small as 40MB.

Out of the box, though, it’s pretty fantastic. I use a hotkey (tied to HyperKey, which I wrote about recently) to trigger voice-to-text, and that allows me to speak and see live transcription at the top of my screen within the Mac’s ‘notch’ - almost like an iPhone’s Dynamic Island.

In my testing, I’ve found it’s a great way to get ideas down so I can copy them into my writing app of choice (usually Drafts or Google Docs). It’s not perfect, but when pointing it at an American Dad episode playing in the room, it did a great job of transcribing its absurdist humor to text.

I can add phrases to the dictionary so that “Eggs Box” becomes “Xbox” more consistently, and more besides.

You talking to me?

That brings us to the real draw, though - file transcription. The majority of interviews I’ve conducted are recorded on messaging platforms and sent over as audio or video files.

What once involved slowly working through with headphones on now needs to be dragged, dropped, and transcribed by TypeWhisper. Naturally, larger models will have more consistent results (particularly if you’re plugging in cloud models like Chat-GPT), but in wanting to keep my environmental footprint to a minimum, I’ve been very pleased with the local model. It can even export to subtitle files with timestamps - ideal for content creation.

There’s a Workflow function to take transcriptions and run automations on them, like dropping them into a specific application.

All in all, what began as a curiosity in my own workflows has become something I can see myself using more and more often.

TypeWhisper is stable in macOS right now, and it’s in beta for Windows and alpha for iOS.

More from Tom's Guide

Read news from 100's of titles, curated specifically for you.

Already a member? Sign in here