Unfortunately, not yet. Whisper per se is not able to do that. Currently, there are few viable solutions for integration, and I’m looking at this one, but all current solutions I know about need GPU for this.
Whisper models have a very good WER (word error ratio) for languages like Spanish, English, French… if you use the english-only models it also improves. Check out this page on the docs:
https://whishper.net/reference/models/#languages-and-accuracy
I’m glad you were able to solve the problem, I add the comment I made to another user with the same problem: