Risks of Using Speech Recognition for Transcription of Confidential Data

Risks of Using Speech Recognition for Transcription of Confidential Data. A blog post by OutSec the UK's leading online transcription company

Automated speech recognition (ASR) is a technology that can convert spoken words into text. However, automated speech recognition poses significant risks for the privacy and security of the data it processes. Especially, when it involves confidential or sensitive information.

Data Breaches by Major Tech Companies

In recent years, several major tech companies have faced public backlash and legal consequences for violating the privacy of their users through their automated speech recognition services.

It was reported by The Guardian that “Google workers can listen to what is said on its AI home devices“. This became apparent after certain recordings were leaked.

Again, The Guardian reported that Amazon could well be invading privacy in its article: ‘Alexa, are you invading my privacy?‘. In the US, Amazon is being sued over child recordings.

And there have been further press reports highlighting similar issues:

These incidents show that ASR services are not always transparent about how they handle the voice data they collect from their users. They also expose the users to potential data breaches, identity theft, fraud, or blackmail.

Why Speech Recognition Should Not Be Used by the Legal or Medical Sectors

The legal and medical sectors deal with highly confidential and sensitive data. This requires strict protection and compliance with various laws and regulations. Using automated speech recognition for transcription purposes in these sectors can pose serious risks.

For instance, automated speech recognition can introduce errors or inaccuracies in the transcription. It can affect the quality and validity of the documents. A study by Hodgson and Coiera found that ASR had an average error rate of 1.3 per document, with 15% of them being clinically significantAnother study by Topaz et al found that physician-created notes using ASR had four times the rate of errors compared to non-ASR notes.

Moreover, automated speech recognition can compromise the confidentiality and security of the data. This can occur when storing it on cloud servers or sharing it with third parties without proper consent or encryption. Obviously, this can violate the privacy rights of the clients or patients and expose them to legal liabilities. For example, a lawyer who uses automated speech recognition to transcribe a client’s testimony may inadvertently disclose privileged information to an unauthorised party or a malicious actor. Similarly, a doctor who uses ASR to transcribe a patient’s medical history may unintentionally reveal personal health information to an unauthorised third party.

The GDPR Issues with Using Automated Speech Recognition for Transcription

The GDPR is a regulation that aims to protect the personal data of individuals in the EU. It gives them more control over how their data is collected, processed, stored, and shared. GDPR considers voice as personally identifiable information (PII) as voice recordings provide information on gender, ethnic origin, or potential diseases. Therefore, any entity that uses automated speech recognition for transcription purposes must comply with the GDPR requirements, such as:

  • Obtaining explicit and informed consent from the users before collecting, processing, or storing their voice data
  • Providing clear and accessible information about how their voice data is used, who has access to it, and how long it is kept
  • Implementing appropriate technical and organisational measures to ensure the security and integrity of their voice data
  • Respecting the rights of the users to access, rectify, erase, restrict, or object to the processing of their voice data
  • Reporting any data breaches involving their voice data to the relevant authorities and users within 72 hours

Failing to comply with the GDPR can result in hefty fines up to 4% of annual global turnover or €20 million, whichever is higher. Additionally, it can damage the reputation and trust of the entity among its users and stakeholders.

Further Reading:

  • Voice Recognition Tech Privacy and Cybersecurity Concerns. This article discusses the challenges and risks of using voice recognition technology in various applications and domains. Especially in the context of the GDPR and other relevant laws and regulations.
  • Data, privacy, and security for Speech to text. This article provides some high-level details on how speech-to-text processes data provided by customers. Also what privacy and security obligations need to comply with. It also reminds people that they are responsible for obtaining all necessary permissions for processing the data. Including any licenses, permissions or other proprietary rights required for the content they input into the speech-to-text service.

Why Human Transcription is Better than Using ASR for Transcription Purposes

Human transcription is the process of converting speech into text by a human transcriber. They listen to an audio file and type what they hear. Human transcription has several advantages over automated speech recognition for transcription purposes, such as:

Higher accuracy:

Human transcribers can capture context, homophones, accents, jargon, and nuances better than ASR systems. They can also mark inaudible or unclear speech as such rather than guessing or making errors.

More flexibility:

Human transcribers can adapt to different formats, styles, standards, and preferences according to the needs and specifications of the clients. They can also handle complex or specialised content that may be beyond the scope or capability of automated speech recognition systems.

More privacy:

Human transcribers can ensure the confidentiality and security of the data they transcribe by following strict protocols and policies. These could be signing non-disclosure agreements, deleting the audio files after transcription, and using encrypted platforms or devices.

Therefore, human transcription is a more reliable and customisable. It is a more secure option than ASR for transcription purposes, especially when it involves confidential or sensitive data.


Speech recognition is a powerful and convenient technology that can facilitate and enhance the transcription of speech into text. However, it also poses significant risks for the privacy and security of the data it processes, especially when it involves confidential or sensitive information. Therefore, users of automated speech recognition should be aware of these risks and take appropriate measures to protect their data and comply with the relevant laws and regulations. Alternatively, users of ASR can opt for human transcription, which is a more accurate, flexible, and secure option than automated speech recogniton for transcription purposes.

About OutSec

OutSec is the UK’s leading online transcription company whose business has grown substantially since 2002. We are one of the most successful transcription companies in the United Kingdom.

OutSec provides secure outsourced transcription services to the medicallegalproperty and surveyinguniversitiesmedia and interviewsadvisory boards, conferences & seminarsinventoriesfinancialcorporateHR, recruitment and Executive Search sectors.

Why is Dictation More Efficient than Typing?

Well, the simple fact is that we can all speak considerably faster than we can physically type:

“The average person types between 38 and 40 words per minute”.

A “good rate of speech ranges between 140 -160 words per minute.

In other words, dictation is up to four times faster than we can type. Therefore, simply dictating a document is more cost-efficient, giving you more time to dedicate your efforts elsewhere in your business.

Therefore why not add OutSec as a business continuity option for your business? Accounts are free, you pay on a per-minute basis (rounded to the nearest minute) on a pay-as-you-go basis, with no contracts or minimum spend. What do you have to lose? Why not open an account today!

Picture Attribution:

Image by WangXiNa on Freepik

Scroll to Top