(Research) STT (1) – Converting Speech to Text (Guide)

3/31/2021

Guide to Converting Speech to Text Using Python

Those who have some experience in program development might have wondered or been intrigued by the idea of developing voice-based services when using AI speakers (or assistants). ^^

For such individuals, even those with limited development experience or lacking knowledge in natural language processing or speech recognition, there are several easy-to-follow packages that I would like to introduce.

You can utilize these tools according to your specific situation

Speech_recognition

Speech Recognition Using Python: A Guide

For those interested in voice recognition services, here's an overview of the available packages and tools you can use to convert speech to text in various languages. Below are some popular packages and APIs that support speech recognition in different languages:

Popular Speech Recognition Libraries/Packages

apiai
assemblyai
google-cloud-speech
pocketsphinx
SpeechRecognition
watson-developer-cloud
wit

Speech Recognition with Language Support (Other than English)

Many libraries and APIs allow you to specify a language keyword argument to recognize speech in languages other than English. Example:

python
recognize_google(audio, language='fr-FR')  # For French

Using `SpeechRecognition` Library

In my case, I’m using the speech_recognition library and recognize_google() method to convert speech to text in Korean.

Environment Setup for Speech Recognition

Install SpeechRecognition Library
Use pip to install the SpeechRecognition library:
```
bash
pip install SpeechRecognition
```
Install PyAudio for Microphone Access
To access the microphone with the SpeechRecognizer, you need to install PyAudio:
On Windows:
```
bash
pip install pyaudio
```
On Debian Linux:
```
bash
sudo apt-get install python-pyaudio python3-pyaudio
```

Supported Audio File Formats

The following file formats are supported for speech recognition:

WAV (PCM / LPCM format)
AIFF
AIFF-C
FLAC (Standard FLAC format, OGG-FLAC is not supported)

Example Code: Converting Speech to Text (Korean)

Here’s a simple example of how to use the SpeechRecognition library to convert speech into text:

import speech_recognition as sr


recognizer = sr.Recognizer()
recognizer.energy_threshold = 300

## wav 파일 읽어오기
harvard_audio = sr.AudioFile("./TEST.wav")

# audio file을 audio source로 사용합니다
r = sr.Recognizer()
with harvard_audio as source:
    audio = r.record(source)  # 전체 audio file 읽기

# 구글 웹 음성 API로 인식하기 (하루에 제한 50회)
try:
    print("Google Speech Recognition thinks you said : " + r.recognize_google(audio, language='ko'))
except sr.UnknownValueError:
    print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
    print("Could not request results from Google Speech Recognition service; {0}".format(e))
    
"""
    recognize_bing(): Microsoft Bing Speech
    recognize_google(): Google Web Speech API
    recognize_google_cloud(): Google Cloud Speech - requires installation of the google-cloud-speech package
    recognize_houndify(): Houndify by SoundHound

    recognize_ibm(): IBM Speech to Text
    recognize_sphinx(): CMU Sphinx - requires installing PocketSphinx
    recognize_wit(): Wit.ai
"""

print(recognizer)

Search This Blog

Maritime 4.0: Innovation Driven by AI, Data, and Cyber Security

(Research) STT (1) – Converting Speech to Text (Guide)

Guide to Converting Speech to Text Using Python

Speech Recognition Using Python: A Guide

Popular Speech Recognition Libraries/Packages

Speech Recognition with Language Support (Other than English)

Using `SpeechRecognition` Library

Environment Setup for Speech Recognition

Supported Audio File Formats

Example Code: Converting Speech to Text (Korean)

Comments

Post a Comment

Popular posts from this blog

[MaritimeCyberTrend] Relationship and prospects between U.S. Chinese maritime operations and maritime cybersecurity

인공지능 서비스 - 챗봇, 사전에 충분한 지식을 전달하고 함께 학습 하기!

Matching Shipbuilding Schedules with Cybersecurity Deliverables

(Research) STT (1) – Converting Speech to Text (Guide)

Guide to Converting Speech to Text Using Python

Speech Recognition Using Python: A Guide

Popular Speech Recognition Libraries/Packages

Speech Recognition with Language Support (Other than English)

Using SpeechRecognition Library

Environment Setup for Speech Recognition

Supported Audio File Formats

Example Code: Converting Speech to Text (Korean)

Comments

Post a Comment

Popular posts from this blog

[MaritimeCyberTrend] Relationship and prospects between U.S. Chinese maritime operations and maritime cybersecurity

인공지능 서비스 - 챗봇, 사전에 충분한 지식을 전달하고 함께 학습 하기!

Matching Shipbuilding Schedules with Cybersecurity Deliverables

Using `SpeechRecognition` Library