Google Cloud Platform – Machine learning APIs

I have been watching a few Google Cloud Platform videos recently from Google Cloud Next and really enjoyed the demo in one of them: Machine learning APIs.

The idea is simply to record your voice (here using the microphone on your laptop). Then the audio file is sent to Cloud Storage. Demo @11″35.

By using Google Speech, you can not only get a transcript of your record, but you can add additional context words in your API call to make sure GCP understands it perfectly.

"speechContext": { "phrases": ["GKE", "Kubernetes", "Containers"] }

I tried to work on the script to do the exact same thing and decided to share it if you want to try it at home.
Prerequisites are:

  • A GCP projet
  • Run the following command on your laptop:
    brew install sox --with-flac
  • Download and install Google Cloud SDK
  • Create a Cloud Storage bucket
  • Create an API Key and give it access to Google Speech
#!/usr/bin/env bash

# Configuration

gcloud auth login $GCP_USERNAME
gcloud config set project $GCP_PROJECT_ID

# Recording with Sox (brew install sox --with-flac)
rec --encoding signed-integer --bits 32 --channels 1 --rate 44100 recording.flac

# Upload to Cloud Storage
gsutil cp -a public-read recording.flac gs://$BUCKET_NAME

# Prepare our request parameters for Google Speech
cat <<< '
    "config": {
    "sample_rate": 44100,
    "language_code": "en-US",
    "speechContext": {
        "phrases": ["<My context word>"]
    "audio": {
}' > request.json

# API call to Google Speech
curl -s -X POST -H "Content-Type: application/json" --data-binary @request.json \

# Cleaning
rm -f recording.flac

If you are interested in learning more about AI and machine learning, there is a great video from Andrew Ng which covers the state of AI today and what you can do to be the next AI company!

Notify of
Inline Feedbacks
View all comments