I have been watching a few Google Cloud Platform videos recently from Google Cloud Next and really enjoyed the demo in one of them: Machine learning APIs (Demo @11″35).
The idea is simply to record your voice (here using the microphone on your laptop). Then the audio file is sent to Cloud Storage.
By using Google Speech, you can not only get a transcript of your record, but you can add additional context words in your API call to make sure GCP understands it perfectly.
Example:
"speechContext": { "phrases": ["GKE", "Kubernetes", "Containers"] }
I tried to work on the script to do the exact same thing and decided to share it if you want to try it at home.
Prerequisites are:
- A GCP projet
- Run the following command on your laptop:
brew install sox --with-flac
- Download and install Google Cloud SDK
- Create a Cloud Storage bucket
- Create an API Key and give it access to Google Speech
#!/usr/bin/env bash # Configuration GCP_USERNAME=<my-user-email> GCP_PROJECT_ID=<my-project-id> BUCKET_NAME=<my-bucket-name> API_KEY=<my-api-key> gcloud auth login $GCP_USERNAME gcloud config set project $GCP_PROJECT_ID # Recording with Sox (brew install sox --with-flac) rec --encoding signed-integer --bits 32 --channels 1 --rate 44100 recording.flac # Upload to Cloud Storage gsutil cp -a public-read recording.flac gs://$BUCKET_NAME # Prepare our request parameters for Google Speech cat <<< ' { "config": { "encoding":"FLAC", "sample_rate": 44100, "language_code": "en-US", "speechContext": { "phrases": ["<My context word>"] } }, "audio": { "uri":"gs://'$BUCKET_NAME'/recording.flac" } }' > request.json # API call to Google Speech curl -s -X POST -H "Content-Type: application/json" --data-binary @request.json \ "https://speech.googleapis.com/v1beta1/speech:syncrecognize?key=$API_KEY" # Cleaning rm -f recording.flac
If you are interested in learning more about AI, there is a great video from Andrew Ng which covers the state of AI today and what you can do to be the next AI company!