I have been watching a few Google Cloud Platform videos recently from Google Cloud Next and really enjoyed the demo in one of them: Machine learning APIs.
The idea is simply to record your voice (here using the microphone on your laptop). Then the audio file is sent to Cloud Storage. Demo @11″35.
By using Google Speech, you can not only get a transcript of your record, but you can add additional context words in your API call to make sure GCP understands it perfectly.
Example:
"speechContext": { "phrases": ["GKE", "Kubernetes", "Containers"] }
I tried to work on the script to do the exact same thing and decided to share it if you want to try it at home.
Prerequisites are:
- A GCP projet
- Run the following command on your laptop:
brew install sox --with-flac
- Download and install Google Cloud SDK
- Create a Cloud Storage bucket
- Create an API Key and give it access to Google Speech
#!/usr/bin/env bash # Configuration GCP_USERNAME=<my-user-email> GCP_PROJECT_ID=<my-project-id> BUCKET_NAME=<my-bucket-name> API_KEY=<my-api-key> gcloud auth login $GCP_USERNAME gcloud config set project $GCP_PROJECT_ID # Recording with Sox (brew install sox --with-flac) rec --encoding signed-integer --bits 32 --channels 1 --rate 44100 recording.flac # Upload to Cloud Storage gsutil cp -a public-read recording.flac gs://$BUCKET_NAME # Prepare our request parameters for Google Speech cat <<< ' { "config": { "encoding":"FLAC", "sample_rate": 44100, "language_code": "en-US", "speechContext": { "phrases": ["<My context word>"] } }, "audio": { "uri":"gs://'$BUCKET_NAME'/recording.flac" } }' > request.json # API call to Google Speech curl -s -X POST -H "Content-Type: application/json" --data-binary @request.json \ "https://speech.googleapis.com/v1beta1/speech:syncrecognize?key=$API_KEY" # Cleaning rm -f recording.flac
If you are interested in learning more about AI and machine learning, there is a great video from Andrew Ng which covers the state of AI today and what you can do to be the next AI company!