Google Cloud Data Engineer Professional certified!

One more GCP certification on the list! This one was by far the most interesting one in a while as it gave me a chance to review topics that I don’t work with every day: Machine learning and Big data.

 

Let’s dive right in, here is the preparation I followed:

My feedback on the exam:

  • Check the scope of this exam, be prepared for design questions on database models, optimization and troubleshooting
  • Know Bigquery VS Bigtable VS Datastore VS Cloud SQL
  • Dataflow and how to deal with batch and stream processing
  • Read as much as you can and play with machine learning!
  • How to share datasets, queries, reports is really something that comes often, don’t underestimate security aspects
  • Understand Hadoop ecosystem, learn about the typical big data lifecycle on GCP

Good luck to everyone taking this exam!

Google Cloud Platform – Machine learning APIs

I have been watching a few Google Cloud Platform videos recently from Google Cloud Next and really enjoyed the demo in one of them: Machine learning APIs (Demo @11″35).

The idea is simply to record your voice (here using the microphone on your laptop). Then the audio file is sent to Cloud Storage.

By using Google Speech, you can not only get a transcript of your record, but you can add additional context words in your API call to make sure GCP understands it perfectly.
Example:

"speechContext": { "phrases": ["GKE", "Kubernetes", "Containers"] }

I tried to work on the script to do the exact same thing and decided to share it if you want to try it at home.
Prerequisites are:

  • A GCP projet
  • Run the following command on your laptop:
    brew install sox --with-flac
  • Download and install Google Cloud SDK
  • Create a Cloud Storage bucket
  • Create an API Key and give it access to Google Speech

#!/usr/bin/env bash

# Configuration
GCP_USERNAME=<my-user-email>
GCP_PROJECT_ID=<my-project-id>
BUCKET_NAME=<my-bucket-name>
API_KEY=<my-api-key>

gcloud auth login $GCP_USERNAME
gcloud config set project $GCP_PROJECT_ID

# Recording with Sox (brew install sox --with-flac)
rec --encoding signed-integer --bits 32 --channels 1 --rate 44100 recording.flac

# Upload to Cloud Storage
gsutil cp -a public-read recording.flac gs://$BUCKET_NAME

# Prepare our request parameters for Google Speech
cat <<< '
{
    "config": {
    "encoding":"FLAC",
    "sample_rate": 44100,
    "language_code": "en-US",
    "speechContext": {
        "phrases": ["<My context word>"]
    }
    },
    "audio": {
        "uri":"gs://'$BUCKET_NAME'/recording.flac"
    }
}' > request.json

# API call to Google Speech
curl -s -X POST -H "Content-Type: application/json" --data-binary @request.json \
"https://speech.googleapis.com/v1beta1/speech:syncrecognize?key=$API_KEY"

# Cleaning
rm -f recording.flac

If you are interested in learning more about AI, there is a great video from Andrew Ng which covers the state of AI today and what you can do to be the next AI company!

AWS – Bastions with user-managed SSH keys

I recently architected a bastion solution to let employees manage their own SSH keys from the AWS interface. CodeCommit actually let you upload directly your SSH keys inside the IAM section of your user, a bit like on Github.

Benefits of this solution:

  • Nothing to manage once installed and configured
  • Let users update their public SSH keys themselves inside the console
  • Deploy the keys automatically and keep them up-to-date on all bastions and instances
  • Add and remove users on all Linux boxes automatically when you add/remove accounts in IAM
  • Linux usernames are generated based on IAM account email: paul.chapotet@domain.com -> pchapotet
  • Keys are automatically deployed on bastions and instances based on the VPC where they are located
  • Inexpensive: the lambda is running only when there is a change in IAM:
    • UploadSSHPublicKey, when anyone adds a SSH Key to an IAM user
    • UpdateSSHPublicKey, when anyone makes active or inactive a SSH key
    • DeleteSSHPublicKey, when anyone deletes a SSH Key
    • DeleteUser, when anyone deletes an IAM user
  • A single S3 GET operation is needed to update the SSH keys from bastions and instances

In the diagram above, I assume that you are following AWS best practices and that you have a central account to manage IAM users, one account for production and one for your development environment. Interested in digging into the code? It’s available here: https://github.com/pchapotet/aws-bastions

Google Cloud Platform – Start stop instance scheduler

I recently worked on a feature missing on GCP: a start stop scheduler for my GCP instances based on labels. I was first excited about using Cloud functions, but it seemed App Engine was the way to go for several reasons: it supports python and the task scheduling feature is already embedded.

I had a few requirements:

  • Ability to schedule start and stop of GCE instances every hour
  • Extra options to run only during working days or weekends, default is every day
  • It must work across all projects inside an organisation if you give the right permissions to your default App Engine service account
  • Inexpensive to run (or free), who wants to pay for a feature that should be available by default in the cloud?
    • According to https://cloud.google.com/free/docs/always-free-usage-limits you should have 28 instance hours of App Engine Standard free per day.
    • If you are already using App Engine for something else, the script is easy to merge with your application code.
    • If you don’t want to use App Engine, the python code can be executed from any other machine with the right credentials, even your laptop if not critical.

To deploy the solution, please follow the instructions from the following repository: https://github.com/pchapotet/gcp-start-stop-scheduler

Once it is installed, simply add a few tags to your instances and enjoy the automation! You can run it only during working days (Monday to Friday) with ‘d’ option and during weekend (Saturday and Sunday) with ‘w’ option. Feel free to comment and raise Github issues if you see anything to improve.

With just 2 labels, it starts your instance at 8am and stops it at midnight every day during working days.
Google Cloud Architect Professional certified!

Taking the GCP Architect exam is quite a challenge as there is very little study material or practice questions available at the moment.

To prepare for the exam:

To sum up the exam without saying too much, it was 50 questions for a total of 120 minutes. Timing is friendly, I had about 15-20 minutes left before the end. Half of the exam worked pretty easily by proceeding by elimination to remove the craziest answers. I was surprise to see a split screen with questions on the left and a listbox on the right allowing to switch between the 4 use cases available at the moment.

About 15 questions were related to use cases. They seemed more complex, even confusing sometimes. I had to use only 2 use cases out of 4, the rest of the questions is more general and seemed to be what I would categorize as medium level questions.

A few points I would suggest to work on:

  • Prepare yourself with the 4 use cases available, work on them for an hour as if they were your customer and how you would deal with each point (means which service you would use on GCP instead of what they have)
  • Read about BQ, Bigtable, CloudStorage, Pub/Sub, Dataflow, Dataproc and when to use all of them
  • Container engine vs Compute Engine vs App Engine
  • Know cloud related business terms: capex, opex, tco, capacity planning
  • Best practices regarding IAM, audit logs and how to secure them
  • Know resources that are global vs regional vs zonal (some major differences with AWS)
  • Know how are structured the different databases
  • Learn everything about instance groups, load balancers, stress tests
  • CI/CD on GCP, how to architect perfectly dev/qa/stg/prod environments
  • You will have to look at Java and Python code as expected
  • Cloud deployment manager is part of the exam and interesting to know in details
  • Migration: how do you deal with existing DC, move data around, etc
  • Network: VPN, firewall, tags

Once again, good luck to everyone taking this exam!

 

AWS DevOps Professional Certification – All-5 AWS certified!

I passed the AWS DevOps professional exam this weekend with success after a few weeks looking at the following services: CloudFormation, Autoscaling, Beanstalk, Opsworks and Cloudwatch. The strategy for the exam was to watch all https://acloud.guru videos, then do the https://cloudacademy.com/ quizzes (there is a 7-day free trial) as well as review the following:

Docs:

  • http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/crpg-walkthrough.html
  • http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_docker.html
  • http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features.deployment.source.html
  • http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/introducing-lifecycle-hooks.html
  • http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-attribute-updatepolicy.html
  • http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features.rollingupdates.html
  • https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/ebextensions.html
  • https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/environment-resources.html
  • https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-init.html
  • http://docs.aws.amazon.com/cli/latest/reference/opsworks/create-deployment.html
  • http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/eb-cli3-getting-started.html
  • http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/applications-sourcebundle.html

Blogs:

  • http://cantrill.io/certification/aws/2015/10/29/passing-the-aws-devops-engineer-professional-exam.html
  • http://ozaws.com/2015/10/30/aws-professional-devops-engineer-certification-tips/
  • http://blog.error510.com/2016/03/28/aws-devops-engineer-exam-passed/
  • https://www.sumologic.com/blog-amazon-web-services/monitoring-aws-auto-scaling-and-elastic-load-balancers-with-log-analytics/

Videos:

  • https://www.youtube.com/watch?v=aX54mhZbN58
  • https://www.youtube.com/watch?v=ZhGMaw67Yu0
  • https://www.youtube.com/watch?v=4trGuelatMI

Must know:

  • Rolling Updates versus Rolling Deployments
  • Blue-green strategies on Opsworks, Beanstalk and with Route 53 and AutoScaling
  • A/B deployments
  • AutoScaling lifecycle hooks
  • Cloudwatch Logs
  • Opsworks CLI commands
  • CF Custom resources, cfn signals and wait conditions
  • Kinesis, Cloudtrail, S3 Logging
AWS Solutions Architect Professional Certification

Getting ready for the AWS Solutions Architect Professional Exam is not an easy task! It is currently one of the most difficult AWS certification to get with the DevOps one due to the number of services it covers. Plan on studying for a few months, not only AWS services but a very wide range of concepts. The level required to pass this exam is very high, nothing compared to the Associate level certification. AWS even recommends 2 years of experience on the platform.

As usual a good start is to follow the awesome https://acloud.guru/ courses.

Don’t forget to study all the AWS Reference Architectures and watch AWS Summit videos:

The exam tests your ability to answer very quickly, it’s a bit more than 2 minutes per question and very few are short ones. Sometimes answers are very similar and you will have to proceed by elimination. Best tip that helped me from Reddit: Focus on the “kicker”.  This is the part of the after the fluff that tells you exactly what they want.  e.g. “Which option provides the MOST COST EFFECTIVE solution.

One last thing, if English is not your first language you might be able to get an extra 30 minutes by contacting the certification team, but this request can take up to a month prior to taking the exam.

Good luck to everyone taking this exam!