Skip to main content

Battle of the AI Titans Part 1: AWS's Artificial Intelligence

Chris Dunn, Regional Vice President - APAC
March 22, 2018

Artificial intelligence (AI) movies, companies, and careers have all been based on the promise of AI. But, until now, it's been more hype than reality.

The good news is that all of that has changed. And, from a techie perspective, the opportunity to build AI-augmented apps has never been better.

In previous articles, we've had conversations with luminaries in the AI field. We've also surveyed use cases and libraries available in different programming languages to build AI functionality. But this blog is different. It focuses on another piece of the puzzle accelerating AI development with cloud application platforms.

We're going to look at the three mega-vendors in this area: Amazon AWS, Microsoft Azure, and Google Cloud. Each of these vendors has a different flavor based on their heritage. We'll also take a look at how that influences the broader low-code and application platform markets.

Amazon Web Services (AWS)

Let's start with arguably the most famous IaaS provider in the world - AWS. Amazon has a rich history of providing powerful services to developers and AWS's artificial intelligence and machine learning options continue in that tradition. They lump these services under their Machine Learning service line.

Amazon Deep Learning AMIs

At the most basic level, AWS offers Amazon Machine Instances (AMIs) tuned for deep learning. There are a few different flavors available. The Deep Learning AMIs come preinstalled with tools and frameworks like: Apache MXNet, TensorFlow, PyTorch, Microsoft Cognitive Toolkit (CNTK), Caffe, Caffe2, Theano, Torch, Gluon, and Keras.

Sagemaker

I have to admit, this is a pretty cool service. It reminds me of the movie Inception, but with AI.

More on the Inception bit later.

Let's start off with the core value. A preset environment with everything you need to begin building ML models. Everything? Yes, everything.

First, for the collaboration and building of models, the Sagemaker environment has Jupyter Notebooks built-in. If you're not familiar with Jupyter Notebooks (formally known as Python Notebooks), they allow you to combine rich text, code, and the output of code all in the same page. Notebooks have become the go-to development and collaboration method for data scientists working on machine learning problems.

Second, AWS builds ready-to-use algorithms into the platform (10 of the most commonly used ones):

? K-Means Clustering

? Principal Component Analysis

? Neural Topic Modeling

? Factorization Machines

? Linear Learner - Regression

? XGBoost

? Latent Dirichlet Allocation

? Image Classification

? Seq2Seq

? Linear Learner - Classification

They've also installed the drivers needed to run these algorithms and set the proper configurations, which means you are ready to run out of the gate. But that's not the best part!

AWS assigned teams to make each of the algorithms run faster. These teams worked for months to tune the algorithms to a point where, as Andy Jassy claims, 8 out of the 10 run 10x faster than anywhere else, with the other 2 running 3x faster.

https://www.youtube.com/watch?time_continue=1&v=lM4zhNO5Rbg

I'm impressed with the commitment this demonstrates that Amazon has for AI.

Should you not want the cookie-cutter model, you can always choose your framework of choice. Sagemaker comes preconfigured with TensorFlow and Apache MXNet. In addition, you can support CAffe2, CNTK, PyTorch, and Torch. But you have to do that via a docker container.

With Sagemaker, AWS is also introducing what they call "One-Click Training." All you need to do is point to the S3 datastore where your data resides and, in one click, Sagemaker:

    • sets up an isolated cluster

    • provides a separate SDN

    • sets up auto-scaling

    • sets up the EBS volumes

    • sets up the data pipelines

    • starts training with the algorithm you chose - right away

Pretty impressive! And better, when it's done, Sagemaker tears everything back down.

Remember I mentioned Inception? Sagemaker also does something dubbed "hyperparameter optimization." It automatically tunes your model for you as it runs. How does it do that? It spins up multiple copies of your model, and then uses machine learning to inform the model. Machine learning within machine learning - can you say Inception?

So, that's building your model. But what about deploying it? Sagemaker covers that as well. With one-click, you can deploy your model. Once deployed, Sagemaker:

    • Handles autoscaling

    • Applies security patches

    • Performs health checks

    • Does node scaling

Finally, it's modular. You can build and train in Sagemaker. Then, run your model in a different environment. You can also build and train your model in a different environment and run with Sagemaker.

DeepLens

This is the ultimate AI developer's gadget. DeepLens is a deep learning enabled video camera. It's pre-installed with Apache MXNet and includes a library of pre-trained models. Oh, and the Sagemaker service that I was just gushing about? It ties right into DeepLens. So, it's easy for you to feed your models from Sagemaker to DeepLens.

Should you want to automate actions and program DeepLens into your overall application schema, it supports as you would guess the use of Lambda functions.

But don't take my word for it. Here is Andy Jassy introducing it at AWS re:Invent:

https://www.youtube.com/watch?time_continue=1&v=RhEVld4GwzU

I'm already thinking about this as the perfect Father's Day gift for me (My darling wife, are you reading this :) ?). You can pre-order it for $249.

Rekoginition

This machine learning service earns the "closest-to-TV-show" technology award. We'll get to that in a second but first, the basics. Rekognition identifies objects, people, text, scenes, and activities in both photos and videos.

What does that practically translate into? I'm glad you asked:

    • Facial Recognition and Analysis - identifying a person in a photo or a video. It also analyzes facial features to identify sex, tell if the person is smiling or frowning, if the eyes are open, if they are happy, etc.

    • Person of Interest - This is the function that gives Rekognition the award for "closest-to-TV-show." Rekognition can track people within video even if they go out of the picture and come back. It's crazy how close this technology comes to one of my all time favorite TV shows - Person of Interest. Great for analyzing security camera footage!

https://www.youtube.com/watch?v=WYDWSNMTauQ

    • Unsafe Content - It can identify videos and images with adult-themed activities and content as well as other content that might be objectionable.

    • Image to Text - As mentioned in the intro on this section, Rekognition can turn the text in an image into editable text.

Lex

For all of you Alexa fans, this is the go-to service for you. Amazon Lex gives the power of Alexa and its voice recognition and simulation capabilities to developers.

At the beginning of this blog, I mentioned each of these platform providers have a different flavor based on their heritage. Alexa is the perfect example of that. Since its initial release, Alexa has been used by hundreds of thousands, if not millions of users. And all of that use has honed and trained Alexa to be better at recognizing and responding to audio. This is an obvious value-add for Amazon. So what does the Lex service offer?

    • Voice/Chatbots - You can build voice and chatbots using this service and then publish them to Facebook Messenger, Slack, Kik, and Twilio SMS.

    • Database Information - Lex ties in with Lambda so you can pull data from your S3 datastores and databases to provide a response to inquiries.

    • Intent Chaining - Lex is capable of complex interactions that bridge multiple intents. So, for example, someone could book a flight through an audio service and then be asked if they need help reserving a hotel room at their destination, restaurant suggestions, etc.

    • Telephony - Lex wasn't just trained for high-quality audio, but also for phone quality audio. Now you can power your own Interactive Voice Response (IVR) system!

How could Lex work in the Amazon Web Services ecosystem? The below diagram available on Amazon Web Services's Lex page gives one example. If you want to view this image and other examples you can find them here.

[caption id="" align="alignnone" width="1116"]This image shows Amazon Lex being used for a business application. Source: https://aws.amazon.com/lex/[/caption]

Comprehend

Similar to Lex, this is another one of those services where Amazon uses a natural differentiator: their product descriptions and reviews (possibly the largest ever of its kind). Comprehend looks at text and identifies a series of things: language used, key phrases, places, people, brands, and events. Comprehend also does sentiment analysis and automatically organizes a collection of text by topic.

There are definitely a few interesting use cases for Comprehend, like Voice of the Customer (VOC). For VOC you could use comprehend to take text from a variety of sources (emails, call transcripts, social channels, etc.) and glean feedback on your product or service. Another example might be cataloging and then searching a knowledge repository.

https://www.youtube.com/watch?time_continue=1&v=hdXvVyVjPLg

Translate

As the name would suggest, this service provides real-time translation. Currently its available in Preview. So, it isn't a full-fledged offering just yet, but certainly worth exploring. Beyond doing real-time you can also do batch translation, so you don't throttle your network with your applications!

Transcribe

Again, another service that is named for what it does - transcription from speech-to-text. This offering currently only supports English and Spanish. It was designed to support poor quality phone audio making transcribing phone calls a great use case for this service. Beyond the words it currently supports you can extend it with custom vocabulary and, in the future, support multiple speakers.

Polly

If you are picking up on the theme from the previous two entries, you'd likely think this is a parrot service! Close. It actually parrots back (pun intended) the audio from a text file. You give it a text file and it converts it into a standard audio format. An example use case might be reading emails for the busy executive, or catching up on your blog "reading" on the drive home from work.

This is truly a global service in that it supports 24 languages. It also supports many different voices, so you can choose the voice style that is most soothing to your ears.

_______________________________________________________________________________________

That's the roundup of AWS's latest offerings.

As you can see, there are a ton of services available for us techies to use in our apps. The next blog in this series will look at Google's suite of AI services. So, stay tuned!

Want to learn more about some of the use cases that show artificial intelligence in action?

Be sure to register to view our eWeek webinar all about Artificial Intelligence. It's available on demand!

How to Help Your Business Become an AI Early Adopter

Chris Dunn

Director, Product Marketing