Tutorials / Extend Sagemaker with Location Intelligence
Last Updated: July 31, 2020

Extend the capability of SageMaker with HERE Location Services

Introduction

Duration is 5 min

How do you optimize Machine Learning? Start HERE

Location intelligence that deepens Machine Learning

Rising customer expectations have put unprecedented pressure on businesses. In response, forward-thinking businesses use Machine Learning to tap into the big data needed to predict what customers want, when they want it, and where they want to get it. To optimize Machine Learning organizations must be able to improve location data within Machine Learning datasets, to avoid incomplete location data and missing datasets. Those are just a few of the many reasons location intelligence has become a critical tool for organizations around the world— regardless of industry.

Explore how location intelligence combined with machine Learning dramatically increase your ability to respond to the needs of your organization.

Benefits of integrating HERE location awareness technology with SageMaker

  • Increases business insights with deeper visualization of all aspects of business
  • Enables real-time tracking of assets, devices, products and people in the field
  • Creates visibility of entire shipments down to individual SKUs
  • Facilitates ad-hoc queries for any neighborhood, city, territory and region
  • Delivers Intelligent location data for better decision-making
  • Provides unmatched flexibility with cloud-based services
  • Adds valuable context through geospatial data
  • Drives business value through differentiated services like route optimization, sequencing of waypoints and more
  • Provides map visualization, geocoding, optimized routing and more

Start the tutorial

This tutorial walks you through the steps required to integrate your Machine Learning (ML) data pipeline with HERE Location Services. This tutorial will use Amazon SageMaker to manage the ML workflow.

What you’ll learn

  • How to leverage HERE Location Services to enrich an ML dataset with additional location information
  • How to integrate the ML dataset with Amazon SageMaker

What you’ll build

  • AWS Lambda function that will call HERE Location Services and use the returned data to update the ML dataset with additional location data.

Assumptions & Prerequisites

  • A familiarity with cloud computing and AWS products
  • Working knowledge of Amazon SageMaker

AWS Resources

Get set up

Duration is 10 min

What you’ll need

An Amazon AWS Account

  • Set up an account with Amazon Web Services (AWS)

A HERE Location Services Account

Get started with Amazon SageMaker

Duration is 45 min

Machine Learning with Amazon SageMaker

The following diagram illustrates the typical workflow for creating a machine learning model:

img

Get Started

This tutorial will focus on the integration between AWS SageMaker and HERE Location Services. Please reference the following documentation for the steps on creating a sample SageMaker Notebook.

Please reference the following links for the most up-to-date information on how to

  • Create an Amazon SageMaker Notebook Instance
  • Train a Model with a Built-in Algorithm and Deploy It

Set Up Amazon SageMaker

Get Started

Sample Dataset

For this tutorial, we’ll use a fictional dataset that is a list of incidents. The incident list has attributes such as case#, address, description and timestamp.

With the current data, ML models will be able to visualize and aggregate data points including:

  • Number of incidents over time
  • Types of incidents over time
  • Limited location insights such as number of incidents at a specific address
 {
   "results": [
     {
       "case": "12345",
       "vicinity": "1300 1st Ave Seattle, WA 98101",
       "description": "Traffic Collision",
       "timestamp": "Dec 31 2018 13:28:14",
       "position": [
         47.60707,
         -122.3384
       ]
     },
     {
       "case": "86743",
       "vicinity": "400 Broad St<br/>Seattle WA 98109",
       "description": "Disturbance",
       "timestamp": "Dec 31 2018 17:28:14",
       "position": [
         47.61993,
         -122.34867
       ]
     },
     {
       "case": "9798",
       "vicinity": "1000 4th Ave Seattle WA 98104",
       "description": "Public Intoxication",
       "timestamp": "Dec 31 2018 23:32:10",
       "position": [
         47.60668,
         -122.33266
       ]
     },
     {
       "case": "45323",
       "vicinity": "1300 1st Ave Seattle WA 98101",
       "description": "Noise Disturbance",
       "timestamp": "Jan 01 2018 1:32:33",
       "position": [
         47.60707,
         -122.3384
       ]
     }
   ]
 }

Enrich the Dataset with HERE Location Services

In order to enhance the capabilities of the Machine Learning models that will be applied to the dataset, you’ll enrich the data by leveraging HERE Location Services.

Additional data that will be added to our incidents include:

  • Nearby points of interest such as schools, hospitals, police stations and fire stations

With this additional incident data, your ML models will have more dimensions to include within the model’s learning process. Potential insights may include:

  • Number of incidents within a certain distance from a hospital or school
  • Type of incidents occurring within a certain distance from a school
  • The types of incidents occurring within a given distance of a specific point of interest

For more information on how to trigger an AWS Lambda function when a file is dropped on Amazon S3 please reference ‘Using AWS Lambda with Amazon S3’

Integrate Places API with Amazon SageMaker

Duration is 20 min

img

Transform Dataset

We will use a Jupyter Notebook to update our fictitious ‘incidents’ dataset.

Click ‘Open Jupyter Notebook’ from your notebook instance

img

Create a ‘New’ notebook and use ‘conda_python3’ for the notebook type

img

Set variables for S3 file location

 from sagemaker import get_execution_role

 role = get_execution_role()
 bucket = '<S3_BUCKET_NAME>'
 sourcefileName = 'incidents.json'
 destfileName = 'incidents.enriched.json'
 print('bucket: {}, sourcefileName:{}, destfileName:{}'.format(bucket, sourcefileName, destfileName))
 print(role)

The HERE API will take in latitude and longitude coordinates, and will return nearby points of interests.

 import json
 import requests
 import boto3
 import os

 def checkKey(dict, key):

     if key in dict.keys():
         return True
     else:
         return False

 def getPoiData(query, coord):
     latlon = "at=" + str(coord[0]) + ',' + str(coord[1])
     query = "q=" + query
     apiKey = "<YOUR-API-KEY>"
     resp = requests.get("https://places.cit.api.here.com/places/v1/autosuggest?" +
         latlon + "&" + query + "&" + apiKey)

     return json.loads(resp.text)

Load the dataset from S3, call the ‘getPoiData’ function, and save the updated enriched dataset back to S3.

 s3 = boto3.resource("s3").Bucket(bucket)
 json.load_s3 = lambda f: json.load(s3.Object(key=f).get()["Body"])
 json.dump_s3 = lambda obj, f: s3.Object(key=f).put(Body=json.dumps(obj))

 jsonData = json.load_s3(sourcefileName)

 for incident in jsonData["results"]:
     if "hospitals not in incident:
         hospitals = getPoiData("hospital", incident["position"])
         incident["hospitals"] = hospitals

     if "schools" not in incident):
         schools = getPoiData("school", incident["position"])
         incident["schools"] = schools

 json.dump_s3(jsonData, destfileName)

Test it

Click the ‘Run’ button.

img

For demo purposes, we put “print” statements into the code.

img

Enriched Dataset

The updated dataset will be saved to S3. Notice the new ‘hospitals’ and ‘schools’ nodes following the ‘position’ node.

img
 {
   "results": [
     {
       "case": "12345",
       "vicinity": "1300 1st Ave Seattle, WA 98101",
       "description": "Traffic Collision",
       "timestamp": "Dec 31 2018 13:28:14",
       "position": [
         47.60707,
         -122.3384
       ],
       "hospitals": [
         {
           "title": "Seattle Children's Hospital",
           "highlightedTitle": "Seattle Children's <b>Hospital</b>",
           "vicinity": "4800 Sand Point Way NE<br/>Seattle, WA 98105",
           "highlightedVicinity": "4800 Sand Point Way NE<br/>Seattle, WA       98105",
           "position": [
             47.66337,
             -122.28278
           ],
           "category": "hospital",
           "categoryTitle": "Hospital",
           "distance": 6912
         }
       ],
       "schools": [
         {
           "title": "Pacific Northwest Ballet School",
           "highlightedTitle": "Pacific Northwest Ballet <b>School</b>",
           "vicinity": "301 Mercer St<br/>Seattle, WA 98109",
           "highlightedVicinity": "301 Mercer St<br/>Seattle, WA 98109",
           "position": [
             47.62385,
             -122.34937
           ],
           "category": "education-facility",
           "categoryTitle": "Educational Facility",
           "distance": 438
         },
         {
           "title": "School of Visual Concepts",
           "highlightedTitle": "<b>School</b> of Visual Concepts",
           "vicinity": "2300 7th Ave<br/>Seattle, WA 98121",
           "highlightedVicinity": "2300 7th Ave<br/>Seattle, WA 98121",
           "position": [
             47.61769,
             -122.3414
           ],
           "category": "education-facility",
           "categoryTitle": "Educational Facility",
           "distance": 597
         },

       ],

     },
     {
       "case": "86743",
       "vicinity": "400 Broad St<br/>Seattle WA 98109",
       "description": "Disturbance",
       "timestamp": "Dec 31 2018 17:28:14",
       "position": [
         47.61993,
         -122.34867
       ],
       "hospitals": [
         {
           "title": "Seattle Children's Hospital",
           "highlightedTitle": "Seattle Children's <b>Hospital</b>",
           "vicinity": "4800 Sand Point Way NE<br/>Seattle, WA 98105",
           "highlightedVicinity": "4800 Sand Point Way NE<br/>Seattle, WA 98105",
           "position": [
             47.66337,
             -122.28278
           ],
           "category": "hospital",
           "categoryTitle": "Hospital",
           "distance": 6912
         }
       ],
       "schools": [
         {
           "title": "Pacific Northwest Ballet School",
           "highlightedTitle": "Pacific Northwest Ballet <b>School</b>",
           "vicinity": "301 Mercer St<br/>Seattle, WA 98109",
           "highlightedVicinity": "301 Mercer St<br/>Seattle, WA 98109",
           "position": [
             47.62385,
             -122.34937
           ],
           "category": "education-facility",
           "categoryTitle": "Educational Facility",
           "distance": 438
         },
         {
           "title": "School of Visual Concepts",
           "highlightedTitle": "<b>School</b> of Visual Concepts",
           "vicinity": "2300 7th Ave<br/>Seattle, WA 98121",
           "highlightedVicinity": "2300 7th Ave<br/>Seattle, WA 98121",
           "position": [
             47.61769,
             -122.3414
           ],
           "category": "education-facility",
           "categoryTitle": "Educational Facility",
           "distance": 597
         }
       ]
     },
     {
       "case": "9798",
       "vicinity": "1000 4th Ave Seattle WA 98104",
       "description": "Public Intoxication",
       "timestamp": "Dec 31 2018 23:32:10",
       "position": [
         47.60668,
         -122.33266
       ]
     },
     {
       "case": "45323",
       "vicinity": "1300 1st Ave Seattle WA 98101",
       "description": "Noise Disturbance",
       "timestamp": "Jan 01 2018 1:32:33",
       "position": [
         47.60707,
         -122.3384
       ]
     }
   ]
 }

Review

In this tutorial, we created a SageMaker Jupyter Notebook that utilized HERE Location Services to add valuable location data to a fictitious ‘incidents’ dataset during the “fetch” phase of the Machine Learning model. The additional location data will allow Machine Learning models to gather greater insights from the data once it is cleaned and prepared for training & evaluating.