Hands on

Geocoding Address Data in Batch Quantities with HERE and Golang

By Nic Raboy | 06 November 2018

If you’re a larger organization processing massive amounts of address information, you’re probably going to burn through your HERE API transactions pretty quickly and rack up a nice bill. The good news is that you can easily prevent this by batching your requests into fewer transactions using the HERE Batch Geocoder API.

With the Batch Geocoder API you can send up to one million addresses to be geocoded in a single transaction. This could potentially save you a lot of money in the long term, depending on the needs of your application.

We’re going to see how to use the HERE Batch Geocoder API with Golang to process a potentially large amount of data, check on the status, and download results to a file.

Building Native Data Structures for the HTTP Response

Rather than trying to use a generic Go interface{} for all of our responses and data, we should probably more accurately model our data so it is easier to work with in the long term.

If you take a look at the documentation, you’ll notice that all responses of the Batch Geocoder API are of the same format. This is good for us because it limits the work we need to do.

Create a main.go file somewhere in your $GOPATH and include the following:

package main

import (
    "bytes"
    "encoding/json"
    "encoding/xml"
    "flag"
    "fmt"
    "io/ioutil"
    "net/http"
    "net/url"
    "os"
)

type BatchResponse struct {
    Response struct {
        MetaInfo struct {
            RequestId string `xml:"RequestId" json:"RequestId"`
        } `xml:"MetaInfo" json:"MetaInfo"`
        Status         string `xml:"Status" json:"Status"`
        TotalCount     int    `xml:"TotalCount" json:"TotalCount"`
        ValidCount     int    `xml:"ValidCount" json:"ValidCount"`
        InvalidCount   int    `xml:"InvalidCount" json:"InvalidCount"`
        ProcessedCount int    `xml:"ProcessedCount" json:"ProcessedCount"`
        PendingCount   int    `xml:"PendingCount" json:"PendingCount"`
        SuccessCount   int    `xml:"SuccessCount" json:"SuccessCount"`
        ErrorCount     int    `xml:"ErrorCount" json:"ErrorCount"`
    } `xml:"Response" json:"Response"`
}

type Geocoder struct {
    AppId   string `json:"app_id"`
    AppCode string `json:"app_code"`
}

func main() { }

A lot of the code above is boilerplate Go code, but we have two data structures. The first data structure is a model of the potential response. Per the documentation, the response is XML, which isn’t exactly the nicest format to work with. Lucky for us, Go makes it easy to switch between formats using XML and JSON annotations. The annotations allow us to map the response to an object with no manual intervention.

More information on working with XML data in Go can be found in a previous tutorial I wrote titled, Parse XML Data in a Golang Application.

The Geocoder data structure will hold our app id and app code found in the HERE Developer Portal. To use this API you must have a developer account, which is free.

Developing Functions for the HERE Batch Geocoder API Requests

With the data models in place, we can focus on developing functions that will process our requests. As of now there is no Golang SDK for the HERE services, but REST is available and not difficult to use.

Using the HERE Batch Geocoder API is a multi-step process. This means that no single request will allow us to upload data and get a geocoded response back. Instead, there are a few things to do, the first of which is upload data and start the process.

Take a look at the following batch function:

func (geocoder *Geocoder) batch(payload []byte) (BatchResponse, error) {
    endpoint, _ := url.Parse("https://batch.geocoder.api.here.com/6.2/jobs")
    queryParams := endpoint.Query()
    queryParams.Set("app_id", geocoder.AppId)
    queryParams.Set("app_code", geocoder.AppCode)
    queryParams.Set("indelim", "|")
    queryParams.Set("outdelim", "|")
    queryParams.Set("outcols", "displayLatitude,displayLongitude,locationLabel,houseNumber,street,district,city,postalCode,county,state,country")
    queryParams.Set("outputcombined", "false")
    queryParams.Set("action", "run")
    endpoint.RawQuery = queryParams.Encode()
    response, err := http.Post(endpoint.String(), "text/plain", bytes.NewBuffer(payload))
    if err != nil {
        return BatchResponse{}, err
    } else {
        data, err := ioutil.ReadAll(response.Body)
        if err != nil {
            return BatchResponse{}, err
        }
        var batchResponse BatchResponse
        xml.Unmarshal(data, &batchResponse)
        return batchResponse, nil
    }
}

Most of the above code is around constructing a request. The batch function does require a payload to be provided. The payload in this circumstance is the byte data for a CSV file. After setting the query parameters, a POST request is made, passing the CSV file in the request.

When the request is successful, the XML body that is returned is stored in the BatchResponse variable. Without getting too far ahead of ourselves, the final outcome for this request might look something like the following image:

here-batch-geocode-run-golang

If everything went smooth, you’d be given a RequestId and a message saying the request was accepted. The RequestId will be used in future requests for checking the status or downloading the results.

With the batch process running remotely, we’d want to check the status with a status function like this:

func (geocoder *Geocoder) status(id string) (BatchResponse, error) {
    endpoint, _ := url.Parse("https://batch.geocoder.api.here.com/6.2/jobs/" + id)
    queryParams := endpoint.Query()
    queryParams.Set("app_id", geocoder.AppId)
    queryParams.Set("app_code", geocoder.AppCode)
    queryParams.Set("action", "status")
    endpoint.RawQuery = queryParams.Encode()
    response, err := http.Get(endpoint.String())
    if err != nil {
        return BatchResponse{}, err
    } else {
        data, err := ioutil.ReadAll(response.Body)
        if err != nil {
            return BatchResponse{}, err
        }
        var batchResponse BatchResponse
        xml.Unmarshal(data, &batchResponse)
        return batchResponse, nil
    }
}

In the above code, the request is a little different. Instead of a POST request, we’re doing a GET request and the RequestId is expected. Assuming all went well, the response will be parsed into a BatchResponse object and returned. In the end, it might look something like the following image:

here-batch-geocode-status-golang

Between the status response and the batch response, they are more or less the same, with the exception that the status has information about the process. When the process has completed, you’ll have information about the data and be able to download it.

The final step in the process is to download a ZIP archive of the data. We can create a data function that looks like the following:

func (geocoder *Geocoder) data(id string) ([]byte, error) {
    endpoint, _ := url.Parse("https://batch.geocoder.api.here.com/6.2/jobs/" + id + "/result")
    queryParams := endpoint.Query()
    queryParams.Set("app_id", geocoder.AppId)
    queryParams.Set("app_code", geocoder.AppCode)
    endpoint.RawQuery = queryParams.Encode()
    response, err := http.Get(endpoint.String())
    if err != nil {
        return nil, err
    } else {
        return ioutil.ReadAll(response.Body)
    }
}

We cannot create a response model for this function because we’re getting binary data back. In the end we’ll want to take the binary data and save it to a file on the computer.

So how do we use each of these functions that we had created?

Establishing a CLI for Executing Requests Against the API

The final part of this project is to bring everything together. We’re actually going to build our own little CLI, so we’re going to define some possible flags and how to react to them.

Let’s have a look at the main function:

func main() {
    filepath := flag.String("p", "", "Path to CSV Data")
    status := flag.String("s", "", "Id for status of batch process")
    download := flag.String("d", "", "Id for download of batch process")
    flag.Parse()
    geocoder := Geocoder{AppId: "APP-ID-HERE", AppCode: "APP-CODE-HERE"}
    if *filepath != "" {

    } else if *status != "" {

    } else if *download != "" {

    }
}

In the main function, we are defining three possible flags when the application is ran. A path to a CSV file, a status flag which is an id, and a download flag which is also an id.

When the application is ran, we do a simple conditional statement to see which flag was provided. In this example we’re assuming only one flag is available per run.

If the filepath is present, we can execute the following:

if *filepath != "" {
    data, _ := ioutil.ReadFile(*filepath)
    result, err := geocoder.batch(data)
    if err != nil {
        fmt.Printf("The HTTP request failed with error %s\n", err)
        return
    }
    jsonData, _ := json.Marshal(result)
    fmt.Println(string(jsonData))
}

The CSV file is read and the byte data is passed as a payload to the batch function. The response is then converted into JSON and presented in the Terminal. It was pretty printed in the example images because of an optional Python tool.

If the status is present, we can execute the following:

if *status != "" {
    result, err := geocoder.status(*status)
    if err != nil {
        fmt.Printf("The HTTP request failed with error %s\n", err)
        return
    }
    jsonData, _ := json.Marshal(result)
    fmt.Println(string(jsonData))
}

In the above code, we are taking the provided RequestId and passing it to the status function for an update. The result is also converted into JSON and printed in the Terminal.

Finally, we have the last part of the process:

if *download != "" {
    result, err := geocoder.data(*download)
    if err != nil {
        fmt.Printf("The HTTP request failed with error %s\n", err)
        return
    }
    fmt.Print(string(result))
}

In the above code, we pass the RequestId to the data function and the binary data that is returned, is printed in the Terminal. Using the binary data, we can save it to a file like so:

go run *.go -d yCQBgMSQgdXhxnwkdMqjMec3doxeZl2v >> output.zip

If you’d rather not pipe the response into a file, you could always implement some file logic within the Go application directly. From a scripting perspective, I thought it’d be more valuable to allow piping of the data.

Conclusion

You just saw how to use the HERE Batch Geocoder API in your Golang application. While batching requests might not be useful in every possible scenario, given the correct scenario, it could save your company a lot of money in API transactions.

If you’d like to see how to use the Geocoder API without batching requests in Go, check out a previous tutorial I wrote on the topic titled, Process CSV Address Information with Golang and the HERE Geocoder API.