Probe data
Probe data, also known as floating car data (FCD), is named after the concept that a vehicle is working as a probe into the overall flow of traffic. Probe data is distinguished from sensor data, which is generally produced by some kind of sensor installed in or next to a roadway. Sensors can usually monitor every vehicle that triggers them, whereas probe data is available only from vehicles that are equipped to produce probe data. The advantage of probe data is that it requires no investment in installing and maintaining sensor equipment, and probe data can be collected anywhere, not just where sensors are installed.
HERE probe data comes from vehicles with GNSS or GPS systems installed. The position of each vehicle is established using this satellite-based positioning system. This makes satellite-based probe data more valuable than probe data generated from mobile cellular network systems which can only triangulate as precisely as the cell towers are spaced. For example, GPS positioning data is precise enough to reliably indicate which road the vehicle is on.
A HERE Probe Data dataset is a list of probe points in the Protobuf format. For further details on the format, see the HERE Probe Data Catalog Specification.
Probe points
Each probe point is a record of where a vehicle was at a particular point in time, and how it was moving. A point has a timestamp (in seconds) and a GNSS/GPS coordinate (latitude and longitude). Most probe points also have a speed and heading value.
Each probe point has an anonymized trace ID that it shares with other probe points from the same vehicle. These IDs makes it possible to understand the path the vehicle takes. A future enhancement will add a vehicle type identifier to each point, for example truck, passenger, or taxi.
Each vehicle in the HERE fleet records probe points based on its installed equipment and connection technologies. The rate at which points are generated varies across the whole fleet, from one per second and up to one per minute. It is not possible to select only vehicles with a particular sample rate, as HERE Probe Data always combines the probe points from all vehicles in a given area. If you have a requirement for a fixed sample rate, HERE recommends that you filter the dataset you receive.
For more information on the HERE fleet, see HERE fleet.
Historical and current data
HERE Probe Data offers its data in two forms:
- Historical: provides up to two years of historical data.
- Real-time: data delivered as a stream, with a potential latency of up to 10 minutes.
Note
If you have a project that needs before and after data, HERE can construct a dataset to your specifications and email you when it’s ready.
The real-time stream has a buffer of up to 12 hours. If you don’t consume the data when it is ready, you have up to 12 hours to read it into your application.
Outliers
Probe points represent samples of vehicle movement, and there may be outliers. The way to handle outliers depends on your specific needs, but the common sources of outliers are:
- GNSS aliasing: there are several ways in which the GNSS/GPS receiver in a vehicle can generate a false position. In dense urban areas, the satellite signals can bounce off of tall buildings, creating a multi-path situation. Some chipsets can also generate inaccurate positions as they first start up.
- Single-vehicle anomalies: a single vehicle may not represent the general traffic flow on a road, for example when it is turning onto that road, or when it stops to park.
HERE recommends that you consider the best detection and treatment of outliers for your particular project. There is no one correct way to do this, so it is not possible for HERE to filter out what some would consider bad data, but others may find valuable.