Skip to main content

Dataset API

note

This document assumes you are already familiar with the Dataset API Configurator. You should only consider this document if you plan to develop your own web interface or need to construct very complex data calls not supported by the web interface.

The recommended way is to configure a call in the web-interfaces and then export the generated call definition using Output format -> API query.

Commonly meteoblue APIs use simple URL GET parameters like http://my.meteoblue.com/packages/basic-1?lat=47.2&lon=9.6.... This approach is unfortunately not sufficient to query datasets dynamically. Instead a HTTP JSON request body is used:

{
"units": {
"temperature": "C",
"velocity": "km/h",
"length": "metric",
"energy": "watts"
},
"geometry": {
"type": "MultiPoint",
"coordinates": [[7.57327,47.558399,279]], // lon, lat, asl
"locationNames": ["Basel"]
},
"format": "json",
"timeIntervals": [
"2019-01-01T+00:00/2019-12-31T+00:00"
],
"queries": [{
"domain": "NEMSGLOBAL",
"gapFillDomain": null,
"timeResolution": "hourly",
"codes": [{
"code": 157,
"level": "180-0 mb above gnd"
}]
}]
}

This call can be executed with the command line tool curl:

curl \
-L -H "Content-Type: application/json" \
-d '{"units":{"temperature":"C","velocity":"km/h","length":"metric","energy":"watts"},"geometry":{"type":"MultiPoint","coordinates":[[7.57327,47.558399,279]],"locationNames":["Basel"]},"format":"json","timeIntervals":["2019-01-01T+00:00/2019-12-31T+00:00"],"timeIntervalsAlignment":"none","queries":[{"domain":"NEMSGLOBAL","gapFillDomain":null,"timeResolution":"hourly","codes":[{"code":157,"level":"180-0 mb above gnd"}]}]}' \
"http://my.meteoblue.com/dataset/query?apikey=APIKEY"

Many web-development tools like Insomnia REST client support using JSON bodies. Alternatively, the JSON query can be encoded into the URL. This will result in long URLs and quickly hit maximum URL length limits.

https://my.meteoblue.com/dataset/query?apikey=DEMOKEY&json=%7B%22units%22%3A%7B%22temperature%22%3A%22C%22%2C%22velocity%22%3A%22km%2Fh%22%2C%22length%22%3A%22metric%22%2C%22energy%22%3A%22watts%22%7D%2C%22geometry%22%3A%7B%22typ....

All calls to meteoblue APIs require a valid API key. Please contact [email protected] for more information.

More complex calls might also be declined to be executed directly and require the use of job queues. The next chapter will explain job queues in more detail.

To use the dataset API with Python, we recommend to use the meteoblue-dataset-sdk Python module. This library simplifies access to the dataset API and transparently implements job queues and decoding of data using protobuf.

JSON Query Structure

The JSON body uses various structures and arrays that are nested to build complex queries with recursive transformations. All JSON attributes are case-sensitive and use camel-case names. As in the example above, the outer JSON structure contains properties like units, geometry, timeIntervals or queries.

The following tables describe all properties and how they are integrated with other structures. Some properties address special use-cases that are not available in the web-interfaces. For completeness all API properties are documented in the next chapters.

PropertyTypeDescription
unitsStructure: UnitsOption to select units like Fahrenheit
geometryStructure: GeoJSONSelect polygon or points
formatString enumeration: FormatWhich output format to use
timeIntervalsArray of Structure: TimeIntervalDefine time intervals to read
timeIntervalsAlignmentString enumeration: AlignmentHow multiple time-intervals are aligned in charts
queriesArray of Structure: QueryPer dataset queries
oneTimeIntervalPerGeometryBooleanSee below
checkOnlyBooleanOnly calculate amount of required datapoints
runOnJobQueueBooleanExecute this job on a queue, instead directly

Per default API will return a data-series for each time-interval times the number of geometries. 10 elements in timeIntervals and 20 coordinates in geometries return 200 data-series.

If oneTimeIntervalPerGeometry is set to true and a GeoJSON GeometryCollection is used, the first geometry will use the first time-interval, then the second geometry and the second time-interval and so on. This is used to return for each coordinate, different time-intervals. In the web-interfaces this used in Coordinates and time mode. An example call is available in the GeoJSON description below.

If checkOnly is set to true, the API will only calculate how many data points must be processed and whether a job queue must be used. runOnJobQueue would then be required to submit the call to a job queue. More information can be found in the last chapter about job queues.

Units

If units are not set, the defaults are Celsius, km/h, metric and watts

PropertyTypeDescription
temperatureStringcelsius or fahrenheit
velocityStringkm/h, m/s , mph, kn or bft
lengthStringmetric or imperial
energyStringwatts or joules

GeoJSON Geometry

important

Please make sure to provide all input coordinates in the correct order: "lon" -> "lat" (-> "asl")

The geometry structure is based on GeoJSON, but extended to support features like geoname polygon id, location names and additional attributes. A geometry could also be of type GeometryCollection to select multiple geometries (this can be used in conjunction with oneTimeIntervalPerGeometry).

Depending on the feature type different geometries can be used.

Point and MultiPoint

{
"type": "Point",
"coordinates": [8.6, 47.5, 351.1] // lon, lat, asl
}
{
"type": "MultiPoint",
"coordinates": [[8.6, 47.5,351.1], [8.55, 47.37, 429]], // lon, lat, asl
"locationNames": ["Basel", "Zürich"]
}

Coordinates are defined as tuple of longitude, latitude and elevation above sea level. Elevation is optional and will be automatically resolved from an 80 m resolution digital elevation model (DEM). locationNames can be optionally specified and will be replicated in the output.

The order of coordinates will be preserved in the output.

The coordinates of the output refer to the center of the relevant grid cell and therefore do not necessarily correspond to the input coordinates (unless the latter happen to be identical to the grid cell's center). To ensure that the desired coordinates can be found (if no index table is available), the mode can be used to select the preference for how the grid point is selected for Point and MultiPoint requests.

{
"type": "MultiPoint",
"coordinates": [[8.6, 47.5, 351.1], [8.55, 47.37, 429]], // lon, lat
"mode": "preferLandWithMatchingElevation" // default value
}

Four mode-options can be chosen in the query:

PropertyDescription
preferLandWithMatchingElevationClosest vertical distance
preferSeaConsiders grid points over the sea
nearestClosest horizontal distance
includeNeighboursCombination of all 4 grid points closest in horizontal distance

The default grid selection mode is preferLandWithMatchingElevation, it will evaluate the four closest grid points and select the one that best matches the desired criteria, otherwise the nearest.

important

Caution: If a specific mode is selected, the output may deviate from the desired criteria, for example: If nearest is selected in a valley, the height at which the closest eligible grid point is located may differ greatly from that of the input coordinates.

Polygon and MultiPolygon

{
"type": "Polygon",
"coordinates": [
[[7.5,47.5],[7.5,47.6],[7.7,47.6],[7.7,47.5],[7.5,47.5]] // lon, lat
]
}
{
"type": "MultiPolygon",
"coordinates": [
[[[8.0,47.4],[8.0,47.6],[8.2,47.6],[8.2,47.4],[8.0,47.4]]], // lon, lat
[[[7.5,47.5],[7.5,47.6],[7.7,47.6],[7.7,47.5],[7.5,47.5]]] // lon, lat
],
"excludeSeaPoints": true,
"fallbackToNearestNeighbour": true
}

The first and last coordinate must be the same. Please make sure to supply a valid polygon without self-intersections.

The optional Boolean parameter excludeSeaPoints can be set to true, to ignore grid-cells that are located on the sea.

If no grid-cells are within the polygon, the result would be empty. If fallbackToNearestNeighbour is set to true, the result will select the nearest neighbour grid-cell instead.

Geoname Polygon

Administrative areas in the web-interfaces are based the geonames polygon database. To keep calls short and not always include the full GeoJSON polygon for each administrative area, the API can directly get a polygon from a database. Once the polygon is loaded from the database, be behavior is identical to a regular polygon API call.

{
"type": "GeonamePolygon",
"geonameid": 2345235
}

Multiple geoname polygons can also be selected in one call. Internally polygons get merged into a single polygon. If the transformation Aggregate all grid-cells would now be used, all grid-cells of both administrative areas would be aggregated to a single data-series.

{
"type": "MultiGeonamePolygon",
"geonameids": [2345235, 312453]
}

Parameters excludeSeaPoints and fallbackToNearestNeighbour are also considered, if set.

Geometry Collection

Multiple geometries can also be processed in one call instead of calling the API multiple times. If the GeoJSON type GeometryCollection is used, the API will process one geometry after another.

The previous MultiGeonamePolygon call could be split into a collection like:

{
"type": "GeometryCollection",
"geometries": [
{ "type": "GeonamePolygon", "geonameid": 2345235 },
{ "type": "GeonamePolygon", "geonameid": 312453 }
]
}

It is important to notice, that for a GeometryCollection all transformation are applied individually. The transformation Aggregate all grid-cells will only aggregate grid-cell in one geometry of a geometry collection. This can be used to select multiple administrative areas in a country, use the transformation Aggregate all grid-cells and retrieve one index for each area individually. In the example above, two data-series would be returned.

Alternatively, GeometryCollection with the parameter oneTimeIntervalPerGeometry allows you to select different time-intervals for each geometry. It is used in the web-interface for the coordinates and time selection mode. For the first coordinate, the first time-interval will be used, for the second coordinate the second time-interval will be used, and so on.

{
"oneTimeIntervalPerGeometry": true,
"geometry": {
"type": "GeometryCollection",
"geometries": [
{ "type": "Point", "coordinates": [8.6, 47.5, 351.1] } // lon, lat, asl
{ "type": "Point", "coordinates": [8.55, 47.37, 429] } // lon, lat, asl
]
},
"timeIntervals": [
"2015-05-05T+00:00/2016-06-06T+00:00",
"2015-05-03T+00:00/2016-06-01T+00:00"
]
}

Output Format

The attribute format accepts the following values:

  • json: Recommended JSON format (default, if not set)
  • csv: CSV format for large amount of locations
  • csvTimeOriented: CSV format for long time-ranges
  • csvIrregular: CSV format for mixed time-intervals and locations
  • xlsx: XLSX format for large amount of locations
  • xlsxTimeOriented: XLSX format for long time-range
  • xlsxIrregular: XLSX format for mixed time-intervals and locations
  • highcharts: JSON output to create a highcharts graph
  • highchartsHtml: HTML page that embeds the highcharts library and the chart
  • geoJson: JSON output to create map with bullet points
  • geoJsonHtml: HTML page that embeds a map library and the map json
  • kml: KML format that only includes the grid cell coordinates
  • netCDF: Recommended binary format for further scientific data analysis

Detailed information about the structure of each format can be found here in the previous format chapter.

Time Intervals

Time intervals and timezones can be specified using the ISO8601 format. The timeIntervals attribute is an array of ISO8601 strings. Per default the web-interfaces generate time-intervals with a timezone offset, but without specifying the hour and minute.

{
"timeIntervals": [
"2015-05-01T+00:00/2015-05-02T+00:00",
"2016-05-01T+00:00/2016-05-02T+00:00"
]
}

In the intervals above, 2 full days are selected. For hourly data, the API would return 48 hourly values for each time interval. In the API syntax time-intervals could be specified to select exactly 1 hour:

{
"timeIntervals": [
"2019-01-01T00:00+00:00/2019-01-01T01:00+00:00"
]
}

Datasets and Variables

The selection of datasets and variables is specified in the attribute queries as an array to select multiple datasets. For each dataset, specified by the domain attribute, multiple weather variable codes can then be selected.

In this example, three variables are selected from NEMSGLOBAL and than transformed with two transformations. In the same call, data can be selected from the dataset NEMS12 and transformed individually.

{
"queries": [
{
"domain": "NEMSGLOBAL",
"gapFillDomain": null,
"timeResolution": "hourly",
"codes": [
{"code": 11, "level": "2 m above gnd"},
{"code": 52, "level": "2 m above gnd"},
{"code": 157, "level": "180-0 mb above gnd"}
],
"transformations": [
{
"type": "valueIsAbove",
"valueMin": 30,
"returnClassification": "zeroOrOne"
},
{
"type": "aggregateTimeInterval",
"aggregation": "mean"
}
]
},
{
"domain": "NEMS12",
"gapFillDomain": null,
"codes": [ ... ],
"transformations": [...]
}
]
}

Attributes for the structure query:

PropertyTypeDescription
domainStringdataset name like NEMSGLOBAL or ERA5
gapFillDomainOptional Stringdataset to use to fill gaps
timeResolutionStringhourly or daily
codesArray of CodesIndividual selection of weather variables. See next chapter.
transformationsOptional array of transformations
allowForecastBoolean, default trueWhether to allow forecast data
allowHistoryBoolean, default trueWhether to allow history data

Notes:

  • allowHistory enables reads form the meteoblue archive storage. Forecasts are archived once a day and tend to be more consistent.
  • allowForecast enables reads from up to date forecasts which reside on SSD and are updated more frequently. Data of the last days may change slightly. This applies only to datasets which offer forecasts.
  • timeResolution specified the resolution to read. It can also be set to daily although the dataset only offers hourly data to automatically calculate daily aggregations. Aggregations like monthly must use transformations. In the future, some datasets may offer pre-computed monthly or yearly data directly.

Once the dataset has been selected, multiple variables at different levels can be encoded into the call. The web-interfaces only use one variable per dataset for simplicity. The API is capable of selecting multiple variables per dataset at once. This could improve API call performance, because expensive spatial calculations are only performed once.

Attributes for the structure code:

PropertyTypeDescription
codeIntegerNumeric variable code. E.g. 11 for temperature
levelStringLevel the variable. E.g. 2 m above gnd
aggregationOptional Stringmin, max, mean, sum to be used with daily aggregations
gddBaseOptional FloatLower limit for the GDD calculation. Celsius unless Fahrenheit is selected
gddLimitOptional FloatUpper limit
startDepthOptional IntegerDepth in centimeters for the soil depth aggregation
endDepthOptional Integer
slopeOptional FloatInclination to calculate GTI. 0 = horizontal, 20 = typical value, 90 = vertical
facingOptional FloatEast-West orientation for GTI. 90° = East, 180° = South, 270° = West

Variable Codes

The numeric codes to select a variable from a dataset originated from NOAA GRIB 1 codes, but have been extended to include more variables.

A list of all weather variable codes at meteoblue is available as JSON API. Please note, that any individual dataset only supports a small fraction of the available codes.

Transformations

Within the query structure an array of transformations can be specified. All transformations are processed one after another, but also modify the behavior of others like extend time-intervals or spatial contexts.

We recommend using web-interfaces to configure calls, but as a reference the API syntax for each transformation is documented below. For more details on each transformation consult the web-interfaces documentation.

Temporal Transformations Syntax

Aggregations to daily, monthly and yearly use an easy syntax. In this example 3 transformations are used with a 30-year temperature time-series:

  • Calculate the daily minimum
  • Use all daily minima and calculate the mean for a month. This is now the monthly mean of daily minimum temperatures.
  • From all the monthly means pick the coldest monthly value. The call now returns 30 values because 30 years are used as an input
{
"transformations": [
{
"type": "aggregateDaily",
"aggregation": "min"
},
{
"type": "aggregateMonthly",
"aggregation": "mean"
},
{
"type": "aggregateYearly",
"aggregation": "min"
}
]
}

The following values are supported for the attribute aggregation:

  • sum, min, max, mean, stddev
  • sumIgnoreNaN, minIgnoreNaN, maxIgnoreNaN, meanIgnoreNaN
  • p10, p25, p50, p75, p90

The transformations Aggregate daily by longitude and Aggregate each time-interval also just use the aggregation type:

{
"transformations": [
{
"type": "aggregateDailyByLongitude",
"aggregation": "mean"
},
{
"type": "aggregateTimeInterval",
"aggregation": "mean"
}
]
}

The transformation Aggregate by day and night additonally takes an attribute dailyNightly:

  • daylightAndNighttime: Return 2 values per day. One for daytime and one for nighttime
  • daylight: Only aggregate daylight hours
  • nighttime: Only aggregate nighttime hours
{
"type": "aggregateHalfDaily",
"dailyNightly": "daylightAndNighttime",
"aggregation": "mean"
}

Note: To keep the documentation compact, the examples only include the minimum JSON syntax.

Aggregate over a sliding time window requires a nTimesteps attributes which is an Integer for how many time-steps are used in the sliding windows aggregation.

{
"type": "timeLaggedAggregation",
"aggregation": "mean",
"nTimesteps": 3
}

Aggregate to climate normals allows to select daily and hourly resolution with the attribute temporalResolution

{
"type": "aggregateNormals",
"aggregation": "mean",
"temporalResolution": "daily"
}

For temporal interpolations the transformation Interpolate temporal expects and temporalResolution attribute with the options: 15min, 10min, 5min and 1min

{
"type": "interpolateTemporal",
"temporalResolution": "15min"
}

Value Filter Transformation Syntax

The transformations to filter values based on a threshold, use a returnClassification to specify the return behavior:

  • zeroOrOne
  • zeroOrValue
  • zeroOrDelta
  • zeroOrOneAccumulated
  • zeroOrValueAccumulated
  • zeroOrDeltaAccumulated
  • zeroOrConsecutiveCount
{
"type": "valueIsAbove",
"valueMin": 30,
"returnClassification": "zeroOrOne"
},
{
"type": "valueIsBelow",
"valueMax": 10,
"returnClassification": "zeroOrOne"
},
{
"type": "valueIsBetween",
"valueMin": 10,
"valueMax": 30,
"returnClassification": "zeroOrOne"
}

The transformation Value limited to a range takes two integers to limit clip values to a certain range.

{
"type": "valueLimitRange",
"valueMin": 5,
"valueMax": 10
}

Accumulate time-series to a running total takes no additional attributes.

{
"type": "accumulate"
}

Spatial Transformations Syntax

The transformation Resample to a regular grid takes a floating-point gridResolution of greater than 0.001, options to control interpolation and aggregation and the behavior for the disjoint area of the grid and polygon. Spatial transformation calls only work for polygon calls and not for calls based on single coordinates.

The attributes interpolationMethod support:

  • linear interpolation using triangulated irregular networks
  • nearest neighbor interpolation

Attribute spatialAggregation:

  • mean, min, max: Return NaNs if one input value is NaN.
  • meanIgnoreNaN, minIgnoreNaN, maxIgnoreNaN: Ignores NaNs if possible.

The disjointArea of the polygon and the resampled grid can be discarded discard or kept keep.

{
"type": "spatialTransform",
"gridResolution": 0.5,
"interpolationMethod": "linear",
"spatialAggregation": "mean",
"disjointArea": "discard"
}

This transformation also offers an additional attribute geometry which can be set to a MultiPoint geometry to select individual grid-cells after a dataset has been resampled. The grid-cells are selected by a nearest neighbor search in the new regular grid. In the next example, a selected polygon would be gridded to 0.1° and afterwards 2 locations extracted.

{
"type": "spatialTransform",
"gridResolution": 0.1,
"interpolationMethod": "linear",
"spatialAggregation": "mean",
"geometry": {
"type": "MultiPoint",
"coordinates": [[7.57327,47.558399], [7.85222,47.995899]], // lon, lat
"locationNames": ["Basel","Freiburg"]
}
}

Combine Dataset Transformations Syntax

With the transformation Combine the selected data-series the API syntax uses recursion to select another data-series. The attribute dataQuery is now using the same structure as described above.

The attribute mathOperator supports the following modes:

  • multiply, divide, add, substract
  • maximum, minimum, mean
  • equals, notEquals, greaterThanEquals, lessThanEquals
{
"type": "combineDataset",
"mathOperator": "multiply",
"dataQuery": {
"domain": "ERA5",
"gapFillDomain": null,
"timeResolution": "hourly",
"codes": [{"code": 75, "level": "high cld lay"}],
"transformations": [...]
}
}

To combine a dataset with a different resolution, resampling can also be used. The attributes accept the same values as explained above.

 {
"type": "combineDatasetWithResampling",
"mathOperator": "multiply",
"interpolationMethod": "linear",
"spatialAggregation": "mean",
"dataQuery": {
"domain": "GFS05",
"gapFillDomain": null,
"timeResolution": "3hourly",
"codes": [{"code": 301, "level": "2 m above gnd"}]
}
}

Aggregate all Grid Cells Syntax

The transformation Aggregate all grid-cells aggregates all grid-cells based on a function. The aggregation function is using the same syntax as for temporal transformations. This transformation works for coordinate as well as polygon calls. For polygon calls, the centroid coordinate will be shown in the output.

{
"type": "spatialTotalAggregate",
"aggregation": "mean"
}

For a weighted average, the transformation spatialTotalWeighted can be used and takes the weights from a data-series specified in dataQuery.

 {
"type": "spatialTotalWeighted",
"dataQuery": {
"domain": "ERA5",
"gapFillDomain": null,
"timeResolution": "hourly",
"codes": [{"code": 301,"level": "2 m above gnd"}]
}
}

In case the weights originated from another dataset with a different grid, resampling can be used. interpolationMethod and spatialAggregation follow the same specifications as before.

 {
"type": "spatialTotalWeightedWithResampling",
"interpolationMethod": "linear",
"spatialAggregation": "mean",
"dataQuery": {
"domain": "ERA5",
"gapFillDomain": null,
"timeResolution": "hourly",
"codes": [{"code": 301,"level": "2 m above gnd"}]
}
}

Mask out Grid Cells Syntax

To set values to NaN based on filter criteria, the transformation maskOut uses a floating-point threshold attribute and a aboveOrBelow setting. The filter criteria are retrieved from another data-series which can be specified with dataQuery.

The attribute aboveOrBelow supports (The naming is not consistent for historic reasons):

  • above: Greater than condition >
  • below: Less than condition <
  • greaterThanEquals Greater or equals than condition >=
  • lessThanEquals Less or equals than condition <=
{
"type": "maskOut",
"aboveOrBelow": "greaterThanEquals",
"threshold": 10.0,
"dataQuery": {
"domain": "NEMSGLOBAL",
"gapFillDomain": null,
"timeResolution": "hourly",
"codes": [{"code": 256, "level": "sfc"}]
}
},

With resampling to match another grid:

{
"type": "maskOutWithResampling",
"aboveOrBelow": "greaterThanEquals",
"threshold": 10.0,
"interpolationMethod": "linear",
"spatialAggregation": "mean",
"dataQuery": {
"domain": "VHP",
"gapFillDomain": null,
"timeResolution": "daily",
"codes": [{"code": 274, "level": "sfc"}]
}
}

Downscale Grid Cells Syntax

Activating this transformation for coordinate API calls, will enable linear downscale using 3 neighboring grid-cells. This is not available for polygon calls.

{
"type": "downscaleGridCell"
}

Sign Mechanism

The meteoblue APIs support shared secrets to make API URLs temper proof or set an expire day. Because the query is submitted as a JSON POST body in an API call, the body content is not signed. If your API key requires signature, you have to calculate the MD5 sum of the POST body and set the URL GET parameter &post_body_md5=. A signed URL may look like this:

https://my.meteoblue.com/dataset/query?apikey=MY_API_KEY&post_body_md5=070aff9477baf1844e37e68606483436&expire=1581606535&sig=6a5c276f186539bf1d0c57835c4fb0dd

Metadata API

The Metadata API provides basic information about a dataset, the time of the last update of each dataset and differentiates between preliminary (first) and final run. In case of satellite data the preliminary run is usually available after a couple of hours. A second or final run will be published days or even weeks later with improved data quality. Because there could be changes in the data, meteoblue keeps track of these dates and includes them in the metadata API.

Example API call for CHRIPS: http://my.meteoblue.com/dataset/meta?dataset=CHIRPS2

{
"name": "CHIRPS2",
"temporalResolution": "daily",
"region": "50S-50N",
"spatialResolution": "5.0 km",
"historyDataStart": "19810101T0000",
"historyDataFinalRun": "20200731T2300",
"source": "USGS & CHG",
"sourceUrl": "http://chg.geog.ucsb.edu"
}

Fields:

  • name: Name of the dataset e.g. NEMSGLOBAL or CHIRPS2
  • temporalResolution: Native temporal resolution. E.g. hourly or daily
  • spatialResolution: Spatial resolution. E.g. "5 km", but could also be a range "4-30 km"
  • historyDataStart: The first valid timestamp for API calls using archived historical data
  • historyDataFinalRun: The last timestamp, that will not be modified anymore by future planed updates. E.g. for CHIRPS this date is a couple of weeks in the past.
  • region: Extend of this dataset. E.g. global, central-asia or a latitude bound like 50S-50N
  • source: Provider of this dataset
  • sourceUrl: URL to the provider website

Job Queues

While regular API calls only take a couple of seconds, complex dataset calls can take a couple of minutes or even hours. HTTP APIs do not work well with long-running calls. This leads to timeouts on server and client-side. Before executing dataset calls, the API calculates the estimated run-time. If the approximated run-time exceeds a threshold, the API will return an error and the user must submit the call to the job queue system.

After a job is completed, the result will be uploaded to an Amazon S3 web-storage and kept for 7 days. The job-queue result is identical to a regular dataset call.

Implementing the schematics of job queues correctly needs special care. For Python meteoblue offer simple SDK to use the dataset API without having to care about job queues: meteoblue-dataset-sdk on GitHub.

Jobs "belong" to a queue. Queues are associated with API keys by meteoblue and provisioned according to performance requirements. Multiple API keys can share the same queue. Each queue will be processed by multiple "workers" which are exclusively dedicated to one queue. Workers run on multiple physical servers and synchronize with a central job queue dispatcher.

Each API key can submit up to 5 jobs in parallel to the job queue system. Additional jobs will be declined until the previous jobs are completed. This prevents single API users to over-utilize the queue system and starve resources from other applications.

Current job queue limits are (only one needs to be fulfilled):

  • Data-points > 876'000: This is one year of hourly data for 100 locations.
  • Number of grid-cells > 5000: This is using an approximated number of grid-cells of potential effected grid-cells by using the polygon area and grid-resolution. This approximation is necessary to quickly calculate the number of grid-cells without performing expensive polygon/grid operations.
  • Spatial resampling transformation is used: This is independent of polygon size or resolution.

Running Jobs on the Queue

There are 2 options to determinate if a job must be executed on a job queue:

  1. The API will return an error: {"error_message": "This job must be executed on a job-queue"}
  2. The dataset query JSON syntax accepts a checkOnly parameter: {"checkOnly": true, "geometry": ..., "queries":...}. The response JSON contains {"requiresJobQueue": true}

To start a job on the queue, the parameter runOnJobQueue must be set to true in the request POST JSON. An API key is only necessary to start a job, but status and result can be retrieved without an API key.

curl "http://my.meteoblue.com/dataset/query/?apikey=123456789"
-H "Content-Type: application/json"
-d '{"runOnJobQueue":true,"geometry":{"type":"MultiPoint","coordinates":[[7.57327,47.558399,279]],"locationNames":["Basel"]},"format":"csv","timeIntervals":["2017-01-01T+00:00/2017-12-31T+00:00"],"queries":[{"domain":"ERA5","gapFillDomain":null,"timeResolution":"hourly","codes":[{"code":75,"level":"high cld lay"}]}]}'

If the API call is ok, the API responds with a JSON which contains the UUID of the newly submitted job. The status shows waiting. If a worker is available, the job will be started immediately.

{
"id": "6768BAC9-2446-4A9F-A2CD-A8CCAE070919",
"queueTime": 1524063300,
"apikey": "123456",
"status": "waiting"
}

Calls to http://my.meteoblue.com/queue/status/API-ID show after a couple of seconds running and finally finished

{
"id": "6768BAC9-2446-4A9F-A2CD-A8CCAE070919",
"queueTime": 1524063300,
"apikey": "123456",
"status": "finished"
}

The result is uploaded to a central storage system and can be retrieved with:

curl "http://queueresults.meteoblue.com/DD7F947B-C11E-48A2-92DC-1A4A5E77DE8C"

Job States

Jobs can have the following state:

  • waiting: A job is queued but has not yet been started. It should start within a couple of seconds, unless the queues are highly utilized.
  • running: The job is currently running. Each job is checked every couple of seconds if it is actually running.
  • finished: Job successfully completed. The result can now be retrieved at http://queueresults.meteoblue.com/<id>.
  • deleted: Job has been cancelled manually by the user
  • error: An error occurred

To cancel a waiting or running job, send a HTTP DELETE request to /queue/delete/6768BAC9-2446-4A9F-A2CD-A8CCAE070919. If the job is already finished, this call we delete the stored result on AWS S3.

To retrieve the JSON call of a queued or running job: http://my.meteoblue.com/queue/jobquery/API-ID.

Errors

If an error occurs while executing a job, the error message is set in the job status. Sometimes a job will fail with the error message Job failed for unknown reasons. In this case the application executing job, most likely ran out of memory. Generating CSV text output with large polygons or long time-intervals quickly requires many gigabytes of memory. Try to use netCDF as output format or use smaller polygons, fewer coordinates or shorter time-intervals.

{
"status": "error",
"error_message": "Job failed for unknown reasons"
}

This error is not limited to out-of-memory issues but could also indicate to a programming error which lead to a crash of the application. In case the error persists even with a smaller geographical and temporal extent, please contact us.

API Endpoint:

  • Status: http://my.meteoblue.com/queue/status/API-ID
  • Delete: http://my.meteoblue.com/queue/delete/API-ID (only HTTP DELETE)
  • Result: http://queueresults.meteoblue.com/API-ID
  • Query JSON: http://my.meteoblue.com/queue/jobquery/API-ID