APIs#
Application Programming Interface
A standardised interface for access to:
data and transfers,
programs and services,
general communication between apps/programs.
The “hidden” version of a User Interface letting computers and programs communicate.
Can be limited (by owners) on:
number of requests per time unit,
access codes/credentials.
Web APIs#
Hyper Text Transfer Protocol (HTTP) based queries and answers using POST or GET methods.
Each API has its own hierachy and possibilites for querying.
URLs used for querying using the GET method typically consist of:
a server address: http://api.openweathermap.org,
a hierarchy with descriptive names: /data/2.5/forecast, and
a question mark marking the beginning of user supplied named variables with
contents joined by ampersands: ?q=London&appid=MY_API_KEY
Variations in naming include data / api, q / query.
A full query can look like this: _https://ec.europa.eu/eurostat/api/dissemination/statistics/1.0/data/prc_hicp_mv12r?geo=DK&geo=NO&geo=SE&geo=DE&geo=UK&coicop=CP00&unit=RCH_MV12MAVR_
Here there are filters at work to extract various countries and data.
For the POST method, the server address and hierarchy has the same format, but the query text is “POSTed” separately (see example below).
Some APIs use time-limited access tokens, e.g., see BarentsWatch tutorial and GitHub example:
First POST to the API includes a client ID and client “secret”.
A token (temporary password) is returned and can be used for subsequent requests (typically expiring after 3600 s).
APIs are not eternal.
Formats are changed over time.
Sometimes different formatting can be accessed by the 1.0/1.1, etc. part of the URL.
JSON#
The query can be a JSON string, e.g., {‘city’: ‘London’, ‘year’: ‘2000’} which can be sent separately,
see example below using POST.Also the returned contents are often JSON formatted.
Example with JSON query and JSON-stat return#
Statistics Norway (SSB)
Traffic accident data
from pyjstat import pyjstat
import requests
# API for Statistics Norway, table of traffic accidents
POST_URL = 'https://data.ssb.no/api/v0/en/table/06794'
# Paste the URL into a browser to see all the options
# The payload is the JSON-stat dataset identifier
payload = { "query": [{ "code": "Skadegrad", "selection": { "filter": "item", "values": [ "01", "20", "02", "04", "05" ] } },
{ "code": "Kjonn", "selection": { "filter": "item", "values": [ "1", "2" ] } },
{ "code": "Trafikkantgruppe", "selection": { "filter": "item", "values": [ "1", "2", "3", "7", "8" ] } },
{ "code": "ContentsCode", "selection": { "filter": "item", "values": [ "SkaddDrept" ] } },
{ "code": "Tid", "selection": { "filter": "item", "values": [ "1999M01", "1999M02", "1999M03", "2024M06", "2024M07" ] } }
],
"response": { "format": "json-stat2" } }
result = requests.post(POST_URL, json = payload)
print(result) # 200 = OK
<Response [200]>
# Extract DataFrame from JSON-stat
dataset = pyjstat.Dataset.read(result.text)
df = dataset.write('dataframe')
print(df.shape)
df.head()
(250, 6)
degree of damage | sex | group of road user | contents | month | value | |
---|---|---|---|---|---|---|
0 | Killed | Females | Drivers of car | Persons killed or injured | 1999M01 | 2 |
1 | Killed | Females | Drivers of car | Persons killed or injured | 1999M02 | 3 |
2 | Killed | Females | Drivers of car | Persons killed or injured | 1999M03 | 1 |
3 | Killed | Females | Drivers of car | Persons killed or injured | 2024M06 | 0 |
4 | Killed | Females | Drivers of car | Persons killed or injured | 2024M07 | 1 |
# New payload with less restrictions
payload = { "query": [ { "code": "Skadegrad", "selection": { "filter": "all", "values": [ "*" ] } },
{ "code": "Kjonn", "selection": { "filter": "all", "values": [ "*" ] } },
{ "code": "Trafikkantgruppe", "selection": { "filter": "all", "values": [ "*" ] } },
{ "code": "ContentsCode", "selection": { "filter": "all", "values": [ "*" ] } },
{ "code": "Tid", "selection": { "filter": "all", "values": [ "*" ] } }
],
"response": { "format": "json-stat2" } }
result = requests.post(POST_URL, json = payload)
print(result) # 200 = OK
<Response [200]>
dataset = pyjstat.Dataset.read(result.text)
df_all = dataset.write('dataframe')
print(df_all.shape)
df_all.head()
(29760, 6)
degree of damage | sex | group of road user | contents | month | value | |
---|---|---|---|---|---|---|
0 | Killed | Females | Drivers of car | Persons killed or injured | 1999M01 | 2 |
1 | Killed | Females | Drivers of car | Persons killed or injured | 1999M02 | 3 |
2 | Killed | Females | Drivers of car | Persons killed or injured | 1999M03 | 1 |
3 | Killed | Females | Drivers of car | Persons killed or injured | 1999M04 | 3 |
4 | Killed | Females | Drivers of car | Persons killed or injured | 1999M05 | 3 |
Exercise#
Visit Statistics Norway’s Ready-made datasets.
Select a different dataset, download through the API and inspect the results.
See also
Resources