Access the ETER database via API request using R

Accessing data via API requests has a large advantage in that your data will be up-to-date anytime you start a request. Implementing an API request requires some investment, especially compared to downloading a .csv-file, but it is especially useful if you need to repeat a process several times. This blog post shows how the ETER data can be loaded into R using the ETER API.

First, we need to load the necessary packages for making API calls:

#---------------------------#
#        load packages      #
#---------------------------#
library(httr)
library(jsonlite)
library(tidyverse)

Then we want to log in, since we want to get all data, not the public dataset:

#---------------------------#
#            Login          #
#---------------------------#
cred <- '{
"email": "YOURUSERNAME",
"password": "YOURPASSWORD"
}'
url_login <- "https://eter-project.com/api/4.0/users/login"
result_login <- POST(url_login, body = cred, content_type_json())
 
# translate Unicode into text
login_raw <- rawToChar(result_login$content)
 
# transform json into workable format for R
mylogin <- fromJSON(login_raw, flatten = TRUE)
mytoken <- mylogin$token

Now we can make some requests. We have the following options:

    • get all data at once

    • filter the data by country and/or year (in the example below we extract data for Austria and the year 2015).

We provided examples for all options below. Be aware that you need to add the token and the string “Bearer” to your request to receive the unrestricted dataset.

#---------------------------#
#         API requests      #
#---------------------------#
##  get all data at once  ##
url <- "https://eter-project.com/api/4.0/HEIs/full/flattened"
# request for all data
raw_result <- GET(url, add_headers(Authorization = paste("Bearer ", mytoken)))
##  filter data within API request  ##
#  build a query
myquery <- '{
    "filter": {"BAS.REFYEAR.v": 2015, "BAS.COUNTRY.v": "AT"},
    "searchTerms": []
    }'
myquery2 <- '{
    "filter": {"BAS.REFYEAR.v": { "$in": [2013, 2014, 2015, 2016]},
    "BAS.COUNTRY.v": { "$in": ["AL", "AT", "BG", "LV"]}},
    "fieldIds": [],
    "searchTerms": []
    }'
# now make the request for a filtered query
url <- "https://eter-project.com/api/4.0/HEIs/query/flattened"
raw_result <- POST(url, body = myquery, content_type_json(),
add_headers(Authorization = paste("Bearer ", mytoken)))

After the API request, we need to do some data processing to get the data from a JSON format into a workable format for R:

#---------------------------#
#       data processing     #
#---------------------------#
# check the status code ("200" for all HEIs and "201" for POST request should be the output)
raw_result$status_code
# translate unicode into text
this.raw.content <- rawToChar(raw_result$content)
# encoding to UTF-8
Encoding(this.raw.content) <- "UTF-8"
# transform JSON into workable format for R
mydata <- fromJSON(this.raw.content, flatten = TRUE)

Now you can check the data in order to be sure they have the right structure:

# check if mydata is a dataframe
class(mydata)
# check column names
colnames(mydata)
#check number of rows and columns
dim(mydata)