Package 'SmarterPoland' reference manual

Title:	Tools for Accessing Various Datasets Developed by the Foundation SmarterPoland.pl
Description:	Tools for accessing and processing datasets prepared by the Foundation SmarterPoland.pl. Among all: access to API of Google Maps, Central Statistical Office of Poland, MojePanstwo, Eurostat, WHO and other sources.
Authors:	Przemyslaw Biecek
Maintainer:	Przemyslaw Biecek <[email protected]>
License:	GPL-3
Version:	1.8.1
Built:	2025-03-12 03:15:39 UTC
Source:	https://github.com/pbiecek/smarterpoland

Tools for Accessing Various Datasets Developed by the Foundation SmarterPoland.pl

Description

Tools for accessing and processing datasets prepared by the Foundation SmarterPoland.pl. Among all: access to API of Google Maps, Central Statistical Office of Poland, Eurostat, WHO and other sources.

Author(s)

Author: Przemyslaw Biecek Maintainer: Przemyslaw Biecek <[email protected]>

Examples

## Not run: 
 # download the dataset 'Pupil/Student - teacher ratio and average class' from eurostat
 # for more developed API see https://github.com/rOpenGov/eurostat
 tmp <- getEurostatRCV(kod = "educ_iste")
 head(tmp)

 # download the dataset 'People killed in road accidents' from eurostat
 # and plot a maptable for selected countries
 # for more developed API see https://github.com/rOpenGov/eurostat
 library(ggplot2)
 t1 <- getEurostatRCV("tsdtr420")
 t1 <- t1[t1$geo 
 ggplot(t1, aes(time, value, color=sex, group=sex)) +
 	geom_line() + facet_wrap(~geo)

## End(Not run)
## Not run: 
 # download the dataset 'Pupil/Student - teacher ratio and average class' from eurostat
 # for more developed API see https://github.com/rOpenGov/eurostat
 tmp <- getEurostatRCV(kod = "educ_iste")
 head(tmp)

 # download the dataset 'People killed in road accidents' from eurostat
 # and plot a maptable for selected countries
 # for more developed API see https://github.com/rOpenGov/eurostat
 library(ggplot2)
 t1 <- getEurostatRCV("tsdtr420")
 t1 <- t1[t1$geo 
 ggplot(t1, aes(time, value, color=sex, group=sex)) +
 	geom_line() + facet_wrap(~geo)

## End(Not run)

API to Bank Danych Lokalnych [GUS]

Description

Access to the GUS Bank Danych Lokalnych with the use of API developed by MojePanstwo.

Download and parse data from Bank Danych Lokalnych with the use of API developed by MojePanstwo.

Usage

getBDLtree(raw = FALSE, debug = 0)
getBDLsearch(query = "", debug = 0, raw = FALSE)
getBDLseries(metric_id = "", slice = NULL, time_range = NULL,
            wojewodztwo_id = NULL, powiat_id = NULL, gmina_id = NULL,
            meta = NULL, debug = 0, raw = FALSE)
getMPgminy(debug = 0)
getMPpowiaty(debug = 0)
getMPwojewodztwa(debug = 0)
getBDLtree(raw = FALSE, debug = 0)
getBDLsearch(query = "", debug = 0, raw = FALSE)
getBDLseries(metric_id = "", slice = NULL, time_range = NULL,
            wojewodztwo_id = NULL, powiat_id = NULL, gmina_id = NULL,
            meta = NULL, debug = 0, raw = FALSE)
getMPgminy(debug = 0)
getMPpowiaty(debug = 0)
getMPwojewodztwa(debug = 0)

Arguments

`debug`	Level of debug info. 0 for no debug, 1 or 2 for info about processed groups.
`raw`	If raw = TRUE the resulting JSON is returned without any transformation. For raw = FALSE results are transformed into a data.frame.
`query`	A query for DBL search.
`metric_id`	Metric id, if unknown then look for it in DBL tree or DBL search.
`slice`	A table with id dimensions, with format [1,34,]. Use '' to choose all dimensions (or use an empty string).
`time_range`	Year or range (like 2000:2010), empty means - full range.
`wojewodztwo_id`	Voievodship id or '*' for all.
`powiat_id`	County id of '*' for all. It's internal ID. Use `getMPpowiaty()` to get names and other information.
`gmina_id`	Subcounty id or '*' for all. It's internal ID. Use `getMPgminy()` to get TERYT codes.
`meta`	Should meta data be returned?

Value

The function getMPgminy() returns a data frame with identifiers id/TERYT for each subcounty. The function getMPpowiaty() returns a data frame with identifiers id for each county.

The function getBDLtree() returns a data frame with identifiers of resources in Bank Danych Lokalnych.

Author(s)

Przemyslaw Biecek

References

The API of Bank Danych Lokalnych developed by MojePanstwo is described as https://mojepanstwo.pl/api/dane/get_dane_dataset

Examples

## Not run: 
 # the data is downloaded and parsed from Internet
 # not that this dataset is pre-calculated in the package
 BDLtree <- getBDLtree(2)
 head(BDLtree)

 DBLtransport <- getBDLsearch("transport")
 head(DBLtransport)

 BDLseries <- getBDLseries(metric_id = 1)
 head(BDLseries)

 gminy <- getMPgminy()
 head(gminy)

 powiaty <- getMPpowiaty()
 head(powiaty)

## End(Not run)
## Not run: 
 # the data is downloaded and parsed from Internet
 # not that this dataset is pre-calculated in the package
 BDLtree <- getBDLtree(2)
 head(BDLtree)

 DBLtransport <- getBDLsearch("transport")
 head(DBLtransport)

 BDLseries <- getBDLseries(metric_id = 1)
 head(BDLseries)

 gminy <- getMPgminy()
 head(gminy)

 powiaty <- getMPpowiaty()
 head(powiaty)

## End(Not run)

Geocoordinates of Largest Cities

Description

A subset of world.citiesmaps. Extracted in order to shink number of dependencies. Only cities with pop > 50k are keept.

Author(s)

Przemyslaw Biecek [based on world.cities]

Examples

## Not run: 
	library(maps)
	data(world.cities)
	cities_lon_lat <- world.cities[!duplicated(world.cities$name),]
	rownames(cities_lon_lat) = cities_lon_lat[,1]
	cities_lon_lat <- cities_lon_lat[cities_lon_lat$pop > 50000,]
	cities_lon_lat <- cities_lon_lat[,4:5]

## End(Not run)
## Not run: 
	library(maps)
	data(world.cities)
	cities_lon_lat <- world.cities[!duplicated(world.cities$name),]
	rownames(cities_lon_lat) = cities_lon_lat[,1]
	cities_lon_lat <- cities_lon_lat[cities_lon_lat$pop > 50000,]
	cities_lon_lat <- cities_lon_lat[,4:5]

## End(Not run)

Birth and death rates, continent and population for selected countries

Description

Data from World Health Organization database http://apps.who.int/gho/data/view.main.CBDR2040. Based on the example from Grammar of Graphics by Leland Wilkinson.

Author(s)

Przemyslaw Biecek [based on WHO data]

Examples

## Not run: 
	library(maps)
	data(countries)
	head(countries)

## End(Not run)
## Not run: 
	library(maps)
	data(countries)
	head(countries)

## End(Not run)

Access to Weather Forecasts with the Use of Dark Sky API.

Description

Access to hourly and daily weather forecasts with the use of Dark Sky API.

Usage

getWeatherForecast(apiKey, lat = NA, lon = NA, city = NA, raw=FALSE)
getWeatherForecast(apiKey, lat = NA, lon = NA, city = NA, raw=FALSE)

Arguments

`apiKey`	You need to have Dark Sky apiKey in order to access weather forecasts. See here: https://developer.forecast.io/ hor more details.
`lat`	The latitude coordinate for which prediction has to be made.
`lon`	The longitude coordinate for which prediction has to be made.
`city`	Instead of lat and lon you may specify name of the city for which prediction has to be made.
`raw`	If TRUE then no parsing is done. The function getWeatherForecast() just download an forecast and returns it as a list.

Value

The function getWeatherForecast() returns list of three datasets. now and by.hour datasets contains predictions. For each timepoint following information are collected:

time, summary, icon, precipIntensity, precipProbability, temperature, apparentTemperature, dewPoint, humidity, windSpeed, windBearing, visibility, cloudCover, pressure, ozone, temperatureCelsius, apparentTemperatureCelsius

Daily predictions (by.day component) contain following information:

time, summary, icon, sunriseTime, sunsetTime, moonPhase, precipIntensity, precipIntensityMax, precipProbability, temperatureMin, temperatureMinTime, temperatureMax, temperatureMaxTime, apparentTemperatureMin, apparentTemperatureMinTime, apparentTemperatureMax, apparentTemperatureMaxTime, dewPoint, humidity, windSpeed, windBearing, visibility, cloudCover, pressure, ozone, precipIntensityMaxTime, precipType, temperatureMaxCelsius, temperatureMinCelsius, apparentTemperatureMaxCelsius, apparentTemperatureMinCelsius

Author(s)

Przemyslaw Biecek

References

The Dark Sky API for weather forecasts is described as https://developer.forecast.io/

Examples

## Not run: 
 # you have to have apiKey to execute these examples
library(scales)
library(ggplot2)

prognoza <- getWeatherForecast(apiKey, city='Warsaw')

ggplot(prognoza$by.hour, aes(y=temperatureCelsius, x=time)) + 
  geom_line() + geom_point() +
  geom_point(data=prognoza$now, size=10, color='red') +
  theme(title=element_text(size=20),
        axis.text=element_text(size=20)) + 
  scale_x_datetime(breaks = date_breaks("3 hour"),
                   minor_breaks = date_breaks("1 hour"),
                   labels = date_format("
  ylab("") + xlab("") + ggtitle("Prognoza temperatury dla Warszawy")


## End(Not run)
## Not run: 
 # you have to have apiKey to execute these examples
library(scales)
library(ggplot2)

prognoza <- getWeatherForecast(apiKey, city='Warsaw')

ggplot(prognoza$by.hour, aes(y=temperatureCelsius, x=time)) + 
  geom_line() + geom_point() +
  geom_point(data=prognoza$now, size=10, color='red') +
  theme(title=element_text(size=20),
        axis.text=element_text(size=20)) + 
  scale_x_datetime(breaks = date_breaks("3 hour"),
                   minor_breaks = date_breaks("1 hour"),
                   labels = date_format("
  ylab("") + xlab("") + ggtitle("Prognoza temperatury dla Warszawy")


## End(Not run)

Download a Dictionary from the Eurostat Database

Description

Download a dictionary for given coded variable from Eurostat (ec.europa.eu/eurostat).

Usage

getEurostatDictionary(dictname)
getEurostatDictionary(dictname)

Arguments

dictname

Character, dictionary for given variable name will be downloaded.

Value

A data.frame with two columns, first with code names and second with full names.

Author(s)

Przemyslaw Biecek

References

The TOC is downloaded from the http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?file=dic....

Examples

## Not run: 
 tmp <- getEurostatDictionary("crop_pro")
 head(tmp)

## End(Not run)
## Not run: 
 tmp <- getEurostatDictionary("crop_pro")
 head(tmp)

## End(Not run)

Download a Dataset from the Eurostat Database

Description

Download a dataset from the eurostat database. The dataset is transformed into the tabular format.

Usage

getEurostatRaw(kod = "educ_iste", rowRegExp=NULL, colRegExp=NULL, 
       strip.white = TRUE)
getEurostatRaw(kod = "educ_iste", rowRegExp=NULL, colRegExp=NULL, 
       strip.white = TRUE)

Arguments

`kod`	A code name for the data set of interested. See the table of contents of eurostat datasets for more details.
`rowRegExp`	If not NULL this regular expression will be used to filter rows out of downloaded file.
`colRegExp`	If not NULL this regular expression will be used to filter collumns out of downloaded file.
`strip.white`	Passed to the internal `read.table()`. By default it strips white spaces from eurostat values.

Value

A dataset in data.frame format. First column contains names of cases. Column names usually corresponds to years.

Author(s)

Przemyslaw Biecek

References

Data is downloaded from http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing website.

Examples

## Not run: 
 tmp <- getEurostatRaw(kod = "educ_iste")
 head(tmp)

## End(Not run)
## Not run: 
 tmp <- getEurostatRaw(kod = "educ_iste")
 head(tmp)

## End(Not run)

Download a Dataset from the Eurostat Database

Description

Download a dataset from the eurostat database. The dataset is transformed into the molten / row-column-value format (RCV).

Usage

getEurostatRCV(kod = "educ_iste", ...)
getEurostatRCV(kod = "educ_iste", ...)

Arguments

`kod`	A code name for the data set of interested. See the table of contents of eurostat datasets for more details.
`...`	Other parameters that are passed to getEurostatRaw().

Value

A dataset in the molten format with the last column 'value'.

Author(s)

Przemyslaw Biecek

References

Data is downloaded from http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing website.

Examples

## Not run: 
 tmp <- getEurostatRCV(kod = "educ_iste")
 head(tmp)
 
 t1 <- getEurostatRCV("rail_ac_catvict")
 tmp <- cast(t1, geo ~ time , mean, subset=victim=="KIL" & 
                 pers_inv=="TOTAL" & accident=="TOTAL")
 tmp3 <- tmp[,1:9]
 rownames(tmp3) <- tmp3[,1]
 tmp3 <- tmp3[c("UK", "SK", "FR", "PL", "ES", "PT", "LV"),]
 matplot(2005:2012,t(tmp3[,-1]), type="o", pch=19, lty=1, 
                 las=1, xlab="", ylab="", yaxt="n")
 axis(2,tmp3[,9], rownames(tmp3), las=1)

## End(Not run)
## Not run: 
 tmp <- getEurostatRCV(kod = "educ_iste")
 head(tmp)
 
 t1 <- getEurostatRCV("rail_ac_catvict")
 tmp <- cast(t1, geo ~ time , mean, subset=victim=="KIL" & 
                 pers_inv=="TOTAL" & accident=="TOTAL")
 tmp3 <- tmp[,1:9]
 rownames(tmp3) <- tmp3[,1]
 tmp3 <- tmp3[c("UK", "SK", "FR", "PL", "ES", "PT", "LV"),]
 matplot(2005:2012,t(tmp3[,-1]), type="o", pch=19, lty=1, 
                 las=1, xlab="", ylab="", yaxt="n")
 axis(2,tmp3[,9], rownames(tmp3), las=1)

## End(Not run)

Eurostat Table of Contents

Description

Download a table of contents of eurostat datasets. Note that the values in column 'code' should be used to download a selected dataset.

Usage

getEurostatTOC()
getEurostatTOC()

Value

A data.frame with eight columns

`title`	The name of dataset of theme
`code`	The codename of dataset of theme, will be used by the getEurostatRCV and getEurostatRaw functions.
`type`	Is it a dataset, folder or table.
`last.update.of.data`, `last.table.structure.change`, `data.start`, `data.end`	Dates.

Author(s)

Przemyslaw Biecek

References

The TOC is downloaded from the http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?sort=1&file=table_of_contents_en.txt

Examples

## Not run: 
 tmp <- getEurostatTOC()
 head(tmp)

## End(Not run)
## Not run: 
 tmp <- getEurostatTOC()
 head(tmp)

## End(Not run)

Geolocalisation with Google Maps

Description

Get geolocalisation (longitude, latitude) of a given adress with the use of Google Maps API.

The Google Maps API is used to determine the geolocalisation (longitude, latitude) of a given adress.

Usage

getGoogleMapsAddress(street = "Banacha 2", city = "Warszawa", 
	country="Poland", positionOnly = TRUE, delay=1)
getGoogleMapsAddress(street = "Banacha 2", city = "Warszawa", 
	country="Poland", positionOnly = TRUE, delay=1)

Arguments

`street`	An address (street and building number)
`city`	City
`country`	Country
`positionOnly`	What should be returned, vector with longitude, latitude coordinates or the raw result from Google Maps API
`delay`	Number of seconds to wait between api calls

Value

If positionOnly=TRUE then a vector with two values or a raw list from Google Maps otherwise.

Author(s)

Przemyslaw Biecek

References

The Google Maps API https://developers.google.com/maps/

Examples

## Not run: 
 getGoogleMapsAddress()

## End(Not run)
## Not run: 
 getGoogleMapsAddress()

## End(Not run)

MillwardBrown Pool Results

Description

Download pool results from MillwardBrown website.

Usage

getMillwardBrown()
getMillwardBrown()

Value

A dataset in the molten format with pool date, party and percent of votes.

Author(s)

Maciej Beresewicz [data extraction] Przemyslaw Biecek [data melting]

Examples

## Not run: 
 getMillwardBrown()

## End(Not run)
## Not run: 
 getMillwardBrown()

## End(Not run)

Names of Eurostat Datasets That Fit Given Pattern

Description

Lists names of dataset from eurostat with the particular pattern in the description.

This function downloads list of all datasets available on eurostat and return list of names of datasets that contains particular pattern in the dataset description.

E.g. all datasets related to education of teaching.

Usage

grepEurostatTOC(pattern)
grepEurostatTOC(pattern)

Arguments

pattern

Character, only datasets that contains this pattern in the description will be returned.

Value

A data.frame with eight columns

`title`	The name of dataset of theme
`code`	The codename of dataset of theme, will be used by the getEurostatRCV and getEurostatRaw functions.
`type`	Is it a dataset, folder or table.
`last.update.of.data`, `last.table.structure.change`, `data.start`, `data.end`	Dates.

Author(s)

Przemyslaw Biecek

Examples

## Not run: 
 tmp <- grepEurostatTOC("education")
 head(tmp)

## End(Not run)
## Not run: 
 tmp <- grepEurostatTOC("education")
 head(tmp)

## End(Not run)

Results from Matura Exams in Poland for Math and Language for Years 2010-2015

Description

This dataset is created based on data from ZPD package, see https://github.com/zozlak/ZPD and http://zpd.ibe.edu.pl/doku.php?id=obazie. Each row shows results for one person that takes matura exams in a given year.

Author(s)

Przemyslaw Biecek [based on IBE / ZPD data]

Examples

## Not run: 
	data(maturaExam)
	head(maturaExam)

## End(Not run)
## Not run: 
	data(maturaExam)
	head(maturaExam)

## End(Not run)

Package 'SmarterPoland'

Help Index

Tools for Accessing Various Datasets Developed by the Foundation SmarterPoland.pl

Description

Author(s)

See Also

Examples

API to Bank Danych Lokalnych [GUS]

Description

Usage

Arguments

Value

Author(s)

References

Examples

Geocoordinates of Largest Cities

Description

Author(s)

Examples

Birth and death rates, continent and population for selected countries

Description

Author(s)

Examples

Access to Weather Forecasts with the Use of Dark Sky API.

Description

Usage

Arguments

Value

Author(s)

References

Examples

Download a Dictionary from the Eurostat Database

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Download a Dataset from the Eurostat Database

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Download a Dataset from the Eurostat Database

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Eurostat Table of Contents

Description

Usage

Value

Author(s)

References

See Also

Examples

Geolocalisation with Google Maps

Description

Usage

Arguments

Value

Author(s)

References

Examples

MillwardBrown Pool Results

Description

Usage

Value

Author(s)

Examples