
Get started with pRycollection
Andreas Schneider
Source:vignettes/pRycollection.Rmd
pRycollection.Rmd
Metadata can provide much more context than they usually do, because often data and their respective metadata live in separate places. This can make data analysis more difficult.
The main goal of pRycollection is to provide datasets about Paraguay for research and teaching that are not easily found or accessible. The data package was build from the beginning with FAIR principles in mind. FAIR stands for Findable, Accessible, Interoperable, and Reusable. These principles are critical to maximizing the impact and value of data in research and practice.
You can download all dataset directly from
vignette("download")
Installation
You can install the development version of pRycollection from GitHub with:
# install.packages("pak")
pak::pak("schneiderpy/pRycollection")
# load pRycollection
library(pRycollection)
Examples
Choose an available dataset from the package:
# you will see a new pane tab with the available datasets
data(package = "pRycollection")
Choose your dataset and print the first six rows…
head(py_temperature)
#> Schneider (2025): Weekly mean temperature data [dataset], https://doi.org/10.5281/zenodo.16729963
#> rowid country ISO city week avg_temp holiday
#> <defined> <defined> <defined> <defined> <dttm_dfn> <defined> <defined>
#> 1 obs:1 Paraguay PY 1 [Asuncion] 2016-01-04 27.8 0
#> 2 obs:2 Paraguay PY 1 [Asuncion] 2016-01-11 30.3 0
#> 3 obs:3 Paraguay PY 1 [Asuncion] 2016-01-18 29.9 0
#> 4 obs:4 Paraguay PY 1 [Asuncion] 2016-01-25 27.3 1
#> 5 obs:5 Paraguay PY 1 [Asuncion] 2016-02-01 26.6 0
#> 6 obs:6 Paraguay PY 1 [Asuncion] 2016-02-08 30.1 0
You might have noticed that the dataset looks like a normal dataset, however, with additional metadata. The additional metadata was added with the dataset R package. For example, you can observe the author, year, title, and reference of the dataset.
The new data frames are stored in an R native format to avoid loss of contextual information and to keep data and metadata together.
For example, if you are interested in the unit of measure for
avg_temp
variable, type the following line of code:
attributes(py_temperature$avg_temp)
#> $label
#> [1] "Mean temperature"
#>
#> $class
#> [1] "haven_labelled_defined" "haven_labelled" "vctrs_vctr"
#> [4] "double"
#>
#> $unit
#> [1] "degrees Celsius"
# or alternatively
# attributes(py_temperature[[6]])
As you can see the unit of measure is “degrees Celsius”.