Skip to contents

Metadata can provide much more context than they usually do, because often data and their respective metadata live in separate places. This can make data analysis more difficult.

The main goal of pRycollection is to provide datasets about Paraguay for research and teaching that are not easily found or accessible. The data package was build from the beginning with FAIR principles in mind. FAIR stands for Findable, Accessible, Interoperable, and Reusable. These principles are critical to maximizing the impact and value of data in research and practice.

You can download all dataset directly from vignette("download")

Installation

You can install the development version of pRycollection from GitHub with:

# install.packages("pak")
pak::pak("schneiderpy/pRycollection")

# load pRycollection
library(pRycollection)

Examples

Choose an available dataset from the package:

# you will see a new pane tab with the available datasets
data(package = "pRycollection")

Choose your dataset and print the first six rows…

head(py_temperature)
#> Schneider (2025): Weekly mean temperature data [dataset], https://doi.org/10.5281/zenodo.16729963
#>   rowid     country   ISO       city         week       avg_temp  holiday   
#>   <defined> <defined> <defined> <defined>    <dttm_dfn> <defined> <defined>
#> 1 obs:1     Paraguay  PY        1 [Asuncion] 2016-01-04 27.8      0        
#> 2 obs:2     Paraguay  PY        1 [Asuncion] 2016-01-11 30.3      0        
#> 3 obs:3     Paraguay  PY        1 [Asuncion] 2016-01-18 29.9      0        
#> 4 obs:4     Paraguay  PY        1 [Asuncion] 2016-01-25 27.3      1        
#> 5 obs:5     Paraguay  PY        1 [Asuncion] 2016-02-01 26.6      0        
#> 6 obs:6     Paraguay  PY        1 [Asuncion] 2016-02-08 30.1      0

You might have noticed that the dataset looks like a normal dataset, however, with additional metadata. The additional metadata was added with the dataset R package. For example, you can observe the author, year, title, and reference of the dataset.

The new data frames are stored in an R native format to avoid loss of contextual information and to keep data and metadata together.

For example, if you are interested in the unit of measure for avg_temp variable, type the following line of code:

attributes(py_temperature$avg_temp)
#> $label
#> [1] "Mean temperature"
#> 
#> $class
#> [1] "haven_labelled_defined" "haven_labelled"         "vctrs_vctr"            
#> [4] "double"                
#> 
#> $unit
#> [1] "degrees Celsius"

# or alternatively
# attributes(py_temperature[[6]])

As you can see the unit of measure is “degrees Celsius”.