Skip to contents

So far computational results are as expected, as we have seen in previous vignettes.

However, we have just seen additional metadata on a glance. In our py_temperature data, what is the unit for the avg_temp variable?

To find this out we need the dataset package. Since all metadata has been added with this package we need some functionality of this package. For more details see the documentation of the package.

Make sure you have installed and load the dataset package.

# load dataset
library(dataset)
# load data locally
data("py_temperature")

The summary() function gives a complete overview of the data and metadata.

summary(py_temperature)
#> Schneider (2025): Summary of Weekly mean temperature data [dataset], https://doi.org/10.5281/zenodo.16729963
#> 
#> Country name
#> Country ISO code
#> Mean temperature (degrees Celsius)
#> Holiday indicator
#>     rowid             country              ISO                 city  
#>  Length:1565        Length:1565        Length:1565        Min.   :1  
#>  Class :character   Class :character   Class :character   1st Qu.:2  
#>  Mode  :character   Mode  :character   Mode  :character   Median :3  
#>                                                           Mean   :3  
#>                                                           3rd Qu.:4  
#>                                                           Max.   :5  
#>       week               avg_temp         holiday      
#>  Min.   :2016-01-04   Min.   : 9.329   Min.   :0.0000  
#>  1st Qu.:2017-07-03   1st Qu.:20.043   1st Qu.:0.0000  
#>  Median :2018-12-31   Median :24.214   Median :0.0000  
#>  Mean   :2018-12-31   Mean   :23.280   Mean   :0.1885  
#>  3rd Qu.:2020-06-29   3rd Qu.:26.529   3rd Qu.:0.0000  
#>  Max.   :2021-12-27   Max.   :32.000   Max.   :1.0000

While this might be a good idea at the beginning, you might be just interested in the unit of measure of one variable. Just type the one line of code:

var_unit(py_temperature$avg_temp)
#> [1] "degrees Celsius"

To see what the data is about, use the description() function:

description(py_temperature)
#> [1] "Weekly mean temperature of the five largest cities (2016-2021).\n  Data set also includes a holiday indicator."

And for a more complete description:

print(get_bibentry(py_temperature), "bibtex")
#> Dublin Core Metadata Record
#> --------------------------
#> Title:       Weekly mean temperature data
#> Creator(s):  Andreas Schneider [cre, dtm]
#> Publisher:   :unas
#> Year:        2025
#> Language:    en
#> Description: Weekly mean temperature of the five largest cities (2016-2021).
#>   Data set also includes a holiday indicator.

As you can see, the dataset has rich additional metadata along with the actual measured data. Metadata fields are easily accessible which makes guessing unnecessary.