B. Advanced Usage

Gerold Hepp

2019-03-09

This vignette illustrates some more advanced concepts of the DTSg package, namely reference semantics as well as chaining and piping.


First, let’s load the package as well as some data and let’s create a DTSg object:

library(DTSg)

data(flow)
TS <- DTSg$new(flow)
TS
#> Values:
#>        .dateTime   flow
#>           <POSc>  <num>
#>    1: 2007-01-01  9.540
#>    2: 2007-01-02  9.285
#>    3: 2007-01-03  8.940
#>    4: 2007-01-04  8.745
#>    5: 2007-01-05  8.490
#>   ---                  
#> 2188: 2012-12-27 26.685
#> 2189: 2012-12-28 28.050
#> 2190: 2012-12-29 23.580
#> 2191: 2012-12-30 18.840
#> 2192: 2012-12-31 17.250
#> 
#> ID:          
#> Parameter:   
#> Variant:     
#> Unit:        
#> Aggregated:  FALSE
#> Regular:     TRUE
#> Periodicity: Time difference of 1 days
#> Time zone:   UTC

Reference Semantics

By default, every method manipulating the values of a DTSg object creates a copy of it. This behaviour can be overridden by setting their clone argument to FALSE. Globally, cloning can be controlled with the DTSgClone option:

TS$alter("2007-01-01", "2008-12-31")
# end date is still in the year 2012
TS
#> Values:
#>        .dateTime   flow
#>           <POSc>  <num>
#>    1: 2007-01-01  9.540
#>    2: 2007-01-02  9.285
#>    3: 2007-01-03  8.940
#>    4: 2007-01-04  8.745
#>    5: 2007-01-05  8.490
#>   ---                  
#> 2188: 2012-12-27 26.685
#> 2189: 2012-12-28 28.050
#> 2190: 2012-12-29 23.580
#> 2191: 2012-12-30 18.840
#> 2192: 2012-12-31 17.250
#> 
#> ID:          
#> Parameter:   
#> Variant:     
#> Unit:        
#> Aggregated:  FALSE
#> Regular:     TRUE
#> Periodicity: Time difference of 1 days
#> Time zone:   UTC

options("DTSgClone" = FALSE)
getOption("DTSgClone")
#> [1] FALSE
TS$alter("2007-01-01", "2008-12-31")
# end date is in the year 2008 now
TS
#> Values:
#>       .dateTime   flow
#>          <POSc>  <num>
#>   1: 2007-01-01  9.540
#>   2: 2007-01-02  9.285
#>   3: 2007-01-03  8.940
#>   4: 2007-01-04  8.745
#>   5: 2007-01-05  8.490
#>  ---                  
#> 727: 2008-12-27 18.180
#> 728: 2008-12-28 16.575
#> 729: 2008-12-29 13.695
#> 730: 2008-12-30 12.540
#> 731: 2008-12-31 11.940
#> 
#> ID:          
#> Parameter:   
#> Variant:     
#> Unit:        
#> Aggregated:  FALSE
#> Regular:     TRUE
#> Periodicity: Time difference of 1 days
#> Time zone:   UTC

As we can see, with cloning set to FALSE, the object was altered in place, i.e., no assignment to a new or reassignment to an existing variable was necessary in order to make the changes stick. This is due to the R6 nature of DTSg objects.

Note

Using reference semantics can result in undesired behaviour. Merely assigning a variable representing a DTSg object to a new variable does not result in a copy of the object. Instead, both variables will reference and access the same data in the background, i.e., changing one will also affect the other. In case you really want a copy of a DTSg object, you will have to use the clone method with the deep argument set to TRUE (for consistency with the R6 package the default is FALSE):

TSc <- TS$clone(deep = TRUE)
#  or clone(TS, deep = TRUE)

Chaining and Piping

Especially in combination with reference semantics, chaining and piping can be a fast and comfortable way to apply several object manipulations in a row. While chaining only works in combination with the R6 interface, piping is an exclusive feature of the S3 interface.

Let’s start with chaining:

TS <- DTSg$
  new(flow)$
  alter("2007-01-01", "2008-12-31")$
  colapply(interpolateLinear)$
  aggregate(byYm____, mean)
TS
#> Values:
#>      .dateTime      flow
#>         <POSc>     <num>
#>  1: 2007-01-01 25.281290
#>  2: 2007-02-01 14.496964
#>  3: 2007-03-01 12.889839
#>  4: 2007-04-01 12.470500
#>  5: 2007-05-01  9.233226
#>  6: 2007-06-01  9.193500
#>  7: 2007-07-01 12.272419
#>  8: 2007-08-01 11.291129
#>  9: 2007-09-01  8.874500
#> 10: 2007-10-01 13.063065
#> 11: 2007-11-01 25.668500
#> 12: 2007-12-01 20.474032
#> 13: 2008-01-01 19.677097
#> 14: 2008-02-01 14.185345
#> 15: 2008-03-01 23.577581
#> 16: 2008-04-01 23.284000
#> 17: 2008-05-01 14.325968
#> 18: 2008-06-01  9.287000
#> 19: 2008-07-01 22.004032
#> 20: 2008-08-01 12.641129
#> 21: 2008-09-01 13.710500
#> 22: 2008-10-01 10.626774
#> 23: 2008-11-01  8.902000
#> 24: 2008-12-01 16.435645
#>      .dateTime      flow
#> 
#> ID:          
#> Parameter:   
#> Variant:     
#> Unit:        
#> Aggregated:  TRUE
#> Regular:     FALSE
#> Periodicity: 1 months
#> Min lag:     Time difference of 28 days
#> Max lag:     Time difference of 31 days
#> Time zone:   UTC

For piping, we have to make sure the magrittr package is installed and load it for access to its pipe operator first:

if (requireNamespace("magrittr", quietly = TRUE)) {
  library(magrittr)
  
  TS <- new("DTSg", flow) %>%
    alter("2007-01-01", "2008-12-31") %>%
    colapply(interpolateLinear) %>%
    aggregate(byYm____, mean)
  TS
}
#> Values:
#>      .dateTime      flow
#>         <POSc>     <num>
#>  1: 2007-01-01 25.281290
#>  2: 2007-02-01 14.496964
#>  3: 2007-03-01 12.889839
#>  4: 2007-04-01 12.470500
#>  5: 2007-05-01  9.233226
#>  6: 2007-06-01  9.193500
#>  7: 2007-07-01 12.272419
#>  8: 2007-08-01 11.291129
#>  9: 2007-09-01  8.874500
#> 10: 2007-10-01 13.063065
#> 11: 2007-11-01 25.668500
#> 12: 2007-12-01 20.474032
#> 13: 2008-01-01 19.677097
#> 14: 2008-02-01 14.185345
#> 15: 2008-03-01 23.577581
#> 16: 2008-04-01 23.284000
#> 17: 2008-05-01 14.325968
#> 18: 2008-06-01  9.287000
#> 19: 2008-07-01 22.004032
#> 20: 2008-08-01 12.641129
#> 21: 2008-09-01 13.710500
#> 22: 2008-10-01 10.626774
#> 23: 2008-11-01  8.902000
#> 24: 2008-12-01 16.435645
#>      .dateTime      flow
#> 
#> ID:          
#> Parameter:   
#> Variant:     
#> Unit:        
#> Aggregated:  TRUE
#> Regular:     FALSE
#> Periodicity: 1 months
#> Min lag:     Time difference of 28 days
#> Max lag:     Time difference of 31 days
#> Time zone:   UTC