---
title: "Extracting environmental data with `rvdat`"
date: 2024-01-10
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Extracting environmental data with `rvdat`}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---


We often want to extract the environmental data that some of our receivers log. You can do this by going into Innovasea's VUE or Fathom Connect programs and clicking a bunch of buttons, or you can do it programmatically and reproducibly by combining `matos` and [`rvdat`](https://rvdat.obrien.page).


```r
library(matos)
#> By continuing, you are agreeing to the ACT Network MATOS User Agreement and Data Policy, Version 1.2:
#> 
#> <https://matos.asascience.com/static/MATOS.User.Agreement.V1.1.pdf>
#> 
#> Attaching package: 'matos'
#> The following objects are masked from 'package:rvdat':
#> 
#>     skip_example_on_ci, skip_example_on_runiverse
library(rvdat)
```

## Download the data

If you're an ACT member, log into your MATOS account. If you're not, we'll skip to the next section and assume you already have your files ready.


```r
matos_login()
#> ! Please log in.
#> ✔ Login successful!
```

Next, tell `rvdat` the location of the `vdat.exe` executable (check out the [`rvdat` documentation](https://rvdat.obrien.page) for more information on this).


```r
vdat_here("vdat.exe")
```

Using `matos`, I'm going to find the VDAT files associated with the [Mid-Bay Chesapeake Backbone array](https://matos.asascience.com/project/detail/161). Note that things like temperature and tilt are logged within the VRL/VDAT files, and so will be under the "detections" file type.


```r
detection_files <- list_project_files(
  project = "UMCES Chesapeake Backbone, Mid-Bay",
  file_type = "detections"
)
#> Warning in (function (what, read_access, quiet, warn_multimatch) :
#> MATOS has exactly matched multiple OTN project names.

head(detection_files)
#>   project                  file_type upload_date
#> 3     161 Tag Detections - .vfl file  2023-10-12
#> 5     161 Tag Detections - .vfl file  2023-09-13
#> 6     161 Tag Detections - .vfl file  2023-09-13
#> 7     161 Tag Detections - .vfl file  2023-09-13
#> 8     161 Tag Detections - .vfl file  2023-09-13
#> 9     161 Tag Detections - .vfl file  2023-09-13
#>                     file_name
#> 3 VR2AR_546323_20231012_1.vrl
#> 5 VR2AR_547715_20230912_1.vrl
#> 6 VR2AR_547714_20230912_1.vrl
#> 7 VR2AR_546476_20230912_1.vrl
#> 8 VR2AR_546470_20230912_1.vrl
#> 9 VR2AR_546462_20230912_1.vrl
#>                                                       url
#> 3 https://matos.asascience.com/projectfile/download/11087
#> 5  https://matos.asascience.com/projectfile/download/8003
#> 6  https://matos.asascience.com/projectfile/download/8002
#> 7  https://matos.asascience.com/projectfile/download/8001
#> 8  https://matos.asascience.com/projectfile/download/8000
#> 9  https://matos.asascience.com/projectfile/download/7999
```

I'll download the first file into a temporary directory.


```r
get_project_file(
  url = detection_files$url[1],
  out_dir = tempdir()
)
#> 
#> ── Downloading files ─────────────────────────────────────────────────
#> ✔ File(s) saved to:
#>    C:\Users\darpa2\AppData\Local\Temp\RtmpuIFd1a\VR2AR_546323_20231012_1.vrl
#> [1] "C:\\Users\\darpa2\\AppData\\Local\\Temp\\RtmpuIFd1a\\VR2AR_546323_20231012_1.vrl"

vrl_file <- list.files(tempdir(), pattern = "vrl$", full.names = T)
```

## Convert VRL using `rvdat`

Now, I'll use `rvdat::vdat_to_folder` to convert that file into a folder of CSV files, each one representing one data type.


```r
vdat_to_folder(
  vrl_file,
  outdir = tempdir()
)
#> ✔ File converted:
#>   C:\Users\darpa2\AppData\Local\Temp\RtmpuIFd1a/VR2AR_546323_20231012_1.vrl
#> ℹ Files saved in:
#>   C:\Users\darpa2\AppData\Local\Temp\RtmpuIFd1a/VR2AR_546323_20231012_1.csv-fathom-split
```

There are quite a few files in there, but for this we're going to focus on the temperature records stored in `TEMP.csv`.


```r
list.files(tempdir(), pattern = "fathom-split", full.names = T) |>
  list.files()
#>  [1] "ATTITUDE.csv"         "BATTERY.csv"         
#>  [3] "CFG_CHANNEL.csv"      "CFG_STUDY.csv"       
#>  [5] "CFG_TRANSMITTER.csv"  "CLOCK_REF.csv"       
#>  [7] "DATA_SOURCE_FILE.csv" "DEPTH.csv"           
#>  [9] "DET.csv"              "DIAG.csv"            
#> [11] "EVENT.csv"            "EVENT_INIT.csv"      
#> [13] "EVENT_OFFLOAD.csv"    "HEALTH_VR2AR.csv"    
#> [15] "TEMP.csv"

bwt_file <- list.files(tempdir(), pattern = "TEMP", full.names = T, recursive = T)
```

## Read it into R

Let's read in the data.


```r
read.csv(bwt_file)
#> Error in read.table(file = file, header = header, sep = sep, quote = quote, : more columns than column names
```

An error, oh no! Let's see what's causing it.


```r
read.csv(bwt_file,
  header = FALSE,
  nrows = 5
)
#>               V1                  V2
#> 1 VEMCO DATA LOG               2.0.0
#> 2      TEMP_DESC   Device Time (UTC)
#> 3           TEMP 2022-05-04 16:00:00
#> 4           TEMP 2022-05-04 17:00:00
#> 5           TEMP 2022-05-04 18:00:00
#>                                    V3              V4
#> 1 vdat-9.13.2-20240607-7a2e9d-release                
#> 2                                Time Time Offset (h)
#> 3                                                    
#> 4                                                    
#> 5                                                    
#>                    V5       V6            V7              V8
#> 1                                                           
#> 2 Time Correction (s)    Model Serial Number Ambient (deg C)
#> 3                     VR2AR-69        546323          13.558
#> 4                     VR2AR-69        546323          12.869
#> 5                     VR2AR-69        546323          12.827
#>                 V9
#> 1                 
#> 2 Internal (deg C)
#> 3                 
#> 4                 
#> 5
```

Ah, the data doesn't really start until the second row. Skip the first one and take a look.


```r
bwt <- read.csv(
  bwt_file,
  skip = 1
)

head(bwt)
#>   TEMP_DESC   Device.Time..UTC. Time Time.Offset..h.
#> 1      TEMP 2022-05-04 16:00:00   NA              NA
#> 2      TEMP 2022-05-04 17:00:00   NA              NA
#> 3      TEMP 2022-05-04 18:00:00   NA              NA
#> 4      TEMP 2022-05-04 19:00:00   NA              NA
#> 5      TEMP 2022-05-04 20:00:00   NA              NA
#> 6      TEMP 2022-05-04 21:00:00   NA              NA
#>   Time.Correction..s.    Model Serial.Number Ambient..deg.C.
#> 1                  NA VR2AR-69        546323          13.558
#> 2                  NA VR2AR-69        546323          12.869
#> 3                  NA VR2AR-69        546323          12.827
#> 4                  NA VR2AR-69        546323          12.806
#> 5                  NA VR2AR-69        546323          12.806
#> 6                  NA VR2AR-69        546323          12.827
#>   Internal..deg.C.
#> 1               NA
#> 2               NA
#> 3               NA
#> 4               NA
#> 5               NA
#> 6               NA
```

Great! The data are in. I'm going to clean up the names a little bit and convert the time column from a character string to POSIX time.


```r
names(bwt) <- gsub("[_\\.]", "", tolower(names(bwt)))

names(bwt)
#> [1] "tempdesc"        "devicetimeutc"   "time"           
#> [4] "timeoffseth"     "timecorrections" "model"          
#> [7] "serialnumber"    "ambientdegc"     "internaldegc"

bwt$devicetimeutc <- as.POSIXct(bwt$devicetimeutc,
  tz = "UTC",
  format = "%F %T"
)

head(bwt)
#>   tempdesc       devicetimeutc time timeoffseth timecorrections
#> 1     TEMP 2022-05-04 16:00:00   NA          NA              NA
#> 2     TEMP 2022-05-04 17:00:00   NA          NA              NA
#> 3     TEMP 2022-05-04 18:00:00   NA          NA              NA
#> 4     TEMP 2022-05-04 19:00:00   NA          NA              NA
#> 5     TEMP 2022-05-04 20:00:00   NA          NA              NA
#> 6     TEMP 2022-05-04 21:00:00   NA          NA              NA
#>      model serialnumber ambientdegc internaldegc
#> 1 VR2AR-69       546323      13.558           NA
#> 2 VR2AR-69       546323      12.869           NA
#> 3 VR2AR-69       546323      12.827           NA
#> 4 VR2AR-69       546323      12.806           NA
#> 5 VR2AR-69       546323      12.806           NA
#> 6 VR2AR-69       546323      12.827           NA
```

## See what we've got

Now that we have our time series, let's see what it looks like!


```r
plot(ambientdegc ~ devicetimeutc,
  data = bwt,
  type = "l"
)
```

![](figures/rvdat-bwt-1.png)

## Other variables

We can pull out other variables in a similar manner. Take, for example, receiver tilt, located in `ATTITUDE.csv`.


```r
env_import <- function(file) {
  hold <- list.files(
    tempdir(),
    pattern = file,
    full.names = T,
    recursive = T
  ) |>
    read.csv(skip = 1)

  names(hold) <- gsub("[_\\.]", "", tolower(names(hold)))

  hold$devicetimeutc <- as.POSIXct(hold$devicetimeutc,
    tz = "UTC",
    format = "%F %T"
  )

  hold
}

tilt <- env_import("ATTITUDE")

plot(tiltdeg ~ devicetimeutc, data = tilt, type = "l")
```

![](figures/rvdat-tilt-1.png)

Depth, in `DEPTH.csv`:


```r
depth <- env_import("DEPTH")

plot(depthm ~ devicetimeutc, data = depth)
```

![](figures/rvdat-depth-1.png)

Battery, in `BATTERY.csv`:


```r
battery <- env_import("BATTERY")

plot(batteryvoltagev ~ devicetimeutc, data = battery)
```

![](figures/rvdat-battery-1.png)

Noise, which is actually in the diagnostic file `DIAG.csv`:


```r
diagnostics <- env_import("DIAG")

plot(noisemeanmv ~ devicetimeutc, data = diagnostics)
```

![](figures/rvdat-noise-1.png)
Hourly summaries of pings and detections are also in the diagnositc file:


```r
plot(ppmdetections ~ devicetimeutc, data = diagnostics)
```

![](figures/rvdat-detections-1.png)

```r
plot(ppmpings ~ devicetimeutc, data = diagnostics)
```

![](figures/rvdat-detections-2.png)

But are really located in `DET.csv`.


```r
dets <- env_import("DET")

head(dets[, ])
#>   detdesc       devicetimeutc time timeoffseth timecorrections
#> 1     DET 2022-05-04 15:28:58   NA          NA              NA
#> 2     DET 2022-05-04 15:39:32   NA          NA              NA
#> 3     DET 2022-05-04 15:49:37   NA          NA              NA
#> 4     DET 2022-05-04 16:09:11   NA          NA              NA
#> 5     DET 2022-05-04 16:18:23   NA          NA              NA
#> 6     DET 2022-05-04 16:27:44   NA          NA              NA
#>      model serialnumber channel detectiontype         fullid    id
#> 1 VR2AR-69       546323       1           PPM A69-1601-60787 60787
#> 2 VR2AR-69       546323       1           PPM A69-1601-60787 60787
#> 3 VR2AR-69       546323       1           PPM A69-1601-60787 60787
#> 4 VR2AR-69       546323       1           PPM A69-1601-60787 60787
#> 5 VR2AR-69       546323       1           PPM A69-1601-60787 60787
#> 6 VR2AR-69       546323       1           PPM A69-1601-60787 60787
#>   rawdata transmitterserial signalstrengthdb noisedb gaindb
#> 1      NA                NA               NA      NA     NA
#> 2      NA                NA               NA      NA     NA
#> 3      NA                NA               NA      NA     NA
#> 4      NA                NA               NA      NA     NA
#> 5      NA                NA               NA      NA     NA
#> 6      NA                NA               NA      NA     NA
#>   qualityscore stationname latitude longitude gpshdop
#> 1           NA          NA       NA        NA      NA
#> 2           NA          NA       NA        NA      NA
#> 3           NA          NA       NA        NA      NA
#> 4           NA          NA       NA        NA      NA
#> 5           NA          NA       NA        NA      NA
#> 6           NA          NA       NA        NA      NA
```

These data were gleaned from a particular receiver programmed in a particular way -- note that many of the data fields are empty! Depending on the receiver, how you programmed it, and the arguments you passed to `rvdat` (time correction, anyone?), your split CSV could look rather different.