--- title: "Extracting environmental data with `rvdat`" date: 2024-01-10 output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Extracting environmental data with `rvdat`} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- We often want to extract the environmental data that some of our receivers log. You can do this by going into Innovasea's VUE or Fathom Connect programs and clicking a bunch of buttons, or you can do it programmatically and reproducibly by combining `matos` and [`rvdat`](https://rvdat.obrien.page). ```r library(matos) #> By continuing, you are agreeing to the ACT Network MATOS User Agreement and Data Policy, Version 1.2: #> #> #> #> Attaching package: 'matos' #> The following objects are masked from 'package:rvdat': #> #> skip_example_on_ci, skip_example_on_runiverse library(rvdat) ``` ## Download the data If you're an ACT member, log into your MATOS account. If you're not, we'll skip to the next section and assume you already have your files ready. ```r matos_login() #> ! Please log in. #> ✔ Login successful! ``` Next, tell `rvdat` the location of the `vdat.exe` executable (check out the [`rvdat` documentation](https://rvdat.obrien.page) for more information on this). ```r vdat_here("vdat.exe") ``` Using `matos`, I'm going to find the VDAT files associated with the [Mid-Bay Chesapeake Backbone array](https://matos.asascience.com/project/detail/161). Note that things like temperature and tilt are logged within the VRL/VDAT files, and so will be under the "detections" file type. ```r detection_files <- list_project_files( project = "UMCES Chesapeake Backbone, Mid-Bay", file_type = "detections" ) #> Warning in (function (what, read_access, quiet, warn_multimatch) : #> MATOS has exactly matched multiple OTN project names. head(detection_files) #> project file_type upload_date #> 3 161 Tag Detections - .vfl file 2023-10-12 #> 5 161 Tag Detections - .vfl file 2023-09-13 #> 6 161 Tag Detections - .vfl file 2023-09-13 #> 7 161 Tag Detections - .vfl file 2023-09-13 #> 8 161 Tag Detections - .vfl file 2023-09-13 #> 9 161 Tag Detections - .vfl file 2023-09-13 #> file_name #> 3 VR2AR_546323_20231012_1.vrl #> 5 VR2AR_547715_20230912_1.vrl #> 6 VR2AR_547714_20230912_1.vrl #> 7 VR2AR_546476_20230912_1.vrl #> 8 VR2AR_546470_20230912_1.vrl #> 9 VR2AR_546462_20230912_1.vrl #> url #> 3 https://matos.asascience.com/projectfile/download/11087 #> 5 https://matos.asascience.com/projectfile/download/8003 #> 6 https://matos.asascience.com/projectfile/download/8002 #> 7 https://matos.asascience.com/projectfile/download/8001 #> 8 https://matos.asascience.com/projectfile/download/8000 #> 9 https://matos.asascience.com/projectfile/download/7999 ``` I'll download the first file into a temporary directory. ```r get_project_file( url = detection_files$url[1], out_dir = tempdir() ) #> #> ── Downloading files ───────────────────────────────────────────────── #> ✔ File(s) saved to: #> C:\Users\darpa2\AppData\Local\Temp\RtmpuIFd1a\VR2AR_546323_20231012_1.vrl #> [1] "C:\\Users\\darpa2\\AppData\\Local\\Temp\\RtmpuIFd1a\\VR2AR_546323_20231012_1.vrl" vrl_file <- list.files(tempdir(), pattern = "vrl$", full.names = T) ``` ## Convert VRL using `rvdat` Now, I'll use `rvdat::vdat_to_folder` to convert that file into a folder of CSV files, each one representing one data type. ```r vdat_to_folder( vrl_file, outdir = tempdir() ) #> ✔ File converted: #> C:\Users\darpa2\AppData\Local\Temp\RtmpuIFd1a/VR2AR_546323_20231012_1.vrl #> ℹ Files saved in: #> C:\Users\darpa2\AppData\Local\Temp\RtmpuIFd1a/VR2AR_546323_20231012_1.csv-fathom-split ``` There are quite a few files in there, but for this we're going to focus on the temperature records stored in `TEMP.csv`. ```r list.files(tempdir(), pattern = "fathom-split", full.names = T) |> list.files() #> [1] "ATTITUDE.csv" "BATTERY.csv" #> [3] "CFG_CHANNEL.csv" "CFG_STUDY.csv" #> [5] "CFG_TRANSMITTER.csv" "CLOCK_REF.csv" #> [7] "DATA_SOURCE_FILE.csv" "DEPTH.csv" #> [9] "DET.csv" "DIAG.csv" #> [11] "EVENT.csv" "EVENT_INIT.csv" #> [13] "EVENT_OFFLOAD.csv" "HEALTH_VR2AR.csv" #> [15] "TEMP.csv" bwt_file <- list.files(tempdir(), pattern = "TEMP", full.names = T, recursive = T) ``` ## Read it into R Let's read in the data. ```r read.csv(bwt_file) #> Error in read.table(file = file, header = header, sep = sep, quote = quote, : more columns than column names ``` An error, oh no! Let's see what's causing it. ```r read.csv(bwt_file, header = FALSE, nrows = 5 ) #> V1 V2 #> 1 VEMCO DATA LOG 2.0.0 #> 2 TEMP_DESC Device Time (UTC) #> 3 TEMP 2022-05-04 16:00:00 #> 4 TEMP 2022-05-04 17:00:00 #> 5 TEMP 2022-05-04 18:00:00 #> V3 V4 #> 1 vdat-9.13.2-20240607-7a2e9d-release #> 2 Time Time Offset (h) #> 3 #> 4 #> 5 #> V5 V6 V7 V8 #> 1 #> 2 Time Correction (s) Model Serial Number Ambient (deg C) #> 3 VR2AR-69 546323 13.558 #> 4 VR2AR-69 546323 12.869 #> 5 VR2AR-69 546323 12.827 #> V9 #> 1 #> 2 Internal (deg C) #> 3 #> 4 #> 5 ``` Ah, the data doesn't really start until the second row. Skip the first one and take a look. ```r bwt <- read.csv( bwt_file, skip = 1 ) head(bwt) #> TEMP_DESC Device.Time..UTC. Time Time.Offset..h. #> 1 TEMP 2022-05-04 16:00:00 NA NA #> 2 TEMP 2022-05-04 17:00:00 NA NA #> 3 TEMP 2022-05-04 18:00:00 NA NA #> 4 TEMP 2022-05-04 19:00:00 NA NA #> 5 TEMP 2022-05-04 20:00:00 NA NA #> 6 TEMP 2022-05-04 21:00:00 NA NA #> Time.Correction..s. Model Serial.Number Ambient..deg.C. #> 1 NA VR2AR-69 546323 13.558 #> 2 NA VR2AR-69 546323 12.869 #> 3 NA VR2AR-69 546323 12.827 #> 4 NA VR2AR-69 546323 12.806 #> 5 NA VR2AR-69 546323 12.806 #> 6 NA VR2AR-69 546323 12.827 #> Internal..deg.C. #> 1 NA #> 2 NA #> 3 NA #> 4 NA #> 5 NA #> 6 NA ``` Great! The data are in. I'm going to clean up the names a little bit and convert the time column from a character string to POSIX time. ```r names(bwt) <- gsub("[_\\.]", "", tolower(names(bwt))) names(bwt) #> [1] "tempdesc" "devicetimeutc" "time" #> [4] "timeoffseth" "timecorrections" "model" #> [7] "serialnumber" "ambientdegc" "internaldegc" bwt$devicetimeutc <- as.POSIXct(bwt$devicetimeutc, tz = "UTC", format = "%F %T" ) head(bwt) #> tempdesc devicetimeutc time timeoffseth timecorrections #> 1 TEMP 2022-05-04 16:00:00 NA NA NA #> 2 TEMP 2022-05-04 17:00:00 NA NA NA #> 3 TEMP 2022-05-04 18:00:00 NA NA NA #> 4 TEMP 2022-05-04 19:00:00 NA NA NA #> 5 TEMP 2022-05-04 20:00:00 NA NA NA #> 6 TEMP 2022-05-04 21:00:00 NA NA NA #> model serialnumber ambientdegc internaldegc #> 1 VR2AR-69 546323 13.558 NA #> 2 VR2AR-69 546323 12.869 NA #> 3 VR2AR-69 546323 12.827 NA #> 4 VR2AR-69 546323 12.806 NA #> 5 VR2AR-69 546323 12.806 NA #> 6 VR2AR-69 546323 12.827 NA ``` ## See what we've got Now that we have our time series, let's see what it looks like! ```r plot(ambientdegc ~ devicetimeutc, data = bwt, type = "l" ) ``` ![](figures/rvdat-bwt-1.png) ## Other variables We can pull out other variables in a similar manner. Take, for example, receiver tilt, located in `ATTITUDE.csv`. ```r env_import <- function(file) { hold <- list.files( tempdir(), pattern = file, full.names = T, recursive = T ) |> read.csv(skip = 1) names(hold) <- gsub("[_\\.]", "", tolower(names(hold))) hold$devicetimeutc <- as.POSIXct(hold$devicetimeutc, tz = "UTC", format = "%F %T" ) hold } tilt <- env_import("ATTITUDE") plot(tiltdeg ~ devicetimeutc, data = tilt, type = "l") ``` ![](figures/rvdat-tilt-1.png) Depth, in `DEPTH.csv`: ```r depth <- env_import("DEPTH") plot(depthm ~ devicetimeutc, data = depth) ``` ![](figures/rvdat-depth-1.png) Battery, in `BATTERY.csv`: ```r battery <- env_import("BATTERY") plot(batteryvoltagev ~ devicetimeutc, data = battery) ``` ![](figures/rvdat-battery-1.png) Noise, which is actually in the diagnostic file `DIAG.csv`: ```r diagnostics <- env_import("DIAG") plot(noisemeanmv ~ devicetimeutc, data = diagnostics) ``` ![](figures/rvdat-noise-1.png) Hourly summaries of pings and detections are also in the diagnositc file: ```r plot(ppmdetections ~ devicetimeutc, data = diagnostics) ``` ![](figures/rvdat-detections-1.png) ```r plot(ppmpings ~ devicetimeutc, data = diagnostics) ``` ![](figures/rvdat-detections-2.png) But are really located in `DET.csv`. ```r dets <- env_import("DET") head(dets[, ]) #> detdesc devicetimeutc time timeoffseth timecorrections #> 1 DET 2022-05-04 15:28:58 NA NA NA #> 2 DET 2022-05-04 15:39:32 NA NA NA #> 3 DET 2022-05-04 15:49:37 NA NA NA #> 4 DET 2022-05-04 16:09:11 NA NA NA #> 5 DET 2022-05-04 16:18:23 NA NA NA #> 6 DET 2022-05-04 16:27:44 NA NA NA #> model serialnumber channel detectiontype fullid id #> 1 VR2AR-69 546323 1 PPM A69-1601-60787 60787 #> 2 VR2AR-69 546323 1 PPM A69-1601-60787 60787 #> 3 VR2AR-69 546323 1 PPM A69-1601-60787 60787 #> 4 VR2AR-69 546323 1 PPM A69-1601-60787 60787 #> 5 VR2AR-69 546323 1 PPM A69-1601-60787 60787 #> 6 VR2AR-69 546323 1 PPM A69-1601-60787 60787 #> rawdata transmitterserial signalstrengthdb noisedb gaindb #> 1 NA NA NA NA NA #> 2 NA NA NA NA NA #> 3 NA NA NA NA NA #> 4 NA NA NA NA NA #> 5 NA NA NA NA NA #> 6 NA NA NA NA NA #> qualityscore stationname latitude longitude gpshdop #> 1 NA NA NA NA NA #> 2 NA NA NA NA NA #> 3 NA NA NA NA NA #> 4 NA NA NA NA NA #> 5 NA NA NA NA NA #> 6 NA NA NA NA NA ``` These data were gleaned from a particular receiver programmed in a particular way -- note that many of the data fields are empty! Depending on the receiver, how you programmed it, and the arguments you passed to `rvdat` (time correction, anyone?), your split CSV could look rather different.