Read and Combine Terra Climate Parquet Files into a Single Data Frame — TerraClimParquetRead • WaterBalanceSummary

This function reads all Parquet files from a specified S3 bucket related to Terra Climate data and combines them into a single data frame. It also performs some specific transformations on the `var` column to standardize variable names.

Usage

TerraClimParquetRead(
  sub = "terraclim",
  bucket = "tnc-dangermond",
  prefix = "water_balance/v2/"
)

Arguments

sub: A character string specifying the subfolder within the base folder to look for Parquet files. Default is "terraclim".
bucket: A string representing the S3 bucket name.
prefix: A string representing the prefix (path) within the S3 bucket where the Parquet files are stored.

Value

A data frame containing the combined data from all Parquet files in the specified subfolder. The `var` column is standardized with values "run" for "q" and "str" for "soil".

Details

The `TerraClimParquetRead` function performs the following steps:

Lists all Parquet files in the specified subfolder within the base folder `"/data/water_balance/"`.
Reads each Parquet file into a data frame and stores them in a list.
Combines all data frames in the list into a single data frame with an additional column `file` indicating the file from which each row was read.
Transforms the `var` column to standardize variable names, changing "q" to "run" and "soil" to "str".