Read and Combine Terra Climate Parquet Files into a Single Data Frame
Source:R/TerraClimParquetRead.R
TerraClimParquetRead.Rd
This function reads all Parquet files from a specified S3 bucket related to Terra Climate data and combines them into a single data frame. It also performs some specific transformations on the `var` column to standardize variable names.
Usage
TerraClimParquetRead(
sub = "terraclim",
bucket = "tnc-dangermond",
prefix = "water_balance/v2/"
)
Arguments
- sub
A character string specifying the subfolder within the base folder to look for Parquet files. Default is "terraclim".
- bucket
A string representing the S3 bucket name.
- prefix
A string representing the prefix (path) within the S3 bucket where the Parquet files are stored.
Value
A data frame containing the combined data from all Parquet files in the specified subfolder. The `var` column is standardized with values "run" for "q" and "str" for "soil".
Details
The `TerraClimParquetRead` function performs the following steps:
Lists all Parquet files in the specified subfolder within the base folder `"/data/water_balance/"`.
Reads each Parquet file into a data frame and stores them in a list.
Combines all data frames in the list into a single data frame with an additional column `file` indicating the file from which each row was read.
Transforms the `var` column to standardize variable names, changing "q" to "run" and "soil" to "str".