Skip to contents

This function reads all Parquet files from a specified S3 bucket related to Terra Climate data and combines them into a single data frame. It also performs some specific transformations on the `var` column to standardize variable names.

Usage

TerraClimParquetRead(
  sub = "terraclim",
  bucket = "tnc-dangermond",
  prefix = "water_balance/v2/"
)

Arguments

sub

A character string specifying the subfolder within the base folder to look for Parquet files. Default is "terraclim".

bucket

A string representing the S3 bucket name.

prefix

A string representing the prefix (path) within the S3 bucket where the Parquet files are stored.

Value

A data frame containing the combined data from all Parquet files in the specified subfolder. The `var` column is standardized with values "run" for "q" and "str" for "soil".

Details

The `TerraClimParquetRead` function performs the following steps:

  • Lists all Parquet files in the specified subfolder within the base folder `"/data/water_balance/"`.

  • Reads each Parquet file into a data frame and stores them in a list.

  • Combines all data frames in the list into a single data frame with an additional column `file` indicating the file from which each row was read.

  • Transforms the `var` column to standardize variable names, changing "q" to "run" and "soil" to "str".