Skip to contents

This function reads all Parquet files from a specified S3 bucket and combines them into a single data frame.

Usage

CABCMParquetRead(
  sub = "cabcm",
  bucket = "tnc-dangermond",
  prefix = "water_balance/v2/"
)

Arguments

sub

A character string specifying the subfolder within the base folder to look for Parquet files. Default is "cabcm".

bucket

A string representing the S3 bucket name.

prefix

A string representing the prefix (path) within the S3 bucket where the Parquet files are stored.

Value

A data frame containing the combined data from all Parquet files in the specified subfolder. Each row in the combined data frame has an additional column named `file` indicating the source file of the data.

Details

The `CABCMParquetRead` function performs the following steps:

  • Lists all Parquet files in the specified subfolder within the base folder `"/Users/wamclean/Desktop/Lynker/tnc_hf/water_balance/"`.

  • Reads each Parquet file into a data frame and stores them in a list.

  • Combines all data frames in the list into a single data frame with an additional column `file` indicating the file from which each row was read.