Read and Combine Parquet Files into a Single Data Frame
Source:R/CABCMParquetRead.R
CABCMParquetRead.Rd
This function reads all Parquet files from a specified S3 bucket and combines them into a single data frame.
Arguments
- sub
A character string specifying the subfolder within the base folder to look for Parquet files. Default is "cabcm".
- bucket
A string representing the S3 bucket name.
- prefix
A string representing the prefix (path) within the S3 bucket where the Parquet files are stored.
Value
A data frame containing the combined data from all Parquet files in the specified subfolder. Each row in the combined data frame has an additional column named `file` indicating the source file of the data.
Details
The `CABCMParquetRead` function performs the following steps:
Lists all Parquet files in the specified subfolder within the base folder `"/Users/wamclean/Desktop/Lynker/tnc_hf/water_balance/"`.
Reads each Parquet file into a data frame and stores them in a list.
Combines all data frames in the list into a single data frame with an additional column `file` indicating the file from which each row was read.