Download Status and Intensity Records — npn_download_status

This function allows for a parameterized search of all status records in the USA-NPN database, returning all records as per the search parameters in a data table. Data fetched from NPN services is returned as raw JSON before being channeled into a data table. Optionally results can be directed to an output file in which case the raw JSON is converted to CSV and saved to file; in that case, data is also streamed to file which allows for more easily handling of the data if the search otherwise returns more data than can be handled at once in memory.

npn_download_status_data(
  request_source,
  years,
  coords = NULL,
  species_ids = NULL,
  genus_ids = NULL,
  family_ids = NULL,
  order_ids = NULL,
  class_ids = NULL,
  station_ids = NULL,
  species_types = NULL,
  network_ids = NULL,
  states = NULL,
  phenophase_ids = NULL,
  functional_types = NULL,
  additional_fields = NULL,
  climate_data = FALSE,
  ip_address = NULL,
  dataset_ids = NULL,
  email = NULL,
  download_path = NULL,
  six_leaf_layer = FALSE,
  six_bloom_layer = FALSE,
  agdd_layer = NULL,
  six_sub_model = NULL,
  additional_layers = NULL,
  pheno_class_ids = NULL,
  wkt = NULL
)

Arguments

request_source: Required field, character Self-identify who is making requests to the data service.
years: Required field, character vector. Specify the years to include in the search, e.g. c('2013','2014'). You must specify at least one year.
coords: Numeric vector, used to specify a bounding box as a search parameter, e.g. c(lower_left_lat, lower_left_long, upper_right, lat,upper_right_long).
species_ids: Integer vector of unique IDs for searching based on species, e.g. c(3, 34, 35).
genus_ids: Integer vector of unique IDs for searching based on taxonomic family, e.g. c(3, 34, 35). This parameter will take precedence if species_ids is also set.
family_ids: Integer vector of unique IDs for searching based on taxonomic family, e.g. c(3, 34, 35). This parameter will take precedence if species_ids is also set.
order_ids: Integer vector of unique IDs for searching based on taxonomic order, e.g. c(3, 34, 35). This parameter will take precedence if species_ids or family_ids are also set.
class_ids: Integer vector of unique IDs for searching based on taxonomic class, e.g. c(3, 34, 35). This parameter will take precedence if species_ids, family_ids or order_ids are also set.
station_ids: Integer vector of unique IDs for searching based on site location, e.g. c(5, 9).
species_types: Character vector of unique species type names for searching based on species types, e.g. c("Deciduous", "Evergreen").
network_ids: Integer vector of unique IDs for searching based on partner group/network, e.g. c(500, 300).
states: Character vector of US postal states to be used as search params, e.g. c("AZ", "IL").
phenophase_ids: Integer vector of unique IDs for searching based on phenophase, e.g. c(323, 324).
functional_types: Character vector of unique functional type names, e.g. `c("Birds").
additional_fields: Character vector of additional fields to be included in the search results, e.g. c("Station_Name", "Plant_Nickname").
climate_data: Boolean value indicating that all climate variables should be included in additional_fields.
ip_address: Optional field, string. IP Address of user requesting data. Used for generating data reports.
dataset_ids: Integer vector of unique IDs for searching based on dataset, e.g. NEON or GRSM c(17,15).
email: Optional field, string. Email of user requesting data.
download_path: Character, optional file path to the file for which to output the results.
six_leaf_layer: Boolean value when set to TRUE will attempt to resolve the date of the observation to a spring index, leafing value for the location at which the observations was taken.
six_bloom_layer: Boolean value when set to TRUE will attempt to resolve the date of the observation to a spring index, bloom value for the location at which the observations was taken.
agdd_layer: Numeric value, accepts 32 or 50. When set, the results will attempt to resolve the date of the observation to an AGDD value for the location; the 32 or 50 represents the base value of the AGDD value returned. All AGDD values are based on a January 1st start date of the year in which the observation was taken.
six_sub_model: Affects the results of the six layers returned. Can be used to specify one of three submodels used to calculate the spring index values. Thus setting this field will change the results of six_leaf_layer and six_bloom_layer. Valid values include: 'lilac', 'zabelli' and 'arnoldred'. For more information see the NPN's Spring Index Maps documentation: https://www.usanpn.org/data/maps/spring.
additional_layers: Data frame with first column named name and containing the names of the layer for which to retrieve data and the second column named param and containing string representations of the time/elevation subset parameter to use. This variable can be used to append additional geospatial layer data fields to the results, such that the date of observation in each row will resolve to a value from the specified layers, given the location of the observation.
pheno_class_ids: Integer vector of unique IDs for searching based on pheno class. Note that if both pheno_class_id and phenophase_id are provided in the same request, phenophase_id will be ignored.
wkt: WKT geometry by which filter data. Specifying a valid WKT within the contiguous US will filter data based on the locations which fall within that WKT.

Value

A tibble of all status records returned as per the search parameters. If download_path is specified, the file path is returned instead.

Details

Most search parameters are optional. However, users are encouraged to supply additional search parameters to get results that are easier to work with. request_source must be provided. This is a self-identifying string, telling the service who is asking for the data or from where the request is being made. It is recommended you provide your name or organization name. If the call to this function is acting as an intermediary for a client, then you may also optionally provide a user email and/or IP address for usage data reporting later.

Additional fields provides the ability to specify more, non-critical fields to include in the search results. A complete list of additional fields can be found in the NPN service's companion documentation. Metadata on all fields can be found in the following Excel sheet: https://www.usanpn.org/files/metadata/status_intensity_datafield_descriptions.xlsx

Examples

if (FALSE) { # \dontrun{
#Download all saguaro data for 2016
npn_download_status_data(
  request_source = "Your Name or Org Here",
  years = c(2016),
  species_id = c(210),
  download_path = "saguaro_data_2016.csv"
)
} # }