Package 'ibis.iSDM'

Title: Modelling framework for integrated biodiversity distribution scenarios
Description: Integrated framework of modelling the distribution of species and ecosystems in a suitability framing. This package allows the estimation of integrated species distribution models (iSDM) based on several sources of evidence and provided presence-only and presence-absence datasets. It makes heavy use of point-process models for estimating habitat suitability and allows to include spatial latent effects and priors in the estimation. To do so 'ibis.iSDM' supports a number of engines for Bayesian and more non-parametric machine learning estimation. Further, the 'ibis.iSDM' is specifically customized to support spatial-temporal projections of habitat suitability into the future.
Authors: Martin Jung [aut, cre, cph] , Maximilian H.K. Hesselbarth [ctb]
Maintainer: Martin Jung <[email protected]>
License: CC BY 4.0
Version: 0.1.5
Built: 2025-02-06 05:46:08 UTC
Source: https://github.com/iiasa/ibis.iSDM

Help Index


Add biodiversity point dataset to a distribution object (presence-absence).

Description

This function adds a presence-absence biodiversity dataset to a distribution object. Opposed to presence-only data, presence-absence biodiversity records usually originate from structured biodiversity surveys where the absence of a species in a given region was specifically assessed.

If it is the analysts choice it is also possible to format presence-only biodiversity data into a presence-absence form, by adding pseudo-absence through add_pseudoabsence. See the help file for more information.

Usage

add_biodiversity_poipa(
  x,
  poipa,
  name = NULL,
  field_occurrence = "observed",
  formula = NULL,
  family = "binomial",
  link = NULL,
  weight = 1,
  separate_intercept = TRUE,
  docheck = TRUE,
  ...
)

## S4 method for signature 'BiodiversityDistribution,sf'
add_biodiversity_poipa(
  x,
  poipa,
  name = NULL,
  field_occurrence = "observed",
  formula = NULL,
  family = "binomial",
  link = NULL,
  weight = 1,
  separate_intercept = TRUE,
  docheck = TRUE,
  ...
)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

poipa

A data.frame or sf object of presence-absence point occurrences.

name

The name of the biodiversity dataset used as internal identifier.

field_occurrence

A numeric or character location of biodiversity point records indicating presence/absence. By default set to "observed" and an error will be thrown if a numeric column with that name does not exist.

formula

A character or formula object to be passed. Default (NULL) is to use all covariates.

family

A character stating the family to be used (Default: 'binomial').

link

A character to overwrite the default link function (Default: NULL).

weight

A numeric value acting as a multiplier with regards to any weights used in the modelling. Larger weights indicate higher weighting relative to any other datasets. By default set to 1 if only one dataset is added. A vector is also supported but must be of the same length as parameter "poipa".

separate_intercept

A logical value stating whether a separate intercept is to be added in. shared likelihood models for engines engine_inla, engine_inlabru and engine_stan.

docheck

logical on whether additional checks should be performed (e.g. intersection tests) (Default: TRUE).

...

Other parameters passed down.

Details

By default, the logit link function is used in a logistic regression setting unless the specific engine does not support generalised linear regressions (e.g. engine_bart).

Value

Adds biodiversity data to distribution object.

References

  • Renner, I. W., J. Elith, A. Baddeley, W. Fithian, T. Hastie, S. J. Phillips, G. Popovic, and D. I. Warton. 2015. Point process models for presence-only analysis. Methods in Ecology and Evolution 6:366–379.

  • Guisan A. and Zimmerman N. 2000. Predictive habitat distribution models in ecology. Ecol. Model. 135: 147–186.

See Also

Other add_biodiversity: add_biodiversity_poipo(), add_biodiversity_polpa(), add_biodiversity_polpo()

Examples

## Not run: 
# Define model
x <- distribution(background) |> add_biodiversity_poipa(virtual_species)

## End(Not run)

Add biodiversity point dataset to a distribution object (presence-only)

Description

This function adds a presence-only biodiversity dataset to a distribution object.

Usage

add_biodiversity_poipo(
  x,
  poipo,
  name = NULL,
  field_occurrence = "observed",
  formula = NULL,
  family = "poisson",
  link = NULL,
  weight = 1,
  separate_intercept = TRUE,
  docheck = TRUE,
  pseudoabsence_settings = NULL,
  ...
)

## S4 method for signature 'BiodiversityDistribution,sf'
add_biodiversity_poipo(
  x,
  poipo,
  name = NULL,
  field_occurrence = "observed",
  formula = NULL,
  family = "poisson",
  link = NULL,
  weight = 1,
  separate_intercept = TRUE,
  docheck = TRUE,
  pseudoabsence_settings = NULL,
  ...
)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

poipo

A data.frame or sf object of presence-only point occurrences.

name

The name of the biodiversity dataset used as internal identifier.

field_occurrence

A numeric or character location of biodiversity point records.

formula

A character or formula object to be passed. Default is to use all covariates (if specified).

family

A character stating the family to be used (Default: 'Poisson').

link

A character to overwrite the default link function (Default: NULL).

weight

A numeric value acting as a multiplier with regards to any weights used in the modelling. Larger weights indicate higher weighting relative to any other datasets. By default set to 1 if only one dataset is added. A vector is also supported but must be of the same length as "poipo". Note: Weights are reformated to the inverse for models with area offsets (e.g. 5 is converted to 1/5).

separate_intercept

A logical value stating whether a separate intercept is to be added in shared likelihood models for engines engine_inla, engine_inlabru and engine_stan. Otherwise ignored.

docheck

logical on whether additional checks should be performed (e.g. intersection tests) (Default: TRUE).

pseudoabsence_settings

Either NULL or a pseudoabs_settings() created settings object.

...

Other parameters passed down to the object. Normally not used unless described in details.

Details

This function allows to add presence-only biodiversity records to a distribution ibis.iSDM Presence-only data are usually modelled through an inferential model (see Guisan and Zimmerman, 2000) that relate their occurrence in relation to environmental covariates to a selected sample of 'background' points. The most common approach for estimation and the one supported by this type of dataset are poisson-process models (PPM) in which presence-only points are fitted through a down-weighted Poisson regression. See Renner et al. 2015 for an overview.

Value

Adds biodiversity data to distribution object.

References

  • Guisan A. and Zimmerman N. 2000. Predictive habitat distribution models in ecology. Ecol. Model. 135: 147–186.

  • Renner, I. W., J. Elith, A. Baddeley, W. Fithian, T. Hastie, S. J. Phillips, G. Popovic, and D. I. Warton. 2015. Point process models for presence-only analysis. Methods in Ecology and Evolution 6:366–379.

See Also

Other add_biodiversity: add_biodiversity_poipa(), add_biodiversity_polpa(), add_biodiversity_polpo()

Examples

# Load background
background <- terra::rast(system.file('extdata/europegrid_50km.tif',
package='ibis.iSDM',mustWork = TRUE))
# Load virtual species
virtual_points <- sf::st_read(system.file('extdata/input_data.gpkg',
package='ibis.iSDM',mustWork = TRUE),'points',quiet = TRUE)
# Define model
x <- distribution(background) |>
add_biodiversity_poipo(virtual_points, field_occurrence = "Observed")

Add biodiversity polygon dataset to a distribution object (presence-absence)

Description

This function can be used to add a sf polygon dataset to an existing distribution object. Presence-absence polygon data assumes that each area within the polygon can be treated as 'presence' for the species, while each area outside the polygon is where the species is absent.

Usage

add_biodiversity_polpa(
  x,
  polpa,
  name = NULL,
  field_occurrence = "observed",
  formula = NULL,
  family = "binomial",
  link = NULL,
  weight = 1,
  simulate = FALSE,
  simulate_points = 100,
  simulate_bias = NULL,
  simulate_strategy = "random",
  separate_intercept = TRUE,
  docheck = TRUE,
  pseudoabsence_settings = NULL,
  ...
)

## S4 method for signature 'BiodiversityDistribution,sf'
add_biodiversity_polpa(
  x,
  polpa,
  name = NULL,
  field_occurrence = "observed",
  formula = NULL,
  family = "binomial",
  link = NULL,
  weight = 1,
  simulate = FALSE,
  simulate_points = 100,
  simulate_bias = NULL,
  simulate_strategy = "random",
  separate_intercept = TRUE,
  docheck = TRUE,
  pseudoabsence_settings = NULL,
  ...
)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

polpa

A sf polygon object of presence-absence occurrences.

name

The name of the biodiversity dataset used as internal identifier.

field_occurrence

A numeric or character location of biodiversity point records.

formula

A character or formula object to be passed. Default (NULL) is to use all covariates .

family

A character stating the family to be used (Default: binomial).

link

A character to overwrite the default link function (Default: NULL).

weight

A numeric value acting as a multiplier with regards to any weights used in the modelling. Larger weights indicate higher weighting relative to any other datasets. By default set to 1 if only one dataset is added. A vector is also supported but must be of the same length as "polpa".

simulate

Simulate poipa points within its boundaries. Result are passed to add_biodiversity_poipa (Default: FALSE).

simulate_points

A numeric number of points to be created by simulation.

simulate_bias

A SpatRaster layer describing an eventual preference for simulation (Default: NULL).

simulate_strategy

A character stating the strategy for sampling. Can be set to either. 'random' or 'regular', the latter requiring a raster supplied in the 'simulate_weights' parameter.

separate_intercept

A logical value stating whether a separate intercept is to be added in shared likelihood models for engines engine_inla, engine_inlabru and engine_stan.

docheck

logical on whether additional checks should be performed (e.g. intersection tests) (Default: TRUE).

pseudoabsence_settings

Either NULL or a pseudoabs_settings() created settings object.

...

Other parameters passed down.

Details

The default approach for polygon data is to sample presence-absence points across the region of the polygons. This function thus adds as a wrapper to add_biodiversity_poipa() as presence-only points are created by the model. Note if the polygon is used directly in the modelling the link between covariates and polygonal data is established by regular sampling of points within the polygon and is thus equivalent to simulating the points directly.

For an integration of range data as predictor or offset, see add_predictor_range() and add_offset_range() instead.

Value

Adds biodiversity data to distribution object.

See Also

Other add_biodiversity: add_biodiversity_poipa(), add_biodiversity_poipo(), add_biodiversity_polpo()

Examples

## Not run: 
 x <- distribution(background) |>
   add_biodiversity_polpa(protectedArea)

## End(Not run)

Add biodiversity polygon dataset to a distribution object (presence-only)

Description

This function can be used to add a sf polygon dataset to an existing distribution object. Presence-only polygon data is treated differential than point data in some engines particular through the way that points are generated.

Usage

add_biodiversity_polpo(
  x,
  polpo,
  name = NULL,
  field_occurrence = "observed",
  formula = NULL,
  family = "poisson",
  link = NULL,
  weight = 1,
  simulate = FALSE,
  simulate_points = 100,
  simulate_bias = NULL,
  simulate_strategy = "random",
  separate_intercept = TRUE,
  docheck = TRUE,
  pseudoabsence_settings = NULL,
  ...
)

## S4 method for signature 'BiodiversityDistribution,sf'
add_biodiversity_polpo(
  x,
  polpo,
  name = NULL,
  field_occurrence = "observed",
  formula = NULL,
  family = "poisson",
  link = NULL,
  weight = 1,
  simulate = FALSE,
  simulate_points = 100,
  simulate_bias = NULL,
  simulate_strategy = "random",
  separate_intercept = TRUE,
  docheck = TRUE,
  pseudoabsence_settings = NULL,
  ...
)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

polpo

A sf polygon object of presence-only occurrences.

name

The name of the biodiversity dataset used as internal identifier.

field_occurrence

A numeric or character location of biodiversity point records.

formula

A character or formula object to be passed. Default is to use all covariates (if specified).

family

A character stating the family to be used (Default: poisson).

link

A character to overwrite the default link function (Default: NULL).

weight

A numeric value acting as a multiplier with regards to any weights used in the modelling. Larger weights indicate higher weighting relative to any other datasets. By default set to 1 if only one dataset is added. A vector is also supported but must be of the same length as "polpo".

simulate

Simulate poipo points within its boundaries. Result are passed to add_biodiversity_poipo (Default: FALSE).

simulate_points

A numeric number of points to be created by simulation (Default: 100).

simulate_bias

A SpatRaster layer describing an eventual preference for simulation (Default: NULL).

simulate_strategy

A character stating the strategy for sampling. Can be set to either. 'random' or 'regular', the latter requiring a raster supplied in the 'simulate_weights' parameter.

separate_intercept

A logical value stating whether a separate intercept is to be added in shared likelihood models for engines engine_inla, engine_inlabru and engine_stan.

docheck

logical on whether additional checks should be performed (e.g. intersection tests) (Default: TRUE).

pseudoabsence_settings

Either NULL or a pseudoabs_settings() created settings object.

...

Other parameters passed down.

Details

The default approach for polygon data is to sample presence-only points across the region of the polygons. This function thus adds as a wrapper to add_biodiversity_poipo() as presence-only points are created by the model. If no points are simulated directly (Default) then the polygon is processed by train() by creating regular point data over the supplied predictors.

Use add_biodiversity_polpa() to create binomial distributed inside-outside points for the given polygon!

For an integration of range data as predictor or offset, see

add_predictor_range() and add_offset_range() instead.

Value

Adds biodiversity data to distribution object.

See Also

Other add_biodiversity: add_biodiversity_poipa(), add_biodiversity_poipo(), add_biodiversity_polpa()

Examples

## Not run: 
 x <- distribution(mod) |>
   add_biodiversity_polpo(protectedArea)

## End(Not run)

Add a constraint to an existing scenario

Description

This function adds a constrain to a BiodiversityScenario object to constrain (future) projections. These constrains can for instance be constraints on a possible dispersal distance, connectivity between identified patches or limitations on species adaptability.

Most constrains require pre-calculated thresholds to present in the BiodiversityScenario object!

Usage

add_constraint(mod, method, ...)

## S4 method for signature 'BiodiversityScenario'
add_constraint(mod, method, ...)

Arguments

mod

A BiodiversityScenario object with specified predictors.

method

A character indicating the type of constraints to be added to the scenario. See details for more information.

...

passed on parameters. See also the specific methods for adding constraints.

Details

Constraints can be added to scenario objects to increase or decrease the suitability of a given area for the target feature. This function acts as a wrapper to add these constraints. Currently supported are the following options:

Dispersal:

  • sdd_fixed - Applies a fixed uniform dispersal distance per modelling timestep.

  • sdd_nexpkernel - Applies a dispersal distance using a negative exponential kernel from its origin.

  • kissmig - Applies the kissmig stochastic dispersal model. Requires `kissmig` package. Applied at each modelling time step.

  • migclim - Applies the dispersal algorithm MigClim to the modelled objects. Requires "MigClim" package.

A comprehensive overview of the benefits of including dispersal constrains in species distribution models can be found in Bateman et al. (2013).

Connectivity:

  • hardbarrier - Defines a hard barrier to any dispersal events. By definition this sets all values larger than 0 in the barrier layer to 0 in the projection. Barrier has to be provided through the "resistance" parameter.

  • resistance - Allows the provision of a static or dynamic layer that is multiplied with the projection at each time step. Can for example be used to reduce the suitability of any given area (using pressures not included in the model). The respective layer(s) have to be provided through the "resistance" parameter. Provided layers are incorporated as abs(resistance - 1) and multiplied with the prediction.

Adaptability:

  • nichelimit - Specifies a limit on the environmental niche to only allow a modest amount of extrapolation beyond the known occurrences. This can be particular useful to limit the influence of increasing marginal responses and avoid biologically unrealistic projections.

Boundary and size:

  • boundary - Applies a hard boundary constraint on the projection, thus disallowing an expansion of a range outside the provide layer. Similar as specifying projection limits (see distribution), but can be used to specifically constrain a projection within a certain area (e.g. a species range or an island).

  • minsize - Allows to specify a certain size that must be satisfied in order for a thresholded patch to be occupied. Can be thought of as a minimum size requirement. See add_constraint_minsize() for the required parameters.

  • threshold - Applies the set threshold as a constrain directly on the suitability projections. Requires a threshold to be set.

Value

Adds constraints data to a BiodiversityScenario object.

References

  • Bateman, B. L., Murphy, H. T., Reside, A. E., Mokany, K., & VanDerWal, J. (2013). Appropriateness of full‐, partial‐and no‐dispersal scenarios in climate change impact modelling. Diversity and Distributions, 19(10), 1224-1234.

  • Nobis MP and Normand S (2014) KISSMig - a simple model for R to account for limited migration in analyses of species distributions. Ecography 37: 1282-1287.

  • Mendes, P., Velazco, S. J. E., de Andrade, A. F. A., & Júnior, P. D. M. (2020). Dealing with overprediction in species distribution models: How adding distance constraints can improve model accuracy. Ecological Modelling, 431, 109180.

See Also

Other constraint: add_constraint_MigClim(), add_constraint_adaptability(), add_constraint_boundary(), add_constraint_connectivity(), add_constraint_dispersal(), add_constraint_minsize(), add_constraint_threshold(), simulate_population_steps()

Examples

## Not run: 
# Assumes that a trained 'model' object exists
 mod <- scenario(model) |>
  add_predictors(env = predictors, transform = 'scale', derivates = "none") |>
  add_constraint_dispersal(method = "kissmig", value = 2, pext = 0.1) |>
  project()

## End(Not run)

Adds an adaptability constraint to a scenario object

Description

Adaptability constraints assume that suitable habitat for species in (future) projections might be unsuitable if it is outside the range of conditions currently observed for the species.

Currently only nichelimit is implemented, which adds a simple constrain on the predictor parameter space, which can be defined through the "value" parameter. For example by setting it to 1 (Default), any projections are constrained to be within the range of at maximum 1 standard deviation from the range of covariates used for model training.

Usage

add_constraint_adaptability(
  mod,
  method = "nichelimit",
  names = NULL,
  value = 1,
  increment = 0,
  ...
)

## S4 method for signature 'BiodiversityScenario'
add_constraint_adaptability(
  mod,
  method = "nichelimit",
  names = NULL,
  value = 1,
  increment = 0,
  ...
)

Arguments

mod

A BiodiversityScenario object with specified predictors.

method

A character indicating the type of constraints to be added to the scenario. See details for more information.

names

A character vector with names of the predictors for which an adaptability threshold should be set (Default: NULL for all).

value

A numeric value in units of standard deviation (Default: 1).

increment

A numeric constant that is added to value at every time step (Default: 0). Allows incremental widening of the niche space, thus opening constraints.

...

passed on parameters. See also the specific methods for adding constraints.

See Also

Other constraint: add_constraint(), add_constraint_MigClim(), add_constraint_boundary(), add_constraint_connectivity(), add_constraint_dispersal(), add_constraint_minsize(), add_constraint_threshold(), simulate_population_steps()

Examples

## Not run: 
scenario(fit) |>
 add_constraint_adaptability(value = 1)

## End(Not run)

Adds a boundary constraint to a scenario object

Description

The purpose of boundary constraints is to limit a future projection within a specified area (such as for example a range or ecoregion). This can help to limit unreasonable projections into geographic space.

Similar to boundary constraints it is also possible to define a "zone" for the scenario projections, similar as was done for model training. The difference to a boundary constraint is that the boundary constraint is applied posthoc as a hard cut on any projection, while the zones would allow any projection (and other constraints) to be applied within the zone. Note: Setting a boundary constraint for future projections effectively potentially suitable areas!

Usage

add_constraint_boundary(mod, layer, ...)

## S4 method for signature 'BiodiversityScenario,sf'
add_constraint_boundary(mod, layer, method = "boundary", ...)

## S4 method for signature 'BiodiversityScenario,ANY'
add_constraint_boundary(mod, layer, method = "boundary", ...)

Arguments

mod

A BiodiversityScenario object with specified predictors.

layer

A SpatRaster or sf object with the same extent as the model background. Has to be binary and is used for a posthoc masking of projected grid cells.

...

passed on parameters. See also the specific methods for adding constraints.

method

A character indicating the type of constraints to be added to the scenario. See details for more information.

See Also

Other constraint: add_constraint(), add_constraint_MigClim(), add_constraint_adaptability(), add_constraint_connectivity(), add_constraint_dispersal(), add_constraint_minsize(), add_constraint_threshold(), simulate_population_steps()

Examples

## Not run: 
# Add scenario constraint
scenario(fit) |> add_constraint_boundary(range)

## End(Not run)

Adds a connectivity constraint to a scenario object.

Description

Adds a connectivity constraint to a scenario object.

Usage

add_constraint_connectivity(mod, method, value = NULL, resistance = NULL, ...)

## S4 method for signature 'BiodiversityScenario'
add_constraint_connectivity(mod, method, value = NULL, resistance = NULL, ...)

Arguments

mod

A BiodiversityScenario object with specified predictors.

method

A character indicating the type of constraints to be added to the scenario. See details for more information.

value

For many dispersal "constrain" this is set as numeric value specifying a fixed constrain or constant in units "m" (Default: NULL). For kissmig the value needs to give the number of iteration steps (or within year migration steps). For adaptability constraints this parameter specifies the extent (in units of standard deviation) to which extrapolations should be performed.

resistance

A SpatRaster object describing a resistance surface or barrier for use in connectivity constrains (Default: NULL).

...

passed on parameters. See also the specific methods for adding constraints.

Details

  • hardbarrier - Defines a hard barrier to any dispersal events. By definition this sets all values larger than 0 in the barrier layer to 0 in the projection. Barrier has to be provided through the "resistance" parameter.

  • resistance - Allows the provision of a static or dynamic layer that is multiplied with the projection at each time step. Can for example be used to reduce the suitability of any given area (using pressures not included in the model). The respective layer(s) have to be provided through the "resistance" parameter. Provided layers are incorporated as abs(resistance - 1) and multiplied with the prediction.

See Also

Other constraint: add_constraint(), add_constraint_MigClim(), add_constraint_adaptability(), add_constraint_boundary(), add_constraint_dispersal(), add_constraint_minsize(), add_constraint_threshold(), simulate_population_steps()


Add dispersal constraint to an existing scenario

Description

Add dispersal constraint to an existing scenario

Usage

add_constraint_dispersal(mod, method, value = NULL, type = NULL, ...)

## S4 method for signature 'BiodiversityScenario'
add_constraint_dispersal(mod, method, value = NULL, type = NULL, ...)

Arguments

mod

A BiodiversityScenario object with specified predictors.

method

A character indicating the type of constraints to be added to the scenario. See details for more information.

value

For many dispersal "constrain" this is set as numeric value specifying a fixed constrain or constant in units "m" (Default: NULL). For kissmig the value needs to give the number of iteration steps (or within year migration steps). For adaptability constraints this parameter specifies the extent (in units of standard deviation) to which extrapolations should be performed.

type

A character indicating the type used in the method. See for instance `kissmig`.

...

passed on parameters. See also the specific methods for adding constraints.

Details

Dispersal: Parameters for 'method':

  • sdd_fixed - Applies a fixed uniform dispersal distance per modelling timestep.

  • sdd_nexpkernel - Applies a dispersal distance using a negative exponential kernel from its origin. #' The negative exponential kernel is defined as:

    f(x)=12πa2exaf(x) = \frac{1}{2 \pi a^2} e^{-\frac{x}{a}}

    where aa is the mean dispersal distance (in m) divided by 2.

  • kissmig - Applies the kissmig stochastic dispersal model. Requires `kissmig` package. Applied at each modelling time step.

  • migclim - Applies the dispersal algorithm MigClim to the modelled objects. Requires "MigClim" package.

A comprehensive overview of the benefits of including dispersal constrains in species distribution models can be found in Bateman et al. (2013).

The following additional parameters can bet set:

  • pext: numeric indicator for `kissmig` of the probability a colonized cell becomes uncolonised, i.e., the species gets locally extinct (Default: 0.1).

  • pcor: numeric probability that corner cells are considered in the 3x3 neighbourhood (Default: 0.2).

Note

Unless otherwise stated, the default unit of supplied distance values (e.g. average dispersal distance) should be in "m".

References

  • Bateman, B. L., Murphy, H. T., Reside, A. E., Mokany, K., & VanDerWal, J. (2013). Appropriateness of full‐, partial‐and no‐dispersal scenarios in climate change impact modelling. Diversity and Distributions, 19(10), 1224-1234.

See Also

Other constraint: add_constraint(), add_constraint_MigClim(), add_constraint_adaptability(), add_constraint_boundary(), add_constraint_connectivity(), add_constraint_minsize(), add_constraint_threshold(), simulate_population_steps()


Add constrains to the modelled distribution projection using the MigClim approach

Description

This function adds constrain as defined by the MigClim approach (Engler et al. 2013) to a BiodiversityScenario object to constrain future projections. For a detailed description of MigClim, please the respective reference and the UserGuide. The default parameters chosen here are suggestions.

Usage

add_constraint_MigClim(
  mod,
  rcThresholdMode = "continuous",
  dispSteps = 1,
  dispKernel = c(1, 0.4, 0.16, 0.06, 0.03),
  barrierType = "strong",
  lddFreq = 0,
  lddRange = c(1000, 10000),
  iniMatAge = 1,
  propaguleProdProb = c(0.2, 0.6, 0.8, 0.95),
  replicateNb = 10,
  dtmp = terra::terraOptions(print = F)$tempdir
)

## S4 method for signature 'BiodiversityScenario'
add_constraint_MigClim(
  mod,
  rcThresholdMode = "continuous",
  dispSteps = 1,
  dispKernel = c(1, 0.4, 0.16, 0.06, 0.03),
  barrierType = "strong",
  lddFreq = 0,
  lddRange = c(1000, 10000),
  iniMatAge = 1,
  propaguleProdProb = c(0.2, 0.6, 0.8, 0.95),
  replicateNb = 10,
  dtmp = terra::terraOptions(print = F)$tempdir
)

Arguments

mod

A BiodiversityScenario object with specified predictors.

rcThresholdMode

A character of either binary or continuous value (Default: continuous).

dispSteps

numeric parameters on the number of dispersal steps. Dispersal steps are executed for each timestep (prediction layer). and ideally should be aligned with the number of steps for projection. Minimum is 1 (Default) and maximum is 99.

dispKernel

A vector with the number of the dispersal Kernel to be applied. Can be set either to a uniform numeric vector, e.g. c(1,1,1,1) or to a proportional decline (1,0.4,0.16,0.06,0.03) (Default). Depending on the resolution of the raster, this parameter needs to be adapted

barrierType

A character indicating whether any set barrier should be set as 'strong' or 'weak' barriers. Strong barriers prevent any dispersal across the barrier and weak barriers only do so if the whole "dispKernel" length is covered by the barrier (Default: 'strong').

lddFreq

numeric parameter indicating the frequency of long-distance dispersal (LDD) events. Default is 0, so no long-distance dispersal.

lddRange

A numeric value highlighting the minimum and maximum distance of LDD events. Note: The units for those distance are in cells, thus the projection units in the raster.

iniMatAge

Initial maturity age. Used together with propaguleProd as a proxy of population growth. Must be set to the cell age in time units which are dispersal steps (Default: 1).

propaguleProdProb

Probability of a source cell to produce propagules as a function of time since colonization. Set as probability vector that defines the probability of a cell producing propagules.

replicateNb

Number of replicates to be used for the analysis (Default: 10).

dtmp

A character to a folder where temporary files are to be created.

Details

The barrier parameter is defined through "add_barrier".

Value

Adds a MigClim onstrain to a BiodiversityScenario object.

References

  • Engler R., Hordijk W. and Guisan A. The MIGCLIM R package – seamless integration of dispersal constraints into projections of species distribution models. Ecography,

  • Robin Engler, Wim Hordijk and Loic Pellissier (2013). MigClim: Implementing dispersal into species distribution models. R package version 1.6.

See Also

Other constraint: add_constraint(), add_constraint_adaptability(), add_constraint_boundary(), add_constraint_connectivity(), add_constraint_dispersal(), add_constraint_minsize(), add_constraint_threshold(), simulate_population_steps()

Examples

## Not run: 
# Assumes that a trained 'model' object exists
 mod <- scenario(model) |>
  add_predictors(env = predictors, transform = 'scale',
                 derivates = "none") |>
  add_constraint_MigClim() |>
  project()

## End(Not run)

Adds a size constraint on a scenario

Description

This function applies a minimum size constraint on a scenario() created object. The rationale here is that for a given species isolated habitat patches smaller than a given size might not be viable / unrealistic for a species to establish a (long-term) presence.

The idea thus is to apply a constraint in that only patches bigger than a certain size are retained between timesteps. It has thus the potential to reduce subsequent colonizations of neighbouring patches.

Usage

add_constraint_minsize(
  mod,
  value,
  unit = "km2",
  establishment_step = FALSE,
  ...
)

## S4 method for signature 'BiodiversityScenario,numeric'
add_constraint_minsize(
  mod,
  value,
  unit = "km2",
  establishment_step = FALSE,
  ...
)

Arguments

mod

A BiodiversityScenario object with specified predictors.

value

A numeric value describing the minimum amount of area of a given patch

unit

A character of the unit of area. Options available are km2 (Default), ha and pixel.

establishment_step

A logical flag indicating whether a given patch is only to be removed if wasn't small in a previous time step (not yet implemented!)

...

passed on parameters. See also the specific methods for adding constraints.

Details

Area values in a specific unit need to be supplied.

Note

This function requires that a scenario has a set threshold()!

See Also

Other constraint: add_constraint(), add_constraint_MigClim(), add_constraint_adaptability(), add_constraint_boundary(), add_constraint_connectivity(), add_constraint_dispersal(), add_constraint_threshold(), simulate_population_steps()

Examples

## Not run: 
scenario(fit) |>
 add_predictors(future_covariates) |>
 threshold() |>
 add_constraint_minsize(value = 1000, unit = "km2") |>
 project()

## End(Not run)

Adds a threshold constraint to a scenario object

Description

This option adds a threshold() constraint to a scenario projection, thus effectively applying the threshold as mask to each projection step made during the scenario projection.

Applying this constraint thus means that the "suitability" projection is clipped to the threshold. This method requires the threshold() set for a scenario object.

It could be in theory possible to re calculate the threshold for each time step based on supplied parameters or even observation records. So far this option has not been necessary to implement.

Usage

add_constraint_threshold(mod, updatevalue = NA, ...)

## S4 method for signature 'BiodiversityScenario'
add_constraint_threshold(mod, updatevalue = NA, ...)

Arguments

mod

A BiodiversityScenario object with specified predictors.

updatevalue

A numeric indicating to what the masked out values (those outside) the threshold should become (Default: NA).

...

passed on parameters. See also the specific methods for adding constraints.

Note

Threshold values are taken from the original fitted model.

See Also

Other constraint: add_constraint(), add_constraint_MigClim(), add_constraint_adaptability(), add_constraint_boundary(), add_constraint_connectivity(), add_constraint_dispersal(), add_constraint_minsize(), simulate_population_steps()

Examples

## Not run: 
# Add scenario constraint
scenario(fit) |> threshold() |>
add_constraint_threshold()

## End(Not run)

Add a control to a BiodiversityModel object to control biases

Description

Sampling and other biases are pervasive drivers of the spatial location of biodiversity datasets. While the integration of other, presumably less biased data can be one way of controlling for sampling biases, another way is to control directly for the bias in the model. Currently supported methods are:

  • "partial" - An approach described by Warton et al. (2013) to control the biases in a model, by including a specified variable ("layer") in the model, but "partialling" it out during the projection phase. Specifically the variable is set to a specified value ("bias_value"), which is by default the minimum value observed across the background.

  • "offset" - Dummy method that points to the add_offset_bias() functionality (see note). Makes use of offsets to factor out a specified bias variable.

  • "proximity" - Use the proximity or distance between points as a weight in the model. This option effectively places greater weight on points farther away. Note: In the best case this can control for spatial bias and aggregation, in the worst case it can place a lot of emphasis on points that likely outliers or misidentification (in terms of species).

See also details for some explanations.

Usage

add_control_bias(
  x,
  layer,
  method = "partial",
  bias_value = NULL,
  maxdist = NULL,
  alpha = 1,
  add = TRUE
)

## S4 method for signature 'BiodiversityDistribution'
add_control_bias(
  x,
  layer,
  method = "partial",
  bias_value = NULL,
  maxdist = NULL,
  alpha = 1,
  add = TRUE
)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

layer

A sf or SpatRaster object with the range for the target feature. Specify a variable that is not already added to "x" to avoid issues with duplications.

method

A character vector describing the method used for bias control. Available options are "partial" (Default), "offset" or "proximity".

bias_value

A numeric with a value for "layer". Specifying a numeric value here sets layer to the target value during projection. By default the value is set to the minimum value found in the layer (Default: NULL).

maxdist

A numeric giving the maximum distance if method "proximity" is used. If unset it uses by default the distance to the centroid of a minimum convex polygon encircling all points.

alpha

A numeric given the initial weight to points if method "proximity" is used (Default: 1). For example, if set to values smaller than 1 neighbouring points will be weighted less.

add

logical specifying whether a new offset is to be added. Setting this parameter to FALSE replaces the current offsets with the new one (Default: TRUE).

Details

In the case of "proximity" weights are assigned to each point, placing higher weight on points further away and with less overlap. Weights are are assigned up to a maximum of distance which can be provided by the user (parameter "maxdist"). This distance is ideally informed by some knowledge of the species to be modelled (e.g., maximum dispersal distance). If not provided, it is set to the distance of the centroid of a minimum convex polygon encircling all observations. The parameter "alpha" is a weighting factor which can be used to diminish the effect of neighboring points.
For a given observation ii, the weight ww is defined as

wi=1/(1+ϵ)w_i = 1 / (1 + \epsilon)

where

ϵ=n=1N((1dn)/dsac)α\epsilon = \sum_{n=1}^{N}((1 - d_n)/d_sac)^\alpha

in which NN is the total number of points closer than the maximum distance (dsacd_sac) of point ii, and dnd_n the distance between focal point ii and point nn.

Value

Adds bias control option to a distribution object.

Note

Covariate transformations applied to other predictors need to be applied to bias too. Another option to consider biases particular in Poisson-point process models is to remove them through an offset. Functionality to do so is available through the add_offset_bias() method. Setting the method to "offset" will automatically point to this option.

References

  • Warton, D.I., Renner, I.W. and Ramp, D., 2013. Model-based control of observer bias for the analysis of presence-only data in ecology. PloS one, 8(11), p.e79168.

  • Merow, C., Allen, J.M., Aiello-Lammens, M., Silander, J.A., 2016. Improving niche and range estimates with Maxent and point process models by integrating spatially explicit information. Glob. Ecol. Biogeogr. 25, 1022–1036. https://doi.org/10.1111/geb.12453

  • Botella, C., Joly, A., Bonnet, P., Munoz, F., & Monestiez, P. (2021). Jointly estimating spatial sampling effort and habitat suitability for multiple species from opportunistic presence‐only data. Methods in Ecology and Evolution, 12(5), 933-945.

See Also

add_limits_extrapolation()

Examples

## Not run: 
 x <- distribution(background) |>
   add_predictors(covariates) |>
   add_control_bias(biasvariable, bias_value = NULL)

## End(Not run)

Add latent spatial effect to the model equation

Description

In general we understand under latent spatial effects the occurrence of spatial dependency in the observations, which might either be caused by spatial biases, similarities in the underlying sampling processes or unmeasured latent covariates, e.g. those that have not been quantified.

This package supports a range of different spatial effects, however they differ from another by their impact on the estimated prediction. Some effects simply add the spatial dependence as covariate, others make use of spatial random effects to account for spatial dependence in the predictions. By default these effects are added to each dataset as covariate or shared spatial field (e.g. SPDE). See details for an explanation of the available options.

Usage

add_latent_spatial(
  x,
  method = "spde",
  priors = NULL,
  separate_spde = FALSE,
  ...
)

## S4 method for signature 'BiodiversityDistribution'
add_latent_spatial(
  x,
  method = "spde",
  priors = NULL,
  separate_spde = FALSE,
  ...
)

## S4 method for signature 'BiodiversityScenario'
add_latent_spatial(x, layer = NULL, reuse_latent = TRUE, ...)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

method

A character describing what kind of spatial effect is to be added to the model. See details.

priors

A "Prior-List" object supplied to the latent effect. Relevant only for engine_inla and NULL equates the use of default priors.

separate_spde

A logical parameter indicating whether, in the case of SPDE effects, separate effects for each likelihood are being fitted. Default (FALSE) uses a copy of the first added likelihood.

...

Other parameters passed down

layer

A SpatRaster layer describing alternative latent effects to be used instead if "reuse_latent" is set to FALSE.

reuse_latent

A logical flag on whether any latent effects found in the fitted model should be reused (Default TRUE).

Details

There are several different options some of which depend on the engine used. In case a unsupported method for an engine is chosen this is modified to the next similar method.

Available are:

  • "spde" - stochastic partial differential equation (SPDE) for engine_inla and engine_inlabru. SPDE effects aim at capturing the variation of the response variable in space, once all of the covariates are accounted for. Examining the spatial distribution of the spatial error can reveal which covariates might be missing. For example, if elevation is positively correlated with the response variable, but is not included in the model, we could see a higher posterior mean in areas with higher elevation. Note that calculations of SPDE's can be computationally costly.

  • "car" - conditional autocorrelative errors (CAR) for engine_inla. Not yet implemented in full.

  • "kde" - additional covariate of the kernel density of input point observations.

  • "poly" - spatial trend correction by adding coordinates as polynominal transformation. This method assumed that a transformation of spatial coordinates can if - included as additional predictor - explain some of the variance in the distribution. This method does not interact with species occurrences.

  • "nnd" - nearest neighbour distance. This function calculates the euclidean distance from each point to the nearest other grid cell with known species occurrence. Originally proposed by Allouche et al. (2008) and can be applied across all datasets in the BiodiversityDistribution) object.

Value

Adds latent spatial effect to a distribution object.

References

  • Allouche, O.; Steinitz, O.; Rotem, D.; Rosenfeld, A.; Kadmon, R. (2008). Incorporating distance constraints into species distribution models. Journal of Applied Ecology, 45(2), 599-609. doi:10.1111/j.1365-2664.2007.01445.x

  • Mendes, P., Velazco, S. J. E., de Andrade, A. F. A., & Júnior, P. D. M. (2020). Dealing with overprediction in species distribution models: How adding distance constraints can improve model accuracy. Ecological Modelling, 431, 109180.

Examples

## Not run: 
 distribution(background) |> add_latent_spatial(method = "poly")

## End(Not run)

Add a control to a BiodiversityModel object to limit extrapolation

Description

One of the main aims of species distribution models (SDMs) is to project in space and time. For projections a common issue is extrapolation as - unconstrained - SDMs can indicate areas as suitable which are unlikely to be occupied by species or habitats (often due to historic or biotic factors). To some extent this can be related to an insufficient quantification of the niche (e.g. niche truncation by considering only a subset of observations within the actual distribution), in other cases there can also be general barriers or constraints that limit any projections (e.g. islands). This limit method adds some of those options to a model distribution object. Currently supported methods are:

* "zones" - This is a wrapper to allow the addition of zones to a distribution model object, similar to what is also possible via distribution(). Required is a spatial layer that describes a environmental zoning.

* "mcp" - Rather than using an external or additional layer, this option constraints predictions by a certain distance of points in its vicinity. Buffer distances have to be in the unit of the projection used and can be configured via "mcp_buffer".

* "nt2" - Constraints the predictions using the multivariate combination novelty index (NT2) following Mesgaran et al. (2014). This method is also available in the similarity() function.

* "mess" - Constraints the predictions using the Multivariate Environmental Similarity Surfaces (MESS) following Mesgaran et al. (2014). This method is also available in the similarity() function.

* "shape" - This is an implementation of the 'shape' method introduced by Velazco et al. (2023). Through a user defined threshold it effectively limits model extrapolation so that no projections are made beyond the extent judged as defensible and informed by the training observations. Not yet implemented!

See also details for further explanations.

Usage

add_limits_extrapolation(
  x,
  layer,
  method = "mcp",
  mcp_buffer = 0,
  novel = "within",
  limits_clip = FALSE
)

## S4 method for signature 'BiodiversityDistribution'
add_limits_extrapolation(
  x,
  layer,
  method = "mcp",
  mcp_buffer = 0,
  novel = "within",
  limits_clip = FALSE
)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

layer

A SpatRaster or sf object that limits the prediction surface when intersected with input data (Default: NULL).

method

A character vector describing the method used for controlling extrapolation. Available options are "zones", "mcp" (Default), or "nt2", "mess" or "shape".

mcp_buffer

A numeric distance to buffer the mcp (Default 0). Only used if "mcp" is used.

novel

Which conditions are to be masked out respectively, either the novel conditions within only "within" (Default) or also including outside reference conditions "outside". Only use for method = "nt2", for method = "mess" this variable is always "within".

limits_clip

logical Should the limits clip all predictors before fitting a model (TRUE) or just the prediction (FALSE, default).

Details

For method "zones" a zoning layer can be supplied which is then used to intersect the provided training points with. Any projections made with the model can then be constrained so as to not project into areas that do not consider any training points and are unlikely to have any. Examples for zones are for the separation of islands and mainlands, biomes, or lithological soil conditions.

If no layer is available, it is also possible to constraint predictions by the distance to a minimum convex polygon surrounding the training points with method "mcp" (optionally buffered). This can make sense particular for rare species or those fully sampled across their niche.

For the "NT2" and "MESS" index it is possible to constrain the prediction to conditions within (novel = "within") or also include outside (novel = "outside") conditions.

Value

Adds extrapolation limit option to a distribution object.

Note

The method "zones" is also possible directly within distribution().

References

  • Randin, C. F., Dirnböck, T., Dullinger, S., Zimmermann, N. E., Zappa, M., & Guisan, A. (2006). Are niche‐based species distribution models transferable in space?. Journal of biogeography, 33(10), 1689-1703. https://doi.org/10.1111/j.1365-2699.2006.01466.x

  • Chevalier, M., Broennimann, O., Cornuault, J., & Guisan, A. (2021). Data integration methods to account for spatial niche truncation effects in regional projections of species distribution. Ecological Applications, 31(7), e02427. https://doi.org/10.1002/eap.2427

  • Velazco, S. J. E., Brooke, M. R., De Marco Jr., P., Regan, H. M., & Franklin, J. (2023). How far can I extrapolate my species distribution model? Exploring Shape, a novel method. Ecography, 11, e06992. https://doi.org/10.1111/ecog.06992

  • Mesgaran, M. B., R. D. Cousens, B. L. Webber, and J. Franklin. (2014) Here be dragons: a tool for quantifying novelty due to covariate range and correlation change when projecting species distribution models. Diversity and Distributions 20:1147-1159.

Examples

## Not run: 
 # To add a zone layer for extrapolation constraints.
 x <- distribution(background) |>
   add_predictors(covariates) |>
   add_limits_extrapolation(method = "zones", layer = zones)

## End(Not run)

Adds a log file to distribution object

Description

This function allows to specify a file as Log file, which is used to save all console outputs, prints and messages.

Usage

add_log(x, filename)

## S4 method for signature 'BiodiversityDistribution,character'
add_log(x, filename)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

filename

A character object. The destination must be writeable and filename ends with 'txt'.

Value

Adds a log file to a distribution object.

Examples

## Not run: 
 x <- distribution(background) |>
    add_log()
 x

## End(Not run)

Specify a spatial explicit offset

Description

Including offsets is another option to integrate spatial prior information in linear and additive regression models. Offsets shift the intercept of the regression fit by a certain amount. Although only one offset can be added to a regression model, it is possible to combine several spatial-explicit estimates into one offset by calculating the sum of all spatial-explicit layers.

Usage

add_offset(x, layer, add = TRUE)

## S4 method for signature 'BiodiversityDistribution,SpatRaster'
add_offset(x, layer, add = TRUE)

## S4 method for signature 'BiodiversityDistribution,sf'
add_offset(x, layer, add = TRUE)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

layer

A sf or SpatRaster object with the range for the target feature.

add

logical specifying whether new offset is to be added. Setting this parameter to FALSE replaces the current offsets with the new one (Default: TRUE).

Details

This function allows to set any specific offset to a regression model. The offset has to be provided as spatial SpatRaster object. This function simply adds the layer to a distribution() object. Note that any transformation of the offset (such as log) has do be done externally!

If the layer is range and requires additional formatting, consider using the function add_offset_range() which has additional functionalities such such distance transformations.

Value

Adds an offset to a distribution object.

Note

Since offsets only make sense for linear regressions (and not for instance regression tree based methods such as engine_bart()), they do not work for all engines. Offsets specified for non-supported engines are ignored during the estimation

References

  • Merow, C., Allen, J.M., Aiello-Lammens, M., Silander, J.A., 2016. Improving niche and range estimates with Maxent and point process models by integrating spatially explicit information. Glob. Ecol. Biogeogr. 25, 1022–1036. https://doi.org/10.1111/geb.12453

See Also

Other offset: add_offset_bias(), add_offset_elevation(), add_offset_range(), rm_offset()

Examples

## Not run: 
 x <- distribution(background) |>
   add_predictors(covariates) |>
   add_offset(nicheEstimate)

## End(Not run)

Specify a spatial explicit offset as bias

Description

Including offsets is another option to integrate spatial prior information in linear and additive regression models. Offsets shift the intercept of the regression fit by a certain amount. Although only one offset can be added to a regression model, it is possible to combine several spatial-explicit estimates into one offset by calculating the sum of all spatial-explicit layers.

Usage

add_offset_bias(x, layer, add = TRUE, points = NULL)

## S4 method for signature 'BiodiversityDistribution,SpatRaster'
add_offset_bias(x, layer, add = TRUE, points = NULL)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

layer

A sf or SpatRaster object with the range for the target feature.

add

logical specifying whether new offset is to be added. Setting this parameter to FALSE replaces the current offsets with the new one (Default: TRUE).

points

An optional sf object with key points. The location of the points are then used to calculate the probability that a cell has been sampled while accounting for area differences. (Default: NULL).

Details

This functions emulates the use of the add_offset() function, however applies an inverse transformation to remove the provided layer from the overall offset. So if for instance a offset is already specified (such as area), this function removes the provided bias.layer from it via "offset(log(off.area)-log(bias.layer))"

Note that any transformation of the offset (such as log) has do be done externally!

If a generic offset is added, consider using the add_offset() function. If the layer is a expert-based range and requires additional parametrization, consider using the function add_offset_range() or the bossMaps R-package.

Value

Adds a bias offset to a distribution object.

References

  • Merow, C., Allen, J.M., Aiello-Lammens, M., Silander, J.A., 2016. Improving niche and range estimates with Maxent and point process models by integrating spatially explicit information. Glob. Ecol. Biogeogr. 25, 1022–1036. https://doi.org/10.1111/geb.12453

See Also

Other offset: add_offset(), add_offset_elevation(), add_offset_range(), rm_offset()

Examples

## Not run: 
 x <- distribution(background) |>
   add_predictors(covariates) |>
   add_offset_bias(samplingBias)

## End(Not run)

Specify elevational preferences as offset

Description

This function implements the elevation preferences offset defined in Ellis‐Soto et al. (2021). The code here was adapted from the Supporting materials script.

Usage

add_offset_elevation(x, elev, pref, rate = 0.0089, add = TRUE)

## S4 method for signature 'BiodiversityDistribution,SpatRaster,numeric'
add_offset_elevation(x, elev, pref, rate = 0.0089, add = TRUE)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

elev

A SpatRaster with the elevation for a given background.

pref

A numeric vector of length 2 giving the lower and upper bound of known elevational preferences. Can be set to Inf if unknown.

rate

A numeric for the rate used in the offset (Default: .0089). This parameter specifies the decay to near zero probability at elevation above and below the expert limits.

add

logical specifying whether new offset is to be added. Setting this parameter to FALSE replaces the current offsets with the new one (Default: TRUE).

Details

Specifically this functions calculates a continuous decay and decreasing probability of a species to occur from elevation limits. It requires a SpatRaster with elevation information. A generalized logistic transform (aka Richard's curve) is used to calculate decay from the suitable elevational areas, with the "rate" parameter allowing to vary the steepness of decline.

Note that all offsets created by this function are by default log-transformed before export. In addition this function also mean-centers the output as recommended by Ellis-Soto et al.

Value

Adds a elevational offset to a distribution object.

References

  • Ellis‐Soto, D., Merow, C., Amatulli, G., Parra, J.L., Jetz, W., 2021. Continental‐scale 1 km hummingbird diversity derived from fusing point records with lateral and elevational expert information. Ecography (Cop.). 44, 640–652. https://doi.org/10.1111/ecog.05119

  • Merow, C., Allen, J.M., Aiello-Lammens, M., Silander, J.A., 2016. Improving niche and range estimates with Maxent and point process models by integrating spatially explicit information. Glob. Ecol. Biogeogr. 25, 1022–1036. https://doi.org/10.1111/geb.12453

See Also

Other offset: add_offset(), add_offset_bias(), add_offset_range(), rm_offset()

Examples

## Not run: 
 # Adds the offset to a distribution object
 distribution(background) |> add_offset_elevation(dem, pref = c(400, 1200))

## End(Not run)

Specify a expert-based species range as offset

Description

This function has additional options compared to the more generic add_offset(), allowing customized options specifically for expert-based ranges as offsets or spatialized polygon information on species occurrences. If even more control is needed, the user is informed of the "bossMaps" package Merow et al. (2017). Some functionalities of that package emulated through the "distance_function" set to "log". This tries to fit a 5-parameter logistic function to estimate the distance from the range (Merow et al. 2017).

Usage

add_offset_range(
  x,
  layer,
  distance_max = Inf,
  family = "poisson",
  presence_prop = 0.9,
  distance_clip = FALSE,
  distance_function = "negexp",
  field_occurrence = "observed",
  fraction = NULL,
  point = FALSE,
  add = TRUE
)

## S4 method for signature 'BiodiversityDistribution,SpatRaster'
add_offset_range(x, layer, fraction = NULL, add = TRUE)

## S4 method for signature 'BiodiversityDistribution,sf'
add_offset_range(
  x,
  layer,
  distance_max = Inf,
  family = "poisson",
  presence_prop = 0.9,
  distance_clip = FALSE,
  distance_function = "negexp",
  field_occurrence = "observed",
  fraction = NULL,
  point = FALSE,
  add = TRUE
)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

layer

A sf or SpatRaster object with the range for the target feature.

distance_max

A numeric threshold on the maximum distance beyond the range that should be considered to have a high likelihood of containing species occurrences (Default: Inf "m"). Can be set to NULL or 0 to indicate that no distance should be calculated.

family

A character denoting the type of model to which this offset is to be added. By default it assumes a 'poisson' distributed model and as a result the output created by this function will be log-transformed. If however a 'binomial' distribution is chosen, than the output will be `logit` transformed. For integrated models leave at default.

presence_prop

numeric giving the proportion of all records expected to be inside the range. By default this is set to 0.9 indicating that 10% of all records are likely outside the range.

distance_clip

logical as to whether distance should be clipped after the maximum distance (Default: FALSE).

distance_function

A character specifying the distance function to be used. Available are linear ("linear"), negative exponential kernels ("negexp", default) and a five parameters logistic curve ("logcurve") as proposed by Merow et al. 2017.

field_occurrence

A numeric or character location of biodiversity point records.

fraction

An optional SpatRaster object that is multiplied with digitized raster layer. Can be used to for example to remove or reduce the expected value (Default: NULL).

point

An optional sf layer with points or logical argument. In the case of the latter the point data is ignored (Default: FALSE).

add

logical specifying whether new offset is to be added. Setting this parameter to FALSE replaces the current offsets with the new one (Default: TRUE).

Details

The output created by this function creates a SpatRaster to be added to a provided distribution object. Offsets in regression models are likelihood specific as they are added directly to the overall estimate of `y^hat`.

Note that all offsets created by this function are by default log-transformed before export. Background values (e.g. beyond "distance_max") are set to a very small constant (1e-10).

Value

Adds a range offset to a distribution object.

References

  • Merow, C., Wilson, A.M., Jetz, W., 2017. Integrating occurrence data and expert maps for improved species range predictions. Glob. Ecol. Biogeogr. 26, 243–258. https://doi.org/10.1111/geb.12539

  • Merow, C., Allen, J.M., Aiello-Lammens, M., Silander, J.A., 2016. Improving niche and range estimates with Maxent and point process models by integrating spatially explicit information. Glob. Ecol. Biogeogr. 25, 1022–1036. https://doi.org/10.1111/geb.12453

See Also

"bossMaps"

Other offset: add_offset(), add_offset_bias(), add_offset_elevation(), rm_offset()

Examples

## Not run: 
 # Train a presence-only model with a simple offset
 fit <- distribution(background) |>
 add_biodiversity_poipo(virtual_points, field_occurrence = "Observed") |>
 add_predictors(predictors) |>
 add_offset_range(virtual_range, distance_max = 5,distance_function = "logcurve",
 distance_clip = TRUE ) |>
 engine_glm() |>
 train()

## End(Not run)

Create lower and upper limits for an elevational range and add them as separate predictors

Description

Create lower and upper limits for an elevational range and add them as separate predictors

Usage

add_predictor_elevationpref(x, layer, lower, upper, transform = "none")

## S4 method for signature 'BiodiversityDistribution,ANY,numeric,numeric'
add_predictor_elevationpref(x, layer, lower, upper, transform = "none")

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

layer

A character stating the elevational layer in the Distribution object or SpatRaster object.

lower

numeric value for a lower elevational preference of a species.

upper

numeric value for a upper elevational preference of a species.

transform

character Any optional transformation to be applied. Usually not needed (Default: "none").

Examples

## Not run: 
distribution(background) |>
  add_predictor_elevationpref(elevation, lower = 200, upper = 1000)

## End(Not run)

Add a range of a species as predictor to a distribution object

Description

This function allows to add a species range which is usually drawn by experts in a separate process as spatial explicit prior. Both sf and SpatRaster-objects are supported as input.

Users are advised to look at the "bossMaps" R-package presented as part of Merow et al. (2017), which allows flexible calculation of non-linear distance transforms from the boundary of the range. Outputs of this package could be added directly to this function. Note that this function adds the range as predictor and not as offset. For this purpose a separate function add_offset_range() exists.

Additional options allow to include the range either as "binary" or as "distance" transformed predictor. The difference being that the range is either directly included as presence-only predictor or alternatively with a linear distance transform from the range boundary. The parameter "distance_max" can be specified to constrain this distance transform.

Usage

add_predictor_range(
  x,
  layer,
  method = "distance",
  distance_max = NULL,
  fraction = NULL,
  priors = NULL
)

## S4 method for signature 'BiodiversityDistribution,SpatRaster'
add_predictor_range(
  x,
  layer,
  method = "precomputed_range",
  fraction = NULL,
  priors = NULL
)

## S4 method for signature 'BiodiversityDistribution,sf'
add_predictor_range(
  x,
  layer,
  method = "distance",
  distance_max = Inf,
  fraction = NULL,
  priors = NULL
)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

layer

A sf or SpatRaster object with the range for the target feature.

method

character describing how the range should be included ("binary" | "distance").

distance_max

Numeric threshold on the maximum distance (Default: NULL).

fraction

An optional SpatRaster object that is multiplied with digitized raster layer. Can be used to for example to remove or reduce the expected value (Default: NULL).

priors

A PriorList object. Default is set to NULL which uses default prior assumptions.

References

  • Merow, C., Wilson, A. M., & Jetz, W. (2017). Integrating occurrence data and expert maps for improved species range predictions. Global Ecology and Biogeography, 26(2), 243–258. https://doi.org/10.1111/geb.12539

Examples

## Not run: 
distribution(background) |>
  add_predictor_range(range, method = "distance", distance_max = 2)

## End(Not run)

Add predictors to a Biodiversity distribution object

Description

This function allows to add predictors to distribution or BiodiversityScenario objects. Predictors are covariates that in spatial projection have to match the geographic projection of the background layer in the distribution object. This function furthermore allows to transform or create derivates of provided predictors.

Usage

add_predictors(
  x,
  env,
  names = NULL,
  transform = "none",
  derivates = "none",
  derivate_knots = 4,
  int_variables = NULL,
  bgmask = TRUE,
  harmonize_na = FALSE,
  explode_factors = FALSE,
  priors = NULL,
  state = NULL,
  ...
)

## S4 method for signature 'BiodiversityDistribution,SpatRasterCollection'
add_predictors(
  x,
  env,
  names = NULL,
  transform = "none",
  derivates = "none",
  derivate_knots = 4,
  int_variables = NULL,
  bgmask = TRUE,
  harmonize_na = FALSE,
  explode_factors = FALSE,
  priors = NULL,
  state = NULL,
  ...
)

## S4 method for signature 'BiodiversityDistribution,SpatRaster'
add_predictors(
  x,
  env,
  names = NULL,
  transform = "none",
  derivates = "none",
  derivate_knots = 4,
  int_variables = NULL,
  bgmask = TRUE,
  harmonize_na = FALSE,
  explode_factors = FALSE,
  priors = NULL,
  state = NULL,
  ...
)

## S4 method for signature 'BiodiversityDistribution,stars'
add_predictors(
  x,
  env,
  names = NULL,
  transform = "none",
  derivates = "none",
  derivate_knots = 4,
  int_variables = NULL,
  bgmask = TRUE,
  harmonize_na = FALSE,
  explode_factors = FALSE,
  priors = NULL,
  state = NULL,
  ...
)

## S4 method for signature 'BiodiversityScenario,SpatRaster'
add_predictors(
  x,
  env,
  names = NULL,
  transform = "none",
  derivates = "none",
  derivate_knots = 4,
  int_variables = NULL,
  bgmask = TRUE,
  harmonize_na = FALSE,
  explode_factors = FALSE,
  priors = NULL,
  state = NULL,
  ...
)

## S4 method for signature 'BiodiversityScenario,stars'
add_predictors(
  x,
  env,
  names = NULL,
  transform = "none",
  derivates = "none",
  derivate_knots = 4,
  int_variables = NULL,
  bgmask = TRUE,
  harmonize_na = FALSE,
  explode_factors = FALSE,
  priors = NULL,
  state = NULL,
  ...
)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

env

A SpatRaster or stars object.

names

A vector of character names describing the environmental stack in case they should be renamed.

transform

A vector stating whether predictors should be preprocessed in any way (Options: 'none','pca', 'scale', 'norm')

derivates

A Boolean check whether derivate features should be considered (Options: 'none', 'thresh', 'hinge', 'quad') )

derivate_knots

A single numeric or vector giving the number of knots for derivate creation if relevant (Default: 4).

int_variables

A vector with length greater or equal than 2 specifying the covariates (Default: NULL).

bgmask

Check whether the environmental data should be masked with the background layer (Default: TRUE).

harmonize_na

A logical value indicating of whether NA values should be harmonized among predictors (Default: FALSE).

explode_factors

logical of whether any factor variables should be split up into binary variables (one per class). (Default: FALSE).

priors

A PriorList object. Default is set to NULL which uses default prior assumptions.

state

A matrix with one value per variable (column) providing either a ( stats::mean(), stats::sd() ) for each variable in env for option 'scale' or a range of minimum and maximum values for option 'norm'. Effectively applies their value range for rescaling. In the case of provided stars data to a BiodiversityScenario object, the state variables are attempted to be compiled from the predictor ranges used for model inferrence (Default: NULL).

...

Other parameters passed down

Details

A transformation takes the provided rasters and for instance rescales them or transforms them through a principal component analysis (prcomp). In contrast, derivates leave the original provided predictors alone, but instead create new ones, for instance by transforming their values through a quadratic or hinge transformation. Note that this effectively increases the number of predictors in the object, generally requiring stronger regularization by the used Engine. Both transformations and derivates can also be combined. Available options for transformation are:

  • 'none' - Leaves the provided predictors in the original scale.

  • 'pca' - Converts the predictors to principal components. Note that this results in a renaming of the variables to principal component axes!

  • 'scale' - Transforms all predictors by applying scale on them.

  • 'norm' - Normalizes all predictors by transforming them to a scale from 0 to 1.

  • 'windsor' - Applies a windsorization to the target predictors. By default this effectively cuts the predictors to the 0.05 and 0.95, thus helping to remove extreme outliers.

Available options for creating derivates are:

  • 'none' - No additional predictor derivates are created.

  • 'quad' - Adds quadratic derivate predictors.

  • 'interaction' - Add interacting predictors. Interactions need to be specified ("int_variables")!

  • 'thresh' - Add threshold derivate predictors.

  • 'hinge' - Add hinge derivate predictors.

  • 'kmeans' - Add k-means derived factors.

  • 'bin' - Add predictors binned by their percentiles.

Note

Important: Not every Engine supported by the ibis.iSDM R-package allows missing data points among extracted covariates. Thus any observation with missing data is generally removed prior from model fitting. Thus ensure that covariates have appropriate no-data settings (for instance setting NA values to 0 or another out of range constant).

Not every engine does actually need covariates. For instance it is perfectly legit to fit a model with only occurrence data and a spatial latent effect (add_latent_spatial). This correspondents to a spatial kernel density estimate.

Certain names such "offset" are forbidden as predictor variable names. The function will return an error message if these are used.

Some engines use binary variables regardless of the parameter explode_factors set here.

Examples

## Not run: 
 obj <- distribution(background) |>
        add_predictors(covariates, transform = 'scale')
 obj

## End(Not run)

Add GLOBIOM-DownScaleR derived predictors to a Biodiversity distribution object

Description

This is a customized function to format and add downscaled land-use shares from the Global Biosphere Management Model (GLOBIOM) to a distribution or BiodiversityScenario in ibis.iSDM. GLOBIOM is a partial-equilibrium model developed at IIASA and represents land-use sectors with a rich set of environmental and socio-economic parameters, where for instance the agricultural and forestry sector are estimated through dedicated process-based models. GLOBIOM outputs are spatial explicit and usually at a half-degree resolution globally. For finer grain analyses GLOBIOM outputs can be produced in a downscaled format with a customized statistical downscaling module.

The purpose of this script is to format the GLOBIOM outputs of DownScale for the use in the ibis.iSDM package.

Usage

add_predictors_globiom(
  x,
  fname,
  names = NULL,
  transform = "none",
  derivates = "none",
  derivate_knots = 4,
  int_variables = NULL,
  bgmask = TRUE,
  harmonize_na = FALSE,
  priors = NULL,
  ...
)

## S4 method for signature 'BiodiversityDistribution,character'
add_predictors_globiom(
  x,
  fname,
  names = NULL,
  transform = "none",
  derivates = "none",
  derivate_knots = 4,
  int_variables = NULL,
  bgmask = TRUE,
  harmonize_na = FALSE,
  priors = NULL,
  ...
)

## S4 method for signature 'BiodiversityScenario,character'
add_predictors_globiom(
  x,
  fname,
  names = NULL,
  transform = "none",
  derivates = "none",
  derivate_knots = 4,
  int_variables = NULL,
  bgmask = TRUE,
  harmonize_na = FALSE,
  priors = NULL,
  ...
)

Arguments

x

A BiodiversityDistribution or BiodiversityScenario object.

fname

A character pointing to a netCDF with the GLOBIOM data.

names

A vector of character names describing the environmental stack in case they should be renamed (Default: NULL).

transform

A vector stating whether predictors should be preprocessed in any way (Options: 'none','pca', 'scale', 'norm')

derivates

A Boolean check whether derivate features should be considered (Options: 'none', 'thresh', 'hinge', 'quad') )

derivate_knots

A single numeric or vector giving the number of knots for derivate creation if relevant (Default: 4).

int_variables

A vector with length greater or equal than 2 specifying the covariates (Default: NULL).

bgmask

Check whether the environmental data should be masked with the background layer (Default: TRUE)

harmonize_na

A logical value indicating of whether NA values should be harmonized among predictors (Default: FALSE)

priors

A PriorList object. Default is set to NULL which uses default prior assumptions.

...

Other parameters passed down

Details

See add_predictors() for additional parameters and customizations. For more (manual) control the function for formatting the GLOBIOM data can also be called directly via formatGLOBIOM().

See Also

add_predictors

Examples

## Not run: 
 obj <- distribution(background) |>
        add_predictors_globiom(fname = "", transform = 'none')
 obj

## End(Not run)

Add predictions from a fitted model to a Biodiversity distribution object

Description

This function is a convenience wrapper to add the output from a previous fitted DistributionModel to another BiodiversityDistribution object. Obviously only works if a prediction was fitted in the model. Options to instead add thresholds, or to transform / derivate the model outputs are also supported.

Usage

add_predictors_model(
  x,
  model,
  transform = "scale",
  derivates = "none",
  threshold_only = FALSE,
  priors = NULL,
  ...
)

## S4 method for signature 'BiodiversityDistribution'
add_predictors_model(
  x,
  model,
  transform = "scale",
  derivates = "none",
  threshold_only = FALSE,
  priors = NULL,
  ...
)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

model

A DistributionModel object.

transform

A vector stating whether predictors should be preprocessed in any way (Options: 'none','pca', 'scale', 'norm')

derivates

A Boolean check whether derivate features should be considered (Options: 'none', 'thresh', 'hinge', 'quad') )

threshold_only

A logical flag indicating whether to add thresholded layers from the fitted model (if existing) instead (Default: FALSE).

priors

A PriorList object. Default is set to NULL which uses default prior assumptions.

...

Other parameters passed down

Details

A transformation takes the provided rasters and for instance rescales them or transforms them through a principal component analysis (prcomp). In contrast, derivates leave the original provided predictors alone, but instead create new ones, for instance by transforming their values through a quadratic or hinge transformation. Note that this effectively increases the number of predictors in the object, generally requiring stronger regularization by the used Engine. Both transformations and derivates can also be combined. Available options for transformation are:

  • 'none' - Leaves the provided predictors in the original scale.

  • 'pca' - Converts the predictors to principal components. Note that this results in a renaming of the variables to principal component axes!

  • 'scale' - Transforms all predictors by applying scale on them.

  • 'norm' - Normalizes all predictors by transforming them to a scale from 0 to 1.

  • 'windsor' - Applies a windsorization to the target predictors. By default this effectively cuts the predictors to the 0.05 and 0.95, thus helping to remove extreme outliers.

Available options for creating derivates are:

  • 'none' - No additional predictor derivates are created.

  • 'quad' - Adds quadratic transformed predictors.

  • 'interaction' - Add interacting predictors. Interactions need to be specified ("int_variables")!

  • 'thresh' - Add threshold transformed predictors.

  • 'hinge' - Add hinge transformed predictors.

  • 'bin' - Add predictors binned by their percentiles.

Examples

## Not run: 
 # Fit first model
 fit <- distribution(background) |>
        add_predictors(covariates) |>
        add_biodiversity_poipa(species) |>
        engine_glmnet() |>
        train()

 # New model object
 obj <- distribution(background) |>
        add_predictors_model(fit)
 obj

## End(Not run)

Add priors to an existing distribution object

Description

This function simply allows to add priors to an existing distribution object. The supplied priors must be a PriorList object created through calling priors.

Usage

add_priors(x, priors = NULL, ...)

## S4 method for signature 'BiodiversityDistribution'
add_priors(x, priors = NULL, ...)

Arguments

x

distribution (i.e. BiodiversityDistribution) object.

priors

A PriorList object containing multiple priors.

...

Other parameters passed down.

Note

Alternatively priors to environmental predictors can also directly added as parameter via add_predictors

See Also

Other prior: BARTPrior(), BARTPriors(), BREGPrior(), BREGPriors(), GDBPrior(), GDBPriors(), GLMNETPrior(), GLMNETPriors(), INLAPrior(), INLAPriors(), STANPrior(), STANPriors(), XGBPrior(), XGBPriors(), get_priors(), priors(), rm_priors()

Examples

## Not run: 
 pp <-  GLMNETPrior("forest")
 x <- distribution(background) |>
  add_priors(pp)


## End(Not run)

Add pseudo-absence points to a point data set

Description

For most engines, background or pseudo-absence points are necessary. The distinction lies in how the absence data are handled. For poisson distributed responses, absence points are considered background points over which the intensity of sampling (lambda) is integrated (in a classical Poisson point-process model).

In contrast in binomial distributed responses, the absence information is assumed to be an adequate representation of the true absences and treated by the model as such... Here it is advised to specify absence points in a way that they represent potential true absence, such as for example through targeted background sampling or by sampling them within/outside a given range.

Usage

add_pseudoabsence(
  df,
  field_occurrence = "observed",
  template = NULL,
  settings = getOption("ibis.pseudoabsence")
)

Arguments

df

A sf, data.frame or tibble object containing point data.

field_occurrence

A character name of the column containing the presence information (Default: observed).

template

A SpatRaster object that is aligned with the predictors (Default: NULL). If set to NULL, then background in the pseudoabs_settings() has to be a SpatRaster object.

settings

A pseudoabs_settings() objects. Absence settings are taken from ibis_options otherwise (Default).

Details

A pseudoabs_settings() object can be added to setup how absence points should be sampled. A bias parameter can be set to specify a bias layer to sample from, for instance a layer of accessibility. Note that when modelling several datasets, it might make sense to check across all datasets whether certain areas are truly absent. By default, the pseudo-absence points are not sampled in areas in which there are already presence points.

Value

A data.frame containing the newly created pseudo absence points.

Note

This method removes all columns from the input df object other than the field_occurrence column and the coordinate columns (which will be created if not already present).

References

  • Stolar, J., & Nielsen, S. E. (2015). Accounting for spatially biased sampling effort in presence‐only species distribution modelling. Diversity and Distributions, 21(5), 595-608.

  • Bird, T.J., Bates, A.E., Lefcheck, J.S., Hill, N.A., Thomson, R.J., Edgar, G.J., Stuart-Smith, R.D., Wotherspoon, S., Krkosek, M., Stuart-Smith, J.F. and Pecl, G.T., 2014. Statistical solutions for error and bias in global citizen science datasets. Biological Conservation, 173, pp.144-154.


Align a SpatRaster object to another by harmonizing geometry and extend.

Description

If the data is not in the same projection as the template, the alignment will be computed by reprojection only. If the data has already the same projection, the data set will be cropped and aggregated prior to resampling in order to reduce computation time.

Usage

alignRasters(data, template, method = "bilinear", func = mean, cl = TRUE)

Arguments

data

SpatRaster object to be resampled.

template

SpatRaster or sf object from which geometry can be extracted.

method

method for resampling (Options: "near" or "bilinear").

func

function for resampling (Default: mean).

cl

logical value if multicore computation should be used (Default: TRUE).

Details

Nearest Neighbour resampling (near) is recommended for discrete and bilinear resampling recommended for continuous data. See also help from terra::resample for other options.

Value

New SpatRaster object aligned to the supplied template layer.

Examples

## Not run: 
 # Align one raster to another
 ras1 <- alignRasters( ras1, ras2, method = "near", cl = FALSE)

## End(Not run)

As Id

Description

As Id

Usage

as.Id(x, ...)

## S3 method for class 'character'
as.Id(x, ...)

Arguments

x

A character to be converted as id.

...

Other arguements


Create a tree-based split probability prior for BART

Description

Function to include prior information as split probability for the Bayesian additive regression tree model added via engine_bart.

Priors for engine_bart have to be specified as transition probabilities of variables which are internally used to generate splits in the regression tree. Specifying a prior can thus help to 'enforce' a split with a given variable. These can be numeric and coded as values between 0 and 1.

Usage

BARTPrior(variable, hyper = 0.75, ...)

## S4 method for signature 'character'
BARTPrior(variable, hyper = 0.75, ...)

Arguments

variable

A character matched against existing predictors or latent effects.

hyper

A numeric object with a number being >0 and equal to 1. Defaults to 0.75.

...

Variables passed on to prior object.

Note

Even if a given variable is included as split in the regression or classification tree, this does not necessarily mean that the prediction changes if the value is non-informative (as the split can occur early on). It does however affect any variable importance estimates calculated from the model.

References

  • Chipman, H., George, E., and McCulloch, R. (2009) BART: Bayesian Additive Regression Trees.

  • Chipman, H., George, E., and McCulloch R. (2006) Bayesian Ensemble Learning. Advances in Neural Information Processing Systems 19, Scholkopf, Platt and Hoffman, Eds., MIT Press, Cambridge, MA, 265-272.

See Also

Prior.

Other prior: BARTPriors(), BREGPrior(), BREGPriors(), GDBPrior(), GDBPriors(), GLMNETPrior(), GLMNETPriors(), INLAPrior(), INLAPriors(), STANPrior(), STANPriors(), XGBPrior(), XGBPriors(), add_priors(), get_priors(), priors(), rm_priors()


Helper function when multiple variables are supplied for a BART prior

Description

This is a helper function to specify several BARTPrior objects with the same hyper-parameters, but different variables.

Usage

BARTPriors(variable, hyper = 0.75, ...)

## S4 method for signature 'character'
BARTPriors(variable, hyper = 0.75, ...)

Arguments

variable

A character matched against existing predictors or latent effects.

hyper

A numeric object with a number being >0 and equal to 1. Defaults to 0.75.

...

Variables passed on to prior object.

See Also

Other prior: BARTPrior(), BREGPrior(), BREGPriors(), GDBPrior(), GDBPriors(), GLMNETPrior(), GLMNETPriors(), INLAPrior(), INLAPriors(), STANPrior(), STANPriors(), XGBPrior(), XGBPriors(), add_priors(), get_priors(), priors(), rm_priors()


BiodiversityDataset prototype description

Description

BiodiversityDataset prototype description

Public fields

name

The default name of this dataset as character.

id

A character with the unique id for this dataset.

equation

A formula object containing the equation of how this dataset is modelled.

family

The family used for this dataset as character.

link

The link function used for this data as character.

type

A character with the type as character.

weight

A numeric containing custom weights per observation for this dataset.

field_occurrence

A character with the name of the column name containing observations.

data

Contains the observational data in sf format.

use_intercept

A logical flag on whether intercepts are included for this dataset.

pseudoabsence_settings

Optionally provided pseudoabsence settings.

Methods

Public methods


Method new()

Initializes the object and creates an empty list

Usage
BiodiversityDataset$new(
  name,
  id,
  equation,
  family,
  link,
  type,
  weight,
  field_occurrence,
  data,
  use_intercept,
  pseudoabsence_settings
)
Arguments
name

The default name of this dataset as character.

id

A character with the unique id for this dataset.

equation

A formula object containing the equation of how this dataset is modelled.

family

The family used for this dataset as character.

link

The link function used for this data as character.

type

A character with the type as character.

weight

A numeric containing custom weights per observation for this dataset.

field_occurrence

A character with the name of the column name containing observations.

data

Contains the observational data in sf format.

use_intercept

A logical flag on whether intercepts are included for this dataset.

pseudoabsence_settings

Optionally provided pseudoabsence settings.

Returns

NULL


Method print()

Print the names and properties of all Biodiversity datasets contained within

Usage
BiodiversityDataset$print()
Returns

A message on screen


Method set_equation()

Set new equation and writes it into formula

Usage
BiodiversityDataset$set_equation(x)
Arguments
x

A new formula object.

Returns

Invisible


Method get_equation()

Get equation

Usage
BiodiversityDataset$get_equation()
Returns

A placeholder or formula object.


Method show_equation()

Function to print the equation

Usage
BiodiversityDataset$show_equation()
Returns

A message on screen.


Method get_id()

Get Id within the dataset

Usage
BiodiversityDataset$get_id()
Returns

A character with the id.


Method get_type()

Get type of the dataset.

Usage
BiodiversityDataset$get_type(short = FALSE)
Arguments
short

A logical flag if this should be formatted in shortform.

Returns

A character with the type


Method get_column_occ()

Get field with occurrence information

Usage
BiodiversityDataset$get_column_occ()
Returns

A character with the occurence field


Method get_family()

Get family

Usage
BiodiversityDataset$get_family()
Returns

A character with the family for the dataset


Method get_link()

Get custom link function

Usage
BiodiversityDataset$get_link()
Returns

A character with the family for the dataset


Method get_data()

Get data from the object

Usage
BiodiversityDataset$get_data()
Returns

A sf object with the data


Method get_weight()

Get weight

Usage
BiodiversityDataset$get_weight()
Returns

A numeric with the weights within the dataset.


Method show()

Print input messages

Usage
BiodiversityDataset$show()
Returns

A message on screen.


Method get_observations()

Collect info statistics about number of observations

Usage
BiodiversityDataset$get_observations()
Returns

A numeric with the number of observations.


Method mask()

Convenience function to mask all input datasets.

Usage
BiodiversityDataset$mask(mask, inverse = FALSE, ...)
Arguments
mask

A SpatRaster or sf object.

inverse

A logical flag if the inverse should be masked instead.

...

Any other parameters passed on to mask

Returns

Invisible


Method clone()

The objects of this class are cloneable with this method.

Usage
BiodiversityDataset$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.


BiodiversityDatasetCollection super class description

Description

Acts a container for a specified set of BiodiversityDataset contained within. Functions are provided to summarize across the BiodiversityDataset-class objects.

Public fields

data

A list of BiodiversityDataset objects.

name

The default name of this collection as character.

Methods

Public methods


Method new()

Initializes the object and creates an empty list

Usage
BiodiversityDatasetCollection$new()
Returns

NULL


Method print()

Print the names and properties of all Biodiversity datasets contained within

Usage
BiodiversityDatasetCollection$print(format = TRUE)
Arguments
format

A logical flag on whether a message should be printed.

Returns

A message on screen


Method show()

Aliases that calls print.

Usage
BiodiversityDatasetCollection$show()
Returns

A message on screen


Method get_types()

Types of all biodiversity datasets included in this

Usage
BiodiversityDatasetCollection$get_types(short = FALSE)
Arguments
short

A logical flag whether types should be in short format.

Returns

A character vector.


Method get_names()

Get names and format them if necessary

Usage
BiodiversityDatasetCollection$get_names(format = FALSE)
Arguments
format

A logical flag whether names are to be formatted

Returns

A character vector.


Method set_data()

Add a new Biodiversity dataset to this collection.

Usage
BiodiversityDatasetCollection$set_data(x, value)
Arguments
x

A character with the name or id of this dataset.

value

A BiodiversityDataset

Returns

Invisible


Method get_data_object()

Get a specific Biodiversity dataset by id

Usage
BiodiversityDatasetCollection$get_data_object(id)
Arguments
id

A character with a given id for the dataset.

Returns

Returns a BiodiversityDataset.


Method get_data()

Get all biodiversity observations from a given dataset.

Usage
BiodiversityDatasetCollection$get_data(id)
Arguments
id

A character with a given id for the dataset.

Returns

Returns all data from a set BiodiversityDataset.


Method get_coordinates()

Get coordinates for a given biodiversity dataset. Else return a wkt object

Usage
BiodiversityDatasetCollection$get_coordinates(id)
Arguments
id

A character with a given id for the dataset.

Returns

All coordinates from a given object in data.frame.


Method mask()

Convenience function to mask all input datasets.

Usage
BiodiversityDatasetCollection$mask(mask, inverse = FALSE)
Arguments
mask

A SpatRaster or sf object.

inverse

A logical flag if the inverse should be masked instead.

Returns

Invisible


Method rm_data()

Remove a specific biodiversity dataset by id

Usage
BiodiversityDatasetCollection$rm_data(id)
Arguments
id

A character with a given id for the dataset.

Returns

Invisible


Method length()

Number of Biodiversity Datasets in connection

Usage
BiodiversityDatasetCollection$length()
Returns

A numeric with the number of datasets.


Method get_observations()

Get number of observations of all datasets

Usage
BiodiversityDatasetCollection$get_observations()
Returns

A numeric with the number of observations across datasets.


Method get_equations()

Get equations from all datasets

Usage
BiodiversityDatasetCollection$get_equations()
Returns

A list vector with all equations across datasets.


Method get_families()

Get families from datasets.

Usage
BiodiversityDatasetCollection$get_families()
Returns

A list vector with all families across datasets.


Method get_links()

Get custom link functions

Usage
BiodiversityDatasetCollection$get_links()
Returns

A list vector with all link functions across datasets.


Method get_columns_occ()

Get fields with observation columns

Usage
BiodiversityDatasetCollection$get_columns_occ()
Returns

A list vector with the names of observation columns.


Method get_weights()

Get the weights across datasets.

Usage
BiodiversityDatasetCollection$get_weights()
Returns

A list vector with the weights if set per dataset.


Method get_ids()

Get ids of all assets in the collection.

Usage
BiodiversityDatasetCollection$get_ids()
Returns

A list vector with the ids of all datasets.


Method get_id_byType()

Search for a specific biodiversity dataset with type

Usage
BiodiversityDatasetCollection$get_id_byType(type)
Arguments
type

A character for a given data type.

Returns

A character with the id(s) of datasets with the given type.


Method get_id_byName()

Get id by name

Usage
BiodiversityDatasetCollection$get_id_byName(name)
Arguments
name

A character for a given name.

Returns

A character with the id(s) of datasets with the given name.


Method show_equations()

Show equations of all datasets

Usage
BiodiversityDatasetCollection$show_equations(msg = TRUE)
Arguments
msg

A logical on whether to use print a message instead.

Returns

Shows equations on screen or as character.


Method plot()

Plot the whole collection

Usage
BiodiversityDatasetCollection$plot()
Returns

Invisible


Method clone()

The objects of this class are cloneable with this method.

Usage
BiodiversityDatasetCollection$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Note

This can likely be beautified further.


Biodiversity Distribution master class

Description

Base R6 class for any biodiversity distribution objects. Serves as container that supplies data and functions to other R6 classes. Generally stores all objects and parameters added to a model.

Details

Run names() on a distribution object to show all available functions.

Public fields

background

A SpatRaster or sf object delineating the modelling extent.

limits

An optional sf object on potential extrapolation limits

biodiversity

A BiodiversityDatasetCollection object.

predictors

A PredictorDataset object.

priors

An optional PriorList object.

control

An optional Control object.

latentfactors

A character on whether latentfactors are used.

offset

A character on whether methods are used.

log

An optional Log object.

engine

A Engine object.

Methods

Public methods


Method new()

Initializes the object and creates an BiodiversityDataset by default.

Usage
BiodiversityDistribution$new(background, limits, biodiversity, ...)
Arguments
background

A SpatRaster or sf object delineating the modelling extent.

limits

An optional sf object on potential extrapolation limits

biodiversity

A BiodiversityDatasetCollection object.

...

Any other objects

Returns

NULL


Method print()

Looks for and returns the properties of all contained objects.

Usage
BiodiversityDistribution$print()
Returns

A message on screen


Method show()

An alias for print

Usage
BiodiversityDistribution$show()
Returns

A message on screen


Method name()

Returns self-describing name

Usage
BiodiversityDistribution$name()
Returns

A character with the name


Method show_background_info()

Summarizes extent and projection from set background

Usage
BiodiversityDistribution$show_background_info()
Returns

A character with the name


Method set_limits()

Specify new limits to the background

Usage
BiodiversityDistribution$set_limits(x)
Arguments
x

A list object with method and limit type.

Returns

This object.


Method get_limits()

Get provided limits if set or a waiver

Usage
BiodiversityDistribution$get_limits()
Returns

A list or waiver.


Method rm_limits()

Remove limits if set.

Usage
BiodiversityDistribution$rm_limits()
Returns

This object.


Method get_predictor_names()

Function for querying predictor names if existing

Usage
BiodiversityDistribution$get_predictor_names()
Returns

A character vector.


Method set_latent()

Adding latent factors to the object.

Usage
BiodiversityDistribution$set_latent(type, method = NULL, separate_spde = FALSE)
Arguments
type

A character with the given type.

method

A character with a method.

separate_spde

A logical flag whether duplicate of SPDE effects are to be created.

Returns

This object.


Method get_latent()

Get latent factors if found in object.

Usage
BiodiversityDistribution$get_latent()
Returns

A character with those objects.


Method rm_latent()

Remove latent factors if found in object.

Usage
BiodiversityDistribution$rm_latent()
Returns

This object.


Method get_priors()

Get prior object if found in object.

Usage
BiodiversityDistribution$get_priors()
Returns

This object.


Method set_priors()

Specify new prior object. Overwrites existing ones

Usage
BiodiversityDistribution$set_priors(x)
Arguments
x

A PriorList object.

Returns

This object.


Method set_biodiversity()

Adds a new biodiversity object to the existing empty collection.

Usage
BiodiversityDistribution$set_biodiversity(id, p)
Arguments
id

A character or id defining this object.

p

A BiodiversityDataset object.

Returns

This object.


Method set_predictors()

Set a new Predictor object to this object.

Usage
BiodiversityDistribution$set_predictors(x)
Arguments
x

A PredictorDataset with predictors for this object.

Returns

This object.


Method set_engine()

Set a new Engine object to this object.

Usage
BiodiversityDistribution$set_engine(x)
Arguments
x

A Engine for this object.

Returns

This object.


Method get_engine()

Gets the name of the current engine if set.

Usage
BiodiversityDistribution$get_engine()
Returns

A character with the engine name


Method rm_engine()

Removes the current engine if set.

Usage
BiodiversityDistribution$rm_engine()
Returns

This object


Method get_prior_variables()

Get prior variables

Usage
BiodiversityDistribution$get_prior_variables()
Returns

A character with the variable names for which priors have been added.


Method set_offset()

Specify new offsets.

Usage
BiodiversityDistribution$set_offset(x)
Arguments
x

A new SpatRaster object to be used as offset.

Returns

This object.


Method get_offset()

Get offset (print name)

Usage
BiodiversityDistribution$get_offset()
Returns

A character with all the offsets in here.


Method rm_offset()

Remove offsets if found.

Usage
BiodiversityDistribution$rm_offset(what = NULL)
Arguments
what

Optional character of specific offsets to remove.

Returns

This object.


Method plot_offsets()

Plot offset if found.

Usage
BiodiversityDistribution$plot_offsets()
Returns

A graphical element.


Method get_offset_type()

Get offset parameters if found

Usage
BiodiversityDistribution$get_offset_type()
Returns

A list with the offset parameters if found.


Method set_control()

Set new bias control

Usage
BiodiversityDistribution$set_control(type = "bias", x, method, value)
Arguments
type

A character with the type of control object.

x

A new bias control object. Expecting a SpatRaster object.

method

The method used to create the object.

value

A bias value as numeric.

Returns

This object.


Method get_control()

Get bias control (print name)

Usage
BiodiversityDistribution$get_control(type = "bias")
Arguments
type

A character with the type of control object.

Returns

A character with the bias object if found.


Method rm_control()

Remove bias controls if found.

Usage
BiodiversityDistribution$rm_control()
Returns

This object.


Method plot_bias()

Plot bias variable if set.

Usage
BiodiversityDistribution$plot_bias()
Returns

A graphical element.


Method get_log()

Returns the output filename of the current log object if set.

Usage
BiodiversityDistribution$get_log()
Returns

A character where the output is returned.


Method set_log()

Set a new log object

Usage
BiodiversityDistribution$set_log(x)
Arguments
x

A Log object.

Returns

This object


Method get_extent()

Get extent

Usage
BiodiversityDistribution$get_extent()
Returns

Background extent or NULL.


Method get_projection()

Get projection from the background in crs format.

Usage
BiodiversityDistribution$get_projection()
Returns

A character of the projection


Method get_resolution()

Return resolution of the background object.

Usage
BiodiversityDistribution$get_resolution()
Returns

A vector with the resolution.


Method rm_predictors()

Remove predictiors. Either all of them or specific ones.

Usage
BiodiversityDistribution$rm_predictors(names)
Arguments
names

A character with the predictors to be removed.

Returns

This object.


Method rm_priors()

Remove priors. Either all of them or specific ones.

Usage
BiodiversityDistribution$rm_priors(names = NULL)
Arguments
names

A character with the priors to be removed.

Returns

This object.


Method show_biodiversity_length()

Show number of biodiversity records

Usage
BiodiversityDistribution$show_biodiversity_length()
Returns

A numeric with sum of biodiversity records


Method show_biodiversity_equations()

Show Equations of biodiversity records

Usage
BiodiversityDistribution$show_biodiversity_equations()
Returns

A message on screen.


Method get_biodiversity_equations()

Get equations of biodiversity records

Usage
BiodiversityDistribution$get_biodiversity_equations()
Returns

A list vector.


Method get_biodiversity_types()

Query all biodiversity types in this object

Usage
BiodiversityDistribution$get_biodiversity_types()
Returns

A character vector.


Method get_biodiversity_ids()

Return all biodiversity dataset ids in the object

Usage
BiodiversityDistribution$get_biodiversity_ids()
Returns

A list for the ids in the biodiversity datasets


Method get_biodiversity_names()

Return all the character names of all biodiversity datasets

Usage
BiodiversityDistribution$get_biodiversity_names()
Returns

A list with the names in the biodiversity datasets


Method plot()

Plots the content of this class.

Usage
BiodiversityDistribution$plot()
Returns

A message.


Method summary()

Summary function for this object.

Usage
BiodiversityDistribution$summary()
Returns

A message.


Method clone()

The objects of this class are cloneable with this method.

Usage
BiodiversityDistribution$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Note

Not implemented yet.

Not implemented yet.

See Also

add_limits_extrapolation()

add_latent_spatial()

add_priors()

add_biodiversity_poipa(), add_biodiversity_poipo(), add_biodiversity_polpa(), add_biodiversity_polpo()

add_predictors()

add_offset()

Examples

# Query available functions and entries
background <- terra::rast(system.file('extdata/europegrid_50km.tif',
package='ibis.iSDM',mustWork = TRUE))
# Define model
x <- distribution(background)
names(x)

Class for a biodiversity scenario from a trained model

Description

Base R6 class for any biodiversity scenario objects. Serves as container that supplies data and functions to other R6 classes and functions.

Public fields

modelobject

A name of the model for projection.

modelid

An id of the model used for projection.

limits

A sf object used to constraint the prediction.

predictors

A predictor object for projection.

constraints

Any constraints set for projection.

latentfactors

A list on whether latentfactors are used.

scenarios

The resulting stars objects.

Methods

Public methods


Method new()

Initializes the object and creates an empty list

Usage
BiodiversityScenario$new()
Returns

NULL


Method print()

Print the names and properties of all scenarios.

Usage
BiodiversityScenario$print()
Returns

A message on screen


Method verify()

Verify that set Model exist and check self-validity

Usage
BiodiversityScenario$verify()
Returns

Invisible


Method show()

Show the name of the Model

Usage
BiodiversityScenario$show()
Returns

Model objectname


Method get_projection()

Get projection of the projection.

Usage
BiodiversityScenario$get_projection()
Returns

A sf object with the geographic projection


Method get_resolution()

Get resultion of the projection.

Usage
BiodiversityScenario$get_resolution()
Returns

A numeric indication of the resolution.


Method get_model()

Get the actual model used for projection

Usage
BiodiversityScenario$get_model(copy = FALSE)
Arguments
copy

A logical flag on whether a deep copy should be created.

Returns

A DistributionModel object.


Method get_limits()

Get provided projection limits if set.

Usage
BiodiversityScenario$get_limits()
Returns

A sf object or NULL.


Method rm_limits()

Remove current limits.

Usage
BiodiversityScenario$rm_limits()
Returns

Invisible


Method get_predictor_names()

Get names of predictors for scenario object.

Usage
BiodiversityScenario$get_predictor_names()
Returns

A character vector with the names.


Method get_timeperiod()

Get time period of projection.

Usage
BiodiversityScenario$get_timeperiod(what = "range")
Arguments
what

character on whether full time period or just the range is to be returned.

Returns

A time period from start to end.


Method get_constraints()

Get constrains for model

Usage
BiodiversityScenario$get_constraints()
Returns

A list with the constraints within the scenario.


Method rm_constraints()

Remove contraints from model

Usage
BiodiversityScenario$rm_constraints()
Returns

Invisible


Method get_threshold()

Get thresholds if specified.

Usage
BiodiversityScenario$get_threshold()
Returns

A list with method and value for the threshold.


Method get_thresholdvalue()

Duplicate function for internal consistency to return threshold

Usage
BiodiversityScenario$get_thresholdvalue()
Returns

A list with method and value for the threshold.


Method apply_threshold()

Apply a new threshold to the projection.

Usage
BiodiversityScenario$apply_threshold(tr = new_waiver())
Arguments
tr

A numeric value with the new threshold.

Returns

This object.


Method set_predictors()

Set new predictors to this object.

Usage
BiodiversityScenario$set_predictors(x)
Arguments
x

PredictorDataset object to be supplied.

Returns

This object.


Method set_constraints()

Set new constrains

Usage
BiodiversityScenario$set_constraints(x)
Arguments
x

A list object with constraint settings.

Returns

This object.


Method get_simulation()

Get simulation options and parameters if gound

Usage
BiodiversityScenario$get_simulation()
Returns

A list with the parameters.


Method set_simulation()

Set simulation objects.

Usage
BiodiversityScenario$set_simulation(x)
Arguments
x

new simulation entries and options as list to be set.

Returns

This object.


Method get_predictors()

Get Predictors from the object.

Usage
BiodiversityScenario$get_predictors()
Returns

A predictor dataset.


Method rm_predictors()

Remove predictors from the object.

Usage
BiodiversityScenario$rm_predictors(names)
Arguments
names

A character vector with names

Returns

This object.


Method get_data()

Get scenario predictions or any other data

Usage
BiodiversityScenario$get_data(what = "scenarios")
Arguments
what

A character vector with names of what

Returns

This object.


Method rm_data()

Remove scenario predictions

Usage
BiodiversityScenario$rm_data()
Arguments
what

A character vector with names of what

Returns

Invisible


Method set_data()

Set new data in object.

Usage
BiodiversityScenario$set_data(x)
Arguments
x

A new data object measuing scenarios.

Returns

This object.


Method set_latent()

Adding latent factors to the object.

Usage
BiodiversityScenario$set_latent(latent)
Arguments
latent

A list containing the data object.

Returns

This object.


Method get_latent()

Get latent factors if found in object.

Usage
BiodiversityScenario$get_latent()
Returns

A list with the latent settings


Method rm_latent()

Remove latent factors if found in object.

Usage
BiodiversityScenario$rm_latent()
Returns

This object.


Method plot()

Plot the predictions made here.

Usage
BiodiversityScenario$plot(what = "suitability", which = NULL, ...)
Arguments
what

A character describing the layers to be plotted.

which

A numeric subset to any specific time steps.

...

Any other parameters passed on.

Returns

A graphical representation


Method plot_threshold()

Convenience function to plot thresholds if set

Usage
BiodiversityScenario$plot_threshold(which = NULL)
Arguments
which

A numeric subset to any specific time steps.

Returns

A graphical representation


Method plot_migclim()

Plot Migclim results if existing.

Usage
BiodiversityScenario$plot_migclim()
Returns

A graphical representation


Method plot_animation()

Plot animation of scenarios if possible

Usage
BiodiversityScenario$plot_animation(what = "suitability", fname = NULL)
Arguments
what

A character describing the layers to be plotted.

fname

An optional filename to write the result.

Returns

A graphical representation


Method plot_relative_change()

Plot relative change between baseline and projected thresholds

Usage
BiodiversityScenario$plot_relative_change(
  position = NULL,
  variable = "mean",
  plot = TRUE
)
Arguments
position

Which layer to be plotted

variable

A character of the variable to be plotted

plot

logical flag on whether to plot the results or return the object.

Returns

A graphical representation or SpatRaster.


Method summary()

Summarize the change in layers between timesteps

Usage
BiodiversityScenario$summary(
  layer = "threshold",
  plot = FALSE,
  relative = FALSE
)
Arguments
layer

A character of the variable to be plotted

plot

logical flag on whether to plot the results or return the coefficients.

relative

logical on coefficients to be converted to relative change.

Returns

Summarized coefficients as data.frame


Method summary_beforeafter()

Summarize before-after change of first and last layer.

Usage
BiodiversityScenario$summary_beforeafter()
Returns

Summarized coefficients as data.frame


Method plot_scenarios_slope()

Calculate slopes across the projection

Usage
BiodiversityScenario$plot_scenarios_slope(
  what = "suitability",
  oftype = "stars"
)
Arguments
what

A character with layer to be plotted (default: "suitability").

oftype

character of the output type.

Returns

A plot of the scenario slopes


Method calc_scenarios_slope()

Calculate slopes across the projection

Usage
BiodiversityScenario$calc_scenarios_slope(
  what = "suitability",
  plot = TRUE,
  oftype = "stars"
)
Arguments
what

A character with layer to be plotted (default: "suitability").

plot

logical flag on whether to plot the results or return the coefficients.

oftype

character of the output type.

Returns

A SpatRaster layer or stars object.


Method mask()

Convenience function to mask all input projections.

Usage
BiodiversityScenario$mask(mask, inverse = FALSE, ...)
Arguments
mask

A SpatRaster or sf object.

inverse

A logical flag if the inverse should be masked instead.

...

Any other parameters passed on.

Returns

Invisible


Method get_centroid()

Get centroids of projection layers

Usage
BiodiversityScenario$get_centroid(patch = FALSE)
Arguments
patch

A logical if centroid should be calculated weighted by values.

Returns

Returns a sf object.


Method save()

Save object as output somewhere

Usage
BiodiversityScenario$save(fname, type = "tif", dt = "FLT4S")
Arguments
fname

An output filename as character.

type

A format as character. Matched against a list of supported formats.

dt

The datatype used, such as float64

Returns

Saved spatial prediction on drive.


Method clone()

The objects of this class are cloneable with this method.

Usage
BiodiversityScenario$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Note

This sets the threshold method internally to 'fixed'.

The latent factor is usually obtained from the fitted model object, unless re-specified and added here to the list.

This requires the "gganimate" package.

This requires a set threshold() to the scenario object.

This requires set threshold prior to projection.

See Also

threshold()

threshold()

add_latent_spatial()


Bivariate prediction plot for distribution objects

Description

Often there is an intention to display not only the predictions made with a SDM, but also the uncertainty of the prediction. Uncertainty be estimated either directly by the model or by calculating the variation in prediction values among a set of models.

In particular Bayesian engines can produce not only mean estimates of fitted responses, but also pixel-based estimates of uncertainty from the posterior such as the standard deviation (SD) or the coefficient of variation of a given prediction.

This function makes use of the "biscale" R-package to create bivariate plots of the fitted distribution object, allowing to visualize two variables at once. It is mostly thought of as a convenience function to create such bivariate plots for quick visualization.

Supported Inputs are either single trained Bayesian DistributionModel with uncertainty or the output of an ensemble() call. In both cases, users have to make sure that "xvar" and "yvar" are set accordingly.

Usage

bivplot(
  mod,
  xvar = "mean",
  yvar = "sd",
  plot = TRUE,
  fname = NULL,
  title = NULL,
  col = "BlueGold",
  ...
)

## S4 method for signature 'ANY'
bivplot(
  mod,
  xvar = "mean",
  yvar = "sd",
  plot = TRUE,
  fname = NULL,
  title = NULL,
  col = "BlueGold",
  ...
)

Arguments

mod

A trained DistributionModel or alternatively a SpatRaster object with prediction model within.

xvar

A character denoting the value on the x-axis (Default: 'mean').

yvar

A character denoting the value on the y-axis (Default: 'sd').

plot

A logical indication of whether the result is to be plotted (Default: TRUE)?

fname

A character specifying the output filename a created figure should be written to.

title

Allows to respecify the title through a character (Default:NULL).

col

A character stating the colour palette to use. Has to be either a predefined value or a vector of colours. See "biscale::bi_pal_manual". Default: "BlueGold".

...

Other engine specific parameters.

Value

Saved bivariate plot in 'fname' if specified, otherwise plot.

Note

This function requires the biscale package to be installed. Although a work around without the package could be developed, it was not deemed necessary at this point. See also this gist.

See Also

partial, plot.DistributionModel


Create a new spike and slab prior for Bayesian generalized linear models

Description

Function to include prior information via Zellner-style spike and slab prior for generalized linear models used in engine_breg. These priors are similar to the horseshoe priors used in regularized engine_stan models and penalize regressions by assuming most predictors having an effect of 0.

Usage

BREGPrior(variable, hyper = NULL, ip = NULL)

## S4 method for signature 'character'
BREGPrior(variable, hyper = NULL, ip = NULL)

Arguments

variable

A character matched against existing predictors.

hyper

A numeric estimate of the mean regression coefficients.

ip

A numeric estimate between 0 and 1 of the inclusion probability of the target variable (Default: NULL).

Details

The Zellner-style spike and slab prior for generalized linear models are specified as described in the Boom R-package. Currently supported are two options which work for models with Poisson and binomial (Bernoulli) distributed errors. Two types of priors can be provided on a variable:

  • "coefficient" Allows to specify Gaussian priors on the mean coefficients of the model. Priors on the coefficients can be provided via the "hyper" parameter. Note that variables with such a prior can still be regularized out from the model.

  • "inclusion.probability" A vector giving the prior probability of inclusion for the specified variable. This can be useful when prior information on preference is known but not the strength of it.

If coefficients are set, then the inclusion probability is also modified by default. However even when not knowing a particular estimate of a beta coefficients and their direction, one can still provide an estimate of the inclusion probability. In other words: The hyperparameters 'hyper' and 'ip' can't be both NULL.

References

  • Hugh Chipman, Edward I. George, Robert E. McCulloch, M. Clyde, Dean P. Foster, Robert A. Stine (2001), "The Practical Implementation of Bayesian Model Selection" Lecture Notes-Monograph Series, Vol. 38, pp. 65-134. Institute of Mathematical Statistics.

See Also

Prior

Other prior: BARTPrior(), BARTPriors(), BREGPriors(), GDBPrior(), GDBPriors(), GLMNETPrior(), GLMNETPriors(), INLAPrior(), INLAPriors(), STANPrior(), STANPriors(), XGBPrior(), XGBPriors(), add_priors(), get_priors(), priors(), rm_priors()

Examples

## Not run: 
# Positive coefficient
p1 <- BREGPrior(variable = "forest", hyper = 2, ip = NULL)
p1
# Coefficient and direction unknown but variable def. important
p2 <- BREGPrior(variable = "forest", hyper = NULL, ip = 1)
p2

## End(Not run)

Helper function when multiple variables are supplied for a BREG prior

Description

This is a helper function to specify several BREGPrior with the same hyper-parameters, but different variables.

Usage

BREGPriors(variable, hyper = NULL, ip = NULL)

## S4 method for signature 'character'
BREGPriors(variable, hyper = NULL, ip = NULL)

Arguments

variable

A character matched against existing predictors.

hyper

A numeric estimate of the mean regression coefficients.

ip

A numeric estimate between 0 and 1 of the inclusion probability of the target variable (Default: NULL).

See Also

Other prior: BARTPrior(), BARTPriors(), BREGPrior(), GDBPrior(), GDBPriors(), GLMNETPrior(), GLMNETPriors(), INLAPrior(), INLAPriors(), STANPrior(), STANPriors(), XGBPrior(), XGBPriors(), add_priors(), get_priors(), priors(), rm_priors()


Check objects in the package for common errors or issues

Description

Not always is there enough data or sufficient information to robustly infer the suitable habitat or niche of a species. As many SDM algorithms are essentially regression models, similar assumptions about model convergence, homogeneity of residuals and inferrence usually apply (although often ignored). This function simply checks the respective input object for common issues or mistakes.

Usage

check(obj, stoponwarning = FALSE)

## S4 method for signature 'ANY'
check(obj, stoponwarning = FALSE)

Arguments

obj

A BiodiversityDistribution, DistributionModel or BiodiversityScenario object.

stoponwarning

logical Should check return a stop if warning is raised? (Default: FALSE).

Details

Different checks are implemented depending on the supplied object

  • Checks if there are less than 200 observations

  • TODO: Add rm_insufficient_covs link

  • Check model convergence

  • Check if model is found

  • Check if coefficients exist

  • Check if there are unusal outliers in prediction (using 10median absolute deviation)

  • Check if threshold is larger than layer

Value

Message outputs

Note

This function will likely be expanded with additional checks in the future. If you have ideas, please let them know per issue.

Examples

## Not run: 
 # Where mod is an estimated DistributionModel
 check(mod)

## End(Not run)

Obtains the coefficients of a trained model

Description

Similar as summary, this helper function obtains the coefficients from a given DistributionModel object.

Usage

## S3 method for class 'DistributionModel'
coef(object, ...)

Arguments

object

Any prepared object.

...

not used.

Note

For models trained with machine-learning approaches (e.g. engine_bart etc) this function will return variable importance estimates rather than linear coefficients. Similar can be said for trained non-linear models.

See Also

stats::coef().


Combine or concatenate multiple formula objects

Description

This small helper function allows to combine multiple formula() objects into one. In the case of duplicate variable entries, only the unique ones are used.

Usage

combine_formulas(..., combine = "both", env = parent.frame())

Arguments

...

Any number formula objects in "LHS ~ RHS" format, also supporting character strings.

combine

character on whether LHS and RHS duplicates are to be removed. Can be set to either "lhs", "rhs" or "both" (Default).

env

A new environment of the formula (def=parent.frame()).

Details

Use "y ~ 0" to specify a stand alone LHS.

Value

A formula as cbind(lhs_1, lhs_2, ...) ~ rhs_1 + rhs_2 + ... or lhs ~ rhs_1 + rhs_2 in case of identical LHS (see examples).

Note

This likely won't work for interaction terms (such as * or :).

Examples

# Combine everything (default)
combine_formulas(observed ~ rainfall + temp, observed ~ rainfall + forest.cover)
# Combine only LHS
combine_formulas(observed ~ rainfall + temp, observed ~ rainfall + forest.cover, combine = "lhs")

Create distribution modelling procedure

Description

This function creates an object that contains all the data, parameters and settings for building an (integrated) species distribution model. Key functions to add data are add_biodiversity_poipo and the like, add_predictors, add_latent_spatial, engine_glmnet or similar, add_priors and add_offset. It creates a prototype BiodiversityDistribution object with its own functions. After setting input data and parameters, model predictions can then be created via the train function and predictions be created.

Additionally, it is possible to specify a "limit" to any predictions conducted on the background. This can be for instance a buffered layer by a certain dispersal distance (Cooper and Soberon, 2018) or a categorical layer representing biomes or soil conditions. Another option is to create a constraint by constructing a minimum convex polygon (MCP) using the supplied biodiversity data. This option can be enabled by setting "limits_method" to "mcp". It is also possible to provide a small buffer to constructed MCP that way. See the frequently asked question (FAQ) section on the homepage for more information.

See Details for a description of the internal functions available to modify or summarize data within the created object.

Note that any model requires at minimum a single added biodiversity dataset as well as a specified engine.

Usage

distribution(
  background,
  limits = NULL,
  limits_method = "none",
  mcp_buffer = 0,
  limits_clip = FALSE
)

## S4 method for signature 'SpatRaster'
distribution(
  background,
  limits = NULL,
  limits_method = "none",
  mcp_buffer = 0,
  limits_clip = FALSE
)

## S4 method for signature 'sf'
distribution(
  background,
  limits = NULL,
  limits_method = "none",
  mcp_buffer = 0,
  limits_clip = FALSE
)

Arguments

background

Specification of the modelling background. Must be a SpatRaster or sf object.

limits

A SpatRaster, sf or stars object that limits the prediction surface when intersected with input data (Default: NULL). In case of a stars object the first factorized time entry is taken.

limits_method

A character of the method used for hard limiting a projection. Available options are "none" (Default), "zones" or "mcp". See also add_limits_extrapolation().

mcp_buffer

A numeric distance to buffer the mcp (Default 0). Only used if "mcp" is used.

limits_clip

logical Should the limits clip all predictors before fitting a model (TRUE) or just the prediction (FALSE, default).

Details

This function creates a BiodiversityDistribution object that in itself contains other functions and stores parameters and (pre-)processed data. A full list of functions available can be queried via "names(object)". Some of the functions are not intended to be manipulated directly, but rather through convenience functions (e.g. "object$set_predictors()"). Similarly other objects are stored in the BiodiversityDistribution object that have their own functions as well and can be queried (e.g. "names(object)"). For a list of functions see the reference documentation. By default, if some datasets are not set, then a "Waiver" object is returned instead.

The following objects can be stored:

Useful high-level functions to address those objects are for instance:

  • object$show() A generic summary of the BiodiversityDistribution object contents. Can also be called via print.

  • object$get_biodiversity_equations() Lists the equations used for each biodiversity dataset with given id. Defaults to all predictors.

  • object$get_biodiversity_types() Lists the type of each specified biodiversity dataset with given id.

  • object$get_extent() Outputs the terra::ext of the modelling region.

  • object$show_background_info() Returns a list with the terra::ext and the terra::crs.

  • object$get_extent_dimensions() Outputs the terra::ext dimension by calling the "extent_dimensions()" function.

  • object$get_predictor_names() Returns a character vector with the names of all added predictors.

  • object$get_prior_variables() Returns a description of priors added.

There are other functions as well but those are better accessed through their respective wrapper functions.

Value

BiodiversityDistribution object containing data for building a biodiversity distribution modelling problem.

References

  • Fletcher, R.J., Hefley, T.J., Robertson, E.P., Zuckerberg, B., McCleery, R.A., Dorazio, R.M., (2019) A practical guide for combining data to model species distributions. Ecology 100, e02710. https://doi.org/10.1002/ecy.2710

  • Cooper, Jacob C., and Jorge Soberón. "Creating individual accessible area hypotheses improves stacked species distribution model performance." Global Ecology and Biogeography 27, no. 1 (2018): 156-165.

See Also

BiodiversityDistribution and other classes.

Examples

# Load background raster
background <- terra::rast(system.file("extdata/europegrid_50km.tif",package = "ibis.iSDM"))
# Define model
x <- distribution(background)
x

Class for the trained Model object

Description

All trained Models inherit the options here plus any additional ones defined by the engine and inference.

Public fields

id

A character id for any trained model

name

A description of the model as character.

model

A list containing all input datasets and parameters to the model.

settings

A Settings object with information on inference.

fits

A list containing the prediction and fitted model.

.internals

A list containing previous fitted models.

Methods

Public methods


Method new()

Initializes the object and creates an empty list

Usage
DistributionModel$new(name)
Arguments
name

A description of the model as character.

Returns

NULL


Method get_name()

Return the name of the model

Usage
DistributionModel$get_name()
Returns

A character with the model name used.


Method print()

Print the names and summarizes the model within

Usage
DistributionModel$print()
Returns

A message on screen


Method show()

Show the name of the Model.

Usage
DistributionModel$show()
Returns

A character of the run name.


Method plot()

Plots the prediction if found.

Usage
DistributionModel$plot(what = "mean")
Arguments
what

character with the specific layer to be plotted.

Returns

A graphical representation of the prediction


Method plot_threshold()

Plots the thresholded prediction if found.

Usage
DistributionModel$plot_threshold(what = 1)
Arguments
what

character or numeric for the layer to be plotted.

Returns

A graphical representation of the thresholded prediction if found.


Method show_duration()

Show model run time if settings exist

Usage
DistributionModel$show_duration()
Returns

A numeric estimate of the duration it took to fit the models.


Method summary()

Get effects or importance tables from model

Usage
DistributionModel$summary(obj = "fit_best")
Arguments
obj

A character of which object to return.

Returns

A data.frame summarizing the model, usually its coefficient.


Method effects()

Generic plotting function for effect plots

Usage
DistributionModel$effects(x = "fit_best", what = "fixed", ...)
Arguments
x

A character for the object in question.

what

A character for the type of coefficients.

...

Any other options.

Returns

A graphical representation of the coefficents.


Method get_equation()

Get equation

Usage
DistributionModel$get_equation()
Returns

A formula of the inferred model.


Method get_data()

Get specific fit from this Model

Usage
DistributionModel$get_data(x = "prediction")
Arguments
x

A character stating what should be returned.

Returns

A SpatRaster object with the prediction.


Method get_model()

Small internal helper function to directly get the model object

Usage
DistributionModel$get_model()
Returns

A fitted model if existing.


Method set_data()

Set new fit for this Model.

Usage
DistributionModel$set_data(x, value)
Arguments
x

The name of the new fit.

value

The SpatRaster layer (or model) to be inserted.

Returns

This object.


Method get_thresholdvalue()

Get the threshold value if calculated

Usage
DistributionModel$get_thresholdvalue()
Returns

A numeric threshold value.


Method get_thresholdtype()

Get threshold type and format if calculated.

Usage
DistributionModel$get_thresholdtype()
Returns

A vector with a character method and numeric threshold value.


Method show_rasters()

List all rasters in object

Usage
DistributionModel$show_rasters()
Returns

A vector with logical flags for the various objects.


Method get_projection()

Get projection of the background.

Usage
DistributionModel$get_projection()
Returns

A geographic projection


Method get_resolution()

Get the resolution of the projection

Usage
DistributionModel$get_resolution()
Returns

numeric estimates of the distribution.


Method rm_threshold()

Remove calculated thresholds

Usage
DistributionModel$rm_threshold()
Returns

Invisible


Method calc_suitabilityindex()

Calculate a suitability index for a given projection

Usage
DistributionModel$calc_suitabilityindex(method = "normalize")
Arguments
method

The method used for normalization.

Details

Methods can either be normalized by the minimum and maximum. Or the relative total using the sumof values.

Returns

Returns a SpatRaster.


Method get_centroid()

Get centroids of prediction layers

Usage
DistributionModel$get_centroid(patch = FALSE, layer = "mean")
Arguments
patch

A logical if centroid should be calculated weighted by values.

layer

character of the layer to use.

Returns

Returns a sf object.


Method has_limits()

Logical indication if the prediction was limited.

Usage
DistributionModel$has_limits()
Returns

A logical flag.


Method has_latent()

Logical indication if the prediction has added latent factors.

Usage
DistributionModel$has_latent()
Returns

A logical flag.


Method has_offset()

Has a offset been used?

Usage
DistributionModel$has_offset()
Returns

A logical flag.


Method mask()

Convenience function to mask all input datasets.

Usage
DistributionModel$mask(mask, inverse = FALSE, ...)
Arguments
mask

A SpatRaster or sf object.

inverse

A logical flag if the inverse should be masked instead.

...

Any other parameters passed on to mask

Returns

Invisible


Method save()

Save the prediction as output.

Usage
DistributionModel$save(fname, type = "gtif", dt = "FLT4S")
Arguments
fname

An output filename as character.

type

A format as character. Matched against a list of supported formats.

dt

The datatype used, such as float64

Returns

Saved spatial prediction on drive.


Method clone()

The objects of this class are cloneable with this method.

Usage
DistributionModel$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Note

Could be further pretified and commands outsourced.


Plot effects of trained model

Description

This functions is handy wrapper that calls the default plotting functions for the model of a specific engine. Equivalent to calling effects of a fitted distribution function.

Usage

## S3 method for class 'DistributionModel'
effects(object, ...)

Arguments

object

Any fitted distribution object.

...

Not used.

Value

None.

Note

For some models, where default coefficients plots are not available, this function will attempt to generate partial dependency plots instead.

Examples

## Not run: 
# Where mod is an estimated distribution model
mod$effects()

## End(Not run)

Create an empty SpatRaster based on a template

Description

This function creates an empty copy of a provided SpatRaster object. It is primarily used in the package to create the outputs for the predictions.

Usage

emptyraster(x, ...)

Arguments

x

A SpatRaster* object corresponding.

...

other arguments that can be passed to terra

Value

an empty SpatRaster, i.e. all cells are NA.

Examples

require(terra)
r <- rast(matrix(1:100, 5, 20))
emptyraster(r)

Engine for use of Bayesian Additive Regression Trees (BART)

Description

The Bayesian regression approach to a sum of complementary trees is to shrink the said fit of each tree through a regularization prior. BART models provide non-linear highly flexible estimation and have been shown to compare favourable among machine learning algorithms (Dorie et al. 2019). Default prior preference is for trees to be small (few terminal nodes) and shrinkage towards 0.

This package requires the "dbarts" R-package to be installed. Many of the functionalities of this engine have been inspired by the "embarcadero" R-package. Users are therefore advised to cite if they make heavy use of BART.

Usage

engine_bart(x, iter = 1000, nburn = 250, chains = 4, type = "response", ...)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

iter

A numeric estimate of the number of trees to be used in the sum-of-trees formulation (Default: 1000).

nburn

A numeric estimate of the burn in samples (Default: 250).

chains

A number of the number of chains to be used (Default: 4).

type

The type used for creating posterior predictions. Either "link" or "response" (Default: "response").

...

Other options.

Details

Prior distributions can furthermore be set for:

  • probability that a tree stops at a node of a given depth (Not yet implemented)

  • probability that a given variable is chosen for a splitting rule

  • probability of splitting that variable at a particular value (Not yet implemented)

Value

An Engine.

References

  • Carlson, CJ. embarcadero: Species distribution modelling with Bayesian additive regression trees in r. Methods Ecol Evol. 2020; 11: 850– 858. https://doi.org/10.1111/2041-210X.13389

  • Dorie, V., Hill, J., Shalit, U., Scott, M., & Cervone, D. (2019). Automated versus do-it-yourself methods for causal inference: Lessons learned from a data analysis competition. Statistical Science, 34(1), 43-68.

  • Vincent Dorie (2020). dbarts: Discrete Bayesian Additive Regression Trees Sampler. R package version 0.9-19. https://CRAN.R-project.org/package=dbarts

See Also

Other engine: engine_breg(), engine_gdb(), engine_glm(), engine_glmnet(), engine_inla(), engine_inlabru(), engine_scampr(), engine_stan(), engine_xgboost()

Examples

## Not run: 
# Add BART as an engine
x <- distribution(background) |> engine_bart(iter = 100)

## End(Not run)

Engine for Bayesian regularized regression models

Description

Efficient MCMC algorithm for linear regression models that makes use of 'spike-and-slab' priors for some modest regularization on the amount of posterior probability for a subset of the coefficients.

Usage

engine_breg(
  x,
  iter = 10000,
  nthread = getOption("ibis.nthread"),
  type = "response",
  ...
)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

iter

numeric on the number of MCMC iterations to run (Default: 10000).

nthread

numeric on the number of CPU-threads to use for data augmentation.

type

The mode used for creating posterior predictions. Either making "link" or "response" (Default: "response").

...

Other none specified parameters passed on to the model.

Details

This engine provides efficient Bayesian predictions through the Boom R-package. However note that not all link and models functions are supported and certain functionalities such as offsets are generally not available. This engines allows the estimation of linear and non-linear effects via the "only_linear" option specified in train.

Value

An Engine.

References

  • Nguyen, K., Le, T., Nguyen, V., Nguyen, T., & Phung, D. (2016, November). Multiple kernel learning with data augmentation. In Asian Conference on Machine Learning (pp. 49-64). PMLR.

  • Steven L. Scott (2021). BoomSpikeSlab: MCMC for Spike and Slab Regression. R package version 1.2.4. https://CRAN.R-project.org/package=BoomSpikeSlab

See Also

Other engine: engine_bart(), engine_gdb(), engine_glm(), engine_glmnet(), engine_inla(), engine_inlabru(), engine_scampr(), engine_stan(), engine_xgboost()

Examples

## Not run: 
# Add BREG as an engine
x <- distribution(background) |> engine_breg(iter = 1000)

## End(Not run)

Use of Gradient Descent Boosting for model estimation

Description

Gradient descent boosting is an efficient way to optimize any loss function of a generalized linear or additive model (such as the GAMs available through the "mgcv" R-package). It furthermore automatically regularizes the fit, thus the resulting model only contains the covariates whose baselearners have some influence on the response. Depending on the type of the add_biodiversity data, either poisson process models or logistic regressions are estimated. If the "only_linear" term in train is set to FALSE, splines are added to the estimation, thus providing a non-linear additive inference.

Usage

engine_gdb(
  x,
  iter = 2000,
  learning_rate = 0.1,
  empirical_risk = "inbag",
  type = "response",
  ...
)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

iter

An integer giving the number of boosting iterations (Default: 2e3L).

learning_rate

A bounded numeric value between 0 and 1 defining the shrinkage parameter.

empirical_risk

method for empirical risk calculation. Available options are 'inbag', 'oobag' and 'none'. (Default: 'inbag').

type

The mode used for creating posterior predictions. Either making "link", "response" or "class" (Default: "response").

...

Other variables or control parameters

Details

: This package requires the "mboost" R-package to be installed. It is in philosophy somewhat related to the engine_xgboost and "XGBoost" R-package, however providing some additional desirable features that make estimation quicker and particularly useful for spatial projections. Such as for instance the ability to specifically add spatial baselearners via add_latent_spatial or the specification of monotonically constrained priors via GDBPrior.

Value

An engine.

Note

The coefficients resulting from gdb with poipa data (Binomial) are only 0.5 of the typical coefficients of a logit model obtained via glm (see Binomial).

References

  • Hofner, B., Mayr, A., Robinzonov, N., & Schmid, M. (2014). Model-based boosting in R: a hands-on tutorial using the R package mboost. Computational statistics, 29(1-2), 3-35.

  • Hofner, B., Müller, J., Hothorn, T., (2011). Monotonicity-constrained species distribution models. Ecology 92, 1895–901.

  • Mayr, A., Hofner, B. and Schmid, M. (2012). The importance of knowing when to stop - a sequential stopping rule for component-wise gradient boosting. Methods of Information in Medicine, 51, 178–186.

See Also

Other engine: engine_bart(), engine_breg(), engine_glm(), engine_glmnet(), engine_inla(), engine_inlabru(), engine_scampr(), engine_stan(), engine_xgboost()

Examples

## Not run: 
# Add GDB as an engine
x <- distribution(background) |> engine_gdb(iter = 1000)

## End(Not run)

Engine for Generalized linear models (GLM)

Description

This engine implements a basic generalized linear modle (GLM) for creating species distribution models. The main purpose of this engine is to support a basic, dependency-free method for inference and projection that can be used within the package for examples and vignettes. That being said, the engine is fully functional as any other engine.

The basic implementation of GLMs here is part of a general class oflinear models and has - with exception of offsets - only minimal options to integrate other sources of information such as priors or joint integration. The general recommendation is to engine_glmnet() instead for regularization support. However basic GLMs can in some cases be useful for quick projections or for ensemble() of small models (a practice common for rare species).

Usage

engine_glm(x, control = NULL, type = "response", ...)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

control

A list containing parameters for controlling the fitting process (Default: NULL).

type

The mode used for creating posterior predictions. Either making "link" or "response" (Default: "response").

...

Other parameters passed on to stats::glm().

Details

This engine is essentially a wrapper for stats::glm.fit(), however with customized settings to support offsets and weights.

If "optim_hyperparam" is set to TRUE in train(), then a AIC based step-wise (backwards) model selection is performed. Generally however engine_glmnet should be the preferred package for models with more than >3 covariates.

Value

An Engine.

References

  • Hastie, T. J. and Pregibon, D. (1992) Generalized linear models. Chapter 6 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

See Also

Other engine: engine_bart(), engine_breg(), engine_gdb(), engine_glmnet(), engine_inla(), engine_inlabru(), engine_scampr(), engine_stan(), engine_xgboost()

Examples

# Load background
background <- terra::rast(system.file('extdata/europegrid_50km.tif',
package='ibis.iSDM',mustWork = TRUE))

# Add GLM as an engine
x <- distribution(background) |> engine_glm()
print(x)

Engine for regularized regression models

Description

This engine allows the estimation of linear coefficients using either ridge, lasso or elastic net regressions techniques. Backbone of this engine is the glmnet R-package which is commonly used in SDMs, including the popular 'maxnet' (e.g. Maxent) package. Ultimately this engine is an equivalent of engine_breg, but in a "frequentist" setting. If user aim to emulate a model that most closely resembles maxent within the ibis.iSDM modelling framework, then this package is the best way of doing so. Compared to the 'maxnet' R-package, a number of efficiency settings are implemented in particular for cross-validation of alpha and lambda values.

Limited amount of prior information can be specified for this engine, specifically via offsets or as GLMNETPrior, which allow to specify priors as regularization constants.

Usage

engine_glmnet(
  x,
  alpha = 0,
  nlambda = 100,
  lambda = NULL,
  type = "response",
  ...
)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

alpha

A numeric giving the elasticnet mixing parameter, which has to be between 0 and 1. alpha=1 is the lasso penalty, and alpha=0 the ridge penalty (Default: 0).

nlambda

A numeric giving the number of lambda values to be used (Default: 100).

lambda

A numeric with a user supplied estimate of lambda. Usually best to let this parameter be determined deterministically (Default: NULL).

type

The mode used for creating posterior predictions. Either making "link" or "response" (Default: "response").

...

Other parameters passed on to glmnet.

Details

Regularized regressions are effectively GLMs that are fitted with ridge, lasso or elastic-net regularization. Which of them is chosen is critical dependent on the alpha value: * For alpha equal to 0 a ridge regularization is used. Ridge regularization has the property that it doesn't remove variables entirely, but instead sets their coefficients to 0. * For alpha equal to 1 a lasso regularization is used. Lassos tend to remove those coefficients fully from the final model that do not improve the loss function. * For alpha values between 0 and 1 a elastic-net regularization is used, which is essentially a combination of the two. The optimal lambda parameter can be determined via cross-validation. For this option set "varsel" in train() to "reg".

Value

An Engine.

References

  • Jerome Friedman, Trevor Hastie, Robert Tibshirani (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1-22. URL https://www.jstatsoft.org/v33/i01/.

  • Renner, I.W., Elith, J., Baddeley, A., Fithian, W., Hastie, T., Phillips, S.J., Popovic, G. and Warton, D.I., 2015. Point process models for presence‐only analysis. Methods in Ecology and Evolution, 6(4), pp.366-379.

  • Fithian, W. & Hastie, T. (2013) Finite-sample equivalence in statistical models for presence-only data. The Annals of Applied Statistics 7, 1917–1939

See Also

Other engine: engine_bart(), engine_breg(), engine_gdb(), engine_glm(), engine_inla(), engine_inlabru(), engine_scampr(), engine_stan(), engine_xgboost()

Examples

## Not run: 
# Add GLMNET as an engine
x <- distribution(background) |> engine_glmnet(iter = 1000)

## End(Not run)

Use INLA as engine

Description

Allows a full Bayesian analysis of linear and additive models using Integrated Nested Laplace approximation. Engine has been largely superceded by the engine_inlabru package and users are advised to us this one, unless specific options are required.

Usage

engine_inla(
  x,
  optional_mesh = NULL,
  optional_projstk = NULL,
  max.edge = NULL,
  offset = NULL,
  cutoff = NULL,
  proj_stepsize = NULL,
  timeout = NULL,
  strategy = "auto",
  int.strategy = "eb",
  barrier = FALSE,
  type = "response",
  area = "gpc2",
  nonconvex.bdry = FALSE,
  nonconvex.convex = -0.15,
  nonconvex.concave = -0.05,
  nonconvex.res = 40,
  ...
)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

optional_mesh

A directly supplied "INLA" mesh (Default: NULL)

optional_projstk

A directly supplied projection stack. Useful if projection stack is identical for multiple species (Default: NULL)

max.edge

The largest allowed triangle edge length, must be in the same scale units as the coordinates. Default is an educated guess (Default: NULL).

offset

interpreted as a numeric factor relative to the approximate data diameter. Default is an educated guess (Default: NULL).

cutoff

The minimum allowed distance between points on the mesh. Default is an educated guess (Default: NULL).

proj_stepsize

The stepsize in coordinate units between cells of the projection grid (Default: NULL).

timeout

Specify a timeout for INLA models in sec. Afterwards it passed.

strategy

Which approximation to use for the joint posterior. Options are "auto" ("default"), "adaptative", "gaussian", "simplified.laplace" & "laplace".

int.strategy

Integration strategy. Options are "auto","grid", "eb" ("default") & "ccd". See also https://groups.google.com/g/r-inla-discussion-group/c/hDboQsJ1Mls

barrier

Should a barrier model be added to the model?

type

The mode used for creating posterior predictions. Either summarizing the linear "predictor" or "response" (Default: "response").

area

Accepts a character denoting the type of area calculation to be done on the mesh (Default: 'gpc2').

nonconvex.bdry

Create a non-convex boundary hulls instead (Default: FALSE) Not yet implemented

nonconvex.convex

Non-convex minimal extension radius for convex curvature Not yet implemented

nonconvex.concave

Non-convex minimal extension radius for concave curvature Not yet implemented

nonconvex.res

Computation resolution for nonconvex.hulls Not yet implemented

...

Other options.

Details

All INLA engines require the specification of a mesh that needs to be provided to the "optional_mesh" parameter. Otherwise the mesh will be created based on best guesses of the data spread. A good mesh needs to have triangles as regular as possible in size and shape: equilateral.

* "max.edge": The largest allowed triangle edge length, must be in the same scale units as the coordinates Lower bounds affect the density of triangles * "offset": The automatic extension distance of the mesh If positive: same scale units. If negative, interpreted as a factor relative to the approximate data diameter i.e., a value of -0.10 will add a 10% of the data diameter as outer extension. * "cutoff": The minimum allowed distance between points, it means that points at a closer distance than the supplied value are replaced by a single vertex. it is critical when there are some points very close to each other, either for point locations or in the domain boundary. * "proj_stepsize": The stepsize for spatial predictions, which affects the spatial grain of any outputs created.

Priors can be set via INLAPrior.

Value

An engine.

Note

How INLA Meshes are generated, substantially influences prediction outcomes. See Dambly et al. (2023).

References

  • Havard Rue, Sara Martino, and Nicholas Chopin (2009), Approximate Bayesian Inference for Latent Gaussian Models Using Integrated Nested Laplace Approximations (with discussion), Journal of the Royal Statistical Society B, 71, 319-392.

  • Finn Lindgren, Havard Rue, and Johan Lindstrom (2011). An Explicit Link Between Gaussian Fields and Gaussian Markov Random Fields: The Stochastic Partial Differential Equation Approach (with discussion), Journal of the Royal Statistical Society B, 73(4), 423-498.

  • Simpson, Daniel, Janine B. Illian, S. H. Sørbye, and Håvard Rue. 2016. “Going Off Grid: Computationally Efficient Inference for Log-Gaussian Cox Processes.” Biometrika 1 (103): 49–70.

  • Dambly, L. I., Isaac, N. J., Jones, K. E., Boughey, K. L., & O'Hara, R. B. (2023). Integrated species distribution models fitted in INLA are sensitive to mesh parameterisation. Ecography, e06391.

See Also

Other engine: engine_bart(), engine_breg(), engine_gdb(), engine_glm(), engine_glmnet(), engine_inlabru(), engine_scampr(), engine_stan(), engine_xgboost()

Examples

## Not run: 
# Add INLA as an engine (with a custom mesh)
x <- distribution(background) |> engine_inla(mesh = my_mesh)

## End(Not run)

Use inlabru as engine

Description

Model components are specified with general inputs and mapping methods to the latent variables, and the predictors are specified via general R expressions, with separate expressions for each observation likelihood model in multi-likelihood models. The inlabru engine - similar as the engine_inla function acts a wrapper for INLA, albeit "inlabru" has a number of convenience functions implemented that make in particular predictions with new data much more straight forward (e.g. via posterior simulation instead of fitting). Since more recent versions "inlabru" also supports the addition of multiple likelihoods, therefore allowing full integrated inference.

Usage

engine_inlabru(
  x,
  optional_mesh = NULL,
  max.edge = NULL,
  offset = NULL,
  cutoff = NULL,
  proj_stepsize = NULL,
  strategy = "auto",
  int.strategy = "eb",
  area = "gpc2",
  timeout = NULL,
  type = "response",
  ...
)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

optional_mesh

A directly supplied "INLA" mesh (Default: NULL)

max.edge

The largest allowed triangle edge length, must be in the same scale units as the coordinates. Default is an educated guess (Default: NULL).

offset

interpreted as a numeric factor relative to the approximate data diameter. Default is an educated guess (Default: NULL).

cutoff

The minimum allowed distance between points on the mesh. Default is an educated guess (Default: NULL).

proj_stepsize

The stepsize in coordinate units between cells of the projection grid (Default: NULL)

strategy

Which approximation to use for the joint posterior. Options are "auto" ("default"), "adaptative", "gaussian", "simplified.laplace" & "laplace".

int.strategy

Integration strategy. Options are "auto", "grid", "eb" ("default") & "ccd".

area

Accepts a character denoting the type of area calculation to be done on the mesh (Default: 'gpc2').

timeout

Specify a timeout for INLA models in sec. Afterwards it passed.

type

The mode used for creating posterior predictions. Either summarizing the linear "predictor" or "response" (Default:"response").

...

Other variables

Details

All INLA engines require the specification of a mesh that needs to be provided to the "optional_mesh" parameter. Otherwise the mesh will be created based on best guesses of the data spread. A good mesh needs to have triangles as regular as possible in size and shape: equilateral.

* "max.edge": The largest allowed triangle edge length, must be in the same scale units as the coordinates Lower bounds affect the density of triangles * "offset": The automatic extension distance of the mesh If positive: same scale units. If negative, interpreted as a factor relative to the approximate data diameter i.e., a value of -0.10 will add a 10% of the data diameter as outer extension. * "cutoff": The minimum allowed distance between points, it means that points at a closer distance than the supplied value are replaced by a single vertex. it is critical when there are some points very close to each other, either for point locations or in the domain boundary. * "proj_stepsize": The stepsize for spatial predictions, which affects the spatial grain of any outputs created.

Priors can be set via INLAPrior.

Value

An Engine.

Note

How INLA Meshes are generated, substantially influences prediction outcomes. See Dambly et al. (2023).

Source

https://inlabru-org.github.io/inlabru/articles/

References

  • Bachl, F. E., Lindgren, F., Borchers, D. L., & Illian, J. B. (2019). inlabru: an R package for Bayesian spatial modelling from ecological survey data. Methods in Ecology and Evolution, 10(6), 760-766.

  • Simpson, Daniel, Janine B. Illian, S. H. Sørbye, and Håvard Rue. 2016. “Going Off Grid: Computationally Efficient Inference for Log-Gaussian Cox Processes.” Biometrika 1 (103): 49–70.

  • Dambly, L. I., Isaac, N. J., Jones, K. E., Boughey, K. L., & O'Hara, R. B. (2023). Integrated species distribution models fitted in INLA are sensitive to mesh parameterisation. Ecography, e06391.

See Also

Other engine: engine_bart(), engine_breg(), engine_gdb(), engine_glm(), engine_glmnet(), engine_inla(), engine_scampr(), engine_stan(), engine_xgboost()

Examples

## Not run: 
# Add inlabru as an engine
x <- distribution(background) |> engine_inlabru()

## End(Not run)

Engine for process models using scampr

Description

Similar to others, this engine enables the fitting and prediction of log-Gaussian Cox process (LGCP) and Inhomogeneous Poisson process (IPP) processes. It uses the scampr package, which uses maximum likelihood estimation fitted via TMB (Template Model Builder).

It also support the addition of spatial latent effects which can be added via Gaussian fields and approximated by 'FRK' (Fixed Rank Kriging) and are integrated out using either variational or Laplace approximation.

The main use case for this engine is as an alternative to engine_inlabru() and engine_inla() for fitting iSDMs, e.g. those combining both presence-only and presence-absence point occurrence data.

Usage

engine_scampr(x, type = "response", dens = "posterior", maxit = 500, ...)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

type

The mode used for creating (posterior or prior) predictions. Either stting "link" or "response" (Default: "response").

dens

A character on how predictions are made, either from the "posterior" (Default) or "prior".

maxit

A numeric on the number of iterations for the optimizer (Default: 500).

...

Other parameters passed on.

Details

This engine may only be used to predict for one or two datasets at most. It supports only presence-only PPMs and presence/absence Binary GLMs, or 'IDM' (for an integrated data model).

Value

An Engine.

Note

  • The package can currently be installed from github directly only "ElliotDovers/scampr"

  • Presence-absence models in SCAMPR currently only support cloglog link functions!

References

  • Dovers, E., Popovic, G. C., & Warton, D. I. (2024). A fast method for fitting integrated species distribution models. Methods in Ecology and Evolution, 15(1), 191-203.

  • Dovers, E., Stoklosa, D., and Warton D. I. (2024). Fitting log-Gaussian Cox processes using generalized additive model software. The American Statistician, 1-17.

See Also

Other engine: engine_bart(), engine_breg(), engine_gdb(), engine_glm(), engine_glmnet(), engine_inla(), engine_inlabru(), engine_stan(), engine_xgboost()

Examples

## Not run: 
# Load background
background <- terra::rast(system.file('extdata/europegrid_50km.tif',
package='ibis.iSDM',mustWork = TRUE))

# Add GLM as an engine
x <- distribution(background) |> engine_scampr()

## End(Not run)

Use Stan as engine

Description

Stan is probabilistic programming language that can be used to specify most types of statistical linear and non-linear regression models. Stan provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler, an adaptive form of Hamiltonian Monte Carlo sampling. Stan code has to be written separately and this function acts as compiler to build the stan-model. Requires the "cmdstanr" package to be installed!

Usage

engine_stan(
  x,
  chains = 4,
  iter = 2000,
  warmup = floor(iter/2),
  init = "random",
  cores = getOption("ibis.nthread"),
  algorithm = "sampling",
  control = list(adapt_delta = 0.95),
  type = "response",
  ...
)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

chains

A positive integer specifying the number of Markov chains (Default: 4 chains).

iter

A positive integer specifying the number of iterations for each chain (including warmup). (Default: 2000).

warmup

A positive integer specifying the number of warmup (aka burnin) iterations per chain. If step-size adaptation is on (Default: TRUE), this also controls the number of iterations for which adaptation is run (and hence these warmup samples should not be used for inference). The number of warmup iterations should be smaller than iter and the default is iter/2.

init

Initial values for parameters (Default: 'random'). Can also be specified as list (see: "rstan::stan")

cores

If set to NULL take values from specified ibis option getOption('ibis.nthread').

algorithm

Mode used to sample from the posterior. Available options are "sampling", "optimize", or "variational". See "cmdstanr" package for more details. (Default: "sampling").

control

See "rstan::stan" for more details on specifying the controls.

type

The mode used for creating posterior predictions. Either summarizing the linear "predictor" or "response" (Default: "response").

...

Other variables

Details

By default the posterior is obtained through sampling, however stan also supports approximate inference forms through penalized maximum likelihood estimation (see Carpenter et al. 2017).

Value

An Engine.

Note

The function obj$stancode() can be used to print out the stancode of the model.

References

  • Jonah Gabry and Rok Češnovar (2021). cmdstanr: R Interface to 'CmdStan'. https://mc-stan.org/cmdstanr, https://discourse.mc-stan.org.

  • Carpenter, B., Gelman, A., Hoffman, M. D., Lee, D., Goodrich, B., Betancourt, M., ... & Riddell, A. (2017). Stan: A probabilistic programming language. Journal of statistical software, 76(1), 1-32.

  • Piironen, J., & Vehtari, A. (2017). Sparsity information and regularization in the horseshoe and other shrinkage priors. Electronic Journal of Statistics, 11(2), 5018-5051.

See Also

rstan, cmdstanr

Other engine: engine_bart(), engine_breg(), engine_gdb(), engine_glm(), engine_glmnet(), engine_inla(), engine_inlabru(), engine_scampr(), engine_xgboost()

Examples

## Not run: 
# Add Stan as an engine
x <- distribution(background) |> engine_stan(iter = 1000)

## End(Not run)

Engine for extreme gradient boosting (XGBoost)

Description

Allows to estimate eXtreme gradient descent boosting for tree-based or linear boosting regressions. The XGBoost engine is a flexible, yet powerful engine with many customization options, supporting multiple options to perform single and multi-class regression and classification tasks. For a full list of options users are advised to have a look at the xgboost::xgb.train help file and https://xgboost.readthedocs.io.

Usage

engine_xgboost(
  x,
  booster = "gbtree",
  iter = 8000L,
  learning_rate = 0.001,
  gamma = 6,
  reg_lambda = 0,
  reg_alpha = 0,
  max_depth = 2,
  subsample = 0.75,
  colsample_bytree = 0.4,
  min_child_weight = 3,
  nthread = getOption("ibis.nthread"),
  ...
)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

booster

A character of the booster to use. Either "gbtree" or "gblinear" (Default: gblinear)

iter

numeric value giving the the maximum number of boosting iterations for cross-validation (Default: 8e3L).

learning_rate

numeric value indicating the learning rate (eta). Lower values generally being better but also computationally more costly. (Default: 1e-3)

gamma

numeric A regularization parameter in the model. Lower values for better estimates (Default: 3). Also see "reg_lambda" parameter for the L2 regularization on the weights

reg_lambda

numeric L2 regularization term on weights (Default: 0).

reg_alpha

numeric L1 regularization term on weights (Default: 0).

max_depth

numeric The Maximum depth of a tree (Default: 3).

subsample

numeric The ratio used for subsampling to prevent overfitting. Also used for creating a random tresting dataset (Default: 0.75).

colsample_bytree

numeric Sub-sample ratio of columns when constructing each tree (Default: 0.4).

min_child_weight

numeric Broadly related to the number of instances necessary for each node (Default: 3).

nthread

numeric on the number of CPU-threads to use.

...

Other none specified parameters.

Details

The default parameters have been set relatively conservative as to reduce overfitting.

XGBoost supports the specification of monotonic constraints on certain variables. Within ibis this is possible via XGBPrior. However constraints are available only for the "gbtree" baselearners.

Value

An Engine.

Note

'Machine learning is statistics minus any checking of models and assumptions‘ ~ Brian D. Ripley, useR! 2004, Vienna

References

  • Tianqi Chen and Carlos Guestrin, "XGBoost: A Scalable Tree Boosting System", 22nd SIGKDD Conference on Knowledge Discovery and Data Mining, 2016, https://arxiv.org/abs/1603.02754

See Also

xgboost::xgb.train

Other engine: engine_bart(), engine_breg(), engine_gdb(), engine_glm(), engine_glmnet(), engine_inla(), engine_inlabru(), engine_scampr(), engine_stan()

Examples

## Not run: 
# Add xgboost as an engine
x <- distribution(background) |> engine_xgboost(iter = 4000)

## End(Not run)

Engine class description

Description

Basic object for engine, all other engines inherit from here.

Public fields

engine

The class name of the engine.

name

The name of the engine

data

Any data or parameters necessary to make this engine work.

Methods

Public methods


Method new()

Initializes the object and creates an empty list

Usage
Engine$new(engine, name)
Arguments
engine

The class name of the engine.

name

The name of the engine

Returns

NULL


Method print()

Print the Engine name

Usage
Engine$print()
Returns

A message on screen


Method show()

Aliases that calls print.

Usage
Engine$show()
Returns

A message on screen


Method get_class()

Get class description

Usage
Engine$get_class()
Returns

A character with the class as saved in engine


Method get_data()

Get specific data from this engine

Usage
Engine$get_data(x)
Arguments
x

A respecified data to be added to the engine.

Returns

A list with the data.


Method list_data()

List all data

Usage
Engine$list_data()
Returns

A character vector of the data entries.


Method set_data()

Set data for this engine

Usage
Engine$set_data(x, value)
Arguments
x

A character with the name or id of this dataset.

value

A new list of parameters.

Returns

Invisible


Method get_self()

Dummy function to get self object

Usage
Engine$get_self()
Returns

This object


Method clone()

The objects of this class are cloneable with this method.

Usage
Engine$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.


Function to create an ensemble of multiple fitted models

Description

Ensemble models calculated on multiple models have often been shown to outcompete any single model in comparative assessments (Valavi et al. 2022).

This function creates an ensemble of multiple provided distribution models fitted with the ibis.iSDM-package. Each model has to have estimated predictions with a given method and optional uncertainty in form of the standard deviation or similar. Through the layer parameter it can be specified which part of the prediction should be averaged in an ensemble. This can be for instance the mean prediction and/or the standard deviation sd. See Details below for an overview of the different methods.

Also returns a coefficient of variation (cv) as output of the ensemble, but note this should not be interpreted as measure of model uncertainty as it cannot capture parameter uncertainty of individual models; rather it reflects variation among predictions which can be due to many factors including simply differences in model complexity.

Usage

ensemble(
  ...,
  method = "mean",
  weights = NULL,
  min.value = NULL,
  layer = "mean",
  normalize = FALSE,
  uncertainty = "cv",
  point = NULL,
  field_occurrence = "observed",
  apply_threshold = TRUE
)

## S4 method for signature 'ANY'
ensemble(
  ...,
  method = "mean",
  weights = NULL,
  min.value = NULL,
  layer = "mean",
  normalize = FALSE,
  uncertainty = "cv",
  point = NULL,
  field_occurrence = "observed",
  apply_threshold = TRUE
)

Arguments

...

Provided DistributionModel or SpatRaster objects.

method

Approach on how the ensemble is to be created. See details for available options (Default: 'mean').

weights

(Optional) weights provided to the ensemble function if weighted means are to be constructed (Default: NULL).

min.value

A optional numeric stating a minimum value that needs to be surpassed in each layer before calculating and ensemble (Default: NULL).

layer

A character of the layer to be taken from each prediction (Default: 'mean'). If set to NULL ignore any of the layer names in ensembles of SpatRaster objects.

normalize

logical on whether the inputs of the ensemble should be normalized to a scale of 0-1 (Default: FALSE).

uncertainty

A character indicating how the uncertainty among models should be calculated. Available options include "none", the standard deviation ("sd"), the average of all PCA axes except the first "pca", the coefficient of variation ("cv", Default) or the range between the lowest and highest value ("range").

point

A sf object containing observational data used for model training. Used for method 'superlearner' only (Default: NULL).

field_occurrence

A character location of biodiversity point records (Default: 'observed').

apply_threshold

A logical flag (Default: TRUE) specifying whether threshold values should also be created via "method". Only applies and works for DistributionModel and thresholds found.

Details

Possible options for creating an ensemble includes:

  • 'mean' - Calculates the mean of several predictions.

  • 'median' - Calculates the median of several predictions.

  • 'max' - The maximum value across predictions.

  • 'min' - The minimum value across predictions.

  • 'mode' - The mode/modal values as the most commonly occurring value.

  • 'weighted.mean' - Calculates a weighted mean. Weights have to be supplied separately (e.g. TSS).

  • 'min.sd' - Ensemble created by minimizing the uncertainty among predictions.

  • 'threshold.frequency' - Returns an ensemble based on threshold frequency (simple count). Requires thresholds to be computed.

  • 'pca' - Calculates a PCA between predictions of each algorithm and then extract the first axis (the one explaining the most variation).

  • 'superlearner' - Composites two predictions through a 'meta-model' fitted on top (using a glm by default). Requires binomial data in current Setup.

In addition to the different ensemble methods, a minimal threshold (min.value) can be set that needs to be surpassed for averaging. By default this option is not used (Default: NULL).

Note by default only the band in the layer parameter is composited. If supported by the model other summary statistics from the posterior (e.g. 'sd') can be specified.

Value

A SpatRaster object containing the ensemble of the provided predictions specified by method and a coefficient of variation across all models.

Note

If a list is supplied, then it is assumed that each entry in the list is a fitted DistributionModel object. Take care not to create an ensemble of models constructed with different link functions, e.g. logistic vs log. In this case the "normalize" parameter has to be set.

References

  • Valavi, R., Guillera‐Arroita, G., Lahoz‐Monfort, J. J., & Elith, J. (2022). Predictive performance of presence‐only species distribution models: a benchmark study with reproducible code. Ecological Monographs, 92(1), e01486.

Examples

# Method works for fitted models as well as as rasters
r1 <- terra::rast(nrows = 10, ncols = 10, res = 0.05, xmin = -1.5,
 xmax = 1.5, ymin = -1.5, ymax = 1.5, vals = rnorm(3600,mean = .5,sd = .1))
r2 <- terra::rast(nrows = 10, ncols = 10, res = 0.05, xmin = -1.5,
 xmax = 1.5, ymin = -1.5, ymax = 1.5, vals = rnorm(3600,mean = .5,sd = .5))
names(r1) <- names(r2) <- "mean"

# Assumes previously computed predictions
ex <- ensemble(r1, r2, method = "mean")

terra::plot(ex)

Function to create an ensemble of partial effects from multiple models

Description

Similar to the ensemble() function, this function creates an ensemble of partial responses of provided distribution models fitted with the ibis.iSDM-package. Through the layer parameter it can be specified which part of the partial prediction should be averaged in an ensemble (if given). This can be for instance the mean prediction and/or the standard deviation sd. Ensemble partial is also being called if more than one input DistributionModel object is provided to partial.

By default the ensemble of partial responses is created as average across all models with the uncertainty being the standard deviation of responses.

Usage

ensemble_partial(
  ...,
  x.var,
  method = "mean",
  layer = "mean",
  newdata = NULL,
  normalize = TRUE
)

## S4 method for signature 'ANY'
ensemble_partial(
  ...,
  x.var,
  method = "mean",
  layer = "mean",
  newdata = NULL,
  normalize = TRUE
)

Arguments

...

Provided DistributionModel objects from which partial responses can be called. In the future provided data.frames might be supported as well.

x.var

A character of the variable from which an ensemble is to be created.

method

Approach on how the ensemble is to be created. See details for options (Default: 'mean').

layer

A character of the layer to be taken from each prediction (Default: 'mean'). If set to NULL ignore any of the layer names in ensembles of SpatRaster objects.

newdata

A optional data.frame or SpatRaster object supplied to the model (DefaultL NULL). This object needs to have identical names as the original predictors.

normalize

logical on whether the inputs of the ensemble should be normalized to a scale of 0-1 (Default: TRUE).

Details

Possible options for creating an ensemble includes:

  • 'mean' - Calculates the mean of several predictions.

  • 'median' - Calculates the median of several predictions.

Value

A data.frame with the combined partial effects of the supplied models.

Note

If a list is supplied, then it is assumed that each entry in the list is a fitted DistributionModel object. Take care not to create an ensemble of models constructed with different link functions, e.g. logistic vs log. By default the response functions of each model are normalized.

Examples

## Not run: 
 # Assumes previously computed models
 ex <- ensemble_partial(mod1, mod2, mod3, method = "mean")

## End(Not run)

Function to create an ensemble of spartial effects from multiple models

Description

Similar to the ensemble() function, this function creates an ensemble of partial responses of provided distribution models fitted with the ibis.iSDM-package. Through the layer parameter it can be specified which part of the partial prediction should be averaged in an ensemble (if given). This can be for instance the mean prediction and/or the standard deviation sd. Ensemble partial is also being called if more than one input DistributionModel object is provided to partial.

By default the ensemble of partial responses is created as average across all models with the uncertainty being the standard deviation of responses.

Usage

ensemble_spartial(
  ...,
  x.var,
  method = "mean",
  layer = "mean",
  newdata = NULL,
  min.value = NULL,
  normalize = TRUE
)

## S4 method for signature 'ANY'
ensemble_spartial(
  ...,
  x.var,
  method = "mean",
  layer = "mean",
  newdata = NULL,
  min.value = NULL,
  normalize = TRUE
)

Arguments

...

Provided DistributionModel objects from which partial responses can be called. In the future provided data.frames might be supported as well.

x.var

A character of the variable from which an ensemble is to be created.

method

Approach on how the ensemble is to be created. See details for options (Default: 'mean').

layer

A character of the layer to be taken from each prediction (Default: 'mean'). If set to NULL ignore any of the layer names in ensembles of SpatRaster objects.

newdata

A optional data.frame or SpatRaster object supplied to the model (DefaultL NULL). This object needs to have identical names as the original predictors.

min.value

A optional numeric stating a minimum value that needs to be surpassed in each layer before calculating and ensemble (Default: NULL).

normalize

logical on whether the inputs of the ensemble should be normalized to a scale of 0-1 (Default: TRUE).

Details

Possible options for creating an ensemble includes:

  • 'mean' - Calculates the mean of several predictions.

  • 'median' - Calculates the median of several predictions.

Value

A SpatRaster object with the combined partial effects of the supplied models.

Note

If a list is supplied, then it is assumed that each entry in the list is a fitted DistributionModel object. Take care not to create an ensemble of models constructed with different link functions, e.g. logistic vs log. By default the response functions of each model are normalized.

Examples

## Not run: 
 # Assumes previously computed models
 ex <- ensemble_spartial(mod1, mod2, mod3, method = "mean")

## End(Not run)

Function to format a prepared GLOBIOM netCDF file for use in Ibis.iSDM

Description

This function expects a downscaled GLOBIOM output as created in the BIOCLIMA project. Likely of little use for anyone outside IIASA.

Usage

formatGLOBIOM(
  fname,
  oftype = "raster",
  ignore = NULL,
  period = "all",
  template = NULL,
  shares_to_area = FALSE,
  use_gdalutils = FALSE,
  verbose = getOption("ibis.setupmessages", default = TRUE)
)

Arguments

fname

A filename in character pointing to a GLOBIOM output in netCDF format.

oftype

A character denoting the output type (Default: 'raster').

ignore

A vector of variables to be ignored (Default: NULL).

period

A character limiting the period to be returned from the formatted data. Options include "reference" for the first entry, "projection" for all entries but the first, and "all" for all entries (Default: "reference").

template

An optional SpatRaster object towards which projects should be transformed.

shares_to_area

A logical on whether shares should be corrected to areas (if identified).

use_gdalutils

(Deprecated) logical on to use gdalutils hack-around.

verbose

logical on whether to be chatty.

Value

A SpatRaster stack with the formatted GLOBIOM predictors.

Examples

## Not run: 
# Expects a filename pointing to a netCDF file.
covariates <- formatGLOBIOM(fname)

## End(Not run)

Monotonic constrained priors for boosted regressions

Description

Monotonic constrains for gradient descent boosting models do not work in the same way as other priors where a specific coefficient or magnitude of importance is specified. Rather monotonic constraints enforce a specific directionality of regression coefficients so that for instance a coefficient has to be positive or negative.

Important: Specifying a monotonic constrain for the engine_gdb does not guarantee that the variable is retained in the model as it can still be regularized out.

Usage

GDBPrior(variable, hyper = "increasing", ...)

## S4 method for signature 'character'
GDBPrior(variable, hyper = "increasing", ...)

Arguments

variable

A character matched against existing predictors variables.

hyper

A character object describing the type of constrain. Available options are 'increasing', 'decreasing', 'convex', 'concave', 'positive', 'negative' or 'none'.

...

Variables passed on to prior object.

Note

Similar priors can also be defined for the engine_xgboost via XGBPrior().

References

  • Hofner, B., Müller, J., & Hothorn, T. (2011). Monotonicity‐constrained species distribution models. Ecology, 92(10), 1895-1901.

See Also

Prior, XGBPrior

Other prior: BARTPrior(), BARTPriors(), BREGPrior(), BREGPriors(), GDBPriors(), GLMNETPrior(), GLMNETPriors(), INLAPrior(), INLAPriors(), STANPrior(), STANPriors(), XGBPrior(), XGBPriors(), add_priors(), get_priors(), priors(), rm_priors()


Helper function when multiple variables are supplied for a GDB prior

Description

This is a helper function to specify several GLMNETPrior with the same hyper-parameters, but different variables.

Usage

GDBPriors(variable, hyper = "increasing", ...)

## S4 method for signature 'character'
GDBPriors(variable, hyper = "increasing", ...)

Arguments

variable

A character matched against existing predictors variables.

hyper

A character object describing the type of constrain. Available options are 'increasing', 'decreasing', 'convex', 'concave', 'positive', 'negative' or 'none'.

...

Variables passed on to prior object.

See Also

Other prior: BARTPrior(), BARTPriors(), BREGPrior(), BREGPriors(), GDBPrior(), GLMNETPrior(), GLMNETPriors(), INLAPrior(), INLAPriors(), STANPrior(), STANPriors(), XGBPrior(), XGBPriors(), add_priors(), get_priors(), priors(), rm_priors()


Small helper function to obtain predictions from an object

Description

This function is a short helper function to return the fitted data from a DistributionModel or BiodiversityScenario object. It can be used to easily obtain for example the estimated prediction from a model or the projected scenario from a scenario() object.

Usage

get_data(obj, what = NULL)

## S4 method for signature 'ANY'
get_data(obj, what = NULL)

Arguments

obj

Provided DistributionModel or BiodiversityScenario object.

what

A character of specific layer to be returned if existing (Default: NULL).

Value

A SpatRaster or "stars" object depending on the input.

Note

This function is essentially identical to querying the internal function x$get_data() from the object. However it does attempt some lazy character matching if what is supplied.

Examples

## Not run: 
 # Assumes previously computed model
 get_data(fit)

## End(Not run)

Function to extract nearest neighbour predictor values of provided points

Description

This function performs nearest neighbour matching between biodiversity observations and independent predictors, and operates directly on provided data.frames. Note that despite being parallized this function can be rather slow for large data volumes of data!

Usage

get_ngbvalue(
  coords,
  env,
  longlat = TRUE,
  field_space = c("x", "y"),
  cheap = FALSE,
  ...
)

Arguments

coords

A matrix, data.frame or sf object.

env

A data.frame object with the predictors.

longlat

A logical variable indicating whether the projection is long-lat.

field_space

A vector highlight the columns from which coordinates are to be extracted (Default: c('x','y')).

cheap

A logical variable whether the dataset is considered to be large and faster computation could help.

...

other options.

Details

Nearest neighbour matching is done via the geodist R-package (geodist::geodist).

Value

A data.frame with the extracted covariate data from each provided data point.

Note

If multiple values are of equal distance during the nearest neighbour check, then the results is by default averaged.

References

  • Mark Padgham and Michael D. Sumner (2021). geodist: Fast, Dependency-Free Geodesic Distance Calculations. R package version 0.0.7. https://CRAN.R-project.org/package=geodist

Examples

## Not run: 
 # Create matchup table
tab <- get_ngbvalue( coords = coords, # Coordinates
                     env = env # Data.frame with covariates and coordinates
                  )

## End(Not run)

Create priors from an existing distribution model

Description

Often it can make sense to fit an additional model to get a grasp on the range of values that "beta" parameters can take. This function takes an existing BiodiversityDistribution object and creates PriorList object from them. The resulting object can be used to add for instance priors to a new model.

Usage

get_priors(mod, target_engine, ...)

## S4 method for signature 'ANY,character'
get_priors(mod, target_engine, ...)

Arguments

mod

A fitted DistributionModel object. If instead a BiodiversityDistribution object is passed to this function, it simply returns the contained priors used for estimation (if any).

target_engine

A character for which the priors should be created.

...

Other parameters passed down.

Note

Not all engines support priors in similar ways. See the vignettes and help pages on that topic!

See Also

Other prior: BARTPrior(), BARTPriors(), BREGPrior(), BREGPriors(), GDBPrior(), GDBPriors(), GLMNETPrior(), GLMNETPriors(), INLAPrior(), INLAPriors(), STANPrior(), STANPriors(), XGBPrior(), XGBPriors(), add_priors(), priors(), rm_priors()

Examples

## Not run: 
 mod <- distribution(background) |>
    add_predictors(covariates) |>
    add_biodiversity_poipo(points) |>
    engine_inlabru() |>
    train()
 get_priors(mod, target_engine = "BART")

## End(Not run)

Function to extract point values directly from a SpatRaster

Description

This function simply extracts the values from a provided SpatRaster, SpatRasterDataset or SpatRasterCollection object. For points where or NA values were extracted a small buffer is applied to try and obtain the remaining values.

Usage

get_rastervalue(coords, env, ngb_fill = TRUE, rm.na = FALSE)

Arguments

coords

A data.frame, matrix or sf object.

env

A SpatRaster object with the provided predictors.

ngb_fill

logical on whether cells should be interpolated from neighbouring values.

rm.na

logical parameter which - if set - removes all rows with a missing data point (NA) from the result.

Details

It is essentially a wrapper for terra::extract.

Value

A data.frame with the extracted covariate data from each provided data point.

Examples

# Dummy raster:
r <- terra::rast(nrows = 10, ncols = 10, res = 0.05, xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5, vals = rnorm(3600,mean = .5,sd = .1))
# (dummy points)
pp <- terra::spatSample(r,20,as.points = TRUE) |> sf::st_as_sf()

# Extract values
vals <- get_rastervalue(pp, r)
head(vals)

Regression penalty priors for GLMNET

Description

The engine_glmnet engine does not support priors in a typical sense, however it is possible to specify so called penalty factors as well as lower and upper limits on all variables in the model.

The default penalty multiplier is 1 for each coefficient X covariate, i.e. coefficients are penalized equally and then informed by an intersection of any absence information with the covariates. In contrast a variable with penalty.factor equal to 0 is not penalized at all.

In addition, it is possible to specifiy a lower and upper limit for specific coefficients, which constrain them to a certain range. By default those ranges are set to -Inf and Inf respectively, but can be reset to a specific value range by altering "lims" (see examples).

For a regularized regression that supports a few more options on the priors, check out the Bayesian engine_breg.

Usage

GLMNETPrior(variable, hyper = 0, lims = c(-Inf, Inf), ...)

## S4 method for signature 'character'
GLMNETPrior(variable, hyper = 0, lims = c(-Inf, Inf), ...)

Arguments

variable

A character variable passed on to the prior object.

hyper

A numeric value between 0 and 1 that state the penalization factor. By default this is set to 0, implying the "variable" provided is not regularized at all.

lims

A numeric vector of the lower and upper limits for each coefficient (Default: c(-Inf, Inf)).

...

Variables passed on to prior object.

See Also

Prior

Other prior: BARTPrior(), BARTPriors(), BREGPrior(), BREGPriors(), GDBPrior(), GDBPriors(), GLMNETPriors(), INLAPrior(), INLAPriors(), STANPrior(), STANPriors(), XGBPrior(), XGBPriors(), add_priors(), get_priors(), priors(), rm_priors()

Examples

## Not run: 
# Retain variable
p1 <- GLMNETPrior(variable = "forest", hyper = 0)
p1
# Smaller chance to be regularized
p2 <- GLMNETPrior(variable = "forest", hyper = 0.2, lims = c(0, Inf))
p2

## End(Not run)

Helper function when multiple variables are supplied for a GLMNET prior

Description

This is a helper function to specify several GLMNETPrior with the same hyper-parameters, but different variables.

Usage

GLMNETPriors(variable, hyper = 0, lims = c(-Inf, Inf))

## S4 method for signature 'character'
GLMNETPriors(variable, hyper = 0, lims = c(-Inf, Inf))

Arguments

variable

A character variable passed on to the prior object.

hyper

A numeric value between 0 and 1 that state the penalization factor. By default this is set to 0, implying the "variable" provided is not regularized at all.

lims

A numeric vector of the lower and upper limits for each coefficient (Default: c(-Inf, Inf)).

See Also

Other prior: BARTPrior(), BARTPriors(), BREGPrior(), BREGPriors(), GDBPrior(), GDBPriors(), GLMNETPrior(), INLAPrior(), INLAPriors(), STANPrior(), STANPriors(), XGBPrior(), XGBPriors(), add_priors(), get_priors(), priors(), rm_priors()


Install ibis dependencies

Description

Some of the dependencies (R-Packages) that ibis.iSDM relies on are by intention not added to the Description of the file to keep the number of mandatory dependencies small and enable the package to run even on systems that might not have all libraries pre-installed.

This function provides a convenience wrapper to install those missing dependencies as needed. It furthermore checks which packages require updating and updates them as needed.

Usage

ibis_dependencies(deps = getOption("ibis.dependencies"), update = TRUE)

Arguments

deps

A vector with the names of the packages to be installed (Default: "ibis.dependencies" in ibis_options).

update

A logical flag of whether all (installed) packages should also be checked for updates (Default: TRUE).

Value

Nothing. Packages will be installed.

Note

INLA is handled in a special way as it is not available via cran.

Examples

## Not run: 
  # Install and update all dependencies
  ibis_dependencies()

## End(Not run)

Set the parallel processing flag to TRUE

Description

Small helper function to enable parallel processing. If set to TRUE, then parallel inference (if supported by engines) and projection is enabled across the package. For enabling prediction support beyond sequential prediction see the ibis_future function.

Usage

ibis_enable_parallel()

Value

Invisible

See Also

future, ibis_future


Internal function to enable (a)synchronous parallel processing

Description

This function checks if parallel processing can be set up and enables it. Ideally this is done by the user for more control! In the package parallelization is usually only used for predictions and projections, but not for inference in which case parallel inference should be handled by the engine.

Usage

ibis_future(
  plan_exists = FALSE,
  cores = getOption("ibis.nthread", default = 2),
  strategy = getOption("ibis.futurestrategy"),
  workers = NULL
)

Arguments

plan_exists

A logical check on whether an existing future plan exists (Default: FALSE).

cores

A numeric number stating the number of cores to use.

strategy

A character denoting the strategy to be used for future. See help of future for options. (Default: "multisession").

workers

An optional list of remote machines or workers, e.g. "c(remote.server.org)". Alternatively a "cluster" object can be provided.

Details

Currently supported strategies are:

  • "sequential" = Resolves futures sequentially in the current R process (Package default).

  • "multisession" = Resolves futures asynchronously across 'cores' sessions.

  • "multicore" = Resolves futures asynchronously across on forked processes. Only works on UNIX systems!

  • "cluster" = Resolves futures asynchronously in sessions on this or more machines.

  • "slurm" = To be implemented: Slurm linkage via batchtools.

Value

Invisible

Note

The 'plan' set by future exists after the function has been executed.

If the aim is to parallize across many species, this is better done in a scripted solution. Make sure not to parallize predictions within existing clusters to avoid out-of-memory issues.

See Also

future

Examples

## Not run: 
# Starts future job. F in this case is a prediction function.
ibis_future(cores = 4, strategy = "multisession")

## End(Not run)

Print ibis options

Description

There are a number of hidden options that can be specified for ibis.iSDM. Currently supported are:

  • 'ibis.runparallel' : logical value on whether processing should be run in parallel.

  • 'ibis.nthread' : numeric value on how many cores should be used by default.

  • 'ibis.setupmessages' : logical value indicating whether message during object creation should be shown (Default: NULL).

  • 'ibis.engines' : Returns a vector with all valid engines.

  • 'ibis.use_future' : logical on whether the future package should be used for parallel computing.

Usage

ibis_options()

Value

The output of getOptions for all ibis related variables.

Examples

ibis_options()

Set the number of threads for parallel processing.

Description

Small helper function to respecify the strategy for parallel processing (Default: 'sequential').

Usage

ibis_set_strategy(strategy = "sequential")

Arguments

strategy

A character with the strategy.

Details

Currently supported strategies are:

  • "sequential" = Resolves futures sequentially in the current R process (Package default).

  • "multisession" = Resolves futures asynchronously across 'cores' sessions.

  • "multicore" = Resolves futures asynchronously across on forked processes. Only works on UNIX systems!

  • "cluster" = Resolves futures asynchronously in sessions on this or more machines.

  • "slurm" = To be implemented: Slurm linkage via batchtools.

Value

Invisible

See Also

future, ibis_future


Set the threads for parallel processing.

Description

Small helper function to respecify the number of threads for parallel processing.

Usage

ibis_set_threads(threads = 2)

Arguments

threads

A numeric greater thna 0.

Value

Invisible

See Also

future, ibis_future_run


Create a new INLA prior

Description

For any fixed and random effect INLA supports a range of different priors of exponential distributions.

Currently supported for INLA in ibis.iSDM are the following priors that can be specified via "type":

  • "normal" or "gaussian": Priors on normal distributed and set to specified variable. Required parameters are a mean and a precision estimate provided to "hyper". Note that precision is not equivalent (rather the inverse) to typical standard deviation specified in Gaussian priors. Defaults are set to a mean of 0 and a precision of 0.001.

  • "clinear": Prior that places a constraint on the linear coefficients of a model so as that the coefficient is in a specified interval "c(lower,upper)". Specified through hyper these values can be negative, positive or infinite.

  • "spde", specifically 'prior.range' and 'prior.sigma': Specification of penalized complexity priors which can be added to a SPDE spatial random effect added via add_latent_spatial(). Here the range of the penalized complexity prior can be specified through 'prior.range' and the uncertainty via 'prior.sigma' both supplied to the options 'type' and 'hyper'.

Other priors available in INLA names(INLA::inla.models()$prior) ) might also work, but have not been tested!

Usage

INLAPrior(variable, type = "normal", hyper = c(0, 0.001), ...)

## S4 method for signature 'character,character'
INLAPrior(variable, type = "normal", hyper = c(0, 0.001), ...)

Arguments

variable

A character matched against existing predictors or latent effects.

type

A character specifying the type of prior to be set.

hyper

A vector with numeric values to be used as hyper-parameters. See description. The default values are set to a mean of 0 and a precision of 0.001.

...

Variables passed on to prior object.

Note

Compared to other engines, INLA does unfortunately does not support priors related to more stringent parameter regularization such as Laplace or Horseshoe priors, which limits the capability of engine_inla for regularization. That being said many of the default uninformative priors act already regularize the coefficients to some degree.

References

  • Rue, H., Riebler, A., Sørbye, S. H., Illian, J. B., Simpson, D. P., & Lindgren, F. K. (2017). Bayesian computing with INLA: a review. Annual Review of Statistics and Its Application, 4, 395-421.

  • Simpson, D., Rue, H., Riebler, A., Martins, T. G., & Sørbye, S. H. (2017). Penalising model component complexity: A principled, practical approach to constructing priors. Statistical science, 32(1), 1-28.

See Also

Prior.

Other prior: BARTPrior(), BARTPriors(), BREGPrior(), BREGPriors(), GDBPrior(), GDBPriors(), GLMNETPrior(), GLMNETPriors(), INLAPriors(), STANPrior(), STANPriors(), XGBPrior(), XGBPriors(), add_priors(), get_priors(), priors(), rm_priors()


Helper function when multiple variables and types are supplied for INLA

Description

This is a helper function to specify several INLAPrior objects with the same hyper-parameters, but different variables.

Usage

INLAPriors(variables, type, hyper = c(0, 0.001), ...)

## S4 method for signature 'vector,character'
INLAPriors(variables, type, hyper = c(0, 0.001), ...)

Arguments

variables

A vector of character matched against existing predictors or latent effects.

type

A character specifying the type of prior to be set.

hyper

A vector with numeric values to be used as hyper-parameters.

...

Variables passed on to prior object.

See Also

Other prior: BARTPrior(), BARTPriors(), BREGPrior(), BREGPriors(), GDBPrior(), GDBPriors(), GLMNETPrior(), GLMNETPriors(), INLAPrior(), STANPrior(), STANPriors(), XGBPrior(), XGBPriors(), add_priors(), get_priors(), priors(), rm_priors()


Approximate missing time steps between dates

Description

This function linearly approximates shares between time steps, so that gaps for instance between 2010 and 2020 are filled with data for 2010, 2011, 2012, etc.

Usage

interpolate_gaps(env, date_interpolation = "annual", method = "linear")

Arguments

env

A stars object.

date_interpolation

character on how missing dates between events should be interpolated. See project().

method

A character on the used method for approximation, either "linear" (Default) or "constant" through a step function.

Value

logical indicating if the two SpatRaster objects have the same

Examples

## Not run: 
  # Interpolate stars stack
  sc <- interpolate_gaps( stack, "annual")

## End(Not run)

Check whether a formula is valid

Description

Check whether a formula is valid

Usage

is.formula(x)

Arguments

x

A character object

Value

Boolean evaluation with logical output.


Check whether a provided object is truly of a specific type

Description

Check whether a provided object is truly of a specific type

Usage

is.Id(x)

Arguments

x

A provided Id object

Value

Boolean evaluation with logical output.


Tests if an input is a SpatRaster object.

Description

Tests if an input is a SpatRaster object.

Usage

is.Raster(x)

Arguments

x

an R Object.

Value

Boolean evaluation with logical output.


Tests if an input is a stars object.

Description

Tests if an input is a stars object.

Usage

is.stars(x)

Arguments

x

an R Object.

Value

Boolean evaluation with logical output.


Is the provided object of type waiver?

Description

Is the provided object of type waiver?

Usage

is.Waiver(x)

Arguments

x

A provided Waiver object

Value

Boolean evaluation with logical output.


Identify local limiting factor

Description

Calculates a SpatRaster of locally limiting factors from a given projected model. To calculate this first the spartial effect of each individual covariate in the model is calculated.

The effect is estimated as that variable most responsible for decreasing suitability at that cell. The decrease in suitability is calculated, for each predictor in turn, relative to thesuitability that would be achieved if that predictor took the value equal to the mean The predictor associated with the largest decrease in suitability is the most limiting factor.

Usage

limiting(mod, plot = TRUE)

## S4 method for signature 'ANY'
limiting(mod, plot = TRUE)

Arguments

mod

A fitted 'DistributionModel' object from which limited factors are to be identified.

plot

Should the result be plotted? (Default: TRUE).

Value

A terra object of the most important variable for a given grid cell.

References

  • Elith, J., Kearney, M. and Phillips, S. (2010), The art of modelling range-shifting species. Methods in Ecology and Evolution, 1: 330-342. doi: 10.1111/j.2041-210X.2010.00036.x

Examples

## Not run: 
o <- limiting(fit)
plot(o)

## End(Not run)

Load a pre-computed model

Description

The load_model function (opposed to the write_model) loads previous saved DistributionModel. It is essentially a wrapper to readRDS.

When models are loaded, they are briefly checked for their validity and presence of necessary components.

Usage

load_model(fname, verbose = getOption("ibis.setupmessages", default = TRUE))

## S4 method for signature 'character'
load_model(fname, verbose = getOption("ibis.setupmessages", default = TRUE))

Arguments

fname

A character depicting an output filename.

verbose

logical indicating whether messages should be shown. Overwrites getOption("ibis.setupmessages") (Default: TRUE).

Value

A DistributionModel object.

See Also

write_model

Examples

## Not run: 
# Load model
mod <- load_model("testmodel.rds")

summary(mod)

## End(Not run)

Log prototype.

Description

Basic R6 object for Log, any Log inherit from here

Public fields

filename

A character of where the log is to be stored.

output

The log content.

Methods

Public methods


Method new()

Initializes the object and specifies some default parameters.

Usage
Log$new(filename, output)
Arguments
filename

A character of where the log is to be stored.

output

The log content.

Returns

NULL


Method print()

Print message with filename

Usage
Log$print()
Returns

A message on screen


Method open()

Opens the connection to the output filename.

Usage
Log$open(type = c("output", "message"))
Arguments
type

A character vector of the output types.

Returns

Invisible TRUE


Method close()

Closes the connection to the output file

Usage
Log$close()
Returns

Invisible TRUE


Method get_filename()

Get output filename

Usage
Log$get_filename()
Returns

A character with the filename


Method set_filename()

Set a new output filename

Usage
Log$set_filename(value)
Arguments
value

A character with the new filename.

Returns

Invisible TRUE


Method delete()

Delete log file

Usage
Log$delete()
Returns

Invisible TRUE


Method open_system()

Open log with system viewer

Usage
Log$open_system()
Returns

Invisible TRUE


Method clone()

The objects of this class are cloneable with this method.

Usage
Log$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.


Mask data with an external layer

Description

This is a helper function that takes an existing object created by the ibis.iSDM package and an external layer, then intersects both. It currently takes either a DistributionModel, BiodiversityDatasetCollection, PredictorDataset or BiodiversityScenario as input.

As mask either a sf or SpatRaster object can be chosen. The mask will be converted internally depending on the object.

Usage

mask.DistributionModel(x, mask, inverse = FALSE, ...)

mask.BiodiversityDatasetCollection(x, mask, inverse = FALSE, ...)

mask.PredictorDataset(x, mask, inverse = FALSE, ...)

mask.BiodiversityScenario(x, mask, inverse = FALSE, ...)

Arguments

x

Any object belonging to DistributionModel, BiodiversityDatasetCollection, PredictorDataset or BiodiversityScenario.

mask

A sf or SpatRaster object.

inverse

A logical flag whether to take inverse of the mask instead (Default: FALSE).

...

Passed on arguments

Value

A respective object of the input type.

See Also

terra::mask()

Examples

## Not run: 
# Build and train a model
mod <- distribution(background) |>
  add_biodiversity_poipo(species) |>
  add_predictors(predictors) |>
  engine_glmnet() |>
  train()

# Constrain the prediction by another object
mod <- mask(mod, speciesrange)


## End(Not run)

Identifier

Description

Generate a new unique identifier.

Usage

new_id()

Details

Identifiers are made using the uuid::UUIDgenerate().

Value

"Id" object.

See Also

uuid::UUIDgenerate().

Examples

# create new id
i <- new_id()

# print id
print(i)

# convert to character
as.character(i)

# check if it is an Id object
is.Id(i)

Waiver

Description

Create a waiver object.

Usage

new_waiver()

Details

This object is used to represent that the user has not manually specified a setting, and so defaults should be used. By explicitly using a new_waiver(), this means that NULL objects can be a valid setting. The use of a "waiver" object was inspired by the ggplot2 and prioritizr package.

Value

Object of class Waiver.

Examples

# create new waiver object
w <- new_waiver()

# print object
print(w)

# is it a waiver object?
is.Waiver(w)

Niche plot for distribution objects

Description

The suitability of any given area for a biodiversity feature can in many instances be complex and non-linear. Visualizing obtained suitability predictions (e.g. from train()) against underlying predictors might help to explain the underlying gradients of the niche.

Supported Inputs for this function are either single trained ibis.iSDM DistributionModel objects or alternatively a set of three SpatRaster objects. In both cases, users can specify "xvar" and "yvar" explicitly or leave them empty. In the latter case a principal component analysis (PCA) is conducted on the full environmental stack (loaded from DistributionModel or supplied separately).

Usage

nicheplot(
  mod,
  xvar = NULL,
  yvar = NULL,
  envvars = NULL,
  overlay_data = FALSE,
  plot = TRUE,
  fname = NULL,
  title = NULL,
  pal = NULL,
  ...
)

## S4 method for signature 'ANY'
nicheplot(
  mod,
  xvar = NULL,
  yvar = NULL,
  envvars = NULL,
  overlay_data = FALSE,
  plot = TRUE,
  fname = NULL,
  title = NULL,
  pal = NULL,
  ...
)

Arguments

mod

A trained DistributionModel or alternatively a SpatRaster object with prediction model within.

xvar

A character denoting the predictor on the x-axis. Alternatively a SpatRaster object can be provided.

yvar

A character denoting the predictor on the y-axis. Alternatively a SpatRaster object can be provided.

envvars

A SpatRaster object containing all environmental variables. Only used if xvar and yvar is empty (Default: NULL).

overlay_data

A logical on whether training data should be overlaid on the plot. Only used for DistributionModel objects (Default: FALSE).

plot

A logical indication of whether the result is to be plotted (Default: TRUE)?

fname

A character specifying the output file name a created figure should be written to.

title

Allows to respecify the title through a character (Default: NULL).

pal

An optional vector with continuous custom colours (Default: NULL).

...

Other engine specific parameters.

Value

Saved niche plot in 'fname' if specified, otherwise plot.

See Also

partial, plot.DistributionModel

Examples

# Make quick prediction
background <- terra::rast(system.file('extdata/europegrid_50km.tif',
package='ibis.iSDM',mustWork = TRUE))
virtual_points <- sf::st_read(system.file('extdata/input_data.gpkg', package='ibis.iSDM'), 'points',quiet = TRUE)
ll <- list.files(system.file('extdata/predictors/',package = 'ibis.iSDM',mustWork = TRUE),full.names = TRUE)

# Load them as rasters
predictors <- terra::rast(ll);names(predictors) <- tools::file_path_sans_ext(basename(ll))

# Add GLM as an engine and predict
fit <- distribution(background) |>
add_biodiversity_poipo(virtual_points, field_occurrence = 'Observed',
name = 'Virtual points',docheck = FALSE) |>
add_predictors(predictors, transform = 'none',derivates = 'none') |>
engine_glm() |>
train()

# Plot niche for prediction for temperature and forest cover
nicheplot(fit, xvar = "bio01_mean_50km", yvar = "CLC3_312_mean_50km" )

Shows size of objects in the R environment

Description

Shows the size of the objects currently in the R environment. Helps to locate large objects cluttering the R environment and/or causing memory problems during the execution of large workflows.

Usage

objects_size(n = 10)

Arguments

n

Number of objects to show, Default: 10

Value

A data frame with the row names indicating the object name, the field 'Type' indicating the object type, 'Size' indicating the object size, and the columns 'Length/Rows' and 'Columns' indicating the object dimensions if applicable.

Author(s)

Bias Benito

Examples

if(interactive()){

 #creating dummy objects
 x <- matrix(runif(100), 10, 10)
 y <- matrix(runif(10000), 100, 100)

 #reading their in-memory size
 objects_size()

}

Obtain partial effects of trained model

Description

Create a partial response or effect plot of a trained model.

Usage

partial(
  mod,
  x.var = NULL,
  constant = NULL,
  variable_length = 100,
  values = NULL,
  newdata = NULL,
  plot = FALSE,
  type = "response",
  ...
)

## S4 method for signature 'ANY'
partial(
  mod,
  x.var = NULL,
  constant = NULL,
  variable_length = 100,
  values = NULL,
  newdata = NULL,
  plot = FALSE,
  type = "response",
  ...
)

partial.DistributionModel(mod, ...)

Arguments

mod

A trained DistributionModel object with fit_best model within.

x.var

A character indicating the variable for which a partial effect is to be calculated.

constant

A numeric constant to be inserted for all other variables. Default calculates a mean per variable.

variable_length

numeric The interpolation depth (nr. of points) to be used (Default: 100).

values

numeric Directly specified values to compute partial effects for. If this parameter is set to anything other than NULL, the parameter "variable_length" is ignored (Default: NULL).

newdata

An optional data.frame with provided data for partial estimation (Default: NULL).

plot

A logical indication of whether the result is to be plotted?

type

A specified type, either 'response' or 'predictor'. Can be missing.

...

Other engine specific parameters.

Details

By default the mean is calculated across all parameters that are not x.var. Instead a constant can be set (for instance 0) to be applied to the output.

Value

A data.frame with the created partial response.

See Also

partial

Examples

## Not run: 
 # Do a partial calculation of a trained model
 partial(fit, x.var = "Forest.cover", plot = TRUE)

## End(Not run)

Visualize the density of the data over the environmental data

Description

Based on a fitted model, plot the density of observations over the estimated variable and environmental space. Opposed to the partial and spartial functions, which are rather low-level interfaces, this function provides more detail in the light of the data. It is also able to contrast different variables against each other and show the used data.

Usage

partial_density(mod, x.var, df = FALSE, ...)

## S4 method for signature 'ANY,character'
partial_density(mod, x.var, df = FALSE, ...)

Arguments

mod

A trained DistributionModel object. Requires a fitted model and inferred prediction.

x.var

A character indicating the variable to be investigated. Can be a vector of length 1 or 2.

df

logical if plotting data should be returned instead (Default: FALSE).

...

Other engine specific parameters.

Details

This functions calculates the observed density of presence and absence points over the whole surface of a specific variable. It can be used to visually inspect the fit of the model to data.

Value

A ggplot2 object showing the marginal response in light of the data.

Note

By default all variables that are not x.var are hold constant at the mean.

References

  • Warren, D.L., Matzke, N.J., Cardillo, M., Baumgartner, J.B., Beaumont, L.J., Turelli, M., Glor, R.E., Huron, N.A., Simões, M., Iglesias, T.L. Piquet, J.C., and Dinnage, R. 2021. ENMTools 1.0: an R package for comparative ecological biogeography. Ecography, 44(4), pp.504-511.

See Also

partial

Examples

## Not run: 
 # Do a partial calculation of a trained model
 partial_density(fit, x.var = "Forest.cover")
 # Or with two variables
 partial_density(fit, x.var = c("Forest.cover", "bio01"))

## End(Not run)

Plot wrappers

Description

Plots information from a given object where a plotting object is available.

Usage

## S3 method for class 'DistributionModel'
plot(x, what = "mean", ...)

## S3 method for class 'BiodiversityDatasetCollection'
plot(x, ...)

## S3 method for class 'PredictorDataset'
plot(x, ...)

## S3 method for class 'Engine'
plot(x, ...)

## S3 method for class 'BiodiversityScenario'
plot(x, ...)

Arguments

x

Any object belonging to DistributionModel, BiodiversityDatasetCollection, PredictorDataset or BiodiversityScenario.

what

In case a SpatRaster is supplied, this parameter specifies the layer to be shown (Default: "mean").

...

Further arguments passed on to x$plot.

Details

The plotted outputs vary depending on what object is being plotted. For example for a fitted DistributionModel the output is usually the fitted spatial prediction (Default: 'mean').

Value

Graphical output

Examples

## Not run: 
# Build and train a model
mod <- distribution(background) |>
  add_biodiversity_poipo(species) |>
  add_predictors(predictors) |>
  engine_glmnet() |>
  train()
# Plot the resulting model
plot(mod)

## End(Not run)

Create a posterior prediction from a rstanfit object

Description

This function does simulates from the posterior of a created stan model, therefore providing a fast and efficient way to project coefficients obtained from Bayesian models to new/novel contexts.

Usage

posterior_predict_stanfit(
  obj,
  form,
  newdata,
  type = "predictor",
  family = NULL,
  offset = NULL,
  draws = NULL
)

Arguments

obj

A "stanfit" object (as used by rstan).

form

A formula object created for the DistributionModel.

newdata

A data.frame with new data to be used for prediction.

type

A character of whether the linear predictor or the response is to be summarized.

family

A character giving the family for simulating linear response values (Default: NULL)

offset

A vector with an optionally specified offset.

draws

numeric indicating whether a specific number of draws should be taken.

References


Create spatial derivative of raster stacks

Description

This function creates derivatives of existing covariates and returns them in Raster format. Derivative variables can in the machine learning literature commonly be understood as one aspect of feature engineering. They can be particularly powerful in introducing non-linearities in otherwise linear models, for example is often done in the popular Maxent framework.

Usage

predictor_derivate(
  env,
  option,
  nknots = 4,
  deriv = NULL,
  int_variables = NULL,
  method = NULL,
  ...
)

Arguments

env

A SpatRaster object.

option

A vector stating whether predictors should be preprocessed in any way (Options: 'none', 'quadratic', 'hinge', 'kmeans', 'thresh', 'bin').

nknots

The number of knots to be used for the transformation (Default: 4).

deriv

A vector with character of specific derivates to create (Default: NULL).

int_variables

A vector with length greater or equal than 2 specifying the covariates (Default: NULL).

method

As 'option' for more intuitive method setting. Can be left empty (in this case option has to be set).

...

other options (Non specified).

Details

Available options are:

  • 'none' - The original layer(s) are returned.

  • 'quadratic' - A quadratic transformation (x2x^{2}) is created of the provided layers.

  • 'hinge' - Creates hinge transformation of covariates, which set all values lower than a set threshold to 0 and all others to a range of [0,1][0,1]. The number of thresholds and thus new derivates is specified via the parameter 'nknots' (Default: 4).

  • 'interaction' - Creates interactions between variables. Target variables have to be specified via "int_variables".

  • 'thresh' - A threshold transformation of covariates, which sets all values lower than a set threshold at 0 and those larger to 1. The number of thresholds and thus new derivates is specified via the parameter 'nknots' (Default: 4).

  • 'bin' - Creates a factor representation of a covariates by cutting the range of covariates by their percentiles. The number of percentile cuts and thus new derivates is specified via the parameter 'nknots' (Default: 4).

  • 'kmeans' Creates a factor representation of a covariates through a kmeans() clustering. The number of clusters are specified via the parameter 'nknots'.

Value

Returns the derived adjusted SpatRaster objects of identical resolution.

See Also

predictor_transform

Examples

# Dummy raster
 r_ori <- terra::rast(nrows = 10, ncols = 10, res = 0.05, xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5, vals = rpois(3600, 10))

# Create a hinge transformation with 4 knots of one or multiple SpatRaster.
new <- predictor_derivate(r_ori, option = "hinge", knots = 4)
terra::plot(new)

# Or a quadratic transformation
new2 <- predictor_derivate(r_ori, option = "quad", knots = 4)
terra::plot(new2)

Filter a set of correlated predictors to fewer ones

Description

This function helps to remove highly correlated variables from a set of predictors. It supports multiple options some of which require both environmental predictors and observations, others only predictors.

Some of the options require different packages to be pre-installed, such as ranger or Boruta.

Usage

predictor_filter(env, keep = NULL, method = "pearson", ...)

Arguments

env

A data.frame or matrix with extracted environmental covariates for a given species.

keep

A vector with variables to keep regardless. These are usually variables for which prior information is known.

method

Which method to use for constructing the correlation matrix (Options: 'pearson' (Default), 'spearman'| 'kendal'), "abess", or "boruta".

...

Other options for a specific method

Details

Available options are:

  • "none" No prior variable removal is performed (Default).

  • "pearson", "spearman" or "kendall" Makes use of pairwise comparisons to identify and remove highly collinear predictors (Pearson's r >= 0.7).

  • "abess" A-priori adaptive best subset selection of covariates via the abess package (see References). Note that this effectively fits a separate generalized linear model to reduce the number of covariates.

  • "boruta" Uses the Boruta package to identify non-informative features.

Value

A character vector of variable names to be excluded. If the function fails due to some reason return NULL.

Note

Using this function on predictors effectively means that a separate model is fitted on the data with all the assumptions that come with in (e.g. linearity, appropriateness of response, normality, etc).

Examples

## Not run: 
 # Remove highly correlated predictors
 env <- predictor_filter( env, option = "pearson")

## End(Not run)

Homogenize NA values across a set of predictors.

Description

This method allows the homogenization of missing data across a set of environmental predictors. It is by default called when predictors are added to BiodiversityDistribution object. Only grid cells with NAs that contain values at some raster layers are homogenized. Additional parameters allow instead of homogenization to fill the missing data with neighbouring values

Usage

predictor_homogenize_na(
  env,
  fill = FALSE,
  fill_method = "ngb",
  return_na_cells = FALSE
)

Arguments

env

A SpatRaster object with the predictors.

fill

A logical value indicating whether missing data are to be filled (Default: FALSE).

fill_method

A character of the method for filling gaps to be used (Default: 'ngb').

return_na_cells

A logical value of whether the ids of grid cells with NA values is to be returned instead (Default: FALSE).

Value

A SpatRaster object with the same number of layers as the input.

Examples

## Not run: 
 # Harmonize predictors
 env <- predictor_homogenize_na(env)

## End(Not run)

Spatial adjustment of environmental predictors and raster stacks

Description

This function allows the transformation of provided environmental predictors (in SpatRaster format). A common use case is for instance the standardization (or scaling) of all predictors prior to model fitting. This function works both with SpatRaster as well as with stars objects.

Usage

predictor_transform(
  env,
  option,
  windsor_props = c(0.05, 0.95),
  pca.var = 0.8,
  state = NULL,
  method = NULL,
  ...
)

Arguments

env

A SpatRaster or stars object.

option

A vector stating whether predictors should be preprocessed in any way (Options: 'none', 'scale', 'norm', 'windsor', 'windsor_thresh', 'percentile' 'pca', 'revjack'). See Details.

windsor_props

A numeric vector specifying the proportions to be clipped for windsorization (Default: c(.05,.95)).

pca.var

A numeric value between >0 and 1 stating the minimum amount of variance to be covered (Default: 0.8).

state

A matrix with one value per variable (column) providing either a ( stats::mean(), stats::sd() ) for each variable in env for option 'scale' or a range of minimum and maximum values for option 'norm'. Effectively applies their value range for rescaling. (Default: NULL).

method

As 'option' for more intuitive method setting. Can be left empty (in this case option has to be set).

...

other options (Non specified).

Details

Available options are:

  • 'none' The original layer(s) are returned.

  • 'scale' This run the scale() function with default settings (1 Standard deviation) across all predictors. A sensible default to for most model fitting.

  • 'norm' This normalizes all predictors to a range from 0-1.

  • 'windsor' This applies a 'windsorization' to an existing raster layer by setting the lowest, respectively largest values to the value at a certain percentage level (e.g. 95%). Those can be set via the parameter "windsor_props".

  • 'windsor_thresh' Same as option 'windsor', however in this case values are clamped to a thresholds rather than certain percentages calculated on the data.

  • 'percentile' This converts and bins all values into percentiles, e.g. the top 10% or lowest 10% of values and so on.

  • 'pca' This option runs a principal component decomposition of all predictors (via prcomp()). It returns new predictors resembling all components in order of the most important ones. Can be useful to reduce collinearity, however note that this changes all predictor names to 'PCX', where X is the number of the component. The parameter 'pca.var' can be modified to specify the minimum variance to be covered by the axes.

  • 'revjack' Removes outliers from the supplied stack via a reverse jackknife procedure. Identified outliers are by default set to NA.

Value

Returns a adjusted SpatRaster object of identical resolution.

Note

If future covariates are rescaled or normalized, it is highly recommended to use the statistical moments on which the models were trained for any variable transformations, also to ensure that variable ranges are consistent among relative values.

See Also

predictor_derivate

Examples

# Dummy raster
r_ori <- terra::rast(nrows = 10, ncols = 10, res = 0.05, xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5, vals = rnorm(3600,mean = .01,sd = .1))

# Normalize
r_norm <- predictor_transform(r_ori, option = 'norm')
new <- c(r_ori, r_norm)
names(new) <- c("original scale", "normalized units")
terra::plot(new)

PredictorDataset class description

Description

This class describes the PredictorDataset and is used to store covariates within.

Public fields

id

The id for this collection as character.

data

A predictor dataset usually as SpatRaster.

name

A name for this object.

transformed

Saves whether the predictors have been transformed somehow.

timeperiod

A timeperiod field

Methods

Public methods


Method new()

Initializes the object and creates an empty list

Usage
PredictorDataset$new(id, data, transformed = FALSE, ...)
Arguments
id

The id for this collection as character.

data

A predictor dataset usually as SpatRaster.

transformed

A logical flag if predictors have been transformed. Assume not.

...

Any other parameters found.

Returns

NULL


Method print()

Print the names and properties of all Biodiversity datasets contained within

Usage
PredictorDataset$print(format = TRUE)
Arguments
format

A logical flag on whether a message should be printed.

Returns

A message on screen


Method get_name()

Return name of this object

Usage
PredictorDataset$get_name()
Returns

Default character name.


Method get_id()

Get Id of this object

Usage
PredictorDataset$get_id()
Returns

Default character name.


Method get_names()

Get names of data

Usage
PredictorDataset$get_names()
Returns

character names of the data value.


Method get_predictor_names()

Alias for get_names

Usage
PredictorDataset$get_predictor_names()
Returns

character names of the data value.


Method get_data()

Get a specific dataset

Usage
PredictorDataset$get_data(df = FALSE, na.rm = TRUE, ...)
Arguments
df

logical on whether data is to be returned as data.frame.

na.rm

logical if NA is to be removed from data.frame.

...

Any other parameters passed on.

Returns

A SpatRaster or data.frame.


Method get_time()

Get time dimension of object.

Usage
PredictorDataset$get_time(...)
Arguments
...

Any other parameters passed on.

Returns

A vector with the time dimension of the dataset.


Method get_projection()

Get Projection

Usage
PredictorDataset$get_projection()
Returns

A vector with the geographical projection of the object.


Method get_resolution()

Get Resolution

Usage
PredictorDataset$get_resolution()
Returns

A numeric vector with the spatial resolution of the data.


Method get_ext()

Get Extent of predictors

Usage
PredictorDataset$get_ext()
Returns

A numeric vector with the spatial resolution of the data.


Method crop_data()

Utility function to clip the predictor dataset by another dataset

Usage
PredictorDataset$crop_data(pol, apply_time = FALSE)
Arguments
pol

A sf object used for cropping the data

apply_time

A logical flag indicating if time should be acknowledged in cropping.

Details

This code now also

Returns

Invisible TRUE


Method mask()

Utility function to mask the predictor dataset by another dataset

Usage
PredictorDataset$mask(mask, inverse = FALSE, ...)
Arguments
mask

A SpatRaster or sf object.

inverse

A logical flag if the inverse should be masked instead.

...

Any other parameters passed on to masking.

Returns

Invisible


Method set_data()

Add a new Predictor dataset to this collection

Usage
PredictorDataset$set_data(value)
Arguments
value

A new SpatRaster or stars object.

Returns

This object


Method rm_data()

Remove a specific Predictor by name

Usage
PredictorDataset$rm_data(x)
Arguments
x

character of the predictor name to be removed.

Returns

Invisible


Method show()

Alias for print method

Usage
PredictorDataset$show()
Returns

Invisible


Method summary()

Collect info statistics with optional decimals

Usage
PredictorDataset$summary(digits = 2)
Arguments
digits

numeric Giving the rounding precision

Returns

A data.frame summarizing the data.


Method has_derivates()

Indication if there are any predictors that are derivates of outers

Usage
PredictorDataset$has_derivates()
Returns

A logical flag.


Method is_transformed()

Predictors have been transformed?

Usage
PredictorDataset$is_transformed()
Returns

A logical flag.


Method get_transformed_params()

Get transformation params.

Usage
PredictorDataset$get_transformed_params()
Returns

A matrix flag.


Method length()

Number of Predictors in object

Usage
PredictorDataset$length()
Returns

A numeric estimate


Method ncell()

Number of cells or values in object

Usage
PredictorDataset$ncell()
Returns

A numeric estimate


Method plot()

Basic Plotting function

Usage
PredictorDataset$plot()
Returns

A graphical interpretation of the predictors in this object.


Method clone()

The objects of this class are cloneable with this method.

Usage
PredictorDataset$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

See Also

predictor_derivate()

predictor_transform()

predictor_transform()


Print

Description

Display information about any object created through the ibis.iSDM R-package.

Usage

## S3 method for class 'distribution'
print(x, ...)

## S3 method for class 'BiodiversityDistribution'
print(x, ...)

## S3 method for class 'BiodiversityDatasetCollection'
print(x, ...)

## S3 method for class 'BiodiversityDataset'
print(x, ...)

## S3 method for class 'PredictorDataset'
print(x, ...)

## S3 method for class 'DistributionModel'
print(x, ...)

## S3 method for class 'BiodiversityScenario'
print(x, ...)

## S3 method for class 'Prior'
print(x, ...)

## S3 method for class 'PriorList'
print(x, ...)

## S3 method for class 'Engine'
print(x, ...)

## S3 method for class 'Settings'
print(x, ...)

## S3 method for class 'Log'
print(x, ...)

## S3 method for class 'Id'
print(x, ...)

## S4 method for signature 'Id'
print(x, ...)

## S4 method for signature 'tbl_df'
print(x, ...)

Arguments

x

Any object created through the package.

...

not used.

Value

Object specific.

See Also

base::print().

Examples

## Not run: 
# Where mod is fitted object
mod
print(mod)

## End(Not run)

Base Prior class

Description

This class sets up the base class for priors which will be inherited by all priors.

Value

Defines a Prior object.

Public fields

id

A character with the id of the prior.

name

A character with the name of the prior.

type

A character with the type of the prior.

variable

A character with the variable name for the prior.

distribution

A character with the distribution of the prior if relevant.

value

A numeric or character with the prior value, e.g. the hyper-parameters.

prob

Another numeric entry on the prior field. The inclusion probability.

lims

A limitation on the lower and upper bounds of a numeric value.

Methods

Public methods


Method new()

Initializes the object and prepared the various prior variables

Usage
Prior$new(
  id,
  name,
  variable,
  value,
  type = NULL,
  distribution = NULL,
  prob = NULL,
  lims = NULL
)
Arguments
id

A character with the id of the prior.

name

A character with the name of the prior.

variable

A character with the variable name for the prior.

value

A numeric or character with the prior value, e.g. the hyper-parameters.

type

A character with the type of the prior.

distribution

A character with the distribution of the prior if relevant.

prob

Another numeric entry on the prior field. The inclusion probability.

lims

A limitation on the lower and upper bounds of a numeric value.

Returns

NULL


Method print()

Print out the prior type and variable.

Usage
Prior$print()
Returns

A message on screen


Method validate()

Generic validation function for a provided value.

Usage
Prior$validate(x)
Arguments
x

A new prior value.

Returns

Invisible TRUE


Method get()

Get prior values

Usage
Prior$get(what = "value")
Arguments
what

A character with the entry to be returned (Default: value).

Returns

Invisible TRUE


Method set()

Set prior

Usage
Prior$set(x)
Arguments
x

A new prior value as numeric or character.

Returns

Invisible TRUE


Method get_id()

Get a specific ID from a prior.

Usage
Prior$get_id()
Returns

A character id.


Method get_name()

Get Name of object

Usage
Prior$get_name()
Returns

Returns a character with the class name.


Method clone()

The objects of this class are cloneable with this method.

Usage
Prior$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Note

This functionality likely is deprecated or checks have been superseeded.


List of Priors supplied to an class

Description

This class represents a collection of Prior objects. It provides methods for accessing, adding and removing priors from the list

Value

A PriorList object.

Public fields

priors

A list of Prior object.

Methods

Public methods


Method new()

Initializes the object

Usage
PriorList$new(priors)
Arguments
priors

A list of Prior object.

Returns

NULL


Method print()

Print out summary statistics

Usage
PriorList$print()
Returns

A message on screen


Method show()

Aliases that calls print.

Usage
PriorList$show()
Returns

A message on screen


Method length()

Number of priors in object

Usage
PriorList$length()
Returns

A numeric with the number of priors set


Method ids()

Ids of prior objects

Usage
PriorList$ids()
Returns

A list with ids of the priors objects for query


Method varnames()

Variable names of priors in object

Usage
PriorList$varnames()
Returns

A character list with the variable names of the priors.


Method classes()

Function to return the classes of all contained priors

Usage
PriorList$classes()
Returns

A character list with the class names of the priors.


Method types()

Get types of all contained priors

Usage
PriorList$types()
Returns

A character list with the type names of the priors.


Method exists()

Does a certain variable or type combination exist as prior ?

Usage
PriorList$exists(variable, type = NULL)
Arguments
variable

A character with the variable name.

type

A character with the type.

Returns

A character id.


Method add()

Add a new prior to the object.

Usage
PriorList$add(p)
Arguments
p

A Prior object.

Returns

Invisible TRUE


Method get()

Get specific prior values from the list if set

Usage
PriorList$get(variable, type = NULL, what = "value")
Arguments
variable

A character with the variable name.

type

A character with the type name

what

A character on the specific entry to return (Default: prior value).

Returns

The prior object.


Method collect()

Collect priors for a given id or multiple.

Usage
PriorList$collect(id)
Arguments
id

A character with the prior id.

Returns

A PriorList object.


Method rm()

Remove a set prior by id

Usage
PriorList$rm(id)
Arguments
id

A character with the prior id.

Returns

Invisible TRUE


Method summary()

Summary function that lists all priors

Usage
PriorList$summary()
Returns

A data.frame with the summarized priors.


Method combine()

Combining function to combine this PriorList with another new one

Usage
PriorList$combine(x)
Arguments
x

A new PriorList object.

Returns

Invisible TRUE


Method clone()

The objects of this class are cloneable with this method.

Usage
PriorList$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Examples

## Not run: 
priors(
    INLAPrior('var1','normal',c(0,0.1)),
    INLAPrior('var2','normal',c(0,0.1))
   )

## End(Not run)

Creates a new PriorList object

Description

A PriorList object is essentially a list that contains individual Prior objects. In order to use priors for any of the engines, the respective Prior has to be identified (e.g. INLAPrior) and embedded in a PriorList object. Afterwards these objects can then be added to a distribution object with the add_priors function.

A PriorList object is essentially a list that contains individual Prior objects. In order to use priors for any of the engines, the respective Prior has to be identified (e.g. INLAPrior) and embedded in a PriorList object. Afterwards these objects can then be added to a distribution object with the add_priors function.

Usage

priors(x, ...)

## S4 method for signature 'ANY'
priors(x, ...)

priors(x, ...)

## S4 method for signature 'ANY'
priors(x, ...)

Arguments

x

A Prior object added to the list.

...

One or multiple additional Prior object added to the list.

Value

A PriorList object.

A PriorList object.

See Also

Prior, PriorList

Prior, PriorList

Other prior: BARTPrior(), BARTPriors(), BREGPrior(), BREGPriors(), GDBPrior(), GDBPriors(), GLMNETPrior(), GLMNETPriors(), INLAPrior(), INLAPriors(), STANPrior(), STANPriors(), XGBPrior(), XGBPriors(), add_priors(), get_priors(), rm_priors()

Other prior: BARTPrior(), BARTPriors(), BREGPrior(), BREGPriors(), GDBPrior(), GDBPriors(), GLMNETPrior(), GLMNETPriors(), INLAPrior(), INLAPriors(), STANPrior(), STANPriors(), XGBPrior(), XGBPriors(), add_priors(), get_priors(), rm_priors()

Examples

p1 <- GDBPrior(variable = "Forest", hyper = "positive")
p2 <- GDBPrior(variable = "Urban", hyper = "decreasing")
priors(p1, p2)

## Not run: 
p1 <- INLAPrior(variable = "Forest",type = "normal", hyper = c(1,1e4))
p2 <- INLAPrior(variable = "Urban",type = "normal", hyper = c(0,1e-2))
priors(p1, p2)

## End(Not run)

Project a fitted model to a new environment and covariates

Description

Equivalent to train, this function acts as a wrapper to project the model stored in a BiodiversityScenario object to newly supplied (future) covariates. Supplied predictors are usually spatial-temporal predictors which should be prepared via add_predictors() (e.g. transformations and derivates) in the same way as they have been during the initial modelling with distribution(). Any constrains specified in the scenario object are applied during the projection.

Usage

project.BiodiversityScenario(x, ...)

## S4 method for signature 'BiodiversityScenario'
project(
  x,
  date_interpolation = "none",
  stabilize = FALSE,
  stabilize_method = "loess",
  layer = "mean",
  verbose = getOption("ibis.setupmessages", default = TRUE),
  ...
)

Arguments

x

A BiodiversityScenario object with set predictors. Note that some constrains such as MigClim can still simulate future change without projections.

...

passed on parameters.

date_interpolation

A character on whether dates should be interpolated. Options include "none" (Default), "annual", "monthly", "daily".

stabilize

A logical value indicating whether the suitability projection should be stabilized (Default: FALSE).

stabilize_method

character stating the stabilization method to be applied. Currently supported is `loess`.

layer

A character specifying the layer to be projected (Default: "mean").

verbose

Setting this logical value to TRUE prints out further information during the model fitting (Default: FALSE).

Details

In the background the function x$project() for the respective model object is called, where x is fitted model object. For specifics on the constraints, see the relevant constrain functions, respectively:

  • add_constraint() for generic wrapper to add any of the available constrains.

  • add_constraint_dispersal() for specifying dispersal constraint on the temporal projections at each step.

  • add_constraint_MigClim() Using the MigClim R-package to simulate dispersal in projections.

  • add_constraint_connectivity() Apply a connectivity constraint at the projection, for instance by adding a barrier that prevents migration.

  • add_constraint_minsize() Adds a constraint on the minimum area a given thresholded patch should have, assuming that smaller areas are in fact not suitable.

  • add_constraint_adaptability() Apply an adaptability constraint to the projection, for instance constraining the speed a species is able to adapt to new conditions.

  • add_constraint_boundary() To artificially limit the distribution change. Similar as specifying projection limits, but can be used to specifically constrain a projection within a certain area (e.g. a species range or an island).

Many constrains also requires thresholds to be calculated. Adding threshold() to a BiodiversityScenario object enables the computation of thresholds at every step based on the threshold used for the main model (threshold values are taken from there).

It is also possible to make a complementary simulation with the steps package, which can be provided via simulate_population_steps() to the BiodiversityScenario object. Similar as with thresholds, estimates values will then be added to the outputs.

Finally this function also allows temporal stabilization across prediction steps via enabling the parameter stabilize and checking the stablize_method argument. Stabilization can for instance be helpful in situations where environmental variables are quite dynamic, but changes in projected suitability are not expected to abruptly increase or decrease. It is thus a way to smoothen out outliers from the projection. Options are so far for instance 'loess' which fits a loess() model per pixel and time step. This is conducted at the very of the processing steps and any thresholds will be recalculated afterwards.

Value

Saves stars objects of the obtained predictions in mod.

See Also

scenario()

Examples

## Not run: 
# Fit a model
fit <- distribution(background) |>
        add_biodiversity_poipa(surveydata) |>
        add_predictors(env = predictors) |>
        engine_breg() |>
        train()

# Fit a scenario
sc <- scenario(fit) |>
        add_predictors(env = future_predictors) |>
        project()

## End(Not run)

Settings for specifying pseudo-absence points within the model background

Description

This function defines the settings for pseudo-absence sampling of the background. For many engines such points are necessary to model Poisson (or Binomial) distributed point process data. Specifically we call absence points for Binomial (Bernoulli really) distributed responses 'pseudo-absence' and absence data for Poisson responses 'background' points. For more details read Renner et al. (2015).

The function 'add_pseudoabsence' allows to add absence points to any sf object. See Details for additional parameter description and examples on how to 'turn' a presence-only dataset into a presence-(pseudo-)absence.

Usage

pseudoabs_settings(
  background = NULL,
  nrpoints = 10000,
  min_ratio = 0.25,
  method = "random",
  buffer_distance = 10000,
  inside = FALSE,
  layer = NULL,
  bias = NULL,
  ...
)

## S4 method for signature 'ANY'
pseudoabs_settings(
  background = NULL,
  nrpoints = 10000,
  min_ratio = 0.25,
  method = "random",
  buffer_distance = 10000,
  inside = FALSE,
  layer = NULL,
  bias = NULL,
  ...
)

Arguments

background

A SpatRaster or sf object over which background points can be sampled. Default is NULL (Default) and the background is then added when the sampling is first called.

nrpoints

A numeric given the number of absence points to be created. Has to be larger than 0 and normally points are not created in excess of the number of cells of the background (Default: 10000).

min_ratio

A numeric with the minimum ratio of background points relative to the presence points. Setting this value to 1 generates an equal amount of absence points relative to the presence points. Usually ignored unless the ratio exceeds the nrpoints parameters (Default: 0.25).

method

character denoting how the sampling should be done. See details for options (Default: "random").

buffer_distance

numeric A distance from the observations in which pseudo-absence points are not to be generated. Note that units follow the units of the projection (e.g. m or °). Only used when method = "buffer".

inside

A logical value of whether absence points should be sampled outside (Default) or inside a minimum convex polygon or range provided the respective method is chosen (parameter method = "mcp" or method = "range").

layer

A sf or SpatRaster (in the case of method 'zones') object indicating the range of a species. Only used with method = "range" or method = "zones" (Default: NULL).

bias

A SpatRaster with the same extent and projection and background. Absence points will be preferentially sampled in areas with higher (!) bias. (Default: NULL).

...

Any other settings to be added to the pseudoabs settings.

Details

There are multiple methods available for sampling a biased background layer. Possible parameters for method are:

  • 'random' Absence points are generated randomly over the background (Default),

  • 'buffer' Absence points are generated only within a buffered distance of existing points. This option requires the specification of the parameter buffer_distance.

  • 'mcp' Can be used to only generate absence points within or outside a minimum convex polygon of the presence points. The parameter inside specifies whether points should be sampled inside or outside (Default) the minimum convex polygon.

  • 'range' Absence points are created either inside or outside a provided additional layer that indicates for example a range of species (controlled through parameter inside).

  • 'zones' A ratified (e.g. of type factor) SpatRaster layer depicting zones from which absence points are to be sampled. This method checks which points fall within which zones and then samples absence points either within or outside these zones exclusively. Both 'layer' and 'inside' have to be set for this option.

  • 'target' Make use of a target background for sampling absence points. Here a SpatRaster object has to be provided through the parameter 'layer'. Absence points are then sampled exclusively within the target areas for grid cells with non-zero values.

References

  • Renner IW, Elith J, Baddeley A, Fithian W, Hastie T, Phillips SJ, Popovic G, Warton DI. 2015. Point process models for presence-only analysis. Methods in Ecology and Evolution 6:366–379. DOI: 10.1111/2041-210X.12352.

  • Renner, I. W., & Warton, D. I. (2013). Equivalence of MAXENT and Poisson point process models for species distribution modeling in ecology. Biometrics, 69(1), 274-281.

Examples

## Not run: 
# This setting generates 10000 pseudo-absence points outside the
# minimum convex polygon of presence points
ass1 <- pseudoabs_settings(nrpoints = 10000, method = 'mcp', inside = FALSE)

# This setting would match the number of presence-absence points directly.
ass2 <- pseudoabs_settings(nrpoints = 0, min_ratio = 1)

# These settings can then be used to add pseudo-absence data to a
# presence-only dataset. This effectively adds these simulated absence
# points to the resulting model
all_my_points <- add_pseudoabsence(
                     df = virtual_points,
                      field_occurrence = 'observed',
                      template = background,
                      settings = ass1)

## End(Not run)

render_html

Description

Renders DistributionModel to HTML

Usage

render_html(mod, file, title = NULL, author = NULL, notes = "-", ...)

## S4 method for signature 'ANY'
render_html(mod, file, title = NULL, author = NULL, notes = "-", ...)

Arguments

mod

Any object belonging to DistributionModel

file

Character with path to file.

title

Character with title of document.

author

Character with name of author.

notes

Character with notes added at the beginning of the document.

...

Currently not used

Details

Renders a HTML file with several summaries of a trained DistributionModel. The file paths must be an HTML file ending. The functions creates a temporary Rmd file that gets renders as a HTML using the file argument.

Value

Writes HTML file

Examples

## Not run: 
mod <- distribution(background) |>
  add_biodiversity_poipo(species) |>
  add_predictors(predictors) |>
  engine_glmnet() |>
  train()

render_html(mod, file = "Test.html")

## End(Not run)

Remove specific BiodiversityDataset from a distribution object

Description

Remove a particular dataset (or all) from an distribution object with a BiodiversityDatasetCollection.

Usage

rm_biodiversity(x, name, id)

## S4 method for signature 'BiodiversityDistribution'
rm_biodiversity(x, name, id)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

name

A character with the name of the biodiversity dataset.

id

A character with the id of the biodiversity dataset.

Examples

## Not run: 
distribution(background) |>
 add_biodiversity_poipa(species, "Duckus communus")
 rm_biodiversity(names = "Duckus communus")

## End(Not run)

Remove control from an existing distribution object

Description

This function allows to remove set control obtions from an existing distribution object.

Usage

rm_control(x)

## S4 method for signature 'BiodiversityDistribution'
rm_control(x)

Arguments

x

distribution (i.e. BiodiversityDistribution) object.

See Also

add_control_bias()

Other control: rm_limits()

Examples

## Not run: 
 x <- distribution(background) |>
   add_predictors(covariates) |>
   add_control_bias(method = "proximity")
 x <- x |> rm_control()
 x

## End(Not run)

Function to remove a latent effect

Description

This is just a wrapper function for removing specified offsets from a BiodiversityDistribution) object.

Usage

rm_latent(x)

## S4 method for signature 'BiodiversityDistribution'
rm_latent(x)

## S4 method for signature 'BiodiversityScenario'
rm_latent(x)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

Value

Removes a latent spatial effect from a distribution object.

See Also

add_latent_spatial

Examples

## Not run: 
 rm_latent(model) -> model

## End(Not run)

Remove limits from an existing distribution object

Description

This function allows to remove set limits from an existing distribution object.

Usage

rm_limits(x)

## S4 method for signature 'BiodiversityDistribution'
rm_limits(x)

Arguments

x

distribution (i.e. BiodiversityDistribution) object.

See Also

add_limits_extrapolation()

Other control: rm_control()

Examples

## Not run: 
 x <- distribution(background) |>
   add_predictors(covariates) |>
   add_limits_extrapolation(method = "zones", layer = zones)
 x <- x |> rm_limits()
 x

## End(Not run)

Function to remove an offset

Description

This is just a wrapper function for removing specified offsets from a BiodiversityDistribution) object.

Usage

rm_offset(x, layer = NULL)

## S4 method for signature 'BiodiversityDistribution'
rm_offset(x, layer = NULL)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

layer

A character pointing to the specific layer to be removed. If set to NULL, then all offsets are removed from the object.

Value

Removes an offset from a distribution object.

See Also

Other offset: add_offset(), add_offset_bias(), add_offset_elevation(), add_offset_range()

Examples

## Not run: 
 rm_offset(model) -> model

## End(Not run)

Remove specific predictors from a distribution object

Description

Remove a particular variable from an distribution object with a PredictorDataset. See Examples.

Usage

rm_predictors(x, names)

## S4 method for signature 'BiodiversityDistribution,character'
rm_predictors(x, names)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

names

vector A Vector of character names describing the environmental stack.

Examples

## Not run: 
distribution(background) |>
 add_predictors(my_covariates) |>
 rm_predictors(names = "Urban")

## End(Not run)

Remove existing priors from an existing distribution object

Description

This function allows to remove priors from an existing distribution object. In order to remove a set prior, the name of the prior has to be specified.

Usage

rm_priors(x, names = NULL, ...)

## S4 method for signature 'BiodiversityDistribution'
rm_priors(x, names = NULL, ...)

Arguments

x

distribution (i.e. BiodiversityDistribution) object.

names

A vector or character object for priors to be removed.

...

Other parameters passed down

See Also

Other prior: BARTPrior(), BARTPriors(), BREGPrior(), BREGPriors(), GDBPrior(), GDBPriors(), GLMNETPrior(), GLMNETPriors(), INLAPrior(), INLAPriors(), STANPrior(), STANPriors(), XGBPrior(), XGBPriors(), add_priors(), get_priors(), priors()

Examples

## Not run: 
 # Add prior
 pp <-  GLMNETPrior("forest")
 x <- distribution(background) |>
  add_priors(pp)
 # Remove again
 x <- x |> rm_priors("forest")

## End(Not run)

Parallel computation of function

Description

Some computations take considerable amount of time to execute. This function provides a helper wrapper for running functions of the apply family to specified outputs.

Usage

run_parallel(
  X,
  FUN,
  cores = 1,
  approach = "future",
  export_packages = NULL,
  ...
)

Arguments

X

A list, data.frame or matrix object to be fed to a single core or parallel apply call.

FUN

A function passed on for computation.

cores

A numeric of the number of cores to use (Default: 1).

approach

character for the parallelization approach taken (Options: "parallel" or "future").

export_packages

A vector with packages to export for use on parallel nodes (Default: NULL).

...

Any other parameter passed on.

Details

By default, the parallel package is used for parallel computation, however an option exists to use the future package instead.

Examples

## Not run: 
 run_parallel(list, mean, cores = 4)

## End(Not run)

Fit cmdstanr model and convert to rstan object

Description

This function fits a stan model using the light-weight interface provided by cmdstanr. The code was adapted from McElreath rethinking package.

Usage

run_stan(
  model_code,
  data = list(),
  algorithm = "sampling",
  chains = 4,
  cores = getOption("ibis.nthread"),
  threads = 1,
  iter = 1000,
  warmup = floor(iter/2),
  control = list(adapt_delta = 0.95),
  cpp_options = list(),
  force = FALSE,
  path = base::getwd(),
  save_warmup = TRUE,
  ...
)

Arguments

model_code

A character pointing to the stan modelling code.

data

A list with all the parameters required to run the model_code in stan.

algorithm

A character giving the algorithm to use. Either 'sampling' (Default), 'optimize' or 'variational' for penalized likelihood estimation.

chains

A numeric indicating the number of chains to use for estimation.

cores

Number of threads for sampling. Default set to 'getOption("ibis.nthread")'. See ibis_options().

threads

numeric giving the number of threads to be run per chain. Has to be specified in accordance with cores.

iter

A numeric value giving the number of MCMC samples to generate.

warmup

numeric for the number of warm-up samples for MCMC. Default set to 1/2 of iter.

control

A list with further control options for stan.

cpp_options

A list with options for the Cpp compiling.

force

logical indication whether to force recompile the model (Default: FALSE).

path

character indicating a path to be made available to the stan compiler.

save_warmup

A logical flag whether to save the warmup samples.

...

Other non-specified parameters.

Value

A rstan object

See Also

rethinking R package


Sanitize variable names

Description

Prepared covariates often have special characters in their variable names which can or can not be used in formulas or cause errors for certain engines. This function converts special characters of variable names into a format

Usage

sanitize_names(names)

Arguments

names

A vector of character vectors to be sanitized.

Value

A vector of sanitized character.

Examples

# Correct variable names
vars <- c("Climate-temperature2015", "Elevation__sealevel", "Landuse.forest..meanshare")
sanitize_names(vars)

Create a new scenario based on trained model parameters

Description

This function creates a new BiodiversityScenario object that contains the projections of a model.

Usage

scenario(fit, limits = NULL, reuse_limits = FALSE, copy_model = FALSE)

## S4 method for signature 'ANY'
scenario(fit, limits = NULL, reuse_limits = FALSE, copy_model = FALSE)

Arguments

fit

A BiodiversityDistribution object containing a trained model.

limits

A SpatRaster or sf object that limits the projection surface when intersected with the prediction data (Default: NULL). This can for instance be set as an expert-delineated constrain to limit spatial projections.

reuse_limits

A logical on whether to reuse limits if found in the trained BiodiversityDistribution object (Default: FALSE). See also notes!

copy_model

A logical of whether the model object is to be copied to the scenario object. Note that setting this option to TRUE can increase the required amount of memory (Default: FALSE).

Note

If a limit has been defined already during train(), for example by adding an extrapolation limit add_limits_extrapolation(), this zonal layer can be reused for the projections. Note: This effectively fixes the projections to certain areas.

Examples

## Not run: 
  scenario(fit, limits = island_area)

## End(Not run)

Select specific predictors from a distribution object

Description

This function allows - out of a character vector with the names of an already added PredictorDataset object - to select a particular set of predictors. See Examples.

Usage

sel_predictors(x, names)

## S4 method for signature 'BiodiversityDistribution,character'
sel_predictors(x, names)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object.

names

vector A Vector of character names describing the environmental stack.

Examples

## Not run: 
distribution(background) |>
 add_predictors(my_covariates) |>
 sel_predictors(names = c("Forest", "Elevation"))

## End(Not run)

Add priors to an existing distribution object

Description

This function simply allows to add priors to an existing distribution object. The supplied priors must be a PriorList object created through calling priors.

Usage

set_priors(x, priors = NULL, ...)

Arguments

x

distribution (i.e. BiodiversityDistribution) object.

priors

A PriorList object containing multiple priors.

...

Other parameters passed down.

Note

Alternatively priors to environmental predictors can also directly added as parameter via add_predictors

See Also

Other prior: BARTPrior(), BARTPriors(), BREGPrior(), BREGPriors(), GDBPrior(), GDBPriors(), GLMNETPrior(), GLMNETPriors(), INLAPrior(), INLAPriors(), STANPrior(), STANPriors(), XGBPrior(), XGBPriors(), get_priors(), priors(), rm_priors()

Examples

## Not run: 
 pp <-  GLMNETPrior("forest")
 x <- distribution(background) |>
  add_priors(pp)


## End(Not run)

Add priors to an existing distribution object

Description

This function simply allows to add priors to an existing distribution object. The supplied priors must be a PriorList object created through calling priors.

Usage

## S4 method for signature 'BiodiversityDistribution'
set_priors(x, priors = NULL, ...)

Arguments

x

distribution (i.e. BiodiversityDistribution) object.

priors

A PriorList object containing multiple priors.

...

Other parameters passed down.

Note

Alternatively priors to environmental predictors can also directly added as parameter via add_predictors

See Also

Other prior: BARTPrior(), BARTPriors(), BREGPrior(), BREGPriors(), GDBPrior(), GDBPriors(), GLMNETPrior(), GLMNETPriors(), INLAPrior(), INLAPriors(), STANPrior(), STANPriors(), XGBPrior(), XGBPriors(), get_priors(), priors(), rm_priors()

Examples

## Not run: 
 pp <-  GLMNETPrior("forest")
 x <- distribution(background) |>
  add_priors(pp)


## End(Not run)

Prototype for model settings object

Description

Basic R6 object for Settings object, a List that stores settings used related to model training.

Public fields

name

The default name of this settings as character.

modelid

A character of the model id this belongs to.

data

A list of contained settings.

Methods

Public methods


Method new()

Initializes the object and creates an empty list

Usage
Settings$new()
Returns

NULL


Method print()

Print the names and properties of all Biodiversity datasets contained within

Usage
Settings$print()
Returns

A message on screen


Method show()

Shows the name and the settings

Usage
Settings$show()
Returns

A character of the name and settings.


Method length()

Number of options

Usage
Settings$length()
Returns

A numeric with the number of options.


Method duration()

Computation duration convenience function

Usage
Settings$duration()
Returns

The amount of time passed for model fitting if found.


Method summary()

Summary call of the contained parameters

Usage
Settings$summary()
Returns

A list with the parameters in this object.


Method get()

Get a specific setting

Usage
Settings$get(what)
Arguments
what

A character with the respective setting.

Returns

The setting if found in the object.


Method set()

Set new settings

Usage
Settings$set(what, x, copy = FALSE)
Arguments
what

A character with the name for the new settings.

x

The new setting to be stored. Can be any object.

copy

logical on whether a new settings object is to be created.

Returns

The setting if found in the object.


Method clone()

The objects of this class are cloneable with this method.

Usage
Settings$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.


Calculate environmental similarity of reference datasets to predictors.

Description

Calculate the environmental similarity of the provided covariates with respect to a reference dataset. Currently supported is Multivariate Environmental Similarity index and the multivariate combination novelty index (NT2) based on the Mahalanobis divergence (see references).

Usage

similarity(
  obj,
  ref,
  ref_type = "poipo",
  method = "mess",
  predictor_names = NULL,
  full = FALSE,
  plot = TRUE,
  ...
)

## S4 method for signature 'BiodiversityDistribution'
similarity(
  obj,
  ref,
  ref_type = "poipo",
  method = "mess",
  predictor_names = NULL,
  full = FALSE,
  plot = TRUE,
  ...
)

## S4 method for signature 'SpatRaster'
similarity(
  obj,
  ref,
  ref_type = "poipo",
  method = "mess",
  predictor_names = NULL,
  full = FALSE,
  plot = TRUE,
  ...
)

Arguments

obj

A BiodiversityDistribution, DistributionModel or alternatively a SpatRaster object.

ref

A BiodiversityDistribution, DistributionModel or alternatively a data.frame with extracted values (corresponding to those given in obj).

ref_type

A character specifying the type of biodiversity to use when obj is a BiodiversityDistribution.

method

A specifc method for similarity calculation. Currently supported: 'mess', 'nt'.

predictor_names

An optional character specifying the covariates to be used (Default: NULL).

full

should similarity values be returned for all variables (Default:FALSE)?

plot

Should the result be plotted? Otherwise return the output list (Default: TRUE).

...

other options (Non specified).

Details

similarity implements the MESS algorithm described in Appendix S3 of Elith et al. (2010) as well as the Mahalanobis dissimilarity described in Mesgaran et al. (2014).

Value

This function returns a list containing:

  • similarity: A SpatRaster object with multiple layers giving the environmental similarities for each variable in x (only included when "full=TRUE");

  • mis: a SpatRaster layer giving the minimum similarity value across all variables for each location (i.e. the MESS);

  • exip: a SpatRaster layer indicating whether any model would interpolate or extrapolate to this location based on environmental surface;

  • mod: a factor SpatRaster layer indicating which variable was most dissimilar to its reference range (i.e. the MoD map, Elith et al. 2010); and

  • mos: a factor SpatRaster layer indicating which variable was most similar to its reference range.

References

  • Elith, J., Kearney, M., and Phillips, S. (2010) "The art of modelling range-shifting species". Methods in Ecology and Evolution, 1: 330-342. https://doi.org/10.1111/j.2041-210X.2010.00036.x

  • Mesgaran, M.B., Cousens, R.D. and Webber, B.L. (2014) "Here be dragons: a tool for quantifying novelty due to covariate range and correlation change when projecting species distribution models". Diversity and Distributions, 20: 1147-1159. https://doi.org/10.1111/ddi.12209

See Also

dismo R-package.

Examples

## Not run: 
plot(
  similarity(x) # Where x is a distribution or Raster object
)
 
## End(Not run)

Simulate population dynamics following the steps approach

Description

This function adds a flag to a BiodiversityScenario object to indicate that species abundances are to be simulated based on the expected habitat suitability, as well as demography, density-dependence and dispersal information. The simulation is done using the steps package (Visintin et al. 2020) and conducted after a habitat suitability projection has been created. steps is a spatially explicit population models coded mostly in R.

For a detailed description of steps parameters, please see the respective reference and help files. Default assumptions underlying this wrapper are presented in the details

Usage

simulate_population_steps(
  mod,
  vital_rates,
  replicates = 1,
  carrying_capacity = NULL,
  initial = NULL,
  dispersal = NULL,
  density_dependence = NULL,
  include_suitability = TRUE
)

## S4 method for signature 'BiodiversityScenario,matrix'
simulate_population_steps(
  mod,
  vital_rates,
  replicates = 1,
  carrying_capacity = NULL,
  initial = NULL,
  dispersal = NULL,
  density_dependence = NULL,
  include_suitability = TRUE
)

Arguments

mod

A BiodiversityScenario object with specified predictors.

vital_rates

A symmetrical demographic matrix. Should have column and row names equivalent to the vital stages that are to be estimated.

replicates

A numeric vector of the number of replicates (Default: 1).

carrying_capacity

Either SpatRaster or a numeric estimate of the maximum carrying capacity, e.g. how many adult individual are likely to occur per grid cell. If set to numeric, then carrying capacity is estimated up to a maximum set (Note: a more clever way would be to use a species-area relationship for scaling. This is not yet implemented).

initial

A SpatRaster giving the initial population size. If not provided, then initial populations are guessed (see details) from the projected suitability rasters (Default: NULL).

dispersal

A dispersal object defined by the steps package (Default: NULL).

density_dependence

Specification of density dependence defined by the steps package (Default: NULL).

include_suitability

A logical flag on whether the projected suitability estimates should be used (Default: TRUE) or only the initial conditions set to the first time step.

Details

In order for this function to work the steps package has to be installed separately. Instructions to do so can be found on github.

If initial population lifestages are not provided, then they are estimated assuming a linear scaling with suitability, a 50:50 split between sexes and a 1:3 ratio of adults to juveniles. The provision of different parameters is highly encouraged!

Value

Adds flag to a BiodiversityScenario object to indicate that further simulations are added during projection.

Note

The steps package has multiple options for simulating species population and not all possible options are represented in this wrapper.

Furthermore, the package still makes use of the raster package for much of its internal data processing. Since ibis.iSDM switched to terra a while ago, there can be efficiency problems as layers need to be translated between packages.

References

  • Visintin, C., Briscoe, N. J., Woolley, S. N., Lentini, P. E., Tingley, R., Wintle, B. A., & Golding, N. (2020). steps: Software for spatially and temporally explicit population simulations. Methods in Ecology and Evolution, 11(4), 596-603. https://doi.org/10.1111/2041-210X.13354

See Also

Other constraint: add_constraint(), add_constraint_MigClim(), add_constraint_adaptability(), add_constraint_boundary(), add_constraint_connectivity(), add_constraint_dispersal(), add_constraint_minsize(), add_constraint_threshold()

Examples

## Not run: 
# Define vital rates
vt <- matrix(c(0.0,0.5,0.75,
               0.5,0.2,0.0,
               0.0,0.5,0.9),
               nrow = 3, ncol = 3, byrow = TRUE)
colnames(vt) <- rownames(vt) <- c('juvenile','subadult','adult')

# Assumes that a trained 'model' object exists
 mod <- scenario(model) |>
  add_predictors(env = predictors, transform = 'scale',
                 derivates = "none") |>
  # Use Vital rates here, but note the other parameters!
  simulate_population_steps(vital_rates = vt) |>
  project()

## End(Not run)

Obtain spatial partial effects of trained model

Description

Similar as partial this function calculates a partial response of a trained model for a given variable. Differently from partial in space. However the result is a SpatRaster showing the spatial magnitude of the partial response.

Usage

spartial(mod, x.var, constant = NULL, newdata = NULL, plot = FALSE, ...)

## S4 method for signature 'ANY,character'
spartial(mod, x.var, constant = NULL, newdata = NULL, plot = FALSE, ...)

spartial.DistributionModel(mod, ...)

Arguments

mod

A DistributionModel object with trained model.

x.var

A character indicating the variable for which a partial effect is to be calculated.

constant

A numeric constant to be inserted for all other variables. Default calculates the mean per variable.

newdata

A data.frame on which to calculate the spartial for. Can be for example created from a raster file (Default: NULL).

plot

A logical indication of whether the result is to be plotted?

...

Other engine specific parameters.

Details

By default the mean is calculated across all parameters that are not x.var. Instead a constant can be set (for instance 0) to be applied to the output.

Value

A SpatRaster containing the mapped partial response of the variable.

See Also

partial

Examples

## Not run: 
 # Create and visualize the spartial effect
 spartial(fit, x.var = "Forest.cover", plot = TRUE)

## End(Not run)

Show the stan code from a trained model

Description

This helper function shows the code from a trained DistributionModel using the engine_stan. This function is emulated after a similar functionality in the brms R-package. It only works with models inferred with stan!

Usage

stancode(obj, ...)

stancode.DistributionModel(obj, ...)

Arguments

obj

Any prepared object.

...

not used.

Value

None.

See Also

rstan, cmdstanr, brms


Create a new STAN prior

Description

Function to create a new prior for engine_stan models. Priors currently can be set on specific environmental predictors.

Usage

STANPrior(variable, type, hyper = c(0, 2), ...)

## S4 method for signature 'character,character'
STANPrior(variable, type, hyper = c(0, 2), ...)

Arguments

variable

A character matched against existing predictors or latent effects.

type

A character specifying the type of prior to be set.

hyper

A vector with numeric values to be used as hyper parameters. First entry is treated as mean (Default: 0), the second as the standard variation (Default: 2) of a Gaussian distribution on the respective coefficient.

...

Variables passed on to prior object.

References

  • Lemoine, N. P. (2019). Moving beyond noninformative priors: why and how to choose weakly informative priors in Bayesian analyses. Oikos, 128(7), 912-928.

  • Carpenter, B., Gelman, A., Hoffman, M. D., Lee, D., Goodrich, B., Betancourt, M., ... & Riddell, A. (2017). Stan: A probabilistic programming language. Journal of statistical software, 76(1), 1-32.

See Also

Prior.

Other prior: BARTPrior(), BARTPriors(), BREGPrior(), BREGPriors(), GDBPrior(), GDBPriors(), GLMNETPrior(), GLMNETPriors(), INLAPrior(), INLAPriors(), STANPriors(), XGBPrior(), XGBPriors(), add_priors(), get_priors(), priors(), rm_priors()

Examples

## Not run: 
 pp <- STANPrior("forest", "normal", c(0,1))

## End(Not run)

Helper function when multiple variables and types are supplied for STAN

Description

This is a helper function to specify several STANPrior with the same hyper-parameters, but different variables.

Usage

STANPriors(variables, type, hyper = c(0, 2), ...)

## S4 method for signature 'vector,character'
STANPriors(variables, type, hyper = c(0, 2), ...)

Arguments

variables

A vector of character matched against existing predictors or latent effects.

type

A character specifying the type of prior to be set.

hyper

A vector with numeric values to be used as hyper-parameters.

...

Variables passed on to prior object

See Also

Other prior: BARTPrior(), BARTPriors(), BREGPrior(), BREGPriors(), GDBPrior(), GDBPriors(), GLMNETPrior(), GLMNETPriors(), INLAPrior(), INLAPriors(), STANPrior(), XGBPrior(), XGBPriors(), add_priors(), get_priors(), priors(), rm_priors()


Summarises a trained model or predictor object

Description

This helper function summarizes a given object, including DistributionModel, PredictorDataset or PriorList objects and others. This can be a helpful way to summarize what is contained within and the values of specified models or objects.

When unsure, it is usually a good strategy to run summary on any object.

Usage

## S3 method for class 'distribution'
summary(object, ...)

## S3 method for class 'DistributionModel'
summary(object, ...)

## S3 method for class 'PredictorDataset'
summary(object, ...)

## S3 method for class 'BiodiversityScenario'
summary(object, ...)

## S3 method for class 'PriorList'
summary(object, ...)

## S3 method for class 'Settings'
summary(object, ...)

Arguments

object

Any prepared object.

...

not used.

See Also

base::summary().

Examples

## Not run: 
# Example with a trained model
x <- distribution(background) |>
        # Presence-absence data
        add_biodiversity_poipa(surveydata) |>
        # Add predictors and scale them
        add_predictors(env = predictors) |>
        # Use glmnet and lasso regression for estimation
        engine_glmnet(alpha = 1)
 # Train the model
 mod <- train(x)
 summary(mod)

 # Example with a prior object
 p1 <- BREGPrior(variable = "forest", hyper = 2, ip = NULL)
 p2 <- BREGPrior(variable = "cropland", hyper = NULL, ip = 1)
 pp <- priors(p1,p2)
 summary(pp)

## End(Not run)

Functionality for geographic and environmental thinning

Description

For most species distribution modelling approaches it is assumed that occurrence records are unbiased, which is rarely the case. While model-based control can alleviate some of the effects of sampling bias, it can often be desirable to account for some sampling biases through spatial thinning (Aiello‐Lammens et al. 2015). This is an approach based on the assumption that over-sampled grid cells contribute little more than bias, rather than strengthening any environmental responses. This function provides some methods to apply spatial thinning approaches. Note that this effectively removes data prior to any estimation and its use should be considered with care (see also Steen et al. 2021).

Usage

thin_observations(
  data,
  background,
  env = NULL,
  method = "random",
  remainpoints = 10,
  mindistance = NULL,
  zones = NULL,
  probs = 0.75,
  global = TRUE,
  centers = NULL,
  verbose = TRUE
)

Arguments

data

A sf object with observed occurrence points. All methods threat presence-only and presence-absence occurrence points equally.

background

A SpatRaster object with the background of the study region. Use for assessing point density.

env

A SpatRaster object with environmental covariates. Needed when method is set to "environmental" or "bias" (Default: NULL).

method

A character of the method to be applied (Default: "random").

remainpoints

A numeric giving the number of data points at minimum to remain (Default: 10).

mindistance

A numeric for the minimum distance of neighbouring observations (Default: NULL).

zones

A SpatRaster to be supplied when option "zones" is chosen (Default: NULL).

probs

A numeric used as quantile threshold in "bias" method. (Default: 0.75).

global

A logical if during "bias" method global (entire env raster) or local (extracted at point locations) bias values are used as for quantile threshold. (Default: TRUE).

centers

A numeric used as number of centers for "environmental" method. (Default: NULL). If not set, automatically set to three or nlayers - 1 (whatever is bigger).

verbose

logical of whether to print some statistics about the thinning outcome (Default: TRUE).

Details

All methods only remove points from "over-sampled" grid cells/areas. These are defined as all cells/areas which either have more points than remainpoints or more points than the global minimum point count per cell/area (whichever is larger).

Currently implemented thinning methods:

  • "random": Samples at random across all over-sampled grid cells returning only "remainpoints" from over-sampled cells. Does not account for any spatial or environmental distance between observations.

  • "bias": This option removes explicitly points that are considered biased only (based on "env"). Points are only thinned from grid cells which are above the bias quantile (larger values equals greater bias). Thins the observations returning "remainpoints" from each over-sampled and biased cell.

  • "zones": Thins observations from each zone that is above the over-sampled threshold and returns "remainpoints" for each zone. Careful: If the zones are relatively wide this can remove quite a few observations.

  • "environmental": This approach creates an observation-wide clustering (k-means) under the assumption that the full environmental niche has been comprehensively sampled and is covered by the provided covariates env. For each over-sampled cluster, we then obtain ("remainpoints") by thinning points.

  • "spatial": Calculates the spatial distance between all observations. Then points are removed iteratively until the minimum distance between points is crossed. The "mindistance" parameter has to be set for this function to work.

References

  • Aiello‐Lammens, M. E., Boria, R. A., Radosavljevic, A., Vilela, B., & Anderson, R. P. (2015). spThin: an R package for spatial thinning of species occurrence records for use in ecological niche models. Ecography, 38(5), 541-545.

  • Steen, V. A., Tingley, M. W., Paton, P. W., & Elphick, C. S. (2021). Spatial thinning and class balancing: Key choices lead to variation in the performance of species distribution models with citizen science data. Methods in Ecology and Evolution, 12(2), 216-226.

Examples

## Not run: 
 # Thin a certain number of observations
 # At random
 thin_points <- thin_observations(points, background, method = "random")
 # using a bias layer
 thin_points <- thin_observations(points, background, method = "bias", env = bias)

## End(Not run)

Threshold a continuous prediction to a categorical layer

Description

It is common in many applications of species distribution modelling that estimated continuous suitability surfaces are converted into discrete representations of where suitable habitat might or might not exist. This so called threshold'ing can be done in various ways which are further described in the details.

In case a SpatRaster is provided as input in this function for obj, it is furthermore necessary to provide a sf object for validation as there is no DistributionModel to read this information from.

Note: This of course also allows to estimate the threshold based on withheld data, for instance those created from an a-priori cross-validation procedure.

For BiodiversityScenario objects, adding this function to the processing pipeline stores a threshold attribute in the created scenario object.

For BiodiversityScenario objects a set threshold() simply indicates that the projection should create and use thresholds as part of the results. The threshold values for this are either taken from the provided model or through an optional provide parameter value.

If instead the aim is to apply thresholds to each step of the suitability projection, see add_constraint_threshold().

Usage

threshold(
  obj,
  method = "mtp",
  value = NULL,
  point = NULL,
  field_occurrence = "observed",
  format = "binary",
  return_threshold = FALSE,
  ...
)

## S4 method for signature 'ANY'
threshold(
  obj,
  method = "mtp",
  value = NULL,
  point = NULL,
  field_occurrence = "observed",
  format = "binary",
  return_threshold = FALSE,
  ...
)

## S4 method for signature 'SpatRaster'
threshold(
  obj,
  method = "fixed",
  value = NULL,
  point = NULL,
  field_occurrence = "observed",
  format = "binary",
  return_threshold = FALSE
)

## S4 method for signature 'BiodiversityScenario'
threshold(
  obj,
  method = "mtp",
  value = NULL,
  point = NULL,
  field_occurrence = "observed",
  format = "binary",
  return_threshold = FALSE,
  ...
)

Arguments

obj

A BiodiversityScenario object to which an existing threshold is to be added.

method

A specifc method for thresholding. See details for available options.

value

A numeric value specifying the specific threshold for scenarios (Default: NULL Grab from object).

point

A sf object containing observational data used for model training.

field_occurrence

A character location of biodiversity point records.

format

character indication of whether "binary", "normalize" or "percentile" formatted thresholds are to be created (Default: "binary"). Also see Muscatello et al. (2021).

return_threshold

Should threshold value be returned instead (Default: FALSE)

...

Any other parameter. Used to fetch value if set somehow.

Details

The following options are currently implemented:

  • 'fixed' = applies a single pre-determined threshold. Requires value to be set.

  • 'mtp' = minimum training presence is used to find and set the lowest predicted suitability for any occurrence point.

  • 'percentile' = For a percentile threshold. A value as parameter has to be set here.

  • 'min.cv' = Threshold the raster so to minimize the coefficient of variation (cv) of the posterior. Uses the lowest tercile of the cv in space. Only feasible with Bayesian engines.

  • 'TSS' = Determines the optimal TSS (True Skill Statistic). Requires the "modEvA" package to be installed.

  • 'kappa' = Determines the optimal kappa value (Kappa). Requires the "modEvA" package to be installed.

  • 'F1score' = Determines the optimal F1score (also known as Sorensen similarity). Requires the "modEvA" package to be installed.

  • 'F1score' = Determines the optimal sensitivity of presence records. Requires the "modEvA" package to be installed.

  • 'Sensitivity' = Determines the optimal sensitivity of presence records. Requires the "modEvA" package to be installed.

  • 'Specificity' = Determines the optimal sensitivity of presence records. Requires the "modEvA" package to be installed.

  • 'AUC' = Determines the optimal AUC of presence records. Requires the "modEvA" package to be installed.

  • 'kmeans' = Determines a threshold based on a 2 cluster k-means clustering. The presence class is assumed to be the cluster with the larger mean.

Value

A SpatRaster if a SpatRaster object as input. Otherwise the threshold is added to the respective DistributionModel or BiodiversityScenario object.

References

  • Lawson, C.R., Hodgson, J.A., Wilson, R.J., Richards, S.A., 2014. Prevalence, thresholds and the performance of presence-absence models. Methods Ecol. Evol. 5, 54–64. https://doi.org/10.1111/2041-210X.12123

  • Liu, C., White, M., Newell, G., 2013. Selecting thresholds for the prediction of species occurrence with presence-only data. J. Biogeogr. 40, 778–789. https://doi.org/10.1111/jbi.12058

  • Muscatello, A., Elith, J., Kujala, H., 2021. How decisions about fitting species distribution models affect conservation outcomes. Conserv. Biol. 35, 1309–1320. https://doi.org/10.1111/cobi.13669

See Also

"modEvA"

Examples

## Not run: 
 # Where mod is an estimated DistributionModel
 tr <- threshold(mod)
 tr$plot_threshold()

## End(Not run)

Train the model from a given engine

Description

This function trains a distribution() model with the specified engine and furthermore has some generic options that apply to all engines (regardless of type). See Details with regards to such options.

Users are advised to check the help files for individual engines for advice on how the estimation is being done.

Usage

train(
  x,
  runname,
  filter_predictors = "none",
  optim_hyperparam = FALSE,
  inference_only = FALSE,
  only_linear = TRUE,
  method_integration = "predictor",
  keep_models = TRUE,
  aggregate_observations = TRUE,
  clamp = FALSE,
  verbose = getOption("ibis.setupmessages", default = TRUE),
  ...
)

## S4 method for signature 'BiodiversityDistribution'
train(
  x,
  runname,
  filter_predictors = "none",
  optim_hyperparam = FALSE,
  inference_only = FALSE,
  only_linear = TRUE,
  method_integration = "predictor",
  keep_models = TRUE,
  aggregate_observations = TRUE,
  clamp = TRUE,
  verbose = getOption("ibis.setupmessages", default = TRUE),
  ...
)

Arguments

x

distribution() (i.e. BiodiversityDistribution) object).

runname

A character name of the trained run.

filter_predictors

A character defining if and how highly correlated predictors are to be removed prior to any model estimation. Available options are:

  • "none" No prior variable removal is performed (Default).

  • "pearson", "spearman" or "kendall" Makes use of pairwise comparisons to identify and remove highly collinear predictors (Pearson's r >= 0.7).

  • "abess" A-priori adaptive best subset selection of covariates via the "abess" package (see References). Note that this effectively fits a separate generalized linear model to reduce the number of covariates.

  • "boruta" Uses the "Boruta" package to identify non-informative features.

optim_hyperparam

Parameter to tune the model by iterating over input parameters or selection of predictors included in each iteration. Can be set to TRUE if extra precision is needed (Default: FALSE).

inference_only

By default the engine is used to create a spatial prediction of the suitability surface, which can take time. If only inferences of the strength of relationship between covariates and observations are required, this parameter can be set to TRUE to ignore any spatial projection (Default: FALSE).

only_linear

Fit model only on linear baselearners and functions. Depending on the engine setting this option to FALSE will result in non-linear relationships between observations and covariates, often increasing processing time (Default: TRUE). How non-linearity is captured depends on the used engine.

method_integration

A character with the type of integration that should be applied if more than one BiodiversityDataset object is provided in x. Particular relevant for engines that do not support the integration of more than one dataset. Integration methods are generally sensitive to the order in which they have been added to the BiodiversityDistribution object. Available options are:

  • "predictor" The predicted output of the first (or previously fitted) models are added to the predictor stack and thus are predictors for subsequent models (Default).

  • "offset" The predicted output of the first (or previously fitted) models are added as spatial offsets to subsequent models. Offsets are back-transformed depending on the model family. This option might not be supported for every Engine.

  • "interaction" Instead of fitting several separate models, the observations from each dataset are combined and incorporated in the prediction as a factor interaction with the "weaker" data source being partialed out during prediction. Here the first dataset added determines the reference level (see Leung et al. 2019 for a description).

  • "prior" In this option we only make use of the coefficients from a previous model to define priors to be used in the next model. Might not work with any engine!

  • "weight" This option only works for multiple biodiversity datasets with the same type (e.g. "poipo"). Individual weight multipliers can be determined while setting up the model (Note: Default is 1). Datasets are then combined for estimation and weighted respectively, thus giving for example presence-only records less weight than survey records. Note that this parameter is ignored for engines that support joint likelihood estimation.

keep_models

logical if true and method_integration = "predictor", all models are stored in the .internal list of the model object.

aggregate_observations

logical on whether observations covering the same grid cell should be aggregated (Default: TRUE).

clamp

logical whether predictions should be clamped to the range of predictor values observed during model fitting (Default: FALSE).

verbose

Setting this logical value to TRUE prints out further information during the model fitting (Default: FALSE).

...

further arguments passed on.

Details

This function acts as a generic training function that - based on the provided BiodiversityDistribution object creates a new distribution model. The resulting object contains both a "fit_best" object of the estimated model and, if inference_only is FALSE a SpatRaster object named "prediction" that contains the spatial prediction of the model. These objects can be requested via object$get_data("fit_best").

Other parameters in this function:

  • "filter_predictors" The parameter can be set to various options to remove highly correlated variables or those with little additional information gain from the model prior to any estimation. Available options are "none" (Default) "pearson" for applying a 0.7 correlation cutoff, "abess" for the regularization framework by Zhu et al. (2020), or "RF" or "randomforest" for removing the least important variables according to a randomForest model. Note: This function is only applied on predictors for which no prior has been provided (e.g. potentially non-informative ones).

  • "optim_hyperparam" This option allows to make use of hyper-parameter search for several models, which can improve prediction accuracy although through the a substantial increase in computational cost.

  • "method_integration" Only relevant if more than one BiodiversityDataset is supplied and when the engine does not support joint integration of likelihoods. See also Miller et al. (2019) in the references for more details on different types of integration. Of course, if users want more control about this aspect, another option is to fit separate models and make use of the add_offset, add_offset_range and ensemble functionalities.

  • "clamp" Boolean parameter to support a clamping of the projection predictors to the range of values observed during model training.

Value

A DistributionModel object.

Note

There are no silver bullets in (correlative) species distribution modelling and for each model the analyst has to understand the objective, workflow and parameters than can be used to modify the outcomes. Different predictions can be obtained from the same data and parameters and not all necessarily make sense or are useful.

References

  • Miller, D.A.W., Pacifici, K., Sanderlin, J.S., Reich, B.J., 2019. The recent past and promising future for data integration methods to estimate species’ distributions. Methods Ecol. Evol. 10, 22–37. https://doi.org/10.1111/2041-210X.13110

  • Zhu, J., Wen, C., Zhu, J., Zhang, H., & Wang, X. (2020). A polynomial algorithm for best-subset selection problem. Proceedings of the National Academy of Sciences, 117(52), 33117-33123.

  • Leung, B., Hudgins, E. J., Potapova, A. & Ruiz‐Jaen, M. C. A new baseline for countrywide α‐diversity and species distributions: illustration using >6,000 plant species in Panama. Ecol. Appl. 29, 1–13 (2019).

See Also

engine_gdb, engine_xgboost, engine_bart, engine_inla, engine_inlabru, engine_breg, engine_stan, engine_glm

Examples

# Load example data
 background <- terra::rast(system.file('extdata/europegrid_50km.tif',
 package='ibis.iSDM',mustWork = TRUE))
 # Get test species
 virtual_points <- sf::st_read(system.file('extdata/input_data.gpkg',
 package='ibis.iSDM',mustWork = TRUE),'points',quiet = TRUE)

 # Get list of test predictors
 ll <- list.files(system.file('extdata/predictors/', package = 'ibis.iSDM',
 mustWork = TRUE),full.names = TRUE)
 # Load them as rasters
 predictors <- terra::rast(ll);names(predictors) <- tools::file_path_sans_ext(basename(ll))

 # Use a basic GLM to fit a SDM
 x <- distribution(background) |>
        # Presence-only data
        add_biodiversity_poipo(virtual_points, field_occurrence = "Observed") |>
        # Add predictors and scale them
        add_predictors(env = predictors, transform = "scale", derivates = "none") |>
        # Use GLM as engine
        engine_glm()

 # Train the model, Also filter out co-linear predictors using a pearson threshold
 mod <- train(x, only_linear = TRUE, filter_predictors = 'pearson')
 mod

Unwrap a model for later use

Description

The unwrap_model function uses terra::unwrap() to easier ship a DistributionModel object.

Usage

unwrap_model(mod, verbose = getOption("ibis.setupmessages", default = TRUE))

## S4 method for signature 'ANY'
unwrap_model(mod, verbose = getOption("ibis.setupmessages", default = TRUE))

Arguments

mod

Provided DistributionModel object.

verbose

logical indicating whether messages should be shown. Overwrites getOption("ibis.setupmessages") (Default: TRUE).

Value

DistributionModel with unwrapped raster layers

See Also

wrap_model

Examples

## Not run: 
x <- distribution(background) |>
 add_biodiversity_poipo(virtual_points, field_occurrence = 'observed', name = 'Virtual points') |>
 add_predictors(pred_current, transform = 'scale',derivates = 'none') |>
 engine_xgboost(nrounds = 2000) |>
 train(varsel = FALSE, only_linear = TRUE) |>
 wrap_model()
unwrap_model(x, "testmodel.rds")

## End(Not run)

Validation of a fitted distribution object

Description

This function conducts a model evaluation based on either on the fitted point data or any supplied independent. Currently only supporting point datasets. For validation of integrated models more work is needed.

Usage

validate(
  mod,
  method = "continuous",
  layer = "mean",
  point = NULL,
  point_column = "observed",
  field_occurrence = NULL,
  ...
)

## S4 method for signature 'ANY'
validate(
  mod,
  method = "continuous",
  layer = "mean",
  point = NULL,
  point_column = "observed",
  field_occurrence = NULL,
  ...
)

## S4 method for signature 'SpatRaster'
validate(
  mod,
  method = "continuous",
  layer = NULL,
  point = NULL,
  point_column = "observed",
  field_occurrence = NULL,
  ...
)

Arguments

mod

A fitted BiodiversityDistribution object with set predictors. Alternatively one can also provide directly a SpatRaster, however in this case the point layer also needs to be provided.

method

Should the validation be conducted on the continious prediction or a (previously calculated) thresholded layer in binary format? Note that depending on the method different metrics can be computed. See Details.

layer

In case multiple layers exist, which one to use? (Default: 'mean').

point

A sf object with type POINT or MULTIPOINT.

point_column

A character vector with the name of the column containing the independent observations. (Default: 'observed').

field_occurrence

(Deprectated) A character field pointing to the name of the independent observations. Identical to "point_column"

...

Other parameters that are passed on. Currently unused.

Details

The 'validate' function calculates different validation metrics depending on the output type.

The output metrics for each type are defined as follows: (where TP stands for true positive, TN for true negative, FP the false positive and FN the false negative) Continuous:

  • 'n' = Number of observations.

  • 'rmse' = Root Mean Square Error,

    1Ni=1N(yi^yi)2\sqrt {\frac{1}{N} \sum_{i=1}^{N} (\hat{y_{i}} - y_{i})^2}

  • 'mae' = Mean Absolute Error,

    i=1Nyixin\frac{ \sum_{i=1}^{N} y_{i} - x_{i} }{n}

  • 'logloss' = Log loss, TBD

  • 'normgini' = Normalized Gini index, TBD

  • 'cont.boyce' = Continuous Boyce index, Ratio of predicted against expected frequency calculated over a moving window:

    PiEi\frac{P_{i}}{E_{i}}

    , where

    Pi=pij=1bpjP_{i} = \frac{p_{i}}{\sum{j=1}^{b} p_{j}}

    and

    Ei=aij=1bajE_{i} = \frac{a_{i}}{\sum{j=1}^{b} a_{j}}

Discrete:

  • 'n' = Number of observations.

  • 'auc' = Area under the curve, e.g. the integral of a function relating the True positive rate against the false positive rate.

  • 'overall.accuracy' = Overall Accuracy, Average of all positives,

    TP+TNn\frac{TP + TN}{n}

  • 'true.presence.ratio' = True presence ratio or Jaccard index,

    TPTP+TN+FP+FN\frac{TP}{TP+TN+FP+FN}

  • 'precision' = Precision, positive detection rate

    TPTP+FP\frac{TP}{TP+FP}

  • 'sensitivity' = Sensitivity, Ratio of True positives against all positives,

    TPTP+FP\frac{TP}{TP+FP}

  • 'specificity' = Specifivity, Ratio of True negatives against all negatives,

    TNTN+FN\frac{TN}{TN+FN}

  • 'tss' = True Skill Statistics, sensitivity + specificity – 1 * 'f1' = F1 Score or Positive predictive value,

    2TP2TP+FP+FN\frac{2TP}{2TP + FP + FN}

  • 'logloss' = Log loss, TBD

  • 'expected.accuracy' = Expected Accuracy,

    TP+FPNxTP+FNN+TN+FNNxTN+FPN\frac{TP + FP}{N} x \frac{TP + FN}{N} + \frac{TN + FN}{N} x \frac{TN + FP}{N}

  • 'kappa' = Kappa value,

    2(TPxTNFNxFP)(TP+FP)x(FP+TN)+(TP+FN)x(FN+TN)\frac{2 (TP x TN - FN x FP)}{(TP + FP) x (FP + TN) + (TP + FN) x (FN + TN) }

    ,

  • 'brier.score' = Brier score,

    i=1N(yixi)2n\frac{ \sum_{i=1}^{N} (y_{i} - x_{i})^{2} }{n}

    , where

    yiy_{i}

    is predicted presence or absence and

    xix_{i}

    an observed.

Value

Return a tidy tibble with validation results.

Note

If you use the Boyce Index, please cite the original Hirzel et al. (2006) paper.

References

  • Allouche O., Tsoar A., Kadmon R., (2006). Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS). Journal of Applied Ecology, 43(6), 1223–1232.

  • Liu, C., White, M., Newell, G., 2013. Selecting thresholds for the prediction of species occurrence with presence-only data. J. Biogeogr. 40, 778–789. https://doi.org/10.1111/jbi.12058

  • Hirzel, A. H., Le Lay, G., Helfer, V., Randin, C., & Guisan, A. (2006). Evaluating the ability of habitat suitability models to predict species presences. Ecological modelling, 199(2), 142-152.

Examples

## Not run: 
 # Assuming that mod is a distribution object and has a thresholded layer
 mod <- threshold(mod, method = "TSS")
 validate(mod, method = "discrete")
 
## End(Not run)

Wrap a model for later use

Description

The wrap_model function uses terra::wrap() to easier ship a DistributionModel object.

Usage

wrap_model(mod, verbose = getOption("ibis.setupmessages", default = TRUE))

## S4 method for signature 'ANY'
wrap_model(mod, verbose = getOption("ibis.setupmessages", default = TRUE))

Arguments

mod

Provided DistributionModel object.

verbose

logical indicating whether messages should be shown. Overwrites getOption("ibis.setupmessages") (Default: TRUE).

Value

DistributionModel with wrapped raster layers

See Also

unwrap_model

Examples

## Not run: 
x <- distribution(background) |>
 add_biodiversity_poipo(virtual_points, field_occurrence = 'observed', name = 'Virtual points') |>
 add_predictors(pred_current, transform = 'scale',derivates = 'none') |>
 engine_xgboost(nrounds = 2000) |>
 train(varsel = FALSE, only_linear = TRUE)
wrap_model(x, "testmodel.rds")

## End(Not run)

Save a model for later use

Description

The write_model function (opposed to the write_output) is a generic wrapper to writing a DistributionModel to disk. It is essentially a wrapper to saveRDS. Models can be loaded again via the load_model function.

Usage

write_model(
  mod,
  fname,
  slim = FALSE,
  verbose = getOption("ibis.setupmessages", default = TRUE)
)

## S4 method for signature 'ANY'
write_model(
  mod,
  fname,
  slim = FALSE,
  verbose = getOption("ibis.setupmessages", default = TRUE)
)

Arguments

mod

Provided DistributionModel object.

fname

A character depicting an output filename.

slim

A logical option to whether unnecessary entries in the model object should be deleted. This deletes for example predictions or any other non-model content from the object (Default: FALSE).

verbose

logical indicating whether messages should be shown. Overwrites getOption("ibis.setupmessages") (Default: TRUE).

Value

No R-output is created. A file is written to the target direction.

Note

By default output files will be overwritten if already existing!

See Also

load_model

Examples

## Not run: 
x <- distribution(background) |>
 add_biodiversity_poipo(virtual_points, field_occurrence = 'observed', name = 'Virtual points') |>
 add_predictors(pred_current, transform = 'scale',derivates = 'none') |>
 engine_xgboost(nrounds = 2000) |> train(varsel = FALSE, only_linear = TRUE)
write_model(x, "testmodel.rds")

## End(Not run)

Generic function to write spatial outputs

Description

The write_output function is a generic wrapper to writing any output files (e.g. projections) created with the ibis.iSDM-package. It is possible to write outputs of fitted DistributionModel, BiodiversityScenario or individual terra or stars objects. In case a data.frame is supplied, the output is written as csv file. For creating summaries of distribution and scenario parameters and performance, see write_summary()

Usage

write_output(
  mod,
  fname,
  dt = "FLT4S",
  verbose = getOption("ibis.setupmessages", default = TRUE),
  ...
)

## S4 method for signature 'ANY,character'
write_output(
  mod,
  fname,
  dt = "FLT4S",
  verbose = getOption("ibis.setupmessages", default = TRUE),
  ...
)

## S4 method for signature 'BiodiversityScenario,character'
write_output(
  mod,
  fname,
  dt = "FLT4S",
  verbose = getOption("ibis.setupmessages", default = TRUE),
  ...
)

## S4 method for signature 'SpatRaster,character'
write_output(
  mod,
  fname,
  dt = "FLT4S",
  verbose = getOption("ibis.setupmessages", default = TRUE),
  ...
)

## S4 method for signature 'data.frame,character'
write_output(
  mod,
  fname,
  dt = "FLT4S",
  verbose = getOption("ibis.setupmessages", default = TRUE),
  ...
)

## S4 method for signature 'stars,character'
write_output(
  mod,
  fname,
  dt = "FLT4S",
  verbose = getOption("ibis.setupmessages", default = TRUE),
  ...
)

Arguments

mod

Provided DistributionModel, BiodiversityScenario, terra or stars object.

fname

A character depicting an output filename.

dt

A character for the output datatype. Following the terra::writeRaster options (Default: 'FLT4S').

verbose

logical indicating whether messages should be shown. Overwrites getOption("ibis.setupmessages") (Default: TRUE).

...

Any other arguments passed on the individual functions.

Value

No R-output is created. A file is written to the target direction.

Note

By default output files will be overwritten if already existing!

Examples

## Not run: 
x <- distribution(background)  |>
 add_biodiversity_poipo(virtual_points, field_occurrence = 'observed', name = 'Virtual points') |>
 add_predictors(pred_current, transform = 'scale',derivates = 'none') |>
 engine_xgboost(nrounds = 2000) |> train(varsel = FALSE, only_linear = TRUE)
write_output(x, "testmodel.tif")

## End(Not run)

Generic function to write summary outputs from created models.

Description

The write_summary function is a wrapper function to create summaries from fitted DistributionModel or BiodiversityScenario objects. This function will extract parameters and statistics about the used data from the input object and writes the output as either 'rds' or 'rdata' file. Alternative, more open file formats are under consideration.

Usage

write_summary(
  mod,
  fname,
  partial = FALSE,
  verbose = getOption("ibis.setupmessages", default = TRUE),
  ...
)

## S4 method for signature 'ANY,character'
write_summary(
  mod,
  fname,
  partial = FALSE,
  verbose = getOption("ibis.setupmessages", default = TRUE),
  ...
)

Arguments

mod

Provided DistributionModel or BiodiversityScenario object.

fname

A character depicting an output filename. The suffix determines the file type of the output (Options: 'rds', 'rdata').

partial

A logical value determining whether partial variable contributions should be calculated and added to the model summary. Note that this can be rather slow (Default: FALSE).

verbose

logical indicating whether messages should be shown. Overwrites getOption("ibis.setupmessages") (Default: TRUE).

...

Any other arguments passed on the individual functions.

Value

No R-output is created. A file is written to the target direction.

Note

No predictions or tabular data is saved through this function. Use write_output() to save those.

Examples

## Not run: 
x <- distribution(background) |>
 add_biodiversity_poipo(virtual_points, field_occurrence = 'observed', name = 'Virtual points')  |>
 add_predictors(pred_current, transform = 'scale',derivates = 'none') |>
 engine_xgboost(nrounds = 2000) |> train(varsel = FALSE, only_linear = TRUE)
write_summary(x, "testmodel.rds")

## End(Not run)

Create a new monotonic prior for boosted regressions

Description

Function to include prior information as monotonic constrain to a extreme gradient descent boosting model engine_xgboost. Monotonic priors enforce directionality in direction of certain variables, however specifying a monotonic constrain does not guarantee that the variable is not regularized out during model fitting.

Usage

XGBPrior(variable, hyper = "increasing", ...)

## S4 method for signature 'character,character'
XGBPrior(variable, hyper = "increasing", ...)

Arguments

variable

A character matched against existing predictors or latent effects.

hyper

A character object describing the type of constrain. Available options are 'increasing', 'decreasing', 'convex', 'concave', 'none'.

...

Variables passed on to prior object.

References

  • Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., & Cho, H. (2015). Xgboost: extreme gradient boosting. R package version 0.4-2, 1(4), 1-4.

See Also

Prior and GDBPrior.

Other prior: BARTPrior(), BARTPriors(), BREGPrior(), BREGPriors(), GDBPrior(), GDBPriors(), GLMNETPrior(), GLMNETPriors(), INLAPrior(), INLAPriors(), STANPrior(), STANPriors(), XGBPriors(), add_priors(), get_priors(), priors(), rm_priors()

Examples

## Not run: 
 pp <- XGBPrior("forest", "increasing")

## End(Not run)

Helper function when multiple variables are supplied for XGBOOST

Description

This is a helper function to specify several XGBPrior with the same hyper-parameters, but different variables.

Usage

XGBPriors(variable, hyper = "increasing", ...)

## S4 method for signature 'character'
XGBPriors(variable, hyper = "increasing", ...)

Arguments

variable

A character matched against existing predictors or latent effects.

hyper

A character object describing the type of constrain. Available options are 'increasing', 'decreasing', 'convex', 'concave', 'none'.

...

Variables passed on to prior object.

See Also

Other prior: BARTPrior(), BARTPriors(), BREGPrior(), BREGPriors(), GDBPrior(), GDBPriors(), GLMNETPrior(), GLMNETPriors(), INLAPrior(), INLAPriors(), STANPrior(), STANPriors(), XGBPrior(), add_priors(), get_priors(), priors(), rm_priors()