Title: | Modelling framework for integrated biodiversity distribution scenarios |
---|---|
Description: | Integrated framework of modelling the distribution of species and ecosystems in a suitability framing. This package allows the estimation of integrated species distribution models (iSDM) based on several sources of evidence and provided presence-only and presence-absence datasets. It makes heavy use of point-process models for estimating habitat suitability and allows to include spatial latent effects and priors in the estimation. To do so 'ibis.iSDM' supports a number of engines for Bayesian and more non-parametric machine learning estimation. Further, the 'ibis.iSDM' is specifically customized to support spatial-temporal projections of habitat suitability into the future. |
Authors: | Martin Jung [aut, cre, cph] |
Maintainer: | Martin Jung <[email protected]> |
License: | CC BY 4.0 |
Version: | 0.1.5 |
Built: | 2025-02-06 05:46:08 UTC |
Source: | https://github.com/iiasa/ibis.iSDM |
This function adds a presence-absence biodiversity dataset to a distribution object. Opposed to presence-only data, presence-absence biodiversity records usually originate from structured biodiversity surveys where the absence of a species in a given region was specifically assessed.
If it is the analysts choice it is also possible to format presence-only
biodiversity data into a presence-absence form, by adding pseudo-absence
through add_pseudoabsence
. See the help file for more information.
add_biodiversity_poipa( x, poipa, name = NULL, field_occurrence = "observed", formula = NULL, family = "binomial", link = NULL, weight = 1, separate_intercept = TRUE, docheck = TRUE, ... ) ## S4 method for signature 'BiodiversityDistribution,sf' add_biodiversity_poipa( x, poipa, name = NULL, field_occurrence = "observed", formula = NULL, family = "binomial", link = NULL, weight = 1, separate_intercept = TRUE, docheck = TRUE, ... )
add_biodiversity_poipa( x, poipa, name = NULL, field_occurrence = "observed", formula = NULL, family = "binomial", link = NULL, weight = 1, separate_intercept = TRUE, docheck = TRUE, ... ) ## S4 method for signature 'BiodiversityDistribution,sf' add_biodiversity_poipa( x, poipa, name = NULL, field_occurrence = "observed", formula = NULL, family = "binomial", link = NULL, weight = 1, separate_intercept = TRUE, docheck = TRUE, ... )
x |
|
poipa |
A |
name |
The name of the biodiversity dataset used as internal identifier. |
field_occurrence |
A |
formula |
A |
family |
A |
link |
A |
weight |
A |
separate_intercept |
A |
docheck |
|
... |
Other parameters passed down. |
By default, the logit link function is used in a logistic regression setting unless the specific engine does not support generalised linear regressions (e.g. engine_bart).
Adds biodiversity data to distribution object.
Renner, I. W., J. Elith, A. Baddeley, W. Fithian, T. Hastie, S. J. Phillips, G. Popovic, and D. I. Warton. 2015. Point process models for presence-only analysis. Methods in Ecology and Evolution 6:366–379.
Guisan A. and Zimmerman N. 2000. Predictive habitat distribution models in ecology. Ecol. Model. 135: 147–186.
Other add_biodiversity:
add_biodiversity_poipo()
,
add_biodiversity_polpa()
,
add_biodiversity_polpo()
## Not run: # Define model x <- distribution(background) |> add_biodiversity_poipa(virtual_species) ## End(Not run)
## Not run: # Define model x <- distribution(background) |> add_biodiversity_poipa(virtual_species) ## End(Not run)
This function adds a presence-only biodiversity dataset to a distribution object.
add_biodiversity_poipo( x, poipo, name = NULL, field_occurrence = "observed", formula = NULL, family = "poisson", link = NULL, weight = 1, separate_intercept = TRUE, docheck = TRUE, pseudoabsence_settings = NULL, ... ) ## S4 method for signature 'BiodiversityDistribution,sf' add_biodiversity_poipo( x, poipo, name = NULL, field_occurrence = "observed", formula = NULL, family = "poisson", link = NULL, weight = 1, separate_intercept = TRUE, docheck = TRUE, pseudoabsence_settings = NULL, ... )
add_biodiversity_poipo( x, poipo, name = NULL, field_occurrence = "observed", formula = NULL, family = "poisson", link = NULL, weight = 1, separate_intercept = TRUE, docheck = TRUE, pseudoabsence_settings = NULL, ... ) ## S4 method for signature 'BiodiversityDistribution,sf' add_biodiversity_poipo( x, poipo, name = NULL, field_occurrence = "observed", formula = NULL, family = "poisson", link = NULL, weight = 1, separate_intercept = TRUE, docheck = TRUE, pseudoabsence_settings = NULL, ... )
x |
|
poipo |
A |
name |
The name of the biodiversity dataset used as internal identifier. |
field_occurrence |
A |
formula |
A |
family |
A |
link |
A |
weight |
A |
separate_intercept |
A |
docheck |
|
pseudoabsence_settings |
Either |
... |
Other parameters passed down to the object. Normally not used unless described in details. |
This function allows to add presence-only biodiversity records to a distribution ibis.iSDM Presence-only data are usually modelled through an inferential model (see Guisan and Zimmerman, 2000) that relate their occurrence in relation to environmental covariates to a selected sample of 'background' points. The most common approach for estimation and the one supported by this type of dataset are poisson-process models (PPM) in which presence-only points are fitted through a down-weighted Poisson regression. See Renner et al. 2015 for an overview.
Adds biodiversity data to distribution object.
Guisan A. and Zimmerman N. 2000. Predictive habitat distribution models in ecology. Ecol. Model. 135: 147–186.
Renner, I. W., J. Elith, A. Baddeley, W. Fithian, T. Hastie, S. J. Phillips, G. Popovic, and D. I. Warton. 2015. Point process models for presence-only analysis. Methods in Ecology and Evolution 6:366–379.
Other add_biodiversity:
add_biodiversity_poipa()
,
add_biodiversity_polpa()
,
add_biodiversity_polpo()
# Load background background <- terra::rast(system.file('extdata/europegrid_50km.tif', package='ibis.iSDM',mustWork = TRUE)) # Load virtual species virtual_points <- sf::st_read(system.file('extdata/input_data.gpkg', package='ibis.iSDM',mustWork = TRUE),'points',quiet = TRUE) # Define model x <- distribution(background) |> add_biodiversity_poipo(virtual_points, field_occurrence = "Observed")
# Load background background <- terra::rast(system.file('extdata/europegrid_50km.tif', package='ibis.iSDM',mustWork = TRUE)) # Load virtual species virtual_points <- sf::st_read(system.file('extdata/input_data.gpkg', package='ibis.iSDM',mustWork = TRUE),'points',quiet = TRUE) # Define model x <- distribution(background) |> add_biodiversity_poipo(virtual_points, field_occurrence = "Observed")
This function can be used to add a sf
polygon dataset to an
existing distribution object. Presence-absence polygon data assumes that each
area within the polygon can be treated as 'presence' for the species, while
each area outside the polygon is where the species is absent.
add_biodiversity_polpa( x, polpa, name = NULL, field_occurrence = "observed", formula = NULL, family = "binomial", link = NULL, weight = 1, simulate = FALSE, simulate_points = 100, simulate_bias = NULL, simulate_strategy = "random", separate_intercept = TRUE, docheck = TRUE, pseudoabsence_settings = NULL, ... ) ## S4 method for signature 'BiodiversityDistribution,sf' add_biodiversity_polpa( x, polpa, name = NULL, field_occurrence = "observed", formula = NULL, family = "binomial", link = NULL, weight = 1, simulate = FALSE, simulate_points = 100, simulate_bias = NULL, simulate_strategy = "random", separate_intercept = TRUE, docheck = TRUE, pseudoabsence_settings = NULL, ... )
add_biodiversity_polpa( x, polpa, name = NULL, field_occurrence = "observed", formula = NULL, family = "binomial", link = NULL, weight = 1, simulate = FALSE, simulate_points = 100, simulate_bias = NULL, simulate_strategy = "random", separate_intercept = TRUE, docheck = TRUE, pseudoabsence_settings = NULL, ... ) ## S4 method for signature 'BiodiversityDistribution,sf' add_biodiversity_polpa( x, polpa, name = NULL, field_occurrence = "observed", formula = NULL, family = "binomial", link = NULL, weight = 1, simulate = FALSE, simulate_points = 100, simulate_bias = NULL, simulate_strategy = "random", separate_intercept = TRUE, docheck = TRUE, pseudoabsence_settings = NULL, ... )
x |
|
polpa |
A |
name |
The name of the biodiversity dataset used as internal identifier. |
field_occurrence |
A |
formula |
A |
family |
A |
link |
A |
weight |
A |
simulate |
Simulate poipa points within its boundaries. Result are
passed to |
simulate_points |
A |
simulate_bias |
A |
simulate_strategy |
A |
separate_intercept |
A |
docheck |
|
pseudoabsence_settings |
Either |
... |
Other parameters passed down. |
The default approach for polygon data is to sample presence-absence
points across the region of the polygons. This function thus adds as a
wrapper to add_biodiversity_poipa()
as presence-only points are created
by the model. Note if the polygon is used directly in the modelling the
link between covariates and polygonal data is established by regular
sampling of points within the polygon and is thus equivalent to simulating
the points directly.
For an integration of range data as predictor or offset, see add_predictor_range()
and add_offset_range()
instead.
Adds biodiversity data to distribution object.
Other add_biodiversity:
add_biodiversity_poipa()
,
add_biodiversity_poipo()
,
add_biodiversity_polpo()
## Not run: x <- distribution(background) |> add_biodiversity_polpa(protectedArea) ## End(Not run)
## Not run: x <- distribution(background) |> add_biodiversity_polpa(protectedArea) ## End(Not run)
This function can be used to add a sf
polygon dataset to an
existing distribution object. Presence-only polygon data is treated
differential than point data in some engines particular through the way that
points are generated.
add_biodiversity_polpo( x, polpo, name = NULL, field_occurrence = "observed", formula = NULL, family = "poisson", link = NULL, weight = 1, simulate = FALSE, simulate_points = 100, simulate_bias = NULL, simulate_strategy = "random", separate_intercept = TRUE, docheck = TRUE, pseudoabsence_settings = NULL, ... ) ## S4 method for signature 'BiodiversityDistribution,sf' add_biodiversity_polpo( x, polpo, name = NULL, field_occurrence = "observed", formula = NULL, family = "poisson", link = NULL, weight = 1, simulate = FALSE, simulate_points = 100, simulate_bias = NULL, simulate_strategy = "random", separate_intercept = TRUE, docheck = TRUE, pseudoabsence_settings = NULL, ... )
add_biodiversity_polpo( x, polpo, name = NULL, field_occurrence = "observed", formula = NULL, family = "poisson", link = NULL, weight = 1, simulate = FALSE, simulate_points = 100, simulate_bias = NULL, simulate_strategy = "random", separate_intercept = TRUE, docheck = TRUE, pseudoabsence_settings = NULL, ... ) ## S4 method for signature 'BiodiversityDistribution,sf' add_biodiversity_polpo( x, polpo, name = NULL, field_occurrence = "observed", formula = NULL, family = "poisson", link = NULL, weight = 1, simulate = FALSE, simulate_points = 100, simulate_bias = NULL, simulate_strategy = "random", separate_intercept = TRUE, docheck = TRUE, pseudoabsence_settings = NULL, ... )
x |
|
polpo |
A |
name |
The name of the biodiversity dataset used as internal identifier. |
field_occurrence |
A |
formula |
A |
family |
A |
link |
A |
weight |
A |
simulate |
Simulate poipo points within its boundaries. Result are
passed to |
simulate_points |
A |
simulate_bias |
A |
simulate_strategy |
A |
separate_intercept |
A |
docheck |
|
pseudoabsence_settings |
Either |
... |
Other parameters passed down. |
The default approach for polygon data is to sample presence-only
points across the region of the polygons. This function thus adds as a
wrapper to add_biodiversity_poipo()
as presence-only points are created
by the model. If no points are simulated directly (Default) then the
polygon is processed by train()
by creating regular point data over the
supplied predictors.
Use add_biodiversity_polpa()
to create binomial distributed
inside-outside points for the given polygon!
For an integration of range data as predictor or offset, see
add_predictor_range()
and add_offset_range()
instead.
Adds biodiversity data to distribution object.
Other add_biodiversity:
add_biodiversity_poipa()
,
add_biodiversity_poipo()
,
add_biodiversity_polpa()
## Not run: x <- distribution(mod) |> add_biodiversity_polpo(protectedArea) ## End(Not run)
## Not run: x <- distribution(mod) |> add_biodiversity_polpo(protectedArea) ## End(Not run)
scenario
This function adds a constrain to a
BiodiversityScenario
object to constrain (future) projections.
These constrains can for instance be constraints on a possible dispersal
distance, connectivity between identified patches or limitations on species
adaptability.
Most constrains require pre-calculated thresholds to present in the BiodiversityScenario
object!
add_constraint(mod, method, ...) ## S4 method for signature 'BiodiversityScenario' add_constraint(mod, method, ...)
add_constraint(mod, method, ...) ## S4 method for signature 'BiodiversityScenario' add_constraint(mod, method, ...)
mod |
A |
method |
A |
... |
passed on parameters. See also the specific methods for adding constraints. |
Constraints can be added to scenario objects to increase or decrease the suitability of a given area for the target feature. This function acts as a wrapper to add these constraints. Currently supported are the following options:
Dispersal:
sdd_fixed
- Applies a fixed uniform dispersal distance per modelling timestep.
sdd_nexpkernel
- Applies a dispersal distance using a negative exponential kernel from its origin.
kissmig
- Applies the kissmig stochastic dispersal model. Requires `kissmig`
package. Applied at each modelling time step.
migclim
- Applies the dispersal algorithm MigClim to the modelled objects. Requires "MigClim"
package.
A comprehensive overview of the benefits of including dispersal constrains in species distribution models can be found in Bateman et al. (2013).
Connectivity:
hardbarrier
- Defines a hard barrier to any dispersal events. By
definition this sets all values larger
than 0
in the barrier layer to 0
in the projection. Barrier has
to be provided through the "resistance"
parameter.
resistance
- Allows the provision of a static or dynamic layer that is
multiplied with the projection at each time step. Can for example be used to
reduce the suitability of any given area (using pressures not included in the model).
The respective layer(s) have to be provided through the "resistance"
parameter.
Provided layers are incorporated as abs(resistance - 1)
and multiplied with
the prediction.
Adaptability:
nichelimit
- Specifies a limit on the environmental niche to only allow
a modest amount of extrapolation beyond the known occurrences. This can be particular
useful to limit the influence of increasing marginal responses and avoid biologically
unrealistic projections.
Boundary and size:
boundary
- Applies a hard boundary constraint on the projection, thus
disallowing an expansion of a range outside the provide layer. Similar as specifying
projection limits (see distribution
), but can be used to specifically constrain a
projection within a certain area (e.g. a species range or an island).
minsize
- Allows to specify a certain size that must be satisfied in
order for a thresholded patch to be occupied. Can be thought of as a minimum
size requirement. See add_constraint_minsize()
for the required parameters.
threshold
- Applies the set threshold as a constrain directly on the
suitability projections. Requires a threshold to be set.
Adds constraints data to a BiodiversityScenario
object.
Bateman, B. L., Murphy, H. T., Reside, A. E., Mokany, K., & VanDerWal, J. (2013). Appropriateness of full‐, partial‐and no‐dispersal scenarios in climate change impact modelling. Diversity and Distributions, 19(10), 1224-1234.
Nobis MP and Normand S (2014) KISSMig - a simple model for R to account for limited migration in analyses of species distributions. Ecography 37: 1282-1287.
Mendes, P., Velazco, S. J. E., de Andrade, A. F. A., & Júnior, P. D. M. (2020). Dealing with overprediction in species distribution models: How adding distance constraints can improve model accuracy. Ecological Modelling, 431, 109180.
Other constraint:
add_constraint_MigClim()
,
add_constraint_adaptability()
,
add_constraint_boundary()
,
add_constraint_connectivity()
,
add_constraint_dispersal()
,
add_constraint_minsize()
,
add_constraint_threshold()
,
simulate_population_steps()
## Not run: # Assumes that a trained 'model' object exists mod <- scenario(model) |> add_predictors(env = predictors, transform = 'scale', derivates = "none") |> add_constraint_dispersal(method = "kissmig", value = 2, pext = 0.1) |> project() ## End(Not run)
## Not run: # Assumes that a trained 'model' object exists mod <- scenario(model) |> add_predictors(env = predictors, transform = 'scale', derivates = "none") |> add_constraint_dispersal(method = "kissmig", value = 2, pext = 0.1) |> project() ## End(Not run)
Adaptability constraints assume that suitable habitat for species in (future) projections might be unsuitable if it is outside the range of conditions currently observed for the species.
Currently only nichelimit
is implemented, which adds a simple constrain on
the predictor parameter space, which can be defined through the
"value"
parameter. For example by setting it to 1
(Default),
any projections are constrained to be within the range of at maximum 1
standard deviation from the range of covariates used for model training.
add_constraint_adaptability( mod, method = "nichelimit", names = NULL, value = 1, increment = 0, ... ) ## S4 method for signature 'BiodiversityScenario' add_constraint_adaptability( mod, method = "nichelimit", names = NULL, value = 1, increment = 0, ... )
add_constraint_adaptability( mod, method = "nichelimit", names = NULL, value = 1, increment = 0, ... ) ## S4 method for signature 'BiodiversityScenario' add_constraint_adaptability( mod, method = "nichelimit", names = NULL, value = 1, increment = 0, ... )
mod |
A |
method |
A |
names |
A |
value |
A |
increment |
A |
... |
passed on parameters. See also the specific methods for adding constraints. |
Other constraint:
add_constraint()
,
add_constraint_MigClim()
,
add_constraint_boundary()
,
add_constraint_connectivity()
,
add_constraint_dispersal()
,
add_constraint_minsize()
,
add_constraint_threshold()
,
simulate_population_steps()
## Not run: scenario(fit) |> add_constraint_adaptability(value = 1) ## End(Not run)
## Not run: scenario(fit) |> add_constraint_adaptability(value = 1) ## End(Not run)
The purpose of boundary constraints is to limit a future projection within a specified area (such as for example a range or ecoregion). This can help to limit unreasonable projections into geographic space.
Similar to boundary constraints it is also possible to define a "zone"
for the scenario projections, similar as was done for model training. The
difference to a boundary constraint is that the boundary constraint is
applied posthoc as a hard cut on any projection, while the zones would allow
any projection (and other constraints) to be applied within the zone.
Note: Setting a boundary constraint for future projections effectively potentially suitable areas!
add_constraint_boundary(mod, layer, ...) ## S4 method for signature 'BiodiversityScenario,sf' add_constraint_boundary(mod, layer, method = "boundary", ...) ## S4 method for signature 'BiodiversityScenario,ANY' add_constraint_boundary(mod, layer, method = "boundary", ...)
add_constraint_boundary(mod, layer, ...) ## S4 method for signature 'BiodiversityScenario,sf' add_constraint_boundary(mod, layer, method = "boundary", ...) ## S4 method for signature 'BiodiversityScenario,ANY' add_constraint_boundary(mod, layer, method = "boundary", ...)
mod |
A |
layer |
A |
... |
passed on parameters. See also the specific methods for adding constraints. |
method |
A |
Other constraint:
add_constraint()
,
add_constraint_MigClim()
,
add_constraint_adaptability()
,
add_constraint_connectivity()
,
add_constraint_dispersal()
,
add_constraint_minsize()
,
add_constraint_threshold()
,
simulate_population_steps()
## Not run: # Add scenario constraint scenario(fit) |> add_constraint_boundary(range) ## End(Not run)
## Not run: # Add scenario constraint scenario(fit) |> add_constraint_boundary(range) ## End(Not run)
Adds a connectivity constraint to a scenario object.
add_constraint_connectivity(mod, method, value = NULL, resistance = NULL, ...) ## S4 method for signature 'BiodiversityScenario' add_constraint_connectivity(mod, method, value = NULL, resistance = NULL, ...)
add_constraint_connectivity(mod, method, value = NULL, resistance = NULL, ...) ## S4 method for signature 'BiodiversityScenario' add_constraint_connectivity(mod, method, value = NULL, resistance = NULL, ...)
mod |
A |
method |
A |
value |
For many dispersal |
resistance |
A |
... |
passed on parameters. See also the specific methods for adding constraints. |
hardbarrier
- Defines a hard barrier to any dispersal events. By
definition this sets all values larger
than 0
in the barrier layer to 0
in the projection. Barrier has
to be provided through the "resistance"
parameter.
resistance
- Allows the provision of a static or dynamic layer that is
multiplied with the projection at each time step. Can for example be used to
reduce the suitability of any given area (using pressures not included in the model).
The respective layer(s) have to be provided through the "resistance"
parameter.
Provided layers are incorporated as abs(resistance - 1)
and multiplied with
the prediction.
Other constraint:
add_constraint()
,
add_constraint_MigClim()
,
add_constraint_adaptability()
,
add_constraint_boundary()
,
add_constraint_dispersal()
,
add_constraint_minsize()
,
add_constraint_threshold()
,
simulate_population_steps()
scenario
Add dispersal constraint to an existing scenario
add_constraint_dispersal(mod, method, value = NULL, type = NULL, ...) ## S4 method for signature 'BiodiversityScenario' add_constraint_dispersal(mod, method, value = NULL, type = NULL, ...)
add_constraint_dispersal(mod, method, value = NULL, type = NULL, ...) ## S4 method for signature 'BiodiversityScenario' add_constraint_dispersal(mod, method, value = NULL, type = NULL, ...)
mod |
A |
method |
A |
value |
For many dispersal |
type |
A |
... |
passed on parameters. See also the specific methods for adding constraints. |
Dispersal:
Parameters for 'method'
:
sdd_fixed
- Applies a fixed uniform dispersal distance per modelling timestep.
sdd_nexpkernel
- Applies a dispersal distance using a negative exponential kernel from its origin.
#' The negative exponential kernel is defined as:
where is the mean dispersal distance (in m) divided by 2.
kissmig
- Applies the kissmig stochastic dispersal model. Requires `kissmig`
package. Applied at each modelling time step.
migclim
- Applies the dispersal algorithm MigClim to the modelled objects. Requires "MigClim"
package.
A comprehensive overview of the benefits of including dispersal constrains in species distribution models can be found in Bateman et al. (2013).
The following additional parameters can bet set:
pext
: numeric
indicator for `kissmig`
of the probability a
colonized cell becomes uncolonised, i.e., the species gets locally extinct
(Default: 0.1
).
pcor
: numeric
probability that corner cells are considered in the
3x3 neighbourhood (Default: 0.2
).
Unless otherwise stated, the default unit of supplied distance values (e.g. average dispersal
distance) should be in "m"
.
Bateman, B. L., Murphy, H. T., Reside, A. E., Mokany, K., & VanDerWal, J. (2013). Appropriateness of full‐, partial‐and no‐dispersal scenarios in climate change impact modelling. Diversity and Distributions, 19(10), 1224-1234.
Other constraint:
add_constraint()
,
add_constraint_MigClim()
,
add_constraint_adaptability()
,
add_constraint_boundary()
,
add_constraint_connectivity()
,
add_constraint_minsize()
,
add_constraint_threshold()
,
simulate_population_steps()
This function adds constrain as defined by the MigClim approach
(Engler et al. 2013) to a BiodiversityScenario
object to
constrain future projections. For a detailed description of MigClim, please
the respective reference and the UserGuide. The default parameters chosen
here are suggestions.
add_constraint_MigClim( mod, rcThresholdMode = "continuous", dispSteps = 1, dispKernel = c(1, 0.4, 0.16, 0.06, 0.03), barrierType = "strong", lddFreq = 0, lddRange = c(1000, 10000), iniMatAge = 1, propaguleProdProb = c(0.2, 0.6, 0.8, 0.95), replicateNb = 10, dtmp = terra::terraOptions(print = F)$tempdir ) ## S4 method for signature 'BiodiversityScenario' add_constraint_MigClim( mod, rcThresholdMode = "continuous", dispSteps = 1, dispKernel = c(1, 0.4, 0.16, 0.06, 0.03), barrierType = "strong", lddFreq = 0, lddRange = c(1000, 10000), iniMatAge = 1, propaguleProdProb = c(0.2, 0.6, 0.8, 0.95), replicateNb = 10, dtmp = terra::terraOptions(print = F)$tempdir )
add_constraint_MigClim( mod, rcThresholdMode = "continuous", dispSteps = 1, dispKernel = c(1, 0.4, 0.16, 0.06, 0.03), barrierType = "strong", lddFreq = 0, lddRange = c(1000, 10000), iniMatAge = 1, propaguleProdProb = c(0.2, 0.6, 0.8, 0.95), replicateNb = 10, dtmp = terra::terraOptions(print = F)$tempdir ) ## S4 method for signature 'BiodiversityScenario' add_constraint_MigClim( mod, rcThresholdMode = "continuous", dispSteps = 1, dispKernel = c(1, 0.4, 0.16, 0.06, 0.03), barrierType = "strong", lddFreq = 0, lddRange = c(1000, 10000), iniMatAge = 1, propaguleProdProb = c(0.2, 0.6, 0.8, 0.95), replicateNb = 10, dtmp = terra::terraOptions(print = F)$tempdir )
mod |
A |
rcThresholdMode |
A |
dispSteps |
|
dispKernel |
A |
barrierType |
A character indicating whether any set barrier should be
set as |
lddFreq |
|
lddRange |
A |
iniMatAge |
Initial maturity age. Used together with |
propaguleProdProb |
Probability of a source cell to produce propagules as a function of time since colonization. Set as probability vector that defines the probability of a cell producing propagules. |
replicateNb |
Number of replicates to be used for the analysis (Default: |
dtmp |
A |
The barrier parameter is defined through "add_barrier"
.
Adds a MigClim onstrain to a BiodiversityScenario
object.
Engler R., Hordijk W. and Guisan A. The MIGCLIM R package – seamless integration of dispersal constraints into projections of species distribution models. Ecography,
Robin Engler, Wim Hordijk and Loic Pellissier (2013). MigClim: Implementing dispersal into species distribution models. R package version 1.6.
Other constraint:
add_constraint()
,
add_constraint_adaptability()
,
add_constraint_boundary()
,
add_constraint_connectivity()
,
add_constraint_dispersal()
,
add_constraint_minsize()
,
add_constraint_threshold()
,
simulate_population_steps()
## Not run: # Assumes that a trained 'model' object exists mod <- scenario(model) |> add_predictors(env = predictors, transform = 'scale', derivates = "none") |> add_constraint_MigClim() |> project() ## End(Not run)
## Not run: # Assumes that a trained 'model' object exists mod <- scenario(model) |> add_predictors(env = predictors, transform = 'scale', derivates = "none") |> add_constraint_MigClim() |> project() ## End(Not run)
This function applies a minimum size constraint on a scenario()
created
object. The rationale here is that for a given species isolated habitat patches
smaller than a given size might not be viable / unrealistic for a species
to establish a (long-term) presence.
The idea thus is to apply a constraint in that only patches bigger than a certain size are retained between timesteps. It has thus the potential to reduce subsequent colonizations of neighbouring patches.
add_constraint_minsize( mod, value, unit = "km2", establishment_step = FALSE, ... ) ## S4 method for signature 'BiodiversityScenario,numeric' add_constraint_minsize( mod, value, unit = "km2", establishment_step = FALSE, ... )
add_constraint_minsize( mod, value, unit = "km2", establishment_step = FALSE, ... ) ## S4 method for signature 'BiodiversityScenario,numeric' add_constraint_minsize( mod, value, unit = "km2", establishment_step = FALSE, ... )
mod |
A |
value |
A |
unit |
A |
establishment_step |
A |
... |
passed on parameters. See also the specific methods for adding constraints. |
Area values in a specific unit need to be supplied.
This function requires that a scenario has a set threshold()
!
Other constraint:
add_constraint()
,
add_constraint_MigClim()
,
add_constraint_adaptability()
,
add_constraint_boundary()
,
add_constraint_connectivity()
,
add_constraint_dispersal()
,
add_constraint_threshold()
,
simulate_population_steps()
## Not run: scenario(fit) |> add_predictors(future_covariates) |> threshold() |> add_constraint_minsize(value = 1000, unit = "km2") |> project() ## End(Not run)
## Not run: scenario(fit) |> add_predictors(future_covariates) |> threshold() |> add_constraint_minsize(value = 1000, unit = "km2") |> project() ## End(Not run)
This option adds a threshold()
constraint to a scenario projection,
thus effectively applying the threshold as mask to each projection step made
during the scenario projection.
Applying this constraint thus means that the "suitability"
projection is
clipped to the threshold. This method requires
the threshold()
set for a scenario object.
It could be in theory possible to re calculate the threshold for each time step based on supplied parameters or even observation records. So far this option has not been necessary to implement.
add_constraint_threshold(mod, updatevalue = NA, ...) ## S4 method for signature 'BiodiversityScenario' add_constraint_threshold(mod, updatevalue = NA, ...)
add_constraint_threshold(mod, updatevalue = NA, ...) ## S4 method for signature 'BiodiversityScenario' add_constraint_threshold(mod, updatevalue = NA, ...)
mod |
A |
updatevalue |
A |
... |
passed on parameters. See also the specific methods for adding constraints. |
Threshold values are taken from the original fitted model.
Other constraint:
add_constraint()
,
add_constraint_MigClim()
,
add_constraint_adaptability()
,
add_constraint_boundary()
,
add_constraint_connectivity()
,
add_constraint_dispersal()
,
add_constraint_minsize()
,
simulate_population_steps()
## Not run: # Add scenario constraint scenario(fit) |> threshold() |> add_constraint_threshold() ## End(Not run)
## Not run: # Add scenario constraint scenario(fit) |> threshold() |> add_constraint_threshold() ## End(Not run)
Sampling and other biases are pervasive drivers of the spatial location of biodiversity datasets. While the integration of other, presumably less biased data can be one way of controlling for sampling biases, another way is to control directly for the bias in the model. Currently supported methods are:
"partial"
- An approach described by Warton et al. (2013) to
control the biases in a model, by including a specified variable ("layer") in
the model, but "partialling" it out during the projection phase. Specifically
the variable is set to a specified value ("bias_value"), which is by default
the minimum value observed across the background.
"offset"
- Dummy
method that points to the add_offset_bias()
functionality (see note).
Makes use of offsets to factor out a specified bias variable.
"proximity"
- Use the proximity or distance between points as a weight
in the model. This option effectively places greater weight on points farther
away. Note: In the best case this can control for spatial bias and
aggregation, in the worst case it can place a lot of emphasis on points that
likely outliers or misidentification (in terms of species).
See also details for some explanations.
add_control_bias( x, layer, method = "partial", bias_value = NULL, maxdist = NULL, alpha = 1, add = TRUE ) ## S4 method for signature 'BiodiversityDistribution' add_control_bias( x, layer, method = "partial", bias_value = NULL, maxdist = NULL, alpha = 1, add = TRUE )
add_control_bias( x, layer, method = "partial", bias_value = NULL, maxdist = NULL, alpha = 1, add = TRUE ) ## S4 method for signature 'BiodiversityDistribution' add_control_bias( x, layer, method = "partial", bias_value = NULL, maxdist = NULL, alpha = 1, add = TRUE )
x |
|
layer |
A |
method |
A |
bias_value |
A |
maxdist |
A |
alpha |
A |
add |
|
In the case of "proximity"
weights are assigned to each point, placing
higher weight on points further away and with less overlap. Weights are are
assigned up to a maximum of distance which can be provided by the user
(parameter "maxdist"
). This distance is ideally informed by some
knowledge of the species to be modelled (e.g., maximum dispersal distance).
If not provided, it is set to the distance of the centroid of a minimum
convex polygon encircling all observations. The parameter "alpha"
is a
weighting factor which can be used to diminish the effect of neighboring
points.
For a given observation , the weight
is defined as
where
in which is the total number of points closer
than the maximum distance (
) of point
, and
the
distance between focal point
and point
.
Adds bias control option to a distribution
object.
Covariate transformations applied to other predictors need to be applied to bias too.
Another option to consider biases particular in Poisson-point process models
is to remove them through an offset. Functionality to do so is available
through the add_offset_bias()
method. Setting the method to
"offset"
will automatically point to this option.
Warton, D.I., Renner, I.W. and Ramp, D., 2013. Model-based control of observer bias for the analysis of presence-only data in ecology. PloS one, 8(11), p.e79168.
Merow, C., Allen, J.M., Aiello-Lammens, M., Silander, J.A., 2016. Improving niche and range estimates with Maxent and point process models by integrating spatially explicit information. Glob. Ecol. Biogeogr. 25, 1022–1036. https://doi.org/10.1111/geb.12453
Botella, C., Joly, A., Bonnet, P., Munoz, F., & Monestiez, P. (2021). Jointly estimating spatial sampling effort and habitat suitability for multiple species from opportunistic presence‐only data. Methods in Ecology and Evolution, 12(5), 933-945.
## Not run: x <- distribution(background) |> add_predictors(covariates) |> add_control_bias(biasvariable, bias_value = NULL) ## End(Not run)
## Not run: x <- distribution(background) |> add_predictors(covariates) |> add_control_bias(biasvariable, bias_value = NULL) ## End(Not run)
In general we understand under latent spatial effects the occurrence of spatial dependency in the observations, which might either be caused by spatial biases, similarities in the underlying sampling processes or unmeasured latent covariates, e.g. those that have not been quantified.
This package supports a range of different spatial effects, however they differ from another by their impact on the estimated prediction. Some effects simply add the spatial dependence as covariate, others make use of spatial random effects to account for spatial dependence in the predictions. By default these effects are added to each dataset as covariate or shared spatial field (e.g. SPDE). See details for an explanation of the available options.
add_latent_spatial( x, method = "spde", priors = NULL, separate_spde = FALSE, ... ) ## S4 method for signature 'BiodiversityDistribution' add_latent_spatial( x, method = "spde", priors = NULL, separate_spde = FALSE, ... ) ## S4 method for signature 'BiodiversityScenario' add_latent_spatial(x, layer = NULL, reuse_latent = TRUE, ...)
add_latent_spatial( x, method = "spde", priors = NULL, separate_spde = FALSE, ... ) ## S4 method for signature 'BiodiversityDistribution' add_latent_spatial( x, method = "spde", priors = NULL, separate_spde = FALSE, ... ) ## S4 method for signature 'BiodiversityScenario' add_latent_spatial(x, layer = NULL, reuse_latent = TRUE, ...)
x |
|
method |
A |
priors |
A |
separate_spde |
A |
... |
Other parameters passed down |
layer |
A |
reuse_latent |
A |
There are several different options some of which depend on the engine used. In case a unsupported method for an engine is chosen this is modified to the next similar method.
Available are:
"spde"
- stochastic partial differential equation (SPDE) for
engine_inla
and engine_inlabru
. SPDE effects aim at capturing the
variation of the response variable in space, once all of the covariates are
accounted for. Examining the spatial distribution of the spatial error can
reveal which covariates might be missing. For example, if elevation is
positively correlated with the response variable, but is not included in
the model, we could see a higher posterior mean in areas with higher
elevation. Note that calculations of SPDE's can be computationally costly.
"car"
- conditional autocorrelative errors (CAR) for engine_inla
.
Not yet implemented in full.
"kde"
- additional covariate of the kernel density of input point observations.
"poly"
- spatial trend correction by adding coordinates as polynominal
transformation. This method assumed that a transformation of spatial coordinates
can if - included as additional predictor - explain some of the variance in the
distribution. This method does not interact with species occurrences.
"nnd"
- nearest neighbour distance. This function calculates the euclidean
distance from each point to the nearest other grid cell with known species occurrence.
Originally proposed by Allouche et al. (2008) and can be applied across all
datasets in the BiodiversityDistribution
) object.
Adds latent spatial effect to a distribution
object.
Allouche, O.; Steinitz, O.; Rotem, D.; Rosenfeld, A.; Kadmon, R. (2008). Incorporating distance constraints into species distribution models. Journal of Applied Ecology, 45(2), 599-609. doi:10.1111/j.1365-2664.2007.01445.x
Mendes, P., Velazco, S. J. E., de Andrade, A. F. A., & Júnior, P. D. M. (2020). Dealing with overprediction in species distribution models: How adding distance constraints can improve model accuracy. Ecological Modelling, 431, 109180.
## Not run: distribution(background) |> add_latent_spatial(method = "poly") ## End(Not run)
## Not run: distribution(background) |> add_latent_spatial(method = "poly") ## End(Not run)
One of the main aims of species distribution models (SDMs) is to project in space and time. For projections a common issue is extrapolation as - unconstrained - SDMs can indicate areas as suitable which are unlikely to be occupied by species or habitats (often due to historic or biotic factors). To some extent this can be related to an insufficient quantification of the niche (e.g. niche truncation by considering only a subset of observations within the actual distribution), in other cases there can also be general barriers or constraints that limit any projections (e.g. islands). This limit method adds some of those options to a model distribution object. Currently supported methods are:
* "zones"
- This is a wrapper to allow the addition of zones to a
distribution model object, similar to what is also possible via distribution()
.
Required is a spatial layer that describes a environmental zoning.
* "mcp"
- Rather than using an external or additional layer, this option constraints
predictions by a certain distance of points in its vicinity. Buffer distances
have to be in the unit of the projection used and can be configured via
"mcp_buffer"
.
* "nt2"
- Constraints the predictions using the multivariate combination novelty index (NT2)
following Mesgaran et al. (2014). This method is also available in the similarity()
function.
* "mess"
- Constraints the predictions using the
Multivariate Environmental Similarity Surfaces (MESS) following Mesgaran et al. (2014).
This method is also available in the similarity()
function.
* "shape"
- This is an implementation of the 'shape' method introduced
by Velazco et al. (2023). Through a user defined threshold it effectively limits
model extrapolation so that no projections are made beyond the extent judged as
defensible and informed by the training observations. Not yet implemented!
See also details for further explanations.
add_limits_extrapolation( x, layer, method = "mcp", mcp_buffer = 0, novel = "within", limits_clip = FALSE ) ## S4 method for signature 'BiodiversityDistribution' add_limits_extrapolation( x, layer, method = "mcp", mcp_buffer = 0, novel = "within", limits_clip = FALSE )
add_limits_extrapolation( x, layer, method = "mcp", mcp_buffer = 0, novel = "within", limits_clip = FALSE ) ## S4 method for signature 'BiodiversityDistribution' add_limits_extrapolation( x, layer, method = "mcp", mcp_buffer = 0, novel = "within", limits_clip = FALSE )
x |
|
layer |
A |
method |
A |
mcp_buffer |
A |
novel |
Which conditions are to be masked out respectively, either the
novel conditions within only |
limits_clip |
|
For method "zones"
a zoning layer can be supplied which is then used to intersect
the provided training points with. Any projections made with the model can
then be constrained so as to not project into areas that do not consider any
training points and are unlikely to have any. Examples for zones are for the
separation of islands and mainlands, biomes, or lithological soil conditions.
If no layer is available, it is also possible to constraint predictions by the
distance to a minimum convex polygon surrounding the training points with
method "mcp"
(optionally buffered). This can make sense particular for
rare species or those fully sampled across their niche.
For the "NT2"
and "MESS"
index it is possible to constrain
the prediction to conditions within (novel = "within"
) or also include
outside (novel = "outside"
) conditions.
Adds extrapolation limit option to a distribution
object.
The method "zones"
is also possible directly within distribution()
.
Randin, C. F., Dirnböck, T., Dullinger, S., Zimmermann, N. E., Zappa, M., & Guisan, A. (2006). Are niche‐based species distribution models transferable in space?. Journal of biogeography, 33(10), 1689-1703. https://doi.org/10.1111/j.1365-2699.2006.01466.x
Chevalier, M., Broennimann, O., Cornuault, J., & Guisan, A. (2021). Data integration methods to account for spatial niche truncation effects in regional projections of species distribution. Ecological Applications, 31(7), e02427. https://doi.org/10.1002/eap.2427
Velazco, S. J. E., Brooke, M. R., De Marco Jr., P., Regan, H. M., & Franklin, J. (2023). How far can I extrapolate my species distribution model? Exploring Shape, a novel method. Ecography, 11, e06992. https://doi.org/10.1111/ecog.06992
Mesgaran, M. B., R. D. Cousens, B. L. Webber, and J. Franklin. (2014) Here be dragons: a tool for quantifying novelty due to covariate range and correlation change when projecting species distribution models. Diversity and Distributions 20:1147-1159.
## Not run: # To add a zone layer for extrapolation constraints. x <- distribution(background) |> add_predictors(covariates) |> add_limits_extrapolation(method = "zones", layer = zones) ## End(Not run)
## Not run: # To add a zone layer for extrapolation constraints. x <- distribution(background) |> add_predictors(covariates) |> add_limits_extrapolation(method = "zones", layer = zones) ## End(Not run)
This function allows to specify a file as Log file, which is used to save all console outputs, prints and messages.
add_log(x, filename) ## S4 method for signature 'BiodiversityDistribution,character' add_log(x, filename)
add_log(x, filename) ## S4 method for signature 'BiodiversityDistribution,character' add_log(x, filename)
x |
|
filename |
A |
Adds a log file to a distribution
object.
## Not run: x <- distribution(background) |> add_log() x ## End(Not run)
## Not run: x <- distribution(background) |> add_log() x ## End(Not run)
Including offsets is another option to integrate spatial prior information in linear and additive regression models. Offsets shift the intercept of the regression fit by a certain amount. Although only one offset can be added to a regression model, it is possible to combine several spatial-explicit estimates into one offset by calculating the sum of all spatial-explicit layers.
add_offset(x, layer, add = TRUE) ## S4 method for signature 'BiodiversityDistribution,SpatRaster' add_offset(x, layer, add = TRUE) ## S4 method for signature 'BiodiversityDistribution,sf' add_offset(x, layer, add = TRUE)
add_offset(x, layer, add = TRUE) ## S4 method for signature 'BiodiversityDistribution,SpatRaster' add_offset(x, layer, add = TRUE) ## S4 method for signature 'BiodiversityDistribution,sf' add_offset(x, layer, add = TRUE)
x |
|
layer |
A |
add |
|
This function allows to set any specific offset to a regression
model. The offset has to be provided as spatial SpatRaster
object. This
function simply adds the layer to a distribution()
object.
Note that any transformation of the offset (such as log
) has do be done externally!
If the layer is range and requires additional formatting, consider using the
function add_offset_range()
which has additional functionalities such
such distance transformations.
Adds an offset to a distribution
object.
Since offsets only make sense for linear regressions (and not for
instance regression tree based methods such as engine_bart()
), they do not
work for all engines. Offsets specified for non-supported engines are ignored
during the estimation
Merow, C., Allen, J.M., Aiello-Lammens, M., Silander, J.A., 2016. Improving niche and range estimates with Maxent and point process models by integrating spatially explicit information. Glob. Ecol. Biogeogr. 25, 1022–1036. https://doi.org/10.1111/geb.12453
Other offset:
add_offset_bias()
,
add_offset_elevation()
,
add_offset_range()
,
rm_offset()
## Not run: x <- distribution(background) |> add_predictors(covariates) |> add_offset(nicheEstimate) ## End(Not run)
## Not run: x <- distribution(background) |> add_predictors(covariates) |> add_offset(nicheEstimate) ## End(Not run)
Including offsets is another option to integrate spatial prior information in linear and additive regression models. Offsets shift the intercept of the regression fit by a certain amount. Although only one offset can be added to a regression model, it is possible to combine several spatial-explicit estimates into one offset by calculating the sum of all spatial-explicit layers.
add_offset_bias(x, layer, add = TRUE, points = NULL) ## S4 method for signature 'BiodiversityDistribution,SpatRaster' add_offset_bias(x, layer, add = TRUE, points = NULL)
add_offset_bias(x, layer, add = TRUE, points = NULL) ## S4 method for signature 'BiodiversityDistribution,SpatRaster' add_offset_bias(x, layer, add = TRUE, points = NULL)
x |
|
layer |
A |
add |
|
points |
An optional |
This functions emulates the use of the add_offset()
function,
however applies an inverse transformation to remove the provided layer from
the overall offset. So if for instance a offset is already specified (such as
area), this function removes the provided bias.layer
from it via
"offset(log(off.area)-log(bias.layer))"
Note that any transformation of the offset (such as log
) has do be done externally!
If a generic offset is added, consider using the add_offset()
function.
If the layer is a expert-based range and requires additional parametrization,
consider using the function add_offset_range()
or the bossMaps
R-package.
Adds a bias offset to a distribution
object.
Merow, C., Allen, J.M., Aiello-Lammens, M., Silander, J.A., 2016. Improving niche and range estimates with Maxent and point process models by integrating spatially explicit information. Glob. Ecol. Biogeogr. 25, 1022–1036. https://doi.org/10.1111/geb.12453
Other offset:
add_offset()
,
add_offset_elevation()
,
add_offset_range()
,
rm_offset()
## Not run: x <- distribution(background) |> add_predictors(covariates) |> add_offset_bias(samplingBias) ## End(Not run)
## Not run: x <- distribution(background) |> add_predictors(covariates) |> add_offset_bias(samplingBias) ## End(Not run)
This function implements the elevation preferences offset defined in Ellis‐Soto et al. (2021). The code here was adapted from the Supporting materials script.
add_offset_elevation(x, elev, pref, rate = 0.0089, add = TRUE) ## S4 method for signature 'BiodiversityDistribution,SpatRaster,numeric' add_offset_elevation(x, elev, pref, rate = 0.0089, add = TRUE)
add_offset_elevation(x, elev, pref, rate = 0.0089, add = TRUE) ## S4 method for signature 'BiodiversityDistribution,SpatRaster,numeric' add_offset_elevation(x, elev, pref, rate = 0.0089, add = TRUE)
x |
|
elev |
A |
pref |
A |
rate |
A |
add |
|
Specifically this functions calculates a continuous decay and
decreasing probability of a species to occur from elevation limits. It
requires a SpatRaster
with elevation information. A generalized logistic
transform (aka Richard's curve) is used to calculate decay from the suitable
elevational areas, with the "rate"
parameter allowing to vary the
steepness of decline.
Note that all offsets created by this function are by default log-transformed before export. In addition this function also mean-centers the output as recommended by Ellis-Soto et al.
Adds a elevational offset to a distribution
object.
Ellis‐Soto, D., Merow, C., Amatulli, G., Parra, J.L., Jetz, W., 2021. Continental‐scale 1 km hummingbird diversity derived from fusing point records with lateral and elevational expert information. Ecography (Cop.). 44, 640–652. https://doi.org/10.1111/ecog.05119
Merow, C., Allen, J.M., Aiello-Lammens, M., Silander, J.A., 2016. Improving niche and range estimates with Maxent and point process models by integrating spatially explicit information. Glob. Ecol. Biogeogr. 25, 1022–1036. https://doi.org/10.1111/geb.12453
Other offset:
add_offset()
,
add_offset_bias()
,
add_offset_range()
,
rm_offset()
## Not run: # Adds the offset to a distribution object distribution(background) |> add_offset_elevation(dem, pref = c(400, 1200)) ## End(Not run)
## Not run: # Adds the offset to a distribution object distribution(background) |> add_offset_elevation(dem, pref = c(400, 1200)) ## End(Not run)
This function has additional options compared to the more
generic add_offset()
, allowing customized options specifically for
expert-based ranges as offsets or spatialized polygon information on species
occurrences. If even more control is needed, the user is informed of the
"bossMaps"
package Merow et al. (2017). Some functionalities of that
package emulated through the "distance_function"
set to "log"
.
This tries to fit a 5-parameter logistic function to estimate the distance
from the range (Merow et al. 2017).
add_offset_range( x, layer, distance_max = Inf, family = "poisson", presence_prop = 0.9, distance_clip = FALSE, distance_function = "negexp", field_occurrence = "observed", fraction = NULL, point = FALSE, add = TRUE ) ## S4 method for signature 'BiodiversityDistribution,SpatRaster' add_offset_range(x, layer, fraction = NULL, add = TRUE) ## S4 method for signature 'BiodiversityDistribution,sf' add_offset_range( x, layer, distance_max = Inf, family = "poisson", presence_prop = 0.9, distance_clip = FALSE, distance_function = "negexp", field_occurrence = "observed", fraction = NULL, point = FALSE, add = TRUE )
add_offset_range( x, layer, distance_max = Inf, family = "poisson", presence_prop = 0.9, distance_clip = FALSE, distance_function = "negexp", field_occurrence = "observed", fraction = NULL, point = FALSE, add = TRUE ) ## S4 method for signature 'BiodiversityDistribution,SpatRaster' add_offset_range(x, layer, fraction = NULL, add = TRUE) ## S4 method for signature 'BiodiversityDistribution,sf' add_offset_range( x, layer, distance_max = Inf, family = "poisson", presence_prop = 0.9, distance_clip = FALSE, distance_function = "negexp", field_occurrence = "observed", fraction = NULL, point = FALSE, add = TRUE )
x |
|
layer |
A |
distance_max |
A |
family |
A |
presence_prop |
|
distance_clip |
|
distance_function |
A |
field_occurrence |
A |
fraction |
An optional |
point |
An optional |
add |
|
The output created by this function creates a SpatRaster
to be
added to a provided distribution object. Offsets in regression models are
likelihood specific as they are added directly to the overall estimate of
`y^hat`
.
Note that all offsets created by this function are by default log-transformed
before export. Background values (e.g. beyond "distance_max"
) are set
to a very small constant (1e-10
).
Adds a range offset to a distribution
object.
Merow, C., Wilson, A.M., Jetz, W., 2017. Integrating occurrence data and expert maps for improved species range predictions. Glob. Ecol. Biogeogr. 26, 243–258. https://doi.org/10.1111/geb.12539
Merow, C., Allen, J.M., Aiello-Lammens, M., Silander, J.A., 2016. Improving niche and range estimates with Maxent and point process models by integrating spatially explicit information. Glob. Ecol. Biogeogr. 25, 1022–1036. https://doi.org/10.1111/geb.12453
"bossMaps"
Other offset:
add_offset()
,
add_offset_bias()
,
add_offset_elevation()
,
rm_offset()
## Not run: # Train a presence-only model with a simple offset fit <- distribution(background) |> add_biodiversity_poipo(virtual_points, field_occurrence = "Observed") |> add_predictors(predictors) |> add_offset_range(virtual_range, distance_max = 5,distance_function = "logcurve", distance_clip = TRUE ) |> engine_glm() |> train() ## End(Not run)
## Not run: # Train a presence-only model with a simple offset fit <- distribution(background) |> add_biodiversity_poipo(virtual_points, field_occurrence = "Observed") |> add_predictors(predictors) |> add_offset_range(virtual_range, distance_max = 5,distance_function = "logcurve", distance_clip = TRUE ) |> engine_glm() |> train() ## End(Not run)
Create lower and upper limits for an elevational range and add them as separate predictors
add_predictor_elevationpref(x, layer, lower, upper, transform = "none") ## S4 method for signature 'BiodiversityDistribution,ANY,numeric,numeric' add_predictor_elevationpref(x, layer, lower, upper, transform = "none")
add_predictor_elevationpref(x, layer, lower, upper, transform = "none") ## S4 method for signature 'BiodiversityDistribution,ANY,numeric,numeric' add_predictor_elevationpref(x, layer, lower, upper, transform = "none")
x |
|
layer |
A |
lower |
|
upper |
|
transform |
|
## Not run: distribution(background) |> add_predictor_elevationpref(elevation, lower = 200, upper = 1000) ## End(Not run)
## Not run: distribution(background) |> add_predictor_elevationpref(elevation, lower = 200, upper = 1000) ## End(Not run)
This function allows to add a species range which is usually
drawn by experts in a separate process as spatial explicit prior. Both sf
and SpatRaster
-objects are supported as input.
Users are advised to look at the "bossMaps"
R-package presented as
part of Merow et al. (2017), which allows flexible calculation of non-linear
distance transforms from the boundary of the range. Outputs of this package
could be added directly to this function.
Note that this function adds the range as predictor and not as offset. For this purpose a separate function add_offset_range()
exists.
Additional options allow to include the range either as "binary"
or as
"distance"
transformed predictor. The difference being that the range
is either directly included as presence-only predictor or alternatively with
a linear distance transform from the range boundary. The parameter
"distance_max"
can be specified to constrain this distance transform.
add_predictor_range( x, layer, method = "distance", distance_max = NULL, fraction = NULL, priors = NULL ) ## S4 method for signature 'BiodiversityDistribution,SpatRaster' add_predictor_range( x, layer, method = "precomputed_range", fraction = NULL, priors = NULL ) ## S4 method for signature 'BiodiversityDistribution,sf' add_predictor_range( x, layer, method = "distance", distance_max = Inf, fraction = NULL, priors = NULL )
add_predictor_range( x, layer, method = "distance", distance_max = NULL, fraction = NULL, priors = NULL ) ## S4 method for signature 'BiodiversityDistribution,SpatRaster' add_predictor_range( x, layer, method = "precomputed_range", fraction = NULL, priors = NULL ) ## S4 method for signature 'BiodiversityDistribution,sf' add_predictor_range( x, layer, method = "distance", distance_max = Inf, fraction = NULL, priors = NULL )
x |
|
layer |
A |
method |
|
distance_max |
Numeric threshold on the maximum distance (Default: |
fraction |
An optional |
priors |
A |
Merow, C., Wilson, A. M., & Jetz, W. (2017). Integrating occurrence data and expert maps for improved species range predictions. Global Ecology and Biogeography, 26(2), 243–258. https://doi.org/10.1111/geb.12539
## Not run: distribution(background) |> add_predictor_range(range, method = "distance", distance_max = 2) ## End(Not run)
## Not run: distribution(background) |> add_predictor_range(range, method = "distance", distance_max = 2) ## End(Not run)
This function allows to add predictors to distribution or BiodiversityScenario objects. Predictors are covariates that in spatial projection have to match the geographic projection of the background layer in the distribution object. This function furthermore allows to transform or create derivates of provided predictors.
add_predictors( x, env, names = NULL, transform = "none", derivates = "none", derivate_knots = 4, int_variables = NULL, bgmask = TRUE, harmonize_na = FALSE, explode_factors = FALSE, priors = NULL, state = NULL, ... ) ## S4 method for signature 'BiodiversityDistribution,SpatRasterCollection' add_predictors( x, env, names = NULL, transform = "none", derivates = "none", derivate_knots = 4, int_variables = NULL, bgmask = TRUE, harmonize_na = FALSE, explode_factors = FALSE, priors = NULL, state = NULL, ... ) ## S4 method for signature 'BiodiversityDistribution,SpatRaster' add_predictors( x, env, names = NULL, transform = "none", derivates = "none", derivate_knots = 4, int_variables = NULL, bgmask = TRUE, harmonize_na = FALSE, explode_factors = FALSE, priors = NULL, state = NULL, ... ) ## S4 method for signature 'BiodiversityDistribution,stars' add_predictors( x, env, names = NULL, transform = "none", derivates = "none", derivate_knots = 4, int_variables = NULL, bgmask = TRUE, harmonize_na = FALSE, explode_factors = FALSE, priors = NULL, state = NULL, ... ) ## S4 method for signature 'BiodiversityScenario,SpatRaster' add_predictors( x, env, names = NULL, transform = "none", derivates = "none", derivate_knots = 4, int_variables = NULL, bgmask = TRUE, harmonize_na = FALSE, explode_factors = FALSE, priors = NULL, state = NULL, ... ) ## S4 method for signature 'BiodiversityScenario,stars' add_predictors( x, env, names = NULL, transform = "none", derivates = "none", derivate_knots = 4, int_variables = NULL, bgmask = TRUE, harmonize_na = FALSE, explode_factors = FALSE, priors = NULL, state = NULL, ... )
add_predictors( x, env, names = NULL, transform = "none", derivates = "none", derivate_knots = 4, int_variables = NULL, bgmask = TRUE, harmonize_na = FALSE, explode_factors = FALSE, priors = NULL, state = NULL, ... ) ## S4 method for signature 'BiodiversityDistribution,SpatRasterCollection' add_predictors( x, env, names = NULL, transform = "none", derivates = "none", derivate_knots = 4, int_variables = NULL, bgmask = TRUE, harmonize_na = FALSE, explode_factors = FALSE, priors = NULL, state = NULL, ... ) ## S4 method for signature 'BiodiversityDistribution,SpatRaster' add_predictors( x, env, names = NULL, transform = "none", derivates = "none", derivate_knots = 4, int_variables = NULL, bgmask = TRUE, harmonize_na = FALSE, explode_factors = FALSE, priors = NULL, state = NULL, ... ) ## S4 method for signature 'BiodiversityDistribution,stars' add_predictors( x, env, names = NULL, transform = "none", derivates = "none", derivate_knots = 4, int_variables = NULL, bgmask = TRUE, harmonize_na = FALSE, explode_factors = FALSE, priors = NULL, state = NULL, ... ) ## S4 method for signature 'BiodiversityScenario,SpatRaster' add_predictors( x, env, names = NULL, transform = "none", derivates = "none", derivate_knots = 4, int_variables = NULL, bgmask = TRUE, harmonize_na = FALSE, explode_factors = FALSE, priors = NULL, state = NULL, ... ) ## S4 method for signature 'BiodiversityScenario,stars' add_predictors( x, env, names = NULL, transform = "none", derivates = "none", derivate_knots = 4, int_variables = NULL, bgmask = TRUE, harmonize_na = FALSE, explode_factors = FALSE, priors = NULL, state = NULL, ... )
x |
|
env |
A |
names |
A |
transform |
A |
derivates |
A Boolean check whether derivate features should be
considered (Options: |
derivate_knots |
A single |
int_variables |
A |
bgmask |
Check whether the environmental data should be masked with the
background layer (Default: |
harmonize_na |
A |
explode_factors |
|
priors |
A |
state |
A |
... |
Other parameters passed down |
A transformation takes the provided rasters and for instance rescales them or
transforms them through a principal component analysis (prcomp). In
contrast, derivates leave the original provided predictors alone, but instead
create new ones, for instance by transforming their values through a
quadratic or hinge transformation. Note that this effectively increases the
number of predictors in the object, generally requiring stronger
regularization by the used Engine
. Both transformations and derivates can
also be combined. Available options for transformation are:
'none'
- Leaves the provided predictors in the original scale.
'pca'
- Converts the predictors to principal components. Note that this
results in a renaming of the variables to principal component axes!
'scale'
- Transforms all predictors by applying scale on them.
'norm'
- Normalizes all predictors by transforming them to a scale from 0 to 1.
'windsor'
- Applies a windsorization to the target predictors. By default
this effectively cuts the predictors to the 0.05 and 0.95, thus helping to
remove extreme outliers.
Available options for creating derivates are:
'none'
- No additional predictor derivates are created.
'quad'
- Adds quadratic derivate predictors.
'interaction'
- Add interacting predictors. Interactions need to be specified ("int_variables"
)!
'thresh'
- Add threshold derivate predictors.
'hinge'
- Add hinge derivate predictors.
'kmeans'
- Add k-means derived factors.
'bin'
- Add predictors binned by their percentiles.
Important:
Not every Engine
supported by the ibis.iSDM R-package allows
missing data points among extracted covariates. Thus any observation with
missing data is generally removed prior from model fitting. Thus ensure that
covariates have appropriate no-data settings (for instance setting NA
values to 0
or another out of range constant).
Not every engine does actually need covariates. For instance it is perfectly legit to fit a model with only occurrence data and a spatial latent effect (add_latent_spatial). This correspondents to a spatial kernel density estimate.
Certain names such "offset"
are forbidden as predictor variable names.
The function will return an error message if these are used.
Some engines use binary variables regardless of the parameter explode_factors
set here.
## Not run: obj <- distribution(background) |> add_predictors(covariates, transform = 'scale') obj ## End(Not run)
## Not run: obj <- distribution(background) |> add_predictors(covariates, transform = 'scale') obj ## End(Not run)
This is a customized function to format and add downscaled land-use shares from the Global Biosphere Management Model (GLOBIOM) to a distribution or BiodiversityScenario in ibis.iSDM. GLOBIOM is a partial-equilibrium model developed at IIASA and represents land-use sectors with a rich set of environmental and socio-economic parameters, where for instance the agricultural and forestry sector are estimated through dedicated process-based models. GLOBIOM outputs are spatial explicit and usually at a half-degree resolution globally. For finer grain analyses GLOBIOM outputs can be produced in a downscaled format with a customized statistical downscaling module.
The purpose of this script is to format the GLOBIOM outputs of DownScale for the use in the ibis.iSDM package.
add_predictors_globiom( x, fname, names = NULL, transform = "none", derivates = "none", derivate_knots = 4, int_variables = NULL, bgmask = TRUE, harmonize_na = FALSE, priors = NULL, ... ) ## S4 method for signature 'BiodiversityDistribution,character' add_predictors_globiom( x, fname, names = NULL, transform = "none", derivates = "none", derivate_knots = 4, int_variables = NULL, bgmask = TRUE, harmonize_na = FALSE, priors = NULL, ... ) ## S4 method for signature 'BiodiversityScenario,character' add_predictors_globiom( x, fname, names = NULL, transform = "none", derivates = "none", derivate_knots = 4, int_variables = NULL, bgmask = TRUE, harmonize_na = FALSE, priors = NULL, ... )
add_predictors_globiom( x, fname, names = NULL, transform = "none", derivates = "none", derivate_knots = 4, int_variables = NULL, bgmask = TRUE, harmonize_na = FALSE, priors = NULL, ... ) ## S4 method for signature 'BiodiversityDistribution,character' add_predictors_globiom( x, fname, names = NULL, transform = "none", derivates = "none", derivate_knots = 4, int_variables = NULL, bgmask = TRUE, harmonize_na = FALSE, priors = NULL, ... ) ## S4 method for signature 'BiodiversityScenario,character' add_predictors_globiom( x, fname, names = NULL, transform = "none", derivates = "none", derivate_knots = 4, int_variables = NULL, bgmask = TRUE, harmonize_na = FALSE, priors = NULL, ... )
x |
A |
fname |
A |
names |
A |
transform |
A |
derivates |
A Boolean check whether derivate features should be considered
(Options: |
derivate_knots |
A single |
int_variables |
A |
bgmask |
Check whether the environmental data should be masked with the
background layer (Default: |
harmonize_na |
A |
priors |
A |
... |
Other parameters passed down |
See add_predictors()
for additional parameters and
customizations. For more (manual) control the function for formatting the
GLOBIOM data can also be called directly via formatGLOBIOM()
.
## Not run: obj <- distribution(background) |> add_predictors_globiom(fname = "", transform = 'none') obj ## End(Not run)
## Not run: obj <- distribution(background) |> add_predictors_globiom(fname = "", transform = 'none') obj ## End(Not run)
This function is a convenience wrapper to add the output from a
previous fitted DistributionModel
to another BiodiversityDistribution
object. Obviously only works if a prediction was fitted in the model. Options
to instead add thresholds, or to transform / derivate the model outputs are
also supported.
add_predictors_model( x, model, transform = "scale", derivates = "none", threshold_only = FALSE, priors = NULL, ... ) ## S4 method for signature 'BiodiversityDistribution' add_predictors_model( x, model, transform = "scale", derivates = "none", threshold_only = FALSE, priors = NULL, ... )
add_predictors_model( x, model, transform = "scale", derivates = "none", threshold_only = FALSE, priors = NULL, ... ) ## S4 method for signature 'BiodiversityDistribution' add_predictors_model( x, model, transform = "scale", derivates = "none", threshold_only = FALSE, priors = NULL, ... )
x |
|
model |
A |
transform |
A |
derivates |
A Boolean check whether derivate features should be considered
(Options: |
threshold_only |
A |
priors |
A |
... |
Other parameters passed down |
A transformation takes the provided rasters and for instance rescales them or
transforms them through a principal component analysis (prcomp). In
contrast, derivates leave the original provided predictors alone, but instead
create new ones, for instance by transforming their values through a
quadratic or hinge transformation. Note that this effectively increases the
number of predictors in the object, generally requiring stronger
regularization by the used Engine
. Both transformations and derivates can
also be combined. Available options for transformation are:
'none'
- Leaves the provided predictors in the original scale.
'pca'
- Converts the predictors to principal components. Note that this
results in a renaming of the variables to principal component axes!
'scale'
- Transforms all predictors by applying scale on them.
'norm'
- Normalizes all predictors by transforming them to a scale from 0 to 1.
'windsor'
- Applies a windsorization to the target predictors. By default
this effectively cuts the predictors to the 0.05 and 0.95, thus helping to
remove extreme outliers.
Available options for creating derivates are:
'none'
- No additional predictor derivates are created.
'quad'
- Adds quadratic transformed predictors.
'interaction'
- Add interacting predictors. Interactions need to be specified ("int_variables"
)!
'thresh'
- Add threshold transformed predictors.
'hinge'
- Add hinge transformed predictors.
'bin'
- Add predictors binned by their percentiles.
## Not run: # Fit first model fit <- distribution(background) |> add_predictors(covariates) |> add_biodiversity_poipa(species) |> engine_glmnet() |> train() # New model object obj <- distribution(background) |> add_predictors_model(fit) obj ## End(Not run)
## Not run: # Fit first model fit <- distribution(background) |> add_predictors(covariates) |> add_biodiversity_poipa(species) |> engine_glmnet() |> train() # New model object obj <- distribution(background) |> add_predictors_model(fit) obj ## End(Not run)
This function simply allows to add priors to an existing
distribution object. The supplied priors must be a PriorList
object created through calling priors.
add_priors(x, priors = NULL, ...) ## S4 method for signature 'BiodiversityDistribution' add_priors(x, priors = NULL, ...)
add_priors(x, priors = NULL, ...) ## S4 method for signature 'BiodiversityDistribution' add_priors(x, priors = NULL, ...)
x |
distribution (i.e. |
priors |
A |
... |
Other parameters passed down. |
Alternatively priors to environmental predictors can also directly added as parameter via add_predictors
Other prior:
BARTPrior()
,
BARTPriors()
,
BREGPrior()
,
BREGPriors()
,
GDBPrior()
,
GDBPriors()
,
GLMNETPrior()
,
GLMNETPriors()
,
INLAPrior()
,
INLAPriors()
,
STANPrior()
,
STANPriors()
,
XGBPrior()
,
XGBPriors()
,
get_priors()
,
priors()
,
rm_priors()
## Not run: pp <- GLMNETPrior("forest") x <- distribution(background) |> add_priors(pp) ## End(Not run)
## Not run: pp <- GLMNETPrior("forest") x <- distribution(background) |> add_priors(pp) ## End(Not run)
For most engines, background or pseudo-absence points are
necessary. The distinction lies in how the absence data are handled. For
poisson
distributed responses, absence points are considered background
points over which the intensity of sampling (lambda
) is integrated (in
a classical Poisson point-process model).
In contrast in binomial
distributed responses, the absence information is
assumed to be an adequate representation of the true absences and treated by
the model as such... Here it is advised to specify absence points in a way
that they represent potential true absence, such as for example through
targeted background sampling or by sampling them within/outside a given
range.
add_pseudoabsence( df, field_occurrence = "observed", template = NULL, settings = getOption("ibis.pseudoabsence") )
add_pseudoabsence( df, field_occurrence = "observed", template = NULL, settings = getOption("ibis.pseudoabsence") )
df |
A |
field_occurrence |
A |
template |
A |
settings |
A |
A pseudoabs_settings()
object can be added to setup how absence
points should be sampled. A bias
parameter can be set to specify a
bias layer to sample from, for instance a layer of accessibility. Note that
when modelling several datasets, it might make sense to check across all
datasets whether certain areas are truly absent. By default, the
pseudo-absence points are not sampled in areas in which there are already
presence points.
A data.frame
containing the newly created pseudo absence points.
This method removes all columns from the input df
object other
than the field_occurrence
column and the coordinate columns (which
will be created if not already present).
Stolar, J., & Nielsen, S. E. (2015). Accounting for spatially biased sampling effort in presence‐only species distribution modelling. Diversity and Distributions, 21(5), 595-608.
Bird, T.J., Bates, A.E., Lefcheck, J.S., Hill, N.A., Thomson, R.J., Edgar, G.J., Stuart-Smith, R.D., Wotherspoon, S., Krkosek, M., Stuart-Smith, J.F. and Pecl, G.T., 2014. Statistical solutions for error and bias in global citizen science datasets. Biological Conservation, 173, pp.144-154.
SpatRaster
object to another by harmonizing geometry and
extend.If the data is not in the same projection as the template, the alignment will be computed by reprojection only. If the data has already the same projection, the data set will be cropped and aggregated prior to resampling in order to reduce computation time.
alignRasters(data, template, method = "bilinear", func = mean, cl = TRUE)
alignRasters(data, template, method = "bilinear", func = mean, cl = TRUE)
data |
|
template |
|
method |
method for resampling (Options: |
func |
function for resampling (Default: mean). |
cl |
|
Nearest Neighbour resampling (near) is recommended for discrete and bilinear resampling recommended for continuous data. See also help from terra::resample for other options.
New SpatRaster
object aligned to the supplied template layer.
## Not run: # Align one raster to another ras1 <- alignRasters( ras1, ras2, method = "near", cl = FALSE) ## End(Not run)
## Not run: # Align one raster to another ras1 <- alignRasters( ras1, ras2, method = "near", cl = FALSE) ## End(Not run)
Function to include prior information as split probability for the Bayesian additive regression tree model added via engine_bart.
Priors for engine_bart have to be specified as transition probabilities of
variables which are internally used to generate splits in the regression
tree. Specifying a prior can thus help to 'enforce' a split with a given
variable. These can be numeric and coded as values between 0
and
1
.
BARTPrior(variable, hyper = 0.75, ...) ## S4 method for signature 'character' BARTPrior(variable, hyper = 0.75, ...)
BARTPrior(variable, hyper = 0.75, ...) ## S4 method for signature 'character' BARTPrior(variable, hyper = 0.75, ...)
variable |
A |
hyper |
A |
... |
Variables passed on to prior object. |
Even if a given variable is included as split in the regression or classification tree, this does not necessarily mean that the prediction changes if the value is non-informative (as the split can occur early on). It does however affect any variable importance estimates calculated from the model.
Chipman, H., George, E., and McCulloch, R. (2009) BART: Bayesian Additive Regression Trees.
Chipman, H., George, E., and McCulloch R. (2006) Bayesian Ensemble Learning. Advances in Neural Information Processing Systems 19, Scholkopf, Platt and Hoffman, Eds., MIT Press, Cambridge, MA, 265-272.
Other prior:
BARTPriors()
,
BREGPrior()
,
BREGPriors()
,
GDBPrior()
,
GDBPriors()
,
GLMNETPrior()
,
GLMNETPriors()
,
INLAPrior()
,
INLAPriors()
,
STANPrior()
,
STANPriors()
,
XGBPrior()
,
XGBPriors()
,
add_priors()
,
get_priors()
,
priors()
,
rm_priors()
This is a helper function to specify several BARTPrior objects with the same hyper-parameters, but different variables.
BARTPriors(variable, hyper = 0.75, ...) ## S4 method for signature 'character' BARTPriors(variable, hyper = 0.75, ...)
BARTPriors(variable, hyper = 0.75, ...) ## S4 method for signature 'character' BARTPriors(variable, hyper = 0.75, ...)
variable |
A |
hyper |
A |
... |
Variables passed on to prior object. |
Other prior:
BARTPrior()
,
BREGPrior()
,
BREGPriors()
,
GDBPrior()
,
GDBPriors()
,
GLMNETPrior()
,
GLMNETPriors()
,
INLAPrior()
,
INLAPriors()
,
STANPrior()
,
STANPriors()
,
XGBPrior()
,
XGBPriors()
,
add_priors()
,
get_priors()
,
priors()
,
rm_priors()
BiodiversityDataset prototype description
name
The default name of this dataset as character
.
id
A character
with the unique id for this dataset.
equation
A formula
object containing the equation of how this dataset is modelled.
family
The family used for this dataset as character
.
link
The link function used for this data as character
.
type
weight
A numeric
containing custom weights per observation for this dataset.
field_occurrence
A character
with the name of the column name containing observations.
data
Contains the observational data in sf
format.
use_intercept
A logical
flag on whether intercepts are included for this dataset.
pseudoabsence_settings
Optionally provided pseudoabsence settings.
new()
Initializes the object and creates an empty list
BiodiversityDataset$new( name, id, equation, family, link, type, weight, field_occurrence, data, use_intercept, pseudoabsence_settings )
name
The default name of this dataset as character
.
id
A character
with the unique id for this dataset.
equation
A formula
object containing the equation of how this dataset is modelled.
family
The family used for this dataset as character
.
link
The link function used for this data as character
.
type
weight
A numeric
containing custom weights per observation for this dataset.
field_occurrence
A character
with the name of the column name containing observations.
data
Contains the observational data in sf
format.
use_intercept
A logical
flag on whether intercepts are included for this dataset.
pseudoabsence_settings
Optionally provided pseudoabsence settings.
NULL
print()
Print the names and properties of all Biodiversity datasets contained within
BiodiversityDataset$print()
A message on screen
set_equation()
Set new equation and writes it into formula
BiodiversityDataset$set_equation(x)
x
A new formula
object.
Invisible
get_equation()
Get equation
BiodiversityDataset$get_equation()
A placeholder or formula
object.
show_equation()
Function to print the equation
BiodiversityDataset$show_equation()
A message on screen.
get_id()
Get Id within the dataset
BiodiversityDataset$get_id()
A character
with the id.
get_type()
Get type of the dataset.
BiodiversityDataset$get_type(short = FALSE)
short
A logical
flag if this should be formatted in shortform.
A character
with the type
get_column_occ()
Get field with occurrence information
BiodiversityDataset$get_column_occ()
A character
with the occurence field
get_family()
Get family
BiodiversityDataset$get_family()
A character
with the family for the dataset
get_link()
Get custom link function
BiodiversityDataset$get_link()
A character
with the family for the dataset
get_data()
Get data from the object
BiodiversityDataset$get_data()
A sf
object with the data
get_weight()
Get weight
BiodiversityDataset$get_weight()
A numeric
with the weights within the dataset.
show()
Print input messages
BiodiversityDataset$show()
A message on screen.
get_observations()
Collect info statistics about number of observations
BiodiversityDataset$get_observations()
A numeric
with the number of observations.
mask()
Convenience function to mask all input datasets.
BiodiversityDataset$mask(mask, inverse = FALSE, ...)
mask
A SpatRaster
or sf
object.
inverse
A logical
flag if the inverse should be masked instead.
...
Any other parameters passed on to mask
Invisible
clone()
The objects of this class are cloneable with this method.
BiodiversityDataset$clone(deep = FALSE)
deep
Whether to make a deep clone.
Acts a container for a specified set of BiodiversityDataset contained within. Functions are provided to summarize across the BiodiversityDataset-class objects.
data
A list
of BiodiversityDataset
objects.
name
The default name of this collection as character
.
new()
Initializes the object and creates an empty list
BiodiversityDatasetCollection$new()
NULL
print()
Print the names and properties of all Biodiversity datasets contained within
BiodiversityDatasetCollection$print(format = TRUE)
format
A logical
flag on whether a message should be printed.
A message on screen
show()
Aliases that calls print.
BiodiversityDatasetCollection$show()
A message on screen
get_types()
Types of all biodiversity datasets included in this
BiodiversityDatasetCollection$get_types(short = FALSE)
short
A logical
flag whether types should be in short format.
A character
vector.
get_names()
Get names and format them if necessary
BiodiversityDatasetCollection$get_names(format = FALSE)
format
A logical
flag whether names are to be formatted
A character
vector.
set_data()
Add a new Biodiversity dataset to this collection.
BiodiversityDatasetCollection$set_data(x, value)
x
A character
with the name or id of this dataset.
value
A BiodiversityDataset
Invisible
get_data_object()
Get a specific Biodiversity dataset by id
BiodiversityDatasetCollection$get_data_object(id)
id
A character
with a given id for the dataset.
Returns a BiodiversityDataset.
get_data()
Get all biodiversity observations from a given dataset.
BiodiversityDatasetCollection$get_data(id)
id
A character
with a given id for the dataset.
Returns all data from a set BiodiversityDataset.
get_coordinates()
Get coordinates for a given biodiversity dataset. Else return a wkt object
BiodiversityDatasetCollection$get_coordinates(id)
id
A character
with a given id for the dataset.
All coordinates from a given object in data.frame
.
mask()
Convenience function to mask all input datasets.
BiodiversityDatasetCollection$mask(mask, inverse = FALSE)
mask
A SpatRaster
or sf
object.
inverse
A logical
flag if the inverse should be masked instead.
Invisible
rm_data()
Remove a specific biodiversity dataset by id
BiodiversityDatasetCollection$rm_data(id)
id
A character
with a given id for the dataset.
Invisible
length()
Number of Biodiversity Datasets in connection
BiodiversityDatasetCollection$length()
A numeric
with the number of datasets.
get_observations()
Get number of observations of all datasets
BiodiversityDatasetCollection$get_observations()
A numeric
with the number of observations across datasets.
get_equations()
Get equations from all datasets
BiodiversityDatasetCollection$get_equations()
A list
vector with all equations across datasets.
get_families()
Get families from datasets.
BiodiversityDatasetCollection$get_families()
A list
vector with all families across datasets.
get_links()
Get custom link functions
BiodiversityDatasetCollection$get_links()
A list
vector with all link functions across datasets.
get_columns_occ()
Get fields with observation columns
BiodiversityDatasetCollection$get_columns_occ()
A list
vector with the names of observation columns.
get_weights()
Get the weights across datasets.
BiodiversityDatasetCollection$get_weights()
A list
vector with the weights if set per dataset.
get_ids()
Get ids of all assets in the collection.
BiodiversityDatasetCollection$get_ids()
A list
vector with the ids of all datasets.
get_id_byType()
Search for a specific biodiversity dataset with type
BiodiversityDatasetCollection$get_id_byType(type)
type
A character
for a given data type.
A character
with the id(s) of datasets with the given type.
get_id_byName()
Get id by name
BiodiversityDatasetCollection$get_id_byName(name)
name
A character
for a given name.
A character
with the id(s) of datasets with the given name.
show_equations()
Show equations of all datasets
BiodiversityDatasetCollection$show_equations(msg = TRUE)
msg
A logical
on whether to use print a message instead.
Shows equations on screen or as character
.
plot()
Plot the whole collection
BiodiversityDatasetCollection$plot()
Invisible
clone()
The objects of this class are cloneable with this method.
BiodiversityDatasetCollection$clone(deep = FALSE)
deep
Whether to make a deep clone.
This can likely be beautified further.
Base R6
class for any biodiversity distribution objects.
Serves as container that supplies data and functions to other R6
classes. Generally stores all objects and parameters added to a model.
Run names()
on a distribution
object to show all available
functions.
background
A SpatRaster
or sf
object delineating the modelling extent.
limits
An optional sf
object on potential extrapolation limits
biodiversity
A BiodiversityDatasetCollection
object.
predictors
A PredictorDataset
object.
priors
An optional PriorList
object.
control
An optional Control object.
latentfactors
A character
on whether latentfactors are used.
offset
A character
on whether methods are used.
log
An optional Log
object.
engine
A Engine
object.
new()
Initializes the object and creates an BiodiversityDataset by default.
BiodiversityDistribution$new(background, limits, biodiversity, ...)
background
A SpatRaster
or sf
object delineating the modelling extent.
limits
An optional sf
object on potential extrapolation limits
biodiversity
A BiodiversityDatasetCollection
object.
...
Any other objects
NULL
print()
Looks for and returns the properties of all contained objects.
BiodiversityDistribution$print()
A message on screen
show()
An alias for print
BiodiversityDistribution$show()
A message on screen
name()
Returns self-describing name
BiodiversityDistribution$name()
A character
with the name
show_background_info()
Summarizes extent and projection from set background
BiodiversityDistribution$show_background_info()
A character
with the name
set_limits()
Specify new limits to the background
BiodiversityDistribution$set_limits(x)
x
A list
object with method and limit type.
This object.
get_limits()
Get provided limits if set or a waiver
BiodiversityDistribution$get_limits()
A list
or waiver.
rm_limits()
Remove limits if set.
BiodiversityDistribution$rm_limits()
This object.
get_predictor_names()
Function for querying predictor names if existing
BiodiversityDistribution$get_predictor_names()
A character
vector.
set_latent()
Adding latent factors to the object.
BiodiversityDistribution$set_latent(type, method = NULL, separate_spde = FALSE)
This object.
get_latent()
Get latent factors if found in object.
BiodiversityDistribution$get_latent()
A character
with those objects.
rm_latent()
Remove latent factors if found in object.
BiodiversityDistribution$rm_latent()
This object.
get_priors()
Get prior object if found in object.
BiodiversityDistribution$get_priors()
This object.
set_priors()
Specify new prior object. Overwrites existing ones
BiodiversityDistribution$set_priors(x)
x
A PriorList
object.
This object.
set_biodiversity()
Adds a new biodiversity object to the existing empty collection.
BiodiversityDistribution$set_biodiversity(id, p)
id
A character
or id defining this object.
p
A BiodiversityDataset
object.
This object.
set_predictors()
Set a new Predictor object to this object.
BiodiversityDistribution$set_predictors(x)
x
A PredictorDataset
with predictors for this object.
This object.
set_engine()
Set a new Engine object to this object.
BiodiversityDistribution$set_engine(x)
x
A Engine
for this object.
This object.
get_engine()
Gets the name of the current engine if set.
BiodiversityDistribution$get_engine()
A character
with the engine name
rm_engine()
Removes the current engine if set.
BiodiversityDistribution$rm_engine()
This object
get_prior_variables()
Get prior variables
BiodiversityDistribution$get_prior_variables()
A character
with the variable names for which priors have been added.
set_offset()
Specify new offsets.
BiodiversityDistribution$set_offset(x)
x
A new SpatRaster
object to be used as offset.
This object.
get_offset()
Get offset (print name)
BiodiversityDistribution$get_offset()
A character
with all the offsets in here.
rm_offset()
Remove offsets if found.
BiodiversityDistribution$rm_offset(what = NULL)
what
Optional character
of specific offsets to remove.
This object.
plot_offsets()
Plot offset if found.
BiodiversityDistribution$plot_offsets()
A graphical element.
get_offset_type()
Get offset parameters if found
BiodiversityDistribution$get_offset_type()
A list
with the offset parameters if found.
set_control()
Set new bias control
BiodiversityDistribution$set_control(type = "bias", x, method, value)
type
A character
with the type of control object.
x
A new bias control object. Expecting a SpatRaster
object.
method
The method used to create the object.
value
A bias value as numeric
.
This object.
get_control()
Get bias control (print name)
BiodiversityDistribution$get_control(type = "bias")
type
A character
with the type of control object.
A character
with the bias object if found.
rm_control()
Remove bias controls if found.
BiodiversityDistribution$rm_control()
This object.
plot_bias()
Plot bias variable if set.
BiodiversityDistribution$plot_bias()
A graphical element.
get_log()
Returns the output filename of the current log object if set.
BiodiversityDistribution$get_log()
A character
where the output is returned.
set_log()
Set a new log object
BiodiversityDistribution$set_log(x)
x
A Log
object.
This object
get_extent()
Get extent
BiodiversityDistribution$get_extent()
Background extent or NULL.
get_projection()
Get projection from the background in crs format.
BiodiversityDistribution$get_projection()
A character
of the projection
get_resolution()
Return resolution of the background object.
BiodiversityDistribution$get_resolution()
A vector
with the resolution.
rm_predictors()
Remove predictiors. Either all of them or specific ones.
BiodiversityDistribution$rm_predictors(names)
names
A character
with the predictors to be removed.
This object.
rm_priors()
Remove priors. Either all of them or specific ones.
BiodiversityDistribution$rm_priors(names = NULL)
names
A character
with the priors to be removed.
This object.
show_biodiversity_length()
Show number of biodiversity records
BiodiversityDistribution$show_biodiversity_length()
A numeric
with sum of biodiversity records
show_biodiversity_equations()
Show Equations of biodiversity records
BiodiversityDistribution$show_biodiversity_equations()
A message on screen.
get_biodiversity_equations()
Get equations of biodiversity records
BiodiversityDistribution$get_biodiversity_equations()
A list
vector.
get_biodiversity_types()
Query all biodiversity types in this object
BiodiversityDistribution$get_biodiversity_types()
A character
vector.
get_biodiversity_ids()
Return all biodiversity dataset ids in the object
BiodiversityDistribution$get_biodiversity_ids()
A list
for the ids in the biodiversity datasets
get_biodiversity_names()
Return all the character
names of all biodiversity datasets
BiodiversityDistribution$get_biodiversity_names()
A list
with the names in the biodiversity datasets
plot()
Plots the content of this class.
BiodiversityDistribution$plot()
A message.
summary()
Summary function for this object.
BiodiversityDistribution$summary()
A message.
clone()
The objects of this class are cloneable with this method.
BiodiversityDistribution$clone(deep = FALSE)
deep
Whether to make a deep clone.
Not implemented yet.
Not implemented yet.
add_biodiversity_poipa()
, add_biodiversity_poipo()
, add_biodiversity_polpa()
, add_biodiversity_polpo()
# Query available functions and entries background <- terra::rast(system.file('extdata/europegrid_50km.tif', package='ibis.iSDM',mustWork = TRUE)) # Define model x <- distribution(background) names(x)
# Query available functions and entries background <- terra::rast(system.file('extdata/europegrid_50km.tif', package='ibis.iSDM',mustWork = TRUE)) # Define model x <- distribution(background) names(x)
Base R6
class for any biodiversity scenario objects. Serves as
container that supplies data and functions to other R6
classes and functions.
modelobject
A name of the model for projection.
modelid
An id of the model used for projection.
limits
A sf
object used to constraint the prediction.
predictors
A predictor object for projection.
constraints
Any constraints set for projection.
latentfactors
A list
on whether latentfactors are used.
scenarios
The resulting stars
objects.
new()
Initializes the object and creates an empty list
BiodiversityScenario$new()
NULL
print()
Print the names and properties of all scenarios.
BiodiversityScenario$print()
A message on screen
verify()
Verify that set Model exist and check self-validity
BiodiversityScenario$verify()
Invisible
show()
Show the name of the Model
BiodiversityScenario$show()
Model objectname
get_projection()
Get projection of the projection.
BiodiversityScenario$get_projection()
A sf
object with the geographic projection
get_resolution()
Get resultion of the projection.
BiodiversityScenario$get_resolution()
A numeric
indication of the resolution.
get_model()
Get the actual model used for projection
BiodiversityScenario$get_model(copy = FALSE)
copy
A logical
flag on whether a deep copy should be created.
A DistributionModel object.
get_limits()
Get provided projection limits if set.
BiodiversityScenario$get_limits()
A sf
object or NULL.
rm_limits()
Remove current limits.
BiodiversityScenario$rm_limits()
Invisible
get_predictor_names()
Get names of predictors for scenario object.
BiodiversityScenario$get_predictor_names()
A character
vector with the names.
get_timeperiod()
Get time period of projection.
BiodiversityScenario$get_timeperiod(what = "range")
what
character
on whether full time period or just the range is to be returned.
A time period from start to end.
get_constraints()
Get constrains for model
BiodiversityScenario$get_constraints()
A list
with the constraints within the scenario.
rm_constraints()
Remove contraints from model
BiodiversityScenario$rm_constraints()
Invisible
get_threshold()
Get thresholds if specified.
BiodiversityScenario$get_threshold()
A list
with method and value for the threshold.
get_thresholdvalue()
Duplicate function for internal consistency to return threshold
BiodiversityScenario$get_thresholdvalue()
A list
with method and value for the threshold.
apply_threshold()
Apply a new threshold to the projection.
BiodiversityScenario$apply_threshold(tr = new_waiver())
tr
A numeric
value with the new threshold.
This object.
set_predictors()
Set new predictors to this object.
BiodiversityScenario$set_predictors(x)
x
PredictorDataset
object to be supplied.
This object.
set_constraints()
Set new constrains
BiodiversityScenario$set_constraints(x)
x
A list
object with constraint settings.
This object.
get_simulation()
Get simulation options and parameters if gound
BiodiversityScenario$get_simulation()
A list
with the parameters.
set_simulation()
Set simulation objects.
BiodiversityScenario$set_simulation(x)
x
new simulation entries and options as list
to be set.
This object.
get_predictors()
Get Predictors from the object.
BiodiversityScenario$get_predictors()
A predictor dataset.
rm_predictors()
Remove predictors from the object.
BiodiversityScenario$rm_predictors(names)
names
A character
vector with names
This object.
get_data()
Get scenario predictions or any other data
BiodiversityScenario$get_data(what = "scenarios")
what
A character
vector with names of what
This object.
rm_data()
Remove scenario predictions
BiodiversityScenario$rm_data()
what
A character
vector with names of what
Invisible
set_data()
Set new data in object.
BiodiversityScenario$set_data(x)
x
A new data object measuing scenarios.
This object.
set_latent()
Adding latent factors to the object.
BiodiversityScenario$set_latent(latent)
latent
A list
containing the data object.
This object.
get_latent()
Get latent factors if found in object.
BiodiversityScenario$get_latent()
A list
with the latent settings
rm_latent()
Remove latent factors if found in object.
BiodiversityScenario$rm_latent()
This object.
plot()
Plot the predictions made here.
BiodiversityScenario$plot(what = "suitability", which = NULL, ...)
A graphical representation
plot_threshold()
Convenience function to plot thresholds if set
BiodiversityScenario$plot_threshold(which = NULL)
which
A numeric
subset to any specific time steps.
A graphical representation
plot_migclim()
Plot Migclim results if existing.
BiodiversityScenario$plot_migclim()
A graphical representation
plot_animation()
Plot animation of scenarios if possible
BiodiversityScenario$plot_animation(what = "suitability", fname = NULL)
what
A character
describing the layers to be plotted.
fname
An optional filename to write the result.
A graphical representation
plot_relative_change()
Plot relative change between baseline and projected thresholds
BiodiversityScenario$plot_relative_change( position = NULL, variable = "mean", plot = TRUE )
A graphical representation or SpatRaster
.
summary()
Summarize the change in layers between timesteps
BiodiversityScenario$summary( layer = "threshold", plot = FALSE, relative = FALSE )
Summarized coefficients as data.frame
summary_beforeafter()
Summarize before-after change of first and last layer.
BiodiversityScenario$summary_beforeafter()
Summarized coefficients as data.frame
plot_scenarios_slope()
Calculate slopes across the projection
BiodiversityScenario$plot_scenarios_slope( what = "suitability", oftype = "stars" )
A plot of the scenario slopes
calc_scenarios_slope()
Calculate slopes across the projection
BiodiversityScenario$calc_scenarios_slope( what = "suitability", plot = TRUE, oftype = "stars" )
A SpatRaster
layer or stars
object.
mask()
Convenience function to mask all input projections.
BiodiversityScenario$mask(mask, inverse = FALSE, ...)
mask
A SpatRaster
or sf
object.
inverse
A logical
flag if the inverse should be masked instead.
...
Any other parameters passed on.
Invisible
get_centroid()
Get centroids of projection layers
BiodiversityScenario$get_centroid(patch = FALSE)
patch
A logical
if centroid should be calculated weighted by values.
Returns a sf
object.
save()
Save object as output somewhere
BiodiversityScenario$save(fname, type = "tif", dt = "FLT4S")
Saved spatial prediction on drive.
clone()
The objects of this class are cloneable with this method.
BiodiversityScenario$clone(deep = FALSE)
deep
Whether to make a deep clone.
This sets the threshold method internally to 'fixed'
.
The latent factor is usually obtained from the fitted model object, unless re-specified and added here to the list.
This requires the "gganimate"
package.
This requires a set threshold()
to the scenario object.
This requires set threshold
prior to projection.
Often there is an intention to display not only the predictions made with a SDM, but also the uncertainty of the prediction. Uncertainty be estimated either directly by the model or by calculating the variation in prediction values among a set of models.
In particular Bayesian engines can produce not only mean estimates of fitted responses, but also pixel-based estimates of uncertainty from the posterior such as the standard deviation (SD) or the coefficient of variation of a given prediction.
This function makes use of the "biscale"
R-package to create bivariate
plots of the fitted distribution object, allowing to visualize two variables
at once. It is mostly thought of as a convenience function to create such
bivariate plots for quick visualization.
Supported Inputs are either single trained Bayesian DistributionModel
with uncertainty or the output of an ensemble()
call. In both cases,
users have to make sure that "xvar"
and "yvar"
are set
accordingly.
bivplot( mod, xvar = "mean", yvar = "sd", plot = TRUE, fname = NULL, title = NULL, col = "BlueGold", ... ) ## S4 method for signature 'ANY' bivplot( mod, xvar = "mean", yvar = "sd", plot = TRUE, fname = NULL, title = NULL, col = "BlueGold", ... )
bivplot( mod, xvar = "mean", yvar = "sd", plot = TRUE, fname = NULL, title = NULL, col = "BlueGold", ... ) ## S4 method for signature 'ANY' bivplot( mod, xvar = "mean", yvar = "sd", plot = TRUE, fname = NULL, title = NULL, col = "BlueGold", ... )
mod |
A trained |
xvar |
A |
yvar |
A |
plot |
A |
fname |
A |
title |
Allows to respecify the title through a |
col |
A |
... |
Other engine specific parameters. |
Saved bivariate plot in 'fname'
if specified, otherwise plot.
This function requires the biscale package to be installed. Although a work around without the package could be developed, it was not deemed necessary at this point. See also this gist.
partial, plot.DistributionModel
Function to include prior information via Zellner-style spike
and slab prior for generalized linear models used in engine_breg. These
priors are similar to the horseshoe priors used in regularized engine_stan
models and penalize regressions by assuming most predictors having an effect
of 0
.
BREGPrior(variable, hyper = NULL, ip = NULL) ## S4 method for signature 'character' BREGPrior(variable, hyper = NULL, ip = NULL)
BREGPrior(variable, hyper = NULL, ip = NULL) ## S4 method for signature 'character' BREGPrior(variable, hyper = NULL, ip = NULL)
variable |
A |
hyper |
A |
ip |
A |
The Zellner-style spike and slab prior for generalized linear models
are specified as described in the Boom R-package. Currently supported
are two options which work for models with Poisson
and binomial
(Bernoulli
) distributed errors. Two types of priors can be provided on
a variable:
"coefficient"
Allows to specify Gaussian priors on the mean coefficients of the model.
Priors on the coefficients can be provided via the "hyper"
parameter.
Note that variables with such a prior can still be regularized out from the
model.
"inclusion.probability"
A vector
giving the prior probability of inclusion for the
specified variable. This can be useful when prior information on preference
is known but not the strength of it.
If coefficients are set, then the inclusion probability is also modified by
default. However even when not knowing a particular estimate of a beta
coefficients and their direction, one can still provide an estimate of the
inclusion probability. In other words:
The hyperparameters 'hyper' and 'ip' can't be both NULL
.
Hugh Chipman, Edward I. George, Robert E. McCulloch, M. Clyde, Dean P. Foster, Robert A. Stine (2001), "The Practical Implementation of Bayesian Model Selection" Lecture Notes-Monograph Series, Vol. 38, pp. 65-134. Institute of Mathematical Statistics.
Other prior:
BARTPrior()
,
BARTPriors()
,
BREGPriors()
,
GDBPrior()
,
GDBPriors()
,
GLMNETPrior()
,
GLMNETPriors()
,
INLAPrior()
,
INLAPriors()
,
STANPrior()
,
STANPriors()
,
XGBPrior()
,
XGBPriors()
,
add_priors()
,
get_priors()
,
priors()
,
rm_priors()
## Not run: # Positive coefficient p1 <- BREGPrior(variable = "forest", hyper = 2, ip = NULL) p1 # Coefficient and direction unknown but variable def. important p2 <- BREGPrior(variable = "forest", hyper = NULL, ip = 1) p2 ## End(Not run)
## Not run: # Positive coefficient p1 <- BREGPrior(variable = "forest", hyper = 2, ip = NULL) p1 # Coefficient and direction unknown but variable def. important p2 <- BREGPrior(variable = "forest", hyper = NULL, ip = 1) p2 ## End(Not run)
This is a helper function to specify several BREGPrior with the same hyper-parameters, but different variables.
BREGPriors(variable, hyper = NULL, ip = NULL) ## S4 method for signature 'character' BREGPriors(variable, hyper = NULL, ip = NULL)
BREGPriors(variable, hyper = NULL, ip = NULL) ## S4 method for signature 'character' BREGPriors(variable, hyper = NULL, ip = NULL)
variable |
A |
hyper |
A |
ip |
A |
Other prior:
BARTPrior()
,
BARTPriors()
,
BREGPrior()
,
GDBPrior()
,
GDBPriors()
,
GLMNETPrior()
,
GLMNETPriors()
,
INLAPrior()
,
INLAPriors()
,
STANPrior()
,
STANPriors()
,
XGBPrior()
,
XGBPriors()
,
add_priors()
,
get_priors()
,
priors()
,
rm_priors()
Not always is there enough data or sufficient information to robustly infer the suitable habitat or niche of a species. As many SDM algorithms are essentially regression models, similar assumptions about model convergence, homogeneity of residuals and inferrence usually apply (although often ignored). This function simply checks the respective input object for common issues or mistakes.
check(obj, stoponwarning = FALSE) ## S4 method for signature 'ANY' check(obj, stoponwarning = FALSE)
check(obj, stoponwarning = FALSE) ## S4 method for signature 'ANY' check(obj, stoponwarning = FALSE)
obj |
A |
stoponwarning |
|
Different checks are implemented depending on the supplied object
Checks if there are less than 200 observations
TODO: Add rm_insufficient_covs link
Check model convergence
Check if model is found
Check if coefficients exist
Check if there are unusal outliers in prediction (using 10median absolute deviation)
Check if threshold is larger than layer
Message outputs
This function will likely be expanded with additional checks in the future. If you have ideas, please let them know per issue.
## Not run: # Where mod is an estimated DistributionModel check(mod) ## End(Not run)
## Not run: # Where mod is an estimated DistributionModel check(mod) ## End(Not run)
Similar as summary
, this helper function obtains the
coefficients from a given DistributionModel object.
## S3 method for class 'DistributionModel' coef(object, ...)
## S3 method for class 'DistributionModel' coef(object, ...)
object |
Any prepared object. |
... |
not used. |
For models trained with machine-learning approaches (e.g. engine_bart
etc) this function will return variable importance estimates rather than
linear coefficients. Similar can be said for trained non-linear models.
This small helper function allows to combine multiple formula()
objects
into one. In the case of duplicate variable entries, only the unique ones
are used.
combine_formulas(..., combine = "both", env = parent.frame())
combine_formulas(..., combine = "both", env = parent.frame())
... |
Any number |
combine |
|
env |
A new environment of the formula |
Use "y ~ 0" to specify a stand alone LHS.
A formula as cbind(lhs_1, lhs_2, ...) ~ rhs_1 + rhs_2 + ...
or
lhs ~ rhs_1 + rhs_2
in case of identical LHS (see examples).
This likely won't work for interaction terms (such as *
or :
).
# Combine everything (default) combine_formulas(observed ~ rainfall + temp, observed ~ rainfall + forest.cover) # Combine only LHS combine_formulas(observed ~ rainfall + temp, observed ~ rainfall + forest.cover, combine = "lhs")
# Combine everything (default) combine_formulas(observed ~ rainfall + temp, observed ~ rainfall + forest.cover) # Combine only LHS combine_formulas(observed ~ rainfall + temp, observed ~ rainfall + forest.cover, combine = "lhs")
This function creates an object that contains all the data,
parameters and settings for building an (integrated) species distribution
model. Key functions to add data are add_biodiversity_poipo
and the like,
add_predictors
, add_latent_spatial
, engine_glmnet
or similar,
add_priors
and add_offset
. It creates a prototype
BiodiversityDistribution
object with its own functions. After setting
input data and parameters, model predictions can then be created via the
train function and predictions be created.
Additionally, it is possible to specify a "limit"
to any predictions
conducted on the background. This can be for instance a buffered layer by a
certain dispersal distance (Cooper and Soberon, 2018) or a categorical layer
representing biomes or soil conditions. Another option is to create a
constraint by constructing a minimum convex polygon (MCP) using the supplied
biodiversity data. This option can be enabled by setting
"limits_method"
to "mcp"
. It is also possible to provide a
small buffer to constructed MCP that way. See the frequently asked question
(FAQ) section on the homepage for more information.
See Details for a description of the internal functions available to modify or summarize data within the created object.
Note that any model requires at minimum a single added biodiversity dataset as well as a specified engine.
distribution( background, limits = NULL, limits_method = "none", mcp_buffer = 0, limits_clip = FALSE ) ## S4 method for signature 'SpatRaster' distribution( background, limits = NULL, limits_method = "none", mcp_buffer = 0, limits_clip = FALSE ) ## S4 method for signature 'sf' distribution( background, limits = NULL, limits_method = "none", mcp_buffer = 0, limits_clip = FALSE )
distribution( background, limits = NULL, limits_method = "none", mcp_buffer = 0, limits_clip = FALSE ) ## S4 method for signature 'SpatRaster' distribution( background, limits = NULL, limits_method = "none", mcp_buffer = 0, limits_clip = FALSE ) ## S4 method for signature 'sf' distribution( background, limits = NULL, limits_method = "none", mcp_buffer = 0, limits_clip = FALSE )
background |
Specification of the modelling background. Must be a
|
limits |
A |
limits_method |
A |
mcp_buffer |
A |
limits_clip |
|
This function creates a BiodiversityDistribution
object
that in itself contains other functions and stores parameters and
(pre-)processed data. A full list of functions available can be queried via
"names(object)"
. Some of the functions are not intended to be
manipulated directly, but rather through convenience functions (e.g.
"object$set_predictors()"
). Similarly other objects are stored in the
BiodiversityDistribution
object that have their own functions as
well and can be queried (e.g. "names(object)"
). For a list of
functions see the reference documentation. By default, if some datasets are
not set, then a "Waiver"
object is returned instead.
The following objects can be stored:
object$biodiversity
A BiodiversityDatasetCollection
object with
the added biodiversity data.
object$engine
An "engine"
object (e.g. engine_inlabru()
)
with function depended on the added engine.
object$predictors
A PredictorDataset
object with all set predictions.
object$priors
A PriorList
object with all specified priors.
object$log
A Log
object that captures.
Useful high-level functions to address those objects are for instance:
object$show()
A generic summary of the BiodiversityDistribution
object contents. Can also be called via print.
object$get_biodiversity_equations()
Lists the equations used for each
biodiversity dataset with given id. Defaults to all predictors.
object$get_biodiversity_types()
Lists the type of each specified
biodiversity dataset with given id.
object$get_extent()
Outputs the terra::ext of the modelling region.
object$show_background_info()
Returns a list
with the terra::ext
and the terra::crs.
object$get_extent_dimensions()
Outputs the terra::ext dimension by
calling the "extent_dimensions()"
function.
object$get_predictor_names()
Returns a character vector with the
names of all added predictors.
object$get_prior_variables()
Returns a description of priors
added.
There are other functions as well but those are better accessed through their respective wrapper functions.
BiodiversityDistribution
object containing data for building
a biodiversity distribution modelling problem.
Fletcher, R.J., Hefley, T.J., Robertson, E.P., Zuckerberg, B., McCleery, R.A., Dorazio, R.M., (2019) A practical guide for combining data to model species distributions. Ecology 100, e02710. https://doi.org/10.1002/ecy.2710
Cooper, Jacob C., and Jorge Soberón. "Creating individual accessible area hypotheses improves stacked species distribution model performance." Global Ecology and Biogeography 27, no. 1 (2018): 156-165.
BiodiversityDistribution
and other classes.
# Load background raster background <- terra::rast(system.file("extdata/europegrid_50km.tif",package = "ibis.iSDM")) # Define model x <- distribution(background) x
# Load background raster background <- terra::rast(system.file("extdata/europegrid_50km.tif",package = "ibis.iSDM")) # Define model x <- distribution(background) x
All trained Models inherit the options here plus any additional ones defined by the engine and inference.
id
A character id for any trained model
name
A description of the model as character
.
model
A list
containing all input datasets and parameters to the model.
settings
A Settings
object with information on inference.
fits
A list
containing the prediction and fitted model.
.internals
A list
containing previous fitted models.
new()
Initializes the object and creates an empty list
DistributionModel$new(name)
name
A description of the model as character
.
NULL
get_name()
Return the name of the model
DistributionModel$get_name()
A character
with the model name used.
print()
Print the names and summarizes the model within
DistributionModel$print()
A message on screen
show()
Show the name of the Model.
DistributionModel$show()
A character
of the run name.
plot()
Plots the prediction if found.
DistributionModel$plot(what = "mean")
what
character
with the specific layer to be plotted.
A graphical representation of the prediction
plot_threshold()
Plots the thresholded prediction if found.
DistributionModel$plot_threshold(what = 1)
A graphical representation of the thresholded prediction if found.
show_duration()
Show model run time if settings exist
DistributionModel$show_duration()
A numeric
estimate of the duration it took to fit the models.
summary()
Get effects or importance tables from model
DistributionModel$summary(obj = "fit_best")
obj
A character
of which object to return.
A data.frame
summarizing the model, usually its coefficient.
effects()
Generic plotting function for effect plots
DistributionModel$effects(x = "fit_best", what = "fixed", ...)
A graphical representation of the coefficents.
get_equation()
Get equation
DistributionModel$get_equation()
A formula
of the inferred model.
get_data()
Get specific fit from this Model
DistributionModel$get_data(x = "prediction")
x
A character
stating what should be returned.
A SpatRaster
object with the prediction.
get_model()
Small internal helper function to directly get the model object
DistributionModel$get_model()
A fitted model if existing.
set_data()
Set new fit for this Model.
DistributionModel$set_data(x, value)
x
The name of the new fit.
value
The SpatRaster
layer (or model) to be inserted.
This object.
get_thresholdvalue()
Get the threshold value if calculated
DistributionModel$get_thresholdvalue()
A numeric
threshold value.
get_thresholdtype()
Get threshold type and format if calculated.
DistributionModel$get_thresholdtype()
A vector with a character
method and numeric
threshold value.
show_rasters()
List all rasters in object
DistributionModel$show_rasters()
A vector
with logical
flags for the various objects.
get_projection()
Get projection of the background.
DistributionModel$get_projection()
A geographic projection
get_resolution()
Get the resolution of the projection
DistributionModel$get_resolution()
numeric
estimates of the distribution.
rm_threshold()
Remove calculated thresholds
DistributionModel$rm_threshold()
Invisible
calc_suitabilityindex()
Calculate a suitability index for a given projection
DistributionModel$calc_suitabilityindex(method = "normalize")
method
The method used for normalization.
Methods can either be normalized by the minimum and maximum. Or the relative total using the sumof values.
Returns a SpatRaster
.
get_centroid()
Get centroids of prediction layers
DistributionModel$get_centroid(patch = FALSE, layer = "mean")
Returns a sf
object.
has_limits()
Logical indication if the prediction was limited.
DistributionModel$has_limits()
A logical
flag.
has_latent()
Logical indication if the prediction has added latent factors.
DistributionModel$has_latent()
A logical
flag.
has_offset()
Has a offset been used?
DistributionModel$has_offset()
A logical
flag.
mask()
Convenience function to mask all input datasets.
DistributionModel$mask(mask, inverse = FALSE, ...)
mask
A SpatRaster
or sf
object.
inverse
A logical
flag if the inverse should be masked instead.
...
Any other parameters passed on to mask
Invisible
save()
Save the prediction as output.
DistributionModel$save(fname, type = "gtif", dt = "FLT4S")
Saved spatial prediction on drive.
clone()
The objects of this class are cloneable with this method.
DistributionModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
Could be further pretified and commands outsourced.
This functions is handy wrapper that calls the default plotting
functions for the model of a specific engine. Equivalent to calling effects
of a fitted distribution function.
## S3 method for class 'DistributionModel' effects(object, ...)
## S3 method for class 'DistributionModel' effects(object, ...)
object |
Any fitted distribution object. |
... |
Not used. |
None.
For some models, where default coefficients plots are not available, this function will attempt to generate partial dependency plots instead.
## Not run: # Where mod is an estimated distribution model mod$effects() ## End(Not run)
## Not run: # Where mod is an estimated distribution model mod$effects() ## End(Not run)
SpatRaster
based on a templateThis function creates an empty copy of a provided
SpatRaster
object. It is primarily used in the package to create the
outputs for the predictions.
emptyraster(x, ...)
emptyraster(x, ...)
x |
A |
... |
other arguments that can be passed to |
an empty SpatRaster
, i.e. all cells are NA
.
require(terra) r <- rast(matrix(1:100, 5, 20)) emptyraster(r)
require(terra) r <- rast(matrix(1:100, 5, 20)) emptyraster(r)
The Bayesian regression approach to a sum of complementary trees
is to shrink the said fit of each tree through a regularization prior. BART
models provide non-linear highly flexible estimation and have been shown to
compare favourable among machine learning algorithms (Dorie et al. 2019).
Default prior preference is for trees to be small (few terminal nodes) and
shrinkage towards 0
.
This package requires the "dbarts"
R-package to be installed. Many
of the functionalities of this engine have been inspired by the
"embarcadero"
R-package. Users are therefore advised to cite if they
make heavy use of BART.
engine_bart(x, iter = 1000, nburn = 250, chains = 4, type = "response", ...)
engine_bart(x, iter = 1000, nburn = 250, chains = 4, type = "response", ...)
x |
|
iter |
A |
nburn |
A |
chains |
A number of the number of chains to be used (Default: |
type |
The type used for creating posterior predictions. Either |
... |
Other options. |
Prior distributions can furthermore be set for:
probability that a tree stops at a node of a given depth (Not yet implemented)
probability that a given variable is chosen for a splitting rule
probability of splitting that variable at a particular value (Not yet implemented)
An Engine.
Carlson, CJ. embarcadero: Species distribution modelling with Bayesian additive regression trees in r. Methods Ecol Evol. 2020; 11: 850– 858. https://doi.org/10.1111/2041-210X.13389
Dorie, V., Hill, J., Shalit, U., Scott, M., & Cervone, D. (2019). Automated versus do-it-yourself methods for causal inference: Lessons learned from a data analysis competition. Statistical Science, 34(1), 43-68.
Vincent Dorie (2020). dbarts: Discrete Bayesian Additive Regression Trees Sampler. R package version 0.9-19. https://CRAN.R-project.org/package=dbarts
Other engine:
engine_breg()
,
engine_gdb()
,
engine_glm()
,
engine_glmnet()
,
engine_inla()
,
engine_inlabru()
,
engine_scampr()
,
engine_stan()
,
engine_xgboost()
## Not run: # Add BART as an engine x <- distribution(background) |> engine_bart(iter = 100) ## End(Not run)
## Not run: # Add BART as an engine x <- distribution(background) |> engine_bart(iter = 100) ## End(Not run)
Efficient MCMC algorithm for linear regression models that makes use of 'spike-and-slab' priors for some modest regularization on the amount of posterior probability for a subset of the coefficients.
engine_breg( x, iter = 10000, nthread = getOption("ibis.nthread"), type = "response", ... )
engine_breg( x, iter = 10000, nthread = getOption("ibis.nthread"), type = "response", ... )
x |
|
iter |
|
nthread |
|
type |
The mode used for creating posterior predictions. Either making
|
... |
Other none specified parameters passed on to the model. |
This engine provides efficient Bayesian predictions through the
Boom R-package. However note that not all link and models functions are
supported and certain functionalities such as offsets are generally not
available. This engines allows the estimation of linear and non-linear
effects via the "only_linear"
option specified in train.
An Engine.
Nguyen, K., Le, T., Nguyen, V., Nguyen, T., & Phung, D. (2016, November). Multiple kernel learning with data augmentation. In Asian Conference on Machine Learning (pp. 49-64). PMLR.
Steven L. Scott (2021). BoomSpikeSlab: MCMC for Spike and Slab Regression. R package version 1.2.4. https://CRAN.R-project.org/package=BoomSpikeSlab
Other engine:
engine_bart()
,
engine_gdb()
,
engine_glm()
,
engine_glmnet()
,
engine_inla()
,
engine_inlabru()
,
engine_scampr()
,
engine_stan()
,
engine_xgboost()
## Not run: # Add BREG as an engine x <- distribution(background) |> engine_breg(iter = 1000) ## End(Not run)
## Not run: # Add BREG as an engine x <- distribution(background) |> engine_breg(iter = 1000) ## End(Not run)
Gradient descent boosting is an efficient way to optimize any
loss function of a generalized linear or additive model (such as the GAMs
available through the "mgcv"
R-package). It furthermore automatically
regularizes the fit, thus the resulting model only contains the covariates
whose baselearners have some influence on the response. Depending on the type
of the add_biodiversity
data, either poisson process models or
logistic regressions are estimated. If the "only_linear"
term in
train is set to FALSE
, splines are added to the estimation, thus
providing a non-linear additive inference.
engine_gdb( x, iter = 2000, learning_rate = 0.1, empirical_risk = "inbag", type = "response", ... )
engine_gdb( x, iter = 2000, learning_rate = 0.1, empirical_risk = "inbag", type = "response", ... )
x |
|
iter |
An |
learning_rate |
A bounded |
empirical_risk |
method for empirical risk calculation. Available options
are |
type |
The mode used for creating posterior predictions. Either making
|
... |
Other variables or control parameters |
: This package requires the "mboost"
R-package to be
installed. It is in philosophy somewhat related to the engine_xgboost and
"XGBoost"
R-package, however providing some additional desirable
features that make estimation quicker and particularly useful for spatial
projections. Such as for instance the ability to specifically add spatial
baselearners via add_latent_spatial or the specification of monotonically
constrained priors via GDBPrior.
An engine.
The coefficients resulting from gdb with poipa data (Binomial) are only 0.5 of the typical coefficients of a logit model obtained via glm (see Binomial).
Hofner, B., Mayr, A., Robinzonov, N., & Schmid, M. (2014). Model-based boosting in R: a hands-on tutorial using the R package mboost. Computational statistics, 29(1-2), 3-35.
Hofner, B., Müller, J., Hothorn, T., (2011). Monotonicity-constrained species distribution models. Ecology 92, 1895–901.
Mayr, A., Hofner, B. and Schmid, M. (2012). The importance of knowing when to stop - a sequential stopping rule for component-wise gradient boosting. Methods of Information in Medicine, 51, 178–186.
Other engine:
engine_bart()
,
engine_breg()
,
engine_glm()
,
engine_glmnet()
,
engine_inla()
,
engine_inlabru()
,
engine_scampr()
,
engine_stan()
,
engine_xgboost()
## Not run: # Add GDB as an engine x <- distribution(background) |> engine_gdb(iter = 1000) ## End(Not run)
## Not run: # Add GDB as an engine x <- distribution(background) |> engine_gdb(iter = 1000) ## End(Not run)
This engine implements a basic generalized linear modle (GLM) for creating species distribution models. The main purpose of this engine is to support a basic, dependency-free method for inference and projection that can be used within the package for examples and vignettes. That being said, the engine is fully functional as any other engine.
The basic implementation of GLMs here is part of a general class oflinear models
and has - with exception of offsets - only minimal options to integrate other
sources of information such as priors or joint integration. The general
recommendation is to engine_glmnet()
instead for regularization support.
However basic GLMs can in some cases be useful for quick projections or
for ensemble()
of small models (a practice common for rare species).
engine_glm(x, control = NULL, type = "response", ...)
engine_glm(x, control = NULL, type = "response", ...)
x |
|
control |
A |
type |
The mode used for creating posterior predictions. Either making
|
... |
Other parameters passed on to |
This engine is essentially a wrapper for stats::glm.fit()
, however with customized
settings to support offsets and weights.
If "optim_hyperparam"
is set to TRUE
in train()
, then a AIC
based step-wise (backwards) model selection is performed.
Generally however engine_glmnet
should be the preferred package for models
with more than >3
covariates.
An Engine.
Hastie, T. J. and Pregibon, D. (1992) Generalized linear models. Chapter 6 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
Other engine:
engine_bart()
,
engine_breg()
,
engine_gdb()
,
engine_glmnet()
,
engine_inla()
,
engine_inlabru()
,
engine_scampr()
,
engine_stan()
,
engine_xgboost()
# Load background background <- terra::rast(system.file('extdata/europegrid_50km.tif', package='ibis.iSDM',mustWork = TRUE)) # Add GLM as an engine x <- distribution(background) |> engine_glm() print(x)
# Load background background <- terra::rast(system.file('extdata/europegrid_50km.tif', package='ibis.iSDM',mustWork = TRUE)) # Add GLM as an engine x <- distribution(background) |> engine_glm() print(x)
This engine allows the estimation of linear coefficients using
either ridge, lasso or elastic net regressions techniques. Backbone of this
engine is the glmnet R-package which is commonly used in SDMs,
including the popular 'maxnet'
(e.g. Maxent) package. Ultimately this
engine is an equivalent of engine_breg, but in a "frequentist" setting. If
user aim to emulate a model that most closely resembles maxent within the
ibis.iSDM modelling framework, then this package is the best way of doing so.
Compared to the 'maxnet'
R-package, a number of efficiency settings
are implemented in particular for cross-validation of alpha and lambda
values.
Limited amount of prior information can be specified for this engine,
specifically via offsets or as GLMNETPrior
, which allow to specify priors
as regularization constants.
engine_glmnet( x, alpha = 0, nlambda = 100, lambda = NULL, type = "response", ... )
engine_glmnet( x, alpha = 0, nlambda = 100, lambda = NULL, type = "response", ... )
x |
|
alpha |
A |
nlambda |
A |
lambda |
A |
type |
The mode used for creating posterior predictions. Either making
|
... |
Other parameters passed on to glmnet. |
Regularized regressions are effectively GLMs that are fitted with
ridge, lasso or elastic-net regularization. Which of them is chosen is
critical dependent on the alpha value: * For alpha
equal to 0
a ridge regularization is used. Ridge regularization has the property that it
doesn't remove variables entirely, but instead sets their coefficients to
0
. * For alpha
equal to 1
a lasso regularization is
used. Lassos tend to remove those coefficients fully from the final model
that do not improve the loss function. * For alpha
values between
0
and 1
a elastic-net regularization is used, which is
essentially a combination of the two. The optimal lambda parameter can be
determined via cross-validation. For this option set "varsel"
in
train()
to "reg"
.
An Engine.
Jerome Friedman, Trevor Hastie, Robert Tibshirani (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1-22. URL https://www.jstatsoft.org/v33/i01/.
Renner, I.W., Elith, J., Baddeley, A., Fithian, W., Hastie, T., Phillips, S.J., Popovic, G. and Warton, D.I., 2015. Point process models for presence‐only analysis. Methods in Ecology and Evolution, 6(4), pp.366-379.
Fithian, W. & Hastie, T. (2013) Finite-sample equivalence in statistical models for presence-only data. The Annals of Applied Statistics 7, 1917–1939
Other engine:
engine_bart()
,
engine_breg()
,
engine_gdb()
,
engine_glm()
,
engine_inla()
,
engine_inlabru()
,
engine_scampr()
,
engine_stan()
,
engine_xgboost()
## Not run: # Add GLMNET as an engine x <- distribution(background) |> engine_glmnet(iter = 1000) ## End(Not run)
## Not run: # Add GLMNET as an engine x <- distribution(background) |> engine_glmnet(iter = 1000) ## End(Not run)
Allows a full Bayesian analysis of linear and additive models
using Integrated Nested Laplace approximation. Engine has been largely
superceded by the engine_inlabru
package and users are advised to us this
one, unless specific options are required.
engine_inla( x, optional_mesh = NULL, optional_projstk = NULL, max.edge = NULL, offset = NULL, cutoff = NULL, proj_stepsize = NULL, timeout = NULL, strategy = "auto", int.strategy = "eb", barrier = FALSE, type = "response", area = "gpc2", nonconvex.bdry = FALSE, nonconvex.convex = -0.15, nonconvex.concave = -0.05, nonconvex.res = 40, ... )
engine_inla( x, optional_mesh = NULL, optional_projstk = NULL, max.edge = NULL, offset = NULL, cutoff = NULL, proj_stepsize = NULL, timeout = NULL, strategy = "auto", int.strategy = "eb", barrier = FALSE, type = "response", area = "gpc2", nonconvex.bdry = FALSE, nonconvex.convex = -0.15, nonconvex.concave = -0.05, nonconvex.res = 40, ... )
x |
|
optional_mesh |
A directly supplied |
optional_projstk |
A directly supplied projection stack. Useful if
projection stack is identical for multiple species (Default: |
max.edge |
The largest allowed triangle edge length, must be in the same
scale units as the coordinates. Default is an educated guess (Default:
|
offset |
interpreted as a numeric factor relative to the approximate
data diameter. Default is an educated guess (Default: |
cutoff |
The minimum allowed distance between points on the mesh.
Default is an educated guess (Default: |
proj_stepsize |
The stepsize in coordinate units between cells of the
projection grid (Default: |
timeout |
Specify a timeout for INLA models in sec. Afterwards it passed. |
strategy |
Which approximation to use for the joint posterior. Options
are |
int.strategy |
Integration strategy. Options are
|
barrier |
Should a barrier model be added to the model? |
type |
The mode used for creating posterior predictions. Either
summarizing the linear |
area |
Accepts a |
nonconvex.bdry |
Create a non-convex boundary hulls instead (Default:
|
nonconvex.convex |
Non-convex minimal extension radius for convex curvature Not yet implemented |
nonconvex.concave |
Non-convex minimal extension radius for concave curvature Not yet implemented |
nonconvex.res |
Computation resolution for nonconvex.hulls Not yet implemented |
... |
Other options. |
All INLA
engines require the specification of a mesh that
needs to be provided to the "optional_mesh"
parameter. Otherwise the
mesh will be created based on best guesses of the data spread. A good mesh
needs to have triangles as regular as possible in size and shape:
equilateral.
* "max.edge"
: The largest allowed triangle edge length, must be in
the same scale units as the coordinates Lower bounds affect the density of
triangles * "offset"
: The automatic extension distance of the mesh
If positive: same scale units. If negative, interpreted as a factor relative
to the approximate data diameter i.e., a value of -0.10 will add a 10% of the
data diameter as outer extension. * "cutoff"
: The minimum allowed
distance between points, it means that points at a closer distance than the
supplied value are replaced by a single vertex. it is critical when there are
some points very close to each other, either for point locations or in the
domain boundary. * "proj_stepsize"
: The stepsize for spatial
predictions, which affects the spatial grain of any outputs created.
Priors can be set via INLAPrior.
An engine.
How INLA Meshes are generated, substantially influences prediction outcomes. See Dambly et al. (2023).
Havard Rue, Sara Martino, and Nicholas Chopin (2009), Approximate Bayesian Inference for Latent Gaussian Models Using Integrated Nested Laplace Approximations (with discussion), Journal of the Royal Statistical Society B, 71, 319-392.
Finn Lindgren, Havard Rue, and Johan Lindstrom (2011). An Explicit Link Between Gaussian Fields and Gaussian Markov Random Fields: The Stochastic Partial Differential Equation Approach (with discussion), Journal of the Royal Statistical Society B, 73(4), 423-498.
Simpson, Daniel, Janine B. Illian, S. H. Sørbye, and Håvard Rue. 2016. “Going Off Grid: Computationally Efficient Inference for Log-Gaussian Cox Processes.” Biometrika 1 (103): 49–70.
Dambly, L. I., Isaac, N. J., Jones, K. E., Boughey, K. L., & O'Hara, R. B. (2023). Integrated species distribution models fitted in INLA are sensitive to mesh parameterisation. Ecography, e06391.
Other engine:
engine_bart()
,
engine_breg()
,
engine_gdb()
,
engine_glm()
,
engine_glmnet()
,
engine_inlabru()
,
engine_scampr()
,
engine_stan()
,
engine_xgboost()
## Not run: # Add INLA as an engine (with a custom mesh) x <- distribution(background) |> engine_inla(mesh = my_mesh) ## End(Not run)
## Not run: # Add INLA as an engine (with a custom mesh) x <- distribution(background) |> engine_inla(mesh = my_mesh) ## End(Not run)
Model components are specified with general inputs and mapping
methods to the latent variables, and the predictors are specified via
general R expressions, with separate expressions for each observation
likelihood model in multi-likelihood models. The inlabru engine - similar
as the engine_inla
function acts a wrapper for INLA, albeit
"inlabru"
has a number of convenience functions implemented that
make in particular predictions with new data much more straight forward
(e.g. via posterior simulation instead of fitting). Since more recent
versions "inlabru"
also supports the addition of multiple
likelihoods, therefore allowing full integrated inference.
engine_inlabru( x, optional_mesh = NULL, max.edge = NULL, offset = NULL, cutoff = NULL, proj_stepsize = NULL, strategy = "auto", int.strategy = "eb", area = "gpc2", timeout = NULL, type = "response", ... )
engine_inlabru( x, optional_mesh = NULL, max.edge = NULL, offset = NULL, cutoff = NULL, proj_stepsize = NULL, strategy = "auto", int.strategy = "eb", area = "gpc2", timeout = NULL, type = "response", ... )
x |
|
optional_mesh |
A directly supplied |
max.edge |
The largest allowed triangle edge length, must be in the same
scale units as the coordinates. Default is an educated guess (Default: |
offset |
interpreted as a numeric factor relative to the approximate data
diameter. Default is an educated guess (Default: |
cutoff |
The minimum allowed distance between points on the mesh. Default
is an educated guess (Default: |
proj_stepsize |
The stepsize in coordinate units between cells of the
projection grid (Default: |
strategy |
Which approximation to use for the joint posterior. Options
are |
int.strategy |
Integration strategy. Options are |
area |
Accepts a |
timeout |
Specify a timeout for INLA models in sec. Afterwards it passed. |
type |
The mode used for creating posterior predictions. Either summarizing
the linear |
... |
Other variables |
All INLA
engines require the specification of a mesh that
needs to be provided to the "optional_mesh"
parameter. Otherwise the
mesh will be created based on best guesses of the data spread. A good mesh
needs to have triangles as regular as possible in size and shape:
equilateral.
* "max.edge"
: The largest allowed triangle edge length, must be in
the same scale units as the coordinates Lower bounds affect the density of
triangles * "offset"
: The automatic extension distance of the mesh
If positive: same scale units. If negative, interpreted as a factor relative
to the approximate data diameter i.e., a value of -0.10 will add a 10% of the
data diameter as outer extension. * "cutoff"
: The minimum allowed
distance between points, it means that points at a closer distance than the
supplied value are replaced by a single vertex. it is critical when there are
some points very close to each other, either for point locations or in the
domain boundary. * "proj_stepsize"
: The stepsize for spatial
predictions, which affects the spatial grain of any outputs created.
Priors can be set via INLAPrior.
An Engine.
How INLA Meshes are generated, substantially influences prediction outcomes. See Dambly et al. (2023).
https://inlabru-org.github.io/inlabru/articles/
Bachl, F. E., Lindgren, F., Borchers, D. L., & Illian, J. B. (2019). inlabru: an R package for Bayesian spatial modelling from ecological survey data. Methods in Ecology and Evolution, 10(6), 760-766.
Simpson, Daniel, Janine B. Illian, S. H. Sørbye, and Håvard Rue. 2016. “Going Off Grid: Computationally Efficient Inference for Log-Gaussian Cox Processes.” Biometrika 1 (103): 49–70.
Dambly, L. I., Isaac, N. J., Jones, K. E., Boughey, K. L., & O'Hara, R. B. (2023). Integrated species distribution models fitted in INLA are sensitive to mesh parameterisation. Ecography, e06391.
Other engine:
engine_bart()
,
engine_breg()
,
engine_gdb()
,
engine_glm()
,
engine_glmnet()
,
engine_inla()
,
engine_scampr()
,
engine_stan()
,
engine_xgboost()
## Not run: # Add inlabru as an engine x <- distribution(background) |> engine_inlabru() ## End(Not run)
## Not run: # Add inlabru as an engine x <- distribution(background) |> engine_inlabru() ## End(Not run)
Similar to others, this engine enables the fitting and prediction of
log-Gaussian Cox process (LGCP) and Inhomogeneous Poisson process (IPP) processes.
It uses the scampr
package, which uses maximum likelihood estimation
fitted via TMB
(Template Model Builder).
It also support the addition of spatial latent effects which can be added via Gaussian fields and approximated by 'FRK' (Fixed Rank Kriging) and are integrated out using either variational or Laplace approximation.
The main use case for this engine is as an alternative to engine_inlabru()
and
engine_inla()
for fitting iSDMs, e.g. those combining both presence-only
and presence-absence point occurrence data.
engine_scampr(x, type = "response", dens = "posterior", maxit = 500, ...)
engine_scampr(x, type = "response", dens = "posterior", maxit = 500, ...)
x |
|
type |
The mode used for creating (posterior or prior) predictions. Either stting
|
dens |
A |
maxit |
A |
... |
Other parameters passed on. |
This engine may only be used to predict for one or two datasets at most. It supports only presence-only PPMs and presence/absence Binary GLMs, or 'IDM' (for an integrated data model).
An Engine.
The package can currently be installed from github directly only "ElliotDovers/scampr"
Presence-absence models in SCAMPR currently only support cloglog link functions!
Dovers, E., Popovic, G. C., & Warton, D. I. (2024). A fast method for fitting integrated species distribution models. Methods in Ecology and Evolution, 15(1), 191-203.
Dovers, E., Stoklosa, D., and Warton D. I. (2024). Fitting log-Gaussian Cox processes using generalized additive model software. The American Statistician, 1-17.
Other engine:
engine_bart()
,
engine_breg()
,
engine_gdb()
,
engine_glm()
,
engine_glmnet()
,
engine_inla()
,
engine_inlabru()
,
engine_stan()
,
engine_xgboost()
## Not run: # Load background background <- terra::rast(system.file('extdata/europegrid_50km.tif', package='ibis.iSDM',mustWork = TRUE)) # Add GLM as an engine x <- distribution(background) |> engine_scampr() ## End(Not run)
## Not run: # Load background background <- terra::rast(system.file('extdata/europegrid_50km.tif', package='ibis.iSDM',mustWork = TRUE)) # Add GLM as an engine x <- distribution(background) |> engine_scampr() ## End(Not run)
Stan is probabilistic programming language that can be used to
specify most types of statistical linear and non-linear regression models.
Stan provides full Bayesian inference for continuous-variable models
through Markov chain Monte Carlo methods such as the No-U-Turn sampler, an
adaptive form of Hamiltonian Monte Carlo sampling. Stan code has to be
written separately and this function acts as compiler to build the
stan-model.
Requires the "cmdstanr"
package to be installed!
engine_stan( x, chains = 4, iter = 2000, warmup = floor(iter/2), init = "random", cores = getOption("ibis.nthread"), algorithm = "sampling", control = list(adapt_delta = 0.95), type = "response", ... )
engine_stan( x, chains = 4, iter = 2000, warmup = floor(iter/2), init = "random", cores = getOption("ibis.nthread"), algorithm = "sampling", control = list(adapt_delta = 0.95), type = "response", ... )
x |
|
chains |
A positive |
iter |
A positive |
warmup |
A positive |
init |
Initial values for parameters (Default: |
cores |
If set to NULL take values from specified ibis option |
algorithm |
Mode used to sample from the posterior. Available options are
|
control |
See |
type |
The mode used for creating posterior predictions. Either summarizing
the linear |
... |
Other variables |
By default the posterior is obtained through sampling, however stan also supports approximate inference forms through penalized maximum likelihood estimation (see Carpenter et al. 2017).
An Engine.
The function obj$stancode()
can be used to print out the
stancode of the model.
Jonah Gabry and Rok Češnovar (2021). cmdstanr: R Interface to 'CmdStan'. https://mc-stan.org/cmdstanr, https://discourse.mc-stan.org.
Carpenter, B., Gelman, A., Hoffman, M. D., Lee, D., Goodrich, B., Betancourt, M., ... & Riddell, A. (2017). Stan: A probabilistic programming language. Journal of statistical software, 76(1), 1-32.
Piironen, J., & Vehtari, A. (2017). Sparsity information and regularization in the horseshoe and other shrinkage priors. Electronic Journal of Statistics, 11(2), 5018-5051.
rstan, cmdstanr
Other engine:
engine_bart()
,
engine_breg()
,
engine_gdb()
,
engine_glm()
,
engine_glmnet()
,
engine_inla()
,
engine_inlabru()
,
engine_scampr()
,
engine_xgboost()
## Not run: # Add Stan as an engine x <- distribution(background) |> engine_stan(iter = 1000) ## End(Not run)
## Not run: # Add Stan as an engine x <- distribution(background) |> engine_stan(iter = 1000) ## End(Not run)
Allows to estimate eXtreme gradient descent boosting for tree-based or linear boosting regressions. The XGBoost engine is a flexible, yet powerful engine with many customization options, supporting multiple options to perform single and multi-class regression and classification tasks. For a full list of options users are advised to have a look at the xgboost::xgb.train help file and https://xgboost.readthedocs.io.
engine_xgboost( x, booster = "gbtree", iter = 8000L, learning_rate = 0.001, gamma = 6, reg_lambda = 0, reg_alpha = 0, max_depth = 2, subsample = 0.75, colsample_bytree = 0.4, min_child_weight = 3, nthread = getOption("ibis.nthread"), ... )
engine_xgboost( x, booster = "gbtree", iter = 8000L, learning_rate = 0.001, gamma = 6, reg_lambda = 0, reg_alpha = 0, max_depth = 2, subsample = 0.75, colsample_bytree = 0.4, min_child_weight = 3, nthread = getOption("ibis.nthread"), ... )
x |
|
booster |
A |
iter |
|
learning_rate |
|
gamma |
|
reg_lambda |
|
reg_alpha |
|
max_depth |
|
subsample |
|
colsample_bytree |
|
min_child_weight |
|
nthread |
|
... |
Other none specified parameters. |
The default parameters have been set relatively conservative as to reduce overfitting.
XGBoost supports the specification of monotonic constraints on certain
variables. Within ibis this is possible via XGBPrior
. However constraints
are available only for the "gbtree"
baselearners.
An Engine.
'Machine learning is statistics minus any checking of models and assumptions‘ ~ Brian D. Ripley, useR! 2004, Vienna
Tianqi Chen and Carlos Guestrin, "XGBoost: A Scalable Tree Boosting System", 22nd SIGKDD Conference on Knowledge Discovery and Data Mining, 2016, https://arxiv.org/abs/1603.02754
Other engine:
engine_bart()
,
engine_breg()
,
engine_gdb()
,
engine_glm()
,
engine_glmnet()
,
engine_inla()
,
engine_inlabru()
,
engine_scampr()
,
engine_stan()
## Not run: # Add xgboost as an engine x <- distribution(background) |> engine_xgboost(iter = 4000) ## End(Not run)
## Not run: # Add xgboost as an engine x <- distribution(background) |> engine_xgboost(iter = 4000) ## End(Not run)
Basic object for engine, all other engines inherit from here.
engine
The class name of the engine.
name
The name of the engine
data
Any data or parameters necessary to make this engine work.
new()
Initializes the object and creates an empty list
Engine$new(engine, name)
engine
The class name of the engine.
name
The name of the engine
NULL
print()
Print the Engine name
Engine$print()
A message on screen
show()
Aliases that calls print.
Engine$show()
A message on screen
get_class()
Get class description
Engine$get_class()
A character
with the class as saved in engine
get_data()
Get specific data from this engine
Engine$get_data(x)
x
A respecified data to be added to the engine.
A list
with the data.
list_data()
List all data
Engine$list_data()
A character
vector of the data entries.
set_data()
Set data for this engine
Engine$set_data(x, value)
Invisible
get_self()
Dummy function to get self object
Engine$get_self()
This object
clone()
The objects of this class are cloneable with this method.
Engine$clone(deep = FALSE)
deep
Whether to make a deep clone.
Ensemble models calculated on multiple models have often been shown to outcompete any single model in comparative assessments (Valavi et al. 2022).
This function creates an ensemble of multiple provided distribution models
fitted with the ibis.iSDM-package
. Each model has to have estimated
predictions with a given method and optional uncertainty in form of the
standard deviation or similar. Through the layer
parameter it can be
specified which part of the prediction should be averaged in an ensemble.
This can be for instance the mean prediction and/or the standard deviation
sd. See Details below for an overview of the different methods.
Also returns a coefficient of variation (cv) as output of the ensemble, but note this should not be interpreted as measure of model uncertainty as it cannot capture parameter uncertainty of individual models; rather it reflects variation among predictions which can be due to many factors including simply differences in model complexity.
ensemble( ..., method = "mean", weights = NULL, min.value = NULL, layer = "mean", normalize = FALSE, uncertainty = "cv", point = NULL, field_occurrence = "observed", apply_threshold = TRUE ) ## S4 method for signature 'ANY' ensemble( ..., method = "mean", weights = NULL, min.value = NULL, layer = "mean", normalize = FALSE, uncertainty = "cv", point = NULL, field_occurrence = "observed", apply_threshold = TRUE )
ensemble( ..., method = "mean", weights = NULL, min.value = NULL, layer = "mean", normalize = FALSE, uncertainty = "cv", point = NULL, field_occurrence = "observed", apply_threshold = TRUE ) ## S4 method for signature 'ANY' ensemble( ..., method = "mean", weights = NULL, min.value = NULL, layer = "mean", normalize = FALSE, uncertainty = "cv", point = NULL, field_occurrence = "observed", apply_threshold = TRUE )
... |
Provided |
method |
Approach on how the ensemble is to be created. See details for
available options (Default: |
weights |
(Optional) weights provided to the ensemble function if
weighted means are to be constructed (Default: |
min.value |
A optional |
layer |
A |
normalize |
|
uncertainty |
A |
point |
A |
field_occurrence |
A |
apply_threshold |
A |
Possible options for creating an ensemble includes:
'mean'
- Calculates the mean of several predictions.
'median'
- Calculates the median of several predictions.
'max'
- The maximum value across predictions.
'min'
- The minimum value across predictions.
'mode'
- The mode/modal values as the most commonly occurring value.
'weighted.mean'
- Calculates a weighted mean. Weights have to be supplied separately (e.g. TSS).
'min.sd'
- Ensemble created by minimizing the uncertainty among predictions.
'threshold.frequency'
- Returns an ensemble based on threshold frequency (simple count). Requires thresholds to be computed.
'pca'
- Calculates a PCA between predictions of each algorithm and then extract the first axis (the one explaining the most variation).
'superlearner'
- Composites two predictions through a 'meta-model' fitted on top
(using a glm
by default). Requires binomial data in current Setup.
In addition to the different ensemble methods, a minimal threshold
(min.value
) can be set that needs to be surpassed for averaging. By
default this option is not used (Default: NULL
).
Note by default only the band in the layer
parameter is composited. If
supported by the model other summary statistics from the posterior (e.g.
'sd'
) can be specified.
A SpatRaster
object containing the ensemble of the provided
predictions specified by method
and a coefficient of variation
across all models.
If a list is supplied, then it is assumed that each entry in the list
is a fitted DistributionModel
object. Take care not to create an ensemble
of models constructed with different link functions, e.g. logistic vs log.
In this case the "normalize"
parameter has to be set.
Valavi, R., Guillera‐Arroita, G., Lahoz‐Monfort, J. J., & Elith, J. (2022). Predictive performance of presence‐only species distribution models: a benchmark study with reproducible code. Ecological Monographs, 92(1), e01486.
# Method works for fitted models as well as as rasters r1 <- terra::rast(nrows = 10, ncols = 10, res = 0.05, xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5, vals = rnorm(3600,mean = .5,sd = .1)) r2 <- terra::rast(nrows = 10, ncols = 10, res = 0.05, xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5, vals = rnorm(3600,mean = .5,sd = .5)) names(r1) <- names(r2) <- "mean" # Assumes previously computed predictions ex <- ensemble(r1, r2, method = "mean") terra::plot(ex)
# Method works for fitted models as well as as rasters r1 <- terra::rast(nrows = 10, ncols = 10, res = 0.05, xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5, vals = rnorm(3600,mean = .5,sd = .1)) r2 <- terra::rast(nrows = 10, ncols = 10, res = 0.05, xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5, vals = rnorm(3600,mean = .5,sd = .5)) names(r1) <- names(r2) <- "mean" # Assumes previously computed predictions ex <- ensemble(r1, r2, method = "mean") terra::plot(ex)
Similar to the ensemble()
function, this function creates an
ensemble of partial responses of provided distribution models fitted with
the ibis.iSDM-package
. Through the layer
parameter it can be
specified which part of the partial prediction should be averaged in an
ensemble (if given). This can be for instance the mean prediction and/or
the standard deviation sd. Ensemble partial is also being called if more
than one input DistributionModel
object is provided to partial
.
By default the ensemble of partial responses is created as average across all models with the uncertainty being the standard deviation of responses.
ensemble_partial( ..., x.var, method = "mean", layer = "mean", newdata = NULL, normalize = TRUE ) ## S4 method for signature 'ANY' ensemble_partial( ..., x.var, method = "mean", layer = "mean", newdata = NULL, normalize = TRUE )
ensemble_partial( ..., x.var, method = "mean", layer = "mean", newdata = NULL, normalize = TRUE ) ## S4 method for signature 'ANY' ensemble_partial( ..., x.var, method = "mean", layer = "mean", newdata = NULL, normalize = TRUE )
... |
Provided |
x.var |
A |
method |
Approach on how the ensemble is to be created. See details for
options (Default: |
layer |
A |
newdata |
A optional |
normalize |
|
Possible options for creating an ensemble includes:
'mean'
- Calculates the mean of several predictions.
'median'
- Calculates the median of several predictions.
A data.frame with the combined partial effects of the supplied models.
If a list is supplied, then it is assumed that each entry in the list
is a fitted DistributionModel
object. Take care not to create an ensemble
of models constructed with different link functions, e.g. logistic vs log.
By default the response functions of each model are normalized.
## Not run: # Assumes previously computed models ex <- ensemble_partial(mod1, mod2, mod3, method = "mean") ## End(Not run)
## Not run: # Assumes previously computed models ex <- ensemble_partial(mod1, mod2, mod3, method = "mean") ## End(Not run)
Similar to the ensemble()
function, this function creates an
ensemble of partial responses of provided distribution models fitted with
the ibis.iSDM-package
. Through the layer
parameter it can be
specified which part of the partial prediction should be averaged in an
ensemble (if given). This can be for instance the mean prediction and/or
the standard deviation sd. Ensemble partial is also being called if more
than one input DistributionModel
object is provided to partial
.
By default the ensemble of partial responses is created as average across all models with the uncertainty being the standard deviation of responses.
ensemble_spartial( ..., x.var, method = "mean", layer = "mean", newdata = NULL, min.value = NULL, normalize = TRUE ) ## S4 method for signature 'ANY' ensemble_spartial( ..., x.var, method = "mean", layer = "mean", newdata = NULL, min.value = NULL, normalize = TRUE )
ensemble_spartial( ..., x.var, method = "mean", layer = "mean", newdata = NULL, min.value = NULL, normalize = TRUE ) ## S4 method for signature 'ANY' ensemble_spartial( ..., x.var, method = "mean", layer = "mean", newdata = NULL, min.value = NULL, normalize = TRUE )
... |
Provided |
x.var |
A |
method |
Approach on how the ensemble is to be created. See details for
options (Default: |
layer |
A |
newdata |
A optional |
min.value |
A optional |
normalize |
|
Possible options for creating an ensemble includes:
'mean'
- Calculates the mean of several predictions.
'median'
- Calculates the median of several predictions.
A SpatRaster object with the combined partial effects of the supplied models.
If a list is supplied, then it is assumed that each entry in the list
is a fitted DistributionModel
object. Take care not to create an ensemble
of models constructed with different link functions, e.g. logistic vs log.
By default the response functions of each model are normalized.
## Not run: # Assumes previously computed models ex <- ensemble_spartial(mod1, mod2, mod3, method = "mean") ## End(Not run)
## Not run: # Assumes previously computed models ex <- ensemble_spartial(mod1, mod2, mod3, method = "mean") ## End(Not run)
This function expects a downscaled GLOBIOM output as created in the BIOCLIMA project. Likely of little use for anyone outside IIASA.
formatGLOBIOM( fname, oftype = "raster", ignore = NULL, period = "all", template = NULL, shares_to_area = FALSE, use_gdalutils = FALSE, verbose = getOption("ibis.setupmessages", default = TRUE) )
formatGLOBIOM( fname, oftype = "raster", ignore = NULL, period = "all", template = NULL, shares_to_area = FALSE, use_gdalutils = FALSE, verbose = getOption("ibis.setupmessages", default = TRUE) )
fname |
A filename in |
oftype |
A |
ignore |
A |
period |
A |
template |
An optional |
shares_to_area |
A |
use_gdalutils |
(Deprecated) |
verbose |
|
A SpatRaster
stack with the formatted GLOBIOM predictors.
## Not run: # Expects a filename pointing to a netCDF file. covariates <- formatGLOBIOM(fname) ## End(Not run)
## Not run: # Expects a filename pointing to a netCDF file. covariates <- formatGLOBIOM(fname) ## End(Not run)
Monotonic constrains for gradient descent boosting models do not work in the same way as other priors where a specific coefficient or magnitude of importance is specified. Rather monotonic constraints enforce a specific directionality of regression coefficients so that for instance a coefficient has to be positive or negative.
Important: Specifying a monotonic constrain for the engine_gdb does not guarantee that the variable is retained in the model as it can still be regularized out.
GDBPrior(variable, hyper = "increasing", ...) ## S4 method for signature 'character' GDBPrior(variable, hyper = "increasing", ...)
GDBPrior(variable, hyper = "increasing", ...) ## S4 method for signature 'character' GDBPrior(variable, hyper = "increasing", ...)
variable |
A |
hyper |
A |
... |
Variables passed on to prior object. |
Similar priors can also be defined for the engine_xgboost
via
XGBPrior()
.
Hofner, B., Müller, J., & Hothorn, T. (2011). Monotonicity‐constrained species distribution models. Ecology, 92(10), 1895-1901.
Other prior:
BARTPrior()
,
BARTPriors()
,
BREGPrior()
,
BREGPriors()
,
GDBPriors()
,
GLMNETPrior()
,
GLMNETPriors()
,
INLAPrior()
,
INLAPriors()
,
STANPrior()
,
STANPriors()
,
XGBPrior()
,
XGBPriors()
,
add_priors()
,
get_priors()
,
priors()
,
rm_priors()
This is a helper function to specify several GLMNETPrior with the same hyper-parameters, but different variables.
GDBPriors(variable, hyper = "increasing", ...) ## S4 method for signature 'character' GDBPriors(variable, hyper = "increasing", ...)
GDBPriors(variable, hyper = "increasing", ...) ## S4 method for signature 'character' GDBPriors(variable, hyper = "increasing", ...)
variable |
A |
hyper |
A |
... |
Variables passed on to prior object. |
Other prior:
BARTPrior()
,
BARTPriors()
,
BREGPrior()
,
BREGPriors()
,
GDBPrior()
,
GLMNETPrior()
,
GLMNETPriors()
,
INLAPrior()
,
INLAPriors()
,
STANPrior()
,
STANPriors()
,
XGBPrior()
,
XGBPriors()
,
add_priors()
,
get_priors()
,
priors()
,
rm_priors()
This function is a short helper function to return the fitted
data from a DistributionModel
or BiodiversityScenario
object. It can be
used to easily obtain for example the estimated prediction from a model or
the projected scenario from a scenario()
object.
get_data(obj, what = NULL) ## S4 method for signature 'ANY' get_data(obj, what = NULL)
get_data(obj, what = NULL) ## S4 method for signature 'ANY' get_data(obj, what = NULL)
obj |
Provided |
what |
A |
A SpatRaster
or "stars" object depending on the input.
This function is essentially identical to querying the internal
function x$get_data()
from the object. However it does attempt some
lazy character matching if what is supplied.
## Not run: # Assumes previously computed model get_data(fit) ## End(Not run)
## Not run: # Assumes previously computed model get_data(fit) ## End(Not run)
This function performs nearest neighbour matching between biodiversity observations and independent predictors, and operates directly on provided data.frames. Note that despite being parallized this function can be rather slow for large data volumes of data!
get_ngbvalue( coords, env, longlat = TRUE, field_space = c("x", "y"), cheap = FALSE, ... )
get_ngbvalue( coords, env, longlat = TRUE, field_space = c("x", "y"), cheap = FALSE, ... )
coords |
A |
env |
A |
longlat |
A |
field_space |
A |
cheap |
A |
... |
other options. |
Nearest neighbour matching is done via the geodist R-package (geodist::geodist
).
A data.frame
with the extracted covariate data from each provided
data point.
If multiple values are of equal distance during the nearest neighbour check, then the results is by default averaged.
Mark Padgham and Michael D. Sumner (2021). geodist: Fast, Dependency-Free Geodesic Distance Calculations. R package version 0.0.7. https://CRAN.R-project.org/package=geodist
## Not run: # Create matchup table tab <- get_ngbvalue( coords = coords, # Coordinates env = env # Data.frame with covariates and coordinates ) ## End(Not run)
## Not run: # Create matchup table tab <- get_ngbvalue( coords = coords, # Coordinates env = env # Data.frame with covariates and coordinates ) ## End(Not run)
Often it can make sense to fit an additional model to get a
grasp on the range of values that "beta" parameters can take. This function
takes an existing BiodiversityDistribution
object and creates
PriorList
object from them. The resulting object can be used to add
for instance priors to a new model.
get_priors(mod, target_engine, ...) ## S4 method for signature 'ANY,character' get_priors(mod, target_engine, ...)
get_priors(mod, target_engine, ...) ## S4 method for signature 'ANY,character' get_priors(mod, target_engine, ...)
mod |
A fitted |
target_engine |
A |
... |
Other parameters passed down. |
Not all engines support priors in similar ways. See the vignettes and help pages on that topic!
Other prior:
BARTPrior()
,
BARTPriors()
,
BREGPrior()
,
BREGPriors()
,
GDBPrior()
,
GDBPriors()
,
GLMNETPrior()
,
GLMNETPriors()
,
INLAPrior()
,
INLAPriors()
,
STANPrior()
,
STANPriors()
,
XGBPrior()
,
XGBPriors()
,
add_priors()
,
priors()
,
rm_priors()
## Not run: mod <- distribution(background) |> add_predictors(covariates) |> add_biodiversity_poipo(points) |> engine_inlabru() |> train() get_priors(mod, target_engine = "BART") ## End(Not run)
## Not run: mod <- distribution(background) |> add_predictors(covariates) |> add_biodiversity_poipo(points) |> engine_inlabru() |> train() get_priors(mod, target_engine = "BART") ## End(Not run)
This function simply extracts the values from a provided
SpatRaster
, SpatRasterDataset
or SpatRasterCollection
object. For
points where or NA
values were extracted a small buffer is applied to try and
obtain the remaining values.
get_rastervalue(coords, env, ngb_fill = TRUE, rm.na = FALSE)
get_rastervalue(coords, env, ngb_fill = TRUE, rm.na = FALSE)
coords |
A |
env |
A |
ngb_fill |
|
rm.na |
|
It is essentially a wrapper for terra::extract
.
A data.frame
with the extracted covariate data from each provided
data point.
# Dummy raster: r <- terra::rast(nrows = 10, ncols = 10, res = 0.05, xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5, vals = rnorm(3600,mean = .5,sd = .1)) # (dummy points) pp <- terra::spatSample(r,20,as.points = TRUE) |> sf::st_as_sf() # Extract values vals <- get_rastervalue(pp, r) head(vals)
# Dummy raster: r <- terra::rast(nrows = 10, ncols = 10, res = 0.05, xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5, vals = rnorm(3600,mean = .5,sd = .1)) # (dummy points) pp <- terra::spatSample(r,20,as.points = TRUE) |> sf::st_as_sf() # Extract values vals <- get_rastervalue(pp, r) head(vals)
The engine_glmnet
engine does not support priors in a
typical sense, however it is possible to specify so called penalty factors as
well as lower and upper limits on all variables in the model.
The default penalty multiplier is 1
for each coefficient X covariate,
i.e. coefficients are penalized equally and then informed by an intersection
of any absence information with the covariates. In contrast a variable with
penalty.factor equal to 0
is not penalized at all.
In addition, it is possible to specifiy a lower and upper limit for specific
coefficients, which constrain them to a certain range. By default those
ranges are set to -Inf
and Inf
respectively, but can be reset
to a specific value range by altering "lims"
(see examples).
For a regularized regression that supports a few more options on the priors,
check out the Bayesian engine_breg
.
GLMNETPrior(variable, hyper = 0, lims = c(-Inf, Inf), ...) ## S4 method for signature 'character' GLMNETPrior(variable, hyper = 0, lims = c(-Inf, Inf), ...)
GLMNETPrior(variable, hyper = 0, lims = c(-Inf, Inf), ...) ## S4 method for signature 'character' GLMNETPrior(variable, hyper = 0, lims = c(-Inf, Inf), ...)
variable |
A |
hyper |
A |
lims |
A |
... |
Variables passed on to prior object. |
Other prior:
BARTPrior()
,
BARTPriors()
,
BREGPrior()
,
BREGPriors()
,
GDBPrior()
,
GDBPriors()
,
GLMNETPriors()
,
INLAPrior()
,
INLAPriors()
,
STANPrior()
,
STANPriors()
,
XGBPrior()
,
XGBPriors()
,
add_priors()
,
get_priors()
,
priors()
,
rm_priors()
## Not run: # Retain variable p1 <- GLMNETPrior(variable = "forest", hyper = 0) p1 # Smaller chance to be regularized p2 <- GLMNETPrior(variable = "forest", hyper = 0.2, lims = c(0, Inf)) p2 ## End(Not run)
## Not run: # Retain variable p1 <- GLMNETPrior(variable = "forest", hyper = 0) p1 # Smaller chance to be regularized p2 <- GLMNETPrior(variable = "forest", hyper = 0.2, lims = c(0, Inf)) p2 ## End(Not run)
This is a helper function to specify several GLMNETPrior with the same hyper-parameters, but different variables.
GLMNETPriors(variable, hyper = 0, lims = c(-Inf, Inf)) ## S4 method for signature 'character' GLMNETPriors(variable, hyper = 0, lims = c(-Inf, Inf))
GLMNETPriors(variable, hyper = 0, lims = c(-Inf, Inf)) ## S4 method for signature 'character' GLMNETPriors(variable, hyper = 0, lims = c(-Inf, Inf))
variable |
A |
hyper |
A |
lims |
A |
Other prior:
BARTPrior()
,
BARTPriors()
,
BREGPrior()
,
BREGPriors()
,
GDBPrior()
,
GDBPriors()
,
GLMNETPrior()
,
INLAPrior()
,
INLAPriors()
,
STANPrior()
,
STANPriors()
,
XGBPrior()
,
XGBPriors()
,
add_priors()
,
get_priors()
,
priors()
,
rm_priors()
Some of the dependencies (R-Packages) that ibis.iSDM relies on are by intention not added to the Description of the file to keep the number of mandatory dependencies small and enable the package to run even on systems that might not have all libraries pre-installed.
This function provides a convenience wrapper to install those missing dependencies as needed. It furthermore checks which packages require updating and updates them as needed.
ibis_dependencies(deps = getOption("ibis.dependencies"), update = TRUE)
ibis_dependencies(deps = getOption("ibis.dependencies"), update = TRUE)
deps |
A |
update |
A |
Nothing. Packages will be installed.
INLA is handled in a special way as it is not available via cran.
## Not run: # Install and update all dependencies ibis_dependencies() ## End(Not run)
## Not run: # Install and update all dependencies ibis_dependencies() ## End(Not run)
Small helper function to enable parallel processing. If set
to TRUE
, then parallel inference (if supported by engines) and projection is
enabled across the package.
For enabling prediction support beyond sequential prediction see the ibis_future
function.
ibis_enable_parallel()
ibis_enable_parallel()
Invisible
This function checks if parallel processing can be set up and enables it. Ideally this is done by the user for more control! In the package parallelization is usually only used for predictions and projections, but not for inference in which case parallel inference should be handled by the engine.
ibis_future( plan_exists = FALSE, cores = getOption("ibis.nthread", default = 2), strategy = getOption("ibis.futurestrategy"), workers = NULL )
ibis_future( plan_exists = FALSE, cores = getOption("ibis.nthread", default = 2), strategy = getOption("ibis.futurestrategy"), workers = NULL )
plan_exists |
A |
cores |
A |
strategy |
A |
workers |
An optional list of remote machines or workers, e.g. |
Currently supported strategies are:
"sequential"
= Resolves futures sequentially in the current R process (Package default).
"multisession"
= Resolves futures asynchronously across 'cores'
sessions.
"multicore"
= Resolves futures asynchronously across on forked processes. Only works on UNIX systems!
"cluster"
= Resolves futures asynchronously in sessions on this or more machines.
"slurm"
= To be implemented: Slurm linkage via batchtools.
Invisible
The 'plan'
set by future exists after the function has been executed.
If the aim is to parallize across many species, this is better done in a scripted solution. Make sure not to parallize predictions within existing clusters to avoid out-of-memory issues.
## Not run: # Starts future job. F in this case is a prediction function. ibis_future(cores = 4, strategy = "multisession") ## End(Not run)
## Not run: # Starts future job. F in this case is a prediction function. ibis_future(cores = 4, strategy = "multisession") ## End(Not run)
There are a number of hidden options that can be specified for ibis.iSDM. Currently supported are:
'ibis.runparallel'
: logical
value on whether processing should be run in parallel.
'ibis.nthread'
: numeric
value on how many cores should be used by default.
'ibis.setupmessages'
: logical
value indicating whether message during object creation should be shown (Default: NULL
).
'ibis.engines'
: Returns a vector
with all valid engines.
'ibis.use_future'
: logical
on whether the future package should be used for parallel computing.
ibis_options()
ibis_options()
The output of getOptions
for all ibis related variables.
ibis_options()
ibis_options()
Small helper function to respecify the strategy for parallel processing (Default: 'sequential'
).
ibis_set_strategy(strategy = "sequential")
ibis_set_strategy(strategy = "sequential")
strategy |
A |
Currently supported strategies are:
"sequential"
= Resolves futures sequentially in the current R process (Package default).
"multisession"
= Resolves futures asynchronously across 'cores'
sessions.
"multicore"
= Resolves futures asynchronously across on forked processes. Only works on UNIX systems!
"cluster"
= Resolves futures asynchronously in sessions on this or more machines.
"slurm"
= To be implemented: Slurm linkage via batchtools.
Invisible
Small helper function to respecify the number of threads for parallel processing.
ibis_set_threads(threads = 2)
ibis_set_threads(threads = 2)
threads |
A |
Invisible
future, ibis_future_run
For any fixed and random effect INLA supports a range of different priors of exponential distributions.
Currently supported for INLA in ibis.iSDM are the following priors that can
be specified via "type"
:
"normal"
or "gaussian"
: Priors on normal distributed and set to specified variable. Required parameters
are a mean and a precision estimate provided to "hyper"
. Note that
precision is not equivalent (rather the inverse) to typical standard
deviation specified in Gaussian priors. Defaults are set to a mean of
0
and a precision of 0.001
.
"clinear"
: Prior that places a constraint on the linear coefficients of a model
so as that the coefficient is in a specified interval
"c(lower,upper)"
. Specified through hyper these values can be
negative, positive or infinite.
"spde"
, specifically 'prior.range'
and 'prior.sigma'
: Specification of
penalized complexity priors which can be added to a SPDE spatial random
effect added via add_latent_spatial()
. Here the range of the penalized
complexity prior can be specified through 'prior.range'
and the
uncertainty via 'prior.sigma'
both supplied to the options 'type' and
'hyper'.
Other priors available in INLA names(INLA::inla.models()$prior) )
might also work, but have not been tested!
INLAPrior(variable, type = "normal", hyper = c(0, 0.001), ...) ## S4 method for signature 'character,character' INLAPrior(variable, type = "normal", hyper = c(0, 0.001), ...)
INLAPrior(variable, type = "normal", hyper = c(0, 0.001), ...) ## S4 method for signature 'character,character' INLAPrior(variable, type = "normal", hyper = c(0, 0.001), ...)
variable |
A |
type |
A |
hyper |
A |
... |
Variables passed on to prior object. |
Compared to other engines, INLA does unfortunately does not support
priors related to more stringent parameter regularization such as Laplace or
Horseshoe priors, which limits the capability of engine_inla
for
regularization. That being said many of the default uninformative priors act
already regularize the coefficients to some degree.
Rue, H., Riebler, A., Sørbye, S. H., Illian, J. B., Simpson, D. P., & Lindgren, F. K. (2017). Bayesian computing with INLA: a review. Annual Review of Statistics and Its Application, 4, 395-421.
Simpson, D., Rue, H., Riebler, A., Martins, T. G., & Sørbye, S. H. (2017). Penalising model component complexity: A principled, practical approach to constructing priors. Statistical science, 32(1), 1-28.
Other prior:
BARTPrior()
,
BARTPriors()
,
BREGPrior()
,
BREGPriors()
,
GDBPrior()
,
GDBPriors()
,
GLMNETPrior()
,
GLMNETPriors()
,
INLAPriors()
,
STANPrior()
,
STANPriors()
,
XGBPrior()
,
XGBPriors()
,
add_priors()
,
get_priors()
,
priors()
,
rm_priors()
This is a helper function to specify several INLAPrior objects with the same hyper-parameters, but different variables.
INLAPriors(variables, type, hyper = c(0, 0.001), ...) ## S4 method for signature 'vector,character' INLAPriors(variables, type, hyper = c(0, 0.001), ...)
INLAPriors(variables, type, hyper = c(0, 0.001), ...) ## S4 method for signature 'vector,character' INLAPriors(variables, type, hyper = c(0, 0.001), ...)
variables |
A |
type |
A |
hyper |
A |
... |
Variables passed on to prior object. |
Other prior:
BARTPrior()
,
BARTPriors()
,
BREGPrior()
,
BREGPriors()
,
GDBPrior()
,
GDBPriors()
,
GLMNETPrior()
,
GLMNETPriors()
,
INLAPrior()
,
STANPrior()
,
STANPriors()
,
XGBPrior()
,
XGBPriors()
,
add_priors()
,
get_priors()
,
priors()
,
rm_priors()
This function linearly approximates shares between time steps, so that gaps for instance between 2010 and 2020 are filled with data for 2010, 2011, 2012, etc.
interpolate_gaps(env, date_interpolation = "annual", method = "linear")
interpolate_gaps(env, date_interpolation = "annual", method = "linear")
env |
A |
date_interpolation |
|
method |
A |
logical
indicating if the two SpatRaster
objects have the same
## Not run: # Interpolate stars stack sc <- interpolate_gaps( stack, "annual") ## End(Not run)
## Not run: # Interpolate stars stack sc <- interpolate_gaps( stack, "annual") ## End(Not run)
Check whether a formula is valid
is.formula(x)
is.formula(x)
x |
A |
Boolean evaluation with logical output.
Check whether a provided object is truly of a specific type
is.Id(x)
is.Id(x)
x |
A provided Id object |
Boolean evaluation with logical output.
Tests if an input is a SpatRaster object.
is.Raster(x)
is.Raster(x)
x |
an R Object. |
Boolean evaluation with logical output.
Tests if an input is a stars object.
is.stars(x)
is.stars(x)
x |
an R Object. |
Boolean evaluation with logical output.
Is the provided object of type waiver?
is.Waiver(x)
is.Waiver(x)
x |
A provided |
Boolean evaluation with logical output.
Calculates a SpatRaster
of locally limiting factors from a
given projected model. To calculate this first the spartial
effect of
each individual covariate in the model is calculated.
The effect is estimated as that variable most responsible for decreasing suitability at that cell. The decrease in suitability is calculated, for each predictor in turn, relative to thesuitability that would be achieved if that predictor took the value equal to the mean The predictor associated with the largest decrease in suitability is the most limiting factor.
limiting(mod, plot = TRUE) ## S4 method for signature 'ANY' limiting(mod, plot = TRUE)
limiting(mod, plot = TRUE) ## S4 method for signature 'ANY' limiting(mod, plot = TRUE)
mod |
A fitted |
plot |
Should the result be plotted? (Default: |
A terra
object of the most important variable for a given grid cell.
Elith, J., Kearney, M. and Phillips, S. (2010), The art of modelling range-shifting species. Methods in Ecology and Evolution, 1: 330-342. doi: 10.1111/j.2041-210X.2010.00036.x
## Not run: o <- limiting(fit) plot(o) ## End(Not run)
## Not run: o <- limiting(fit) plot(o) ## End(Not run)
The load_model
function (opposed to the write_model
) loads
previous saved DistributionModel
. It is essentially a wrapper to
readRDS
.
When models are loaded, they are briefly checked for their validity and presence of necessary components.
load_model(fname, verbose = getOption("ibis.setupmessages", default = TRUE)) ## S4 method for signature 'character' load_model(fname, verbose = getOption("ibis.setupmessages", default = TRUE))
load_model(fname, verbose = getOption("ibis.setupmessages", default = TRUE)) ## S4 method for signature 'character' load_model(fname, verbose = getOption("ibis.setupmessages", default = TRUE))
fname |
A |
verbose |
|
A DistributionModel
object.
write_model
## Not run: # Load model mod <- load_model("testmodel.rds") summary(mod) ## End(Not run)
## Not run: # Load model mod <- load_model("testmodel.rds") summary(mod) ## End(Not run)
Basic R6
object for Log, any Log inherit from here
filename
A character
of where the log is to be stored.
output
The log content.
new()
Initializes the object and specifies some default parameters.
Log$new(filename, output)
filename
A character
of where the log is to be stored.
output
The log content.
NULL
print()
Print message with filename
Log$print()
A message on screen
open()
Opens the connection to the output filename.
Log$open(type = c("output", "message"))
type
A character
vector of the output types.
Invisible TRUE
close()
Closes the connection to the output file
Log$close()
Invisible TRUE
get_filename()
Get output filename
Log$get_filename()
A character
with the filename
set_filename()
Set a new output filename
Log$set_filename(value)
value
A character
with the new filename.
Invisible TRUE
delete()
Delete log file
Log$delete()
Invisible TRUE
open_system()
Open log with system viewer
Log$open_system()
Invisible TRUE
clone()
The objects of this class are cloneable with this method.
Log$clone(deep = FALSE)
deep
Whether to make a deep clone.
This is a helper function that takes an existing object created by the ibis.iSDM package and an external layer, then intersects both. It currently takes either a DistributionModel, BiodiversityDatasetCollection, PredictorDataset or BiodiversityScenario as input.
As mask either a sf
or SpatRaster
object can be chosen. The mask will
be converted internally depending on the object.
mask.DistributionModel(x, mask, inverse = FALSE, ...) mask.BiodiversityDatasetCollection(x, mask, inverse = FALSE, ...) mask.PredictorDataset(x, mask, inverse = FALSE, ...) mask.BiodiversityScenario(x, mask, inverse = FALSE, ...)
mask.DistributionModel(x, mask, inverse = FALSE, ...) mask.BiodiversityDatasetCollection(x, mask, inverse = FALSE, ...) mask.PredictorDataset(x, mask, inverse = FALSE, ...) mask.BiodiversityScenario(x, mask, inverse = FALSE, ...)
x |
Any object belonging to DistributionModel, BiodiversityDatasetCollection, PredictorDataset or BiodiversityScenario. |
mask |
A |
inverse |
A |
... |
Passed on arguments |
A respective object of the input type.
## Not run: # Build and train a model mod <- distribution(background) |> add_biodiversity_poipo(species) |> add_predictors(predictors) |> engine_glmnet() |> train() # Constrain the prediction by another object mod <- mask(mod, speciesrange) ## End(Not run)
## Not run: # Build and train a model mod <- distribution(background) |> add_biodiversity_poipo(species) |> add_predictors(predictors) |> engine_glmnet() |> train() # Constrain the prediction by another object mod <- mask(mod, speciesrange) ## End(Not run)
Calculate the mode of a provided vector
modal(x, na.rm = TRUE)
modal(x, na.rm = TRUE)
x |
A |
na.rm |
|
The most common (mode) estimate.
# Example modal(trees$Girth)
# Example modal(trees$Girth)
Generate a new unique identifier.
new_id()
new_id()
Identifiers are made using the uuid::UUIDgenerate()
.
"Id"
object.
# create new id i <- new_id() # print id print(i) # convert to character as.character(i) # check if it is an Id object is.Id(i)
# create new id i <- new_id() # print id print(i) # convert to character as.character(i) # check if it is an Id object is.Id(i)
Create a waiver
object.
new_waiver()
new_waiver()
This object is used to represent that the user has not manually
specified a setting, and so defaults should be used. By explicitly using a
new_waiver()
, this means that NULL
objects can be a valid setting. The
use of a "waiver" object was inspired by the ggplot2
and prioritizr
package.
Object of class Waiver
.
# create new waiver object w <- new_waiver() # print object print(w) # is it a waiver object? is.Waiver(w)
# create new waiver object w <- new_waiver() # print object print(w) # is it a waiver object? is.Waiver(w)
The suitability of any given area for a biodiversity feature can in
many instances be complex and non-linear. Visualizing obtained suitability
predictions (e.g. from train()
) against underlying predictors might help
to explain the underlying gradients of the niche.
Supported Inputs for this function are either single trained ibis.iSDM
DistributionModel
objects or alternatively a set of three SpatRaster
objects.
In both cases, users can specify "xvar"
and "yvar"
explicitly
or leave them empty. In the latter case a principal component analysis (PCA)
is conducted on the full environmental stack (loaded from DistributionModel
or supplied separately).
nicheplot( mod, xvar = NULL, yvar = NULL, envvars = NULL, overlay_data = FALSE, plot = TRUE, fname = NULL, title = NULL, pal = NULL, ... ) ## S4 method for signature 'ANY' nicheplot( mod, xvar = NULL, yvar = NULL, envvars = NULL, overlay_data = FALSE, plot = TRUE, fname = NULL, title = NULL, pal = NULL, ... )
nicheplot( mod, xvar = NULL, yvar = NULL, envvars = NULL, overlay_data = FALSE, plot = TRUE, fname = NULL, title = NULL, pal = NULL, ... ) ## S4 method for signature 'ANY' nicheplot( mod, xvar = NULL, yvar = NULL, envvars = NULL, overlay_data = FALSE, plot = TRUE, fname = NULL, title = NULL, pal = NULL, ... )
mod |
A trained |
xvar |
A |
yvar |
A |
envvars |
A |
overlay_data |
A |
plot |
A |
fname |
A |
title |
Allows to respecify the title through a |
pal |
An optional |
... |
Other engine specific parameters. |
Saved niche plot in 'fname'
if specified, otherwise plot.
partial, plot.DistributionModel
# Make quick prediction background <- terra::rast(system.file('extdata/europegrid_50km.tif', package='ibis.iSDM',mustWork = TRUE)) virtual_points <- sf::st_read(system.file('extdata/input_data.gpkg', package='ibis.iSDM'), 'points',quiet = TRUE) ll <- list.files(system.file('extdata/predictors/',package = 'ibis.iSDM',mustWork = TRUE),full.names = TRUE) # Load them as rasters predictors <- terra::rast(ll);names(predictors) <- tools::file_path_sans_ext(basename(ll)) # Add GLM as an engine and predict fit <- distribution(background) |> add_biodiversity_poipo(virtual_points, field_occurrence = 'Observed', name = 'Virtual points',docheck = FALSE) |> add_predictors(predictors, transform = 'none',derivates = 'none') |> engine_glm() |> train() # Plot niche for prediction for temperature and forest cover nicheplot(fit, xvar = "bio01_mean_50km", yvar = "CLC3_312_mean_50km" )
# Make quick prediction background <- terra::rast(system.file('extdata/europegrid_50km.tif', package='ibis.iSDM',mustWork = TRUE)) virtual_points <- sf::st_read(system.file('extdata/input_data.gpkg', package='ibis.iSDM'), 'points',quiet = TRUE) ll <- list.files(system.file('extdata/predictors/',package = 'ibis.iSDM',mustWork = TRUE),full.names = TRUE) # Load them as rasters predictors <- terra::rast(ll);names(predictors) <- tools::file_path_sans_ext(basename(ll)) # Add GLM as an engine and predict fit <- distribution(background) |> add_biodiversity_poipo(virtual_points, field_occurrence = 'Observed', name = 'Virtual points',docheck = FALSE) |> add_predictors(predictors, transform = 'none',derivates = 'none') |> engine_glm() |> train() # Plot niche for prediction for temperature and forest cover nicheplot(fit, xvar = "bio01_mean_50km", yvar = "CLC3_312_mean_50km" )
Shows the size of the objects currently in the R environment. Helps to locate large objects cluttering the R environment and/or causing memory problems during the execution of large workflows.
objects_size(n = 10)
objects_size(n = 10)
n |
Number of objects to show, Default: |
A data frame with the row names indicating the object name, the field 'Type' indicating the object type, 'Size' indicating the object size, and the columns 'Length/Rows' and 'Columns' indicating the object dimensions if applicable.
Bias Benito
if(interactive()){ #creating dummy objects x <- matrix(runif(100), 10, 10) y <- matrix(runif(10000), 100, 100) #reading their in-memory size objects_size() }
if(interactive()){ #creating dummy objects x <- matrix(runif(100), 10, 10) y <- matrix(runif(10000), 100, 100) #reading their in-memory size objects_size() }
Create a partial response or effect plot of a trained model.
partial( mod, x.var = NULL, constant = NULL, variable_length = 100, values = NULL, newdata = NULL, plot = FALSE, type = "response", ... ) ## S4 method for signature 'ANY' partial( mod, x.var = NULL, constant = NULL, variable_length = 100, values = NULL, newdata = NULL, plot = FALSE, type = "response", ... ) partial.DistributionModel(mod, ...)
partial( mod, x.var = NULL, constant = NULL, variable_length = 100, values = NULL, newdata = NULL, plot = FALSE, type = "response", ... ) ## S4 method for signature 'ANY' partial( mod, x.var = NULL, constant = NULL, variable_length = 100, values = NULL, newdata = NULL, plot = FALSE, type = "response", ... ) partial.DistributionModel(mod, ...)
mod |
A trained |
x.var |
A character indicating the variable for which a partial effect is to be calculated. |
constant |
A numeric constant to be inserted for all other variables. Default calculates a mean per variable. |
variable_length |
numeric The interpolation depth (nr. of points) to
be used (Default: |
values |
numeric Directly specified values to compute partial effects
for. If this parameter is set to anything other than |
newdata |
An optional data.frame with provided data for partial estimation
(Default: |
plot |
A |
type |
A specified type, either |
... |
Other engine specific parameters. |
By default the mean is calculated across all parameters that are not
x.var
. Instead a constant can be set (for instance 0
) to be
applied to the output.
A data.frame with the created partial response.
## Not run: # Do a partial calculation of a trained model partial(fit, x.var = "Forest.cover", plot = TRUE) ## End(Not run)
## Not run: # Do a partial calculation of a trained model partial(fit, x.var = "Forest.cover", plot = TRUE) ## End(Not run)
Based on a fitted model, plot the density of observations over the estimated variable and environmental space. Opposed to the partial and spartial functions, which are rather low-level interfaces, this function provides more detail in the light of the data. It is also able to contrast different variables against each other and show the used data.
partial_density(mod, x.var, df = FALSE, ...) ## S4 method for signature 'ANY,character' partial_density(mod, x.var, df = FALSE, ...)
partial_density(mod, x.var, df = FALSE, ...) ## S4 method for signature 'ANY,character' partial_density(mod, x.var, df = FALSE, ...)
mod |
A trained |
x.var |
A character indicating the variable to be investigated. Can be
a |
df |
|
... |
Other engine specific parameters. |
This functions calculates the observed density of presence and absence points over the whole surface of a specific variable. It can be used to visually inspect the fit of the model to data.
A ggplot2
object showing the marginal response in light of the data.
By default all variables that are not x.var
are hold constant at
the mean.
Warren, D.L., Matzke, N.J., Cardillo, M., Baumgartner, J.B., Beaumont, L.J., Turelli, M., Glor, R.E., Huron, N.A., Simões, M., Iglesias, T.L. Piquet, J.C., and Dinnage, R. 2021. ENMTools 1.0: an R package for comparative ecological biogeography. Ecography, 44(4), pp.504-511.
## Not run: # Do a partial calculation of a trained model partial_density(fit, x.var = "Forest.cover") # Or with two variables partial_density(fit, x.var = c("Forest.cover", "bio01")) ## End(Not run)
## Not run: # Do a partial calculation of a trained model partial_density(fit, x.var = "Forest.cover") # Or with two variables partial_density(fit, x.var = c("Forest.cover", "bio01")) ## End(Not run)
Plots information from a given object where a plotting object is available.
## S3 method for class 'DistributionModel' plot(x, what = "mean", ...) ## S3 method for class 'BiodiversityDatasetCollection' plot(x, ...) ## S3 method for class 'PredictorDataset' plot(x, ...) ## S3 method for class 'Engine' plot(x, ...) ## S3 method for class 'BiodiversityScenario' plot(x, ...)
## S3 method for class 'DistributionModel' plot(x, what = "mean", ...) ## S3 method for class 'BiodiversityDatasetCollection' plot(x, ...) ## S3 method for class 'PredictorDataset' plot(x, ...) ## S3 method for class 'Engine' plot(x, ...) ## S3 method for class 'BiodiversityScenario' plot(x, ...)
x |
Any object belonging to DistributionModel, BiodiversityDatasetCollection, PredictorDataset or BiodiversityScenario. |
what |
In case a SpatRaster is supplied, this parameter specifies the layer
to be shown (Default: |
... |
Further arguments passed on to |
The plotted outputs vary depending on what object is being plotted.
For example for a fitted DistributionModel the output is usually the fitted
spatial prediction (Default: 'mean'
).
Graphical output
## Not run: # Build and train a model mod <- distribution(background) |> add_biodiversity_poipo(species) |> add_predictors(predictors) |> engine_glmnet() |> train() # Plot the resulting model plot(mod) ## End(Not run)
## Not run: # Build and train a model mod <- distribution(background) |> add_biodiversity_poipo(species) |> add_predictors(predictors) |> engine_glmnet() |> train() # Plot the resulting model plot(mod) ## End(Not run)
This function does simulates from the posterior of a created stan model, therefore providing a fast and efficient way to project coefficients obtained from Bayesian models to new/novel contexts.
posterior_predict_stanfit( obj, form, newdata, type = "predictor", family = NULL, offset = NULL, draws = NULL )
posterior_predict_stanfit( obj, form, newdata, type = "predictor", family = NULL, offset = NULL, draws = NULL )
obj |
A |
form |
A |
newdata |
A data.frame with new data to be used for prediction. |
type |
A |
family |
A |
offset |
A vector with an optionally specified offset. |
draws |
numeric indicating whether a specific number of draws should be taken. |
https://medium.com/@alex.pavlakis/making-predictions-from-stan-models-in-r-3e349dfac1ed.
The brms R-package.
This function creates derivatives of existing covariates and returns them in Raster format. Derivative variables can in the machine learning literature commonly be understood as one aspect of feature engineering. They can be particularly powerful in introducing non-linearities in otherwise linear models, for example is often done in the popular Maxent framework.
predictor_derivate( env, option, nknots = 4, deriv = NULL, int_variables = NULL, method = NULL, ... )
predictor_derivate( env, option, nknots = 4, deriv = NULL, int_variables = NULL, method = NULL, ... )
env |
A |
option |
A |
nknots |
The number of knots to be used for the transformation (Default: |
deriv |
A |
int_variables |
A |
method |
As |
... |
other options (Non specified). |
Available options are:
'none'
- The original layer(s) are returned.
'quadratic'
- A quadratic transformation () is created of
the provided layers.
'hinge'
- Creates hinge transformation of covariates, which set all
values lower than a set threshold to 0
and all others to a range of .
The number of thresholds and thus new derivates is specified via the parameter
'nknots'
(Default: 4
).
'interaction'
- Creates interactions between variables. Target variables
have to be specified via "int_variables"
.
'thresh'
- A threshold transformation of covariates, which sets all
values lower than a set threshold at 0
and those larger to 1
.
The number of thresholds and thus new derivates is specified via the parameter
'nknots'
(Default: 4
).
'bin'
- Creates a factor representation of a covariates by cutting the
range of covariates by their percentiles. The number of percentile cuts and thus
new derivates is specified via the parameter 'nknots'
(Default: 4
).
'kmeans'
Creates a factor representation of a covariates through a
kmeans()
clustering. The number of clusters are specified via the parameter 'nknots'
.
Returns the derived adjusted SpatRaster
objects of identical resolution.
predictor_transform
# Dummy raster r_ori <- terra::rast(nrows = 10, ncols = 10, res = 0.05, xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5, vals = rpois(3600, 10)) # Create a hinge transformation with 4 knots of one or multiple SpatRaster. new <- predictor_derivate(r_ori, option = "hinge", knots = 4) terra::plot(new) # Or a quadratic transformation new2 <- predictor_derivate(r_ori, option = "quad", knots = 4) terra::plot(new2)
# Dummy raster r_ori <- terra::rast(nrows = 10, ncols = 10, res = 0.05, xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5, vals = rpois(3600, 10)) # Create a hinge transformation with 4 knots of one or multiple SpatRaster. new <- predictor_derivate(r_ori, option = "hinge", knots = 4) terra::plot(new) # Or a quadratic transformation new2 <- predictor_derivate(r_ori, option = "quad", knots = 4) terra::plot(new2)
This function helps to remove highly correlated variables from a set of predictors. It supports multiple options some of which require both environmental predictors and observations, others only predictors.
Some of the options require different packages to be pre-installed, such as
ranger
or Boruta
.
predictor_filter(env, keep = NULL, method = "pearson", ...)
predictor_filter(env, keep = NULL, method = "pearson", ...)
env |
A |
keep |
A |
method |
Which method to use for constructing the correlation matrix
(Options: |
... |
Other options for a specific method |
Available options are:
"none"
No prior variable removal is performed (Default).
"pearson"
, "spearman"
or "kendall"
Makes use of pairwise
comparisons to identify and remove highly collinear predictors (Pearson's r >= 0.7
).
"abess"
A-priori adaptive best subset selection of covariates via the
abess
package (see References). Note that this effectively fits a separate
generalized linear model to reduce the number of covariates.
"boruta"
Uses the Boruta
package to identify non-informative features.
A character
vector
of variable names to be excluded. If the
function fails due to some reason return NULL
.
Using this function on predictors effectively means that a separate model is fitted on the data with all the assumptions that come with in (e.g. linearity, appropriateness of response, normality, etc).
## Not run: # Remove highly correlated predictors env <- predictor_filter( env, option = "pearson") ## End(Not run)
## Not run: # Remove highly correlated predictors env <- predictor_filter( env, option = "pearson") ## End(Not run)
This method allows the homogenization of missing data across a
set of environmental predictors. It is by default called when predictors
are added to BiodiversityDistribution
object. Only grid cells with NAs
that contain values at some raster layers are homogenized. Additional
parameters allow instead of homogenization to fill the missing data with
neighbouring values
predictor_homogenize_na( env, fill = FALSE, fill_method = "ngb", return_na_cells = FALSE )
predictor_homogenize_na( env, fill = FALSE, fill_method = "ngb", return_na_cells = FALSE )
env |
A |
fill |
A |
fill_method |
A |
return_na_cells |
A |
A SpatRaster
object with the same number of layers as the input.
## Not run: # Harmonize predictors env <- predictor_homogenize_na(env) ## End(Not run)
## Not run: # Harmonize predictors env <- predictor_homogenize_na(env) ## End(Not run)
This function allows the transformation of provided environmental
predictors (in SpatRaster
format). A common use case is for instance the
standardization (or scaling) of all predictors prior to model fitting. This
function works both with SpatRaster
as well as with stars
objects.
predictor_transform( env, option, windsor_props = c(0.05, 0.95), pca.var = 0.8, state = NULL, method = NULL, ... )
predictor_transform( env, option, windsor_props = c(0.05, 0.95), pca.var = 0.8, state = NULL, method = NULL, ... )
env |
A |
option |
A |
windsor_props |
A |
pca.var |
A |
state |
A |
method |
As |
... |
other options (Non specified). |
Available options are:
'none'
The original layer(s) are returned.
'scale'
This run the scale()
function with default settings
(1 Standard deviation) across all predictors. A sensible default to for most model fitting.
'norm'
This normalizes all predictors to a range from 0-1
.
'windsor'
This applies a 'windsorization' to an existing raster layer
by setting the lowest, respectively largest values to the value at a certain
percentage level (e.g. 95%). Those can be set via the parameter "windsor_props"
.
'windsor_thresh'
Same as option 'windsor', however in this case values
are clamped to a thresholds rather than certain percentages calculated on the data.
'percentile'
This converts and bins all values into percentiles, e.g.
the top 10% or lowest 10% of values and so on.
'pca'
This option runs a principal component decomposition of all
predictors (via prcomp()
). It returns new predictors resembling all components
in order of the most important ones. Can be useful to reduce collinearity, however
note that this changes all predictor names to 'PCX', where X is the number of
the component. The parameter 'pca.var'
can be modified to specify the
minimum variance to be covered by the axes.
'revjack'
Removes outliers from the supplied stack via a reverse jackknife
procedure. Identified outliers are by default set to NA
.
Returns a adjusted SpatRaster
object of identical resolution.
If future covariates are rescaled or normalized, it is highly recommended to use the statistical moments on which the models were trained for any variable transformations, also to ensure that variable ranges are consistent among relative values.
predictor_derivate
# Dummy raster r_ori <- terra::rast(nrows = 10, ncols = 10, res = 0.05, xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5, vals = rnorm(3600,mean = .01,sd = .1)) # Normalize r_norm <- predictor_transform(r_ori, option = 'norm') new <- c(r_ori, r_norm) names(new) <- c("original scale", "normalized units") terra::plot(new)
# Dummy raster r_ori <- terra::rast(nrows = 10, ncols = 10, res = 0.05, xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5, vals = rnorm(3600,mean = .01,sd = .1)) # Normalize r_norm <- predictor_transform(r_ori, option = 'norm') new <- c(r_ori, r_norm) names(new) <- c("original scale", "normalized units") terra::plot(new)
This class describes the PredictorDataset and is used to store covariates within.
id
The id for this collection as character
.
data
A predictor dataset usually as SpatRaster
.
name
A name for this object.
transformed
Saves whether the predictors have been transformed somehow.
timeperiod
A timeperiod field
new()
Initializes the object and creates an empty list
PredictorDataset$new(id, data, transformed = FALSE, ...)
id
The id for this collection as character
.
data
A predictor dataset usually as SpatRaster
.
transformed
A logical
flag if predictors have been transformed. Assume not.
...
Any other parameters found.
NULL
print()
Print the names and properties of all Biodiversity datasets contained within
PredictorDataset$print(format = TRUE)
format
A logical
flag on whether a message should be printed.
A message on screen
get_name()
Return name of this object
PredictorDataset$get_name()
Default character
name.
get_id()
Get Id of this object
PredictorDataset$get_id()
Default character
name.
get_names()
Get names of data
PredictorDataset$get_names()
character
names of the data value.
get_predictor_names()
Alias for get_names
PredictorDataset$get_predictor_names()
character
names of the data value.
get_data()
Get a specific dataset
PredictorDataset$get_data(df = FALSE, na.rm = TRUE, ...)
df
logical
on whether data is to be returned as data.frame
.
na.rm
logical
if NA
is to be removed from data.frame.
...
Any other parameters passed on.
A SpatRaster
or data.frame
.
get_time()
Get time dimension of object.
PredictorDataset$get_time(...)
...
Any other parameters passed on.
A vector
with the time dimension of the dataset.
get_projection()
Get Projection
PredictorDataset$get_projection()
A vector
with the geographical projection of the object.
get_resolution()
Get Resolution
PredictorDataset$get_resolution()
A numeric
vector
with the spatial resolution of the data.
get_ext()
Get Extent of predictors
PredictorDataset$get_ext()
A numeric
vector
with the spatial resolution of the data.
crop_data()
Utility function to clip the predictor dataset by another dataset
PredictorDataset$crop_data(pol, apply_time = FALSE)
This code now also
Invisible TRUE
mask()
Utility function to mask the predictor dataset by another dataset
PredictorDataset$mask(mask, inverse = FALSE, ...)
mask
A SpatRaster
or sf
object.
inverse
A logical
flag if the inverse should be masked instead.
...
Any other parameters passed on to masking.
Invisible
set_data()
Add a new Predictor dataset to this collection
PredictorDataset$set_data(value)
value
A new SpatRaster
or stars
object.
This object
rm_data()
Remove a specific Predictor by name
PredictorDataset$rm_data(x)
x
character
of the predictor name to be removed.
Invisible
show()
Alias for print method
PredictorDataset$show()
Invisible
summary()
Collect info statistics with optional decimals
PredictorDataset$summary(digits = 2)
digits
numeric
Giving the rounding precision
A data.frame
summarizing the data.
has_derivates()
Indication if there are any predictors that are derivates of outers
PredictorDataset$has_derivates()
A logical
flag.
is_transformed()
Predictors have been transformed?
PredictorDataset$is_transformed()
A logical
flag.
get_transformed_params()
Get transformation params.
PredictorDataset$get_transformed_params()
A matrix
flag.
length()
Number of Predictors in object
PredictorDataset$length()
A numeric
estimate
ncell()
Number of cells or values in object
PredictorDataset$ncell()
A numeric
estimate
plot()
Basic Plotting function
PredictorDataset$plot()
A graphical interpretation of the predictors in this object.
clone()
The objects of this class are cloneable with this method.
PredictorDataset$clone(deep = FALSE)
deep
Whether to make a deep clone.
Display information about any object created through the ibis.iSDM R-package.
## S3 method for class 'distribution' print(x, ...) ## S3 method for class 'BiodiversityDistribution' print(x, ...) ## S3 method for class 'BiodiversityDatasetCollection' print(x, ...) ## S3 method for class 'BiodiversityDataset' print(x, ...) ## S3 method for class 'PredictorDataset' print(x, ...) ## S3 method for class 'DistributionModel' print(x, ...) ## S3 method for class 'BiodiversityScenario' print(x, ...) ## S3 method for class 'Prior' print(x, ...) ## S3 method for class 'PriorList' print(x, ...) ## S3 method for class 'Engine' print(x, ...) ## S3 method for class 'Settings' print(x, ...) ## S3 method for class 'Log' print(x, ...) ## S3 method for class 'Id' print(x, ...) ## S4 method for signature 'Id' print(x, ...) ## S4 method for signature 'tbl_df' print(x, ...)
## S3 method for class 'distribution' print(x, ...) ## S3 method for class 'BiodiversityDistribution' print(x, ...) ## S3 method for class 'BiodiversityDatasetCollection' print(x, ...) ## S3 method for class 'BiodiversityDataset' print(x, ...) ## S3 method for class 'PredictorDataset' print(x, ...) ## S3 method for class 'DistributionModel' print(x, ...) ## S3 method for class 'BiodiversityScenario' print(x, ...) ## S3 method for class 'Prior' print(x, ...) ## S3 method for class 'PriorList' print(x, ...) ## S3 method for class 'Engine' print(x, ...) ## S3 method for class 'Settings' print(x, ...) ## S3 method for class 'Log' print(x, ...) ## S3 method for class 'Id' print(x, ...) ## S4 method for signature 'Id' print(x, ...) ## S4 method for signature 'tbl_df' print(x, ...)
x |
Any object created through the package. |
... |
not used. |
Object specific.
## Not run: # Where mod is fitted object mod print(mod) ## End(Not run)
## Not run: # Where mod is fitted object mod print(mod) ## End(Not run)
This class sets up the base class for priors which will be inherited by all priors.
Defines a Prior object.
id
A character
with the id of the prior.
name
A character
with the name of the prior.
type
A character
with the type of the prior.
variable
A character
with the variable name for the prior.
distribution
A character
with the distribution of the prior if relevant.
value
A numeric
or character
with the prior value, e.g. the hyper-parameters.
prob
Another numeric
entry on the prior field. The inclusion probability.
lims
A limitation on the lower and upper bounds of a numeric value.
new()
Initializes the object and prepared the various prior variables
Prior$new( id, name, variable, value, type = NULL, distribution = NULL, prob = NULL, lims = NULL )
id
A character
with the id of the prior.
name
A character
with the name of the prior.
variable
A character
with the variable name for the prior.
value
A numeric
or character
with the prior value, e.g. the hyper-parameters.
type
A character
with the type of the prior.
distribution
A character
with the distribution of the prior if relevant.
prob
Another numeric
entry on the prior field. The inclusion probability.
lims
A limitation on the lower and upper bounds of a numeric value.
NULL
print()
Print out the prior type and variable.
Prior$print()
A message on screen
validate()
Generic validation function for a provided value.
Prior$validate(x)
x
A new prior value.
Invisible TRUE
get()
Get prior values
Prior$get(what = "value")
what
A character
with the entry to be returned (Default: value
).
Invisible TRUE
set()
Set prior
Prior$set(x)
Invisible TRUE
get_id()
Get a specific ID from a prior.
Prior$get_id()
A character
id.
get_name()
Get Name of object
Prior$get_name()
Returns a character
with the class name.
clone()
The objects of this class are cloneable with this method.
Prior$clone(deep = FALSE)
deep
Whether to make a deep clone.
This functionality likely is deprecated or checks have been superseeded.
This class represents a collection of Prior
objects. It provides
methods for accessing, adding and removing priors from the list
A PriorList object.
priors
A list of Prior
object.
new()
Initializes the object
PriorList$new(priors)
priors
A list of Prior
object.
NULL
print()
Print out summary statistics
PriorList$print()
A message on screen
show()
Aliases that calls print.
PriorList$show()
A message on screen
length()
Number of priors in object
PriorList$length()
A numeric
with the number of priors set
ids()
Ids of prior objects
PriorList$ids()
A list with ids of the priors objects for query
varnames()
Variable names of priors in object
PriorList$varnames()
A character
list with the variable names of the priors.
classes()
Function to return the classes of all contained priors
PriorList$classes()
A character
list with the class names of the priors.
types()
Get types of all contained priors
PriorList$types()
A character
list with the type names of the priors.
exists()
Does a certain variable or type combination exist as prior ?
PriorList$exists(variable, type = NULL)
A character
id.
add()
Add a new prior to the object.
PriorList$add(p)
p
A Prior
object.
Invisible TRUE
get()
Get specific prior values from the list if set
PriorList$get(variable, type = NULL, what = "value")
The prior object.
collect()
Collect priors for a given id or multiple.
PriorList$collect(id)
id
A character
with the prior id.
A PriorList
object.
rm()
Remove a set prior by id
PriorList$rm(id)
id
A character
with the prior id.
Invisible TRUE
summary()
Summary function that lists all priors
PriorList$summary()
A data.frame
with the summarized priors.
combine()
Combining function to combine this PriorList with another new one
PriorList$combine(x)
x
A new PriorList
object.
Invisible TRUE
clone()
The objects of this class are cloneable with this method.
PriorList$clone(deep = FALSE)
deep
Whether to make a deep clone.
## Not run: priors( INLAPrior('var1','normal',c(0,0.1)), INLAPrior('var2','normal',c(0,0.1)) ) ## End(Not run)
## Not run: priors( INLAPrior('var1','normal',c(0,0.1)), INLAPrior('var2','normal',c(0,0.1)) ) ## End(Not run)
A PriorList
object is essentially a list that contains
individual Prior
objects. In order to use priors for any of the
engines, the respective Prior
has to be identified (e.g.
INLAPrior
) and embedded in a PriorList
object. Afterwards these
objects can then be added to a distribution object with the add_priors
function.
A PriorList
object is essentially a list that contains
individual Prior
objects. In order to use priors for any of the
engines, the respective Prior
has to be identified (e.g.
INLAPrior
) and embedded in a PriorList
object. Afterwards these
objects can then be added to a distribution object with the add_priors
function.
priors(x, ...) ## S4 method for signature 'ANY' priors(x, ...) priors(x, ...) ## S4 method for signature 'ANY' priors(x, ...)
priors(x, ...) ## S4 method for signature 'ANY' priors(x, ...) priors(x, ...) ## S4 method for signature 'ANY' priors(x, ...)
x |
A |
... |
One or multiple additional |
A PriorList
object.
A PriorList
object.
Other prior:
BARTPrior()
,
BARTPriors()
,
BREGPrior()
,
BREGPriors()
,
GDBPrior()
,
GDBPriors()
,
GLMNETPrior()
,
GLMNETPriors()
,
INLAPrior()
,
INLAPriors()
,
STANPrior()
,
STANPriors()
,
XGBPrior()
,
XGBPriors()
,
add_priors()
,
get_priors()
,
rm_priors()
Other prior:
BARTPrior()
,
BARTPriors()
,
BREGPrior()
,
BREGPriors()
,
GDBPrior()
,
GDBPriors()
,
GLMNETPrior()
,
GLMNETPriors()
,
INLAPrior()
,
INLAPriors()
,
STANPrior()
,
STANPriors()
,
XGBPrior()
,
XGBPriors()
,
add_priors()
,
get_priors()
,
rm_priors()
p1 <- GDBPrior(variable = "Forest", hyper = "positive") p2 <- GDBPrior(variable = "Urban", hyper = "decreasing") priors(p1, p2) ## Not run: p1 <- INLAPrior(variable = "Forest",type = "normal", hyper = c(1,1e4)) p2 <- INLAPrior(variable = "Urban",type = "normal", hyper = c(0,1e-2)) priors(p1, p2) ## End(Not run)
p1 <- GDBPrior(variable = "Forest", hyper = "positive") p2 <- GDBPrior(variable = "Urban", hyper = "decreasing") priors(p1, p2) ## Not run: p1 <- INLAPrior(variable = "Forest",type = "normal", hyper = c(1,1e4)) p2 <- INLAPrior(variable = "Urban",type = "normal", hyper = c(0,1e-2)) priors(p1, p2) ## End(Not run)
Equivalent to train, this function acts as a wrapper to
project the model stored in a BiodiversityScenario
object to newly
supplied (future) covariates. Supplied predictors are usually
spatial-temporal predictors which should be prepared via add_predictors()
(e.g. transformations and derivates) in the same way as they have been during
the initial modelling with distribution()
. Any constrains specified in
the scenario object are applied during the projection.
project.BiodiversityScenario(x, ...) ## S4 method for signature 'BiodiversityScenario' project( x, date_interpolation = "none", stabilize = FALSE, stabilize_method = "loess", layer = "mean", verbose = getOption("ibis.setupmessages", default = TRUE), ... )
project.BiodiversityScenario(x, ...) ## S4 method for signature 'BiodiversityScenario' project( x, date_interpolation = "none", stabilize = FALSE, stabilize_method = "loess", layer = "mean", verbose = getOption("ibis.setupmessages", default = TRUE), ... )
x |
A |
... |
passed on parameters. |
date_interpolation |
A |
stabilize |
A |
stabilize_method |
|
layer |
A |
verbose |
Setting this |
In the background the function x$project()
for the respective
model object is called, where x
is fitted model object. For specifics
on the constraints, see the relevant constrain
functions, respectively:
add_constraint()
for generic wrapper to add any of the available constrains.
add_constraint_dispersal()
for specifying dispersal constraint on the temporal projections at each step.
add_constraint_MigClim()
Using the MigClim R-package to simulate dispersal in projections.
add_constraint_connectivity()
Apply a connectivity constraint at the projection, for instance by adding
a barrier that prevents migration.
add_constraint_minsize()
Adds a constraint on the minimum area a given
thresholded patch should have, assuming that smaller areas are in fact not
suitable.
add_constraint_adaptability()
Apply an adaptability constraint to the projection, for instance
constraining the speed a species is able to adapt to new conditions.
add_constraint_boundary()
To artificially limit the distribution change. Similar as specifying projection limits, but
can be used to specifically constrain a projection within a certain area
(e.g. a species range or an island).
Many constrains also requires thresholds to be calculated. Adding
threshold()
to a BiodiversityScenario
object enables the
computation of thresholds at every step based on the threshold used for the
main model (threshold values are taken from there).
It is also possible to make a complementary simulation with the steps
package, which can be provided via simulate_population_steps()
to the
BiodiversityScenario
object. Similar as with thresholds, estimates
values will then be added to the outputs.
Finally this function also allows temporal stabilization across prediction
steps via enabling the parameter stabilize
and checking the
stablize_method
argument. Stabilization can for instance be helpful in
situations where environmental variables are quite dynamic, but changes in
projected suitability are not expected to abruptly increase or decrease. It
is thus a way to smoothen out outliers from the projection. Options are so
far for instance 'loess'
which fits a loess()
model per pixel and
time step. This is conducted at the very of the processing steps and any
thresholds will be recalculated afterwards.
Saves stars
objects of the obtained predictions in mod.
## Not run: # Fit a model fit <- distribution(background) |> add_biodiversity_poipa(surveydata) |> add_predictors(env = predictors) |> engine_breg() |> train() # Fit a scenario sc <- scenario(fit) |> add_predictors(env = future_predictors) |> project() ## End(Not run)
## Not run: # Fit a model fit <- distribution(background) |> add_biodiversity_poipa(surveydata) |> add_predictors(env = predictors) |> engine_breg() |> train() # Fit a scenario sc <- scenario(fit) |> add_predictors(env = future_predictors) |> project() ## End(Not run)
This function defines the settings for pseudo-absence sampling
of the background. For many engines such points are necessary to model
Poisson (or Binomial) distributed point process data. Specifically we call
absence points for Binomial (Bernoulli really) distributed responses
'pseudo-absence'
and absence data for Poisson responses
'background'
points. For more details read Renner et al. (2015).
The function 'add_pseudoabsence'
allows to add absence points to any
sf
object. See Details for additional parameter description and
examples on how to 'turn' a presence-only dataset into a
presence-(pseudo-)absence.
pseudoabs_settings( background = NULL, nrpoints = 10000, min_ratio = 0.25, method = "random", buffer_distance = 10000, inside = FALSE, layer = NULL, bias = NULL, ... ) ## S4 method for signature 'ANY' pseudoabs_settings( background = NULL, nrpoints = 10000, min_ratio = 0.25, method = "random", buffer_distance = 10000, inside = FALSE, layer = NULL, bias = NULL, ... )
pseudoabs_settings( background = NULL, nrpoints = 10000, min_ratio = 0.25, method = "random", buffer_distance = 10000, inside = FALSE, layer = NULL, bias = NULL, ... ) ## S4 method for signature 'ANY' pseudoabs_settings( background = NULL, nrpoints = 10000, min_ratio = 0.25, method = "random", buffer_distance = 10000, inside = FALSE, layer = NULL, bias = NULL, ... )
background |
A |
nrpoints |
A |
min_ratio |
A |
method |
|
buffer_distance |
|
inside |
A |
layer |
A |
bias |
A |
... |
Any other settings to be added to the pseudoabs settings. |
There are multiple methods available for sampling a biased
background layer. Possible parameters for method
are:
'random'
Absence points are generated randomly over the background (Default),
'buffer'
Absence points are generated only within a buffered distance of existing points.
This option requires the specification of the parameter
buffer_distance
.
'mcp'
Can be used to only generate absence points within or outside a
minimum convex polygon of the presence points. The parameter inside
specifies whether points should be sampled inside or outside (Default) the
minimum convex polygon.
'range'
Absence points are created either inside or outside a provided additional
layer that indicates for example a range of species (controlled through
parameter inside
).
'zones'
A ratified (e.g. of type factor) SpatRaster
layer depicting zones from which absence
points are to be sampled. This method checks which points fall within which
zones and then samples absence points either within or outside these zones
exclusively. Both 'layer'
and 'inside'
have to be set for this
option.
'target'
Make use of a target background for sampling absence points.
Here a SpatRaster
object has to be provided through the parameter 'layer'
.
Absence points are then sampled exclusively within the target areas for grid
cells with non-zero values.
Renner IW, Elith J, Baddeley A, Fithian W, Hastie T, Phillips SJ, Popovic G, Warton DI. 2015. Point process models for presence-only analysis. Methods in Ecology and Evolution 6:366–379. DOI: 10.1111/2041-210X.12352.
Renner, I. W., & Warton, D. I. (2013). Equivalence of MAXENT and Poisson point process models for species distribution modeling in ecology. Biometrics, 69(1), 274-281.
## Not run: # This setting generates 10000 pseudo-absence points outside the # minimum convex polygon of presence points ass1 <- pseudoabs_settings(nrpoints = 10000, method = 'mcp', inside = FALSE) # This setting would match the number of presence-absence points directly. ass2 <- pseudoabs_settings(nrpoints = 0, min_ratio = 1) # These settings can then be used to add pseudo-absence data to a # presence-only dataset. This effectively adds these simulated absence # points to the resulting model all_my_points <- add_pseudoabsence( df = virtual_points, field_occurrence = 'observed', template = background, settings = ass1) ## End(Not run)
## Not run: # This setting generates 10000 pseudo-absence points outside the # minimum convex polygon of presence points ass1 <- pseudoabs_settings(nrpoints = 10000, method = 'mcp', inside = FALSE) # This setting would match the number of presence-absence points directly. ass2 <- pseudoabs_settings(nrpoints = 0, min_ratio = 1) # These settings can then be used to add pseudo-absence data to a # presence-only dataset. This effectively adds these simulated absence # points to the resulting model all_my_points <- add_pseudoabsence( df = virtual_points, field_occurrence = 'observed', template = background, settings = ass1) ## End(Not run)
Renders DistributionModel to HTML
render_html(mod, file, title = NULL, author = NULL, notes = "-", ...) ## S4 method for signature 'ANY' render_html(mod, file, title = NULL, author = NULL, notes = "-", ...)
render_html(mod, file, title = NULL, author = NULL, notes = "-", ...) ## S4 method for signature 'ANY' render_html(mod, file, title = NULL, author = NULL, notes = "-", ...)
mod |
Any object belonging to DistributionModel |
file |
|
title |
|
author |
|
notes |
|
... |
Currently not used |
Renders a HTML file with several summaries of a trained DistributionModel.
The file paths must be an HTML file ending. The functions creates a temporary
Rmd file that gets renders as a HTML using the file
argument.
Writes HTML file
## Not run: mod <- distribution(background) |> add_biodiversity_poipo(species) |> add_predictors(predictors) |> engine_glmnet() |> train() render_html(mod, file = "Test.html") ## End(Not run)
## Not run: mod <- distribution(background) |> add_biodiversity_poipo(species) |> add_predictors(predictors) |> engine_glmnet() |> train() render_html(mod, file = "Test.html") ## End(Not run)
Remove a particular dataset (or all) from an distribution object with
a BiodiversityDatasetCollection
.
rm_biodiversity(x, name, id) ## S4 method for signature 'BiodiversityDistribution' rm_biodiversity(x, name, id)
rm_biodiversity(x, name, id) ## S4 method for signature 'BiodiversityDistribution' rm_biodiversity(x, name, id)
x |
|
name |
A |
id |
A |
## Not run: distribution(background) |> add_biodiversity_poipa(species, "Duckus communus") rm_biodiversity(names = "Duckus communus") ## End(Not run)
## Not run: distribution(background) |> add_biodiversity_poipa(species, "Duckus communus") rm_biodiversity(names = "Duckus communus") ## End(Not run)
This function allows to remove set control obtions from an existing distribution object.
rm_control(x) ## S4 method for signature 'BiodiversityDistribution' rm_control(x)
rm_control(x) ## S4 method for signature 'BiodiversityDistribution' rm_control(x)
x |
distribution (i.e. |
Other control:
rm_limits()
## Not run: x <- distribution(background) |> add_predictors(covariates) |> add_control_bias(method = "proximity") x <- x |> rm_control() x ## End(Not run)
## Not run: x <- distribution(background) |> add_predictors(covariates) |> add_control_bias(method = "proximity") x <- x |> rm_control() x ## End(Not run)
This is just a wrapper function for removing specified offsets
from a BiodiversityDistribution
) object.
rm_latent(x) ## S4 method for signature 'BiodiversityDistribution' rm_latent(x) ## S4 method for signature 'BiodiversityScenario' rm_latent(x)
rm_latent(x) ## S4 method for signature 'BiodiversityDistribution' rm_latent(x) ## S4 method for signature 'BiodiversityScenario' rm_latent(x)
x |
|
Removes a latent spatial effect from a distribution
object.
add_latent_spatial
## Not run: rm_latent(model) -> model ## End(Not run)
## Not run: rm_latent(model) -> model ## End(Not run)
This function allows to remove set limits from an existing distribution object.
rm_limits(x) ## S4 method for signature 'BiodiversityDistribution' rm_limits(x)
rm_limits(x) ## S4 method for signature 'BiodiversityDistribution' rm_limits(x)
x |
distribution (i.e. |
Other control:
rm_control()
## Not run: x <- distribution(background) |> add_predictors(covariates) |> add_limits_extrapolation(method = "zones", layer = zones) x <- x |> rm_limits() x ## End(Not run)
## Not run: x <- distribution(background) |> add_predictors(covariates) |> add_limits_extrapolation(method = "zones", layer = zones) x <- x |> rm_limits() x ## End(Not run)
This is just a wrapper function for removing specified offsets
from a BiodiversityDistribution
) object.
rm_offset(x, layer = NULL) ## S4 method for signature 'BiodiversityDistribution' rm_offset(x, layer = NULL)
rm_offset(x, layer = NULL) ## S4 method for signature 'BiodiversityDistribution' rm_offset(x, layer = NULL)
x |
|
layer |
A |
Removes an offset from a distribution
object.
Other offset:
add_offset()
,
add_offset_bias()
,
add_offset_elevation()
,
add_offset_range()
## Not run: rm_offset(model) -> model ## End(Not run)
## Not run: rm_offset(model) -> model ## End(Not run)
Remove a particular variable from an distribution object with
a PredictorDataset
. See Examples.
rm_predictors(x, names) ## S4 method for signature 'BiodiversityDistribution,character' rm_predictors(x, names)
rm_predictors(x, names) ## S4 method for signature 'BiodiversityDistribution,character' rm_predictors(x, names)
x |
|
names |
|
## Not run: distribution(background) |> add_predictors(my_covariates) |> rm_predictors(names = "Urban") ## End(Not run)
## Not run: distribution(background) |> add_predictors(my_covariates) |> rm_predictors(names = "Urban") ## End(Not run)
This function allows to remove priors from an existing distribution object. In order to remove a set prior, the name of the prior has to be specified.
rm_priors(x, names = NULL, ...) ## S4 method for signature 'BiodiversityDistribution' rm_priors(x, names = NULL, ...)
rm_priors(x, names = NULL, ...) ## S4 method for signature 'BiodiversityDistribution' rm_priors(x, names = NULL, ...)
x |
distribution (i.e. |
names |
|
... |
Other parameters passed down |
Other prior:
BARTPrior()
,
BARTPriors()
,
BREGPrior()
,
BREGPriors()
,
GDBPrior()
,
GDBPriors()
,
GLMNETPrior()
,
GLMNETPriors()
,
INLAPrior()
,
INLAPriors()
,
STANPrior()
,
STANPriors()
,
XGBPrior()
,
XGBPriors()
,
add_priors()
,
get_priors()
,
priors()
## Not run: # Add prior pp <- GLMNETPrior("forest") x <- distribution(background) |> add_priors(pp) # Remove again x <- x |> rm_priors("forest") ## End(Not run)
## Not run: # Add prior pp <- GLMNETPrior("forest") x <- distribution(background) |> add_priors(pp) # Remove again x <- x |> rm_priors("forest") ## End(Not run)
Some computations take considerable amount of time to execute.
This function provides a helper wrapper for running functions of the
apply
family to specified outputs.
run_parallel( X, FUN, cores = 1, approach = "future", export_packages = NULL, ... )
run_parallel( X, FUN, cores = 1, approach = "future", export_packages = NULL, ... )
X |
A |
FUN |
A |
cores |
A numeric of the number of cores to use (Default: |
approach |
|
export_packages |
A |
... |
Any other parameter passed on. |
By default, the parallel package is used for parallel computation, however an option exists to use the future package instead.
## Not run: run_parallel(list, mean, cores = 4) ## End(Not run)
## Not run: run_parallel(list, mean, cores = 4) ## End(Not run)
This function fits a stan model using the light-weight interface provided by cmdstanr. The code was adapted from McElreath rethinking package.
run_stan( model_code, data = list(), algorithm = "sampling", chains = 4, cores = getOption("ibis.nthread"), threads = 1, iter = 1000, warmup = floor(iter/2), control = list(adapt_delta = 0.95), cpp_options = list(), force = FALSE, path = base::getwd(), save_warmup = TRUE, ... )
run_stan( model_code, data = list(), algorithm = "sampling", chains = 4, cores = getOption("ibis.nthread"), threads = 1, iter = 1000, warmup = floor(iter/2), control = list(adapt_delta = 0.95), cpp_options = list(), force = FALSE, path = base::getwd(), save_warmup = TRUE, ... )
model_code |
A |
data |
A |
algorithm |
A |
chains |
A |
cores |
Number of threads for sampling. Default set to |
threads |
|
iter |
A |
warmup |
|
control |
A |
cpp_options |
A |
force |
|
path |
|
save_warmup |
A |
... |
Other non-specified parameters. |
A rstan object
rethinking R package
Prepared covariates often have special characters in their variable names which can or can not be used in formulas or cause errors for certain engines. This function converts special characters of variable names into a format
sanitize_names(names)
sanitize_names(names)
names |
A vector
of sanitized character
.
# Correct variable names vars <- c("Climate-temperature2015", "Elevation__sealevel", "Landuse.forest..meanshare") sanitize_names(vars)
# Correct variable names vars <- c("Climate-temperature2015", "Elevation__sealevel", "Landuse.forest..meanshare") sanitize_names(vars)
This function creates a new BiodiversityScenario object that contains the projections of a model.
scenario(fit, limits = NULL, reuse_limits = FALSE, copy_model = FALSE) ## S4 method for signature 'ANY' scenario(fit, limits = NULL, reuse_limits = FALSE, copy_model = FALSE)
scenario(fit, limits = NULL, reuse_limits = FALSE, copy_model = FALSE) ## S4 method for signature 'ANY' scenario(fit, limits = NULL, reuse_limits = FALSE, copy_model = FALSE)
fit |
A |
limits |
A |
reuse_limits |
A |
copy_model |
A |
If a limit has been defined already during train()
, for example by adding
an extrapolation limit add_limits_extrapolation()
, this zonal layer can be
reused for the projections. Note: This effectively fixes the projections to certain areas.
## Not run: scenario(fit, limits = island_area) ## End(Not run)
## Not run: scenario(fit, limits = island_area) ## End(Not run)
This function allows - out of a character
vector with the
names of an already added PredictorDataset
object - to select a
particular set of predictors. See Examples.
sel_predictors(x, names) ## S4 method for signature 'BiodiversityDistribution,character' sel_predictors(x, names)
sel_predictors(x, names) ## S4 method for signature 'BiodiversityDistribution,character' sel_predictors(x, names)
x |
|
names |
|
## Not run: distribution(background) |> add_predictors(my_covariates) |> sel_predictors(names = c("Forest", "Elevation")) ## End(Not run)
## Not run: distribution(background) |> add_predictors(my_covariates) |> sel_predictors(names = c("Forest", "Elevation")) ## End(Not run)
This function simply allows to add priors to an existing
distribution object. The supplied priors must be a PriorList
object created through calling priors.
set_priors(x, priors = NULL, ...)
set_priors(x, priors = NULL, ...)
x |
distribution (i.e. |
priors |
A |
... |
Other parameters passed down. |
Alternatively priors to environmental predictors can also directly added as parameter via add_predictors
Other prior:
BARTPrior()
,
BARTPriors()
,
BREGPrior()
,
BREGPriors()
,
GDBPrior()
,
GDBPriors()
,
GLMNETPrior()
,
GLMNETPriors()
,
INLAPrior()
,
INLAPriors()
,
STANPrior()
,
STANPriors()
,
XGBPrior()
,
XGBPriors()
,
get_priors()
,
priors()
,
rm_priors()
## Not run: pp <- GLMNETPrior("forest") x <- distribution(background) |> add_priors(pp) ## End(Not run)
## Not run: pp <- GLMNETPrior("forest") x <- distribution(background) |> add_priors(pp) ## End(Not run)
This function simply allows to add priors to an existing
distribution object. The supplied priors must be a PriorList
object created through calling priors.
## S4 method for signature 'BiodiversityDistribution' set_priors(x, priors = NULL, ...)
## S4 method for signature 'BiodiversityDistribution' set_priors(x, priors = NULL, ...)
x |
distribution (i.e. |
priors |
A |
... |
Other parameters passed down. |
Alternatively priors to environmental predictors can also directly added as parameter via add_predictors
Other prior:
BARTPrior()
,
BARTPriors()
,
BREGPrior()
,
BREGPriors()
,
GDBPrior()
,
GDBPriors()
,
GLMNETPrior()
,
GLMNETPriors()
,
INLAPrior()
,
INLAPriors()
,
STANPrior()
,
STANPriors()
,
XGBPrior()
,
XGBPriors()
,
get_priors()
,
priors()
,
rm_priors()
## Not run: pp <- GLMNETPrior("forest") x <- distribution(background) |> add_priors(pp) ## End(Not run)
## Not run: pp <- GLMNETPrior("forest") x <- distribution(background) |> add_priors(pp) ## End(Not run)
Basic R6
object for Settings object, a List that stores settings used
related to model training.
new()
Initializes the object and creates an empty list
Settings$new()
NULL
print()
Print the names and properties of all Biodiversity datasets contained within
Settings$print()
A message on screen
show()
Shows the name and the settings
Settings$show()
A character
of the name and settings.
length()
Number of options
Settings$length()
A numeric
with the number of options.
duration()
Computation duration convenience function
Settings$duration()
The amount of time passed for model fitting if found.
summary()
Summary call of the contained parameters
Settings$summary()
A list
with the parameters in this object.
get()
Get a specific setting
Settings$get(what)
what
A character
with the respective setting.
The setting if found in the object.
set()
Set new settings
Settings$set(what, x, copy = FALSE)
The setting if found in the object.
clone()
The objects of this class are cloneable with this method.
Settings$clone(deep = FALSE)
deep
Whether to make a deep clone.
Calculate the environmental similarity of the provided covariates with respect to a reference dataset. Currently supported is Multivariate Environmental Similarity index and the multivariate combination novelty index (NT2) based on the Mahalanobis divergence (see references).
similarity( obj, ref, ref_type = "poipo", method = "mess", predictor_names = NULL, full = FALSE, plot = TRUE, ... ) ## S4 method for signature 'BiodiversityDistribution' similarity( obj, ref, ref_type = "poipo", method = "mess", predictor_names = NULL, full = FALSE, plot = TRUE, ... ) ## S4 method for signature 'SpatRaster' similarity( obj, ref, ref_type = "poipo", method = "mess", predictor_names = NULL, full = FALSE, plot = TRUE, ... )
similarity( obj, ref, ref_type = "poipo", method = "mess", predictor_names = NULL, full = FALSE, plot = TRUE, ... ) ## S4 method for signature 'BiodiversityDistribution' similarity( obj, ref, ref_type = "poipo", method = "mess", predictor_names = NULL, full = FALSE, plot = TRUE, ... ) ## S4 method for signature 'SpatRaster' similarity( obj, ref, ref_type = "poipo", method = "mess", predictor_names = NULL, full = FALSE, plot = TRUE, ... )
obj |
A |
ref |
A |
ref_type |
A |
method |
A specifc method for similarity calculation. Currently supported:
|
predictor_names |
An optional |
full |
should similarity values be returned for all variables (Default: |
plot |
Should the result be plotted? Otherwise return the output list (Default: |
... |
other options (Non specified). |
similarity
implements the MESS algorithm described in Appendix S3
of Elith et al. (2010) as well as the Mahalanobis dissimilarity described in
Mesgaran et al. (2014).
This function returns a list containing:
similarity
: A SpatRaster
object with multiple layers giving the environmental
similarities for each variable in x
(only included when "full=TRUE"
);
mis
: a SpatRaster
layer giving the minimum similarity value
across all variables for each location (i.e. the MESS);
exip
: a SpatRaster
layer indicating whether any model would interpolate
or extrapolate to this location based on environmental surface;
mod
: a factor SpatRaster
layer indicating which variable was most
dissimilar to its reference range (i.e. the MoD map, Elith et al. 2010); and
mos
: a factor SpatRaster
layer indicating which variable was most
similar to its reference range.
Elith, J., Kearney, M., and Phillips, S. (2010) "The art of modelling range-shifting species". Methods in Ecology and Evolution, 1: 330-342. https://doi.org/10.1111/j.2041-210X.2010.00036.x
Mesgaran, M.B., Cousens, R.D. and Webber, B.L. (2014) "Here be dragons: a tool for quantifying novelty due to covariate range and correlation change when projecting species distribution models". Diversity and Distributions, 20: 1147-1159. https://doi.org/10.1111/ddi.12209
dismo R-package.
## Not run: plot( similarity(x) # Where x is a distribution or Raster object ) ## End(Not run)
## Not run: plot( similarity(x) # Where x is a distribution or Raster object ) ## End(Not run)
This function adds a flag to a BiodiversityScenario
object
to indicate that species abundances are to be simulated based on the expected
habitat suitability, as well as demography, density-dependence and dispersal
information. The simulation is done using the steps package (Visintin et
al. 2020) and conducted after a habitat suitability projection has been
created. steps is a spatially explicit population models coded mostly in R.
For a detailed description of steps parameters, please see the respective reference and help files. Default assumptions underlying this wrapper are presented in the details
simulate_population_steps( mod, vital_rates, replicates = 1, carrying_capacity = NULL, initial = NULL, dispersal = NULL, density_dependence = NULL, include_suitability = TRUE ) ## S4 method for signature 'BiodiversityScenario,matrix' simulate_population_steps( mod, vital_rates, replicates = 1, carrying_capacity = NULL, initial = NULL, dispersal = NULL, density_dependence = NULL, include_suitability = TRUE )
simulate_population_steps( mod, vital_rates, replicates = 1, carrying_capacity = NULL, initial = NULL, dispersal = NULL, density_dependence = NULL, include_suitability = TRUE ) ## S4 method for signature 'BiodiversityScenario,matrix' simulate_population_steps( mod, vital_rates, replicates = 1, carrying_capacity = NULL, initial = NULL, dispersal = NULL, density_dependence = NULL, include_suitability = TRUE )
mod |
A |
vital_rates |
A symmetrical demographic matrix. Should have column and row names equivalent to the vital stages that are to be estimated. |
replicates |
A |
carrying_capacity |
Either |
initial |
A |
dispersal |
A dispersal object defined by the |
density_dependence |
Specification of density dependence defined by the
|
include_suitability |
A |
In order for this function to work the steps package has to be installed separately. Instructions to do so can be found on github.
If initial population lifestages are not provided, then they are estimated
assuming a linear scaling with suitability, a 50:50
split between
sexes and a 1:3
ratio of adults to juveniles. The provision of
different parameters is highly encouraged!
Adds flag to a BiodiversityScenario
object to indicate that
further simulations are added during projection.
The steps package has multiple options for simulating species population and not all possible options are represented in this wrapper.
Furthermore, the package still makes use of the raster
package for much
of its internal data processing. Since ibis.iSDM switched to terra a while
ago, there can be efficiency problems as layers need to be translated between
packages.
Visintin, C., Briscoe, N. J., Woolley, S. N., Lentini, P. E., Tingley, R., Wintle, B. A., & Golding, N. (2020). steps: Software for spatially and temporally explicit population simulations. Methods in Ecology and Evolution, 11(4), 596-603. https://doi.org/10.1111/2041-210X.13354
Other constraint:
add_constraint()
,
add_constraint_MigClim()
,
add_constraint_adaptability()
,
add_constraint_boundary()
,
add_constraint_connectivity()
,
add_constraint_dispersal()
,
add_constraint_minsize()
,
add_constraint_threshold()
## Not run: # Define vital rates vt <- matrix(c(0.0,0.5,0.75, 0.5,0.2,0.0, 0.0,0.5,0.9), nrow = 3, ncol = 3, byrow = TRUE) colnames(vt) <- rownames(vt) <- c('juvenile','subadult','adult') # Assumes that a trained 'model' object exists mod <- scenario(model) |> add_predictors(env = predictors, transform = 'scale', derivates = "none") |> # Use Vital rates here, but note the other parameters! simulate_population_steps(vital_rates = vt) |> project() ## End(Not run)
## Not run: # Define vital rates vt <- matrix(c(0.0,0.5,0.75, 0.5,0.2,0.0, 0.0,0.5,0.9), nrow = 3, ncol = 3, byrow = TRUE) colnames(vt) <- rownames(vt) <- c('juvenile','subadult','adult') # Assumes that a trained 'model' object exists mod <- scenario(model) |> add_predictors(env = predictors, transform = 'scale', derivates = "none") |> # Use Vital rates here, but note the other parameters! simulate_population_steps(vital_rates = vt) |> project() ## End(Not run)
Similar as partial this function calculates a partial response
of a trained model for a given variable. Differently from partial in space.
However the result is a SpatRaster
showing the spatial magnitude of the
partial response.
spartial(mod, x.var, constant = NULL, newdata = NULL, plot = FALSE, ...) ## S4 method for signature 'ANY,character' spartial(mod, x.var, constant = NULL, newdata = NULL, plot = FALSE, ...) spartial.DistributionModel(mod, ...)
spartial(mod, x.var, constant = NULL, newdata = NULL, plot = FALSE, ...) ## S4 method for signature 'ANY,character' spartial(mod, x.var, constant = NULL, newdata = NULL, plot = FALSE, ...) spartial.DistributionModel(mod, ...)
mod |
A |
x.var |
A character indicating the variable for which a partial effect is to be calculated. |
constant |
A numeric constant to be inserted for all other variables. Default calculates the mean per variable. |
newdata |
A |
plot |
A logical indication of whether the result is to be plotted? |
... |
Other engine specific parameters. |
By default the mean is calculated across all parameters that are
not x.var
. Instead a constant can be set (for instance 0
) to
be applied to the output.
A SpatRaster containing the mapped partial response of the variable.
## Not run: # Create and visualize the spartial effect spartial(fit, x.var = "Forest.cover", plot = TRUE) ## End(Not run)
## Not run: # Create and visualize the spartial effect spartial(fit, x.var = "Forest.cover", plot = TRUE) ## End(Not run)
This helper function shows the code from a trained
DistributionModel using the engine_stan
. This function is emulated
after a similar functionality in the brms R-package.
It only works with models inferred with stan!
stancode(obj, ...) stancode.DistributionModel(obj, ...)
stancode(obj, ...) stancode.DistributionModel(obj, ...)
obj |
Any prepared object. |
... |
not used. |
None.
rstan, cmdstanr, brms
Function to create a new prior for engine_stan models. Priors currently can be set on specific environmental predictors.
STANPrior(variable, type, hyper = c(0, 2), ...) ## S4 method for signature 'character,character' STANPrior(variable, type, hyper = c(0, 2), ...)
STANPrior(variable, type, hyper = c(0, 2), ...) ## S4 method for signature 'character,character' STANPrior(variable, type, hyper = c(0, 2), ...)
variable |
A |
type |
A |
hyper |
A |
... |
Variables passed on to prior object. |
Lemoine, N. P. (2019). Moving beyond noninformative priors: why and how to choose weakly informative priors in Bayesian analyses. Oikos, 128(7), 912-928.
Carpenter, B., Gelman, A., Hoffman, M. D., Lee, D., Goodrich, B., Betancourt, M., ... & Riddell, A. (2017). Stan: A probabilistic programming language. Journal of statistical software, 76(1), 1-32.
Other prior:
BARTPrior()
,
BARTPriors()
,
BREGPrior()
,
BREGPriors()
,
GDBPrior()
,
GDBPriors()
,
GLMNETPrior()
,
GLMNETPriors()
,
INLAPrior()
,
INLAPriors()
,
STANPriors()
,
XGBPrior()
,
XGBPriors()
,
add_priors()
,
get_priors()
,
priors()
,
rm_priors()
## Not run: pp <- STANPrior("forest", "normal", c(0,1)) ## End(Not run)
## Not run: pp <- STANPrior("forest", "normal", c(0,1)) ## End(Not run)
This is a helper function to specify several STANPrior with the same hyper-parameters, but different variables.
STANPriors(variables, type, hyper = c(0, 2), ...) ## S4 method for signature 'vector,character' STANPriors(variables, type, hyper = c(0, 2), ...)
STANPriors(variables, type, hyper = c(0, 2), ...) ## S4 method for signature 'vector,character' STANPriors(variables, type, hyper = c(0, 2), ...)
variables |
A |
type |
A |
hyper |
A |
... |
Variables passed on to prior object |
Other prior:
BARTPrior()
,
BARTPriors()
,
BREGPrior()
,
BREGPriors()
,
GDBPrior()
,
GDBPriors()
,
GLMNETPrior()
,
GLMNETPriors()
,
INLAPrior()
,
INLAPriors()
,
STANPrior()
,
XGBPrior()
,
XGBPriors()
,
add_priors()
,
get_priors()
,
priors()
,
rm_priors()
This helper function summarizes a given object, including DistributionModel, PredictorDataset or PriorList objects and others. This can be a helpful way to summarize what is contained within and the values of specified models or objects.
When unsure, it is usually a good strategy to run summary on any object.
## S3 method for class 'distribution' summary(object, ...) ## S3 method for class 'DistributionModel' summary(object, ...) ## S3 method for class 'PredictorDataset' summary(object, ...) ## S3 method for class 'BiodiversityScenario' summary(object, ...) ## S3 method for class 'PriorList' summary(object, ...) ## S3 method for class 'Settings' summary(object, ...)
## S3 method for class 'distribution' summary(object, ...) ## S3 method for class 'DistributionModel' summary(object, ...) ## S3 method for class 'PredictorDataset' summary(object, ...) ## S3 method for class 'BiodiversityScenario' summary(object, ...) ## S3 method for class 'PriorList' summary(object, ...) ## S3 method for class 'Settings' summary(object, ...)
object |
Any prepared object. |
... |
not used. |
## Not run: # Example with a trained model x <- distribution(background) |> # Presence-absence data add_biodiversity_poipa(surveydata) |> # Add predictors and scale them add_predictors(env = predictors) |> # Use glmnet and lasso regression for estimation engine_glmnet(alpha = 1) # Train the model mod <- train(x) summary(mod) # Example with a prior object p1 <- BREGPrior(variable = "forest", hyper = 2, ip = NULL) p2 <- BREGPrior(variable = "cropland", hyper = NULL, ip = 1) pp <- priors(p1,p2) summary(pp) ## End(Not run)
## Not run: # Example with a trained model x <- distribution(background) |> # Presence-absence data add_biodiversity_poipa(surveydata) |> # Add predictors and scale them add_predictors(env = predictors) |> # Use glmnet and lasso regression for estimation engine_glmnet(alpha = 1) # Train the model mod <- train(x) summary(mod) # Example with a prior object p1 <- BREGPrior(variable = "forest", hyper = 2, ip = NULL) p2 <- BREGPrior(variable = "cropland", hyper = NULL, ip = 1) pp <- priors(p1,p2) summary(pp) ## End(Not run)
For most species distribution modelling approaches it is assumed that occurrence records are unbiased, which is rarely the case. While model-based control can alleviate some of the effects of sampling bias, it can often be desirable to account for some sampling biases through spatial thinning (Aiello‐Lammens et al. 2015). This is an approach based on the assumption that over-sampled grid cells contribute little more than bias, rather than strengthening any environmental responses. This function provides some methods to apply spatial thinning approaches. Note that this effectively removes data prior to any estimation and its use should be considered with care (see also Steen et al. 2021).
thin_observations( data, background, env = NULL, method = "random", remainpoints = 10, mindistance = NULL, zones = NULL, probs = 0.75, global = TRUE, centers = NULL, verbose = TRUE )
thin_observations( data, background, env = NULL, method = "random", remainpoints = 10, mindistance = NULL, zones = NULL, probs = 0.75, global = TRUE, centers = NULL, verbose = TRUE )
data |
A |
background |
A |
env |
A |
method |
A |
remainpoints |
A |
mindistance |
A |
zones |
A |
probs |
A |
global |
A |
centers |
A |
verbose |
|
All methods only remove points from "over-sampled" grid cells/areas. These are
defined as all cells/areas which either have more points than remainpoints
or
more points than the global minimum point count per cell/area (whichever is larger).
Currently implemented thinning methods:
"random"
: Samples at random across all over-sampled grid cells returning
only "remainpoints"
from over-sampled cells. Does not account for any
spatial or environmental distance between observations.
"bias"
: This option removes explicitly points that are considered biased
only (based on "env"
). Points are only thinned from grid cells which are
above the bias quantile (larger values equals greater bias). Thins the observations
returning "remainpoints"
from each over-sampled and biased cell.
"zones"
: Thins observations from each zone that is above the over-sampled
threshold and returns "remainpoints"
for each zone. Careful: If the
zones are relatively wide this can remove quite a few observations.
"environmental"
: This approach creates an observation-wide clustering
(k-means) under the assumption that the full environmental niche has been comprehensively
sampled and is covered by the provided covariates env
. For each over-sampled
cluster, we then obtain ("remainpoints"
) by thinning points.
"spatial"
: Calculates the spatial distance between all observations.
Then points are removed iteratively until the minimum distance between points
is crossed. The "mindistance"
parameter has to be set for this function to work.
Aiello‐Lammens, M. E., Boria, R. A., Radosavljevic, A., Vilela, B., & Anderson, R. P. (2015). spThin: an R package for spatial thinning of species occurrence records for use in ecological niche models. Ecography, 38(5), 541-545.
Steen, V. A., Tingley, M. W., Paton, P. W., & Elphick, C. S. (2021). Spatial thinning and class balancing: Key choices lead to variation in the performance of species distribution models with citizen science data. Methods in Ecology and Evolution, 12(2), 216-226.
## Not run: # Thin a certain number of observations # At random thin_points <- thin_observations(points, background, method = "random") # using a bias layer thin_points <- thin_observations(points, background, method = "bias", env = bias) ## End(Not run)
## Not run: # Thin a certain number of observations # At random thin_points <- thin_observations(points, background, method = "random") # using a bias layer thin_points <- thin_observations(points, background, method = "bias", env = bias) ## End(Not run)
It is common in many applications of species distribution modelling that estimated continuous suitability surfaces are converted into discrete representations of where suitable habitat might or might not exist. This so called threshold'ing can be done in various ways which are further described in the details.
In case a SpatRaster
is provided as input in this function for
obj
, it is furthermore necessary to provide a sf
object for
validation as there is no DistributionModel
to read this information
from.
Note: This of course also allows to estimate the threshold based on withheld data, for instance those created from an a-priori cross-validation procedure.
For BiodiversityScenario
objects, adding this function to the processing
pipeline stores a threshold attribute in the created scenario object.
For BiodiversityScenario objects a set threshold()
simply indicates that
the projection should create and use thresholds as part of the results.
The threshold values for this are either taken from the provided model or
through an optional provide parameter value
.
If instead the aim is to apply thresholds to each step of the suitability
projection, see add_constraint_threshold()
.
threshold( obj, method = "mtp", value = NULL, point = NULL, field_occurrence = "observed", format = "binary", return_threshold = FALSE, ... ) ## S4 method for signature 'ANY' threshold( obj, method = "mtp", value = NULL, point = NULL, field_occurrence = "observed", format = "binary", return_threshold = FALSE, ... ) ## S4 method for signature 'SpatRaster' threshold( obj, method = "fixed", value = NULL, point = NULL, field_occurrence = "observed", format = "binary", return_threshold = FALSE ) ## S4 method for signature 'BiodiversityScenario' threshold( obj, method = "mtp", value = NULL, point = NULL, field_occurrence = "observed", format = "binary", return_threshold = FALSE, ... )
threshold( obj, method = "mtp", value = NULL, point = NULL, field_occurrence = "observed", format = "binary", return_threshold = FALSE, ... ) ## S4 method for signature 'ANY' threshold( obj, method = "mtp", value = NULL, point = NULL, field_occurrence = "observed", format = "binary", return_threshold = FALSE, ... ) ## S4 method for signature 'SpatRaster' threshold( obj, method = "fixed", value = NULL, point = NULL, field_occurrence = "observed", format = "binary", return_threshold = FALSE ) ## S4 method for signature 'BiodiversityScenario' threshold( obj, method = "mtp", value = NULL, point = NULL, field_occurrence = "observed", format = "binary", return_threshold = FALSE, ... )
obj |
A BiodiversityScenario object to which an existing threshold is to be added. |
method |
A specifc method for thresholding. See details for available options. |
value |
A |
point |
A |
field_occurrence |
A |
format |
|
return_threshold |
Should threshold value be returned instead (Default: |
... |
Any other parameter. Used to fetch value if set somehow. |
The following options are currently implemented:
'fixed'
= applies a single pre-determined threshold. Requires value
to be set.
'mtp'
= minimum training presence is used to find and set the lowest
predicted suitability for any occurrence point.
'percentile'
= For a percentile threshold. A value
as parameter
has to be set here.
'min.cv'
= Threshold the raster so to minimize the coefficient of
variation (cv) of the posterior. Uses the lowest tercile of the cv in space.
Only feasible with Bayesian engines.
'TSS'
= Determines the optimal TSS (True Skill Statistic). Requires
the "modEvA"
package to be installed.
'kappa'
= Determines the optimal kappa value (Kappa). Requires the
"modEvA"
package to be installed.
'F1score'
= Determines the optimal F1score (also known as Sorensen
similarity). Requires the "modEvA"
package to be installed.
'F1score'
= Determines the optimal sensitivity of presence records.
Requires the "modEvA"
package to be installed.
'Sensitivity'
= Determines the optimal sensitivity of presence records.
Requires the "modEvA"
package to be installed.
'Specificity'
= Determines the optimal sensitivity of presence records.
Requires the "modEvA"
package to be installed.
'AUC'
= Determines the optimal AUC of presence records. Requires the
"modEvA"
package to be installed.
'kmeans'
= Determines a threshold based on a 2 cluster k-means clustering.
The presence class is assumed to be the cluster with the larger mean.
A SpatRaster if a SpatRaster object as input. Otherwise the threshold
is added to the respective DistributionModel
or BiodiversityScenario
object.
Lawson, C.R., Hodgson, J.A., Wilson, R.J., Richards, S.A., 2014. Prevalence, thresholds and the performance of presence-absence models. Methods Ecol. Evol. 5, 54–64. https://doi.org/10.1111/2041-210X.12123
Liu, C., White, M., Newell, G., 2013. Selecting thresholds for the prediction of species occurrence with presence-only data. J. Biogeogr. 40, 778–789. https://doi.org/10.1111/jbi.12058
Muscatello, A., Elith, J., Kujala, H., 2021. How decisions about fitting species distribution models affect conservation outcomes. Conserv. Biol. 35, 1309–1320. https://doi.org/10.1111/cobi.13669
"modEvA"
## Not run: # Where mod is an estimated DistributionModel tr <- threshold(mod) tr$plot_threshold() ## End(Not run)
## Not run: # Where mod is an estimated DistributionModel tr <- threshold(mod) tr$plot_threshold() ## End(Not run)
This function trains a distribution()
model with the specified
engine and furthermore has some generic options that apply to all engines
(regardless of type). See Details with regards to such options.
Users are advised to check the help files for individual engines for advice on how the estimation is being done.
train( x, runname, filter_predictors = "none", optim_hyperparam = FALSE, inference_only = FALSE, only_linear = TRUE, method_integration = "predictor", keep_models = TRUE, aggregate_observations = TRUE, clamp = FALSE, verbose = getOption("ibis.setupmessages", default = TRUE), ... ) ## S4 method for signature 'BiodiversityDistribution' train( x, runname, filter_predictors = "none", optim_hyperparam = FALSE, inference_only = FALSE, only_linear = TRUE, method_integration = "predictor", keep_models = TRUE, aggregate_observations = TRUE, clamp = TRUE, verbose = getOption("ibis.setupmessages", default = TRUE), ... )
train( x, runname, filter_predictors = "none", optim_hyperparam = FALSE, inference_only = FALSE, only_linear = TRUE, method_integration = "predictor", keep_models = TRUE, aggregate_observations = TRUE, clamp = FALSE, verbose = getOption("ibis.setupmessages", default = TRUE), ... ) ## S4 method for signature 'BiodiversityDistribution' train( x, runname, filter_predictors = "none", optim_hyperparam = FALSE, inference_only = FALSE, only_linear = TRUE, method_integration = "predictor", keep_models = TRUE, aggregate_observations = TRUE, clamp = TRUE, verbose = getOption("ibis.setupmessages", default = TRUE), ... )
x |
|
runname |
A |
filter_predictors |
A
|
optim_hyperparam |
Parameter to tune the model by iterating over input
parameters or selection of predictors included in each iteration. Can be set
to |
inference_only |
By default the engine is used to create a spatial prediction
of the suitability surface, which can take time. If only inferences of the strength
of relationship between covariates and observations are required, this parameter
can be set to |
only_linear |
Fit model only on linear baselearners and functions. Depending
on the engine setting this option to |
method_integration |
A
|
keep_models |
|
aggregate_observations |
|
clamp |
|
verbose |
Setting this |
... |
further arguments passed on. |
This function acts as a generic training function that - based on the
provided BiodiversityDistribution
object creates a new distribution model.
The resulting object contains both a "fit_best"
object of the estimated
model and, if inference_only
is FALSE
a SpatRaster object named
"prediction"
that contains the spatial prediction of the model. These
objects can be requested via object$get_data("fit_best")
.
Other parameters in this function:
"filter_predictors"
The parameter can be set to various options to
remove highly correlated variables or those with little additional information
gain from the model prior to any estimation. Available options are "none"
(Default) "pearson"
for applying a 0.7
correlation cutoff, "abess"
for the regularization framework by Zhu et al. (2020), or "RF"
or
"randomforest"
for removing the least important variables according to a
randomForest model. Note: This function is only applied on predictors for
which no prior has been provided (e.g. potentially non-informative ones).
"optim_hyperparam"
This option allows to make use of hyper-parameter
search for several models, which can improve prediction accuracy although through
the a substantial increase in computational cost.
"method_integration"
Only relevant if more than one BiodiversityDataset
is supplied and when the engine does not support joint integration of likelihoods.
See also Miller et al. (2019) in the references for more details on different types
of integration. Of course, if users want more control about this aspect, another
option is to fit separate models and make use of the add_offset, add_offset_range
and ensemble functionalities.
"clamp"
Boolean parameter to support a clamping of the projection predictors
to the range of values observed during model training.
A DistributionModel object.
There are no silver bullets in (correlative) species distribution modelling and for each model the analyst has to understand the objective, workflow and parameters than can be used to modify the outcomes. Different predictions can be obtained from the same data and parameters and not all necessarily make sense or are useful.
Miller, D.A.W., Pacifici, K., Sanderlin, J.S., Reich, B.J., 2019. The recent past and promising future for data integration methods to estimate species’ distributions. Methods Ecol. Evol. 10, 22–37. https://doi.org/10.1111/2041-210X.13110
Zhu, J., Wen, C., Zhu, J., Zhang, H., & Wang, X. (2020). A polynomial algorithm for best-subset selection problem. Proceedings of the National Academy of Sciences, 117(52), 33117-33123.
Leung, B., Hudgins, E. J., Potapova, A. & Ruiz‐Jaen, M. C. A new baseline for countrywide α‐diversity and species distributions: illustration using >6,000 plant species in Panama. Ecol. Appl. 29, 1–13 (2019).
engine_gdb, engine_xgboost, engine_bart, engine_inla, engine_inlabru, engine_breg, engine_stan, engine_glm
# Load example data background <- terra::rast(system.file('extdata/europegrid_50km.tif', package='ibis.iSDM',mustWork = TRUE)) # Get test species virtual_points <- sf::st_read(system.file('extdata/input_data.gpkg', package='ibis.iSDM',mustWork = TRUE),'points',quiet = TRUE) # Get list of test predictors ll <- list.files(system.file('extdata/predictors/', package = 'ibis.iSDM', mustWork = TRUE),full.names = TRUE) # Load them as rasters predictors <- terra::rast(ll);names(predictors) <- tools::file_path_sans_ext(basename(ll)) # Use a basic GLM to fit a SDM x <- distribution(background) |> # Presence-only data add_biodiversity_poipo(virtual_points, field_occurrence = "Observed") |> # Add predictors and scale them add_predictors(env = predictors, transform = "scale", derivates = "none") |> # Use GLM as engine engine_glm() # Train the model, Also filter out co-linear predictors using a pearson threshold mod <- train(x, only_linear = TRUE, filter_predictors = 'pearson') mod
# Load example data background <- terra::rast(system.file('extdata/europegrid_50km.tif', package='ibis.iSDM',mustWork = TRUE)) # Get test species virtual_points <- sf::st_read(system.file('extdata/input_data.gpkg', package='ibis.iSDM',mustWork = TRUE),'points',quiet = TRUE) # Get list of test predictors ll <- list.files(system.file('extdata/predictors/', package = 'ibis.iSDM', mustWork = TRUE),full.names = TRUE) # Load them as rasters predictors <- terra::rast(ll);names(predictors) <- tools::file_path_sans_ext(basename(ll)) # Use a basic GLM to fit a SDM x <- distribution(background) |> # Presence-only data add_biodiversity_poipo(virtual_points, field_occurrence = "Observed") |> # Add predictors and scale them add_predictors(env = predictors, transform = "scale", derivates = "none") |> # Use GLM as engine engine_glm() # Train the model, Also filter out co-linear predictors using a pearson threshold mod <- train(x, only_linear = TRUE, filter_predictors = 'pearson') mod
The unwrap_model
function uses terra::unwrap()
to easier ship a
DistributionModel
object.
unwrap_model(mod, verbose = getOption("ibis.setupmessages", default = TRUE)) ## S4 method for signature 'ANY' unwrap_model(mod, verbose = getOption("ibis.setupmessages", default = TRUE))
unwrap_model(mod, verbose = getOption("ibis.setupmessages", default = TRUE)) ## S4 method for signature 'ANY' unwrap_model(mod, verbose = getOption("ibis.setupmessages", default = TRUE))
mod |
Provided |
verbose |
|
DistributionModel with unwrapped raster layers
wrap_model
## Not run: x <- distribution(background) |> add_biodiversity_poipo(virtual_points, field_occurrence = 'observed', name = 'Virtual points') |> add_predictors(pred_current, transform = 'scale',derivates = 'none') |> engine_xgboost(nrounds = 2000) |> train(varsel = FALSE, only_linear = TRUE) |> wrap_model() unwrap_model(x, "testmodel.rds") ## End(Not run)
## Not run: x <- distribution(background) |> add_biodiversity_poipo(virtual_points, field_occurrence = 'observed', name = 'Virtual points') |> add_predictors(pred_current, transform = 'scale',derivates = 'none') |> engine_xgboost(nrounds = 2000) |> train(varsel = FALSE, only_linear = TRUE) |> wrap_model() unwrap_model(x, "testmodel.rds") ## End(Not run)
This function conducts a model evaluation based on either on the fitted point data or any supplied independent. Currently only supporting point datasets. For validation of integrated models more work is needed.
validate( mod, method = "continuous", layer = "mean", point = NULL, point_column = "observed", field_occurrence = NULL, ... ) ## S4 method for signature 'ANY' validate( mod, method = "continuous", layer = "mean", point = NULL, point_column = "observed", field_occurrence = NULL, ... ) ## S4 method for signature 'SpatRaster' validate( mod, method = "continuous", layer = NULL, point = NULL, point_column = "observed", field_occurrence = NULL, ... )
validate( mod, method = "continuous", layer = "mean", point = NULL, point_column = "observed", field_occurrence = NULL, ... ) ## S4 method for signature 'ANY' validate( mod, method = "continuous", layer = "mean", point = NULL, point_column = "observed", field_occurrence = NULL, ... ) ## S4 method for signature 'SpatRaster' validate( mod, method = "continuous", layer = NULL, point = NULL, point_column = "observed", field_occurrence = NULL, ... )
mod |
A fitted |
method |
Should the validation be conducted on the continious prediction or a (previously calculated) thresholded layer in binary format? Note that depending on the method different metrics can be computed. See Details. |
layer |
In case multiple layers exist, which one to use? (Default: |
point |
A |
point_column |
A |
field_occurrence |
(Deprectated) A |
... |
Other parameters that are passed on. Currently unused. |
The 'validate'
function calculates different validation
metrics depending on the output type.
The output metrics for each type are defined as follows: (where TP stands for true positive, TN for true negative, FP the false positive and FN the false negative) Continuous:
'n'
= Number of observations.
'rmse'
= Root Mean Square Error,
'mae'
= Mean Absolute Error,
'logloss'
= Log loss, TBD
'normgini'
= Normalized Gini index, TBD
'cont.boyce'
= Continuous Boyce index, Ratio of predicted against expected frequency calculated over
a moving window:
, where
and
Discrete:
'n'
= Number of observations.
'auc'
= Area under the curve, e.g. the integral of a function relating the True positive rate
against the false positive rate.
'overall.accuracy'
= Overall Accuracy, Average of all positives,
'true.presence.ratio'
= True presence ratio or Jaccard index,
'precision'
= Precision, positive detection rate
'sensitivity'
= Sensitivity, Ratio of True positives against all positives,
'specificity'
= Specifivity, Ratio of True negatives against all negatives,
'tss'
= True Skill Statistics, sensitivity + specificity – 1
* 'f1'
= F1 Score or Positive predictive value,
'logloss'
= Log loss, TBD
'expected.accuracy'
= Expected Accuracy,
'kappa'
= Kappa value,
,
'brier.score'
= Brier score,
, where
is predicted presence or absence and
an observed.
Return a tidy tibble
with validation results.
If you use the Boyce Index, please cite the original Hirzel et al. (2006) paper.
Allouche O., Tsoar A., Kadmon R., (2006). Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS). Journal of Applied Ecology, 43(6), 1223–1232.
Liu, C., White, M., Newell, G., 2013. Selecting thresholds for the prediction of species occurrence with presence-only data. J. Biogeogr. 40, 778–789. https://doi.org/10.1111/jbi.12058
Hirzel, A. H., Le Lay, G., Helfer, V., Randin, C., & Guisan, A. (2006). Evaluating the ability of habitat suitability models to predict species presences. Ecological modelling, 199(2), 142-152.
## Not run: # Assuming that mod is a distribution object and has a thresholded layer mod <- threshold(mod, method = "TSS") validate(mod, method = "discrete") ## End(Not run)
## Not run: # Assuming that mod is a distribution object and has a thresholded layer mod <- threshold(mod, method = "TSS") validate(mod, method = "discrete") ## End(Not run)
The wrap_model
function uses terra::wrap()
to easier ship a
DistributionModel
object.
wrap_model(mod, verbose = getOption("ibis.setupmessages", default = TRUE)) ## S4 method for signature 'ANY' wrap_model(mod, verbose = getOption("ibis.setupmessages", default = TRUE))
wrap_model(mod, verbose = getOption("ibis.setupmessages", default = TRUE)) ## S4 method for signature 'ANY' wrap_model(mod, verbose = getOption("ibis.setupmessages", default = TRUE))
mod |
Provided |
verbose |
|
DistributionModel with wrapped raster layers
unwrap_model
## Not run: x <- distribution(background) |> add_biodiversity_poipo(virtual_points, field_occurrence = 'observed', name = 'Virtual points') |> add_predictors(pred_current, transform = 'scale',derivates = 'none') |> engine_xgboost(nrounds = 2000) |> train(varsel = FALSE, only_linear = TRUE) wrap_model(x, "testmodel.rds") ## End(Not run)
## Not run: x <- distribution(background) |> add_biodiversity_poipo(virtual_points, field_occurrence = 'observed', name = 'Virtual points') |> add_predictors(pred_current, transform = 'scale',derivates = 'none') |> engine_xgboost(nrounds = 2000) |> train(varsel = FALSE, only_linear = TRUE) wrap_model(x, "testmodel.rds") ## End(Not run)
The write_model
function (opposed to the write_output
) is a
generic wrapper to writing a DistributionModel
to disk. It is essentially
a wrapper to saveRDS
. Models can be loaded again via the load_model
function.
write_model( mod, fname, slim = FALSE, verbose = getOption("ibis.setupmessages", default = TRUE) ) ## S4 method for signature 'ANY' write_model( mod, fname, slim = FALSE, verbose = getOption("ibis.setupmessages", default = TRUE) )
write_model( mod, fname, slim = FALSE, verbose = getOption("ibis.setupmessages", default = TRUE) ) ## S4 method for signature 'ANY' write_model( mod, fname, slim = FALSE, verbose = getOption("ibis.setupmessages", default = TRUE) )
mod |
Provided |
fname |
A |
slim |
A |
verbose |
|
No R-output is created. A file is written to the target direction.
By default output files will be overwritten if already existing!
load_model
## Not run: x <- distribution(background) |> add_biodiversity_poipo(virtual_points, field_occurrence = 'observed', name = 'Virtual points') |> add_predictors(pred_current, transform = 'scale',derivates = 'none') |> engine_xgboost(nrounds = 2000) |> train(varsel = FALSE, only_linear = TRUE) write_model(x, "testmodel.rds") ## End(Not run)
## Not run: x <- distribution(background) |> add_biodiversity_poipo(virtual_points, field_occurrence = 'observed', name = 'Virtual points') |> add_predictors(pred_current, transform = 'scale',derivates = 'none') |> engine_xgboost(nrounds = 2000) |> train(varsel = FALSE, only_linear = TRUE) write_model(x, "testmodel.rds") ## End(Not run)
The write_output
function is a generic wrapper to writing any
output files (e.g. projections) created with the ibis.iSDM-package
. It is
possible to write outputs of fitted DistributionModel
,
BiodiversityScenario
or individual terra
or stars
objects. In
case a data.frame
is supplied, the output is written as csv file.
For creating summaries of distribution and scenario parameters and performance,
see write_summary()
write_output( mod, fname, dt = "FLT4S", verbose = getOption("ibis.setupmessages", default = TRUE), ... ) ## S4 method for signature 'ANY,character' write_output( mod, fname, dt = "FLT4S", verbose = getOption("ibis.setupmessages", default = TRUE), ... ) ## S4 method for signature 'BiodiversityScenario,character' write_output( mod, fname, dt = "FLT4S", verbose = getOption("ibis.setupmessages", default = TRUE), ... ) ## S4 method for signature 'SpatRaster,character' write_output( mod, fname, dt = "FLT4S", verbose = getOption("ibis.setupmessages", default = TRUE), ... ) ## S4 method for signature 'data.frame,character' write_output( mod, fname, dt = "FLT4S", verbose = getOption("ibis.setupmessages", default = TRUE), ... ) ## S4 method for signature 'stars,character' write_output( mod, fname, dt = "FLT4S", verbose = getOption("ibis.setupmessages", default = TRUE), ... )
write_output( mod, fname, dt = "FLT4S", verbose = getOption("ibis.setupmessages", default = TRUE), ... ) ## S4 method for signature 'ANY,character' write_output( mod, fname, dt = "FLT4S", verbose = getOption("ibis.setupmessages", default = TRUE), ... ) ## S4 method for signature 'BiodiversityScenario,character' write_output( mod, fname, dt = "FLT4S", verbose = getOption("ibis.setupmessages", default = TRUE), ... ) ## S4 method for signature 'SpatRaster,character' write_output( mod, fname, dt = "FLT4S", verbose = getOption("ibis.setupmessages", default = TRUE), ... ) ## S4 method for signature 'data.frame,character' write_output( mod, fname, dt = "FLT4S", verbose = getOption("ibis.setupmessages", default = TRUE), ... ) ## S4 method for signature 'stars,character' write_output( mod, fname, dt = "FLT4S", verbose = getOption("ibis.setupmessages", default = TRUE), ... )
mod |
Provided |
fname |
A |
dt |
A |
verbose |
|
... |
Any other arguments passed on the individual functions. |
No R-output is created. A file is written to the target direction.
By default output files will be overwritten if already existing!
## Not run: x <- distribution(background) |> add_biodiversity_poipo(virtual_points, field_occurrence = 'observed', name = 'Virtual points') |> add_predictors(pred_current, transform = 'scale',derivates = 'none') |> engine_xgboost(nrounds = 2000) |> train(varsel = FALSE, only_linear = TRUE) write_output(x, "testmodel.tif") ## End(Not run)
## Not run: x <- distribution(background) |> add_biodiversity_poipo(virtual_points, field_occurrence = 'observed', name = 'Virtual points') |> add_predictors(pred_current, transform = 'scale',derivates = 'none') |> engine_xgboost(nrounds = 2000) |> train(varsel = FALSE, only_linear = TRUE) write_output(x, "testmodel.tif") ## End(Not run)
The write_summary
function is a wrapper function to create
summaries from fitted DistributionModel
or BiodiversityScenario
objects. This function will extract parameters and statistics about the used
data from the input object and writes the output as either 'rds'
or
'rdata'
file. Alternative, more open file formats are under
consideration.
write_summary( mod, fname, partial = FALSE, verbose = getOption("ibis.setupmessages", default = TRUE), ... ) ## S4 method for signature 'ANY,character' write_summary( mod, fname, partial = FALSE, verbose = getOption("ibis.setupmessages", default = TRUE), ... )
write_summary( mod, fname, partial = FALSE, verbose = getOption("ibis.setupmessages", default = TRUE), ... ) ## S4 method for signature 'ANY,character' write_summary( mod, fname, partial = FALSE, verbose = getOption("ibis.setupmessages", default = TRUE), ... )
mod |
Provided |
fname |
A |
partial |
A |
verbose |
|
... |
Any other arguments passed on the individual functions. |
No R-output is created. A file is written to the target direction.
No predictions or tabular data is saved through this function. Use
write_output()
to save those.
## Not run: x <- distribution(background) |> add_biodiversity_poipo(virtual_points, field_occurrence = 'observed', name = 'Virtual points') |> add_predictors(pred_current, transform = 'scale',derivates = 'none') |> engine_xgboost(nrounds = 2000) |> train(varsel = FALSE, only_linear = TRUE) write_summary(x, "testmodel.rds") ## End(Not run)
## Not run: x <- distribution(background) |> add_biodiversity_poipo(virtual_points, field_occurrence = 'observed', name = 'Virtual points') |> add_predictors(pred_current, transform = 'scale',derivates = 'none') |> engine_xgboost(nrounds = 2000) |> train(varsel = FALSE, only_linear = TRUE) write_summary(x, "testmodel.rds") ## End(Not run)
Function to include prior information as monotonic constrain to
a extreme gradient descent boosting model engine_xgboost
. Monotonic
priors enforce directionality in direction of certain variables, however
specifying a monotonic constrain does not guarantee that the variable is not
regularized out during model fitting.
XGBPrior(variable, hyper = "increasing", ...) ## S4 method for signature 'character,character' XGBPrior(variable, hyper = "increasing", ...)
XGBPrior(variable, hyper = "increasing", ...) ## S4 method for signature 'character,character' XGBPrior(variable, hyper = "increasing", ...)
variable |
A |
hyper |
A |
... |
Variables passed on to prior object. |
Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., & Cho, H. (2015). Xgboost: extreme gradient boosting. R package version 0.4-2, 1(4), 1-4.
Other prior:
BARTPrior()
,
BARTPriors()
,
BREGPrior()
,
BREGPriors()
,
GDBPrior()
,
GDBPriors()
,
GLMNETPrior()
,
GLMNETPriors()
,
INLAPrior()
,
INLAPriors()
,
STANPrior()
,
STANPriors()
,
XGBPriors()
,
add_priors()
,
get_priors()
,
priors()
,
rm_priors()
## Not run: pp <- XGBPrior("forest", "increasing") ## End(Not run)
## Not run: pp <- XGBPrior("forest", "increasing") ## End(Not run)
This is a helper function to specify several XGBPrior with the same hyper-parameters, but different variables.
XGBPriors(variable, hyper = "increasing", ...) ## S4 method for signature 'character' XGBPriors(variable, hyper = "increasing", ...)
XGBPriors(variable, hyper = "increasing", ...) ## S4 method for signature 'character' XGBPriors(variable, hyper = "increasing", ...)
variable |
A |
hyper |
A |
... |
Variables passed on to prior object. |
Other prior:
BARTPrior()
,
BARTPriors()
,
BREGPrior()
,
BREGPriors()
,
GDBPrior()
,
GDBPriors()
,
GLMNETPrior()
,
GLMNETPriors()
,
INLAPrior()
,
INLAPriors()
,
STANPrior()
,
STANPriors()
,
XGBPrior()
,
add_priors()
,
get_priors()
,
priors()
,
rm_priors()