likelihood_sparsesample.Rd
Compare model predictions against a set of sparse data points
g3l_sparsesample_linreg(
fit = c('log', 'linear'),
slope = 1,
intercept = NULL )
g3l_sparsesample_sumsquares(
weighting = "model_stddev" )
g3l_sparsesample(
nll_name,
obs_df,
stocks,
measurement_f = quote( wgt ),
function_f = g3l_sparsesample_linreg(),
predstocks = list(),
area_group = NULL,
weight = g3_parameterized(paste(
if (length(predstocks) > 0) "csparse" else "asparse",
function_f_name,
nll_name,
"weight",
sep = "_"), optimise = FALSE, value = 1),
run_at = g3_action_order$likelihood )
formula substituted into surveyindices calcuations to fix slope/intercept of linear regression, or NULL if not fixed. See below.
Is the fit 'log' or 'linear'? See below.
Weighting applied to sum-of-squares. One of "model_stddev", "obs_stddev" or a formula.
Character string, used to define the variable name for obsstock and modelstock.
By default set to (asparse|csparse)_(name_of_function_f)_(nll_name)_weight
.
Data.frame of observation data. See details.
A list of g3_stock
objects to collect sparsesample data for,
depending if stocks were provided.
formula to derive the model's equivalent predicted value for a data point.
You can use wgt
to refer to weight of matching individuals,
length
to refer to length of matching individuals.
A formula to compare obs_df to predicted values generated via transform_f and generate nll, defined by one of the g3l_sparsesample_* functions.
A list of g3_stock
predator or fleet objects.
If present, we will compare against the model predicted catch.
Without (the default), we compare against overall abundance.
List mapping area names used in obs_df to integer model areas,
most likely generated by g3_areas
.
Weighting applied to this likelihood component. Default is a g3_param
that defaults to 1, allowing weights to be altered without recompiling.
Integer order that actions will be run within model, see g3_action_order
.
The actions will define the following variables in your model, which could be reported with g3a_report_history
:
Observation mean, the mean column from obs_df
Observation standard deviation, the stddev column from obs_df
Observation number, the number column from obs_df
The corresponding model prediction vector, total datapoints. __model_sum / __model_n
for the mean
The corresponding model prediction vector, sqared-sum datapoints.
The number of data points at each point in the model prediction vector, if predstocks set this is the number of individuals caught matching the datapoint (length/age/...), otherwise abundance of individuals matching the datapoint.
data.frame of observation data. Unlike g3l_abundancedistribution
, gaps and sparse data is accepted,
and gaps will not be filled with zero.
For each row in the table, all matching predictions are aggregated. Aggregation columns include:
Required. The year the sample is from
Optional. The timestep/season the sample is from
Optional. Only aggregate predicted values from given area
Optional. Only aggregate predicted values with given age
Optional. Only aggregate predicted values with given length (matches nearest lengthgroup)
So, a row with "year=1998,age=4" will be compared against age 4 individuals of all lengths in 1998, step 1 & 2. A row with "year=2004,step=1,age=2,length=19" will be compared against individuals of age 4, length 10..20, in winter 2004.
The observation data is provided in the following columns:
Required. Mean value at this data point
Optional. Number of data points, defaults to 1
Optional. Observed standard deviation (only required if weighting = "obs_stddev"
)
Returns a formula for use as function_f:
If fit = "log":
$$
\sum_{\it i}^{rows} (\alpha + \beta \log N_{i} - \log \frac{\nu_{i}}{P_{i}})^2
$$
If fit = "linear":
$$
\sum_{\it i}^{rows} (\alpha + \beta N_{i} - \frac{\nu_{i}}{P_{i}})^2
$$
"mean" column from obs_df
Total predicted values for all data points, i.e. nll_spabund_name__model_sum
Number of data points, i.e. nll_spabund_name__model_n
intercept parameter, defaults to 1
, i.e. fixed slope
slope parameter, defaults to NULL
, i.e. linear regression performed to find optimal value
If either alpha or beta is not provided, then linear regression is performed on \(N\) vs \(\nu\) for each value in table, and the optimal value used for each.
Returns a formula for use as function_f:
$$ \sum_{\it i}^{rows} w (\frac{\nu_{i}}{P_{i}} - N_{i})^2 $$
"mean" column from obs_df
Total predicted values, i.e. nll_spabund_name__model_sum
Number of data points, i.e. nll_spabund_name__model_n
weighting parameter, either:
\(1 / \sigma^2\), using stddev of model predicted values if weighting = "model_stddev"
\(1 / \sigma^2\), using stddev column from obs_df if weighting = "obs_stddev"
A custom forumla provided for weighting
st <- g3_stock("fish", c(10, 20, 30)) %>% g3s_age(3,5)
# Generate some random sparsesample samples
obs_df <- data.frame(
# NB: No 1993, we don't have any samples for that year
year = rep(c(1990, 1991, 1992, 1994), each = 2),
step = 1:2 )
obs_df$age = floor(runif(nrow(obs_df), min = 3, max = 5.1))
obs_df$length = floor(runif(nrow(obs_df), min = 10, max = 50))
obs_df$mean = runif(nrow(obs_df), min = 10, max = 1000)
actions <- list(
g3a_time(1990, 1994, c(6,6)),
# Use otherfood to populate abundance / mean weight
g3a_otherfood(st,
quote( age * 100 + stock__minlen ),
quote( cur_year * 1e5 + cur_step * 1e4 + 0 * stock__minlen ) ),
g3l_sparsesample(
"bt",
obs_df,
list(st),
measurement_f = g3_formula(
# Derive blubber thickness from length/weight
((wgt/(wmax.a * length^wmax.b) - 0.5) * 100 - 4.44) / (5693 * (length/wgt)^0.5),
wmax.a = g3_parameterized("wmax.a", by_stock = TRUE),
wmax.b = g3_parameterized("wmax.b", by_stock = TRUE),
end = NULL ),
function_f = g3l_sparsesample_linreg(fit = "linear") ),
NULL )
model_fn <- g3_to_r(c(actions, list(
g3a_report_detail(actions), # TODO: Not reporting anything useful
NULL )))
r <- attributes(model_fn())
colSums(r$dstart_fish__num) # TODO: Report something related
#> time
#> age 1990-01 1990-02 1991-01 1991-02 1992-01 1992-02 1993-01 1993-02 1994-01
#> age3 960 960 960 960 960 960 960 960 960
#> age4 1260 1260 1260 1260 1260 1260 1260 1260 1260
#> age5 1560 1560 1560 1560 1560 1560 1560 1560 1560
#> time
#> age 1994-02
#> age3 960
#> age4 1260
#> age5 1560