Aggregate data from the database in a variety of ways

mfdb_area_size(mdb, params)
mfdb_area_size_depth(mdb, params)
mfdb_temperature(mdb, params)
mfdb_survey_index_mean(mdb, cols, params, scale_index = NULL)
mfdb_survey_index_total(mdb, cols, params, scale_index = NULL)
mfdb_sample_count(mdb, cols, params, scale_index = NULL)
mfdb_sample_meanlength(mdb, cols, params, scale_index = NULL)
mfdb_sample_meanlength_stddev(mdb, cols, params, scale_index = NULL)
mfdb_sample_totalweight(mdb, cols, params, measurements = c('overall'))
mfdb_sample_meanweight(mdb, cols, params, scale_index = NULL,
                       measurements = c('overall'))
mfdb_sample_meanweight_stddev(mdb, cols, params, scale_index = NULL,
                              measurements = c('overall'))
mfdb_sample_rawdata(mdb, cols, params, scale_index = NULL)
mfdb_sample_scaled(mdb, cols, params, abundance_scale = NULL, scale = 'tow_length')
mfdb_stomach_preycount(mdb, cols, params)
mfdb_stomach_preymeanlength(mdb, cols, params)
mfdb_stomach_preymeanweight(mdb, cols, params)
mfdb_stomach_preyweightratio(mdb, cols, params)
mfdb_stomach_presenceratio(mdb, cols, params)

Arguments

mdb

An object created by mfdb()

cols

Any additonal columns to group by, see details.

params

A list of parameters, see details.

scale_index

Optional. survey_index used to scale results before aggregation, either "tow_length", "area_size" or from mfdb_import_survey_index

abundance_scale

Optional. Same as scale_index

scale

Optional. A scale to apply to the resulting values, e.g. 'tow_length'

measurements

Optional, default 'overall'. A vector of measurement names to use, one of overall, liver, gonad, stomach

Details

The items in the params list either restrict data that is returned, or groups data if they are also in the cols vector, or are 'year', 'timestep', or 'area'.

If you are grouping by the column, params should contain one of the following:

NULL

Don't do any grouping, instead put 'all' in the resulting column. For example, age = NULL results in "all".

character / numeric vector

Aggregate all samples together where they match. For example, year = 1990:2000 results in 1990, ... , 2000.

mfdb_unaggregated()

Don't do any aggregation for this column, return all possible values.

mfdb_group()

Group several discrete items together. For example, age = mfdb_group(young = 1:3, old = 4:5) results in "young" and "old".

mfdb_interval()

Group irregular ranges together. For example, length = mfdb_interval('len', c(0, 10, 100, 1000)) results in "len0", "len10", "len100" (1000 is the upper bound to len100).

mfdb_step_interval()

Group regular ranges together. For example, length = mfdb_step_interval('len', to = 100, by = 10) results in "len0", "len10", ... , "len90".

In addition, params can contain other arguments to purely restrict the data that is returned.

institute

A vector of institute names / countries, see mfdb::institute for possible values

gear

A vector of gear names, see mfdb::gear for possible values

vessel

A vector of vessel names, see mfdb::vessel for possible values

sampling_type

A vector of sampling_type names, see mfdb::sampling_type for possible values

species

A vector of species names, see mfdb::species for possible values

sex

A vector of sex names, see mfdb::sex for possible values

To save specifying the same items repeatedly, you can use list concatenation to keep some defaults, for example:


defaults <- list(year = 1998:2000)
mfdb_sample_meanlength(mdb, c('age'), c(list(), defaults))

scale_index allows you to scale samples before aggregation. If it contains the name of a survey index (see mfdb_import_survey_index), then any counts will be scaled by the value for that areacell before and used in aggregation / weighted averages. As a special case, you can use "tow_length" to to scale counts by the tow length.

Value

All will return a list of data.frame objects. If there was no bootstrapping requested, there will be only one. Otherwise, there will be one for each sample.

The columns of these data frames depends on the function called.

mfdb_area_size

Returns area, (total area) size

mfdb_area_size_depth

Returns area, (total area) size, mean depth, weighted by area size

mfdb_temperature

Returns year, step, area, (mean) temperature

mfdb_survey_index_mean

Returns year, step, area, (group cols), (mean) survey index

mfdb_survey_index_total

Returns year, step, area, (group cols), (sum) survey index

mfdb_sample_count

Returns year, step, area, (group cols), number (i.e sum of count)

mfdb_sample_meanlength

Return year, step, area, (group cols), number (i.e sum of count), mean (length)

mfdb_sample_meanlength_stddev

As mfdb_sample_meanlength, but also returns std. deviation.

mfdb_sample_totalweight

Returns year,step,area,(group cols),total (weight of group)

mfdb_sample_meanweight

Returns year, step, area, (group cols), number (i.e sum of count), mean (weight)

mfdb_sample_meanweight_stddev

As mfdb_sample_meanweight, but also returns std. deviation.

mfdb_sample_rawdata

Returns year,step,area,(group cols),number of samples, raw_weight and raw_length.

NB: No grouping of results is performed, instead all matching table entries are returned

mfdb_sample_scaled

Returns year, step, area, (group cols), number (i.e. sum of count, scaled by tow_length), mean_weight (scaled by tow_length)

mfdb_stomach_preycount

Returns year, step, area, (group cols), number (of prey found in stomach)

mfdb_stomach_preymeanlength

Returns year, step, area, (group cols), number (of prey found in stomach), mean_length (of prey found in stomach). NB: Entries where count is NA (i.e. totals) are ignored with this function.

mfdb_stomach_preymeanweight

Returns year, step, area, (group cols), number (of unique stomachs in group), mean_weight (per unique stomach).

mfdb_stomach_preyweightratio

Returns year, step, area, (group cols), ratio (of selected prey in stomach to all prey by weight)

mfdb_stomach_presenceratio

Returns year, step, area, (group cols), ratio (of selected prey in stomach to all prey by count)

Examples

mdb <- mfdb(tempfile(fileext = '.duckdb'))
#> 2022-11-16 12:34:34 INFO:mfdb:Creating schema from scratch
#> 2022-11-16 12:34:34 INFO:mfdb:Taxonomy market_category no updates to make
#> 2022-11-16 12:34:34 INFO:mfdb:Schema up-to-date

# Define 2 areacells of equal size
mfdb_import_area(mdb, data.frame(name=c("divA", "divB"), size=1))

# Make up some samples
samples <- expand.grid(
    year = 1998,
    month = c(1:12),
    areacell = c("divA", "divB"),
    species = 'COD',
    age = c(1:5),
    length = c(0,40,80))
samples$count <- runif(nrow(samples), 20, 90)
mfdb_import_survey(mdb, data_source = "x", samples)

# Query numbers by age and length
agg_data <- mfdb_sample_count(mdb, c('age', 'length'), list(
    length = mfdb_interval("len", seq(0, 500, by = 30)),
    age = mfdb_group('young' = c(1,2), old = 3),
    year = c(1998)))
agg_data
#> $`0.0.0.0.0`
#>   year step area   age length   number
#> 1 1998  all  all   old   len0 1231.319
#> 2 1998  all  all   old  len30 1233.876
#> 3 1998  all  all   old  len60 1308.686
#> 4 1998  all  all young   len0 2560.562
#> 5 1998  all  all young  len30 2693.943
#> 6 1998  all  all young  len60 2297.988
#> 

# Use in a catchdistribution likelihood component
gadget_dir_write(gadget_directory(tempfile()), gadget_likelihood_component("catchdistribution",
        name = "cdist",
        weight = 0.9,
        data = agg_data[[1]],
        area = attr(agg_data[[1]], "area"),
        age = attr(agg_data[[1]], "age")))

mfdb_disconnect(mdb)