Functions to import data into MareFrame DB

mfdb_import_temperature(mdb, data_in)
    mfdb_import_survey(mdb, data_in, data_source = 'default_sample')
    mfdb_import_survey_index(mdb, data_in, data_source = 'default_index')
    mfdb_import_stomach(mdb, predator_data, prey_data, data_source = "default_stomach")

Arguments

mdb

Database connection created by mfdb().

data_in, predator_data, prey_data

A data.frame of survey data to import, see details.

data_source

A name for this data, e.g. the filename it came from. Used so you can replace it later without disturbing other data.

Details

All functions will replace existing data in the case study with new data, unless you specify a data_source, in which case then only existing data with the same data_source will be replaced.

If you want to remove the data, import empty data.frames with the same data_source.

mfdb_import_temperature imports temperature time-series data for areacells. The data_in should be a data.frame with the following columns:

id

A numeric ID for this areacell (will be combined with the case study number internally)

year

Required. Year each sample was taken, e.g. c(2000,2001)

month

Required. Month (1--12) each sample was taken, e.g. c(1,12)

areacell

Required. Areacell sample was taken within

temperature

The temperature at given location/time

mfdb_import_survey imports institution surveys and commercial sampling for your case study. The data_in should be a data.frame with the following columns:

institute

Optional. An institute name, see mfdb::institute for possible values

gear

Optional. Gear name, see mfdb::gear for possible values

vessel

Optional. Vessel defined previously with mfdb_import_vessel_taxonomy(...)

tow

Optional. Tow defined previously with mfdb_import_tow_taxonomy(...)

sampling_type

Optional. A sampling_type, see mfdb::sampling_type for possible values

year

Required. Year each sample was taken, e.g. c(2000,2001)

month

Required. Month (1--12) each sample was taken, e.g. c(1,12)

areacell

Required. Areacell sample was taken within

species

Optional, default c(NA). Species of sample, see mfdb::species for possible values

age

Optional, default c(NA). Age of sample, or mean age

sex

Optional, default c(NA). Sex of sample, see mfdb::sex for possible values

length

Optional, default c(NA). Length of sample / mean length of all samples

length_var

Optional, default c(NA). Sample variance, if data is already aggregated

length_min

Optional, default c(NA). Minimum theoretical length, if data is already aggregated

weight

Optional, default c(NA). Weight of sample / mean weight of all samples

weight_var

Optional, default c(NA). Sample variance, if data is already aggregated

weight_total

Optional, default c(NA). Total weight of all samples, can be used with count = NA to represent an unknown number of samples

liver_weight

Optional, default c(NA). Weight of sample / mean liver weight of all samples

liver_weight_var

Optional, default c(NA). Sample variance, if data is already aggregated

gonad_weight

Optional, default c(NA). Weight of sample / mean gonad weight of all samples

gonad_weight_var

Optional, default c(NA). Sample variance, if data is already aggregated

stomach_weight

Optional, default c(NA). Weight of sample / mean stomach weight of all samples

stomach_weight_var

Optional, default c(NA). Sample variance, if data is already aggregated

count

Optional, default c(1). Number of samples this row represents (i.e. if the data is aggregated)

mfdb_import_survey_index adds indicies that can be used as abundance information, for example. Before using mfdb_import_survey_index, make sure that the index_type you intend to use exists by using mfdb_import_cs_taxonomy. The data_in should be a data.frame with the following columns:

index_type

Required. the name of the index data you are storing, e.g. 'acoustic'

year

Required. Year each sample was taken, e.g. c(2000,2001)

month

Required. Month (1--12) each sample was taken, e.g. c(1,12)

areacell

Required. Areacell sample was taken within

value

Value of the index at this point in space/time

mfdb_import_stomach imports data on predators and prey. The predator and prey data are stored separately, however they should be linked by the stomach_name column. If a prey has a stomach name that doesn't match a predator, then an error will be returned.

The predator_data should be a data.frame with the following columns:

stomach_name

Required. An arbitary name that provides a link between the predator and prey tables

institute

Optional. An institute name, see mfdb::institute for possible values

gear

Optional. Gear name, see mfdb::gear for possible values

vessel

Optional. Vessel defined previously with mfdb_import_vessel_taxonomy(mdb, ...)

tow

Optional. Tow defined previously with mfdb_import_tow_taxonomy(...)

sampling_type

Optional. A sampling_type, see mfdb::sampling_type for possible values

year

Required. Year each sample was taken, e.g. c(2000,2001)

month

Required. Month (1--12) each sample was taken, e.g. c(1,12)

areacell

Required. Areacell sample was taken within

species

Optional, default c(NA). Species of sample, see mfdb::species for possible values

age

Optional, default c(NA). Age of sample, or mean age

sex

Optional, default c(NA). Sex of sample, see mfdb::sex for possible values

maturity_stage

Optional, default c(NA). Maturity stage of sample, see mfdb::maturity_stage for possible values

stomach_state

Optional, default c(NA). Stomach state of sample, see mfdb::stomach_state for possible values

length

Optional, default c(NA). Length of sample

weight

Optional, default c(NA). Weight of sample

The prey_data should be a data.frame with the following columns:

stomach_name

Required. The stomach name of the predator this was found in

species

Optional, default c(NA). Species of sample, see mfdb::species for possible values

digestion_stage

Optional, default c(NA). Stage of digestion of the sample, see mfdb::digestion_stage for possible values

length

Optional, default c(NA). Length of sample / mean length of all samples

weight

Optional, default c(NA). Weight of sample / mean weight of all samples

weight_total

Optional, default c(NA). Total weight of all samples

count

Optional, default c(NA). Number of samples this row represents (i.e. if the data is aggregated), count = NA represents an unknown number of samples

Value

NULL

Examples

mdb <- mfdb(tempfile(fileext = '.duckdb'))
#> 2022-11-16 12:34:33 INFO:mfdb:Creating schema from scratch
#> 2022-11-16 12:34:33 INFO:mfdb:Taxonomy market_category no updates to make
#> 2022-11-16 12:34:33 INFO:mfdb:Schema up-to-date

# We need to set-up vocabularies first
mfdb_import_area(mdb, data.frame(
    id = c(1,2,3),
    name = c('35F1', '35F2', '35F3'),
    size = c(5)))
mfdb_import_vessel_taxonomy(mdb, data.frame(
    name = c('1.RSH', '2.COM'),
    stringsAsFactors = FALSE))
mfdb_import_sampling_type(mdb, data.frame(
    name = c("RES", "LND"),
    description = c("Research", "Landings"),
    stringsAsFactors = FALSE))

data_in <- read.csv(text = '
year,month,areacell,species,age,sex,length
1998,1,35F1,COD,3,M,140
1998,1,35F1,COD,3,M,150
1998,1,35F1,COD,3,F,150
')

data_in$institute <- 'MRI'
data_in$gear <- 'GIL'
data_in$vessel <- '1.RSH'
data_in$sampling_type <- 'RES'
mfdb_import_survey(mdb, data_in, data_source = 'cod-1998')

mfdb_disconnect(mdb)