Unravelling the scientific potential of high resolution fishery data

Kristian Schreiber Plet-Hansen; Erling Larsen; Lars Olof Mortensen; J. Rasmus Nielsen; Clara Ulrich

doi:10.1051/alr/2018016

All issues

Volume 31 (2018)

Aquat. Living Resour., 31 (2018) 24

Full HTML

Free Access

Issue		Aquat. Living Resour. Volume 31, 2018


Article Number		24
Number of page(s)		14
DOI		https://doi.org/10.1051/alr/2018016
Published online		04 October 2018

Aquat. Living Resour. 2018, 31, 24

Research Article

Unravelling the scientific potential of high resolution fishery data

Kristian Schreiber Plet-Hansen^*, Erling Larsen, Lars Olof Mortensen, J. Rasmus Nielsen and Clara Ulrich

Technical University of Denmark, National Institute of Aquatic Resources (DTU Aqua), Kemitorvet, DK-2800 Kgs, Lyngby, Denmark

^* Corresponding author: kspl@aqua.dtu.dk

Handling Editor: Verena Trenkel

Received: 8 February 2018
Accepted: 25 July 2018

Abstract

Fisheries science and fisheries management advice rely on both scientific and commercial data to estimate the distribution and abundance of marine species. These two data types differ, with scientific data having a broader geographical coverage but less intensity and time coverage compared to commercial data. Here we present a new type of commercial data with high resolution and coverage. To our knowledge, the dataset presented in this study has never been used for scientific purposes. While commercial datasets usually include the total weight by species on per haul basis, the new data also include the commercial size class for the species landed, recorded directly on a haul-by-haul basis. Thus, this dataset has the potential to provide knowledge on landed fish with as high spatio-temporal resolution as when coupling logbooks and sales slips but with the addition of detailed knowledge on the size distribution. Such information may otherwise be obtained through on-board observer programmes but unlike the observers’ data, the dataset presented here is routinely collected on most of the trips of the vessels involved, which means that the coverage of the data for the individual vessel is larger than observers’ data. Furthermore, the risk of changes in fishing behaviour due to the presence of an observer on-board is avoided. This paper describes the coverage and completeness of the dataset, and explores the reliability of the data available. We conclude that the main limitation is the small number of fishing vessels covered by the program, but that the data from those vessels are accurate, detailed, and relatively reliable.

Key words: Fisheries / haul-by-haul information / science-industry cooperation / sea-packing commercial fishery data / size distribution / spatial and seasonal selectivity

© EDP Sciences 2018

1 Introduction

Fisheries science and management rely on scientific survey data and commercial fishery data to estimate the status of marine populations and assess the impact of fishery on the environment. A key challenge is that the two data sources differ much in quality and detail. Scientific survey data usually have a broader and more homogeneous geographical coverage than commercial fishery data, as fishers target certain species and areas. However, scientific survey data have less intensity and temporal coverage (Pennino et al., 2016; Bourdaud et al., 2017). While both commercial and scientific data are important sources of information, it is a challenge to link the two types of data and provide a coherent picture (Poos et al., 2013; Bourdaud et al., 2017). Currently, integrated commercial datasets rely on coupling data from logbooks, sales slips and the vessel monitoring system (VMS) to allocate landings to vessels’ hauls and fishing grounds (Hintzen et al., 2012). However, size composition at haul level is not known, and it is usually assumed that it is the same as the aggregated size composition from the entire trip (Bastardie et al., 2010). Fishing trips can cover several days and large areas, with potentially large variation in size composition; hence, these estimates probably introduce a bias. Thus, expanding the commercial data to incorporate accurate recordings of size at haul level could add significant quality to the information available (Verdoit et al., 2003; Bourdaud et al., 2017). A Danish initiative of packing-at-sea came to our attention that might be able to provide such information. The project started in 1995 with the purpose of investigating whether sea-packing could provide additional profit to fishers, by reducing their costs of size-sorting and packing at the auctions, and by ensuring higher quality fish. The project found a reduction in costs of 6–7% when packing fish at-sea but remained inconclusive on whether sea-packing resulted in a profit increase (Frederiksen and Olsen, 1997; Frederiksen et al., 2002). Because sea-packed fish are labelled with information on size class, species, weight, vessel, and catch time, a by-product of this project was the development of a database collecting the size composition of landings at the haul level together with detailed spatio-temporal information. Although on-board observers programmes in the EU collect data with similar resolution and characteristics, the sea-packing data extends the data coverage substantially because vessels engaged in sea-packing record their sea-packed landings for most trips, while observers only record a limited number of trips. Additionally, sea-packing data are collected by fishers, without additional costs to be borne by scientists or public authorities.

In 2002, the Council of the European Union laid down rules for increased traceability of food goods, including fish (EU, 2002). The traceability regulations apply for batches of fish, with a batch being a quantity of fish caught at one time. The regulations do allow for the registration of a batch as the compiled landings from a full fishing trip. Additionally, spatial traceability regulations are complied with if a batch can be traced to the fishing area (e.g. an ICES subdivision) which covers large areas. In Denmark three traceability systems were developed to meet the requirements; the Vessels Data Exchange Center (VDEC) software, the yellow catch information notes and the “Sporbarhed i Fiskerisektoren” (SIF) database, which is an add-on to the sea-packing project. The VDEC is in theory capable of delivering more detailed data than the electronic logbook (eLog), including crate landing composition and size classes (a crate is a standard size box used to store fish for landing (Pack and Sea A/S, 2018)). However, in practice, most of the data reported in the VDEC are limited to haul position, time, and non-sized landings information (O. Skov, personal communication). The yellow catch information notes were developed by the industry to ensure compliance with the regulations among vessels unfit for carrying sea-packing or VDEC equipment (Dandanell and Vejrup, 2013). A note is filled in for the crate with information of the fishing trip including date of first and last fishing, geographical area where fishing took place (as ICES subdivision), gear type and other administrative information, as well as the species and commercial size class. The minimum labelling and information requirements are thus complied with (EU, 2001, 2009, 2011; Dandanell and Vejrup, 2013).

The present study focus on the third system, the SIF database. We analyse and explore the accessibility, coverage, consistency and reliability of the data, in order to assess whether it may be used for scientific studies and in management advice. The quality of the data is assessed by comparing it with the eLog, sales slips and data from a trial using Remote Electronic Monitoring with a camera system (EM). The objective of the present paper is only to investigate whether SIF data are suitable and reliable, before they can be used in future studies. As such, we primarily focus here on describing these new data and assess their quality. Future studies involving SIF data are briefly suggested, including comparison with coupled VMS and logbook data as well as studies on spatial size distribution for certain species.

2 Materials and methods

2.1 The SIF database

The SIF database began in 2012 as collaboration between the Danish Fishermen’s Association (DFPO), the Danish AgriFish Agency and the retail industry. The sea-packing data in SIF provide information at haul level on the landed species and size composition by weight, together with detailed information on date, time and position of the haul. The size classes applied are those defined by the EU regulation and size classes used by the fish auctions (Tab. 1) (EU, 1996; Danske Fiskeauktioner, 2017). The sea-packing equipment includes a dynamic scale, which records the weight of each size class of each species automatically. When in port, the records are relayed online from the sea-packing software to SIF. The weight recorded by the sea-packing equipment is the gutted weight, not the live weight as recorded in the eLog (Frederiksen et al., 1997, 2002; Danish AgriFish Agency, 2017). As in the eLog, the SIF database allows for entries of discards in addition to the landings. Figure 1 presents a schematic of the difference between landings information at haul level in the eLog and SIF. SIF provides the size composition of the landings directly at haul level, assuming that the sea-packed fish of a given species are representative of the total landings of that species in the individual haul. This assumption will be discussed in the subsection Using SIF data. SIF is linked with the eLog, from which the temporal and spatial data for the hauls are derived. In 2016, funding for SIF operational costs was reduced. The future of SIF is thus uncertain, although it recently proved valuable. In 2017, the German authorities required traceability data for a batch of fish a German buyer had purchased from a wholesaler in Denmark. The required information could be retrieved from in SIF and met the expectations of the German authorities, thus demonstrating the operationality of the system (C.S. Pedersen, personal communication).

Table 1

Commercial fish size classes and their corresponding weight in kg for the 10 investigated species based on SIF and Danish fish auction as well as DFAD and EU regulations.

Fig. 1

Conceptual figure of the difference between landings data available at haul level in the electronic logbook and the sea-packing data available in the SIF database.

2.2 Data collection

As each vessel owns its own data in SIF, individual acceptance to use the data for the present study was required. Around 90 vessels operated with sea-packing in Denmark in 2015 and 2016. All sea-packing vessels were part of the large-scale fleet, which consisted of 419 vessels in 2015 and 396 vessels in 2016 (STECF, 2017). However, due to confidentiality agreements, vessel details from SIF could not be provided by the database administrator (C.S. Pedersen, personal communication). Twenty eight vessel owners have thus been personally contacted so far, and asked whether they sea-pack their landings and are willing to grant access to their SIF data.. At the time of writing, confirmation was still pending from four skippers, 13 skippers had granted access to their SIF data and 11 skippers had refused (Tab. 2). The access to SIF occur through a website, with no export function. A web scraper was thus developed to extract the data.

Table 2

Vessel ID, remarks and whether access to SIF data has been granted for contacted vessels. 4.a = Northern North Sea, 4.b = Central North Sea, 3.a = Skagerrak and Kattegat, 22–28 = Baltic Sea. Vessels where owners were unwilling to share SIF or who are undecided have been aggregated into groups based on reason for not granting access or remark on current status.

2.3 Study period

The study period is January 1 2015–December 31 2016. Over this period, high resolution haul data for five vessels and SIF data could be compared with EM data (GPS) for two vessels, which both had sea-packing equipment and participated in the Danish Cod Catch Quota Management trial (Ulrich et al., 2015; Bergsson and Plet-Hansen, 2016; Bergsson et al., 2017).

2.4 Assessing validity of SIF against DFAD and eLog

For the validity assessment, SIF data from vessels A, B, C, D and E in 2015 and 2016 were compared to the DTU AQUA DFAD (Danish Fisheries Analyses Database) dataset. DFAD is based on sales slips merged with the eLog catches and fleet register data. Catches are recorded as total live weight of each species and since 2015 it has been mandatory to record catches in the eLog on a haul-by-haul level (EU, 2011; Danish AgriFish Agency, 2017). The coupling of eLog haul data and sales slips data do allow for inference of landings’ size composition at the haul level assuming constant size distribution across all hauls (Bastardie et al., 2010; Hintzen et al., 2012). However, the assumption of even size distribution risks assigning inaccurate size distributions to the haul.

Not all species landed by a vessel are sea-packed. To analyse the completeness of the SIF data the species recorded in SIF were compared to the same vessels’ data from DFAD. The 10 most important species (in landings by weight) for the five vessels were identified based on DFAD landings records. These 10 species constituted 95.8% of the landings by weight for the five vessels in both years. The completeness of landings recorded in SIF compared to DFAD was calculated as: $C_{L} = 100 - \frac{L_{DFAD} - L_{SIF}}{L_{DFAD}} * 100,$ (1) where L is the sum of recorded landings of the species in DFAD and SIF respectively. No conversion factor was needed for the comparison, since both SIF and DFAD have records of the gutted weight.

Similarly, the completeness of hauls available in SIF was estimated based on the number of hauls according to the eLog, using: $C_{H} = 100 - \frac{H_{e L o g} - H_{SIF}}{H_{e L o g}} * 100,$ (2) where H is the number of recorded hauls in eLog and SIF respectively.

A comparison between SIF and DFAD of the species and commercial size classes recorded by vessel A, B, C, D and E during 2015 and 2016 for the 10 most landed species was then performed. SIF and DFAD data were merged based on the trips’ landing date. The weight of each commercial size class of the 10 most landed species for each trip was summed based on the unique logbook number identifying each fishing trip. Trips with no records in either SIF or DFAD were excluded. The largest size class for cod (Gadus morhua) and hake (Merluccius merluccius) in SIF is 0, whereas the largest size class is 1 in DFAD (Tab. 1). The division between the second largest size class, size class 2, and size class 1 is the same for SIF and DFAD. Therefore, size class 0 was aggregated with size class 1 in SIF to render the comparison between databases possible. In addition to a visual comparison of SIF and DFAD data at trip level, the fit between SIF and DFAD records was analysed with a linear model using the lm function in R. This was done to estimate how close SIF records are to DFAD records and vice-versa. A log-transformation was applied to landings recorded in SIF and DFAD whereby normal distribution was induced.

The model is thus written as: $\log (y_{i}) = a + \log (x_{i}) * b,$ (3) where a is the intercept, b is the slope, y is the landings by size class recorded in SIF, x is the landings by size class recorded in DFAD and i is an index for the fishing trip and commercial size class of the investigated species.

Essentially, DFAD should contain all landings of all species from all the vessels’ fishing trips. SIF has only records of all landings of all species from when the vessel started sea-packing during the fishing trip. A comparison of the trip-based percentwise size class compositions of landings was performed between trips where sea-packing did not take place and trips where sea-packing was conducted. This was done to investigate whether a potential bias in the size class compositions is possible depending on whether a vessel packs at-sea or not. The comparison was made solely using DFAD, because SIF does not have information in trips without sea-packing. First, the size class composition of the landings recorded in DFAD was calculated as a percentage of the total landings recorded in DFAD for trips where SIF records also existed and for trips where SIF records did not exist. This was plotted and investigated visually. Then, a non-parametric analysis was performed using the Wilcoxon rank-sum test, to detect potential bias in size distribution which could occur if fishers for instance only sea-pack at trips with ample volumes of large fish.

To investigate the effect of year, vessel and size class on the differences between landings recorded in SIF compared to DFAD, an extension of the model in equation (3) was made and analysed using an analysis of covariance (ANCOVA). The model is written as: $\log (y_{i}) = \log (x_{i}) + β_{1} (μ_{i}) + β_{2} (v_{i}) + β_{3} (s_{i}),$ (4) where y is the landings by size class recorded in SIF, x is the landings by size class recorded in DFAD, i is an index for the fishing trip, µ is year, ν is vessel, s is size class and β₁ to β₃ are the effects of year, vessel and size class for the investigated species.

2.5 Spatial distribution of SIF data compared to EM data

Because the SIF system depend on the eLog for the temporal and spatial haul information, a geographic comparison with DFAD is not relevant. Therefore, coverage quality was assessed using a different dataset, comparing SIF with the GPS sensor data from an EM trial run by the Danish AgriFish Agency in 2015 and 2016 (Bergsson and Plet-Hansen, 2016; Bergsson et al., 2017). This was done for two vessels that took part in this trial during 2015 and 2016. EM GPS data were plotted as dots at a 1-minute interval. Start and end position according to SIF was used to plot lines for each haul on the same chart. Because this assumes linear track courses, some deviance is expected. Additionally, some hauls with unrealistic haul lengths and towing speeds were spotted in SIF. SIF hauls were excluded if the towing speed exceeded 7 knots. The criteria for exclusion was based on information from the vessel owners on their maximum and usual towing lengths as well as an inspection of the maximum towing speeds recorded in the EM trial. In addition to the visual inspection, the mean mid-latitude and mid-longitude were calculated for each haul. Because fishers target certain fishing grounds, the distribution of fishing hauls becomes non-random and it is not possible to induce normal distribution of samples. Therefore, statistical comparison of mid-latitude and mid-longitude was performed using a Wilcoxon rank-sum test.

3 Results

Although it is possible to enter discards in SIF, none of the investigated vessels had any discards recorded. Seven of the 13 skippers who granted access to their SIF data had recordings at the haul level with high resolution, while the data from the other six showed that on these vessels, the sea-packing equipment was not used in a manner where the size classes were recorded at the haul level. The main reason given for this was that the vessels had used the sea-packing equipment to clean the fish during their catch processing but had not stored their landings in size-graded crates (Tab. 2). This was also the main reason given by the 11 skippers who have not granted access.

3.1 Species not occurring in SIF

Of all species reported in DFAD for each vessel, only a few were never reported in SIF. For vessel A, this was the case for five species: Atlantic mackerel (Scomber scombrus), edible crab (Cancer pagurus), marine crabs (Brachyura sp.), greater weever (Trachimus draco) and lumpfish (Cyclopterus lumpus). For vessel B six species: Norway lobster (Nephrops norvegicus), golden redfish (Sebastes marinus), greater forkbeard (Phycis blennoides), long-rough dab (Hippoglossoides platessoides), cuttlefish (Sepiidae sp.) and tope shark (Galeorhinus galeus). For vessel C and D three species: Atlantic mackerel, edible crab and lumpfish. For vessel E five species: Norway lobster, golden redfish, lumpfish, greater forkbeard and blue ling (Molva dypterygia). The weight of the species never recorded in SIF ranged from 0.02% (vessels C and E) to 0.1% (vessel B).

3.2 Comparison of trips, hauls and 10 most landed species

The majority of hauls and trips were represented in both SIF and DFAD, although a third of the 14 570 species*haul combinations were missing in SIF (Tab. 3). For the reported landings, the highest completeness C_L was achieved for vessel B at around 90% on average, followed by vessel A at around 80% on average, whereas vessel C had the poorest completeness, at 69%. Overall the size class composition was similar on an aggregated level (Fig. 2) but the means differed significantly in 16 out of 39 cases when α = 0.05 (Tab. 4). For cod, hake, haddock (Melanogrammus aeglefinus), lemon sole (Microstomus kitt), turbot (Scophthalmus maximus) and witch flounder (Glyptocephalus cynoglossus), the size classes constituted roughly the same percentage of the overall landings regardless of whether the trips had only DFAD data or had SIF too. The largest overall discrepancy was for saithe (Pollachius virens) where size class 3 constituted a lower percentage of the landed weight while size class 4 constituted a larger share when trips had not been recorded in SIF. However, all species had at least one size class with a significant difference in percentwise composition. Conversely, all species also had at least one size class where no significant difference was found. Additionally, the standard deviation was large for all species and size classes, meaning that large variation in size composition occur between trips.

Log-transformation of landings recorded in SIF and DFAD was necessary to assume normal distribution (Fig. 3). A scatterplot and a linear model fit was made for the size classes of the 10 investigated species of each vessel at trip level (Fig. 4 and Tab. 5). Saithe, turbot, witch flounder, wolffish (Anarchichas spp.) and monkfish (Lophius spp.) had R²-values and a scatterplot close to a 1:1 ratio between SIF and DFAD by trip for most vessels. However, monkfish was not sorted into size classes on vessel A when sea-packed, and vessel E had several trips with a poor fit for the medium size classes of saithe as well as some trips with a poor fit for the largest size class of wolffish. Correlations were also generally high for hake and lemon sole but vessel D had several trips where the larger size classes of these two species had a poor fit. Vessel A also had some trips with a poor fit for lemon sole, and this species was rarely landed for vessel B. Haddock had high R²-values as well but not for all years and all vessels, where especially vessel B and D in 2016 had a poor fit. Cod had R²-values and a scatterplot with a good agreement between SIF and DFAD for vessel B, but not for the rest of the vessels. For plaice (Pleuronectes platessa) the scatterplot and R²-values were poor for most vessels. Interestingly, some occurrences of more landings in weight in SIF than DFAD appeared, mainly for witch flounder, which in theory should not be possible, since the summing of all SIF data should also be found in the total recorded landings for any given trip. Presenting this to the fishers revealed two reasons; (1) small mismatches are inevitable, as the fishery auctions, from where the landings data in DFAD are derived, only record landings in total kilograms, whereas the sea-packing equipment uses scales with dynamic motion compensation and relay data with two decimals. (2) Larger mismatches could be an artefact in the SIF system. If a crate is labelled wrongfully, e.g. by recording the wrong size class or species, a new label must be made. This in turn will be recorded as a new entry in SIF and the fishers cannot delete the old entry, meaning that the same crate will count twice in SIF.

Extension of the model to include the effect of year, vessel and size class revealed that each of these factors could have a significant effect among the species (Tab. 6). The effect of year was significant for cod, hake and lemon sole. Vessel effect was significant for all species, except haddock and turbot and the effect of size class was significant for all species, except witch flounder. The log-transformed landings in DFAD had a significant effect and the largest sum of squares and F-value for all species.

Table 3

Completeness of SIF when compared to the eLog (hauls and trips) and vessel landings data from DFAD for the 10 most landed species in 2015 and 2016.

Fig. 2

Landings’ size composition in percent stratified on trips with only DFAD data and trips with both DFAD and SIF data. Size class 1 are the largest specimens.

Table 4

Mean and standard deviation in percentage of size classes as well as p-value from Wilcoxon rank-sum test. Comparison is done solely using DFAD data between trips where only DFAD data exist and trips where both SIF and DFAD data exist. *Vessel A is not included for monkfish as the vessel do not sea-pack monkfish.

Fig. 3

QQ-plot for (I) log-transformed landings recorded in SIF. (II) log-transformed landings recorded in DFAD.

Fig. 4

Landings per trip according to DFAD and SIF for the 10 most landed species in 2015 and 2016 by species and commercial size class. Points: the aggregated weight of the species and size class for a fishing trip. The x-axis represent the weight according to DFAD, the y-axis represent the weight according to SIF. Blue dashed line: linear model fit between DFAD and SIF. Black line: the 1:1 ratio between DFAD and SIF. Size class 9 is unsorted.

Table 5

R² and degrees of freedom for linear model fit of landings in SIF and DFAD for the 10 most landed species in 2015 and 2016. SIF data has been aggregated to trip level in order to make the comparison possible with DFAD and comparison is done solely for trips where both SIF and DFAD have records.

Table 6

ANCOVA output for the effect of year, vessel and size class as well as remaining effect of log-transformed landings from DFAD and residuals.

3.3 Spatial distribution of hauls compared to EM data

The exclusion criteria to filter for unrealistic haul lengths and towing speeds in SIF led to the exclusion of respectively 91 and 71 hauls for the two EM vessels, corresponding to 6.33% and 7.67% of recorded hauls. Overlay maps for positions according to EM GPS data and according to SIF in 2015 and 2016 are presented in Figure 5. Visually, most areas had overlap between SIF and EM but in 2015, the difference between positional data in SIF and EM was statistically significant (Tab. 7). An area at roughly 59° N and 0.5° W was visually identified where fishing took place according to EM but no hauls have been recorded in SIF, neither in 2015 nor in 2016.

Fig. 5

Fishing activity overlap between EM and SIF for two vessels. (I) 2015. (II) 2016. Blue points: Fishing activity recorded by EM GPS sensors (1-minute interval). Yellow lines: Hauls according to SIF. The EM trial did not cover the Baltic Sea and the maps do therefore not include hauls in this area.

Table 7

Mean latitude and longitude as well as p-value from Wilcoxon rank-sum test for all hauls recorded in SIF and EM during 2015 and 2016. Two vessels had records in both datasets. Due to confidentiality agreements, the number of hauls cannot be revealed, however it exceeded 1000 observations in both years.

4 Discussion

The SIF dataset possess information not available in the currently used commercial fisheries data. That cover direct observations on size distributions at the haul level instead of merely at the trip level. The completeness of SIF compared to DFAD shows overall a good match, albeit not perfect. Although all five vessels landed a few species that were never sea-packed and, consequently, present in DFAD but not in SIF, these species only constituted a minor fraction of the vessels’ total landings. Thus, they were non-target species for the vessels. According to the fishers, vessels engaged in sea-packing may choose not to sea-pack a species if it is not considered worth the effort of sea-packing during the catch processing. Norway lobster is an example of a potential target species that is not necessarily sea-packed. This is because as the added value is not considered to be large enough, which is also the case for several flatfish species.

Fishing trips and hauls recorded in the eLog were overall well represented in SIF. No discards were recorded in SIF, which is likely because the legal purpose of the dataset is for traceability requirements of the landings.

Several trips had records of landings for one or more of the 10 investigated species in DFAD but no records of the species in SIF. A reason for this may be the loss of data when merging DFAD and SIF, because there are no unique haul and trip IDs shared between SIF and DFAD. Therefore, the common identifier used to merge SIF and DFAD was the landings date, which can be inferred from SIF and is recorded in the DFAD data. Mismatch may also be due to lack of vessel storage capacity to pack all their landings in crates at-sea. Because it takes up more storage room to sea-pack landings there is a trade-off between continuing to fish after the storage capacity for sea-packing is reached. On the one hand, sea-packing should give a higher quality and thereby higher price for the landings (Frederiksen and Olsen, 1997; Frederiksen et al., 2002). On the other hand, the cost of steaming between fishing grounds and port may make it more profitable to continue fishing, store landings in larger bulks, and land a larger amount of unsorted fish, which will give a higher total revenue. The choice between one and the other is likely to be influenced by several factors. These include among others as the amount of remaining quota, the expected value of the landings already in storage, how far into the expected duration of the fishing trip a haul takes place, and the weather conditions. Accordingly, there may not necessarily be consistency between fishing trips as to whether a species is sea-packed or not. The fact that plaice is the species where SIF records are poorest supports this, as plaice is a relatively low value species in this context. Conversely, it is likely that species with a high profit gained from sea-packing will have the best agreement between DFAD and SIF records. Monkfish has good agreement for most vessels, which supports the above perspective as monkfish has a relatively high value. The model extension to include the effect of year, vessel, and size class for each species did not reveal which factors specifically and significantly influence the choice of sea-packing or not. The model output show that factors other than year, vessel, and size class significantly influence the lack of a perfect fit between SIF and DFAD records. As stated above, external factors may well heavily influence the choice. This include factors that may vary substantially such as fish price. Furthermore, due to the Danish Individual Vessel Quota system, it is difficult to specify the remaining quota during a year, which may also influence the choice. We, nonetheless, consider it to be beyond the scope of this study to further analyse these factors here. Future studies on the frequency of storage limitations, possible correlation between expected fish prices and sea-packing, or cost-benefit analysis of the added workload at-sea compared to the potential gain from sea-packing could shed further light on the underlying reasons and key driving factors behind the frequency of trips with landings recorded in DFAD while lacking in SIF. The potential bias created by lack of SIF records for certain trips seems limited, though. Overall, there are only small differences in the percentwise size composition in the landings for the DFAD dataset when looking at trips where SIF data was available compared to trips with no SIF data available. However, statistical test output of the percentwise composition suggest large variation among trips. As a whole, the investigations and tests comparing SIF and DFAD revealed that a consistent bias in SIF records seems unlikely. Lack of entries in SIF varies between vessels, years, species and possibly size classes, although fishers have stated that they either do not sea-pack a species or sea-pack all retained specimens at the hauls where they sea-pack. In light of this, SIF should not be viewed as a full record but rather as a subsample of the landings with higher resolution for certain species. Due to the species-to-species variation in reliability in SIF, studies utilizing SIF data should verify the completeness of the specific SIF data available for those species, which are to be investigated, prior to any further analysis.

4.1 Spatial data

Overall, there is a good spatial overlap between the SIF and EM datasets. However, some gaps in spatial coverage occur, and a statistically significant difference between mid-points of hauls was found for 2015. Several reasons can explain the discrepancy. First, hauls recorded in SIF with unrealistic duration and towing speeds were excluded which inevitably creates gaps for SIF compared to EM. Second, positional data in SIF is exported from the eLog. Although the eLog software allow for real-time entries of the vessel’s position, the skipper may postpone entries of haul data, including fishing time and position, as long as the data has been entered prior to the mandatory deadline of data transmission (once every 24 h). Therefore, a certain mismatch could be caused by human errors if positional data is entered manually in the eLog. Third, there is an inherent error in plotting a haul as a simple straight line from haul start to end. Adjustments in vessels’ course and drag will mean that towing paths are not conducted in straight lines in the real world, which can cause mismatch when assuming a straight line between start and end position of the haul. Fourth, some gaps may come from fishers testing an area for fish. If the catch in this area is poor, then no sea-packing will occur, meaning that no haul is recorded in SIF, but because a fishing activity was recorded in EM, the haul will appear in the EM data. This could explain the mismatch in an area around 59° N and 0.5° W. Fifth, the spatial resolution of the data used for the statistical test will influence the outcome of the test of means. Finally, breakdowns have happened in the GPS equipment during the EM trial, meaning that it is possible for hauls to have taken place and be present in SIF without being recorded in EM.

4.2 Using SIF data

When taking the differences in data between DFAD and SIF into account, it is clear that the quality of the SIF data has to be scrutinized at the vessel and species level before it can be utilized for scientific and management purposes. Spatial and temporal entries in SIF seem valid, but due to inaccurate reporting, it is necessary to filter out hauls where spatial or temporal records are unrealistic. This can be done by setting up exclusion criteria and filtering by these. Prior to in depth analysis of species distributions it is necessary to validate the species records in SIF for the individual year, vessel, species, and size class. The agreement between DFAD and SIF can vary substantially. The discrepancies originating from incorrect crate labelling are more difficult to remove. It is a very species and vessel specific issue and therefore only relate to analysis for these specific species, e.g. witch flounder. The simplest approach is to exclude the records from the problematic vessel and/or species, depending on the analysis. The more cumbersome solution is to identify the trips where incorrect labelling has happened, as can be done for the trips where SIF do not contain the majority of landings of a species. By identifying the vessel, species and size class, one can find the corresponding landings in DFAD and SIF and subset for these. Then, using the landings date, the corresponding hauls for the specific fishing trip can be removed from the dataset.

Based on talks with sea-packing fishers, species are generally either sea-packed at the haul level or not at all. Mismatch between SIF and DFAD at the trip level should be due to hauls where species where not sea-packed rather than hauls where a fraction of a species was sea-packed. However, the effect of size class in the extended model does not fully support this statement.

4.3 Possible applications

There are clear limitations regarding the usefulness of SIF owing to the facts that (i) the future of SIF is uncertain due to funding issues, (ii) the majority of Danish fishing vessels do not use it, and (iii) vessels can refuse to share SIF data. Furthermore, several vessels with sea-packing do not complete the entries into SIF in a manner that allow for better spatial resolution than DFAD. The relatively short time coverage of SIF further limits its use. Nevertheless, SIF have several benefits: SIF data is already collected and is therefore a free data source, which only requires the time spent on access permission and adjustment of a web scraper to collect. SIF does not serve as a direct control measure but is used for commercial purposes and to fulfil traceability requirements, whereby there should be little if any incentive to tamper with the system. This study serves, therefore, as a proof of concept that it is possible to obtain precise size distribution from fisheries data at the haul level, even though it is not a legal requirement. Indeed, the fisheries control in Greenland already requires vessels above 75 GRT to include the size distribution of the landings at the haul level (Greenland’s Autonomy, 2010). Although the number of sea-packing vessels is low in Denmark, the landed volume from sea-packing vessels is large and the activity coverage is extensive. The five Danish vessels investigated in this study have SIF data from 258 trips in 2015 and 293 trips in 2016. In 2015 and 2016, the entire Danish observer programme covered a total of 224 and 262 trips respectively. When SIF and observer data overlap, SIF could also be used to investigate potential behavioural aspects of observer presence. Because fishers may refuse to take observers on-board, there is a risk of a bias in the observer data relative to the reason for not wanting observers. Likewise, fishers may adapt their fishing behaviour while carrying observers, either intentionally or unintentionally, which may also cause a bias in observer data. While sharing SIF data with scientist or fisheries managers is purely voluntary, there is an economic incentive to conduct sea-packing as costs are reduced (Frederiksen and Olsen, 1997; Frederiksen et al., 2002) and vessels are liable to the fish auctions for correct labelling of sea-packed landings. Therefore, the risk of fishers adapting fishing behaviour is less likely for SIF. Investigations with SIF data could enhance the knowledge on spatial explicit fish distributions, for instance by mapping areas with a larger share of juveniles for certain species, whereby fishers may improve their spatial selectivity. Especially monkfish and wolffish could be of interest for analysis utilizing the size class information in SIF as these species are data poor and have some of the best records in SIF for the investigated species.

Based on the presented results, the next planned step in utilization of SIF data is to compare the spatial and temporal distribution of size classes for species well represented in SIF data, to that of DFAD and VMS-logbook coupled data. This will allow for testing the validity of the homogeneous reallocation of size classes, as well as showing the importance of having the size composition at the haul level.

5 Conclusion

SIF provides new, relatively reliable data on the size composition of important commercial fish species with the same or higher resolution than what is available in traditional fisheries data. However, the quantity, quality and reliability vary between vessels and species. Although SIF has high coverage and detailed landings and spatio-temporal information, the dataset has limited coverage in the number of vessels. If the SIF database is maintained and SIF data continuously collected, we believe SIF could provide additional knowledge on detailed spatial patterns of fishing effort and commercial fish species and size distributions. Because SIF provide direct observations at the haul level it could be used for analysis at a vessel or métier specific level, for instance on catchability, spatial selectivity, seasonal patterns or to compare and verify outcomes of spatial fishery evaluation models as evaluated in Nielsen et al. (2018). A fleet-wide application or stock assessment usage would require an expansion of the vessel coverage and better accessibility to SIF data. It is our hope that this study may serve as a case study to highlight the possibilities that exist in enhancement of commercial fisheries data available to science.

Acknowledgements

This work has received funding from the Horizon 2020 Programme under grant agreement DiscardLess number 633680 and from the EASME project DRuMFISH contract number EASME/EMFF/2014/1.3.2.4/ SI2.721116. This support is gratefully acknowledged. The authors thankfully acknowledge the support from fishers who have been willing to share their data as well as the parties who have made such data sharing possible. The help and information provided by Lyngsoe Systems, Anchor Lab K/S, DFPO and the Danish AgriFish Agency is gratefully acknowledged. Carsten Søndergaard Pedersen from Pack and Sea A/S and Kim Clemmensen deserve special thanks for their continuous support.

References

Bastardie F, Nielsen JR, Ulrich C, Egekvist J, Degel H. 2010. Detailed mapping of fishing effort and landings by coupling fishing logbooks with satellite-recorded vessel geo-location. Fish Res 106: 41–53. [CrossRef] [Google Scholar]
Bergsson H, Plet-Hansen KS. 2016. Final report on development and usage of electronic monitoring systems as a measure to monitor compliance with the landing obligation − 2015. [Google Scholar]
Bergsson H, Plet-Hansen KS, Jessen LN, Jensen P, Bahlke SØ. 2017. Final report on development and usage of REM systems along with electronic data transfer as a measure to monitor compliance with the landing obligation − 2016. [Google Scholar]
Bourdaud P, Travers-trolet M, Vermard Y, Cormon X, Marchal P. 2017. Inferring the annual, seasonal, and spatial distributions of marine species from complementary research and commercial vessels’ catch rates. ICES J Mar Sci 74: 2415–2426. [CrossRef] [Google Scholar]
Dandanell R, Vejrup K. 2013. TEMA om sporbarhed i fiskeriet. Fisk. Tid, Available online at: https://issuu.com/dandanell/docs/sporbarhed_i_fiskeriet_tema_i_fiskeri_tidende_feb (accessed 1.10.18). [Google Scholar]
Danish AgriFish Agency. 2017. Elektronisk logbog, Available online at: http://lfst.dk/fiskeri/erhvervsfiskeri/indberetning-og-foering-af-logbog/elektronisk-logbog/ (accessed 12.21.17). [Google Scholar]
Danske Fiskeauktioner. 2017. Grading, Available online at: http://www.dfa.as/sortering (accessed 12.21.17). [Google Scholar]
EU. 1996. Council Regulation (EC) No 2406/96 of 26 November 1996 laying down common marketing standards for certain fishery products. Off J Eur Communities L334/1: 1–15. [Google Scholar]
EU. 2001. Commission Regulation (EC) No 2065/2001 of 22 October 2001 laying down detailed rules for the application of Council Regulation (EC) No 104/2000 as regards informing consumers about fishery and aquaculture products. Off J Eur Union L 278: 6. [Google Scholar]
EU. 2002. Regulation (EC) No 178/2002 of the european parliament and of the council of 28 January 2002 laying down the general principles and requirements of food law, establishing the European Food Safety Authority and laying down procedures in matters of food saf. Off J Eur Communities L31: 1. [Google Scholar]
EU. 2009. Council Regulation (EC) No 1224/2009 of 20 November 2009 establishing a Community control system for ensuring compliance with the rules of the common fisheries policy, amending Regulations (EC) No 847/96, (EC) No 2371/2002, (EC) No 811/2004, (EC) No 768/2. Off J Eur Union L 343: 1. [Google Scholar]
EU. 2011. Commission Implementing Regulation (EU) No 404/2011 of 8 April 2011 laying down detailed rules for the implementation of Council Regulation (EC) No 1224/2009 establishing a Community control system for ensuring compliance with the rules of the Common Fish. Off J Eur Union L 112: 1. [Google Scholar]
Frederiksen MT, Olsen KB. 1997. Søpakning med sporbar deklaration, Danmarks Fiskeriundersøgelser, Lyngby (DFU-rapport; Nr. 45–97). [Google Scholar]
Frederiksen MT, Olsen KB, Popescu V. Integrated quality assurance of chilled food fish at sea. In: J.B. Luten, T. Børresen, J. Oehlenschläger (Eds.), Seafood From Producer To Consumer, Integrated Approach To Quality, Elsevier, Amsterdam, 1997, pp. 87–96. [Google Scholar]
Frederiksen M, Osterberg C, Silberg S, Larsen E, Bremner A. 2002. Info-Fisk. Development and validation of an internet based traceability system in a danish domestic fresh fish chain. J Aquat Food Prod Technol 11: 13–34. [CrossRef] [Google Scholar]
Greenland’s Autonomy. 2010. Selvstyrets bekendtgørelse nr. 18 af 9. December 2010 om kontrol med havgående fiskeri. Available online at: http://lovgivning.gl/lov?rid=%7B8EC0C382-4157-4543-AEE1-E83AB40CABEE%7D (accessed 12.21.17). [Google Scholar]
Hintzen NT, Bastardie F, Beare D, Piet G, Ulrich C, Deporte N, Egekvist J, Degel H. 2012. VMStools: open source software for the processing, analysis and visualization of fisheries logbook and VMS data. Fish Res 115–116: 31–43. [CrossRef] [Google Scholar]
Nielsen JR, Thunberg E, Holland DS, Schmidt JO, Fulton EA, Bastardie F, Punt AE, Allen I, Bartelings H, Bertignac M, Bethke E, Bossier S, Buckworth R, Carpenter G, Christensen A, Christensen V, Da-Rocha JM, Deng R, Dichmont C, Doering R, Esteban A, Fernandes JA, Frost H, Garcia D, Gasche L, Gascuel D, Gourguet S, Groeneveld RA, Guillén J, Guyader O, Hamon KG, Hoff A, Horbowy J, Hutton T, Lehuta S, Little LR, Lleonart J, Macher C, Mackinson S, Mahevas S, Marchal P, Mato-Amboage R, Mapstone B, Maynou F, Merzéréaud M, Palacz A, Pascoe S, Paulrud A, Plaganyi E, Prellezo R, van Putten EI, Quaas M, Ravn-Jonsen L, Sanchez S, Simons S, Thébaud O, Tomczak MT, Ulrich C, van Dijk D, Vermard Y, Voss R, Waldo S. 2018. Integrated ecological-economic fisheries models — evaluation, review and challenges for implementation. Fish Fish 19: 1–29. [CrossRef] [Google Scholar]
Pack and Sea A/S. 2018. Pack and sea − types of crates/tubs. Available online at: http://packandsea.dk/ (accessed 1.10.18). [Google Scholar]
Pennino MG, Conesa D, López-Quílez A, Munoz F., Fernández A, Bellido JM. 2016. Fishery-dependent and −independent data lead to consistent estimations of essential habitats. ICES J Mar Sci 73: 2302–2310. [CrossRef] [Google Scholar]
Poos JJ, Aarts G, Vandemaele S, Willems W, Bolle LJ, Van Helmond ATM. 2013. Estimating spatial and temporal variability of juvenile North Sea plaice from opportunistic data. J Sea Res 75: 118–128. [CrossRef] [Google Scholar]
STECF, 2017, Scientific, technical and economic committee for fisheries (STECF) − The 2017 Annual Economic Report on the EU Fishing Fleet (STECF-17-12). [Google Scholar]
Ulrich C, Olesen HJ, Bergsson H, Egekvist J, Håkansson KB, Dalskov J, Kindt-larsen L, Storr-paulsen M. 2015. Discarding of cod in the Danish fully documented fisheries trials. ICES J Mar Sci 72: 1848–1860. [CrossRef] [Google Scholar]
Verdoit M, Pelletier D, Bellail R. 2003. Are commercial logbook and scientific CPUE data useful for characterizing the spatial and seasonal distribution of exploited populations? The case of the Celtic Sea whiting. Aquat Liv Resour 16: 467–485. [CrossRef] [Google Scholar]

Cite this article as: Plet-Hansen KS, Larsen E, Mortensen LO, Nielsen JR, Ulrich C. 2018. Unravelling the scientific potential of high resolution fishery data. Aquat. Living Resour. 31: 24

All Tables

Table 1

Commercial fish size classes and their corresponding weight in kg for the 10 investigated species based on SIF and Danish fish auction as well as DFAD and EU regulations.

In the text

Table 2

Vessel ID, remarks and whether access to SIF data has been granted for contacted vessels. 4.a = Northern North Sea, 4.b = Central North Sea, 3.a = Skagerrak and Kattegat, 22–28 = Baltic Sea. Vessels where owners were unwilling to share SIF or who are undecided have been aggregated into groups based on reason for not granting access or remark on current status.

In the text

Table 3

Completeness of SIF when compared to the eLog (hauls and trips) and vessel landings data from DFAD for the 10 most landed species in 2015 and 2016.

In the text

Table 4

Mean and standard deviation in percentage of size classes as well as p-value from Wilcoxon rank-sum test. Comparison is done solely using DFAD data between trips where only DFAD data exist and trips where both SIF and DFAD data exist. *Vessel A is not included for monkfish as the vessel do not sea-pack monkfish.

In the text

Table 5

R² and degrees of freedom for linear model fit of landings in SIF and DFAD for the 10 most landed species in 2015 and 2016. SIF data has been aggregated to trip level in order to make the comparison possible with DFAD and comparison is done solely for trips where both SIF and DFAD have records.

In the text

Table 6

ANCOVA output for the effect of year, vessel and size class as well as remaining effect of log-transformed landings from DFAD and residuals.

In the text

Table 7

Mean latitude and longitude as well as p-value from Wilcoxon rank-sum test for all hauls recorded in SIF and EM during 2015 and 2016. Two vessels had records in both datasets. Due to confidentiality agreements, the number of hauls cannot be revealed, however it exceeded 1000 observations in both years.

In the text

All Figures

	Fig. 1 Conceptual figure of the difference between landings data available at haul level in the electronic logbook and the sea-packing data available in the SIF database.
In the text

	Fig. 2 Landings’ size composition in percent stratified on trips with only DFAD data and trips with both DFAD and SIF data. Size class 1 are the largest specimens.
In the text

	Fig. 3 QQ-plot for (I) log-transformed landings recorded in SIF. (II) log-transformed landings recorded in DFAD.
In the text

Fig. 4

Landings per trip according to DFAD and SIF for the 10 most landed species in 2015 and 2016 by species and commercial size class. Points: the aggregated weight of the species and size class for a fishing trip. The x-axis represent the weight according to DFAD, the y-axis represent the weight according to SIF. Blue dashed line: linear model fit between DFAD and SIF. Black line: the 1:1 ratio between DFAD and SIF. Size class 9 is unsorted.

In the text

	Fig. 5 Fishing activity overlap between EM and SIF for two vessels. (I) 2015. (II) 2016. Blue points: Fishing activity recorded by EM GPS sensors (1-minute interval). Yellow lines: Hauls according to SIF. The EM trial did not cover the Baltic Sea and the maps do therefore not include hauls in this area.
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.