| Literature DB >> 24932915 |
Tommaso Russo1, Lorenzo D'Andrea1, Antonio Parisi2, Stefano Cataudella1.
Abstract
VMSbase is an R package devised to manage, process and visualize information about fishing vessels activity (provided by the vessel monitoring system--VMS) and catches/landings (as reported in the logbooks). VMSbase is primarily conceived to be user-friendly; to this end, a suite of state-of-the-art analyses is accessible via a graphical interface. In addition, the package uses a database platform allowing large datasets to be stored, managed and processed vey efficiently. Methodologies include data cleaning, that is removal of redundant or evidently erroneous records, and data enhancing, that is interpolation and merging with external data sources. In particular, VMSbase is able to estimate sea bottom depth for single VMS pings using an on-line connection to the National Oceanic and Atmospheric Administration (NOAA) database. It also allows VMS pings to be assigned to whatever geographic partitioning has been selected by users. Standard analyses comprise: 1) métier identification (using a modified CLARA clustering approach on Logbook data or Artificial Neural Networks on VMS data); 2) linkage between VMS and Logbook records, with the former organized into fishing trips; 3) discrimination between steaming and fishing points; 4) computation of spatial effort with respect to user-selected grids; 5) calculation of standard fishing effort indicators within Data Collection Framework; 6) a variety of mapping tools, including an interface for Google viewer; 7) estimation of trawled area. Here we report a sample workflow for the accessory sample datasets (available with the package) in order to explore the potentialities of VMSbase. In addition, the results of some performance tests on two large datasets (1×10(5) and 1×10(6) VMS signals, respectively) are reported to inform about the time required for the analyses. The results, although merely illustrative, indicate that VMSbase can represent a step forward in extracting and enhancing information from VMS/logbook data for fisheries studies.Entities:
Mesh:
Year: 2014 PMID: 24932915 PMCID: PMC4059747 DOI: 10.1371/journal.pone.0100195
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Main interface of the VMSbase package.
It is organized into groups of icons that allow access to different functions that operate on different input data (VMS, logbooks, or a combination of the two). The “Project Management” panel on the bottom left side allows setting the files for the work session (and also saving and loading the workspace). The “VMS Data Management” and “Logbook Data Management” provide access to functions and routines for the analysis of, respectively, VMS and Logbook data. These two flows converge in the analyses provided in the “VMS-Logbook Analysis” panel. Finally, the main graphical and numerical outputs can be produced by the tools in the “Data Output” panel at the bottom-right side of the main interface.
Figure 2VMSbase interfaces for: a) VMS and b) Logbook data loading.
While a series of data format options is available in both cases, these interfaces also allow the selected configurations to be saved.
Figure 3VMSbase interfaces for VMS data viewer or database query.
The upper area returns a series of statistical data for the selected database. A traffic light system is present to indicate the analyses already completed (green lights) and the ones which can still be performed (red lights). The bottom area shows the appearance of the dataset. Different panels are used for the native VMS pings, the Tracks (i.e. groups of VMS pings), the Interpolated tracks, the Warnings, etc. The names of the datasets as well as the data in the figure are just for the sake of illustration.
Figure 4VMSbase interfaces for the Assign bathymetry tool: a) main interface that allows data and algorithm to be selected; b) interface for Custom Box selection of data.
Figure 5VMSbase interfaces for a) the ping viewer.
Single VMS positions are represented by red points; b) the track viewer in which all the tracks for a single vessel are plotted; c) the track viewer in which only a single track for a single vessel is plotted; d) the interpolation viewer in which a single track is plotted, with black points representing interpolated pings and red points representing real pings. It should be noted that, in all cases, the main isobaths are visualized as computed by the “Get Isobaths” tool in the main interface. The names of the datasets as well as the data in the figure are just for the sake of illustration.
Figure 6VMSbase interfaces for a) Logbook database viewer, the tool to visually inspect the logbook data; b) Métier discovery, the tool for searching the métiers, defined as catches profiles, in a given logbook database; c) Métier editing tool, which allows the user to assign DCF labels to the detected métier.
The names of the datasets as well as the data in the figure are just for illustrative purposes.
Figure 7VMSbase interface for Métier Prediction tool.
It allows customizing the parameters of the Artificial Neural Network and then train it, but also shows the performance of the trained ANN in terms of correct prediction and allows loading a previously trained ANN and applying it on a new dataset.
Figure 8VMSbase interfaces for a) “Mark fishing points” tool, which allows the user to specify the speed and eventually the bathymetric range for different fishing activities; b) “Effort Gridding”, the tool designed to associate fishing effort with grids or other partitioning, which are then plotted and exported in the desired formats.
This tool returns a value (hours of fishing effort activity) for each cell of the grid, while the results can be saved and exported as ESRI shapefile or CSV textfile; c) “DCF Indicators”, the tool aimed to compute the values for DCF indicators of fishing pressure 5 and 6.
Temporal performances (time required for computation) for each step of a typical data flow.
| Dataset #1∶105 initial VMS pings | Dataset #2∶106 initial VMS pings | ||||
| General Step of the Analysis | Tool | Time required tocomplete the processing(minutes) | Main Statistics | Time required tocomplete theprocessing (minutes) | Main Statistics |
|
| Edit RawData | <1 | 0 NAs found in latitude degrees;0 NAs found in latitude minutes;0 NAs found in latitude seconds;0 NAs found in latitude direction;0 Latitudes out of range (−90/90);0 NAs found in longitude degrees;0 NAs found in longitude minutes;0 NAs found in longitude seconds;0 NAs found in longitude direction;0 Longitudes out of range (−180/180);0 NAs found in dates; 0 NAs found inhours; 0 NAs found in minutes; 0 NAsfound in seconds; 0 dates found withbad format; 0 NAs found in knotsspeed; 0 NAs found in degrees heading | <1 | 0 NAs found in latitude degrees; 0 NAs found inlatitude minutes; 0 NAs found in latitude seconds;0 NAs found in latitude direction; 0 Latitudes outof range (−90/90); 0 NAs found in longitude degrees;0 NAs found in longitude minutes; 0 NAs found inlongitude seconds; 0 NAs found in longitude direction;0 Longitudes out of range (−180/180); 0 NAs found indates; 0 NAs found in hours; 0 NAs found in minutes;0 NAs found in seconds; 0 dates found with badformat; 0 NAs found in knots speed; 0 NAs found in degrees heading |
| CreateDatabase | <1 | / | <1 | / | |
| Load DBin theVMS DataViewer | <1 | / | <1 | / | |
| Clean DBData | 20 | Found 4584 (4.58% of total) duplicatedpings; Found 31019 (31.02% of total)pings in harbour; Found 7550 (7.55%of total) pings on land; Found 38(0.04% of total) not coherent pings | 150 | Found 98381 (9.84% of total) duplicated pings;Found 272996 (27.31% of total) pings in harbour;Found 124015 (12.41% of total) pings on land;Found 279 (0.03% of total) not coherent pings | |
| Track Cutting | 5 | 3883 tracks detected | 40 | 36994 tracks detected | |
| Interpolation(10 minutesfrequency) | 20 | Number of pings changed increasesfrom 105 (real) to 4.9*105 (interpolated) | 180 | Number of pings changed increases from 106(real) to 4.4*106 (interpolated) | |
| Assign Bathymetry(1 degree resolution) | 150 (Fast& Heavy Algorithm)/60(Slow & Light) | / | 1000 (Fast& HeavyAlgorithm)/400(Slow & Light) | / | |
| Assign Area(MediterraneanGSAs) | 100 | / | 900 | / | |
|
|
|
|
| ||
|
|
|
|
| ||
| Edit Raw Data | <1 | 20 NAs found in Start Times; 140 NAs found in End Times;0 NAs found in Species; 0 NAs found in Quantity;Removed 0.08% of data, that is 160 logbooks | |||
| Create Database | 180 | This step requires only few minutes for the EFLALO format | |||
| Métier Discovery(searching between2–30 groups on thewhole dataset with100 samples of 1000records each) | 20 | The best partitioning corresponded to 11 métiers. These are alsoavailable into to the package as reference dataset for Métier Classification | |||
| Métier editing | Depends by the user, reasonablyno more than half an hour | 69397 records in the database; 387 species | |||
| Métier Classification | 20 | / | |||
|
|
| ||||
|
|
|
|
|
|
|
| Logbook- VMSMatching | 5 | 60.3% of VMS tracks with matching in LB database | 30 | 57.2% of VMS tracks with matching in LB database | |
| Predict Métier: DataPreparation | 10 | / | 50 | / | |
| Predict Métier:Training ANN | 3 | Prediction completed for all the tracks without métiers by LB database | 3 | Prediction completed for all the trackswithout métiers by LB database | |
| Find Fishing Points | 30 | / | 45 | / | |
|
|
|
| |||
|
|
|
|
|
|
|
| Gridding & Mapping(for each métier) | 2 | / | 9 | / | |
| DCF Indicators(for each métier) | <1 | / | <1 | / | |
| Trawled Area Viewer | Untested | / | / | / | |
|
|
|
| |||
|
|
|
|
| ||
These data was measured on three sample datasets (two for VMS and one for Logbook).
The Fast & Heavy Algorithm was tested on a different personal computer with 16 Gb of Ram.
List of métiers identified in the sample Logbook database for the activity of the Italian fishing fleet during year 2012.
| DCF Code | Extended definition |
| DRB_MOL_0_0_0 | Boat dredge for Molluscs |
| LLD_LPF_0_0_0 SWO | Drifting longlines for large pelagic fish |
| OTB_DES_> = 40_0_0 | Bottom otter trawl for Demersal species |
| OTB_DWS_> = 40_0_0 | Bottom otter trawl for Deep Water species |
| OTB_MDD_> = 40_0_0 | Bottom otter trawl for Mixed demersal species and deep water species |
| OTM_MPD_> = 20_0_0 | Midwater otter trawl for Mixed demersal and pelagic species |
| PS_SPF_> = 14_0_0 | Purse seine for small pelagic fish |
| PS_LPF_> = 14_0_0 | Purse seine for Large pelagic fish |
| PTM_SPF_> = 20_0_0 | Pelagic pair trawl for Small pelagic fish |
| TBB_DEF_0_0_0 | Beam trawl for Demersal species |
List of dependencies (other R add-on packages) for VMSbase.
| Package Name | Scope | Reference |
| CairoDevice: Cairo-basedcross-platform antialiasedgraphics device driver | Cairo/GTK graphics device driver withoutput to screen, file (png, svg, pdf, and ps)or memory (arbitrary GdkDrawable orCairo context). |
|
| chron: Chronological objects whichcan handle dates and times | Chronological objects which can handledates and times |
|
| cluster: Cluster Analysis ExtendedRousseeuw et al | Cluster Analysis, extended original fromPeter Rousseeuw, Anja Struyf and MiaHubert. |
|
| Ecodist: Dissimilarity-based functionsfor ecological analysis | Dissimilarity-based analysis functionsincluding ordination and Mantel testfunctions, intended for use with spatialand community data. |
|
| Fields: Tools for spatial data | Fields is for curve, surface and functionfitting with an emphasis on splines, spatialdata and spatial statistics. Implementationof sparse matrix methods for large data setsand currently requires the sparse matrix(spam) package for testing and use withlarge data sets. |
|
| Ggmap: A package for spatialvisualization with Google Mapsand OpenStreetMap | Visualization of spatial data and modelson top of Google Maps, OpenStreetMaps,Stamen Maps, or CloudMade Maps usingggplot2. |
|
| Gwidgets:gWidgets API for buildingtoolkit-independent, interactive GUIs | Toolkit-independent API for buildinginteractive GUIs. |
|
| GWidgetsRGtk: Toolkit implementationof gWidgets for RGtk2 | Port of gWidgets API to RGtk2 |
|
| Intervals: Weighted Logrank Testsand NPMLE for interval censored data | Functions to fit nonparametric survivalcurves, plot them, and perform logrankor Wilcoxon type tests. |
|
| Mapdata: Extra Map Databases | Supplement to maps package, providingthe larger and/or higher-resolution databases. |
|
| Maps:Draw Geographical Maps | Display of maps. Projection code andlarger maps are in separate packages(mapproj and mapdata) |
|
| Maptools: Tools for reading andhandling spatial objects | Set of tools for manipulating and readinggeographic data, in particular ESRIshapefiles; C code used from shapelib.It includes binary access to GSHHSshoreline files. The package also providesinterface wrappers for exchanging spatialobjects with packages such asPBSmapping, spatstat, maps, RArcInfo,Stata tmap, WinBUGS, Mondrian, andothers. |
|
| marmap: Import, plot and analyzebathymetric and topographic data | Import, plot and analyze bathymetric andtopographic data |
|
| Outliers: Tests for outliers | A collection of some tests commonlyused for identifying outliers |
|
| PBSmapping: Mapping FisheriesData and Spatial Analysis Tools | Two-dimensional plotting featuressimilar to those commonly availablein a Geographic Information System(GIS). Embedded C code speedsalgorithms from computational geometry,such as finding polygons that containspecified point events or convertingbetween longitude-latitude andUniversal Transverse Mercator (UTM)coordinates. |
|
| Plotrix: Various plotting functions | Lots of plots, various labeling, axis andcolor scaling functions |
|
| Sp: classes and methodsfor spatial data | Classes and methods for spatial data.The classes document where the spatiallocation information resides, for 2Dor 3D data. |
|
| Sqldf: Perform SQL Selects onR Data Frames | Manipulate R data frames using SQL. |
|