The Mapping Application for Penguin Populations and Projected Dynamics (MAPPPD; www.penguinmap.com) is an open access decision-support tool designed for use by all Antarctic stakeholders. This system uses modern web-based technologies to deliver consistent and verifiable population data for the four most widespread penguin species in the Antarctic (emperor, Aptenodytes forsteri; gentoo, Pygoscelis papua; chinstrap, P. antarctica; and Adélie, P. adeliae). Such web-based technologies are increasingly important for long-term storage of curated data in the public domain. Furthermore, they allow us to make data freely available, which is important in ensuring the transfer of scientific knowledge (Klump and others Reference Klump, Bertelmann and Brase2006). Open access data are important for the creation of global agreements that rely on using the best available science and benefit from wide access to shared data (Dittert and others Reference Dittert, Diepenbroek and Grobe2001). The MAPPPD tool acts as a portal to deliver population data to stakeholders including, but not limited to, consultative and non-consultative parties to the Antarctic Treaty System (ATS; for example, the Commission for the Conservation of Antarctic Marine Living Resources [CCAMLR] and its associated committees and working groups), the Council of Managers of National Antarctic Programs (COMNAP) representatives, scientists, non-governmental organisations (NGOs), fishing and tourism interests, and the general public. MAPPPD can play an important supporting role to monitoring programmes such as the CCAMLR Ecosystem Monitoring Program (CEMP) by providing information on regional distribution, abundance and population trends that can be linked to detailed measurements at CEMP sites for parameters involving, for example, reproductive success, foraging effort, diet and demography.
MAPPPD serves three primary functions: (1) as a means by which data on the distribution and abundance of penguin species can be submitted, vetted and stored in the public record; (2) as a tool for searching the existing state of knowledge on Antarctic penguin abundance and distribution, and estimating abundance at sites or aggregated abundance across larger areas of interest (for example, Antarctic Specially Protected Areas [ASPAs], Antarctic Specially Managed Areas [ASMAs], CCAMLR statistical areas and small-scale management units, proposed or existing Marine Protected Areas [MPAs]); and (3) to create and deliver checklists of all bird species at sites along the Antarctic Peninsula that are likely to be visited by tour ships or otherwise impacted by human activities.
Mapping Application for Penguin Populations and Projected Dynamics (MAPPPD)
There are two principal components of MAPPPD: the ‘back end’ (that is, code and processes not visible to website users), which includes the database of penguin abundance, as well as model output used for understanding population dynamics and occupancy at sites in the MAPPPD database; and the ‘front end’ web-based interface used for search and retrieval of information. The back end of MAPPPD is made up of several interacting components: (1) a PostgreSQL database; (2) a Bayesian population model calculating the number of nests at any site for any year covering years 1982 to present; and (3) a Bayesian occupancy model for determining probabilities of presence and breeding for all Antarctic avian species. Currently, a population dynamics model describing abundance through time is only available for Adélie penguins, though population models for the other three species of penguin are in development and will be added to MAPPPD in the future.
The database was programmed using Structured Query Language (SQL) using the R programming language as the interface (v3.2.0; R Core Team 2015). It consists of 24 tables that are related via several primary key identifier fields. Once population data from the various data sources are processed, they are populated in a table that contains information on the type of count (chicks, nests, adults), quality of the count, date, species counted and the associated citation. The table structure and procedures for version control that we have adopted allows us to trace the analysis process in a transparent fashion. The population model framework then creates estimates of current populations with corresponding credible intervals and generates forecasts for the abundance at any breeding population or in any user-defined collection of breeding locations. MAPPPD has been designed for ease of use by the management community and as a mechanism for open source development, testing and benchmarking of population models to support Antarctic science and management.
All sites in MAPPPD are intended to represent one biologically relevant population separated, except through occasional migration, to populations at other ‘sites’. The delineation of a geographical area as a ‘site’ is largely consistent with historical precedent (for example, Croxall and Kirkwood Reference Croxall and Kirkwood1979; Woehler Reference Woehler1993; Lynch and LaRue Reference Lynch and LaRue2014), including in cases where several small islands have been combined to a single ‘site’. This was done so that historical survey data could be used for more complete population time series. We have used the names that have been historically associated with each site; however, we have eliminated synonymous names and merged the associated time series where required. We have also renamed sites as needed to correct previous naming errors (based on, for example, an incorrectly identified geographical feature).
Data on penguin abundance (number of breeding pairs or chicks) and occupancy (presence/absence) form the largest component of MAPPPD's database. We classify the sources of abundance and occupancy data for MAPPPD into four categories. By far the largest source of abundance and occupancy data currently in the MAPPPD database is the publicly available published literature, which includes both peer-reviewed scientific literature as well as reports, management plans and other ‘grey’ literature. Data contributed through the literature may derive from direct ground surveys, aerial counts, satellite counts or counts from photographs, and the methods associated with each record are included with the data's metadata. Quality flags, following the precedent set by Croxall and Kirkwood (Reference Croxall and Kirkwood1979) and other compilations of penguin census data, are included with all data records so end users can filter by count accuracy. Quality flags of ‘1’ are those of the highest accuracy (for example, recent ground counts that are site-wide), while flags of ‘5’ are those of the lowest accuracy (for example, estimates from satellite data or modelled output). The Antarctic Site Inventory project and publications stemming from it (for example, Lynch and others Reference Lynch, Naveen and Casanovas2013; Casanovas and others Reference Casanovas, Naveen and Forrest2015) contribute 41.1% of all the population data in the MAPPPD database; the second biggest contributor is the Landcare Research dataset (http://www.landcareresearch.co.nz/resources/data/adelie-census-data) with 12.3%. All other contributors combined make up the other 46.6% of surveys in MAPPPD for all species.
The second category of data are those that are contributed directly to MAPPPD, which may include pre-published survey data or census data collected on an ad hoc or opportunistic basis by professional or citizen scientists in the region. While this is currently a minor component of MAPPPD, we expect this data stream to grow in future iterations of the web application. To ensure consistent quality and metadata for users, data contributed to MAPPPD will be vetted by MAPPPD collaborators before integration. This is a twofold process wherein we first ensure that data being submitted are representative of site-wide counts for the four penguin species in our database and, second, precision estimates are consistent with those already existing in the database. In this way, MAPPPD will serve as a data ‘clearinghouse’ for ad hoc data that may otherwise go unpublished and allows for proper credit to be established for data contributors.
The third category of data are those derived from historical sources (for example, aerial photographs) that have not been included in previous census data compilations. Finally, when fully developed, MAPPPD will automatically ingest satellite data from NASA and other imagery providers, extract pixels classified as guano (Fretwell and others Reference Fretwell, LaRue and Morin2012; LaRue and others Reference LaRue, Lynch and Lyver2014; Lynch and LaRue Reference Lynch and LaRue2014; Lynch and Schwaller Reference Lynch and Schwaller2014; Witharana and Lynch Reference Witharana and Lynch2016) and incorporate those data into the MAPPPD database and, when appropriate, into models for current abundance and forecasts of future abundance.
The distribution of survey data by species per year varies greatly and is primarily dominated by the Pygoscelis spp. penguins (Adélie, chinstrap and gentoo). While the number of surveys with chinstrap penguin data have remained roughly consistent over time, the number of surveys with Adélie and gentoo penguin data have decreased and increased, respectively, since the 1980s. This is probably due to an increase in visitation to gentoo penguin colonies on the Antarctic Peninsula, which represent a large and growing proportion of sites visited by passenger vessels (Bender and others Reference Bender, Crosbie and Lynch2016). Except for a large effort in 2009, emperor penguins represent the smallest component, reflecting both their smaller population size and the remoteness of their colonies (Fig. 1). However, we expect that as satellites are more frequently being used to estimate colony size for emperor penguins, the number of abundance estimates available for this species will increase in the next decade.
On the front end, the user can query the database and models by individual colonies, groups of colonies, species, region or user-defined polygons (Fig. 2). Estimates of population size are delivered as graphical outputs that can be downloaded in a variety of formats. MAPPPD's underlying population model generates Bayesian posterior distributions (that is, the statistical distribution associated with predictions of the number of nests as a colony) for the number of breeding pairs (that is, number of nests) in each colony (known to exist) in each year since 1982. An initial model has been developed to illustrate the functions available for display and download of results; the details of which will be distributed in a public Github repository. When complete, MAPPPD will include tools for community-contributed population models; the suite of models available within MAPPPD will facilitate the creation and display of ensemble model predictions, and models will be compared each year in terms of their predictive performance as new data become available. Since the model development toolkit is still undergoing development, users are encouraged to interpret the displayed results with caution (that is, although general trends will remain the same, be aware that results are likely to vary in the final version); downloaded results will be associated with metadata on the versions of each model included.
Bayesian posteriors are summed across selected sites for each year to give a population estimate for the entire user query. Population model predictions within MAPPPD use the median, rather than the mean, of the posterior distribution for each year because it provides a more sensible measure of central tendency for right skew distributions. Uncertainty is communicated through the display of 90th percentile highest posterior density credible intervals. For example, if the model output says that the Adélie penguin population at Petermann Island is 939 (416–1,482) breeding pairs, we would say that there is a 90% probability that the true abundance of Adélie penguins at Petermann Island was between 416 and 1,482 breeding pairs, and that there was an equal probability that the true count was greater than 939 and less than 939.
Occupancy, or the presence of species at a site, can be measured in different ways, such as whether a species is ever physically present at a site and whether a species uses a site as a breeding location. MAPPPD uses a series of Bayesian single-species multistate occupancy models to simultaneously calculate both an annual probability of presence and probability of breeding for 16 breeding bird species (Schrimpf and others unpublished). These models currently use presence/absence data from ground surveys by the Antarctic Site Inventory at 181 sites throughout the Antarctic Peninsula, South Shetland Islands, and South Orkney Islands from 1995–2015 (Naveen and Lynch Reference Lynch and Naveen2011). Future research will allow other survey types, including citizen-science bird checklists such as those submitted to eBird (Sullivan and others Reference Sullivan, Wood and Iliff2009), to inform these models.
Modelling both presence and breeding as a probability is necessary to account for uncertainty in the survey process. Multistate occupancy models incorporate probability of non-detection for each state, and accommodate the possibility that a survey missed evidence of breeding and/or presence during a visit. Probability of presence is useful for MAPPPD users who wish to create a list of species most likely to be seen at a site, while probability of breeding is most useful for assessing the chance that flying bird species with cryptic nesting habitats (for example, storm petrels) may be using a site for reproduction. Though Bayesian occupancy modelling is still relatively new, similar models have been adapted to suit different needs (for example, MacKenzie and others Reference MacKenzie, Nichols and Seamans2009; Bailey and others Reference Bailey, MacKenzie and Nichols2014). The models constructed for MAPPPD currently incorporate site-specific November sea ice concentration averaged across all years as an environmental covariate and previous-state (that is, previous population estimate) as a biological covariate, which help to estimate occupancy for sites with very few data. On the front end, MAPPPD displays the probability of each species either being present or present and breeding, which can be collapsed to a simpler checklist of birds likely to be present and/or breeding at a site. These probabilities are influenced by direct observation at the site in question, as well as by the statistically estimated relationship between occupancy status and environmental conditions, the latter of which allows us to estimate the probability of occupancy for sites with known site characteristics even if no survey data has been collected.
MAPPPD's graphical user interface
The front end of MAPPPD was programmed using the Django framework in Python version 2.7.11. All HTML and CSS code for creating the static website displays were written to be compliant with standards set by the World Wide Web Consortium (https://www.w3.org/). The concept behind the front end was to maintain a fully open access framework for end users to easily query and download population data. Furthermore, we aimed for simplicity with respect to the layout, ensuring the pages were self-explanatory, with headers that could be found by web searches. The overall website is broken down into a home page, publications page, about page, news page and the MAPPPD application. All pages, except the MAPPPD application, are static content pages that contain information about MAPPPD and links to other internal and external sites. The structure and content of these pages are likely to change in the future as new web programming tools become available and more features are added in response to feature requests by MAPPPD users. The following describes MAPPPD as it currently exists.
Upon completion of a query, the results window displays five tabs which can be viewed: sites, counts, count plots, population estimate and download. The ‘sites’ tab displays the full names of all sites selected as well as the four letter ID code when the user hovers the mouse over site names. The ‘sites’ tab also contains a button to locate the site on the map (that is, a red circle ‘pings’ and centres on the location). An option to view any associated Landsat images for each site appears as a button under this tab.
Each Adélie penguin site is associated with a polygon mask that includes the location of the colony with a buffer that accommodates misalignment of satellite imagery used for guano detection; similar masks for the other species are in development. These site-delineating polygons are available for download under the ‘sites’ tab and should minimise confusion over the locations of penguin colonies.
The ‘counts’ tab lists all available count data for all species and all sites within the user's query. When the user selects the species of interest, available count data and associated citations (via a button click) are displayed within expandable tables. All of the count data can then be visualised under the ‘count plots’ tab as scatter plots, which display all nest counts for all species. Under this tab the user can select which site to display and can then download the figure as a PNG, JPEG or SVG file.
The ‘population estimate’ tab is where the results of the population model for each site resides. Once a user selects sites using the query tools, MAPPPD aggregates the results across the individual sites to give an overall estimate. This is the most computationally intensive part of MAPPPD, as 4,500 random posterior draws from each site need to be summed across the query. As data are loaded, a progress bar updates until complete, which then activates the ability to view population estimates for all sites combined and individual sites. These figures are interactive and can be downloaded as PNG, JPEG or SVG files. All results from the ‘population estimate’ and ‘counts’ tabs will be available to download under the ‘download’ tab in various formats (for example, CSV, SHP, XLSX, etc.). The user will also have the option to generate a PDF report of the results under the ‘download’ tab.
Occupancy data are accessed by switching layers on the map and are currently available only for the Antarctic Peninsula. The user can choose to query by site, species or polygon in the same way that the population data are queried. Results from the occupancy data are displayed in the ‘results’ window as a checklist in expandable tables based on probability of presence or present and breeding.
MAPPPD support for ATS decision-making
The MAPPPD interface is designed to assist the ATS in achieving a variety of management objectives. At the Treaty level, all human activities must be evaluated in advance for their potential environmental impacts, ideally using the best available, peer-reviewed scientific data and information (Lynch and others Reference Lynch, Foley and Thorne2016). For activities involving sites where penguins may be breeding, MAPPPD can help to ensure that supporting environmental documentation for such activities – whether Comprehensive Environmental Evaluations (CEEs) or Initial Environmental Evaluations (IEEs) – are fact-based and involve the most recent, readily available data on a site's breeding penguin population and population trends. Furthermore, MAPPPD will assist in the future development of guidelines for visitors for additional sites (ATS 1994).
The CAMLR Convention was signed in 1980 and sets forth the principle of ecosystem management and specifically emphasises under Paragraph 3 of Article II: the prevention of a decrease in size of any harvested population to levels below those ensuring its stable recruitment; maintenance of ecological relationships between harvested, dependent and related populations of Antarctic marine resources and the restoration of any depleted populations; and prevention of changes or minimisation of the risk of change that are not potentially reversible over two or three decades; all aimed at making possible the sustained conservation of Antarctic marine living resources. This led to the formation of monitoring programmes, such as CEMP, designed to support the view, set in 1991 by CCAMLR, that:
‘Reactive management – the practice of taking management action when the need for it has become apparent – is not a viable long-term strategy for the krill fishery. Some form of feedback management, which involves the continuous adjustment of management measures in response to information, is to be preferred as a long-term strategy. In the interim, a precautionary approach is desirable and in particular, a precautionary limit on annual catches should be considered’ (Constable and others Reference Constable, de la Mare and Agnew2000).
MAPPPD can assist CCAMLR in fulfilling its management objectives by providing data from sentinel species (for example, penguins) used to monitor ecosystem changes in line with techniques based on whole-ecosystem approaches (Fabra and Gascon Reference Fabra and Gascon2008). These data, in concert with CEMP, can be used to inform CCAMLR's precautionary catch limits for Antarctic krill (Euphausia superba), which is a major prey source for many Antarctic predators (Constable and others Reference Constable, de la Mare and Agnew2000), including but not limited to penguins, and may be vulnerable to over-exploitation (Croxall and Nicol Reference Croxall and Nicol2004). MAPPPD has the potential to play a central role in linking population processes (for example, predator–prey dynamics), environmental conditions and human activities (for example, resource extraction, krill fisheries) to population assessments, which allows for the re-assessment of management procedures in accordance with the precautionary principle (Fig. 3).
To supplement and enhance the ecosystem-based management approach, it has been suggested that model predictive control systems be put in place to act as a feedback mechanism for management. In controlled studies, simulations have shown that using predictive control systems decreased the chances of target stocks falling below a specific reference point (Hill and Cannon Reference Hill and Cannon2013). In the future, MAPPPD could act as a predictive control system for penguins as effort is currently underway to include population forecasts as a feature of the application.
Recently, discussions within CCAMLR have included the idea of seasonal closures or ‘no take’ zones in the vicinity of penguin breeding colonies (CCAMLR 2015). MAPPPD can enable spatial and temporally comprehensive analyses of penguin distribution and abundance data vis-à-vis krill fishing vessel track and catch data, results from which should assist and guide CCAMLR in any forthcoming decision-making. In addition, MAPPPD may also assist with the interpretation and utility of data flowing from the CEMP sites, since three of the four species in MAPPPD (Adélie, chinstrap and gentoo penguins) represent key species in CEMP (Agnew Reference Agnew1997). Specifically, MAPPPD can provide a larger spatial context for more detailed studies occurring at long-term CEMP study sites and provides a data-driven mechanism for identifying new CEMP sites.
There are several stakeholders in the Antarctic ecosystem including governments, fisheries managers, scientists, conservation-based NGOs and, more recently, tour operators (Cavanagh and others Reference Cavanagh, Hill and Knowland2016). In order to decrease potential conflict between these parties, it is important that all participants have access to the best available science and transparent, easy-to-use decision-support tools. Because MAPPPD is easily queried, provides all known publicly available data sources and is open access, it acts as an ideal tool to help bridge the gap between stakeholders by providing all individuals the same information for evaluating management proposals. Critically, MAPPPD operates at any user-defined spatial scale and naturally accommodates not only single-site queries for environmental impact assessment studies but also scenario planning involving groups of sites such as an ASPA or MPA.
The MAPPPD application is less a static tool than a platform for data sharing and model testing, and its mission is to support Antarctic decision-making using models and code that can be shared, modified and discussed by the widest possible community of Antarctic stakeholders. Its ultimate functionality will depend on community input, and we encourage users (or potential users) to contact us with suggestions and feature requests.
The MAPPPD application is funded by the US National Aeronautics and Space Administration Award No. NNX14AC32G. We would like to thank C. Foley for comments and edits.
To view supplementary material for this article, please visit https://doi.org/10.1017/S0032247417000055