OCEANIC DRIVERS OF SEI WHALE DISTRIBUTION IN THE NORTH ATLANTIC

This study investigated the oceanic drivers of sei whale (Balaenoptera borealis) distribution in the central and eastern North Atlantic, and explored how distribution may have changed over almost three decades. Cetacean sightings data were available from Icelandic, Faroese and Norwegian surveys conducted throughout the central and eastern North Atlantic during summer between 1987 and 2015. Effective strip half width was estimated from the data to take account of variation in detection probability. Spatially-referenced environmental variables used as predictors in generalised additive models of sei whale relative density included: relief-related variables seabed depth, slope and aspect; monthly-varying physical oceanographic variables sea surface temperature (SST), mixed layer depth, bottom temperature, salinity, and sea surface height anomaly (SSH); and monthly-varying biological oceanographic variables chlorophyll-a concentration and primary productivity. Preliminary analysis considered which month (March-August) in the dynamic oceanographic variables explained most variability in sei whale density. Models including all variables (“full models”) could only be run for 1998-2015 because data for several variables were missing in earlier years. “Simple models" including only reliefrelated variables and SST were therefore run for 1987-89, and also for 1998-2015 for comparison. The best-fitting full model for 19982015 retained the covariates depth, May SST, May bottom temperature, July salinity, July SSH and July primary productivity. Of these, depth, May SST and July SSH were the strongest predictors of sei whale density. In the simple models for both 1987-89 and 19982015, depth (especially), May SST and seabed slope were the strongest predictors of sei whale density. The highest densities of sei whales were predicted in the Irminger Sea and over the Charlie-Gibbs Fracture Zone; a pattern driven by large negative SSH, deep water (>1500m) and polar-temperate SST (5-12°C). There was some inter-annual variability in predicted distribution and there appears to be a northward expansion in distribution consistent with prey species responding to ocean warming. The models could potentially be used to predict future distribution of sei whales based on future environmental conditions predicted by climate models.


INTRODUCTION
Cetacean distributions may be associated with environmental variables through oceanographic influences on prey species, relationships that may vary in space and over time (Hátún et al., 2009;Nøttestad et al., 2015;Víkingsson et al., 2015). This study investigated the oceanic drivers of sei whale (Balaenoptera borealis) distribution (Figure 1) in the central and eastern North Atlantic (Figure 2A), and explored how their distribution may have changed over a period of almost three decades. Changes in the distribution of other cetacean species have been observed in the North Atlantic (Øien, 1988;IJsseldijk et al., 2018) but this has not been explored for sei whales.
The sei whale is found in all oceans. It is typically restricted to temperate waters and is found in shelf seas (for example off eastern North America and around South America) as well as offshore waters (Horwood, 2009;Roberts et al., 2016;Sigurjónsson, 1995). Sei whales migrate to high latitude summer feeding grounds (Horwood, 2009). Whilst there have been sporadic reports of sei whales close to the ice edge, this is thought to be uncommon (Jonsgård, 1966). The International Whaling Commission recognizes three stocks of sei whales in the North Atlantic (Nova Scotia, Iceland-Denmark Strait, and North-eastern Atlantic) (Cooke, 2018). However, sei whale genetics from different North Atlantic locations have been compared and no evidence was found for population structure (Huijser et al., 2018). Estimates of abundance of sei whales in the central and eastern North Atlantic have been made from all North Atlantic Sightings Surveys (NASS), as summarised by Pike et al. (2019a). and sightings of sei whale groups (red circles) in the complete dataset for NASS andNILS 1987-2015. The size of the circle for each sighting is representative of the best estimate of group size.
The coverage and timing of the NASS has not been optimal for sei whales but estimates of 10,300 in 198910,300 in , 9,200 in 199510,300 in and 9,700 in 200710,300 in (Cattanach et al., 1993Borchers & Burt 1997;Pike et al., 2019b) indicate a likely minimum abundance of around 10,000 animals. Sei whales are rarely seen on surveys in the north eastern North Atlantic (Pike et al., 2019a).
The sei whale has the ability to capture prey both by engulfing patches of dense prey and by skimming on relatively low prey concentrations (Prieto et al., 2012). The ability to switch between feeding strategies is enabled by adaptations of having a finer baleen fringe than other rorquals (Collet, 1886) and having some mouth cross-section features similar to right whales (genus Eubalaena) (Brodie & Víkingsson, 2009). These adaptations allow sei whales to consume a wide variety of prey species including copepods, euphausiids, fish and cephalopods. The diet of sei whales is important because the distribution and abundance of prey species is expected to have a strong influence on sei whale distribution.
Prey preference appears to be highly dependent on the ocean basin and the swarming characteristics of the prey (Horwood, 1987). Sei whales in the North Atlantic are known to feed almost exclusively on the copepod Calanus finmarchicus when available (Christensen et al., 1992;Sigurjónsson, 1995). If copepods are absent, they will feed on euphausiids, such as northern krill (Meganyctiphanes norvegica) and Thysanoessa species, that congregate near the surface in dense shoals (Christensen et al., 1992). However, off Iceland, it is suggested that this pattern is inverted, and euphausiids are the main prey, followed by copepods (Sigurjónsson & Víkingsson, 1997). Whilst sei whales feed almost exclusively on plankton, they have been observed to consume fish in Icelandic waters, e.g. sandeel (Ammodytes tobianus), lumpfish (Cyclopterus lumpus) and capelin (Mallotus villosus) (Sigurjónsson & Víkingsson, 1997). Sei whales in the North Pacific, where they feed more extensively on small schooling fishes, have been recorded as consuming anchovies (Engraulidae), sauries (Scomberesocidae), jack mackerel (Carangidae), pollock (Gadidae), sardines (Clupeidae) and mackerel (Scombridae) (Shuntov & Ivanov, 2015). It is, therefore, possible that North Atlantic sei whales could also consume the Atlantic forms of these fish species.
Little is currently known about oceanographic influences on sei whale distribution. Aggregations of sei whales over the Mid-Atlantic Ridge have been associated with ocean fronts, northern facing seafloor (aspect) and interactions between bottom topography and flow gradients, especially at depths of less than 100m . Distributions of sei whales in the North Pacific have also been associated with ocean fronts, as well as cooler surface waters, negative sea surface height anomaly, and high chlorophyll-a concentration (Murase et al., 2014;Sasaki et al., 2013).
In this paper, we investigate the oceanic environmental drivers of sei whale distribution in the central and eastern North Atlantic, and explore how distribution may have changed over almost three decades.

Data
The cetacean survey data used for this project were collected as part of the North Atlantic Sighting Surveys (NASS) and the Norwegian Independent Line-transect Surveys (NILS) in the central and eastern North Atlantic during summer months between 1987 and 2015. The majority of the data were collected in July (59%) and August (36%), with some in June (5%). A summary of the effort and sightings data is given in Table 1 and the effort and sightings are shown in Figure 2B. Sei whales and fin whales can sometimes be difficult to identify to species with certainty. Here, sightings included in the analysis were those classified with a high and medium degree of certainty of being sei whales. It is possible that some of the probable sei whales were actually fin whales; equally, some sightings identified as fin whales could have been sei whales. Data were processed into segments of effort, as described in Ramirez-Martinez (2020).
The environmental variables used in analysis were selected on the basis of there being a potential for them to influence sei whale distribution through potential effects on prey, and on their availability for the whole survey area. The variables selected were: relief-related variables (depth, slope and aspect), dynamic variables measured remotely (SST, chlorophyll-a concentration and primary productivity) and dynamic variables reconstructed by an ocean model (mixed layer depth (MLD), sea surface height anomaly (SSH), bottom temperature and salinity). The sources and units of these environmental variables are described in Table 2. Other potential variables that might explain sei whale distribution, such as measures of secondary production, prey and ocean fronts, were not available at the scale of the survey area. A grid containing values of the environmental variables in each grid cell (cell size 25km x 25km, reflecting the lowest resolution of the available data - Table 2) overlaying the study area was created. The environmental variables in each grid cell were extracted from the sources listed in Table 2, mean weighted over a circular buffer (10km radius) centred on the mid-point of each grid cell. The grid was built using QGIS software using the vector grid function (QGIS Desktop 3.2.3; QGIS Development Team, 2018).

Estimation of effect strip half width (ESW)
Detection functions were fitted using the R package 'mrds' (version 2.2.0; Laake, 1999) to estimate the ESW. Sea conditions measured on the Beaufort scale, vessel identity and group size were included as potential covariates in the models to reduce bias and improve precision. Other covariates that could have affected detection probability, such as observer and swell height, were not available consistently in the data. These covariates were only retained in the model if the resultant Akaike's Information Criterion (AIC) score was lower (∆AIC > 2) than that for the model without the covariate. Beaufort scale was rounded to the nearest integer. Vessels from NASS were grouped into six categories based on similar observer platform heights. Platform height was not available for all vessels on NILS surveys, so these vessels were grouped into four categories based on vessel length, which was considered to be the most appropriate way to reflect vessel differences for these data.
Detection functions with hazard rate and half normal key functions were considered and truncation of the perpendicular distance data to improve model fit was investigated where there were outlier sightings far from the transect line. The final detection functions were selected by minimizing AIC, where appropriate, comparing the Cramér-von Mises goodness of fit test statistics, and visual inspection of the fitted model (Buckland et al., 2015). Estimated ESW was derived from the best fitting model.

Checking for collinearity
Collinearity between dynamic variables was assessed using multi-panel scatterplots ('ggplot2' and 'GGally' packages in R) and Pearson's product-moment correlation coefficients. Collinearity between two variables was considered sufficient not to include both variables in the same model when the correlation coefficient exceeded 0.7.

Selecting month for lagged dynamic variables
Oceanic conditions in spring months might influence sei whale density in summer through effects on primary production and hence higher trophic levels. To investigate this, monthly data for March-August for each of the dynamic variables were first used in a preliminary analysis to determine which month might best explain variability in the data for each of these variables. Only one month for each variable could be included in any model because values of each variable were highly correlated among months. Each covariate was modelled in a GAM with each month as a single smooth term. In cases in which models fitted equally well for more than one month (i.e. ∆AIC < 2), alternate "full models" (see below) were run.

Model fitting using GAMs
Sei whale relative density was modelled as a function of environmental variables using GAMs in the R package 'mgcv' (version 1.8.24;Wood, 2017). The response variable was the count of individuals in each effort segment. An offset (log of effective area searched, where effective area searched = segment length x 2 ESW) was included to account for variation among segments. The restricted maximum likelihood (REML) method was used to fit the models (Wood, 2017). The gamma argument was set to 1.4 to produce smoother models and correct overfitting without compromising model fit, as recommended by Kim and Gu (2004). AIC, quantile-quantile (QQ) plots, plots of Pearson residuals and response vs. fitted values plots were assessed to determine the most appropriate error distribution (Poisson, quasi-Poisson, negative binomial or Tweedie). All environmental variables listed in Table 2 were considered as smooth terms in the models. Cubic regression splines were fitted for all environmental variables, excluding aspect (0-360°), for which a cyclic regression spline was fitted to match start and end points.  (54) Shrinkage spline smooths were used for covariate selection, which penalize the terms in the model so that a smooth curve can be shrunk to a linear function or to zero (in which case a covariate can be removed from the model), as appropriate.
Models including all variables ("full models") could only be run for 1998-2015 because data for several variables were missing in earlier years. "Simple models" including only relief-related variables and sea surface temperature were run for 1987-89, and also for 1998-2015 for comparison. AIC was used for model selection (Akaike, 1973).

Model evaluation
Adequacy of the fit of the best-performing models was evaluated by visual inspection of the QQ and residual plots, and the shape and size of the confidence intervals around the smooth plots. Density maps from the full models were also compared to the distribution of observed sightings as a qualitative check for any obvious mismatch.

Predicted density
The results of best fitting models were used to predict the relative density (individuals/km 2 ) in each cell of the grid described above. Predictive maps were generated using the R package 'ggplot2' (version 3.1.0; Wickham, 2016). Where the grid values were outside the range of the environmental variables covered by the effort data, these grid cells were omitted from the prediction area to avoid extrapolation outside the range of the data and appear as white on the prediction maps. Weighted mean primary productivity (mg C m -2 day -1 ), derived from Chlorophyll-a, for the months of March to August 1998 to 2015.

Reconstructed dynamic variables:
Ocean mixed layer depth (mld) Weighted mean ocean mixed layer depth (m) for the months of March to August 1998 to 2015.
Global reanalysis, GLORYS2V4. Relies on three main components: 1) the ocean model, including the surface atmospheric boundary condition, 2) the data assimilation method, and 3) the assimilated components. (Same as mld).

Salinity (sal)
Weighted mean salinity (PSU) for the months of March to August 1998 to 2015.

Prediction uncertainty
To estimate the uncertainty of predicted density, coefficients of variation (CV) were calculated based on posterior simulation from the best models. From the posterior distributions of the model coefficients, 1000 coefficient vectors were simulated using 'mvrnorm' from the R MASS library (version 7.3.50, Ripley, 2009) and these were used to generate 1000 predictions for each grid cell. The CV was calculated as the standard deviation divided by the mean of these 1000 predictions. Maps of the CV were generated using 'ggplot2'.

Effort and sightings data
The full dataset of compiled Norwegian and Iceland-Faroes data from 1987-2015 contained 282,101 km of transect lines (20,957 segments of effort). Overall, 516 individual sei whales were sighted on effort during seven surveys (Table 1), with mean group size of 2.5 individuals (SE = 0.2). Maximum group size was 25 animals. There were very few observations north of 68°N, with the large majority of sightings (96.7%) between 51 o and 65°N ( Figure 2B).

Detection function
Sightings of sei whales from the Iceland-Faroes data were truncated so that sightings further away than 5000m were removed. No group size bias in detectability was found. The best detection function model for the NILS data was a hazard rate key function with no additional covariates. The best detection function model for the NASS data was a half-normal key function with vessel identity as a factor covariate.

Error distribution
The negative binomial error distribution best described the data. This was used for all models.

Collinearity
Only chlorophyll-a concentration and primary productivity were collinear in the same month; e.g. correlation between July chlorophyll-a concentration and July primary productivity = 0.91. July was the best fitting month for both these environmental covariates.

Full model
The best-fitting full model for 1998-2015 retained the covariates depth, May SST, May bottom temperature, July salinity, July SSH and July primary productivity (Table 3). Aspect and June mld were shrunk to zero degrees of freedom by the shrinkage regression splines and seabed slope had very low effective degrees of freedom (edf = 0.60, p-value = 0.095); therefore, these three covariates were dropped from the model. The July chlorophyll-a term was not included because it was collinear with July primary productivity. Diagnostic plots showed that the model fitted the data well (see Supplementary  File). This final model explained 54.7% of the deviance. All covariates had significant p-values (Figure 3), with depth, SST and SSH the most significant. There were positive effects (higher estimated relative density of sei whales) where SST was warmer than 5°C, bottom temperature was warmer than 275K (approx. 1.8°C) and salinity was greater than 34.7 PSU. There were negative effects at depths shallower than approximately 1400m, where SSH was less than -0.7m, and where primary productivity was greater than 1500 mg C m -2 day -1 .  Predicted sei whale distribution based on the full model is shown in Figure 4. Visual inspection indicates that the model prediction was generally a good reflection of the distribution of observations, including in the Irminger Sea and over the Charlie-Gibbs Fracture Zone. The predictions in these areas also have higher precision ( Figure 4B). Some lower density is predicted curving through the Norwegian and Greenland Seas, around the Faroe Islands and Jan Mayen; whilst this predicted density is lower, it is still contained within the region of higher precision in the prediction.

Simple models
In both simple models, depth was the strongest environmental variable for explaining relative density of sei whales, although SST still explained a high amount of the variability in the data, particularly in the early simple model (Table 3). Slope was shrunk to zero degrees of freedom and dropped from the late simple model but was retained in the early simple model. Diagnostic plots showed that both models fitted the data well (see Supplementary File). All retained covariates in both simple models were highly significant. The smooth plots are shown in Figure 5. In the early simple model  there was a negative effect on density in waters shallower than approximately 1,800m and positive effects where the slope was steeper than 1° and where SST was between 3.0° -8.0°C. Similarly, in the late simple model (1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015) there was a negative effect on density where waters were shallower than 1200m and a positive effect when SST was between 2.5°C and 11.0°C. In the late simple model, where SST < 0°C the confidence interval is so wide that it includes both positive and negative effects. This does not appear to be as a result of limited observations, as shown in the "rug plot" along the horizontal axis.
Predictions of sei whale density, based on the simple models, are shown in Figure 6. Visual inspection of both maps shows very similar predicted distributions in 1987-89 and 1998-2015, with densities predicted to be highest in the Irminger Sea, towards the Labrador Basin, and in the Norwegian Sea (between the Faroe Islands and Jan Mayen). The models predict similar areas to where the observed sightings were made, especially for 1998-2015. Maps of the CV of these predictions show high uncertainty in Arctic waters and the Western European Basin. However, for the regions where sei whales were predicted to be present, the CV is relatively low.

Prediction in 2001, 2007 and 2015
To illustrate variation in modelled distribution over time, Figure  7 shows

Environmental drivers of sei whale distribution
In all models of sei whale distribution, depth was consistently one of the strongest explanatory variables. The negative effect on density at depths shallower than 1200-1800m, depending on the model, suggests that shelf and slope waters are less favoured than deep waters by sei whales in this region. This result is similar to Waring et al. (2008) who found that sei whales over the Mid-Atlantic Ridge were most common over depths between 1500m and 3000m. May SST was also consistently one of the strongest predictors of distribution. There were positive effects on density when SST was between 3°C and 11°C in May. In the Irminger Sea, May is the month when the net heat flux changes from negative (mixing) to positive (stratification); this time also coincides with the spring bloom (Waniek & Holliday, 2006). The influence of SST in May on sei whale distribution during the summer (survey effort focused primarily in July) appears likely to be a result of changes in productivity influencing sei whale prey later in the year. SSH was the strongest predictor in the full model. Throughout the North Atlantic, values of SSH are generally negative and the largest anomaly is in the Irminger Sea (Häkkinen et al., 2013;Hátún et al., 2009), which is an area of high sei whale density. Depressed (negative) SSH can be associated with cyclonic eddies or cold-core rings, within which upwelling of nutrientrich water occurs that favours higher primary productivity and supports high biomass of phytoplankton, zooplankton and fish stocks (Biggs et al., 1997;Wormuth et al., 2000). These food resources associated with these oceanic features can attract higher trophic level predators, such as cetaceans (Wormuth et al., 2000). In the North Pacific, Murase et al. (2014) found sei whales to be associated with oceanographic fronts and reported mesoscale eddies associated with these fronts in the feeding area. In the North Atlantic, Skov et al. (2008) found aggregations of sei whales to be primarily associated with finescale ocean frontal processes to the north of the Charlie-Gibbs Fracture Zone. It is possible that oceanic fronts could enhance foraging efficiency because of primary productivity increasing prey biomass or advection processes aggregating prey (Skov et  1987-1989panels B andD: SST averaged over 1998-2015. In all panels, the circles show the observed sightings of sei whale groups made over the specified time periods, with size scaled to number of individuals in each group. al., 2008). High concentrations of C. finmarchicus in the upper 100m of the water column were identified to the north of the Charlie-Gibbs Fracture Zone, which overlapped with aggregations of sei whales . In the North Atlantic, these copepods appear to be the main prey species for sei whales (Christensen et al., 1992;Sigurjónsson, 1995).
Primary productivity was also an important predictor in the full model. Primary productivity is thought to have an indirect effect on sei whales because it drives the distribution and abundance of grazing zooplankton, such as C. finmarchicus, which in turn drives whale distribution. Visser et al. (2011) believed that baleen whales can track the secondary production induced by the spring bloom to find foraging areas. The North Atlantic can be considered as a multi-seasonal bloom region with spring (long lasting) and summer blooms (Waniek and Holliday, 2006). The summer blooms in the Labrador Sea start at the beginning of June and the summer blooms in the north-eastern Atlantic (Irminger and Iceland Basins) start at the beginning of July (Friedland et al., 2016). These bloom activities could explain why July was the best month for primary productivity for sei whales.
Salinity and bottom temperature were retained in the final full model but had a very weak influence on sei whale density as shown by edf < 1, and the 95% confidence interval including the line of zero effect across the range of the variable (Figure 3).

Variability in sei whale distribution in recent decades
There is some inter-annual variability in the sightings, as shown in Figures 6 and 7, which is consistent with the suggestion of Jonsgård and Darling (1977) that summer distributions of sei whales on their feeding grounds exhibit year-to-year variability. However, the regions of predicted highest density are consistently in the Irminger Sea and over the Charlie-Gibbs Fracture Zone, with some occurrence also in the Norwegian Sea, between the Faroe Islands and Jan Mayen. Other studies have found sei whales feeding over the Charlie-Gibbs Fracture Zone and migrating to the Labrador Sea (Olsen et al., 2009;Prieto et al., 2014;Skov et al., 2008;Waring et al., 2008). Sigurjónsson et al. (1991) referred to the possibility that the "stock" of sei whales previously hunted off Newfoundland and Labrador could extend as far as the Denmark Strait. The high density of sei whales predicted in the Irminger Sea, which lies between the Labrador Sea and the Denmark Strait, lends some support to this. There is little evidence for a directional change in distribution over the study period. The simple models show some signal of a slight northward expansion in distribution in 1998-2015 compared to 1987-1989 ( Figure 6) but this is not reflected in the predicted distribution from the full model for 2001, 2007 and 2015 (Figure 7).

Overlap in predicted habitats of sei whale and other species
Part of the habitat characterised as suitable for sei whales in this study overlaps with the habitat of the fin whale (Ramirez-Martinez, 2020). Similar to sei whales, the two most important environmental variables for predicting fin whale distribution in 1987-89 were depth and July SST, whilst in 1998-2015 it was depth and August SSH (Ramirez-Martinez, 2020). Likewise, in the Gulf of St Lawrence, Schleimer et al. (2019) found that water depth and aspect were important predictors of fin whale distribution, with highest densities occurring in deep waters, over steep and northward facing slopes, and where SST was greater than 12°C. Fin whales in the Northern Hemisphere are known to prey on euphausiids (primarily Arctic krill, Thysanoessa spp., but also northern krill) as well as capelin and the copepod C. finmarchicus (Gavrilchuk et al., 2014;Nøttestad et al., 2014;Reissler et al., 2015;Sigurjónsson & Víkingsson, 1997;Víkingsson, 1997). As these are the same prey species as sei whales (Christensen et al., 1992), there may be competition for prey between sei and fin whales in this region. However, it is not possible to assess the extent of this simply via observation. For example, Sigurjónsson et al. (1991) noted that in the 1989 survey, the distribution of sightings of fin and sei whales was largely non-overlapping, which could indicate a lack of competition or be the result of the species' response to competition.

Future research
Although prey data are not available for the whole study area, the direct effect of prey on sei whale distribution could be investigated in regions where such data are available. Copepod, krill, capelin and herring data are available in patches from acoustic surveys or pelagic trawls (e.g. Skern-Mauritzen et al., 2009;Víkingsson et al., 2015). Fitting models with these as covariates could indicate how much, if at all, prey data improved the fit and predictive ability of the models.
The current models could potentially be used to predict the future distribution of sei whales based on climate change induced effects on environmental conditions. This would require cross-validation to check the predictive power of the models and for the prediction grid to be populated with future values of environmental variables predicted by climate models. Such a prediction grid would likely include warmer SST values, a wider range of SSH values due to sea level rise, and fresher salinity values resulting from increased precipitation.

ADHERENCE TO ANIMAL WELFARE PROTOCOLS
The research presented in this article has been done in accordance with the University of St Andrews' Animal Welfare and Ethics Committee and in accordance with the national animal welfare laws and protocols applicable in the waters of the United Kingdom, Norway, Iceland and Denmark in which the animals were surveyed.