Census analysis

Census estimator for asynchronous populations

Multi-season version

Input counts of individuals during many seasons are converted into an estimate of total population size each season. This covers a scenario where there is asynchrony, so that there is never a day when all individuals are present at once. The degree of asynchrony is estimated using the mean length of time each individual remains in the study area (referred to as tenure of individuals). If the duration of the entire season is much longer than the mean tenure, then asynchrony is high (ie synchrony is low); then the maximum count is much lower than the total population. At the other extreme, if the tenure equals the duration of the season, there is complete synchrony, and the count of individuals is the same as the population size.

The model requires prior estimates of mean tenure per individual, and the variance of tenure among individuals. Without that, there are too many parameters to fit. Ideally, tenure is known from observations of some marked individuals. Both mean and variance of tenure must be input as prior probabilitiy distributions in a Bayesian sense. Some background on the use of priors is helpful in understanding the method.

On the other hand, the distribution of arrival and departure dates of individuals are estimated by the model. No knowledge of either is needed in advance. Both distributions are assumed to follow a Gaussian. The model will also estimate the correlation between arrival and tenure, ie, if late arriving individuals have shorter (or longer) tenure on the colony.

The power of the multi-season approach is that some seasons with poor coverage will still yield good population estimates as long as other seasons have many counts. This is based on the assumption that individual phenology is consistent (but not identical!) across seasons. Some knowledge of multi-level statistical modeling will be helpful in understanding how this works.

The model assumes all individuals present on any day are detected counted. Incomplete detection would have to be estimated with additional information.

To execute the model, a table of counts per day in one or more seasons must be input (see text box below), along with basic input parameters needed to initiate the model. Detailed instructions follow, along with a sample data table. Details of the procedure are published in "Estimating population size in asynchronous aggregations: a Bayesian approach and test with elephant seal censuses".


Input parameters and data. Click on each for instructions below.

Start day each season
End day each season
Mean tenure of one individual
Prior standard deviation of mean tenure
CV of tenure among individual
Prior standard deviation of CV tenure
Steps
Burn-in
Show steps
Paste from a spreadsheet (Excel, Libreoffice, Openoffice, etc.) or tab-delimited ascii into text box
  • There must be 3 columns of integers
  • There should be no header row
  • Column 1 must be Season: an integer, in typical use, this is a year, but other numbers will work
  • Column 2 must be Day within Season: an integer, in typical use, this is the day within each year
  • Column 3 is the Count on each day: must be an integer
  • There might be only one season, but then hyper-parameters are meaningless
  • Days must have the same meaning each year, ie day 10 might mean 10 Jan every year
  • It is not necessary to have matching days every year; one year can have day 1, 5, 10, the next year 7, 8, 12
  • Only include days with counts, no blank records; some seasons may have no data at all
  • There can be seasons with few, or even one day, but some seasons need a full series of counts
  • Sample data below

When the execution button is clicked, nothing will change on the screen as the model runs, but the browser should show an indication that it is waiting. When complete, results appear. A full 6000 steps with 10 or more seasons will take several minutes to finish, so it is very helpful to start with a test run of few steps (500 or fewer) to confirm that results are output. When execution completes, estimated population size in each season, with confidence limits, are typed to the screen and saved in a table for download. There are also estimated hyper-parameters -- the mean across years of arrival date, the standard deviation in arrival, and the correlation between arrival and tenure.



Back to top

Input parameters needed:

Start Day: Start day each season should be earlier than counts ever start, ie before the earliest arrivals. It can be negative. The results are most accurate if there are days with predicted count~0. The default -30 works for northern elephant seals; it is equivalent to 31 October. Day 1 is then 1 December.

End Day: End day each season should be a day later than the last departure.

Mean Tenure: The mean length of time an individual is present. This must be known independently for good estimates. The default (in days) is for northern elephant seal.

Prior SD of Tenure: The prior standard deviation (SD) of that mean tenure. It is the degree of confidence in the independent estimate. Ideally, it is very small; if it is high, it will add error to the estimated population size. It must be positive. The default is from northern elephant seals.

CV Tenure: The coefficient of variation (CV) in tenure among individuals (CV = ratio SD/mean). This must be known independently for good estimates. Note the difference between CVtenure, which is a trait of the organism, and prior SD of tenure, which is confidence of the observations.

Prior SD of CV Tenure: The prior standard deviation (SD) of that CV tenure. It is the degree of confidence in the independent estimate of CV tenure. Ideally, it is very small; if it is high, it will add error to the estimated population size. It must be positive. The default is from northern elephant seals.

Steps: Number of steps to run the parameter search. Final results should be 6000-10000 steps. But first test with ~200 steps. This will confirm the model runs and finish quickly.

Burn-in: Preliminary steps to be discarded in parameter calculations. Must be < number of steps. Final run should be 2000. Test run can be any number as long as it is < number of test steps.

Show Steps: After completion, the current estimated hyper-parameters will print to the screen every Show steps. This can confirm the estimates are converging, or suggest a problem.

Back to top
Download this table as SampleSealData.csv

Sample Data:

201331150
201339422
2013491076
2013551308
2013621303
20142677
20142793
201431179
201433227
201444829
2014591406
2014601374
201487161
20151720
201531154
201533198
201544701
2015521117
2015581276
20159264
201633230
201634274
201643677
2016511258
2016571414
2016671284
20169367
20172846
201744694
2017561442
2017571473
20179193
201838410
2018541424
2018621483
201883382
201889132
Back to top