Gene Conservation Laboratory
Statistics Program for Analyzing Mixtures (SPAM) Software
SPAM Version Hitory Page
Version 3.7b – 5 January 2005
No changes in documentation were made from the previous version 3.7. This new release eliminates a bug found in Spam version 3.7. The bug was affecting synchronization between simulated mixtures when performing likelihood ratio test (LRT) of mixture equality (see example in SPAM v 3.5 Addendum, pp. 25-30). The bug revealed itself only when testing mixtures of unequal sizes, which required the use of keywords BEFORE and AFTER in the parameter section of the control files. These keywords allow simulated mixtures to be partitioned into (random) subsamples in a manner that is coordinated across multiple SPAM runs. The synchronization should be checked by comparing SPAM produced *.rlk files across all simulation runs. Random seeds found in the last three columns of *.rlk files must be the same for all runs, which indicates that the same random seeds were used in mixture simulations.
Version 3.7 - 25 June 2003
The new features of this version were developed and implemented by Michele Masuda (Auke Bay Laboratory, Alaska Fisheries Science Center, NMFS, Juneau, Alaska).
This new version features an option of using Bayesian modeling in estimation of baseline allele frequency distributions. This is especially useful for loci with many rare alleles whose frequencies are more reliably estimated with Bayesian approach instead of just using observed allele frequencies obtained from baseline samples (see also FAQ - Sampling Zeros).
Version 3.7 offers two Bayesian models of baseline allele frequency distributions:
- Rannala-Mountain (Rannala, B. and Mountain, J. L. 1997. Detecting immigration by using multilocus genotypes. Proceedings of the National Academy of Sciences of the United State of America. 94:9197 - 9201)
- Pella-Masuda (Pella, J. and Masuda, M. 2001 Bayesian methods for analysis of stock mixtures from genetic characters. Fishery Bulletin. 99:151 - 167).
The new option is turned on/off in the SPAM control file. When Bayesian modeling is turned off, SPAM 3.7 runs just like SPAM Version 3.6 where baseline allele frequencies are estimated by the traditional maximum likelihood method.
Version 3: 3.0, 3.1 - 1998, 3.2 - November 1999
Version 3.5 - 4 October 2001, 18 March 2002, 5 April 2002
Version 3.6 - 8 May 2002
Microsoft Fortran PowerStation executables that run under the Windows 95/98 OS. SPAM 3.* provide a convenient windows environment from which the user can edit input files, perform analyses, and view output files.
SPAM 3.* includes a number of control, memory, output, and calculation enhancements:
- Simpler installation and initialization of program.
- Expanded manual.
- Expanded error warnings and internal error-checking.
- The maximum number of alleles per locus has been increased from 9 to 100.
- Up to 3 different (not necessarily nested) regional partitions (combinations of the baseline populations into different regions) can be defined by the user and their proportion estimates and confidence intervals will be calculated.
- The maximum number of characters is limited only by a maximum baseline file line length of 512.
- The user can define the confidence level desired (had been fixed at 80%).
- Various Bootstrap confidence intervals can be calculated (symmetric percentile, nonsymmetric percentile, bootstrap-t).
- Simulated mixtures can be partitioned across SPAM calls, allowing, in conjunction w/ the reporting of each simulation's maximum likelihood value, the construction of Monte Carlo likelihood ratio tests (see the SPAM v 3.5 Addendum for two example applications - testing mixture equality, testing whether a baseline can be reduced without loss of performance, thereby minimizing bias).
- Bootstrap resamples can be printed out for analysis in another package.
- Conditional Genotype Probabilities, i.e., prob(mixture fish i belongs to pop. j | baseline frequency distributions), can be calculated for each observed mixture genotype.
- Conditional Population Probabilities, i.e, the Bayes update of the prob(fish belongs to pop j | it has genotype i), can be calculated from the conditional genotype probabilities and the contribution estimates.
- SPAM won't balk if you analyze a mixture with only 1 character.
- (18 March 02 Release) - fixed a bug that could lead, in rare circumstances involving sufficient amounts of missing data, to resample contribution estimates summing to slightly more than 100%. In general, if one used the recommended 1000 bootstrap resamples in estimating region-specific confidence intervals, the bug would only influence the 4th digit of the confidence intervals.
- (5 April 02 Release) - fixed a bug that occurred in simulating mixtures when the user explicitly assigns, in the control file, a nonzero region contribution and zero population contributions for all populations in that region. Depending on your perspective, this was less a bug and more an incomplete assessment of the ways a user could circumvent the error-checking procedures.
- (8 May 02 Release) - fixed a bug that could occur when using large numbers of highly polymorphic loci. If the prob(observed genotype) < 1.0d-30, yet larger than the user-set genotype threshhold, then, depending on what proportion of the observed mixture this held for, the search could become increasingly unreliable. Such situations were accompanied by absurd values of the achieved GPA (listed in the *.log file). That is, achieved GPA > 100%.
Version 2 - April 1997
Aka SPAM 95 - conversion to Windows 95 OS and GUI interface. Fixed various bug from previous versions (mainly regarding output), and added the following enhancements:
- The random number seed can be either generated by the computer or set by the user (for controlled simulation comparisons).
- The user can run several control files in succession in batch mode.
Version 1 - May 1995
Windows 3.1 OS Fortran executable.
SPAM was developed by the Alaska Department of Fish and Game's (ADF&G) genetics lab using the algorithms in the GIRLSEM and CONJA-S programs written by Masuda et al. (1991) and Pella et al. (1994) of NMFS Auke Bay Laboratory and from the program HighSeas written by Smouse et al. (1990).
SPAM 1.0 provides conditional likelihood point estimates for the mixture proportions, as well as likelihood confidence intervals (80% confidence), infinitesimal jackknife standard deviations for use in approximate normal 80% confidence intervals, and bootstrapping of either the mixture and/or the baseline population marginal 'allele' frequency distributions to estimate bias (note that non-genetic characters can be used).