These programs were written to analyze genetic
diversity and relationships among bacterial strains characterized
by multilocus enzyme electrophoresis (Selander et al. 1986, App.
Environ. Microbiol. 51:873-884). The programs can also be used with
other types of binary state of multi-state data with unordered
categories. The electromorph data should be stored as integers with 0 (null
alleles) to be treated as missing data. The input data files need to
be saved as text files in the format described in the
Readme file.
See an example input data file
Programs can be executed on a PC running Windows by clicking on the application in Windows Explorer or typing the name of the program at the Command prompt. The FORTRAN source code can be obtained by request.
ETDIV finds and lists the electrophoretic types (ETs) in
a collection of bacterial isolates with multilocus enzyme profiles.
The program writes the results to an output file and creates a file
named ETLIST.DAT to be used as input for ETCLUS. The input file for
ETDIV must have the format explained in the Readme file.
See example output file
ETCLUS uses the output file ETLIST.DAT created by ETDIV and
finds a dendrogram based on the average linkage algorithm (UPGMA).
Distance is measured as the proportion of mismatched loci between
pairs of ETs. Null alleles that are scored as '0' are not used
in the calculation of pairwise distances. To obtain dendrograms for
publication, I use ETMEGA (see below) and the MEGA program.
See example output file
ETMEGA creates a distance matrix for input into the MEGA program
(Kumar, Tamura, and Nei, 1994, CABIOS 10:189).
The program uses the same input file format and has the same default
parameter values as ETCLUS. It calculates genetic distance between pairs
of ETs and writes a file in the MEGA input format. Note that MEGA does not
accept blanks spaces within the strain labels, so replace these blank spaces
with some other symbol. The output file from ETMEGA is then used as data file
input (distance matrix option) in MEGA.
See annotated output file
The MEGA computer program can be obtained from the authors by filling out an order form
ETLINK calculates several measures of linkage disequilibrium, including the
distribution of standardized coefficient (D') between all pairs of alleles,
the two-locus coefficient Q* for multiple alleles per locus, and the indices
of multilocus association based on the properties of the mismatch distribution.
For information and references about these measures, see Whittam et al. (1983,
Proc. Natl. Acad. Sci. USA 80:1751-1755) and Hedrick and Thomson (1986, Genetics
112:135-156).
See annotated output file
ETBOOT is a bootstrap program that randomly selects loci, obtains
a distance matrix, finds a tree (based on the average linkage or
the neighbor joining algorthim, and records the nodes of the tree. The
process is repeated for a number of bootstrapped trees (input
by the user). ETBOOT then tabulates the number and frequency of
each observed node recovered among the randomly generated trees.
See annotated output file