PROGRAM: dmsimulate (DMotif Sequence Simulator) USAGE: dmsimulate [OPTIONS] neutral.mod motif.pfm msa-fname > features.gff DESCRIPTION: Simulates a multiple sequence alignment or a set of short alignments adding up to approximately the specified total sequence length based on a DMotif Phylo-HMM and a set of parameter values. Prints MSA(s) to output files according to the supplied prefix and writes a GFF describing motif features to stdout. Default parameter values and output sequence length given in the options section will be used unless otherwise specified. EXAMPLES: OPTIONS: --msa-format, -o FASTA|PHYLIP|MPM|MAF|SS Alignment format for output sequences (default FASTA). Note that the program msa_view can be used for conversion. --msa-length, -L Length (in bp) of output sequences. Represents the sum of all fragment lengths when used with --random-lengths. (default 1,000,000) --nseqs, -n Specify the number of output sequences, all of the length specified with --msa-length. Not compatible with --random-lengths! --random-lengths, -l offset,k,theta Generate a set of many fragments according to a gamma distribution with given parameters k and theta, offset by the given value (can be used to shift the base of the distribution without affecting the range of length values). --branch, -b Produce output sequence(s) containing a single motif on the subtree below or supertree above the given branch embedded in the center of the sequence. Requires --event (see below). The dmotif hmm is not used with this option -- only the phylogenetic models for the neutral background and motif states relevant to the given branch and event type. If used with --nseqs, all output sequences will contain a single motif instance of the given type. --event, -e CONS|NEUT|BIRTH|DEATH Simulate sequences containing a single instance of the given type embedded in the center of the sequence. Types BIRTH and DEATH require a branch upon which the motif is gained or lost, respectively, specified with the --branch option. The dmotif hmm is not used with this option, only the phylogenetic models for the neutral background and pertinent motif states. --rho, -R (default 0.3) --phi, -p (default 0.5) --zeta, -z Motif entry parameter for conserved sequences (default 0.001) --xi, -Z Motif-entry parameter for nonconserved sequences (default 0.0001) --scale-by-branch, -B Scale transition probabilities for lineage-specific states by the branch length leading to the node predicted to contain the gain or loss event. Default is to treat all branches as if they were of equal length when assigning transition probabilities. --xi-off, -F Use the alternate model parameterization (single motif entry parameter) instead of the default (separate motif entry parameters for nonconserved (xi) and conserved sequences (zeta)). --mot-mod-type, -S F81|HB Set the substitution model type used for motif states. Default is HB. With HB, the Halpern-Bruno model (1998. MBE. 15(7):910-917) will be used on motif branches, with site-specific selective pressure modeled through the rate matrix and equilibrium frequencies; branch lengths are not scaled by rho on motif branches. With F81, motif branches are explicitly scaled by rho and only equilibrium frequencies will be used in the substitution model for these branches. All branches are scaled equally, by rho, regardless of constraint implied by the motif weight matrix, which implies equal constraint at all motif positions. --target-coverage, -C (Alternative to transitions, use with --expected-length) Set the transition parameters such that the expected fraction of sites in conserved elements is (betwen 0 and 1). This is a *prior* rather than *posterior* expectation and assumes stationarity of the state-transition process. This option causes the ratio mu/nu to be fixed at (1-gamma)/gamma, and together with --expected-length, completely defines the transition probabilities. --expected-length, -E (Alternative to --transitions, use with --target-coverage) Set transition probabilities such that the (prior) expected length of a conserved element is . The parameter mu is set to 1/omega. --transitions, -t , Set the transition probabilities of the two-state HMM using the specified values of and (both between 0 and 1). --refseq, -M (for use with --msa-format MAF) Read the complete text of the reference sequence from (FASTA format) and combine it with the contents of the MAF file to produce a complete, ordered representation of the alignment. The reference sequence of the MAF file is assumed to be the one that appears first in each block. --refidx, -r Use coordinate frame of specified sequence in output. Default value is 1, first sequence in alignment; 0 indicates coordinate frame of entire multiple alignment. --seqname, -N Use specified string for 'seqname' (GFF) or 'chrom' field in output file. Default is obtained from input file name (double filename root, e.g., "chr22" if input file is "chr22.35.ss"). --idpref, -P Use specified string as prefix of generated ids in output file. Can be used to ensure ids are unique. Default is obtained from input file name (single filename root, e.g., "chr22.35" if input file is "chr22.35.ss"). --indel-model, -I alpha,beta,tau,epsilon[,alpha2,beta2,tau2,epsilon2] Use a simple model of insertions and deletions that assumes a known indel history and at most one indel per branch of the tree at any given position. The parameters alpha and beta are rates of insertion and deletion, respectively, per expected substitution per site, and the parameter tau is approximately the inverse of the expected indel length (see indelFit). If two sets are parameters are given the first will be used for nonconserved regions and the second for conserved regions. If --indel-history is not used, a history will be inferred on the fly using a simple parsimony algorithm. --nc-mot-indel-mode, -j Specify the set of indel params to use in nonconserved motif states. Default is motif params. Use of this option toggles to background params in these states. --keep-ancestral, -k Include ancestral sequences in output MSA. Ancestral bases will not be predicted -- all will be reported as N's. --no-require-subs, -s Do not require any substitutions on a branch incurring a gain/loss event. Default behavior is to require at least one substitution relative to the parent node when generating a motif gain or loss. --help, -h Show this help message and exit.