- Introduction
- Definitions
- Availability
- Analysis on Single Variants
- Analysis on Single Variants
- Analysis on Group of Variants
- Population Stratification
- Trios Analysis
- Linkage Analysis
- Citations
|
ATAV Input Parameters of Collapsing Methods
--memory {50000}:assign 50000 MB (around 50 GB) memory to run ATAV. The necessary memory size is dependent on the your dataset. You need to assign more than 50 GB memory to run a whole genome dataset, and about 30 GB memory to run a whole exome dataset.
-
--project {$PROJECT.gsap}: specify an SVA project gasp file, with gsap.sva and gsap.sva.data files in the same folder; $PROJECT.gsap is the SVA filename.
-
--collapsing: a collapsing method will be performed.
-
--recessive [optional]: add this flag if you want to run a collapsing with recessive model; by default, running a collapsing with dominant model.
-
--collapsing-comp-het: a collapsing method with compound het model will be performed.
-
--min-variant-present {1} [optional]: consider variants only if observed n or more times in either heterozygotes or homozygotes; the default value is 1 (all variants are considered).
-
--min-coverage {3} [optional]: specify a minimum coverage (read depth); the default value is 3.
-
--out {foldername & fileroot} [optional]: specify output foldername and output root filename; the default value is the project name.
-
--exclude-male-het [optional]: when "--exclude-male-het" is specified, variants on sex chromosomes that have one or more male(s) with heterozygous mutations will be excluded. By default, these variants are included but the questionable males are set to missing.
-
--ctrlMAF {0.05} [optional]: specify a maximum variant allele frequency in controls; the default value is 0.05. For example, if one user specifies "--ctrlMAF 0.05", ATAV will load variants that their frequencies are either <= 0.05 or >= 0.95. This is for loading rare variants and calculation of significance threshold for rare variants.
-
--snvFunctionList { STOP_GAINED, STOP_LOST,ESSENTIAL_SPLICE_SITE, NON_SYNONYMOUS_CODING} [optional] : specify snv functional list, using comma (,) to separate them (NOTE: don't add blank after comma); the default value is STOP_GAINED, STOP_LOST, ESSENTIAL_SPLICE_SITE, NON_SYNONYMOUS_CODING. The available snv functional list are in the following: STOP_GAINED, STOP_LOST, FRAMESHIFT_CODING, NON_SYNONYMOUS_CODING, ESSENTIAL_SPLICE_SITE, SPLICE_SITE, REGULATION_REGION, INTRONIC_EXON_BOUNDARY, 5PRIME_UTR, 3PRIME_UTR, EXONIC_NON_CODING_RNA, UPSTREAM, DOWNSTREAM, INTRONIC, SYNONYMOUS_CODING, INTERGENIC, REFERENCE, CANNOT_ANNOTATE.
-
--indelFunctionList {CODING_DISRUPTED_FRAMESHIFT,CODING_DISRUPTED_OTHER} [optional]: specify indel functional list, using comma (,) to separate them (NOTE: don't add blank after comma); the default value is CODING_DISRUPTED_FRAMESHIFT, CODING_DISRUPTED_OTHER. The available indel functional list are in the following: CODING_DISRUPTED_FRAMESHIFT, CODING_DISRUPTED_OTHER, TRANSCRIPT_INCLUDED, 5PRIME_UTR, 3PRIME_UTR, INTRONIC_EXON_BOUNDARY, UPSTREAM, DOWNSTREAM, INTRONIC, INTERGENIC, CANNOT_ANNOTATE, SPLICE_SITE.
-
--linear [optional]: linear regression for continuous traits; the default value is logistic regression (for dichotomous traits).
-
--permute [optional]: add this flag if you want to run permutations; by default without permutations.
-
--mperm [optional]: number of permutations if running permutations; the default value is 1000.
-
--var-missing-rate [optional]: a variant would be dropped before collapsing in one gene if its missing rate is greater than or equal to the specified number (the suggested number should be bigger than 0.8); the default value is 1.0.
-
--region [optional]: if a region is specified, e.g. "--region chr22:17257787-19792353", the combined results of a Fisher's exact test will only output results for the specified region.
-
--without-non-carrier [optional]: use a flag "--without-non-carrier" plus other parameters to run analysis for data sets without non-carrier information.
-
--exclude-tolerant [optional]: use a flag "--exclude-tolerant" to exclude NON_SYNONYMOUS_CODING variants that are predicted as ""tolerant".
-
--test-model [optional]: use a flag "--test-model fisher" to test statistics for
collapsed variants in a gene by a Fisher's exact test; use a flag "--test-model regression" to test statistics for collapsed variants in a gene by linear/logistic regression framework to do log likelihood ratio test. Note: gender information is used as a covariate in the log likelihood ratio test by performing logistic regression. User can incorporate a specified a number of eigenvectors to adjust population stratification as covariates in regression models. By default, ATAV tests collapsed variants in gene units using a Fisher's exact test.
-
--ctrl-maf-recessive [optional]: minor allele frequency in controls for recessive model.
-
--ctrl-mhgf-recessive [optional]: minor homozygous genotype frequency in controls for recessive model. If ctrl-maf-recessive=0.15 and ctrl-mhgf-recessive=0.05, then a variant would be removed if it has either ctrlVAF > ctrlMAF (0.15) or ctrlVHGF > ctrlMHGF (0.05).
-
--covar {$COV_FILE} [optional]: specify a covariate file. The "$COV_FILE can be a regular covariate file in flat text format (space-delimited, tab-delimited, or mixed both space and tab delimited), or a .evec file if no other covariates are included. When a .evec file is specified, users may specify the number of eigen axes to be included by using a parameter "--ncov $N" where the default of $N is 3. Note: (1) in a covariate file, the first column should be subject IDs. From the second column, they should be covariates; (2) adding a number sign ""#" before the header line if there is a header line in the covariate file.
-
--nov {3} [optional]: number of eigenvectors/covariates from a covariate file to be included in multivariate regression.
|