วันจันทร์ที่ 2 กรกฎาคม พ.ศ. 2550

Assignment-Lod score-deadline 24.00/4/7/50

Mode of Inheritance
Pattern of Inheritance and Multifactorial
- หากความผิดปกติทางพันธุกรรมนั้นเกิดขึ้นใน Somatic cells จะไม่มีการถ่ายทอดสู่รุ่นต่อไป(non-inheritance)
- หากเกิดขึ้นในเซลล์สืบพันธุ์ก็จะสามารถถ่ายทอดต่อไปได้ (Inheritance)
Genetic Disorder
1. Single-gene disorders
2. Multiple-gene disorders
3. Mitochondrial disorders
1. Chromosomal disorders
แบบแผนของการถ่ายทอดโรคทางพันธุกรรม
มักใช้ single gene disorders เป็นต้นแบบในการดูการถ่ายทอดของโรค โดยขึ้นอยู่กับ 2 ปัจจัย
1. ตำแหน่งที่เกี่ยวข้อง
1.1 Autosomal Chromosome
1.2 Sex Chromosome
2. Phenotype
2.1 Dominant
2.2 Recessive
Single Gene Disorders หรือ Mendelian Inheritant
 แสดงออกในลักษณะเด่นและด้อย
 มักผิดปกติที่ locus เดียวกัน(ตำแหน่งเดียวกันบนแท่งโครโมโซม)
 มีลักษณะเฉพาะในการถ่ายทอด
 ความผิดปกติในรุ่นลูกจะมีอัตราส่วนคงที่
มี 5 รูปแบบ
1. Autosomal Recessive
2. Autosomal Dominant
3. X-linked Recessive
4. X-linked Dominant
5. Y-linked
Pedegree
= เพศชาย O = เพศหญิง g = ชายที่เป็นโรค
-O = แต่งงานกัน =O = การแต่งงานระหว่างญาติ
= แฝดร่วมไข่ = แฝดต่างไข่

, = จำนวนลูกต่อเพศ , ญ=2 , ช=6

n = คนที่มาพบแพทย์ = Autosomal heterozygous recessive

= X-linked carrier (พาหะ) Æ = ตาย

Autosomal Recessive
- ความเสี่ยงในการเกิดโรคเท่ากันทั้งชายและหญิง
- ทุกรุ่นไม่จำเป็นต้องมีคนเป็นโรค
- คนที่เป็นโรคจะต้องได้รับจีนด้อยจากทั้งพ่อและแม่
- คนที่เป็น Heterozygous recessive จะไม่แสองความผิดปกติเนื่องจากคู่โครโมโซมที่ปกติยังช่วยทำงานได้เต็มที่แทนตัวโครโมโซมที่ผิดปกติได้
- หากพ่อและแม่เป็น Heterozygote โอกาสที่ลูกจะเป็นปกติ 25% : พาหะ 50% : เป็นโรค 25%
ตัวอย่างโรค
- Thalussemia (ปกติ trait หมายความว่า เป็นโรค แต่ในโรคธาลัสซีเมีย หมายความว่า เป็นพาหะ
- Sickle Cell Anemia
- SCID (Severe Combined Immunodeficiency) เป็นต้น
Autosomal Dominant
- คนที่รับจีนที่ผิดปกติเพียงแท่งเดียวก็สามารถแสดงความผิดปกติได้ เนื่องจากมีจีนที่ผิดปกติเข้าไปแทรกแซงจีนที่ทำงานปกติอยู่ของคู่โครโมโซม ทำให้จีนปกติทำงานเสียหายไปด้วย
- มักพบในทุก ๆ รุ่น ยกเว้นการเกิดการผ่าเหล่า
- หญิงและชาย มีโอกาสเท่ากันที่จะเกิดโรค
ตัวอย่างโรค
- Huntington ‘s chorea
- Marfan ‘s syndrome
- Bradydactyly,Polydactyly เป็นต้น
Term
1. New mutation : มิวเตชั่นที่ไม่พบในรุ่นพ่อและแม่
2. Incomplete penetrance : จีนที่ถ่ายทอดสู่รุ่นลูกจากพ่อและแม่ ทำให้ลูกเป็นโรค (Trait) แต่วาลูกม่มีการแสดงออกของโรคในทุกคน
3. Variable Expressivity : ในแต่ละคนมีการแสดงออกของจีนแตกต่างกันตามปัจจัยต่าง ๆ
4. Codominant : ทั้ง 2 อัลลีลมีการแสดงออกมาพร้อมๆกัน เช่นAB-blood group
X-linked Recessive
- คนที่เป็นโรคมักเป็น Trait ในหญิงหรือชายมีโอกาสเฉพาะโครโมโซม X ที่ผิดปกติเพียงอันเดียวในชาย
- ชายจะได้รับผลเมื่อเป็น Hemizygotes
- หญิงจะได้รับผลเมื่อเป็น Homozygotes
- ชายจะให้โครโมโซม X แก่ลูกผู้หญิงเท่านั้น
- โอกาสที่ชายเป็นโรคจะมากกว่าเพศหญิง
ตัวอย่างโรค
- Hemophilia A,B
- G-6PD deficiency
- Color Blindness
- Duchene muscular Dystrophy
Lyon Hypothesis
ในเพศหญิงจะมีโครโมโซม X 2 ตัวในโครโมโซมเพศ แต่อีแท่งที่เป็นคู่กันนั้นไม่ทำงานจะกลายเป็น Barr Body XIST (X-inactive transcript) ถูกสร้างจากบริเวณบนโครโมโซม X เรียกบริเวณนั้นว่า XIC (X-inactive center)
การแสดงออกของ XIST RNA ใน XIC Region จะปรากฏบน inactive X-chromosome

X-linked Dominant
- พบได้น้อย อัตราส่วนการเกิดโรคหญิง > ชาย 2 เท่า
- อาการในเพศหญิงจะรุนแรงน้อยกว่าเพศชาย
- หากแม่เป็นโรคลูกชายทุกคนต้องเป็นโรคทั้งหมด
ตัวอย่างโรค
Vitamin D resistant rickets
Y-linked
- พบน้อย พบเฉพาะเพศชาย
- พบเฉพาะจีนที่อยู๋ในโครโมโซม Y เท่านั้น
ตัวอย่างโรค : Hairy Pinna (หูมีขนมาก)

Other mode of inheritance
1. Genomic imprinting : จีนที่ถ่ายทอดมาจากฝ่ายพ่อหรือฝ่ายแม่ไม่แสดงออก
2. Uniparental disomy : เด็กได้รับจีนจากพ่อหรือแม่จากฝ่ายเดียวทั้งหมด
3. Germline Mosaicism : ในเด็กที่เกิดจากพ่อและแม่ที่มีจีนปกติ แต่ลูกกลับเป็นโรค อาจเนื่องจากการผ่าเหล่าในขณะที่ยังเป็น Germline cell
4. Mitochondrial inheritance : ความผิดปกติของสารพันธุกรรมในไมโตคอนเดรีย

Mitochondrial DNA Disorders
- จะพบความผิดปกติจากแม่ไปสู่ลูกเท่านั้น เนื่องจากว่าในขณะปฏิสนธิ ไข่กับอสุจิจะเชื่อมนิวเคลียสเข้าหากัน แต่ตัวอสุจิจะเอาเฉพาะนิวเคลียสเท่านั้นเข้าไปในเซลล์ไข่ ทำให้ Organelle อื่นในเซลล์ของ Zygote มาจากฝ่ายแม่เท่านั้นรวมทั้งไมโตคอนเดรีย
- ทั้งหญิงและชายมีโอกาสเป็นโรคเท่า ๆกัน
- มักเป็นโรคที่เกี่ยวข้องกับระบบประสาท เนื่องจากเซลล์ประสาทต้องใช้พลังงานสูงจึงเห็นไมโตคอนเดรียมากในเซลล์จึงทำให้ได้รับผลกระทบที่เห็นชัดมากที่สุด
ตัวอย่างโรค Leber ‘s hereditary optic neuropathy

Chromosomal DNA Disorders
ความผิดปกติด้านจำนวน
Trisomy 21 = Down syndrome
Trisomy 18 = Edward syndrome
Trisomy 13 = Patau syndrome
44+X = Turner syndrome
44+XXY = Klinefelter syndrome
ความผิดปกติทางโครงสร้าง
(deletion 22) = Di George syndrome
(deletion p5) = Cri-du-chat syndrome (Cat cry syndrome)
Term
1. Delay age of onset : ความผิดปกติของจีนที่เลื่อนการแสดงออกของโรค เช่น Polycystic Kidney จะแสดงออกมาเมื่ออายุ 35 ปี
2. Pleiotropy : ความผิดปกติหลาย ๆจีนจึงจะทำให้เกิดความผิดปกติอย่างเดียว ทำให้เกิดอาการในหลายระบบ เช่น Marfan’s syndrome มีการสราง elastic fiber ผิดปกติทั่วร่างกาย
3. Genetic heterogeneity : คามผิดปกติหลายๆจีนจึงจะทำให้เกิดความผิดปกติเพียงอย่างเดียว เช่น Congenital Deafness
4. Sex-limited phenotype : ลักษณะที่แสดงออกเฉพาะบางเพศเท่านั้น เช่น มะเร็งมดลูก
5. Sex-influenced phenotype : เพศมีอิทธิพลต่อการเกิดโรคลักษณะนั้น ๆ เช่น ศรีษะล้าน จะเกิดขึ้นในเพศชายมากกว่าเพศหญิงเนื่องจากอิทธิพลของฮอร์โมนเพศชาย

Multifactorial Inheritance
นิยาม : Trait จะถูกกำหนดจากการรวมตัวของหลาย ๆ จีนและปัจจัยที่ไม่ใช่จีน เช่น สิ่งแวดล้อม
Polygene : Trait ถูกกำหนดโดยการรวมตัวของจีนมากกว่า 2 จีนขึ้นไป
Multifactorial or Polygenetic trait แบ่งได้ 2 กลุ่ม
1. Continuous Trait : ส่วนสูง,น้ำหนัก,สีผิว
2. Discontinuous Trait : DM}Cleft lip cleft Palate
ลักษณะของ Multifactorial Inheritance
- มีรูปแบบการถ่ายทอดไม่แน่นอน และมักมีอิทธิพลของสิ่งแวดล้อมมาเกี่ยวข้องด้วย
- ยิ่งคนที่เป็นญาติใกล้ชิดกันมากเท่าใดยิ่งมีอุบัติการณ์ในการเกิดโรคมากยิ่งขึ้น
- อัตราการเกิดโรคเหมือนกัน (Concordance rate) เป็นดังนี้คือ Identical twins > Non-identical Twins
Normal Distribution
หากลักษณะใดลักษณะหนึ่งถูกกำหนด ด้วยจำนวนจีนมากขึ้นเท่าใดโอกาสที่จะได้พันธุ์แท้ก็ยิ่งมีจำนวนน้อยลงมากขึ้น เช่น ความสูง หากถูกกำหนดด้วยจีน A และ a คือจีนเตี้ย ก็จะพบว่า อัตราส่วนของสูงแท้และเตี้ยแท้มีถึง ½ ในรุ่นนั้น แต้ถ้ามีจำนวนจีนที่ควบคุมหลายจีนเพิ่มขึ้นก็จะพบอัตราส่วนของพันธุ์แท้ลดลงในรุ่นลูก

LOH is the underlying mechanism in the development of some cancers in which a single mutation in one homolog of a tumor-suppressor gene is not sufficient to initiate tumor growth; however, deletion or disabling of the allele on the homologous chromosome results in unregulated cell growth.
LOH can underlie cancer development in both sporadic tumors and hereditary tumors. In sporadic tumors, both alleles are normal at conception; in hereditary tumors, one mutant allele is present at conception.

loss of heterozygosity: (synonym: LOH) At a particular locus heterozygous for a deleterious mutant allele and a normal allele, a deletion or other mutational event within the normal allele renders the cell either hemizygous (one deleterious allele and one deleted allele) or homozygous for the deleterious allele
Related Terms: deletion; hemizygous; heterozygote; homozygote
Your Comprehensive Source Program For Easy and Efficient LOD Score Calculations

http://phg.mc.vanderbilt.edu/content/ezlod
Factor for LOD score calculator.
1.Length of genomic region in Morgan

2.Number of chromosomes

3.Cross over rate

4.Global significance level

5.Precision

Standard lod score analysis is not without problems
Standard lod score analysis is a tremendously powerful method for scanning the genome in 20-Mb segments to locate a disease gene, but it can run into difficulties. These include:
• vulnerability to errors;
• computational limits on what pedigrees can be analyzed;
• problems with locus heterogeneity;
• limits on the ultimate resolution achievable;
• the need to specify a precise genetic model, detailing the mode of inheritance, gene frequencies and penetrance of each genotype.
1. Errors in genotyping and misdiagnoses can generate spurious recombinants
With highly polymorphic markers, common errors such as misread gels, switched samples or nonpaternity will usually result in a child being given a genotype incompatible with the parents. The linkage analysis program will stall until such errors have been corrected. Errors that introduce possible but wrong genotypes are more of a problem. These include misdiagnosis of somebody's disease status. Such errors inflate the length of genetic maps by introducing spurious recombinants, because if a child has been assigned the wrong parental allele, it will appear to be a recombinant. Multilocus analysis can help, because spurious recombinants appear as close double recombinants (Figure 11.7). Error-checking routines test the extent to which the map can be shortened by omitting any single test result (see Broman et al., 1998). Results that significantly lengthen the map (i.e. add recombinants) are suspect.
2. Computational difficulties limit the pedigrees that can be analyzed
As we saw in Section 11.3.2, human linkage analysis depends on computer programs that implement algorithms for handling branching trees of genotype probabilities, given the pedigree data and gene frequencies. liped was the first generally useful program, and mlink (part of a package called linkage) used the same basic algorithm, the Elston-Stewart algorithm, but extended it to multipoint data. The Elston-Stewart algorithm can handle arbitrarily large pedigrees, but the computing time increases exponentially with increasing numbers of possible haplotypes (more alleles and/or more loci). This limits the ability of mlink to analyse multipoint data. An alternative algorithm, the Lander-Green algorithm, can cope with any number of genotypes but the computing time increases exponentially with the size of the pedigree. This algorithm is implemented in the genehunter program (see Section 12.2.4), which is particularly good for analysing whole-genome searches of modest sized pedigrees. The general theory of linkage analysis is excellently covered in the book by Ott (Further reading), while the book by Terwilliger and Ott (Further reading) is full of practical advice indispensable to anybody undertaking human linkage analysis.
3. Locus heterogeneity is always a pitfall in human gene mapping
As we saw in Section 3.1.4, it is common for mutations in several unlinked genes to produce the same clinical phenotype. Even a dominant condition with large families can be hard to map if there is locus heterogeneity within the collection of families studied. It took years of collaborative work to show that tuberous sclerosis was caused by mutations at either of two loci, TSC1 (MIM 191100) at 9q34 and TSC2 (MIM 191092) at 16p13. With recessive conditions, the difficulty is multiplied by the need to combine many small families. Autozygosity mapping (Section 11.5.5) is the main solution in such cases.
genehunter or homog and related programs (see Terwilliger and Ott, 1994) can compare the likelihood of the data on the alternative assumptions of locus homogeneity (all families map to the location under test) and heterogeneity (a proportion α of unlinked families), and give a maximum likelihood estimate of α.
4. The limited resolution of human genetic mapping may be overcome by typing single sperm or by using linkage disequilibrium
Once a marker is found for which all meioses are informative and nonrecombinant, linkage analysis comes to a halt. In typical collections of disease families, the target region thus identified is likely to be 1 Mb or more. This is uncomfortably large for positional cloning of an unknown disease gene. One possible way to increase the resolution of marker-marker mapping is to type sperm instead of children. Humans have far too few children for optimal linkage analysis, but men produce untold millions of sperm, and modern PCR technology allows markers to be scored on single separated sperm from a doubly heterozygous man. Yu et al. (1996) show examples. Apart from technical problems, one drawback is that a single sperm cannot be resampled repeatedly to confirm interesting results, in the same way as a child can. Whole genome amplification ( Zhang et al., 1992) partially circumvents this problem. Individual spermatozoa are subjected to whole genome amplification followed by multiplex PCR amplification of markers from an aliquot. Further aliquots can be used to check any recombinants. Unfortunately sperm typing could not be used for disease-marker mapping, unless the disease mutations were already characterized.
Linkage disequilibrium provides the best hope of narrowing down the candidate region in disease-marker mapping. Genotypes or haplotypes for markers spread across the candidate region are examined in a series of unrelated affected patients. If the patients all carry independent mutations, as may very well be the case for a dominant or X-linked disease, this exercise will reveal nothing of interest. However, if a proportion of the disease genes in apparently unrelated patients derive from a common ancestor, as often happens with recessive conditions, it may be possible to find a shared ancestral haplotype that defines a small part of the candidate region. This approach is illustrated in Section 12.4.1.
5. Autozygosity mapping can map recessive conditions efficiently in extended inbred families
Autozygosity is a term used to mean homozygosity for markers identical by descent, inherited from a recent common ancestor. People with rare recessive diseases in consanguineous families are likely to be autozygous for markers linked to the disease locus. Suppose the parents are second cousins: they would be expected to share 1/32 of all their genes because of their common ancestry, and a child would be autozygous at only 1/64 of all loci. If a child is homozygous for a particular marker allele, this could be because of autozygosity, or it could be because a second copy of the same allele has entered the family independently. The rarer the allele is in the population, the greater the likelihood that homozygosity represents autozygosity. For an infinitely rare allele, a single homozygous affected child born to second cousin parents generates a lod score of log10(64) = 1.8. If there are two other affected sibs who are both also homozygous for the same rare allele, the lod score is 3.0 (log10(64 × 4 × 4); the chance that a sib would have inherited the same pair of parental haplotypes even if they are unrelated to the disease is 1 in 4).
Thus quite small inbred families can generate significant lod scores, and autozygosity mapping becomes a powerful tool for linkage analysis if families can be found with multiple affected people in two or more sibships, linked by inbreeding. Suitable families may be found in Middle Eastern countries where inbreeding is common. The method has been applied with great success to locating genes for autosomal recessive hearing loss, which otherwise presents intractable problems because of extensive locus heterogeneity ( Guilford et al., 1994). An example is shown in Figure 11.8.
The same principle can be extended to populations where the common ancestry is inferred rather than demonstrated. A bold application of this principle enabled Houwen et al. (1994) to map the rare recessive condition, benign recurrent intrahepatic cholestasis, using only four affected individuals (two sibs and two supposedly unrelated people) from an isolated Dutch village. The more remote the shared ancestor, the smaller is the proportion of the genome that is shared by virtue of that common ancestry, and therefore the greater the significance for linkage if autozygosity can be demonstrated. But at the same time, the remoter the common ancestor, the more chances there are for a second independent allele to enter the family from outside, and so the less likely is it that homozygosity represents autozygosity, either for the disease or for the markers. With remote common ancestry, as in the study of Houwen et al., everything depends on finding people with a very rare recessive condition who are homozygous for a very rare marker allele or (more likely) haplotype. The power of Houwen's study seems almost miraculous, but it is important to remember that this methodology applies only to diseases and populations where most affected people are descended from a common ancestor who was a carrier. The wider use of allelic association is described in the next chapter (Sections 12.3 and 12.4).
6. Characters whose inheritance is not mendelian are not suitable for mapping by the methods described in this chapter
The methods of lod score analysis described in this chapter require a precise genetic model that specifies the mode of inheritance, gene frequencies and penetrance of each genotype. For mendelian characters, penetrance is the main problem area. If no allowance is made for unaffected people being nonpenetrant gene carriers, or affected people being phenocopies, then these people may be wrongly scored as recombinant. On the other hand, if the penetrance is set too low there is a reduction in the power to detect linkage, because a less precise hypothesis is being tested. Errors in the order of markers on marker framework maps can cause problems, but these are diminishing as genetic maps are cross-checked against physical mapping data. Given sufficient meioses, the main obstacle in linkage analysis of mendelian characters is locus heterogeneity. However, for common complex diseases like diabetes or schizophrenia, the problems are far more intractable. Any genetic model is no more than a hypothesis - we have no real idea of the gene frequencies or penetrance of any susceptibility alleles, or even the mode of inheritance. This makes it near-impossible to apply the methods we have described in this chapter to such diseases. Nevertheless, identifying the genetic components of susceptibility to complex diseases is now a major part of human genetics research. The ways one can attempt to do this are the subject of the next chapter.

Genetic information

Genetics Information:
INTRODUCTION
It is common knowledge that a person's appearance (e.g., height, hair color, skin color, eye color, etc) are determined by genes. A person's mental abilities and natural talents are certainly affected by heredity. Some diseases (or susceptibility to acquire a disease) are also known to be genetically related.

An inherited abnormal trait or anomaly may be of no real consequence to a person's health or well being (for example, a white splotch of hair or an extended ear lobe). An inherited anomaly may be of minor consequence (for example, color blindness). On the other hand, an inherited disorder may also have multiple effects resulting in dramatically decreased quality or length of life. For some genetic disoders, genetic counseling and prenatal diagnosis may be advised.

The terms anomaly, abnormality, disorder, defect, disease, and syndrome are not used consistently and do not have precise definitions.


BACKGROUND INFORMATION
Human beings have cells with 46 chromosomes (2 sex chromosomes and 22 pairs of autosomal, that is, non-sex chromosomes). Males are 44,XY; females are 44,XX. Each chromosome is comprised of 2 extremely long DNA molecules in combination with chromosomal proteins. Genes are defined by intervals along one of the DNA molecules. The location of the gene is called the locus. Most genes carry information which is necessary to synthesize a protein. The pairs of autosomal chromosomes (one from mom and one from dad) carry basically the same information; that is, each has the same genes, but there may be slight variations in the DNA sequence of nucleotide bases in each gene. Alleles are different variants of a particular gene.

The information contained in the nucleotide sequence of a gene is transcribed to mRNA (messenger RNA) by enzymes in the cell's nucleus and then translated to a protein in the cytoplasm. This protein may be a structural constituent of a given tissue. It may be an enzyme which catalyzes a chemical reaction, or it may be a hormone. There are also many other potential functions for proteins.

If a gene is abnormal, it may code for an abnormal protein or for an insufficient amount of a normal protein. Since the autosomal chromosomes are paired, there are 2 copies of each gene. If one of these genes is defective, the other may code for sufficient protein so that the abnormality is not clinically apparent. This is called a recessive disease gene. If one abnormal gene somehow produces disease, this is called a dominant hereditary disorder. In the case of a dominant disorder, if one abnormal gene is inherited from mom or dad, the child will show the disease. In the case of a recessive disease, if one abnormal gene is inherited, the child will not show clinical disease, but they will pass the abnormal gene to 50% (on average) of their offspring.

A person with one abnormal gene is termed HETEROZYGOUS for that gene. If a child receives an abnormal recessive disease gene from both parents, the child will show the disease and will be HOMOZYGOUS for that gene. If two parents are each heterozygous for a particular recessive disease gene, then 25% of their children (on average) will be homozygous for that gene and show the disease. If one parent is homozygous and the other heterozygous, then 50% of the children will be homozygous.


GENETIC DISORDERS:
Almost all diseases have a genetic component, but the importance of that component varies. Disorders where genetics play an important role, so-called genetic diseases, can be classified as single gene defects, chromosomal disorders, or multifactorial. Single-gene defects are also called mendelian disorders.

A single gene disorder is one that is determined by a specific allele at a single locus on one or both members of a chromosome pair. Single gene defects are rare, with a frequency of less than 1 in 500 births, but since there are about 3000 known their combined impact is significant. The incidence of serious single gene disorders is estimated to be about 1 in 300 births.

Single-gene disorders are characterized by the pattern of transmission in families; this is called a pedigree. A kindred includes the relatives outside of the immediate family. The affected individual that initially comes to light or is of immediate interest is called the proband. The brothers and sisters of the proband are called sibs.

There are only four basic patterns of single gene inheritance:
autosomal dominant,
autosomal recessive,
X-linked dominant, and
X-linked recessive.

The observed effect of an abnormal gene (the appearance of a disorder) is called the abnormal phenotype. A phenotype expressed in the same way (in both homozygotes and heterozygotes) is dominant. A phenotype expressed only in homozygotes (or, for X-linked traits expressed in males but not females) is recessive. Heterozygotes for a recessive gene are called carriers. They usually don't express the phenotype clinically, but it can frequently be identified by sensitive laboratory methods.

In autosomal dominant inheritance, the abnormality or abnormalities appear in every generation. Every affected child has an affected parent and, on average, each child of an affected parent has a 50% chance of showing the disease. Normal members of the family do not transmit the disease. Males and females are equally likely to have the disease and to transmit the disease. Male-to-male transmission can occur (unlike with X-linked dominant inheritance) and males can have unaffected daughters (unlike with X-linked dominant inheritance).

In autosomal recessive inheritance, the parents of an affected individual may not express the disease. On average, the chance of an affected child's brothers or sisters having the disease are 1 in 4. Males and females are equally likely to be affected. For a child to have symptoms of an autosomal recessive disorder, the child must receive the recessive gene from BOTH parents. Because these disorders are rare, when a child has symptoms of an autosomal disorder there is a chance that the parents are related.

In X-linked recessive inheritance, the incidence of the disease is much higher in males than females. Since the abnormal gene is carried on the X chromosome, males do not transmit it to their sons; they do transmit it to all their daughters. The presence of one normal X chromosome masks the effects of the X chromosome with the abnormal gene so almost all of the daughters of an affected man appear normal, but they are all carriers of the abnormal gene. The sons of these daughters then have a 50% chance of receiving the defective gene.

In X-linked dominant inheritance, the presence of the defective gene makes itself manifest in females even if there is also a normal X chromosome present. Since males pass the Y chromosome to their sons, affected males will not have affected sons, but all of their daughters will be affected. Sons or daughters of affected females will have a 50% chance of getting the disease (except for the rare case of the female with two abnormal genes).


EXAMPLES OF SINGLE GENE DISORDERS:
Autosomal recessive:
Cystic fibrosis (CF) is a very common hereditary disorder (1 out of 2000 caucasian births). The normal function of the protein is to transport chloride ions into certain cells. Deficiency of this protein somehow results in the accumulation of thick mucus in the lungs and other parts of the body. This situation compromises respiration and greatly increases the chance of pulmonary infections. Affected individuals rarely survive to the age of 30.
Phenylketonuria (PKU) is a common genetic disorder (1 out of 12,000 births) which results from a deficient enzyme required for the metabolism of the amino acid phenylalanine. Failure to recognize the disorder early in life results in mental retardation. Many states require all newborns to be screened for this disease.
AAT deficiency is a disorder seen in about 1 out of 10,000 births. The normal function of the protein is to inhibit enzymes which escape from white blood cells in the process of destroying invading bacteria. Affected individuals are much more likely to develop emphysema than usual.
Sickle cell anemia is a disorder common in individuals with an African ethnic background. The high frequency of the gene probably relates to the fact the the heterozygotes are resistant to malaria. The homozygotes have a predominance of an abnormal hemoglobin in their red blood cells. This abnormal protein causes the red blood cells to assume abnormal shapes and to lyse in small blood vessels under conditions of reduced oxygen pressure.
ADA deficiency is a rare immunodeficiency disorders, sometimes called the "boy in a bubble" disease, which results from the deficiency of an enzyme called adenosine deaminase. This enzyme is important for the normal function of lymphocytes which are the primary components of the immune system. This disease has the distinction of being the first to be treated effectively by genetic engineering, where some of the patients lymphocytes are removed from the body, injected with a normal gene, then reintroduced to the body.

X-linked recessive:
Duchenne muscular dystrophy is a very common (1 out of 3500 male births) disorder which results from the presence of an abnormal muscle protein. Muscles of young boys gradually deteriorate until even the muscles required for normal respiration become ineffective. These boys usually die of pulmonary infections before the age of 20.
Hemophilia A is seen in 1 out of 10,000 male births. The defective protein (coagulation factor VIII) is require for normal blood clotting. Affected individuals require injections of the protein or transfusion of blood products to prevent internal bleeding. Until recently when the genetically engineered protein became available, many of these individuals contracted hepatitis or AIDS as a result of their many transfusions.
Tay-Sachs disease is a disorder which is seen almost solely in Ashkenazi Jew populations. The incidence in this population has been substantially reduced (from about 1 out of 900) as a result of massive screening programs. The affected protein is an enzyme necessary to breakdown lipids in the membranes of cells. Abnormal membrane fragments accumulate and cause a deterioration of the nervous system. Affected individuals die before the age of 3.

Autosomal dominant:
Familial hypercholesterolemia (FHC) is a fairly common disorder (1 out of 500 individuals are heterozygous). The affected gene codes for a protein which is found on the external surface of most of the body's cells. This so-called receptor protein mediates the uptake of cholesterol into the cells. This cholesterol is transported in the blood by a lipoprotein called LDL. When LDL can't get into cells it increases to high levels in the blood. High levels of LDL (with it's associated cholesterol) increases the risk of developing arteriosclerosis and coronary artery disease. Homozygous individuals (about 1 out of 1,000,000 births) have extremely high levels of LDL and develop coronary heart disease in childhood.
Huntingtons disease is a neurodegenerative disease which doesn't appear until approximately age 30. It has recently become possible to test for the presence of the abnormal gene at any age. This information may be of great interest to individuals who know they will develop the disease later in life since they may wish to modify their plans in regards to marriage, childbearing, etc.

X-linked dominant:
Only a few, very rare, disorders are classified as X-linked dominants. One of these is hypophosphatemic rickets (also called vitamin D resistant rickets). In this case a protein in the kidneys is defective. This protein normally transports phosphate from the urinary filtrate back into the blood. Since the amount of phosphate in the blood is much lower than normal, the bones are chronically stimulated to release calcium and phosphate by hormones such as parathormone. This results in fragile and abnormally structured bones.


CHROMOSOMAL DISORDERS
In chromosomal disorders, the defect is due not to a single gene, but to an excess or deficiency of the genes contained in a whole chromosome or chromosome segment.

Downs syndrome is the most common chromosomal disorder (1 out of 800). Affected individuals have an extra copy of chromosome 21. This unbalanced set of genes results in moderate to severe mental retardation and numerous physical changes.
Klinefelters syndrome (1 out of 1000 males) and
Turners syndrome (1 out of 5000 females).


MULTIFACTORIAL DISORDERS
Many of the most common diseases which affect humans undoubtedly involve interactions of numerous genes, e.g., coronary heart disease, hypertension, stroke, and various kinds of cancer. These are currently active areas of research.


MITOCHONDRIAL DISORDERS
Mitochondria are small organelles present in most of the body's cells which function in the conversion of certain chemicals in our food, in the presence of oxygen, to the common currency of energy inside cells, i.e., ATP. Mitochondria contain their own private DNA. In recent years several hereditary disorders have been shown to result from mutations in mitochondrial DNA.





12. Autosomal dominant
Autosomal dominant Information:

Definition:
A single abnormal gene on one of the autosomal chromosomes (one of the first 22 "non-sex" chromosomes) from either parent can cause the disease. One of the parents will have the disease (since it is dominant) in this mode of inheritance and that person is called the CARRIER. Only one parent must be a carrier in order for the child to inherit the disease.

BACKGROUND:
The inheritance of genetic diseases, abnormalities, or traits is described by both the type of chromosome the abnormal gene resides on (autosomal or sex chromosome) and by whether the gene itself is dominant or recessive.

Autosomally inherited diseases are inherited through the non-sex chromosomes, pairs 1 through 22. Sex-linked diseases are inherited through one of the "sex chromosomes", the X chromosome (diseases are not inherited through the Y chromosome).

Dominant inheritance occurs when an abnormal gene from ONE parent is capable of causing disease even though the matching gene from the other parent is normal. The abnormal gene dominates the outcome of the gene pair.

Recessive inheritance occurs when BOTH matching genes must be abnormal to produce disease. If only one gene in the pair is abnormal the disease is not manifest or is only mildly manifest; however the disease can be passed on to the children.

STATISTICAL CHANCES OF INHERITING A TRAIT:
For an autosomal dominant disorder: If one parent is a carrier and the other normal there is a 50% chance a child will inherit the trait.

In other words, if it is assumed that 4 children are produced, one parent is carrier and exhibits disease, the STATISTICAL expectation is for:

2 children normal
2 children with the disease
This does not mean that children WILL necessarily be affected; it does mean that EACH child has a 50:50 chance of inheriting the disorder.

RELATED TOPICS:
autosomal recessive
genetic counseling and prenatal diagnosis
sex-linked dominant
sex-linked recessive

For detailed information, see heredity and disease (genetics).




13. SEX-LINKED DOMINANT
SEX-LINKED DOMINANT Information:

Definition:
A single abnormal gene on the X chromosome can cause the disease. This disease is transmitted equally to boys and girls. This is a rare mode of transmission.

BACKGROUND:
The inheritance of genetic diseases, abnormalities, or traits is described by both the type of chromosome the abnormal gene resides on (autosomal or sex chromosome) and by whether the gene itself is dominant or recessive.

Autosomally inherited diseases are inherited through the non-sex chromosomes, pairs 1 through 22. Sex-linked diseases are inherited through one of the "sex chromosomes", the X chromosome (diseases are not inherited through the Y chromosome).

Dominant inheritance occurs when an abnormal gene from ONE parent is capable of causing disease even though the matching gene from the other parent is normal. The abnormal gene dominates the outcome of the gene pair.

Recessive inheritance occurs when BOTH matching genes must be abnormal to produce disease. If only one gene in the pair is abnormal the disease is not manifest or is only mildly manifest; however the disease can be passed on to the children.

STATISTICAL CHANCES OF INHERITING A TRAIT:
For an X-linked dominant disorder: If the father carries the abnormal X gene all of his daughters will have the disease and none of the sons will have the disease. If the mother carries the abnormal X gene half of all their children (daughters and sons) will have the disease.

In other words, if it is assumed that 4 children are produced (2 male and 2 female), the mother is a carrier (1 abnormal X, she has disease), and the father is normal, the STATISTICAL expectation is for:

2 children (1 girl & 1 boy) with disease
2 children (1 girl & 1 boy) normal
If it is assumed that 4 children are produced (2 male and 2 female), the father is a carrier (abnormal X, he has disease), and the mother is normal, the STATISTICAL expectation is for:
2 girls with disease
2 boys normal
This does not mean that children WILL necessarily be affected; it does mean that EACH child has a chance of inheriting the disorder or of being a carrier.




14. SEX-LINKED RECESSIVE
SEX-LINKED RECESSIVE Information:

Definition:
An abnormal gene on the X chromosome from each parent is required to cause the disease in females since the female has 2 X chromosomes. In males there is only one X chromosome, therefore, a single recessive gene on the X chromosome will cause the disease. (Note: Although the Y chromosome is the other half of the XY gene pair in the male, the Y chromosome is only a portion of the X chromosome and doesn't protect the male. Therefore recessive genes on the X chromosome of the male will be expressed. This is seen in diseases such as hemophilia and muscular dystrophy.)

BACKGROUND:
The inheritance of genetic diseases, abnormalities, or traits is described by both the type of chromosome the abnormal gene resides on (autosomal or sex chromosome) and by whether the gene itself is dominant or recessive.

Autosomally inherited diseases are inherited through the non-sex chromosomes, pairs 1 through 22. Sex-linked diseases are inherited through one of the "sex chromosomes", the X chromosome (diseases are not inherited through the Y chromosome).

Dominant inheritance occurs when an abnormal gene from ONE parent is capable of causing disease even though the matching gene from the other parent is normal. The abnormal gene dominates the outcome of the gene pair.

Recessive inheritance occurs when BOTH matching genes must be abnormal to produce disease. If only one gene in the pair is abnormal the disease is not manifest or is only mildly manifest; however the disease can be passed on to the children.

STATISTICAL CHANCES OF INHERITING A TRAIT:
For an X-linked recessive disorder:

If only the mother carries the gene and the father is normal, all of the female children will be normal (50% with 2 normal chromosomes and 50% carriers), one half of all the male children will exhibit the disease, and one half will be normal. The recessive gene is expressed in the male because there is not another X to counteract it, only the Y (which determines for maleness).
If only the father caries the recessive gene, all of his daughters will be carriers and all of his sons will be normal
If both the mother and the father carry the abnormal gene, then STATISTICALLY out of 4 children 1 daughter will have the disease (two recessive genes on the X chromosome), 1 daughter will be a carrier, 1 son will have the disease (one recessive gene on the X and a Y chromosome) and the other son will be normal. In other words, 50% of the children (boys and girls) will be affected and 50% normal.
In other words, if it is assumed that 4 children are produced (2 boys and 2 girls), the mother is a carrier (one abnormal X but no disease), and the father is normal, the

STATISTICAL expectation is for:
1 boy normal
1 boy with disease
1 girl normal
1 girl carrier without disease
If it is assumed that 4 children are produced (2 boys and 2 girls), the father is a carrier (1 abnormal X, he has the disease), and the mother is normal, the STATISTICAL expectation is for:
2 boys normal
2 girls carriers without disease
If it is assumed that 4 children are produced (2 boys and 2 girls), the father is a carrier (1 abnormal X, he has the disease), and the mother is a carrier (one abnormal X but no disease), the STATISTICAL expectation is for:
1 girl with disease
1 girl carrier without disease
1 boy (abnormal X) with disease
1 boy normal
This does not mean that children WILL necessarily be affected; it does mean that EACH child has a chance of inheriting the disorder or of being a carrier.


15. Autosomal recessive
Autosomal recessive Information:

Definition:
An abnormal gene on one of the autosomal chromosomes (one of the first 22 "non-sex" chromosomes) from each parent is required to cause the disease. People with only one abnormal gene in the gene pair are called CARRIERS but since the gene is recessive they do not exhibit the disease. Both parents must be carriers in order for a child to have symptoms of the disease; a child who inherits the gene from one parent will be a carrier.

BACKGROUND:
The inheritance of genetic diseases, abnormalities, or traits is described by both the type of chromosome the abnormal gene resides on (autosomal or sex chromosome) and by whether the gene itself is dominant or recessive.

Autosomally inherited diseases are inherited through the non-sex chromosomes, pairs 1 through 22. Sex-linked diseases are inherited through one of the "sex chromosomes", the X chromosome (diseases are not inherited through the Y chromosome).

Dominant inheritance occurs when an abnormal gene from ONE parent is capable of causing disease even though the matching gene from the other parent is normal. The abnormal gene dominates the outcome of the gene pair.

Recessive inheritance occurs when BOTH matching genes must be abnormal to produce disease. If only one gene in the pair is abnormal the disease is not manifest or is only mildly manifest; however the disease can be passed on to the children.

STATISTICAL CHANCES OF INHERITING A TRAIT:
For an autosomal recessive disorder: When both parents are carriers of an autosomal recessive trait there is a 25% chance of a child inheriting both abnormal genes (developing the disease). There is a 50% chance of a child inheriting one abnormal gene (being a carrier).

In other words, if it is assumed that 4 children are produced, and both parents are carriers (neither exhibits any disease), the STATISTICAL expectation is for:

1 child with 2 normal chromosomes (normal)
2 children with 1 normal and 1 abnormal chromosome (carriers, without disease)
1 child with 2 abnormal chromosomes (has the disease)
This does not mean that children WILL necessarily be affected; it does mean that EACH child has a one in four chance of inheriting the disorder and a 50:50 chance of being a carrier.




16. Genetic counseling and prenatal diagnosis
Genetic counseling and prenatal diagnosis
Information:
For over 4000 years, certain human abnormalities have been noted to run in families but the "WHY" of the observations did not become apparent until the advent of modern genetics and the recognition of how genetic information is transmitted. Before then one only heard the admonition, "it's in the blood" (thought to refer more to bloodline rather than some abnormal element in the blood).

Present day medicine has recognized how genetic diseases are inherited based on an understanding of the nature of DNA, genes, and chromosomes. Scientists are presently trying to "map" the chromosomes, to determine the location and function of all of the millions of genes in each chromosome. This will ultimately help in treating genetic disorders.

However, until science has the ability to treat some of the more disastrous and ultimately fatal genetic disorders the best remaining recourse is prevention. Prevention of genetically transmitted disease can consist of major choices: abstinence from pregnancy, artificial insemination, prenatal diagnosis, and termination of affected pregnancies.

Prenatal diagnosis involves testing fetal cells, amniotic fluid, or amniotic membranes to detect fetal abnormalities.

Genetic counseling (and prenatal diagnosis) provides parents with the knowledge to make intelligent, informed decisions regarding possible pregnancy and its outcome. Based on genetic counseling some parents, in the face of possibly lethal genetic disease, have forgone pregnancy and adopted children while other have opted for artificial insemination from an anonymous donor who is not a carrier of the specific disease.

Many diseases transmitted as a single gene defect can now be diagnosed very early in pregnancy. Because of this some parents have elected to become pregnant and then, early in the pregnancy, had the disease status of the fetus determined. The pregnancy is continued if the fetus is disease-free. Parents who decide to continue the pregnancy with a defective fetus may be able to better prepare to care for the infant by being informed about the disease in advance.




17. Chromosome
Chromosome Information:
Humans have 46 chromosomes. There are a total of 23 pairs of chromosomes or 46 total chromosomes. All of the body's genes are contained within these 46 chromosomes.

Two of the chromosomes, the X and the Y chromosome, determine sex and are called the SEX CHROMOSOMES. Females have 2 X chromosomes and males have 1 X and 1 Y chromosome. The Y chromosome determines the male sex but does little else.

The remaining 44 chromosomes are called AUTOSOMAL CHROMOSOMES. Chromosomes exist in pairs. For convenience, scientists have numbered the autosomal chromosome pairs 1 through 22. The X and Y chromosome are the 23rd pair.

Each parent contributes one half of each pair or 23 chromosomes to their child, 22 autosomal and 1 sex chromosome. Females always contribute an X chromosome to the child while a male may contribute an X or a Y. Therefore, it is the male that determines the sex of the child.





18. Gene
Gene Information:
Genes are the smallest units of heredity. The information from all the genes, taken together, makes up the blueprint or plan for the human body and its functions. A gene is a short segment of DNA which is interpreted by the body as a plan or template for building a specific protein. Genes reside within long strands of DNA which in turn make up the chromosomes. Some diseases, such as sickle cell anemia, can be caused by a change in a single gene (one out of the millions of genes which make up the plan for the entire human body).

Genes are arranged in order along the DNA strand within the chromosome (similar to beads on a string). Matching genes from each parent exist on matching chromosomes and matching positions along the DNA within the chromosome. These genes are paired, one from the mother and one from the father. Genes are described as DOMINANT or RECESSIVE. DOMINANT means that one gene in the gene pair is able to control the trait which that gene pair codes for. RECESSIVE means that both genes in the gene pair are necessary to control the trait.

Lod score add

Lod Score Method of Estimating Linkage Distances
We will now introduce a new method to calculate linkage distances called the Lod Score Method. The method developed by Newton E. Morton is an iterative approach were a series of lod scores are calculated from a number of proposed linkage distance. Here is how the method works. A linkage distance is estimated, and given that estimate, the probablity of a given birth sequence is calculated. That value is then divided by the probability of a given birth sequence assuming that the genes are unlinked. The log of this value is calculated, and that value is the lod score for this linkage distance estimate. The same process is repeated with another linkage distance estimate. A series of these lod scores are obtained using different linkage distances, and the linkage distance giving the highest Lod score is considered the estimate of the linkage distance. The following is the formula for the lod score:

The above example will be used to demonstrate the principle. We will first use 0.125 as our estimate of the recombination fraction. In this first birth sequence, we have an individual with a parental genotype.The probablity of this event is (1 - 0.125). Because there are two parental types, this value is divided by two to give a value of 0.4375. In this pedigree we have a total of seven parental types. We also have one recombinant type. The probability of this event is 0.125 which is divided by two because two recombinant types exist.
What would the sequence of births be if these genes were unlinked? When two genes are unlinked the recombination frequency is 0.5. Therefore, the probability of any given genotype would be 0.25.
Now let's put the whole method together. The probability of a given birth sequence is the product of each of the independent events.So the probability of the birth sequence based on our estimate of 0.125 as the recombination frequence would be equal to (0.4375)7(0.0625)1 = 0.0001917. The probability of the birth sequence based on no linkage would be (0.25)8 = 0.0000153. Now divide the linkage probability by the non-linkage probability and you get a value of 12.566. Next take the log of this value, and you obtain a value of 1.099. This value is the lod score.
As was mentioned, this is repeated for a series of recombination frequency estimates. The table below gives the lod score for six different linkage estimates.
Recombination
Frequency Lod Score
0.050 0.951
0.100 1.088
0.125 1.099
0.150 1.090
0.200 1.031
0.250 0.932
As the table shows, the largest lod score corresponds to a linkage estimate of 0.125. In practice, we would like to see a lod score greater that 3.0. What this means is that the likelihood of linkage occurring at this distance is 1000 times greater that no linkage.
The lod score is a widely used technique not only in human research but in plant and animal linkage analyzes as well. An important software package, MAPMAKER, that is widely used in plant mapping research is based in part on the lod score method.
Modifications introduced into Version 1.4
• Maximum-Likelihood Haplotyping analysis, for finding the most likely haplotypes that generated the input.
• An analysis that moves the disease locus over the whole map, or a specified part of it, and finds the position which produces the maximum likelihood, using golden section search.
• MMLS-C analysis (can be used when the inheritance model of the disease isn't known). This analysis finds the position of the disease locus which produced the maximum LOD score, once assuming dominant inheritance with 50% penetrance, and once assuming recesseive inheritance with 50% penetrance. It then commits to the model which produced the higher maximum LOD score and subtracts 0.3 from this score to correct for multiple tests.
• MBLOD analysis (can be used when the inheritance model of the disease isn't known). This analysis finds the position of the disease locus which produced the maximum LOD score, once assuming dominant inheritance, and once assuming recesseive inheritance. The likelihood at a specific position of the disease locus is computed by averaging the likelihood given the specific inheritance model (dominant or recessive) over all penetrance values. The inheritance model which produced the higher maximum LOD score is commited to and and 0.3 is subtracted from the maximum LOD score to correct for multiple tests.
• An option to generate a Postscript graph of the Lod-Score as a function of the position of a disease locus (only possible for the case where one disease locus is iterated).
• An improved faster implementation (more noticeable in case of medium-->large files).
• Correction of some bugs in Version 1.3.

LOD SCORE METHOD

Documentation and program files for FLOSS version 1.4.1
License
You have permission to use and develop the FLOSS and COV programs ("the Program"), provided that the following conditions are met:
You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program.
If the FLOSS or COV software is used for analyses which will be reported or published,you specify the version of the software used, and cite the article noted in the citation section below.
You acknowledge that Brian Browning, GlaxoSmithKline ("GSK") and the GSK developers may develop modifications to the software that may be substantially similar to your modifications of the software and that Brian Browning, GSK and GSK developers shall not be constrained in any way by you in Brian Browning's, GSK's and GSK developers' use or management of such modifications. You acknowledge the right of Brian Browning, GSK and GSK developers to prepare and publish modifications to the software that may be substantially similar or functionally equivalent to your modifications and improvements, and if you obtain patent protection for any modification or improvement to the software, you agree not to allege or enjoin infringement of your patent by Brian Browning, GSK or GSK developers
This software is provided ``AS IS'' and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. In no event shall Brian Browning or GlaxoSmithKline be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.
Contents
Introduction
Citation
Creating FLOSS input files
Running the FLOSS program
FLOSS output files
Download FLOSS
Frequently Asked Questions
References
Introduction
The FLOSS software package uses input and output files from the MERLIN linkage analysis package (Abecasis et al, 2002) to perform an ordered subset analysis using either nonparametric linkage analysis z-scores or linear allele sharing model LOD scores. The FLOSS program is written in java and requires a java 1.4 interpreter.
back to contents
Citation
If you use FLOSS in your published work, please cite
Browning, BL (2006) FLOSS: Flexible ordered subsets analysis for linkage analysis of complex traits. Bioinformatics 22(4):512-3.
If you are publishing results of an ordered subset analysis, the following Suggested Reporting Guidelines may be helpful.
back to contents
Creating FLOSS input files
The FLOSS program requires two input files: a linkage score file and a covariate file. The linkage score file is a MERLIN ".lod" output file created using MERLIN with the --perFamily option. The covariate file can be created using the MERLIN pedigree (.ped) and data (.dat) input files. All covariates must be identified with a 'C' in the MERLIN data file, and must be numeric (not categorical) data.
To create the covariate file from the MERLIN pedigree and data files enter the command:
java -jar cov.jar [options]
where [options] are combinations of the following flags and arguments:
-d [.dat file]
name of MERLIN data file. Required.
-p [.ped file]
name of MERLIN pedigree file. Required.
-o [output file]
name of output file. It is suggested that the output covariate filename end in ".cov". Required.
-f [filter]
subject filter. Optional: defaults to "-f all".
-n [number]
minimum number of subjects required to define a family covariate value. The argument must be an integer. Optional: defaults to "-n 2".
-s [statistic]
statistic used. Optional: defaults to "-s avg".
A short suffix is appended to the name of each covariate that identifies the subject filter, mininum number of subjects, and statistic used to define the family covariate score. For example, if you defined the family covariate score using the flags "-f all -n2 -s avg" for the "age_of_onset" covariate then covariate name in the covariate file would be "age_of_onset.avg2all".
Subject filter: -f
The subject filter specifies the subset of families members that will be used to calculate the family covariate value. Three filters options are available: all, aff, and FDU.
-f all
use all family members who have a covariate value.
-f aff
use all affected family members who have a covariate value.
-f FDU
First Degree Unaffected: use all family members who are not affected (ie whose affection status is either unaffected or unknown) and who have a first degree relative (parent, offspring, or full sibling) who is affected. This filter is useful if you are concerned that the covariate values of affected members will be influenced by treatment for the affection.
Note: if there are multiple affection status variables specified in the MERLIN data file, only the first affection status variable is used to determine the subject's affection status.
Minimum number of subjects: -n
The minimum number of subjects required in order to define a family covariate value. First the subject filter specified with the "-f" option is applied. If there are fewer than the specified number of subjects after the subject filter is applied, the family is assigned an unknown covariate value ("NaN").
Statistic: -s
The statistic used to create the covariate value. Three statistics are available: min, max, and avg.
-s min
minimum covariate value for the family members specified with the subject filter argument.
-s max
maximum covariate value for the family members specified with the subject filter argument.
-s avg
mean covariate value for the family members specified with the subject filter argument.
Covariate File FormatThe covariate file is a white-space delimited matrix of entries. The first column contains FamilyID followed by the family identifiers. The first row contains FamilyID followed by the covariate identifiers. All other entries of the matrix give the family covariate scores for the family (determined by the row) and the covariate (determined by the column).
When creating the covariate file using the COV program, an additional column __asm__ is added. This column gives the maximum value for the allele sharing parameter for each family when using linear allele sharing model LOD scores. The __asm__ column is not used when using nonparametric linkage scores.
Missing Covariate DataAn NaN entry in the covariate file means there was there were not enough pedigree members with covariate data to assign a family covariate score. If the number of individuals with covariate data in the subset of family members specified by the filter (-f) parameter is less than the minimum number specified by the -n parameter, then the family covariate score is reported as missing (i.e. NaN). An ordered subset analysis for a particular covariate uses only the families with non-missing covariate scores for that covariate.
Choice of CovariatesOrdered subset analysis is well suited to covariates which can discriminate between the families, but usually is not recommended when the number of ranks due to the covariate ordering is small or when a large number of families share the same family covariate score. For example, one usually would not define a covariate based on the number of females or males in the family or the number of affected family members.
back to contents
Running the FLOSS program
Ordered subset analysis program is run using the "floss.jar" program. Enter
java -jar floss.jar [options]
where [options] are combinations of the following flags and arguments:
-c [.cov file]
name of covariate file. Multiple covariate files can be analyzed in a single run by including a separate "-c" flag before each covariate filename. Required (see Creating FLOSS input files).
-merlin [.lod file]
name of MERLIN ".lod" file. Created by MERLIN when using the --perFamily option. Required.
-o [output prefix]
prefix of output files. Required.
-seed [integer]
seed for random number generator. Optional: defaults to "-seed 0".
-subsets [type]
Type of ordered subsets used. type must be "extreme" or "slice". Optional: defaults to "-subsets extreme".
-asm [interval_type]
Type of allele sharing parameter interval used. interval_type must be "unequal" or "equal". Optional: defaults to "-asm unequal".
-minperm [integer]
minimum number of permutations for the permutation test. Optional: defaults to "-minperm 100".
-maxperm [integer]
maximum number of permutations for the permutation test. Optional: defaults to "-maxperm 10000".
--npl
Compute a nonparametric linkage (NPL) Z-score statistic for each subset of families. Note that two hypens "--" are required. Optional: linear allele sharing model LOD scores are used if "--npl" option is absent.
Type of ordered subsets: -subsets
-subsets extreme
Rank the families in order of increasing family covariate score and perform linkage on all subsets of families with the k smallest or k largest covariate scores. The "extreme" option is the recommended option, and the default option.
-subsets slice
Rank the families in order of increasing family covariate score and perform linkage on all subsets of families with consecutive covariate scores. For example, if there are N families linkage analysis is performed using families i through j for 1 ≤ i &le j &le N. This option is discouraged since the increased number of subsets makes it more difficult to detect disease loci associated with unusually low or high covariate values and requires substantially more computing time.
Type of allele sharing parameter interval: -asm
-asm unequal
The allele sharing parameter interval for an ordered subset is the intersection of the allele sharing parameter intervals for each family in the ordered subset. The "unequal" option is the default option. .
-asm equal
The allele sharing parameter interval for an ordered subset is the intersection of the allele parameter intervals for all families. The parameter interval will be the same for all ordered subsets.
back to contents
FLOSS output files
Ordered subset analysis produces four output files. The output filenames have the format prefix.extension where the prefix is the filename prefix specified with the "-o" flag when running FLOSS and the extension is ".out", ".fam", ".plt", or ".log"
Summary file (.out)
The summary file (.out) records the analysis options and gives summary information for each covariate analyzed. The file reports the change in linkage score between the entire set of families, and the ordered subset with the highest linkage score, the maximum linkage score for this ordered subset, the optimal interval of family covariate scores, and the Monte Carlo p-value with a 95% confidence interval. The summary file is self-documented with documentation included at the end of the ".out" file.
Families file (.fam)
The ".fam" file gives the families ordered by the covariate values. The ".fam" file is arranged in sections corresponding to each covariate listed in the Covariate file. The sections are separated by a blank line. Each section contains three columns, and the first row in each section contains labels for the columns. The first column is labeled "Family" and contains the identifiers for families with defined covariate values in order of increasing covariate value. The second column is labeled "Subset" and contains "x" if the family in the first column is included in the ordered subset with the highest linkage score (when maximized over all ordered subsets and all loci). The third column is labeled with the covariate name and gives the covariate value for the families in the first column.
Plotting file (.plt)
The ".plt" file contains linkage scores for the complete set of families and for the optimal ordered subset of families for each covariate at the loci in the MERLIN .lod file. The plotting file has a simple format that is easily read and plotted using a speadsheet (eg. Excel) or a statistical software package (eg. R).
The first column is labeled "Position" and contains the position of all loci used in the ordered subset analysis. All data in each row is computed at the position specified in the first column. The second column is labeled "Orig_Score" and lists the linkage scores at the position specified in the first column obtained using all families . After the first two columns, the columns correspond to the covariates in the ordered subset analysis and are labeled by the covariate names. Each covariate column gives the linkage score at the position specified in the first column for the ordered subset that maximizes the linkage score for that covariate.
The Download section of this documentation includes an R script for plotting the linkage curves.
Log file (.log)
The ".log" file gives details for all ordered subset considered in the ordered subset analysis. The ".log" file is arranged in sections corresponding to each covariate listed in the Covariate file.
The first line in a section contains the name of the covariate. The next line begins "Ordered Families:", and the following line or lines list the identifiers for the families used in the ordered subset analysis. The families are listed in order of increasing covariate values. An = between two family identifiers means the two families have the same covariate score.
Following the ordered family identifiers are the results from each ordered subset considered in the ordered subset analysis. The results are presented in eight columns. Each line corresponds to a distinct ordered subset and has the following entries (in order from left to right):
First Fam gives the family identifier with the smallest covariate value in the ordered subset.
Last Fam gives the family identifier with the largest covariate value in the ordered subset.
Num Fams gives the number of families in the ordered subset
Peak gives the the locus where the highest linkage score was observed for the ordered subset of families specified by the first two entries (First Fam and Last Fam).
Subset Score gives the highest linkage score observed for the ordered subset of families specified by the first two entries (First Fam and Last Fam).
Orig Score gives the linkage score for the position specified in the fourth entry (Peak) for the set of all families.
Subset Params gives the parameter value that yielded the maximum linkage score for the ordered subset of families specified by the first two entries (First Fam and Last Fam). This entry is blank when using nonparametric linkage analysis z-scores with the --npl option.
Orig Params gives the parameter value that yielded the maximum linkage score for the position specified in the fourth entry (Peak) for the set of all families. This entry is blank when using nonparametric linkage analysis z-scores with the --npl option.
back to contents
Download FLOSSThe executable files for COV and FLOSS can be run using a java 1.4 (or later) interpreter with the "-jar" flag. See Creating FLOSS input files and Running the FLOSS program for details. The following files are available for viewing or download:
executable files
sample input and output files
R script for graphing ".plt" file linkage scores
source code
version notes
back to contents
Frequently Asked QuestionsThe list of Frequently Asked Questions answers common questions about FLOSS and gives tips for using FLOSS.
back to contents
References
Abecasis GR, Cherny, SS, Cookson, WO, Cardon, LR (2002) MERLIN--rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet 30:97-101.
Browning, BL (2006) FLOSS: Flexible ordered subsets analysis for linkage analysis of complex traits. Bioinformatics 22(4):512-3.
Hauser ER, Watanabe RM, Duren WL, Bass MP, Langefeld CD, Boehnke M (2004) Ordered subset analysis in genetic linkage mapping of complex traits. Genet Epi 27:53-63.
Kong A, Cox NJ (1997) Allele-sharing models: LOD scores and accurate linkage tests. Am J Hum Genet 61:1179-1188.
Kruglyak L, Daly MJ, Reeve-Daly MP, Lander ES (1996) Parametric and nonparametric linkage analysis: a unified multipoint approach. Am J Hum Genet 53:1347-1363.
back to contents