Elsevier

The Lancet

Volume 366, Issue 9490, 17–23 September 2005, Pages 1036-1044
The Lancet

Series
Genetic linkage studies

https://doi.org/10.1016/S0140-6736(05)67382-5Get rights and content

Summary

Linkage analysis is used to map genetic loci by use of observations of related individuals. We provide an introduction to methods commonly used to map loci that predispose to disease. Linkage analysis methods can be applied to both major gene disorders (parametric linkage) and complex diseases (model-free or non-parametric linkage). Evidence for linkage is most commonly expressed as a logarithm of the odds score. We provide a framework for interpretation of these scores and discuss the role of simulation in assessment of statistical significance and estimation of power. Genetic and phenotypic heterogeneity can also affect the success of a study, and several methods exist to address such problems.

Section snippets

Parametric linkage analysis

Parametric or model-based linkage analysis is the analysis of the cosegregation of genetic loci in pedigrees. Loci that are close enough together on the same chromosome segregate together more often than do loci on different chromosomes. Loci on different chromosomes segregate together purely by chance. Each genotype for one genetic marker or locus is made up of two alleles, one inherited from each parent. Specific alleles are in gametic phase when they are coinherited from the same parent—ie,

Ehlers-Danlos disease

As an example, figure 1 shows a pedigree segregating a form of the Ehlers-Danlos disease (EDS-VIII [MIM 130080]). We will use the reported linkage analysis of this pedigree3 to illustrate parametric linkage analysis. EDS-VIII is a very rare autosomal dominant disorder. 72 individuals from five generations were clinically examined in this family, and DNA samples were available for genetic analysis from 19 of them. Figure 1 shows only those parts of the pedigree segregating the disease (ie, many

LOD scores

Linkage is usually reported as a logarithm of the odds (LOD) score (panel 1). This score was first proposed by Morton in 1955.5 It is a function of the recombination fraction (θ) or chromosomal position measured in cM. This means that the LOD score is different depending upon which value of θ is being considered. Large positive scores are evidence for linkage (or cosegregation), and negative scores are evidence against. To calculate a LOD score a model for disease expression must be specified.

Specifying the genetic model

For any parametric linkage analysis, the genetic model for the disease of interest must be specified. For a simple mendelian disease, this model amounts to mode of inheritance and frequency of disease allele. For some diseases, carrying the risk genotype does not always result in the individual being affected (incomplete penetrance). In more complex models, only a proportion of disease cases are due to a specific major gene, resulting in some risk of disease for individuals with any disease

Genetic heterogeneity

The fact that the pattern of disease in families is consistent with a strong major gene component does not necessarily imply that only one gene is involved. There are many examples of diseases caused by inherited mutations in distinct genes. Some mutations give rise to the same disease but with a different mode of inheritance—for example, Charcot-Marie-Tooth disease has autosomal recessive, dominant, and X-linked forms, and mutations in up to ten genes are responsible for the different forms.8

Heterogeneity LOD scores

Locus heterogeneity such as that with Charcot-Marie-Tooth disease can seriously affect the power of parametric linkage analysis. The most common solution is to assume that mutations in the disease genes will be so rare that each family will be linked to only one such gene. The genome scan is then done maximising a heterogeneity LOD score (panel 1). At each genomic position, the heterogeneity LOD score is maximised with respect to another parameter, α: the proportion of families linked to this

Model-free (non-parametric) linkage analysis

For multifactorial diseases, where several genes (and environmental factors) might contribute to disease risk, there is no clear mode of inheritance. Methods to investigate linkage have therefore been developed that do not require specification of a disease model. Such methods are referred to as non-parametric, or model-free. The rationale is that, between affected relatives excess sharing of haplotypes that are identical by descent (IBD) in the region of a disease-causing gene would be

Sibling pairs

The simplest approach is to study sibling pairs, both of whom are affected. At any locus, according to the null hypothesis of no linkage, the number of IBD alleles shared by a pair of siblings is none with probability 0·25, one with probability 0·5, or two with probability 0·25 (panel 2). If IBD sharing in the families is known, the observed proportions of pairs sharing no, one, and two alleles at a candidate locus can be compared with these expectations. Linkage would be suggested if the pairs

Other groups of relatives

Pairwise comparisons between relatives can easily be modified for types of relative pair other than siblings. However, in studies that set out to examine affected sibling pairs, additional affected siblings are often recruited. Various methods have been proposed to extend the pairwise approach to sibships larger than two. Selecting one pair at random or using only independent pairs means discarding information, so using all possible pairs is preferred. Should larger sibships be down-weighted to

Issues of power and interpretation

A fundamental issue in understanding the results of a linkage analysis is the interpretation of statistical significance. Whenever statistical tests are done, a balance must be struck between making claims many of which fail to be substantiated and adopting criteria so stringent that true findings are missed. For the parametric analysis of single gene disorders, it was suggested early that a threshold of 3 for the LOD score indicated a significant result at the genome-wide level. This approach

Choice of phenotype

Some traits or diseases have a clear phenotype definition. For simple mendelian traits, it is straightforward to identify affected and unaffected individuals and even in a disease such as cancer, once symptoms are experienced the diagnosis is based on pathological findings. However, other illnesses such as psychiatric disorders are more problematic because the diagnosis often depends upon several distinct symptoms, and there is often disagreement as to what constitutes a definitive diagnosis.43

Linkage analysis: what next?

A linkage analysis of the whole genome can identify regions that show evidence of containing a disease gene. In the study of mendelian traits, crossover events often narrow down the region sufficiently to define a small interval of interest. Linkage analysis of complex diseases can only identify large regions (typically tens of cM). Location estimates indicated by the linkage peak are highly variable, and increasing the density of the marker map only somewhat improves the resolution.48 Although

References (49)

  • JA Douglas et al.

    A multipoint method for detecting genotyping errors and mutations in sibling-pair linkage data

    Am J Hum Genet

    (2000)
  • DF Levinson et al.

    Genome scan meta-analysis of schizophrenia and bipolar disorder, part I: methods and power analysis

    Am J Hum Genet

    (2003)
  • SB Roberts et al.

    Replication of linkage studies of complex traits: an examination of variation in location estimates

    Am J Hum Genet

    (1999)
  • Cordell HJ, Clayton DG. Genetic association studies. Lancet (in...
  • JBS Haldane et al.

    A new estimate of the linkage between the genes for colour-blindness and haemophilia in man

    Ann Eugen

    (1947)
  • NE Morton

    Sequential tests for the detection of linkage

    Am J Hum Genet

    (1955)
  • J Chotai

    On the LOD score method in linkage analysis

    Ann Hum Genet

    (1984)
  • ES Lander et al.

    Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results

    Nat Genet

    (1995)
  • P Berger et al.

    Molecular cell biology of Charcot-Marie-Tooth disease

    Neurogenetics

    (2002)
  • JM Hall et al.

    Linkage of early-onset familial breast cancer to chromosome 17q21

    Science

    (1990)
  • R Wooster et al.

    Identification of the breast cancer susceptibility gene BRCA2

    Nature

    (1995)
  • WC Blackwelder et al.

    A comparison of sib-pair linkage tests for disease susceptibility loci

    Genet Epidemiol

    (1985)
  • JK Hasemen et al.

    The investigation of linkage between a quantitative trait and a marker locus

    Behav Genet

    (1972)
  • N Risch

    Genetics of IDDM: evidence for complex inheritance with HLA

    Genet Epidemiol

    (1989)
  • Cited by (150)

    • Statistical genomics in rare cancer

      2020, Seminars in Cancer Biology
    View all citing articles on Scopus
    View full text