The background of the MBGP.
Gene expression analyses quantify global temporal and spatial patterns in gene expression.
Understanding these patterns and how they relate to those of other genes
and respond to, for example, behavioural alterations or onset of disease is of enormous interest, hence the great interest
in "global" gene expression technologies such as cDNA microarrays
and SAGE. With approximately 90-96% of the mouse genome available in the
current mouse assembly, it is possible to exploit genome-scale approaches
to the study of the brain. The
Melbourne Brain Genome Project (MBGP)
uses murine models for quantifying gene expression in the brain, enabling
correlation of expression with different developmental, pathological,
or functional states. The dominant gene expression technology that the
MBGP is using is SAGE.
MBGP: neural development, Down syndrome, neurodegeneration.
The MBGP has five main aims:
- to amass catalogues of genes representing developmental stages, functional regions and stem-cell derived neurospheres of normal and mutant mouse brain
- to reveal gene expression profiles characteristic of Down syndrome mouse models
- to build and compare databases of expressed genes in mouse models of neurodegenerative diseases
- to systematically compare SAGE and microarray data and utilize SAGE data to validate microarray analyses
- to systematically describe the molecular anatomy of previously undiscovered genes during brain development and to test the function of a subset of these genes using a variety of techniques including gene knockout technology
Why choose to study the brain?
Our brains make us what we are, both as individuals and as a species. The
brain presents unique challenges to the study of how gene expression is
manifested in specialized cell form and function. The brain possesses
a complex array of cell types falling into two broad groups (neurons and
glia). Importantly, groups of neurons that appear morphologically similar
can have very different functional properties and molecular phenotypes, providing an opportunity to study the problem of how gene expression regulates
the development, structure and function of individual cells. In addition, the brain displays unparalleled sophistication of function, being
responsible for sensory processing, learning and memory, cognition and
reasoning, emotion and coordination of motor and autonomic activity. Mounting
evidence suggests that these higher-order functions are dependent on gene
expression alterations.
An understanding of brain development
is unlikely to advance without consideration of the multiple gene sets
that are activated during complex morphogenetic events including long-distance
migration from germinal zones into specific layers and regions; interactions
with radial glial scaffold, other migrating neurons and the surrounding
extracellular matrix; reciprocal innervation with specific targets, and
juggling of the above with progressive lineage restriction and cellular
differentiation programs. The MBGP aims to provide a molecular inventory
of genes that are expressed in developing and adult brain tissues in defined
space and time. Expression profiles derived from different sources of
neural tissue will reveal molecular asymmetry, enabling detection of new
genes present in one domain but not in another and uncovering different
complexities of gene expression in different brain territories. SAGE libraries,
containing information on sequence abundance and complexity, can be further
analysed for higher order correlations using appropriate algorithms. The MBGP plans to address the molecular bases of brain regionalization by comparing libaries from various time points including embryonic (ganglionic eminences), E12, and adult brain regions.
Comparison of regional gene expression will provide a foundation for higher
resolution analysis of particular brain nuclei or cell types and pave
the way for system level interpretations. SAGE technology can also be
applied to the study of the transcriptional networks activated in response
to particular developmental signals. The MBGP plans to conduct such studies with
two mutant mouse models; p75 and disabled-1. The p75 neurotrophin receptor
mediates the death of neural cells, including stem cells, during development
and in the damaged and diseased nervous system (Coulson 1999, Coulson 2000). Preliminary
studies indicate that p75 is likely to be involved in maintaining the
crucial balance between neural precursor proliferation and apoptosis however
the components that make up the downstream signaling pathway are unknown.
The consequences of inactivating the p75 gene on the molecular profile
of stem cell neurospheres (Rietze 2001) will be assessed. Disabled-1 (Dab1)
is part of the Reelin signaling pathway, important for directing neuronal
migration and positioning in the developing cortex (Rice 1999). Mutations
in Dab-1, or in other members of the Reelin pathway, result in
inversion of cortical layers however the transcriptional consequences
of disrupting the Reelin signaling pathway have not been analyzed. Recent
data from the Tan laboratory (Hammond 2001) suggests that Dab-1 activation leads to increased
neuron-neuron adhesion and the MBGP aims to test this hypothesis by comparing
the cortical transcriptomes of Dab1 -/- mutants with those of wild-type
mice. Similar SAGE strategies have been spectacularly successful, revealing
previously unknown components of tumor suppressor signaling pathways (Polyak
1997, He 1998).
How valid is the mouse as an experimental model for human?
Differences in gene expression between developmentally regulated mouse and human orthologues have been demonstrated (
Ross 2000), however preliminary comparisons of the mouse and human brain transcriptomes show that there is good correlation for highly expressed genes in both transcript identity and abundance (
Fougerousse
2000). With the completion of both the human and mouse genomes, it
will be possible to capitalise enormously on SAGE data, both pre-existing
and that to be generated. Detailed comparisons will be possible and the
absolutely quantitative nature of SAGE data will be important in determining
the validity of mouse global gene expression studies for extrapolation
to human. A longer term goal is that the genes identified may be possible
targets for therapeutic intervention. Translation and comparison of our
mouse studies to human neurological diseases will bring our analyses closer
to direct application.
How does Serial Analysis of Gene Expression (SAGE) work?
SAGE works by two principles (Velculescu 1995): first, a short tag of 10 base pairs acts like a bar-code, containing sufficient information to distinguish between 1,048,576
transcripts provided the tag is taken from a defined position in the mRNA;
second, by joining the tags together for sequence analysis, SAGE provides
for rapid and automatable data collection and analysis (Velculescu
1995).
What are the advantages of SAGE?
As SAGE relies on sequencing to identify genes it may be considered as
a variant of expressed sequence tag (EST) analysis. While large scale
EST sequencing is "somewhat" quantitative and also an effective
approach to gene discovery, it is laborious due to the length of the clones
and the high level of redundancy (more than 2 million human ESTs have
been found to collapse by UniGene clustering to only approximately 86,000
unique genes; dbEST release, October, 2000). In comparison, by reducing
the sequencing effort to the minimum sequence length required for unambiguous
transcript identification, SAGE results in an approximately 40 fold increase
in efficiency over EST sequencing. Techniques such as cDNA (Schena 1995) or oligonucleotide
arrays (Lockhart
1996), have been used to compare the expression of thousands of genes
in a variety of tissues but are limited to analysing only previously identified
transcripts. New methods and ever-increasing numbers of molecules to array
make microarrays very much an evolving technology and therefore it can
be difficult, if not impossible, to compare between microarray experiments.
Microarrays are also a "closed system" in that it is necessary
to know what is on the microarray. It is possible that even the developing
microarray technologies will not have the sensitivity to detect and quantitate
the preponderance of low abundance transcripts given hybridization kinetic
limitations.
The power of SAGE:
- SAGE does not depend on the prior availability of transcript information (Velculescu 1995).
- SAGE is an absolutely quantitative methodology.
- SAGE can be considered to be "global".
- There is a high probability that transcripts detected by SAGE that cannot be assigned an identity (e.g. rare transcripts) will correspond to previously unknown genes. [A SAGE tag sequence has sufficient information to generate longer cDNA fragments for gene identification (Polyak 1997)].
- SAGE data is in a standard, digital format allowing quick and easy comparison of data sets from within lab experiments and from external groups.
- The data files do not have large storage requirements.
- Data can be stored and reanalysed at any stage in the future as new information is acquired.
- A number of SAGE data sets are freely available and a system is in place to enable all future SAGE data to be desposited in the public domain (NCBI SAGEmap).
- SAGE is high-throughput, something that EST sequencing lacks. The MBGP is aided in this issue by the close proximity of the Australian Genome Research Facilty (AGRF).
- SAGE provides the ability to uncover higher order organization patterns of chromosomal arrangement (e.g. RIDGEs: Regions of IncreaseD Gene Expression in the Human Transcriptome Map Caron 2000)
Current issues with SAGE, for example the identifying of unknown tags, will be reduced as genome sequencing nears completion. Many genes not currently in the databases will be predicted from accumulating genomic sequence and corresponding cDNAs will eventually be arrayed.
Which groups comprise the MBGP?
The MBGP is a collaboration between the laboratory of Dr. Hamish Scott
at The Walter and Eliza Hall Institute
of Medical Research, the Brain Development team under Associate Professor
Seong-Seng Tan at the Howard Florey Institute of Experimental Physiology and Medicine,
and Professor Colin Masters group at the Department of Pathology
at The University
of Melbourne.
Dr Scott has contributed extensively to the genetic, physical,
and transcription maps of human chromosome 21 (HC21), including participation
in the cloning and functional characterization of 24 HC21 genes, leading
to the identification of genes for two H21 monogenic disorders.
The identification of the genes on HC21 is essential in order to fully
understand the molecular pathogenesis of Down Syndrome.
The Tan group has demonstrated expertise
in the SAGE technique, having published the results of a small-scale study
identifying molecules potentially involved in the growth and migration
of rat C6 glioma cells (Gunnersen 2000). In line
with the major research focus of the Tan laboratory, to understand the
molecular processes underlying patterning and regionalization of the developing
mouse brain (Tan
1998), SAGE analyses of genes expressed in the developing neocortex
are currently underway. A total of more than 40,000 SAGE tags have been
sequenced from two libraries prepared from neocortex at two different
time points (embryonic day 15 [E15] and post-natal day 1 [P1]). Comparison
of these SAGE libraries has revealed 153 differentially expressed genes
(p-chance <0.01), one third of which represent unknown genes. Of the known
genes, transcription factors of the bHLH, zinc finger, forkHEAD box and
Sox gene families featured prominently, several of which (e.g. neuroD2,
Id2 and brain factor 1) have recognized roles in neurogenesis and patterning
of the cortex. The majority of differentially-expressed genes tested have
been confirmed by Northern blot and the detailed expression patterns of
the complete set of differentially expressed genes are being characterized
by in situ hybridization on developing brain tissue.
The Masters laboratory has worked on the
mechanisms of neurodegeneration for more than 20 years. The focus has
been on Alzheimer's and Creutzfeldt-Jakob diseases, with analyses of the
amyloidogenic proteins which accumulate in the extracellular space (as
plaques). More recently, the laboratory has extended its studies into
the proteins which aggregate in the other major neurodegenerative conditions.
Facilities within the laboratory are wide-ranging, from basic molecular
genetic techniques through to human brain banking and clinical trials.
Thus, results from the proposed gene expression profiles of transgenic
models of these disorders can be rapidly translated into experiments on
human tissues and systems.
Syndromes and diseases under study. Mouse models of neurodegenerative disorders
Over the past five years, the molecular bases of the major neurodegenerative
diseases have become more clearly defined, and for each disease, the corresponding
mouse model has been created based on gene targeting or transgenic technologies.
The major disease categories encompass the commonly recognized sub-types
of degenerations associated with the aging human brain. Yet despite the
discovery of the molecular lesion in each category, no satisfactory therapeutics
have emerged. A strategy aimed at identifying key genes or pathways in
these diseases is to examine the downstream effects of the primary molecular
lesion at a very early stage of preclinical evolution using SAGE, a technique
well-suited to this approach. A primary outcome of this approach
would be to define the common elements of down-stream effects across all
neurodegenerative phenotypes, thus indicating a possible therapeutic
strategy applicable across the whole spectrum of disorders.
The MBGP is working with a number of mouse
models of neurodegenerative disorders. The main models are:
Alzheimer's disease: several transgenic lines now exist which reproducibly
form abundant extracellular Ab amyloid deposits although none of the models
fully replicate the neurofibrillary tangles and neuritic changes which
are also an integral part of the human condition. A double transgenic
line (APP x PS), in which over-expression of the substrate (APP) combined
with abnormal processing (PS) results in earlier phenotypic onset, is
established in our colony. By 16 weeks of age, doubly transgenic mice
display both increased brain Ab and behavioural abnormalities (
Holcomb 1998). Cortical tissue will
be harvested from these mice at 8 weeks for SAGE analysis.
Creutzfeldt-Jakob Disease: the various forms of experimental transmissible
spongiform encephalopathy (TSE) represent very authentic models for this
infectious neurodegenerative and neurogenetic disease (
Prusiner 1998). The Masters lab uses
routine inoculation of a mouse-adapted strain of a Japanese Gerstmann-Straussler-Scheinker
syndrome isolate (Fujisaki strain) to generate a reproducible incubation
period of 120 day. Mouse cortices will be taken for SAGE 60 days after
inoculation, when the abnormal isoforms of PrP are first detectable by
Western blot.
Parkinson's disease: transgenic mice over-expressing either wild
type or mutant forms of human a-synuclein (
van
der Putten 2000,
Masliah 2000)
develop aggregates of a-SN in association with degenerative changes in
the dopaminergic systems. These mice provide valuable models of idiopathic
Parkinson's disease and multiple system atrophy, although not displaying
a full phenotype. The downstream consequences of a-SN aggregation are
poorly defined - it will be of immense interest to determine whether they
mirror effects seen in the other types of neurodegeneration. SAGE libraries
will be constructed from the striatum of 6 month old mice.
Motor neuron disease (amyotrophic lateral sclerosis): motor neuron
disease is far more restricted in its bulbar/spinal cord phenotype. The
mouse models which over-express mutant SOD-1 closely match the human disease
in this respect (
Gurney 1994,
Dal Canto 1997). Aggregation of SOD-1
with ensuing abnormal redox chemical reactions is expected to generate
a defined sequence of downstream events to be analysed by SAGE. To encompass
the restricted topographic distribution of the lesions in this model,
we will collect spinal cord and lower brain stem from 4 month old mice.
Huntington's disease and selected forms of spinocerebellar ataxias: The
discovery of polyglutamine aggregates in this class of neurogenetic disorder
immediately allowed a synthesis of ideas underlying the neurodegeneration
in these conditions. Although initially thought to be primarily nuclear
in their distribution, polyglutamine aggregates are also formed in the
neuronal cytoplasm. Mice transgenic for exon I of the huntingtin gene
containing an expanded CAG sequence display inclusions which may affect
the expression of genes important for neuronal function and appear to
result in non-apoptotic neurodegeneration (
Turmaine
2000). Striatal tissue for SAGE from appropriately aged animals (depending
on transgenic mouse line used) will be available through collaborators.
Fronto-temporal dementias: This group of illnesses are still being categorised
according to the patterns of Tau protein aggregation. In the first instance,
however, the model in which four-repeat Tau is overexpressed (
Goedert
1999) provides an approximation for the Tau aggregates seen as Pick
bodies and straight filaments in Picks disease and progressive supranuclear
palsy, respectively. Mice of this line are also available for SAGE studies
through collaborators.
Trisomy 21 or Down Syndrome (DS) (OMIM
190685) occurs at a frequency of approximately 1/700 live
births. In contrast to the variable phenotypic penetrance in many organs,
mental retardation (MR) and early onset Alzheimer disease (AD) are invariably
present in brains of all DS patients such that DS is the most common genetic
cause of mental retardation (Epstein). Changes in the neuropathology, neurochemistry,
neurophysiology, and neuropharmacology of DS patient brains indicate that
there is abnormal development and maintenance of CNS structure and function.
DS is thus a model for mental retardation with abnormal development and
neurodegeneration or premature aging. Importantly, two of the genes known
to be involved in neurodegenerative disorders, APP and SOD1, map to human
chromosome 21 (HC21). To understand fully the molecular pathogenesis of
DS, it is necessary to identify all HC21 genes. The DS SAGE studies have
provided the first global analysis of gene expression differences in aneuploid
versus normal cells, implicating certain genes as "directly"
involved in several DS phenotypes (Chrast
2000) and providing pointers to diagnostic or prognostic markers and
possible targets for therapeutic intervention.
A number of viable mouse models of DS trisomic for different parts of
mouse chromosome 16 (MMU16 syntenic to HC21) have been produced. While
all show learning and behavioural abnormalities, these phenotypes are
the most pronounced in the Ts65Dn mice which are trisomic for the largest
region of MMU16 (App to Mx1, Reeves
1995) (Hernandez
1999, Baxter 2000). Ts65Dn
is better characterized phenotypically than the human syndrome due to
availability of tissues for testing. An increasingly long list of Ts65Dn
phenotypes can be directly related to the human syndrome including skeletal
defects, developmental delay, learning and behavioural deficits, and age-related
degeneration of neurons (Reeves 1995). Imaging and neuropathological studies have
shown few differences between the brains of DS and normal neonates and
similar observations have been made for the Ts65Dn mouse. However, both
DS and Ts65Dn brains show a reduction in cell number and volume in the
hippocampal dentate gyrus (Insausti 1998) and the cerebellar internal granule and molecular
layers (Baxter
2000) and a reduction in excitatory (asymmetric) synapses in the temporal
cortex at more advanced ages (Kurt 2000, Wisniewski 1990). These neuropathological changes are likely
to be associated with the impaired memory, sensory and motor function
seen in both DS and Ts65Dn mice (Escorihuela 1998, Martinez-Cue, Holtzman 1996). Additionally, age-related degeneration of
septohippocampal cholinergic neurons and astrocytic hypertrophy, markers
of Alzheimer’s disease pathology, are seen in elderly DS individuals
(Holtzman 1996). Thus
Ts65Dn mice may be used to study both developmental and degenerative abnormalities
in the DS brain.
Systematic comparison of SAGE and microarrays and generation of a control for microarrays
The MBGP plans to use the complementary
gene expression technique of microarrays in combination with SAGE, which
will enhance the utility of both techniques. A systematic comparison of
SAGE results to those generated using AGRF microarrays will be performed
using the same samples used in the SAGE analyses. Recent comparisons of
SAGE with Affymetrix chips (
Ishii 2000), filter arrays (
Nacht 1999; Lyle, Chrast, Antonarakis
and Scott, unpublished data) and microarrays (Chrast, Antonarakis and
Scott, unpublished;
Blackshaw 2000)
have shown that the techniques have similar sensitivity, although microarrays
tended to underestimate the fold difference in expression determined by
SAGE or Northern blot (
Blackshaw
2000). The MBGP aims to make microarray expression data absolutely
quantitative and comparable between different experiments by employing
SAGE data. Reliable and precise microarray gene expression profiling relies
on comparison of hybridization efficiency between an experimental and
a reference RNA samples. Differences in hybridization intensity between
these RNA targets reflect relative differences in gene expression levels.
Using the same reference RNA in different microarray experiments provides
a common denominator for accurate and reproducible comparison of gene
expression data (
Eisen 1999,
Lash 2000). For comparison of multiple
experimental RNA samples (hybridization experiments), a common reference
RNA sample provides an essential internal-control and allows comparisons
to be made among large numbers of samples (experiments) and can thus dramatically
increase the power of a microarray experiment. The MBGP will generate
large quantities of whole C57BL/6J adult mouse brain RNA as a reference
RNA sample for microarray gene expression profiling from pooled samples
from routinely sacrificed mice. The mouse reference RNA will be made available
as a reference sample through the AGRF allowing inter-laboratory comparisons
of microarray data in Australia.
Velculescu et al. (14)
showed that the number of new unique transcripts identified approached
zero at the level of 600,000 tags. We will generate a "saturated"
transcriptome of 600,000 tags from the whole C57BL/6J adult mouse brain
reference RNA. Microarray elements representing a subset of genes representative
of different expression levels in the SAGE transcriptome (600,000 tags)
will then act as additional microarray controls. Use of the reference
RNA sample should allow conversion of microarray data to absolute expression
levels as well as thorough monitoring of sensitivity. With the advent
of multiple colour microarray scanners and additional flurophores, it
may become possible to include three (or more) RNA samples in a microarray
experiment, the reference RNA and the two (or more) samples for which
an immediate comparison is desired. Addtionally, the saturated whole mouse
brain transcriptome will allow ready identification of transcripts that
are specific to the more defined areas of the brain to be studied.
Flow-on benefits to the research community
Upon publication all SAGE data will be made available to the public. This data will provide stimulus for hypothesis-driven
research on genes of neurological importances is crucial. In keeping with
the vast majority of SAGE libraries constructed to date (SAGE 2000, Baltimore,
MD, USA, Sept.18-20), we will continue to use the anchoring enzyme NlaIII
and the tagging enzyme BsmF1 during SAGE library construction. In this
way, the value of newly generated SAGE data is enhanced by access to other
large data sets for mouse The MBGP is pleased to offer assistance to other
investigators in the production of SAGE libraries by supplying proven
reagents and expertise.
Future plans
In order to more precisely define the gene expression changes associated
with the neurodevelopmental defects seen in DS/Ts65Dn, we propose to conduct
SAGE analyses of discrete brain regions, namely hippocampus, cortex and
cerebellum at three developmental stages (P1, P15 and P30). In addition
we will follow the subsequent neurodegeneration seen in DS/Ts65Dn by comparison
of these libraries with those generated from equivalent regions of adult
mice. Ts1Cje mice, which are trisomic for a region of MMU16 from
Sod1 to Mx1, may be an appropriate model mainly for the neurodevelopmental
defects in DS while Ms1Ts65 mice (3 copies of MMU16 from App to Sod1 (both
implicated in neurodegenerative diseases – see below) may be used
to study the neurodegenerative aspects of DS. To dissect the contribution
of these two chromosomal segments to the DS/Ts65Dn phenotypes, gene expression
profiles of hippocampus, cortex and cerebellum from these two mouse models
will be analysed at the same time points as for Ts65Dn using microarrays
from the AGRF. The SAGE data from the Ts65Dn mouse line will serve as
a framework for the interpretation of the microarray results.
Click here for the original press release.
Last modified on the 28th January 2004.
For website queries, please email
hyde@wehi.edu.au