Imaging in neuroscience has revolutionized our current understanding of brain structure, architecture and increasingly its function. Many characteristics of morphology, cell type, and neuronal circuitry have been elucidated through methods of neuroimaging. Combining this data in a meaningful, standardized, and accessible manner is the scope and goal of the digital brain atlas. Digital brain atlases are used today in neuroscience to characterize the spatial organization of neuronal structures, for planning and guidance during neurosurgery, and as a reference for interpreting other data modalities such as gene expression and connectivity data. The field of digital atlases is extensive and in addition to atlases of the human includes high quality brain atlases of the mouse, rat, rhesus macaque, and other model organisms. Using techniques based on histology, structural and functional magnetic resonance imaging as well as gene expression data, modern digital atlases use probabilistic and multimodal techniques, as well as sophisticated visualization software to form an integrated product. Toward this goal, brain atlases form a common coordinate framework for summarizing, accessing, and organizing this knowledge and will undoubtedly remain a key technology in neuroscience in the future. Since the development of its flagship project of a genome wide image-based atlas of the mouse brain, the Allen Institute for Brain Science has used imaging as a primary data modality for many of its large scale atlas projects. We present an overview of Allen Institute digital atlases in neuroscience, with a focus on the challenges and opportunities for image processing and computation.
Understanding the geography of genetic expression in the mouse brain has opened previously unexplored avenues in
neuroinformatics. The Allen Brain Atlas (www.brain-map.org) (ABA) provides genome-wide colorimetric in situ
hybridization (ISH) gene expression images at high spatial resolution, all mapped to a common three-dimensional
200μm3 spatial framework defined by the Allen Reference Atlas (ARA) and is a unique data set for studying expression
based structural and functional organization of the brain. The goal of this study was to facilitate an unbiased data-driven
structural partitioning of the major structures in the mouse brain. We have developed an algorithm that uses nonnegative
matrix factorization (NMF) to perform parts based analysis of ISH gene expression images. The standard NMF
approach and its variants are limited in their ability to flexibly integrate prior knowledge, in the context of spatial data.
In this paper, we introduce spatial connectivity as an additional regularization in NMF decomposition via the use of
Markov Random Fields (mNMF). The mNMF algorithm alternates neighborhood updates with iterations of the standard
NMF algorithm to exploit spatial correlations in the data. We present the algorithm and show the sub-divisions of
hippocampus and somatosensory-cortex obtained via this approach. The results are compared with established
neuroanatomic knowledge. We also highlight novel gene expression based sub divisions of the hippocampus identified
by using the mNMF algorithm.
Understanding gene expression in the mouse brain should provide a better understanding of the underlying topology of the mammalian brain, thereby opening previously unexplored avenues in neuroscience and brain informatics. An important step in this direction is to develop robust algorithms to quantify gene expression in in-situ hybridization (ISH) data. ISH methodology involves the use of labeled nucleic acid probes that bind to specific mRNA transcripts in tissue sections. The bound probe is detected using colorimetric methods and the resulting stained tissue sections are imaged at high resolution. The goals of the present study are to first identify a staining method that produces maximum signal to noise ratio (SNR), and second to develop a method for gene expression detection for a wide range of ISH images spanning different intensity and expression patterns. A simple k-means based clustering method is used to separate foreground labeling from background and non-expressing tissues in a variety of images stained with different staining/counter staining techniques. We found NBT/BCIP with no counterstain produces the best signal to background separation. The foreground cluster detected using the k-means algorithm was further modeled using a normal distribution. A novel one sided Mahalanobis distance based metric with majority partial ordered voting method was then developed to generate a fuzzy segmentation of the gene expression in each ISH image. This algorithm is fully automatic and facilitates high throughput analysis of large amount of image data. Using this methodology a cluster of 10 PCs was able to process approximately 10% of the mouse genome (17 TBytes of JPEG2000 lossless compressed images) over a period of 2 weeks. The results may be visualized at the Allen Institute for Brain Science web site www.brain-map.org.