Human Genes: What Makes a Hit

Historical bias is a key reason biomedical researchers continue to study the same 10 percent of all human genes whose sequences are known while ignoring many genes known to play roles in disease, according to a new study by Northwestern University. The bias is bolstered by research funding mechanisms and social forces.

Recent studies from other labs have reported that researchers actively study only about 2,000 of the nearly 20,000 human protein-coding genes, so Thomas Stoeger, a Data Science Scholar at the Northwestern Institute on Complex Systems (NICO) and the Center for Genetic Medicine, set out to find why. With colleagues including Luís Amaral, professor of chemical and biological engineering in Northwestern’s McCormick School of Engineering, Stoeger compiled 36 distinct resources describing various aspects of biomedical research and analyzed the large database for answers.

The team found that well-meaning policy interventions to promote exploratory or innovative research actually result primarily in additional work on the most established research topics: genes first characterized in the 1980s and 1990s, before completion of the Human Genome Project. The researchers also discovered that postdoctoral fellows and Ph.D. students who focus on poorly characterized genes have a 50 percent reduced chance of becoming an independent researcher.

The interdisciplinary study was published September 18 by the open access journal PLOS Biology.

“We discovered that current research on human genes does not reflect the medical importance of the genes,” Stoeger said. “Many genes with a very strong relevance to human disease are still not studied. Instead, social forces and funding mechanisms reinforce a focus of present-day science on past research topics.”

Stoeger, Amaral, postdoctoral fellow Martin Gerlach and Richard I. Morimoto, the Bill and Gayle Cook Professor of Molecular Biosciences in Northwestern’s Weinberg College of Arts and Sciences, conducted the study.

The researchers applied a systems approach to the data — which included chemical, physical, biological, historical and experimental data — to uncover underlying patterns. In addition to explaining why some genes are not studied, they can explain the level to which an individual gene is studied. And they can do that for approximately 15,000 genes.

The Human Genome Project — the identification and mapping of all human genes, completed in 2003 — promised to expand the scope of scientific study beyond the small group of genes scientists had studied since the 1980s. But the Northwestern researchers found that 30 percent of all genes have never been the focus of a scientific study and less than 10 percent of genes are the subject of more than 90 percent of published papers. And this despite the increasing availability of new techniques to study and characterize genes.

“Everything was supposed to change with the Human Genome Project, but everything stayed the same,” said Amaral, the Erastus Otis Haven Professor of Chemical and Biological Engineering and a co-author of the study. “Scientists keep going to the same place, studying the exact same genes. Should we be focusing all of our attention on this small group of genes?”

With researchers focused on just 2,000 human genes, the biology encoded by the remaining 18,000 genes is largely uncharacterized. Some of these genes, the researchers note, include an understudied breast cancer gene cluster and genes connected to lung cancer that could be at least as important as well-studied genes.

“The bias to study the exact same human genes is very high,” Amaral said. “The entire system is fighting the very purpose of the agencies and scientific knowledge, which is to broaden the set of things we study and understand. We need to make a concerted effort to incentivize the study of other genes important to human health.”

Looking forward, the Northwestern team is developing a public resource that could help identify understudied genes that have the potential to be of critical importance to specific diseases. The resource includes information on any extraordinary chemical property, if a gene is highly active in a specific tissue and if there is a strong link to a disease.

The research was supported by the National Science Foundation, the Department of Defense’s Army Research Office, the National Institute of Aging, the National Institute of Allergy and Infectious Diseases, the Simons Foundation, the Daniel F. and Ada L. Rice Foundation and a gift from John and Leslie McQuown.

The title of the paper is “Large-scale investigation of the reasons why potentially important genes are ignored.”