Wednesday, January 18, 2006

Around the World in 800 Billion Bases

Sanger Institute Genetic Records are World's Biggest

On Tuesday 17 January 2006 the Wellcome Trust Sanger Institute's World Trace Archive database of DNA sequences hit one billion entries. The Trace Archive is a store of all the sequence data produced and published by the world scientific community, including the Sanger Institute's own prodigious output as a world-leading genomics institution.

To grasp how much data is in the Archive, if it were printed out as a single line of text, it would stretch around the world more than 250 times. Printing it out on pages of A4 would produce a stack of paper two-and-a-half times as high as Mount Everest.

Each entry is a piece of genetic information averaging 864 characters long. Scientists can search these sequences and piece them together to build up the whole genetic information of organisms - mice, fish, flies, bacteria and, of course, humans.

The Archive is 22 Terabytes in size and doubling every ten months - perhaps the largest single scientific database in Europe, if not the world.

Martin Widlake, Database Services Manager at the Wellcome Trust Sanger Institute said: "At 22 000 GB the Trace Archive is in the Top Ten UNIX databases in the world. That's not bad for a research organisation of 850 employees in the countryside just outside Cambridge."

"It is possibly the biggest single (acknowledged) scientific RDBMS database in Europe, if not the world."

The Trace Archive is possibly the biggest single (acknowledged) scientific RDBMS database in Europe, if not the world

Martin Widlake

All the data are freely available to the world scientific community (http://trace.ensembl.org/), as a resource to geneticists all over the globe. When a researcher is studying a disease or gene, they can download the genetic information known about the area they are studying.

The data are being actively used by biomedical researchers in academic and commercial organizations. The three internet domains that make most use of the trace archive are .com, .edu and .uk. Dotcoms are responsible for about 80% of download each week - mostly as big 'customers', taking vast chunks each visit. Next are US university researchers, followed by UK scientists.

Trace data are the raw results of genetic research to allow them to identify and study genes, to reveal variations (mutations) in genes and to study similarity to genes in other organisms. These are vital starting points for studying and better understanding the biology of health and disease.

By any comparison, the billion records stands above many other familiar repositories. The British Library holds 13 million items: the US Library of Congress holds 115 million items. The Trace Archive holds one billion chunks of unique information.

"Accessing the data becomes a larger and larger problem as the dataset grows," continued Martin Widlake. "At present it is simple and very quick to access a record if you know its unique identifier as issued by the Sanger Institute, the US National Center for Biotechnology Information (NCBI) database, or the 'name' of the trace as given by the organization that originally sequenced that piece of genetic information."

"Scanning the whole dataset for a single genetic sequence, which is a lot like searching for a single sentence in the contents of the British Library, is a massive task. However, the team at the Sanger Institute are working on new methods to make the data easier to search and access".

The data are held in duplicate, with the NCBI also maintaining a copy: with two sites holding it, a single disaster cannot wipe out the only copy of this vital and heavily used database.



source:http://www.sanger.ac.uk/Info/Press/2006/060117.shtml


Bored meetings

Do you believe, as someone somewhere perhaps does, that meetings, meetings, meetings, followed by more meetings are altogether a good thing? If so, Alexandra Luong and Steven G Rogelberg think you should think again. In a newly published study, they say: "We propose that despite the fact that meetings may help to achieve work-related goals, having too many meetings and spending too much time in meetings per day may have negative effects on the individual."

Luong is an assistant professor of industrial and organisational psychology at the University of Minnesota, Duluth. Rogelberg is an associate professor of psychology at the University of North Carolina at Charlotte. Their report appears in the journal Group Dynamics: Theory, Research and Practice.

It begins with a somewhat brief recitation of the history of important research discoveries about meetings. Here is a capsule version of their tale.

Discovery: The majority of a manager's typical workday is spent in meetings. This was reported by an investigator named Mintzberg in 1973.

Discovery: The frequency and length of meetings have grown considerably in the last few decades. So declared the team of Mosvick and Nelson in 1987.

Discovery: A scientist named Zohar, in a series of reports published during the 1990s, found evidence that "annoying episodes" - which are sometimes also known as "hassles" - contribute to burnout, anxiety, depression and other negative emotions. Zohar advanced a theoretical framework that may one day help to explain why this is so.

Discovery: In 1999, a scientist named Zijlstra "had a sample of office workers work in a simulated office for a period of two days in order to examine the psychological effects of interruptions. [They] were periodically interrupted by telephone calls from the researcher." This had what Zijlstra calls "negative effects" on their mood.

Luong and Rogelberg used those and other discoveries as a basis for their own innovatively broad theory.

They devised a pair of hypotheses, educatedly guessing that:

1. The more meetings one has to attend, the greater the negative effects; and

2. The more time one spends in meetings, the greater the negative effects.

Then they performed an experiment to test these two hypotheses. Thirty-seven volunteers each kept a diary for five working days, answering survey questions after every meeting they attended and also at the end of each day. That was the experiment.

The results speak volumes. "It is impressive," Luong and Rogelberg write in their summary, "that a general relationship between meeting load and the employee's level of fatigue and subjective workload was found". Their central insight, they say, is the concept of "the meeting as one more type of hassle or interruption that can occur for individuals".

Rogelberg has delivered this insight in a talk called "Meetings and More Meetings," which he presented to a meeting at the University of Sheffield. He also does a talk called "Not Another Meeting!", which has been well received at two meetings in North Carolina.

ยท Marc Abrahams is editor of the bimonthly magazine Annals of Improbable Research (www.improbable.com) and organiser of the Ig Nobel Prize



source:http://education.guardian.co.uk/higher/research/improbable/story/0,11109,1687547,00.html


This page is powered by Blogger. Isn't yours?