Nick Deason received his B.S. in Biology from the University of Notre Dame and currently studies the spread of drug-resistant malaria.
The scale of our universe is astounding. The Milky Way alone is up to 100 trillion times the size of our sun. When you try to compare the size of subatomic particles to the observable universe, you must abandon words and turn to scientific notation to describe the numbers involved.
As a biologist studying infectious diseases, I often struggle to comprehend the scale of the microbes that live inside us and make us sick. Prompted by this ignorance, I set out to identify the largest and smallest viruses, bacteria, and parasites that call humans home. While not quite as dramatic as the difference between subatomic particles and astronomical bodies, the range of sizes of human pathogens is astonishing in its own right, with a size difference of more than one billion-fold from smallest to largest. If the smallest human virus were enlarged to the size of a tennis ball, the longest parasitic worm would easily wrap around the circumference of Earth’s equator. With this range of proportions in mind, let’s start with our smallest pathogens and make our way up.
The Nano: Prions, Viroids, and Viruses
At the small end of the spectrum, things are a bit hazy. For one, the infectious agents involved at this level – prions, viroids, and viruses – are not necessarily living organisms. Prions (such as the causative agent of Mad Cow Disease) are simply misfolded proteins that cause a chain reaction of more protein misfolding. Viroids consist only of RNA (genetic material similar in structure to DNA) that replicates using host enzymes. Meanwhile, viruses are semi-living packages of genetic material and protein that reproduce inside of organisms’ cells.
The smallest of these agents, prions, may be as few as 5 to 10 nanometers (nm) in length (for reference, a nanometer is one billionth of a meter, or roughly the length of 3 carbon atoms). However, the caveat is that prions often cling together to form large aggregates of proteins, and so the size of infectious prion particles is often much larger than a single protein.
Viroids are infectious pieces of RNA that also replicate using host enzymes. They cause disease by interfering with normal host cell function through a process called gene silencing. However, all known viroids are in plants, so I largely ignored them for this article, but if you’re interested, you can read more about Hepatitis D, which is a human disease caused by a viroid-like mechanism.
Viruses are much more familiar to the average person and can cause diseases as common as the seasonal flu or as rare and deadly as Ebola. Viruses are made up of DNA or RNA surrounded by a protective protein coat. Some may also have an outer lipid layer that protects the virus and functions in pathogenesis. While larger than viroids or single proteins, viruses are still very small. The smallest human-infecting viruses that I came across in my search belong to a group called parvoviruses, of which I chose parvovirus B19 as an example to focus on. At a mere 23-26nm in diameter, parvovirus B19 can only be seen with powerful electron microscopes. Somehow, this virus is nonetheless able to pack several thousand base pairs of DNA into its small protein capsid. Normally, DNA is about 2nm wide, but in this case the virus can save some precious space because it’s genome is actually comprised of single stranded DNA.
Parvovirus B19 is most notable for causing bright red rashes in infected children, leading to the term “slapped check syndrome.” However, immune-compromised people, such as those with HIV infection, can have more serious manifestations of disease.
Surprisingly, it’s not just viruses that make their living at this nanometer scale. I was astounded to learn that a bacterium in the urinary tract of humans called Mycoplasma genitalium can measure as little as 200nm in length. It also has one of the smallest genomes of any cellular organism, with around 580,000 base pairs. This made it the ideal organism to use in The Minimal Genome Project, in which scientists deleted various genes in M. genitalium and discovered a minimum of 382 that were needed to sustain life. In regards to public health, this bacterium causes urethritis and sometimes more serious pathology in the reproductive organs of both men and women.
The last stop on our tour of the nanoscale are poxviruses: the largest viruses that infect humans. A familiar example is variola virus, the causative agent of smallpox. This enveloped virus may be up to 350nm at its longest point, making it a behemoth in the virus world. Amazingly though, there are viruses that dwarf even variola, although they almost always exclusively infect amoebas. The largest of all, Pithovirus, measures 1500nm in length. Finally, there is some debate as to whether a virus called Mimivirus causes pneumonia in humans. If confirmed, this 400+ nm long virus would replace poxviruses on our list.
The Micro: Bacteria and Parasites
A micrometer (μm) is 1000 times longer than a nanometer. In other words, it takes 1000nm to equal 1μm. The organisms measured in micrometers are typically single cells, such as bacteria and small eukaryotes. The latter category includes microsporidium, which are single-celled infectious fungi. When they infect humans, they replicate inside of our cells, which necessitates them being very small. Typically, the spores of the 15 human-infecting microsporidium species measure 1-4μm in length. Disease manifestation includes diarrhea and wasting, most commonly in AIDS patients. A similarly-sized intracellular parasite is the malarias, whose “ring stages” are about 1-2μm in diameter.
The largest human-infecting bacteria are probably spirochetes. These long, thin, and spiral shaped prokaryotes cause diseases like syphilis, Lyme disease, and maybe even dementia. Featured below is Borrelia burgdorferi, the causative agent of Lyme disease. While less than a micrometer wide, these spirochetes can be more than 20μm in length. For reference, this is almost as long as a human skin cell.
The Macro: Bugs and Worms
Of course, there are infectious agents that we can see with our own eyes. These include parasites like fleas, mites, and worms. None come close to the size of the beef tapeworm, however. Obtained by consuming undercooked beef, these worms can grow to be over 22m long (that’s over 70ft!) in our intestines, although they are typically less than 10m long in most infections.
Thinking back to our smallest pathogen, Parvovirus B19, we see the huge range of sizes that infectious agents can come in. Each organism’s body size is adapted to its specific home. Intercellular viruses are small, constrained by the size of our own cells, while worms that live in our intestines, like the tapeworm, can afford to be much larger. As the parasitologist Dickson Despommier always says: “Successful systems attract parasites.” In the case of humans, we are so successful that viruses, bacteria, and parasites of all sizes have evolved to find a home in our bodies.
Francesca Tomasi received her B.A. from the University of Chicago and currently researches tuberculosis drug targets in search for novel antibiotics.
Proteins are essential components in all forms of carbon-based life, capable of serving just about every function imaginable. From providing physical structure to a cell, to orchestrating complex metabolic networks, our bodies need hundreds of thousands of different types of proteins to survive.
So how do we study them? To begin, scientists typically need to isolate a protein of interest. For example, say someone wants to investigate a bacterial protein as a new drug target. They want to isolate that molecule and test whether different drugs successfully bind to and inhibit its function.
To do so, scientists use SDS-PAGE gel electrophoresis, a popular technique that allows the separation of biological molecules.
Say you have a massive box filled with knotted ropes, but you just want to take out a single shoelace from the box. If you were looking for a specific, 15” shoelace, how would you distinguish it from a similar-looking one that’s longer or shorter if they are all knotted up? You would have to dump the box on the floor, detangle its contents, and pick out your shoelace.
Proteins exist in their own sort of knotted mess, complex three-dimensional conformations that render them uniquely capable of carrying out a very specific process. So just like it’s easier to find a specific string in a tangled mess after untangling and sorting everything out, it is easier to pick out a protein of interest by “untangling” all the proteins in a solution and sorting them out by length.
In SDS-PAGE, proteins are linearized by a detergent called SDS (which stands for sodium dodecyl sulfate) that also gives them a negative charge. Once proteins are injected into a gel (the starting point of this time-lapse video), an electric current is turned on, which allows them to travel across the gel. Larger proteins require a larger “pull” to move than smaller ones, so everything moves at a different rate down the gel.
A researcher knows the size of a specific protein of interest by studying its genetic code and the amino acids that make it up; it is therefore easy to pinpoint its location on a gel, using a template ladder (the left-most lane in this video), which acts as a size key.
Francesca Tomasi received her B.A. from the University of Chicago and currently researches tuberculosis drug targets in search for novel antibiotics.
You have probably heard words like genome or chromosome before, both of which refer to some self-contained unit of completion: chromosomes contain genetic information that all together makes up the genome, the complete set of genes within an organism. Now in the twenty-first century, the suffix -ome has been modified into a neologism, -omics, aimed at encompassing different fields that can collectively characterize (even quantify) entire pools of biological molecules. With modern tools, scientists can translate massive data sets into the inner function and dynamics of living things.
Sometimes we assume many of the “major” discoveries in science have already been made: the discovery of DNA, the identification of etiologic agents of disease through a thorough understanding of human physiology, the structure and function of viruses, bacteria, proteins, eukaryotic cells, and so on. We assume that as members of this century it is our duty instead to dig into the intimate details of a biological system to pinpoint the nuances responsible for diseases we have not been able to treat yet. As we learn repeatedly, however, this is not exactly true. Major discoveries are still made today that revolutionize anything from a small niche in research to the central dogma of medicine. Recently, for instance, the discovery of CRISPR and its simple applications to genome editing have stirred the foundation of genetics and the potential for gene therapy as a very real way to treat hereditary diseases.
Another revolution comes from the invention of tools to study entire organisms all at once. A cell is a highly complex, dynamic unit— the smallest structural and functional component of a living organism. Cells contain all sorts of biological molecules that allow them to function as single living entities. In multicellular organisms like ourselves, our cells work together to make us us. We are, to echo some biology teachers’ favorite cliché, walking chemical factories, constantly undergoing biochemical processes to maintain a functional, physiological balance.
Until now, we have predominantly studied cells and greater organisms piece by piece, by mutating, deleting, or overexpressing specific genes that encode for specific molecules, to understand the purpose of those molecules and how they come together in biochemical pathways. The goal of the omics, however, is to look at something –a single cell, organ, or an entire population of individuals – and deduce patterns that correspond to specific consequences. Let’s look at three of the main scientific omics.
GENOMICS: PUTTING TOGETHER THE GLOBAL GENETIC CODE
In 1953, Rosalind Franklin confirmed the helical structure of DNA, and her findings were published by James Watson and Francis Crick. As soon as the scientific world learned what DNA was and how it worked, the advent of genetic sequencing technology revolutionized molecular biology. As early as 1955, researchers developed ways to decode the genetic makeup of anything from a single protein to cellular organelles. By the turn of the century, DNA sequencing evolved into a rapid technique that today allows us to sequence entire genomes. In 2007, the Human Genome Project was completed: the entire genome of a single human being was published. Since then, thousands of genomes across all of life’s kingdoms have been sequenced, and thus was born the study of genomics. For example, scientists can easily compare different genomes to search for potential genes associated with different diseases. In the event of an infectious disease outbreak, sequencing pathogens can allow biologists to understand the microbe’s evolutionary pattern over the course of an outbreak. Understanding the genetic makeup of viruses and bacteria helps scientists develop vaccines, or predict mutations that will make a pathogen resistant to drugs or more contagious.
Genomics has multiple subsets. Functional genomics, for instance, attempts to describe gene functions and interactions (you will see this recurring theme as you read on about proteomics and metabolomics). This area of interest focuses on the more dynamic nature of DNA and its use as a blueprint for proteins as opposed to examining the static nature of the DNA code. For instance, scientists trying to understand how an organism coordinates its life processes on a genetic level will employ techniques in functional genomics.
Meanwhile, researchers studying structural genomics would like to elucidate the 3-dimensional structure of the proteins encoded by a genome. Like a blueprint that describes a building, moving from a 1- to 2-dimensional map to a three-dimensional building is a complex process; nonetheless, a combination of physics, structural chemistry, and genomics allows scientists to develop structure prediction algorithms to model an unknown protein’s structure. This, in turn, helps study the protein’s interactions with other proteins, and design potential molecular inhibitors of protein drug targets.
Finally, metagenomics is the study of genomes recovered from environmental samples. A major limitation in microbiology is the inability to culture over 90% of microbes. To study a community of bacteria, classical microbiology requires scientists to be able to grow and manipulate each species in the lab. However, growth media is a complex result of years of trial and error to determine the minimal necessary growth conditions for a given organism. Thus, it is difficult – and often deemed impossible – to invent and manufacture special media for any species of interest. Metagenomics helps navigate this limitation by allowing researchers to sequence entire populations and study microbial diversity by understanding the different genome sequences within a sample. It holds the power to reveal previously untapped microscopic life, allowing us to delve more deeply into hidden wonders of the world including our very own microbiomes.
PROTEOMICS: PUTTING THE “PRO” IN PROTEIN
Biochemical processes are carried out by proteins, the indispensable organic compounds that give us our body tissues, enzymes, and antibodies (amongst so many other things). Proteins are encoded for by genes, our biological blueprints. The word “proteome” furthermore describes the entire set of proteins expressed by a genome, cell, tissue, or organism at a given time. Proteomics, therefore, is the study of proteomes; a quest to understand how entire networks of proteins interact in a specific context. The proteome of a given organism, or subset of an organism, varies per the environment – stress, nutrient levels, and so on – so the amount of data one can generate through proteomic studies is massive.
Proteomic analysis is further complicated by the fact that once a protein is made, it does not necessarily remain unchanged. Proteins constantly interact with each other and are altered along with a cell’s physiological state. So-called “post-translational modifications exist to slightly alter a protein to perform a new task. An extremely common example of this is called "phosphorylation." Phosphorylation refers to the addition of a phosphate molecule to specific amino acids (the building blocks of proteins). This often happens in processes like cell signaling, where the presence or absence of a molecule causes a protein to become phosphorylated, giving it a kind of energy-rich tag that equips the protein to set off a new cascade of events in response to a change in the environment. Thus, proteomics can also involve studying variations of a specific protein in different scenarios to understand when and where it is used for a given purpose.
Suffice to say, proteomics experiments are complex tasks that require very precise planning and execution. In any biological field, proteomics has seemingly infinite applications. One such application is in drug discovery.
In the past, antimicrobial drug discovery relied heavily on serendipity. Penicillin, for instance, was only discovered when a laboratory petri dish was contaminated with mold that killed the dish’s resident bacteria. Nature is arguably Earth’s greatest pharmacist in this regard, as organisms have evolved side by side for billions of years, developing arsenals against each other. Name a bacterial species, and it is more likely than not that there exist different bacteria or fungi capable of producing their own antibiotics to outcompete that species. Nonetheless, treating bacteria within the human body has added criteria that are not necessarily met by natural antimicrobial compounds. An antibiotic used in people is, ideally, selectively toxic to a specific type of pathogen while simultaneously causing little to no harm to patients. Meanwhile, a compound produced by other microbes might be broad-spectrum, or it targets some essential biochemical process common to both a pathogen and its host. Artificial drug discovery efforts – and proteomics – step in to understand existing drugs and build novel ones.
Proteomic analyses are helpful in many subsets of drug research: namely, target identification, identification of a compound’s efficacy or toxicity, and understanding mechanisms of action. The first category requires the identification of proteins whose activities change between healthy and diseased states. Such proteins provide insight into potential novel drug targets as well as diagnostic biomarkers of a given condition. If researchers analyze a pool of extracted proteins from healthy control populations and from individuals with a given disease, they can analyze differences between both cohorts (and even along a spectrum of disease severity). With regards to compound efficacy, proteomics can aid in the identification of protein interactions and the detection of any biomarkers to assess whether an intended, theoretical target is modulated by a given compound in real life. Lastly, sometimes researchers identify chemicals that successfully treat a specific pathogen, but whose specific target(s) or mechanisms of action are not understood. In this scenario, proteomic profiling can shed light on drug targets to pinpoint what is taking place. This is often accomplished by slightly modifying a drug into a probe – that is, giving it a “handle” (usually a small chemical modification that will not change the drug’s activity) that is later detectable by a tag (such as a fluorescent molecule) that specifically binds to the handle. This in turn makes the drug discernible from other uninvolved proteins, allowing a scientist to purify specific tagged molecules and, consequently, any target protein they are bound to.
METABOLOMICS: YOU ARE WHAT YOU EAT
When researchers study new drugs to treat an infection, the obvious first question to ask when a compound is found that kills a pathogen of interest is “what is the target?” That is, what molecule or biochemical pathway within the organism is this foreign compound destroying or inhibiting that leads to the microbe’s failure to thrive? One way to address this question, besides the proteomic approach discussed above, is to compare the metabolites of untreated and treated bacterial populations. Metabolomics, the youngest of the three -omics discussed in this article, is defined as “the systematic study of the unique chemical fingerprints that specific cellular processes leave behind.” The metabolome, simply put, is a collection of all end products of cellular processes within a cell, tissue, organ, or entire organism. This physiological snapshot can provide valuable insight into the characteristics of healthy and diseased cells. In the world of drug discovery, researchers can compare the metabolic profiles over time of bacteria in a control environment with those of the same bacteria in a drug-treated environment. The distinct populations will inevitably diverge in their chemical processes, both because of metabolic disaster triggered by a foreign substance and as a potential compensatory mechanism.
Metabolomics also plays an increasingly important role in the diagnostic world, both for communicable and non-communicable diseases. Researchers are asking questions about metabolic processes that might be altered within a host, or characteristic pathogen-related metabolites that could indicate an infection. They are also asking about potential molecular predictors of a disease’s severity or state of progression.
For instance, dengue fever is a major global health concern that presents with a range of symptoms, from minor illness to organ failure. The ability for doctors to predict disease outcome on a patient-by-patient basis would aid in on-site triaging of patients in the clinic. Scientists at Colorado State University have investigated just that. In a study published last year, the researchers characterized the serum metabolome of dengue patients who experienced different disease outcomes. In this retrospective analysis of blood samples, the researchers could differentiate between different outcomes (for instance, between dengue fever and dengue shock syndrome, as well as between patients who progressed from one state to the other or ultimately succumbed to the infection) by way of different metabolites. Altogether, this study provided a proof-of-concept that metabolomics may serve as a predictive tool for infection outcome, which will help inform individualized treatment plans. Metabolomics is considered the most dynamic level of biological regulation (more so than protein modification, and much more so than any genome-level alterations), and as such a very convenient real-time method of physiological analysis.
Many times, large-scale biological projects such as antibiotic discovery efforts and environmental research do not use a single omics approach. Instead, proteomics, metabolomics, and genomics are often used together as synergistic approaches to create a more unified perspective within a given biological question. The omics do not end here, though. In fact, new omics fields are constantly popping up, from neurogenomics (the study of genetic influences on the nervous system) to nutrigenomics (understanding the relationship between the human genome, nutrition, and health). This is the era of the omics, the quantification of biology to uncover untapped questions and patterns.