Today we’re going to talk to Eleazar Eskin. He’s a Professor of Computer Science and Human Genetics at UCLA, and one of the youngest people I know to become a Full Professor. In other words he’s really smart.
Eleazar, can you tell everyone your title and what you work on?
My name is Eleazar Eskin and I’m a professor of computer science and human genetics at UCLA. I work on understanding which genetic variants are involved in human disease.
Human DNA is a sequence of 3 billion letters long; small differences in our sequence account for high variability, differences in disease risk between individuals.
I study what specific letters or specific differences explain differences in diseases. I do this by using genome wide association studies (GWA) – genomes of thousands of individuals who have a disease and do not have a disease and genetic variants are measured in each of these individuals. We look for differences in the DNA sequence, more common in individuals who have the disease versus in individuals who do not.
How do you do it? Do you gather DNA and see whether a specific sequence leads to disease in people or do you first find people with disease and then try to find the DNA sequences in them?
That’s a good question. Both ways. In Case Control design, we collect people from the same region and compare two sets. Another way is using a technique called population cohort where you will collect all individuals that were born in specific year, so that way you can measure many things about different individuals and then look at how genetics affects them.
We’re looking at correlations between diseases and genes. Just because you see a correlation between a genetic variant and a disease trait, you cannot be sure that that variant causes that disease because there are several reasons why it might not. One is that may be your sample is not big enough, and you’re just seeing a correlation.
Typically, today, we look at studies at tens of thousands to avoid possibility of that. There are also other reasons, like population structure – if you have a combination of different ethnicities in your sample and there’s a possibility you might see a correlation of something that doesn’t cause disease.
My lab has worked on this aspect a lot. When you can understand when correlations are accurate/valid or not. And developing statistical methods to help understand that.
These projects are huge projects involving hundreds of researchers and everyone does their own little part. My group is focused on the statistical methods that are used later to analyze the data.
We either pair up with researchers who have data or those who are collecting this data. We are a computational lab so our primary focus and our research is to develop computational techniques that will be used in analyzing the data.
I can give an example. One of the things that we do is that we try to understand how to take 3 billion data points and process and analyze that data.
Because you’re looking all over the genome, you’re actually looking at so many different places and that increases your chances of detecting false positives. One thing we’ve worked on is figuring out how to address false positives because you’re looking at so many different locations in the genome.
How do you account for the epigenetic effect – environmental vs genetic contribution?
Our software actually accounts for this. We can calculate how the environment contributes to a risk. You can think of it as based on your genetics, you get a genetic score. Based on your environment – your lifestyle etc. you get an environmental score and if you sum those and that is greater than a threshold value, you will have that disease.
If your genetics gives you a 30 and environment gives you 80 and the threshold is 100, for say diabetes, you will have that disease. If you have higher genetics, then you have to be much more careful about controlling your environment – your diet, lifestyle, etc.
It’s always both environments and genetics. Using genetic data you can control for the environment. That is the basic idea behind twin studies: there are fraternal and maternal twins. Fraternal twins are just like siblings, on average, they share 50% of their DNA; maternal twins are 100% identical. So to determine how much of the disease is genetic, we look at concordance rate, if one twin has a disease – what is the probability that the other one has that same disease? For example, with diabetes concordance rate, if it’s 40% then the twin will have a 40% chance as well. If the disease is not genetic at all, then the difference between maternal and fraternal concordance rates will be the same. So that’s one way we can measure to see how genetic a trait is.
That’s what we do in our lab. Twins are hard to come by in general so we’ve figured out a way to try to estimate how genetic something is based on genetic data in individuals.
With things like health and medicine, what we used to think was healthy for us has changed over time. Take chocolate or Vitamin E or nutrition where we think they’re healthy or unhealthy and then change our minds from later studies. We’ve relearned a lot of what we thought we knew in medicine. How similar is that here? Do we know the research is accurate or will we find we’re wrong in the future? Do we have large enough samples that we know this concordance rate is accurate? Are we relearning things in genetics that we thought we knew but didn’t?
Remarkably, the knowledge of how genetic a disease is and those estimates haven’t changed so much. The older studies that use twins and the newer studies that collect DNA haven’t really changed substantially. Having all this genetic information, we can now look and see what is causing those similarities between identical twins. Previously we knew that disease traits had a substantial genetic component, but before twin studies we didn’t know to what extent.
With mouse data, we see individuals often times have a very different response to their environment. For example, we did a study where we took different strains of mice and gave them high-fat diets. We compared their body weights under a high-fat diet to a normal diet. Different strains have different genetics. Some strains, we found, that under a high-fat diet become obese. There were some strains that even with a high fat diet, their body weight does not change. So if you look at two different strains with two different responses – they are in the same exact environment, so there must be something in their genetics that is affecting their response to their diet.
Vitamin D or even chocolate, there are genetic variants that may make people more predisposed to getting more benefits from vitamins, or how they metabolize food, depending on their genetic variant.
In behavioral science research, we think of strains as different gender or race. Is that an accurate parallel – if we’re finding differences in strains of rats, then in humans we might find differences based on their race or their gender? Is that the right way to look at humans, or do you find even smaller genetic group differences like among groups of African Americans, differences that are not at the phenotypic level?
It’s not a good analogy at all. Ethnicity is more like ancestry. The human population has a very complex ancestral origin, and their genetics are shaped by this. What people don’t realize is that in Africa, there’s substantially more genetic variation than in the rest of the world. That is shown through population studies.
When we think about testing new pharmaceuticals and medications, we’re starting to study medication response differences by ethnicity, race, and gender. Maybe that’s not the right way of studying things. Should we look at ancestry instead?
Eventually, the real hope is that at some point, we would know the genetic variant that would influence the effectiveness of that drug. We would just check to see if an individual has that variant and what their drug response profile will be. That will be more accurate than any correlation with ancestry. Our ability to measure these drug response variants has increased, so it’s not far on the horizon.
How can people use this knowledge to reduce their disease risk?
Where genetics makes a big difference is within drug response. There are some diseases, like infectious diseases, that are highly environmental. There is a genetic component to how well your body fights the infection. Even with a disease that you would think have a low environmental component, research into genetic variants is still very valuable. These genetic studies will give us a good idea to develop therapeutics.
Is there any research that’s done on not just outcome of disease – but on behavior? Do genetics predict who is going to be having sex more often than other people, or what drugs people are going to choose to take?
There is a lot of research in this area. There is a lot of interest related to psychiatric diseases. The neuroscience and neurogenetics is a huge field; one study we’re involved with now is the sound males might make when they are exposed to female urine; we have data from animals and have recordings of their sound. We want to understand what is the relationship and how epigenetics affects that response. There are a lot of studies where we’re trying to understand genetics and learning or genetics and anxiety. A lot of interesting research going on in behavioral genetics.
Any take home points for how the work that you’re doing and how work in this area can be applied in people’s lives?
Soon, we’re going to start seeing genetics playing a larger role in how you are treated by your doctor. Exactly what drugs you’re given or conditions you’re tested for – that will really be within the next few years.