Raw notes – I’ll try to do a proofread and get pictures in sometime this week.
Katherine Borges opened the ISOGG meeting with a welcome and review of the history of ISOGG since it was founded in 2005. In 2018, ISOGG spent a lot of time redesigning the website to make it GDPR compliant. James Irvine wrote up the guidance to put on the website and a group of volunteers helped. Katherine talked about the history of JoGG.
Katherine announced Debbie Parker Wayne’s new book Advanced Genetic Genealogy: Techniques and Case Studies. It includes chapters by multiple authors and it is available on Amazon now. It includes chapters by Jim Bartlett, Blaine Bettinger, Melissa Johnson, Kimberly Powell, Ann Turner, Patti Hobbs, Judy Russell, Michael Lacopo, Debbie Kennett, and others.
Peter Sjolund talked about the conference in Sweden. The Swedish are very much into DNA testing. Peter told the story of two people who took DNA tests and appeared as siblings. They never knew biological family and they met two months later. This was last February or March and in November, they got another half sibling in the US! The knew sibling knew her parents, so the first two met their mother. Peter shared the I1-L1302 tree of the Big Y results of men who had a common ancestor that lived in the 3rd-4th century AD.
INSERT PICTURES OF I1-L1302
Bennett called Dr. Connie Bormans to the stage. Connie has been with FTDNA since about 2006. Max and Bennett accidentally overbooked their travel and must leave at 3:30 today. Connie Bormans will be doing the lab tours tomorrow. They are 50 minute slots.
What’s Taking So Long??? The Life Cycle of a DNA Sample
Connie reviewed the stages of a DNA sample. It starts in preanalytic, include sample collection and accessioning. Then it goes to analytic for DNA extraction, processing of the test, and quality control. After that it goes to postanalytic for data analysis, quality control, and troubleshooting. An example of QC control would be, for example, on an upgrade they run something they ran initially and check it to make sure the values match.
In Stage 1, Preanalystic, the old way to do sample accessioning was manual scan and rack before extraction but now there is automatic sorter that scans and racks. This allows them to generate a lot of samples in less time. For sample collection and extraction, they used to have to pull out plates. They don’t need to do that step any more because they are extracting by test type. DNA is aliquoted and sent to lab teams for processing prior to storage. The new storage system holds 2 million samples and allows faster retrieval. This process gives a time savings of 1-2 days. A few years ago they moved to a new swab and tube configuration with a flocked brush vs. filter paper. The filter paper would swell in the lysis buffer and soak up more and more of that liquid. They couldn’t automate the first step to take some of the liquid out there because a lot of times there wasn’t enough space. A technician had to manually pull things out and little pieces of filter paper were getting stuck in the tip. The new brush does not swell at all. The tube is a little wider. This small change was huge because now the samples can be put on the deck of a robot and the robot can put the tips in and take out an aliquot and transfer it to a new plate to begin the extraction process. The amount of effort for technicians was huge and painful for their hands. Now the only thing they have to do is move the plate from point A to point B.
Stage 2 includes an additional quality control step. Buccal swabs contain human and non-human DNA. The ratio can differ between individual sand affect the quality of test results. They have a way now to determine the amount of human DNA to help with balancing of samples in the downstream process. If a sample is 50% DNA and another is 25% human DNA and you take the same amount from each, one will have twice as much. This was affecting some of the results. By determining the amount of human DNA, they can adjust and balance the samples. This will help them with throughput.
Stage 3 was a change in personnel. People used to be in one big group with tasks. They got so big and the number of samples increased so much that the model was not working. They have now developed teams and they’re divided by test type. They are headed by team leaders who have knowledge to run their teams through the pipeline. This allows fast and efficient processing because they ae experts on their own pipelines with the ability to manage and troubleshoot. They have the ability to run and make decisions. Right now there are eight teams with about 46 team members. There are about 6-8 people on each team with very specific functions on each team. It’s very efficient. They also use these teams to provide feedback to make changes to be more efficient. For example, machines Step A and Step B are across the lab. Can we move them together?
The next changes are Robotics and Automation. When they were not fully automated, they could do about 200 samples in one hour. They have now doubled it to 400 samples in one hour on one machine and multiple machines. Also with the arrays, they initially could run 576 samples in a batch. Then they started growing and got more and more tests and it wasn’t great. They reconfigured some of the robots and now they can run 96 chips, a total of 2,304 samples in one batch in the same amount of time or even less. That’s a huge increase and it’s on one machine so if they need to scale up, they can just get more machines.
For the new Big Y test, they looked at the entire Big Y process in the lab and completely redesigned the process to make it more efficient and more automated. It has gone to full automation. The pass rate has increased to ~95%. What they’ve seen with all of the tests is taht by implementing these changes, they’ve been able to decrease the turnaround time for tests. Their goal is to get them all down for turnaround time.
Automation allows them to easily scale production as needed. Their goal is to use technicians to do high level tasks and prevent burnout, reduce risk of error, and increase the number of samples that can be processed with the same amount of people.
Connie acknowledged all of the people who work in the lab
INSERT ACKNOWLEDGEMENTS PHOTO
Q&A
What is the variance in the human/non-human DNA approximately? It varies. The range would probably be 10-50% with most being human DNA. The desired threshold is 100% but you’re not going to get that from a buccal swab.
What would the tester recommendations to ensure greater quality? Connie thinks they’re in there already. Don’t do it first thing in the morning. There are no real guidelines other than what’s put in the instructions with the kit.
What is slowing down the Y-700 testing and what is being done to speed it up? Connie said with the new Big Y 700 process, they looked at the entire process in the lab start to finish. They have a brand new team who is running the test in the lab and have been completely retrained. They have their director of R&D look at the process and put every step possible on to automation. That is streamlining the entire process start to finish.
How much longer will be using the chip? Illumina has put them onto the Global Screening [not sure I got this right] They are finishing up the differences and two pipelines. Because people transfer they have to have a comparison between all the different files and compare to omni express and now to the new GSA Global Screening Array product. It’s a lot more for the IT side than lab side.
In Jan announcement said Big Y completed 2-4 weeks. When do you expect to reach this? Bennett said 6-8 weeks out. They’ve cut the open cue in half with the new product and it has much higher pass rates.
Are you starting the new big y test via whole genome sequencings? Y-chromosome enriched assay.
Slow vs busy season? Busy November- clear backlog. Slow is rest of year. From sample return standpoint, into slow season and finishing deluge from Christmas and then it’s slow to Nov.
Can you provide lab photos for presentations? Yes, contact Janine.
If we tested 2010 Affimatrix, should we test again with more snps and resolution? When switch from Affi to Illumina, they reran every sample that had been on Affimatrix. There’s no reason.
What do you do with the samples that have been all used up? Can previously viable be enhanced? If it’s gone it’s gone but if it was unviable, that’s a question they’re asking themselves. Right now, can’t, but still thinking about it. They like to play with that in the lab when they have time. They have the samples just in case.
Is there cross training between teams? Yes, people have to be able to go on vacation. Primary and secondary functions.
Is it okay to use old kits? Absolutely. Kits used and stored at room temperature for almost 20 years since 2000. They don’t’ throw away samples. When Ancestry started selling tests in 2002, they said they threw away samples. FTDNA took opposite. No concern of kit of any age because buffer is designed to keep DNA nice and happy for years and years in the tube Bennett said.
When a kit is delayed and notice is posted, after how many occurrences does someone intervene? Right now if a kit fails twice, it will trigger an action by someone in the Data Analysis team. There is someone whose primary function is to serve as a liaison between lab and customer service. She will implement request to send a new kit or something along those lines. After 2, there are some that fall the cracks but when those are raised, a high level person looks to see what is going on. Their goal is to stop having samples triggered. If it fails after two tries, someone from the lab will speak to them about sample collection.
How will your automation process hold samples {I couldn’t hear]? It will be done by hand. They are still getting the old tubes. Sometimes they get kits they sold ten years ago. The kits come back and they honor them and they are not expiring and they work so it’s all good.
What is the oldest sample you have run a test on? How many times can you test a sample? They have tested decades old samples. The DNA from a single extraction is enough to run every test they offer ten times. They’ve increased amount they get from a single extraction. It’s usually quality that fails.
Did the increase in use of more effective enzymes reduce cost? Sometimes. Sometimes it’s better faster cheaper and sometimes just better faster.
Will you be processing artifacts envelopes etc in foreseeable future? Bennett said he gets to make that decision with team and he will say no based on prior experience. About 10 years ago they did postage stamps and other things. Most of it didn’t work. It was hand work, meaning one person one sample and a success rate of 10 or 20 percent and they needed to charge a lot of money and it kind of made Bennett sick. Stamps, underpants, etc. did not work. Number of times of disappointment was more than someone was elated. They would like to but the technology has not gotten to the point that is effective.
How do you separate human and non-human DNA? We can’t. Just knowing how much is in there allows them to adjust how much they put in the test.
How long does a sample last for upgrades? FTDNA said 25 years but some have learned it’s not true and person has died? Their pledge was to hold the sample for 25 years. Not every sample is scraped well enough that it’s going to be there in 25 years. They did a study in 2014 that looked back at samples as old as 11 years old. On an upgrade sale, they ran it for what they ran it in 2004 and 2005 and one of those 96 samples failed to extract and 95 matched the prior results. That’s the only benchmark to share that makes sense.
How do you know what is human vs non-human? They use a human gene and see how much amplifies. It’s a quantity of PCR. Someone scraped their dog 10 or 12 years ago. They didn’t realize someone would spend $100 to do that.
I paid for a kit with old style swab. Should I replace it? Those will work just fine.
Street address for lab? 1445 N Loop West, Houston, TX 77008. Go to Suite 706
For older samples, how long do they start degrading? What about human-non-human? Don’t worry. We are just trying to further reduce reruns. Don’t read into it too much.
Is it still recommended that a fresh sample be used for a Big Y test? It works better. There is a higher success rate the first time when the sample is fresher and NGS is not as forgiving as Sanger was but they are doing thing with Next Gen that you can’t do with Sanger. Sometimes you will see a sample delayed because they’re trying to get the sample they’ve got to work and they’re running it a second or third time.
Have you still retained original NatGeo tests? Can they be upgraded? When they transferred all the samples from U of Arizona to Houston, anyone who had transferred from Nat Geo to FTDNA, those samples were saved in storage. For those who had not transferred, they did not have the legal authority from Nat Geo to save those samples. There was no argument to keep those. They created a script to determine which had transferred and those were preserved.
Original sample of 2005 Y-37 upgraded to 67 to 111. Can you use remained for FF now? Possible. Original DNA from 2005 most likely no but they van check if there is a spare vial and they would pull it and run it.
Is there a method for coordinating rides to the lab? Talk to Fernando.
Purchasing 5 big y and two received and other 3 have not? Come talk to Connie.
Can you generate an email when new samples are needed?
When a second tube is pulled, can you generate an email to ask if they can submit another kit?
They will save those two suggestions.
Five of my first 6 YDA kits came in with red marker at 17? See Connie
Bennett introduced Dr. Caleb Davis who joined the team in 2017. He’s been working on Big Y. Today he will introduce Big Y-700. It can reveal the paternal ancestry even if the men in your family have Big Ys. They went back to the drawing board on it for a couple of reasons. In the five years since the release of Big Y, they moved over from the previous genome reference assembly and there’s new parts of the genome that were not there. Their knowledge of the Y tree itself has really grown and they had a lot of opportunities to focus on regions where they were already doing well and regions where they could do better.
Y-DNA has 23,600,000 nucleotides. Caleb demonstrated how to build a tree out of the SNP differences. Start with the DNA letters at each position that differs between a man and any other men.
HOW TO BUILD A TREE OUT OF DIFFERENCES PHOTO
The point for Big Y is to get as much information on chromosome Y as we can. Not all nucleotides are created equal. Big Y-700 has 50% increase in high quality SNPs over Big Y-500
BIG Y-700 HAS 50% INCREASE PHOTO
SNPs should not just accumulate in some reasons and drop out in other regions. Looking at the regions between 200 and 1000 SNPs, each tip on the X-axis is 100,000 nucleotides and just binned the results together. In total, there were about 57% of the SNPs that were called identically between Bi Y 700 and Big Y. For Big Y 700 there were about 37% that were only found in Big Y 700. For the regions that had SNPs only in Big Y but not Big Y 700, only about 5%. In total, they expect to get 50% more high quality SNPs in Big Y 700 than in Big Y. He thought that couldn’t be right so he went back to the example and tried to figure out what was going on.
REAL FAMILY TREE EXAMPLE WITH CIRCLES PHOTO
S5 and S6 are more closely related to each other than they are to ZA1724. They knew from genealogical data that S5 and S6 were closer but only with Big Y 700 could they identify that. In the Big Y, they used 16 SNPs. In Big Y 700, they found 8 more, or 50%.
If we zoom out a bit, we can see where it falls in the context of the Y-Tree.
Y-TREE VISUALIZED FOR ALL OF HG J PHOTO WITH RED ARROW
Caleb showed a model that incorporates the other haplogroups. [Wow!]
Q&A
On your 16 SNP tree could you have added resolution with STRs? STRs change at a much higher rate from generation to generation so they’re not as reliable. If you’re trying to tell the difference at the very end, all the leaves on the tree, you may be able to find a consistent pattern but the patterns you pick may not tell you the right answer. If you go a little further back, you will need to use a lot of STRs. Caleb focused on SNPs for this example.
Can the tree that you showed at the end of the presentation be shared? Caleb said it can. It’s in Gephi and he can export it and it’s easy to share. He has some work to do to make it more consumable. On the slide there were 50 or 60 different squares because he had to add it manually. Audience members said they would like to have “you are here.”
How are our new unnamed variants moved up the tree to where they belong? Caleb talks to Michael Sager a lot about this. The last time he checked, they had about 30,000 new SNPs to place on the Y-tree, all FT high quality new SNPs. He is working so fast and going as fast as he possibly can. It’s coming.
What are private variants in the Y-tree? The SNPs that are solely observed in one person in the database.
Is the tree hand drawn manually? Gephi took care of the vast majority of the work but it’s not perfect. You have to move branches from one side of the visualization to the other because it’s not perfect.
How are the extra 200 STRs used in genealogical research? They are used how every STR has been used. With enough together, you will have enough confidence in the relations you find.
In my Y-500 about a dozen markers did not return a value. If I upgrade, with it be reprocessed? There’s two ways that can go. There are 5% that were in Big Y and they’re not. There were 88 in the reference sample and about 5% were just Big Y and not 700 but the rest of them are all included. It’s not a wash. If vital information was lost, they can endeavor to fix that. Path from one version to the next will hopefully be going on for a while.
What coordination is there between FTDNA and YFull? There isn’t any. There’s no collusion with the Russians. They don’t have conversations with then.
If a man has only one Y-match, will they learn anything from Y-700 vs. Y-500? “If I had only one match, I’d be looking for more men to take Y-37.” Bennett
Will more individual SNP tests be made available for those who have done the old Big Y to check them so they don’t have to redo it? That is under consideration. They need to select the right technology to do something. It is important to measure twice and cut once. They are looking at using a different technology that will make it easier to quickly add discrete SNPs rather than rebuild an entire panel, which is painful.
What benefit is the Y700 to an individual? Get me a persuasive argument to get my group to do it? You’re going to find new relationships that you never new about. Caleb is not sure he understands the question.
There are more questions he is thinking about and work.
What is the typical age of the newest mutation at the tips of the tree? They are working on that and intend to put on the block tree a time scale. Bennett doesn’t know when but it’s an active project at this point.
The benefit of Big Y to an individual? Bennett said Caleb’s example is in Big Y 700 white paper on blog.familytreedna.com and you can read it.
If you don’t match anyone, then there is not a lot of benefit because you will be in genetic isolation. If you have matches, the goal is try to definitively determine who your closest relative is who doesn’t (or could) share your surname.
What is FTDNA doing to increase the number of testers from Poland, Scotland, Ireland? They go to shows in Scotland and Ireland and England when they have them. That face in front of people works best. If they can keep out of the press with bad news it won’t hurt.
What is the ETA for upgrades? If you have taken Big Y 500 and you have those results back, you will be able to order a Big Y upgrade form the current product to the product with 50% more SNPs now until the end of this month for $179. Because generally they have conferences in November and have a Christmas sale, but they’ve decided to put the shopping cart on sale until the end of this month, March 31st. There will be a sale at the end of the month for attendees and members of the attendees projects.
Can we upgrade to Y-700 from Y-500 no matter how long ago it was done? Yes, as long as they have DNA. It’s new chemistry and new design.
Bennett introduced Roberta Estes to present Beyond Pie Charts: Using Y and Mitochondrial DNA Testing to Solve Genealogical Puzzles. Roberta tested her mitochondrial DNA in 1999 and got haplogroup J at Oxford Ancestors. She heard about Y DNA from a cousin. She called and a man named Bennett Greenspan called back and said if you get 5 Estes men, I’ll make a project for you. “Don’t worry. I’ll help you.” Here we are 19 years later.
Roberta has a love-hate relationship with pie charts. A lot of genealogical information is hidden in the pie charts but people who only look at ethnicity information will never see it. With autosomal DNA, we see roughly 6 generations.
OPPORTUNITY PHOTO
Of the 64 ancestors in a 6-generation chart, 64 have mitochondrial that we don’t carry. 31 or 32 have Y-DNA that you don’t carry.
What? Why? When? Who? Where? How? Reasonably exhaustive search. Due Diligence.
Why Y & Mitochondrial DNA? It reaches far beyond the 5-6 generation threshold. They can be very specific for each individual line and provides focus. You don’t know what you don’t know. Roberta offers scholarships for the first person in each line. They get access and the agreement is that she also gets access. You can’t tell where your ethnicity came from when you look at an ethnicity estimate. These may shed light on it.
Roberta shared that Marie Rundquist didn’t now her mitochondrial line was Native American. Marie wrote a book, Revisiting Anne Marie. Roberta has information on her website about which subclades of A, B, C, and X are Native.
Roberta shared a case study and encouraged people to share about their family to find cousins. A cousin worked with Roberta on a case. She did an analysis of all of the families that migrated from one place to another and helped to determine the identity of the ancestor.
Roberta reminded people to use all of the information that is provided on the DNA testing website.
Roberta suggested to use Google Maps to plot your matches and it’s free.
She shared about her brother David, who she found in 2004. She hunted for him for years and hired a private investigator who found him in 20 minutes.
Y-DNA can help to show an invisible illegitimacy.
INVISIBLE ILLEGITIMACY PHOTO
Some suggestions for finding candidates include joining surname projects at FTDNA, mine autosomal matches, mine ancestry and MyHeritage trees for candidates in ThruLines, as well as Geni.com and Wikitree.
For questions, ask FTDNA customer support, project administrators, haplogroup project admins, and the Facebook FTDNA User Group. There are also 1100 free articles at her website www.dnaexplain.com. She writes two per week every week. On the weekend she writes about her ancestors and during the week, methodology. There is more help at DNA explain because all articles are fully keyword searchable. She also has a help link. The has a lot of information in the Help link.
Q&A
The advanced matching to combine Y and autosomal where and how? On main page look under Y and Mitochondrial and it’s a small yellow link. You can select combinations. Select only the people who match.
Does you access to a scholarship kit include GEDmatch? They would have to upload themselves?
Is it in writing? She doesn’t make them sign anything. It’s on a friendly basis. The worst is that they decide she can’t see it. She has never had anybody renege on it.
Bennett asked breakout sessions to checkout with AV folks.
The afternoon breakout DNA Standards for Genealogy: How, Who, Why, was presented by Debbie Parker Wayne and Patti Hobbs. The BCG had discussed if the current standards covered everything needed for DNA or whether they needed to be modified. They decided that they needed something specific to DNA in the standards and a committee was formed. New standards were adopted at the board meeting in December 2018 and were released at the beginning of this month.
Debbie reminded us that there are no genealogy or DNA police that will make sure you use these standards, however, genealogists and genetic genealogists have decided these are proven methods and you have to use them for your portfolio if you apply for certification. If you are a forensic genealogist, you might have to use them to testify in court.
Patti Hobbs stated that the standards were based on the work of genealogists over decades or even centuries. There are five aspects to the genealogical proof standards. The first is thorough research, including all major record types. We also need to use citations, even if we don’t find anything. One standard requires giving citation to any fact that is not general knowledge. Next, we analyze and correlate the information, resolve any conflicts, and write a soundly reasoned, logically written conclusion.
Special skills for DNA include math skills. Statistics and probabilities are used. Some basic genetics knowledge about DNA inheritance, recombination, etc. is also required. Debbie reviewed the DNA-specific new standards. A common question is how much DNA is needed. That depends on the case. Patti talked about having sufficient verifiable data. This may include Family Tree DNA Public Projects, GEDmatch kits, screen shots, or providing an editor with login credentials for verification purposes. Establishing proof of the common ancestral relationship requires documenting the lineages of the matching test takers. DNA has to be integrated with documentary evidence. A combination of DNA and documentary evidence supports a conclusion about a genetic relationship. Provide an analysis of all types of evidence.
Achieving genealogical proof is a package. It includes thorough searching in traditional records as well as DNA test results. Analysis and correlation requires the researcher to correlate traditional evidence, correlate DNA evidence, and then correlate traditional evidence with DNA evidence to show that those work together. Discuss weaknesses, conflicts and alternate explanations. Perhaps it would have been ideal to have Y-DNA but the line daughtered out. That would be a weakness that you would explain. You might also have alternate explanations. This might apply if there was intermarriage between families. Be sure you have addressed why that is not the one you’ve identified or confirmed.
Another idea that has been integrated is conclusions about genetic relationships and the need for accurate representation of genealogical conclusions. There is a distinction between saying there is a genetic relationship. There’s a difference of opinion about how important this aspect is.
Debbie returned to talk about “the grey area.” We definitely have to show respect for the privacy for our test takers. We want informed consent from the test takers when we want to use their DNA to support our arguments. Blaine has a beneficiary form that has to be notarized. He released it under a cc license, so Debbie modified it to have witnesses only. We need informed consent before we publish in a journal or a book or anywhere. Exactly what you need is up to you and whoever you’ll have to show the form to.
For living persons, explain what you want to share, whether be total DNA, start and stop points of segments, how and where you plan to share it, etc. Explain the benefits and risks of sharing. We know we cannot guarantee anonymity. They should also know what the alternatives are for the privacy options. Perhaps they choose no sharing or to anonymize or privatize identifying info, and many people are fine with giving permission to do whatever you like.
The standards require written consent. In talking to lawyers, all agreed that email is written permission. If it’s oral, then write and email or letter confirming the understanding and send a copy to the test taker. Be sure to share only as given by the consent given.
There has been a lot of discussion about deceased people. The lawyers have agreed that the deceased people have no privacy rights except what they get from HIPPA. All agreed that it would be best to contact the heirs and try to get permission. Also, you can anonymize the identity. Debbie got an obituary and then anonymized an identity. The problem is, if you have one of 20 anonymized, that would probably be ok.
What about public project data? Info on the Y-DNA public project data is out there because the person gave permission for it out there and some think it will be fine to use it. Others think permission is required. This will continue to be outlined over the coming months.
If you want a clear answer on something from BCG, you must go to BCG.
What about people on GEDmatch? They have given permission. There are more people who say no, you should get permission. This is a grey area.
A modified existing standard says where appropriate, distinction among adoptive, foster, genetic, step, and other kinds of familial relationships. Some interpret that to mean that every relationship must have that qualifier. What Debbie might do to make sure she’s not showing she means it’s biological, “William’s will names John as his son.” This avoids using any of the words. Some people may be offended by saying what kind of relationship it is. Some feel that if you’re not using DNA, you’re not proving a biological relationship.
Genealogical charts and diagrams can help to describe the data. If you’re going to publish in a journal, they’re going to want finished and polished tables. Tom Jones describes that in his chapter about writing about DNA. Debbie created a chart she demonstrated with smart art.
In plain language, test the right descendants at the right company with the right test, select matches or target test others with the right test, analyze those results, integrate DNA and documentary evidence, logically sequence and correlate to clearly illustrate… ensure enough people have tested, support every parent-child link in the line from all test takers to the common ancestor, and where appropriate, identify the family relationships. Debbie would not use the term genetic link. She would not assume you have proven a biological link without DNA. Make the DNA data available for verification within the limits of test taker permissions. Publish or share only as much as the test takers permission allows you to do.
For any questions on portfolios, see BCG.
For other discussions on Standards, see debsdelvings.blogspot.com, thegeneticgenealogist.com, legalgenealogist.com, and on Facebook on the Genetic Genealogy Tips and Techniques group.
Q&A
If you have matches on your match list, would you need permission and how would you get it? You can use any information for your analysis. It’s when you want to publish that you need to be concerned about permissions. On Ancestry, people may not answer you back. You may need to figure out who they are. If you want to publish with a name attached, you need permission. You could anonymize but if it’s critical to your proof of the exact lineage, that could hurt your argument.
The next speaker was Paul Maier presenting A Taste of Population Genetics: The Sampler Platter. Paul has worked at FTDNA for about 3-4 months. His goal is to give a sense of how they do what they do. Paul’s background is working on frogs and toads in Eastern California. It provides a general background on genetics for any species you work on. Most toads are unconcerned with privacy issues and not terribly sophisticated. Paul will provide an introduction to pop gen, who we are and where we come from, and methods of estimating population ancestry. Are common d
Two major insights were common ancestors and evolution by natural selection. He talked about discrete mutations vs. continuous traits. Paul talked about genetic drift, which decreases variation. Effective population size is the size of the population that experiences the change. There are 7.5 billion humans on earth but the effective population size is 10,000.
Gene flow is the opposite of drift. It restores variation in the populations. Natural selection can have many affects.
How do we study the forces of evolution? We can simulate forward in time or take the data we have and try to match it to go backwards in time. We call backwards coalescent theory.
Gene trees often do not reflect population history. There are two primary reasons. You can have gene flow, introgression, or incomplete lineage sorting, or ILS.
In 2015, one of Reich’s students found a connection between Native Americans in Brazil and Papuans. There may have been an older migration event into the Americas that was replaced by the first peoples. Another possibility is that there was a seafaring migration across the ocean.
Paul gave an example of recent admixture using Globetrotter technique. They was used to estimate the Bantu expansion, the Mongol Invasion estimated to about 1306CE using the Hazare of Pakistan, and another showing non-Eurasian ancestry in Eurasia.
http://Admixturemap.paintmychromosome.com [I don’t think this is right. Will check it out]
People often think of ancestry in terms of genealogies and Paul doesn’t think that’s right because your genetic ancestors increase at more of a linear rate. You don’t always inherit the DNA of your ancestors.
Q&A
How is new knowledge of epigenetics affecting genetics? We know that there’s much more transmission of epigenetic information than we used to think. Look at the Dutch Hunger babies. They’re had discussions about ChipSeq to inform.
What is the estimated error of current results? Using admixture, you can typically look at the standard deviation. They don’t report results unless that is extremely little.
Genealogists use genealogical trees and genetic trees as coined by Dr. Blaine Bettinger. See Genetic Genealogy in Practice Chapter 2.
Next up was the product update by Chris Pace and Meagan Peters. It was Chris’ first time giving a presentation ever. Their job isn’t just product. They interact with shipping, lab, customer service, and any department you can think of. They try to deliver end user results that benefit the market as well as listen to concerns or needs and evaluate and digest them as much as possible.
They will review recent features, discuss features in development, and the future of FamilyTreeDNA specific to product.
For the longest time, they had basic packaging, which made it difficult to give it as a gift and that is becoming more and more common. With the update, they included detailed instructions for individuals. Each, although similar in delivery instructions, can be confusing for new users. They want to make sure samples are clean and tidy.
The next one is the Big Y Block Tree. That is a more recent larger tool that they’ve delivered to customers. It’s one of the hot topics right now. They’ve talked to other FTDNA members and team members. The block tree, although a lot to digest, is an alternative view to the already well-known Y-DNA haplotree. The Y-tree gives a single axis whereas the block tree gives both axis. The horizontal axis represents branch splits and the vertical axis represents number of mutations or time. You can count the SNPs in each and transfer that into representations of time. They want to make that easier to digest and understand from a novice or advanced standpoint. Not only an alternative view for the tree structure, they display match results. They also integrate user submitted earliest nown ancestor information and integrate aggregated origins information. The layout is based on the Big Tree by Alexander R. Williamson.
Something that Meagan is excited about is community engagement and quality users have. They’re hoping that their blog FamilyTreeDNA Blog will allow users to see hat’s happening at FTDNA and how it can impact the user. They want to give insight into how tools and features can be utilized. It’s a casual way to explain why something is important or how to use something rather than the technical view the learning center has. They’re hoping to add posts such as genetic genealogy guidance, white papers, and to share success stories.
Another community tool is the public haplotrees. It can be shared iwth non FTDNA members. It’s a modernized, easy to navigate layout. The public haplotrees have surnames, origin countries, variants, and positions. Any two people who are at a branch with the same surname would show up in the surname. Another thing that’s been requested is the positions.
Next they talked about some things they are working on that are near complete. The mtDNA videos are near complete. They are personalized views that give a deeper dive into the haplogroup route. They provide better relatability to matrilineal ancestors. Currently there is a small writeup about many haplogroups and where they’re found today. They’re hoping with the videos they can go deeper into the exact path.
They’ve been working really hard on the Family Tree 2.0. They launched beta for user testing. They wanted to fix known issues that exist with the current family tree and gathered that through feedback. They updated architecture and want to have a better notification system. There are many reasons they couldn’t fix it in the current application because of the architecture so they wanted to take their time to build this application so they can keep adding building blocks on top of it. That has taken the most time. As people are finding more relatives and building larger GEDCOMs and gathering more information, the GEDCOM files are getting huge in size. Sometimes they take a little longer. Their notification process to let you know where it is in the process will help set them up for success on future implementations. A few of the new features that will be implemented into 2.0 is the ability to export your GEDCOMs. They will add geolocation feature. When you enter a location, they will use a system and feature to help geolocate it for you. You can definitely enter just a string of data. Also, a step by step onboarding experience. At this point, there are many trees that have been built with single nodes and it’s hard to tell why that is. One idea is that novice users may not know how to fill in that information. They want to make it very clear for that kind of user. If you are seasoned, upload a GEDCOM because you’ve already put in the work.
Future Plans included Family Tree enhancements. Building a very robust base layer architecture will set them up for success. Adding inferred haplogroups to unlinked tree relatives means when you start linking individuals on your tree who have kits and tests, they can trace further back and infer a haplogroup for you to have a pointed area to investigate. They hope to enhance suggested matches to the link. They want to be able to give a sound tool that will give better suggestions and maybe even why and not just a name or date match. Maybe it has to meet three sets of criteria to be recommended to you. That helps the community to help prevent false linking. Also, they want to allow the ability to have more than one family tree per account. Building a parking garage of trees within your account. Sharing with other matches and make it easier to share. Be able to put your tree to others as viewer or editor. You can help build it out for them or they can for you.
Family Finder Features include FF Matrix update. They want to add some value to it. They want to be able to overlay more information and continue to build on top of it. They are working on the chromosome browser heat map view. What they want to do is turn the current browser on its head and give all the information based on the all up front and give them somewhere to start. This is hopefully going to be helpful to people trying to break into the industry and learn what they’re getting out of their tests. They recently did an update to the chromosome browser and it had both negative and positive feedback. They wanted to digest that feedback and also implement it. Hopefully in the near future they will have another iteration along with the heat map.
Some things that Meagan definitely wants to do is bring the GAP experience up to date a little bit more with a facelift. She hopes it will have more features and enhancements. They want to look into and focus on in the coming year and years. Something that they’re going to strive to accomplish is evaluate needs and wants effectively so that they can touch on something that is beneficial to the most people first and then go through everything else they want to do. They’re hoping to improve quality of life for everyone who uses the platform. Some ideas are bulk email tool and member subgrouping. Meagan thinks those are very important in getting new members and communicating with existing members. They want to have strong avenues to get feedback. Right now, customer service gives feedback that they hear on the phone. Meagan looks at it every day. She wants to hear directly from users without a person in between. They want to start doing more surveys and have them be more effective to give them actionable items to work with. Also, Beta testing. Like Chris mentioned, the new tree is in Beta. They want to do that for as many different features as they have so they can incorporate feedback before they launch the new feature. They also want to have internal tools for understanding a user’s journey. It’s important to know how many different users utilize a feature.
Elliott presented the IT and Engineering Update. He will talk about stuff that has not yet made it to the product team. For Big Y, they’re still innovating. There has been some issues. From the matching side, the biggest challenge was bad regions. It took a big stab at their original assumption to have 30 mismatches. It was based on 150 years per mismatch. The problem is, bad regions. These could add 15 or even 20 polymorphic SNPs. They also have Y-STR matching that they recently added into the page for Big Y. The plans are to actually do a re-run. They will take all original BAM files, block the regions they know to be bad, and reprocess all the samples.
Mapping and aligning- they will come out with a new version that will be completely different so they call it V2. They will rerun those and create new BAM files. The goal for matching is 6-7 weeks and for mapping and aligning, Q3 or Q4. They plan to update annually.
SNP Improvement – Age estimates now that we identified those bad regions we’re going to have a more even understanding of how SNPs are transmitted gen after gen. The goal is to put those on the block tree to align for an age estimate. Instead of 30 mismatches, picking an age estimate based on that and using it. Over time they will revise. The goal is also with new aligner to add more detailed statistics on each of the calls. Quality scores. Each of those will be evaluated and produced on the website.
Y500 vs 700 – More variability than the previous group, about 1.5 times more. These have more consistent call rates. They will be reprocessing all samples tha tran for big y for those additional markers and those will appear over the next few weeks as well.
They will also be working on tree matching. Identifying shared ancestors between DNA matches. Take into account Y DNA, X DNA, and mtDNA lines. Using traditional genealogical techniques (names, date, and locations.)
Graph database – Faster display of tree, distributed processing, enabling future feature updates. The graph databaes is how social media woks. They store their data on graph databases beause it works on relationships. Graphing between those nodes tells you how someone is related. Intead of saving in a graph database, it allows them to add a significant number of featuers in the future.
Family Finder – they have been working on some licensing isues. NatGeo and their chip were the same. On it they had some Y and mtDNA mutations. Soon they will be processing all of those and providing that for those who have been run on the chip that had that data.
Matching and transfers updates – The amount of chips coming out in the last two years with variations has exploded. They want to enhance QC methods and update predictions and reduce the weight of small blocks. Some are useful for IBS but not useful for IBD. They will remove them from the calculations. They will expand the number of SNPs they use for the process.
MyOrigins v3 – Work in progress. Improved and increased number of references, more accurate predictions, more specific regions, machine learning.
Family Finder – Triangulation – they call it bucketing internally. It works only on maternal and paternal. If you go to your tree and link a node, behind the scene it will do bucketing. This is a new take. The more people you actually link, the more powerful it gets. They will be enhancing the method using a new strategy that he won’t talk about now.
Q&A
We need locations showing in the family tree. Is that coming soon? One piece is that it will have standard and advanced view. Advanced view will have more information overlaid on it.
Is there a timeline for any of the new features you listed? In regards to tree, next weeks they will be pushing another version of the beta nad hoping in next month sured up and released to production.
Will FF have multiple levels of sorting like ICW? They want to add better sorting or filtering or view by to their applications. The goal is to be able to stack or sort.
When will the very useful step chart be returned to big y matching? There were a few reasons it was removed. It wasn’t to pull the rug out. They are evaluating a better way to implement it. They want to hear why it is valuable to you.
Might FTDNA sometime have some kind of app? They are working on a new dashboard so it will be way more mobile friendly. They are trying to take baby steps to have a robust app to cover the whole site.
Member order summary no column for transfers, etc? They could add that.
Are you planning to add ethnicity of matches on the chromosome browser? They could look into that with an opt-in/opt-out that.
Missed question-The new tree has a tree management section that will help for the future of how to manage trees in your kit. It will have number of nodes, linked kits, so you’re not scouring across multiple generations.
For those who manage more than one kit, at most companies you can have one tree with multiple family members in it? One of the things as far as the future for 2.0 or the version right after it is tree sharing. You can take one single tree and give access to other kits as owner, editor, viewer state. They are sorting through what that level of privacy looks like.
Will the changeover for 2.o… ? They want to leave your information intact and not remove it.
Did you say that all Big Y would have 700 STRs? Elliott said he hopes. The original was run on one version of an enrichment process and the new is run on another. They will evaluate all. Looks like an average of 650. No, not 700 but some will if they were sequenced a lot or got lucky.
Will you provide true segment triangulation? In the end, yes. Currently, bucket does not use just segment. It uses ICW plus segment. The new will be only segment.
Approximately when were FF chips processed that had Y and mtDNA? Approximately April 2016 maybe a few months before or after
Can you add… block chart? Elliott doesn’t know how to make them consistent.
Will autoclusters be added? The goal of their triangulation process is to make it completely automated. The goal is not to have to click 6 or 7 samples to see how they triangulate. People in the room who probably know how to use it represent less than 1% of users.
Can the 30 nonmatching variant be extended to Y700? The 30 is there an that’s part of the problem. Not only is the 30 in place but the number of regions has been extended. The number of bad regions has slightly increased. If it was 10 on average before and doubled the amount of regions, maybe now it’s 20.
Could the notification for project admin for donated funds be reinstated? Yes, pretty sure we could. Need to look into it a little more.
Why has MyOrigins been removed from admins who have limited access? They can look into that.
There are many projects who have no admin and others who don’t? Have you considered building a tool for subgroups? They have considered support and Janine has some good ideas about how to help that and bolster that.
Will the 2020 conference be held in March or November? Word is it might be March. Max said it. It is not confirmed.
Please send one new email instead of for each person he matches? They are aware of that. It costs them every time they send an email and it’s an issue they are discussing. They want it resolved. They will do a better job.
Will you resize product boxes to fit in post office small priority mail box? They have a smaller box right now but it’s not being used just yet. It’s not that much smaller. They have a lot of boxes they’ve ordered.
Customers who purchase Y DNA from homepage button should be immediately presented with an opportunity to join a surname project. Search should not be placed at bottom of homepage? The point they should select a project is when they become a customer. When they join, that is when they should get notified to join a project.
How are you handling comments on the blog Open or Moderated? Right now it’s about spam so it can’t be open. She needs someone to help her check comments. She already has someone helping with blog posts.
What if someone doesn’t’ know how anyone matches and can’t link them to do bucketing? How can they make this work and how many links do they have to have to make this show? To enable bucketing it goes out to third cousin so if you link the third cousin then a portion of your match list that you both match will be bucketed. If you don’t know where anyone on your match list falls on your tree, then bucketing isn’t going to work for you. If you had parents or aunts/uncles test, then you can link them and then bucketing will be enabled that way.
Sometime ago the ability of the admins to be added to the catalog was removed/suspended. Will it be restored? Connie said the issue was with all the Big Y being generated and the requests started numbering into the thousands. It was something the lab couldn’t handle timewise or even financially. A couple weeks ago they were trying to brainstorm a way to start adding individual SNPs again to the testing menu. It may not be a one off. It may be little clusters.
Does a current Big Y tester get free upgrade to BY700 or do we have to pay? What is for the $179 upgrade? BY testers will have to pay for upgrade to BY700 for $179. What they will get is their sample will be completely rerun. They will get additional SNPs, coverage, STRs. The only thing they’ll keep are the SNPs that are in BY500 that are not in 700.
I have several members in the project who don’t reply and are set to limited. What is the view on dropping them? She said if you don’t want someone in your project because they’re hurting your research, that’s your call. That would be Janine’s call.
How can we convince people with other companies to convert to FTDNA? One thing they always say is they don’t expect you to join just one database. If you want to discover more, join all the database. No database is created equal.
ISOGG has leftover wine and goodies. Meet at 8pm.