L540 - A Small Y-DNA Haplogroup

7 Mar 2023
Peter Gwozdz
pete2g2@comcast.net

 

News

            7 Mar 2023:  The L540 Tree continues to grow as a few men per year submit their Y-DNA samples and test positive for L540.  My version of the L540 Tree has 39 men.  I only include men who send me an email with permission to list them here.  The FTDNA “Big Y Block Tree” has 62 men in L540 (S3003).  The Yfull tree has 50 sample id numbers in L540.  I do not have contact information for many of these names and sample numbers; most men that I contact give me permission to list them here.  My tree has several men who are not in the Big Y nor Yfull trees because they tested individual SNPs at Yseq following my recommendations.  Big Y, Yfull, and Yseq use different phyloequivalent SNPs as the code names for some L540 branch nodes;  in such cases I use my favorite SNP code name and include the others in parenthesis.

           

Abstract

            Edited 30 Jun 2022.
            This web document is a summary of my information about a small haplogroup of Y-DNA based on an SNP mutation named L540.  The subject is genetic genealogy.
            The L540 Tree shows the samples that have been tested for the branches of L540.  Prediction to the branches cannot be done with confidence by  STRs, so L540 samples without tests for these specific SNPs need to do SNP testing to determine their branch.
            The L540 haplogroup seems to be roughly 2,000 years old, with an origin perhaps in Central Europe. 
            This web document is written for people reasonably familiar with the jargon of genetic genealogy.  If you are new to genetic genealogy you might prefer to first read an Introduction that I wrote for another of my web documents.
            My References and Sources are listed at the bottom.

L540 Tree

            6 Feb 2023 Update

Click on a link in this tree for more discussion about that SNP or about that sample (DNA data from one man).

Format for each individual line,- data from one sample:

Ancestor:  Country.  FTDNA code number, BY = Big Y data;  Yfull code;  Yseq code

References:  FTDNA, Big Y, Yfull, Yseq
-- means no Big Y data;  placed in the tree by testing individual SNPs

More explanation:  Next topic L540 Tree Format

 

            7 Mar 2023 Update

L540 (S3003) Tree:

L540* = A6295-, Y7026-, Y82423-

              Kovalev:  Russia.  268215, BY; YF04818

              Ponto:  Poland.  557832, BY;  YF08450

     A6295-, Y7026-, Y82423?

              Henriksen:  Norway.  B4816, BY

              Helgesson:  Sweden.  682512, --

              Belov:  Russia.  321021, --

              Simonsson:  Sweden.  70482, --

              Kusyi:  Ukraine.  370518, --

Y82423

              Hoff:  Norway.  374375, BY;  YF09864;  3203

              Kurganov: Russia. B752475, --, YF84217; WGS37431

A6295   

     BY5854

              Nowak:  Poland.  225596, BY; YF03833

     FTA519

              Vajanszky:  Hungary.  IN36326, BY, YF78756

     A9034   

          Y153993 (BY161037)

              Eliasson:  Sweden.  458953, BY;  YF15457

              Rohss:  Germany.  555020;  BY;  YF14486

          A9035   

              A9036-

                   Kargul:  Poland.  199446, --; 4230

              A9036/BY5850

                   Gwozdz:  Poland.  N16800, BY; YF02909;  1433

                   Gush (Gwozdz):  Poland.  B182584, BY; YF13748

Y7026 (Y7025)

     Z29042-, A783-, Y17710?, FT402631?

              Kline:  Germany.  158091, --;  2360

     FT402631

          FT402631*

              Sholtz:  Czech.  926019, BY

              Marschner:  Czech.  480087, BY;  YF05931

              Mrocek: Czech. IN69642, BY

          FT13909

              Wacha:  Czech.  347884, BY

          BY82428

              Stavbom:  Sweden.  B3807, BY;  YF05186;  2100

              Glasser:  Germany.  171456, BY;  YF04393

          Y17710 

              BY178938

                   Blind:  Germany.  B2670, BY; --; 2891

              Z29042 

                   BY5193

                             Gebert:  Germany.  166692, BY;  YF01811

                   Z39377

                             Roider:  Germany.  275510, BY;  YF04216

                   BY5185

                             Hartsfield:  Prussia.  140927, BY;  YF04834;  2397

                             Hartsfield2:  Prussia.  80059, BY;  YF13261

     A783

          A783*       

              Weiand:  Germany.  51282, BY

          A1157

              A1157*

                        Zeidler.  Germany.  B615039, BY;  YF84363

                        Grundel.  Germany.  768562, BY;  YF90359

              BY5876

                        Ratuszni:  Hungary.  200924, BY

                        Stelz:  Germany.  175213, BY;  YF05757;  2028

              Y12393

                   Y12393*

                             Svercl:  Czech.  155155, BY;  YF02913

                             Michalski:  Poland.  SM10049, BY;  YF80537

                   A779 /BY5841

                        Y33576 (A782/BY5856)

                             Hochreutter:  Germany.  N45041, BY;  YF02161

                             Hochreutter:  Germany.  --;  YF09477

                        BY5909

                             Hochreutter:  Germany.  131761 BY;  YF06285

                             Hochreutter:  Germany.  --;  YF09478

 

L540 Tree Format & Discussion

            Update 7 Mar 2023.

            This Tree has only about half the men that I know about who have L540 positive Y-DNA results.  If you are L540+ and would like to be listed in this tree, please send me an email request.  pete2g2@comcast.net.  If you are listed and wish to be removed, just send me an email.

            This tree uses the most common format for Y-DNA trees, where the branches and sub branches are listed below, using tabs for columns.  For each terminal branch, the samples are listed below that branch, with one more tab.

Format for each individual line - from each individual sample (DNA data from one man):

Ancestor:  Country.  FTDNA code number, BY = Big Y data;  Yfull code;  Yseq code

References:  FTDNA, Big Y, Yfull, Yseq    

            Each branch is defined by an SNP.  All the samples in that branch have that particular SNP, and all the samples outside the branch do not.

            For example, L540 has 3 known branches, defined by the SNPs Y82423, A6295 and Y7026.  A6295 and Y7026 each have multiple branches.

            L540* is listed first, with 7 men, but that’s not really a branch;  those 7 samples are positive for L540 but seem to be negative for the known branches.  They are like twigs that might become new branches in the future, as new samples show up matching with these 7.  We expect that future samples will share a few unique (“private”) SNPs with one those 7, thereby defining new branches of L540. 4 of those L540* samples do not have Big Y data;  they test for individual SNPs;  they are so far negative for the 2 main branches.

            A9036/BY5850:  That “/” means an SNP that has been assigned two different names, discovered and named independently.  These are the same SNP.

            Y153993 (BY161037)  These two are different SNPs but they are phyloequivalent, which means that so far, all samples in that branch have both of these, so either can be used to define this branch.  In this case, I follow Yfull, which uses Y153993, but FTDNA uses BY161037 as the name of this same branch.  Actually, this branch has more than 10 phyloequivalent SNPs, so any one of them can be used as the name of this branch.

            S3003 used to be phyloequivalent to L540, but Steve Fix found an on-line sample (PGP89) positive for S3003 but negative for L540;  this means L540 is a branch of S3003.  FTDNA still uses S3003 as the name for the L540 tree;  Yfull uses L540 for the name of the tree;  both still list S3003 as phyloequivalent to L540 because their databases do not have any sample like PGP89.

            In other words, different trees for L540 have different details.  My tree is similar to Yfull’s and FTDNA’s trees for L540 (S3003).  My tree is restricted to men who have requested to be listed;  my tree has samples that have not joined Yfull;  FTDNA’s tree is restricted to men who have purchased Big Y, but my tree has some men who joined FTDNA but did not purchase Big Y, using Yseq tests instead.

            The FTDNA “Big Y Block Tree” has 30 branches for S3003/L540 (more than my tree) because many men with Big Y data have not contacted me with permission to list their sample.  If you look at the Block Tree at your FTDNA site, you will see very few names because only your closest “matches” are listed.  Also, I suppose some men withhold permission for FTDNA to list their name.  The Block Tree has 8 samples for L540* that I do not have, which means we can expect more branches soon, because future samples are likely to match unique SNPs from one of those L540* men.

            The FTDNA “Y-DNA Haplotree & SNPs” has the same branches for S3003/L540 in the simpler tab indented format, and you can display or hide the sub branches for each branch; this provides a compact simplified display.

 

L540 in Y-DNA Tree

            Update 15 Dec 2020.

            Rough outline of the human Y-DNA tree with ISOGG and SNP code names, showing the location of L540:

E (M96)

       E1b1b1 (M35.1)

              E1b1b1a1 (M78)

                      E1b1b1a1b1a (V13)

                             CTS8814/Z1057

                                    CTS1273, CTS5856

                                           E1b1b1a1b1a5 (S3003)

                                                   E1b1b1a1b1a5a (Z7019)

                                                          E1b1b1a1b1a5a1 (L540)

              More detailed trees are available at:  Yfull, FTDNA, ISOGG, and Steve Fix.  These trees have different details depending upon the different samples available, and upon which SNP results are included.

            The ISOGG tree does not have branches CTS8814 or CTS1273.  Z7019 is unique to ISOGG.

            Yfull and FTDNA combine S3003, and L540 into one branch;  FTDNA calls this S3003;  Yfull calls this L540.

            There are intermediate and parallel branches not shown in this outline of mine.  These branches haves many phyloequivalent SNPs not shown in this outline, which is intended to show the location of L540.

            Those long ISOGG code names (like E1b1b1a1b1a5a1) change when new SNPs define new intermediate branches, which makes those long names confusing.

 

L540

            Update 23 Dec 2020.

            L540 is the code name for an SNP that was discovered in my WTY.  L540 was announced 29 March 2011.  On 27 Apr 2011 I demonstrated that L540 defines a new haplogroup branch of V13.

            I use the code name L540 for the SNP, for the associated  haplogroup, and for the samples (men) in that haplogroup.

            This haplogroup was predicted as cluster C based on STR correlations in 2008.  When I originated this web page in early 2010, I coined the name V13C, renaming it L540 on 30 Apr 2011.  Cluster C, also called C type, is the STR equivalent of L540.

            Update of statistics 23 Dec 2020:  My L540 Tree has 35 samples.  This Tree has only about half the men that I know about who have L540 positive Y-DNA results.  I do not list samples in my tree until I get an email request to be listed by the man who submitted the sample.  Also, I do not list samples with an L540+ result if they have not been tested for the 2 known branches of L540.  In addition, I know of about a dozen more men who seem to be cluster C based on their STR results, but they have not tested for L540.  I do not include relatives closer than 4th cousin in my tree;  those with the same family name are distant cousins.

            Judging from the size of L540 at Yfull, as a small fraction of the Yfull tree, my wild guess for the size of L540 worldwide is more than 10,000 men but less than 100,000.

            If you are in my tree, I encourage you to contact your closest Y-DNA matches at FTDNA;  if they have not already tested for their Y-DNA branch, encourage them to test for L540 and to contact me for questions and for inclusion in my L540 Tree.

 

Notice

            3 Aug 2018:  FTDNA changed the rules for DNA data, requiring that DNA data must be removed from all files whenever a person changes their privacy settings to restrict web posting.  This web page has links to my 6 STR analysis “xls” files.  It would be too much trouble for me to change all 6 files every time a person changes settings, so I removed these files from the web.  Some of the links to these files are still here, but the links do not work.  As I update and rewrite this page I remove links to my STR analysis files.  This does not matter very much because these days SNPs are much more important than STRs, and SNPs do not require the statistical analysis.
            The few individuals named in this web page requested to be mentioned;  they may request to be removed at any time.

 

V13

            Rewrite 31 Oct 2015.  Edit 16 Nov 2019.

            For detailed V13 trees, see:

http://www.yfull.com/tree/E-V13/

ISOGG

            V13, in the E haplogroup, is a major branch of the human Y-DNA tree.  The L540 branch is a relatively small branch of V13.

            There are about 80 known SNP equivalents to V13.  V13 was the first to be discovered and the one used in most discussions about this haplogroup.  All but a very few V13 samples belong to L142 and CTS5856, so technically L540 is the main branch of the S3003 haplogroup, which is one of many branches of the CTS5856 haplogroup, which is the main branch of the L142 haplogroup, which is the main branch of V13.  For simplicity the L540 Tree above minimizes these details.  I usually just say in this web page that L540 is a branch of V13.  I say that V13 is the father of L540, when technically S3003 is the father and V13 is the great-great grandfather, and even that may change if additional side branches are discovered with very few samples.  I’m ignoring the known branches that have few samples, for simplicity.

            L542 is one of those 80 equivalents.  V13 is sometimes called L542.  L542 was found in my WTY.

 

PGP89

            New Topic 10 Feb 2015.

            PGP89 is a sample from the Personal Genome Project (search Google for details).  PGP89 is S3003+ but L540-, so this sample represents an older node in the branch leading to L540.  So far there are no such S3003+ L540- results in the E-M35 Project.

            This is from Steve Fix, who included PGP analysis in his tree.

 

Y17710

            New Topic 28 Jan 2020.  Edited 30 Jan 2020.

            This is a new branch, discovered Jan 2020 in the Big Y-700 of Blind.  It is a branch of Y7026.  Previously, Y7026 had two branches, Z29042 and A783.  This new Y17710 includes the branch Z29042.

            Previously, there were 6 samples in Y7026*, including Blind, who is now moved to Y17710, in a new small branch BY178938.  Those other 5 samples are currently listed in my Tree as Y7026*:  Z29042-, A783-, Y17710? because it is not yet known how many of them will test positive for Y17710.  Three of those 5 have Big Y results from 2015 and early 2016, but Y17710 showed up in the expanded Big Y-700 in late 2016, so we don’t know their Y17710 status.

            Y17710 is not available individually from FTDNA, but Yseq has it for $18:

https://www.yseq.net/advanced_search_result.php?keywords=Y17710&search_in_description=1&x=6&y=9

            Big Y-700 is pricey;  FTDNA has not yet announced a reasonable price for upgrade from the original Big Y to the new Big Y-700.

            There are 2 more known men with Big Y-700 in this new branch;  one is Y17710* and one is with Blind in BY178938.  I’m contacting them.

            I’m putting a request into Yseq for an $18 test for BY178938.

 

Z29042

            New Topic 14 Jan 2015.  Edit 27 Dec 2015.

            This SNP was discovered by Steve Fix on 10 Jan 2015, from the Big Y data of Roider, compared to Gebert.  These two samples have this SNP, but Hochreiter does not, so Z29042 defined a new Haplogroup, the first branch to be found for L540.  Steve assigned the Z series code number.  Actually, there are 6 new SNP locations common to Roider and Gebert, but only Z29042 was assigned a code;  some of those others may be needed in the future.

            I’m a bit surprised.  I expected Roider to fall into a branch with Hochreiter, because they are closest in STRs.  Also, I have been predicting an older node for Gebert, based on his DYS389 value, and his STR values that differ from other L540 samples, more than L540 samples differ from each other.  STR predictions are statistical, because STRs mutate relatively rapidly.  So this is a surprise, but such surprises are expected from time to time when making predictions based on STRs.

 

A783

            Update 10 Feb 2015.

            This SNPs was noticed by Steve Fix and me in Hochreiter’s Big Y data, our first L540 Big Y.  Actually, there were 10 new SNPs;  I tested myself for them but came out negative.  Yseq assigned A series code numbers to them.  None of the 10 showed up in the Big Y data for Roider or Gebert.  In Feb 2015 I noticed this one in the Big Y data for Svercl, so it defined a new haplogroup branch for L540, with Hochreiter and Svercl, not me, not Roider, not Gebert.

 

Y12393

            New topic 12 Jul 2016.

            This SNP was newly posted by Yfull in July 2016, as a new branch of A783.

 

A779

            New topic 12 Jul 2016. Edit 31 Jul 2016.  Edit 17 Aug 2016.  Edit 29 Aug 2016.

            Steve Fix suggested the SNP A779 to me on 12 Jul 2016, to distinguish Svercl from two Hochreiter Big Ys.  The two Hochreiter’s are 7th cousins, with a common ancestor about 300 years ago, so the A779 mutation is at least 300 years old.  The 2nd Hochreiter Big Y provided a split into two branches, defined by A782 and BY5909, although those two will not be listed in Yfull’s tree until a 2nd sample shows up in the same branches.

 

Y7026

            New Topic 8 Feb 2015.  Update 17 Mar 2016.

            This SNP represents the major division of L540, with 11 of the 16 samples in the L540 tree so far.  The Yfull tree estimates Y7026 to be about 2,000 years old, although this is a very rough estimate due to the caveats associated with DNA age estimates.  See the next topic discussion about the “bushy” nature of Y7026.

 

Y7026*

            New Topic 17 Mar 2016.

            We now have 5 samples in the paragroup Y7026*.  Click here for a jump to Y7016* in the tree.

            All 5 have been confirmed with SNP results Y7026+, and Z29042-, A783-, so they do not belong to those two known haplogroup branches of Y7026.  Two of the 5 have Big Y results, and they do not have a common novel SNP, which means they will end up in two different new branches of Y7026, as soon as a future sample in their branch gets a Big Y result with a common novel SNP to define that future haplogroup branch.

            In other words, we know Y7026 has a least 4 branches.  The node associated with Y7026 is the major “bushy” node of the L540 tree.

            The other three samples have not purchased Big Y;  their results are from SNP testing only.  These three may belong to those two future branches.  Or, perhaps one or more of those three may end up in yet another branch of Y7026.

            A bushy node is evidence that the immediate descendants of the corresponding MRCA participated in a significant population expansion.

            On the other hand, bushy nodes may be just random, not evidence of population expansion, due to the luck of SNP discovery statistics, particularly for the case of Y7026, with only 11 total samples so far.  Big Y does not cover the entire Y chromosome, and Big Y does occasionally randomly miss SNPs, so future testing may show Y7026 to be not so bushy after all, if future novel SNPs combine the Y7026 branches into fewer larger branches.

 

A6295

            New Topic 22 Jul 2015.  Update 17 Mar 2016.

            This SNP and haplogroup was defined 22 Jul 2015, being present in Nowak’s Big Y, and also present in my earlier Gwozdz Big Y.  Kargul tested positive for A6295, making 3 samples so far.  My Gwozdz cousin would no doubt also test positive, but I leave him out of the L540 tree since together we represent one ancestral line.  Actually, I recruited both Kargul and Nowak, based on close STR matches to me, so statistically, the A6295 branch should be considered to be much smaller than the Y7026 branch with 11 independent samples.  Note that the three A6295 samples are the only Poland origin samples in the L540 tree;  I did not recruit on the basis of Poland origin, so we can speculate that A6295 might represent a small Polish branch of L540, although three samples is far too few for any confidence in this regard.

 

A9035

            New Topic 21 Jan 2016.

            This SNP has just been defined 21 Jan 2016.  It is negative in one A6295 sample (Nowak) and positive in the other two (Gwozdz and Kargul), so it represents a haplogroup - a small twig in the Y-DNA tree.  Kargul does not have Big Y data;  Kargul’s FTDNA sample is A6295+;  Kargul’s Yseq sample is A9035+.  I (Gwozdz) ordered SNP tests at Yseq for 4 of my “private” SNPs, A9032, A9033, A9035, and A9036;  Kargul is negative for those other 3, implying that our MRCA node for A9035 is roughly 3/4 as old as our node with Nowak for A6295, although this is a very rough estimate with only 4 SNPs tested.  At the Yfull SNP browser, using the locations for those 4 SNPs from my (Gwozdz) Big Y data, I verified my positive standing for all 4 of these SNPs;  Nowak and all other V13 samples in the V13 Project at Yfull are negative for all 4.

 

Z39377

            New Topic 22 Nov 2015.

            This SNP has just been defined 22 Nov 2015.  It is present in Hartsfield’s recent Big Y, and is also present in Roider’s Big Y from earlier this year.  So Z39377 defines a new haplogroup, with only those two samples so far.

 

S3003

            Update Feb 2015.

            This SNP is in the L540 branch, but older.  PGP89 is a sample from the Personal Genome Project (search Google for details).  PGP89 is S3003+ but L540-, so this sample represents an older node in the branch leading to L540.  So far there are no such S3003+ L540- results in the E-M35 Project.  Technically, S3003 defines a haplogroup with branches PGP89 and also L540, but for simplicity I just say in this web page that L540 is a branch of V13.

 

Determining Your L540 Twig;  Dividing L540;
Discovering New SNPs

            Rewrite 13 Mar 2020.

            I recommend Big Y, next paragraph, if cost is not an issue for you, and if you are enthusiastic about discovering new SNP haplogroups.  Otherwise, consider the less expensive tests per the following paragraphs, to determine your current haplogroup.

            Big Y:  The newest version is Big Y-700:  Discovering new haplogroups is part of my genetic genealogy hobby.  I have been recruiting L540 members to purchase Big Y in order to discover new SNPs, which provide new SNP haplogroups - terminal  twigs on the Y tree, to further subdivide L540.  Big Y-700 is not cheap.  $449.  Anyone interested in joining me in this L540 project can order Big Y-700;  please contact me so I can keep track of the status.  With Big Y-700, there is no need for individual SNP testing.  In fact, with Big Y, most men immediately discover new “private” SNPs of the Y unique to their sample (unique so far in the Big Y database).  Many men have an immediate match at a new SNP, thereby defining a new twig in the Y tree, combining their sample with a previous Big Y sample.  It is almost certain that future Big Y tests will match one of the new SNPs in your Big Y data, thereby defining new small twigs, eventually combining quite a few samples.  An exception would be if cousins of yours are already in the Big Y database, in which case your Big Y places you in that twig.

            I encourage testing at FTDNA, and joining the E-M35 Project, because I like the convenience of finding all the data in one place.  The administrators of the E-M35 Project will classify your sample into a category, where they give the recommendation for which SNPs you should test.  You may have to wait a few weeks after joining the project for a recommendation.  If you think you might be L540, please feel free to email me for advice on which SNP branches to test.

            There are other companies.  Yseq offers individual SNPs at lower price with faster results.  See SNP Test Orders for detailed instructions.

            If you are new to Y-DNA testing and do not know your haplogroup, I recommend Y-37 as an inexpensive first test.  The FTDNA computer will use that data to predict your haplogroup.  The computer is very conservative.  If you are really L540, the computer will not predict L540 just based on Y-37, but it can confidently predict E-M35 or E-V13.  (L540 is a branch V13, which is a  branch of E-M35.)  If you are not predicted E-M35 consider joining a different Project, for your haplogroup. The FTDNA computer makes a recommendation for further testing;  from your home page, click on “Haplotree & SNPs”.  The administrators of the E-M35 project may make a better recommendation, using that information, plus your close Y-DNA matches, plus other data analysis for which SNP, or which SNP panel, you should test next.  See individual SNPs or panels of SNPs, a separate topic below.

            For example:  If you purchase Y-111, look at your Y-DNA matches, using all 111 of your STRs.  These matches are quite close, so you are probably in the same haplogroup branch as your closest matches.  If you have a close Y-111 match who has determined his haplogroup to be one of the L540 branches, you can purchase the test for the SNP for that branch, then continue up and down the L540 tree, inexpensively testing individual SNPs to determine your terminal position.

            If you previously purchased Y-67, no longer offered, then you still might be in the same branch as your closest Y-67 matches, although with less confidence than with Y-111.

            If you previously determined your “terminal twig”, you can just watch this web page L540 Tree (or other web sites, other trees).  When a new SNP twig shows up extending your twig, you can order just that one SNP to see if you belong, or not.

            For more specific discussion, click on  L540, A783, Z29042, Y7026, A6295, SNP ordering, and Big Y.

            How about STRs?  In the past, I encouraged upgrading to 111 Markers, the largest set available at FTDNA.  Now that there are plenty of SNPs available with low cost tests, SNPs are better than STRs for finding your closest Y matches.  However, there are plenty of samples without the latest SNP tests, so if you are anxious to find out which of these best match your Y, 111 STR markers are much better than the smaller standard sets.

 

Y37

            Y=37 is a test for 37  STRs, from which the FTDNA computer can predict your main branch haplogroup with high confidence.  Then, from your home page at FTDNA, click on “Haplotree & SNPs”.  That will show your position in the Y-DNA tree.

 

Cluster C

            Rewrite 31 Oct 2015.

            Friedman proposed cluster C in 2008, based on STR correlations, when the data was less than what is available today.  Cluster C now seems equivalent to L540.  The cluster C data is still available at the haplozone site but may not be up to date.

 

C Type

            Rewrite 31 Oct 2015.  Edit 16 Aug 2018.

            I defined C type in Jan 2010 as my version of Cluster C.

            I use C type to predict L540 samples based on STRs, for samples that do not have the L540 SNP test.

            I use the word type for an STR cluster with statistical validity as established by my Mountain Method.  “Type” is my own term.  I chose the word “type” because it is not generally used in genetic genealogy and I wish to distinguish my types from haplogroups and from other clusters.  By “type” I mean the cluster data, the hypothetical clade, the modal haplotype, and the set of all possible haplotypes, at any number of markers.  Accordingly, by “C type” I mean any or all of these 4 things.  I sometimes use just “C” as short for “C type”.  I also have a previous C type identified in R1a;  unrelated;  please don’t get confused.  I published my methods in the Fall 2009 issue of JoGG.

            My analysis files define C type.  Sorry, it can be a bit confusing because I have multiple STR definitions for C type, for various marker sets.  The number of markers in my definitions change slightly when new samples show up with unusual STR values.  I hope the meanings are clear from the context of my discussions in this web document. 

            Click on seems equivalent for an explanation that STR types (such as C type) cannot be exactly equal to equivalent SNP haplogroups (such as L540), due to STR outliers.

 

V13C

            Rewrite 31 Oct 2015,

            I coined the name V13C in 2010 to represent C type, cluster C, the hypothetical haplogroup, and the samples (men) in the hypothetical haplogroup.

            I also used V13C to mean samples that match C type from the database of samples at E-M35 or at Haplozone, or at other databases.

            This web document used to be named V13C.html.

            Now that C type seems equivalent to L540 I edited away most of my mentions of the name “V13C”, but I’ll continue to use “C type” for the predicted clade based on STRs.

 

L Type

            Rewrite 27 Sep 2017.

            I proposed L type on this web page in mid 2011, based on only 2 samples, which means not very high statistical confidence.  L type (also called L540 type) was a type that included C type plus those 2 samples that did not fit C type at that time.

            I no longer consider the distinction between C type and L type useful.  Those two samples, Gebert and Fredeen, both tested positive for L540.  So they are just statistical STR outliers.  Since then, more outliers have shown up;  recently, with lots of 111 marker  (next topic) data, I was able to come up with an STR definition of C type to capture all L540+ STR outliers, and not capture any L540- samples.  My C type definitions using less than 111 markers are not quite perfect at predicting L540 based on STRs, but they are satisfactory.

            I edited this web page to remove mentions of L type (except this topic).

 

111 Markers

STRs are still Valuable

            Rewrite 6 Oct 2017.  Edit 16 Aug 2018.

            FTDNA provides STR markers in various sets.  The largest, a set of 111, was introduced in 2011.  Upgrades can be purchased for samples with fewer markers.  Obviously, matches and predictions are more accurate using more markers.  Until 2014, I had been recommending the 111 set to L540 members, hoping to discover STR correlations good enough to divide the L540 haplogroup into clusters with high confidence.  Today, SNPs are more important than STRs.  This is because the cost of discovering new SNPs has come down a lot.  SNPs define haplogroup divisions;  STRs only provide statistical predictions for haplogroups.

            Some clusters are still defined by STRs, as predictions for new haplogroups, which need confirmation by discovery of a corresponding SNP.  However, STR analysis is yielding diminishing returns for this effort.  SNP discovery is now accelerating instead.

            Still, the majority of on-line samples have STR data without adequate SNP data.  So Y-STRs still provide you with your best list of on-line close male line matches.

            At your FTDNA home page click on the Y-DNA “Matches” button to see your closest matches using the various STR marker sets.  Many men at FTDNA do not join the various projects;  if someone in L540 does not join the E-M35 Project, I do not get to see his data.  If you you are L540 and have a very close STR match, please send him an email message about this L540 web page and about the E-M35 Project.  I still occasionally find new L540 members this way.

            Haplozone is another on-line STR database.

            As a specific example of the value of STRs, I discovered DYS445 = 11 as an unusual mutation in my own Y, shared by my 3rd cousin, and also shared by Kargul, adding evidence that we form a twig in the L540 tree, perhaps restricted to south Poland, perhaps only a few centuries old.  DYS445 is not available at less than 111 markers in FTDNA standard sets.  The rest of L540 samples have the value DYS445 = 10.  The value 11 does show up very rarely elsewhere in V13, as an independent mutation, so although DYS445 is very slowly mutating it is not as slow as a typical SNP, so not as statistically reliable as an SNP.  Later, I discovered A9035 (tested at Yseq), an SNP for only the 3 of us.  A9035 is a twig in the A6295 branch (see the Tree).  In other words, DYS445 = 11 seems equivalent to A9035 today, although exceptions may well show up in the future.

            Summary:  111 STR markers are valuable if you are very interested in genetic genealogy, and if cost is not a big issue for you.  If cost is an issue, and if you are merely curious about your Y-DNA, as a first test I recommend the 37 marker STR set (topic after next).

 

67 Markers

            Rewrite 5 Dec 2015.  Edit 16 Aug 2018.

            FTDNA provides a 67 marker standard set of STR markers.  I have been using this 67 set for analysis for more than 8 years.  Although the 111 set is more accurate, this 67 set is valuable for analysis because there are a lot more samples on-line at 67, and all samples with 111 are included.

 

37 Markers

            Rewrite 5 Dec 2015.  Edit 16 Aug 2018.

            FTDNA no longer offers the 25 and 12 STR marker standard sets.  The 37 marker set is sufficient as a first test if you are curious to see in which Y-DNA main branch haplogroup you belong.  With 37 markers, FTDNA will automatically place you in one of the main large haplogroup branches of the Y-DNA tree.  For the smaller branches of the tree, there are SNP tests.  For L540 candidates, I have a separate discussion topic about this:  Dividing L540.

            Most of the more rapidly mutating STRs are in the 37 marker set, so the 37 marker set is good to search for your best matches to other men with a male line common ancestor in the last millennium or so.  FTDNA provides you with matches to other men with similar STR haplotypes.  All samples with 67 or 111 are included because they have these 37 plus more.  For more discussion see Value of STRs.

 

25 Markers

12 Markers

            Rewrite 11 Oct 2017.  Edit 16 Aug 2018.

            FTDNA provides the older STR sets, using 12 and 25, as special orders by project administrators, but for the price difference the 37 set makes more sense.

            There are still lots of data on-line with only 12 markers - not so many with only 25.  Those samples can still be checked for candidates for L540, but not with very high confidence.

 

Best STR Markers

            Edited 27 Sep 2017.  Edit 16 Aug 2018.

            STR markers that mutate relatively slowly are statistical indicators for clades in which they are recently mutated, but they are not perfect because of subsequent independent mutations.  When a clade has a few such good STR markers those provide a signature set of STR markers.  A signature is statistically expected to be a more probable indicator of a clade than just one marker.  Indeed cluster C is characterized by the Friedman Signature.  My definitions of C type (and thereby L540) use other helpful markers, not just the signature.

            My analysis files automatically rank markers, as useful for a particular definition, using a method that I published.  The exact ranking of markers varies slightly from month to month due to the random nature of mutation values in new samples, and due to the somewhat arbitrary cutoff that I use to restrict the database to the L540 neighborhood.  (Using too many samples provides a ranking of the father clade instead of the clade of interest.)  For example, a sample that ranks 6th one month might come out 4th or 5th or 7th or 8th the next month.

            An SNP that defines a haplogroup is very unlikely to have happened exactly at the time of the most recent common ancestor (TMRCA) of a haplogroup.  Most likely the SNP is somewhat older, because usually there are many generations between nodes.  By definition an SNP cannot be younger than the TMRCA.  Similarly, we can consider a hypothetical clade defined by a particular STR mutation, which is likely somewhat older than the TMRCA of that clade.  However, for clusters defined by signatures, and for types defined by definitions, one rare STR mutation that contributes to the signature might have happened before or after the TMRCA of that cluster or type.

            Very slow mutators should make the best markers.  However, the slowest are rarely mutated, so those with intermediate mutation rate show up more often as signature markers.  My Type.xls master file has the Chandler STR mutations rates, in the ASD sheet, row 5.  The ASD sheet is not usually included in my analysis files.

            Best Dozen STR Markers:  Using my latest (Sep 2017) analysis at 111 markers, here are my rankings of the best STRs for C type and thereby for L540 (DYS numbers):  1&2 (two way tie) - 594=12 & 636=12;  3 - 390=25;  4 - Δ389=19;  5 - 561=17;  6 - 444=13;  7 - 406=11;  8 - 504=14;  9 - 517=24;  10 - CDYa=29;  11 - 447=25;  12 - CDYb=33.

 

ΔDYS389 = 19

Original Marker for Cluster C

            Rewrite 27 Dec 2015.  Edited 27 Sep 2017.  Edit 16 Aug 2018.

            ΔDYS389II = 19 is one of the original Friedman Signature markers for cluster C.  It remains a good marker for C type and L540.

            [Technical detail:  DYS389 is a compound marker, where 389I is the first STR chain and (389II minus 389I) is the second STR chain.  For cluster C the first chain is 389-1 = 389I = 13.  The second chain is 389-2 = 19.  389II = 13 + 19 = 32.  The marker of interest here is really 389-2 = 19 (389II minus 389I = 19).  However, 389I mutates more slowly and has the value 13 for all but one L540 sample so far and for almost all samples in the L540 neighborhood.  At Haplozone, both 389 markers need to be used together;  if one is omitted both are ignored.  I use both 389 values, or neither, in my definitions to be compatible with other web sites.]  My xls files can be easily modified to use Δ389 without 389-1.

            All STR standard marker sets by all DNA companies include the 389 pair.  (I have not noticed any exceptions.)

            389 = 13, 30 is the modal value for V13, so it seems to be the ancestral value for L540.  389 = 13, 32 is rare in V13 (other than L540), but shows up in E-M35 branches outside V13.

            Only two L540+ samples, Fredeen and Gebert, have the ancestral value 13, 30.  Butman, the closest STR match with L540-, also has 13, 30.  Only a few samples in the branches of L540 have the value 13, 31, which is not common in the neighborhood.  On this basis, it seems likely that the mutations to from 13, 30 to 31 to 32 happened before the TMRCA for L540, and later mutations back from 13, 32 to 31 to 30 happened in very few L540 male lines.  (We cannot rule out a rare double size mutation incident, from 30 to 32, or a double mutation back to 30.)

            DYS389II (actually the difference value 389-2) ranks 43rd in Chandler mutation rates.  Near the middle.  So exceptions are expected, due to recent mutations.  DYS389-2 is ranked as the 4th best marker in my analysis of 111 markers.

 

DYS594 = 12;  Best Marker for L540 at 67 Markers

            Rewrite 27 Dec 2015.  Edited 27 Sep 2017.  Edit 16 Aug 2018.

            In my analysis, DYS594 = 12 is the best marker for L540 (and C type) using the 67 marker set.  594 is not in the 37 marker set.

            All L540+ samples with 67 or more markers  have the 594 = 12 value.  Butman, the closest STR match not predicted L540, indeed tested L540-, and has the ancestral 11.

            All C type samples (predicted L540), except one marginal sample not yet tested for L540, have the 12 value.

            A few samples in the L540 neighborhood have 594 = 12 but are L540-.  These are not a random sample;  I recruited two of them for the L540 test to find out if all 594 = 12 in the neighborhood are L540;  no, not all.

            The 594 = 12 value is more common in the L540 neighborhood than in the rest of the V13 data.  So I was wondering if 594 = 12 is an old mutation in the S3003 branch.  So I tested one of those two L540- samples with 594 = 12;  it came out S3003-, so it seems to be an independent mutation.  Also, considering the L241 haplogroup, some of those samples are in the neighborhood, but they have 594 = 11 except one sample that has the value 12, so that is also independent.

            DYS594 ranks 12th from the slowest in the 67 Chandler mutation rates.  Quite slow, so independent recent mutations should be rare.

 

DYS636 = 12;  DYS561 = 17;  DYS504 = 14; DYS714 = 24

Excellent Signature Markers for L540

Available in the 111 Set

            Rewrite 27 Sep 2017.

            These 4 are not in the FTDNA 67 STR maker set, but are available in the 111 STR marker set.  636 is just as good as 594 [previous topic];  they are tied as the best two STR markers.  Those other 3 are among the dozen best.  That’s why C74(111), my 111 marker definition for C type, works very well.

 

Friedman Signature

            Rewrite 29 Dec 2015.  Edit 27 Sep 2017.  Edit 16 Aug 2018.

            The signature is (390, 389-2, 447) = (25, 32, 25).

            Friedman had been calling this the “characteristic marker values” for cluster C at the Haplozone site before I started working on this, back in 2008, when there were only 9 samples available in cluster C, including mine.

            This original Friedman signature works surprisingly well by itself for samples with only 25 of the standard markers, but not with high confidence.

            In early 2011 Friedman added 594 = 12 to the “characteristic marker values”, for 67 marker samples.

            DYS389 is a compound marker, discussed above.

            Friedman used a more complicated analysis than just this simple signature in her C type assignments.  I do not know her method exactly, but most definitions (not all) that I tried, selecting well ranked markers, extracted the same samples that she did.

 

L540 Neighborhood

            16 Aug 2018:  Neighborhood Table removed, due to the new FTDNA rules for on-line data.

            I still use the word Neighborhood to mean samples that seem close to L540 based on STRs but are not predicted L540 with high confidence based on STRs.  Neighborhood samples may have results for the L540 SNP test;  those are used to calibrate my predictions based on STRs.  I also use the word Neighbor to refer to samples that are close STR matches.

Gwozdz

            My sample is kit N16800.  N81304 is my 3rd cousin Gwozdz.

 

Kargul

            Edit 17 Dec 2015.

            Kit 199446, Aloysius Kargol is my closest STR match available on the web (other than my 3rd cousin).  In May 2010, his daughter noticed, on ancestry.com, that he and I are perfect matches at 12 STR markers.  I studied the LDS microfilms and located his 1820’s Kargul ancestor living in a village in Poland only 20 miles away from the village of my Gwozdz ancestor.  I paid for his FTDNA sample.  His L540 test came out positive, placing him in that new haplogroup.  We are 5 steps apart at 67 STR markers;  9 at 111.

            For estimating the size of L540 or C type, my cousin and Kargul should not be included, because I recruited them, paying for their tests.  Family sets such as these distort size estimates, when comparing the number of samples per haplogroup or per STR type or cluster.

 

Butman

            New topic 13 May 2011.  Rewrite 22 Dec 2015.  Edit 16 Aug 2018.

            Butman’s L540 SNP test came out negative in 2011.  That means he is not a member of the L540 haplogroup.  Kit N91348.

            This sample is interesting because it is an STR outlier from another haplogroup, coming out closest to C type.  (C type is the STR equivalent of L540.)

            At 67 markers, this sample actually falls within C type;  check the numbers in that table, at the columns for the 67 and 37 marker modal haplotypes.  That’s because the 111 marker set has quite a few good signature markers for C type.  Before 2011, at this web page, I listed this sample as at the edge of C type, or predicted L540 with low confidence.  Using only the 37 marker set, Butman’s 5 closest neighbors are C type (Dec 2015).

            This sample recently came out negative for S3003, which is the “father” of L540.  The MRCA node for S3003 is older than the MRCA for L540.  This sample tested V13+ but has not yet been tested for all the recently discovered SNP branches of V13.  Using all 111 STR markers, Butman has no close neighbors;  his closest are Bartlett at step 21, Hohnloser at step 22, and Hochreiter (L540+) at step 23 along with another Bartlett sample and two other samples that are not in the Table above (Dec 2015).

            In the Y-DNA tree, Butman’s node where he branches apart from L540 is surely older than 1,000 years and might even be older than 4,000 years, according to the estimated age of L540.

            What does this mean?  The simplest explanation is that Butman is alone in the E-M35 database, in a very small haplogroup that branches off the branch leading to S3003 and L540 perhaps 2 or 3 millennia ago.  Another possibility:  he may belong to the recently discovered Z17264 haplogroup, since Bartlett belongs to that one (Table above).  Z17264 is a twig in the main branch Z5018 so Butman might have an MRCA older than Z17264, perhaps.  (The test results might come out Z5018+ Z17264-.)  This paragraph is statistical speculation;  Butman might end up in a new branch of V13, negative for all known branches, for all we know.  This paragraph is a good example of the uncertainty of STR based predictions for outliers.  Big Y or SNP tests are needed here.

 

Fredeen

            Rewrite 27 Dec 2015.

            Kit 162917, Fredeen, has been listed at this web page since Mar 2010.  L540+ result May 2011.

            This sample is an STR outlier.  Even with all 111 markers, this Fredeen sample differs a lot from all the other L540 (C type) samples.  The closest neighbor is at step 24;  most L540 samples have closest neighbor at step 14 to 18.  (Samples with the same family name are even closer, of course.)

            The original best L540 signature marker is DYS389 = 13,32;  Fredeen has 13,30, which is the ancestral value (for most Neighborhood samples outside L540).  Fredeen also differs at two other L540 signature markers.

            The simplest explanation is that Fredeen belongs to a branch with a node in the L540 tree that is older than the other nodes.  Perhaps those 3 signature markers mutated to the L540 values after the node leading to Fredeen.

            However, there is an alternate possibility:  Fredeen may belong to one of the currently known branches;  perhaps those 3 signature markers experienced back mutations;  perhaps the Fredeen line has more mutations than normal, due to the luck of mutations.  Read the following topic, Gebert, also an outlier.

            SNP testing is required to determine the branch for this sample.

 

Gebert

            Rewrite 27 Dec 2015.

            I noticed Gebert’s sample on-line and encouraged him to join the E-M35 project, which he did in 2011, kit 166692 in the table.  I helped pay for the orders for the L540 test and for the 111 extension.  He purchased Big Y in 2014.

            Gebert is also an outlier;  read the previous topic, Fredeen, for a brief explanation.  Gebert is not quite as extreme an outlier as Fredeen, with closest neighbor at step 20.  Gebert also has the ancestral DYS389 = 13,30, and also differs at two other signature markers (not the same two as Fredeen).

            In this case, because Gebert purchased Big Y, we know that this sample falls in the Z29042 branch of the L540 tree.  So it is clear that the Gebert line has more than the expected number of STR mutations;  it is just luck that those 3 signature markers mutated back to the ancestral values, because L540 samples both in Z29042 and outside Z29042 have the signature values.  This sample is an example of the limitation of predicting haplogroup based on STR values.

 

Hohnloser

            Rewrite 22 Dec 2015.

            Hohnloser (kit N39989) is another outlier outside L540.  To understand this, please see the topic above for Butman.  Hohnloser is not quite as close to C type as Butman, but otherwise the Butman discussion mostly applies also to Hohnloser.

            Hohnloser has been mentioned here at this web page since 2010.

            Hohnloser also does not belong to the L540 haplogroup because his SNP test came out negative.  He has not been tested for S3003.

            Hohnloser’s nearest neighbors at 111 markers, step 22, are Butman and two other samples not in the Table above.  Hohnloser’s nearest neighbors with haplogroup identification are at the next step, 23, 3 samples, 2 of which are L241+.  However, Hohnloser tested L241-.  L241 is a branch of Z5018, so maybe Hohnloser might fall in one of the other Z5018 branches.

            Jorg Hohnloser has extensive family tree research results.  He administers a Hohnloser project at FTDNA.  He exchanged helpful email discussions with me.

 

Hochreutter

            New topic 12 Dec 2014.  Edit 17 Dec 2015.

            Kit N45041, Administered by Andrew Hochreiter, who runs the Hochreiter Project.

 

Ysearch

              Due to the European Union DNA privacy law in 2018, Ysearch.org was closed down, so that STR data now unavailable.

 

Ancestry.com

            Update 27 Dec 2015.  Edit 16 Aug 2018.

            Ancestry.com no longer provides a comprehensive Y-DNA database.  They now concentrate on autosomal DNA (all chromosomes, not just Y).

            Kargul originally matched with me at this site, back in 2010, so I encouraged Kargul to join the E-M35 Project.

            I last checked for matches 16 May 2011, when the Y-DNA database was still active.  There were 9 matches of Y-DNA to Kargul & me, but these were not very close matches.

 

Age of L540

            Rewrite 23 Jan 2021.  Edit, with a speculative last paragraph 3 Feb 2021

            Summary:  The L540 haplogroup is about 2,000 years old, which is the time to the most recent common ancestor (TMRCA).  This male line seems to have formed with a split from the ancestral (Z7019) haplogroup roughly 4,000 years ago.

            These are “rough estimates”;  there are reasons why these age numbers may not be exact, some reasons are explained in the following paragraphs of this topic.

            Clarification:  The L540 segment spans the time from 4,000 years ago (TMRCA of the Z7019 haplogroup) to 2,000 years ago (TMRCA of the L540 haplogroup).  There are no known branches along that 2,000 year segment of time (from 4,000 to 2,000).  The actual L540 SNP originated some time during that segment of time;  we do not know exactly when.  There are many other SNPs that also originated during that segment, so these are called phyloequivalent to L540.  A TMRCA is also called a node (a branching point of the tree).

            Yfull does not have data for the Z7019 node, but Yfull does a good job of estimating node ages.  Very briefly, Yfull provides estimates of TMRCA age for haplogroups, based on the number of accumulated SNPs.  The calculation is complicated, as explained at the Yfull site.

            Here is a link to the Yfull L540 tree: 

https://www.yfull.com/tree/E-L540/.  Notice at the L540 line:

“formed 4600 ybp, TMRCA 2000 ybp”;  ybp = years before present.  Click on the “info” box for links to details of the Yfull age estimation methodology.  This is my source for the L540 TMRCA 2000 years ago age.  That “ formed 4600 ybp” is the TMRCA for CTS1273, the immediate ancestor of L540 in Yfull’s tree;  I use that 4600 for Z7019 in a paragraph below.

            The CTS1273 to L540 segment in the Yfull tree has 28 phyloequivalent SNPs;  all us L540 men have all 28 of these SNPs, and these 28 are not found in men at Yfull who are not in the L540 haplogroup.  Yfull calculated the estimated L540 segment based on these 28.  The 2,000 ybp TMRCA is calculated by averaging the number of additional SNPs carried by only some of the men in our L540 haplogroup - different SNP in different branches.

            Statistical adjustments:  These Yfull numbers for L540 are for 23 Jan 2021.  In the past, when there were fewer L540 men known, the numbers changed as more men (more data) became available, mostly for statistical reasons;  the numbers have been stable now for several months, but the numbers may well change slightly in the future, for statistical reasons - with more data.

            More importantly, I expect significant changes in the future, if new branches are discovered, as I explain in the following paragraphs of this topic.

            The FTDNA tree for Y-DNA uses S3003 as the name for what Yfull and I call L540.  That’s OK:  FTDNA and Yfull both show L540 as phyloequivalent to S3003, based on the data that they have.

            FTDNA lists 41 (23 Jan 2021) phyloequivalent SNPs, including L540 and S3003.  Compare to 28 for Yfull, as mentioned above.  I can think of a few reasons for the difference, but I don’t know for sure.  I suppose the main reason is that FTDNA uses all the data from the most recent version of Big Y, which covers more of the Y chromosome (more SNPs) than previous versions.  The Yfull tree also uses Big Y data with very few exceptions;  I assume Yfull lists only the SNPs found in all samples of a haplogroup, thereby restricting the list to the original Big Y;  I’m not sure of this assumption.

            Also, identifying SNPs from raw DNA data is complicated;  it’s not surprising that different web sites (different computer algorithms) have different counts.

            Z7019 branch:  The ISOGG tree shows branches for S3003, as I outline above for L540 in the Y-tree.  ISOGG shows Z7019 as a branch of S3003, and L540 as a branch of Z7019.  That means ISOGG has at least one sample that is positive for S3003 and negative for Z7019 and L540, and at least one sample that is positive for Z7019 and negative for L540.  Z7019 is not listed by FTDNA nor Yfull trees.  Perhaps Z7019 was found in a region of the Y chromosome not tested by Big Y.  ISOGG does not publish who provides samples.  ISOGG has a “~” next to Z7019, indicating not full certainty.  I have some concerns about Z7019, but I’m assuming it represents a valid branch because Steve Fix analyzed a sample called PGP89, which is S3003+ L540-.

            We can make a very rough guess for the TMRCA node of Z7019 from the number of phyloequivalent SNPs listed by ISOGG:  ISOGG shows 4 phyloequivalent SNPs in the S3003 segment, only that one Z7019 SNP in the Z7019 segment, and 15 phyloequivalent SNPs in the L540 segment.  20 Total.  19 of those SNPs (not Z7019) are listed by FTDNA and Yfull.  The TMRCA of Z7019, based on ISOGG SNPs, falls 5/20, or 1/4th the time distance along the S3003 to L540 segment.  Yfull figures that segment as 4,600 minus 2,000 = 2,600 years;  1/4th is 650 years, so I’m guessing L540 formed (branched off) from Z7019 at 4,600 minus 650 = 3,950, rounded off to 4,000 years ago.  Again, this is a very rough guess for the formation time of L540, as summarized at the top of this topic.  This paragraph makes no change to the 2,000 ybp for the L540 TMRCA. 

            Validation of the Z7019 branch will probably come with time as more men from that haplogroup show up with DNA data positive for S3003 and negative for L540, plus positive for about 1/4th of the SNPs currently listed as phyloequivalent to S3003 and L540 by FTDNA and Yfull.

            Even if Z7019 is not validated, the discussion above demonstrates how a new branch can split a segment of the Y-DNA tree.  In the case of Z7019, the Yfull 4,600 ybp formation time of L540 becomes the formation time of Z7019, and a new time roughly 4,000 ybp becomes the TMRCA time for Z7019 and the formation time for L540.

            That new Z7019 node did not change the TMRC for L540.  On the other hand, if the future provides a new branch with a node more recent in time, for example 3,000 ybp, and if L540 ends up in the older part of the split, then in that example the L540 age would get adjusted from 2,000 ybp to 3,000 ybp, including that new branch;  the 2,000 ybp number would remain as the best estimate for a new haplogroup representing the currently known branches of L540.

            At the top of this topic, I mentioned that the actual L540 SNP originated some time during the segment of time from roughly 4,000 to 2,000 years age;  we do not know exactly when.  That’s sometime from the TMRCA of Z7019 to the TMRCA of L540.  It’s possible The L540 mutation happened to one of those 2 MRCA, but it’s very unlikely.  How unlikely?  If we take 25 years as the average time per generation, that’s 4 generations per century, and that 2,000 year segment has 4x20 = 80 generations.  If we take 33.3 years per generation, that’s 3x20 = 60 generations.  Either way, that’s a lot of generations.  A good estimate might be 70 generations - a continuous chain of 70 male line descendants.  The L540 mutation happened in the Y-DNA of one of those 70 men;  we don’t know which one.

            Back to the mutations that are phyloequivalent to L540:  A few paragraphs above, I estimated the Z7019 segment to be roughly 1/4 the length of the L540 segment as currently (1 Feb 2021) estimated by Yfull (and by FTDNA).  Yfull has 28 mutations in that segment, but FTDNA has 41.  So I estimate 3/4 x 41 = about 30 mutations in the L540 segment when defined to start at Z7019.  That’s about 30 mutations spread out along that chain of about 70 men.  Any one of those 30 could be selected as the name of our haplogroup;  this topic is actually about all of them.  Probably more will be found in the future.  L540 was the first one found, in my Y-DNA, in 2011;  that’s when I renamed this web page “L540”.  Obviously, I like to use L540 as the name of our haplogroup.

            Long segment discussion:  The current Yfull L540 segment spans 4,600 to 2,000 ybp.  Using ISOGG data, I adjusted that to roughly 4,000 to 2,000 ybp.  Either way, that’s a long time without any known branches.  This is not unusual;  many old segments in the Y-DNA tree span thousands of years without known branches.  The reason:  most new Y-DNA haplogroup branches become extinct because statistically that’s most likely.  The many men who form that continuous male line chain in a segment are like the rare winners in a casino.  Or like a group of men who get together to buy lottery tickets, and win the lottery.  Throughout the time span of a segment, many men in that chain had more than one son, but by luck only one of those sons at each generation had a male line that did not go extinct.  Many of the male lines that went extinct probably existed for centuries.  It’s possible and even likely that 2,000 years ago hundreds or maybe thousands of men were L540, most of them not descended from our MRCA, Statistically, most male lines go extinct, so maybe all those lines became extinct with one exception - the one line that we now call L540.  Statistics is the simplest explanation, but it’s not the only explanation.  We use the word “statistical” when something is caused by a large number of causes, where none of causes dominates.  But maybe there was one dominant cause, like a plague, and that was the dominant reason why all but one of the L540 branches went extinct.  The Justinian plague, starting in 541, was preceded by a few years of famine caused by cold summers starting in 536, apparently due to volcanic dust in the atmosphere;  that’s an example of 2 dominant causes combining to cause extinction.  Maybe most of the L540 men belonged to a tribe that became involved in a tribal war just before 536, so  that there were 3 reasons - war, famine, and plague - that together caused all but one of the L540 lines to expire over a short period of time.  This last paragraph of this topic is highly speculative.  This paragraph can be applied to any long segment of the Y tree, not just L540.  I included this paragraph not to propose an explanation for the age of L540, but to emphasize that we don’t know the explanation.  We also do not know the region of origin, next topic:

 

Place of Origin for L540

            Rewrite 3 Feb 2021.

            Summary:  Our L540 male line MRCA (Most Recent Common Ancestor) probably lived in Central Europe.  This is not certain;  it’s just a best guess.

            This best guess is based on the reports by L540 men for the location of their most distant known male line ancestor, although most of those ancestors lived only 200 to 400 years ago.  Our L540 TMRCA [previous topic] is roughly 2,000 years ago, so that’s why Central Europe is just a best guess.  After all, our MRCA might have lived his early life somewhere far from Central Europe and migrated to Central Europe.  Or he may have lived his entire life far from Central Europe and some of his sons or grandsons or even later descendants may have migrated, whereby many of his descendants today are in Central Europe.

            My L540 Tree has 35 ancestor names, corresponding to Y-DNA data from 35 men.  For a statistical analysis, I need to disregard the men who were recruited, counting only the men who independently decided to test their DNA.  For example, The 4 Hochreutter all come from one project, so they count as one.  I’m also counting the 2 Hartsfield as one.

            Gush and Kargul (A6295) independently tested autosomal DNA and they matched with me; I highly encouraged them to test Y-DNA.  It’s hard to say if their Y-DNA testing is fully independent.  Also, I do more encouragement for everyone in my A6295 branch.  So I adjusted A6295 by counting Gwozdz, Gush, and Kargul as one common ancestor from Poland.

            That makes 29 ancestors, from various countries as listed in my Tree.  The results:

                                    10 Germany

                                    4   Poland

                                    3   Czech

                                    2   Hungary

            19 Central Europe

                                    4   Sweden

                                    3   Norway

            7  Northern Europe

                                    2   Russia

                                    1   Ukraine

            3  Eastern Europe                  

                        29 Total (data as of 3 Feb 2021)

            So Central Europe seems like a best guess for where our MRCA lived.  19 out of the 29 male line ancestors of the men in my L540 Tree are from Central Europe.  29  is not a very statistically significant sample.  With more data in the future, I suppose the list above might look different.  However this statistical uncertainty is not important, because I mentioned above a much more important reason for uncertainty:  Our MRCA lived about 2,000 years ago, while these 29 known ancestors lived only 200 to 400 years ago.

            My L540 Tree has only about half the men with Y-DNA data who are known L540+ or expected to be L540 based on STR data.  But those other men have not been tested for the branches of L540, and most of them do not report where their male line ancestors lived.  In this topic I’m using branch data and location data.  Most of those other men have their Y-DNA data available at the E-M35 Project.

            My Tree has at least 6 branches from the L540 node:  Y7026 is the largest, with 16 of the 29.  A6295 has 5 of the 29.  Under L540*, there are 4 with “BY”, meaning they have Big Y data and did not have any SNP in common with one another, so those represent 4 branches with only 1 each so far;  each of those has many novel unique SNPs.  The other 4 L540*, without “BY”, tested L540+ Y7026- A6295- with individual SNP tests;  they have not tested for the unique SNPs of the previous 4, but they will test when one of those unique SNPs gets identified as a definition for a branch when someone in the future matches.  One or more of these last 4 might in fact be a member of a branch of one of the previous 4, so these last 4 may or may not represent independent branches.  In other words, our L540 tree is more like a bush, with at least 6 independent branches identified;  this is a strong hint that more independent branches will be identified in the future.

            Germany has 10 of the 29 men in the list.  However, all but 1 of those 10 are in the Y7026 branch of L540.  So Germany is the specific best guess for the origin of the Y7026 branch -  where the MRCA of the Y7026 men lived.  But that’s a highly uncertain guess.  The Yfull Tree estimated TMRCA ages are rounded off;  if we examine the Yfull calculation details, we see 1,976 years for L540 and 2,003 years for Y7026.  Only 27 years separation.  That implies that the Y7026 MRCA is the son of the L540 MRCA.  Of course that’s just a very rough estimate;  he might be a grandson or great grandson, or 2-great grandson.  I checked the web:  Julius Caesar’s writings are the oldest known use of “Germani” for people who lived north of the Roman lands.  That’s a little more than 2,000 years ago, before Rome was an empire, and those “Germani” lived in an area larger than modern Germany.  So it seems the best guess for the MRCA of the Y7026 MRCA is that he was a Germani.  I have the state in Germany for only 5 of those 10 L540 German ancestors:  1 is Prussia, 2 are in the former East Germany, and 2 are from the southeast of the former West Germany.  5 is not enough for statistical significance, but the trend seems to be toward the east side of Germany, consistent with most of the other L540 ancestors being from the region to the east of Germany.  A6295 has only 1 ancestor entry from Germany, and none of the 540* ancestor entries are from Germany, so there is no reason to suggest Germany as the origin of L540, although Germany cannot be ruled out.

            So far, no specific country region from what we now call Central Europe is a reasonable guess for the specific place of origin of L540.

            There is some bias in on-line DNA data toward Europe.  Many parts of the world are not represented well in the database.  I suppose there is a slight chance that someday L540 samples will show up as common elsewhere - for example a group of villages somewhere in the mountains of Russia, or somewhere in the Balkans, or somewhere on the Eurasian Steppe.  Discussing an origin today on the basis of only 29 samples is a bit speculative.  We’ll see how it comes out as more data accumulates.

            Origin of ancestral branch haplogroup:  Z7019 and S3003 are the ancestral haplogroups for L540.  But those 2 are very small, with no significant data so far, so I’ll ignore Z7019 and S3003 for this paragraph.  At Yfull, CTS1273 is the ancestral haplogroup for L540.  CTS1273 is much larger than L540, and Yfull estimates the CTS1273 TMRCA at 4,600 years ago, which is the formation date for L540 [previous topic].  If we glance, at Yfull, at the countries listed for CTS1273, we see a much wider range than for L540;  England, Italy, Greece, and other countries are included in CTS1273.  This makes sense;  4,600 years is a long time for a wider range of migration.  The place of origin for CTS1273 is even more uncertain than for L540.

 

Size of L540

            New topic 29 Dec 2015.  Rewrite 12 Apr 2020.

            I estimate there are 100,000 L540 males living in the world.  This is my very rough educated guess.  This estimate is almost surely not wrong by a factor of 10;  in other words, my 99% confidence range is more than 10,000, less than 1 million.  My 90% confidence range is a factor of 3;  in other words the actual number is very likely between 33,000 and 300,000.

            That’s a wide confidence range.  I worked out that estimate in 2015, based on a statistical analysis of STR data.  I expected to narrow the confidence range as more STR data accumulated.

            A more precise estimate should be possible in a few years based on SNP data, but there is not enough SNP data publically available yet for L540, as of spring 2020.

            My 2015 version of this topic had a detailed explanation of my estimate, but the STR basis data of my estimate was no longer available on-line in 2018, so I removed the detailed explanation.

 

Validity of C Type

            Edit 27 Dec 2015.

            Quite frankly, I was originally surprised by cluster C.  Friedman did a good job finding this one.  I admit I dismissed it when I first saw cluster C in 2007 because it was so small that statistical significance did not seem possible to me.  I postponed analysis until Jan 2010, independently verifying cluster C as C type.

            By “valid” I mean a cluster whereby most of the samples belong to a single clade, and whereby very few other samples in the database belong to that clade.  In other words, a valid cluster should eventually have a corresponding SNP discovered.  Throughout 2010 I confidently predicted such an SNP here in this topic, although I doubted it would be discovered soon.  L540 turned up in my WTY (next topic) in 2011.  C type is the STR equivalent of L540.

 

My WTY Analysis

            Edit 27 Dec 2015.

            Fifteen new SNPs were discovered in my “Walk Through the Y” (WTY).  L535 through L547, L614, and L618.  All 15 are available as commercial SNP tests from  FTDNA.

            My WTY test read about 200,000 base pairs of the Y chromosome in Feb 2011.  WTY is no longer available, having been replaced by Big Y.

            I announced 8 new SNPs here on 29 Mar 2011.  The count on 30 Mar was 13 new SNPs in my WTY.  L614 was added in June.  L618 was added in August.  That was a lot more than I expected.  I now realize that’s because FTDNA expanded the number of DNA bases included in WTY just before my test.  Also, I seem to have been the first WTY from E-M78 in quite some time.

 

SNP Test Orders

            Rewrite 13 Mar 2020.

            The Topic Dividing L540 has tips for which SNPs to order.

            FTDNA:  SNP tests cost $39 each if your sample is already there from previous testing.  The V13 SNP Pack or the V68 SNP pack cost $119.  From your FTDNA home page, top right, click on “ADD ONS & UPGRADES”.  Look for either “Y-DNA SNPs” or “Y-DNA SNP Packs” and click on the “EXPLORE” box to the right.  Then type the SNP or Pack into the “Find” box to search.

            FTDNA has an “E-V68 SNP Pack, with more than 100 SNPs downstream of V68, to determine which SNPs are yours.  L540 is a branch of V13, which is a branch of V68.  This pack has SNPs for many of the branches of L540.

            FTDNA update 13 Mar 2020:  The following L540 SNPs are included in the V68 Pack and also individually:  S3003, L540, Y7026, A783, Z29042, A6295.  If you are sure you are V13, the V13 pack includes:  S3003, Y7026, A783, Y12393, Z29042, A6295.

            Yseq:  SNPs are $18 each.  They ship a cheek swab kit with your first order;  the swab is good for several orders.  At the Yseq home page, the “Quick Find” box is on the left, near the center.  If you Quick Find L540, the result is a long list of SNPs that are in L540, as well as the E1b-V13 Panel.

            Yseq update 13 Mar 2020:  If you are sure you are V13, the Yseq E1b-V13 Panel includes:  S3003, L540, Y7026, A783, Y12393, A779, A782, Z29042, Z39377, A6295.  It costs $88.

            Please let me know if you are L540 and order SNPs from Yseq, so I can keep track of results.

 

Index of Bookmarks

            If you open this html document with Word, all the link targets (bookmarks) can be viewed alphabetically or by location.

 

References & Sources

            Update Nov 2019.  Update 30 Jan 2020.  Edit 30 Jun 2022.

            Big Y-700:  https://www.familytreedna.com/products/y-dna#/compare.  A commercial product at FTDNA for reading about 12 million base pairs of the DNA of the Y chromosome, which has about 60 million base pairs total.  New SNPs are being discovered in the Big Y-700 data provided by customers.  Big Y-700 replaces the original Big Y, which was used to discover most of the SNP branches in the L540 Tree.  There was also another version, Big Y-500, for a while.  The newer versions provide more SNPs.

            Y-37 and Y-111:   https://www.familytreedna.com/products/y-dna#/compare.  These tests are less expensive than Big Y, using 37 or 111 STRs to predict your SNP branch.

            E-M35, a project at FTDNA, is my main source of data.  Previously called E3b.  Link:  https://www.familytreedna.com/groups/e-3b/about/background.  The official name today would be E1b1b1.  ISOGG changes the name when new defining SNPs are discovered, so the name may change again in the future.  M35.1 is the name of the SNP that defines E1b1b1 within haplogroup E.  I am not planning a separate L540 project, because it is more convenient to run this web page using the E-M35 project.

            Haplozone is a web site for analysis of data from the E-M35 project.  This site has not been fully updated since September 2013, but it is still useful.  Link:  http://www.haplozone.net/e3b/project.  Data from E-M35, plus some data added from sources other than FTDNA, so this database is larger than the E-M35.  Page with a listing of proposed clusters:  http://www.haplozone.net/e3b/project/cluster/.  Page with L540 / C cluster samples:  http://www.haplozone.net/e3b/project/cluster/42.  Discussion forum:  http://community.haplozone.net/

            Yseq:  www.Yseq.net.  A company that provides Y- SNP tests at competitive price and fast turnaround.

            Yfull:  www.Yfull.com.  A company that provides analysis of raw DNA data, very useful for Big Y data.  Yfull presents a tree for L540 at:

                        https://www.yfull.com/tree/E-L540/

            SNP Tracker is a web page added to the E-M35 project in late 2011, to keep track of all the new SNP branches in M35.  http://tinyurl.com/e-m35-snps.  Not up to date.

            The V13 data:  http://www.haplozone.net/e3b/project/cluster/10.  V13 is the defining SNP for E1b1b1a1b1a, a major branch haplogroup in E, and “father” of L540.  That page of data does not have the data for samples that have been assigned to clusters as subdivisions of V13, just the data that does not fit any downstream proposed STR cluster.  The number code for other clusters can be typed over that “10” to quickly get to other cluster data.

            Cluster C Data:  http://www.haplozone.net/e3b/project/cluster/42.

            ISOGG link:  http://isogg.org/tree/  Y-DNA tree SNPs and corresponding alphanumeric codes for the haplogroups.  ISOGG names change as new SNP divisions are discovered.  ISOGG names are getting quite long due to the flood of new SNPs in the past few years.  Click on the link for the E branch, and download it (as xlsx, for example) to search for L540 to see their version of the L540 tree.  For V13, it’s easier to search for PF2211, which is an alternate name.

            Steve Fix uses Big Y data to maintain a tree for V13.

            Andrew Lancaster was an administrator for the E-M35 (E3b) Project.  Andrew had been particularly patient with me with long helpful email discussions.  Villarreal and Friedman had also been very helpful.

            Victor Villarreal was an administrator for the E-M35 Project.

            Elise Friedman was a co-administrator for the E-M35 Project and is administrator for the Jewish E3b project.

            Denis Savard is a current administrator for the E-M35 Project.

 

            Peter Gwozdz.  That’s me.  pete2g2@comcast.net.

 

Revision History

2010 Jan 14 original draft version

2010    13 updates

2011    28 updates

2012 - 2014     28 updates

2015    39 updates

2016 -2017      28 updates

2018 - 2019     4 updates

2020    16 updates

2021 Jan 14  Michalski added to Tree;  add FT402631 branch

2021 Jan 23 rewrite of topic “Age of L540”

2021 May 5 add Kurganov to tree

2021 May 8 add Zeidler to tree;  add A1157* branch

2021 Jul 20 add Paul Sholtz to the tree

2021 Sep 14 add Grundel to the tree

2021 Oct 19 new branch of L540:  Y82423

2021 Nov 27 move Shultz in Tree

2022 Jun 30 update tree

2023 Mar 7 minor updates on the first 4 topics, including the tree