L540 - A Small Y-DNA Haplogroup
7 Mar 2023
Peter Gwozdz
pete2g2@comcast.net
News
7 Mar 2023: The L540 Tree
continues to grow as a few men per year submit their Y-DNA samples and test
positive for L540. My version of the L540 Tree has 39 men.
I only include men who send me an email with permission to list them
here. The FTDNA “Big Y
Block Tree” has 62 men in L540 (S3003).
The Yfull tree has 50 sample id numbers in
L540. I do not have contact information
for many of these names and sample numbers; most men that I contact give me
permission to list them here. My tree has
several men who are not in the Big Y nor Yfull trees because they tested
individual SNPs at Yseq
following my recommendations. Big Y,
Yfull, and Yseq use different phyloequivalent
SNPs as the code names for some L540 branch nodes; in such cases I use my favorite SNP code name
and include the others in parenthesis.
Edited 30 Jun 2022.
This web document is a summary
of my information about a small haplogroup
of Y-DNA based on an SNP mutation named L540. The subject is
genetic genealogy.
The L540
Tree shows the samples that have
been tested for the branches of L540.
Prediction to the branches cannot be done with confidence by STRs, so
L540 samples without tests for these specific SNPs need to do SNP testing to determine their branch.
The L540 haplogroup seems to
be roughly 2,000 years old, with an origin perhaps in Central Europe.
This web document is written
for people reasonably familiar with the jargon of genetic genealogy. If you are new to genetic genealogy you might
prefer to first read an Introduction
that I wrote for another of my web documents.
My References
and Sources are listed at the bottom.
L540 Tree
6 Feb 2023 Update
Click on a link in this tree for more discussion about that SNP or about that sample (DNA data from one man).
Format for each individual line,- data from
one sample:
Ancestor: Country. FTDNA code number, BY = Big Y data; Yfull code;
Yseq code
References: FTDNA, Big Y, Yfull, Yseq
-- means no Big Y data; placed in the
tree by testing individual SNPs
More explanation: Next
topic L540 Tree Format
7 Mar 2023 Update
L540 (S3003) Tree:
L540* = A6295-, Y7026-, Y82423-
Kovalev:
Russia. 268215, BY; YF04818
Ponto: Poland.
557832, BY; YF08450
A6295-,
Y7026-, Y82423?
Henriksen:
Norway. B4816, BY
Helgesson:
Sweden. 682512, --
Belov: Russia. 321021, --
Simonsson:
Sweden. 70482, --
Kusyi: Ukraine. 370518, --
Y82423
Hoff: Norway.
374375, BY; YF09864; 3203
Kurganov: Russia. B752475, --, YF84217; WGS37431
BY5854
Nowak: Poland.
225596, BY; YF03833
FTA519
Vajanszky:
Hungary. IN36326, BY, YF78756
A9034
Y153993
(BY161037)
Eliasson:
Sweden. 458953, BY; YF15457
Rohss: Germany. 555020;
BY; YF14486
A9036-
Kargul:
Poland. 199446, --; 4230
A9036/BY5850
Gwozdz:
Poland. N16800, BY; YF02909; 1433
Gush
(Gwozdz): Poland. B182584, BY; YF13748
Y7026 (Y7025)
Z29042-,
A783-, Y17710?, FT402631?
Kline:
Germany. 158091, --; 2360
FT402631
FT402631*
Sholtz: Czech. 926019, BY
Marschner:
Czech. 480087, BY; YF05931
Mrocek: Czech. IN69642, BY
FT13909
Wacha: Czech. 347884, BY
BY82428
Stavbom:
Sweden. B3807, BY; YF05186;
2100
Glasser:
Germany. 171456, BY; YF04393
BY178938
Blind: Germany.
B2670, BY; --; 2891
BY5193
Gebert:
Germany. 166692, BY; YF01811
Roider:
Germany. 275510, BY; YF04216
BY5185
Hartsfield: Prussia.
140927, BY; YF04834; 2397
Hartsfield2: Prussia.
80059, BY; YF13261
A783*
Weiand:
Germany. 51282, BY
A1157
A1157*
Zeidler.
Germany. B615039, BY; YF84363
Grundel.
Germany. 768562, BY; YF90359
BY5876
Ratuszni:
Hungary. 200924, BY
Stelz: Germany. 175213, BY;
YF05757; 2028
Y12393*
Svercl: Czech. 155155, BY;
YF02913
Michalski: Poland.
SM10049, BY; YF80537
A779 /BY5841
Y33576
(A782/BY5856)
Hochreutter:
Germany. N45041, BY; YF02161
Hochreutter:
Germany. --; YF09477
BY5909
Hochreutter:
Germany. 131761 BY; YF06285
Hochreutter:
Germany. --; YF09478
Update 7 Mar 2023.
This Tree has only
about half the men that I know about who have L540 positive Y-DNA results. If you are L540+ and would like to be listed
in this tree, please send me an email request.
pete2g2@comcast.net. If you are listed and wish to be removed,
just send me an email.
This tree uses the
most common format for Y-DNA trees, where the branches and sub branches are
listed below, using tabs for columns. For
each terminal branch, the samples are listed below that branch, with one more
tab.
Format for each individual line - from each individual sample (DNA data from one man):
Ancestor: Country. FTDNA code number, BY = Big Y data; Yfull code;
Yseq code
References: FTDNA, Big Y, Yfull, Yseq
Each branch is defined
by an SNP.
All the samples in that branch have that particular SNP, and all the
samples outside the branch do not.
For example, L540 has 3 known
branches, defined by the SNPs Y82423, A6295 and Y7026. A6295 and Y7026 each have multiple branches.
L540* is listed first,
with 7 men, but that’s not really a branch;
those 7 samples are positive for L540 but seem to be negative for the known
branches. They are like twigs that might
become new branches in the future, as new samples show up matching with these 7. We expect that future samples will share a
few unique (“private”) SNPs with one those 7, thereby defining new branches of
L540. 4 of those L540* samples do not have Big Y data; they test for individual
SNPs; they are so far negative for
the 2 main branches.
A9036/BY5850: That “/” means an SNP that has been assigned
two different names, discovered and named independently. These are the same SNP.
Y153993 (BY161037) These two are different SNPs but they are phyloequivalent, which means that so far, all
samples in that branch have both of these, so either can be used to define this
branch. In this case, I follow Yfull,
which uses Y153993, but FTDNA uses BY161037 as the name of this same
branch. Actually, this branch has more
than 10 phyloequivalent SNPs, so any one of them can be used as the name of
this branch.
S3003
used to be phyloequivalent to L540, but Steve Fix found
an on-line sample (PGP89) positive for S3003 but negative
for L540; this means L540 is a branch of
S3003. FTDNA still uses S3003 as the
name for the L540 tree; Yfull uses L540
for the name of the tree; both still
list S3003 as phyloequivalent to L540 because their databases do not have any
sample like PGP89.
In other words,
different trees for L540 have different details. My tree is similar to Yfull’s and FTDNA’s trees
for L540 (S3003). My tree is restricted
to men who have requested to be listed; my
tree has samples that have not joined Yfull;
FTDNA’s tree is restricted to men who have purchased Big Y, but my tree
has some men who joined FTDNA but did not purchase Big Y, using Yseq tests
instead.
The FTDNA “Big Y Block
Tree” has 30 branches for S3003/L540 (more than my tree) because many men with
Big Y data have not contacted me with permission to list their sample. If you look at the Block Tree at your FTDNA
site, you will see very few names because only your closest “matches” are
listed. Also, I suppose some men
withhold permission for FTDNA to list their name. The Block Tree has 8 samples for L540* that I
do not have, which means we can expect more branches soon, because future
samples are likely to match unique SNPs from one of those L540* men.
The FTDNA “Y-DNA Haplotree
& SNPs” has the same branches for S3003/L540 in the simpler tab indented
format, and you can display or hide the sub branches for each branch; this
provides a compact simplified display.
Update 15 Dec 2020.
Rough outline of the
human Y-DNA tree with ISOGG and SNP code names, showing the location of L540:
E (M96)
E1b1b1
(M35.1)
E1b1b1a1
(M78)
E1b1b1a1b1a
(V13)
CTS8814/Z1057
E1b1b1a1b1a5
(S3003)
E1b1b1a1b1a5a1 (L540)
More
detailed trees are available at: Yfull, FTDNA, ISOGG,
and Steve Fix. These
trees have different details depending upon the different samples available,
and upon which SNP results are included.
The ISOGG tree does not have
branches CTS8814 or CTS1273. Z7019 is
unique to ISOGG.
Yfull and FTDNA combine S3003, and
L540 into one branch; FTDNA calls this
S3003; Yfull calls this L540.
There are intermediate
and parallel branches not shown in this outline of mine. These branches haves many phyloequivalent SNPs not shown in this outline,
which is intended to show the location of L540.
Those long ISOGG code names (like
E1b1b1a1b1a5a1) change when new SNPs define new intermediate branches, which
makes those long names confusing.
Update 23 Dec 2020.
L540 is the code name for an SNP that was discovered in my WTY. L540 was announced 29 March 2011. On 27 Apr 2011 I demonstrated that L540
defines a new haplogroup branch of V13.
I use the code name L540 for the SNP,
for the associated haplogroup, and for the samples (men) in that haplogroup.
This haplogroup was predicted as cluster C based on STR
correlations in 2008. When I originated
this web page in early 2010, I coined the name V13C,
renaming it L540 on 30 Apr 2011. Cluster
C, also called C type, is the STR equivalent of L540.
Update
of statistics 23 Dec 2020: My L540 Tree has 35 samples.
This Tree has only about half the men that I know about who have L540
positive Y-DNA results. I do not list
samples in my tree until I get an email request to be listed by the man who
submitted the sample. Also, I do not
list samples with an L540+ result if they have not been tested for the 2 known
branches of L540. In addition, I know of
about a dozen more men who seem to be cluster C based on their STR results, but
they have not tested for L540. I do not
include relatives closer than 4th cousin in my tree; those with the same family name are distant
cousins.
Judging from the size of L540 at Yfull, as a small fraction of the Yfull tree, my wild guess
for the size of L540 worldwide is more than 10,000 men but less than 100,000.
If you are in my tree, I encourage
you to contact your closest Y-DNA matches at FTDNA; if they have not already tested for their
Y-DNA branch, encourage them to test for L540 and
to contact me for questions and for inclusion in my L540 Tree.
Notice
3 Aug 2018: FTDNA changed
the rules for DNA data, requiring that DNA data must be removed from all files
whenever a person changes their privacy settings to restrict web posting. This web page has links to my 6 STR analysis “xls” files. It would be too much trouble for me to change
all 6 files every time a person changes settings, so I removed these files from
the web. Some of the links to these
files are still here, but the links do not work. As I update and rewrite this page I remove
links to my STR analysis files. This
does not matter very much because these days SNPs
are much more important than STRs, and SNPs do not require the statistical
analysis.
The few individuals named in
this web page requested to be mentioned;
they may request to be removed at any time.
Rewrite 31 Oct 2015. Edit 16 Nov 2019.
For detailed V13 trees, see:
http://www.yfull.com/tree/E-V13/
V13, in the E haplogroup, is a major branch of the
human Y-DNA tree. The L540 branch is a
relatively small branch of V13.
There are about 80 known SNP equivalents to V13. V13 was the first to be discovered and the
one used in most discussions about this haplogroup. All but a very few V13 samples belong to L142
and CTS5856, so technically L540 is the main branch of the S3003 haplogroup,
which is one of many branches of the CTS5856 haplogroup, which is the main
branch of the L142 haplogroup, which is the main branch of V13. For simplicity the L540 Tree above minimizes
these details. I usually just say in
this web page that L540 is a branch of V13.
I say that V13 is the father of L540, when technically S3003 is the
father and V13 is the great-great grandfather, and even that may change if
additional side branches are discovered with very few samples. I’m ignoring the known branches that have few
samples, for simplicity.
L542 is one of those 80
equivalents. V13 is sometimes called
L542. L542 was found in my
WTY.
New Topic 10 Feb 2015.
PGP89 is a sample from the Personal
Genome Project (search Google for details).
PGP89 is S3003+ but L540-, so this sample
represents an older node in the branch
leading to L540. So far there are no
such S3003+ L540- results in the E-M35 Project.
This is from Steve
Fix, who included PGP analysis in his tree.
New Topic 28 Jan
2020. Edited 30 Jan 2020.
This is a new branch,
discovered Jan 2020 in the Big Y-700 of Blind. It is a branch of Y7026. Previously, Y7026 had two branches, Z29042
and A783. This new Y17710 includes the
branch Z29042.
Previously, there were
6 samples in Y7026*, including Blind, who is now moved to Y17710, in a new
small branch BY178938. Those other 5
samples are currently listed in my Tree as Y7026*: Z29042-, A783-, Y17710? because it is not yet
known how many of them will test positive for Y17710. Three of those 5 have Big Y
results from 2015 and early 2016, but Y17710 showed up in the expanded Big
Y-700 in late 2016, so we don’t know their Y17710 status.
Y17710 is not
available individually from FTDNA, but Yseq has it for $18:
https://www.yseq.net/advanced_search_result.php?keywords=Y17710&search_in_description=1&x=6&y=9
Big Y-700 is
pricey; FTDNA has not yet announced a
reasonable price for upgrade from the original Big Y to the new Big Y-700.
There are 2 more known
men with Big Y-700 in this new branch;
one is Y17710* and one is with Blind in BY178938. I’m contacting them.
I’m putting a request
into Yseq for an $18 test for BY178938.
New Topic 14 Jan 2015. Edit 27 Dec 2015.
This SNP
was discovered by Steve Fix on 10 Jan 2015, from the Big Y data of Roider,
compared to Gebert. These two samples have this SNP, but Hochreiter does not, so Z29042
defined a new Haplogroup, the first branch to be found for L540. Steve assigned the Z series code number. Actually, there are 6 new SNP locations
common to Roider and Gebert,
but only Z29042 was assigned a code;
some of those others may be needed in the future.
I’m a bit surprised. I expected Roider
to fall into a branch with Hochreiter, because they
are closest in STRs. Also, I have been predicting an older node
for Gebert, based on his DYS389
value, and his STR values that differ from other L540 samples, more than L540
samples differ from each other. STR
predictions are statistical, because STRs mutate relatively rapidly. So this is a surprise, but such surprises are
expected from time to time when making predictions based on STRs.
Update 10 Feb 2015.
This SNPs
was noticed by Steve Fix and me
in Hochreiter’s Big Y data, our first L540 Big
Y. Actually, there were 10 new
SNPs; I tested myself for them but came
out negative. Yseq
assigned A series code numbers to them.
None of the 10 showed up in the Big Y data for Roider or Gebert. In Feb 2015 I noticed this one in the Big Y
data for Svercl, so it defined
a new haplogroup branch for L540, with Hochreiter and
Svercl, not me, not Roider,
not Gebert.
New topic 12 Jul 2016.
This SNP was newly posted by Yfull
in July 2016, as a new branch of A783.
New topic 12 Jul 2016. Edit 31 Jul
2016. Edit 17 Aug 2016. Edit 29 Aug 2016.
Steve Fix
suggested the SNP A779 to me on 12 Jul 2016, to distinguish Svercl from two Hochreiter Big Ys.
The two Hochreiter’s are 7th cousins, with a
common ancestor about 300 years ago, so the A779 mutation is at least 300 years
old. The 2nd Hochreiter
Big Y provided a split into two branches, defined by A782 and BY5909, although
those two will not be listed in Yfull’s tree until a 2nd sample shows up in the
same branches.
New Topic 8 Feb 2015. Update 17 Mar 2016.
This SNP represents the major
division of L540, with 11 of the 16 samples in the L540
tree so far. The Yfull
tree estimates Y7026 to be about 2,000 years old, although this is a very rough
estimate due to the caveats
associated with DNA age estimates. See
the next topic discussion about the “bushy” nature of Y7026.
New Topic 17 Mar 2016.
We now have 5 samples in the paragroup
Y7026*. Click here for a jump to Y7016* in the tree.
All 5 have been confirmed with SNP results Y7026+, and Z29042-, A783-, so
they do not belong to those two known haplogroup
branches of Y7026. Two of the 5 have Big Y results, and they do not have a common novel SNP, which
means they will end up in two different new branches of Y7026, as soon as a
future sample in their branch gets a Big Y result with a common novel SNP to
define that future haplogroup branch.
In other words, we know Y7026 has a
least 4 branches. The node associated with Y7026 is the major
“bushy” node of the L540 tree.
The other three samples have not
purchased Big Y; their results are from
SNP testing only. These three may belong
to those two future branches. Or,
perhaps one or more of those three may end up in yet another branch of Y7026.
A bushy node is evidence that the
immediate descendants of the corresponding MRCA
participated in a significant population expansion.
On the other hand, bushy nodes may
be just random, not evidence of population expansion, due to the luck of SNP
discovery statistics, particularly for the case of Y7026, with only 11 total
samples so far. Big Y does not cover the
entire Y chromosome, and Big Y does occasionally randomly miss SNPs, so future
testing may show Y7026 to be not so bushy after all, if future novel SNPs
combine the Y7026 branches into fewer larger branches.
New Topic 22 Jul 2015. Update 17 Mar 2016.
This SNP and haplogroup was defined
22 Jul 2015, being present in Nowak’s Big Y, and also
present in my earlier Gwozdz Big Y. Kargul tested positive for A6295, making 3 samples so
far. My Gwozdz
cousin would no doubt also test positive, but I leave him out of the L540
tree since together we represent one ancestral line. Actually, I recruited both Kargul and Nowak,
based on close STR matches to me, so statistically, the A6295 branch should be
considered to be much smaller than the Y7026 branch with 11 independent samples. Note that the three A6295 samples are the
only Poland origin samples in the L540 tree;
I did not recruit on the basis of Poland origin, so we can speculate
that A6295 might represent a small Polish branch of L540, although three
samples is far too few for any confidence in this regard.
New Topic 21 Jan 2016.
This SNP has just been defined 21
Jan 2016. It is negative in one A6295
sample (Nowak) and positive in the other two (Gwozdz and Kargul), so it
represents a haplogroup - a small twig in the Y-DNA tree. Kargul does not have Big Y
data; Kargul’s FTDNA sample is
A6295+; Kargul’s Yseq
sample is A9035+. I
(Gwozdz) ordered SNP tests at Yseq for 4 of my “private” SNPs, A9032, A9033,
A9035, and A9036; Kargul is negative for
those other 3, implying that our MRCA node for A9035 is roughly 3/4 as old as our
node with Nowak for A6295, although this is a very rough estimate with only 4
SNPs tested. At the Yfull
SNP browser, using the locations for those 4 SNPs from my (Gwozdz) Big Y data,
I verified my positive standing for all 4 of these SNPs; Nowak and all other V13 samples in the V13
Project at Yfull are negative for all 4.
New Topic 22 Nov 2015.
This SNP has just been defined 22
Nov 2015. It is present in Hartsfield’s recent Big Y, and is also present in Roider’s
Big Y from earlier this year. So Z39377
defines a new haplogroup, with only those two samples so far.
Update Feb 2015.
This SNP is in the L540 branch, but
older. PGP89 is a sample from the
Personal Genome Project (search Google for details). PGP89 is S3003+ but L540-, so this sample
represents an older node in the branch
leading to L540. So far there are no
such S3003+ L540- results in the E-M35 Project. Technically, S3003 defines a haplogroup with
branches PGP89 and also L540, but for simplicity I just say in this web page
that L540 is a branch of V13.
Determining Your L540 Twig; Dividing L540;
Discovering New SNPs
Rewrite 13 Mar 2020.
I recommend Big Y, next paragraph,
if cost is not an issue for you, and if you are enthusiastic about discovering
new SNP haplogroups. Otherwise, consider the less expensive tests
per the following paragraphs, to determine your current haplogroup.
Big Y: The newest version is Big Y-700: Discovering new haplogroups is part of my
genetic genealogy hobby. I have been
recruiting L540 members to purchase Big Y in order to
discover new SNPs, which provide new SNP haplogroups - terminal
twigs on the Y tree, to further subdivide L540. Big Y-700 is not cheap. $449.
Anyone interested in joining me in this L540 project can order Big
Y-700; please contact me so I can keep
track of the status. With Big Y-700,
there is no need for individual SNP
testing. In fact, with Big Y, most men
immediately discover new “private” SNPs of the Y unique to their sample (unique
so far in the Big Y database). Many men
have an immediate match at a new SNP, thereby defining a new twig in the Y
tree, combining their sample with a previous Big Y sample. It is almost certain that future Big Y tests
will match one of the new SNPs in your Big Y data, thereby defining new small
twigs, eventually combining quite a few samples. An exception would be if cousins of yours are
already in the Big Y database, in which case your Big Y places you in that
twig.
I encourage testing at FTDNA, and joining the E-M35
Project, because I like the convenience of finding all the data in one
place. The administrators of the E-M35
Project will classify your sample into a category, where they give the
recommendation for which SNPs you should test.
You may have to wait a few weeks after joining the project for a
recommendation. If you think you might
be L540, please feel free to email me for advice on
which SNP branches to test.
There are other companies. Yseq offers individual
SNPs at lower price with faster results.
See SNP Test Orders for detailed instructions.
If you are new to Y-DNA testing and
do not know your haplogroup, I recommend Y-37 as an
inexpensive first test. The FTDNA
computer will use that data to predict your haplogroup. The computer is very conservative. If you are really L540, the computer will not
predict L540 just based on Y-37, but it can confidently predict E-M35 or E-V13. (L540 is a branch V13, which is a branch of E-M35.) If you are not predicted E-M35 consider
joining a different Project, for your haplogroup. The FTDNA computer makes a
recommendation for further testing; from
your home page, click on “Haplotree & SNPs”. The administrators of the E-M35 project may
make a better recommendation, using that information, plus your close Y-DNA
matches, plus other data analysis for which SNP, or which SNP panel, you should
test next. See individual
SNPs or panels of SNPs, a separate topic below.
For example: If you purchase Y-111,
look at your Y-DNA matches, using all 111 of your STRs. These matches are quite close, so you are
probably in the same haplogroup branch as your closest matches. If you have a close Y-111 match who has
determined his haplogroup to be one of the L540 branches, you can purchase the
test for the SNP for that branch, then continue up and down the L540 tree,
inexpensively testing individual SNPs to determine your terminal position.
If you previously purchased Y-67, no
longer offered, then you still might be in the same branch as your closest Y-67
matches, although with less confidence than with Y-111.
If you previously determined your “terminal
twig”, you can just watch this web page L540 Tree (or other web sites, other
trees). When a new SNP twig shows up
extending your twig, you can order just that one SNP to see if you belong, or
not.
For more specific discussion, click
on L540, A783, Z29042, Y7026,
A6295, SNP ordering, and Big Y.
How about STRs?
In the past, I encouraged upgrading to 111 Markers,
the largest set available at FTDNA. Now
that there are plenty of SNPs available with low cost tests, SNPs are better
than STRs for finding your closest Y matches.
However, there are plenty of samples without the latest SNP tests, so if
you are anxious to find out which of these best match your Y, 111 STR markers
are much better than the smaller standard sets.
Y37
Y=37
is a test for 37 STRs, from which the FTDNA computer can
predict your main branch haplogroup with high confidence. Then, from your home page at FTDNA, click on
“Haplotree & SNPs”. That will show
your position in the Y-DNA tree.
Rewrite 31 Oct 2015.
Friedman
proposed cluster C in 2008, based on STR
correlations, when the data was less than what is available today. Cluster C now seems equivalent to L540. The cluster C data is still available at the haplozone site but may not
be up to date.
Rewrite 31 Oct 2015. Edit 16 Aug 2018.
I defined C type in Jan 2010 as my
version of Cluster C.
I use C type to predict L540 samples based on STRs,
for samples that do not have the L540 SNP test.
I use the word type for an STR cluster with statistical validity as established by my Mountain Method. “Type” is my own term. I chose the word “type” because it is not
generally used in genetic genealogy and I wish to distinguish my types from haplogroups and from other clusters.
By “type” I mean the cluster data, the hypothetical clade, the modal
haplotype, and the set of all
possible haplotypes, at any number
of markers. Accordingly, by “C type” I
mean any or all of these 4 things. I
sometimes use just “C” as short for “C type”.
I also have a previous C type
identified in R1a; unrelated; please don’t get confused. I published my methods in the Fall 2009 issue of JoGG.
My analysis files
define C type. Sorry, it can be a bit confusing because I
have multiple STR definitions for C type, for various marker sets. The number of markers in my definitions
change slightly when new samples show up with unusual STR values. I hope the meanings are clear from the
context of my discussions in this web document.
Click on seems equivalent for an explanation
that STR types (such as C type) cannot be exactly equal to equivalent SNP
haplogroups (such as L540), due to STR outliers.
Rewrite 31 Oct 2015,
I coined the name V13C in 2010 to
represent C type, cluster C, the hypothetical haplogroup, and the samples (men)
in the hypothetical haplogroup.
I also used V13C to mean samples
that match C type from the database of samples at E-M35 or at Haplozone,
or at other databases.
This web document used to be named
V13C.html.
Now that C type seems equivalent to L540 I edited away most of my mentions of the name “V13C”, but
I’ll continue to use “C type” for the predicted clade based on STRs.
Rewrite 27 Sep 2017.
I proposed L type on this web page
in mid 2011, based on only 2 samples, which means not very high statistical
confidence. L type (also called L540
type) was a type that included C type plus those 2 samples that did not fit C type at that
time.
I no longer consider the distinction
between C type and L type useful. Those
two samples, Gebert and Fredeen, both tested positive for
L540. So they are just statistical STR outliers. Since then, more outliers have shown up; recently, with lots of 111 marker (next topic) data, I was able to come up with
an STR definition of C type to
capture all L540+ STR outliers, and not capture any L540- samples. My C type definitions using less than 111
markers are not quite perfect at predicting L540 based on STRs, but they are
satisfactory.
I edited this web page to remove
mentions of L type (except this topic).
Rewrite 6 Oct 2017. Edit 16 Aug 2018.
FTDNA
provides STR markers in various sets. The largest, a set of 111, was introduced in
2011. Upgrades can be purchased for samples with fewer markers. Obviously, matches and predictions are more
accurate using more markers. Until 2014,
I had been recommending the 111 set to L540 members, hoping to discover STR
correlations good enough to divide the L540 haplogroup into clusters with high confidence. Today, SNPs
are more important than STRs. This is
because the cost of discovering new SNPs has come down a lot. SNPs define haplogroup divisions; STRs only provide statistical predictions for
haplogroups.
Some clusters are still defined by
STRs, as predictions for new haplogroups, which need confirmation by discovery
of a corresponding SNP. However, STR
analysis is yielding diminishing returns for this effort. SNP discovery is now accelerating instead.
Still, the majority of on-line
samples have STR data without adequate SNP data. So Y-STRs still provide you with your best
list of on-line close male line matches.
At your FTDNA home page click on the
Y-DNA “Matches” button to see your closest matches using
the various STR marker sets. Many men at
FTDNA do not join the various projects;
if someone in L540 does not join the E-M35 Project,
I do not get to see his data. If you you are L540 and have a very close STR match, please send
him an email message about this L540 web page and about the E-M35 Project. I still occasionally find new L540 members
this way.
Haplozone is another on-line STR database.
As a specific example of the value
of STRs, I discovered DYS445 = 11 as an unusual mutation in my own Y, shared by
my 3rd cousin, and also shared by Kargul, adding evidence
that we form a twig in the L540 tree,
perhaps restricted to south Poland, perhaps only a few centuries old. DYS445 is not available at less than 111
markers in FTDNA standard sets. The rest
of L540 samples have the value DYS445 = 10.
The value 11 does show up very rarely elsewhere in V13,
as an independent mutation, so although DYS445 is very slowly mutating it is
not as slow as a typical SNP, so not as statistically reliable as an SNP. Later, I discovered A9035 (tested at Yseq), an SNP for only the 3 of us. A9035 is a twig in the A6295 branch (see the Tree). In other
words, DYS445 = 11 seems equivalent to A9035
today, although exceptions may well show up in the future.
Summary: 111 STR markers are valuable if you are very
interested in genetic genealogy,
and if cost is not a big issue for you.
If cost is an issue, and if you are merely curious about your Y-DNA, as
a first test I recommend the 37 marker STR set (topic
after next).
Rewrite 5 Dec 2015. Edit 16 Aug 2018.
FTDNA
provides a 67 marker standard
set of STR markers. I have been
using this 67 set for analysis for more than 8 years. Although the 111 set is more accurate, this
67 set is valuable for analysis because there are a lot more samples on-line at
67, and all samples with 111 are included.
Rewrite 5 Dec 2015. Edit 16 Aug 2018.
FTDNA
no longer offers the 25 and 12 STR marker standard sets. The 37 marker set is sufficient as a first
test if you are curious to see in which Y-DNA main branch haplogroup you
belong. With 37 markers, FTDNA will
automatically place you in one of the main large haplogroup branches of the
Y-DNA tree. For the smaller branches of
the tree, there are SNP tests. For L540
candidates, I have a separate discussion topic about this: Dividing L540.
Most of the more rapidly mutating
STRs are in the 37 marker set, so the 37 marker set is good to search for your
best matches to other men with a male line common ancestor in the last
millennium or so. FTDNA provides you
with matches to other men with similar STR haplotypes. All samples with 67 or 111 are included
because they have these 37 plus more.
For more discussion see Value of STRs.
Rewrite 11 Oct 2017. Edit 16 Aug 2018.
FTDNA provides the older STR sets,
using 12 and 25, as special orders by project administrators, but for the price
difference the 37 set makes more sense.
There are still lots of data on-line
with only 12 markers - not so many with only 25. Those samples can still be checked for
candidates for L540, but not with very high confidence.
Best STR Markers
Edited 27 Sep 2017. Edit 16 Aug 2018.
STR
markers that mutate relatively slowly are statistical indicators for clades in which they are recently mutated,
but they are not perfect because of subsequent independent mutations. When a clade has a few such good STR markers
those provide a signature set of STR
markers. A signature is statistically
expected to be a more probable indicator of a clade than just one marker. Indeed cluster C is
characterized by the Friedman Signature. My definitions
of C type (and thereby L540) use other
helpful markers, not just the signature.
My analysis files
automatically rank markers, as useful for a particular definition, using a
method that I published. The exact ranking of markers varies slightly
from month to month due to the random nature of mutation values in new samples,
and due to the somewhat arbitrary cutoff that I use to restrict the database to
the L540 neighborhood. (Using too many samples provides a ranking of
the father clade instead of the clade of interest.) For example, a sample that ranks 6th one
month might come out 4th or 5th or 7th or 8th the next month.
An SNP
that defines a haplogroup is very
unlikely to have happened exactly at the time of the most recent common
ancestor (TMRCA) of a haplogroup. Most likely the SNP is somewhat older,
because usually there are many generations between nodes.
By definition an SNP cannot be younger than the TMRCA. Similarly, we can consider a hypothetical
clade defined by a particular STR mutation, which is likely somewhat older than
the TMRCA of that clade. However, for
clusters defined by signatures, and for types defined by definitions, one rare
STR mutation that contributes to the signature might have happened before or
after the TMRCA of that cluster or type.
Very slow mutators
should make the best markers. However,
the slowest are rarely mutated, so those with intermediate mutation rate show
up more often as signature markers. My Type.xls master
file has the Chandler STR mutations
rates, in the ASD sheet, row 5. The ASD
sheet is not usually included in my analysis files.
Best Dozen
STR Markers: Using
my latest (Sep 2017) analysis at 111 markers, here are my rankings of the best
STRs for C type and thereby for L540 (DYS numbers): 1&2 (two way tie) - 594=12
& 636=12; 3 - 390=25; 4
- Δ389=19; 5
- 561=17; 6 - 444=13; 7 - 406=11;
8 - 504=14; 9 - 517=24; 10 - CDYa=29; 11 - 447=25;
12 - CDYb=33.
Original Marker for Cluster C
Rewrite 27 Dec 2015. Edited 27 Sep 2017. Edit 16 Aug 2018.
ΔDYS389II = 19 is one of the
original Friedman Signature markers for cluster C. It
remains a good marker for C type and L540.
[Technical detail: DYS389 is a compound marker, where 389I is
the first STR chain and (389II minus 389I) is the second STR chain. For cluster C the first chain is 389-1 = 389I
= 13. The second chain is 389-2 =
19. 389II = 13 + 19 = 32. The marker of interest here is really 389-2 =
19 (389II minus 389I = 19). However,
389I mutates more slowly and has the value 13 for all but one L540 sample so
far and for almost all samples in the L540 neighborhood. At Haplozone,
both 389 markers need to be used together;
if one is omitted both are ignored.
I use both 389 values, or neither, in my definitions to be compatible with other
web sites.] My xls files can be easily
modified to use Δ389 without 389-1.
All STR standard marker sets by all
DNA companies include the 389 pair. (I
have not noticed any exceptions.)
389 = 13, 30 is the modal value for V13, so it seems to be the ancestral value for L540. 389 = 13, 32 is rare in V13 (other than
L540), but shows up in E-M35 branches outside V13.
Only two L540+ samples, Fredeen and Gebert, have the ancestral value 13, 30. Butman,
the closest STR match with L540-, also has 13, 30. Only a few samples in the branches of L540
have the value 13, 31, which is not common in the neighborhood. On this basis, it seems likely that the
mutations to from 13, 30 to 31 to 32 happened before the TMRCA for L540, and later mutations back
from 13, 32 to 31 to 30 happened in very few L540 male lines. (We cannot rule out a rare double size
mutation incident, from 30 to 32, or a double mutation back to 30.)
DYS389II (actually the difference
value 389-2) ranks 43rd in Chandler
mutation rates. Near the middle. So exceptions are expected, due to recent
mutations. DYS389-2 is ranked as the 4th
best marker in my analysis of 111 markers.
DYS594 = 12; Best Marker for L540 at 67 Markers
Rewrite 27 Dec 2015. Edited 27 Sep 2017. Edit 16 Aug 2018.
In my analysis,
DYS594 = 12 is the best marker for L540 (and C type) using the 67 marker
set. 594 is not in the 37 marker set.
All L540+ samples with 67 or more
markers have the 594 = 12 value. Butman,
the closest STR match not predicted L540, indeed tested L540-, and has the
ancestral 11.
All C type samples (predicted L540),
except one marginal sample not yet tested for L540, have the 12 value.
A few samples in the L540 neighborhood have 594 = 12 but are L540-. These are not a random sample; I recruited two of them for the L540 test to
find out if all 594 = 12 in the neighborhood are L540; no, not all.
The 594 = 12 value is more common in
the L540 neighborhood than in the rest of the V13 data. So I was wondering if 594 = 12 is an old
mutation in the S3003 branch. So I tested one of those two L540- samples
with 594 = 12; it came out S3003-, so it
seems to be an independent mutation.
Also, considering the L241 haplogroup, some of those samples are in the
neighborhood, but they have 594 = 11 except one sample that has the value 12,
so that is also independent.
DYS594 ranks 12th from the slowest
in the 67 Chandler
mutation rates. Quite slow, so
independent recent mutations should be rare.
DYS636 = 12; DYS561 = 17; DYS504 = 14; DYS714 = 24
Excellent Signature Markers for L540
Available in the 111 Set
Rewrite 27 Sep 2017.
These 4 are not in the FTDNA 67 STR maker set, but
are available in the 111 STR marker set.
636 is just as good as 594 [previous topic]; they are tied as the best two STR
markers. Those other 3 are among the dozen best. That’s
why C74(111), my 111 marker definition for C type, works
very well.
Rewrite 29 Dec 2015. Edit 27 Sep 2017. Edit 16 Aug 2018.
The signature is (390, 389-2, 447) =
(25, 32, 25).
Friedman had
been calling this the “characteristic marker values” for cluster
C at the Haplozone site
before I started working on this, back in 2008, when there were only 9 samples
available in cluster C, including mine.
This original Friedman signature
works surprisingly well by itself for samples with only 25 of the standard markers,
but not with high confidence.
In early 2011 Friedman added 594 =
12 to the “characteristic marker values”, for 67 marker samples.
DYS389 is a
compound marker, discussed above.
Friedman used a more complicated
analysis than just this simple signature in her C type assignments. I do not know her method exactly, but most
definitions (not all) that I tried, selecting well ranked markers, extracted
the same samples that she did.
16 Aug 2018: Neighborhood Table removed, due to the new
FTDNA rules for on-line data.
I still use the word Neighborhood
to mean samples that seem close to L540 based on STRs but are not predicted
L540 with high confidence based on STRs.
Neighborhood samples may have results for the L540 SNP test; those are used to calibrate my predictions
based on STRs. I also use the word Neighbor
to refer to samples that are close STR matches.
My sample
is kit N16800. N81304 is my 3rd cousin
Gwozdz.
Edit 17 Dec 2015.
Kit 199446, Aloysius Kargol is my closest STR
match available on the web (other than my 3rd cousin). In May 2010, his daughter noticed, on
ancestry.com, that he and I are perfect matches at 12 STR markers. I studied the LDS microfilms and
located his 1820’s Kargul ancestor living in a village in Poland only 20 miles
away from the village of my Gwozdz ancestor.
I paid for his FTDNA sample. His
L540 test came out positive, placing him in that new haplogroup. We are 5 steps apart at 67 STR markers; 9 at 111.
For estimating the size of L540 or C
type, my cousin and Kargul should not be included, because I recruited them,
paying for their tests. Family sets such
as these distort size estimates, when comparing the number of samples per
haplogroup or per STR type or cluster.
New topic 13 May 2011. Rewrite 22 Dec 2015. Edit 16 Aug 2018.
Butman’s
L540 SNP test came out negative in 2011.
That means he is not a member of the L540 haplogroup. Kit N91348.
This sample is interesting because
it is an STR outlier from another
haplogroup, coming out closest to C type.
(C type is the STR equivalent of L540.)
At 67 markers, this sample
actually falls within C type; check the
numbers in that table, at the columns for the 67 and 37 marker modal haplotypes. That’s because the 111 marker set has quite a few good signature markers for C type. Before 2011, at this web page, I listed this
sample as at the edge of C type, or predicted L540 with low confidence. Using only the 37 marker set, Butman’s 5 closest neighbors
are C type (Dec 2015).
This sample recently came out
negative for S3003, which is the “father” of L540. The MRCA
node for S3003 is older than the MRCA for L540. This sample tested V13+
but has not yet been tested for all the recently discovered SNP branches of
V13. Using all 111 STR markers, Butman has no close neighbors; his closest are Bartlett at step 21, Hohnloser at step 22, and Hochreiter
(L540+) at step 23 along with another Bartlett sample and two other samples
that are not in the Table above (Dec 2015).
In the Y-DNA tree, Butman’s node where he branches apart from L540 is surely
older than 1,000 years and might even be older than 4,000 years, according to
the estimated age of L540.
What does this mean? The simplest explanation is that Butman is alone in the E-M35 database,
in a very small haplogroup that branches off the branch leading to S3003 and
L540 perhaps 2 or 3 millennia ago.
Another possibility: he may
belong to the recently discovered Z17264 haplogroup, since Bartlett belongs to
that one (Table above). Z17264 is a twig
in the main branch Z5018 so Butman might have an MRCA
older than Z17264, perhaps. (The test
results might come out Z5018+ Z17264-.)
This paragraph is statistical speculation; Butman might end up
in a new branch of V13, negative for all known branches, for all we know. This paragraph is a good example of the
uncertainty of STR based predictions for outliers. Big Y or SNP tests are needed here.
Rewrite 27 Dec 2015.
Kit 162917, Fredeen,
has been listed at this web page since Mar 2010. L540+ result May 2011.
This sample is an STR outlier. Even with all 111 markers, this Fredeen sample differs a lot from all the other L540 (C
type) samples. The closest neighbor is at step
24; most L540 samples have closest
neighbor at step 14 to 18. (Samples with
the same family name are even closer, of course.)
The original best L540 signature marker is DYS389
= 13,32; Fredeen
has 13,30, which is the ancestral value (for most Neighborhood samples outside
L540). Fredeen
also differs at two other L540 signature markers.
The simplest explanation is that Fredeen belongs to a branch with a node in the L540 tree that is older than the
other nodes. Perhaps those 3 signature
markers mutated to the L540 values after the node leading to Fredeen.
However, there is an alternate
possibility: Fredeen
may belong to one of the currently known branches; perhaps those 3 signature markers experienced
back mutations; perhaps the Fredeen line has more mutations than normal, due to the
luck of mutations. Read the following
topic, Gebert, also an outlier.
SNP testing
is required to determine the branch for this sample.
Rewrite 27 Dec 2015.
I noticed Gebert’s
sample on-line and encouraged him to join the E-M35 project, which he did in
2011, kit 166692 in the table. I helped pay for the orders for the L540 test
and for the 111 extension. He purchased Big Y in
2014.
Gebert is
also an outlier; read the previous
topic, Fredeen, for a brief explanation. Gebert is not quite
as extreme an outlier as Fredeen, with closest
neighbor at step 20. Gebert
also has the ancestral DYS389 = 13,30, and also differs at two other signature
markers (not the same two as Fredeen).
In this case, because Gebert purchased Big Y, we know that this sample falls in
the Z29042 branch of the L540 tree. So it is clear that the Gebert
line has more than the expected number of STR mutations; it is just luck that those 3 signature
markers mutated back to the ancestral values, because L540 samples both in
Z29042 and outside Z29042 have the signature values. This sample is an example of the limitation
of predicting haplogroup based on STR values.
Rewrite 22 Dec 2015.
Hohnloser
(kit N39989) is another outlier outside L540.
To understand this, please see the topic above for Butman. Hohnloser is not quite as close to C type as Butman, but otherwise the Butman
discussion mostly applies also to Hohnloser.
Hohnloser
has been mentioned here at this web page since 2010.
Hohnloser
also does not belong to the L540 haplogroup because his SNP test came out
negative. He has not been tested for
S3003.
Hohnloser’s
nearest neighbors at 111 markers, step 22, are Butman
and two other samples not in the Table above.
Hohnloser’s nearest neighbors with haplogroup
identification are at the next step, 23, 3 samples, 2 of which are L241+. However, Hohnloser
tested L241-. L241 is a branch of Z5018,
so maybe Hohnloser might fall in one of the other
Z5018 branches.
Jorg Hohnloser has extensive family tree research results. He administers a Hohnloser
project at FTDNA. He exchanged helpful
email discussions with me.
New topic 12 Dec 2014. Edit 17 Dec 2015.
Kit N45041, Administered by Andrew Hochreiter, who runs the Hochreiter Project.
Due to the European Union DNA privacy law in 2018, Ysearch.org was
closed down, so that STR data now unavailable.
Update 27 Dec 2015. Edit 16 Aug 2018.
Ancestry.com no longer provides a
comprehensive Y-DNA database. They now
concentrate on autosomal DNA (all chromosomes, not just Y).
Kargul
originally matched with me at this site, back in 2010, so I encouraged Kargul
to join the E-M35 Project.
I last checked for matches 16 May
2011, when the Y-DNA database was still active.
There were 9 matches of Y-DNA to Kargul & me, but these were not
very close matches.
Rewrite 23 Jan 2021. Edit, with a speculative last paragraph 3 Feb
2021
Summary: The L540 haplogroup is about 2,000 years old, which
is the time to the most recent common ancestor (TMRCA). This male line seems to have formed with a
split from the ancestral (Z7019) haplogroup roughly 4,000
years ago.
These are “rough estimates”; there are reasons why these age numbers may
not be exact, some reasons are explained in the following paragraphs of this
topic.
Clarification: The L540 segment
spans the time from 4,000 years ago (TMRCA of the Z7019 haplogroup) to 2,000
years ago (TMRCA of the L540 haplogroup).
There are no known branches along that 2,000 year segment of time (from
4,000 to 2,000). The actual L540 SNP originated some time during that segment
of time; we do not know exactly
when. There are many other SNPs that
also originated during that segment, so these are called phyloequivalent to L540. A TMRCA is also called a node (a branching point of the tree).
Yfull does not
have data for the Z7019 node, but Yfull does a good job of estimating node
ages. Very briefly, Yfull provides
estimates of TMRCA age for haplogroups, based on the number of accumulated
SNPs. The calculation is complicated, as
explained at the Yfull site.
Here is a link to the Yfull L540
tree:
https://www.yfull.com/tree/E-L540/. Notice at the L540 line:
“formed 4600 ybp,
TMRCA 2000 ybp”; ybp = years before
present. Click on the “info” box for
links to details of the Yfull age estimation methodology. This is my source for the L540 TMRCA 2000
years ago age. That “ formed 4600 ybp”
is the TMRCA for CTS1273, the immediate ancestor of L540
in Yfull’s tree; I use that 4600 for
Z7019 in a paragraph below.
The CTS1273 to L540 segment in the
Yfull tree has 28 phyloequivalent SNPs;
all us L540 men have all 28 of these SNPs, and these 28 are not found in
men at Yfull who are not in the L540 haplogroup. Yfull calculated the estimated L540 segment
based on these 28. The 2,000 ybp TMRCA
is calculated by averaging the number of additional SNPs carried by only some
of the men in our L540 haplogroup - different SNP in different branches.
Statistical adjustments: These Yfull numbers for L540 are for 23 Jan
2021. In the past, when there were fewer
L540 men known, the numbers changed as more men (more data) became available,
mostly for statistical reasons; the
numbers have been stable now for several months, but the numbers may well
change slightly in the future, for statistical reasons - with more data.
More importantly, I expect
significant changes in the future, if new branches are discovered, as I explain
in the following paragraphs of this topic.
The FTDNA
tree for Y-DNA uses S3003 as the name for what Yfull and I call L540. That’s OK:
FTDNA and Yfull both show L540 as phyloequivalent to S3003, based on the
data that they have.
FTDNA lists 41 (23 Jan 2021)
phyloequivalent SNPs, including L540 and S3003.
Compare to 28 for Yfull, as mentioned above. I can think of a few reasons for the
difference, but I don’t know for sure. I
suppose the main reason is that FTDNA uses all the data from the most recent
version of Big Y, which covers more of the Y chromosome
(more SNPs) than previous versions. The
Yfull tree also uses Big Y data with very few exceptions; I assume Yfull lists only the SNPs found in
all samples of a haplogroup, thereby restricting the list to the original Big
Y; I’m not sure of this assumption.
Also, identifying SNPs from raw DNA
data is complicated; it’s not surprising
that different web sites (different computer algorithms) have different counts.
Z7019 branch: The ISOGG tree shows branches
for S3003, as I outline above for L540 in the Y-tree. ISOGG shows Z7019 as a branch of S3003, and
L540 as a branch of Z7019. That means
ISOGG has at least one sample that is
positive for S3003 and negative for Z7019 and L540, and at least one sample
that is positive for Z7019 and negative for L540. Z7019 is not listed by FTDNA nor Yfull
trees. Perhaps Z7019 was found in a
region of the Y chromosome not tested by Big Y.
ISOGG does not publish who provides samples. ISOGG has a “~” next to Z7019, indicating not
full certainty. I have some concerns about
Z7019, but I’m assuming it represents a valid branch because Steve
Fix analyzed a sample called PGP89, which is S3003+
L540-.
We can make a very rough guess for
the TMRCA node of Z7019 from the number of phyloequivalent SNPs listed by
ISOGG: ISOGG shows 4 phyloequivalent
SNPs in the S3003 segment, only that one Z7019 SNP in the Z7019 segment, and 15
phyloequivalent SNPs in the L540 segment.
20 Total. 19 of those SNPs (not
Z7019) are listed by FTDNA and Yfull.
The TMRCA of Z7019, based on ISOGG SNPs, falls 5/20, or 1/4th the time
distance along the S3003 to L540 segment.
Yfull figures that segment as 4,600 minus 2,000 = 2,600 years; 1/4th is 650 years, so I’m guessing L540
formed (branched off) from Z7019 at 4,600 minus 650 = 3,950, rounded off to
4,000 years ago. Again, this is a very
rough guess for the formation time of L540, as summarized at the top of this
topic. This paragraph makes no change to
the 2,000 ybp for the L540 TMRCA.
Validation of the Z7019 branch will
probably come with time as more men from that haplogroup show up with DNA data
positive for S3003 and negative for L540, plus positive for about 1/4th of the
SNPs currently listed as phyloequivalent to S3003 and L540 by FTDNA and Yfull.
Even if Z7019 is not validated, the
discussion above demonstrates how a new branch can split a segment of the Y-DNA
tree. In the case of Z7019, the Yfull
4,600 ybp formation time of L540 becomes the formation time of Z7019, and a new
time roughly 4,000 ybp becomes the TMRCA time for Z7019 and the formation time
for L540.
That new Z7019 node did not change the
TMRC for L540. On the other hand, if the
future provides a new branch with a node more recent in time, for example 3,000
ybp, and if L540 ends up in the older part of the split, then in that example
the L540 age would get adjusted from 2,000 ybp to 3,000 ybp, including that new
branch; the 2,000 ybp number would
remain as the best estimate for a new haplogroup representing the currently
known branches of L540.
At the top of this topic, I
mentioned that the actual L540 SNP
originated some time during the segment of time from roughly 4,000 to 2,000
years age; we do not know exactly
when. That’s sometime from the TMRCA of
Z7019 to the TMRCA of L540. It’s
possible The L540 mutation happened to one of those 2 MRCA, but it’s very
unlikely. How unlikely? If we take 25 years as the average time per
generation, that’s 4 generations per century, and that 2,000 year segment has
4x20 = 80 generations. If we take 33.3
years per generation, that’s 3x20 = 60 generations. Either way, that’s a lot of generations. A good estimate might be 70 generations - a
continuous chain of 70 male line descendants.
The L540 mutation happened in the Y-DNA of one of those 70 men; we don’t know which one.
Back to the mutations that are
phyloequivalent to L540: A few
paragraphs above, I estimated the Z7019 segment to be roughly 1/4 the length of
the L540 segment as currently (1 Feb 2021) estimated by Yfull (and by FTDNA). Yfull has 28 mutations in that segment, but
FTDNA has 41. So I estimate 3/4 x 41 =
about 30 mutations in the L540 segment when defined to start at Z7019. That’s about 30 mutations spread out along
that chain of about 70 men. Any one of those
30 could be selected as the name of our haplogroup; this topic is actually about all of
them. Probably more will be found in the
future. L540 was the first one found, in
my Y-DNA, in 2011; that’s when I renamed
this web page “L540”.
Obviously, I like to use L540 as the name of our haplogroup.
Long segment discussion: The current Yfull L540 segment spans 4,600 to
2,000 ybp. Using ISOGG data, I adjusted
that to roughly 4,000 to 2,000 ybp.
Either way, that’s a long time without any known branches. This is not unusual; many old segments in the Y-DNA tree span
thousands of years without known branches.
The reason: most new Y-DNA
haplogroup branches become extinct because statistically that’s most likely. The many men who form that continuous male
line chain in a segment are like the rare winners in a casino. Or like a group of men who get together to
buy lottery tickets, and win the lottery.
Throughout the time span of a segment, many men in that chain had more
than one son, but by luck only one of those sons at each generation had a male
line that did not go extinct. Many of
the male lines that went extinct probably existed for centuries. It’s possible and even likely that 2,000
years ago hundreds or maybe thousands of men were L540, most of them not
descended from our MRCA, Statistically, most male lines go extinct, so maybe
all those lines became extinct with one exception - the one line that we now
call L540. Statistics is the simplest
explanation, but it’s not the only explanation.
We use the word “statistical” when something is caused by a large number
of causes, where none of causes dominates.
But maybe there was one dominant cause, like a plague, and that was the
dominant reason why all but one of the L540 branches went extinct. The Justinian plague, starting in 541, was
preceded by a few years of famine caused by cold summers starting in 536,
apparently due to volcanic dust in the atmosphere; that’s an example of 2 dominant causes
combining to cause extinction. Maybe
most of the L540 men belonged to a tribe that became involved in a tribal war
just before 536, so that there were 3
reasons - war, famine, and plague - that together caused all but one of the
L540 lines to expire over a short period of time. This last paragraph of this topic is highly
speculative. This paragraph can be
applied to any long segment of the Y tree, not just L540. I included this paragraph not to propose an
explanation for the age of L540, but to emphasize that we don’t know the
explanation. We also do not know the
region of origin, next topic:
Rewrite 3 Feb 2021.
Summary: Our L540 male line MRCA (Most Recent Common
Ancestor) probably lived in Central
Europe. This
is not certain; it’s just a best guess.
This best guess is based on the
reports by L540 men for the location of their most distant known male line
ancestor, although most of those ancestors lived only 200 to 400 years
ago. Our L540 TMRCA [previous
topic] is roughly 2,000 years ago, so that’s why Central Europe is just a
best guess. After all, our MRCA might
have lived his early life somewhere far from Central Europe and migrated to
Central Europe. Or he may have lived his
entire life far from Central Europe and some of his sons or grandsons or even
later descendants may have migrated, whereby many of his descendants today are
in Central Europe.
My L540 Tree
has 35 ancestor names, corresponding to Y-DNA data from 35 men. For a statistical analysis, I need to
disregard the men who were recruited, counting only the men who independently
decided to test their DNA. For example,
The 4 Hochreutter all come from one project, so they
count as one. I’m also counting the 2
Hartsfield as one.
Gush and Kargul (A6295) independently
tested autosomal DNA and they matched with me; I highly encouraged them to test
Y-DNA. It’s hard to say if their Y-DNA
testing is fully independent. Also, I do
more encouragement for everyone in my A6295 branch. So I adjusted A6295 by counting Gwozdz, Gush,
and Kargul as one common ancestor from Poland.
That makes 29 ancestors, from
various countries as listed in my Tree.
The results:
10 Germany
4 Poland
3 Czech
2 Hungary
19 Central Europe
4 Sweden
3 Norway
7
Northern Europe
2 Russia
1 Ukraine
3
Eastern Europe
29 Total (data as of 3 Feb 2021)
So Central Europe seems like a best
guess for where our MRCA lived. 19 out
of the 29 male line ancestors of the men in my L540 Tree are from Central
Europe. 29 is not a very statistically significant
sample. With more data in the future, I
suppose the list above might look different.
However this statistical uncertainty is not important, because I
mentioned above a much more important reason for uncertainty: Our MRCA lived about 2,000 years ago, while
these 29 known ancestors lived only 200 to 400 years ago.
My L540 Tree has only about half the men with Y-DNA data who are known
L540+ or expected to be L540 based on STR
data. But those other men have not been
tested for the branches of L540, and most of them do not report where their
male line ancestors lived. In this topic
I’m using branch data and location data.
Most of those other men have their Y-DNA data available at the E-M35 Project.
My Tree has
at least 6 branches from the L540 node:
Y7026 is the largest, with 16 of the 29.
A6295 has 5 of the 29. Under
L540*, there are 4 with “BY”, meaning they have Big Y data
and did not have any SNP in common with one another, so those represent 4
branches with only 1 each so far; each
of those has many novel unique SNPs. The
other 4 L540*, without “BY”, tested L540+ Y7026- A6295- with individual SNP
tests; they have not tested for the
unique SNPs of the previous 4, but they will test when one of those unique SNPs
gets identified as a definition for a branch when someone in the future
matches. One or more of these last 4
might in fact be a member of a branch of one of the previous 4, so these last 4
may or may not represent independent branches.
In other words, our L540 tree is more like a bush, with at least 6
independent branches identified; this is
a strong hint that more independent branches will be identified in the future.
Germany has 10 of the 29 men
in the list. However, all but 1 of those
10 are in the Y7026 branch of L540. So
Germany is the specific best guess for the origin of the Y7026 branch - where the MRCA of the Y7026 men lived. But that’s a highly uncertain guess. The Yfull Tree estimated TMRCA ages are
rounded off; if we examine the Yfull
calculation details, we see 1,976 years for L540 and 2,003 years for
Y7026. Only 27 years separation. That implies that the Y7026 MRCA is the son
of the L540 MRCA. Of course that’s just
a very rough estimate; he might be a
grandson or great grandson, or 2-great grandson. I checked the web: Julius Caesar’s writings are the oldest known
use of “Germani” for people who lived north of the
Roman lands. That’s a little more than
2,000 years ago, before Rome was an empire, and those “Germani”
lived in an area larger than modern Germany.
So it seems the best guess for the MRCA of the Y7026 MRCA is that he was
a Germani. I
have the state in Germany for only 5 of those 10 L540 German ancestors: 1 is Prussia, 2 are in the former East
Germany, and 2 are from the southeast of the former West Germany. 5 is not enough for statistical significance,
but the trend seems to be toward the east side of Germany, consistent with most
of the other L540 ancestors being from the region to the east of Germany. A6295 has only 1 ancestor entry from Germany,
and none of the 540* ancestor entries are from Germany, so there is no reason
to suggest Germany as the origin of L540, although Germany cannot be ruled out.
So far, no specific country region
from what we now call Central Europe is a reasonable guess for the specific
place of origin of L540.
There is some bias in on-line DNA
data toward Europe. Many parts of the
world are not represented well in the database.
I suppose there is a slight chance that someday L540 samples will show
up as common elsewhere - for example a group of villages somewhere in the mountains
of Russia, or somewhere in the Balkans, or somewhere on the Eurasian
Steppe. Discussing an origin today on
the basis of only 29 samples is a bit speculative. We’ll see how it comes out as more data
accumulates.
Origin of ancestral branch haplogroup: Z7019 and S3003 are the ancestral haplogroups for L540. But those 2 are very small, with no
significant data so far, so I’ll ignore Z7019 and S3003 for this
paragraph. At Yfull, CTS1273
is the ancestral haplogroup for L540.
CTS1273 is much larger than L540, and Yfull estimates the CTS1273 TMRCA
at 4,600 years ago, which is the formation date for L540 [previous
topic]. If we glance, at Yfull, at
the countries listed for CTS1273, we see a much wider range than for L540; England, Italy, Greece, and other countries
are included in CTS1273. This makes
sense; 4,600 years is a long time for a
wider range of migration. The place of
origin for CTS1273 is even more uncertain than for L540.
New topic 29 Dec 2015. Rewrite 12 Apr 2020.
I estimate there are 100,000 L540
males living in the world. This is my
very rough educated guess. This estimate
is almost surely not wrong by a factor of 10;
in other words, my 99% confidence range is more than 10,000, less than 1
million. My 90% confidence range is a
factor of 3; in other words the actual
number is very likely between 33,000 and 300,000.
That’s a wide confidence range. I worked out that estimate in 2015, based on
a statistical analysis of STR data. I
expected to narrow the confidence range as more STR data accumulated.
A more precise estimate should be
possible in a few years based on SNP data, but there is not enough SNP data
publically available yet for L540, as of spring 2020.
My 2015 version of this topic had a
detailed explanation of my estimate, but the STR basis data of my estimate was
no longer available on-line in 2018, so I removed the detailed explanation.
Edit 27 Dec 2015.
Quite frankly, I was originally
surprised by cluster C.
Friedman did a good job finding this one. I admit I dismissed it when I first saw
cluster C in 2007 because it was so small that statistical significance did not
seem possible to me. I postponed
analysis until Jan 2010, independently verifying cluster C as C
type.
By “valid” I mean a cluster whereby most of the samples belong to a single clade, and whereby very few other samples in
the database belong to that clade. In
other words, a valid cluster should eventually have a corresponding SNP discovered. Throughout 2010 I confidently predicted such
an SNP here in this topic, although I doubted it would be discovered soon. L540 turned up in my WTY (next topic) in
2011. C type is the STR equivalent of L540.
Edit 27 Dec 2015.
Fifteen new SNPs were discovered in my “Walk Through the Y” (WTY). L535 through L547, L614, and L618. All 15 are available
as commercial SNP tests from FTDNA.
My WTY test read about 200,000 base
pairs of the Y chromosome in Feb 2011.
WTY is no longer available, having been replaced by Big
Y.
I announced 8 new SNPs here on 29
Mar 2011. The count on 30 Mar was 13 new
SNPs in my WTY. L614 was added in
June. L618 was added in August. That was a lot more than I expected. I now realize that’s because FTDNA expanded
the number of DNA bases included in WTY just before my test. Also, I seem to have been the first WTY from
E-M78 in quite some time.
Rewrite 13 Mar 2020.
The Topic Dividing
L540 has tips for which SNPs to order.
FTDNA: SNP tests cost $39 each if your sample is
already there from previous testing. The
V13 SNP Pack or the V68 SNP pack cost $119.
From your FTDNA home page, top right, click on “ADD ONS &
UPGRADES”. Look for either “Y-DNA SNPs”
or “Y-DNA SNP Packs” and click on the “EXPLORE” box to the right. Then type the SNP or Pack into the “Find” box
to search.
FTDNA has an “E-V68 SNP Pack, with
more than 100 SNPs downstream of V68, to determine which SNPs are yours. L540 is a branch of V13,
which is a branch of V68. This pack has
SNPs for many of the branches of L540.
FTDNA update 13 Mar 2020: The following L540 SNPs are included in the V68 Pack and also individually: S3003, L540, Y7026, A783, Z29042, A6295. If you are sure you are V13, the V13 pack includes: S3003,
Y7026, A783, Y12393, Z29042, A6295.
Yseq: SNPs are $18 each. They ship a cheek swab kit with your first
order; the swab is good for several
orders. At the Yseq home page, the
“Quick Find” box is on the left, near the center. If you Quick Find L540, the result is a long
list of SNPs that are in L540, as well as the E1b-V13 Panel.
Yseq update 13 Mar 2020: If you are sure you are V13, the Yseq E1b-V13
Panel includes: S3003, L540, Y7026,
A783, Y12393, A779, A782, Z29042, Z39377, A6295. It costs $88.
Please let me
know if you are L540 and order SNPs from Yseq, so I can keep track of results.
If you open this html document with
Word, all the link targets (bookmarks) can be viewed alphabetically or by
location.
Update Nov 2019. Update 30 Jan 2020. Edit 30 Jun 2022.
Big Y-700: https://www.familytreedna.com/products/y-dna#/compare. A
commercial product at FTDNA for reading
about 12 million base pairs of the DNA of the Y chromosome, which has about 60
million base pairs total. New SNPs are being
discovered in the Big Y-700 data provided by customers. Big Y-700 replaces the original Big Y, which
was used to discover most of the SNP
branches in the L540 Tree. There was also another version, Big Y-500,
for a while. The newer versions provide
more SNPs.
Y-37 and Y-111: https://www.familytreedna.com/products/y-dna#/compare.
These tests are less expensive than Big Y, using 37 or 111 STRs to predict your SNP branch.
E-M35, a
project at FTDNA, is my main source of
data. Previously called E3b. Link: https://www.familytreedna.com/groups/e-3b/about/background. The official name today would be E1b1b1. ISOGG changes the name
when new defining SNPs are discovered, so
the name may change again in the future.
M35.1 is the name of the SNP that defines E1b1b1 within haplogroup E. I am not planning a separate L540 project, because
it is more convenient to run this web page using the E-M35
project.
Haplozone is a web site for analysis of data from
the E-M35 project. This site has not
been fully updated since September 2013, but it is still useful. Link: http://www.haplozone.net/e3b/project. Data from E-M35, plus some data added from
sources other than FTDNA, so this database is larger than the E-M35. Page with a listing of proposed clusters: http://www.haplozone.net/e3b/project/cluster/. Page with L540 / C cluster samples: http://www.haplozone.net/e3b/project/cluster/42. Discussion forum: http://community.haplozone.net/
Yseq: www.Yseq.net.
A company that provides Y- SNP tests at competitive price and fast
turnaround.
Yfull: www.Yfull.com.
A company that provides analysis of raw DNA data, very useful for Big Y data. Yfull
presents a tree for L540 at:
https://www.yfull.com/tree/E-L540/
SNP Tracker
is a web page added to the E-M35 project in late 2011, to keep track of all the
new SNP branches in M35. http://tinyurl.com/e-m35-snps. Not up to date.
The V13
data: http://www.haplozone.net/e3b/project/cluster/10. V13 is the defining SNP for E1b1b1a1b1a, a
major branch haplogroup in E, and “father” of L540. That page of data does not have the data for samples that have been assigned to clusters
as subdivisions of V13, just the data that does not fit any downstream proposed
STR cluster. The number code for other
clusters can be typed over that “10” to quickly get to other cluster data.
Cluster C Data: http://www.haplozone.net/e3b/project/cluster/42.
ISOGG
link: http://isogg.org/tree/ Y-DNA tree SNPs and corresponding
alphanumeric codes for the haplogroups.
ISOGG names change as new SNP divisions are discovered. ISOGG names are getting quite long due to the
flood of new SNPs in the past few years.
Click on the link for the E branch, and download it (as xlsx, for example) to search for L540 to see their version
of the L540 tree. For V13, it’s easier
to search for PF2211, which is an alternate name.
Steve Fix
uses Big Y data to maintain a tree for V13.
Andrew Lancaster
was an administrator for the E-M35 (E3b) Project. Andrew had been particularly patient with me
with long helpful email discussions.
Villarreal and Friedman had also been very helpful.
Victor Villarreal
was an administrator for the E-M35 Project.
Elise Friedman
was a co-administrator for the E-M35 Project and is administrator for the
Jewish E3b project.
Denis Savard
is a current administrator for the E-M35 Project.
Peter Gwozdz. That’s me. pete2g2@comcast.net.
Revision History
2010 Jan 14 original
draft version
2010 13 updates
2011 28 updates
2012 - 2014 28 updates
2015 39 updates
2016 -2017 28 updates
2018 - 2019 4 updates
2020 16 updates
2021 Jan 14 Michalski added to Tree; add FT402631 branch
2021 Jan 23 rewrite of
topic “Age of L540”
2021 May 5 add Kurganov to tree
2021 May 8 add Zeidler to tree; add
A1157* branch
2021 Jul 20 add Paul Sholtz to the tree
2021 Sep 14 add Grundel to the tree
2021 Oct 19 new branch
of L540: Y82423
2021 Nov 27 move
Shultz in Tree
2022 Jun 30 update
tree
2023 Mar 7 minor updates
on the first 4 topics, including the tree