中国汉族人和其他若干少数民族以及外国人的遗传学距离zz

来源:百度文库 编辑:超级军网 时间:2024/04/29 13:19:36
中国汉族人和其他若干少数民族以及外国人的遗传学距离
王小东

我今天终于暂时做完了所有手头的事情,可以扯扯这个闲篇了。以下资料都来自《人类基因的历史与地理》这本书。这本书当然也不是最后的结论,但迄今为止,它就是最好的了,远好于那些道听途说,也远好于古代的含义不清的历史记载。这本书比较复杂,有些地方我也看不懂。比如说,测量遗传学距离有多种方法,究竟哪种方法最好,这本书也没有下结论,只是把各种方法的结果一齐堆到你面前。各种方法的区别在哪里,我也看不太明白,也许Yaomi能看懂。不同的研究显示出来的情况也不完全一样,但大致还是一致的。东西太多了,我也只能选其扼要介绍一下。

我不是这方面的专家,也没下太多功夫,错讹之处难免。写这点东西出来,只是因为我自己拿着这么一本挺有意思的书这么多年了,并没有翻译出来,故此先把一些最有意思的信息并不完全准确地提取出来了。

中国北方的汉人与南方的汉人的遗传学距离很大,故此该书是把汉人分成这两类来讨论的。无论如何,北方汉人是汉人的主体,因为南方汉人主要是指福建、广东、广西和台湾的居民,和其他人确实差别比较大。其他人就全归了北方汉人了,当然其中有过渡型,有个比较陡的梯度变化。以下我就先从北方汉人讲起。

北方汉人的最近的亲戚,是藏人和不丹人,为了叙述方便(其实最清楚、最方便的是图,可我画不出来),我把它称为一等亲。二等亲便轮到了朝鲜和日本,我们不喜欢日本人,但没办法,遗传学的结论就是这个样子了。三等亲是日本的虾夷人,过去欧洲人类学家曾有说法,虾夷人是白种人(因为毛长),但遗传学说明不是。四等亲是蒙古人和乌拉尔西伯利亚人。五等亲是驯鹿楚科奇人。六等亲就太多太多了,只能捡重要的有代表性的说了:有土耳其人、伊朗人、阿拉伯人、中印度人、北突厥人、通古斯人,最后也包括估计现在不少中国人最喜欢攀的亲戚,高加索人。说到六等亲了,连和高加索人都攀上亲戚了,还没有和南方的汉人攀上亲戚呢,所以我怀疑这里是不是有阴谋?呵呵,开个玩笑而已,这样的学术研究还不至于。直到七等亲,北方的汉人终于和南方的汉人接上轨了。到了这一等,还没接上轨的也就只剩下澳大利亚土著、新几内亚人和非洲兄弟了。

那么,从南方汉人那边论,是个什么关系呢。南方汉人的一等亲是泰国人和越南人。二等亲是印度尼西亚人。三等亲是马来西亚人和巴厘人(也住在印度尼西亚)。四等亲是菲律宾人。直到第五等亲,终于和北方汉人接上轨了。等到南方汉人兄弟跟北方汉人接上轨,也就是意味着他们已经跟土耳其人、伊朗人、阿拉伯人、中印度人、北突厥人、通古斯人、高加索人全接上轨了。朋友们可以根据我的叙述自己把遗传树画出来,那样就一目了然了。

讲几等亲,讲的是遗传树,还没有涉及到遗传学距离的具体数值。其实,遗传学距离的具体数值也有。有几个大表,数据太多,无法抄写在这里。而且,那几个表里竟然没有北方汉人的数据,只有南方汉人的。我们要看北方汉人的情况,只有从藏人那里参照了。另一点,那个表里的数据与上述遗传树好像并不完全一致,究竟为什么,我没有细看,估计细看也看不懂,我也就歇了。但是大致轮廓还是一致的。

从遗传学距离看,藏人与蒙古人最近,其次日本人,再其次是朝鲜人。但南方汉人就完全不同了,他们和泰国人最近,其次是和高棉人,再其次是和菲律宾人及印度尼西亚人。藏人(也就是北方汉人的最近的参照点)跟南方汉人是个什么关系呢?真是令人难以置信:藏人(也可以说是北方汉人)与希腊人的距离都比与南方汉人近。那么,藏人和南方汉人跟谁最远呢?都是跟班图人最远。

以上都是根据遗传学来的。还有根据人体测量学来的数据,有一个比较粗糙的表,没有蒙古人啥的,中国人和日本人最近。从体质人类学的角度看,高加索人种的差别很大,从肤色到头发到眼睛的颜色,差别都很大。而蒙古人种(这里的蒙古人种和前面说的蒙古人不是一个概念,这个不用再解释了吧?)的差别很小,基本上都长得一模一样。但这个情况到了中国南方就变了——那是当然了,已经说过了,中国南方人在遗传学上和北方人距离相当大。从考古学资料看,这个情况在旧新石器时代就是这个样子了。除了外表之外,在旧新石器时代,中国北方人的技术要比中国南方人高,挖掘出来的工具的精细程度有差别。

再有,如果把人的肤色分为八等:最浅、较浅浅、较浅深、中等浅、中等深、较深浅、较深深、最深。欧洲人基本都属于最浅,只有一部分西班牙人和北欧北部人属于较浅浅。较浅浅的人最多,分布地域最广,包括中国、俄罗斯、北欧北部和北美北部等。北非人、东南亚人和阿拉伯人基本上属于较浅深、中等浅、中等深。印度人、巴基斯坦人属于中等浅、中等深、较深浅、较深深。这本书说:浅肤色是在光照不足的情况下自身生产维生素D的需要,所以越往北人越白;可为什么有些生活在极北地区的人肤色却不是最浅的呢?那只是因为他们是吃肉的,食物中维生素D的含量已经比较丰富了;生活在北方又吃肉比较少,主要吃麦子的人群是最白的,如瑞典人(这是多年自然选择的结果,现在要想增白,用这招可就不赶趟了,喜欢吃肉的MM们还是敞开了吃吧)。

关于眼睛和头发的颜色,这本书里只谈了一点欧洲的情况。

总体轮廓:

(1) 非洲兄弟被甩到了一边。亚洲人和欧洲人的遗传学距离要比两者与非洲人的距离都要近得多。虽然两者离非洲兄弟都比较远,但欧洲人离非洲人的遗传学距离要比亚洲人离非洲人稍近些(和肤色表现出来的不一样啊!)。

(2) 北方汉人与南方汉人的遗传学距离相当不小,得绕过欧洲人才能跟南方汉人兄弟接上轨(我怎么看怎么不像啊!可人家那是科学,真没辙。Yaomi能不能帮着解释解释)。

(3) 北方汉人和日本人、朝鲜人、蒙古人在遗传学上是非常接近的。中国汉族人和其他若干少数民族以及外国人的遗传学距离
王小东

我今天终于暂时做完了所有手头的事情,可以扯扯这个闲篇了。以下资料都来自《人类基因的历史与地理》这本书。这本书当然也不是最后的结论,但迄今为止,它就是最好的了,远好于那些道听途说,也远好于古代的含义不清的历史记载。这本书比较复杂,有些地方我也看不懂。比如说,测量遗传学距离有多种方法,究竟哪种方法最好,这本书也没有下结论,只是把各种方法的结果一齐堆到你面前。各种方法的区别在哪里,我也看不太明白,也许Yaomi能看懂。不同的研究显示出来的情况也不完全一样,但大致还是一致的。东西太多了,我也只能选其扼要介绍一下。

我不是这方面的专家,也没下太多功夫,错讹之处难免。写这点东西出来,只是因为我自己拿着这么一本挺有意思的书这么多年了,并没有翻译出来,故此先把一些最有意思的信息并不完全准确地提取出来了。

中国北方的汉人与南方的汉人的遗传学距离很大,故此该书是把汉人分成这两类来讨论的。无论如何,北方汉人是汉人的主体,因为南方汉人主要是指福建、广东、广西和台湾的居民,和其他人确实差别比较大。其他人就全归了北方汉人了,当然其中有过渡型,有个比较陡的梯度变化。以下我就先从北方汉人讲起。

北方汉人的最近的亲戚,是藏人和不丹人,为了叙述方便(其实最清楚、最方便的是图,可我画不出来),我把它称为一等亲。二等亲便轮到了朝鲜和日本,我们不喜欢日本人,但没办法,遗传学的结论就是这个样子了。三等亲是日本的虾夷人,过去欧洲人类学家曾有说法,虾夷人是白种人(因为毛长),但遗传学说明不是。四等亲是蒙古人和乌拉尔西伯利亚人。五等亲是驯鹿楚科奇人。六等亲就太多太多了,只能捡重要的有代表性的说了:有土耳其人、伊朗人、阿拉伯人、中印度人、北突厥人、通古斯人,最后也包括估计现在不少中国人最喜欢攀的亲戚,高加索人。说到六等亲了,连和高加索人都攀上亲戚了,还没有和南方的汉人攀上亲戚呢,所以我怀疑这里是不是有阴谋?呵呵,开个玩笑而已,这样的学术研究还不至于。直到七等亲,北方的汉人终于和南方的汉人接上轨了。到了这一等,还没接上轨的也就只剩下澳大利亚土著、新几内亚人和非洲兄弟了。

那么,从南方汉人那边论,是个什么关系呢。南方汉人的一等亲是泰国人和越南人。二等亲是印度尼西亚人。三等亲是马来西亚人和巴厘人(也住在印度尼西亚)。四等亲是菲律宾人。直到第五等亲,终于和北方汉人接上轨了。等到南方汉人兄弟跟北方汉人接上轨,也就是意味着他们已经跟土耳其人、伊朗人、阿拉伯人、中印度人、北突厥人、通古斯人、高加索人全接上轨了。朋友们可以根据我的叙述自己把遗传树画出来,那样就一目了然了。

讲几等亲,讲的是遗传树,还没有涉及到遗传学距离的具体数值。其实,遗传学距离的具体数值也有。有几个大表,数据太多,无法抄写在这里。而且,那几个表里竟然没有北方汉人的数据,只有南方汉人的。我们要看北方汉人的情况,只有从藏人那里参照了。另一点,那个表里的数据与上述遗传树好像并不完全一致,究竟为什么,我没有细看,估计细看也看不懂,我也就歇了。但是大致轮廓还是一致的。

从遗传学距离看,藏人与蒙古人最近,其次日本人,再其次是朝鲜人。但南方汉人就完全不同了,他们和泰国人最近,其次是和高棉人,再其次是和菲律宾人及印度尼西亚人。藏人(也就是北方汉人的最近的参照点)跟南方汉人是个什么关系呢?真是令人难以置信:藏人(也可以说是北方汉人)与希腊人的距离都比与南方汉人近。那么,藏人和南方汉人跟谁最远呢?都是跟班图人最远。

以上都是根据遗传学来的。还有根据人体测量学来的数据,有一个比较粗糙的表,没有蒙古人啥的,中国人和日本人最近。从体质人类学的角度看,高加索人种的差别很大,从肤色到头发到眼睛的颜色,差别都很大。而蒙古人种(这里的蒙古人种和前面说的蒙古人不是一个概念,这个不用再解释了吧?)的差别很小,基本上都长得一模一样。但这个情况到了中国南方就变了——那是当然了,已经说过了,中国南方人在遗传学上和北方人距离相当大。从考古学资料看,这个情况在旧新石器时代就是这个样子了。除了外表之外,在旧新石器时代,中国北方人的技术要比中国南方人高,挖掘出来的工具的精细程度有差别。

再有,如果把人的肤色分为八等:最浅、较浅浅、较浅深、中等浅、中等深、较深浅、较深深、最深。欧洲人基本都属于最浅,只有一部分西班牙人和北欧北部人属于较浅浅。较浅浅的人最多,分布地域最广,包括中国、俄罗斯、北欧北部和北美北部等。北非人、东南亚人和阿拉伯人基本上属于较浅深、中等浅、中等深。印度人、巴基斯坦人属于中等浅、中等深、较深浅、较深深。这本书说:浅肤色是在光照不足的情况下自身生产维生素D的需要,所以越往北人越白;可为什么有些生活在极北地区的人肤色却不是最浅的呢?那只是因为他们是吃肉的,食物中维生素D的含量已经比较丰富了;生活在北方又吃肉比较少,主要吃麦子的人群是最白的,如瑞典人(这是多年自然选择的结果,现在要想增白,用这招可就不赶趟了,喜欢吃肉的MM们还是敞开了吃吧)。

关于眼睛和头发的颜色,这本书里只谈了一点欧洲的情况。

总体轮廓:

(1) 非洲兄弟被甩到了一边。亚洲人和欧洲人的遗传学距离要比两者与非洲人的距离都要近得多。虽然两者离非洲兄弟都比较远,但欧洲人离非洲人的遗传学距离要比亚洲人离非洲人稍近些(和肤色表现出来的不一样啊!)。

(2) 北方汉人与南方汉人的遗传学距离相当不小,得绕过欧洲人才能跟南方汉人兄弟接上轨(我怎么看怎么不像啊!可人家那是科学,真没辙。Yaomi能不能帮着解释解释)。

(3) 北方汉人和日本人、朝鲜人、蒙古人在遗传学上是非常接近的。
王小东
2006-08-18 14:06:39Yaomi呀,你说的那个时间不对呀。斯福查(不是斯皮查,斯福查是意大利贵族的姓氏,意大利人也与中国北方人关系较近,当然欧洲人中最近的是希腊人)的书好像是说,用遗传学的办法是可以追回时间去的,中国北方人跟那边的关系要远远远远早于阿提拉什么的。不过我到现在也没有弄太明白,遗传学是怎么给时间定位的。

yaomi 2006-08-18 14:10:09Here is a paper on PNAS:
Evolution
Genetic relationship of populations in China

ABSTRACT
Despite the fact that the continuity of morphology of fossil specimens of modern humans found in China has repeatedly challenged the Out-of-Africa hypothesis, Chinese populations are underrepresented in genetic studies. Genetic profiles of 28 populations sampled in China supported the distinction between southern and northern populations, while the latter are biphyletic. Linguistic boundaries are often transgressed across language families studied, reflecting substantial gene flow between populations. Nevertheless, genetic evidence does not support an independent origin of Homo sapiens in China. The phylogeny also suggested that it is more likely that ancestors of the populations currently residing in East Asia entered from Southeast Asia.


RESULTS
The phylogeny based on 30 microsatellites (Fig. 1A) revealed a clear distinction between southern and northern Chinese populations, although the number of Chinese populations included in this phylogeny is small. Three northern Chinese populations clustered with the Japanese and Korean as expected. The southern populations in this phylogeny are not representative because three of the five southern populations are Taiwanese Aborigines speaking Austronesian languages. However, this phylogeny provides validation for our current approach, given the fact that the relationship among worldwide populations is identical to that presented in Bowcock et al. (8). The latter was derived by using a completely different set of markers, but some populations analyzed in this study were included in Bowcock et al. (Cambodian, Karitiana, Mayan, Australian, New Guinean, Italian, Zaire Pygmy, Central Republic Pygmy, and Lissongo). Populations from East Asia form a distinctive cluster indicating a common ancestry shared among those groups. Taiwanese Aborigines populations derived from the southern population cluster from the continent, indicating the probable origin of those populations and probably Polynesians.
The distinction between southern populations and northern populations was noticeable but far less clear when 16 more Chinese populations were added, producing the phylogeny presented in Fig. 1B. The number of loci was reduced to 15 due to incomplete data for some loci. Again, the populations from East Asia were derived from the same lineage.
In Fig. 1B, two clusters for the northern populations are discernible. Altaic language-speaking Buryat, Yakut, Uyghur, and Manchu clustered with the Korean and Japanese, two language isolates but closely related to Altaic. Two Han populations, one from north China and the other from Yunnan, also contributed to this cluster (cluster N1). Another Altaic language-speaking population, Ewenki, formed a cluster (cluster N2) with Tibetan, Tujia, and Hui, all of which were originally derived from the northern populations though currently living in the western part of China (21).
Populations of southern origin formed three clusters. In the first south cluster (S1), Blang, an Austro-Asiatic population, grouped with Deang, Aini, Lahu, and Dai, all sampled from the southwest part of Yunnan. This lineage then clustered with three populations from Taiwan (Paiwan, Atayal, and Yami), probably reflecting the origin of Taiwanese Aborigines and thus Polynesians from Southeast Asia. The fourth Taiwanese aboriginal population, Ami, forms a separate cluster with Han Chinese of southern origin living in the U.S. before they joined the previous cluster to form cluster S1. The second southern group consists of three Daic populations (Li, Dong, and Yao from Jinxiu) all from Guangxi or Hainan, two Hmong-Mien populations (She and Yao speaking Punu), Cambodian (a Austro-Asiatic population), Yi and Han from Henan (cluster S2). The second northern lineage (cluster N2) consists of mostly western populations derived from this southern group except Ewenki. Jingpo and Wa formed the third southern lineage (cluster S3). In this phylogeny, populations in East Asia can be divided into two groups: a northern group consisting of populations in cluster N1 and a southern group including all southern populations (clusters S1, S2, and S3) and the second cluster of northern origin (cluster N2). This relationship was not strongly supported by the bootstrap values among major clusters most of which were small. However, a phylogeny with 17 Chinese populations and 8 worldwide populations based on 26 loci presented a topology very similar to that of Fig. 1B, and the bootstrap value supporting the separation of the first northern cluster and the southern clusters being 13% and the bootstrap value supporting the second northern lineage being 19% (data not shown).
The measure of genetic distance, Dc (19), was used in this study because it generally outperformed other measures in obtaining correct topology for microsatellite markers in an extensive simulation study (15). The neighbor-joining method tends to be less affected by the presence of admixture occurring among populations in recovering the correct topology compared with the unweighted pair-group method of averages (UPGMA) and therefore became the method of choice in this analysis (17). Phylogenies using UPGMA were also constructed but not included because the relationships of worldwide populations are different from those in Bowcock et al. and other studies using microsatellites (8-10). Other measures of genetic distance such as Dsw, Rst, and (

2006-08-18 14:14:09were also used in the analysis (20-23), but they lead to less sensible results inconsistent with known ethnohistory of the populations studied (15-17).

CONCLUSIONS AND DISCUSSION
Validation of the utility of microsatellites in reconstructing evolutionary history of human populations has been made not only theoretically (20-23) but also empirically; the relationships based on microsatellites are generally consistent with morphological and paleontological evidence and other types of genetic markers (8-10). However, many of such studies used distantly related populations and, therefore, the utility of such markers in the study of closely related populations is yet to be explored. The current study reflects, to some extend, a lack of resolution of microsatellites in the reconstruction of closely related populations, probably because of an insufficient number of loci and a large number of populations studied but less likely because of the insufficient number of samples for each population as demonstrated by Shriver et al. (20). This is so because the variance of the genetic distance between loci is much larger than the variance due to sampling error (20) in the estimation of genetic distance. Small bootstrap values reflect insufficient amount of information available to resolve the genetic relationship among closely related populations in the presence of strong gene flow among those populations. But the employment of a much larger number of microsatellite loci in the current analysis may not guarantee a better resolution under such a scenario. Nevertheless, it is not our primary intention to reveal the detailed genetic relationship among those closely related populations, rather we are interested in exploring the major pattern of evolutionary history of the human populations currently residing in East Asia.
In both phylogenies with different loci and populations, populations from East Asia always derived from a single lineage, indicating the single origin of those populations. It does not preclude the possibility of an independent origin of modern humans in East Asia, but its contribution to the extant populations is not detectable in this analysis. It is now probably safe to conclude that modern humans originating in Africa constitute the majority of the current gene pool in East Asia. A phylogeny with very different topological structure would have been expected if an independent Asian origin of modern human had made a major contribution to the current gene pool in Asian populations. Since the methods employed in this analysis can detect only major genetic contribution from particular sources, a haplotype-based analysis will probably detect minor contribution from an independent origin of modern humans in East Asia (24, 25).
In contrast with previous studies (2-4) where distinction between southern and northern populations was clear, our current analysis showed that northern populations belong to two different groups, although statistical support was still weak. One noticeable difference in our study is the employment in the phylogeny reconstruction of the neighbor-joining method, which is supposedly more robust in the presence of genetic admixture. The use of microsatellites, a different type of genetic markers from previous studies, and the measures of genetic distance introduced further complication. However, the northern populations in cluster N2 were sampled from the southwestern part of China, except for Ewenki, where genetic admixture with the southern population was more likely to occur. This might explain why this group of northern populations clustered with southern populations.
Another noticeable feature from this analysis is that the linguistic boundaries are often transgressed across the six language families studied (Sino-Tibetan, Daic, Hmong-Mien, Austro-Asiatic, Altaic, and Austronesian). Such a phenomenon is even more pronounced among southern populations, where populations from the same geographic regions tend to cluster in the phylogeny (see Fig. 1B). This observation is consistent with the history of Chinese populations, where population migrations were substantial.
The current analysis suggests that the southern populations in East Asia may be derived from the populations in Southeast Asia that originally migrated from Africa, possibly via mid-Asia, and the northern populations were under strong genetic influences from Altaic populations from the north. But it is unclear how Altaic populations migrated to Northeast Asia. It is possible that ancestral Altaic populations arrived there from middle Asia, or alternatively they may have originated from East Asia.
The analyses of metric and nonmetric cranial traits of modern and prehistoric Siberian and Chinese populations showed that Siberians are closer to Northern Chinese and Mongolian than European (26, 27). The same notion holds for the facial flatness (26-28). European populations did not appear in Siberia, western Mongolia, and China until the Neolithic and Bronze Age (26, 27, 29, 30). Furthermore, cranial and dental analyses have linked the Arctic peoples, Buryat and east Asians with American Indians (31-35), which arrived through Beringia (Bering land bridge) somewhere between 15,000 and 30,000 years ago (36). These observations are generally consistent with the genetic evidence based on this research and mitochondrial DNA data (37-40). Therefore, it is more likely that ancestors of Altaic-speaking populations originated from an East Asian population that was originally derived from Southeast Asia, although the current Altaic-speaking populations undeniably admixed with later arrivers from mid-Asia and Europe (see Fig. 2, thin solid lines). The possibility of early northern route migration from mid-Asia to Siberia is doubtful, given the fact that the last glacier started to recede only 15,000 years ago (see Fig. 2, dashed lines).

View larger version (98K):
[in this window]
[in a new window]
Fig. 2. Hypothetical ancestral migration routes to the Far East. Refer to Table 1 for names of the numbered populations.


This conclusion can be tested by using simple inductive logic. If the ancestral Altaic-speaking population was of northern origin, the genetic relationship of extant populations should follow the phylogeny presented in the bottom of Fig. 3. The phylogeny generated in the current study apparently supports the upper phylogeny of Fig. 3. In this analysis, Altaic populations are represented by Buryat and Yakut. Southern Chinese populations are those populations from Yunnan and Taiwan that reportedly did not have any admixture with Altaic populations. Populations from Middle Asia were not available to this study.

View larger version (20K):
[in this window]
[in a new window]
Fig. 3. Phylogenetic relationships of worldwide populations under two hypotheses; see text for discussion.


Now that we have established that populations in East Asia were subjected to genetic contributions from multiple sources: Southeast Asia, Altaic from northeast Asia, and mid-Asia or Europe. It would be interesting to estimate relative contributions from each source. Unfortunately, the current study involved only mostly minority populations. A study involving populations across the country is necessary to reveal such a picture.
戴嘉琦
2006-08-22 12:53:04都是老手啊,怪不得。
扯淡                    
  .