- Research
- Open access
- Published:
Allele Mining of Seed-Related Genes Reveals Early movement, Selection and Adaptation of Asian Rice Landraces
Rice volume 18, Article number: 95 (2025)
Abstract
Asian cultivated rice is one of the most important crops in the world. According to archaeological studies, carbonated rice grains in Taiwan were quite small before 3300 BP, but rice seeds from excavated sites became much larger later on. In the current study, we explored seed size differences by using genome re-sequencing, followed by allele mining of several seed- and adaptation-related genes to propose the early movement, selection and adaptation of Asian rice landraces.
AbstractSection ResultsTaiwan indigenous people are descendants of early Austronesians. We collected 116 rice accessions from Taiwan indigenous villages and used whole-genome re-sequencing to explore the mutations, early movement, selection and adaptation of world rice landraces together with landraces collected from all rice-growing areas in Asia. The morphology of Taiwan indigenous rice accessions featured huge variations, and the most primitive accessions were tropical japonica. Also, some grain size-related genes could explain grain size differences. Allele analyses revealed that some mutations in grain morphology-controlling genes may have been targets for selection. Mutation, movements, selection and adaptation occurred in early Asian rice cultivation. Our findings do not conflict with the “out-of-Taiwan” theory.
AbstractSection ConclusionThe grain sizes of all rice subgroups studied had been selection targets, and seed size gradually changed along with the improvement of rice cultivation, via introgression and expansion (i.e., by human beings), over thousands of years.
Introduction
Asian cultivated rice (Oryza sativa) is one of the most important crops in the world and the most widely consumed. Rice was genetically divided into 2 main subspecies, japonica and indica, by Kato (Kato et al. 1930) and according to many subsequent analyses of morphologic, physiologic and cytogenetic characteristics (reviewed by Matsuo and Hoshikawa 1993). Later on, with the analysis of molecular markers (Liakat Ali et al. 2011) and whole genome sequencing (3KRGP 2014), rice was classified into 5 major subpopulations: aus, indica, temperate japonica, tropical japonica, and aromatic. In 2018, a detailed analysis of the 3 K accessions further classified indica rice into ind1a, ind1b, ind2 and ind3 and japonica rice into tropical (trop1), subtropical (trop2) and temperate (Wang et al. 2018).
According to archaeological studies, 2 early domesticated (i.e., non-shattering) cultivated rice types in China were the japonica type, including rice grown about 8,500 years ago in Baligang in the Yellow River-Huai River-Plain (Deng et al. 2015) and about 9,000 years ago in the lower Yangtze river region (Fuller et al. 2009). Wild rice ancestral to indica, also known as proto-indica, was present in the northern Indian subcontinent before the arrival of domesticated japonica (Silva et al. 2018). The domestication genes from japonica were contributed to cultivated proto-indica rice and domesticated indica rice was present about 4,000 years ago in the Ganges plains of eastern India (Fuller et al. 2010; Choi et al. 2017). PCR amplification followed by sequencing of fragmented products using nuclear and plastic primers of hundreds of archaeological grains excavated from India (2 archaeological sites) and Thailand (4 sites) dated 2500 to 1500 before present (BP) revealed that the rice was predominantly japonica in Thailand and a mixture of japonica and indica, with indica in the minority, in India (Castillo et al. 2016). Thus, the movement of indica rice accessions to Southeast Asia (SEA) and East Asia was relatively late as compared with japonica rice cultivation.
Taiwan indigenous people are the original inhabitants of Taiwan. Their ancestors may have been living in Taiwan for thousands of years before Han Chinese immigration began in the 17th century. Taiwanese aborigines are Austronesian-speaking peoples and were demonstrated as the origin of Austronesian languages in the Philippines, Indonesia, Malaysia, Madagascar, Polynesia and Oceania (Hill et al. 2007).
Archaeological studies revealed many carbonized rice grains in the excavated sites in southern (Hsieh et al. 2011; Tsang and Li 2015), eastern (Wu et al. 2016; Deng et al. 2022), central (Chu 2016; Deng et al. 2022) and northern Taiwan (Huang 1984; Deng et al. 2022) from 4 to 5 thousand years ago. Thus, the archaeological record shows that rice cultivation has gone on since the 5th millennium BP. The collection of rice accessions from the indigenous villages started from the early 20th century to recent times, including both japonica and indica rice (Wu et al. 2022). Han Chinese immigrated to Taiwan during the late Ming to early Ching Dynasty, and all rice lines brought with them were indica (Teng 1999).
The early movements of rice between Taiwan and nearby regions are an important topic and have been studied for decades. Dispersal of rice into insular SEA may have been suggested to the Austronesian expansion (Bellwood 1997; Diamond 2001; Bellwood 2006). However, many studies showed conflict with this simple “out-of-Taiwan” hypothesis for the spread of rice. For instance, in our recent study involving whole-genome re-sequencing of the regional dispersal of rice in Taiwan and SEA, japonica rice cultivated by Taiwan indigenous peoples consisted of 2 distinct populations: (1) temperate japonica from northeast Asia and (2) tropical japonica from northern Philippines and mainland SEA (Alam et al. 2021). However, only 24 indigenous lines were used in that study. Many of the indigenous rice lines showed typical tropical japonica morphology and some were temperate japonica. There was no definite conclusion about which type was the earlier one in Taiwan. Also, subsequent frequent trade activities might have influenced the diversity of rice accessions.
Taiwan is a small island and is relatively far away from nearby islands and the mainland. Even though many studies have found exchanges of crops, jade, language, etc. with adjacent regions (Bellwood 1997; Pawley 2002), further exchange and reception of new types of seeds were not as frequent as in large continents. The indigenous people in Taiwan had kept the seed resources faithfully for thousands of years and the Han farmers for hundreds of years. Thus, the detailed comparison of Taiwan rice landraces with those in nearby regions is important.
Many indigenous japonica rice seeds were larger than the early carbonated rice grains and the modern temperate japonica varieties. The major aim of the current study was to examine changes in grain morphology, investigating in detail the allele types of several grain size-controlling genes. Rice grain size is a quantitative trait and is controlled by multiple genes (for recent review, see Li et al. 2018; Jiang et al. 2022). The grain size is a 3-D structure, including seed length, width and thickness, so it has an appearance quality as a target during harvesting/selection. Mutations in the open reading frame (ORF) or promoter sequence changes of genes could lead to changes in grain size. They are transcriptional regulatory factor LGY3 (rice grain yield quantitative trait locus on chromosome 3, Liu et al. 2018), grain weight in chromosome 7 (GW7, Wang et al. 2015), grain length and weight on chromosome 7 (GLW7, Si et al. 2016). grain weight in chromosome 8 (GW8, Wang et al. 2012), a protein GS3 (grain shape gene on chromosome 3, Fan et al. 2006; Mao et al. 2010), a phytohormone signaling protein qTGW3 (quantitative trait controlling grain size and weight on chromosome 3, Hu et al. 2018), seed width controlling protein qSW5 (quantitative trait controlling seed width on chromosome 5, Shomura et al. 2008; Duan et al. 2017), as well as hormone signaling proteins grain size in chromosome 5 (GS5, Li et al. 2011), TGW6 (thousand grain weight on chromosome 6, Ishimaru et al. 2013) and GW6 (grain weight on chromosome 6, Shi et al. 2020).
Thus, we performed whole-genome re-sequencing and seed morphology analysis of many Taiwan landraces. Additionally, we retrieved genome sequence data and seed size information for landraces from rice-growing regions through the 3 K Rice Genome Project (3KRGP 2014). We established a panel of cultivated rice with > 85% landraces and also performed allele mining of several seed-related genes. We aimed to provide data on seed morphology changes and allele variations to give information on the early movement, selection and adaptation of Asian rice landraces.
Materials and Methods
Archaeological Materials
Hundreds of carbonized rice grains were collected from 4 excavated sites in Taiwan belonging to 4 cultures by using the flotation method. The sites are Nan-kuan-li East (NKLE) (23º6’58"N, 120º16’35"E, altitude 0.5 m, 4800 BP) of the Dapenkeng culture (4800−3300 BP), Youhsienfang (YHF, 23°06’54.0"N 120°16’25.0"E, altitude 6 m) of the Niuchouzi culture (3800 −3300 BP), Wuchiantsuo (WCT, 23°05’39.0"N 120°16’25.0"E, altitude 7 m) of the Niaosong culture (1400− 500 BP), and Huilaili (HLL, 24º09’42.8"N, 120º38’11.6"E, altitude 73 m, 1300 BP) of the Fanziyuen culture (2000-400BP). In total, 100 carbonized rice samples from each site were used for measuring seed width, length and thickness with a digital ruler (Mitutoyo Co., Japan) with 0.01-mm accuracy. The information, including Chinese translation, of the excavated sites and cultures are in Table S1.
Sequencing Data
We performed whole-genome re-sequencing of 260 rice accessions collected in Taiwan, including (1) 116 Taiwan indigenous rice accessions. These landraces were collected from the indigenous villages of the mountain regions of Taiwan; (2) 59 landraces that were brought to Taiwan by Hang people during the late Ming to early Ching Dynasty, designated “Ming-Ching”; (3) 72 modern varieties; and (4) 11 weedy rice accessions found in the paddy field. (5) Wild rice (O. rufipogon) accessions were found in northern Taiwan (Oka 1991) and we chose 2 for analyses. All seeds were available in the National Plant Genetic Resource Center (NPGRC) of the Taiwan Agricultural Research Institute (TARI). The list of all Taiwan rice accessions is shown in Table S2, and sorted by their types. The plants of single seed descent were cultivated until tillering stage in an Academia Sinica greenhouse under natural light. Healthy leaves without insect damage from one single plant were harvested, frozen under liquid nitrogen and stored at -80℃ for DNA extraction. The high-quality genomic DNA and the followed-up whole-genome re-sequencing information were prepared as described (Wu et al. 2020a). Sequencing data for these accessions are available from the NCBI Short Read Archive (SRA) and the Bioproject accessions numbers are listed in Table S3.
Several sets of whole-genome re-sequencing data were downloaded from the public domain: the 3 K-RG project (3KRGP 2014; Wang et al. 2018); Asian wild rice (Stein et al. 2018; Zhao et al. 2018b); sequencing data for some modern varieties such as Koshihikari, 9311, IR24 and IR64; and Chinese weedy rice accessions (Qiu et al. 2017). The accession names and seed storage information for all seeds used as well as sequencing data for all these accessions are available from the NCBI Short Read Archive (SRA) under the Bioproject accession numbers listed in Table S3.
Mining of Loss-of-Function (LOF) Alleles for Seed Trait-Related Genes
Adaptor sequences, low-quality bases and reads < 20 bp long were discarded. The trimmed paired reads were then aligned to the reference rice Nipponbare genome sequence (IRGSP v1.0). SAMtools and VCFtools (Danecek et al. 2011) were used to manipulate and transform the Sequence Alignment Map (SAM) and variant call format (VCF) file format (Danecek et al. 2011). We performed allele mining of 10 grain-size related genes, including LGY3 (Liu et al. 2018), GS3 (Fan et al. 2006), qTGW3 (Hu et al. 2018), qSW5 (Shomura et al. 2008), GS5 (Li et al. 2011), TGW6 (Ishimaru et al. 2013), GW6 (Song et al. 2015), GW7 (Wang et al. 2015), GLW7 (Si et al. 2016), and GW8 (Wang et al. 2012). The functional impact of nucleotide variants in the whole genome re-sequencing data was analyzed by using SnpEff (Cingolani et al. 2012). From the genome annotation, sequence variants (single nucleotide polymorphisms [SNPs] and small indels) were classified according to their location (ORF, intron, splice sites, etc.) and predicted functional impact (missense, frame shift, early stop, etc.). We also analyzed genes for other seed-related traits such as red caryopsis (Rc, Sweeney et al. 2006); waxy (wx, Wanchana et al. 2003); 3 non-shattering genes qSH1 (Konishi et al. 2006), qSH3 (Ishikawa et al. 2022) and sh4 (Li et al. 2006); and 2 adaptation-related genes Cold1 (Ma et al. 2015) and Heading date 1 (Hd1Yano et al. 2000; Takahashi and Shimamoto 2011; Wu et al. 2020a).
To detect the fragment (> 35 bp) insertion or deletion, we used the Nipponbare genome sequence or the semi-assembled genome of indica rice R498 (Du et al. 2017; for the qSW5 gene) for alignment. The VCF files for each accession at nearby region were compared by using “vcf-isec” to classify sample-specific or intersection variants. The VCF files were then imported into Integrative Genomics Viewer (IGV) (Robinson et al. 2011; Thorvaldsdóttir et al. 2013) to show alignments.
Bioinformatic Analyses
Alignment sites shared in all accessions were identified first, and SNPs in the shared alignment sites were used for phylogenetic analyses. To remove low-frequency and low-quality SNPs, we filtered out VCF files with any missing genotypes, mean depth values < 5, and minor allele frequency (MAF) < 5% using VCFtools (Danecek et al. 2011). The SNPs were used to evaluate the genetic distances between different samples. A neighbor-joining tree was constructed on a distance matrix, calculated by the Neighbor-Joining/UPGMA method with Neighbor 3.696 (Felsenstein 2004) and presented with MEGA 5.2 (Saitou and Nei 1987; Kumar et al. 2016) and FigTree 1.4.4 (http://tree.bio.ed.ac.uk/software/figtree/).
Population structure analysis involved using ADMIXTURE, with the model-based maximum likelihood approach executed with the program (Alexander et al. 2009) and PCA. ADMIXTURE is a clustering software similar to STRUCTURE that infers populations and individual ancestries. SNPs in VCF formats were converted to a plink format by using PLINK (Chang et al. 2015). Linked sites, monomorphic or multiallelic sites, sites with MAF < 5% and Phred quality < 30 were removed by using PLINK. We tested clustering models in the population with presumed cluster number K from 2 to 15. In addition, PLINK was used for PCA on the same genotype likelihood dataset. R 4.03 was used for structure analysis and PCA plots (R Core Team 2013).
For detecting the selection of target gene (GS3), the 280 accessions with type 2 mutation were used. DNA sequences were aligned by using MUSCLE (Edgar 2004a, b). The − 1- to + 1-kb region of the genes corresponding to the Nipponbare genome was used for Bayesian inference implemented in BEAST2 (v2.7.7) (Bouckaert et al. 2014) to reconstruct the early relationships. The parameters for BEAST running included: site model, HKY (Hasegawa Kishino Yano); molecular clock models, strict clock or fixed local clock; tree priors, as default.
To check the insertions in the tandem genes of flavone glucosyltransferase, IGV (Thorvaldsdóttir et al. 2013) was used to visualize the 30-Mb region of chromosome 1.
To identify the signatures of selective sweeps in the chromosome 1 long arm region, we used Raised Accuracy in Sweep Detection (RAiSD v2.9). Mu (µ) was computed by integrating multiple genomic signals, including the site frequency spectrum (SFS) and linkage disequilibrium (LD) patterns (Alachiotis and Pavlidis 2018). Analysis parameters included a window size of 50 SNPs and evaluation of the µ statistic across 10,000 equidistant grid points per chromosome. Results were visualized as Manhattan plots with custom R scripts.
We used the ± 5-kb region of each gene, including the promoter, ORF, intron, and untranslated region (UTR) for all 17 genes of the 14 subgroups for analyses of genetic distance and diversity. Nucleotide diversity (π) (Cingolani et al. 2012), θw (Watterson 1975), and Tajima’s D (Tajima 1989) were calculated by using DNASP v5.0 (Rozas et al. 2017). Insertion, deletion, and unalignable regions were excluded from the analysis.
Protein Structure Prediction
The protein structures of Os01g0734800 and Os01g0734600 were predicted by using Boltz-1 (Wohlwend et al. 2024), an AlphaFold3 (Abramson et al. 2024)-based tool available on the Neurosnap website (https://neurosnap.ai/). The predicted protein structures were then analyzed for their binding interactions with the flavonoid compound kaempferol.
Statistical Analysis and Scoring of Grain-Related Traits
To characterize and compare the grain morphologic traits across excavated sites and 12 subpopulations, we implemented a comprehensive statistical analysis framework. Statistical analyses involved using R 4.4.2 (R Core Team 2024) and the analytical approach comprised the following methodology: (1) raw data were processed and grouped by using the tidyverse ecosystem (Wickham et al. 2019); (2) one-way analysis of variance (ANOVA) was conducted individually for each trait; (3) post-hoc analysis involved using Tukey’s Honest Significant Difference (HSD) test to identify significant pairwise differences between subpopulations (α = 0.05); (4) for each excavated site or subpopulation and trait combination, we calculated comprehensive summary statistics including mean values, quantiles (25%, 50%, 75%), maximum values, and sample sizes (n); and (5) distribution patterns were visualized with violin plots created with ggplot2 (Wickham 2011).
Several grain-related traits of the Taiwan landraces were screened and recorded. Long awn: awn > 3 cm. Red caryopsis: palea and lemma were removed, and the caryopsis color was recorded. Glutinous phenotype: 0.2 ml of 0.2% iodine reagent was added to the endosperm powder from 3 polished seeds. The color change (if any) was recorded 5 min after. Seed width and seed length: a digital ruler (Mitutoyo Co., Japan) was used to measure. Twenty seeds each samples were scored.
Results
Very Early Rice Grown in Taiwan Had Small Grains
Figure 1A and C shows details of carbonated rice seeds from the excavated sites in Taiwan. Using 100 carbonized seeds from 4 excavated sites, we studied seed size changes by measuring the seed length, width and thickness (Fig. 1D, Table S4). ANOVA results showed significant differences in size parameters from the 4 excavated sites (Table S5). Seed length before 3,300 BP was < 4 mm, much smaller than the seeds collected after, with an average of 5.3 and 6.2 mm from HLL and WCT (Table S6). Some of these seeds were twice the size of seeds from the 2 earlier sites, shown in Fig. 1D. Early rice seed size in Taiwan has changed over time, from small and round shaped to large with an oblong shape.
Carbonated rice seeds excavated in Taiwan. A A group of carbonated rice grain, bar = 5 mm. B Scanning electron microscope (SEM) image of a complete carbonated rice grain, bar = 2 mm. C Zoom-in SEM image of the tip of a carbonated grain showing the base of an awn, bar = 500 μm. D Distribution of seed length, width and thickness (in mm) of 100 seeds from each site. Green: Nankuanli East (NKLE), ~ 4800 BP at Tainan; purple: Youhsienfang (YHF), ~ 3800 BP at Tainan; cyan: Wuchiantsuo (WCT), ~ 1400 −500 BP at Tainan; orange: Huilaili (HLL), ~ 1300 BP at Taichung
Specific Rice Landraces from Taiwan
Many rice heritage landraces had been cultivated by indigenous people living in the hill and mountainous regions, with an upland practice, from about 4500 years ago (Hsieh et al. 2011). A total of 60 upland rice accessions were collected from indigenous villages from 1895 to the early 1910s. These accessions were since propagated (renewed) every 5 years by rice breeders, and seeds were stored in TARI (Teng 1999). There were only names for this seed resource, without other information such as collection year, villages and tribes. In cooperation with linguists, the possible tribe information for 10 accessions were proposed and these are listed in Table S7. Some rice breeders continued to visit mountain regions and collected more accessions in recent years. They added another 56 lines, with information on tribes collected (Table S7).
There are 16 indigenous tribes in Taiwan. The rice accessions collected in these villages are highly diverse in seed morphology, plant height, plant stature, and heading behavior. For instance, 16 accessions were collected from a “CD” village of the Tsou tribe. Figure S1 illustrates the plant architecture during heading, the mature seeds and caryopsis of 4 accessions, CD1, CD5, CD7 and CD10. These plants were transplanted into the open field on the Academia Sinica campus at the same time, but the heading period of these 4 accessions varied from early heading to seed mature (i.e., panicle bent down) (Fig. S1A to D). The grain shape, grain color and length of awns (Fig. S1E to H) as well as the caryopsis color (Fig. S1I to L) also show variations. Supplementary Table 8 illustrates the subspecies (japonica or indica), awn length, awn color, grain color, caryopsis color, glutinous type, and grain length, width, length/width ratio and area index (length x width) for each of the 10 lines from the CD village. The data indicate that grain-related traits such as the presence and length of awns, grain type, color, size and caryopsis color, glutinous trait, etc. show large variations. Similar variations are present from lines collected within or between tribes.
We received 16 accessions that were stored under ambient temperature for decades from this CD village, and only 10 were rescued (germinated) successfully. Similar problems existed in the seeds collected from many other villages. Together, it was estimated that we lose (no germination) 30–40 accessions.
The Han people migrated from the southeast coastal area in China, mainly Fujian and Guangdong, to Taiwan during the late Ming to early Ching Dynasty. According to the literature, there were 1,679 Ming-Ching rice accessions during the survey in 1906, and the breeders reduced these to 547 lines after screening and elimination according to similarity in morphology (Iso 1944). These resources are also stored in NPGRC. They were representatives of the early landraces existing about 400 years ago in southern China. Weedy rice, also known as red rice, has been a problem in rice production in Taiwan in recent years. All these weedy rice lines are indica but grew in/contaminated the japonica field (Wu et al. 2020b; Huang et al. 2021). In addition, 2 wild rice accessions existed in northern Taiwan were used for the current study.
We then compared the grain shape of heritage landraces and modern varieties. Figure 2 illustrates seeds of 6 indigenous japonica lines, 5 japonica modern varieties, 3 indigenous indica lines, 5 Ming-Ching indica lines and 5 modern indica lines. Of note, (1) 2 types of Taiwan indigenous japonica seeds were present (e.g., the 4 on the left are larger than the 2 on the right) and the latter 2 show similar seed size as modern temperate japonica varieties and (2) the Ming-Ching indica rice grains were smaller, both in length and length/width ratio, than the modern indica seeds. Thus, several questions were raised: (1) Did qSW5, a seed width-controlling gene, have other LOF mutation(s) in addition to the 1.2-kb deletion?; (2) Which type(s) of the Taiwan indigenous japonica rice were, and what were the reason(s) for their larger size as compared with the early carbonated seeds?; and (3) What were the reasons for the short seed length of the Ming-Ching rice accessions?
Grain shapes of several landraces and modern rice varieties. Six Taiwan indigenous japonica lines, 5 modern japonica varieties, 3 Taiwan indigenous indica lines, 5 Ming-Ching indica lines and 5 modern indica varieties. Scale bar: 10 mm. Several accession names are shown in abbreviation: TC194, Taichung 194; TNG67, Tainung 67; TNG71, Tainung 71; TC65, Taichung 65; KH 145, Kaohsiung 145; SML, SzuMingLuTao; DGWG, Dee Geo Woo Gen; TNGS20 Tainung Sen 20; TNGS22 Tainung Sen 22; TCS10, Taichung Sen 10; TN17, Taichung Sen 17; Tainung Sen Naw 2 TNGSW2
To solve these questions, in addition to the Taiwan landraces, we included 855 rice 3 K data marked as “T” and the weedy rice (also known as red rice, Huang et al. 2021) sequencing data from Taiwan and China in the analysis panel. Data for 78 modern varieties and 20 wild rice accessions were also added. Together, < 15% of the lines used had been improved by modern breeding programs. The information on the accessions’ ID, names, collected regions, seed storage centers, DNA sequencing accessions, etc. is available in Table S3. The GPS information for the collection sites for most accessions is also available in the same table. That information was obtained from Gutaker et al. (Gutaker et al. 2020), and we added the data for accessions from Taiwan.
The grain size information for about 3 K rice accessions is from the IRRI SNP-Seek Database (Mansueto et al. 2016). We measured the grain size information for the accessions from Taiwan. Thus, we had grain size information for 1196 of the 1365 accessions, listed in Table S9. With the morphologic information, we also performed allele mining of 10 seed size-/shape-related genes followed by in-depth comparisons. We also checked another 5 grain morphology-related genes and 2 adaptation-related genes.
Genome Sequence Analysis Results
The average sequencing depth of the panel was about 16 X, with 0.7 million SNPs per accession. Structure (also known as ADMIXTURE) analysis at K = 2 revealed 2 major groups, corresponding to the 2 subspecies japonica and indica, and the charts for K = 2 to K = 15 are shown in Fig. S2. Figure S3 illustrates the cross-validation (CV) error versus K in the admixture analysis and we chose K = 14 because of its low CV error. Figure 3A shows the K = 14 structure analysis and Fig. 3B the phylogenetic tree analysis. Indica rice were grouped into 6 categories: V8 are accessions mainly from the Indian subcontinent; V12 are accessions mainly from the Indochina Peninsula; V11 are accessions mainly from insular SEA; V6 are indica rice, with more than half from Taiwan Ming-Ching and indigenous villages, the remaining being Chinese weedy rice; V13 are accessions also with about half from Taiwan Ming-Ching and a few from indigenous villages, and another half were Chinese landraces and weedy rice; and V9 are aus. Japonica rice were also grouped into 6 categories: V2 are temperate japonica including modern varieties and some Taiwan indigenous accessions; V1 are japonica accessions all collected from Taiwan indigenous villages; V4 are trop2 japonica rice mainly from the Indochina Peninsula. Trop1 japonica were grouped into 3 types: V5 are collected from Japan, Korea and Taiwan indigenous villages, that is, from the northern region; both V7 and V10 are trop1 japonica from insular SEA; 89% of the V7 were collected from western Indonesia, including Sumatra and Java; and V10 are lines collected from all tropical insular regions, including eastern Indonesia. Most wild rice (V3) or weedy rice (V14) were grouped separately. The brief and detailed description of these 14 subgroups are shown in Table 1 and Table S10. The color code for each group was kept the same in all figures, as shown in Table 1. The detailed subgroup information for all 1365 accessions is in Table S3.
Population structure of Asian rice landraces. A Structure analysis showing k = 14 subgroups; B,phylogenetic analysis of all subgroups. Color code for each subspecies, subgroup, and area: japonica: V7, trop 1, insular, violet; V10, trop 1, insular, peru; V5, trop1, northern, cornflower blue; V1, trop 1, Taiwan indigenous, light salmon; V4, trop2, Indochina, sky blue; V2, temp, medium aquamarine. indica: V6, primitive, thistle; V14, weedy, dark olive green; V13, primitive, dark sea green; V11, insular, light coral; V12, Indochina, yellow green; V8, India, orchid; V9, aus, pale violet red. Other Oryza species, V3, light slate gray
The phylogenetic analyses revealed clear differentiation of the panel into 2 major groups – indica and japonica (Fig. 3B, with each subgroup indicated). Trop1 japonica was divided into 4 clades: V7 (violet), V10 (peru), V5 (cornflower blue) and V1 (light salmon). V4 (sky blue) and V2 (medium aquamarine) were in another clade. A total of 12 wild rice accessions were clustered with japonica rice: W1944, W0104, W1739, W2012, W1943, W3095, W3078, W1777, W1979, TWR1, TWR72, and O. barthii. For indica rice, V9 (pale violet red) were aus and were distinct from other indica clades. V11 (light coral), V12 (yellow green) and V8 (orchid) were relatively close, and V6 (thistle), V13 (dark sea green) and V14 (dark olive green) were relatively close. Four wild rice accessions, W0170, W1698, W1687 and W1718, were clustered with indica rice. The colors for structure and phylogenetic analyses fit well, indicating that the classification is reasonable.
The Taiwan indigenous rice accessions were grouped into several subgroups. For japonica, all 43 lines in V1 are indigenous and have relatively primitive traits, that is, tall plant, very long awn (e.g., some > 8 cm), mostly red caryopsis and relatively shattered, etc. V2 (temperate japonica) consisted of 20 accessions collected from indigenous villages. They are relatively modern, have no awn, few with red caryopsis, and are less shattering; V4, V5 and V10 consist of 10, 6 and 2 indigenous lines, respectively, and are relatively primitive. No Taiwan indigenous rice accessions are found in subgroup V7, the insular trop1 group at the western SEA. For the indica type, V6, V8 and V13 consist of 34, 1 and 4 indigenous lines, respectively, and all Taiwan indigenous indica rice accessions had no awn and most had white caryopsis and were less shattering. Also, subgroups V11 and V12, the insular SEA and Indochina indica subgroups, contained no indigenous indica rice accession.
The PCA results indicated japonica lineages positioned on the left and indica lineages on the right along principal component 1 (PC1) (Fig. S4A, B). PC2 further distinguished indica accessions (Fig. S4A), and PC3 separated japonica accessions (Fig. S4B), with each subgroup circled and labeled. The wild relatives (V3) located at the center of PC1. The aus (V9) located at the top and could separate well from the indica of the Indian subcontinent (V8). Those from Indochina (V12) and from insular SEA (V11) could be distinguished, whereas those from China and Taiwan (V6 and V13) as well as weedy rice (V14) were scattered. The temperate japonica (V2) located at the very top of PC3, followed by V1, the trop1 of primitive Taiwan indigenous accessions. Of note, V1 was closer to V2 but not V4. Trop2 japonica lines from Indochina (V4) and trop1 lines from insular SEA (V7 and V10) were at the bottom. The northern trop1 accessions (V5) were separated from other trop1 accessions (V1, V7 and V10). Taiwan indigenous japonica accessions are scattered from the top (V2, V1) to the bottom (V4, V5, V10) along the PC3 scale (Fig. S4, and Table S2).
Grain Size Varied in these subgroups; Many Rice Accessions Contained Different Alleles in Grain-Controlling Genes
Rice grain size is a quantitative trait and is controlled by multiple genes. Sequence changes of several grain-controlling genes led to variations in seed size. We searched for different alleles mentioned in Methods. For instance, a G-to-T SNP occurring in GS3 led to an early stop and thus a larger grain (Fan et al. 2006). We found another 5 mutations leading to a frameshift, early-stop or splicing site change, also leading to LOF GS3 gene, in the panel we used. Table S11 shows the gene names, locus ID, references, mutation positions and types, and sequence changes for all GS3 LOF mutations. Likewise, the allele information for all genes are listed in the same table. The detail information of the mutated alleles of each gene for each accession are in Table S3. If the alleles are heterozygous, they are marked as a reference sequence. Other grain size-related genes such as WTG1 (Huang et al. 2017), GL2 (Che et al. 2015), GW2 (Song et al. 2007), qGSL3-1 (Qi et al. 2012), GL6 (Wang et al. 2019) and GS9 (Zhao et al. 2018a) were checked, but no mutation was found in this panel, possibly because it contained traditional landraces.
ANOVA results (Table S12) indicated significant differences among the grain size parameters in the 1196 accessions listed in Table S9. The statistical analysis for seed length, width, length/width ratio and area is shown in Table S13 and violin plots are in Figure S5, with the significance level and population size indicated. The means for each subgroup and the significant levels are in Table 2. For seed width, japonica rice (V7, V10, V5, V1, V4 and V2) was wider than all indica accessions. For seed length, only temperate japonica (V2) and 2 relatively primitive indica groups (V6 and V13) were < 8 mm. These two subgroups mainly composed of Taiwan indigenous lines and Ming-Ching. The slender seed shape (l/w > 2.8) was present only in V11, V12, V8 and V9, the subtropical and tropical indica lines. Small seeds (less length x width) were mainly in temperate japonica (V2) and the relatively primitive indica landraces (V6 and V13).
With the obvious seed size differences, we then compared the different alleles of the 10 grain-size related genes by using pie charts. The allele mining results shown in Table S3 were reformatted into percentages due to different sample size. The functional alleles (in grey) and mutated alleles (other colors, with different colors if several kinds of mutations were present) of the 10 grain-size related genes for each subgroup are illustrated in Fig. 4.
There were 2 kinds of mutations in LGY3 gene: type 1 was local, present in 1.0% of V9; type 2 was present in all indica subgroups: 5.2, 5.2, 24.8, 13.7, 17.6 and 5.9% for subgroups V6, V13, V11, V12, V8, and V9, respectively. For the 6 japonica subgroups, type 2 mutation was present as 44.2, 40.1, 34.8, 18.6, 46.5 and 11.4% for V7, V10, V5, V1, V4 and V2, respectively. There were 5 types of LOF alleles for the GS3 gene: type 1 is local, present in 2.0% of V9. Type 2 was present in all indica accessions: 28.9, 61.9, 5.1, 15.14, 25.4, 68.6% for V6, V13, V11, V12, V8, and V9, respectively. Type 3 was present in all indica and japonica groups: 25.8, 6.2, 29.3, 56.0, 13.4 and 1.0% for V7, V10, V5, V1, V4 and V2, respectively, of indica; and 88.5, 65.5, 4.4, 95.4, 83.1 and 3.4% for V6, V13, V11, V12, V8, and V9, respectively, of japonica. Type 5 is local, present in 0.6 and 0.5% of for V11 and V12, respectively. Type 6 is also local, present in 1.0% for V9 only. There were 3 types of qTGW3 mutations. Type 1 was present in wild rice. Type 2 was local, present in 28.2% for V4. Type 3 was also local, with 32.6 and 1.4% for V1 and V4, respectively. A total of 3 deletions were found in the qSW5 gene region (Fig. 4, and Table S6). The 0.95-kb deletion was present as 79.4, 88.5, 40.4, 42.2, 45.0, 19.4% for V6, V13, V11, V12, V8, and V9, respectively, of indica; and 63.5, 4.9, 4.3, 0, 2.3 and 0.9% for V7, V10, V5, V1, V4 and V2, respectively, of japonica. The 1.2-kb deletion was present as 1.5, 1.3, 0, 0.46, 2.9 and 0% for V6, V13, V11, V12, V8, and V9, respectively, of indica; and 0, 18.3, 91.3, 90.7, 54.3 and 89.0% for V7, V10, V5, V1, V4 and V2, respectively, of japonica. The 13.4-kb deletion was present as 1.9 and 2.0% for V11 and V9, respectively of indica; and 23.1, 15.5, 4.4, 0, 5.6 and 3.4% for V7, V10, V5, V1, V4 and V2, respectively, of japonica. However, the 0.95-kb deletion was 593 bp 3’ downstream of the qSW5 gene (Fig. S6). ANOVA results indicated significant differences between different deletion lengths (Table S14). Grain width was significantly larger for accessions with a 13.4-kb deletion than a 1.2-kb deletion, and those with a 13.4- or 1.2-kb deletion than a 0.95-kb deletion. Grain width was larger for accessions with a 0.95-kb deletion than the wild type, that is, without any deletion, but to a lesser extent (Figure S7 and Table S15). Only one type of GS5 mutation was found, and in indica rice only: present in 23.7, 6.2, 70.1, 76.2, 14.1 and 3.9% for V6, V13, V11, V12, V8 and V9, respectively. There were 2 types of TGW6 mutation: type 1 was local, present in 1.7% for V2 only. Type 2 was found only in indica rice: 3.1, 38.1, 21.7, 5.1, 3.5, and 7.8% for V6, V13, V11, V12, V8 and V9, respectively. GW6a mutation occurred at the promoter region in 2 positions, i.e. types1 and 2 with 300 bp apart. The type 1 only occurred at local, present in 0.5 and 1.1% for V12 and V8, respectively. All others were types 1 and 2 occurred together, and present in 79.4, 74.2, 96.8, 60.1, 0, and 35.3% for V6, V13, V11, V12, V8 and V9, respectively, of indica; and 7.7, 0.7 and 2.8% for V7, V10 and V4, respectively, of japonica. There were 4 types of GW7 mutations: type1, type 2, type 3, and both types 1 and 3 together. Type 1 was present as 19.6, 2.1, 0.6, 0, 0.1, 2.9% for V6, V13, V11, V12, V8 and V9, respectively, of indica; and 0, 0.7, 8.7, 13.6, 1.4 and 50.3% for V7, V10, V5, V1, V4 and V2, respectively, of japonica. Type 2 was present in 72.2, 87.6, 96.2, 99.1, 95.1 and 61.8% for V6, V13, V11, V12, V8 and V9, respectively, of indica; and 0.6% for V2 of japonica. Type 3 was only present in japonica: 1.9, 2.1, 5.6% for V7, V10 and V4, respectively. Both type 1 and 3 mutations were present only in 2 indica subgroups: 3.1 and 0.7% in V6 and V8, respectively. Both mutations were found in all japonica: 98.1, 95.8, 87.0, 86.1, 91.6 and 5.1% for V7, V10, V5, V1, V4 and V2, respectively. For GLW7 gene, the type 1 mutation was present in 85.6, 93.8, 77.7, 81.2, 84.5 and 86.3% for V6, V13, V11, V12, V8 and V9, respectively, of indica; and 13.5, 58.5, 34.8, 18.6, 28.2, and 2.3% for V7, V10, V5, V1, V4 and V2, respectively, of japonica. For the GW8 gene, the type 1 mutation was present in 95.9, 100, 96.2, 98.2, 97.2 and 67.7% for V6, V13, V11, V12, V8 and V9, respectively, of indica; and 0, 27.5, 82.6, 2.3, 42.3 and 1.7% for V7, V10, V5, V1, V4 and V2, respectively, of japonica.
Together, it was illustrated that few mutations related to the grain-size related genes were local, such as qTGW3. However, most of them were present in relatively high percentages of indica accessions only, such as GS5, TGW6, GW6a, or all over the indica and japonica accessions, from Taiwan, Indochina, insular SEA to Indian subcontinent for another, such as LGY3, GS3, qSW5, GW7, GLW7 and GW8. The data illustrated that rice grain size has been a selected target for long time.
Allele Mining of Other Genes Related To Seed Morphology and Adaptation
In addition to the grain-size controlling genes, we performed similar allele mining analyses on seed morphology-related genes. The proportion of each allele type is also shown in pie charts (Fig. 5). The allele information is in Table S11 and the allele type of each accessions in Table S3. The LOF mutation of Rc gene, i.e. white caryopsis, were present in 79.4, 77.6, 93.5, 95.4, 64.2, and 3.1% for V6, V13, V11, V12, V8, and V9, respectively, of indica; and 93.9, 80.1, 100, 55.8, 83.1 and 93.2% for V7, V10, V5, V1, V4 and V2, respectively, of japonica.
The LOF wx mutation led to glutinous (sticky) grain (Wanchana et al. 2003). The 22-bp deletion was present in 19.4, 6.5, 5.9, 30.0, 0.7 and 0% for V6, V13, V11, V12, V8, and V9, respectively, of indica; and 34.0, 8.5, 50.0, 88.1, 41.8 and 6.1% for V7, V10, V5, V1, V4 and V2, respectively, of japonica.
Three non-shattering seed genes have been cloned and the mutated SNP information is available: qSH1, Konishi et al. 2006; qSH3, Ishikawa et al. 2022; sh4, Li et al. 2006. qSH1 mutation was present in 4.1 and 0.5% for V6 and V12 of indica, and 12.4% for V2 of japonica. The qSH3 mutation was present in 100, 99.0, 99.4, 100, 95.1 and 12.2% for V6, V13, V11, V12, V8, and V9, respectively, of indica; and 100, 95.0, 100. 100, 100, and 100 for V7, V10, V5, V1, V4 and V2, respectively, of japonica. For sh4, it was present in 81.3, 98.0, 99.4, 100, 98.6 and 97.1% for V6, V13, V11, V12, V8, and V9, respectively, of indica; and 100% for all the japonica subgroups.
We also performed allele mining of 2 adaptation-related genes: Cold1 (Ma et al. 2015) and Heading date 1 (Hd1Yano et al. 2000; Takahashi and Shimamoto 2011; Wu et al. 2020a). Cold1 was found related to chilling tolerance; that is, an SNP located at exon 4 of Cold1 associated with the adaptation to cold environment in temperate japonica rice. Specific SNPs could distinguish different subgroups well (Fig. 5): A was present in 95.5, 98.7, 100, 100, 85.1 and 27.4% for V6, V13, V11, V12, V8, and V9, respectively, of indica; and 37.3, 72.7, 86.4, 37.2, 72.9 and 1.7% for V7, V10, V5, V1, V4 and V2, respectively, of japonica. T was present in 4.5, 1.3, 0, 0, 0.7 and 0% for V6, V13, V11, V12, V8, and V9, respectively, of indica; and 62.8, 27.3, 13.6, 62.8, 27.1 and 98.3% for V7, V10, V5, V1, V4 and V2, respectively, of japonica. G was present in indica in Indian subcontinent only, with 13.6 and 72.6% for V8 and V9, respectively.
Rice is a short-day plant; its sensitivity to photoperiod is important for adaptation, especially for those grown in sub-tropical or tropical regions. As one of the important genes for photoperiod sensitivity, Hd1 contains several LOF mutations leading to loss of sensitivity (Yano et al. 2000; Takahashi and Shimamoto 2011; Wu et al. 2020a). The definition of these alleles follows the last 2 papers. Figure 5 shows 6 different mutations and the wild type in these subgroups. Type 13 was present in 8.8, 10.3, 18.0, 0.9, 4.3 and 72.5% for V6, V13, V11, V12, V8, and V9, respectively, of indica; and 63.5, 90.9, 4.4, 2.3, 0 and 5.1% for V7, V10, V5, V1, V4 and V2, respectively, of japonica. Type 3 was local, it was present in 1.5 and 20.5% of V6 and V13, respectively. Type 7 was present in indica only, in 16.2, 42.3, 10.9, 2.8, 4.3, and 0% for V6, V13, V11, V12, V8, and V9, respectively. Type 19 was present in japonica only in 0, 0.7, 0, 2.3, 0 and 42.4% for V7, V10, V5, V1, V4 and V2, respectively. Type 12 was local, present in 14.1% of V4. Type 20 was also local, present in 5.7% of V8.
Some Indigenous Japonica Rice Accessions Showed Selection Related To UV Tolerance
By using 24 Taiwan indigenous japonica accessions together with other SEA landraces, we previously showed that some Taiwan japonica lines associated with genomic regions under heavy selection (Alam et al. 2021). The gene was suggested to be the OsUGT706D1 and could be related to UV tolerance (Peng et al. 2017). In fact, there were 6 tandem flavone glucosyltransferase genes in the region, including Os01t0734600, Os01t0734800, Os01t0735300, Os01t0735900, Os01t0736100, Os01t0736300 (also known as OsUGT706D1), at the 30.7-Mb region of chromosome 1.
In the present study, we performed whole genome re-sequencing of all Taiwan indigenous lines including 77 japonica and 39 indica accessions. RAiSD (Alachiotis and Pavlidis 2018) was used to survey positive selection of the long-arm region of chromosome 1 in 6 Taiwan rice populations: indigenous japonica, indigenous indica, Ming-Ching (indica), modern japonica, modern indica and weedy rice (all indica). Figure S8 indicates a high selection peak at the 30.7-Mb region of chromosome 1 in indigenous japonica only. (1) IGV analysis was used to determine any insertion near this region and the results indicated no newly transposed transposable element. (2) Amino acid changes of these 6 proteins in indigenous lines were checked by using SNPeff (Cingolani et al. 2012). Many randomly distributed missense mutations occurred in these lines, but only 2 proteins had specific mutations. Data mining suggested that the selected genes were Os01t0734600 and Os01t0734800, and 20 indigenous japonica accessions were the selected accessions (Table S16). These included 4 trop1, 11 trop2 and 5 temperate japonica lines. These accessions consisted of specific amino acid mutations in the ORF of the 2 proteins (listed in Tables S17 and S18).
Uridine diphosphate-dependent glycosyltransferases (UGTs) generally adopt a two-domain architecture, consisting of N-terminal and C-terminal domains arranged in a left-right opposition to one another. A catalytic pocket is formed between these two domains, serving as the site for interactions with both the glycosyl donor and acceptor substrates. The glycosyl donor typically interacts with the C-terminal domain, while the acceptor substrate binds to the N-terminal domain (Liu et al. 2025). Among the five mutation sites identified in Os01g0734600, four are located within the glucosyl donor binding region, suggesting potential effects on the affinity between the protein and the glucosyl donor (Fig. S9). All mutation sites in Os01g0734800 are located in the glucosyl acceptor binding region, specifically on the outer sides of two peripheral helices (helix wheel positions) (Fig. S10). These mutations may affect the interaction between the protein and the glucosyl acceptor, or with other molecules. The finding remains for further experimental examination.
Discussion
The current study details the story behind the early carbonated rice grain versus the current indigenous rice cultivated in Taiwan. We performed whole-genome re-sequencing and analyzed the current seed morphology information by using a panel of 1365 rice accessions, about 85% landraces, weedy and wild rice. We show the sequence changes and the fluctuation of early seed size dimensions and how genes related to grain size have been changed. We also show Taiwan indigenous rice accessions consist of important traits, including tolerant to UV, drought and flooding.
Several Mutations Could Lead To Larger Seeds as Compared with Very Early Rice
In one of our previous studies, very early (before 3300 BP) carbonated rice grain from 5 excavated sites in southern Taiwan had small size: average length, width and length/width ratio 3.91 mm, 2.35 mm, and 1.68, respectively (Hsu et al. 2019). In the current study, we performed statistical analyses of 100 seeds from 4 excavated sites and also showed that rice seeds collected before 3300 BP were tiny (Fig. 1D, Tables S4 and S5). For instances, the grain length average of NKLE site was 3.83 mm, with min 2.92, max 5.00, mode 4.05, median 3.79 and the coefficient variation value 10.29%. The value was quite high compared with modern varieties, indicating huge variations in grain size at ancient time. Our previous studies indicated that early Taiwan agriculture came from northern China (Sagart et al. 2018), so we checked the rice seed size of early northern China (before 5,000 BP) and Korea (around 3300 BP). The average carbonated rice grain length was 4.1 mm in the Yuezhuang site (6060− 5,750 BP), northern China (Crawford et al. 2006). For the carbonated rice grain in the Baligang (northern China) site, there were 4 grains in the excavated site of the pre-Yangshao period (8700-8300 BP), with length from 4.0 to 4.8 mm, and 119 grains in the excavated site of the Yangshao period (6300-5000 BP), with length from 3.1 to 5.0 mm (Deng et al. 2015). In the early Mumun period (~ 3,00 BP) in Korea, with a total of 7 excavated sites, the average length of carbonated rice grains was 3.3 mm (Ahn 2010). Hence, very early rice grains in northern China, Korea and Taiwan were tiny, ranged from 2.9 to 5.0 mm in grain length, as compared with later grains.
Many studies focused on reproducing charring experiments and checked the grain size changes on carbonated grains from excavated sites. Several cereals showed a decrease in grain length during the process, such as rice (Chuenwattana 2010), wheat (Renfrew 1973), barley (Renfrew 1973), and oat (Renfrew 1973). A recent study of the natural charring processes aiming for the preservation biases that occurred in prehistoric contexts of carbonization showed that grain length had a shrinkage of 21.05% for rice, 25.0% for proso millet, and 18.9% for foxtail millet, etc. (Castillo 2019). Thus, the estimated original grain length of early Taiwan rice was 4.8 mm before 3300 BP and then became larger later on, including 6.4 mm at HLL and 7.5 mm at WCT. For the present indigenous trop1 accessions, the grain length was 8.6 mm (Table 2).
All carbonated rice seeds used for size measurements were caryopses, as the images in Fig. 1A show. In the present work, we used the genome sequencing information coding for the grain size trait to interpret differences between carbonated grains and current grains. According to the studies on rice grain development, there was a high correlation between the size of caryopsis and the whole grain (with palea and lemma) (Takeda and Takahashi 1970; Takita 1988), and thus allele analysis could be used to explain differences in ancient seeds excavated from archaeological sites.
According to the sequencing analysis, the most primitive Taiwan trop1 indigenous lines, V1, contained alleles leading to larger grains, including LGY3, 19%; GS3, 96%; qTGW3, 33%; GLW7, 19% and GW7, 100%. The additive effects could be one of the reasons leading to larger grains.
In the current study, we investigated different mutations present in 10 grain-size controlling genes (summarized in Fig. 4). However, these mutations could not explain well why the grain size of Ming-Ching rice accessions were smaller than most of the other indica rice accessions in Asia. Further detail studies are still requisite.
Landraces are Precious Genetic Resources
Taiwan is a small island, however, there are 16 different tribes living here for more than 4000 years. We’ve collected 116 indigenous rice accessions and 546 Ming-Ching accessions. All of them have been kept well in the national seed stock center. In the current study, we performed re-sequencing of 116 indigenous lines and 59 Ming-Ching lines, all the sequencing data were submitted to the public domain. The biological materials and genome sequencing data are important resources for any further studies.
All Taiwan rice accessions used in the current study are listed in Table S2, sorted by types. The most primitive indigenous trop1 group (V1) consisted of 43 accessions, with 67.4% of them collected after WWII. Thus, it indicated the importance of continuous visit/collect in mountain regions. All lines unable to germinate belonged to this collection, point out the collection should be proceeded as soon as possible.
During the breeding process, a famous rice variety Taichung 65, bred in 1929, received LOF mutations of 2 important flowering-time related genes Hd1 and early heading date 1 (Ehd1) from 2 Taiwan indigenous accessions (Wei et al. 2016). The indigenous rice lines have been used for breeding programs, and Tunglu 1, Tunglu2 and Tunglu3 were designated in 1964. The other parental lines for the 3 varieties were temperate japonica varieties. According to the current sequencing data, Tunglu3 is trop1 and the other 2 are temperate japonica. All 3 lines are resistant to rice blast, lodging and drought stress and have been cultivated in an upland practice in the indigenous villages.
We performed a genome-wide association study of 2 panels previously: the indica panel contained 268 accessions and the japonica panel 238 accessions, for phenotypes of resistance to drought, flooding and abscisic acid treatments (Wu et al. 2022). All Taiwan indigenous lines were in the panel. Table S19 lists the indigenous line resistant to drought treatment: 16 were indica accessions and 3 japonica. Table S20 lists the indigenous lines whose shoots still grew well after 7-day flooding treatment: 3 were indica accessions and 13 japonica. Table S21 lists the indigenous lines whose roots still grew well after 7-day flooding treatment: 4 were indica accessions and 12 japonica. Of note, 4 lines grew well with both shoots and roots under flooding.
The Taiwan indigenous people have lived in hill to mountain regions, at elevations ranging from 500 to 1500 m, for thousands of years. Our studies illustrated that many of the indigenous rice accessions contained resilient traits such as tolerance to drought, flooding or UV. They certainly provide very precious resources in the current climate change era.
The semidwarf 1 (SD1) mutation played an important role in the rice green revolution. The miracle rice IR8 and one of its parental lines, Dee Geo Woo Gen (DGWG), contained a 383-bp deletion on GA20ox-2 from part of exon 1 to exon 2 and led to the LOF (Sasaki et al. 2002; Spielmeyer et al. 2002). This DGWG was one of the Ming-Ching accessions. In addition, another 4 Ming-Ching lines, including Hsinchu-Ai-Chueh-Chien, Ti-Chueh-Wu-K’o, Ai-Tzu-Ch’ung and Liu-Tou-Tzu, were also semidwarf and contained the same mutation. W1718, one of the wild rice accessions in the current panel, also had the same allele. This W1718 was collected from southern China. All 6 accessions contained the DGWG sd1 allele and were originally from southern China. However, this allele was not present in the southern China landrace collections in the China National Rice Research Institute. Somehow Ming-Ching lines could have been abandoned in China but kept in Taiwan.
Bayesian Evolutionary Analysis Indicated GS3 Gene Mutation Occurrence, Introgression and Expansion
The distribution of type 2 mutation of GS3 gene were relatively wide, from Taiwan, Indochina, insular SEA to Indian subcontinent, and including japonica, indica and aus accessions, as shown in Fig. 4. Thus, we further examined the gene by using Bayesian Evolutionary analysis sampling Trees. We used the fragments containing the whole gene and − 1-kb to + 1-kb up- and downstream regions for elucidating the relation. The 280 accessions carrying the mutation and all 20 wild rice accessions were used in the analysis.
The results of BEAST analysis could be separated into 6 sessions (Fig S11). Wild rice accessions were group into Wa and Wb. Session 1 were closed to Wa, indicating the mutation might occur in trop1 japonica of insular SEA (V10). It was followed by sessions II that consisted of mainly V13 and with few V8. The next clade started with another group of wild rice, Wb. Session III was a small clade with mainly V6, and followed by session IV, all are V9. Together, it was suggested that GS3 type 2 mutation occurred in trop1 japonica from insular SEA (V10). However, most of the remaining accessions were indica rice, including V13, V8, V12, V9 (aus), and V6. From pie chart analysis (Fig. 4), type 2 mutations were mainly in indica rice, and only one japonica subgroup (V10, insular trop 1) carried this mutation. Thus, the mutation occurred at insular SEA then introgressed and expanded (i.e., brought by human beings) to indica rice in TW (V13), Indochina (V12) and the Indian subcontinent (V8), as well as aus (V9).
Mutations, Movements, Selections and Adaptations Occurred in Early Rice Cultivation
There were frequent trade networks around the south China sea since ~ 2500 BP (Hung et al. 2007, 2013; Alam et al. 2021; Wang et al. 2022, 2023). The early Austronesian trading area included Taiwan, southern China, the Indochina peninsula, insular SEA, and India to the west, with high activities from ~ 2500 to ~ 1800 BP. For instance, Fengtian jade was found in the excavated sites around these areas, thereby indicating a jade trade (Hung et al. 2007, 2013; Alam et al. 2021), whereas both mining and manufacturing were found only on the east side of Taiwan (Hung et al. 2007, 2013). Thus, back-and-forth movement between these areas was relatively frequent in early times.
We showed that mutations and movements occurred heavily in eastern, southeastern and southern Asia. We also used Tajima’s D analysis (Tajima 1989) to check for selection across all 17 genes tested in the studies. D values were significant or negative (although not significant) for most genes except subgroup V14 (wild rice) (Table S22). Thus, these genes have indeed been under selection in the landraces. The current Taiwan indigenous millet farmers still performed panicle selection at harvesting stage (Fig. S12), thus illustrating the importance of selection also in current agriculture.
The movement of rice into insular SEA has been suggested to Austronesian expansion (Bellwood 1997; Diamond 2001; Bellwood 2006). Austronesian farmers from southern Taiwan started to migrate to the northern Philippines about 4,000 years ago (Gray and Jordan 2000; Diamond and Bellwood 2003), then expanded across insular SEA (Sagart 2004; Bellwood 2011). The Austronesian vocabulary of rice cultivation is shared by Taiwan and the Malayo-Polynesian languages (Proto-Austronesian *pajay ‘rice plant’, *beRas ‘husked grain’, *S1emay ‘rice as food’; Sagart et al. 2018). This finding argues that the first Austronesians who migrated out of Taiwan brought with them rice cultivation and the attendant vocabulary. The previous whole-genome re-sequencing studies suggested that Taiwan indigenous tropical japonica could came from northern Philippines and mainland SEA (Alam et al. 2021). We hypothesized that once the rice lines arrived in the Philippines, their landraces were out-competed by better-adapted tropical landraces from the southern mainland (Indochina Peninsula), while the vocabulary remained unchanged. From the current allele mining of grain-size related genes, we suggested that rice accessions with larger seeds were then brought back to Taiwan via the trade networks and replaces the original small ones. Together, the genome re-sequencing and linguistic analyses fit well with the out-of-Taiwan theory.
Conclusion
Crop cultivation is influenced by social and cultural exchanges with nearby regions. The history of dispersal/exchange of rice in Asia has been an interesting research target and has been studied by archaeological and linguistic research as well as whole genome re-sequencing. In the current study, we showed that ancient seed size, for example, carbonated seeds before 3,300 BP in Taiwan, were tiny as compared with the size of recent accessions grown in indigenous villages. Use of whole-genome re-sequencing and survey of grain size revealed that some mutations in grain morphology-controlling genes might be targets for selection, leading to great size changes in carbonated rice grains versus recent cultivated ones. We also performed allele mining of 2 adaptation-related genes. Similar to the current small-scale farming of foxtail millet in Taiwan indigenous villages, ancient people may have selected rice during harvest for the following growing seasons. The grain sizes of all subgroups studied have been selection targets, and seed size gradually changed along with the improvement of rice cultivation, via introgression and expansion, over thousands of years.
Data Availability
The data underlying this article are available in the NCBI Short Read Archive (SRA) database as indicated in Table S3.
Abbreviations
- ANOVA:
-
Analysis of variance
- BEAST:
-
Bayesian evolutionary analysis by sampling trees
- BP:
-
Before present
- CV:
-
Cross-validation
- DGWG:
-
Dee Geo Woo Gen
- GS3:
-
Grain shape gene on chromosome 3
- GS5:
-
Grain shape gene on chromosome 5
- GW6:
-
Grain weight on chromosome 6
- GW7:
-
Grain weight on chromosome 7
- GLW7:
-
Grain length and weight on chromosome 7
- GW8:
-
Grain weight on chromosome 8
- HLL:
-
Huilaili
- IGV:
-
Integrative Genomics Viewer
- IRRI:
-
International Rice Research Institute
- LGY3:
-
Rice grain yield quantitative trait locus on chromosome 3
- LOF:
-
Loss-of-function
- NKLE:
-
Nan-kuan-li East
- NPGRC:
-
National Plant Genetic Resources Center
- ORF:
-
Open reading frame
- qSW5:
-
Quantitative trait controlling seed width on chromosome 5
- qTGW3:
-
Quantitative trait controlling grain size and weight on chromosome 3
- SAM:
-
Sequence Alignment/Map
- SEA:
-
Southeast Asia
- TARI:
-
Taiwan Agricultural Research Institute
- TGW6:
-
Thousand grain weight on chromosome 6
- WCT:
-
Wuchiantsuo
- YHF:
-
Youhsienfang
References
Abramson J, Adler J, Dunger J, Evans R, Green T, Pritzel A, Ronneberger O, Willmore L, Ballard AJ, Bambrick J, Bodenstein SW, Evans DA, Hung C-C, O’Neill M, Reiman D, Tunyasuvunakool K, Wu Z, Žemgulytė A, Arvaniti E, Beattie C, Bertolli O, Bridgland A, Cherepanov A, Congreve M, Cowen-Rivers AI, Cowie A, Figurnov M, Fuchs FB, Gladman H, Jain R, Khan YA, Low CMR, Perlin K, Potapenko A, Savy P, Singh S, Stecula A, Thillaisundaram A, Tong C, Yakneen S, Zhong ED, Zielinski M, Žídek A, Bapst V, Kohli P, Jaderberg M, Hassabis D, Jumper JM (2024) Accurate structure prediction of biomolecular interactions with alphafold 3. Nature 630(8016):493–500
Ahn S-M (2010) The emergence of rice agriculture in korea: archaeobotanical perspectives. Archaeol Anthropol Sci 2(2):89–98. https://doi.org/10.1007/s12520-010-0029-9
Alachiotis N, Pavlidis P (2018) Raisd detects positive selection based on multiple signatures of a selective sweep and SNP vectors. Commun Biol 1(1):79
Alam O, Gutaker RM, Wu C-C, Hicks KA, Bocinsky K, Castillo CC, Acabado S, Fuller D, d’Alpoim Guedes JA, Hsing Y-I, Purugganan MD (2021) Genome analysis traces regional dispersal of rice in Taiwan and Southeast Asia. Mol Biol Evol 38(11):4832–4846. https://doi.org/10.1093/molbev/msab209
Alexander DH, Novembre J, Lange K (2009) Fast model-based Estimation of ancestry in unrelated individuals. Genome Res 19(9):1655–1664. https://doi.org/10.1101/gr.094052.109
Bellwood P (2006) The early movements of Austronesian-speaking peoples in the Indonesian region. Austronesian Diaspora and the Ethnogeneses of People in Indonesian Archipelago:61–82
Bellwood P (2011) The checkered prehistory of rice movement southwards as a domesticated cereal—from the Yangzi to the equator. Rice 4(3–4):93–103. https://doi.org/10.1007/s12284-011-9068-9
Bellwood P Prehistory of the Indo-Malaysian archipelago. Honolulu University of Hawaii Press(First edition published 1985 by Academic, Press (1997) Sydney) Indo-Pacific Prehistory Association Bulletin 23:2003
Bouckaert R, Heled J, Kühnert D, Vaughan T, Wu C-H, Xie D, Suchard MA, Rambaut A, Drummond AJ (2014) Beast 2: a software platform for bayesian evolutionary analysis. PLoS Comput Biol 10(4):e1003537
Castillo CC (2019) Preservation bias: is rice overrepresented in the archaeological record? Archaeol Anthropol Sci 11(12):6451–6471
Castillo CC, Tanaka K, Sato Y-I, Ishikawa R, Bellina B, Higham C, Chang N, Mohanty R, Kajale M, Fuller DQ (2016) Archaeogenetic study of prehistoric rice remains from Thailand and india: evidence of early Japonica in South and Southeast Asia. Archaeol Anthropol Sci 8(3):523–543. https://doi.org/10.1007/s12520-015-0236-5
Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ (2015) Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4(1):s13742. 13015-10047-13748
Che R, Tong H, Shi B, Liu Y, Fang S, Liu D, Xiao Y, Hu B, Liu L, Wang H, Zhao M, Chu C (2015) Control of grain size and rice yield by GL2-mediated brassinosteroid responses. Nat Plants 2(1):15195. https://doi.org/10.1038/nplants.2015.195
Choi JY, Platts AE, Fuller DQ, Hsing Y-I, Wing RA, Purugganan MD (2017) The rice paradox: multiple origins but single domestication in Asian rice. Mol Biol Evol 34(4):969–979. https://doi.org/10.1093/molbev/msx049
Chu WL (2016) Rescue excavation report of the Anhelu site. National Museum of Natural Science, Taichung. (in Chinese)
Chuenwattana N (2010) Rice grain charring experiments: can we distinguish sticky or plain archaeologically. Dissertation, UCL, Institute of Archaeology, London
Cingolani P, Platts A, Wang le L, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM (2012) A program for annotating and predicting the effects of single nucleotide polymorphisms, snpeff: SNPs in the genome of drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 6(2):80–92. https://doi.org/10.4161/fly.19695
R Core Team (2024) R: A language and environment for statistical computing (version 4.4.2). R foundation for statistical computing, Vienna, Austria. https://www.R-project.org/
R Core Team (2013) R: A language and environment for statistical computing
Crawford GW, Chen X, Wang J (2006) Houli culture rice from the Yuezhuang site. Jinan Dongfang Kaogu 3:247–251 (in Chinese with English Abstract)
Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST (2011) The variant call format and vcftools. Bioinformatics 27(15):2156–2158
Deng Z, Qin L, Gao Y, Weisskopf AR, Zhang C, Fuller DQ (2015) From early domesticated rice of the middle Yangtze basin to millet, rice and wheat agriculture: archaeobotanical macro-remains from Baligang, Nanyang basin, central China (6700 – 500 bc). PLoS ONE 10(10):e0139885
Deng Z, Kuo S-C, Carson MT, Hung H-C (2022) Early Austronesians cultivated rice and millet together: tracing taiwan’s first neolithic crops. Front Plant Sci 13:962073
Diamond J (2001) Polynesian origins: slow boat to melanesia? Nature 410(6825):167–168
Diamond J, Bellwood P (2003) Farmers and their languages: the first expansions. Science 300(5619):597–603. https://doi.org/10.1126/science.1078208
Du H, Yu Y, Ma Y, Gao Q, Cao Y, Chen Z, Ma B, Qi M, Li Y, Zhao X (2017) Sequencing and de Novo assembly of a near complete indica rice genome. Nat Commun 8(1):1–12
Duan P, Xu J, Zeng D, Zhang B, Geng M, Zhang G, Huang K, Huang L, Xu R, Ge S, Qian Q, Li Y (2017) Natural variation in the promoter of GSE5 contributes to grain size diversity in rice. Mol Plant 10(5):685–694. https://doi.org/10.1016/j.molp.2017.03.009
Edgar RC (2004a) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5(1):113
Edgar RC (2004b) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucl Acids Res 32(5):1792–1797
Fan C, Xing Y, Mao H, Lu T, Han B, Xu C, Li X, Zhang Q (2006) GS3, a major QTL for grain length and weight and minor QTL for grain width and thickness in rice, encodes a putative transmembrane protein. Theor Appl Genet 112(6):1164–1171. https://doi.org/10.1007/s00122-006-0218-1
Felsenstein J (2004) PHYLIP (phylogeny inference package) version 3.6. Distributed by the author. http://wwwevolutiongswashingtonedu/phyliphtml
Fuller DQ, Qin L, Zheng Y, Zhao Z, Chen X, Hosoya LA, Sun G-P (2009) The domestication process and domestication rate in rice: spikelet bases from the lower Yangtze. Science 323(5921):1607–1610
Fuller DQ, Sato Y-I, Castillo C, Qin L, Weisskopf AR, Kingwell-Banham EJ, Song J, Ahn S-M, Van Etten J (2010) Consilience of genetics and archaeobotany in the entangled history of rice. Archaeol Anthropol Sci 2(2):115–131
Gray RD, Jordan FM (2000) Language trees support the express-train sequence of Austronesian expansion. Nature 405(6790):1052–1055. https://doi.org/10.1038/35016575
Gutaker RM, Groen SC, Bellis ES, Choi JY, Pires IS, Bocinsky RK, Slayton ER, Wilkins O, Castillo CC, Negrão S (2020) Genomic history and ecology of the geographic spread of rice. Nat Plants 6(5):492–502
Hill C, Soares P, Mormina M, Macaulay V, Clarke D, Blumbach PB, Vizuete-Forster M, Forster P, Bulbeck D, Oppenheimer S (2007) A mitochondrial stratigraphy for Island Southeast Asia. Am J Hum Genet 80(1):29–43
Hsieh J-S, Hsing Y-IC, Hsu T-F, Li PJ-K, Li K-T, Tsang C-H (2011) Studies on ancient rice—where botanists, agronomists, archeologists, linguists, and ethnologists Meet. Rice 4(3–4):178–183. https://doi.org/10.1007/s12284-011-9075-x
Hsu T-F, Wang Y-H, Fang B-X, Chen Y-Q, Tsai Y-C, Xie Z-S, Hsing Y-IC (2019) A comparative study on morphological types of carbonized rice grains in prehistorical Taiwan. Field Archaeol Taiwan 19:55–86 (in Chinese with English Abstract)
Hu Z, Lu S-J, Wang M-J, He H, Sun L, Wang H, Liu X-H, Jiang L, Sun J-L, Xin X, Kong W, Chu C, Xue H-W, Yang J, Luo X, Liu J-X (2018) A novel QTL qTGW3 encodes the GSK3/SHAGGY-like kinase OsGSK5/OsSK41 that interacts with OsARF4 to negatively regulate grain size and weight in rice. Mol Plant 11(5):736–749. https://doi.org/10.1016/j.molp.2018.03.005
Huang H (1984) Report for the rescue excavation of the Chi-Shan-Yen site. Taipei City Archives. (in Chinese)
Huang K, Wang D, Duan P, Zhang B, Xu R, Li N, Li Y (2017) WIDE AND THICK GRAIN 1, which encodes an otubain-like protease with deubiquitination activity, influences grain size and shape in rice. Plant J 91(5):849–860. https://doi.org/10.1111/tpj.13613
Huang Y-F, Wu D-H, Wang C-L, Du P-R, Cheng C-Y, Cheng C-C (2021) Survey of rice production practices and perception of weedy red rice (Oryza sativa f. spontanea) in Taiwan. Weed Sci 69(5):526–535
Hung H-C, Iizuka Y, Bellwood P, Nguyen KD, Bellina B, Silapanth P, Dizon E, Santiago R, Datan I, Manton JH (2007) Ancient Jades map 3,000 years of prehistoric exchange in Southeast Asia. Proc Natl Acad Sci USA 104(50):19745–19750. https://doi.org/10.1073/pnas.0707304104
Hung H-C, Nguyen KD, Bellwood P, Carson MT (2013) Coastal connectivity: long-term trading networks across the South China sea. J Isl Coast Archaeol 8(3):384–404
Ishikawa R, Castillo CC, Htun TM, Numaguchi K, Inoue K, Oka Y, Ogasawara M, Sugiyama S, Takama N, Orn C, Inoue C, Nonomura K-I, Allaby R, Fuller DQ, Ishii T (2022) A Stepwise route to domesticate rice by controlling seed shattering and panicle shape. Proc Natl Acad Sci USA 119(26):e2121692119. https://doi.org/10.1073/pnas.2121692119
Ishimaru K, Hirotsu N, Madoka Y, Murakami N, Hara N, Onodera H, Kashiwagi T, Ujiie K, Shimizu B, Onishi A, Miyagawa H, Katoh E (2013) Loss of function of the IAA-glucose hydrolase gene TGW6 enhances rice grain weight and increases yield. Nat Genet 45(6):707–711. https://doi.org/10.1038/ng.2612
Iso E (1944) Lectures on rice cultivating in Formosa (Taiwan)
Jiang H, Zhang A, Liu X, Chen J (2022) Grain size associated genes and the molecular regulatory mechanism in rice. Int J Mol Sci 23(6):3169
Kato S, Kosaka H, Hara S (1930) On the affinity of the cultivated varieties of rice plants, Oryza sativa L. J Dept Agric Kyushu Imp Univ 2(29):241–276
Konishi S, Izawa T, Lin SY, Ebana K, Fukuta Y, Sasaki T, Yano M (2006) An SNP caused loss of seed shattering during rice domestication. Science 312(5778):1392–1396. https://doi.org/10.1126/science.1126410
KRGP (2014) The 3,000 rice genomes project. GigaScience 3(1):2047–2217 X-2043-2047
Kumar S, Stecher G, Tamura K (2016) MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol Biol Evol 33(7):1870–1874
Li C, Zhou A, Sang T (2006) Rice domestication by reducing shattering. Science 311(5769):1936–1939. https://doi.org/10.1126/science.1123604
Li Y, Fan C, Xing Y, Jiang Y, Luo L, Sun L, Shao D, Xu C, Li X, Xiao J, He Y, Zhang Q (2011) Natural variation in GS5 plays an important role in regulating grain size and yield in rice. Nat Genet 43(12):1266–1269. https://doi.org/10.1038/ng.977
Li N, Xu R, Duan P, Li Y (2018) Control of grain size in rice. Plant Reprod 31(3):237–251. https://doi.org/10.1007/s00497-018-0333-6
Liakat Ali M, McClung AM, Jia MH, Kimball JA, McCouch SR, Eizenga GC (2011) A rice diversity panel evaluated for genetic and agro-morphological diversity between subpopulations and its geographic distribution. Crop Sci 51(5):2021–2035. https://doi.org/10.2135/cropsci2010.11.0641
Liu Q, Han R, Wu K, Zhang J, Ye Y, Wang S, Chen J, Pan Y, Li Q, Xu X (2018) G-protein βγ subunits determine grain size through interaction with MADS-domain transcription factors in rice. Nat Commun 9(1):852
Liu Z, Xie L, Chen W (2025) Advancement of uridine diphosphate-dependent glycosyltransferases (UGTs) in the glycosylation modification of natural products and their protein engineering. Food Qual Saf 9:fyaf005. https://doi.org/10.1093/fqsafe/fyaf005
Ma Y, Dai X, Xu Y, Luo W, Zheng X, Zeng D, Pan Y, Lin X, Liu H, Zhang D, Xiao J, Guo X, Xu S, Niu Y, Jin J, Zhang H, Xu X, Li L, Wang W, Qian Q, Ge S, Chong K (2015) COLD1 confers chilling tolerance in rice. Cell 160(6):1209–1221. https://doi.org/10.1016/j.cell.2015.01.046
Mansueto L, Fuentes RR, Chebotarov D, Borja FN, Detras J, Abriol-Santos JM, Palis K, Poliakov A, Dubchak I, Solovyev V (2016) SNP-Seek II: a resource for allele mining and analysis of big genomic data in Oryza sativa. Curr Plant Biol 7:16–25
Mao H, Sun S, Yao J, Wang C, Yu S, Xu C, Li X, Zhang Q (2010) Linking differential domain functions of the GS3 protein to natural variation of grain size in rice. Proc Natl Acad Sci USA 107(45):19579–19584. https://doi.org/10.1073/pnas.1014419107
Matsuo T, Hoshikawa K (1993) Science of the rice plant: morphology, vol 1. Food and Agricult Policy Res Center, p 686
Oka HI (1991) Ecology of wild rice planted in Taiwan I. Sequential distribution of species and their interactions in weed. Bot Bull Acad Sinica 32
Pawley A (2002) The Austronesian dispersal: languages, technologies, people. Examining the farming/language dispersal hypothesis. McDonald Institute for Archaeological Research, pp 149–172
Peng M, Shahzad R, Gul A, Subthain H, Shen S, Lei L, Zheng Z, Zhou J, Lu D, Wang S (2017) Differentially evolved glucosyltransferases determine natural variation of rice flavone accumulation and UV-tolerance. Nat Commun 8(1):1975
Qi P, Lin YS, Song XJ, Shen JB, Huang W, Shan JX, Zhu MZ, Jiang L, Gao JP, Lin HX (2012) The novel quantitative trait locus GL3. 1 controls rice grain size and yield by regulating Cyclin-T1; 3. Cell Res 22(12):1666–1680. https://doi.org/10.1038/cr.2012.151
Qiu J, Zhou Y, Mao L, Ye C, Wang W, Zhang J, Yu Y, Fu F, Wang Y, Qian F, Qi T, Wu S, Sultana MH, Cao YN, Wang Y, Timko MP, Ge S, Fan L, Lu Y (2017) Genomic variation associated with local adaptation of weedy rice during de-domestication. Nat Commun 8:15323. https://doi.org/10.1038/ncomms15323
Renfrew JM (1973) Palaeoethnobotany. The Prehistoric Food Plants of the near East and Europe
Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP (2011) Integrative genomics viewer. Nat Biotechnol 29(1):24–26
Rozas J, Ferrer-Mata A, Sánchez-DelBarrio JC, Guirao-Rico S, Librado P, Ramos-Onsins SE, Sánchez-Gracia A (2017) DnaSP 6: DNA sequence polymorphism analysis of large data sets. Mol Biol Evol 34(12):3299–3302
Sagart L (2004) The higher phylogeny of Austronesian and the position of Tai-Kadai. Ocean Ling 43:411–444. https://doi.org/10.1353/ol.2005.0012
Sagart L, Hsu T-F, Tsai Y-C, Wu C-C, Huang L-T, Chen Y-C, Chen Y-F, Tseng Y-C, Lin H-Y, Hsing Y-iC (2018) A Northern Chinese origin of Austronesian agriculture: new evidence on traditional Formosan cereals. Rice 11(1):1–16
Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4(4):406–425
Sasaki A, Ashikari M, Ueguchi-Tanaka M, Itoh H, Nishimura A, Swapan D, Ishiyama K, Saito T, Kobayashi M, Khush GS (2002) A mutant gibberellin-synthesis gene in rice. Nature 416(6882):701–702
Shi CL, Dong NQ, Guo T, Ye WW, Shan JX, Lin HX (2020) A quantitative trait locus GW6 controls rice grain size and yield through the Gibberellin pathway. Plant J 103(3):1174–1188
Shomura A, Izawa T, Ebana K, Ebitani T, Kanegae H, Konishi S, Yano M (2008) Deletion in a gene associated with grain size increased yields during rice domestication. Nat Genet 40(8):1023–1028. https://doi.org/10.1038/ng.169
Si L, Chen J, Huang X, Gong H, Luo J, Hou Q, Zhou T, Lu T, Zhu J, Shangguan Y, Chen E, Gong C, Zhao Q, Jing Y, Zhao Y, Li Y, Cui L, Fan D, Lu Y, Weng Q, Wang Y, Zhan Q, Liu K, Wei X, An K, An G, Han B (2016) OsSPL13 controls grain size in cultivated rice. Nat Genet 48(4):447–456. https://doi.org/10.1038/ng.3518
Silva F, Weisskopf A, Castillo C, Murphy C, Kingwell-Banham E, Qin L, Fuller DQ (2018) A Tale of two rice varieties: modelling the prehistoric dispersals of Japonica and proto-indica rices. Holocene 28(11):1745–1758
Song XJ, Huang W, Shi M, Zhu MZ, Lin HX (2007) A QTL for rice grain width and weight encodes a previously unknown RING-type E3 ubiquitin ligase. Nat Genet 39(5):623–630. https://doi.org/10.1038/ng2014
Song XJ, Kuroha T, Ayano M, Furuta T, Nagai K, Komeda N, Segami S, Miura K, Ogawa D, Kamura T, Suzuki T, Higashiyama T, Yamasaki M, Mori H, Inukai Y, Wu J, Kitano H, Sakakibara H, Jacobsen SE, Ashikari M (2015) Rare allele of a previously unidentified histone H4 acetyltransferase enhances grain weight, yield, and plant biomass in rice. Proc Natl Acad Sci USA 112(1):76–81. https://doi.org/10.1073/pnas.1421127112
Spielmeyer W, Ellis MH, Chandler PM (2002) Semidwarf (sd-1), green revolution rice, contains a defective Gibberellin 20-oxidase gene. Proc Natl Acad Sci USA 99(13):9043–9048
Stein JC, Yu Y, Copetti D, Zwickl DJ, Zhang L, Zhang C, Chougule K, Gao D, Iwata A, Goicoechea JL, Wei S, Wang J, Liao Y, Wang M, Jacquemin J, Becker C, Kudrna D, Zhang J, Londono CEM, Song X, Lee S, Sanchez P, Zuccolo A, Ammiraju JSS, Talag J, Danowitz A, Rivera LF, Gschwend AR, Noutsos C, Wu CC, Kao SM, Zeng JW, Wei FJ, Zhao Q, Feng Q, El Baidouri M, Carpentier MC, Lasserre E, Cooke R, Rosa Farias DD, da Maia LC, Dos Santos RS, Nyberg KG, McNally KL, Mauleon R, Alexandrov N, Schmutz J, Flowers D, Fan C, Weigel D, Jena KK, Wicker T, Chen M, Han B, Henry R, Hsing YC, Kurata N, de Oliveira AC, Panaud O, Jackson SA, Machado CA, Sanderson MJ, Long M, Ware D, Wing RA (2018) Genomes of 13 domesticated and wild rice relatives highlight genetic conservation, turnover and innovation across the genus Oryza. Nat Genet 50(2):285–296. https://doi.org/10.1038/s41588-018-0040-0
Sweeney MT, Thomson MJ, Pfeil BE, McCouch S (2006) Caught red-handed: Rc encodes a basic helix-loop-helix protein conditioning red pericarp in rice. Plant Cell 18(2):283–294. https://doi.org/10.1105/tpc.105.038430
Tajima F (1989) Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123(3):585–595
Takahashi Y, Shimamoto K (2011) Heading date 1 (Hd1), an ortholog of Arabidopsis CONSTANS, is a possible target of human selection during domestication to diversify flowering times of cultivated rice. Genes Genet Syst 86(3):175–182. https://doi.org/10.1266/ggs.86.175
Takeda K, Takahashi M-E (1970) Unbalanced growth in floral glumes and caryopsis in rice: I. varietal difference in the degree of unbalance and the occurrence of mulformed grains.(Genetical studies on rice plant, XXXXV). Japan J Breed 20(6):337–343 (in Japanese with English Summary)
Takita T (1988) Grain ripening of a high yielding rice cultivar with very large grains. Breed Sci 38:443–448
Teng Y-C (1999) The development history of rice farming in Taiwan. Department of Agriculture and Forestry, Taiwan Provincial Government. 793pp. (in Chinese)
Thorvaldsdóttir H, Robinson JT, Mesirov JP (2013) Integrative genomics viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 14(2):178–192
Tsang C-H, Li K-T (2015) Archaeological heritage in the Tainan science park of Taiwan. Tainan science park archaeological discoveries series; 3 Taitung:350–359
Wanchana S, Toojinda T, Tragoonrung S, Vanavichit A (2003) Duplicated coding sequence in the waxy allele of tropical glutinous rice (Oryza sativa L). Plant Sci 165(6):1193–1199. https://doi.org/10.1016/s0168-9452(03)00326-1
Wang S, Wu K, Yuan Q, Liu X, Liu Z, Lin X, Zeng R, Zhu H, Dong G, Qian Q (2012) Control of grain size, shape and quality by OsSPL16 in rice. Nat Genet 44(8):950–954
Wang S, Li S, Liu Q, Wu K, Zhang J, Wang S, Wang Y, Chen X, Zhang Y, Gao C, Wang F, Huang H, Fu X (2015) The OsSPL16-GW7 regulatory module determines grain shape and simultaneously improves rice yield and grain quality. Nat Genet 47(8):949–954. https://doi.org/10.1038/ng.3352
Wang W, Mauleon R, Hu Z, Chebotarov D, Tai S, Wu Z, Li M, Zheng T, Fuentes RR, Zhang F, Mansueto L, Copetti D, Sanciangco M, Palis KC, Xu J, Sun C, Fu B, Zhang H, Gao Y, Zhao X, Shen F, Cui X, Yu H, Li Z, Chen M, Detras J, Zhou Y, Zhang X, Zhao Y, Kudrna D, Wang C, Li R, Jia B, Lu J, He X, Dong Z, Xu J, Li Y, Wang M, Shi J, Li J, Zhang D, Lee S, Hu W, Poliakov A, Dubchak I, Ulat VJ, Borja FN, Mendoza JR, Ali J, Li J, Gao Q, Niu Y, Yue Z, Naredo MEB, Talag J, Wang X, Li J, Fang X, Yin Y, Glaszmann JC, Zhang J, Li J, Hamilton RS, Wing RA, Ruan J, Zhang G, Wei C, Alexandrov N, McNally KL, Li Z, Leung H (2018) Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature 557(7703):43–49. https://doi.org/10.1038/s41586-018-0063-9
Wang A, Hou Q, Si L, Huang X, Luo J, Lu D, Zhu J, Shangguan Y, Miao J, Xie Y, Wang Y, Zhao Q, Feng Q, Zhou C, Li Y, Fan D, Lu Y, Tian Q, Wang Z, Han B (2019) The PLATZ transcription factor GL6 affects grain length and number in rice. Plant Physiol 180(4):2077–2090. https://doi.org/10.1104/pp.18.01574
Wang K-W, Iizuka Y, Jackson C (2022) The production technology of mineral soda alumina glass: a perspective from microstructural analysis of glass beads in iron age Taiwan. PLoS ONE 17(2):e0263986. https://doi.org/10.1371/journal.pone.0263986
Wang K-W, Dussubieux L, Iizuka Y, Li K-t, Tsang C-h (2023) Glass ornaments from Southwestern taiwan: new light on maritime glass exchange across Southeast, South and West Asia in the early-mid 1st millennium CE. Herit Sci 11(1):255. https://doi.org/10.1186/s40494-023-01093-1
Watterson G (1975) On the number of segregating sites in genetical models without recombination. Theor Popul Biol 7(2):256–276
Wei F-J, Tsai Y-C, Wu H-P, Huang L-T, Chen Y-C, Chen Y-F, Wu C-C, Tseng Y-T, Hsing Y-IC (2016) Both Hd1 and Ehd1 are important for artificial selection of flowering time in cultivated rice. Plant Sci 242:187–194
Wickham H (2011) Ggplot2. WIREs Comput Stat 3(2):180–185
Wickham H, Averick M, Bryan J, Chang W, McGowan LDA, François R, Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019) Welcome to the tidyverse. J Open Source Softw 4(43):1686
Wohlwend J, Corso G, Passaro S, Reveiz M, Leidal K, Swiderski W, Portnoi T, Chinn I, Silterra J, Jaakkola T, Barzilay R (2024) Boltz-1: democratizing biomolecular interaction modeling. BioRxiv 20242011(2019):624167
Wu I-L, Lee T, Li K, Lee K-H (2016) The origin of rice cultivation at 4,000 years ago on the East Coast of taiwan: preliminary results of phytolith analysis. J Austronesain Stud 6(1):25–50
Wu C-C, Wei F-J, Chiou W-Y, Tsai Y-C, Wu H-P, Gotarkar D, Wei Z-H, Lai M-H, Hsing Y-IC (2020a) Studies of rice Hd1 haplotypes worldwide reveal adaptation of flowering time to different environments. PLoS ONE 15(9):e0239028. https://doi.org/10.1371/journal.pone.0239028
Wu D-H, Gealy DR, Jia MH, Edwards JD, Lai M-H, McClung AM (2020b) Phylogenetic origin and dispersal pattern of Taiwan weedy rice. Pest Manag Sci 76(5):1639–1651
Wu C-C, Liu C-K, Wei F-J, Huang L-T, Lin W-C, Hsie Y-T, Lin M-C, Chan C-H, Le T-T, Wu Y-P, Lo J-C, Li H-F, Lai M-H, Chen S, Hou A-L, Chiou W-Y, Yu S-M, Ho T-HD, Hsing Y-IC (2022) A rice genomics and phenomics resource with primarily Taiwan rice accessions. Crop Environ Bioinf 18:12–36. https://doi.org/10.30061/CEB.202212_18.0002
Yano M, Katayose Y, Ashikari M, Yamanouchi U, Monna L, Fuse T, Baba T, Yamamoto K, Umehara Y, Nagamura Y, Sasaki T (2000) Hd1, a major photoperiod sensitivity quantitative trait locus in rice, is closely related to the Arabidopsis flowering time gene CONSTANS. Plant Cell 12(12):2473–2484. https://doi.org/10.1105/tpc.12.12.2473
Zhao D-S, Li Q-F, Zhang C-Q, Zhang C, Yang Q-Q, Pan L-X, Ren X-Y, Lu J, Gu M-H, Liu Q-Q (2018a) GS9 acts as a transcriptional activator to regulate rice grain shape and appearance quality. Nat Commun 9(1):1240. https://doi.org/10.1038/s41467-018-03616-y
Zhao Q, Feng Q, Lu H, Li Y, Wang A, Tian Q, Zhan Q, Lu Y, Zhang L, Huang T, Wang Y, Fan D, Zhao Y, Wang Z, Zhou C, Chen J, Zhu C, Li W, Weng Q, Xu Q, Wang ZX, Wei X, Han B, Huang X (2018b) Pan-genome analysis highlights the extent of genomic variation in cultivated and wild rice. Nat Genet 50(2):278–284. https://doi.org/10.1038/s41588-018-0041-z
Acknowledgements
We are grateful to the Taiwanese indigenous people and traditional farmers for their stewardship of traditional rice landraces. We thank Ms. Lie-Hong Wu for maintaining the greenhouse plants and Laura Smales (BioMedEditing, Toronto, Canada) for English editing. We also wish to acknowledge the contributions of Dr. Yung-Pei Wu, who passed away two years ago while conducting fieldwork.
Funding
This work was supported by the MOST grants (110-2313-B-001-005 and 109-2313-B-001-008 to YIH) as well as ITAR grants (AS-ITAR-110-TD06, AS-109-ITAR-TD08 and AS-108-ITAR-TD08 to YIH).
Author information
Authors and Affiliations
Contributions
YICH conceived and designed the study, CCW, KYC, YFC, LTH, SMY, THH generated sequencing data. YHW, YCT, TFH, YTT analyzed the carbonized grains. YCT, YTT, NCD, JCL, DPS, CWW, MHL, DHW, SC, SJC collected rice accessions. CCW, CKL, WYC performed bioinformatics analysis. CHT, KTL, WLC provided archaeological materials. WYC, JYY performed the protein modelling analysis. SL provided linguistic and historic information. YICH wrote the manuscript with input from all authors.
Corresponding author
Ethics declarations
Competing Interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Wu, CC., Tseng, YT., Tsai, YC. et al. Allele Mining of Seed-Related Genes Reveals Early movement, Selection and Adaptation of Asian Rice Landraces. Rice 18, 95 (2025). https://doi.org/10.1186/s12284-025-00854-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12284-025-00854-9