Skip to main content

Deciphering the transcriptional regulatory networks that control size, color, and oil content in Brassica rapa seeds



Brassica rapa is an important oilseed and vegetable crop species and is the A subgenome donor of two important oilseed Brassica crops, Brassica napus and Brassica juncea. Although seed size (SZ), seed color (SC), and oil content (OC) substantially affect seed yield and quality, the mechanisms regulating these traits in Brassica crops remain unclear.


We collected seeds from a pair of B. rapa accessions with significantly different SZ, SC, and OC at seven seed developmental stages (every 7 days from 7 to 49 days after pollination), and identified 28,954 differentially expressed genes (DEGs) from seven pairwise comparisons between accessions at each developmental stage. K-means clustering identified a group of cell cycle-related genes closely connected to variation in SZ of B. rapa. A weighted correlation analysis using the WGCNA package in R revealed two important co-expression modules comprising genes whose expression was positively correlated with SZ increase and negatively correlated with seed yellowness, respectively. Upregulated expression of cell cycle-related genes in one module was important for the G2/M cell cycle transition, and the transcription factor Bra.A05TSO1 seemed to positively stimulate the expression of two CYCB1;2 genes to promote seed development. In the second module, a conserved complex regulated by the transcription factor TT8 appear to determine SC through downregulation of TT8 and its target genes TT3, TT18, and ANR. In the third module, WRI1 and FUS3 were conserved to increase the seed OC, and Bra.A03GRF5 was revealed as a key transcription factor on lipid biosynthesis. Further, upregulation of genes involved in triacylglycerol biosynthesis and storage in the seed oil body may increase OC. We further validated the accuracy of the transcriptome data by quantitative real-time PCR of 15 DEGs. Finally, we used our results to construct detailed models to clarify the regulatory mechanisms underlying variations in SZ, SC, and OC in B. rapa.


This study provides insight into the regulatory mechanisms underlying the variations of SZ, SC, and OC in plants based on transcriptome comparison. The findings hold great promise for improving seed yield, quality and OC through genetic engineering of critical genes in future molecular breeding.


Seeds are the most important harvested organ in many crop plants, storing large amounts of essential substances used by humans, and contain genetic materials that could be potentially used to improve crops. In angiosperms, seed development is initiated by a double fertilization event in which one of the two spermatids unites with an egg cell, while the other fertilizes the diploid central cell to form a triploid endosperm [1]. In monocotyledons and some dicotyledonous species, the endosperm occupies most of the mature seed, but in most dicotyledons, the embryo consumes a large amount of endosperm and occupies the bulk of the mature seed [2].

Seed size (SZ) is an important factor in crop yield, not only directly affecting the yield of seed crops, but also positively influencing seed germination and seedling growth in all species [2]. To date, several pathways regulating this significant agronomic trait have been identified and characterized. In the HAIKU (IKU) pathway, a classical SZ pathway controlling endosperm development, the triple mutant iku1/iku2/mini3 of Arabidopsis thaliana (Arabidopsis) shows reduced SZ due to precocious cell formation of the endosperm [3, 4]. Furthermore, the MINI3 pathway directly regulates CYTOKININ OXIDASE 2 (CKX2), inhibiting the activity of functional cytokinin (CK) to maintain endosperm growth [5]. SHORT HYPOCOTYL UNDER BLUE 1 (SHB1) is recruited by the promoters of IKU2 and MINI3 to promote their transcription. CKX2 is also regulated epigenetically [5, 6]. In the ubiquitin–proteasome pathway, the ubiquitin receptor DA1 acts synergistically with the E3 ubiquitin ligases DA2 or BIG BROTHER (BB)/ENHANCER OF DA1 (EOD1) to degrade various substrates and thereby reduce SZ by restricting cell proliferation [7]. In addition, DA1 functions synergistically with unknown E3 ubiquitin ligases to ubiquitinate UBIQUITIN-SPECIFIC PROTEASE 15 (UBP15), encoded by SUPPRESSOR OF DA1 (SOD2), resulting in small seeds and organs [6]. Overexpression of two transcription factors (TFs), TRANSPARENT TESTA GLABRA2 (TTG2), and KLU/CYP78A5, result in larger seeds due to positive regulation of cell expansion and cell proliferation in Arabidopsis [8]. In rice (Oryza sativa), the SQUAMOSA PROMOTER BINDING PROTEIN-LIKE (OsSPL13) TF promotes cell expansion and cell elongation to augment grain length and weight [9]. By contrast, APETALA2 (AP2) TFs reduce SZ through negative control of cell expansion [10]. In addition, plant hormones such as brassinosteroid (BR) and CK also play key roles in SZ variation. Overexpression of genes involved in BR biosynthesis leads to bigger seeds in Arabidopsis and rice, whereas mutants of BR biosynthesis genes show reduced SZ [11]. In Arabidopsis, the simultaneous silencing of the CK response factors AHK2, AHK3, and CRE1 produces larger but fewer seeds [12]. In Brassica crops, however, although several pathways associated with SZ variation have been identified, the molecular mechanism underlying this trait, in terms of cell cycle and cell division pathways, remains largely unknown.

Yellow seed is a desirable trait of Brassica species due to the higher oil and protein content, but lower fiber content compared with dark seed [13]. Oxidized procyanidins are the key pigments in seed color (SC) formation in plants, and they are secondary products in the flavonoid and phenylpropane biosynthesis pathways [14]. Numerous TRANSPARENT TESTA (TT) genes, which participate in various stages of flavonoid synthesis, have been identified in Arabidopsis [15, 16]. A complex structure, composed of MYB, basic helix–loop–helix (bHLH) TFs, and WD40 proteins, has been identified as a crucial factor in determining SC by regulating the flavonoid biosynthesis pathway [17]. Although several homologous TT genes have been identified in Brassica crops, the regulatory mechanism of SC formation is still not particularly clear [18, 19], retarding the progress of yellow-seeded breeding of Brassica crops. In addition, Brassica species have been widely cultivated as oilseed crop, such as Brassica napus, Brassica juncea, and Brassica rapa. Triacylglycerols (TAG), as the main form of seed oil, not only play an important role in edible oil, but also can be used as a biofuel or biodiesel for industrial production [20]. Since 1961, when the TAG biosynthesis pathway was first proposed by Kennedy, the synthesis of glycerol has become a hallmark of lipid biochemistry [21]. In Arabidopsis, the TAG biosynthetic, metabolic, and degradation pathways have been studied comprehensively [22], and the genes encoding most of the enzymes involved in TAG metabolism have been and cloned characterized [23]. However, the molecular mechanism underlying fatty acid (FA) accumulation in seeds of Brassica crops remains poorly understood, resulting in a slow improvement of breeding for high oil content (OC) in Brassica crops.

Brassica rapa (AA, 2n = 20) is an important oil and vegetable Brassica crop that is widely cultivated worldwide [24]. It is also the diploid parental species of two important oilseed Brassica crops, B. napus (AACC, 2n = 38) and B. juncea (AABB, 2n = 36) [25]. Therefore, unravelling the regulatory mechanisms governing seed development and SZ in B. rapa would lay a solid foundation for breeding high-yield Brassica crops and related plants. Hence, we employed a pair of B. rapa accessions with significant differences in SZ, SC, and OC to analyze the transcriptomic variations of seeds at seven seed developmental stages. These results provide insight not only into the regulatory mechanism influencing SZ by regulating cell cycle progression, but also into the molecular mechanism of SC formation regulated by flavonoid biosynthesis, and that of OC difference caused by FA accumulation in B. rapa.


Phenotypic variations between two B. rapa accessions

The two B. rapa accessions were remarkably different in both SC, SZ and OC after harvest (Fig. 1a–c; Additional file 1: Table S1). The SC of B. rapa accession SWUK4 was yellow, whereas that of accession SWUK3 was black (Fig. 1a). The final SZ of SWUK4 seed (mean ± standard error of the longest dimension, 2.42 ± 0.05 mm) was significantly higher than that of SWUK3 (1.23 ± 0.01 mm), representing an approximately two-fold difference. And the fresh seeds of SWUK4 were visibly larger than those of SWUK3 at all seven sampling stages (Fig. 1b). Moreover, near-infrared reflection spectroscopy (NIRS) assays indicated that mature seed OC was significantly higher in SWUK4 (42.15 ± 1.97%) than in SWUK3 (38.93 ± 1.44%) (Fig. 1c). Gas chromatography (GC) measurements also identified differences in the abundances of the FA components (palmitic acid, stearic acid, oleic acid, linoleic acid, and linolenic acid) between the two accessions (Additional file 1: Table S1).

Fig. 1
figure 1

Phenotypic comparison between two B. rapa accessions. a Comparison of SZ and SC in SWUK3 and SWUK4. b Comparison of SZ of seeds sampled at seven seed developmental stages and dry mature seeds. c Comparison of OC in seeds sampled at seven seed developmental stages and dry mature seeds. d Comparison of SZ increase between two adjacent sampling stages. e Comparison of OC increase between two adjacent sampling stages. Values are the mean ± standard error (SE) of three biological replicates. Student’s t-test was used for statistical analysis of data from the two B. rapa accessions (**P < 0.01; ***P < 0.001)

The speed of seed development is an important factor affecting the seed-filling process in B. rapa. The SZ of accession SWUK3 showed the greatest increase at 21–28 days after pollination (DAP) (0.75 ± 0.004 mm), followed by 7–14 DAP (0.4 ± 0.006 mm), whereas that of SWUK4 varied most at 21–28 DAP (0.94 ± 0.009 mm), followed by 28–35 DAP (0.63 ± 0.002 mm) (Fig. 1d), suggesting that the critical stage for SZ formation was 21–28 DAP in both B. rapa accessions, and the following stage, 28–35 DAP, was also quite important for the larger-seed accession. In addition, the two accessions showed the greatest increase in seed OC at 28–35 DAP (SWUK3: 18.02 ± 1.49%, SWUK4: 19.24 ± 2.84%) (Fig. 1e). The seed OC of SWUK4 increased greater than that of SWUK3 in all 7 developmental stages, especially 21–49 DAP (Fig. 1e). These results suggested that the critical stage for seed OC formation might be 28–35 DAP in B. rapa.

The silique length (SL) and thousand-seed weight (TSW) of the two B. rapa accessions were also significantly different. The SL of SWUK4 (40.90 ± 1.14 mm) was 1.6-fold longer than that of SWUK3 (25.00 ± 0.36 mm) (Additional file 1: Table S1), and the TSW of SWUK4 (7.06 ± 0.14 g) was approximate 2.9-fold greater than that of SWUK3 (2.42 ± 0.05 g) (Additional file 1: Table S1).

Transcriptome sequencing analysis

To generate transcriptional networks and identify key regulatory genes regulating SZ, SC, and OC in B. rapa, we collected 42 seed samples from two distinct accessions at seven seed developmental stages. Transcriptome sequencing generated a total of 320 Gb of raw data, including 1076.28 million reads, averaging 7.62 G and 25.63 million reads per sample (Additional file 1: Table S2). Filtering of low-quality and contaminant reads left 1034 million clean reads, with an average of 24.64 million reads per sample. The G + C content of all samples varied from 44.39% to 51.51%, and the range of Q30 was from 91.12% to 94.28%. After mapping, three samples (one sampled from SWUK3 at 14 DAP, the rest sampled from SWUK4 at 14 and 21 DAP) were removed because they had unique mapping rates of < 70%. This resulted in an average unique mapping rate of 87.27% (Additional file 1: Table S2), indicating that the quality of our transcriptome sequencing reads was good enough for further regulatory network construction and identification of DEGs.

We eliminated one sample from SWUK4 at 49 DAP from further analysis because the correlation coefficient calculated for two of its biological replicates was only ~ 0.62, suggesting that this sample may have been contaminated. The remaining 38 samples all showed high reproducibility (r2 > 0.9) among biological replicates at each seed sampling stage (Additional file 2: Fig. S1). The correlations among the samples harvested at different seed developmental stages were much lower than those among biological replicates. Furthermore, the samples could be divided into two distinct groups in both B. rapa accessions—samples collected at 7, 14, and 21 DAP grouped together, whereas those from the other four time points formed a second group—suggesting that there might be a clear transition event at 21–28 DAP during seed development in B. rapa (Additional file 2: Fig. S1b,d).

Identification of DEGs

Through comparing the same seven seed developmental stages between the two accessions and the adjacent stages in each accession, a total of 28,954 unique DEGs were identified (Fig. 2, Additional file 1: Table S3). In the former of comparison, the number of DEGs rose over the course of seed development, suggesting that the variation in seed transcriptome between the two accessions gradually increased from seed formation to maturation (Fig. 2). The highest number of DEGs, 14,725, was observed at 49 DAP, and the number of downregulated DEGs was higher than that of upregulated DEGs at six stages, with the exception being 49 DAP (Fig. 2). In the later comparison, both two accessions had the highest number of DEGs at 21 DAP vs. 28 DAP (SWUK3: 12,468, SWUK4: 8040) (Fig. 2, Additional file 1: Table S3), and the least number of DEGs at 14 DAP vs. 21 DAP (SWUK3: 1906, SWUK4: 1797) (Fig. 2, Additional file 1: Table S3). These results suggested that the 21–28 DAP might be the most important seed developmental stage in B. rapa, duo to the greatest number of DEGs. According to the transcriptome comparison between samples at adjacent developmental stages, the change of DEG number showed similar trend in two B. rapa accessions (Fig. 2), suggesting that there was not obvious difference of the seed developmental stage between the two accessions.

Fig. 2
figure 2

Number of DEGs. Classification of TF, TR, and kinase genes was performed using iTAK v1.6, (

An analysis of TFs, transcription repressors (TRs), and kinases using the online program iTAK identified a total of 3053 TFs, 602 TRs, and 1500 kinase genes in B. rapa (Additional file 1: Table S4). Among the DEGs, we detected 2261 unique differentially expressed TFs, indicating that the transcription of TFs (P = 2.48E−12) was regulated to a greater degree during seed development than was that of other genes, based on a Chi square test of the ratio between differentially expressed TFs and DEGs versus that between all TFs and all B. rapa genes. Among the TFs in B. rapa, the AP2/ERF–ERF, MYB, and bHLH families seem to play relatively important roles in seed development, as 169, 136, and 130 members of these families, respectively, were differentially expressed between the two accessions. Similar to TFs, the expression of 889 unique kinase genes and 355 unique TRs was also readily reprogrammed during seed development, but the significance levels (P = 6.33E−5 for kinase genes and P = 0.04 for TRs) determined using Chi square tests were lower than those for TFs.

Gene Ontology (GO) enrichment analyses of DEGs

GO enrichment analysis revealed that the genes upregulated at 35, 42, and 49 DAP were significantly enriched in the term “fatty acid biosynthetic process” (GO:0006633). Given that OC differed significantly between the two B. rapa accessions, we deduce that transcriptional variation of these genes may play a crucial role in FA accumulation in B. rapa. Another group of upregulated genes may result in larger seeds through mechanisms involving the cell cycle regulation and CK signaling pathway, as they showed enrichment in “response to cytokinin stimulus” (GO:0009735), “cell cycle” (GO:0007049), “cell division” (GO:0051301), “M phase” (GO:0000279), “M phase of mitotic cell cycle” (GO:0000087), and “regulation of cyclin-dependent protein kinase activity” (GO:0000079) at the middle to late stages of seed development. Among the downregulated genes, important functions related to SC could be identified (Additional file 1: Table S5, S6). The GO enrichment analyses showed that “phenylpropanoid biosynthetic process” (GO:0009699) and “flavonoid biosynthesis process” (GO:0009813) were enriched at all stages of seed development according to both methods, except at 49 DAP in the GO enrichment analysis. These results indicated that a reduction of pigment biosynthesis and accumulation may be the cause for the yellow seeds of B. rapa accession SWUK4 (Additional file 1: Table S5).

GO enrichment analysis of DEGs obtained from comparison of the adjacent stages revealed that the up-regulation genes of 21 DAP vs. 28 DAP and 28 DAP vs. 35 DAP in both two accessions were significantly enriched in GO term “fatty acid biosynthetic process” (GO:0006636), “lipid biosynthetic process” (GO:0008610), and “cellular lipid metabolic process” (GO:0044255), suggesting that 21–28 DAP and 28–35 DAP may be the key stages of seed OC information in B. rapa. Genes down-regulated in 7 DAP vs. 14 DAP of SWUK3 and up-regulated in 21 DAP vs. 28 DAP of SWUK4 were significantly enriched in GO terms “cell cycle” (GO:0007049) and “cell division” (GO:0051301), implying that the key stages for SZ increase were different between the two B. rapa accessions. (Additional file 1: Table S6).

Identification of expression patterns in DEGs by K-means clustering

To identify DEGs with similar expression patterns during seed development, we performed two independent K-means clustering tests for 28,954 DEGs in two accessions (Fig. 3a, b), and generated 12 optimal clusters for each accession. We identified several clusters with similar expression patterns between the two accessions, such as cluster 5 of SWUK3 and cluster 12 of SWUK4, both of which showed gene transcription levels that increased continuously with seed development (Fig. 3a, b). GO and KEGG enrichment analyses revealed that these genes were enriched in lipid storage, seed oil body biogenesis, and related pathways, suggesting that the expression levels of genes related to lipid biosynthesis elevated since 21 DAP in SWUK3, and 14 DAP in SWUK4, and both peaked at 35 DAP, which was consistent with the GO results of DEGs obtained from comparison of the adjacent stages in each accession.

Fig. 3
figure 3

K-means clustering of DEGs. a, b Twelve clusters of a SWUK3 and b SWUK4 based on FPKM values at seven seed developmental stages. c, d GO enrichment analysis results for genes in c cluster 2 of SWUK3 and d cluster 4 of SWUK4. e Heatmap of expression of cell cycle core genes in B. rapa. Gene expression levels were transformed with the log2(FPKM + 1). All genes are listed in Additional file 1: Table S7

A comparison of gene clusters in the two accessions revealed an interesting result in cluster 2 of SWUK3 and cluster 4 in SWUK4. Although the two clusters have different expression patterns, they were significantly enriched in similar functions involved in regulating the SZ, including cell cycle, cell division, and mitosis (Fig. 3c, d). In Arabidopsis, a total of 59 genes have been identified as core genes regulating the cell cycle and cell division. Based on a BLASTP analysis, we identified 101 homologs of these genes in B. rapa (Additional file 1: Table S7). Among these, the majority of positive regulatory genes showed similar patterns upon K-means clustering: they were highly transcribed at 7 and 28 DAP in SWUK3, but at 21–35 DAP in SWUK4 (Fig. 4e). Notably, the expression patterns were consistent with the respective phenotypic variations of SZ in the two accessions (Fig. 1d). Therefore, duration of cell cycle-specific gene expression at different stages may contribute to the difference of B. rapa seed SZ, and merit further investigation.

Fig. 4
figure 4

The WGCNA MEgreen module is significantly associated with SZ formation. a Scatter plot of the correlation of module membership (correlation coefficients between genes with MEgreen module) and gene significance (correlation coefficients between genes with trait of SZ increase). b Heatmap of MEgreen module genes in two B. rapa accessions, which was displayed based on log2(FPKM + 1). c Primary co-expression network of Bra.A05TSO1. Square, down arrow, triangle, and disc represent TF, TR, kinase, and other genes, respectively. The edge width represents the weight value between the two nodes: the higher the value of the weight between the nodes, the wider the edge. The regulatory relationships between the two nodes obtained from PlantRegMap are represented by red edges. d The regulatory model of TSO1 controlling seed size, with the heatmap of genes profiles in the two accessions. TSO1 play an important role in regulating the genes of G2/M phase, and MYB3R1 may bind to TSO1

Construction of co-expression networks

In crops, complex traits are generally regulated by several transcriptional networks. To identify the co-expression networks associated with our target traits, we used the R WGCNA software based on the fragments per kilobase per million mapped (FPKM) expression matrix and phenotypic data of six traits: developmental stage, SZ, SZ increase, seed yellowness, OC and OC increase. The sample clustering and correlation coefficients revealed strong repeatability among the biological replicates, and no outliers needed to be removed (Additional file 2: Fig. S2a). The results were also in agreement with those from the calculation of correlation coefficients, indicating that the seed samples from 7–21 DAP were grouped together, whereas the samples from the remaining stages formed a second group, again suggesting that seed transcriptome reprogramming is closely related to seed developmental stage, but weakly affected by accession.

In the WGCNA pipeline, pickSoftThreshold calculation revealed that the optimal soft threshold was 18, where the fitting curve approached 0.9 (Additional file 2: Fig. S2b). Then, we used the automatic blockwiseModules network construction approach to identify co-expression modules (Additional file 2: Fig. S2c). This allows the visualization of modules by color scheme, showing genes that are highly correlated in the same color, and genes that are weakly correlated in different colors (Additional file 2: Fig. S2d). The module construction process demonstrated that our functional color modules were clearly divided. After merging of modules with similar expression pattern, this process produced a total of 15 color modules, each composed of genes with similar expression patterns over time (Additional file 2: Fig. S3a).

As shown in Additional file 2: Fig. S3a, the abovementioned six traits (developmental stage, SZ, SZ increase, seed yellowness, OC and OC increase) showed significant correlation with different modules. After enrichment analysis of GO for each color module (Additional file 1: Table S8), the genes of the module defined as MEgreen, which had the highest correlation with SZ increase (r2 = 0.73, P = 2E−7), were significantly enriched in “cell cycle” (GO: 0007049), “cell division” (GO: 0051301), “regulation of cell cycle” (GO: 0051726), and “DNA replication” (GO: 006260) (Additional file 2: Fig. S3b). This indicated that the genes related to cell cycle participated in the positive regulation of seed growth rate. The functions of genes in the MEsalmon color module, which showed a significant negative correlation with seed coat transparency (r2 = –0.69, P = 1E−6), were enriched in the terms “flavonoid metabolic process” (GO: 0009812), “flavonoid biosynthetic process” (GO: 0009813), “phenylpropanoid metabolic process” (GO: 0009698), and “phenylpropanoid biosynthetic process” (GO: 0009699) (Additional file 2: Fig. S3c). This revealed that flavonoids negatively regulate yellow seed coat, such as that found in SWUK4. Furthermore, the MEbrown module, which is closely correlated with SZ (r2 = 0.7, P = 1E−6) and OC increase (r2 = 0.7, P = 1E−6), was enriched in “photosynthesis” (GO: 015979), “photosynthesis, light harvesting” (0009765), “lipid biosynthetic process” (GO: 0008610), and “fatty acid biosynthetic process” (GO: 0006633) (Additional file 2: Fig. S3d). This demonstrated that lipid synthesis occurs in the late stages of seed development in B. rapa, and requires photosynthesis to provide necessary materials and energy.

Co-expression modules regulating SZ and SC

Co-expression network construction indicated that a SZ increase was most highly correlated with the MEgreen module (r2 = 0.73, P = 2E−7) (Additional file 2: Fig. S3a). The expression profiles of a large number of genes were highly correlated with both the module eigengene (average expression profile of module genes) and SZ in the MEgreen module (Fig. 4a). GO enrichment analysis indicated that the MEgreen module genes were significantly enriched in “cell cycle” (GO:0007049) and “cell division” (GO:0051301) (Additional file 2: Fig. S3b). A heatmap based on the expression levels of the MEgreen module genes revealed that the module eigengene of the MEgreen module was similar to the average expression profiles of cluster 3 in SWUK3 and cluster 4 in SWUK4 from the K-means clustering results (Figs. 3a, b, 4b), suggesting that the SZ of B. rapa was regulated mainly by genes related to the cell cycle and cell division.

To identify hub genes in our modules of interest, we first assessed gene connectivity (Kwithin) on the basis of the absolute value of Pearson’s correlations. We then considered genes with the top 30% Kwithin in each module as hub genes of those modules. In the 800 genes of the MEgreen module, the Kwithin values ranged from 29.01 to 285.97. Based on the iTAK results, a total of 21 TFs, 17 TRs, and 36 kinase genes were identified in this module (Additional file 2: Fig. S4a). Two TF genes with high Kwithin values, Bra.A05TSO1 (BraA05g024430, Kwithin = 217.73) and Bra.A09GRAS (BraA09g015380, Kwithin = 201.97), were selected as hub genes of the MEgreen module. Their Arabidopsis orthologs are AtTSO1 (AT3G22780) and AtSCL28 (AT1G63100, a TF of the GRAS family), respectively. In the primary Bra.A05TSO1 network, Bra.A05TSO1 was directly co-expressed with 147 genes, including 2 TFs (belonging to the E2F and B3 families, respectively), 7 TRs, and 11 kinase genes (Fig. 4c). The primary network of Bra.A09GRAS contained 248 co-expressed genes, including 5 TFs (2 B3, 1 GARP and 1 TUB family), 9 TRs, and 15 kinase genes (Additional file 2: Fig. S4b).

Another module of interest was MEsalmon, which showed a significant negative correlation with seed yellowness (r2 = –0.69, P = 1E−6) (Additional file 2: Fig. S3a). The genes in this module were enriched in the GO terms “flavonoid biosynthetic process” (GO:0009813), “regulation of flavonoid biosynthetic process” (GO:0009962), and “phenylpropanoid biosynthetic process” (GO:0009699) (Additional file 2: Fig. S3c). The gene expression levels in this module were clearly lower in SWUK4 than in SWUK3 at most of the sampling stages (Fig. 5a). These results suggest that the formation of yellow seeds in SWUK4 is most likely due to weak expression or silencing of flavonoid biosynthesis pathway genes. The co-expression network of the MEsalmon module comprised 91 genes, including 5 TFs, 1 TR, and 2 kinase genes (Fig. 5b). The Kwithin values of these genes varied from 6.17 to 37.62. Two well-known phenylpropanoid biosynthetic pathway TF genes, Bra.A05TT2 (BraA05g008220, R2R3-MYB, Kwithin = 30.23) and Bra.A09TT8 (BraA09g028560, bHLH, Kwithin = 27.54), were identified as hub genes underlying SC formation (Fig. 5c, d).

Fig. 5
figure 5

The WGCNA MEsalmon module is negatively related to SC formation. a Heatmap of MEsalmon module genes in two B. rapa accessions. Gene expression levels are normalized by Z-score. b The MEsalmon co-expression network. c Primary co-expression network of Bra.A05TT2.d Primary co-expression network of Bra.A09TT8. Square, down arrow, triangle, and disc represent TF, TR, kinase, and other genes, respectively. The edge width represents the weight value between the two nodes: the higher the value of the weight between the nodes, the wider the edge

MEbrown module had the highest correlation coefficient with OC increase (r2 = 0.7, P = 1E−6) (Additional file 2: Fig. S3a). Genes in this module were enriched in the GO terms “lipid biosynthetic process” (GO:0008610), “fatty acid biosynthetic process” (GO:0006633), and “lipid metabolic process” (GO:0006629) (Additional file 2: Fig. S3d). Most of the genes in this module were most expressed at 28 and 35 DAP in the two accessions, but they had higher expression level in late seed development of SWUK4 (high OC) than SWUK3 (low OC) (Additional file 2: Fig. S5a). It indicates that the critical period for seed oil synthesis in B. rapa is 28–35 DAP. The co-expression network of the MEbrown module comprised 2268 genes, including 209 TFs, 21 TRs, and 79 kinase genes. The Kwithin values of them from 6.33 to 511.45. Three TFs with the highest Kwithin values, Bra.A03GRF5 (BraA03g036220, Kwithin = 433.30), Bra.A09WRI1 (BraA09g045300, Kwithin = 432.83), and Bra.A06FUS3 (BraA06g038070, Kwithin = 414.13), were selected as hub genes of the MEbrown (Additional file 2: Fig. S5b–d).

Role of flavonoid pathway genes in SC formation

Because the abovementioned analysis suggested that the differences in SC and OC between the two B. rapa accessions might be caused by flavonoid pathway and FA biosynthesis genes, we performed a further expression comparison analysis of these two types of genes. All expression values were subjected to log2(FPKM + 1) transformation and Z-score normalization.

For pathways involved in the formation of compounds contributing to SC (Fig. 6), there were 43 homologous genes in B. rapa (Additional file 1: Table S9). At the beginning of the pathway, the generation of p-cinnamoyl-CoA requires the three enzymes PAL, CH4, and 4CL, encoded by 14 genes in B. rapa (Additional file 1: Table S9). All of these genes showed significantly differential expression between the two accessions at one or more stages of seed development. Their expression patterns were similar, with expression increasing over time and being higher in the black-seeded accession SWUK3 at almost all stages. In the first step of flavonoid biosynthesis, the formation of p-cinnamoyl-CoA is catalyzed by CHS to produce the chalcone compound naringenin, which is then converted to its isomer naringenin in a reaction catalyzed by CHI. Next, F3H and F3′H catalyze reactions converting naringenin to dihydrokaempferol and then eriodictyol. Of the 12 genes encoding these four enzymes in B. rapa, only two (one CHI and one F3H gene) were not differentially expressed between the two accessions.

Fig. 6
figure 6

Flavonoid pathways, with enzymes encoded by genes involved in SC formation. Gene expression values in the heatmap were normalized by log2(FPKM + 1). All of these genes are listed in Additional file 1: Table S9. Asterisks (*) above the heatmap images indicate that a given gene is significantly more highly expressed in the B. rapa accession marked with an asterisk. Protein abbreviations: PAL phenylalanine ammonia lyase; C4H cinnamate 4-hydroxylase, 4CL 4-coumarate coenzyme A ligase; CHS; chalcone synthase, CHI chalcone isomerase, F3H flavonol 3-hydroxylase, F3′H flavonol 3′-hydroxylase, FLS flavonol synthase, DFR dihydroflavonol-4-reductase, LDOX leucoanthocyanidin dioxygenase; BAN BANYULS, TTG1 TRANSPARENT TESTA GLABRA1

The downstream portion of flavonoid biosynthesis is divided into two branches. In one branch, dihydrokaempferol and eriodictyol are catalyzed by FLS to generate dihydrokaempferol and dihydroquercetin. Of the five genes encoding this enzyme, only one gene was not differentially transcribed between accessions. In the other branch of the pathway, dihydrokaempferol and dihydroquercetin are catalyzed to become (–)-epiafzelechin and (–)-epicatechin by the enzymes DFR, LDOX, and BAN, which are encoded by five B. rapa genes. Notably, four of these genes were silenced and the fifth showed dramatically decreased transcription in the yellow-seeded accession SWUK4 at all seed developmental stages, indicating that the mutation of epicatechin branch genes may block pigment accumulation in B. rapa seeds.

Role of genes involved in lipid biosynthesis and storage in OC increase

In plants, lipid biosynthesis and accumulation in seeds can be divided into four steps (Fig. 7). (1) Pyruvate and other substances form C16-C18 FAs in the plastid. (2) FAs are transported into the cytoplasm, where they undergo elongation and desaturation of their carbon chains. (3) Various FAs and glycerol are catalyzed to synthesize triacylglycerols (TAG) and store them in the seed oil body, after which some are (4) degraded by products of the GDSL-type Seed Fatty Acid Reducer (SFAR) genes.

Fig. 7
figure 7

FA metabolism pathway genes regulating the OC. All the genes are listed in Additional file 1: Table S10. Gene expression values were normalized by Z-score. Asterisks (*) above the heatmap indicate that a given gene is expressed at significantly higher levels in the B. rapa accession marked with an asterisk. Protein abbreviations: PDHC pyruvate dehydrogenase complex, ACCase acetyl-CoA carboxylase, MCMT malonyl-CoA:ACP malonyltransferase, ACP acyl carrier protein, KAS II 3-ketoacyl-ACP synthase II, KAR ketoacyl-ACP reductase, HAD hydroxyacyl-ACP dehydrase, ENR enoyl-ACP reductase, SAD stearoyl-acyl carrier protein desaturase, FATA/B fatty acyl-ACP thioesterase A/B, FAX1 plastid fatty acid export 1, LACS9 long-chain acyl-CoA synthetase 9, GPDH glycerol-3-phosphate dehydrogenase, GPAT9 glycerol-3-phosphate acyltransferase 9, LPAAT lysophosphatidic acid acyltransferase, PAP phosphatidic acid phosphatase, DGAT diacylglycerol acyltransferase; LPCAT lysophosphatidylcholine acyltransferase, FAD2/3 fatty acid desaturase 2/3, PLD phospholipase D, PDAT phospholipid: diacylglycerol acyltransferase, PDCT phosphatidylcholine: diacylglycerol cholinephosphotransferase, CPT CDP-choline: diacylglycerol cholinephosphotransferase, PLC phospholipase C, OBO oil body oleosin, CALO caleosin, SFAR seed fatty acid reducer

In the first step, genes shared similar expression patterns in both accessions, being more highly expressed at 28–49 DAP than at 7–21 DAP (Fig. 7). However, most genes encoding proteins involved in this step were generally more highly expressed in the higher-OC accession SWUK4 at each developmental stage, and could be found in cluster 12 of SWUK3 and cluster 5 of SWUK4 in K-means clustering, indicating that oil synthesis in B. rapa seed is mainly initiated at a later stage of seed development. In the second step, the four genes encoding the relevant catalytic enzymes [3-Ketoacyl-ACP Synthase II (KAS II), Stearoyl acyl Carrier Protein Desaturase (SAD), Fatty Acid Export 1 (FAX1), and Long Chain Acyl-CoA Synthetase 9 (LACS9)] were more highly expressed in SWUK4 (Fig. 7), suggesting that SWUK4 seeds have a larger acyl-CoA pool than SWUK3 seeds. Differential expression of genes encoding Lysophosphatidylcholine Acyltransferase (LPCAT) and FA Desaturase 2/3 (FAD2/3) between the two accessions may cause the significant variation in FA components, including oleic acid, linoleic acid, and linolenic acid, in mature seeds detected by GC analysis (Additional file 1: Table S1). In the third step, TAG is formed in the endoplasmic reticulum catalyzed by four enzymes, glycerol-3-phosphate acyltransferase 9 (GPAT9), 1-acylglycerol-3-phosphate acyltransferase (LPAAT), phosphatidic acid phosphatase (PAP), and diacylglycerol acyltransferase (DGAT) [26]. The majority of genes encoding these four enzymes were more highly expressed in SWUK4 than in SWUK3 at stages 28–49 DAP, implying that greater TAG biosynthesis and storage in seed oil body might be the cause for the higher OC in SWUK4. In the fourth step, SFAR genes negatively regulate FA storage and seed oil body size, leading to FA degradation. B. rapa has 12 SFAR genes, but only two of them showed significant upregulation in the higher-OC accession SWUK4 at late sampling stages, suggesting that this accession does not adopt the strategy of negatively regulating SFAR activity to increase OC. In addition, the overexpression of SFAR genes resulted in the increase of C18:1 and C20:1, and the decrease  of C18:2 and C18:3 proportions [27], which explained why the SWUK4 has the higher content of oleic acid and lower content of linoleic acid and linolenic acid in its mature seeds (Additional file 1: Table S1). However, whether the OC of accession SWUK4 could be increased by reducing FA degradation remains to be established.

qRT-PCR validation

The relative expression levels of the 15 key DEGs at the seven seed developmental stages were analyzed by qRT-PCR to assess the accuracy of the RNA-seq results. The relative expression patterns of DEGs tested were positively correlated with the fold change variations obtained from the RNA-seq results (Fig. 8). The correlation coefficients between qRT-PCR and RNA-seq for the 15 DEGs ranged from 0.75 to 0.92. Further comparison showed that all the lower correlation coefficients between the two approaches belonged to three flavonoid pathway genes, Bra.A01TT18, Bra.A01ANR, and Bra.A09TT3. Due to the silencing of the three genes in the yellow-seeded B. rapa accession, their FPKM values were zero, but had to be set to 0.001 for fold change calculation, leading to lower correlation coefficients. However, the remaining 12 genes associated with FA metabolism and cell cycle progression all showed better correlation than the flavonoid pathway genes, demonstrating the reliability and accuracy of our RNA-seq results.

Fig. 8
figure 8

qRT-PCR validation of DEGs. Fifteen genes were selected for validation, including three encoding proteins involved in SC formation (Bra.A01ANR, Bra.A09TT3, and Bra.A01TT18) and four related to lipid biosynthesis and degradation (Bra.A03SFAR5, Bra.A07LACS9, Bra.A06FATA, and Bra.A09HAD); the remaining genes are related to the cell cycle. The blue lines represent RNA-seq results and the orange lines represent qRT-PCR results. Student’s t-test was used for statistical analysis of data from the two B. rapa accessions (*P < 0.05; **P < 0.01; ***P < 0.001)


Cell cycle-related genes may regulate SZ in B. rapa

B. rapa is a model plant for Brassica species. Understanding the molecular mechanism regulating B. rapa SZ can provide essential reference information for other Brassica crops. In this study, we investigated these mechanisms in two B. rapa accessions with significant differences in seed development progression and final SZ. WGCNA co-expression network analysis showed that one expression module, i.e., the MEgreen module, had the highest correlation with SZ increase (Additional file 2: Fig. S3a), and the genes in the module were significantly enriched in cell cycle-related functions. Using the K-means clustering method, we obtained two clusters with different expression profiles but similar functions in the cell cycle from two B. rapa accessions. A comparison between the WGCNA and K-means clustering results indicated that both methods identified as critical genes those associated with the cell cycle, cell division, and mitotic cell cycle, and that these were significantly correlated with traits of interest, suggesting that transcriptional regulators of cell cycle-related genes might be potential targets for improvement of seed yield in B. rapa and related Brassica crops.

Cell cycle regulation plays a crucial role in the growth and development of plant organs, affecting both cell size and cell number. We identified 101 cell cycle core genes in B. rapa, most of which were more highly expressed in the large-seeded accession SWUK4 than in the small-seeded accession SWUK3 (Fig. 3e). The cyclin-dependent kinases (CDKs) play a central role in cell cycle regulation, and their activity is also controlled by regulatory subunits, such as cyclins and E2F TFs [28].

Previous studies have shown that SZ is regulated by cell cycle-related genes. A genome-wide association study of Arabidopsis SZ variation identified a B1-type CDK gene, AtCYCB1;4, and revealed that AtCYCB1;4 overexpression plants produced larger seeds than wild-type plants [29]. Arabidopsis AtCYCB1;2 is responsible for stimulating the G2/M transition and is critical for cell division cycle progression during seed development [30, 31]. Brassica rapa has four CYCB1 genes (one CYCB1;1, two CYCB1;2, and one CYCB1;3 genes), of which three, Bra.A02CYCB1;2, Bra.A03CYCB1;2, and Bra.A01CYCB1;3, showed higher expression levels in the large-seeded accession SWUK4 at seed developmental stages from 28 to 49 DAP, the fast-growing stages of seeds in B. rapa.

In addition, the expression patterns of the two CYCB1;2 genes were positively correlated with SZ increase, suggesting that CYCB1;2 might be a key regulatory target for SZ trait improvement in Brassica crops. Eight other CDK genes (including four CYCB, three CYCA, and one CDKB genes) were also differentially expressed between the two B. rapa accessions. Although they were not included in the co-expression networks, the relationship of their expression to that of genes in these networks is also worthy of further investigation given their importance in cell cycle regulation.

The conserved DREAM complex may control seed size as an upstream coordinator

In the past 20 years, an evolutionarily conserved complex known as the DREAM complex has been identified as a master coordinator of cell cycle-dependent gene expression in animals [32]. DREAM consists of a core complex of five members, LIN9, LIN37, LIN52, LIN 53 (RBBP4), and LIN54, called MuvB (multivulva class B proteins) or MIP (MYB-interacting proteins) in Caenorhabditis elegans and mammals [32]. Perturbations in DREAM complex regulation trigger a shift of the balance from quiescence towards proliferation and result in increased mitotic gene expression that is frequently observed in human cancers.

The conserved function of the DREAM complex has also been identified in plants. In Arabidopsis, the homolog of LIN54, AtTSO1, coordinates with MYB3R1 to activate or inhibit cell cycle-related genes, and regulates the development of the root apical meristem and shoot apical meristem [33]. LIN54 is a core subunit of the DREAM complex. Downregulating LIN54 expression disrupts the DREAM complex, thereby reducing the expression of G2/M cell cycle genes [34].

In this study, we constructed a co-expression network with the hub TF Bra.A05TSO1, and identified the regulatory relationship between Bra.A05TSO1 and its 23 downstream target genes using PlantRegMap (Fig. 4c). Bra.A05TSO1 could bind cis-regulatory elements in the promoters of two cell cycle core genes, Bra.A02CYCB1;2 and Bra.A03CYCB1;2, to stimulate their expression (Fig. 4c), suggesting that the regulatory module TSO1-CYCB1;2 may contribute to the G2/M transition so as to control the cell cycle progression and affect the final SZ in plants, reminiscent of the similar mechanism adopted by LIN54 in animals (Fig. 4d). Hence, it is likely that the function of the DREAM complex in cell cycle regulation is conserved between plants and animals.

MYB binding with MuvB is a key factor in DREAM complex regulation of the cell cycle. When the cell leaves the S phase and enters the G2 phase, MYB binds the MuvB complex to activate genes required for mitosis [32]. However, B-MYB may also cause transcriptional suppression in mammals [30]. The Arabidopsis homologs of the Myb oncoprotein are the MYB3R (R1R2R3-MYB) TFs, which contain similar domains and also regulate the expression of G2 and M phase cell cycle genes [31]. Among the five Arabidopsis MYB3Rs, MYB3R1 not only can be redundant with MYB3R4 in transactivating the expression of specific genes at the G2 and M phases, including that of CYCB1 [28, 35], but also can act as a repressor that inhibits the expression of cell cycle-related genes, similarly to MYB3R3 and MYB3R5 [32].

In our study, we identified 11 members of the MYB3R gene family. Though we did not identify MYB3R genes in the MEgreen module, we did identify Bra.A05TSO1, Bra.A02CYCB1;2, and Bra.A01MYB3R1 (BraA01g005340) in cluster 6 of SWUK4 in the K-means clustering analysis, suggesting that the three genes have similar expression patterns in the larger-seeded accession. Hence, we propose a model of SZ regulation in B. rapa whereby the TF Bra.A05TSO1 positively regulates BraCYCB1;2 either directly or in coordination with Bra.A01MYB3R1 (as in the DREAM complex in animals), and then promotes cell division to produce larger seeds (Fig. 4d).

TT8 might be a conserved key target for creating yellow-seeded Brassica crops

Yellow seed color is a desirable trait with great potential for improving oil and protein content and meal quality in Brassica crops. Though several yellow-seeded varieties of B. rapa, B. napus, B. juncea, and Brassica carinata have been bred, and the inheritance of SC traits in these varieties has been studied, the molecular mechanism controlling the target trait remains to be unraveled. However, a similar trait regulated by the flavonoid pathway has been well characterized at the molecular level by utilizing transparent testa (tt) mutants, which carry mutations that hinder flavonoid accumulation and modify pigmentation in the Arabidopsis seed coat [36,37,38].

Our WGCNA identified a module, denoted MEsalmon, that is significantly negatively correlated with B. rapa SC (Additional file 2: Fig. S3a). The two TF genes Bra.A09TT8 and Bra.A08TT2 were identified as hub genes in this co-expression network (Fig. 5c, d). Previous studies had revealed that three TF regulatory genes, AtTT2, AtTT8, and TTG1 (encoding a WD40 protein), could form an MYB-bHLH-WD40 (MBW) complex that activates proanthocyanidin (PA)-specific genes (such as TT3, TT18, and ANR) during seed coat development [39]. In our results, two of the genes encoding MBW complex members, Bra.A08TT2 and Bra.A06TTG1, seem to be unimportant for SC formation in B. rapa, since their expression was either unregulated or higher in the yellow-seeded accession at one stage only. However, the product of the third MBW gene, Bra.A09TT8, may play a critical regulatory role in SC formation as a core TF in the flavonoid pathway. Our results indicated that Bra.A09TT8 expression in the yellow-seeded accession was very weak, and significantly lower than in the black-seeded accession, at all seed developmental stages except at 49 DAP. Therefore, it is likely that the accumulation of the PA precursor epicatechin might be blocked due to the silencing of Bra.A09TT8, consistent with a previous study showing that the insertion of a transposable element in BraTT8 caused a loss of function of BraTT8 in the yellow-seeded B. rapa var. yellow sarson [40]. Recent studies have also revealed that simultaneous natural mutations of two homologous TT8 genes in allotetraploid B. juncea resulted in the yellow-seeded trait [41], and a similar mutant also could be created in B. napus using CRISPR/Cas9 technology [42].

We also observed that the regulatory network of TT8 was well conserved in Arabidopsis and Brassica crops. In Arabidopsis seeds, the MBW complex AtTT8-AtTT2-AtTTG1 positively regulates PA synthesis by directly binding to the cis-regulatory elements of DFR, LODX, and BAN in the flavonoid biosynthesis pathway [43]. In the primary co-expression network of Bra.A09TT8, we identified three downstream target genes, Bra.A09TT3 (BraA09g019440), Bra.A01ANR (BraA01g029500), and Bra.A03ANR (BraA03g064730), that were almost silenced in the yellow-seeded B. rapa accession during seed development. Though two TT18 genes (Bra.A01TT18 and Bra.A03TT18) in B. rapa were not involved in the Bra.TT8-containing primary network, a positive regulatory relationship between TT8 and TT18 could be observed due to their similar weak expression in the yellow-seeded B. rapa accession. A similar network has also been identified in B. napus [42, 44, 45], Thus, we infer that the regulatory network of the TT8-involved complex was evolutionarily conserved between Arabidopsis and Brassica crop species. Considering the divergence of the two genera, the conserved TT8 genes might be optimal target loci for manipulation in the breeding of yellow-seeded varieties of Brassica and related crops.

GRF5, WRI1, and FUS3 may be key TF genes on increasing seed OC in B. rapa

B. rapa is one of the most important oil crop in the world. It is essential to explore the molecular mechanism on seed OC formation in B.rapa. Our WGCNA identified the MEbrown module, which was the highest correlation with OC increase and significantly associated with fatty acid biosynthsis (Additional file 2: Fig. S3a, d). Three TF genes Bra.A03GRF5, Bra.A09WRI1, and Bra.A06FUS3 were identified as hub genes in this regulatory network (Additional file 2: Fig. S5b–d).

The hub TF Bra.A03GRF5, as the highest Kwithin value of TFs in MEbrown module, may be a novel target in the regulation of FA biosynthesis. In previous researches, GRFs were identified as transcription activators [46], and invoved in ovule formation and female reproductive development in Arabidopsis [47]. They were also expressed in rice embryo and maize ear [48, 49], suggesting that GRFs may play role in seed development regulation. In addition, GRF2-like genes were highly expressed in high seed OC accessions in B. napus, and overexpression of BnGRF2 in Arabidopsis produced higher seed OC than wild type [50]. In our study, Bra.A03GRF5 co-expressed directly with 36 genes related to TAG biosynthesis, such as Bra.A07LACS9, Bra.A09PDHC (E1-β) and Bra.A02KASI, which may be the downstream target genes of Bra.A03GRF5. (Additional file 2: Fig. S5b). Hence, it could be expected that Bra.A03GRF5 may be involved in regulation of seed lipid biosynthesis, similar with BnGRF2.

In addition, Bra.A09WRI1 and Bra.A06FUS3 were also identified as hub TFs in MEbrown module, and co-expressed with genes involved in lipid biosynthesis (Additional file 2: Fig. S5c,d). Overexpression of JcWRI1 increased Jatropha seed OC and seed mass [51]. Heterologous overexpression of homologs of Arabidopsis AtWRI1, such as B. napus, Zea mays and Elaeis guineensis, led to seed OC increase in Arabidopsis [52,53,54]. FUS3 is an well-known TF on enhancing TAG biosynthesis. The fus3 mutant plants shown a decrease in lipid accumulation in Arabidopsis [55], and suppression of BnFUC3 also decreased OC in B. napus [56]. Besides, previous study also shown that LEC1 was upstream of WRI1 and FUS3 to promote lipid synthesis [57, 58], and the WRI1 was also activated by FUS3 [59, 60]. In our study, Bra.A07LEC1 was simultaneously occurred in the primary networks of Bra.A09WRI1, Bra.A06FUS3 and Bra.A03GRF5. And the three TF genes were also directly co-expressed each other. Our results suggested the conservative function of WRI1 and FUS3 in B. rapa, and confirmed the reliability of the construction of our WGCNA results. However, it is interested and worth to explore that the role of GRF5 in seed OC increase and the interaction between the three TF genes.


In this study, we identified 28,954 DEGs from transcriptome comparisons of seeds sampled from a pair of B. rapa accessions with different SZ, SC, and OC at seven seed developmental stages. Both K-means clustering and WGCNA identified a group of cell cycle-related genes whose expression was also positively correlated with SZ increase, and indicated that the TF Bra.A05TSO1 may positively stimulate the expression of two CYCB1;2 genes to increase the SZ through regulation of the G2/M transition. In a module whose expression was negatively correlated with seed yellowness, a conserved TT8-involved complex may determine the SC through downregulation of the key TF gene TT8 and its targets TT3, TT18, and ANR in the flavonoid pathway. Moreover, in a module that is enriched in fatty acid biosynthesis, we identified a novel key TF Bra.A03GRF5 may increase the seed OC, and found two conservative TFs WRI1 and FUS3 on regulating seed OC. Upregulated genes involved in triacylglycerol biosynthesis and storage in the seed oil body may increase the OC of B. rapa. This study thus unravels the regulatory mechanisms underlying the variation of SZ, SC, and OC in B. rapa seeds, and may facilitate genetic engineering efforts to breed Brassica crops with improved yield and OC.

Materials and methods

Plant materials

Seeds of two B. rapa accessions, SWUK3 (small black seeds with lower OC) and SWUK4 (large yellow seeds with higher OC), were obtained from the Chongqing Rapeseed Engineering Research Center. All seeds were sown at the beginning of October, 2018, and transferred into the field at Southwest University, Beibei, Chongqing, China (29° 45′ N, 106° 22′ E, 238.57 m), 1 month later. Each accession was planted in a plot of five rows, 10 plants per row, with 20 cm between plants within each row, and 30 cm between rows. Based on the field trail, the duration time from flowering to final maturation of B. rapa accessions were similar, about 53 and 56 days for SWUK3 and SWUK4, respectively.

Phenotype measurements

Seeds were collected at seven seed developmental stages: 7, 14, 21, 28, 35, 42, and 49 DAP. Fifty seeds were harvested from the siliques on the main inflorescences at each stage to measure SZ. Based on the Biologische Bundesanstalt, Bundessortenamt, and CHemical (BBCH) industry scale [61], six representative plants from the middle of each plot were harvested at growth stage 99 (harvested product) for measurement of SZ, TSW, and SL. The mature seed OC and seed coat transparency were assayed using NIRS (DS2500) with previously established models for Brassica crops [62]. The FA composition, specifically the compositions of palmitic acid, stearic acid, oleic acid, linoleic acid, and linolenic acid, was analyzed by gas–liquid chromatography on a Perkin Elmer Gas Chromatograph Model GC-2010 (Shimadzu, Kyoto, Japan) equipped with a fused silica capillary column DB-WAX (30 m × 0.25 mm id, 0.25 μm film thickness, J&W, Folsom, CA, USA), as reported previously [63]. The OC in seeds harvested from seven developmental stages was assayed by GC–MS, as previously described [64]. For each trait, at least five independent biological replicates were measured.

Transcriptome sequencing and identification of DEGs

To compare the transcriptomic variations between the two B. rapa accessions, a total of 42 seed samples (2 accessions × 7 stages × 3 biological replicates) were collected. Total RNAs of abovementioned samples were isolated using an RNAprep Pure Plant Kit (Tiangen, Beijing, China), and sent to Novogene Corporation (Beijing, China) for library construction and transcriptome sequencing on an Illumina HiSeq 2500 platform. The raw sequencing data were deposited in the BIG Data Center (BIGD) under BioProject accession number PRJCA002339.

Low-quality reads, connectors, and barcode sequences were eliminated using Trimmomatic-0.39 [65] Then, the clean data were aligned to the B. rapa reference genome, v3.0 (, using STAR-2.5.3 [66]. Gene expression levels were quantified as count number and FPKM using the programs featurecounts and cuffquant, respectively [67]. The correlation relationships among all the samples were examined by principal component analysis (PCA), and the correlation coefficients were determined through the R package ggfortify and SPSS15.0 [68]. DEGs were identified using the R package DEseq 2, based on the criteria of false discovery rate (FDR, Benjamini–Hochberg multiple test correction) < 0.01 and absolute fold change > 2 [69].

Identification of transcription factors, and genes associated with SC and acyl-lipid metabolism

To identify and classify transcription factors (TFs), transcription repressors (TRs), and kinases, all the protein sequences of B. rapa were analyzed using the online program iTAK v1.6, ( [70].

Since the SC and OC differed significantly between the two B. rapa accessions, we obtained all of the genes involved in the flavonoid pathway that play critical roles in SC formation, as well as those involved in acyl-lipid metabolism, in Arabidopsis from the Arabidopsis Acyl-Lipid Metabolism database (ARALIP) ( [71]. Subsequently, a reciprocal BLASTP with an E-value cut-off of 1E−5 was used to identify the homologous relationships between the Arabidopsis and B. rapa genes [72], and then the protein sequences were analyzed with Pfam Scan ( to further confirm the existence of corresponding functional domains.

K-means clustering of DEGs

K-means clustering is an effective approach to identify gene expression patterns for transcriptome data. To establish the expression profiles of DEGs with potential biological function in regulating our target traits in B. rapa, cluster analysis was performed by the K-means method with Pearson’s correlation distance using the cluster package in R. Optimal number of clusters for K-means were estimated by the gap statistic that computed using the clusGap function in R package factoextra [73]. K-means clustering was then performed with the optimal cluster number 12 for both two B. rapa accessions. Heatmaps were created using the expression values with log2(FPKM + 1) and Z-score normalization and visualized using the pheatmap package in R.

Weighted gene co-expression network analysis (WGCNA)

To detect co-expression modules and key regulatory genes associated with target seed traits in B. rapa, we generated co-expression networks using the WGCNA package in R as previously described [74]. Briefly, only expressed genes with average FPMK values higher than 1 in any seeds were retained and were then log2(FPKM + 1) transformed before further processing. The soft thresholding power was determined using pickSoftThreshold function based on the scale-free topology model fit (R2) > 0.9. Then, the automatic blockwiseModules network construction approach was applied to obtain the highly correlated modules, with the following parameters: power, 18; TOM-type, unsigned; miniModuleSize, 50; maxBlockSize, 35,000; mergeCutHeight, 0.25. The Plant Transcriptional Regulatory Map (PlantRegMap, was used to analyze regulatory relationships between TFs and their co-expressed genes in the same network [75]. The co-expression and transcriptional regulatory networks were displayed using Cytoscape v3.5.1 [76].

GO and KEGG enrichment analyses

All the B. rapa genes were annotated with BLASTP against the Arabidopsis proteome (TAIR10) with an E-value cut-off of 1E−5 (Altschul et al. [72]). GO enrichment analysis was performed using the BiNGO plug-in in Cytoscape v3.5.1 [76]. Significantly overrepresented GO terms were identified with the threshold of FDR < 0.05. The online OmicShare tool (, a free online platform, was used to perform KEGG pathway enrichment analysis. Bubble plots for GO terms and KEGG pathways were generated using the R package ggplot2 [77].

qRT-PCR validation

To confirm the accuracy of transcriptomic sequencing and identification of DEGs, the cDNA was synthesized from 1 μg of total RNA that used for transcriptomic sequencing using a PrimeScript RT Master Mix Kit (TaKaRa, Dalian, China). A total of 15 DEGs (3, 4, and 8 genes involved in flavonoid pathway, acyl-lipid metabolism, and seed development, respectively) were selected for qRT-PCR assays. All gene-specific primers were retrieved from the qPrimerDB database ( [78]. All qRT-PCR assays were carried out following the Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE) guidelines. Bna.UBC21 and BnaACT7 were used as internal controls, and relative expression levels were calculated using the 2−ΔΔCt method [25].

Availability of data and materials

The datasets supporting the conclusions of this article are included within the article and its additional files.





Arabidopsis Acyl-Lipid Metabolism database




BIG Data Center








Days after pollination


Differentially expressed gene


Diacylglycerol acyltransferase




Fatty acid


FA Desaturase


Fatty Acid Export 1


Fragments per kilobase per million mapped reads


Gas chromatography


Gene Ontology


Glycerol-3-Phosphate Acyltransferase 9






3-Ketoacyl-ACP synthase II


Kyoto Encyclopedia of Genes and Genomes


Long Chain Acyl-Coa Synthetase 9


1-Acylglycerol-3-phosphate acyltransferase


Lysophosphatidylcholine acyltransferase


Minimum Information for Publication of Quantitative Real-Time PCR Experiments


Near-infrared reflection spectroscopy


Oil content


Phosphatidic acid phosphatase


Principal component analysis


Quantitative real-time PCR


Stearoylacyl Carrier Protein Desaturase


Seed color


Seed Fatty Acid Reducer




Silique length






Seed size




transcription factor


Transcription repressor


Thousand-seed weight






Weighted gene co-expression network analysis


  1. Syrenne RD, Shi W, Stewart CN, Yuan JS. Omics platforms: Importance of twenty-first century genome-enabled technologies in seed developmental research for improved seed quality and crop yield. In: Agrawal G, Rakwal R, editors. Seed development: OMICS technologies toward improvement of seed quality and crop yield. Dordrecht: Springer; 2012. p. 43–57.

    Chapter  Google Scholar 

  2. Sundaresan V. Control of seed size in plants. Proc Natl Acad Sci USA. 2005;102:17887–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Luo M, Dennis ES, Berger F, Peacock WJ, Chaudhury A. MINISEED3 (MINI3), a WRKY family gene, and HAIKU2 (IKU2), a leucine-rich repeat (LRR) KINASE gene, are regulators of seed size in Arabidopsis. Proc Natl Acad Sci USA. 2005;102:17531–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Wang A, Garcia D, Zhang H, Feng K, Chaudhury A, Berger F, Peacock WJ, Dennis ES, Luo M. The VQ motif protein IKU1 regulates endosperm growth and seed size in Arabidopsis. Plant J. 2010;63:670–9.

    Article  CAS  PubMed  Google Scholar 

  5. Li J, Nie X, Tan JLH, Berger F. Integration of epigenetic and genetic controls of seed size by cytokinin in Arabidopsis. Proc Natl Acad Sci USA. 2013;110:15479–84.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Li N, Li Y. Signaling pathways of seed size control in plants. Curr Opin Plant Biol. 2016;33:23–32.

    Article  PubMed  CAS  Google Scholar 

  7. Xia T, Li N, Dumenil J, Li J, Kamenski A, Bevan MW, Gao F, Li Y. The ubiquitin receptor DA1 interacts with the E3 ubiquitin ligase DA2 to regulate seed and organ size in Arabidopsis. Plant Cell. 2013;25:3347–59.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Doughty J, Aljabri M, Scott RJ. Flavonoids and the regulation of seed size in Arabidopsis. Biochem Soc Technol. 2014;42(2):364–9.

    Article  CAS  Google Scholar 

  9. Si L, Chen J, Huang X, Gong H, Luo J, Hou Q, Zhou T, Lu T, Zhu J, Shangguan Y. OsSPL13 controls grain size in cultivated rice. Nat Genet. 2016;48:447.

    Article  CAS  PubMed  Google Scholar 

  10. Ohto M, Floyd SK, Fischer RL, Goldberg RB, Harada JJ. Effects of APETALA2 on embryo, endosperm, and seed coat development determine seed size in Arabidopsis. Sex Plant Reprod. 2009;22:277–89.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Zuo J, Li J. Molecular genetic dissection of quantitative trait loci regulating rice grain size. Annu Rev Genet. 2014;48:99–118.

    Article  CAS  PubMed  Google Scholar 

  12. Riefler M, Novak O, Strnad M, Schmülling T. Arabidopsis cytokinin receptor mutants reveal functions in shoot growth, leaf senescence, seed size, germination, root development, and cytokinin metabolism. Plant Cell. 2006;18:40–54.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Tang ZL, Li JN, Zhang XK, Chen L, Wang R. Genetic variation of yellow-seeded rapeseed lines (Brassica napus L.) from different genetic sources. Plant Breed. 1997;116:471–4.

    Article  CAS  Google Scholar 

  14. Winkel-Shirley B. Flavonoid biosynthesis A colorful model for genetics, biochemistry, cell biology, and biotechnology. Plant Physiol. 2001;126:485–93.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Routaboul JM, Kerhoas L, Debeaujon I, Pourcel L, Caboche M, Einhorn J, Lepiniec L. Flavonoid diversity and biosynthesis in seed of Arabidopsis thaliana. Planta. 2006;224:96–107.

    Article  CAS  PubMed  Google Scholar 

  16. Appelhagen I, Thiedig K, Nordholt N, Schmidt N, Huep G, Sagasser M, Weisshaar B. Update on transparent testa mutants from Arabidopsis thaliana: characterisation of new alleles from an isogenic collection. Planta. 2014;240:955–70.

    Article  CAS  PubMed  Google Scholar 

  17. Zhang F, Gonzalez A, Zhao M, Payne CT, Lloyd A. A network of redundant bHLH proteins functions in all TTG1-dependent pathways of Arabidopsis. Development. 2003;130:4859–69.

    Article  CAS  PubMed  Google Scholar 

  18. Wang Y, Xiao L, Dun X, Liu K, Du D. Characterization of the BrTT1 gene responsible for seed coat color formation in Dahuang (Brassica rapa L. landrace). Mol Breed. 2017;37:137.

    Article  CAS  Google Scholar 

  19. Zhang J, Lu Y, Yuan Y, Zhang X, Geng J, Chen Y, Cloutier S, McVetty PBE, Li G. Map-based cloning and characterization of a gene controlling hairiness and seed coat color traits in Brassica rapa. Plant Mol Biol. 2009;69:553–63.

    Article  CAS  PubMed  Google Scholar 

  20. Kalscheuer R, Steinbüchel A. A novel bifunctional wax ester synthase/acyl-CoA: diacylglycerol acyltransferase mediates wax ester and triacylglycerol biosynthesis in Acinetobacter calcoaceticus ADP1. J Biol Chem. 2003;278:8075–82.

    Article  CAS  PubMed  Google Scholar 

  21. Kenedy EP. Biosynthesis of complex lipids. Fed Proc. 1961;20:934–40.

    Google Scholar 

  22. Sébastien B, Lepiniec L. Regulation of de novo fatty acid synthesis in maturing oilseeds of Arabidopsis. Plant Physiol Biochem. 2009;47:448–55.

    Article  CAS  Google Scholar 

  23. Li-Beisson Y, Shorrosh B, Beisson F, Andersson MX, Arondel V, Bates PD, Baud S, Bird D, Debono A, Durrett TP, Franke RB, Graham IA, Katayama K, Kelly AA, Larson T, Markham JE, Miquel M, Molina I, Nishida I, Rowland O, Samuels L, Schmid KM, Wada H, Welti R, Xu C, Zallot R, Ohlrogge J. Acyl-lipid metabolism. Arabidopsis Book. 2013;11:e0161.

    Article  PubMed  PubMed Central  Google Scholar 

  24. Wang X, Wang H, Wang J, Sun R, Wu J, Liu S, Bai Y, Mun JH, Bancroft I, Cheng F, Huang S, Li X, Hua W, Wang J, Wang X, Freeling M, Pires JC, Paterson AH, Chalhoub B, Wang B, Hayward A, Sharpe AG, Park BS, Weisshaar B, Liu B, Li B, Liu B, Tong C, Song C, Duran C, Peng C, Geng C, Koh C, Lin C, Edwards D, Mu D, Shen D, Soumpourou E, Li F, Fraser F, Conant G, Lassalle G, King GJ, Bonnema G, Tang H, Wang H, Belcram H, Zhou H, Hirakawa H, Abe H, Guo H, Wang H, Jin H, Parkin IA, Batley J, Kim JS, Just J, Li J, Xu J, Deng J, Kim JA, Li J, Yu J, Meng J, Wang J, Min J, Poulain J, Wang J, Hatakeyama K, Wu K, Wang L, Fang L, Trick M, Links MG, Zhao M, Jin M, Ramchiary N, Drou N, Berkman PJ, Cai Q, Huang Q, Li R, Tabata S, Cheng S, Zhang S, Zhang S, Huang S, Sato S, Sun S, Kwon SJ, Choi SR, Lee TH, Fan W, Zhao X, Tan X, Xu X, Wang Y, Qiu Y, Yin Y, Li Y, Du Y, Liao Y, Lim Y, Narusaka Y, Wang Y, Wang Z, Li Z, Wang Z, Xiong Z, Zhang Z, Brassica rapa Genome Sequencing Project C. The genome of the mesopolyploid crop species Brassica rapa. Nat Genet. 2010;43(10):1035–9.

    Article  CAS  Google Scholar 

  25. Lu K, Wei L, Li X, Wang Y, Wu J, Liu M, Zhang C, Chen Z, Xiao Z, Jian H, Cheng F, Zhang K, Du H, Cheng X, Qu C, Qian W, Liu L, Wang R, Zou Q, Ying J, Xu X, Mei J, Liang Y, Chai YR, Tang Z, Wan H, Ni Y, He Y, Lin N, Fan Y, Sun W, Li NN, Zhou G, Zheng H, Wang X, Paterson AH, Li J. Whole-genome resequencing reveals Brassica napus origin and genetic loci involved in its improvement. Nat Commun. 2019;10:1154.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  26. Chapman KD, Ohlrogge JB. Compartmentation of triacylglycerol accumulation in plants. J Biol Chem. 2012;287:2288–94.

    Article  CAS  PubMed  Google Scholar 

  27. Chen M, Du X, Zhu Y, Wang Z, Hua S, Li Z, Guo W, Zhang G, Peng J, Jiang L. Seed Fatty Acid Reducer acts downstream of gibberellin signalling pathway to lower seed fatty acid storage in Arabidopsis. Plant Cell Environ. 2012;35:2155–69.

    Article  CAS  PubMed  Google Scholar 

  28. de Veylder L, Joubès J, Inzé D. Plant cell cycle transitions. Curr Opin Plant Biol. 2003;6:536–43.

    Article  PubMed  CAS  Google Scholar 

  29. Ren D, Wang X, Yang M, Yang L, He G, Deng XW. A new regulator of seed size control in Arabidopsis identified by a genome-wide association study. New Phytol. 2019;222:895–906.

    Article  CAS  PubMed  Google Scholar 

  30. Schnittger A, Schöbinger U, Stierhof Y-D, Hülskamp M. Ectopic B-type cyclin expression induces mitotic cycles in endoreduplicating Arabidopsis trichomes. Curr Biol. 2002;12:415–20.

    Article  CAS  PubMed  Google Scholar 

  31. Weingartner M, Pelayo HR, Binarova P, Zwerger K, Melikant B, de la Torre C, Heberle-Bors E, Bögre L. A plant cyclin B2 is degraded early in mitosis and its ectopic expression shortens G2-phase and alleviates the DNA-damage checkpoint. J Cell Sci. 2003;116:487–98.

    Article  CAS  PubMed  Google Scholar 

  32. Sadasivam S, DeCaprio JA. The DREAM complex: master coordinator of cell cycle-dependent gene expression. Nat Rev Cancer. 2013;13:585–95.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Wang W, Sijacic P, Xu P, Lian H, Liu Z. Arabidopsis TSO1 and MYB3R1 form a regulatory module to coordinate cell proliferation with differentiation in shoot and root. Proc Natl Acad Sci USA. 2018;115:E3045–54.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Schmit F, Cremer S, Gaubatz S. LIN54 is an essential core subunit of the DREAM/LINC complex that binds to the cdc2 promoter in a sequence-specific manner. FEBS J. 2009;276:5703–16.

    Article  CAS  PubMed  Google Scholar 

  35. Fischer M, DeCaprio JA. Does Arabidopsis thaliana DREAM of cell cycle control? EMBO J. 2015;34:1987–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Shirley BW, Kubasek WL, Storz G, Bruggemann E, Koornneef M, Ausubel FM, Goodman HM. Analysis of Arabidopsis mutants deficient in flavonoid biosynthesis. Plant J. 1995;8:659–71.

    Article  CAS  PubMed  Google Scholar 

  37. Peer WA, Brown DE, Tague BW, Muday GK, Taiz L, Murphy AS. Flavonoid accumulation patterns of transparent testa mutants of Arabidopsis. Plant Physiol. 2001;126:536–48.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Zhang B, Schrader A. TRANSPARENT TESTA GLABRA 1-dependent regulation of flavonoid biosynthesis. Plants. 2017;6:65.

    Article  PubMed Central  CAS  Google Scholar 

  39. Xu W, Dubos C, Lepiniec L. Transcriptional control of flavonoid biosynthesis by MYB-bHLH-WDR complexes. Trends Plant Sci. 2015;20:176–85.

    Article  CAS  PubMed  Google Scholar 

  40. Li X, Chen L, Hong M, Zhang Y, Zu F, Wen J, Yi B, Ma C, Shen J, Tu J. A large insertion in bHLH transcription factor BrTT8 resulting in yellow seed coat in Brassica rapa. PLoS ONE. 2012;7:e44145.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Padmaja LK, Agarwal P, Gupta V, Mukhopadhyay A, Sodhi YS, Pental D, Pradhan AK. Natural mutations in two homoeologous TT8 genes control yellow seed coat trait in allotetraploid Brassica juncea (AABB). Theor Appl Genet. 2014;127:339–47.

    Article  CAS  PubMed  Google Scholar 

  42. Zhai Y, Yu K, Cai S, Hu L, Amoo O, Xu L, Yang Y, Ma B, Jiao Y, Zhang C. Targeted mutagenesis of BnTT8 homologs controls yellow seed coat development for effective oil production in Brassica napus L. Plant Biotechnol J. 2020;18:1153–68.

    Article  CAS  PubMed  Google Scholar 

  43. Xu W, Grain D, Bobet S, Le Gourrierec J, Thévenin J, Kelemen Z, Lepiniec L, Dubos C. Complexity and robustness of the flavonoid transcriptional regulatory network revealed by comprehensive analyses of MYB–bHLH–WDR complexes and their targets in Arabidopsis seed. New Phytol. 2014;202:132–44.

    Article  CAS  PubMed  Google Scholar 

  44. Hong M, Hu K, Tian T, Li X, Chen L, Zhang Y, Yi B, Wen J, Ma C, Shen J. Transcriptomic analysis of seed coats in yellow-seeded Brassica napus reveals novel genes that influence proanthocyanidin biosynthesis. Front Plant Sci. 2017;8:1674.

    Article  PubMed  PubMed Central  Google Scholar 

  45. Qu C, Fu F, Lu K, Zhang K, Wang R, Xu X, Wang M, Lu J, Wan H, Zhanglin T, Li J. Differential accumulation of phenolic compounds and expression of related genes in black- and yellow-seeded Brassica napus. J Exp Bot. 2013;64:2885–98.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Kim JH, Kende H. A transcriptional coactivator, AtGIF1, is involved in regulating leaf growth and morphology in Arabidopsis. Proc Natl Acad Sci USA. 2004;101:13374–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Wynn AN, Rueschhoff EE, Franks RG. Transcriptomic characterization of a synergistic genetic interaction during carpel margin meristem development in Arabidopsis thaliana. PLoS ONE. 2011;6:e26231.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Ye R, Yao QH, Xu ZH, Xue HW. Development of an efficient method for the isolation of factors involved in gene transcription during rice embryo development. Plant J. 2004;38:348–57.

    Article  CAS  PubMed  Google Scholar 

  49. Zhang DF, Li B, Jia GQ, Zhang TF, Dai JR, Li JS, Wang SC. Isolation and characterization of genes encoding GRF transcription factors and GIF transcriptional coactivators in maize (Zea mays L.). Plant Sci. 2008;175:809–17.

    Article  CAS  Google Scholar 

  50. Liu J, Hua W, Yang HL, Zhan GM, Li RJ, Deng LB, Wang XF, Liu GH, Wang HZ. The BnGRF2 gene (GRF2-like gene from Brassica napus) enhances seed oil production through regulating cell number and plant photosynthesis. J Exp Bot. 2012;63:3727–40.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Ye J, Wang C, Sun Y, Qu J, Mao H, Chua N-H. Overexpression of a transcription factor increases lipid content in a woody perennial Jatropha curcas. Front Plant Sci. 2018;9:1479.

    Article  PubMed  PubMed Central  Google Scholar 

  52. Liu J, Hua W, Zhan G, Wei F, Wang X, Liu G, Wang H. Increasing seed mass and oil content in transgenic Arabidopsis by the overexpression of wri1-like gene from Brassica napus. Plant Physiol Biochem. 2010;48:9–15.

    Article  CAS  PubMed  Google Scholar 

  53. Grimberg A, Carlsson AS, Marttila S, Bhalerao R, Hofvander P. Transcriptional transitions in Nicotiana benthamiana leaves upon induction of oil synthesis by WRINKLED1 homologs from diverses pecies and tissues. BMC Plant Biol. 2015;15:192.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  54. Jin J, Sun Y, Qu J. Transcriptome and functional analysis reveals hybrid vigor for oil biosynthesis in oil palm. Sci Rep. 2017;7:439.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  55. Meinke DW, Franzmann LH, Nickle TC, Yeung EC. Leafy cotyledon mutants of Arabidopsis. Plant Cell. 1994;6:1049–64.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Elahi N, Duncan RW, Stasolla C. Decreased seed oil production in FUSCA3 Brassica napus mutant plants. Plant Physiol Biochem. 2015;96:222–30.

    Article  CAS  PubMed  Google Scholar 

  57. Wang F, Perry SE. Identification of direct targets of FUSCA3, a key regulator of Arabidopsis Seed development. Plant Physiol. 2013;136:2676–86.

    Google Scholar 

  58. Mu J, Tan H, Zheng Q, Fu F, Liang Y, Zhang J, Yang X, Wang T, Chong K, Wang X, Zuo J. LEAFY COTYLEDON1 is a key regulator of fatty acid biosynthesis in Arabidopsis. Plant Physiol. 2008;148:1042–54.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Yamamoto A, Kagaya Y, Usui H, Hobo T, Takeda S, Hattori T. Diverse roles and mechanisms of gene regulation by the Arabidopsis seed maturation master regulator FUS3 revealed by microarray analysis. Plant Cell Physiol. 2010;51:2031–46.

    Article  CAS  PubMed  Google Scholar 

  60. Zhang M, Cao X, Jia Q, Ohlrogge J. FUSCA3 activates triacylglycerol accumulation in Arabidopsis seedlings and tobacco BY2 cells. Plant J. 2016;88:95–107.

    Article  CAS  PubMed  Google Scholar 

  61. Lancashire PD, Bleiholder H, Boom TVD, Langelüddeke P, Stauss R, Weber E, Witzenberger A. A uniform decimal code for growth stages of crops and weeds. Ann Appl Biol. 1991;119:561–601.

    Article  Google Scholar 

  62. Liu L, Qu C, Wittkop B, Yi B, Xiao Y, He Y, Snowdon RJ, Li J. A high-density snp map for accurate mapping of seed fibre QTL in Brassica napus L. PLoS ONE. 2013;8:1–9.

    Google Scholar 

  63. Qu C, Jia L, Fu F, Zhao H, Lu K, Wei L, Xu X, Liang Y, Li S, Wang R, Li J. Genome-wide association mapping and Identification of candidate genes for fatty acid composition in Brassica napus L. using SNP markers. BMC Genomics. 2017;18:232.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  64. Shen E, Zhu C, Hua S, Chen H, Ye C, Zhou L, Liu Q, Zhu Q, Fan L, Chen X. Genome-wide identification of oil biosynthesis-related long non-coding RNAs in allopolyploid Brassica napus. BMC Genom. 2018;19:745.

    Article  CAS  Google Scholar 

  65. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–20.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TRSTAR. Ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21.

    Article  CAS  PubMed  Google Scholar 

  67. Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L. Differential gene transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc. 2012;7:562–78.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Tang Y, Horikoshi M, Li W. ggfortify: unified interface to visualize statistical results of popular r packages. R J. 2016;8:474–85.

    Article  Google Scholar 

  69. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B. 1995;527:289–300.

    Google Scholar 

  70. Zheng Y, Jiao C, Sun H, Rosli HG, Pombo MA, Zhang P, Banf M, Dai X, Martin GB, Giovannoni JJ, Zhao PX, Rhee SY, Fei Z. iTAK: a program for genome-wide prediction and classification of plant transcription factors, transcriptional regulators, and protein kinases. Mol Plant. 2016;9:1667–70.

    Article  CAS  PubMed  Google Scholar 

  71. McGlew K, Shaw V, Zhang M, Kim RJ, Yang W, Shorrosh B, Suh MC, Ohlrogge J. An annotated database of Arabidopsis mutants of acyl lipid metabolism. Plant Cell Rep. 2015;34:519–32.

    Article  CAS  PubMed  Google Scholar 

  72. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–402.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Kassambara A, Mundt F. Factoextra: extract and visualize the results of multivariate data analyses. R Packag Version. 2017;1:2017.

    Google Scholar 

  74. Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinform. 2008;9:559.

    Article  CAS  Google Scholar 

  75. Tian F, Yang DC, Meng YQ, Jin J, Gao G. PlantRegMap: charting functional regulatory maps in plants. Nucleic Acids Res. 2020;48:D1104–13.

    PubMed  Google Scholar 

  76. Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T. Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics. 2011;27:431–2.

    Article  CAS  PubMed  Google Scholar 

  77. Ginestet C. ggplot2: elegant graphics for data analysis. J R Stat Soc Ser A Stat. 2011;174:245–6.

    Article  Google Scholar 

  78. Lu K, Li T, He J, Chang W, Zhang R, Liu M, Yu M, Fan Y, Sun W, Ma J, Qu C, Liu L, Li N, Liang Y, Wang R, Tang Z, Qian W, Xu X, Lei B, Zhang K, Li J. qPrimerDB: a thermodynamics-based gene-specific qPCR primer database for 147 organisms. Nucleic Acids Res. 2018;46:D1229–36.

    Article  CAS  PubMed  Google Scholar 

Download references


The authors are grateful to the reviewers and editors for their constructive review and suggestions for this paper.


This work was supported by grants from the National Key Research and Development Plan (2018YFD0100500 and 2016YFD0101007), the National Natural Science Foundation of China (31871653), the 111 project (B12006); the Natural Science Foundation of Chongqing, China (cstc2018jcyjA1219), and the Chongqing Agricultural Development Fund (NKY-2019AB008-2).

Author information

Authors and Affiliations



KL and LW conceived and designed the experiments; YN, YL and MQ performed the sampling and experiments; HH, HZ, and YX measured the phenotypic traits; WS, YF, UM, BX, and KZ contributed to data processing, analysis, interpretation, and discussion; KL and YN wrote the manuscript; CQ and JL contributed to discussion and helped with manuscript reviewing. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Kun Lu.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1: Table S1.

Phenotypic traits of two B. rapa accessions. Table S2. Overview of transcriptomic data and mapping efficiency. Table S3. Classification and overview of DEGs. Table S4. Identification of classification of all TF, TR and kinase genes in B. rapa. Table S5. GO enrichment analysis of DEGs obtained by comparison two accessions at the seven stages. Table S6. GO enrichment analysis of DEGs obtained by comparison the adjacent stages in each one accession. Table S7. Homologous cell cycle genes in Arabidopsis and B. rapa. Table S8. GO enrichment analysis of color modules in WGCNA. Table S9. Expression values of flavonoid pathway genes in two B. rapa accessions at seven seed developmental stages. Table S10. Expression values of genes involved in fatty acid metabolism in two B. rapa accessions at seven seed developmental stages. Table S11. Primers used in qRT-PCR.

Additional file 2: Fig. S1.

PCA and coefficient of association analysis. (a) The 2D PCA analysis of SWUK3, The biological replicates of the same period have the same color; (b) Heatmap of all retained samples correlation coefficient of SWUK3. In the upper right corner is the red circle representing relevance, and the bottom left is the correlation coefficient. The higher the value of correlation coefficient, the redder the color; (c) The 2D PCA of SWUK4; (d) Heatmap of all retained samples correlation coefficient of SWUK4. Fig. S2. The process of co-expression network construction. (a) The clustering tree of all 38 samples; (b) The left one displays the relationship between the soft threshold and scale independence with 0.9 red line. The right one displays the relationship between the soft threshold and mean connectivity; (c) The cluster dendrogram; (d) Heatmap of the correlation among the color modules, the higher the correlation, the color more deep. Fig. S3. Module-trait relationships and GO enrichment of interested color modules. (a) Heatmap of the correlation between the 15 color modules and 4 traits, with correlation coefficient and P value; (b) GO enrichment analysis for genes in the MEgreen; (c) GO enrichment analysis for genes in the MEsalmon; (d) GO enrichment analysis for genes in the MEbrown. Only top twenty GO terms were displayed here. Fig. S4. Co-expression network of MEgreen and Bra.A09GRAS. Fig S5. The WGCNA MEbrown module is significantly associated with OC increase. (a) Heatmap of MEbrown module genes in two B. rapa accessions, which was displayed based on log2(FPKM + 1). (b) Primary co-expression network of Bra.A03GRF5. (c) Primary co-expression network of Bra.A09WRI1. (d) Primary co-expression network of Bra.A06FUS3. Square, down arrow, triangle, and disc represent TF, TR, kinase, and other genes, respectively. The edge width represents the weight value between the two nodes: the higher the value of the weight between the nodes, the wider the edge.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Niu, Y., Wu, L., Li, Y. et al. Deciphering the transcriptional regulatory networks that control size, color, and oil content in Brassica rapa seeds. Biotechnol Biofuels 13, 90 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: