Integrated Omics approach for Prediction of Operons like gene clusters in plants: Tools, Techniques, and Future aspects


Vineeth Changarangath1, Sakshi Tripathi1, Shweta Singh2, Himanshu Singh3

1Department of Biotechnology, School of Bio-engineering and Biosciences,

Lovely Professional University, Phagwara, Punjab, India.

2Department of Life Sciences, University Institute of Science and Humanities,

Sant Baba Bhag Singh University, Jalandhar, Punjab, India.

3Department of Biotechnology, School of Bio-engineering and Biosciences,

Lovely Professional University, Phagwara, Punjab, India.

*Corresponding Author E-mail:



Organized expression of genes within the genomes of microbes is a well-established concept under the name of operons. Similarly, the recent developments in the field of genetics and biochemistry has exposed operon-like genetic arrangements called Biosynthetic gene clusters (BGCs) in plants that has revolutionized the way we approach applied plant genetics for human use. Plant Gene clusters contain signature and tailoring genes. Signature genes are responsible for forming the backbone of the structure of the molecule. Tailoring genes are the group of genes that support these gene clusters to perform their functions. Recent genetics and chemical studies have shined light on an interesting aspect of plant metabolism, which is the accumulation of genes i.e. gene clusters involved in unique metabolic pathways in plants. Further utilizing the advanced genetic engineering tools provide opportunity to modify the plant genome at gene level for production of beneficial products for humans. In this review we have looked at the background, mechanism, discovery, significance, general methodology and techniques and current and future prospects of Biosynthetic gene clusters BGCs. Also, we shall be looking at some of the tools (examples like Blast and PlantiSmash) having applications in studying these gene clusters, their properties and their functions.


KEYWORDS: Biosynthetic gene clusters, Genetic cluster, Gene cluster, BGC, Operons, Biosynthesis, Secondary metabolites, Specialized metabolites, Signature enzymes, Tailoring enzymes, Synthetic biology, Allelopathy, Co regulation, Co expression, Coordinated expression, Arabidopsis thaliana, Zea mays, Oryza Sativa, planti SMASH, BLAST.




Physically linked gene groups called clusters include enzymatic details for the formation of special metabolites that are a stable feature of secondary microbial metabolism. The BGCs of special plant metabolites were thought to be non-existent and only randomized genes scattered across the genome were considered.


However, recently groundbreaking reports have been made which indicate a growing number of genetic clusters in specific metabolic pathways in plants. Many genetic clusters that include biosynthesis of different metabolite classes have now been revealed in various plant species. Comparison of these visible collections now lets us define their essential features and to use biosynthetic compounds for synthetic biological applications. The rapidly growing number of reports of genetic clustering suggests that this type of genomic organization is common.1


The commonality of genetic clustering for secondary metabolism in plants, as they are in fungi and viruses is not very clear yet. Clear examples non-clustered metabolic pathways in plants are anthocyanins and glucosinolates pathways2. However, the ever-increasing number of genetic cluster reports from different species of plants suggests that this type of genomic arrangement is common. The numbers of pathways for which the genes are known to be scattered are very limited. Importantly most of the specific metabolic pathways of plants remain undetectable and their genomic association unknown. We hope will summarize the current knowledge of plant gene clusters so far through this review, touching on the common features and highlight the similarities and differences. We will also go over the potential for exploitation of genetically modified plant genes and biological applications1.



Plant kingdom produces light molecules called metabolites which they use for their growth prosperity also in interacting with the environment too. They help in performing other function too (like UV protectors, insecticides) though majority of them are still not discovered which is a major handicap for development of biotechnological advancement. Recent developments have highlighted gene clustering roles in specific metabolic pathway. These cluster occur in both monocots and dicots and play a role in formation of many molecules like alkaloids, terpenes1  



To understand genetic cluster and growing genomics we need to understand about already known genetic organization across the kingdom.3


Non-homologous genetic organizations are common in bacteria, otherwise known as operons4. Operons comprise of group of genes present in proximity with one another in the genome and perform the same function: for example, to produce secondary metabolites. All genes have same promoter and are grouped together as a single polycistronic message5


Initially it was believed that gene clusters were limited to bacteria as a result more than 50% of the genes of bacteria are organized as operons. Example, lac operon in E.coli6. Not only that but bacteria uses its operons to produce many natural products. According to the biosynthetic gene cluster repository MIBiG (minimum information about a biosynthetic gene cluster), 1,221 clusters of bacteria have now been reported depicting that its genome exhibit a wealth of various chemical applications in medical, agricultural and industrial applications7,8.


Fig 1: The Escherichia coli lac operon. It contains three structural genes for lactose utilization: lacZ, lacY and lacA. They code for b-galactosidase, a lac permease and a transacetylase, respectively. A single mRNA is produced by the operon. lacI gene   encodes the lac repressor, which when activated form inhibits transcription of the structural. 3


Gene Clusters in Plants:

Today, it is clear that gene cluster occur in eukaryotes some of which are cluster that share similarities and may have emerged from tandem duplication such as repetitive leucine-rich genes in plants that help in disease resistance9,10. Gene cluster has at least three genes coding for enzyme used in metabolic pathway. Sometimes these genes code protein needed for functioning of metabolic pathway for example transcription factor and transporters. Earlier the presences of gene cluster in plant was not expected which was true in case of primary metabolites. Though now by help of computational analysis some lose or partial clustering is observed,11,12. The second metabolism provides many examples of biosynthetic genetic clusters with the exceptions to this case being the pathways of anthocyanins, carotenoids, and glucosinolates16. And now it is clear that many of these have genes that are grouped together and together synthesis compounds like terpenes, hydroxamic Acid, steroidal14,15,16,17. These occur by replication/duplication, neofunctionalilzation not by horizontal gene transfer. Partial clustering of different plant genes and genetic replication for genetic function and modules are also being investigated and confirmed.12,13,15,18,19,20


Commonality of Gene Clusters in Plants:

To identify yet discovered gene cluster in our staple food corn, rice and may other plants were subjected to high throughput genomic analysis in search of clustering information by combining molecular biology, biochemistry, and reverse genetics approaches21,22, 23, 24, 25, 26. And therefore by help of advance technology quick analysis of rice genome for clustering was done27,28. Genome mining based on clustering and co expression of secondary metabolite pathway genes has been proven successful in microbes29,20


What is the Signifificance Of Clustering?

If we look from plant’s perspective regulation is one of the key benefit associated with use of clustering. That is through utilization of genetically dispersed gene by a common transcription factor is a perfect example of co regulation in plants30. However, the physical clustering of genes functionally associated one another in eukaryotic genomes has the capacity to defy the dimensions of genetic management, providing mechanisms for integrated gene expression control at nuclear and/or chromatin levels31,32,33


By proper use of gene cluster we can improve performance, inheritance and survival characteristics in plants by combining the right genes to produce desired protein34,4. Plants according to their “needs” use gene cluster to produce metabolites to act as plant protector which can be very different even in the most closely related species35. Gene cluster independent of physical proximity are performed by separate species (Bx group in corn, rye, and wheat36) or in unrelated species (cyanogenic glucoside pathway in the breast, lotus, and cassava21). This proves that it can not be completely driven by chance. The retention (and expansion) of the cluster is the product of positive selection, which is beneficial to the plant that owns the cluster. For example they may help prevent the accumulation of potentially toxic chemical like intermediate metabolites 37. Recent study of fungal genetic clusters have shown that 40% of genes produced by 'double neighbor gene pairs' (neighbors' genes on the chromosome, their products next to each other in biosynthetic pathways) are toxic 38. Common gene regulation include chromatin remodeling for better and efficient gene regulation20,31,33,40,41. In fungi, a global regulator (LaeA) was first found in Aspergillus42, 43, 44,45 but later in other species as well46, 47, 48 and has been shown to control many secondary metabolite clusters in several species. Although in plants no such 'master regulator' has been found so far, its existence cannot be ruled out     


Discovery of Gene Clusters in Plants:

Till date majority of gene cluster discovered are a result of genetic and biochemistry exploration alongside utilizing the genomic sequence information of plant.9, 49, 50,51 with continuous advancement occurring in bioinformatics tools used for plants sequence analysis (example: antiSMASH, SMURF & Cluster Mine 360) it would be more feasible to explore its genome 52,53,54 and identify clusters. Although the biggest challenge in doing so is the huge size of plant genome55,56.   


A cognate metabolic pathway is defined after the identification of a cluster. Heterologous expression of the putative gene in bacteria or yeast can be employed for product identification. The same can be achievd by transient expression in Nicotiana benthamiana. The function of the product dictates the expression system. Example,yeast strain GIL77 tend to accumulate 2,3-oxidosqualene which makes it ideal for the functional testing of oxidosqualene cyclase57,58


Chemical genetic strategies can be employed to narrow down the list of candidate genes for a gene cluster pathway. For example, the CYP inhibitor uniconazole-P was used to test the involvement of CYPs in the momilactone pathway24 and the 2-oxoglutarate-dependent dioxygenase inhibitor Prohexadion-Ca was used to help identify Bx6 in the DIMBOA pathway in maize19.


·      Validating Candidate Metabolic Gene Clusters:

To show that a genetic group forms a clusters of genes, it is important to show that genetic products work together. There are two main ways in which this can be achieved: (i) by genetic engineering via heterologous expression through plant or bacterial processes; and / or (ii) using genetic techniques that include genetic knockouts and knockdowns on the plant from which it originated.59


·      Heterologous expression approaches:

Various methods of expression have been used to analyze genetic variants and gene clusters in plants. Express the cluster products in E. Coli, and then follow that up with protein purification and experimentation with substrate molecules, is successful in the formation of many different enzymes, like glycosyl transferases, methyltransferases20,25,26,16


·      Reverse genetic approaches:

Reverse genetic techniques, require information on genes that can be targeted for mutagenesis. Knocking out a gene will hault the production of the respective enzyme and often results in accumulation of the enzyme’s substrate, as well as the loss of the final product of the metabolic pathway. Gene knockouts can be obtained from a variety of plants where complete mutated collections such as the T-DNA insertion collection from Arabidopsis, to identify genetic mutations of EMS mutagenized populations60.


Organization and Operation of Clusters in Plant Chromosomes:

In gene cluster, one gene encodes for signature enzyme and the rest gene codes hats or tailoring enzyme [20]. It may be that gene duplication and neofunctionalisation of genes give rise to signature gene within a cluster, which maybe direct or indirect61.


In some cases, the gene coding for the first step does not appear to be closely related to other gene genes (e.g., CYP72 gene GAME7, which may be involved in the first step of biosynthesis of steroidal glycoalkaloids in tomatoes, is 8 Mb away from other combined genes on the same chromosome)62.


Regulation of Gene Clusters:

Metabolic gene clusters share transcriptional response defined by co-expression, co-regulation and coordinate expression which depend on the individual developmental or environmental conditions


Fig 2: (a) Coexpression. Metabolic gene clusters are typically tightly coexpressed. All cluster genes (colored arrows) show similar expression patterns across different conditions. Neighboring genes (gray arrows) do not follow the cluster expression pattern. (b) Coregulation. Common regulators control transcription of all cluster genes. Transcription factors bind to target sequences in promoters of cluster genes. (c) Coordinated expression. Cluster-wide expression is under the control of single regulatory regions, and transcriptional changes in one gene directly or indirectly alter the expression level of other cluster genes. Different arrow colors indicate nonsequence relatedness of genes.63


Tools and Techniques Used for Clustering Experiments:

Protein-protein interactions are important in cellular activities. Collaborative methods of predicting the formation and function of proteins, protein-protein interactions and the evolutionary information of co-partners (proteins) are often used in group cluster techniques. Various data sets were obtained from these methods and were used to create prediction tools for cluster eg antiSMASH, plantiSMASH.64


·      Sequence Alignment:

By use of Next Generation sequencing (NGS) multiple sequence alignment (MSA) is being used in comparative biological sequence studies. Where MSA provides basic understanding of the sequence, structure and functions of nucleotides and proteins. MSA usually involve three or more set of sequences in which two forms a set (known as pairing). Alignment is done by inserting spaces in between to achieve common sequence length L representing the relationship that emerges between sequences65.


By use of MSA conservation and variation within protein family can be identified alongside with evolutionary and functional relationships too. Due to this MSA is used for homology modeling, secondary structure prediction. phylogenic reconstruction and profiles[66] or hidden models of Markov67,68.


The Basic Location Alignment Search Tool (BLAST) ( is a sequential search system that can be used with a web interface or as a standalone tool69,70 There are several types of BLAST to compare the whole combination of nucleotide or protein combinations with nucleotide or protein information. BLAST is a heuristic that finds brief similarities between the two sequences and tries to start alignment from these ‘hot spots’.


·      Protein Modeling:

The three dimensional structure of protein tells about its function in cellular processes. In doing so it help us understand the working of biological system71,72.  By use of 3D structure of binding partners a three-dimensional complex model depending on the geometric and physicochemical cohesion of interacting molecules73-76


·      Homology Modeling:

Homology modeling helps narrowing the gap between known protein sequences and structures determined by experiment. Thanks to all the recent advancements severs like SWISS-MODEL ( discovered 25 years ago allows users without specific technology to access modeling, vision and translational results. Today homology modeling is used to find the amino acid sequence of interacting proteins, stoichiometry and its all-encompassing complex structure77.


·      Threading approach:

Threading refers to the bioinformatics process of identifying template proteins from the structured information of a structure with the same structure or the same structural structure as the sequence protein in question78, to identify changing relatives. The result from this approach consists of estimates of full secondary and tertiary structures, with active ligand binding sites, Enzyme Commission numbers and Gene Ontology terms. Estimation of the accuracy of the predictions is given in terms of the model's confidence measurement. The server is available at


·      Protein-Protein Interaction:

Protein-protein interactions or small molecule protein interactions prediction not only helps academics but industry too. Therefore requires as much accuracy as possible. This is aimed to achieve by docking to find the right conformation of two interlocking molecules with minimum energy giving birth to development of various docking algorithms79 among which there were some free web server like ClusPro server80. Before the software/servers are released they go through excessive testing81,82,83. PatchDock is one of the best servers for protein-protein interactions. It divides the surface of Connolly dot31,32 of molecules into concave, convex and flat patches. That are then tested for geometric and atomic energy33 giving the best conformation84,85


·      Phylogenetric Tree:

It is used to provide relationship between protein sequences where topology indicates how sequence should be arranged, length of branch tells about stages of evolution.58 It is used as a guide tree for orthologous protein clustering and extract intra matrix junctions from protein matrices.64 The most similar sequence will be found to be close, while very different sequences will be far away from one another using this similarity or dissimilarity is calculated86,87   


·      Gene Cluster Mining Algorithms:

Plant Secondary Metabolites Analysis Shell (plantiSMASH) and Plant Cluster Finder allow automatic identification of cluster in plants, their comparison and even predicts the active genetic interactions between them. To use PlantiSMASH, users need to provide genomic information (with or without annotations), preferably with transcriptomic data. PlantiSMASH offers a wide range of options to set the analysis, and offers many visual effects that require professional interpretation. PlantiSMASH accepts two types of genomic data: unspecified (FASTA) with defined feature (GBK / EMBL / GFF + FASTA) By default, scaffolds smaller than 1000 bp are ignored by the algorithm. PlantiSMASH incorporates a genetic analysis module to facilitate the study of compound patterns in BGCs predicted by algorithm. plantiSMASH adopts two formats of this: SOFT files and CSV files. (


Applications and Future Prospects of Gene Clusters:

By the help of all bioinformatics tools and technology we can gather so much information about genes, gene cluster, their function and what not. Which now throws the question of how can we utilize these information for the sake of improving characteristics of plants valuable to both agriculture and industry. Genetic information will help us understand and use genes to get what we desire.89


·      Allelopathy:

It is the phenomenon of release of alleochemicals into rhizosphere by a plant to support the growth of nearby plants90. In recent years genetic traits of plant-and-weed allelopathy have been discovered to explore plant proliferation possibilities91,92,93. By use of it we may find the link and organize genetic expression and reduce self-harm caused by the accumulation of toxic intermediate chemicals to promote plant survival94,95


·      Synthetic Biology:

Yeast is used to produce plant metabolites via metabolic engineering specially to produce artemisinin; a major malarial drug at a large scale96,97. Nicotiana Tabacum and Nicotiana Benthamiana species of tobacco with the help of Agrobacterium-mediated transient expression can successfully express biosynthetic metabolite clusters in a matter of days with minimum detrimental effects 98,99. There are multiple levels of cluster control mechanisms in bacteria and fungi100,101. Also by use of chemicals, we can manipulate gene clusters to produce the molecule of interest102. It might be possible for us to produce target molecules by over expressing or deleting a transcription factor103.


Transfer of a whole metabolic pathway into a heterologous host system for the purposes of industrial production is widely used104. But often the large sixe exceeding 100 KB represents another challenge that can be overcome by elimination of non essential DNA fragments like non-coding intergenic region.By introducing gene clusters in agricultural crops through transformation or introgression plant’s natural defenses be enhanced avoiding the use of pesticides. It is now clear that clustering of genes from secondary metabolic pathways in plants is not unusual. The question remains, however, if genes involved in primary metabolism can also be clustered, and if such clusters exist then it is only a matter of time before they are discovered.60



The avenue that has been opened by the exponential development in the field of plant biology and natural product discovery by way of plant metabolite gene cluster discovery is drastically influential. The commonalities and uniqueness of plant clusters are being made abundantly clear with the help of the regular influx of information. At the horizon of possibilities we could see the data being handy at future genome mining efforts for pathway discoveries, and for the development of biotechnological pipelines. The impact that this accelerated development can have on the way we approach everything from farming to drug discovery to nutrition will be unprecedented. It will be, however, important to tackle the broad applicability of cluster search engines and develop a more specific, narrower method. With number of plant genomes being sequenced increasing it has become drastically easier to identify the regulatory mechanisms of gene cluster expression, uncover the evolutionary forces responsible for the cluster formation and maintenance which are the important prospects to consider.



1.     Nützmann HW, Osbourn A. Gene clustering in plant specialized metabolism. Current opinion in biotechnology. 2014 Apr 1;26:91-9.

2.     Kliebenstein DJ, Osbourn A. Making new molecules–evolution of pathways for novel metabolites in plants. Current opinion in plant biology. 2012 Aug 1;15(4):415-23.DOI: 10.1016/j.pbi.2012.05.005

3.     Chu HY, Wegel E, Osbourn A. From hormones to secondary metabolism: the emergence of metabolic gene clusters in plants. The Plant Journal. 2011 Apr;66(1):66-79.

4.     Rocha EP. The organization of the bacterial genome. Annual review of genetics. 2008 Dec 1;42:211-33.

5.     Jacob F, Monod J. Genetic regulatory mechanisms in the synthesis of proteins. Journal of molecular biology. 1961 Jun 1;3(3):318-56.

6.     Jacob F, Perrin D, Sánchez C, Monod J. The operon: a group of genes whose expression is co-ordinated by an operator. Compte Rendu de l'Academie des Sciences. 1960;250:1727-9.DOI: 10.1016/j.crvi.2005.04.005

7.     Cimermancic P, Medema MH, Claesen J, et al. Insights into secondary metabolism from a global analysis of prokaryotic biosynthetic gene clusters. Cell. 2014 Jul 17;158(2):412-21.DOI: 10.1016/j.cell.2014.06.034

8.     Medema MH, Kottmann R, Yilmaz P, et all. Minimum information about a biosynthetic gene cluster. Nature chemical biology. 2015 Sep;11(9):625-31.doi: 10.1038/nchembio.1890.

9.     Ferrier DE, Holland PW. Ancient origin of the Hox gene cluster. Nature Reviews Genetics. 2001 Jan;2(1):33-8.

10.   Li J, Cocker JM, Wright J, et all. Genetic architecture and evolution of the S locus supergene in Primula vulgaris. Nature plants. 2016 Dec 2;2(12):1-7. DOI: 10.1038/nplants.2016.188

11.   Lee JM, Sonnhammer EL. Genomic gene clustering analysis of pathways in eukaryotes. Genome research. 2003 May 1;13(5):875-82.DOI: 10.1101/gr.737703

12.   Schläpfer P, Zhang P, Wang C, et all. Genome-wide prediction of metabolic enzymes, pathways, and gene clusters in plants. Plant physiology. 2017 Apr;173(4):2041-59.DOI: 10.1104/pp.16.01942

13.   Medema MH, Osbourn A. Computational genomic identification and functional reconstitution of plant natural product biosynthetic pathways. Natural product reports. 2016;33(8):951-62.DOI: 10.1039/c6np00035e

14.   Hen-Avivi S, Savin O, Racovita RC, et all. A metabolic gene cluster in the wheat W1 and the barley Cer-cqu loci determines β-diketone biosynthesis and glaucousness. The Plant Cell. 2016 Jun;28(6):1440-60.DOI: 10.1105/tpc.16.00197

15.   Nützmann HW, Huang A, Osbourn A. Plant metabolic clusters–from genetics to genomics. New phytologist. 2016 Aug;211(3):771-89. DOI: 10.1111/nph.13981

16.   Wilderman PR, Xu M, Jin Y, et all. Identification of syn-pimara-7, 15-diene synthase reveals functional clustering of terpene synthases involved in rice phytoalexin/allelochemical biosynthesis. Plant Physiology. 2004 Aug;135(4):2098-105.DOI: 10.1104/pp.104.045971

17.   Zhou Y, Ma Y, Zeng J, et all. Convergence and divergence of bitterness biosynthesis and regulation in Cucurbitaceae. Nature plants. 2016 Nov 28;2(12):1-8.DOI: 10.1038/nplants.2016.183

18.   Boutanaev AM, Moses T, Zi J, et all. Investigation of terpene diversification across multiple sequenced plant genomes. Proceedings of the National Academy of Sciences. 2015 Jan 6;112(1):E81-8.DOI: 10.1073/pnas.1419547112

19.   Dutartre L, Hilliou F, Feyereisen R. Phylogenomics of the benzoxazinoid biosynthetic pathway of Poaceae: gene duplications and origin of the Bx cluster. BMC Evolutionary Biology. 2012 Dec;12(1):1-9.DOI: 10.1186/1471-2148-12-64

20.   Frey M, Chomet P, Glawischnig E, et all. Analysis of a chemical plant defense mechanism in grasses. Science. 1997 Aug 1;277(5326):696-9.DOI: 10.1126/science.277.5326.696

21.   Osbourn A. Secondary metabolic gene clusters: evolutionary toolkits for chemical innovation. Trends in Genetics. 2010 Oct 1;26(10):449-57. DOI: 10.1016/j.tig.2010.07.001

22.   Takos AM, Knudsen C, Lai D, et all. Genomic clustering of cyanogenic glucoside biosynthetic genes aids their identification in Lotus japonicus and suggests the repeated evolution of this chemical defence pathway. The Plant Journal. 2011 Oct;68(2):273-86.DOI: 10.1111/j.1365-313X.2011.04685.x

23.   Von Rad U, Hüttl R, Lottspeich F, et all. Two glucosyltransferases are involved in detoxification of benzoxazinoids in maize. The Plant Journal. 2001 Dec;28(6):633-42.DOI: 10.1046/j.1365-313x.2001.01161.x

24.   Sakamoto T, Miura K, Itoh H, K, et all. An overview of gibberellin metabolism enzyme genes and their related mutants in rice. Plant physiology. 2004 Apr;134(4):1642-53.DOI: 10.1104/pp.103.033696

25.   Shimura K, Okada A, Okada K, et all. Identification of a biosynthetic gene cluster in rice for momilactones. Journal of Biological Chemistry. 2007 Nov 23;282(47):34013-8.DOI:10.1074/jbc.M703344200

26.   Jonczyk R, Schmidt H, Osterrieder A, et all. Elucidation of the final reactions of DIMBOA-glucoside biosynthesis in maize: characterization of Bx6 and Bx7. Plant physiology. 2008 Mar;146(3):1053-63.DOI: 10.1104/pp.107.111237

27.   Papadopoulou K, Melton RE, Leggett M, et all. Compromised disease resistance in saponin-deficient plants. Proceedings of the National Academy of Sciences. 1999 Oct 26;96(22):12923-8.DOI: 10.1073/pnas.96.22.12923

28.   Qi X, Bakht S, Qin B, et all. A different function for a member of an ancient and highly conserved cytochrome P450 family: from essential sterols to plant defense. Proceedings of the National Academy of Sciences. 2006 Dec 5;103(49):18848-53.DOI: 10.1073/pnas.0607849103

29.   Martin C, Ellis N, Rook F. Do transcription factors play special roles in adaptive variation?. Plant physiology. 2010 Oct;154(2):506-11. DOI: 10.1104/pp.110.161331

30.   Hurst LD, Pál C, Lercher MJ. The evolutionary dynamics of eukaryotic gene order. Nature Reviews Genetics. 2004 Apr;5(4):299-310.DOI: 10.1038/nrg1319

31.   Sproul D, Gilbert N, Bickmore WA. The role of chromatin structure in regulating the expression of clustered genes. Nature Reviews Genetics. 2005 Oct;6(10):775-81.DOI: 10.1038/nrg1688

32.   Osbourn AE, Field B. Operons. Cellular and Molecular Life Sciences. 2009 Dec;66(23):3755-75.DOI: 10.1007/s00018-009-0114-3

33.   Price MN, Arkin AP, Alm EJ. The life-cycle of operons. PLoS genetics. 2006 Jun;2(6):e96.DOI: 10.1371/journal.pgen.0020096

34.   Sue M, Nakamura C, Nomura T. Dispersed benzoxazinone gene cluster: molecular characterization and chromosomal localization of glucosyltransferase and glucosidase genes in wheat and rye. Plant physiology. 2011 Nov;157(3):985-97.DOI: 10.1104/pp.111.182378

35.   Dutartre L, Hilliou F, Feyereisen R. Phylogenomics of the benzoxazinoid biosynthetic pathway of Poaceae: gene duplications and origin of the Bx cluster. BMC Evolutionary Biology. 2012 Dec;12(1):1-9.DOI: 10.1186/1471-2148-12-64

36.   Mylona P, Owatworakit A, Papadopoulou K, et all. Sad3 and Sad4 are required for saponin biosynthesis and root development in oat. The Plant Cell. 2008 Jan;20(1):201-12.DOI: 10.1105/tpc.107.056531

37.   McGary KL, Slot JC, Rokas A. Physical linkage of metabolic genes in fungi is an adaptation against the accumulation of toxic intermediate compounds. Proceedings of the National Academy of Sciences. 2013 Jul 9;110(28):11481-6.DOI: 10.1073/pnas.1304461110

38.   Qi X, Bakht S, Leggett M, et all. A gene cluster for secondary metabolism in oat: implications for the evolution of metabolic diversity in plants. Proceedings of the National Academy of Sciences. 2004 May 25;101(21):8233-8.DOI: 10.1073/pnas.0401301101

39.   Takos AM, Rook F. Why biosynthetic genes for chemical defense compounds cluster. Trends in plant science. 2012 Jul 1;17(7):383-8.DOI: 10.1016/j.tplants.2012.04.004

40.   Palmer JM, Keller NP. Secondary metabolism in fungi: does chromosomal location matter?. Current opinion in microbiology. 2010 Aug 1;13(4):431-6.. DOI: 10.1016/j.mib.2010.04.008

41.   Bayram O, Krappmann S, Ni M, et all. VelB/VeA/LaeA complex coordinates light signal with fungal development and secondary metabolism. Science. 2008 Jun 13;320(5882):1504-6.DOI: 10.1126/science.1155888

42.   Bok JW, Keller NP. LaeA, a regulator of secondary metabolism in Aspergillus spp. Eukaryotic cell. 2004 Apr;3(2):527-35.DOI: 10.1128/EC.3.2.527-535.2004

43.   Bok JW, Balajee SA, Marr KA, Andes D, Nielsen KF, Frisvad JC, Keller NP. LaeA, a regulator of morphogenetic fungal virulence factors. Eukaryotic Cell. 2005 Sep;4(9):1574-82.DOI: 10.1128/EC.4.9.1574-1582.2005

44.   Amaike S, Keller NP. Distinct roles for VeA and LaeA in development and pathogenesis of Aspergillus flavus. Eukaryotic cell. 2009 Jul;8(7):1051-60.DOI: 10.1128/EC.00088-09

45.   Kosalková K, García-Estrada C, Ullán RV, et all. The global regulator LaeA controls penicillin biosynthesis, pigmentation and sporulation, but not roquefortine C synthesis in Penicillium chrysogenum. Biochimie. 2009 Feb 1;91(2):214-25.DOI: 10.1016/j.biochi.2008.09.004

46.   Wiemann P, Brown DW, Kleigrewe K, et all. FfVel1 and FfLae1, components of a velvet‐like complex in Fusarium fujikuroi, affect differentiation, secondary metabolism and virulence. Molecular microbiology. 2010 Aug;77(4):972-94.DOI: 10.1111/j.1365-2958.2010.07263.x

47.   Hoff B, Kamerewerd J, Sigl C, et all. Two components of a velvet-like complex control hyphal morphogenesis, conidiophore development, and penicillin biosynthesis in Penicillium chrysogenum. Eukaryotic cell. 2010 Aug;9(8):1236-50.DOI: 10.1128/EC.00077-10

48.   Field B, Fiston-Lavier AS, Kemen A, et all. Formation of plant metabolic gene clusters within dynamic chromosomal regions. Proceedings of the National Academy of Sciences. 2011 Sep 20;108(38):16116-21.DOI: 10.1073/pnas.1109273108

49.   Osbourn A, Papadopoulou KK, Qi X, et all. Finding and analyzing plant metabolic gene clusters. InMethods in Enzymology 2012 Jan 1 (Vol. 517, pp. 113-138). Academic Press.DOI: 10.1016/B978-0-12-404634-4.00006-1

50.   Castillo DA, Kolesnikova MD, Matsuda SP. An effective strategy for exploring unknown metabolic pathways by genome mining. Journal of the American Chemical Society. 2013 Apr 17;135(15):5885-94.DOI: 10.1021/ja401535g

51.   Blin K, Medema MH, Kazempour D, et all. antiSMASH 2.0—a versatile platform for genome mining of secondary metabolite producers. Nucleic acids research. 2013 Jul 1;41(W1):W204-12.DOI: 10.1093/nar/gkt449

52.   Khaldi N, Seifuddin FT, Turner G, et all. SMURF: genomic mapping of fungal secondary metabolite clusters. Fungal Genetics and Biology. 2010 Sep 1;47(9):736-41.DOI: 10.1016/j.fgb.2010.06.003

53.   Conway KR, Boddy CN. ClusterMine360: a database of microbial PKS/NRPS biosynthesis. Nucleic acids research. 2012 Oct 25;41(D1):D402-7.DOI: 10.1093/nar/gks993

54.   Mackay J, Dean JF, Plomion C, et all. Towards decoding the conifer giga-genome. Plant molecular biology. 2012 Dec;80(6):555-69.DOI: 10.1007/s11103-012-9961-7

55.   Nystedt B, Street NR, Wetterbom A, et all. The Norway spruce genome sequence and conifer genome evolution. Nature. 2013 May;497(7451):579-84.DOI: 10.1038/nature12211

56.   Winzer T, Gazda V, He Z, et all. A Papaver somniferum 10-gene cluster for synthesis of the anticancer alkaloid noscapine. Science. 2012 Jun 29;336(6089):1704-8.DOI: 10.1126/science.1220757

57.   Frey M, Kliem R, Saedler H, Gierl A. Expression of a cytochrome P450 gene family in maize. Molecular and General Genetics MGG. 1995 Jan;246(1):100-9.DOI: 10.1007/BF00290138

58.   Dixon RA, Achnine L, Deavours BE, Naoumkina M. Metabolomics and gene identification in plant natural product pathways. InPlant Metabolomics 2006 (pp. 243-259). Springer, Berlin, Heidelberg.DOI: 10.1007/3-540-29782-0_18.

59.   McCallum CM, Comai L, Greene EA, Henikoff S. Targeted screening for induced mutations. Nature biotechnology. 2000 Apr;18(4):455-7.DOI: 10.1038/74542

60.   Chu HY, Wegel E, Osbourn A. From hormones to secondary metabolism: the emergence of metabolic gene clusters in plants. The Plant Journal. 2011 Apr;66(1):66-79.DOI: 10.1111/j.1365-313X.2011.04503.x

61.   Gao C, Hindra, Mulder D, Yin C, Elliot MA. Crp is a global regulator of antibiotic production in Streptomyces. MBio. 2012 Dec 11;3(6):e00407-12.DOI: 10.1128/mBio.00407-12

62.   Nützmann HW, Scazzocchio C, Osbourn A. Metabolic gene clusters in eukaryotes. Annual Review of Genetics. 2018 Nov 23;52:159-83.DOI: 10.1146/annurev-genet-120417-031237

63.   Craig RA, Liao L. Phylogenetic tree information aids supervised learning for predicting protein-protein interaction based on distance matrices. Bmc Bioinformatics. 2007 Dec;8(1):1-2.DOI: 10.1186/1471-2105-8-6

64.   Bawono P, Dijkstra M, Pirovano W, et all. Multiple sequence alignment. InBioinformatics 2017 (pp. 167-189). Humana Press, New York, NY.DOI: 10.1007/978-1-4939-6622-6_8

65.   Gribskov M, McLachlan AD, Eisenberg D. Profile analysis: detection of distantly related proteins. Proceedings of the National Academy of Sciences. 1987 Jul 1;84(13):4355-8.DOI: 10.1073/pnas.84.13.4355

66.   Haussler D, Krogh A, Mian IS, Sjolander K. Protein modeling using hidden Markov models: Analysis of globins. In[1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences 1993 Jan 8 (Vol. 1, pp. 792-802). IEEE.DOI:10.1109/HICSS.1993.270611

67.   Bucher P, Karplus K, Moeri N, Hofmann K. A flexible motif search technique based on generalized profiles. Computers & chemistry. 1996 Mar 1;20(1):3-23.DOI: 10.1016/s0097-8485(96)80003-9

68.   Altschul SF, Gish W, Miller W, et all. Basic local alignment search tool. Journal of molecular biology. 1990 Oct 5;215(3):403-10.DOI: 10.1016/S0022-2836(05)80360-2

69.   Altschul SF, Madden TL, Schäffer AA, et all. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic acids research. 1997 Sep 1;25(17):3389-402.DOI: 10.1093/nar/25.17.3389

70.   Fuller JC, Burgoyne NJ, Jackson RM. Predicting druggable binding sites at the protein–protein interface. Drug discovery today. 2009 Feb 1;14(3-4):155-61.DOI: 10.1016/j.drudis.2008.10.009

71.   Nim S, Jeon J, Corbi-Verge C, et all. Pooled screening for antiproliferative inhibitors of protein-protein interactions. Nature chemical biology. 2016 Apr;12(4):275-81.DOI: 10.1038/nchembio.2026

72.   Morris GM, Lim-Wilby M. Molecular docking. InMolecular modeling of proteins 2008 (pp. 365-382). Humana Press.DOI: 10.1007/978-1-59745-177-2_19

73.   Chaudhury S, Berrondo M, Weitzner BD, et all. Benchmarking and analysis of protein docking performance in Rosetta v3. 2. PloS one. 2011 Aug 2;6(8):e22477.DOI: 10.1371/journal.pone.0022477

74.   Kurkcuoglu Z, Koukos PI, Citro N, et all. Performance of HADDOCK and a simple contact-based protein–ligand binding affinity predictor in the D3R Grand Challenge 2. Journal of computer-aided molecular design. 2018 Jan;32(1):175-85.DOI: 10.1007/s10822-017-0049-y

75.   Peterson LX, Togawa Y, Esquivel-Rodriguez J, et all. Modeling the assembly order of multimeric heteroprotein complexes. PLoS computational biology. 2018 Jan 12;14(1):e1005937.DOI: 10.1371/journal.pcbi.1005937

76.   Waterhouse A, Bertoni M, Bienert S, et all. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic acids research. 2018 Jul 2;46(W1):W296-303.DOI: 10.1093/nar/gky427

77.   Roy A, Kucukural A, Zhang Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nature protocols. 2010 Apr;5(4):725-38.DOI: 10.1038/nprot.2010.5

78.   Gray JJ, Moughon S, Wang C, et all. Protein–protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. Journal of molecular biology. 2003 Aug 1;331(1):281-99.DOI: 10.1016/s0022-2836(03)00670-3

79.   Comeau SR, Gatchell DW, Vajda S, et all. ClusPro: a fully automated algorithm for protein–protein docking. Nucleic acids research. 2004 Jul 1;32(suppl_2):W96-9.DOI: 10.1093/nar/gkh354

80.   Duhovny D, Nussinov R, Wolfson HJ. Efficient unbound docking of rigid molecules. InInternational workshop on algorithms in bioinformatics 2002 Sep 17 (pp. 185-200). Springer, Berlin, Heidelberg.DOI: 10.1007/3-540-45784-4_14V.

81.   Connolly ML. Solvent-accessible surfaces of proteins and nucleic acids. Science. 1983 Aug 19;221(4612):709-13.DOI: 10.1126/science.6879170

82.   Connolly ML. Analytical molecular surface calculation. Journal of applied crystallography. 1983 Oct 1;16(5):548-58.

83.   Zhang C, Vasmatzis G, Cornette JL, DeLisi C. Determination of atomic desolvation energies from the structures of crystallized proteins. Journal of molecular biology. 1997 Apr 4;267(3):707-26.DOI: 10.1006/jmbi.1996.0859

84.   Schneidman-Duhovny D, Inbar Y, Nussinov R, Wolfson HJ. PatchDock and SymmDock: servers for rigid and symmetric docking. Nucleic acids research. 2005 Jul 1;33(suppl_2):W363-7. DOI: 10.1093/nar/gki481

85.   Feng DF, Doolittle RF. [23] Progressive alignment and phylogenetic tree construction of protein sequences.DOI: 10.1016/0076-6879(90)83025-5

86.   Lewi PJ. 3.3 Receptor Mapping and Phylogenetic Clustering. Methods and Principles in Medicinal Chemistry. 1994:131.DOI:10.1002/9783527615674

87.   Kautsar SA, Suarez Duran HG, Medema MH. Genomic identification and analysis of specialized metabolite biosynthetic gene clusters in plants using PlantiSMASH. InPlant Chemical Genomics 2018 (pp. 173-188). Humana Press, New York, NY.DOI: 10.1007/978-1-4939-7874-8_15

88.   Osbourn A. Gene clusters for secondary metabolic pathways: an emerging theme in plant biology. Plant physiology. 2010 Oct;154(2):531-5.. DOI: 10.1104/pp.110.161315

89.   de Albuquerque MB, dos Santos RC, Lima LM, et all. Allelopathy, an alternative tool to improve cropping systems. A review. Agronomy for Sustainable Development. 2011 Apr;31(2):379-95. DOI:

90.   Khanh TD, Chung MI, Xuan TD, Tawata S. The exploitation of crop allelopathy in sustainable agricultural production. Journal of Agronomy and Crop Science. 2005 Jun;191(3):172-84.

91.   Cheng F, Cheng Z. Research progress on the use of plant allelopathy in agriculture and the physiological and ecological mechanisms of allelopathy. Frontiers in plant science. 2015 Nov 17;6:1020. DOI: 10.3389/fpls.2015.01020

92.   Guo L, Qiu J, Ye C, et all. Echinochloa crus-galli genome analysis provides insight into its adaptation and invasiveness as a weed. Nature communications. 2017 Oct 18;8(1):1-0.DOI: 10.1038/s41467-017-01067-5

93.   Boycheva S, Daviet L, Wolfender JL, Fitzpatrick TB. The rise of operon-like gene clusters in plants. Trends in plant science. 2014 Jul 1;19(7):447-59.DOI: 10.1016/j.tplants.2014.01.013

94.   Olsen KM, Small LL. Micro‐and macroevolutionary adaptation through repeated loss of a complete metabolic pathway. New Phytologist. 2018 Jul;219(2):757-66.DOI: 10.1111/nph.15184 

95.   Paddon CJ, Westfall PJ, Pitera DJ, et all. High-level semi-synthetic production of the potent antimalarial artemisinin. Nature. 2013 Apr;496(7446):528-32.DOI: 10.1038/nature12051

96.   Westfall PJ, Pitera DJ, Lenihan JR, et all. Production of amorphadiene in yeast, and its conversion to dihydroartemisinic acid, precursor to the antimalarial agent artemisinin. Proceedings of the National Academy of Sciences. 2012 Jan 17;109(3):E111-8.DOI: 10.1073/pnas.1110740109

97.   Mugford ST, Louveau T, Melton R, et all. Modularity of plant metabolic gene clusters: a trio of linked genes that are collectively required for acylation of triterpenes in oat. The Plant Cell. 2013 Mar;25(3):1078-92.DOI: 10.1105/tpc.113.110551

98.   Geisler K, Hughes RK, Sainsbury F, et all. Biochemical analysis of a multifunctional cytochrome P450 (CYP51) enzyme required for synthesis of antimicrobial triterpenes in plants. Proceedings of the National Academy of Sciences. 2013 Aug 27;110(35):E3360-7.DOI: 10.1073/pnas.1309157110

99.   Brakhage AA. Regulation of fungal secondary metabolism. Nature Reviews Microbiology. 2013 Jan;11(1):21-32.DOI: 10.1038/nrmicro2916

100.van Wezel GP, McDowall KJ. The regulation of the secondary metabolism of Streptomyces: new links and experimental advances. Natural product reports. 2011;28(7):1311-33.DOI: 10.1039/c1np00003a

101.Bok JW, Chiang YM, Szewczyk E, Reyes-Dominguez Y, Davidson AD, Sanchez JF, Lo HC, Watanabe K, Strauss J, Oakley BR, Wang CC. Chromatin-level regulation of biosynthetic gene clusters. Nature chemical biology. 2009 Jul;5(7):462-4.DOI: 10.1038/nchembio.177

102.Okada A, Okada K, Miyamoto K, et all. OsTGAP1, a bZIP transcription factor, coordinately regulates the inductive production of diterpenoid phytoalexins in rice. Journal of Biological Chemistry. 2009 Sep 25;284(39):26510-8.DOI: 10.1074/jbc.M109.036871

103.Nour-Eldin HH, Hansen BG, Nørholm MH, Jensen JK, Halkier BA. Advancing uracil-excision based cloning towards an ideal technique for cloning PCR fragments. Nucleic acids research. 2006 Oct 1;34(18):e122-.DOI: 10.1093/nar/gkl635

104.Wu S, Schalk M, Clark A, Miles RB, Coates R, Chappell J. Redirection of cytosolic or plastidic isoprenoid precursors elevates terpene production in plants. Nature biotechnology. 2006 Nov;24(11):1441-7.DOI: 10.1038/nbt1251









Received on 20.05.2021          Modified on 22.11.2021

Accepted on 27.06.2022        © RJPT All right reserved

Research J. Pharm. and Tech 2023; 16(2):947-954.

DOI: 10.52711/0974-360X.2023.00159