The analysis of the distribution of ? along chromosomes at the 100-kb scale reveals a more uniform distribution than that of CO (c) rates, with no reduction near telomeres or centromeres (Figure 5). More than 80% of 100-kb windows show ? within a 2-fold range, a percentage that contrasts with the distribution of CO where only 26.3% of 100-kb windows along chromosomes show c within a 2-fold range of the chromosome average. To test specifically whether the distribution of CO events is more variable across the genome that either GC or the combination of GC and CO events (i.e., number of DSBs), we estimated the coefficient of variation (CV) along chromosomes for each of the three parameters for different window sizes and chromosome arms. In all cases (window size and chromosome arm), the CV for CO is much greater (more than 2-fold) than that for either GC or DSBs (CO+GC), while the CV for DSBs is only marginally greater than that for GC: for 100-kb windows, the average CV per chromosome arm for CO, GC and DSBs is 0.90, 0.37 and 0.38, respectively. Nevertheless, we can also rule out the possibility that the distribution of GC events or DSBs are completely random, with significant heterogeneity along each chromosome (P<0.0001 at all physical scales analyzed, from 100 kb to 10 Mb; see Materials and Methods for details). Not surprisingly due to the excess of GC over CO events, GC is a much better predictor of the total number of DSBs or total recombination events across the genome than CO rates, with semi-partial correlations of 0.96 for GC and 0.38 for CO to explain the overall variance in DSBs (not taking into account the fourth chromosome).
DSB solution requires the formation regarding heteroduplex sequences (for both CO or GC events; Figure S1). Such heteroduplex sequences is consist of A great(T):C(G) mismatches which can be fixed randomly otherwise favoring certain nucleotides. During the Drosophila, there isn’t any head experimental research help G+C biased gene transformation repair and you may evolutionary analyses features provided inconsistent show while using CO prices because good proxy to have heteroduplex formation (– however, discover , ). Mention however one to GC situations much more constant than just CO events inside the Drosophila plus other organisms , , , and therefore GC (?) costs are going to be so much more associated than CO (c) costs when exploring the latest it is possible to consequences regarding heteroduplex fix.
In a number of variety, gene transformation mismatch fix Polyamorous dating online has been advised are biased, favoring G and you may C nucleotides – and anticipating an optimistic relationship between recombination prices (sensu regularity from heteroduplex creation) together with G+C posts out-of noncoding DNA ,
All of our study let you know no relationship off ? which have G+C nucleotide composition during the intergenic sequences (Roentgen = +0.036, P>0.20) or introns (Roentgen = ?0.041, P>0.16). An identical not enough association is observed whenever G+C nucleotide constitution are versus c (P>0.twenty five for both intergenic sequences and you will introns). We discover therefore no evidence of gene sales prejudice favoring Grams and you can C nucleotides from inside the D. melanogaster according to nucleotide structure. The causes for some of your own earlier show you to definitely inferred gene conversion process prejudice with the Grams and you may C nucleotides from inside the Drosophila can be multiple and include the usage of simple CO maps too as the unfinished genome annotation. Because the gene occurrence inside the D. melanogaster is actually higher from inside the countries which have non-smaller CO , , many recently annotated transcribed places and you will G+C rich exons , , may have been in past times examined once the basic sequences, particularly in these types of genomic regions which have low-reduced CO.
New design regarding recombination inside Drosophila
To discover DNA motifs associated with recombination events (CO or GC), we focused on 1,909 CO and 3,701 GC events delimited by five-hundred bp or less (CO500 and GC500, respectively). Our D. melanogaster data reveal many motifs significantly enriched in sequences surrounding recombination events (18 and 10 motifs for CO and GC, respectively) (Figure 6 and Figure 7). Individually, the motifs surrounding CO events (MCO) are present in 6.8 to 43.2% of CO500 sequences, while motifs surrounding GC events (MGC) are present in 7.8 to 27.6% of GC500 sequences. Note that 97.7% of all CO500 sequences contain at least one MCO motif and 85.0% of GC500 sequences contain one or more MGC motif (Figure S4).