<p>The majority of proteins within mitochondria and chloroplasts are nuclear-encoded – they are expressed by the host and are imported into the organelle. Proteins meant for the organelle are usually marked by a targeting sequence at one end, also known as a transit peptide, which directs the protein to its destination after which it is cleaved. </p>
<p>This is no different with UCYN-A: Coele et al [1] in their 2024 study used proteomics to find proteins encoded by the host and imported into the nitroplast. Upon examining these protein sequences, they noticed that many of them possess characteristics of organellar import – most of them possess a C-terminal 120 amino acid extension compared to their orthologues. This extension is reminiscent of targeting sequences known to exist in mitochondrial [2] and chloroplastic [3] imported proteins. They termed the putative targeting sequence uTP (UCYN-A Transit Peptide, with lowercase “u” to differentiate it from uridine triphosphate).
</p>
...
...
@@ -44,8 +44,22 @@
</figcaption>
</div>
<p>Motif analysis confirmed findings similar to [1], revealing 8 conserved motifs in the C-terminal region (Fig 2). Further investigation of motif co-occurrence and relative positioning uncovered common patterns: two motifs consistently appeared near the start of the C-terminal region at fixed positions, followed by various combinations of the remaining motifs. This arrangement is reminiscent of a potential sub-organellar localization mechanism, where the initial two motifs could target UCYN-A, while subsequent motifs may specify localization within the endosymbiont, as is the case with chloroplast targeting, where a bipartite N-terminal targeting sequence specifies stromal and thylakoidal localization. More research is needed however to investigate this hypothesis.</p>
<divclass="img-pagestyle">
<imgsrc="https://static.igem.wiki/teams/5054/msa.png"alt="Fig 1: Graphical overview of the experiment plan.">
<figcaption>Figure 3: 5 uTP variations discovered among UCYN-A imported sequences. The left panel shows the relative position of conserved motifs in the different uTP variations, relative to motif #1, which was present in all examined sequences.
</figcaption>
</div>
<p>We investigated the relationship between transit peptide (uTP) sequences and the functional core of proteins, known as the mature domain. The mature domain is the part of a protein that remains after the transit peptide is cleaved off and performs the protein's primary function. Given the observed diversity in uTP sequences, understanding their connection to specific mature domains is crucial for designing effective uTP constructs for future experiments. Certain uTP sequences may only be compatible with specific proteins, so to explore potential correlations between uTP motif patterns and mature domain sequences, we trained classifiers to predict the appropriate uTP sequence (by predicting the correct combination of motifs) based on a given mature domain sequence. The classifiers were evaluated using a permutation test [10], with 3 of them yielding statistically significant results (p <0.05)(Fig4).</p>
<divclass="img-pagestyle">
<imgsrc="https://static.igem.wiki/teams/5054/msa.png"alt="Fig 1: Graphical overview of the experiment plan.">
<figcaption>Figure 4: Classification results of 4 different classifiers trained to predict uTP sequences based on mature domains.
</figcaption>
</div>
<p>For in vivo characterization, we constructed candidate uTP sequences by concatenating the consensus sequences of discovered motif patterns. The uTP sequence classifiers were used to select the correct motifs for the fluorescent proteins we planned to use, mVenus and mNeonGreen. The two sequences with the highest confidence values (uTP1 and uTP2) were selected for in vivo experiments. These sequences were also submitted to the Parts Registry.</p>