<h3>2. The characteraztion of the Yeast two hybrid Library</h3>
<p> The 150-bp insert fragments were amplified using primers CCAGATTACGCTCATATGACAAGTTTGTAC and TCATCTGCAGCTCGAGACCACTTTGTACAA. The constructed libraries were sequenced using the Illumina Novaseq 6000 platform to generate raw data, following the manufacturer’s instructions by the company of ANNOROAD Biotech. Co., Ltd. (China). After quality control, about 34.5 GB of clean data were generated. According to the design principles (TCGTCGGGGACAACTTTGTACAAAAAAGTTGGAACC-(NNK)20-TAAGACCCAACTTTCTTGTACAAAGTTGTGCGGCCGCC), 60bp random sequences were extracted from the clean data and translated into amino acid sequences using standard genetic codons. The translation was stopped from the first stop codon, and peptides less than five aa in length after translation were removed. Finally, we obtained 68,190,232 peptides in total and 3,359,176 peptides after removing redundancy. We carried out saturation analyses and found that about 35Gb data was far from covering all the kinds of peptides in our random peptide library (Figure 4A), indicating that our random peptide library has a high diversity and a broad potential for protein interaction investigations and applications. </p>
<p> The 150-bp insert fragments were amplified using primers CCAGATTACGCTCATATGACAAGTTTGTAC and TCATCTGCAGCTCGAGACCACTTTGTACAA. The constructed libraries were sequenced using the Illumina Novaseq 6000 platform to generate raw data, following the manufacturer’s instructions by the company of ANNOROAD Biotech. Co., Ltd. (China). After quality control, about 34.5 GB of clean data were generated. According to the design principles (TCGTCGGGGACAACTTTGTACAAAAAAGTTGGAACC-(NNK)20-TAAGACCCAACTTTCTTGTACAAAGTTGTGCGGCCGCC), 60bp random sequences were extracted from the clean data and translated into amino acid sequences using standard genetic codons. The translation was stopped from the first stop codon, and peptides less than five aa in length after translation were removed. Finally, we obtained 68,190,232 peptides in total and 3,359,176 peptides after removing redundancy.
<p>We carried out saturation analyses and found that about 35Gb data was far from covering all the kinds of peptides in our random peptide library (Figure 4A), indicating that our random peptide library has a high diversity and a broad potential for protein interaction investigations and applications. </p>
<pclass="refs">Figure 4. statistics of the sequencing data about our random peptide library. (A) Statistics of the peptide number of different lengths; (B) Saturation analyses according to randomly sampling from the sequencing data. The abscissa represents the proportion of random sampling including 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% and 90%, and the ordinate represents the number of peptides detected corresponding to the percentage of sampled sequencing data.</p>