The library concentration equivalence was calculated as 2.8�� 109 molecules/��L. The library was stored at -20��C until further use. The shotgun library was clonally amplified with 1 and overnight delivery 2 cpb in two emPCR reactions each, and the paired-end library was amplified with 0.5 cpb in three emPCR reactions using the GS Titanium SV emPCR Kit (Lib-L) v2 (Roche). The yields of the emPCR were 6.8 and 9.8%, respectively, for the shotgun library, and 11.29% for the paired-end library. These yields fall into the expected 5 to 20% range according to Roche protocol. For each library, approximately 790,000 beads for a quarter region were loaded on the GS Titanium PicoTiterPlate PTP kit and sequenced with the GS FLX Titanium Sequencing Kit XLR70 (Roche).
The run was performed overnight and analyzed on a cluster using the gsRunBrowser and Newbler assembler (Roche). For the shotgun sequencing, 188,659 passed-filter wells were obtained. The sequencing generated 129.3 Mb with an average length of 685 bp. For the paired-end sequencing, 106,675 passed-filter wells were obtained. The sequencing generated 35 Mb with an average length of 262 bp. The passed-filter sequences were assembled using Newbler with 90% identity and 40 bp as overlap. The final assembly identified 8 scaffolds and 66 contigs (>1,500 bp) and generated a genome size of 3.79 Mb which corresponds to a coverage of 54.25 genome equivalents. Genome annotation Open Reading Frames (ORFs) were predicted using Prodigal  with default parameters, but the predicted ORFs were excluded if they were spanning a sequencing gap region.
The predicted bacterial protein sequences were searched against the GenBank database  and the Clusters of Orthologous Groups (COG) databases using BLASTP. The tRNAScanSE tool  was used to find tRNA genes, whereas ribosomal RNAs were found by using RNAmmer  and BLASTn against the GenBank database. Lipoprotein signal peptides and numbers of transmembrane helices were predicted using SignalP  and TMHMM  respectively. ORFans were identified if their BLASTP E-value was lower than 1e-03 for alignment length greater than 80 amino acids. If alignment Cilengitide lengths were smaller than 80 amino acids, we used an E-value of 1e-05. Such parameter thresholds have already been used in previous works to define ORFans. Ortholog sets composed of one gene from each of the four genomes H.