The Human Web Mcneill Ebook Torrents

supernewtraining.bitballoon.com› ▲▲ The Human Web Mcneill Ebook Torrents ▲▲

The Human Web Mcneill Ebook Torrents Rating: 6,1/10 784votes

Nokia Installer Download. Genomics and metagenomics have revolutionized our understanding of marine microbial ecology and the importance of microbes in global geochemical cycles. However, the process of DNA sequencing has always been an abstract extension of the research expedition, completed once the samples were returned to the laboratory.

Nike fluo femme: Et la derniere lubie mode de l'actrice ne sort pas des plus grandes maisons de couture fran?aises, mais bien de l'imagination debordante teintee de. For a fraction of the cost of a book, we can share that book online forever. When I started this, people called me crazy. Collect web pages? For 21 years, we've backed up the Web, so if government data or entire newspapers disappear, we can say: We Got This. We're dedicated to reader privacy. We never accept ads.

During the 2013 Southern Line Islands Research Expedition, we started the first effort to bring next generation sequencing to some of the most remote locations on our planet. We successfully sequenced twenty six marine microbial genomes, and two marine microbial metagenomes using the Ion Torrent PGM platform on the Merchant Yacht Hanse Explorer. Onboard sequence assembly, annotation, and analysis enabled us to investigate the role of the microbes in the coral reef ecology of these islands and atolls. This analysis identified phosphonate as an important phosphorous source for microbes growing in the Line Islands and reinforced the importance of L-serine in marine microbial ecosystems. Sequencing in the field allowed us to propose hypotheses and conduct experiments and further sampling based on the sequences generated. By eliminating the delay between sampling and sequencing, we enhanced the productivity of the research expedition. By overcoming the hurdles associated with sequencing on a boat in the middle of the Pacific Ocean we proved the flexibility of the sequencing, annotation, and analysis pipelines.

Introduction DNA sequencing has revolutionized microbial ecology: next-generation sequencing has upended our traditional views of microbial communities, and enabled exploration of the microbial components of many unusual environments. In a typical environmental exploration, samples are collected at the study site, transported back to the laboratory, and analyzed after the scientific team returns from the field. This abstraction of the DNA sequencing from the sampling eliminates the possibility of immediate follow-up studies to explore interesting findings. In our previous studies of environmental microbial and viral components we identified questions and challenges that could have been answered with additional sample collection but awaited a future return to the field before they could be addressed (; ). The use of next-generation sequencing for microbial ecology involves two distinct components. First, the experimental aspects that include sample collection and preparation, DNA extraction, and sequencing, which are routine in the laboratory but challenging in the field. The principle limitation to taking sequencing into the field is the significant infrastructure and resources required for all steps of the sequencing process.

In addition to the dedicated hardware required to generate the sequences, much of the hardware, and many of the sample preparation steps, require physically separated laboratory space (to reduce cross-contamination between the samples). Second, the informatics aspects, including processing the raw data into high quality sequences, comparing those sequences to existing databases to generate annotations, and the subsequent data analysis all require high-performance computational resources to generate meaningful biological interpretations (;; ).

There are many challenges to next-generation sequencing in the field, but surmounting those obstacles will allow scientists to pursue new research avenues in exploring the environment using genetic approaches. Some of these challenges may be mitigated by the specific location being explored. For example, many terrestrial locations are accessible to mobile laboratories (; ) and have access to cellular communications that can provide Internet access. Though low-bandwidth, these connections can be used for analysis of next-generation DNA sequences (). In contrast, aside from near-shore venues, Internet communication at remote marine stations typically relies on satellite transmissions, and thus is both limited in bandwidth and extremely expensive. New bioinformatics approaches are reducing the computational complexity of the algorithms in DNA sequence processing, therefore minimizing the resources needed for data analysis.

Together with faster and cheaper computational technologies, these improved approaches mitigate the need for Internet-based computations (; ). To explore the frontier of next-generation sequencing at sea, we deployed a Life Technologies Ion Torrent Personal Genome Machine (PGM) during the 2013 Southern Line Islands Research Expedition to sequence bacterial isolates and community metagenomes from these remote islands. We installed local bioinformatics capabilities to perform necessary sequence analysis. There were numerous challenges to remote DNA sequencing and analysis, however the end result—genome sequences generated at the remote central Pacific Atolls allowed us to focus our research on questions relevant to the samples we collected. In this paper we describe the sequencing and informatics pipelines established during the expedition, release the data generated during the 2013 Line Islands research expedition, and discuss some of the unexpected challenges in remote sequencing. Sample collections Water samples were collected above the reef, and in-reef water samples were collected through crevices and against the benthos, both at 10 m depth. All sampling sites were named as either “tent sites” or “black reef”.

“Supersucker” () samples were collected from either coral or algal-surfaces with a modified syringe system which uses pre-filtered sterile seawater to flush the targeted microbial community from the respective surface (; ). Metagenomics samples were collected from the benthic boundary layer of two sites at Starbuck islands; a newly discovered black reef site (; 5.62653°S, 155.90886°W) and the tent site (5.62891°S, 155.92529°W). The collection was performed using 19 l low density polyethylene collapsible bag (Cole-Parmer, Vernon Hills, IL, USA) connected to a modified bilge pump () as we have described previously (). Large debris and eukaryotic cells were removed by filtration through 100 µm Nitex mesh and microbial cells were captured by passing the filtrate through the 0.45 µm Sterivex filter (Millipore, Inc., MA, USA). The Sterivex filters were stored at −20 °C until DNA extraction (). Bacterial isolates collection A sample (100 µl) of each water sample was plated onto Thiosulfate-citrate-bile salts-sucrose (TCBS) agar for the isolation of Vibrio-like spp ().

Typically >90% of colonies are Vibrio spp., but Pseudoalteromonas, Pseudovibrios, Shewanella and others also grow on TCBS. Therefore, colonies isolated from the TCBS plates were designated as Vibrio-like (F Thompson, pers. Comm., 2014). In addition, a sample (100 µl) of each water sample was also plated onto Zobell’s Marine agar for the isolation of heterotrophic marine bacteria (). In the naming scheme of isolates, “V” indicates Vibrio-like spp.

And “Z” indicates isolates from Zobell’s Marine agar (). Single colonies were picked and re-streaked onto new agar plates for colony isolation. Vibrio-like isolates were selected based on the color and size of the colony. Non- vibrio isolates were selected based on the pigmentation (color) and colony morphology. Cells were scraped off the agar plate for DNA extraction, multi-phenotype assay plates (MAP), storage in RNA later, and metabolites extraction using 100% MeOH (). Permit regulations restrict the import and export of live biological material between Kritibati and the United States, and therefore viable bacteria are not available. DNA extraction and sequencing The DNA from bacterial isolates was extracted and purified using the standard bacteria protocol in Nucleospin Tissue Kit (Macherey-Nagel, Dueren, Germany).

In short, the cells were re-suspended with 180 µl T1 lysis buffer and mixed thoroughly. Proteinase K (25 µl) was added and the mixture was incubated at 37 °C for 3–8 h.

The remaining extraction procedure was followed as recommended by the manufacturer protocol. Total microbial DNA was isolated from the Sterivex filters based on a modified protocol using the Nucleospin Tissue Kit (Macherey-Nagel, Dueren, Germany) (). Lysis steps were completed overnight at 37 °C in the Sterivex filters with double amount of Proteinase K-added T1 lysis buffer. An appropriate volume (200 µl for 180 µl T1 lysis buffer added, and 400 µl for 360 µl T1 lysis buffer added) of B3 lysis buffer was added for complete lysis before the lysate was removed from the Sterivex filter for subsequent extraction procedure as described in the manufacturers protocol.

Sequence libraries were prepared using the Ion Xpress™ Plus Fragment Library Kit (Life Technologies, NY, USA) with slight protocol modification and each library is barcoded using the Ion Xpress™ Barcode Adapters 1–16 Kit. SPRI beads-based size selection according to the published New England Bioscience (NEB) E6270 protocol () was performed for 200–300 bp fragment size-selection after adapters ligation. Emulsion PCR was performed on 8-cycles amplified library using the OneTouch supplemented with Ion Torrent PGM Template OT2 200 Kit and template libraries were sequenced on the Ion Torrent PGM using the Ion Torrent PGM Sequencing 200 Kit v2 and Ion 318™ Chip Kit v2. Sequencing was performed across five different locations on the ship ().

Multi-phenotype assay plate (MAP) Bacterial cells were resuspended from single colonies into sterile artificial seawater. Before leaving San Diego, MAPs were created as stock plates using 48 different carbon substrates arrayed on the plate in duplicate (). Each stock well contains 1 ml of 6X basal media (6X MOPS media, 57 mM NH 4Cl, 1.5 mM NaSO 4, 30 µM CaCl 2, 6 mM MgSO 4, 1.9 MNaCl, 7.92 mM K 2HPO 4, 60 mM KCl, 36 µM FeCl 3) and 1 ml of 5X carbon substrate. The substrates are used at a final concentration of 0.2% unless specified. Each experimental well on a 96-well plate consists of 50 µl of pre-mixed basal media + substrate solution, 75 µl sterile water, and 25 µl re-suspended bacterial cells.

Bacteria cell optical density (OD) was read using spectrophotometer at 650 nm, at the start of the experiment ( T = 0) and subsequently at the times noted. The multi-phenotype assay data were parsed and compiled using in-house PERL scripts (). The data were visualized as growth curves by plotting OD measurements over time. Using the ggplot library in R (), the entire plate and curves were generated as images that were manually inspected (DA Cuevas, DR Garza, S Sanchez, JE Rostron, CS Henry, RA Overbeek, V Vonstein, F Rohwer, EA Dinsdale, RA Edwards, 2014, unpublished data). The OD measurements occurring at or after 40 h were extracted from the data for comparative analysis between the samples.

These values were used to establish the 48 substrate vector profile of each sample. The Euclidean distance was calculated using the SciPy () spatial distance module to generate a distance matrix that was the basis for a neighbor-joining tree (DA Cuevas, DR Garza, S Sanchez, JE Rostron, CS Henry, RA Overbeek, V Vonstein, F Rohwer, EA Dinsdale, RA Edwards, 2014, unpublished data). This code is available from GitHub. Bioinformatics analysis of sequence libraries As noted below, the most common computational issue was corruption of the data files on the hard drives. To mitigate this issue, the MD5 checksum values for each file were calculated on the personal genome machine using the command line md5sum application.

(The PGM contains a single hard drive.) This application was chosen because it is fast and efficient. The checksum for each file was computed and compared to the expected values before the computation started and at the completion of each computation. On board ship, bases were called using a modified version of the Ion Torrent pipeline.

To expedite the processing in the absence of a large compute cluster, the sequencing chip was digitally divided into four quadrants using the –cropped option to the bead finding application justBeadFind (part of the Ion Torrent suite, Life Technologies, Carlsbad, CA). A standard IonExpress 318 chip is 3,392 × 3,792 beads, and the chip was divided into four quadrants, 0–1,746, 0–1,946; 0–1,746, 1,846–3,792; 1,646–3,392, 0–1,946; and 1,646–3,392, 1,846–3,792.

An overlap was provided on either side to ensure that all beads were identified. Any identical sequences from the same bead that was found in more than one quadrant were removed in post-processing steps. Bead finding, bead analysis, and base calling were performed using the Life Technologies software version 4.0. Phylogenetic relationships of isolates The 16S rRNA, rpoB, and recA gene sequences were extracted from the unassembled reads of each genome using the program genomePeek (K McNair, RA Edwards, 2014, unpublished data). Each group of sequences extracted from the same genome library were assembled into contigs using Newbler 2.7 () with default parameters. The contigs were then grouped into 16S rRNA, RopB, and RecA gene group. Each group was aligned with ClustalW2 () using the default parameters.

The alignments were visually checked using Seaview (). Extraneous contigs were removed from the original set, and the remaining contigs were re-aligned, trimmed and exported in the PHYLIP format.

Phylogenetic trees were generated using neighbor-joining clustering method () and visualized using the interactive tree of life (). This was not performed on the boat. Conserved functions Function is best conserved between orthologous proteins (i.e., proteins that are derived from the same common ancestor). On the boat we determine which functions are conserved across all genomes by counting proteins that had the same annotation. This allowed us to quickly compare the common functions and identify genes that were unique to the strains that we sequenced. Upon our return, and after the RAST annotations we composed orthologous groups (OGs) specific for the organisms sequenced here.

These orthologous groups represent protein families derived from a single protein in the common ancestor of the genomes and were identified by using a similar approach as previously described (). Briefly, we first queried the complete proteomes using an all-by-all blastp search (). The resulting bitscores were used to define in-paralogous groups of recently duplicated genes (i.e., after the last speciation event) within every genome. Within the genome, all proteins with a matching score better or equal than to any protein in another genome were joined into an “in-paralogous group”.

We then combined the in-paralogous groups conservatively between species by joining pairs of reciprocal best blastp hits to create the final list of orthologous groups for the complete set of genomes. Identification of genes required for growth on L-serine The microbes were scored for growth on L-serine in the multi-phenotype assay plates. A matrix was constructed listing all the genomes and all the functional roles annotated as being present in those genomes, with the values in the matrix being whether each functional role was present in each genome.

Two approaches were used to identify those genes that separate the strains that can grow on L-serine as a sole carbon source from those strains that can not. First, a random forest machine learning approach was used (), with the genes as variables and the ability to grow on L-serine as categories. The random forest identifies important variables (genes) that discriminate the two categories (). The approach used the R package “randomForest” () to classify the matrix.

Second, a simple summation approach was used, counting the number of organisms that contained each gene that could or could not utilize L-serine. This table was sorted to identify those genes that are present in the strains that can utilize L-serine and absent from those strains that could not utilize L-serine. Both approaches gave similar results.

This was initiated on the boat (with the random forest approach) and then subsequently refined. Genome sequencing Solely using the Ion Torrent PGM sequencing technology (Life Technologies), twenty six genomes and two metagenomes were successfully sequenced onboard the M/Y Hanse Explorer during the three weeks expedition in Southern Line Islands (; ). We generated close to 1.5 billion bases (post quality filtering) of high quality DNA sequence data to investigate the role of microbes on the world most pristine coral reef ecosystems. Additionally, more than 7.5 billion bases (post quality filtering) were generated by Life Technologies to supplement the dataset with additional six genomes () from the last two islands and to increase the amount of data of those under-sequenced libraries.

In total, we sequenced three Pseudoalteromonas; one Ruegeria; two Serratia; and twenty Vibrio isolates. All sequences have been deposited in public databases (). On board, culturing, DNA extraction, library construction, and sequencing took approximately 5 days to complete. The analysis of the sequence chips took approximately 5 h per quadrant to complete, and the annotation of those sequences took about an hour to complete. As noted below, analysis of the genomes and metagenomes was the most time consuming part of the bioinformatics analysis, and remains an ongoing project. Accession numbers.

Twenty Vibrio isolates corresponding to five Vibrio spp. Harveyi (and potentially its sister species, V. Campbellii), V. Coralliitycus, V. Alginolyticus, V. Shilonii, and V.

Cyclitrophicus were cultured and their genomes were sequenced. Non- Vibrio isolates whose genomes were sequenced included Pseudomonas fluorescence, Serratia proteamaculans, Serratia marcescens, Pseudoalteromonas spp., and Phaeobacter gallaeciensis. Sequencing these genomes with Ion Torrent PGM demonstrated that approximately 1 gigabase (10 9 bp) of DNA sequence is required to assemble typical marine microbial genomes to less than 100 contigs using this technology ().

The quality of assembly for the genomes appears to be solely dependent on the number of reads generated, and thus with sufficient time and resources all the genomes could be reduced to less than 100 contigs (high-quality draft status). These genomes were annotated onboard the Hanse Explorer using our rapid annotation pipeline. Based on these annotations, the ten closest genomes to our newly sequenced genomes were identified, and the presence and absence of genes in those genomes summarized to identify the unique functions in our genomes. Subsequently, we also created groups of orthologous genes to identify those genes unique to our isolates. In total we identified 11,585 orthologous groups in the genomes. Each genome had 3,032 ± 550 orthologous groups.

There were 1,442 orthologous groups that were unique to the Vibrio-like genomes and 4,913 orthologous groups that were unique to the Zobell genomes (see ). Presumably these are the specialization genes that allow these organisms to grow on the reefs of the Southern Line Islands. Many of these genes are things that have been identified previously as separating microbial species, such as prophages (), transposons (), IS elements and other mobile genes ().

All of the sequenced isolates contain prophage-like elements, suggesting phage predation controls bacterial populations as we have shown before (). Many of the genomes also contained nucleases and CRISPR elements indicative of resistance to active phage infections. The bacteria may be responding to phage infections by altering their cell surface, and genes involved in alternative pathways to construct lipopolysaccharide (LPS) were unique to some of the strains that we sequenced. Twenty of the twenty-six genomes contain variable genes involved in the synthesis of β-L-rhamnose, a deoxy-sugar that that is a building block of LPS (). Some examples include glucose-1-phosphate thymidylyltransferase similar to E. Coli rfbA; dTDP-glucose 4,6-dehydratase similar to E. Coli rfbB; dTDP-4-dehydrorhamnose 3,5-epimerase similar to E.

Coli rfbC; dTDP-4-dehydrorhamnose reductase similar to E. Download Elkasih Kau Tigakan Cintaku. In the isolate VRT11, for example, these four genes are located adjacent to each other, presumably in a single operon within the rfb gene cluster. Phosphorous is essential for growth but is often limiting in marine environments since most phosphate salts are insoluble (). Phosphorous is readily converted to phosphonates, compounds that contain C–P bonds (rather than the more typical C–O–P bonds of phosphates) by the phosphoenolpyruvate mutase (PepM) mediated isomerization of phosphoenolpyruvate to phosphonopyruvate ().

In marine environments, phosphonate production is catalyzed by Prochlorococcus and Pelagibacter, but is also catalyzed by marine mollusks, anemones and by members of the coral holobiont (; ). Phosphonate utilization by Vibrio species has been shown in mescosm experiments using surface water of the North Pacific Subtropical Gyre ().

However, not all Vibrio isolates are able to utilize phosphonate. For example, the coral pathogen V. Shiloi AK1 () is predicted to be able to use phosphonate, while the coral pathogen V. Coralliilyticus () is not able to use phosphonate. Eighteen of the isolates that were sequenced here (VAR3, VAR4, VRT2, VRT4, VRT5B, VRT14, VRT22, VRT23, VRT25, VRT35, VRT37, VRT3, VRT41, ZAR1, ZAR2, ZRT3, ZRT28, and ZRT32) contained phosphonate transporters and utilization genes, suggesting that in the oligotrophic waters of the Southern Line Islands, phosphonate is a critical phosphorous source for heterotrophic bacteria and they likely scavenge it from the coral reef. Iron is also often limiting in offshore marine environments in the Southern Ocean (), and the exogenous addition of iron to reef systems (e.g., from ship wrecks) promotes the over-growth of algae ().

The presence of a multitude of iron acquisition mechanisms, including high affinity transporters for both ferric (Fe 3+) and ferrous (Fe 2+) iron, ABC transporters, and an average of twenty siderophore genes per genome suggests that the marine isolates from the Southern Line Islands actively scavenge iron and are poised to consume any additional iron that enters the system. Phenotypic analysis In addition to sequencing the genomes of all isolates, we examined the phenotypic differences by using a multi-phenotype assay plate. The MAP allowed us to quantitatively measure the cellular phenotypes of each isolate in response to different nutrient sources based on their growth. Examples of the growth curves for all 48 carbon sources are shown for isolates VRT1 and VRT2 (). The growth characteristics of each isolate in the 48 carbon sources used in this experiment are shown in as a heatmap. The growth curves from the negative controls and filtered seawater-only samples displayed no change in OD 650 over time (); the OD 650 measurement was consistently below 0.10 in those controls indicating a viable protocol and setup. Although a few isolates (e.g., ZAR1, VAR2, and VAR4) were only able to grow on a few compounds, most of the isolates were generalists, able to grow on a wide range of carbon and nitrogen sources ().

The isolates did not separate by island of isolation, suggesting that any variations in oceanographic conditions among the atolls are outweighed by biological influences (see below). Serine utilization Free serine is abundant in the ocean and we previously proposed that serine is used as an osmolyte by marine microbes ().

Fifteen of the twenty-six isolates that we assayed were able to grow on serine as a sole carbon source (VAR3, VRT1, VRT3, VRT4, VRT5B, VRT14, VRT18, VRT22, VRT23, VRT30, VRT35, VRT38, VRT41, ZAR2, and ZRT1), and we therefore examined which genotypes are responsible for growth on serine. L-serine dehydratase (E.C.

4.3.1.17), the enzyme that converts L-serine to pyruvate and ammonia, is in every one of the genomes that we sequenced except VAR3, and is almost always associated with a serine transporter (including in all of those strains that can not utilize serine as a sole carbon source). D-serine dehydratase (E.C. 4.3.1.18) that performs the same reaction with D-serine is in twenty of the genomes that we sequenced.

We therefore compared the features present in the genomes to identify which annotations are associated with serine catabolism. Genes involved in vitamin B 12 synthesis (cobalamin; cobU, cobS) and the conversion of serine to homocysteine (O-acetylhomoserine sulfhydrylase (EC 2.5.1.49)/O-succinylhomoserine sulfhydrylase (EC 2.5.1.48)) are present in all of the strains that can use L-serine as a sole carbon source and few of those strains that cannot. These enzymes all connect serine catabolism to methionine metabolism via homocysteine (so that S-adenosyl methionine that catalyzes the reaction can be replaced). It has previously been shown in E. Coli that growth with L-serine as a sole carbon source is dependent on methionine metabolism () suggesting that in marine microbes a similar requirement holds and these microbes are using the same metabolic pathways. Comparison to metagenomes Following our previous expeditions to the Line Islands we sequenced 33 microbial metagenomes, and during the most recent expedition we sequenced two additional metagenomes.

As shown in, comparing the microbial genomes that we sequenced with the microbial metagenomes shows that we have observed each of the microbial genomes previously in our metagenomic sequences. Similarity between the genomes and metagenomes was not dependent on either the metagenome size, genome size, or sequence coverage. ZAR2, the unique Ruggeria, and ZRT1, a Pseudoalteromonas, have unique profiles when compared to the metagenomes.

This suggests that these organisms may be either transient colonizers of the reef that are passing through, or low abundance colonizers that are rarely sampled and we isolated them by chance. In contrast, the Serratia and most of the Vibrio clones are frequently found in the different samples and are therefore likely generalists. However, the uniform similarity across genera ( Vibrio, Pseudoalteromonas, and Serratia) suggests that the previous metagenome sequences contain the genus- and species-specific genes (e.g., housekeeping genes) of those organisms and not necessarily the strain-specific genes that may be unique in these organisms (). Discussion Next-generation sequencing has revolutionized microbial ecology but has always remained a step away from the field work.

Samples are collected, returned to the lab, and studied. In many ways this is analogous to the field ecologists of the 19th Century that captured wild beasts and brought them to zoos or museums to study.

There are many reasons why researchers will want to use sequencing in remote locations, including the limitations of permitting, archiving samples, and the ability to perform experiments as soon as the data is generated. With the advancements in next-generation sequencing, sample preparation, and data analysis, microbes can be studied in their natural habitat.

By bringing the instruments to the environment, and not the other way around, environmental microbiology can be explored in heretofore unimagined ways. The onboard sequencing and analysis suggested that microbes in the Southern Line Islands are limited by phosphorous and iron. The genomes predicted their potential to scavenge phosphorus from phosphonate, and iron from a variety of sources through various transporters and siderophores. Approximately half of the microbes that were isolated are able to grow on L-serine by converting L-serine to methionine. Although there does not appear to be any genus-specific preference for growing on L-serine (some isolates of Vibrio, Pseudoalteromonas, and Serrtatia could grow on L-serine), there is a specific biochemical pathway that is required: the transformation of L-serine to methionine via cobalamine.

We could not identify any correlation between the ability to utilize serine and the location where the microbes were isolated, at any scale from kilometers to micrometers. It therefore remains to be determined what selects for the ability to utilize L-serine as a sole carbon source in the marine environment. The physical distance between the five islands is shown in. Island biogeography suggests that closer islands should have more related organisms (). To test whether the microbes on the Southern Line Islands follow this rule, we calculated genetic distance between each of the isolates based on several marker genes (16S, rpoB, and recA), the genotypic distance based on the presence of orthologous groups in each genome, and the phenotypic distance based on the multi-phenotype assay plates (). There was no correlation between the distance between the islands and the genetic, genotypic, or phenotypic distances, suggesting that the microbes of the Southern Line Islands are not constrained to their local islands and are not restricted in their migration between islands ().

Challenges with onboard sequencing The first challenge to sequencing on a boat was organizing the equipment to minimize the possibility of cross-contamination between the samples. The hardest part of the microbiology and molecular biology was keeping everything clean. On the M/Y Hanse Explorer, the microbiology lab was on the upper aft deck, the DNA isolation and quantification station was in a cabin, and the PCR station was in the dining room (). Because the OneTouch contains a centrifuge, this equipment was placed in the lowest part of the ship, the laundry room, for maximum stability. The Ion Torrent PGM was housed in the owner’s quarters, atop the ship, to allow connection to the nitrogen gas tank which was stored outside. Centrifugation poses a significant problem on boats because of the conservation of angular momentum. Therefore, whenever possible, centrifugation was eliminated from the protocol.

For example, cultures were grown to dense colonies to avoid the need to pellet cells, a mini-centrifuge was used for column based DNA extraction, and vacuum-based purification protocols were used as a back up. The computational aspects were surprisingly challenging. First, there were the unexpected equipment failures that had to be overcome without access to technical support or replacements. The touch screen on the OneTouch did not survive transit to the vessel, and control of that instrument had to be reverse engineered using the X11 interface and a Linux laptop. Second, data analysis requires consistent read/write access to the disks, and that process frequently experienced data corruption on compute server and resulted in the potential for loss of data. The solution that was implemented was to compute the md5sum (essentially a unique string that represents the size and contents of the file) for each file on the Ion Torrent PGM hard drive, and continually compare the md5sum calculated for the files on the compute server with those on the Ion Torrent PGM hard drive. Any deviation in the calculated values suggested that the file had been corrupted.

It is not known what caused the data corruption as upon returning to San Diego the server has been through several compute cycles without a file corruption. We speculate that it was most likely the motion of the boat (as noted above for centrifugation) or potentially the uneven power that is available on a ship. It is likely that implementation of a RAID system in the PGM or the substitution of the hard drive with a solid-state drive would also mitigate these issues. The third problem that had to be overcome was ensuring appropriate compute resources for data analysis.

As discussed in methods, the Ion Torrent PGM data files are amenable to partial processing, which reduces the memory footprint and computational time required to analyze the data. The final problem is to ensure that there is sufficient expertise available to analyze the data in a timely manner. Two proposed solutions include enabling all members of the scientific team access to the data via a local (ship-board) Wi-Fi or sending the data off the ship for remote analysis. The latter is potentially feasible as sequence data is highly compressible and thus resource requirements for data transfer can be reduced. Funding Statement This work is partially supported by NSF Dimensions Grant (DEB-1046413; Edwards and Rohwer). This project was also funded in part by the Gordon and Betty Moore Foundation through Grant GBMF-3781 to Rohwer.

Additional funding for Yan Wei Lim was provided by the Canadian Institute for Advanced Research (CIFAR; IMB-ROHW-141679). Additional funding for Edwards was provided by NSF grants CNS-1305112, and MCB-1330800. Dutilh was supported by an award from CAPES/BRASIL. The SDSU Vice President of Research, Director’s Office of Scripps Institution of Oceanography, Moore Family Foundation, and several private donors provided cruise support.

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Margulies et al. (2005) Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen Y-J, Chen Z, Dewell SB, Du L, Fierro JM, Gomes XV, Godwin BC, He W, Helgesen S, Ho CH, Irzyk GP, Jando SC, Alenquer MLI, Jarvie TP, Jirage KB, Kim J-B, Knight JR, Lanza JR, Leamon JH, Lefkowitz SM, Lei M, Li J, Lohman KL, Lu H, Makhijani VB, McDade KE, McKenna MP, Myers EW, Nickerson E, Nobile JR, Plant R, Puc BP, Ronan MT, Roth GT, Sarkis GJ, Simons JF, Simpson JW, Srinivasan M, Tartaro KR, Tomasz A, Vogt KA, Volkmer GA, Wang SH, Wang Y, Weiner MP, Yu P, Begley RF, Rothberg JM. Genome sequencing in microfabricated high-density picolitre reactors. 2005; 437:376–380.