High Speed DNA (Deoxyribonucleic Acid) Sequencing for Environmental Sample Biological Threat Detection, Identification, and Support for Attribution
DHS (S&T) SBIR FY-08.2 - H-SB08.2-003
Department of Homeland Security
Opens: May 1, 2008 - Closes: June 17, 2008 4:30pm EST

8.3 SBIR TOPIC NUMBER: H-SB08.2-003

TITLE: High Speed DNA (Deoxyribonucleic Acid) Sequencing for Environmental Sample Biological Threat Detection, Identification, and Support for Attribution

TECHNOLOGY AREAS: DNA Sample Preparation, Sequencing and Bioinformatics

OBJECTIVE: Research and develop a prototype instrument that performs high speed DNA and RNA sample preparation, DNA sequencing, DNA sequence data generation and data analysis to detect and identify, at the strain level, biological threat agents in less than four hours from collected sample to warning signal.

DESCRIPTION: Complex samples from the environment need to be analyzed for biological threat agents in backgrounds of near neighbor strains, other endogenous species, mammalian cells and a variety chemical and biological clutter. One method identified as a possible solution to this issue is robust sample preparation followed by high-speed DNA sequencing. However, current commercial platforms take between twenty and sixty hours to generate sequencing data and are not designed for the analysis of environmental samples of the type stated above. Speeding up the process at all stages will be required. No assumption of technique beyond DNA Sequencing is made by this call for proposals.

Sample Preparation - Current sample preparation and library building for high throughput DNA sequencing is long (greater than four hours). This is particularly true when analyzing environmental samples where many different types of biological threat agents may be presents e.g., bacteria and RNA viruses. As a result, research and development to produce an automated sample preparation and library building front end for high throughput sequencing would dramatically speed up the process of obtaining treat agent data. If library building is not required by the sequencing technique proposed to satisfy the next section of this call for proposals, it can be left out.

Develop an automated instrument for the extraction and preparation of sequencing quality DNA and RNA from complex samples derived from a filter sample (e.g., a dry filter unit). Assume that the sample contains the components of floor sweepings as a guide to defining relevant background contaminants.

DNA Sequencing - Current technology requires approximately three, eight hour shifts to complete one run of DNA sequencing, this includes: Sample preparation and library building (approximately four hours), amplification of genomic fragments (on the order of eight hours), sequencing data collection (on the order of eight hours). These runs produce millions of bases of finished product with relatively poor quality.

Develop a DNA sequencing prototype instrument and control system that, through any means, speeds up the process of producing high quality DNA sequencing data with read lengths of twenty bases or more without assembly. No assumption of technology is made by this call for proposals. Assume that the output from automated sample preparation defined in the previous sections is the input for this instrument. Sample preparation and sequencing instrumentation need not be integrated but an integrated system is preferred.

Sequencing Data Analysis - Current sequencing systems require the completion of a run to start assembling data. For the purpose of biological threat detection it may not be necessary to assemble the sequencing information as biological threat identification should be possible with as little as twenty bases of sequence information.

Develop a computer algorithm and program that uses known sequence databases to analyze sequencing data as it is generated (initiating at a sequencing stage where a positive identification can be made, e.g., the twenty base stage) and identifies biological threat signatures allowing phylogenetic mapping to the strain level. No assumption of sequencing technology is made by this call for proposals but the proposed computer program in this section must be consistent with the sequencing instrument proposed as a response to the DNA Sequencing section above.

Informatic Gene Attribution - When analyzing environmental samples that include many different species and strains of interest, there currently are only a few ways of determining that all of the signatures detected come from a single strain or few strains i.e., the threat(s). Flanking sequence and copy number are among the indications that the components found in a metagenomic study are coming from a single threat strain. When a set of treat signatures are detected it is important to know that a single organism type (or a few types) are responsible for all or at least a set of signatures.

Develop improved computer based algorithms and programs for the analysis of partly assembled metagenomic data from environmental samples for the purposes of knowing that a set of threat signatures come from a single species and strain. Demonstrate that this program reduces the false positive rate when detecting threat organisms. No assumption of sequencing technology is made by this call for proposals but the proposed computer program in this section must be consistent with the sequencing instrument proposed as a response to the DNA Sequencing section above.

PHASE I: Using prototype or commercial off the shelf laboratory equipment (modified or unmodified) and computer algorithms demonstrate and report the feasibility of improving upon the time to detect and identify biological agents in a complex environmental background (e.g., from a dry filter containing dust) by the use of DNA sequencing. A written report including graphical and tabular data comparisons with statistical significance clearly stated, and copy of all raw data collected shall be delivered.

The report should include details of experiments including: Biological threat simulant species (e.g., bacillus thuringensis spores) in a complex background, standard laboratory sample preparation techniques, operational details for a commercial or in development DNA sequencing instrument, the real time base calling data and an algorithm(s) used to analyze this data. The report should state results of testing to determine whether or not, early in the sequencing data collection, the species of interested can be detected and identified in blind samples.

PHASE II: Deliver a prototype instrument and analysis/control software suite, capable of (1) Taking raw vacuum filters as its input and generating DNA sequencing data, (2) building a phylogenetic map of the biological threat agent(s) present (based on DNA sequence matching algorithms), (3) sending a signal to the operator that a threat has been detected, and (4) providing a confidence value for that identification, in less than four hours. The computer algorithm and program should use information about the likelihood that all threat signatures came from a single organism type (strain) to generate a second confidence measure that indicates this likelihood.

PHASE III COMMERCIAL APPLICATIONS: Deliver a shippable analytical instrument platform and analysis/control software suite capable of generating, analyzing and utilizing DNA sequencing data to detect and identify biological threat agents in less than four hours starting from a dry vacuum filter sample.

This instrument has multiple commercial applications. The smaller market may in fact be the Homeland Security market. However, Homeland Security is the intended beneficiary of this development and must be part of the commercialization plan. The human molecular diagnostics and research markets will clearly benefit by this development and will most likely be the larger commercial market for this technology. Solving the problems associated with the identification of biological threat agents from environmental samples will help solve problems in clinical molecular diagnostics. Examples include: Speeding up DNA sequence analysis in the point of care arena and reducing the complexity of comparisons needed to answer specific molecular diagnostic questions without requiring complete genome sequencing and sequence assembly.

KEY WORDS: DNA, Nucleic Acid, Sequencing, Analysis, Bioinformatic, Biological, Threat, Warfare, Genes, Virulence Factor, Antibiotic Resistance

TECHNICAL POINT OF CONTACT: Dr. James Anthony, 202-254-5742, james.anthony@dhs.gov


DHS Notice: - From 1 May 2008 through 16 May 2008, proposers may contact the Technical Points of Contact identified in the solicitation by telephone or by email to ask technical questions about specific technical topics. No further direct contact between proposers and Technical Points of Contact shall occur after 16 May 2008 for reasons of competitive fairness. Additional questions will only be accepted until 6 June 2008.through e-mail at faq@hsarpasbir.com.

General DHS SBIR questions/information
Help Desk. All questions about this solicitation, the proposal preparation and electronic submission should be submitted via the website: www.sbir.dhs.gov. The DHS SBIR Help Desk is available toll free at: 1-800-754-3043.

NOTE: The Solicitations listed on this site are copies from the various SBIR agency solicitations and are not necessarily the latest and most up-to-date. For this reason, you should use the agency link listed below which will take you directly to the appropriate agency server where you can read the official version of this solicitation and download the appropriate forms and rules.
The official link for this solicitation is: www.sbir.dhs.gov.
DHS will begin accepting proposals on May 1, 2008.   The solicitation closing date is June 17, 2008 (4:30 p.m. EST).

Copyright © Zyn Systems 2008, all rights reserved