 The annotation data of nucleic acid and amino acid sequences are available for download. The data are presented in simple tab-delimited text file (for easy parsing of the data).

- Download [tar.gz](~21M)
- Download [zip](~21M)
- Download [txt](~87M)
- Download (readme.txt)

+ Tab-delimited one line of the annotation files consists of the following consecutive fields.




If a sequence has 'US298321' as a patent (or application) number and '12' as a SEQ ID NO, This field (Publication_SEQID) will be US298321_12.

Gene ID

Entrez Gene ID that corresponds to the RefSeq number

RefSeq number

The top hit RefSeq number in the BLAST results


BLAST E-value of alignment

RefSeq length Length of the top hit RefSeq sequence

Seq. length

Length of a query patent sequence

Alignment length

Alignment length in BLAST results

RefSeq coverage

alignment length / RefSeq length

Seq. coverage

alignment length / query sequence length

Alignment type

alignment type (see How to build)


