Ashbya Genome Database .

 

Introduction

The Ensembl FTP site provides biological sequence databases suitable for large-scale local sequence similarity search approaches, as well as MySQL table dumps of all underlying Ensembl databases. These table dumps are suitable for import into relational database management systems and allow installation of complete Ensembl mirror sites.

Please note: Ensembl supports downloading of many correlation tables via the highly customisable BioMart data mining tool. You may find exploring this web-based data mining tool easier than extracting information from our normalised database dumps.

The URL ftp://ftp.ensembl.org/pub/ is the basis of the directory structure outlined below. The structure is also described in the FTP site README. The latest data sets for each species are available via directories prefixed 'current_'. For example 'current_mouse' will always point to the latest data release for the mouse.

Species Directories

All species directories have the following basic structure, although not all information is neccessarily available for each species.


-- species
   |
   |-- data
       |
       |-- fasta          Gene predictions in FASTA database format
       |   |
       |   |-- cdna         * Transcript (cDNA) predictions
       |   |-- dna          * Genomic DNA in assembled entities
       |   |-- pep          * Translation (peptide) predictions
       |   |-- rna          * Non-coding RNA predictions
       |
       |-- flat files     Gene predictions annotated on genomic DNA slices of 1 Mb.
       |   |
       |   |-- embl         * EMBL format
       |   |-- genbank      * GenBank format
       |
       |-- mysql          MySQL database table text dumps
           |
           |-- core       General genome annotation information
           |
           |                * Genome sequence assembly
           |                * Ensembl gene predictions
           |                * Ab initio gene predictions
           |                * Marker information
           |                * ...
           |
           |-- otherfeatures  Additional genome annotation
           |
           |                * Gene predictions based on EST information
           |                * ...
           |
           |-- variation  Genetic variation information
           |-- vega       Manually curated gene sets
           |-- cdna       cDNA to genome alignments based on the latest EMBL database

Multi-species Directories

The 'multi-species' directories contain databases not directly linked to single species data or which contain link data between species.


-- multi-species
   |
   |-- data
       |
       |-- mysql                    MySQL database table text dumps
           |
           |-- ensembl_compara      Cross-species comparative genomics data:
           |
           |                          * Orthologue/paralogue predictions
           |                          * Protein families
           |                          * Whole genome alignments
           |                          * Synteny information
           |
           |-- ensembl_go           Gene Ontology database
           |
           |-- ensembl_web_user_db  SQL table defintion for server-side user config database
           |
           |-- ensembl_website      Ensembl web site database:
           |                          * Context-sensitive help articles
           |                          * News articles
           |                          * Mini-ads

Mart Directory

The 'mart' directory contains table dumps of the databases underlying the BioMart data mining tool.


-- mart
   |
   |-- data
       |
       |-- mysql              MySQL database table text dumps
           |
           |-- ensembl_mart   Cross-species data mining tables
           |-- sequence_mart  Genome sequences
           |-- snp_mart       Genetic variation information
           |-- vega_mart      Manually curated gene sets


 

© 2024 University of Basel. This product includes software developed by Ensembl.

                
AGD version 3 based on Ensembl release 40 - Aug 2006
Help