The Ensembl FTP site provides biological sequence databases suitable for large-scale local sequence similarity search approaches, as well as MySQL table dumps of all underlying Ensembl databases. These table dumps are suitable for import into relational database management systems and allow installation of complete Ensembl mirror sites.
Please note: Ensembl supports downloading of many correlation tables via the highly customisable BioMart data mining tool. You may find exploring this web-based data mining tool easier than extracting information from our normalised database dumps.
The URL ftp://ftp.ensembl.org/pub/ is the basis of the directory structure outlined below. The structure is also described in the FTP site README. The latest data sets for each species are available via directories prefixed 'current_'. For example 'current_mouse' will always point to the latest data release for the mouse.
All species directories have the following basic structure, although not all information is neccessarily available for each species.
-- species | |-- data | |-- fasta Gene predictions in FASTA database format | | | |-- cdna * Transcript (cDNA) predictions | |-- dna * Genomic DNA in assembled entities | |-- pep * Translation (peptide) predictions | |-- rna * Non-coding RNA predictions | |-- flat files Gene predictions annotated on genomic DNA slices of 1 Mb. | | | |-- embl * EMBL format | |-- genbank * GenBank format | |-- mysql MySQL database table text dumps | |-- core General genome annotation information | | * Genome sequence assembly | * Ensembl gene predictions | * Ab initio gene predictions | * Marker information | * ... | |-- otherfeatures Additional genome annotation | | * Gene predictions based on EST information | * ... | |-- variation Genetic variation information |-- vega Manually curated gene sets |-- cdna cDNA to genome alignments based on the latest EMBL database
The 'multi-species' directories contain databases not directly linked to single species data or which contain link data between species.
-- multi-species | |-- data | |-- mysql MySQL database table text dumps | |-- ensembl_compara Cross-species comparative genomics data: | | * Orthologue/paralogue predictions | * Protein families | * Whole genome alignments | * Synteny information | |-- ensembl_go Gene Ontology database | |-- ensembl_web_user_db SQL table defintion for server-side user config database | |-- ensembl_website Ensembl web site database: | * Context-sensitive help articles | * News articles | * Mini-ads
The 'mart' directory contains table dumps of the databases underlying the BioMart data mining tool.
-- mart | |-- data | |-- mysql MySQL database table text dumps | |-- ensembl_mart Cross-species data mining tables |-- sequence_mart Genome sequences |-- snp_mart Genetic variation information |-- vega_mart Manually curated gene sets
© 2024 University of Basel. This product includes software developed by Ensembl.