Software

Information Software

Ensembl Versioning Scheme

The Ensembl project releases updated data stets and software approximately ten times a year. Each release of Ensembl web code and databases is given its own version number, which can be found in the top left hand corner of the horizontal blue bar.

More about the Ensembl Versioning Scheme...

Ensembl Archive!

Since November 2004 all Ensembl releases are being archived as fully functional sites for about two years.

More about the Ensembl Archive...

Ensembl Website

The Ensembl website is written in Perl and can be installed locally. The code is modular and extensible so it can be customised for local demands.

More about the Ensembl Website...

Ensembl Databases and Application Programme Interfaces (APIs)

Ensembl uses MySQL relational databases to store its information. A comprehensive set of Application Programme Interfaces (APIs) serve as a middle-layer between underlying database schemes and more specific application programmes. The APIs aim to encapsulate the database layout by providing efficient high-level access to data tables and isolate applications from data layout changes. Ensembl provides a Perl API and a Java API (Ensj) although the Java one is slightly less complete.

The main Ensembl databases are introduced below. Data releases for these databases can be obtained from the Ensembl FTP site. For installation of the Perl APIs see the installation instructions.

Ensembl Core Databases and APIs

The set of species-specific Ensembl Core databases stores genome sequences and most of the annotation information. This includes the gene, transcript and protein models annotated by the Ensembl automated genome analysis and annotation pipeline. Ensembl Core databases also store assembly information, cDNA and protein alignments, external references, markers and repeat regions data sets.

More about the Core database...

Ensembl EST Databases and APIs

Species-specific Ensembl EST databases hold an independent EST gene set provided for all well-characterised species with a suitable amount of biological evidence. The layout of Ensembl EST Databases is identical to the Ensembl Core Database schema so that schema descriptions and API access are equally applicable.

More about the EST database...

Ensembl Compara Database and APIs

The Ensembl Compara multi-species database stores the results of genome-wide species comparisons re-calculated for each release. The comparative genomics set includes pairwise whole genome alignments and synteny regions. The comparative proteomics data set contains orthologue predictions and protein family clusters.

More about the Compara database...

Ensembl Variation Databases and APIs

The large amount of genetic variation information is organised in a set of species-specific Ensembl Variation databases.

More about the Variation database...

Ensembl Registry

The Registry system allows to tell your programs where to find the EnsEMBL databases and how to connect to them. It has been implemented for the Ensembl Core and Compara Perl APIs as well as the Ensembl Java API.

More about the Registry...

Ensembl Tools

BioMart

BioMart is a generic data management system, originally built for Ensembl (EnsMart), which offers a range of advanced query interfaces and administration tools. The system comes with built-in support for query-optimisation and database federation.

Ensembl builds a BioMart database to provide users with the ability to conduct fast, powerful queries using either the online 'MartView' page, a graphical or text based application, or programatically using software libraries written in Perl and Java. For data providers, the system simplifies the task of integrating their own data with other Ensembl datasets.

More about BioMart...

Exonerate

Exonerate is a generic tool for pairwise sequence comparison. It allows you to align sequences using many alignment models, using either exhaustive dynamic programming, or a variety of heuristics.

More about Exonerate...

SSAHA

SSAHA is a software tool for very fast matching and alignment of DNA sequences. It stands for Sequence Search and Alignment by Hashing Algorithm. It achieves its fast search speed by converting sequence information into a 'hash table' data structure, which can then be searched very rapidly for matches.

See SSAHA configuration, information and software...

Wise2

Wise2 is a package focused on comparisons of biopolymers, commonly DNA sequence and protein sequence. Algorithms in this package include genewise and estwise.

More about Wise2...

Ensembl Software Support

Ensembl is an open project and we would like to encourage correspondence and discussions on any subject on any aspect of Ensembl. Please see the Ensembl Contacts page for suitable options for getting in touch with us.

Ashbya Genome Database .