Ashbya Genome Database .

 

This section explains the configuration changes you have to make so that Ensembl will run on your local set-up.

Background: configuration

File Configuration

The configuration for the website is split up into site-wide configuration and species configuration. The default configuration files can all be found in the "conf" subdirectory of your server-root.

The site-wide configuration (Apache config, global settings) is stored in conf/SiteDefs.pm.

The species configuration (database names, etc.) is kept in the conf/ini_files directory as a series of "ini" files.

  • There is a generic file called DEFAULTS.ini which holds default values for all the species. (This is simply to save having to make the same setting in multiple species ini files.)
  • There needs to be a species-specific .ini file for each species you want to display on your Ensembl website. (<Genus_species>.ini files e.g. Homo_sapiens.ini). You can override settings in DEFAULTS.ini by redefining the setting in a particular species' ini file.
  • In addition, MULTI.ini contains multi-species settings. It has database connection information for the Compara and Mart databases.

To create a new species-specific ini file, copy an existing one and use that as a template. These lines will need to be edited by hand:

  • SPECIES_DESCRIPTION (used for the species descriptions in the table at the end of the FTP download page).
  • ASSEMBLY_ID (used for display on the homepage)
  • ENSEMBL_GOLDEN_PATH (used in the webcode)
  • SPECIES_RELEASE_VERSION is the last number of the database name for the species. Used to work out database name, name of FTP site directory etc
  • Are there gene predictions or not? If not you can comment-out/ ignore all the sections for genes
  • Under [databases], list all the ones configured for this species.
    You can use "%" instead of the release number if it is idential to <ENSEMBL_VERSION> (configured in conf/SiteDefs.pm).
    Use "%" to replace the species release version if you have configured this above
  • DAS sections are optional

Plugins

Elsewhere in the documentation you will read about how to extend Ensembl by writing your own plugins. A by product of this extensibility is that you can also use plugins to simply configure your server. For example, in your checked out copy of Ensembl there is a public-plugins with a mirror directory. The mirror/conf directory contains DEFAULTS.ini and SiteDefs.pm files which can be used to configure your copy of Ensembl.

Setting up the "Mirror Plugin"

If you have installed Ensembl prior to release 31 you will have edited files in the conf directory. From version 32 there is a new approach to configure the site using plugins.

In the main conf directory you will find a file called Plugins.pm-dist. Copy this file and name it Plugins.pm (i.e. without the -dist extension). This file sets up the mirror plugin mentioned above.

Now go into the public-plugins/mirror/conf directory and make the changes listed below to SiteDefs.pm and ini_files/DEFAULTS.ini:

public-plugins/mirror/conf/SiteDefs.pm

First copy SiteDefs.pm-dist to SiteDefs.pm in this directory. If you open this file in a text editor, such as vi, you will see it is a Perl module with a single method "update_conf" which contains the changes needed to configure your copy of the website. The "update_conf" method consists of a list of variables (of the form $SiteDefs::ENSEMBL_whatever),

Below are the variables you will need to change to match your installation. If you wish, you can change any variables that are defined in the main SiteDefs.pm file.

You should only change the values, that is the parts between single quotes.

General configuration

$SiteDefs::ENSEMBL_SERVERROOT sets the filesystem location of the web server root directory. By default, this is determined dynamically based on the location of the SiteDefs.pm file. You can, however, hard-code this value if you wish. For example, if you installed the Ensembl site in /usr/local/ensembl, then you could change this line of SiteDefs to read:

  $SiteDefs::ENSEMBL_SERVERROOT = '/usr/local/ensembl';

Configuration of the Apache web server

Change $SiteDefs::ENSEMBL_SERVERNAME to the web name of the server - e.g. "www.yoursite.org". This value is dynamically set to the server hostname by default.

Change $SiteDefs::ENSEMBL_USER and $SiteDefs::ENSEMBL_GROUP to the system user and group you want the Apache web server to run as. Usually, for security, this is a special user (such as "nobody") who has very few permissions.

Mail configuration - error messages

If you want errors to be automatically emailed to you, change $SiteDefs::ENSEMBL_MAIL_ERRORS to the value 1, and change $SiteDefs::ENSEMBL_ERRORS_TO to your email address. If you don't want errors mailed, set $SiteDefs::ENSEMBL_MAIL_ERRORS to 0.

User database - Database and cookie configuration

Change the values of the following to have the details for your web_user database

  • $SiteDefs::ENSEMBL_USERDB_NAME
  • $SiteDefs::ENSEMBL_USERDB_HOST
  • $SiteDefs::ENSEMBL_USERDB_USER
  • $SiteDefs::ENSEMBL_USERDB_PASS

Remember this database needs a user with update/insert/delete privileges. Additionally if you wish to change the encryption keys used to "protect" the cookies, $SiteDefs::ENSEMBL_ENCRYPT_0 should be a six digit hex-number and $SiteDefs::ENSEMBL_ENCRYPT_1, _2 and _3 should each contain two alphanumeric characters. Unless you are particularly concerned about people changing cookies, the default values will probably do.

Configuring the available species

If you wish to use Ensembl 'as-is', with the full list of species found in the release you downloaded, you can skip to the next section.

If you wish to add or remove species, please see the species configuration page.

Temporary Files

There are three temporary file locations that can be configured:

  $SiteDefs::ENSEMBL_TMP_DIR       General storage for temporary files
  $SiteDefs::ENSEMBL_TMP_DIR_IMG   Storage for image files
  $SiteDefs::ENSEMBL_TMP_DIR_BLAST Storage for blast files

The values for these should be set to an appropriate filesystem path.

Some temporary files need to be referenced by URL from web pages. The first two tmp directories above therefore have URL aliases, also configured within SiteDefs.pm. You should not need to edit these.

  $SiteDefs::ENSEMBL_TMP_URL       URL alias for $ENSEMBL_TMP_DIR
  $SiteDefs::ENSEMBL_TMP_URL_IMG   URL alias for $ENSEMBL_TMP_DIR_IMG

There are two further temporary file configuration options available:

  • $SiteDefs::ENSEMBL_TMP_CREATE If set, then at apache startup the server will attempt to create any temporary directories that have been configured, but which don't already exist. It also changes their ownership to $SiteDefs::ENSEMBL_USER.$SiteDefs::ENSEMBL_GROUP
  • $SiteDefs::ENSEMBL_TMP_DELETE If set, then at apache startup the server will attempt to delete the contents of $ENSEMBL_TMP_DIR and $SiteDefs::ENSEMBL_TMP_DIR_IMG.

public-plugins/mirror/conf/ini-files/DEFAULTS.ini
and public-plugins/mirror/conf/ini-files/[Species_name].ini

Again the first thing to do is make a copy of the DEFAULTS.ini-dist file as DEFAULTS.ini, then you will need to go in and edit the appropriate lines of the DEFAULTS.ini.

From version 33 onwards you should be able to re-use the same Plugins.pm, SiteDefs.pm and DEFAULTS.ini file to configure your new Ensembl.

The format of the species-specific .ini files are similar to a standard windows ini file, i.e. split into sections identified by a header in square brackets, which contain key = value entries. When the Apache server is started, these files are parsed and stored in a file called config.packed in the conf directory. This config.packed file is regenerated whenever the server is restarted.

Database configuration

In the [general] section, change the values of ENSEMBL_DBUSER and ENSEMBL_DBPASS to the username and password you want MySQL to be accessed by. Set ENSEMBL_HOST and ENSEMBL_HOST_PORT to be the database server and port that MySQL is running on.

These are default settings - you can override them by adding a section for a particular database. For example, adding a section like the following into the mirror plugin would override the defaults in public-plugin for the ENSEMBL_WEBSITE database:

[ENSEMBL_WEBSITE]
USER=mysqluser2
PASS=helppass
HOST=mysqlserver2
PORT=4444

Names of databases

In the [databases] section, change the values of ENSEMBL_DB, ENSEMBL_OTHERFEATURES, etc. to match the names of the core, OTHERFEATURES, etc. databases you created.

DAS Proxy

If you wish to display external DAS sources on your ensembl installation, and are behind a firewall, then you will need to set a value for ENSEMBL_DAS_PROXY. This should probably be set in DEFAULTS.ini, as the proxy is likely to be the same for all species. The value should be your usual web proxy setting, e.g.

ENSEMBL_DAS_PROXY = http://webproxy.mycompany.com:80

Search Links

These are used to provide sample links for the top right-hand corner of dynamic pages, as well as for the links in the site maps. If you would like to include these in any added species, copy the [SEARCH_LINKS] section from another species, remove any views that do not apply to your species (e.g. species with no chromosomes have no link to MapView) and then edit the links to point to representative sequence locations, genes, etc.

If you extend Ensembl by adding new views, don't forget to add sample links to the ini files!

public-plugins/mirror/conf/ini-files/MULTI.ini

Create this file to configure the multi species databases. Configure the [general] section for connection to the MySQL server, as for the species-specific ini files. In the [databases] section, change the values of ENSEMBL_COMPARA and the various _MART entries to match the name of the ensembl_compara and ensembl_mart databases that you created.

In addition, if you chose to install a local GO database you can configure it here by adding an entry: ENSEMBL_GO = your_go_database_name to the [databases] section along with ENSEMBL_COMPARA and the BioMart datbases.

As in the other species.ini files, you can override the database connection settings for a particular database by adding a section similar to the following:

[ENSEMBL_MART_ENSEMBL]
USER=mysqluser2
PASS=dbpass
HOST=mysqlserver2
PORT=4444

The Mart configuration automatically creates a simplified BioMart configuration file martRegistry.xml in the conf directory at server start up. Read the BioMart install document how to edit this to create a more functional BioMart registry which leverages BioMart's advanced features such as federated searches.

BLAST

BLAST is not installed by default. You have to configure blast for a local installation. See documentation describing how to configure BLAST.

All relevant fasta files are on ftp://ftp.ensembl.org/pub.

SSAHA

SSAHA2 is not installed by default as it requires a large amount of memory to put the look up tables in memory. (Ensembl currently uses 10 4G machines for our SSAHA servers).

All relevant fasta files are on ftp://ftp.ensembl.org/pub. These need processing to produce the appropriate files. SSAHA2 servers can be configured for any DNA based data.

SSAHA2 software is at: http://www.sanger.ac.uk/Software/analysis/SSAHA2/.

Each species ini file (public-plugins/mirror/conf/ini-files/<Genus_species.ini>) needs the following:

[SSAHA2_DATASOURCES]
DATASOURCE_TYPE = dna
LATESTGP        = host:port        ; for unmased dna
CDNA_ALL        = host:port        ; for cDNA data 

Note on Ensembl Web Site Search

The Ensembl web site uses the Exalead search engine (http://www.exalead.com). Previously it used AltaVista (http://www.altavista.com/). This software requires a user license and cannot be distributed as part of the Ensembl web code.

By default, local installations of Ensembl use a simple search page called Unisearch that just does SQL searches against the database. This method is rather slow and crude, however. Unisearch does not do cross species queries (too time consuming for the Unisearch's MySQL queries).

Site Preparation

cd into your server root and run the following commands:

  mkdir logs
  chown -R $ENSEMBL_USER:$ENSEMBL_GROUP .

where $ENSEMBL_USER and $ENSEMBL_GROUP are the web server user & group you configured in SiteDefs.

Your Ensembl site should now be ready to start up.

DISCLAIMER

Please note that the suggested set-up for Apache, mod_perl, MySQL and all Ensembl modules is not intended to provide high-level security. We strongly recommend that you get your systems administrator to audit any system that you provide to others. In particular, note that the perl.startup file in the conf directory may be run as root, so be careful about permissions on this file.

We are not responsible for any damage or loss of data resulting or arising from running the Ensembl website on your own machines.


 

© 2024 University of Basel. This product includes software developed by Ensembl.

                
AGD version 3 based on Ensembl release 40 - Aug 2006
Help