Ensembl/ensembl-hive

View on GitHub
docs/install.html

Summary

Maintainability
Test Coverage
<html>
    <head>
        <title>eHive installation and setup</title>
        <link rel="stylesheet" type="text/css" media="all" href="ehive_doc.css" />
    </head>
<body>

    <center><h1>eHive installation and setup</h1></center>
    <hr width=50% />

    <center><h2>eHive dependencies</h2></center>

    eHive system depends on the following components that you may need to download and install first:
    <ol>
        <li>Perl 5.14 <a href=http://www.perl.org/get.html>or higher</a></li>

        <li>A database engine of your choice. eHive keeps its state in a database, so you will need
            <ol>
                <li>a server installed on the machine where you want to maintain the state of your pipeline(s) and</li>
                <li>clients installed on the machines where the jobs are to be executed.</li>
            </ol>
            At the moment, the following database options are available:
            <ul>
                <li>MySQL 5.1 <a href=http://dev.mysql.com/downloads/>or higher</a>.<br/>
                <b>Warning:</b> eHive is not compatible with MysQL 5.6.20 but is with versions 5.6.16 and 5.6.23. We suggest avoiding the 5.6.[17-22] interval
                </li>
                <li>SQLite 3.6 <a href=http://www.sqlite.org/download.html>or higher</a></li>
                <li>PostgreSQL 9.2 <a href=http://www.postgresql.org/download/>or higher</a></li>
            </ul>
        </li>

        <li>GraphViz visualization package (includes "dot" executable and libraries used by the Perl dependencies).
        <ol>
            <li>Check in your terminal that you have "dot" installed</li>
            <li>If not, visit <a href=http://graphviz.org/>graphviz.org</a> to download it</li>
        </ol>
        </li>

        <li>cpanm -- a handy utility to recursively install Perl dependencies.
        <ol>
            <li>Check in your terminal that you have "cpanm" installed</li>
            <li>If not, visit <a href=https://cpanmin.us>cpanmin.us</a> to download it (just read and follow the instructions in the header of the script)</li>
        </ol>
        </li>

    </ol>

    <hr width=50% />

    <center><h2>Installing eHive code</h2></center>

<h3>Check out the repository by cloning it from GitHub:</h3>

<p>
All eHive pipelines will require the ensembl-hive repository, which can be found on
<a href="https://github.com/Ensembl/ensembl-hive">GitHub</a>. As such it is assumed that <a href="http://git-scm.com/">Git</a> is 
installed on your system, if not follow the instructions <a href="https://help.github.com/articles/set-up-git/">here</a>
</p>

<p>
To download the repository, move to a suitable directory and run the following on the 
command line:
</p>
<pre>
        git clone -b version/2.4 https://github.com/Ensembl/ensembl-hive.git
</pre>

<p>
This will create ensembl-hive directory with all the code and documentation.<br/>
If you cd into the ensembl-hive directory and do an ls you should see something like the
following:
</p>

<pre>
        ls
        Changelog  docs  hive_config.json  modules  README.md  scripts  sql  t
</pre>

The major directories here are:
<dl>
    <dt>modules</dt>
    <dd>This contains all the eHive modules, which are written in Perl</dd>
    <dt>scripts</dt>
    <dd>Has various scripts that are key to initialising, running and debugging the pipeline</dd>
    <dt>sql</dt>
    <dd>Contains sql used to build a standard pipeline database</dd>
</dl>


    <hr width=50% />

    <center><h2>Use cpanm to recursively install the Perl dependencies declared in ensembl-hive/cpanfile</h2></center>

<pre>
    cd ensembl-hive
    cpanm --installdeps .
</pre>

<p>
If installation of either DBD::mysql or DBD::Pg fails, check that the corresponding database system (MySQL or PostgreSQL) was installed correctly.
</p>

    <hr width=50% />

    <center><h2>Optional configuration of the system:</h2></center>

<p>
You may find it convenient (although it is not necessary) to add "ensembl-hive/scripts"
to your <code>$PATH</code> variable to make it easier to run beekeeper.pl and other useful Hive scripts.
</p>

<ul>
<li><i>using bash syntax:</i>
<pre>
        export PATH=$PATH:$ENSEMBL_CVS_ROOT_DIR/ensembl-hive/scripts<i>
                #
                # (for best results, append this line to your ~/.bashrc or ~/.bash_profile configuration file)</i>
</pre></li>

<li><i>using [t]csh syntax:</i>
<pre>
        set path = ( $path ${ENSEMBL_CVS_ROOT_DIR}/ensembl-hive/scripts )<i>
                #
                # (for best results, append this line to your ~/.cshrc or ~/.tcshrc configuration file)</i>
</pre></li>
</ul>

<p>
Also, if you are developing the code and not just running ready pipelines,
you may find it convenient to add "ensembl-hive/modules" to your <code>$PERL5LIB</code> variable.
</p>

<ul>
<li><i>using bash syntax:</i>
<pre>
        export PERL5LIB=${PERL5LIB}:${ENSEMBL_CVS_ROOT_DIR}/ensembl/modules
        export PERL5LIB=${PERL5LIB}:${ENSEMBL_CVS_ROOT_DIR}/ensembl-hive/modules<i>
                #
                # (for best results, append these lines to your ~/.bashrc or ~/.bash_profile configuration file)</i>
</pre></li>

<li><i>using [t]csh syntax:</i>
<pre>
        setenv PERL5LIB  ${PERL5LIB}:${ENSEMBL_CVS_ROOT_DIR}/ensembl/modules
        setenv PERL5LIB  ${PERL5LIB}:${ENSEMBL_CVS_ROOT_DIR}/ensembl-hive/modules<i>
                #
                # (for best results, append these lines to your ~/.cshrc or ~/.tcshrc configuration file)</i>
</pre></li>
</ul>

</body>
</html>