How to set up AWS machine for assembly

If you are hoping to attempt a Trinity assembly, requirements for RAM = .5 * X million read pairs. For instance, to assemble 40 million paired-end reads using Trinity, you’ll need a minimum of 20Gb of RAM. For BinPacker, you’ll need substantially more, maybe as much as 2 * X million read pairs.

These instructions work with a standard Ubuntu 16.04 (for instance, ami-40d28157 or ami-2ef48339) machine available on AWS. Similar instructions should work for people on their own workstations, especially if you have sudo privileges.

Update Software and install things from apt-get

sudo apt-get update && sudo apt-get -y upgrade && sudo apt-get -y dist-upgrade

sudo apt-get -y install build-essential git python-pip python-numpy python-matplotlib  

Format and Mount hard drive (if needed)

sudo mkfs -t ext4 /dev/xvdf
sudo mount /dev/xvdf /mnt
sudo chown -R ubuntu:ubuntu /mnt

Install Adapter seqs and a few utility scripts

cd && mkdir share && cd share
curl -LO
chmod +x
curl -LO

Install Perl Module

sudo cpan URI::Escape

Install BioPython

pip install biopython

Install Ruby and LinuxBrew

gpg --import key.asc
\curl -sSL | bash -s stable --ruby
source /home/ubuntu/.rvm/scripts/rvm

sudo mkdir /home/linuxbrew
sudo chown $USER:$USER /home/linuxbrew
git clone /home/linuxbrew/.linuxbrew
echo 'export PATH="/home/linuxbrew/.linuxbrew/bin:$PATH"' >> ~/.profile
echo 'export MANPATH="/home/linuxbrew/.linuxbrew/share/man:$MANPATH"' >> ~/.profile
echo 'export INFOPATH="/home/linuxbrew/.linuxbrew/share/info:$INFOPATH"' >> ~/.profile
source ~/.profile
brew tap homebrew/science
brew update
brew doctor

Install SolexaQA

curl -LO
cd Linux_x64

Install Software (gcc skewer seqtk python jellyfish bfc rcorrector trinity LAST TransDecoder vsearch salmon kallisto, etc..)

brew install gcc skewer seqtk python jellyfish bfc rcorrector hmmer infernal quorum \
trinity --without-express vsearch salmon transdecoder last parallel spades

Install Kallisto

curl -LO
tar -zxf kallisto_linux-v0.43.0.tar.gz

Install BLAST

curl -LO
tar -zxf ncbi-blast-2.5.0+-x64-linux.tar.gz

Install TransFuse

gem install transfuse


curl -LO
tar -zxf transfuse-0.5.0-linux-x86_64.tar.gz

Install Transrate

curl -LO
tar -zxf transrate-1.0.3-linux-x86_64.tar.gz

Install BUSCO

git clone
cd busco
wget && tar -zxf mammalia_odb9.tar.gz
wget && tar -zxf eukaryota_odb9.tar.gz
wget && tar -zxf metazoa_odb9.tar.gz

Install dammit!

gem install crb-blast
pip install -U setuptools
pip install pandas
pip install dammit
sed -i 's/BUSCO_v1.1b1/BUSCO/' /home/linuxbrew/.linuxbrew/lib/python2.7/site-packages/dammit/
sed -i 's/BUSCO_v1.1b1/BUSCO/' /home/linuxbrew/.linuxbrew/lib/python2.7/site-packages/dammit/
sed -i 's_-in_--in_' /home/linuxbrew/.linuxbrew/lib/python2.7/site-packages/dammit/

Add all these things to the permanent path

echo PATH=$PATH >> ~/.profile
echo export LD_LIBRARY_PATH=/home/linuxbrew/.linuxbrew/Cellar/salmon/0.7.2/lib >> ~/.profile
echo source /home/ubuntu/.rvm/scripts/rvm >> ~/.profile
source ~/.profile