Big Data on a Shoestring

Big Data on a Shoestring by Nicholas Bessmer Page A

Book: Big Data on a Shoestring by Nicholas Bessmer Read Free Book Online
Authors: Nicholas Bessmer
Ads: Link
installation, use SFTP plugin:
     

     
    It lets you see your local computer files and you remote EC2 instance. Now download the following to your computer:
     
    http://www.sai.msu.su/apache/hadoop/core/stable/
     
    for the latest stable version of HADOOP and download CASSANDRA:
     
    http://cassandra.apache.org/download/
     
    “PIG” is a query language designed for Big Data. We will use this query our Big Data dataset.
     
    http://www.sai.msu.su/apache/pig
     
    Now copy these over to your new EC2 Linux Server:
     

     
    Once the files have been copied copy and paste the following command:
     
    tar -xvf hadoop-0.20.2.tar.gz
    tar -xvf apache-cassandra-1.2.1-bin.tar.gz
    tar -xvf pig-0.10.1.tar.gz
    Please also be sure to run this command in this directory by typing these commands:
     
    »         cd pig-0.10.1 (cd changes
    »         tar –xvf tutorial.tar (also can use utility gunzip)
     
    This extracts the files which are compressed much like a ZIP file.
     
    It is possible to choose MS Windows Server as your preferred EC2 server. We installed Linux here (it is cheaper to run than Windows Server)… so editing files with the VI editing tool is a bit harder to do. Lookup VI on the Internet – it is like a very powerful Windows notepad but is command line driven.
     

Getting The Linux Environment Set Up – Basic Steps
     
    Type the following:
     
    »         cd   (changes to the main directory)
    »         vi .bash_profile (vi is the editor and you will be modifying a simple text configuration file) – please see this helpful link from University of San Diego
     
    http://acms.ucsd.edu/info/vi_tutorial.html
     
    »         copy and paste the following into your file
    # . bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
        . ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/bin:/home/ec2-user/hadoop-0.20.2/bin:/home/ec2-user/pig-0.10.1/bin
sh_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
        . ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/bin:/home/ec2-user/hadoop-0.20.2/bin:/home/ec2-user/pig-0.10.1/bin

export PATH

Editing Our Hadoop Configuration Files
     
    We need to edit the following files and run these commands next following up on step #1 above of downloading the Hadoop TAR file:
     
Edit /conf/core-site.xml. I have used localhost in the value of fs.default.nam [1] e
     
           fs.default.name
           hdfs://localhost:9000
     
     
Edit / conf/mapred-site.xml.
     
             mapred.job.tracker
             localhost:9001
     
     
Edit / conf/hdfs-site.xml. Since this test cluster has a single node, replication factor should be set to 1.
     
    dfs.replication = “1”
     
Format the name node (one per install).
     
    $ bin/hadoop namenode –format
     
    It should print out something like the following message:
     
    12/07/15 15:54:20 INFO namenode.NameNode: STARTUP_MSG:
    /************************************************************
    STARTUP_MSG: Starting NameNode
    STARTUP_MSG:   host = Shamim-2.local/192.168.0.103
    STARTUP_MSG:   args = [-format]
    STARTUP_MSG:   version = 0.20.2
    STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
    ************************************************************/
    12/07/15 15:54:21 INFO namenode.FSNamesystem: fsOwner=samim,staff,com.apple.sharepoint.group.1,everyone,_appstore,localaccounts,_appserverusr,admin,_appserveradm,_lpadmin,_lpoperator,_developer,com.apple.access_screensharing
    12/07/15 15:54:21 INFO namenode.FSNamesystem: supergroup=supergroup
    12/07/15 15:54:21 INFO namenode.FSNamesystem: isPermissionEnabled=true
    12/07/15 15:54:21 INFO common.Storage: Image file of size 95 saved in 0 seconds.
    12/07/15 15:54:21 INFO common.Storage: Storage directory

Similar Books

A Cast of Vultures

Judith Flanders

Can't Shake You

Molly McLain

Wings of Lomay

Devri Walls

Charmed by His Love

Janet Chapman

Angel Stations

Gary Gibson

Cheri Red (sWet)

Charisma Knight