Feb 22, 2012

Hong Kong to host GNOME.Asia 2012, June 9-15

I am taking more responsibility for GNOME.Asia 2012.
This year, I have opportunity to learn how to organize the GNOME.Asia summit.


It's awesome to work with GNOME people.
Try to learn discuss in IRC, to remember different timezone for my partner.
Send mail to different timezone, and discuss in real time.


Try to meet and know many friends in GNOME.
I wish I could do my best with GNOME.Asia 2012.






^___^








GNOME.Asia 2012 official announcement.
http://www.gnome.org/news/2012/02/hong-kong-to-host-gnome-asia-2012-june-9-15/


Hong Kong to host GNOME.Asia 2012, June 9-15

It is with great pleasure that the GNOME Foundation announces that Hong Kong has been selected as the venue of our upcoming GNOME.Asia 2012. GNOME.Asia 2012 follows the release of GNOME 3.4, helping to bring new desktop paradigms that facilitate user interaction in the computing world. It will be a great place to celebrate and explore the many new features and enhancements to the groundbreaking GNOME 3 release and to help make GNOME as successful as possible.
Hong Kong is well known for being one of the largest cities in Asia, with a thriving cultural scene, solid infrastructure, and robust public transportation system. Many countries have a visa-free period for travel with Hong Kong and the city has well integrated international connections. We believe that hosting the event in Hong
Kong will bring the spotlight on GNOME and make an impact locally, regionally and internationally in terms of business and community building. Aside from being a business capital, Hong Kong is also well known as a tourist destination that is famous for its food, shopping and many attractions.
Potential sites for the conference are the Breakthrough Youth Village Campsite and City University of Hong Kong, and reasonable rates for accommodations have been arranged.
We would like to thank everyone who participated in the GNOME.Asia 2012 bidding process, especially the great work from Team GNOME Hong Kong and Team GNOME Indonesia. We look forward to working with you more in the future!

Feb 19, 2012

Hadoop with openSUSE -- English Version


This document is for openSUSE users who want to use Hadoop.

Environment setup note:

OS: openSUSE 11.2 ( sure, of course ^^)
HD: 80GB

Prepare two PC for single and cluster pratice.
You can set ip your own ip with your env.

Server1:
10.10.x.y    server.digitalairlines.com    server

Server2:
10.10.v.w    server2.digitalairlines.com    server2



Partition
  • swap 1GB
  • /         73.5GB


User Admin
  • User: root  password: linux
  • User: max  password:  linux


Software
  • select Base Development Packages
  • update openSUSE packages
  • install  java-1_6_0-sun-devel  packages( I found openjdk got problem  ^^||)( you can install it in update repositories)


Services ( Daemon)
  • Active  sshd  and set bootable
    • #rcsshd  start
    • #chkconfig  sshd  on


For numerous use
  • Fix  /etc/fstab  and  /boot/grub/menu.lst  HardDisk DeviceName Use /dev/sda1  and not use  /dev/disk/by-id   -- If you want to clone your hard disks to deploy it!!
  • Delete /etc/udev/rules.d/70-persistent-net.rules  for Network Interface Card ( If you didn’t delete it, when you clone your disk, your new NIC name will be eth1 and not eth0 )



Prepare the software


---------------------------------------- Pratice  ------------------------------------------
Hadoop with single host

At Server1
Please login with max use password linux
Please notice shell promote is  >

Step 1. Create ssh key for connet ssh without password
Use non-interactive method to create  Server1  DSA key pair

>ssh-keygen  -N ''  -d  -q  -f  ~/.ssh/id_dsa

copy public key for  authorized_keys
>cp  ~/.ssh/id_dsa.pub   ~/.ssh/authorized_keys

>ssh-add   ~/.ssh/id_dsa
Identity added: /root/.ssh/id_dsa (/root/.ssh/id_dsa)


Test connect to ssh without password -- with Key
>ssh  localhost
The authenticity of host 'localhost (: :1)' can't be established.
RSA key fingerprint is 05:22:61:78:05:04:7e:d1:81:67:f2:d5:8a:42:bb:9f.
Are you sure you want to continue connecting (yes/no)? Please input   yes

Logout  SSH
>exit

Step 2.  Instll  Hadoop
Exarct Hadoop package(we prepare it at  /opt/OSSF) -- please use sudo to do it
(Because regular user has no permission with  /opt  folder)

>sudo  tar  zxvf   /opt/OSSF/hadoop-0.20.2.tar.gz   -C   /opt

It will ask to input  root password, pleasure input  linux

Change  /opt/hadoop-0.20.2 owner to  max,  and the group belong  users
> sudo  chown   -R  max:users   /opt/hadoop-0.20.2/

Create  /var/hadoop Folder
> sudo  mkdir   /var/hadoop

Change  /var/hadoop  owner to  max, and group belong  users
> sudo  chown  -R  max:users   /var/hadoop/


Step 3.  Set up Hadoop Configuration


3-1. Set up environment with  hadoop-env.sh
>vi   /opt/hadoop-0.20.2/conf/hadoop-env.sh
#Please add these settings ( Depend your env)
export JAVA_HOME=/usr/lib/jvm/java-1.6.0-sun
export HADOOP_HOME=/opt/hadoop-0.20.2
export HADOOP_CONF_DIR=/opt/hadoop-0.20.2/conf

3-2.  add configuration with  core-site.xml  in to
you can copy and paste it ^^
>vi   /opt/hadoop-0.20.2/conf/core-site.xml




fs.default.name
hdfs://localhost:9000


hadoop.tmp.dir
/var/hadoop/hadoop-\${user.name}




3-3. add configuration with  hdfs-site.xml ( Set up replication) in to
you can copy and paste it ^^
>vi   /opt/hadoop-0.20.2/conf/hdfs-site.xml




dfs.replication
1




3-4. add  configuration with mapred-site.xml 內的( For  JobTracker )  in to
you can copy and paste it ^^

>vi   /opt/hadoop-0.20.2/conf/mapred-site.xml




mapred.job.tracker
localhost:9001




Step 4. Format  HDFS
>/opt/hadoop-0.20.2/bin/hadoop   namenode   -format
10/07/20 00:51:13 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = server/127.0.0.2
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 0.20.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
10/07/20 00:51:13 INFO namenode.FSNamesystem: fsOwner=max,users,video
10/07/20 00:51:13 INFO namenode.FSNamesystem: supergroup=supergroup
10/07/20 00:51:13 INFO namenode.FSNamesystem: isPermissionEnabled=true
10/07/20 00:51:14 INFO common.Storage: Image file of size 93 saved in 0 seconds.
10/07/20 00:51:14 INFO common.Storage: Storage directory /var/hadoop/hadoop-\max/dfs/name has been successfully formatted.
10/07/20 00:51:14 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at server/127.0.0.2
************************************************************/

Step 5. Start  hadoop
>/opt/hadoop-0.20.2/bin/start-all.sh
starting namenode, logging to /opt/hadoop-0.20.2/logs/hadoop-max-namenode-server.out
localhost: starting datanode, logging to /opt/hadoop-0.20.2/logs/hadoop-max-datanode-server.out
localhost: starting secondarynamenode, logging to /opt/hadoop-0.20.2/logs/hadoop-max-secondarynamenode-server.out
starting jobtracker, logging to /opt/hadoop-0.20.2/logs/hadoop-max-jobtracker-server.out
localhost: starting tasktracker, logging to /opt/hadoop-0.20.2/logs/hadoop-max-tasktracker-server.out

Step 6. Check Hadoop Status
You can use mouse to link your computer

Hadoop Admin
http://localhost:50030

Hadoop Task Tracker
http://localhost:50060

Hadoop DFS
http://localhost:50070



Lab2 HDFS  commands pratice

1.
show hadoop command  help
>/opt/hadoop-0.20.2/bin/hadoop   fs

use  hadoop  command to list HDFS
( But we don’t upload and file to HDFS, It will have error messages )
>/opt/hadoop-0.20.2/bin/hadoop   fs   -ls


2. Upload  /opt/hadoop-0.20.2/conf  Folder to  HDFS  and rename to input
Syntax is
#hadoop command                              upload           Local-Dir              HDFS-Folder-Name
>/opt/hadoop-0.20.2/bin/hadoop   fs   -put   /opt/hadoop-0.20.2/conf   input


3. Please check  HDFS  again
3-1 check the HDFS
> /opt/hadoop-0.20.2/bin/hadoop  fs   -ls
Found 1 items
drwxr-xr-x   - max supergroup       0 2010-07-18 21:16 /user/max/input

If you don’t order the path , Default path is   /user/username
you can use absolute path name too, for example
> /opt/hadoop-0.20.2/bin/hadoop   fs   -ls   /user/max/

Tips: You can check the  /var/hadoop folder before / after you upload to HDFS
(You can see some change to your folder with your localhost)
>ls  -lh  /var/hadoop/hadoop-\\max/dfs/data/current/

3-2 List  input  folder on HDFS
>/opt/hadoop-0.20.2/bin/hadoop   fs    -ls   input
Found 13 items
-rw-r--r--   1 max supergroup    3936 2010-07-21 16:00 /user/max/input/capacity-scheduler.xml
-rw-r--r--   1 max supergroup     535 2010-07-21 16:00 /user/max/input/configuration.xsl
-rw-r--r--   1 max supergroup     379 2010-07-21 16:00 /user/max/input/core-site.xml
-rw-r--r--   1 max supergroup    2367 2010-07-21 16:00 /user/max/input/hadoop-env.sh
-rw-r--r--   1 max supergroup    1245 2010-07-21 16:00 /user/max/input/hadoop-metrics.properties
-rw-r--r--   1 max supergroup    4190 2010-07-21 16:00 /user/max/input/hadoop-policy.xml
-rw-r--r--   1 max supergroup     254 2010-07-21 16:00 /user/max/input/hdfs-site.xml
-rw-r--r--   1 max supergroup    2815 2010-07-21 16:00 /user/max/input/log4j.properties
-rw-r--r--   1 max supergroup     270 2010-07-21 16:00 /user/max/input/mapred-site.xml
-rw-r--r--   1 max supergroup      10 2010-07-21 16:00 /user/max/input/masters
-rw-r--r--   1 max supergroup      10 2010-07-21 16:00 /user/max/input/slaves
-rw-r--r--   1 max supergroup    1243 2010-07-21 16:00 /user/max/input/ssl-client.xml.example
-rw-r--r--   1 max supergroup    1195 2010-07-21 16:00 /user/max/input/ssl-server.xml.example

4. Download files from  HDFS  to local
Please check your local folder first
>ls

Use command  “ hadoop  fs  -get “ to download it
>/opt/hadoop-0.20.2/bin/hadoop   fs   -get   input    fromHDFS

Please check your local folder again
>ls


5. Use -cat to check the file on HDFS
>/opt/hadoop-0.20.2/bin/hadoop   fs   -cat   input/slaves
localhost

6. Delete files on  HDFS  with  -rm  ( with directory please use  -rmr )
Check  input Folder’s files first, you will see /user/max/input/slaves exist
>> /opt/hadoop-0.20.2/bin/hadoop   fs   -ls   /user/max/input
Found 13 items
-rw-r--r--   1 max supergroup    3936 2010-07-21 16:00 /user/max/input/capacity-scheduler.xml
-rw-r--r--   1 max supergroup     535 2010-07-21 16:00 /user/max/input/configuration.xsl
-rw-r--r--   1 max supergroup     379 2010-07-21 16:00 /user/max/input/core-site.xml
-rw-r--r--   1 max supergroup    2367 2010-07-21 16:00 /user/max/input/hadoop-env.sh
-rw-r--r--   1 max supergroup    1245 2010-07-21 16:00 /user/max/input/hadoop-metrics.properties
-rw-r--r--   1 max supergroup    4190 2010-07-21 16:00 /user/max/input/hadoop-policy.xml
-rw-r--r--   1 max supergroup     254 2010-07-21 16:00 /user/max/input/hdfs-site.xml
-rw-r--r--   1 max supergroup    2815 2010-07-21 16:00 /user/max/input/log4j.properties
-rw-r--r--   1 max supergroup     270 2010-07-21 16:00 /user/max/input/mapred-site.xml
-rw-r--r--   1 max supergroup      10 2010-07-21 16:00 /user/max/input/masters
-rw-r--r--   1 max supergroup      10 2010-07-21 16:00 /user/max/input/slaves
-rw-r--r--   1 max supergroup    1243 2010-07-21 16:00 /user/max/input/ssl-client.xml.example
-rw-r--r--   1 max supergroup    1195 2010-07-21 16:00 /user/max/input/ssl-server.xml.example

Use  hadoop fs -rm  to delete file with name slaves
>/opt/hadoop-0.20.2/bin/hadoop   fs   -rm   input/slaves
Deleted hdfs://localhost:9000/user/max/input/slaves

Check  input Folder’s files again, you will see /user/max/input/slaves not  exist
>> /opt/hadoop-0.20.2/bin/hadoop   fs   -ls   /user/max/input
Found 12 items
-rw-r--r--   1 max supergroup    3936 2010-07-22 15:08 /user/max/input/capacity-scheduler.xml
-rw-r--r--   1 max supergroup     535 2010-07-22 15:08 /user/max/input/configuration.xsl
-rw-r--r--   1 max supergroup     379 2010-07-22 15:08 /user/max/input/core-site.xml
-rw-r--r--   1 max supergroup    2367 2010-07-22 15:08 /user/max/input/hadoop-env.sh
-rw-r--r--   1 max supergroup    1245 2010-07-22 15:08 /user/max/input/hadoop-metrics.properties
-rw-r--r--   1 max supergroup    4190 2010-07-22 15:08 /user/max/input/hadoop-policy.xml
-rw-r--r--   1 max supergroup     254 2010-07-22 15:08 /user/max/input/hdfs-site.xml
-rw-r--r--   1 max supergroup    2815 2010-07-22 15:08 /user/max/input/log4j.properties
-rw-r--r--   1 max supergroup     270 2010-07-22 15:08 /user/max/input/mapred-site.xml
-rw-r--r--   1 max supergroup      10 2010-07-22 15:08 /user/max/input/masters
-rw-r--r--   1 max supergroup    1243 2010-07-22 15:08 /user/max/input/ssl-client.xml.example
-rw-r--r--   1 max supergroup    1195 2010-07-22 15:08 /user/max/input/ssl-server.xml.example

Use  hadoop  fs  -rmr  to delete folder
>/opt/hadoop-0.20.2/bin/hadoop   fs   -rm   input
Deleted hdfs://localhost:9000/user/max/input



Lab 3 Hadoop example pratice

1.grep command

1-1 Upload   /opt/hadoop-0.20.2/conf  folder to  HDFS  and rename to  source
Syntax is
#hadoop                                           upload      LocalFolder                       HDFS-Folder-Name
>/opt/hadoop-0.20.2/bin/hadoop   fs   -put   /opt/hadoop-0.20.2/conf           source

1-2 Check upload  source folder sucessful

> /opt/hadoop-0.20.2/bin/hadoop   fs   -ls   /user/max/
Found 1 items
drwxr-xr-x   - max supergroup       0 2010-07-23 15:13 /user/max/source


1-3 Use grep command to find out files in  source folder, and the content text start with dfs , save it to  output-1

>/opt/hadoop-0.20.2/bin/hadoop   jar   /opt/hadoop-0.20.2/hadoop-0.20.2-examples.jar grep   source   output-1    'dfs[a-z.]+'

1-4  Check Result
>/opt/hadoop-0.20.2/bin/hadoop   fs   -ls    output-1
Found 2 items
drwxr-xr-x   - max supergroup       0 2010-07-20 00:33 /user/max/output/_logs
-rw-r--r--   1 max supergroup      96 2010-07-20 00:33 /user/max/output/part-00000

>/opt/hadoop-0.20.2/bin/hadoop  fs   -cat    output-1/part-00000
3    dfs.class
2    dfs.period
1    dfs.file
1    dfs.replication
1    dfs.servers
1    dfsadmin
1    dfsmetrics.log

2. wordcount  Pratice

2-1 Count how many words in  source folder and save it to  output-2
>/opt/hadoop-0.20.2/bin/hadoop   jar /opt/hadoop-0.20.2/hadoop-0.20.2-examples.jar wordcount   source   output-2

2-2 Check result
>/opt/hadoop-0.20.2/bin/hadoop   fs   -ls    output-2
Found 2 items
drwxr-xr-x   - max supergroup       0 2010-07-20 02:00 /user/max/output-2/_logs
-rw-r--r--   1 max supergroup   10886 2010-07-20 02:01 /user/max/output-2/part-r-00000

Display the result with -cat
>/opt/hadoop-0.20.2/bin/hadoop   fs  -cat   output-2/part-r-00000





Lab 4  Hadoop Cluster  

-- Please do it on  Server 2 ---
Login with  max password  linux  
1, Prepare the folder

>sudo  mkdir   /opt/hadoop-0.20.2
Input  root password, Please input  linux

>sudo  mkdir   /var/hadoop
>sudo  chown   -R  max:users   /opt/hadoop-0.20.2/
>sudo  chown   -R  max:users   /var/hadoop

Set up Name resolve ( It’s very important )
>sudo   vi   /etc/hosts
Please comment  server2’s name resolve for  127.0.0.2
#127.0.0.2    server2.digitalairlines.com    server2
Please add  server1 and server2 IP ( Depend on your env)
10.10.x.y    server.digitalairlines.com    server
10.10.v.w    server2.digitalairlines.com    server2


-----------------------------------------------------------------------------------------------------------------------

-- Please do it on  Server 1 ---

1-1 stop hadoop
>/opt/hadoop-0.20.2/bin/stop-all.sh

1-2 Delete old   hadoop folder
>rm  -rf   /var/hadoop/*

1-3 Modify  Namenode configurtion
>vi   /opt/hadoop-0.20.2/conf/core-site.xml
Please fix
                            hdfs://localhost:9000
To  server1  IP
                            hdfs://Srv1’s ip:9000

***You can use “>ip address show” or  “/sbin/ifconfig” display  IP address***

1-4  Modify  HDFS replication setting
>vi   /opt/hadoop-0.20.2/conf/hdfs-site.xml
Please fix
            1
to
            2

1-5
>vi  /opt/hadoop-0.20.2/conf/mapred-site.xml
Please fix
localhost:9001
to
Srv1’s ip:9001

1-6  Set up  slaves (The host who act  slaves will be  datanode and  tasktracker  role)
>vi  /opt/hadoop-0.20.2/conf/slaves
Please delete  localhost
Please add  Srv1’s ip
Please add  Srv2’s ip

***The   ip addree might be 10.10.x.y ***

1-7 Set up Name resolve
>sudo   vi   /etc/hosts
Please comment  server1’s Name resolve for  127.0.0.2
#127.0.0.2    server.digitalairlines.com    server
Please add  server1 and server2  IP  address for name resolve
10.10.x.y    server.digitalairlines.com    server
10.10.v.w    server2.digitalairlines.com    server2

1-8 Modify  ssh configuration
>sudo   vi   /etc/ssh/ssh_config
Uncomment the StrictHostKeyChecking and modify it to no
# StrictHostKeyChecking ask

StrictHostKeyChecking  no

1-9 Copy  SSH Key to another Node

>scp   -r   ~/.ssh   Srv2-IP:~/
Warning: Permanently added '10.10.v.w' (RSA) to the list of known hosts.
Password: Please input  max password

Test connect to  SSH without password -- with key
Connect to  server1
>ssh    Srv1’s IP
>exit
Connect to  server2
>ssh    Srv2’s IP
>exit

1-10 Copy  hadoop to Server 2
>scp   -r   /opt/hadoop-0.20.2/*    Srv2-IP:/opt/hadoop-0.20.2/

1-11 Format HDFS
>/opt/hadoop-0.20.2/bin/hadoop   namenode   -format


1-12 Start  DFS ( It will depend  /opt/hadoop-0.20.2/conf/slaves to active  datanode  )
>/opt/hadoop-0.20.2/bin/start-dfs.sh
Please check there are 2  datanode
starting namenode, logging to /opt/hadoop-0.20.2/logs/hadoop-max-namenode-linux-7tce.out
10.10.x.y: starting datanode, logging to /opt/hadoop-0.20.2/logs/hadoop-max-datanode-linux-7tce.out
10.10.v.w: starting datanode, logging to /opt/hadoop-0.20.2/logs/hadoop-max-datanode-server2.out
localhost: starting secondarynamenode, logging to /opt/hadoop-0.20.2/logs/hadoop-max-secondarynamenode-linux-7tce.out

Please Check  “ http://Srv1’s IP:50070/ ”
Please check  “Live Nodes” -- It should be  2 

1-13 Start JobTracker 
>/opt/hadoop-0.20.2/bin/start-mapred.sh

Please Check  “ http://Srv1’s IP:50030/ ”
Please Check  “Nodes” -- It should be  2 

Now, just run programs like Lab3 to examine