Monday, December 16, 2013

How to increase Xmx for the hadoop client applications

Sometimes you need more memory for hadoop client application tools like hive, beeline or pig.
export HADOOP_CLIENT_OPTS=-Xmx2G

Thursday, December 12, 2013

Find out the total size of directories with the 'du' command

To get a sorted list of folder sizes run:
du -sm * | sort -n

How to access the web interface of a remote hadoop cluster over a SOCKS proxy

You want to use the web interface of a hadoop cluster but you only have ssh access to it? SOCKS is the solution:

Use ssh to open a SOCKS proxy:

ssh -f -N -D 7070 user@remote-host
Click here for an explanation of the command.

After that you can configure firefox to use this proxy:

  1. Go to the manual proxy settings and add localhost:7070 as SOCKS v5 host
  2. Go to about:config and set network.proxy.socks_remote_dns to true to use DNS resolution over the proxy (thanks to Aczire for this!).
Thats all!

Wednesday, December 11, 2013

How to rsync a folder over ssh

Another short shell snippet:

rsync -avze ssh source user@host:target-folder
For an explanation of the command click here.