How and where to practise Big Data and Apache Spark Programs?

Platforms to practise Big Data and Apache Spark Programs

When you think of learning Big Data the most obvious problem you face is where to practise and run your code. You need a cluster or a single node/pseudo installation to do hands-on. Being a developer you would like to spent time in practising rather than spending time in installation of software, and of course no one will ask you the installation steps unless you are a big data admin.
Installing any big data frameworks on personal laptop is not always feasible as it need a high-end laptop to support seamless and faster execution of your code. Moreover, the installation and configuration are not the things which developers would like to do.



I am going to list down some ways which will give you a platform to practise all big data technologies.


Virtual Box Image(Free)

Installing Virtual box and using an image is the best and free way to practise. You can install Oracle VM (Free) or VMWare and download the Cloudera Quickstart Image but you need at least 8 GB or RAM and a 64-bit processor to use Cloudera Quickstart VM image.
Steps are mentioned below: -
1.    Download Oracle Virtual Box from  https://www.virtualbox.org/wiki/Downloads
2.    Download cloudera  “Quickstart VM” image(Virtual box image )  and unzip it.
3.    Open Oracle VM and Click on new and fill the details as below then click next:-


If you have 8 GB of RAM then neither allocate more than 5-6 GB nor less than this, as allocating beyond or below this may cause your laptop to get hanged frequently.
Once done you can start your VM and practice. Cloudera Quickstart Image has all the required tools installed on it. You can check Cloudera for the detail information about the tools and versions installed on it.


CloudALabs

CloudALab is a cloud hosted labs to practice big data. This is good and very cheap all you need is a browser to access the cluster. Monthly subscription is not more than Rs/- 500 ($ 7 - 8). You can also get a yearly subscription at cheaper rate.

They also provide free subscription for 3 days, so if you are not satisfied you can always opt out without paying a penny.

If you need a subscription you can get in touch with me by writing to me at discussbigdata@gmail.com or WhatsApp me on 7022553011.


BigData Labs

BigData will labs provide you almost same features as CloudALabs. BigData Labs are a little costly than CloudALabs, monthly subscription will cost you around $ 14-15.
They also have a free forum where you can discuss your queries.


Databricks(Free)

    You can also go for free trials version of data bricks platform. You just need to create a free account on https://community.cloud.databricks.com
However, here you can do just the basic hands on as it has limited features.

So the best option is to go for the VM installation and use Quickstart VM, but if you face some issues with the installation or your laptop configuration or if you don’t want to do it on your laptop then go for CloudALabs. CloudAlab is very cheap and it gives you all the features which you need, including Cloudera Manager, Spark UI access, Hadoop UI access etc. It’s worth spending some bucks to practice big Data.


Author - Sanjeev Krishna


To learn more on Spark click hereLet us know your views or feedback on Facebook or Twitter @BigdataDiscuss.

2 comments:

  1. Hi Buddy ... You can add cloudXlab and congnitivclass .
    Cloudexlab is charging very minimal fees (also has 7 days trail period with full access to all services) .also provide stable web console (beter than itversity lab ).
    Where as cognitive class is completely free and provide good environment to practice spark .

    ReplyDelete
  2. Thanks Himanshu, I will add them soon.

    ReplyDelete

Manual Categories