Big Data Analytics

Big Data Analytics

Course Syllabus

  • Name of the Course: Big Data Analytics
  • LTP structure of the course: 2-1-1
  • Objective of the course: This course covers the concept of big data analytics, algorithms, applications and frameworks.
  • Outcome of the course: Students will do the detailed study of big data analytics and able to apply in practical problems.
  • Course Plan:
ComponentUnitTopics for Coverage
Component 1Unit 1Introduction to Big Data and its importance, 3 Vs and more, Big data analytics, Big data applications. Hadoop & Hadoop EcoSystem, Moving Data in and out of Hadoop, Inputs and outputs of MapReduce, Hadoop Architecture, HDFS, Common Hadoop Shell commands, NameNode, Secondary NameNode, and DataNode,
Unit 2Hadoop MapReduce paradigm, Map and Reduce tasks, Job, Task trackers , Algorithms using map reduce, Examples of Map Reduce (Word count problem, Matrix-Vector Multiplication), YARN & Zookeeper, Hadoop Cluster Setup & Hadoop Configuration, HDFS Administration: Monitoring & Maintenance
Component 2Unit 3Hive Architecture, Comparison with Traditional Database, HiveQL - Querying Data - Sorting And Aggregating, Map Reduce Scripts, Joins & Subqueries, HBase concepts, Advanced Usage, Schema Design & Indexing - PIG, Zookeeper
Unit 4Spark: RDD's in Spark, Data Frames & Spark SQL, Spark Streaming, , MongoDB, NoSQL
  • Text Book:

    • Chris Eaton, Dirk Deroos et al. , “Understanding Big data ”, McGraw Hill, 2012.
    • Boris lublinsky, Kevin t. Smith, Alexey Yakubovich, “Professional Hadoop Solutions”, Wiley, ISBN: 9788126551071, 2015.
    • Tom White, “HADOOP: The definitive Guide”, O Reilly 2012.
    • Aven Jeffrey, Data Analytics with Spark Using Python | Big Data | First Edition | Pearson Paperback, November 2018