Skip to content

tertiarycourses/ApacheHiveTraining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Big Data Analysis with Apache Hive

These are the exercise files used for Big Data Analysis with Apache Hive course.

The course outline can be found in

https://www.tertiarycourses.com.sg/big-data-analysis-apache-hive.html

https://www.tertiarycourses.com.my/big-data-analysis-with-apache-hive-malaysia.html

Module 1: Get Started on Apache Hive

  • What is Hive?
  • How Hive Work
  • Setup Hive with VirtualBox and CDH

Module 2: Manipulating Data in Hive

  • Data Structures in Hive
  • Ceating Tables in Hive
  • Handling CSV files in Hive
  • Partitioning Tables

Module 3: Retrieving Data from Hive

  • Restrieving data with SELECT
  • Retrieving Data from Complex Structures

Module 4: Aggregating Data with Hive

  • Simple Aggregations
  • Grouping Sets
  • Using CUBE and ROLLUP

Module 5: Filtering Reults with Hive

  • Simple filter with WHERE
  • Filtering aggregates with HAVING
  • Finding similar values with LIKE

Module 6: Joining Tables 

  • Comibining tables with JOIN
  • Where to use SEMI JOIN
  • Joining multiple tables together

Module 7: Manipulating Data

  • Data Manipulating Functions
  • String Functions
  • Math Functions
  • Date Functions
  • Conditonal Functions