top of page

Big Data Frameworks

  • neovijayk
  • Jul 6, 2020
  • 1 min read

Big Data

  1. Difference between structured, unstructured and semi structured data (coming soon)

  2. What is Big Data?

  3. What is Internet of Things?

Hadoop Framework :

  1. Distribute system, Parallel computing.

  2. About Hadoop Framework. Hadoop 1 Vs Hadoop 2. Good sources to start with Hadoop. (coming soon)

  3. What is Hadoop Distributed File System

  4. Hive, Pig

Spark Framework :

Cheatsheet for PySpark basics


Cloudera and Hortonworks:

  1. Introduction about Cloudera and Hortonworks. (coming soon)

Google BigQuery:

  1. About Google BQ. Steps to Create a dataset and a table on google BQ

  2. Basic BQ query, table and dataset formation using Python. (coming soon)

  3. Data pulling from BQ to Google Cloud Storage using Google API in Python code. (coming soon)

  4. Good sources, courses to start with BQ. (coming soon)

NoSQL:

  1. HBase, MongoDB (coming soon)

Data Ingestion

Sqoop

  1. Introduction to Sqoop (coming soon)

  2. Simple example implementation using Sqoop (coming soon)

Talend Open Studio

  1. Introduction to Talend Open Studio

  2. Example Data ingestion using Talend (coming soon)

Apache Kafka

Some interesting articles want to share:

  1. Massively Parallel Computations using DataProc. (DataProc is Google Cloud’s Apache Hadoop managed service)

Recent Posts

See All

Comments


Subscribe to BrainStorm newsletter

For notifications on latest posts/blogs

Thanks for submitting!

  • Twitter
  • Facebook
  • Linkedin

© 2023 by my-learnings.   Copy rights Vijay@my-learnings.com

bottom of page