Subcribe and Access : 5200+ FREE Videos and 21+ Subjects Like CRT, SoftSkills, JAVA, Hadoop, Microsoft .NET, Testing Tools etc..
Batch
Date: Apr 12th & 13th @4:30PM
Faculty: Mr. N. Vijay Sunder Sagar (20+ Yrs Of Exp,..)
Duration: 16 Weekends Batch
Venue
:
DURGA SOFTWARE SOLUTIONS,
Flat No : 202,
2nd Floor,
HUDA Maitrivanam,
Ameerpet, Hyderabad - 500038
Ph. No: +91 - 9246212143, 80 96 96 96 96
Syllabus:
BIG DATA HADOOP
I: INTRODUCTION
- What is Big Data?
- What is Hadoop?
- Need of Hadoop
- Sources and Types of Data
- Comparison with Other Technologies
- Challenges with Big Data
- i. Storage
- ii. Processing
- RDBMS vs Hadoop
- Advantages of Hadoop
- Hadoop Echo System components
II: HDFS (Hadoop Distributed File System)
- Features of HDFS
- Name node ,Data node ,Blocks
- Configuring Block size,
- HDFS Architecture ( 5 Daemons)
- i. Name Node
- ii. Data Node
- iii. Secondary Name node
- iv. Job Tracker
- v. Task Tracker
- Metadata management
- Storage and processing
- Replication in Hadoop
- Configuring Custom Replication
- Fault Tolerance in Hadoop
- HDFS Commands
III: MAP REDUCE
- Map Reduce Architecture
- Processing Daemons of Hadoop
- Job Tracker (Roles and Responsibilities)
- Task Tracker(Roles and Responsibilities)
- Phases of Map Reduce
- i) Mapper phase
- ii) Reducer phase
- Input split
- Input split vs Block size
- Partitioner in Map Reduce
- Groupings and Aggregations
- Data Types in Map Reduce
- Map Reduce Programming Model
- Driver Code
- Mapper Code
- Reducer Code
- Programming examples
- File input formats
- File output formats
- Merging in Map Reduce
- Speculative Execution Model
- Speculative Job
IV: SQOOP (SQL + HADOOP)
- Introduction to Sqoop
- SQOOP Import
- SQOOP Export
- Importing Data From RDBMS to HDFS
- Importing Data From RDBMS to HIVE
- Importing Data From RDBMS to HBASE
- Exporting From HASE to RDBMS
- Exporting From HBASE to RDBMS
- Exporting From HIVE to RDBMS
- Exporting From HDFS to RDBMS
- Transformations While Importing / Exporting
- Filtering data while importing
- Vertical and Horizontal merging while import
- Working with delimiters while importing
- Groupings and Aggregations while import
- Incremental import
- Examples and operations
- Defining SQOOP Jobs
V: YARN
- Introduction
- Speculative Execution ,Speculative job and
- Speculative Task.
- Comparision of Hadoop1.xx with Hadoop2.xx
- Comparision with previous versions
- YARN Architecture Componets
- i. Resource Manager
- ii. Application Master
- iii. Node Manager
- iv. Application Manager
- v. Resource Scheduler
- vi. Job History Server
- vii. Container
VI: NOSQL
- What is “Not only SQL”
- NOSQL Advantages
- What is problem with RDBMS for Large
- Data Scaling Systems
- Types of NOSQL & Purposes
- Key Value Store
- Columer Store
- Document Store
- Graph Store
- Introduction to cassandra – NOSQL Database
- Introduction to MongoDB and CouchDB Database
- Intergration of NOSQL Databases with Hadoop
VII: HBASE
- Introduction to big table
- What is NOSQL and colummer store Database
- HBASE Introduction
- Hbase use cases
- Hbase basics
- Column families
- Scans
- Hbase Architecture
- Map Reduce Over Hbase
- Hbase data Modeling
- Hbase Schema design
- Hbase CRUD operators
- Hive & Hbaseinteragation
- Hbase storage handlers
VIII: HIVE
- Introduction
- Hive Architecture
- Hive Metastore
- Hive Query Launguage
- Difference between HQL and SQL
- Hive Built in Functions
- Loading Data From Local Files To Hive Tables
- Loading Data From Hdfs Files To Hive Tables
- Tables Types
- Inner Tables
- External Tables
- Hive Working with unstructured data
- Hive Working With Xml Data
- Hive Working With Json Data
- Hive Working With Urls And Weblog Data
- Hive Unions
- Hive Joins
- Multi Table / File Inserts
- Inserting Into Local Files
- Inserting Into Hdfs Files
- Hive UDF (user defined functions)
- Hive UDAF (user defined Aggregated functions)
- Hive UDTF (user defined table Generated functions
- Partitioned Tables
- Non – Partitioned Tables
- Multi-column Partitioning
- Dynamic Partitions In Hive
- Performance Tuning mechanism
- Bucketing in hive
- Indexing in Hive
- Hive Examples
- Hive & Hbase Integration
PYSPARK
I ) PYSPARK INTRODUCTION
- What is Apache Spark?
- Why Pyspark?
- Need for pyspark
- spark Python Vs Scala
- pyspark features
- Real-life usage of PySpark
- PySpark Web/Application
- PySpark - SparkSession
- PySpark – SparkContext
- PySpark – RDD
- PySpark – Parallelize
- PySpark – repartition() vs coalesce()
- PySpark – Broadcast Variables
- PySpark – Accumulator
II) PYSPARK - RDD COMPUTATION
- Operations on a RDD
- Direct Acyclic Graph (DAG)
- RDD Actions and Transformations
- RDD computation
- Steps in RDD computation
- RDD persistence
- Persistence features
II) PERSISTENCE Options:
- 1) MEMORY_ONLY
- 2) MEMORY_SER_ONLY
- 3) DISK_ONLY
- 4) DISK_SER_ONLY
- 5) MEMORY_AND_DISK_ONLY
III) PYSPARK - CORE COMPUTING
- Fault Tolerence model in spark
- Different ways of creating a RDD
- Word Count Example
- Creating spark objects(RDDs) from Scala Objects(lists).
- Increasing the no of partitons
- Aggregations Over Structured Data:
- reduceByKey()
IV) GROUPINGS AND AGGREGATIONS
- i) Single Grouping and Single Aggregation
- ii) Single Grouping and multiple Aggregation
- iii) multi Grouping and Single Aggregation
- iv) Multi Grouping and Multi Aggregation
- Differences b/w reduceByKey() and groupByKey()
- Process of groupByKey
- Process of reduceByKey
- Reduce() function
- Various Transformations
- Various Built-in Functions
V) Various Actions and Transformations:
- countByKey()
- countByValue()
- sortByKey()
- zip()
- Union()
- Distinct()
- Various count aggregation
- Joins
- -inner join
- -outer join
- Cartesian()
- Cogroup()
- Other actions and transformations
VI) PySpark SQL - DataFrame
- Introduction
- Making data Structured
- Case Classes
- ways to extract case class objects
- 1) using function
- 2) using map with multiple exressions
- 3) using map with single expression
- Sql Context
- Data Frames API
- DataSet API
- RDD vs DataFrame vs DataSet
- PySpark – Create a DataFrame
- PySpark – Create an empty DataFrame
- PySpark – Convert RDD to DataFrame
- PySpark – Convert DataFrame to Pandas
- PySpark – show()
- PySpark – StructType & StructField
- PySpark – Row Class
- PySpark – Column Class
- PySpark – select()
- PySpark – collect()
- PySpark – withColumn()
- PySpark – withColumnRenamed()
- PySpark – where() & filter()
- PySpark – drop() & dropDuplicates()
- PySpark – orderBy() and sort()
- PySpark – groupBy()
- PySpark – join()
- PySpark – union() & unionAll()
- PySpark – unionByName()
- PySpark – UDF (User Defined Function)
- PySpark – map()
- PySpark – flatMap()
- pyspark – foreach()
- PySpark – sample() vs sampleBy()
- PySpark – fillna() & fill()
- PySpark – pivot() (Row to Column)
- PySpark – partitionBy()
- PySpark – ArrayType Column (Array)
- PySpark – MapType (Map/Dict)
VII) PySpark SQL Functions
- PySpark – Aggregate Functions
- PySpark – Window Functions
- PySpark – Date and Timestamp Functions
- PySpark – JSON Functions
- PySpark – Read & Write JSON file
VIII) PySpark Built-In Functions
- PySpark – when()
- PySpark – expr()
- PySpark – lit()
- PySpark – split()
- PySpark – concat_ws()
- Pyspark – substring()
- PySpark – translate()
- PySpark – regexp_replace()
- PySpark – overlay()
- PySpark – to_timestamp()
- PySpark – to_date()
- PySpark – date_format()
- PySpark – datediff()
- PySpark – months_between()
- PySpark – explode()
- PySpark – array_contains()
- PySpark – array()
- PySpark – collect_list()
- PySpark – collect_set()
- PySpark – create_map()
- PySpark – map_keys()
- PySpark – map_values()
- PySpark – struct()
- PySpark – countDistinct()
- PySpark – sum(), avg()
- PySpark – row_number()
- PySpark – rank()
- PySpark – dense_rank()
- PySpark – percent_rank()
- PySpark – typedLit()
- PySpark – from_json()
- PySpark – to_json()
- PySpark – json_tuple()
- PySpark – get_json_object()
- PySpark – schema_of_json()
- Working Examples
IX) Pyspark External Sources
- Working with sql statements
- Spark and Hive Integration
- Spark and mysql Integration
- Working with CSV
- Working with JSON
- Transformations and actions on dataframes
- Narrow, wide transformations
- Addition of new columns, dropping of columns ,renaming columns
- Addition of new rows, dropping rows
- Handling nulls
- Joins
- Window function
- Writing data back to External sources
- Creation of tables fromDataframes (Internal tables, Temporary tables)
X) DEPLOYMENT MODES
- Local Mode
- Cluster Modes(Standalone , YARN
XI) PYSPARK APLLICATION
- Stages and Tasks
- Driver and Executor
- Building spark applications/pipelines
- Deploying spark apps to cluster and tuning
- Performance tuning
PySpark Streaming Concepts
Integration with Kafka
PySpark-mllib
PYTHON
1. Python Basics
- What is Python
- Why Python?
- History of python
- Applications of Python
- Features of Python
- Advantages of Python
- Versions of Python
- Installation of Python
- Flavors of Python
- Comparision b/w various programming languages C, Java and Python
2. Python Operations
- Python Modes of Execution
- Interactive mode of Execution
- Batch mode of Execution
- Python Editors and IDEs
- Python Data Types
- Python Constants
- Python Variables
- Comments in python
- Output Print(),function
- Input() Function :Accepting input
- Type Conversion
- Type(),Id() Functions
- Comments in Python
- Escape Sequences in Python
- Strings in Python
- String indices and slicing
3. Operators in Python
- Arithmetic Operators
- Comparision Operators
- Logical Operators
- Assignment Operators
- Short Hand Assignment Operators
- Bitwise Operators
- Membership Operators
- Identity Operators
4. Python IDE’s
- Pycharm IDE Installation
- Working with Pycharm
- Pycharm components
- Installing Anaconda
- What is Conda?
- Anaconda Prompt
- Anaconda Navigator
- Jupyter Notebook
- Jupyter Features
- Spyder IDE
- Spyder Featueres
- Conda and PIP
5. Flow Control statements
- Block/clause
- Indentation in Python
- Conditional Statements
- if stmt
- if…else statement
- if…elif…statement
6. Looping Statements
- while loop,
- while … else,
- for loop
- Range() in for loop
- Nested for loop
- Break statememt
- Continue statement
- Pass statement
7. Strings in Python
- Creating Strings
- String indexing
- String slicing
- String Concatenation
- String Comparision
- String splitting and joining
- Finding Sub Strings
- String Case Change
- Split strings
- String methods
8. Collections in Python
- Introduction
- Lists
- Tuples
- Sets
- Dictionaries
- Operations on collections
- Functions for collections
- Methods of collection
- Nested collections
- Differences b/w list tuple and set and Dictionary
9. Python Lists
- List properties
- List Creation
- List indexing and slicing
- List Operations
- List addresses
- List functions
- Different ways of creating lists
- Nested Lists
- List modification
- List insertion and deletion
- List Methods
10. Python Tuples
- Tuple properties
- Tuple Creation
- Tuple indexing and slicing
- Different ways of creating tuples
- Tuple Operations
- Tuple Addresses
- Tuple Functions
- Nested Tuples
- Tuple Methods
- Differences b/w List and Tuple
11. Python Sets
- Set properties
- Set Creation
- Set Operations
- Set Functions
- Set Addresses
- Set Mathematical Operations
- Set Methods
- Insertion and Deletion operation
12. Python Dictionary
- Dictionary properties
- Dictionary Creation
- Dictionary Operations
- Dictionary Addresses
- Nested Dictionaries
- Dictionary Methods
- Insertion and Deletion of elements
- Differences b/w list tuple and set and Dictionary
13. Functions in Python
- Defining a function
- Calling a function
- Properties of Function
- Examples of Functions
- Categories of Functions
- Argument types
- default arguments
- non-default arguments
- keyword arguments
- non keyword arguments
- Variable Length Arguments
- Variables scope
- Call by value and Call by Reference
- Passing collections to function
- Local and Global variables
- Recursive Function
- Boolean Function
- Passing functions to function
- Anonymous or Lamda function
- Filter() and map() functions
- Reduce Function
14. Modules in Python
- What is a module?
- Different types of module
- Creating user defined module
- Setting path
- The import statement
- Normal Import
- From … Import
- Module Aliases
- Reloading a module
- Dir function
- Working with Standard modules -Math, Random, Date time and os modules,
15. Packages
- Introduction to packages
- Defining packages
- Importing from packages
- --init--.py file
- Defining sub packages
- Importing from sub packages
16. Errors and Exception Handling
- Types of errors
- Compile-Time Errors
- Run-Time Errors
- What is Exception?
- Need of Exception handling
- Predefined Exceptions
- Try,Except, finally blocks
- Nested blocks
- Handling Multiple Exceptions
- User defined Exceptions
- Raise statement
17. File Handling
- Introduction
- Types of Files in Python
- Opening a file
- Closing a file
- Writing data to files
- Tell( ) and seek( ) methods
- Reading a data from files
- Appending data to files
- With open stmt
- Various functions
18. OOPs Concepts
- OOPS Features
- Encapsulation
- Abstraction
- Class
- Object
- Static and non static variables
- Defining methods
- Diff b/w functions & methods
- Constructors
- Parameterized Constructors
- Built –in attributes
- Object Reference count
- Destructor
- Garbage Collection
- Inheritance
- Types of Inheritances
- Object class
- Polymorphism
- Over riding
- Super() statement
19. Regular Expressions
- What is regular expression?
- Special characters
- Forming regular expression
- Compiling regular expressions
- Grouping
- Findall() function
- Finditer() function
- Sub() function
- Match() function
- Search() function
- Matching vs searching
- Splitting a string
- Replacing text
- validations
20. Database Access
- Introduction
- Installing mysql database
- Creating database users,
- Installing Oracle Python modules
- Establishing connection with mysql
- Closing database connections
- Connection object
- Cursor object
- Executing SQL queries
- Retrieving data from Database.
- Using bind variables executing
- SQL queries
- Transaction Management
- Handling errors
21. Python Date and Time
- How to Use Date &DateTime Class
- Time and date Objects
- Calendar in Python
- The Time Module
- Python Calendar Module
22. Operating System Module
- Introduction
- getcwd
- listdir
- chdir
- mkdir
- rename file/dir
- remove file/dir
- rmtree()
- Os help
- Os operations
23. Advanced concepts
- Python Iterator
- Python Generator
- Python closure
- Python Decorators
- Web Scraping
- PIP
- Working with CSV files
- Working with XML files
- Working with JSON files
- Debugging
24. GUI Programming (tkinter)
- Introduction
- Components and events
- Root window
- Labels
- Fonts and colors
- Buttons, checkbox
- Label widget
- Message widget
- Text widget
- Radio button
- image
25. Excel Workbook
- Installing and working with Xlsx writer
- Creating Excel Work book
- Inserting into excel sheet
- Insetting data into multiple excel sheets
- Creating headers
- Installing and working with xlrd module
- Reading a specific cell or row or column
- Reading specific rows and columns
26. Data Analytics
- Introduction
- pandas module
- Numpy module
- Matplotlib module
- Working Examples
27. Introduction to Datascience
- Machine Learning Introduction
- Datasets
- Supervised /Unsupervised Learning
- Statistical Analysis
- Data Analysis
- Uni-variate/multi-variate analysis
- Corelation Analysis
- Algorithm types
- Applications
28. Python Pandas
- Introduction to Pandas
- Creating Pandas Series
- Creating Data Frames
- Pandas Data Frames from dictionaries
- Pandas Data Frames from list
- Pandas Data Frames from series
- Pandas Data Frames from CSV, Excel
- Pandas Data Frames from JSON
- Pandas Data Frames from Databases
- Pandas Data Functionality
- Pandas Timedelta
- Creating Data Frames from Timedelta
- Pandas Groupings and Aggregations
- Converting Data Frames from list
- Creating Functions
- Converting Different Formats
- Pandas and Matplotlib
- Pandas usecases
29. Python Numpy
- Introduction to Numpy
- Numpy Arrays
- Numpy Array Indexing
- 2-D and 3Dimensional Arrays
- Numpy Mathematical operations
- Numpy Flattening and reshaping
- Numpy Horizontal and Vertical Stack
- Numpy linespace and arrange
- Numpy asarray and Random numbers
- Numpy iterations and Transpose
- Numpy Array Manipulation
- Numpy and matplotlib
- Numpy Linear Algebra
- Numpy String Functions
- Numpy operations and usecases
- Numpy Working Examples
30. Python Matplotlib
- Introduction to matplotlib
- Installing matplotlib
- Generating graphs
- Normal plottings
- Generating Bargraphs
- Histograms
- Scatter plots
- Stack plots
- Pie plots
- Matplotlib working examples
POWER BI
Introduction:
1. Downloads
2. Install POWERBI DESKTOP
3. Connect to POWERBI DESKTOP
Source Connections:
1. GET data from SQLSERVER
2. Get data from SSAS
3. Get data from Excel AND multiple Excel sheets
4. Get data from Text files
5. Load data from multiple data sources
Working with Transformations:
1. Change datatype
2. Combine multiple tables
3. Enter data
4. Format dates
5. Joins
6. Pivot table
7. Reorder and remove columns
8. Rename columns
9. Rename tables
10. Split columns
11. Unpivot table
Working with Visualizations:
1. Area chart
2. Bar chart
3. Card
4. Column chart
5. Donut chart
6. Pie chart
7. Line chart
8. Table
9. Matrix
10. Ribbon chart
11. Scatter chart
12. Map
13. Tree ma
14. Waterfall chart
15. Format charts (All types)
POWER BI Filters:
1. Slicer
2. Basic filters
3. Advanced filters
4. Top N filters
5. Filters on Measures
POWER BI Calculated Fields:
1. Calculated Columns
2. Calculated Measures
3. Calculated Tables
4. Conditional Columns
Dashboards:
1. Register to Pro Service
2. Dashboard Introduction
3. Connect Desktop with BI Service or Pro
4. Publish Desktop Reports
5. Create Workspace
6. Create Dashboard
Working with Dashboards:
1. Dashboard Functionalities
2. Dashboard Settings
3. Pin report to Dashboard
4. Delete a Dashboard
Sharing Work:
1. Share a Dashboard
2. Share a Report
3. Share Workspace
Subscriptions:
1. Subscriber Dashboard
2. Subscribe Report
DAX:
1. Aggregate functions
2. Date functions
3. Logical functions
4. Math functions
5. String functions
6. Trigonometric functions
Interview Questions:
1. FAQ’S
PRE-REQUISITES:
1. SQLSERVER KNOWLEDGE
2. SSRS KNOWLEDGE