Video description
Sharpen your architectural skills by understanding challenges in the main areas of distributed systems: storage, computation, messaging, timing, and consensus. You’ll learn how to develop highly scalable big data applications using Apache Accumulo, to model and design an agile data warehouse, and to use Elasticsearch to search, aggregate, analyze, and scale large volume datastores. You’ll also learn how to identify insecurities in your big data cluster, and to secure them using MIT Kerberos, authentication with Active Directory, and authorization.
Table of contents
- Welcome
- What Distributed Systems Are, and Why They Exist
- Read Replication
- Sharding
- Consistent Hashing
- CAP Theorem
- Distributed Transactions
- Distributed Computation Introduction
- Map Reduce
- Hadoop
- Spark
- Storm
- Lambda Architecture
- Synchronization
- Network Time Protocol
- Vector Clocks
- Distributed Consensus: Paxos
- Messaging Introduction
- Kafka
- Zookeeper
- Wrap-Up
- Getting Started
- Data Warehouse Overview
- Data Sources
- Staging Tables
- Data Warehouse Modeling Basics
- Recurrent Dimensions
- DW Modeling - Advanced Dimension
- DW Modeling - Advanced Fact
- Data Warehouse Modeling Recap
-
Data Warehouse Design
- Conceptual, Logical, Physical Models
- System Attributes - Part 1
- System Attributes - Part 2
- Data Types And Domains
- Nullability
- Constraints
- Data Warehouse Tuning - Part 1
- Data Warehouse Tuning - Part 2
- Views - Part 1
- Views - Part 2
- Miscellaneous Aspects Of Design
- Practical Tips
- Self Assessment Test
- Case Study: Create Staging SQL
- Case Study: Execute Staging SQL
- Case Study: Create Warehouse SQL
- Case Study: Execute Warehouse SQL
-
Data Warehouse Data
- Warehouse Data Overview
- Source-To-Target Mappings
- Data Profiling
- Loading Staging Tables - Part 1
- Loading Staging Tables - Part 2
- Loading The Date and Time Dimensions - Part 1
- Loading The Date and Time Dimensions - Part 2
- Initial Warehouse Loading: Dimensions
- Initial Warehouse Loading: Facts
- Updating The Warehouse
- Warehouse Data Processing And Agile Development
- Case Study: Load Warehouse Data
- End User Access
- Data And Metadata Management
- Conclusion
- In Search Of Database Nirvana
- Getting Started
- Basic Operations
- Data Structure
- Queries And Relevance
- Aggregations
- Document Relationships
- Performance And Scaling
- Monitoring And Administration
- Conclusion
- Data Model And Architecture
- Working With Accumulo
- Basic Application Development
- Application Security
- Intermediate Application Development
- Advanced Application Development
- Performance
- Administration
- Conclusion
- Course Overview
- Tooling
- Hadoop Insecurities
- Authentication With MIT Kerberos
- Authentication With Active Directory
- Authorization
-
Encryption
- Creating An HDFS Encryption Zone
- Using HDFS Encryption Zones
- SSL: Crash Course In SSL Tools
- SSL: Preparing A Cluster For SSL Using A Self-Signed Root CA
- SSL: Enabling SSL For HDFS And Yarn
- SSL: Verifying SSL With HDFS And Yarn
- SASL Hive And HiveServer2
- SSL With HBase And Oozie
- SSL With Impala
- SSL With Hue
- Developer Topics
- Administrator Topics
- Secure Hadoop Topics
- Conclusion
Product information
- Title: Advanced Architecture for Big Data Applications
- Author(s):
- Release date: December 2016
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491978658
You might also like
video
A Beginner's Guide to Architecting Big Data Applications
Whether you’re a data engineer who needs to plan and implement a big data pipeline or …
book
Modern Big Data Processing with Hadoop
A comprehensive guide to design, build and execute effective Big Data strategies using Hadoop About This …
video
Big Data for Architects
Do you want a guide that will help you to pick the right Big Data technology …
video
Understanding Tool Integration for Big Data Architecture
In this course, you’ll learn how to integrate Hadoop components to implement big data solutions for …