西维蜀黍

【Hadoop】HBase Shell

Commands using HBase Shell

Listing a Table

# Listing a Table
list
  ...


【Hadoop】Hive

Apache Hive

Apache Hive is a distributed, fault-tolerant data warehouse system that enables analytics at a massive scale.

Hive gives an SQL-like interface to query data stored in various databases and file systems that integrate with Hadoop.

Apache Hive supports the analysis of large datasets stored in Hadoop’s HDFS and compatible file systems such as Amazon S3 filesystem and Alluxio. It provides a SQL-like query language called HiveQL with schema on read and transparently converts queries to MapReduce, Apache Tez and Spark jobs.

  ...


【Hadoop】学习

Hadoop

Hadoop allows the distributed processing of large data sets stored across clusters of computers.

The Hadoop framework consists of two main components

  • Hadoop Distributed File System (HDFS)
    • HDFS is an open source variant of the Google File System (GFS)
  • MapReduce programming framework
    • Hadoop MapReduce is the open source variant of Google MapReduce
  ...


【Database】Entity Relationship (E-R) Diagrams

ER Diagrams

An Entity Relationship (ER) Diagram is a type of flowchart that illustrates how “entities” such as people, objects or concepts relate to each other within a system. ER Diagrams are most often used to design or debug relational databases in the fields of software engineering, business information systems, education and research. Also known as ERDs or ER Models, they use a defined set of symbols such as rectangles, diamonds, ovals and connecting lines to depict the interconnectedness of entities, relationships and their attributes. They mirror grammatical structure, with entities as nouns and relationships as verbs.

  ...


【Database】Online Transaction Processing (OLTP) vs Online Analytical Processing (OLAP)

Online Transaction Processing

Online transaction processing (OLTP) is a type of database system used in transaction-oriented applications, such as many operational systems. “Online” refers to that such systems are expected to respond to user requests and process them in real-time (process transactions). The term is contrasted with online analytical processing (OLAP) which instead focuses on data analysis (for example planning and management systems).

OLTP systems use a relational database that can do the following:

  • Process a large number of relatively simple transactions — usually insertions, updates and deletions to data.
  • Enable multi-user access to the same data, while ensuring data integrity.
  • Support very rapid processing, with response times measured in milliseconds.
  • Provide indexed data sets for rapid searching, retrieval and querying.
  • Be available 24/7/365, with constant incremental backups.

Use

OLTP has also been used to refer to processing in which the system responds immediately to user requests. An automated teller machine (ATM) for a bank is an example of a commercial transaction processing application. Online transaction processing applications have high throughput and are insert- or update-intensive in database management. These applications are used concurrently by hundreds of users. The key goals of OLTP applications are availability, speed, concurrency and recoverability (durability).

  ...