Data Warehouse : Introduction
- A Data Warehouse is a combination of Data and Warehouse in which data represents the raw facts and figures whereas Warehouse represents the storage repository.
- Data Warehouse is a storage repository in which data, information and knowledge from heterogeneous data bases or data sources are combined together only after processing that data to remove errors and inconsistencies.
- Warehouses are different from traditional databases in terms of size, volume and space along with the content. Data warehouses contains all sort of information from multiple data bases or data sources. But in case of databases, they can be termed as symmetric data repository in which data stored in and organized way.
- Due to the size and nature of data-warehouse, they are maintained separately from databases.
- Thus, “A Data-Warehouse is a copy of transaction data, specifically structured for query and analysis” ---- Ralph Kimball.
Data Warehouse : Properties
- As per Bill Inmon “ A Data-Warehouse is a Subject-Oriented, Integrated, Time-Variant and Non-volatile collection of data in support of management’s decision making process.”
- Any data warehouse possesses mentioned properties. These can be explained as:
- Subject Oriented : Any particular subject can be analysed using a data-warehouse. The subject depends on, of which organization the data-warehouse is. For Example : Analysis of financial statistics of last five years from a particular organization’s data warehouse.
- Time Variant : A major advantage of data warehouse over traditional databases is, data warehouses keeps track of historical data. This historical data can be retrieved for a time period of 1 month to over 5 years or more depending upon the need.
- Integrated: Data-warehouses are repositories made by integrating data, information and knowledge from heterogeneous databases.
- Non-Volatile: As data warehouse contains the historical data, any alteration or changes are not possible.
Data Warehouse : Need
- Data-Warehouse contains historical data. It means any data can be accessed from the data warehouse whenever required. This data can be very beneficial from an organizational point of view as:
- Reports: Reports can be generated from the historical data for a particular subject or for a group of subjects or even for a department.
- Analytics: Analytics can be made on the basis of reports generated from the historical data.
- Data Mining: Data mining process can be applied to extract abstract data or information from any data-warehouse.
- Metadata Ability: Data-Warehouse being such a large repository, it is difficult to understand this much amount of data without any description. This description is metadata.
- Decision Making & Decision Support System: Once all the reports and analytics are obtained, any further decision making can be done. Decision making is the fundamental need from data-warehouse.
Advantages : Data Warehouse
- Data can be accessed timely and easily.
- High system performance.
- Report generation and analytics.
- High quality data and information extraction.
- Easier to access data which results in better decision making.
Disadvantages : Data Warehouse
- Building a data-warehouse is a time taking process.
- High building cost.
- High maintenance cost.
- Complex structure.
- More number of resources required.
Applications : Data Warehouse
With the number of advantages and properties of data-warehouses have, they are widely used in various sectors out of which some are listed below.