Data warehouses and operational databases are two different types of databases that are designed for different purposes. Data warehouses are used for analytical processing of historical data, while operational databases are used for transactional processing of current data.
Here is a table that compares data warehouses and operational databases:
Characteristic | Data warehouse (OLAP) | Operational database (OLTP) |
---|---|---|
Purpose | Analytical processing of historical data | Transactional processing of current data |
Users | Knowledge workers such as executives, managers, and analysts | Clerks, DBAs, or database professionals |
Focus | Information out | Data in |
Schema | Star schema, snowflake schema, or fact constellation schema | Entity relationship model |
Orientation | Information-oriented | Application-oriented |
Data | Historical | Current |
Summary | Summarized and consolidated | Primitive and highly detailed |
View | Summarized and multidimensional | Detailed and flat relational |
Number of users | Hundreds | Thousands |
Number of records accessed | Millions | Tens |
Database size | 100GB to 100TB | 100MB to 100GB |
Flexibility | Highly flexible | Provides high performance |
Other key differences between data warehouses and operational databases include:
- Data warehouses are typically updated in batches, while operational databases are updated in real time.
- Data warehouses are designed to support complex queries, while operational databases are designed to support fast and efficient transactions.
- Data warehouses are typically stored on disk, while operational databases are typically stored in memory.
Additive, semi-additive, and non-additive measures
- Additive measures are measures that can be summed up to get a meaningful result. For example, sales quantity is an additive measure.
- Semi-additive measures are measures that can be aggregated using a subset of aggregation functions. For example, account balance is a semi-additive measure. A sum() function on balance does not give a useful result, but a max() or min() balance might be useful.
- Non-additive measures are measures that cannot be aggregated using any numeric aggregation function. For example, profit margin is a non-additive measure.
Factless fact tables
A factless fact table is a fact table that does not contain any numeric fact columns. Factless fact tables are typically used to store event data, such as the number of logins or the number of orders placed.
Data warehousing schemas
The three most common data warehousing schemas are the star schema, snowflake schema, and fact constellation.
- Star schema: A star schema consists of a single fact table surrounded by dimension tables. Each dimension table contains a single attribute that is used to describe the fact table.
- Snowflake schema: A snowflake schema is a variant of the star schema in which dimension tables are normalized. This means that dimension tables are split into multiple tables based on their different attributes.
- Fact constellation: A fact constellation is a collection of star schemas that share dimension tables. This schema is often used to model complex data relationships.
Conclusion
Data warehouses and operational databases are two different types of databases that are designed for different purposes. Data warehouses are used for analytical processing of historical data, while operational databases are used for transactional processing of current data.
The best type of database to use will depend on your specific needs. If you need to perform complex analytical queries on historical data, then a data warehouse is a good option. If you need to support fast and efficient transactions, then an operational database is a good option.
0 Comentários