Highlander
  home
  why use Highlander
  building data marts
  querying and viewing info
  product descriptions and pricing
  about Peak Software
  employment opportunities
glossary
  contact us
  sitemap
Highlander Glossary

The following glossary defines some of the common terminology used with data warehousing generally and with the Highlander products specifically.

Atomic Level
The atomic level is the finest grain of aggregation summarized by a dimensional data model. When applied to dimensions, the atomic level refers to the discrete values the dimension may assume. When applied to a dataset, the atomic level is the cell created by the intersection of all dimensions at the atomic level. The atomic level is the lowest level of detail normally stored in a multi-dimensional database.

Data Mart
Usually sponsored at the department level and developed with a specific issue or subject in mind, a Data Mart is a data warehouse with a focused objective. The scope of data in a data mart is limited to data supporting the target objective. In this sense, the data mart is more structured than a general-purpose data warehouse. Data mart's are now generally recognized as the most cost-effective approach to data warehousing.

Data Warehouse
According to William Inmon, widely considered the father of the modern data warehouse, a Data Warehouse is a "Subject-Oriented, Integrated, Time-Variant, Nonvolatile collection of data in support of decision making". Data Warehouses tend to have these distinguishing features: (1) Use a subject oriented dimensional data model; (2) Contain publishable data from potentially multiple sources and; (3) Contain integrated reporting tools.

Dataset
A Highlander Dataset - also known as a hypercube - is a summary measure arrayed by a collection of dimensions. The most common summary measure is the count of records falling in a cell, based on the dimensions defining the cell. Multi-dimensional databases typically contain many datasets.

Dataview
A Highlander Dataview is a data structure containing multiple datasets, rules for making dimension queries, and data reduction algorithms. A dataview can provide a transparent view to a single dataset or can combine datasets in various ways such as by forming ratios. User-defined calculations using summarized data are implemented by dataviews. End users interact with dataviews directly, datasets.

Dataview Tree
A Dataview Tree is a hierarchical structure of nodes representing dataviews and navigation junctions. This structure visually portrays the subject matter context of the Highlander databases and is an integral part of the database. Dataview nodes are "opened" to run the dataview itself. Navigation nodes organize the subject matter hierarchically. User quickly grasp the scope and nature of the database's information content through this structure.

Dimension
A Highlander Dimension is an independent variable assuming a finite number of discrete possibilities. Dimensions may be categorical in nature, or numeric in nature ranging over a continuous spectrum. In the latter case, the range of possible dimension values is partitioned into intervals and categorized by the interval containing its value. Virtual dimensions are computed from a set of independent variables according to a user-defined expression then categorized by an interval partition.

Dimension Query
A Dimension Query is a selection criterion for a dimension of a dimensional data model. The query might select all atomic dimension values ('ALL') or summarize all values ('TOTAL') essentially ignoring the dimension. Individual values are selected by naming the value (Dimension Color = 'Red' 'Blue'), or are summarized by inserting a plus sign between their values (Dimension Color = 'Red' + 'Blue'). Queries for all dimensions specify a collection of cells for extraction from the database. Cells are summarized by summing their contents, or by applying a summary metric that depends on the dataset's definition.

Dimensional Data Model
The Dimensional Data Model represents data as values residing in cells defined by the intersection of a set of independent variables - dimensions - each assuming a discrete set of values. A dimensional data model is often referred to as a hypercube or the Cartesian product of dimensions forming a n-dimensional structure of cells. For example, two dimensions form a table, the classic spreadsheet. Cells are defined at each row x column intersection. Three dimensions form a cube of cells based on the face, row, and column dimensions. Specifying a value along the face dimension defines a table in the row x column plane, and further specification of rows defines lines in the column dimension. The best metrics of a dimensional data model size are the number of cells (the product of the dimension lengths along all dimensions) and the density (the proportion of cells having data). Dimensional data models are the intuitive models most users have of their data.

Drill Down
Drill Down refers to the process of disaggregating summary results along some dimension. Disaggregation might be to an intermediate level or to an atomic level, the finest level recorded for the dimension. Drilling "through the floor" to the entity level refers to finding the entities that were summarized in a cell at the atomic level.

Enterprise Data Warehouse
An Enterprise Data Warehouse is a data warehouse containing all publishable quality data of a permanent nature collected by an organization. This inevitably includes historic data from multiple data sources. Operational transaction data is usually excluded due to its volatile nature. Enterprise data warehouses are valuable resources, are costly to construct, and require a long time to evolve.

Entity Record
Entity records are tabular records structured as fields within records within relational tables or flat files. Entity records, sometimes called transaction records or just transactions, are the source level subject matter detail summarized by a dimensional data model. Entity records are not retained in a multi-dimensional database, however many data warehouses contain methods for identifying and retrieving the summarized entities.

Multi-Dimensional Database
A Multi-Dimensional Database is a database that stores data natively in multi-dimensional format. Database access is done by specifying a query for each dimension. Results of an access are summarized cell values over a subset of cells. The entity source records are themselves gone and only dimensions and summary measures remain. Dimensional data models can be implemented in relational databases however multi-dimensional databases are more efficient and enjoy better performance. Currently multi-dimensional databases are not standardized; multi-dimensional databases contain the necessary query and reporting tools.

On-Line Analytical Processing (OLAP)
On-Line Analytical Processing - OLAP - is the ability to conduct data analysis within the context of the database. Data analysis may include predefined descriptive statistics, user defined expressions executed against the data, or customized models driven by the data.

Reach Back
Reach Back is the extraction of one or more entity records summarized during the creation of a multi-dimensional database. Entity records are not normally retained in a multi-dimensional database, so reach back operations require recourse to the source entity data used to create the database.

Email: sales@peaksoftware.com


Site created and maintained by Infopoint, Inc., Copyright 2000