In database systems data storage and data representation is one of the major problems because the efficiency of realization of these functions directly influences working speed and labor efficiency. One of the most popular concepts for data storage and data mining is hierarchical representation of information. As a basis for this realization, fundamental concepts such as trees and graphs are used. The aim of this essay is to discuss these two concepts, describe types of trees and analyze the efficiency of these concepts for modern computational and database systems.
Mathematically, a graph is an ordered pair C=(X, Y), where X is a set of nodes, and Y is a set of edges (Diestel, 2005). Geometrically, graph may be represented by a set of points (nodes), connected by lines (edges). Graph may be undirected (unordered pairs or simple line in geometrical case) and directed (ordered pairs or vectors instead of lines). A graph can be weighted if each edge has an assigned number. Graphs are used for modeling and solving may problems such as logistical ones, neural network realization, set of relations between data etc. In computer science, graph data structures are used to represent connections or relationships (Gross & Yellen, 1998). However, the general concept of graph is too broad for realizing an hierarchical structure like those in the relational databases.
A tree is defined as an acyclic connected graph where each node has a set of zero or more children nodes, and at most one parent node. Each tree has a root node, which has no superior one. This data structure was named so because if it is depicted from root to leaves, it resembles a tree set upside down. The most frequently used type of a tree is ordered tree ”“ a tree where order is imposed in some way, for example by assigning order to children of every node or assigning numbers to every node in the tree.
3. Types of trees
The most popular structures used for information representation and storage are binary and n-ary trees. A binary tree is a tree structure where each node has at most two children. Similarly, n-ary tree is a tree where each node has maximum n children. There are several kinds of trees which are most often used in computer science, such as full binary tree (each node besides the leaves has exactly two children), perfect binary tree (full tree where all leaves are at the same level) and complete binary tree (where each level except the last one is completely filled and all leaves are set as far left as possible). Binary trees are often used for realizations of binary searches and binary heaps (Mehta & Sartaj, 2004).
4. Expediency of tree usage
In databases, information is stored in the structures resembling tables, and several tables are connected with relations one-to-many, where one entity of one table may be related to many entities of other table. This relation resembles the structure of tree node and its children. Trees are frequently used for representing interconnections between data entities. Graphs (other than trees), for instance, would be a worse choice for describing data type attributes because trees represent a minimal structure possessing all the necessary relations for hierarchical model.
Therefore, in case graphs were chosen for base of the hierarchy, excess information and data storage would take place.
The concept of a tree is quite a successful decision for representing hierarchical data and storing such data. Usage of trees allows to reflect the relationships between objects and information similar to that which is in the real world.
The relational database model and object relational database model, the two models which are the most frequently utilized data models today, are based on the concept of the tree. Different realizations of this construct help to customize data representation.