![]() |
This volume is the first in a series which aims to contribute to the wider dissemination of the results of research and development in database systems for non-traditional applications and non-traditional machine organizations. It contains updated versions of selected papers from the First International Symposium on Database Systems for Advanced Applications.
https://doi.org/10.1142/9789814503686_fmatter
The following sections are included:
https://doi.org/10.1142/9789814503686_0001
A pattern-based knowledge induction approach is developed for discovering the inferential relationships and correlations between data objects from databases. This approach is characterized by providing quantitative measurement for approximate knowledge and supporting knowledge abstraction.
https://doi.org/10.1142/9789814503686_0002
A new concept of beliefs called interactive extension which are derivable from two interconnected default reasoning systems is proposed, and its properties are discussed. Default reasoning proposed by R. Reiter is suitable for incomplete knowledge reasoning in artificial intelligence, logic programming and deductive database, because it can draw plausible conclusions from the incomplete axioms. Since the conclusions can be invalidated when the partial world description is supplemented by new information, the logic based on the reasoning is a nonmonotonic logic. When two default reasoning systems are interconnected and exchange their beliefs each other, each system has a set of beliefs finally by the default reasoning about the originally owned beliefs and the exchanged beliefs. A pair of acceptable sets of the beliefs for each system is called interactive extension of the systems. When a large scale default reasoning system (deductive database) is constructed by a consolidation of many small scale default reasoning systems, the beliefs of each system should be increased by the default reasoning and the interactions (communication) with other systems. In order to analyze the properties of the set of final beliefs acquired each system, the proposed interactive extension of the default systems is an indispensable concept.
https://doi.org/10.1142/9789814503686_0003
In this paper we present a parallel processing model for the evaluation of recursive queries in Deductive Databases. The model is based on what we call reduced rule dependency graph. For each group of mutually recursive predicates and each possible query mode on those predicates, a process class to compute the answer to the query mode is created by a compiler. The process instantiations are created dynamically. In this model three kinds of parallelism are exploited: and–parallelism, pipeline–parallelism and loop–parallelism. Or–parallelism is obtained implicitly by the set–oriented treatment of processes. Because the processes are responsible for a rather elaborated task, the overhead is significantly less than that in the models proposed so far for Horn logic programs. The model also allows optimization techniques to be incorporated as appropriate in a concrete situation.
https://doi.org/10.1142/9789814503686_0004
An asynchronous chain recursion is a recursion whose compiled formula consists of single chain or asynchronous chains. We study the compilation and evaluation of asynchronous chain recursions and show that many complex function-free recursions, which may contain single or multiple linear recursive rules, nonlinear recursive rules, mutually recursive rules and multiple-level recursions, can be compiled to asynchronous chain recursions. Our study on compilation methods, simplification of compiled formulas and query processing techniques shows that asynchronous chain recursions can be compiled to relatively simple compiled formulas and processed efficiently using transitive closure query evaluation techniques.
https://doi.org/10.1142/9789814503686_0005
Object-oriented database schemas are dynamically modified because object-oriented database applications are dynamically evolving. As schema definitions are modified, existing methods for previous schemas may not be valid. It is important to be able to achieve compatibility of methods in the face of changing schemas. In this paper, we analyze the effects of schema changes on object-oriented database methods in order to come up with a reasonable framework for method coversion.
https://doi.org/10.1142/9789814503686_0006
This paper presents the NICE-OS object server of the system NICE-C++, an environment for persistent programming with the C++ language. NICE-OS has four main goals : persistent object management, multi-device storage, efficient access, and multi-user environment. This object server implements logical independance of the NICE-C++ persistency model. It can be used for supporting other object programming model. On the first hand NICE-OS is an object manager and on the other hand a set of tools for persistent programming in C++ language.
https://doi.org/10.1142/9789814503686_0007
As the applications of DB/DC (Data Base /Data Communication) systems expand, DB/DC systems are growing very rapidly in terms of size and functionality. Distributed database and composite sub-system management facilities are developed in XDM,1) the strategic DB/DC system program product of Hitachi, Ltd., to meet recent users' requirement. As a result of these functional extensions of XDM, it is possible to construct a large-scaled distributed DB/DC system with high performance and reliability.
https://doi.org/10.1142/9789814503686_0008
Constraint validation, an essential feature of any database system, is a process of guaranteeing and maintaining a set of semantic invariants across database state transitions. This process has been studied by many authors and various simplification methods have been proposed. This survey describes recent developments in the field of integrity constraint checking methods. We first give a review on different characteristics of methods for enforcing integrity constraints, followed by a survey of different methods to perform and simplify integrity checking in relational databases, and the corresponding ways to express integrity constraints.
https://doi.org/10.1142/9789814503686_0009
In design database systems, design objects as well as various kinds of binary relationships should be handled. In order to support the cooperation of many designers, flexible views are essential. Views required by designers, however, cannot be defined by subgraphs, because composite relationships not explicitly expressed must be considered. As conventional relational algebra lacks the capability of handling recursive operations, to use a powerful language like Datalog or Prolog is a very common approach. Since use of such a language results in systems with poor efficiency, we take an approach to utilize labeled directed graphs and regular expressions. Query processing on views can be achieved by conversion of regular expressions. As regular expressions are closed under most common operations and there are efficient algorithms to manipulate them, view handling procedures based on regular expressions are efficient, yet keeping enough expressive power.
https://doi.org/10.1142/9789814503686_0010
Before database systems are suitable to support new application areas, like for example CAD / CAM or CASE, they have to consider the special requirements of these environments. Main characteristics are the long duration of a design process, group-oriented division of areas of responsibility, and cooperative work between and within these groups. Due to the complexity, the group-orientation, the long duration of design processes and the enormous number of objects involved, a nested hierarchical transaction model seems to be absolutely necessary. In this paper we will present such a concept. The suggested transaction model supports several levels (transaction types) which, among others, allow to naturally reproduce the hierarchy within a design process: On the one hand a design team has to be protected from the outside world. This is done with the help of a so called database transaction. This transaction corresponds to a conventional transaction. On the other hand mechanisms are provided which not only permits to express the hierarchy within a project but also support a controlled cooperative work on common objects. For these purposes group transactions and user transactions are introduced. Group transactions are especially tailored to the cooperative kind of work of a group. The real development process, however, takes place in user transactions which are installed as childs of group transactions. Both kinds of transactions do avoid the rigid serializability criterion of conventional database systems. Instead, they offer some flexibility in working cooperatively on and with data. A first prototype of this transaction model was implemented on a VAX on top of the ORACLE relational database system.
https://doi.org/10.1142/9789814503686_0011
An approach and mechanism to support the seamless interconnection of federated database systems is described. In the context of autonomous but interconnected database systems, it is necessary to accommodate the efficient sharing and transmission of information across database system boundaries. Towards achieving the goal of seamless interconnection, it is desirable to insulate a given database system from unnecessary detail of remote database systems, allowing users to utilize local tools for database management. A core object database model is used as a framework in which to provide techniques to support the sharing of information units (objects) at various levels of abstraction as well as the sharing of behavior.
https://doi.org/10.1142/9789814503686_0012
This paper presents a novel mechanism which can be used to support the optimistic approach to network partitioning problem in distributed database systems. Our protocol provides the partitioning point effectively, by checking the status of the network only when necessary, and the system reconfiguration is performed in a highly reliable way. Its correctness for providing necessary changing points for partition operation is presented.
https://doi.org/10.1142/9789814503686_0013
Real-time database systems support applications that have severe operational constraints such as transaction deadlines and continued operation in the face of failures. In designing real-time database systems, there can be two approaches to meet those constraints. First approach is to redesign policies for scheduling transaction accesses to system resources, which is the primary determinant for real-time performance of the database system. Second approach is to trade desired features, such as serializability, or exploit semantic information of transactions and replicated data for high performance and reliability. In this paper, we discuss issues involved in these approaches, and present algorithms to solve problems in real-time database systems.
https://doi.org/10.1142/9789814503686_0014
Real-time systems must maintain data consistency while minimizing the number of tasks that miss the deadline. To satisfy both the consistency and real-time constraints, there is the need to integrate synchronization protocols with real-time priority scheduling protocols. In this paper, we address the problem of priority scheduling in real-time database systems. We first present a prototyping environment for investigating distributed software. Specific priority-based real-time locking protocols are then discussed, together with a performance study which illustrates the use of the prototyping environment for evaluation of synchronization protocols for real-time database systems.
https://doi.org/10.1142/9789814503686_0015
A world model for electronic secretaries is proposed to manage office procedures. Networks in a world hierarchy are used to store all past cases in addition to current activities and future plans. The model allows the user to modify the schema of the network whenever the office procedure changes. Moreover, the model can learn the schema also from actual cases.
https://doi.org/10.1142/9789814503686_0016
Software AG of Far East(SAGFE) has established various system environments peculiar to Japanese use since Japanese Kanji was supported on AD ABAS for the first time in 1978.
The supporting of Japanese language, which started by putting the character strings on a special Kanji printer, has recently been improving with the development of terminal equipment and controllers.
This paper describes the progress of our supports for Japanese language at SAGFE and Software AG (SAG) Germany and a study on Korean Hangul done by SAG and Penta Computer Korea as well as Japanese Kanji. Additionally, the method called DBCS (double-byte character set) support is proposed. We discuss the problems and solve those problems by adapting the fourth generation language, NATURAL, as a SAG product.
https://doi.org/10.1142/9789814503686_0017
One of the important issues of statistical database management is to define a data model and language for modelling and manipulating complex statistical summaries in the database. This paper proposes a matrix relational model and language for statistical database management. The matrix relational data structure combines the relation and matrix organizations. It offers a natural and densly built-up representation of complex statistical summaries. It proposes new objects and constructors for statistical database organization. The matrix relational language is an algebraic query language, using an extended relational algebra.
https://doi.org/10.1142/9789814503686_0018
This paper presents a new scheme for design and implementation of a functional data model on top of the conventional relational DBMS. Motivation comes from the need for improved tools and techniques for the specification, design and implementation of chemical structure databases. The scheme provides a high-level user interface integrated into the programming language LISP. An extended relational DBMS with capabilities of handing ADTs(Abstract Data Types) is developed. A functional data model, called TIME data model, is designed and implemented as an external model of the extended relational DBMS. The scheme supports behavioral abstraction as well as structural abstraction. Implementation of some features has been completed on UNIX environments and they are put into experimental evaluation.
https://doi.org/10.1142/9789814503686_0019
This paper describes the design and implementation of a visual query language HistoryChart for historical databases. The underlying data model for HistoryChart is a tuple-sequence data model, where each tuple allows set-valued attributes and contains a time-stamp attribute. A history of an object is represented by a sequence of these tuples. The major characteristics of the language HistoryChart are: (a) each historical (temporal) query is represented by a combination of visual symbolic patterns like time charts, (b) generalization hierarchies for attribute-names and attribute values are supported for specifying queries and displaying the retrieval results, and (c) the output of queries are also sequences of tuples, and are displayed in a visual form.
https://doi.org/10.1142/9789814503686_0020
A very significant goal for multi-media database systems (MMDBS) is the integrated management of unstructured data such as drawings, images, text and voice as well as formatted or structured data. So it is required for MMDBS to harmonize both multi-media data storage and processing capabilities. Traditional technology for data management is based on storage and retrieval of structured data of numerics and characters. But the current technological environment is not sufficient to implement practical application systems in the storage and processing of unstructured data. In order to overcome the problems involved in implementing MMDBS, this paper proposes a new multi-media data processing language SL/B5 based on screen flow. A program of SL/B5 is represented by a collection of screens which are connected through what we have termed “screen call”. The specifications of each screen may be divided into four parts, IMAGE for formatting multi-media data on a display terminal with speakers, FUNCTION for data entries and transformations, DATABASE for database accesses and CALL for conditional call of other screens with parameters. The database is a collection of 5-tuple (value, data type, attribute, entity, term of validity) which can include multimedia objects, meta-data, user information, security and integrity specification. The special database structure is called B5 structure. SL/B5 provides us powerful, flexible and friendly multi-media data processing capabilities based on flexible database structure B5. SL/B5 was specially designed to support development of information systems by naive end-users in considering the harmonization to B5.
https://doi.org/10.1142/9789814503686_0021
New technologies and instruments of computer are widely used, and it is the time to make efforts in shaping the computer world to the user's world. One of the efforts is improvement of a programming language. In the area of database systems, some fourth generation languages adopted visual representations in order to improve their user interfaces. These are, however, not sufficient for naive users because of poor capability for describing procedural programs and difficulty of understanding. This paper proposed a new visual language named MOL, which can be used by nonprogrammers to develop application systems. The structure of database is represented as a special diagram which is referred to a “map” in this paper. On the map, there are some information centers which manage a collection of data. These centers are connected by avenues, when there are certain relationships between respective data of two information centers. A program is represented as a trail of the map. In order to access data with MOL, a user orders a messenger to trace avenues, visit information centers, collect data, fill a reporting sheet with the data and bring back it. The collected data can be processed on a spread sheet. By using MOL, user interface can be improved to be more friendly for non-programmers than the other fourth generation languages without loss of powerful capability of procedural languages.
https://doi.org/10.1142/9789814503686_0022
Some ideas on managing visual objects and their viewing environment are proposed as the navigational aids for multimedia databases. “Image Base Navigator”, described in this paper, is an organized browsing environment, in which we can effectively see and examine a complex of visual objects and in which we can find out associative and related objects with less concern for query procedures. A new information complex modeling technology and a new information world viewing model are introduced. Prototype examples of navigational environment are given. This environment is implemented on the object-oriented data modeling basis.
https://doi.org/10.1142/9789814503686_0023
This paper surveys recent research involving the retrieval of symbolic pictures from a large database. The symbolic pictures are composed of various types of icons that may represent either physical objects or the feature points of these objects in the original image. The queries to the database may involve the use of the operators <, =, and > to compare the X and Y coordinates of the icons. S. K. Chang [1] has defined several versions of the spatial match query problem, depending on exactly how the patterns are intended to match the subject pictures:
a. Type 2 - Does there exist an exact match which preserves distances?
b. Type 1 - Does there exist a match if rows and colums of the subject can be deleted?
c. Type 0 - Does there exist a match if rows and columns of the subject may be merged together?
Although the type 0 and type 1 problems generally have an exponential number of matches in the worst case, the above existential queries can usually be quickly answered by using a variety of techniques. We will survey some of these techniques in this paper.
https://doi.org/10.1142/9789814503686_0024
A multi-layered, spatial, data model allows a user to operate at a high level of abstraction, yet allows efficient processing and storage of spatial data with finite-precision machinery. At the level just below the application layer, lexical and spatial objects are organized into object sets which, in turn, are represented as nested relations. While users view curves, regions, and images as continuous entities, all actual operations on the data take place at lower levels on discretized versions of idealized objects. Taking generalized intersection (relational join) as a prototypical operation, the concepts of completeness, homogeneity, regularization, interval arithmetic, round-off estimation, rational arithmetic, and topological data structures are examined in the context of the correspondence between the continuous- and the discrete-model representations.
https://doi.org/10.1142/9789814503686_0025
In this paper, we will discuss processing methods for set constraint queries. Set constraint queries produce set-valued solutions, and their conditions are given as “constraints,” which describe the solution sets using first-order quantified predicates on their member tuples. We shall examine a subclass of set constraint queries, called functional dependency (FD) set constraint queries. For a given set of FDs as query constraints, we will give a method to analyze query complexity using an FD-graph. We will show that certain classes of queries can be processed in time polynomial in the size of the database, and other classes are NP-complete. We also show that certain classes of NP-complete queries can be processed in polynomial time if certain data dependencies exist in the database.
https://doi.org/10.1142/9789814503686_0026
This paper introduces the OD-RETE pattern matching algorithm which is obtained by incorporating the on-demand evaluation mechanism into the RETE algorithm and which is suitable for production systems that access to large scale data. It allows to construct both of a “data-driven” network that simulates an ordinary RETE network and a “request-driven” network that performs on-demand formula evaluation. The latter network can treat large data without loading the whole data into the network. This paper describes the OD-RETE algorithm by defining nodes as “objects” that exchange messages with each other and discusses the ways to utilize this algorithm with DBMSs.
https://doi.org/10.1142/9789814503686_0027
The average cost for answering partial-match queries can be dramatically reduced by storing multiple copies of the data, each with a different clustering. We analyse the cost benefits (in terms of disk accesses) of this arrangement and present algorithms for determining a minimal cost file organisation for a given probability distribution of queries. We also show how constraining the range of values for specific attributes affects the usefulness of maintaining multiple copies.
https://doi.org/10.1142/9789814503686_0028
In order to improve the two-level signature file scheme proposed by Sacks-Davis et al. [19], we propose a new multikey access scheme based on term discrimination, we create a separate, efficient access method for the terms frequently used in user queries. We in addition cluster similar signatures by means of these terms so that we may achieve good performance on retrieval. Meanwhile we provide the gross space-time analysis of our scheme and compare it with that of the two-level signature file scheme. We show that our scheme can achieve 20-30% savings in retrieval time. Finally we propose a family of access methods constructed using our scheme.
https://doi.org/10.1142/9789814503686_0029
Functional Disk System with Relational data-base engine(FDS-R) is a relational storage system designed to accelerate relational algebraic operations. FDS-R employs a filtering and dynamic clustering mechanism as a special hardware function and provides an intelligent data management and an efficient data processing. The relation size which could be handled on the first prototype, however, was limited to the size of a staging buffer. Then we built up the second version of FDS-R, FDS-RII, which is designed to handle large relations efficiently. We have presented the processing strategy: the “Extended Task Cycle” for relational algebraic operations on FDS-RII, where the strategy is selected at run time form two algorithms(Nested Loop and Grace Hash Algorithms) by comparing the estimated I/O costs of them.
In this paper, we present the overview of Hardware configuration and Software system of FDSRII. First we show the basic Task Cycle and give the performance evaluation results using the original Wisconsin Benchmark for small relations, where filtered relation can be staged on the staging buffer. FDS attains higher performance than the current data-base systems such as INGRES, Oracle and IDM. Next we explain the “Extended Task Cycle” which was introduce to handle the relations larger than the staging buffer size. With the expanded version of Wisconsin Benchmark, we measured the FDS-RII performance. FDS-RII attained a high performance for processing large relations as compared to other large database systems such as Gamma and Teradata. While FDS-RII uses just one disk and three MC68020, Teradata uses 40 disks and 20 AMP's and Gamma requires 8 disks and 17 VAX 11/750's.
https://doi.org/10.1142/9789814503686_0030
Large arrays of small disks are providing an attractive approach for high performance I/O systems. They allow for low-cost, reliable storage and can achieve higher throughput compared to large disks. However, in order to make effective use of the commercially available architectures, it is necessary to develop intelligent software tools that allow automatic tuning of the disk arrays to varying workloads. In this paper we describe an integrated set of algorithms and the implementation of a file manager for automatic file partitioning and allocation and for load balancing on disk arrays. Our approach consists of modular building blocks that can be invoked independently of each other; thus, algorithms for file allocation and disk load balancing can be used regardless of whether striping is employed or not. Our heuristic method for file partitioning aims to determine the optimal striping unit of a file, depending on the access characteristics of the file and the overall workload. Our heuristic method for file allocation and load balancing deals with dynamically changing access frequencies of files by reallocating file extents, thus “cooling down” hot disks. In addition, the method takes into account the fact that some files exhibit periodical access patterns, and considers explicitly the cost of performing the cooling operations.
https://doi.org/10.1142/9789814503686_0031
This paper presents an implementation method of a parallel query processing system using our proposed stream-oriented parallel processing scheme. This scheme is based on demand-driven evaluation incorporating stream processing and used to manipulate application-specific and/or user-defined database operations in parallel.
A primitive set has been designed as basic operators for implementing this scheme. Specific database operations for each database application can be described and executed in parallel by using these primitives. An implementation method of these primitives in the environments where message passing is used for interprocessor communication is presented. This method is used to realize stream-oriented parallel processing by message passing. This paper also shows several experimental results of actual query processing in a parallel processing environment where multiple conventional processors are loosely connected to a conventional high-speed network.
https://doi.org/10.1142/9789814503686_bmatter
The following sections are included: