Imieliński–Lipski algebra

In database theory, Imieliński–Lipski algebra is an extension of relational algebra onto tables with different types of null values. It is used to operate on relations with incomplete information.

Imieliński–Lipski algebras are defined to satisfy precise conditions for semantically meaningful extension of the usual relational operators, such as projection, selection, union, and join, from operators on relations to operators on relations with various kinds of "null values".

These conditions require that the system be safe in the sense that no incorrect conclusion is derivable by using a specified subset F of the relational operators; and that it be complete in the sense that all valid conclusions expressible by relational expressions using operators in F are in fact derivable in this system. For example, it is well known that the three-valued logic approach to deal with null values, supported treatment of nulls values by SQL is not complete, see Ullman book.[1] To show this, let T be:

T=
NAME CLASS GRADE SEMESTER
Rohit Databases B Spring
Igor Networks A NULL

Take SQL query Q

SELECT NAME
FROM T
 WHERE (CLASS = 'Networks' AND SEMESTER = 'Spring') OR (GRADE = 'A' AND SEMESTER <> 'Spring')

SQL query Q will return empty set (no results) under 3-valued semantics currently adopted by all variants of SQL. This is the case because in SQL, NULL is never equal to any constant – in this case, neither to “Spring” nor “Fall” nor “Winter” (if there is Winter semester in this school). NULL='Spring' will evaluate to MAYBE and so will NULL='Fall'. The disjunction MAYBE OR MAYBE evaluates to MAYBE (not TRUE). Thus Igor will not be part of the answer (and of course neither will Rohit). But Igor should be returned as the answer.

Indeed, regardless what semester Igor took the Networks class (no matter what was the unknown value of NULL), the selection condition will be true. This “Igor” will be missed by SQL and the SQL answer would be incomplete according to completeness requirements specified in Tomasz Imieliński, Witold Lipski, 'Incomplete Information in Relational Databases'.[2] It is also argued there that 3-valued logic (TRUE, FALSE, MAYBE) can never provide guarantee of complete answer for tables with incomplete information.

Three algebras which satisfy conditions of safety and completeness are defined as Imielinski–Lipski algebras: the Codd-Tables algebra, the V-tables algebra and the Conditional tables (C-tables) algebra.

Codd-tables algebra

edit

Codd-tables algebra is based on the usual Codd's single NULL values. The table T above is an example of Codd-table. Codd-table algebra supports projection and positive selections only. It is also demonstrated in [IL84 that it is not possible to correctly extend more relational operators over Codd-Tables. For example, such basic operation as join is not extendable over Codd-tables. It is not possible to define selections with Boolean conditions involving negation and preserve completeness. For example, queries like the above query Q cannot be supported. In order to be able to extend more relational operators, more expressive form of null value representation is needed in tables which are called V-table.

V-tables algebra

edit

V-tables algebra is based on many different ("marked") null values or variables allowed to appear in a table. V-tables allow to show that a value may be unknown but the same for different tuples. For example, in the table below Gaurav and Igor order the same (but unknown) beer in two unknown bars (which may, or may not be different – but remain unknown). Gaurav and Jane frequent the same unknown bar (Y1). Thus, instead one NULL value, we use indexed variables, or Skolem constants .[3]

DRINKER BEER BAR
Zhihan Heineken Cabana
Gaurav X1 Y1
Igor X1 Y2
Jane Bud Y1
John X2 Y3

V-tables algebra is shown to correctly support projection, positive selection (with no negation occurring in the selection condition), union, and renaming of attributes, which allows for processing arbitrary conjunctive queries. A very desirable property enjoyed by the V-table algebra is that all relational operators on tables are performed in exactly the same way as in the case of the usual relations.

Conditional tables (c-tables) algebra

edit

Example of conditional table (c-table) is shown below.

NAME CLASS GRADE SEMESTER con
Rohit Databases B Spring true
Igor Networks A x x = 'Spring'
Igor Networks A x x <> 'Spring'

It has additional column “con” which is a Boolean condition involving variables, null values – same as in V-tables.

SELECT *
  FROM T
 WHERE (CLASS = 'Networks' AND SEMESTER = 'Spring') OR (GRADE = 'A' AND SEMESTER <> 'Spring')

over the following table c-table

T1=
NAME CLASS GRADE SEMESTER con
Rohit Databases B Spring true
Igor Networks A x true

Conditional tables algebra, mainly of theoretical interest, supports projection, selection, union, join, and renaming. Under closed-world assumption, it can also handle the operator of difference, thus it can support all relational operators.

History

edit

Imieliński–Lipski algebras were introduced by Tomasz Imieliński and Witold Lipski Jr. in Incomplete Information in Relational Databases.[2]

References

edit
  1. ^ J.D. Ullman (1982). Principles of Database Systems (2nd ed.). Computer Science Press, Potomac, MD. Bibcode:1981pds..book.....U.
  2. ^ a b Imieliński, T.; Lipski Jr., W. (1984). "Incomplete information in relational databases". Journal of the ACM. 31 (4): 761–791. doi:10.1145/1634.1886. S2CID 288040.
  3. ^ Skolem constants

Further reading

edit