8 Extended Database Concepts

Transcription

8 Extended Database Concepts
Vorlesung Datenbanken
8
Wintersemester 2012/13
Extended Database Concepts
DOOD
• deductive and
• object–oriented databases
DOOD databases offer advanced features for
• data modelling and
• database programming
for complex data structures.
Prof. Dr. Dietmar Seipel
854
Vorlesung Datenbanken
Wintersemester 2012/13
8.1 Deductive Databases and Logic Programming
The ease of handling the data structure of terms and the powerful built–in
control structure of backtracking are features that distinguish P ROLOG from
other programming languages.
P ROLOG is very well–suited for embedded database programming.
In the database context, frequently a restricted version is used, which is
called DATALOG – the basis of deductive databases.
• P ROLOG and DATALOG are declarative languages; they can access
databases and X ML documents.
• Relations and complex objects (like, e.g., X ML documents) can be
represented as term structures.
• With the help of declarative rules, we can represent integrity constraints
and inference rules for deriving conclusions from given information.
Prof. Dr. Dietmar Seipel
855
Vorlesung Datenbanken
8.1.1
Wintersemester 2012/13
P ROLOG as a Database Language
1. P ROLOG can be used for representing tables from relational databases.
The tuples of a table become P ROLOG facts with the same predicate
symbol – usually, the table name is used.
2. The data dictionary of a relational database can also be represented
using P ROLOG facts. This can be done using P ROLOG terms that
correspond to an X ML representation of the data dictionary.
3. Queries and integrity constraints can be represented as P ROLOG rules.
Conjunctive queries are posed in the form of P ROLOG goals, which are
then evaluated using the P ROLOG rules.
4. DATALOG is a restricted version of P ROLOG, which ensures termination
and the efficient evaluation of recursive queries.
5. The deductive database system DD BASE combines P ROLOG and DATALOG.
Prof. Dr. Dietmar Seipel
856
Vorlesung Datenbanken
Wintersemester 2012/13
Database Tables
in MyS QL:
E MPLOYEE
FNAME
MINIT
LNAME
SSN
BDATE
ADDRESS
SEX
SALARY
SUPERSSN
DNO
John
B
Smith
444444444
1955-01-09
731 Fondren, Houston, TX
M
30000
222222222
5
Franklin
T
Wong
222222222
1945-12-08
638 Voss, Houston, TX
M
40000
111111111
5
Alicia
J
Zelaya
777777777
1958-07-19
3321 Castle, Spring, TX
F
25000
333333333
4
Jennifer
S
Wallace
333333333
1931-06-20
291 Berry, Bellaire, TX
F
43000
111111111
4
Ramesh
K
Narayan
555555555
1952-09-15
975 Fire Oak, Humble, TX
M
38000
222222222
5
Joyce
A
English
666666666
1962-07-31
5631 Rice, Houston, TX
F
25000
222222222
5
Ahmad
V
Jabbar
888888888
1959-03-29
980 Dallas, Houston, TX
M
25000
333333333
4
James
E
Borg
111111111
1927-11-10
450 Stone, Houston, TX
M
55000
NULL
1
A database table p can be represented by a set of P ROLOG facts, namely one
fact p(t1 , . . . , tn ) for each tuple (t1 , . . . , tn ) of the table.
Prof. Dr. Dietmar Seipel
857
Vorlesung Datenbanken
Wintersemester 2012/13
W ORKS _O N
ESSN
PNO
HOURS
111111111
20
NULL
222222222
2
10.0
222222222
3
10.0
333333333
20
15.0
PNAME
PNUMBER
PLOCATION
DNUM
333333333
30
20.0
ProductX
1
Bellaire
5
444444444
1
32.5
ProductY
2
Sugarland
5
444444444
2
7.5
ProductZ
3
Houston
5
555555555
3
40.0
Computerization
10
Stafford
4
666666666
1
20.0
Reorganization
20
Houston
1
666666666
2
20.0
Newbenefits
30
Stafford
4
777777777
10
10.0
777777777
30
30.0
888888888
10
35.5
888888888
30
5.0
Prof. Dr. Dietmar Seipel
P ROJECT
858
Vorlesung Datenbanken
Wintersemester 2012/13
Database Tables in P ROLOG:
employee(’John’, ’B’, ’Smith’, 444444444,
1955-01-09, ’731 Fondren, Houston, TX’,
’M’, 30000, 222222222, 5).
employee(’Franklin’, ’T’, ’Wong’, ...).
...
works_on(444444444, 1, 32.5).
works_on(444444444, 2, 7.5).
...
department(’Research’, 5, 222222222, 1978-05-22).
...
project(’ProductX’, 1, ’Bellaire’, 5).
...
We do not quote the date values. Then, they are terms, and we can access
their components more conveniently without string parsing.
Prof. Dr. Dietmar Seipel
859
Vorlesung Datenbanken
Wintersemester 2012/13
Export from MyS QL to X ML
Using the P ROLOG library D DK, we can also export a MyS QL database or
table to X ML:
?- mysql_database_to_xml(mysql, company, Xml),
dwrite(xml, Xml).
<database name="company">
<table name="employee"> ... </table>
<table name="works_on">
<row ESSN="111111111" PNO="20" HOURS="0.0"/>
<row ESSN="222222222" PNO="2" HOURS="10.0"/> ...
</table> ...
</database>
?- mysql_database_table_to_xml(
mysql, company:employee, Xml).
Prof. Dr. Dietmar Seipel
860
Vorlesung Datenbanken
Wintersemester 2012/13
Data Dictionary
Using the P ROLOG library D DK, we can export the data dictionary of a
relational database from MyS QL to an X ML representation:
?- mysql_database_schema_to_xml(company, Xml),
dwrite(xml, Xml).
This produces an X ML element with one table sub–element for every table:
<database name="company">
<table name="department"> ... </table>
<table name="employee"> ... </table>
<table name="dependent"> ... </table>
<table name="dept_locations"> ... </table>
<table name="project"> ... </table>
<table name="works_on"> ... </table>
</database>
Prof. Dr. Dietmar Seipel
861
Vorlesung Datenbanken
Wintersemester 2012/13
<table name="employee">
<attribute name="FNAME" type="varchar(15)" is_nullable="NO"/>
<attribute name="MINIT" type="char(1)" is_nullable="YES"/>
<attribute name="LNAME" type="varchar(15)" is_nullable="NO"/>
<attribute name="SSN" type="varchar(9)" is_nullable="NO"/>
<attribute name="BDATE" type="date" is_nullable="YES"/>
<attribute name="ADDRESS" type="varchar(30)" is_nullable="YES"/>
<attribute name="SEX" type="char(1)" is_nullable="YES"/>
<attribute name="SALARY" type="decimal(10,2)" is_nullable="YES"/>
<attribute name="SUPERSSN" type="varchar(9)" is_nullable="YES"/>
<attribute name="DNO" type="int(11)" is_nullable="NO"/>
<primary_key> <attribute name="SSN"/> </primary_key>
<foreign_key> <attribute name="SUPERSSN"/>
<references table="employee">
<attribute name="SSN"/> </references> </foreign_key>
<foreign_key> <attribute name="DNO"/>
<references table="department"> <attribute name="DNUMBER"/>
</references> </foreign_key>
</table>
Prof. Dr. Dietmar Seipel
862
Vorlesung Datenbanken
Wintersemester 2012/13
Data Dictionary as a P ROLOG Term
table:[name:employee]:[
attribute:[name:’FNAME’,
type:’varchar(15)’, is_nullable:’NO’]:[],
attribute:[name:’MINIT’, ...]:[],
attribute:[name:’LNAME’, ...]:[],
attribute:[name:’SSN’, ...]:[],
...
attribute:[name:’SUPERSSN’, ...]:[],
attribute:[name:’DNO’, ...]:[],
primary_key:[ attribute:[name:’SSN’]:[] ],
...
foreign_key:[ attribute:[name:’DNO’]:[],
references:[table:’department]:[
attribute:[name:’DNUMBER’]:[] ] ] ]
This P ROLOG representation of X ML can be queried and transformed using
the D DK library F N Query.
Prof. Dr. Dietmar Seipel
863
Vorlesung Datenbanken
Wintersemester 2012/13
A foreign key
foreign_key:[
attribute:[name:A1]:[], ..., attribute:[name:An]:[],
references:[table:T]:[
attribute:[name:B1]:[], ..., attribute:[name:Bn]:[] ] ]
can be represented in short form as
[A1,...,An] -> T:[B1,...,Bn].
Then, the list of all foreign keys becomes a P ROLOG term
foreign_keys:[fk1,...,fkm].
Similarly, the list of attributes and the primary key can be simplified to a
short form.
DD BASE stores a P ROLOG fact schema(table:[...]:[...]) with
the simplified term representation for every database table.
Prof. Dr. Dietmar Seipel
864
Vorlesung Datenbanken
Wintersemester 2012/13
Data Dictionary as P ROLOG Facts (Short Form)
schema( table:[name:employee, database:company]:[
attributes:[’FNAME’, ’MINIT’, ’LNAME’, ’SSN’, ’BDATE’,
’ADDRESS’, ’SEX’, ’SALARY’, ’SUPERSSN’, ’DNO’],
primary_key:[’SSN’],
foreign_keys:[ [’SUPERSSN’]->employee:[’SSN’],
[’DNO’]->department:[’DNO’] ] ] ).
schema( table:[name:works_on, database:company]:[
attributes:[’ESSN’, ’PNO’, ’HOURS’],
primary_key:[’ESSN’, ’PNO’],
foreign_keys:[ [’ESSN’]->employee:[’SSN’],
’[PNO’]->project:[’PNO’] ] ] ).
schema( table:[name:department, ...]:[...] ).
schema( table:[name:project, ...]:[...] ).
Prof. Dr. Dietmar Seipel
865
Vorlesung Datenbanken
Wintersemester 2012/13
Views and Queries vs. Rules and Goals
• S QL V IEW:
CREATE
SELECT
FROM
WHERE
AND
VIEW QUERY_1 AS
LNAME, PNAME, HOURS
EMPLOYEE, WORKS_ON, PROJECT
EMPLOYEE.SSN = WORKS_ON.ESSN
PROJECT.PNUMBER = WORKS_ON.PNO
• P ROLOG rule:
query_1(LNAME, PNAME, HOURS) :employee(_,_, LNAME, SSN, _,_,_,_,_,_),
project(PNAME, P, _,_),
works_on(SSN, P, HOURS).
Prof. Dr. Dietmar Seipel
866
Vorlesung Datenbanken
Wintersemester 2012/13
• S QL S ELECT:
SELECT *
FROM
QUERY_1
The S ELECT statement calls the view.
• P ROLOG goal:
?- query_1(LNAME, PNAME, HOURS).
The query is submitted to the P ROLOG interpreter as a goal.
The goal corresponds to the S ELECT statement calling the view.
Prof. Dr. Dietmar Seipel
867
Vorlesung Datenbanken
Wintersemester 2012/13
Recursive Queries: Transitive Closure (Version 1)
The following recursive rule set derives the transitive supervisor relation on
the social security numbers:
supervisor(SSN_1, SSN_2) :direct_supervisor(SSN_1, SSN_2).
supervisor(SSN_1, SSN_2) :direct_supervisor(SSN_1, SSN_3),
supervisor(SSN_3, SSN_2).
SSN_1
?direct s.
SSN_3
supervisor
?
SSN_2
direct_supervisor(SSN_1, SSN_2) :employee(_, _, _, SSN_2, _, _, _, _, SSN_1, _).
The following query assigns names to the social security numbers:
query_2(F1-M1-L1, F2-M2-L2) :supervisor(SSN_1, SSN_2),
employee(F1, M1, L1, SSN_1, _, _, _, _, _, _),
employee(F2, M2, L2, SSN_2, _, _, _, _, _, _).
Prof. Dr. Dietmar Seipel
868
Vorlesung Datenbanken
Wintersemester 2012/13
Transitive closure queries cannot be formulated in standard S QL systems.
Some relational database systems, however, offer limited forms of
recursion – cf. S QL–99.
CREATE
SELECT
FROM
UNION
SELECT
FROM
WHERE
RECURSIVE VIEW supervisor(Emp, Sup) AS
Emp, Sup
direct_supervisor
D.Emp, S.Sup
direct_supervisor D, supervisor S
D.Sup = S.Emp
This assumes a table direct_supervisor with the attributes Emp
and Sup. Obviously, this S QL implementation is structurally equivalent to
the following shorter rule implementation (“;” means “or”).
supervisor(Emp, Sup) :( direct_supervisor(Emp, Sup)
; direct_supervisor(Emp, X), supervisor(X, Sup) ).
Prof. Dr. Dietmar Seipel
869
Vorlesung Datenbanken
Wintersemester 2012/13
Further Applications of Recursion
• computation of aggregate functions
• parts–of–list resolution
Meta–Predicates: Transitive Closure (Version 2)
Using the generic meta–predicate transitive_closure, the previous
two rules for supervisor can be replaced by a single and much more
compact and abstract rule:
supervisor(SSN_1, SSN_2) :transitive_closure(
direct_supervisor, SSN_1, SSN_2 ).
Prof. Dr. Dietmar Seipel
870
Vorlesung Datenbanken
Wintersemester 2012/13
Aggregation Queries
The meta–predicate ddbase_aggregate/3 in the following query
groups over the employees:
• for every employee – given by FNAME,MINIT,LNAME,SSN –
• the corresponding list of all tuples [PNO,HOURS] is computed:
?- ddbase_aggregate( [F, M, L, S, list([P,H])],
( works_on(S, P, H),
employee(F, M, L, S, _,_,_,_,_,_) ),
Tuples ),
Attributes =
[’FNAME’,’MINIT’,’LNAME’,’SSN’,’[PNO,HOURS]’],
xpce_display_table(Attributes, Tuples).
The result is displayed as a table in the X PCE extension of S WI P ROLOG.
Prof. Dr. Dietmar Seipel
871
Vorlesung Datenbanken
Wintersemester 2012/13
Tuples = [
[’Ahmad’, ’V’, ’Jabbar’, ’888888888’, [[10, 35.5], [30, 5.0]]],
[’Alicia’, ’J’, ’Zelaya’, ’777777777’, [[10, 10.0], [30, 30.0]]],
... ]
Thus, DD BASE can produce nested (NF2 ) tables, which is not possible
in S QL.
Prof. Dr. Dietmar Seipel
872
Vorlesung Datenbanken
Wintersemester 2012/13
Transitive Closure (Version 3)
We can also compute the list of subordinates for each employee in P ROLOG:
?- findall( Boss-Emp,
( employee(_,_,_, Emp, _,_,_,_, Boss, _),
Boss \= ’$null$’ ),
Edges ),
edges_to_ugraph(Edges, Graph),
transitive_closure(Graph, Tc_Graph).
Tc_Graph = [
’111111111’-[’222222222’, ’333333333’,
’444444444’, ’555555555’, ’666666666’,
’777777777’, ’888888888’],
’222222222’-[’444444444’, ’555555555’, ’666666666’],
’333333333’-[’777777777’, ’888888888’],
’444444444’-[], ’555555555’-[], ’666666666’-[],
’777777777’-[], ’888888888’-[] ].
Prof. Dr. Dietmar Seipel
873
Vorlesung Datenbanken
Wintersemester 2012/13
Firstly, we compute a list Edges of pairs Boss-Emp of social security
numbers in DD BASE, such that Boss is the boss of Emp and Boss is not the
NULL value.
Secondly, we transform Edges to an adjacency representation Graph using
the predicate edges_to_ugraph/2 from S WI P ROLOG:
Graph = [
’111111111’-[’222222222’, ’333333333’],
’222222222’-[’444444444’, ’555555555’, ’666666666’],
’333333333’-[’777777777’, ’888888888’],
’444444444’-[], ’555555555’-[], ’666666666’-[],
’777777777’-[], ’888888888’-[] ].
Thirdly, the predicate transitive_closure/2 from S WI P ROLOG
computes the transitive closure of Graph. It infers, e.g., that
’111111111’ is the transitive supervisor of all the other employees.
Prof. Dr. Dietmar Seipel
874
Vorlesung Datenbanken
Wintersemester 2012/13
In P ROLOG, the edges of a graph G = (N, E), where
• nodes N = { a, . . . , d } and
• edges E = { (a, b), (b, c), (c, a), (c, d) },
can be represented as a list
Edges = [ a-b, b-c, c-a, c-d ].
G:
aY
?
b
*c
-d
In S WI P ROLOG, the call
edges_to_ugraph(Edges, Graph)
converts Edges to an adjacency list representation
Graph = [ a-[b], b-[c], c-[a,d], d-[] ].
For every node V, a tuple V-Vs is given, such that Vs consists of all
successor nodes of V.
Prof. Dr. Dietmar Seipel
875
Vorlesung Datenbanken
Wintersemester 2012/13
Termination Issues in P ROLOG and DATALOG
• Version 1 can be evaluated both in P ROLOG and in DATALOG.
The DATALOG evaluation always terminates, whereas the P ROLOG
evaluation is only suitable for acyclic graphs; it may not terminate for
cyclic graphs.
• The Versions 2 and 3 can only can be evaluated in P ROLOG.
The predicates transitive_closure/3 from the D DK and
transitive_closure/2 from S WI P ROLOG ensure termination for
arbitrary graphs.
Graph Representations
The Versions 1 and 2 work on facts. Version 3 works on a list representation
of the graph edges.
Prof. Dr. Dietmar Seipel
876
Vorlesung Datenbanken
Wintersemester 2012/13
Basic Syntax of P ROLOG
Constant Symbol:
a, 10, ’Smith, John B.’
Variable Symbol:
X, Lname (starts with a capital letter)
Term:
f (t1 , . . . , tn ),
with function symbol f and terms ti
a, X (constant and variable symbols are terms),
f(g(a,b),X,10), a*(b+c) (complex terms),
[LNAME, . . . , DNO] (this is a list)
Predicate Symbol:
employee, attributes, query_1, transitive_closure
Atom:
p(t1 , . . . , tn ),
with predicate symbol p and terms ti .
Prof. Dr. Dietmar Seipel
877
Vorlesung Datenbanken
Wintersemester 2012/13
Terms in Infix / Prefix Form
• The infix term 1955-01-09 representing a date has the prefix form
-(-(1955,01),09).
• The infix term a*(b+c) representing an arithmetic expression has the
prefix form *(a,+(b,c)).
The operator trees for the terms above are given in the following:
-
1955
Prof. Dr. Dietmar Seipel
*
R
09
a
R
R
01
b
+
R
c
878
Vorlesung Datenbanken
Wintersemester 2012/13
Term Representation for X ML
An X ML element
<table name="employee">
<attribute name="FNAME"/>
</table>
can be represented by a complex term in field notation (FN):
table:[name:employee]:[
attribute:[name:’FNAME’]:[] ].
This infix form is using the binary functor ”:”.
The sub–term name:employee could be equivalently represented in prefix
form as :(name, employee).
Lists are denoted as ”[X1 ,...,Xn ]”, and ”[]” is the empty list – above
the list of sub–elements of the attribute element is empty.
Prof. Dr. Dietmar Seipel
879
Vorlesung Datenbanken
Wintersemester 2012/13
Term Representation for Lists
In term notation, a non–empty list is represented as .(X, Xs), where
• X is the first element (head) and
• Xs represents the rest of the list (tail).
The list functor ”.” is binary, and the empty list is given by ”[]”.
[b] = .(b, [])
[a, b] = .(a, []) = .(a, .(b, []))
For communicating lists with the user, P ROLOG uses the compact list
notation [X1 ,...,Xn ], which is called syntactic sugar.
It helps the user to better comprehend the list.
Prof. Dr. Dietmar Seipel
880
Vorlesung Datenbanken
Wintersemester 2012/13
When an infix operator ⊙ is used multiple times in a term a ⊙ b ⊙ c, then
there are rules in P ROLOG that determine whether a and b or b and c are
joined first in the prefix form.
• The infix term 1955-01-09 representing a date has the prefix form
-(-(1955,01),09).
• The infix term T:As:Es representing an X ML element has the prefix
form :(T,:(As,Es)).
The operator trees for the terms above are given in the following:
-
1955
Prof. Dr. Dietmar Seipel
:
R
09
R
01
T
R
As
:
R
Es
881
Vorlesung Datenbanken
Wintersemester 2012/13
Thus, the term attribute:[name:’FNAME’]:[], which is equivalent
to :(attribute, :(.(:(name,’FNAME’), []), [])), has the
following operator tree:
Prof. Dr. Dietmar Seipel
882
Vorlesung Datenbanken
Wintersemester 2012/13
Facts, Rules, and Goals
Literal:
atom A oder negated atom not(A)
Fact:
A
with atom A; e.g.,
employee(’John’, ’B’, ’Smith’, ...)
Rule:
A :- B1 , . . . , Bm
|{z}
{z
}
|
head
body
with atom A and literals Bi , example later
Goal:
:- B1 , . . . , Bm
with literals Bi
A set of facts for the same predicate symbol corresponds to a relation in
databases. Rules generalize views. Goals are used for expressing queries.
Prof. Dr. Dietmar Seipel
883
Vorlesung Datenbanken
Wintersemester 2012/13
Argument Positions vs. Field Notation (FN)
• Like in other programming languages, the arguments ti of an atom
p(t1 , . . . , tn ) are handed over by position in P ROLOG. E.g., in
works_on(S, P, H),
the first position t1 = S is the social security number of an employee
who has worked on the project with the number t2 = P (second position)
for t3 = H hours (third position).
• In the database context, we could use a meta–interpreter for accessing
arguments in field notation – in a more abstract way – by their
corresponding attribute name. Then, according to the database schema,
works_on(’PNO’:P, ’ESSN’:S)
means that the employee with the social security number S has worked
on the project with the number P, independently of the order of the
arguments – and it is, e.g., not necessary to refer to the hours.
Prof. Dr. Dietmar Seipel
884
Vorlesung Datenbanken
Wintersemester 2012/13
Integrity Constraints in P ROLOG
• Primary Key Constraint for Employee:
primary_key_violation(employee, X, Y) :X = employee(_,_,_, SSN, _,_,_,_,_,_),
Y = employee(_,_,_, SSN, _,_,_,_,_,_),
call(X), call(Y), X \= Y.
• Foreign Key Constraint for Employee:
foreign_key_violation(
employee(’DNO’), department(’DNUMBER’), X) :X = employee(_,_,_,_,_,_,_,_,_, DNO),
call(X),
not(department(_, DNO, _,_)).
In DD BASE, the primary and foreign key contraints of a relational database
are transformed to such rules, which are then tested on database updates.
Prof. Dr. Dietmar Seipel
885
Vorlesung Datenbanken
Wintersemester 2012/13
In a less elegant, naive implementation, we have to assign variable symbols
for all the argument positions of the two violating employee facts:
• Primary Key Constraint for Employee:
primary_key_violation(employee, X, Y) :employee(A,B,C, SSN, D,E,F,G,H,I),
employee(J,K,L, SSN, M,N,O,P,Q,R),
X = employee(A,B,C, SSN, D,E,F,G,H,I),
Y = employee(J,K,L, SSN, M,N,O,P,Q,R),
X \= Y.
Moreover, we have to repeat all these variable symbols when we define the
return arguments X and Y of the call
primary_key_violation(employee, X, Y).
The many variable symbols and their repetition makes the rule more
error–prone.
Prof. Dr. Dietmar Seipel
886
Vorlesung Datenbanken
Wintersemester 2012/13
In the shorter, first primary key rule above, we use the templates
X = employee(_,_,_, SSN, _,_,_,_,_,_),
Y = employee(_,_,_, SSN, _,_,_,_,_,_),
to avoid the naming and the repeated writing of all the arguments.
call(X), call(Y), X \= Y.
calls the templates in the P ROLOG database and tries to assign values to all
argument positions – even the ones with anonymous variables “_” – and tests
if X and Y represent two different database tuples.
If the primary key constraint is violated, then the instantiated templates are
returned.
Analogously, we proceed for the foreign key constraint.
As a general purpose programming language, P ROLOG offers a great
functionality for defining integrity constraints.
Prof. Dr. Dietmar Seipel
887
Vorlesung Datenbanken
Wintersemester 2012/13
Semantic Constraints in Field Notation (FN)
• No employee should earn more than his manager:
trigger(salary, X, Y) :employee(’SSN’:X, ’SALARY’:S1, ’SUPERSSN’:Y),
employee(’SSN’:Y, ’SALARY’:S2),
S1 > S2.
• Which employee works on a foreign project ?
trigger(employee_works_on_foreign_project, E, P) :works_on(’ESSN’:E, ’PNO’:P),
employee(’SSN’:E, ’DNO’:D1),
project(’PNUMBER’:P, ’DNUM’:D2),
D1 \= D2.
FN abstracts from argument positions: employee(’SSN’:E, ’DNO’:D1)
corresponds to employee(_,_,_, E, _,_,_,_,_, D1).
Prof. Dr. Dietmar Seipel
888
Vorlesung Datenbanken
Wintersemester 2012/13
Bottom–Up Evaluation of DATALOG
• The set of all given facts for a predicate corresponds to a relation.
• A rule without function symbols corresponds to a VIEW statement
defining a relation for the head predicate.
• The relations for the body predicates are derived using rules themselves.
Thus, it can happen that a rule transitively helps to derive tuples for one
of its body predicates (recursion).
E.g., the second rule for supervisor is directly recursive.
• The bottom–up evaluation iteratively enlarges the relations for the
predicates by repeatedly evaluating all rules until a fixpoint is reached.
Thus, e.g., all transitive supervisors can be derived, which is provably
not possible using standard S QL.
Prof. Dr. Dietmar Seipel
889
Vorlesung Datenbanken
Wintersemester 2012/13
Example (Recursion and Transitive Closure)
111111111
j
222222222
?
333333333
j
R
444444444 555555555 666666666 777777777 888888888
The following recursive rule set derives the transitive supervisor relation on
the social security numbers:
supervisor(SSN_1, SSN_2) :direct_supervisor(SSN_1, SSN_2).
supervisor(SSN_1, SSN_2) :direct_supervisor(SSN_1, SSN_3),
supervisor(SSN_3, SSN_2).
direct_supervisor(SSN_1, SSN_2) :employee(_,_,_, SSN_2, _,_,_,_, SSN_1, _).
Prof. Dr. Dietmar Seipel
890
Vorlesung Datenbanken
Wintersemester 2012/13
The first iteration derives the facts for direct_supervisor from the
facts for employee:
direct_supervisor(111111111,
direct_supervisor(111111111,
direct_supervisor(222222222,
direct_supervisor(222222222,
direct_supervisor(222222222,
direct_supervisor(333333333,
direct_supervisor(333333333,
222222222).
333333333).
444444444).
555555555).
666666666).
777777777).
888888888).
The second iteration translates these facts to the corresponding 7 facts for
supervisor.
supervisor(111111111, 222222222).
...
supervisor(333333333, 888888888).
Prof. Dr. Dietmar Seipel
891
Vorlesung Datenbanken
Wintersemester 2012/13
The third iteration derives the 5 new facts that 111111111 is the transitive
(indirect) supervisor of the employees 444444444 to 888888888:
supervisor(111111111,
supervisor(111111111,
supervisor(111111111,
supervisor(111111111,
supervisor(111111111,
444444444).
555555555).
666666666).
777777777).
888888888).
Since the hierarchy is of limited depth 2 here, the relations corresponding to
these facts could also be derived in S QL.
For arbitrary hierarchies of unlimited depth, however, it is not possible to
derive the transitive supervisors in S QL.
Prof. Dr. Dietmar Seipel
892
Vorlesung Datenbanken
Wintersemester 2012/13
In principle, all rules can be used in all iterations. But, a rule can only fire and
derive facts, as soon as facts for the body atoms have been derived in previous
iterations. From then on, the rule can always be used to derive the same facts.
One of the purposes of efficient bottom–up evaluation is to avoid these
redundant derivations – especially in the presence of recursion.
The rule for query_2 fires in iteration 3 for the first time and derives 7 facts
for direct supervisors:
query_2(’James’-’E’-’Borg’, ’Franklin’-’T’-’Wong’).
...
query_2(’Jennifer’-’S’-’Wallace’, ’Ahmad’-’V’-’Jabbar’).
Finally, in iteration 4, the 5 facts for transitive supervisors are derived.
Iteration 5 does not derive any new facts.
Thus, a fixpoint is reached, and the iteration terminates.
Prof. Dr. Dietmar Seipel
893
Vorlesung Datenbanken
Wintersemester 2012/13
Comparison with S QL
• Non–recursive DATALOG could be simulated in S QL by mapping the
rules to View statements – or to INSERT statements whose result is
computed using a SELECT statement.
• Recursion brings higher expressivity to DATALOG.
• There are DATALOG extensions which allow for default negation and
aggregate operations as well.
• The rule–based approach of DATALOG supports modularization:
instead of one single, complex VIEW or SELECT statement in S QL, a set
of simpler and more compact DATALOG rules can be used.
The deductive database system DDBASE also supports update operations
such as INSERT and DELETE, and it can connect to relational databases.
Prof. Dr. Dietmar Seipel
894
Vorlesung Datenbanken
Wintersemester 2012/13
8.1.2 The Deductive Database System DDBASE
The deductive database system DD BASE, which is part of the D DK, can
process
• relational databases and
• X ML documents
within the same query using O DBC and F N Query, respectively:
DD BASE
O DBC
RDB
F N Query
U
X ML
This extends database programming languages (DBPL) by X ML capabilities.
Prof. Dr. Dietmar Seipel
895
Vorlesung Datenbanken
Wintersemester 2012/13
O DBC
The following P ROLOG rule accesses a relational database – given by the
connection handle mysql – using the O DBC library of S WI P ROLOG.
generate_html_table(Salary, table:Rows) :concat(’SELECT fname, minit, lname, salary \
FROM employee WHERE salary >= ’, Salary, Query),
Types = [types([atom,atom,atom,integer])],
findall( Row,
( odbc_query(mysql, Query, row(F,M,L,S), Types),
Row = tr:[td:[F], td:[M], td:[L], td:[S]] ),
Rows ).
The query string Query is obtained by concatenating a partial select
statement with the value for the salary. Types gives the types of the
components of the result tuples.
Prof. Dr. Dietmar Seipel
896
Vorlesung Datenbanken
Wintersemester 2012/13
The findall Statement
• The call odbc_query(mysql, Query, row(F,M,L,S),
Types) returns the values F,M,L,S for the attributes fname,
minit, lname, salary of the table employee.
• By backtracking, the findall statement produces a list Rows of
P ROLOG terms Row of the form tr:[td:[F], td:[M],
td:[L], td:[S]], which represent X ML elements in F N Query.
• For a given Salary, the call generate_html_table(Salary,
table:Rows) produces a P ROLOG term table:Rows, which
represents the following H TML table in F N Query.
Prof. Dr. Dietmar Seipel
897
Vorlesung Datenbanken
Wintersemester 2012/13
The generated H TML table
<table>
<tr><th>Fname</th><th>Minit</th><th>Lname</th><th>Salary</th></tr>
<tr><td>John</td><td>B</td><td>Smith</td><td>30000</td></tr>
<tr><td>Franklin</td><td>T</td><td>Wong</td><td>40000</td></tr>
<tr><td>Jennifer</td><td>S</td><td>Wallace</td><td>43000</td></tr>
<tr><td>Ramesh</td><td>K</td><td>Narayan</td><td>38000</td></tr>
<tr><td>James</td><td>E</td><td>Borg</td><td>55000</td></tr>
</table>
can be rendered in a web browser:
Prof. Dr. Dietmar Seipel
898
Vorlesung Datenbanken
Wintersemester 2012/13
By O DBC, we can make S QL tables available in DD BASE:
employee(A,B,C,D,E,F,G,H,I,J) :Goal = company:employee(A,B,C,D,E,F,G,H,I,J),
ddbase_call(odbc(mysql), Goal).
works_on(A,B,C) :Goal = company:works_on(A,B,C),
ddbase_call(odbc(mysql), Goal).
It is also possible to generate these rules in DD BASE, which avoids the
error–prone, repeated use of so many variable symbols.
The call ddbase_connect(odbc(mysql), M, Database:Table)
asserts a corresponding rule in a P ROLOG module M.
The following two aggregation statements refer to the predicate
employee/10 provided by O DBC. The facts for works_on/3 are derived
using F N Query from an X ML document works_on.xml.
Prof. Dr. Dietmar Seipel
899
Vorlesung Datenbanken
Wintersemester 2012/13
Aggregation on RDB and X ML
For every Ssn in the table employee, the following query groups all
corresponding entries from the document works_on.xml:
?- ddbase_aggregate( [Ssn, list([Pno, Hours])],
( employee(_,_,_, Ssn, _,_,_,_,_,_),
Row := doc(’works_on.xml’)/row::[@’ESSN’=Ssn],
Pno := Row@’PNO’, Hours := Row@’HOURS’ ),
Tuples ).
Tuples = [
[’111111111’, [[’20’, ’0.0’]]],
[’222222222’, [[’2’, ’10.0’], [’3’, ’10.0’]]], ... ]
The resulting list Tuples represents an NF2 relation.
A query optimizer could rearrange the Goal in ddbase_aggregate/3 by
changing the order of the calls to the predicate employee/10 and the X ML
document works_on.xml.
Prof. Dr. Dietmar Seipel
900
Vorlesung Datenbanken
Wintersemester 2012/13
In DD BASE, we can define arbitary binary aggregation predicates.
ddbase_aggregate/3 groups over all variable symbols that occur
standalone in the result template [Ssn, list([Pno, Hours])];
in this case, this is Ssn.
• For every Ssn, the above call to ddbase_aggregate/3 computes
the list Xs of all corresponding pairs [Pno, Hours].
• Then, the call list(Xs, Pairs), which will be explained in a little
while, simply passes Xs to Pairs.
• Thus, ddbase_aggregate/3 produces a nested tuple [Ssn,
Pairs] for every Ssn. Pairs is a list of lists; it represents a relation.
The resulting list Tuples is the output.
Prof. Dr. Dietmar Seipel
901
Vorlesung Datenbanken
Wintersemester 2012/13
The following statement aggregates the working hours of the employees of
the departments:
?- ddbase_aggregate( [Dno, sum(Hours)],
( employee(_,_,_, Ssn, _,_,_,_,_, Dno),
Row := doc(’works_on.xml’)/row::[@’ESSN’=Ssn],
H := Row@’HOURS’, atom_number(H, Hours) ),
Tuples ).
Tuples = [[1, 0.0], [4, 115.5], [5, 140.0]]
The attribute value H of the attribute ’Hours’ of Row is an atom that has to
be converted to a number Hours.
The template [Dno, sum(Hours)] leads to a grouping on the department
numbers. For every Dno, first the list Xs of all corresponding Hours is
computed, and then the sum is computed by the call sum(Xs, Sum); thus,
we obtain a standard result tuple [Dno, Sum].
Prof. Dr. Dietmar Seipel
902
Vorlesung Datenbanken
Wintersemester 2012/13
For explaining the effect of the template [Dno, sum(Hours)], we
abstract the second argument of the call above as follows:
dno_hours(Dno, Hours) :employee(_,_,_, Ssn, _,_,_,_,_, Dno),
Row := doc(’works_on.xml’)/row::[@’ESSN’=Ssn],
H := Row@’HOURS’, atom_number(H, Hours).
The intermediate variable symbols Ssn, Row, and H do not become
arguments of dno_hours/2, since they are not used in the template.
Then, the following call has the same result as the call above:
?- ddbase_aggregate( [Dno, sum(Hours)],
dno_hours(Dno, Hours),
Tuples ).
Prof. Dr. Dietmar Seipel
903
Vorlesung Datenbanken
Wintersemester 2012/13
E.g., for Dno=4, first the list Xs of all working hours of employees from
department 4 is computed by dno_hours(4, Hours) in the following
functional set notation, and then the sum Sum is computed:
?- Xs <= { Hours | dno_hours(4, Hours) },
Sum <= sum(Xs).
Xs = [15.0, 20.0, 10.0, 30.0, 35.5, 5.0],
Sum = 115.5.
These functional notations, which are possible in the D DK, can even be
nested to get rid of the intermediate variable symbol Xs:
?- Sum <= sum({ Hours | dno_hours(4, Hours) }).
Sum = 115.5.
The functional notation Sum <= sum(Xs) is equivalent to the relational
notation sum(Xs, Sum) which includes the return value as the last
argument. Thus, sum should be defined as a binary predicate in P ROLOG.
Prof. Dr. Dietmar Seipel
904
Vorlesung Datenbanken
Wintersemester 2012/13
Aggregation Predicates
In DD BASE, arbitrary user–defined aggregation predicates can be used.
The predicate list/2 simply passes the input to the output:
list(Xs, Xs).
The predicate sum/2 uses an accumulator, which is initialized to 0. sum/3
traverses the input list recursively. The list head X is added to the accumulator
Acc, and then sum/3 is called recursively on the list tail Xs and the new
accumulator Acc_2; if the list is empty, then Acc becomes the output:
sum(Xs, Sum) :sum(Xs, 0, Sum).
sum([X|Xs], Acc, Sum) :Acc_2 is Acc + X, sum(Xs, Acc_2, Sum).
sum([], Acc, Acc).
Prof. Dr. Dietmar Seipel
905
Vorlesung Datenbanken
Wintersemester 2012/13
Lightweight Fact Database
A relational database can also be imported into a lightweight fact
representation in P ROLOG.
The following sequence of statements loads the data dictionary from the
MyS QL database company in a module c. Subsequently, the corresponding
relations are imported from the MyS QL database, and a summary is shown.
?- ddbase_load(odbc(mysql), company, c),
ddbase_load_tables(c), ddbase_show_tables(c).
Prof. Dr. Dietmar Seipel
906
Vorlesung Datenbanken
Wintersemester 2012/13
We can describe the schema of a database table based on the data dictionary
of MyS QL:
?- ddbase_describe_table(company:works_on).
<table name="works_on">
<attribute name="ESSN" type="char(9)" is_nullable="NO"/>
<attribute name="PNO" type="int(11)" is_nullable="NO"/>
<attribute name="HOURS" type="decimal(3,1)" is_nullable="NO"/>
<primary_key>
<attribute name="ESSN"/> <attribute name="PNO"/>
</primary_key>
<foreign_key> <attribute name="ESSN"/>
<references table="employee"> <attribute name="SSN"/>
</references>
</foreign_key>
<foreign_key> <attribute name="PNO"/>
<references table="project"> <attribute name="PNUMBER"/>
</references>
</foreign_key>
</table>
true.
Prof. Dr. Dietmar Seipel
907
Vorlesung Datenbanken
Wintersemester 2012/13
We can display a complete database or single relations.
?- ddbase_facts_to_display(c).
?- ddbase_facts_to_display(c:works_on/3).
Prof. Dr. Dietmar Seipel
908
Vorlesung Datenbanken
Wintersemester 2012/13
Of course, the P ROLOG representation of the relational database can be
queried in the standard way in P ROLOG.
Moreover, we can also execute update statements, which respect the integrity
constraints of the relational database. After an insertion or deletion in the
database, the primary and foreign key constraints from the data dictionary are
checked.
?- ddbase_insert(c, works_on(’666666666’, 10, 3)),
ddbase_insert(c, works_on(’666666666’, 10, 4)),
ddbase_delete(c, works_on(’666666666’, 10, 3)).
The second update is rejected, since it violates a primary key constraint.
All tuples from all relations of a database can be deleted in one step.
The data dictionary remains unchanged.
?- ddbase_drop_database(c).
Prof. Dr. Dietmar Seipel
909
Vorlesung Datenbanken
Wintersemester 2012/13
Complex Computations with F N Query
element_to_subtree(Xml, Course_1, Course_2) :[T] := Course_1/’Title’/content::’*’,
( Ps := Course_1@’Prerequisites’ ->
let( Trees := Xml/’Course’::[
@’CourseNr’ = N, name_contains_name(Ps, N) ]
/call::element_to_subtree(Xml) )
; Trees = [] ),
Course_2 = ’Course’:[’Title’:T]:Trees.
?- dread(xml, ’Uni.xml’, [Xml]),
let( Trees := Xml/descendant::’Course’
/call::element_to_subtree(Xml) ),
dwrite(xml, ’CourseHierarchy’:Trees).
In F N Query, the attribute value of an element C is accessed by C@A,
whereas in XPATH, it is accessed by C/@A.
Prof. Dr. Dietmar Seipel
910
Vorlesung Datenbanken
Wintersemester 2012/13
The call element_to_subtree(Xml, Course_1, Course_2)
takes an X ML document and a course element Course_1 and produces
another course element Course_2:
• Firstly, T becomes the content of the Title element of the Course
element Course_1.
• If Course_1 has prerequisites, then we determine a list Trees of
X ML terms using let.
For every course in the document, we check whether the course number
N is contained in the list Ps of prerequisites of Course_1.
In that case, we call the predicate element_to_subtree/3
recursively on that course to produce an element of Trees.
The global X ML document is also a parameter of the call.
• If Course_1 has no prerequisites, then we determine the empty list
Trees = [ ].
Prof. Dr. Dietmar Seipel
911
Vorlesung Datenbanken
Wintersemester 2012/13
The main call reads the X ML document Uni.xml into a P ROLOG variable
Xml using dread/3.
Subsequently, let/1 calls element_to_subtree/3 on every
descendant Course element.
The resulting list Trees represents a list of X ML elements, which are then
packed into a CourseHierarchy element.
The corresponding P ROLOG term ’CourseHierarchy’:Trees
represents the X ML output of the whole computation, and it can be written to
standard output (the screen) using dwrite/2.
Prof. Dr. Dietmar Seipel
912
Vorlesung Datenbanken
8.1.3
Wintersemester 2012/13
P ROLOG as a Programming Language
In the following, we will present P ROLOG implementations of well–known
algorithms for searching in graphs and binary search trees.
The benefits of P ROLOG are
• the elegant handling of data structures (lists, trees, X ML),
• (implicit) backtracking, and
• the compact representation of case distinctions in different rules.
The algorithms are typically recursive. Recursion can be formulated nicely
due to the compact list access.
Also meta–predicates support a compact and elegant encoding.
Prof. Dr. Dietmar Seipel
913
Vorlesung Datenbanken
Wintersemester 2012/13
Graph Search
Labyrinth:
Prof. Dr. Dietmar Seipel
914
Vorlesung Datenbanken
Wintersemester 2012/13
Computation of Simple Paths by Backtracking
The predicate graph_search/2 computes a simple path from a given
node to a sink of a graph:
% graph_search(+Node, ?Path) <graph_search(X, Path) :graph_search(X, [X], Path).
Another predicate graph_search/3 with the same predicate symbol but a
different arity is called.
The graph is given by facts for the prediactes graph_arc/2 and
graph_sink/1.
Notation for arguments in the comment line:
+: bound, -: free, ?: either bound or free
Prof. Dr. Dietmar Seipel
915
Vorlesung Datenbanken
Wintersemester 2012/13
Path =
Visited [Y1 =Y, . . . ,Yn =Z]
-X-Y
-Z
• A call graph_search(X, Visited, Nodes) with a bound
argument X, which is not a sink, and a list Visited of already visitied
nodes
– uses an edge from X to not yet vistited successor node Y, and then
– calculates a path Path from Y to a sink Z, which does not visit Y and
the nodes in Visited.
If no path from Y to a sink can be found, then another successor node of
X must be used (Backtracking).
• The result Nodes = [X|Path] is a simple path from X to a sink of
the graph.
Prof. Dr. Dietmar Seipel
916
Vorlesung Datenbanken
Wintersemester 2012/13
The predicate graph_search/3 is recursive, because of its second rule:
% graph_search(+Node, +Visited, ?Path) <graph_search(X, _, [X]) :graph_sink(X).
graph_search(X, Visited, [X|Path]) :Path =
[Y1 =Y,
Visited
graph_edge(X, Y),
-X-Y
not(member(Y, Visited)),
write(user, ’->’), write(user, Y),
graph_search(Y, [Y|Visited], Path).
. . . ,Yn =Z]
-Z
Termination is ensured by the fact that already visited nodes cannot be visited
again.
Prof. Dr. Dietmar Seipel
917
Vorlesung Datenbanken
Wintersemester 2012/13
• The initial call graph_search(X, [X], Path) calculates a
simple path from X to a sink of the graph.
– If X is a sink, then the first rule for graph_search/3 computes
Path as an empty list.
– Otherwise, the recursive, second rule choses a successor node Y using
graph_edge(X, Y), and then it continues searching from there.
• Further paths can be searched for by backtracking.
– Alternative successor nodes Y can be used in the second rule.
– In the implementation above, we can continue searching beyond a
sink by using the second rule instead of the fist one.
Prof. Dr. Dietmar Seipel
918
Vorlesung Datenbanken
Wintersemester 2012/13
Implicit and Explicit Backtracking
In P ROLOG, backtracking is used automatically (implicitly).
In a procedural language, backtracking has to be implemented explicitly.
In a direct translation of the code above to a procedural environment, a call
graph_edge(X, Y) can only produce a single successor node Y of X –
if there is no path from Y to a sink, then the computation fails.
Moreover, at most one solution could be computed.
If we implement the graph search procedurally using explicit backtracking,
then we get more code than in P ROLOG.
Prof. Dr. Dietmar Seipel
919
Vorlesung Datenbanken
Wintersemester 2012/13
Representation of a Graph by P ROLOG Facts
Labyrinth:
a
b
c
d
e
f
g
h
i
-
graph_arc(i,
graph_arc(i,
graph_arc(h,
graph_arc(g,
graph_arc(d,
graph_arc(d,
graph_arc(a,
graph_arc(b,
f).
h).
g).
d).
e).
a).
b).
c).
6
graph_sink(c).
Prof. Dr. Dietmar Seipel
920
Vorlesung Datenbanken
Wintersemester 2012/13
The following rule symmetrisises the predicate graph_arc/2:
graph_edge(X, Y) :( graph_arc(X, Y)
; graph_arc(Y, X) ).
Thus, it is not necessary to explicitely list the inverse edges:
graph_edge(i,
graph_edge(f,
graph_edge(i,
graph_edge(h,
...
Prof. Dr. Dietmar Seipel
f).
i).
h).
i).
921
Vorlesung Datenbanken
Wintersemester 2012/13
Computation
• The predicate graph_search/2 use depth first search, and it
calculates simple paths (without duplicate nodes).
• With the call graph_search(+Node, -Path), we can calculate
all simple paths from Node to a sink (graph_sink) by backtracking:
?- graph_search(i, Path).
->f->h->g->d->e->a->b->c
Path = [i, h, g, d, a, b, c]
?- graph_search(e, Path).
->d->a->b->c
Path = [e, d, a, b, c] ;
->g->h->i->f
No
Prof. Dr. Dietmar Seipel
922
Vorlesung Datenbanken
Wintersemester 2012/13
• If we add another edge graph_arc(e, b) to the graph (i.e., we tear
down the wall between e and b), then there appears another simple path
[e, b, c] from e to the sink c.
• All results can be calculated by backtracking and findall/3:
graph_arc(e, b).
?- findall( Path,
graph_search(e, Path),
Paths ).
Paths = [[e, d, a, b, c], [e, b, c]]
Yes
Prof. Dr. Dietmar Seipel
923
Vorlesung Datenbanken
Wintersemester 2012/13
The Meta–Predicate findall/3
Finding of all solutions for a goal:
findall( X,
goal(X),
Xs )
The D DK allows for the following equivalent set notation:
Xs <= { X | goal(X) }
Further important meta–predicates are checklist/2 and maplist/3 for
lists, as well as the predicates for loops (control structures) from the library
loops.pl (e.g., foreach-do).
Prof. Dr. Dietmar Seipel
924
Vorlesung Datenbanken
Wintersemester 2012/13
Binary Search Trees
% search_in_tree(+Key, +Tree) <search_in_tree(Key, Tree) :parse_tree(Tree, Root, Lson, Rson),
( Key = Root
; Key < Root ->
search_in_tree(Key, Lson)
; Key > Root ->
search_in_tree(Key, Rson) ).
arguments: +: bound, -: free, ?: either bound or free
Prof. Dr. Dietmar Seipel
925
Vorlesung Datenbanken
Wintersemester 2012/13
Search Tree
X ML Representation:
P ROLOG Representation:
<node key="5">
<node key="4"/>
<node key="9">
<node key="6"/>
<node key="10"/>
</node>
</node>
alternative P ROLOG representation:
node:[key:5]:[
node:[key:4]:[],
node:[key:9]:[
node:[key:6]:[],
node:[key:10]:[]
]
5
]
4
9
[5, [4], [9, [6], [10]]]
6
Prof. Dr. Dietmar Seipel
10
926
Vorlesung Datenbanken
Wintersemester 2012/13
Encapsulation of the Tree Access
% parse_tree(+Tree, ?Key, ?Lson, ?Rson) <% parse_tree(?Tree, +Key, +Lson, +Rson) <parse_tree(Tree, Key, Empty, Empty) :Tree = node:[key:Key]:[],
Empty = node:[]:[].
parse_tree(Tree, Key, Lson, Rson) :Tree = node:[key:Key]:[Lson, Rson].
% binary_tree_empty(?Tree) <binary_tree_empty(node:[]:[]).
The same code for parse_tree/4 can be called both for extracting the
root key and the two subtrees of a binary tree (+,-,-,-) and for
constructing a binary tree from a key and two subtrees (-,+,+,+) .
Prof. Dr. Dietmar Seipel
927
Vorlesung Datenbanken
Wintersemester 2012/13
Examples:
?- Tree = node:[key:5]:[
node:[key:4]:[],
node:[key:9]:[node:[key:6]:[], ...] ],
parse_tree(Tree, Key, Lson, Rson).
Key = 5,
Lson = node:[key:4]:[],
Rson = node:[key:9]:[node:[key:6]:[], ...]
?- Key = 9,
Lson = node:[key:6]:[],
Rson = node:[key:10]:[],
parse_tree(Tree, Key, Lson, Rson).
Tree = node:[key:9]:[node:[key:6]:[], node:[key:10]:[]]
Prof. Dr. Dietmar Seipel
928
Vorlesung Datenbanken
Wintersemester 2012/13
Alternative Encapsulation of the Tree Access
parse_tree([Root, Lson, Rson], Root, Lson, Rson).
parse_tree([Root], Root, [], []).
binary_tree_empty([]).
5
4
9
6
10
Example:
?- Tree = [5, [4], [9, [6], [10]]],
parse_tree(Tree, Root, Lson, Rson).
Root = 5, Lson = [4], Rson = [9, [6], [10]]
Prof. Dr. Dietmar Seipel
929
Vorlesung Datenbanken
Wintersemester 2012/13
% insert_into_tree(+Key, +Tree, ?New_Tree) <insert_into_tree(Key, Tree, New_Tree) :parse_tree(Tree, Root, Lson, Rson),
( Key = Root ->
New_Tree = Tree
; Key < Root ->
insert_into_tree(Key, Lson, L),
parse_tree(New_Tree, Root, L, Rson) )
; K > Root ->
insert_into_tree(Key, Rson, R),
parse_tree(New_Tree, Root, Lson, R) ).
insert_into_tree(Key, _, New_Tree) :binary_tree_empty(Empty),
parse_tree(New_Tree, Key, Empty, Empty).
If the tree is empty, then parse_tree(Tree, Root, Lson, Rson)
fails. Then the second rule builds a new tree with two empty subtrees using
parse_tree(New_Tree, Key, Empty, Empty).
Prof. Dr. Dietmar Seipel
930
Vorlesung Datenbanken
Wintersemester 2012/13
Important Concepts
• Terms (for Data and Control Structures) and Unification
• Backtracking
• SLDNF–Resolution
P ROLOG allows for
• declarative programming,
• compact programs, and
• rapid prototyping, agile software development.
We are using the X PCE extension of S WI P ROLOG.
Prof. Dr. Dietmar Seipel
931
Vorlesung Datenbanken
Wintersemester 2012/13
Top–Down Evaluation of P ROLOG: SLDNF–Resolution
• Like in conventional programming, P ROLOG is evaluated top–down:
a call to a predicate looks for an applicable rule with the predicate in
head and then successively calls the statements in the body.
• Unlike in conventional programming languages, there can be many such
rules, which are then used successively – comparably to the different
options of a case–statement. The evaluation of a call using a rule can
fail; then, the next applicable rule is used (backtracking). This is done
until finally the complete computation is successive.
• Since the arguments of a rule head can be partially instantiated, the
passing of paramenters is done using unification, which suitably extends
the standard way of paramenter passing.
Prof. Dr. Dietmar Seipel
932
Vorlesung Datenbanken
Wintersemester 2012/13
• A negated call succeeds, if the corresponding positive call fails
(negation–as–finite–failure).
• Using backtracking, it is possible to compute the list of all answers to a
given call (query). This corresponds to query answering in relational
databases using S QL.
In practical P ROLOG systems, there exists a large collection of pre–defined
built–in predicates and also meta–predicates (i.e., predicates, some of whose
arguments can be predicates themselves).
Moreover, there can be side–effects – mainly for I/O and access to the
internal fact database (assert, retract).
Prof. Dr. Dietmar Seipel
933
Vorlesung Datenbanken
Wintersemester 2012/13
Data Structures, Operations, and Control Structures
• The restriction to a few basic data types and a single complex data type,
namely the terms, which is generic and subsumes all the other types,
standardizes the data structures.
• There are no explicit type declarations.
• There exists a large collection of generic operations that are applicable to
terms – and thus to all data types.
• Frequently, meta–predicates are used.
• Actually, control structures are meta–predicates as well. In addition to
standard control structures, such as branching (if–then–else), loops (for,
while), and recursion, user–defined control structures can be built as
meta–predicates.
Prof. Dr. Dietmar Seipel
934
Vorlesung Datenbanken
Wintersemester 2012/13
Software Engineering Aspects
P ROLOG supports abstraction and compact code, and thus stimulates
refactoring:
• The generic type of terms with generic operations supports abstraction
and code reuse.
• User–defined control structures allow for further abstraction.
• Unification, implicit backtracking, and abstaining from explicit type
declarations, result in very compact code and support rapid prototyping.
• Declarativity makes the code much more readable and thus extensible.
Switching from conventional programming languages to the logic
programming paradigm is difficult and usually requires a lot of training
and effort.
Prof. Dr. Dietmar Seipel
935
Vorlesung Datenbanken
Wintersemester 2012/13
Disjunctive Logic Programming
Sink in a Network
Fact Base: a network is represented as node facts and arc facts.
b
a
d
c
node(a).
node(b).
node(c).
node(d).
arc(a,b); arc(a,c).
arc(b,d).
arc(c,d).
Either there exists an arc from node a to b or from a to c (disjunction).
Prof. Dr. Dietmar Seipel
936
Vorlesung Datenbanken
Wintersemester 2012/13
Rule Base: A node X is a sink, if there is no other node Y for which there is
no transitive connection (transitive closure, tc) from Y to X.
sink(X) :node(X), not(not_sink(X)).
not_sink(X) :node(X), node(Y), X \= Y, not(tc(Y, X)).
tc(X, Y) :arc(X, Z), tc(Z, Y).
tc(X, Y) :arc(X, Y).
Query:
?- sink(X)
X = d
Prof. Dr. Dietmar Seipel
937
Vorlesung Datenbanken
Wintersemester 2012/13
Course on Deductive Databases
Topics:
• foundations and applications of P ROLOG and DATALOG,
data modelling and programming;
• the deductive database system DD BASE;
• efficient evaluation of DATALOG programs;
• further language constructs in the D DK (D IS L OG Developers’ Kit):
– complex data structures,
– default negation and disjunction;
• applications on the basis of P ROLOG and DD BASE.
Prof. Dr. Dietmar Seipel
938
Vorlesung Datenbanken
Wintersemester 2012/13
8.2 Semantic Web Databases
Knowledge Engineering in the Semantic Web (Web 2.0) is based on
ontologies and logic.
Reasoning Tasks:
• supporting the search (query answering);
• in knowledge engineering / modelling: analysis of the structure of the
ontologies for anomalies.
Knowledge engineering and reasoning in the Semantic Web can be supported
by deductive databases and logic programming techniques.
Prof. Dr. Dietmar Seipel
939
Vorlesung Datenbanken
Wintersemester 2012/13
In the Semantic Web, it is possible to reason about
• the ontology / taxonomy (i.e., the schema) and
• the instances.
This is called terminological or assertional (T–Box or A–Box) reasoning,
respectively. This makes search in the Semantic Web more effective.
• In the following printer ontology, we could search for a printer from HP,
and the result could be a laser–jet printer from HP, since the system
knows that hpLaserJetPrinter is a sub–class of hpPrinter.
• It can also be derived, that all laser–jet printers from HP are no laser
writers from Apple; in this case, this is very easy, since it is explicitely
stored in the ontology.
Moreover, we will show in the following how to support knowledge
engineering by detecting anomalies in OWL ontologies.
Prof. Dr. Dietmar Seipel
940
Vorlesung Datenbanken
Wintersemester 2012/13
The Web Ontology Language (OWL)
In OWL, we can mix concepts from
• rdf (Resource Description Framework) for defining instances and
• rdfs (rdf Schema) for defining the schema
of an application. Moreover, tags with the namespace owl are allowed.
The Semantic Web Rule Language (S WRL) incorporates logic programming
rules into OWL ontologies.
There exist well–known, powerful tools for asking queries on and for
reasoning with OWL ontologies.
Prof. Dr. Dietmar Seipel
941
Vorlesung Datenbanken
Wintersemester 2012/13
The Printer Ontology
product
hpProduct
printer
personalPrinter
ibmLaserPrinter
laserJetPrinter
appleLaserWriter
hpPrinter
hpLaserJetPrinter
{disjoint}
hpApplePrinter
Prof. Dr. Dietmar Seipel
942
Vorlesung Datenbanken
Wintersemester 2012/13
The Printer Ontology in OWL
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:xsd="http://www.w3.org/2001/XLMSchema#"
xmlns="file:/protege/Ontologies/p.owl#">
<owl:Ontology rdf:about="">
<owl:VersionInfo> Printer Example, Version 1.3, 02.02.2013
</owl:VersionInfo> </owl:Ontology>
<owl:Class rdf:ID="printer"/>
<owl:Class rdf:ID="laserJetPrinter">
<rdfs:subClassOf rdf:resource="#printer"/> </owl:Class>
...
</rdf:RDF>
Prof. Dr. Dietmar Seipel
943
Vorlesung Datenbanken
Wintersemester 2012/13
The following owl:Class element defines the class appleLaserWriter:
<owl:Class rdf:ID="appleLaserWriter">
<rdfs:comment>
Apple laser writers are laser jet printers
</rdfs:comment>
<rdfs:subClassOf rdf:resource="#laserJetPrinter"/>
<owl:disjointWith rdf:resource="#hpLaserJetPrinter"/>
</owl:Class>
The rdfs:subClassOf sub–element states that appleLaserWriter is a
sub–class of laserJetPrinter. The owl:disjointWith sub–element
states that appleLaserWriter is disjoint from hpLaserJetPrinter.
Observe, that we refer using the attribute rdf:resource and a “#”, whereas
the owl:Class element uses the attribute rdf:ID and no “#”.
Prof. Dr. Dietmar Seipel
944
Vorlesung Datenbanken
Wintersemester 2012/13
The following owl:Class element defines a class of printers from a joint
venture of HP and Apple:
<owl:Class rdf:ID="hpApplePrinter">
<rdfs:comment>
Printers from a joint venture of HP and Apple
</rdfs:comment>
<rdfs:subClassOf rdf:resource="#hpLaserJetPrinter"/>
<rdfs:subClassOf rdf:resource="#appleLaserWriter"/>
</owl:Class>
The existence of such printers would contradict the disjointWith
restriction between the classes hpLaserJetPrinter and
apperLaserWriter.
The emptiness of the class hpApplePrinter can be detected by reasoners
in the ontology editor Protégé.
Prof. Dr. Dietmar Seipel
945
Vorlesung Datenbanken
Wintersemester 2012/13
Every laserJetPrinter is a printer, and every hpPrinter is an
hpProduct:
<owl:Class rdf:ID="printer"/>
<owl:Class rdf:ID="laserJetPrinter">
<rdfs:subClassOf rdf:resource="#printer"/>
</owl:Class>
<owl:Class rdf:ID="hpProduct"/>
<owl:Class rdf:ID="hpPrinter">
<rdfs:subClassOf rdf:resource="#hpProduct"/>
</owl:Class>
Prof. Dr. Dietmar Seipel
946
Vorlesung Datenbanken
Wintersemester 2012/13
Redundant subClassOf Relation
Since hpLaserJetPrinter is a sub–class of hpPrinter and hpPrinter
is a sub–class of hpProduct, it is redundant to explicitely state that
hpLaserJetPrinter is a sub–class of hpProduct.
<owl:Class rdf:ID="hpLaserJetPrinter">
<rdfs:subClassOf rdf:resource="#laserJetPrinter"/>
<rdfs:subClassOf rdf:resource="#hpPrinter"/>
<rdfs:subClassOf rdf:resource="#hpProduct"/>
<owl:disjointWith rdf:resource="#appleLaserWriter"/>
</owl:Class>
This redundancy is not an error. We could simply consider it as an anomaly,
that should be reported to the knowledge engineer.
This anomaly is not reported by reasoners in the ontology editor Protégé.
Prof. Dr. Dietmar Seipel
947
Vorlesung Datenbanken
Wintersemester 2012/13
Instances
Finally, we have some instances of some of the defined classes:
<appleLaserWriter rdf:ID="1001"/>
<appleLaserWriter rdf:ID="1002"/>
<hpLaserJetPrinter rdf:ID="1003"/>
<hpLaserJetPrinter rdf:ID="1004"/>
As mentioned before, there cannot exist instances of the class
hpApplePrinter.
Prof. Dr. Dietmar Seipel
948
Vorlesung Datenbanken
Wintersemester 2012/13
The Ontology Editor Protégé
Prof. Dr. Dietmar Seipel
949
Vorlesung Datenbanken
Wintersemester 2012/13
The ontology editor Protégé has some plugged in reasoners, such as
• FaCT++,
• HermiT, and
• Racer.
In the session that is shown in the screenshot above, the emptiness of the class
hpApplePrinter was be detected by the ontology reasoner FaCT++.
It is inferred that the class hpApplePrinter is EquivalentTo the
empty class Nothing. By clicking the question mark, an explanation can be
shown.
There are also databases for handling rdf data, so called triple stores, such
as Sesame or Jena. They use extensions of S QL– most notably SPARQL – as
a query language.
Prof. Dr. Dietmar Seipel
950
Vorlesung Datenbanken
Wintersemester 2012/13
Declarative Queries in F N Query
Complex X ML data structures in P ROLOG:
’owl:Class’:[’rdf:ID’:’appleLaserWriter’]:[
’rdfs:comment’:[’Apple laser ...’],
’rdfs:subClassOf’:[
’rdf:resource’:’#laserJetPrinter’]:[],
’owl:disjointWith’:[
’rdf:resource’:’#hpLaserJetPrinter’]:[] ]
An X ML element is represented as a term structure T:As:C, called
FN–triple.
• T is the tag of the element,
• As is the list of the attribute/value pairs A:V of the element, and
• C is a list of FN–triples for the sub–elements.
Prof. Dr. Dietmar Seipel
951
Vorlesung Datenbanken
Wintersemester 2012/13
F N S ELECT
In an OWL knowledge base Owl, there exists an isa relation between two
classes C1 and C2, if a subclassOf relation is stated explicitely, or
if C1 was defined as the interesection of C2 and some other classes:
% isa(+Owl, ?C1, ?C2) <isa(Owl, C1, C2) :C := Owl/’owl:Class’::[@’rdf:ID’=C1],
( R2 := C/’rdfs:subClassOf’@’rdf:resource’
; R2 := C/’owl:intersectionOf’/’owl:Class’@’rdf:about’ ),
owl_reference_to_id(R2, C2).
% owl_reference_to_id(+Reference, ?Id) <owl_reference_to_id(Reference, Id) :( concat(’#’, Id, Reference)
; Id = Reference ).
Prof. Dr. Dietmar Seipel
952
Vorlesung Datenbanken
Wintersemester 2012/13
Disjointness of Classes
% disjointWith(+Owl, ?C1, ?C2) <disjointWith(Owl, C1, C2) :R2 := Owl/’owl:Class’::[@’rdf:about’=R1]
/’owl:disjointWith’@’rdf:resource’,
owl_reference_to_id(R1, C1),
owl_reference_to_id(R2, C2).
In the following, we often suppress the ontology argument Owl.
Transitive Closure of isa
% subClassOf(?C1, ?C2) <subClassOf(C1, C2) :isa(C1, C2).
subClassOf(C1, C2) :isa(C1, C), subClassOf(C, C2).
Prof. Dr. Dietmar Seipel
953
Vorlesung Datenbanken
Wintersemester 2012/13
Anomalies in Ontologies
Cycle
?- isa(C1, C2), subClassOf(C2, C1).
C1 = personalPrinter,
C2 = printer
Partition Error
?- disjointWith(C1, C2),
subClassOf(C, C1), subClassOf(C, C2).
C = hpApplePrinter,
C1 = hpLaserJetPrinter,
C2 = appleLaserWriter
The class C is a sub–class of two disjoint classes C1 and C2.
Prof. Dr. Dietmar Seipel
954
Vorlesung Datenbanken
Wintersemester 2012/13
Incompleteness
?- isa(C1, C), isa(C2, C), isa(C3, C),
disjointWith(C1, C2), not(disjointWith(C2, C3)).
C
C1
C2
C3
=
=
=
=
laserJetPrinter,
hpLaserJetPrinter,
appleLaserWriter,
ibmLaserPrinter
The class C has three sub–classes C1, C2 and C3, from which only the two
sub–classes C1 and C2 are declared as disjoint in the knowledge base.
The fact that C2 and C3 are disjoint and that C1 and C3 are disjoint as well,
possibly was forgotten by the knowledge engineer during the creation of the
knowledge base.
Prof. Dr. Dietmar Seipel
955
Vorlesung Datenbanken
Wintersemester 2012/13
Redundant subClassOf/instanceOf Relations
% redundant_isa(?Chain) <redundant_isa(C1->C2->C3) :isa(C1, C2), subClassOf(C2, C3),
isa(C1, C3).
?- redundant_isa(Chain).
Chain = hpLaserJetPrinter -> hpPrinter -> hpProduct
The sub–class relation between C1 and C3 can be derived by transitivity over
the class C2.
Here, isa(C1, C2), subClassOf(C2,
done over at least two levels.
Prof. Dr. Dietmar Seipel
C3),
requires that this deduction is
956
Vorlesung Datenbanken
Wintersemester 2012/13
Undefined Reference
During the development of an ontology in OWL, it is possible that we
reference a class that we have not yet defined.
% undefined_reference(+Owl, ?Ref) <undefined_reference(Owl, Ref) :rdf_reference(Owl, Ref),
not(owl_class(Owl, Ref)).
rdf_reference(Owl, Ref) :( R := Owl/descendant_or_self::’*’@’rdf:resource’
; R := Owl/descendant_or_self::’*’@’rdf:about’ ),
owl_reference_to_id(R, Ref).
owl_class(Owl, Ref) :Ref := Owl/’owl:Class’@’rdf:ID’.
If we load such an ontology into Protégé, then the ontology reasoners may
produce wrong results, even for unrelated parts of the ontology.
Prof. Dr. Dietmar Seipel
957
Vorlesung Datenbanken
Wintersemester 2012/13
8.3 Object–Oriented Databases
Application Domains
• engineering (CAD/CAM, CIM)
• image and graphics databases
• scientific applications
• geo–databases
• multimedia systems
• integration of heterogenous databases
Prof. Dr. Dietmar Seipel
958
Vorlesung Datenbanken
Wintersemester 2012/13
Influences and concepts from other areas of computer science:
• programming languages:
abstract data typs and encapsulation
completeness (w.r.t. expressivity)
• software engineering:
modularisation, code extensibility and reuse
• artificial intelligence:
concepts for knowledge representation, classification
• databases:
(semantic) data modelling
Prof. Dr. Dietmar Seipel
959
Vorlesung Datenbanken
Wintersemester 2012/13
8.3.1 Complex Objects
Every object has a unique object identifier OID.
This value is invisible for the user. It is only used internally by the system to
identify an object and to allow for references between different objects.
An object o is represented by a triple h i, c, v i:
• i is the unique object identifier,
• c is a type constructor,
• v is the value of o.
Type Constructors: atom, tuple, set, list, array
Domains for atomic values: integer, real, string, boolean, date, . . .
Prof. Dr. Dietmar Seipel
960
Vorlesung Datenbanken
Wintersemester 2012/13
Given an object o = h i, c, v i.
• If c = atom, then v is an atomic value.
• If c = tuple, then
v = h a1 : i1 , a2 : i2 , . . . , an : in i
is a tuple with attribute names aj and OID’s ij .
• If c = set, then
v = { i1 , i2 , . . . , in }
is a set of OID’s ij .
• If c = list/array, then v is an ordered list / an array of OID’s.
Prof. Dr. Dietmar Seipel
961
Vorlesung Datenbanken
Wintersemester 2012/13
A complex object can be represented by a graph.
Two complex objects o1 = h i1 , c1 , v1 i and o2 = h i2 , c2 , v2 i are called
• deeply equal, if c1 = c2 and v1 = v2 .
• shallow equal, if their graphs are isomorphic and the atomic values in
the corresponding leaves are the same.
OODDL: Object–Oriented Data Definition Language
Nowadays, complex objects are frequently stored and managed using X ML
databases.
Prof. Dr. Dietmar Seipel
962
Vorlesung Datenbanken
Wintersemester 2012/13
Example (Complex Objects)
The complex objects o1 , o2 , and o3 contain the atomic objects o4 , o5 , and o6
as sub–objects:
o1 = h i1 , tuple, h a1 : i4 , a2 : i6 i i,
o2 = h i2 , tuple, h a1 : i4 , a2 : i6 i i,
o3 = h i3 , tuple, h a1 : i5 , a2 : i6 i i,
o4 = h i4 , atom, 10 i,
o5 = h i5 , atom, 10 i,
o6 = h i6 , atom, 20 i.
o1 and o2 are deeply equal; o1 and o3 are shallow equal.
Prof. Dr. Dietmar Seipel
963
Vorlesung Datenbanken
Wintersemester 2012/13
i1:
i :
3
tuple
o1
tuple
a1
i4:
atom
a2
o4
o
i : 6
6
atom
<a 1:10, a 2 :20>
a
o3
1
o
i : 5
5
atom
a2
o6
i :
6
atom
<a :10, a 2 :20>
1
Identity after resolving the references
Prof. Dr. Dietmar Seipel
964
Vorlesung Datenbanken
Wintersemester 2012/13
Nested Relations
Given a set U of attributes with domains dom(A), A ∈ U .
Formats R and domains dom(R) are defined recursively:
• R = (A1 , . . . , An , R1 , . . . , Rm ) with Ai ∈ U , 1 ≤ i ≤ n, and formats
Rj , 1 ≤ j ≤ m, is a format with
dom(R) = dom(A1 ) × . . . × dom(An ) × 2dom(R1 ) × . . . × 2dom(R2 ) .
• If m = 0, then R is a basic format.
A nested tuple over a format R is an element of dom(R).
A nested relation or NF2 –Relation (Non–First–Normal–Form) over R is a
subset of dom(R).
Prof. Dr. Dietmar Seipel
965
Vorlesung Datenbanken
Wintersemester 2012/13
Example (NF2 –Relation)
formats
Children = (Cname, BDate, Sex)
Graduations = (Type, Date)
Employees = (Id, Name, Address, Children, Graduations)
NF2 –Relation over the format Employees:
Employees
Id
Name
Children
Address
Cname
100
200
Prof. Dr. Dietmar Seipel
Joe
Theo
LA
NY
Graduations
Bdate
Sex
Mary
120261
F
Peter
041465
M
John
082270
M
Mary
051578
F
Laura
051578
F
Type
Date
driv_lic
121255
phd_cs
021565
driv_lic
082686
966
Vorlesung Datenbanken
Wintersemester 2012/13
8.3.2 Features of Object–Orientation
Encapsulation of Structure and Behaviour
In the relational data model there exist generic operatios for searching,
inserting, deleting, and updating tuples, which can be applied to arbitrary
relation schemas.
In object–oriented databases there are visible and hidden attributes.
• The visible attributes can be accessed by a declarative query language.
• The hidden attributes are accessed by sending messages (message
passing) between the objects.
Each object type “has” integrity conditions, which are realised in the access
operations.
Prof. Dr. Dietmar Seipel
967
Vorlesung Datenbanken
Wintersemester 2012/13
Type and Class Hierarchies, Inheritance
A type is given by its
• type name,
• attributes, and
• operations (methods).
As a generalization of attribute and method we use the term function.
Prof. Dr. Dietmar Seipel
968
Vorlesung Datenbanken
Wintersemester 2012/13
A type hierarchy is an acyclic, binary relation of the set of types:
Person
Student
?
Grad_Student
R
Faculty
Supertype
?
Subtype
specialization: ↓
generalization: ↑
Prof. Dr. Dietmar Seipel
969
Vorlesung Datenbanken
Wintersemester 2012/13
A sub–type inherits the functions of the super–type (inheritance).
Additionally, the sub–type has its own functions.
→ multiple inheritance, selective inheritance
A class is a set of objects, which usually are of the same type.
Usually, the set of all stored objects of each type forms a class.
Classes can form hierarchies, too.
Prof. Dr. Dietmar Seipel
970
Vorlesung Datenbanken
Wintersemester 2012/13
The type system in OODBs can be extended at run time.
Frequently, the non–standard data type BLOB (binary large object) is used for
• raster pixel pictures and
• long text strings.
These are supported as abstract data types with suitable access operations.
Prof. Dr. Dietmar Seipel
971
Vorlesung Datenbanken
Wintersemester 2012/13
Polymorphism (Operator Overloading)
The same operator name can have different implementations.
The implementation which is suitable for a certain object is determined at run
time, when the type of the object is known (late binding).
E.g., the function “area” for calculating the area is implemented differently
for different geometrical objects.
GEOMETRY_OBJEKT: Shape, Area, Centerpoint
RECTANGLE subtype-of GEOMETRY_OBJECT
(Shape=’rectangle’): Width, Height
TRIANGLE subtype-of GEOMETRY_OBJECT
(Shape=’triangle’): Side1, Side2, Angle
CIRCLE subtype-of GEOMETRY_OBJECT
(Shape=’circle’): Radius
Prof. Dr. Dietmar Seipel
972
Vorlesung Datenbanken
Wintersemester 2012/13
Multiple and Selective Inheritance
• Multiple inheritance occurs in a type hierarchy, if a type T is a sub–type
of several super–types T1 , . . . , Tn :
T1
T2
...
Tn
RU T
Then T inherits the functions of T1 , . . . , Tn ; this can lead to ambiguities.
• Selective inheritance occurs, if a type should only inherit some special
functions of one of its super–types T ′ . In this case, the undesired
functions are excluded (EXCEPT clause).
Prof. Dr. Dietmar Seipel
973
Vorlesung Datenbanken
Wintersemester 2012/13
Versions and Configurations
Many database applications require the management of different versions
versions of complex objects:
• software projects
• CAD applications.
A version graph shows the relations between the different versions of an
object.
A configuration of a complex object is a composition of compatible versions
for the sub–objects.
Prof. Dr. Dietmar Seipel
974
Vorlesung Datenbanken
Wintersemester 2012/13
8.3.3 Examples: C OMPANY and U NIVERSITY Database
In the following we will see the
1. types,
2. classes,
3. methods, and
4. some queries
for two examples.
Prof. Dr. Dietmar Seipel
975
Vorlesung Datenbanken
Wintersemester 2012/13
The C OMPANY Database as an OODB
i8 :
tuple
DNAME
i :
5
atom
o
v
5
5
DNUMBER
MGR
i :
4
atom
i :
9
tuple
o
4
v
4
o
8
LOCATIONS
o
i : o
7
7
set
9
10
set
: o
i
10
11
set
: o
11
v
10
7
v
11
5
i :
1
atom
i : o
2
2
atom
o1
v
v
1
Houston
MANAGER
i
PROJECTS
v
9
v
Research
EMPLOYEES
2
Bellaire
i : o
3
3
atom
v
i :... i :... i :...
15
17
16
tuple
tuple
tuple
3
Sugarland
MANAGERSTARTDATE
i :
6
atom
o
6
v
6
i 13 :...
22-May-78
tuple
i 14 :...
tuple
i 12 :...
tuple
Prof. Dr. Dietmar Seipel
976
Vorlesung Datenbanken
Wintersemester 2012/13
Complex Objects
o1
=
h i1 , atom, Houston i,
o2
=
h i2 , atom, Bellaire i,
o3
=
h i3 , atom, Sugarland i,
o4
=
h i4 , atom, 5 i,
o5
=
h i5 , atom, Research i,
o6
=
h i6 , atom, 22-May-78 i,
o7
=
h i7 , set, { i1 , i2 , i3 } i,
o8
=
h i8 , tuple, h DNAME : i5 , DNUMBER : i4 , MGR : i9 ,
LOCATIONS : i7 , EMPLOYEES : i10 , PROJECTS : i11 i i,
o9
=
h i9 , tuple, h MANAGER : i12 , MANAGERSTARTDATE : i6 i i,
o10
=
h i10 , set, { i12 , i13 , i14 } i,
o11
=
h i11 , set, { i15 , i16 , i17 } i, . . .
Prof. Dr. Dietmar Seipel
977
Vorlesung Datenbanken
Wintersemester 2012/13
Data Types
define type Date:
tuple( year: integer, month: integer, day: integer );
define type Employee:
tuple( name: string, ssn: string,
birthdate: Date, sex: char, dept: Department );
define type Department:
tuple( dname: string, dnumber: integer,
mgr: tuple( manager: Employee, startdate: Date ),
locations: set(string),
employees: set(Employee),
projects: set(Project) );
Prof. Dr. Dietmar Seipel
978
Vorlesung Datenbanken
Wintersemester 2012/13
Classes
define class Employee:
type tuple( name: string,
ssn: string,
birthdate: Date,
sex: char,
dept: Department );
operations
age(e: Employee): integer,
create_new_emp: Employee,
destroy_emp(e: Employee): boolean;
Prof. Dr. Dietmar Seipel
979
Vorlesung Datenbanken
Wintersemester 2012/13
define class Department:
type tuple ( dname: string, dnumber: integer,
mgr: tuple( manager: Employee, startdate: Date ),
locations: set (string),
employees: set (Employee),
projects: set (Project) );
operations
number_of_emps(d: Department): integer,
create_new_dept: Department,
destroy_dept(d: Department): boolean,
add_emp(d: Department, e: Employee): boolean,
remove_emp(d: Department, e: Employee): boolean;
Prof. Dr. Dietmar Seipel
980
Vorlesung Datenbanken
Wintersemester 2012/13
define class DepartmentSet:
type set (Department);
operations
create_dept_set: DepartmentSet,
destroy_dept_set(ds: DepartmentSet): boolean,
add_dept(ds: DepartmentSet, d: Department): boolean,
remove_dept(ds: DepartmentSet, d: Department): boolean;
persistent name AllDepartments: DepartmentSet;
/* AllDepartments is a persistent named object of type set(Department) */
...
d := create_new_dept;
/* creates new department object in the variable d */
b := add_dept(AllDepartments, d);
/* makes d persistent by adding it to a persistent named object */
Prof. Dr. Dietmar Seipel
981
Vorlesung Datenbanken
Wintersemester 2012/13
The U NIVERSITY Database as an OODB
Data Types
type Phone: tuple(
area_code: integer,
number: integer );
type Date: tuple(
year: integer,
month: integer,
day: integer );
Prof. Dr. Dietmar Seipel
982
Vorlesung Datenbanken
Wintersemester 2012/13
Classes
class Person
type tuple(
ssn: string,
name: tuple( firstname: string, middlename: string, lastname: string ),
address: tuple( number: integer, street: string, apt_no: string,
city: string, state: string, zipcode: string ),
birthdate: Date,
sex: character );
method age: integer
end
Prof. Dr. Dietmar Seipel
983
Vorlesung Datenbanken
Wintersemester 2012/13
class Student inherit Person
type tuple(
class: string, majors_in: Department, minors_in: Department,
registered_in: set(Section),
transcript: set ( tuple(
grade: character, ngrade: real, section: Section ) ) );
method grade_point_average: real, change_class: boolean,
change_major(new_major: Department): boolean;
end
class Grad_Student inherit Student
type tuple(
degrees: set ( tuple ( college: string, degree: string, year: integer ) ),
advisor: Faculty );
end
Prof. Dr. Dietmar Seipel
984
Vorlesung Datenbanken
Wintersemester 2012/13
class Faculty inherit Person
type tuple(
salary: real, rank: string, foffice: string, fphone: Phone,
belongs_to: set(Department),
grants: set(Grants), advises: set(Student) ),
method promote_faculty, give_raise(percent: real),
end
class Department
type tuple(
dname: string, office: string, dphone: Phone,
members: set(Faculty), majors: set(Student),
chairperson: Faculty, courses: set(Course) ),
method add_major(s: Student), remove_major(s: Student):boolean
end
Prof. Dr. Dietmar Seipel
985
Vorlesung Datenbanken
Wintersemester 2012/13
class Section
type tuple(
sec_num: integer, qtr: Quartar, year: Year,
students: set ( tuple( stud: Student, grade: character ) ),
course: Course,
teacher: Instructor ),
method change_grade(s: Student, g: character);
end
class Course
type tuple(
cname: string, cnumber: string, cdescription: string,
sections: set(Section),
offering_dept: Department );
end
Prof. Dr. Dietmar Seipel
986
Vorlesung Datenbanken
Wintersemester 2012/13
Methods
method body age: integer in class Person
{
int a;
Date d;
d=today();
a=d->year - self->birthdate->year;
if ((d->month < self->birthdate->month) ||
(d->month == self->birthdate->month) &&
(d->day < self->birthdate->day))
--a;
return a;
}
Prof. Dr. Dietmar Seipel
987
Vorlesung Datenbanken
Wintersemester 2012/13
method body grade_point_average: real in class Student
{
float sum=0.0;
int count=0;
struct {
char gr;
float ngrade;
o2_Section sec;
} t;
for (t in self->transcript) {
/* increment sum by ngrade, count by 1 */
sum += t->ngrade; ++count;
}
return sum/count;
}
Prof. Dr. Dietmar Seipel
988
Vorlesung Datenbanken
Wintersemester 2012/13
method body
change_major (new_major: Department): boolean in class Student
{
if (self->majors_in->remove_major(self)) {
return 0;
}
else {
new_major->add_major(self);
self->majors_in=new_major;
return 1;
}
}
Prof. Dr. Dietmar Seipel
989
Vorlesung Datenbanken
Wintersemester 2012/13
method body
remove_major(s: Student): boolean in class Department
{
if (s in self->majors) {
/* –= apply set difference to remove object s from set of majors */
self->majors –= set(s);
return 1;
}
else return 0;
}
Prof. Dr. Dietmar Seipel
990
Vorlesung Datenbanken
Wintersemester 2012/13
method body
add_major(s: Student) in class Department
{
/* += apply set union to add object s to set of majors */
self->majors += set(s);
}
/* a persistent root to hold all persistent Person objects */
name All_Persons: set(Person);
/* a persistent root to hold a single Person object */
name John_Smith: set(Person);
Prof. Dr. Dietmar Seipel
991
Vorlesung Datenbanken
Wintersemester 2012/13
run body {
/* create a new Person object p */
o2 Person p = new Person;
*p = tuple (
ssn: ”222222222”,
name: tuple(
firstname: ”Franklin”, middlename: ”T”, lastname: ”Wong” ),
address: tuple( number: 638, street: ”Voss Road”,
city: ”Houston”, state: ”Texas”, zipcode: ”77079” ),
birthdate: tuple( year: 1945, month: 12, day: 8 ),
sex: M );
/* p becomes persistent by attaching to persistent root */
All_Persons += set(p);
Prof. Dr. Dietmar Seipel
992
Vorlesung Datenbanken
Wintersemester 2012/13
/* now put values in persitent named object John_Smith */
John_Smith->ssn=”444444444”,
John_Smith->name: tuple(
firstname: ”John”, middlename: ”B”, lastname: ”Smith”),
John_Smith->address: tuple( number: 731, street: ”Fondren Road”,
city: ”Houston”, state: ”Texas”, zipcode: ”77036” ),
John_Smith->birthdate: tuple( year: 1955, month: 1, day: 9 ),
John_Smith->sex:M;
}
Prof. Dr. Dietmar Seipel
993
Vorlesung Datenbanken
Wintersemester 2012/13
Queries
select tuple (
fname: s.name.firstname, lname: s.name.lastname )
from s in Student
where s.majors_in.dname = ”Computer Science”
select tuple(
fname: s.name.firstname, lname: s.name.lastname,
transcript: select tuple(
cname: sc.section.course.cname, sec_no: sc.section.sec_num,
quarter: sc.section.qtr, year: sc.section.year, grade: sc.grade )
from sc in sec )
from s in Student, sec in s.transcript
where s.majors_in.dname = ”Computer Science”
Prof. Dr. Dietmar Seipel
994