Difference between revisions of "Update through views: rewrites"

Latest revision as of 11:00, 7 February 2017

[todo: currently includes all examples from discussions; add more rewrites for full coverage; precisely specify the rewrite rules and qualifying conditions]

Preliminaries/general approach

Assume a variant of Tutorial D in which targets for update can be relational expressions, not merely relvar names. Then a view name (virtual relvar) is to be a shorthand for its defining expression. We're already doing that in read contexts. Do so also in update target contexts.

I.e. if there's a virtual defined in the form

VAR V VIRTUAL (R WHERE p)

as a target for update

UPDATE V WHERE q : assigns;

expand to the syntax for update 'targets' as relational expressions:

UPDATE (R WHERE p) WHERE q : assigns;

Is only a smidgeon away from:

UPDATE R WHERE (p AND q) : assigns;

So that is now targetting a base variable.

A virtual's defining expression might itself refer to virtuals. Recursively substitute their defining expressions, until the target expression consists only of base relvar names, literals/constants and relational operators.

Where a given base relvar appears more than once in the target expression, refactor the expression (if possible) so that it appears only once. For example:

(R WHERE p) UNION (R WHERE q)	==> R WHERE (p OR q)
R MINUS (R WHERE q)	==> R WHERE NOT q

That is, refactor using the equational equivalences used typically for optimising query plans. These must be information-preserving.

If a multiply-occuring base relvar cannot be refactored to a single occurence, that is of itself survivable, but it might have the effect that some relexpr is not updatable-through, whereas a suitably refactored expression would be.

If there's an IND between two relations R1, R2, it doesn't seem a big stretch to put them together as target in a single update statement, rather than two. (That is, put them together as a shorthand for the Multiple Assignment. Figuring out the separate updates is something a dbms could do mechanically; why impose that load on the programmer?)

INSERT (R1 JOIN R2) v;

==> INSERT R1 (v {attribs of R1}, INSERT R2 (v {attribs of R2});

In various examples I've identified the sort of criteria that can be considered:

keys and cardinality of joins
constraints (particularly INDs)
projections that make attributes 'inaccessible' from the view
the form of the update request, Prescription 2.

(for example DELETE ... WHERE ... could succeed whereas DELETE ... v might not.)

in general anything that can be examined at compile time, Prescription 3.

Semantics: effect of update operations

For definiteness, this is the specified effect for the update operations (Prescription 2), in style of RM Prescription 21

After assignment of v to V (the “target variable”), the equality comparison V = v shall evaluate to TRUE.

These are entirely standard; stated here to show equivalences of update operations.

Operation (see #Notation)	After update this equation shall hold true (where re' is the 'after' value)
INSERT re v;	re' == re UNION v
DELETE "re" v;	re' == re MINUS v
DELETE "re" WHERE p;	re' == re MINUS (re WHERE p)
UPDATE re WHERE p : assigns;	re' == (re MINUS (re WHERE p)) UNION Upd((re WHERE p), assigns)
re := v; // assignment	re' == v, == (re MINUS (re MINUS v)) UNION (v MINUS re)

The rewrites may take advantage of the following equivalences, especially so that some operation be supported which would not otherwise be updatable-through.

Operation	==> equivalent operation
INSERT re v;	==> re := re UNION v;
DELETE re v;	==> re := re MINUS v;
DELETE re v;	==> DELETE re WHERE (rel{tup{*}} MATCHING v);
re := v;	==> DELETE re (re MINUS v) , INSERT re (v MINUS re);

In particular, := assignment is only supported on condition that both DELETE and INSERT are supported, so usually is rewritten as such.

Sub-terms that are not updatable-through

Where a sub-term is a literal/constant, it is not updatable-through. (In effect it is subject to a constraint that it always be that value, so falls under Prescription 2 "Golden Rule".) Rewrites can take advantage of that constraint to disambiguate rewrites. For example

[to do]

Notation

relexpr, re, re', re1, re2, ...	relational expressions
R, R', R1, R2, ...	relvar names (i.e. bases)
p, q, p1, q1, ...	predicate expressions (booleans) for WHERE clauses
assigns	attribute assignment lists in UPDATE or EXTEND
Upd(relexpr, assigns)	a function that applies assigns to the relation value of relexpr, yielding a relation value
==>	rewrite rule: transform the left operation to the right
<==>	two-way rewrite: transform in either direction
rel{tup{*}}	a singleton relation formed from the 'current' tuple of some enclosing relation expression
relop	an arbitrary relational operator (infix), including projection considered as an operator
relexpr {attribs of re2}	a projection of relexpr on the attributes of re2
upd1 , upd2;	Note the , comma, not semicolon separating two update statements. This is standard Tutorial D Multiple Update (aka Multiple Assignment). Typically the rewrites for a target with more than one base relvar will be set of Multiple Updates (separated by commas), one for each relvar. The rewrites will 'walk' down the target's expression tree, generating an update for each sub-term until it gets to the base relvars.

JOINs

A) INSERT through 1:many JOIN with IND Supported for update.

INSERT (R1 JOIN R2) v ;

==> INSERT R1 (v{attribs of R1}), INSERT R2 (v{attribs of R2});

B) INSERT through 1:many JOIN without IND Not supported: an INSERT on the 1 side could MATCH tuples already in the many side, that would thereby 'appear' in the result; so breaking Assignment Principle.

C) DELETE through 1:many JOIN (with or without IND) Supported: delete from the many side only.

DELETE (R1 JOIN R2) v; // R2 is the many side

==> DELETE R2 ( (R2 MATCHING R1) MATCHING v);

Deleting from the 1 side might 'unMATCH' non-targetted tuples on the many side, that would thereby 'disappear' from the result. If no such tuples, we might DELETE the 1 side, but that's not essential.

D) DELETE through 1:1 JOIN with two-way IND (i.e. EQD) Supported: delete from both sides.

DELETE (R1 JOIN R2) v;

==> DELETE R1 ( (R1 MATCHING R2) MATCHING v), // Multiple Update

DELETE R2 ( (R2 MATCHING R1) MATCHING v);

E) DELETE through 1:1 JOIN with a single IND Supported: delete from the dependent (subset) side only.

DELETE (R1 JOIN R2) v; // R2 is the subset side

==> DELETE R2 ( (R2 MATCHING R1) MATCHING v);

F) DELETE through 1:1 join without IND Not supported: could be achieved by deleting from either side only or both. Then neither tuple DELETE is essential.

Assign through 1:many JOIN with IND

(R1 JOIN R2) := v;

==> DELETE (R1 JOIN R2) ((R1 JOIN R2) MINUS v), INSERT (R1 JOIN R2) (v MINUS (R1 JOIN R2));

Restrictions

The Appendix A approach is that Restrictions are Joins. And yet joins that arise from restrictions are typically not joins of bases. Examples:

(R WHERE X = 1) <==> R JOIN REL{TUP{X 1}}

REL{TUP{X 1}} is not a base. In fact it's a constant, so not updatable at all.

(R WHERE X > Y) <==> R JOIN REL{{{x, y} | x IN dom(X), y IN dom(Y), x > y}}

That REL{ } is not a base. I've defined it using set builder notation. In fact it's a constant, so not updatable at all.

In case of (R WHERE <subselect on R2>) we need to consider more carefully. Typically R2 has attributes that don't appear in R, so any update would amount to update-through-projection. Which is a different kettle of fish to update-through-join.

So we need update-through-restriction, which is usually regarded as non-controversial:

INSERT (R WHERE p) v; ==> INSERT R (v WHERE p);
DELETE (R WHERE p) v; ==> DELETE R (v WHERE p);
DELETE (R WHERE p) WHERE q; ==> DELETE R WHERE (p AND q)
Therefore assign(:=) is supported, being a combo of DELETE, INSERT.
UPDATE also supported, being a combo of DELETE, INSERT.

A tad tricky because the updated tuples might fail the restriction.

And that attempt should be rejected as failing prescription 2 "Assignment P". aka SQL's WITH CHECK OPTION, approximately.

Assume I have a function Upd(rel, assigns) that applies assigns to rel:

UPDATE (R WHERE p) WHERE q : assigns; ==>

IF IS_EMPTY(Upd(R WHERE (p AND q), assigns) WHERE NOT p)

THEN UPDATE R WHERE (p AND q) : assigns ;

ELSE Fail ;

Projections

Projections are awkward for updating through. Push projections towards the leaves as far as possible, for example:

For [NOT] MATCHING, which needs a projection and JOIN

Appendix A gives: re1 MATCHING re2 ==> (re1 JOIN re2) { attribs of re1 }

Instead: re1 MATCHING re2 ==> re1 JOIN (re2 { attribs in common } )

For projections that are really implementing RENAME, change to RENAME

(EXTEND re : { B := A }) {ALL BUT A} ==> (re RENAME A AS B)

For no-op projections (i.e. is projecting all attributes), all updates are supported.

Rewrite updop (re{ ... }) ...; ==> updop re ...;

G) DELETE through projection Supported (whether or not the projection is key-preserving.) DELETE (R{ ... }) WHERE p; ==> DELETE R WHERE p; DELETE (R{ ... }) v; ==> DELETE R (R MATCHING v); We must DELETE exactly those base tuples to achieve the DELETE of the proj. We can't DELETE too many, by those rewrites.

H) INSERT through projection Not valid. (No surprise.) Therefore

I) assign (:=) through projection Not valid.

J) UPDATE through projection Valid providing the projection is key-preserving (i.e. at least one key). UPDATE (R{ ... }) WHERE p : assigns; ==> UPDATE R WHERE p : assigns; Note: the assigns could be changing the key.

Valid even if the projection is not key-preserving, providing the assigns are not changing attribs in the based-on's key (for at least one key).

[For ref: Dave's suggested 'template' for spcifying rewrites]

2. Update through JOIN

Given r1 JOIN r2 where r1 and r2 are arbitrary relations.

a. Rewrite UPDATE r1 JOIN r2: {p := q} as UPDATE ….

b. Rewrite INSERT (r1 JOIN r2) v as INSERT ….

c. Rewrite DELETE r1 JOIN r2 WHERE p as DELETE ...

d. Rewrite (r1 JOIN r2) := v as ...

Or: Not supported.

@@ Line 59: / Line 59: @@
 | DELETE "re" WHERE ''p'';  || ''re' '' == ''re'' MINUS (''re'' WHERE ''p'')
 |- style="vertical-align:top;"
-| UPDATE ''re'' WHERE ''p'' : ''assigns'';  ||  ''re' '' == (''re'' MINUS (''re'' WHERE ''p'')) UNION ''Upd''((''re'' WHERE ''p''), ''assigns'')
+| UPDATE ''re'' WHERE ''p'' : ''assigns'';  || ''re' '' == (''re'' MINUS (''re'' WHERE ''p'')) UNION ''Upd''((''re'' WHERE ''p''), ''assigns'')
 |- style="vertical-align:top;"
 | ''re'' ''':=''' ''v'';  // assignment   || ''re' '' == ''v'',  == (''re'' MINUS (''re'' MINUS ''v'')) UNION (''v'' MINUS ''re'')
@@ Line 67: / Line 67: @@
 {|
-! style="width: 35%; " |Operation         !! ==> equivalent operation
+! style="width: 40%; " | Operation         !! ==> equivalent operation
 |-
 | INSERT ''re''  ''v'';    || ==> ''re'' := ''re'' UNION ''v'';

Difference between revisions of "Update through views: rewrites"

Latest revision as of 11:00, 7 February 2017

Contents

Preliminaries/general approach

Semantics: effect of update operations

Sub-terms that are not updatable-through

Notation

JOINs

Restrictions

Projections

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools