Update through views: a not-quite-so seminal paper - asking too much; can base relations be orthogonal?

From D Wiki
Jump to: navigation, search

Another much-cited paper on updating through views, of the same vintage as Dayal & Bernstein or Codd's 12 rules, is Bancilhon & Spyratos 1981, which introduced the "constant complement" approach (and which anticipates Date's Principle of Interchangeability).

B&S aim to give a more precise specification for D&B's unique, avoiding extraneous updates. They don't offer an opinion on how few views are updatable-through, compared to D&B's "very few situations" conclusion. But it isn't difficult to see their restrictions are more stringent.

I think their approach rests on an unstated assumption that is unrealistic. I'm going to draw out my reasons, to discuss the implications for independent update of relations (base or views). First I'll outline their approach.

What is a "View complement"?

Given base relations and a view defined over them, the view complement is "The information [in the bases] not visible within the view", as B&S put it.

  • Take view V defined over base relations B1, B2, ..., Bn.
(If V is defined over relvars that are themselves views, substitute in the view definitions (recursively) until there are all bases.)
  • Interchange the schema for an information equivalent one in which:
    • There is a base BV with same heading as V.
    • There are views VB1, VB2, ..., VBn with the same headings as the based-ons respectively, and to be defined over BV and ....
    • There are bases BBm, BBm+1, ..., BBm+o (not necessarily corresponding to B1, B2, ...).
    • View V' is defined over VB1, VB2, ...VBn exactly as was V over the original bases.
  • The content for these alternative relvars (i.e. their internal predicate) is to be:
    • Base BV is to have the same content as would view V.
    • View VB1, VB2, ..., VBn are to have the same content as would B1, B2, ..., Bn respectively.
    • Bases BBm, BBm+1, ..., BBm+o are to have 'just enough' content to combine with the content of BV and supply the content for views VB1, VB2, ..., VBn.
    • So V' always yields the same content as BV.

The idea is that BBm, BBm+1, ..., BBm+o represent the original base relations B1, B2, ..., Bn minus the view V -- i.e. the complement of the view. For example:

  • If V is a restriction over a single base, its complement could be a base with the same heading, containing the tuples excluded from the view (i.e. disjoint internal predicate).
  • If V is a projection over a single base, its complement could be a base with the projected-out attributes, plus a key to join back.
To be information equivalent/a lossless decomposition would need declaring an EQD between the two projections.
  • If V is a join, its complement could be two bases with the headings of the original bases, containing respectively the tuples not MATCHING.

In general there will be many ways to choose the schema and internal predicates for BBm, BBm+1, ..., BBm+o. The schema designer has a free hand to decompose or combine the original bases.

Constant complement

The ideal choice of complement is one in which the view-as-base BV can be freely updated without requiring update to any other bases; and from which update bases-as-views VB1, VB2, ..., VBn are correctly generated, such that V' yields the same as BV.

  • With the restriction view, since its complement has a disjunctive internal predicate, tuples can be freely updated into the view-as-base. (An attempt to modify an existing tuple to values that would exclude it from the restriction is disallowed, because that would violate the view-as-base's internal predicate.)
  • With the projection view-as-base, it is possible to modify the non-key attributes; but not insert (because tuples would need to be inserted to the complement, to be able to join); and (arguably) not delete, because the 'information equivalence' requirement imposes an EQD on the keys.
  • With the join view-as-base, opportunities for update are even more limited. (Only modify the non-common attributes; not INSERT because that might need 'moving' a tuple from the complement to the join-as-base; not DELETE because that might need 'moving' a tuple to the complement.)

It is notable that the example initial base schema B&S work with (two base relations each of two attributes) has an unusually large number of integrity constraints, including bi-directional Functional Dependencies, and an EQD such that the two relations are a lossless decomposition.

B&S tighten the screw further [Secion 3]:

User requirements impose two constraints on this set [of translatable updates], namely,

(i) If the user is allowed two updates u and v, then he must be allowed the composite update uv.
(ii) The user must have the means to cancel, if he wishes, the effect of every update that he is allowed to the view.

Re (i) they don't appear to consider that applying u might make v invalid (or v.v.). Re (ii), it is not clear which User(s) stated this requirement, or why.

The invalid assumption: that base relvars are independently updatable

Behind the hold-the-complement-constant approach is an assumption that if a view can be updated independently, as if it was a base, it is first-class. On the grounds updates to each base are independent (and are therefore first-class), because they hold all other bases constant.

"The problem of finding independent views is extremely important in database theory." pronounce B&S [section 6], with no citation (beyond their own technical work).

To put it plainly: updates to base are not necessarily independent. And B&S need look no further than their own initial schema. Updates are not independent because of inter-relational constraints. B&S's should look at how 'tight' are the constraints in their schema (two base relations), and consider that neither base relation could form a constant complement of the other.

Independence of update of base relations is indeed an important criterion for schema design: Normalisation is to remove redundancy/avoid update anomalies. Note, though, that if a normalised schema is to be information-equivalent to its un-normalised predecessor, each decomposition must add inter-relational constraints to ensure lossless re-composition.

Inter-relational constraints are the norm with mature schema designs. Therefore an ideal of independent update is unachievable for base relations. It is an entirely pointless criterion for update through views.

Orthogonal Design and updating base relations

'Orthogonal Design' (Principle Of, introduced as part of Date & McGoveran's 1994 treatment of updating through views) of a schema is claimed to be essential to translating updates through a view to updates to base. Definitions vary for what POOD means precisely,

"Loosely, the principle that no two relvars in a given database should have overlapping meanings." says Date in his Relational DB Dictionary (2008 Extended Edition). And Date goes on to give a precise extensional definition.

" The semantic orthogonality property holds in a database design (a particular schema) if and only if, ignoring multi-relation constraints, there is no "overlap" or redundancy between the intended membership of any two relations in the schema. If this is true, then we can add (or remove) tuples to (from) one without that operation in and of itself implying tuples need to be added (or removed) to the other for database consistency, except as required by some multi-relation constraint (e.g., a foreign key). This notion is analogous to having orthogonal axes in a coordinate system: The variables such axes represent are independent – we can alter the value on one such variable without altering any of the others. " says McGoveran in his draft Chapter on Updating Relations (2015).

As we have seen, normalising a design (so that "we can add (or remove) tuples to (from) [a relation] without that operation in and of itself implying [other] tuples need to be added (or removed) to " that same relation "for database consistency") entails defining multi-relation constraints to preserve information equivalence. So allowing an exception for inter-relational (multi-relation) constraints makes POOD a dead letter.

Behind B&S's advocacy for constant complement is the presumption that base relations are first-class because "we can alter the value on one such [relation] without altering any of the others". This in fact does not apply for base relations (because of muti-relation constraints). Neither is it reasonable to expect it to apply for update through views, a fortiori because their content is redundant necessarily.


To compare updates to a base relvar subject to a multi-relation constraint with update through a view:

  • Consider an INSERT to a base relation that is the dependent in an INclusion Dependency.
That INSERT might fail because there is no matching tuple in the defining base relation.
Violation of the IND can only be detected at run time.
Does this mean the base relation is not updatable-to?
  • Then consider an INSERT through a view defined using MATCHING.
If that's translated to an INSERT to the MATCHING's left operand, it might fail because there is no matching tuple in the right operand.
(That is, fail because the inserted tuple would therefore not appear in the resulting value for the view -- violating no side effects/The Assignment Principle.)
So (arguably) violation of The Assignment Principle can only be detected at run time.
Does this mean the MATCHING view is not updatable-through?

[in progress]

References

  • Bancilhon and Spyratos 1981 "Update Semantics of Relational Views"
http://pages.cs.wisc.edu/~jhuang/qual/update-semantics-of-views.pdf
  • David McGoveran Chapter draft March 2015 "Can All Relations Be Updated? "/"Updating Relations"
http://www.alternativetech.com/publications/Logic%20for%20DB%20Folks%20Series%20XX%20-%20Updating%20Relations.pdf
  • Chris Date 2008 "The Relational Database Dictionary, Extended Edition"
http://www.apress.com/book/view/1430210419


See also Update through views: a possible approach#References