sql - Inner Join vs Natural Join vs USING clause: are there any advantages? -
imagine have 2 simple tables, such as:
create table departments(dept int primary key, name); create table employees(id primary key, fname, gname, dept int references departments(dept));
(simplified, of course).
i have of following statements:
select * employees e inner join departments d on e.dept=d.dept; select * employees e natural join departments d; select * employees e join departments d using(dept);
a working example can found here: sql fiddle: http://sqlfiddle.com/#!15/864a5/13/10
they give same results — same rows.
i have preferred first form because of flexibility, readability , predictability — define connected what.
now, apart fact first form has duplicated column, there real advantage other 2 forms? or syntactic sugar?
i can see disadvantage in latter forms expected have named primary , foreign keys same, not practical.
now, apart fact first form has duplicated column, there real advantage other 2 forms? or syntactic sugar?
tl;dr natural join used in style of relational programming simpler usual sql style. (although when embedded in sql burdened rest of sql query syntax.) that's because 1. directly uses simple operators of predicate logic, language of precision in engineering (including software engineering), science (including computer science) , mathematics, , 2. simultaneously , alternatively directly uses simple operators of relational algebra.
the common complaint natural join since shared columns aren't explicit, after schema change inappropriate column pairing may occur. , may case in development environment. in case there requirement only columns joined , natural join without project was not appropriate. these arguments assume natural join being used inappropriately. arguers aren't aware ignoring requirements. such complaints specious. (moreover, sound software engineering design principles lead not having interfaces such specificiatons.)
you have described difference in effect: 1 copy of each common column returned.
from is there rule of thumb construct sql query human-readable description?:
it turns out natural language expressions , logical expressions , relational algebra expressions , sql expressions (a hybrid of last two) correspond in rather direct way.
eg codd 1970:
the relation depicted called component. [...] meaning of component(x, y,z) part x immediate component (or subassembly) of part y, , z units of part x needed assemble 1 unit of part y.
from this answer:
every base table has statement template, aka predicate, parameterized column names, put row in or leave out.
plugging row predicate gives statement aka proposition. rows make true proposition go in table , rows make false proposition stay out. (so table states proposition of each present row , states not proposition of each absent row.)
but every table expression value has predicate per expression. relational model designed if tables
t
,u
hold rows t(...) , u(...) (respectively) then:
t natural join u
holds rows t(...) , u(...)t where
condition
holds rows t(...) , conditiont union corresponding u
holds rows t(...) or u(...)t except corresponding u
holds rows t(...) , not u(...)select distinct
columns keep
from t
holds rows where
there exists columns drop such t(...)- etc
whereas reasoning sql otherwise is... not "natural":
an sql select statement can thought of algebraically 1. implicitly renameing each column c
of table (possibly implicit) correlation name t
t.c
, 2. cross joining, 3. restricting per inner on, 4. restricting per where, 5. projecting per select, 6. renameing per select, dropping t.
s, 7. implicitly renameing drop remaining t.
s between t.
-renameings algebra operators can thought of logic operators , table names predicates: t join ...
vs employee t.employee has name t.name ... , ...
. conceptually inside select statement double-rename-inducing cross join table t.c
s column names while outside tables have c
s column names.
alternatively sql select statement can thought of logically 1. introducing forsome t in e
around entire statement per correlation name t
, base name or subquery e
, 2. referring value of quantified t
using t.c
refer c
part, 3. building result rows t.c
s per etc, 4. naming result row columns per select clause, 4. leaving scope of forsome
s. again algebra operators being thought of logic operators , table names predicates. again though, conceptually has t.c
inside selects c
outside correlation names coming , going.
these 2 sql interpretations near straightforward using join or and, etc, interchangeably. (you don't have agree it's simpler, perception why natural join , union/except corresponding there.) (arguments criticizing style outside context of intended use specious.)
using kind of middle ground orphan 1 foot in natural join camp , 1 in cross join. has no real role in former because there no duplicate column names there. in latter more or less abbreviating join conditions , select clauses.
i can see disadvantage in latter forms expected have named primary , foreign keys same, not practical.
pks (primary keys), fks (foreign keys) & other constraints not needed querying. (knowing column function of others allows scalar subqueries, can phrase without.) 2 tables can meaningfully joined. if need 2 columns have same name natural join rename via select as.
Comments
Post a Comment