2008 Summer School: Advanced SQL
This talk is intended to introduce advanced concepts in Structured Query Language (SQL). Examples of each concept are presented.
We will use the SQL Search front end to the SDSS database available here. Throughout, remember that the Schema Browser is your friend. Refer to it often for column names and function definitions.
The simplest query is just a select and an expression.
SELECT sqrt(100.0)The next simplest is a select with a from on a single table.
SELECT chunkID FROM chunkNext in complexity is a query with constraints.
SELECT run, rerun, camcol, field FROM field WHERE camcol between 4 and 6
Each SQL database vendor is responsible for implementing the SQL grammar for
their particular product. There are standards for SQL and standard extensions
(like Transact-SQL), but frequently these standards are not followed to the
A good comilation of the differences between different dialects is located at: http://troels.arvin.dk/db/rdbms/
Often it is nice to return just a few rows to make sure the query is working, or maybe you just want the "most something" values (i.e. brightest, most distant, nearest, bluest, etc.).
SELECT TOP 10 ra, dec FROM PhotoPrimary
Aggregate functions are useful for doing simple math on return columns. A common usage is to find the number of records that match a constraint via the COUNT function.
SELECT AVG(mjd_r), COUNT(DISTINCT field), MAX(run), MIN(run), SUM(nObjects) FROM field
In cases where it appears that an intermediate table is necessary to get what you want, a sub-select may do what you want in a single query. Sub-selects act just like a table in the FROM clause.
SELECT a.ra, a.dec, b.bestobjid FROM galaxy a, (SELECT TOP 50 ra, dec, bestobjid FROM SpecObjAll where bestobjid > 0) b WHERE a.objid = b.bestobjid
A real world usecase is getting multiple lines out of the SDSS lines tables.
SELECT a.ew, b.ew, c.ew, a.specobjid, b.specobjid, c.specobjid, a.name, b.name, c.name FROM (select ew, name, specobjid from speclineindex where name = "Lick_CN1") a, (select ew, name, specobjid from speclineindex where name = "Lick_CN2") b, (select ew, name, specobjid from speclineindex where name = "Lick_Hb") c WHERE a.specobjid = b.specobjid and b.specobjid = c.specobjid and (a.ew > -9999 and b.ew > -9999 and c.ew > -9999)
We won't go into how to create functions, but many dataproviders will define useful functions for you to use. In general, functions operate just like tables. The following example uses the SDSS function for doing a cone search on ra, dec, and r in arcmin. The return is a list of object IDs.
SELECT a.modelMag_g, a.modelMag_r, a.modelMag_i, a.modelMag_g - a.modelMag_r as gr, a.modelMag_r - a.modelMag_i as ri FROM PhotoObjAll a, dbo.fGetNearbyObjEq(210.0, 35.0, 10) b WHERE a.objID = b.objID AND a.type=3
On occasion, one may only care about records in an grouped sense. For eample, the following query returns the average redshift of all objects in the database grouped by their spectral classification name with the constraint that the redshift is less than unity.
SELECT c.name, AVG(s.z) as AVG_Z FROM specObj s, specClass c WHERE s.z < 1 AND s.specClass = c.value GROUP BY c.name
Further constraints can be placed on the groups created in the GROUP BY clause with the HAVING clause. In the above example, average redshifts were returned for all spectral classes. In some the following example, only spectral classes with star in the name will be returned.
SELECT c.name, AVG(s.z) as AVG_Z FROM specObj s, specClass c WHERE s.z < 1 AND s.specClass = c.value GROUP BY c.name HAVING c.name LIKE '%STAR%'
Flags may contain lots of information about objects in a relatively compact form. Bitwise operators are the way to deal with these flags.
SELECT TOP 10 ra, dec FROM photoObjAll WHERE flags & dbo.fPhotoFlags('SATURATED') > 0
The SATURATED flag in binary is 1000000000000000000 which is 262144 in decimal. The & operator computes the bitwise AND of the two arguments, thus if the SATURATED flag is not set, the result will be zero. The | operator is the bitwise OR.
If the database you are using does not have the nice builtin functions for flags that the SDSS DB has, you can just do the bitwise operations on the decimal representation of the binary flag.
SELECT TOP 10 ra, dec FROM photoObjAll WHERE flags & 262144 > 0
The NVO Summer School is made possible through the support of the National Science Foundation.