SQL style guide
Overview
You can use this set of guidelines, fork them or make your own - the
key here is that you pick a style and stick to it. To suggest changes
or fix bugs please open an issue or pull request on GitHub.
These guidelines are designed to be compatible with Joe Celko’s SQL Programming
Style book to make adoption for teams who have already read that book
easier. This guide is a little more opinionated in some areas and in others a
little more relaxed. It is certainly more succinct where Celko’s book
contains anecdotes and reasoning behind each rule as thoughtful prose.
It is easy to include this guide in Markdown format as a part of a
project’s code base or reference it here for anyone on the project to freely
read—much harder with a physical book.
SQL style guide by Simon Holywell is licensed under a Creative Commons
Attribution-ShareAlike 4.0 International License.
Based on a work at http://www.sqlstyle.guide.
General
Do
- Use consistent and descriptive identifiers and names.
- Make judicious use of white space and indentation to make code easier to read.
- Store ISO-8601 compliant time and date information
(YYYY-MM-DD HH:MM:SS.SSSSS
). - Try to use only standard SQL functions instead of vendor specific functions for
reasons of portability. - Keep code succinct and devoid of redundant SQL—such as unnecessary quoting or
parentheses orWHERE
clauses that can otherwise be derived. - Include comments in SQL code where necessary. Use the C style opening
/*
and
closing*/
where possible otherwise precede comments with--
and finish
them with a new line.
1 | SELECT file_hash -- stored ssdeep hash |
1 | /* Updating the file record after writing to the file */ |
Avoid
- CamelCase—it is difficult to scan quickly.
- Descriptive prefixes or Hungarian notation such as
sp_
ortbl
. - Plurals—use the more natural collective term where possible instead. For example
staff
instead ofemployees
orpeople
instead ofindividuals
. - Quoted identifiers—if you must use them then stick to SQL92 double quotes for
portability (you may need to configure your SQL server to support this depending
on vendor). - Object oriented design principles should not be applied to SQL or database
structures.
Naming conventions
General
- Ensure the name is unique and does not exist as a
reserved keyword. - Keep the length to a maximum of 30 bytes—in practice this is 30 characters
unless you are using multi-byte character set. - Names must begin with a letter and may not end with an underscore.
- Only use letters, numbers and underscores in names.
- Avoid the use of multiple consecutive underscores—these can be hard to read.
- Use underscores where you would naturally include a space in the name (first
name becomesfirst_name
). - Avoid abbreviations and if you have to use them make sure they are commonly
understood.
1 | SELECT first_name |
Tables
- Use a collective name or, less ideally, a plural form. For example (in order of
preference)staff
andemployees
. - Do not prefix with
tbl
or any other such descriptive prefix or Hungarian
notation. - Never give a table the same name as one of its columns and vice versa.
- Avoid, where possible, concatenating two table names together to create the name
of a relationship table. Rather thancars_mechanics
preferservices
.
Columns
- Always use the singular name.
- Where possible avoid simply using
id
as the primary identifier for the table. - Do not add a column with the same name as its table and vice versa.
- Always use lowercase except where it may make sense not to such as proper nouns.
Aliasing or correlations
- Should relate in some way to the object or expression they are aliasing.
- As a rule of thumb the correlation name should be the first letter of each word
in the object’s name. - If there is already a correlation with the same name then append a number.
- Always include the
AS
keyword—makes it easier to read as it is explicit. - For computed data (
SUM()
orAVG()
) use the name you would give it were it
a column defined in the schema.
1 | SELECT first_name AS fn |
1 | SELECT SUM(s.monitor_tally) AS monitor_total |
Stored procedures
- The name must contain a verb.
- Do not prefix with
sp_
or any other such descriptive prefix or Hungarian
notation.
Uniform suffixes
The following suffixes have a universal meaning ensuring the columns can be read
and understood easily from SQL code. Use the correct suffix where appropriate.
_id
—a unique identifier such as a column that is a primary key._status
—flag value or some other status of any type such aspublication_status
._total
—the total or sum of a collection of values._num
—denotes the field contains any kind of number._name
—signifies a name such asfirst_name
._seq
—contains a contiguous sequence of values._date
—denotes a column that contains the date of something._tally
—a count._size
—the size of something such as a file size or clothing._addr
—an address for the record could be physical or intangible such asip_addr
.
Query syntax
Reserved words
Always use uppercase for the reserved keywords
like SELECT
and WHERE
.
It is best to avoid the abbreviated keywords and use the full length ones where
available (prefer ABSOLUTE
to ABS
).
Do not use database server specific keywords where an ANSI SQL keyword already
exists performing the same function. This helps to make code more portable.
1 | SELECT model_num |
White space
To make the code easier to read it is important that the correct compliment of
spacing is used. Do not crowd code or remove natural language spaces.
Spaces
Spaces should be used to line up the code so that the root keywords all end on
the same character boundary. This forms a river down the middle making it easy for
the readers eye to scan over the code and separate the keywords from the
implementation detail. Rivers are bad in typography, but helpful here.
1 | SELECT f.average_height, f.average_diameter |
Notice that SELECT
, FROM
, etc. are all right aligned while the actual column
names and implementation specific details are left aligned.
Although not exhaustive always include spaces:
- before and after equals (
=
) - after commas (
,
) - surrounding apostrophes (
'
) where not within parentheses or with a trailing
comma or semicolon.
1 | SELECT a.title, a.release_date, a.recording_date |
Line spacing
Always include newlines/vertical space:
- before
AND
orOR
- after semicolons to separate queries for easier reading
- after each keyword definition
- after a comma when separating multiple columns into logical groups
- to separate code into related sections, which helps to ease the readability of
large chunks of code.
Keeping all the keywords aligned to the righthand side and the values left aligned
creates a uniform gap down the middle of query. It makes it much easier to scan
the query definition over quickly too.
1 | INSERT INTO albums (title, release_date, recording_date) |
1 | UPDATE albums |
1 | SELECT a.title, |
Indentation
To ensure that SQL is readable it is important that standards of indentation
are followed.
Joins
Joins should be indented to the other side of the river and grouped with a new
line where necessary.
1 | SELECT r.last_name |
Subqueries
Subqueries should also be aligned to the right side of the river and then laid
out using the same style as any other query. Sometimes it will make sense to have
the closing parenthesis on a new line at the same character position as it’s
opening partner—this is especially true where you have nested subqueries.
1 | SELECT r.last_name, |
Preferred formalisms
- Make use of
BETWEEN
where possible instead of combining multiple statements
withAND
. - Similarly use
IN()
instead of multipleOR
clauses. - Where a value needs to be interpreted before leaving the database use the
CASE
expression.CASE
statements can be nested to form more complex logical structures. - Avoid the use of
UNION
clauses and temporary tables where possible. If the
schema can be optimised to remove the reliance on these features then it most
likely should be.
1 | SELECT CASE postcode |
Create syntax
When declaring schema information it is also important to maintain human
readable code. To facilitate this ensure the column definitions are ordered and
grouped where it makes sense to do so.
Indent column definitions by four (4) spaces within the CREATE
definition.
Choosing data types
- Where possible do not use vendor specific data types—these are not portable and
may not be available in older versions of the same vendor’s software. - Only use
REAL
orFLOAT
types where it is strictly necessary for floating
point mathematics otherwise preferNUMERIC
andDECIMAL
at all times. Floating
point rounding errors are a nuisance!
Specifying default values
- The default value must be the same type as the column—if a column is declared
aDECIMAL
do not provide anINTEGER
default value. - Default values must follow the data type declaration and come before any
NOT NULL
statement.
Constraints and keys
Constraints and their subset, keys, are a very important component of any
database definition. They can quickly become very difficult to read and reason
about though so it is important that a standard set of guidelines are followed.
Choosing keys
Deciding the column(s) that will form the keys in the definition should be a
carefully considered activity as it will effect performance and data integrity.
- The key should be unique to some degree.
- Consistency in terms of data type for the value across the schema and a lower
likelihood of this changing in the future. - Can the value be validated against a standard format (such as one published by
ISO)? Encouraging conformity to point 2. - Keeping the key as simple as possible whilst not being scared to use compound
keys where necessary.
It is a reasoned and considered balancing act to be performed at the definition
of a database. Should requirements evolve in the future it is possible to make
changes to the definitions to keep them up to date.
Defining constraints
Once the keys are decided it is possible to define them in the system using
constraints along with field value validation.
General
- Tables must have at least one key to be complete and useful.
- Constraints should be given a custom name excepting
UNIQUE
,PRIMARY KEY
andFOREIGN KEY
where the database vendor will generally supply sufficiently
intelligible names automatically.
Layout and order
- Specify the primary key first right after the
CREATE TABLE
statement. - Constraints should be defined directly beneath the column they correspond to.
Indent the constraint so that it aligns to the right of the column name. - If it is a multi-column constraint then consider putting it as close to both
column definitions as possible and where this is difficult as a last resort
include them at the end of theCREATE TABLE
definition. - If it is a table level constraint that applies to the entire table then it
should also appear at the end. - Use alphabetical order where
ON DELETE
comes beforeON UPDATE
. - If it make senses to do so align each aspect of the query on the same character
position. For example allNOT NULL
definitions could start at the same
character position. This is not hard and fast, but it certainly makes the code
much easier to scan and read.
Validation
- Use
LIKE
andSIMILAR TO
constraints to ensure the integrity of strings
where the format is known. - Where the ultimate range of a numerical value is known it must be written as a
rangeCHECK()
to prevent incorrect values entering the database or the silent
truncation of data too large to fit the column definition. In the least it
should check that the value is greater than zero in most cases. CHECK()
constraints should be kept in separate clauses to ease debugging.
Example
1 | CREATE TABLE staff ( |
Designs to avoid
- Object oriented design principles do not effectively translate to relational
database designs—avoid this pitfall. - Placing the value in one column and the units in another column. The column
should make the units self evident to prevent the requirement to combine
columns again later in the application. UseCHECK()
to ensure valid data is
inserted into the column. - EAV (Entity Attribute Value) tables—use a specialist product intended for
handling such schema-less data instead. - Splitting up data that should be in one table across many because of arbitrary
concerns such as time-based archiving or location in a multi-national
organisation. Later queries must then work across multiple tables withUNION
rather than just simply querying one table.
Appendix
Reserved keyword reference
A list of ANSI SQL (92, 99 and 2003), MySQL 3 to 5.x, PostgreSQL 8.1, MS SQL Server 2000, MS ODBC and Oracle 10.2 reserved keywords.
1 | A |