Get Adobe Flash player

SQL Server

Database Naming Convention

Naming can be very hard to standardize on as everyone has their own way of naming variables, columns, tables, and etc. Many times, there are way too many rules to remember. I am going to propose what I think is a consistent way of naming.

History: If you look at history of naming objects back to the mainframe days. There were physical constraints that designers had to live with. Today, you would never name a column FN (first name) as it is too cryptic. But, when storage was very expensive, it was done. Many systems had restrictions on number of characters. In the early days, the limit was up to 6. Even the first version of dBase had a limit of 8 characters. I remember one system that didn't use ASCII and they could save a bit or two by only having upper case letters.

Today, we really don't have physical constraints. Most systems allow 256 character names. Some even allow Unicode characters. Oracle is the only exception to the rule.

What do you think of these examples: FIRSTNAME, FIRST_NAME, firstname, first_name, First_Name, FirstName, firstName. How about these: FrstNm, FstNme, fn, FN, Fn. I am sure you have picked out your favorite.

Before we have a standard, we have to look at the development language. Is the language case sensitive? Java and c++ are. PHP and ASP are not. If the language is case sensitive, many developers like to have all of the names in lower case (firstname) or proper case (FirstName) or the modified proper case (firstName).

Should the development language drive the naming convention for databases? The answer is maybe. It might be easier to have a standard that will work across all development efforts.

What ever you decide to be your standard, just be consistent.

Database Naming Standards

The Standard to Think About:
There are many articles on the web talk naming conventions and none will be the same.   However there is a few common attributes that are the same.   They are:  1) don't use underscore (_) to separate names and 2) don't use all upper case characters.   Everything else is open to personal preference.     

A Standard: In the following, we will have a logical and consistent discussion of standard naming convention.  

A few consistent rules:

  • No underscores (_)
  • No s at the end (i.e. Names).  Every database or table will hold more than one item, plural is assumed.  Name should be singular.  I.e.  Name table will hold names.
  • No Abbreviations.   All modern database systems allow more characters for a name they you will ever use.   For example:  "nbr" and "no"  spell out number.   "Amt"  spell out amount.   
  • Be consistent.  

Databases
What data does it hold?   What is the system name?   Name the database.  Remember the rules above.  

Tables

  • Simple Name:   Use Proper case. (i.e.  Name,  Address,  ClientName, and etc)
  • Singular naming -- No s at the end of the name.
  • Avoid prefix or suffix of List, Data,  Info.   (i.e don't use NameList).  All tables are a list or data or some sort as their nature.
  • For Junction tables (many to many) or Parent/Child tables (1 to many) use all names that define the relationship.   (i.e.  Order (parent), OrderDetail (child))

Columns

  • Simple Name:   Use Proper Case (i.e. FirstName, LastName, and etc)
  • DO NOT: have the table name part of the column name.   If the table is named Client, do not have a column named ClientCode.  It should just be Code.
  • Foreign key Column names should identify the parent table name.  Using the Client example above.  It is ok to have ClientCode in an Invoice table.  
  • For Primary Keys, use ID suffix.  The only inconsistency in naming will be this:   Don't just use ID as the column name.  Use the table name too. Example: OrderID.    The reason is simple and logical.  When writing sql code,  PK and FK (which are used all of the time) are is easier to read when you can type OrderDetail.OrderID = Order.OrderID compared OrderDetail.OrderID = Order.ID
  • Don't use abbreviations.  
  • For consistent data typing, here are some suggestions. 
Type of Data Naming suggestion
Boolean <ColumName> + Flag  *
Date <ColumName> + Date
Date Time <ColumName> + DateTime
Time <ColumName> + Time
Currency or Money <ColumName> + Amount
Percent <ColumName> + Percent

* this data type you will see many different opinions. Some standards suggest prefixing the column name with Is (i.e. IsActive). Some use the word flag or bit. Recommend using a suffix.

Creating a Data Dictionary

What is a Data Dictionary?    
Simply put, it is a document that describes databases, tables and columns.  

Maybe a better question is, why do you need a Data Dictionary?
1. At a minimum a Data Dictionary is useful to educate new employees.   

2. One could argue that is a database is designed properly that you wouldn't need one.   Full names with no abbreviations would be the best database design.    However, with the best designs, there are still names that don't make full since.

For example, if you saw address, city state in an address table, you really wouldn't need a Data Dictionary.     How about s_address, s_city, s_state, m_address, m_city, m_state for column names?  Names are not too clear.  A Data Dictionary would be useful.

How to maintain a Data Dictionary?
Databases and tables change all of the time to meet the changing business requirements.  With that said, how do you keep the document up to date?
1. you could make it part of your release process to maintain the documentation. However, it seems that documentation is always last on everyone's list.
2.  You could use the database it's self for documentation.  While making a database change it is easier to maintain the Data Dictionary.  

SQL Server has a somewhat crude way of  adding a description to a column with a command sp_addextendedproperty.   This command will add a record into a system table called sysproperties
   EXEC sp_addextendedproperty
       @name='<columnName>',
       @value='<columnDescription>',
       @level0type='user',    @level0name='dbo',
       @level1type='table',   @level1name='<tableName>'

   where @name is the column name, @level1name is user table name, and @value is the column description

Creating the Data Dictionary
attached is a sql script that will produce a list of every table and every column.   It has the table name, column name, data type, constraints and defaults, and column descriptions (if using sp_addextendedproperty)

 

SQL Regular Expression

I have been getting into biking more and more.   Biking is a good exercise.   I have read health articles that just 3 hours a week can reduce hart-attack and stokes by 50%.word lock

I purchased a Word Combination Lock for the bike.    It has four dials with ten letters each.   I couldn’t find a word I liked so I searching the web for samples.  But I didn’t find any lists.  

Now for the real purpose of this blog.   Using Regular Expressions.    I used regular expressions to produce a list of possible combinations.     This is how I did it:

 

1) first off, you need a list of English words.    I did a search and found a huge list of words (WordList.txt).  There was a lot of garbage in the file I found, so I modified it a bit. 

 

2) Create a table to hold the word list:
    Create table wordlist (word varchar(255));

3) Import the list into your database.    There is many ways to get data into your database.    I used toad's import tool.

There are duplicates in the list.   That is why there isn’t a primary key defined.

 

4) Time for the query:

Letter options for each dial.   The default combination is SHED.    As you can see from the list, there is runs, fast, bike, loop, play and well.     

 

S

H

E

D

T

W

R

G

D

R

M

K

M

Y

T

M

R

U

N

S

F

A

S

T

B

I

K

E

L

O

O

P

P

L

A

Y

W

E

L

L

 

Using the letter combinations for each dial, we can use simple regular expression to produce a list of words that can be formed.  

 

See http://en.wikipedia.org/wiki/Regular_expression for definition and history of regular expression.

 

The Query:

select distinct word
from wordlist
where ucase (word) REGEXP '[STDMRFBLPW][HWRYUAIOLE][ERMTNSKOAL][DGKMSTEPYL]'
and length(word)=4;

 

Looking at the query.   

1. select distinct word
Remember there are duplicates in the word list.  Distinct will eliminate duplicates the list.   Note, distinct shouldn’t be used in any production environment

2.  where ucase (word)...
It is good practice to match the case that you are expressing. Ucase will convert all words in the table to upper case.    

Do note, that some database engines will not use an index when changing case.   Thus producing a slower query.

3.  Looking at the simple version ...word regexp...  is similar to ...word = 'SHED' or word='SHEG' and etc.   looking at each character that matches each dial letter.

 

 

 

For the Microsoft SQL Server users:   there really isn’t a good way to do regular expression.    Click here for a goole search that might help.

 

 

The Result.  The query produced a list of 5026 possible words.    See Word_Lock_Sample_List.txt for the list.   The list isn't the best. The source file did have non-existent words. 

 

 

 

 

 

 

 

 

How to Add a Default Value to an Existing Column

I  have discussed with other articles that a good database design shouldn’t use nulls.   See Real World Use of nulls.  Nulls can cause issues in application development.

I was helping out a friend with a query and noticed that he had written a check (in WHERE) for a column with two conditions.   One of them being null.   The column was a status column (I.e true/false).    First off, All status columns should have a default value of zero (false).   

To help him with his query and worry about fixing the database later I helped him to write his query better.

 

Before:
    Select...
    From...
    Where status <> 1 OR status is null. 

 

On the top of my list of best practices is never to use OR.   He didn’t think he had another option until I showed him this version:

 

After:
    Select...
    From...
    Where coalesce(status,0) <> 1

Notice NO OR.   Coalesce will return the first non null.   In above example, if status is null, it will then return 0.  If status was 0 or 1, it would return that value.

Should also note that coalesce will work on all database engines. 

 

Now lets fix the database issue.   How to add a default to an existing column?  
First lets look at a sample table.  Col1 and Col2 allow nulls:

CREATE TABLE Table1 (
   Table1ID INT AUTO_INCREMENT NOT NULL,
   Col1 INT,
   Col2 VARCHAR(50),
  PRIMARY KEY (Table1ID)
);

 

Put some data into that table (mysql code):
insert into Table1 (Col1, Col2)
VALUES (1,NULL), (NULL,'A'), (null,null);

 

we get:

select * from Table1;

Table1ID    Col1    Col2
1            1      NULL
2            NULL   A
3            NULL   NULL

 

 

 

Now lets update col1.  Set a default and make it not nullable.

-mysql
ALTER TABLE Table1
 CHANGE Col1 Col1 INT(11) NOT NULL DEFAULT '0';

--mssql  v2000 and 2005
ALTER TABLE Table1
 alter column Col1 INT NOT NULL DEFAULT '0';

 
--mssql  v2000 and 2005
ALTER TABLE Table1
 alter column Col1 INT NOT NULL DEFAULT '0';

--mssql 2008
ALTER TABLE Table1
add constraint dfTable1 default '0' for Col1

 

Looking at the existing data you will see that Col1 values that were once null are now 0
select * from Table1;
Table1ID  Col1  Col2
1         1     NULL
2         0     A
3         0     NULL