Get Adobe Flash player

SQL2005

Generate SQL Server Index Creates

There are several reasons you may want to create script to add indexes. If you have used SQL Server DTS application to transfer data from one database to another, the data will be transferred over, but not the indexes. If you used the enterprise manager (or management studio) to script a table create, indexes again will not be created. Want a brute force way of recreating all of your indexes? Read further.

How to Add a Default Value to an Existing Column

I  have discussed with other articles that a good database design shouldn’t use nulls.   See Real World Use of nulls.  Nulls can cause issues in application development.

I was helping out a friend with a query and noticed that he had written a check (in WHERE) for a column with two conditions.   One of them being null.   The column was a status column (I.e true/false).    First off, All status columns should have a default value of zero (false).   

To help him with his query and worry about fixing the database later I helped him to write his query better.

 

Before:
    Select...
    From...
    Where status <> 1 OR status is null. 

 

On the top of my list of best practices is never to use OR.   He didn’t think he had another option until I showed him this version:

 

After:
    Select...
    From...
    Where coalesce(status,0) <> 1

Notice NO OR.   Coalesce will return the first non null.   In above example, if status is null, it will then return 0.  If status was 0 or 1, it would return that value.

Should also note that coalesce will work on all database engines. 

 

Now lets fix the database issue.   How to add a default to an existing column?  
First lets look at a sample table.  Col1 and Col2 allow nulls:

CREATE TABLE Table1 (
   Table1ID INT AUTO_INCREMENT NOT NULL,
   Col1 INT,
   Col2 VARCHAR(50),
  PRIMARY KEY (Table1ID)
);

 

Put some data into that table (mysql code):
insert into Table1 (Col1, Col2)
VALUES (1,NULL), (NULL,'A'), (null,null);

 

we get:

select * from Table1;

Table1ID    Col1    Col2
1            1      NULL
2            NULL   A
3            NULL   NULL

 

 

 

Now lets update col1.  Set a default and make it not nullable.

-mysql
ALTER TABLE Table1
 CHANGE Col1 Col1 INT(11) NOT NULL DEFAULT '0';

--mssql  v2000 and 2005
ALTER TABLE Table1
 alter column Col1 INT NOT NULL DEFAULT '0';

 
--mssql  v2000 and 2005
ALTER TABLE Table1
 alter column Col1 INT NOT NULL DEFAULT '0';

--mssql 2008
ALTER TABLE Table1
add constraint dfTable1 default '0' for Col1

 

Looking at the existing data you will see that Col1 values that were once null are now 0
select * from Table1;
Table1ID  Col1  Col2
1         1     NULL
2         0     A
3         0     NULL

 

 

Real World use of Database NULLs -- Part 2

Article: Real World use of Database NULLs -- Part 2

Database: MS SQL Server, MS access and MySql. Could also apply to Progress and Oracle.
Author: Robert Cunningham

Review: In part 1, the concept of a NULL is bad was discussed. There were a few examples where a having the possibility of a NULL would cause programmers and report developers to handle data differently. Simple concatenation of FirstName and LastName (SQL: select FirstName + ' ' + LastName) would cause undesirable outcomes if NULLs are allowed. Also discussed was updating existing databases to eliminate NULLs without changing existing programs or reports. In Part 2, we will look at not using NULLs in database design, performance and changing existing databases. The concept does go against common practices and I am sure database purist will disagree with the article. Again the concept is: Real world use don't use NULLs

Database Design In part 1, there was a very simple table called Name. There was a column called MiddleName that allowed NULLs. There is a way to get rid of the Name table and start over and eliminate the option for NULLs. But that method is complicated. But here is pseudo code to redesign the table. 1) remove any foreign key reference 2) rename table Name to Name_old 3) create new table Name. With MiddleName NOT NULL 4) If there was an auto number ID (NameID is not), turn off restrictions to allow keeping the old ID 5) insert all of the Name_old records into the newly created Name table 6) turn on auto number if needed 7 delete Name_old. As you read the above steps, you might think "why can't I just alter the table and column?" The reason is because you cannot turn a column into a NOT NULL. As you can see, changing an existing table can be complicated. Leave the table alone and just use these two commands:

update Name Set MiddleName ='' where MiddleName is null alter table Name add constraint dfMiddleName default '' for MiddleName

There are a few things to keep in mind why the above will work:

  • The data type of MiddleName is varchar. With a varchar, even a ' ' (tic space tic) will be truncated and made into an empty string.
  • If the data type of MiddleName were char, then there would be some other issues. A char data type stores spaces. If MiddleName was defined as char(25), the length would always be 25. Good database design would define variable length data as varchar.
  • Having a default defined will force all further inserts or updates to not have a NULL.
  • Having a default on a NOT NULL column can also help as a insert or update statement can exclude the column.

Fixing Data Relationship Using the Name table example from Part 1:

create table Name ( NameID integer NOT NULL, LastName varchar(25) NOT NULL, FirstName varchar(25) NOT NULL, MiddleName varchar(25) NULL);

We will add a new column called NameTypeID. And also define a reference or relationship to a new table called NameType. Create The New Table:

Create table NameType ( NameTypeID integer NOT NULL, Description varchar(50) NOT NULL); alter table NameType add constraint PK_NameType primary key (NameTypeID);

Insert Records Into NameType:

Insert into NameType values (1, 'given name') Insert into NameType values (2, 'married name')

Create New Column for Name Table:

Alter table Name Add NameTypeID integer NOT NULL default 0

Please note that you cannot add a new column with the NOT NULL option without having a default defined. Define Relationship: If you added the relationship without adding a 0 record, you will get an error. Remember there are three records in the Name Table. Adding the NameTypeID column to the Name Table will set each existing record to zero. This code will insert a zero 0 record and then define the relationship between the NameType and Name Tables:

Insert into NameType values (0, 'unknown') alter table Name add constraint fkNameType_Name foreign key(NameTypeID) references NameType (NameTypeID)

Seeing the Data The Join is now simple because NULLs are not used:

select Name.FirstName, Name.LastName, NameType.Description from Name inner join NameType on NameType.NameTypeID = Name.NameTypeID

The query will return all names, plus the Description of the NameType. The description will all be "unknown" as none of the records have a name type defined. Here is what the query would look like if NULLs were used:

select Name.FirstName, Name.LastName, Description=coalesce(NameType.Description,'') from Name right join NameType ON NameType.NameTypeID = Name.NameTypeID

How to handle Zero To Many record NULLs In the above example with NameTypeID, we could handle the relationship as it is -- a "lookup" relationship. The relationship is still a One to Many, but for each Name record had a corresponding NameTypeID because the 0 record was added. Now let's look at another real world condition. Zero to Many record relationship. The difference is that the relationship doesn't not have to exist. Keeping with the simple Person type of data, let's add a new table called Address. A person can have zero or more address.

create table Address ( AddressID int not null primary key AUTO_INCREMENT, NameID int not null, Street varchar(50) null, City varchar(50) null, State varchar(50) null, PostalCode varchar(50) null);

Add one Address record for Fred Anderson:

Insert into Address (NameID, Street, City, State, PostalCode) Values( 101, '939 Lincoln Street', 'Mercerville', 'NJ','8619')

Knowing that the relationship between Name and Address is a Zero to Many, the keywords LEFT JOIN must be used because we want to return all Names and Addresses (if we have them).

select Name.LastName, Address.Street, Address.City, Address.State, Address.PostalCode from Name left join Address on Address.NameID = Name.NameID;

Executing the select statement will return all names and a single address for Fred Anderson. The Addresses do not exist for Sally and Matt so they are NULL.

LastName Street City State PostalCode
Anderson 939 Lincoln Street Mercerville NJ 8619
Johnson NULL NULL NULL NULL
Smith NULL NULL NULL NULL

Now to eliminate NULLs, we must check for existence of NULLs. The following code uses a ANSI standard keyword coalesce. This code will make any NULL an empty string.

select Name.LastName coalesce (Address.Street,'') as Street, coalesce (Address.City,'') as Address, coalesce (Address.State,'') as State, coalesce (Address.PostalCode,'') as PostalCode from Name left join Address on Address.NameID = Name.NameID

You might be thinking, that is a lot of work to get rid of NULLs. In the real word, NULLs still exist. There are two arguments about where (location) you should handle the NULLs. 1) The first argument states that all presentation of data should be handled by the application. The application should handle NULLs, data formatting and etc. 2) The second argument is that the database should handle NULLs and formatting. As it is just data. 3) I would add a third argument. NULLs should be handled on the database side and formatting of data on the application side. Taking argument 3, how do you eliminate having to code or change code to get rid of NULLs? How do you consistently eliminate nulls without having to put code in many places (i.e. reporting, exports, user interface)? One solution without coding is to add the missing rows to the Address Table. Every Person has some type of address. So, they should have at least one record. We are talking about changing the relationship to be a one to many (not zero to many). This code will add all missing Address records. Keep in mind that more disk space will be used having a one to many relationship.

insert into address(NameID, Street, City, State, PostalCode) select NameID, Street='',City='',State='',PostalCode='' from Name where not exists (select 1 from Address where Address.NameID = Name.NameID)

After running the above code, the left join is no longer needed. Turn it into inner join. The performance would also be improved.
Conclusion

 

 

 

  • Don't use NULLs in your databases
  •  

     

     

    [End of Part 2]

    • Existing queries and reports do not need to change
    • There are ways to take existing databases and tables and eliminate the need for NULLs
    • Adding Defaults to columns can eliminate NULLs
    • Do keep in mind, that not all relationships can be change into a one to many.
    • In the real word, NULLs still exist.

    Real World use of Database NULLs -- Part 1

    Article: Real World use of Database NULLs -- Part 1
    Database: MS SQL Server, MS access and MySql. Could also apply to Progress and Oracle.
    Author: Robert Cunningham

    You can search the web and find a lot of information on the subject of NULLs. In this article, I will be contradicting some of the information you may find. Contradicting some business best practice. That is because academic or theoretical doesn't apply to Real World usage. Real World -- don't use NULLs -- with one exception.

    WHY? Here are a few benefits:

    • Developers and end users do not have to handle data differently. See below for issues
    • JOIN performance is better. The JOINS are natural one to one ratio
    • Updates are faster. If NULLs are used, most databases require more disk writes.


    May cause some issues:

    • More database space will be used. Larger database requires more disk space.
    • Parent Tables need to have a record to show not assigned. See Part 2 for more details


    WHAT IS A NULL To start, what is a NULL? Simply put, a NULL is the absents of data. Or put another way, value of nothing. If you look at a simple Name table:

    create table Name ( NameID integer NOT NULL, LastName varchar(25) NOT NULL, FirstName varchar(25) NOT NULL, MiddleName varchar(25) NULL);

     

    NameID LastName FirstName MiddleName
    100 Anderson Fred John
    101 Johnson Sally
    102 Smith Matt NULL

    In the above example, MiddleName is the only column that allows NULLs. From the academic prospective: a NULL should be used if the value is unknown. For example, Matt Smith's middle name is not known, so a NULL is used. Whereas Sally Johnson's doesn't have a middle name and it is set to an empty string. From the real world prospective: Don't use NULLs. NULLs can cause problems for programming. If programming of a Stored Procedure, Java or PHP code, you have to accommodate a NULL differently than any other data. Does it really matter if Matt Smith has a MiddleName? No! Set it to an empty string and don't use a NULL.

    PROGRAMMING ISSUES There are many issues with NULLs when it comes to programming. Most programming languages don't like NULLs and the programmer has to handle data differently. SQL:

    If NULLs are Allowed No NULLs -- NOTE The MiddleName Column does not have nulls
    Return unknown middle names:
    select * from Names where (MiddleName is not null OR MiddleName = '')

    (Will return Sally and Matt)
    select * from Names where MiddleName = ''
    (will return Sally and Matt because they do not have middle names)

    Looking at the above queries both return the same number of rows. With the first query, The problem OR is used. You also have to remember to use the (). With the second query, it is easier to write and you don't have to worry about the concept of a NULL.

    If NULLs are Allowed No NULLs -- NOTE: The MiddleName Column does not have nulls
    Select EmployeeName = FirstName + ' ' + MiddleName + ' ' + LastName from Names (The above query would return NULL for EmployeeName of Matt) To fix query to work with NULLs: Select EmployeeName = FirstName + ' ' + coalesce(MiddleName, '') + ' ' + LastName from Names Select EmployeeName = FirstName + ' ' + MiddleName + ' ' + LastName from Names
    (This query works because NULLs are not allowed. It is simple to read. Simple to write)


    Notes:
    1) coalesce is a command that will return a first non-null value.
    2) Remember Anything + nothing (a NULL) is nothing.
    NO NULLs with Different Data Types If you are designing a new database, you can force each column not to have NULLs. For existing tables, you have two options:

    • The first is to alter each table and each column to not allow NULLs. This can be difficult and can require someone with Medium to Advanced level of databases experience.
    • The second option is easier to implement. This option just updates the existing data to a default not NULL value. See table below for examples for each type of data.


    Example:

    update Name Set MiddleName = '' where MiddleName is null


    Optionally, To insure the data doesn't get NULLs again:

    SQL Server: alter table Name add constraint dfMiddleName default '' for MiddleName
    Data Type Not NULL value
    Text Data -- varchar, text or char Empty string. tic tic ('') or a quote quote (""). Storage is zero bytes
    Numeric -- integer, int, and etc Set to zero. If you really need to have a non-answer value, use a -1.
    Date Set to the value will be 1900-01-01 00:00:00.000.
    Currency, Rates, Percentage Currency or money type of columns should NEVER be null. A currency type of column should have a predefined DEFAULT of zero.

    THE EXCEPTION The exception to the rule is when it comes to data relationship columns. For All database engines, a foreign key is a definition of data relationship between two tables. Most foreign keys are of numeric data type -- typically Integer. To set a NULL value to a zero would cause foreign keyviolation . So, how do you eliminate a need for a NULL. Easy! Insert a 0 ID record in the parent table. This is actually common for data warehouse databases. See Part 2 for details. [End of Part 1]