Archive | Applying Additional Query Techniques RSS for this section

Control execution plans

May include but is not limited to: table hints, query hints

As you may know, SQL Server tries to use the best possible way to execute a query. It has very powerful query optimization techniques, with which you’ll be happy most of the time. But sometimes (very very rarely) you need to provide hints on how to execute a given statement (in a real world, you’d never do that, because it’s the responsibility of the DBA).

Fortunately, there are a bunch of built-in query hints. Even more fortunately, there isn’t a single goddamn word on these hints (to be funny, there isn’t any hint of them) in the Training Kit, so let’s query our old friend, MSDN.

Now there are two kind of these hints, query and table. Let’s start with the former. You can set a query hint on any query (the sole exception is when you use UNIONs), by adding the OPTION keyword, along with the desired hint to it. Here’s a brief list of what you may encounter:
Read More…

Implement CTE (common table expression) queries

May include but is not limited to: recursive, non recursive

A common table expression is a powerful new tool which generates a named result set for the lifetime of the current query. In some cases, they can replace views (you don’t always need views), and therefore reducing complexity. They can be referenced by their name in the query multiple times, and they can be self-referencing, thus enabling recursion.

Here’s how a CTE looks like:

WITH Name of CTE
( List of columns)
AS
( SELECT statement )
Outer SELECT statement

This structure is fairly useful, but let’s see the real cool feature, the self-referencing and the recursion:

WITH Numbers (n)
AS
(SELECT 1 AS n
UNION ALL
SELECT n + 1
FROM Numbers
WHERE n < 10)
SELECT n FROM Numbers

The sample above, which originates from Michael Coles: Pro T-SQL 2008 Programmer’s Guide p.255, does nothing but lists the numbers from one to ten. But it does this in a recursive manner, which opens up new horizons in data access code.

A last thing to remember: CTEs have a recursion level of a hundred, by default. You can override this behavior by specifying the MAXRECURSION option, can be 0 to 32767.

Manage international considerations

May include but is not limited to: collations, defining custom errors, filtering data, sort order, nvarchar, database collation, column collation

SQL Server supports a tremendous set of different cultures and languages, which leads to interesting things. For a second, try to imagine my situation. As a native Hungarian, I constantly have to battle with characters like ‘ő’ or ‘í’, and so on. Let’s I send you a table with the names of my customers, who happen to be Hungarians, too. How would you sort this table, with names like Álmos or Éva?

Every language and accent has sorting rules and SQL Server respects those rules, and enforces them within collations. In Hungary, the letter á comes immediately after a, and not after z. But in Latin, á is seen as an accent variable of a. Now the sorting is obfuscated! Even worse, what if we’re filtering? In Latin, ‘á’ = ‘a’ would return true, but in Hungarian, it certainly isn’t the case.

If you create a database, SQL Server asks you which collation you’d like to use (or uses the default one, if you don’t specify anything). You have the ability to override this setting on the column level, using the COLLATE keyword, and the collation name. For example:

CREATE TABLE GreekTable
( Name nvarchar(20) COLLATE greek_ci_as)

I’m sure you found out which part of the code above sets the collation to Greek. The other two values CI means case insensitive (CS is the case-sensitive one), and AS means accent-sensitive (guess how they mark accent-insensitive).

The last thing in this post will be the SQL string types, stay tuned! We’ve all encountered them, char, nchar, varchar, nvarchar, text, ntext… But what do they do? The good news is that text and ntext are deprecated, so avoid using them. The others: n means Unicode, var means variable length. When you use char and varchar (the ones without the n prefix), you use the underlying collation to specify the code page of the string. This is not the case with Unicode strings, which can store any character, because it uses two bytes for each of them. In both cases, the specified collation determines the filtering and sorting rules.

Apply ranking functions

May include but is not limited to: RANK, PARTITION BY, DENSE_RANK, OVER, ROW_NUMBER, NTILE

When you need to have your result sets numbered, ranked, partitioned by some custom logic, you should turn your attention towards ranking functions (also known as windowing functions). Luckily, there are only a few of them, and you can implement some good behaviors (such as paging) with the help of them. Let’s see the list:

  • ROW_NUMBER: is an ideal candidate for paging data. It takes the OVER clause with an ORDER BY clause, and an optional PARTITION BY. For example: SELECT ROW_NUMBER() OVER(ORDER BY LastName) FROM Employees. When you use PARTITION BY (which must take place within the OVER clause, preceding ORDER BY), you can partition your result by the column name you’ve specified. This means that after every new value, the numbering starts again.
  • RANK: the syntax is the same as was with ROW_NUMBER, but RANK, as the name shows, ranks the results by the specified ORDER BY value. When duplicates are found, it assigns the same number to the other values to them, but skips the next value.
  • DENSE_RANK: the same as RANK, but without skipping any rank number.
  • NTILE: divides the result set into approximate n-tiles, such as quintiles, percentiles, etc. If you have ever stumbled into statistics, you should be quite familiar with these quintiles or deciles. A good example on NTILE is to compute who are the richest in a set.

Implement subqueries

May include but is not limited to: simple, correlated, scalar, list, table valued

A subquery is a query inside another one. I must confess that when I began working with SQL Server (way back in the past) I used subqueries instead of for example joins.

For those who are still here, let me explain why did I do that. Because it is straightforward and simple and requires only the absolute beginner SQL statements. You can do almost everything using subqueries, and if you don’t know how to do something, you definitely will use them.

Now to be a little professional, I’ll introduce the two types of subqueries: correlated and (shocking name) noncorrelated. These names are a little bit self-explanatory. A correlated subquery is one that correlates with the main query, to be more concrete, the subquery depends on the main one. An example:

SELECT * FROM Customers
WHERE CustomerID IN
(SELECT CustomerID FROM Orders WHERE TotalValue > 100)

In contrast, a noncorrelated subquery hasn’t got this relationship.