Archive for the 'Transact SQL' Category

Subselect with inline comma delimited query result

I needed to write a query that would output a list of entities, with a subselect containing all of it’s related entities (could be multiple) in a comma separated list. This was for a one time report, so performance was not a concern, just simplicity. This post was had a solution that dominated my task, check it out!

Composite Key Guid & Int

I have been reading up on the pros and cons of INT vs GUID primary keys and have learned quite a bit I wasn’t aware of regarding the downsides of GUID when it comes to indexing and performance. One thing I saw a few people suggest was to use an INT as the clustered primary key and then a GUID as a nonclustered key in addition to this. The idea is that you would have the performance benefits of using the INT as the primary key, but still maintain the ability for two way data synchronization across servers.

My question is this. It seems to me that if you one were to use GUID as the primary keys, a tool like SQL Data Compare would be extremely powerful as off the top of my head, I feel like data synchronization would be a breeze. However, if the guid was only a secondary key, and INT was still used for primary keys and foreign key reletionships, then I would think that an automated tool like this would not be effective. My assumption is the GUID would only exist on the original record, not on related records, therefore there would be conflicting primary/foreign keys causing confusion to a data comparison tool. Yes, with stored procedures these secondary GUID keys would make it possible to perform two way synchronization, but don’t see it as offering the same level of ease that using the GUID as a primary would.

Anyone have experience with this concept and want to chime in? Maybe I am misinterpreting the way this has been described and someone can steer me in the right direction.

Transact SQL: Using partition by with row_number()

I stumbled across something interesting today when looking at the documentation for Transact SQL's row_number() function.  In the past, when using row_number(), I had limited myself to simple scenarios only using the over clause with "order by". Here is an example where I take a simple resultset containing 7 grades and rank them in descending order:

SQL:
  1. SELECT row_number() over(ORDER BY grades.grade DESC) AS grade_rank, grades.grade
  2. FROM
  3. (
  4.     SELECT 99 AS grade
  5.     union
  6.     SELECT 87
  7.     union
  8.     SELECT 71
  9.     union
  10.     SELECT 83
  11.     union
  12.     SELECT 78
  13.     union
  14.     SELECT 63
  15.     union
  16.     SELECT 92
  17. ) grades

 

In the documentation found here I found mention of using not only "order by", but also something I was unfamiliar with called "partition by", within the over clause. Upon reading further, I learned that this can be of value if you wanted to apply some sort of rank or sequence to subsets within a resultset rather than the entire resultset. A relevant example I came up with was taking a resultset that contains grades for multiple students in multiple classes and ranking the students' grade in descending order within each class. Here is the query:

SQL:
  1. SELECT classid, studentid, row_number() over(partition BY classid ORDER BY classid ASC, grade DESC) AS class_rank, grade
  2. FROM
  3. (
  4.     SELECT 1 AS classid, 1 AS studentid, 99 AS grade
  5.     union
  6.     SELECT 1 AS classid, 2 AS studentid, 87 AS grade
  7.     union
  8.     SELECT 1 AS classid, 3 AS studentid, 71 AS grade
  9.     union
  10.     SELECT 2 AS classid, 4 AS studentid, 83 AS grade
  11.     union
  12.     SELECT 2 AS classid, 5 AS studentid, 78 AS grade
  13.     union
  14.     SELECT 2 AS classid, 6 AS studentid, 63 AS grade
  15.     union
  16.     SELECT 2 AS classid, 7 AS studentid, 92 AS grade
  17. ) student_grades

Running this query will show you 3 students ranked in order of grade descending for class 1 and 4 students ranked in order of grade descending for class 2. Pretty sweet, huh?

Transact SQL StripNonNumeric Function

I was looking to scrub some phone fields that allow for some dirty varchar values at the moment, so I needed a tool to strip non-numeric values from the existing data to determine which records had legit phone numbers and which did not.  Here is a quick function I wrote to help me with this in case anyone needed something similar. I included both the StripNonNumeric function and a more generic StripPattern function that you can use to basically do the same thing, but strip any pattern you like.

SQL:
  1. CREATE CREATE dbo.StripNonNumeric(@value AS VARCHAR(MAX)) RETURNS VARCHAR(MAX) AS
  2. BEGIN
  3.     DECLARE @len AS INT
  4.     DECLARE @pattern AS VARCHAR(5)
  5.     DECLARE @result AS VARCHAR(MAX)
  6.    
  7.     SET @len = LEN(@Value)
  8.     SET @pattern = '[0-9]'
  9.     SET @result = ''
  10.    
  11.     WHILE @len> 0
  12.     BEGIN
  13.         SET @result = @result + CASE WHEN SUBSTRING(@value,@len,1) LIKE @pattern THEN SUBSTRING(@value,@len,1) ELSE '' END
  14.         SET @len = @len - 1
  15.     END
  16.     RETURN reverse(@result)
  17. END
  18.  
  19.  
  20. CREATE FUNCTION dbo.StripPattern(@value AS VARCHAR(MAX), @pattern AS VARCHAR(100)) RETURNS VARCHAR(MAX) AS
  21. BEGIN
  22.     DECLARE @len AS INT
  23.     DECLARE @result AS VARCHAR(MAX)
  24.    
  25.     SET @len = LEN(@Value)
  26.     SET @result = ''
  27.    
  28.     WHILE @len> 0
  29.     BEGIN
  30.         SET @result = @result + CASE WHEN SUBSTRING(@value,@len,1) LIKE @pattern THEN SUBSTRING(@value,@len,1) ELSE '' END
  31.         SET @len = @len - 1
  32.     END
  33.     RETURN reverse(@result)
  34. END