In this article, I am
giving a quick overview about Ranking Functions in SQL Server. After completing
this article you will understand:
What is Ranking Function?
What are different Ranking Functions in SQL Server?
Please give your valuable
suggestions and feedback to improve this article.
What
is Ranking Function?
In SQL Server Ranking functions are used to give a rank to
each record among all records or in a group.
Let’s understand what does this means. I will take a simple real
world example to make you understand this. Let’s go to in your school days. Our
class teacher maintains a register in which all student names are written. Each
student is given a roll number based on their names for example if there are
50 students then each student will get one roll number between 1 to 50, this is nothing but teacher has
given a serial number to each student. When your result comes out,
based on student marks now student will be given rank, which will be
different from roll number (serial number) i.e roll number 10 came first, roll
number 21 cane second and so on. This is nothing but teacher has given a
rank based on the student marks. In some case suppose two students get
the same marks, then same rank will be given to both the students.
All Ranking functions are non deterministic function, for
more information see Deterministic and Non Deterministic functions in SQLServer.
What
are the different Ranking Functions in SQL Server?
In SQL Server there are four types of Ranking Functions
1.
RANK
2.
DENSE_RANK
3.
ROW_NUMBER
4.
NTILE
Let’s understand each ranking function one by one.
Before we start, let's populate some data into our TEACHMESQLSERVER Database. If you don’t
have database created then use the below query to create.
CREATE DATABASE
TEACHMESQLSERVER
Let’s create one table
name as EMP in our database and populate data. To create and populate data you
can run below set of SQL statements.
USE
TEACHMESQLSERVER
GO
IF EXISTS(SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE
TABLE_NAME='EMP')
DROP TABLE EMP
GO
CREATE TABLE EMP(EMPNAME VARCHAR(MAX),DEPARTMENT VARCHAR(MAX), SALARY MONEY)
GO
INSERT INTO EMP VALUES ('GHANESH','IT',18000)
INSERT INTO EMP VALUES ('PRASAD','HR',2000)
INSERT INTO EMP VALUES ('GAUTAM','SALES',5000)
INSERT INTO EMP VALUES ('KUMAR','HR',85000)
INSERT INTO EMP VALUES ('SUMIT','IT',18000)
INSERT INTO EMP VALUES ('PRIYANKA','HR',25000)
INSERT INTO EMP VALUES ('KAPIL','SALES',5000)
INSERT INTO EMP VALUES ('GOLDY','HR',12000)
INSERT INTO EMP VALUES ('SACHIN','IT',21500)
INSERT INTO EMP VALUES ('OJAL','SALES',19500)
INSERT INTO EMP VALUES ('ISHU','HR',28000)
INSERT INTO EMP VALUES ('GAURAV','IT',15500)
INSERT INTO EMP VALUES ('SAURABH','SALES',20500)
INSERT INTO EMP VALUES ('MADHU','IT',18000)
INSERT INTO EMP VALUES('ATUL','SALES',35000)
GO
SELECT * FROM EMP
As you can see from the above query result set, data has
been successfully populated. I will be using this data set in coming examples
to explain all ranking functions.
RANK
Rank function returns the rank of each record among all
records or in a partition. If two or more rows tie for a rank, each tied rows
receives the same rank but RANK function does not always return
consecutive integers. Don’t worry if second point is not clear to you I
will explain this with example.
Syntax: RANK ()
OVER (PARTITION BY [column_list] ORDER BY [column_list])
PARTITION BY [Column_list]
– This is optional if you want to divide the result set into partitions to
which the function is applied then must give the column name, if you don’t
use this clause then the function treats all records of the query result set as
a single group.
ORDER BY [Column_list]
– This is not optional it is must clause; this determines the order of the data
before the function is applied.
Let’s understand the RANK function using example.
Problem 1:
You want to get each employee rank among all other employees
based on their highest salary.
Solution:
You can use RANK function without PARTITION By Clause and
ORDER BY Clause on SALARY in Descending order.
SQL Code:
SELECT EMPNAME, DEPARTMENT, SALARY , RANK() OVER(ORDER BY SALARY DESC) AS OVER_ALL_RANK_BY_SALARY FROM EMP
Output:
Explanation:
As you can see from the result set each employees has
given a rank based on their salary. Employee KUMAR has the
highest Salary rank is 1 and PRASAD has the lowest salary rank is 15. If you
notice Employee MADHU, SUMIT and GHANESH have given rank 8 because all three
has the same salary. Employee GAURAV has given 11 because 8 rank can be given
to a least one person but two more employees shared the same rank so next two
ranks 9 and 10 will not be given to any employee. This is the problem with the
RANK Function it does not always return consecutive integers.
Problem 2:
You want to get each employee rank in their department based
on their highest salary.
Solution:
You can use RANK function with PARTITION BY Clause on
DEPARTMENT
and ORDER BY Clause on SALARY in Descending order.
and ORDER BY Clause on SALARY in Descending order.
SQL Code:
SELECT EMPNAME, DEPARTMENT, SALARY , RANK() OVER(PARTITION BY DEPARTMENT ORDER BY SALARY DESC) AS RANK_IN_DEP_BY_SALARY FROM EMP
Output:
Explanation:
As you can see from the result set each employees has
given a rank based on their salary in their department. In HR Department every
employee has distinct salary so each has given different rank but in IT
Department three employees has same salary so they has given rank 2 and next
employee gets rank 5.This is the problem with the RANK Function it does not
always return consecutive integers.
DENSE_RANK
RANK function does not always return consecutive integers
to overcome this problem we have another Ranking function in SQL Server which
is known as DENSE_RANK. DENSE_RANK function Returns the rank of
rows within the partition of a result set or within all records, without any gaps in
the ranking. The rank of a row is one plus the number of distinct ranks that
come from the previous row.
Syntax: DENSE_RANK
() OVER (PARTITION BY [column_list] ORDER BY [column_list])
PARTITION BY [Column_list] – This is optional if you want
to divide the result set into partitions to which the function is applied
then you must give the column name, if you don’t use this clause then the
function treats all records of the query result set as a single group.
ORDER BY [Column_list] – This is not optional it is a must
clause; this determines the order of the data before the function is applied.
Let’s understand the DENSE_RANK function using the old examples.
Problem 1:
You want to get each employee rank without gap among
all other employees based on their highest salary.
Solution:
You can use DENSE_RANK function without PARTITION By
Clause and ORDER BY Clause on SALAY in Descending order.
SQL Code:
SELECT EMPNAME, DEPARTMENT, SALARY , DENSE_RANK() OVER(ORDER BY SALARY DESC) AS
OVER_ALL_RANK_BY_SALARY FROM EMP
Output:
Explanation:
As you can see from the result set each employees has
given a rank based on their salary. Whoever is having same
salary has been given same rank i.e 8 and 11 but there is no gap in the ranking
like RANK function.
Problem 2:
You want to get each employee rank without any gap in
their department based on their highest salary.
Solution:
You can use DENSE_RANK function with PARTITION BY Clause on
DEPARTMENT and
ORDER BY Clause on SALAY in Descending order.
SQL Code:
SELECT EMPNAME, DEPARTMENT, SALARY , DENSE_RANK() OVER(PARTITION BY DEPARTMENT ORDER BY SALARY DESC) AS
RANK_IN_DEP_BY_SALARY FROM EMP
Output:
Explanation:
As you can see from the result set each employees has
given a rank based on their salary in their department. In HR Department every
employee has distinct salary so each has given different rank but in IT
Department three employees has same salary so they has given rank 2 and next
Employee gets rank as 3.
ROW_NUMBER
ROW_NUMBER function Returns the serial number of each
record within a partition of a result set or within all records. It
starts serial number as 1 for the first record and for next records it adds
one. We get unique serial number for each record if using ROW_NUMBER function. If we are using ROW_NUMBER function within a partition of a result
set then it starts with 1 in each partition.
Syntax: ROW_NUMBER
() OVER (PARTITION BY [column_list] ORDER BY [column_list])
PARTITION BY [Column_list] – This is optional if you want
to divide the result set into partitions to which the function is applied
then you must give the column name, if you don’t use this clause then the
function treats all records of the query result set as a single group.
ORDER BY [Column_list] – This is not optional it is a must
clause; this determines the order of the data before the function is applied.
Let’s understand the ROW_NUMBER function with simple
examples.
Problem 1:
You want to get serial number for each employee among all
employees based on their highest salary.
Solution:
You can use ROW_NUMBER
function without PARTITION By Clause and ORDER BY Clause on SALAY in Descending
order.
SQL Code:
SELECT EMPNAME, DEPARTMENT, SALARY, ROW_NUMBER() OVER(ORDER BY SALARY DESC) AS ROW_NUMBER_BY_SALARY FROM EMP
Output:
Explanation:
As you can see from the result set each employees has
given a serial number based on their salary among all employees. Serial number
starts with 1 and goes till 15 (because we have total 15 records in EMP table).Whoever
is having same salary has not given same serial number like DENSE_RANK
function.
Problem 2:
You want to get serial number for each employee in their
department based on their highest salary.
Solution:
You can use ROW_NUMBER
function with PARTITION BY Clause on DEPARTMENT and ORDER BY Clause on SALARY in
Descending order.
SQL Code:
SELECT EMPNAME, DEPARTMENT, SALARY, ROW_NUMBER() OVER(PARTITION BY DEPARTMENT ORDER BY SALARY DESC) AS ROW_NUMBER_IN_DEP_BY_SALARY FROM EMP
Output:
Explanation:
As you can see from the result set each employee has
given a serial number based on their salary in their department and it’s
unique, it starts with 1 and goes till 5 because each department has max 5
employees.
Integer_Expression – It is a positive integer constant expression that specifies the number of groups into which each partition must be divided
NTILE
NTILE function distributes all
your records in an ordered groups with a specified group number. The
groups are numbered, starting at 1 for the first group and adds 1 for the next
group. For each record, NTILE returns the number of the group to which the
record belongs.
Syntax: NTILE
(Interger_Expression) OVER (PARTITION BY [column_list] ORDER BY [column_list])
Integer_Expression – It is a positive integer constant expression that specifies the number of groups into which each partition must be divided
PARTITION BY [Column_list] – This is optional if you
want to divide the result set into partitions to which the function is applied
then you must give the column name, if you don’t use this clause then the
function treats all records of the query result set as a single group.
ORDER BY [Column_list] – This is not optional it is a
must clause; this determines the order of the data before the function is
applied.
Let’s understand the NTILE function with simple
examples.
Problem 1:
You want to divide all employees into 3 groups based
on their highest salary.
Solution:
You can use NTILE function without PARTITION By Clause
and with Inter_expression equals to 3 and ORDER BY Clause on SALAY in
Descending order.
SQL
Code:
SELECT EMPNAME, DEPARTMENT, SALARY , NTILE(3) OVER(ORDER BY SALARY DESC) AS GROUP_NUMBER_BY_SALARY FROM
EMP
Output:
Explanation:
As you can see from the result set there are total 15
records in EMP table. All records have been divided into three groups. Each
group has it group number with 5 records.
Problem 2:
You want divide each employee in their department
based on their highest salary into 2 groups.
Solution:
You can use NTILE function with INTEGER EXPRESSION value
equal to 2, PARTITION BY Clause on DEPARTMENT and ORDER BY Clause on SALAY in
Descending order.
SQL Code:
SELECT EMPNAME, DEPARTMENT, SALARY , NTILE(3) OVER(PARTITION BY
DEPARTMENT ORDER BY
SALARY DESC) AS GROUP_NUMBER_IN_DEP_BY_SALARY FROM EMP
Output:
As you can see from the result set there are total 3
Departments, each department has 5 records in it. All records in each department
have been divided into 3 groups. Each group has it group number with its records.
If you liked the article hit the Facebook like button.
Keep Learning!
1 comment:
This is the best article on Ranking functions I have read. Thanks a lot. Keep posting.
Post a Comment