Today, I will continue the discussion of SQL queries with queries that involve different ways to group or derive new data based on the data in a table. The SQL in this post is based on the TeamMembers database. For instructions on setting up the database in LibreOffice and a short description of the database and tables, see the post at http://wp.me/pvwht-a4.
Herding the Data in Tables
The word aggregation comes from the Latin with a literal meaning of “to herd.” So, an aggregation is a collection or gathering of things together. In this post I will show you how to herd some of the data in your tables. I will also show you how to get new data derived from the data in a table.
There are times when you know you are going to get more results than you need. You may want to herd them into smaller groups. You can use TOP and LIMIT to create smaller groups for finding the data you are after. TOP will give you the first x number of records in the result table.
SELECT TOP 5 * FROM "TeamMembers";
The above query will give you the top 5 results in the set.
But what about seeing the next 5 results? You can use the following query to see the next 5 results:
SELECT LIMIT 5 5 * FROM "TeamMembers";
The first number after LIMIT is the offset from the beginning (in this case 5), and the second number is the limit (or number of records to display).
Using TOP and LIMIT, you could work your way through a query with a large result set by herding (grouping) together smaller chucks of the result set. Think of these two keywords as the SQL equivalent of the Page Up/Down buttons.
When using the TOP and LIMIT keywords in Query (SQL View), you need to make sure the Run SQL command directly button is activated or the query will fail. The reason is you are invoking keywords understood only by the underlying HSQLDB engine.
COUNT allows you to count the number of times a field appears in the resulting data set. Keep in mind that if a field in a record is NULL the field is not counted.
SELECT COUNT(*) FROM "TeamMembers";
Gives us the total number of records in the table.
SELECT COUNT(DISTINCT "City") FROM "TeamMembers";
Gives us a count of the number of distinct cities in the table. In other words, it eliminates duplicates. So, if you wanted to know how many different cities the team members live in, this is your command.
SELECT COUNT("LastName") FROM "TeamMembers" WHERE "LastName" = 'Walker';
You can narrow things down with a WHERE clause. This last query tells you how many team members have the last name Walker.
MIN and MAX will give you the smallest or the greatest value in a field. If the field is a text type, the order is determined by alphabetic order. Numbers are considered by normal numeric value. When used on date fields, the MAX is the most recent date and MIN is the oldest date.
To get the date of the most recent hire and the date of the first hire, you can us the following query:
SELECT MAX("DateOfHire"), MIN("DateOfHire") FROM "TeamMembers";
SOME and EVERY only works with Boolean types (1, 0, True, or False). SOME returns True if at least one entry in the field is True (or 1) and False only if all entries are False (or 0). EVERY returns True only if all entries in the field are True (or 1) and False (or 0) otherwise.
SELECT SOME("Married") FROM "TeamMembers";
Will return True (or a checked box) because some of the members are married.
SELECT EVERY("Married") FROM "TeamMembers";
Will return False (or unchecked box) because not all members are married.
SELECT EVERY("Married") FROM "TeamMembers" WHERE "LastName" = 'Walker';
Will return True because all members with the last name Walker are married.
So by using the aggregation functions in SQL, we can “herd” the data into groups and collect information about the nature of the data in a table.
Written with StackEdit.