A question that I answer what seems like all the time is how to run cross-database queries against Azure SQL Database. So rather than answering those questions repeatedly I figured I should blog my response up so from here on out I can just forward the link to this blog on to my customers and others interested.
Most people I talk to still think that cross-database queries in Azure SQL DB aren’t possible. But thanks to a new feature called “Elastic Query”, we can issue cross database queries. There’s actually a few different types of cross database queries that you can run. Just keep in mind that the Elastic Query feature is currently in preview so there are some limitations to be aware of.
1. Horizontal queries (in preview): Horizontal elastic queries are queries that are executed across a set of databases featuring the same schema. The data is partitioned across the many different databases and the (horizontal) queries issued are routed to the correct database based on a shard map. This technique is commonly referred to as “sharding”. You can read more about horizontal partitioning in Azure SQL Database here. An elastic query can be executed across many different databases that share the exact same schema but contain different data. The diagram below represents a horizontal elastic query.
2. Vertical queries (in preview): A vertical elastic query is a query that is executed across databases that contain different schemas and different data sets. An elastic query can be executed across any two Azure SQL Database instances. This is actually really easy to set up and that what this blog post is about! The diagram below represents a query being issued against tables that exist in separate Azure SQL Database instances that contain different schemas.
3. TSQL queries from Azure SQL Database to Azure SQL Data Warehouse (planned feature): This feature is not yet available but basically you’ll be able to issue queries from Azure SQL Database to Azure SQL Data Warehouse. I don’t have an ETA on this feature, but this will be very cool!
Setting up Vertical Elastic Queries in Azure SQL Database
In this blog post, we’re going to set up an Elastic Query that queries different tables in two different Azure SQL Database instance (#2 in the list above). In my example, I have two SQL DBs that are on the same server (which you can see in the diagram below), but they could very well be on separate servers. That part doesn’t matter.
The databases contain tables from the AdventureWorksLT database. Sqldustyeq1 has the following tables has the Product and SalesOrder related tables while sqldustyeq2 has the Customer and Address related tables, as you can see below.
In this example, we’re going to set up elastic queries to query the Customer and Address tables on sqldustyeq2 from the sqldustyeq1 database.
To set this up, there’s basically four steps:
1. Create a Master Key
The Master Key is used to protect the Database Scoped Credential:
CREATE MASTER KEY ENCRYPTION BY PASSWORD = ‘<your password goes here>’;
2. Create a Database Scoped Credential
The database scoped credential contains the credentials used to connect to your external data source. The user name and password should be for an account that has access to the tables in the external database.
CREATE DATABASE SCOPED CREDENTIAL <credential name> WITH IDENTITY = ‘<user name>’,
SECRET = ‘<password for user name>’;
3. Create an External Data Source
CREATE EXTERNAL DATA SOURCE <external data source name> WITH
(TYPE = RDBMS,
LOCATION = ‘<server i.e. myserver.database.windows.net’,
DATABASE_NAME = ‘<name of your database>’,
CREDENTIAL = <credential name you used in the previous step>
) ;
4. Create an External Table
The Create External Table statement specifies the external table you want to query. In my case, I created external tables for SalesLT.Address, SalesLT.Customer, and SalesLT.CustomerAddress. Here’s the Create External Table Statement for the SalesLT.Address table:
CREATE EXTERNAL TABLE [SalesLT].[CustomerAddress](
[CustomerID] [int] NOT NULL,
[AddressID] [int] NOT NULL,
[AddressType] [dbo].[Name] NOT NULL,
[rowguid] [uniqueidentifier] NOT NULL,
[ModifiedDate] [datetime] NOT NULL,
)
WITH
(
DATA_SOURCE = <external data source name you used in Step 3>
);
After creating the external tables, you should see the tables listed under the External Tables folder in SSMS.
So now that we’ve done that, we’re ready to start running some cross database queries!
Querying External Tables
There’s basically two ways to query external tables in Azure SQL Database.
1. Write a Select statement (duh!)
First of all, you can write a basic select statement using the external table just like you would any other physical table.
Select distinct count(s.SalesOrderID) as OrderCount,
etc.CustomerID as ExternalTableCustomerID
From SalesLT.SalesOrderHeader s
inner join SalesLT.Customer etc
on s.CustomerID = etc.CustomerID
Where etc.CustomerID = 29568
Group By etc.CustomerID
Order by 1 DESC
2. Use sp_execute_remote to execute the SQL statement or external stored procedure.
And secondly, you can use sp_execute_remote to execute TSQL or stored procedures on the remote database.
Executing an external stored procedure:
exec sp_execute_remote
N’sqldustyeqtest.sqldustyeq2′, — This is the external data source name…
N’get_CustomerCount’ — This is the external procedure…
Executing a TSQL statement:
exec sp_execute_remote
N’sqldustyeqtest.sqldustyeq2′, — This is the external data source name…
N’Select distinct count(etc.CustomerID) as CustomerCount
From SalesLT.Customer etc’ — This is the TSQL statement
And just like that, you’re executing cross-database queries from one Azure SQL Database to another.
One of the really nice things about the external queries is that filter predicates can be pushed down to the remote data source. So when you’re running your external queries, try to use filter predicates that can be pushed down to the remote data source. In the screenshot below, you can see the Where clause has been pushed down to the remote data source.
Resources
The documentation on Elastic Query is pretty good, so if you have further questions, start here: https://docs.microsoft.com/en-us/azure/sql-database/sql-database-elastic-query-overview
Here’s some information on the pre-requisites for vertical queries: https://docs.microsoft.com/en-us/azure/sql-database/sql-database-elastic-query-vertical-partitioning
Thoughts and Feedback?
I hope you found this useful. Let me know if you’ve played around with Elastic Query in Azure SQL Database and what you learned. I’d love to hear about it!
HI,
Thanks for helpful post.
Is cross database query is available between two stand alone sql databases (which are not part of elastic pool)?
-Chintak.
Yes. The database don’t need to be in an elastic pool. They can be any two SQL DB instances, even different servers.
How is that possible can you redirect me to any specific links where I can try out this.
Try it out with any SQL DB instances in Azure.
Is performance comparable to what you would see on Linked Servers on “regular” SQL ? Or better ? Thanks
I think there’s a lot of “it depends” to answer that question. A lot of variables to consider. Both linked servers and cross database queries in Azure support remote query execution so whenever writing cross database queries I think you’d want to try to write those queries to pull the smallest row set from the remote database to ensure the best performance. But as far as which one is faster I think you’d have to test your specific scenario.
Cross database queries between Azure Datawarehouse and SQL Database possible, if not when this feature going to be available.
I don’t have a timeframe for the release of this feature.
Couple of things: it took a while for me to figure out that the master key and db-scoped credential are created on the DESTINATION database, rather than the source. It makes sense in retrospect.
When I went to CREATE EXTERNAL TABLE, I was informed that the type ‘sys.geography’ is not supported with external tables for sharded data, which kills my deal. That’s what I was trying to transfer. (I have in mind a “horizontal” query… even though my tables do match. Not sure if it just won’t do geography, or if it has inferred I’m sharding, when I am not.
Also wondering where the db-scoped credentials can be managed. SSMS doesn’t show them that I can see, and an attempt to DROP by name in a query window tells me that’s not supported.
Thanks for your blog post. Best of several I found on this topic.
Looks like this is so simple in on-premise SQL Server, a bit more complex in Azure SQL
I have SQL Server Management Studio and my Azure server opened. Yet I cannot see any “External Tables” nor the folder when I expand the databases. Any ideas?
Downloaded SSMS 2017 and I now see the proper folders.
Good to know! Thanks for sharing, Alex.
hey guys,
i have one management data base and 10 other dbs having same schema. from the management db, i need to select sama data from the other 10 dbs. any idea how to do that in a simpliest way.. thanks
I have performed all the above steps but when I try to select data from my external table I got this error –
Error retrieving data from 123.201.6.106, 1717.DbMilestone. The underlying error message received was: ‘A connection was successfully established with the server, but then an error occurred during the login process. (provider: SSL Provider, error: 0 – The certificate chain was issued by an authority that is not trusted.)’.
Thanks for your sharing, it’s real helpful as i followed provided steps and it worked perfectly.
You’re welcome! I’m glad it was helpful!