# Connectivity issues in Azure SQL Database Before we jump into troubleshooting, we should clarify the following points first: - It's **connectivity** error or **authentication** error - The connection **always fails** or it's **intermittently failed**. - Understand the network topology. Is the client hosted in Azure, is the client behind a firewall, or routed through a 3rd-party VPN or proxy? ## Basic concepts The available options for SQL Database network controls: Server Firewalls: - [IP based firewall rules](learn.microsoft.com/en-us/azure/azure-sql/database/firewall-configure?view=azuresql): explicitly allow connections from a specific IP address. For example, from on-premises machines or a range of IP addresses by specifying the start and end IP address. - [VNET firewall rules (service endpoint)](learn.microsoft.com/en-us/azure/azure-sql/database/vnet-service-endpoint-rule-overview?view=azuresql): allow traffic from a specific virtual network within the Azure boundary. - Allow Azure services and resources to access this server: any resources within the Azure boundary can access SQL Database - [Azure IP service range](microsoft.com/en-my/download/details.aspx?id=56519&msockid=2763aef7699d64f52b82bdfa68e765e2). For example, an Azure Virtual Machine can access the SQL Database resources. Private endpoint: - [Private Link](https://learn.microsoft.com/en-us/azure/azure-sql/database/private-endpoint-overview?view=azuresql): create a private endpoint for the SQL Server within a specific virtual network. ## Troubleshooting steps ### Determine error types We will be able to quickly identify it's a service related issue or network related issue by checking the error messages and code, for instance error `40197` or `40613` usually indciates a service issue and Error `18456` and `40531` usually indicates an authenticaiton issue. [Troubleshoot Common Connection Issues - Azure SQL & SQL database in Fabric | Microsoft Learn](https://learn.microsoft.com/en-us/azure/azure-sql/database/troubleshoot-common-errors-issues?view=azuresql#transient-fault-error-messages-40197-40613-and-others) ### Permanent network-related error For network-related errors or instance-specific error, we can leverage network utilities to investigate further. Regarding timeouts errors we should first verify whether the TCP traffic is allowed or not: - telnet {host} {port} - nc -vz {host} {port} - nmap -p {port} {host} Windows featured - Test-NetConnection -computer myserver.database.windows.net -port 1433 -InformationLevel Detailed - [PsPing.exe](https://learn.microsoft.com/en-us/sysinternals/downloads/PsPing) mysqldbsrvr.database.windows.net:1433 Azure SQL Database use a wide range of port for Redirect connection policy: 11000 to 11999 for public endpoint, 1433 to 65535 for private endpoint, plus 1433 for gateway. It's common to see connection timeouts due to this port range not allowed properly: [Connectivity architecture ](https://learn.microsoft.com/en-us/azure/azure-sql/database/connectivity-architecture?view=azuresql#connection-policy) When using private endpoint, it's important to verify if DNS resoluted to the private IP properly : - nslookup {domain} We can leverage PowerShell to verify using specific DNS server: - Resolve-DnsName -Name www.bing.com -Server 10.0.0.1 #### Advanced tools ##### SQL connectivity checker [GitHub - Azure/SQL-Connectivity-Checker: This PowerShell script will run some connectivity checks from this machine to the server and database.](https://github.com/Azure/SQL-Connectivity-Checker) This checked is super useful, it can be executed both on Windows and Linux OS. Also it can collect network trace to analyze further. To open etl file captured with SQL connectivity checker - Download Network Monitor 4.3 - From menu select "Options": change parser profile → select "Windows" → set as active profile. Tips: - check specific service by filtering ports: ``` SourcePort == 1433 or DestinationPort == 1433 OR Source == "DBHOST" OR Destination == "10.97.24.188" ``` - check specific communication by filtering ports and source/destination ``` SourcePort == 1433 or DestinationPort == 1433 OR Source == "DBHOST" OR Destination == "10.97.24.188" ``` ##### SSMS sqlclient trace We can enable SQL Client trace from SSMS which is helpful to live troubleshoot connectivity issue: https://learn.microsoft.com/en-us/ssms/menu-help/options-output-window-page How to enable this trace: 1. Menu: Tools -> Options -> Output Window -> General -> check "SQL Client" 2. Restart the SSMS. How to show output: Menu: View -> Output -> show output from "SQL Client" - sample log for login ```log 2025-12-25T05:50:48.2373992Z [1]: 19, 'Data Source=XXXXXX.database.windows.net;Persist Security Info=True;User ID=XXXX;Pooling=False;MultipleActiveResultSets=False;Connect Timeout=30;Encrypt=True;TrustServerCertificate=False;Packet Size=4096;Command Timeout=0' 2025-12-25T05:50:48.2373992Z [1]: Requested the ConnectionProvider value. 2025-12-25T05:50:48.2383954Z [153]: SQL authentication 2025-12-25T05:50:48.5330570Z [153]: Sending prelogin handshake 2025-12-25T05:50:48.5330570Z [153]: ClientConnectionID 77dc3d3c-b5df-418a-a0f5-3d1381eb0dc2, ActivityID df7fd146-1a52-4f6a-8410-06f48f896e62:1 2025-12-25T05:50:48.5330570Z [153]: Consuming prelogin handshake 2025-12-25T05:50:49.0423727Z [153]: Prelogin handshake successful 2025-12-25T05:50:49.2047005Z [153]: Received login acknowledgement token 2025-12-25T05:50:49.2047005Z [153]: Routed toXXXXX1.database.windows.net 2025-12-25T05:50:49.2047005Z [153]: SQL authentication 2025-12-25T05:50:49.4602473Z [153]: Sending prelogin handshake 2025-12-25T05:50:49.4602473Z [153]: ClientConnectionID 9ea9e0e6-2238-4e87-8849-193e9a19d403, ActivityID df7fd146-1a52-4f6a-8410-06f48f896e62:2 2025-12-25T05:50:49.4602473Z [153]: Consuming prelogin handshake 2025-12-25T05:50:49.9681568Z [153]: Prelogin handshake successful 2025-12-25T05:50:50.1179660Z [153]: Received login acknowledgement token 2025-12-25T05:50:50.1189649Z [153]: State changed from NotStarted to HasLoggedIn. 2025-12-25T05:50:50.1189649Z [153]: 1, Non-pooled database connection created. 2025-12-25T05:50:50.1189649Z [72]: 19 2025-12-25T05:50:50.1189649Z [1]: 19 2025-12-25T05:50:50.1649649Z [1]: 19 2025-12-25T05:50:50.1664709Z [1]: 123, String Value = 'null' .... 2025-12-25T05:50:50.1664709Z [1]: SqlCommand.WriteBeginExecuteEvent | INFO | Object Id 124, Client connection Id 9ea9e0e6-2238-4e87-8849-193e9a19d403, Command Text DECLARE @edition sysname; 2025-12-25T05:50:50.1664709Z [1]: 124, Command executed as SQLBATCH. 2025-12-25T05:50:50.1664709Z [1]: ActivityID cb37ee21-7e9d-4199-b8d1-2bafe41b1879:6 2025-12-25T05:50:50.3098985Z [1]: SqlCommand.WriteEndExecuteEvent | INFO | Object Id 124, Client Connection Id 9ea9e0e6-2238-4e87-8849-193e9a19d403, Composite State 5, Sql Exception Number 0 ``` #### Useful links - 64 Error [Lesson Learned #433:Azure SQL Pre-login Handshake Error 64:"The specified network name is no longer"](https://techcommunity.microsoft.com/blog/azuredbsupport/lesson-learned-433azure-sql-pre-login-handshake-error-64the-specified-network-na/3936302): This error suggests that although a connection to the SQL Server was initially successful, it failed during the pre-login handshake. This phase is vital in the SQL Server connection process, and failures here commonly indicate network-related issues, incorrect configurations, or software incompatibilities. ### Intermittent connection drops/termination errors Given the nature of cloud environment, 100% network stability cannot be guaranteed and a few transient faults are inevitable. Regarding intermittent connectivity issues with SQL Database, there are 2 patterns; cannot establish connection - login failures; exiting conneciton drops - abrupt disconnections. Java SDK throws connection reset exceptions, and .NET SDK throws SqlExceptions like the [following termination errors](https://learn.microsoft.com/en-us/azure/azure-sql/database/troubleshoot-common-errors-issues?view=azuresql#network-connection-termination-errors). Those are client-side TCP provider errors (SQL error number = 0) raised when the network stack detects that an already-open TCP socket was closed/reset unexpectedly by the peer (or by an intermediate device): - A connection was successfully established with the server, but then an error occurred during the pre-login handshake. (provider: TCP Provider, error: 0 - An existing connection was forcibly closed by the remote host.) An existing connection was forcibly closed by the remote host → drop happened before TDS login, while exchanging PreLogin + TLS parameters. - A transport-level error has occurred when sending the request to the server. (provider: TCP Provider, error: 0 - An existing connection was forcibly closed by the remote host.) →drop happened after TLS was up and the driver was sending LOGIN7 / waiting for LoginAck - A connection was successfully established with the server, but then an error occurred during the login process. (provider: TCP Provider, error: 0 - An existing connection was forcibly closed by the remote host.) → drop happened after login, during query/request I/O. Often seen when result sets are large or the network path is unstable. > Why error number 0? Because these are not SQL engine failures; they originate in the TCP provider (Winsock/.NET or Java sockets), so there’s no database error code to report. Above intermitent termination errors are generally a symptom of network instability and can be caused by any hops involved in the TCP connection establishment process. Think of the TCP connection as spanning multiple hops (client → DNS → gateway → DB node, plus any firewalls/LBs/VPN/proxies). A reset at any hop will surface to the SDK as “forcibly closed.” - The target is temporarily unavailable or too busy to accept new connections. - Corporate firewalls, SSL/TLS inspection, NAT, proxies, or VPN concentrators may inject resets or timeouts under load or policy (idle timeout, connection reaping, packet-size anomalies). - If TLS can’t be established quickly (network latency, inspection devices, certificate negotiation issues), you’ll see pre‑login handshake failures or timeouts. In terms of Azure SQL Database, it's easy to figure out failures caused by database maintenance. The other one impactful but hard to notice is SQL Gateway maintenance, as it's completely invisible to customers: https://learn.microsoft.com/en-us/azure/azure-sql/database/maintenance-window?view=azuresql#gateway-maintenance > In Azure SQL Database, any connections using the proxy connection policy could be affected by both the chosen maintenance window and a gateway node maintenance window. However, client connections using the recommended redirect connection policy are unaffected by a gateway node maintenance reconfiguration. SQL Gateway is a regional shared resource for database service. The gateway restarts are part of the maintenance of the service. During the maintenance most of gateway nodes remain operational. For nodes undergoing maintenance, the primary impact is a brief instance start, typically for seconds. Using Redirect connection policy is an effective approach to mitigate the impact from the SQL Gateway interruptions. However, tracing all intermittent network faults can be challenging, as it requires capturing a network trace while the issue is still occurring. As a general guidance, applications that interact with cloud-based services should use a fixed number of retry attempts to handle such exceptions transparently. Retry guidance from Architecture Center: - [Transient fault handling - Azure Architecture Center | Microsoft Learn](https://learn.microsoft.com/en-us/azure/architecture/best-practices/transient-faults) - [Retry pattern - Azure Architecture Center | Microsoft Learn](https://learn.microsoft.com/en-us/azure/architecture/patterns/retry#example) Retry coding: - ADO.NET Configurable retry logic: [Configurable retry logic core APIs in SqlClient - ADO.NET Provider for SQL Server | Microsoft Learn](https://learn.microsoft.com/en-us/sql/connect/ado-net/configurable-retry-logic-core-apis-sqlclient?view=sql-server-ver17) - ADO.NET code sample that demonstrates custom retry logic: [Step 4: Connect Resiliently to SQL with ADO.NET - ADO.NET Provider for SQL Server | Microsoft Learn](https://learn.microsoft.com/en-us/sql/connect/ado-net/step-4-connect-resiliently-sql-ado-net?view=sql-server-ver17) ### SqlClient tracing - [Data tracing in SqlClient - ADO.NET Provider for SQL Server | Microsoft Learn](https://learn.microsoft.com/en-us/sql/connect/ado-net/data-tracing?view=sql-server-ver17#access-diagnostic-information-in-the-extended-events-log) - [SqlException.ClientConnectionId Property (System.Data.SqlClient) | Microsoft Learn](https://learn.microsoft.com/en-us/dotnet/api/system.data.sqlclient.sqlexception.clientconnectionid?view=netframework-4.8.1) ### A few edge cases - Login failed due to schema lock wait timeout. ### SQL Database connection latency To test pure network latency, user may use PsPing Usage: https://learn.microsoft.com/en-us/azure/azure-sql/database/private-endpoint-overview?view=azuresql#check-connectivity-using-psping However, ping uses ICMP and only tests basic network reachability and round‑trip time at the IP layer. While database connection involves many additional steps, and customized scripts to establish conenction is effective to quickly measure the latency. In Azure many customers use Application Insights in Azure Monitor to track application performance, which also can measure the call duration to database. https://learn.microsoft.com/en-us/azure/azure-monitor/app/dependencies?tabs=otel