{"id":113,"date":"2017-11-09T22:27:40","date_gmt":"2017-11-09T22:27:40","guid":{"rendered":"http:\/\/eipsoftware.com\/musings\/?p=113"},"modified":"2018-02-19T16:40:43","modified_gmt":"2018-02-19T16:40:43","slug":"sql-update-many-rows","status":"publish","type":"post","link":"https:\/\/eipsoftware.com\/musings\/sql-update-many-rows\/","title":{"rendered":"SQL Update Many Rows"},"content":{"rendered":"<p>When working with a large table the UPDATE statement can easily bring even the most robust server to a screeching halt. In this posting I will show an alternative so even the largest tables can be updated without negatively impacting server performance.<\/p>\n<h4>Updating a million rows<\/h4>\n<p><!--more-->We&#8217;ve all been there; writing a simple update statement that is just setting one column&#8217;s value to a new value.\u00a0 And maybe we are using a simple join to determine what rows need to be updated.\u00a0 Without thinking about it you click execute and wait&#8230; and wait&#8230; and wait&#8230;<\/p>\n<pre class=\"theme:github width-mode:1 width:700 lang:tsql decode:true\">UPDATE\t\tsdo\r\nSET\t\t\tsdo.flo\t= 1\r\nFROM\t\tsto.SourceData_Outlet AS sdo\r\n\t\t\tINNER JOIN [tmp].[flo_dlvr_pnt_no] AS fdp\r\n\t\t\t\tON sdo.dlvr_pnt_no = fdp.dlvr_pnt_no\r\nWHERE\t\tsdo.flo IS NULL<\/pre>\n<p>&nbsp;<\/p>\n<p>The transaction is trying to update &gt; 1 million rows and begins writing to the local disk system or worse the SAN.\u00a0 The query may eventually finish if you are lucky, or it just may time out and need to rollback if you are unlucky.<\/p>\n<p><strong>What to do? What to do?<\/strong><\/p>\n<p>Fortunately, T-SQL supports a simple looping structure.\u00a0 You can use a WHILE loop with a BEGIN and END statement to denote the sql commands to execute.<\/p>\n<h4>Example<\/h4>\n<pre class=\"theme:github width-set:true width-mode:1 width:700 lang:tsql decode:true \">SET ROWCOUNT 10000\r\n\r\nWHILE\t\t@@rowcount &gt; 0\r\n\tBEGIN\r\n\t\t\tUPDATE\t\tsdo\r\n\t\t\tSET\t\t\tsdo.flo\t= 1\r\n\t\t\tFROM\t\tsto.SourceData_Outlet AS sdo\r\n\t\t\t\t\t\tINNER JOIN [tmp].[flo_dlvr_pnt_no] AS fdp\r\n\t\t\t\t\t\t\tON sdo.dlvr_pnt_no = fdp.dlvr_pnt_no\r\n\t\t\tWHERE\t\tsdo.flo IS NULL\r\n\tEND\r\n\r\n\r\nSET\t\t\tROWCOUNT 0\r\n\r\n<\/pre>\n<h4>Explanation<\/h4>\n<p>In the example below I set the variable ROWCOUNT to 10000.\u00a0 Next we will loop through the UPDATE statement in groups of 10000 WHILE ROWCOUNT &gt; 0.<\/p>\n<p>The UPDATE statement will execute, and commit the transaction updating the 10000 rows, then the next 10000 rows will be updated, until all of the rows have been updated.<\/p>\n<h4>Caveat<\/h4>\n<p>The query above will still be ACID compliant, even if it is interrupted mid-stream.\u00a0 The query can just be restarted without any issues.\u00a0 The reason being is that if one of the batches of 10000 doesn&#8217;t complete the transaction for those 10000 rows will not be committed.\u00a0 And since in the WHERE clause we are looking for any values that haven&#8217;t been updated, i.e. in this example they are NULL, the query will just pick up from where it left off.<\/p>\n<h4>Real World Statistics<\/h4>\n<p>By updating in batches of 10000 I was able to update a table with &gt; 100 million rows in under 4 minutes.\u00a0 If I tried to update all 100 million rows in one query, it wasn&#8217;t possible to update the rows.\u00a0 In another example I updated 10 million rows in under 1 1\/2 minutes.\u00a0 Alternatively the same query trying to update all 10 million rows took more than 25 minutes. Your results may vary.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>When working with a large table the UPDATE statement can easily bring even the most robust server to a screeching halt. In this posting I will show an alternative so even the largest tables can be updated without negatively impacting server performance. Updating a million rows<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_crdt_document":"","footnotes":""},"categories":[43,4,6],"tags":[13,14,15,25],"series":[],"class_list":["post-113","post","type-post","status-publish","format-standard","hentry","category-sql-t-sql","category-code","category-sql","tag-hint","tag-sql","tag-t-sql","tag-update"],"_links":{"self":[{"href":"https:\/\/eipsoftware.com\/musings\/wp-json\/wp\/v2\/posts\/113","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/eipsoftware.com\/musings\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/eipsoftware.com\/musings\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/eipsoftware.com\/musings\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/eipsoftware.com\/musings\/wp-json\/wp\/v2\/comments?post=113"}],"version-history":[{"count":5,"href":"https:\/\/eipsoftware.com\/musings\/wp-json\/wp\/v2\/posts\/113\/revisions"}],"predecessor-version":[{"id":145,"href":"https:\/\/eipsoftware.com\/musings\/wp-json\/wp\/v2\/posts\/113\/revisions\/145"}],"wp:attachment":[{"href":"https:\/\/eipsoftware.com\/musings\/wp-json\/wp\/v2\/media?parent=113"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/eipsoftware.com\/musings\/wp-json\/wp\/v2\/categories?post=113"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/eipsoftware.com\/musings\/wp-json\/wp\/v2\/tags?post=113"},{"taxonomy":"series","embeddable":true,"href":"https:\/\/eipsoftware.com\/musings\/wp-json\/wp\/v2\/series?post=113"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}