GentleRebuild is Powershell script for online index rebuild in
high-load clustered MSSQL databases working 24/7 - Enterprise Edition
is preferred, Standard Edition is supported with limitations.
Detailed description: https://habr.com/ru/articles/761518/
Script executes ALTER INDEX ... REBUILD/REORGANIZE in a
safe controlled mode using 2 threads: one thread is executing the command
while second one is checking the environment for several conditions. What
is controlled:
- High CPU
- Too active index rebuild activity can overflow the LDF file -
transaction log backup could be not fast enough to flush the generated
log to the disk
- High index rebuild activity can fill AlwaysOn queues (Send and
Redo) above safe thresholds
- Or, despite the ONLINE option, rebuild can block other processes via
schema locks - in such cases script "yields", stopping it work
temporarily
- Controlled deadline time - deadline is a time when script should stop
(not to interfere with the other important processes running on a fixed
schedule)
- Exception list of tables and indexes - schemas named 'import', 'tmp',
tables named 'tmp_%' etc.
- Indexes which can't be rebuilt with the options RESUMABLE=ON
or even ONLINE=ON - such case indexes are skipped when size is
above the defined thresholds
- Maximum duration of continuous index rebuild
- Maximum size of indexes rebuild during a single run - in case the
amount of total space for transaction log backups is limited or logs are
used to roll over to STANDBY databases.
- Custom throttling conditions - review custom.ps1 and adjust
for your needs throttling and lock sensitivity (if you download new
version, don't overwrite your custom.ps1)
SQL server script
- Execute script FRG_install.sql on a "DBA" database.
In almost any environment there is dedicated "DBA" database - DBA,
SYSSYS, Optimization, names differ. In the worst case, if there is no
such database, use msdb.
- Execute FRG_FillUsageStats if needed (check changes in v1.37
for more details)
- Execute procedure FRG_FillSizeStats to fill a table
with object list and sizes. This procedure usually finishes it's work it
few seconds. You can limit it to a singled database
- Execute procedure FRG_FillFragmentation to analyze a
level of fragmentation. It could take up to a day or more to analyze
30-100Tb (to limit the impact, check '2-server mode'), defult analysis
mode is DETAILED. This is why powershell program doesn't analyze the
fragmentation level during it's run, but relies on the previously
collected information. First parameter of this stored procedure allows
you to limit the analysis by a single database. By default all non
system databases are analyzed.
- When script finishes the index rebuild, it updates the size and
fragmentation info using the stored procedure FRG_FillFragmentationOne.
Typically fragmentation level drops below the threshold, so on the next
run an index won't be affected
- Powershell script relies on the static previously collected
information, not on the actual fragmentation level NOW. (which can
gradually deteriorate after an analysis). This is why steps 2 and 3
should be executed again from time to time.
- Underlying tables keep all the history of sizes and fragmentation
levels, the last value can be selected from a view FRG_last:
Powershell Script parameters
When you run powershell script, supply a configuration file name as first
parameter, for example:
powershell .\GentleRebuild.ps1 param
where param.ps1 - is powershell files setting parameter variables inside.
Check config_example.ps1
All parameters in detail:
- Where and what to rebuild:
- server - server name (Integrated Security is
used, if you do need SQL login edit the ps1 file)
- replicaserver - read only replica where fragmentation
analysis is executed to limit the stress on the mains server, check
'2-server mode'. if parameter is blank or the same as server, all
queries run against the same server
- dbname - Database name to defragement. You can
provide comma separated list. * means all non-system databases.
- workdb - DBA database (see above)
- threshold - fragmentation threshold in percent.
- extrafilter - Additional filter, which is applied
against FRG_last view. Useful filter examples:
- page_count>100 to skip tiny tables. Even after rebuild they
could be too fragmented because they don't have enough pages to
be completely populated
- TotalSpaceMb<1000000 - Skip huge indexes, they need extra
care.
- exceptions by SchemaName, Tablename, IndexName etc
- Reorganize - 0 or 1. use 1 for INDEX REORGANIZE.
- AllowNonResumable - size in Gb. If an index can't
be rebuilt using RESUMABLE=ON, an attempt is made to rebuild
it with ONLINE=ON. However, in that case process can't be
interrupted, and program can't yield to locks, unless killnonresumable=1
is specified. This is why index should be small enough. It
index size is bigger than that value it is skipped. Most HEAP
indexes fall into that category.
- AllowNonOnline - size in Gb. Some indexes can't
be rebuilt even using ONLINE=ON. Then they are rebuild
OFFLINE, locking other processes, and this process is not
interrupted unless killnonresumable=1 is specified. They
should be smaller than a threshold, or they are skipped. Obviously AllowNonOnline
should be even smaller than AllowNonResumable
- How to rebuild
- deadline - string in "HH:MM" format or an empty
string. If current time is greater than deadline, tomorrow time is
assumed. What deadline affects:
- If deadline is reached, and there are still indexes to work
on, process stops without starting a work on a next index
- If during a rebuild a projected time of completion (ETA) falls
behind the deadline, and when work is between 2% and 3% (not far
enough), then rebuild is terminated (using ABORT). However, it
doesn't guarantee that work would always finish before the
deadline, as process can slow down later. In such cases manual
intervention is required (check "interactive menu")
- if harddeadline =1, index resumable rebuild is
terminated at any stage, not only between 2% and 3%, to
guarantee that there is no activity outside of the maintenance
window
- If ETA is behind the deadline, the progress messages are
highlighted with yellow to attract an attention.
- harddeadline - 0 (default). When 1, forces to abort
operations no matter what progress is.
- starttime - string in "HH:MM" format or an empty string. If
current time is greater than starttime, tomorrow time is assumed.
Forces script to wait until that time is reached - typically a
maintenance window start.
- rebuildopt - rebuild options for ordinary
indexes (CLUSTERED, NONCLUSTERED, HEAP). Don't use MAX_DURATION, use
chunkminutes instead. Don't use SORT_IN_TEMPDB: for
RESUMABLE it is not supported, for offline rebuilds there is a s
special parameter sortintempdb. Regarding MAXDOP:
- MAXDOP = 1 - slowly without much pressure on a server
- MAXDOP = 2 - normal work, recommended option
- MAXDOP = 4 - aggressive work
- It is not recommended to use MAXDOP>4, you will reach the
AlwaysOn limitations, and the process will be throttled
- Note: for some indexes MAXDOP is ignored and server uses 1-2
threads no matter what MADOP is.
- columnstoreopt - similar options for COLUMNSTORE
indexes (COLUMNSTORE index support is experimental)
- reorganizeopt - options for INDEX REORGANIZE
- relaxation - number of seconds process sleeps
when throttled by throttling conditions (LDF use size, AlwaysOn
queues, locking)
- checkevery - check for locks every N seconds, default 15
- maxcpu - percentage of the CPU for the last 10 minutes,
process is throttled if cpu is higher
- maxlogused - size in Mb used in log (LDF). When
used space is greater than this value, script stops (throttles) for
(relaxation) seconds and waits until the condition is cleared
(transaction log backup carries away log). Value 0 means 'don't
check'
- maxlogusedpct - Same but in percent to the total
size of transaction log. Value 0 means 'don't check. Typically one
of these 2 conditions is verified.
- maxqlen - Max size of AlwaysOn queues in Mb (log
send queue + redo queue) for all databases. If value is above the
threshold, process is throttled. To skip this check specify a very
high value
- maxdailysize - maximum work size of indexes
rebuilt in Mb. Condition is verified when work on a new index is
started. 0 means 'don't check'
- chunkminutes - similar to MAX_DURATION, but is
implemented using Powershell. 0 means no max duration, unless there
are throttling conditions. Time is counted since last throttling.
- killnonresumable - when 1, operations without RESUMABLE=ON
can be killed. You can increase values of AllowNonResumable
and AllowNonOnline in hope that no locks will be
detected during the operation (so value 1 is an optimistic
strategy).
- forceoffline - Enterprise edition works like Standard one,
all rebuilds are offline
- sortintempdb - for offline rebuilds only, size in Mb. if
index is smaller than that size, SORT_IN_TEMPDB=ON is used. It index
is too big (bigger than the value provided) it will be OFF.
Resumable and online rebuilds can't use this option.
- offlineretries - when rebuild is not resumable and killnonresumable=1,
index operation can be aborted when there are locks. Script will
retry the operation soon after. But after the specified number of
retries if would give up and skip this index
- orderby - order in which tables are processed, for more
details check changes in v1.37
Interactive menu
By pressing Ctrl/C while script is progressing (not throttling) you enter
an interactive menu (might take up to 15 seconds for the menu to appear):
For non-resumable index rebuilds, option S - STOP is not available
For REORGANIZE, there is an extra option - R - stops and recalculates new
stats, because on REORGANIZE already done, then process goes to the nest
index in a list.
It displays the current rebuild command and waits for a single character
command. Don't leave it waiting for a command - it such state throttling
conditions are not controlled. After command 'S' you can change the
parameters and restart the script, it would continue from the same index.
However MAXDOP can't be changed once rebuild is started.
Additional comments:
- Script starts from smaller indexes and is progressing towards the
bigger ones
- Percentage is calculated based on the data in a table index_resumable_operations,
if operation is with RESUMABLE option. Otherwise, there is a rough
estimation based on the IO count vs page_count (rough estimation ... %).
For COLUMNSTORE percentage is calculated based on the number of
ROWSTOREs rebuilt.
- Sometimes after throttling the percentage doesn't increase and you
will see ETA: (unknown). SQL server is seeking for a place from where to
continue (rewinding effect). Sometimes it could take a long time
- You can check a global progress on a console title
- ETA for the global progress is calculated when at least 1% of the
total work is done
- Don't leave indexes in a suspended resumable state. SQL server still
has to track all the changes and some queries can run slower.
- Indexes in a stopped resumable state can fail over to another server,
and rebuild can continue on another server (!!!)
- To track changes for ONLINE indexes, SQL has to change execution plans
which update these table. In most cases there is only mild performance
degradation, but in few cases (with MERGE) the difference could be
dramatic
- Sometimes index rebuild can slow down processes even on unrelated
databases for unknown reasons. Throttling or chunkminutes solves
this problem. It is not known where the magic is, but looks like
Microsoft knows something and had invented MAX_DURATION for that reason.
- If you chose chunkminutes too small the rebuild performance
could suffer because of the "rewinding effect"
- COLUMNSTORE support is experimental.
Standard Edition: tips and tricks
- All throttling conditions are checked only between the operations,
when a new operation starts. Exception: when other processes are locked
and killnonresumable=1
- use killnonresumable=0 for "hard" defrag which ignores locks,
when you have a maintenance window (use deadline parameter as
well)
- you can benefit from setting sortintempdb value to make
rebuild faster, also you can increase MAXDOP to reduce the probability
of locks
- set offlineretries to skip "hot" indexes which can't be been
rebuilt without causing too much trouble
- the most important is defining the custom procedure VictimClassifier,
which allows you defide different categories for different connections,
and for each category to define how much it can wait for locks (for
example, job named XXX can wait for 1 hour, job YYY for 10 minutes,
other processes are interactive and process should yield immediately)
- progress indicator is available for non-resumable operations - even it
is not provided by SQL server, quite accurate estimation is done based
on other system counters
INDEX REORGANIZE: tips and tricks
- this operation is resumable as well, it works on Standard Edition, but
it is 3-5 time slows then INDEX REBUIILD on non-clustered indexes and up
to 20x on CLUSTERED
- there is some estimation of the percentage of the work done for
REORGANIZE, it is +/- 15% accurate in most cases
- sometimes the reorganized area hits the 'hot spot' where data is
inserted on a systems under heavy load, and REORGANIZE never ends. For
that reason if estimated percentage reaches 120%, process is terminated
and statistics is recalculated.
- reorganize is throttled the same way other operations. It can cause
locking - typically on schema locks.
Rebuild Log
Is saved into a table FRG_Log
Two server mode (taking benefits of the readable replica)
Index analysis creates huge read stress on a server, reading all tables.
If possible, it makes sense to move this process to a secondary replica.
Simply provide replica server name in replicaserver parameter.
Note: it is assumed that there is a DBA database there, and that
database is read-write, so, it is not part of the AlwaysOn (or for that
group replica server is primary).
You should run stress-full FRG_FillFragmentation there.
After finishing each index rebuild, Powershell script will also run FRG_FillFragmentationOne
on the secondary server to update fragmentation statistics.
Note: before running FRG_FillFragmentationOne it waits
for the current LSN on the primary database to be replicated to the
replica server.
Change Log:
v1.10
- Introducing custom throttle events. Works similar to all other
throttling conditions (logs, locks, AlwaysOn). For an idea check
function CustomThrottling
- New config parameter: killnonresumable: 0 (please add to
parameter file). When 1, operations without RESUMABLE=ON can be killed.
You can increase values of AllowNonResumable and AllowNonOnline
in hope that no locks will be detected during the operation (so value 1
is an optimistic strategy).
- Minor bug fix in FRG_FillFragmentationOne - please alter
procedure from FRG_install.sql file
v1.11
- maxcpu parameter and throttling based on cpu for the last 10
minutes.
- partial support of the Standard Edition - all operations are offline,
throttling is between operations only, with the exception of having
other processes locked. If killnonresumable=1, operation will be
aborted but will be retried offlineretries times.
- using forceoffline=1 you can force Enterprise Edition to work
as Standard one, all rebuilds will be offline
v1.12
- sortintempdb parameter uses SORT_IN_TEMPDB=ON if rebuild is
offline (option is not compatible with online and resumable operations),
and if index is smaller the value of sortintempdb (value in Mb).
For example, value 100000 means that indexes smaller than 100Gb would
use SORT_IN_TEMPDB=ON
- custom throttle events can now return values 1 (soft throttle, only
between operations, so indexes are not left in resumable state) and 2 -
hard, throttle immediately.
- custom throttle events and other throttling conditions affect offline
rebuilds only between the operations, exception are locking.
- quite accurate progress percentage estimation for non-resumable
operations
- sanity checks before run (that FRG_ tables exist and are populated)
v1.13
- VictimClassifier is used extensively on Standard Edition.
Kills rebuilds which lock other processes, when:
- victim process waits longer than limit, given by VictimClassifier
- victim process wait time + projected (estimated) rebuild time to
completion (based on percentage) will be longer than limit, given by
VictimClassifier
- but (in both cases) the percentage done is not behind "point of no
return", where rollback would take longer than completion. Depending
on the actual MAXDOP, it is 75%-50% (the higher MAXDOP, the lower the
point of no return)
v1.13
- 2 custom functions, CustomThrottling and VictimClassifier
are extracted into a separate file custom.ps1
v1.2
- Configuration file now is no longer JSON, but *.ps1. Reasons
- I do need comments in configuration file
- I do need multi-line strings (here-strings) for the filters.
- YAML is not natively supported by Powershell
v1.23
- new heuristics for INDEX REBUILD OFFLINE
v1.26
- index reorganize now supported with progress estimation
- use option $reorganize = 1
v1.30
- multiple bug fixes and optimizations
- 2-server mode
v1.32
- FRG_last view now stores compression type. Please regenerate stats for
new version! Underlying tables had changed! (but .ps1 code is compatible
with the old ones)
- checkevery parameter - can check locks with any frequency, if
15 seconds is too long.
- text log has the same name as config file
v1.33
- Supports SAMPLED and LIMITED index analysis, default is still DETAILED
- FRG_FillSizeStats can be limited by a single database
v1.34
- harddeadline = 1 forces to stop at deadline no matter what
- starttime = wait for start time (used to delay for maintenance
window in interactive mode)
v1.35
v1.36
- Added FileGroupName to FRG_last and underlying views, so script can
filter by Filegroup. You need to recreate script tables
(uninstall/install)
v1.37
- new parameter, orderby to control the processing order
- "TotalSpaceMb" default - by size from the smallest
- "frag_pct desc" from the worst
- "frag_count desc" from highest entropy
- "activity desc" most active first
- "activity*frag_count desc" extra workload for read access
- "activity*(100-density) desc" potential space benefit
after rebuild
Note: new column activity is calculated based on the index usage
statistics from sys.dm_db_index_usage_stats, value is calculated
as sum of seeks and lookups and the number of scans, multiplied by table
size. Value is populated by procedure FRG_FillUsageStats which
must be called before FRG_FillFragmentation.
On two server configuration, where activity happens on the primary node,
but fragmentation analysis is done on the secondary node, information from
dm_db_index_usage_stats can be transferred using procedure FRG_PrintUsageStats
Check other projects: https://www.actionatdistance.com/