How to use Minion Enterprise for the first time, part 3

This is the first article in a multi-part series to guide you through your first weeks in Minion Enterprise.

This is the third article in a multi-part series to guide you through your first weeks in Minion Enterprise.

  • In Part 1 we covered the philosophy of environment monitoring, and how to register servers.
  • In Part 2 we configured error log search terms, and got some good “navigating ME” tips.

Today let’s talk a little bit about configuring alerting, handling email alerts, and a quick note on applications and roles.

Configure Alerting

Minion Enterprise comes with a set of alerts – each with custom thresholds – all configured. Of course, each environment will have its own requirements and quirks.

You can set thresholds, deferments, and/or exceptions for:

  • Missing Backup alerts
  • Drive Space Full alerts
  • Error Log Search Term found alerts
  • Instance Configuration (sp_configure) alerts
  • Replication Latency alerts
  • Service Status alerts
  • Weak Password alerts

You can also turn data collections “off” for one or more servers, by inserting one or more rows to the dbo.CollectionExceptionsServer table.

For information on configuring alerts, see How To: Configure Alerting.

Handle Email Alerts

Inevitably, something will trigger an email alert. Alerts in ME are designed to be aggregated and actionable. For example, the following email alerts us to SQL services with “Auto” start mode, which are stopped:

In this email, we can easily see that:

  1. The alert applies to “Gold” servers.
  2. The only server we’re alerting on is P___.
  3. The service that’s stopped is “SQL Server CEIP service (MSSQLSERVER)”.
  4. We have ready-made code available to us, to defer or exclude this service from alerting.

All of the MinionWare alerts follow these conventions.

Note: We do not believe in junk alerts, inactionable alerts, or alert storming. Take the necessary actions to make sure the alerts you receive are real alerts!

In this example, I personally considered running the Setup.ServiceStatusException code, because I don’t need emails for CEIP (customer feedback program) alerts for this server, ever.

Come to think of it, I don’t need alerts for this service on any server! So I make the exception global:

For more information, see Service Down Alerts.

Set Roles, Applications, Application Owners, and/or Environments

We often refer to different servers not by their name, but by their associated function or application. Minion Enterprise provides a way to define applications and roles for your servers. For each application you wish to define, you’ll need to insert:

  • the application information
  • the server’s role for this application (e.g., Database, SSIS, SSRS, etc.)
  • the application’s environment on this server (e.g., production, test, etc.)
  • the associated SQL instance

For instructions, see How to: Define Applications.

 

Next time we’ll talk more about what you can do with Minion Enterprise. In the meantime, feel free to look around the online documentation!

How to use Minion Enterprise for the first time, part 2

This is the first article in a multi-part series to guide you through your first weeks in Minion Enterprise.

This is the second article in a multi-part series to guide you through your first weeks in Minion Enterprise. In Part 1 we covered the philosophy of environment monitoring, and how to register servers.

Today let’s start with how to configure error log search terms, and navigating ME.

Configure Error Log Search Terms

The Error Log Search module allows you to set up specific error log search terms in the dbo.ErrorLogSearch table.  This module automatically gathers any search term matches, and logs them in the Collector.ErrorLog table for alerting or reporting.

For more information on error log searches, see Error Log Search Module.

IMPORTANT: SQL Server error logs are simply text files; they aren’t indexable. So, any search on an error log file must (by definition) search the entire file.  It’s possible, then, that you could see some performance lag during error log searches if the SQL Server error log is extremely large.  To minimize this effect, set up a nightly job to cycle the SQL Server error log on each instance, and configure SQL Server to retain 30 days of logs.  This is good log management that we recommend in any case; it has the added benefit here of helping the performance of this process.

To set up an error log search, insert that search to the dbo.ErrorLogSearch table.  From that time forward, search is now valid for all active, managed SQL Server instances (that don’t have an exception defined in dbo.ErrorLogSearchServerExceptions).

Take the example of an enterprise-wide search for corruption errors. DBCC CHECKDB detects corruption, and logs it in the SQL Server error logs in the form of Error 823 and Error 824. So we will define one search for 823, and one for 824:

Remember: from now on, ME will search all of your active SQL Server instances for these errors, and will log them in the Collector.ErrorLog table for alerting or reporting. But we have not set up an alert for this! You can make your own alert SP, or pull one down from the Community Zone!

Getting Around ME

The Architecture Overview article is an excellent place to start. But, to name a few salient points, we’ll tell you:

  • How to get around in ME
  • A  bit about default values
  • A couple of super important features
  • What we think about your customizations

Getting Around

Minion Enterprise is made up of SQL Server stored procedures, tables, jobs, executables, and configuration (“config”) files. Almost everything you’ll touch is native SQL!

The Minion Enterprise repository groups database objects logically into schemas. Some of the most common schemas you’ll see are:

  • Alert– The Alert schema includes alert-related stored procedures, and tables that control alert thresholds and deferments.
  • Archive– The Archive schema includes tables and stored procedures related to Minion Enterprise’s self-cleanup. For example, the Archive.Config table determines how many days’ worth of data to keep for each of the tables in the Minion database. Be sure to look at the archival settings in this table to customize your data retention.
  • Collector– Tables in the Collector schema hold collected data from managed instances. As you might suspect, Collector stored procedures play a part in the actual data collection.
  • dbo– Data in the dbo schema tables tends to be more static than in other schemas – for example, the data in dbo.Servers will change as servers are added, removed, and upgraded, but the frequency of server changes is far less than data collections or alerts.
  • History– Alert details are saved to tables in the History schema, before those alerts are emailed. For example, one row in the table History.Backups represents one detected missed backup. This allows you to (among other things) report on databases that have frequent trouble with backups.
  • Setup – The Setup schema denotes stored procedures that allow you to set up various deferments, exceptions, and thresholds (e.g., Setup.DiskSpaceThreshold allows you to create or alter a disk space alert threshold). In the future this may expand to additional setup stored procedures.

Using this, and a touch of “sys.tables”, you can find pretty much anythink you’re looking for. “Where is the lastest job history collection? Oh yeah, Collector.SysJobHistoryCurrent…”)

On Defaults

Global Default – Any time you see InstanceID = 0 in Minion Enterprise, it represents a global default. So for example, you can set an alert exception for all servers, using InstanceID=0 in that alert’s exceptions table.

Instance Default – Any time you see DBName = MinionDefault, it represents a global default for all databases on the given InstanceID.

Global Database Default – You can also set an environment-wide default for a specific database, by using the database name along with InstanceID = 0.

Super Important Features

IsActive – Many tables have an IsActive column. This allows you to “turn off” that particular row, temporarily or permanently, without deleting the row.

Each managed server in Minion Enterprise must be assigned a Service Levels – a simple label for the level of the server’s importance. (You assign a service level to a server in the dbo.Servers table, ServiceLevel column.) By default, Minion Enterprise handles three specific service levels:

  • Gold
  • Silver
  • Bronze

ME collects a lot of data, keeps it for a (configurable) amount of time, then deletes it.

Customizing – Do it!

MinionWare encourages you to create stored procedures and views as you find you need them. ME is meant to be customized.

We further recommend you create your own schemas to organize these custom objects. For example, if your company name is ABC, you might create the schemas ABCAlert and ABCReport.

As long as you do not modify existing Minion Enterprise objects, there should be no ill effects. And, Minion Enterprise upgrades will not remove or modify your custom objects and schemas.

 

Next time we’ll talk more about what you can do with Minion Enterprise. In the meantime, feel free to look around the online documentation!

How to use Minion Enterprise for the first time

This is the first article in a multi-part series to guide you through your first weeks in Minion Enterprise.

Minion Enterprise lets you monitor and investigate your SQL Server environment for security and stability. It’s a big system, though, and it’s easy to get lost.

This is the first article in a multi-part series to guide you through your first weeks in Minion Enterprise. Topics we’ll cover include:

  • Getting around ME
  • Configure Error Log Search Terms
  • Set Roles, Applications, Application Owners, and/or Environments
  • Configure Alerting
  • Handle Email Alerts
  • Check High-Level Health
  • Check the Security of the servers
  • Research Index health
  • Research (and set alerts for) sp_configure settings
  • Get an inventory of clusters and Availability Groups

And of course, more. For today, let’s start with the philosophy of environment monitoring, and getting your servers registered in ME.

Monitor the environment

The database industry is familiar with performance monitoring, which tracks performance-related metrics. But the health of a server depends on so much more.

  • Are the disks filling up?
  • What service packs are installed?
  • Who has permissions to what servers and objects?
  • How well are the databases indexed?
  • Are backups and maintenance running well?
  • Have sp_configure settings been changed from the standard?

Environment monitoring covers as many aspects of database administration as possible, including:

  • Server Environment Data – Disk space utilization alerts, anyone? How about OS patch levels and service properties?
  • SQL Server Environment Data – Everything from sp_configure values to error log alerting.
  • Database Information – DB properties, files, scripts, space used, index stats, and more.
  • Maintenance and Backups – When did backups happen last? How about DBCC CheckDB?
  • Security and Encryption – All logins, users, Active Directory information (including group expansion!), role membership, and scripted permissions.
  • Replication Latency – We’ve got replication latency data for all subscriptions.
  • Availability Groups – ME gathers information on AG replicas, listeners, groups, status, and read-only routing lists.

First things first: Register servers in Minion Enterprise

You can install and configure Minion Enterprise in about five minutes. Just be sure your server meets the requirements outlined in the Quick Start Guide!

Minion Enterprise references the dbo.Servers table to determine what SQL Server instances it should be managing. So the first step after installation is to “register”, or insert, instances to that table!

Because this is a table, and we’re all DBAs, there are a number of ways you can insert servers to the table:

  • Import the list from a CSV or Excel spreadsheet.
  • Import the list from another table.
  • Enter instances using INSERT statements.
  • And so on.

Whichever way works for you, you’ll need to insert to dbo.Servers with the following information:

  • ServerName – of course.
  • Port – this must be NULL for the default port (1433), or the port number for a nonstandard port (anything other than 1433).
  • ServiceLevel – Determine whether your server ranks as “Gold”, “Silver”, or “Bronze”. This service level determines what level of service (e.g., how often collections and alerts are performed) each server receives.
  • IsSQL – This should be 1 for SQL Server instances, and 0 for non-SQL servers. Servers without SQL still get the benefit of drive space monitoring and more!
  • IsActive – This should be 1 for any server you want collections and alerts to run on, and 0 for any other server. I personally keep inactive servers in the list specifically just to keep a master list of servers!

Next time we’ll talk more about what you can do with Minion Enterprise. In the meantime, feel free to look around the online documentation!

In Beta: Health Check for Minion Enterprise!

At the PASS Summit we were thrilled to announce a new feature: Health Check for Minion Enterprise!

Minion Enterprise already gathers a ton of information about your servers and SQL Server instances into a central repository, which is perfect for alerting, reporting and configuration.

Now, we’re adding standard audits and internal health checks, so you can assess your entire environment all at once!

We want feedback! Fill out a four questions to get input for our new Health Check module, and we’ll contact you when the module is ready for beta testers!

“Audits are easy!” said no one before now.

Like everything in Minion Enterprise, Health Check is native SQL: run a stored procedure, and see the results in the log tables and the associated SSRS report.

For example, to run the CIS Microsoft SQL Server Benchmark, we run:

That’s rather unassuming, so let’s say it a bit louder: Your entire benchmark is now complete for every instance in your environment.

The final version of Health Check will include other external benchmarks and audits, and customizable internal audits.

Help shape the Health Check

This is where you come in: email us to get in on the beta and have your input shape the module!

Don’t get us wrong: we already have a list of metrics a mile long that we’re adding to ME Health Check. But we know we won’t think of absolutely everything. So…what does your shop need to check?

Now, back to the details…

Live in audit mode, automatically

Like everything else in Minion Enterprise, your data is stored in SQL Server tables. What’s the point of being a DBA, really, if we can’t get our hands on the data?

Of course, we’ve also made pretty reports for the Health Check results, so you can drill down into the data easily. Here’s an early version of the drill-down report:

What this all means is you can do anything you like with the Health Check feature. For example, you can:

  1. Schedule a health check to run every Sunday night.
  2. Create an alert to check the results for any failures, and email them to the DBA team.
  3. Schedule regular report emails for the team leads and managers.
  4. Never, ever be caught by surprise by an audit.

It’s already turning out to be one of our favorite modules ever, and we know you’re going to love it, too! Remember,  email us to get in on the beta!

 

Protect yourself from Ransomware with Minion Enterprise

Ransomware is becoming a huge problem in the corporate world as more and more companies fall prey to this heinous act of terrorism. One of the biggest disasters that could befall you as a company – or even specifically as a DBA – is to come in one day and discover your shop has been taken over and held hostage.

“Ransomware is a type of malicious software … that threatens to publish the victim’s data or perpetually block access to it unless a ransom is paid.” – Wikipedia.org

There is no single step that can protect you from ransomware. Microsoft, and many others, have said for years that security is a layered process: never count on a single product, or a single avenue of protection. In this case they’re most definitely right…when ransomware slips past your antivirus software, you’re going to need another layer of protection to help contain it.

Today we’ll talk about an auditing layer, and how Minion Enterprise can help protect your shop from this vicious attack.

How ransomware works

Ransomware encrypts your files, then makes you pay to get them decrypted.  But it doesn’t just take your workstation. It also reaches out to any network locations you have mapped, or that you’re simply using, or even sometimes locations that remain in cache…you don’t even have to be actively connected to them!

This happened to me once.  Before ransomware was widely known, I visited a legitimate website, and my laptop became infected with ransomware. It then reached out and encrypted my OneDrive, as well as network locations I had connected to earlier that day.

Knowing this, we know that the first step is to lock down all of your server shares, and audit them to make sure permissions aren’t expanded again. For a home office like mine, this is fairly simple. But for a large shop with many servers, the task gets bigger and bigger. It can become impossible, because it’s a never-ending task.

The audit layer in two steps

To implement the audit layer, you must first perform discovery: Which servers have shares, and what are the permissions to those shares?

This is a huge question. It’s bad enough if you have 20 servers, but what if you have 200, or 700, or 1500? No one in IT has the time to go through all those servers by hand to gather shares and permissions. Even if they did have time, if you convinced management to make that project a priority and got it done and everyone’s happy that you’ve locked down the environment…well, one part of it anyway. Even if you did, what about next week? Or the week after that, or after that?

“What about after that?” is the second aspect of implementing the audit layer. You must not only find and fix those permissions issues on the shares, but you also need to regularly review to ensure that nothing has reverted.

You may have convinced the brass to make it a priority once, nobody wants to make this one task a full-time job. There’s just too much else to do.

Automate the audit layer

This is where Minion Enterprise literally saves you. ME discovers of all your shares and their permissions, and it maintains a constant audit of permissions on all the servers in your shop.

Louder, for the back row: ME does discovery of shares and permissions, and maintains a constant audit.

You can easily set up an alert to let you know when permissions on important shares change. This gives you an unprecedented view into your environment, and increases your security exponentially.

The alerting mechanism is flexible, so you can alert on any security change to the shares. You can:

  • Alert when accounts get an access change
  • Alert when new accounts are added
  • Just alert on specific shares and specific accounts
  • Alert when specific permissions are granted (like giving Everyone read and write)

There’s nothing to install on any of the servers you’re managing, and the alerting is done from a central location.  This makes it effortless to change every aspect of your alert scenario.

Protect your shop today

You can’t guarantee that some computer in your environment won’t be infected with ransomware. But you can implement layered security. If ransomware does make it you’re your shop, ME will help greatly contain the spread.

And of course, that’s just one small part of the security features in Minion Enterprise.

 

 

Celebrate! Second Anniversary Giveaway (is now over)

Last year, we gave away free licenses of our SQL Server management software for the company birthday! Let’s do it again, right now.

Edit: The giveaway is now over. If you didn’t make it in time, you should still get a 30 day trial of Minion Enterprise, which will cover ALL your instances, and the free modules Minion Backup, Minion Reindex, and Minion CheckDB.

Take a look at the Introduction to Minion Enterprise webinar recording!

The Giveaway…

Two years ago, we officially became MinionWare and launched the absolutely masterful SQL Server management solution, Minion Enterprise. We have talked to literally hundreds of people at dozens of database events, meetings, webinars, conferences – you name it!  Even better, clients are raving about the software.

Our business model is still: Give away as much as you can.  Our world-class maintenance tools - Minion Backup and Minion Reindex - are still free, and this year we added Minion CheckDB!

Last year, we gave away Minion Enterprise. This year, I thought we’d do something different. We talked about it at length…then decided that the world needs more free Minion Enterprise.

Have Some Free Enterprise-Level Software

From now until 5:00pm (Central Time), on July 20, 2017, anyone who enters, receives three free Minion Enterprise licenses. 

We’ve been thrilled with the reception and feedback we’ve gotten this year. So, let’s do this one more time. Maybe next year we’ll write thank you cards, instead.

Giveaway Rules

Of course there are just a couple caveats so see the restrictions below:

  1. Sign up before 5:00pm Central Time on July 20.  It’s a good idea to just do it now, because after July 20, the offer is over.
  2. Current version only.  Free licenses are eligible for patches and service releases of the current version, but not upgrades.
  3. Includes 3 months of support. Afterward that, a support contract will need to be purchased, if you want continued support.
  4. Additional licenses are available for purchase.
  5. Licenses are not transferable to any other companies.
  6. Sorry, no repeat giveaways! If you or your company have won free licenses from us before, you can’t do it again.

Enter Here

Fill out this form for your free software!

Databases need more than performance monitoring

There is more to databases than performance monitoring. So why are the most popular DBA tools, performance monitoring tools? They don’t even begin to cover the vast majority of DBA responsibilities. What we need is environment monitoring.

Image: performance monitoring pressure gauge
You’re gonna need more info

When you read a job spec for “database administrator”, it does not simply say:

  • Performance Monitoring

It says something like:

  • Database Performance Tuning
  • Database Security
  • Promoting Process Improvement
  • Problem Solving
  • Presenting Technical Information
  • Database Management
  • Data Maintenance
  • Operating Systems

(List taken in part from Hiring.Monster.com.)

So why are the most popular DBA tools, performance monitoring tools? Those are great, sure, but they don’t even begin to cover the vast majority of DBA responsibilities. What we need is environment monitoring.

Environment Monitoring

I don’t have a website I can link to, to give you a definition of SQL environment monitoring. That’s because we’ve been defining it ourselves, for the past nine years.

An environment monitor is a system that allows administrators to examine the overall and specific health of database instances.

An environment monitor should touch on performance, yes. It should also:

  • Manage and monitor security
  • Speed audits
  • Make a majority of common DBA tasks effortless
  • Collect and present as much system information as possible, including service packs, disk space, errors, and more
  • And more
  • And also, lots more

Minion Enterprise is far more than glorified maintenance

This is exactly what we built Minion Enterprise for: to monitor and manage the environment. To take away the Server-By-Agonizing-Server aspect of administration by introducing the “set based enterprise” approach. To automate everything that can be automated, and to make data available to the DBA on everything else.

Get a trial and a demo, and you’ll see exactly what we mean. Monitor your environment, not just your performance.

Five minutes to freedom: Installing Minion Enterprise

Get your repo server, your Minion Enterprise download, and your license key (trial or permanent). Install and config take just five minutes, and your entire enterprise is in order.

Let’s take five minutes and get our entire enterprise in order.

Get your repo server, your Minion Enterprise download, and your license key (trial or permanent). Install and config take just five minutes.

02-Install1. Install

Extract the MinionEnterprise2.3Setup.exe and run it on your repository server.  Give the installation “localhost” for the instance name.

2. Install the license key

Once you receive your license key – trial or permanent – from MinionWare:

  1. Copy License.txt to C:\MinionByMidnightDBA\Collector on the repository server.
  2. Rename the file to License.dll
  3. From Powershell, run the command:
    C:\MinionByMidnightDBA\Collector\License.exe Install

3. Configure email for alerts

Connect to the repo server and insert your alert email:

4. Configure servers

Insert (or bulk insert) server to the repo:

5. You’re done!

Jobs will begin kicking off to collect data within the next hour. Some jobs run hourly, some run daily or weekly or monthly. You’ll start getting tables full of useful data, and email alerts that actually mean something.

If you’re impatient to start getting some of the good stuff right away, kick off the CollectorServerInfo% jobs manually. These populate data in the dbo.Servers table, which other jobs need to run. You’ll start noticing data in the Collector.DBProperties, Collector.ServiceProperties, and Collector.DriveSpace tables first, as these jobs run most frequently.

While ME is taking care of your shop for you, you can get to know it better with some of our better resources:

Download our free maintenance modules:

  • Minion Backup: Free SQL Server Backup utility with world-class enterprise features and full lifecycle management.
  • Minion Reindex: Free SQL Server Reindex utility with unmatched configurability.
  • Minion CheckDB: Free SQL Server CHECKDB that solves all of your biggest CHECKDB problems.

Proper Backup Alerting

Today I’d like to talk about two topics that get overlooked quite often, the “backups” to the backup, so to speak. First up: proper backup alerting. And second, missing backup recovery.

tsql2sday150x150When I saw the topic for T-SQL Tuesday this time I just had to get in. Maybe I’ve never mentioned it, but backups is one of my big things.  Today I’d like to talk about two topics that get overlooked quite often, the “backups” to the backup, so to speak. First up: proper backup alerting.  And second, missing backup recovery.

Traditional alerting falls short

Well, let’s begin with a story from my days as senior DBA. Years ago, one of the application groups messed something up in their database, and they needed a restore.  “Sure thing,” I said.  No problem.  So I went to the backup drive, and there wasn’t anything that could even be vaguely considered a fresh backup. The last backup file on the drive was from about three months ago.

OOPS.  Oh crap…so what do I tell the app team?

First, a little investigation.  I had to find out why the backup alert didn’t kick off. Every box was set up to alert us when a backup job failed.  I found the problem right away.  The SQL Agent was turned off.  And from the looks of things, it had been turned off for quite some time.  And as you may realize, there’s just no way to alert on missing backups if the Agent is off and can’t fire the alert.

But that was just the first part of the problem.  The SQL Agent couldn’t send the email, of course. But the job never actually failed, because it didn’t start in the first place.

This is the crux of the issue: jobs that don’t start, can’t fail.  Alerting on failed backup jobs isn’t the way to go.

“But it’s okay, we have…”

Hold on, I know what you’re thinking. You have service alerts through some other monitoring tool, so that could never happen to you! To  degree, you’re right. But let’s see what else can go wrong along those same lines:

  • The database in question isn’t included in the backup job.
  • The network monitor agent was turned off, or not deployed to that server.
  • SMTP on the server has stopped working.
  • The backup job has the actual backup step commented out.
  • Someone deleted that backup job.
  • Someone disabled the backup job, or just disabled the job’s schedule.

Service alerts won’t help you in any of these circumstances.

Proper backup alerting

I’ve run into every one of those scenarios, many times.  And there are only two ways to mitigate every one of them (and any other situation you come across) with proper backup alerting.

Number one: Move to a centralized alerting system.  You can’t put alerts on each of your servers. When you do that, you’re at the mercy of the conditions on that box, and those conditions can be whimsical at best.

Move the backup alerts from the server level to the enterprise level.  Then, when there’s an issue with SMTP or something, you only have one place to check. It’s much easier to keep track of whether  an enterprise level alerting system isn’t working than to keep track of dozens, hundreds, or even thousands of servers.  After all, if you haven’t heard from a server in a long time, how do you know whether it’s because there’s nothing to hear, or if the alerting mechanism is down?

Number two: Stop alerting on failed backups.  Alert on missing backups.  When you alert on missing backups, it doesn’t matter if the job didn’t kick off, if the database wasn’t part of the job, or if the job was deleted.  The only thing that matters is that it’s been 24 hours since your the backup. Then when you get the alert, you can look into what the problem is.

The important point is that the backup may or may not have failed, but your enterprise alert will fire no matter what.  This is a very effective method for alerting on backups, because it’s incredibly resilient to all types of issues…not only in the backups, but also in the alerting process.  If you do it right, it’s just about foolproof.

Part 2: Missing Backups

Handling missed backups is not the same as alerting on missing backup (like we talked about above). What we want to do is avoid the need for the alert to begin with.

Minion Backup (which is free, so we get to talk about it all we want, ha!) includes a feature called “Missing Backups”, which allows you to run any backups that failed during the last run.  

Here’s what this looks like:  You set your backups to run at midnight, and they’re usually done by around 2:00 AM.  However, occasionally they fail for one reason or another. Then you get an alert in the middle of the night, and you have to get up to deal with it.

Missing Backups lets you set Minion Backup to run again at, say, 2:30 or 3:00 AM with the @Include = ‘Missing’ parameter.  This will look at the last run and see if there were any backups that failed; if there were, then MB will retry them.  This will prevent the need for alerts in the first place.

We use this feature in many shops we consult in because we see databases that fail from time to time for weird reasons, but they always pass the second time.  So Minion Backup helps improve your backups simply by giving you a second chance at your backups.

Now we mention Minion Enterprise

We’ve got you covered for enterprise-level alerting, too.  Our flagship product, Minion Enterprise, was made for just that purpose and it comes with many enterprise-level features; not just backup alerting.  I invite you to take a look at it if you like.

But if you don’t then by all means, write yourself an enterprise-level alerting system and stop relying on alerts that only fire on missing backups.

And, improve your situation in general by switching to the free Minion Backup.

What Really Causes Performance Problems?

Every IT shop has its problems with performance: some localized, and some that span a server, or even multiple servers. Technologists tend to treat these problems as isolated incidents – solving one, then another, and then another. This happens especially when a problem is recurring but intermittent. When a slowdown or error happens every so often, it’s far too easy to lose the big picture.

Some shops suffer from these issues for years without ever getting to the bottom of it all. So, how can you determine what really causes performance problems?

Every IT shop has its problems with performance: some localized, and some that span a server, or even multiple servers. Technologists tend to treat these problems as isolated incidents – solving one, then another, and then another. This happens especially when a problem is recurring but intermittent. When a slowdown or error happens every so often, it’s far too easy to lose the big picture.

Some shops suffer from these issues for years without ever getting to the bottom of it all.  So, how can you determine what really causes performance problems?

First, a story

A developer in your shop creates an SSIS package to move data from one server to another. He makes the decision to pull the data from production using SELECT * FROM dbo.CustomerOrders.  This works just fine in his development environment, and it works fine in QA, and it works fine when he pushes it into production.  The package runs on an hourly schedule, and all is well.

What he doesn’t realize is that there’s a VARCHAR(MAX) column in that table that holds 2GB of data in almost every row…in production.

Things run just fine for a couple months.  Then without warning, one day things in production start to slow down.  It’s subtle at first, but then it gets worse and worse.  The team opens a downtime bridge, and a dozen IT guys get on to look at the problem.  And they find it!  An important query is getting the wrong execution plan from time to time.  They naturally conclude that they need to manage statistics, or put in a plan guide, or whatever other avenue they decide to take to solve the problem.  All is well again.

A couple of days later, it happens again.  And then again and then again.  Then it stops.  And a couple weeks later they start seeing a lot of blocking.  They put together another bridge, and diagnose and fix the issue.  Then they start seeing performance issues on another server that’s completely unrelated to that production server.  There’s another bridge line, and another run through the process again.

What’s missing here?

The team has been finding and fixing individual problems, but they haven’t gotten to the root of the issue: the SSIS package data pull is very expensive.  It ran fine for a while, but once the data grew (or more processes or more users came onto the server), the system was no longer able to keep up with demand.  The symptoms manifested differently every time.  While they’re busy blaming conditions on the server, or blaming the way the app was written, the real cause of the issues is that original data pull.

Now multiply this situation times several dozen, and you’ll get a true representation of what happens in IT shops all over the world, all the time.

What nobody saw is that the original developer should never have had access to pull that much data from production to begin with.  He didn’t need to pull all of the columns in that table, especially the VARCHAR(MAX) column.  By giving him access to prod – by not limiting his data access in any way – they allowed this situation to occur.

What Really Causes Performance Problems?

Just as too many cooks spoil the broth, too many people with access to production, will cause instability. Instability is probably the biggest performance killer. But IT shops are now in the habit of letting almost anyone make changes as needed, and then treating the resulting chaos one CPU spike at a time.

This is why performance issues go undiagnosed in so many shops.  The people in the trenches need the ability to stand back and see the real root cause of issues past the singular event they’re in the middle of, and it’s not an easy skill to develop.  It takes a lot of experience and it takes wisdom, and not everyone has both.  So, these issues can be very difficult to ferret out.

Even when someone does have this experience, they’re likely only one person in a company of others who aren’t able to make the leap.  Management quite often doesn’t understand enough about IT to see how these issues can build on each other and cause problems, so they’ll often refuse to make the necessary changes to policy.

So really, the problem is environmental, from a people point of view:

  • Too many people in production makes for an unstable shop.
  • It takes someone with vision to see that this is the problem, as opposed to troubleshooting the symptoms.
  • Most of the time, they’re overridden by others who only see the one issue.

What’s the ultimate solution?

In short: seriously limit the access people have in production. It’s absolutely critical to keep your production environments free from extra processes.

Security is one of those areas that must be constantly managed and audited, because it’s quite easy to inadvertently escalate permissions without realizing it.  This is where Minion Enterprise comes in: I spent 20 years in different shops, working out the best way to manage these permissions, and even harder, working out how to make sure permissions didn’t get out of control.

Minion Enterprise gives you a complete view of your entire shop to make it effortless to audit management conditions on all your servers.

That’s the difference between performance monitoring and management monitoring.  The entire industry thinks of performance as a single event, when in reality, performance is multi-layered.  It’s comprised of many events, management-level events where important decisions have been ignored or pushed aside.  And these decisions build on each other.  One bad decision – giving developers full access to production – can have drastic consequences that nobody will realize for a long time.

Sign up for your trial of Minion Enterprise today.

MinionWare