Wednesday, September 28, 2011

Going with the (net)Flow

If you are using SolarWinds' NetFlow Analyzer modle (NTA) then you might have run into the confusion about all the different settings.

Most people keep them at the default but if you are experiencing performance hits, you will want to see where a tweak here or there might be beneficial.

The problem is (and no disrespect meant to the hard working tech writers at Solarwinds), the options don't make a whole lot of sense at first blush.

What appears below are my notes after about 20 (no exaggeration) email exchanges with tech support to nail down what each of the options means. It also includes what SolarWinds is doing behind the scenes with your data.

To see these options, log in to the regular website as an administrator go to settings (upper-right corner), then "NTA Settings".

“Compress Data” is talking about rolling up the data - averaging the detailed statistics to the hourly.  So the options that apply to this are:

(“Keep Uncompressed Data for...”) 
How long should NPM keep the minute-by-minute data from each data source  (default is one hour). During this time a new table is created for each netflow source every 15 minutes. If set to 60 minutes, you get 4 tables per netflow source. If you had 1,000 netflow sources, that would be 4,000 tables

This can be bumped up to 240 minutes, but doing so will create more tables
  • Once the time limit (again, 60 min is the default) is reached all those detailed values in all those tables are calculated into a 15 minute average. This becomes the database table NetflowSummary1.
  •  Every 24 hours (this is NOT tunable), the 15-minute data is compressed (averaged) into hourly data. This becomes the table NetflowSummary2
  •  After 3 days (again, not tunable), the hourly data is compressed into a daily average, which is moved to the table NetflowSummary3
“Keep Compressed data for”
The daily averages are held for 30 days (this can be held longer), after which they are deleted.

“Delete expired flow data”
The expired data (ie: older than 30 days or whatever you set) is deleted however often you indicate in this setting. "Once a day" is the default

“Compress database and log files”
is a shrink operation. As in, it tells the MS-SQL server to shrink tables. Nothing more exciting than that.

“Enable aggregation of Top Talker data” 
This uses Memory on the primary poller to store a certain amount of Netflow statistics. The web server (either locally, or via port 17777 if you have an additional web server) pulls the statistics from RAM rather than a distinct query to the DB server. This improves the overall load times of the NTA webpages (especially top talker, top conversations, top applications) and has the secondary effect of reducing load on the database server. Of course, any of that is only true if you have a lot of people hitting the NTA pages all the time.

Tuesday, September 27, 2011

Yes yes yes! 100%. Like. Plus. Bump. Retweet. Buzz. Digg. Stumble. Forward.

 

Could someone please forward this to my Mother-in-law?
Thanks.

Tuesday, September 13, 2011

Self-Reliance

There's disaster recovery, and then there's how you recover from a disaster.

No, I'm not talking about Irene. I'm talking about the perfect storm of travel, customer visits, and a crashing hard disk.

It's a familiar story. I mean, hard drives gotta die sometime. That's what the MTBF (mean time between failures) rating IS. And since I use my laptop (yes, the big one) almost constantly,  it was really due to happen any time now.

So when I booted up Ubuntu and it asked me to perform a lengthy fsck routine twice in a row, I knew it was time to take action.

Step 1: Back up all the data I could, to whatever I had handy. Luckily, I carry a Sandisk Cruzer 16Gb flash drive, so I could back up A LOT of my immediately important stuff. I had also backed up my laptop before I left so I knew I wasn't completely sunk, just slowed down.

Step 2: Get a new drive. No problem, that's why God gave us Fry's.

Step 3: transfer the data from the old drive to the new one. I mean, that's the simple part, right?!? You just hook it up to a hard drive replicator (a technology that's been around for years making cloning and other techniques obsolete) and in an hour or two you are good to go.

Right? RIGHT?!?

Apparently not.

My first stop - the internal desktop support folks at my company - was a 3 hour odyssey of getting first Ghost and then "some other program I haven't used much" to run on an old Dell 386 with hand-spliced cables shooting out the front. While I'm sure that setup does work, it didn't like my Ubuntu drive and helpfully failed at the end of the 3 hour copy attempt.

Having given the home-team the chance to prove itself, I went to the experts - those wizards at Fry's - where I was certain they'd be able to get me back on my feet while I leisurely browsed their aisles.

Uh... no. First, I was informed in a condescending tone that what I wanted was called "ghosting" ("Yes," I thought while maintaining a rigid smile. "I remember Symantec Ghost. I also remember Norton Ghost. I also remember PartitionMagic. I also remember using a LapLink cable to provision an entire training room. And I'm also certain that what I want is a clone of my hard drive. But who am I to quibble?")

Second, I was informed that they weren't certain Linux would work correctly if the old drive had bad sectors. ("Weeeeellll, if the drive runs NOW, I am fairly certain it will run after copying it to the new hard disk. I mean, it's not going to DAMAGE the sectors on the new drive, right?")

Third, this was going to cost me $70. Fine.

Finally, it would take 2-3 days.

Okay. Buh Bye.

Taking my leave of the lack-of-service counter, I decided to see if wandering the aisles offered any inspiration. Plus, walking around Fry's always makes me feel better. It just does.

I knew that my laptop had two drive bays, so if I could score some drive rails and a flat SATA cable (as described here) I might be able to set up a RAID 1 setup and just replicate the whole darn thing.

Short story long, they didn't have either the rails or the SATA cable. What they DID have was a $20 SATA-to-USB port connector. Now I could connect both drives, but how to get my whole OS over to the new disk. I didn't want to spend the rest of the night installing all my stuff (not that I had the install disks with me in the first place.)

In researching RAID options, I stumbled upon CloneZilla. A quick CD-burn later, and I was booting to a beautifully Linux-esque system that would let me copy my data from the old drive (now connected via the SATA-to-USB cable) to the new (safely ensconced inside the laptop). The first copy attempt - using default settings - ran for just 5 minutes, but didn't work (to many disk read errors). But the second attempt - which included a pre-copy fsck and was a RAW (bit for bit, no matter what) copy was a complete success.

It took 9 hours to run, but I was able to catch some z's during that time and awoke to a laptop that was actually usable and didn't leave my heart palpitating.

Tuesday, September 06, 2011

Solarwinds: Giving rights to NCM without giving away the farm

This is an enhancement to a thread that originally started on thwack:

Since NPM 10.1.x, everyone has enjoyed the ability to use AD groups rather than individual user accounts. Yay for NPM. But now in NCM, we have to somehow validate all these "new" users in NCM. Users who might not even have logged in yet, because you added an AD GROUP rather than a single account.
  1. To do that, in NPM you have to give the group (or account) "View Customization" right, which ain't gonna happen because then all your users can change anything about any screen anywhere.
  2. Not to mention that NCM doesn't allow you to add AD Groups, so you have to 
    1. Add user accounts individually to the NCM system
    2. OR stick with generic NCM roles and map them for each user in NPM
While I'm hopeful that the next version of NCM  (rumor has it that it will be version 7.0, due out by the end of 2011) will have some improvements to this, we've found a work-around.

This assumes you've set up the generic roles (webviewer, engineer, etc) on the NCM server.

  1. Log onto your SolarWinds website with an account that has “Change View” permissions
  2. Go to the "Config" tab and make sure you are set the credentials to use the account “Webviewer” (with whatever password you gave it in the NCM Console)
  3. Open an RDP session to your NPM server
  4. Start the Solarwinds Orion Database Manager utility
  5. Find the table “WebUserSettings”, right-click it, and choose "Query"
  6. Run the query: “Select * from WebUserSettings where settingname like '%cirrus%' and accountID like ‘%%’”
  7. make sure is the one you used in step 1 above
  8. Click the read-write radio button and hit “refresh”
  9.  Change the AccountID for the 3 settings (CirrusIsFirstTime, CirrusISPassword, CirrusISUserName) to use the user account, in the form:
    DOMAIN\username
    ...or...
    DOMAIN\GroupID
Repeat this step by going back to the Solarwinds website, hitting refresh (you will see that you have to re-enter your credentials; and then going back to the RDP session and hitting refresh and renaming your account again.