Leon's Log: 05/01/2011

Tuesday, May 17, 2011

Yes, it's big. Get over it.

A couple of years ago, I had the opportunity to buy any laptop I wanted (within reason.) My thought was that I wanted something larger than a 15.x" screen, so that put me in the 17.x" camp.

I work with a great VAR so I just bounced ideas off him, along with a few models I had found.

He responded that he had a laptop that had 1Gb more than my choice, and 100Gb more disk. It was only $100 more. Was I interested? Of course I was.

It was, he told me, a little bigger than 17" as well.

The Toshiba Qosmio G55-804 is 18.4" to be exact. Big enough that you can't fit it into most standard laptop bags. But it still fits under the seat when you are taking a flight.

I actually like this laptop a lot. It's got a great screen, good keyboard and it runs well under Ubuntu.

What I *don't* like about it is that everyone feels obligated to comment on it. There are guys at work who have - I'm not making this up - said something about it every time they have seen it. For over a year now.

It's a laptop, people. It's just a laptop.

Wednesday, May 11, 2011

I posted this over on thwack.com , the support/advice/therapy forum for SolarWinds users (you can read it in situ here). But I thought it was worth reposting here since this is, after all, my technical blog.

This is the first in a series of posts where, in the name of giving back to the community, I’m going to share some of the customizations that make SolarWinds a little more robust for us and our customers.

First, a little background about my company and how we use SolarWinds. Sentinel is an IT solutions provider that focuses on communications technologies, Data Center, and Outsourced / Managed Solutions.

One of our key services (and the thing that lets me put food on the table) is a remote monitoring solution (based on SolarWinds, of course). All we have to do is drop a VPN router onto the customer’s premises and set up NAT’s for the devices they want (read “pay us”) to monitor, and we’re good to go. This is a perfect fit for our customer base, where they don’t want to divert resources for the ongoing investment in staff, software, and skills to set up an enterprise-wide monitoring and management solution (not to mention figuring out who’s going to handle all those pesky tickets).

So our model – where we have many independent customers with different sets of values, different monitoring requirements and so on has driven us to come up with some customizations that focus on:
•    How to stop alerting on various devices (because of pilot projects, new customer onboarding, or maintenance windows) while continuing to collect statistics
•    How to set thresholds for devices when that could be different on nearly a device-by-device basis
•    How to ignore alerts based on the built-in monitors for CPU/RAM, etc on older or closed-architecture devices where a custom OID gave better data

This post is going to look at our solution for the first bullet – how to stop alerting but continue to collect statistics.

Of course, we all know that SolarWinds has the “unmanage” feature. This is a nifty little function that even has a scheduler associated with it, and can handle one-time or recurring events.

But our problem was that in some cases we needed to continue to collect the statistics even during the window where alerts would be a problem. For example, when a circuit goes down, our Network Operations Center (NOC) staff contact the customer’s carrier and act as the point of contact for testing and resolution. During that time, *we* want to know the status of the WAN circuit, but we don’t want additional alerts (read “tickets, where we have an SLA that carries $$$ penalties if we fail to acknowledge and close”). Unmanage would certainly turn off the alerts, but we’d have no way of knowing what SolarWinds thought about the circuit status until we managed the interface again and – you guessed it – potentially cut another ticket.

So we developed the “MUTE” field. The logic is very simple:
1.    Set up a custom property (a yes/no field) labeled "mute"
2.    for specific nodes, set that property to "yes"
3.    Within your alerts, make sure one of your logic checks is something like "MUTE is not equal to YES"

That’s the basic idea. But here at Sentinel we’ve made it a bit more granular. The following mute fields are in place:

•    n_mute - node mute. This is an overall mute. All alerts should check for node-mute, and if it is set to "yes", the alert should be ignored.
•    i_mute - interface mute. This is, as the name implies, used in any interface-related alert
•    v_mute - volume mute. Again, the name should be a good clue to the usage. Very valuable when you have disks that are always at the edge of being full, but (for whatever reason) you don't care.
•    APM_mute - This mute option is very useful when you are bringing new applications online and want to pilot them, but you still need to get hardware alerts (CPU, RAM, etc).

The logic for any alert then looks like this:

Where ALL of the following are true
N_MUTE is not equal to YES

For an interface alert, the logic would simply include two lines:

Where ALL of the following are true
N_MUTE is not equal to YES
I_MUTE is not equal to YES

Along with the MUTE fields, there are associated DESCRIPTION (n_mute_desc, i_mute_desc, etc) fields. That way we can add comments about when and why the element was muted.

As long as everything stays nice and standard, views and reports can be designed that let you know which elements are muted and why.

We’ve developed a standard set of terms for use in these description fields so that, for example, we can create a view that shows all the muted nodes – so that we can know when a device has been muted for too long - but ignores ones that are purposely muted forever based on customer requirements.
IN THE NEXT POST: How to easily set per-device thresholds.

Leon Adato is a monitoring engineer at Sentinel Technologies. Sentinel is an independent technology company providing integrated, customized IT solutions including remote systems Monitoring and Management. Find out more at http://www.sentinel.com/

Wednesday, May 04, 2011

Required Reading if you ever want to email me

Seriously. Read this.

Come to think of it, I should probably print this and hand it out to certain members of my family, and stand over them to make sure they read it. And then show them the clue-by-four I'll smack them with if they violate the rules set therein.

http://lifehacker.com/#!5798413/how-to-identify-and-avoid-spreading-misinformation-myths-and-urban-legends-on-the-internet

How to Identify and Avoid Spreading Misinformation, Myths, and Urban Legends on the Internet

Adam Pash — Even if you've never embarrassed yourself by unknowingly spreading an urban legend as fact to friends and family or, say, retweeting a fake quote by Martin Luther King, Jr after Osama bin Laden's death, you've at least been on the receiving end of one of these misinformed messages. Next time an email, tweet, or link seems a little fishy, here's how to spot it before your itchy trigger finger sends it to all your friends or followers. (Send this one to your forward- or retweet-happy family and friends.)

A little timely backstory: Osama bin Laden's death resulted in millions of bin Laden-related tweets every hour on Twitter. Thousands of those related tweets included a nice quote attributed to Martin Luther King, Jr that, unfortunately, had never been uttered by King. The story of how an innocent Facebook update turned into a widespread fake quotation is an interesting read, but more importantly, for those of us who prefer to avoid internet egg on our faces: How do you identify and avoid spreading misinformation, myths, and urban legends on the internet?

Click here to read the rest of the article.