SNMP Behavior Wishlist

From my few years of working in a NOC I've seen a lot of stupid behaviors from devices in regards to monitoring and notifications. Here's my current list of gripes.

I wish your app, monitoring product and/or device would take the following list into consideration when you decide to add SNMP Notification (trap) support.

  1. You probably don't need more then one custom trap. A few different traps for different types of notifications is fine. 30,000 different traps that all have identical structure is not ok. (I'm looking at you Symantec I3)
  2. Use a different OID for each piece of information. Please don't make us use regex to pull out the data.
  3. If you're trapping about a problem, trap about when the problem is resolved.
  4. Cold/warm start, link up, link down and most of the other basic RFC traps that are well defined are more then welcome.
  5. If you're going to tell me the status changed of something, tell me what it is! (I'm looking at you Arista Networks.)
  6. If you're going to tell me the status changed  of something in particular, tell me how it changed!
  7. Nobody ever got angry because you provided too much data. If I have to poll your device to find out what happened you're doing it wrong.
  8. If there is extra data you might think is useful (Say the label on the port when you trap the port had a link down event.) please provide it. You can still do this and conform to the RFC spec.
  9. SNMP V2c support is fine, nobody cares about V3. Feel free to support it but don't assume people will use it by default and don't make people jump though hoops to disable it.
  10. Don't hold internal state of the "Alert" you're trapping about. If I have to manually log into your system and clear or acknowledge your alert before you'll tell me it happened again you're doing it wrong. (I'm looking at you RAD Data Comm.)
  11. If you're reporting on state changes, tell me every state change every time! Nobody ever complained about too much data.

Life in the NOC

As you may know I work in a Network Operation Center or as my business card says "THE NOC". What is life like in the NOC?

My team is responsible for monitoring a large number of hardware devices (Are they working? Did something break? Is it fixed?, servers (Are they responding? How's their storage? Is something trying to kill it?) applications (Is it running? Why did it crash? Can it start up again?), and business processes (Did these jobs fail? Will they succeed before someone needs them? Can they be rerun?) It's a broad set of responsibilities and it involves working with many different teams.

One of the more interesting parts of my job is monitoring a new device or application. It usually goes like this. "We've just purchased something new and we want it to send it's alerts to you. Can you make that happen?" Why of course! Someone from that team will ask a few questions, turn on "SNMP" on the device, point the "traps" at our servers and walk away. My team will see these "alarms" and panic. It usually look like a cross between mayhem and spam. Neither are good and frankly I don't want either of them on my watch. There is sometimes some yelling. "Hey knock that off!" "No! This is important!" And then we sit down and work things out. Let me give you some background first.

SNMP for the purposes of this story is a simple way for things to send us messages. These messages when sent to us are called "traps". Things send us messages about anything they want to. Maybe they're hungry. Maybe they just turned on. More often then not they speak up when something goes wrong and it's usually along the lines of "Help I've lost power!" or "I HAD AN ERROR!" or "My eighth port on card 23 had too many collisions in the last 30 seconds!".

My systems listen to these messages and decides what's important and creates Alerts for my team to respond to. These are along the lines of "Pardon me sir, A switch named core-12 in New York is seeing too many errors on the following ports. Here are the procedures for this type of thing and here are the tickets from when they happened last week. Do have a nice day." I've trained my systems to be very polite.

When a device is set to send us all their messages, my system does it's best to guess what's important and ignores the rest. It's harsh but necessary. There are some standard messages that most devices know. (some useful some not) but there is such a plethora of useless chatter that it would be very troublesome to pay attention to all of it. That stuff gets recorded but not alerted. "Why yes that's very interesting that you just did that thing. I can tell it's very important to you, I'll write that down and put it over here for later."

Often the devices don't know how to tell you what's wrong, or tell when anything happens but not what happened. "I need help." or "Hey, I saw something!" Neither are helpful. Helpful devices will be very specific and tell you when problems go away. "I lost something." followed by "Hey I found it again!" is a very nice occurrence and my system knows it can resolve that Alert and we can all go on with life.

Today I found a device that didn't want to bother me. If it had an issue it would tell you, but only the first time. "I lost power! but.. I lost power a few weeks ago, I shouldn't bother you with that again. I'll keep it to myself." I don't understand why someone would make a device like that.

I wasn't honest before. Applications can talk SNMP too. It's not just for devices. Most of the applications that talk to my system do it poorly and I know who made them do it. They sit on the other side of my office and while I'm not a mean person, I do judge their work quite harshly. They're notorious for spam. You see, if you can tell a computer to send a message once, you can easily tell it to send a message a million times. Or even better, you can tell it to send you a message every time it thinks something. A ton of messages from an application that deals with city dog registration isn't helpful if all it tells you is "I thought about dogs." every time a dog comes up in conversation. In fact that's a bit rude. I'm often asked to ignore things "for now". Could you ignore that for very long?

To be fair application programmers don't want to be rude. Their application was made to do a task not alert about it. They usually do the task very well and that is most important. But when things go wrong (and they will, and soon) how they tell my team suddenly matters and is the difference between the right reaction and an very slow wrong reaction. Imagine if the dog registration program was handed a cat and decided it couldn't work with any more dogs until someone took this cat away. Does it even know what a cat is? Can it say "I have a cat and I'm stumped what to do with it. Its not very dog like at all." or will it just say "I can't think about this dog!". One message will make sense to me and I'll be able to help, the other will not and I wouldn't be able to guess what it meant.

I hold that it is important to talk to your NOC before sending us any new messages. But it's a step that's often forgotten.

I've barely scratched the surface of what life in the NOC can be like, But I should stop myself before I go on for too long. I hope this gives you a greater understanding some of what we do. I also hope it helps garner a greater respect for your local NOC workers in your neighborhood or city.

Hugo

Go watch Martin Scorsese's "Hugo" based off the book "The Invention of Hugo Cabret" by Brian Selznick. Don't however watch any trailers or pay attention to any advertising. Whoever was in charge of marketing this movie screwed it up royally. I hadn't seen the trailers, and now that I've seen them I can say, "Don't watch them!". They are misleading and give away important plot points.

Other non-spoiling things I learned about Hugo from it's marketing that turned out to be untrue.

  • It was animated.
  • It was a love story.
  • It was about a heart shaped key.
  • It was about a little boy who's a thief on the run from the law for killing possibly his father.
  • It was about a steampunk 1930's Paris.
  • It was about a robot who befriends Hugo after Hugo fixes him.
  • It was about a crooked Paris Inspector who hates little kids.
  • Everyone is Paris is English.

I don't want to tell you what the movie was about because discovering it was half the fun. Maybe that's what the movie was about, Adventure and discovery.

Enjoy.

Umbrellas

Good umbrellas are rare. Today I was coveting the BLUNT MINI. An umbrella that will probably never break and satisfies my inner engineer. It's designed to have a low wind profile, mitigate the risk of poking people in the eye, and distribute stress across it's structure. It doesn't look half bad either.

The problem with nice umbrellas are they cost a lot. The BLUNT Mini is $80 and I'll probably leave it somewhere before it breaks. I'd have to go though more then 20 of the $4 umbrellas you can buy on every corner in Manhattan when it rains. I'd rather not have tons of crappy umbrellas traveling around the world only to end up buried in a long island landfill but I wonder if the manufacturing costs (in money and environmental impact) of the Blunt outweigh a couple of them making that trip.

This is neither here nor there, I have a better use for the $80. I also don't mind getting wet.

The Feynman Series — Beauty — Honors — Curiosity

I've been reading a lot about Richard Feynman lately. I found this breathtaking. I love how this man looks at the world. It makes me so happy.

On Investigative Journalism

I was listening to some dam fine journalism the other day.

Ira Glass of This American Life spent a few weeks in Glynn County, Georgia investigating a "drug court" that seems to have the power to legally imprison people without oversight, for indefinite amounts of time and relatively small charges. In the show's two acts he touched on two points I found very pertinent.

  1. The judge behind the injustices, Amanda Williams, has enough power as not to be challenged by anyone in her county.
  2. What she is doing is legal and the victims have no recourse. They can't appeal once they enter the Drug Court program.

There will always be people who overstep their power, make mistakes, make bad judgments, etc. It's why we have checks and balances. No one can rule absolutely. What makes me angry is that there is no checks on Judge Amanda Williams. She has overstepped her bounds (on the record) on multiple occasions, and it takes an out of state reporter to come in and show what's happening.

Since this is the kind of journalism and programming I want to encourage I've donated $10 to the "This American Life" show. If this is the kind of thing you want to encourage too, please donate at least a dollar, either via paypal or by buying the episode on the iTunes Store.

Listen to the episode or read the transcript (pdf).

Next up;

This Developers Life did an interview with Remon Zakaria, an Egyptian programmer who runs a successful business working on projects with American companies. Remon was an active participant in the recent revolution, against his family's wishes and with a company to run.

His turning point was when they shut off the internet and phones. He discovered that most phones still had data connections; they were limited to the networks they were on. To keep his team busy he had them develop a Twitter-like program that could run on every phone they could find (Nokia, Android, and weird java implementations on feature phones), but instead of requiring the internet they would scan the phone's subnet for other phones. They would connect with them, transferring messages and information about other phones. This led to a localized Twitter that could connect people in an area.

He talks about why older people trusted the TV more then any social media. That led to a great disagreement in what was actually going on in the streets and if his family should be involved. He also covers how life has changed since the revolution ended. It's worth a listen.

On charging money

(This post is for all you freelancers out there.)

I got work consulting right out of college. I quit school. It was more giving up spending mtoney on something I wasn't happy with than not wanting an education. (Not to speak ill of Rowan University, but I wasn't happy in the middle of nowhere New Jersey.) When I came home I figured I'd straighten myself out and go back to school elsewhere. My godfather had plans for me in the meantime.

He figured as long as I was home I should work repairing computers (a hobby of mine). I didn't have anything better to do so I went along with it. I made some flyers; "$20 an hour, fix anything PC or MAC!" and got someone from my old junior high school to let me stuff the teachers mailboxes with them.

That led to a few jobs, which led to a few more. I talked about my new venture with anybody who would listen. I gave out cards all the time. I offered free advice, tons of free advice. Leaned the phrase, "Well to figure out anything more I'd really have to take a closer look, here's my card." And eventually I had enough work where that I felt like I was undercharging.

Two things convinced me. A guy at a copy center asked how much I was charging (I was telling a friend about my new business) and when I told him he immediately said, "That's way too little, you'll never make it." I spun off, "Oh I'm just starting blah blah blah" and he didn't elaborate further. (He ran away quite nervously if I recall.) But that guy was right. I started noticing that people really didn't mind paying for my work.

And while I was getting better at what I did every day, I could barely believe that anyone was giving me money. It felt like a scam. This was a hobby, I mean I actually liked what I did. But as long as they insisted on paying I bet I could get more.

And so began my trek up the pricing chart. $20/hour turned into $35/hour, turned into $55 residential, $65 business, turned in to $75/$85 and so on. $85 an hour was a plateau that kept me happy and ensured clients reserved me for the interesting jobs. I began working with a startup around that time and quickly became unavailable for most of the trivial work. But if I ever needed money I had a roster of clients a phone call away. There's always something I can do to help them.

A side effect of charging low was that people didn't mind wasting my time and would argue over pennies. My best guess is if they don't see you valuing your own time, they won't value it either. They wouldn't show up, or they'd spend hours talking to me, not learning about computers (I try to teach as much as I can), but sharing personal stories about computers, telling me about their family, their personal problems, etc. Some people became good friends (you know who you are) but it's an odd experience when you charge by the hour.

I'll leave you with the quote from this blog post that inspired my story.

Your response reminded me of how I moved from a $20/hr web developer to a $120/hr web developer in about 12 months simply by increasing my rate at every contract bid I put out and occasionally doing some very high, just-to-see type bids. I never lost a contract due to cost (unless I wanted to ;) ), it was purely a mental barrier.

I wasn't writing to these points, but if you want a take away:

  • Teach as much as you can. Keeping your client dumb doesn't keep you in business, it just keeps your business with them dumb.
  • Charge more with every project until it becomes prohibitive to the clients. As long as you're honest, you won't get paid more than you're worth.

Good luck!

Birds

Herd these little guys on my way home the other day. No idea what kind of birds they are but they sure are loud. Good thing they're cute. They hung around a while and flew off.

I saw them visit the next day too.

Wheels

Your daily fortune

Your daily fortune: (917) 652‑6846

(tron is brought to you by daft punk)