Some fungus from my neck… of the woods

June 29th, 2008

All photos taken with my beloved nikon d40

Starting with the ones I can identify:

thumb_20080622_12-47-25_114.jpg Lapotia Morgani (green spore lapotia), ultra common around here. These grow in my front yard and the yards of my neighbors. It’s difficult to see the stem ring in the photo but it’s there just benieth the caps.  These supposidly grow in farie rings sometimes, but I haven’t seen them doing it around here.
thumb_20080527_191035-1.jpgLemon-Yellow Lapotia, less common around here, but they seem to like my mother-in-laws terra-cotta pots.

thumb_20080515_165648.jpgNetted stinkhorn. These are really cool mushrooms. The stalks are spongy and hollow, and man do they stink. These sprout in the late spring after nearly every rain down near the creek where I live. They are edible but my guides tell me they are “not recommended” heh, I can’t imagine wanting to eat one either.
thumb_20080515_165048.jpgFawn Mushroom, growing out of a partially buried tree-stump that isn’t quite visible in the photo. These are edible and supposedly quite tasty but I haven’t gone there yet.

thumb_20080426_174823.jpgthumb_20080426_174836.jpgWhen I first spotted these, I thought they might be chanterelle, but a quick glance at the gills shows them to be some species of little brown mushroom I haven’t been able to identify. I haven’t taken a spore print yet

thumb_20080515_170546.jpgThis might be some species of amanita, or (because of the ribs on the edge of it’s cap) a white lapotia. I don’t really know, and didn’t get a spore print for this one either.

thumb_20080523_090257.jpg No earthly idea what this is (although I suspect it might be a more mature version of the mushroom above). None of my identification guides has anything resembling this guy. I didn’t take spore prints of him either, I decided to take a few photos and come back later to see how it matured and get a spore print. When I got back it was gone. Eaten.. trampled.. whatever. Any idea?

The Law of Ironic Potential

June 29th, 2008

Events have occurred, the details of which I won’t bore you with, which have convinced me of the presence of a probabilistic law of nature.  My extensive research (4 google searches) seems to indicate that this law has not been previously published, so although the details are still sketchy, and hypothesis remain to be tested, it seemed important that the community be made aware of it as quickly as possible.  Chop chop people, there’s science to be done.
Closely related to Murphy’s law; the law of ironic potential states that the probability of a given negative outcome occurring increases linearly with it’s ironic potential.  I will give you an example:

Assume the owner of a 19 year-old dog happens upon a dog-food sale at a local pet store.  The law of ironic potential predicts that the probability of the dog being dead upon the owners return home varies directly with the amount of dog food the owner buys.  If the owner were to buy an infinite amount of food (a “life time supply for $1200″ for example) the probability of the dog already being dead approaches 1.

Stuff you don’t see every day

May 21st, 2008

lostchicken

Hate to say I told you so, but..

April 29th, 2008

If you use RBL’s, you make the battle about IP space.

If you make the battle about IP space, they’ll attack the freaking IP space.

Don’t say you weren’t warned.

linkedin is gay

April 8th, 2008

Social networks in general are gay. This is merely a statement of fact. On the gay continuum they’re nearly as gay as blogs, or maybe even more so. If I had an online poll thingy I might take a poll about which had a higher coefficient of gayness except that these things are not subjective, and anyway online polls are utterly gay.
Anyway, for reasons unknown to myself, I’m completely fascinated by linkedin. I just can’t get enough. Someone will send me an invite and three hours later I’ll still be clicking around the infernal site reading the profile of someone in Argentina or answering questions I would LART people for asking in any other venue. It’s SO gay, yet somehow strongly compelling, and like I mentioned 3 sentences ago, I have no idea why.

Anyway here’s 3 months worth of articles since I can’t be bothered to update my blog.

Mystical Flows from December

Permission to parse from February
Comply from this months issue in April

It’s that special time of year again!

November 7th, 2007

No I’m not talking about Decemberween, I’m talking about LISA of course! They’ve invited me to give the Homeless Vikings talk this year which makes me — drumroll please — a LISA Invited Speaker. Now all I have to do is learn Spanish and hike the CDT and I can pretty much die happy. I’m changing things up a bit because I don’t think I quite got my point across at Defcon. Hopefully I can refine my delivery as well, I still find public speaking to be pretty much terrifying, and at this point it looks like my entire “LISA crowd” is going to be there this year (except Clarence (what’s up dude?! Per is coming and he lives in DENMARK!)), so I’ll have plenty of people to laugh at me when I start stuttering like a schoolgirl.

They’ve made pdf’s of the papers available already to registered attendees which is a first (and very cool). As usual the papers track looks fantastic (glancing around NetADHICT, ATLANTIDES, and Usher all look promising (lots of tools papers this year)). Lots of Nagios-related stuff going on (none of which I’m directly involved in but odds are good I’ll show up :-) ), and I’m also looking forward to the panel on configuration management (I’m hopefully awaiting the day someone smarter than I am stands up and asks why the heck the CM people are all so in love with XML (yeah yeah, I’ve read the Burgess stuff)). There’s even an early BOF on digital SLR Photography which I’ve recently taken an interest in. Too cool.

See you there?

Opaque Brews, better late then never

October 29th, 2007

Work, as usual, is kicking my butt, so I’m just now getting the October ;login article up. In a sentence it’s about monitoring Java Virtual machines. They can be challenging because they abstract the application’s inner workings behind their own thread model and memory management. Anyway.. here ya go: Opaque_Brews

Vote for Ron Paul

August 12th, 2007

If you haven’t seen/heard of the guy.. just go to youtube and search on his name, or better yet, go read some of the speeches he’s made in congress. Forget your party affiliation for a second and just listen to the guy.

Electing him equates to a net-gain in freedom, which benifits you more than having your way on any given combination of “the issues” possibly could regardless of what party you feel obligated to vote for.

If you think he’s ‘too wierd’, do me a favor and read this

Defcon

August 3rd, 2007

I’ll be at defcon tomorrow giving the homeless vikings talk. The talk is at 3pm in Track 2. I’ll be the rather large white dude with sandalls standing in a corner and looking forlorn, so I should stick out like a needle in a haystack (well I wont be wearing black, so that should help). Come up and talk to me please. I’m never lonlier than at a conference without any friends.

Edit:

Slides etc.. are now available on the defcon media archives. You should be able to get video and audio of the talk there, but the link doesn’t seem to be working for me. *shrug*. By the way.. if you stopped going to Defcon around Defcon 10 or 11 like I did because it stopped being fun, you should definitely give it another go, they have a new venue and it is a BLAST. Dare I say better even than it ever was? I dare.

New ;login Magazine Column

August 3rd, 2007

So I’ve accepted an offer from ;login magazine to write a monitoring column for them. One of the cool things about writing for ;login is that you own the copyright to your articles, so I’ve decided to post them here, since I’m not very good at coming up with blog material anyway (and also, this will provide a way for people to comment if they so desire). So without further ado and with my sincerest apologies, I give you Column 1, “A veiw from someplace nearby”.

iVoyer: A View From Someplace Nearby

Dave Josephsen

Greetings, and welcome to login magazine’s shiny new monitoring column. When Rik first approached me with the idea, I must admit my first thought was to wonder if there was enough subject matter to fill a semi-monthly column for a reasonable length of time. Is systems monitoring really that deep? If you have any experience with the large enterprise-strength monitoring apps, then you know the vendors don’t seem to think so; they view systems monitoring as a largely turn-key affair. Purchase license, install agent, reboot server, repeat.

Even the corporate-backed open source up-starts seem to share this opinion to a certain degree [1]. While the Patrol’s and OpenView’s of the world clamor to support the largest number of gadgets, the hyperic’s and zenoss’s appear to be differentiating themselves based on their auto-discovery tools and ease of configuration. If the vendor claims of “zero to monitoring solution in 30 minutes” are to be believed, then a monitoring column might not be a particularly entertaining prospect for you.

But as a good friend of mine once (quite rightly) said: “Knowing that there is a web server on port 8080 is about 2% of the problem”. Systems monitoring it turns out, is anything but a turnkey affair. Just behind the shiny facade of port scanners and SNMP traps is a stunningly complex problem. A question, the answer to which is unique for each person who asks it. A problem in fact that I think we have yet as a community to fully understand, much less actually solve.

Consider for a moment what happens when you type a url in your browser and get back an error page. At that moment, the actual status of the web “service” in question is a quantum superposition ; it is, to you, in a Schroedinger’s state. You have observed an error page, but that isn’t necessarily indicative of a problem with the website itself. There are a great many things that could be wrong that have nothing whatsoever to do with the web server. The blame might rest with your system’s network connection, DNS, an unfriendly filter, or a mistyped “ip route” command by some sleepy admin somewhere in the world-wide mass of interconnected routers between you and the webpage you seek. Some of these you can test for, and some are more difficult to detect. The website is up AND it is down. There is an objective reality; a singlular state, but for the moment, it eludes you. You’ll have to tease it out.

Teasing things out however is a talent your monitoring system doesn’t poses. It checks exactly the parameters you tell it to check, and returns the result. If you called the parameter ‘web service’, then that’s what the monitoring system will tell you is down, and if you aren’t careful about choosing the parameters, it might even tell you everything is fine in the presence of a problem — an even more distressing proposition. If only knowing the state of the cat were as simple as opening the box. Arguably, the pinnacle of our error detection capability at this point is end-to-end monitoring; scripts that mimic user behavior, thereby encountering the same problems a user would. But end to end monitoring programs are somewhat of a cop-out because they don’t actually give you the state of the cat either. They tell you that there is a problem (from the perspective of the monitoring system), but not where the problem might actually reside. Their real intent is to catch errors that more specific checks like port scanners might not. Monitoring systems it seems are not (yet) capable of makeing the observations necessary to solve our quantum conundrum.

So you can call this notification from your monitoring system a “web outage” on the reporting interface if you like but that doesn’t make it true. Like the demanding helplessness of the user crying ‘the interwebs are broken’, there’s information there, but not very much, and it’s of questionable accuracy. Perhaps knowing where the problem lay is not critical to you; it’s enough to know that there is a problem, and you’ll take it from there. But perhaps automated site to site failover depends on bulletproof detection of a specific error or set of errors, or maybe the problem is chronic and requires a human to detect patterns in the service availability over time (false alarms make pattern hunting a bit more difficult). Either way, the monitoring system probably hasn’t actually answered the question it was intended to answer, and many of the humans using the system won’t be aware of the distinction. In systems monitoring, the area where the humans and system meet is especially problematic.

Really the core of the monitoring problem is that we’ve created ourselves some rather untrustworthy machines. There’s just an awful lot of places where things can go bad, and for all of our fancy packet pitching, today’s PC’s are very much islands unto themselves, barely aware of their own state, much less that of the network around them. We, like an unfortunate mix between detective and geologist, rely mostly on forensics to gain what insights we can; netflow, syslog, utilization graphs, monitoring tools. And being every bit as untrustworthy as the systems they are trying to monitor, the monitoring box itself can have all of the same problems. In the end all it can give you is its own crudely-gleaned opinion of the current state of a set of services from a single static point in the network, which is often a poor substitute for knowing the service state first hand.

So asking one fallible machine in a fallible network its opinion about the fallible machines surrounding it might not be so great an idea. Doing so is not unlike paying a guy $5 to watch your car at 2:30am in Tijuana (usually it works out fine but that doesn’t make it a good idea). And speaking of misplaced faith in humanity, the humans in this equation are equally as fallible as the machines (if not more so). For one thing, in classic, failure-to-quantify-the-risk fashion, we sysadmin and our managers seem to place an unfounded amount of trust in our monitoring systems. As if calling them “monitoring systems” somehow imbues them with a magical immunity from mistaking a DNS failure for a website outage, or even just crashing outright.

But alas, our monitoring tools betray us. They crash like normal systems, and are largely dependent upon the same network infrastructure as the other systems. And yet for some reason the false positives surprise us as much as false negatives; it “feels” like this sort of thing shouldn’t happen to the monitoring system. It seems ironic, when there’s no real reason it should. It’s telling even that I used the word “betray”. So there is an emotional component here, and its most common effect is to cause us to ignore a monitoring system that has proven itself to be unduly chatty, or sometimes incorrect. We don’t “lose faith” in Tomcat when it runs out of threads and starts handing out 500’s, but for some reason we are quick to anthropomorphize and discredit a monitoring server for its digressions, even though it may be the worst possible server to take with a grain of salt.

With “normal” systems — the ones without the magical “monitoring system” moniker — we mitigate the risk of failure with redundancy. Load balancers, VRRP, BGP multi-homing; redundancy is an industry unto itself, and it could certainly help out in a monitoring context. It’s not uncommon for a large organization to have a failover monitoring box, and large installs sometimes require lots of monitoring systems to aggregate alerts to a master in order to scale, but these setups don’t improve the resolution of our failure detection ability.

Parallel systems have potential in this regard; two opinions are better than one. Yet curiously, monitoring systems are seldom deployed this way (if you have one, I’d like to hear about it).

This might be because having parallel monitoring systems agree on a given service state is a difficult problem to solve, which in itself is a decent proof of the fallibility I alluded to above. What do you do when two systems disagree? Further, avoiding things like redundant notifications requires that the monitoring systems be somewhat aware of each other’s opinion of the current state of things, making prospects even hairier.

Leslie Lamport is intimately familiar with getting parallel computational entities to agree on states. His work on the Byzantine Generals [2] problem is used widely at NASA and the aerospace industry to design fault tolerant flight-control systems. His work, and the work of those at SRI showed that 3n+1 processors are generally required to tolerate n faults. In layman’s terms and warped to suit our needs, the opinions of 4 monitoring systems would be needed to reach a trustworthy agreement on a given service if one of them were malfunctioning. I don’t have any fancy math to back this up, but it “feels” like the odds of 1 out of 4 pc-based monitoring systems misbehaving at any given moment are good. So monitoring systems it seems, may need to be a bit more redundant than we’re used to before they can begin to give really meaningful opinions. We can’t simply toss another box in without making things a lot more complicated, and it turns out we’d need to throw in quite a few before seeing a real return on the investment.

None of this is to say that systems monitoring is impossible or hopelessly broken. In practice monitoring tools usually work pretty darn well, and are certainly better than nothing at all. My monitoring systems have saved my gravy more times than I can remember. But it’s useful, I think, to imagine a reference system; some inexpensive, Byzantine Failure-proof, massively parallel monitoring system communicating securely via out of band channels and telling us with flawless accuracy and resolution about specific problems and their causes without burdening the network with traffic, or the systems with bulky agents. It introduces no security flaws, has an infinite amount of trending and utilization data on every metric we can imagine on every server, and network device in the environment, and can do complex event correlation and aberrant behavior detection in real-time. Maybe it has some of those heuristics and biological diversity I’m always reading about, and what the heck, it runs Plan9, and doubles as a margarita machine. This makes it easier to imagine the huge space of grey between the reference system and the system you probably have in your shop today. That enormous grey-space is what the vendors are ignoring when they say “0 to monitoring solution in 30 minutes”.

So needless to say, I happily took Rik up on his offer. In the column, I want to explore the grey space, providing practical solutions, advice, code, and general food for thought. My sincere hope is that perhaps somewhere along the way we’ll both gain a better understanding of the problem, and maybe move a few gradients closer to the monitoring system of our dreams. Expect topics to range from network architecture to SNMP to security to data visualization to temperature sensors to dealing with humans and back again, running the gamut of what you as a sysadmin might run into in the course of implementing and maintaining a monitoring system.

To a large extent the information I provide will be specific to Nagios [3], which is probably the most ubiquitous open source monitoring program today. This is not, however, a column about Nagios. I would prefer that you think of Nagios as a reference implementation language rather than than a design requirement. If systems monitoring has an XML-like means of specifying solutions, a prototyping language that is relatively easily translated between disparate systems, then Nagios, with its (almost painfully) open architecture, and liberal lack of design assumptions is probably the closest thing I’ve seen to it. So my use of Nagios in this column is only to ensure that the solutions discussed herein have a good chance of being translated to whatever you happen to use (and if that’s Nagios, then all the better).

Feel free to shoot me an email, or comment on my blog if you would like to talk about something specific, or just want to say hi. And finally, believe it or not, I honestly plan to maintain a better signal to noise ratio in my future articles, so sorry for the theoretical ramble. I promise to have some nitty gritty for you in the next issue.

Take it easy.
–dave.

dave-usenix@skeptech.org

http://www.skeptech.org

[1] http://books.slashdot.org/comments.pl?sid=230333&cid=18695063

[2] http://research.microsoft.com/users/lamport/pubs/pubs.html#byz

[3] http://www.nagios.org