Many organizations like to see certificates. Certification has its limits. The ability to pass a test requiring memorization is more of a test of one's ability to memorize details than it is a test of one's native skills in a hypothetical worst-case scenario where the entire network is down and everyone is looking to you for leadership. |
Since image-editing software such as Photoshop became commonly available the reliability of paper certification has dropped substantially. As a court-appointed foster parent, I learned that my own foster daughter's older sister was imprisoned for several years because she was a habitual criminal and her modus operandi involved creating fake certification for herself and others - diplomas, awards, permits, etc - and it got her in trouble, time after time.
All that having been said, here are some certificates to establish that I am a Microsoft-certified information technology professional; and that I have also been certified to install and administer a variety of databases, firewalls, and operating systems:
The definitive list of all the special courses I have taken is best reviewed in my resume.
NOTE: The certificates shown above are only those courses where a paper certificate was issued at the end of the class, AND where that paper certificate has survived years of turbulence.
I do not recall Oracle University or Sybase User Education issuing graduation certificates to their employees, for instance; the courses were usually week-long intensives and the evidence that we had graduated was in our continued employment and increased responsibilities.
Here we have assembled over three decades of work-related information technology exhibits, going backwards in time.
Consider this a virtual interview - these are the materials I would want show you, the order I would present them to you in, and, if I felt comfortable, the comments I would make while presenting them, if we were meeting in person to discuss employment, in the pre-COVID, pre-H1B days of yore:
In 2015 my oldest daughter was hospitalized and I returned to Humboldt County to care for her full time. The past several years have been consumed dealing with the complexities resulting from her care. A week or so after my daughter was moved to a hospital 250 miles away ... my mother died. In 2020 my younger brother was killed - also in Humboldt County, we suspect murder because his wallet was emptied - and I spent two years dealing with the fallout from that. Employment opportunities have been spotty during these past few years. All those H1-Bs; then, COVID; which popularized remote employment. Which has opened up new opportunities. The field of remote administration is exploding. (Personally, I have been telecommuting to work since 1986 - when I first started carrying an off-hour pager for GE Calma, and would be automatically paged by my own programs when they detected something that needed human intervention; at which point I would dial up the workplace's modem from my home, log in, authenticate myself, and do what needed to be done. So I have no problem with working remotely or managing others who are working remotely.) I am now back on the job market and looking actively for work that is meaningful and professional.
Send me an email: childers@redwoodhodling.com |
Project: Cheap Classroom NFS Workgroup
The Raspberry Pi is an onboard computer the size of a 3-by-5-inch index card. A variety of operating systems have been ported to the Raspberry Pi including several Linux releases, FreeBSD, and an operating system written in assembler specifically for the Raspberry Pi's CPU architecture. The number of operating systems available is constantly growing. A teacher at my oldest daughter's school tried to introduce the students to robotics using the Raspberry Pi, back in 2014. He required my daughter to get a Raspberry Pi. So I ordered two - I thought that perhaps I would have an opportunity to contribute. But the teacher encountered resistance and I don't think he was invited back - Academy of the Redwoods was rolling out Chromebooks and did not not seem to see the value in teaching children electronics, programming, or Linux. However, I continued playing around with my Raspberry Pi, and I found they were versatile little machines. I actually downloaded a version of Minecraft that ran on the first version of the Raspberry Pi. The size of the universe was limited by the amount of memory available to, I think, something like 1024 cubits in any direction. The CPU could only handle two players before it started slowing down. But, considering the diminutive nature of the computer running not only a UNIX operating system, but a game, on top of it ... I was quite impressed. I did some tests to see how the Raspberry Pi held up as a UNIX server and I found that it didn't have what it takes to be a server. Of course, a server tends to spend a lot of time accessing its storage, and, on the Raspberry Pi, storage is accessed through the USB controller, which sits on the same bus as the CPU and everything else, so when the bus is busy transferring data it can't be used for processing data, and vice versa. However, as a client, the Raspberry Pi is superb. For instance: the adjacent screengrab shows a Raspberry Pi 3B running the FreeBSD operating system, a web server (nginx), and the Gnome windowing system. On the left is an X utility, xosview, that shows the machine's state graphically. On the right side is an xterm running htop, which shows the process table and the load of each CPU - note the four cores. In the lower left is a program called tbclock that tells the time in binary - hours are blue, minutes are red, seconds are yellow. Very impressive, for a computer the size of a postcard. Don't you agree? Everything you see in this screengrab is open source. The Raspberry Pi; the operating system; the services; the applications; it's all open source - no license required. No fees. It's all free. I had spent some time working with another of my childrens' teachers, at a middle school, in San Francisco ... and I had become aware that the San Francisco school district possessed hundreds, perhaps thousands of Raspberry Pis - but didn't have a good use for them, as administering a lab full of UNIX computers is a full time job in its own right, leaving teachers who wanted to teach computer science with no time left for teaching. As San Francisco goes, so goes the rest of California; perhaps the rest of the nation. That's an awful lot of Raspberry Pis sitting idle. I decided to do something about it. This was an area I was something of an expert in because deploying and remotely managing large populations of UNIX computers is what I have done for a living for forty years. Professional UNIX administrators used powerful software tools to orchestrate the administration of dozens, hundreds, or even thousands of UNIX computers, simultaneously - think of the racks and racks of computers at companies like Amazon, Ebay, and Google.
I began to wonder if it might be possible to create a software release that utilized these orchestration software tools to administer a classroom full of noisy students, perhaps by creating a traditional UNIX workgroup architecture with Raspberry Pi clients, making the teacher's workstation a more powerful Intel- or AMD-based workgroup server, from which the Raspberry Pi clients could be managed, centrally, using a friendly graphic user interface. And thus, RaspiLab - a project to create software to weave an old Intel computer and a box of idle Raspberry Pis into a fully functional, centrally managed, easily orchestrated, classroom-oriented NFS cluster - was born. |
Project: Automated NAS Cleanup
Project: Breathe New Life Into An Old Tablet
The last time I worked in Silicon Valley, it was for Roche Molecular Systems. Roche Molecular Systems was spun off by Roche Pharmaceuticals, the parent company, in Switzerland, in order to profit from recent advances in biocomputing - RMS sold a CRISPR editing unit that was the size of a laminar hood. As an indication of how fast things were moving, functionality similar to that offered by Roche's CRISPR unit was, in 2015, available, as a small device that could be plugged into a laptop computer's USB port and operated on a desktop. It was said that the smaller units were less precise but it was also said that the inaccuracy could be compensated for by using many such desktop-sized USB CRISPR readers in parallel to check one another, while still achieving significant savings compared to the six- and seven-figure prices of the larger units. So I was thrilled to be working on the fringes of biocomputing. I'd taken a lot of flack in the 2000s for not having enough quote-unquote "Big Data" experience on my resume - me, a guy who has survived and flourished while computing with 8 bits, 16 bits, 32 bits, and now 64 bit CPUs ... me, a guy who runs Big Iron for Big Companies ... me, a guy whose home RAID capacity has been measured in terabytes since 2005 ... me, a guy who was raised by an astronomer, knows the speed of light, has been using scientific notation since he first started programming computers at age 14. Gradually I realized that what the recruiters meant was, "You're too old". But none of the problems I encountered at Roche were technological. |
|
I've always gotten along well with scientists. including those at Roche. They wanted to know how and why things work. So do I. Some of the best explanations I've gotten came from graduate students and senior scientists. The presence of a college degree does not seem to be the important factor. What is important is a burning interest in the truth. The rest is details. So when some of the researchers started asking me questions about load average - why did the computer show one CPU to be 100% busy and the other twenty-seven processors show as idle and how could they change this behavior - I was excited to have a chance to share everything I had learned, from working with parallel processing at GE Calma, and Oracle, and Sybase, and Ingres, and from everything I had learned while studying parallel processing at Sequent's User Education, in Portland, while working at Oracle. But I also learned something very disturbing about the people I was working with, in IT - the scientists told me that they had installed their own Nagios instance to monitor the uptime and performance of their own servers, to gather statistics with which to make arguments about resource allocations with their management ... and that the IT manager I was reporting to - the one on vacation - had ordered it deleted, without warning. That seemed rather savage behavior for an information technology professional. After all, if we are both gathering metrics on the same device, using the same technology, you would expect the metrics to agree with one another. So where is the harm? And with 27 idle cores, it was not like Roche was lacking in computing resources. |
|
I guess you could say that I am an inclusive IT manager. I like my customers to know what I am doing and why I am doing it. No secrets. Everything is transparent. I am the person they trust with their mail and their file transfers. I can see everything. I need to be honest with them in order to maintain that trust. Furthermore, I am not against sharing power. With power comes responsibility. If I want my customers to take responsibility for their computing resources and their information security, I need to empower them. With empowerment, usually, come mature relations, between equals. However, the person who I had replaced for two weeks while he vacationed in Hawaii was furious with me and instructed me to cease communicating with my customers. It is also possible that Roche no longer needed my services and that their negotiation of a six-month-long contract was just a ploy to secure my services for a few weeks while their manager took time off, then, find a reason to terminate the contract. Silicon Valley has gotten very Machiavellian, these past ten or fifteen years. It was at that point that I informed my manager that my daughter had just been hospitalized; and another contract was concluded. |
2013: Pacific Gas & Electric (PG&E)
` |
MobiTV brought me onboard because they needed help running what they called a "war room" - more precisely, it was an NFL war room. You see, MobiTV presents itself as a Silicon Valley dot-com corporation. But really, they are an entertainment company. Their entire product line is designed to turn your cellphone into a television. I would call this a desperate last gasp of the live entertainment industry to remain relevant in a future that will be entirely based on streaming video. MobiTV is the only company I have ever worked at where software bugs were noted on a spreadsheet rather than fed into a bug database. MobiTV didn't even assign ID numbers to their bugs. They were all related to appearance and features. I felt that the company's approach to software development was remarkably casual. So, basically, the "war room" job was to babysit dozens of audio and video feeds that were going out in sync with one another and targeting a half a dozen vendors and maybe a dozen different screen resolutions. The primary feed was up on a six foot screen at one end of the board of directors' table and we were scattered about it, watching metrics and instrumentation on our laptops and watching professional football on a six-foot-wide screen, and being paid to do so, too. The job came to an end one Sunday, when the orbit of the satellite they were using crossed in front of the Sun and, for ten minutes, the satellite was invisible against the Sun's electromagnetic radiation. It seems that real broadcasting companies use two satellites for exactly this reason, at least, they must do so when one of the satellites' orbits is going to transit in front of the Sun. A scapegoat was required and I volunteered myself by applying my astronomical knowledge to quickly guess the cause of the problem, drawing the anger of the manager whose project this was. |
In 2004 Daemonized Networking Services contracted with Button IT Services, in Alameda, to develop a workgroup configuration, including a working demonstration model, the workgroup server configuration, the workgroup client configuration, and all documentation for creating them. |
|
The operating system we used for this project was a Linux release called Immunix, which incorporated what were at that time state-of-the-art mechanisms to counter buffer overflow attacks. The deliverables included instructions on installing and configuring the operating system; SSH; Lynx, a text-based web browser; DHCP; DNS; and LDAP, taking the place of NIS for purposes of authentication. If I recall correctly the infrastructure also used NFS. |
|
Installing and configuring LDAP proved to be the most challenging part of the task. LDAP has its own unique syntax dating from the early days of computing (ASN.1) and the use of LDAP as a replacement for NIS required the installation of a custom schema as well - but it worked. Problems ensued, however, when we ran X Windows, because X had not been modified to recognize LDAP as an authentication mechanism. It was at this point that I left the project, with my part complete. I don't think Russ Button's idea of a Linux-based workgroup configuration ever made it to market, even after Sun Microsystems folded and went out of business, essentially removing his biggest competitor. I think Russ had a child in the hospital; and, like so many others with a great idea, he had no funding. I am inspired by the work I did for Russ Button, today, in my own efforts to create a similar product - a UNIX-based NFS workgroup configuration - but I am using diffferent computers, and a different operating system, this time around - and the customer base I am aiming at, is schools, with boxes and boxes of old computers, just gathering dust, for lack of good software to install. |
Also in 2003, I did a gig working with a company called Emageon, that had designed and built an extremely robust, highly available medical image server and was installing it in Kaiser hospitals throughout Northern California. Emageon's medical image server incorporated two separate servers, each running an HA-Linux kernel and connected by a heartbeat link. The level of replication between the two servers, due to the use of the HA-Linux kernel, was such that you could type a command on one console, then, move to the other console, type 'history' at the command prompt, and see the command you just typed on the first machine, reflected in the second machine's command history - the two machines were truly one. Here we see the procedural documentation for installing an Emageon server and it is remarkably similar to the sort of documentation I am accustomed to drafting for myself and others; I was struck by the similarity. |
2002: Daemonized Networking Services
I'd like to mention one more Dot Com derelict: Linuxcare. Speaking as a UNIX systems administrator my first commitment was to reliability and structural integrity. I had some problems with using Linux in an enterprise role where hundreds or thousands of people might be depending upon one machine to work properly, perhaps for years. I also found precompiled packages problematic - I often had insights into how the software might be better compiled to fit into the local environment, and precompiled binaries did not allow me the opportunity of customizing the installation. |
|
The Linux kernel is excellent. I have no complaints about the quality of the kernel. However, the suite of applications that are bundled with a typical Linux release are often selected for their glitz and glamor rather than for their reliability or interoperability. Command line utilities are given short shrift. |
|
I'd applied to Linuxcare back when things were flush, in 1999, but I hadn't heard anything back - probably, because Linux was The New Wave and UNIX was The Old Wave and because I'd been a UNIX administrator for ten years already I could not possibly have anything to contribute to the Linux movement, which was Revolutionary™ - unlike UNIX, which was old-fashioned and proprietary, not open source and young and sexy, like Linux. Now it was 2001. The Dot Com economy - all smoke and mirrors and promises - had collapsed, and 90% of Linuxcare's employees had all been laid off. |
|
Speaking of smoke and mirrors, take note of whose name appears the most on the 'Our Linux Experts' page. Take note of whose name also appears on the 'Strong Management' page. 'Secure networking', you say. 'Netfilters', you say. Show me the projects. 'Blogosphere icon' - LOL. Currently Dave Sifry is warming a seat for the Anti-Defamation League. Was Linuxcare just a giant slo-mo smash-and-grab, with management misleading investors, on a corporate dot-com scale? All that catering. All those first-class hotel rooms and airline tickets. That female "intern", from Japan. All that attention. All that free publicity. All those freebies. Ya gotta wonder. |
|
The Linux community was chock full of unique indiviuals and I fit right in. I loved the willingness to experiment. On the other hand, enterprise infrastructure needs to be rock-solid stable; you can't build magnificent architecture on the shifting sand of college IT projects, there needs to be reliable support for the product. Ten years before LinuxWorld I had attended InterOp at the exact same convention hall. It seemed to me that what Linux needed was to interoperate with other UNIX operating systems; more cooperation, less rebellion. |
|
Linuxcare's faith in Linux as a shield against hackers was, I thought, misplaced. But as a platform for innovation and experimentation in things like Linuxcare's bootable business cards or infrared communication between Thinkpads at short range or controlling battle robots and drones, Linux was - and is - great. |
|
While I was at Linuxcare one of the things that we dealt with was constantly being targeted by people strobing our external network, looking for vulnerabilities. Scanssh was the most common probe we received. I wrote a script that scanned system logs for error and warning messages related to strobes. As it detected each message, it would extract the IP address ... nmap(1) the remote IP address, first, so that if the remote IP address disconnected we would still have their fingerprint ... then, it would run a traceroute(1) back to the remote IP address, to document the path the packets had taken. Then, it would look up the administrative email address of each intervening domain, with whois(1), and send them an email detailing how < insert remote IP address here>'s TCP packets had traversed their network infrastructure, en route to our firewall, for purposes which we regarded as devious, and that we weren't complaining - just notifying them, in the interests of due diligence. |
|
The message sent to the intervening Internet services' administrative email addresses was carried at the end of the script, as a payload, commented out - a technique that I had used, previously, at Oracle Corporation, in a script that I wrote to automatically warn Oracle employees when their shared /home filesystem was nearing its capacity The earlier Oracle script would identify the top ten users of disk capacity, by login; then, embed this information in a preformatted, friendly electronic email to all occupants of the shared filesystem, encouraging everyone to contact these individuals and their managers, if necessary, and to ask them to clean up their home directory and free up some space for everyone else. In this way the community itself enforced the discipline required to administer their shared resource and the software only informed, it did not act. It was a masterful piece of human engineering - and so I copied its design when I drafted a solution to our Internet probes, there, at Linuxcare, and, from all reports, it was massively successful. If we really want to stop black hat hackers, IMHO, it's not that hard. It just takes commitment. Follow-up. Attention to detail. Burglars are looking for low risk opportunities. They don't expect you to come and try their doorknob. Active defenses - blocking the offending IP for a week or two with a firewall rule, for instance - are quite effective. |
|
One of the tasks I assigned myself was cleaning out all of the desks of all of the employees who had been laid off in order to reacquire all the various assets that had been in the employees' possession. I sought to assemble a curated library of Linux releases from what had been in the possession of all of the 200+ Linuxcare employees - to preserve their work, as it were - perhaps as an exhibit, in a museum, somewhere. When I was done we had collected well over 100 distinct Linux releases. I think my favorite Linux name might have been Yellow Dog Linux. It was in this way that I acquired a small collection of Linuxcare bootable business cards, including what I think may be hand-numbered cards #1, #2, #3, #4,#5 and #6 of the original, version-number-free release of the Linuxcare Bootable Toolkit, or LBT. |
MimEcom was another flash-in-the-pan Dot Com started by a bunch of tech bros who had all gone to college together and started a company - Fort Point Partners - that built web sites, sort of a fratboys' managed equivalent of Organic Online (another well known local turn-of-the-century website builder whose offices were visible from the San Francisco approach to the Bay Bridge). They delivered the websites to their customers, but then the customers wanted them to run the websites, too ... and so they spun off MimEcom, which did nothing but manage website infrastructure. Overwhelmed by the demand for their services, MimEcom started hiring people off the street and even the CEO's girlfriend's inlaws. The quality of the work we were doing decayed and I left. |
1998-1999: Hambrecht & Quist (HQ)
1997: Wells Fargo Bank, Information Security Services (WFB/ISS)
The work I did for Wells Fargo's Business Banking Group led to another contract with Wells Fargo - this time, with Wells Fargo's internal security team, which had been tasked with deploying something called Powerbroker and was approaching a deadline, after two years, with nothing to show. When I and a few other contractors first started working on Powerbroker, we were operating in a vacuum. Nobody knew anything about Powerbroker. The manuals had disappeared. The employee tasked with completing the project, Jim Nation, retired, and moved to Colorado. All we had was the software - it had been installed on some servers - and the online manual pages. I printed out the online manual page for the pbrun(1) command and studied it carefully. This led to another online manual page, for the local pb.conf(5) file. These documents led me to other, related online manual pages describing other parts of the Powerbroker infrastructure. I gradually figured out that Powerbroker was, basically, like the common UNIX utility, sudo(1), exscept that instead of consulting a local file to see what a user was allowed to do, as root, pbrun(1) went out over the network, via an encrypted connection, and got the exact same sort of information from another, different computer - ideally, one that the person who was using pbrun(1) did not have access to. I sketched the fundamental relationship amongst all the parts, in the image that you see displayed, in order to preserve my understanding of this arcane software's functionality. Once I grasped this relationship between pbrun(1) as a client and the Powerbroker infrastructure as the server that granted or denied permission for the operation to occur, I quickly realized that this architecture could be employed recursively. This realization - that Powerbroker was designed to be used recursively - is preserved in the sketch that I made, in the lower right corner of the diagram. The diagram shows how a hypothetical departmental client (labelled 'CLIENTS') could query a departmental server (labelled 'DEP') for permissions ... while, simultaneously, administrators of the departmental server would need to query a CIS server (labelled 'CIS') for the authority to administer the departmental server ... while, simultaneously, administrators of the CIS server would need their commands to be approved by a top-level, company-wide Powerbroker server (labelled 'PB') for the authority to administer the CIS server. Therein lies the origin of Wells Fargo Bank's mighty Powerbroker infrastructure - one lone contractor and a piece of scratch paper, reverse-engineering the infrastructure from a few well-written online UNIX man pages and explaining it to management. Just another day in the life of an easily replaced IT contractor whose work is quickly forgotten, LOL |
1997: Wells Fargo Bank, Business Banking Group (WFB/BBG)
I next found myself working for Wells Fargo Bank's Business Banking Group, known as BBG. If I recall correctly, Alan Myers, the administrator I was filling in for, had broken his collarbone, motorcycling - just as BBG was getting ready to do a major expansion of their Oracle server, named 'orion'. Here we see orion's controllers, hard drives, and their interrelationships, exposed. |
|
The goal was to expand the logical volume containing the Oracle data partitions so that Oracle could store more data. However, there was a problem: Hewlett Packard did not support logical volumes over a certain size, I think perhaps 4 GB, and Wells Fargo wanted to create a partition that exceeded the size supported by HP's LVM graphic user interface. The only way to accomplish the task was to manually build the new LVM partition, one extent at a time, using tools running on the command line. But each extent took about fifteen minutes to complete. We did the math and realized that it would take around 48 hours to manually build the new partition, one extent at a time. My proposal was that we write a program to build the entire LVM infrastructure, automatically, over the weekend; redirect the output to a logfile so that we could monitor progress, and write the script to keep its state in temporary files, so that if we needed to stop the program to fix something, we could restart it exactly where it had stopped, IE, I designed my program to be re-entrant. |
|
The program was designed, tested, and used to build the new LVM infrastructure and it worked perfectly. Doing this was not easy! The physical machines were all in a data center in Sacramento. I was not allowed to become "root". But every single command that I was preparing to run, to build the LVM, needed to be run as 'root'. So there was no real way to test the software. I had to write and test a program, but someone else would run it. Hence, the re-entrant architecture. I received absolutely no recognition for the stellar job I had done designing and implementing a solution to a thorny problem that pushed the boundaries of HP's LVM software, of course. I was just a contractor, doing my job. |
1996: Rosenberg Capital Management (RCM)
After Dave Pinho (who had hired me) left Sybase, to run Ops at Silicon Graphics, the new manager Sybase replaced him with - Al Montemayor - and I did not get along. Several coworkers left Sybase. I joined the herd. I next ended up working for Rosenberg Capital Management, or RCM. I reported directly to Cathy Scharf - whose husband, I think, is now CEO and President of Wells Fargo Bank. Cathy Scharf assigned me to work on the Trade AFFirmation software, known, inhouse, as 'TAFF'. |
|
TAFF was a system that accumulated the purchases and sales of stocks and bonds, bundling them together and transmitting them to a brokerage, where they were executed. This operated twenty-four hours a day, seven days a week. Sometimes TAFF would mysteriously come to a stop. When this happened someone in RCM's Information Technlogy department had to be paged, had to interrupt whatever they were doing - usually, sleeping - and log in and restart the system. This pager duty came with a premium for off-hours duty and was a welcome source of additional income to RCM's pampered IT personnel, whose workspaces, we add, were partitioned with panels made from rare woods, imported from Sweden. Cathy Scharf asked me to fix TAFF so that people did not need to get paged in the middle of the night to restart it. |
|
My solution was to store the process ID inside a lockfile, so that when the script was called, the first thing it did was check to see if the process ID listed in the lockfile existed in the process table - if it was not running, then the process which had created the lockfile was dead, and the lockfile could be deleted and a new process started. Cathy Scharf loved the work I did but the IT employees who had come to depend upon their income carrying a pager, off hours, as a rotating side gig, restarting TAFF, were not so happy; and friction ensued. |
A recruiter contacted me about RadioMail and I jumped at the chance to work on cutting edge wireless technology. RadioMail allowed people to communicate with their offices remotely, wherever they were. The company had grown out of an earlier company that had formed around the operation of an electronic mail gateway that interconnected the ccMail universe with the Internet so that the two worlds could talk to one another. |
|
Unfortunately, the ccMail side of the business had been allowed to rot and fester while the RadioMail side of the business - with all the attention from wealthy customers, like Bill Joy, of Sun Microsystems - claimed all the attention from management. Being a diligent fellow, I discovered that our ccMail customers were being ignored and made an issue out of it when I discovered that a shipment of medicine had been left sitting on an African airstrip, unclaimed by the missionaries to whom it had been sent, because RadioMail was not delivering the mail that the company had contracted to deliver, and the missionaries had not received timely notice of the shipment. My manager - he told me he was Paul Vixie's neighbor, in Redwood City, and that was how he'd gotten the job; that his degree was actually in civil engineering - then instructed me to not read the body of the email, but only the header, so that I would not be bothered by abandoned medical supplies - because the medical supplies were not my responsibility. In retrospect it seems that the only consequence of my identifying this state of affairs and trying to correct it was that my employment was terminated. |
In 1991 I applied to a position at Oracle Corporation, as a systems administrator, and I was snapped up immediately - during a hiring freeze. While at Oracle I received extensive training in database design and systems administration of diverse operating systems. However, Oracle was extremely polarized by the huge bonuses and the politics became intolerable. When I worked at Oracle Corporation, one of my areas of expertise was reading and writing magnetic tapes. I was an expert at using dd(1), mt(1), dump(8), restore(8), tar(1) and other UNIX archival utilities to extract usable files from tapes produced on unconventional computers and operating systems. Employees involved in customer support were often asking the Data Center for help in reading and writing data to and from tapes. Someone - it might have even been me - suggested setting up some publicly accessible tape drives in the Data Center, a sort of self-service tape drive kiosk, so that users could read and write their own tapes and not need assistance from computer operators. But even after this was done, users still needed help composing the commands necessary to extract data from or write data to a tape in a way that was guaranteed to be usable by the customer at the other end. So I ended up writing a small instructional manual for users, on the theory and practice of cutting tapes. I might have even titled it, 'On The Theory & Practice Of Cutting Tapes'. A few years later, when I started working at ASK/Ingres, another database company located in Alameda, I met the person I was replacing - Dan Dick - and Dan told me that he was leaving to start working at Oracle, in the Data Center. I told him that we were trading jobs; and, we stayed in touch. A few months later Dan sent me an email telling me that he had encountered my tape drive manual, in the Data Center, and that it was still in use, and still listed me as the author. Many of the servers that we administered in the Data Center were used to provide home directories for Oracle employees. These servers often provided home directories to hundreds of users. When one of these home directory servers was offline, it meant that hundreds of Oracle employees could not work. One did not need to shut down a computer to render hundreds of people idle, however. One only needed to fill up the shared disk partition in which the hundreds of home directories co-resided. All it took was one user copying one large dataset or SQL dump to their home directory, and then forgetting to delete it ... then, doing it again ... and again ... until, one day, everything came to a halt and frantic calls to the Data Center demanded that we find and fix the problem NOW. My methodology for fixing this problem was to run a utility to measure the disk use of each user, in kilobytes, then sort the results to see who was using the most storage; then, contact them, or their manager, and ask them to clean things up, and not to do it again. Cooperation was essential. I found myself doing this so often that I wrote a program to automatically check the free disk space in the home directory partition and, if it crossed a threshold, identified the problematic employee home directories, and sent them an electronic mail politely asking them to clean up, before the partition filled up. Our boss, Patricia McElroy, who managed the UNIX sysadmins, described it, admiringly, as "solving a human problem with technology". (This was a decade before "proactive" was a buzzword, by the way.) The result: no more home directory filesystems filling up, and no more hundreds of idle employees. I saved Oracle Corporation hundreds of thousands of dollars in lost productivity from idle employees. Probably much more. Pretty cool, huh? The odds are pretty good that my script, or a derivative of it, is still in use, thirty years later. Because nothing, really, has changed. People still have home directories. The home directories are still concentrated on larger machines so that they can be backed up. People still fill up partitions - nowadays, it's digitized videos, not datasets. But the principle is the same - find the biggest files and delete them, or move them elsewhere. And the solution always involves eliciting cooperation. |
1989-1991: VA Medical Center, San Francisco (SFVAMC)
By 1990 I was dabbling with trying to start my own business. Another friend from GE Calma, Walter Karshat, referred me to an opening for a systems manager at the VA Medical Center, in San Francisco, where he was doing some graphics programming for a medical research project. I ended up working part-time for the VA's Department of Radiology. |
|
Dr Hedgecock, who was overseeing the Radiology Department, arranged for me to work part time in the Department of Microbiology, as well, so that between the two jobs there was something approximating full time employment. Dr Hedgecock's research was into digitally acquired medical imagery and the application of heuristics to automatically diagnose pathologies. The goal was not some sort of Jetsons-like robotic doctor so much as it was the accumulation and concentration of decades of medical and image processing expertise to provide what I would call decision support for the hospital personnel, who remained ultimately responsible. AT&T had partnered up with Philips to create what may have been the world's first digital X-ray system. Philips provided the medical image acquisition technology - X-ray, computer-aided tomography, and magnetic resonance imaging systems - and delivered a digitized stream of data, which fed into an AT&T Commview switch. The switch preserved the images on a huge laser disc. The images were then shared from an AT&T server, running System V, via FTP. The Radiology Department was only wired for Appletalk, because everyone used Macintoshes. So one of my first tasks was to cable the computer room and the research office together via Ethernet so the images could be transferred to the Sun workstation with the Pixar video processing card. My second task was to install an Appletalk-to-Ethernet gateway so that the other offices on the Radiology wing could also access the Ethernet. Even with Ethernet, it took a good five or ten minutes to download each medical image. So my next task was to write a command-line utility that could take a patient's name, a patient's ID, or a date and recursively identify and download all matching images, so that researchers need type only one command; those images matching the criteria would be automatically downloaded and an email would be sent when it was done, so that they could do other things and not be tied up for hours manually downloading huge files. Supporting medical research was very interesting but the Veterans' Administration Hospital was an incredibly depressing place to work; almost everyone you saw was missing a limb or three, and nobody looked very happy. The Veterans' Administration, itself, was a terrible employer - during the time that I was there I watched as VA's management decided that everyone who was not a VA employee would now have to park offsite, and began charging employees for parking at their own job. |
|
It was while I was at AMPEX that the Great Internet Worm became a topic of discussion amongst networking professionals. Here we see my original copy of the reports published by Gene Spafford: ... and Rutgers University; analyzing what went wrong and how to avoid another such event. |
|
Probably the most interesting project I was involved in while at AMPEX was VSD's request that I mirror their server's hard drive. We met, at AMPEX, with the founder of a company called Ciproco that was just starting to make RAID controllers for Sun computers, but they were very expensive - I think the gentleman wanted $5000 for one controller, and we couldn't afford it, although, to be fair, the card would have probably been hand-assembled, and hand-programmed by the founder himself; I don't think he had an assembly line set up yet. We determined that copying the hard drive once a day so that we would never lose more than 24 hours of work would be sufficient. My guess, looking back, is that two days' work was worth more than $5000 - but one day's work was not, even after buying an extra hard drive - hence, buying a spare drive and doing a roll-your-own mirror with 24 hours' latency was more cost-effective than buying the RAID card - which could always be bought, later. I then cobbled together a script that was executed during the single user phase of the operating system's boot cycle, such that every time the machine rebooted, it would pause, preen its filesystems, make brand new filesystems on the target hard drive, mount them on a scratch directory, invoke dump(8) and restore(8) and pipe them together so that the output from the dump(8) was piped into the input of the restore(8), thus making a pristine copy of each filesystem on the source hard drive, every night at around 0200, seven days a week. Last but not least, the script would rewrite the /etc/fstab so that the mirror drive would boot, if used. |
|
Of course, I also made conventional disk-to-tape backups. So I guess I can say that I have been building and operating high availability infrastructures snce 1989. AMPEX was also where I was introduced to the use of software that could actually show our network's traffic in real time - Ethernet frames, IP packets, and application-level data, too. We brought in a gentleman by the name of Bill Pachoud, of the Root Group, to inventory our network traffic. He installed some probes, gathered some data, did some analyses, and introduced me to the idea of modelling a network using statistical analysis. Working with Bill for a week was more educational than sitting in a classroom for a whole year. |
1987-1988: Network Equipment Technologies
After GE Calma laid everyone off and relocated the project to San Diego, I followed my former manager, Tim Radzykewycz, to Network Equipment Technologies, where I became N.E.T.'s first UNIX systems administrator. Network Equipment Technologies made routers for T1 packets and so they were at the front of the burgeoning networking economy. |
|
Here we see what is probably the world's first network management system, half a decade before Cisco even existed. Note that the software is running on a Sun Microsystems workstation, probably a model 3/50 with a 68020 processor and 4 to 8 MB of RAM. Tim told me, later, that after I left Network Equipment Technologies, Barbara Hooker, our VP of MIS, had to hire five people to fill the gap I left. That reminds me of a story ... One of those five people was a man named Bjorn Satdeva. I actually interviewed Bjorn; and I read his resume. His background could be summarized as having managed a small network of maybe 30 Sun workstations at Stanford University. However, Network Equipment Technologies had an infrastructure over ten times that size - 300+ workstations and a dozen Sun 3/180 and 3/280 servers, every single one of them installed by me. I managed the entire infrastructure from my cubicle. You see, my customers at Network Equipment Technologies were all low level systems programmers. Many of them knew that the Sun Microsystems workstation sitting on their desk could be halted with the correct combination of keys - like CTL-ALT-DEL, for Microsoft Windows, but a decade before - and that, once halted, the machine could be booted into single user mode, at which point the root password could be changed, and the configuration modified so that I, the legitimate administrator, could no longer log in unless I came to their physical office and did the same thing - halted the machine and brought it back up into single user mode, and returned everything to normal. Quickly tiring of this thankless and confrontational task of resetting workstations to normal, I automated the process by writing a program that periodically compared the host computer's configuration files, in /etc, with known good copies, kept in /.../etc, and sent me the differences, via email. I would see what had changed - usually the root password, and sometimes the token at the end of the /etc/passwd file that enabled an extended user lookup in the YP/NIS tables - saunter into their office, display an uncanny and intimate knowledge of their recent actions, explain to them why they must not do that, persuade them to change it back, and make a friend, all in one move. I had the job completely wired. I had prebuilt diskless (ND, or Network Disk) images for the Sun 3/50s, and I could unbox a machine, assemble it on someone's desktop, plug it into the Ethernet, go back to my workstation, build an image for the new machine on a host Sun 3/180 server in the computer room, walk back to the new machine, and see it booting to completion, in less than fifteen minutes, every time. After I left Network Equipment Technologies, Bjorn Satdeva found my remote reporting infrastructure - the /... directory must have caught his eye, in the crontab - and decided it would make a great topic for the next LISA, or Large Installation Systems Administration meeting, in Monterey, in November (that may have actually been the very first meeting of the organization). I was sitting in the audience when Bjorn Satdeva presented my work as his own. I did not call Bjorn out for misappropriating my work and publishing it as his own. Like many or even most computer people, I am just a humble INTJ, inclined towards introspective behavior. I despise drama. I didn't want to be thrown out of the conference. So I said nothing. As it turns out, I do not think that Bjorn Satdeva ever actually published any software of his own. I think he may have partnered with someone on something once - perhaps he had learned a lesson from his last experience - but we do not see any more fountains of creativity or productivity coming from Bjorn Satdeva - just sayin'. With 30 years of perspective to guide us, I think that everyone can now see for themselves that Bjorn Satdeva could not have possibly been the author of software that scaled to 300+ workstations and a machine room full of servers, coming from a background where he had spent perhaps a year or two managing maybe 30 desktop machines and no file servers - which was obvious to me at the time but not so obvious to the people I complained to about the situation. It is my understanding that Bjorn parlayed his expertise, from his LISA presentation in Monterey, into getting himself elected to the leadership of USENIX Association - after which followed some sort of drama, which included the USENIX Association's headquarters being burgled and all the copies of the USENIX Journal being stolen from the organization's library - or maybe it was just the copy that contained the scripts that had been published by Bjorn as his own work ... but Bjorn was out of the leadership, and he never clawed his way back. At one point I had been interested in joining the USENIX Association. But the experience at LISA had left me viewing the organization as elitist. People who claimed to be my peers had found Bjorn's claim to authority convincing because he had a college degree. They had found my claim to authority to be unconvincing, because I did not have a college degree. Clearly they did not regard me as a peer; but they did regard Bjorn as one of their own. As far as I was concerned, they could have him. I mean, really, where do you go to study how to be a UNIX systems administrator, in 1986, other than maybe MIT, or CalTech ... and who issues a bachelor's degree in systems administration, anyway? Network Equipment Technologies was also my introduction to databases. As NET's business grew it was decided to invest in a 24x7 help desk for customers. Called TAC, or the Technical Assistance Center, it was a help desk, with an 800 number. To track problems, Oracle was selected, and two Oracle consultants joined our IT department, on loan from Oracle. TAC required three servers - we named them 'tic', 'tac', and 'toe'. One of those servers hosted Oracle v4, which at that point in Oracle's history still possessed many indications of its recent portage from VAX/VMS. Oracle was a resource pig. We didn't get along. Most of the overtime I racked up at night and on weekends while I was working at NET was related to frantic calls from TAC complaining that the database was down, followed by my remotely logging into the TAC servers and using ipcs(1) and ipcrm(1) to query the state of the System-V-based semaphore infrastructure, remove semaphores and then restart the database. Oracle was notorious for not really understanding how to properly start their own executables, running as processes owned by the user 'oracle', at boot time. I am probably the one responsible for helping Oracle figure that out at NET, and I did it again, at Oracle, a few years later. Network Equipment Technologies suffered an explosion of growth and hired a new crop of executives who laid off all the people I reported to, and I ended up leaving NET and going across the freeway to AMPEX R&D. |