This post is part of a series. For more information and links to other posts in the series, see the “My hi-tech adventure… original” home page.
This chapter is about my experiences at IBM.
“n+1 trivial tasks are expected to be accomplished in the same time as n trivial tasks.” —Gray’s Law of Programming
IBM San Jose
By 1985 it was becoming clear to me that most of the interesting work on VM at SLAC was over for me. IBM had been steadily improving the VM product and I found myself taking out more local SLAC VM modifications every time we got a new release of the product. I also began to get the itch to work somewhere else and see more of the world. This was my version of the mid-life crisis you have all heard about, and have maybe experienced yourself. I decided to try to get a job at IBM, because at the time IBM was the computer company and I knew many people who worked there, having met them at SHARE meetings and the various joint efforts SLAC had done with IBM Research at Yorktown Heights.
In August 1986, I went to work for the IBM Storage Subsystem Division (SSD) in San Jose. I came in as a Senior Programmer (which was a rare event at the time) and entered the VM Technical Office with Margarita Chieng as my manager. In SSD there was a programming laboratory under Lynn Yates, an IBM vice president. The programming lab consisted of almost 1000 people in San Jose and Tucson, Arizona, and was responsible for providing operating system device support for SSD disk and tape hardware. Most of the people worked on MVS device support. I was in a much smaller group under Don Bussey (later Monte Anglin) providing device support for VM. My knowledge of VM was what got me the job.
At the time I got there, SSD designed and built the lion’s share of disk and tape storage for all computers. A significant share of IBM’s revenues came from selling 3380 disk drives and 3880 disk subsystems built in San Jose. The disk drive assembly line was right there in Building 50 at San Jose. I toured the line soon after I got there. How this picture changed in the years I stayed at IBM! Just as I was about to retire from IBM, IBM sold the remnants of its disk drive business to Hitachi, including the San Jose plant site. In 2001, the disk business actually lost huge amounts of money for IBM.
Our vice president, Lynn Yates, had a ranch in Hollister and wore cowboy boots to work. This had established some kind of work apparel norm, and there were quite a few people walking around the halls in my part of SSD wearing cowboy boots. I wore cowboy boots in the 3rd grade, but not at SSD! Yates also had his own version of salty language, which I used to hear quite a bit in the hallways.
When I arrived, the main activity in the Yates organization was developing device support for the MVS operating system (VM was only a sideline.). There were almost 1000 people working on this in San Jose and Tucson, Arizona. They were accustomed to working in a slow and bureaucratic way. At one point they were actually holding pre-pre-plan meetings. (The plan meeting was held twice a month and in advance they held meetings to prepare for the meeting to prepare for the meeting!) The funny thing was that Yates, our vice president, frequently attended almost all these meetings. Fortunately I had to attend only once or twice. Some managers liked to attend because it got them executive “exposure.”
Projects I worked on at IBM
VM 3390 (Carmel) device support
My first project when I arrived in August 1986 was to size (estimate) the effort of adding support for the 3390 disk drive (code-named Carmel) to the VM operating system. The 3390 was the successor to the highly profitable 3380 disk drive. The 3390 project had been under way for several years when I arrived (in those days IBM took their time developing products, since they had no real competition). I struggled with this sizing task because I was more used to actually writing the code than just sizing it. I did write a prototype over the 1987 Christmas holidays break that included most of the necessary code.
When I got to IBM their software development process was very slow, complex, and inefficient. We were supposed to produce architecture, high-level design documents, low-level design documents and then finally write and debug the code. This all took far more time, documents, and meetings than I thought was necessary. You could easily spend six months writing development documents and reviewing them without writing a single line of code. Most of the people I met were used to working this way. Coming from a research environment, the culture shock was considerable! At SLAC we just wrote the code, and almost never did any documentation. I did finally produce the sizing and the necessary development documents and the developers added support for the 3390 to VM.
Note: at this time SSD had many meetings and task forces going on trying to figure out why the 3380E disk drive was not making its forecast revenues. This was the first disk IBM produced that had that problem. This was a sign of future major problems for the division that they never really overcame.
One thing that would be needed if SSD was going to sell the new 3390 disk drive to existing VM customers with older 3380 disk drives, was a way to move existing data between 3380 to 3390. The geometry of the 3390 was different from that of 3380. It had longer tracks than 3380s. Without telling my manager, I began to work on a prototype of a utility that could move VM/CMS file systems (on minidisks) between 3380 and 3390.
At this same time, SSD had a large number of technical people from San Jose, Tucson, and the East Coast who were architecting more powerful storage management for all the IBM operating systems under the code name Jupiter. Once they find out about my prototype datamover utility, I was put to work in software development on a product called DFSMS/VM. I can’t remember what all those letters stood for! This was supposed to be stage one of the Jupiter Project for VM.
The product combined the existing ISMF user interface from MVS with my datamover and server virtual machines to make it possible for customers to easily move their 3380 data to 3390. Mike Dillon led this product development and I supplied the mcopy datamover. Working on the datamover taught me a lot about the details of I/O programming for IBM disk drives. Mcopy used complex channel programs that moved data between drives as fast as the drives could go.
This project gave me my first introduction to IBM divisional rivalries. The VM operating system was developed in a different division of IBM in Endicott and Kingston, NY. Once the people in that division found out what we doing in San Jose, they didn’t like it and tried to stop us from shipping our product. This was the standard mode of operation inside IBM at the time. We were able to ship the product because there was too much potential hardware revenue involved. There was a saying I learned from this: “Never get in front of the IBM revenue train!”.
The product was used by many large IBM customers to move huge amounts of data, and we never had a single data integrity problem reported. I was proud of this. Unfortunately IBM did not make much money from selling DFSMS/VM. It was given to most large customers as part of a 3390 disk purchase.
In SSD there was always a tendency for the hardware people to want to give away software to make it easier to sell the hardware. The argument was that the hardware revenues completely dwarfed anything that could be produced by selling software. To most of SSD management, software revenue was uninteresting. Later on we all learned that software profit margins were generally much larger than those for hardware and that selling software might actually be a good idea. When I retired from IBM, selling storage software and firmware was our main focus.
AIX/ESA disk and tape support
In the late 1980s, IBM was part of an industry group, the Open Software Foundation (OSF). Part of the 370 processor division of IBM, in Kingston, NY, had decided to develop a version of OSF/1 UNIX to run on the 370 mainframe. This would allow IBM to compete with a fairly successful UNIX offering from Amdahl, which was called UTS. My department, managed by Nora Denzel (now an HP vice president), got the mission of providing device support for SSD hardware in this product, which was called AIX/ESA.
Our team was responsible for the disk and tape device drivers for AIX/ESA. There were about a dozen of us in San Jose working remotely with a much larger group of several hundred in Kingston, NY. I was the technical leader of the San Jose group and was responsible for the design and overall technical direction of the device driver development. Later when the disk driver got into trouble, I became a coder responsible for fixing many bugs and actually getting us through system test of AIX/ESA.
I didn’t know it at the time, but this device driver was larger and more complex than most UNIX device drivers ever done before. That was because the hardware we supported from SSD was very complex and required incredible amounts of error recovery code. While I was on this project I made many trips to Kingston during a period of several years.
From working on this project I learned about team leading, negotiating, UNIX, the C language, and many other things. From my standpoint it was a big success. Unfortunately, the product did not sell well once it was produced in 1989-90. We never got more than 35 paying customers. I was amazed to find out that several hundred people had worked on something for several years only to find out that the customers didn’t like the product and IBM Sales had not much interest in selling it. In addition, part way through the development of AIX/ESA, the MVS people in Poughkeepsie began working on a rival product called “Open MVS” that allowed existing MVS customers to run some UNIX applications under MVS. This was an example of the corporate immune system at work.
Note: It is interesting that in all my years at SLAC, our budgets were always managed carefully. SLAC wasted very little money. In IBM I saw hundreds of millions of dollars wasted on just some of the products I worked on. So be careful when people start talking about how the “government” wastes money. Industry can be pretty good at it too!
Workplace OS tape support
My team was quickly pulled off of work on AIX/ESA when the failure of the product caused us to lose our funding. At this time, IBM had a large product development effort underway called Workplace OS. As a way to save money on supporting the multiple IBM operating systems, a micro-kernel was going to be developed that would allow multiple operating system personalities to run on top of the micro-kernel (based on the Mach kernel from Carnegie Mellon University). The micro-kernel would provide the low level hardware support for all the personalities.
Three of us at IBM San Jose were asked to write the SSD tape device support for this product. Primary development was in Boca Raton, Florida, which had been the home of the PC OS/2 operating system. For almost a year our team tried to work with the people in Florida. I traveled to Boca Raton several times for meetings. After a while the Boca folks stopped communicating with us. We learned later that this was because Workplace OS was in big trouble. Enough of the micro-kernel code had been written to demonstrate that performance of the operating system was terrible, and nobody could think of a way to fix it. This caused the inevitable schedule slips and the search for the guilty. By this time (around 1992) IBM itself was in big trouble and was losing money like crazy. Most of the Workplace OS developers got laid off after Lou Gerstner began running IBM in 1993.
The Mach kernel did have success in the marketplace. Apple Computer used it as the basis for their OS 10 operating system for the Macintosh. Today it is also being used by the GNU HURD developers. In IBM, Mach died a horrible death.
ADSM (ADSTAR Distributed Storage Management) device support
After Workplace OS I became part of a new project to provide SCSI device support for a network backup product called ADSTAR Distributed Storage Manager (ADSM). The name ADSTAR came from the name IBM SSD was supposed to have once it got spun off as a separate business under our new division president, former U.S. Senator Ed Zschau. The spin-off never happened, and Zschau left IBM. In 1993, Louis Gerstner, the new CEO, decided to keep IBM together. At this writing (2003), ADSM still exists, but is now called Tivoli Storage Manager (TSM).
ADSM needed to be able to use tape and optical libraries to back up client data. There were many companies providing such hardware products, not just IBM SSD. After some false starts, my team wrote three device drivers using common software building blocks I helped design. We were able to support hundreds of IBM and non-IBM devices with this code. This was critical to the success of ADSM, since our competitors (the main one being Legato Networker) already could support that many devices.
Just about the time we got the code finished, most of the development team left IBM (1993). This was because there were plenty of jobs in the industry and IBM had to lay people off to cut expenses. Our developers and managers found out they could make a lot more money outside IBM, so they left. Some people got 50% raises!
For whatever reason, I hung on, but for a while we had only a few developers left. The nice thing, though, was that unlike my previous products, ADSM was a roaring success. Over several years we built the team back up again with new people and I won awards and filed patents for the work we had done. My biggest ADSM achievement was receiving an IBM Corporate Award from Lou Gerstner himself.
SSA disk Windows device driver
By 1997, Lynn Yates had been given responsibility for his existing MVS software plus all hardware firmware in the storage division. During the development of the new Tarpon/Shark disk storage subsystem, an SSD group in Havant, England, had greatly embarrassed Yates by missing critical dates that had been promised to the rest of IBM. To fix this, Yates decided that he wanted to move this firmware and associated device driver development work from England to San Jose. The hardware involved was the Serial Storage Architecture (SSA) disk and its controller cards. These disks and controllers were generating lots of revenue at the time for IBM.
I was put in place as the leader of a small San Jose team that was supposed to take ownership of the Windows device drivers for the SSA controller cards. Our team spent over a year learning how the existing card firmware and Windows device drivers worked. This was all done with only the slightest help from IBM SSD in Hursley (the Havant group had moved there). Like most human beings, the Hursley group was not interested in putting themselves out of business by giving their mission to people in San Jose. This resistance could have been overcome by executive management, but by the end of 1997, Yates was on his way out of IBM (he retired from IBM rather suddenly at the end of the year) and we had no executive backing. The Hursley people knew about this, dug in their heels (at one point it took us in San Jose over 6 weeks to get a look at their code), and waited for us to go away, which we did by early 1998. This year and a half of wasted effort was the most frustrating time I had at IBM.
Fortunately, in mid-1998 a more interesting project came along. The Storage Division had decided to get back into the disk subsystem business (which it had almost completely lost to EMC) by building disk subsystems out of common parts. The first such subsystem was named Tarpon and the next was Shark.
Since these boxes contained huge amounts of customer data, the customers wanted to have redundant hardware paths connecting the subsystem to each server. Without this capability, IBM could not compete with EMC, who already had multiple paths in their hardware and software. My new team was given the job of providing device driver software that would support this environment in the AIX, Windows, Solaris, and HP-UX environments. The team initially consisted of me, Limei Shaw, and Cam-Thuy Do. Our manager was Randy George. We called the product Riptide. We all worked very hard for several months and got the product working on Windows and AIX. This time IBM Austin (owner of the AIX operating system) tried to kill our product, but we managed to placate them by making the changes to our AIX code they said they wanted. They weren’t able to kill the product because, again, there was too much IBM revenue at stake. It was a good thing, too, because it was not until 2003 before AIX had similar support in the operating system.
This was the last project I worked on before I retired from IBM. By the time I left our team had grown to over twenty people and had produced many releases of the product, which we called either Subsystem Device Driver (SDD) or Data Path Optimizer (DPO). DPO was OEMed to other storage vendors.
Since the Shark program was successful for IBM, and because our multi-pathing software was delivered on time and was very reliable, I was promoted to Senior Technical Staff Member (STSM) about a year before I retired from IBM.
IBM business travel and boondoggles
Prior to the corporate near-death of IBM in the early 90s, budgets for business travel were very generous. During these years, especially, I went to many IBM locations to attend meetings. Among the places I visited in the U.S. were:
- Tucson, Arizona
- Endicott, New York
- Kingston, New York
- Poughkeepsie, New York
- Boca Raton, Florida
- Cary, North Carolina
- Hawthorne, New York (also Yorktown Heights)
- Austin, Texas
My most frequent destination was Tucson, where roughly half of the people in my division were located. In addition, I traveled to Kingston and Endicott many times. I was able to gather information for my family history on the trips to Endicott, since I flew out of Syracuse airport which is close to Auburn where my mother’s family, the Moseleys, came from.
I remember one meeting in Kingston, NY, that was attended by a dozen people from San Jose and Tucson. During this time IBM San Jose and IBM Tucson were constantly battling for turf and hence both sites sent a large complement of people to protect their interests in all the meetings on the East Coast.
In addition to frequent regular business travel, I also had the pleasure of traveling on boondoggle trips, one to Thornwood, New York, for a one-week seminar on the future of operating systems and another one-week trip to Mayaguez, Puerto Rico, to help IBM recruit engineering students. In Thornwood we had rooms with built-in computer offices having access to the IBM corporate network. This looked nice at first, but who wants to work day and night? The predictions I heard on operating system futures were useless. Nobody took the challenge from Microsoft seriously at all. Most of the predictions for the future were wrong.
In Puerto Rico my “job” was to give two guest lectures to computer science and engineering classes at the university. So for those two hours of work I got to spend a week touring all around the island. I must confess it was fun. You can get good fish dinners in Puerto Rico!
Although I never worked for IBM Research Division, I did travel to their locations many times, either on airplane trips to Yorktown Heights and Hawthorne or driving up the hill to Almaden Research in San Jose. Until the middle 1990s, IBM Research was the inner sanctum of computer hardware and software research. All their locations had beautiful buildings, and the staff members generally had much nicer equipment than we lowly “development” people. Three of my co-workers from SLAC ended up at Yorktown Heights: Joe Wells, Paul Dantzig, and Mike Penner. The early stages of ADSM product development were a joint project between Almaden Research and SSD San Jose and Tucson.
Awards and patents
IBM has awards programs for technical people that recognizes technical achievement. Usually they give these awards to people who played a significant role in a product that has some level of commercial success. I received quite a few of these while I worked at IBM, and used the money to take some nice trips to places like the Soviet Union and Costa Rica. If you want to see one of the patents, look up 5450579 issued 9-12-1995, Method and Apparatus for Error Recovery.
The biggest prize for me was an IBM Corporate award for my work on ADSM. These awards are given out once a year at a big event where IBM pays for your stay in a big city for a whole week. For me the city was San Francisco, which wasn’t too exotic. During the event I got to meet quite a few IBM executives and also attend an awards ceremony with my wife.
The Internet bubble
By 1998, it was clear that an incredible boom was happening in high-tech. The newspapers were full of articles about people going to work at Internet startups and becoming a millionaire months later when the company “went public.” Many people all over Silicon Valley were working 60+ hours a week so their company could ship its first product and go public. My wife and I used to hear people talking all the time (even in Starbucks) about stock options, how much money they had, and whether they could retire by the time they were 30. Many people left IBM at this time to go get rich. IBM itself even began to give stock options to non-executives. I got some myself. One lesson Anna and I learned from the bubble times is that the money from stock options is not real until you spend it. Strangely enough, many people in the high-tech industry held on to their stock options instead of cashing them in, and watched them become worthless instead.
IBM was cautious and did not participate very much in the bubble. During the boom, Sun Microsystems and EMC took market share away from IBM at an alarming rate. When the bubble burst Sun and EMC suffered considerably, but IBM did not at first. However, in the last year I was at IBM they finally did see rapidly falling revenues in various areas of technology, including storage. In early 2002 almost 30% of the people working on the Shark storage subsystem were laid off and the entire development effort was moved to Tucson from San Jose. This looked like the handwriting on the wall to me for the location in San Jose where I worked.
My retirement from IBM
I retired from IBM at the end of July 2002. My last year at IBM was during the largest business downturn I have ever experienced. The computer industry went from being white-hot in 1998-1999 to a wasteland by 2001. I had very little to keep me busy at IBM. We had no travel money, and no money for classes or books. Instead of starting new product development, the executives focused mostly on trimming the cost of the things they already had going. IBM even sold the entire disk drive business to Hitachi. This was a good time for me to make my departure, while my pension was still intact!