A friend asked, “I would also be curious to know how you got into math and CS. Are you just a natural born computer nerd, or did your parents or the environment influence you in that direction?”
I don’t know about “natural-born nerd”, but maybe….
My father was a musician and recording engineer, but he disappeared before I was 6. My mother came from an intellectual family, was a consummate organizer, and became a historian of nuclear weapons and technology. In my early teens I was a polymath, and deeply resented having to choose between classics (Greek and Latin) and science. I was also politically active, as were many members of my extended family, and so I became interested in economics.
At age 16, after taking my GCE O Levels, I had to pick the subjects I would study in 6th Form. I wanted to include both a science and economics, and the cluster that worked in the schedule was maths, physics, and economics. At this time, I was also a chess player and captain of the school bridge team, so I had a very analytical style. Our physics course had a distinctly experimental bent, and maths included topics up to partial differential equations and linear algebra. With these influences, I approached economics from a distinctly econometric perspective. Supply and demand curves were fine as visual representations of behaviour, but I wanted to understand whether they were mathematical models that could be validated with data. And my interest in physics prompted me to study the dynamic behaviour and stability of the systems the models described.
At this point it is necessary to understand how university admissions worked in the UK in 1968.
At the beginning of 1968 I took my “mock” A Levels, in preparation for the exams in June, and started applying for university places. Of my (fairly elite) cohort, about half were planning to stay at school for a few extra months to take Oxford and Cambridge (“Oxbridge”) application exams, with the goal of starting in September 1969. Those who weren’t trying for Oxbridge were applying to regular universities for admission in September 1968.
My teachers assumed that I would be going to Oxbridge, but I was a bit of a rebel. I had decided that I wanted to focus on econometrics, and I wanted to study with the world’s expert on the subject: Richard Lipsey. I’d read his book, and it was fantastic. And Lipsey was going to be teaching Essex, a relatively new (and notoriously radical) university. So I decided not to accept my teachers’ advice, but that I would still stay in sync with my Oxbridge friends. I applied to Essex for delayed admission in September 1969, and I took a year off between Grammar School and University. These days this pattern is called a “gap year” but in 1968 it was unheard of. In the summer of 1968, I took my A levels, got decent grades (A in economics, A2 in math/advanced math, C in physics) and was accepted by Essex.
So what to do for my gap year? My mother was working at the UK Atomic Energy Authority (UKAEA), and a good friend and colleague of hers, Ken Binning, had just transferred into a new team at the Atomic Energy Research Establishment (AERE) at Harwell. The team was called the Programmes Analysis Unit (PAU), and their job was to assess (and if possible quantify) the economic impact of government funded research and development programmes. When my mother told Ken of my plans, he suggested I spend a year as a (paid) intern with the PAU, taking care of the number-crunching. Nepotism? Sure, but pretty innocent stuff. I moved to a staff hostel in Abingdon, and started work. (Harwell was an ex-RAF bomber base, so accommodation was military-style.)
It’s worth mentioning at this point that I had absolutely no experience of computing in any form. Our exposure at school was limited to a lunchtime talk by an RGS “Old Boy” who was working at (I think) Ferranti; the only thing I remember from it was that he passed around a core memory module. None of my maths or economics texts mentioned computers.
Economists had already identified the S-curve of technology adoption that we’re all familiar with today. What we didn’t know was what kind of S curve to use. The senior mathematician for whom I worked (I’ve forgotten most of the names) hypothesized that there were three plausible curves: the sigmoid (symmetrical, horizontal asymptote), the Gompertz (like the sigmoid, but asymptotic to a rising line, to represent exogenous growth), and a modified exponential (where the portion before the point of inflection was irrelevant). We had time series data from a number of industries and technologies. My job was to find out which family of curves fit the data best, which meant finding the parameters and degree of fit for each curve for each dataset.
For a week, I struggled to come up with a workable methodology. I could graph the data, draw a reasonable curve by eye, and then estimate the corresponding parameters, but that was clearly inadequate. Then another colleague suggested that instead of working with the equations of the basic curves, it might be easier to work with a linear transformation of each. This meant that I could graph the transformed data values and then fit a straight line through them. And I had used least-squares line fitting techniques back in the physics lab….
Now I needed a way of iterating on the parameters of each curve, so that I could find the parameterized curve that fit best (i.e. had the least residual error). I started with a simple trisection model, iterating until it converged.
By hand. Pencil and paper.
OK, that got old really quickly (in about a week). I asked my boss if he had any ideas, and the very next day a Wang Model 370 Programming Keyboard and Model 371 Card Reader appeared on my desk. This device deserves its own blog post. (It probably cost more than my wages for the whole year!) The Wang allowed me to write a “program” of up to 80 steps, including conditional branches. Programmes (British spelling!) were encoded on punched cards, using a stylus to push out the chads. I spent about a month getting familiar with the unit and using it to assist in my pencil and paper work. And then I tried a simple iterative convergence program.
The first model converged in two minutes. The second, more complex one took an hour. The third ran for twenty hours. The fourth failed to converge even after running over a weekend….
My approach was working, but I needed more “calculator” power. I asked my boss for suggestions. We went to the department library, where he handed me a copy of “McCracken on Fortran”; then he showed me the way to the building next door which housed the Harwell Computer Centre and its IBM 360/65. I went in, applied for an account, and was led to a room full of IBM 029 card punches. (No, I didn’t know how to type.)
I’m not absolutely sure, but I think I completed my first successful “production run” of the curve-fitting application within two or three weeks of this introduction. And then, like any other hacker, I started tweaking things. I noticed that if I iterated on the parameters in a particular way the linear transform would give me a residual fit curve that had a nice single minimum. So I was able to replace the simple trisection mechanism by dropping a parabola onto three points and using the minimum as the next candidate. Next, my boss wanted to be able to present the findings to the department, so I tried drawing graphs on the line printer. That looked clunky, and one of the operators at the Computer Centre saw my printouts and told me about a COM (computer output to microfilm) system that had just been installed. This was the only kind of graphical output device that I had access to, but it allowed me to produce some nice slides. (The big plotters were reserved for engineering blueprints.)
This project lasted from September 1968 to about March 1969. After that, I was put in charge of a database (on 7 track tape) that identified every single computer in the United Kingdom (including the classified ones). I used this to generate a series of reports on trends in different industries, reliability (based on explicit and inferred replacement), and so forth. The Fortran programs for these tasks were trivially simple, and I wound up spending most of my time at the Computer Centre hanging out with the operators, systems programmers, and users. I helped several scientists from the Theoretical Physics Division to debug their applications, and it was these connections that allowed me to return to Harwell in 1971 for a “co-op” job. (Coincidentally, one of the systems programmers – a Canadian named Dave Lyons – took a job as a lecturer at Essex University in 1969, and he taught me Operating Systems in my second year.)
Although all of my “official” work was in Fortran IV, I (necessarily) became proficient in OS/360 JCL. I was a bit of a pack rat, and whenever IBM distributed new documentation I would “liberate” the old editions. This gave me the chance to play around with OS/360 and Assembly/360, and I loved it. In fact, a large part of my work from 1969 to 1988 involved assembly language programming, for mainframes, supercomputers, minicomputers, and PCs.
My internship finished in the summer of 1969, and I arrived at Essex University in September.
So how much of this was me? I certainly got some help with the mathematics, especially the linear transformation of the various equations. And the operators at the Computer Centre helped me get started with OS/360 JCL and Fortran – I think one of them gave me a little “hello world” card deck to get going. But the algorithms and programming were all mine. So yeah, “natural born computer nerd” fits.