One of the things I get a kick out of in the clip of HAL singing Daisy is just how much the physical modules being removed <i>look</i> like hard drives being pulled out of a NAS. I could easily think of a much larger version of my Synchrony NAS looking just like that.<p>BUT: When I first watched 2001 sometime in the early 1990s, I had only seen a 5.25" hard drive, and not on a slider / rails. I thought the inside of HAL was just tacky scifi from the 1960s.<p>It's only later as I've seen the predictions come true that I've realized just how forward-looking 2001 is. Like the scene with watching news reports on the tablets at breakfast. It wasn't until I watched a video, on my phone, in the late 2010s, that I realized that prediction in the move was 100% spot-on.<p>BTW, the 4k Ultra-HD bluray of 2001 is awesome.
Are there any detailed descriptions available of how the music and voice were synthesized?<p>Based on the recording, the information I could find, and imagining how I'd try to do the same thing using the technology of the era, I assume the melody is based on single-cycle samples of a piano and Max Matthews playing the violin. The vocals sound like formant synthesis like the Votrax SC-01 or TI LPC series, although of course those chips didn't exist until 15+ years after the work at IBM. But I'm very curious about the details. Did the team develop a general-purpose sequencer for the melody and/or speech, or were all of the notes, slides, etc. hardcoded? Did the computer actually output all 3+ parts together, or were they separate elements mixed after the fact? I assume the output was not realtime, but it would be a neat surprise if they achieved that in the 60s. Was it all handled digitally in the computer, or was the computer controlling some add-on hardware, maybe with analogue filters? Etc.
I inherited a Southern Bell promotional card/record that includes <i>Daisy Bell</i>:<p><a href="https://dbarlett-wordpress.s3.us-east-1.amazonaws.com/wp-content/uploads/2023/05/Computer_Speaks_1-1024x762.jpg" rel="nofollow">https://dbarlett-wordpress.s3.us-east-1.amazonaws.com/wp-con...</a><p><a href="https://dbarlett-wordpress.s3.us-east-1.amazonaws.com/wp-content/uploads/2023/05/Computer_Speaks_2-1024x762.jpg" rel="nofollow">https://dbarlett-wordpress.s3.us-east-1.amazonaws.com/wp-con...</a>
The crowdsourced version of the song is quite chilling: <a href="https://youtu.be/Gz4OTFeE5JY" rel="nofollow">https://youtu.be/Gz4OTFeE5JY</a>
Nice. But why did the video creator feel the need to put in the fake film projector effects? The urge people have to add "oldness" where it is already present - though not in the form they imagine - is interesting by itself.
A few years ago I picked up "Music By Computer"[1] from a used book store, and it's fascinating.<p>Published in 1969, it's a collection of papers from the 60s about music and sound processing on the machines back then, and it goes into a lot more detail, if anybody is interested and can find a copy. It even came with recorded music on 5 paper thin flexi-discs that I've never been able to play.<p><a href="https://www.amazon.com/Music-Computers-Heinz-von-Foerster/dp/0471910309" rel="nofollow">https://www.amazon.com/Music-Computers-Heinz-von-Foerster/dp...</a><p><a href="https://en.wikipedia.org/wiki/Flexi_disc" rel="nofollow">https://en.wikipedia.org/wiki/Flexi_disc</a>
Library of Congress essay by Cary O'Dell on "Daisy Bell (Bicycle Built For Two)" from song origins to Bell Labs recording:<p><a href="https://www.loc.gov/static/programs/national-recording-preservation-board/documents/DaisyBell.pdf" rel="nofollow">https://www.loc.gov/static/programs/national-recording-prese...</a>
We had a floppy vinyl 45 of that when I was a kid in the 1960s (my mother was a high school science teacher and we often had cool stuff like that around the house).
A recording of it was included in some electronics hobbyist magazines of that time on shiny black flexible vinyl for playing on a phonograph at 45rpm. I seem to recall that Bell Labs was credited on the label but IBM was not.<p>EDIT: dbarlett just posted an image of the recording's label elsewhere in this thread
The neat thing about this particular singing synthesizer is that it used a surprisingly sophisticated (especially for the 60s) physical model of the human vocal tract [1], and was perhaps the first use of physical modeling sound synthesis. Vowel shapes were obtained through physical measurements of an actual vocal tract via x-rays. In this case, they were Russian vowels, but were close enough for English.<p>While this particular kind of speech synthesis[2] isn't really used anymore, it's still fun to play around with. Pink Trombone [3] is a good example of a fun toy that uses a waveguide physical model, similar to the Kelly-Lochbaum model above. I've adapted some of the DSP in Pink Trombone a few times[4][5][6], and used it in some music[7] and projects[8]of mine.<p>For more in-depth information about specifically doing singing synthesis (as opposed to general speech synthesis) using waveguide physical models, Perry Cook's Dissertation [9] is still considered to be a seminal work. In the early 2000s, there were a handful of follow-ups to physically-based singing synthesis being done at CCRMA. Hui-Ling Lu's dissertation [10] on glottal source modelling for singing purposes comes to mind.<p>1: <a href="https://ccrma.stanford.edu/~jos/pasp/Singing_Kelly_Lochbaum_Vocal_Tract.html" rel="nofollow">https://ccrma.stanford.edu/~jos/pasp/Singing_Kelly_Lochbaum_...</a><p>2: <a href="https://en.wikipedia.org/wiki/Articulatory_synthesis" rel="nofollow">https://en.wikipedia.org/wiki/Articulatory_synthesis</a><p>3: <a href="https://dood.al/pinktrombone/" rel="nofollow">https://dood.al/pinktrombone/</a><p>4: <a href="https://pbat.ch/proj/voc/" rel="nofollow">https://pbat.ch/proj/voc/</a><p>5: <a href="https://pbat.ch/sndkit/tract/" rel="nofollow">https://pbat.ch/sndkit/tract/</a><p>6: <a href="https://pbat.ch/sndkit/glottis/" rel="nofollow">https://pbat.ch/sndkit/glottis/</a><p>7: <a href="https://soundcloud.com/patchlore/sets/looptober-2021" rel="nofollow">https://soundcloud.com/patchlore/sets/looptober-2021</a><p>8: <a href="https://pbat.ch/wiki/vocshape/" rel="nofollow">https://pbat.ch/wiki/vocshape/</a><p>9: <a href="https://www.cs.princeton.edu/~prc/SingingSynth.html" rel="nofollow">https://www.cs.princeton.edu/~prc/SingingSynth.html</a><p>10: <a href="https://web.archive.org/web/20080725195347/http://ccrma-www.stanford.edu/~vickylu/thesis/index.html" rel="nofollow">https://web.archive.org/web/20080725195347/http://ccrma-www....</a>
Does anyone have any information / background on the programming behind this project?<p>It seems incredible for so long ago and I can't quite conceive of how they were able to do it.
HAL did it better 40-years later in 2001: <a href="https://www.google.com/search?q=HAL+singing+D+a+bicycle+built+for+two+2001&oq=HAL+singing+D+a+bicycle+built+for+two+2001&aqs=chrome..69i57j33i160l2.13467j1j4&sourceid=chrome&ie=UTF-8#fpstate=ive&vld=cid:66e079f5,vid:E7WQ1tdxSqI" rel="nofollow">https://www.google.com/search?q=HAL+singing+D+a+bicycle+buil...</a>