HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
Fossil Data Part 2: 8-Inch IBM Floppy Data Recovery

CuriousMarc · Youtube · 73 HN points · 0 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention CuriousMarc's video "Fossil Data Part 2: 8-Inch IBM Floppy Data Recovery".
Youtube Summary
We help intrepid paleontologists to retrieve precious fossil data fossilized on 8-inch IBM floppies.

Our sponsor for PCBs: https://www.pcbway.com
Support the team on Patreon: https://www.patreon.com/curiousmarc
Buy shirts on Teespring: https://teespring.com/stores/curiousmarcs-store
Learn more on companion site: https://www.curiousmarc.com
Contact info: https://www.youtube.com/curiousmarc/about

Music Credits: Crinoline Dreams, Kevin MacLeod, https://incompetech.com
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
Feb 23, 2020 · 72 points, 26 comments · submitted by zdw
crmrc114
I love his videos he has a webpage with merch if you want to support him https://www.curiousmarc.com/

If you are new to his channel I cannot recommend enough the Apollo AGC videos https://www.youtube.com/watch?v=2KSahAoOLdU&list=PL-_93BVApb...

ggambetta
Somewhat related, I have a couple of tapes from the 80s that contained software for the ZX Spectrum (mostly my first attempts at writing code). They were written in a custom format by a custom device I no longer have, and of which I have very little information (I managed to track down one of the engineers who designed the thing some 30 years ago).

I have raw audio files of these tapes. I have managed to convert some other tapes in standard ZX Spectrum format to readable files I can load in an emulator. However, for the special tapes, there's no tooling available - all I have is a waveform.

If I had an array of bits, I could start trying to figure out the format of this thing. However, I have no idea how to go from a raw waveform to the zeros and ones it encodes. My best idea so far is to write a small program that looks for zero crossings, and depending on the timing output zeros or ones, but I suspect there might exist some software that does this already? I have next to no knowledge of signal processing.

Any suggestions on how to go about this?

tpmx
I'd recommend asking people at https://www.worldofspectrum.org/forums/. This forum has been around since the 90s (iirc). I'd expect the kind of detailed knowledge you're after to be present there.
vnchr
Interesting Q uestion you posed in your deleted comment. Unlikely to get traction in this forum, but interesting nonetheless.
jacobush
Now I'm curious.
tpmx
Yeah, it's the wrong place. I deleted the comment when it reached -3. The thread is full of hyper positive cheerleaders now. Fxxk me.
IIAOPSW
If you just want to get a quick answer on basic questions of the encoding, such as where to put the zero/one threshold, bits per second, how bits are encoded etc you can open up the file in audacity and zoom in on the wave form.
TheOtherHobbes
Start with this for a technical backgrounder.

http://www.worldofspectrum.org/tapsamp.html

What you have is probably some variant of FSK, so if you view a spectrum (in Audacity, or similar) you should be able to pick out the frequencies being used.

Decoding the block structure is going to be harder, and it will also depend on the data representation. The data is probably tokenised. If it's standard ZX BASIC it should be easy-ish to decode, but if it's a custom format it's going to be much harder.

WalterBright
A .wav file is pretty simple, just an array of amplitudes indexed by time quanta. See if your audio files can be converted to .wav files.

Then, read the amplitudes into a C array (or a D array, even better!) and they're easy to do simple processing on.

I wrote a program a few years ago that would do this, looking for things that looked like clicks and smooth them with a cubic interpolation. It was fun to dink around with it.

cat199
> Any suggestions on how to go about this?

No audio domain specific related experience, so there may be something better for this, but, failing that, for format exploration at least, I'd think numpy/scipy+jupyter would be great for interactively mucking around with the bits/bytes e.g.

np.where(x > ((max(x) - min(x)) / 2))

(roughly, am a bit rusty at the moment)

basically gives you a boolean array containing approximate zero crossings in an array, and so on.

similarly, you could subslice on range boundaries to ignore imagined 'marker' bits, index this through an ASCII table & display results, etc.

if you're not up with numpy array syntax / dtypes it takes a bit of getting used to, but well worth the effort IMHO in terms of the overall data exploration skills gained

jacquesm
Your typical tape from that era was simple FSK, 256 byte blocks + a checksum. You may want to start with trying that and simple variations on it. It shouldn't be too hard to figure out the FSK frequencies from the audio using a good scope.
jhallenworld
They should try Dave Dunfield's ImageDisk:

http://www.classiccmp.org/dunfield/img/index.htm

This is used by bitsavers to preserve old data from 8 inch floppy disks. I was able to write an emulator for a Motorola 6800 "Exorciser" that could boot disks saved this way.

oneplane
I don't think it handles IBM formatted disks. You can use it with you have an FDC that does some handling for you. Problem is that those disks are practically 'punchcards-on-disk' for mainframes, not PC-based at all. Oddly enough, it does know about paper tape.

Then again, OmniDisk can't autodetect it initially either, so perhaps the whole concept of reading mainframe formatted disks and mainframe encoded data on a non-mainframe system was rather problematic anyway.

jhallenworld
It does, IBM format is the most basic floppy standard (same for ASCII and EBCDIC). I watched the video again- they did use ImageDisk for the bulk of the transfer- you can see it flashed a few times (13:46 is one point). It's the mostly blue screen with the red bar on the top.
oneplane
Re-watching the video and re-reading the comments, this makes sense. I probably was too focused on the subtitles I had on.

It is pretty interesting that while older formats can be somewhat obscure these days, because the format was much simpler it can be 'understood' by one person much easier.

kencausey
Part 1 for a more detailed introduction to the situation: https://www.youtube.com/watch?v=MPOYHQTMnf8
CaptArmchair
If you haven't seen this already, CuriousMarc has also done this awesome series on the restoration of an original AGC - Apollo Guidance Computer last year.

https://www.youtube.com/watch?v=2KSahAoOLdU&list=PL-_93BVApb...

And then there are real treats such as the restoration of a Teletype 33 ASR:

https://www.youtube.com/watch?v=QzfjT1mCRww&list=PL-_93BVApb...

Or them trying to get Fortran to compile on an IBM 1401 Mainframe dated 1959. https://www.youtube.com/watch?v=uFQ3sajIdaM

thedance
This skewers the notion that old programmers were honed and refined by their resource constraints. Whoever wrote these files was using a slow, expensive physical format and wasting virtually all of it on padding.
fsh
All data was typed in by hand, and the entire dataset fit into a few boxes. Resources were never an issue, so why optimize for them?
thedance
I think there's a practical difference between a folder full of data and a wheelbarrow full of data.
sys32768
About six years ago I found several 8" floppies for the 1970s Ohio Scientific system. These had been in storage since the late 1980s and contained games and utilities. One disk label was dated 1/31/1979.

I sent them to the author of the OSI emulator and I believe all but one of them were fully readable and dumped for emulation.

http://osi.marks-lab.com/

WalterBright
I retrieved my files off of old PDP-11 8" floppies by contacting a friend of mine at cheshireeng.com who had an old 11 in a closet. He didn't know if it still worked, but it did (yay for DEC quality!), and was able to transfer the disk contents out via serial line. He was able to retrieve files and images from all my disks, didn't lose a bit (yay for DEC floppy quality!). This was after 30 years of sitting in a box.

Anyhow, I was able to retrieve my original 11 version of Empire this way:

https://github.com/DigitalMars/Empire-for-PDP-11

DEC made good machines. Not one of my machines from the 80s or 90s would power up, though I stored them in working condition in warm and dry places. But the 30 year old DEC worked great.

timbit42
I notice Wikipedia states you wrote Empire for the PDP-10. Are there two versions or is Wikipedia wrong?
dzdt
The typical thing that goes wrong with computers from the 80's and 90's is the electrolytic capacitors dry out and go bad. This got even worse with the "capacitor plague" of the late 90's/2000's.

I would guess the PDP-11 has fewer capacitors or of a different design.

Aloha
Higher quality caps
WalterBright
My vintage 1981 Carver amp still runs all day every day.
Feb 23, 2020 · 1 points, 0 comments · submitted by EvanAnderson
HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.