HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
How are Images Compressed? [46MB ↘↘ 4.07MB]

Branch Education · Youtube · 109 HN points · 1 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention Branch Education's video "How are Images Compressed? [46MB ↘↘ 4.07MB]".
Youtube Summary
Go to http://brilliant.org/BranchEducation/ to sign up for free, and expand your knowledge. The first 200 people will get 20% off their annual premium membership.

You've probably saved 1000s of JPEG images, but do you know what exactly JPEG does? Our smartphones and cameras save images in JPEG format, furthermore, the majority of images you see on the internet are saved using JPEG compression. This format is everywhere, but do you know exactly how it works? Well in this video we're going to explore the JPEG compression format. This is a rather complicated video, so it may take watching it a few times through to understand it all.

Do you want to support in-depth engineering and technology education? Support us on: www.patreon.com/brancheducation

Website: www.branch.education
Script, Modeling, Animation: Teddy Tablante
Twitter: @teddytablante
Voice Over- Phil Lee
Nature Photography- Tobias Karlsson

Table of Contents:
00:00 - Intro into JPEG
01:24 - What does JPEG do?
02:35 - What are the Steps of JPEG?
04:06 - Color Space Conversion
06:06 - Discrete Cosine Transform
09:32 - Quantization
11:02- Run Length and Huffman Encoding
12:04 - H.264 Video Compression
13:25 - Rebuilding an Image
15:01 - Notes and Caveats on JPEG
17:06 - Sponsored by Brilliant
18:20 - Outro

Key Branches from this video are: How does a Camera Work? How do SSDs Work?

Erratum:
Tulips are not the same as Lillies.

Animation built using Blender 3.0.0 https://www.blender.org/
Post with Adobe Premiere Pro

References:
A Trip Through the Graphics Pipeline 2011: Index
https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/

DCT (Discrete Cosine Transform)
https://asecuritysite.com/comms/dct2

H.264 is Magic
https://sidbala.com/h-264-is-magic/

How JPEG Works
https://cgjennings.ca/articles/jpeg-compression/

Jack, Keith. Video Demystified. Fifth Edition. Elsevier 2007.

JPEG 101 - How does JPEG Work?
https://arjunsreedharan.org/post/146070390717/jpeg-101-how-does-jpeg-work

JPEG: Image Compression Algorithm
http://pi.math.cornell.edu/~web6140/TopTenAlgorithms/JPEG.html

What is H.264?
https://www.streamingmedia.com/Articles/Editorial/What-Is-.../What-Is-H.264-74735.aspx?utm_source=related_articles&utm_medium=gutenberg&utm_campaign=editors_selection

Wikipedia contributors. "Chroma Subsampling", "Chrominance", "Chroma Subsampling", "Discrete Cosine Transform", , "JPEG" , "
Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, Visited December, 2021


Music Credits
Kindred

#JPEG #Camera #Picture
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
Jun 27, 2022 · 103 points, 20 comments · submitted by davidbarker
vanderZwan
Out of curiosity: why did they decide to go with a zig-zag pattern instead of grouping same frequencies of each block together? That way you'd end up with even longer runs of zeros.

edit: is that perhaps how progressive jpegs work and is that why they typically compress better?

GeertJohan
The QOI format is relatively easy to understand, explained in a single page pdf file. I found it very interesting, knowing nothing of image encoding/compression.

https://qoiformat.org/

vanderZwan
It's great but it's an entirely different beast, being both lossless and designed to be fast and minimal above everything else. It works by minimizing the overhead of encoding deltas and run-lengths.
Traubenfuchs
I was disheartened right at the beginning at the color space conversion: Why is there no green chrominance?

Is the chrimonance quantization done for both red and blue chrominance?

astrange
You don’t need it. Y has all the info you need.
vanderZwan
The calculated red and blue chrominance values can be both positive and negative. The latter may sound a bit ridiculous in isolation, but if you combine it with luminance it works out: you can subtract blue and red from white to reconstruct green.
diogenes_of_ak
Oh! Hey!

I just did something like this! I was messing around and used SVD to write a compressor. Do I dare share my feeble GitHub here lol

texaslonghorn5
Please do!
diogenes_of_ak
Debating it - my GitHub account is basically my name… so yah… debating the subsequent loss of anonymity
unnouinceput
Then create another one, and share that one. It's not like there is a limit of how many GitHub accounts you can have.
gabsens
Sounds similar to performing PCA and keeping only the top eigenvectors
IIAOPSW
Any sufficiently good compression algorithm is inherently also a dimensionality reduction algorithm, reducing the representation down to as few dimensions as needed. This simple fact is suggestive of an origin to a number of cognitive abilities. Out of a system which evolves only for condensed representation, the ability to discern patterns in the represented things emerges for free.
Guzba
I recently helped work on a new open source JPEG decoder in Nim. (Over here on GitHub: https://github.com/treeform/pixie/blob/master/src/pixie/file...)

This video was extremely helpful to understand the "why" of all the things the spec was trying to explain. It made a huge difference in us being able to get things working.

We talk a bit about JPEG and actually writing our decoder in Nim here: https://www.youtube.com/watch?v=vYwD7OynFcg

Overall, our concluding opinion is that JPEG has some extremely cool and really smart ideas for how to compress images but the binary file format itself has some very painful things in it (progressive and restart markers as a couple examples).

Sesse__
The amazing thing is how well JPEG performs for something that is pretty simple and worked (although very slowly) on 1992 hardware. (I don't mind restart markers, BTW, but stuffing definitely was a mistake.) Look at the state of the art in video codecs in 1992 versus today, then consider that we still make new image formats that can beat JPEG only on PSNR (not perceived quality), or in very narrow niches like super-low bit rates. As the quote goes, “it's like alien technology from the future”.

JPEG XL appears to finally be getting there, with a meaningful improvement. But still nothing like 3x. Also perhaps AVIF, but current encoders have problems with rough texture on high bitrates.

TacticalCoder
It's the same with the, even older, CD audio format. Sure, it's lossless and uncompressed but stereo 16-bit 44.1 kHz was a stroke of genius. In 1980.

Some may argue about lossy audio format (as it the nineties called and wanted their 20 GB HDD back) while others may argue about SACD, 24 bit 96 kHz and whatnots but the fact stays: there are engineers out there who came up with the CD audio format in 1980, which is still in use to this day.

I legally and bit-perfectly rip my CDs to FLAC files and it still boggles my mind that it's basically the format from 1980 (FLAC files are lossless and you can re-burn the exact same identical CD, which you can then, if you fancy so, re-rip to the exact same, bit perfect, WAV or FLAC files, rinse & repeat as many times as you want).

Speakers definitely got better. DACs are ubiquitous now. Amps probably got better too. But 16-bit 44.1 kHz stereo lives on since 42 years (40 years commercially). Soon half a century.

"It's like alien technology from the future" indeed.

vanderZwan
> as it the nineties called and wanted their 20 GB HDD back

I think that was the early 2000s. The nineties were the era of the CD-ROM storing huge games that did not fit on your HDD.

zRedShift
FLAC compression is, although lossless, not nearly as straightforward as raw PCM/WAV/AIFF. It has LPC (linear predictive coding), with the usual residual entropy/RLE coding (but without the quantization stage, due to being lossless). Also an optimization for when there's stereo input and both channels are very similar, (where it converts it losslessly to mid-channel and side-channel, where the values in side-channel are very small and lend themselves to RLE/entropy coding).

As far as the xiph.org audio codecs go however, Opus is the real magnum opus (pun obviously intended). SILK (the LPC part, donated by skype) + CELT + DNN (used to detect whether it's speech or music to tune the 2 codecs since libopus v1.3), it's quite complex, and I feel like some of its parts (specifically the SILK encoder, which has the donated implementation and only the high level details in its RFC, since CELT has a plethora of documentation/articles and independent encoder re-implementation in ffmpeg) are only really understood by the original authors (or at least were when they wrote them a decade and a half ago). Reverse engineering the (SILK) encoder code and making a video similar to the one on the OP (or at least an article/blog post) could be a fun activity.

Sesse__
Opus feels like it solved the problem of audio compression. Even if someone came out with a codec that gave same quality at half the bitrate, I don't think I would care much; I just want Opus in all my devices, everywhere. :-) It's good enough along pretty much all axes, except, of course, universal support.
zRedShift
If we're talking about wish-lists, encoding performance on low-power IoT devices, maybe? It has decent SIMD support on ARM/x86 and tweakable complexity settings, but if your device is weaker than an ESP32, you'll be hard-pressed to encode audio in real time, even on the lowest complexity.

The new kids on the block in the speech encoding/real time communications space (Google Lyra/Microsoft Satin) have fancy AI models, promise decent quality in ultra-low bitrates (3-6kbps), but don't look like they're any easier to run on micro controller.

lifthrasiir
How about Codec 2 [1]? I think it delivers a comparable performance to Lyra etc. while not using ML, and has multiple ESP32 ports already. Maybe it might be usable for less powered devices.

[1] https://www.rowetel.com/?page_id=452

Jan 09, 2022 · 3 points, 0 comments · submitted by codetrotter
I agree, I had the same click happen when I saw this set of shapes in a YouTube video on how JPEG works: https://www.youtube.com/watch?v=Kv1Hiv3ox8I.
SavantIdiot
This video is really well done. I worked on optimizing the Xing video player for Pentium MMX ages ago (almost 30 years?!), and had to figure this out by reading the same one book on the topic over and over again. I never quite got the details, but this video would have saved me weeks of study. Pre-internet days sucked for learning complex topics.
Jan 05, 2022 · 2 points, 0 comments · submitted by zaidhaan
Dec 24, 2021 · 1 points, 0 comments · submitted by U1F984
HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.