The past years JPEG 2000 has become the de-facto codec standard for 1G AV over IP solutions. It has been used by AMX/Harman for years in several of their SVSi products, Crestron used it in their DM NVX, and a range of other manufacturers also ships AV over IP solutions based on JPEG 2000. Despite impressive growth of the 10G SDVoE ecosystem, there is still a need for lower bitrate solutions. Mixed feedback from customers, and concerns from IT departments, however, raises questions on whether JPEG 2000 is the best solution moving forward.
So, what are the concerns with JPEG 2000? Let’s start with some history. JPEG 2000 was developed as a still image compression standard by the Joint Photographic Experts Group (JPEG) committee in 2000, with the intention of superseding their original discreet cosine transform-based JPEG standard. Since JPEG 2000 is a wavelet-based codec, it offers superior compression ratios to JPEG, while eliminating the issues with blocking-artifacts. JPEG 2000 video consist of a series of compressed still images, one for each video frame, typically 24, 30 or 60 frames per second. Codecs compressing and transmitting each frame individually in this way are called intra-frame codecs. This is a different approach from inter frame codecs like H.264 or HEVC, that work on a group of neighboring frames (GOP). JPEG 2000 has been adopted by the broadcast industry as mezzanine compression in the live production workflows. The compression offers unique benefits suitable for video production as alternative to uncompressed video. One of the key advantages of intra-frame codecs is the low latency, and solutions based on JPEG 2000 allow video transmission with less than a single frame of latency (16ms @ 60 fps). This makes it much more attractive for AV over IP than H.264, which might have several hundred milliseconds of latency.
All this sounds pretty great though, so what’s the problem? If it works for the broadcast industry, shouldn't it be acceptable for the AV industry as well? Despite 4K motion video being frequently used in demonstrations of video quality, this type of content is far less common in Pro AV applications. Video is usually of a more static character and users are often located very close to the display. Even when looking at a simple spreadsheet, artifacts or pixel changes would be noticeable and have a detrimental effect on the overall user experience. In order to deliver acceptable image quality for Pro AV applications, the compression rate in my experience needs to be lower than 10:1 with JPEG 2000. This means a lot of bandwidth is
needed. The raw data from 4K/60 video with 4:4:4 chroma-sampling requires a bandwidth of 14.92 Gbps. Using a 10:1 compression results in a 1.49 Gbps stream which can't possibly fit on a 1G network. However, since these images are more or less static, why do we transmit every single frame? The page you are looking at right now doesn't change while you're reading. There’s really no good reason to send the same picture 60 times every second. This is the main issue with JPEG 2000 for Pro AV use.
What’s the solution? By looking at the previous frame, effectively adding some elements of an inter frame codec, it's possible to prevent transmission of content that doesn't change from one frame to the next. This is Extron's approach with their PURE3 codec, and after finally admitting the weaknesses of JPEG 2000, seemingly what Crestron is doing through the addition of IntoPix’s FlinQ technology to their codec. Extron is using what they call Intelligent Selective Streaming Technology (ISS), analyzing frames before they are being transmitted, and only encoding and transmitting the required changes. This approach results in a significant reduction in bandwidth, that can be leveraged with lower compression for improved image quality. Little information is available on the FlinQ technology used by Crestron, but based on the previews, my assumption would be that their approach is somewhat similar. How AMX/Harman plans to address the problem remains to be seen.