torchjpeg.codec

Low-level Codec Operations

Provides access to low-level JPEG operations using libjpeg. By using libjpeg directly, coefficients can be loaded or saved to JPEG files directly with needing to be recomputed. In addtion to the C++ implemented low-level operations, two python convenience functions are exported that can decode the ressulting coefficients to pixels.

torchjpeg.codec.read_coefficients(path: str) → Tuple[Tensor, Tensor, Tensor, Optional[Tensor]]

Read DCT coefficients from a JPEG file

Parameters

path (str) – The path to an existing JPEG file

Returns

  • Tensor – A \(\left(C, 2 \right)\) Tensor containing the size of the original image that produced the returned DCT coefficients, this is usually different from the size of the coefficient Tensor because padding is added during the compression process. The format is \(\left(H, W \right)\).

  • Tensor – A \(\left(C, 8, 8 \right)\) Tensor containing the quantization matrices for each of the channels. Usually the color channels have the same quantization matrix.

  • Tensor – A \(\left(1, \frac{H}{8}, \frac{W}{8}, 8, 8 \right)\) Tensor containing the Y channel DCT coefficients for each \(8 \times 8\) block.

  • Optional[Tensor] – A \(\left(2, \frac{H}{8}, \frac{W}{8}, 8, 8 \right)\) Tensor containing the Cb and Cr channel DCT coefficients for each \(8 \times 8\) block, or None if the image is grayscale.

Note

The return values from this function are “raw” values, as output by libjpeg with no transformation. In particular, the DCT coefficients are quantized and will need to be dequantized using the returned quantization matrices before they can be converted into displayable image pixels. They will likely also need cropping and the chroma channels, if they exist, will probably be downsampled. The type of all Tensors is torch.short except the dimensions (first return value) with are of type torch.int.

torchjpeg.codec.write_coefficients(path: str, dimensions: Tensor, quantization: Tensor, Y_coefficients: Tensor, CrCb_coefficients: Optional[Tensor] = None) → None

Write DCT coefficients to a JPEG file.

Parameters
  • path (str) – The path to the JPEG file to write, will be overwritten

  • dimensions (Tensor) – A \(\left(C, 2 \right)\) Tensor containing the size of the original image before taking the DCT. If you padded the image to produce the coefficients, pass the size before padding here.

  • quantization (Tensor) – A \(\left(C, 8, 8 \right)\) Tensor containing the quantization matrices that were used to quantize the DCT coefficients.

  • Y_coefficients (Tensor) – A \(\left(1, \frac{H}{8}, \frac{W}{8}, 8, 8 \right)\) Tensor of Y channel DCT coefficients separated into \(8 \times 8\) blocks.

  • CbCr_coefficients (Optional[Tensor]) – A \(\left(2, \frac{H}{8}, \frac{W}{8}, 8, 8 \right)\) Tensor of Cb and Cr channel DCT coefficients separated into \(8 \times 8\) blocks.

Note

The parameters passed to this function are in the same “raw” format as returned by read_coefficients(). The DCT coefficients must be appropriately quantized and the color channel coefficients must be downsampled if desired. The type of the Tensors must be torch.short except the dimensions parameter which must be torch.int.

torchjpeg.codec.quantize_at_quality(pixels: Tensor, quality: int, color_samp_factor_vertical: int = 2, color_samp_factor_horizontal: int = 2, baseline: bool = true) → Tuple[Tensor, Tensor, Tensor, Optional[Tensor]]

Quantize pixels using libjpeg at the given quality. By using this function instead of torchjpeg.quantization the result is guaranteed to be exactly the same as if the JPEG was quantized using an image library like Pillow and the coefficients are returned directly without needing to be recomputed from pixels.

Parameters
  • pixels (Tensor) – A \(\left(C, H, W \right)\) Tensor of image pixels in pytorch format (normalized to [0, 1]).

  • quality (int) – The integer quality level to quantize to, in [0, 100] with 100 being maximum quality and 0 being minimal quality.

  • color_samp_factor_vertical (int) – Vertical chroma subsampling factor. Defaults to 2.

  • color_samp_factor_horizontal (int) – Horizontal chroma subsampling factor. Defaults to 2.

  • baseline (bool) – Use the baseline quantization matrices, e.g. quantization matrix entries cannot be larger than 255. True by default, don’t change it unless you know what you’re doing.

Returns

  • Tensor – A \(\left(C, 2 \right)\) Tensor containing the size of the original image that produced the returned DCT coefficients, this is usually different from the size of the coefficient Tensor because padding is added during the compression process. The format is \(\left(H, W \right)\).

  • Tensor – A \(\left(C, 8, 8 \right)\) Tensor containing the quantization matrices for each of the channels. Usually the color channels have the same quantization matrix.

  • Tensor – A \(\left(1, \frac{H}{8}, \frac{W}{8}, 8, 8 \right)\) Tensor containing the Y channel DCT coefficients for each \(8 \times 8\) block.

  • Optional[Tensor] – A \(\left(2, \frac{H}{8}, \frac{W}{8}, 8, 8 \right)\) Tensor containing the Cb and Cr channel DCT coefficients for each \(8 \times 8\) block, or None if the image is grayscale.

Note

The output format of this function is the same as that of read_coefficients().

torchjpeg.codec.pixels_for_channel()[source]

Converts a single channel of quantized DCT coefficients into pixels.

Parameters
  • channel (torch.Tensor) – A \(\left(1, \frac{H}{8}, \frac{W}{8}, 8, 8 \right)\) Tensor of quantized DCT coefficients.

  • quantization (torch.Tensor) – An (8, 8) Tensor of the quantization matrix that was used to quantize channel.

  • crop (torch.Tensor) – An optional (2) Tensor of containing the $left(H, W right)$ original sizes of the image channel stored in channel. The pixel result will be cropped to this size.

Returns

A \(\left(H, W \right)\) Tensor containing the pixel values of the channel in [0, 1]

Return type

torch.Tensor

Note

This function takes inputs in the same format as returned by read_coefficients() separated into a single channel.

torchjpeg.codec.reconstruct_full_image()[source]

Converts quantized DCT coefficients into an image.

This function is designed to work on the output of read_coefficients() and py:func:quantize_at_quality. Note that the color channel coefficients will be upsampled by 2 as chroma subsampling is currently assumed. If the image is color, it will be converted from YCbCr to RGB.

Parameters
  • y_coefficients (torch.Tensor) – A \(\left(1, \frac{H}{8}, \frac{W}{8}, 8, 8 \right)\) Tensor of quantized Y channel DCT coefficients.

  • quantization (torch.Tensor) – A \(\left(C, 8, 8 \right)\) Tensor of quantization matrices for each channel.

  • cbcr_coefficients (Optional[torch.Tensor]) – A \(\left(2, \frac{H}{8}, \frac{W}{8}, 8, 8 \right)\) Tensor of quantized color channel DCT coeffcients.

  • crop (Optional[torch.Tensor]) – A \(\left(C, 2 \right)\) Tensor containing the \(\left(H, W \right)\) dimensions of the image that produced the given DCT coefficients, the pixel result will be cropped to this size.

Returns

A \(\left(C, H, W \right)\) Tensor containing the image pixels in pytorch format (normalized to [0, 1])

Return type

torch.Tensor