Skip to content

Support specifying output channels (RGB vs grayscale) for io.image.read_image #2948

Closed
@rwightman

Description

@rwightman

🚀 Feature

Similar to channels argument for tf.io.decode_image or PIL.Image.open(..).convert('format') allow specifying the output channel 'format' and have a sensible conversion performed for you when decoding an image if set.

Motivation

Prevent the need to manually check and do color conversion from RGB -> grayscale or grayscale -> RGB when datasets often have mixes of RGB and grayscale images. The output color format of image loading in data pipeline needs to be consistent for many typical use cases.

The lower level libraries (libjpeg, libpng) etc have facilities for specifying the output color space and will do the conversion for you if you set it up (ie out_color_space for libjpeg). This is what tensorflow decode_image does and is the most efficient.

Pitch

Add a channels= arg to read_image that matches Tensorflow semantics.

  • channels=0 - leave as original, a grayscale image decodes to 1 channel out, rgb to 3
  • channels=1 - grayscale out
  • channels=3 - RGB out
  • channels=4 - RGBA out (PNG or formats that support alpha, not valid for jpeg)

Alternatives

Additional context

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions