The OME team is leading a community effort to design a new cloud-friendly “Next Generation” file format (NGFF).
See the announcement
and other image.sc posts tagged with ome-ngff.
The ome-zarr spec is currently
under development and includes a changelog with version numbers.
A number of tools are being developed
to work with ome-zarr data. The vizarr viewer is used below to view OME-Zarr images.
Below is a statement delivered November 2018 to the Euro-Bioimaging Industry
Board regarding the support of proprietary file formats by Bio-Formats. This
was discussed during the
From Images to Knowledge with ImageJ & Friendsmeeting in
December and since then, there have been a growing number of conversations
about a common format for bioimaging data. We're posting it here to tie the
conversations back together and continue an open discussion of this critical
As many of you know, work on Bio-Formats began in 2006, and over the first 10
years of development, support was added for over 140 file formats. If you
include the per-format variants that have emerged over the years, that might be
as much as 5 or 10 times higher, but precise numbers are difficult at best.
In 2016, we issued a public
that OME, or more specifically its funding model, was not going to keep up with
the accelerated development of new formats. We warned that we would be spending
less time on closed formats, and we suggested that format developers either
move to open formats or invest their own time or money to support their formats.
How did that turn out? Well, two years later the growth curve has naturally
levelled off as we pursue other priorities. Currently there are just over 150
formats supported. One company, 3i, has taken over support of their own file
with a closed source reader that lives outside of Bio-Formats.
A few other companies have added support for their format either by
contributing directly to the library or by commissioning Glencoe Software to
do so. Where necessary, the open source team has added support for formats
that are needed for their funded priorities like datasets published in the
Image Data Resource.1,2,3,4
Paying for the initial cost of a format is not enough.
But paying for the initial cost of a format is not enough. The need for
indefinite support carries a larger, longer-lived price tag that leaves data
written in a given format constantly at risk. These costs are exacerbated by
format variants. Even when a format is defined following standards like DICOM,
there is a need to contend with multiple implementations as is the case in the
radiology domain. The same happened with the Olympus OIR format added in 2017
in partnership with Olympus Europe. Following public release, the community has
periodically reported breakages caused by new variants of the format.
Put simply, the format landscape has scaled beyond a manageable level.
Put simply, the format landscape has scaled beyond a manageable level. The
result is that scientists end up blocked in accessing and properly handling
their data, and thus blocked in their scientific endeavor. If Bio-Formats were
to cease to exist, a large percentage of imaging data would immediately cease
to be accessible at least until someone took on the burden of support.
We understand the push to develop new formats. From numerous interactions, we
know how crucial it is for data producers to be able to write data quickly as well
as it is for users to be able to access their data quickly, and both across as many
platforms as possible. We also know that, optimally, this ecosystem should all
just keep working for years to come. But while these requirements need to be
fulfilled, something must give.
We think the only scalable way forward is to work together on an ever smaller
number of formats.
We think the only scalable way forward is to work together on an ever smaller
number of formats. That’s why we’ve been concentrating on open formats instead
of adding new proprietary formats. For example, Bio-Formats
support for the open BigDataViewer (BDV) format, a strong candidate for support
across the community.
BDV provides a testbed for moving beyond the current single binary format of
OME-TIFF. The OME Model will be extended to permit describing the multiscale,
multidimensional data that is currently stored in BDV XML/H5. As a stable
container format, HDF5 allows us a quick way to validate these concepts.
At the same time there’s a consensus that HDF5 itself as currently implemented
cannot be the only binary container for our community, and, therefore, we are
also collaborating on next-generation open-source, chunked (or “cloud”) formats
for the scale of data generated by future acquisition systems. Two candidates —
N5 — were independently developed but
overlap in most of their core concepts. Both communities have since begun
on a common storage spec, and other groups from
are getting involved.
We would like to see a community agreement between the various parties on a minimal
set of open formats covering a broad range of imaging modalities.
We would like to see the bioimaging community agree on set of open formats covering
a broad range of imaging modalities. We need to reduce long-term cost of our
domain’s file formats and their variants. We want data users and producers
to be able to ensure the long-term viability of their data.
OME-TIFF has been available for over a decade and today is in use by software
across industry and academia, minimally as an export format, but it still
doesn’t have the traction to stop a proliferation of new file formats. As
support for this new binary format solidifies, we intend to invest long-term
support in a new OME format.
Some of this work is the regular work of supporting the bioimaging
community, but we feel this is a larger effort that could use more collaboration
and funding. We are considering an application to the CZI’s Essential Open Source Software
and welcome any coordinated efforts. Beyond that, a truly common format
will need indefinite support, and we will continue to look for avenues to do
After the intensive development period of IDR’s
first releases, the 5.4 series of OMERO was intended to be a stable platform for the
community and the OME team to build on. From its first release in October 2017 to its
tenth and final release this year, 5.4 has, we think, served as a reference point for
In trying to maintain that stability, however, it’s become ever more clear that we
need the ability to quickly release individual parts of OMERO to the community.
Fixes to file formats, performance improvements, security patches, and more should
not need to wait on the simultaneous release of the entire OMERO platform.
Enabling such releases has been the focus of the upcoming, largely developer-centric
release. Though with production-quality Docker deployments and the
fresh-off-the-presses Bio-Formats 6.1, we hope that OMERO 5.5 will provide
something for everyone.
During the development of OMERO 5.5, all 800,000+ lines of Java and Matlab code were
migrated out of the openmicroscopy/openmicroscopy GitHub repository into individual repositories, each with a new
Gradle build system. Support for Java 7, Python 2.6 and
Ice 3.5 were dropped. Java 11 support was added. The versions of most of these new
repositories began at 5.5.0, but they have already begun to diverge following
semver principles. Though initially disruptive, we hope this
modernization and modularization will ease participation in the development of OMERO.
See the Gradle super project omero-build for more details.
Beyond the changes for building OMERO, the distribution of
as Docker images is now considered production quality. Examples for using these
images in various configurations are available under omero-deployment-examples. Both images will be updated with every
OMERO release, and will also be updated with releases of the embedded components
and plugins as necessary.
Other Docker images from the OME team that you may have used over the years
have been deprecated and will soon be removed. A next step will be to
additionally provide Helm Charts for easing
deployment on Kubernetes. If you are interested
please get in touch through the forum.
But don’t worry: we also didn’t forget our users. OMERO 5.5 finally makes the
jump to Bio-Formats 6 both with its support for pyramidal TIFFs
and for new community file formats like BDV, see
Bio-Formats 6.1 announcement
for more details.
Moving forward, we look forward to helping you to create and share these more
In the coming months, we will continue to release fixes for the individual
components of OMERO and hope to ease their introduction into your local installation.
Feedback on how you find working with the decoupled repositories and installing
changes would be much appreciated.
At the same time, we will begin preparing for the next large changes:
With the deprecation of Python 2, all OME code bases will need to be upgraded to
work with Python 3. Likely a similar modularization will be applied to the Python
and Web code such that pip install -U omero-web should be all that is needed to
receive the latest updates to OMERO.web.
A development version of OMERO will begin with a flexible extension mechanism for
instrument and eventually experiment metadata. This is likely to become the basis for
OMERO 6 which, unlike OMERO 5.5, will require a database upgrade.
For OMERO to properly fulfill the role of being a useful repository for
microscopy images its users must have easy access to that data. As data
sets grow in size it becomes a correspondingly greater challenge to
provide access to that data. This motivates the creation of server-side
solutions such as the IDR’s
Virtual Analysis Environment.
For the past couple of years the OME team has also been investigating
ways to improve users’ ability to obtain data from OMERO for client-side
storage and processing.
We now release
a Java application that acts as a command-line OMERO client. It writes
selected data from an OMERO server into a local directory and creates
soft links to represent some of the relationships among server objects.
This is still an early version missing many features but it can already
download some original files and metadata.
OMERO.downloader is designed to handle situations in which not all the
specified data can be downloaded in a single session. If download is
interrupted then it can be resumed by repeating the same command line
invocation. If files have already been downloaded then they will not be
Downloading original files
The files that were uploaded for OMERO image ID 1234 are available
These are downloaded within the current directory. The -b option can
be used to specify a different preexisting directory to use as a base
for the downloads. We recommend using a different base directory for
each OMERO server that you use because the directory structure created
locally reflects how the server stores your data.
The above command would download image files into the
Image/1234/Binary/ directory with any companion files (not containing
pixel data) in the Image/1234/Companion/ directory. The files are soft
links that, perhaps via a Fileset/ directory, link to files in
Repository/. In the repository directory the binary and companion
files are located together. On systems with the GNU Core Utilities
installed a command like:
showinf `realpath Image/1234/Binary/my-image.fmt`
can be used to conveniently direct Bio-Formats’ command-line tools to
the directory that includes the binary and companion files together.
The original files for multiple images can be downloaded by specifying,
e.g., Dataset:123 or Image:1234,1235,1236. However, nothing stored
in the base directory indicates which datasets or other containers held
the downloaded images. Original files from plates may be downloaded only
if the server’s omero.policy.binary_access setting is configured to
Exporting metadata as OME-XML
Metadata representing images, ROIs and some annotations can be fetched
from the OMERO server and written locally as OME-XML:
The OME-XML is stored in two forms: First, each top-level schema object
is stored independently in separate files, e.g., in
Image/1234/Metadata/image-1234.ome.xml. Soft links exist among related
model objects, e.g., Image/1234/Annotation/567 may link to
Annotation/567/ which contains Metadata/annotation-567.ome.xml. To
use those files and links to list the IDs of the images that are tagged
Second, each specified model object is assembled from the various object
files into a single OME-XML document, e.g.,
Image/1234/Export/image-1234.ome.xml. The OME-XML files in Export/
can include multiple top-level schema objects: for example, with
ROIRef elements linking an image to its ROIs.
As the pixel data is not included, any Pixels element contains
Plans for the future
OMERO.downloader is an early prototype: we have many ideas for how to
improve both how it is engineered and what it can do. For instance, it
cannot yet fetch map annotations or file attachments but both should be
feasible. We have been working toward offering export of pixel data into
TIFF or OME-TIFF, even for large images. This could make local image
analysis easier for pathology images that are too large for the server
to export or for plates where file download is disabled. We intend to
benefit from new developments in Bio-Formats such as having large
exported OME-TIFFs include pyramids.
There are also more ambitious possibilities. For example,
OMERO.downloader’s operation could be parallelized for greater speed, a
graphical user interface could be added, images’ container structure
(screens, projects, etc.) could be fetched. Further work depends on what
our user community most needs and what best supports our funded
deliverables. We would gladly exchange design and implementation ideas
with collaborators who wish to assist with OMERO.downloader development.
In the meantime, we hope that the present version is already very useful
to some scientists. We welcome questions and comments via our forum and