The OME Blog Because metadata is worth a thousand pictures

OME 2016 Users Meeting

For those of you who couldn’t make it to Dundee for OME 2016, or maybe just didn’t get to all the sessions you wanted to, we have a range of content on our downloads site–notes, slides and even movie versions of workshop presentations–available to browse from or linked via the programme page if there is specific content you’re interested in.

One of the things the user meeting is really valuable for is hearing about what the rest of the community is doing with our tools so the lightning talks have been a really valuable edition to the format over the past two years. Together with the formal talks, this year’s examples illustrate people doing everything from archiving to data visualization to image processing, from the scale of individual institutions to international research collaborations.

It’s always great to hear positive stories from the community rather than only getting feedback when something needs fixing. To this end, we’d always invite you to make use of the mailing lists and forums to discuss the challenges you’re working on even if you don’t necessarily need help. We’d also be keen to feature your stories on this blog if you’d like to submit something longer e.g. with screenshots - you can open a PR at or just drop us an email and we’ll get back to you for details. Did you know we maintain a list of publications using our tools on We are always happy to hear of new citations to add.

An even newer addition this year was the Unconference sessions on the third day. Again, we’ve put some notes from the session up on our downloads site. The topics covered in the Technical Developer discussions include data pipelines, federation and scaling, while the Admining OMERO section is more about managing data and users, having grown out of a session aimed at facility managers.

The File Formats discussion provides a good example of how we carry issues raised forward. The Unconference session on Reader integration and decentralisation prompted the creation of a GitHub Design issue where the interested members of the community can follow the options we’re considering and make comments.

If you haven’t noticed our Design repo before, you can find Issues relating to a whole range of design scoping for the UI, OMERO.web viewer, Folders etc. You do need a GitHub account to comment but signing up only requires a valid email address. Similarly, we have a selection of public Trello boards at to help you track what’s going on with the project, and receive notifications and give feedback if you sign up for free.

We’ve had some great feedback from the meeting attendees and would love to carry forward more positive engagement and to help empower our user community to help each other. We’ll be looking at ways to try to promote the activities of our community better going forward and in the meantime, please keep in touch!

Upcoming support for ROI Folders

For several months the OME team has been working on what will soon be released as the new 2016 OME Data Model. The OME Data Model specifies Regions of Interest (ROIs) in terms of a set of Shapes. As OMERO 5.3 will use the new Data Model, the upcoming changes include an initial round of adjustments that improve consistency between Shape representations in the Data Model and OMERO. They will each also include a significant addition: Folders, a new top-level model object.

Using ROI Folders

Our initial focus is on supporting ROI-based Folder workflows. OMERO.insight, OMERO.web and OMERO.cli will offer some support for users to have their images’ ROIs organized into a hierarchy of folders. We have several use cases in mind: for example, one may wish to sort cells into phenotypes or assign ontology terms to them, or an analysis script may track entities across a set of images and use a folder for each entity to store sightings of it. We expect the community to think of many more uses for folders.

Folders may seem rather like Datasets in their implementation: in our current draft of the Data Model one may name Folders, add a description, even annotate them, exactly as with datasets. The most obvious difference from Datasets is what Folders may contain: ROIs, but also other folders, even a mix of both. Folders may be nested arbitrarily deeply but with a caveat: a Folder may have only one parent. The same Dataset can be put into many projects but the Folder hierarchy is a strict tree. This simplifies the user interface and speeds up processing.

Graphical clients

An ROI may be both on an image and in some folders. When a scientist views the ROIs of an image in OMERO.insight or OMERO.web they will be able to see how those ROIs are organized into folders. Work is ongoing in both clients to deliver at least some ability to work with ROIs and folder hierarchies in OMERO 5.3.0. On Twitter, YouTube and elsewhere we have already published some screenshots of how OMERO.insight can be used to organize ROIs into folders and we are working on usability features such as making images searchable by the name of a folder that contains any of the image’s ROIs.

ROI Folder screenshot

OMERO.server and scripts

Users of the Blitz API are well aware that it allows a wider range of actions than the graphical clients support. Those exploring how Folders are represented will see that a Folder has a parentFolder property for its parent, if any, and childFolders and roiLinks properties for its contents, as they might expect.

In our current draft of the new OME Data Model folders may directly contain images regardless of ROIs. One may expect the graphical clients to ignore this experimental feature, relying instead on folder-image linkage via ROIs as above, but the additional imageLinks property of Folder objects may be useful for grouping images in a different way within a hierarchy.

One concern with writing scripts may be that processing an arbitrarily deep Folder hierarchy could require many separate calls to the IQuery query service. OMERO 5.3.0 will include two new Request subclasses for queries, FindParents and FindChildren, that can be used to traverse hierarchies arbitrarily far in one call: for instance, to ask which images are contained in a set of Folders or in any of their subfolders.

The future of Folders

We have considered proposals for containing other objects, like plates, wells and annotations, in Folders. One can imagine that someday Folders may supersede everything from tag sets to datasets but, in starting out by allowing only a couple of kinds of content, and then only in a strict tree of Folders, we take a conservative approach that allows us to adjust our plans based on how the community reacts to the introduction of Folders in OMERO. We welcome your input on how you see Folders being used in your workflows.

Windows support update

The OME team has always been committed to building specifications and software that are cross-platform solutions and that support as many different types of users as possible, including those with limited IT support/budget. In keeping with our open source ethos and the fact we rely on public and charity grant funding, we also try to use open source tools as part of our workflows for testing, building and deploying our products wherever possible. We do support commercial operating systems and platforms — we build and test OMERO.insight on Windows, actively work to support accessing the OMERO and Bio-Formats APIs in Matlab, and actively support browsing OMERO.web using IE.

Since the beginning of the OMERO project, we have actively tested and supported builds and tests of OMERO.server on Windows. Several users— students, faculty and institutions—have highlighted the importance of this support over many years. Therefore we are frustrated and saddened to announce that we have to withdraw support for OMERO.server on Windows starting with the 5.3.0 release. This means OMERO.web hosting will not be possible on Windows either. We will of course still support running OMERO.insight on Windows, OMERO.web browsing on IE and continue to provide full support for Bio-Formats on Windows (including the C++ components of this project). The reasons for this decision are outlined below.

Ever-increasing technical challenges

Our Continuous Integration (CI) system uses Travis and allows the OME Consortium’s work to be built and tested on a per-commit basis. One of the challenges of running OME’s CI system is including tests for the numerous products we release, across several operating systems with different configurations e.g. Python 2.7, Openjdk 1.8, Ice 3.5. The testing matrix is constantly growing; already we are adding Ice 3.6 and soon Java 1.9 will need to be on our radar too. The resources we have for building and maintaining our CI system aren’t infinite. We have to balance these resources with core development work. There’s always a tension between rapid development of new functionality and robust, reliable testing.

The focus of our OMERO 5.2.1 release was on deployment, following feedback from system administrators (e.g. this forum thread). We improved our server installation guides and OMERO.web deployment documentation, and provided stepwise deployment scripts, e.g. for CentOS 6 with Python 2.7. We extensively used Docker to test our Linux installation scripts and also to test our installation documentation. All the installation scripts are available (see omero-install) and run on the CI system each time a commit is pushed.

During the development phase of OMERO 5.2.1, we dedicated a large amount of extra resources—from the devops team and in terms of computing hardware—to test the Windows installation scripts and improve our installation documentation. This level of effort will not scale with the introduction of new elements to our testing matrix e.g. Ice 3.6 support. Moreover, we have other critical priorities—public repositories and improved support for ontologies to name but two—so we are forced to make difficult decisions.

Our usage statistics indicate that a large majority of production servers are installed on Ubuntu 14.04 and CentOS 6, and less than 10 % are run on Windows, so this decision should have relatively limited impact. We are, nonetheless, an open source project and aim to support our incredibly diverse community and the needs they have to deploy our software. So, even if only a few OMERO.server installations run on Windows, we very much want to support them. We simply can’t afford to do so.

For all these reasons, from version 5.3, we will not be able to support OMERO.server on Windows. Again, this is a build and testing issue, so if anyone out there would like to contribute their time, energy, expertise and compute resources to provide Windows support, we’d welcome them doing so. Instead, we will focus on ensuring we provide the best support we can for a range of UNIX-like systems, continuing the effort to make OMERO.server easier to install, maintain and manage.

This decision is for OMERO.server support and OMERO.web hosting only; we will continue to support and test OMERO.insight, OMERO.web browsing and all aspects of the Bio-Formats project on Windows.

Bio-Formats and OME Data Model Development Status

This is an update about what we are working on in the Bio-Formats codebase for the next few months. As this is where the OME Data Model lives, it covers our current and upcoming work on the Model and the Bio-Formats project.

Current Bio-Formats development focus

The release of 5.1.7 back in December is likely to be the last regularly planned release of Bio-Formats 5.1.x. Bio-Formats development has now shifted to focus on 5.2.0 in the develop branch. There are two points for Bio-Formats users to note:

  • the primary aim of the Bio-Formats 5.2.0 work is to upgrade our OME Data Model (as discussed below) to provide critical new functionality for many of our users
  • our regular Bio-Formats Java schedule of monthly releases will be suspended and non-critical bug fixes and new format support will have a lower priority until this model work is complete

For developers using Bio-Formats, the develop branch will include development schema versions and should not be used for writing OME files (OME-XML, OME-TIFF) until Bio-Formats 5.2.0 is released.

We hope to release Bio-Formats version 5.2.0 in Spring 2016. You can follow our progress on the public Trello board.

Data Model work

The main effort of the Bio-Formats 5.2.0 development work will be focused on updating the Data Model to include a folder-like structure for storing Regions of Interest (ROIs), as discussed in the most recent OMERO status post.

Regions of Interest are core features of the OME Data Model currently stored as image components without any ordering or structure. We have identified several use cases across a wide range of imaging domains from high content screening to digital pathology where this representation limits the ROI usability. For instance, the Image Data Resource1 built by OME contains several datasets where each image is associated with several hundreds of thousands of ROIs (nice examples are here, here, and here). Similar orders of magnitudes of ROIs are commonly generated computationally by analytical tools in high content screening. In other domains, a ROI or set of ROIs needs to be associated with a complex hierarchical representation like ontology. Across all these use cases, there is a growing need to organize, browse and filter ROIs at the model-level. To address it, we will introduce a folder concept allowing the ROIs within an image to be grouped in a hierarchical manner.

We aim to update OMERO to include ROI Folders and release this as version 5.3 during Spring 2016.

If you are interested in our design process, you can follow the discussion on the issues in our Design GitHub repository.

We also aim to extend our support of experimental and analytic metadata—more about this in a later entry. In brief, our aim is to package and release all the work we’ve done on the Image Data Repository as tools for the community to use to access a broad range of types of metadata.

New format support

Despite the focus on the Data Model, 5.2.0 will also introduce two new formats. These are scheduled to be Becker & Hickl SPC and Princeton Instruments SPE. We are currently working on the readers for these and would greatly appreciate sample files if you have any to help us with testing (you can submit files via our QA system or get in touch on our mailing lists for details on how to submit larger files).

While the core team won’t be focusing on any other readers for 5.2.0, we continue to encourage community submissions. New readers submitted by external collaborators will be treated on a case-by-case basis. We always aim to review external PRs promptly but our capacity for reviewing major changes is going to be reduced for the next couple of months so release of new readers may be delayed to 5.2.1 or later. We will endeavour to keep you informed of the timeline, including having public Trello boards for future Bio-Formats releases so the whole community can follow what is upcoming (these will be listed on the Getting Started Trello board).


The OME Data Model and Bio-Formats C++ will be decoupled from the main Bio-Formats code repository and renamed as OME-Files. This new API will provide the reference implementation for working with the file formats defined by the Open Microscopy Environment—OME-XML and OME-TIFF—in Java and C++ and the new development cycle will allow us to get updates out to our users as quickly as possible.

  1. At the time of publication, this was referred to as the ‘Image Data Repository’. 

Supporting complex formats - what we will and won't do, and what you can do to help

You may have noticed that a few months ago, we received an email asking us about when we expect to support 3D HISTECH .mrxs files. This sort of request isn’t particularly unusual and the reply gives an insight into one of the key challenges we face.

Just because we don’t have a reader, doesn’t mean we haven’t done any work

3D HISTECH .mrxs is an example of a complex format, the design of which does not make our work any easier. In fact, we can say with some confidence that the 3D HISTECH .mrxs file format is the most complex whole slide imaging file format we have ever encountered. We can say this because although we haven’t delivered a full reader for .mrxs—and there hasn’t been substantial public development—we have spent a great deal of time examining the format and potential solutions, and building test readers. Thanks to the example data the community has generously provided, we have been able to analyse the on-disk layout as well as the compression types, and map out the details of what an implementation would entail.

Unfortunately, the result of all this work has been the conclusion that we simply do not have the resources to prioritize delivering a complete solution for this format. This is not the only format we have reached this conclusion about. For example, support for 3i Slidebook 6 files was only added to Bio-Formats last May when 3i committed to developing the reader themselves. Obviously, we are very grateful for this, but that doesn’t change the fact that we had already spent years working on various versions of this format (our initial single-series Slidebook reader was released back in 2006 and obviously the work to produce it started even further back than that). Nikon ND2 and Zeiss CZI are other examples of formats with a complex design that makes them very difficult for us to support.

We won’t deliver something that doesn’t do the job well enough

One thing to understand about our work, strategy and commitment to supporting all file formats, especially formats used in production-scale facilities that use technology like whole slide imaging, is that we insist on delivering as close to complete support as possible. This is important given the size of community we support, the breadth of applications that use our software, and the need for utility and reliability in the software we deliver to the community.

With 3D HISTECH .mrxs, it is very hard and expensive to meet this goal. To be specific:

  1. The design decisions of 3D HISTECH with respect to image pyramid layout are at odds with what we can reasonably handle within the infrastructure currently in place. Our analysis suggests we will have to re-calculate several of the resolution levels, because of choices 3D HISTECH has made in their tiling strategy. This will create a substantial performance penalty for anyone using Bio-Formats to read this format.
  2. The strategy for storing binary data on disk in 3D HISTECH .mrxs brightfield differs substantially from fluorescence images stored in .mrxs. They are essentially two different file formats, thus doubling the work required.
  3. Based on recent data submissions and information from the community, 3D HISTECH scanners default to JPEG-XR compression when acquiring fluorescence data. Another doubling of work and complexity, as we would need to support both compressed and uncompressed data, in brightfield and fluorescence.

These points are specific to this format but similar issues occur with other proprietary formats. As a team, we are not comfortable with releasing a reader implementation that works on a limited set of file format variants, or requires time consuming and computationally expensive reprocessing and pyramid creation, just because of the implementation choices made by the format designers.

A philosophical point about our funding and resources

The OME Consortium and the wider development community have worked steadily since 2002, funded mostly by grants from non-profit charities and public funders, to build tools for the scientific community.

Building readers for proprietary formats has never been funded, and we don’t think it would ever be funded by any grant funding panel. New readers are created either by diverting our precious resources from other projects, by contributions from the community (most recently by the companies themselves), or by work commissioned by customers of Glencoe Software. We certainly listen to the community and adjust our priorities based on requests, but we can’t do everything with limited resources.

Perhaps we could crowdsource the funding for file formats but that misses the point—the formats we often lack the resources to support are those which are complex, expensive, difficult, proprietary, closed formats, designed to lock their users into a single, proprietary software application. The community’s resources are finite and should be used for things other than reverse engineering this type of format; work that, if subjected to peer review, would be declined as a waste of community resources. Several of those “other things” were discussed at our most recent Annual Users Meeting and represent key technologies that the community needs to achieve its scientific goals.

Over the last few years, we have seen efforts by several commercial imaging companies to support open formats, provide open APIs, and to make it easier for researchers and clinicians to work with the data acquired by their instruments. We have also received specifications and input from several imaging companies, which we have used to improve our own work and output. We applaud this trend; ultimately it means scientists, clinicians, engineers and developers spend less time dealing with data formats and more time doing science, developing new technologies and treating patients.

What you can do

The community has the power to change this situation. You are paying for these proprietary formats. You can condition your purchase, continued payment of support and maintenance fees etc. on:

  • the delivery of a rational, well-designed, efficient, open format
  • use of open compression schemes
  • support for the community’s efforts to deliver open readers for these files

You can of course also commit your own development resources to help solve this problem.