The OME Blog Because metadata is worth a thousand pictures

The Image Data Resource is added to the list of Scientific Data recommended data repositories

Nature Research journal Scientific Data is a peer-reviewed, open-access journal for descriptions of scientifically valuable datasets, and research that advances the sharing and reuse of scientific data. These data descriptors provide a path for publishing datasets associated with scientific publications.

Scientific Data requires that all datasets related to a data descriptor, including experimental, computational and curated data, should be submitted to an appropriate open, public community repository. Although Scientific Data mandates the release of the datasets that accompany their manuscripts, they do not themselves host data. Instead, they encourage submission of data to community-recognized data repositories where possible.

The Image Data Resource (IDR), the world’s largest public bioimage database, is now recommended by Scientific Data as a repository for bioimaging data. Authors wishing to submit a data descriptor that includes imaging data will be asked to deposit the data in one of the image data repositories that now include IDR. IDR accepts reference image datasets (as defined at IDR Submissions). Datasets that are not reference images can be published using data archives (e.g. BioStudies or Data Dryad) or, if appropriate, any of the other imaging data repositories recommended by Scientific Data (e.g. EM DataBank, Cancer Imaging Archive, SICAS Medical Image Repository, or Coherent X-ray Imaging Data Bank). Datasets published in IDR will receive a Data DOI that should then be included during the Scientific Data manuscript submission process. This follows the successful publication in IDR of the imaging data for a Scientific Data data descriptor by Pascual-Vargas et al on the role of Rho GTPases in triple negative breast cancer.

The Scientific Data repository list is mirrored for use by other Springer Nature journals and is therefore also available for use by authors submitting to other Springer Nature journals. As an example, an article published by Kilpinen et al in Nature is supported by imaging data that has been archived in IDR.

We look forward to publishing many more imaging datasets associated with Scientific Data and other Springer Nature articles now that this partnership is officially launched!

Quality figures from OMERO.figure

This is a repost from the figure.openmicroscopy.org blog where its creator Will Moore talks about how OMERO.figure works:

I recently read an excellent guide by Benjamin Names on How to Create Publication-Quality Figures.

It describes the goals of scientific figure creation (accuracy, quality, transparency) and a very thorough workflow to achieve these goals. The key is to understand your data and how it is stored, manipulated and displayed. In particular, it is important to minimise the number of steps where data is transformed and perform lossy steps as late as possible in the figure creation process.

Benjamin documents specific tools that he uses for his workflow such as ImageJ for images and Inkscape for figure layout. But much of his workflow can also be applied to OMERO.figure since it was designed with the same principals in mind.

I highly recommend you read the guide above, as it provides a lot of background information on how computers handle vector and raster data. The steps of Benjamin’s guide that can be replicated in OMERO.figure are described below.

Preparing figure components (High-bit-depth images)

The OMERO server stores your original microscope files and can render them as 8-bit images using your chosen rendering settings. Single-color LUTs can be applied to each channel over a specified intensity range and channels can be toggled on and off. None of these changes will alter the original microscope data. OMERO.figure does not require you to save 8-bit images before creating a figure, since all rendering is done ‘live’ within the figure app itself after importing images, as described below.

Figure layout

OMERO.figure is similar to Inkscape and Adobe Illustrator in that it defines figures in a vector-based format that embeds linked images. This means that moving and resizing images within a figure does not require resampling of pixel data, avoiding loss of image quality.

screenshot

Screenshot: Editing layout and rendering settings in OMERO.figure.
Data from Wang et al JCB 3013.

Importing images

To add images to OMERO.figure, you simply specify the ID of the image in the OMERO server. The necessary data such as image size, number of channels, pixel bit-depth etc is then imported from the server. You can then edit the image rendering settings while working on the figure layout and these changes are stored in the OMERO.figure file. The file format is a Javascript object (saved as json data) and contains no pixel data. OMERO.figure retrieves rendered 8-bit images from the OMERO.server and assembles them into a figure in the browser as needed.

The resolution (dpi) of images in OMERO.figure is calculated from their size on the page and the printed size of the page itself (which can be edited under File > Paper Setup…). The dpi of each image can be seen under the ‘Info’ tab and will change as the image is resized and zoomed.

Journals usually require all images to be at 300 dpi or above in order to avoid a pixelated appearance when figures are displayed at their published size. If you need to increase the dpi for an image, you can set an export dpi and the panel will be resampled as necessary in the exported PDF.

Clipping masks

OMERO.figure allows you to crop images. It uses a ‘clipping mask’ to produce the cropping effect which means you can undo or edit the crop at any time. You can crop by using the zoom slider to zoom the image, then pan to the desired spot, or you can use a standard ‘crop’ tool to draw a crop region on an image.

Calculating scale bars

Scale bars can be easily added to images in OMERO.figure and the known pixel size will be used to calculate the correct length. Scale bars are vector objects overlaid on the image and will be automatically resized if you resize or zoom the image.

Exporting final figure files

OMERO.figure offers export in PDF and TIFF formats. Both are generated on the OMERO server using a Python script and the Python Imaging Library (PIL) for image manipulation. Figures are saved on the server and are then available to download.

Creating TIFF Images

Tiff images, at 300 dpi are generated by resampling all the embedded images using a bilinear filter. Vector data such as labels and scalebars is rasterized and overlaid on the image.

Creating PDF Files

The Python script uses the Reportlab library to produce PDF files. Images are rotated, cropped and resampled to the chosen dpi as necessary and saved as TIFFs before embedding in the PDF. Labels and scalebars remain as vector objects that can subseqently be manipulated in other vector-based packages if needed.

Export with images

An additional option with TIFF or PDF figure export is to export all the embedded images as TIFFs, saved at each stage of the figure generation process:

  • As 8-bit images at full size as rendered by OMERO
  • After cropping & rotating, but before any resampling
  • Finally, saved as they are embedded in the figure

This option increases the transparency of the image processing steps, and also provides images that can be used for other purposes if needed.

Summary

OMERO.figure is a web application that stores figures in a vector-based file format linked to images. By linking to the original microscope images in the OMERO.server, we have complete control over rendering of high bit-depth images within the figure. Only when the figure is exported do we need to save images as 8-bit TIFFs. This pushes the lossy and file-format specific steps to the very end of the figure creation process, ensuring the highest possible quality of images in the final figure.

Thanks to Benjamin Names for his original guide which provided the basis of this article.

(Originally published on April 30th 2015)

OME FAIR

Recently there have been several publications and substantial discussion about the FAIR principles (see for example, Wilkinson et al, 2016 and the Force11 Fair Data Principles). Overall, the goal of the FAIR principles is “to facilitate knowledge discovery by assisting humans and machines in their discovery of, access to, integration and analysis of …scientific data and their associated algorithms and workflows.”1 These principles are extremely powerful but as has been repeatedly noted, the routine implementation of FAIR principles is a significant challenge.

Imaging datasets present a particular challenge for implementing FAIR. The datasets are large, multidimensional and complex. Perhaps most importantly it is probably unrealistic to suggest that a single metadata standard will handle the huge diversity of imaging experiments and datasets. In the best possible case, it is likely that there will be families of metadata standards or flexible APIs, each tuned and designed for accessing specific types of imaging metadata.

OME has been working on the image data publication problem for many years. Our recent work on the Image Data Resource (IDR) is an example of an added value database that integrates imaging data from many biological imaging datasets and links gene and drug perturbations with cell phenotypes2. IDR focuses on reference image datasets, i.e. those datasets that have high levels of biological and molecular annotations and have a strong likelihood of re-use by the scientific community.

Our work on IDR has been well-received and the resource is growing in size and usage. However, IDR doesn’t address more routine data publication; the datasets that are not reference images, but are associated with a scientific publication in the biological sciences. For example, our lab in Dundee has recently published a paper that explores the interaction of a single protein Bod1 and the Ndc80 complex, a protein complex that mediates the attachment of microtubules to connect to chromosomes during cell division3. How to publish the imaging data associated with this paper?

As you might guess, we’ve used OMERO to publish and link these data. We’ve used our institutional OMERO server, and used an institutional DOI as the definitive link to the data. The datasets associated with this paper were imported into OMERO as part of the analysis workflows and then were moved into a public OMERO group for publication. The data can be browsed, searched, viewed and downloaded. We believe we’ve made the datasets “AIR”—Accessible, Interoperable and Reusable. Making these datasets truly “Findable” will take more time as we develop routine landing pages and JSON-LD-based metadata for these images.

In the meantime, we thought it might be useful for the community to see how we have achieved this work. With the latest releases of OMERO (5.4 and beyond), we have made it fairly easy for images to be managed and published online. Documentation describing exactly what we did is available4.

We hope this work is an important contribution to the movement for making data available online. We believe we’ve made reasonable progress in making data AIR and look forward to fully achieving the goals of the FAIR principles.

OME Project Status Update

The year so far…

2017 has been a very busy year for the OME team so far with:

  • 2 major OMERO releases plus 6 security or patch releases - introducing ROI Folders, a whole new UI layout for HCS data, a configurable restricted admin user role and many other fixes and improvements
  • 12 Bio-Formats releases to date (and another in the pipeline) - featuring 2 new formats and improved support for many more
  • 4 OME Files C++ releases - improving support for reading and writing OME-TIFF in an open, liberally-licensed, native library
  • 4 OMERO.figure releases - adding support for loading ROIs and choosing look-up tables from OMERO, using markdown syntax for italics and bold labels, setting background colors, plus other fixes and improvements
  • various other web app releases - all now available from PyPI, including OMERO.FPBioimage, a volumetric visualization tool
  • the launch of our new OMERO.iviewer - a web browser-based interactive multidimensional image viewing app that includes ROI drawing and measurement functionality. OMERO.iviewer is approaching a full v1.0 release, a sign of our commitment to maintaining and growing the functionality in this app
  • AND a new website!

Plus of course, our usual Annual User Meeting bringing together members of the community from across the globe (if you couldn’t make it, talks and workshops are available from the event page).

The Image Data Resource (IDR) also continues to go from strength to strength. If you missed it, our Nature Methods paper (or the open access PubMedCentral version) discusses how the IDR can be used to obtain new biological insights from existing datasets, plus an in-depth explanation of how the resource is set up. It also includes information on how you can set up your own IDR.

With all that, hopefully you’ll forgive us for neglecting this blog!

What’s still to come…

We’re running two days of Bioinformatics training at Cambridge University in December. In the run up to this, we’ll be developing new training materials which will all be available via our website for those of you who can’t attend. We’ll also be expanding our collection of Jupyter notebooks providing you with examples of how to carry out image analysis via the OMERO API and likely adding to our collection of how-to movies on our YouTube channel. The IDR will also be updated, publishing several new datasets (our first light sheet fluorescence microscopy dataset is almost ready to go!) and improvements to the Jupyter analysis tools.

In terms of releases, we expect to put out patch releases for OMERO, Bio-Formats and OME Files. As always the content of these will be driven by both the community and our own projects, for example the IDR continues to challenge us in terms of format support and display, and the way our tools connect and interact with analysis packages. You can always keep up with the latest upcoming releases and more via our Trello boards (you can sign up here with just an email address).

Beyond this, we are continuing to push forward our functionality and the scale at which our tools operate. As imaging technology takes huge leaps in scale, this is challenging to say the least and we’ll need the support of the community more than ever. If you have resources to contribute, we’d love you to get involved - write to the mailing lists or forums, check out our contributing developer docs, read our blog posts on helping us support new file formats. Even if you don’t have resources to spare, you can always help us secure grant funding by citing our work in your publications.

OMERO 5.3 Status Update

2016 has been a busy year for OME. Many of you will have noticed the number of Bio-Formats releases and the fact that OME Files C++ is now available for implementing the OME Data Model and OME-TIFF support in C++ software (you can read our preprint here).

That doesn’t mean we haven’t been busy on the OMERO front though, we’ve pushed our deadline to make sure we can deliver several new features:

Data Management

As well as implementing the new Data Model, we have introduced the Folders feature discussed in a previous post. Folders will allow you to organize your images’ ROIs into a hierarchical structure so that, for example, you can sort cells into phenotypes or assign ontology terms to them, or use an analysis script to track entities across a set of images and then use a folder for each entity to store sightings of it in OMERO.insight.

ROI Folder screenshot

UI development

We’ve been working on our metadata capturing and display, drawing lessons from our IDR project. We’ve also revamped the display of Screen-Plate-Well data to allow browsing of all fields within a well and display of their positions within each well. Zooming of the plate and fields are both supported. It is also possible to add and view annotations on individual wells (and not just the Images they contain).

New Well UI screenshot

Permissions improvements

To aid the workflows of facilities managers, we are designing a new ‘Light Admin’ role to the OMERO permissions system. This will allow a manager to import data for any other user (i.e. it will belong to the other user), delete other users’ data (for clean up e.g. after someone has left the lab), and manage user groups (creating groups and adding existing users to them) without having the full rights and responsibilities of a full Administrator. If you are interested in all the technical details, there is a design issue on GitHub.

Reading data

OMERO 5.3.0 will bundle Bio-Formats 5.3.x. This will include, amongst other changes, support for JPEG-XR compressed CZI data. This work was funded by a partnership between Glencoe Software and ZEISS. See announcement

Additionally, OMERO 5.3.x will benefit from the improvements in Bio-Formats 5.3.x for tile-base image writing for TIFF and derived formats like OME-TIFF. For more information about the plan for Bio-Formats 5.3, see Bio-Formats Development Status

Image rendering

We’ve introduced support for Lookup Tables and reverse intensity and improved the rendering engine to allow full projection thumbnails.

LUT support screenshot

We’ve also added histograms of image pixel intensities in the OMERO.insight and OMERO.web clients.

Histograms screenshot

OMERO.web architecture

A big effort has gone into making OMERO.web deployable as a separate component from the server and the various web apps installable from Python Package Index (PyPI). This has involved reorganizing all the packages. OMERO.web now has an ‘Open With’ function for using custom viewers and our next generation web image viewer (OMERO.iviewer) is coming on too. There has also been a lot of work behind the scenes to improve the Web API.

In line with this, our future client development will be very much focused on web clients and OMERO.insight will enter maintenance mode once the updates for 5.3.0 are complete.

Developer previews

If you’re a developer with your own OMERO code, you can check out the work so far to see if it’ll affect you. There are further details in the milestone version history and as ever, the code is all on GitHub (our latest development milestone is here).

Heads up for sysadmins

We’re changing our recommended way to install OMERO.web so you can deploy it separately to the main OMERO.server, making it easier to get your your virtual environment set up with all the necessary prerequisites. We have also deprecated support for Apache and recommend you deploy using Nginx as we are likely to drop Apache completely during the 5.3.x development line.

We are moving to Ice 3.6 as the minimum supported version. Other updates to version requirements include Python 2.7 but we will continue to support Java 1.7 for OMERO 5.3.x, as we are aware that upgrading to 1.8 would be incompatible with current MATLAB distributions.

With the requirements changes mentioned above, we are also considering no longer deploying and building OMERO on CentOS 6 to reduce our testing matrix. It will still be possible to use OMERO on CentOS 6 with the minimum requirements. See OMERO.server installation on CentOS 6 with Python 2.7 and Ice 3.6

Note that we are in the process of updating the documentation so although a preview of the code changes is publicly available, you should not expect all the milestone documentation to reflect these changes before the final release.

Release schedule

As a consequence of all this work, OMERO 5.3.0 is still under development and is now not expected until the first quarter of 2017.