What is it about?

Most scientific facilities produce large amounts of heterogeneous data at a rapid pace. Managing users, instruments, reports and invoices presents additional challenges. To address these challenges, EMhub, a web platform designed to support the daily operations and record-keeping of a scientific facility, has been introduced. EMhub enables the easy management of user information, instruments, bookings and projects. The application was initially developed to meet the needs of a cryoEM facility, but its functionality and adaptability have proven to be broad enough to be extended to other data-generating centers. The expansion of EMHub is enabled by the modular nature of its core functionalities. The application allows external processes to be connected via a REST API, automating tasks such as folder creation, user and password generation, and the execution of real-time data-processing pipelines. EMhub has been used for several years at the Swedish National CryoEM Facility and installed in the CryoEM center at the Structural Biology Department at St. Jude Children's Research Hospital. A fully automated single-particle pipeline has been implemented for on-the-fly data processing and analysis. At St. Jude, the X-Ray Crystallography Center and the Single-Molecule Imaging Center have already expanded the platform to support their operational and data-management workflows.

Featured Image

Why is it important?

Data management is crucial for cryoEM and other scientific facilities. However, many facilities lack proper computational infrastructure and software for their operations. Access to different instruments must be coordinated and scheduled based on user type, facility policies, and the data-acquisition technique. Users may need multiple sessions on several instruments for the same project, and keeping track of related experiments can help with planning and timely execution. Having on-the-fly data-processing workflows is also advantageous for continuously assessing sample and data quality and instrument performance with minimal delay. This helps to ensure efficient instrument usage and high-quality services for users.

Perspectives

The current implementation of EMhub has already proven valuable, and we expect it to appeal to other facilities as well. Since different facilities have diverse needs and operations, we do not think a single monolithic solution can be ideal for every case. Accordingly, we have focused on developing a framework with many useful built-in features that can be extended and customized for different situations. Nonetheless, some programming skills are required, such as the basics of the Python language and web programming concepts (HTML, Javascript, REST, etc.). Moreover, the data-management workflow can also be complex in some cases, which needs to be considered when integrating into EMhub. To facilitate this process, we have put a lot of effort into documenting EMhub (https://3dem.github.io/emdocs/emhub/developer_guide/), providing many examples that could serve as a guide when implementing new features. We envision EMhub becoming a community hub where other developers will implement custom features while contributing back with core functionality or enhancements. We plan to support this collaborative development model to grow EMhub and serve the scientific community better.

Jose Miguel De la Rosa Trevin
St.Jude Children's Research Hospital

Read the Original

This page is a summary of: EMhub: a web platform for data management and on-the-fly processing in scientific facilities, Acta Crystallographica Section D Structural Biology, October 2024, International Union of Crystallography,
DOI: 10.1107/s2059798324009471.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page