Boost Histogram 0.6

boost-histogram logo

The foundational histogramming package for Python, boost-histogram, was just updated to version 0.6! This is a major update to the new Boost.Histogram bindings. Version 0.6.1 is based on the recently released Boost C++ Libraries version 1.72 Histogram package.

This Python library is part of a larger picture in the Scikit-HEP ecosystem of tools for Particle Physics and is funded by DIANA/HEP and IRIS-HEP. It is the core library for making and manipulating histograms. Other packages are under development to provide a complete set of tools to work with and visualize histograms. The Aghast package is designed to convert between popular histogram formats, and the Hist package will be designed to make common analysis tasks simple, like plotting via tools such as the mplhep package. Hist and Aghast will be initially driven by HEP (High Energy Physics and Particle Physics) needs, but outside issues and contributions are welcome and encouraged.

Continue reading on ISciNumPy →

Configuration settings in the ASPIRE package

As with any substantial application package, the ASPIRE project needed a convenient way to specify configuration settings pertaining to different parts of the computational pipeline.

What follows below are some outlines from our attempts to tackle this configuration issue. Where a supplementary (and hopefully useful) nugget is provided, or a caveat discussed, I shall append a linked numeral, like so: (n)

A brief background of ASPIRE

ASPIRE is a Python (3.6) package under development, which ingests Micrographs, the output of Cryo-Electron Microscopy (images that closely resemble television static), and comes up with a 3D reconstruction of the molecule. Read the excellent writeup on the ASPIRE page for a more comprehensive review of the package.

Continue reading

Developing a GPU Version of APPLE-Picker in a Five-day Hackathon Event

Background on APPLE-Picker

APPLE-Picker is a submodule of ASPIRE Python package in development for reconstructing a 3D CryoEM map of biomolecule from corresponding 2D particle images, developed by the researchers in Professor Amit Singer’s group. It is an automatic tool to select millions of particles from thousands of micrographs, a critical step in the pipeline of CryoEM image reconstruction. It used to be performed manually but can be very tedious and difficult especially for small particles with low contrast (low signal-noise ratio). The CPU version takes ~80 seconds on average to finish processing one micrograph. To achieve the goal of finishing thousands of micrographs in a few minutes, we need an alternative method, such as GPU accelerating.

2019 Princeton GPU Hackathon

Princeton university held its first GPU hackathon on campus this summer from June 24 to 28, organized and hosted by the Princeton Institute for Computational Science and Engineering (PICSciE), and co-sponsored by NVIDIA and the Oak Ridge Leadership Computing Facility (OLCF). The main goal of this Hackathon was to port research codes to GPUs or optimize them with the help of experts from industry, academia and national labs, as emphasized by Ian Cosden, one of lead organizers and manager of Princeton’s Research Software Engineering Group. This blog reports our attempts and the story behind accelerating APPLE-Picker using GPU and parallel computing in Python.

Continue reading
Posted in Uncategorized | Tagged ,

Development of Python ASPIRE Package

Background on ASPIRE (Algorithms for Single Particle Reconstruction)

Significant progress on computational algorithms and software is one of the major reasons leading the revolution of resolution in three dimensional structure determination of biomolecules using CryoEM, a technique projecting rapidly frozen and randomly orientated 3D particles into 2D noisy images on micrographs and reconstructing 3D density maps in atomistic resolution through computer software. Due to many crucial roles of 3D biomolecules such as protein enzymes for further study in structural and chemical biology, biophysics, biomedical and other related fields, the 2017 Nobel prize in chemistry was awarded to three scholars for significantly advancing the CryoEM technique as explained in this Youtube video.

During the past 10 years, Professor Amit Singer’s group has proposed many new ideas in various numerical algorithms and developed the ASPIRE Matlab package to tackle many problems involved in reconstructing a 3D CryoEM map of biomolecule from corresponding 2D particle images, including CTF estimation, denoising, particle picking, 2D and 3D classification, and ab initio 3D reconstruction.

Feature Summary of ASPIRE
Continue reading
Posted in Uncategorized | Tagged ,

Hello World from the Princeton RSE Group!

Welcome to the first Princeton RSE group blog entry!  Before we get into the good stuff, here’s a bit of background

About Us

A relatively new addition, the Princeton RSE group formed in late 2016 within the Princeton Research Computing Department.  We develop software long term with traditional research groups to enable and advance research. We strive to develop high quality software both in terms of performance, and sustainability/maintainability.  We work alongside research groups from multiple disciplines.  You can read more about our group on our webpage and the individual members of our group here.

What you can expect out of this blog

We’ll be sharing software projects we’ve worked on including major releases, scripts and solutions we’ve generated, and little interesting discoveries that have come up along the way.  We’re looking to share some of the best practices we’ve settled on, point out some lessons learned, and try to foster discussions within the broader RSE and research software community.

Posted in Uncategorized | Tagged

Paper published in Science: A metagenomic strategy for harnessing the chemical repertoire of the human microbiome

By Yuki Sugimoto, Francine R. Camacho, Shuo Wang, Pranatchareeya Chankhamjon, Arman Odabas, Abhishek Biswas, Philip D. Jeffrey, Mohamed S. Donia

The human microbiome is a vast and complex ecosystem, teeming with microbial life that influences our health in profound ways. While correlations between microbiome composition and disease have been widely studied, the molecular mechanisms behind these relationships remain elusive. One promising avenue for exploration lies in the small molecules produced by these microbes—compounds that mediate interactions both among microbes and between microbes and their human hosts.

In this groundbreaking study, the authors introduce MetaBGC, a hybrid computational and synthetic biology strategy designed to uncover biosynthetic gene clusters (BGCs) directly from metagenomic sequencing data. These clusters encode the machinery for producing bioactive small molecules, including antibiotics and other therapeutics.

Read the paper: https://www.science.org/doi/10.1126/science.aax9176

Posted in Uncategorized

Published in Socius: Improving Metadata Infrastructure for Complex Surveys: Insights from the Fragile Families Challenge

By Kindel, A. T., Bansal, V., Catena, K. D., Hartshorne, T. H., Jaeger, K., Koffman, D., McLanahan, S., Phillips, M., Rouhani, S., Vinh, R., & Salganik, M. J.

Researchers rely on metadata systems to prepare data for analysis. As the complexity of data sets increases and the breadth of data analysis practices grow, existing metadata systems can limit the efficiency and quality of data preparation. This article describes the redesign of a metadata system supporting the Fragile Families and Child Wellbeing Study on the basis of the experiences of participants in the Fragile Families Challenge. The authors demonstrate how treating metadata as data (i.e., releasing comprehensive information about variables in a format amenable to both automated and manual processing) can make the task of data preparation less arduous and less error prone for all types of data analysis. The authors hope that their work will facilitate new applications of machine-learning methods to longitudinal surveys and inspire research on data preparation in the social sciences. The authors have open-sourced the tools they created so that others can use and improve them.

Read the paper: https://doi.org/10.1177/2378023118817378

Posted in Uncategorized