Published in Water: Development of a Deep Learning Emulator for a Distributed Groundwater–Surface Water Model: ParFlow-ML

By Tran, H.; Leonarduzzi, E.; De la Fuente, L.; Hull, R.B.; Bansal, V.; Chennault, C.; Gentine, P.; Melchior, P.; Condon, L.E.; Maxwell, R.M.

Integrated hydrologic models solve coupled mathematical equations that represent natural processes, including groundwater, unsaturated, and overland flow. However, these models are computationally expensive. It has been recently shown that machine leaning (ML) and deep learning (DL) in particular could be used to emulate complex physical processes in the earth system. In this study, we demonstrate how a DL model can emulate transient, three-dimensional integrated hydrologic model simulations at a fraction of the computational expense. This emulator is based on a DL model previously used for modeling video dynamics, PredRNN. The emulator is trained based on physical parameters used in the original model, inputs such as hydraulic conductivity and topography, and produces spatially distributed outputs (e.g., pressure head) from which quantities such as streamflow and water table depth can be calculated. Simulation results from the emulator and ParFlow agree well with average relative biases of 0.070, 0.092, and 0.032 for streamflow, water table depth, and total water storage, respectively. Moreover, the emulator is up to 42 times faster than ParFlow. Given this promising proof of concept, our results open the door to future applications of full hydrologic model emulation, particularly at larger scales.

Read the paper: https://doi.org/10.3390/w13233393

Posted in Uncategorized

The human microbiome encodes resistance to the antidiabetic drug acarbose

By Balaich J, Estrella M, Wu G, Jeffrey PD, Biswas A, Zhao L, Korennykh A, Donia MS

The human microbiome encodes a large repertoire of biochemical enzymes and pathways, most of which remain uncharacterized. Here, using a metagenomics-based search strategy, we discovered that bacterial members of the human gut and oral microbiome encode enzymes that selectively phosphorylate a clinically used antidiabetic drug, acarbose1,2, resulting in its inactivation. Acarbose is an inhibitor of both human and bacterial α-glucosidases3, limiting the ability of the target organism to metabolize complex carbohydrates. Using biochemical assays, X-ray crystallography and metagenomic analyses, we show that microbiome-derived acarbose kinases are specific for acarbose, provide their harbouring organism with a protective advantage against the activity of acarbose, and are widespread in the microbiomes of western and non-western human populations. These results provide an example of widespread microbiome resistance to a non-antibiotic drug, and suggest that acarbose resistance has disseminated in the human microbiome as a defensive strategy against a potential endogenous producer of a closely related molecule.

Read the paper: https://www.nature.com/articles/s41586-021-04091-0

Posted in Uncategorized

New mouse models for high resolution and live imaging of planar cell polarity proteins in vivo

By Lena P. Basta, Michael Hill-Oliva, Sarah V. Paramore, Rishabh Sharan, Audrey Goh, Abhishek Biswas, Marvin Cortez, Katherine A. Little, Eszter Posfai, Danelle Devenport

The collective polarization of cellular structures and behaviors across a tissue plane is a near universal feature of epithelia known as planar cell polarity (PCP). This property is controlled by the core PCP pathway, which consists of highly conserved membrane-associated protein complexes that localize asymmetrically at cell junctions. Here, we introduce three new mouse models for investigating the localization and dynamics of transmembrane PCP proteins: Celsr1, Fz6 and Vangl2. Using the skin epidermis as a model, we characterize and verify the expression, localization and function of endogenously tagged Celsr1-3xGFP, Fz6-3xGFP and tdTomato-Vangl2 fusion proteins. Live imaging of Fz6-3xGFP in basal epidermal progenitors reveals that the polarity of the tissue is not fixed through time. Rather, asymmetry dynamically shifts during cell rearrangements and divisions, while global, average polarity of the tissue is preserved. We show using super-resolution STED imaging that Fz6-3xGFP and tdTomato-Vangl2 can be resolved, enabling us to observe their complex localization along junctions. We further explore PCP fusion protein localization in the trachea and neural tube, and discover new patterns of PCP expression and localization throughout the mouse embryo.

Read the paper: https://doi.org/10.1242/dev.199695

Posted in Uncategorized

cuFINUFFT: a load-balanced GPU library for general-purpose nonuniform FFTs

By Yu-hsuan Shih, Garrett Wright, Joakim Andén, Johannes Blaschke, Alex H. Barnett

Nonuniform fast Fourier transforms dominate the computational cost in many applications including image reconstruction and signal processing. We thus present a general-purpose GPU-based CUDA library for type 1 (nonuniform to uniform) and type 2 (uniform to nonuniform) transforms in dimensions 2 and 3, in single or double precision. It achieves high performance for a given user-requested accuracy, regardless of the distribution of nonuniform points, via cache-aware point reordering, and load-balanced blocked spreading in shared memory. At low accuracies, this gives on-GPU throughputs around 10e9 nonuniform points per second, and (even including host-device transfer) is typically 4-10x faster than the latest parallel CPU code FINUFFT (at 28 threads). It is competitive with two established GPU codes, being up to 90x faster at high accuracy and/or type 1 clustered point distributions. Finally we demonstrate a 5-12x speedup versus CPU in an X-ray diffraction 3D iterative reconstruction task at 10e-12 accuracy, observing excellent multi-GPU weak scaling up to one rank per GPU.

Read the paper: https://doi.org/10.48550/arXiv.2102.08463

Posted in Uncategorized

Using Intel Advisor at Princeton Research Computing clusters: analyze performance remotely and visualize results locally

Intel Advisor is an optimization tool that helps the developers identify hot spots, performance issues and also provide recommendations for performance improvement. It has been installed at most of the Princeton research computing systems. Intel Advisor was part of the licensed Parallel Studio XE (PSXE) releases before. It is now included in the Intel OneAPI base toolkit, which is free to download. In this article, we will walk you through the process of collecting performance data remotely at Princeton Research Computing clusters using the Intel Advisor command line interface (CLI) and displaying the results on a local macOS system using the Intel Advisor graphical user interface (GUI).

Preparing Applications for Performance Analysis

For C/C++ and Fortran code (on Linux OS), it is recommended to setup the following compiler flags before running the performance analysis:

  1. Request full debug information (compiler and linker): -g
  2. Request moderate optimization: -O2 or higher 
  3. Disable inter procedural optimization that may inhibit the profiler to collect performance data: -no-ipo
  4. Produce compiler diagnostics: -qopt-report=5
  5. Enable OpenMP directives: -qopenmp

See: https://software.intel.com/content/www/us/en/develop/documentation/advisor-user-guide/top.html

Using Intel Advisor on Princeton Research Computing Clusters

Before, we usually recommended that you used the CLI to collect data via batch jobs at compute nodes and then viewed results using the GUI on a login node. Now as the Intel Advisor GUI is available free on macOS, we recommend that you copy the collected data from the remote system to your local macOS to view. Note Intel Advisor does not support data collection on macOS and you can only use macOS for displaying the data collected on a Windows or Linux OS.

Collecting Data at Remote System

Once in a remote system (e.g., Tigercpu, Adroit etc), you start by loading the module, e.g., module load intel-advisor. Then you can collect the data using the Intel Advisor CLI. The CLI is launched with advisor command. You can use advisor –help to search for the command for a specific action. For example, after issuing advisor –help command, you will see

Intel(R) Advisor Command Line Tool
Copyright (C) 2009-2020 Intel Corporation. All rights reserved.

 Usage: advisor <--action> [--action-option] [--global-option] [[--] <target>
 [target options]] 

<action> is one of the following:
 
    collect          Run the specified analysis and collect data. Tip: Specify the search-dir when collecting data.
    command          Issue a command to a running collection.
    create-project   Create an empty project, if it does not already exist.
    help             Explain command line actions with corresponding options.
    import-dir       Import and finalize data collected on an MPI cluster.
    mark-up-loops    After running a Survey analysis and identifying loops of interest, select loops (by file and line number or criteria) for deeper analysis.
    report           Generate a report from data collected during a previous analysis.
    snapshot         Create a result snapshot.
    version          Display product version information. 
    workflow         Explain typical Intel Advisor user scenarios, with corresponding command lines. 

For help on a specific action, type: advisor --help <action>
 
Examples: 

 Perform a Survey analysis.
 
 	advisor --collect=survey --project-dir=./advi --search-dir src:r=./src -- ./bin/myApplication
 
 Generate a Survey report.
 
 	advisor --report=survey --project-dir=./advi --search-dir src:r=./src
 
 Display help for the collect action.
 
 	advisor --help collect

advisor –help collect shows you the command to perform a specific analysis. For example, to perform a survey analysis to determine hotspots, we use

advisor --collect=survey --project-dir=./advi --search-dir src:r=./src   -- ./bin/myApplication

To collect the roofline, you can run a tripcounts analysis on top of the above survey analysis. Note the project directory needs to be the same for both analyses.

advisor --collect=tripcoutns -flop --project-dir=./advi --search-dir src:r=./src -- ./bin/myApplication

Intel Avisor version 2021.1 provides a roofline analysis option to integrate the earlier two steps roofline collection in a single step.

advisor --collect=roofline --project-dir=./advi --search-dir src:r=./src -- ./bin/myApplication

Note it is recommended to NOT use –no-auto-finalize option for reducing collection and finalization time if the data will be reviewed on a local macOS later since the macOS might have a different version of compiler, runtimes, math libraries and other parts of analyzed application stack (see: https://software.intel.com/content/www/us/en/develop/documentation/advisor-cookbook/top/analyze-performance-remotely-and-visualize-results-on-macos.html).

It is also helpful to use the GUI to find out the command. For example, you can:

  1. Log in a remote head node with ssh -Y usersname@adroit.princeton.edu
  2. Load the module with module load intel-advisor
  3. Launch the Intel Advisor GUI with advisor-gui
  4. Create a project
  5. Set up the project properties
  6. Choose the appropriate analysis type
  7. Click the get command line button on the workflow tab under the desired analysis
  8. Copy the command line to clipboard to paste to the script for remote runs

To view the results, you can copy the whole project directory to your local macOS. It is also recommended to first pack the analysis results in a snapshot and then copy the packed *.advixeexpz file. For example:

advisor --snapshot --project-dir=./advi --pack --cache-sources --cache-binaries -- ./advi_snapshot

Viewing Results on a Local macOS System

You can download the Intel Advisor for macOS from the oneAPI base toolkit. After launching the Intel Advisor GUI, you then go to File > Open > Project/Result and navigate the copied project directory/snapshot.

Intel Advisor GUI

This article covers the following NEW in Intel Advisor version 2021.1:

  1. Intel Advisor is included as part of the Intel OneAPI base toolkit 
  2. The executables are renamed. advixe-cl is renamed to advisor. advixe-gui is renamed to advisor-gui
  3. The roofline analysis is provided as a single command. In the earlier version, roofline analysis is done by first running a survey analysis followed by a tripcounts analysis. Now we can run the roofline in a single step using —collect=roofline option.

For a complete list of new update, please see https://software.intel.com/content/www/us/en/develop/documentation/advisor-user-guide/top/what-s-new.html.

References:

  1. https://software.intel.com/content/www/us/en/develop/documentation/advisor-user-guide/top.html
  2. https://software.intel.com/content/www/us/en/develop/documentation/advisor-cookbook/top/analyze-performance-remotely-and-visualize-results-on-macos.html

Posted in Uncategorized

Using Codeocean for sharing reproducible research

As a researcher, inquiries about previously published research probably evoke two feelings: panic-filled regret or calm authority. Often the difference is time; it’s easier to talk about the project you worked on last week than last decade. Talking about old protocols or software is a lot like someone critically examining the finger painting you did as a child. You know it’s not perfect and you would do several things differently in hindsight, but it is the method in the public record. The rate of change seems faster with software development, where new technologies redefine best practices and standards at dizzying rates.

Perhaps the most challenging problem is when researchers outside of your institution fail to reproduce results. How can you troubleshoot the software on every system or determine what missing piece is required to get things working? What was that magic bash command you wrote 5 years ago?

Continue reading

Published in Nature Physics: Topological limits to the parallel processing capability of network architectures

By Giovanni Petri, Sebastian Musslick, Biswadip Dey, Kayhan Özcimder, David Turner, Nesreen K. Ahmed, Theodore L. Willke & Jonathan D. Cohen

The ability to learn new tasks and generalize to others is a remarkable characteristic of both human brains and recent artificial intelligence systems. The ability to perform multiple tasks simultaneously is also a key characteristic of parallel architectures, as is evident in the human brain and exploited in traditional parallel architectures. Here we show that these two characteristics reflect a fundamental tradeoff between interactive parallelism, which supports learning and generalization, and independent parallelism, which supports processing efficiency through concurrent multitasking. Although the maximum number of possible parallel tasks grows linearly with network size, under realistic scenarios their expected number grows sublinearly. Hence, even modest reliance on shared representations, which support learning and generalization, constrains the number of parallel tasks. This has profound consequences for understanding the human brain’s mix of sequential and parallel capabilities, as well as for the development of artificial intelligence systems that can optimally manage the tradeoff between learning and processing efficiency.

Similar content being viewed by others

Read the paper: https://www.nature.com/articles/s41567-021-01170-x

Posted in Uncategorized

Linting non-inclusive language with blocklint

Blocklint is a simple command line utility for finding non-inclusive wording with an emphasis on source code. If you’ve used a modern IDE, you know the importance of immediate feedback for compilation errors or even stylistic slip-ups.  Knowing all variables should be declared or that lines must be less than 80 characters long is good, but adhering to those rules takes a back seat when in the flow of writing code.  A linter brings these issues back into your consciousness by highlighting the problematic lines of code.  Over time, the enforced style becomes more intuitive but the linter is always there to nudge you if you slip.

Continue reading

FlyVR

I started working as a research software engineer for the Princeton Neuroscience Institute (PNI) in May 2017. At the end of my first week I received an email from Professor Mala Murthy and post-doc David Deutsch of the MurthyLab, asking if I would be interested in working on a project involving a “virtual reality environment for neural recording experiments”. The kid in me got very excited at the prospect of making video games. At the time I did not know the project was to develop a virtual reality simulation for flies!

The FlyVR setup. Projection arena for visual stimuli not shown as it surrounds parts of the setup and occludes components. Projector for visual stimuli is directed at a curved mirror to project onto half dome surrounding fly.

Virtual reality experiments have a long history in neuroscience . They allow researchers to restrict the movement of animal subjects so that they can use advanced microscopy to image their brains in “naturalistic” environments. In the Murthy lab’s VR setup, the fly is fixed to the objective of a two photon microscope and suspended above a small sphere floating on a column of air. This small sphere is used as a sort of omni-directional treadmill. While the fly cannot actually move, it can move its legs, which in turn move the freely rotating sphere. The movement of the sphere is tracked with computer vision algorithms and thus a fictive path for the fly in a virtual world can be reconstructed. This setup allows a “moving” fly’s brain to be imaged with techniques that require it to be stationary. The two photon imaging system then provides a very flexible and powerful tool for studying changes in the fly’s brain activity over time of the experiment. Different spatial and temporal resolutions are available depending on the needs of the experimenter. 

Continue reading
Posted in Uncategorized

Monitoring slurm efficiency with reportseff

Motivation

As I started using Snakemake, I had hundreds of jobs that I wanted to get performance information about. seff gives the efficiency information I wanted, but for only a single job at a time. sacct handles multiple jobs, but couldn’t give the efficiency. With the current python implementation of reportseff, all job information is obtained from a single sacct call and with click the output is colored to quickly see how things are running.

Continue reading