Using Codeocean for sharing reproducible research

As a researcher, inquiries about previously published research probably evoke two feelings: panic-filled regret or calm authority. Often the difference is time; it’s easier to talk about the project you worked on last week than last decade. Talking about old protocols or software is a lot like someone critically examining the finger painting you did as a child. You know it’s not perfect and you would do several things differently in hindsight, but it is the method in the public record. The rate of change seems faster with software development, where new technologies redefine best practices and standards at dizzying rates.

Perhaps the most challenging problem is when researchers outside of your institution fail to reproduce results. How can you troubleshoot the software on every system or determine what missing piece is required to get things working? What was that magic bash command you wrote 5 years ago?

Continue reading

Linting non-inclusive language with blocklint

Blocklint is a simple command line utility for finding non-inclusive wording with an emphasis on source code. If you’ve used a modern IDE, you know the importance of immediate feedback for compilation errors or even stylistic slip-ups.  Knowing all variables should be declared or that lines must be less than 80 characters long is good, but adhering to those rules takes a back seat when in the flow of writing code.  A linter brings these issues back into your consciousness by highlighting the problematic lines of code.  Over time, the enforced style becomes more intuitive but the linter is always there to nudge you if you slip.

Continue reading

Monitoring slurm efficiency with reportseff

Motivation

As I started using Snakemake, I had hundreds of jobs that I wanted to get performance information about. seff gives the efficiency information I wanted, but for only a single job at a time. sacct handles multiple jobs, but couldn’t give the efficiency. With the current python implementation of reportseff, all job information is obtained from a single sacct call and with click the output is colored to quickly see how things are running.

Continue reading