Posts

An update (v 0.1-1) of the intrval package was recently published on CRAN. The package simplifies interval related logical operations (read more about the motivation in this post). So what is new in this version? Some of the inconsistencies in the 1st CRAN release have been cleaned up, and I have been pushed hard (see GitHub issue to implement all the 16 interval-to-interval operators. These operators define the open/closed nature of the lower/upper limits of the intervals on the left and right hand side of the o in the middle as in c(a1, b1) %[]o[]% c(a2, b2).

I recently posted a piece about how to write and document special functions in R. I meant that as a prelude for the topic I am writing about in this post. Let me start at the beginning. The other day Dirk Eddelbuettel tweeted about the new release of the data.table package (v1.9.8). There were new features announced for joins based on %inrange% and %between%. That got me thinking: it would be really cool to generalize this idea for different intervals, for example as x %[]% c(a, b).

I spend a considerable portion of my working hours with data processing where I often use the %in% R function as x %in% y. Whenever I need the negation of that, I used to write !(x %in% y). Not much of a hassle, but still, wouldn’t it be nicer to have x %notin% y instead? So I decided to code it for my mefa4 package that I maintain primarily to make my data munging time shorter and more efficient. Coding a %special% function was no big deal. But I had to do quite a bit of research and trial-error until I figured out the proper documentation. So here it goes.

Transformation of native habitat by human activity is the main cause of global biodiversity loss. Humans have visibly transformed 27% of Alberta to date. The effects of these changes depend on the species, and the nature and extent of the human activities in question. Teasing apart these factors in a cumulative effects framework are of the focus of several initiatives and organizations in Alberta. The Alberta Biodiversity Monitoring Institute (ABMI) collects data and produces information that helps attributing the effects of human activities on species to different industrial sectors, or as we call them, sector effects.

As a testament to my obsession with progress bars in R, here is a quick investigation about the overhead cost of drawing a progress bar during computations in R. I compared several approaches including my pbapply and Hadley Wickham’s plyr.

As of today, there are 20 R packages that reverse depend/import/suggest (3/14/3) the pbapply package. Current and future package developers who decide to incorporate the progress bar using pbapply might want to customize the type and style of the progress bar in their packages to better suit the needs of certain functions or to create a distinctive look. Here is a quick guide to help in setting up and customizing the progress bar.

The pbapply R package adds progress bar to vectorized functions, like lapply. A feature request regarding progress bar for parallel functions has been sitting at the development GitHub repository for a few months. More recently, the author of the pbmcapply package dropped a note about his implementation of forking functionality with progress bar for Unix/Linux computers, which got me thinking. How should we add progress bar to snow type clusters? Which led to more important questions: what is the real cost of the progress bar and how can we reduce overhead on process times?

The title says it all. I wrote this piece about Publication Viability Analysis pondering about a pattern that I observed while looking at Hungarian ecologists publication output through time using the Web of Science database (the original post is in Hungarian).

This post was prompted by this blog about using the cranlogs package by Gabor Csardi. But my own interest as long time package developer dates back to this post by Ben Bolker. I like to see that my packages are being used. So I thought why stop at counting downloads and plotting the past. Why not predict into the future?

I was invited to represent ABMI at the Multi-taxa Monitoring in North America symposium, North American Congress for Conservation Biology, Madison, Wisconsin, July 18, 2016. The symposium was organized by Michael Lucid (Idaho Department of Fish and Game). It was great to see all the good work happening in North America, and the commitment to push the agenda of multi-taxa monitoring against critics and scarce funding (of course Alberta ‘has all the oil money’).

As I was preparing for an R intro course I came up with the idea of creating a fake data set that is stuffed full of all the conceivable errors one can imagine. Just in case my imagination falls short, I’d appreciate all the suggestions in the comments so that I can incorporate more errors.

Automated acoustic monitoring is gaining momentum worldwide. Alberta is stepping up to the game by implementing automated recording unit (ARU) based monitoring programs. An improved command line tool is here to help in the process.

pbapply is a lightweight R extension package that adds progress bar to vectorized R functions (*apply). The latest addition in version 1.2-0 is the timerProgressBar function which adds a text based progress bar with timer that all started with this pull request.

The mefa4 R package is aimed at efficient manipulation of very big data sets leveraging sparse matrices thanks to the Matrix package. The recent update (version 0.3-3) of the package includes a bugfix and few new functions to compare sets and finding dominant features in compositional data as described in the ChangeLog.

Personal website revamped

February 27, 2016 Etc site

It all started with my site based on the SinglePaged theme broken by the Jekyll 3.0 update on GitHub pages. Although Karthik Raman sent a nice pull request with a fix, I opted to revamp my site instead of fixing the old theme.

One-day short course at NACCB congress in Madison, WI, on July 16th, with Peter Solymos and Subhash Lele.

Information on spatial distribution, habitat associations, responses to human footprint, and predicted relative abundance distributions for 2285 species in Alberta by the Alberta Biodiversity Monitoring Institute (ABMI) at http://species.abmi.ca.

One-day teaching workshop at ICCB/ECCB congress in Monpellier on August 1st, 2015.

We presented a poster at the ICCB/ECCB 2015 congress in Montpellier, France, that summarized our research on single visit methodology.

Habitat associations and responses to human footprint were quantified for several breeding bird species as part of a collaborative modeling effort that synthesized the available information in Alberta.

The ABMI hosted its 2nd annual Speakers’ Series ‘Better Environmental Management Through Monitoring 2015’ to understand distribution of biodiversity and to inform sustainable resource development and biological conservation in Alberta.

Alberta Biodiversity Monitoring Institute (ABMI) monitors species and their habitats to understand distribution of biodiversity and to inform sustainable resource development and biological conservation in Alberta. The species website can be accessed at http://species.abmi.ca.

I presented a guest lecture ‘Data cloning: bridging the Bayesian and frequentist statistical paradigms’, at the Budapest R User Group meetup, Budapest, Hungary.

Discussing problems vs. finding solutions: an operational framework for dealing with imperfect detection in species distribution modelling, International Statistical Ecology Conference 2014, Montpellier, France.

Development of predictive models for migratory landbirds and estimation of cumulative effects of human development in the oil sands areas of Alberta, Joint Oil Sands Monitoring: Cause-Effects Assessment of Oil Sands Activity on Migratory Landbirds, Edmonton, AB, 2014.

Statistical computing meets biodiversity conservation and natural resource management

What is new in the intrval R package?

An update (v 0.1-1) of the intrval package was recently published on CRAN. The package simplifies interval related logical operations (read more about the motivation in this post). So what is new in this version? Some of the inconsistencies in the 1st CRAN release have been cleaned up, and I have been pushed hard (see GitHub issue to implement all the 16 interval-to-interval operators. These operators define the open/closed nature of the lower/upper limits of the intervals on the left and right hand side of the o in the middle as in c(a1, b1) %[]o[]% c(a2, b2).

ABMI (6) ARU (1) C (1) CRAN (1) Hungary (1) JOSM (2) PVA (1) QPAD (1) R (15) R packages (1) bioacoustics (1) biodiversity (1) birds (2) course (2) data (1) data cloning (3) dclone (3) dependencies (1) detect (2) footprint (3) forecasting (1) functions (3) intrval (2) mefa4 (1) monitoring (2) pbapply (4) plyr (1) poster (2) processing time (1) progress bar (3) publications (1) report (1) sector effects (1) single visit (1) site (1) slides (2) special (3) species (1) trend (1) tutorials (2) video (4)