As I was preparing for an R intro course I came up with the idea of creating a fake data set that is stuffed full of all the conceivable errors one can imagine. Just in case my imagination falls short, I’d appreciate all the suggestions in the comments so that I can incorporate more errors.
There is a Hungarian saying about the veterinarian’s horse to describe
a case that exhibits all the possible conditions a subject can suffer from
(read more of the etymology here).
I would like to create a data set that shows all the
possible errors a data set can exhibit. This data would be then used in
the aforementioned course to make participants’
life miserable experience more diverse.
So far I have been able to come up with the following issues:
"1,234,567.0058654"(needs to clear commas, turn it into numeric, digits are irrelevant but eating up memory)
0-3works fine, but
I don’t imagine that this list can ever be complete, but right now it is far from complete. If you have struggled with a problem in the past and would like others to learn from it, please leave a comment and I will expand the list accordingly.
In a paper recently published in the Condor, titled Evaluating time-removal models for estimating availability of boreal birds during point-count surveys: sample size requirements and model complexity, we assessed different ways of controlling for point-count duration in bird counts using data from the Boreal Avian Modelling Project. As the title indicates, the paper describes a cost-benefit analysis to make recommendations about when to use different types of the removal model. The paper is open access, so feel free to read the whole paper here).
ABMI (6) ARU (1) C (1) CRAN (1) Hungary (2) JOSM (2) PVA (2) PVAClone (1) QPAD (2) R (20) R packages (1) bioacoustics (1) biodiversity (1) birds (2) course (2) data (1) data cloning (4) dclone (3) dependencies (1) detect (3) detectability (2) footprint (3) forecasting (1) functions (3) intrval (4) lhreg (1) mefa4 (1) monitoring (2) pbapply (5) phylogeny (1) plyr (1) poster (2) processing time (2) progress bar (4) publications (2) report (1) sector effects (1) shiny (1) single visit (1) site (1) slider (1) slides (2) special (3) species (1) trend (1) tutorials (2) video (4)