class: inverse, middle, center class: left, middle, inverse <!-- Title slide --> ### Friends don't let friends copy-paste ![:pull right, 35%]( <img src="https://github.com/crsh/papaja/raw/main/tools/images/papaja_hex.png" style="padding-top: 0.5em;"> ) # Preventing code rot #### Frederik Aust & Marius Barth <small> 21.04.2023 </small> --- exclude: true --- layout: true name: footer <div class="my-footer"> <div style="float: left;"><span>Frederik Aust & Marius Barth</span></div> <div style="text-align: right;"><span>Friends don't let friends copy-paste</span></div> <div style="float: center;"><span>21.04.2023</span></div> </div> <script type="text/x-mathjax-config"> MathJax.Hub.Config({ "HTML-CSS": { scale: 150, } }); </script> --- <script src="https://cdn.jsdelivr.net/npm/medium-zoom@1.0.6/dist/medium-zoom.js"></script> <script type="module"> import mediumZoom from 'https://cdn.jsdelivr.net/npm/medium-zoom@1.0.6/dist/medium-zoom.esm.js' const zoomDefault = mediumZoom('#zoom-default') const zoomMargin = mediumZoom('#zoom-margin', { margin: 45 }) </script> # Preventing code rot --- layout: true template: footer name: rot # Preventing code rot --- ### Recap *Computational reproducibility is hard!* -- - Shared reproducibility packages often fail<br />.grey[(e.g., Eubank, 2016)] -- - Code is increasingly likely to break as time passes<br />(it "rots") - This is true even if code is *untouched* --- - Computing environments inevitably change -- - Installation or removal of software - Software updates - Removal or relocation of files - (Hardware updates) -- - Complex computing environments exacerbate the problem --- <img src="data:image/png;base64,#img/dependendy-graph.png" width="70%" id="zoom-margin" style="display: block; margin: auto;" /> --- .pull-left-50[ <small> - Computational reproducibility requires a stable computing environment - But how? - Archive the computer used to run the analysis? - Trade-off between robustness and feasibility </small> ] -- .pull-right-35[ .center[![:scale 35%](data:image/png;base64,#img/toppling-tower.jpg)] ] --- exclude: true <img src="data:image/png;base64,#img/reproducibility-stack.jpg" width="" height="425px" style="display: block; margin: auto;" /> .center[ <span style = "font-size: 75%">Adapted from GrĂĽning et al. (2018)</span> ] --- 1. [`checkpoint`](https://github.com/RevolutionAnalytics/checkpoint/) by Microsoft - Requires a project-based workflow - Package database will gradually grow -- 2. [`groundhog`](https://groundhogr.com/) - Package database will gradually grow -- 3. [`renv`](https://rstudio.github.io/renv/articles/renv.html) by RStudio - Most flexible and powerful - Least straight forward to use - No "forensic" applications --- <img src="data:image/png;base64,#img/checkpoint-pkg.png" width="1016" style="display: block; margin: auto;" /> --- Dependencies are detected automatically .pull-left-50[ .center[<small>R script</small>] ```r library("checkpoint") checkpoint("2022-04-27", r_version = "4.1.2") library("ggplot2") ``` ] .pull-right-45[ .center[<small>Console</small>] ```r checkpoint("2022-04-27") install.packages("ggplot2") ``` ] -- <br /><br /><br /><br /><br /><br /> Uses a date-specific directory outside of usual library ```bash ~/.checkpoint/... ``` --- name: toppling .pull-left-75[ .center[![:scale 45%](data:image/png;base64,#img/toppling-tower.jpg)] ] -- .pull-right-25[ <br /><br /> âś… <br /><br /> (âś…) ] --- ### Virtual machines <img src="data:image/png;base64,#img/vm-stack.png" width="450px" height="" style="display: block; margin: auto;" /> --- ### Containers <img src="data:image/png;base64,#img/container-stack.png" width="450px" height="" style="display: block; margin: auto;" /> .footnote[<small>(Piccolo & Frampton, 2016)</small>] --- - Virtual machines (e.g., [Vagrant](https://www.vagrantup.com/)) - Encapsulate an entire operating system - Require a lot of disk space - Take little longer to set up and boot -- - Containers (e.g., [Docker](https://www.docker.com/)) - Widely used in software development - Depend on OS kernel, but can be run on all common operating systems --- template: toppling .pull-right-25[ <br /><br /> âś… <br /><br /> âś… <br /><br /> âś… ] --- - Virtual machines and containers requires some expertise - Online services simplify setup, collaboration, sharing, and archiving - [PsychNotebooks](https://www.psychnotebook.org/) - [CodeOcean](https://codeocean.com/) - [RStudio.cloud](https://rstudio.cloud) --- [An unobtrusive Docker workflow for papaja](https://github.com/crsh/papaja_docker) <img src="data:image/png;base64,#img/papaja-docker-repo.png" width="700px" style="display: block; margin: auto;" /> --- layout: false template: footer class: middle, center # Time for a demonstration! --- template: rot ### Summary - Computing environments inevitably change -- - Robust computational reproducibility requires a fixed computing environment 1. prevents code rot 2. facilitates sharing reproducible analyses -- - Mind software licenses when sharing software environments