Stop writing nonsense code instructions
People new to programming often wonder why they can copy code precisely, and watch it break i.e. not work at all. Medium is one source of such ‘breaking broken’ code. I am not writing an article to bash Medium. Medium is only one place where many people record interesting ideas related to code. I wish to point out something far more general. Code has a baked in replication crisis. Popular coding languages , environments , hardware and software can change really quickly. Unfortunately, this means that many of programming articles I have seen that were meant to share code are very difficult to use for new programmers, but this need not be the case. Code can be designed such that it is easy to replicate.
To get right to the heart of the solution, a few lines of code for Python:
!! pip install watermark
or
!! pip install makedalytics
but depending on your environment, you may want
conda install makedalytics
or …OK, I think I made my point.
These lines ironically illustrate the problem. They are lines of code ones needs to import libraries that have functions to aid in making code reproducible, but even they have dependencies. Watermark requires Python >=3.7. Makedalytics (full disclosure, I wrote this Python library) requires pip.
I do not envision a world in my lifetime in which a perfect and universally available computing system appears, and perfected bug free languages stop changing. So long as languages and packages upgrade, we will write code that may not be executable everywhere, and quite frankly it may not even be executable on our own machine if we install some new libraries. What we can do is at least explain our environments and dependencies to others when we share code.
It may seem like a superficial issue that code on blogs is not easy to replicate, but when you consider how many new scientific discoveries rely in one way or another on code, it’s hard not to wonder how much of a problem we have in many fields. Standards are slowly moving towards open publication of scientific data, but code is also an essential part of much important work. A conspiracy theorist might wonder if some academics use proprietary technologies such as MATLAB when open source ones are available i.e. Octave, as a barrier to entry to keep people from investigating their work. Scientists and other academics who seek to obfuscate their methodologies probably can not be helped. But people who write on Medium or blog about code are usually trying to be helpful. So please, folks, include a statement about your environment and dependencies when you share code whether it be on a blog, Medium post or even in a scientific article appendix. You can make your work go farther with a few lines of comments.