LSDTopoTools is software for analysing landscapes. It has applications in hydrology, ecology, soil science, and geomorphology. This document is not about the use of the software, but rather about its history and its philosophy. Your authors feel the need to write these things down because:
Readers might wonder why we don’t use existing software and
We feel very strongly about open science and this is a chance to proselytize.
2. A brief history (by Simon Mudd)
The project started around 2010 due to my increasing frustration with my inability to reproduce topographic analyses that I found in scientific papers and saw at conferences. Some of the papers that had irreproducible analyses were my own! Like many scientists working with topographic data, I was using a geographic information system (GIS) to prepare figures and analyse topography, and after a long session of clicking on commercial software to get just that right figure, I did not have a record of the steps I took to get there. Mea culpa. However, I do not think I am the only person guilty of doing this! I wanted a way of doing topographic analysis that did not involve a sequence of mouse clicks.
A second motivation came when my PhD student, Martin Hurst, finished his PhD and left Edinburgh for warmer pastures in England (he is now back in Scotland, where he belongs). His PhD included several novel analyses that were clearly very useful, but also built using the python functionality in a certain commercial GIS and not very portable. I and my other PhD students wanted to run Martin’s analyses on other landscapes, but this proved to be a painful process that required numerous emails and telephone calls between Martin and our group.
This motivated me to start writing my own software for dealing with topographic data. This seemed crazy at the time. Why were we trying to reinvent a GIS? The answer is that the resulting software, LSDTopoTools, IS NOT A GIS! It is a series of algorithms that are open-source and can be used to analyse topography, and the programs that run these analyses, which we call driver programs, are intended to be redistributed such that if you have the same topographic data as was used in the original analysis, you should be able to reproduce the analysis exactly. In addition the philosophy of my research group is that each of our publications will coincide with the release of the software used to generate the figures: we made the (often frightening) decision that there would be no hiding behind cherry-picked figures. (Of course, our figures in our papers are chosen to be good illustrations of some landscape property, but other researchers can always use our code to find the ugly examples as well).
We hope that others outside our group will find our tools useful, and this document will help users get our tools working on their systems. I do plead for patience: we have yet to involve anyone in the project that has any formal training in computer science or software engineering! But we do hope to distribute beyond the walls of the School of GeoSciences at the University of Edinburgh, so please contact us for help, questions or suggestions.
3. Some comments on open data and reproducibility
What separates science from philosophy and art? If you look up "science" in a dictionary you will probably find something similar to what is on the Wikipedia page for science:
Science (from Latin scientia, meaning "knowledge") is a systematic enterprise that builds and organizes knowledge in the form of testable explanations and predictions about nature and the universe.
Two important words in that definition are "systematic" and "testable". Philosophers like Karl Popper have argued that a defining feature of science is the ability to falsify hypotheses. Scientists need a way to know if a hypothesis is wrong. Other scientists must be able to test the findings.
In 1989 Stanley Pons and Martin Fleischmann claimed to have carried out a successful cold fusion experiment, but others were unable to reproduce their results and thus their findings were considered unsound. Naturally you want to avoid this sort of thing. Ideally all research should be reproducible. But researchers actually have quite a poor track record for creating reproducible research. Here is a fun game: ask any senior academic to reproduce the results from their PhD or even show the documentation of the data, figure preparation, etc. Many will be unable to do so.
In scientific publications, if other scientists cannot test the results, then the paper is more of an advertisement than actual science (in the succinct words of one of our Hutton Club speakers). Testability and reproducibility is the aspiration but in reality very few scientific papers are reproducible. This has been at times described as a "crisis" in science. In fact, Nature published an entire special collection on this topic:
This problem can be mitigated significantly if you are using computation to test hypotheses. However this is only possible if other actually have access to the programs you use for your research. Again, several recent articles have highlighted the need for transparent computation in science.
Open code for open science (published in Nature).
Code share (published in Nature Geoscience).
Journals unite for reproducibility (published in Nature).
On the LSDTopoTools team, we aspire for our publications to be fully reproducible. We admit to not fully reaching this goal in many cases. When you are processing many, many, gigabytes of data there are technical barriers to making all the steps open and reproducible (e.g., how do we make all the topographic data public?), and making the work reproducible requires carefully thought out workflows. In each publication we make results in a slightly more streamlined workflow for reproducibility. Hopefully our experience can help others in creating reproducible research since we have learned many lessons about the best way to create fully reproducible analyses the hard way.
4. The philosophy behind LSDTopoTools
LSDTopoTools is a software package designed to analyse landscapes for applications in geomorphology, hydrology, ecology and allied fields. It is not intended as a substitute for a GIS, but rather is designed to be a research and analysis tool that produces reproducible data. The motivations behind its development were:
To serve as a framework for implementing the latest developments in topographic analysis.
To serve as a framework for developing new topographic analysis techniques.
To serve as a framework for numerical modelling of landscapes (for hydrology, geomorphology and ecology).
To improve the speed and performance of topographic analysis versus other tools (e.g., commercial GIS software).
To enable reproducible topographic analysis in the research context.
The toolbox is organised around objects, which are used to store and manipulate specific kinds of data, and driver functions, which users write to interface with the objects.
For most readers of this documentation, you can exist in blissful ignorance of the implementation and simply stay on these pages to learn how to use the software for your topographic analysis needs.
5. Why don’t we just use ArcMap/QGIS? It has topographic analysis tools.
One of the things our group does as geomorphologists is try to understand the physics and evolution of the Earth’s surface by analysing topography. Many geomorphologists will take some topographic data and perform a large number of steps to produce and original analysis. Our code is designed to automate such steps as well as make these steps reproducible. If you send another geomorphologist your code and data they should be able to exactly reproduce your analysis. This is not true of work done in ArcMap or other GIS systems. ArcMap and QGIS are good at many things! But they are not that great for analysis that can easily be reproduced by other groups. Our software was built to do the following:
LSDTopoTools automates things that would be slow in ArcMap or QGIS.
LSDTopoTools is designed to be reproducible: it does not depend on one individual’s mouse clicks.
LSDTopoTools uses the latest fast algorithms so it is much faster than ArcMap or QGIS for many things (for example, flow routing).
LSDTopoTools has topographic analysis algorithms designed and coded by us or designed by someone else but coded by us soon after publication that are not available in ArcMap or QGIS.
LSDTopoTools contains some elements of landscape evolution models which cannot be done in ArcMap or QGIS.
6. Why didn’t you just use TopoToolbox?
TopoToolbox is a popular topographic analysis package written by Wolfgang Schwanghart. However, when we started LSDTopoTools, TopoToolbox was in its infancy and at the time could not do any of the things we wanted it to (LSDTopoTools does a number of analyses related to hillslopes that are not in any other software package). In addition, TopoToolbox is written in a commercial programming language that starts with
Ma, and we wanted an open programming language. When we started LSDTopoTools, the Stream profiler tool, which uses the same
Ma language as TopoToolbox, used a statistics toolbox that the University of Edinburgh did not own, and so we could not reproduce any stream profile analysis that was appearing in many papers at the time. This was, to say the least, irritating. At that time we would also run out of
Ma licences on the server just before AGU and EGU, leading to pandemonium. The final straw was when a collaborator sent a question to the support team of this commercial organisation and was told that his licence was an educational licence and that he needed to stop using the software for research immediately or would face legal consequences. We didn’t want users of our software facing the same issues we did, so we chose to develop LSDTopoTools in a combination of python and C++. No element of the LSDTopoTools toolchain is commercial or proprietary, and we intend to keep it that way.