Discussion: how to wrap up your paper (Paper writing for PhD students pt 11)

[For the previous post, see here]

Time to wrap up the paper! You have presented your research question, its greater role in the universe, and your findings. Now let’s close the circle and discuss your answers and the questions that still remain, or the questions that you have just discovered. This is what the Discussion section is for.

First, before we get started, a note for those poor souls who work in fields where you are not supposed to discuss your results in the Results section at all. In this case, the Results section is pure data, and therefore the Discussion section usually has two parts, Discussion and General Discussion. The first part is about interpreting each of your results, and the second part is what we will mainly talk about in the following. Sorry for the confusion (not my fault).

And while we are at it, what’s the difference between Discussion and Conclusion? The Conclusion is typically fairly short—it is almost like an echo of the abstract. It is a summary of what you have done, and it adds no new ideas or thoughts. In contrast, the Discussion section can introduce new points and insights (but of course no new results). It can connect the dots in new ways, and show how your findings relate to the broader issue at hand. It can ask new questions.

In the Discussion section, you can remind the reader of your research question and provide a synthesis of your results. You can present the answer to your question and show how it contributes to solving the larger problem that you have painted in the abstract and the introduction. You can suggest further research based on your results. You can contextualize your results (and your research problem) within the literature—you can even cite papers that were not cited in the intro if they help to interpret your results or their significance.

In general, there is a lot of freedom in choosing how to shape the (General) Discussion section. One way to begin is by reminding the reader of the broader knowledge gap and the specific research question of the paper, before proceeding through the results, perhaps grouping them to make certain points, summing up all evidence before arriving at the final conclusion. Or you can begin with a paragraph that recaps your question and your answer, the final conclusion of the paper, and continue with paragraphs that discuss the elements that led to this conclusion.

Whatever the structure you choose, it pays off to plan the flow of the Discussion section in advance at the paragraph level. In my own papers, a very typical formula goes like this: Paragraph 1—quick summary of the problem and the main result(s), Paragraph 2— discuss point A and its deeper meaning, Paragraph 3—discuss point B and its meaning, Paragraph 4— wrap up and finish the paper (see below).

The last paragraph and your final words are tremendously important—do not waste them. First, if the reader has been with you this far, reward her at the end. Second, endings are power positions—the last words of your paper will be remembered by your readers (or, at least those who made it this far and some skimmers that directly jumped to the end). Never waste a power position, never waste an ending! Always end on a high note.

There are some all too common ways of ending with a whimper instead of a bang. These are to discuss limitations of your work in the final paragraph (never do this!), or to say something very vague (never do this either!). As I wrote in the chapter on Methods, all technical limitations should be discussed there, and if they have to be mentioned in the Discussion section, never in the final paragraph. In particular, do not dwell on limitations that everyone knows already and that everyone has (“if we just had more data”). Also, please do avoid vague endings like “further research is needed”. Further research is always needed, this just makes it sound like you didn’t do enough even when you did! Besides undermining the importance of your work, vague statements are never memorable. Say something that you would want the reader to remember.

One good way of ending the Discussion section—and the paper—is to write a short paragraph that recaps your conclusion and the significance of your work. You can even signpost this to the reader and begin the paragraph with the words “in conclusion, we have shown that…” This paragraph can resemble the bottom part of the hourglass of the abstract: what did you find out, what does it mean, and what are its broader implications? How has your work concretely contributed to the big picture? Where have you come from where you started?

coverdraft

[Figure 1: Coming soon. Or not so soon. But closer by almost one chapter now…]

Paper Writing for PhD Students, pt 9: Methods

methods

[Click here for the previous post in this series]

It doesn’t matter how beautiful your theory is. If it disagrees with experiment, it’s wrong. In that simple statement is the key to science.” -Richard Feynman

While much of this series has been about writing an exciting story, we now need to put excitement aside for a while. I’ve earlier claimed that papers are not only containers of information. Their Methods sections, however, are. Their role is entirely utilitarian. So before we discuss form, let’s discuss function.

A Methods section serves two purposes. First, it should let other researchers gauge whether your conclusions are justified and backed up by evidence—it should let other researchers assess how credible your data are, and how credible your analysis is. Second, it should allow other researchers to replicate your study and repeat whatever it is that you have done.

Unfortunately, as any experienced researcher knows, these goals are not always met. More often than not, the authors of a paper do not explain the procedures that they have used in enough detail, even if there is a Supporting Information document with an unrestricted page count. It happens all too often that when the reader attempts to understand in detail how the authors have arrived at their results, she has to give up because that information is simply not there or it is too patchy.

Not being able to understand a paper’s methods or to replicate its pipeline leads to many problems. First, this contributes to the replicability crisis and therefore erodes the very foundation of science, the scientific principle itself: only those results that can be replicated by others can be taken as facts. Second, selling your discovery to the scientific community will be hard if your fellow scientists cannot trust your findings because they do not understand how they were obtained. Third, if your pipeline—from data collection to analysis—contains new methods or ideas, those will not be adopted by anyone unless they are clearly explained (or even wrapped up and served on a plate, say, as a software package). This leads to many lost citations and your work not being discovered. If you release data, someone can also use it for things that you didn’t think of, and if you release software, there will always be someone who needs it.

So please do take replicability and reuse seriously. Explain what you have done in as much detail as possible. Release your raw data. Release your intermediate results. Release your code. Reveal everything. Hide nothing. Be a good scientist. Don’t be an evil scientist.

If you release everything that there is to release, you will probably need to use external repositories. Some journals, however, do allow submitting supplementary data and code files, to be published together with the article. If you are thinking of hosting the data and code yourself, consider that we are talking about the scientific record here: your paper, your data, and your code should, in theory, be available forever. And forever is a mighty long time, as the late artist known as Prince once put it. It certainly is longer than the lifetime of the URL that points to your www homepage on your university’s server, or of the server daemon that runs on the Linux machine in your bedroom closet. So no DIY here, please—always use long-term data and code repositories, like Zenodo. While even those might not last forever, they’ll last longer than any self-hosted repository. Note that even GitHub is not futureproofed: it is run by a commercial company that can become extinct just like any other company.

Let’s return to the paper itself, and move from function to form. First, where to describe materials, data, and methods? This, of course, depends on the journal, and there are many options. The top-tier journal style (think PNAS, Nature, etc) is to have Materials and Methods as a separate section at the end of the article, as an appendix of sorts. In these journals, methods are only briefly described in the main text and the reader is referred to the Materials and Methods appendix for details. While writing a paper this way may at first feel difficult, this structure does make sense: the short letter format is all about the story, and technical details that would get in the way of the story are pushed aside. This may make writing feel harder because one cannot hide behind technical details: there has to be a story. However, beware of the dark side: referring the reader to the Materials and Methods section where only superficial details are given and where the reader is further referred to the Supplementary Information that adds detail but still lacks essential information, or where the limitations of the chosen methods are hidden in a subordinate clause on page 28. This structure makes it dangerously easy to sweep something under the rug. Which is why it often happens.

So if you are writing for one of these journals, do resist the dark side: do not hide problems in the SI. Other than that, just strive for clarity in the Materials and Methods. Typically this section comprises independent subsections for different items, so there is not much storytelling involved. In the main text, when talking about methods, describe their purpose, not their details: “we measure the similarity of X and Y with the help of (insert name of fancy similarity measure), see Materials and Methods for details.”

Then there is the style common to biomedical journals, where Methods are described in all their detail straight after the Introduction. This makes it easier to describe everything properly and more difficult to hide problems, which is good. The downside is that being hit by several pages of painstakingly detailed method descriptions is something of a turn-off: the story suffers. While this cannot be entirely avoided, it helps if you remember to provide context: begin each subsection by reminding the reader why this data set was collected, why this experiment was done, or why you are going to next describe some mathematical methods. Often, this is not more difficult than simply saying that, e.g., “to measure the similarity of X with Y, we need some well-behaved distance measure for probability distributions that…” and then describing the chosen measure.

The third way that is common, say, to the journals of the American Physical Society, is to happily mix methods with results, explaining how things were done and what the outcome was without making a distinction between the two. In this case, things like experimental setups or data collection procedures may still be explained separately, but typically all mathematical and statistical methods are described together with the results. In my view, this makes writing a smoothly flowing story easier than the biomedical style. It is easier to motivate the methods by saying that “next, we’ll investigate X, and to do that, we need to do Y, and look, here’s the result”. In the biomedical style, this connection is harder to make because the methods and results are separated, so one has to focus on making sure that the reader understands why the methods have been chosen and why the reader should understand their details.

Before concluding, let us return back to being good versus being evil, and talk about discussing the limitations of your methods. All methods have limitations, as every scientist knows, and it is best to lay these out in the open. In my view, the Methods section is the best place for doing this: while even minor limitations of methods are often discussed in the Discussion section, it feels more natural if they are addressed when the methods are introduced. Strangely, this even feels more honest. First, at least to me, it feels a bit like having been cheated if I have read a long paper, and only in the last paragraph, it is mentioned that by the way, we’re not sure that things work the way we just told they would. Second, it is easier for the writer to explain the limitations together with the methods. Third, it is also easier for the reader to understand the limitations and their implications if the details of the methods are fresh in her memory.

When addressing limitations, you should tread carefully: being honest is different from making it sound like your study is flawed. Joshua Schimel’s “Writing Science” introduces a great principle: say but, yes instead of yes, but. Instead of saying that your quite clear results would be much more detailed if your experimental setup would have a higher resolution (or similar), say that even though the resolution of your experimental setup is limited, your results are quite clear. The latter has a much more positive ring to it, although both sentences have the same information content. So don’t make it sound like there is something wrong with your work—if there is, fix it first, before writing your paper.

Coming up next: the Results section.

Do an MSc in Complex Systems – Admission Now Open!

complex_wrk2

Admission is now open for our Life Science Technologies master’s programme at Aalto University, Finland, Europe; there is a major in Complex Systems within the program and I am the responsible professor.

What’s in the major? Well everything that’s cool and fun and interesting of course: network science, data science, machine learning, nonlinear dynamics, to mention a few! Here’s why networks are the thing. And if you want to know more about what complex systems are, just have a look at previous posts in this blog, e.g. on mobile-telephone calls, ants, and the immune system.

Here is a complete list of courses in the complex systems major for this winter (only minor changes coming for 2018).

There are almost no mandatory courses in the major; rather, there are many courses to choose from, including courses by other Life Science Technologies majors. This makes it possible to mix and match: want a combination of machine learning and complex networks? Check. Want to be a network neuroscientist? Check. Want to get a broad training in data science? Check.

Note: even though the programme is called Life Science Technologies, you can almost completely avoid anything that begins with “bio” if you so wish. As an example, I have students who focus on social networks and computational social science.

One more thing: the doctoral track. If you are talented and your grades are good, you can apply to the doctoral track where your final target is not the master’s degree but PhD; your studies are tailored towards that goal and you’ll get to spend time as intern in our research groups, with the aim of publishing the first journal article(s) of your thesis already before you get the master’s degree.

So, what are you waiting for? Apply here!