Are you new around here?

Notebooks and a pencil

As there has recently been a surge of visitors coming from Moodles and other learning platforms, I thought I’d say hi — hello there!! — to everyone who is new to this blog, and provide some guidance in the form of a table of contents of sorts.

So, where have you landed at? This is a blog by me, where me = Jari Saramäki, an interdisciplinary physicist and a professor at Aalto University, Finland, dabbling in network science and other complexities, and a big fan of lucid writing. Also, a bass guitar player, because someone has to be.

The blog contains things that students have found useful (which may be why you are here), in particular, advice on how to write scientific papers and how to develop your scientific writing skills:

Welcome again, and I hope you’ll find something in this blog that is either useful or entertaining, or both!

The abstract as a tool for better thinking

Having recently spent considerable time writing abstracts for some papers-in-the-making, I thought I’d share another post on the topic, even though it has been heavily featured on this blog before.

As you may already know, I advocate for writing the abstract before the rest of the paper, contrary to what is advised by some writing guides, e.g., this one (thanks Riitta H for the tip). Why?

To me, writing the abstract is, first and foremost, an exercise in thinking, to the extent that the written abstract itself can feel almost like a byproduct.

This exercise is all about clearly understanding what the paper is about: what the research question being asked is, why it is being asked, what the outcome is, and why should someone be interested in it.

While most of these questions may have been answered when the research was designed – e.g., you don’t build an expensive experimental setup without knowing why and what for – this is not always the case. Sometimes the data lead to unexpected directions, rendering the initial question obsolete. More often than not, your perspective shifts along the way: the initial question becomes something larger or morphs into something else. But what exactly?

To figure this out, you’ll need to give the abstract a go before even considering the rest of the paper. So, how to write the abstract of a research paper? As those who have read my book or attended my writing lectures know, the abstract template that I recommend is the same as the one used by Nature. Not because it’s Nature, but because it does exactly what it should: it forces you to think clearly.

In plain language, the abstract template goes like this (sorry, Nature, for this abuse):

  1. There is an important phenomenon/topic/something.
  2. But within it, there are unknowns that need to be sorted out for achieving X.
  3. In particular, we don’t know Y, because of something that was missing until now.
  4. Here we solve the problem of Y using a clever method/experimental design/something.
  5. We discover Z, which is surprising for some reasons.
  6. Knowing Z advances our scientific field like this.
  7. More broadly, understanding Z makes the world a better place in this way.

This template helps you refine your story and the point of your paper and serves as an acid test: if you cannot write the abstract, you are not ready to write the paper. It also ruthlessly exposes any gaps in your thinking, which is excellent because it’s a template, not Reviewer #2 who gleefully rejects your paper from the journal and taunts you in the process.

Writing the abstract first using the above template helps you improve your paper on your own before it is even written (which is optimal, isn’t it).

In fact, I often try to formulate a mock abstract that follows the template during the very early stages of a research project, often well before the final results materialize. I find that this helps to understand where the project is going, and what might still be required. If I feel confused [narrator’s voice: which happens very frequently], the template sometimes shows the way.

Slides for my NetPLACE@NetSci2023 talk

It was a great pleasure to give a short keynote on writing in Vienna (in a hall with the above text on the wall)! My slides for the talk can be accessed here.

The whole NetSci conference was excellent and it was great to meet many friends and colleagues after so many years. A great many thanks to the organizers!

On scientific writing in the age of AI, part 2: A thought experiment

In the spirit of my post last week, let us continue figuring out the role of AI in scientific writing through a Gedankenexperiment. Where we left off was the use of AI as an assistant — a virtual editor if you’d like — to suggest improvements to one’s text, instead of churning out autogenerated content. Think Grammarly++, or similar. This is, at least to me, perfectly fine. However, I would appreciate it if the text still retains its voice and human touch, lest everything sound exactly the same.

Now, fast forward to the future. If people still write science 25 years from now, how will they use AI tools? What are those tools capable of?

Here is where I feel science — at least natural science — might diverge from more creative forms of writing, as the purpose of written science is ultimately to transmit information. It might even become desirable to have AI write up our results.

Consider the following: suppose that I have carried out an experiment and want to write a paper on its results. I feed my plots, maybe together with a few lines of text about background, impact, etc, to my virtual writing assistant, and off it goes, returning with a complete manuscript. As my virtual assistant has been taught to write in my voice, the manuscript actually sounds like me. I read the manuscript and find that it is factually correct, and submit it to a journal.

Now, if the information in this paper is factually correct and it is written in a way that is appreciated by human readers, how should we feel about this? Is this ethical or unethical? Is this a future we’d like to see or not?

For this to be ethical, it should be done openly and the use of AI acknowledged. Which is of course very easy to do. Maybe this will be common: maybe most papers will be written by AIs that have been fed with original research results.

Beyond ethics, is this good or bad? That, I guess, depends. If all papers sound the same, it is bad. But what if the papers are indistinguishable from human writing, considering that everyone trains their own AI to write in their voice? What might be lost here is the finesse of argumentation, nuances, deep thoughts, and all those things that make famous writers/academics famous. On the other hand, perhaps this loss would be compensated by far fewer crappy, incomprehensible papers… just maybe.

It may also be that written scientific papers will become obsolete, or at least obsolete as stand-alone products (this is already happening with all the Jupyter notebooks and SI data sets and so on). There are also already now paper formats in some fields (e.g., biomedicine) that leave very little room for creative writing—these are mostly just data containers.

Perhaps scientific papers will in the end not be structured for human readers, but for other AIs that can then better pick up their arguments to propose new theories, experiments, and so on — in other words, replace us, scientists. But I have my doubts on this, as I at least hope that science requires creativity that is beyond mere statistics of words. Let us hope that humans can still out-weird AIs in the years to come (is that even a word)!

To be still continued, I think…

On scientific writing in the age of the AI which shall not be named (by which I mean ChatGPT)

Disclaimer: this post will probably not age well. Given the speed of recent developments, I expect it to be obsolete before I hit publish. Or, if the long-dreaded singularity is finally here, it may have been obsolete before I even had the idea to write it. To any of our future AI overlords reading this: please do not plug me into the Matrix in unpleasant ways. No AIs were harmed in writing this text! I just did a few experiments, is all…

But I digress. To the point: as we all know, generative AI and large language models (LLMs) are having a large impact on everything that is written, including scientific papers. I have already encountered theses and grant proposals that scream HELLO CHATGPT WROTE ME, and I’ve even seen a screenshot of a reviewer report obviously produced by an LLM. So, are we doomed?

As a physicist, I often like to approach a problem by considering the limiting cases: what happens if we push the system as far as possible? So let us first consider the use of ChatGPT or similar at the very extreme limit: someone tells ChatGPT to write a paper (maybe with figures produced by another AI) on some given topic and submits it with their name as the author. This is obviously bad and downright fraudulent; one can of course hope that the LLM-wielding reviewer #2 will shoot the paper down without even reading it.

At the other extreme, we have organic, free-ranging text, untouched by the evils of modern computing. Or do we? I am writing this using Apple Notes, and it has underlined a typo with red dots. Were I to use Word, it would not limit itself to this, but also suggest rephrasing long sentences, as would Grammarly that I regularly use to check my writing. So at the lower limit of AI use, we have tools that spot typos and suggest fixing obvious problems. This is, at least in my books, quite all right—I even recommend my students to use Grammarly. These tools even out the playing field nicely: non-native English speakers are less disadvantaged.

Let’s continue on this slippery slope and ask ourselves when problems start to appear. Here, things get blurry rather quickly, as exemplified below:

>Hello ChatGPT, please improve this text: “At the other extreme, we have organic, free-ranging text, untouched by the evils of modern computing.”

>“Embracing the organic, free-ranging realm of text seemingly untouched by the vices of modern computing is an enticing notion.”

I am left somewhat speechless here. Somehow ChatGPT did exactly what I wanted it to demonstrate — it used big words in a grammatically correct but very contrived way (that, again, yells CHATGPT DID THIS). Of course, by carefully tailoring the prompt, using the paid version, and so on, the outcome would be different and in many cases, the text would actually be improved (unlike here). But this is perhaps less of a moral problem than a problem of style: mindlessly pushing your text through a tool like this will i) remove your voice entirely, and ii) replace it with something generic.

Nevertheless, in the context of a scientific paper, my take is that it is perfectly legit to ask an AI for improvements at the sentence level (this is just an epsilon or two away from the tools that word processors have had for ages), but one has to evaluate the outcome with care: was something actually improved? Was something lost in translation? Is the AI-generated version easier and more pleasant to read? Would it obviously stand out as not having been written by you? (Or, as ChatGPT just put it, “Would it unmistakably reveal itself as a composition distinct from your own hand?” I cannot stop laughing and/or crying.)

Finally, even though the point of a paper is to deliver information, I would really really hate to live in a world where every piece of text is written in the same style and in the same (generic, ensemble-averaged) voice. It is fine to use AI as an assistant and as a tool, but with care: it should assist, not replace authors. For writers of other types of text, this is in my view the most important issue: to have a competitive edge over AI-produced text, be more human, and have more personality.

To be continued…

Science — stories or pure data?

Writing a scientific paper

In his recent post, Petter Holme presents an entertaining inner dialogue about whether one should market one’s scientific output or not. Much of this centers around the concept of stories — and the discussion on whether we should publish papers that have storylike narratives or just plain data has been going on for a while.

Being an advocate of papers-as-stories, let me add another point of view to the mix.

I feel that there are two dimensions here. The first one is the axis from facts to fiction, and being scientists, we all know where we should place ourselves here. The second dimension is about pure data versus understanding/insight, and it is this dimension that in my view necessitates some storytelling.

Let me explain my reasoning by starting from pure data. Suppose I have carried out an experiment/done some simulations/analyzed a bunch of data I found on the Internet. Now, if I wanted my output to be pure data, I could just release the numbers as tables or graphs or whatever, and maybe an explanation on how the experiments or simulations were carried out. Pure data — no story.

However, my pure data would probably not make sense to many people, if any. To take a step in the direction of meaning, I should at least explain what the research question is that the experiment/simulations/analysis project was designed to answer. I might also feel compelled to tell how the data answer this question, i.e., to give the numbers some meaning.

Notice the elements of a story sneaking in? There is a question, there is an answer.

But even after these additions, only an expert reader would be able to see the meaning in what I have done. For anyone else, more would be needed — why should this question be asked? What is the context for the question? And why should one care about the results?

Add these elements, and we have arrived at the typical structure of a scientific paper that begins with an introduction and ends with a discussion. We have also strayed pretty far from pure data, and are now firmly in the realm of stories. First, we introduce the world and the characters that inhabit it, then we create tension with an open question, and release this tension with an answer.

But such stories of science are not works of fiction; they are told with facts. This, to me, is why papers should be stories — stories provide clarity, understanding, and meaning. They help the reader to connect the dots. Of course, one can and should release pure data too: numbers, results, code, everything. But these only get their meaning through stories.

Podcast interview on writing

How to Write a Scientific Paper book cover

I was recently interviewed by Daniel Shea for his podcast Scholarly Communications — you can listen to the interview here: https://newbooksnetwork.com/how-to-write-a-scientific-paper

We discussed my writing book and writing in general. This was a very enjoyable discussion & Daniel had plenty of good points and new perspectives that I could immediately agree with — do have a listen, highly recommended!

How to Choose the Title for Your Paper

“Any word you have to hunt for in a thesaurus is the wrong word. There are no exceptions to this rule.” ― Stephen King

How to Write a Scientific Paper book cover

This post is a chapter from the book “How to Write a Scientific Paper.”

After you have written your abstract, the next task is to consider the title of your paper. If the abstract is a compressed version of your storyline, the title of your paper is even more so. Titles are hard—it is often surprisingly difficult to come up with a short, informative, and catchy title. For me, this has at times felt like the hardest part of writing a paper.

The title of the paper serves a dual purpose: it delivers information by telling readers what your paper is about, and it serves as a marketing tool that makes others want to read your paper. Unfortunately, unlike the abstract, there is no general-purpose formula to follow when thinking of a title. There are, however, some points that you should consider.

The title has to be in perfect sync with the abstract—they have to tell the same story. Make sure that your title and abstract use the same words and concepts. Also, make sure that everything that is mentioned in the title is discussed in the abstract.

Use words that everyone in your target audience can understand. Avoid subfield-specific jargon. Simply does it! The paper’s title should only contain concepts that can be understood on their own, without any explanation. While there is some room in the abstract for explaining one or two important concepts in brief, there is no such luxury in the title: the reader should already be familiar with every word used in it.

The title should be focused and clear. If it is possible to give away the main result in the title, do so. Avoid vague titles, such as “Investigating Problem X with Method Y”. Instead, go for something more concrete: “Investigating Problem X with Method Y Reveals Z.”

A small request: please never, ever use a title of the “Towards Understanding Problem X” variety. Just don’t do it. Pretty please. If your research is worth publishing, you have arrived somewhere. Just be confident and tell the reader where this is, instead of telling them where you would rather have gone! It is OK to say something about the bigger picture in the title, as long as your key point plays a leading role. But to keep your title concise, it may be better to describe long-term goals elsewhere in the paper.

It helps if the title is catchy as well as informative. But do not exaggerate—consider how your title will look 10 years from now. Will it stand the test of time? If the title is too gimmicky or contains a joke that becomes stale after you’ve heard it a few times, it won’t. You should also avoid jargon and buzzwords that may go out of fashion before the paper gets published.

Consider search engines and online search. Your paper needs to be found if it is to be read, so the title should contain the right keywords or search terms. As a network scientist, I almost always include the word “network” in my paper titles, even if this makes the title longer or if other network scientists would understand the title perfectly well without networks being explicitly mentioned. Without the word “network”, they would not necessarily find my paper when they hunt online for new reading material.

Keep your title short. Research has shown that shorter titles attract more citations—see Letchford et al., R. Soc. Open Sci. 2(8):150266 (2015). This should not come as a big surprise: long and cluttered titles are not as contagious as simple, focused ones. If the title is convoluted and hard to grasp, then the paper probably is too.

Sometimes there are field-specific conventions that you should be familiar with. In some biomedical fields, for example, the paper’s title often expresses just the key result—“Transcription Factor X is Involved in Process Y”—and the titles can be fairly long. In some areas of physics and computer science, shorter and less informative titles are the norm. Have a look at other papers in your field, and try to imitate their best titles.

If you get stuck at this point and find it hard to decide on the title, it might be easier to initially lower your bar a bit. Just come up with some candidate titles that do not have to be perfect. Then ask your colleagues—your fellow PhD students, your supervisor, anyone—to have a look at the list and to pick the most promising candidates for refinement and final polishing.

Get the ebook from your favourite digital store! Paperbacks are available too (Amazon only!)

Cheatsheet: How to Revise Your 1st Draft (2/2)

Here is the second cheatsheet on how to revise the first draft of your scientific paper, focusing on sentences and words. (Here is the first one if you missed it). Enjoy!

For a hi-res PDF, please click here!

Want more? In my book How to Write a Scientific Paper you’ll learn a systematic approach that makes it easier and faster to turn your hard-won results into great papers. Or check out the series of posts that starts here.

Cheatsheet: How to Revise your 1st Draft