How to choose a research question?

In mathematics, the art of proposing a question must be held of higher value than solving it. –Georg Cantor

It can be said with complete confidence that any scientist of any age who wants to make important discoveries must study important problems. Dull or piffling problems yield dull or piffling answers. It is not enough that a problem should be ‘interesting—almost any problem is interesting if it is studied in sufficient depth. —P. B. Medawar

Choosing which problems to work on is perhaps the hardest and most crucial part of science. It is also an invisible and underrated part. In university, we are taught to solve problems that someone has designed for us to solve. I’ve yet to see an exam where the task is to invent a problem rather than solve one! But without problems, there are no solutions either. Problems come first, and solutions are only meaningful when the problems themselves are meaningful.

Despite this, solutions and the clever methods we develop to achieve them tend to take center stage when we write up our research. While we typically provide a post hoc justification for our research question in the introduction, convincing the reader why it is important, we practically never document how we actually chose that problem over a large number of others.

How did we search for the problem? How did we identify it? Was it already circulating in the literature, known to others as well? Did it come to us suddenly in a flash of inspiration — this would be a great problem to study? Did we have an intuitive feeling that if we begin to chip away at this problem, something useful will emerge? Or did we accidentally stumble upon some results, only later realizing what problem they actually solve? There are several ways of arriving at a research problem.

While having a lot of experience in one’s field is of course useful for identifying great problems, not knowing everything can also be an advantage. Creativity thrives when unexpected connections are made, and knowing too much can lead to to tunnel vision. This is why it might be a good idea to switch fields once or twice during your career… Of course, if you know nothing, it’s not possible to invent meaningful problems, and if you know too little, you might come up with problems that others have already solved. So know your literature, but don’t be afraid to set your own research goals, not solely relying on what everyone else thinks is important.

When doing a Ph.D., start developing the skill of inventing meaningful scientific problems from day one! This investment pays off compound interest: the better problems you invent, the more doors your results open, leading to even better questions.

Often, a good research problem is like a seed: plant it in fertile soil, tend to it well, and more great problems will sprout.

Let’s now suppose that you have a list of candidate problems—research questions that you might well like to study. Which one should you pick? What does a great research problem look like? First, let’s impose some real-world constraints.

Of course, the problem should be important, and its solution should make a difference. But it also needs to be solvable—impossible problems are not worth it, especially if you’re trying to finish a Ph.D. Still, problems may look deceptively easy, and it’s probably safe to say that the large majority of problems are more difficult than they first appear.

Since there is a finite time in which the Ph.D. has to be completed, it might be wise to mitigate risk by balancing your more ambitious endeavors with some “safe” problems that guarantee results. It is also good to have a plan B for getting something publishable out of problems that are too hard or too slow to completely crack. Then there is always material to write about, even if the more grandiose undertakings fail or take one to several lifetimes to complete.

Therefore “finding a cure for cancer” or “developing an artificial brain” do not qualify as great research problems for your Ph.D. Rather, they are long-term targets of entire fields of science. However, “figuring out the role of pathway X in preventing our immune cells to attack tumours” or “understanding the role of criticality in the prediction power of liquid-state machines” could be steps in the right direction.

Confession: I made these up sort of randomly (though not entirely randomly). Yet both are focused or at least focused and concrete enough that one could actually start attacking them from some experimental or theoretical angle.

Thus far, we’ve covered the easy part—problems should be concrete and solvable, and impossible problems should better be left alone. But not all concrete, solvable problems are worth solving. The elephant left in the room is importance. What makes a problem important? And what does that even mean?

Of course, your results might yield obvious and direct benefits—perhaps contributing to a new medical treatment or laying the groundwork for future advancements in artificial intelligence. However, most of the time, “importance” is a more elusive concept. It’s usually easier to assess the significance of a result after the fact: if a scientific article is highly cited, it likely had a substantial impact on other scientists’ thinking.

When we move from concrete outputs like articles to the more abstract and hazy realm of ideas, one way to visualize science is as an ever-growing network where ideas give birth to new ideas. An impactful idea is one that sparks many others downstream, either directly as offspring or indirectly through a chain of intermediate concepts. This latent process is what generates the aforementioned citations: ideas spawning ideas, and, in a Darwinian sense, great ideas giving rise to more ideas.

Unlike in biology, however, an idea can have far more than two parents, and its fitness isn’t always immediately apparent—there can be a long delay before its importance is recognized. And unfortunately, sometimes the building blocks that had to be put in place before a major new idea could emerge are forgotten. Everyone knows about Einstein’s theory of relativity, but few are aware of all the earlier efforts that went into developing ways to synchronize clocks across countries and continents using electric cables. Yet Einstein was undoubtedly familiar with this work, and it must have influenced his way of thinking.

In any case, even if assessing the importance of a research question is not trivial, it is worth asking why someone would cite your results, say, 10 years down the line. Is the question fundamental enough so that if you solve it, others will build on your results? Try to see the question as part of a bigger tapestry.

The above picture of science as a flow of ideas being born, merging, and mutating is also helpful for reviewing the literature, both for coming up with research questions and understanding what has already been done in the general vicinity of a question that you have chosen to address. Science is a network—to make discoveries, follow the links of that network!

Identify impactful and highly cited papers and try to figure out why they are important. Then, use Google Scholar or some other tool to find out who cites them; just look at the abstracts of the citing papers to get the big picture and then dive into the details of those pieces of work that sound relevant to you. This is, in my view, a more useful way than trying to read all the literature in detail, in some random order. Try to first see the forest from the trees, and then focus on those trees that you find important. If you feel that some trees or entire forests are missing, you have a research question!

Apply Now to Our Master’s Programme in Life Science Technologies!

Our popular Master’s programme in Life Science Technologies at Aalto University, Finland has a major in Complex Systems! The major includes a lot of network science taught by top scientists in the field — yours truly, Mikko Kivelä, and Petter Holme. You’ll also learn some Python programming, data science, machine learning, and nonlinear dynamics — or, if you wish, you can choose a more maths-heavy subset of courses, or combine your studies with, e.g., human neuroscience.

This major has very tight connections to research, and many students have continued toward their doctoral degrees after receiving their M.Sc. Another very popular and successful career path has been that of an industrial data scientist or consultant, e.g., in the health industry. There is a lot of demand for these in the Finnish job market, so a Master’s degree in Life Science Technologies is a great investment in your future.

The application period is open only until 2 Jan 2024, 3:00 PM GMT+2 so be quick & apply now!

Season’s Grant-Writing Tips, Part 2/2

A very, very AI-generated image where money falls down like snow.

In the first part of this grant-writing mini-series, we learned the fundamental secret of grant-writing (and, in fact, any writing): everything revolves around the reader. The only purpose of a grant proposal is to make it easy for the reviewer to recommend funding.

Let’s break that statement down. For the reviewer to recommend funding, she has to feel that what you aim to do is important, novel, and feasible, and that you are exactly the right person/team to do this. In more touchy-feely terms, the reviewer has to like the proposal. And you.

As we discussed in the previous post, this is much more likely to happen if the proposal doesn’t make the reviewer work too hard: it should be focused, clearly written, and provide clear answers to the questions the reviewer must address.

To help with the above, we’ll now address writing at the level of paragraphs and sentences, borrowing some tricks from professional copywriters who craft advertising text. These techniques not only involve gently manipulating the reader—all writing is about manipulating the reader!—but also aim to ensure that the text flows. An ad where the reader gets lost or bored is a failed ad.

Let’s begin at the beginning because it is the most important place. In any writing, the first sentence and the first few words have enormous power—”Call me Ishmael”—and you should tap into this power. This is because they prime the reader’s mind for what is to come. They also set the general mood. Begin your proposal with a few strong sentences that almost win the grant! These sentences should summarize your plan and its impact: why is it important to do the things you plan to do? Why are you in a unique position to do this? If your grant is funded, how will the world become a much better place?

This mini-summary serves a dual purpose in priming the reader. Firstly, on an emotional level, the reviewer should feel excited – “This sounds like a great proposal!” If you achieve this, the reviewer will have a positive bias from the very beginning. However, with a weak or muddled beginning, you’ll need to work hard to win them over. Secondly, it is much easier for the reviewer to follow the text when they know where it is going — easier in terms of both comprehension and how reading the text feels (these two are, in fact, the same).

There is another place of power: endings. The power of endings is different from that of beginnings: whereas beginnings prime the reader, the endings are what the reader remembers. This is because between paragraphs and between sections there is a break in reading, where the stream of input to the reader’s brain temporarily ceases. This leaves more space for whatever the last input was to echo around in the reader’s head.

Saving important bits to the end is a common copywriter trick—ever seen an ad with “click here to buy” in the middle?

However, this trick works best for short sections and well-written text. If you lose your readers along the way, they won’t reach the end. Remember the overworked, sleep-deprived reviewer from the last post? She might be tempted to just skim, you know. To mitigate this risk, write short paragraphs ensuring that the reader makes it through to their end—and write them well. For section endings, a strong recap sentence — perhaps as a separate paragraph—can do wonders. “In summary, my research can be expected to have an enormous impact, because…”

We’ve now covered beginnings and endings. What is left is how to get from the former to the latter. Here, a copywriter’s trick is to understand that while the sentences must deliver information — including enough details of your research plan to judge its feasibility, etc — their task is also to propel the reader forward. In ad copy, the primary task of every sentence is to make the reader read the next!

This means that the sentences should seamlessly flow into one another, which is a general sign of good writing regardless of the genre. This is particularly important for information-dense grant proposals: information is much, much easier to absorb through a narrative than when it is presented as disconnected bits and pieces. The narrative is what keeps the reader going: as humans, we’ve enjoyed stories since the dawn of man, singing around campfires.

For a grant, the narrative is particularly important for sections prone to being dense, taxing, and boring—imagine the sleep-deprived reviewer having to wade through 25 poorly written state-of-the-art sections! This is especially crucial if the section is at the proposal’s beginning, as state-of-the-art sections often are. So next time when writing one, consider the reviewer, and instead of just listing references, write a story of how your field of science has evolved to the point where you can both ask and answer your research question.

Finally, as I mentioned in the previous post, there is one spot in the proposal where you can be slightly difficult to understand on purpose, in particular, if the reviewer is not really in your (sub)field and your proposal involves theory/maths/data analysis/similar.

This is in the methods section, or whatever the section where you describe what you are going to do is called. Whereas the research question and its importance should be written with absolute clarity so that everyone can understand them, here you can show off a bit. The point is to give the impression that you really know your stuff. Even though your proposal should generally be as free of jargon as humanly possible, it doesn’t hurt to have one strategically placed sentence where you flex your claws, show that you can devour your field’s most complicated concepts for breakfast, and instill a bit of fear and awe in the reviewer. Then you can be all nice again, and wait for the gifts to arrive.

I wish you merry grant-writing!

Season’s Grant-Writing Tips, Part 1/2

Grant money falling like snow (a very, very AI-generated image, by craiyon.com)

It is grant-writing season here in snowy Finland, and to keep away from the actual work, I thought I’d write a couple of posts on grant-writing tips. Today we’ll be all nice, but in the next episode, we’ll get a bit naughty because that might in the end bring us more gifts. Ho ho ho.

Let’s start at the very beginning. When writing a grant, the most important thing for you to understand is what is going through the heads of your target audience—the reviewers. You are writing the grant to persuade them to recommend you to get funded. Your one and only task is to make this as easy as possible for them.

This simple rule — to make it as easy as possible for the reviewers to recommend funding the proposal — gives rise to many corollaries.

To arrive at those, consider the situation that the reviewers find themselves in. It is very rare to get a single proposal to review that is spot on in the reviewer’s own subfield. What is more common is that there is a large pile of proposals on the reviewer’s desk, they are almost but not entirely off-topic, the deadline was last week, the reviewer has barely slept because the kids are sick, and even the coffee has gotten cold.

In this situation, the reviewer will be very, very grateful if you make her task easier.

This means, among others, that a) the proposal must be easy to understand, even to a non-expert, b) the proposal’s value and level of ambition must be immediately visible, c) the proposal must contain direct answers to the questions that the reviewer has to answer, and d) the proposal must not contain any more stuff than is necessary to convince the reviewer.

The first corollary requires that you’ve actually given your research plan enough thought so that you can understand it yourself—in other words, you must know what you are doing. It helps a lot to have a clear focus: it is a common beginner’s mistake to try to squeeze all your ideas into one proposal, which then reads like a confusing superposition of several muddled research plans. Focus on a single topic and your best idea to avoid confusing the reviewers because otherwise, they won’t know which of your parallel plans they should be rating. Confused people are rarely happy people, and only happy people give top ratings!

Being easy to understand also means well-written: reading a good grant proposal shouldn’t feel taxing. Avoid jargon and complicated sentences; always err on the side of simplicity. Also, your proposal should not read like lecture notes because the proposal is not about teaching the reviewers. Nothing is as annoying as being lectured to if you only want to get your reviews done!

The proposal should contain enough information to convince the reviewers of how and why you plan to do what you plan to do, but no more than that. Again, think of the poor reviewer who has 20 proposals on her desk: do you think that she is happy to try to become an expert in 20 new topics by reading about a metric ton of intricate details under heavy time pressure, with cold coffee and cranky kids demanding attention? I don’t think so.

That being said, there is one spot in the proposal where you can be a bit difficult to understand on purpose, but let’s leave that for the next part of this series.

Being easy to understand also means no bulls*it: no fluff or fancy-sounding, big words that mean nothing. For god’s sake, no ChatGPT-produced text because it is full of the above, unless you really, really know how to use it. Write the text yourself. Write concisely, simply, and powerfully. Write like you mean it.

The second corollary demands that you make your case clear directly and very early on. Here, my suggestion is to start with a summary paragraph that is almost enough to win the grant for you. More about this later.

The third corollary — the proposal must contain direct answers to the questions that the reviewers have to answer — is hugely important as well. This requires you to do a bit of reconnaissance: the reviewer guidelines and/or review forms of many grant agencies are public. Get them. Study them. Learn them by heart. Find out what specific questions the reviewers are asked, and make sure that your text contains copy-pasteable answers to each, preferably well-highlighted (in italics, or so), so that in a hurry, the reviewers can recycle your text in their statement. Make sure that your answers are winners and that it is easy for the reviewer to give them full points.

Lastly, clarity and readability are often in direct conflict with the amount of stuff in a proposal. Again, a common beginner’s mistake is to cram in as much text as possible, fiddling with the margins or font sizes and using stamp-sized figures, etc. In contrast, the pros choose what elements to include and then focus on those, leaving enough white space and room to breathe. Don’t make the reviewers choke on the amount of stuff they have to ingest! Focus on what matters. Quality instead of quantity.

That’s all for today. In the next episode, we’ll put on our black hats and talk about some Jedi mind tricks, stolen from the evil folks who write ad copy that makes you buy stuff that you don’t need. Stay tuned!

Are you new around here?

Notebooks and a pencil

As there has recently been a surge of visitors coming from Moodles and other learning platforms, I thought I’d say hi — hello there!! — to everyone who is new to this blog, and provide some guidance in the form of a table of contents of sorts.

So, where have you landed at? This is a blog by me, where me = Jari Saramäki, an interdisciplinary physicist and a professor at Aalto University, Finland, dabbling in network science and other complexities, and a big fan of lucid writing. Also, a bass guitar player, because someone has to be.

The blog contains things that students have found useful (which may be why you are here), in particular, advice on how to write scientific papers and how to develop your scientific writing skills:

Welcome again, and I hope you’ll find something in this blog that is either useful or entertaining, or both!

On scientific writing in the age of AI, part 2: A thought experiment

In the spirit of my post last week, let us continue figuring out the role of AI in scientific writing through a Gedankenexperiment. Where we left off was the use of AI as an assistant — a virtual editor if you’d like — to suggest improvements to one’s text, instead of churning out autogenerated content. Think Grammarly++, or similar. This is, at least to me, perfectly fine. However, I would appreciate it if the text still retains its voice and human touch, lest everything sound exactly the same.

Now, fast forward to the future. If people still write science 25 years from now, how will they use AI tools? What are those tools capable of?

Here is where I feel science — at least natural science — might diverge from more creative forms of writing, as the purpose of written science is ultimately to transmit information. It might even become desirable to have AI write up our results.

Consider the following: suppose that I have carried out an experiment and want to write a paper on its results. I feed my plots, maybe together with a few lines of text about background, impact, etc, to my virtual writing assistant, and off it goes, returning with a complete manuscript. As my virtual assistant has been taught to write in my voice, the manuscript actually sounds like me. I read the manuscript and find that it is factually correct, and submit it to a journal.

Now, if the information in this paper is factually correct and it is written in a way that is appreciated by human readers, how should we feel about this? Is this ethical or unethical? Is this a future we’d like to see or not?

For this to be ethical, it should be done openly and the use of AI acknowledged. Which is of course very easy to do. Maybe this will be common: maybe most papers will be written by AIs that have been fed with original research results.

Beyond ethics, is this good or bad? That, I guess, depends. If all papers sound the same, it is bad. But what if the papers are indistinguishable from human writing, considering that everyone trains their own AI to write in their voice? What might be lost here is the finesse of argumentation, nuances, deep thoughts, and all those things that make famous writers/academics famous. On the other hand, perhaps this loss would be compensated by far fewer crappy, incomprehensible papers… just maybe.

It may also be that written scientific papers will become obsolete, or at least obsolete as stand-alone products (this is already happening with all the Jupyter notebooks and SI data sets and so on). There are also already now paper formats in some fields (e.g., biomedicine) that leave very little room for creative writing—these are mostly just data containers.

Perhaps scientific papers will in the end not be structured for human readers, but for other AIs that can then better pick up their arguments to propose new theories, experiments, and so on — in other words, replace us, scientists. But I have my doubts on this, as I at least hope that science requires creativity that is beyond mere statistics of words. Let us hope that humans can still out-weird AIs in the years to come (is that even a word)!

To be still continued, I think…

Slides for my CCS warm-up presentation

The young researchers in Complex Systems Society (yrCSS) invited me to talk on scientific writing at Palma de Mallorca on October 15, 2022. It was really great to speak to an active & interested audience!

Here are the slides — I hope you find them helpful!

There is a video recording of the whole talk as well, available on YouTube. Go check it out.

Science — stories or pure data?

Writing a scientific paper

In his recent post, Petter Holme presents an entertaining inner dialogue about whether one should market one’s scientific output or not. Much of this centers around the concept of stories — and the discussion on whether we should publish papers that have storylike narratives or just plain data has been going on for a while.

Being an advocate of papers-as-stories, let me add another point of view to the mix.

I feel that there are two dimensions here. The first one is the axis from facts to fiction, and being scientists, we all know where we should place ourselves here. The second dimension is about pure data versus understanding/insight, and it is this dimension that in my view necessitates some storytelling.

Let me explain my reasoning by starting from pure data. Suppose I have carried out an experiment/done some simulations/analyzed a bunch of data I found on the Internet. Now, if I wanted my output to be pure data, I could just release the numbers as tables or graphs or whatever, and maybe an explanation on how the experiments or simulations were carried out. Pure data — no story.

However, my pure data would probably not make sense to many people, if any. To take a step in the direction of meaning, I should at least explain what the research question is that the experiment/simulations/analysis project was designed to answer. I might also feel compelled to tell how the data answer this question, i.e., to give the numbers some meaning.

Notice the elements of a story sneaking in? There is a question, there is an answer.

But even after these additions, only an expert reader would be able to see the meaning in what I have done. For anyone else, more would be needed — why should this question be asked? What is the context for the question? And why should one care about the results?

Add these elements, and we have arrived at the typical structure of a scientific paper that begins with an introduction and ends with a discussion. We have also strayed pretty far from pure data, and are now firmly in the realm of stories. First, we introduce the world and the characters that inhabit it, then we create tension with an open question, and release this tension with an answer.

But such stories of science are not works of fiction; they are told with facts. This, to me, is why papers should be stories — stories provide clarity, understanding, and meaning. They help the reader to connect the dots. Of course, one can and should release pure data too: numbers, results, code, everything. But these only get their meaning through stories.

Podcast interview on writing

How to Write a Scientific Paper book cover

I was recently interviewed by Daniel Shea for his podcast Scholarly Communications — you can listen to the interview here: https://newbooksnetwork.com/how-to-write-a-scientific-paper

We discussed my writing book and writing in general. This was a very enjoyable discussion & Daniel had plenty of good points and new perspectives that I could immediately agree with — do have a listen, highly recommended!

How to Write an Excellent Master’s Thesis

How to Write and Excellent Master's Thesis [slideset cover]I was asked to give a talk on how to write a Master’s Thesis at our department’s Comms & Coffee event this morning; here are the slides.

This talk is an adapted version of my paper-writing system (no, I haven’t written a book about writing Master’s theses, at least not yet). You’ll notice that companies & businesses are mentioned—Aalto is a technical university, so many MSc theses are in fact done as interns/trainees in companies.

I hope the slides are useful. Feel free to share with your students!