How to choose a research question?

In mathematics, the art of proposing a question must be held of higher value than solving it. –Georg Cantor

It can be said with complete confidence that any scientist of any age who wants to make important discoveries must study important problems. Dull or piffling problems yield dull or piffling answers. It is not enough that a problem should be ‘interesting—almost any problem is interesting if it is studied in sufficient depth. —P. B. Medawar

Choosing which problems to work on is perhaps the hardest and most crucial part of science. It is also an invisible and underrated part. In university, we are taught to solve problems that someone has designed for us to solve. I’ve yet to see an exam where the task is to invent a problem rather than solve one! But without problems, there are no solutions either. Problems come first, and solutions are only meaningful when the problems themselves are meaningful.

Despite this, solutions and the clever methods we develop to achieve them tend to take center stage when we write up our research. While we typically provide a post hoc justification for our research question in the introduction, convincing the reader why it is important, we practically never document how we actually chose that problem over a large number of others.

How did we search for the problem? How did we identify it? Was it already circulating in the literature, known to others as well? Did it come to us suddenly in a flash of inspiration — this would be a great problem to study? Did we have an intuitive feeling that if we begin to chip away at this problem, something useful will emerge? Or did we accidentally stumble upon some results, only later realizing what problem they actually solve? There are several ways of arriving at a research problem.

While having a lot of experience in one’s field is of course useful for identifying great problems, not knowing everything can also be an advantage. Creativity thrives when unexpected connections are made, and knowing too much can lead to to tunnel vision. This is why it might be a good idea to switch fields once or twice during your career… Of course, if you know nothing, it’s not possible to invent meaningful problems, and if you know too little, you might come up with problems that others have already solved. So know your literature, but don’t be afraid to set your own research goals, not solely relying on what everyone else thinks is important.

When doing a Ph.D., start developing the skill of inventing meaningful scientific problems from day one! This investment pays off compound interest: the better problems you invent, the more doors your results open, leading to even better questions.

Often, a good research problem is like a seed: plant it in fertile soil, tend to it well, and more great problems will sprout.

Let’s now suppose that you have a list of candidate problems—research questions that you might well like to study. Which one should you pick? What does a great research problem look like? First, let’s impose some real-world constraints.

Of course, the problem should be important, and its solution should make a difference. But it also needs to be solvable—impossible problems are not worth it, especially if you’re trying to finish a Ph.D. Still, problems may look deceptively easy, and it’s probably safe to say that the large majority of problems are more difficult than they first appear.

Since there is a finite time in which the Ph.D. has to be completed, it might be wise to mitigate risk by balancing your more ambitious endeavors with some “safe” problems that guarantee results. It is also good to have a plan B for getting something publishable out of problems that are too hard or too slow to completely crack. Then there is always material to write about, even if the more grandiose undertakings fail or take one to several lifetimes to complete.

Therefore “finding a cure for cancer” or “developing an artificial brain” do not qualify as great research problems for your Ph.D. Rather, they are long-term targets of entire fields of science. However, “figuring out the role of pathway X in preventing our immune cells to attack tumours” or “understanding the role of criticality in the prediction power of liquid-state machines” could be steps in the right direction.

Confession: I made these up sort of randomly (though not entirely randomly). Yet both are focused or at least focused and concrete enough that one could actually start attacking them from some experimental or theoretical angle.

Thus far, we’ve covered the easy part—problems should be concrete and solvable, and impossible problems should better be left alone. But not all concrete, solvable problems are worth solving. The elephant left in the room is importance. What makes a problem important? And what does that even mean?

Of course, your results might yield obvious and direct benefits—perhaps contributing to a new medical treatment or laying the groundwork for future advancements in artificial intelligence. However, most of the time, “importance” is a more elusive concept. It’s usually easier to assess the significance of a result after the fact: if a scientific article is highly cited, it likely had a substantial impact on other scientists’ thinking.

When we move from concrete outputs like articles to the more abstract and hazy realm of ideas, one way to visualize science is as an ever-growing network where ideas give birth to new ideas. An impactful idea is one that sparks many others downstream, either directly as offspring or indirectly through a chain of intermediate concepts. This latent process is what generates the aforementioned citations: ideas spawning ideas, and, in a Darwinian sense, great ideas giving rise to more ideas.

Unlike in biology, however, an idea can have far more than two parents, and its fitness isn’t always immediately apparent—there can be a long delay before its importance is recognized. And unfortunately, sometimes the building blocks that had to be put in place before a major new idea could emerge are forgotten. Everyone knows about Einstein’s theory of relativity, but few are aware of all the earlier efforts that went into developing ways to synchronize clocks across countries and continents using electric cables. Yet Einstein was undoubtedly familiar with this work, and it must have influenced his way of thinking.

In any case, even if assessing the importance of a research question is not trivial, it is worth asking why someone would cite your results, say, 10 years down the line. Is the question fundamental enough so that if you solve it, others will build on your results? Try to see the question as part of a bigger tapestry.

The above picture of science as a flow of ideas being born, merging, and mutating is also helpful for reviewing the literature, both for coming up with research questions and understanding what has already been done in the general vicinity of a question that you have chosen to address. Science is a network—to make discoveries, follow the links of that network!

Identify impactful and highly cited papers and try to figure out why they are important. Then, use Google Scholar or some other tool to find out who cites them; just look at the abstracts of the citing papers to get the big picture and then dive into the details of those pieces of work that sound relevant to you. This is, in my view, a more useful way than trying to read all the literature in detail, in some random order. Try to first see the forest from the trees, and then focus on those trees that you find important. If you feel that some trees or entire forests are missing, you have a research question!

The Sound of Temporal Networks

I recently gave a talk at the Complexity, Aesthetics, and Sonification workshop in Bielefeld, Germany, organized by Thilo Gross, Maximilian Schich, and Cristián Huepe. A really great workshop with lots of different points of view from art to science!

For the talk, I did a bit of exploration in representing temporal networks with sounds. As those who have dabbled with temporal networks know, visualizing them is very difficult, as they live in time instead of space. But so do sounds. Let’s hear what temporal networks sound like, then!

So what was that? That was one month’s worth of data on students’ phone calls from the Copenhagen experiment, compressed into 13 seconds. I took 10 random students and assigned each their own random pitch so that a sound is played every time the student makes a call. I then turned the time series into MIDI which was fed into one of the synthesizers of Apple’s Logic Pro X.

For such a simple and straightforward exercise, there’s a surprising amount of information in the sonification. If you are into temporal networks, you can hear several familiar patterns: there is a daily cycle, weekdays are different from weekends, and there’s also burstiness.

Let’s continue listening to these data. The Copenhagen data set contains metadata on text messages as well, so let’s pick one of the students and listen to their egonets — everyone they call or text will get their own pitch, so that, e.g., one friend is always C (on some octave). Then we’ll feed the calls into a sampler with piano sounds and the texts into another with sampled upright bass.

Quite jazzy, isn’t it? And, again, one can pick up a lot of information here. The daily cycle and the burstiness are still there — and there are even some repeated patterns, parts of temporal motifs. There is also a finding that had escaped my attention earlier — at around the middle of the timeline, there is a cluster of notes being played on the piano, as the student makes a large number of calls in a short period of time. This pattern is, in fact, present in several other students’ timelines at the very same time.

Now let’s have a bit of fun with probing the network with random walkers. I use greedy walkers — a random walker is placed on a node (student), and when the student makes a phone call, the walker moves on to the student being called, and so on. Every newly visited student gets their own pitch that is one semitone higher; when the pitch goes down during the process, this means that the walker is visiting nodes that were already visited. Let’s hear one walk, starting from a random node:

The walker explores a larger subnetwork around the starting point, sometimes backtracking, before escaping off. Now let’s hear another walk:

Quite different, right? This walker has literally become stuck in a neighbourhood of a few students who only keep calling one another and the walker cannot escape. So the social neighbourhoods of these two students are quite different indeed!

Finally, for something entirely different — the sound of criticality. This is simulated (by my student Sara Laurila): what we have is the SIS (Susceptible-Infectious-Susceptible) model on a N=50000 node network, parametrized exactly at criticality — on the boundary between two phases where in one, all activity dies out and in the other, there is persistent activity. (In the model, nodes are S until they are in contact with an I, then they become I and make others I too, until they revert back to being S, to become I again at some point in the future. So this excitement (I) propagates through the network).

In the sonification below, I again use a random sample of sentinel nodes, each assigned their own random pitch. The nodes make a sound whenever they turn I, i.e., whenever the wave of excitation hits them. Here’s what criticality sounds like:

Here’s the same but with drum sounds instead. Sounds like Zappa, but without intention or direction, as a drummer friend of mine remarked.

And finally, criticality from the point of view of one single sentinel node:

Apply Now to Our Master’s Programme in Life Science Technologies!

Our popular Master’s programme in Life Science Technologies at Aalto University, Finland has a major in Complex Systems! The major includes a lot of network science taught by top scientists in the field — yours truly, Mikko Kivelä, and Petter Holme. You’ll also learn some Python programming, data science, machine learning, and nonlinear dynamics — or, if you wish, you can choose a more maths-heavy subset of courses, or combine your studies with, e.g., human neuroscience.

This major has very tight connections to research, and many students have continued toward their doctoral degrees after receiving their M.Sc. Another very popular and successful career path has been that of an industrial data scientist or consultant, e.g., in the health industry. There is a lot of demand for these in the Finnish job market, so a Master’s degree in Life Science Technologies is a great investment in your future.

The application period is open only until 2 Jan 2024, 3:00 PM GMT+2 so be quick & apply now!

Season’s Grant-Writing Tips, Part 2/2

A very, very AI-generated image where money falls down like snow.

In the first part of this grant-writing mini-series, we learned the fundamental secret of grant-writing (and, in fact, any writing): everything revolves around the reader. The only purpose of a grant proposal is to make it easy for the reviewer to recommend funding.

Let’s break that statement down. For the reviewer to recommend funding, she has to feel that what you aim to do is important, novel, and feasible, and that you are exactly the right person/team to do this. In more touchy-feely terms, the reviewer has to like the proposal. And you.

As we discussed in the previous post, this is much more likely to happen if the proposal doesn’t make the reviewer work too hard: it should be focused, clearly written, and provide clear answers to the questions the reviewer must address.

To help with the above, we’ll now address writing at the level of paragraphs and sentences, borrowing some tricks from professional copywriters who craft advertising text. These techniques not only involve gently manipulating the reader—all writing is about manipulating the reader!—but also aim to ensure that the text flows. An ad where the reader gets lost or bored is a failed ad.

Let’s begin at the beginning because it is the most important place. In any writing, the first sentence and the first few words have enormous power—”Call me Ishmael”—and you should tap into this power. This is because they prime the reader’s mind for what is to come. They also set the general mood. Begin your proposal with a few strong sentences that almost win the grant! These sentences should summarize your plan and its impact: why is it important to do the things you plan to do? Why are you in a unique position to do this? If your grant is funded, how will the world become a much better place?

This mini-summary serves a dual purpose in priming the reader. Firstly, on an emotional level, the reviewer should feel excited – “This sounds like a great proposal!” If you achieve this, the reviewer will have a positive bias from the very beginning. However, with a weak or muddled beginning, you’ll need to work hard to win them over. Secondly, it is much easier for the reviewer to follow the text when they know where it is going — easier in terms of both comprehension and how reading the text feels (these two are, in fact, the same).

There is another place of power: endings. The power of endings is different from that of beginnings: whereas beginnings prime the reader, the endings are what the reader remembers. This is because between paragraphs and between sections there is a break in reading, where the stream of input to the reader’s brain temporarily ceases. This leaves more space for whatever the last input was to echo around in the reader’s head.

Saving important bits to the end is a common copywriter trick—ever seen an ad with “click here to buy” in the middle?

However, this trick works best for short sections and well-written text. If you lose your readers along the way, they won’t reach the end. Remember the overworked, sleep-deprived reviewer from the last post? She might be tempted to just skim, you know. To mitigate this risk, write short paragraphs ensuring that the reader makes it through to their end—and write them well. For section endings, a strong recap sentence — perhaps as a separate paragraph—can do wonders. “In summary, my research can be expected to have an enormous impact, because…”

We’ve now covered beginnings and endings. What is left is how to get from the former to the latter. Here, a copywriter’s trick is to understand that while the sentences must deliver information — including enough details of your research plan to judge its feasibility, etc — their task is also to propel the reader forward. In ad copy, the primary task of every sentence is to make the reader read the next!

This means that the sentences should seamlessly flow into one another, which is a general sign of good writing regardless of the genre. This is particularly important for information-dense grant proposals: information is much, much easier to absorb through a narrative than when it is presented as disconnected bits and pieces. The narrative is what keeps the reader going: as humans, we’ve enjoyed stories since the dawn of man, singing around campfires.

For a grant, the narrative is particularly important for sections prone to being dense, taxing, and boring—imagine the sleep-deprived reviewer having to wade through 25 poorly written state-of-the-art sections! This is especially crucial if the section is at the proposal’s beginning, as state-of-the-art sections often are. So next time when writing one, consider the reviewer, and instead of just listing references, write a story of how your field of science has evolved to the point where you can both ask and answer your research question.

Finally, as I mentioned in the previous post, there is one spot in the proposal where you can be slightly difficult to understand on purpose, in particular, if the reviewer is not really in your (sub)field and your proposal involves theory/maths/data analysis/similar.

This is in the methods section, or whatever the section where you describe what you are going to do is called. Whereas the research question and its importance should be written with absolute clarity so that everyone can understand them, here you can show off a bit. The point is to give the impression that you really know your stuff. Even though your proposal should generally be as free of jargon as humanly possible, it doesn’t hurt to have one strategically placed sentence where you flex your claws, show that you can devour your field’s most complicated concepts for breakfast, and instill a bit of fear and awe in the reviewer. Then you can be all nice again, and wait for the gifts to arrive.

I wish you merry grant-writing!

Season’s Grant-Writing Tips, Part 1/2

Grant money falling like snow (a very, very AI-generated image, by craiyon.com)

It is grant-writing season here in snowy Finland, and to keep away from the actual work, I thought I’d write a couple of posts on grant-writing tips. Today we’ll be all nice, but in the next episode, we’ll get a bit naughty because that might in the end bring us more gifts. Ho ho ho.

Let’s start at the very beginning. When writing a grant, the most important thing for you to understand is what is going through the heads of your target audience—the reviewers. You are writing the grant to persuade them to recommend you to get funded. Your one and only task is to make this as easy as possible for them.

This simple rule — to make it as easy as possible for the reviewers to recommend funding the proposal — gives rise to many corollaries.

To arrive at those, consider the situation that the reviewers find themselves in. It is very rare to get a single proposal to review that is spot on in the reviewer’s own subfield. What is more common is that there is a large pile of proposals on the reviewer’s desk, they are almost but not entirely off-topic, the deadline was last week, the reviewer has barely slept because the kids are sick, and even the coffee has gotten cold.

In this situation, the reviewer will be very, very grateful if you make her task easier.

This means, among others, that a) the proposal must be easy to understand, even to a non-expert, b) the proposal’s value and level of ambition must be immediately visible, c) the proposal must contain direct answers to the questions that the reviewer has to answer, and d) the proposal must not contain any more stuff than is necessary to convince the reviewer.

The first corollary requires that you’ve actually given your research plan enough thought so that you can understand it yourself—in other words, you must know what you are doing. It helps a lot to have a clear focus: it is a common beginner’s mistake to try to squeeze all your ideas into one proposal, which then reads like a confusing superposition of several muddled research plans. Focus on a single topic and your best idea to avoid confusing the reviewers because otherwise, they won’t know which of your parallel plans they should be rating. Confused people are rarely happy people, and only happy people give top ratings!

Being easy to understand also means well-written: reading a good grant proposal shouldn’t feel taxing. Avoid jargon and complicated sentences; always err on the side of simplicity. Also, your proposal should not read like lecture notes because the proposal is not about teaching the reviewers. Nothing is as annoying as being lectured to if you only want to get your reviews done!

The proposal should contain enough information to convince the reviewers of how and why you plan to do what you plan to do, but no more than that. Again, think of the poor reviewer who has 20 proposals on her desk: do you think that she is happy to try to become an expert in 20 new topics by reading about a metric ton of intricate details under heavy time pressure, with cold coffee and cranky kids demanding attention? I don’t think so.

That being said, there is one spot in the proposal where you can be a bit difficult to understand on purpose, but let’s leave that for the next part of this series.

Being easy to understand also means no bulls*it: no fluff or fancy-sounding, big words that mean nothing. For god’s sake, no ChatGPT-produced text because it is full of the above, unless you really, really know how to use it. Write the text yourself. Write concisely, simply, and powerfully. Write like you mean it.

The second corollary demands that you make your case clear directly and very early on. Here, my suggestion is to start with a summary paragraph that is almost enough to win the grant for you. More about this later.

The third corollary — the proposal must contain direct answers to the questions that the reviewers have to answer — is hugely important as well. This requires you to do a bit of reconnaissance: the reviewer guidelines and/or review forms of many grant agencies are public. Get them. Study them. Learn them by heart. Find out what specific questions the reviewers are asked, and make sure that your text contains copy-pasteable answers to each, preferably well-highlighted (in italics, or so), so that in a hurry, the reviewers can recycle your text in their statement. Make sure that your answers are winners and that it is easy for the reviewer to give them full points.

Lastly, clarity and readability are often in direct conflict with the amount of stuff in a proposal. Again, a common beginner’s mistake is to cram in as much text as possible, fiddling with the margins or font sizes and using stamp-sized figures, etc. In contrast, the pros choose what elements to include and then focus on those, leaving enough white space and room to breathe. Don’t make the reviewers choke on the amount of stuff they have to ingest! Focus on what matters. Quality instead of quantity.

That’s all for today. In the next episode, we’ll put on our black hats and talk about some Jedi mind tricks, stolen from the evil folks who write ad copy that makes you buy stuff that you don’t need. Stay tuned!

Are you new around here?

Notebooks and a pencil

As there has recently been a surge of visitors coming from Moodles and other learning platforms, I thought I’d say hi — hello there!! — to everyone who is new to this blog, and provide some guidance in the form of a table of contents of sorts.

So, where have you landed at? This is a blog by me, where me = Jari Saramäki, an interdisciplinary physicist and a professor at Aalto University, Finland, dabbling in network science and other complexities, and a big fan of lucid writing. Also, a bass guitar player, because someone has to be.

The blog contains things that students have found useful (which may be why you are here), in particular, advice on how to write scientific papers and how to develop your scientific writing skills:

Welcome again, and I hope you’ll find something in this blog that is either useful or entertaining, or both!

Slides for my keynote at Complex Networks 2019

LisbonTalkCover

I gave a keynote talk at the Complex Networks 2019 conference in Lisbon—here are the slides, if you are interested.

If you are interested in temporal networks in general, here are some pointers:

Postdoc Wanted — Network Science, Public Transport Networks, Cities, etc

HelsinkiPTN2

We are looking for a postdoc (2 years) to work on the intersection of complex systems/networks, transport engineering, human mobility, the science of cities, and data science.

This position is related to ongoing collaboration between my group and prof. Milos Mladenovic’s (Twitter: milosplanner) transport engineering group  (both at Aalto University, Helsinki area, Finland).

We want to bridge the gap between network science and transport engineering, including city planning and public transport network planning; for our earlier joint works, see, e.g.,

What we can offer:

  • Access to unique data: e.g., details of all trips from Kutsuplus, famous for being the world’s first on-demand public transport service; vehicle-level geocoordinate trajectories for public transport in the Helsinki region; aggregated mobile-phone flow data; and more coming in.
  • True multidisciplinarity with real-life application potential: in addition to the two teams from different domains (networks & transport), we interact with on-demand transport companies, the Helsinki Region Transport Authority, etc.
  • Access to heavy-duty computational resources (our Triton cluster, etc)
  • Access to lots of in-house expertise on networks, data science, and transport studies
  • Lively environment: Aalto University with a campus ~10 km from the centre of Helsinki with its own subway station (great public transport connectivity!)
  • Decent salary: >3keur/month, which is really quite OK in the Helsinki area (despite taxes + costs of living being a bit higher than in most countries)
  • Darkness of winter that is compensated by almost around-the-clock sunlight in the summer!

What we expect:

  • Expertise in network science/complex systems/data science
  • Some level of expertise in cities, transport, spatial networks, geodata, etc
  • PhD in a field relevant to the above
  • Skills in Python or willingness to learn them fairly quickly (packages such as gtfspy will help you get started)
  • Interest in the topic!

The call is open until 20th of December; the applications will be processed and (Skype) interviews with shortlisted candidates will be conducted in January 2020.

Please email a single combined PDF document containing 1) a cover letter, 2) your CV and publication list, 3) contact details for two references, to jari.saramaki@aalto.fi, with “Mobility postdoc” in the topic.

What is scientific creativity—and how do you feed it? (Part I)

acid-citric-citrus-997725

Last winter, on a speaking trip to Norrköping, someone asked me to write about skills (and meta-skills) that scientists and PhD students need, beyond writing papers. Turns out that this is a lot more difficult than writing about writing, where the end product—a scientific paper—is something tangible and amenable to analysis: how do great introductions look like? How do the greatest writers finish their papers? It is much more difficult to write, say, about learning to be creative, which is what I shall try to do here. But what would be more important for aspiring scientists than creativity?

Science is all about creativity: coming up with the right questions, developing clever methods to answer those questions, and connecting the answers in imaginative ways to learn something greater. But we rarely talk about creativity as a skill—often, people view it as something that you either have or don’t have, just like an ear for music or an eye for design. And just like with music and design, this view is wrong: everything can be learned. So how do you learn to be creative?

Before attempting to answer this question, let’s take the bull by the horns and ask what creativity is. If by creativity we mean the ability to bring forth ideas that are entirely new, we are immediately hit by a very difficult, philosophical question: where do new ideas come from? At least to us (recovering ex-) physicists, the emergence of something that wasn’t there before is kind of strange: aren’t there conservation laws that forbid this kind of travesty from happening? What is it that gives birth to new information (because that is what happens when a new idea emerges, whether it is a question or an answer)?

If physics doesn’t provide us with answers, let’s drop it for a while and put on the hat of a biologist: in the realm of living things, don’t new things gradually emerge, driven by the slow Darwinian evolution? Notice the word “gradually”—biological evolution is slow tinkering, a process where existing forms and shapes and organs are gradually transformed into something new, of dinosaurs developing feathers that eventually help some of them to learn to fly, of finches’ beak shapes adapting to their habitats. So in biological evolution, everything that is “new” is built on top of a lot of something old, and this happens slowly: a slow expansion into the adjacent possible, if you’ve read your Kauffman.

Are there some other natural processes where new forms emerge more rapidly? The human immune system provides a great example. Somewhat surprisingly, not all our cells carry the same sets of genes: the T and B cells of our immune system, our ultimate smart weapons against viruses and other invaders, display an enormous diversity of different receptors that recognise those invaders. This diversity results from those cells carrying some randomised (but not too randomised) parts of our genome. The precursor cells that eventually become T and B cells have strings of different modules in their genetic code, and in the process of randomisation, some of those modules are randomly picked and joined together (the rest are discarded). Then, a bit of extra randomness (extra letters, deleted letters, and so on) is added to their junction. So to arrive at new kinds of receptors, our bodies randomly merge things that are known to work (those receptor modules) and then add some noise on top. Again, “new” equals “old, but with added something.”

Let’s now return back to creativity, in the context of science or otherwise. The above examples point out that the old rhyme—“something old, something new, something borrowed, something blue”—is scientifically highly accurate, except for the blue bit perhaps. In other words, the things that we think are new are in fact modifications and clever combinations of old things, with perhaps some small amount of additional randomness. Ideas do not live in a vacuum, they emerge because of other ideas.

Therefore, creativity is the ability to merge existing ideas in new ways (while possibly adding a magic ingredient on top).

This brings us to a fairly simple recipe for feeding one’s creativity: collect lots of things that can be combined/transmogrified into something new, and then just combine them! In other words, first, feed your head with lots of information—and not just any information, but preferably pieces of information that haven’t yet been combined.

To maximise the chance of something entirely new emerging out of this process, your input information—the stuff that you feed your head with—should be diverse enough. There are, however, different possibilities: on the one hand, if you know everything that there is to know about your field, you can probably see where the holes are and combine bits of your knowledge in order to fill them. On the other hand, if you know enough about a lot of fields, you might be able to spot connections between them (think of, say, network neuroscience, applying network theory to problems of neuroscience). There are different styles here, but even if you choose to go deep instead of wide, do keep the diversity of input information in mind: just for fun, learn some mathematical techniques that people do not (yet) use in your field! You never know, those might turn out to be useful later.

To be continued…