Functional brain networks: the problem of node definition

Summary: Nodes in brain networks from fMRI are usually defined using ROI’s (Regions of Interest) so that each ROI node has a time series that is the average of the BOLD time series of the ROI’s voxels and links represent correlations between nodes. Here, we show that this averaging of voxel time series is problematic.

The human brain is a complex network of neurons. The problem is that there are about 10^12 of them with ~10^5 outgoing connections each; mapping out a network of this scale is not possible. Therefore, one needs to zoom out and look at the coarse-grained picture. This coarse-grained picture can be anatomical – a map of the large-scale wiring diagram between parts of the brain – or functional, indicating which parts of the brain tend to become active together under a given task.

But how should this coarse-graining be done in practice? How to define the nodes of a brain network –– what should brain nodes represent? In functional magnetic resonance imaging (fMRI), the highest level of detail is determined by the imaging technology. In a fMRI experiment, subjects are put inside a scanner that measures the dynamics of blood oxygenation in a 3D representation of the brain, divided into around 10,000 volume elements (voxels). Blood oxygenation is thought to correlate with the level of neural activity in the area. As each voxel contains about 5.5 million neurons, the network of voxels is significantly smaller than the network of neurons. However, it is still too large for many analysis tasks, and further coarse-graining is needed.

A typical way in the fMRI community is to group voxels into larger brain regions that are for historical reasons known as Regions of Interest (ROIs). This can be done in many ways, and there are many pre-defined maps (“brain atlases”) that define ROIs; these maps are based on anatomy, histology, or data-driven methods. It is common to use ROIs as the nodes of a brain functional network. The first step in constructing the brain network is to assign to each ROI a time series that is the average of the time series of its voxels measured in the imaging experiment. Then, to get the links, similarities between the ROI time series are calculated, usually with the Pearson correlation coefficient. The correlation between the two ROIs becomes their link weight. Often, only the strongest correlations are retained, and weak links are pruned from the network.

If the ROI approach is to work, the ROIs should be functionally homogeneous: their underlying voxels should behave approximately similarly. Otherwise, it is not clear what the brain network represents. Because this assumption hasn’t really been tested properly and because it is fundamentally important, we recently set out to explore whether it really holds.

We used resting-state data – data recorded with subjects who are just resting in the scanner, instructed to do nothing – to construct functional ROI-level networks based on some available atlases. We defined a measure of ROI consistency that has a value of one if all the voxels that make up the ROI have identical time series (making the ROI functionally homogeneous, which is good), and a value of zero if the voxels do not correlate at all (making that ROI a bad idea, in general).

Distribution of consistency for ROIs as brain network nodes
[Figure from our paper in Network Neuroscience]

We found that consistency varied broadly between ROIs. While a few ROIs were quite consistent (values around 0.6), many were not (values around 0.2).  There were many low-consistency ROIs in three commonly used brain atlases.

From the viewpoint of network analysis, the existence of many low-consistency ROIs is a bit alarming.  We also observed strong links between low-consistency ROIs – how should this be interpreted? These links may be an artefact, as they disappear if we look at the voxel-level signals. This means that the source of the problem is probably the averaging of voxel signals into ROI time series. While this averaging can reduce noise, it can also remove the signal: at one extreme, if one subpopulation of voxels goes up while another goes down, the average signal is flat. More generally, if a ROI consists of many functionally different subareas, their average signal is not necessarily representative of anything.

In conclusion, we would recommend being careful with functional brain networks constructed using ROIs; at least, it would be good to go back to the voxel-level data to verify that the obtained results are indeed meaningful.

For details, see our recent paper in Network Neuroscience.

This post was co-written by Onerva Korhonen, Enrico Glerean & Jari Saramäki.

[PS: The definition of brain network nodes is not the only complicated issue in the study of functional brain networks. Even before one has to worry about node selection, a possible distortion has already taken place: preprocessing of the measurement data. We’ll continue this story soon.]

Why can writing a paper be such a pain?

This is the first in a series of “self-help” posts for PhD students on how to write a scientific paper.

Writing a scientific paper

Show me a researcher who has never struggled with writing, and I’ll show you someone who hasn’t written anything, or who doesn’t care about the quality of the output. Science is hard, and so is writing. Together they are harder. Now add in lack of experience as a researcher and as a writer, together with the usual time pressure, and it’s no wonder that the blank document in front of you looks like the north face of Mount Everest. We’ve all been there, staring at that wall.

While no mountaineer would risk climbing Everest without a route plan, an inexperienced writer tends to neglect the importance of planning. Having no plan, she tries to do everything at once. She opens the blank document in her editor, stares at it, tries to decide what to make of her results, what the first sentence of the first paragraph should be, what the point of the first paragraph should be, and what the point of the whole paper should be.

It’s no wonder that this feels impossible. No-one can solve that many problems in parallel. Problems are best solved one at a time.

Writing becomes easier if one separates the process of thinking from the process of writing. To write clearly is to think clearly, and thinking precedes writing. Writing becomes a lot less of a struggle when you think through the right things in the right order, before putting down a single word.  A successful software project begins with the big picture: what functions and classes are needed, and for what purpose. It doesn’t begin with developing code for the internal bits of these functions and classes. A writing project should also begin with addressing the overall point and structure of the paper, before moving to details such as words or sentences.

Another way of looking at the problem is linearity versus modularity. The fear of the blank page arises out of linearity: the feeling that the only way to fill the page is to start with the first word and proceed towards the last, word by word. This is not so. Whereas reading is usually linear, writing doesn’t have to be. The process of writing should be modular – first, sculpt your raw materials into rough blocks that form your text, and then start working on the blocks, filling in more and more details, so that entire sentences only begin appearing towards the end of this process.

The approach I try to teach my students is splitting the writing process into a series of hierarchical tasks. This way, getting from a pile of results to a polished research paper is a bit less painful.

This approach begins by identifying the key point of the paper and then moving on to structuring the material that supports this point into a storyline. This storyline is then condensed into the abstract of the paper. My advice is to always write the abstract first, not last! This serves as an acid test: if you cannot do it, you haven’t developed your storyline enough.

After that, there are many steps to be taken before writing any more complete sentences: planning the order of presentation, including figures, and for each section of the paper, mapping the arc of the storyline into paragraphs so that the point addressed by each of the paragraphs is decided in advance. Then, the paragraph contents are expanded into rough sketches, and these sketches are finally transformed into whole sentences. At this point, there is no fear of the blank page, because there are no blank pages: for each section, for each paragraph, there is a map, a route plan, and the only decision that is needed is how to best transform that plan into series of words. Often, this feels almost effortless.

[Next in the series on how to write a scientific paper: how to write a great abstract]

There is now an ebook based on this series, available from a number of stores (Kindle Store, Apple Books, Kobo, Tolino, etc!)

Mitä matkapuheluidemme ajoitukset kertovat meistä

[This post is in Finnish in case you are wondering; the original English-language version can be found here. The rest of the posts in this blog are in English.]

Tämä postaus on tarkoitettu taustamateriaaliksi tiedetoimittajille, liittyen Akatemian tiedeaamiaiseen 27.4. Mutta sinun ei toki tarvitse olla toimittaja lukeaksesi eteenpäin!

Tutkimusryhmäni on tutkinut matkapuhelindataa yli vuosikymmenen ajan. Se, miksi tutkimustamme kutsutaan, on muuttunut tällä välin verkostoanalyysistä datatieteeksi ja laskennalliseksi ihmistieteeksi. Miksi tahansa sitä kutsutaankin, tutkimuksessamme tarkastellaan ihmisten käyttäytymistä laskennallisin keinoin, ja aineistot voivat sisältää jopa miljoonia henkilöitä!

Käytämme automaattisesti kerättyä, anonymisoitua, aikaleimattua dataa, joka on peräisin teleoperaattoreiden laskutusjärjestelmistä. Tämän lisäksi tutkimme dataa joka on kerätty vapaaehtoisilta koehenkilöiltä esimerkiksi älypuhelinapplikaatioilla. Matkapuhelintietojen (kuka soitti kenelle ja milloin) avulla voimme rekonstruoida sosiaalisten verkostojen kytköksiä ja tarkastella myös puheluiden aikasarjoja. Nämä aikasarjat ovat osoittautuneet erittäin mielenkiintoisiksi!

Tarkastellaan ensin hyvin lyhyitä aikaskaaloja, sekunneista minuutteihin. Jos katsomme yksittäisen henkilön puheluita, ja piirrämme aikajanalle viivan aina kun henkilö puhuu puhelimessa, saamme tällaisen kuvan:

Puheluiden purskeisuus

Tämä aikasarja on purskeinen – se on satunnainen mutta ei tasaisen satunnainen! Se sisältää hyvin lyhyillä aikavälillä tapahtuvien puheluiden purskeita (kymmenistä sekunneista pariin minuuttiin), ja pidempiä taukoja näiden purskeiden välillä. Ihmisten viestintä ja muukin toiminta on usein purskeista – eikä kukaan oikeastaan tiedä, miksi. Muuten, hermosolujen laukomisen aikasarjat näyttävät varsin samanlaisilta! Ehkä me kaikki olemme vain hermosoluja koko maailman kattavassa sosiaalisessa verkostossa… no, jätetään tämä tieteiskirjailijoille.

Mennäänpä kohti pidempiä ajanjaksoja, tunteja ja päiviä. Sieltä löydämme vuorokausirytmit, jotka ymmärretään huomattavasti paremmin. Meidän päivittäinen toimintamme seuraa päivän ja yön vaihtelua 24 tunnin jaksoissa. Poimitaanpa pari henkilöä datasta ja katsotaan, paljonko he soittavat puheluita kuhunkin kellonaikaan:

Puheluiden vuorokausirytmejä
Tästä nähdään että vaikka ihmiset yleensä nukkuvat yöllä ja valvovat päivällä, vuorokausirytmeissä on silti selkeitä eroja, mikä näkyy myös puheluiden määrässä. On aamuvirkkuja, jotka soittavat puheluita jo toisten nukkuessa, ja iltaihmisiä jotka soittelevat myöhään illalla (varmaankin toisille iltaihmisille). Me olemme kaikki erilaisia!

Vuorokausirytmeihin liittyy muutakin kuin puhelumäärien vaihtelu: esimerkiksi iltaisin puhelut kohdistuvat usein harvoille (ja läheisille) ystäville, ja päivällä ne ovat satunnaisempia.

Siirrytäänpä sitten kohti vielä pidempiä ajanjaksoja – kuukausia ja vuosia. Nyt yksittäisten puheluiden tarkoilla ajoituksilla ei ole enää väliä. Lasketaan siis koehenkilöllemme, montako soittoa hän tekee kullekin ystävistään (ja sukulaisistaan), ja katsotaan miten tämä kuvio muuttuu ajassa! Saadaan tämäntapainen kuvio:

Egosentrinen verkosto

Tämä jakauma kertoo mikä osuus henkilön puheluista suunnataan tämän eniten puheluita saavalle ystävälle, mikä toiseksi eniten, jne. Eli se vastaa kysymykseen kuinka suosittu suosituin ystävä on, ja kuinka tasa-arvoisesti me ystäviämme kohtelemme (yleensä varsin epätasa-arvoisesti, kolme suosituinta voi saada yli puolet puheluista!) Tämä heijastaa tapaa, jolla rakennamme sosiaalisen maailmamme: meillä on vain muutama hyvin läheinen ystävä ja paljon ystäviä jotka eivät kuulu tähän rajattuun sisäpiiriin. Suurin osa siteistämme on heikkoja, ja ne muutamat vahvat siteet ovat hyvin merkityksellisiä.

Tällaiset puheluiden jakaumat ovat hieman erilaisia kaikille, ja ne ovat osoittautuneet hyvin pysyviksi silloikin, kun verkostossa on suurta vaihtuvuutta. Jos tapanasi on keskittyä 1-2 läheiseen ystävään, tulet tekemään näin silloinkin, jos nämä ystävät korvautuvat joillakin muilla vaikkapa paikkakunnalta muuton takia. Vastaavasti jos jaat aikasi tasan ystäviesi kesken, teet varmaan näin jatkossakin.

Puhelujakaumilla sekä verkoston vaihtuvuudella on yhteys luonteenpiirteisiin; jos tämä kiinnostaa, kollegani Simone Centellegher on kirjoittanut blogipostauksen aihepiiristä äsken julkaistun artikkelimme pohjalta.

Onko tästä kaikesta tiedosta sitten muutakin hyötyä kuin että se on mielenkiintoista? Todennäköisesti. Käyttäjästä kerättyyn dataan perustuvat hyvinvointisovellukset ovat yksi mahdollisuus, kunhan niiden toiminta varmennetaan tieteellisesti. Tutkimusryhmälläni onkin käynnissä Helsingin yliopiston Psykiatrian osaston kanssa pilottihanke, jossa pyritään löytämään mielialapotilaiden hyvinvointia ennustavia tekijöitä sovellusten keräämästä datavirrasta.

Lopuksi vielä linkkejä alkuperäisiin tieteellisiin julkaisuihin:

  • Small But Slow World [Phys. Rev. E | arXiv] (2011)
  • Daily Rhythms in Mobile Telephone Communication [PLoS One] (2015)
  • Persistence of Social Signatures in Human Communication [PNAS | arXiv] (2014)
  • Personality Traits and Ego-Network Dynamics [PLoS One] (2017)
  • Effects of time window size and placement on the structure of an aggregated communication network [EPJ Data Science] (2012)
  • From Seconds to Months: the Multi-scale Dynamics of Mobile Telephone Calls [EPJB | arXiv] (2015)

Ant supercolonies: networks of nests

An ant (F. Aquilonia)

Ant colonies are complex systems par excellence. It’s almost as if the colony is the organism, not the ant. Ants follow simple behavioural patterns, depositing pheromones as they go and following trails of scent laid down by others. Because of their collective actions, the colony seems to have a life of its own, sprouting its foraging trails towards food sources much like a slime mold grows its branches along the shortest path to food. The colony appears to have its own reproductive cycle too: queens and males mate during the nuptial flight, and the impregnated queens then land to give birth to new colonies, like fertilized eggs. Ordinary workers play no role in reproduction; they are outside the germline.

But some species of ants behave in ways that are even more complex: they form supercolonies, networks of interconnected nests with hundreds of reproductive queens. In these supercolonies, queens and workers move freely between nests without eliciting aggression; they cooperate across nest boundaries. Ant supercolonies are the largest cooperative units known in nature: for some ants, they can extend for hundreds of kilometres.  They are also among the strangest: their existence is difficult to explain from the point of view of gene-centric evolutionary theory. This has to do with altruism: relatedness among nestmates can be low, and workers will end up helping unrelated individuals that carry a different set of genes. It may even be that ant supercolonies represent an evolutionary dead end.

Recently, I had a chance to have some fun with the genetics of ant supercolonies. My colleagues Eva Schultner and Heikki Helanterä who work on ants had collected a number of samples from tens of nests of F. Aquilonia in southern Finland. As Eva and Heikki wanted to understand the genetic structure of F. Aquilonia supercolonies, the sampled ants were genotyped for estimating genetic similarities between the nests (for technical details, scroll down). From a network-science point of view, the nests and their similarities span a weighted spatial network: nests are nodes and pairwise genetic similarities are mapped to link weights. The resulting similarity network looks like this:


There are two supercolonies, one to the NE and one to the SW – the link weights inside the colonies are higher than between them, much like you would have for two communities in a social network. A closer look inside these two supercolonies (with methods more advanced than bare-bones network thresholding) revealed that there is a faint hint of substructure, of subclusters inside supercolonies. And because queens, workers, and pupae were genotyped separately and sampled at two time points, we could see that the genetic relationships between nests are not the same in terms of queens as they are in terms of workers, and not the same in spring as they are in summer when workers have started migrating.

This means that there may an extra layer of complexity in the genetics of ant supercolonies – fine structure in time and space, and in terms of class.

This work was published in Molecular Ecology last year. If you are interested in toying around with ant genetics, the data are available on Datadryad and my Python scripts can be found here:

[Technical details: the ants were sequenced at 8 polymorphic microsatellite loci; microsatellites are nonsensical bits of DNA where a random sequence is repeated 5-50 times. They do not do anything and there is no selection pressure, and therefore microsatellite alleles are great for just seeing how close or far two populations are genetically. There are various measures for quantifying this: the simplest would be to see how often the same alleles appear in populations. In social-insect studies, the typical measure is the so-called relatedness (Queller & Goodnight 1989) and we used it in this work.]

A Neuroscience Conference On Twitter

Brain Twitter Conference ad

My colleagues at the Department of Neuroscience and Biomedical Engineering at Aalto University are organizing the Brain Twitter Conference. It takes place on Twitter on the 20th of April, with an impressive list of speakers. Talks and keynotes will be delivered under the #brainTC hashtag.

While the idea of a Twitter conference may sound like a gimmick, it should be taken seriously – not as a substitute but as something new. There are no coffee breaks or conference dinners for socializing, but anyone can attend for free. And, even better, because the tweets will remain available, a kind of time travel becomes possible – one can revisit any talk, any discussion, and any debate at will. The conference becomes frozen in time.

Networks are everywhere, and they are beautiful

This is another popular-science post, for anyone out there who wants to see the light of network science! It could be considered a network sermon of sorts. This is also the way how I begin my course on complex networks and most of my pop science talks.

networks from genetic regulation to the Internet

So, why are networks so fundamentally important? Because they exist on all levels of the living universe.

Let’s begin by having a look at what’s inside our cells. First, there is some (messy) software written into double-stranded DNA, where the functional subunits are called genes. They like to talk to other genes, upregulating or downregulating their activity. This network of genetic regulation determines what happens inside our cells. In particular, it determines which proteins are to be built. Mirroring the genes that code for them, the proteins also like to interact, again forming a network (which is coupled to the network of genes…). To make all this happen, some fuel and some building blocks are needed, and this is taken care of by a network of chemical reactions: the metabolic network that is responsible for the logistics of energy and matter.

Our cells are full of networks. We are full of networks.

As it often happens in nature, similar kinds of structures emerge on multiple levels. If we now zoom out from inside the cells and look around, we again see networks. This time, the networks are those of cells talking to one another and influencing each other’s actions, the immune system being a most beautiful example (I’ll return to it in a later post). But there are no cells that are more fond of networking than neurons, the nerve cells. I am typing this paragraph because of spike trains transmitted by each of my ten billion neurons to about ten thousand other recipient neurons (this much can be said, but no-one knows how these spike trains actually encode the words that I am typing). These neurons are the fundamental building blocks of my brain, a network of enormous complexity (and by that I don’t mean my brain but any human brain).

Let us continue zooming out to get a broader view. Just like the neurons inside them, brains also like to network! Practically speaking, we do not even exist in isolation. Our brains have evolved into social supercomputers: most of the concepts inside our heads exist because larger networks of brains have agreed that they are meaningful, and developed a language to describe them. And boy have we come a long way from those times when these larger networks were of the size of a tribe (of about 150 members, it has been claimed). Now our connections transcend space and time through social media and the Internet, and we are all part of a gargatuan social network that spans the entire planet.  What is happening right here, right now, is a long-distance connection between two brains. One brain talks to another, mine to yours, across time and distance. Hi, brain!

But these networks of brains do not only exist for talking. We humans like to build things: systems of trade, structures of power, grids that transmit energy, webs that transmit information, organizations that exist for making things that didn’t exist before. We dig up raw materials and transform them into parts that are brought elsewhere and merged with other parts to make larger parts, over and over again, until this complex weave of logistics and manufactoring spits out cars and cloud servers. We connect cities with ships and trains and airlines; we connect minds with phones and computers.

We really, really like to build networks. But we cannot do it alone, so we connect with others. We form networks that build networks. The same thing that happens on all levels. From genes to cells, from brains to people. Networks building networks.

This is why networks are cool, essential, and beautiful.

(I’ll stop here to end on a high note. I’ll talk about applications and other earthly things in later posts).