The internet is currently made up of around 50 billion pages, linked to
form a vast, virtual landscape. Our interaction provides data which,
when broken down and analysed, can help us understand a wide range of
human activities from the cultural to the economic.
Funded by the EU’s FP7 under the
Future and Emerging Technologies scheme, the
New tools and Algorithms for DIrected NEtwork analysis
(NADINE) project is contributing to the development of new types of
search engines, putting Europe in the lead in this important area.
‘We are trying to map the net to show how pages are linked together
and how people use these links in their voyage around the net,’ says
NADINE project coordinator, Dima Shepelyansky research director at the
Laboratoire de Physique Théorique, CNRS Toulouse.
The project uses, among other tools, some provided by Google to show
how pages are linked together. Doing so can, for example, show the
probability of people visiting certain sites, making choices, buying
objects or voting in certain ways.
Refining ways of tracking online interaction
To develop and test their methodologies, researchers looked at
Wikipedia biographical entries to see if they could rank the people
referred to in order of influence. They analysed the 24 major languages,
considering the number of articles linking to the individuals using
Google’s PageRank system which says a page is important if important pages link to it.
But this threw up an interesting problem for the project to iron out
– the scientist Linnaeus appeared to be the most important individual.
Since he was responsible for classifying organisms, there are links to
his page on every Wikipedia page referring to plants and animals which
skewed the results.
So researchers decided to introduce
CheiRank,
which describes the importance of a page in proportion to the number of
outgoing links. By combining both, researchers were able to establish a
robust way of measuring importance. Self-organising, hyperlinked web communities can be also detected by developed methods.
Online information flows similar to commercial exchanges
Considering the way links to and from a page can show how
information is exchanged, the project then applied their findings to the
analysis of commercial flows. NADINE has been using the
United Nation’s world trade database
which gathers data from the last 50 years. ‘We have been developing a
new way of analysing the commercial exchange of 61 products across the
UN countries, determining the sensitivity of trade balance to price
variations’, Shepelyansky explains.
NADINE brings together a
partnership
of theoretical physicists, mathematicians and computer scientists from
France, The Netherlands, Hungary and Italy. ‘Transnational, EU funding
was indispensible when it comes to getting a team of scientists from
such a variety of disciplines together,’ he adds.
The project has been running for three years and ends this April
(2015). It is supported by nearly EUR 1.223 million in EU funding. Now
it has the methodology clearly established, researchers from the NADINE
consortium intend to continue the work with various partners including
the
World Trade Organisation.
Link to project's website
Other links
http://www.quantware.ups-tlse.fr/QWART/cheirank/cheirank.htmlhttp://www.quantware.ups-tlse.fr/QWLIB/topwikipeople/http://www.quantware.ups-tlse.fr/QWLIB/wtnmatrix/http://en.wikipedia.org/wiki/CheiRank