Using Citation Tracking and Networks in Research: From Established Databases to the Latest in AI

Using Citation Tracking and Networks in Research: From Established Databases to the Latest in AI

Using Citation Tracking and Networks in Research

From Established Databases to the Latest in AI


Here at Gottesman Libraries, I frequently assist people seeking peer-reviewed articles for
literature reviews in education, psychology, or allied health field. A literature review gives an
overview of relevant academic literature surrounding a topic or study, situating one’s research
within a scholarly context. The process of finding key literature in an academic topic–identifying
the important conversations taking place between scholars, locating the seminal, impactful
papers about a theory, uncovering the foundational research that new studies build upon– can
seem daunting. I’ve worked with many students preparing for a literature review and they ask
me: Where is this information? How do I find the right articles? How do I know I’m not missing
anything?


There are various ways to do research. Different methods can serve unique purposes. During
my research consultations, I mostly help people develop queries to systematically search
databases and library catalogs in order to surface as much relevant literature as possible.
Another method is citation tracking (also called citation analysis, citation tracing, citation mining,
or cited reference searching). Citation tracking looks at the number of times that a particular
work, author or journal has been cited in the bibliographies of other works. This gives an
indication of how impactful something is in the academic community. With advances in artificial
intelligence and the robustness of bibliographic data for academic literature, in the last few
years we’ve seen the development of digital tools designed to help researchers perform citation
tracking more efficiently and intuitively. In this blog post, I’ll discuss a few of these tools, weigh
their pros and cons, and illustrate their potential for research.


Database Logic - A Review


Before diving into citation tracking, it is helpful to review a complementary method– database
logic. With this research strategy, Boolean operators, truncation, key words, search fields, and
other components of database logic are formulated into a query to systematically search
collections of academic literature. This method is tried and true. Comprehensive search queries
when searching databases is crucial for ensuring you are surfacing all relevant literature while
reducing the irrelevant results to sift through. When writing literature reviews of all types, searching databases this way will save you time and provide assurance that you are accessing all relevant literature on a topic.

For more information on database logic and systematic search queries, see the Gottesman
Libraries guide on conducting library research.

Citation Tracking


While database logic is a fundamental research method, it doesn’t illustrate research impact
over time or how scholars engage with each other in their work.
Citation tracking is a research method that involves starting with a “seed article,” AKA an article
that is exemplary of the kind of literature you are seeking, and going through the reference
section to see what prior literature the seed article is in conversation with. In other words, you
are going backwards in time in the literature. Citation tracking also works by going forwards in
time by identifying the articles that cite the seed article. This method provides you with a more
holistic picture for the seed article’s context– not only the related scholarship that came before
it, but also that came after.


Bibliometrics


Citation analysis is an element of bibliometrics. Bibliometrics are the statistical analyses of
books, articles, or other publications. The analyses are used to track author or researcher
output, impact, and influence. The quantitative impact of a given publication, author, or article is
appraised by measuring the amount of times a certain work is cited by other resources.
Bibliometrics, while a useful method for understanding scholarship, is not an exhaustive way to
measure impact and has limitations in its accuracy (Põder, 2022). It can also lead to the same
handful of articles getting constantly cited whereas new literature on a subject is overlooked.


Citation Tracking - Databases


Many databases provide features that allow you to easily do citation tracking. This is due to the
fact that databases are getting more advanced in the ways they store bibliographic information.
Articles and the information that come with them is connected through hyperlinks.


1. Google Scholar:

Google Scholar is a large search engine that searches the web for scholarly literature (not
necessarily peer-reviewed). While Google Scholar should not be used as a replacement to a
library catalog, it has some effective built-in features, especially for citation tracking.
Search results in Google Scholar will have the link “Cited by….” beneath an article’s information.

 

results for search in Google Scholar

Image: Screenshot of results in Google Scholar


By clicking on “Cited by…” you will be taken to a list of articles that have cited it. Keep in mind
that this number is not comprehensive, as it only is listing the articles indexed by Google
Scholar. While Google Scholar’s ranking algorithm has not been made public, we do know that
it weights the higher cited articles first. Be mindful of the results in Google Scholar as this
creates an echo chamber where researchers keep citing the same, already well-cited articles.

 

list of citations for seed article in Google Scholar

Image: Screenshot of citations for seed article in Google Scholar

 

***Remember: Google Scholar does not have a way to filter for peer-reviewed literature.
So when using sources found in Google Scholar, make sure to check the
publisher/journal for credibility and authority in the field.

 

2. Scopus:

 

Scopus is a multidisciplinary database with comprehensive coverage of scientific, technical,
medical and social sciences literature launched by the academic publisher, Elsevier.

 

seed article with number of citations in Scopus

Image: Screenshot of seed article in Scopus

 

You’ll notice that the number of citations for the seed article is different in Scopus than in
Google Scholar. This is because Scopus indexes a smaller amount of content, however its
collection is curated for high-quality, peer reviewed literature. Google Scholar indexes a broader
scope of content.

 

3. Web of Science:


Web of Science is an online index that covers journal articles published in the physical and life
sciences, health sciences, social sciences, and arts and humanities. Similarly to Scopus, it
allows you to easily see and access articles citing the seed article, as well as its references.

 

seed article showing number of references and citations in Web of Science

Image: Screenshot of seed article in Web of Science


Web of Science provides useful ways to sort your results. In the image below, we are seeing a
list of the citations (not the references) of the seed article. Notice that you can organize the
articles citing the seed article in various ways. One interesting method is to sort by articles that
cited the seed article in the background section, articles that use the seed article to support their
findings, articles that cite the seed article as having differing findings, etcetera. I encourage you
to play around with the Web of Science interface to see if its features for nuanced citation
tracking are helpful for your purposes.

 

list of citations with citation class filter feature dropdown

Image: screenshot of citations for seed article in Web of Science showing Citation class filter

Citation Networks and Graphs – Tools and Software

 

citation network from study of nobel prize literature

Image: Citation graph from- Ioannidis, J. P., Cristea, I. A., & Boyack, K. W. (2020). Work
honored by Nobel prizes clusters heavily in a few scientific fields. Plos one, 15(7), e0234612.
Read more about what scholarly literature and citation metrics are being represented in this
network here: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0234612

 

Digital tools using artificial intelligence and bibliometrics create visual representations of citation
tracking. These network visualizations use algorithms to generate related literature based on a
seed article. Instead of seeing lists of citations and references, these tools display how
scholarship and authors are connected through shared citations and references. Citation
networks are another method to gain insight into scholarly impact and influence within an
academic field or topic.


You’ll notice that the databases discussed in this post present a lot of bibliographic information
as links: you can click on the author to see other works by them, the reference section is
displayed as article links to click on and access the full-text. The ability for robust bibliographic
data to be linked together is what allows people to develop tools employing artificial intelligence
and computational methods to generate citation networks from a literature corpus.

 

1. ResearchRabbit.ai

 

ResearchRabbit is an innovative “citation-based literature mapping tool” available free online.
You just need to make an account. The software scans for any publicly available source online
and selects papers based on their similarities. However, it seems to only work for scholarly
papers, and therefore is unlikely to find other sources of information that are not journal articles,
such as books.


ResearchRabbit works by building your own collection of seed articles. The tool then creates a
network of similar literature based on shared citations and references. Start by adding one or
more seed articles.

 

seed article upload in ResearchRabbit

Image: Screenshot of adding a seed article to a collection in ResearchRabbit.


Once a seed article is added to your collection, ResearchRabbit provides ways to “explore
papers.” You can visualize the article’s references and citations, as well as “similar work.”
ResearchRabbit is proprietary, which means that the platform has not made publicly available
the algorithms the AI uses to generate this network. Keep in mind that ResearchRabbit does not
include all references or citations. It uses an algorithm to find the work that is most likely useful
given your corpus. Learn more about ResearchRabbit’s algorithms and data sources here.

 

network of Similar Work for seed article in ResearchRabbit

Image: Screenshot of “Similar Work” network in ResearchRabbit

 

In order to get the most out of a network graph, it is important to know how to read it.
ResearchRabbit’s FAQ provides the following information regarding what to look for in a
network:


Understanding Paper Graphs
What does a Dot mean? Each dot represents a paper!
Can I learn more about a Dot? Absolutely! Simply click on the dot to see all the information
about that paper!
What does a Line represent? Each line represents a citation relationship between two papers.
In other words, one of the dots cites the other dot. You’ll see arrows that indicate the direction of
the citation.
Colors of Dots? Green dots are papers that come from your Collection (papers that you’ve
added)! Blue dots are papers that come from the column immediately to the left of this graph!
Intensity of dot colors is based on recency! More recent papers are darker in color. Older
papers are lighter in color. (edited)
When I click a dot, what do the arrows mean? The dot receiving the arrow is being cited by
the dot sending the arrow!
Size of Dots? Great question! The Green Dots will always be the biggest size. The Blue Dots
are sized based on the number of connections they have with the Green Dots (in order words,
the “connectivity” a given set of papers has with your initial starting papers).

 

network of References for seed article in ResearchRabbit

Image: Screenshot of network of citations in ResearchRabbit

 

In the image above, we are seeing the connectedness between the references of the seed
article.


By adding relevant articles to your collection, you are providing the AI more information to then
create networks of scholarly connections and similarity.


Another useful feature of ResearchRabbit is the ability to import citations you’ve already saved
in a reference manager like Zotero or Endnote. This way, you can discover more articles that
are connected to the articles you are already referencing in your paper.

 

2. Inciteful.xyz


Similar to ResearchRabbit, Inciteful is a free online tool that helps you map academic papers. It
uses network algorithms and bibliometrics to provide a list of relevant articles.


You can input one seed article to see an overview of the current state of that topic. From there,
you can add more articles to your seed corpus to discover more relevant literature. You can also
input two papers to see what literature connects them. This can be useful for interdisciplinary
work.

 

seed article in Inciteful

Image: Screenshot of seed article bibliometrics in Inciteful.xyz

 

Inciteful provides a few ways to discover more literature. Unlike other network tools, Inciteful
provides the ability to use Boolean logic to search within the paper graph. The “Similar Papers”
table lists papers that cite the same papers as your seed article. The “Important Papers” table
lists the important or fundamental papers in a topic.


PageRank is used as the ranking algorithm for these lists. PageRank is an algorithm first
developed by Google Search to rank websites in their search engine results. Basically,
PageRank measures the importance of webpages based on the quality and quantity of links
pointing to them. Learn more about PageRank here. Given the nature of academic works,
PageRank algorithm tends to bias towards older highly cited papers. Nonetheless, it does give
weight to papers which may not have a ton of citations but that are cited by papers which do.
This is in contrast with major citation databases such as Web of Science or Scopus, whose
result lists, if sorted by “times cited,” are only based on the number of the citations each item
has in total.


Request a consultation with a librarian to learn more about how to utilize these tools for your
research!

 

The inspiration for this blog post came from Computational Research Instruction Librarian,
Daniel Woulfin’s workshop, “Level Up Your Lit Review” at Columbia University Libraries.

 

References

ResearchRabbit tutorial from James Cook University’s Learning Centre


Põder, E. (2022). What Is Wrong With the Current Evaluative Bibliometrics? Frontiers in
Research Metrics and Analytics, 6. https://doi.org/10.3389/frma.2021.824518


Inciteful: Explore Literature Using Academic Papers Graph. (n.d.). Retrieved April 1, 2024, from
https://library.hkust.edu.hk/sc/inciteful/


Ioannidis, J. P., Cristea, I. A., & Boyack, K. W. (2020). Work honored by Nobel prizes clusters
heavily in a few scientific fields. Plos one, 15(7), e0234612.

 

Further Reading and Resources on Citation Analysis and Network Visualization


Amy N. Langville & Carl D. Meyer. (2012). Google’s PageRank and Beyond: The Science of
Search Engine Rankings. Princeton University Press. https://clio.columbia.edu/catalog/14088989


Lawson, P. (n.d.). Guides: Data Visualization: Network Visualization. Retrieved April 2, 2024,
from https://guides.library.jhu.edu/datavisualization/network


Page Rank Algorithm and Implementation. (2017, August 30). GeeksforGeeks.
https://www.geeksforgeeks.org/page-rank-algorithm-implementation/


Robins, G. (2015). Doing social network research: Network-based research design for social
scientists. Sage Publications Ltd. https://clio.columbia.edu/catalog/17383865 


Hg bn Tchangalova, N. (n.d.). Research Guides: Bibliometrics and Altmetrics: Measuring the
Impact of Knowledge: Bibliometrics. Retrieved April 1, 2024, from
https://lib.guides.umd.edu/bibliometrics/bibliometrics


Wetzel, D. (n.d.). Library Guides: Generative AI: ChatGPT and Beyond: ResearchRabbit.
Retrieved March 29, 2024, from
https://guides.libraries.psu.edu/c.php?g=1338692&p=9867286


Tags:
  • Learning at the Library
Back to skip to quick links
occupancy image
3FL
occupancy image
2FL
occupancy image
1FL
The library is
barely
crowded right now.
How busy?