Publication annotation (tagging) is time-consuming

In Citation tracking, Tools by Xu CuiLeave a Comment

Many of our customers annotate their citations upon receiving their citation lists from us. The most popular fields they add include: product used (SKU), product category, application, research field, species, copyright etc. Citation annotation is valuable because it makes the data easier to analyze and digest, thus allowing the company to gain more insights into their products.

However, citation annotation is very time consuming. It easily takes days or weeks to annotate a medium-sized citation list. This has caused some businesses to give up on this laborious task entirely.

“I used to tag our citations by products, applications, research fields, and technologies. But I stopped it because it took too much time!

To address this pain, we can do the annotatation work for you. Below are the methods we use:

1. Phrase matching

Our first method is to match the tagging terms to each citation. For example, let’s say we want to tag the citations by research field, and “Neuroscience” is one of the options. If we find the term “Neuroscience” in a citations’ title, abstract, or snippet, then we will tag this citation as “Neuroscience“.

For example, the paper titled “Application of CRISPR-Cas systems in neuroscience” will be tagged as “Neuroscience“.

2. Semantic matching

It is often the case where our tag terms do not appear in a citation, but they are semantically related. Let’s take a look at the following citation titled:

“Amygdala-Insula Circuit Computations in Posttraumatic Stress Disorder.”

This citation should be tagged as “Neuroscience” but the term “Neuroscience” is absent from the title or abstract. To solve this issue, we developed a semantic aware AI. This AI is based on one of the world’s most powerful machine learning models and has been trained with articles with hundred billions of words. It understands the semantic relationship between texts very well. We use this AI to calculate the semantic similarity score between a citation and the tag terms. Terms with high score will be used to tag a citation.

3. Dictionary mapping

We frequently notice that researchers cited a product name in various formats. For example, a product called “ExoQuick-TC” can be cited as “Exo-Quick-TC”, “Exo Quick TC”, “ExoQuick TC”, etc. To correctly tag the citations, we can create a dictionary, mapping various terms to a single, standard term.

This method can also be used to tag research fields, applications, etc. For example, if one of the application tags is “Animal“, then we can create a dictionary mapping mouse, mice, rat, rats, pig, pigs etc. to “Animal“.

4. Manual tagging by human expert

Should your citations require special expertise for annotation, we have Ph.D. level human experts who can work on your data.

In summary, the 4 methods listed above can be used to tag citations with high accuracy. If you are spending too much time tagging your citations manually, please let us know.

If you find the article useful, you may consider to subscribe:

Leave this empty:

Leave a Comment