AI Literature Review Assistant

Classification System — Binary Decision Tree for Classification

The AI Literature Review Assistant is a tool designed to assist researchers in conducting literature reviews—a crucial step in academic research that consolidates existing knowledge within a chosen field into a single analysis. Given the time-intensive nature of literature reviews, especially the initial pass over a large corpus of research, this tool proposes a semi-automatic method to streamline the process. Utilizing keyword selection, database searching, and machine learning, it automates the classification of academic papers into "interesting" or "not interesting" categories based on the researcher's focus. The core of this solution lies in a binary decision tree model optimized and deployed through a Jupyter Notebook, which ensures precise paper classification. An overview of the overall process is shown in the flowchart below.

Features

Automated Paper Classification:
- Employs a binary decision tree model for categorizing academic papers based on predefined criteria.
- Fosters a more efficient literature review process by filtering out irrelevant papers early on.
- Pre-processes paper titles by removing noise, tokenizing them, and extracting features. This makes the raw titles usable by the decision tree classifier.
- Leverages lemmatization to convert words to their base form, enhancing the classification accuracy by recognizing various forms of the same word.
User-friendly GUI Application:
- Facilitates easy reading and classification of paper titles and abstracts, enhancing user interaction with the tool.
- Enables researchers to classify papers as "interesting" or "not interesting" with simple interactive elements.
Scopus Database Queries:
- Automates the search and retrieval of academic papers from the Scopus database using tailored queries.
- Yields a list of papers with key information like title, authors, etc., aiding in the quick assessment of relevance.
Python and Jupyter Notebook Optimization:
- Provides a collaborative environment for training, optimizing, and deploying the machine learning model.
- Ensures accurate paper classification through a robust and fine-tuned binary decision tree model.
- Model optimization techniques are incorporated to tweak parameters, minimizing false negatives for "interesting" titles.
- Utilizes metrics such as accuracy, ROC, AUC, recall, and F1-score to optimize the model's performance and assess its efficacy.
Keyword Extraction and Query Preparation:
- Enables researchers to define keywords and prepare queries for searching the Scopus database.
- Contained within the `./method` folder, showcasing an example of how keyword extraction and query preparation are performed.

If you want to know more about the AI Literature Review Assistant, please check out the [github] repository.

Description of the image — Flowchart of how the Literature Review Unfolds with our Tool