Automated-Content-Analysis-Website

Team Whackos

Abhijeet Kumar, Aryan Mittal, Devansh Verma, Riya Sanket Kashive

Overview

This solution aims to automate the extraction and content analysis of all embedded links from a website, regardless of their location, and includes asking concise and relevant questions, the most relevant links and topics for those questions, all complete with an automated verification and metric system for assessing their aforementioned parameters (conciseness and relevance). A detailed documentation of the repository has been laid out in this document.

Key Features:

Data Scraping: Utilizes Selenium to extract all embedded links from the target website.
Data Storage: JSON files are used to store and organize the extracted data.
Question Generation: We employ the duckduckgo_search library in conjunction with the gemini API to generate precise and pertinent questions.
Link-Question Mapping and Relevance Metric: TFIDF Vectorization is used to map the generated questions to the most relevant links, and is employed as a relevance metric to evaluate the quality of the mappings.

Problem Statement

For detailed information on the problem statement, please refer to this document.

Achievements

Our solution achieved an accuracy of 83%.

Milestones

Extraction of embedded links from a website.
Content analysis and question generation.
Implementation of an automated system for verification and relevance assessment.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Json files		Json files
.DS_Store		.DS_Store
Documentation.pdf		Documentation.pdf
Final_Overlayy.ipynb		Final_Overlayy.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Automated-Content-Analysis-Website

Team Whackos

Overview

Key Features:

Problem Statement

Achievements

Milestones

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Devanshv17/Smart-Link

Folders and files

Latest commit

History

Repository files navigation

Automated-Content-Analysis-Website

Team Whackos

Overview

Key Features:

Problem Statement

Achievements

Milestones

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages