AWS Logo
Menu

How to develop a tag-based web searcher using AI

This service enhances users' convenience in using web pages by supporting smart page management and search using tags.

Hyunjoong Shin
Amazon Employee
Published Apr 18, 2024

Preview

Image preview

technology used

  • Streamlit
  • Langchain
  • Bedrock (anthropic.claude-instant-v1)
  • Opensearch

Purpose of development

In recent years, as the number of websites and web pages has increased exponentially, it is becoming increasingly difficult to effectively manage the web pages stored by individuals. In particular, when a large number of web pages are stored, it is difficult to search and use them because it is difficult to check the main information and contents of each page one by one. Accordingly, this service extracts and provides key information and appropriate tags on the corresponding page through an AI algorithm based on the web page URL entered by the user. Specifically, the overall contents of the web page are summarized, and representative keywords related to the contents are extracted as tags. Since the information extracted in this way is stored as a brief description and tag of each page, users can use it much more efficiently in the process of searching and managing pages. In particular, it has the advantage of being able to search related pages immediately when a specific tag is selected. This service is an AI-based tagging solution for managing users' web pages, and is expected to be useful to individual users or companies that store large amounts of web pages.

Brief description of the feature

  1. Users register by entering the URL of the webpage they want to bookmark.
  2. When a page is registered and refreshed, the AI analyzes the contents of the page and automatically generates appropriate tags.
  3. Users can click on the generated tag to search for other pages with the same tag.
  • These tag-based searches allow you to quickly browse pages with similar content.
  • In addition, page-specific tags allow you to grasp the content and characteristics of the saved page at a glance.

Step1. Configure AWS credentials

  • Create a new IAM user.
  • The IAM User you created connects its policy (Amazon BedrockFullAccess, Amazon OpenSearchServiceFullAccess).
  • Perform credentials through aws configure using the users created above through the CMD window.
  • At this time, the region must select the region in which the Opensearch is generated.

Image preview

Step2. Configure opensearch

  • Connect Opensearch dashboard, Management > Security > Roles > search all_access
  • Click manage mapping in all_access > Mapped users
  • Map the ARNs of IAM users created in the previous step,
Image preview
  • Connect Opensearch dashboard > Management > Click Devtools
  • Opensearch should set the index as follows.
  • In Opensearch, insert the following example documents.

Step3. Code settings

  • Run IDE similar to Pycharm.
  • Install the required libraries as follows.
  • Enter the code below.
  • Change the region and host according to your environment in the code below.
  • If the code above has been entered, use the command below to execute it.
streamlit run <your python file>.py
  • You can successfully check the page below.
Image preview

 

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Comments