Skip to content

GenFlowChart is a framework that implements flowchart parsing using generative AI. Leveraging SAM for segmentation and OCR for text extraction, it reconstructs workflows through prompt-engineered integration.

Notifications You must be signed in to change notification settings

ResponsibleAILab/GenFlowchart

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GenFlowchart: Parsing and Understanding Flowcharts Using Generative AI

Flowcharts serve as integral visual aids, encapsulating both logical flows and specific component-level information in a manner easily interpretable by humans. However, automated parsing of these diagrams poses a significant challenge due to their intricate logical structure and text-rich nature.

In this paper, we introduce GenFlowChart, a novel framework that employs generative AI to enhance the parsing and understanding of flowcharts. First, a cutting-edge segmentation model is deployed to delineate the various components and geometrical shapes within the flowchart using the Segment Anything Model (SAM). Second, Optical Character Recognition (OCR) is utilized to extract the text residing in each component for deeper functional comprehension. Finally, we formulate prompts using prompt engineering for the generative AI to integrate the segmented results and extracted text, thereby reconstructing the flowchart's workflows. To validate the effectiveness of \modelname, we evaluate its performance across multiple flowcharts and benchmark it against several baseline approaches.

Installation

Install the following libraries before executing the commands as shown in the provided Jupyter notebook:

pip install pdf2image PyMuPDF 
pip install pytesseract
pip install bert_score
pip install -U sentence-transformers
pip install openai
pip install Word2Vec

About

GenFlowChart is a framework that implements flowchart parsing using generative AI. Leveraging SAM for segmentation and OCR for text extraction, it reconstructs workflows through prompt-engineered integration.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages