Register for your free account! | Forgot your password?

Go Back   elitepvpers > Coders Den > General Coding
You last visited: Today at 08:40

  • Please register to post and access all features, it's quick, easy and FREE!

Advertisement



Screph – CV-based GUI annotation & desktop automation data for LLM agents

Discussion on Screph – CV-based GUI annotation & desktop automation data for LLM agents within the General Coding forum part of the Coders Den category.

Reply
 
Old   #1
 
elite*gold: 0
Join Date: Jan 2026
Posts: 1
Received Thanks: 0
Screph – CV-based GUI annotation & desktop automation data for LLM agents

Hi everyone,

I’d like to share a project I’m working on: Screph , a desktop tool for annotating graphical interfaces and preparing structured data for automation , built around computer vision , human‑like input emulation and a PySide6 GUI .

The main idea is:

Annotate interfaces, export structured data to JSON — LLM agents generate automation scripts based on your data.

Instead of hardcoding logic into a traditional bot, you visually describe how the UI looks and behaves. Screph captures screens, lets you annotate elements and flows, and produces machine‑readable data that other tools (including LLM agents) can use to generate and maintain automation scripts.

What is Screph?

Screph is a Windows desktop application that combines:

- computer vision for detecting and analyzing UI elements on the screen;
- human‑like mouse and keyboard input emulation;
- a configurable PySide6 GUI for building and testing visual annotations;
- speech recognition used to create voice annotations for images and logical links between them.
You work at the level of “this is the button”, “this is the field”, “this sequence means this operation” , and Screph turns that into structured data (JSON) that can drive higher‑level automation.

More about the concept and product:

- Website:
Intended use cases

Screph is focused on GUI annotation and data preparation for automation, not on being a one‑click bot generator. Typical scenarios:

- Legacy systems without API

- annotate screens of old terminals, SCADA, banking ABS;
- let an agent generate monitoring/control scripts based on your annotations.
- Digitizing expert workflows

- record how an expert works with a complex UI;
- annotate key steps and decisions;
- generate training and automation scripts from those annotations.
- Visual compliance and auditing

- annotate forms with validation rules and expectations;
- generate scripts that perform visual checks and logging.
- Integration without API

- annotate UIs of two systems (e.g. CAD ↔ ERP, LIMS ↔ Excel);
- generate pipelines that transfer data through visual interaction.
- ETL from screencasts

- annotate frames of recorded sessions;
- generate scripts and documentation (SOPs, best practices).
- Other GUI‑heavy domains

- video surveillance operators, laboratory protocols, creative pipelines (Adobe / DaVinci / Blender), and similar “screen‑driven” processes.
Key features

- Computer vision for GUI understanding

- YOLO‑based detection of UI elements and regions;
- classic OpenCV processing (segmentation, contours, masks, metrics);
- reusable presets and an extensible architecture for custom CV modes.
- Human‑like input emulation

- mouse moves, clicks, drags, scrolling with human‑like trajectories;
- keyboard events with variable typing speed, micro‑movements and random delays;
- optional hardware‑based modes (e.g. Arduino) if you work with strict environments.
- Annotation‑oriented PySide6 GUI

- multi‑tab interface for CV settings, annotations and input emulation;
- screen selector for drawing rectangles / polygons / arbitrary ROIs;
- tray integration for quick access while you work in real applications.
- Speech recognition for annotations (not direct control)

- multiple speech recognition providers (online and offline);
- using recognized text as voice annotations to images, regions and logical flows ;
- convenient for documenting what happens on the screen while you work.
- Web panel & API (optional)

- separate Django backend with a web panel and REST API;
- suitable for remote control, integrations, and “cloud side” workflows when needed.
How Screph is typically used

A common workflow looks like this:

1. Start the Screph GUI.
2. Configure basic parameters (language, logging, CV/input settings).
3. Use the screen selector to capture screens and mark UI elements or regions.
4. Optionally run CV analysis on selected regions to refine detection.
5. Add text or voice annotations that describe what these regions mean and how they are related.
6. Export structured data (e.g. JSON) that can be consumed by LLM agents or other automation systems to generate scripts.
The focus is on describing the GUI and flows , not on embedding all automation logic directly into Screph.

Downloads, docs and community

- Product website and overview:
-
- Github
-
- Telegram:
-
-
- Development & collaboration:
- We manage issues, discussions and pull requests via GitHub (links are available from the website and documentation).
- The docs also contain more technical details about architecture, modules and APIs.
As the project evolves, these resources will be kept up to date with new releases and guides.

Security & ethical notice

Screph is a general‑purpose tool for GUI annotation and automation data preparation .

It is NOT intended to be used as a cheat, hack or botting tool in online games or other software where this would violate Terms of Service, EULAs or local laws.

By using Screph you agree that:

- you are fully responsible for how you use this software;
- you will not use it to bypass anti‑cheat systems or gain unfair advantage in online environments;
- you will comply with all applicable ToS, EULAs and laws.
I do not provide support for cheating, botting or bypassing protections in games or other software.

Feedback

I’d appreciate feedback from other developers and automation practitioners:

- ideas for better CV/annotation abstractions;
- suggestions for useful JSON schemas and patterns for LLM‑driven automation;
- real‑world use cases where GUI annotation would simplify integration or training.
If you have questions about the internals (CV pipeline, input emulation, speech recognition for annotations), feel free to ask and I’ll share more technical details.

— Screph author
screph is offline  
Reply

Tags
computer-vision, desktop-automation, gui-annotation, llm-agents, python-pyside6


Similar Threads Similar Threads
[Selling] Orion IT | C++ » Desktop » Web » Native » Automation (Systems Level)
10/24/2025 - Coders Trading - 0 Replies
/del
[Selling] Automation Software - Cheap automation software for you
12/10/2014 - Coders Trading - 0 Replies
Automation Software - Create personalised automated software I am offering a service where I will custom create software in C# for you, whether it be mass username checkers, account checker, or any custom software that you request. I will do my best to make it happen, and for a reasonable price. Visit my website for more information: http://automationsoftware.co or contact me on skype: entrailz
[Buying] &&&&&&&&&KAUFE STEAM ACCOUNT! &&&&&&&&&
06/07/2013 - Trading - 1 Replies
Hallo, bin nicht hier um groß zu traden,sondern möchte einen Steam Account kaufen. Fakten: Biete maximal 60€ PaySafeCard Es sollten viele kleine Spiele sowie COD enthalten sein COD 7-9 sind Pflicht! Kein VAC/TAC/Valve o.Ä Bann!
&&&&&&&&&KAUFE STEAM ACCOUNT! &&&&&&&&&
06/07/2013 - elite*gold Trading - 0 Replies
Hallo, bin nicht hier um groß zu traden,sondern möchte einen Steam Account kaufen. Fakten: Biete maximal 60€ PaySafeCard oder kann es auch zu egold machen Es sollten viele kleine Spiele sowie COD enthalten sein COD 7-9 sind Pflicht! Kein VAC/TAC/Valve o.Ä Bann!



All times are GMT +1. The time now is 08:40.


Powered by vBulletin®
Copyright ©2000 - 2026, Jelsoft Enterprises Ltd.
SEO by vBSEO ©2011, Crawlability, Inc.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Support | Contact Us | FAQ | Advertising | Privacy Policy | Terms of Service | Abuse
Copyright ©2026 elitepvpers All Rights Reserved.