|
|
||
|---|---|---|
| api-bridge | ||
| assets | ||
| build | ||
| docs | ||
| src | ||
| build_standalone.py | ||
| manifest.json | ||
| README.md | ||
| warp-output-hkg-chrome-ext-21aug2025.txt | ||
| warpterminalconvesation-hkg-chrome-ext-21aug2025.txt | ||
hKG Web Importer Chrome Extension
Overview
The hKG Web Importer is a powerful Chrome extension that seamlessly imports web content into your Hybrid Knowledge Graph (hKG) system. It intelligently handles duplicate detection while preserving valuable incremental redundancies for triangulation - a key design principle of hKG.
Key Features
- 🔍 Smart Duplicate Detection: Checks for identical content but preserves near-duplicates for triangulation
- 📊 Incremental Redundancy Tracking: Captures micro-adjustments and variations in content
- 🎯 Customizable Import Options: Choose what content to import (text, images, videos, audio, metadata)
- 🤖 AI Inference: Extract entities, relationships, summaries, and topics
- 📸 Media Processing: OCR on images, transcription of audio/video
- 🔗 Context Preservation: Maintains relationships between similar content
- ⚡ Quick Import: Keyboard shortcuts and context menu integration
- 📈 Import History: Track what you've imported with similarity scores
Philosophy: Triangulation Through Redundancy
Unlike traditional systems that avoid duplicates, hKG embraces controlled redundancy. When content is similar but not identical, these variations provide valuable signals for understanding how information evolves and relates. The extension supports three duplicate handling modes:
- Skip only if 100% identical (Default): Import everything except exact duplicates
- Always import: Track all variations, even identical content with different access contexts
- Smart diff: Highlight changes between versions
Installation
1. Install the Chrome Extension
Developer Mode Installation (Recommended for now)
- Clone this repository or download the extension files
- Open Chrome and navigate to
chrome://extensions/ - Enable "Developer mode" in the top right
- Click "Load unpacked"
- Select the
hkg-chrome-extensiondirectory - The extension icon (🧠) will appear in your toolbar
2. Set Up the API Bridge Server
The API bridge connects the Chrome extension to your hKG infrastructure.
Prerequisites
# Install Python dependencies
pip install flask flask-cors psycopg2-binary qdrant-client neo4j \
sentence-transformers numpy requests
Start the API Bridge
cd /home/robin/CascadeProjects/ihkg/hkg-chrome-extension/api-bridge
python hkg_web_api.py
The API bridge will start on port 59877 by default. You can override the port by setting HKG_API_PORT or PORT.
Configure Environment (Optional)
Create a .env file in the api-bridge directory:
POSTGRES_HOST=192.168.0.111
POSTGRES_PORT=5432
POSTGRES_DB=hkg
POSTGRES_USER=hkg_user
POSTGRES_PASSWORD=h3Lp-hKg#2024
NEO4J_URI=bolt://192.168.0.111:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=h3Lp-Ne0#2024
QDRANT_HOST=192.168.0.173
QDRANT_PORT=6333
SIMILARITY_THRESHOLD=0.85
3. Configure the Extension
- Click the extension icon (🧠) in Chrome
- Click on "Advanced Options"
- Set the API Endpoint (default:
http://localhost:59877/import) - Configure your import preferences
Usage
Quick Import
- Keyboard Shortcut: Press
Ctrl+Shift+I(configurable in Chrome settings) - Extension Icon: Click the 🧠 icon and click "Import to hKG"
Context Menu Import
Right-click on any:
- Selected text
- Image
- Video
- Audio
- Link
- Page background
Select "Import to hKG" from the context menu.
Advanced Import Options
Content Selection
- Page Text Content: Extract all text from the page
- Images: Include all images (with OCR option)
- Videos: Extract video URLs and metadata
- Audio: Include audio files
- Links & References: Capture all hyperlinks
- Page Metadata: Include title, description, Open Graph data
- Selected Text Only: Import only highlighted text
Duplicate Handling
- Similarity Threshold: Adjust the slider (50-100%) to control what's considered "similar"
- Duplicate Modes: Choose how to handle similar content
AI Inference Options
- Extract Entities: Identify people, places, organizations
- Extract Relationships: Find connections between entities
- Generate Summary: Create concise summaries
- Detect Topics: Identify main themes
- OCR on Images: Extract text from images
- Transcribe Media: Convert audio/video to text
Checking for Duplicates
Click "Check Duplicates" to see if similar content already exists in your hKG:
- Green (✨ Unique): No similar content found
- Yellow (⚠ Similar): Similar content exists (shows similarity %)
- Red (✓ Identical): Exact duplicate found
Preview Content
Click "Preview" to see what will be imported before committing.
API Endpoints
The API bridge provides several endpoints:
POST /import: Import content to hKGPOST /check-duplicates: Check for similar/identical contentPOST /search: Search content in hKGGET /stats: Get import statisticsGET /health: Health check
Architecture
Chrome Extension
├── popup.html/js (UI)
├── content.js (Page extraction)
└── background.js (Service worker)
↓
API Bridge (Flask)
↓
hKG Infrastructure
├── PostgreSQL (Raw data + metadata)
├── Neo4j (Knowledge graph)
└── Qdrant (Vector embeddings)
Development
Building from Source
# Install dependencies
npm install
# Build extension (if using build tools)
npm run build
# Package for distribution
npm run package
Testing
- Load the extension in developer mode
- Navigate to any webpage
- Try importing different types of content
- Check the browser console for debug messages
Debugging the API Bridge
# Run with debug logging
FLASK_ENV=development python hkg_web_api.py
# Check logs
tail -f /var/log/hkg-web-api.log
Troubleshooting
Extension Not Working
- Check that the API bridge is running:
curl http://localhost:59877/health - Verify Chrome permissions are granted
- Check browser console for errors (F12 → Console)
Import Failures
- Verify database connections (PostgreSQL, Neo4j, Qdrant)
- Check API bridge logs for errors
- Ensure the universal importer is properly configured
Duplicate Detection Issues
- Adjust similarity threshold in settings
- Check if Qdrant collection exists and has proper embeddings
- Verify the embedding model is loaded correctly
Advanced Configuration
Custom Tags and Context
Add custom tags and context notes to imports for better organization:
- Custom Tags: Comma-separated tags (e.g., "research, project-x, important")
- Context Note: Add notes about why you're importing this content
Multiple API Endpoints
You can run multiple API bridges on different high ports for different projects. The server reads HKG_API_PORT (or PORT):
# Project 1
HKG_API_PORT=59877 python hkg_web_api.py
# Project 2
HKG_API_PORT=59878 python hkg_web_api.py
Security Considerations
- The extension only accesses the active tab when you explicitly trigger an import
- API communication is local by default (localhost)
- For remote API access, use HTTPS and authentication
- Sensitive data is never logged or exposed in the extension
Contributing
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
License
[Specify your license here]
Support
For issues or questions:
- Check the Issues page
- Contact: [your contact info]
Roadmap
- Batch import multiple tabs
- Scheduled imports for monitoring changes
- Integration with browser bookmarks
- Export/import settings
- Team collaboration features
- Advanced diff visualization
- Mobile browser support
Remember: hKG thrives on incremental redundancies. Don't worry about importing similar content - these variations help build a richer, more nuanced knowledge graph!